[
  {
    "path": ".bbp-project.yaml",
    "content": "tools:\n  ClangFormat:\n    enable: True\n    include:\n      match:\n      - coreneuron/.*\\.((cu)|(h)|([chi]pp))$\n  CMakeFormat:\n    enable: True\n"
  },
  {
    "path": ".clang-format.changes",
    "content": "IndentCaseLabels: true\nSortIncludes: false\nStatementMacros: [nrn_pragma_acc, nrn_pragma_omp]\n"
  },
  {
    "path": ".cmake-format.changes.yaml",
    "content": "additional_commands:\n  cpp_cc_build_time_copy:\n    flags: ['NO_TARGET']\n    kwargs:\n      INPUT: '1'\n      OUTPUT: '1'\n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/bug_report.md",
    "content": "---\nname: Bug report\nabout: Create a report to help us improve\ntitle: ''\nlabels: ''\nassignees: ''\n\n---\n\n**Describe the issue**\nA clear and concise description of what the issue is.\n\n**To Reproduce**\nSteps to reproduce the behavior:\n```bash\nA simple script\n```\n\n**Expected behavior**\nA clear and concise description of what you expected to happen.\n\n**Logs**\nIf possible attach helpful logs related to the issue.\nIf there is an issue during build `CMakeError.log`, `CMakeOutput.log` or the output of `make VERBOSE=1` would be helpful.\nOtherwise any error printed to the therminal.\n\n**System (please complete the following information)**\n - OS: [e.g. Ubuntu 20.04]\n - Compiler: [e.g. PGI 20.9]\n - Version: [e.g. master branch]\n - Backend: [e.g. CPU]\n\n**Additional context**\nAdd any other context about the problem here.\n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/config.yml",
    "content": "blank_issues_enabled: true\n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/feature_request.md",
    "content": "---\nname: Feature request\nabout: Suggest an idea for this project\ntitle: ''\nlabels: ''\nassignees: ''\n\n---\n\n**Is your feature request related to a problem? Please describe.**\nA clear and concise description of what the problem is. Ex. I'm always frustrated when [...]\n\n**Describe the solution you'd like**\nA clear and concise description of what you want to happen.\n\n**Describe alternatives you've considered**\nA clear and concise description of any alternative solutions or features you've considered.\n\n**Additional context**\nAdd any other context about the feature request here.\n"
  },
  {
    "path": ".github/problem-matchers/address.json",
    "content": "\n{\n    \"problemMatcher\": [\n        {\n            \"owner\": \"asan-problem-matcher\",\n            \"severity\": \"warning\",\n            \"pattern\": [\n                {\n                    \"regexp\": \"^.*AddressSanitizer: (.*)$\",\n                    \"message\": 1\n                }\n            ]\n        }\n    ]\n}\n"
  },
  {
    "path": ".github/problem-matchers/gcc.json",
    "content": "{\n    \"__comment\": \"Taken from vscode-cpptools's Extension/package.json gcc rule\",\n    \"problemMatcher\": [\n        {\n            \"owner\": \"gcc-problem-matcher\",\n            \"pattern\": [\n                {\n                    \"regexp\": \"^\\\\.\\\\./(.*):(\\\\d+):(\\\\d+):\\\\s+(?:fatal\\\\s+)?(warning|error):\\\\s+(.*)$\",\n                    \"file\": 1,\n                    \"line\": 2,\n                    \"column\": 3,\n                    \"severity\": 4,\n                    \"message\": 5\n                }\n            ]\n        }\n    ]\n}\n"
  },
  {
    "path": ".github/problem-matchers/undefined.json",
    "content": "{\n    \"problemMatcher\": [\n        {\n            \"owner\": \"ubsan-problem-matcher\",\n            \"severity\": \"warning\",\n            \"pattern\": [\n                {\n                    \"regexp\": \"^.*\\\\/(src\\\\/.*):(\\\\d+):(\\\\d+): runtime error: (.*)$\",\n                    \"file\": 1,\n                    \"line\": 2,\n                    \"column\": 3,\n                    \"message\": 4\n                },\n                {\n                    \"regexp\": \"^.*UndefinedBehaviorSanitizer:.*$\"\n                }\n            ]\n        }\n    ]\n}\n"
  },
  {
    "path": ".github/pull_request_template.md",
    "content": "**Description**\n\nPlease include a summary of the change and which issue is fixed or which feature is added.\n\n- [ ] Issue 1 fixed\n- [ ] Issue 2 fixed\n- [ ] Feature 1 added\n- [ ] Feature 2 added\n\nFixes # (issue)\n\n**How to test this?**\n\nPlease describe the tests that you ran to verify your changes. Provide instructions so we can reproduce if there is no integration test added with this PR. Please also list any relevant details for your test configuration\n\n```bash\ncmake ..\nmake -j8\nnrnivmodl mod\n./bin/nrnivmodl-core mod\n./x86_64/special script.py\n./x86_64/special-core --tstop=10 --datpath=coredat\n```\n\n**Test System**\n - OS: [e.g. Ubuntu 20.04]\n - Compiler: [e.g. PGI 20.9]\n - Version: [e.g. master branch]\n - Backend: [e.g. CPU]\n\n**Use certain branches in CI pipelines.**\n<!-- You can steer which versions of CoreNEURON dependencies will be used in\n     the various CI pipelines (GitLab, test-as-submodule) here. Expressions are\n     of the form PROJ_REF=VALUE, where PROJ is the relevant Spack package name,\n     transformed to upper case and with hyphens replaced with underscores.\n     REF may be BRANCH, COMMIT or TAG, with exceptions:\n      - SPACK_COMMIT and SPACK_TAG are invalid (hpc/gitlab-pipelines limitation)\n      - NEURON_COMMIT and NEURON_TAG are invalid (test-as-submodule limitation)\n     These values for NEURON, nmodl and Spack are the defaults and are given\n     for illustrative purposes; they can safely be removed.\n-->\nCI_BRANCHES:NEURON_BRANCH=master,NMODL_BRANCH=master,SPACK_BRANCH=develop\n"
  },
  {
    "path": ".github/workflows/clang_cmake_format_check.yaml",
    "content": "name: clang-cmake-format-check\n\nconcurrency:\n  group: ${{ github.workflow }}#${{ github.ref }}\n  cancel-in-progress: true\n\non:\n    push:\n\njobs:\n  build:\n    name: clang-cmake-format-check\n    runs-on: ubuntu-22.04\n    steps:\n        - name: Fetch repository\n          uses: actions/checkout@v3\n        - name: Fetch hpc-coding-conventions submodules\n          shell: bash\n          working-directory: ${{runner.workspace}}/CoreNeuron\n          run: git submodule update --init --depth 1 -- CMake/hpc-coding-conventions\n        - name: Run clang-format and cmake-format\n          shell: bash\n          working-directory: ${{runner.workspace}}/CoreNeuron\n          run: CMake/hpc-coding-conventions/bin/format -v --dry-run\n"
  },
  {
    "path": ".github/workflows/coreneuron-ci.yml",
    "content": "name: CoreNEURON CI\n\nconcurrency:\n  group: ${{ github.workflow }}#${{ github.ref }}\n  cancel-in-progress: true\n\non:\n  push:\n    branches:\n      - master\n      - release/**\n  pull_request:\n    branches:\n      - master\n      - release/**\n\nenv:\n  BUILD_TYPE: Release\n  DEFAULT_PY_VERSION: 3.8\n  MACOSX_DEPLOYMENT_TARGET: 11.0\n\njobs:\n  ci:\n    runs-on: ${{ matrix.os }}\n\n    name: ${{ matrix.os }} - ${{ toJson(matrix.config) }})\n\n    env:\n      SDK_ROOT: $(xcrun --sdk macosx --show-sdk-path)\n\n    strategy:\n      matrix:\n        os: [ubuntu-20.04, macOS-11]\n        config:\n          # Defaults: CORENRN_ENABLE_MPI=ON\n          - {cmake_option: \"-DCORENRN_ENABLE_MPI_DYNAMIC=ON\", flag_warnings: ON}\n          - {cmake_option: \"-DCORENRN_ENABLE_MPI_DYNAMIC=ON -DCORENRN_ENABLE_SHARED=OFF\"}\n          - {cmake_option: \"-DCORENRN_ENABLE_MPI=OFF\"}\n          - {use_nmodl: ON, py_version: 3.7}\n          - {use_nmodl: ON}\n        include:\n          - os: ubuntu-20.04\n            config:\n              gcc_version: 10\n          - os: ubuntu-20.04\n            config:\n              cmake_option: -DCORENRN_ENABLE_DEBUG_CODE=ON\n              documentation: ON\n          - os: ubuntu-22.04\n            config:\n              sanitizer: address\n          - os: ubuntu-22.04\n            config:\n              flag_warnings: ON\n              sanitizer: undefined\n      fail-fast: false\n\n    steps:\n\n      - name: Install homebrew packages\n        if: startsWith(matrix.os, 'macOS')\n        run: |\n          brew update\n          brew install bison boost ccache coreutils flex ninja openmpi\n          echo /usr/local/opt/flex/bin:/usr/local/opt/bison/bin >> $GITHUB_PATH\n        shell: bash\n\n      - name: Install apt packages\n        if: startsWith(matrix.os, 'ubuntu')\n        run: |\n          sudo apt-get install bison ccache doxygen flex libboost-all-dev \\\n            libfl-dev libopenmpi-dev ninja-build openmpi-bin\n        shell: bash\n\n      - name: Install specific apt packages\n        if: startsWith(matrix.os, 'ubuntu') && matrix.config.gcc_version\n        run: |\n          sudo apt-get install gcc-${{matrix.config.gcc_version}}\n          echo CC=\"gcc-${{matrix.config.gcc_version}}\" >> $GITHUB_ENV\n          echo CXX=\"g++-${{matrix.config.gcc_version}}\" >> $GITHUB_ENV\n        shell: bash\n\n      - name: Set up Python3\n        uses: actions/setup-python@v4\n        with:\n          python-version: ${{ env.PYTHON_VERSION }}\n        env:\n          PYTHON_VERSION: ${{matrix.config.py_version || env.DEFAULT_PY_VERSION}}\n\n      - name: Install NMODL dependencies\n        if: ${{ matrix.config.use_nmodl == 'ON' }}\n        run: |\n          python3 -m pip install --upgrade pip jinja2 pyyaml pytest sympy\n\n      - uses: actions/checkout@v3\n\n      - name: Install documentation dependencies\n        if: ${{matrix.config.documentation == 'ON'}}\n        working-directory: ${{runner.workspace}}/CoreNeuron\n        run: |\n          sudo apt-get install doxygen\n          python3 -m pip install --upgrade pip\n          python3 -m pip install --upgrade -r docs/docs_requirements.txt\n\n      - name: Register compiler warning problem matcher\n        if: ${{matrix.config.flag_warnings == 'ON'}}\n        run: echo \"::add-matcher::.github/problem-matchers/gcc.json\"\n\n      - name: Register sanitizer problem matcher\n        if: ${{matrix.config.sanitizer}}\n        run: echo \"::add-matcher::.github/problem-matchers/${{matrix.config.sanitizer}}.json\"\n\n      - name: Hash config dictionary\n        run: |\n          cat << EOF > matrix.json\n          ${{toJSON(matrix.config)}}\n          EOF\n          echo matrix.config JSON:\n          cat matrix.json\n          echo -----\n      \n      # Workaround for https://github.com/actions/cache/issues/92\n      - name: Checkout cache action\n        uses: actions/checkout@v3\n        with:\n          repository: actions/cache\n          ref: v3\n          path: tmp/actions/cache\n          \n      - name: Make actions/cache@v3 run even on failure\n        run: |\n          sed -i'.bak' -e '/ post-if: /d' tmp/actions/cache/action.yml\n          \n      - name: Restore compiler cache\n        uses: ./tmp/actions/cache\n        with:\n          path: |\n            ${{runner.workspace}}/ccache\n          key: ${{matrix.os}}-${{hashfiles('matrix.json')}}-${{github.ref}}-${{github.sha}}\n          restore-keys: |\n            ${{matrix.os}}-${{hashfiles('matrix.json')}}-${{github.ref}}-\n            ${{matrix.os}}-${{hashfiles('matrix.json')}}-\n\n      - name: Build and Test\n        id: build-test\n        shell: bash\n        working-directory: ${{runner.workspace}}/CoreNeuron\n        run:  |\n          cmake_args=(${{matrix.config.cmake_option}})\n          if [[ \"${{ startsWith(matrix.os, 'macOS') }}\" = \"true\" ]]; then\n              cmake_args+=(-DCORENRN_ENABLE_OPENMP=OFF)\n          else\n              cmake_args+=(-DCORENRN_ENABLE_OPENMP=ON)\n          fi\n\n          if [[ \"${{matrix.config.flag_warnings}}\" == \"ON\" ]]; then\n              cmake_args+=(-DCORENRN_EXTRA_CXX_FLAGS=\"-Wall\")\n          fi\n\n          if [[ -n \"${{matrix.config.sanitizer}}\" ]]; then\n              CC=$(command -v clang-14)\n              CXX=$(command -v clang++-14)\n              symbolizer_path=$(realpath $(command -v llvm-symbolizer-14))\n              cmake_args+=(-DCMAKE_BUILD_TYPE=Custom \\\n                           -DCMAKE_C_FLAGS=\"-O1 -g -Wno-writable-strings\" \\\n                           -DCMAKE_CXX_FLAGS=\"-O1 -g -Wno-writable-strings\" \\\n                           -DLLVM_SYMBOLIZER_PATH=\"${symbolizer_path}\" \\\n                           -DCORENRN_SANITIZERS=$(echo ${{matrix.config.sanitizer}} | sed -e 's/-/,/g'))\n          else\n              CC=${CC:-gcc}\n              CXX=${CXX:-g++}\n          fi\n          \n          echo \"------- Build, Test and Install -------\"\n          mkdir build && cd build\n          if [[ \"$USE_NMODL\" == \"ON\" ]]; then\n              cmake_args+=(-DCORENRN_ENABLE_NMODL=ON \"-DCORENRN_NMODL_FLAGS=sympy --analytic\")\n          fi\n          cmake .. -G Ninja \"${cmake_args[@]}\" \\\n            -DCMAKE_C_COMPILER=\"${CC}\" \\\n            -DCMAKE_C_COMPILER_LAUNCHER=ccache \\\n            -DCMAKE_CXX_COMPILER=\"${CXX}\" \\\n            -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \\\n            \"-DCMAKE_INSTALL_PREFIX=${{runner.workspace}}/install\" \\\n            -DPYTHON_EXECUTABLE=$(command -v python3)\n          if ccache --version | grep -E '^ccache version 4\\.(4|4\\.1)$'\n          then\n            echo \"------- Disable ccache direct mode -------\"\n            # https://github.com/ccache/ccache/issues/935\n            export CCACHE_NODIRECT=1\n          fi\n          ccache -z\n          # Older versions don't support -v (verbose)\n          ccache -vs 2>/dev/null || ccache -s\n          cmake --build . --parallel\n          ccache -vs 2>/dev/null || ccache -s\n          ctest -T Test --output-on-failure\n          cmake --build . --target install\n        env:\n          CCACHE_BASEDIR: ${{runner.workspace}}/CoreNeuron\n          CCACHE_DIR: ${{runner.workspace}}/ccache\n          USE_NMODL: ${{matrix.config.use_nmodl}}\n\n      - uses: actions/upload-artifact@v3\n        with:\n          name: ctest-results-${{hashfiles('matrix.json')}}-sanitizer\n          path: ${{runner.workspace}}/CoreNeuron/build/Testing/*/Test.xml\n\n      # This step will set up an SSH connection on tmate.io for live debugging.\n      # To enable it, you have to:\n      #   * add 'live-debug-ci' to your PR title\n      #   * push something to your PR branch (note that just re-running the pipeline disregards the title update)\n      - name: live debug session on failure (manual steps required, check `.github/workflows/coreneuron-ci.yml`)\n        if: failure() && contains(github.event.pull_request.title, 'live-debug-ci')\n        uses: mxschmitt/action-tmate@v3\n\n      - name: Documentation\n        if: ${{ startsWith(matrix.os, 'ubuntu') && matrix.config.documentation == 'ON' }}\n        id: documentation\n        working-directory: ${{runner.workspace}}/CoreNeuron/build\n        run: |\n          echo \"------- Build Doxygen Documentation -------\";\n          cmake --build . --target docs\n          echo \"-------- Disable jekyll --------\";\n          pushd docs;\n          touch .nojekyll;\n          echo ::set-output name=status::done\n          \n      - name: Deploy 🚀\n        uses: JamesIves/github-pages-deploy-action@v4\n        if: steps.documentation.outputs.status == 'done' && github.ref == 'refs/heads/master'\n        with:\n          branch: gh-pages # The branch the action should deploy to.\n          folder: ${{runner.workspace}}/CoreNeuron/build/docs  # The folder the action should deploy.\n          single-commit: true #have a single commit on the deployment branch instead of maintaining the full history\n"
  },
  {
    "path": ".github/workflows/coverage.yml",
    "content": "name: Coverage\n\nconcurrency:\n  group: ${{ github.workflow }}#${{ github.ref }}\n  cancel-in-progress: true\n\non:\n  push:\n    branches:\n      - master\n      - release/**\n  pull_request:\n    branches:\n      - master\n      - release/**\n\nenv:\n  CMAKE_BUILD_PARALLEL_LEVEL: 3\n\njobs:\n  coverage:\n    runs-on: ubuntu-20.04\n    name: \"Coverage Test\"\n    steps:\n      - name: Install packages\n        run: |\n          sudo apt-get update\n          sudo apt-get install bison doxygen flex lcov libboost-all-dev \\\n            libopenmpi-dev libfl-dev ninja-build openmpi-bin python3-dev \\\n            python3-pip\n        shell: bash\n      - uses: actions/checkout@v3\n        with:\n          fetch-depth: 2\n      - name: Build and Test for Coverage\n        id: build-test\n        shell: bash\n        working-directory: ${{runner.workspace}}/CoreNeuron\n        run:  |\n          mkdir build && cd build\n          cmake .. -G Ninja \\\n            -DCMAKE_BUILD_TYPE=Debug \\\n            -DCMAKE_C_FLAGS=\"-coverage\" \\\n            -DCMAKE_CXX_FLAGS=\"-coverage\" \\\n            -DCORENRN_ENABLE_MPI=ON \\\n            -DCORENRN_ENABLE_DEBUG_CODE=ON\n          cmake --build .\n          (cd ..;  lcov --capture  --initial --directory . --no-external --output-file build/coverage-base.info)\n          ctest --output-on-failure\n          (cd ..; lcov --capture  --directory . --no-external --output-file build/coverage-run.info)\n          lcov --add-tracefile coverage-base.info --add-tracefile coverage-run.info --output-file coverage-combined.info\n          lcov --remove coverage-combined.info --output-file coverage.info \"*/external/*\"\n          lcov --list coverage.info\n      - name: Upload to codecov.io\n        run: |\n          # Download codecov script and perform integrity checks\n          curl https://keybase.io/codecovsecurity/pgp_keys.asc | gpg --import # One-time step \n          curl -Os https://uploader.codecov.io/latest/linux/codecov \n          curl -Os https://uploader.codecov.io/latest/linux/codecov.SHA256SUM \n          curl -Os https://uploader.codecov.io/latest/linux/codecov.SHA256SUM.sig \n          gpg --verify codecov.SHA256SUM.sig codecov.SHA256SUM \n          shasum -a 256 -c codecov.SHA256SUM \n          chmod +x codecov \n          ./codecov -f build/coverage.info\n"
  },
  {
    "path": ".github/workflows/test-as-submodule.yml",
    "content": "name: NEURON submodule\n\nconcurrency:\n  group: ${{ github.workflow }}#${{ github.ref }}\n  cancel-in-progress: true\n\non:\n  push:\n    branches:\n      - master\n      - release/**\n  pull_request:\n    branches:\n      - master\n      - release/**\n\njobs:\n  ci:\n    name: ${{ matrix.os }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      matrix:\n        include:\n          - os: ubuntu-20.04\n            cores: 2\n          - os: macOS-11\n            cores: 3\n      fail-fast: false\n    env:\n      CMAKE_BUILD_PARALLEL_LEVEL: ${{matrix.cores}}\n      SDK_ROOT: $(xcrun --sdk macosx --show-sdk-path)\n\n    steps:\n\n      - name: Install homebrew packages\n        if: startsWith(matrix.os, 'macOS')\n        run: |\n          brew install bison coreutils flex ninja openmpi\n          python3 -m pip install --upgrade numpy pytest pytest-cov\n          echo /usr/local/opt/flex/bin:/usr/local/opt/bison/bin >> $GITHUB_PATH\n          echo \"CC=gcc\" >> $GITHUB_ENV\n          echo \"CXX=g++\" >> $GITHUB_ENV\n\n      - name: Install apt packages\n        if: startsWith(matrix.os, 'ubuntu')\n        run: |\n          sudo apt-get update\n          sudo apt-get install bison cython3 flex libfl-dev libopenmpi-dev \\\n            ninja-build openmpi-bin python3-dev\n          python3 -m pip install --upgrade numpy pytest pytest-cov\n          echo \"CC=gcc\" >> $GITHUB_ENV\n          echo \"CXX=g++\" >> $GITHUB_ENV\n\n      - name: Set NEURON branch\n        id: vars\n        env:\n          GITHUB_PR_BODY: ${{ github.event.pull_request.body }}\n        run: |\n          nrn_branch=$(echo \"${GITHUB_PR_BODY}\" | grep \"^CI_BRANCHES\" \\\n                      | awk -F '[:,]{1}NEURON_BRANCH=' '{print $2}' \\\n                      | awk -F ',' '{print $1}')\n          if [ -z \"$nrn_branch\" ]; then\n              nrn_branch=master\n          fi\n          echo \"Will use neuron branch: $nrn_branch\"\n          echo ::set-output name=neuron_branch::\"${nrn_branch}\"\n\n      - uses: actions/checkout@v3\n        name: Checkout NEURON\n        with:\n          path: nrn\n          repository: neuronsimulator/nrn\n          ref: ${{ steps.vars.outputs.neuron_branch }}\n\n      - name: Update CoreNEURON submodule\n        run: |\n          cd ${GITHUB_WORKSPACE}/nrn\n          coreneuron_sha=${{github.event.pull_request.head.sha}}\n          if [[ -z ${coreneuron_sha} ]]; then\n          # presumably we're running on a push event\n          coreneuron_sha=${{github.sha}}\n          fi\n          echo \"Using CoreNEURON SHA ${coreneuron_sha}\"\n          # https://stackoverflow.com/a/33575837\n          git update-index --cacheinfo 160000,${coreneuron_sha},external/coreneuron\n          git submodule update --init external/coreneuron\n          echo \"NEURON status\"\n          git status\n          git log -n 1\n          cd external/coreneuron\n          echo \"CoreNEURON status\"\n          git status\n          git log -n 1\n\n      - name: Configure NEURON\n        run: |\n          cd ${GITHUB_WORKSPACE}/nrn\n          mkdir build install\n          cd build\n          # NEURON CMake assumes this is defined.\n          export SHELL=$(command -v bash)\n          openMP=\" -DCORENRN_ENABLE_OPENMP=ON\"\n          if [[ \"${{ startsWith(matrix.os, 'macOS') }}\" = \"true\" ]]; then\n            openMP=\" -DCORENRN_ENABLE_OPENMP=OFF\"\n          fi\n          cmake .. -G Ninja \\\n            -DCMAKE_BUILD_TYPE=RelWithDebInfo \\\n            -DCMAKE_INSTALL_PREFIX=../install \\\n            -DPYTHON_EXECUTABLE=$(command -v python3) \\\n            -DNRN_ENABLE_CORENEURON=ON \\\n            -DNRN_ENABLE_INTERVIEWS=OFF \\\n            -DNRN_ENABLE_RX3D=OFF \\\n            -DNRN_ENABLE_MPI_DYNAMIC=ON \\\n            -DNRN_ENABLE_TESTS=ON ${openMP}\n\n      - name: Build NEURON\n        run: |\n          cd ${GITHUB_WORKSPACE}/nrn/build\n          cmake --build . --parallel\n\n      - name: Test NEURON\n        run: |\n          cd ${GITHUB_WORKSPACE}/nrn/build\n          ctest --output-on-failure\n\n      - name: Install NEURON\n        run: |\n          cd ${GITHUB_WORKSPACE}/nrn/build\n          cmake --build . --target install\n\n      # This step will set up an SSH connection on tmate.io for live debugging.\n      # To enable it, you have to:\n      #   * add 'live-debug-ci' to your PR title\n      #   * push something to your PR branch (note that just re-running the pipeline disregards the title update)\n      - name: live debug session on failure (manual steps required, check `.github/workflows/test-as-submodule.yml`)\n        if: failure() && contains(github.event.pull_request.title, 'live-debug-ci')\n        uses: mxschmitt/action-tmate@v3\n"
  },
  {
    "path": ".gitignore",
    "content": "cmake-build-debug*\n*build*\nspconfig.*\n*~\n.DS_Store\n*.swp\n*.srctrl*\n\n# HPC coding conventions\n.clang-format\n.clang-tidy\n.cmake-format.yaml\n.pre-commit-config.yaml\n.bbp-project-venv/\n"
  },
  {
    "path": ".gitlab-ci.yml",
    "content": "include:\n  - project: hpc/gitlab-pipelines\n    file:\n      - spack-build-components.gitlab-ci.yml\n      - github-project-pipelines.gitlab-ci.yml\n    ref: '$GITLAB_PIPELINES_BRANCH'\n  - project: hpc/gitlab-upload-logs\n    file: enable-upload.yml\n\nvariables:\n  NEURON_BRANCH:\n    description: Branch of NEURON to build against CoreNEURON (NEURON_COMMIT and NEURON_TAG also possible)\n    value: master\n  NMODL_BRANCH:\n    description: Branch of NMODL to build CoreNEURON against (NMODL_COMMIT and NMODL_TAG also possible)\n    value: master\n  SPACK_BRANCH:\n    description: Branch of BlueBrain Spack to use for the CI pipeline\n    value: develop\n  SPACK_DEPLOYMENT_SUFFIX:\n    description: Extra path component used when finding deployed software. Set to something like `pulls/1497` use software built for https://github.com/BlueBrain/spack/pull/1497. You probably want to set SPACK_BRANCH to the branch used in the relevant PR if you set this.\n    value: ''\n\n# Set up Spack\nspack_setup:\n  extends: .spack_setup_ccache\n  variables:\n    CORENEURON_COMMIT: ${CI_COMMIT_SHA}\n    # Enable fetching GitHub PR descriptions and parsing them to find out what\n    # branches to build of other projects.\n    PARSE_GITHUB_PR_DESCRIPTIONS: \"true\"\n\nsimulation_stack:\n  stage: .pre\n  # Take advantage of GitHub PR description parsing in the spack_setup job.\n  needs: [spack_setup]\n  trigger:\n    project: hpc/sim/blueconfigs\n    # CoreNEURON CI status depends on the BlueConfigs CI status.\n    strategy: depend\n  variables:\n    GITLAB_PIPELINES_BRANCH: $GITLAB_PIPELINES_BRANCH\n    SPACK_ENV_FILE_URL: $SPACK_SETUP_COMMIT_MAPPING_URL\n\n# Performance seems to be terrible when we get too many jobs on a single node.\n.build:\n  extends: [.spack_build]\n  variables:\n    bb5_ntasks: 2   # so we block 16 cores\n    bb5_cpus_per_task: 8 # ninja -j {this}\n    bb5_memory: 76G # ~16*384/80\n\n.spack_intel:\n  variables:\n    SPACK_PACKAGE_COMPILER: intel\n.spack_nvhpc:\n  variables:\n    SPACK_PACKAGE_COMPILER: nvhpc\n.build_neuron:\n  extends: [.build]\n  timeout: two hours\n  variables:\n    bb5_duration: \"2:00:00\"\n    SPACK_PACKAGE: neuron\n    SPACK_PACKAGE_SPEC: +coreneuron+debug+tests~legacy-unit~rx3d model_tests=channel-benchmark,olfactory,tqperf-heavy\n.gpu_node:\n  variables:\n    bb5_constraint: volta\n    bb5_cpus_per_task: 2\n.test_neuron:\n  extends: [.ctest]\n  variables:\n    bb5_ntasks: 16\n    bb5_memory: 76G # ~16*384/80\n\n# Build NMODL once with GCC\nbuild:nmodl:\n  extends: [.build]\n  variables:\n    SPACK_PACKAGE: nmodl\n    SPACK_PACKAGE_SPEC: ~legacy-unit\n    SPACK_PACKAGE_COMPILER: gcc\n\n# Build CoreNEURON\n.build_coreneuron:\n  extends: [.build]\n  variables:\n    SPACK_PACKAGE: coreneuron\n    # NEURON depends on py-mpi4py, most of whose dependencies are pulled in by\n    # nmodl%gcc, with the exception of MPI, which is pulled in by\n    # coreneuron%{nvhpc,intel}. hpe-mpi is an external package anyway, so\n    # setting its compiler is just changing how it is labelled in the\n    # dependency graph and not changing which installation is used, but this\n    # means that in the NEURON step an existing py-mpi4py%gcc can be used.\n    # Otherwise a new py-mpi4py with hpe-mpi%{nvhpc,intel} will be built.\n    # caliper: papi%nvhpc does not build; use the caliper from the deployment\n    # TODO: fix this more robustly so we don't have to play so many games.\n    SPACK_PACKAGE_DEPENDENCIES: ^hpe-mpi%gcc ^caliper%gcc+cuda cuda_arch=70\n\n# TODO: improve coverage by switching an Intel build to be statically linked\n# TODO: improve coverage by switching an Intel build to RelWithDebInfo\n# TODO: improve coverage by enabling +openmp on an Intel build\nbuild:coreneuron:mod2c:intel:shared:debug:\n  extends: [.build_coreneuron, .spack_intel]\n  variables:\n    SPACK_PACKAGE_SPEC: +caliper~gpu~legacy-unit~nmodl~openmp+shared+tests~unified build_type=Debug\n\nbuild:coreneuron:nmodl:intel:debug:legacy:\n  extends: [.build_coreneuron, .spack_intel]\n  needs: [\"build:nmodl\"]\n  variables:\n    SPACK_PACKAGE_SPEC: +caliper~gpu~legacy-unit+nmodl~openmp~shared~sympy+tests~unified build_type=Debug\n\n# Disable caliper to improve coverage\nbuild:coreneuron:nmodl:intel:shared:debug:\n  extends: [.build_coreneuron, .spack_intel]\n  needs: [\"build:nmodl\"]\n  variables:\n    SPACK_PACKAGE_DEPENDENCIES: ^hpe-mpi%gcc\n    SPACK_PACKAGE_SPEC: ~caliper~gpu~legacy-unit+nmodl~openmp+shared+sympy+tests~unified build_type=Debug\n\n# Not linked to a NEURON build+test job, see\n# https://github.com/BlueBrain/CoreNeuron/issues/594\nbuild:coreneuron:mod2c:nvhpc:acc:debug:unified:\n  extends: [.build_coreneuron, .spack_nvhpc]\n  variables:\n    SPACK_PACKAGE_SPEC: +caliper+gpu~legacy-unit~nmodl+openmp~shared+tests+unified build_type=Debug\n\n# Shared + OpenACC + OpenMP host threading has problems\nbuild:coreneuron:mod2c:nvhpc:acc:shared:\n  extends: [.build_coreneuron, .spack_nvhpc]\n  variables:\n    SPACK_PACKAGE_SPEC: +caliper+gpu~legacy-unit~nmodl~openmp+shared+tests~unified build_type=RelWithDebInfo\n\nbuild:coreneuron:nmodl:nvhpc:acc:debug:legacy:\n  extends: [.build_coreneuron, .spack_nvhpc]\n  needs: [\"build:nmodl\"]\n  variables:\n    SPACK_PACKAGE_SPEC: +caliper+gpu~legacy-unit+nmodl~openmp~shared~sympy+tests~unified build_type=Debug\n\nbuild:coreneuron:nmodl:nvhpc:acc:shared:\n  extends: [.build_coreneuron, .spack_nvhpc]\n  needs: [\"build:nmodl\"]\n  variables:\n    SPACK_PACKAGE_SPEC: +caliper+gpu~legacy-unit+nmodl~openmp+shared+sympy+tests~unified build_type=RelWithDebInfo\n\nbuild:coreneuron:nmodl:nvhpc:omp:legacy:\n  extends: [.build_coreneuron, .spack_nvhpc]\n  needs: [\"build:nmodl\"]\n  variables:\n    SPACK_PACKAGE_SPEC: +caliper+gpu~legacy-unit+nmodl+openmp~shared~sympy+tests~unified build_type=RelWithDebInfo\n\nbuild:coreneuron:nmodl:nvhpc:omp:debug:\n  extends: [.build_coreneuron, .spack_nvhpc]\n  needs: [\"build:nmodl\"]\n  variables:\n    SPACK_PACKAGE_SPEC: +caliper+gpu~legacy-unit+nmodl+openmp~shared+sympy+tests~unified build_type=Debug\n\n# Build NEURON\nbuild:neuron:mod2c:intel:shared:debug:\n  extends: [.build_neuron, .spack_intel]\n  needs: [\"build:coreneuron:mod2c:intel:shared:debug\"]\n\nbuild:neuron:nmodl:intel:debug:legacy:\n  extends: [.build_neuron, .spack_intel]\n  needs: [\"build:coreneuron:nmodl:intel:debug:legacy\"]\n\nbuild:neuron:nmodl:intel:shared:debug:\n  extends: [.build_neuron, .spack_intel]\n  needs: [\"build:coreneuron:nmodl:intel:shared:debug\"]\n\nbuild:neuron:mod2c:nvhpc:acc:shared:\n  extends: [.build_neuron, .spack_nvhpc]\n  needs: [\"build:coreneuron:mod2c:nvhpc:acc:shared\"]\n\nbuild:neuron:nmodl:nvhpc:acc:debug:legacy:\n  extends: [.build_neuron, .spack_nvhpc]\n  needs: [\"build:coreneuron:nmodl:nvhpc:acc:debug:legacy\"]\n\nbuild:neuron:nmodl:nvhpc:acc:shared:\n  extends: [.build_neuron, .spack_nvhpc]\n  needs: [\"build:coreneuron:nmodl:nvhpc:acc:shared\"]\n\nbuild:neuron:nmodl:nvhpc:omp:legacy:\n  extends: [.build_neuron, .spack_nvhpc]\n  needs: [\"build:coreneuron:nmodl:nvhpc:omp:legacy\"]\n\nbuild:neuron:nmodl:nvhpc:omp:debug:\n  extends: [.build_neuron, .spack_nvhpc]\n  needs: [\"build:coreneuron:nmodl:nvhpc:omp:debug\"]\n\n# Test CoreNEURON\ntest:coreneuron:mod2c:intel:shared:debug:\n  extends: [.ctest]\n  needs: [\"build:coreneuron:mod2c:intel:shared:debug\"]\n\ntest:coreneuron:nmodl:intel:debug:legacy:\n  extends: [.ctest]\n  needs: [\"build:coreneuron:nmodl:intel:debug:legacy\"]\n\ntest:coreneuron:nmodl:intel:shared:debug:\n  extends: [.ctest]\n  needs: [\"build:coreneuron:nmodl:intel:shared:debug\"]\n\ntest:coreneuron:mod2c:nvhpc:acc:debug:unified:\n  extends: [.ctest, .gpu_node]\n  needs: [\"build:coreneuron:mod2c:nvhpc:acc:debug:unified\"]\n\ntest:coreneuron:mod2c:nvhpc:acc:shared:\n  extends: [.ctest, .gpu_node]\n  needs: [\"build:coreneuron:mod2c:nvhpc:acc:shared\"]\n\ntest:coreneuron:nmodl:nvhpc:acc:debug:legacy:\n  extends: [.ctest, .gpu_node]\n  needs: [\"build:coreneuron:nmodl:nvhpc:acc:debug:legacy\"]\n\ntest:coreneuron:nmodl:nvhpc:acc:shared:\n  extends: [.ctest, .gpu_node]\n  needs: [\"build:coreneuron:nmodl:nvhpc:acc:shared\"]\n\ntest:coreneuron:nmodl:nvhpc:omp:legacy:\n  extends: [.ctest, .gpu_node]\n  needs: [\"build:coreneuron:nmodl:nvhpc:omp:legacy\"]\n\ntest:coreneuron:nmodl:nvhpc:omp:debug:\n  extends: [.ctest, .gpu_node]\n  needs: [\"build:coreneuron:nmodl:nvhpc:omp:debug\"]\n\n# Test NEURON\ntest:neuron:mod2c:intel:shared:debug:\n  extends: [.test_neuron]\n  needs: [\"build:neuron:mod2c:intel:shared:debug\"]\n\ntest:neuron:nmodl:intel:debug:legacy:\n  extends: [.test_neuron]\n  needs: [\"build:neuron:nmodl:intel:debug:legacy\"]\n\ntest:neuron:nmodl:intel:shared:debug:\n  extends: [.test_neuron]\n  needs: [\"build:neuron:nmodl:intel:shared:debug\"]\n\ntest:neuron:mod2c:nvhpc:acc:shared:\n  extends: [.test_neuron, .gpu_node]\n  needs: [\"build:neuron:mod2c:nvhpc:acc:shared\"]\n\ntest:neuron:nmodl:nvhpc:acc:debug:legacy:\n  extends: [.test_neuron, .gpu_node]\n  needs: [\"build:neuron:nmodl:nvhpc:acc:debug:legacy\"]\n\ntest:neuron:nmodl:nvhpc:acc:shared:\n  extends: [.test_neuron, .gpu_node]\n  needs: [\"build:neuron:nmodl:nvhpc:acc:shared\"]\n\ntest:neuron:nmodl:nvhpc:omp:legacy:\n  extends: [.test_neuron, .gpu_node]\n  needs: [\"build:neuron:nmodl:nvhpc:omp:legacy\"]\n\ntest:neuron:nmodl:nvhpc:omp:debug:\n  extends: [.test_neuron, .gpu_node]\n  needs: [\"build:neuron:nmodl:nvhpc:omp:debug\"]\n"
  },
  {
    "path": ".gitmodules",
    "content": "[submodule \"external/mod2c\"]\n  path = external/mod2c\n  url = https://github.com/BlueBrain/mod2c\n[submodule \"external/CLI11\"]\n  path = external/CLI11\n  url = https://github.com/CLIUtils/CLI11.git\n[submodule \"external/nmodl\"]\n  path = external/nmodl\n  url = https://github.com/BlueBrain/nmodl\n[submodule \"external/Random123\"]\n\tpath = external/Random123\n\turl = https://github.com/BlueBrain/Random123.git\n[submodule \"CMake/hpc-coding-conventions\"]\n\tpath = CMake/hpc-coding-conventions\n\turl = https://github.com/BlueBrain/hpc-coding-conventions.git\n"
  },
  {
    "path": ".readthedocs.yml",
    "content": "version: 2\n\nconda:\n  environment: docs/conda_environment.yml\n\npython:\n  install:\n    - requirements: docs/docs_requirements.txt\n"
  },
  {
    "path": ".sanitizers/undefined.supp",
    "content": "unsigned-integer-overflow:_philox4x32bumpkey(r123array2x32)\nunsigned-integer-overflow:coreneuron::TNode::mkhash()\nunsigned-integer-overflow:std::mersenne_twister_engine\n"
  },
  {
    "path": "AUTHORS.txt",
    "content": "Akiko Sato\nAleksandr Ovcharenko\nAlessandro Cattabiani\nAlexander Dietz\nAlexandru Săvulescu\nAntonio Bellotta\nBaudouin Del Marmol\nBruno Magalhaes\nChristos Kotsalos\nFabien Delalondre\nFelix Schuermann (contributor)\nFernando Pereira\nFrancesco Cremonesi\nIoannis Magkanaris\nJames Gonzalo King\nJeremy Fouriaux\nJorge Blanco Alonso\nKai Langen\nMichael Lee Hines\nNicolas Cornu\nOlli Lupton\nOmar Awile\nOren Amsalem\nPramod Shivaji Kumbhar (maintainer)\nSam Yates\nSergio Rivas-Gomez\nTapasweni Pathak\nWeina Ji\nviniciusdepadua\n"
  },
  {
    "path": "CMake/AddHpcCodingConvSubmodule.cmake",
    "content": "# =============================================================================\n# Copyright (C) 2016-2021 Blue Brain Project\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\ninclude(FindPackageHandleStandardArgs)\nfind_package(FindPkgConfig QUIET)\n\nfind_path(\n  HpcCodingConv_PROJ\n  NAMES setup.cfg\n  PATHS \"${CORENEURON_PROJECT_SOURCE_DIR}/CMake/hpc-coding-conventions/\")\n\nfind_package_handle_standard_args(HpcCodingConv REQUIRED_VARS HpcCodingConv_PROJ)\n\nif(NOT HpcCodingConv_FOUND)\n  find_package(Git 1.8.3 QUIET)\n  if(NOT ${GIT_FOUND})\n    message(FATAL_ERROR \"git not found, clone repository with --recursive\")\n  endif()\n  message(\n    STATUS \"Sub-module CMake/hpc-coding-conventions missing: running git submodule update --init\")\n  execute_process(\n    COMMAND ${GIT_EXECUTABLE} submodule update --init --\n            ${CORENEURON_PROJECT_SOURCE_DIR}/CMake/hpc-coding-conventions\n    WORKING_DIRECTORY ${CORENEURON_PROJECT_SOURCE_DIR})\nendif()\n"
  },
  {
    "path": "CMake/AddMod2cSubmodule.cmake",
    "content": "# =============================================================================\n# Copyright (C) 2016-2021 Blue Brain Project\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\nfind_package(FindPkgConfig QUIET)\n\nfind_path(\n  MOD2C_PROJ\n  NAMES CMakeLists.txt\n  PATHS \"${CORENEURON_PROJECT_SOURCE_DIR}/external/mod2c\")\n\nfind_package_handle_standard_args(MOD2C REQUIRED_VARS MOD2C_PROJ)\n\nif(NOT MOD2C_FOUND)\n  find_package(Git 1.8.3 QUIET)\n  if(NOT ${GIT_FOUND})\n    message(FATAL_ERROR \"git not found, clone repository with --recursive\")\n  endif()\n  message(STATUS \"Sub-module mod2c missing : running git submodule update --init --recursive\")\n  execute_process(\n    COMMAND ${GIT_EXECUTABLE} submodule update --init --recursive --\n            ${CORENEURON_PROJECT_SOURCE_DIR}/external/mod2c\n    WORKING_DIRECTORY ${CORENEURON_PROJECT_SOURCE_DIR})\nelse()\n  message(STATUS \"Using mod2c submodule from ${MOD2C_PROJ}\")\nendif()\n\nadd_subdirectory(${CORENEURON_PROJECT_SOURCE_DIR}/external/mod2c)\n"
  },
  {
    "path": "CMake/AddNmodlSubmodule.cmake",
    "content": "# =============================================================================\n# Copyright (C) 2016-2021 Blue Brain Project\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\nfind_package(FindPkgConfig QUIET)\n\nfind_path(\n  NMODL_PROJ\n  NAMES CMakeLists.txt\n  PATHS \"${CORENEURON_PROJECT_SOURCE_DIR}/external/nmodl\")\n\nfind_package_handle_standard_args(NMODL REQUIRED_VARS NMODL_PROJ)\n\nif(NOT NMODL_FOUND)\n  find_package(Git 1.8.3 QUIET)\n  if(NOT ${GIT_FOUND})\n    message(FATAL_ERROR \"git not found, clone repository with --recursive\")\n  endif()\n  message(STATUS \"Sub-module nmodl missing : running git submodule update --init\")\n  execute_process(\n    COMMAND ${GIT_EXECUTABLE} submodule update --init --\n            ${CORENEURON_PROJECT_SOURCE_DIR}/external/nmodl\n    WORKING_DIRECTORY ${CORENEURON_PROJECT_SOURCE_DIR})\nelse()\n  message(STATUS \"Using nmodl submodule from ${NMODL_PROJ}\")\nendif()\n\nadd_subdirectory(${CORENEURON_PROJECT_SOURCE_DIR}/external/nmodl)\n"
  },
  {
    "path": "CMake/AddRandom123Submodule.cmake",
    "content": "# =============================================================================\n# Copyright (C) 2016-2021 Blue Brain Project\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\ninclude(FindPackageHandleStandardArgs)\nfind_package(FindPkgConfig QUIET)\n\nfind_path(\n  Random123_PROJ\n  NAMES LICENSE\n  PATHS \"${CORENEURON_PROJECT_SOURCE_DIR}/external/Random123\"\n  NO_CMAKE_PATH NO_CMAKE_ENVIRONMENT_PATH NO_SYSTEM_ENVIRONMENT_PATH NO_CMAKE_SYSTEM_PATH)\n\nfind_package_handle_standard_args(Random123 REQUIRED_VARS Random123_PROJ)\n\nif(NOT Random123_FOUND)\n  find_package(Git 1.8.3 QUIET)\n  if(NOT ${GIT_FOUND})\n    message(FATAL_ERROR \"git not found, clone repository with --recursive\")\n  endif()\n  message(STATUS \"Sub-module Random123 missing: running git submodule update --init --recursive\")\n  execute_process(\n    COMMAND ${GIT_EXECUTABLE} submodule update --init --recursive --\n            ${CORENEURON_PROJECT_SOURCE_DIR}/external/Random123\n    WORKING_DIRECTORY ${CORENEURON_PROJECT_SOURCE_DIR})\nendif()\n"
  },
  {
    "path": "CMake/CrayPortability.cmake",
    "content": "# =============================================================================\n# Copyright (C) 2016-2021 Blue Brain Project\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\nif(IS_DIRECTORY \"/opt/cray\")\n  set(CRAY_SYSTEM TRUE)\nendif()\n\nif(CRAY_SYSTEM)\n  # default build type is static for cray\n  if(NOT DEFINED COMPILE_LIBRARY_TYPE)\n    set(COMPILE_LIBRARY_TYPE \"STATIC\")\n  endif()\n\n  # Cray wrapper take care of everything!\n  set(MPI_LIBRARIES \"\")\n  set(MPI_C_LIBRARIES \"\")\n  set(MPI_CXX_LIBRARIES \"\")\n\n  # ~~~\n  # instead of -rdynamic, cray wrapper needs either -dynamic or -static(default)\n  # also cray compiler needs fPIC flag\n  # ~~~\n  if(COMPILE_LIBRARY_TYPE STREQUAL \"SHARED\")\n    set(CMAKE_SHARED_LIBRARY_LINK_CXX_FLAGS \"-dynamic\")\n    # TODO: add Cray compiler flag configurations in CompilerFlagsHelpers.cmake\n    if(CMAKE_C_COMPILER_IS_CRAY)\n      set(CMAKE_C_FLAGS \"${CMAKE_C_FLAGS} -fPIC\")\n      set(CMAKE_CXX_FLAGS \"${CMAKE_CXX_FLAGS} -fPIC\")\n    endif()\n\n  else()\n    set(CMAKE_SHARED_LIBRARY_LINK_CXX_FLAGS \"\")\n  endif()\nelse()\n  # default is shared library\n  if(NOT DEFINED COMPILE_LIBRARY_TYPE)\n    set(COMPILE_LIBRARY_TYPE \"SHARED\")\n  endif()\nendif()\n"
  },
  {
    "path": "CMake/GitRevision.cmake",
    "content": "# =============================================================================\n# Copyright (C) 2016-2021 Blue Brain Project\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\n# ~~~\n# For now use simple approach to get version information as git is often\n# avaialble on the machine where we are building from source\n# ~~~\n\nfind_package(Git)\n\nif(GIT_FOUND)\n  # get last commit sha1\n  execute_process(\n    COMMAND ${GIT_EXECUTABLE} -c log.showSignature=false log -1 --format=%h\n    WORKING_DIRECTORY ${CORENEURON_PROJECT_SOURCE_DIR}\n    OUTPUT_VARIABLE GIT_REVISION_SHA1\n    ERROR_QUIET OUTPUT_STRIP_TRAILING_WHITESPACE)\n  # get last commit date\n  execute_process(\n    COMMAND ${GIT_EXECUTABLE} -c log.showSignature=false show -s --format=%ci\n    WORKING_DIRECTORY ${CORENEURON_PROJECT_SOURCE_DIR}\n    OUTPUT_VARIABLE GIT_REVISION_DATE\n    ERROR_QUIET OUTPUT_STRIP_TRAILING_WHITESPACE)\n  set(CN_GIT_REVISION \"${GIT_REVISION_SHA1} (${GIT_REVISION_DATE})\")\nelse()\n  set(CN_GIT_REVISION \"unknown\")\nendif()\n"
  },
  {
    "path": "CMake/MakefileBuildOptions.cmake",
    "content": "# =============================================================================\n# Copyright (C) 2016-2022 Blue Brain Project\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\n# =============================================================================\n# NMODL CLI options : common and backend specific\n# =============================================================================\n# ~~~\n# if user pass arguments then use those as common arguments\n# note that inlining is done by default\n# ~~~\nset(NMODL_COMMON_ARGS \"passes --inline\")\n\nif(NOT \"${CORENRN_NMODL_FLAGS}\" STREQUAL \"\")\n  string(APPEND NMODL_COMMON_ARGS \" ${CORENRN_NMODL_FLAGS}\")\nendif()\n\nset(NMODL_CPU_BACKEND_ARGS \"host --c\")\nset(NMODL_ACC_BACKEND_ARGS \"host --c acc --oacc\")\n\n# =============================================================================\n# Construct the linker arguments that are used inside nrnivmodl-core (to build libcorenrnmech from\n# libcoreneuron-core, libcoreneuron-cuda and mechanism object files) and inside nrnivmodl (to link\n# NEURON's special against CoreNEURON's libcorenrnmech). These are stored in two global properties:\n# CORENRN_LIB_LINK_FLAGS (used by NEURON/nrnivmodl to link special against CoreNEURON) and\n# CORENRN_LIB_LINK_DEP_FLAGS (used by CoreNEURON/nrnivmodl-core to link libcorenrnmech.so).\n# Conceptually: CORENRN_LIB_LINK_FLAGS = -lcorenrnmech $CORENRN_LIB_LINK_DEP_FLAGS\n# =============================================================================\nif(NOT CORENRN_ENABLE_SHARED)\n  set_property(GLOBAL APPEND_STRING PROPERTY CORENRN_LIB_LINK_FLAGS \" -Wl,--whole-archive\")\nendif()\nset_property(GLOBAL APPEND_STRING PROPERTY CORENRN_LIB_LINK_FLAGS \" -lcorenrnmech\")\nif(NOT CORENRN_ENABLE_SHARED)\n  set_property(GLOBAL APPEND_STRING PROPERTY CORENRN_LIB_LINK_FLAGS \" -Wl,--no-whole-archive\")\nendif()\n# Essentially we \"just\" want to unpack the CMake dependencies of the `coreneuron-core` target into a\n# plain string that we can bake into the Makefiles in both NEURON and CoreNEURON.\nfunction(coreneuron_process_library_path library)\n  get_filename_component(library_dir \"${library}\" DIRECTORY)\n  if(NOT library_dir)\n    # In case target is not a target but is just the name of a library, e.g. \"dl\"\n    set_property(GLOBAL APPEND_STRING PROPERTY CORENRN_LIB_LINK_DEP_FLAGS \" -l${library}\")\n  elseif(\"${library_dir}\" MATCHES \"^(/lib|/lib64|/usr/lib|/usr/lib64)$\")\n    # e.g. /usr/lib64/libpthread.so -> -lpthread TODO: consider using\n    # https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_IMPLICIT_LINK_DIRECTORIES.html, or\n    # dropping this special case entirely\n    get_filename_component(libname ${library} NAME_WE)\n    string(REGEX REPLACE \"^lib\" \"\" libname ${libname})\n    set_property(GLOBAL APPEND_STRING PROPERTY CORENRN_LIB_LINK_DEP_FLAGS \" -l${libname}\")\n  else()\n    # It's a full path, include that on the line\n    set_property(GLOBAL APPEND_STRING PROPERTY CORENRN_LIB_LINK_DEP_FLAGS\n                                               \" -Wl,-rpath,${library_dir} ${library}\")\n  endif()\nendfunction()\nfunction(coreneuron_process_target target)\n  if(TARGET ${target})\n    if(NOT target STREQUAL \"coreneuron-core\")\n      # This is a special case: libcoreneuron-core.a is manually unpacked into .o files by the\n      # nrnivmodl-core Makefile, so we do not want to also emit an -lcoreneuron-core argument.\n      get_target_property(target_inc_dirs ${target} INTERFACE_INCLUDE_DIRECTORIES)\n      if(target_inc_dirs)\n        foreach(inc_dir_genex ${target_inc_dirs})\n          string(GENEX_STRIP \"${inc_dir_genex}\" inc_dir)\n          if(inc_dir)\n            set_property(GLOBAL APPEND_STRING PROPERTY CORENRN_EXTRA_COMPILE_FLAGS \" -I${inc_dir}\")\n          endif()\n        endforeach()\n      endif()\n      get_target_property(target_imported ${target} IMPORTED)\n      if(target_imported)\n        # In this case we can extract the full path to the library\n        get_target_property(target_location ${target} LOCATION)\n        coreneuron_process_library_path(${target_location})\n      else()\n        # This is probably another of our libraries, like -lcoreneuron-cuda. We might need to add -L\n        # and an RPATH later.\n        set_property(GLOBAL APPEND_STRING PROPERTY CORENRN_LIB_LINK_DEP_FLAGS \" -l${target}\")\n      endif()\n    endif()\n    get_target_property(target_libraries ${target} LINK_LIBRARIES)\n    if(target_libraries)\n      foreach(child_target ${target_libraries})\n        coreneuron_process_target(${child_target})\n      endforeach()\n    endif()\n    return()\n  endif()\n  coreneuron_process_library_path(\"${target}\")\nendfunction()\ncoreneuron_process_target(coreneuron-core)\nget_property(CORENRN_LIB_LINK_DEP_FLAGS GLOBAL PROPERTY CORENRN_LIB_LINK_DEP_FLAGS)\nset_property(GLOBAL APPEND_STRING PROPERTY CORENRN_LIB_LINK_FLAGS \" ${CORENRN_LIB_LINK_DEP_FLAGS}\")\n# In static builds then NEURON uses dlopen(nullptr, ...) to look for the corenrn_embedded_run\n# symbol, which comes from libcoreneuron-core.a and gets included in libcorenrnmech.\nif(NOT CORENRN_ENABLE_SHARED)\n  set_property(GLOBAL APPEND_STRING PROPERTY CORENRN_LIB_LINK_FLAGS \" -rdynamic\")\nendif()\nget_property(CORENRN_EXTRA_COMPILE_FLAGS GLOBAL PROPERTY CORENRN_EXTRA_COMPILE_FLAGS)\nget_property(CORENRN_LIB_LINK_FLAGS GLOBAL PROPERTY CORENRN_LIB_LINK_FLAGS)\n\n# Detect if --start-group and --end-group are valid linker arguments. These are typically needed\n# when linking mutually-dependent .o files (or where we don't know the correct order) on Linux, but\n# they are not needed *or* recognised by the macOS linker.\nif(CMAKE_VERSION VERSION_GREATER_EQUAL 3.18)\n  include(CheckLinkerFlag)\n  check_linker_flag(CXX -Wl,--start-group CORENRN_CXX_LINKER_SUPPORTS_START_GROUP)\nelseif(CMAKE_SYSTEM_NAME MATCHES Linux)\n  # Assume that --start-group and --end-group are only supported on Linux\n  set(CORENRN_CXX_LINKER_SUPPORTS_START_GROUP ON)\nendif()\nif(CORENRN_CXX_LINKER_SUPPORTS_START_GROUP)\n  set(CORENEURON_LINKER_START_GROUP -Wl,--start-group)\n  set(CORENEURON_LINKER_END_GROUP -Wl,--end-group)\nendif()\n\n# Things that used to be in CORENRN_LIB_LINK_FLAGS: -lrt -L${CMAKE_HOST_SYSTEM_PROCESSOR}\n# -L${caliper_LIB_DIR} -l${CALIPER_LIB}\n\n# =============================================================================\n# Turn CORENRN_COMPILE_DEFS into a list of -DFOO[=BAR] options.\n# =============================================================================\nlist(TRANSFORM CORENRN_COMPILE_DEFS PREPEND -D OUTPUT_VARIABLE CORENRN_COMPILE_DEF_FLAGS)\n\n# =============================================================================\n# Extra link flags that we need to include when linking libcorenrnmech.{a,so} in CoreNEURON but that\n# do not need to be passed to NEURON to use when linking nrniv/special (why?)\n# =============================================================================\nstring(JOIN \" \" CORENRN_COMMON_LDFLAGS ${CORENRN_LIB_LINK_DEP_FLAGS} ${CORENRN_EXTRA_LINK_FLAGS})\nif(CORENRN_SANITIZER_LIBRARY_DIR)\n  string(APPEND CORENRN_COMMON_LDFLAGS \" -Wl,-rpath,${CORENRN_SANITIZER_LIBRARY_DIR}\")\nendif()\nstring(JOIN \" \" CORENRN_SANITIZER_ENABLE_ENVIRONMENT_STRING ${CORENRN_SANITIZER_ENABLE_ENVIRONMENT})\n\n# =============================================================================\n# compile flags : common to all backend\n# =============================================================================\nstring(TOUPPER \"${CMAKE_BUILD_TYPE}\" _BUILD_TYPE)\nstring(\n  JOIN\n  \" \"\n  CORENRN_CXX_FLAGS\n  ${CMAKE_CXX_FLAGS}\n  ${CMAKE_CXX_FLAGS_${_BUILD_TYPE}}\n  ${CMAKE_CXX17_STANDARD_COMPILE_OPTION}\n  ${NVHPC_ACC_COMP_FLAGS}\n  ${NVHPC_CXX_INLINE_FLAGS}\n  ${CORENRN_COMPILE_DEF_FLAGS}\n  ${CORENRN_EXTRA_MECH_CXX_FLAGS}\n  ${CORENRN_EXTRA_COMPILE_FLAGS})\n\n# =============================================================================\n# nmodl/mod2c related options : TODO\n# =============================================================================\n# name of nmodl/mod2c binary\nget_filename_component(nmodl_name ${CORENRN_MOD2CPP_BINARY} NAME)\nset(nmodl_binary_name ${nmodl_name})\n"
  },
  {
    "path": "CMake/OpenAccHelper.cmake",
    "content": "# =============================================================================\n# Copyright (C) 2016-2021 Blue Brain Project\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\n# Helper to parse X.Y[.{anything] into X.Y\nfunction(cnrn_parse_version FULL_VERSION)\n  cmake_parse_arguments(PARSE_ARGV 1 CNRN_PARSE_VERSION \"\" \"OUTPUT_MAJOR_MINOR\" \"\")\n  if(NOT \"${CNRN_PARSE_VERSION_UNPARSED_ARGUMENTS}\" STREQUAL \"\")\n    message(\n      FATAL_ERROR\n        \"cnrn_parse_version got unexpected arguments: ${CNRN_PARSE_VERSION_UNPARSED_ARGUMENTS}\")\n  endif()\n  string(FIND ${FULL_VERSION} . first_dot)\n  math(EXPR first_dot_plus_one \"${first_dot}+1\")\n  string(SUBSTRING ${FULL_VERSION} ${first_dot_plus_one} -1 minor_and_later)\n  string(FIND ${minor_and_later} . second_dot_relative)\n  if(${first_dot} EQUAL -1 OR ${second_dot_relative} EQUAL -1)\n    message(FATAL_ERROR \"Failed to parse major.minor from ${FULL_VERSION}\")\n  endif()\n  math(EXPR second_dot_plus_one \"${first_dot}+${second_dot_relative}+1\")\n  string(SUBSTRING ${FULL_VERSION} 0 ${second_dot_plus_one} major_minor)\n  set(${CNRN_PARSE_VERSION_OUTPUT_MAJOR_MINOR}\n      ${major_minor}\n      PARENT_SCOPE)\nendfunction()\n\n# =============================================================================\n# Prepare compiler flags for GPU target\n# =============================================================================\nif(CORENRN_ENABLE_GPU)\n  # Get the NVC++ version number for use in nrnivmodl_core_makefile.in\n  cnrn_parse_version(${CMAKE_CXX_COMPILER_VERSION} OUTPUT_MAJOR_MINOR\n                     CORENRN_NVHPC_MAJOR_MINOR_VERSION)\n  # Enable cudaProfiler{Start,Stop}() behind the Instrumentor::phase... APIs\n  list(APPEND CORENRN_COMPILE_DEFS CORENEURON_CUDA_PROFILING CORENEURON_ENABLE_GPU)\n  # Plain C++ code in CoreNEURON may need to use CUDA runtime APIs for, for example, starting and\n  # stopping profiling. This makes sure those headers can be found.\n  include_directories(${CMAKE_CUDA_TOOLKIT_INCLUDE_DIRECTORIES})\n  # cuda unified memory support\n  if(CORENRN_ENABLE_CUDA_UNIFIED_MEMORY)\n    list(APPEND CORENRN_COMPILE_DEFS CORENEURON_UNIFIED_MEMORY)\n  endif()\n  if(${CMAKE_VERSION} VERSION_LESS 3.17)\n    # Hopefully we can drop this soon. Parse ${CMAKE_CUDA_COMPILER_VERSION} into a shorter X.Y\n    # version without any patch version.\n    if(NOT ${CMAKE_CUDA_COMPILER_ID} STREQUAL \"NVIDIA\")\n      message(FATAL_ERROR \"Unsupported CUDA compiler ${CMAKE_CUDA_COMPILER_ID}\")\n    endif()\n    cnrn_parse_version(${CMAKE_CUDA_COMPILER_VERSION} OUTPUT_MAJOR_MINOR CORENRN_CUDA_VERSION_SHORT)\n  else()\n    # This is a lazy way of getting the major/minor versions separately without parsing\n    # ${CMAKE_CUDA_COMPILER_VERSION}\n    find_package(CUDAToolkit 9.0 REQUIRED)\n    # Be a bit paranoid\n    if(NOT ${CMAKE_CUDA_COMPILER_VERSION} STREQUAL ${CUDAToolkit_VERSION})\n      message(\n        FATAL_ERROR\n          \"CUDA compiler (${CMAKE_CUDA_COMPILER_VERSION}) and toolkit (${CUDAToolkit_VERSION}) versions are not the same!\"\n      )\n    endif()\n    set(CORENRN_CUDA_VERSION_SHORT \"${CUDAToolkit_VERSION_MAJOR}.${CUDAToolkit_VERSION_MINOR}\")\n  endif()\n  # -cuda links CUDA libraries and also seems to be important to make the NVHPC do the device code\n  # linking. Without this, we had problems with linking between the explicit CUDA (.cu) device code\n  # and offloaded OpenACC/OpenMP code. Using -cuda when compiling seems to improve error messages in\n  # some cases, and to be recommended by NVIDIA. We pass -gpu=cudaX.Y to ensure that OpenACC/OpenMP\n  # code is compiled with the same CUDA version as the explicit CUDA code.\n  set(NVHPC_ACC_COMP_FLAGS \"-cuda -gpu=cuda${CORENRN_CUDA_VERSION_SHORT}\")\n  # Combining -gpu=lineinfo with -O0 -g gives a warning: Conflicting options --device-debug and\n  # --generate-line-info specified, ignoring --generate-line-info option\n  if(CMAKE_BUILD_TYPE STREQUAL \"Debug\")\n    string(APPEND NVHPC_ACC_COMP_FLAGS \",debug\")\n  else()\n    string(APPEND NVHPC_ACC_COMP_FLAGS \",lineinfo\")\n  endif()\n  # Make sure that OpenACC code is generated for the same compute capabilities as the explicit CUDA\n  # code. Otherwise there may be confusing linker errors. We cannot rely on nvcc and nvc++ using the\n  # same default compute capabilities as each other, particularly on GPU-less build machines.\n  foreach(compute_capability ${CMAKE_CUDA_ARCHITECTURES})\n    string(APPEND NVHPC_ACC_COMP_FLAGS \",cc${compute_capability}\")\n  endforeach()\n  if(CORENRN_ACCELERATOR_OFFLOAD STREQUAL \"OpenMP\")\n    # Enable OpenMP target offload to GPU and if both OpenACC and OpenMP directives are available\n    # for a region then prefer OpenMP.\n    list(APPEND CORENRN_COMPILE_DEFS CORENEURON_PREFER_OPENMP_OFFLOAD)\n    string(APPEND NVHPC_ACC_COMP_FLAGS \" -mp=gpu\")\n  elseif(CORENRN_ACCELERATOR_OFFLOAD STREQUAL \"OpenACC\")\n    # Only enable OpenACC offload for GPU\n    string(APPEND NVHPC_ACC_COMP_FLAGS \" -acc\")\n  else()\n    message(FATAL_ERROR \"${CORENRN_ACCELERATOR_OFFLOAD} not supported with NVHPC compilers\")\n  endif()\n  string(APPEND CMAKE_EXE_LINKER_FLAGS \" ${NVHPC_ACC_COMP_FLAGS}\")\n  # Use `-Mautoinline` option to compile .cpp files generated from .mod files only. This is\n  # especially needed when we compile with -O0 or -O1 optimisation level where we get link errors.\n  # Use of `-Mautoinline` ensure that the necessary functions like `net_receive_kernel` are inlined\n  # for OpenACC code generation.\n  set(NVHPC_CXX_INLINE_FLAGS \"-Mautoinline\")\nendif()\n\n# =============================================================================\n# Initialise global properties that will be used by NEURON to link with CoreNEURON\n# =============================================================================\nif(CORENRN_ENABLE_GPU)\n  # CORENRN_LIB_LINK_FLAGS is the full set of flags needed to link against libcorenrnmech.so:\n  # something like `-acc -lcorenrnmech ...`. CORENRN_NEURON_LINK_FLAGS only contains flags that need\n  # to be used when linking the NEURON Python module to make sure it is able to dynamically load\n  # libcorenrnmech.so.\n  set_property(GLOBAL PROPERTY CORENRN_LIB_LINK_FLAGS \"${NVHPC_ACC_COMP_FLAGS}\")\n  if(CORENRN_ENABLE_SHARED)\n    # Because of\n    # https://forums.developer.nvidia.com/t/dynamically-loading-an-openacc-enabled-shared-library-from-an-executable-compiled-with-nvc-does-not-work/210968\n    # we have to tell NEURON to pass OpenACC flags when linking special, otherwise we end up with an\n    # `nrniv` binary that cannot dynamically load CoreNEURON in shared-library builds.\n    set_property(GLOBAL PROPERTY CORENRN_NEURON_LINK_FLAGS \"${NVHPC_ACC_COMP_FLAGS}\")\n  endif()\nendif()\n\n# NEURON needs to have access to this when CoreNEURON is built as a submodule. If CoreNEURON is\n# installed externally then this is set via coreneuron-config.cmake\nset_property(GLOBAL PROPERTY CORENRN_ENABLE_SHARED ${CORENRN_ENABLE_SHARED})\n\nif(CORENRN_HAVE_NVHPC_COMPILER)\n  if(${CMAKE_CXX_COMPILER_VERSION} VERSION_GREATER_EQUAL 20.7)\n    # https://forums.developer.nvidia.com/t/many-all-diagnostic-numbers-increased-by-1-from-previous-values/146268/3\n    # changed the numbering scheme in newer versions. The following list is from a clean start 13\n    # August 2021. It would clearly be nicer to apply these suppressions only to relevant files.\n    # Examples of the suppressed warnings are given below.\n    # ~~~\n    # \"include/Random123/array.h\", warning #111-D: statement is unreachable\n    # \"include/Random123/features/sse.h\", warning #550-D: variable \"edx\" was set but never used\n    # ~~~\n    set(CORENEURON_CXX_WARNING_SUPPRESSIONS --diag_suppress=111,550)\n    # This one can be a bit more targeted\n    # ~~~\n    # \"boost/test/unit_test_log.hpp\", warning #612-D: overloaded virtual function \"...\" is only partially overridden in class \"...\"\n    # ~~~\n    set(CORENEURON_BOOST_UNIT_TEST_COMPILE_FLAGS --diag_suppress=612)\n    # Extra suppressions for .cpp files translated from .mod files.\n    # ~~~\n    # \"x86_64/corenrn/mod2c/pattern.cpp\", warning #161-D: unrecognized #pragma\n    # \"x86_64/corenrn/mod2c/svclmp.cpp\", warning #177-D: variable \"...\" was declared but never referenced\n    # ~~~\n    string(JOIN \" \" CORENEURON_TRANSLATED_CODE_COMPILE_FLAGS ${CORENEURON_CXX_WARNING_SUPPRESSIONS}\n           --diag_suppress=161,177)\n  endif()\nendif()\n"
  },
  {
    "path": "CMake/TestScriptUtils.cmake",
    "content": "# =============================================================================\n# Copyright (C) 2016-2021 Blue Brain Project\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\n# ~~~\n# Utility functions for manipulating test labels and producing\n# tests from scripts:\n#\n# 1. add_test_class(label [label2 ...])\n#\n#    Create a target with name test-label (or test-label-label2 etc.)\n#    which runs only those tests possessing all of the supplied labels.\n#\n#\n# 2. add_test_label(name label ...)\n#\n#    Add the given labels to the test 'name'.\n#\n#\n# 3. add_test_script(name script interp)\n#\n#    Add a test 'name' that runs the given script, using the\n#    interpreter 'interp'. If no interpreter is supplied,\n#    the script will be run with /bin/sh.\n#\n#    Uses the following variables to customize the new test:\n#    * TEST_LABEL, ${NAME}_TEST_LABEL\n#          If defined, apply the label(s) in these variable to the\n#          new test.\n#    * TEST_ARGS, ${NAME}_TEST_ARGS\n#          Additional arguments to pass to the script.\n#          ${NAME}_TEST_ARGS takes priority over TEST_ARGS.\n#    * TEST_ENVIRONMENT\n#          Additional environment variables to define for the test;\n#          added to test properties.\n#    * TEST_PREFIX, ${NAME}_TEST_PREFIX\n#          If defined, preface the interpreter with this prefix.\n#          ${NAME}_TEST_PREFIX takes priority over TEST_PREFIX.\n# ~~~\n\nfunction(add_test_label NAME)\n  set_property(\n    TEST ${NAME}\n    APPEND\n    PROPERTY LABELS ${ARGN})\n  # create test classes for each label\n  foreach(L ${ARGN})\n    add_test_class(${L})\n  endforeach()\nendfunction()\n\nfunction(add_test_script NAME SCRIPT INTERP)\n  set(RUN_PREFIX ${TEST_PREFIX})\n  if(${NAME}_TEST_PREFIX)\n    set(RUN_PREFIX ${${NAME}_TEST_PREFIX})\n  endif()\n\n  if(NOT INTERP)\n    set(INTERP \"/bin/sh\")\n  endif()\n\n  set(RUN_ARGS ${TEST_ARGS})\n  if(${NAME}_TEST_ARGS)\n    set(RUN_ARGS ${${NAME}_TEST_ARGS})\n  endif()\n\n  set(SCRIPT_PATH \"${SCRIPT}\")\n  if(NOT IS_ABSOLUTE \"${SCRIPT_PATH}\")\n    set(SCRIPT_PATH \"${CMAKE_CURRENT_SOURCE_DIR}/${SCRIPT_PATH}\")\n  endif()\n\n  add_test(\n    NAME ${NAME}\n    COMMAND ${RUN_PREFIX} ${INTERP} \"${SCRIPT_PATH}\" ${RUN_ARGS}\n    WORKING_DIRECTORY \"${CMAKE_CURRENT_BINARY_DIR}\")\n\n  # Add test labels\n  set(TEST_LABELS ${TEST_LABEL} ${${NAME}_TEST_LABEL})\n  if(TEST_LABELS)\n    add_test_label(${NAME} ${TEST_LABELS})\n  endif()\n\n  if(TEST_ENVIRONMENT)\n    set_property(TEST ${NAME} PROPERTY ENVIRONMENT ${TEST_ENVIRONMENT})\n  endif()\nendfunction()\n\nfunction(add_test_class)\n  string(REPLACE \";\" \"-\" TEST_SUFFIX \"${ARGN}\")\n  string(REPLACE \";\" \"$$;-L;^\" TEST_LOPTS \"${ARGN}\")\n\n  if(NOT TARGET test-${TEST_SUFFIX})\n    add_custom_target(\n      \"test-${TEST_SUFFIX}\"\n      COMMAND ${CMAKE_CTEST_COMMAND} -L ^${TEST_LOPTS}$$\n      WORKING_DIRECTORY ${${PROJECT_NAME}_BINARY_DIR}\n      COMMENT \"Running all ${ARGN} tests\")\n  endif()\nendfunction()\n"
  },
  {
    "path": "CMake/config/CompilerFlagsHelpers.cmake",
    "content": "# =============================================================================\n# Copyright (C) 2016-2021 Blue Brain Project\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\n# ~~~\n# CompilerFlagsHelpers.cmake\n# set of Convenience functions for portable compiler flags\n# ~~~\n\nset(SUPPORTED_COMPILER_LANGUAGE_LIST \"CXX\")\n\n# detect compiler\nforeach(COMPILER_LANGUAGE ${SUPPORTED_COMPILER_LANGUAGE_LIST})\n  if(CMAKE_${COMPILER_LANGUAGE}_COMPILER_ID STREQUAL \"XL\")\n    set(CMAKE_${COMPILER_LANGUAGE}_COMPILER_IS_XLC ON)\n  elseif(CMAKE_${COMPILER_LANGUAGE}_COMPILER_ID STREQUAL \"Intel\")\n    set(CMAKE_${COMPILER_LANGUAGE}_COMPILER_IS_ICC ON)\n  elseif(\"${CMAKE_CXX_COMPILER_ID}\" STREQUAL \"MSVC\")\n    set(CMAKE_${COMPILER_LANGUAGE}_COMPILER_IS_MSVC)\n  elseif(${CMAKE_${COMPILER_LANGUAGE}_COMPILER_ID} STREQUAL \"Clang\")\n    set(CMAKE_${COMPILER_LANGUAGE}_COMPILER_IS_CLANG ON)\n  elseif(CMAKE_${COMPILER_LANGUAGE}_COMPILER_ID STREQUAL \"GNU\")\n    set(CMAKE_${COMPILER_LANGUAGE}_COMPILER_IS_GCC ON)\n  elseif(CMAKE_${COMPILER_LANGUAGE}_COMPILER_ID STREQUAL \"Cray\")\n    set(CMAKE_${COMPILER_LANGUAGE}_COMPILER_IS_CRAY ON)\n  endif()\nendforeach()\n\nforeach(COMPILER_LANGUAGE ${SUPPORTED_COMPILER_LANGUAGE_LIST})\n  # XLC compiler\n  if(CMAKE_${COMPILER_LANGUAGE}_COMPILER_IS_XLC)\n    # ~~~\n    # XLC -qinfo=all is awfully verbose on any platforms that use the GNU STL\n    # Enable by default only the relevant one\n    # ~~~\n    set(CMAKE_${COMPILER_LANGUAGE}_WARNING_ALL \"-qformat=all -qinfo=lan:trx:ret:zea:cmp:ret\")\n\n    set(CMAKE_${COMPILER_LANGUAGE}_DEBUGINFO_FLAGS \"-g\")\n\n    set(CMAKE_${COMPILER_LANGUAGE}_OPT_NONE \"-O0\")\n    set(CMAKE_${COMPILER_LANGUAGE}_OPT_NORMAL \"-O2\")\n    set(CMAKE_${COMPILER_LANGUAGE}_OPT_AGGRESSIVE \"-O3\")\n    set(CMAKE_${COMPILER_LANGUAGE}_OPT_FASTEST \"-O5\")\n\n    set(CMAKE_${COMPILER_LANGUAGE}_STACK_PROTECTION \"-qstackprotect\")\n\n    set(CMAKE_${COMPILER_LANGUAGE}_POSITION_INDEPENDENT \"-qpic=small\")\n\n    set(CMAKE_${COMPILER_LANGUAGE}_VECTORIZE \"-qhot\")\n    set(ADDITIONAL_THREADSAFE_FLAGS \"-qthreaded\")\n    set(IGNORE_UNKNOWN_PRAGMA_FLAGS \"-qsuppress=1506-224\")\n\n    # Microsoft compiler\n  elseif(CMAKE_${COMPILER_LANGUAGE}_COMPILER_IS_MSVC)\n\n    set(CMAKE_${COMPILER_LANGUAGE}_DEBUGINFO_FLAGS \"-Zi\")\n\n    set(CMAKE_${COMPILER_LANGUAGE}_OPT_NONE \"\")\n    set(CMAKE_${COMPILER_LANGUAGE}_OPT_NORMAL \"-O2\")\n    set(CMAKE_${COMPILER_LANGUAGE}_OPT_AGGRESSIVE \"-O2\")\n    set(CMAKE_${COMPILER_LANGUAGE}_OPT_FASTEST \"-O2\")\n\n    set(CMAKE_${COMPILER_LANGUAGE}_STACK_PROTECTION \"-GS\")\n\n    # enable by default on MSVC\n    set(CMAKE_${COMPILER_LANGUAGE}_POSITION_INDEPENDENT \"\")\n\n    # GCC\n  elseif(CMAKE_${COMPILER_LANGUAGE}_COMPILER_IS_GCC)\n\n    set(CMAKE_${COMPILER_LANGUAGE}_WARNING_ALL \"-Wall\")\n    set(CMAKE_${COMPILER_LANGUAGE}_DEBUGINFO_FLAGS \"-g\")\n\n    set(CMAKE_${COMPILER_LANGUAGE}_OPT_NONE \"-O0\")\n    set(CMAKE_${COMPILER_LANGUAGE}_OPT_NORMAL \"-O2\")\n    set(CMAKE_${COMPILER_LANGUAGE}_OPT_AGGRESSIVE \"-O3\")\n    set(CMAKE_${COMPILER_LANGUAGE}_OPT_FASTEST \"-Ofast -march=native\")\n\n    set(CMAKE_${COMPILER_LANGUAGE}_STACK_PROTECTION \"-fstack-protector\")\n\n    set(CMAKE_${COMPILER_LANGUAGE}_POSITION_INDEPENDENT \"-fPIC\")\n\n    set(CMAKE_${COMPILER_LANGUAGE}_VECTORIZE \"-ftree-vectorize\")\n    set(IGNORE_UNKNOWN_PRAGMA_FLAGS \"-Wno-unknown-pragmas\")\n\n    if(CMAKE_${COMPILER_LANGUAGE}_COMPILER_VERSION VERSION_GREATER \"4.7.0\")\n      set(CMAKE_${COMPILER_LANGUAGE}_LINK_TIME_OPT \"-flto\")\n    endif()\n\n    if((CMAKE_HOST_SYSTEM_PROCESSOR MATCHES \"^ppc\") OR (CMAKE_HOST_SYSTEM_PROCESSOR MATCHES \"^power\"\n                                                       ))\n      # ppc arch do not support -march= syntax\n      set(CMAKE_${COMPILER_LANGUAGE}_GEN_NATIVE \"-mcpu=native\")\n    else()\n      set(CMAKE_${COMPILER_LANGUAGE}_GEN_NATIVE \"-march=native\")\n    endif()\n\n    # CLANG\n  elseif(CMAKE_${COMPILER_LANGUAGE}_COMPILER_IS_CLANG)\n    set(CMAKE_${COMPILER_LANGUAGE}_WARNING_ALL \"-Wall\")\n    set(CMAKE_${COMPILER_LANGUAGE}_DEBUGINFO_FLAGS \"-g\")\n\n    set(CMAKE_${COMPILER_LANGUAGE}_OPT_NONE \"-O0\")\n    set(CMAKE_${COMPILER_LANGUAGE}_OPT_NORMAL \"-O2\")\n    set(CMAKE_${COMPILER_LANGUAGE}_OPT_AGGRESSIVE \"-O3\")\n    set(CMAKE_${COMPILER_LANGUAGE}_OPT_FASTEST \"-Ofast -march=native\")\n\n    set(CMAKE_${COMPILER_LANGUAGE}_STACK_PROTECTION \"-fstack-protector\")\n    set(CMAKE_${COMPILER_LANGUAGE}_POSITION_INDEPENDENT \"-fPIC\")\n\n    # Force same ld behavior as when called from gcc --as-needed forces the linker to check whether\n    # a dynamic library mentioned in the command line is actually needed by the objects being\n    # linked. Symbols needed in shared objects are already linked when building that library.\n    set(CMAKE_EXE_LINKER_FLAGS \"-Wl,--as-needed\")\n    set(CMAKE_SHARED_LINKER_FLAGS \"-Wl,--as-needed\")\n\n    # rest of the world\n  else()\n    set(CMAKE_${COMPILER_LANGUAGE}_WARNING_ALL \"-Wall\")\n    set(CMAKE_${COMPILER_LANGUAGE}_DEBUGINFO_FLAGS \"-g\")\n\n    set(CMAKE_${COMPILER_LANGUAGE}_OPT_NONE \"-O0\")\n    set(CMAKE_${COMPILER_LANGUAGE}_OPT_NORMAL \"-O2\")\n    set(CMAKE_${COMPILER_LANGUAGE}_OPT_AGGRESSIVE \"-O3\")\n    set(CMAKE_${COMPILER_LANGUAGE}_OPT_FASTEST \"-O3\")\n\n    set(CMAKE_${COMPILER_LANGUAGE}_STACK_PROTECTION \"\")\n    set(CMAKE_${COMPILER_LANGUAGE}_POSITION_INDEPENDENT \"-fPIC\")\n    set(CMAKE_${COMPILER_LANGUAGE}_VECTORIZE \"\")\n\n    if(CMAKE_${COMPILER_LANGUAGE}_COMPILER_IS_ICC)\n      # unknown compiler flags produce error on Cray and hence just set this for intel now\n      set(IGNORE_UNKNOWN_PRAGMA_FLAGS \"-Wno-unknown-pragmas\")\n      # Intel O3 is extreme\n      set(CMAKE_${COMPILER_LANGUAGE}_OPT_AGGRESSIVE \"-O2\")\n    endif()\n\n    if(CMAKE_${COMPILER_LANGUAGE}_COMPILER_ID STREQUAL \"PGI\")\n      set(CMAKE_${COMPILER_LANGUAGE}_WARNING_ALL \"\")\n    endif()\n  endif()\n\nendforeach()\n\n# ===============================================================================\n# Allow undefined reference in shared library as mod files will be linked later\n# ===============================================================================\nif(CMAKE_CXX_COMPILER_ID MATCHES \"AppleClang\" OR ${CMAKE_SYSTEM_NAME} MATCHES \"Darwin\")\n  set(UNDEFINED_SYMBOLS_IGNORE_FLAG \"-undefined dynamic_lookup\")\n  string(APPEND CMAKE_SHARED_LIBRARY_CREATE_CXX_FLAGS \" ${UNDEFINED_SYMBOLS_IGNORE_FLAG}\")\n  string(APPEND CMAKE_SHARED_LIBRARY_CREATE_C_FLAGS \" ${UNDEFINED_SYMBOLS_IGNORE_FLAG}\")\nendif()\n"
  },
  {
    "path": "CMake/config/ReleaseDebugAutoFlags.cmake",
    "content": "# =============================================================================\n# Copyright (C) 2016-2021 Blue Brain Project\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\n# ~~~\n# ReleaseDebugAutoFlags.cmake\n# Release / Debug configuration helper\n# ~~~\n\n# default configuration\nif(NOT CMAKE_BUILD_TYPE AND (NOT CMAKE_CONFIGURATION_TYPES))\n  set(CMAKE_BUILD_TYPE\n      RelWithDebInfo\n      CACHE STRING \"Choose the type of build.\" FORCE)\n  message(STATUS \"Setting build type to '${CMAKE_BUILD_TYPE}' as none was specified.\")\nendif()\n\n# =============================================================================\n# Different build types\n# =============================================================================\n# ~~~\n# Debug : Optimized for debugging, include debug symbols\n# Release : Release mode, no debuginfo\n# RelWithDebInfo : Distribution mode, basic optimizations for potable code with debuginfos\n# Fast : Maximum level of optimization. Target native architecture, not portable code\n# ~~~\n\ninclude(CompilerFlagsHelpers)\n\n# ~~~\nset(CMAKE_C_FLAGS_RELEASE \"${CMAKE_C_OPT_NORMAL}\")\nset(CMAKE_C_FLAGS_DEBUG\n    \"${CMAKE_C_DEBUGINFO_FLAGS}  ${CMAKE_C_OPT_NONE} ${CMAKE_C_STACK_PROTECTION}\")\nset(CMAKE_C_FLAGS_RELWITHDEBINFO \"${CMAKE_C_DEBUGINFO_FLAGS}  ${CMAKE_C_OPT_NORMAL}\")\nset(CMAKE_C_FLAGS_FAST \" ${CMAKE_C_OPT_FASTEST} ${CMAKE_C_LINK_TIME_OPT} ${CMAKE_C_GEN_NATIVE}\")\n\nset(CMAKE_CXX_FLAGS_RELEASE \"${CMAKE_CXX_OPT_NORMAL}\")\nset(CMAKE_CXX_FLAGS_DEBUG\n    \"${CMAKE_CXX_DEBUGINFO_FLAGS}  ${CMAKE_CXX_OPT_NONE} ${CMAKE_CXX_STACK_PROTECTION}\")\nset(CMAKE_CXX_FLAGS_RELWITHDEBINFO \"${CMAKE_CXX_DEBUGINFO_FLAGS}  ${CMAKE_CXX_OPT_NORMAL}\")\nset(CMAKE_CXX_FLAGS_FAST\n    \" ${CMAKE_CXX_OPT_FASTEST} ${CMAKE_CXX_LINK_TIME_OPT} ${CMAKE_CXX_GEN_NATIVE}\")\n# ~~~\n"
  },
  {
    "path": "CMake/config/SetRpath.cmake",
    "content": "# =============================================================================\n# Copyright (C) 2016-2021 Blue Brain Project\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\n# enable @rpath in the install name for any shared library being built\nset(CMAKE_MACOSX_RPATH 1)\n\n# ~~~\n# On platforms like bgq, xlc didn't like rpath with static build and similar\n# issue was seen on Cray\n# ~~~\nif(NOT CRAY_SYSTEM)\n  # use, i.e. don't skip the full RPATH for the build tree\n  set(CMAKE_SKIP_BUILD_RPATH FALSE)\n\n  # when building, don't use the install RPATH already but later on when installing\n  set(CMAKE_BUILD_WITH_INSTALL_RPATH FALSE)\n\n  # ~~~\n  # add the automatically determined parts of the RPATH which point to directories\n  # outside the build tree to the install RPATH\n  # ~~~\n  set(CMAKE_INSTALL_RPATH_USE_LINK_PATH TRUE)\n\n  set(LIB_INSTALL_DIR \"${CMAKE_INSTALL_PREFIX}/lib\")\n\n  # the RPATH to be used when installing, but only if it's not a system directory\n  list(FIND CMAKE_PLATFORM_IMPLICIT_LINK_DIRECTORIES \"${LIB_INSTALL_DIR}\" isSystemDir)\n  if(\"${isSystemDir}\" STREQUAL \"-1\")\n    set(CMAKE_INSTALL_RPATH \"${LIB_INSTALL_DIR}\")\n  endif(\"${isSystemDir}\" STREQUAL \"-1\")\nendif()\n"
  },
  {
    "path": "CMake/config/TestHelpers.cmake",
    "content": "# =============================================================================\n# Copyright (C) 2016-2021 Blue Brain Project\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\n# ~~~\n# TestHelpers.cmake\n# set of Convenience functions for unit testing with cmake\n# ~~~\n\n# enable or disable detection of SLURM and MPIEXEC\noption(AUTO_TEST_WITH_SLURM \"Add srun as test prefix in a SLURM environment\" TRUE)\noption(AUTO_TEST_WITH_MPIEXEC \"Add mpiexec as test prefix in a MPICH2/OpenMPI environment\" TRUE)\n\n# ~~~\n# Basic SLURM support the prefix \"srun\" is added to any test in the environment/ For a\n# slurm test execution, simply run \"salloc [your_exec_parameters] ctest\"\n# ~~~\nif(AUTO_TEST_WITH_SLURM)\n  if(NOT DEFINED SLURM_SRUN_COMMAND)\n    find_program(\n      SLURM_SRUN_COMMAND\n      NAMES \"srun\"\n      HINTS \"${SLURM_ROOT}/bin\" QUIET)\n  endif()\n\n  if(SLURM_SRUN_COMMAND)\n    set(TEST_EXEC_PREFIX_DEFAULT \"${SLURM_SRUN_COMMAND}\")\n    set(TEST_MPI_EXEC_PREFIX_DEFAULT \"${SLURM_SRUN_COMMAND}\")\n    set(TEST_MPI_EXEC_BIN_DEFAULT \"${SLURM_SRUN_COMMAND}\")\n    set(TEST_WITH_SLURM ON)\n  endif()\n\nendif()\n\n# Basic mpiexec support, will just forward mpiexec as prefix\nif(AUTO_TEST_WITH_MPIEXEC AND NOT TEST_WITH_SLURM)\n  if(NOT DEFINED MPIEXEC)\n    find_program(\n      MPIEXEC\n      NAMES \"mpiexec\"\n      HINTS \"${MPI_ROOT}/bin\")\n  endif()\n\n  if(MPIEXEC)\n    set(TEST_MPI_EXEC_PREFIX_DEFAULT \"${MPIEXEC}\")\n    set(TEST_MPI_EXEC_BIN_DEFAULT \"${MPIEXEC}\")\n    set(TEST_WITH_MPIEXEC ON)\n  endif()\nendif()\n\n# ~~~\n# MPI executor program path without arguments used for testing.\n# default: srun or mpiexec if found\n# ~~~\nset(TEST_MPI_EXEC_BIN\n    \"${TEST_MPI_EXEC_BIN_DEFAULT}\"\n    CACHE STRING \"path of the MPI executor (mpiexec, mpirun) for test execution\")\n\n# ~~~\n# Test execution prefix. Override this variable for any execution prefix required\n# in clustered environment\n#\n# To specify manually a command with argument, e.g -DTEST_EXEC_PREFIX=\"/usr/bin/srun;-n;-4\"\n# for a srun execution with 4 nodes\n#\n# default: srun if found\n# ~~~\nset(TEST_EXEC_PREFIX\n    \"${TEST_EXEC_PREFIX_DEFAULT}\"\n    CACHE STRING \"prefix command for the test executions\")\n\n# ~~~\n# Test execution prefix specific for MPI programs.\n#\n# To specify manually a command with argument, use the cmake list syntax. e.g\n# -DTEST_EXEC_PREFIX=\"/usr/bin/mpiexec;-n;-4\" for an MPI execution with 4 nodes\n#\n# default: srun or mpiexec if found\n# ~~~\nset(TEST_MPI_EXEC_PREFIX\n    \"${TEST_MPI_EXEC_PREFIX_DEFAULT}\"\n    CACHE STRING \"prefix command for the MPI test executions\")\n"
  },
  {
    "path": "CMake/coreneuron-config.cmake.in",
    "content": "# =============================================================================\n# Copyright (C) 2016-2022 Blue Brain Project\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\n# coreneuron-config.cmake - package configuration file\n\nget_filename_component(CONFIG_PATH \"${CMAKE_CURRENT_LIST_FILE}\" PATH)\n\nset(CORENRN_VERSION_MAJOR @PROJECT_VERSION_MAJOR@)\nset(CORENRN_VERSION_MINOR @PROJECT_VERSION_MINOR@)\nset(CORENRN_VERSION_PATCH @PROJECT_VERSION_PATCH@)\nset(CORENRN_ENABLE_GPU @CORENRN_ENABLE_GPU@)\nset(CORENRN_ENABLE_NMODL @CORENRN_ENABLE_NMODL@)\nset(CORENRN_ENABLE_REPORTING @CORENRN_ENABLE_REPORTING@)\nset(CORENRN_ENABLE_SHARED @CORENRN_ENABLE_SHARED@)\nset(CORENRN_LIB_LINK_FLAGS \"@CORENRN_LIB_LINK_FLAGS@\")\nset(CORENRN_NEURON_LINK_FLAGS \"@CORENRN_NEURON_LINK_FLAGS@\")\n\nfind_path(CORENEURON_INCLUDE_DIR \"coreneuron/coreneuron.h\" HINTS \"${CONFIG_PATH}/../../include\")\nfind_path(\n  CORENEURON_LIB_DIR\n  NAMES libcorenrnmech.a libcorenrnmech.so libcorenrnmech.dylib\n  HINTS \"${CONFIG_PATH}/../../lib\")\n\ninclude(${CONFIG_PATH}/coreneuron.cmake)\n"
  },
  {
    "path": "CMake/packages/FindSphinx.cmake",
    "content": "# =============================================================================\n# Copyright (C) 2016-2021 Blue Brain Project\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\nfind_program(\n  SPHINX_EXECUTABLE\n  NAMES sphinx-build\n  DOC \"/path/to/sphinx-build\")\n\ninclude(FindPackageHandleStandardArgs)\n\nfind_package_handle_standard_args(Sphinx \"Failed to find sphinx-build executable\" SPHINX_EXECUTABLE)\n"
  },
  {
    "path": "CMake/packages/Findlikwid.cmake",
    "content": "# =============================================================================\n# Copyright (C) 2016-2021 Blue Brain Project\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\n# ~~~\n# Findlikwid\n# -------------\n#\n# Find likwid\n#\n# Find the likwid RRZE Performance Monitoring and Benchmarking Suite\n#\n# Using likwid:\n#\n# ::\n#   set(LIKWID_DIR \"\" CACHE PATH \"Path likwid performance monitoring and benchmarking suite\")\n#   find_package(likwid REQUIRED)\n#   include_directories(${likwid_INCLUDE_DIRS})\n#   target_link_libraries(foo ${likwid_LIBRARIES})\n#\n# This module sets the following variables:\n#\n# ::\n#\n#   likwid_FOUND     - set to true if the library is found\n#   likwid_INCLUDE   - list of required include directories\n#   likwid_LIBRARIES - list of required library directories\n# ~~~\n\nfind_path(likwid_INCLUDE_DIRS \"likwid.h\" HINTS \"${LIKWID_DIR}/include\")\nfind_library(likwid_LIBRARIES likwid HINTS \"${LIKWID_DIR}/lib\")\n\n# Checks 'REQUIRED', 'QUIET' and versions.\ninclude(FindPackageHandleStandardArgs)\n\nfind_package_handle_standard_args(likwid REQUIRED_VARS likwid_INCLUDE_DIRS likwid_LIBRARIES)\n"
  },
  {
    "path": "CMake/packages/Findnmodl.cmake",
    "content": "# =============================================================================\n# Copyright (C) 2016-2021 Blue Brain Project\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\n# ~~~\n# Findnmodl\n# -------------\n#\n# Find nmodl\n#\n# Find the nmodl Blue Brain HPC utils library\n#\n# Using nmodl:\n#\n# ::\n#   set(CORENRN_NMODL_DIR \"\" CACHE PATH \"Path to nmodl source-to-source compiler installation\")\n#   find_package(nmodl REQUIRED)\n#   include_directories(${nmodl_INCLUDE_DIRS})\n#   target_link_libraries(foo ${nmodl_LIBRARIES})\n#\n# This module sets the following variables:\n#\n# ::\n#\n#   nmodl_FOUND   - set to true if the library is found\n#   nmodl_INCLUDE - list of required include directories\n#   nmodl_BINARY  - the nmodl binary\n# ~~~\n\n# UNIX paths are standard, no need to write.\nfind_program(\n  nmodl_BINARY\n  NAMES nmodl${CMAKE_EXECUTABLE_SUFFIX}\n  HINTS \"${CORENRN_NMODL_DIR}/bin\" QUIET)\n\nfind_path(nmodl_INCLUDE \"nmodl/fast_math.hpp\" HINTS \"${CORENRN_NMODL_DIR}/include\")\nfind_path(nmodl_PYTHONPATH \"nmodl/__init__.py\" HINTS \"${CORENRN_NMODL_DIR}/lib\")\n\n# Checks 'REQUIRED', 'QUIET' and versions.\ninclude(FindPackageHandleStandardArgs)\n\nfind_package_handle_standard_args(\n  nmodl\n  FOUND_VAR nmodl_FOUND\n  REQUIRED_VARS nmodl_BINARY nmodl_INCLUDE nmodl_PYTHONPATH)\n"
  },
  {
    "path": "CMake/packages/Findreportinglib.cmake",
    "content": "# =============================================================================\n# Copyright (C) 2016-2021 Blue Brain Project\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\n# ~~~\n# Findreportinglib\n# -------------\n#\n# Find reportinglib\n#\n# Find the reportinglib Blue Brain HPC utils library\n#\n# Using reportinglib:\n#\n# ::\n#\n#   find_package(reportinglib REQUIRED)\n#   include_directories(${reportinglib_INCLUDE_DIRS})\n#   target_link_libraries(foo ${reportinglib_LIBRARIES})\n#\n# This module sets the following variables:\n#\n# ::\n#\n#   reportinglib_FOUND - set to true if the library is found\n#   reportinglib_INCLUDE_DIRS - list of required include directories\n#   reportinglib_LIBRARIES - list of libraries to be linked\n# ~~~\n\n# UNIX paths are standard, no need to write.\nfind_path(reportinglib_INCLUDE_DIR reportinglib/Report.h)\nfind_library(reportinglib_LIBRARY reportinglib)\nget_filename_component(reportinglib_LIB_DIR ${reportinglib_LIBRARY} DIRECTORY)\nfind_program(reportinglib_somaDump somaDump ${reportinglib_LIB_DIR}/../bin)\n\n# Checks 'REQUIRED', 'QUIET' and versions.\ninclude(FindPackageHandleStandardArgs)\n\nfind_package_handle_standard_args(\n  reportinglib\n  FOUND_VAR reportinglib_FOUND\n  REQUIRED_VARS reportinglib_INCLUDE_DIR reportinglib_LIBRARY reportinglib_LIB_DIR)\n"
  },
  {
    "path": "CMakeLists.txt",
    "content": "# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\ncmake_minimum_required(VERSION 3.15 FATAL_ERROR)\n# CoreNEURON's version jumped from 1.0 to 8.2.0 with the introduction of the NRN_VERSION_* macros\n# for use in VERBATIM blocks. Starting from this version, the NEURON and CoreNEURON versions are\n# locked together. A version has to be hardcoded here to handle the case that CoreNEURON is built\n# standalone.\nproject(\n  coreneuron\n  VERSION 9.0.0\n  LANGUAGES CXX)\n\n# ~~~\n# It is a bad idea having floating point versions, since macros cant handle them\n# We therefore, have version as an int, which is pretty much standard\n# ~~~\nmath(EXPR CORENEURON_VERSION_COMBINED\n     \"${coreneuron_VERSION_MAJOR} * 100 + ${coreneuron_VERSION_MINOR}\")\n\n# =============================================================================\n# CMake common project settings\n# =============================================================================\nset(CMAKE_CXX_STANDARD 17)\nset(CMAKE_CXX_STANDARD_REQUIRED ON)\nset(CMAKE_CXX_EXTENSIONS OFF)\nset(CMAKE_BUILD_TYPE\n    RelWithDebInfo\n    CACHE STRING \"Empty or one of Debug, Release, RelWithDebInfo\")\n\nif(NOT \"cxx_std_17\" IN_LIST CMAKE_CXX_COMPILE_FEATURES)\n  message(\n    FATAL_ERROR\n      \"This compiler does not fully support C++17, choose a higher version or another compiler.\")\nendif()\n\n# =============================================================================\n# Settings to enable project as submodule\n# =============================================================================\nset(CORENEURON_PROJECT_BINARY_DIR ${CMAKE_CURRENT_BINARY_DIR})\nset(CORENEURON_PROJECT_SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR})\nset(CORENEURON_AS_SUBPROJECT OFF)\nif(NOT CMAKE_CURRENT_SOURCE_DIR STREQUAL CMAKE_SOURCE_DIR)\n  set(CORENEURON_AS_SUBPROJECT ON)\n  # Make these visible to the parent project (NEURON) so it can do some sanity checking.\n  set_property(GLOBAL PROPERTY CORENRN_VERSION_MAJOR ${PROJECT_VERSION_MAJOR})\n  set_property(GLOBAL PROPERTY CORENRN_VERSION_MINOR ${PROJECT_VERSION_MINOR})\n  set_property(GLOBAL PROPERTY CORENRN_VERSION_PATCH ${PROJECT_VERSION_PATCH})\nendif()\nif(NOT DEFINED NRN_VERSION_MAJOR\n   OR NOT DEFINED NRN_VERSION_MINOR\n   OR NOT DEFINED NRN_VERSION_PATCH)\n  if(CORENEURON_AS_SUBPROJECT)\n    set(level WARNING)\n  else()\n    set(level STATUS)\n  endif()\n  # Typically in this case CoreNEURON is being built standalone. In this case NRN_VERSION_* macros\n  # resolve to the CoreNEURON version, which is supposed to be moving in lockstep with the NEURON\n  # version.\n  set(NRN_VERSION_MAJOR ${PROJECT_VERSION_MAJOR})\n  set(NRN_VERSION_MINOR ${PROJECT_VERSION_MINOR})\n  set(NRN_VERSION_PATCH ${PROJECT_VERSION_PATCH})\n  message(${level} \"CoreNEURON could not determine the NEURON version, using the hardcoded \"\n          \"${NRN_VERSION_MAJOR}.${NRN_VERSION_MINOR}.${NRN_VERSION_PATCH}\")\nendif()\n# Regardless of whether we are being built as a submodule of NEURON, NRN_VERSION_{MAJOR,MINOR,PATCH}\n# are now set to the version that we should claim compatibility with when compiling translated MOD\n# files. Generate a header under a special `generated` prefix in the build directory, so that\n# -I/path/to/src -I/path/to/build/generated is safe (headers from the source prefix are copied\n# elsewhere under the build prefix, so there is scope for confusion)\nconfigure_file(coreneuron/config/neuron_version.hpp.in\n               generated/coreneuron/config/neuron_version.hpp)\n\n# =============================================================================\n# Include cmake modules path\n# =============================================================================\nlist(APPEND CMAKE_MODULE_PATH ${CORENEURON_PROJECT_SOURCE_DIR}/CMake\n     ${CORENEURON_PROJECT_SOURCE_DIR}/CMake/packages ${CORENEURON_PROJECT_SOURCE_DIR}/CMake/config)\n\n# =============================================================================\n# HPC Coding Conventions\n# =============================================================================\nset(CODING_CONV_PREFIX \"CORENRN\")\nset(CORENRN_3RDPARTY_DIR \"external\")\ninclude(AddHpcCodingConvSubmodule)\nadd_subdirectory(CMake/hpc-coding-conventions/cpp)\n\n# =============================================================================\n# Enable sanitizer support if the CORENRN_SANITIZERS variable is set\n# =============================================================================\ninclude(CMake/hpc-coding-conventions/cpp/cmake/sanitizers.cmake)\nset(CORENRN_EXTRA_CXX_FLAGS\n    \"\"\n    CACHE STRING \"Add extra compile flags for CoreNEURON sources\")\nseparate_arguments(CORENRN_EXTRA_CXX_FLAGS)\nset(CORENRN_EXTRA_MECH_CXX_FLAGS\n    \"\"\n    CACHE STRING \"Add extra compile flags for translated mechanisms\")\nseparate_arguments(CORENRN_EXTRA_MECH_CXX_FLAGS)\nlist(APPEND CORENRN_EXTRA_CXX_FLAGS ${CORENRN_SANITIZER_COMPILER_FLAGS})\nlist(APPEND CORENRN_EXTRA_MECH_CXX_FLAGS ${CORENRN_SANITIZER_COMPILER_FLAGS})\nlist(APPEND CORENRN_EXTRA_LINK_FLAGS ${CORENRN_SANITIZER_COMPILER_FLAGS})\n\n# =============================================================================\n# Include common cmake modules\n# =============================================================================\ninclude(CheckIncludeFiles)\ninclude(ReleaseDebugAutoFlags)\ninclude(CrayPortability)\ninclude(SetRpath)\ninclude(CTest)\ninclude(AddRandom123Submodule)\ninclude(GitRevision)\n\nset(CORENRN_3RDPARTY_DIR external)\ninclude(CMake/hpc-coding-conventions/cpp/cmake/3rdparty.cmake)\ncpp_cc_git_submodule(CLI11 BUILD PACKAGE CLI11 REQUIRED)\n\n# =============================================================================\n# Build options\n# =============================================================================\noption(CORENRN_ENABLE_OPENMP \"Build the CORE NEURON with OpenMP implementation\" ON)\noption(CORENRN_ENABLE_OPENMP_OFFLOAD \"Prefer OpenMP target offload to OpenACC\" ON)\noption(CORENRN_ENABLE_TIMEOUT \"Enable nrn_timeout implementation\" ON)\noption(CORENRN_ENABLE_REPORTING \"Enable use of ReportingLib for soma reports\" OFF)\noption(CORENRN_ENABLE_MPI \"Enable MPI-based execution\" ON)\noption(CORENRN_ENABLE_MPI_DYNAMIC \"Enable dynamic MPI support\" OFF)\noption(CORENRN_ENABLE_HOC_EXP \"Enable wrapping exp with hoc_exp()\" OFF)\noption(CORENRN_ENABLE_SPLAYTREE_QUEUING \"Enable use of Splay tree for spike queuing\" ON)\noption(CORENRN_ENABLE_NET_RECEIVE_BUFFER \"Enable event buffering in net_receive function\" ON)\noption(CORENRN_ENABLE_NMODL \"Enable external nmodl source-to-source compiler\" OFF)\noption(CORENRN_ENABLE_CALIPER_PROFILING \"Enable Caliper instrumentation\" OFF)\noption(CORENRN_ENABLE_LIKWID_PROFILING \"Enable LIKWID instrumentation\" OFF)\noption(CORENRN_ENABLE_CUDA_UNIFIED_MEMORY \"Enable CUDA unified memory support\" OFF)\noption(CORENRN_ENABLE_UNIT_TESTS \"Enable unit tests execution\" ON)\noption(CORENRN_ENABLE_GPU \"Enable GPU support using OpenACC or OpenMP\" OFF)\noption(CORENRN_ENABLE_SHARED \"Enable shared library build\" ON)\noption(CORENRN_ENABLE_LEGACY_UNITS \"Enable legacy FARADAY, R, etc\" OFF)\noption(CORENRN_ENABLE_PRCELLSTATE \"Enable NRN_PRCELLSTATE debug feature\" OFF)\n\nset(CORENRN_NMODL_DIR\n    \"\"\n    CACHE PATH \"Path to nmodl source-to-source compiler installation\")\nset(LIKWID_DIR\n    \"\"\n    CACHE PATH \"Path to likwid performance analysis suite\")\n\n# Older CMake versions label NVHPC as PGI, newer ones label it as NVHPC.\nif(${CMAKE_CXX_COMPILER_ID} STREQUAL \"PGI\" OR ${CMAKE_CXX_COMPILER_ID} STREQUAL \"NVHPC\")\n  set(CORENRN_HAVE_NVHPC_COMPILER ON)\nelse()\n  set(CORENRN_HAVE_NVHPC_COMPILER OFF)\nendif()\n\nset(CORENRN_ACCELERATOR_OFFLOAD \"Disabled\")\nif(CORENRN_ENABLE_GPU)\n  # Older CMake versions than 3.15 have not been tested for GPU/CUDA/OpenACC support after\n  # https://github.com/BlueBrain/CoreNeuron/pull/609.\n\n  # Fail hard and early if we don't have the PGI/NVHPC compiler.\n  if(NOT CORENRN_HAVE_NVHPC_COMPILER)\n    message(\n      FATAL_ERROR\n        \"GPU support is available via OpenACC using PGI/NVIDIA compilers.\"\n        \" Use NVIDIA HPC SDK with -DCMAKE_C_COMPILER=nvc -DCMAKE_CUDA_COMPILER=nvcc -DCMAKE_CXX_COMPILER=nvc++\"\n    )\n  endif()\n\n  # Set some sensible default CUDA architectures.\n  if(NOT DEFINED CMAKE_CUDA_ARCHITECTURES)\n    set(CMAKE_CUDA_ARCHITECTURES 70 80)\n    message(STATUS \"Setting default CUDA architectures to ${CMAKE_CUDA_ARCHITECTURES}\")\n  endif()\n\n  # See https://gitlab.kitware.com/cmake/cmake/-/issues/23081, this should not be needed according\n  # to the CMake documentation, but it is not clear that any version behaves as documented.\n  if(DEFINED CMAKE_CUDA_HOST_COMPILER)\n    unset(ENV{CUDAHOSTCXX})\n  endif()\n\n  # Enable CUDA language support.\n  enable_language(CUDA)\n\n  # Prefer shared libcudart.so\n  if(${CMAKE_VERSION} VERSION_LESS 3.17)\n    # Ugly workaround from https://gitlab.kitware.com/cmake/cmake/-/issues/17559, remove when\n    # possible\n    if(CMAKE_CUDA_HOST_IMPLICIT_LINK_LIBRARIES)\n      list(REMOVE_ITEM CMAKE_CUDA_HOST_IMPLICIT_LINK_LIBRARIES \"cudart_static\")\n      list(REMOVE_ITEM CMAKE_CUDA_HOST_IMPLICIT_LINK_LIBRARIES \"cudadevrt\")\n      list(APPEND CMAKE_CUDA_HOST_IMPLICIT_LINK_LIBRARIES \"cudart\")\n    endif()\n    if(CMAKE_CUDA_IMPLICIT_LINK_LIBRARIES)\n      list(REMOVE_ITEM CMAKE_CUDA_IMPLICIT_LINK_LIBRARIES \"cudart_static\")\n      list(REMOVE_ITEM CMAKE_CUDA_IMPLICIT_LINK_LIBRARIES \"cudadevrt\")\n      list(APPEND CMAKE_CUDA_IMPLICIT_LINK_LIBRARIES \"cudart\")\n    endif()\n  else()\n    # nvc++ -cuda implicitly links dynamically to libcudart.so. Setting this makes sure that CMake\n    # does not add -lcudart_static and trigger errors due to mixed dynamic/static linkage.\n    set(CMAKE_CUDA_RUNTIME_LIBRARY Shared)\n  endif()\n\n  # Patch CUDA_ARCHITECTURES support into older CMake versions\n  if(${CMAKE_VERSION} VERSION_LESS 3.18)\n    foreach(cuda_arch ${CMAKE_CUDA_ARCHITECTURES})\n      string(\n        APPEND CMAKE_CUDA_FLAGS\n        \" --generate-code=arch=compute_${cuda_arch},code=[compute_${cuda_arch},sm_${cuda_arch}]\")\n    endforeach()\n  endif()\n\n  # ~~~\n  # Needed for the Eigen GPU support Warning suppression (Eigen GPU-related):\n  # 3057 : Warning on ignoring __host__ annotation in some functions\n  # 3085 : Warning on redeclaring a __host__ function as __host__ __device__\n  # ~~~\n  set(CMAKE_CUDA_FLAGS\n      \"${CMAKE_CUDA_FLAGS} --expt-relaxed-constexpr -Xcudafe --diag_suppress=3057,--diag_suppress=3085\"\n  )\n\n  if(CORENRN_ENABLE_NMODL)\n    # NMODL supports both OpenACC and OpenMP target offload\n    if(CORENRN_ENABLE_OPENMP AND CORENRN_ENABLE_OPENMP_OFFLOAD)\n      set(CORENRN_ACCELERATOR_OFFLOAD \"OpenMP\")\n    else()\n      set(CORENRN_ACCELERATOR_OFFLOAD \"OpenACC\")\n    endif()\n  else()\n    # MOD2C only supports OpenACC offload\n    set(CORENRN_ACCELERATOR_OFFLOAD \"OpenACC\")\n  endif()\nendif()\n\n# =============================================================================\n# Project version from git and project directories\n# =============================================================================\nset(CN_PROJECT_VERSION ${PROJECT_VERSION})\n\n# generate file with version number from git and nrnunits.lib file path\nconfigure_file(${CMAKE_CURRENT_SOURCE_DIR}/coreneuron/config/config.cpp.in\n               ${PROJECT_BINARY_DIR}/coreneuron/config/config.cpp @ONLY)\n\n# =============================================================================\n# Include cmake modules after cmake options\n# =============================================================================\ninclude(OpenAccHelper)\n\n# =============================================================================\n# Common dependencies\n# =============================================================================\nfind_package(PythonInterp REQUIRED)\nfind_package(Perl REQUIRED)\n\n# =============================================================================\n# Common build options\n# =============================================================================\n# build mod files for coreneuron\nlist(APPEND CORENRN_COMPILE_DEFS CORENEURON_BUILD)\nset(CMAKE_REQUIRED_QUIET TRUE)\ncheck_include_files(malloc.h have_malloc_h)\nif(have_malloc_h)\n  list(APPEND CORENRN_COMPILE_DEFS HAVE_MALLOC_H)\nendif()\n\n# =============================================================================\n# Build option specific compiler flags\n# =============================================================================\nif(CORENRN_ENABLE_NMODL)\n  # We use Eigen for \"small\" matrices with thread-level parallelism handled at a higher level; tell\n  # Eigen not to try to multithread internally\n  list(APPEND CORENRN_COMPILE_DEFS EIGEN_DONT_PARALLELIZE)\nendif()\nif(CORENRN_HAVE_NVHPC_COMPILER)\n  # PGI with llvm code generation doesn't have necessary assembly intrinsic headers\n  list(APPEND CORENRN_COMPILE_DEFS EIGEN_DONT_VECTORIZE=1)\n  if(NOT CORENRN_ENABLE_GPU AND ${CMAKE_CXX_COMPILER_VERSION} VERSION_GREATER_EQUAL 21.11)\n    # Random123 does not play nicely with NVHPC 21.11+'s detection of ABM features if it detects the\n    # compiler to be PGI or NVHPC, see: https://github.com/BlueBrain/CoreNeuron/issues/724 and\n    # https://github.com/DEShawResearch/random123/issues/6. In fact in GPU builds Random123\n    # (mis)detects nvc++ as nvcc because we pass the -cuda option and we therefore avoid the\n    # problem. If GPU support is disabled, we define R123_USE_INTRIN_H=0 to avoid the problem.\n    list(APPEND CORENRN_COMPILE_DEFS R123_USE_INTRIN_H=0)\n  endif()\n  # CMake versions <3.19 used to add -A when using NVHPC/PGI, which makes the compiler excessively\n  # pedantic. See https://gitlab.kitware.com/cmake/cmake/-/issues/20997.\n  if(CMAKE_VERSION VERSION_LESS 3.19)\n    list(REMOVE_ITEM CMAKE_CXX17_STANDARD_COMPILE_OPTION -A)\n  endif()\nendif()\n\nif(CORENRN_ENABLE_SHARED)\n  set(COMPILE_LIBRARY_TYPE \"SHARED\")\nelse()\n  set(COMPILE_LIBRARY_TYPE \"STATIC\")\nendif()\n\nif(CORENRN_ENABLE_MPI)\n  find_package(MPI REQUIRED)\n  list(APPEND CORENRN_COMPILE_DEFS NRNMPI=1)\n  # avoid linking to C++ bindings\n  list(APPEND CORENRN_COMPILE_DEFS MPI_NO_CPPBIND=1)\n  list(APPEND CORENRN_COMPILE_DEFS OMPI_SKIP_MPICXX=1)\n  list(APPEND CORENRN_COMPILE_DEFS MPICH_SKIP_MPICXX=1)\nelse()\n  list(APPEND CORENRN_COMPILE_DEFS NRNMPI=0)\n  list(APPEND CORENRN_COMPILE_DEFS NRN_MULTISEND=0)\nendif()\n\nif(CORENRN_ENABLE_OPENMP)\n  find_package(OpenMP QUIET)\n  if(OPENMP_FOUND)\n    set(CMAKE_C_FLAGS \"${CMAKE_C_FLAGS} ${OpenMP_C_FLAGS} ${ADDITIONAL_THREADSAFE_FLAGS}\")\n    set(CMAKE_CXX_FLAGS \"${CMAKE_CXX_FLAGS} ${OpenMP_CXX_FLAGS} ${ADDITIONAL_THREADSAFE_FLAGS}\")\n  endif()\nendif()\n\nlist(APPEND CORENRN_COMPILE_DEFS LAYOUT=0)\n\nif(NOT CORENRN_ENABLE_HOC_EXP)\n  list(APPEND CORENRN_COMPILE_DEFS DISABLE_HOC_EXP)\nendif()\n\n# splay tree required for net_move\nif(CORENRN_ENABLE_SPLAYTREE_QUEUING)\n  list(APPEND CORENRN_COMPILE_DEFS ENABLE_SPLAYTREE_QUEUING)\nendif()\n\nif(NOT CORENRN_ENABLE_NET_RECEIVE_BUFFER)\n  list(APPEND CORENRN_COMPILE_DEFS NET_RECEIVE_BUFFERING=0)\nendif()\n\nif(NOT CORENRN_ENABLE_TIMEOUT)\n  list(APPEND CORENRN_COMPILE_DEFS DISABLE_TIMEOUT)\nendif()\n\nif(CORENRN_ENABLE_REPORTING)\n  find_package(reportinglib)\n  find_package(sonata)\n  find_program(H5DUMP_EXECUTABLE h5dump)\n\n  if(reportinglib_FOUND)\n    list(APPEND CORENRN_COMPILE_DEFS ENABLE_BIN_REPORTS)\n    set(ENABLE_BIN_REPORTS_TESTS ON)\n  else()\n    set(reportinglib_INCLUDE_DIR \"\")\n    set(reportinglib_LIBRARY \"\")\n  endif()\n  if(sonata_FOUND)\n    if(TARGET sonata::sonata_report)\n      list(APPEND CORENRN_COMPILE_DEFS ENABLE_SONATA_REPORTS)\n      set(ENABLE_SONATA_REPORTS_TESTS ON)\n    else()\n      message(SEND_ERROR \"SONATA library was found but without reporting support\")\n    endif()\n  endif()\n\n  if(NOT reportinglib_FOUND AND NOT sonata_FOUND)\n    message(SEND_ERROR \"Neither reportinglib nor SONATA libraries were found\")\n  endif()\n\n  include_directories(${reportinglib_INCLUDE_DIR})\n  include_directories(${sonatareport_INCLUDE_DIR})\nendif()\n\nif(CORENRN_ENABLE_LEGACY_UNITS)\n  set(CORENRN_USE_LEGACY_UNITS 1)\nelse()\n  set(CORENRN_USE_LEGACY_UNITS 0)\nendif()\nlist(APPEND CORENRN_COMPILE_DEFS CORENEURON_USE_LEGACY_UNITS=${CORENRN_USE_LEGACY_UNITS})\n# Propagate Legacy Units flag to backends.\nset(MOD2C_ENABLE_LEGACY_UNITS\n    ${CORENRN_ENABLE_LEGACY_UNITS}\n    CACHE BOOL \"\" FORCE)\nset(NMODL_ENABLE_LEGACY_UNITS\n    ${CORENRN_ENABLE_LEGACY_UNITS}\n    CACHE BOOL \"\" FORCE)\n\nif(CORENRN_ENABLE_MPI_DYNAMIC)\n  if(NOT CORENRN_ENABLE_MPI)\n    message(FATAL_ERROR \"Cannot enable dynamic mpi without mpi\")\n  endif()\n  list(APPEND CORENRN_COMPILE_DEFS CORENEURON_ENABLE_MPI_DYNAMIC)\nendif()\n\nif(CORENRN_ENABLE_PRCELLSTATE)\n  set(CORENRN_NRN_PRCELLSTATE 1)\nelse()\n  set(CORENRN_NRN_PRCELLSTATE 0)\nendif()\nif(MINGW)\n  list(APPEND CORENRN_COMPILE_DEFS MINGW)\nendif()\n\n# =============================================================================\n# NMODL specific options\n# =============================================================================\nif(CORENRN_ENABLE_NMODL)\n  find_package(nmodl)\n  if(NOT \"${CORENRN_NMODL_DIR}\" STREQUAL \"\" AND NOT nmodl_FOUND)\n    message(FATAL_ERROR \"Cannot find NMODL in ${CORENRN_NMODL_DIR}\")\n  endif()\n  if(nmodl_FOUND)\n    set(CORENRN_MOD2CPP_BINARY ${nmodl_BINARY})\n    set(CORENRN_MOD2CPP_INCLUDE ${nmodl_INCLUDE})\n    # path to python interface\n    set(ENV{PYTHONPATH} \"${nmodl_PYTHONPATH}:$ENV{PYTHONPATH}\")\n    set(CORENRN_NMODL_PYTHONPATH $ENV{PYTHONPATH})\n  else()\n    set(NMODL_ENABLE_PYTHON_BINDINGS\n        OFF\n        CACHE BOOL \"Disable NMODL python bindings\")\n    include(AddNmodlSubmodule)\n    set(CORENRN_MOD2CPP_BINARY ${CMAKE_BINARY_DIR}/bin/nmodl${CMAKE_EXECUTABLE_SUFFIX})\n    set(CORENRN_MOD2CPP_INCLUDE ${CMAKE_BINARY_DIR}/include)\n    set(ENV{PYTHONPATH} \"$ENV{PYTHONPATH}\")\n    set(nmodl_PYTHONPATH \"${CMAKE_BINARY_DIR}/lib\")\n    set(CORENRN_NMODL_PYTHONPATH \"${nmodl_PYTHONPATH}:$ENV{PYTHONPATH}\")\n    set(NMODL_TARGET_TO_DEPEND nmodl)\n  endif()\n  include_directories(${CORENRN_MOD2CPP_INCLUDE})\n  # set correct arguments for nmodl for cpu/gpu target\n  set(CORENRN_NMODL_FLAGS\n      \"\"\n      CACHE STRING \"Extra NMODL options such as passes\")\nelse()\n  include(AddMod2cSubmodule)\n  set(NMODL_TARGET_TO_DEPEND mod2c_core)\n  set(CORENRN_MOD2CPP_BINARY ${CMAKE_BINARY_DIR}/bin/mod2c_core${CMAKE_EXECUTABLE_SUFFIX})\n  set(CORENRN_MOD2CPP_INCLUDE ${CMAKE_BINARY_DIR}/include)\nendif()\n\n# =============================================================================\n# Profiler/Instrumentation Options\n# =============================================================================\nif(CORENRN_ENABLE_CALIPER_PROFILING)\n  find_package(caliper REQUIRED)\n  list(APPEND CORENRN_COMPILE_DEFS CORENEURON_CALIPER)\n  set(CORENRN_CALIPER_LIB caliper)\nendif()\n\nif(CORENRN_ENABLE_LIKWID_PROFILING)\n  find_package(likwid REQUIRED)\n  list(APPEND CORENRN_COMPILE_DEFS LIKWID_PERFMON)\n  # TODO: avoid this part, probably by using some likwid CMake target\n  include_directories(${likwid_INCLUDE_DIRS})\nendif()\n\n# enable debugging code with extra logs to stdout\nif(CORENRN_ENABLE_DEBUG_CODE)\n  list(APPEND CORENRN_COMPILE_DEFS CORENRN_DEBUG CHKPNTDEBUG CORENRN_DEBUG_QUEUE INTERLEAVE_DEBUG)\nendif()\n\n# =============================================================================\n# Common CXX flags : ignore unknown pragma warnings\n# =============================================================================\n# Do not set this when building wheels. The nrnivmodl workflow means that we do not know what\n# compiler will be invoked with these flags, so we have to use flags that are as generic as\n# possible.\nif(NOT DEFINED NRN_WHEEL_BUILD OR NOT NRN_WHEEL_BUILD)\n  list(APPEND CORENRN_EXTRA_CXX_FLAGS \"${IGNORE_UNKNOWN_PRAGMA_FLAGS}\")\nendif()\n\n# Add the main source directory\nadd_subdirectory(coreneuron)\n\n# Extract the various compiler option strings to use inside nrnivmodl-core. Sets the global property\n# CORENRN_LIB_LINK_FLAGS, which contains the arguments that must be added to the link line for\n# `special` to link against `libcorenrnmech.{a,so}`\ninclude(MakefileBuildOptions)\n\n# Generate the nrnivmodl-core script and makefile using the options from MakefileBuildOptions\nadd_subdirectory(extra)\n\nif(CORENRN_ENABLE_UNIT_TESTS)\n  add_subdirectory(tests)\nendif()\n\n# =============================================================================\n# Install cmake modules\n# =============================================================================\nget_property(CORENRN_NEURON_LINK_FLAGS GLOBAL PROPERTY CORENRN_NEURON_LINK_FLAGS)\nconfigure_file(CMake/coreneuron-config.cmake.in CMake/coreneuron-config.cmake @ONLY)\ninstall(FILES \"${CMAKE_CURRENT_BINARY_DIR}/CMake/coreneuron-config.cmake\" DESTINATION share/cmake)\ninstall(EXPORT coreneuron DESTINATION share/cmake)\n\nif(NOT CORENEURON_AS_SUBPROJECT)\n  # =============================================================================\n  # Setup Doxygen documentation\n  # =============================================================================\n  find_package(Doxygen QUIET)\n  if(DOXYGEN_FOUND)\n    # generate Doxyfile with correct source paths\n    configure_file(${PROJECT_SOURCE_DIR}/docs/Doxyfile.in ${PROJECT_BINARY_DIR}/Doxyfile)\n    add_custom_target(\n      doxygen\n      COMMAND ${DOXYGEN_EXECUTABLE} ${PROJECT_BINARY_DIR}/Doxyfile\n      WORKING_DIRECTORY ${PROJECT_BINARY_DIR}\n      COMMENT \"Generating API documentation with Doxygen\"\n      VERBATIM)\n  endif()\n\n  # =============================================================================\n  # Setup Sphinx documentation\n  # =============================================================================\n  find_package(Sphinx QUIET)\n  if(SPHINX_FOUND)\n    set(SPHINX_SOURCE ${PROJECT_SOURCE_DIR}/docs)\n    set(SPHINX_BUILD ${PROJECT_BINARY_DIR}/docs/)\n\n    add_custom_target(\n      sphinx\n      COMMAND ${SPHINX_EXECUTABLE} -b html ${SPHINX_SOURCE} ${SPHINX_BUILD}\n      WORKING_DIRECTORY ${PROJECT_BINARY_DIR}\n      COMMENT \"Generating documentation with Sphinx\")\n  endif()\n\n  # =============================================================================\n  # Build full docs\n  # =============================================================================\n  if(DOXYGEN_FOUND AND SPHINX_FOUND)\n    add_custom_target(\n      docs\n      COMMAND ${CMAKE_COMMAND} --build ${PROJECT_BINARY_DIR} --target doxygen\n      COMMAND ${CMAKE_COMMAND} --build ${PROJECT_BINARY_DIR} --target sphinx\n      COMMENT \"Generating full documentation\")\n  else()\n    add_custom_target(\n      docs\n      VERBATIM\n      COMMAND echo \"Please install docs requirements (see docs/README.md)!\"\n      COMMENT \"Documentation generation not possible!\")\n  endif()\nendif()\n# =============================================================================\n# Build status\n# =============================================================================\nmessage(STATUS \"\")\nmessage(STATUS \"Configured CoreNEURON ${PROJECT_VERSION}\")\nmessage(STATUS \"\")\nmessage(STATUS \"You can now build CoreNEURON using:\")\nmessage(STATUS \"  cmake --build . --parallel 8 [--target TARGET]\")\nmessage(STATUS \"You might want to adjust the number of parallel build jobs for your system.\")\nmessage(STATUS \"Some non-default targets you might want to build:\")\nmessage(STATUS \"--------------------+--------------------------------------------------------\")\nmessage(STATUS \" Target             |   Description\")\nmessage(STATUS \"--------------------+--------------------------------------------------------\")\nmessage(STATUS \"install             | Will install CoreNEURON to: ${CMAKE_INSTALL_PREFIX}\")\nmessage(STATUS \"docs                | Build full docs. Calls targets: doxygen, sphinx\")\nmessage(STATUS \"--------------------+--------------------------------------------------------\")\nmessage(STATUS \" Build option       | Status\")\nmessage(STATUS \"--------------------+--------------------------------------------------------\")\nmessage(STATUS \"CXX COMPILER        | ${CMAKE_CXX_COMPILER}\")\nmessage(STATUS \"COMPILE FLAGS       | ${CORENRN_CXX_FLAGS}\")\nmessage(STATUS \"Build Type          | ${COMPILE_LIBRARY_TYPE}\")\nmessage(STATUS \"MPI                 | ${CORENRN_ENABLE_MPI}\")\nif(CORENRN_ENABLE_MPI)\n  message(STATUS \"  DYNAMIC           | ${CORENRN_ENABLE_MPI_DYNAMIC}\")\n  if(CORENRN_ENABLE_MPI_DYNAMIC AND NRN_MPI_LIBNAME_LIST)\n    # ~~~\n    # for dynamic mpi, rely on neuron for list of libraries to build\n    # this is to avoid cmake code duplication on the coreneuron side\n    # ~~~\n    list(LENGTH NRN_MPI_LIBNAME_LIST _num_mpi)\n    math(EXPR num_mpi \"${_num_mpi} - 1\")\n    foreach(val RANGE ${num_mpi})\n      list(GET NRN_MPI_LIBNAME_LIST ${val} libname)\n      list(GET NRN_MPI_INCLUDE_LIST ${val} include)\n      message(STATUS \"    LIBNAME         | core${libname}\")\n      message(STATUS \"    INC             | ${include}\")\n    endforeach(val)\n  else()\n    message(STATUS \"  INC               | ${MPI_CXX_INCLUDE_PATH}\")\n  endif()\nendif()\nmessage(STATUS \"OpenMP              | ${CORENRN_ENABLE_OPENMP}\")\nmessage(STATUS \"Use legacy units    | ${CORENRN_ENABLE_LEGACY_UNITS}\")\nmessage(STATUS \"NMODL               | ${CORENRN_ENABLE_NMODL}\")\nif(CORENRN_ENABLE_NMODL)\n  message(STATUS \"  FLAGS             | ${CORENRN_NMODL_FLAGS}\")\nendif()\nmessage(STATUS \"MOD2CPP PATH        | ${CORENRN_MOD2CPP_BINARY}\")\nmessage(STATUS \"GPU Support         | ${CORENRN_ENABLE_GPU}\")\nif(CORENRN_ENABLE_GPU)\n  message(STATUS \"  CUDA              | ${CUDAToolkit_LIBRARY_DIR}\")\n  message(STATUS \"  Offload           | ${CORENRN_ACCELERATOR_OFFLOAD}\")\n  message(STATUS \"  Unified Memory    | ${CORENRN_ENABLE_CUDA_UNIFIED_MEMORY}\")\nendif()\nmessage(STATUS \"Auto Timeout        | ${CORENRN_ENABLE_TIMEOUT}\")\nmessage(STATUS \"Wrap exp()          | ${CORENRN_ENABLE_HOC_EXP}\")\nmessage(STATUS \"SplayTree Queue     | ${CORENRN_ENABLE_SPLAYTREE_QUEUING}\")\nmessage(STATUS \"NetReceive Buffer   | ${CORENRN_ENABLE_NET_RECEIVE_BUFFER}\")\nmessage(STATUS \"Caliper             | ${CORENRN_ENABLE_CALIPER_PROFILING}\")\nmessage(STATUS \"Likwid              | ${CORENRN_ENABLE_LIKWID_PROFILING}\")\nmessage(STATUS \"Unit Tests          | ${CORENRN_ENABLE_UNIT_TESTS}\")\nmessage(STATUS \"Reporting           | ${CORENRN_ENABLE_REPORTING}\")\nif(CORENRN_ENABLE_REPORTING)\n  message(STATUS \"  sonatareport_INC  | ${sonatareport_INCLUDE_DIR}\")\n  message(STATUS \"  sonatareport_LIB  | ${sonatareport_LIBRARY}\")\n  message(STATUS \"  reportinglib_INC  | ${reportinglib_INCLUDE_DIR}\")\n  message(STATUS \"  reportinglib_LIB  | ${reportinglib_LIBRARY}\")\nendif()\nmessage(STATUS \"--------------+--------------------------------------------------------------\")\nmessage(STATUS \" See documentation : https://github.com/BlueBrain/CoreNeuron/\")\nmessage(STATUS \"--------------+--------------------------------------------------------------\")\nmessage(STATUS \"\")\n"
  },
  {
    "path": "LICENSE.txt",
    "content": "Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\nAll rights reserved.\n\nRedistribution and use in source and binary forms, with or without modification,\nare permitted provided that the following conditions are met:\n1. Redistributions of source code must retain the above copyright notice,\n   this list of conditions and the following disclaimer.\n2. Redistributions in binary form must reproduce the above copyright notice,\n   this list of conditions and the following disclaimer in the documentation\n   and/or other materials provided with the distribution.\n3. Neither the name of the copyright holder nor the names of its contributors\n   may be used to endorse or promote products derived from this software\n   without specific prior written permission.\n\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\"\nAND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE\nIMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE\nARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE\nLIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR\nCONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF\nSUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS\nINTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN\nCONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)\nARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF\nTHE POSSIBILITY OF SUCH DAMAGE.\n"
  },
  {
    "path": "README.md",
    "content": " :bangbang:\n **NOTE:** The CoreNEURON is now [integrated within NEURON](https://github.com/neuronsimulator/nrn/tree/master/src/coreneuron) simulator at the source level and hence all the latest development happens under the main GitHub project [neuronsimulator/nrn](https://github.com/neuronsimulator/nrn). To use CoreNEURON, see the latest NEURON documentation under [nrn.readthedocs.io](https://nrn.readthedocs.io/en/latest/).:bangbang:\n\n_______________________________________________________\n\n![CoreNEURON CI](https://github.com/BlueBrain/CoreNeuron/workflows/CoreNEURON%20CI/badge.svg) [![codecov](https://codecov.io/gh/BlueBrain/CoreNeuron/branch/master/graph/badge.svg?token=mguTdBx93p)](https://codecov.io/gh/BlueBrain/CoreNeuron)\n\n![CoreNEURON](docs/_static/bluebrain_coreneuron.jpg)\n\n\n## Citation\n\nIf you would like to know more about CoreNEURON or would like to cite it, then use the following paper:\n\n* Pramod Kumbhar, Michael Hines, Jeremy Fouriaux, Aleksandr Ovcharenko, James King, Fabien Delalondre and Felix Schürmann. CoreNEURON : An Optimized Compute Engine for the NEURON Simulator ([doi.org/10.3389/fninf.2019.00063](https://doi.org/10.3389/fninf.2019.00063))\n\n## License\n* See LICENSE.txt\n* See [NEURON](https://github.com/neuronsimulator/nrn)\n\n\n## Funding\n\nCoreNEURON is developed in a joint collaboration between the Blue Brain Project and Yale University. This work is supported by funding to the Blue Brain Project, a research center of the École polytechnique fédérale de Lausanne (EPFL), from the Swiss government’s ETH Board of the Swiss Federal Institutes of Technology, NIH grant number R01NS11613 (Yale University), the European Union Seventh Framework Program (FP7/20072013) under grant agreement n◦ 604102 (HBP) and the European Union’s Horizon 2020 Framework Programme for Research and Innovation under Specific Grant Agreement n◦ 720270 (Human Brain Project SGA1), n◦ 785907 (Human Brain Project SGA2) and n◦ 945539 (Human Brain Project SGA3).\n\nCopyright (c) 2016 - 2022 Blue Brain Project/EPFL\n"
  },
  {
    "path": "coreneuron/CMakeLists.txt",
    "content": "# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\n# Add compiler flags that should apply to all CoreNEURON targets, but which should not leak into\n# other included projects.\nadd_compile_definitions(${CORENRN_COMPILE_DEFS})\nadd_compile_options(${CORENRN_EXTRA_CXX_FLAGS})\nadd_link_options(${CORENRN_EXTRA_LINK_FLAGS})\n\n# put libraries (e.g. dll) in bin directory\nset(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/bin)\n\n# =============================================================================\n# gather various source files\n# =============================================================================\nfile(\n  GLOB\n  CORENEURON_CODE_FILES\n  \"apps/main1.cpp\"\n  \"apps/corenrn_parameters.cpp\"\n  \"gpu/nrn_acc_manager.cpp\"\n  \"io/*.cpp\"\n  \"io/reports/*.cpp\"\n  \"mechanism/*.cpp\"\n  \"mpi/core/nrnmpi_def_cinc.cpp\"\n  \"network/*.cpp\"\n  \"permute/*.cpp\"\n  \"sim/*.cpp\"\n  \"sim/scopmath/abort.cpp\"\n  \"sim/scopmath/newton_thread.cpp\"\n  \"utils/*.cpp\"\n  \"utils/*/*.c\"\n  \"utils/*/*.cpp\")\nset(MPI_LIB_FILES \"mpi/lib/mpispike.cpp\" \"mpi/lib/nrnmpi.cpp\")\nif(CORENRN_ENABLE_MPI)\n  # Building these requires -ldl, which is only added if MPI is enabled.\n  list(APPEND CORENEURON_CODE_FILES \"mpi/core/resolve.cpp\" \"mpi/core/nrnmpidec.cpp\")\nendif()\nfile(COPY ${CORENEURON_PROJECT_SOURCE_DIR}/external/Random123/include/Random123\n     DESTINATION ${CMAKE_BINARY_DIR}/include)\nlist(APPEND CORENEURON_CODE_FILES ${PROJECT_BINARY_DIR}/coreneuron/config/config.cpp)\n\nset(ENGINEMECH_CODE_FILE \"mechanism/mech/enginemech.cpp\")\n\n# for external mod files we need to generate modl_ref function in mod_func.c\nset(MODFUNC_PERL_SCRIPT \"mechanism/mech/mod_func.c.pl\")\n\nset(NMODL_UNITS_FILE \"${CMAKE_BINARY_DIR}/share/mod2c/nrnunits.lib\")\n\n# =============================================================================\n# Copy files that are required by nrnivmodl-core to the build tree at build time.\n# =============================================================================\ncpp_cc_build_time_copy(\n  INPUT \"${CMAKE_CURRENT_SOURCE_DIR}/${MODFUNC_PERL_SCRIPT}\"\n  OUTPUT \"${CMAKE_BINARY_DIR}/share/coreneuron/mod_func.c.pl\"\n  NO_TARGET)\ncpp_cc_build_time_copy(\n  INPUT \"${CMAKE_CURRENT_SOURCE_DIR}/${ENGINEMECH_CODE_FILE}\"\n  OUTPUT \"${CMAKE_BINARY_DIR}/share/coreneuron/enginemech.cpp\"\n  NO_TARGET)\nset(nrnivmodl_core_dependencies \"${CMAKE_BINARY_DIR}/share/coreneuron/mod_func.c.pl\"\n                                \"${CMAKE_BINARY_DIR}/share/coreneuron/enginemech.cpp\")\n# Set up build rules that copy builtin mod files from\n# {source}/coreneuron/mechanism/mech/modfile/*.mod to {build_dir}/share/modfile/\nfile(GLOB builtin_modfiles\n     \"${CORENEURON_PROJECT_SOURCE_DIR}/coreneuron/mechanism/mech/modfile/*.mod\")\nforeach(builtin_modfile ${builtin_modfiles})\n  # Construct the path in the build directory.\n  get_filename_component(builtin_modfile_name \"${builtin_modfile}\" NAME)\n  set(modfile_build_path \"${CMAKE_BINARY_DIR}/share/modfile/${builtin_modfile_name}\")\n  # Create a build rule to copy the modfile there.\n  cpp_cc_build_time_copy(\n    INPUT \"${builtin_modfile}\"\n    OUTPUT \"${modfile_build_path}\"\n    NO_TARGET)\n  list(APPEND nrnivmodl_core_dependencies \"${modfile_build_path}\")\nendforeach()\nadd_custom_target(coreneuron-copy-nrnivmodl-core-dependencies ALL\n                  DEPENDS ${nrnivmodl_core_dependencies})\n# Store the build-tree modfile paths in a cache variable; these are an implicit dependency of\n# nrnivmodl-core.\nset(CORENEURON_BUILTIN_MODFILES\n    \"${nrnivmodl_core_dependencies}\"\n    CACHE STRING \"List of builtin modfiles that nrnivmodl-core implicitly depends on\" FORCE)\n\n# =============================================================================\n# coreneuron GPU library\n# =============================================================================\nif(CORENRN_ENABLE_GPU)\n  # ~~~\n  # artificial cells and some other cpp files (using Random123) should be compiled\n  # without OpenACC to avoid use of GPU Random123 streams\n  # OL210813: this shouldn't be needed anymore, but it may have a small performance benefit\n  # ~~~\n  set(OPENACC_EXCLUDED_FILES\n      ${CMAKE_CURRENT_BINARY_DIR}/netstim.cpp\n      ${CMAKE_CURRENT_BINARY_DIR}/netstim_inhpoisson.cpp\n      ${CMAKE_CURRENT_BINARY_DIR}/pattern.cpp\n      ${CMAKE_CURRENT_SOURCE_DIR}/io/nrn_setup.cpp\n      ${CMAKE_CURRENT_SOURCE_DIR}/io/setup_fornetcon.cpp\n      ${CMAKE_CURRENT_SOURCE_DIR}/io/corenrn_data_return.cpp\n      ${CMAKE_CURRENT_SOURCE_DIR}/io/global_vars.cpp)\n\n  set_source_files_properties(${OPENACC_EXCLUDED_FILES} PROPERTIES COMPILE_FLAGS\n                                                                   \"-DDISABLE_OPENACC\")\n  # Only compile the explicit CUDA implementation of the Hines solver in GPU builds. Because of\n  # https://forums.developer.nvidia.com/t/cannot-dynamically-load-a-shared-library-containing-both-openacc-and-cuda-code/210972\n  # this cannot be included in the same shared library as the rest of the OpenACC code.\n  set(CORENEURON_CUDA_FILES ${CMAKE_CURRENT_SOURCE_DIR}/permute/cellorder.cu)\n\n  # Eigen functions cannot be called directly from OpenACC regions, but Eigen is sort-of compatible\n  # with being compiled as CUDA code. Because of\n  # https://forums.developer.nvidia.com/t/cannot-dynamically-load-a-shared-library-containing-both-openacc-and-cuda-code/210972\n  # this has to mean `nvc++ -cuda` rather than `nvcc`. We explicitly instantiate Eigen functions for\n  # different matrix sizes in partial_piv_lu.cpp (with CUDA attributes but without OpenACC or OpenMP\n  # annotations) and dispatch to these from a wrapper in partial_piv_lu.h that does have\n  # OpenACC/OpenMP annotations.\n  if(CORENRN_ENABLE_NMODL AND EXISTS ${CORENRN_MOD2CPP_INCLUDE}/partial_piv_lu/partial_piv_lu.cpp)\n    list(APPEND CORENEURON_CODE_FILES ${CORENRN_MOD2CPP_INCLUDE}/partial_piv_lu/partial_piv_lu.cpp)\n    if(CORENRN_ENABLE_GPU\n       AND CORENRN_HAVE_NVHPC_COMPILER\n       AND CMAKE_BUILD_TYPE STREQUAL \"Debug\")\n      # In this case OpenAccHelper.cmake passes -gpu=debug, which makes these Eigen functions\n      # extremely slow. Downgrade that to -gpu=lineinfo for this file.\n      set_source_files_properties(${CORENRN_MOD2CPP_INCLUDE}/partial_piv_lu/partial_piv_lu.cpp\n                                  PROPERTIES COMPILE_FLAGS \"-gpu=lineinfo,nodebug -O1\")\n    endif()\n  endif()\nendif()\n\n# =============================================================================\n# create libraries\n# =============================================================================\n\n# name of coreneuron mpi objects or dynamic library\nset(CORENRN_MPI_LIB_NAME\n    \"corenrn_mpi\"\n    CACHE INTERNAL \"\")\n\n# for non-dynamic mpi mode just build object files\nif(CORENRN_ENABLE_MPI AND NOT CORENRN_ENABLE_MPI_DYNAMIC)\n  add_library(${CORENRN_MPI_LIB_NAME} OBJECT ${MPI_LIB_FILES})\n  target_include_directories(\n    ${CORENRN_MPI_LIB_NAME} PRIVATE ${MPI_INCLUDE_PATH} ${CORENEURON_PROJECT_SOURCE_DIR}\n                                    ${CORENEURON_PROJECT_BINARY_DIR}/generated)\n  target_link_libraries(${CORENRN_MPI_LIB_NAME} ${CORENRN_CALIPER_LIB})\n  set_property(TARGET ${CORENRN_MPI_LIB_NAME} PROPERTY POSITION_INDEPENDENT_CODE ON)\n  set(CORENRN_MPI_OBJ $<TARGET_OBJECTS:${CORENRN_MPI_LIB_NAME}>)\nendif()\n\n# Library containing the bulk of the non-mechanism CoreNEURON code. This is always created and\n# installed as a static library, and then the nrnivmodl-core workflow extracts the object files from\n# it and does one of the following:\n#\n# * shared build: creates libcorenrnmech.so from these objects plus those from the translated MOD\n#   files\n# * static build: creates a (temporary, does not get installed) libcorenrnmech.a from these objects\n#   plus those from the translated MOD files, then statically links that into special-core\n#   (nrniv-core)\n#\n# This scheme means that both core and mechanism .o files are linked in a single step, which is\n# important for GPU linking. It does, however, mean that the core code is installed twice, once in\n# libcoreneuron-core.a and once in libcorenrnmech.so (shared) or nrniv-core (static). In a GPU\n# build, libcoreneuron-cuda.{a,so} is also linked to provide the CUDA implementation of the Hines\n# solver. This cannot be included in coreneuron-core because of this issue:\n# https://forums.developer.nvidia.com/t/cannot-dynamically-load-a-shared-library-containing-both-openacc-and-cuda-code/210972\nadd_library(coreneuron-core STATIC ${CORENEURON_CODE_FILES} ${CORENRN_MPI_OBJ})\nif(CORENRN_ENABLE_GPU)\n  set(coreneuron_cuda_target coreneuron-cuda)\n  add_library(coreneuron-cuda ${COMPILE_LIBRARY_TYPE} ${CORENEURON_CUDA_FILES})\n  target_link_libraries(coreneuron-core PUBLIC coreneuron-cuda)\nendif()\n\nforeach(target coreneuron-core ${coreneuron_cuda_target})\n  target_include_directories(${target} PRIVATE ${CORENEURON_PROJECT_SOURCE_DIR}\n                                               ${CORENEURON_PROJECT_BINARY_DIR}/generated)\nendforeach()\n\n# we can link to MPI libraries in non-dynamic-mpi build\nif(CORENRN_ENABLE_MPI AND NOT CORENRN_ENABLE_MPI_DYNAMIC)\n  target_link_libraries(coreneuron-core PUBLIC ${MPI_CXX_LIBRARIES})\nendif()\n\n# ~~~\n# main coreneuron library needs to be linked to libdl.so\n# only in case of dynamic mpi build. But on old system\n# like centos7, we saw mpich library require explici\n# link to libdl.so. See\n#   https://github.com/neuronsimulator/nrn-build-ci/pull/51\n# ~~~\ntarget_link_libraries(coreneuron-core PUBLIC ${CMAKE_DL_LIBS})\n\n# this is where we handle dynamic mpi library build\nif(CORENRN_ENABLE_MPI AND CORENRN_ENABLE_MPI_DYNAMIC)\n  # store mpi library targets that will be built\n  list(APPEND corenrn_mpi_targets \"\")\n\n  # ~~~\n  # if coreneuron is built as a submodule of neuron then check if NEURON has created\n  # list of libraries that needs to be built. We use neuron cmake variables here because\n  # we don't need to duplicate CMake code into coreneuron (we want to have unified cmake\n  # project soon). In the absense of neuron just build a single library libcorenrn_mpi.\n  # This is mostly used for the testing.\n  # ~~~\n  if(NOT CORENEURON_AS_SUBPROJECT)\n    add_library(${CORENRN_MPI_LIB_NAME} SHARED ${MPI_LIB_FILES})\n    target_link_libraries(${CORENRN_MPI_LIB_NAME} ${MPI_CXX_LIBRARIES})\n    target_include_directories(\n      ${CORENRN_MPI_LIB_NAME} PRIVATE ${MPI_INCLUDE_PATH} ${CORENEURON_PROJECT_SOURCE_DIR}\n                                      ${CORENEURON_PROJECT_BINARY_DIR}/generated)\n    set_property(TARGET ${CORENRN_MPI_LIB_NAME} PROPERTY POSITION_INDEPENDENT_CODE ON)\n    list(APPEND corenrn_mpi_targets ${CORENRN_MPI_LIB_NAME})\n  else()\n    # ~~~\n    # from neuron we know how many different libraries needs to be built, their names\n    # include paths to be used for building shared libraries. Iterate through those\n    # and build separate library for each MPI distribution. For example, following\n    # libraries are created:\n    # - libcorenrn_mpich.so\n    # - libcorenrn_ompi.so\n    # - libcorenrn_mpt.so\n    # ~~~\n    list(LENGTH NRN_MPI_LIBNAME_LIST _num_mpi)\n    math(EXPR num_mpi \"${_num_mpi} - 1\")\n    foreach(val RANGE ${num_mpi})\n      list(GET NRN_MPI_INCLUDE_LIST ${val} include)\n      list(GET NRN_MPI_LIBNAME_LIST ${val} libname)\n\n      add_library(core${libname}_lib SHARED ${MPI_LIB_FILES})\n      target_link_libraries(core${libname}_lib ${CORENRN_CALIPER_LIB})\n      target_include_directories(\n        core${libname}_lib\n        PUBLIC ${include}\n        PRIVATE ${CORENEURON_PROJECT_SOURCE_DIR} ${CORENEURON_PROJECT_BINARY_DIR}/generated)\n\n      # ~~~\n      # TODO: somehow mingw requires explicit linking. This needs to be verified\n      # when we will test coreneuron on windows.\n      # ~~~\n      if(MINGW) # type msmpi only\n        add_dependencies(core${libname}_lib coreneuron-core)\n        target_link_libraries(core${libname}_lib ${MPI_C_LIBRARIES} coreneuron-core)\n      endif()\n      set_property(TARGET core${libname}_lib PROPERTY OUTPUT_NAME core${libname})\n      list(APPEND corenrn_mpi_targets \"core${libname}_lib\")\n    endforeach(val)\n  endif()\n\n  set_target_properties(\n    ${corenrn_mpi_targets}\n    PROPERTIES ARCHIVE_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/lib\n               LIBRARY_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/lib\n               POSITION_INDEPENDENT_CODE ON)\n  install(TARGETS ${corenrn_mpi_targets} DESTINATION lib)\nendif()\n\n# Suppress some compiler warnings.\ntarget_compile_options(coreneuron-core PRIVATE ${CORENEURON_CXX_WARNING_SUPPRESSIONS})\ntarget_link_libraries(coreneuron-core PUBLIC ${reportinglib_LIBRARY} ${sonatareport_LIBRARY}\n                                             ${CORENRN_CALIPER_LIB} ${likwid_LIBRARIES})\n\n# TODO: fix adding a dependency of coreneuron-core on CLI11::CLI11 when CLI11 is a submodule. Right\n# now this doesn't work because the CLI11 targets are not exported/installed but coreneuron-core is.\nget_target_property(CLI11_HEADER_DIRECTORY CLI11::CLI11 INTERFACE_INCLUDE_DIRECTORIES)\ntarget_include_directories(\n  coreneuron-core SYSTEM PRIVATE ${CLI11_HEADER_DIRECTORY}\n                                 ${CORENEURON_PROJECT_SOURCE_DIR}/external/Random123/include)\n\n# See: https://en.cppreference.com/w/cpp/filesystem#Notes\nif(CMAKE_CXX_COMPILER_IS_GCC AND CMAKE_CXX_COMPILER_VERSION VERSION_LESS 9.1)\n  target_link_libraries(coreneuron-core PUBLIC stdc++fs)\nendif()\n\nif(CORENRN_ENABLE_GPU)\n  # nrnran123.cpp uses Boost.Pool in GPU builds if it's available.\n  find_package(Boost QUIET)\n  if(Boost_FOUND)\n    message(STATUS \"Boost found, enabling use of memory pools for Random123...\")\n    target_include_directories(coreneuron-core SYSTEM PRIVATE ${Boost_INCLUDE_DIRS})\n    target_compile_definitions(coreneuron-core PRIVATE CORENEURON_USE_BOOST_POOL)\n  endif()\nendif()\n\nset_target_properties(\n  coreneuron-core ${coreneuron_cuda_target}\n  PROPERTIES ARCHIVE_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/lib\n             LIBRARY_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/lib\n             POSITION_INDEPENDENT_CODE ${CORENRN_ENABLE_SHARED})\ncpp_cc_configure_sanitizers(TARGET coreneuron-core ${coreneuron_cuda_target} ${corenrn_mpi_targets})\n\n# =============================================================================\n# create special-core with halfgap.mod for tests\n# =============================================================================\nset(modfile_directory \"${CORENEURON_PROJECT_SOURCE_DIR}/tests/integration/ring_gap/mod files\")\nfile(GLOB modfiles \"${modfile_directory}/*.mod\")\n\n# We have to link things like unit tests against this because some \"core\" .cpp files refer to\n# symbols in the translated versions of default .mod files\nset(nrniv_core_prefix \"${CMAKE_BINARY_DIR}/bin/${CMAKE_SYSTEM_PROCESSOR}\")\nset(corenrn_mech_library\n    \"${nrniv_core_prefix}/${CMAKE_${COMPILE_LIBRARY_TYPE}_LIBRARY_PREFIX}corenrnmech${CMAKE_${COMPILE_LIBRARY_TYPE}_LIBRARY_SUFFIX}\"\n)\nset(output_binaries \"${nrniv_core_prefix}/special-core\" \"${corenrn_mech_library}\")\n\nadd_custom_command(\n  OUTPUT ${output_binaries}\n  DEPENDS coreneuron-core ${NMODL_TARGET_TO_DEPEND} ${modfiles} ${CORENEURON_BUILTIN_MODFILES}\n  COMMAND ${CMAKE_BINARY_DIR}/bin/nrnivmodl-core -b ${COMPILE_LIBRARY_TYPE} -m\n          ${CORENRN_MOD2CPP_BINARY} -p 4 \"${modfile_directory}\"\n  WORKING_DIRECTORY ${CMAKE_BINARY_DIR}/bin\n  COMMENT \"Running nrnivmodl-core with halfgap.mod\")\nadd_custom_target(nrniv-core ALL DEPENDS ${output_binaries})\n\n# Build a target representing the libcorenrnmech.so that is produced under bin/x86_64, which\n# executables such as the unit tests must link against\nadd_library(builtin-libcorenrnmech SHARED IMPORTED)\nadd_dependencies(builtin-libcorenrnmech nrniv-core)\nset_target_properties(builtin-libcorenrnmech PROPERTIES IMPORTED_LOCATION \"${corenrn_mech_library}\")\n\nif(CORENRN_ENABLE_GPU)\n  separate_arguments(CORENRN_ACC_FLAGS UNIX_COMMAND \"${NVHPC_ACC_COMP_FLAGS}\")\n  target_compile_options(coreneuron-core PRIVATE ${CORENRN_ACC_FLAGS})\nendif()\n\n# Create an extra target for use by NEURON when CoreNEURON is being built as a submodule. NEURON\n# tests will depend on this, so it must in turn depend on everything that is needed to run nrnivmodl\n# -coreneuron.\nadd_custom_target(coreneuron-for-tests)\nadd_dependencies(coreneuron-for-tests coreneuron-core ${NMODL_TARGET_TO_DEPEND})\n# Create an extra target for internal use that unit tests and so on can depend on.\n# ${corenrn_mech_library} is libcorenrnmech.{a,so}, which contains both the compiled default\n# mechanisms and the content of libcoreneuron-core.a.\nadd_library(coreneuron-all INTERFACE)\ntarget_link_libraries(coreneuron-all INTERFACE builtin-libcorenrnmech)\n# Also copy the dependencies of libcoreneuron-core as interface dependencies of this new target\n# (example: ${corenrn_mech_library} will probably depend on MPI, so when the unit tests link against\n# ${corenrn_mech_library} they need to know to link against MPI too).\nget_target_property(coreneuron_core_deps coreneuron-core LINK_LIBRARIES)\nif(coreneuron_core_deps)\n  foreach(dep ${coreneuron_core_deps})\n    target_link_libraries(coreneuron-all INTERFACE ${dep})\n  endforeach()\nendif()\n\n# Make headers avail to build tree\nconfigure_file(engine.h.in ${CMAKE_BINARY_DIR}/include/coreneuron/engine.h @ONLY)\n\nfile(\n  GLOB_RECURSE main_headers\n  RELATIVE \"${CMAKE_CURRENT_SOURCE_DIR}\"\n  *.h *.hpp)\n\nconfigure_file(\"${CORENEURON_PROJECT_BINARY_DIR}/generated/coreneuron/config/neuron_version.hpp\"\n               \"${CMAKE_BINARY_DIR}/include/coreneuron/config/neuron_version.hpp\" COPYONLY)\nforeach(header ${main_headers})\n  configure_file(\"${header}\" \"${CMAKE_BINARY_DIR}/include/coreneuron/${header}\" COPYONLY)\nendforeach()\n\nconfigure_file(\"utils/profile/profiler_interface.h\"\n               ${CMAKE_BINARY_DIR}/include/coreneuron/nrniv/profiler_interface.h COPYONLY)\n\n# main program required for building special-core\nfile(COPY apps/coreneuron.cpp DESTINATION ${CMAKE_BINARY_DIR}/share/coreneuron)\n\n# =============================================================================\n# Install main targets\n# =============================================================================\n\n# coreneuron main libraries\ninstall(\n  TARGETS coreneuron-core ${coreneuron_cuda_target}\n  EXPORT coreneuron\n  LIBRARY DESTINATION lib\n  ARCHIVE DESTINATION lib\n  INCLUDES\n  DESTINATION $<INSTALL_INTERFACE:include>)\n\n# headers and some standalone code files for nrnivmodl-core\ninstall(\n  DIRECTORY ${CMAKE_BINARY_DIR}/include/coreneuron\n  DESTINATION include/\n  FILES_MATCHING\n  PATTERN \"*.h*\"\n  PATTERN \"*.ipp\")\ninstall(FILES ${MODFUNC_PERL_SCRIPT} ${ENGINEMECH_CODE_FILE} DESTINATION share/coreneuron)\n\n# copy mod2c/nmodl for nrnivmodl-core\ninstall(PROGRAMS ${CORENRN_MOD2CPP_BINARY} DESTINATION bin)\n\nif(NOT CORENRN_ENABLE_NMODL)\n  install(FILES ${NMODL_UNITS_FILE} DESTINATION share/mod2c)\nendif()\n\n# install nrniv-core app\ninstall(\n  PROGRAMS ${CMAKE_BINARY_DIR}/bin/${CMAKE_HOST_SYSTEM_PROCESSOR}/special-core\n  DESTINATION bin\n  RENAME nrniv-core)\ninstall(FILES apps/coreneuron.cpp DESTINATION share/coreneuron)\n\n# install mechanism library in shared library builds, if we're linking statically then there is no\n# need\nif(CORENRN_ENABLE_SHARED)\n  install(FILES ${corenrn_mech_library} DESTINATION lib)\nendif()\n\n# install random123 and nmodl headers\ninstall(DIRECTORY ${CMAKE_BINARY_DIR}/include/ DESTINATION include)\n\n# install mod files\ninstall(DIRECTORY ${CMAKE_BINARY_DIR}/share/modfile DESTINATION share)\n"
  },
  {
    "path": "coreneuron/apps/coreneuron.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include <coreneuron/engine.h>\n#include \"coreneuron/utils/profile/profiler_interface.h\"\n\nint main(int argc, char** argv) {\n    coreneuron::Instrumentor::init_profile();\n    auto solve_core_result = solve_core(argc, argv);\n    coreneuron::Instrumentor::finalize_profile();\n    return solve_core_result;\n}\n"
  },
  {
    "path": "coreneuron/apps/corenrn_parameters.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n#include \"coreneuron/apps/corenrn_parameters.hpp\"\n\n#include <CLI/CLI.hpp>\n\nnamespace coreneuron {\n\nextern std::string cnrn_version();\n\ncorenrn_parameters::corenrn_parameters()\n    : m_app{std::make_unique<CLI::App>(\"CoreNeuron - Optimised Simulator Engine for NEURON.\")} {\n    auto& app = *m_app;\n    app.set_config(\"--read-config\", \"\", \"Read parameters from ini file\", false)\n        ->check(CLI::ExistingFile);\n    app.add_option(\"--write-config\",\n                   this->writeParametersFilepath,\n                   \"Write parameters to this file\",\n                   false);\n\n    app.add_flag(\n        \"--mpi\",\n        this->mpi_enable,\n        \"Enable MPI. In order to initialize MPI environment this argument must be specified.\");\n    app.add_option(\"--mpi-lib\",\n                   this->mpi_lib,\n                   \"CoreNEURON MPI library to load for dynamic MPI support\",\n                   false);\n    app.add_flag(\"--gpu\", this->gpu, \"Activate GPU computation.\");\n    app.add_option(\"--dt\",\n                   this->dt,\n                   \"Fixed time step. The default value is set by defaults.dat or is 0.025.\",\n                   true)\n        ->check(CLI::Range(-1'000., 1e9));\n    app.add_option(\"-e, --tstop\", this->tstop, \"Stop Time in ms.\")->check(CLI::Range(0., 1e9));\n    app.add_flag(\"--show\");\n    app.add_set(\n        \"--verbose\",\n        this->verbose,\n        {verbose_level::NONE, verbose_level::ERROR, verbose_level::INFO, verbose_level::DEBUG_INFO},\n        \"Verbose level: 0 = NONE, 1 = ERROR, 2 = INFO, 3 = DEBUG. Default is INFO\");\n    app.add_flag(\"--model-stats\",\n                 this->model_stats,\n                 \"Print number of instances of each mechanism and detailed memory stats.\");\n\n    auto sub_gpu = app.add_option_group(\"GPU\", \"Commands relative to GPU.\");\n    sub_gpu\n        ->add_option(\"-W, --nwarp\",\n                     this->nwarp,\n                     \"Number of warps to execute in parallel the Hines solver. Each warp solves a \"\n                     \"group of cells. (Only used with cell permute 2)\",\n                     true)\n        ->check(CLI::Range(0, 1'000'000));\n    sub_gpu\n        ->add_option(\"-R, --cell-permute\",\n                     this->cell_interleave_permute,\n                     \"Cell permutation: 0 No permutation; 1 optimise node adjacency; 2 optimize \"\n                     \"parent adjacency.\",\n                     true)\n        ->check(CLI::Range(0, 2));\n    sub_gpu->add_flag(\"--cuda-interface\",\n                      this->cuda_interface,\n                      \"Activate CUDA branch of the code.\");\n    sub_gpu->add_option(\"-n, --num-gpus\", this->num_gpus, \"Number of gpus to use per node.\");\n\n    auto sub_input = app.add_option_group(\"input\", \"Input dataset options.\");\n    sub_input->add_option(\"-d, --datpath\", this->datpath, \"Path containing CoreNeuron data files.\")\n        ->check(CLI::ExistingDirectory);\n    sub_input->add_option(\"-f, --filesdat\", this->filesdat, \"Name for the distribution file.\", true)\n        ->check(CLI::ExistingFile);\n    sub_input\n        ->add_option(\"-p, --pattern\",\n                     this->patternstim,\n                     \"Apply patternstim using the specified spike file.\")\n        ->check(CLI::ExistingFile);\n    sub_input\n        ->add_option(\"-s, --seed\", this->seed, \"Initialization seed for random number generator.\")\n        ->check(CLI::Range(0, 100'000'000));\n    sub_input\n        ->add_option(\"-v, --voltage\",\n                     this->voltage,\n                     \"Initial voltage used for nrn_finitialize(1, v_init). If 1000, then \"\n                     \"nrn_finitialize(0,...).\")\n        ->check(CLI::Range(-1e9, 1e9));\n    sub_input->add_option(\"--report-conf\", this->reportfilepath, \"Reports configuration file.\")\n        ->check(CLI::ExistingFile);\n    sub_input\n        ->add_option(\"--restore\",\n                     this->restorepath,\n                     \"Restore simulation from provided checkpoint directory.\")\n        ->check(CLI::ExistingDirectory);\n\n    auto sub_parallel = app.add_option_group(\"parallel\", \"Parallel processing options.\");\n    sub_parallel->add_flag(\"-c, --threading\",\n                           this->threading,\n                           \"Parallel threads. The default is serial threads.\");\n    sub_parallel->add_flag(\"--skip-mpi-finalize\",\n                           this->skip_mpi_finalize,\n                           \"Do not call mpi finalize.\");\n\n    auto sub_spike = app.add_option_group(\"spike\", \"Spike exchange options.\");\n    sub_spike\n        ->add_option(\"--ms-phases\", this->ms_phases, \"Number of multisend phases, 1 or 2.\", true)\n        ->check(CLI::Range(1, 2));\n    sub_spike\n        ->add_option(\"--ms-subintervals\",\n                     this->ms_subint,\n                     \"Number of multisend subintervals, 1 or 2.\",\n                     true)\n        ->check(CLI::Range(1, 2));\n    sub_spike->add_flag(\"--multisend\",\n                        this->multisend,\n                        \"Use Multisend spike exchange instead of Allgather.\");\n    sub_spike\n        ->add_option(\"--spkcompress\",\n                     this->spkcompress,\n                     \"Spike compression. Up to ARG are exchanged during MPI_Allgather.\",\n                     true)\n        ->check(CLI::Range(0, 100'000));\n    sub_spike->add_flag(\"--binqueue\", this->binqueue, \"Use bin queue.\");\n\n    auto sub_config = app.add_option_group(\"config\", \"Config options.\");\n    sub_config->add_option(\"-b, --spikebuf\", this->spikebuf, \"Spike buffer size.\", true)\n        ->check(CLI::Range(0, 2'000'000'000));\n    sub_config\n        ->add_option(\"-g, --prcellgid\",\n                     this->prcellgid,\n                     \"Output prcellstate information for the gid NUMBER.\")\n        ->check(CLI::Range(-1, 2'000'000'000));\n    sub_config->add_option(\"-k, --forwardskip\", this->forwardskip, \"Forwardskip to TIME\")\n        ->check(CLI::Range(0., 1e9));\n    sub_config\n        ->add_option(\n            \"-l, --celsius\",\n            this->celsius,\n            \"Temperature in degC. The default value is set in defaults.dat or else is 34.0.\",\n            true)\n        ->check(CLI::Range(-1000., 1000.));\n    sub_config\n        ->add_option(\"--mindelay\",\n                     this->mindelay,\n                     \"Maximum integration interval (likely reduced by minimum NetCon delay).\",\n                     true)\n        ->check(CLI::Range(0., 1e9));\n    sub_config\n        ->add_option(\"--report-buffer-size\",\n                     this->report_buff_size,\n                     \"Size in MB of the report buffer.\")\n        ->check(CLI::Range(1, 128));\n\n    auto sub_output = app.add_option_group(\"output\", \"Output configuration.\");\n    sub_output->add_option(\"-i, --dt_io\", this->dt_io, \"Dt of I/O.\", true)\n        ->check(CLI::Range(-1000., 1e9));\n    sub_output->add_option(\"-o, --outpath\",\n                           this->outpath,\n                           \"Path to place output data files.\",\n                           true);\n    sub_output->add_option(\"--checkpoint\",\n                           this->checkpointpath,\n                           \"Enable checkpoint and specify directory to store related files.\");\n\n    app.add_flag(\"-v, --version\", this->show_version, \"Show version information and quit.\");\n\n    CLI::retire_option(app, \"--show\");\n}\n\n// Implementation in .cpp file where CLI types are complete.\ncorenrn_parameters::~corenrn_parameters() = default;\n\nstd::string corenrn_parameters::config_to_str(bool default_also, bool write_description) const {\n    return m_app->config_to_str(default_also, write_description);\n}\n\nvoid corenrn_parameters::reset() {\n    static_cast<corenrn_parameters_data&>(*this) = corenrn_parameters_data{};\n    m_app->clear();\n}\n\nvoid corenrn_parameters::parse(int argc, char** argv) {\n    try {\n        m_app->parse(argc, argv);\n        if (verbose == verbose_level::NONE) {\n            nrn_nobanner_ = 1;\n        }\n    } catch (const CLI::ExtrasError& e) {\n        // in case of parsing errors, show message with exception\n        std::cerr << \"CLI parsing error, see nrniv-core --help for more information. \\n\"\n                  << std::endl;\n        m_app->exit(e);\n        throw e;\n    } catch (const CLI::ParseError& e) {\n        // use --help is also ParseError; in this case exit by showing all options\n        m_app->exit(e);\n        exit(0);\n    }\n\n#ifndef CORENEURON_ENABLE_GPU\n    if (gpu) {\n        std::cerr\n            << \"Error: GPU support was not enabled at build time but GPU execution was requested.\"\n            << std::endl;\n        exit(42);\n    }\n#endif\n\n    // is user has asked for version info, print it and exit\n    if (show_version) {\n        std::cout << \"CoreNEURON Version : \" << cnrn_version() << std::endl;\n        exit(0);\n    }\n};\n\nstd::ostream& operator<<(std::ostream& os, const corenrn_parameters& corenrn_param) {\n    os << \"GENERAL PARAMETERS\" << std::endl\n       << \"--mpi=\" << (corenrn_param.mpi_enable ? \"true\" : \"false\") << std::endl\n       << \"--mpi-lib=\" << corenrn_param.mpi_lib << std::endl\n       << \"--gpu=\" << (corenrn_param.gpu ? \"true\" : \"false\") << std::endl\n       << \"--dt=\" << corenrn_param.dt << std::endl\n       << \"--tstop=\" << corenrn_param.tstop << std::endl\n       << std::endl\n       << \"GPU\" << std::endl\n       << \"--nwarp=\" << corenrn_param.nwarp << std::endl\n       << \"--cell-permute=\" << corenrn_param.cell_interleave_permute << std::endl\n       << \"--cuda-interface=\" << (corenrn_param.cuda_interface ? \"true\" : \"false\") << std::endl\n       << std::endl\n       << \"INPUT PARAMETERS\" << std::endl\n       << \"--voltage=\" << corenrn_param.voltage << std::endl\n       << \"--seed=\" << corenrn_param.seed << std::endl\n       << \"--datpath=\" << corenrn_param.datpath << std::endl\n       << \"--filesdat=\" << corenrn_param.filesdat << std::endl\n       << \"--pattern=\" << corenrn_param.patternstim << std::endl\n       << \"--report-conf=\" << corenrn_param.reportfilepath << std::endl\n       << std::left << std::setw(15) << \"--restore=\" << corenrn_param.restorepath << std::endl\n       << std::endl\n       << \"PARALLEL COMPUTATION PARAMETERS\" << std::endl\n       << \"--threading=\" << (corenrn_param.threading ? \"true\" : \"false\") << std::endl\n       << \"--skip_mpi_finalize=\" << (corenrn_param.skip_mpi_finalize ? \"true\" : \"false\")\n       << std::endl\n       << std::endl\n       << \"SPIKE EXCHANGE\" << std::endl\n       << \"--ms_phases=\" << corenrn_param.ms_phases << std::endl\n       << \"--ms_subintervals=\" << corenrn_param.ms_subint << std::endl\n       << \"--multisend=\" << (corenrn_param.multisend ? \"true\" : \"false\") << std::endl\n       << \"--spk_compress=\" << corenrn_param.spkcompress << std::endl\n       << \"--binqueue=\" << (corenrn_param.binqueue ? \"true\" : \"false\") << std::endl\n       << std::endl\n       << \"CONFIGURATION\" << std::endl\n       << \"--spikebuf=\" << corenrn_param.spikebuf << std::endl\n       << \"--prcellgid=\" << corenrn_param.prcellgid << std::endl\n       << \"--forwardskip=\" << corenrn_param.forwardskip << std::endl\n       << \"--celsius=\" << corenrn_param.celsius << std::endl\n       << \"--mindelay=\" << corenrn_param.mindelay << std::endl\n       << \"--report-buffer-size=\" << corenrn_param.report_buff_size << std::endl\n       << std::endl\n       << \"OUTPUT PARAMETERS\" << std::endl\n       << \"--dt_io=\" << corenrn_param.dt_io << std::endl\n       << \"--outpath=\" << corenrn_param.outpath << std::endl\n       << \"--checkpoint=\" << corenrn_param.checkpointpath << std::endl;\n\n    return os;\n}\n\ncorenrn_parameters corenrn_param;\nint nrn_nobanner_{0};\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/apps/corenrn_parameters.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n#pragma once\n#include <memory>\n#include <ostream>\n#include <string>\n\n/**\n * \\class corenrn_parameters\n * \\brief Parses and contains Command Line parameters for Core Neuron\n *\n * This structure contains all the parameters that CoreNeuron fetches\n * from the Command Line. It uses the CLI11 libraries to parse these parameters\n * and saves them in an internal public structure. Each parameter can be\n * accessed or written freely. By default the constructor instantiates a\n * CLI11 object and initializes it for CoreNeuron use.\n * This object is freely accessible from any point of the program.\n * An ostream method is also provided to print out all the parameters that\n * CLI11 parse.\n * Please keep in mind that, due to the nature of the subcommands in CLI11,\n * the command line parameters for subcategories NEED to be come before the relative\n * parameter. e.g. --mpi --gpu gpu --nwarp\n * Also single dash long options are not supported anymore (-mpi -> --mpi).\n */\n\nnamespace CLI {\nstruct App;\n}\n\nnamespace coreneuron {\n\nstruct corenrn_parameters_data {\n    enum verbose_level : std::uint32_t {\n        NONE = 0,\n        ERROR = 1,\n        INFO = 2,\n        DEBUG_INFO = 3,\n        DEFAULT = INFO\n    };\n\n    static constexpr int report_buff_size_default = 4;\n\n    unsigned spikebuf = 100'000;           /// Internal buffer used on every rank for spikes\n    int prcellgid = -1;                    /// Gid of cell for prcellstate\n    unsigned ms_phases = 2;                /// Number of multisend phases, 1 or 2\n    unsigned ms_subint = 2;                /// Number of multisend interval. 1 or 2\n    unsigned spkcompress = 0;              /// Spike Compression\n    unsigned cell_interleave_permute = 0;  /// Cell interleaving permutation\n    unsigned nwarp = 65536;  /// Number of warps to balance for cell_interleave_permute == 2\n    unsigned num_gpus = 0;   /// Number of gpus to use per node\n    unsigned report_buff_size = report_buff_size_default;  /// Size in MB of the report buffer.\n    int seed = -1;  /// Initialization seed for random number generator (int)\n\n    bool mpi_enable = false;         /// Enable MPI flag.\n    bool skip_mpi_finalize = false;  /// Skip MPI finalization\n    bool multisend = false;          /// Use Multisend spike exchange instead of Allgather.\n    bool threading = false;          /// Enable pthread/openmp\n    bool gpu = false;                /// Enable GPU computation.\n    bool cuda_interface = false;     /// Enable CUDA interface (default is the OpenACC interface).\n                                  /// Branch of the code is executed through CUDA kernels instead of\n                                  /// OpenACC regions.\n    bool binqueue = false;  /// Use bin queue.\n\n    bool show_version = false;  /// Print version and exit.\n\n    bool model_stats = false;  /// Print mechanism counts and model size after initialization\n\n    verbose_level verbose{verbose_level::DEFAULT};  /// Verbosity-level\n\n    double tstop = 100;        /// Stop time of simulation in msec\n    double dt = -1000.0;       /// Timestep to use in msec\n    double dt_io = 0.1;        /// I/O timestep to use in msec\n    double dt_report;          /// I/O timestep to use in msec for reports\n    double celsius = -1000.0;  /// Temperature in degC.\n    double voltage = -65.0;    /// Initial voltage used for nrn_finitialize(1, v_init).\n    double forwardskip = 0.;   /// Forward skip to TIME.\n    double mindelay = 10.;     /// Maximum integration interval (likely reduced by minimum NetCon\n                               /// delay).\n\n    std::string patternstim;             /// Apply patternstim using the specified spike file.\n    std::string datpath = \".\";           /// Directory path where .dat files\n    std::string outpath = \".\";           /// Directory where spikes will be written\n    std::string filesdat = \"files.dat\";  /// Name of file containing list of gids dat files read in\n    std::string restorepath;             /// Restore simulation from provided checkpoint directory.\n    std::string reportfilepath;          /// Reports configuration file.\n    std::string checkpointpath;  /// Enable checkpoint and specify directory to store related files.\n    std::string writeParametersFilepath;  /// Write parameters to this file\n    std::string mpi_lib;                  /// Name of CoreNEURON MPI library to load dynamically.\n};\n\nstruct corenrn_parameters: corenrn_parameters_data {\n    corenrn_parameters();   /// Constructor that initializes the CLI11 app.\n    ~corenrn_parameters();  /// Destructor defined in .cpp where CLI11 types are complete.\n\n    void parse(int argc, char* argv[]);  /// Runs the CLI11_PARSE macro.\n\n    /** @brief Reset all parameters to their default values.\n     *\n     *  Unfortunately it is awkward to support `x = corenrn_parameters{}`\n     *  because `app` holds pointers to members of `corenrn_parameters`.\n     */\n    void reset();\n\n    inline bool is_quiet() {\n        return verbose == verbose_level::NONE;\n    }\n\n    /** @brief Return a string summarising the current parameter values.\n     *\n     * This forwards to the CLI11 method of the same name. Returns a string that\n     * could be read in as a config of the current values of the App.\n     *\n     * @param default_also Include any defaulted arguments.\n     * @param write_description Include option descriptions and the App description.\n     */\n    std::string config_to_str(bool default_also = false, bool write_description = false) const;\n\n  private:\n    // CLI app that performs CLI parsing. std::unique_ptr avoids having to\n    // include CLI11 headers from CoreNEURON headers, and therefore avoids\n    // CoreNEURON having to install CLI11 when using it from a submodule.\n    std::unique_ptr<CLI::App> m_app;\n};\n\nstd::ostream& operator<<(std::ostream& os,\n                         const corenrn_parameters& corenrn_param);  /// Printing method.\n\nextern corenrn_parameters corenrn_param;  /// Declaring global corenrn_parameters object for this\n                                          /// instance of CoreNeuron.\nextern int nrn_nobanner_;                 /// Global no banner setting\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/apps/main1.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n/**\n * @file main1.cpp\n * @date 26 Oct 2014\n * @brief File containing main driver routine for CoreNeuron\n */\n\n#include <cstring>\n#include <climits>\n#include <dlfcn.h>\n#include <memory>\n#include <vector>\n\n#include \"coreneuron/config/config.h\"\n#include \"coreneuron/utils/randoms/nrnran123.h\"\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/sim/fast_imem.hpp\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/mpi/nrnmpi.h\"\n#include \"coreneuron/nrniv/nrniv_decl.h\"\n#include \"coreneuron/mechanism/register_mech.hpp\"\n#include \"coreneuron/io/output_spikes.hpp\"\n#include \"coreneuron/io/nrn_checkpoint.hpp\"\n#include \"coreneuron/utils/memory_utils.h\"\n#include \"coreneuron/apps/corenrn_parameters.hpp\"\n#include \"coreneuron/io/prcellstate.hpp\"\n#include \"coreneuron/utils/nrn_stats.h\"\n#include \"coreneuron/io/reports/nrnreport.hpp\"\n#include \"coreneuron/io/reports/binary_report_handler.hpp\"\n#include \"coreneuron/io/reports/report_handler.hpp\"\n#include \"coreneuron/io/reports/sonata_report_handler.hpp\"\n#include \"coreneuron/gpu/nrn_acc_manager.hpp\"\n#include \"coreneuron/utils/profile/profiler_interface.h\"\n#include \"coreneuron/network/partrans.hpp\"\n#include \"coreneuron/network/multisend.hpp\"\n#include \"coreneuron/io/nrn_setup.hpp\"\n#include \"coreneuron/io/file_utils.hpp\"\n#include \"coreneuron/io/nrn2core_direct.h\"\n#include \"coreneuron/io/core2nrn_data_return.hpp\"\n#include \"coreneuron/utils/utils.hpp\"\n\nextern \"C\" {\nconst char* corenrn_version() {\n    return coreneuron::bbcore_write_version;\n}\n\n// the CORENEURON_USE_LEGACY_UNITS determined by CORENRN_ENABLE_LEGACY_UNITS\nbool corenrn_units_use_legacy() {\n    return CORENEURON_USE_LEGACY_UNITS;\n}\n\nvoid (*nrn2core_part2_clean_)();\n\n/**\n * If \"export OMP_NUM_THREADS=n\" is not set then omp by default sets\n * the number of threads equal to the number of cores on this node.\n * If there are a number of mpi processes on this node as well, things\n * can go very slowly as there are so many more threads than cores.\n * Assume the NEURON users pc.nthread() is well chosen if\n * OMP_NUM_THREADS is not set.\n */\nvoid set_openmp_threads(int nthread) {\n#if defined(_OPENMP)\n    if (!getenv(\"OMP_NUM_THREADS\")) {\n        omp_set_num_threads(nthread);\n    }\n#endif\n}\n\n/**\n * Convert char* containing arguments from neuron to char* argv[] for\n * coreneuron command line argument parser.\n */\nchar* prepare_args(int& argc, char**& argv, int use_mpi, const char* mpi_lib, const char* arg) {\n    // first construct all arguments as string\n    std::string args(arg);\n    args.insert(0, \" coreneuron \");\n    args.append(\" --skip-mpi-finalize \");\n    if (use_mpi) {\n        args.append(\" --mpi \");\n    }\n\n    // if neuron has passed name of MPI library then add it to CLI\n    std::string corenrn_mpi_lib{mpi_lib};\n    if (!corenrn_mpi_lib.empty()) {\n        args.append(\" --mpi-lib \");\n        corenrn_mpi_lib += \" \";\n        args.append(corenrn_mpi_lib);\n    }\n\n    // we can't modify string with strtok, make copy\n    char* first = strdup(args.c_str());\n    const char* sep = \" \";\n\n    // first count the no of argument\n    char* token = strtok(first, sep);\n    argc = 0;\n    while (token) {\n        token = strtok(nullptr, sep);\n        argc++;\n    }\n    free(first);\n\n    // now build char*argv\n    argv = new char*[argc];\n    first = strdup(args.c_str());\n    token = strtok(first, sep);\n    for (int i = 0; token; i++) {\n        argv[i] = token;\n        token = strtok(nullptr, sep);\n    }\n\n    // return actual data to be freed\n    return first;\n}\n}\n\nnamespace coreneuron {\nvoid call_prcellstate_for_prcellgid(int prcellgid, int compute_gpu, int is_init);\n\n// bsize = 0 then per step transfer\n// bsize > 1 then full trajectory save into arrays.\nvoid get_nrn_trajectory_requests(int bsize) {\n    if (nrn2core_get_trajectory_requests_) {\n        for (int tid = 0; tid < nrn_nthread; ++tid) {\n            NrnThread& nt = nrn_threads[tid];\n            int n_pr;\n            int n_trajec;\n            int* types;\n            int* indices;\n            void** vpr;\n            double** varrays;\n            double** pvars;\n\n            // bsize is passed by reference, the return value will determine if\n            // per step return or entire trajectory return.\n            (*nrn2core_get_trajectory_requests_)(\n                tid, bsize, n_pr, vpr, n_trajec, types, indices, pvars, varrays);\n            delete_trajectory_requests(nt);\n            if (n_trajec) {\n                TrajectoryRequests* tr = new TrajectoryRequests;\n                nt.trajec_requests = tr;\n                tr->bsize = bsize;\n                tr->n_pr = n_pr;\n                tr->n_trajec = n_trajec;\n                tr->vsize = 0;\n                tr->vpr = vpr;\n                tr->gather = new double*[n_trajec];\n                tr->varrays = varrays;\n                tr->scatter = pvars;\n                for (int i = 0; i < n_trajec; ++i) {\n                    tr->gather[i] = stdindex2ptr(types[i], indices[i], nt);\n                }\n                delete[] types;\n                delete[] indices;\n            }\n        }\n    }\n}\n\nvoid nrn_init_and_load_data(int argc,\n                            char* argv[],\n                            CheckPoints& checkPoints,\n                            bool is_mapping_needed,\n                            bool run_setup_cleanup) {\n#if defined(NRN_FEEXCEPT)\n    nrn_feenableexcept();\n#endif\n\n    /// profiler like tau/vtune : do not measure from begining\n    Instrumentor::stop_profile();\n\n    // memory footprint after mpi initialisation\n    if (!corenrn_param.is_quiet()) {\n        report_mem_usage(\"After MPI_Init\");\n    }\n\n    // initialise default coreneuron parameters\n    initnrn();\n\n    // set global variables\n    // precedence is: set by user, globals.dat, 34.0\n    celsius = corenrn_param.celsius;\n\n#if CORENEURON_ENABLE_GPU\n    if (!corenrn_param.gpu && corenrn_param.cell_interleave_permute == 2) {\n        fprintf(stderr,\n                \"compiled with CORENEURON_ENABLE_GPU does not allow the combination of \"\n                \"--cell-permute=2 and \"\n                \"missing --gpu\\n\");\n        exit(1);\n    }\n    if (!corenrn_param.gpu && corenrn_param.cuda_interface) {\n        fprintf(stderr,\n                \"compiled with OpenACC/CUDA does not allow the combination of --cuda-interface and \"\n                \"missing --gpu\\n\");\n        exit(1);\n    }\n#endif\n\n// if multi-threading enabled, make sure mpi library supports it\n#if NRNMPI\n    if (corenrn_param.mpi_enable && corenrn_param.threading) {\n        nrnmpi_check_threading_support();\n    }\n#endif\n\n    // full path of files.dat file\n    std::string filesdat(corenrn_param.datpath + \"/\" + corenrn_param.filesdat);\n\n    // read the global variable names and set their values from globals.dat\n    set_globals(corenrn_param.datpath.c_str(), (corenrn_param.seed >= 0), corenrn_param.seed);\n\n    // set global variables for start time, timestep and temperature\n    if (!corenrn_embedded) {\n        t = checkPoints.restore_time();\n    }\n\n    if (corenrn_param.dt != -1000.) {  // command line arg highest precedence\n        dt = corenrn_param.dt;\n    } else if (dt == -1000.) {  // not on command line and no dt in globals.dat\n        dt = 0.025;             // lowest precedence\n    }\n\n    corenrn_param.dt = dt;\n\n    rev_dt = (int) (1. / dt);\n\n    if (corenrn_param.celsius != -1000.) {  // command line arg highest precedence\n        celsius = corenrn_param.celsius;\n    } else if (celsius == -1000.) {  // not on command line and no celsius in globals.dat\n        celsius = 34.0;              // lowest precedence\n    }\n\n    corenrn_param.celsius = celsius;\n\n    // create net_cvode instance\n    mk_netcvode();\n\n    // One part done before call to nrn_setup. Other part after.\n\n    if (!corenrn_param.patternstim.empty()) {\n        nrn_set_extra_thread0_vdata();\n    }\n\n    if (!corenrn_param.is_quiet()) {\n        report_mem_usage(\"Before nrn_setup\");\n    }\n\n    // set if need to interleave cells\n    interleave_permute_type = corenrn_param.cell_interleave_permute;\n    cellorder_nwarp = corenrn_param.nwarp;\n    use_solve_interleave = corenrn_param.cell_interleave_permute;\n\n    if (corenrn_param.gpu && interleave_permute_type == 0) {\n        if (nrnmpi_myid == 0) {\n            printf(\n                \" WARNING : GPU execution requires --cell-permute type 1 or 2. Setting it to 1.\\n\");\n        }\n        interleave_permute_type = 1;\n        use_solve_interleave = true;\n    }\n\n    // multisend options\n    use_multisend_ = corenrn_param.multisend ? 1 : 0;\n    n_multisend_interval = corenrn_param.ms_subint;\n    use_phase2_ = (corenrn_param.ms_phases == 2) ? 1 : 0;\n\n    // reading *.dat files and setting up the data structures, setting mindelay\n    nrn_setup(filesdat.c_str(),\n              is_mapping_needed,\n              checkPoints,\n              run_setup_cleanup,\n              corenrn_param.datpath.c_str(),\n              checkPoints.get_restore_path().c_str(),\n              &corenrn_param.mindelay);\n\n    // Allgather spike compression and  bin queuing.\n    nrn_use_bin_queue_ = corenrn_param.binqueue;\n    int spkcompress = corenrn_param.spkcompress;\n    nrnmpi_spike_compress(spkcompress, (spkcompress ? true : false), use_multisend_);\n\n    if (!corenrn_param.is_quiet()) {\n        report_mem_usage(\"After nrn_setup \");\n    }\n\n    // Invoke PatternStim\n    if (!corenrn_param.patternstim.empty()) {\n        nrn_mkPatternStim(corenrn_param.patternstim.c_str(), corenrn_param.tstop);\n    }\n\n    /// Setting the timeout\n    nrn_set_timeout(200.);\n\n    // show all configuration parameters for current run\n    if (nrnmpi_myid == 0 && !corenrn_param.is_quiet()) {\n        std::cout << corenrn_param << std::endl;\n        std::cout << \" Start time (t) = \" << t << std::endl << std::endl;\n    }\n\n    // allocate buffer for mpi communication\n    mk_spikevec_buffer(corenrn_param.spikebuf);\n\n    if (!corenrn_param.is_quiet()) {\n        report_mem_usage(\"After mk_spikevec_buffer\");\n    }\n\n    // In direct mode there are likely trajectory record requests\n    // to allow processing in NEURON after simulation by CoreNEURON\n    if (corenrn_embedded) {\n        // arg is additional vector size required (how many items will be\n        // written to the double*) but NEURON can instead\n        // specify that returns will be on a per time step basis.\n        get_nrn_trajectory_requests(int((corenrn_param.tstop - t) / corenrn_param.dt) + 2);\n\n        // In direct mode, CoreNEURON has exactly the behavior of\n        // ParallelContext.psolve(tstop). Ie a sequence of such calls\n        // without an intervening h.finitialize() continues from the end\n        // of the previous call. I.e., all initial state, including\n        // the event queue has been set up in NEURON. And, at the end\n        // all final state, including the event queue will be sent back\n        // to NEURON. Here there is some first time only\n        // initialization and queue transfer.\n        direct_mode_initialize();\n        clear_spike_vectors();  // PreSyn send already recorded by NEURON\n        (*nrn2core_part2_clean_)();\n    }\n\n    if (corenrn_param.gpu) {\n        // Copy nrnthreads to device only after all the data are passed from NEURON and the\n        // nrnthreads on CPU are properly set up\n        setup_nrnthreads_on_device(nrn_threads, nrn_nthread);\n    }\n\n    if (corenrn_embedded) {\n        // Run nrn_init of mechanisms only to allocate any extra data needed on the GPU after\n        // nrnthreads are properly set up on the GPU\n        allocate_data_in_mechanism_nrn_init();\n    }\n\n    if (corenrn_param.gpu) {\n        if (nrn_have_gaps) {\n            nrn_partrans::copy_gap_indices_to_device();\n        }\n    }\n\n    // call prcellstate for prcellgid\n    call_prcellstate_for_prcellgid(corenrn_param.prcellgid, corenrn_param.gpu, 1);\n}\n\nvoid call_prcellstate_for_prcellgid(int prcellgid, int compute_gpu, int is_init) {\n    char prcellname[1024];\n#ifdef ENABLE_CUDA\n    const char* prprefix = \"cu\";\n#else\n    const char* prprefix = \"acc\";\n#endif\n\n    if (prcellgid >= 0) {\n        if (compute_gpu) {\n            if (is_init)\n                sprintf(prcellname, \"%s_gpu_init\", prprefix);\n            else\n                sprintf(prcellname, \"%s_gpu_t%f\", prprefix, t);\n        } else {\n            if (is_init)\n                strcpy(prcellname, \"cpu_init\");\n            else\n                sprintf(prcellname, \"cpu_t%f\", t);\n        }\n        update_nrnthreads_on_host(nrn_threads, nrn_nthread);\n        prcellstate(prcellgid, prcellname);\n    }\n}\n\n/* perform forwardskip and call prcellstate for prcellgid */\nvoid handle_forward_skip(double forwardskip, int prcellgid) {\n    double savedt = dt;\n    double savet = t;\n\n    dt = forwardskip * 0.1;\n    t = -1e9;\n    dt2thread(-1.);\n\n    for (int step = 0; step < 10; ++step) {\n        nrn_fixed_step_minimal();\n    }\n\n    if (prcellgid >= 0) {\n        prcellstate(prcellgid, \"fs\");\n    }\n\n    dt = savedt;\n    t = savet;\n    dt2thread(-1.);\n\n    // clear spikes generated during forward skip (with negative time)\n    clear_spike_vectors();\n}\n\nstd::string cnrn_version() {\n    return version::to_string();\n}\n\n\nstatic void trajectory_return() {\n    if (nrn2core_trajectory_return_) {\n        for (int tid = 0; tid < nrn_nthread; ++tid) {\n            NrnThread& nt = nrn_threads[tid];\n            TrajectoryRequests* tr = nt.trajec_requests;\n            if (tr && tr->varrays) {\n                (*nrn2core_trajectory_return_)(tid, tr->n_pr, tr->bsize, tr->vsize, tr->vpr, nt._t);\n            }\n        }\n    }\n}\n\nstd::unique_ptr<ReportHandler> create_report_handler(const ReportConfiguration& config,\n                                                     const SpikesInfo& spikes_info) {\n    std::unique_ptr<ReportHandler> report_handler;\n    if (config.format == \"Bin\") {\n        report_handler = std::make_unique<BinaryReportHandler>();\n    } else if (config.format == \"SONATA\") {\n        report_handler = std::make_unique<SonataReportHandler>(spikes_info);\n    } else {\n        if (nrnmpi_myid == 0) {\n            printf(\" WARNING : Report name '%s' has unknown format: '%s'.\\n\",\n                   config.name.data(),\n                   config.format.data());\n        }\n        return nullptr;\n    }\n    return report_handler;\n}\n\n}  // namespace coreneuron\n\n/// The following high-level functions are marked as \"extern C\"\n/// for compat with C, namely Neuron mod files.\n/// They split the previous solve_core so that intermediate init of external mechanisms can occur.\n/// See mech/corenrnmech.cpp for the new all-in-one solve_core (not compiled into the coreneuron\n/// lib since with nrnivmodl-core we have 'future' external mechanisms)\n\nusing namespace coreneuron;\n\n#if NRNMPI && defined(CORENEURON_ENABLE_MPI_DYNAMIC)\nstatic void* load_dynamic_mpi(const std::string& libname) {\n    dlerror();\n    void* handle = dlopen(libname.c_str(), RTLD_NOW | RTLD_GLOBAL);\n    const char* error = dlerror();\n    if (error) {\n        std::string err_msg = std::string(\"Could not open dynamic MPI library: \") + error + \"\\n\";\n        throw std::runtime_error(err_msg);\n    }\n    return handle;\n}\n#endif\n\nextern \"C\" void mk_mech_init(int argc, char** argv) {\n    // reset all parameters to their default values\n    corenrn_param.reset();\n\n    // read command line parameters and parameter config files\n    corenrn_param.parse(argc, argv);\n\n#if NRNMPI\n    if (corenrn_param.mpi_enable) {\n#ifdef CORENEURON_ENABLE_MPI_DYNAMIC\n        // coreneuron rely on neuron to detect mpi library distribution and\n        // the name of the library itself. Make sure the library name is specified\n        // via CLI option.\n        if (corenrn_param.mpi_lib.empty()) {\n            throw std::runtime_error(\n                \"For dynamic MPI support you must pass '--mpi-lib \"\n                \"/path/libcorenrnmpi_<name>.<suffix>` argument!\\n\");\n        }\n\n        // neuron can call coreneuron multiple times and hence we do not\n        // want to initialize/load mpi library multiple times\n        static bool mpi_lib_loaded = false;\n        if (!mpi_lib_loaded) {\n            auto mpi_handle = load_dynamic_mpi(corenrn_param.mpi_lib);\n            mpi_manager().resolve_symbols(mpi_handle);\n            mpi_lib_loaded = true;\n        }\n#endif\n        auto ret = nrnmpi_init(&argc, &argv, corenrn_param.is_quiet());\n        nrnmpi_numprocs = ret.numprocs;\n        nrnmpi_myid = ret.myid;\n    }\n#endif\n\n#ifdef CORENEURON_ENABLE_GPU\n    if (corenrn_param.gpu) {\n        init_gpu();\n        cnrn_target_copyin(&celsius);\n        cnrn_target_copyin(&pi);\n        cnrn_target_copyin(&secondorder);\n        nrnran123_initialise_global_state_on_device();\n    }\n#endif\n\n    if (!corenrn_param.writeParametersFilepath.empty()) {\n        std::ofstream out(corenrn_param.writeParametersFilepath, std::ios::trunc);\n        out << corenrn_param.config_to_str(false, false);\n        out.close();\n    }\n\n    // reads mechanism information from bbcore_mech.dat\n    mk_mech((corenrn_param.datpath).c_str());\n}\n\nextern \"C\" int run_solve_core(int argc, char** argv) {\n    Instrumentor::phase_begin(\"main\");\n\n    std::vector<ReportConfiguration> configs;\n    std::vector<std::unique_ptr<ReportHandler>> report_handlers;\n    SpikesInfo spikes_info;\n    bool reports_needs_finalize = false;\n\n    if (!corenrn_param.is_quiet()) {\n        report_mem_usage(\"After mk_mech\");\n    }\n\n    // Create outpath if it does not exist\n    if (nrnmpi_myid == 0) {\n        mkdir_p(corenrn_param.outpath.c_str());\n    }\n\n    if (!corenrn_param.reportfilepath.empty()) {\n        configs = create_report_configurations(corenrn_param.reportfilepath,\n                                               corenrn_param.outpath,\n                                               spikes_info);\n        reports_needs_finalize = !configs.empty();\n    }\n\n    CheckPoints checkPoints{corenrn_param.checkpointpath, corenrn_param.restorepath};\n\n    // initializationa and loading functions moved to separate\n    {\n        Instrumentor::phase p(\"load-model\");\n        nrn_init_and_load_data(argc, argv, checkPoints, !configs.empty());\n    }\n\n    std::string output_dir = corenrn_param.outpath;\n\n    if (nrnmpi_myid == 0) {\n        mkdir_p(output_dir.c_str());\n    }\n#if NRNMPI\n    if (corenrn_param.mpi_enable) {\n        nrnmpi_barrier();\n    }\n#endif\n    bool compute_gpu = corenrn_param.gpu;\n\n    nrn_pragma_acc(update device(celsius, secondorder, pi) if (compute_gpu))\n    nrn_pragma_omp(target update to(celsius, secondorder, pi) if (compute_gpu))\n    {\n        double v = corenrn_param.voltage;\n        double dt = corenrn_param.dt;\n        double delay = corenrn_param.mindelay;\n        double tstop = corenrn_param.tstop;\n\n        if (tstop < t && nrnmpi_myid == 0) {\n            printf(\"Error: Stop time (%lf) < Start time (%lf), restoring from checkpoint? \\n\",\n                   tstop,\n                   t);\n            abort();\n        }\n\n        // TODO : if some ranks are empty then restore will go in deadlock\n        // phase (as some ranks won't have restored anything and hence return\n        // false in checkpoint_initialize\n        if (!corenrn_embedded && !checkPoints.initialize()) {\n            nrn_finitialize(v != 1000., v);\n        }\n\n        if (!corenrn_param.is_quiet()) {\n            report_mem_usage(\"After nrn_finitialize\");\n        }\n\n        // register all reports into reportinglib\n        double min_report_dt = INT_MAX;\n        for (size_t i = 0; i < configs.size(); i++) {\n            std::unique_ptr<ReportHandler> report_handler = create_report_handler(configs[i],\n                                                                                  spikes_info);\n            if (report_handler) {\n                report_handler->create_report(configs[i], dt, tstop, delay);\n                report_handlers.push_back(std::move(report_handler));\n            }\n            if (configs[i].report_dt < min_report_dt) {\n                min_report_dt = configs[i].report_dt;\n            }\n        }\n        // Set the buffer size if is not the default value. Otherwise use report.conf on\n        // register_report\n        if (corenrn_param.report_buff_size != corenrn_param.report_buff_size_default) {\n            set_report_buffer_size(corenrn_param.report_buff_size);\n        }\n\n        if (!configs.empty()) {\n            setup_report_engine(min_report_dt, delay);\n            configs.clear();\n        }\n\n        // call prcellstate for prcellgid\n        call_prcellstate_for_prcellgid(corenrn_param.prcellgid, compute_gpu, 0);\n\n        // handle forwardskip\n        if (corenrn_param.forwardskip > 0.0) {\n            Instrumentor::phase p(\"handle-forward-skip\");\n            handle_forward_skip(corenrn_param.forwardskip, corenrn_param.prcellgid);\n        }\n\n        /// Solver execution\n        Instrumentor::start_profile();\n        Instrumentor::phase_begin(\"simulation\");\n        BBS_netpar_solve(corenrn_param.tstop);\n        Instrumentor::phase_end(\"simulation\");\n        Instrumentor::stop_profile();\n\n        // update cpu copy of NrnThread from GPU\n        update_nrnthreads_on_host(nrn_threads, nrn_nthread);\n\n        // direct mode and full trajectory gathering on CoreNEURON, send back.\n        if (corenrn_embedded) {\n            trajectory_return();\n        }\n\n        // Report global cell statistics\n        if (!corenrn_param.is_quiet()) {\n            report_cell_stats();\n        }\n\n        // prcellstate after end of solver\n        call_prcellstate_for_prcellgid(corenrn_param.prcellgid, compute_gpu, 0);\n    }\n\n    // write spike information to outpath\n    {\n        Instrumentor::phase p(\"output-spike\");\n        output_spikes(output_dir.c_str(), spikes_info);\n    }\n\n    // copy weights back to NEURON NetCon\n    if (nrn2core_all_weights_return_) {\n        // first update weights from gpu\n        update_weights_from_gpu(nrn_threads, nrn_nthread);\n\n        // store weight pointers\n        std::vector<double*> weights(nrn_nthread, nullptr);\n\n        // could be one thread more (empty) than in NEURON but does not matter\n        for (int i = 0; i < nrn_nthread; ++i) {\n            weights[i] = nrn_threads[i].weights;\n        }\n        (*nrn2core_all_weights_return_)(weights);\n    }\n\n    core2nrn_data_return();\n\n    {\n        Instrumentor::phase p(\"checkpoint\");\n        checkPoints.write_checkpoint(nrn_threads, nrn_nthread);\n    }\n\n    // must be done after checkpoint (to avoid deleting events)\n    if (reports_needs_finalize) {\n        finalize_report();\n    }\n\n    // cleanup threads on GPU\n    if (corenrn_param.gpu) {\n        delete_nrnthreads_on_device(nrn_threads, nrn_nthread);\n        if (nrn_have_gaps) {\n            nrn_partrans::delete_gap_indices_from_device();\n        }\n        nrnran123_destroy_global_state_on_device();\n        cnrn_target_delete(&secondorder);\n        cnrn_target_delete(&pi);\n        cnrn_target_delete(&celsius);\n    }\n\n    // Cleaning the memory\n    nrn_cleanup();\n\n    // tau needs to resume profile\n    Instrumentor::start_profile();\n\n// mpi finalize\n#if NRNMPI\n    if (corenrn_param.mpi_enable && !corenrn_param.skip_mpi_finalize) {\n        nrnmpi_finalize();\n    }\n#endif\n\n    Instrumentor::phase_end(\"main\");\n\n    return 0;\n}\n"
  },
  {
    "path": "coreneuron/config/config.cpp.in",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include \"coreneuron/config/config.h\"\n\n/// Git version of the project\nconst std::string coreneuron::version::GIT_REVISION = \"@CN_GIT_REVISION@\";\n\n/// CoreNEURON version\nconst std::string coreneuron::version::CORENEURON_VERSION = \"@CN_PROJECT_VERSION@\";\n"
  },
  {
    "path": "coreneuron/config/config.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n/**\n * \\dir\n * \\brief Global project configurations\n *\n * \\file\n * \\brief Version information\n */\n\n#include <string>\n\nnamespace coreneuron {\n\n/**\n * \\brief Project version information\n */\nstruct version {\n    /// git revision id\n    static const std::string GIT_REVISION;\n\n    /// project tagged version in the cmake\n    static const std::string CORENEURON_VERSION;\n\n    /// return version string (version + git id) as a string\n    static std::string to_string() {\n        return CORENEURON_VERSION + \" \" + GIT_REVISION;\n    }\n};\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/config/neuron_version.hpp.in",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n#pragma once\n\n// This is the CoreNEURON analogue of nrnsemanticversion.h in NEURON. Hopefully\n// the duplication can go away soon.\n#define NRN_VERSION_MAJOR @NRN_VERSION_MAJOR@\n#define NRN_VERSION_MINOR @NRN_VERSION_MINOR@\n#define NRN_VERSION_PATCH @NRN_VERSION_PATCH@\n"
  },
  {
    "path": "coreneuron/config/version_macros.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n#pragma once\n\n// This is the CoreNEURON analogue of nrnversionmacros.h in NEURON. Hopefully\n// the duplication can go away soon.\n#include \"coreneuron/config/neuron_version.hpp\"\n#define NRN_VERSION_INT(maj, min, pat)  (10000 * maj + 100 * min + pat)\n#define NRN_VERSION                     NRN_VERSION_INT(NRN_VERSION_MAJOR, NRN_VERSION_MINOR, NRN_VERSION_PATCH)\n#define NRN_VERSION_EQ(maj, min, pat)   (NRN_VERSION == NRN_VERSION_INT(maj, min, pat))\n#define NRN_VERSION_NE(maj, min, pat)   (NRN_VERSION != NRN_VERSION_INT(maj, min, pat))\n#define NRN_VERSION_GT(maj, min, pat)   (NRN_VERSION > NRN_VERSION_INT(maj, min, pat))\n#define NRN_VERSION_LT(maj, min, pat)   (NRN_VERSION < NRN_VERSION_INT(maj, min, pat))\n#define NRN_VERSION_GTEQ(maj, min, pat) (NRN_VERSION >= NRN_VERSION_INT(maj, min, pat))\n#define NRN_VERSION_LTEQ(maj, min, pat) (NRN_VERSION <= NRN_VERSION_INT(maj, min, pat))\n\n// 8.2.0 is significant because all versions >=8.2.0 should contain definitions\n// of these macros, and doing #ifndef NRN_VERSION_GTEQ_8_2_0 is a more\n// descriptive way of writing #if defined(NRN_VERSION_GTEQ). Testing for 8.2.0\n// is likely to be a common pattern when adapting MOD file VERBATIM blocks for\n// C++ compatibility.\n#if NRN_VERSION_GTEQ(8, 2, 0)\n#define NRN_VERSION_GTEQ_8_2_0\n#endif\n"
  },
  {
    "path": "coreneuron/coreneuron.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n#pragma once\n\n/***\n * Includes all headers required to communicate and run all methods\n * described in CoreNEURON, neurox, and mod2c C-generated mechanisms\n * functions.\n **/\n\n\n#include <cstdio>\n#include <cstdlib>\n#include <cmath>\n#include <string.h>\n#include <vector>\n#include <array>\n\n#include \"coreneuron/utils/randoms/nrnran123.h\"     //Random Number Generator\n#include \"coreneuron/sim/scopmath/newton_struct.h\"  //Newton Struct\n#include \"coreneuron/membrane_definitions.h\"        //static definitions\n#include \"coreneuron/mechanism/mechanism.hpp\"       //Memb_list and mechs info\n\n#include \"coreneuron/utils/memory.h\"  //Memory alignments and padding\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/mechanism/mech_mapping.hpp\"\n\nnamespace coreneuron {\n\n// from nrnoc/capac.c\nextern void nrn_init_capacitance(NrnThread*, Memb_list*, int);\nextern void nrn_cur_capacitance(NrnThread* _nt, Memb_list* ml, int type);\nextern void nrn_alloc_capacitance(double* data, Datum* pdata, int type);\n\n// from nrnoc/eion.c\nextern void nrn_init_ion(NrnThread*, Memb_list*, int);\nextern void nrn_cur_ion(NrnThread* _nt, Memb_list* ml, int type);\nextern void nrn_alloc_ion(double* data, Datum* pdata, int type);\nextern void second_order_cur(NrnThread* _nt, int secondorder);\n\nusing DependencyTable = std::vector<std::vector<int>>;\n\n/**\n * A class representing the CoreNEURON state, holding pointers to the various data structures\n *\n * The pointers to \"global\" data such as the NrnThread, Memb_list and Memb_func data structures\n * are managed here. they logically share their lifetime and runtime scope with instances of\n * this class.\n */\nclass CoreNeuron {\n    /**\n     * map if mech is a point process\n     * In the future only a field of Mechanism class\n     */\n    std::vector<char> pnt_map; /* so prop_free can know its a point mech*/\n\n    /** Vector mapping the types (IDs) of different mechanisms of mod files between NEURON and\n     * CoreNEURON\n     */\n    std::vector<int> different_mechanism_type;\n\n    /**\n     * dependency helper filled by calls to hoc_register_dparam_semantics\n     * used when nrn_mech_depend is called\n     * vector-of-vector DS. First idx is the mech, second idx is the dependent mech.\n     */\n    DependencyTable ion_write_dependency;\n\n    std::vector<Memb_func> memb_funcs;\n\n    /**\n     * Net send / Net receive\n     * only used in CoreNEURON for book keeping synapse mechs, should go into CoreNEURON class\n     */\n    std::vector<std::pair<NetBufReceive_t, int>> net_buf_receive;\n    std::vector<int> net_buf_send_type;\n\n    /**\n     * before-after-blocks from nmodl are registered here as function pointers\n     */\n    std::array<BAMech*, BEFORE_AFTER_SIZE> bamech;\n\n    /**\n     * Internal lookup tables. Number of float and int variables in each mechanism and memory layout\n     * future --> mech class\n     */\n    std::vector<int> nrn_prop_param_size;\n    std::vector<int> nrn_prop_dparam_size;\n    std::vector<int> nrn_mech_data_layout; /* 1 AoS (default), 0 SoA */\n    /* array is parallel to memb_func. All are 0 except 1 for ARTIFICIAL_CELL */\n    std::vector<short> nrn_artcell_qindex;\n    std::vector<bool> nrn_is_artificial;\n\n    /**\n     * Net Receive function pointer lookup tables\n     */\n    std::vector<pnt_receive_t> pnt_receive; /* for synaptic events. */\n    std::vector<pnt_receive_t> pnt_receive_init;\n    std::vector<short> pnt_receive_size;\n\n    /**\n     * Holds function pointers for WATCH callback\n     */\n    std::vector<nrn_watch_check_t> nrn_watch_check;\n\n    /**\n     * values are type numbers of mechanisms which do net_send call\n     * related to NMODL net_event()\n     *\n     */\n    std::vector<int> nrn_has_net_event;\n\n    /**\n     * inverse of nrn_has_net_event_ maps the values of nrn_has_net_event_ to the index of\n     * ptntype2presyn\n     */\n    std::vector<int> pnttype2presyn;\n\n\n    std::vector<bbcore_read_t> nrn_bbcore_read;\n    std::vector<bbcore_write_t> nrn_bbcore_write;\n\n  public:\n    auto& get_memb_funcs() {\n        return memb_funcs;\n    }\n\n    auto& get_memb_func(size_t idx) {\n        return memb_funcs[idx];\n    }\n\n    auto& get_different_mechanism_type() {\n        return different_mechanism_type;\n    }\n\n    auto& get_pnt_map() {\n        return pnt_map;\n    }\n\n    auto& get_ion_write_dependency() {\n        return ion_write_dependency;\n    }\n\n    auto& get_net_buf_receive() {\n        return net_buf_receive;\n    }\n\n    auto& get_net_buf_send_type() {\n        return net_buf_send_type;\n    }\n\n    auto& get_bamech() {\n        return bamech;\n    }\n\n    auto& get_prop_param_size() {\n        return nrn_prop_param_size;\n    }\n\n    auto& get_prop_dparam_size() {\n        return nrn_prop_dparam_size;\n    }\n\n    auto& get_mech_data_layout() {\n        return nrn_mech_data_layout;\n    }\n\n    auto& get_is_artificial() {\n        return nrn_is_artificial;\n    }\n\n    auto& get_artcell_qindex() {\n        return nrn_artcell_qindex;\n    }\n\n    auto& get_pnt_receive() {\n        return pnt_receive;\n    }\n\n    auto& get_pnt_receive_init() {\n        return pnt_receive_init;\n    }\n\n    auto& get_pnt_receive_size() {\n        return pnt_receive_size;\n    }\n\n    auto& get_watch_check() {\n        return nrn_watch_check;\n    }\n\n    auto& get_has_net_event() {\n        return nrn_has_net_event;\n    }\n\n    auto& get_pnttype2presyn() {\n        return pnttype2presyn;\n    }\n\n    auto& get_bbcore_read() {\n        return nrn_bbcore_read;\n    }\n\n    auto& get_bbcore_write() {\n        return nrn_bbcore_write;\n    }\n};\n\nextern CoreNeuron corenrn;\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/engine.h.in",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n// Use MAJOR.MINOR for public version\n#define CORENEURON_VERSION @CORENEURON_VERSION_COMBINED@\n\n#ifdef __cplusplus\nextern \"C\" {\n#endif\n\n/// All-in-one initialization of mechanisms and solver\nextern int solve_core(int argc, char** argv);\n\n/// Initialize mechanisms\nextern void mk_mech_init(int argc, char** argv);\n/// Run core solver\nextern int run_solve_core(int argc, char** argv);\n\n#ifdef __cplusplus\n}\n#endif\n"
  },
  {
    "path": "coreneuron/gpu/nrn_acc_manager.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include <queue>\n#include <utility>\n\n#include \"coreneuron/apps/corenrn_parameters.hpp\"\n#include \"coreneuron/gpu/nrn_acc_manager.hpp\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/network/netcon.hpp\"\n#include \"coreneuron/nrniv/nrniv_decl.h\"\n#include \"coreneuron/utils/vrecitem.h\"\n#include \"coreneuron/utils/profile/profiler_interface.h\"\n#include \"coreneuron/permute/cellorder.hpp\"\n#include \"coreneuron/permute/data_layout.hpp\"\n#include \"coreneuron/sim/scopmath/newton_struct.h\"\n#include \"coreneuron/coreneuron.hpp\"\n#include \"coreneuron/utils/nrnoc_aux.hpp\"\n#include \"coreneuron/mpi/nrnmpidec.h\"\n#include \"coreneuron/utils/utils.hpp\"\n\n#ifdef CRAYPAT\n#include <pat_api.h>\n#endif\n\n#if defined(CORENEURON_ENABLE_GPU) && defined(CORENEURON_PREFER_OPENMP_OFFLOAD) && defined(_OPENMP)\n#include <cuda_runtime_api.h>\n#endif\n\n#if __has_include(<cxxabi.h>)\n#define USE_CXXABI\n#include <cxxabi.h>\n#include <memory>\n#include <string>\n#endif\n\n#ifdef CORENEURON_ENABLE_PRESENT_TABLE\n#include <cassert>\n#include <cstddef>\n#include <iostream>\n#include <map>\n#include <shared_mutex>\nnamespace {\nstruct present_table_value {\n    std::size_t ref_count{}, size{};\n    std::byte* dev_ptr{};\n};\nstd::map<std::byte const*, present_table_value> present_table;\nstd::shared_mutex present_table_mutex;\n}  // namespace\n#endif\n\nnamespace {\n/** @brief Try to demangle a type name, return the mangled name on failure.\n */\nstd::string cxx_demangle(const char* mangled) {\n#ifdef USE_CXXABI\n    int status{};\n    // Note that the third argument to abi::__cxa_demangle returns the length of\n    // the allocated buffer, which may be larger than strlen(demangled) + 1.\n    std::unique_ptr<char, decltype(free)*> demangled{\n        abi::__cxa_demangle(mangled, nullptr, nullptr, &status), free};\n    return status ? mangled : demangled.get();\n#else\n    return mangled;\n#endif\n}\nbool cnrn_target_debug_output_enabled() {\n    const char* env = std::getenv(\"CORENEURON_GPU_DEBUG\");\n    if (!env) {\n        return false;\n    }\n    std::string env_s{env};\n    if (env_s == \"1\") {\n        return true;\n    } else if (env_s == \"0\") {\n        return false;\n    } else {\n        throw std::runtime_error(\"CORENEURON_GPU_DEBUG must be set to 0 or 1 (got \" + env_s + \")\");\n    }\n}\nbool cnrn_target_enable_debug{cnrn_target_debug_output_enabled()};\n}  // namespace\n\nnamespace coreneuron {\nextern InterleaveInfo* interleave_info;\nvoid nrn_ion_global_map_copyto_device();\nvoid nrn_ion_global_map_delete_from_device();\nvoid nrn_VecPlay_copyto_device(NrnThread* nt, void** d_vecplay);\nvoid nrn_VecPlay_delete_from_device(NrnThread* nt);\n\nvoid cnrn_target_copyin_debug(std::string_view file,\n                              int line,\n                              std::size_t sizeof_T,\n                              std::type_info const& typeid_T,\n                              void const* h_ptr,\n                              std::size_t len,\n                              void* d_ptr) {\n    if (!cnrn_target_enable_debug) {\n        return;\n    }\n    std::cerr << file << ':' << line << \": cnrn_target_copyin<\" << cxx_demangle(typeid_T.name())\n              << \">(\" << h_ptr << \", \" << len << \" * \" << sizeof_T << \" = \" << len * sizeof_T\n              << \") -> \" << d_ptr << std::endl;\n}\nvoid cnrn_target_delete_debug(std::string_view file,\n                              int line,\n                              std::size_t sizeof_T,\n                              std::type_info const& typeid_T,\n                              void const* h_ptr,\n                              std::size_t len) {\n    if (!cnrn_target_enable_debug) {\n        return;\n    }\n    std::cerr << file << ':' << line << \": cnrn_target_delete<\" << cxx_demangle(typeid_T.name())\n              << \">(\" << h_ptr << \", \" << len << \" * \" << sizeof_T << \" = \" << len * sizeof_T << ')'\n              << std::endl;\n}\nvoid cnrn_target_deviceptr_debug(std::string_view file,\n                                 int line,\n                                 std::type_info const& typeid_T,\n                                 void const* h_ptr,\n                                 void* d_ptr) {\n    if (!cnrn_target_enable_debug) {\n        return;\n    }\n    std::cerr << file << ':' << line << \": cnrn_target_deviceptr<\" << cxx_demangle(typeid_T.name())\n              << \">(\" << h_ptr << \") -> \" << d_ptr << std::endl;\n}\nvoid cnrn_target_is_present_debug(std::string_view file,\n                                  int line,\n                                  std::type_info const& typeid_T,\n                                  void const* h_ptr,\n                                  void* d_ptr) {\n    if (!cnrn_target_enable_debug) {\n        return;\n    }\n    std::cerr << file << ':' << line << \": cnrn_target_is_present<\" << cxx_demangle(typeid_T.name())\n              << \">(\" << h_ptr << \") -> \" << d_ptr << std::endl;\n}\nvoid cnrn_target_memcpy_to_device_debug(std::string_view file,\n                                        int line,\n                                        std::size_t sizeof_T,\n                                        std::type_info const& typeid_T,\n                                        void const* h_ptr,\n                                        std::size_t len,\n                                        void* d_ptr) {\n    if (!cnrn_target_enable_debug) {\n        return;\n    }\n    std::cerr << file << ':' << line << \": cnrn_target_memcpy_to_device<\"\n              << cxx_demangle(typeid_T.name()) << \">(\" << d_ptr << \", \" << h_ptr << \", \" << len\n              << \" * \" << sizeof_T << \" = \" << len * sizeof_T << ')' << std::endl;\n}\n\n#ifdef CORENEURON_ENABLE_PRESENT_TABLE\nstd::pair<void*, bool> cnrn_target_deviceptr_impl(bool must_be_present_or_null, void const* h_ptr) {\n    if (!h_ptr) {\n        return {nullptr, false};\n    }\n    // Concurrent calls to this method are safe, but they must be serialised\n    // w.r.t. calls to the cnrn_target_*_update_present_table methods.\n    std::shared_lock _{present_table_mutex};\n    if (present_table.empty()) {\n        return {nullptr, must_be_present_or_null};\n    }\n    // prev(first iterator greater than h_ptr or last if not found) gives the first iterator less\n    // than or equal to h_ptr\n    auto const iter = std::prev(std::upper_bound(\n        present_table.begin(), present_table.end(), h_ptr, [](void const* hp, auto const& entry) {\n            return hp < entry.first;\n        }));\n    if (iter == present_table.end()) {\n        return {nullptr, must_be_present_or_null};\n    }\n    std::byte const* const h_byte_ptr{static_cast<std::byte const*>(h_ptr)};\n    std::byte const* const h_start_of_block{iter->first};\n    std::size_t const block_size{iter->second.size};\n    std::byte* const d_start_of_block{iter->second.dev_ptr};\n    bool const is_present{h_byte_ptr < h_start_of_block + block_size};\n    if (!is_present) {\n        return {nullptr, must_be_present_or_null};\n    }\n    return {d_start_of_block + (h_byte_ptr - h_start_of_block), false};\n}\n\nvoid cnrn_target_copyin_update_present_table(void const* h_ptr, void* d_ptr, std::size_t len) {\n    if (!h_ptr) {\n        assert(!d_ptr);\n        return;\n    }\n    std::lock_guard _{present_table_mutex};\n    // TODO include more pedantic overlap checking?\n    present_table_value new_val{};\n    new_val.size = len;\n    new_val.ref_count = 1;\n    new_val.dev_ptr = static_cast<std::byte*>(d_ptr);\n    auto const [iter, inserted] = present_table.emplace(static_cast<std::byte const*>(h_ptr),\n                                                        std::move(new_val));\n    if (!inserted) {\n        // Insertion didn't occur because h_ptr was already in the present table\n        assert(iter->second.size == len);\n        assert(iter->second.dev_ptr == new_val.dev_ptr);\n        ++(iter->second.ref_count);\n    }\n}\nvoid cnrn_target_delete_update_present_table(void const* h_ptr, std::size_t len) {\n    if (!h_ptr) {\n        return;\n    }\n    std::lock_guard _{present_table_mutex};\n    auto const iter = present_table.find(static_cast<std::byte const*>(h_ptr));\n    assert(iter != present_table.end());\n    assert(iter->second.size == len);\n    --(iter->second.ref_count);\n    if (iter->second.ref_count == 0) {\n        present_table.erase(iter);\n    }\n}\n#endif\n\nint cnrn_target_get_num_devices() {\n#if defined(CORENEURON_ENABLE_GPU) && !defined(CORENEURON_PREFER_OPENMP_OFFLOAD) && \\\n    defined(_OPENACC)\n    // choose nvidia GPU by default\n    acc_device_t device_type = acc_device_nvidia;\n    // check how many gpu devices available per node\n    return acc_get_num_devices(device_type);\n#elif defined(CORENEURON_ENABLE_GPU) && defined(CORENEURON_PREFER_OPENMP_OFFLOAD) && \\\n    defined(_OPENMP)\n    return omp_get_num_devices();\n#else\n    throw std::runtime_error(\n        \"cnrn_target_get_num_devices() not implemented without OpenACC/OpenMP and gpu build\");\n#endif\n}\n\nvoid cnrn_target_set_default_device(int device_num) {\n#if defined(CORENEURON_ENABLE_GPU) && !defined(CORENEURON_PREFER_OPENMP_OFFLOAD) && \\\n    defined(_OPENACC)\n    acc_set_device_num(device_num, acc_device_nvidia);\n#elif defined(CORENEURON_ENABLE_GPU) && defined(CORENEURON_PREFER_OPENMP_OFFLOAD) && \\\n    defined(_OPENMP)\n    omp_set_default_device(device_num);\n    // It seems that with NVHPC 21.9 then only setting the default OpenMP device\n    // is not enough: there were errors on some nodes when not-the-0th GPU was\n    // used. These seemed to be related to the NMODL instance structs, which are\n    // allocated using cudaMallocManaged.\n    auto const cuda_code = cudaSetDevice(device_num);\n    assert(cuda_code == cudaSuccess);\n#else\n    throw std::runtime_error(\n        \"cnrn_target_set_default_device() not implemented without OpenACC/OpenMP and gpu build\");\n#endif\n}\n\n#ifdef CORENEURON_ENABLE_GPU\n#ifndef CORENEURON_UNIFIED_MEMORY\nstatic Memb_list* copy_ml_to_device(const Memb_list* ml, int type) {\n    // As we never run code for artificial cell inside GPU we don't copy it.\n    int is_art = corenrn.get_is_artificial()[type];\n    if (is_art) {\n        return nullptr;\n    }\n\n    auto d_ml = cnrn_target_copyin(ml);\n\n    if (ml->global_variables) {\n        assert(ml->global_variables_size);\n        void* d_inst = cnrn_target_copyin(static_cast<std::byte*>(ml->global_variables),\n                                          ml->global_variables_size);\n        cnrn_target_memcpy_to_device(&(d_ml->global_variables), &d_inst);\n    }\n\n\n    int n = ml->nodecount;\n    int szp = corenrn.get_prop_param_size()[type];\n    int szdp = corenrn.get_prop_dparam_size()[type];\n\n    double* dptr = cnrn_target_deviceptr(ml->data);\n    cnrn_target_memcpy_to_device(&(d_ml->data), &(dptr));\n\n\n    int* d_nodeindices = cnrn_target_copyin(ml->nodeindices, n);\n    cnrn_target_memcpy_to_device(&(d_ml->nodeindices), &d_nodeindices);\n\n    if (szdp) {\n        int pcnt = nrn_soa_padded_size(n, SOA_LAYOUT) * szdp;\n        int* d_pdata = cnrn_target_copyin(ml->pdata, pcnt);\n        cnrn_target_memcpy_to_device(&(d_ml->pdata), &d_pdata);\n    }\n\n    int ts = corenrn.get_memb_funcs()[type].thread_size_;\n    if (ts) {\n        ThreadDatum* td = cnrn_target_copyin(ml->_thread, ts);\n        cnrn_target_memcpy_to_device(&(d_ml->_thread), &td);\n    }\n\n    // net_receive buffer associated with mechanism\n    NetReceiveBuffer_t* nrb = ml->_net_receive_buffer;\n\n    // if net receive buffer exist for mechanism\n    if (nrb) {\n        NetReceiveBuffer_t* d_nrb = cnrn_target_copyin(nrb);\n        cnrn_target_memcpy_to_device(&(d_ml->_net_receive_buffer), &d_nrb);\n\n        int* d_pnt_index = cnrn_target_copyin(nrb->_pnt_index, nrb->_size);\n        cnrn_target_memcpy_to_device(&(d_nrb->_pnt_index), &d_pnt_index);\n\n        int* d_weight_index = cnrn_target_copyin(nrb->_weight_index, nrb->_size);\n        cnrn_target_memcpy_to_device(&(d_nrb->_weight_index), &d_weight_index);\n\n        double* d_nrb_t = cnrn_target_copyin(nrb->_nrb_t, nrb->_size);\n        cnrn_target_memcpy_to_device(&(d_nrb->_nrb_t), &d_nrb_t);\n\n        double* d_nrb_flag = cnrn_target_copyin(nrb->_nrb_flag, nrb->_size);\n        cnrn_target_memcpy_to_device(&(d_nrb->_nrb_flag), &d_nrb_flag);\n\n        int* d_displ = cnrn_target_copyin(nrb->_displ, nrb->_size + 1);\n        cnrn_target_memcpy_to_device(&(d_nrb->_displ), &d_displ);\n\n        int* d_nrb_index = cnrn_target_copyin(nrb->_nrb_index, nrb->_size);\n        cnrn_target_memcpy_to_device(&(d_nrb->_nrb_index), &d_nrb_index);\n    }\n\n    /* copy NetSendBuffer_t on to GPU */\n    NetSendBuffer_t* nsb = ml->_net_send_buffer;\n\n    if (nsb) {\n        NetSendBuffer_t* d_nsb;\n        int* d_iptr;\n        double* d_dptr;\n\n        d_nsb = cnrn_target_copyin(nsb);\n        cnrn_target_memcpy_to_device(&(d_ml->_net_send_buffer), &d_nsb);\n\n        d_iptr = cnrn_target_copyin(nsb->_sendtype, nsb->_size);\n        cnrn_target_memcpy_to_device(&(d_nsb->_sendtype), &d_iptr);\n\n        d_iptr = cnrn_target_copyin(nsb->_vdata_index, nsb->_size);\n        cnrn_target_memcpy_to_device(&(d_nsb->_vdata_index), &d_iptr);\n\n        d_iptr = cnrn_target_copyin(nsb->_pnt_index, nsb->_size);\n        cnrn_target_memcpy_to_device(&(d_nsb->_pnt_index), &d_iptr);\n\n        d_iptr = cnrn_target_copyin(nsb->_weight_index, nsb->_size);\n        cnrn_target_memcpy_to_device(&(d_nsb->_weight_index), &d_iptr);\n\n        d_dptr = cnrn_target_copyin(nsb->_nsb_t, nsb->_size);\n        cnrn_target_memcpy_to_device(&(d_nsb->_nsb_t), &d_dptr);\n\n        d_dptr = cnrn_target_copyin(nsb->_nsb_flag, nsb->_size);\n        cnrn_target_memcpy_to_device(&(d_nsb->_nsb_flag), &d_dptr);\n    }\n\n    return d_ml;\n}\n#endif\n\nstatic void update_ml_on_host(const Memb_list* ml, int type) {\n    int is_art = corenrn.get_is_artificial()[type];\n    if (is_art) {\n        // Artificial mechanisms such as PatternStim and IntervalFire\n        // are not copied onto the GPU. They should not, therefore, be\n        // updated from the GPU.\n        return;\n    }\n\n    int n = ml->nodecount;\n    int szp = corenrn.get_prop_param_size()[type];\n    int szdp = corenrn.get_prop_dparam_size()[type];\n\n    int pcnt = nrn_soa_padded_size(n, SOA_LAYOUT) * szp;\n\n    nrn_pragma_acc(update self(ml->data[:pcnt], ml->nodeindices[:n]))\n    nrn_pragma_omp(target update from(ml->data[:pcnt], ml->nodeindices[:n]))\n\n    int dpcnt = nrn_soa_padded_size(n, SOA_LAYOUT) * szdp;\n    nrn_pragma_acc(update self(ml->pdata[:dpcnt]) if (szdp))\n    nrn_pragma_omp(target update from(ml->pdata[:dpcnt]) if (szdp))\n\n    auto nrb = ml->_net_receive_buffer;\n\n    // clang-format off\n    nrn_pragma_acc(update self(nrb->_cnt,\n                               nrb->_size,\n                               nrb->_pnt_offset,\n                               nrb->_displ_cnt,\n                               nrb->_pnt_index[:nrb->_size],\n                               nrb->_weight_index[:nrb->_size],\n                               nrb->_displ[:nrb->_size + 1],\n                               nrb->_nrb_index[:nrb->_size])\n                          if (nrb != nullptr))\n    nrn_pragma_omp(target update from(nrb->_cnt,\n                                      nrb->_size,\n                                      nrb->_pnt_offset,\n                                      nrb->_displ_cnt,\n                                      nrb->_pnt_index[:nrb->_size],\n                                      nrb->_weight_index[:nrb->_size],\n                                      nrb->_displ[:nrb->_size + 1],\n                                      nrb->_nrb_index[:nrb->_size])\n                                 if (nrb != nullptr))\n    // clang-format on\n}\n\nstatic void delete_ml_from_device(Memb_list* ml, int type) {\n    int is_art = corenrn.get_is_artificial()[type];\n    if (is_art) {\n        return;\n    }\n    // Cleanup the net send buffer if it exists\n    {\n        NetSendBuffer_t* nsb{ml->_net_send_buffer};\n        if (nsb) {\n            cnrn_target_delete(nsb->_nsb_flag, nsb->_size);\n            cnrn_target_delete(nsb->_nsb_t, nsb->_size);\n            cnrn_target_delete(nsb->_weight_index, nsb->_size);\n            cnrn_target_delete(nsb->_pnt_index, nsb->_size);\n            cnrn_target_delete(nsb->_vdata_index, nsb->_size);\n            cnrn_target_delete(nsb->_sendtype, nsb->_size);\n            cnrn_target_delete(nsb);\n        }\n    }\n    // Cleanup the net receive buffer if it exists.\n    {\n        NetReceiveBuffer_t* nrb{ml->_net_receive_buffer};\n        if (nrb) {\n            cnrn_target_delete(nrb->_nrb_index, nrb->_size);\n            cnrn_target_delete(nrb->_displ, nrb->_size + 1);\n            cnrn_target_delete(nrb->_nrb_flag, nrb->_size);\n            cnrn_target_delete(nrb->_nrb_t, nrb->_size);\n            cnrn_target_delete(nrb->_weight_index, nrb->_size);\n            cnrn_target_delete(nrb->_pnt_index, nrb->_size);\n            cnrn_target_delete(nrb);\n        }\n    }\n    int n = ml->nodecount;\n    int szdp = corenrn.get_prop_dparam_size()[type];\n    int ts = corenrn.get_memb_funcs()[type].thread_size_;\n    if (ts) {\n        cnrn_target_delete(ml->_thread, ts);\n    }\n    if (szdp) {\n        int pcnt = nrn_soa_padded_size(n, SOA_LAYOUT) * szdp;\n        cnrn_target_delete(ml->pdata, pcnt);\n    }\n    cnrn_target_delete(ml->nodeindices, n);\n\n    if (ml->global_variables) {\n        assert(ml->global_variables_size);\n        cnrn_target_delete(static_cast<std::byte*>(ml->global_variables),\n                           ml->global_variables_size);\n    }\n\n    cnrn_target_delete(ml);\n}\n\n#endif\n\n/* note: threads here are corresponding to global nrn_threads array */\nvoid setup_nrnthreads_on_device(NrnThread* threads, int nthreads) {\n#ifdef CORENEURON_ENABLE_GPU\n    // initialize NrnThreads for gpu execution\n    // empty thread or only artificial cells should be on cpu\n    for (int i = 0; i < nthreads; i++) {\n        NrnThread* nt = threads + i;\n        nt->compute_gpu = (nt->end > 0) ? 1 : 0;\n        nt->_dt = dt;\n    }\n\n    nrn_ion_global_map_copyto_device();\n\n#ifdef CORENEURON_UNIFIED_MEMORY\n    for (int i = 0; i < nthreads; i++) {\n        NrnThread* nt = threads + i;  // NrnThread on host\n\n        if (nt->n_presyn) {\n            PreSyn* d_presyns = cnrn_target_copyin(nt->presyns, nt->n_presyn);\n        }\n\n        if (nt->n_vecplay) {\n            /* copy VecPlayContinuous instances */\n            /** just empty containers */\n            void** d_vecplay = cnrn_target_copyin(nt->_vecplay, nt->n_vecplay);\n            // note: we are using unified memory for NrnThread. Once VecPlay is copied to gpu,\n            // we dont want to update nt->vecplay because it will also set gpu pointer of vecplay\n            // inside nt on cpu (due to unified memory).\n\n            nrn_VecPlay_copyto_device(nt, d_vecplay);\n        }\n\n        if (!nt->_permute && nt->end > 0) {\n            printf(\"\\n WARNING: NrnThread %d not permuted, error for linear algebra?\", i);\n        }\n    }\n\n#else\n    /* -- copy NrnThread to device. this needs to be contigious vector because offset is used to\n     * find\n     * corresponding NrnThread using Point_process in NET_RECEIVE block\n     */\n    NrnThread* d_threads = cnrn_target_copyin(threads, nthreads);\n\n    if (interleave_info == nullptr) {\n        printf(\"\\n Warning: No permutation data? Required for linear algebra!\");\n    }\n\n    /* pointers for data struct on device, starting with d_ */\n\n    for (int i = 0; i < nthreads; i++) {\n        NrnThread* nt = threads + i;      // NrnThread on host\n        NrnThread* d_nt = d_threads + i;  // NrnThread on device\n        if (!nt->compute_gpu) {\n            continue;\n        }\n        double* d__data;  // nrn_threads->_data on device\n\n        /* -- copy _data to device -- */\n\n        /*copy all double data for thread */\n        d__data = cnrn_target_copyin(nt->_data, nt->_ndata);\n\n\n        /* Here is the example of using OpenACC data enter/exit\n         * Remember that we are not allowed to use nt->_data but we have to use:\n         *      double *dtmp = nt->_data;  // now use dtmp!\n                #pragma acc enter data copyin(dtmp[0:nt->_ndata]) async(nt->stream_id)\n                #pragma acc wait(nt->stream_id)\n         */\n\n        /*update d_nt._data to point to device copy */\n        cnrn_target_memcpy_to_device(&(d_nt->_data), &d__data);\n\n        /* -- setup rhs, d, a, b, v, node_aread to point to device copy -- */\n        double* dptr;\n\n        /* for padding, we have to recompute ne */\n        int ne = nrn_soa_padded_size(nt->end, 0);\n\n        dptr = d__data + 0 * ne;\n        cnrn_target_memcpy_to_device(&(d_nt->_actual_rhs), &(dptr));\n\n        dptr = d__data + 1 * ne;\n        cnrn_target_memcpy_to_device(&(d_nt->_actual_d), &(dptr));\n\n        dptr = d__data + 2 * ne;\n        cnrn_target_memcpy_to_device(&(d_nt->_actual_a), &(dptr));\n\n        dptr = d__data + 3 * ne;\n        cnrn_target_memcpy_to_device(&(d_nt->_actual_b), &(dptr));\n\n        dptr = d__data + 4 * ne;\n        cnrn_target_memcpy_to_device(&(d_nt->_actual_v), &(dptr));\n\n        dptr = d__data + 5 * ne;\n        cnrn_target_memcpy_to_device(&(d_nt->_actual_area), &(dptr));\n\n        if (nt->_actual_diam) {\n            dptr = d__data + 6 * ne;\n            cnrn_target_memcpy_to_device(&(d_nt->_actual_diam), &(dptr));\n        }\n\n        int* d_v_parent_index = cnrn_target_copyin(nt->_v_parent_index, nt->end);\n        cnrn_target_memcpy_to_device(&(d_nt->_v_parent_index), &(d_v_parent_index));\n\n        /* nt._ml_list is used in NET_RECEIVE block and should have valid membrane list id*/\n        Memb_list** d_ml_list = cnrn_target_copyin(nt->_ml_list, corenrn.get_memb_funcs().size());\n        cnrn_target_memcpy_to_device(&(d_nt->_ml_list), &(d_ml_list));\n\n        /* -- copy NrnThreadMembList list ml to device -- */\n\n        NrnThreadMembList* d_last_tml;\n\n        bool first_tml = true;\n\n        for (auto tml = nt->tml; tml; tml = tml->next) {\n            /*copy tml to device*/\n            /*QUESTIONS: does tml will point to nullptr as in host ? : I assume so!*/\n            auto d_tml = cnrn_target_copyin(tml);\n\n            /*first tml is pointed by nt */\n            if (first_tml) {\n                cnrn_target_memcpy_to_device(&(d_nt->tml), &d_tml);\n                first_tml = false;\n            } else {\n                /*rest of tml forms linked list */\n                cnrn_target_memcpy_to_device(&(d_last_tml->next), &d_tml);\n            }\n\n            // book keeping for linked-list\n            d_last_tml = d_tml;\n\n            /* now for every tml, there is a ml. copy that and setup pointer */\n            Memb_list* d_ml = copy_ml_to_device(tml->ml, tml->index);\n            cnrn_target_memcpy_to_device(&(d_tml->ml), &d_ml);\n            /* setup nt._ml_list */\n            cnrn_target_memcpy_to_device(&(d_ml_list[tml->index]), &d_ml);\n        }\n\n        if (nt->shadow_rhs_cnt) {\n            double* d_shadow_ptr;\n\n            int pcnt = nrn_soa_padded_size(nt->shadow_rhs_cnt, 0);\n\n            /* copy shadow_rhs to device and fix-up the pointer */\n            d_shadow_ptr = cnrn_target_copyin(nt->_shadow_rhs, pcnt);\n            cnrn_target_memcpy_to_device(&(d_nt->_shadow_rhs), &d_shadow_ptr);\n\n            /* copy shadow_d to device and fix-up the pointer */\n            d_shadow_ptr = cnrn_target_copyin(nt->_shadow_d, pcnt);\n            cnrn_target_memcpy_to_device(&(d_nt->_shadow_d), &d_shadow_ptr);\n        }\n\n        /* Fast membrane current calculation struct */\n        if (nt->nrn_fast_imem) {\n            NrnFastImem* d_fast_imem = cnrn_target_copyin(nt->nrn_fast_imem);\n            cnrn_target_memcpy_to_device(&(d_nt->nrn_fast_imem), &d_fast_imem);\n            {\n                double* d_ptr = cnrn_target_copyin(nt->nrn_fast_imem->nrn_sav_rhs, nt->end);\n                cnrn_target_memcpy_to_device(&(d_fast_imem->nrn_sav_rhs), &d_ptr);\n            }\n            {\n                double* d_ptr = cnrn_target_copyin(nt->nrn_fast_imem->nrn_sav_d, nt->end);\n                cnrn_target_memcpy_to_device(&(d_fast_imem->nrn_sav_d), &d_ptr);\n            }\n        }\n\n        if (nt->n_pntproc) {\n            /* copy Point_processes array and fix the pointer to execute net_receive blocks on GPU\n             */\n            Point_process* pntptr = cnrn_target_copyin(nt->pntprocs, nt->n_pntproc);\n            cnrn_target_memcpy_to_device(&(d_nt->pntprocs), &pntptr);\n        }\n\n        if (nt->n_weight) {\n            /* copy weight vector used in NET_RECEIVE which is pointed by netcon.weight */\n            double* d_weights = cnrn_target_copyin(nt->weights, nt->n_weight);\n            cnrn_target_memcpy_to_device(&(d_nt->weights), &d_weights);\n        }\n\n        if (nt->_nvdata) {\n            /* copy vdata which is setup in bbcore_read. This contains cuda allocated\n             * nrnran123_State * */\n            void** d_vdata = cnrn_target_copyin(nt->_vdata, nt->_nvdata);\n            cnrn_target_memcpy_to_device(&(d_nt->_vdata), &d_vdata);\n        }\n\n        if (nt->n_presyn) {\n            /* copy presyn vector used for spike exchange, note we have added new PreSynHelper due\n             * to issue\n             * while updating PreSyn objects which has virtual base class. May be this is issue due\n             * to\n             * VTable and alignment */\n            PreSynHelper* d_presyns_helper = cnrn_target_copyin(nt->presyns_helper, nt->n_presyn);\n            cnrn_target_memcpy_to_device(&(d_nt->presyns_helper), &d_presyns_helper);\n            PreSyn* d_presyns = cnrn_target_copyin(nt->presyns, nt->n_presyn);\n            cnrn_target_memcpy_to_device(&(d_nt->presyns), &d_presyns);\n        }\n\n        if (nt->_net_send_buffer_size) {\n            /* copy send_receive buffer */\n            int* d_net_send_buffer = cnrn_target_copyin(nt->_net_send_buffer,\n                                                        nt->_net_send_buffer_size);\n            cnrn_target_memcpy_to_device(&(d_nt->_net_send_buffer), &d_net_send_buffer);\n        }\n\n        if (nt->n_vecplay) {\n            /* copy VecPlayContinuous instances */\n            /** just empty containers */\n            void** d_vecplay = cnrn_target_copyin(nt->_vecplay, nt->n_vecplay);\n            cnrn_target_memcpy_to_device(&(d_nt->_vecplay), &d_vecplay);\n\n            nrn_VecPlay_copyto_device(nt, d_vecplay);\n        }\n\n        if (nt->_permute) {\n            if (interleave_permute_type == 1) {\n                /* todo: not necessary to setup pointers, just copy it */\n                InterleaveInfo* info = interleave_info + i;\n                int* d_ptr = nullptr;\n                InterleaveInfo* d_info = cnrn_target_copyin(info);\n\n                d_ptr = cnrn_target_copyin(info->stride, info->nstride + 1);\n                cnrn_target_memcpy_to_device(&(d_info->stride), &d_ptr);\n\n                d_ptr = cnrn_target_copyin(info->firstnode, nt->ncell);\n                cnrn_target_memcpy_to_device(&(d_info->firstnode), &d_ptr);\n\n                d_ptr = cnrn_target_copyin(info->lastnode, nt->ncell);\n                cnrn_target_memcpy_to_device(&(d_info->lastnode), &d_ptr);\n\n                d_ptr = cnrn_target_copyin(info->cellsize, nt->ncell);\n                cnrn_target_memcpy_to_device(&(d_info->cellsize), &d_ptr);\n\n            } else if (interleave_permute_type == 2) {\n                /* todo: not necessary to setup pointers, just copy it */\n                InterleaveInfo* info = interleave_info + i;\n                InterleaveInfo* d_info = cnrn_target_copyin(info);\n                int* d_ptr = nullptr;\n\n                d_ptr = cnrn_target_copyin(info->stride, info->nstride);\n                cnrn_target_memcpy_to_device(&(d_info->stride), &d_ptr);\n\n                d_ptr = cnrn_target_copyin(info->firstnode, info->nwarp + 1);\n                cnrn_target_memcpy_to_device(&(d_info->firstnode), &d_ptr);\n\n                d_ptr = cnrn_target_copyin(info->lastnode, info->nwarp + 1);\n                cnrn_target_memcpy_to_device(&(d_info->lastnode), &d_ptr);\n\n                d_ptr = cnrn_target_copyin(info->stridedispl, info->nwarp + 1);\n                cnrn_target_memcpy_to_device(&(d_info->stridedispl), &d_ptr);\n\n                d_ptr = cnrn_target_copyin(info->cellsize, info->nwarp);\n                cnrn_target_memcpy_to_device(&(d_info->cellsize), &d_ptr);\n            } else {\n                printf(\"\\n ERROR: only --cell_permute = [12] implemented\");\n                abort();\n            }\n        } else {\n            printf(\"\\n WARNING: NrnThread %d not permuted, error for linear algebra?\", i);\n        }\n\n        {\n            TrajectoryRequests* tr = nt->trajec_requests;\n            if (tr) {\n                // Create a device-side copy of the `trajec_requests` struct and\n                // make sure the device-side NrnThread object knows about it.\n                TrajectoryRequests* d_trajec_requests = cnrn_target_copyin(tr);\n                cnrn_target_memcpy_to_device(&(d_nt->trajec_requests), &d_trajec_requests);\n                // Initialise the double** gather member of the struct.\n                double** d_tr_gather = cnrn_target_copyin(tr->gather, tr->n_trajec);\n                cnrn_target_memcpy_to_device(&(d_trajec_requests->gather), &d_tr_gather);\n                // Initialise the double** varrays member of the struct if it's\n                // set.\n                double** d_tr_varrays{nullptr};\n                if (tr->varrays) {\n                    d_tr_varrays = cnrn_target_copyin(tr->varrays, tr->n_trajec);\n                    cnrn_target_memcpy_to_device(&(d_trajec_requests->varrays), &d_tr_varrays);\n                }\n                for (int i = 0; i < tr->n_trajec; ++i) {\n                    if (tr->varrays) {\n                        // tr->varrays[i] is a buffer of tr->bsize doubles on the host,\n                        // make a device-side copy of it and store a pointer to it in\n                        // the device-side version of tr->varrays.\n                        double* d_buf_traj_i = cnrn_target_copyin(tr->varrays[i], tr->bsize);\n                        cnrn_target_memcpy_to_device(&(d_tr_varrays[i]), &d_buf_traj_i);\n                    }\n                    // tr->gather[i] is a double* referring to (host) data in the\n                    // (host) _data block\n                    auto* d_gather_i = cnrn_target_deviceptr(tr->gather[i]);\n                    cnrn_target_memcpy_to_device(&(d_tr_gather[i]), &d_gather_i);\n                }\n                // TODO: other `double** scatter` and `void** vpr` members of\n                // the TrajectoryRequests struct are not copied to the device.\n                // The `int vsize` member is updated during the simulation but\n                // not kept up to date timestep-by-timestep on the device.\n            }\n        }\n        {\n            auto* d_fornetcon_perm_indices = cnrn_target_copyin(nt->_fornetcon_perm_indices,\n                                                                nt->_fornetcon_perm_indices_size);\n            cnrn_target_memcpy_to_device(&(d_nt->_fornetcon_perm_indices),\n                                         &d_fornetcon_perm_indices);\n        }\n        {\n            auto* d_fornetcon_weight_perm = cnrn_target_copyin(nt->_fornetcon_weight_perm,\n                                                               nt->_fornetcon_weight_perm_size);\n            cnrn_target_memcpy_to_device(&(d_nt->_fornetcon_weight_perm), &d_fornetcon_weight_perm);\n        }\n    }\n\n#endif\n#else\n    (void) threads;\n    (void) nthreads;\n#endif\n}\n\nvoid copy_ivoc_vect_to_device(const IvocVect& from, IvocVect& to) {\n#ifdef CORENEURON_ENABLE_GPU\n    /// by default `to` is desitionation pointer on a device\n    IvocVect* d_iv = &to;\n\n    size_t n = from.size();\n    if (n) {\n        double* d_data = cnrn_target_copyin(from.data(), n);\n        cnrn_target_memcpy_to_device(&(d_iv->data_), &d_data);\n    }\n#else\n    (void) from;\n    (void) to;\n#endif\n}\n\nvoid delete_ivoc_vect_from_device(IvocVect& vec) {\n#ifdef CORENEURON_ENABLE_GPU\n    auto const n = vec.size();\n    if (n) {\n        cnrn_target_delete(vec.data(), n);\n    }\n#else\n    static_cast<void>(vec);\n#endif\n}\n\nvoid realloc_net_receive_buffer(NrnThread* nt, Memb_list* ml) {\n    NetReceiveBuffer_t* nrb = ml->_net_receive_buffer;\n    if (!nrb) {\n        return;\n    }\n\n#ifdef CORENEURON_ENABLE_GPU\n    if (nt->compute_gpu) {\n        // free existing vectors in buffers on gpu\n        cnrn_target_delete(nrb->_pnt_index, nrb->_size);\n        cnrn_target_delete(nrb->_weight_index, nrb->_size);\n        cnrn_target_delete(nrb->_nrb_t, nrb->_size);\n        cnrn_target_delete(nrb->_nrb_flag, nrb->_size);\n        cnrn_target_delete(nrb->_displ, nrb->_size + 1);\n        cnrn_target_delete(nrb->_nrb_index, nrb->_size);\n    }\n#endif\n    // Reallocate host buffers using ecalloc_align (as in phase2.cpp) and\n    // free_memory (as in nrn_setup.cpp)\n    auto const realloc = [old_size = nrb->_size, nrb](auto*& ptr, std::size_t extra_size = 0) {\n        using T = std::remove_pointer_t<std::remove_reference_t<decltype(ptr)>>;\n        static_assert(std::is_trivial<T>::value,\n                      \"Only trivially constructible and copiable types are supported.\");\n        static_assert(std::is_same<decltype(ptr), T*&>::value,\n                      \"ptr should be reference-to-pointer\");\n        auto* const new_data = static_cast<T*>(ecalloc_align((nrb->_size + extra_size), sizeof(T)));\n        std::memcpy(new_data, ptr, (old_size + extra_size) * sizeof(T));\n        free_memory(ptr);\n        ptr = new_data;\n    };\n    nrb->_size *= 2;\n    realloc(nrb->_pnt_index);\n    realloc(nrb->_weight_index);\n    realloc(nrb->_nrb_t);\n    realloc(nrb->_nrb_flag);\n    realloc(nrb->_displ, 1);\n    realloc(nrb->_nrb_index);\n#ifdef CORENEURON_ENABLE_GPU\n    if (nt->compute_gpu) {\n        // update device copy\n        nrn_pragma_acc(update device(nrb));\n        nrn_pragma_omp(target update to(nrb));\n\n        NetReceiveBuffer_t* const d_nrb{cnrn_target_deviceptr(nrb)};\n        // recopy the vectors in the buffer\n        int* const d_pnt_index{cnrn_target_copyin(nrb->_pnt_index, nrb->_size)};\n        cnrn_target_memcpy_to_device(&(d_nrb->_pnt_index), &d_pnt_index);\n\n        int* const d_weight_index{cnrn_target_copyin(nrb->_weight_index, nrb->_size)};\n        cnrn_target_memcpy_to_device(&(d_nrb->_weight_index), &d_weight_index);\n\n        double* const d_nrb_t{cnrn_target_copyin(nrb->_nrb_t, nrb->_size)};\n        cnrn_target_memcpy_to_device(&(d_nrb->_nrb_t), &d_nrb_t);\n\n        double* const d_nrb_flag{cnrn_target_copyin(nrb->_nrb_flag, nrb->_size)};\n        cnrn_target_memcpy_to_device(&(d_nrb->_nrb_flag), &d_nrb_flag);\n\n        int* const d_displ{cnrn_target_copyin(nrb->_displ, nrb->_size + 1)};\n        cnrn_target_memcpy_to_device(&(d_nrb->_displ), &d_displ);\n\n        int* const d_nrb_index{cnrn_target_copyin(nrb->_nrb_index, nrb->_size)};\n        cnrn_target_memcpy_to_device(&(d_nrb->_nrb_index), &d_nrb_index);\n    }\n#endif\n}\n\nusing NRB_P = std::pair<int, int>;\n\nstruct comp {\n    bool operator()(const NRB_P& a, const NRB_P& b) {\n        if (a.first == b.first) {\n            return a.second > b.second;  // same instances in original net_receive order\n        }\n        return a.first > b.first;\n    }\n};\n\nstatic void net_receive_buffer_order(NetReceiveBuffer_t* nrb) {\n    Instrumentor::phase p_net_receive_buffer_order(\"net-receive-buf-order\");\n    if (nrb->_cnt == 0) {\n        nrb->_displ_cnt = 0;\n        return;\n    }\n\n    std::priority_queue<NRB_P, std::vector<NRB_P>, comp> nrbq;\n\n    for (int i = 0; i < nrb->_cnt; ++i) {\n        nrbq.push(NRB_P(nrb->_pnt_index[i], i));\n    }\n\n    int displ_cnt = 0;\n    int index_cnt = 0;\n    int last_instance_index = -1;\n    nrb->_displ[0] = 0;\n\n    while (!nrbq.empty()) {\n        const NRB_P& p = nrbq.top();\n        nrb->_nrb_index[index_cnt++] = p.second;\n        if (p.first != last_instance_index) {\n            ++displ_cnt;\n        }\n        nrb->_displ[displ_cnt] = index_cnt;\n        last_instance_index = p.first;\n        nrbq.pop();\n    }\n    nrb->_displ_cnt = displ_cnt;\n}\n\n/* when we execute NET_RECEIVE block on GPU, we provide the index of synapse instances\n * which we need to execute during the current timestep. In order to do this, we have\n * update NetReceiveBuffer_t object to GPU. When size of cpu buffer changes, we set\n * reallocated to true and hence need to reallocate buffer on GPU and then need to copy\n * entire buffer. If reallocated is 0, that means buffer size is not changed and hence\n * only need to copy _size elements to GPU.\n * Note: this is very preliminary implementation, optimisations will be done after first\n * functional version.\n */\nvoid update_net_receive_buffer(NrnThread* nt) {\n    Instrumentor::phase p_update_net_receive_buffer(\"update-net-receive-buf\");\n    for (auto tml = nt->tml; tml; tml = tml->next) {\n        int is_art = corenrn.get_is_artificial()[tml->index];\n        if (is_art) {\n            continue;\n        }\n        // net_receive buffer to copy\n        NetReceiveBuffer_t* nrb = tml->ml->_net_receive_buffer;\n\n        // if net receive buffer exist for mechanism\n        if (nrb && nrb->_cnt) {\n            // instance order to avoid race. setup _displ and _nrb_index\n            net_receive_buffer_order(nrb);\n\n            if (nt->compute_gpu) {\n                Instrumentor::phase p_net_receive_buffer_order(\"net-receive-buf-cpu2gpu\");\n                // note that dont update nrb otherwise we lose pointers\n\n                // clang-format off\n\n                /* update scalar elements */\n                nrn_pragma_acc(update device(nrb->_cnt,\n                                             nrb->_displ_cnt,\n                                             nrb->_pnt_index[:nrb->_cnt],\n                                             nrb->_weight_index[:nrb->_cnt],\n                                             nrb->_nrb_t[:nrb->_cnt],\n                                             nrb->_nrb_flag[:nrb->_cnt],\n                                             nrb->_displ[:nrb->_displ_cnt + 1],\n                                             nrb->_nrb_index[:nrb->_cnt])\n                                             async(nt->stream_id))\n                nrn_pragma_omp(target update to(nrb->_cnt,\n                                                nrb->_displ_cnt,\n                                                nrb->_pnt_index[:nrb->_cnt],\n                                                nrb->_weight_index[:nrb->_cnt],\n                                                nrb->_nrb_t[:nrb->_cnt],\n                                                nrb->_nrb_flag[:nrb->_cnt],\n                                                nrb->_displ[:nrb->_displ_cnt + 1],\n                                                nrb->_nrb_index[:nrb->_cnt]))\n                // clang-format on\n            }\n        }\n    }\n    nrn_pragma_acc(wait(nt->stream_id))\n}\n\nvoid update_net_send_buffer_on_host(NrnThread* nt, NetSendBuffer_t* nsb) {\n#ifdef CORENEURON_ENABLE_GPU\n    if (!nt->compute_gpu)\n        return;\n\n    // check if nsb->_cnt was exceeded on GPU: as the buffer can not be increased\n    // during gpu execution, we should just abort the execution.\n    // \\todo: this needs to be fixed with different memory allocation strategy\n    if (nsb->_cnt > nsb->_size) {\n        printf(\"ERROR: NetSendBuffer exceeded during GPU execution (rank %d)\\n\", nrnmpi_myid);\n        nrn_abort(1);\n    }\n\n    if (nsb->_cnt) {\n        Instrumentor::phase p_net_receive_buffer_order(\"net-send-buf-gpu2cpu\");\n    }\n    // clang-format off\n    nrn_pragma_acc(update self(nsb->_sendtype[:nsb->_cnt],\n                               nsb->_vdata_index[:nsb->_cnt],\n                               nsb->_pnt_index[:nsb->_cnt],\n                               nsb->_weight_index[:nsb->_cnt],\n                               nsb->_nsb_t[:nsb->_cnt],\n                               nsb->_nsb_flag[:nsb->_cnt])\n                          if (nsb->_cnt))\n    nrn_pragma_omp(target update from(nsb->_sendtype[:nsb->_cnt],\n                                      nsb->_vdata_index[:nsb->_cnt],\n                                      nsb->_pnt_index[:nsb->_cnt],\n                                      nsb->_weight_index[:nsb->_cnt],\n                                      nsb->_nsb_t[:nsb->_cnt],\n                                      nsb->_nsb_flag[:nsb->_cnt])\n                                 if (nsb->_cnt))\n    // clang-format on\n#else\n    (void) nt;\n    (void) nsb;\n#endif\n}\n\nvoid update_nrnthreads_on_host(NrnThread* threads, int nthreads) {\n#ifdef CORENEURON_ENABLE_GPU\n\n    for (int i = 0; i < nthreads; i++) {\n        NrnThread* nt = threads + i;\n\n        if (nt->compute_gpu && (nt->end > 0)) {\n            /* -- copy data to host -- */\n\n            int ne = nrn_soa_padded_size(nt->end, 0);\n\n            // clang-format off\n            nrn_pragma_acc(update self(nt->_actual_rhs[:ne],\n                                       nt->_actual_d[:ne],\n                                       nt->_actual_a[:ne],\n                                       nt->_actual_b[:ne],\n                                       nt->_actual_v[:ne],\n                                       nt->_actual_area[:ne]))\n            nrn_pragma_omp(target update from(nt->_actual_rhs[:ne],\n                                              nt->_actual_d[:ne],\n                                              nt->_actual_a[:ne],\n                                              nt->_actual_b[:ne],\n                                              nt->_actual_v[:ne],\n                                              nt->_actual_area[:ne]))\n            // clang-format on\n\n            nrn_pragma_acc(update self(nt->_actual_diam[:ne]) if (nt->_actual_diam != nullptr))\n            nrn_pragma_omp(\n                target update from(nt->_actual_diam[:ne]) if (nt->_actual_diam != nullptr))\n\n            /* @todo: nt._ml_list[tml->index] = tml->ml; */\n\n            /* -- copy NrnThreadMembList list ml to host -- */\n            for (auto tml = nt->tml; tml; tml = tml->next) {\n                if (!corenrn.get_is_artificial()[tml->index]) {\n                    nrn_pragma_acc(update self(tml->index, tml->ml->nodecount))\n                    nrn_pragma_omp(target update from(tml->index, tml->ml->nodecount))\n                }\n                update_ml_on_host(tml->ml, tml->index);\n            }\n\n            int pcnt = nrn_soa_padded_size(nt->shadow_rhs_cnt, 0);\n            /* copy shadow_rhs to host */\n            /* copy shadow_d to host */\n            nrn_pragma_acc(\n                update self(nt->_shadow_rhs[:pcnt], nt->_shadow_d[:pcnt]) if (nt->shadow_rhs_cnt))\n            nrn_pragma_omp(target update from(\n                nt->_shadow_rhs[:pcnt], nt->_shadow_d[:pcnt]) if (nt->shadow_rhs_cnt))\n\n            // clang-format off\n            nrn_pragma_acc(update self(nt->nrn_fast_imem->nrn_sav_rhs[:nt->end],\n                                       nt->nrn_fast_imem->nrn_sav_d[:nt->end])\n                                  if (nt->nrn_fast_imem != nullptr))\n            nrn_pragma_omp(target update from(nt->nrn_fast_imem->nrn_sav_rhs[:nt->end],\n                                              nt->nrn_fast_imem->nrn_sav_d[:nt->end])\n                                         if (nt->nrn_fast_imem != nullptr))\n            // clang-format on\n\n            nrn_pragma_acc(update self(nt->pntprocs[:nt->n_pntproc]) if (nt->n_pntproc))\n            nrn_pragma_omp(target update from(nt->pntprocs[:nt->n_pntproc]) if (nt->n_pntproc))\n\n            nrn_pragma_acc(update self(nt->weights[:nt->n_weight]) if (nt->n_weight))\n            nrn_pragma_omp(target update from(nt->weights[:nt->n_weight]) if (nt->n_weight))\n\n            nrn_pragma_acc(update self(\n                nt->presyns_helper[:nt->n_presyn], nt->presyns[:nt->n_presyn]) if (nt->n_presyn))\n            nrn_pragma_omp(target update from(\n                nt->presyns_helper[:nt->n_presyn], nt->presyns[:nt->n_presyn]) if (nt->n_presyn))\n\n            {\n                TrajectoryRequests* tr = nt->trajec_requests;\n                if (tr && tr->varrays) {\n                    // The full buffers have `bsize` entries, but only `vsize`\n                    // of them are valid.\n                    for (int i = 0; i < tr->n_trajec; ++i) {\n                        nrn_pragma_acc(update self(tr->varrays[i][:tr->vsize]))\n                        nrn_pragma_omp(target update from(tr->varrays[i][:tr->vsize]))\n                    }\n                }\n            }\n\n            /* dont update vdata, its pointer array\n               nrn_pragma_acc(update self(nt->_vdata[:nt->_nvdata) if nt->_nvdata)\n               nrn_pragma_omp(target update from(nt->_vdata[:nt->_nvdata) if (nt->_nvdata))\n             */\n        }\n    }\n#else\n    (void) threads;\n    (void) nthreads;\n#endif\n}\n\n/**\n * Copy weights from GPU to CPU\n *\n * User may record NetCon weights at the end of simulation.\n * For this purpose update weights of all NrnThread objects\n * from GPU to CPU.\n */\nvoid update_weights_from_gpu(NrnThread* threads, int nthreads) {\n#ifdef CORENEURON_ENABLE_GPU\n    for (int i = 0; i < nthreads; i++) {\n        NrnThread* nt = threads + i;\n        size_t n_weight = nt->n_weight;\n        if (nt->compute_gpu && n_weight > 0) {\n            double* weights = nt->weights;\n            nrn_pragma_acc(update host(weights [0:n_weight]))\n            nrn_pragma_omp(target update from(weights [0:n_weight]))\n        }\n    }\n#endif\n}\n\n/** Cleanup device memory that is being tracked by the OpenACC runtime.\n *\n *  This function painstakingly calls `cnrn_target_delete` in reverse order on all\n *  pointers that were passed to `cnrn_target_copyin` in `setup_nrnthreads_on_device`.\n *  This cleanup ensures that if the GPU is initialised multiple times from the\n *  same process then the OpenACC runtime will not be polluted with old\n *  pointers, which can cause errors. In particular if we do:\n *  @code\n *    {\n *      // ... some_ptr is dynamically allocated ...\n *      cnrn_target_copyin(some_ptr, some_size);\n *      // ... do some work ...\n *      // cnrn_target_delete(some_ptr);\n *      free(some_ptr);\n *    }\n *    {\n *      // ... same_ptr_again is dynamically allocated at the same address ...\n *      cnrn_target_copyin(same_ptr_again, some_other_size); // ERROR\n *    }\n *  @endcode\n *  the application will/may abort with an error such as:\n *    FATAL ERROR: variable in data clause is partially present on the device.\n *  The pattern above is typical of calling CoreNEURON on GPU multiple times in\n *  the same process.\n */\nvoid delete_nrnthreads_on_device(NrnThread* threads, int nthreads) {\n#ifdef CORENEURON_ENABLE_GPU\n    for (int i = 0; i < nthreads; i++) {\n        NrnThread* nt = threads + i;\n        if (!nt->compute_gpu) {\n            continue;\n        }\n        cnrn_target_delete(nt->_fornetcon_weight_perm, nt->_fornetcon_weight_perm_size);\n        cnrn_target_delete(nt->_fornetcon_perm_indices, nt->_fornetcon_perm_indices_size);\n        {\n            TrajectoryRequests* tr = nt->trajec_requests;\n            if (tr) {\n                if (tr->varrays) {\n                    for (int i = 0; i < tr->n_trajec; ++i) {\n                        cnrn_target_delete(tr->varrays[i], tr->bsize);\n                    }\n                    cnrn_target_delete(tr->varrays, tr->n_trajec);\n                }\n                cnrn_target_delete(tr->gather, tr->n_trajec);\n                cnrn_target_delete(tr);\n            }\n        }\n        if (nt->_permute) {\n            if (interleave_permute_type == 1) {\n                InterleaveInfo* info = interleave_info + i;\n                cnrn_target_delete(info->cellsize, nt->ncell);\n                cnrn_target_delete(info->lastnode, nt->ncell);\n                cnrn_target_delete(info->firstnode, nt->ncell);\n                cnrn_target_delete(info->stride, info->nstride + 1);\n                cnrn_target_delete(info);\n            } else if (interleave_permute_type == 2) {\n                InterleaveInfo* info = interleave_info + i;\n                cnrn_target_delete(info->cellsize, info->nwarp);\n                cnrn_target_delete(info->stridedispl, info->nwarp + 1);\n                cnrn_target_delete(info->lastnode, info->nwarp + 1);\n                cnrn_target_delete(info->firstnode, info->nwarp + 1);\n                cnrn_target_delete(info->stride, info->nstride);\n                cnrn_target_delete(info);\n            }\n        }\n\n        if (nt->n_vecplay) {\n            nrn_VecPlay_delete_from_device(nt);\n            cnrn_target_delete(nt->_vecplay, nt->n_vecplay);\n        }\n\n        // Cleanup send_receive buffer.\n        if (nt->_net_send_buffer_size) {\n            cnrn_target_delete(nt->_net_send_buffer, nt->_net_send_buffer_size);\n        }\n\n        if (nt->n_presyn) {\n            cnrn_target_delete(nt->presyns, nt->n_presyn);\n            cnrn_target_delete(nt->presyns_helper, nt->n_presyn);\n        }\n\n        // Cleanup data that's setup in bbcore_read.\n        if (nt->_nvdata) {\n            cnrn_target_delete(nt->_vdata, nt->_nvdata);\n        }\n\n        // Cleanup weight vector used in NET_RECEIVE\n        if (nt->n_weight) {\n            cnrn_target_delete(nt->weights, nt->n_weight);\n        }\n\n        // Cleanup point processes\n        if (nt->n_pntproc) {\n            cnrn_target_delete(nt->pntprocs, nt->n_pntproc);\n        }\n\n        if (nt->nrn_fast_imem) {\n            cnrn_target_delete(nt->nrn_fast_imem->nrn_sav_d, nt->end);\n            cnrn_target_delete(nt->nrn_fast_imem->nrn_sav_rhs, nt->end);\n            cnrn_target_delete(nt->nrn_fast_imem);\n        }\n\n        if (nt->shadow_rhs_cnt) {\n            int pcnt = nrn_soa_padded_size(nt->shadow_rhs_cnt, 0);\n            cnrn_target_delete(nt->_shadow_d, pcnt);\n            cnrn_target_delete(nt->_shadow_rhs, pcnt);\n        }\n\n        for (auto tml = nt->tml; tml; tml = tml->next) {\n            delete_ml_from_device(tml->ml, tml->index);\n            cnrn_target_delete(tml);\n        }\n        cnrn_target_delete(nt->_ml_list, corenrn.get_memb_funcs().size());\n        cnrn_target_delete(nt->_v_parent_index, nt->end);\n        cnrn_target_delete(nt->_data, nt->_ndata);\n    }\n    cnrn_target_delete(threads, nthreads);\n    nrn_ion_global_map_delete_from_device();\n#endif\n}\n\n\nvoid nrn_newtonspace_copyto_device(NewtonSpace* ns) {\n#ifdef CORENEURON_ENABLE_GPU\n    // FIXME this check needs to be tweaked if we ever want to run with a mix\n    //       of CPU and GPU threads.\n    if (nrn_threads[0].compute_gpu == 0) {\n        return;\n    }\n\n    int n = ns->n * ns->n_instance;\n    // actually, the values of double do not matter, only the  pointers.\n    NewtonSpace* d_ns = cnrn_target_copyin(ns);\n\n    double* pd;\n\n    pd = cnrn_target_copyin(ns->delta_x, n);\n    cnrn_target_memcpy_to_device(&(d_ns->delta_x), &pd);\n\n    pd = cnrn_target_copyin(ns->high_value, n);\n    cnrn_target_memcpy_to_device(&(d_ns->high_value), &pd);\n\n    pd = cnrn_target_copyin(ns->low_value, n);\n    cnrn_target_memcpy_to_device(&(d_ns->low_value), &pd);\n\n    pd = cnrn_target_copyin(ns->rowmax, n);\n    cnrn_target_memcpy_to_device(&(d_ns->rowmax), &pd);\n\n    auto pint = cnrn_target_copyin(ns->perm, n);\n    cnrn_target_memcpy_to_device(&(d_ns->perm), &pint);\n\n    auto ppd = cnrn_target_copyin(ns->jacobian, ns->n);\n    cnrn_target_memcpy_to_device(&(d_ns->jacobian), &ppd);\n\n    // the actual jacobian doubles were allocated as a single array\n    double* d_jacdat = cnrn_target_copyin(ns->jacobian[0], ns->n * n);\n\n    for (int i = 0; i < ns->n; ++i) {\n        pd = d_jacdat + i * n;\n        cnrn_target_memcpy_to_device(&(ppd[i]), &pd);\n    }\n#endif\n}\n\nvoid nrn_newtonspace_delete_from_device(NewtonSpace* ns) {\n#ifdef CORENEURON_ENABLE_GPU\n    // FIXME this check needs to be tweaked if we ever want to run with a mix\n    //       of CPU and GPU threads.\n    if (nrn_threads[0].compute_gpu == 0) {\n        return;\n    }\n    int n = ns->n * ns->n_instance;\n    cnrn_target_delete(ns->jacobian[0], ns->n * n);\n    cnrn_target_delete(ns->jacobian, ns->n);\n    cnrn_target_delete(ns->perm, n);\n    cnrn_target_delete(ns->rowmax, n);\n    cnrn_target_delete(ns->low_value, n);\n    cnrn_target_delete(ns->high_value, n);\n    cnrn_target_delete(ns->delta_x, n);\n    cnrn_target_delete(ns);\n#endif\n}\n\nvoid nrn_sparseobj_copyto_device(SparseObj* so) {\n#if defined(CORENEURON_ENABLE_GPU) && !defined(CORENEURON_UNIFIED_MEMORY)\n    // FIXME this check needs to be tweaked if we ever want to run with a mix\n    //       of CPU and GPU threads.\n    if (nrn_threads[0].compute_gpu == 0) {\n        return;\n    }\n\n    unsigned n1 = so->neqn + 1;\n    SparseObj* d_so = cnrn_target_copyin(so);\n    // only pointer fields in SparseObj that need setting up are\n    //   rowst, diag, rhs, ngetcall, coef_list\n    // only pointer fields in Elm that need setting up are\n    //   r_down, c_right, value\n    // do not care about the Elm* ptr value, just the space.\n\n    Elm** d_rowst = cnrn_target_copyin(so->rowst, n1);\n    cnrn_target_memcpy_to_device(&(d_so->rowst), &d_rowst);\n\n    Elm** d_diag = cnrn_target_copyin(so->diag, n1);\n    cnrn_target_memcpy_to_device(&(d_so->diag), &d_diag);\n\n    unsigned* pu = cnrn_target_copyin(so->ngetcall, so->_cntml_padded);\n    cnrn_target_memcpy_to_device(&(d_so->ngetcall), &pu);\n\n    double* pd = cnrn_target_copyin(so->rhs, n1 * so->_cntml_padded);\n    cnrn_target_memcpy_to_device(&(d_so->rhs), &pd);\n\n    double** d_coef_list = cnrn_target_copyin(so->coef_list, so->coef_list_size);\n    cnrn_target_memcpy_to_device(&(d_so->coef_list), &d_coef_list);\n\n    // Fill in relevant Elm pointer values\n\n    for (unsigned irow = 1; irow < n1; ++irow) {\n        for (Elm* elm = so->rowst[irow]; elm; elm = elm->c_right) {\n            Elm* pelm = cnrn_target_copyin(elm);\n\n            if (elm == so->rowst[irow]) {\n                cnrn_target_memcpy_to_device(&(d_rowst[irow]), &pelm);\n            } else {\n                Elm* d_e = cnrn_target_deviceptr(elm->c_left);\n                cnrn_target_memcpy_to_device(&(pelm->c_left), &d_e);\n            }\n\n            if (elm->col == elm->row) {\n                cnrn_target_memcpy_to_device(&(d_diag[irow]), &pelm);\n            }\n\n            if (irow > 1) {\n                if (elm->r_up) {\n                    Elm* d_e = cnrn_target_deviceptr(elm->r_up);\n                    cnrn_target_memcpy_to_device(&(pelm->r_up), &d_e);\n                }\n            }\n\n            pd = cnrn_target_copyin(elm->value, so->_cntml_padded);\n            cnrn_target_memcpy_to_device(&(pelm->value), &pd);\n        }\n    }\n\n    // visit all the Elm again and fill in pelm->r_down and pelm->c_left\n    for (unsigned irow = 1; irow < n1; ++irow) {\n        for (Elm* elm = so->rowst[irow]; elm; elm = elm->c_right) {\n            auto pelm = cnrn_target_deviceptr(elm);\n            if (elm->r_down) {\n                auto d_e = cnrn_target_deviceptr(elm->r_down);\n                cnrn_target_memcpy_to_device(&(pelm->r_down), &d_e);\n            }\n            if (elm->c_right) {\n                auto d_e = cnrn_target_deviceptr(elm->c_right);\n                cnrn_target_memcpy_to_device(&(pelm->c_right), &d_e);\n            }\n        }\n    }\n\n    // Fill in the d_so->coef_list\n    for (unsigned i = 0; i < so->coef_list_size; ++i) {\n        pd = cnrn_target_deviceptr(so->coef_list[i]);\n        cnrn_target_memcpy_to_device(&(d_coef_list[i]), &pd);\n    }\n#endif\n}\n\nvoid nrn_sparseobj_delete_from_device(SparseObj* so) {\n#if defined(CORENEURON_ENABLE_GPU) && !defined(CORENEURON_UNIFIED_MEMORY)\n    // FIXME this check needs to be tweaked if we ever want to run with a mix\n    //       of CPU and GPU threads.\n    if (nrn_threads[0].compute_gpu == 0) {\n        return;\n    }\n    unsigned n1 = so->neqn + 1;\n    for (unsigned irow = 1; irow < n1; ++irow) {\n        for (Elm* elm = so->rowst[irow]; elm; elm = elm->c_right) {\n            cnrn_target_delete(elm->value, so->_cntml_padded);\n            cnrn_target_delete(elm);\n        }\n    }\n    cnrn_target_delete(so->coef_list, so->coef_list_size);\n    cnrn_target_delete(so->rhs, n1 * so->_cntml_padded);\n    cnrn_target_delete(so->ngetcall, so->_cntml_padded);\n    cnrn_target_delete(so->diag, n1);\n    cnrn_target_delete(so->rowst, n1);\n    cnrn_target_delete(so);\n#endif\n}\n\n#ifdef CORENEURON_ENABLE_GPU\n\nvoid nrn_ion_global_map_copyto_device() {\n    if (nrn_ion_global_map_size) {\n        double** d_data = cnrn_target_copyin(nrn_ion_global_map, nrn_ion_global_map_size);\n        for (int j = 0; j < nrn_ion_global_map_size; j++) {\n            if (nrn_ion_global_map[j]) {\n                double* d_mechmap = cnrn_target_copyin(nrn_ion_global_map[j],\n                                                       ion_global_map_member_size);\n                cnrn_target_memcpy_to_device(&(d_data[j]), &d_mechmap);\n            }\n        }\n    }\n}\n\nvoid nrn_ion_global_map_delete_from_device() {\n    for (int j = 0; j < nrn_ion_global_map_size; j++) {\n        if (nrn_ion_global_map[j]) {\n            cnrn_target_delete(nrn_ion_global_map[j], ion_global_map_member_size);\n        }\n    }\n    if (nrn_ion_global_map_size) {\n        cnrn_target_delete(nrn_ion_global_map, nrn_ion_global_map_size);\n    }\n}\n\nvoid init_gpu() {\n    // check how many gpu devices available per node\n    int num_devices_per_node = cnrn_target_get_num_devices();\n\n    // if no gpu found, can't run on GPU\n    if (num_devices_per_node == 0) {\n        nrn_fatal_error(\"\\n ERROR : Enabled GPU execution but couldn't find NVIDIA GPU!\\n\");\n    }\n\n    if (corenrn_param.num_gpus != 0) {\n        if (corenrn_param.num_gpus > num_devices_per_node) {\n            nrn_fatal_error(\"Fatal error: asking for '%d' GPUs per node but only '%d' available\\n\",\n                            corenrn_param.num_gpus,\n                            num_devices_per_node);\n        } else {\n            num_devices_per_node = corenrn_param.num_gpus;\n        }\n    }\n\n    // get local rank within a node and assign specific gpu gpu for this node.\n    // multiple threads within the node will use same device.\n    int local_rank = 0;\n    int local_size = 1;\n#if NRNMPI\n    if (corenrn_param.mpi_enable) {\n        local_rank = nrnmpi_local_rank();\n        local_size = nrnmpi_local_size();\n    }\n#endif\n\n    cnrn_target_set_default_device(local_rank % num_devices_per_node);\n\n    if (nrnmpi_myid == 0 && !corenrn_param.is_quiet()) {\n        std::cout << \" Info : \" << num_devices_per_node << \" GPUs shared by \" << local_size\n                  << \" ranks per node\\n\";\n    }\n}\n\nvoid nrn_VecPlay_copyto_device(NrnThread* nt, void** d_vecplay) {\n    for (int i = 0; i < nt->n_vecplay; i++) {\n        VecPlayContinuous* vecplay_instance = (VecPlayContinuous*) nt->_vecplay[i];\n\n        /** just VecPlayContinuous object */\n        VecPlayContinuous* d_vecplay_instance = cnrn_target_copyin(vecplay_instance);\n        cnrn_target_memcpy_to_device((VecPlayContinuous**) (&(d_vecplay[i])), &d_vecplay_instance);\n\n        /** copy y_, t_ and discon_indices_ */\n        copy_ivoc_vect_to_device(vecplay_instance->y_, d_vecplay_instance->y_);\n        copy_ivoc_vect_to_device(vecplay_instance->t_, d_vecplay_instance->t_);\n        // OL211213: beware, the test suite does not currently include anything\n        // with a non-null discon_indices_.\n        if (vecplay_instance->discon_indices_) {\n            IvocVect* d_discon_indices = cnrn_target_copyin(vecplay_instance->discon_indices_);\n            cnrn_target_memcpy_to_device(&(d_vecplay_instance->discon_indices_), &d_discon_indices);\n            copy_ivoc_vect_to_device(*(vecplay_instance->discon_indices_),\n                                     *(d_vecplay_instance->discon_indices_));\n        }\n\n        /** copy PlayRecordEvent : todo: verify this */\n        PlayRecordEvent* d_e_ = cnrn_target_copyin(vecplay_instance->e_);\n\n        cnrn_target_memcpy_to_device(&(d_e_->plr_), (PlayRecord**) (&d_vecplay_instance));\n        cnrn_target_memcpy_to_device(&(d_vecplay_instance->e_), &d_e_);\n\n        /** copy pd_ : note that it's pointer inside ml->data and hence data itself is\n         * already on GPU */\n        double* d_pd_ = cnrn_target_deviceptr(vecplay_instance->pd_);\n        cnrn_target_memcpy_to_device(&(d_vecplay_instance->pd_), &d_pd_);\n    }\n}\n\nvoid nrn_VecPlay_delete_from_device(NrnThread* nt) {\n    for (int i = 0; i < nt->n_vecplay; i++) {\n        auto* vecplay_instance = static_cast<VecPlayContinuous*>(nt->_vecplay[i]);\n        cnrn_target_delete(vecplay_instance->e_);\n        if (vecplay_instance->discon_indices_) {\n            delete_ivoc_vect_from_device(*(vecplay_instance->discon_indices_));\n        }\n        delete_ivoc_vect_from_device(vecplay_instance->t_);\n        delete_ivoc_vect_from_device(vecplay_instance->y_);\n        cnrn_target_delete(vecplay_instance);\n    }\n}\n\n#endif\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/gpu/nrn_acc_manager.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n#pragma once\n\nnamespace coreneuron {\nstruct Memb_list;\nstruct NrnThread;\nstruct NetSendBuffer_t;\nvoid setup_nrnthreads_on_device(NrnThread* threads, int nthreads);\nvoid delete_nrnthreads_on_device(NrnThread* threads, int nthreads);\nvoid update_nrnthreads_on_host(NrnThread* threads, int nthreads);\n\nvoid update_net_receive_buffer(NrnThread* _nt);\n\n// Called by NModl\nvoid realloc_net_receive_buffer(NrnThread* nt, Memb_list* ml);\nvoid update_net_send_buffer_on_host(NrnThread* nt, NetSendBuffer_t* nsb);\n\nvoid update_weights_from_gpu(NrnThread* threads, int nthreads);\nvoid init_gpu();\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/core2nrn_data_return.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include <sstream>\n\n#include \"coreneuron/coreneuron.hpp\"\n#include \"coreneuron/io/nrn2core_direct.h\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/nrniv/nrniv_decl.h\"\n#include \"coreneuron/io/core2nrn_data_return.hpp\"\n#include \"coreneuron/network/netcvode.hpp\"\n#include \"coreneuron/permute/node_permute.h\"\n#include \"coreneuron/utils/nrnoc_aux.hpp\"\n#include \"coreneuron/utils/vrecitem.h\"\n#include \"coreneuron/io/mem_layout_util.hpp\"\n\n/** @brief, Information from NEURON to help with copying data to NEURON.\n *  Info for copying voltage, i_membrane_, and mechanism data.\n *  See implementaton in\n *  nrn/src/nrniv/nrnbbcore_write.cpp:nrnthreads_type_return.\n *  Return is size of either the returned data pointer or the number\n *  of pointers in mdata. tid is the thread index.\n */\nsize_t (*nrn2core_type_return_)(int type, int tid, double*& data, double**& mdata);\n\n/** @brief, Call NEURON mechanism bbcore_read.\n *  Inverse of bbcore_write for transfer from NEURON to CoreNEURON.\n *  Mostly for transferring back the nrnran123_State sequence so psolve can\n *  continue on NEURON side (or continue psolve on CoreNEURON).\n */\nextern \"C\" {\nint (*core2nrn_corepointer_mech_)(int tid,\n                                  int type,\n                                  int icnt,\n                                  int dcnt,\n                                  int* iArray,\n                                  double* dArray);\n}\n\nnamespace coreneuron {\n\n/** @brief permuted array copied to unpermuted array\n *  If permute is NULL then just a copy\n */\nstatic void inverse_permute_copy(size_t n, double* permuted_src, double* dest, int* permute) {\n    if (permute) {\n        for (size_t i = 0; i < n; ++i) {\n            dest[i] = permuted_src[permute[i]];\n        }\n    } else {\n        std::copy(permuted_src, permuted_src + n, dest);\n    }\n}\n\n/** @brief SoA permuted mechanism data copied to unpermuted AoS data.\n *  dest is an array of n pointers to the beginning of each sz length array.\n *  src is a contiguous array of sz segments of size stride. The stride\n *  may be slightly greater than n for purposes of alignment.\n *  Each of the sz segments of src are permuted.\n */\nstatic void soa2aos_inverse_permute_copy(size_t n,\n                                         int sz,\n                                         int stride,\n                                         double* src,\n                                         double** dest,\n                                         int* permute) {\n    // src is soa and permuted. dest is n pointers to sz doubles (aos).\n    for (size_t instance = 0; instance < n; ++instance) {\n        double* d = dest[instance];\n        double* s = src + permute[instance];\n        for (int i = 0; i < sz; ++i) {\n            d[i] = s[i * stride];\n        }\n    }\n}\n\n/** @brief SoA unpermuted mechanism data copied to unpermuted AoS data.\n *  dest is an array of n pointers to the beginning of each sz length array.\n *  src is a contiguous array of sz segments of size stride. The stride\n *  may be slightly greater than n for purposes of alignment.\n *  Each of the sz segments of src have the same order as the n pointers\n *  of dest.\n */\nstatic void soa2aos_unpermuted_copy(size_t n, int sz, int stride, double* src, double** dest) {\n    // src is soa and permuted. dest is n pointers to sz doubles (aos).\n    for (size_t instance = 0; instance < n; ++instance) {\n        double* d = dest[instance];\n        double* s = src + instance;\n        for (int i = 0; i < sz; ++i) {\n            d[i] = s[i * stride];\n        }\n    }\n}\n\n/** @brief AoS mechanism data copied to AoS data.\n *  dest is an array of n pointers to the beginning of each sz length array.\n *  src is a contiguous array of n segments of size sz.\n */\nstatic void aos2aos_copy(size_t n, int sz, double* src, double** dest) {\n    for (size_t instance = 0; instance < n; ++instance) {\n        double* d = dest[instance];\n        double* s = src + (instance * sz);\n        std::copy(s, s + sz, d);\n    }\n}\n\n/** @brief Copy back COREPOINTER info to NEURON\n */\nstatic void core2nrn_corepointer(int tid, NrnThreadMembList* tml) {\n    // Based on get_bbcore_write fragment in nrn_checkpoint.cpp\n    int type = tml->index;\n    if (!corenrn.get_bbcore_write()[type]) {\n        return;\n    }\n    NrnThread& nt = nrn_threads[tid];\n    Memb_list* ml = tml->ml;\n    double* d = nullptr;\n    Datum* pd = nullptr;\n    int layout = corenrn.get_mech_data_layout()[type];\n    int dsz = corenrn.get_prop_param_size()[type];\n    int pdsz = corenrn.get_prop_dparam_size()[type];\n    int aln_cntml = nrn_soa_padded_size(ml->nodecount, layout);\n\n    int icnt = 0;\n    int dcnt = 0;\n    // data size and allocate\n    for (int j = 0; j < ml->nodecount; ++j) {\n        int jp = j;\n        if (ml->_permute) {\n            jp = ml->_permute[j];\n        }\n        d = ml->data + nrn_i_layout(jp, ml->nodecount, 0, dsz, layout);\n        pd = ml->pdata + nrn_i_layout(jp, ml->nodecount, 0, pdsz, layout);\n        (*corenrn.get_bbcore_write()[type])(\n            nullptr, nullptr, &dcnt, &icnt, 0, aln_cntml, d, pd, ml->_thread, &nt, ml, 0.0);\n    }\n\n    std::unique_ptr<int[]> iArray;\n    std::unique_ptr<double[]> dArray;\n    if (icnt) {\n        iArray.reset(new int[icnt]);\n    }\n    if (dcnt) {\n        dArray.reset(new double[dcnt]);\n    }\n    icnt = dcnt = 0;\n    for (int j = 0; j < ml->nodecount; j++) {\n        int jp = j;\n\n        if (ml->_permute) {\n            jp = ml->_permute[j];\n        }\n\n        d = ml->data + nrn_i_layout(jp, ml->nodecount, 0, dsz, layout);\n        pd = ml->pdata + nrn_i_layout(jp, ml->nodecount, 0, pdsz, layout);\n\n        (*corenrn.get_bbcore_write()[type])(dArray.get(),\n                                            iArray.get(),\n                                            &dcnt,\n                                            &icnt,\n                                            0,\n                                            aln_cntml,\n                                            d,\n                                            pd,\n                                            ml->_thread,\n                                            &nt,\n                                            ml,\n                                            0.0);\n    }\n\n    (*core2nrn_corepointer_mech_)(tid, type, icnt, dcnt, iArray.get(), dArray.get());\n}\n\n/** @brief Copy event queue and related state back to NEURON.\n */\nstatic void core2nrn_tqueue(NrnThread&);\n\n/** @brief Callback to clear NEURON thread queues.\n    In particular need to initialize bin queues to the current time before\n    transferring events.\n */\nextern \"C\" {\nvoid (*core2nrn_clear_queues_)(double t);\n}\n\n/** @brief All activated WATCH statements need activation on NEURON side.\n */\n// vector in unpermuted Memb_list index order of vector of\n// activated watch_index (the bool is whether it is above threshold).\nusing Core2NrnWatchInfoItem = std::vector<std::pair<int, bool>>;\nusing Core2NrnWatchInfo = std::vector<Core2NrnWatchInfoItem>;\n\nextern \"C\" {\nvoid (*core2nrn_watch_clear_)();\nvoid (*core2nrn_watch_activate_)(int tid, int type, int watch_begin, Core2NrnWatchInfo&);\n}\n\nstatic void core2nrn_watch();\n\n/** @brief VecPlay indices back to NEURON */\nextern \"C\" {\nvoid (*core2nrn_vecplay_)(int tid, int i_nrn, int last, int discon, int ubound);\nvoid (*core2nrn_vecplay_events_)();\n}\n\nstatic void core2nrn_vecplay();\n\n/** @brief copy data back to NEURON.\n *  Copies t, voltage, i_membrane_ if it used, and mechanism param data.\n *  Copies event queue and related state, e.g. WATCH, VecPlayContinuous.\n */\nvoid core2nrn_data_return() {\n    if (!nrn2core_type_return_) {\n        return;\n    }\n\n    (*core2nrn_clear_queues_)(nrn_threads[0]._t);  // all threads at same time\n\n    for (int tid = 0; tid < nrn_nthread; ++tid) {\n        size_t n = 0;\n        double* data = nullptr;\n        double** mdata = nullptr;\n        NrnThread& nt = nrn_threads[tid];\n\n        n = (*nrn2core_type_return_)(0, tid, data, mdata);  // 0 means time\n        if (n) {                                            // not the empty thread\n            data[0] = nt._t;\n        }\n\n        if (nt.end) {  // transfer voltage and possibly i_membrane_\n            n = (*nrn2core_type_return_)(voltage, tid, data, mdata);\n            assert(n == size_t(nt.end) && data);\n            inverse_permute_copy(n, nt._actual_v, data, nt._permute);\n\n            if (nt.nrn_fast_imem) {\n                n = (*nrn2core_type_return_)(i_membrane_, tid, data, mdata);\n                assert(n == size_t(nt.end) && data);\n                inverse_permute_copy(n, nt.nrn_fast_imem->nrn_sav_rhs, data, nt._permute);\n            }\n        }\n\n        for (NrnThreadMembList* tml = nt.tml; tml; tml = tml->next) {\n            int mtype = tml->index;\n            Memb_list* ml = tml->ml;\n            n = (*nrn2core_type_return_)(mtype, tid, data, mdata);\n            assert(n == size_t(ml->nodecount) && mdata);\n            if (n == 0) {\n                continue;\n            }\n            // NEURON is AoS, CoreNEURON may be SoA and may be permuted.\n            // On the NEURON side, the data is actually contiguous because of\n            // cache_efficient, but that may not be the case for ARTIFICIAL_CELL.\n            // For initial implementation simplicity, use the mdata info which gives\n            // a double* for each param_size mech instance.\n            int* permute = ml->_permute;\n            double* cndat = ml->data;\n            int layout = corenrn.get_mech_data_layout()[mtype];\n            int sz = corenrn.get_prop_param_size()[mtype];\n            if (layout == Layout::SoA) {\n                int stride = ml->_nodecount_padded;\n                if (permute) {\n                    soa2aos_inverse_permute_copy(n, sz, stride, cndat, mdata, permute);\n                } else {\n                    soa2aos_unpermuted_copy(n, sz, stride, cndat, mdata);\n                }\n            } else { /* AoS */\n                aos2aos_copy(n, sz, cndat, mdata);\n            }\n\n            core2nrn_corepointer(tid, tml);\n        }\n\n        // Copy the event queue and related state.\n        core2nrn_tqueue(nt);\n    }\n    core2nrn_vecplay();\n    core2nrn_watch();\n}\n\n/** @brief Callbacks into NEURON for WatchCondition.\n */\nstatic void core2nrn_watch() {\n    (*core2nrn_watch_clear_)();\n\n    // much of the following nested iterations follows the\n    // watch_activate_clear() function in sim/finitialize.cpp, though here\n    // we iterate over nt._watch_types instead of nt.tml and then picking out\n    // the WATCH relevant types with corenrn.get_watch_check().\n    for (int tid = 0; tid < nrn_nthread; ++tid) {\n        NrnThread& nt = nrn_threads[tid];\n        if (nt._watch_types) {\n            for (int i = 0; nt._watch_types[i] != 0; ++i) {\n                int type = nt._watch_types[i];\n                Memb_list& ml = *(nt._ml_list[type]);\n                int nodecount = ml.nodecount;\n                Core2NrnWatchInfo watch_info(ml.nodecount);\n                int* permute = ml._permute;\n                int* pdata = (int*) ml.pdata;\n                int dparam_size = corenrn.get_prop_dparam_size()[type];\n                int layout = corenrn.get_mech_data_layout()[type];\n                int first, last;\n                watch_datum_indices(type, first, last);\n                int watch_begin = first;\n                for (int iml = 0; iml < nodecount; ++iml) {\n                    int iml_permute = permute ? permute[iml] : iml;\n                    Core2NrnWatchInfoItem& wiv = watch_info[iml];\n                    for (int ix = first; ix <= last; ++ix) {\n                        int datum =\n                            pdata[nrn_i_layout(iml_permute, nodecount, ix, dparam_size, layout)];\n                        if (datum & 2) {  // activated\n                            bool above_thresh = bool(datum & 1);\n                            wiv.push_back(std::pair<int, bool>(ix, above_thresh));\n                        }\n                    }\n                }\n                (*core2nrn_watch_activate_)(tid, type, watch_begin, watch_info);\n            }\n        }\n    }\n}\n\n/** @brief Transfer VecPlay indices to NEURON.\n */\nvoid core2nrn_vecplay() {\n    for (int tid = 0; tid < nrn_nthread; ++tid) {\n        NrnThread& nt = nrn_threads[tid];\n        std::vector<int> i_nrn;\n        int ok = (*nrn2core_get_dat2_vecplay_)(tid, i_nrn);\n        if (nt.n_vecplay) {\n            assert(ok);\n        }\n        for (int i = 0; i < nt.n_vecplay; ++i) {\n            VecPlayContinuous& vp = *((VecPlayContinuous*) nt._vecplay[i]);\n            (*core2nrn_vecplay_)(tid,\n                                 i_nrn[i],\n                                 (int) vp.last_index_,\n                                 (int) vp.discon_index_,\n                                 (int) vp.ubound_index_);\n        }\n    }\n    (*core2nrn_vecplay_events_)();\n}\n\n/** @brief Callbacks into NEURON for queue event types.\n */\nextern \"C\" {\nvoid (*core2nrn_NetCon_event_)(int tid, double td, size_t nc_index);\n\n// must calculate netcon index from the weight index on this side\nvoid (*core2nrn_SelfEvent_event_)(int tid,\n                                  double td,\n                                  int tar_type,\n                                  int tar_index,\n                                  double flag,\n                                  size_t nc_index,\n                                  int is_movable);\n// the no weight case\nvoid (*core2nrn_SelfEvent_event_noweight_)(int tid,\n                                           double td,\n                                           int tar_type,\n                                           int tar_index,\n                                           double flag,\n                                           int is_movable);\n\n// PreSyn.flag_ will be 1 if it has fired and the value it is watching\n// is still greater than threshold. (Note, is 0 no matter what after\n// finitialize so using a set to send back the flag explicitly for any\n// that are 1. Although that is not really relevant in the core2nrn\n// direction. To match up PreSyn on NEURON and CoreNEURON side, we use\n// the (unpermuted) voltage index.\nvoid (*core2nrn_PreSyn_flag_)(int tid, std::set<int> presyns_flag_true);\n// Receive the PreSyn.flag_ == true voltage indices from the neuron side.\nvoid (*nrn2core_transfer_PreSyn_flag_)(int tid, std::set<int>& presyns_flag_true);\n}\n\nstatic void core2nrn_PreSyn_flag(NrnThread& nt) {\n    std::set<int> presyns_flag_true;\n    std::unique_ptr<int[]> pinv_nt;\n    if (nt._permute) {\n        pinv_nt.reset(inverse_permute(nt._permute, nt.end));\n    }\n    for (int i = 0; i < nt.n_presyn; ++i) {\n        PreSyn& ps = nt.presyns[i];\n        PreSynHelper& psh = nt.presyns_helper[i];\n        if (psh.flag_ && ps.thvar_index_ >= 0) {\n            int index_v = pinv_nt ? pinv_nt[ps.thvar_index_] : ps.thvar_index_;\n            presyns_flag_true.insert(index_v);\n        }\n    }\n    // have to send even if empty so NEURON side can turn off all flag_\n    (*core2nrn_PreSyn_flag_)(nt.id, presyns_flag_true);\n}\n\nvoid nrn2core_PreSyn_flag_receive(int tid) {\n    NrnThread& nt = nrn_threads[tid];\n    // turn off all the PreSyn.flag_ as they might have been turned off\n    // on the NEURON side if NEURON integrated a bit.\n    for (int i = 0; i < nt.n_presyn; ++i) {\n        nt.presyns_helper[i].flag_ = 0;  // in case 1 from previous psolve\n    }\n    std::set<int> presyns_flag_true;\n    (*nrn2core_transfer_PreSyn_flag_)(tid, presyns_flag_true);\n    if (presyns_flag_true.empty()) {\n        return;\n    }\n    std::unique_ptr<int[]> pinv_nt;\n    if (nt._permute) {\n        pinv_nt.reset(inverse_permute(nt._permute, nt.end));\n    }\n    for (int i = 0; i < nt.n_presyn; ++i) {\n        PreSyn& ps = nt.presyns[i];\n        PreSynHelper& psh = nt.presyns_helper[i];\n        if (ps.thvar_index_ >= 0) {\n            int index_v = pinv_nt ? pinv_nt[ps.thvar_index_] : ps.thvar_index_;\n            if (presyns_flag_true.erase(index_v)) {\n                psh.flag_ = 1;\n                if (presyns_flag_true.empty()) {\n                    break;\n                }\n            }\n        }\n    }\n}\n\nstd::map<int, int*> type2invperm;\n\nstatic void clear_inv_perm_for_selfevent_targets() {\n    for (auto it: type2invperm) {\n        delete[] it.second;\n    }\n    type2invperm.clear();\n}\n\n\nusing SelfEventWeightMap = std::map<int, std::vector<TQItem*>>;\n\n// return false unless q is pushed to sewm\nstatic bool core2nrn_tqueue_item(TQItem* q, SelfEventWeightMap& sewm, NrnThread& nt) {\n    DiscreteEvent* d = (DiscreteEvent*) q->data_;\n    double td = q->t_;\n    bool in_sewm = false;\n\n    switch (d->type()) {\n        case NetConType: {\n            NetCon* nc = (NetCon*) d;\n            assert(nc >= nt.netcons && (nc < (nt.netcons + nt.n_netcon)));\n            size_t nc_index = nc - nt.netcons;\n            (*core2nrn_NetCon_event_)(nt.id, td, nc_index);\n            break;\n        }\n        case SelfEventType: {\n            SelfEvent* se = (SelfEvent*) d;\n            Point_process* pnt = se->target_;\n            assert(pnt->_tid == nt.id);\n            int tar_type = (int) pnt->_type;\n            Memb_list* ml = nt._ml_list[tar_type];\n            if (ml->_permute) {  // if permutation, then make inverse available\n                // Doing this here because we don't know, in general, which\n                // mechanisms use SelfEvent\n                if (type2invperm.count(tar_type) == 0) {\n                    type2invperm[tar_type] = inverse_permute(ml->_permute, ml->nodecount);\n                }\n            }\n            double flag = se->flag_;\n            TQItem** movable = (TQItem**) (se->movable_);\n            int is_movable = (movable && *movable == q) ? 1 : 0;\n            int weight_index = se->weight_index_;\n            // the weight_index is useless on the NEURON side so we need\n            // to convert that to NetCon index  and let the NEURON side\n            // figure out the weight_index. To figure out the netcon_index\n            // construct a {weight_index : [TQItem]} here for any\n            // weight_index >= 0, otherwise send it NEURON now.\n            if (weight_index >= 0) {\n                // Potentially several SelfEvent TQItem* associated with\n                // same weight index. More importantly, collect them all\n                // so that we only need to iterate over the nt.netcons once\n                sewm[weight_index].push_back(q);\n                in_sewm = true;\n\n            } else {\n                int tar_index = pnt->_i_instance;  // correct for no permutation\n                if (ml->_permute) {\n                    tar_index = type2invperm[tar_type][tar_index];\n                }\n                (*core2nrn_SelfEvent_event_noweight_)(\n                    nt.id, td, tar_type, tar_index, flag, is_movable);\n                delete se;\n            }\n            break;\n        }\n        case PreSynType: {\n            // nothing to transfer\n            // `d` can be cast to PreSyn*\n            break;\n        }\n        case NetParEventType: {\n            // nothing to transfer\n            break;\n        }\n        case PlayRecordEventType: {\n            // nothing to transfer\n            break;\n        }\n        default: {\n            // In particular, InputPreSyn does not appear in tqueue as it\n            // immediately fans out to NetCon.\n            std::stringstream qetype;\n            qetype << d->type();\n            hoc_execerror(\"core2nrn_tqueue_item -> unimplemented queue event type:\",\n                          qetype.str().c_str());\n            break;\n        }\n    }\n    return in_sewm;\n}\n\nvoid core2nrn_tqueue(NrnThread& nt) {\n    // VecPlayContinuous\n\n    // PatternStim\n\n    // nrn_checkpoint.cpp has:\n    // Avoid extra spikes due to some presyn voltages above threshold\n\n    // PreSyn.flag_ that are on\n    core2nrn_PreSyn_flag(nt);\n\n    // The items on the queue\n    NetCvodeThreadData& ntd = net_cvode_instance->p[nt.id];\n    // make sure all buffered interthread events are on the queue\n    ntd.enqueue(net_cvode_instance, &nt);\n\n    TQueue<QTYPE>* tqe = ntd.tqe_;\n    TQItem* q;\n    SelfEventWeightMap sewm;\n    // TQItems from atomic_dq\n    while ((q = tqe->atomic_dq(1e20)) != nullptr) {\n        if (core2nrn_tqueue_item(q, sewm, nt) == false) {\n            delete q;\n        }\n    }\n    // TQitems from binq_\n    for (q = tqe->binq_->first(); q; q = tqe->binq_->next(q)) {\n        bool const result = core2nrn_tqueue_item(q, sewm, nt);\n        assert(result == false);\n    }\n\n    // For self events with weight, find the NetCon index and send that\n    // to NEURON.\n    // If the SelfEventWeightMap approach (and the corresponding pattern\n    // on the nrn2core side in NEURON) ends up being too expensive in space\n    // or time, it would be possible to modify SelfEvent to use the NetCon\n    // index instead of the weight index, and then directly determine the\n    // NetCon within the core2nrn_tqueue_item function above and call\n    // (*core2nrn_SelfEvent_event_) from there.\n    if (!sewm.empty()) {\n        for (int nc_index = 0; nc_index < nt.n_netcon; ++nc_index) {\n            NetCon& nc = nt.netcons[nc_index];\n            int weight_index = nc.u.weight_index_;\n            auto search = sewm.find(weight_index);\n            if (search != sewm.end()) {\n                const auto& tqitems = search->second;\n                for (auto q: tqitems) {\n                    DiscreteEvent* d = (DiscreteEvent*) (q->data_);\n                    double td = q->t_;\n                    assert(d->type() == SelfEventType);\n                    SelfEvent* se = (SelfEvent*) d;\n                    int tar_type = se->target_->_type;\n                    // Note that instead of getting tar_index from the permuted\n                    // pnt->_i_instance here and for the noweight case above\n                    // which then needs the possibly large inverse permutation\n                    // vectors, it would save some space to use the unpermuted\n                    // nt.pntprocs array along with a much shorter vector\n                    // of type offsets.\n                    int tar_index = se->target_->_i_instance;\n                    if (nt._ml_list[tar_type]->_permute) {\n                        tar_index = type2invperm[tar_type][tar_index];\n                    }\n                    double flag = se->flag_;\n                    TQItem** movable = (TQItem**) (se->movable_);\n                    int is_movable = (movable && *movable == q) ? 1 : 0;\n                    (*core2nrn_SelfEvent_event_)(\n                        nt.id, td, tar_type, tar_index, flag, nc_index, is_movable);\n                    delete q;\n                    delete se;\n                }\n            }\n        }\n    }\n\n    clear_inv_perm_for_selfevent_targets();\n}\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/core2nrn_data_return.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\nnamespace coreneuron {\n\n/** @brief Copies back to NEURON everything needed to analyze and continue simulation.\n    I.e. voltage, i_membrane_, mechanism data, event queue, WATCH state,\n    Play state, etc.\n */\nextern void core2nrn_data_return();\n\n/** @brief return first and last datum indices of WATCH statements\n */\nextern void watch_datum_indices(int type, int& first, int& last);\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/file_utils.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include <cstdio>\n#include <cstring>\n#include <cstdlib>\n#include <sys/stat.h>\n#include <errno.h>\n\n#if defined(MINGW)\n#define mkdir(dir_name, permission) _mkdir(dir_name)\n#endif\n\n/* adapted from : gist@jonathonreinhart/mkdir_p.c */\nint mkdir_p(const char* path) {\n    const int path_len = strlen(path);\n    if (path_len == 0) {\n        printf(\"Warning: Empty path for creating directory\");\n        return -1;\n    }\n\n    char* dirpath = new char[path_len + 1];\n    strcpy(dirpath, path);\n    errno = 0;\n\n    /* iterate from outer upto inner dir */\n    for (char* p = dirpath + 1; *p; p++) {\n        if (*p == '/') {\n            /* temporarily truncate to sub-dir */\n            *p = '\\0';\n\n            if (mkdir(dirpath, S_IRWXU) != 0) {\n                if (errno != EEXIST)\n                    return -1;\n            }\n            *p = '/';\n        }\n    }\n\n    if (mkdir(dirpath, S_IRWXU) != 0) {\n        if (errno != EEXIST) {\n            return -1;\n        }\n    }\n\n    delete[] dirpath;\n    return 0;\n}\n"
  },
  {
    "path": "coreneuron/io/file_utils.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n/**\n * @file file_utils.h\n * @brief Utility functions for file/directory management\n *\n */\n\n#pragma once\n\n/** @brief Creates directory if doesn't exisit (similar to mkdir -p)\n *  @param Directory path\n *  @return Status\n */\nint mkdir_p(const char* path);\n"
  },
  {
    "path": "coreneuron/io/global_vars.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include <cstdio>\n#include <cstring>\n#include <map>\n#include <string>\n#include <algorithm>\n\n#include \"coreneuron/utils/randoms/nrnran123.h\"\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/mechanism/membfunc.hpp\"\n#include \"coreneuron/utils/nrn_assert.h\"\n#include \"coreneuron/io/nrn2core_direct.h\"\n#include \"coreneuron/utils/nrnoc_aux.hpp\"\n\nvoid* (*nrn2core_get_global_dbl_item_)(void*, const char*& name, int& size, double*& val);\nint (*nrn2core_get_global_int_item_)(const char* name);\n\nnamespace coreneuron {\nusing PSD = std::pair<std::size_t, double*>;\nusing N2V = std::map<std::string, PSD>;\n\nstatic N2V* n2v;\n\nvoid hoc_register_var(DoubScal* ds, DoubVec* dv, VoidFunc*) {\n    if (!n2v) {\n        n2v = new N2V();\n    }\n    for (size_t i = 0; ds[i].name; ++i) {\n        (*n2v)[ds[i].name] = PSD(0, ds[i].pdoub);\n    }\n    for (size_t i = 0; dv[i].name; ++i) {\n        (*n2v)[dv[i].name] = PSD(dv[i].index1, ds[i].pdoub);\n    }\n}\n\nvoid set_globals(const char* path, bool cli_global_seed, int cli_global_seed_value) {\n    if (!n2v) {\n        n2v = new N2V();\n    }\n    (*n2v)[\"celsius\"] = PSD(0, &celsius);\n    (*n2v)[\"dt\"] = PSD(0, &dt);\n    (*n2v)[\"t\"] = PSD(0, &t);\n    (*n2v)[\"PI\"] = PSD(0, &pi);\n\n    if (corenrn_embedded) {  // CoreNEURON embedded, get info direct from NEURON\n\n        const char* name;\n        int size;\n        double* val = nullptr;\n        void* p = nullptr;\n        while (1) {\n            p = (*nrn2core_get_global_dbl_item_)(p, name, size, val);\n            // If the last item in the NEURON symbol table is a USERDOUBLE\n            // then p is NULL but val is not NULL and following fragment\n            // will be processed before exit from loop.\n            if (val) {\n                N2V::iterator it = n2v->find(name);\n                if (it != n2v->end()) {\n                    if (size == 0) {\n                        nrn_assert(it->second.first == 0);\n                        *(it->second.second) = val[0];\n                    } else {\n                        nrn_assert(it->second.first == (size_t) size);\n                        double* pval = it->second.second;\n                        for (int i = 0; i < size; ++i) {\n                            pval[i] = val[i];\n                        }\n                    }\n                }\n                delete[] val;\n                val = nullptr;\n            }\n            if (!p) {\n                break;\n            }\n        }\n        secondorder = (*nrn2core_get_global_int_item_)(\"secondorder\");\n        nrnran123_set_globalindex((*nrn2core_get_global_int_item_)(\"Random123_global_index\"));\n\n    } else {  // get the info from the globals.dat file\n        std::string fname = std::string(path) + std::string(\"/globals.dat\");\n        FILE* f = fopen(fname.c_str(), \"r\");\n        if (!f) {\n            printf(\"ignore: could not open %s\\n\", fname.c_str());\n            delete n2v;\n            n2v = nullptr;\n            return;\n        }\n\n        char line[256];\n\n        nrn_assert(fscanf(f, \"%s\\n\", line) == 1);\n        check_bbcore_write_version(line);\n\n        for (;;) {\n            char name[256];\n            double val;\n            int n;\n            nrn_assert(fgets(line, 256, f) != nullptr);\n            N2V::iterator it;\n            if (sscanf(line, \"%s %lf\", name, &val) == 2) {\n                if (strcmp(name, \"0\") == 0) {\n                    break;\n                }\n                it = n2v->find(name);\n                if (it != n2v->end()) {\n                    nrn_assert(it->second.first == 0);\n                    *(it->second.second) = val;\n                }\n            } else if (sscanf(line, \"%[^[][%d]\\n\", name, &n) == 2) {\n                if (strcmp(name, \"0\") == 0) {\n                    break;\n                }\n                it = n2v->find(name);\n                if (it != n2v->end()) {\n                    nrn_assert(it->second.first == (size_t) n);\n                    double* pval = it->second.second;\n                    for (int i = 0; i < n; ++i) {\n                        nrn_assert(fgets(line, 256, f) != nullptr);\n                        nrn_assert(sscanf(line, \"%lf\\n\", &val) == 1);\n                        pval[i] = val;\n                    }\n                }\n            } else {\n                nrn_assert(0);\n            }\n        }\n\n        while (fgets(line, 256, f)) {\n            char name[256];\n            int n;\n            if (sscanf(line, \"%s %d\", name, &n) == 2) {\n                if (strcmp(name, \"secondorder\") == 0) {\n                    secondorder = n;\n                } else if (strcmp(name, \"Random123_globalindex\") == 0) {\n                    nrnran123_set_globalindex((uint32_t) n);\n                } else if (strcmp(name, \"_nrnunit_use_legacy_\") == 0) {\n                    if (n != CORENEURON_USE_LEGACY_UNITS) {\n                        hoc_execerror(\n                            \"CORENRN_ENABLE_LEGACY_UNITS not\"\n                            \" consistent with NEURON value of\"\n                            \" nrnunit_use_legacy()\",\n                            nullptr);\n                    }\n                }\n            }\n        }\n\n        fclose(f);\n\n        // overwrite global.dat config if seed is specified on Command line\n        if (cli_global_seed) {\n            nrnran123_set_globalindex((uint32_t) cli_global_seed_value);\n        }\n    }\n\n#if CORENRN_DEBUG\n    for (const auto& item: *n2v) {\n        printf(\"%s %ld %p\\n\", item.first.c_str(), item.second.first, item.second.second);\n    }\n#endif\n\n    delete n2v;\n    n2v = nullptr;\n}\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/lfp.cpp",
    "content": "#include \"coreneuron/io/lfp.hpp\"\n#include \"coreneuron/apps/corenrn_parameters.hpp\"\n\n#include <cmath>\n#include <limits>\n#include <sstream>\n\n\nnamespace coreneuron {\nnamespace lfputils {\n\ndouble line_source_lfp_factor(const Point3D& e_pos,\n                              const Point3D& seg_0,\n                              const Point3D& seg_1,\n                              const double radius,\n                              const double f) {\n    nrn_assert(radius >= 0.0);\n    Point3D dx = paxpy(seg_1, -1.0, seg_0);\n    Point3D de = paxpy(e_pos, -1.0, seg_0);\n    double dx2(dot(dx, dx));\n    double dxn(std::sqrt(dx2));\n    if (dxn < std::numeric_limits<double>::epsilon()) {\n        return point_source_lfp_factor(e_pos, seg_0, radius, f);\n    }\n    double de2(dot(de, de));\n    double mu(dot(dx, de) / dx2);\n    Point3D de_star(paxpy(de, -mu, dx));\n    double de_star2(dot(de_star, de_star));\n    double q2(de_star2 / dx2);\n\n    double delta(mu * mu - (de2 - radius * radius) / dx2);\n    double one_m_mu(1.0 - mu);\n    auto log_integral = [&q2, &dxn](double a, double b) {\n        if (q2 < std::numeric_limits<double>::epsilon()) {\n            if (a * b <= 0) {\n                std::ostringstream s;\n                s << \"Log integral: invalid arguments \" << b << \" \" << a\n                  << \". Likely electrode exactly on the segment and \"\n                  << \"no flooring is present.\";\n                throw std::invalid_argument(s.str());\n            }\n            return std::abs(std::log(b / a)) / dxn;\n        } else {\n            return std::log((b + std::sqrt(b * b + q2)) / (a + std::sqrt(a * a + q2))) / dxn;\n        }\n    };\n    if (delta <= 0.0) {\n        return f * log_integral(-mu, one_m_mu);\n    } else {\n        double sqr_delta(std::sqrt(delta));\n        double d1(mu - sqr_delta);\n        double d2(mu + sqr_delta);\n        double parts = 0.0;\n        if (d1 > 0.0) {\n            double b(std::min(d1, 1.0) - mu);\n            parts += log_integral(-mu, b);\n        }\n        if (d2 < 1.0) {\n            double b(std::max(d2, 0.0) - mu);\n            parts += log_integral(b, one_m_mu);\n        };\n        // complement\n        double maxd1_0(std::max(d1, 0.0)), mind2_1(std::min(d2, 1.0));\n        if (maxd1_0 < mind2_1) {\n            parts += 1.0 / radius * (mind2_1 - maxd1_0);\n        }\n        return f * parts;\n    };\n}\n}  // namespace lfputils\n\nusing namespace lfputils;\n\ntemplate <LFPCalculatorType Type, typename SegmentIdTy>\nLFPCalculator<Type, SegmentIdTy>::LFPCalculator(const Point3Ds& seg_start,\n                                                const Point3Ds& seg_end,\n                                                const std::vector<double>& radius,\n                                                const std::vector<SegmentIdTy>& segment_ids,\n                                                const Point3Ds& electrodes,\n                                                double extra_cellular_conductivity)\n    : segment_ids_(segment_ids) {\n    if (seg_start.size() != seg_end.size()) {\n        throw std::invalid_argument(\"Different number of segment starts and ends.\");\n    }\n    if (seg_start.size() != radius.size()) {\n        throw std::invalid_argument(\"Different number of segments and radii.\");\n    }\n    double f(1.0 / (extra_cellular_conductivity * 4.0 * pi));\n\n    m.resize(electrodes.size());\n    for (size_t k = 0; k < electrodes.size(); ++k) {\n        auto& ms = m[k];\n        ms.resize(seg_start.size());\n        for (size_t l = 0; l < seg_start.size(); l++) {\n            ms[l] = getFactor(electrodes[k], seg_start[l], seg_end[l], radius[l], f);\n        }\n    }\n}\n\ntemplate <LFPCalculatorType Type, typename SegmentIdTy>\ntemplate <typename Vector>\ninline void LFPCalculator<Type, SegmentIdTy>::lfp(const Vector& membrane_current) {\n    std::vector<double> res(m.size());\n    for (size_t k = 0; k < m.size(); ++k) {\n        res[k] = 0.0;\n        auto& ms = m[k];\n        for (size_t l = 0; l < ms.size(); l++) {\n            res[k] += ms[l] * membrane_current[segment_ids_[l]];\n        }\n    }\n#if NRNMPI\n    if (corenrn_param.mpi_enable) {\n        lfp_values_.resize(res.size());\n        int mpi_sum{1};\n        nrnmpi_dbl_allreduce_vec(res.data(), lfp_values_.data(), res.size(), mpi_sum);\n    } else\n#endif\n    {\n        std::swap(res, lfp_values_);\n    }\n}\n\n\ntemplate LFPCalculator<LineSource>::LFPCalculator(const lfputils::Point3Ds& seg_start,\n                                                  const lfputils::Point3Ds& seg_end,\n                                                  const std::vector<double>& radius,\n                                                  const std::vector<int>& segment_ids,\n                                                  const lfputils::Point3Ds& electrodes,\n                                                  double extra_cellular_conductivity);\ntemplate LFPCalculator<PointSource>::LFPCalculator(const lfputils::Point3Ds& seg_start,\n                                                   const lfputils::Point3Ds& seg_end,\n                                                   const std::vector<double>& radius,\n                                                   const std::vector<int>& segment_ids,\n                                                   const lfputils::Point3Ds& electrodes,\n                                                   double extra_cellular_conductivity);\ntemplate void LFPCalculator<LineSource>::lfp(const DoublePtr& membrane_current);\ntemplate void LFPCalculator<PointSource>::lfp(const DoublePtr& membrane_current);\ntemplate void LFPCalculator<LineSource>::lfp(const std::vector<double>& membrane_current);\ntemplate void LFPCalculator<PointSource>::lfp(const std::vector<double>& membrane_current);\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/lfp.hpp",
    "content": "#pragma once\n\n#include <array>\n#include <vector>\n\n#include \"coreneuron/mpi/nrnmpi.h\"\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/utils/nrn_assert.h\"\n\nnamespace coreneuron {\n\nnamespace lfputils {\n\nusing Point3D = std::array<double, 3>;\nusing Point3Ds = std::vector<Point3D>;\nusing DoublePtr = double*;\n\ninline double dot(const Point3D& p1, const Point3D& p2) {\n    return p1[0] * p2[0] + p1[1] * p2[1] + p1[2] * p2[2];\n}\n\ninline double norm(const Point3D& p1) {\n    return std::sqrt(dot(p1, p1));\n}\n\ninline Point3D barycenter(const Point3D& p1, const Point3D& p2) {\n    return {0.5 * (p1[0] + p2[0]), 0.5 * (p1[1] + p2[1]), 0.5 * (p1[2] + p2[2])};\n}\n\ninline Point3D paxpy(const Point3D& p1, const double alpha, const Point3D& p2) {\n    return {p1[0] + alpha * p2[0], p1[1] + alpha * p2[1], p1[2] + alpha * p2[2]};\n}\n\n/**\n *\n * \\param e_pos electrode position\n * \\param seg_pos segment position\n * \\param radius segment radius\n * \\param double conductivity factor 1/([4 pi] * [conductivity])\n * \\return Resistance of the medium from the segment to the electrode.\n */\ninline double point_source_lfp_factor(const Point3D& e_pos,\n                                      const Point3D& seg_pos,\n                                      const double radius,\n                                      const double f) {\n    nrn_assert(radius >= 0.0);\n    Point3D es = paxpy(e_pos, -1.0, seg_pos);\n    return f / std::max(norm(es), radius);\n}\n\n/**\n *\n * \\param e_pos electrode position\n * \\param seg_pos segment position\n * \\param radius segment radius\n * \\param f conductivity factor 1/([4 pi] * [conductivity])\n * \\return Resistance of the medium from the segment to the electrode.\n */\ndouble line_source_lfp_factor(const Point3D& e_pos,\n                              const Point3D& seg_0,\n                              const Point3D& seg_1,\n                              const double radius,\n                              const double f);\n}  // namespace lfputils\n\nenum LFPCalculatorType { LineSource, PointSource };\n\n/**\n * \\brief LFPCalculator allows calculation of LFP given membrane currents.\n */\ntemplate <LFPCalculatorType Ty, typename SegmentIdTy = int>\nstruct LFPCalculator {\n    /**\n     * LFP Calculator constructor\n     * \\param seg_start all segments start owned by the proc\n     * \\param seg_end all segments end owned by the proc\n     * \\param radius fence around the segment. Ensures electrode cannot be\n     * arbitrarily close to the segment\n     * \\param electrodes positions of the electrodes\n     * \\param extra_cellular_conductivity conductivity of the extra-cellular\n     * medium\n     */\n    LFPCalculator(const lfputils::Point3Ds& seg_start,\n                  const lfputils::Point3Ds& seg_end,\n                  const std::vector<double>& radius,\n                  const std::vector<SegmentIdTy>& segment_ids,\n                  const lfputils::Point3Ds& electrodes,\n                  double extra_cellular_conductivity);\n\n    template <typename Vector>\n    void lfp(const Vector& membrane_current);\n\n    const std::vector<double>& lfp_values() const noexcept {\n        return lfp_values_;\n    }\n\n  private:\n    inline double getFactor(const lfputils::Point3D& e_pos,\n                            const lfputils::Point3D& seg_0,\n                            const lfputils::Point3D& seg_1,\n                            const double radius,\n                            const double f) const;\n    std::vector<double> lfp_values_;\n    std::vector<std::vector<double>> m;\n    const std::vector<SegmentIdTy>& segment_ids_;\n};\n\ntemplate <>\ndouble LFPCalculator<LineSource>::getFactor(const lfputils::Point3D& e_pos,\n                                            const lfputils::Point3D& seg_0,\n                                            const lfputils::Point3D& seg_1,\n                                            const double radius,\n                                            const double f) const {\n    return lfputils::line_source_lfp_factor(e_pos, seg_0, seg_1, radius, f);\n}\n\ntemplate <>\ndouble LFPCalculator<PointSource>::getFactor(const lfputils::Point3D& e_pos,\n                                             const lfputils::Point3D& seg_0,\n                                             const lfputils::Point3D& seg_1,\n                                             const double radius,\n                                             const double f) const {\n    return lfputils::point_source_lfp_factor(e_pos, lfputils::barycenter(seg_0, seg_1), radius, f);\n}\n\nextern template LFPCalculator<LineSource>::LFPCalculator(const lfputils::Point3Ds& seg_start,\n                                                         const lfputils::Point3Ds& seg_end,\n                                                         const std::vector<double>& radius,\n                                                         const std::vector<int>& segment_ids,\n                                                         const lfputils::Point3Ds& electrodes,\n                                                         double extra_cellular_conductivity);\nextern template LFPCalculator<PointSource>::LFPCalculator(const lfputils::Point3Ds& seg_start,\n                                                          const lfputils::Point3Ds& seg_end,\n                                                          const std::vector<double>& radius,\n                                                          const std::vector<int>& segment_ids,\n                                                          const lfputils::Point3Ds& electrodes,\n                                                          double extra_cellular_conductivity);\nextern template void LFPCalculator<LineSource>::lfp(const lfputils::DoublePtr& membrane_current);\nextern template void LFPCalculator<PointSource>::lfp(const lfputils::DoublePtr& membrane_current);\nextern template void LFPCalculator<LineSource>::lfp(const std::vector<double>& membrane_current);\nextern template void LFPCalculator<PointSource>::lfp(const std::vector<double>& membrane_current);\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/mech_report.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include <iostream>\n#include <vector>\n\n#include \"coreneuron/coreneuron.hpp\"\n#include \"coreneuron/io/nrn_setup.hpp\"\n#include \"coreneuron/mpi/nrnmpi.h\"\n#include \"coreneuron/apps/corenrn_parameters.hpp\"\n\nnamespace coreneuron {\n/** display global mechanism count */\nvoid write_mech_report() {\n    /// mechanim count across all gids, local to rank\n    const auto n_memb_func = corenrn.get_memb_funcs().size();\n    std::vector<long> local_mech_count(n_memb_func, 0);\n    std::vector<long> local_mech_size(n_memb_func, 0);\n\n    /// each gid record goes on separate row, only check non-empty threads\n    for (int i = 0; i < nrn_nthread; i++) {\n        const auto& nt = nrn_threads[i];\n        for (auto* tml = nt.tml; tml; tml = tml->next) {\n            const int type = tml->index;\n            const auto& ml = tml->ml;\n            local_mech_count[type] += ml->nodecount;\n            local_mech_size[type] = memb_list_size(tml, true);\n        }\n    }\n\n    std::vector<long> total_mech_count(n_memb_func);\n    std::vector<long> total_mech_size(n_memb_func);\n\n#if NRNMPI\n    if (corenrn_param.mpi_enable) {\n        /// get global sum of all mechanism instances\n        nrnmpi_long_allreduce_vec(&local_mech_count[0],\n                                  &total_mech_count[0],\n                                  local_mech_count.size(),\n                                  1);\n        nrnmpi_long_allreduce_vec(&local_mech_size[0],\n                                  &total_mech_size[0],\n                                  local_mech_size.size(),\n                                  1);\n    } else\n#endif\n    {\n        total_mech_count = local_mech_count;\n        total_mech_size = local_mech_size;\n    }\n\n    /// print global stats to stdout\n    if (nrnmpi_myid == 0) {\n        printf(\"\\n============== MECHANISMS COUNT AND SIZE BY TYPE =============\\n\");\n        printf(\"%4s %20s %10s %25s\\n\", \"Id\", \"Name\", \"Count\", \"Total memory size (KiB)\");\n        for (size_t i = 0; i < total_mech_count.size(); i++) {\n            if (total_mech_count[i] > 0) {\n                printf(\"%4lu %20s %10ld %25.2lf\\n\",\n                       i,\n                       nrn_get_mechname(i),\n                       total_mech_count[i],\n                       static_cast<double>(total_mech_size[i]) / 1024);\n            }\n        }\n        printf(\"==============================================================\\n\");\n    }\n}\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/mech_report.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n#include <string>\n\nnamespace coreneuron {\n/// write mechanism counts to stdout\nvoid write_mech_report();\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/mem_layout_util.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include \"mem_layout_util.hpp\"\n\nnamespace coreneuron {\n\n/// calculate size after padding for specific memory layout\n// Warning: this function is declared extern in nrniv_decl.h\nint nrn_soa_padded_size(int cnt, int layout) {\n    return soa_padded_size<NRN_SOA_PAD>(cnt, layout);\n}\n\n/// return the new offset considering the byte aligment settings\nsize_t nrn_soa_byte_align(size_t size) {\n    static_assert(NRN_SOA_BYTE_ALIGN % sizeof(double) == 0,\n                  \"NRN_SOA_BYTE_ALIGN should be a multiple of sizeof(double)\");\n    constexpr size_t dbl_align{NRN_SOA_BYTE_ALIGN / sizeof(double)};\n    size_t remainder{size % dbl_align};\n    if (remainder) {\n        size += dbl_align - remainder;\n    }\n    nrn_assert((size * sizeof(double)) % NRN_SOA_BYTE_ALIGN == 0);\n    return size;\n}\n\nint nrn_i_layout(int icnt, int cnt, int isz, int sz, int layout) {\n    switch (layout) {\n        case Layout::AoS:\n            return icnt * sz + isz;\n        case Layout::SoA:\n            int padded_cnt = nrn_soa_padded_size(cnt,\n                                                 layout);  // may want to factor out to save time\n            return icnt + isz * padded_cnt;\n    }\n\n    nrn_assert(false);\n    return 0;\n}\n\n// file data is AoS. ie.\n// organized as cnt array instances of mtype each of size sz.\n// So input index i refers to i_instance*sz + i_item offset\n// Return the corresponding SoA index -- taking into account the\n// alignment requirements. Ie. i_instance + i_item*align_cnt.\n\nint nrn_param_layout(int i, int mtype, Memb_list* ml) {\n    int layout = corenrn.get_mech_data_layout()[mtype];\n    switch (layout) {\n        case Layout::AoS:\n            return i;\n        case Layout::SoA:\n            nrn_assert(layout == Layout::SoA);\n            int sz = corenrn.get_prop_param_size()[mtype];\n            return nrn_i_layout(i / sz, ml->nodecount, i % sz, sz, layout);\n    }\n    nrn_assert(false);\n    return 0;\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/mem_layout_util.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n#include \"coreneuron/coreneuron.hpp\"\n#include \"coreneuron/nrniv/nrniv_decl.h\"\n\nnamespace coreneuron {\n\n#if !defined(NRN_SOA_PAD)\n// for layout 0, every range variable array must have a size which\n// is a multiple of NRN_SOA_PAD doubles\n#define NRN_SOA_PAD 8\n#endif\n\n/// return the new offset considering the byte aligment settings\nsize_t nrn_soa_byte_align(size_t i);\n\n/// This function return the index in a flat array of a matrix coordinate (icnt, isz).\n/// The matrix size is (cnt, sz)\n/// Depending of the layout some padding can be calculated\nint nrn_i_layout(int icnt, int cnt, int isz, int sz, int layout);\n\n// file data is AoS. ie.\n// organized as cnt array instances of mtype each of size sz.\n// So input index i refers to i_instance*sz + i_item offset\n// Return the corresponding SoA index -- taking into account the\n// alignment requirements. Ie. i_instance + i_item*align_cnt.\n\nint nrn_param_layout(int i, int mtype, Memb_list* ml);\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/mk_mech.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include <cstring>\n#include <map>\n#include <iostream>\n#include <fstream>\n#include <sstream>\n\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/membrane_definitions.h\"\n#include \"coreneuron/mechanism/register_mech.hpp\"\n#include \"coreneuron/nrniv/nrniv_decl.h\"\n#include \"coreneuron/utils/nrn_assert.h\"\n#include \"coreneuron/mechanism/mech/cfile/cabvars.h\"\n#include \"coreneuron/io/nrn2core_direct.h\"\n#include \"coreneuron/coreneuron.hpp\"\n#include \"coreneuron/mechanism//eion.hpp\"\n\nstatic char banner[] = \"Duke, Yale, and the BlueBrain Project -- Copyright 1984-2020\";\n\nnamespace coreneuron {\nextern int nrn_nobanner_;\n\n// NB: this should go away\nextern std::string cnrn_version();\nstd::map<std::string, int> mech2type;\n\nextern \"C\" {\nvoid (*nrn2core_mkmech_info_)(std::ostream&);\n}\nstatic void mk_mech();\nstatic void mk_mech(std::istream&);\n\n/// Read meta data about the mechanisms and allocate corresponding mechanism management data\n/// structures\nvoid mk_mech(const char* datpath) {\n    if (corenrn_embedded) {\n        // we are embedded in NEURON\n        mk_mech();\n        return;\n    }\n    {\n        std::string fname = std::string(datpath) + \"/bbcore_mech.dat\";\n        std::ifstream fs(fname);\n\n        if (!fs.good()) {\n            fprintf(stderr,\n                    \"Error: couldn't find bbcore_mech.dat file in the dataset directory \\n\");\n            fprintf(stderr,\n                    \"       Make sure to pass full directory path of dataset using -d DIR or \"\n                    \"--datpath=DIR \\n\");\n        }\n\n        nrn_assert(fs.good());\n        mk_mech(fs);\n        fs.close();\n    }\n}\n\n// we are embedded in NEURON, get info as stringstream from nrnbbcore_write.cpp\nstatic void mk_mech() {\n    static bool already_called = false;\n    if (already_called) {\n        return;\n    }\n    std::stringstream ss;\n    nrn_assert(nrn2core_mkmech_info_);\n    (*nrn2core_mkmech_info_)(ss);\n    mk_mech(ss);\n    already_called = true;\n}\n\nstatic void mk_mech(std::istream& s) {\n    char version[256];\n    s >> version;\n    check_bbcore_write_version(version);\n\n    //  printf(\"reading %s\\n\", fname);\n    int n = 0;\n    nrn_assert(s >> n);\n\n    /// Allocate space for mechanism related data structures\n    alloc_mech(n);\n\n    /// Read all the mechanisms and their meta data\n    for (int i = 2; i < n; ++i) {\n        char mname[100];\n        int type = 0, pnttype = 0, is_art = 0, is_ion = 0, dsize = 0, pdsize = 0;\n        nrn_assert(s >> mname >> type >> pnttype >> is_art >> is_ion >> dsize >> pdsize);\n        nrn_assert(i == type);\n#ifdef DEBUG\n        printf(\"%s %d %d %d %d %d %d\\n\", mname, type, pnttype, is_art, is_ion, dsize, pdsize);\n#endif\n        std::string str(mname);\n        corenrn.get_memb_func(type).sym = (Symbol*) strdup(mname);\n        mech2type[str] = type;\n        corenrn.get_pnt_map()[type] = (char) pnttype;\n        corenrn.get_prop_param_size()[type] = dsize;\n        corenrn.get_prop_dparam_size()[type] = pdsize;\n        corenrn.get_is_artificial()[type] = is_art;\n        if (is_ion) {\n            double charge = 0.;\n            nrn_assert(s >> charge);\n            // strip the _ion\n            char iname[100];\n            strcpy(iname, mname);\n            iname[strlen(iname) - 4] = '\\0';\n            // printf(\"%s %s\\n\", mname, iname);\n            ion_reg(iname, charge);\n        }\n        // printf(\"%s %d %d\\n\", mname, nrn_get_mechtype(mname), type);\n    }\n\n    if (nrnmpi_myid < 1 && nrn_nobanner_ == 0) {\n        fprintf(stderr, \" \\n\");\n        fprintf(stderr, \" %s\\n\", banner);\n        fprintf(stderr, \" Version : %s\\n\", cnrn_version().c_str());\n        fprintf(stderr, \" \\n\");\n        fflush(stderr);\n    }\n    /* will have to put this back if any mod file refers to diam */\n    //\tregister_mech(morph_mech, morph_alloc, (Pfri)0, (Pfri)0, (Pfri)0, (Pfri)0, -1, 0);\n\n    /// Calling _reg functions for the default mechanisms from the file mech/cfile/cabvars.h\n    for (int i = 0; mechanism[i]; i++) {\n        (*mechanism[i])();\n    }\n}\n\n/// Get mechanism type by the mechanism name\nint nrn_get_mechtype(const char* name) {\n    auto mapit = mech2type.find(name);\n    if (mapit == mech2type.end())\n        return -1;  // Could not find the mechanism\n    return mapit->second;\n}\n\nconst char* nrn_get_mechname(int type) {\n    for (const auto& item: mech2type) {\n        if (type == item.second) {\n            return item.first.c_str();\n        }\n    }\n    return nullptr;\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/nrn2core_data_init.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n#include <sstream>\n\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/network/netpar.hpp\"\n#include \"coreneuron/network/netcvode.hpp\"\n#include \"coreneuron/sim/fast_imem.hpp\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/utils/profile/profiler_interface.h\"\n#include \"coreneuron/coreneuron.hpp\"\n#include \"coreneuron/utils/nrnoc_aux.hpp\"\n#include \"coreneuron/io/mem_layout_util.hpp\"  // for WATCH use of nrn_i_layout\n#include \"coreneuron/utils/vrecitem.h\"\n#include \"coreneuron/io/core2nrn_data_return.hpp\"\n\nnamespace coreneuron {\n\n// helper functions defined below.\nstatic void nrn2core_tqueue();\nstatic void watch_activate_clear();\nstatic void nrn2core_transfer_watch_condition(int, int, int, int, int);\nstatic void vec_play_activate();\nstatic void nrn2core_patstim_share_info();\n\nextern \"C\" {\n/** Pointer to function in NEURON that iterates over activated\n    WATCH statements, sending each item to ...\n**/\nvoid (*nrn2core_transfer_watch_)(void (*cb)(int, int, int, int, int));\n}\n\n/**\n  All state from NEURON necessary to continue a run.\n\n  In NEURON direct mode, we desire the exact behavior of\n  ParallelContext.psolve(tstop). I.e. a sequence of such calls with and\n  without intervening calls to h.finitialize(). Most state (structure\n  and data of the substantive model) has been copied\n  from NEURON during nrn_setup. Now we need to copy the event queue\n  and set up any other invalid internal structures. I.e basically the\n  nrn_finitialize above but without changing any simulation data. We follow\n  some of the strategy of checkpoint_initialize.\n**/\nvoid direct_mode_initialize() {\n    dt2thread(-1.);\n    nrn_thread_table_check();\n\n    clear_event_queue();\n\n    // Reproduce present NEURON WATCH activation\n    // Start from nothing active.\n    watch_activate_clear();\n    // nrn2core_transfer_watch_condition(...) receives the WATCH activation info\n    // on a per active WatchCondition basis from NEURON.\n    (*nrn2core_transfer_watch_)(nrn2core_transfer_watch_condition);\n\n    nrn_spike_exchange_init();\n\n    // the things done by checkpoint restore at the end of Phase2::read_file\n    // vec_play_continuous n_vec_play_continuous of them\n    // patstim_index\n    // preSynConditionEventFlags nt.n_presyn of them\n    // restore_events\n    // restore_events\n    // the things done for checkpoint at the end of Phase2::populate\n    // checkpoint_restore_tqueue\n    // Lastly, if PatternStim exists, needs initialization\n    // checkpoint_restore_patternstim\n    // io/nrn_checkpoint.cpp: write_tqueue contains examples for each\n    // DiscreteEvent type with regard to the information needed for each\n    // subclass from the point of view of CoreNEURON.\n    // E.g. for NetConType_, just netcon_index\n    // The trick, then, is to figure out the CoreNEURON info from the\n    // NEURON queue items and that should be available in passing from\n    // the existing processing of nrncore_write.\n\n    // activate the vec_play_continuous events defined in phase2 setup.\n    vec_play_activate();\n\n    // Any PreSyn.flag_ == 1 on the NEURON side needs to be transferred\n    // or the PreSyn will spuriously fire when psolve starts.\n    extern void nrn2core_PreSyn_flag_receive(int tid);\n    for (int tid = 0; tid < nrn_nthread; ++tid) {\n        nrn2core_PreSyn_flag_receive(tid);\n    }\n\n    nrn2core_patstim_share_info();\n\n    nrn2core_tqueue();\n}\n\nvoid vec_play_activate() {\n    for (int tid = 0; tid < nrn_nthread; ++tid) {\n        NrnThread* nt = nrn_threads + tid;\n        for (int i = 0; i < nt->n_vecplay; ++i) {\n            PlayRecord* pr = (PlayRecord*) nt->_vecplay[i];\n            assert(pr->type() == VecPlayContinuousType);\n            VecPlayContinuous* vpc = (VecPlayContinuous*) pr;\n            assert(vpc->e_);\n            assert(vpc->discon_indices_ == nullptr);  // not implemented\n            vpc->e_->send(vpc->t_[vpc->ubound_index_], net_cvode_instance, nt);\n        }\n    }\n}\n\n}  // namespace coreneuron\n\n// For direct transfer of event queue information\n// Must be the same as corresponding struct NrnCoreTransferEvents in NEURON\n// Do not put this coreneuron version in the coreneuron namespace so that the\n// function pointer/callback has the same type in both NEURON and CoreNEURON.\n// Calling a function through a pointer to a function of different type is\n// undefined behaviour.\nstruct NrnCoreTransferEvents {\n    std::vector<int> type;        // DiscreteEvent type\n    std::vector<double> td;       // delivery time\n    std::vector<int> intdata;     // ints specific to the DiscreteEvent type\n    std::vector<double> dbldata;  // doubles specific to the type.\n};\n\nnamespace coreneuron {\n\nextern \"C\" {\n/** Pointer to function in NEURON that iterates over its tqeueue **/\nNrnCoreTransferEvents* (*nrn2core_transfer_tqueue_)(int tid);\n}\n\n// for faster determination of the movable index given the type\nstatic std::unordered_map<int, int> type2movable;\nstatic void setup_type2semantics() {\n    if (type2movable.empty()) {\n        std::size_t const n_memb_func{corenrn.get_memb_funcs().size()};\n        for (std::size_t type = 0; type < n_memb_func; ++type) {\n            int* ds{corenrn.get_memb_func(type).dparam_semantics};\n            if (ds) {\n                int dparam_size = corenrn.get_prop_dparam_size()[type];\n                for (int psz = 0; psz < dparam_size; ++psz) {\n                    if (ds[psz] == -4) {  // netsend semantics\n                        type2movable[type] = psz;\n                    }\n                }\n            }\n        }\n    }\n}\n\n/** Copy each thread's queue from NEURON **/\nstatic void nrn2core_tqueue() {\n    setup_type2semantics();                        // need type2movable for SelfEvent.\n    for (int tid = 0; tid < nrn_nthread; ++tid) {  // should be parallel\n        NrnCoreTransferEvents* ncte = (*nrn2core_transfer_tqueue_)(tid);\n        if (ncte) {\n            size_t idat = 0;\n            size_t idbldat = 0;\n            NrnThread& nt = nrn_threads[tid];\n            for (size_t i = 0; i < ncte->type.size(); ++i) {\n                switch (ncte->type[i]) {\n                    case 0: {  // DiscreteEvent\n                               // Ignore\n                    } break;\n\n                    case 2: {  // NetCon\n                        int ncindex = ncte->intdata[idat++];\n                        NetCon* nc = nt.netcons + ncindex;\n#ifndef CORENRN_DEBUG_QUEUE\n#define CORENRN_DEBUG_QUEUE 0\n#endif\n#if CORENRN_DEBUG_QUEUE\n                        printf(\"nrn2core_tqueue tid=%d i=%zd type=%d tdeliver=%g NetCon %d\\n\",\n                               tid,\n                               i,\n                               ncte->type[i],\n                               ncte->td[i],\n                               ncindex);\n#endif\n                        nc->send(ncte->td[i], net_cvode_instance, &nt);\n                    } break;\n\n                    case 3: {  // SelfEvent\n                        // target_type, target_instance, weight_index, flag movable\n\n                        // This is a nightmare and needs to be profoundly re-imagined.\n\n                        // Determine Point_process*\n                        int target_type = ncte->intdata[idat++];\n                        int target_instance = ncte->intdata[idat++];\n                        // From target_type and target_instance (mechanism data index)\n                        // compute the nt.pntprocs index.\n                        int offset = nt._pnt_offset[target_type];\n                        Point_process* pnt = nt.pntprocs + offset + target_instance;\n                        assert(pnt->_type == target_type);\n                        Memb_list* ml = nt._ml_list[target_type];\n                        if (ml->_permute) {\n                            target_instance = ml->_permute[target_instance];\n                        }\n                        assert(pnt->_i_instance == target_instance);\n                        assert(pnt->_tid == tid);\n\n                        // Determine weight_index\n                        int netcon_index = ncte->intdata[idat++];  // via the NetCon\n                        int weight_index = -1;                     // no associated netcon\n                        if (netcon_index >= 0) {\n                            weight_index = nt.netcons[netcon_index].u.weight_index_;\n                        }\n\n                        double flag = ncte->dbldata[idbldat++];\n                        int is_movable = ncte->intdata[idat++];\n                        // If the queue item is movable, then the pointer needs to be\n                        // stored in the mechanism instance movable slot by net_send.\n                        // And don't overwrite if not movable. Only one SelfEvent\n                        // for a given target instance is movable.\n                        int movable_index =\n                            nrn_i_layout(target_instance,\n                                         ml->nodecount,\n                                         type2movable[target_type],\n                                         corenrn.get_prop_dparam_size()[target_type],\n                                         corenrn.get_mech_data_layout()[target_type]);\n                        void** movable_arg = nt._vdata + ml->pdata[movable_index];\n                        TQItem* old_movable_arg = (TQItem*) (*movable_arg);\n#if CORENRN_DEBUG_QUEUE\n                        printf(\"nrn2core_tqueue tid=%d i=%zd type=%d tdeliver=%g SelfEvent\\n\",\n                               tid,\n                               i,\n                               ncte->type[i],\n                               ncte->td[i]);\n                        printf(\n                            \"  target_type=%d pnt data index=%d flag=%g is_movable=%d netcon index \"\n                            \"for weight=%d\\n\",\n                            target_type,\n                            target_instance,\n                            flag,\n                            is_movable,\n                            netcon_index);\n#endif\n                        net_send(movable_arg, weight_index, pnt, ncte->td[i], flag);\n                        if (!is_movable) {\n                            *movable_arg = (void*) old_movable_arg;\n                        }\n                    } break;\n\n                    case 4: {  // PreSyn\n                        int type = ncte->intdata[idat++];\n                        if (type == 0) {  // CoreNEURON PreSyn\n                            int ps_index = ncte->intdata[idat++];\n#if CORENRN_DEBUG_QUEUE\n                            printf(\"nrn2core_tqueue tid=%d i=%zd type=%d tdeliver=%g PreSyn %d\\n\",\n                                   tid,\n                                   i,\n                                   ncte->type[i],\n                                   ncte->td[i],\n                                   ps_index);\n#endif\n                            PreSyn* ps = nt.presyns + ps_index;\n                            int gid = ps->output_index_;\n                            // Following assumes already sent to other machines.\n                            ps->output_index_ = -1;\n                            ps->send(ncte->td[i], net_cvode_instance, &nt);\n                            ps->output_index_ = gid;\n                        } else {  // CoreNEURON InputPreSyn\n                            int gid = ncte->intdata[idat++];\n                            InputPreSyn* ps = gid2in[gid];\n                            ps->send(ncte->td[i], net_cvode_instance, &nt);\n                        }\n                    } break;\n\n                    case 6: {  // PlayRecordEvent\n                               // Ignore as phase2 handles analogous to checkpoint restore.\n                    } break;\n\n                    case 7: {  // NetParEvent\n#if CORENRN_DEBUG_QUEUE\n                        printf(\"nrn2core_tqueue tid=%d i=%zd type=%d tdeliver=%g NetParEvent\\n\",\n                               tid,\n                               i,\n                               ncte->type[i],\n                               ncte->td[i]);\n#endif\n                    } break;\n\n                    default: {\n                        std::stringstream qetype;\n                        qetype << ncte->type[i];\n                        hoc_execerror(\"Unimplemented transfer queue event type:\",\n                                      qetype.str().c_str());\n                    } break;\n                }\n            }\n            delete ncte;\n        }\n    }\n}\n\n/** @brief return first and last datum indices of WATCH statements\n */\nvoid watch_datum_indices(int type, int& first, int& last) {\n    int* semantics = corenrn.get_memb_func(type).dparam_semantics;\n    int dparam_size = corenrn.get_prop_dparam_size()[type];\n    // which slots are WATCH\n    // Note that first is the WatchList item, not the WatchCondition\n    first = -1;\n    last = 0;\n    for (int i = 0; i < dparam_size; ++i) {\n        if (semantics[i] == -8) {  // WATCH\n            if (first == -1) {\n                first = i;\n            }\n            last = i;\n        }\n    }\n}\n\nvoid watch_activate_clear() {\n    // Can identify mechanisms with WATCH statements from non-NULL\n    // corenrn.get_watch_check()[type] and figure out pdata that are\n    // _watch_array items from corenrn.get_memb_func(type).dparam_semantics\n    // Ironically, all WATCH statements may already be inactivated in\n    // consequence of phase2 transfer. But, for direct mode psolve, we would\n    // eventually like to minimise that transfer (at least with respect to\n    // structure).\n\n    // Loop over threads, mechanisms and pick out the ones with WATCH statements.\n    for (int tid = 0; tid < nrn_nthread; ++tid) {\n        NrnThread& nt = nrn_threads[tid];\n        for (NrnThreadMembList* tml = nt.tml; tml; tml = tml->next) {\n            if (corenrn.get_watch_check()[tml->index]) {\n                // zero all the WATCH slots.\n                Memb_list* ml = tml->ml;\n                int type = tml->index;\n                int dparam_size = corenrn.get_prop_dparam_size()[type];\n                // which slots are WATCH\n                int first, last;\n                watch_datum_indices(type, first, last);\n                // Zero the _watch_array from first to last inclusive.\n                // Note: the first is actually unused but is there because NEURON\n                // uses it. There is probably a better way to do this.\n                int* pdata = ml->pdata;\n                int nodecount = ml->nodecount;\n                int layout = corenrn.get_mech_data_layout()[type];\n                for (int iml = 0; iml < nodecount; ++iml) {\n                    for (int i = first; i <= last; ++i) {\n                        int* pd = pdata + nrn_i_layout(iml, nodecount, i, dparam_size, layout);\n                        *pd = 0;\n                    }\n                }\n            }\n        }\n    }\n}\n\nvoid nrn2core_transfer_watch_condition(int tid,\n                                       int pnttype,\n                                       int pntindex,\n                                       int watch_index,\n                                       int triggered) {\n    // Note: watch_index relative to AoS _ppvar for instance.\n    NrnThread& nt = nrn_threads[tid];\n    int pntoffset = nt._pnt_offset[pnttype];\n    Point_process* pnt = nt.pntprocs + (pntoffset + pntindex);\n    assert(pnt->_type == pnttype);\n    Memb_list* ml = nt._ml_list[pnttype];\n    if (ml->_permute) {\n        pntindex = ml->_permute[pntindex];\n    }\n    assert(pnt->_i_instance == pntindex);\n    assert(pnt->_tid == tid);\n\n    // perhaps all this should be more closely associated with phase2 since\n    // we are really talking about (direct) transfer from NEURON and not able\n    // to rely on finitialize() on the CoreNEURON side which would otherwise\n    // set up all this stuff as a consequence of SelfEvents initiated\n    // and delivered at time 0.\n    // I've become shakey in regard to how this is done since the reorganization\n    // from where everything was done in nrn_setup.cpp. Here, I'm guessing\n    // nrn_i_layout is the relevant index transformation after finding the\n    // beginning of the mechanism pdata.\n    int* pdata = ml->pdata;\n    int iml = pntindex;\n    int nodecount = ml->nodecount;\n    int i = watch_index;\n    int dparam_size = corenrn.get_prop_dparam_size()[pnttype];\n    int layout = corenrn.get_mech_data_layout()[pnttype];\n    int* pd = pdata + nrn_i_layout(iml, nodecount, i, dparam_size, layout);\n\n    // activate the WatchCondition\n    *pd = 2 + triggered;\n}\n\n// PatternStim direct mode\n// NEURON and CoreNEURON had different definitions for struct Info but\n// the NEURON version of pattern.mod for PatternStim was changed to\n// adopt the CoreNEURON version (along with THREADSAFE so they have the\n// same param size). So they now both share the same\n// instance of Info and NEURON is responsible for constructor/destructor.\n// And in direct mode, PatternStim gets no special treatment except that\n// on the CoreNEURON side, the Info struct points to the NEURON instance.\n\n// from patstim.mod\nextern void** pattern_stim_info_ref(int icnt,\n                                    int cnt,\n                                    double* _p,\n                                    Datum* _ppvar,\n                                    ThreadDatum* _thread,\n                                    NrnThread* _nt,\n                                    Memb_list* ml,\n                                    double v);\n\nextern \"C\" {\nvoid (*nrn2core_patternstim_)(void** info);\n}\n\n// In direct mode, CoreNEURON and NEURON share the same PatternStim Info\n// Assume singleton for PatternStim but that is not really necessary in principle.\nvoid nrn2core_patstim_share_info() {\n    int type = nrn_get_mechtype(\"PatternStim\");\n    NrnThread* nt = nrn_threads + 0;\n    Memb_list* ml = nt->_ml_list[type];\n    if (ml) {\n        int layout = corenrn.get_mech_data_layout()[type];\n        int sz = corenrn.get_prop_param_size()[type];\n        int psz = corenrn.get_prop_dparam_size()[type];\n        int _cntml = ml->nodecount;\n        assert(ml->nodecount == 1);\n        int _iml = 0;  // Assume singleton here and in (*nrn2core_patternstim_)(info) below.\n        double* _p = ml->data;\n        Datum* _ppvar = ml->pdata;\n        if (layout == Layout::AoS) {\n            _p += _iml * sz;\n            _ppvar += _iml * psz;\n        } else if (layout == Layout::SoA) {\n            ;\n        } else {\n            assert(0);\n        }\n\n        void** info = pattern_stim_info_ref(_iml, _cntml, _p, _ppvar, nullptr, nt, ml, 0.0);\n        (*nrn2core_patternstim_)(info);\n    }\n}\n\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/nrn2core_direct.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n#include <iostream>\n#include <vector>\n\nextern \"C\" {\n// The callbacks into nrn/src/nrniv/nrnbbcore_write.cpp to get\n// data directly instead of via files.\n\nextern bool corenrn_embedded;\nextern int corenrn_embedded_nthread;\n\nextern void (*nrn2core_group_ids_)(int*);\n\nextern void (*nrn2core_mkmech_info_)(std::ostream&);\n\nextern void* (*nrn2core_get_global_dbl_item_)(void*, const char*& name, int& size, double*& val);\nextern int (*nrn2core_get_global_int_item_)(const char* name);\n\nextern int (*nrn2core_get_dat1_)(int tid,\n                                 int& n_presyn,\n                                 int& n_netcon,\n                                 int*& output_gid,\n                                 int*& netcon_srcgid,\n                                 std::vector<int>& netcon_negsrcgid_tid);\n\nextern int (*nrn2core_get_dat2_1_)(int tid,\n                                   int& n_real_cell,\n                                   int& ngid,\n                                   int& n_real_gid,\n                                   int& nnode,\n                                   int& ndiam,\n                                   int& nmech,\n                                   int*& tml_index,\n                                   int*& ml_nodecount,\n                                   int& nidata,\n                                   int& nvdata,\n                                   int& nweight);\n\nextern int (*nrn2core_get_dat2_2_)(int tid,\n                                   int*& v_parent_index,\n                                   double*& a,\n                                   double*& b,\n                                   double*& area,\n                                   double*& v,\n                                   double*& diamvec);\n\nextern int (*nrn2core_get_dat2_mech_)(int tid,\n                                      size_t i,\n                                      int dsz_inst,\n                                      int*& nodeindices,\n                                      double*& data,\n                                      int*& pdata,\n                                      std::vector<int>& pointer2type);\n\nextern int (*nrn2core_get_dat2_3_)(int tid,\n                                   int nweight,\n                                   int*& output_vindex,\n                                   double*& output_threshold,\n                                   int*& netcon_pnttype,\n                                   int*& netcon_pntindex,\n                                   double*& weights,\n                                   double*& delays);\n\nextern int (*nrn2core_get_dat2_corepointer_)(int tid, int& n);\n\nextern int (*nrn2core_get_dat2_corepointer_mech_)(int tid,\n                                                  int type,\n                                                  int& icnt,\n                                                  int& dcnt,\n                                                  int*& iarray,\n                                                  double*& darray);\n\nextern int (*nrn2core_get_dat2_vecplay_)(int tid, std::vector<int>& indices);\n\nextern int (*nrn2core_get_dat2_vecplay_inst_)(int tid,\n                                              int i,\n                                              int& vptype,\n                                              int& mtype,\n                                              int& ix,\n                                              int& sz,\n                                              double*& yvec,\n                                              double*& tvec,\n                                              int& last_index,\n                                              int& discon_index,\n                                              int& ubound_index);\n\nextern void (*nrn2core_part2_clean_)();\n\n/* what variables to send back to NEURON on each time step */\nextern void (*nrn2core_get_trajectory_requests_)(int tid,\n                                                 int& bsize,\n                                                 int& n_pr,\n                                                 void**& vpr,\n                                                 int& n_trajec,\n                                                 int*& types,\n                                                 int*& indices,\n                                                 double**& pvars,\n                                                 double**& varrays);\n\n/* send values to NEURON on each time step */\nextern void (*nrn2core_trajectory_values_)(int tid, int n_pr, void** vpr, double t);\n\n/* Filled the Vector data arrays and send back the sizes at end of run */\nextern void (\n    *nrn2core_trajectory_return_)(int tid, int n_pr, int bsize, int vecsz, void** vpr, double t);\n\n/* send all spikes vectors to NEURON */\nextern int (*nrn2core_all_spike_vectors_return_)(std::vector<double>& spikevec,\n                                                 std::vector<int>& gidvec);\n\n/* send all weights to NEURON */\nextern void (*nrn2core_all_weights_return_)(std::vector<double*>& weights);\n\n/* get data array pointer from NEURON to copy into. */\nextern size_t (*nrn2core_type_return_)(int type, int tid, double*& data, double**& mdata);\n}  // extern \"C\"\n"
  },
  {
    "path": "coreneuron/io/nrn_checkpoint.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n#include <iostream>\n#include <sstream>\n#include <cassert>\n#include <memory>\n\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/nrniv/nrniv_decl.h\"\n#include \"coreneuron/io/nrn_filehandler.hpp\"\n#include \"coreneuron/io/nrn_checkpoint.hpp\"\n#include \"coreneuron/io/nrn_setup.hpp\"\n#include \"coreneuron/network/netcvode.hpp\"\n#include \"coreneuron/network/netpar.hpp\"\n#include \"coreneuron/utils/vrecitem.h\"\n#include \"coreneuron/mechanism/mech/mod2c_core_thread.hpp\"\n#include \"coreneuron/io/file_utils.hpp\"\n#include \"coreneuron/permute/data_layout.hpp\"\n#include \"coreneuron/permute/node_permute.h\"\n#include \"coreneuron/coreneuron.hpp\"\n#include \"coreneuron/utils/nrnoc_aux.hpp\"\n#include \"coreneuron/apps/corenrn_parameters.hpp\"\n\nnamespace coreneuron {\n// Those functions comes from mod file directly\nextern int checkpoint_save_patternstim(_threadargsproto_);\nextern void checkpoint_restore_patternstim(int, double, _threadargsproto_);\n\nCheckPoints::CheckPoints(const std::string& save, const std::string& restore)\n    : save_(save)\n    , restore_(restore)\n    , restored(false) {\n    if (!save.empty()) {\n        if (nrnmpi_myid == 0) {\n            mkdir_p(save.c_str());\n        }\n    }\n}\n\n/// todo : need to broadcast this rather than all reading a double\ndouble CheckPoints::restore_time() const {\n    if (!should_restore()) {\n        return 0.;\n    }\n\n    double rtime = 0.;\n    FileHandler f;\n    std::string filename = restore_ + \"/time.dat\";\n    f.open(filename, std::ios::in);\n    f.read_array(&rtime, 1);\n    f.close();\n    return rtime;\n}\n\nvoid CheckPoints::write_checkpoint(NrnThread* nt, int nb_threads) const {\n    if (!should_save()) {\n        return;\n    }\n\n#if NRNMPI\n    if (corenrn_param.mpi_enable) {\n        nrnmpi_barrier();\n    }\n#endif\n\n    /**\n     * if openmp threading needed:\n     *  #pragma omp parallel for private(i) shared(nt, nb_threads) schedule(runtime)\n     */\n    for (int i = 0; i < nb_threads; i++) {\n        if (nt[i].ncell || nt[i].tml) {\n            write_phase2(nt[i]);\n        }\n    }\n\n    if (nrnmpi_myid == 0) {\n        write_time();\n    }\n#if NRNMPI\n    if (corenrn_param.mpi_enable) {\n        nrnmpi_barrier();\n    }\n#endif\n}\n\n// Factor out the body of ion handling below as the same code\n// handles POINTER\nstatic int nrn_original_aos_index(int etype, int ix, NrnThread& nt, int** ml_pinv) {\n    // Determine ei_instance and ei from etype and ix.\n    // Deal with existing permutation and SoA.\n    Memb_list* eml = nt._ml_list[etype];\n    int ecnt = eml->nodecount;\n    int esz = corenrn.get_prop_param_size()[etype];\n    int elayout = corenrn.get_mech_data_layout()[etype];\n    // current index into eml->data is a  function\n    // of elayout, eml._permute, ei_instance, ei, and\n    // eml padding.\n    int p = ix - (eml->data - nt._data);\n    assert(p >= 0 && p < eml->_nodecount_padded * esz);\n    int ei_instance, ei;\n    nrn_inverse_i_layout(p, ei_instance, ecnt, ei, esz, elayout);\n    if (elayout == Layout::SoA) {\n        if (eml->_permute) {\n            if (!ml_pinv[etype]) {\n                ml_pinv[etype] = inverse_permute(eml->_permute, eml->nodecount);\n            }\n            ei_instance = ml_pinv[etype][ei_instance];\n        }\n    }\n    return ei_instance * esz + ei;\n}\n\nvoid CheckPoints::write_phase2(NrnThread& nt) const {\n    FileHandler fh;\n\n    NrnThreadChkpnt& ntc = nrnthread_chkpnt[nt.id];\n    auto filename = get_save_path() + \"/\" + std::to_string(ntc.file_id) + \"_2.dat\";\n\n    fh.open(filename, std::ios::out);\n    fh.checkpoint(2);\n\n    int n_outputgid = 0;  // calculate PreSyn with gid >= 0\n    for (int i = 0; i < nt.n_presyn; ++i) {\n        if (nt.presyns[i].gid_ >= 0) {\n            ++n_outputgid;\n        }\n    }\n\n    fh << nt.ncell << \" ncell\\n\";\n    fh << n_outputgid << \" ngid\\n\";\n#if CHKPNTDEBUG\n    assert(ntc.n_outputgids == n_outputgid);\n#endif\n\n    fh << nt.n_real_output << \" n_real_output\\n\";\n    fh << nt.end << \" nnode\\n\";\n    fh << ((nt._actual_diam == nullptr) ? 0 : nt.end) << \" ndiam\\n\";\n    int nmech = 0;\n    for (NrnThreadMembList* tml = nt.tml; tml; tml = tml->next) {\n        if (tml->index != patstimtype) {  // skip PatternStim\n            ++nmech;\n        }\n    }\n\n    fh << nmech << \" nmech\\n\";\n#if CHKPNTDEBUG\n    assert(nmech == ntc.nmech);\n#endif\n\n    for (NrnThreadMembList* current_tml = nt.tml; current_tml; current_tml = current_tml->next) {\n        if (current_tml->index == patstimtype) {\n            continue;\n        }\n        fh << current_tml->index << \"\\n\";\n        fh << current_tml->ml->nodecount << \"\\n\";\n    }\n\n    fh << nt._nidata << \" nidata\\n\";\n    fh << nt._nvdata << \" nvdata\\n\";\n    fh << nt.n_weight << \" nweight\\n\";\n\n    // see comment about parent in node_permute.cpp\n    int* pinv_nt = nullptr;\n    if (nt._permute) {\n        int* d = new int[nt.end];\n        pinv_nt = inverse_permute(nt._permute, nt.end);\n        for (int i = 0; i < nt.end; ++i) {\n            int x = nt._v_parent_index[nt._permute[i]];\n            if (x >= 0) {\n                d[i] = pinv_nt[x];\n            } else {\n                d[i] = 0;  // really should be -1;\n            }\n        }\n#if CHKPNTDEBUG\n        for (int i = 0; i < nt.end; ++i) {\n            assert(d[i] == ntc.parent[i]);\n        }\n#endif\n        fh.write_array<int>(d, nt.end);\n        delete[] d;\n    } else {\n#if CHKPNTDEBUG\n        for (int i = 0; i < nt.end; ++i) {\n            assert(nt._v_parent_index[i] == ntc.parent[i]);\n        }\n#endif\n        fh.write_array<int>(nt._v_parent_index, nt.end);\n        pinv_nt = new int[nt.end];\n        for (int i = 0; i < nt.end; ++i) {\n            pinv_nt[i] = i;\n        }\n    }\n\n    data_write(fh, nt._actual_a, nt.end, 1, 0, nt._permute);\n    data_write(fh, nt._actual_b, nt.end, 1, 0, nt._permute);\n\n#if CHKPNTDEBUG\n    for (int i = 0; i < nt.end; ++i) {\n        assert(nt._actual_area[i] == ntc.area[pinv_nt[i]]);\n    }\n#endif\n\n    data_write(fh, nt._actual_area, nt.end, 1, 0, nt._permute);\n    data_write(fh, nt._actual_v, nt.end, 1, 0, nt._permute);\n\n    if (nt._actual_diam) {\n        data_write(fh, nt._actual_diam, nt.end, 1, 0, nt._permute);\n    }\n\n    auto& memb_func = corenrn.get_memb_funcs();\n    // will need the ml_pinv inverse permutation of ml._permute for ions and POINTER\n    int** ml_pinv = (int**) ecalloc(memb_func.size(), sizeof(int*));\n\n    for (NrnThreadMembList* current_tml = nt.tml; current_tml; current_tml = current_tml->next) {\n        Memb_list* ml = current_tml->ml;\n        int type = current_tml->index;\n        if (type == patstimtype) {\n            continue;\n        }\n        int cnt = ml->nodecount;\n        auto& nrn_prop_param_size_ = corenrn.get_prop_param_size();\n        auto& nrn_prop_dparam_size_ = corenrn.get_prop_dparam_size();\n        auto& nrn_is_artificial_ = corenrn.get_is_artificial();\n\n        int sz = nrn_prop_param_size_[type];\n        int layout = corenrn.get_mech_data_layout()[type];\n        int* semantics = memb_func[type].dparam_semantics;\n\n        if (!nrn_is_artificial_[type]) {\n            // ml->nodeindices values are permuted according to nt._permute\n            // and locations according to ml._permute\n            // i.e. according to comment in node_permute.cpp\n            // nodelist[p_m[i]] = p[nodelist_original[i]\n            // so pinv[nodelist[p_m[i]] = nodelist_original[i]\n            int* nd_ix = new int[cnt];\n            for (int i = 0; i < cnt; ++i) {\n                int ip = ml->_permute ? ml->_permute[i] : i;\n                int ipval = ml->nodeindices[ip];\n                nd_ix[i] = pinv_nt[ipval];\n            }\n            fh.write_array<int>(nd_ix, cnt);\n            delete[] nd_ix;\n        }\n\n        data_write(fh, ml->data, cnt, sz, layout, ml->_permute);\n\n        sz = nrn_prop_dparam_size_[type];\n        if (sz) {\n            // need to update some values according to Datum semantics.\n            int* d = soa2aos(ml->pdata, cnt, sz, layout, ml->_permute);\n            std::vector<int> pointer2type;  // voltage or mechanism type (starts empty)\n            if (!nrn_is_artificial_[type]) {\n                for (int i_instance = 0; i_instance < cnt; ++i_instance) {\n                    for (int i = 0; i < sz; ++i) {\n                        int ix = i_instance * sz + i;\n                        int s = semantics[i];\n                        if (s == -1) {  // area\n                            int p = pinv_nt[d[ix] - (nt._actual_area - nt._data)];\n                            d[ix] = p;         // relative _actual_area\n                        } else if (s == -9) {  // diam\n                            int p = pinv_nt[d[ix] - (nt._actual_diam - nt._data)];\n\n                            d[ix] = p;         // relative to _actual_diam\n                        } else if (s == -5) {  // POINTER\n                            // loop over instances, then sz, means that we\n                            // visit consistent with natural order of\n                            // pointer2type\n\n                            // Relevant code that this has to invert\n                            // is permute/node_permute.cpp :: update_pdata_values with\n                            // respect to permutation, and\n                            // io/phase2.cpp :: Phase2::pdata_relocation\n                            // with respect to that AoS -> SoA\n\n                            // Step 1: what mechanism is d[ix] pointing to\n                            int ptype = type_of_ntdata(nt, d[ix], i_instance == 0);\n                            pointer2type.push_back(ptype);\n\n                            // Step 2: replace d[ix] with AoS index relative to type\n                            if (ptype == voltage) {\n                                int p = pinv_nt[d[ix] - (nt._actual_v - nt._data)];\n                                d[ix] = p;  // relative to _actual_v\n                            } else {\n                                // Since we know ptype, the situation is\n                                // identical to ion below. (which was factored\n                                // out into the following function.\n                                d[ix] = nrn_original_aos_index(ptype, d[ix], nt, ml_pinv);\n                            }\n                        } else if (s >= 0 && s < 1000) {  // ion\n                            d[ix] = nrn_original_aos_index(s, d[ix], nt, ml_pinv);\n                        }\n#if CHKPNTDEBUG\n                        if (s != -8) {  // WATCH values change\n                            assert(d[ix] ==\n                                   ntc.mlmap[type]->pdata_not_permuted[i_instance * sz + i]);\n                        }\n#endif\n                    }\n                }\n            }\n            fh.write_array<int>(d, cnt * sz);\n            delete[] d;\n            size_t s = pointer2type.size();\n            fh << s << \" npointer\\n\";\n            if (s) {\n                fh.write_array<int>(pointer2type.data(), s);\n            }\n        }\n    }\n\n    int nnetcon = nt.n_netcon;\n\n    int* output_vindex = new int[nt.n_presyn];\n    double* output_threshold = new double[nt.n_real_output];\n    for (int i = 0; i < nt.n_presyn; ++i) {\n        PreSyn* ps = nt.presyns + i;\n        if (ps->thvar_index_ >= 0) {\n            // real cell and index into (permuted) actual_v\n            // if any assert fails in this loop then we have faulty understanding\n            // of the for (int i = 0; i < nt.n_presyn; ++i) loop in nrn_setup.cpp\n            assert(ps->thvar_index_ < nt.end);\n            assert(ps->pntsrc_ == nullptr);\n            output_threshold[i] = ps->threshold_;\n            output_vindex[i] = pinv_nt[ps->thvar_index_];\n        } else if (i < nt.n_real_output) {  // real cell without a presyn\n            output_threshold[i] = 0.0;      // the way it was set in nrnbbcore_write.cpp\n            output_vindex[i] = -1;\n        } else {\n            Point_process* pnt = ps->pntsrc_;\n            assert(pnt);\n            int type = pnt->_type;\n            int ix = pnt->_i_instance;\n            if (nt._ml_list[type]->_permute) {\n                // pnt->_i_instance is the permuted index into pnt->_type\n                if (!ml_pinv[type]) {\n                    Memb_list* ml = nt._ml_list[type];\n                    ml_pinv[type] = inverse_permute(ml->_permute, ml->nodecount);\n                }\n                ix = ml_pinv[type][ix];\n            }\n            output_vindex[i] = -(ix * 1000 + type);\n        }\n    }\n    fh.write_array<int>(output_vindex, nt.n_presyn);\n    fh.write_array<double>(output_threshold, nt.n_real_output);\n#if CHKPNTDEBUG\n    for (int i = 0; i < nt.n_presyn; ++i) {\n        assert(ntc.output_vindex[i] == output_vindex[i]);\n    }\n    for (int i = 0; i < nt.n_real_output; ++i) {\n        assert(ntc.output_threshold[i] == output_threshold[i]);\n    }\n#endif\n    delete[] output_vindex;\n    delete[] output_threshold;\n    delete[] pinv_nt;\n\n    int synoffset = 0;\n    std::vector<int> pnt_offset(memb_func.size(), -1);\n    for (NrnThreadMembList* tml = nt.tml; tml; tml = tml->next) {\n        int type = tml->index;\n        if (corenrn.get_pnt_map()[type] > 0) {\n            pnt_offset[type] = synoffset;\n            synoffset += tml->ml->nodecount;\n        }\n    }\n\n    int* pnttype = new int[nnetcon];\n    int* pntindex = new int[nnetcon];\n    double* delay = new double[nnetcon];\n    for (int i = 0; i < nnetcon; ++i) {\n        NetCon& nc = nt.netcons[i];\n        Point_process* pnt = nc.target_;\n        if (pnt == nullptr) {\n            // nrn_setup.cpp allows type <=0 which generates nullptr target.\n            pnttype[i] = 0;\n            pntindex[i] = -1;\n        } else {\n            pnttype[i] = pnt->_type;\n\n            // todo: this seems most natural, but does not work. Perhaps should look\n            // into how pntindex determined in nrnbbcore_write.cpp and change there.\n            // int ix = pnt->_i_instance;\n            // if (ml_pinv[pnt->_type]) {\n            //     ix = ml_pinv[pnt->_type][ix];\n            // }\n\n            // follow the inverse of nrn_setup.cpp using pnt_offset computed above.\n            int ix = (pnt - nt.pntprocs) - pnt_offset[pnt->_type];\n            pntindex[i] = ix;\n        }\n        delay[i] = nc.delay_;\n    }\n    fh.write_array<int>(pnttype, nnetcon);\n    fh.write_array<int>(pntindex, nnetcon);\n    fh.write_array<double>(nt.weights, nt.n_weight);\n    fh.write_array<double>(delay, nnetcon);\n#if CHKPNTDEBUG\n    for (int i = 0; i < nnetcon; ++i) {\n        assert(ntc.pnttype[i] == pnttype[i]);\n        assert(ntc.pntindex[i] == pntindex[i]);\n        assert(ntc.delay[i] == delay[i]);\n    }\n#endif\n    delete[] pnttype;\n    delete[] pntindex;\n    delete[] delay;\n\n    // BBCOREPOINTER\n    int nbcp = 0;\n    for (NrnThreadMembList* tml = nt.tml; tml; tml = tml->next) {\n        if (corenrn.get_bbcore_read()[tml->index] && tml->index != patstimtype) {\n            ++nbcp;\n        }\n    }\n\n    fh << nbcp << \" bbcorepointer\\n\";\n#if CHKPNTDEBUG\n    assert(nbcp == ntc.nbcp);\n#endif\n    nbcp = 0;\n    for (NrnThreadMembList* tml = nt.tml; tml; tml = tml->next) {\n        if (corenrn.get_bbcore_read()[tml->index] && tml->index != patstimtype) {\n            int i = nbcp++;\n            int type = tml->index;\n            assert(corenrn.get_bbcore_write()[type]);\n            Memb_list* ml = tml->ml;\n            double* d = nullptr;\n            Datum* pd = nullptr;\n            int layout = corenrn.get_mech_data_layout()[type];\n            int dsz = corenrn.get_prop_param_size()[type];\n            int pdsz = corenrn.get_prop_dparam_size()[type];\n            int aln_cntml = nrn_soa_padded_size(ml->nodecount, layout);\n            fh << type << \"\\n\";\n            int icnt = 0;\n            int dcnt = 0;\n            // data size and allocate\n            for (int j = 0; j < ml->nodecount; ++j) {\n                int jp = j;\n                if (ml->_permute) {\n                    jp = ml->_permute[j];\n                }\n                d = ml->data + nrn_i_layout(jp, ml->nodecount, 0, dsz, layout);\n                pd = ml->pdata + nrn_i_layout(jp, ml->nodecount, 0, pdsz, layout);\n                (*corenrn.get_bbcore_write()[type])(\n                    nullptr, nullptr, &dcnt, &icnt, 0, aln_cntml, d, pd, ml->_thread, &nt, ml, 0.0);\n            }\n            fh << icnt << \"\\n\";\n            fh << dcnt << \"\\n\";\n#if CHKPNTDEBUG\n            assert(ntc.bcptype[i] == type);\n            assert(ntc.bcpicnt[i] == icnt);\n            assert(ntc.bcpdcnt[i] == dcnt);\n#endif\n            int* iArray = nullptr;\n            double* dArray = nullptr;\n            if (icnt) {\n                iArray = new int[icnt];\n            }\n            if (dcnt) {\n                dArray = new double[dcnt];\n            }\n            icnt = dcnt = 0;\n            for (int j = 0; j < ml->nodecount; j++) {\n                int jp = j;\n\n                if (ml->_permute) {\n                    jp = ml->_permute[j];\n                }\n\n                d = ml->data + nrn_i_layout(jp, ml->nodecount, 0, dsz, layout);\n                pd = ml->pdata + nrn_i_layout(jp, ml->nodecount, 0, pdsz, layout);\n\n                (*corenrn.get_bbcore_write()[type])(\n                    dArray, iArray, &dcnt, &icnt, 0, aln_cntml, d, pd, ml->_thread, &nt, ml, 0.0);\n            }\n\n            if (icnt) {\n                fh.write_array<int>(iArray, icnt);\n                delete[] iArray;\n            }\n\n            if (dcnt) {\n                fh.write_array<double>(dArray, dcnt);\n                delete[] dArray;\n            }\n            ++i;\n        }\n    }\n\n    fh << nt.n_vecplay << \" VecPlay instances\\n\";\n    for (int i = 0; i < nt.n_vecplay; i++) {\n        PlayRecord* pr = (PlayRecord*) nt._vecplay[i];\n        int vtype = pr->type();\n        int mtype = -1;\n        int ix = -1;\n\n        // not as efficient as possible but there should not be too many\n        Memb_list* ml = nullptr;\n        for (NrnThreadMembList* tml = nt.tml; tml; tml = tml->next) {\n            ml = tml->ml;\n            int nn = corenrn.get_prop_param_size()[tml->index] * ml->nodecount;\n            if (nn && pr->pd_ >= ml->data && pr->pd_ < (ml->data + nn)) {\n                mtype = tml->index;\n                ix = (pr->pd_ - ml->data);\n                break;\n            }\n        }\n        assert(mtype >= 0);\n        int icnt, isz;\n        nrn_inverse_i_layout(ix,\n                             icnt,\n                             ml->nodecount,\n                             isz,\n                             corenrn.get_prop_param_size()[mtype],\n                             corenrn.get_mech_data_layout()[mtype]);\n        if (ml_pinv[mtype]) {\n            icnt = ml_pinv[mtype][icnt];\n        }\n        ix = nrn_i_layout(\n            icnt, ml->nodecount, isz, corenrn.get_prop_param_size()[mtype], AOS_LAYOUT);\n\n        fh << vtype << \"\\n\";\n        fh << mtype << \"\\n\";\n        fh << ix << \"\\n\";\n#if CHKPNTDEBUG\n        assert(ntc.vtype[i] == vtype);\n        assert(ntc.mtype[i] == mtype);\n        assert(ntc.vecplay_ix[i] == ix);\n#endif\n        if (vtype == VecPlayContinuousType) {\n            VecPlayContinuous* vpc = (VecPlayContinuous*) pr;\n            int sz = vpc->y_.size();\n            fh << sz << \"\\n\";\n            fh.write_array<double>(vpc->y_.data(), sz);\n            fh.write_array<double>(vpc->t_.data(), sz);\n        } else {\n            std::cerr << \"Error checkpointing vecplay type\" << std::endl;\n            assert(0);\n        }\n    }\n\n    for (size_t i = 0; i < memb_func.size(); ++i) {\n        if (ml_pinv[i]) {\n            delete[] ml_pinv[i];\n        }\n    }\n    free(ml_pinv);\n\n    write_tqueue(nt, fh);\n    fh.close();\n}\n\nvoid CheckPoints::write_time() const {\n    FileHandler f;\n    auto filename = get_save_path() + \"/time.dat\";\n    f.open(filename, std::ios::out);\n    f.write_array(&t, 1);\n    f.close();\n}\n\n// A call to finitialize must be avoided after restoring the checkpoint\n// as that would change all states to a voltage clamp initialization.\n// Nevertheless t and some spike exchange and other computer state needs to\n// be initialized.\n// Also it is occasionally the case that nrn_init allocates data so we\n// need to call it but avoid the internal call to initmodel.\n// Consult finitialize.c to help decide what should be here\nbool CheckPoints::initialize() {\n    dt2thread(-1.);\n    nrn_thread_table_check();\n    nrn_spike_exchange_init();\n\n    allocate_data_in_mechanism_nrn_init();\n\n    // if PatternStim exists, needs initialization\n    for (NrnThreadMembList* tml = nrn_threads[0].tml; tml; tml = tml->next) {\n        if (tml->index == patstimtype && patstim_index >= 0 && patstim_te > 0.0) {\n            Memb_list* ml = tml->ml;\n            checkpoint_restore_patternstim(patstim_index,\n                                           patstim_te,\n                                           /* below correct only for AoS */\n                                           0,\n                                           ml->nodecount,\n                                           ml->data,\n                                           ml->pdata,\n                                           ml->_thread,\n                                           nrn_threads,\n                                           ml,\n                                           0.0);\n            break;\n        }\n    }\n\n    // Check that bbcore_write is defined if we want to use checkpoint\n    for (NrnThreadMembList* tml = nrn_threads[0].tml; tml; tml = tml->next) {\n        auto type = tml->index;\n        if (corenrn.get_bbcore_read()[type] && !corenrn.get_bbcore_write()[type]) {\n            auto memb_func = corenrn.get_memb_func(type);\n            fprintf(stderr,\n                    \"Checkpoint is requested involving BBCOREPOINTER but there is no bbcore_write\"\n                    \" function for %s\\n\",\n                    memb_func.sym);\n            assert(corenrn.get_bbcore_write()[type]);\n        }\n    }\n\n\n    return restored;\n}\n\ntemplate <typename T>\nT* CheckPoints::soa2aos(T* data, int cnt, int sz, int layout, int* permute) const {\n    // inverse of F -> data. Just a copy if layout=1. If SoA,\n    // original file order depends on padding and permutation.\n    // Good for a, b, area, v, diam, Memb_list.data, or anywhere values do not change.\n    T* d = new T[cnt * sz];\n    if (layout == Layout::AoS) {\n        for (int i = 0; i < cnt * sz; ++i) {\n            d[i] = data[i];\n        }\n    } else if (layout == Layout::SoA) {\n        int align_cnt = nrn_soa_padded_size(cnt, layout);\n        for (int i = 0; i < cnt; ++i) {\n            int ip = i;\n            if (permute) {\n                ip = permute[i];\n            }\n            for (int j = 0; j < sz; ++j) {\n                d[i * sz + j] = data[ip + j * align_cnt];\n            }\n        }\n    }\n    return d;\n}\n\ntemplate <typename T>\nvoid CheckPoints::data_write(FileHandler& F, T* data, int cnt, int sz, int layout, int* permute)\n    const {\n    T* d = soa2aos(data, cnt, sz, layout, permute);\n    F.write_array<T>(d, cnt * sz);\n    delete[] d;\n}\n\nNrnThreadChkpnt* nrnthread_chkpnt;\n\nint patstimtype;\n\nvoid CheckPoints::write_tqueue(TQItem* q, NrnThread& nt, FileHandler& fh) const {\n    DiscreteEvent* d = (DiscreteEvent*) q->data_;\n\n    // printf(\"  p %.20g %d\\n\", q->t_, d->type());\n    // d->pr(\"\", q->t_, net_cvode_instance);\n\n    if (!d->require_checkpoint()) {\n        return;\n    }\n\n    fh << d->type() << \"\\n\";\n    fh.write_array(&q->t_, 1);\n\n    switch (d->type()) {\n        case NetConType: {\n            NetCon* nc = (NetCon*) d;\n            assert(nc >= nt.netcons && (nc < (nt.netcons + nt.n_netcon)));\n            fh << (nc - nt.netcons) << \"\\n\";\n            break;\n        }\n        case SelfEventType: {\n            SelfEvent* se = (SelfEvent*) d;\n            fh << int(se->target_->_type) << \"\\n\";\n            fh << se->target_ - nt.pntprocs << \"\\n\";  // index of nrnthread.pntprocs\n            fh << se->target_->_i_instance << \"\\n\";   // not needed except for assert check\n            fh.write_array(&se->flag_, 1);\n            fh << (se->movable_ - nt._vdata) << \"\\n\";  // DANGEROUS?\n            fh << se->weight_index_ << \"\\n\";\n            // printf(\"    %d %ld %d %g %ld %d\\n\", se->target_->_type, se->target_ - nt.pntprocs,\n            // se->target_->_i_instance, se->flag_, se->movable_ - nt._vdata, se->weight_index_);\n            break;\n        }\n        case PreSynType: {\n            PreSyn* ps = (PreSyn*) d;\n            assert(ps >= nt.presyns && (ps < (nt.presyns + nt.n_presyn)));\n            fh << (ps - nt.presyns) << \"\\n\";\n            break;\n        }\n        case NetParEventType: {\n            // nothing extra to write\n            break;\n        }\n        case PlayRecordEventType: {\n            PlayRecord* pr = ((PlayRecordEvent*) d)->plr_;\n            fh << pr->type() << \"\\n\";\n            if (pr->type() == VecPlayContinuousType) {\n                VecPlayContinuous* vpc = (VecPlayContinuous*) pr;\n                int ix = -1;\n                for (int i = 0; i < nt.n_vecplay; ++i) {\n                    // if too many for fast search, put ix in the instance\n                    if (nt._vecplay[i] == (void*) vpc) {\n                        ix = i;\n                        break;\n                    }\n                }\n                assert(ix >= 0);\n                fh << ix << \"\\n\";\n            } else {\n                assert(0);\n            }\n            break;\n        }\n        default: {\n            // In particular, InputPreSyn does not appear in tqueue as it\n            // immediately fans out to NetCon.\n            assert(0);\n            break;\n        }\n    }\n}\n\nvoid CheckPoints::restore_tqitem(int type,\n                                 std::shared_ptr<Phase2::EventTypeBase> event,\n                                 NrnThread& nt) {\n    // printf(\"restore tqitem type=%d time=%.20g\\n\", type, time);\n\n    switch (type) {\n        case NetConType: {\n            auto e = static_cast<Phase2::NetConType_*>(event.get());\n            // printf(\"  NetCon %d\\n\", netcon_index);\n            NetCon* nc = nt.netcons + e->netcon_index;\n            nc->send(e->time, net_cvode_instance, &nt);\n            break;\n        }\n        case SelfEventType: {\n            auto e = static_cast<Phase2::SelfEventType_*>(event.get());\n            if (e->target_type == patstimtype) {\n                if (nt.id == 0) {\n                    patstim_te = e->time;\n                }\n                break;\n            }\n            Point_process* pnt = nt.pntprocs + e->point_proc_instance;\n            // printf(\"  SelfEvent %d %d %d %g %d %d\\n\", target_type, point_proc_instance,\n            // target_instance, flag, movable, weight_index);\n            nrn_assert(e->target_instance == pnt->_i_instance);\n            nrn_assert(e->target_type == pnt->_type);\n            net_send(nt._vdata + e->movable, e->weight_index, pnt, e->time, e->flag);\n            break;\n        }\n        case PreSynType: {\n            auto e = static_cast<Phase2::PreSynType_*>(event.get());\n            // printf(\"  PreSyn %d\\n\", presyn_index);\n            PreSyn* ps = nt.presyns + e->presyn_index;\n            int gid = ps->output_index_;\n            ps->output_index_ = -1;\n            ps->send(e->time, net_cvode_instance, &nt);\n            ps->output_index_ = gid;\n            break;\n        }\n        case NetParEventType: {\n            // nothing extra to read\n            // printf(\"  NetParEvent\\n\");\n            break;\n        }\n        case PlayRecordEventType: {\n            auto e = static_cast<Phase2::PlayRecordEventType_*>(event.get());\n            VecPlayContinuous* vpc = (VecPlayContinuous*) (nt._vecplay[e->vecplay_index]);\n            vpc->e_->send(e->time, net_cvode_instance, &nt);\n            break;\n        }\n        default: {\n            assert(0);\n            break;\n        }\n    }\n}\n\nvoid CheckPoints::write_tqueue(NrnThread& nt, FileHandler& fh) const {\n    // VecPlayContinuous\n    fh << nt.n_vecplay << \" VecPlayContinuous state\\n\";\n    for (int i = 0; i < nt.n_vecplay; ++i) {\n        VecPlayContinuous* vpc = (VecPlayContinuous*) nt._vecplay[i];\n        fh << vpc->last_index_ << \"\\n\";\n        fh << vpc->discon_index_ << \"\\n\";\n        fh << vpc->ubound_index_ << \"\\n\";\n    }\n\n    // PatternStim\n    int patstim_index = -1;\n    for (NrnThreadMembList* tml = nrn_threads[0].tml; tml; tml = tml->next) {\n        if (tml->index == patstimtype) {\n            Memb_list* ml = tml->ml;\n            patstim_index = checkpoint_save_patternstim(\n                /* below correct only for AoS */\n                0,\n                ml->nodecount,\n                ml->data,\n                ml->pdata,\n                ml->_thread,\n                nrn_threads,\n                ml,\n                0.0);\n            break;\n        }\n    }\n    fh << patstim_index << \" PatternStim\\n\";\n\n    // Avoid extra spikes due to some presyn voltages above threshold\n    fh << -1 << \" Presyn ConditionEvent flags\\n\";\n    for (int i = 0; i < nt.n_presyn; ++i) {\n        // PreSyn.flag_ not used. HPC memory utilizes PreSynHelper.flag_ array\n        fh << nt.presyns_helper[i].flag_ << \"\\n\";\n    }\n\n    NetCvodeThreadData& ntd = net_cvode_instance->p[nt.id];\n    // printf(\"write_tqueue %d %p\\n\", nt.id, ndt.tqe_);\n    TQueue<QTYPE>* tqe = ntd.tqe_;\n    TQItem* q;\n\n    fh << -1 << \" TQItems from atomic_dq\\n\";\n    while ((q = tqe->atomic_dq(1e20)) != nullptr) {\n        write_tqueue(q, nt, fh);\n    }\n    fh << 0 << \"\\n\";\n    fh << -1 << \" TQItemsfrom binq_\\n\";\n    for (q = tqe->binq_->first(); q; q = tqe->binq_->next(q)) {\n        write_tqueue(q, nt, fh);\n    }\n    fh << 0 << \"\\n\";\n}\n\n// Read a tqueue/checkpoint\n// int :: should be equal to the previous n_vecplay\n// n_vecplay:\n//   int: last_index\n//   int: discon_index\n//   int: ubound_index\n// int: patstim_index\n// int: should be -1\n// n_presyn:\n//   int: flags of presyn_helper\n// int: should be -1\n// null terminated:\n//   int: type\n//   array of size 1:\n//     double: time\n//   ... depends of the type\n// int: should be -1\n// null terminated:\n//   int: TO BE DEFINED\n//   ... depends of the type\nvoid CheckPoints::restore_tqueue(NrnThread& nt, const Phase2& p2) {\n    restored = true;\n\n    for (int i = 0; i < nt.n_vecplay; ++i) {\n        VecPlayContinuous* vpc = (VecPlayContinuous*) nt._vecplay[i];\n        auto& vec = p2.vec_play_continuous[i];\n        vpc->last_index_ = vec.last_index;\n        vpc->discon_index_ = vec.discon_index;\n        vpc->ubound_index_ = vec.ubound_index;\n    }\n\n    // PatternStim\n    patstim_index = p2.patstim_index;  // PatternStim\n    if (nt.id == 0) {\n        patstim_te = -1.0;  // changed if relevant SelfEvent;\n    }\n\n    for (int i = 0; i < nt.n_presyn; ++i) {\n        nt.presyns_helper[i].flag_ = p2.preSynConditionEventFlags[i];\n    }\n\n    for (const auto& event: p2.events) {\n        restore_tqitem(event.first, event.second, nt);\n    }\n}\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/nrn_checkpoint.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\n#include \"coreneuron/io/phase2.hpp\"\n\nnamespace coreneuron {\nstruct NrnThread;\nclass FileHandler;\n\nclass CheckPoints {\n  public:\n    CheckPoints(const std::string& save, const std::string& restore);\n    std::string get_save_path() const {\n        return save_;\n    }\n    std::string get_restore_path() const {\n        return restore_;\n    }\n    bool should_save() const {\n        return !save_.empty();\n    }\n    bool should_restore() const {\n        return !restore_.empty();\n    }\n    double restore_time() const;\n    void write_checkpoint(NrnThread* nt, int nb_threads) const;\n    /* return true if special checkpoint initialization carried out and\n       one should not do finitialize\n     */\n    bool initialize();\n    void restore_tqueue(NrnThread&, const Phase2& p2);\n\n  private:\n    const std::string save_;\n    const std::string restore_;\n    bool restored;\n    int patstim_index;\n    double patstim_te;\n\n    void write_time() const;\n    void write_phase2(NrnThread& nt) const;\n\n    template <typename T>\n    void data_write(FileHandler& F, T* data, int cnt, int sz, int layout, int* permute) const;\n    template <typename T>\n    T* soa2aos(T* data, int cnt, int sz, int layout, int* permute) const;\n    void write_tqueue(TQItem* q, NrnThread& nt, FileHandler& fh) const;\n    void write_tqueue(NrnThread& nt, FileHandler& fh) const;\n    void restore_tqitem(int type, std::shared_ptr<Phase2::EventTypeBase> event, NrnThread& nt);\n};\n\n\nint* inverse_permute(int* p, int n);\nvoid nrn_inverse_i_layout(int i, int& icnt, int cnt, int& isz, int sz, int layout);\n\nextern int patstimtype;\n\n#ifndef CHKPNTDEBUG\n#define CHKPNTDEBUG 0\n#endif\n\n#if CHKPNTDEBUG\n// Factored out from checkpoint changes to nrnoc/multicore.h and nrnoc/nrnoc_ml.h\n// Put here to avoid potential issues with gpu transfer and to allow\n// debugging comparison with respect to checkpoint writing to verify that\n// data is same as on reading when inverse transforming SoA and permutations.\n// Following is a mixture of substantive information which is lost during\n// nrn_setup.cpp and debugging only information which is retrievable from\n// NrnThread and Memb_list. Ideally, this should all go away\n\nstruct Memb_list_chkpnt {\n    // debug only\n    double* data_not_permuted;\n    Datum* pdata_not_permuted;\n    int* nodeindices_not_permuted;\n};\n\n#endif  // CHKPNTDEBUG but another section for it below\n\nstruct NrnThreadChkpnt {\n    int file_id;\n\n#if CHKPNTDEBUG\n    int nmech;\n    double* area;\n    int* parent;\n    Memb_list_chkpnt** mlmap;\n\n    int n_outputgids;\n    int* output_vindex;\n    double* output_threshold;\n\n    int* pnttype;\n    int* pntindex;\n    double* delay;\n\n    // BBCOREPOINTER\n    int nbcp;\n    int* bcptype;\n    int* bcpicnt;\n    int* bcpdcnt;\n\n    // VecPlay\n    int* vtype;\n    int* mtype;\n    int* vecplay_ix;\n#endif  // CHKPNTDEBUG\n};\n\nextern NrnThreadChkpnt* nrnthread_chkpnt;\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/nrn_filehandler.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include <iostream>\n#include \"coreneuron/io/nrn_filehandler.hpp\"\n#include \"coreneuron/nrnconf.h\"\n\nnamespace coreneuron {\nFileHandler::FileHandler(const std::string& filename)\n    : chkpnt(0)\n    , stored_chkpnt(0) {\n    this->open(filename);\n}\n\nbool FileHandler::file_exist(const std::string& filename) {\n    struct stat buffer;\n    return (stat(filename.c_str(), &buffer) == 0);\n}\n\nvoid FileHandler::open(const std::string& filename, std::ios::openmode mode) {\n    nrn_assert((mode & (std::ios::in | std::ios::out)));\n    close();\n    F.open(filename, mode | std::ios::binary);\n    if (!F.is_open()) {\n        std::cerr << \"cannot open file '\" << filename << \"'\" << std::endl;\n    }\n    nrn_assert(F.is_open());\n    current_mode = mode;\n    char version[256];\n    if (current_mode & std::ios::in) {\n        F.getline(version, sizeof(version));\n        nrn_assert(!F.fail());\n        check_bbcore_write_version(version);\n    }\n    if (current_mode & std::ios::out) {\n        F << bbcore_write_version << \"\\n\";\n    }\n}\n\nbool FileHandler::eof() {\n    if (F.eof()) {\n        return true;\n    }\n    int a = F.get();\n    if (F.eof()) {\n        return true;\n    }\n    F.putback(a);\n    return false;\n}\n\nint FileHandler::read_int() {\n    char line_buf[max_line_length];\n\n    F.getline(line_buf, sizeof(line_buf));\n    nrn_assert(!F.fail());\n\n    int i;\n    int n_scan = sscanf(line_buf, \"%d\", &i);\n    nrn_assert(n_scan == 1);\n\n    return i;\n}\n\nvoid FileHandler::read_mapping_count(int* gid, int* nsec, int* nseg, int* nseclist) {\n    char line_buf[max_line_length];\n\n    F.getline(line_buf, sizeof(line_buf));\n    nrn_assert(!F.fail());\n\n    /** mapping file has extra strings, ignore those */\n    int n_scan = sscanf(line_buf, \"%d %d %d %d\", gid, nsec, nseg, nseclist);\n    nrn_assert(n_scan == 4);\n}\n\nvoid FileHandler::read_mapping_cell_count(int* count) {\n    *count = read_int();\n}\n\nvoid FileHandler::read_checkpoint_assert() {\n    char line_buf[max_line_length];\n\n    F.getline(line_buf, sizeof(line_buf));\n    nrn_assert(!F.fail());\n\n    int i;\n    int n_scan = sscanf(line_buf, \"chkpnt %d\\n\", &i);\n    if (n_scan != 1) {\n        fprintf(stderr, \"no chkpnt line for %d\\n\", chkpnt);\n    }\n    nrn_assert(n_scan == 1);\n    if (i != chkpnt) {\n        fprintf(stderr, \"file chkpnt %d != expected %d\\n\", i, chkpnt);\n    }\n    nrn_assert(i == chkpnt);\n    ++chkpnt;\n}\n\nvoid FileHandler::close() {\n    F.close();\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/nrn_filehandler.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\n#include <iostream>\n#include <fstream>\n#include <vector>\n#include <sys/stat.h>\n\n#include \"coreneuron/utils/nrn_assert.h\"\n\nnamespace coreneuron {\n/** Encapsulate low-level reading of coreneuron input data files.\n *\n * Error handling is simple: abort()!\n *\n * Reader will abort() if native integer size is not 4 bytes.\n *\n * All automatic allocations performed by read_int_array()\n * and read_dbl_array() methods use new [].\n */\n\n// @todo: remove this static buffer\nconst int max_line_length = 1024;\n\nclass FileHandler {\n    std::fstream F;                        //!< File stream associated with reader.\n    std::ios_base::openmode current_mode;  //!< File open mode (not stored in fstream)\n    int chkpnt;                            //!< Current checkpoint number state.\n    int stored_chkpnt;                     //!< last \"remembered\" checkpoint number state.\n    /** Read a checkpoint line, bump our chkpnt counter, and assert equality.\n     *\n     * Checkpoint information is represented by a sequence \"checkpt %d\\n\"\n     * where %d is a scanf-compatible representation of the checkpoint\n     * integer.\n     */\n    void read_checkpoint_assert();\n\n    // FileHandler is not copyable.\n    FileHandler(const FileHandler&) = delete;\n    FileHandler& operator=(const FileHandler&) = delete;\n\n  public:\n    FileHandler()\n        : chkpnt(0)\n        , stored_chkpnt(0) {}\n\n    explicit FileHandler(const std::string& filename);\n\n    /** Preserving chkpnt state, move to a new file. */\n    void open(const std::string& filename, std::ios::openmode mode = std::ios::in);\n\n    /** Is the file not open */\n    bool fail() const {\n        return F.fail();\n    }\n\n    static bool file_exist(const std::string& filename);\n\n    /** nothing more to read */\n    bool eof();\n\n    /** Query chkpnt state. */\n    int checkpoint() const {\n        return chkpnt;\n    }\n\n    /** Explicitly override chkpnt state. */\n    void checkpoint(int c) {\n        chkpnt = c;\n    }\n\n    /** Record current chkpnt state. */\n    void record_checkpoint() {\n        stored_chkpnt = chkpnt;\n    }\n\n    /** Restored last recorded chkpnt state. */\n    void restore_checkpoint() {\n        chkpnt = stored_chkpnt;\n    }\n\n    /** Parse a single integer entry.\n     *\n     * Single integer entries are represented by their standard\n     * (C locale) text representation, followed by a newline.\n     * Extraneous characters following the integer but preceding\n     * the newline are ignore.\n     */\n    int read_int();\n\n    /** Parse a neuron mapping count entries\n     *\n     * Reads neuron mapping info which is represented by\n     * gid, #sections, #segments, #section lists\n     */\n    void read_mapping_count(int* gid, int* nsec, int* nseg, int* nseclist);\n\n    /** Reads number of cells in parsing file */\n    void read_mapping_cell_count(int* count);\n\n    /** Parse a neuron section segment mapping\n     *\n     * Read count no of mappings for section to segment\n     */\n    template <typename T>\n    int read_mapping_info(T* mapinfo) {\n        int nsec, nseg, n_scan;\n        char line_buf[max_line_length], name[max_line_length];\n\n        F.getline(line_buf, sizeof(line_buf));\n        n_scan = sscanf(line_buf, \"%s %d %d\", name, &nsec, &nseg);\n\n        nrn_assert(n_scan == 3);\n\n        mapinfo->name = std::string(name);\n\n        if (nseg) {\n            std::vector<int> sec, seg;\n            sec.reserve(nseg);\n            seg.reserve(nseg);\n\n            read_array<int>(&sec[0], nseg);\n            read_array<int>(&seg[0], nseg);\n\n            for (int i = 0; i < nseg; i++) {\n                mapinfo->add_segment(sec[i], seg[i]);\n            }\n        }\n        return nseg;\n    }\n\n    /** Defined flag values for parse_array() */\n    enum parse_action { read, seek };\n\n    /** Generic parse function for an array of fixed length.\n     *\n     * \\tparam T the array element type: may be \\c int or \\c double.\n     * \\param p pointer to the target in memory for reading into.\n     * \\param count number of items of type \\a T to parse.\n     * \\param action whether to validate and skip (\\c seek) or\n     *    copy array into memory (\\c read).\n     * \\return the supplied pointer value.\n     *\n     * Error if \\a count is non-zero, \\a flag is \\c read, and\n     * the supplied pointer \\p is null.\n     *\n     * Arrays are represented by a checkpoint line followed by\n     * the array items in increasing index order, in the native binary\n     * representation of the writing process.\n     */\n    template <typename T>\n    inline T* parse_array(T* p, size_t count, parse_action flag) {\n        if (count > 0 && flag != seek)\n            nrn_assert(p != 0);\n\n        read_checkpoint_assert();\n        switch (flag) {\n            case seek:\n                F.seekg(count * sizeof(T), std::ios_base::cur);\n                break;\n            case read:\n                F.read((char*) p, count * sizeof(T));\n                break;\n        }\n\n        nrn_assert(!F.fail());\n        return p;\n    }\n\n    // convenience interfaces:\n\n    /** Read an integer array of fixed length. */\n    template <typename T>\n    inline T* read_array(T* p, size_t count) {\n        return parse_array(p, count, read);\n    }\n\n    /** Allocate and read an integer array of fixed length. */\n    template <typename T>\n    inline T* read_array(size_t count) {\n        return parse_array(new T[count], count, read);\n    }\n\n    template <typename T>\n    inline std::vector<T> read_vector(size_t count) {\n        std::vector<T> vec(count);\n        parse_array(vec.data(), count, read);\n        return vec;\n    }\n\n    /** Close currently open file. */\n    void close();\n\n    /** Write an 1D array **/\n    template <typename T>\n    void write_array(T* p, size_t nb_elements) {\n        nrn_assert(F.is_open());\n        nrn_assert(current_mode & std::ios::out);\n        write_checkpoint();\n        F.write((const char*) p, nb_elements * (sizeof(T)));\n        nrn_assert(!F.fail());\n    }\n\n    /** Write a padded array. nb_elements is number of elements to write per line,\n     * line_width is full size of a line in nb elements**/\n    template <typename T>\n    void write_array(T* p,\n                     size_t nb_elements,\n                     size_t line_width,\n                     size_t nb_lines,\n                     bool to_transpose = false) {\n        nrn_assert(F.is_open());\n        nrn_assert(current_mode & std::ios::out);\n        write_checkpoint();\n        T* temp_cpy = new T[nb_elements * nb_lines];\n\n        if (to_transpose) {\n            for (size_t i = 0; i < nb_lines; i++) {\n                for (size_t j = 0; j < nb_elements; j++) {\n                    temp_cpy[i + j * nb_lines] = p[i * line_width + j];\n                }\n            }\n        } else {\n            memcpy(temp_cpy, p, nb_elements * sizeof(T) * nb_lines);\n        }\n        // AoS never use padding, SoA is translated above, so one write\n        // operation is enought in both cases\n        F.write((const char*) temp_cpy, nb_elements * sizeof(T) * nb_lines);\n        nrn_assert(!F.fail());\n        delete[] temp_cpy;\n    }\n\n    template <typename T>\n    FileHandler& operator<<(const T& scalar) {\n        nrn_assert(F.is_open());\n        nrn_assert(current_mode & std::ios::out);\n        F << scalar;\n        nrn_assert(!F.fail());\n        return *this;\n    }\n\n  private:\n    /* write_checkpoint is callable only for our internal uses, making it accesible to user, makes\n     * file format unpredictable */\n    void write_checkpoint() {\n        F << \"chkpnt \" << chkpnt++ << \"\\n\";\n    }\n};\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/nrn_setup.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include <algorithm>\n#include <vector>\n#include <map>\n#include <cstring>\n#include <mutex>\n\n#include \"coreneuron/apps/corenrn_parameters.hpp\"\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/utils/randoms/nrnran123.h\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/nrniv/nrniv_decl.h\"\n#include \"coreneuron/sim/fast_imem.hpp\"\n#include \"coreneuron/network/multisend.hpp\"\n#include \"coreneuron/utils/nrn_assert.h\"\n#include \"coreneuron/utils/nrnmutdec.hpp\"\n#include \"coreneuron/utils/memory.h\"\n#include \"coreneuron/utils/utils.hpp\"\n#include \"coreneuron/mpi/nrnmpi.h\"\n#include \"coreneuron/mpi/core/nrnmpi.hpp\"\n#include \"coreneuron/io/nrn_setup.hpp\"\n#include \"coreneuron/network/partrans.hpp\"\n#include \"coreneuron/io/nrn_checkpoint.hpp\"\n#include \"coreneuron/permute/node_permute.h\"\n#include \"coreneuron/permute/cellorder.hpp\"\n#include \"coreneuron/io/nrnsection_mapping.hpp\"\n#include \"coreneuron/utils/nrnoc_aux.hpp\"\n#include \"coreneuron/io/phase1.hpp\"\n#include \"coreneuron/io/phase2.hpp\"\n#include \"coreneuron/io/mech_report.h\"\n#include \"coreneuron/io/reports/nrnreport.hpp\"\n\n// callbacks into nrn/src/nrniv/nrnbbcore_write.cpp\n#include \"coreneuron/sim/fast_imem.hpp\"\n#include \"coreneuron/coreneuron.hpp\"\n\n\n/// --> Coreneuron\nbool corenrn_embedded;\nint corenrn_embedded_nthread;\n\nvoid (*nrn2core_group_ids_)(int*);\n\nextern \"C\" {\nSetupTransferInfo* (*nrn2core_get_partrans_setup_info_)(int ngroup,\n                                                        int cn_nthread,\n                                                        size_t cn_sidt_size);\n}\n\nvoid (*nrn2core_get_trajectory_requests_)(int tid,\n                                          int& bsize,\n                                          int& n_pr,\n                                          void**& vpr,\n                                          int& n_trajec,\n                                          int*& types,\n                                          int*& indices,\n                                          double**& pvars,\n                                          double**& varrays);\n\nvoid (*nrn2core_trajectory_values_)(int tid, int n_pr, void** vpr, double t);\n\nvoid (*nrn2core_trajectory_return_)(int tid, int n_pr, int bsize, int vecsz, void** vpr, double t);\n\nint (*nrn2core_all_spike_vectors_return_)(std::vector<double>& spikevec, std::vector<int>& gidvec);\n\nvoid (*nrn2core_all_weights_return_)(std::vector<double*>& weights);\n\n// file format defined in cooperation with nrncore/src/nrniv/nrnbbcore_write.cpp\n// single integers are ascii one per line. arrays are binary int or double\n// Note that regardless of the gid contents of a group, since all gids are\n// globally unique, a filename convention which involves the first gid\n// from the group is adequate. Also note that balance is carried out from a\n// per group perspective and launching a process consists of specifying\n// a list of group ids (first gid of the group) for each process.\n//\n// <firstgid>_1.dat\n// n_presyn, n_netcon\n// output_gids (npresyn) with -(type+1000*index) for those acell with no gid\n// netcon_srcgid (nnetcon) -(type+1000*index) refers to acell with no gid\n//                         -1 means the netcon has no source (not implemented)\n// Note that the negative gids are only thread unique and not process unique.\n// We create a thread specific hash table for the negative gids for each thread\n// when <firstgid>_1.dat is read and then destroy it after <firstgid>_2.dat\n// is finished using it.  An earlier implementation which attempted to\n// encode the thread number into the negative gid\n// (i.e -ith - nth*(type +1000*index)) failed due to not large enough\n// integer domain size.\n// Note that for file transfer it is an error if a negative srcgid is\n// not in the same thread as the target. This is because there it may\n// not be the case that threads in a NEURON process end up on same process\n// in CoreNEURON. NEURON will raise an error if this\n// is the case. However, for direct memory transfer, it is allowed that\n// a negative srcgid may be in a different thread than the target. So\n// nrn2core_get_dat1 has a last arg netcon_negsrcgid_tid that specifies\n// for the negative gids in netcon_srcgid (in that order) the source thread.\n//\n// <firstgid>_2.dat\n// n_real_cell, n_output, n_real_output, nnode\n// ndiam - 0 if no mechanism has dparam with diam semantics, or nnode\n// nmech - includes artcell mechanisms\n// for the nmech tml mechanisms\n//   type, nodecount\n// nidata, nvdata, nweight\n// v_parent_index (nnode)\n// actual_a, b, area, v (nnode)\n// diam - if ndiam > 0. Note that only valid diam is for those nodes with diam semantics mechanisms\n// for the nmech tml mechanisms\n//   nodeindices (nodecount) but only if not an artificial cell\n//   data (nodecount*param_size)\n//   pdata (nodecount*dparam_size) but only if dparam_size > 0 on this side.\n// output_vindex (n_presyn) >= 0 associated with voltages -(type+1000*index) for acell\n// output_threshold (n_real_output)\n// netcon_pnttype (nnetcon) <=0 if a NetCon does not have a target.\n// netcon_pntindex (nnetcon)\n// weights (nweight)\n// delays (nnetcon)\n// for the nmech tml mechanisms that have a nrn_bbcore_write method\n//   type\n//   icnt\n//   dcnt\n//   int array (number specified by the nodecount nrn_bbcore_write\n//     to be intepreted by this side's nrn_bbcore_read method)\n//   double array\n// #VectorPlay_instances, for each of these instances\n// 4 (VecPlayContinuousType)\n// mtype\n// index (from Memb_list.data)\n// vecsize\n// yvec\n// tvec\n//\n// The critical issue requiring careful attention is that a coreneuron\n// process reads many coreneuron thread files with a result that, although\n// the conceptual\n// total n_pre is the sum of all the n_presyn from each thread as is the\n// total number of output_gid, the number of InputPreSyn instances must\n// be computed here from a knowledge of all thread's netcon_srcgid after\n// all thread's output_gids have been registered. We want to save the\n// \"individual allocation of many small objects\" memory overhead by\n// allocating a single InputPreSyn array for the entire process.\n// For this reason cellgroup data are divided into two separate\n// files with the first containing output_gids and netcon_srcgid which are\n// stored in the nt.presyns array and nt.netcons array respectively\nnamespace coreneuron {\nstatic OMP_Mutex mut;\n\n/// Vector of maps for negative presyns\nstd::vector<std::map<int, PreSyn*>> neg_gid2out;\n/// Maps for ouput and input presyns\nstd::map<int, PreSyn*> gid2out;\nstd::map<int, InputPreSyn*> gid2in;\n\n/// InputPreSyn.nc_index_ to + InputPreSyn.nc_cnt_ give the NetCon*\nstd::vector<NetCon*> netcon_in_presyn_order_;\n\n/// Only for setup vector of netcon source gids\nstd::vector<int*> nrnthreads_netcon_srcgid;\n\n/// If a nrnthreads_netcon_srcgid is negative, need to determine the thread when\n/// in order to use the correct neg_gid2out[tid] map\nstd::vector<std::vector<int>> nrnthreads_netcon_negsrcgid_tid;\n\n/* read files.dat file and distribute cellgroups to all mpi ranks */\nvoid nrn_read_filesdat(int& ngrp, int*& grp, const char* filesdat) {\n    patstimtype = nrn_get_mechtype(\"PatternStim\");\n    if (corenrn_embedded) {\n        ngrp = corenrn_embedded_nthread;\n        grp = new int[ngrp + 1];\n        (*nrn2core_group_ids_)(grp);\n        return;\n    }\n\n    FILE* fp = fopen(filesdat, \"r\");\n\n    if (!fp) {\n        nrn_fatal_error(\"No input file ( %s ) with nrnthreads, exiting...\", filesdat);\n    }\n\n    char version[256];\n    nrn_assert(fscanf(fp, \"%s\\n\", version) == 1);\n    check_bbcore_write_version(version);\n\n    int iNumFiles = 0;\n    nrn_assert(fscanf(fp, \"%d\\n\", &iNumFiles) == 1);\n\n    // temporary strategem to figure out if model uses gap junctions while\n    // being backward compatible\n    if (iNumFiles == -1) {\n        nrn_assert(fscanf(fp, \"%d\\n\", &iNumFiles) == 1);\n        nrn_have_gaps = true;\n        if (nrnmpi_myid == 0) {\n            printf(\"Model uses gap junctions\\n\");\n        }\n    }\n\n    if (nrnmpi_numprocs > iNumFiles && nrnmpi_myid == 0) {\n        printf(\n            \"Info : The number of input datasets are less than ranks, some ranks will be idle!\\n\");\n    }\n\n    ngrp = 0;\n    grp = new int[iNumFiles / nrnmpi_numprocs + 1];\n\n    // irerate over gids in files.dat\n    for (int iNum = 0; iNum < iNumFiles; ++iNum) {\n        int iFile;\n\n        nrn_assert(fscanf(fp, \"%d\\n\", &iFile) == 1);\n        if ((iNum % nrnmpi_numprocs) == nrnmpi_myid) {\n            grp[ngrp] = iFile;\n            ngrp++;\n        }\n    }\n\n    fclose(fp);\n}\n\nvoid netpar_tid_gid2ps(int tid, int gid, PreSyn** ps, InputPreSyn** psi) {\n    /// for gid < 0 returns the PreSyn* in the thread (tid) specific map.\n    *ps = nullptr;\n    *psi = nullptr;\n\n    if (gid >= 0) {\n        auto gid2out_it = gid2out.find(gid);\n        if (gid2out_it != gid2out.end()) {\n            *ps = gid2out_it->second;\n        } else {\n            auto gid2in_it = gid2in.find(gid);\n            if (gid2in_it != gid2in.end()) {\n                *psi = gid2in_it->second;\n            }\n        }\n    } else {\n        auto gid2out_it = neg_gid2out[tid].find(gid);\n        if (gid2out_it != neg_gid2out[tid].end()) {\n            *ps = gid2out_it->second;\n        }\n    }\n}\n\nvoid determine_inputpresyn() {\n    // allocate the process wide InputPreSyn array\n    // all the output_gid have been registered and associated with PreSyn.\n    // now count the needed InputPreSyn by filling the netpar::gid2in map\n    gid2in.clear();\n\n    // now have to fill the new table\n    // do not need to worry about negative gid overlap since only use\n    // it to search for PreSyn in this thread.\n\n    std::vector<InputPreSyn*> inputpresyn_;\n\n    for (int ith = 0; ith < nrn_nthread; ++ith) {\n        NrnThread& nt = nrn_threads[ith];\n        // associate gid with InputPreSyn and increase PreSyn and InputPreSyn count\n        nt.n_input_presyn = 0;\n        // if single thread or file transfer then definitely empty.\n        std::vector<int>& negsrcgid_tid = nrnthreads_netcon_negsrcgid_tid[ith];\n        size_t i_tid = 0;\n        for (int i = 0; i < nt.n_netcon; ++i) {\n            int gid = nrnthreads_netcon_srcgid[ith][i];\n            if (gid >= 0) {\n                /// If PreSyn or InputPreSyn is already in the map\n                auto gid2out_it = gid2out.find(gid);\n                if (gid2out_it != gid2out.end()) {\n                    /// Increase PreSyn count\n                    ++gid2out_it->second->nc_cnt_;\n                    continue;\n                }\n                auto gid2in_it = gid2in.find(gid);\n                if (gid2in_it != gid2in.end()) {\n                    /// Increase InputPreSyn count\n                    ++gid2in_it->second->nc_cnt_;\n                    continue;\n                }\n\n                /// Create InputPreSyn and increase its count\n                InputPreSyn* psi = new InputPreSyn;\n                ++psi->nc_cnt_;\n                gid2in[gid] = psi;\n                inputpresyn_.push_back(psi);\n                ++nt.n_input_presyn;\n            } else {\n                int tid = nt.id;\n                if (!negsrcgid_tid.empty()) {\n                    tid = negsrcgid_tid[i_tid++];\n                }\n                auto gid2out_it = neg_gid2out[tid].find(gid);\n                if (gid2out_it != neg_gid2out[tid].end()) {\n                    /// Increase negative PreSyn count\n                    ++gid2out_it->second->nc_cnt_;\n                }\n            }\n        }\n    }\n\n    // now, we can opportunistically create the NetCon* pointer array\n    // to save some memory overhead for\n    // \"large number of small array allocation\" by\n    // counting the number of NetCons each PreSyn and InputPreSyn point to.\n    // Conceivably the nt.netcons could become a process global array\n    // in which case the NetCon* pointer array could become an integer index\n    // array. More speculatively, the index array could be eliminated itself\n    // if the process global NetCon array were ordered properly but that\n    // would interleave NetCon from different threads. Not a problem for\n    // serial threads but the reordering would propagate to nt.pntprocs\n    // if the NetCon data pointers are also replaced by integer indices.\n\n    // First, allocate the pointer array.\n    int n_nc = 0;\n    for (int ith = 0; ith < nrn_nthread; ++ith) {\n        n_nc += nrn_threads[ith].n_netcon;\n    }\n    netcon_in_presyn_order_.resize(n_nc);\n    n_nc = 0;\n\n    // fill the indices with the offset values and reset the nc_cnt_\n    // such that we use the nc_cnt_ in the following loop to assign the NetCon\n    // to the right place\n    // for PreSyn\n    int offset = 0;\n    for (int ith = 0; ith < nrn_nthread; ++ith) {\n        NrnThread& nt = nrn_threads[ith];\n        for (int i = 0; i < nt.n_presyn; ++i) {\n            PreSyn& ps = nt.presyns[i];\n            ps.nc_index_ = offset;\n            offset += ps.nc_cnt_;\n            ps.nc_cnt_ = 0;\n        }\n    }\n    // for InputPreSyn\n    for (auto psi: inputpresyn_) {\n        psi->nc_index_ = offset;\n        offset += psi->nc_cnt_;\n        psi->nc_cnt_ = 0;\n    }\n\n    inputpresyn_.clear();\n\n    // with gid to InputPreSyn and PreSyn maps we can setup the multisend\n    // target lists.\n    if (use_multisend_) {\n#if NRN_MULTISEND\n        nrn_multisend_setup();\n#endif\n    }\n\n    // fill the netcon_in_presyn_order and recompute nc_cnt_\n    // note that not all netcon_in_presyn will be filled if there are netcon\n    // with no presyn (ie. nrnthreads_netcon_srcgid[nt.id][i] = -1) but that is ok since they are\n    // only used via ps.nc_index_ and ps.nc_cnt_;\n    for (int ith = 0; ith < nrn_nthread; ++ith) {\n        NrnThread& nt = nrn_threads[ith];\n        // if single thread or file transfer then definitely empty.\n        std::vector<int>& negsrcgid_tid = nrnthreads_netcon_negsrcgid_tid[ith];\n        size_t i_tid = 0;\n        for (int i = 0; i < nt.n_netcon; ++i) {\n            NetCon* nc = nt.netcons + i;\n            int gid = nrnthreads_netcon_srcgid[ith][i];\n            int tid = ith;\n            if (!negsrcgid_tid.empty() && gid < -1) {\n                tid = negsrcgid_tid[i_tid++];\n            }\n            PreSyn* ps;\n            InputPreSyn* psi;\n            netpar_tid_gid2ps(tid, gid, &ps, &psi);\n            if (ps) {\n                netcon_in_presyn_order_[ps->nc_index_ + ps->nc_cnt_] = nc;\n                ++ps->nc_cnt_;\n                ++n_nc;\n            } else if (psi) {\n                netcon_in_presyn_order_[psi->nc_index_ + psi->nc_cnt_] = nc;\n                ++psi->nc_cnt_;\n                ++n_nc;\n            }\n        }\n    }\n\n    /// Resize the vector to its actual size of the netcons put in it\n    netcon_in_presyn_order_.resize(n_nc);\n}\n\n/// Clean up\nvoid nrn_setup_cleanup() {\n    for (int ith = 0; ith < nrn_nthread; ++ith) {\n        if (nrnthreads_netcon_srcgid[ith])\n            delete[] nrnthreads_netcon_srcgid[ith];\n    }\n    nrnthreads_netcon_srcgid.clear();\n    nrnthreads_netcon_negsrcgid_tid.clear();\n    neg_gid2out.clear();\n}\n\nvoid nrn_setup(const char* filesdat,\n               bool is_mapping_needed,\n               CheckPoints& checkPoints,\n               bool run_setup_cleanup,\n               const char* datpath,\n               const char* restore_path,\n               double* mindelay) {\n    double time = nrn_wtime();\n\n    int ngroup;\n    int* gidgroups;\n    nrn_read_filesdat(ngroup, gidgroups, filesdat);\n    UserParams userParams(ngroup,\n                          gidgroups,\n                          datpath,\n                          strlen(restore_path) == 0 ? datpath : restore_path,\n                          checkPoints);\n\n\n    // temporary bug work around. If any process has multiple threads, no\n    // process can have a single thread. So, for now, if one thread, make two.\n    // Fortunately, empty threads work fine.\n    // Allocate NrnThread* nrn_threads of size ngroup (minimum 2)\n    // Note that rank with 0 dataset/cellgroup works fine\n    nrn_threads_create(userParams.ngroup <= 1 ? 2 : userParams.ngroup);\n\n    // from nrn_has_net_event create pnttype2presyn for use in phase2.\n    auto& memb_func = corenrn.get_memb_funcs();\n    auto& pnttype2presyn = corenrn.get_pnttype2presyn();\n    auto& nrn_has_net_event_ = corenrn.get_has_net_event();\n    pnttype2presyn.clear();\n    pnttype2presyn.resize(memb_func.size(), -1);\n    for (size_t i = 0; i < nrn_has_net_event_.size(); ++i) {\n        pnttype2presyn[nrn_has_net_event_[i]] = i;\n    }\n\n    nrnthread_chkpnt = new NrnThreadChkpnt[nrn_nthread];\n\n    if (nrn_nthread > 1) {\n        // NetCvode construction assumed one thread. Need nrn_nthread instances\n        // of NetCvodeThreadData. Here since possible checkpoint restore of\n        // tqueue at end of phase2.\n        nrn_p_construct();\n    }\n\n    if (use_solve_interleave) {\n        create_interleave_info();\n    }\n\n    /// Reserve vector of maps of size ngroup for negative gid-s\n    /// std::vector< std::map<int, PreSyn*> > neg_gid2out;\n    neg_gid2out.resize(userParams.ngroup);\n\n    // bug fix. gid2out is cumulative over all threads and so do not\n    // know how many there are til after phase1\n    // A process's complete set of output gids and allocation of each thread's\n    // nt.presyns and nt.netcons arrays.\n    // Generates the gid2out map which is needed\n    // to later count the required number of InputPreSyn\n    /// gid2out - map of output presyn-s\n    /// std::map<int, PreSyn*> gid2out;\n    gid2out.clear();\n\n    nrnthreads_netcon_srcgid.resize(nrn_nthread);\n    for (int i = 0; i < nrn_nthread; ++i)\n        nrnthreads_netcon_srcgid[i] = nullptr;\n\n    // Gap junctions used to be done first in the sense of reading files\n    // and calling gap_mpi_setup. But during phase2, gap_thread_setup and\n    // gap_indices_permute were called after NrnThread.data was in its final\n    // layout and mechanism permutation was determined. This is no longer\n    // ideal as it necessitates keeping setup_info_ in existence to the end\n    // of phase2.  So gap junction setup is deferred to after phase2.\n\n    nrnthreads_netcon_negsrcgid_tid.resize(nrn_nthread);\n    if (!corenrn_embedded) {\n        coreneuron::phase_wrapper<coreneuron::phase::one>(userParams);\n    } else {\n        nrn_multithread_job([](NrnThread* n) {\n            Phase1 p1{n->id};\n            NrnThread& nt = *n;\n            p1.populate(nt, mut);\n        });\n    }\n\n    // from the gid2out map and the nrnthreads_netcon_srcgid array,\n    // fill the gid2in, and from the number of entries,\n    // allocate the process wide InputPreSyn array\n    determine_inputpresyn();\n\n    // read the rest of the gidgroup's data and complete the setup for each\n    // thread.\n    /* nrn_multithread_job supports serial, pthread, and openmp. */\n    coreneuron::phase_wrapper<coreneuron::phase::two>(userParams, corenrn_embedded);\n\n    // gap junctions\n    // Gaps are done after phase2, in order to use layout and permutation\n    // information via calls to stdindex2ptr.\n    if (nrn_have_gaps) {\n        nrn_partrans::transfer_thread_data_ = new nrn_partrans::TransferThreadData[nrn_nthread];\n        if (!corenrn_embedded) {\n            nrn_partrans::setup_info_ = new SetupTransferInfo[nrn_nthread];\n            coreneuron::phase_wrapper<coreneuron::gap>(userParams);\n        } else {\n            nrn_partrans::setup_info_ = (*nrn2core_get_partrans_setup_info_)(userParams.ngroup,\n                                                                             nrn_nthread,\n                                                                             sizeof(sgid_t));\n        }\n\n        nrn_multithread_job(nrn_partrans::gap_data_indices_setup);\n        nrn_partrans::gap_mpi_setup(userParams.ngroup);\n\n        // Whether allocated in NEURON or here, delete here.\n        delete[] nrn_partrans::setup_info_;\n        nrn_partrans::setup_info_ = nullptr;\n    }\n\n    if (is_mapping_needed)\n        coreneuron::phase_wrapper<coreneuron::phase::three>(userParams);\n\n    *mindelay = set_mindelay(*mindelay);\n\n    if (run_setup_cleanup)  // if run_setup_cleanup==false, user must call nrn_setup_cleanup() later\n        nrn_setup_cleanup();\n\n#if INTERLEAVE_DEBUG\n    // mk_cell_indices debug code is supposed to be used with cell-per-core permutations\n    if (corenrn_param.cell_interleave_permute == 1) {\n        mk_cell_indices();\n    }\n#endif\n\n    /// Allocate memory for fast_imem calculation\n    nrn_fast_imem_alloc();\n\n    /// Generally, tables depend on a few parameters. And if those parameters change,\n    /// then the table needs to be recomputed. This is obviously important in NEURON\n    /// since the user can change those parameters at any time. However, there is no\n    /// c example for CoreNEURON so can't see what it looks like in that context.\n    /// Boils down to setting up a function pointer of the function _check_table_thread(),\n    /// which is only executed by StochKV.c.\n    nrn_mk_table_check();  // was done in nrn_thread_memblist_setup in multicore.c\n\n    size_t model_size_bytes;\n\n    if (corenrn_param.model_stats) {\n        write_mech_report();\n        model_size_bytes = model_size(true);\n    } else {\n        model_size_bytes = model_size(false);\n    }\n\n    if (nrnmpi_myid == 0 && !corenrn_param.is_quiet()) {\n        printf(\" Setup Done   : %.2lf seconds \\n\", nrn_wtime() - time);\n\n        if (model_size_bytes < 1024) {\n            printf(\" Model size   : %ld bytes\\n\", model_size_bytes);\n        } else if (model_size_bytes < 1024 * 1024) {\n            printf(\" Model size   : %.2lf kB\\n\", model_size_bytes / 1024.);\n        } else if (model_size_bytes < 1024 * 1024 * 1024) {\n            printf(\" Model size   : %.2lf MB\\n\", model_size_bytes / (1024. * 1024.));\n        } else {\n            printf(\" Model size   : %.2lf GB\\n\", model_size_bytes / (1024. * 1024. * 1024.));\n        }\n    }\n\n    delete[] userParams.gidgroups;\n}\n\nvoid setup_ThreadData(NrnThread& nt) {\n    for (NrnThreadMembList* tml = nt.tml; tml; tml = tml->next) {\n        Memb_func& mf = corenrn.get_memb_func(tml->index);\n        Memb_list* ml = tml->ml;\n        if (mf.thread_size_) {\n            ml->_thread = (ThreadDatum*) ecalloc_align(mf.thread_size_, sizeof(ThreadDatum));\n            if (mf.thread_mem_init_) {\n                {\n                    const std::lock_guard<OMP_Mutex> lock(mut);\n                    (*mf.thread_mem_init_)(ml->_thread);\n                }\n            }\n        } else {\n            ml->_thread = nullptr;\n        }\n    }\n}\n\nvoid read_phasegap(NrnThread& nt, UserParams& userParams) {\n    auto& F = userParams.file_reader[nt.id];\n    if (F.fail()) {\n        return;\n    }\n\n    F.checkpoint(0);\n\n    int sidt_size = F.read_int();\n    assert(sidt_size == int(sizeof(sgid_t)));\n    std::size_t ntar = F.read_int();\n    std::size_t nsrc = F.read_int();\n\n    auto& si = nrn_partrans::setup_info_[nt.id];\n    si.src_sid.resize(nsrc);\n    si.src_type.resize(nsrc);\n    si.src_index.resize(nsrc);\n    if (nsrc) {\n        F.read_array<sgid_t>(si.src_sid.data(), nsrc);\n        F.read_array<int>(si.src_type.data(), nsrc);\n        F.read_array<int>(si.src_index.data(), nsrc);\n    }\n\n    si.tar_sid.resize(ntar);\n    si.tar_type.resize(ntar);\n    si.tar_index.resize(ntar);\n    if (ntar) {\n        F.read_array<sgid_t>(si.tar_sid.data(), ntar);\n        F.read_array<int>(si.tar_type.data(), ntar);\n        F.read_array<int>(si.tar_index.data(), ntar);\n    }\n\n#if CORENRN_DEBUG\n    printf(\"%d read_phasegap tid=%d nsrc=%d ntar=%d\\n\", nrnmpi_myid, nt.id, nsrc, ntar);\n    for (int i = 0; i < nsrc; ++i) {\n        printf(\"src %z %d %d\\n\", size_t(si.src_sid[i]), si.src_type[i], si.src_index[i]);\n    }\n    for (int i = 0; i < ntar; ++i) {\n        printf(\"tar %z %d %d\\n\", size_t(si.src_sid[i]), si.src_type[i], si.src_index[i]);\n    }\n#endif\n}\n\n// This function is related to nrn_dblpntr2nrncore in Neuron to determine which values should\n// be transferred from CoreNeuron. Types correspond to the value to be transferred based on\n// mech_type enum or non-artificial cell mechanisms.\n// take into account alignment, layout, permutation\n// only voltage, i_membrane_ or mechanism data index allowed. (mtype 0 means time)\ndouble* stdindex2ptr(int mtype, int index, NrnThread& nt) {\n    if (mtype == voltage) {  // voltage\n        int ix{index};       // relative to _actual_v\n        nrn_assert((ix >= 0) && (ix < nt.end));\n        if (nt._permute) {\n            node_permute(&ix, 1, nt._permute);\n        }\n        return nt._actual_v + ix;\n    } else if (mtype == i_membrane_) {  // membrane current from fast_imem calculation\n        int ix{index};                  // relative to nrn_fast_imem->nrn_sav_rhs\n        nrn_assert((ix >= 0) && (ix < nt.end));\n        if (nt._permute) {\n            node_permute(&ix, 1, nt._permute);\n        }\n        return nt.nrn_fast_imem->nrn_sav_rhs + ix;\n    } else if (mtype > 0 && mtype < static_cast<int>(corenrn.get_memb_funcs().size())) {  //\n        Memb_list* ml = nt._ml_list[mtype];\n        nrn_assert(ml);\n        int ix = nrn_param_layout(index, mtype, ml);\n        if (ml->_permute) {\n            ix = nrn_index_permute(ix, mtype, ml);\n        }\n        return ml->data + ix;\n    } else if (mtype == 0) {  // time\n        return &nt._t;\n    } else {\n        printf(\"stdindex2ptr does not handle mtype=%d\\n\", mtype);\n        nrn_assert(0);\n    }\n    return nullptr;\n}\n\n// from i to (icnt, isz)\nvoid nrn_inverse_i_layout(int i, int& icnt, int cnt, int& isz, int sz, int layout) {\n    if (layout == Layout::AoS) {\n        icnt = i / sz;\n        isz = i % sz;\n    } else if (layout == Layout::SoA) {\n        int padded_cnt = nrn_soa_padded_size(cnt, layout);\n        icnt = i % padded_cnt;\n        isz = i / padded_cnt;\n    } else {\n        assert(0);\n    }\n}\n\n/**\n * Cleanup global ion map created during mechanism registration\n *\n * In case of coreneuron standalone execution nrn_ion_global_map\n * can be deleted at the end of execution. But in case embedded\n * run via neuron, mechanisms are registered only once i.e. during\n * first call to coreneuron. This is why we call cleanup only in\n * case of standalone coreneuron execution via nrniv-core or\n * special-core.\n *\n * @todo coreneuron should have finalise callback which can be\n * called from NEURON for final memory cleanup including global\n * state like registered mechanisms and ions map.\n */\nvoid nrn_cleanup_ion_map() {\n    for (int i = 0; i < nrn_ion_global_map_size; i++) {\n        free_memory(nrn_ion_global_map[i]);\n    }\n    free_memory(nrn_ion_global_map);\n    nrn_ion_global_map = nullptr;\n    nrn_ion_global_map_size = 0;\n}\n\nvoid delete_fornetcon_info(NrnThread& nt) {\n    delete[] std::exchange(nt._fornetcon_perm_indices, nullptr);\n    delete[] std::exchange(nt._fornetcon_weight_perm, nullptr);\n}\n\n/* nrn_threads_free() presumes all NrnThread and NrnThreadMembList data is\n * allocated with malloc(). This is not the case here, so let's try and fix\n * things up first. */\n\nvoid nrn_cleanup() {\n    clear_event_queue();  // delete left-over TQItem\n    for (auto psi: gid2in) {\n        delete psi.second;\n    }\n    gid2in.clear();\n    gid2out.clear();\n\n    // clean nrnthread_chkpnt\n    if (nrnthread_chkpnt) {\n        delete[] nrnthread_chkpnt;\n        nrnthread_chkpnt = nullptr;\n    }\n\n    // clean NrnThreads\n    for (int it = 0; it < nrn_nthread; ++it) {\n        NrnThread* nt = nrn_threads + it;\n        NrnThreadMembList* next_tml = nullptr;\n        delete_fornetcon_info(*nt);\n        delete_trajectory_requests(*nt);\n        for (NrnThreadMembList* tml = nt->tml; tml; tml = next_tml) {\n            Memb_list* ml = tml->ml;\n\n            mod_f_t s = corenrn.get_memb_func(tml->index).destructor;\n            if (s) {\n                (*s)(nt, ml, tml->index);\n            }\n\n            ml->data = nullptr;  // this was pointing into memory owned by nt\n            free_memory(ml->pdata);\n            ml->pdata = nullptr;\n            free_memory(ml->nodeindices);\n            ml->nodeindices = nullptr;\n            if (ml->_permute) {\n                delete[] ml->_permute;\n                ml->_permute = nullptr;\n            }\n\n            if (ml->_thread) {\n                free_memory(ml->_thread);\n                ml->_thread = nullptr;\n            }\n\n            // Destroy the global variables struct allocated in nrn_init\n            if (auto* const priv_dtor = corenrn.get_memb_func(tml->index).private_destructor) {\n                (*priv_dtor)(nt, ml, tml->index);\n                assert(!ml->instance);\n                assert(!ml->global_variables);\n                assert(ml->global_variables_size == 0);\n            }\n\n            NetReceiveBuffer_t* nrb = ml->_net_receive_buffer;\n            if (nrb) {\n                if (nrb->_size) {\n                    free_memory(nrb->_pnt_index);\n                    free_memory(nrb->_weight_index);\n                    free_memory(nrb->_nrb_t);\n                    free_memory(nrb->_nrb_flag);\n                    free_memory(nrb->_displ);\n                    free_memory(nrb->_nrb_index);\n                }\n                free_memory(nrb);\n                ml->_net_receive_buffer = nullptr;\n            }\n\n            NetSendBuffer_t* nsb = ml->_net_send_buffer;\n            if (nsb) {\n                delete nsb;\n                ml->_net_send_buffer = nullptr;\n            }\n\n            if (tml->dependencies)\n                free(tml->dependencies);\n\n            next_tml = tml->next;\n            free_memory(tml->ml);\n            free_memory(tml);\n        }\n\n        nt->_actual_rhs = nullptr;\n        nt->_actual_d = nullptr;\n        nt->_actual_a = nullptr;\n        nt->_actual_b = nullptr;\n\n        free_memory(nt->_v_parent_index);\n        nt->_v_parent_index = nullptr;\n\n        free_memory(nt->_data);\n        nt->_data = nullptr;\n\n        free(nt->_idata);\n        nt->_idata = nullptr;\n\n        free_memory(nt->_vdata);\n        nt->_vdata = nullptr;\n\n        if (nt->_permute) {\n            delete[] nt->_permute;\n            nt->_permute = nullptr;\n        }\n\n        if (nt->presyns_helper) {\n            free_memory(nt->presyns_helper);\n            nt->presyns_helper = nullptr;\n        }\n\n        if (nt->pntprocs) {\n            free_memory(nt->pntprocs);\n            nt->pntprocs = nullptr;\n        }\n\n        if (nt->presyns) {\n            delete[] nt->presyns;\n            nt->presyns = nullptr;\n        }\n\n        if (nt->pnt2presyn_ix) {\n            for (size_t i = 0; i < corenrn.get_has_net_event().size(); ++i) {\n                if (nt->pnt2presyn_ix[i]) {\n                    free(nt->pnt2presyn_ix[i]);\n                }\n            }\n            free_memory(nt->pnt2presyn_ix);\n        }\n\n        if (nt->netcons) {\n            delete[] nt->netcons;\n            nt->netcons = nullptr;\n        }\n\n        if (nt->weights) {\n            free_memory(nt->weights);\n            nt->weights = nullptr;\n        }\n\n        if (nt->_shadow_rhs) {\n            free_memory(nt->_shadow_rhs);\n            nt->_shadow_rhs = nullptr;\n        }\n\n        if (nt->_shadow_d) {\n            free_memory(nt->_shadow_d);\n            nt->_shadow_d = nullptr;\n        }\n\n        if (nt->_net_send_buffer_size) {\n            free_memory(nt->_net_send_buffer);\n            nt->_net_send_buffer = nullptr;\n            nt->_net_send_buffer_size = 0;\n        }\n\n        if (nt->_watch_types) {\n            free(nt->_watch_types);\n            nt->_watch_types = nullptr;\n        }\n\n        // mapping information is available only for non-empty NrnThread\n        if (nt->mapping && nt->ncell) {\n            delete ((NrnThreadMappingInfo*) nt->mapping);\n        }\n\n        free_memory(nt->_ml_list);\n\n        if (nt->nrn_fast_imem) {\n            fast_imem_free();\n        }\n    }\n\n#if NRN_MULTISEND\n    nrn_multisend_cleanup();\n#endif\n\n    netcon_in_presyn_order_.clear();\n\n    nrn_threads_free();\n\n    if (!corenrn.get_pnttype2presyn().empty()) {\n        corenrn.get_pnttype2presyn().clear();\n    }\n\n    destroy_interleave_info();\n\n    nrn_partrans::gap_cleanup();\n}\n\nvoid delete_trajectory_requests(NrnThread& nt) {\n    if (nt.trajec_requests) {\n        TrajectoryRequests* tr = nt.trajec_requests;\n        if (tr->n_trajec) {\n            delete[] tr->vpr;\n            if (tr->scatter) {\n                delete[] tr->scatter;\n            }\n            if (tr->varrays) {\n                delete[] tr->varrays;\n            }\n            delete[] tr->gather;\n        }\n        delete nt.trajec_requests;\n        nt.trajec_requests = nullptr;\n    }\n}\n\nvoid read_phase1(NrnThread& nt, UserParams& userParams) {\n    Phase1 p1{userParams.file_reader[nt.id]};\n\n    // Protect gid2in, gid2out and neg_gid2out\n    p1.populate(nt, mut);\n}\n\nvoid read_phase2(NrnThread& nt, UserParams& userParams) {\n    Phase2 p2;\n    if (corenrn_embedded) {\n        p2.read_direct(nt.id, nt);\n    } else {\n        p2.read_file(userParams.file_reader[nt.id], nt);\n    }\n    p2.populate(nt, userParams);\n}\n\n/** read mapping information for neurons */\nvoid read_phase3(NrnThread& nt, UserParams& userParams) {\n    /** restore checkpoint state (before restoring queue items */\n    auto& F = userParams.file_reader[nt.id];\n    F.restore_checkpoint();\n\n    /** mapping information for all neurons in single NrnThread */\n    NrnThreadMappingInfo* ntmapping = new NrnThreadMappingInfo();\n\n    int count = 0;\n\n    F.read_mapping_cell_count(&count);\n\n    /** number of cells in mapping file should equal to cells in NrnThread */\n    nrn_assert(count == nt.ncell);\n\n    /** for every neuron */\n    for (int i = 0; i < nt.ncell; i++) {\n        int gid, nsec, nseg, nseclist;\n\n        // read counts\n        F.read_mapping_count(&gid, &nsec, &nseg, &nseclist);\n\n        CellMapping* cmap = new CellMapping(gid);\n\n        // read section-segment mapping for every section list\n        for (int j = 0; j < nseclist; j++) {\n            SecMapping* smap = new SecMapping();\n            F.read_mapping_info(smap);\n            cmap->add_sec_map(smap);\n        }\n\n        ntmapping->add_cell_mapping(cmap);\n    }\n\n    // make number #cells match with mapping size\n    nrn_assert((int) ntmapping->size() == nt.ncell);\n\n    // set pointer in NrnThread\n    nt.mapping = (void*) ntmapping;\n    nt.summation_report_handler_ = std::make_unique<SummationReportMapping>();\n}\n\n/* Returns the size of the dynamically allocated memory for NrnThreadMembList\n * Includes:\n *  - Size of NrnThreadMembList\n *  - Size of Memb_list\n *  - Size of nodeindices\n *  - Size of _permute\n *  - Size of _thread\n *  - Size of NetReceive and NetSend Buffers\n *  - Size of int variables\n *  - Size of double variables (If include_data is enabled. Those variables are already counted\n * since they point to nt->_data.)\n */\nsize_t memb_list_size(NrnThreadMembList* tml, bool include_data) {\n    size_t nbyte = sizeof(NrnThreadMembList) + sizeof(Memb_list);\n    nbyte += tml->ml->nodecount * sizeof(int);\n    if (tml->ml->_permute) {\n        nbyte += tml->ml->nodecount * sizeof(int);\n    }\n    if (tml->ml->_thread) {\n        Memb_func& mf = corenrn.get_memb_func(tml->index);\n        nbyte += mf.thread_size_ * sizeof(ThreadDatum);\n    }\n    if (tml->ml->_net_receive_buffer) {\n        nbyte += sizeof(NetReceiveBuffer_t) + tml->ml->_net_receive_buffer->size_of_object();\n    }\n    if (tml->ml->_net_send_buffer) {\n        nbyte += sizeof(NetSendBuffer_t) + tml->ml->_net_send_buffer->size_of_object();\n    }\n    if (include_data) {\n        nbyte += corenrn.get_prop_param_size()[tml->index] * tml->ml->nodecount * sizeof(double);\n    }\n    nbyte += corenrn.get_prop_dparam_size()[tml->index] * tml->ml->nodecount * sizeof(Datum);\n#ifdef DEBUG\n    int i = tml->index;\n    printf(\"%s %d psize=%d ppsize=%d cnt=%d nbyte=%ld\\n\",\n           corenrn.get_memb_func(i).sym,\n           i,\n           corenrn.get_prop_param_size()[i],\n           corenrn.get_prop_dparam_size()[i],\n           tml->ml->nodecount,\n           nbyte);\n#endif\n    return nbyte;\n}\n\n/// Approximate count of number of bytes for the gid2out map\nsize_t output_presyn_size(void) {\n    if (gid2out.empty()) {\n        return 0;\n    }\n    size_t nbyte = sizeof(gid2out) + sizeof(int) * gid2out.size() +\n                   sizeof(PreSyn*) * gid2out.size();\n#ifdef DEBUG\n    printf(\" gid2out table bytes=~%ld size=%ld\\n\", nbyte, gid2out.size());\n#endif\n    return nbyte;\n}\n\nsize_t input_presyn_size(void) {\n    if (gid2in.empty()) {\n        return 0;\n    }\n    size_t nbyte = sizeof(gid2in) + sizeof(int) * gid2in.size() +\n                   sizeof(InputPreSyn*) * gid2in.size();\n#ifdef DEBUG\n    printf(\" gid2in table bytes=~%ld size=%ld\\n\", nbyte, gid2in.size());\n#endif\n    return nbyte;\n}\n\nsize_t model_size(bool detailed_report) {\n    long nbyte = 0;\n    size_t sz_nrnThread = sizeof(NrnThread);\n    size_t sz_presyn = sizeof(PreSyn);\n    size_t sz_input_presyn = sizeof(InputPreSyn);\n    size_t sz_netcon = sizeof(NetCon);\n    size_t sz_pntproc = sizeof(Point_process);\n    size_t nccnt = 0;\n\n    std::vector<long> size_data(13, 0);\n    std::vector<long> global_size_data_min(13, 0);\n    std::vector<long> global_size_data_max(13, 0);\n    std::vector<long> global_size_data_sum(13, 0);\n    std::vector<float> global_size_data_avg(13, 0.0);\n\n    for (int i = 0; i < nrn_nthread; ++i) {\n        NrnThread& nt = nrn_threads[i];\n        size_t nb_nt = 0;  // per thread\n        nccnt += nt.n_netcon;\n\n        // Memb_list size\n        int nmech = 0;\n        for (auto tml = nt.tml; tml; tml = tml->next) {\n            nb_nt += memb_list_size(tml, false);\n            ++nmech;\n        }\n\n        // basic thread size includes mechanism data and G*V=I matrix\n        nb_nt += sz_nrnThread;\n        nb_nt += nt._ndata * sizeof(double) + nt._nidata * sizeof(int) + nt._nvdata * sizeof(void*);\n        nb_nt += nt.end * sizeof(int);  // _v_parent_index\n\n        // network connectivity\n        nb_nt += nt.n_pntproc * sz_pntproc + nt.n_netcon * sz_netcon + nt.n_presyn * sz_presyn +\n                 nt.n_input_presyn * sz_input_presyn + nt.n_weight * sizeof(double);\n        nbyte += nb_nt;\n\n#ifdef DEBUG\n        printf(\"ncell=%d end=%d nmech=%d\\n\", nt.ncell, nt.end, nmech);\n        printf(\"ndata=%ld nidata=%ld nvdata=%ld\\n\", nt._ndata, nt._nidata, nt._nvdata);\n        printf(\"nbyte so far %ld\\n\", nb_nt);\n        printf(\"n_presyn = %d sz=%ld nbyte=%ld\\n\", nt.n_presyn, sz_presyn, nt.n_presyn * sz_presyn);\n        printf(\"n_input_presyn = %d sz=%ld nbyte=%ld\\n\",\n               nt.n_input_presyn,\n               sz_input_presyn,\n               nt.n_input_presyn * sz_input_presyn);\n        printf(\"n_pntproc=%d sz=%ld nbyte=%ld\\n\",\n               nt.n_pntproc,\n               sz_pntproc,\n               nt.n_pntproc * sz_pntproc);\n        printf(\"n_netcon=%d sz=%ld nbyte=%ld\\n\", nt.n_netcon, sz_netcon, nt.n_netcon * sz_netcon);\n        printf(\"n_weight = %d\\n\", nt.n_weight);\n\n        printf(\"%d thread %d total bytes %ld\\n\", nrnmpi_myid, i, nb_nt);\n#endif\n\n        if (detailed_report) {\n            size_data[0] += nt.ncell;\n            size_data[1] += nt.end;\n            size_data[2] += nmech;\n            size_data[3] += nt._ndata;\n            size_data[4] += nt._nidata;\n            size_data[5] += nt._nvdata;\n            size_data[6] += nt.n_presyn;\n            size_data[7] += nt.n_input_presyn;\n            size_data[8] += nt.n_pntproc;\n            size_data[9] += nt.n_netcon;\n            size_data[10] += nt.n_weight;\n            size_data[11] += nb_nt;\n        }\n    }\n\n    nbyte += nccnt * sizeof(NetCon*);\n    nbyte += output_presyn_size();\n    nbyte += input_presyn_size();\n\n    nbyte += nrnran123_instance_count() * nrnran123_state_size();\n\n#ifdef DEBUG\n    printf(\"%d netcon pointers %ld  nbyte=%ld\\n\", nrnmpi_myid, nccnt, nccnt * sizeof(NetCon*));\n    printf(\"nrnran123 size=%ld cnt=%ld nbyte=%ld\\n\",\n           nrnran123_state_size(),\n           nrnran123_instance_count(),\n           nrnran123_instance_count() * nrnran123_state_size());\n    printf(\"%d total bytes %ld\\n\", nrnmpi_myid, nbyte);\n#endif\n    if (detailed_report) {\n        size_data[12] = nbyte;\n#if NRNMPI\n        if (corenrn_param.mpi_enable) {\n            // last arg is op type where 1 is sum, 2 is max and any other value is min\n            nrnmpi_long_allreduce_vec(&size_data[0], &global_size_data_sum[0], 13, 1);\n            nrnmpi_long_allreduce_vec(&size_data[0], &global_size_data_max[0], 13, 2);\n            nrnmpi_long_allreduce_vec(&size_data[0], &global_size_data_min[0], 13, 3);\n            for (int i = 0; i < 13; i++) {\n                global_size_data_avg[i] = global_size_data_sum[i] / float(nrnmpi_numprocs);\n            }\n        } else\n#endif\n        {\n            global_size_data_max = size_data;\n            global_size_data_min = size_data;\n            global_size_data_avg.assign(size_data.cbegin(), size_data.cend());\n        }\n        // now print the collected data:\n        if (nrnmpi_myid == 0) {\n            printf(\"Memory size information for all NrnThreads per rank\\n\");\n            printf(\"------------------------------------------------------------------\\n\");\n            printf(\"%22s %12s %12s %12s\\n\", \"field\", \"min\", \"max\", \"avg\");\n            printf(\"%22s %12ld %12ld %15.2f\\n\",\n                   \"n_cell\",\n                   global_size_data_min[0],\n                   global_size_data_max[0],\n                   global_size_data_avg[0]);\n            printf(\"%22s %12ld %12ld %15.2f\\n\",\n                   \"n_compartment\",\n                   global_size_data_min[1],\n                   global_size_data_max[1],\n                   global_size_data_avg[1]);\n            printf(\"%22s %12ld %12ld %15.2f\\n\",\n                   \"n_mechanism\",\n                   global_size_data_min[2],\n                   global_size_data_max[2],\n                   global_size_data_avg[2]);\n            printf(\"%22s %12ld %12ld %15.2f\\n\",\n                   \"_ndata\",\n                   global_size_data_min[3],\n                   global_size_data_max[3],\n                   global_size_data_avg[3]);\n            printf(\"%22s %12ld %12ld %15.2f\\n\",\n                   \"_nidata\",\n                   global_size_data_min[4],\n                   global_size_data_max[4],\n                   global_size_data_avg[4]);\n            printf(\"%22s %12ld %12ld %15.2f\\n\",\n                   \"_nvdata\",\n                   global_size_data_min[5],\n                   global_size_data_max[5],\n                   global_size_data_avg[5]);\n            printf(\"%22s %12ld %12ld %15.2f\\n\",\n                   \"n_presyn\",\n                   global_size_data_min[6],\n                   global_size_data_max[6],\n                   global_size_data_avg[6]);\n            printf(\"%22s %12ld %12ld %15.2f\\n\",\n                   \"n_presyn (bytes)\",\n                   global_size_data_min[6] * sz_presyn,\n                   global_size_data_max[6] * sz_presyn,\n                   global_size_data_avg[6] * sz_presyn);\n            printf(\"%22s %12ld %12ld %15.2f\\n\",\n                   \"n_input_presyn\",\n                   global_size_data_min[7],\n                   global_size_data_max[7],\n                   global_size_data_avg[7]);\n            printf(\"%22s %12ld %12ld %15.2f\\n\",\n                   \"n_input_presyn (bytes)\",\n                   global_size_data_min[7] * sz_input_presyn,\n                   global_size_data_max[7] * sz_input_presyn,\n                   global_size_data_avg[7] * sz_input_presyn);\n            printf(\"%22s %12ld %12ld %15.2f\\n\",\n                   \"n_pntproc\",\n                   global_size_data_min[8],\n                   global_size_data_max[8],\n                   global_size_data_avg[8]);\n            printf(\"%22s %12ld %12ld %15.2f\\n\",\n                   \"n_pntproc (bytes)\",\n                   global_size_data_min[8] * sz_pntproc,\n                   global_size_data_max[8] * sz_pntproc,\n                   global_size_data_avg[8] * sz_pntproc);\n            printf(\"%22s %12ld %12ld %15.2f\\n\",\n                   \"n_netcon\",\n                   global_size_data_min[9],\n                   global_size_data_max[9],\n                   global_size_data_avg[9]);\n            printf(\"%22s %12ld %12ld %15.2f\\n\",\n                   \"n_netcon (bytes)\",\n                   global_size_data_min[9] * sz_netcon,\n                   global_size_data_max[9] * sz_netcon,\n                   global_size_data_avg[9] * sz_netcon);\n            printf(\"%22s %12ld %12ld %15.2f\\n\",\n                   \"n_weight\",\n                   global_size_data_min[10],\n                   global_size_data_max[10],\n                   global_size_data_avg[10]);\n            printf(\"%22s %12ld %12ld %15.2f\\n\",\n                   \"NrnThread (bytes)\",\n                   global_size_data_min[11],\n                   global_size_data_max[11],\n                   global_size_data_avg[11]);\n            printf(\"%22s %12ld %12ld %15.2f\\n\",\n                   \"model size (bytes)\",\n                   global_size_data_min[12],\n                   global_size_data_max[12],\n                   global_size_data_avg[12]);\n        }\n    }\n\n#if NRNMPI\n    if (corenrn_param.mpi_enable) {\n        long global_nbyte = 0;\n        nrnmpi_long_allreduce_vec(&nbyte, &global_nbyte, 1, 1);\n        nbyte = global_nbyte;\n    }\n#endif\n\n    return nbyte;\n}\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/nrn_setup.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\n#include <string>\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/io/nrn_filehandler.hpp\"\n#include \"coreneuron/io/nrn2core_direct.h\"\n#include \"coreneuron/io/user_params.hpp\"\n#include \"coreneuron/io/mem_layout_util.hpp\"\n#include \"coreneuron/io/nrn_checkpoint.hpp\"\n\nnamespace coreneuron {\nvoid read_phase1(NrnThread& nt, UserParams& userParams);\nvoid read_phase2(NrnThread& nt, UserParams& userParams);\nvoid read_phase3(NrnThread& nt, UserParams& userParams);\nvoid read_phasegap(NrnThread& nt, UserParams& userParams);\nvoid setup_ThreadData(NrnThread& nt);\n\nvoid nrn_setup(const char* filesdat,\n               bool is_mapping_needed,\n               CheckPoints& checkPoints,\n               bool run_setup_cleanup = true,\n               const char* datapath = \"\",\n               const char* restore_path = \"\",\n               double* mindelay = nullptr);\n\n// Functions to load and clean data;\nextern void nrn_init_and_load_data(int argc,\n                                   char** argv,\n                                   CheckPoints& checkPoints,\n                                   bool is_mapping_needed = false,\n                                   bool run_setup_cleanup = true);\nextern void allocate_data_in_mechanism_nrn_init();\nextern void nrn_setup_cleanup();\n\nextern int nrn_i_layout(int i, int cnt, int j, int size, int layout);\n\nsize_t memb_list_size(NrnThreadMembList* tml, bool include_data);\n\nsize_t model_size(bool detailed_report);\n\nnamespace coreneuron {\n\n\n/// Reading phase number.\nenum phase { one = 1, two, three, gap };\n\n/// Get the phase number in form of the string.\ntemplate <phase P>\ninline std::string getPhaseName();\n\ntemplate <>\ninline std::string getPhaseName<one>() {\n    return \"1\";\n}\n\ntemplate <>\ninline std::string getPhaseName<two>() {\n    return \"2\";\n}\n\ntemplate <>\ninline std::string getPhaseName<three>() {\n    return \"3\";\n}\n\ntemplate <>\ninline std::string getPhaseName<gap>() {\n    return \"gap\";\n}\n\n/// Reading phase selector.\ntemplate <phase P>\ninline void read_phase_aux(NrnThread& nt, UserParams&);\n\ntemplate <>\ninline void read_phase_aux<one>(NrnThread& nt, UserParams& userParams) {\n    read_phase1(nt, userParams);\n}\n\ntemplate <>\ninline void read_phase_aux<two>(NrnThread& nt, UserParams& userParams) {\n    read_phase2(nt, userParams);\n}\n\ntemplate <>\ninline void read_phase_aux<three>(NrnThread& nt, UserParams& userParams) {\n    read_phase3(nt, userParams);\n}\n\ntemplate <>\ninline void read_phase_aux<gap>(NrnThread& nt, UserParams& userParams) {\n    read_phasegap(nt, userParams);\n}\n\n/// Reading phase wrapper for each neuron group.\ntemplate <phase P>\ninline void* phase_wrapper_w(NrnThread* nt, UserParams& userParams, bool in_memory_transfer) {\n    int i = nt->id;\n    if (i < userParams.ngroup) {\n        if (!in_memory_transfer) {\n            const char* data_dir = userParams.path;\n            // directory to read could be different for phase 2 if we are restoring\n            // all other phases still read from dataset directory because the data\n            // is constant\n            if (P == 2) {\n                data_dir = userParams.restore_path;\n            }\n\n            std::string fname = std::string(data_dir) + \"/\" +\n                                std::to_string(userParams.gidgroups[i]) + \"_\" + getPhaseName<P>() +\n                                \".dat\";\n\n            // Avoid trying to open the gid_gap.dat file if it doesn't exist when there are no\n            // gap junctions in this gid.\n            // Note that we still need to close `userParams.file_reader[i]`\n            // because files are opened in the order of `gid_1.dat`, `gid_2.dat` and `gid_gap.dat`.\n            // When we open next file, `gid_gap.dat` in this case, we are supposed to close the\n            // handle for `gid_2.dat` even though file doesn't exist.\n            if (P == gap && !FileHandler::file_exist(fname)) {\n                userParams.file_reader[i].close();\n            } else {\n                // if no file failed to open or not opened at all\n                userParams.file_reader[i].open(fname);\n            }\n        }\n        read_phase_aux<P>(*nt, userParams);\n        if (!in_memory_transfer) {\n            userParams.file_reader[i].close();\n        }\n        if (P == 2) {\n            setup_ThreadData(*nt);\n        }\n    }\n    return nullptr;\n}\n\n/// Specific phase reading executed by threads.\ntemplate <phase P>\ninline static void phase_wrapper(UserParams& userParams, int direct = 0) {\n    nrn_multithread_job(phase_wrapper_w<P>, userParams, direct != 0);\n}\n}  // namespace coreneuron\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/nrnsection_mapping.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n#include <numeric>\n#include <string>\n#include <utility>\n#include <vector>\n#include <map>\n#include <iostream>\n\nnamespace coreneuron {\n\n/** type to store every section and associated segments */\nusing segvec_type = std::vector<int>;\nusing secseg_map_type = std::map<int, segvec_type>;\nusing secseg_it_type = secseg_map_type::iterator;\n\n/** @brief Section to segment mapping\n *\n *  For a section list (of a particulat type), store mapping\n *  of section to segments\n *  a section is a arbitrary user classification to recognize some segments (ex: api, soma, dend,\n * axon)\n *\n */\nstruct SecMapping {\n    /** name of section list */\n    std::string name;\n\n    /** map of section and associated segments */\n    secseg_map_type secmap;\n\n    SecMapping() = default;\n\n    explicit SecMapping(std::string s)\n        : name(std::move(s)) {}\n\n    /** @brief return total number of sections in section list */\n    size_t num_sections() const noexcept {\n        return secmap.size();\n    }\n\n    /** @brief return number of segments in section list */\n    size_t num_segments() const {\n        return std::accumulate(secmap.begin(), secmap.end(), 0, [](int psum, const auto& item) {\n            return psum + item.second.size();\n        });\n    }\n\n    /** @brief add section to associated segment */\n    void add_segment(int sec, int seg) {\n        secmap[sec].push_back(seg);\n    }\n};\n\n/** @brief Compartment mapping information for a cell\n *\n * A cell can have multiple section list types like\n * soma, axon, apic, dend etc. User will add these\n * section lists using HOC interface.\n */\nstruct CellMapping {\n    /** gid of a cell */\n    int gid;\n\n    /** list of section lists (like soma, axon, apic) */\n    std::vector<SecMapping*> secmapvec;\n\n    CellMapping(int g)\n        : gid(g) {}\n\n    /** @brief total number of sections in a cell */\n    int num_sections() const {\n        return std::accumulate(secmapvec.begin(),\n                               secmapvec.end(),\n                               0,\n                               [](int psum, const auto& secmap) {\n                                   return psum + secmap->num_sections();\n                               });\n    }\n\n    /** @brief return number of segments in a cell */\n    int num_segments() const {\n        return std::accumulate(secmapvec.begin(),\n                               secmapvec.end(),\n                               0,\n                               [](int psum, const auto& secmap) {\n                                   return psum + secmap->num_segments();\n                               });\n    }\n\n    /** @brief number of section lists */\n    size_t size() const noexcept {\n        return secmapvec.size();\n    }\n\n    /** @brief add new SecMapping */\n    void add_sec_map(SecMapping* s) {\n        secmapvec.push_back(s);\n    }\n\n    /** @brief return section list mapping with given name */\n    SecMapping* get_seclist_mapping(const std::string& name) const {\n        for (auto& secmap: secmapvec) {\n            if (name == secmap->name) {\n                return secmap;\n            }\n        }\n\n        std::cout << \"Warning: Section mapping list \" << name << \" doesn't exist! \\n\";\n        return nullptr;\n    }\n\n    /** @brief return segment count for specific section list with given name */\n    size_t get_seclist_segment_count(const std::string& name) const {\n        SecMapping* s = get_seclist_mapping(name);\n        size_t count = 0;\n        if (s) {\n            count = s->num_segments();\n        }\n        return count;\n    }\n    /** @brief return segment count for specific section list with given name */\n    size_t get_seclist_section_count(const std::string& name) const {\n        SecMapping* s = get_seclist_mapping(name);\n        size_t count = 0;\n        if (s) {\n            count = s->num_sections();\n        }\n        return count;\n    }\n\n    ~CellMapping() {\n        for (size_t i = 0; i < secmapvec.size(); i++) {\n            delete secmapvec[i];\n        }\n    }\n};\n\n/** @brief Compartment mapping information for NrnThread\n *\n * NrnThread could have more than one cell in cellgroup\n * and we store this in vector.\n */\nstruct NrnThreadMappingInfo {\n    /** list of cells mapping */\n    std::vector<CellMapping*> mappingvec;\n\n    /** @brief number of cells */\n    size_t size() const {\n        return mappingvec.size();\n    }\n\n    /** @brief memory cleanup */\n    ~NrnThreadMappingInfo() {\n        for (size_t i = 0; i < mappingvec.size(); i++) {\n            delete mappingvec[i];\n        }\n    }\n\n    /** @brief get cell mapping information for given gid\n     *\tif exist otherwise return nullptr.\n     */\n    CellMapping* get_cell_mapping(int gid) const {\n        for (const auto& mapping: mappingvec) {\n            if (mapping->gid == gid) {\n                return mapping;\n            }\n        }\n        return nullptr;\n    }\n\n    /** @brief add mapping information of new cell */\n    void add_cell_mapping(CellMapping* c) {\n        mappingvec.push_back(c);\n    }\n};\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/output_spikes.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include <iostream>\n#include <sstream>\n#include <cstring>\n#include <stdexcept>  // std::lenght_error\n#include <vector>\n#include <algorithm>\n#include <numeric>\n#include <limits>\n\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/io/nrn2core_direct.h\"\n#include \"coreneuron/io/output_spikes.hpp\"\n#include \"coreneuron/mpi/nrnmpi.h\"\n#include \"coreneuron/mpi/core/nrnmpi.hpp\"\n#include \"coreneuron/utils/nrnmutdec.hpp\"\n#include \"coreneuron/mpi/nrnmpidec.h\"\n#include \"coreneuron/utils/string_utils.h\"\n#include \"coreneuron/apps/corenrn_parameters.hpp\"\n#ifdef ENABLE_SONATA_REPORTS\n#include \"bbp/sonata/reports.h\"\n#endif  // ENABLE_SONATA_REPORTS\n\n/**\n * @brief Return all spike vectors to NEURON\n *\n * @param spiketvec - vector of spikes at the end of CORENEURON simulation\n * @param spikegidvec - vector of gids at the end of CORENEURON simulation\n * @return true if we are in embedded_run and NEURON has successfully retrieved the vectors\n */\nstatic bool all_spikes_return(std::vector<double>& spiketvec, std::vector<int>& spikegidvec) {\n    return corenrn_embedded && nrn2core_all_spike_vectors_return_ &&\n           (*nrn2core_all_spike_vectors_return_)(spiketvec, spikegidvec);\n}\n\nnamespace coreneuron {\n/// --> Coreneuron as SpikeBuffer class\nstd::vector<double> spikevec_time;\nstd::vector<int> spikevec_gid;\n\nstatic OMP_Mutex mut;\n\nvoid mk_spikevec_buffer(int sz) {\n    try {\n        spikevec_time.reserve(sz);\n        spikevec_gid.reserve(sz);\n    } catch (const std::length_error& le) {\n        std::cerr << \"Lenght error\" << le.what() << std::endl;\n    }\n}\n\nvoid spikevec_lock() {\n    mut.lock();\n}\n\nvoid spikevec_unlock() {\n    mut.unlock();\n}\n\nstatic void local_spikevec_sort(std::vector<double>& isvect,\n                                std::vector<int>& isvecg,\n                                std::vector<double>& osvect,\n                                std::vector<int>& osvecg) {\n    osvect.resize(isvect.size());\n    osvecg.resize(isvecg.size());\n    // first build a permutation vector\n    std::vector<std::size_t> perm(isvect.size());\n    std::iota(perm.begin(), perm.end(), 0);\n    // sort by gid (second predicate first)\n    std::stable_sort(perm.begin(), perm.end(), [&](std::size_t i, std::size_t j) {\n        return isvecg[i] < isvecg[j];\n    });\n    // then sort by time\n    std::stable_sort(perm.begin(), perm.end(), [&](std::size_t i, std::size_t j) {\n        return isvect[i] < isvect[j];\n    });\n    // now apply permutation to time and gid output vectors\n    std::transform(perm.begin(), perm.end(), osvect.begin(), [&](std::size_t i) {\n        return isvect[i];\n    });\n    std::transform(perm.begin(), perm.end(), osvecg.begin(), [&](std::size_t i) {\n        return isvecg[i];\n    });\n}\n\n#if NRNMPI\n\nstatic void sort_spikes(std::vector<double>& spikevec_time, std::vector<int>& spikevec_gid) {\n    double lmin_time = std::numeric_limits<double>::max();\n    double lmax_time = std::numeric_limits<double>::min();\n    if (!spikevec_time.empty()) {\n        lmin_time = *(std::min_element(spikevec_time.begin(), spikevec_time.end()));\n        lmax_time = *(std::max_element(spikevec_time.begin(), spikevec_time.end()));\n    }\n    double min_time = nrnmpi_dbl_allmin(lmin_time);\n    double max_time = nrnmpi_dbl_allmax(lmax_time);\n\n    // allocate send and receive counts and displacements for MPI_Alltoallv\n    std::vector<int> snd_cnts(nrnmpi_numprocs);\n    std::vector<int> rcv_cnts(nrnmpi_numprocs);\n    std::vector<int> snd_dsps(nrnmpi_numprocs);\n    std::vector<int> rcv_dsps(nrnmpi_numprocs);\n\n    double bin_t = (max_time - min_time) / nrnmpi_numprocs;\n    bin_t = bin_t ? bin_t : 1;\n    // first find number of spikes in each time window\n    for (const auto& st: spikevec_time) {\n        int idx = (int) (st - min_time) / bin_t;\n        snd_cnts[idx]++;\n    }\n    for (int i = 1; i < nrnmpi_numprocs; i++) {\n        snd_dsps[i] = snd_dsps[i - 1] + snd_cnts[i - 1];\n    }\n\n    // now let each rank know how many spikes they will receive\n    // and get in turn all the buffer sizes to receive\n    nrnmpi_int_alltoall(&snd_cnts[0], &rcv_cnts[0], 1);\n    for (int i = 1; i < nrnmpi_numprocs; i++) {\n        rcv_dsps[i] = rcv_dsps[i - 1] + rcv_cnts[i - 1];\n    }\n    std::size_t new_sz = 0;\n    for (const auto& r: rcv_cnts) {\n        new_sz += r;\n    }\n    // prepare new sorted vectors\n    std::vector<double> svt_buf(new_sz, 0.0);\n    std::vector<int> svg_buf(new_sz, 0);\n\n    // now exchange data\n    nrnmpi_dbl_alltoallv(spikevec_time.data(),\n                         &snd_cnts[0],\n                         &snd_dsps[0],\n                         svt_buf.data(),\n                         &rcv_cnts[0],\n                         &rcv_dsps[0]);\n    nrnmpi_int_alltoallv(spikevec_gid.data(),\n                         &snd_cnts[0],\n                         &snd_dsps[0],\n                         svg_buf.data(),\n                         &rcv_cnts[0],\n                         &rcv_dsps[0]);\n\n    local_spikevec_sort(svt_buf, svg_buf, spikevec_time, spikevec_gid);\n}\n\n#ifdef ENABLE_SONATA_REPORTS\n/** Split spikevec_time and spikevec_gid by populations\n *  Add spike data with population name and gid offset tolibsonatareport API\n */\nvoid output_spike_populations(const SpikesInfo& spikes_info) {\n    // Write spikes with default population name and offset\n    if (spikes_info.population_info.empty()) {\n        sonata_add_spikes_population(\"All\",\n                                     0,\n                                     spikevec_time.data(),\n                                     spikevec_time.size(),\n                                     spikevec_gid.data(),\n                                     spikevec_gid.size());\n        return;\n    }\n    int n_populations = spikes_info.population_info.size();\n    for (int idx = 0; idx < n_populations; idx++) {\n        const auto& curr_pop = spikes_info.population_info[idx];\n        std::string population_name = curr_pop.first;\n        int population_offset = curr_pop.second;\n        int gid_lower = population_offset;\n        int gid_upper = std::numeric_limits<int>::max();\n        if (idx != n_populations - 1) {\n            gid_upper = spikes_info.population_info[idx + 1].second - 1;\n        }\n        std::vector<double> pop_spikevec_time;\n        std::vector<int> pop_spikevec_gid;\n        for (int j = 0; j < spikevec_gid.size(); j++) {\n            if (spikevec_gid[j] >= gid_lower && spikevec_gid[j] <= gid_upper) {\n                pop_spikevec_time.push_back(spikevec_time[j]);\n                pop_spikevec_gid.push_back(spikevec_gid[j]);\n            }\n        }\n        sonata_add_spikes_population(population_name.data(),\n                                     population_offset,\n                                     pop_spikevec_time.data(),\n                                     pop_spikevec_time.size(),\n                                     pop_spikevec_gid.data(),\n                                     pop_spikevec_gid.size());\n    }\n}\n#endif  // ENABLE_SONATA_REPORTS\n\n/** Write generated spikes to out.dat using mpi parallel i/o.\n *  \\todo : MPI related code should be factored into nrnmpi.c\n *          Check spike record length which is set to 64 chars\n */\nstatic void output_spikes_parallel(const char* outpath, const SpikesInfo& spikes_info) {\n    std::stringstream ss;\n    ss << outpath << \"/out.dat\";\n    std::string fname = ss.str();\n\n    // remove if file already exist\n    if (nrnmpi_myid == 0) {\n        remove(fname.c_str());\n    }\n#ifdef ENABLE_SONATA_REPORTS\n    sonata_create_spikefile(outpath, spikes_info.file_name.data());\n    output_spike_populations(spikes_info);\n    sonata_write_spike_populations();\n    sonata_close_spikefile();\n#endif  // ENABLE_SONATA_REPORTS\n\n    sort_spikes(spikevec_time, spikevec_gid);\n    nrnmpi_barrier();\n\n    // each spike record in the file is time + gid (64 chars sufficient)\n    const int SPIKE_RECORD_LEN = 64;\n    size_t num_spikes = spikevec_gid.size();\n    size_t num_bytes = (sizeof(char) * num_spikes * SPIKE_RECORD_LEN);\n    char* spike_data = (char*) malloc(num_bytes);\n\n    if (spike_data == nullptr) {\n        printf(\"Error while writing spikes due to memory allocation\\n\");\n        return;\n    }\n\n    // empty if no spikes\n    strcpy(spike_data, \"\");\n\n    // populate buffer with all spike entries\n    char spike_entry[SPIKE_RECORD_LEN];\n    size_t spike_data_offset = 0;\n    for (size_t i = 0; i < num_spikes; i++) {\n        int spike_entry_chars =\n            snprintf(spike_entry, 64, \"%.8g\\t%d\\n\", spikevec_time[i], spikevec_gid[i]);\n        spike_data_offset =\n            strcat_at_pos(spike_data, spike_data_offset, spike_entry, spike_entry_chars);\n    }\n\n    // calculate offset into global file. note that we don't write\n    // all num_bytes but only \"populated\" buffer\n    size_t num_chars = strlen(spike_data);\n\n    nrnmpi_write_file(fname, spike_data, num_chars);\n\n    free(spike_data);\n}\n#endif\n\nstatic void output_spikes_serial(const char* outpath) {\n    std::stringstream ss;\n    ss << outpath << \"/out.dat\";\n    std::string fname = ss.str();\n\n    // reserve some space for sorted spikevec buffers\n    std::vector<double> sorted_spikevec_time(spikevec_time.size());\n    std::vector<int> sorted_spikevec_gid(spikevec_gid.size());\n    local_spikevec_sort(spikevec_time, spikevec_gid, sorted_spikevec_time, sorted_spikevec_gid);\n\n    // remove if file already exist\n    remove(fname.c_str());\n\n    FILE* f = fopen(fname.c_str(), \"w\");\n    if (!f && nrnmpi_myid == 0) {\n        std::cout << \"WARNING: Could not open file for writing spikes.\" << std::endl;\n        return;\n    }\n\n    for (std::size_t i = 0; i < sorted_spikevec_gid.size(); ++i)\n        if (sorted_spikevec_gid[i] > -1)\n            fprintf(f, \"%.8g\\t%d\\n\", sorted_spikevec_time[i], sorted_spikevec_gid[i]);\n\n    fclose(f);\n}\n\nvoid output_spikes(const char* outpath, const SpikesInfo& spikes_info) {\n    // try to transfer spikes to NEURON. If successfull, don't write out.dat\n    if (all_spikes_return(spikevec_time, spikevec_gid)) {\n        clear_spike_vectors();\n        return;\n    }\n#if NRNMPI\n    if (corenrn_param.mpi_enable && nrnmpi_initialized()) {\n        output_spikes_parallel(outpath, spikes_info);\n    } else\n#endif\n    {\n        output_spikes_serial(outpath);\n    }\n    clear_spike_vectors();\n}\n\nvoid clear_spike_vectors() {\n    auto spikevec_time_capacity = spikevec_time.capacity();\n    auto spikevec_gid_capacity = spikevec_gid.capacity();\n    spikevec_time.clear();\n    spikevec_gid.clear();\n    spikevec_time.reserve(spikevec_time_capacity);\n    spikevec_gid.reserve(spikevec_gid_capacity);\n}\n\nvoid validation(std::vector<std::pair<double, int>>& res) {\n    for (unsigned i = 0; i < spikevec_gid.size(); ++i)\n        if (spikevec_gid[i] > -1)\n            res.push_back(std::make_pair(spikevec_time[i], spikevec_gid[i]));\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/output_spikes.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\n#include <string>\n#include <vector>\n#include <utility>\n#include \"coreneuron/io/reports/nrnreport.hpp\"\nnamespace coreneuron {\nvoid output_spikes(const char* outpath, const SpikesInfo& spikes_info);\nvoid mk_spikevec_buffer(int);\n\nextern std::vector<double> spikevec_time;\nextern std::vector<int> spikevec_gid;\n\nvoid clear_spike_vectors();\nvoid validation(std::vector<std::pair<double, int>>& res);\n\nvoid spikevec_lock();\nvoid spikevec_unlock();\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/phase1.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include <cassert>\n#include <mutex>\n\n#include \"coreneuron/nrniv/nrniv_decl.h\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/io/phase1.hpp\"\n#include \"coreneuron/utils/nrnoc_aux.hpp\"\n\nint (*nrn2core_get_dat1_)(int tid,\n                          int& n_presyn,\n                          int& n_netcon,\n                          int*& output_gid,\n                          int*& netcon_srcgid,\n                          std::vector<int>& netcon_negsrcgid_tid);\n\nnamespace coreneuron {\nPhase1::Phase1(FileHandler& F) {\n    assert(!F.fail());\n    int n_presyn = F.read_int();  /// Number of PreSyn-s in NrnThread nt\n    int n_netcon = F.read_int();  /// Number of NetCon-s in NrnThread nt\n\n    this->output_gids = F.read_vector<int>(n_presyn);\n    this->netcon_srcgids = F.read_vector<int>(n_netcon);\n    // For file mode transfer, it is not allowed that negative gids exist\n    // in different threads. So this->netcon_tids remains clear.\n\n    F.close();\n}\n\nPhase1::Phase1(int thread_id) {\n    int* output_gids;\n    int* netcon_srcgid;\n    int n_presyn;\n    int n_netcon;\n\n    // TODO : check error codes for NEURON - CoreNEURON communication\n    int valid = (*nrn2core_get_dat1_)(\n        thread_id, n_presyn, n_netcon, output_gids, netcon_srcgid, this->netcon_negsrcgid_tid);\n    if (!valid) {\n        return;\n    }\n\n    this->output_gids = std::vector<int>(output_gids, output_gids + n_presyn);\n    delete[] output_gids;\n    this->netcon_srcgids = std::vector<int>(netcon_srcgid, netcon_srcgid + n_netcon);\n    delete[] netcon_srcgid;\n}\n\nvoid Phase1::populate(NrnThread& nt, OMP_Mutex& mut) {\n    nt.n_presyn = this->output_gids.size();\n    nt.n_netcon = this->netcon_srcgids.size();\n\n    nrnthreads_netcon_srcgid[nt.id] = new int[nt.n_netcon];\n    std::copy(this->netcon_srcgids.begin(),\n              this->netcon_srcgids.end(),\n              nrnthreads_netcon_srcgid[nt.id]);\n\n    // netcon_negsrcgid_tid is empty if file transfer or single thread\n    coreneuron::nrnthreads_netcon_negsrcgid_tid[nt.id] = this->netcon_negsrcgid_tid;\n\n    nt.netcons = new NetCon[nt.n_netcon];\n\n    if (nt.n_presyn) {\n        nt.presyns_helper = (PreSynHelper*) ecalloc_align(nt.n_presyn, sizeof(PreSynHelper));\n        nt.presyns = new PreSyn[nt.n_presyn];\n    }\n\n    PreSyn* ps = nt.presyns;\n    /// go through all presyns\n    for (auto& gid: this->output_gids) {\n        if (gid == -1) {\n            ++ps;\n            continue;\n        }\n\n        {\n            const std::lock_guard<OMP_Mutex> lock(mut);\n            // Note that the negative (type, index)\n            // coded information goes into the neg_gid2out[tid] hash table.\n            // See netpar.cpp for the netpar_tid_... function implementations.\n            // Both that table and the process wide gid2out table can be deleted\n            // before the end of setup\n\n            /// Put gid into the gid2out hash table with correspondent output PreSyn\n            /// Or to the negative PreSyn map\n            if (gid >= 0) {\n                char m[200];\n                if (gid2in.find(gid) != gid2in.end()) {\n                    sprintf(m, \"gid=%d already exists as an input port\", gid);\n                    hoc_execerror(m,\n                                  \"Setup all the output ports on this process before using them as \"\n                                  \"input ports.\");\n                }\n                if (gid2out.find(gid) != gid2out.end()) {\n                    sprintf(m, \"gid=%d already exists on this process as an output port\", gid);\n                    hoc_execerror(m, 0);\n                }\n                ps->gid_ = gid;\n                ps->output_index_ = gid;\n                gid2out[gid] = ps;\n            } else {\n                nrn_assert(neg_gid2out[nt.id].find(gid) == neg_gid2out[nt.id].end());\n                ps->output_index_ = -1;\n                neg_gid2out[nt.id][gid] = ps;\n            }\n        }  // end of the mutex\n\n        ++ps;\n    }\n}\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/phase1.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n#include <vector>\n\n#include \"coreneuron/io/nrn_filehandler.hpp\"\n#include \"coreneuron/utils/nrnmutdec.hpp\"\n\nnamespace coreneuron {\n\nstruct NrnThread;\n\nclass Phase1 {\n  public:\n    Phase1(FileHandler& F);\n    Phase1(int thread_id);\n    void populate(NrnThread& nt, OMP_Mutex& mut);\n\n  private:\n    std::vector<int> output_gids;\n    std::vector<int> netcon_srcgids;\n    std::vector<int> netcon_negsrcgid_tid;  // entries only for negative srcgids\n};\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/phase2.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include \"coreneuron/io/phase2.hpp\"\n#include \"coreneuron/coreneuron.hpp\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/io/nrn_checkpoint.hpp\"\n#include \"coreneuron/utils/nrnoc_aux.hpp\"\n#include \"coreneuron/permute/cellorder.hpp\"\n#include \"coreneuron/permute/data_layout.hpp\"\n#include \"coreneuron/permute/node_permute.h\"\n#include \"coreneuron/utils/utils.hpp\"\n#include \"coreneuron/utils/vrecitem.h\"\n#include \"coreneuron/io/mem_layout_util.hpp\"\n#include \"coreneuron/io/setup_fornetcon.hpp\"\n\n#if defined(_OPENMP)\n#include <omp.h>\n#endif\n\nint (*nrn2core_get_dat2_1_)(int tid,\n                            int& n_real_cell,\n                            int& ngid,\n                            int& n_real_gid,\n                            int& nnode,\n                            int& ndiam,\n                            int& nmech,\n                            int*& tml_index,\n                            int*& ml_nodecount,\n                            int& nidata,\n                            int& nvdata,\n                            int& nweight);\n\nint (*nrn2core_get_dat2_2_)(int tid,\n                            int*& v_parent_index,\n                            double*& a,\n                            double*& b,\n                            double*& area,\n                            double*& v,\n                            double*& diamvec);\n\nint (*nrn2core_get_dat2_mech_)(int tid,\n                               size_t i,\n                               int dsz_inst,\n                               int*& nodeindices,\n                               double*& data,\n                               int*& pdata,\n                               std::vector<int>& pointer2type);\n\nint (*nrn2core_get_dat2_3_)(int tid,\n                            int nweight,\n                            int*& output_vindex,\n                            double*& output_threshold,\n                            int*& netcon_pnttype,\n                            int*& netcon_pntindex,\n                            double*& weights,\n                            double*& delays);\n\nint (*nrn2core_get_dat2_corepointer_)(int tid, int& n);\n\nint (*nrn2core_get_dat2_corepointer_mech_)(int tid,\n                                           int type,\n                                           int& icnt,\n                                           int& dcnt,\n                                           int*& iarray,\n                                           double*& darray);\n\nint (*nrn2core_get_dat2_vecplay_)(int tid, std::vector<int>& indices);\n\nint (*nrn2core_get_dat2_vecplay_inst_)(int tid,\n                                       int i,\n                                       int& vptype,\n                                       int& mtype,\n                                       int& ix,\n                                       int& sz,\n                                       double*& yvec,\n                                       double*& tvec,\n                                       int& last_index,\n                                       int& discon_index,\n                                       int& ubound_index);\n\nnamespace coreneuron {\ntemplate <typename T>\ninline void mech_data_layout_transform(T* data, int cnt, int sz, int layout) {\n    if (layout == Layout::AoS) {\n        return;\n    }\n    // layout is equal to Layout::SoA\n    int align_cnt = nrn_soa_padded_size(cnt, layout);\n    std::vector<T> d(cnt * sz);\n    // copy matrix\n    for (int i = 0; i < cnt; ++i) {\n        for (int j = 0; j < sz; ++j) {\n            d[i * sz + j] = data[i * sz + j];\n        }\n    }\n    // transform memory layout\n    for (int i = 0; i < cnt; ++i) {\n        for (int j = 0; j < sz; ++j) {\n            data[i + j * align_cnt] = d[i * sz + j];\n        }\n    }\n}\n\nvoid Phase2::read_file(FileHandler& F, const NrnThread& nt) {\n    n_real_cell = F.read_int();\n    n_output = F.read_int();\n    n_real_output = F.read_int();\n    n_node = F.read_int();\n    n_diam = F.read_int();\n    n_mech = F.read_int();\n    mech_types = std::vector<int>(n_mech, 0);\n    nodecounts = std::vector<int>(n_mech, 0);\n    for (int i = 0; i < n_mech; ++i) {\n        mech_types[i] = F.read_int();\n        nodecounts[i] = F.read_int();\n    }\n\n    // check mechanism compatibility before reading data\n    check_mechanism();\n\n    n_idata = F.read_int();\n    n_vdata = F.read_int();\n    int n_weight = F.read_int();\n    v_parent_index = (int*) ecalloc_align(n_node, sizeof(int));\n    F.read_array<int>(v_parent_index, n_node);\n\n    int n_data_padded = nrn_soa_padded_size(n_node, SOA_LAYOUT);\n    {\n        {  // Compute size of _data and allocate\n            int n_data = 6 * n_data_padded;\n            if (n_diam > 0) {\n                n_data += n_data_padded;\n            }\n            for (int i = 0; i < n_mech; ++i) {\n                int layout = corenrn.get_mech_data_layout()[mech_types[i]];\n                int n = nodecounts[i];\n                int sz = corenrn.get_prop_param_size()[mech_types[i]];\n                n_data = nrn_soa_byte_align(n_data);\n                n_data += nrn_soa_padded_size(n, layout) * sz;\n            }\n            _data = (double*) ecalloc_align(n_data, sizeof(double));\n        }\n        F.read_array<double>(_data + 2 * n_data_padded, n_node);\n        F.read_array<double>(_data + 3 * n_data_padded, n_node);\n        F.read_array<double>(_data + 5 * n_data_padded, n_node);\n        F.read_array<double>(_data + 4 * n_data_padded, n_node);\n        if (n_diam > 0) {\n            F.read_array<double>(_data + 6 * n_data_padded, n_node);\n        }\n    }\n\n    size_t offset = 6 * n_data_padded;\n    if (n_diam > 0) {\n        offset += n_data_padded;\n    }\n    for (int i = 0; i < n_mech; ++i) {\n        int layout = corenrn.get_mech_data_layout()[mech_types[i]];\n        int n = nodecounts[i];\n        int sz = corenrn.get_prop_param_size()[mech_types[i]];\n        int dsz = corenrn.get_prop_dparam_size()[mech_types[i]];\n        offset = nrn_soa_byte_align(offset);\n        std::vector<int> nodeindices;\n        if (!corenrn.get_is_artificial()[mech_types[i]]) {\n            nodeindices = F.read_vector<int>(n);\n        }\n        F.read_array<double>(_data + offset, sz * n);\n        offset += nrn_soa_padded_size(n, layout) * sz;\n        std::vector<int> pdata;\n        if (dsz > 0) {\n            pdata = F.read_vector<int>(dsz * n);\n        }\n        tmls.emplace_back(TML{nodeindices, pdata, mech_types[i], {}, {}});\n        if (dsz > 0) {\n            int sz = F.read_int();\n            if (sz) {\n                auto& p2t = tmls.back().pointer2type;\n                p2t = F.read_vector<int>(sz);\n            }\n        }\n    }\n    output_vindex = F.read_vector<int>(nt.n_presyn);\n    output_threshold = F.read_vector<double>(n_real_output);\n    pnttype = F.read_vector<int>(nt.n_netcon);\n    pntindex = F.read_vector<int>(nt.n_netcon);\n    weights = F.read_vector<double>(n_weight);\n    delay = F.read_vector<double>(nt.n_netcon);\n    num_point_process = F.read_int();\n\n    for (int i = 0; i < n_mech; ++i) {\n        if (!corenrn.get_bbcore_read()[mech_types[i]]) {\n            continue;\n        }\n        tmls[i].type = F.read_int();\n        int icnt = F.read_int();\n        int dcnt = F.read_int();\n        if (icnt > 0) {\n            tmls[i].iArray = F.read_vector<int>(icnt);\n        }\n        if (dcnt > 0) {\n            tmls[i].dArray = F.read_vector<double>(dcnt);\n        }\n    }\n\n    int n_vec_play_continuous = F.read_int();\n    vec_play_continuous.reserve(n_vec_play_continuous);\n    for (int i = 0; i < n_vec_play_continuous; ++i) {\n        VecPlayContinuous_ item;\n        item.vtype = F.read_int();\n        item.mtype = F.read_int();\n        item.ix = F.read_int();\n        int sz = F.read_int();\n        item.yvec = IvocVect(sz);\n        item.tvec = IvocVect(sz);\n        F.read_array<double>(item.yvec.data(), sz);\n        F.read_array<double>(item.tvec.data(), sz);\n        vec_play_continuous.push_back(std::move(item));\n    }\n\n    // store current checkpoint state to continue reading mapping\n    // The checkpoint numbering in phase 3 is a continuing of phase 2, and so will be restored\n    F.record_checkpoint();\n\n    if (F.eof())\n        return;\n\n    nrn_assert(F.read_int() == n_vec_play_continuous);\n\n    for (int i = 0; i < n_vec_play_continuous; ++i) {\n        auto& vecPlay = vec_play_continuous[i];\n        vecPlay.last_index = F.read_int();\n        vecPlay.discon_index = F.read_int();\n        vecPlay.ubound_index = F.read_int();\n    }\n\n    patstim_index = F.read_int();\n\n    nrn_assert(F.read_int() == -1);\n\n    for (int i = 0; i < nt.n_presyn; ++i) {\n        preSynConditionEventFlags.push_back(F.read_int());\n    }\n\n    nrn_assert(F.read_int() == -1);\n    restore_events(F);\n\n    nrn_assert(F.read_int() == -1);\n    restore_events(F);\n}\n\nvoid Phase2::read_direct(int thread_id, const NrnThread& nt) {\n    int* types_ = nullptr;\n    int* nodecounts_ = nullptr;\n    int n_weight;\n    (*nrn2core_get_dat2_1_)(thread_id,\n                            n_real_cell,\n                            n_output,\n                            n_real_output,\n                            n_node,\n                            n_diam,\n                            n_mech,\n                            types_,\n                            nodecounts_,\n                            n_idata,\n                            n_vdata,\n                            n_weight);\n    mech_types = std::vector<int>(types_, types_ + n_mech);\n    delete[] types_;\n\n    nodecounts = std::vector<int>(nodecounts_, nodecounts_ + n_mech);\n    delete[] nodecounts_;\n\n    check_mechanism();\n\n    // TODO: fix it in the future\n    int n_data_padded = nrn_soa_padded_size(n_node, SOA_LAYOUT);\n    int n_data = 6 * n_data_padded;\n    if (n_diam > 0) {\n        n_data += n_data_padded;\n    }\n    for (int i = 0; i < n_mech; ++i) {\n        int layout = corenrn.get_mech_data_layout()[mech_types[i]];\n        int n = nodecounts[i];\n        int sz = corenrn.get_prop_param_size()[mech_types[i]];\n        n_data = nrn_soa_byte_align(n_data);\n        n_data += nrn_soa_padded_size(n, layout) * sz;\n    }\n    _data = (double*) ecalloc_align(n_data, sizeof(double));\n\n    v_parent_index = (int*) ecalloc_align(n_node, sizeof(int));\n    double* actual_a = _data + 2 * n_data_padded;\n    double* actual_b = _data + 3 * n_data_padded;\n    double* actual_v = _data + 4 * n_data_padded;\n    double* actual_area = _data + 5 * n_data_padded;\n    double* actual_diam = n_diam > 0 ? _data + 6 * n_data_padded : nullptr;\n    (*nrn2core_get_dat2_2_)(\n        thread_id, v_parent_index, actual_a, actual_b, actual_area, actual_v, actual_diam);\n\n    tmls.resize(n_mech);\n\n    auto& param_sizes = corenrn.get_prop_param_size();\n    auto& dparam_sizes = corenrn.get_prop_dparam_size();\n    int dsz_inst = 0;\n    size_t offset = 6 * n_data_padded;\n    if (n_diam > 0)\n        offset += n_data_padded;\n    for (int i = 0; i < n_mech; ++i) {\n        auto& tml = tmls[i];\n        int type = mech_types[i];\n        int layout = corenrn.get_mech_data_layout()[type];\n        offset = nrn_soa_byte_align(offset);\n\n        tml.type = type;\n        // artificial cell don't use nodeindices\n        if (!corenrn.get_is_artificial()[type]) {\n            tml.nodeindices.resize(nodecounts[i]);\n        }\n        tml.pdata.resize(nodecounts[i] * dparam_sizes[type]);\n\n        int* nodeindices_ = nullptr;\n        double* data_ = _data + offset;\n        int* pdata_ = const_cast<int*>(tml.pdata.data());\n        (*nrn2core_get_dat2_mech_)(thread_id,\n                                   i,\n                                   dparam_sizes[type] > 0 ? dsz_inst : 0,\n                                   nodeindices_,\n                                   data_,\n                                   pdata_,\n                                   tml.pointer2type);\n        if (dparam_sizes[type] > 0)\n            dsz_inst++;\n        offset += nrn_soa_padded_size(nodecounts[i], layout) * param_sizes[type];\n        if (nodeindices_) {\n            std::copy(nodeindices_, nodeindices_ + nodecounts[i], tml.nodeindices.data());\n            free(nodeindices_);  // not free_memory because this is allocated by NEURON?\n        }\n        if (corenrn.get_is_artificial()[type]) {\n            assert(nodeindices_ == nullptr);\n        }\n    }\n\n    int* output_vindex_ = nullptr;\n    double* output_threshold_ = nullptr;\n    int* pnttype_ = nullptr;\n    int* pntindex_ = nullptr;\n    double* weight_ = nullptr;\n    double* delay_ = nullptr;\n    (*nrn2core_get_dat2_3_)(thread_id,\n                            n_weight,\n                            output_vindex_,\n                            output_threshold_,\n                            pnttype_,\n                            pntindex_,\n                            weight_,\n                            delay_);\n\n    output_vindex = std::vector<int>(output_vindex_, output_vindex_ + nt.n_presyn);\n    delete[] output_vindex_;\n\n    output_threshold = std::vector<double>(output_threshold_, output_threshold_ + n_real_output);\n    delete[] output_threshold_;\n\n    int n_netcon = nt.n_netcon;\n    pnttype = std::vector<int>(pnttype_, pnttype_ + n_netcon);\n    delete[] pnttype_;\n\n    pntindex = std::vector<int>(pntindex_, pntindex_ + n_netcon);\n    delete[] pntindex_;\n\n    weights = std::vector<double>(weight_, weight_ + n_weight);\n    delete[] weight_;\n\n    delay = std::vector<double>(delay_, delay_ + n_netcon);\n    delete[] delay_;\n\n    (*nrn2core_get_dat2_corepointer_)(nt.id, num_point_process);\n\n    for (int i = 0; i < n_mech; ++i) {\n        // not all mod files have BBCOREPOINTER data to read\n        if (!corenrn.get_bbcore_read()[mech_types[i]]) {\n            continue;\n        }\n        int icnt;\n        int* iArray_ = nullptr;\n        int dcnt;\n        double* dArray_ = nullptr;\n        (*nrn2core_get_dat2_corepointer_mech_)(nt.id, tmls[i].type, icnt, dcnt, iArray_, dArray_);\n        tmls[i].iArray.resize(icnt);\n        std::copy(iArray_, iArray_ + icnt, tmls[i].iArray.begin());\n        delete[] iArray_;\n\n        tmls[i].dArray.resize(dcnt);\n        std::copy(dArray_, dArray_ + dcnt, tmls[i].dArray.begin());\n        delete[] dArray_;\n    }\n\n    // Get from NEURON, the VecPlayContinuous indices in\n    // NetCvode::fixed_play_ for this thread.\n    std::vector<int> indices_vec_play_continuous;\n    (*nrn2core_get_dat2_vecplay_)(thread_id, indices_vec_play_continuous);\n\n    // i is an index into NEURON's NetCvode::fixed_play_ for this thread.\n    for (auto i: indices_vec_play_continuous) {\n        VecPlayContinuous_ item;\n        // yvec_ and tvec_ are not deleted as that space is within\n        // NEURON Vector\n        double *yvec_, *tvec_;\n        int sz;\n        (*nrn2core_get_dat2_vecplay_inst_)(thread_id,\n                                           i,\n                                           item.vtype,\n                                           item.mtype,\n                                           item.ix,\n                                           sz,\n                                           yvec_,\n                                           tvec_,\n                                           item.last_index,\n                                           item.discon_index,\n                                           item.ubound_index);\n        item.yvec = IvocVect(sz);\n        item.tvec = IvocVect(sz);\n        std::copy(yvec_, yvec_ + sz, item.yvec.data());\n        std::copy(tvec_, tvec_ + sz, item.tvec.data());\n        vec_play_continuous.push_back(std::move(item));\n    }\n}\n\n/// Check if MOD file used between NEURON and CoreNEURON is same\nvoid Phase2::check_mechanism() {\n    int diff_mech_count = 0;\n    for (int i = 0; i < n_mech; ++i) {\n        if (std::any_of(corenrn.get_different_mechanism_type().begin(),\n                        corenrn.get_different_mechanism_type().end(),\n                        [&](int e) { return e == mech_types[i]; })) {\n            if (nrnmpi_myid == 0) {\n                printf(\"Error: %s is a different MOD file than used by NEURON!\\n\",\n                       nrn_get_mechname(mech_types[i]));\n            }\n            diff_mech_count++;\n        }\n    }\n\n    if (diff_mech_count > 0) {\n        if (nrnmpi_myid == 0) {\n            printf(\n                \"Error : NEURON and CoreNEURON must use same mod files for compatibility, %d \"\n                \"different mod file(s) found. Re-compile special and special-core!\\n\",\n                diff_mech_count);\n            nrn_abort(1);\n        }\n    }\n}\n\n/// Perform in memory transformation between AoS<>SoA for integer data\nvoid Phase2::transform_int_data(int elem0,\n                                int nodecount,\n                                int* pdata,\n                                int i,\n                                int dparam_size,\n                                int layout,\n                                int n_node_) {\n    for (int iml = 0; iml < nodecount; ++iml) {\n        int* pd = pdata + nrn_i_layout(iml, nodecount, i, dparam_size, layout);\n        int ix = *pd;  // relative to beginning of _actual_*\n        nrn_assert((ix >= 0) && (ix < n_node_));\n        *pd = elem0 + ix;  // relative to nt._data\n    }\n}\n\nvoid Phase2::set_net_send_buffer(Memb_list** ml_list, const std::vector<int>& pnt_offset) {\n    // NetReceiveBuffering\n    for (auto& net_buf_receive: corenrn.get_net_buf_receive()) {\n        int type = net_buf_receive.second;\n        // Does this thread have this type.\n        Memb_list* ml = ml_list[type];\n        if (ml) {  // needs a NetReceiveBuffer\n            NetReceiveBuffer_t* nrb =\n                (NetReceiveBuffer_t*) ecalloc_align(1, sizeof(NetReceiveBuffer_t));\n            assert(!ml->_net_receive_buffer);\n            ml->_net_receive_buffer = nrb;\n            nrb->_pnt_offset = pnt_offset[type];\n\n            // begin with a size equal to the number of instances, or at least 8\n            nrb->_size = std::max(8, ml->nodecount);\n            nrb->_pnt_index = (int*) ecalloc_align(nrb->_size, sizeof(int));\n            nrb->_displ = (int*) ecalloc_align(nrb->_size + 1, sizeof(int));\n            nrb->_nrb_index = (int*) ecalloc_align(nrb->_size, sizeof(int));\n            nrb->_weight_index = (int*) ecalloc_align(nrb->_size, sizeof(int));\n            nrb->_nrb_t = (double*) ecalloc_align(nrb->_size, sizeof(double));\n            nrb->_nrb_flag = (double*) ecalloc_align(nrb->_size, sizeof(double));\n        }\n    }\n\n    // NetSendBuffering\n    for (int type: corenrn.get_net_buf_send_type()) {\n        // Does this thread have this type.\n        Memb_list* ml = ml_list[type];\n        if (ml) {  // needs a NetSendBuffer\n            assert(!ml->_net_send_buffer);\n            // begin with a size equal to twice number of instances\n            NetSendBuffer_t* nsb = new NetSendBuffer_t(ml->nodecount * 2);\n            ml->_net_send_buffer = nsb;\n        }\n    }\n}\n\nvoid Phase2::restore_events(FileHandler& F) {\n    int type;\n    while ((type = F.read_int()) != 0) {\n        double time;\n        F.read_array(&time, 1);\n        switch (type) {\n            case NetConType: {\n                auto event = std::make_shared<NetConType_>();\n                event->time = time;\n                event->netcon_index = F.read_int();\n                events.emplace_back(type, event);\n                break;\n            }\n            case SelfEventType: {\n                auto event = std::make_shared<SelfEventType_>();\n                event->time = time;\n                event->target_type = F.read_int();\n                event->point_proc_instance = F.read_int();\n                event->target_instance = F.read_int();\n                F.read_array(&event->flag, 1);\n                event->movable = F.read_int();\n                event->weight_index = F.read_int();\n                events.emplace_back(type, event);\n                break;\n            }\n            case PreSynType: {\n                auto event = std::make_shared<PreSynType_>();\n                event->time = time;\n                event->presyn_index = F.read_int();\n                events.emplace_back(type, event);\n                break;\n            }\n            case NetParEventType: {\n                auto event = std::make_shared<NetParEvent_>();\n                event->time = time;\n                events.emplace_back(type, event);\n                break;\n            }\n            case PlayRecordEventType: {\n                auto event = std::make_shared<PlayRecordEventType_>();\n                event->time = time;\n                event->play_record_type = F.read_int();\n                if (event->play_record_type == VecPlayContinuousType) {\n                    event->vecplay_index = F.read_int();\n                    events.emplace_back(type, event);\n                } else {\n                    nrn_assert(0);\n                }\n                break;\n            }\n            default: {\n                nrn_assert(0);\n                break;\n            }\n        }\n    }\n}\n\nvoid Phase2::fill_before_after_lists(NrnThread& nt, const std::vector<Memb_func>& memb_func) {\n    /// Fill the BA lists\n    std::vector<BAMech*> before_after_map(memb_func.size());\n    for (int i = 0; i < BEFORE_AFTER_SIZE; ++i) {\n        for (size_t ii = 0; ii < memb_func.size(); ++ii) {\n            before_after_map[ii] = nullptr;\n        }\n        // Save first before-after block only. In case of multiple before-after blocks with the\n        // same mech type, we will get subsequent ones using linked list below.\n        for (auto bam = corenrn.get_bamech()[i]; bam; bam = bam->next) {\n            if (!before_after_map[bam->type]) {\n                before_after_map[bam->type] = bam;\n            }\n        }\n        // necessary to keep in order wrt multiple BAMech with same mech type\n        NrnThreadBAList** ptbl = nt.tbl + i;\n        for (auto tml = nt.tml; tml; tml = tml->next) {\n            if (before_after_map[tml->index]) {\n                int mtype = tml->index;\n                for (auto bam = before_after_map[mtype]; bam && bam->type == mtype;\n                     bam = bam->next) {\n                    auto tbl = (NrnThreadBAList*) emalloc(sizeof(NrnThreadBAList));\n                    *ptbl = tbl;\n                    tbl->next = nullptr;\n                    tbl->bam = bam;\n                    tbl->ml = tml->ml;\n                    ptbl = &(tbl->next);\n                }\n            }\n        }\n    }\n}\n\nvoid Phase2::pdata_relocation(const NrnThread& nt, const std::vector<Memb_func>& memb_func) {\n    // Some pdata may index into data which has been reordered from AoS to\n    // SoA. The four possibilities are if semantics is -1 (area), -5 (pointer),\n    // -9 (diam), // or 0-999 (ion variables).\n    // Note that pdata has a layout and the // type block in nt.data into which\n    // it indexes, has a layout.\n\n    // For faster search of tmls[i].type == type, use a map.\n    // (perhaps would be better to replace tmls so that we can use tmls[type].\n    std::map<int, size_t> type2itml;\n    for (size_t i = 0; i < tmls.size(); ++i) {\n        if (tmls[i].pointer2type.size()) {\n            type2itml[tmls[i].type] = i;\n        }\n    }\n\n    for (auto tml = nt.tml; tml; tml = tml->next) {\n        int type = tml->index;\n        int layout = corenrn.get_mech_data_layout()[type];\n        int* pdata = tml->ml->pdata;\n        int cnt = tml->ml->nodecount;\n        int szdp = corenrn.get_prop_dparam_size()[type];\n        int* semantics = memb_func[type].dparam_semantics;\n\n        // compute only for ARTIFICIAL_CELL (has useful area pointer with semantics=-1)\n        if (!corenrn.get_is_artificial()[type]) {\n            if (szdp) {\n                if (!semantics)\n                    continue;  // temporary for HDFReport, Binreport which will be skipped in\n                // bbcore_write of HBPNeuron\n                nrn_assert(semantics);\n            }\n\n            for (int i = 0; i < szdp; ++i) {\n                int s = semantics[i];\n                switch (s) {\n                    case -1:  // area\n                        transform_int_data(\n                            nt._actual_area - nt._data, cnt, pdata, i, szdp, layout, nt.end);\n                        break;\n                    case -9:  // diam\n                        transform_int_data(\n                            nt._actual_diam - nt._data, cnt, pdata, i, szdp, layout, nt.end);\n                        break;\n                    case -5:  // pointer assumes a pointer to membrane voltage\n                        // or mechanism data in this thread. The value of the\n                        // pointer on the NEURON side was analyzed by\n                        // nrn_dblpntr2nrncore which returned the\n                        // mechanism index and type. At this moment the index\n                        // is in pdata and the type is in tmls[type].pointer2type.\n                        // However the latter order is according to the nested\n                        // iteration for nodecount { for szdp {}}\n                        // Also the nodecount POINTER instances of mechanism\n                        // might possibly point to differnt range variables.\n                        // Therefore it is not possible to use transform_int_data\n                        // and the transform must be done one at a time.\n                        // So we do nothing here and separately iterate\n                        // after this loop instead of the former voltage only\n                        /**\n                        transform_int_data(\n                            nt._actual_v - nt._data, cnt, pdata, i, szdp, layout, nt.end);\n                         **/\n                        break;\n                    default:\n                        if (s >= 0 && s < 1000) {  // ion\n                            int etype = s;\n                            /* if ion is SoA, must recalculate pdata values */\n                            /* if ion is AoS, have to deal with offset */\n                            Memb_list* eml = nt._ml_list[etype];\n                            int edata0 = eml->data - nt._data;\n                            int ecnt = eml->nodecount;\n                            int esz = corenrn.get_prop_param_size()[etype];\n                            for (int iml = 0; iml < cnt; ++iml) {\n                                int* pd = pdata + nrn_i_layout(iml, cnt, i, szdp, layout);\n                                int ix = *pd;  // relative to the ion data\n                                nrn_assert((ix >= 0) && (ix < ecnt * esz));\n                                /* Original pd order assumed ecnt groups of esz */\n                                *pd = edata0 + nrn_param_layout(ix, etype, eml);\n                            }\n                        }\n                }\n            }\n            // Handle case -5 POINTER transformation (see comment above)\n            auto search = type2itml.find(type);\n            if (search != type2itml.end()) {\n                auto& ptypes = tmls[type2itml[type]].pointer2type;\n                assert(ptypes.size());\n                size_t iptype = 0;\n                for (int iml = 0; iml < cnt; ++iml) {\n                    for (int i = 0; i < szdp; ++i) {\n                        if (semantics[i] == -5) {  // POINTER\n                            int* pd = pdata + nrn_i_layout(iml, cnt, i, szdp, layout);\n                            int ix = *pd;  // relative to elem0\n                            int ptype = ptypes[iptype++];\n                            if (ptype == voltage) {\n                                nrn_assert((ix >= 0) && (ix < nt.end));\n                                int elem0 = nt._actual_v - nt._data;\n                                *pd = elem0 + ix;\n                            } else {\n                                Memb_list* pml = nt._ml_list[ptype];\n                                int pcnt = pml->nodecount;\n                                int psz = corenrn.get_prop_param_size()[ptype];\n                                nrn_assert((ix >= 0) && (ix < pcnt * psz));\n                                int elem0 = pml->data - nt._data;\n                                *pd = elem0 + nrn_param_layout(ix, ptype, pml);\n                            }\n                        }\n                    }\n                }\n                ptypes.clear();\n            }\n        }\n    }\n}\n\nvoid Phase2::set_dependencies(const NrnThread& nt, const std::vector<Memb_func>& memb_func) {\n    /* here we setup the mechanism dependencies. if there is a mechanism dependency\n     * then we allocate an array for tml->dependencies otherwise set it to nullptr.\n     * In order to find out the \"real\" dependencies i.e. dependent mechanism\n     * exist at the same compartment, we compare the nodeindices of mechanisms\n     * returned by nrn_mech_depend.\n     */\n\n    /* temporary array for dependencies */\n    int* mech_deps = (int*) ecalloc(memb_func.size(), sizeof(int));\n\n    for (auto tml = nt.tml; tml; tml = tml->next) {\n        /* initialize to null */\n        tml->dependencies = nullptr;\n        tml->ndependencies = 0;\n\n        /* get dependencies from the models */\n        int deps_cnt = nrn_mech_depend(tml->index, mech_deps);\n\n        /* if dependencies, setup dependency array */\n        if (deps_cnt) {\n            /* store \"real\" dependencies in the vector */\n            std::vector<int> actual_mech_deps;\n\n            Memb_list* ml = tml->ml;\n            int* nodeindices = ml->nodeindices;\n\n            /* iterate over dependencies */\n            for (int j = 0; j < deps_cnt; j++) {\n                /* memb_list of dependency mechanism */\n                Memb_list* dml = nt._ml_list[mech_deps[j]];\n\n                /* dependency mechanism may not exist in the model */\n                if (!dml)\n                    continue;\n\n                /* take nodeindices for comparison */\n                int* dnodeindices = dml->nodeindices;\n\n                /* set_intersection function needs temp vector to push the common values */\n                std::vector<int> node_intersection;\n\n                /* make sure they have non-zero nodes and find their intersection */\n                if ((ml->nodecount > 0) && (dml->nodecount > 0)) {\n                    std::set_intersection(nodeindices,\n                                          nodeindices + ml->nodecount,\n                                          dnodeindices,\n                                          dnodeindices + dml->nodecount,\n                                          std::back_inserter(node_intersection));\n                }\n\n                /* if they intersect in the nodeindices, it's real dependency */\n                if (!node_intersection.empty()) {\n                    actual_mech_deps.push_back(mech_deps[j]);\n                }\n            }\n\n            /* copy actual_mech_deps to dependencies */\n            if (!actual_mech_deps.empty()) {\n                tml->ndependencies = actual_mech_deps.size();\n                tml->dependencies = (int*) ecalloc(actual_mech_deps.size(), sizeof(int));\n                std::copy(actual_mech_deps.begin(), actual_mech_deps.end(), tml->dependencies);\n            }\n        }\n    }\n\n    /* free temp dependency array */\n    free(mech_deps);\n}\n\nvoid Phase2::handle_weights(NrnThread& nt, int n_netcon, NrnThreadChkpnt& ntc) {\n    nt.n_weight = weights.size();\n    // weights in netcons order in groups defined by Point_process target type.\n    nt.weights = (double*) ecalloc_align(nt.n_weight, sizeof(double));\n    std::copy(weights.begin(), weights.end(), nt.weights);\n\n    int iw = 0;\n    for (int i = 0; i < n_netcon; ++i) {\n        NetCon& nc = nt.netcons[i];\n        nc.u.weight_index_ = iw;\n        if (pnttype[i] != 0) {\n            iw += corenrn.get_pnt_receive_size()[pnttype[i]];\n        } else {\n            iw += 1;\n        }\n    }\n    assert(iw == nt.n_weight);\n\n    // Nontrivial if FOR_NETCON in use by some mechanisms\n    setup_fornetcon_info(nt);\n\n\n#if CHKPNTDEBUG\n    ntc.delay = new double[n_netcon];\n    memcpy(ntc.delay, delay.data(), n_netcon * sizeof(double));\n#endif\n    for (int i = 0; i < n_netcon; ++i) {\n        NetCon& nc = nt.netcons[i];\n        nc.delay_ = delay[i];\n    }\n}\n\nvoid Phase2::get_info_from_bbcore(NrnThread& nt,\n                                  const std::vector<Memb_func>& memb_func,\n                                  NrnThreadChkpnt& ntc) {\n    // BBCOREPOINTER information\n#if CHKPNTDEBUG\n    ntc.nbcp = num_point_process;\n    ntc.bcpicnt = new int[n_mech];\n    ntc.bcpdcnt = new int[n_mech];\n    ntc.bcptype = new int[n_mech];\n    size_t point_proc_id = 0;\n#endif\n    for (int i = 0; i < n_mech; ++i) {\n        int type = mech_types[i];\n        if (!corenrn.get_bbcore_read()[type]) {\n            continue;\n        }\n        type = tmls[i].type;  // This is not an error, but it has to be fixed I think\n#if CHKPNTDEBUG\n        ntc.bcptype[point_proc_id] = type;\n        ntc.bcpicnt[point_proc_id] = tmls[i].iArray.size();\n        ntc.bcpdcnt[point_proc_id] = tmls[i].dArray.size();\n        point_proc_id++;\n#endif\n        int ik = 0;\n        int dk = 0;\n        Memb_list* ml = nt._ml_list[type];\n        int dsz = corenrn.get_prop_param_size()[type];\n        int pdsz = corenrn.get_prop_dparam_size()[type];\n        int cntml = ml->nodecount;\n        int layout = corenrn.get_mech_data_layout()[type];\n        for (int j = 0; j < cntml; ++j) {\n            int jp = j;\n            if (ml->_permute) {\n                jp = ml->_permute[j];\n            }\n            double* d = ml->data;\n            Datum* pd = ml->pdata;\n            d += nrn_i_layout(jp, cntml, 0, dsz, layout);\n            pd += nrn_i_layout(jp, cntml, 0, pdsz, layout);\n            int aln_cntml = nrn_soa_padded_size(cntml, layout);\n            (*corenrn.get_bbcore_read()[type])(tmls[i].dArray.data(),\n                                               tmls[i].iArray.data(),\n                                               &dk,\n                                               &ik,\n                                               0,\n                                               aln_cntml,\n                                               d,\n                                               pd,\n                                               ml->_thread,\n                                               &nt,\n                                               ml,\n                                               0.0);\n        }\n        assert(dk == static_cast<int>(tmls[i].dArray.size()));\n        assert(ik == static_cast<int>(tmls[i].iArray.size()));\n    }\n}\n\nvoid Phase2::set_vec_play(NrnThread& nt, NrnThreadChkpnt& ntc) {\n    // VecPlayContinuous instances\n    // No attempt at memory efficiency\n    nt.n_vecplay = vec_play_continuous.size();\n    if (nt.n_vecplay) {\n        nt._vecplay = new void*[nt.n_vecplay];\n    } else {\n        nt._vecplay = nullptr;\n    }\n#if CHKPNTDEBUG\n    ntc.vecplay_ix = new int[nt.n_vecplay];\n    ntc.vtype = new int[nt.n_vecplay];\n    ntc.mtype = new int[nt.n_vecplay];\n#endif\n    for (int i = 0; i < nt.n_vecplay; ++i) {\n        auto& vecPlay = vec_play_continuous[i];\n        nrn_assert(vecPlay.vtype == VecPlayContinuousType);\n#if CHKPNTDEBUG\n        ntc.vtype[i] = vecPlay.vtype;\n#endif\n#if CHKPNTDEBUG\n        ntc.mtype[i] = vecPlay.mtype;\n#endif\n        Memb_list* ml = nt._ml_list[vecPlay.mtype];\n#if CHKPNTDEBUG\n        ntc.vecplay_ix[i] = vecPlay.ix;\n#endif\n\n        vecPlay.ix = nrn_param_layout(vecPlay.ix, vecPlay.mtype, ml);\n        if (ml->_permute) {\n            vecPlay.ix = nrn_index_permute(vecPlay.ix, vecPlay.mtype, ml);\n        }\n        nt._vecplay[i] = new VecPlayContinuous(ml->data + vecPlay.ix,\n                                               std::move(vecPlay.yvec),\n                                               std::move(vecPlay.tvec),\n                                               nullptr,\n                                               nt.id);\n    }\n}\n\nvoid Phase2::populate(NrnThread& nt, const UserParams& userParams) {\n    NrnThreadChkpnt& ntc = nrnthread_chkpnt[nt.id];\n    ntc.file_id = userParams.gidgroups[nt.id];\n\n    nt.ncell = n_real_cell;\n    nt.end = n_node;\n    nt.n_real_output = n_real_output;\n\n#if CHKPNTDEBUG\n    ntc.n_outputgids = n_output;\n    ntc.nmech = n_mech;\n#endif\n\n    /// Checkpoint in coreneuron is defined for both phase 1 and phase 2 since they are written\n    /// together\n    nt._ml_list = (Memb_list**) ecalloc_align(corenrn.get_memb_funcs().size(), sizeof(Memb_list*));\n\n    auto& memb_func = corenrn.get_memb_funcs();\n#if CHKPNTDEBUG\n    ntc.mlmap = new Memb_list_chkpnt*[memb_func.size()];\n    for (int i = 0; i < memb_func.size(); ++i) {\n        ntc.mlmap[i] = nullptr;\n    }\n#endif\n\n    nt.stream_id = 0;\n    nt.compute_gpu = 0;\n    auto& nrn_prop_param_size_ = corenrn.get_prop_param_size();\n    auto& nrn_prop_dparam_size_ = corenrn.get_prop_dparam_size();\n\n/* read_phase2 is being called from openmp region\n * and hence we can set the stream equal to current thread id.\n * In fact we could set gid as stream_id when we will have nrn threads\n * greater than number of omp threads.\n */\n#if defined(_OPENMP)\n    nt.stream_id = omp_get_thread_num();\n#endif\n\n    int shadow_rhs_cnt = 0;\n    nt.shadow_rhs_cnt = 0;\n\n    NrnThreadMembList* tml_last = nullptr;\n    for (int i = 0; i < n_mech; ++i) {\n        auto tml =\n            create_tml(nt, i, memb_func[mech_types[i]], shadow_rhs_cnt, mech_types, nodecounts);\n\n        nt._ml_list[tml->index] = tml->ml;\n\n#if CHKPNTDEBUG\n        Memb_list_chkpnt* mlc = new Memb_list_chkpnt;\n        ntc.mlmap[tml->index] = mlc;\n#endif\n\n        if (nt.tml) {\n            tml_last->next = tml;\n        } else {\n            nt.tml = tml;\n        }\n        tml_last = tml;\n    }\n\n    if (shadow_rhs_cnt) {\n        nt._shadow_rhs = (double*) ecalloc_align(nrn_soa_padded_size(shadow_rhs_cnt, 0),\n                                                 sizeof(double));\n        nt._shadow_d = (double*) ecalloc_align(nrn_soa_padded_size(shadow_rhs_cnt, 0),\n                                               sizeof(double));\n        nt.shadow_rhs_cnt = shadow_rhs_cnt;\n    }\n\n    nt.mapping = nullptr;  // section segment mapping\n\n    nt._nidata = n_idata;\n    if (nt._nidata)\n        nt._idata = (int*) ecalloc(nt._nidata, sizeof(int));\n    else\n        nt._idata = nullptr;\n    // see patternstim.cpp\n    int extra_nv = (&nt == nrn_threads) ? nrn_extra_thread0_vdata : 0;\n    nt._nvdata = n_vdata;\n    if (nt._nvdata + extra_nv)\n        nt._vdata = (void**) ecalloc_align(nt._nvdata + extra_nv, sizeof(void*));\n    else\n        nt._vdata = nullptr;\n\n    // The data format begins with the matrix data\n    int n_data_padded = nrn_soa_padded_size(nt.end, SOA_LAYOUT);\n    nt._data = _data;\n    nt._actual_rhs = nt._data + 0 * n_data_padded;\n    nt._actual_d = nt._data + 1 * n_data_padded;\n    nt._actual_a = nt._data + 2 * n_data_padded;\n    nt._actual_b = nt._data + 3 * n_data_padded;\n    nt._actual_v = nt._data + 4 * n_data_padded;\n    nt._actual_area = nt._data + 5 * n_data_padded;\n    nt._actual_diam = n_diam ? nt._data + 6 * n_data_padded : nullptr;\n\n    size_t offset = 6 * n_data_padded;\n    if (n_diam) {\n        // in the rare case that a mechanism has dparam with diam semantics\n        // then actual_diam array added after matrix in nt._data\n        // Generally wasteful since only a few diam are pointed to.\n        // Probably better to move the diam semantics to the p array of the mechanism\n        offset += n_data_padded;\n    }\n\n    // Memb_list.data points into the nt._data array.\n    // Also count the number of Point_process\n    int num_point_process = 0;\n    for (auto tml = nt.tml; tml; tml = tml->next) {\n        Memb_list* ml = tml->ml;\n        int type = tml->index;\n        int layout = corenrn.get_mech_data_layout()[type];\n        int n = ml->nodecount;\n        int sz = nrn_prop_param_size_[type];\n        offset = nrn_soa_byte_align(offset);\n        ml->data = nt._data + offset;\n        offset += nrn_soa_padded_size(n, layout) * sz;\n        if (corenrn.get_pnt_map()[type] > 0) {\n            num_point_process += n;\n        }\n    }\n    nt.pntprocs = (Point_process*) ecalloc_align(num_point_process,\n                                                 sizeof(Point_process));  // includes acell with and\n                                                                          // without gid\n    nt.n_pntproc = num_point_process;\n    nt._ndata = offset;\n\n\n    // matrix info\n    nt._v_parent_index = v_parent_index;\n\n#if CHKPNTDEBUG\n    ntc.parent = new int[nt.end];\n    memcpy(ntc.parent, nt._v_parent_index, nt.end * sizeof(int));\n    ntc.area = new double[nt.end];\n    memcpy(ntc.area, nt._actual_area, nt.end * sizeof(double));\n#endif\n\n    int synoffset = 0;\n    std::vector<int> pnt_offset(memb_func.size());\n\n    // All the mechanism data and pdata.\n    // Also fill in the pnt_offset\n    // Complete spec of Point_process except for the acell presyn_ field.\n    int itml = 0;\n    for (auto tml = nt.tml; tml; tml = tml->next, ++itml) {\n        int type = tml->index;\n        Memb_list* ml = tml->ml;\n        int n = ml->nodecount;\n        int szp = nrn_prop_param_size_[type];\n        int szdp = nrn_prop_dparam_size_[type];\n        int layout = corenrn.get_mech_data_layout()[type];\n\n        ml->nodeindices = (int*) ecalloc_align(ml->nodecount, sizeof(int));\n        std::copy(tmls[itml].nodeindices.begin(), tmls[itml].nodeindices.end(), ml->nodeindices);\n\n        mech_data_layout_transform<double>(ml->data, n, szp, layout);\n\n        if (szdp) {\n            ml->pdata = (int*) ecalloc_align(nrn_soa_padded_size(n, layout) * szdp, sizeof(int));\n            std::copy(tmls[itml].pdata.begin(), tmls[itml].pdata.end(), ml->pdata);\n            mech_data_layout_transform<int>(ml->pdata, n, szdp, layout);\n\n#if CHKPNTDEBUG  // Not substantive. Only for debugging.\n            Memb_list_chkpnt* mlc = ntc.mlmap[type];\n            mlc->pdata_not_permuted = (int*) coreneuron::ecalloc_align(n * szdp, sizeof(int));\n            if (layout == Layout::AoS) {  // only copy\n                for (int i = 0; i < n; ++i) {\n                    for (int j = 0; j < szdp; ++j) {\n                        mlc->pdata_not_permuted[i * szdp + j] = ml->pdata[i * szdp + j];\n                    }\n                }\n            } else if (layout == Layout::SoA) {  // transpose and unpad\n                int align_cnt = nrn_soa_padded_size(n, layout);\n                for (int i = 0; i < n; ++i) {\n                    for (int j = 0; j < szdp; ++j) {\n                        mlc->pdata_not_permuted[i * szdp + j] = ml->pdata[i + j * align_cnt];\n                    }\n                }\n            }\n#endif\n        } else {\n            ml->pdata = nullptr;\n        }\n        if (corenrn.get_pnt_map()[type] > 0) {  // POINT_PROCESS mechanism including acell\n            int cnt = ml->nodecount;\n            Point_process* pnt = nullptr;\n            pnt = nt.pntprocs + synoffset;\n            pnt_offset[type] = synoffset;\n            synoffset += cnt;\n            for (int i = 0; i < cnt; ++i) {\n                Point_process* pp = pnt + i;\n                pp->_type = type;\n                pp->_i_instance = i;\n                nt._vdata[ml->pdata[nrn_i_layout(i, cnt, 1, szdp, layout)]] = pp;\n                pp->_tid = nt.id;\n            }\n        }\n    }\n\n    // pnt_offset needed for SelfEvent transfer from NEURON. Not needed on GPU.\n    // Ugh. Related but not same as NetReceiveBuffer._pnt_offset\n    nt._pnt_offset = pnt_offset;\n\n    pdata_relocation(nt, memb_func);\n\n    /* if desired, apply the node permutation. This involves permuting\n       at least the node parameter arrays for a, b, and area (and diam) and all\n       integer vector values that index into nodes. This could have been done\n       when originally filling the arrays with AoS ordered data, but can also\n       be done now, after the SoA transformation. The latter has the advantage\n       that the present order is consistent with all the layout values. Note\n       that after this portion of the permutation, a number of other node index\n       vectors will be read and will need to be permuted as well in subsequent\n       sections of this function.\n    */\n    if (interleave_permute_type) {\n        nt._permute = interleave_order(nt.id, nt.ncell, nt.end, nt._v_parent_index);\n    }\n    if (nt._permute) {\n        int* p = nt._permute;\n        permute_data(nt._actual_a, nt.end, p);\n        permute_data(nt._actual_b, nt.end, p);\n        permute_data(nt._actual_area, nt.end, p);\n        permute_data(nt._actual_v,\n                     nt.end,\n                     p);  // need if restore or finitialize does not initialize voltage\n        if (nt._actual_diam) {\n            permute_data(nt._actual_diam, nt.end, p);\n        }\n        // index values change as well as ordering\n        permute_ptr(nt._v_parent_index, nt.end, p);\n        node_permute(nt._v_parent_index, nt.end, p);\n\n#if CORENRN_DEBUG\n        for (int i = 0; i < nt.end; ++i) {\n            printf(\"parent[%d] = %d\\n\", i, nt._v_parent_index[i]);\n        }\n#endif\n\n        // specify the ml->_permute and sort the nodeindices\n        // Have to calculate all the permute before updating pdata in case\n        // POINTER to data of other mechanisms exist.\n        for (auto tml = nt.tml; tml; tml = tml->next) {\n            if (tml->ml->nodeindices) {  // not artificial\n                permute_nodeindices(tml->ml, p);\n            }\n        }\n        for (auto tml = nt.tml; tml; tml = tml->next) {\n            if (tml->ml->nodeindices) {  // not artificial\n                permute_ml(tml->ml, tml->index, nt);\n            }\n        }\n\n        // permute the Point_process._i_instance\n        for (int i = 0; i < nt.n_pntproc; ++i) {\n            Point_process& pp = nt.pntprocs[i];\n            Memb_list* ml = nt._ml_list[pp._type];\n            if (ml->_permute) {\n                pp._i_instance = ml->_permute[pp._i_instance];\n            }\n        }\n    }\n\n    set_dependencies(nt, memb_func);\n\n    fill_before_after_lists(nt, memb_func);\n\n    // for fast watch statement checking\n    // setup a list of types that have WATCH statement\n    {\n        int sz = 0;  // count the types with WATCH\n        for (auto tml = nt.tml; tml; tml = tml->next) {\n            if (corenrn.get_watch_check()[tml->index]) {\n                ++sz;\n            }\n        }\n        if (sz) {\n            nt._watch_types = (int*) ecalloc(sz + 1, sizeof(int));  // nullptr terminated\n            sz = 0;\n            for (auto tml = nt.tml; tml; tml = tml->next) {\n                if (corenrn.get_watch_check()[tml->index]) {\n                    nt._watch_types[sz++] = tml->index;\n                }\n            }\n        }\n    }\n    auto& pnttype2presyn = corenrn.get_pnttype2presyn();\n    auto& nrn_has_net_event_ = corenrn.get_has_net_event();\n    // create the nt.pnt2presyn_ix array of arrays.\n    nt.pnt2presyn_ix = (int**) ecalloc(nrn_has_net_event_.size(), sizeof(int*));\n    for (size_t i = 0; i < nrn_has_net_event_.size(); ++i) {\n        Memb_list* ml = nt._ml_list[nrn_has_net_event_[i]];\n        if (ml && ml->nodecount > 0) {\n            nt.pnt2presyn_ix[i] = (int*) ecalloc(ml->nodecount, sizeof(int));\n        }\n    }\n\n    // Real cells are at the beginning of the nt.presyns followed by\n    // acells (with and without gids mixed together)\n    // Here we associate the real cells with voltage pointers and\n    // acell PreSyn with the Point_process.\n    // nt.presyns order same as output_vindex order\n#if CHKPNTDEBUG\n    ntc.output_vindex = new int[nt.n_presyn];\n    memcpy(ntc.output_vindex, output_vindex.data(), nt.n_presyn * sizeof(int));\n#endif\n    if (nt._permute) {\n        // only indices >= 0 (i.e. _actual_v indices) will be changed.\n        node_permute(output_vindex.data(), nt.n_presyn, nt._permute);\n    }\n#if CHKPNTDEBUG\n    ntc.output_threshold = new double[n_real_output];\n    memcpy(ntc.output_threshold, output_threshold.data(), n_real_output * sizeof(double));\n#endif\n\n    for (int i = 0; i < nt.n_presyn; ++i) {  // real cells\n        PreSyn* ps = nt.presyns + i;\n\n        int ix = output_vindex[i];\n        if (ix == -1 && i < n_real_output) {  // real cell without a presyn\n            continue;\n        }\n        if (ix < 0) {\n            ix = -ix;\n            int index = ix / 1000;\n            int type = ix % 1000;\n            Point_process* pnt = nt.pntprocs + (pnt_offset[type] + index);\n            ps->pntsrc_ = pnt;\n            // pnt->_presyn = ps;\n            int ip2ps = pnttype2presyn[pnt->_type];\n            if (ip2ps >= 0) {\n                nt.pnt2presyn_ix[ip2ps][pnt->_i_instance] = i;\n            }\n            if (ps->gid_ < 0) {\n                ps->gid_ = -1;\n            }\n        } else {\n            assert(ps->gid_ > -1);\n            ps->thvar_index_ = ix;  // index into _actual_v\n            assert(ix < nt.end);\n            ps->threshold_ = output_threshold[i];\n        }\n    }\n\n    // initial net_send_buffer size about 1% of number of presyns\n    // nt._net_send_buffer_size = nt.ncell/100 + 1;\n    // but, to avoid reallocation complexity on GPU ...\n    nt._net_send_buffer_size = n_real_output;\n    nt._net_send_buffer = (int*) ecalloc_align(nt._net_send_buffer_size, sizeof(int));\n\n    int nnetcon = nt.n_netcon;\n\n    // it may happen that Point_process structures will be made unnecessary\n    // by factoring into NetCon.\n\n#if CHKPNTDEBUG\n    ntc.pnttype = new int[nnetcon];\n    ntc.pntindex = new int[nnetcon];\n    memcpy(ntc.pnttype, pnttype.data(), nnetcon * sizeof(int));\n    memcpy(ntc.pntindex, pntindex.data(), nnetcon * sizeof(int));\n#endif\n    for (int i = 0; i < nnetcon; ++i) {\n        int type = pnttype[i];\n        if (type > 0) {\n            int index = pnt_offset[type] + pntindex[i];  /// Potentially uninitialized pnt_offset[],\n                                                         /// check for previous assignments\n            NetCon& nc = nt.netcons[i];\n            nc.target_ = nt.pntprocs + index;\n            nc.active_ = true;\n        }\n    }\n\n    handle_weights(nt, nnetcon, ntc);\n\n    get_info_from_bbcore(nt, memb_func, ntc);\n\n    set_vec_play(nt, ntc);\n\n    if (!events.empty()) {\n        userParams.checkPoints.restore_tqueue(nt, *this);\n    }\n\n    set_net_send_buffer(nt._ml_list, pnt_offset);\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/phase2.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n#include \"coreneuron/io/nrn_filehandler.hpp\"\n#include \"coreneuron/io/user_params.hpp\"\n#include \"coreneuron/utils/ivocvect.hpp\"\n\n#include <memory>\n\nnamespace coreneuron {\nstruct NrnThread;\nstruct NrnThreadMembList;\nstruct Memb_func;\nstruct Memb_list;\nstruct NrnThreadChkpnt;\n\nclass Phase2 {\n  public:\n    void read_file(FileHandler& F, const NrnThread& nt);\n    void read_direct(int thread_id, const NrnThread& nt);\n    void populate(NrnThread& nt, const UserParams& userParams);\n\n    std::vector<int> preSynConditionEventFlags;\n\n    // All of this is public for nrn_checkpoint\n    struct EventTypeBase {\n        double time;\n    };\n    struct NetConType_: public EventTypeBase {\n        int netcon_index;\n    };\n    struct SelfEventType_: public EventTypeBase {\n        int target_type;\n        int point_proc_instance;\n        int target_instance;\n        double flag;\n        int movable;\n        int weight_index;\n    };\n    struct PreSynType_: public EventTypeBase {\n        int presyn_index;\n    };\n    struct NetParEvent_: public EventTypeBase {};\n    struct PlayRecordEventType_: public EventTypeBase {\n        int play_record_type;\n        int vecplay_index;\n    };\n\n    struct VecPlayContinuous_ {\n        int vtype;\n        int mtype;\n        int ix;\n        IvocVect yvec;\n        IvocVect tvec;\n\n        int last_index;\n        int discon_index;\n        int ubound_index;\n    };\n    std::vector<VecPlayContinuous_> vec_play_continuous;\n    int patstim_index;\n\n    std::vector<std::pair<int, std::shared_ptr<EventTypeBase>>> events;\n\n  private:\n    void check_mechanism();\n    void transform_int_data(int elem0,\n                            int nodecount,\n                            int* pdata,\n                            int i,\n                            int dparam_size,\n                            int layout,\n                            int n_node_);\n    void set_net_send_buffer(Memb_list** ml_list, const std::vector<int>& pnt_offset);\n    void restore_events(FileHandler& F);\n    void fill_before_after_lists(NrnThread& nt, const std::vector<Memb_func>& memb_func);\n    void pdata_relocation(const NrnThread& nt, const std::vector<Memb_func>& memb_func);\n    void set_dependencies(const NrnThread& nt, const std::vector<Memb_func>& memb_func);\n    void handle_weights(NrnThread& nt, int n_netcon, NrnThreadChkpnt& ntc);\n    void get_info_from_bbcore(NrnThread& nt,\n                              const std::vector<Memb_func>& memb_func,\n                              NrnThreadChkpnt& ntc);\n    void set_vec_play(NrnThread& nt, NrnThreadChkpnt& ntc);\n\n    int n_real_cell;\n    int n_output;\n    int n_real_output;\n    int n_node;\n    int n_diam;  // 0 if not needed, else n_node\n    int n_mech;\n    std::vector<int> mech_types;\n    std::vector<int> nodecounts;\n    int n_idata;\n    int n_vdata;\n    int* v_parent_index;\n    /* TO DO: when this is fixed use it like that\n    std::vector<double> actual_a;\n    std::vector<double> actual_b;\n    std::vector<double> actual_area;\n    std::vector<double> actual_v;\n    std::vector<double> actual_diam;\n    */\n    double* _data;\n    struct TML {\n        std::vector<int> nodeindices;\n        std::vector<int> pdata;\n        int type;\n        std::vector<int> iArray;\n        std::vector<double> dArray;\n        std::vector<int> pointer2type;\n    };\n    std::vector<TML> tmls;\n    std::vector<int> output_vindex;\n    std::vector<double> output_threshold;\n    std::vector<int> pnttype;\n    std::vector<int> pntindex;\n    std::vector<double> weights;\n    std::vector<double> delay;\n    int num_point_process;\n};\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/prcellstate.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include <vector>\n#include <map>\n\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/io/nrn_setup.hpp\"\n#include \"coreneuron/network/netcon.hpp\"\n#include \"coreneuron/nrniv/nrniv_decl.h\"\n#include \"coreneuron/utils/nrn_assert.h\"\n#include \"coreneuron/coreneuron.hpp\"\n#include \"coreneuron/utils/nrnoc_aux.hpp\"\n\n#define precision 15\nnamespace coreneuron {\nstatic std::map<Point_process*, int> pnt2index;  // for deciding if NetCon is to be printed\nstatic int pntindex;                             // running count of printed point processes.\nstatic std::map<NetCon*, DiscreteEvent*> map_nc2src;\nstatic std::vector<int>* inv_permute_;\n\nstatic int permute(int i, NrnThread& nt) {\n    return nt._permute ? nt._permute[i] : i;\n}\n\nstatic int inv_permute(int i, NrnThread& nt) {\n    nrn_assert(i >= 0 && i < nt.end);\n    if (!nt._permute) {\n        return i;\n    }\n    if (!inv_permute_) {\n        inv_permute_ = new std::vector<int>(nt.end);\n        for (int i = 0; i < nt.end; ++i) {\n            (*inv_permute_)[nt._permute[i]] = i;\n        }\n    }\n    return (*inv_permute_)[i];\n}\n\nstatic int ml_permute(int i, Memb_list* ml) {\n    return ml->_permute ? ml->_permute[i] : i;\n}\n\n// Note: cellnodes array is in unpermuted order.\n\nstatic void pr_memb(int type, Memb_list* ml, int* cellnodes, NrnThread& nt, FILE* f) {\n    if (corenrn.get_is_artificial()[type])\n        return;\n\n    bool header_printed = false;\n    int size = corenrn.get_prop_param_size()[type];\n    int psize = corenrn.get_prop_dparam_size()[type];\n    bool receives_events = corenrn.get_pnt_receive()[type];\n    int layout = corenrn.get_mech_data_layout()[type];\n    int cnt = ml->nodecount;\n    for (int iorig = 0; iorig < ml->nodecount; ++iorig) {  // original index\n        int i = ml_permute(iorig, ml);                     // present index\n        int inode = ml->nodeindices[i];                    // inode is the permuted node\n        int cix = cellnodes[inv_permute(inode, nt)];       // original index relative to this cell\n        if (cix >= 0) {\n            if (!header_printed) {\n                header_printed = true;\n                fprintf(f, \"type=%d %s size=%d\\n\", type, corenrn.get_memb_func(type).sym, size);\n            }\n            if (receives_events) {\n                fprintf(f, \"%d nri %d\\n\", cix, pntindex);\n                int k = nrn_i_layout(i, cnt, 1, psize, layout);\n                Point_process* pp = (Point_process*) nt._vdata[ml->pdata[k]];\n                pnt2index[pp] = pntindex;\n                ++pntindex;\n            }\n            for (int j = 0; j < size; ++j) {\n                int k = nrn_i_layout(i, cnt, j, size, layout);\n                fprintf(f, \" %d %d %.*g\\n\", cix, j, precision, ml->data[k]);\n            }\n        }\n    }\n}\n\nstatic void pr_netcon(NrnThread& nt, FILE* f) {\n    if (pntindex == 0) {\n        return;\n    }\n    // pnt2index table has been filled\n\n    // List of NetCon for each of the NET_RECEIVE point process instances\n    // Also create the initial map of NetCon <-> DiscreteEvent (PreSyn)\n    std::vector<std::vector<NetCon*>> nclist(pntindex);\n    map_nc2src.clear();\n    int nc_cnt = 0;\n    for (int i = 0; i < nt.n_netcon; ++i) {\n        NetCon* nc = nt.netcons + i;\n        Point_process* pp = nc->target_;\n        std::map<Point_process*, int>::iterator it = pnt2index.find(pp);\n        if (it != pnt2index.end()) {\n            nclist[it->second].push_back(nc);\n            map_nc2src[nc] = nullptr;\n            ++nc_cnt;\n        }\n    }\n    fprintf(f, \"netcons %d\\n\", nc_cnt);\n    fprintf(f, \" pntindex srcgid active delay weights\\n\");\n\n    /// Fill the NetCon <-> DiscreteEvent map with PreSyn-s\n    // presyns can come from any thread\n    for (int ith = 0; ith < nrn_nthread; ++ith) {\n        NrnThread& ntps = nrn_threads[ith];\n        for (int i = 0; i < ntps.n_presyn; ++i) {\n            PreSyn* ps = ntps.presyns + i;\n            for (int j = 0; j < ps->nc_cnt_; ++j) {\n                NetCon* nc = netcon_in_presyn_order_[ps->nc_index_ + j];\n                auto it_nc2src = map_nc2src.find(nc);\n                if (it_nc2src != map_nc2src.end()) {\n                    it_nc2src->second = ps;\n                }\n            }\n        }\n    }\n\n    /// Fill the NetCon <-> DiscreteEvent map with InputPreSyn-s\n    /// Traverse gid <-> InputPreSyn map and loop over NetCon-s of the\n    /// correspondent InputPreSyn. If NetCon is in the nc2src map,\n    /// remember its ips and the gid\n    std::map<NetCon*, int> map_nc2gid;\n    for (const auto& gid: gid2in) {\n        InputPreSyn* ips = gid.second;  /// input presyn\n        for (int i = 0; i < ips->nc_cnt_; ++i) {\n            NetCon* nc = netcon_in_presyn_order_[ips->nc_index_ + i];\n            auto it_nc2src = map_nc2src.find(nc);\n            if (it_nc2src != map_nc2src.end()) {\n                it_nc2src->second = ips;\n                map_nc2gid[nc] = gid.first;  /// src gid of the input presyn\n            }\n        }\n    }\n\n    for (int i = 0; i < pntindex; ++i) {\n        for (int j = 0; j < (int) (nclist[i].size()); ++j) {\n            NetCon* nc = nclist[i][j];\n            int srcgid = -3;\n            auto it_nc2src = map_nc2src.find(nc);\n            if (it_nc2src != map_nc2src.end()) {  // seems like there should be no NetCon which is\n                                                  // not in the map\n                DiscreteEvent* de = it_nc2src->second;\n                if (de && de->type() == PreSynType) {\n                    PreSyn* ps = (PreSyn*) de;\n                    srcgid = ps->gid_;\n                    Point_process* pnt = ps->pntsrc_;\n                    if (srcgid < 0 && pnt) {\n                        int type = pnt->_type;\n                        fprintf(f,\n                                \"%d %s %d %.*g\",\n                                i,\n                                corenrn.get_memb_func(type).sym,\n                                nc->active_ ? 1 : 0,\n                                precision,\n                                nc->delay_);\n                    } else if (srcgid < 0 && ps->thvar_index_ > 0) {\n                        fprintf(\n                            f, \"%d %s %d %.*g\", i, \"v\", nc->active_ ? 1 : 0, precision, nc->delay_);\n                    } else {\n                        fprintf(f,\n                                \"%d %d %d %.*g\",\n                                i,\n                                srcgid,\n                                nc->active_ ? 1 : 0,\n                                precision,\n                                nc->delay_);\n                    }\n                } else {\n                    fprintf(f,\n                            \"%d %d %d %.*g\",\n                            i,\n                            map_nc2gid[nc],\n                            nc->active_ ? 1 : 0,\n                            precision,\n                            nc->delay_);\n                }\n            } else {\n                fprintf(f, \"%d %d %d %.*g\", i, srcgid, nc->active_ ? 1 : 0, precision, nc->delay_);\n            }\n            int wcnt = corenrn.get_pnt_receive_size()[nc->target_->_type];\n            for (int k = 0; k < wcnt; ++k) {\n                fprintf(f, \" %.*g\", precision, nt.weights[nc->u.weight_index_ + k]);\n            }\n            fprintf(f, \"\\n\");\n        }\n    }\n    // cleanup\n    nclist.clear();\n}\n\nstatic void pr_realcell(PreSyn& ps, NrnThread& nt, FILE* f) {\n    // for associating NetCons with Point_process identifiers\n\n    pntindex = 0;\n\n    // threshold variable is a voltage\n    printf(\"thvar_index_=%d end=%d\\n\", inv_permute(ps.thvar_index_, nt), nt.end);\n    if (ps.thvar_index_ < 0 || ps.thvar_index_ >= nt.end) {\n        hoc_execerror(\"gid not associated with a voltage\", 0);\n    }\n    int inode = ps.thvar_index_;\n\n    // and the root node is ...\n    int rnode = inode;\n    while (rnode >= nt.ncell) {\n        rnode = nt._v_parent_index[rnode];\n    }\n\n    // count the number of nodes in the cell\n    // do not assume all cell nodes except the root are contiguous\n    // cellnodes is an unpermuted vector\n    int* cellnodes = new int[nt.end];\n    for (int i = 0; i < nt.end; ++i) {\n        cellnodes[i] = -1;\n    }\n    int cnt = 0;\n    cellnodes[inv_permute(rnode, nt)] = cnt++;\n    for (int i = nt.ncell; i < nt.end; ++i) {  // think of it as unpermuted order\n        if (cellnodes[inv_permute(nt._v_parent_index[permute(i, nt)], nt)] >= 0) {\n            cellnodes[i] = cnt++;\n        }\n    }\n    fprintf(f, \"%d nodes  %d is the threshold node\\n\", cnt, cellnodes[inv_permute(inode, nt)] - 1);\n    fprintf(f, \" threshold %.*g\\n\", precision, ps.threshold_);\n    fprintf(f, \"inode parent area a b\\n\");\n    for (int iorig = 0; iorig < nt.end; ++iorig)\n        if (cellnodes[iorig] >= 0) {\n            int i = permute(iorig, nt);\n            int ip = nt._v_parent_index[i];\n            fprintf(f,\n                    \"%d %d %.*g %.*g %.*g\\n\",\n                    cellnodes[iorig],\n                    ip >= 0 ? cellnodes[inv_permute(ip, nt)] : -1,\n                    precision,\n                    nt._actual_area[i],\n                    precision,\n                    nt._actual_a[i],\n                    precision,\n                    nt._actual_b[i]);\n        }\n    fprintf(f, \"inode v\\n\");\n    for (int i = 0; i < nt.end; ++i)\n        if (cellnodes[i] >= 0) {\n            fprintf(f, \"%d %.*g\\n\", cellnodes[i], precision, nt._actual_v[permute(i, nt)]);\n        }\n\n    // each mechanism\n    for (NrnThreadMembList* tml = nt.tml; tml; tml = tml->next) {\n        pr_memb(tml->index, tml->ml, cellnodes, nt, f);\n    }\n\n    // the NetCon info (uses pnt2index)\n    pr_netcon(nt, f);\n\n    delete[] cellnodes;\n    pnt2index.clear();\n    if (inv_permute_) {\n        delete inv_permute_;\n        inv_permute_ = nullptr;\n    }\n}\n\nint prcellstate(int gid, const char* suffix) {\n    // search the NrnThread.presyns for the gid\n    for (int ith = 0; ith < nrn_nthread; ++ith) {\n        NrnThread& nt = nrn_threads[ith];\n        for (int ip = 0; ip < nt.n_presyn; ++ip) {\n            PreSyn& ps = nt.presyns[ip];\n            if (ps.output_index_ == gid) {\n                // found it so create a <gid>_<suffix>.corenrn file\n                std::string filename = std::to_string(gid) + \"_\" + suffix + \".corenrn\";\n                FILE* f = fopen(filename.c_str(), \"w\");\n                assert(f);\n                fprintf(f, \"gid = %d\\n\", gid);\n                fprintf(f, \"t = %.*g\\n\", precision, nt._t);\n                fprintf(f, \"celsius = %.*g\\n\", precision, celsius);\n                if (ps.thvar_index_ >= 0) {\n                    pr_realcell(ps, nt, f);\n                }\n                fclose(f);\n                return 1;\n            }\n        }\n    }\n    return 0;\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/prcellstate.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\nnamespace coreneuron {\n\nextern int prcellstate(int gid, const char* suffix);\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/reports/binary_report_handler.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include \"binary_report_handler.hpp\"\n#ifdef ENABLE_BIN_REPORTS\n#include \"reportinglib/Records.h\"\n#endif  // ENABLE_BIN_REPORTS\n\nnamespace coreneuron {\n\nvoid BinaryReportHandler::create_report(ReportConfiguration& config,\n                                        double dt,\n                                        double tstop,\n                                        double delay) {\n#ifdef ENABLE_BIN_REPORTS\n    records_set_atomic_step(dt);\n#endif  // ENABLE_BIN_REPORTS\n    ReportHandler::create_report(config, dt, tstop, delay);\n}\n\n#ifdef ENABLE_BIN_REPORTS\nstatic void create_soma_extra(const CellMapping& mapping, std::array<int, 5>& extra) {\n    extra = {1, 0, 0, 0, 0};\n    /* report extra \"mask\" all infos not written in report: here only soma count is reported */\n    extra[1] = mapping.get_seclist_segment_count(\"soma\");\n}\n\nstatic void create_compartment_extra(const CellMapping& mapping, std::array<int, 5>& extra) {\n    extra[1] = mapping.get_seclist_section_count(\"soma\");\n    extra[2] = mapping.get_seclist_section_count(\"axon\");\n    extra[3] = mapping.get_seclist_section_count(\"dend\");\n    extra[4] = mapping.get_seclist_section_count(\"apic\");\n    extra[0] = std::accumulate(extra.begin() + 1, extra.end(), 0);\n}\n\nstatic void create_custom_extra(const CellMapping& mapping, std::array<int, 5>& extra) {\n    extra = {1, 0, 0, 0, 1};\n    extra[1] = mapping.get_seclist_section_count(\"soma\");\n    // extra[2] and extra[3]\n    extra[4] = mapping.get_seclist_section_count(\"apic\");\n    extra[0] = std::accumulate(extra.begin() + 1, extra.end(), 0);\n}\n\nvoid BinaryReportHandler::register_section_report(const NrnThread& nt,\n                                                  const ReportConfiguration& config,\n                                                  const VarsToReport& vars_to_report,\n                                                  bool is_soma_target) {\n    create_extra_func create_extra = is_soma_target ? create_soma_extra : create_compartment_extra;\n    register_report(nt, config, vars_to_report, create_extra);\n}\n\nvoid BinaryReportHandler::register_custom_report(const NrnThread& nt,\n                                                 const ReportConfiguration& config,\n                                                 const VarsToReport& vars_to_report) {\n    create_extra_func create_extra = create_custom_extra;\n    register_report(nt, config, vars_to_report, create_extra);\n}\n\nvoid BinaryReportHandler::register_report(const NrnThread& nt,\n                                          const ReportConfiguration& config,\n                                          const VarsToReport& vars_to_report,\n                                          create_extra_func& create_extra) {\n    int sizemapping = 1;\n    int extramapping = 5;\n    std::array<int, 1> mapping = {0};\n    std::array<int, 5> extra;\n    for (const auto& var: vars_to_report) {\n        int gid = var.first;\n        auto& vars = var.second;\n        if (vars.empty()) {\n            continue;\n        }\n        const auto* mapinfo = static_cast<NrnThreadMappingInfo*>(nt.mapping);\n        const CellMapping* m = mapinfo->get_cell_mapping(gid);\n        extra[0] = vars.size();\n        create_extra(*m, extra);\n        records_add_report(config.output_path.data(),\n                           gid,\n                           gid,\n                           gid,\n                           config.start,\n                           config.stop,\n                           config.report_dt,\n                           sizemapping,\n                           config.type_str.data(),\n                           extramapping,\n                           config.unit.data());\n\n        records_set_report_max_buffer_size_hint(config.output_path.data(), config.buffer_size);\n        records_extra_mapping(config.output_path.data(), gid, 5, extra.data());\n        for (const auto& var: vars) {\n            mapping[0] = var.id;\n            records_add_var_with_mapping(\n                config.output_path.data(), gid, var.var_value, sizemapping, mapping.data());\n        }\n    }\n}\n#endif  // ENABLE_BIN_REPORTS\n\n}  // Namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/reports/binary_report_handler.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n#include <functional>\n#include <memory>\n#include <vector>\n#include <array>\n\n#include \"report_handler.hpp\"\n#include \"coreneuron/io/nrnsection_mapping.hpp\"\n\nnamespace coreneuron {\n\nclass BinaryReportHandler: public ReportHandler {\n  public:\n    void create_report(ReportConfiguration& config, double dt, double tstop, double delay) override;\n#ifdef ENABLE_BIN_REPORTS\n    void register_section_report(const NrnThread& nt,\n                                 const ReportConfiguration& config,\n                                 const VarsToReport& vars_to_report,\n                                 bool is_soma_target) override;\n    void register_custom_report(const NrnThread& nt,\n                                const ReportConfiguration& config,\n                                const VarsToReport& vars_to_report) override;\n\n  private:\n    using create_extra_func = std::function<void(const CellMapping&, std::array<int, 5>&)>;\n    void register_report(const NrnThread& nt,\n                         const ReportConfiguration& config,\n                         const VarsToReport& vars_to_report,\n                         create_extra_func& create_extra);\n#endif  // ENABLE_BIN_REPORTS\n};\n\n}  // Namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/reports/nrnreport.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include <iostream>\n#include <vector>\n#include <algorithm>\n#include <map>\n#include <set>\n#include <cmath>\n\n#include \"coreneuron/network/netcon.hpp\"\n#include \"coreneuron/utils/nrn_assert.h\"\n#include \"coreneuron/network/netcvode.hpp\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/io/reports/nrnreport.hpp\"\n#include \"coreneuron/io/nrnsection_mapping.hpp\"\n#include \"coreneuron/mechanism/mech_mapping.hpp\"\n#include \"coreneuron/mechanism/membfunc.hpp\"\n#ifdef ENABLE_BIN_REPORTS\n#include \"reportinglib/Records.h\"\n#endif\n#ifdef ENABLE_SONATA_REPORTS\n#include \"bbp/sonata/reports.h\"\n#endif\n\nnamespace coreneuron {\n\n// Size in MB of the report buffer\nstatic int size_report_buffer = 4;\n\nvoid nrn_flush_reports(double t) {\n    // flush before buffer is full\n#ifdef ENABLE_BIN_REPORTS\n    records_end_iteration(t);\n#endif\n#ifdef ENABLE_SONATA_REPORTS\n    sonata_check_and_flush(t);\n#endif\n}\n\n/** in the current implementation, we call flush during every spike exchange\n *  interval. Hence there should be sufficient buffer to hold all reports\n *  for the duration of mindelay interval. In the below call we specify the\n *  number of timesteps that we have to buffer.\n *  TODO: revisit this because spike exchange can happen few steps before/after\n *  mindelay interval and hence adding two extra timesteps to buffer.\n */\nvoid setup_report_engine(double dt_report, double mindelay) {\n    int min_steps_to_record = static_cast<int>(std::round(mindelay / dt_report));\n    static_cast<void>(min_steps_to_record);\n#ifdef ENABLE_BIN_REPORTS\n    records_set_min_steps_to_record(min_steps_to_record);\n    records_setup_communicator();\n    records_finish_and_share();\n#endif\n#ifdef ENABLE_SONATA_REPORTS\n    sonata_set_min_steps_to_record(min_steps_to_record);\n    sonata_setup_communicators();\n    sonata_prepare_datasets();\n#endif\n}\n\n// Size in MB of the report buffers\nvoid set_report_buffer_size(int n) {\n    size_report_buffer = n;\n#ifdef ENABLE_BIN_REPORTS\n    records_set_max_buffer_size_hint(size_report_buffer);\n#endif\n#ifdef ENABLE_SONATA_REPORTS\n    sonata_set_max_buffer_size_hint(size_report_buffer);\n#endif\n}\n\nvoid finalize_report() {\n#ifdef ENABLE_BIN_REPORTS\n    records_flush(nrn_threads[0]._t);\n#endif\n#ifdef ENABLE_SONATA_REPORTS\n    sonata_flush(nrn_threads[0]._t);\n#endif\n}\n}  // Namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/reports/nrnreport.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n/**\n * @file nrnreport.h\n * @brief interface with reportinglib for soma reports\n */\n\n#ifndef _H_NRN_REPORT_\n#define _H_NRN_REPORT_\n\n#include <string>\n#include <vector>\n#include <set>\n#include <unordered_map>\n#include <cstdint>\n\n#define REPORT_MAX_NAME_LEN     256\n#define REPORT_MAX_FILEPATH_LEN 4096\n\nnamespace coreneuron {\n\nstruct SummationReport {\n    // Contains the values of the summation with index == segment_id\n    std::vector<double> summation_ = {};\n    // Map containing the pointers of the currents and its scaling factor for every segment_id\n    std::unordered_map<size_t, std::vector<std::pair<double*, int>>> currents_;\n    // Map containing the list of segment_ids per gid\n    std::unordered_map<int, std::vector<size_t>> gid_segments_;\n};\n\nstruct SummationReportMapping {\n    // Map containing a SummationReport object per report\n    std::unordered_map<std::string, SummationReport> summation_reports_;\n};\n\nstruct SpikesInfo {\n    std::string file_name = \"out\";\n    std::vector<std::pair<std::string, int>> population_info;\n};\n\n// name of the variable in mod file that is used to indicate which synapse\n// is enabled or disable for reporting\n#define SELECTED_VAR_MOD_NAME \"selected_for_report\"\n\n/// name of the variable in mod file used for setting synapse id\n#define SYNAPSE_ID_MOD_NAME \"synapseID\"\n\n/*\n * Defines the type of target, as per the following syntax:\n *   0=Compartment, 1=Cell/Soma, Section { 2=Axon, 3=Dendrite, 4=Apical }\n * The \"Comp\" variations are compartment-based (all segments, not middle only)\n */\nenum class TargetType {\n    Compartment = 0,\n    Cell = 1,\n    SectionSoma = 2,\n    SectionAxon = 3,\n    SectionDendrite = 4,\n    SectionApical = 5,\n    SectionSomaAll = 6,\n    SectionAxonAll = 7,\n    SectionDendriteAll = 8,\n    SectionApicalAll = 9,\n};\n\n// enumerate that defines the type of target report requested\nenum ReportType {\n    SomaReport,\n    CompartmentReport,\n    SynapseReport,\n    IMembraneReport,\n    SectionReport,\n    SummationReport\n};\n\n// enumerate that defines the section type for a Section report\nenum SectionType { Cell, Soma, Axon, Dendrite, Apical, All };\n\nstruct ReportConfiguration {\n    std::string name;                     // name of the report\n    std::string output_path;              // full path of the report\n    std::string target_name;              // target of the report\n    std::vector<std::string> mech_names;  // mechanism names\n    std::vector<std::string> var_names;   // variable names\n    std::vector<int> mech_ids;            // mechanisms\n    std::string unit;                     // unit of the report\n    std::string format;                   // format of the report (Bin, hdf5, SONATA)\n    std::string type_str;                 // type of report string\n    TargetType target_type;               // type of the target\n    ReportType type;                      // type of the report\n    SectionType section_type;             // type of section report\n    bool section_all_compartments;        // flag for section report (all values)\n    double report_dt;                     // reporting timestep\n    double start;                         // start time of report\n    double stop;                          // stop time of report\n    int num_gids;                         // total number of gids\n    int buffer_size;                      // hint on buffer size used for this report\n    std::vector<int> target;              // list of gids for this report\n};\n\nvoid setup_report_engine(double dt_report, double mindelay);\nstd::vector<ReportConfiguration> create_report_configurations(const std::string& filename,\n                                                              const std::string& output_dir,\n                                                              SpikesInfo& spikes_info);\nvoid finalize_report();\nvoid nrn_flush_reports(double t);\nvoid set_report_buffer_size(int n);\n\n}  // namespace coreneuron\n\n#endif  //_H_NRN_REPORT_\n"
  },
  {
    "path": "coreneuron/io/reports/report_configuration_parser.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include <algorithm>\n#include <cstdio>\n#include <cstdlib>\n#include <cstring>\n#include <fstream>\n#include <iostream>\n#include <limits>\n#include <sstream>\n#include <string>\n#include <vector>\n\n#include \"coreneuron/io/reports/nrnreport.hpp\"\n#include \"coreneuron/mechanism/mech_mapping.hpp\"\n#include \"coreneuron/sim/fast_imem.hpp\"\n#include \"coreneuron/utils/nrn_assert.h\"\n#include \"coreneuron/utils/utils.hpp\"\n\nnamespace coreneuron {\n\n\n/*\n * Split filter comma separated strings (\"mech.var_name\") into mech_name and var_name\n */\nvoid parse_filter_string(const std::string& filter, ReportConfiguration& config) {\n    std::vector<std::string> mechanisms;\n    std::stringstream ss(filter);\n    std::string mechanism;\n    // Multiple report variables are separated by `,`\n    while (getline(ss, mechanism, ',')) {\n        mechanisms.push_back(mechanism);\n\n        // Split mechanism name and corresponding reporting variable\n        std::string mech_name;\n        std::string var_name;\n        std::istringstream iss(mechanism);\n        std::getline(iss, mech_name, '.');\n        std::getline(iss, var_name, '.');\n        if (var_name.empty()) {\n            var_name = \"i\";\n        }\n        config.mech_names.emplace_back(mech_name);\n        config.var_names.emplace_back(var_name);\n        if (mech_name == \"i_membrane\") {\n            nrn_use_fast_imem = true;\n        }\n    }\n}\n\nvoid register_target_type(ReportConfiguration& report, ReportType report_type) {\n    report.type = report_type;\n    switch (report.target_type) {\n        case TargetType::Compartment:\n            report.section_type = All;\n            report.section_all_compartments = true;\n            break;\n        case TargetType::Cell:\n            report.section_type = Cell;\n            report.section_all_compartments = false;\n            break;\n        case TargetType::SectionSoma:\n            report.section_type = Soma;\n            report.section_all_compartments = false;\n            break;\n        case TargetType::SectionSomaAll:\n            report.section_type = Soma;\n            report.section_all_compartments = true;\n            break;\n        case TargetType::SectionAxon:\n            report.section_type = Axon;\n            report.section_all_compartments = false;\n            break;\n        case TargetType::SectionAxonAll:\n            report.section_type = Axon;\n            report.section_all_compartments = true;\n            break;\n        case TargetType::SectionDendrite:\n            report.section_type = Dendrite;\n            report.section_all_compartments = false;\n            break;\n        case TargetType::SectionDendriteAll:\n            report.section_type = Dendrite;\n            report.section_all_compartments = true;\n            break;\n        case TargetType::SectionApical:\n            report.section_type = Apical;\n            report.section_all_compartments = false;\n            break;\n        case TargetType::SectionApicalAll:\n            report.section_type = Apical;\n            report.section_all_compartments = true;\n            break;\n        default:\n            std::cerr << \"Report error: unsupported target type\" << std::endl;\n            nrn_abort(1);\n    }\n}\n\nstd::vector<ReportConfiguration> create_report_configurations(const std::string& conf_file,\n                                                              const std::string& output_dir,\n                                                              SpikesInfo& spikes_info) {\n    std::string report_on;\n    int target;\n    std::ifstream report_conf(conf_file);\n\n    int num_reports = 0;\n    report_conf >> num_reports;\n    std::vector<ReportConfiguration> reports(num_reports);\n    for (auto& report: reports) {\n        report.buffer_size = 4;  // default size to 4 Mb\n\n        report_conf >> report.name >> report.target_name >> report.type_str >> report_on >>\n            report.unit >> report.format >> target >> report.report_dt >> report.start >>\n            report.stop >> report.num_gids >> report.buffer_size;\n\n        report.target_type = static_cast<TargetType>(target);\n        std::transform(report.type_str.begin(),\n                       report.type_str.end(),\n                       report.type_str.begin(),\n                       [](unsigned char c) { return std::tolower(c); });\n        report.output_path = output_dir + \"/\" + report.name;\n        ReportType report_type;\n        if (report.type_str == \"compartment\") {\n            report_type = SectionReport;\n            if (report_on == \"i_membrane\") {\n                nrn_use_fast_imem = true;\n                report_type = IMembraneReport;\n            }\n        } else if (report.type_str == \"synapse\") {\n            report_type = SynapseReport;\n        } else if (report.type_str == \"summation\") {\n            report_type = SummationReport;\n        } else {\n            std::cerr << \"Report error: unsupported type \" << report.type_str << std::endl;\n            nrn_abort(1);\n        }\n        register_target_type(report, report_type);\n        if (report.type == SynapseReport || report.type == SummationReport) {\n            parse_filter_string(report_on, report);\n        }\n        if (report.num_gids) {\n            report.target.resize(report.num_gids);\n            report_conf.ignore(std::numeric_limits<std::streamsize>::max(), '\\n');\n            report_conf.read(reinterpret_cast<char*>(report.target.data()),\n                             report.num_gids * sizeof(int));\n            // extra new line: skip\n            report_conf.ignore(std::numeric_limits<std::streamsize>::max(), '\\n');\n        }\n    }\n    // read population information for spike report\n    int num_populations;\n    std::string spikes_population_name;\n    int spikes_population_offset;\n    if (report_conf.peek() == '\\n') {\n        // skip newline and move forward to spike reports\n        report_conf.ignore(std::numeric_limits<std::streamsize>::max(), '\\n');\n    }\n    if (isdigit(report_conf.peek())) {\n        report_conf >> num_populations;\n    } else {\n        // support old format: one single line \"All\"\n        num_populations = 1;\n    }\n    for (int i = 0; i < num_populations; i++) {\n        if (!(report_conf >> spikes_population_name >> spikes_population_offset)) {\n            // support old format: one single line \"All\"\n            report_conf >> spikes_population_name;\n            spikes_population_offset = 0;\n        }\n        spikes_info.population_info.emplace_back(\n            std::make_pair(spikes_population_name, spikes_population_offset));\n    }\n    report_conf >> spikes_info.file_name;\n\n    return reports;\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/reports/report_event.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include \"report_event.hpp\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/io/reports/nrnreport.hpp\"\n#include \"coreneuron/utils/nrn_assert.h\"\n#ifdef ENABLE_BIN_REPORTS\n#include \"reportinglib/Records.h\"\n#endif  // ENABLE_BIN_REPORTS\n#ifdef ENABLE_SONATA_REPORTS\n#include \"bbp/sonata/reports.h\"\n#endif  // ENABLE_SONATA_REPORTS\n\nnamespace coreneuron {\n\n#if defined(ENABLE_BIN_REPORTS) || defined(ENABLE_SONATA_REPORTS)\nReportEvent::ReportEvent(double dt,\n                         double tstart,\n                         const VarsToReport& filtered_gids,\n                         const char* name,\n                         double report_dt)\n    : dt(dt)\n    , tstart(tstart)\n    , report_path(name)\n    , report_dt(report_dt)\n    , vars_to_report(filtered_gids) {\n    nrn_assert(filtered_gids.size());\n    step = tstart / dt;\n    reporting_period = static_cast<int>(report_dt / dt);\n    gids_to_report.reserve(filtered_gids.size());\n    for (const auto& gid: filtered_gids) {\n        gids_to_report.push_back(gid.first);\n    }\n    std::sort(gids_to_report.begin(), gids_to_report.end());\n}\n\nvoid ReportEvent::summation_alu(NrnThread* nt) {\n    // Sum currents only on reporting steps\n    if (step > 0 && (static_cast<int>(step) % reporting_period) == 0) {\n        auto& summation_report = nt->summation_report_handler_->summation_reports_[report_path];\n        // Add currents of all variables in each segment\n        double sum = 0.0;\n        for (const auto& kv: summation_report.currents_) {\n            int segment_id = kv.first;\n            for (const auto& value: kv.second) {\n                double current_value = *value.first;\n                int scale = value.second;\n                sum += current_value * scale;\n            }\n            summation_report.summation_[segment_id] = sum;\n            sum = 0.0;\n        }\n        // Add all currents in the soma\n        // Only when type summation and soma target\n        if (!summation_report.gid_segments_.empty()) {\n            double sum_soma = 0.0;\n            for (const auto& kv: summation_report.gid_segments_) {\n                int gid = kv.first;\n                for (const auto& segment_id: kv.second) {\n                    sum_soma += summation_report.summation_[segment_id];\n                }\n                *(vars_to_report[gid].front().var_value) = sum_soma;\n                sum_soma = 0.0;\n            }\n        }\n    }\n}\n\n/** on deliver, call ReportingLib and setup next event */\nvoid ReportEvent::deliver(double t, NetCvode* nc, NrnThread* nt) {\n/* reportinglib is not thread safe */\n#pragma omp critical\n    {\n        summation_alu(nt);\n        // each thread needs to know its own step\n#ifdef ENABLE_BIN_REPORTS\n        records_nrec(step, gids_to_report.size(), gids_to_report.data(), report_path.data());\n#endif\n#ifdef ENABLE_SONATA_REPORTS\n        sonata_record_node_data(step,\n                                gids_to_report.size(),\n                                gids_to_report.data(),\n                                report_path.data());\n#endif\n        send(t + dt, nc, nt);\n        step++;\n    }\n}\n\nbool ReportEvent::require_checkpoint() {\n    return false;\n}\n#endif  // defined(ENABLE_BIN_REPORTS) || defined(ENABLE_SONATA_REPORTS)\n\n}  // Namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/reports/report_event.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n#include <algorithm>\n#include <unordered_map>\n#include <vector>\n#include <string>\n\n#include \"coreneuron/network/netcon.hpp\"\n#include \"coreneuron/network/netcvode.hpp\"\n\nnamespace coreneuron {\n\n#if defined(ENABLE_BIN_REPORTS) || defined(ENABLE_SONATA_REPORTS)\nstruct VarWithMapping {\n    uint32_t id;\n    double* var_value;\n    VarWithMapping(int id_, double* v_)\n        : id(id_)\n        , var_value(v_) {}\n};\n\n// mapping the set of variables pointers to report to its gid\nusing VarsToReport = std::unordered_map<uint64_t, std::vector<VarWithMapping>>;\n\nclass ReportEvent: public DiscreteEvent {\n  public:\n    ReportEvent(double dt,\n                double tstart,\n                const VarsToReport& filtered_gids,\n                const char* name,\n                double report_dt);\n\n    /** on deliver, call ReportingLib and setup next event */\n    void deliver(double t, NetCvode* nc, NrnThread* nt) override;\n    bool require_checkpoint() override;\n    void summation_alu(NrnThread* nt);\n\n  private:\n    double dt;\n    double step;\n    std::string report_path;\n    double report_dt;\n    int reporting_period;\n    std::vector<int> gids_to_report;\n    double tstart;\n    VarsToReport vars_to_report;\n};\n#endif  // defined(ENABLE_BIN_REPORTS) || defined(ENABLE_SONATA_REPORTS)\n\n}  // Namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/reports/report_handler.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include \"report_handler.hpp\"\n#include \"coreneuron/io/nrnsection_mapping.hpp\"\n#include \"coreneuron/mechanism/mech_mapping.hpp\"\n#include \"coreneuron/utils/utils.hpp\"\n\nnamespace coreneuron {\n\ntemplate <typename T>\nstd::vector<T> intersection_gids(const NrnThread& nt, std::vector<T>& target_gids) {\n    std::vector<int> thread_gids;\n    for (int i = 0; i < nt.ncell; i++) {\n        thread_gids.push_back(nt.presyns[i].gid_);\n    }\n    std::vector<T> intersection;\n\n    std::sort(thread_gids.begin(), thread_gids.end());\n    std::sort(target_gids.begin(), target_gids.end());\n\n    std::set_intersection(thread_gids.begin(),\n                          thread_gids.end(),\n                          target_gids.begin(),\n                          target_gids.end(),\n                          back_inserter(intersection));\n\n    return intersection;\n}\n\nvoid ReportHandler::create_report(ReportConfiguration& report_config,\n                                  double dt,\n                                  double tstop,\n                                  double delay) {\n#if defined(ENABLE_BIN_REPORTS) || defined(ENABLE_SONATA_REPORTS)\n    if (report_config.start < t) {\n        report_config.start = t;\n    }\n    report_config.stop = std::min(report_config.stop, tstop);\n\n    for (const auto& mech: report_config.mech_names) {\n        report_config.mech_ids.emplace_back(nrn_get_mechtype(mech.data()));\n    }\n    if (report_config.type == SynapseReport && report_config.mech_ids.empty()) {\n        std::cerr << \"[ERROR] mechanism to report: \" << report_config.mech_names[0]\n                  << \" is not mapped in this simulation, cannot report on it \\n\";\n        nrn_abort(1);\n    }\n    for (int ith = 0; ith < nrn_nthread; ++ith) {\n        NrnThread& nt = nrn_threads[ith];\n        double* report_variable = nt._actual_v;\n        if (!nt.ncell) {\n            continue;\n        }\n        const std::vector<int>& nodes_to_gid = map_gids(nt);\n        const std::vector<int> gids_to_report = intersection_gids(nt, report_config.target);\n        VarsToReport vars_to_report;\n        bool is_soma_target;\n        switch (report_config.type) {\n            case IMembraneReport:\n                report_variable = nt.nrn_fast_imem->nrn_sav_rhs;\n            case SectionReport:\n                vars_to_report = get_section_vars_to_report(nt,\n                                                            gids_to_report,\n                                                            report_variable,\n                                                            report_config.section_type,\n                                                            report_config.section_all_compartments);\n                is_soma_target = report_config.section_type == SectionType::Soma ||\n                                 report_config.section_type == SectionType::Cell;\n                register_section_report(nt, report_config, vars_to_report, is_soma_target);\n                break;\n            case SummationReport:\n                vars_to_report =\n                    get_summation_vars_to_report(nt, gids_to_report, report_config, nodes_to_gid);\n                register_custom_report(nt, report_config, vars_to_report);\n                break;\n            default:\n                vars_to_report =\n                    get_synapse_vars_to_report(nt, gids_to_report, report_config, nodes_to_gid);\n                register_custom_report(nt, report_config, vars_to_report);\n        }\n        if (!vars_to_report.empty()) {\n            auto report_event = std::make_unique<ReportEvent>(\n                dt, t, vars_to_report, report_config.output_path.data(), report_config.report_dt);\n            report_event->send(t, net_cvode_instance, &nt);\n            m_report_events.push_back(std::move(report_event));\n        }\n    }\n#else\n    if (nrnmpi_myid == 0) {\n        std::cerr << \"[WARNING] : Reporting is disabled. Please recompile with either libsonata or \"\n                     \"reportinglib. \\n\";\n    }\n#endif  // defined(ENABLE_BIN_REPORTS) || defined(ENABLE_SONATA_REPORTS)\n}\n\n#if defined(ENABLE_BIN_REPORTS) || defined(ENABLE_SONATA_REPORTS)\nvoid ReportHandler::register_section_report(const NrnThread& nt,\n                                            const ReportConfiguration& config,\n                                            const VarsToReport& vars_to_report,\n                                            bool is_soma_target) {\n    if (nrnmpi_myid == 0) {\n        std::cerr << \"[WARNING] : Format '\" << config.format << \"' in report '\"\n                  << config.output_path << \"' not supported.\\n\";\n    }\n}\nvoid ReportHandler::register_custom_report(const NrnThread& nt,\n                                           const ReportConfiguration& config,\n                                           const VarsToReport& vars_to_report) {\n    if (nrnmpi_myid == 0) {\n        std::cerr << \"[WARNING] : Format '\" << config.format << \"' in report '\"\n                  << config.output_path << \"' not supported.\\n\";\n    }\n}\n\nstd::string getSectionTypeStr(SectionType type) {\n    switch (type) {\n        case All:\n            return \"All\";\n        case Cell:\n        case Soma:\n            return \"soma\";\n        case Axon:\n            return \"axon\";\n        case Dendrite:\n            return \"dend\";\n        case Apical:\n            return \"apic\";\n        default:\n            std::cerr << \"SectionType not handled in getSectionTypeStr\" << std::endl;\n            nrn_abort(1);\n    }\n}\n\nvoid register_sections_to_report(const SecMapping* sections,\n                                 std::vector<VarWithMapping>& to_report,\n                                 double* report_variable,\n                                 bool all_compartments) {\n    for (const auto& section: sections->secmap) {\n        // compartment_id\n        int section_id = section.first;\n        const auto& segment_ids = section.second;\n\n        // get all compartment values (otherwise, just middle point)\n        if (all_compartments) {\n            for (const auto& segment_id: segment_ids) {\n                // corresponding voltage in coreneuron voltage array\n                double* variable = report_variable + segment_id;\n                to_report.emplace_back(VarWithMapping(section_id, variable));\n            }\n        } else {\n            nrn_assert(segment_ids.size() % 2);\n            // corresponding voltage in coreneuron voltage array\n            const auto segment_id = segment_ids[segment_ids.size() / 2];\n            double* variable = report_variable + segment_id;\n            to_report.emplace_back(VarWithMapping(section_id, variable));\n        }\n    }\n}\n\nVarsToReport ReportHandler::get_section_vars_to_report(const NrnThread& nt,\n                                                       const std::vector<int>& gids_to_report,\n                                                       double* report_variable,\n                                                       SectionType section_type,\n                                                       bool all_compartments) const {\n    VarsToReport vars_to_report;\n    const auto& section_type_str = getSectionTypeStr(section_type);\n    const auto* mapinfo = static_cast<NrnThreadMappingInfo*>(nt.mapping);\n    if (!mapinfo) {\n        std::cerr << \"[COMPARTMENTS] Error : mapping information is missing for a Cell group \"\n                  << nt.ncell << '\\n';\n        nrn_abort(1);\n    }\n\n    for (const auto& gid: gids_to_report) {\n        const auto& cell_mapping = mapinfo->get_cell_mapping(gid);\n        if (cell_mapping == nullptr) {\n            std::cerr\n                << \"[COMPARTMENTS] Error : Compartment mapping information is missing for gid \"\n                << gid << '\\n';\n            nrn_abort(1);\n        }\n        std::vector<VarWithMapping> to_report;\n        to_report.reserve(cell_mapping->size());\n\n        if (section_type_str == \"All\") {\n            const auto& section_mapping = cell_mapping->secmapvec;\n            for (const auto& sections: section_mapping) {\n                register_sections_to_report(sections, to_report, report_variable, all_compartments);\n            }\n        } else {\n            /** get section list mapping for the type, if available */\n            if (cell_mapping->get_seclist_section_count(section_type_str) > 0) {\n                const auto& sections = cell_mapping->get_seclist_mapping(section_type_str);\n                register_sections_to_report(sections, to_report, report_variable, all_compartments);\n            }\n        }\n        vars_to_report[gid] = to_report;\n    }\n    return vars_to_report;\n}\n\nVarsToReport ReportHandler::get_summation_vars_to_report(\n    const NrnThread& nt,\n    const std::vector<int>& gids_to_report,\n    const ReportConfiguration& report,\n    const std::vector<int>& nodes_to_gids) const {\n    VarsToReport vars_to_report;\n    const auto* mapinfo = static_cast<NrnThreadMappingInfo*>(nt.mapping);\n    auto& summation_report = nt.summation_report_handler_->summation_reports_[report.output_path];\n    if (!mapinfo) {\n        std::cerr << \"[COMPARTMENTS] Error : mapping information is missing for a Cell group \"\n                  << nt.ncell << '\\n';\n        nrn_abort(1);\n    }\n\n    for (const auto& gid: gids_to_report) {\n        bool has_imembrane = false;\n        // In case we need convertion of units\n        int scale = 1;\n        for (auto i = 0; i < report.mech_ids.size(); ++i) {\n            auto mech_id = report.mech_ids[i];\n            auto var_name = report.var_names[i];\n            auto mech_name = report.mech_names[i];\n            if (mech_name != \"i_membrane\") {\n                // need special handling for Clamp processes to flip the current value\n                if (mech_name == \"IClamp\" || mech_name == \"SEClamp\") {\n                    scale = -1;\n                }\n                Memb_list* ml = nt._ml_list[mech_id];\n                if (!ml) {\n                    continue;\n                }\n\n                for (int j = 0; j < ml->nodecount; j++) {\n                    auto segment_id = ml->nodeindices[j];\n                    if ((nodes_to_gids[ml->nodeindices[j]] == gid)) {\n                        double* var_value =\n                            get_var_location_from_var_name(mech_id, var_name.data(), ml, j);\n                        summation_report.currents_[segment_id].push_back(\n                            std::make_pair(var_value, scale));\n                    }\n                }\n            } else {\n                has_imembrane = true;\n            }\n        }\n        const auto& cell_mapping = mapinfo->get_cell_mapping(gid);\n        if (cell_mapping == nullptr) {\n            std::cerr << \"[SUMMATION] Error : Compartment mapping information is missing for gid \"\n                      << gid << '\\n';\n            nrn_abort(1);\n        }\n        std::vector<VarWithMapping> to_report;\n        to_report.reserve(cell_mapping->size());\n        summation_report.summation_.resize(nt.end);\n        double* report_variable = summation_report.summation_.data();\n        const auto& section_type_str = getSectionTypeStr(report.section_type);\n        if (report.section_type != SectionType::All) {\n            if (cell_mapping->get_seclist_section_count(section_type_str) > 0) {\n                const auto& sections = cell_mapping->get_seclist_mapping(section_type_str);\n                register_sections_to_report(sections,\n                                            to_report,\n                                            report_variable,\n                                            report.section_all_compartments);\n            }\n        }\n        const auto& section_mapping = cell_mapping->secmapvec;\n        for (const auto& sections: section_mapping) {\n            for (auto& section: sections->secmap) {\n                // compartment_id\n                int section_id = section.first;\n                auto& segment_ids = section.second;\n                for (const auto& segment_id: segment_ids) {\n                    // corresponding voltage in coreneuron voltage array\n                    if (has_imembrane) {\n                        summation_report.currents_[segment_id].push_back(\n                            std::make_pair(nt.nrn_fast_imem->nrn_sav_rhs + segment_id, 1));\n                    }\n                    if (report.section_type == SectionType::All) {\n                        double* variable = report_variable + segment_id;\n                        to_report.emplace_back(VarWithMapping(section_id, variable));\n                    } else if (report.section_type == SectionType::Cell) {\n                        summation_report.gid_segments_[gid].push_back(segment_id);\n                    }\n                }\n            }\n        }\n        vars_to_report[gid] = to_report;\n    }\n    return vars_to_report;\n}\n\nVarsToReport ReportHandler::get_synapse_vars_to_report(\n    const NrnThread& nt,\n    const std::vector<int>& gids_to_report,\n    const ReportConfiguration& report,\n    const std::vector<int>& nodes_to_gids) const {\n    VarsToReport vars_to_report;\n    for (const auto& gid: gids_to_report) {\n        // There can only be 1 mechanism\n        nrn_assert(report.mech_ids.size() == 1);\n        auto mech_id = report.mech_ids[0];\n        auto var_name = report.var_names[0];\n        Memb_list* ml = nt._ml_list[mech_id];\n        if (!ml) {\n            continue;\n        }\n        std::vector<VarWithMapping> to_report;\n        to_report.reserve(ml->nodecount);\n\n        for (int j = 0; j < ml->nodecount; j++) {\n            double* is_selected =\n                get_var_location_from_var_name(mech_id, SELECTED_VAR_MOD_NAME, ml, j);\n            bool report_variable = false;\n\n            /// if there is no variable in mod file then report on every compartment\n            /// otherwise check the flag set in mod file\n            if (is_selected == nullptr) {\n                report_variable = true;\n            } else {\n                report_variable = *is_selected != 0.;\n            }\n            if ((nodes_to_gids[ml->nodeindices[j]] == gid) && report_variable) {\n                double* var_value = get_var_location_from_var_name(mech_id, var_name.data(), ml, j);\n                double* synapse_id =\n                    get_var_location_from_var_name(mech_id, SYNAPSE_ID_MOD_NAME, ml, j);\n                nrn_assert(synapse_id && var_value);\n                to_report.emplace_back(static_cast<int>(*synapse_id), var_value);\n            }\n        }\n        if (!to_report.empty()) {\n            vars_to_report[gid] = to_report;\n        }\n    }\n    return vars_to_report;\n}\n\n// map GIDs of every compartment, it consist in a backward sweep then forward sweep algorithm\nstd::vector<int> ReportHandler::map_gids(const NrnThread& nt) const {\n    std::vector<int> nodes_gid(nt.end, -1);\n    // backward sweep: from presyn compartment propagate back GID to parent\n    for (int i = 0; i < nt.n_presyn; i++) {\n        const int gid = nt.presyns[i].gid_;\n        const int thvar_index = nt.presyns[i].thvar_index_;\n        // only for non artificial cells\n        if (thvar_index >= 0) {\n            // setting all roots gids of the presyns nodes,\n            // index 0 have parent set to 0, so we must stop at j > 0\n            // also 0 is the parent of all, so it is an error to attribute a GID to it.\n            nodes_gid[thvar_index] = gid;\n            for (int j = thvar_index; j > 0; j = nt._v_parent_index[j]) {\n                nodes_gid[nt._v_parent_index[j]] = gid;\n            }\n        }\n    }\n    // forward sweep: setting all compartements nodes to the GID of its root\n    //  already sets on above loop. This is working only because compartments are stored in order\n    //  parents follow by childrens\n    for (int i = nt.ncell + 1; i < nt.end; i++) {\n        nodes_gid[i] = nodes_gid[nt._v_parent_index[i]];\n    }\n    return nodes_gid;\n}\n#endif  // defined(ENABLE_BIN_REPORTS) || defined(ENABLE_SONATA_REPORTS)\n\n}  // Namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/reports/report_handler.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n#include <memory>\n#include <vector>\n\n#include \"nrnreport.hpp\"\n#include \"coreneuron/io/reports/report_event.hpp\"\n#include \"coreneuron/sim/multicore.hpp\"\n\nnamespace coreneuron {\n\nclass ReportHandler {\n  public:\n    virtual ~ReportHandler() = default;\n\n    virtual void create_report(ReportConfiguration& config, double dt, double tstop, double delay);\n#if defined(ENABLE_BIN_REPORTS) || defined(ENABLE_SONATA_REPORTS)\n    virtual void register_section_report(const NrnThread& nt,\n                                         const ReportConfiguration& config,\n                                         const VarsToReport& vars_to_report,\n                                         bool is_soma_target);\n    virtual void register_custom_report(const NrnThread& nt,\n                                        const ReportConfiguration& config,\n                                        const VarsToReport& vars_to_report);\n    VarsToReport get_section_vars_to_report(const NrnThread& nt,\n                                            const std::vector<int>& gids_to_report,\n                                            double* report_variable,\n                                            SectionType section_type,\n                                            bool all_compartments) const;\n    VarsToReport get_summation_vars_to_report(const NrnThread& nt,\n                                              const std::vector<int>& gids_to_report,\n                                              const ReportConfiguration& report,\n                                              const std::vector<int>& nodes_to_gids) const;\n    VarsToReport get_synapse_vars_to_report(const NrnThread& nt,\n                                            const std::vector<int>& gids_to_report,\n                                            const ReportConfiguration& report,\n                                            const std::vector<int>& nodes_to_gids) const;\n    std::vector<int> map_gids(const NrnThread& nt) const;\n#endif  // defined(ENABLE_BIN_REPORTS) || defined(ENABLE_SONATA_REPORTS)\n  protected:\n#if defined(ENABLE_BIN_REPORTS) || defined(ENABLE_SONATA_REPORTS)\n    std::vector<std::unique_ptr<ReportEvent>> m_report_events;\n#endif  // defined(ENABLE_BIN_REPORTS) || defined(ENABLE_SONATA_REPORTS)\n};\n\n}  // Namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/reports/sonata_report_handler.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include \"sonata_report_handler.hpp\"\n#include \"coreneuron/network/netcvode.hpp\"\n#include \"coreneuron/network/netcon.hpp\"\n#include \"coreneuron/io/nrnsection_mapping.hpp\"\n#include \"coreneuron/mechanism/mech_mapping.hpp\"\n#ifdef ENABLE_SONATA_REPORTS\n#include \"bbp/sonata/reports.h\"\n#endif  // ENABLE_SONATA_REPORTS\n\nnamespace coreneuron {\n\nvoid SonataReportHandler::create_report(ReportConfiguration& config,\n                                        double dt,\n                                        double tstop,\n                                        double delay) {\n#ifdef ENABLE_SONATA_REPORTS\n    sonata_set_atomic_step(dt);\n#endif  // ENABLE_SONATA_REPORTS\n    ReportHandler::create_report(config, dt, tstop, delay);\n}\n\n#ifdef ENABLE_SONATA_REPORTS\nvoid SonataReportHandler::register_section_report(const NrnThread& nt,\n                                                  const ReportConfiguration& config,\n                                                  const VarsToReport& vars_to_report,\n                                                  bool is_soma_target) {\n    register_report(nt, config, vars_to_report);\n}\n\nvoid SonataReportHandler::register_custom_report(const NrnThread& nt,\n                                                 const ReportConfiguration& config,\n                                                 const VarsToReport& vars_to_report) {\n    register_report(nt, config, vars_to_report);\n}\n\nstd::pair<std::string, int> SonataReportHandler::get_population_info(int gid) {\n    if (m_spikes_info.population_info.empty()) {\n        return std::make_pair(\"All\", 0);\n    }\n    std::pair<std::string, int> prev = m_spikes_info.population_info.front();\n    for (const auto& name_offset: m_spikes_info.population_info) {\n        std::string pop_name = name_offset.first;\n        int pop_offset = name_offset.second;\n        if (pop_offset > gid) {\n            break;\n        }\n        prev = name_offset;\n    }\n    return prev;\n}\n\nvoid SonataReportHandler::register_report(const NrnThread& nt,\n                                          const ReportConfiguration& config,\n                                          const VarsToReport& vars_to_report) {\n    sonata_create_report(config.output_path.data(),\n                         config.start,\n                         config.stop,\n                         config.report_dt,\n                         config.unit.data(),\n                         config.type_str.data());\n    sonata_set_report_max_buffer_size_hint(config.output_path.data(), config.buffer_size);\n\n    for (const auto& kv: vars_to_report) {\n        uint64_t gid = kv.first;\n        const std::vector<VarWithMapping>& vars = kv.second;\n        if (!vars.size())\n            continue;\n\n        const auto& pop_info = get_population_info(gid);\n        std::string population_name = pop_info.first;\n        int population_offset = pop_info.second;\n        sonata_add_node(config.output_path.data(), population_name.data(), population_offset, gid);\n        sonata_set_report_max_buffer_size_hint(config.output_path.data(), config.buffer_size);\n        for (const auto& variable: vars) {\n            sonata_add_element(config.output_path.data(),\n                               population_name.data(),\n                               gid,\n                               variable.id,\n                               variable.var_value);\n        }\n    }\n}\n#endif  // ENABLE_SONATA_REPORTS\n}  // Namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/reports/sonata_report_handler.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n#include <memory>\n#include <vector>\n\n#include \"report_handler.hpp\"\n\nnamespace coreneuron {\n\nclass SonataReportHandler: public ReportHandler {\n  public:\n    SonataReportHandler(const SpikesInfo& spikes_info)\n        : m_spikes_info(spikes_info) {}\n\n    void create_report(ReportConfiguration& config, double dt, double tstop, double delay) override;\n#ifdef ENABLE_SONATA_REPORTS\n    void register_section_report(const NrnThread& nt,\n                                 const ReportConfiguration& config,\n                                 const VarsToReport& vars_to_report,\n                                 bool is_soma_target) override;\n    void register_custom_report(const NrnThread& nt,\n                                const ReportConfiguration& config,\n                                const VarsToReport& vars_to_report) override;\n\n  private:\n    void register_report(const NrnThread& nt,\n                         const ReportConfiguration& config,\n                         const VarsToReport& vars_to_report);\n    std::pair<std::string, int> get_population_info(int gid);\n#endif  // ENABLE_SONATA_REPORTS\n\n  private:\n    SpikesInfo m_spikes_info;\n};\n\n}  // Namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/setup_fornetcon.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include \"coreneuron/coreneuron.hpp\"\n#include \"coreneuron/io/setup_fornetcon.hpp\"\n#include \"coreneuron/network/netcon.hpp\"\n#include \"coreneuron/nrniv/nrniv_decl.h\"\n#include <map>\n#include <utility>\n\nnamespace coreneuron {\n\n/**\n   If FOR_NETCON in use, setup NrnThread fornetcon related info.\n\n   i.e NrnThread._fornetcon_perm_indices, NrnThread._fornetcon_weight_perm,\n   and the relevant dparam element of each mechanism instance that uses\n   a FOR_NETCONS statement.\n\n   Makes use of nrn_fornetcon_cnt_, nrn_fornetcon_type_,\n   and nrn_fornetcon_index_ that were specified during registration of\n   mechanisms that use FOR_NETCONS.\n\n   nrn_fornetcon_cnt_ is the number of mechanisms that use FOR_NETCONS,\n   nrn_fornetcon_type_ is an int array of size nrn_fornetcon_cnt, that specifies\n   the mechanism type.\n   nrn_fornetcon_index_ is an int array of size nrn_fornetcon_cnt, that\n   specifies the index into an instance's dparam int array having the\n   fornetcon semantics.\n\n   FOR_NETCONS (args) means to loop over all NetCon connecting to this\n   target instance and args are the names of the items of each NetCon's\n   weight vector (same as the enclosing NET_RECEIVE but possible different\n   local names).\n\n   NrnThread._weights is a vector of weight groups where the number of groups\n   is the number of NetCon in this thread and each group has a size\n   equal to the number of args in the target NET_RECEIVE block. The order\n   of these groups is the NetCon Object order in HOC (the construction order).\n   So the weight vector indices for the NetCons in the FOR_NETCONS loop\n   are not adjacent.\n\n   NrnThread._fornetcon_weight_perm is an index vector into the\n   NrnThread._weight vector such that the list of indices that targets a\n   mechanism instance are adjacent.\n   NrnThread._fornetcon_perm_indices is an index vector into the\n   NrnThread._fornetcon_weight_perm to the first of the list of NetCon weights\n   that target the instance. The index of _fornetcon_perm_indices\n   containing this first in the list is stored in the mechanism instances\n   dparam at the dparam's semantic fornetcon slot. (Note that the next index\n   points to the first index of the next target instance.)\n\n**/\n\nstatic int* fornetcon_slot(const int mtype,\n                           const int instance,\n                           const int fnslot,\n                           const NrnThread& nt) {\n    int layout = corenrn.get_mech_data_layout()[mtype];\n    int sz = corenrn.get_prop_dparam_size()[mtype];\n    Memb_list* ml = nt._ml_list[mtype];\n    int* fn = nullptr;\n    if (layout == Layout::AoS) {\n        fn = ml->pdata + (instance * sz + fnslot);\n    } else if (layout == Layout::SoA) {\n        int padded_cnt = nrn_soa_padded_size(ml->nodecount, layout);\n        fn = ml->pdata + (fnslot * padded_cnt + instance);\n    }\n    return fn;\n}\n\nvoid setup_fornetcon_info(NrnThread& nt) {\n    if (nrn_fornetcon_cnt_ == 0) {\n        return;\n    }\n\n    // Mechanism types in use that have FOR_NETCONS statements\n    // Nice to have the dparam fornetcon slot as well so use map\n    // instead of set\n    std::map<int, int> type_to_slot;\n    for (int i = 0; i < nrn_fornetcon_cnt_; ++i) {\n        int type = nrn_fornetcon_type_[i];\n        Memb_list* ml = nt._ml_list[type];\n        if (ml && ml->nodecount) {\n            type_to_slot[type] = nrn_fornetcon_index_[i];\n        }\n    }\n    if (type_to_slot.empty()) {\n        return;\n    }\n\n    // How many NetCons (weight groups) are involved.\n    // Also count how many weight groups for each target instance.\n    // For the latter we can count in the dparam fornetcon slot.\n\n    // zero the dparam fornetcon slot for counting and count number of slots.\n    size_t n_perm_indices = 0;\n    for (const auto& kv: type_to_slot) {\n        int mtype = kv.first;\n        int fnslot = kv.second;\n        int nodecount = nt._ml_list[mtype]->nodecount;\n        for (int i = 0; i < nodecount; ++i) {\n            int* fn = fornetcon_slot(mtype, i, fnslot, nt);\n            *fn = 0;\n            n_perm_indices += 1;\n        }\n    }\n\n    // Count how many weight groups for each slot and total number of weight groups\n    size_t n_weight_perm = 0;\n    for (int i = 0; i < nt.n_netcon; ++i) {\n        NetCon& nc = nt.netcons[i];\n        int mtype = nc.target_->_type;\n        auto search = type_to_slot.find(mtype);\n        if (search != type_to_slot.end()) {\n            int i_instance = nc.target_->_i_instance;\n            int* fn = fornetcon_slot(mtype, i_instance, search->second, nt);\n            *fn += 1;\n            n_weight_perm += 1;\n        }\n    }\n\n    // Displacement vector has an extra element since the number for last item\n    // at n-1 is x[n] - x[n-1] and number for first is x[0] = 0.\n    delete[] std::exchange(nt._fornetcon_perm_indices, nullptr);\n    delete[] std::exchange(nt._fornetcon_weight_perm, nullptr);\n    // Manual memory management because of needing to copy NrnThread to the GPU\n    // and update device-side pointers there. Note the {} ensure the allocated\n    // arrays are zero-initalised.\n    nt._fornetcon_perm_indices_size = n_perm_indices + 1;\n    nt._fornetcon_perm_indices = new size_t[nt._fornetcon_perm_indices_size]{};\n    nt._fornetcon_weight_perm_size = n_weight_perm;\n    nt._fornetcon_weight_perm = new size_t[nt._fornetcon_weight_perm_size]{};\n\n    // From dparam fornetcon slots, compute displacement vector, and\n    // set the dparam fornetcon slot to the index of the displacement vector\n    // to allow later filling the _fornetcon_weight_perm.\n    size_t i_perm_indices = 0;\n    nt._fornetcon_perm_indices[0] = 0;\n    for (const auto& kv: type_to_slot) {\n        int mtype = kv.first;\n        int fnslot = kv.second;\n        int nodecount = nt._ml_list[mtype]->nodecount;\n        for (int i = 0; i < nodecount; ++i) {\n            int* fn = fornetcon_slot(mtype, i, fnslot, nt);\n            nt._fornetcon_perm_indices[i_perm_indices + 1] =\n                nt._fornetcon_perm_indices[i_perm_indices] + size_t(*fn);\n            *fn = int(nt._fornetcon_perm_indices[i_perm_indices]);\n            i_perm_indices += 1;\n        }\n    }\n\n    // One more iteration over NetCon to fill in weight index for\n    // nt._fornetcon_weight_perm. To help with this we increment the\n    // dparam fornetcon slot on each use.\n    for (int i = 0; i < nt.n_netcon; ++i) {\n        NetCon& nc = nt.netcons[i];\n        int mtype = nc.target_->_type;\n        auto search = type_to_slot.find(mtype);\n        if (search != type_to_slot.end()) {\n            int i_instance = nc.target_->_i_instance;\n            int* fn = fornetcon_slot(mtype, i_instance, search->second, nt);\n            size_t nc_w_index = size_t(nc.u.weight_index_);\n            nt._fornetcon_weight_perm[size_t(*fn)] = nc_w_index;\n            *fn += 1;  // next item conceptually adjacent\n        }\n    }\n\n    // Put back the proper values into the dparam fornetcon slot\n    i_perm_indices = 0;\n    for (const auto& kv: type_to_slot) {\n        int mtype = kv.first;\n        int fnslot = kv.second;\n        int nodecount = nt._ml_list[mtype]->nodecount;\n        for (int i = 0; i < nodecount; ++i) {\n            int* fn = fornetcon_slot(mtype, i, fnslot, nt);\n            *fn = int(i_perm_indices);\n            i_perm_indices += 1;\n        }\n    }\n}\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/setup_fornetcon.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\n#include \"coreneuron/sim/multicore.hpp\"\n\nnamespace coreneuron {\n\n/**\n   If FOR_NETCON in use, setup NrnThread fornetcon related info.\n**/\n\nvoid setup_fornetcon_info(NrnThread& nt);\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/io/user_params.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\nnamespace coreneuron {\n\nclass CheckPoints;\n\n/// This structure is data needed is several part of nrn_setup, phase1 and phase2.\n/// Before it was globals variables, group them to give them as a single argument.\n/// They have for the most part, nothing related to each other.\nstruct UserParams {\n    UserParams(int ngroup_,\n               int* gidgroups_,\n               const char* path_,\n               const char* restore_path_,\n               CheckPoints& checkPoints_)\n        : ngroup(ngroup_)\n        , gidgroups(gidgroups_)\n        , path(path_)\n        , restore_path(restore_path_)\n        , file_reader(ngroup_)\n        , checkPoints(checkPoints_) {}\n\n    /// direct memory mode with neuron, do not open files\n    /// Number of local cell groups\n    const int ngroup;\n    /// Array of cell group numbers (indices)\n    const int* const gidgroups;\n    /// path to dataset file\n    const char* const path;\n    /// Dataset path from where simulation is being restored\n    const char* const restore_path;\n    std::vector<FileHandler> file_reader;\n    CheckPoints& checkPoints;\n};\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/mechanism/capac.cpp",
    "content": "/***\n  THIS FILE IS AUTO GENERATED DONT MODIFY IT.\n ***/\n/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include \"coreneuron/coreneuron.hpp\"\n#include \"coreneuron/permute/data_layout.hpp\"\n\n#define _PRAGMA_FOR_INIT_ACC_LOOP_                                                               \\\n    nrn_pragma_acc(parallel loop present(vdata [0:_cntml_padded * nparm]) if (_nt->compute_gpu)) \\\n    nrn_pragma_omp(target teams distribute parallel for simd if(_nt->compute_gpu))\n#define _STRIDE _cntml_padded + _iml\n\nnamespace coreneuron {\n\nstatic const char* mechanism[] = {\"0\", \"capacitance\", \"cm\", 0, \"i_cap\", 0, 0};\nvoid nrn_alloc_capacitance(double*, Datum*, int);\nvoid nrn_init_capacitance(NrnThread*, Memb_list*, int);\nvoid nrn_jacob_capacitance(NrnThread*, Memb_list*, int);\nvoid nrn_div_capacity(NrnThread*, Memb_list*, int);\nvoid nrn_mul_capacity(NrnThread*, Memb_list*, int);\n\n#define nparm 2\n\nvoid capacitance_reg(void) {\n    /* all methods deal with capacitance in special ways */\n    register_mech(mechanism,\n                  nrn_alloc_capacitance,\n                  nullptr,\n                  nullptr,\n                  nullptr,\n                  nrn_init_capacitance,\n                  nullptr,\n                  nullptr,\n                  -1,\n                  1);\n    int mechtype = nrn_get_mechtype(mechanism[1]);\n    _nrn_layout_reg(mechtype, SOA_LAYOUT);\n    hoc_register_prop_size(mechtype, nparm, 0);\n}\n\n#define cm    vdata[0 * _STRIDE]\n#define i_cap vdata[1 * _STRIDE]\n\n/*\ncj is analogous to 1/dt for cvode and daspk\nfor fixed step second order it is 2/dt and\nfor pure implicit fixed step it is 1/dt\nIt used to be static but is now a thread data variable\n*/\n\nvoid nrn_jacob_capacitance(NrnThread* _nt, Memb_list* ml, int /* type */) {\n    int _cntml_actual = ml->nodecount;\n    int _cntml_padded = ml->_nodecount_padded;\n    int _iml;\n    double* vdata;\n    double cfac = .001 * _nt->cj;\n    (void) _cntml_padded; /* unused when layout=1*/\n\n    double* _vec_d = _nt->_actual_d;\n\n    { /*if (use_cachevec) {*/\n        int* ni = ml->nodeindices;\n\n        vdata = ml->data;\n        nrn_pragma_acc(parallel loop present(vdata [0:_cntml_padded * nparm],\n                                             ni [0:_cntml_actual],\n                                             _vec_d [0:_nt->end]) if (_nt->compute_gpu)\n                           async(_nt->stream_id))\n        nrn_pragma_omp(target teams distribute parallel for simd if(_nt->compute_gpu))\n        for (_iml = 0; _iml < _cntml_actual; _iml++) {\n            _vec_d[ni[_iml]] += cfac * cm;\n        }\n    }\n}\n\nvoid nrn_init_capacitance(NrnThread* _nt, Memb_list* ml, int /* type */) {\n    int _cntml_actual = ml->nodecount;\n    int _cntml_padded = ml->_nodecount_padded;\n    double* vdata;\n    (void) _cntml_padded; /* unused */\n\n    // skip initialization if restoring from checkpoint\n    if (_nrn_skip_initmodel == 1) {\n        return;\n    }\n\n    vdata = ml->data;\n    _PRAGMA_FOR_INIT_ACC_LOOP_\n    for (int _iml = 0; _iml < _cntml_actual; _iml++) {\n        i_cap = 0;\n    }\n}\n\nvoid nrn_cur_capacitance(NrnThread* _nt, Memb_list* ml, int /* type */) {\n    int _cntml_actual = ml->nodecount;\n    int _cntml_padded = ml->_nodecount_padded;\n    double* vdata;\n    double cfac = .001 * _nt->cj;\n\n    /*@todo: verify cfac is being copied !! */\n\n    (void) _cntml_padded; /* unused when layout=1*/\n\n    /* since rhs is dvm for a full or half implicit step */\n    /* (nrn_update_2d() replaces dvi by dvi-dvx) */\n    /* no need to distinguish secondorder */\n    int* ni = ml->nodeindices;\n    double* _vec_rhs = _nt->_actual_rhs;\n\n    vdata = ml->data;\n    nrn_pragma_acc(parallel loop present(vdata [0:_cntml_padded * nparm],\n                                         ni [0:_cntml_actual],\n                                         _vec_rhs [0:_nt->end]) if (_nt->compute_gpu)\n                       async(_nt->stream_id))\n    nrn_pragma_omp(target teams distribute parallel for simd if(_nt->compute_gpu))\n    for (int _iml = 0; _iml < _cntml_actual; _iml++) {\n        i_cap = cfac * cm * _vec_rhs[ni[_iml]];\n    }\n}\n\n/* the rest can be constructed automatically from the above info*/\n\nvoid nrn_alloc_capacitance(double* data, Datum* pdata, int type) {\n    (void) pdata;\n    (void) type;      /* unused */\n    data[0] = DEF_cm; /*default capacitance/cm^2*/\n}\n\nvoid nrn_div_capacity(NrnThread* _nt, Memb_list* ml, int type) {\n    (void) type;\n    int _cntml_actual = ml->nodecount;\n    int _cntml_padded = ml->_nodecount_padded;\n    int _iml;\n    double* vdata;\n    (void) _nt;\n    (void) type;\n    (void) _cntml_padded; /* unused */\n\n    int* ni = ml->nodeindices;\n\n    vdata = ml->data;\n    _PRAGMA_FOR_INIT_ACC_LOOP_\n    for (_iml = 0; _iml < _cntml_actual; _iml++) {\n        i_cap = VEC_RHS(ni[_iml]);\n        VEC_RHS(ni[_iml]) /= 1.e-3 * cm;\n        // fprintf(stderr, \"== nrn_div_cap: RHS[%d]=%.12f\\n\", ni[_iml], VEC_RHS(ni[_iml])) ;\n    }\n}\n\nvoid nrn_mul_capacity(NrnThread* _nt, Memb_list* ml, int type) {\n    (void) type;\n    int _cntml_actual = ml->nodecount;\n    int _cntml_padded = ml->_nodecount_padded;\n    int _iml;\n    double* vdata;\n    (void) _nt;\n    (void) type;\n    (void) _cntml_padded; /* unused */\n\n    int* ni = ml->nodeindices;\n\n    const double cfac = .001 * _nt->cj;\n\n    vdata = ml->data;\n    _PRAGMA_FOR_INIT_ACC_LOOP_\n    for (_iml = 0; _iml < _cntml_actual; _iml++) {\n        VEC_RHS(ni[_iml]) *= cfac * cm;\n    }\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/mechanism/eion.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n/// THIS FILE IS AUTO GENERATED DONT MODIFY IT.\n\n#include <math.h>\n#include <string.h>\n\n#include \"coreneuron/coreneuron.hpp\"\n#include \"coreneuron/mpi/nrnmpi.h\"\n#include \"coreneuron/mechanism/membfunc.hpp\"\n#include \"coreneuron/permute/data_layout.hpp\"\n#include \"coreneuron/utils/nrnoc_aux.hpp\"\n\n#define _STRIDE _cntml_padded + _iml\n\nnamespace coreneuron {\n\n// for each ion it refers to internal concentration, external concentration, and charge,\nconst int ion_global_map_member_size = 3;\n\n\n#define nparm 5\nstatic const char* mechanism[] = {/*just a template*/\n                                  \"0\",\n                                  \"na_ion\",\n                                  \"ena\",\n                                  \"nao\",\n                                  \"nai\",\n                                  0,\n                                  \"ina\",\n                                  \"dina_dv_\",\n                                  0,\n                                  0};\n\nvoid nrn_init_ion(NrnThread*, Memb_list*, int);\nvoid nrn_alloc_ion(double*, Datum*, int);\n\nstatic int na_ion, k_ion, ca_ion; /* will get type for these special ions */\n\nint nrn_is_ion(int type) {\n    // Old: commented to remove dependency on memb_func and alloc function\n    // return (memb_func[type].alloc == ion_alloc);\n    return (type < nrn_ion_global_map_size            // type smaller than largest ion's\n            && nrn_ion_global_map[type] != nullptr);  // allocated ion charge variables\n}\n\nint nrn_ion_global_map_size;\ndouble** nrn_ion_global_map;\n#define global_conci(type)  nrn_ion_global_map[type][0]\n#define global_conco(type)  nrn_ion_global_map[type][1]\n#define global_charge(type) nrn_ion_global_map[type][2]\n\ndouble nrn_ion_charge(int type) {\n    return global_charge(type);\n}\n\nvoid ion_reg(const char* name, double valence) {\n    char buf[7][50];\n#define VAL_SENTINAL -10000.\n\n    sprintf(buf[0], \"%s_ion\", name);\n    sprintf(buf[1], \"e%s\", name);\n    sprintf(buf[2], \"%si\", name);\n    sprintf(buf[3], \"%so\", name);\n    sprintf(buf[5], \"i%s\", name);\n    sprintf(buf[6], \"di%s_dv_\", name);\n    for (int i = 0; i < 7; i++) {\n        mechanism[i + 1] = buf[i];\n    }\n    mechanism[5] = nullptr; /* buf[4] not used above */\n    int mechtype = nrn_get_mechtype(buf[0]);\n    if (mechtype >= nrn_ion_global_map_size ||\n        nrn_ion_global_map[mechtype] == nullptr) {  // if hasn't yet been allocated\n\n        // allocates mem for ion in ion_map and sets null all non-ion types\n        if (nrn_ion_global_map_size <= mechtype) {\n            int size = mechtype + 1;\n            nrn_ion_global_map = (double**) erealloc(nrn_ion_global_map, sizeof(double*) * size);\n\n            for (int i = nrn_ion_global_map_size; i < mechtype; i++) {\n                nrn_ion_global_map[i] = nullptr;\n            }\n            nrn_ion_global_map_size = mechtype + 1;\n        }\n        nrn_ion_global_map[mechtype] = (double*) emalloc(ion_global_map_member_size *\n                                                         sizeof(double));\n\n        register_mech((const char**) mechanism,\n                      nrn_alloc_ion,\n                      nrn_cur_ion,\n                      nullptr,\n                      nullptr,\n                      nrn_init_ion,\n                      nullptr,\n                      nullptr,\n                      -1,\n                      1);\n        mechtype = nrn_get_mechtype(mechanism[1]);\n        _nrn_layout_reg(mechtype, SOA_LAYOUT);\n        hoc_register_prop_size(mechtype, nparm, 1);\n        hoc_register_dparam_semantics(mechtype, 0, \"iontype\");\n        nrn_writes_conc(mechtype, 1);\n\n        {\n            // See https://en.cppreference.com/w/cpp/io/c/fprintf: If a call to\n            // sprintf or snprintf causes copying to take place between objects\n            // that overlap, the behavior is undefined.\n            std::string const old_buf_0{buf[0]};\n            sprintf(buf[0], \"%si0_%s\", name, old_buf_0.c_str());\n        }\n        sprintf(buf[1], \"%so0_%s\", name, buf[0]);\n        if (strcmp(\"na\", name) == 0) {\n            na_ion = mechtype;\n            global_conci(mechtype) = DEF_nai;\n            global_conco(mechtype) = DEF_nao;\n            global_charge(mechtype) = 1.;\n        } else if (strcmp(\"k\", name) == 0) {\n            k_ion = mechtype;\n            global_conci(mechtype) = DEF_ki;\n            global_conco(mechtype) = DEF_ko;\n            global_charge(mechtype) = 1.;\n        } else if (strcmp(\"ca\", name) == 0) {\n            ca_ion = mechtype;\n            global_conci(mechtype) = DEF_cai;\n            global_conco(mechtype) = DEF_cao;\n            global_charge(mechtype) = 2.;\n        } else {\n            global_conci(mechtype) = DEF_ioni;\n            global_conco(mechtype) = DEF_iono;\n            global_charge(mechtype) = VAL_SENTINAL;\n        }\n    }\n    double val = global_charge(mechtype);\n    if (valence != VAL_SENTINAL && val != VAL_SENTINAL && valence != val) {\n        fprintf(stderr,\n                \"%s ion valence defined differently in\\n\\\ntwo USEION statements (%g and %g)\\n\",\n                buf[0],\n                valence,\n                global_charge(mechtype));\n        nrn_exit(1);\n    } else if (valence == VAL_SENTINAL && val == VAL_SENTINAL) {\n        fprintf(stderr,\n                \"%s ion valence must be defined in\\n\\\nthe USEION statement of any model using this ion\\n\",\n                buf[0]);\n        nrn_exit(1);\n    } else if (valence != VAL_SENTINAL) {\n        global_charge(mechtype) = valence;\n    }\n}\n\n#if VECTORIZE\n#define erev   pd[0 * _STRIDE] /* From Eion */\n#define conci  pd[1 * _STRIDE]\n#define conco  pd[2 * _STRIDE]\n#define cur    pd[3 * _STRIDE]\n#define dcurdv pd[4 * _STRIDE]\n\n/*\n handle erev, conci, conc0 \"in the right way\" according to ion_style\n default. See nrn/lib/help/nrnoc.help.\nion_style(\"name_ion\", [c_style, e_style, einit, eadvance, cinit])\n\n ica is assigned\n eca is parameter but if conc exists then eca is assigned\n if conc is nrnocCONST then eca calculated on finitialize\n if conc is STATE then eca calculated on fadvance and conc finitialize\n        with global nai0, nao0\n\n nernst(ci, co, charge) and ghk(v, ci, co, charge) available to hoc\n and models.\n*/\n\n#define iontype ppd[_iml] /* how _AMBIGUOUS is to be handled */\n/*the bitmap is\n03\tconcentration unused, nrnocCONST, DEP, STATE\n04\tinitialize concentrations\n030\treversal potential unused, nrnocCONST, DEP, STATE\n040\tinitialize reversal potential\n0100\tcalc reversal during fadvance\n0200\tci being written by a model\n0400\tco being written by a model\n*/\n\n#define charge global_charge(type)\n#define conci0 global_conci(type)\n#define conco0 global_conco(type)\n\ndouble nrn_nernst_coef(int type) {\n    /* for computing jacobian element dconc'/dconc */\n    return ktf(celsius) / charge;\n}\n\n/* Must be called prior to any channels which update the currents */\nvoid nrn_cur_ion(NrnThread* nt, Memb_list* ml, int type) {\n    int _cntml_actual = ml->nodecount;\n    double* pd;\n    Datum* ppd;\n    (void) nt; /* unused */\n    /*printf(\"ion_cur %s\\n\", memb_func[type].sym->name);*/\n    int _cntml_padded = ml->_nodecount_padded;\n    pd = ml->data;\n    ppd = ml->pdata;\n    // clang-format off\n    nrn_pragma_acc(parallel loop present(pd[0:_cntml_padded * 5],\n                                         ppd[0:_cntml_actual],\n                                         nrn_ion_global_map[0:nrn_ion_global_map_size]\n                                                           [0:ion_global_map_member_size])\n                                 if (nt->compute_gpu)\n                                 async(nt->stream_id))\n    // clang-format on\n    nrn_pragma_omp(target teams distribute parallel for simd if(nt->compute_gpu))\n    for (int _iml = 0; _iml < _cntml_actual; ++_iml) {\n        dcurdv = 0.;\n        cur = 0.;\n        if (iontype & 0100) {\n            erev = nrn_nernst(conci, conco, charge, celsius);\n        }\n    };\n}\n\n/* Must be called prior to other models which possibly also initialize\n        concentrations based on their own states\n*/\nvoid nrn_init_ion(NrnThread* nt, Memb_list* ml, int type) {\n    int _cntml_actual = ml->nodecount;\n    double* pd;\n    Datum* ppd;\n    (void) nt; /* unused */\n\n    // skip initialization if restoring from checkpoint\n    if (_nrn_skip_initmodel == 1) {\n        return;\n    }\n\n    /*printf(\"ion_init %s\\n\", memb_func[type].sym->name);*/\n    int _cntml_padded = ml->_nodecount_padded;\n    pd = ml->data;\n    ppd = ml->pdata;\n    // There was no async(...) clause in the initial OpenACC implementation, so\n    // no `nowait` clause has been added to the OpenMP implementation. TODO:\n    // verify if this can be made asynchronous or if there is a strong reason it\n    // needs to be like this.\n    // clang-format off\n    nrn_pragma_acc(parallel loop present(pd[0:_cntml_padded * 5],\n                                         ppd[0:_cntml_actual],\n                                         nrn_ion_global_map[0:nrn_ion_global_map_size]\n                                                           [0:ion_global_map_member_size])\n                                 if (nt->compute_gpu))\n    // clang-format on\n    nrn_pragma_omp(target teams distribute parallel for simd if(nt->compute_gpu))\n    for (int _iml = 0; _iml < _cntml_actual; ++_iml) {\n        if (iontype & 04) {\n            conci = conci0;\n            conco = conco0;\n        }\n        if (iontype & 040) {\n            erev = nrn_nernst(conci, conco, charge, celsius);\n        }\n    }\n}\n\nvoid nrn_alloc_ion(double* p, Datum* ppvar, int _type) {\n    assert(0);\n}\n\nvoid second_order_cur(NrnThread* _nt, int secondorder) {\n    int _cntml_padded;\n    double* pd;\n    (void) _nt; /* unused */\n    double* _vec_rhs = _nt->_actual_rhs;\n\n    if (secondorder == 2) {\n        for (NrnThreadMembList* tml = _nt->tml; tml; tml = tml->next)\n            if (nrn_is_ion(tml->index)) {\n                Memb_list* ml = tml->ml;\n                int _cntml_actual = ml->nodecount;\n                int* ni = ml->nodeindices;\n                _cntml_padded = ml->_nodecount_padded;\n                pd = ml->data;\n                nrn_pragma_acc(parallel loop present(pd [0:_cntml_padded * 5],\n                                                     ni [0:_cntml_actual],\n                                                     _vec_rhs [0:_nt->end]) if (_nt->compute_gpu)\n                                   async(_nt->stream_id))\n                nrn_pragma_omp(target teams distribute parallel for simd if(_nt->compute_gpu))\n                for (int _iml = 0; _iml < _cntml_actual; ++_iml) {\n                    cur += dcurdv * (_vec_rhs[ni[_iml]]);\n                }\n            }\n    }\n}\n}  // namespace coreneuron\n#endif\n"
  },
  {
    "path": "coreneuron/mechanism/eion.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n/// THIS FILE IS AUTO GENERATED DONT MODIFY IT.\n\n#pragma once\n\nnamespace coreneuron {\n\nextern int nrn_is_ion(int);\nextern void ion_reg(const char*, double);\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/mechanism/mech/cfile/cabvars.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\nnamespace coreneuron {\n\nextern void capacitance_reg(void), _passive_reg(void),\n#if EXTRACELLULAR\n    extracell_reg_(void),\n#endif\n    _stim_reg(void), _hh_reg(void), _netstim_reg(void), _expsyn_reg(void), _exp2syn_reg(void),\n    _svclmp_reg(void);\n\nstatic void (*mechanism[])(void) = {/* type will start at 3 */\n                                    capacitance_reg,\n                                    _passive_reg,\n#if EXTRACELLULAR\n                                    /* extracellular requires special handling and must be type 5 */\n                                    extracell_reg_,\n#endif\n                                    nullptr};\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/mechanism/mech/enginemech.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n/**\n * \\file\n * \\brief Provides interface function for CoreNEURON mechanism library and NEURON\n *\n * libcorenrnmech is a interface library provided to building standalone executable\n * special-core. Also, it is used by NEURON to run CoreNEURON via dlopen to execute\n * models via in-memory transfer.\n */\n\n#include <cstdlib>\n#include <coreneuron/engine.h>\n\nnamespace coreneuron {\n\n/** Mechanism registration function\n *\n * If external mechanisms present then use modl_reg function generated\n * in mod_func.cpp otherwise use empty one.\n */\n#ifdef ADDITIONAL_MECHS\nextern void modl_reg();\n#else\nvoid modl_reg() {}\n#endif\n\n/// variables defined in coreneuron library\nextern bool nrn_have_gaps;\nextern bool nrn_use_fast_imem;\n\n/// function defined in coreneuron library\nextern void nrn_cleanup_ion_map();\n}  // namespace coreneuron\n\n/** Initialize mechanisms and run simulation using CoreNEURON\n *\n * This is mainly used to build nrniv-core executable\n */\nint solve_core(int argc, char** argv) {\n    mk_mech_init(argc, argv);\n    coreneuron::modl_reg();\n    int ret = run_solve_core(argc, argv);\n    coreneuron::nrn_cleanup_ion_map();\n    return ret;\n}\n\nextern \"C\" {\n\n/// global variables from coreneuron library\nextern bool corenrn_embedded;\nextern int corenrn_embedded_nthread;\n\n/// parse arguments from neuron and prepare new one for coreneuron\nchar* prepare_args(int& argc, char**& argv, int use_mpi, const char* mpi_lib, const char* nrn_arg);\n\n/// initialize standard mechanisms from coreneuron\nvoid mk_mech_init(int argc, char** argv);\n\n/// set openmp threads equal to neuron's pthread\nvoid set_openmp_threads(int nthread);\n\n/** Run CoreNEURON in embedded mode with NEURON\n *\n * @param nthread Number of Pthreads on NEURON side\n * @param have_gaps True if gap junctions are used\n * @param use_mpi True if MPI is used on NEURON side\n * @param use_fast_imem True if fast imembrance calculation enabled\n * @param nrn_arg Command line arguments passed by NEURON\n * @return 1 if embedded mode is used otherwise 0\n * \\todo Change return type semantics\n */\nint corenrn_embedded_run(int nthread,\n                         int have_gaps,\n                         int use_mpi,\n                         int use_fast_imem,\n                         const char* mpi_lib,\n                         const char* nrn_arg) {\n    // set coreneuron's internal variable based on neuron arguments\n    corenrn_embedded = true;\n    corenrn_embedded_nthread = nthread;\n    coreneuron::nrn_have_gaps = have_gaps != 0;\n    coreneuron::nrn_use_fast_imem = use_fast_imem != 0;\n\n    // set number of openmp threads\n    set_openmp_threads(nthread);\n\n    // pre-process argumnets from neuron and prepare new for coreneuron\n    int argc;\n    char** argv;\n    char* new_arg = prepare_args(argc, argv, use_mpi, mpi_lib, nrn_arg);\n\n    // initialize internal arguments\n    mk_mech_init(argc, argv);\n\n    // initialize extra arguments built into special-core\n    static bool modl_reg_called = false;\n    if (!modl_reg_called) {\n        coreneuron::modl_reg();\n        modl_reg_called = true;\n    }\n    // run simulation\n    run_solve_core(argc, argv);\n\n    // free temporary string created from prepare_args\n    free(new_arg);\n\n    // delete array for argv\n    delete[] argv;\n\n    return corenrn_embedded ? 1 : 0;\n}\n}\n"
  },
  {
    "path": "coreneuron/mechanism/mech/mod2c_core_thread.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/mechanism/mechanism.hpp\"\n#include \"coreneuron/utils/offload.hpp\"\n\nnamespace coreneuron {\n\n#define _STRIDE _cntml_padded + _iml\n\n#define _threadargscomma_ _iml, _cntml_padded, _p, _ppvar, _thread, _nt, _ml, _v,\n#define _threadargsprotocomma_                                                                    \\\n    int _iml, int _cntml_padded, double *_p, Datum *_ppvar, ThreadDatum *_thread, NrnThread *_nt, \\\n        Memb_list *_ml, double _v,\n#define _threadargs_ _iml, _cntml_padded, _p, _ppvar, _thread, _nt, _ml, _v\n#define _threadargsproto_                                                                         \\\n    int _iml, int _cntml_padded, double *_p, Datum *_ppvar, ThreadDatum *_thread, NrnThread *_nt, \\\n        Memb_list *_ml, double _v\n\nstruct Elm {\n    unsigned row;        /* Row location */\n    unsigned col;        /* Column location */\n    double* value;       /* The value SOA  _cntml_padded of them*/\n    struct Elm* r_up;    /* Link to element in same column */\n    struct Elm* r_down;  /*       in solution order */\n    struct Elm* c_left;  /* Link to left element in same row */\n    struct Elm* c_right; /*       in solution order (see getelm) */\n};\n\nstruct Item {\n    Elm* elm{};\n    unsigned norder{}; /* order of a row */\n    Item* next{};\n    Item* prev{};\n};\n\nusing List = Item; /* list of mixed items */\n\nstruct SparseObj {            /* all the state information */\n    Elm** rowst{};            /* link to first element in row (solution order)*/\n    Elm** diag{};             /* link to pivot element in row (solution order)*/\n    void* elmpool{};          /* no interthread cache line sharing for elements */\n    unsigned neqn{};          /* number of equations */\n    unsigned _cntml_padded{}; /* number of instances */\n    unsigned* varord{};       /* row and column order for pivots */\n    double* rhs{};            /* initially- right hand side        finally - answer */\n    unsigned* ngetcall{};     /* per instance counter for number of calls to _getelm */\n    int phase{};              /* 0-solution phase; 1-count phase; 2-build list phase */\n    int numop{};\n    unsigned coef_list_size{};\n    double** coef_list{}; /* pointer to (first instance) value in _getelm order */\n    /* don't really need the rest */\n    int nroworder{};   /* just for freeing */\n    Item** roworder{}; /* roworder[i] is pointer to order item for row i.\n                             Does not have to be in orderlist */\n    List* orderlist{}; /* list of rows sorted by norder\n                             that haven't been used */\n    int do_flag{};\n};\n\nextern void _nrn_destroy_sparseobj_thread(SparseObj* so);\n\n// derived from nrn/src/scopmath/euler.c\n// updated for aos/soa layout index\ntemplate <typename F>\nint euler_thread(int neqn, int* var, int* der, F fun, _threadargsproto_) {\n    double const dt{_nt->_dt};\n    /* calculate the derivatives */\n    fun(_threadargs_);  // std::invoke in C++17\n    /* update dependent variables */\n    for (int i = 0; i < neqn; i++) {\n        _p[var[i] * _STRIDE] += dt * (_p[der[i] * _STRIDE]);\n    }\n    return 0;\n}\n\ntemplate <typename F>\nint derivimplicit_thread(int n, int* slist, int* dlist, F fun, _threadargsproto_) {\n    fun(_threadargs_);  // std::invoke in C++17\n    return 0;\n}\n\nvoid nrn_sparseobj_copyto_device(SparseObj* so);\nvoid nrn_sparseobj_delete_from_device(SparseObj* so);\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/mechanism/mech/mod_func.c.pl",
    "content": "#!/usr/bin/perl\n#\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\n#Construct the modl_reg() function from a provided list\n#of modules.\n\n#Usage : mod_func.c.pl[MECH1.mod MECH2.mod...]\n\n@mods = @ARGV;\ns/\\.mod$// foreach @mods;\n\n@mods=sort @mods;\n\nif(!@mods) {\n    print STDERR \"mod_func.c.pl: No mod files provided\";\n    print \"// No mod files provided\nnamespace coreneuron {\n  void modl_reg() {}\n}\n\";\n    exit 0;\n}\n\nprint << \"__eof\";\n#include <cstdio>\nnamespace coreneuron {\nextern int nrnmpi_myid;\nextern int nrn_nobanner_;\nextern int @{[join \",\\n  \", map{\"_${_}_reg(void)\"} @mods]};\n\nvoid modl_reg() {\n    if (!nrn_nobanner_ && nrnmpi_myid < 1) {\n        fprintf(stderr, \" Additional mechanisms from files\\\\n\");\n        @{[join \"\\n        \",\n           map{\"fprintf(stderr, \\\" $_.mod\\\");\"} @mods] }\n        fprintf(stderr, \"\\\\n\\\\n\");\n    }\n\n    @{[join \"\\n    \", map{\"_${_}_reg();\"} @mods] }\n}\n} //namespace coreneuron\n__eof\n"
  },
  {
    "path": "coreneuron/mechanism/mech/modfile/exp2syn.mod",
    "content": "COMMENT\nTwo state kinetic scheme synapse described by rise time tau1,\nand decay time constant tau2. The normalized peak condunductance is 1.\nDecay time MUST be greater than rise time.\n\nThe solution of A->G->bath with rate constants 1/tau1 and 1/tau2 is\n A = a*exp(-t/tau1) and\n G = a*tau2/(tau2-tau1)*(-exp(-t/tau1) + exp(-t/tau2))\n    where tau1 < tau2\n\nIf tau2-tau1 is very small compared to tau1, this is an alphasynapse with time constant tau2.\nIf tau1/tau2 is very small, this is single exponential decay with time constant tau2.\n\nThe factor is evaluated in the initial block \nsuch that an event of weight 1 generates a\npeak conductance of 1.\n\nBecause the solution is a sum of exponentials, the\ncoupled equations can be solved as a pair of independent equations\nby the more efficient cnexp method.\n\nENDCOMMENT\n\nNEURON {\n    POINT_PROCESS Exp2Syn\n    RANGE tau1, tau2, e, i\n    NONSPECIFIC_CURRENT i\n\n    RANGE g\n}\n\nUNITS {\n    (nA) = (nanoamp)\n    (mV) = (millivolt)\n    (uS) = (microsiemens)\n}\n\nPARAMETER {\n    tau1 = 0.1 (ms) <1e-9,1e9>\n    tau2 = 10 (ms) <1e-9,1e9>\n    e=0 (mV)\n}\n\nASSIGNED {\n    v (mV)\n    i (nA)\n    g (uS)\n    factor\n}\n\nSTATE {\n    A (uS)\n    B (uS)\n}\n\nINITIAL {\n    LOCAL tp\n    if (tau1/tau2 > 0.9999) {\n        tau1 = 0.9999*tau2\n    }\n    if (tau1/tau2 < 1e-9) {\n        tau1 = tau2*1e-9\n    }\n    A = 0\n    B = 0\n    tp = (tau1*tau2)/(tau2 - tau1) * log(tau2/tau1)\n    factor = -exp(-tp/tau1) + exp(-tp/tau2)\n    factor = 1/factor\n}\n\nBREAKPOINT {\n    SOLVE state METHOD cnexp\n    g = B - A\n    i = g*(v - e)\n}\n\nDERIVATIVE state {\n    A' = -A/tau1\n    B' = -B/tau2\n}\n\nNET_RECEIVE(weight (uS)) {\n    A = A + weight*factor\n    B = B + weight*factor\n}\n"
  },
  {
    "path": "coreneuron/mechanism/mech/modfile/expsyn.mod",
    "content": "NEURON {\n\tPOINT_PROCESS ExpSyn\n\tRANGE tau, e, i\n\tNONSPECIFIC_CURRENT i\n}\n\nUNITS {\n\t(nA) = (nanoamp)\n\t(mV) = (millivolt)\n\t(uS) = (microsiemens)\n}\n\nPARAMETER {\n\ttau = 0.1 (ms) <1e-9,1e9>\n\te = 0\t(mV)\n}\n\nASSIGNED {\n\tv (mV)\n\ti (nA)\n}\n\nSTATE {\n\tg (uS)\n}\n\nINITIAL {\n\tg=0\n}\n\nBREAKPOINT {\n\tSOLVE state METHOD cnexp\n\ti = g*(v - e)\n}\n\nDERIVATIVE state {\n\tg' = -g/tau\n}\n\nNET_RECEIVE(weight (uS)) {\n\tg = g + weight\n}\n"
  },
  {
    "path": "coreneuron/mechanism/mech/modfile/hh.mod",
    "content": "TITLE hh.mod   squid sodium, potassium, and leak channels\n \nCOMMENT\n This is the original Hodgkin-Huxley treatment for the set of sodium, \n  potassium, and leakage channels found in the squid giant axon membrane.\n  (\"A quantitative description of membrane current and its application \n  conduction and excitation in nerve\" J.Physiol. (Lond.) 117:500-544 (1952).)\n Membrane voltage is in absolute mV and has been reversed in polarity\n  from the original HH convention and shifted to reflect a resting potential\n  of -65 mV.\n Remember to set celsius=6.3 (or whatever) in your HOC file.\n See squid.hoc for an example of a simulation using this model.\n SW Jaslove  6 March, 1992\nENDCOMMENT\n \nUNITS {\n        (mA) = (milliamp)\n        (mV) = (millivolt)\n\t(S) = (siemens)\n}\n \n? interface\nNEURON {\n        SUFFIX hh\n        USEION na READ ena WRITE ina\n        USEION k READ ek WRITE ik\n        NONSPECIFIC_CURRENT il\n        RANGE gnabar, gkbar, gl, el, gna, gk\n        :GLOBAL minf, hinf, ninf, mtau, htau, ntau\n        RANGE minf, hinf, ninf, mtau, htau, ntau\n\tTHREADSAFE : assigned GLOBALs will be per thread\n}\n \nPARAMETER {\n        gnabar = .12 (S/cm2)\t<0,1e9>\n        gkbar = .036 (S/cm2)\t<0,1e9>\n        gl = .0003 (S/cm2)\t<0,1e9>\n        el = -54.3 (mV)\n}\n \nSTATE {\n        m h n\n}\n \nASSIGNED {\n        v (mV)\n        celsius (degC)\n        ena (mV)\n        ek (mV)\n\n\tgna (S/cm2)\n\tgk (S/cm2)\n        ina (mA/cm2)\n        ik (mA/cm2)\n        il (mA/cm2)\n        minf hinf ninf\n\tmtau (ms) htau (ms) ntau (ms)\n}\n \n? currents\nBREAKPOINT {\n        SOLVE states METHOD cnexp\n        gna = gnabar*m*m*m*h\n\tina = gna*(v - ena)\n        gk = gkbar*n*n*n*n\n\tik = gk*(v - ek)      \n        il = gl*(v - el)\n}\n \n \nINITIAL {\n\trates(v)\n\tm = minf\n\th = hinf\n\tn = ninf\n}\n\n? states\nDERIVATIVE states {  \n        rates(v)\n        m' =  (minf-m)/mtau\n        h' = (hinf-h)/htau\n        n' = (ninf-n)/ntau\n}\n \n:LOCAL q10\n\n\n? rates\nPROCEDURE rates(v(mV)) {  :Computes rate and other constants at current v.\n                      :Call once from HOC to initialize inf at resting v.\n        LOCAL  alpha, beta, sum, q10\n:        TABLE minf, mtau, hinf, htau, ninf, ntau DEPEND celsius FROM -100 TO 100 WITH 200\n\nUNITSOFF\n        q10 = 3^((celsius - 6.3)/10)\n                :\"m\" sodium activation system\n        alpha = .1 * vtrap(-(v+40),10)\n        beta =  4 * exp(-(v+65)/18)\n        sum = alpha + beta\n\tmtau = 1/(q10*sum)\n        minf = alpha/sum\n                :\"h\" sodium inactivation system\n        alpha = .07 * exp(-(v+65)/20)\n        beta = 1 / (exp(-(v+35)/10) + 1)\n        sum = alpha + beta\n\thtau = 1/(q10*sum)\n        hinf = alpha/sum\n                :\"n\" potassium activation system\n        alpha = .01*vtrap(-(v+55),10) \n        beta = .125*exp(-(v+65)/80)\n\tsum = alpha + beta\n        ntau = 1/(q10*sum)\n        ninf = alpha/sum\n}\n \nFUNCTION vtrap(x,y) {  :Traps for 0 in denominator of rate eqns.\n        if (fabs(x/y) < 1e-6) {\n                vtrap = y*(1 - x/y/2)\n        }else{\n                vtrap = x/(exp(x/y) - 1)\n        }\n}\n \nUNITSON\n"
  },
  {
    "path": "coreneuron/mechanism/mech/modfile/netstim.mod",
    "content": ": $Id: netstim.mod 2212 2008-09-08 14:32:26Z hines $\n: comments at end\n\n: the Random idiom has been extended to support CoreNEURON.\n\n: For backward compatibility, noiseFromRandom(hocRandom) can still be used\n: as well as the default low-quality scop_exprand generator.\n: However, CoreNEURON will not accept usage of the low-quality generator,\n: and, if noiseFromRandom is used to specify the random stream, that stream\n: must be using the Random123 generator.\n\n: The recommended idiom for specfication of the random stream is to use\n: noiseFromRandom123(id1, id2[, id3])\n\n: If any instance uses noiseFromRandom123, then no instance can use noiseFromRandom\n: and vice versa.\n\nNEURON\t{ \n  ARTIFICIAL_CELL NetStim\n  RANGE interval, number, start\n  RANGE noise\n  THREADSAFE : only true if every instance has its own distinct Random\n  BBCOREPOINTER donotuse\n}\n\nPARAMETER {\n\tinterval\t= 10 (ms) <1e-9,1e9>: time between spikes (msec)\n\tnumber\t= 10 <0,1e9>\t: number of spikes (independent of noise)\n\tstart\t\t= 50 (ms)\t: start of first spike\n\tnoise\t\t= 0 <0,1>\t: amount of randomness (0.0 - 1.0)\n}\n\nASSIGNED {\n\tevent (ms)\n\ton\n\tispike\n\tdonotuse\n}\n\nVERBATIM\n#if NRNBBCORE /* running in CoreNEURON */\n\n#define IFNEWSTYLE(arg) arg\n\n#else /* running in NEURON */\n\n/*\n   1 means noiseFromRandom was called when _ran_compat was previously 0 .\n   2 means noiseFromRandom123 was called when _ran_compat was previously 0.\n*/\nstatic int _ran_compat; /* specifies the noise style for all instances */\n#define IFNEWSTYLE(arg) if(_ran_compat == 2) { arg }\n\n#endif /* running in NEURON */\nENDVERBATIM\n\n:backward compatibility\nPROCEDURE seed(x) {\nVERBATIM\n#if !NRNBBCORE\nENDVERBATIM\n\tset_seed(x)\nVERBATIM\n#endif\nENDVERBATIM\n}\n\nINITIAL {\n\n\tVERBATIM\n\t  if (_p_donotuse) {\n\t    /* only this style initializes the stream on finitialize */\n\t    IFNEWSTYLE(nrnran123_setseq((nrnran123_State*)_p_donotuse, 0, 0);)\n\t  }\n\tENDVERBATIM\n\n\ton = 0 : off\n\tispike = 0\n\tif (noise < 0) {\n\t\tnoise = 0\n\t}\n\tif (noise > 1) {\n\t\tnoise = 1\n\t}\n\tif (start >= 0 && number > 0) {\n\t\ton = 1\n\t\t: randomize the first spike so on average it occurs at\n\t\t: start + noise*interval\n\t\tevent = start + invl(interval) - interval*(1. - noise)\n\t\t: but not earlier than 0\n\t\tif (event < 0) {\n\t\t\tevent = 0\n\t\t}\n\t\tnet_send(event, 3)\n\t}\n}\t\n\nPROCEDURE init_sequence(t(ms)) {\n\tif (number > 0) {\n\t\ton = 1\n\t\tevent = 0\n\t\tispike = 0\n\t}\n}\n\nFUNCTION invl(mean (ms)) (ms) {\n\tif (mean <= 0.) {\n\t\tmean = .01 (ms) : I would worry if it were 0.\n\t}\n\tif (noise == 0) {\n\t\tinvl = mean\n\t}else{\n\t\tinvl = (1. - noise)*mean + noise*mean*erand()\n\t}\n}\nVERBATIM\n#include \"nrnran123.h\"\n\n#if !NRNBBCORE\n/* backward compatibility */\ndouble nrn_random_pick(void* r);\nvoid* nrn_random_arg(int argpos);\nint nrn_random_isran123(void* r, uint32_t* id1, uint32_t* id2, uint32_t* id3);\nint nrn_random123_setseq(void* r, uint32_t seq, char which);\nint nrn_random123_getseq(void* r, uint32_t* seq, char* which);\n#endif\nENDVERBATIM\n\nFUNCTION erand() {\nVERBATIM\n\tif (_p_donotuse) {\n\t\t/*\n\t\t:Supports separate independent but reproducible streams for\n\t\t: each instance. However, the corresponding hoc Random\n\t\t: distribution MUST be set to Random.negexp(1)\n\t\t*/\n#if !NRNBBCORE\n\t\tif (_ran_compat == 2) {\n\t\t\t_lerand = nrnran123_negexp((nrnran123_State*)_p_donotuse);\n\t\t}else{\n\t\t\t_lerand = nrn_random_pick(_p_donotuse);\n\t\t}\n#else\n\t\t_lerand = nrnran123_negexp((nrnran123_State*)_p_donotuse);\n#endif\n\t\treturn _lerand;\n\t}else{\n#if NRNBBCORE\n\t\tassert(0);\n#else\n\t\t/*\n\t\t: the old standby. Cannot use if reproducible parallel sim\n\t\t: independent of nhost or which host this instance is on\n\t\t: is desired, since each instance on this cpu draws from\n\t\t: the same stream\n\t\t*/\n#endif\n\t}\n#if !NRNBBCORE\nENDVERBATIM\n\terand = exprand(1)\nVERBATIM\n#endif\nENDVERBATIM\n}\n\nPROCEDURE noiseFromRandom() {\nVERBATIM\n#if !NRNBBCORE\n {\n\tvoid** pv = (void**)(&_p_donotuse);\n\tif (_ran_compat == 2) {\n\t\tfprintf(stderr, \"NetStim.noiseFromRandom123 was previously called\\n\");\n\t\tassert(0);\n\t}\n\t_ran_compat = 1;\n\tif (ifarg(1)) {\n\t\t*pv = nrn_random_arg(1);\n\t}else{\n\t\t*pv = (void*)0;\n\t}\n }\n#endif\nENDVERBATIM\n}\n\n\nPROCEDURE noiseFromRandom123() {\nVERBATIM\n#if !NRNBBCORE\n {\n\tnrnran123_State** pv = (nrnran123_State**)(&_p_donotuse);\n\tif (_ran_compat == 1) {\n\t\tfprintf(stderr, \"NetStim.noiseFromRandom was previously called\\n\");\n\t\tassert(0);\n\t}\n\t_ran_compat = 2;\n\tif (*pv) {\n\t\tnrnran123_deletestream(*pv);\n\t\t*pv = (nrnran123_State*)0;\n\t}\n\tif (ifarg(3)) {\n\t\t*pv = nrnran123_newstream3((uint32_t)*getarg(1), (uint32_t)*getarg(2), (uint32_t)*getarg(3));\n\t}else if (ifarg(2)) {\n\t\t*pv = nrnran123_newstream((uint32_t)*getarg(1), (uint32_t)*getarg(2));\n\t}\n }\n#endif\nENDVERBATIM\n}\n\nDESTRUCTOR {\nVERBATIM\n\tif (!noise) { return; }\n\tif (_p_donotuse) {\n#if NRNBBCORE\n\t\t{ /* but note that mod2c does not translate DESTRUCTOR */\n#else\n\t\tif (_ran_compat == 2) {\n#endif\n\t\t\tnrnran123_State** pv = (nrnran123_State**)(&_p_donotuse);\n\t\t\tnrnran123_deletestream(*pv);\n\t\t\t*pv = (nrnran123_State*)0;\n\t\t}\n\t}\nENDVERBATIM\n}\n\nVERBATIM\nstatic void bbcore_write(double* x, int* d, int* xx, int *offset, _threadargsproto_) {\n\tif (!noise) { return; }\n\t/* error if using the legacy scop_exprand */\n\tif (!_p_donotuse) {\n\t\tfprintf(stderr, \"NetStim: cannot use the legacy scop_negexp generator for the random stream.\\n\");\n\t\tassert(0);\n\t}\n\tif (d) {\n\t\tchar which;\n\t\tuint32_t* di = ((uint32_t*)d) + *offset;\n#if !NRNBBCORE\n\t\tif (_ran_compat == 1) {\n\t\t\tvoid** pv = (void**)(&_p_donotuse);\n\t\t\t/* error if not using Random123 generator */\n\t\t\tif (!nrn_random_isran123(*pv, di, di+1, di+2)) {\n\t\t\t\tfprintf(stderr, \"NetStim: Random123 generator is required\\n\");\n\t\t\t\tassert(0);\n\t\t\t}\n\t\t\tnrn_random123_getseq(*pv, di+3, &which);\n\t\t\tdi[4] = (int)which;\n\t\t}else{\n#else\n    {\n#endif\n\t\t\tnrnran123_State** pv = (nrnran123_State**)(&_p_donotuse);\n\t\t\tnrnran123_getids3(*pv, di, di+1, di+2);\n\t\t\tnrnran123_getseq(*pv, di+3, &which);\n\t\t\tdi[4] = (int)which;\n#if NRNBBCORE\n\t\t\t/* CORENeuron does not call DESTRUCTOR so... */\n\t\t\tnrnran123_deletestream(*pv);\n                        *pv = (nrnran123_State*)0;\n#endif\n\t\t}\n\t\t/*printf(\"Netstim bbcore_write %d %d %d\\n\", di[0], di[1], di[3]);*/\n\t}\n\t*offset += 5;\n}\n\nstatic void bbcore_read(double* x, int* d, int* xx, int* offset, _threadargsproto_) {\n\tif (!noise) { return; }\n\t/* Generally, CoreNEURON, in the context of psolve, begins with\n           an empty model so this call takes place in the context of a freshly\n           created instance and _p_donotuse is not NULL.\n\t   However, this function\n           is also now called from NEURON at the end of coreneuron psolve\n           in order to transfer back the nrnran123 sequence state. That\n           allows continuation with a subsequent psolve within NEURON or\n           properly transfer back to CoreNEURON if we continue the psolve\n           there. So now, extra logic is needed for this call to work in\n           a NEURON context.\n        */\n\n\tuint32_t* di = ((uint32_t*)d) + *offset;\n#if NRNBBCORE\n\tnrnran123_State** pv = (nrnran123_State**)(&_p_donotuse);\n\tassert(!_p_donotuse);\n\t*pv = nrnran123_newstream3(di[0], di[1], di[2]);\n\tnrnran123_setseq(*pv, di[3], (char)di[4]);\n#else\n\tuint32_t id1, id2, id3;\n\tassert(_p_donotuse);\n\tif (_ran_compat == 1) { /* Hoc Random.Random123 */\n\t\tvoid** pv = (void**)(&_p_donotuse);\n\t\tint b = nrn_random_isran123(*pv, &id1, &id2, &id3);\n\t\tassert(b);\n\t\tnrn_random123_setseq(*pv, di[3], (char)di[4]);\n\t}else{\n\t\tassert(_ran_compat == 2);\n\t\tnrnran123_State** pv = (nrnran123_State**)(&_p_donotuse);\n\t\tnrnran123_getids3(*pv, &id1, &id2, &id3);\n\t\tnrnran123_setseq(*pv, di[3], (char)di[4]);\n\t}\n        /* Random123 on NEURON side has same ids as on CoreNEURON side */\n\tassert(di[0] == id1 && di[1] == id2 && di[2] == id3);\n#endif\n\t*offset += 5;\n}\nENDVERBATIM\n\nPROCEDURE next_invl() {\n\tif (number > 0) {\n\t\tevent = invl(interval)\n\t}\n\tif (ispike >= number) {\n\t\ton = 0\n\t}\n}\n\nNET_RECEIVE (w) {\n\tif (flag == 0) { : external event\n\t\tif (w > 0 && on == 0) { : turn on spike sequence\n\t\t\t: but not if a netsend is on the queue\n\t\t\tinit_sequence(t)\n\t\t\t: randomize the first spike so on average it occurs at\n\t\t\t: noise*interval (most likely interval is always 0)\n\t\t\tnext_invl()\n\t\t\tevent = event - interval*(1. - noise)\n\t\t\tnet_send(event, 1)\n\t\t}else if (w < 0) { : turn off spiking definitively\n\t\t\ton = 0\n\t\t}\n\t}\n\tif (flag == 3) { : from INITIAL\n\t\tif (on == 1) { : but ignore if turned off by external event\n\t\t\tinit_sequence(t)\n\t\t\tnet_send(0, 1)\n\t\t}\n\t}\n\tif (flag == 1 && on == 1) {\n\t\tispike = ispike + 1\n\t\tnet_event(t)\n\t\tnext_invl()\n\t\tif (on == 1) {\n\t\t\tnet_send(event, 1)\n\t\t}\n\t}\n}\n\nFUNCTION bbsavestate() {\n  bbsavestate = 0\n  : limited to noiseFromRandom123\nVERBATIM\n#if !NRNBBCORE\n  if (_ran_compat == 2) {\n    nrnran123_State** pv = (nrnran123_State**)(&_p_donotuse);\n    if (!*pv) { return 0.0; }\n    char which;\n    uint32_t seq;\n    double *xdir, *xval;\n    xdir = hoc_pgetarg(1);\n    if (*xdir == -1.) { *xdir = 2; return 0.0; }\n    xval = hoc_pgetarg(2);\n    if (*xdir == 0.) {\n      nrnran123_getseq(*pv, &seq, &which);\n      xval[0] = (double)seq;\n      xval[1] = (double)which;\n    }\n    if (*xdir == 1) {\n      nrnran123_setseq(*pv, (uint32_t)xval[0], (char)xval[1]);\n    }\n  } /* else do nothing */\n#endif\nENDVERBATIM\n}\n\n\nCOMMENT\nPresynaptic spike generator\n---------------------------\n\nThis mechanism has been written to be able to use synapses in a single\nneuron receiving various types of presynaptic trains.  This is a \"fake\"\npresynaptic compartment containing a spike generator.  The trains\nof spikes can be either periodic or noisy (Poisson-distributed)\n\nParameters;\n   noise: \tbetween 0 (no noise-periodic) and 1 (fully noisy)\n   interval: \tmean time between spikes (ms)\n   number: \tnumber of spikes (independent of noise)\n\nWritten by Z. Mainen, modified by A. Destexhe, The Salk Institute\n\nModified by Michael Hines for use with CVode\nThe intrinsic bursting parameters have been removed since\ngenerators can stimulate other generators to create complicated bursting\npatterns with independent statistics (see below)\n\nModified by Michael Hines to use logical event style with NET_RECEIVE\nThis stimulator can also be triggered by an input event.\nIf the stimulator is in the on==0 state (no net_send events on queue)\n and receives a positive weight\nevent, then the stimulator changes to the on=1 state and goes through\nits entire spike sequence before changing to the on=0 state. During\nthat time it ignores any positive weight events. If, in an on!=0 state,\nthe stimulator receives a negative weight event, the stimulator will\nchange to the on==0 state. In the on==0 state, it will ignore any ariving\nnet_send events. A change to the on==1 state immediately fires the first spike of\nits sequence.\n\nENDCOMMENT\n\n"
  },
  {
    "path": "coreneuron/mechanism/mech/modfile/passive.mod",
    "content": "TITLE passive membrane channel\n\nUNITS {\n\t(mV) = (millivolt)\n\t(mA) = (milliamp)\n\t(S) = (siemens)\n}\n\nNEURON {\n\tSUFFIX pas\n\tNONSPECIFIC_CURRENT i\n\tRANGE g, e\n}\n\nPARAMETER {\n\tg = .001\t(S/cm2)\t<0,1e9>\n\te = -70\t(mV)\n}\n\nASSIGNED {v (mV)  i (mA/cm2)}\n\nBREAKPOINT {\n\ti = g*(v - e)\n}\n"
  },
  {
    "path": "coreneuron/mechanism/mech/modfile/pattern.mod",
    "content": ": The spikeout pairs (t, gid) resulting from a parallel network simulation\n: can become the stimulus for any single cpu subnet as long as the gid's are\n: consistent.\n: Note: hoc must retain references to the tvec and gidvec vectors\n: to prevent the Info from going out of existence\n\nNEURON {\n\tARTIFICIAL_CELL PatternStim\n\tRANGE fake_output\n\tTHREADSAFE\n\tBBCOREPOINTER ptr\n}\n\nPARAMETER {\n\tfake_output = 0\n}\n\nASSIGNED {\n\tptr\n}\n\nINITIAL {\n\tif (initps() > 0) { net_send(0, 1) }\n}\n\nNET_RECEIVE (w) {LOCAL nst\n\tif (flag == 1) {\n\t\tnst = sendgroup()\n\t\tif (nst >= t) {net_send(nst - t, 1)}\n\t}\n}\n\nVERBATIM\n\nstruct Info {\n\tint size;\n\tdouble* tvec;\n\tint* gidvec;\n\tint index;\n};\n\n#define INFOCAST Info** ip = (Info**)(&(_p_ptr))\n\nENDVERBATIM\n\n\nVERBATIM\nInfo* mkinfo(_threadargsproto_) {\n\tINFOCAST;\n\tInfo* info = (Info*)hoc_Emalloc(sizeof(Info)); hoc_malchk();\n\tinfo->size = 0;\n\tinfo->tvec = nullptr;\n\tinfo->gidvec = nullptr;\n\tinfo->index = 0;\n\treturn info;\n}\n/* for CoreNEURON checkpoint save and restore */\nnamespace coreneuron {\nint checkpoint_save_patternstim(_threadargsproto_) {\n\tINFOCAST; Info* info = *ip;\n\treturn info->index;\n}\nvoid checkpoint_restore_patternstim(int _index, double _te, _threadargsproto_) {\n    INFOCAST; Info* info = *ip;\n    info->index = _index;\n    artcell_net_send(_tqitem, -1, (Point_process*)_nt->_vdata[_ppvar[1*_STRIDE]], _te, 1.0);\n}\n} //namespace coreneuron\nENDVERBATIM\n\nFUNCTION initps() {\nVERBATIM {\n\tINFOCAST; Info* info = *ip;\n\tinfo->index = 0;\n\tif (info && info->tvec) {\n\t\t_linitps = 1.;\n\t}else{\n\t\t_linitps = 0.;\n\t}\n}\nENDVERBATIM\n}\n\nFUNCTION sendgroup() {\nVERBATIM {\n\tINFOCAST; Info* info = *ip;\n\tint size = info->size;\n\tint fake_out;\n\tdouble* tvec = info->tvec;\n\tint* gidvec = info->gidvec;\n\tint i;\n\tfake_out = fake_output ? 1 : 0;\n\tfor (i=0; info->index < size; ++i) {\n\t\t/* only if the gid is NOT on this machine */\n\t\tnrn_fake_fire(gidvec[info->index], tvec[info->index], fake_out);\n\t\t++info->index;\n\t\tif (i > 100 && t < tvec[info->index]) { break; }\n\t}\n\tif (info->index >= size) {\n\t\t_lsendgroup = t - 1.;\n\t}else{\n\t\t_lsendgroup = tvec[info->index];\n\t}\n}\nENDVERBATIM\n}\n\nVERBATIM\nstatic void bbcore_write(double* x, int* d, int* xx, int *offset, _threadargsproto_){}\nstatic void bbcore_read(double* x, int* d, int* xx, int* offset, _threadargsproto_){}\nnamespace coreneuron {\nvoid pattern_stim_setup_helper(int size, double* tv, int* gv, _threadargsproto_) {\n\tINFOCAST;\n\tInfo* info = mkinfo(_threadargs_);\n\t*ip = info;\n\tinfo->size = size;\n\tinfo->tvec = tv;\n\tinfo->gidvec = gv;\n    // initiate event chain (needed in case of restore)\n\tartcell_net_send ( _tqitem, -1, (Point_process*) _nt->_vdata[_ppvar[1*_STRIDE]], t +  0.0 , 1.0 ) ;\n}\n\nInfo** pattern_stim_info_ref(_threadargsproto_) {\n    // Info shared with NEURON.\n    // So nrn <-> corenrn needs no actual transfer for direct mode psolve.\n    INFOCAST;\n    return ip; // Caller sets *ip to NEURON's PatternStim Info*\n}\n\n} // namespace coreneuron\nENDVERBATIM\n\n"
  },
  {
    "path": "coreneuron/mechanism/mech/modfile/stim.mod",
    "content": "COMMENT\nSince this is an electrode current, positive values of i depolarize the cell\nand in the presence of the extracellular mechanism there will be a change\nin vext since i is not a transmembrane current but a current injected\ndirectly to the inside of the cell.\nENDCOMMENT\n\nNEURON {\n\tPOINT_PROCESS IClamp\n\tRANGE del, dur, amp, i\n\tELECTRODE_CURRENT i\n}\nUNITS {\n\t(nA) = (nanoamp)\n}\n\nPARAMETER {\n\tdel (ms)\n\tdur (ms)\t<0,1e9>\n\tamp (nA)\n}\nASSIGNED { i (nA) }\n\nINITIAL {\n\ti = 0\n}\n\nBREAKPOINT {\n    : for fixed step methos, we can ignore at_time, was introduced for variable timestep, will be deprecated anyway. \n\t: at_time(del)\n\t: at_time(del+dur)\n\n\tif (t < del + dur && t >= del) {\n\t\ti = amp\n\t}else{\n\t\ti = 0\n\t}\n}\n"
  },
  {
    "path": "coreneuron/mechanism/mech/modfile/svclmp.mod",
    "content": "TITLE svclmp.mod\nCOMMENT\nSingle electrode Voltage clamp with three levels.\nClamp is on at time 0, and off at time\ndur1+dur2+dur3. When clamp is off the injected current is 0.\nThe clamp levels are amp1, amp2, amp3.\ni is the injected current, vc measures the control voltage)\nDo not insert several instances of this model at the same location in order to\nmake level changes. That is equivalent to independent clamps and they will\nhave incompatible internal state values.\nThe electrical circuit for the clamp is exceedingly simple:\nvc ---'\\/\\/`--- cell\n        rs\n\nNote that since this is an electrode current model v refers to the\ninternal potential which is equivalent to the membrane potential v when\nthere is no extracellular membrane mechanism present but is v+vext when\none is present.\nAlso since i is an electrode current,\npositive values of i depolarize the cell. (Normally, positive membrane currents\nare outward and thus hyperpolarize the cell)\nENDCOMMENT\n\nINDEPENDENT {t FROM 0 TO 1 WITH 1 (ms)}\n\nDEFINE NSTEP 3\n\nNEURON {\n\tPOINT_PROCESS SEClamp\n\tELECTRODE_CURRENT i\n\tRANGE dur1, amp1, dur2, amp2, dur3, amp3, rs, vc, i\n}\n\nUNITS {\n\t(nA) = (nanoamp)\n\t(mV) = (millivolt)\n\t(uS) = (microsiemens)\n}\n\n\nPARAMETER {\n\trs = 1 (megohm) <1e-9, 1e9>\n\tdur1 (ms) \t  amp1 (mV)\n\tdur2 (ms) <0,1e9> amp2 (mV)\n\tdur3 (ms) <0,1e9> amp3 (mV)\n}\n\nASSIGNED {\n\tv (mV)\t: automatically v + vext when extracellular is present\n\ti (nA)\n\tvc (mV)\n\ttc2 (ms)\n\ttc3 (ms)\n\ton\n}\n\nINITIAL {\n\ttc2 = dur1 + dur2\n\ttc3 = tc2 + dur3\n\ton = 0\n}\n\nBREAKPOINT {\n\tSOLVE icur METHOD after_cvode\n\tvstim()\n}\n\nPROCEDURE icur() {\n\tif (on) {\n\t\ti = (vc - v)/rs\n\t}else{\n\t\ti = 0\n\t}\n}\n\nCOMMENT\nThe SOLVE of icur() in the BREAKPOINT block is necessary to compute\ni=(vc - v(t))/rs instead of i=(vc - v(t-dt))/rs\nThis is important for time varying vc because the actual i used in\nthe implicit method is equivalent to (vc - v(t)/rs due to the\ncalculation of di/dv from the BREAKPOINT block.\nThe reason this works is because the SOLVE statement in the BREAKPOINT block\nis executed after the membrane potential is advanced.\n\nIt is a shame that vstim has to be called twice but putting the call\nin a SOLVE block would cause playing a Vector into vc to be off by one\ntime step.\nENDCOMMENT\n\nPROCEDURE vstim() {\n\ton = 1\n\tif (dur1) {at_time(dur1)}\n\tif (dur2) {at_time(tc2)}\n\tif (dur3) {at_time(tc3)}\n\tif (t < dur1) {\n\t\tvc = amp1\n\t}else if (t < tc2) {\n\t\tvc = amp2\n\t}else if (t < tc3) {\n\t\tvc = amp3\n\t}else {\n\t\tvc = 0\n\t\ton = 0\n\t}\n\ticur()\n}\n\n"
  },
  {
    "path": "coreneuron/mechanism/mech_mapping.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include <cstring>\n#include <cstdlib>\n#include <iostream>\n#include <map>\n\n#include \"coreneuron/mechanism/mech_mapping.hpp\"\n#include \"coreneuron/mechanism/mechanism.hpp\"\n#include \"coreneuron/permute/data_layout.hpp\"\n\nnamespace coreneuron {\nusing Offset = size_t;\nusing MechId = int;\nusing VariableName = const char*;\n\nstruct cmp_str {\n    bool operator()(char const* a, char const* b) const {\n        return std::strcmp(a, b) < 0;\n    }\n};\n\n/*\n * Structure that map variable names of mechanisms to their value's location (offset) in memory\n */\nusing MechNamesMapping = std::map<MechId, std::map<VariableName, Offset, cmp_str>>;\nstatic MechNamesMapping mechNamesMapping;\n\nstatic void set_an_offset(int mech_id, const char* variable_name, int offset) {\n    mechNamesMapping[mech_id][variable_name] = offset;\n}\n\ndouble* get_var_location_from_var_name(int mech_id,\n                                       const char* variable_name,\n                                       Memb_list* ml,\n                                       int node_index) {\n    if (mechNamesMapping.find(mech_id) == mechNamesMapping.end()) {\n        std::cerr << \"ERROR : no variable name mapping exist for mechanism id: \" << mech_id\n                  << std::endl;\n        abort();\n    }\n    if (mechNamesMapping.at(mech_id).find(variable_name) == mechNamesMapping.at(mech_id).end()) {\n        std::cerr << \"ERROR : no value associtated to variable name: \" << variable_name\n                  << std::endl;\n        abort();\n    }\n    int variable_rank = mechNamesMapping.at(mech_id).at(variable_name);\n    int ix = get_data_index(node_index, variable_rank, mech_id, ml);\n    return &(ml->data[ix]);\n}\n\nvoid register_all_variables_offsets(int mech_id, SerializedNames variable_names) {\n    int idx = 0;\n    int nb_parsed_variables = 0;\n    int current_categorie = 1;\n    while (current_categorie < NB_MECH_VAR_CATEGORIES) {\n        if (variable_names[idx]) {\n            set_an_offset(mech_id, variable_names[idx], nb_parsed_variables);\n            nb_parsed_variables++;\n        } else {\n            current_categorie++;\n        }\n        idx++;\n    }\n    idx++;\n}\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/mechanism/mech_mapping.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n/*\n * todo : currently mod2c has exactly 4 different variable categories\n * that are registered to coreneuron.\n */\n#define NB_MECH_VAR_CATEGORIES 4\n\n/*\n * SerializedNames\n *\n * names are passed serialized using the following format:\n * SerializedNames : {\"0\",[[<CategorieNames>,]*0,]* [[<CategorieNames>,]* 0]}\n * All categories must be filled, if they are emtpy, just an other 0 follow.\n *\n * ex: {\"0\", \"name1\", \"name2\", 0, \"name3, \"name4\", 0,0,0}\n *     This means the first categorie with names {name1,name2},\n *     the second categorie with {name3, name4}, 2 last categories are empty\n */\nnamespace coreneuron {\nstruct Memb_list;\n\nusing SerializedNames = const char**;\n\n// return pointer to value of a variable's mechanism, or nullptr if not found\nextern double* get_var_location_from_var_name(int mech_id,\n                                              const char* variable_name,\n                                              Memb_list* ml,\n                                              int local_index);\n\n// initialize mapping of variable names of mechanism, to their places in memory\nextern void register_all_variables_offsets(int mech_id, SerializedNames variable_names);\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/mechanism/mechanism.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n#include <string.h>\n\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/utils/memory.h\"\n\nnamespace coreneuron {\n// OpenACC with PGI compiler has issue when union is used and hence use struct\n// \\todo check if newer PGI versions has resolved this issue\n#if defined(_OPENACC)\nstruct ThreadDatum {\n    int i;\n    double* pval;\n    void* _pvoid;\n};\n#else\nunion ThreadDatum {\n    double val;\n    int i;\n    double* pval;\n    void* _pvoid;\n};\n#endif\n\n/* will go away at some point */\nstruct Point_process {\n    int _i_instance;\n    short _type;\n    short _tid; /* NrnThread id */\n};\n\nstruct NetReceiveBuffer_t {\n    int* _displ;     /* _displ_cnt + 1 of these */\n    int* _nrb_index; /* _cnt of these (order of increasing _pnt_index) */\n\n    int* _pnt_index;\n    int* _weight_index;\n    double* _nrb_t;\n    double* _nrb_flag;\n    int _cnt;\n    int _displ_cnt; /* number of unique _pnt_index */\n    int _size;      /* capacity */\n    int _pnt_offset;\n    size_t size_of_object() {\n        size_t nbytes = 0;\n        nbytes += _size * sizeof(int) * 3;\n        nbytes += (_size + 1) * sizeof(int);\n        nbytes += _size * sizeof(double) * 2;\n        return nbytes;\n    }\n};\n\nstruct NetSendBuffer_t: MemoryManaged {\n    int* _sendtype;  // net_send, net_event, net_move\n    int* _vdata_index;\n    int* _pnt_index;\n    int* _weight_index;\n    double* _nsb_t;\n    double* _nsb_flag;\n    int _cnt;\n    int _size;       /* capacity */\n    int reallocated; /* if buffer resized/reallocated, needs to be copy to cpu */\n\n    NetSendBuffer_t(int size)\n        : _size(size) {\n        _cnt = 0;\n\n        _sendtype = (int*) ecalloc_align(_size, sizeof(int));\n        _vdata_index = (int*) ecalloc_align(_size, sizeof(int));\n        _pnt_index = (int*) ecalloc_align(_size, sizeof(int));\n        _weight_index = (int*) ecalloc_align(_size, sizeof(int));\n        // when == 1, NetReceiveBuffer_t is newly allocated (i.e. we need to free previous copy\n        // and recopy new data\n        reallocated = 1;\n        _nsb_t = (double*) ecalloc_align(_size, sizeof(double));\n        _nsb_flag = (double*) ecalloc_align(_size, sizeof(double));\n    }\n\n    size_t size_of_object() {\n        size_t nbytes = 0;\n        nbytes += _size * sizeof(int) * 4;\n        nbytes += _size * sizeof(double) * 2;\n        return nbytes;\n    }\n\n    ~NetSendBuffer_t() {\n        free_memory(_sendtype);\n        free_memory(_vdata_index);\n        free_memory(_pnt_index);\n        free_memory(_weight_index);\n        free_memory(_nsb_t);\n        free_memory(_nsb_flag);\n    }\n\n    void grow() {\n#ifdef CORENEURON_ENABLE_GPU\n        int cannot_reallocate_on_device = 0;\n        assert(cannot_reallocate_on_device);\n#else\n        int new_size = _size * 2;\n        grow_buf(&_sendtype, _size, new_size);\n        grow_buf(&_vdata_index, _size, new_size);\n        grow_buf(&_pnt_index, _size, new_size);\n        grow_buf(&_weight_index, _size, new_size);\n        grow_buf(&_nsb_t, _size, new_size);\n        grow_buf(&_nsb_flag, _size, new_size);\n        _size = new_size;\n#endif\n    }\n\n  private:\n    template <typename T>\n    void grow_buf(T** buf, int size, int new_size) {\n        T* new_buf = nullptr;\n        new_buf = (T*) ecalloc_align(new_size, sizeof(T));\n        memcpy(new_buf, *buf, size * sizeof(T));\n        free(*buf);\n        *buf = new_buf;\n    }\n};\n\nstruct Memb_list {\n    /* nodeindices contains all nodes this extension is responsible for,\n     * ordered according to the matrix. This allows to access the matrix\n     * directly via the nrn_actual_* arrays instead of accessing it in the\n     * order of insertion and via the node-structure, making it more\n     * cache-efficient */\n    int* nodeindices = nullptr;\n    int* _permute = nullptr;\n    double* data = nullptr;\n    Datum* pdata = nullptr;\n    ThreadDatum* _thread = nullptr; /* thread specific data (when static is no good) */\n    NetReceiveBuffer_t* _net_receive_buffer = nullptr;\n    NetSendBuffer_t* _net_send_buffer = nullptr;\n    int nodecount; /* actual node count */\n    int _nodecount_padded;\n    void* instance{nullptr}; /* mechanism instance struct */\n    // nrn_acc_manager.cpp handles data movement to/from the accelerator as the\n    // \"private constructor\" in the translated MOD file code is called before\n    // the main nrn_acc_manager methods that copy thread/mechanism data to the\n    // device\n    void* global_variables{nullptr};\n    std::size_t global_variables_size{};\n};\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/mechanism/membfunc.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n#pragma once\n\n#include \"coreneuron/mechanism/mechanism.hpp\"\n#include \"coreneuron/utils/offload.hpp\"\n#include \"coreneuron/utils/units.hpp\"\n\n#include <cmath>\n#include <vector>\n\nnamespace coreneuron {\n\nusing Pfrpdat = Datum* (*) (void);\n\nstruct NrnThread;\n\nusing mod_alloc_t = void (*)(double*, Datum*, int);\nusing mod_f_t = void (*)(NrnThread*, Memb_list*, int);\nusing pnt_receive_t = void (*)(Point_process*, int, double);\nusing thread_table_check_t =\n    void (*)(int, int, double*, Datum*, ThreadDatum*, NrnThread*, Memb_list*, int);\n\n/*\n * Memb_func structure contains all related informations of a mechanism\n */\nstruct Memb_func {\n    mod_alloc_t alloc;\n    mod_f_t current;\n    mod_f_t jacob;\n    mod_f_t state;\n    mod_f_t initialize;\n    mod_f_t constructor;\n    mod_f_t destructor; /* only for point processes */\n    // These are used for CoreNEURON-internal allocation/cleanup; they are kept\n    // separate from the CONSTRUCTOR/DESTRUCTOR functions just above (one of\n    // which is apparently only for point processes) for simplicity.\n    mod_f_t private_constructor;\n    mod_f_t private_destructor;\n    Symbol* sym;\n    int vectorized;\n    int thread_size_;                       /* how many Datum needed in Memb_list if vectorized */\n    void (*thread_mem_init_)(ThreadDatum*); /* after Memb_list._thread is allocated */\n    void (*thread_cleanup_)(ThreadDatum*);  /* before Memb_list._thread is freed */\n    thread_table_check_t thread_table_check_;\n    int is_point;\n    void (*setdata_)(double*, Datum*);\n    int* dparam_semantics; /* for nrncore writing. */\n    ~Memb_func();\n};\n\n#define VINDEX       -1\n#define CABLESECTION 1\n#define MORPHOLOGY   2\n#define CAP          3\n#define EXTRACELL    5\n\n#define nrnocCONST 1\n#define DEP        2\n#define STATE      3 /*See init.c and cabvars.h for order of nrnocCONST, DEP, and STATE */\n\n#define BEFORE_INITIAL    0\n#define AFTER_INITIAL     1\n#define BEFORE_BREAKPOINT 2\n#define AFTER_SOLVE       3\n#define BEFORE_STEP       4\n#define BEFORE_AFTER_SIZE 5 /* 1 more than the previous */\nstruct BAMech {\n    mod_f_t f;\n    int type;\n    struct BAMech* next;\n};\n\nextern int nrn_ion_global_map_size;\nextern double** nrn_ion_global_map;\nextern const int ion_global_map_member_size;\n\n#define NRNPOINTER                                                            \\\n    4 /* added on to list of mechanism variables.These are                    \\\npointers which connect variables  from other mechanisms via the _ppval array. \\\n*/\n\n#define _AMBIGUOUS 5\n\n\nextern int nrn_get_mechtype(const char*);\nextern const char* nrn_get_mechname(int);  // slow. use memb_func[i].sym if posible\nextern int register_mech(const char** m,\n                         mod_alloc_t alloc,\n                         mod_f_t cur,\n                         mod_f_t jacob,\n                         mod_f_t stat,\n                         mod_f_t initialize,\n                         mod_f_t private_constructor,\n                         mod_f_t private_destructor,\n                         int nrnpointerindex,\n                         int vectorized);\nextern int point_register_mech(const char**,\n                               mod_alloc_t alloc,\n                               mod_f_t cur,\n                               mod_f_t jacob,\n                               mod_f_t stat,\n                               mod_f_t initialize,\n                               mod_f_t private_constructor,\n                               mod_f_t private_destructor,\n                               int nrnpointerindex,\n                               mod_f_t constructor,\n                               mod_f_t destructor,\n                               int vectorized);\nextern void register_constructor(mod_f_t constructor);\nusing NetBufReceive_t = void (*)(NrnThread*);\nextern void hoc_register_net_receive_buffering(NetBufReceive_t, int);\n\nextern void hoc_register_net_send_buffering(int);\n\nusing nrn_watch_check_t = void (*)(NrnThread*, Memb_list*);\nextern void hoc_register_watch_check(nrn_watch_check_t, int);\n\nextern void nrn_jacob_capacitance(NrnThread*, Memb_list*, int);\nextern void nrn_writes_conc(int, int);\nconstexpr double ktf(double celsius) {\n    return 1000. * units::gasconstant * (celsius + 273.15) / units::faraday;\n}\n// std::log isn't constexpr, but there are argument values for which nrn_nernst\n// is a constant expression\nconstexpr double nrn_nernst(double ci, double co, double z, double celsius) {\n    if (z == 0) {\n        return 0.;\n    }\n    if (ci <= 0.) {\n        return 1e6;\n    } else if (co <= 0.) {\n        return -1e6;\n    } else {\n        return ktf(celsius) / z * std::log(co / ci);\n    }\n}\nconstexpr void nrn_wrote_conc(int type,\n                              double* p1,\n                              int p2,\n                              int it,\n                              double** gimap,\n                              double celsius,\n                              int _cntml_padded) {\n    if (it & 040) {\n        constexpr int _iml = 0;\n        int const STRIDE{_cntml_padded + _iml};\n        /* passing _nt to this function causes cray compiler to segfault during compilation\n         * hence passing _cntml_padded\n         */\n        double* pe = p1 - p2 * STRIDE;\n        pe[0] = nrn_nernst(pe[1 * STRIDE], pe[2 * STRIDE], gimap[type][2], celsius);\n    }\n}\ninline double nrn_ghk(double v, double ci, double co, double z, double celsius) {\n    auto const efun = [](double x) {\n        if (std::abs(x) < 1e-4) {\n            return 1. - x / 2.;\n        } else {\n            return x / (std::exp(x) - 1.);\n        }\n    };\n    double const temp{z * v / ktf(celsius)};\n    double const eco{co * efun(+temp)};\n    double const eci{ci * efun(-temp)};\n    return .001 * z * units::faraday * (eci - eco);\n}\nextern void hoc_register_prop_size(int, int, int);\nextern void hoc_register_dparam_semantics(int type, int, const char* name);\nextern void hoc_reg_ba(int, mod_f_t, int);\n\nstruct DoubScal {\n    const char* name;\n    double* pdoub;\n};\nstruct DoubVec {\n    const char* name;\n    double* pdoub;\n    int index1;\n};\nstruct VoidFunc {\n    const char* name;\n    void (*func)(void);\n};\nextern void hoc_register_var(DoubScal*, DoubVec*, VoidFunc*);\n\nextern void _nrn_layout_reg(int, int);\nextern void _nrn_thread_reg0(int i, void (*f)(ThreadDatum*));\nextern void _nrn_thread_reg1(int i, void (*f)(ThreadDatum*));\n\nusing bbcore_read_t = void (*)(double*,\n                               int*,\n                               int*,\n                               int*,\n                               int,\n                               int,\n                               double*,\n                               Datum*,\n                               ThreadDatum*,\n                               NrnThread*,\n                               Memb_list*,\n                               double);\n\nusing bbcore_write_t = void (*)(double*,\n                                int*,\n                                int*,\n                                int*,\n                                int,\n                                int,\n                                double*,\n                                Datum*,\n                                ThreadDatum*,\n                                NrnThread*,\n                                Memb_list*,\n                                double);\n\nextern int nrn_mech_depend(int type, int* dependencies);\nextern int nrn_fornetcon_cnt_;\nextern int* nrn_fornetcon_type_;\nextern int* nrn_fornetcon_index_;\nextern void add_nrn_fornetcons(int, int);\nextern void add_nrn_has_net_event(int);\nextern void net_event(Point_process*, double);\nextern void net_send(void**, int, Point_process*, double, double);\nextern void net_move(void**, Point_process*, double);\nextern void artcell_net_send(void**, int, Point_process*, double, double);\nextern void artcell_net_move(void**, Point_process*, double);\nextern void nrn2ncs_outputevent(int netcon_output_index, double firetime);\nextern bool nrn_use_localgid_;\nextern void net_sem_from_gpu(int sendtype, int i_vdata, int, int ith, int ipnt, double, double);\n\n// _OPENACC and/or NET_RECEIVE_BUFFERING\nextern void net_sem_from_gpu(int, int, int, int, int, double, double);\n\nextern void hoc_malchk(void); /* just a stub */\nextern void* hoc_Emalloc(size_t);\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/mechanism/patternstim.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n// Want to have the classical NEURON PatternStim functionality available\n// in coreneuron to allow debugging and trajectory verification on\n// desktop single process tests.  Since pattern.mod provides most of what\n// we need even in the coreneuron context, we placed a minimally modified\n// version of that in coreneuron/mechanism/mech/modfile/pattern.mod and this file\n// provides an interface that creates an instance of the\n// PatternStim ARTIFICIAL_CELL in thread 0 and attaches the spike raster\n// data to it.\n\n#include <algorithm>\n\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/nrniv/nrniv_decl.h\"\n#include \"coreneuron/io/output_spikes.hpp\"\n#include \"coreneuron/utils/nrn_assert.h\"\n#include \"coreneuron/utils/nrnoc_aux.hpp\"\n#include \"coreneuron/coreneuron.hpp\"\n\nnamespace coreneuron {\n// from translated patstim.mod\nvoid _pattern_reg(void);\n// from patstim.mod\nextern void pattern_stim_setup_helper(int size,\n                                      double* tvec,\n                                      int* gidvec,\n                                      int icnt,\n                                      int cnt,\n                                      double* _p,\n                                      Datum* _ppvar,\n                                      ThreadDatum* _thread,\n                                      NrnThread* _nt,\n                                      Memb_list* ml,\n                                      double v);\n\nstatic size_t read_raster_file(const char* fname, double** tvec, int** gidvec, double tstop);\n\nint nrn_extra_thread0_vdata;\n\nvoid nrn_set_extra_thread0_vdata() {\n    // limited to PatternStim for now.\n    // if called, must be called before nrn_setup and after mk_mech.\n    int type = nrn_get_mechtype(\"PatternStim\");\n    if (!corenrn.get_memb_func(type).initialize) {\n        _pattern_reg();\n    }\n    nrn_extra_thread0_vdata = corenrn.get_prop_dparam_size()[type];\n}\n\n// fname is the filename of an output_spikes.h format raster file.\n// todo : add function for memory cleanup (to be called at the end of simulation)\nvoid nrn_mkPatternStim(const char* fname, double tstop) {\n    int type = nrn_get_mechtype(\"PatternStim\");\n    if (!corenrn.get_memb_func(type).sym) {\n        printf(\"nrn_set_extra_thread_vdata must be called (after mk_mech, and before nrn_setup\\n\");\n        assert(0);\n    }\n\n    // if there is empty thread then return, don't need patternstim\n    if (nrn_threads == nullptr || nrn_threads->ncell == 0) {\n        return;\n    }\n\n    double* tvec;\n    int* gidvec;\n\n    // todo : handle when spike raster will be very large (int < size_t)\n    size_t size = read_raster_file(fname, &tvec, &gidvec, tstop);\n\n    Point_process* pnt = nrn_artcell_instantiate(\"PatternStim\");\n    NrnThread* nt = nrn_threads + pnt->_tid;\n\n    Memb_list* ml = nt->_ml_list[type];\n    int layout = corenrn.get_mech_data_layout()[type];\n    int sz = corenrn.get_prop_param_size()[type];\n    int psz = corenrn.get_prop_dparam_size()[type];\n    int _cntml = ml->nodecount;\n    int _iml = pnt->_i_instance;\n    double* _p = ml->data;\n    Datum* _ppvar = ml->pdata;\n    if (layout == Layout::AoS) {\n        _p += _iml * sz;\n        _ppvar += _iml * psz;\n    } else if (layout == Layout::SoA) {\n        ;\n    } else {\n        assert(0);\n    }\n    pattern_stim_setup_helper(size, tvec, gidvec, _iml, _cntml, _p, _ppvar, nullptr, nt, ml, 0.0);\n}\n\nsize_t read_raster_file(const char* fname, double** tvec, int** gidvec, double tstop) {\n    FILE* f = fopen(fname, \"r\");\n    nrn_assert(f);\n\n    // skip first line containing \"scatter\" string\n    char dummy[100];\n    nrn_assert(fgets(dummy, 100, f));\n\n    std::vector<std::pair<double, int>> spikes;\n    spikes.reserve(10000);\n\n    double stime;\n    int gid;\n\n    while (fscanf(f, \"%lf %d\\n\", &stime, &gid) == 2) {\n        if (stime >= t && stime <= tstop) {\n            spikes.push_back(std::make_pair(stime, gid));\n        }\n    }\n\n    fclose(f);\n\n    // pattern.mod expects sorted spike raster (this is to avoid\n    // injecting all events at the begining of the simulation).\n    // sort spikes according to time\n    std::sort(spikes.begin(), spikes.end());\n\n    // fill gid and time vectors\n    *tvec = (double*) emalloc(spikes.size() * sizeof(double));\n    *gidvec = (int*) emalloc(spikes.size() * sizeof(int));\n\n    for (size_t i = 0; i < spikes.size(); i++) {\n        (*tvec)[i] = spikes[i].first;\n        (*gidvec)[i] = spikes[i].second;\n    }\n\n    return spikes.size();\n}\n\n// see nrn_setup.cpp:read_phase2 for how it creates NrnThreadMembList instances.\nstatic NrnThreadMembList* alloc_nrn_thread_memb(NrnThread* nt, int type) {\n    NrnThreadMembList* tml = (NrnThreadMembList*) ecalloc(1, sizeof(NrnThreadMembList));\n    tml->dependencies = nullptr;\n    tml->ndependencies = 0;\n    tml->index = type;\n    tml->next = nullptr;\n\n    // fill in tml->ml info. The data is not in the cache efficient\n    // NrnThread arrays but there should not be many of these instances.\n    int psize = corenrn.get_prop_param_size()[type];\n    int dsize = corenrn.get_prop_dparam_size()[type];\n    int layout = corenrn.get_mech_data_layout()[type];\n    tml->ml = (Memb_list*) ecalloc(1, sizeof(Memb_list));\n    tml->ml->nodecount = 1;\n    tml->ml->_nodecount_padded = tml->ml->nodecount;\n    tml->ml->nodeindices = nullptr;\n    tml->ml->data = (double*) ecalloc(tml->ml->nodecount * psize, sizeof(double));\n    tml->ml->pdata = (Datum*) ecalloc(nrn_soa_padded_size(tml->ml->nodecount, layout) * dsize,\n                                      sizeof(Datum));\n    tml->ml->_thread = nullptr;\n    tml->ml->_net_receive_buffer = nullptr;\n    tml->ml->_net_send_buffer = nullptr;\n    tml->ml->_permute = nullptr;\n\n    if (auto* const priv_ctor = corenrn.get_memb_func(tml->index).private_constructor) {\n        priv_ctor(nt, tml->ml, tml->index);\n    }\n\n    return tml;\n}\n\n// Opportunistically implemented to create a single PatternStim.\n// So only does enough to get that functionally incorporated into the model\n// and other types may require additional work. In particular, we\n// append a new NrnThreadMembList with one item to the thread 0 tml list\n// in order for the artificial cell to get its INITIAL block called but\n// we do not modify any of the other thread 0 data arrays or counts.\n\nPoint_process* nrn_artcell_instantiate(const char* mechname) {\n    int type = nrn_get_mechtype(mechname);\n    NrnThread* nt = nrn_threads + 0;\n\n    // printf(\"nrn_artcell_instantiate %s type=%d\\n\", mechname, type);\n\n    // create and append to nt.tml\n    auto tml = alloc_nrn_thread_memb(nt, type);\n\n    assert(nt->_ml_list[type] == nullptr);  // FIXME\n    nt->_ml_list[type] = tml->ml;\n\n    if (!nt->tml) {\n        nt->tml = tml;\n    } else {\n        for (NrnThreadMembList* i = nt->tml; i; i = i->next) {\n            if (!i->next) {\n                i->next = tml;\n                break;\n            }\n        }\n    }\n\n    // Here we have a problem with no easy general solution. ml->pdata are\n    // integer indexes into the nt->_data nt->_idata and nt->_vdata array\n    // depending on context,\n    // but nrn_setup.cpp allocated these to exactly have the size needed by\n    // the file defined model (at least for _vdata) and so there are no slots\n    // for pdata to index into for this new instance.\n    // So nrn_setup.cpp:phase2 needs to\n    // be notified that some extra space will be required. For now, defer\n    // the general situation of several instances for several types and\n    // demand that this method is never called more than once. We introduce\n    // a int nrn_extra_thread0_vdata (only that is needed by PatternStim)\n    //  which will be used by\n    // nrn_setup.cpp:phase2 to allocate the appropriately larger\n    // _vdata arrays for thread 0 (without changing _nvdata so\n    // that we can fill in the indices here)\n    static int cnt = 0;\n    if (++cnt > 1) {\n        printf(\"nrn_artcell_instantiate cannot be called more than once\\n\");\n        assert(0);\n    }\n    // note that PatternStim internal usage for the 4 ppvar values  is:\n    // #define _nd_area  _nt->_data[_ppvar[0]]  (not used since ARTIFICIAL_CELL)\n    // #define _p_ptr  _nt->_vdata[_ppvar[2]] (the BBCORE_POINTER)\n    // #define _tqitem &(_nt->_vdata[_ppvar[3]]) (for net_send)\n    // and general external usage is:\n    // _nt->_vdata[_ppvar[1]] = Point_process*\n    //\n\n    Point_process* pnt = new Point_process;\n    pnt->_type = type;\n    pnt->_tid = nt->id;\n    pnt->_i_instance = 0;\n    // as though all dparam index into _vdata\n    int dsize = corenrn.get_prop_dparam_size()[type];\n    assert(dsize <= nrn_extra_thread0_vdata);\n    for (int i = 0; i < dsize; ++i) {\n        tml->ml->pdata[i] = nt->_nvdata + i;\n    }\n    nt->_vdata[nt->_nvdata + 1] = (void*) pnt;\n\n    return pnt;\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/mechanism/register_mech.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include <cstring>\n\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/membrane_definitions.h\"\n#include \"coreneuron/mechanism/eion.hpp\"\n#include \"coreneuron/mechanism/mech_mapping.hpp\"\n#include \"coreneuron/mechanism/membfunc.hpp\"\n#include \"coreneuron/coreneuron.hpp\"\n#include \"coreneuron/utils/nrnoc_aux.hpp\"\n\nnamespace coreneuron {\nint secondorder = 0;\ndouble t, dt, celsius, pi;\nint rev_dt;\n\nusing Pfrv = void (*)();\n\nstatic void ion_write_depend(int type, int etype);\n\nvoid hoc_reg_bbcore_read(int type, bbcore_read_t f) {\n    if (type == -1) {\n        return;\n    }\n    corenrn.get_bbcore_read()[type] = f;\n}\nvoid hoc_reg_bbcore_write(int type, bbcore_write_t f) {\n    if (type == -1) {\n        return;\n    }\n    corenrn.get_bbcore_write()[type] = f;\n}\n\nvoid add_nrn_has_net_event(int type) {\n    if (type == -1) {\n        return;\n    }\n    corenrn.get_has_net_event().push_back(type);\n}\n\n/* values are type numbers of mechanisms which have FOR_NETCONS statement */\nint nrn_fornetcon_cnt_;    /* how many models have a FOR_NETCONS statement */\nint* nrn_fornetcon_type_;  /* what are the type numbers */\nint* nrn_fornetcon_index_; /* what is the index into the ppvar array */\n\nvoid add_nrn_fornetcons(int type, int indx) {\n    if (type == -1)\n        return;\n\n    int i = nrn_fornetcon_cnt_++;\n    nrn_fornetcon_type_ = (int*) erealloc(nrn_fornetcon_type_, (i + 1) * sizeof(int));\n    nrn_fornetcon_index_ = (int*) erealloc(nrn_fornetcon_index_, (i + 1) * sizeof(int));\n    nrn_fornetcon_type_[i] = type;\n    nrn_fornetcon_index_[i] = indx;\n}\n\nvoid add_nrn_artcell(int type, int qi) {\n    if (type == -1) {\n        return;\n    }\n\n    corenrn.get_is_artificial()[type] = 1;\n    corenrn.get_artcell_qindex()[type] = qi;\n}\n\nvoid set_pnt_receive(int type,\n                     pnt_receive_t pnt_receive,\n                     pnt_receive_t pnt_receive_init,\n                     short size) {\n    if (type == -1) {\n        return;\n    }\n    corenrn.get_pnt_receive()[type] = pnt_receive;\n    corenrn.get_pnt_receive_init()[type] = pnt_receive_init;\n    corenrn.get_pnt_receive_size()[type] = size;\n}\n\nvoid alloc_mech(int memb_func_size_) {\n    corenrn.get_memb_funcs().resize(memb_func_size_);\n    corenrn.get_pnt_map().resize(memb_func_size_);\n    corenrn.get_pnt_receive().resize(memb_func_size_);\n    corenrn.get_pnt_receive_init().resize(memb_func_size_);\n    corenrn.get_pnt_receive_size().resize(memb_func_size_);\n    corenrn.get_watch_check().resize(memb_func_size_);\n    corenrn.get_is_artificial().resize(memb_func_size_, false);\n    corenrn.get_artcell_qindex().resize(memb_func_size_);\n    corenrn.get_prop_param_size().resize(memb_func_size_);\n    corenrn.get_prop_dparam_size().resize(memb_func_size_);\n    corenrn.get_mech_data_layout().resize(memb_func_size_, 1);\n    corenrn.get_bbcore_read().resize(memb_func_size_);\n    corenrn.get_bbcore_write().resize(memb_func_size_);\n}\n\nvoid initnrn() {\n    secondorder = DEF_secondorder; /* >0 means crank-nicolson. 2 means currents\n                              adjusted to t+dt/2 */\n    t = 0.;                        /* msec */\n    dt = DEF_dt;                   /* msec */\n    rev_dt = (int) (DEF_rev_dt);   /* 1/msec */\n    celsius = DEF_celsius;         /* degrees celsius */\n}\n\n/* if vectorized then thread_data_size added to it */\nint register_mech(const char** m,\n                  mod_alloc_t alloc,\n                  mod_f_t cur,\n                  mod_f_t jacob,\n                  mod_f_t stat,\n                  mod_f_t initialize,\n                  mod_f_t private_constructor,\n                  mod_f_t private_destructor,\n                  int /* nrnpointerindex */,\n                  int vectorized) {\n    auto& memb_func = corenrn.get_memb_funcs();\n\n    int type = nrn_get_mechtype(m[1]);\n\n    // No mechanism in the .dat files\n    if (type == -1)\n        return type;\n\n    assert(type);\n#ifdef DEBUG\n    printf(\"register_mech %s %d\\n\", m[1], type);\n#endif\n    if (memb_func[type].sym) {\n        assert(strcmp(memb_func[type].sym, m[1]) == 0);\n    } else {\n        memb_func[type].sym = (char*) emalloc(strlen(m[1]) + 1);\n        strcpy(memb_func[type].sym, m[1]);\n    }\n    memb_func[type].current = cur;\n    memb_func[type].jacob = jacob;\n    memb_func[type].alloc = alloc;\n    memb_func[type].state = stat;\n    memb_func[type].initialize = initialize;\n    memb_func[type].constructor = nullptr;\n    memb_func[type].destructor = nullptr;\n    memb_func[type].private_constructor = private_constructor;\n    memb_func[type].private_destructor = private_destructor;\n#if VECTORIZE\n    memb_func[type].vectorized = vectorized ? 1 : 0;\n    memb_func[type].thread_size_ = vectorized ? (vectorized - 1) : 0;\n    memb_func[type].thread_mem_init_ = nullptr;\n    memb_func[type].thread_cleanup_ = nullptr;\n    memb_func[type].thread_table_check_ = nullptr;\n    memb_func[type].is_point = 0;\n    memb_func[type].setdata_ = nullptr;\n    memb_func[type].dparam_semantics = nullptr;\n#endif\n    register_all_variables_offsets(type, &m[2]);\n    return type;\n}\n\nvoid nrn_writes_conc(int type, int /* unused */) {\n    static int lastion = EXTRACELL + 1;\n    if (type == -1)\n        return;\n\n#if CORENRN_DEBUG\n    printf(\"%s reordered from %d to %d\\n\", corenrn.get_memb_func(type).sym, type, lastion);\n#endif\n    if (nrn_is_ion(type)) {\n        ++lastion;\n    }\n}\n\nvoid _nrn_layout_reg(int type, int layout) {\n    corenrn.get_mech_data_layout()[type] = layout;\n}\n\nvoid hoc_register_net_receive_buffering(NetBufReceive_t f, int type) {\n    corenrn.get_net_buf_receive().emplace_back(f, type);\n}\n\nvoid hoc_register_net_send_buffering(int type) {\n    corenrn.get_net_buf_send_type().push_back(type);\n}\n\nvoid hoc_register_watch_check(nrn_watch_check_t nwc, int type) {\n    corenrn.get_watch_check()[type] = nwc;\n}\n\nvoid hoc_register_prop_size(int type, int psize, int dpsize) {\n    if (type == -1)\n        return;\n\n    int pold = corenrn.get_prop_param_size()[type];\n    int dpold = corenrn.get_prop_dparam_size()[type];\n    if (psize != pold || dpsize != dpold) {\n        corenrn.get_different_mechanism_type().push_back(type);\n    }\n    corenrn.get_prop_param_size()[type] = psize;\n    corenrn.get_prop_dparam_size()[type] = dpsize;\n    if (dpsize) {\n        corenrn.get_memb_func(type).dparam_semantics = (int*) ecalloc(dpsize, sizeof(int));\n    }\n}\nvoid hoc_register_dparam_semantics(int type, int ix, const char* name) {\n    /* needed for SoA to possibly reorder name_ion and some \"pointer\" pointers. */\n    /* only interested in area, iontype, cvode_ieq,\n       netsend, pointer, pntproc, bbcorepointer, watch, diam, fornetcon\n       xx_ion and #xx_ion which will get\n       a semantics value of -1, -2, -3,\n       -4, -5, -6, -7, -8, -9, -10,\n       type, and type+1000 respectively\n    */\n    auto& memb_func = corenrn.get_memb_funcs();\n    if (strcmp(name, \"area\") == 0) {\n        memb_func[type].dparam_semantics[ix] = -1;\n    } else if (strcmp(name, \"iontype\") == 0) {\n        memb_func[type].dparam_semantics[ix] = -2;\n    } else if (strcmp(name, \"cvodeieq\") == 0) {\n        memb_func[type].dparam_semantics[ix] = -3;\n    } else if (strcmp(name, \"netsend\") == 0) {\n        memb_func[type].dparam_semantics[ix] = -4;\n    } else if (strcmp(name, \"pointer\") == 0) {\n        memb_func[type].dparam_semantics[ix] = -5;\n    } else if (strcmp(name, \"pntproc\") == 0) {\n        memb_func[type].dparam_semantics[ix] = -6;\n    } else if (strcmp(name, \"bbcorepointer\") == 0) {\n        memb_func[type].dparam_semantics[ix] = -7;\n    } else if (strcmp(name, \"watch\") == 0) {\n        memb_func[type].dparam_semantics[ix] = -8;\n    } else if (strcmp(name, \"diam\") == 0) {\n        memb_func[type].dparam_semantics[ix] = -9;\n    } else if (strcmp(name, \"fornetcon\") == 0) {\n        memb_func[type].dparam_semantics[ix] = -10;\n    } else {\n        int i = name[0] == '#' ? 1 : 0;\n        int etype = nrn_get_mechtype(name + i);\n        memb_func[type].dparam_semantics[ix] = etype + i * 1000;\n        /* note that if style is needed (i==1), then we are writing a concentration */\n        if (i) {\n            ion_write_depend(type, etype);\n        }\n    }\n#if CORENRN_DEBUG\n    printf(\"dparam semantics %s ix=%d %s %d\\n\",\n           memb_func[type].sym,\n           ix,\n           name,\n           memb_func[type].dparam_semantics[ix]);\n#endif\n}\n\n/* only ion type ion_write_depend_ are non-nullptr */\n/* and those are array of integers with first integer being array size */\n/* and remaining size-1 integers containing the mechanism types that write concentrations to that\n * ion */\nstatic void ion_write_depend(int type, int etype) {\n    auto& memb_func = corenrn.get_memb_funcs();\n    auto& ion_write_depend_ = corenrn.get_ion_write_dependency();\n    if (ion_write_depend_.size() < memb_func.size()) {\n        ion_write_depend_.resize(memb_func.size());\n    }\n\n    int size = !ion_write_depend_[etype].empty() ? ion_write_depend_[etype][0] + 1 : 2;\n\n    ion_write_depend_[etype].resize(size, 0);\n    ion_write_depend_[etype][0] = size;\n    ion_write_depend_[etype][size - 1] = type;\n}\n\nstatic int depend_append(int idep, int* dependencies, int deptype, int type) {\n    /* append only if not already in dependencies and != type*/\n    bool add = true;\n    if (deptype == type) {\n        return idep;\n    }\n    for (int i = 0; i < idep; ++i) {\n        if (deptype == dependencies[i]) {\n            add = false;\n            break;\n        }\n    }\n    if (add) {\n        dependencies[idep++] = deptype;\n    }\n    return idep;\n}\n\n/* return list of types that this type depends on (10 should be more than enough) */\n/* dependencies must be an array that is large enough to hold that array */\n/* number of dependencies is returned */\nint nrn_mech_depend(int type, int* dependencies) {\n    int dpsize = corenrn.get_prop_dparam_size()[type];\n    int* ds = corenrn.get_memb_func(type).dparam_semantics;\n    int idep = 0;\n    if (ds)\n        for (int i = 0; i < dpsize; ++i) {\n            if (ds[i] > 0 && ds[i] < 1000) {\n                int deptype = ds[i];\n                int idepnew = depend_append(idep, dependencies, deptype, type);\n                if ((idepnew > idep) && !corenrn.get_ion_write_dependency().empty() &&\n                    !corenrn.get_ion_write_dependency()[deptype].empty()) {\n                    auto& iwd = corenrn.get_ion_write_dependency()[deptype];\n                    int size = iwd[0];\n                    for (int j = 1; j < size; ++j) {\n                        idepnew = depend_append(idepnew, dependencies, iwd[j], type);\n                    }\n                }\n                idep = idepnew;\n            }\n        }\n    return idep;\n}\n\nvoid register_constructor(mod_f_t c) {\n    corenrn.get_memb_funcs().back().constructor = c;\n}\n\nvoid register_destructor(mod_f_t d) {\n    corenrn.get_memb_funcs().back().destructor = d;\n}\n\nint point_reg_helper(const Symbol* s2) {\n    static int next_pointtype = 1; /* starts at 1 since 0 means not point in pnt_map */\n    int type = nrn_get_mechtype(s2);\n\n    // No mechanism in the .dat files\n    if (type == -1)\n        return type;\n\n    corenrn.get_pnt_map()[type] = next_pointtype++;\n    corenrn.get_memb_func(type).is_point = 1;\n\n    return corenrn.get_pnt_map()[type];\n}\n\nint point_register_mech(const char** m,\n                        mod_alloc_t alloc,\n                        mod_f_t cur,\n                        mod_f_t jacob,\n                        mod_f_t stat,\n                        mod_f_t initialize,\n                        mod_f_t private_constructor,\n                        mod_f_t private_destructor,\n                        int nrnpointerindex,\n                        mod_f_t constructor,\n                        mod_f_t destructor,\n                        int vectorized) {\n    const Symbol* s = m[1];\n    register_mech(m,\n                  alloc,\n                  cur,\n                  jacob,\n                  stat,\n                  initialize,\n                  private_constructor,\n                  private_destructor,\n                  nrnpointerindex,\n                  vectorized);\n    register_constructor(constructor);\n    register_destructor(destructor);\n    return point_reg_helper(s);\n}\n\nvoid _modl_cleanup() {}\n\nint state_discon_allowed_;\nint state_discon_flag_ = 0;\nvoid state_discontinuity(int /* i */, double* pd, double d) {\n    if (state_discon_allowed_ && state_discon_flag_ == 0) {\n        *pd = d;\n        /*printf(\"state_discontinuity t=%g pd=%lx d=%g\\n\", t, (long)pd, d);*/\n    }\n}\n\nvoid hoc_reg_ba(int mt, mod_f_t f, int type) {\n    if (type == -1)\n        return;\n\n    switch (type) { /* see bablk in src/nmodl/nocpout.c */\n        case 11:\n            type = BEFORE_BREAKPOINT;\n            break;\n        case 22:\n            type = AFTER_SOLVE;\n            break;\n        case 13:\n            type = BEFORE_INITIAL;\n            break;\n        case 23:\n            type = AFTER_INITIAL;\n            break;\n        case 14:\n            type = BEFORE_STEP;\n            break;\n        default:\n            printf(\"before-after processing type %d for %s not implemented\\n\",\n                   type,\n                   corenrn.get_memb_func(mt).sym);\n            nrn_exit(1);\n    }\n    auto bam = (BAMech*) emalloc(sizeof(BAMech));\n    bam->f = f;\n    bam->type = mt;\n    bam->next = nullptr;\n    // keep in call order\n    if (!corenrn.get_bamech()[type]) {\n        corenrn.get_bamech()[type] = bam;\n    } else {\n        BAMech* last;\n        for (last = corenrn.get_bamech()[type]; last->next; last = last->next) {\n        }\n        last->next = bam;\n    }\n}\n\nvoid _nrn_thread_reg0(int i, void (*f)(ThreadDatum*)) {\n    if (i == -1)\n        return;\n\n    corenrn.get_memb_func(i).thread_cleanup_ = f;\n}\n\nvoid _nrn_thread_reg1(int i, void (*f)(ThreadDatum*)) {\n    if (i == -1)\n        return;\n\n    corenrn.get_memb_func(i).thread_mem_init_ = f;\n}\n\nvoid _nrn_thread_table_reg(int i, thread_table_check_t f) {\n    if (i == -1)\n        return;\n\n    corenrn.get_memb_func(i).thread_table_check_ = f;\n}\n\nvoid _nrn_setdata_reg(int i, void (*call)(double*, Datum*)) {\n    if (i == -1)\n        return;\n\n    corenrn.get_memb_func(i).setdata_ = call;\n}\n\nMemb_func::~Memb_func() {\n    if (sym != nullptr) {\n        free(sym);\n    }\n    if (dparam_semantics != nullptr) {\n        free(dparam_semantics);\n    }\n}\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/mechanism/register_mech.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n#pragma once\n\nnamespace coreneuron {\nvoid add_nrn_artcell(int type, int qi);\nvoid set_pnt_receive(int type,\n                     pnt_receive_t pnt_receive,\n                     pnt_receive_t pnt_receive_init,\n                     short size);\nextern void initnrn(void);\nextern void hoc_reg_bbcore_read(int type, bbcore_read_t f);\nextern void hoc_reg_bbcore_write(int type, bbcore_write_t f);\nextern void _nrn_thread_table_reg(\n    int i,\n    void (*f)(int, int, double*, Datum*, ThreadDatum*, NrnThread*, Memb_list*, int));\nextern void alloc_mech(int);\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/membrane_definitions.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n/* /local/src/master/nrn/src/nrnoc/membdef.h,v 1.2 1995/02/13 20:20:42 hines Exp */\n\n/* numerical parameters */\n#define DEF_nseg   1           /* default number of segments per section*/\n#define DEF_dt     .025        /* ms */\n#define DEF_rev_dt 1. / DEF_dt /* 1/ms */\n#define DEF_secondorder                           \\\n    0 /* >0 means crank-nicolson. 2 means current \\\n      adjusted to t+dt/2 */\n\n/*global parameters */\n#define DEF_Ra      35.4 /* ohm-cm */ /*changed from 34.5 on 1/6/95*/\n#define DEF_celsius 6.3               /* deg-C */\n\n#define DEF_vrest -65. /* mV */\n\n/* old point process parameters */\n/* fclamp */\n#define DEF_clamp_resist 1e-3 /* megohm */\n\n/* Parameters that are used in mechanism _alloc() procedures */\n/* cable */\n#define DEF_L          100. /* microns */\n#define DEF_rallbranch 1.\n\n/* morphology */\n#define DEF_diam 500. /* microns */\n\n/* capacitance */\n#define DEF_cm 1. /* uF/cm^2 */\n\n/* fast passive (e_p and g_p)*/\n#define DEF_e DEF_vrest /* mV */\n#define DEF_g 5.e-4     /* S/cm^2 */\n\n/* na_ion */\n#define DEF_nai 10.                /* mM */\n#define DEF_nao 140.               /* mM */\n#define DEF_ena (115. + DEF_vrest) /* mV */\n\n/* k_ion */\n#define DEF_ki 54.4               /* mM */\n#define DEF_ko 2.5                /* mM */\n#define DEF_ek (-12. + DEF_vrest) /* mV */\n\n/* ca_ion -> any program that uses DEF_eca must include <math.h> */\n#define DEF_cai 5.e-5 /* mM */\n#define DEF_cao 2.    /* mM */\n#include <math.h>\n#define DEF_eca 12.5 * log(DEF_cao / DEF_cai) /* mV */\n\n/* default ion values */\n#define DEF_ioni 1. /* mM */\n#define DEF_iono 1. /* mM */\n#define DEF_eion 0. /* mV */\n"
  },
  {
    "path": "coreneuron/mpi/core/nrnmpi.hpp",
    "content": "#pragma once\n\nnamespace coreneuron {\nextern int nrnmpi_numprocs;\nextern int nrnmpi_myid;\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/mpi/core/nrnmpi_def_cinc.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\nnamespace coreneuron {\nint nrnmpi_numprocs = 1; /* size */\nint nrnmpi_myid = 0;     /* rank */\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/mpi/core/nrnmpidec.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include \"../nrnmpi.h\"\n\nnamespace coreneuron {\n\n\n/* from nrnmpi.cpp */\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_init_impl)> nrnmpi_init{\"nrnmpi_init_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_finalize_impl)> nrnmpi_finalize{\n    \"nrnmpi_finalize_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_check_threading_support_impl)>\n    nrnmpi_check_threading_support{\"nrnmpi_check_threading_support_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_write_file_impl)> nrnmpi_write_file{\n    \"nrnmpi_write_file_impl\"};\n\n/* from mpispike.c */\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_spike_exchange_impl)> nrnmpi_spike_exchange{\n    \"nrnmpi_spike_exchange_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_spike_exchange_compressed_impl)>\n    nrnmpi_spike_exchange_compressed{\"nrnmpi_spike_exchange_compressed_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_int_allmax_impl)> nrnmpi_int_allmax{\n    \"nrnmpi_int_allmax_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_int_allgather_impl)> nrnmpi_int_allgather{\n    \"nrnmpi_int_allgather_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_int_alltoall_impl)> nrnmpi_int_alltoall{\n    \"nrnmpi_int_alltoall_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_int_alltoallv_impl)> nrnmpi_int_alltoallv{\n    \"nrnmpi_int_alltoallv_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_dbl_alltoallv_impl)> nrnmpi_dbl_alltoallv{\n    \"nrnmpi_dbl_alltoallv_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_dbl_allmin_impl)> nrnmpi_dbl_allmin{\n    \"nrnmpi_dbl_allmin_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_dbl_allmax_impl)> nrnmpi_dbl_allmax{\n    \"nrnmpi_dbl_allmax_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_barrier_impl)> nrnmpi_barrier{\n    \"nrnmpi_barrier_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_dbl_allreduce_impl)> nrnmpi_dbl_allreduce{\n    \"nrnmpi_dbl_allreduce_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_dbl_allreduce_vec_impl)> nrnmpi_dbl_allreduce_vec{\n    \"nrnmpi_dbl_allreduce_vec_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_long_allreduce_vec_impl)>\n    nrnmpi_long_allreduce_vec{\"nrnmpi_long_allreduce_vec_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_initialized_impl)> nrnmpi_initialized{\n    \"nrnmpi_initialized_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_abort_impl)> nrnmpi_abort{\"nrnmpi_abort_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_wtime_impl)> nrnmpi_wtime{\"nrnmpi_wtime_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_local_rank_impl)> nrnmpi_local_rank{\n    \"nrnmpi_local_rank_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_local_size_impl)> nrnmpi_local_size{\n    \"nrnmpi_local_size_impl\"};\n#if NRN_MULTISEND\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_multisend_comm_impl)> nrnmpi_multisend_comm{\n    \"nrnmpi_multisend_comm_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_multisend_impl)> nrnmpi_multisend{\n    \"nrnmpi_multisend_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_multisend_single_advance_impl)>\n    nrnmpi_multisend_single_advance{\"nrnmpi_multisend_single_advance_impl\"};\nmpi_function<cnrn_make_integral_constant_t(nrnmpi_multisend_conserve_impl)>\n    nrnmpi_multisend_conserve{\"nrnmpi_multisend_conserve_impl\"};\n#endif  // NRN_MULTISEND\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/mpi/core/resolve.cpp",
    "content": "#include <dlfcn.h>\n#include <sstream>\n#include \"../nrnmpi.h\"\n\nnamespace coreneuron {\n// Those functions are part of a mechanism to dynamically load mpi or not\nvoid mpi_manager_t::resolve_symbols(void* handle) {\n    for (auto* ptr: m_function_ptrs) {\n        assert(!(*ptr));\n        ptr->resolve(handle);\n        assert(*ptr);\n    }\n}\n\nvoid mpi_function_base::resolve(void* handle) {\n    dlerror();\n    void* ptr = dlsym(handle, m_name);\n    const char* error = dlerror();\n    if (error) {\n        std::ostringstream oss;\n        oss << \"Could not get symbol \" << m_name << \" from handle \" << handle << \": \" << error;\n        throw std::runtime_error(oss.str());\n    }\n    assert(ptr);\n    m_fptr = ptr;\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/mpi/lib/mpispike.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include \"coreneuron/nrnconf.h\"\n/* do not want the redef in the dynamic load case */\n#include \"coreneuron/mpi/nrnmpiuse.h\"\n#include \"coreneuron/mpi/nrnmpi.h\"\n#include \"coreneuron/mpi/nrnmpidec.h\"\n#include \"nrnmpi.hpp\"\n#include \"coreneuron/utils/profile/profiler_interface.h\"\n#include \"coreneuron/utils/nrn_assert.h\"\n\n#include <mpi.h>\n\n#include <cstring>\n\nnamespace coreneuron {\nextern MPI_Comm nrnmpi_comm;\n\nstatic int np;\nstatic int* displs{nullptr};\nstatic int* byteovfl{nullptr}; /* for the compressed transfer method */\nstatic MPI_Datatype spike_type;\n\nstatic void* emalloc(size_t size) {\n    void* memptr = malloc(size);\n    assert(memptr);\n    return memptr;\n}\n\n// Register type NRNMPI_Spike\nvoid nrnmpi_spike_initialize() {\n    NRNMPI_Spike s;\n    int block_lengths[2] = {1, 1};\n    MPI_Aint addresses[3];\n\n    MPI_Get_address(&s, &addresses[0]);\n    MPI_Get_address(&(s.gid), &addresses[1]);\n    MPI_Get_address(&(s.spiketime), &addresses[2]);\n\n    MPI_Aint displacements[2] = {addresses[1] - addresses[0], addresses[2] - addresses[0]};\n\n    MPI_Datatype typelist[2] = {MPI_INT, MPI_DOUBLE};\n    MPI_Type_create_struct(2, block_lengths, displacements, typelist, &spike_type);\n    MPI_Type_commit(&spike_type);\n}\n\n#if nrn_spikebuf_size > 0\n\nstatic MPI_Datatype spikebuf_type;\n\n// Register type NRNMPI_Spikebuf\nstatic void make_spikebuf_type() {\n    NRNMPI_Spikebuf s;\n    int block_lengths[3] = {1, nrn_spikebuf_size, nrn_spikebuf_size};\n    MPI_Datatype typelist[3] = {MPI_INT, MPI_INT, MPI_DOUBLE};\n\n    MPI_Aint addresses[4];\n    MPI_Get_address(&s, &addresses[0]);\n    MPI_Get_address(&(s.nspike), &addresses[1]);\n    MPI_Get_address(&(s.gid[0]), &addresses[2]);\n    MPI_Get_address(&(s.spiketime[0]), &addresses[3]);\n\n    MPI_Aint displacements[3] = {addresses[1] - addresses[0],\n                                 addresses[2] - addresses[0],\n                                 addresses[3] - addresses[0]};\n\n    MPI_Type_create_struct(3, block_lengths, displacements, typelist, &spikebuf_type);\n    MPI_Type_commit(&spikebuf_type);\n}\n#endif\n\nvoid wait_before_spike_exchange() {\n    MPI_Barrier(nrnmpi_comm);\n}\n\nint nrnmpi_spike_exchange_impl(int* nin,\n                               NRNMPI_Spike* spikeout,\n                               int icapacity,\n                               NRNMPI_Spike** spikein,\n                               int& ovfl,\n                               int nout,\n                               NRNMPI_Spikebuf* spbufout,\n                               NRNMPI_Spikebuf* spbufin) {\n    nrn_assert(spikein);\n    Instrumentor::phase_begin(\"spike-exchange\");\n\n    {\n        Instrumentor::phase p(\"imbalance\");\n        wait_before_spike_exchange();\n    }\n\n    Instrumentor::phase_begin(\"communication\");\n    if (!displs) {\n        np = nrnmpi_numprocs_;\n        displs = (int*) emalloc(np * sizeof(int));\n        displs[0] = 0;\n#if nrn_spikebuf_size > 0\n        make_spikebuf_type();\n#endif\n    }\n#if nrn_spikebuf_size == 0\n    MPI_Allgather(&nout, 1, MPI_INT, nin, 1, MPI_INT, nrnmpi_comm);\n    int n = nin[0];\n    for (int i = 1; i < np; ++i) {\n        displs[i] = n;\n        n += nin[i];\n    }\n    if (n) {\n        if (icapacity < n) {\n            icapacity = n + 10;\n            free(*spikein);\n            *spikein = (NRNMPI_Spike*) emalloc(icapacity * sizeof(NRNMPI_Spike));\n        }\n        MPI_Allgatherv(spikeout, nout, spike_type, *spikein, nin, displs, spike_type, nrnmpi_comm);\n    }\n#else\n    MPI_Allgather(spbufout, 1, spikebuf_type, spbufin, 1, spikebuf_type, nrnmpi_comm);\n    int novfl = 0;\n    int n = spbufin[0].nspike;\n    if (n > nrn_spikebuf_size) {\n        nin[0] = n - nrn_spikebuf_size;\n        novfl += nin[0];\n    } else {\n        nin[0] = 0;\n    }\n    for (int i = 1; i < np; ++i) {\n        displs[i] = novfl;\n        int n1 = spbufin[i].nspike;\n        n += n1;\n        if (n1 > nrn_spikebuf_size) {\n            nin[i] = n1 - nrn_spikebuf_size;\n            novfl += nin[i];\n        } else {\n            nin[i] = 0;\n        }\n    }\n    if (novfl) {\n        if (icapacity < novfl) {\n            icapacity = novfl + 10;\n            free(*spikein);\n            *spikein = (NRNMPI_Spike*) emalloc(icapacity * sizeof(NRNMPI_Spike));\n        }\n        int n1 = (nout > nrn_spikebuf_size) ? nout - nrn_spikebuf_size : 0;\n        MPI_Allgatherv(spikeout, n1, spike_type, *spikein, nin, displs, spike_type, nrnmpi_comm);\n    }\n    ovfl = novfl;\n#endif\n    Instrumentor::phase_end(\"communication\");\n    Instrumentor::phase_end(\"spike-exchange\");\n    return n;\n}\n\n/*\nThe compressed spike format is restricted to the fixed step method and is\na sequence of unsigned char.\nnspike = buf[0]*256 + buf[1]\na sequence of spiketime, localgid pairs. There are nspike of them.\n        spiketime is relative to the last transfer time in units of dt.\n        note that this requires a mindelay < 256*dt.\n        localgid is an unsigned int, unsigned short,\n        or unsigned char in size depending on the range and thus takes\n        4, 2, or 1 byte respectively. To be machine independent we do our\n        own byte coding. When the localgid range is smaller than the true\n        gid range, the gid->PreSyn are remapped into\n        hostid specific\tmaps. If there are not many holes, i.e just about every\n        spike from a source machine is delivered to some cell on a\n        target machine, then instead of\ta hash map, a vector is used.\nThe allgather sends the first part of the buf and the allgatherv buffer\nsends any overflow.\n*/\nint nrnmpi_spike_exchange_compressed_impl(int localgid_size,\n                                          unsigned char*& spfixin_ovfl,\n                                          int send_nspike,\n                                          int* nin,\n                                          int ovfl_capacity,\n                                          unsigned char* spikeout_fixed,\n                                          int ag_send_size,\n                                          unsigned char* spikein_fixed,\n                                          int& ovfl) {\n    if (!displs) {\n        np = nrnmpi_numprocs_;\n        displs = (int*) emalloc(np * sizeof(int));\n        displs[0] = 0;\n    }\n    if (!byteovfl) {\n        byteovfl = (int*) emalloc(np * sizeof(int));\n    }\n    MPI_Allgather(\n        spikeout_fixed, ag_send_size, MPI_BYTE, spikein_fixed, ag_send_size, MPI_BYTE, nrnmpi_comm);\n    int novfl = 0;\n    int ntot = 0;\n    int bstot = 0;\n    for (int i = 0; i < np; ++i) {\n        displs[i] = bstot;\n        int idx = i * ag_send_size;\n        int n = spikein_fixed[idx++] * 256;\n        n += spikein_fixed[idx++];\n        ntot += n;\n        nin[i] = n;\n        if (n > send_nspike) {\n            int bs = 2 + n * (1 + localgid_size) - ag_send_size;\n            byteovfl[i] = bs;\n            bstot += bs;\n            novfl += n - send_nspike;\n        } else {\n            byteovfl[i] = 0;\n        }\n    }\n    if (novfl) {\n        if (ovfl_capacity < novfl) {\n            ovfl_capacity = novfl + 10;\n            free(spfixin_ovfl);\n            spfixin_ovfl = (unsigned char*) emalloc(ovfl_capacity * (1 + localgid_size) *\n                                                    sizeof(unsigned char));\n        }\n        int bs = byteovfl[nrnmpi_myid_];\n        /*\n        note that the spikeout_fixed buffer is one since the overflow\n        is contiguous to the first part. But the spfixin_ovfl is\n        completely separate from the spikein_fixed since the latter\n        dynamically changes its size during a run.\n        */\n        MPI_Allgatherv(spikeout_fixed + ag_send_size,\n                       bs,\n                       MPI_BYTE,\n                       spfixin_ovfl,\n                       byteovfl,\n                       displs,\n                       MPI_BYTE,\n                       nrnmpi_comm);\n    }\n    ovfl = novfl;\n    return ntot;\n}\n\nint nrnmpi_int_allmax_impl(int x) {\n    int result;\n    MPI_Allreduce(&x, &result, 1, MPI_INT, MPI_MAX, nrnmpi_comm);\n    return result;\n}\n\nextern void nrnmpi_int_alltoall_impl(int* s, int* r, int n) {\n    MPI_Alltoall(s, n, MPI_INT, r, n, MPI_INT, nrnmpi_comm);\n}\n\nextern void nrnmpi_int_alltoallv_impl(const int* s,\n                                      const int* scnt,\n                                      const int* sdispl,\n                                      int* r,\n                                      int* rcnt,\n                                      int* rdispl) {\n    MPI_Alltoallv(s, scnt, sdispl, MPI_INT, r, rcnt, rdispl, MPI_INT, nrnmpi_comm);\n}\n\nextern void nrnmpi_dbl_alltoallv_impl(double* s,\n                                      int* scnt,\n                                      int* sdispl,\n                                      double* r,\n                                      int* rcnt,\n                                      int* rdispl) {\n    MPI_Alltoallv(s, scnt, sdispl, MPI_DOUBLE, r, rcnt, rdispl, MPI_DOUBLE, nrnmpi_comm);\n}\n\n/* following are for the partrans */\n\nvoid nrnmpi_int_allgather_impl(int* s, int* r, int n) {\n    MPI_Allgather(s, n, MPI_INT, r, n, MPI_INT, nrnmpi_comm);\n}\n\ndouble nrnmpi_dbl_allmin_impl(double x) {\n    double result;\n    MPI_Allreduce(&x, &result, 1, MPI_DOUBLE, MPI_MIN, nrnmpi_comm);\n    return result;\n}\n\ndouble nrnmpi_dbl_allmax_impl(double x) {\n    double result;\n    MPI_Allreduce(&x, &result, 1, MPI_DOUBLE, MPI_MAX, nrnmpi_comm);\n    return result;\n}\n\nvoid nrnmpi_barrier_impl() {\n    MPI_Barrier(nrnmpi_comm);\n}\n\ndouble nrnmpi_dbl_allreduce_impl(double x, int type) {\n    double result;\n    MPI_Op tt;\n    if (type == 1) {\n        tt = MPI_SUM;\n    } else if (type == 2) {\n        tt = MPI_MAX;\n    } else {\n        tt = MPI_MIN;\n    }\n    MPI_Allreduce(&x, &result, 1, MPI_DOUBLE, tt, nrnmpi_comm);\n    return result;\n}\n\nvoid nrnmpi_dbl_allreduce_vec_impl(double* src, double* dest, int cnt, int type) {\n    MPI_Op tt;\n    assert(src != dest);\n    if (type == 1) {\n        tt = MPI_SUM;\n    } else if (type == 2) {\n        tt = MPI_MAX;\n    } else {\n        tt = MPI_MIN;\n    }\n    MPI_Allreduce(src, dest, cnt, MPI_DOUBLE, tt, nrnmpi_comm);\n    return;\n}\n\nvoid nrnmpi_long_allreduce_vec_impl(long* src, long* dest, int cnt, int type) {\n    MPI_Op tt;\n    assert(src != dest);\n    if (type == 1) {\n        tt = MPI_SUM;\n    } else if (type == 2) {\n        tt = MPI_MAX;\n    } else {\n        tt = MPI_MIN;\n    }\n    MPI_Allreduce(src, dest, cnt, MPI_LONG, tt, nrnmpi_comm);\n    return;\n}\n\n#if NRN_MULTISEND\n\nstatic MPI_Comm multisend_comm;\n\nvoid nrnmpi_multisend_comm_impl() {\n    if (!multisend_comm) {\n        MPI_Comm_dup(MPI_COMM_WORLD, &multisend_comm);\n    }\n}\n\nvoid nrnmpi_multisend_impl(NRNMPI_Spike* spk, int n, int* hosts) {\n    MPI_Request r;\n    for (int i = 0; i < n; ++i) {\n        MPI_Isend(spk, 1, spike_type, hosts[i], 1, multisend_comm, &r);\n        MPI_Request_free(&r);\n    }\n}\n\nint nrnmpi_multisend_single_advance_impl(NRNMPI_Spike* spk) {\n    int flag = 0;\n    MPI_Status status;\n    MPI_Iprobe(MPI_ANY_SOURCE, 1, multisend_comm, &flag, &status);\n    if (flag) {\n        MPI_Recv(spk, 1, spike_type, MPI_ANY_SOURCE, 1, multisend_comm, &status);\n    }\n    return flag;\n}\n\nint nrnmpi_multisend_conserve_impl(int nsend, int nrecv) {\n    int tcnts[2];\n    tcnts[0] = nsend - nrecv;\n    MPI_Allreduce(tcnts, tcnts + 1, 1, MPI_INT, MPI_SUM, multisend_comm);\n    return tcnts[1];\n}\n\n#endif /*NRN_MULTISEND*/\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/mpi/lib/nrnmpi.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include <iostream>\n#include <string>\n#include <tuple>\n\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/mpi/nrnmpi.h\"\n#include \"coreneuron/utils/nrn_assert.h\"\n#include \"nrnmpi.hpp\"\n#if _OPENMP\n#include <omp.h>\n#endif\n#include <mpi.h>\nnamespace coreneuron {\n\nMPI_Comm nrnmpi_world_comm;\nMPI_Comm nrnmpi_comm;\nint nrnmpi_numprocs_;\nint nrnmpi_myid_;\n\nstatic bool nrnmpi_under_nrncontrol_{false};\n\nstatic void nrn_fatal_error(const char* msg) {\n    if (nrnmpi_myid_ == 0) {\n        printf(\"%s\\n\", msg);\n    }\n    nrnmpi_abort_impl(-1);\n}\n\nnrnmpi_init_ret_t nrnmpi_init_impl(int* pargc, char*** pargv, bool is_quiet) {\n    // Execute at most once per launch. Avoid memory leak.\n    static bool executed = false;\n    if (executed) {\n        return {nrnmpi_numprocs_, nrnmpi_myid_};\n    }\n\n    nrnmpi_under_nrncontrol_ = true;\n\n    if (!nrnmpi_initialized_impl()) {\n#if defined(_OPENMP)\n        int required = MPI_THREAD_FUNNELED;\n        int provided;\n        nrn_assert(MPI_Init_thread(pargc, pargv, required, &provided) == MPI_SUCCESS);\n\n        nrn_assert(required <= provided);\n#else\n        nrn_assert(MPI_Init(pargc, pargv) == MPI_SUCCESS);\n#endif\n    }\n    nrn_assert(MPI_Comm_dup(MPI_COMM_WORLD, &nrnmpi_world_comm) == MPI_SUCCESS);\n    nrn_assert(MPI_Comm_dup(nrnmpi_world_comm, &nrnmpi_comm) == MPI_SUCCESS);\n    nrn_assert(MPI_Comm_rank(nrnmpi_world_comm, &nrnmpi_myid_) == MPI_SUCCESS);\n    nrn_assert(MPI_Comm_size(nrnmpi_world_comm, &nrnmpi_numprocs_) == MPI_SUCCESS);\n    nrnmpi_spike_initialize();\n\n    if (nrnmpi_myid_ == 0 && !is_quiet) {\n#if defined(_OPENMP)\n        printf(\" num_mpi=%d\\n num_omp_thread=%d\\n\\n\", nrnmpi_numprocs_, omp_get_max_threads());\n#else\n        printf(\" num_mpi=%d\\n\\n\", nrnmpi_numprocs_);\n#endif\n    }\n\n    executed = true;\n    return {nrnmpi_numprocs_, nrnmpi_myid_};\n}\n\nvoid nrnmpi_finalize_impl(void) {\n    if (nrnmpi_under_nrncontrol_) {\n        if (nrnmpi_initialized_impl()) {\n            MPI_Comm_free(&nrnmpi_world_comm);\n            MPI_Comm_free(&nrnmpi_comm);\n            MPI_Finalize();\n        }\n    }\n}\n\n// check if appropriate threading level supported (i.e. MPI_THREAD_FUNNELED)\nvoid nrnmpi_check_threading_support_impl() {\n    int th = 0;\n    MPI_Query_thread(&th);\n    if (th < MPI_THREAD_FUNNELED) {\n        nrn_fatal_error(\n            \"\\n Current MPI library doesn't support MPI_THREAD_FUNNELED,\\\n                    \\n Run without enabling multi-threading!\");\n    }\n}\n\nbool nrnmpi_initialized_impl() {\n    int flag = 0;\n    MPI_Initialized(&flag);\n    return flag != 0;\n}\n\nvoid nrnmpi_abort_impl(int errcode) {\n    MPI_Abort(MPI_COMM_WORLD, errcode);\n}\n\ndouble nrnmpi_wtime_impl() {\n    return MPI_Wtime();\n}\n\n/**\n * Return local mpi rank within a shared memory node\n *\n * When performing certain operations, we need to know the rank of mpi\n * process on a given node. This function uses MPI 3 MPI_Comm_split_type\n * function and MPI_COMM_TYPE_SHARED key to find out the local rank.\n */\nint nrnmpi_local_rank_impl() {\n    int local_rank = 0;\n    if (nrnmpi_initialized_impl()) {\n        MPI_Comm local_comm;\n        MPI_Comm_split_type(\n            MPI_COMM_WORLD, MPI_COMM_TYPE_SHARED, nrnmpi_myid_, MPI_INFO_NULL, &local_comm);\n        MPI_Comm_rank(local_comm, &local_rank);\n        MPI_Comm_free(&local_comm);\n    }\n    return local_rank;\n}\n\n/**\n * Return number of ranks running on single shared memory node\n *\n * We use MPI 3 MPI_Comm_split_type function and MPI_COMM_TYPE_SHARED key to\n * determine number of mpi ranks within a shared memory node.\n */\nint nrnmpi_local_size_impl() {\n    int local_size = 1;\n    if (nrnmpi_initialized_impl()) {\n        MPI_Comm local_comm;\n        MPI_Comm_split_type(\n            MPI_COMM_WORLD, MPI_COMM_TYPE_SHARED, nrnmpi_myid_, MPI_INFO_NULL, &local_comm);\n        MPI_Comm_size(local_comm, &local_size);\n        MPI_Comm_free(&local_comm);\n    }\n    return local_size;\n}\n\n/**\n * Write given buffer to a new file using MPI collective I/O\n *\n * For output like spikes, each rank has to write spike timing\n * information to a single file. This routine writes buffers\n * of length len1, len2, len3... at the offsets 0, 0+len1,\n * 0+len1+len2... offsets. This write op is a collective across\n * all ranks of the common MPI communicator used for spike exchange.\n *\n * @param filename Name of the file to write\n * @param buffer Buffer to write\n * @param length Length of the buffer to write\n */\nvoid nrnmpi_write_file_impl(const std::string& filename, const char* buffer, size_t length) {\n    MPI_File fh;\n    MPI_Status status;\n\n    // global offset into file\n    unsigned long offset = 0;\n    MPI_Exscan(&length, &offset, 1, MPI_UNSIGNED_LONG, MPI_SUM, nrnmpi_comm);\n\n    int op_status = MPI_File_open(\n        nrnmpi_comm, filename.c_str(), MPI_MODE_CREATE | MPI_MODE_WRONLY, MPI_INFO_NULL, &fh);\n    if (op_status != MPI_SUCCESS && nrnmpi_myid_ == 0) {\n        std::cerr << \"Error while opening output file \" << filename << std::endl;\n        abort();\n    }\n\n    op_status = MPI_File_write_at_all(fh, offset, buffer, length, MPI_BYTE, &status);\n    if (op_status != MPI_SUCCESS && nrnmpi_myid_ == 0) {\n        std::cerr << \"Error while writing output \" << std::endl;\n        abort();\n    }\n\n    MPI_File_close(&fh);\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/mpi/lib/nrnmpi.hpp",
    "content": "#pragma once\n\n// This file contains functions that does not go outside of the mpi library\nnamespace coreneuron {\nextern int nrnmpi_numprocs_;\nextern int nrnmpi_myid_;\nvoid nrnmpi_spike_initialize();\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/mpi/nrnmpi.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\n#include <cassert>\n#include <string>\n#include <type_traits>\n#include <vector>\n\n#include \"coreneuron/mpi/nrnmpiuse.h\"\n\n#ifndef nrn_spikebuf_size\n#define nrn_spikebuf_size 0\n#endif\n\nnamespace coreneuron {\nstruct NRNMPI_Spikebuf {\n    int nspike;\n    int gid[nrn_spikebuf_size];\n    double spiketime[nrn_spikebuf_size];\n};\n}  // namespace coreneuron\n\nnamespace coreneuron {\nstruct NRNMPI_Spike {\n    int gid;\n    double spiketime;\n};\n\n// Those functions and classes are part of a mechanism to dynamically or statically load mpi\n// functions\nstruct mpi_function_base;\n\nstruct mpi_manager_t {\n    void register_function(mpi_function_base* ptr) {\n        m_function_ptrs.push_back(ptr);\n    }\n    void resolve_symbols(void* dlsym_handle);\n\n  private:\n    std::vector<mpi_function_base*> m_function_ptrs;\n    // true when symbols are resolved\n};\n\ninline mpi_manager_t& mpi_manager() {\n    static mpi_manager_t x;\n    return x;\n}\n\nstruct mpi_function_base {\n    void resolve(void* dlsym_handle);\n    operator bool() const {\n        return m_fptr;\n    }\n    mpi_function_base(const char* name)\n        : m_name{name} {\n        mpi_manager().register_function(this);\n    }\n\n  protected:\n    void* m_fptr{};\n    const char* m_name;\n};\n\n// This could be done with a simpler\n//   template <auto fptr> struct function : function_base { ... };\n// pattern in C++17...\ntemplate <typename>\nstruct mpi_function {};\n\n#define cnrn_make_integral_constant_t(x) std::integral_constant<std::decay_t<decltype(x)>, x>\n\ntemplate <typename function_ptr, function_ptr fptr>\nstruct mpi_function<std::integral_constant<function_ptr, fptr>>: mpi_function_base {\n    using mpi_function_base::mpi_function_base;\n    template <typename... Args>  // in principle deducible from `function_ptr`\n    auto operator()(Args&&... args) const {\n#ifdef CORENEURON_ENABLE_MPI_DYNAMIC\n        // Dynamic MPI, m_fptr should have been initialised via dlsym.\n        assert(m_fptr);\n        return (*reinterpret_cast<decltype(fptr)>(m_fptr))(std::forward<Args>(args)...);\n#else\n        // No dynamic MPI, use `fptr` directly. Will produce link errors if libmpi.so is not linked.\n        return (*fptr)(std::forward<Args>(args)...);\n#endif\n    }\n};\n}  // namespace coreneuron\n#include \"coreneuron/mpi/nrnmpidec.h\"\n"
  },
  {
    "path": "coreneuron/mpi/nrnmpidec.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n/*\nThis file is processed by mkdynam.sh and so it is important that\nthe prototypes be of the form \"type foo(type arg, ...)\"\n*/\n\n#pragma once\n\n#include <stdlib.h>\n\nnamespace coreneuron {\n/* from nrnmpi.cpp */\nstruct nrnmpi_init_ret_t {\n    int numprocs;\n    int myid;\n};\nextern \"C\" nrnmpi_init_ret_t nrnmpi_init_impl(int* pargc, char*** pargv, bool is_quiet);\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_init_impl)> nrnmpi_init;\nextern \"C\" void nrnmpi_finalize_impl(void);\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_finalize_impl)> nrnmpi_finalize;\nextern \"C\" void nrnmpi_check_threading_support_impl();\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_check_threading_support_impl)>\n    nrnmpi_check_threading_support;\n// Write given buffer to a new file using MPI collective I/O\nextern \"C\" void nrnmpi_write_file_impl(const std::string& filename,\n                                       const char* buffer,\n                                       size_t length);\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_write_file_impl)> nrnmpi_write_file;\n\n\n/* from mpispike.cpp */\nextern \"C\" int nrnmpi_spike_exchange_impl(int* nin,\n                                          NRNMPI_Spike* spikeout,\n                                          int icapacity,\n                                          NRNMPI_Spike** spikein,\n                                          int& ovfl,\n                                          int nout,\n                                          NRNMPI_Spikebuf* spbufout,\n                                          NRNMPI_Spikebuf* spbufin);\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_spike_exchange_impl)>\n    nrnmpi_spike_exchange;\nextern \"C\" int nrnmpi_spike_exchange_compressed_impl(int,\n                                                     unsigned char*&,\n                                                     int,\n                                                     int*,\n                                                     int,\n                                                     unsigned char*,\n                                                     int,\n                                                     unsigned char*,\n                                                     int& ovfl);\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_spike_exchange_compressed_impl)>\n    nrnmpi_spike_exchange_compressed;\nextern \"C\" int nrnmpi_int_allmax_impl(int i);\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_int_allmax_impl)> nrnmpi_int_allmax;\nextern \"C\" void nrnmpi_int_allgather_impl(int* s, int* r, int n);\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_int_allgather_impl)> nrnmpi_int_allgather;\nextern \"C\" void nrnmpi_int_alltoall_impl(int* s, int* r, int n);\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_int_alltoall_impl)> nrnmpi_int_alltoall;\nextern \"C\" void nrnmpi_int_alltoallv_impl(const int* s,\n                                          const int* scnt,\n                                          const int* sdispl,\n                                          int* r,\n                                          int* rcnt,\n                                          int* rdispl);\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_int_alltoallv_impl)> nrnmpi_int_alltoallv;\nextern \"C\" void nrnmpi_dbl_alltoallv_impl(double* s,\n                                          int* scnt,\n                                          int* sdispl,\n                                          double* r,\n                                          int* rcnt,\n                                          int* rdispl);\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_dbl_alltoallv_impl)> nrnmpi_dbl_alltoallv;\nextern \"C\" double nrnmpi_dbl_allmin_impl(double x);\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_dbl_allmin_impl)> nrnmpi_dbl_allmin;\nextern \"C\" double nrnmpi_dbl_allmax_impl(double x);\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_dbl_allmax_impl)> nrnmpi_dbl_allmax;\nextern \"C\" void nrnmpi_barrier_impl(void);\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_barrier_impl)> nrnmpi_barrier;\nextern \"C\" double nrnmpi_dbl_allreduce_impl(double x, int type);\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_dbl_allreduce_impl)> nrnmpi_dbl_allreduce;\nextern \"C\" void nrnmpi_dbl_allreduce_vec_impl(double* src, double* dest, int cnt, int type);\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_dbl_allreduce_vec_impl)>\n    nrnmpi_dbl_allreduce_vec;\nextern \"C\" void nrnmpi_long_allreduce_vec_impl(long* src, long* dest, int cnt, int type);\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_long_allreduce_vec_impl)>\n    nrnmpi_long_allreduce_vec;\nextern \"C\" bool nrnmpi_initialized_impl();\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_initialized_impl)> nrnmpi_initialized;\nextern \"C\" void nrnmpi_abort_impl(int);\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_abort_impl)> nrnmpi_abort;\nextern \"C\" double nrnmpi_wtime_impl();\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_wtime_impl)> nrnmpi_wtime;\nextern \"C\" int nrnmpi_local_rank_impl();\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_local_rank_impl)> nrnmpi_local_rank;\nextern \"C\" int nrnmpi_local_size_impl();\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_local_size_impl)> nrnmpi_local_size;\n#if NRN_MULTISEND\nextern \"C\" void nrnmpi_multisend_comm_impl();\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_multisend_comm_impl)>\n    nrnmpi_multisend_comm;\nextern \"C\" void nrnmpi_multisend_impl(NRNMPI_Spike* spk, int n, int* hosts);\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_multisend_impl)> nrnmpi_multisend;\nextern \"C\" int nrnmpi_multisend_single_advance_impl(NRNMPI_Spike* spk);\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_multisend_single_advance_impl)>\n    nrnmpi_multisend_single_advance;\nextern \"C\" int nrnmpi_multisend_conserve_impl(int nsend, int nrecv);\nextern mpi_function<cnrn_make_integral_constant_t(nrnmpi_multisend_conserve_impl)>\n    nrnmpi_multisend_conserve;\n#endif\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/mpi/nrnmpiuse.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\n/* define to 1 if you want MPI specific features activated\n   (optionally provided by CMake option NRNMPI) */\n#ifndef NRNMPI\n#define NRNMPI 1\n#endif\n\n/* define to 1 if want multisend spike exchange available */\n#ifndef NRN_MULTISEND\n#define NRN_MULTISEND 1\n#endif\n\n/* define to 1 if you want parallel distributed cells (and gap junctions) */\n#define PARANEURON 1\n\n/* define to 1 if you want the MUSIC - MUlti SImulation Coordinator */\n#undef NRN_MUSIC\n\n/* define to the dll path if you want to load automatically */\n#undef DLL_DEFAULT_FNAME\n\n/* Number of times to retry a failed open */\n#undef FILE_OPEN_RETRY\n\n/* Define to 1 for possibility of rank 0 xopen/ropen a file and broadcast everywhere */\n#undef USE_NRNFILEWRAP\n"
  },
  {
    "path": "coreneuron/network/cvodestb.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include \"coreneuron/coreneuron.hpp\"\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/sim/multicore.hpp\"\n// solver CVode stub to allow cvode as dll for mswindows version.\n\n#include \"coreneuron/network/netcvode.hpp\"\n#include \"coreneuron/utils/vrecitem.h\"\n\n#include \"coreneuron/gpu/nrn_acc_manager.hpp\"\n\nnamespace coreneuron {\n\n// for fixed step thread\n// check thresholds and deliver all (including binqueue) events\n// up to t+dt/2\nvoid deliver_net_events(NrnThread* nt) {\n    if (net_cvode_instance) {\n        net_cvode_instance->check_thresh(nt);\n        net_cvode_instance->deliver_net_events(nt);\n    }\n}\n\n// deliver events (but not binqueue)  up to nt->_t\nvoid nrn_deliver_events(NrnThread* nt) {\n    double tsav = nt->_t;\n    if (net_cvode_instance) {\n        net_cvode_instance->deliver_events(tsav, nt);\n    }\n    nt->_t = tsav;\n\n    /*before executing on gpu, we have to update the NetReceiveBuffer_t on GPU */\n    update_net_receive_buffer(nt);\n\n    for (auto& net_buf_receive: corenrn.get_net_buf_receive()) {\n        (*net_buf_receive.first)(nt);\n    }\n}\n\nvoid clear_event_queue() {\n    if (net_cvode_instance) {\n        net_cvode_instance->clear_events();\n    }\n}\n\nvoid init_net_events() {\n    if (net_cvode_instance) {\n        net_cvode_instance->init_events();\n    }\n\n#ifdef CORENEURON_ENABLE_GPU\n    /* weight vectors could be updated (from INITIAL block of NET_RECEIVE, update those on GPU's */\n    for (int ith = 0; ith < nrn_nthread; ++ith) {\n        NrnThread* nt = nrn_threads + ith;\n        double* weights = nt->weights;\n        int n_weight = nt->n_weight;\n        if (n_weight && nt->compute_gpu) {\n            nrn_pragma_acc(update device(weights [0:n_weight]))\n            nrn_pragma_omp(target update to(weights [0:n_weight]))\n        }\n    }\n#endif\n}\n\nvoid nrn_play_init() {\n    for (int ith = 0; ith < nrn_nthread; ++ith) {\n        NrnThread* nt = nrn_threads + ith;\n        for (int i = 0; i < nt->n_vecplay; ++i) {\n            ((PlayRecord*) nt->_vecplay[i])->play_init();\n        }\n    }\n}\n\nvoid fixed_play_continuous(NrnThread* nt) {\n    for (int i = 0; i < nt->n_vecplay; ++i) {\n        ((PlayRecord*) nt->_vecplay[i])->continuous(nt->_t);\n    }\n}\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/network/have2want.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n/*\nTo be included by a file that desires rendezvous rank exchange functionality.\nNeed to define HAVEWANT_t, HAVEWANT_alltoallv, and HAVEWANT2Int\n*/\n\n#ifdef have2want_h\n#error \"This implementation can only be included once\"\n/* The static function names could involve a macro name. */\n#endif\n\n#define have2want_h\n\n#include \"coreneuron/utils/nrnoc_aux.hpp\"\n#include \"coreneuron/apps/corenrn_parameters.hpp\"\n#include \"coreneuron/mpi/core/nrnmpi.hpp\"\n\n/*\n\nA rank owns a set of HAVEWANT_t keys and wants information associated with\na set of HAVEWANT_t keys owned by unknown ranks.  Owners do not know which\nranks want their information. Ranks that want info do not know which ranks\nown that info.\n\nThe have_to_want function returns two new vectors of keys along with\nassociated count and displacement vectors of length nrnmpi_numprocs and nrnmpi_numprocs+1\nrespectively. Note that a send_to_want_displ[i+1] =\n  send_to_want_cnt[i] + send_to_want_displ[i] .\n\nsend_to_want[send_to_want_displ[i] to send_to_want_displ[i+1]] contains\nthe keys from this rank for which rank i wants information.\n\nrecv_from_have[recv_from_have_displ[i] to recv_from_have_displ[i+1] contains\nthe keys from which rank i is sending information to this rank.\n\nNote that on rank i, the order of keys in the rank j area of send_to_want\nis the same order of keys on rank j in the ith area in recv_from_have.\n\nThe rendezvous_rank function is used to parallelize this computation\nand minimize memory usage so that no single rank ever needs to know all keys.\n*/\n\n#ifndef HAVEWANT_t\n#define HAVEWANT_t int\n#endif\nnamespace coreneuron {\n// round robin default rendezvous rank function\nstatic int default_rendezvous(HAVEWANT_t key) {\n    return key % nrnmpi_numprocs;\n}\n\nstatic int* cnt2displ(int* cnt) {\n    int* displ = new int[nrnmpi_numprocs + 1];\n    displ[0] = 0;\n    for (int i = 0; i < nrnmpi_numprocs; ++i) {\n        displ[i + 1] = displ[i] + cnt[i];\n    }\n    return displ;\n}\n\nstatic int* srccnt2destcnt(int* srccnt) {\n    int* destcnt = new int[nrnmpi_numprocs];\n#if NRNMPI\n    if (corenrn_param.mpi_enable) {\n        nrnmpi_int_alltoall(srccnt, destcnt, 1);\n    } else\n#endif\n    {\n        for (int i = 0; i < nrnmpi_numprocs; ++i) {\n            destcnt[i] = srccnt[i];\n        }\n    }\n    return destcnt;\n}\n\nstatic void rendezvous_rank_get(HAVEWANT_t* data,\n                                int size,\n                                HAVEWANT_t*& sdata,\n                                int*& scnt,\n                                int*& sdispl,\n                                HAVEWANT_t*& rdata,\n                                int*& rcnt,\n                                int*& rdispl,\n                                int (*rendezvous_rank)(HAVEWANT_t)) {\n    // count what gets sent\n    scnt = new int[nrnmpi_numprocs];\n    for (int i = 0; i < nrnmpi_numprocs; ++i) {\n        scnt[i] = 0;\n    }\n    for (int i = 0; i < size; ++i) {\n        int r = (*rendezvous_rank)(data[i]);\n        ++scnt[r];\n    }\n\n    sdispl = cnt2displ(scnt);\n    rcnt = srccnt2destcnt(scnt);\n    rdispl = cnt2displ(rcnt);\n    sdata = new HAVEWANT_t[sdispl[nrnmpi_numprocs]];\n    rdata = new HAVEWANT_t[rdispl[nrnmpi_numprocs]];\n    // scatter data into sdata by recalculating scnt.\n    for (int i = 0; i < nrnmpi_numprocs; ++i) {\n        scnt[i] = 0;\n    }\n    for (int i = 0; i < size; ++i) {\n        int r = (*rendezvous_rank)(data[i]);\n        sdata[sdispl[r] + scnt[r]] = data[i];\n        ++scnt[r];\n    }\n#if NRNMPI\n    if (corenrn_param.mpi_enable) {\n        HAVEWANT_alltoallv(sdata, scnt, sdispl, rdata, rcnt, rdispl);\n    } else\n#endif\n    {\n        for (int i = 0; i < sdispl[nrnmpi_numprocs]; ++i) {\n            rdata[i] = sdata[i];\n        }\n    }\n}\n\nstatic void have_to_want(HAVEWANT_t* have,\n                         int have_size,\n                         HAVEWANT_t* want,\n                         int want_size,\n                         HAVEWANT_t*& send_to_want,\n                         int*& send_to_want_cnt,\n                         int*& send_to_want_displ,\n                         HAVEWANT_t*& recv_from_have,\n                         int*& recv_from_have_cnt,\n                         int*& recv_from_have_displ,\n                         int (*rendezvous_rank)(HAVEWANT_t)) {\n    // 1) Send have and want to the rendezvous ranks.\n    // 2) Rendezvous rank matches have and want.\n    // 3) Rendezvous ranks tell the want ranks which ranks own the keys\n    // 4) Ranks that want tell owner ranks where to send.\n\n    // 1) Send have and want to the rendezvous ranks.\n    HAVEWANT_t *have_s_data, *have_r_data;\n    int *have_s_cnt, *have_s_displ, *have_r_cnt, *have_r_displ;\n    rendezvous_rank_get(have,\n                        have_size,\n                        have_s_data,\n                        have_s_cnt,\n                        have_s_displ,\n                        have_r_data,\n                        have_r_cnt,\n                        have_r_displ,\n                        rendezvous_rank);\n    // assume it is an error if two ranks have the same key so create\n    // hash table of key2rank. Will also need it for matching have and want\n    HAVEWANT2Int havekey2rank = HAVEWANT2Int();\n    for (int r = 0; r < nrnmpi_numprocs; ++r) {\n        for (int i = 0; i < have_r_cnt[r]; ++i) {\n            HAVEWANT_t key = have_r_data[have_r_displ[r] + i];\n            if (havekey2rank.find(key) != havekey2rank.end()) {\n                char buf[200];\n                sprintf(buf, \"key %lld owned by multiple ranks\\n\", (long long) key);\n                hoc_execerror(buf, 0);\n            }\n            havekey2rank[key] = r;\n        }\n    }\n    delete[] have_s_data;\n    delete[] have_s_cnt;\n    delete[] have_s_displ;\n    delete[] have_r_data;\n    delete[] have_r_cnt;\n    delete[] have_r_displ;\n\n    HAVEWANT_t *want_s_data, *want_r_data;\n    int *want_s_cnt, *want_s_displ, *want_r_cnt, *want_r_displ;\n    rendezvous_rank_get(want,\n                        want_size,\n                        want_s_data,\n                        want_s_cnt,\n                        want_s_displ,\n                        want_r_data,\n                        want_r_cnt,\n                        want_r_displ,\n                        rendezvous_rank);\n\n    // 2) Rendezvous rank matches have and want.\n    //    we already have made the havekey2rank map.\n    // Create an array parallel to want_r_data which contains the ranks that\n    // have that data.\n    int n = want_r_displ[nrnmpi_numprocs];\n    int* want_r_ownerranks = new int[n];\n    for (int r = 0; r < nrnmpi_numprocs; ++r) {\n        for (int i = 0; i < want_r_cnt[r]; ++i) {\n            int ix = want_r_displ[r] + i;\n            HAVEWANT_t key = want_r_data[ix];\n            if (havekey2rank.find(key) == havekey2rank.end()) {\n                char buf[200];\n                sprintf(buf, \"key = %lld is wanted but does not exist\\n\", (long long) key);\n                hoc_execerror(buf, 0);\n            }\n            want_r_ownerranks[ix] = havekey2rank[key];\n        }\n    }\n    delete[] want_r_data;\n\n    // 3) Rendezvous ranks tell the want ranks which ranks own the keys\n    // The ranks that want keys need to know the ranks that own those keys.\n    // The want_s_ownerranks will be parallel to the want_s_data.\n    // That is, each item defines the rank from which information associated\n    // with that key is coming from\n    int* want_s_ownerranks = new int[want_s_displ[nrnmpi_numprocs]];\n#if NRNMPI\n    if (corenrn_param.mpi_enable) {\n        nrnmpi_int_alltoallv(want_r_ownerranks,\n                             want_r_cnt,\n                             want_r_displ,\n                             want_s_ownerranks,\n                             want_s_cnt,\n                             want_s_displ);\n    } else\n#endif\n    {\n        for (int i = 0; i < want_r_displ[nrnmpi_numprocs]; ++i) {\n            want_s_ownerranks[i] = want_r_ownerranks[i];\n        }\n    }\n    delete[] want_r_ownerranks;\n    delete[] want_r_cnt;\n    delete[] want_r_displ;\n\n    // 4) Ranks that want tell owner ranks where to send.\n    // Finished with the rendezvous ranks. The ranks that want keys know the\n    // owner ranks for those keys. The next step is for the want ranks to\n    // tell the owner ranks where to send.\n    // The parallel want_s_ownerranks and want_s_data are now uselessly ordered\n    // by rendezvous rank. Reorganize so that want ranks can tell owner ranks\n    // what they want.\n    n = want_s_displ[nrnmpi_numprocs];\n    delete[] want_s_displ;\n    for (int i = 0; i < nrnmpi_numprocs; ++i) {\n        want_s_cnt[i] = 0;\n    }\n    HAVEWANT_t* old_want_s_data = want_s_data;\n    want_s_data = new HAVEWANT_t[n];\n    // compute the counts\n    for (int i = 0; i < n; ++i) {\n        int r = want_s_ownerranks[i];\n        ++want_s_cnt[r];\n    }\n    want_s_displ = cnt2displ(want_s_cnt);\n    for (int i = 0; i < nrnmpi_numprocs; ++i) {\n        want_s_cnt[i] = 0;\n    }  // recount while filling\n    for (int i = 0; i < n; ++i) {\n        int r = want_s_ownerranks[i];\n        HAVEWANT_t key = old_want_s_data[i];\n        want_s_data[want_s_displ[r] + want_s_cnt[r]] = key;\n        ++want_s_cnt[r];\n    }\n    delete[] want_s_ownerranks;\n    delete[] old_want_s_data;\n    want_r_cnt = srccnt2destcnt(want_s_cnt);\n    want_r_displ = cnt2displ(want_r_cnt);\n    want_r_data = new HAVEWANT_t[want_r_displ[nrnmpi_numprocs]];\n#if NRNMPI\n    if (corenrn_param.mpi_enable) {\n        HAVEWANT_alltoallv(\n            want_s_data, want_s_cnt, want_s_displ, want_r_data, want_r_cnt, want_r_displ);\n    } else\n#endif\n    {\n        for (int i = 0; i < want_s_displ[nrnmpi_numprocs]; ++i) {\n            want_r_data[i] = want_s_data[i];\n        }\n    }\n    // now the want_r_data on the have_ranks are grouped according to the ranks\n    // that want those keys.\n\n    send_to_want = want_r_data;\n    send_to_want_cnt = want_r_cnt;\n    send_to_want_displ = want_r_displ;\n    recv_from_have = want_s_data;\n    recv_from_have_cnt = want_s_cnt;\n    recv_from_have_displ = want_s_displ;\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/network/multisend.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include \"coreneuron/nrniv/nrniv_decl.h\"\n#include \"coreneuron/network/multisend.hpp\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/network/netcon.hpp\"\n#include \"coreneuron/network/netcvode.hpp\"\n\n/*\nOverall exchange strategy\n\nWhen a cell spikes, it immediately does a multisend of\n(int gid, double spiketime) to all the target machines that have\ncells that need to receive this spike by spiketime + delay.\nThe MPI implementation does not block due to use of MPI_Isend.\n\nIn order to minimize the number of nrnmpi_multisend_conserve tests\n(and potentially abandon them altogether if I can ever guarantee\nthat exchange time is less than half the computation time), I divide the\nminimum delay integration intervals into two equal subintervals.\nSo if a spike is generated in an even subinterval, I do not have\nto include it in the conservation check until the end of the next even\nsubinterval.\n\nWhen a spike is received (generally MPI_Iprobe, MPI_Recv) it is placed in\neven or odd buffers (depending on whether the coded gid is positive or negative)\n\nAt the end of a computation subinterval the even or odd buffer spikes\nare enqueued in the priority queue after checking that the number\nof spikes sent is equal to the number of spikes sent.\n*/\n\n// The initial idea behind the optional phase2 is to avoid the large overhead of\n// initiating a send of the up to 10k list of target hosts when a cell fires.\n// I.e. when there are a small number of cells on a processor, this causes\n// load balance problems.\n// Load balance should be better if the send is distributed to a much smaller\n// set of targets, which, when they receive the spike, pass it on to a neighbor\n// set. A non-exclusive alternative to this is the use of RECORD_REPLAY\n// which give a very fast initiation but we have not been able to get that\n// to complete in the sense of all the targets receiving their spikes before\n// the conservation step.\n// We expect that phase2 will work best in combination with ENQUEUE=2\n// which has the greatest amount of overlap between computation\n// and communication.\nnamespace coreneuron {\nbool use_multisend_;\nbool use_phase2_;\nint n_multisend_interval = 2;\n\n#if NRN_MULTISEND\n\nstatic int n_xtra_cons_check_;\n#define MAXNCONS 10\n#if MAXNCONS\nstatic int xtra_cons_hist_[MAXNCONS + 1];\n#endif\n\n// ENQUEUE 0 means to  Multisend_ReceiveBuffer buffer -> InputPreSyn.send\n// ENQUEUE 1 means to Multisend_ReceiveBuffer buffer -> psbuf -> InputPreSyn.send\n// ENQUEUE 2 means to Multisend_ReceiveBuffer.incoming -> InputPrySyn.send\n// Note that ENQUEUE 2 give more overlap between computation and exchange\n// since the enqueuing takes place during computation except for those\n// remaining during conservation.\n#define ENQUEUE 2\n\n#if ENQUEUE == 2\nstatic unsigned long enq2_find_time_;\nstatic unsigned long enq2_enqueue_time_;  // includes enq_find_time_\n#endif\n\n#define PHASE2BUFFER_SIZE 2048  // power of 2\n#define PHASE2BUFFER_MASK (PHASE2BUFFER_SIZE - 1)\nstruct Phase2Buffer {\n    InputPreSyn* ps;\n    double spiketime;\n    int gid;\n};\n\n#define MULTISEND_RECEIVEBUFFER_SIZE 10000\nclass Multisend_ReceiveBuffer {\n  public:\n    Multisend_ReceiveBuffer();\n    virtual ~Multisend_ReceiveBuffer();\n    void init(int index);\n    void incoming(int gid, double spiketime);\n    void enqueue();\n    int index_{};\n    int size_{MULTISEND_RECEIVEBUFFER_SIZE};\n    int count_{};\n    int maxcount_{};\n    bool busy_{};\n    int nsend_{}, nrecv_{};  // for checking conservation\n    int nsend_cell_{};       // cells that spiked this interval.\n    NRNMPI_Spike** buffer_{};\n\n    void enqueue1();\n    void enqueue2();\n    InputPreSyn** psbuf_{};\n\n    void phase2send();\n    int phase2_head_{};\n    int phase2_tail_{};\n    int phase2_nsend_cell_{}, phase2_nsend_{};\n    Phase2Buffer* phase2_buffer_{};\n};\n\n#define MULTISEND_INTERVAL 2\nstatic Multisend_ReceiveBuffer* multisend_receive_buffer[MULTISEND_INTERVAL];\nstatic int current_rbuf, next_rbuf;\n#if MULTISEND_INTERVAL == 2\n// note that if a spike is supposed to be received by multisend_receive_buffer[1]\n// then during transmission its gid is complemented.\n#endif\n\nstatic int* targets_phase1_;\nstatic int* targets_phase2_;\n\nvoid nrn_multisend_send(PreSyn* ps, double t, NrnThread* nt) {\n    int i = ps->multisend_index_;\n    if (i >= 0) {\n        // format is cnt, cnt_phase1, array of target ranks.\n        // Valid for one or two phase.\n        int* ranks = targets_phase1_ + i;\n        int cnt = ranks[0];\n        int cnt_phase1 = ranks[1];\n        ranks += 2;\n        NRNMPI_Spike spk;\n        spk.gid = ps->output_index_;\n        spk.spiketime = t;\n        if (next_rbuf == 1) {\n            spk.gid = ~spk.gid;\n        }\n        if (nt == nrn_threads) {\n            multisend_receive_buffer[next_rbuf]->nsend_ += cnt;\n            multisend_receive_buffer[next_rbuf]->nsend_cell_ += 1;\n            nrnmpi_multisend(&spk, cnt_phase1, ranks);\n        } else {\n            assert(0);\n        }\n    }\n}\n\nstatic void multisend_send_phase2(InputPreSyn* ps, int gid, double t) {\n    int i = ps->multisend_phase2_index_;\n    assert(i >= 0);\n    // format is cnt_phase2, array of target ranks\n    int* ranks = targets_phase2_ + i;\n    int cnt_phase2 = ranks[0];\n    ranks += 1;\n    NRNMPI_Spike spk;\n    spk.gid = gid;\n    spk.spiketime = t;\n    nrnmpi_multisend(&spk, cnt_phase2, ranks);\n}\n\nMultisend_ReceiveBuffer::Multisend_ReceiveBuffer()\n    : buffer_ {\n    new NRNMPI_Spike*[size_]\n}\n#if ENQUEUE == 1\n, psbuf_ {\n    new InputPreSyn*[size_]\n}\n#endif\n, phase2_buffer_{new Phase2Buffer[PHASE2BUFFER_SIZE]} {}\n\nMultisend_ReceiveBuffer::~Multisend_ReceiveBuffer() {\n    nrn_assert(!busy_);\n    for (int i = 0; i < count_; ++i) {\n        delete buffer_[i];\n    }\n    delete[] buffer_;\n    if (psbuf_)\n        delete[] psbuf_;\n    delete[] phase2_buffer_;\n}\nvoid Multisend_ReceiveBuffer::init(int index) {\n    index_ = index;\n    nsend_cell_ = nsend_ = nrecv_ = maxcount_ = 0;\n    busy_ = false;\n    for (int i = 0; i < count_; ++i) {\n        delete buffer_[i];\n    }\n    count_ = 0;\n\n    phase2_head_ = phase2_tail_ = 0;\n    phase2_nsend_cell_ = phase2_nsend_ = 0;\n}\nvoid Multisend_ReceiveBuffer::incoming(int gid, double spiketime) {\n    // printf(\"%d %p.incoming %g %g %d\\n\", nrnmpi_myid, this, t, spk->spiketime, spk->gid);\n    nrn_assert(!busy_);\n    busy_ = true;\n\n    if (count_ >= size_) {\n        size_ *= 2;\n        NRNMPI_Spike** newbuf = new NRNMPI_Spike*[size_];\n        for (int i = 0; i < count_; ++i) {\n            newbuf[i] = buffer_[i];\n        }\n        delete[] buffer_;\n        buffer_ = newbuf;\n        if (psbuf_) {\n            delete[] psbuf_;\n            psbuf_ = new InputPreSyn*[size_];\n        }\n    }\n    NRNMPI_Spike* spk = new NRNMPI_Spike();\n    spk->gid = gid;\n    spk->spiketime = spiketime;\n    buffer_[count_++] = spk;\n    if (maxcount_ < count_) {\n        maxcount_ = count_;\n    }\n\n    ++nrecv_;\n    busy_ = false;\n}\nvoid Multisend_ReceiveBuffer::enqueue() {\n    // printf(\"%d %p.enqueue count=%d t=%g nrecv=%d nsend=%d\\n\", nrnmpi_myid, this, t, count_,\n    // nrecv_, nsend_);\n    nrn_assert(!busy_);\n    busy_ = true;\n\n    for (int i = 0; i < count_; ++i) {\n        NRNMPI_Spike* spk = buffer_[i];\n\n        auto gid2in_it = gid2in.find(spk->gid);\n        assert(gid2in_it != gid2in.end());\n        InputPreSyn* ps = gid2in_it->second;\n\n        if (use_phase2_ && ps->multisend_phase2_index_ >= 0) {\n            Phase2Buffer& pb = phase2_buffer_[phase2_head_++];\n            phase2_head_ &= PHASE2BUFFER_MASK;\n            assert(phase2_head_ != phase2_tail_);\n            pb.ps = ps;\n            pb.spiketime = spk->spiketime;\n            pb.gid = spk->gid;\n        }\n\n        ps->send(spk->spiketime, net_cvode_instance, nrn_threads);\n        delete spk;\n    }\n\n    count_ = 0;\n#if ENQUEUE != 2\n    nrecv_ = 0;\n    nsend_ = 0;\n    nsend_cell_ = 0;\n#endif\n    busy_ = false;\n    phase2send();\n}\n\nvoid Multisend_ReceiveBuffer::enqueue1() {\n    // printf(\"%d %lx.enqueue count=%d t=%g nrecv=%d nsend=%d\\n\", nrnmpi_myid, (long)this, t,\n    // count_, nrecv_, nsend_);\n    nrn_assert(!busy_);\n    busy_ = true;\n    for (int i = 0; i < count_; ++i) {\n        NRNMPI_Spike* spk = buffer_[i];\n\n        auto gid2in_it = gid2in.find(spk->gid);\n        assert(gid2in_it != gid2in.end());\n        InputPreSyn* ps = gid2in_it->second;\n        psbuf_[i] = ps;\n        if (use_phase2_ && ps->multisend_phase2_index_ >= 0) {\n            Phase2Buffer& pb = phase2_buffer_[phase2_head_++];\n            phase2_head_ &= PHASE2BUFFER_MASK;\n            assert(phase2_head_ != phase2_tail_);\n            pb.ps = ps;\n            pb.spiketime = spk->spiketime;\n            pb.gid = spk->gid;\n        }\n    }\n    busy_ = false;\n    phase2send();\n}\n\nvoid Multisend_ReceiveBuffer::enqueue2() {\n    // printf(\"%d %lx.enqueue count=%d t=%g nrecv=%d nsend=%d\\n\", nrnmpi_myid, (long)this, t,\n    // count_, nrecv_, nsend_);\n    nrn_assert(!busy_);\n    busy_ = false;\n    for (int i = 0; i < count_; ++i) {\n        NRNMPI_Spike* spk = buffer_[i];\n        InputPreSyn* ps = psbuf_[i];\n        ps->send(spk->spiketime, net_cvode_instance, nrn_threads);\n        delete spk;\n    }\n    count_ = 0;\n    nrecv_ = 0;\n    nsend_ = 0;\n    nsend_cell_ = 0;\n    busy_ = false;\n}\n\nvoid Multisend_ReceiveBuffer::phase2send() {\n    while (phase2_head_ != phase2_tail_) {\n        Phase2Buffer& pb = phase2_buffer_[phase2_tail_++];\n        phase2_tail_ &= PHASE2BUFFER_MASK;\n        int gid = pb.gid;\n        if (index_) {\n            gid = ~gid;\n        }\n        multisend_send_phase2(pb.ps, gid, pb.spiketime);\n    }\n}\n\nstatic int max_ntarget_host;\n// For one phase sending, max_multisend_targets is max_ntarget_host.\n// For two phase sending, it is the maximum of all the\n// ntarget_hosts_phase1 and ntarget_hosts_phase2.\nstatic int max_multisend_targets;\n\nvoid nrn_multisend_init() {\n    for (int i = 0; i < n_multisend_interval; ++i) {\n        multisend_receive_buffer[i]->init(i);\n    }\n    current_rbuf = 0;\n    next_rbuf = n_multisend_interval - 1;\n#if ENQUEUE == 2\n    enq2_find_time_ = enq2_enqueue_time_ = 0;\n#endif\n    n_xtra_cons_check_ = 0;\n#if MAXNCONS\n    for (int i = 0; i <= MAXNCONS; ++i) {\n        xtra_cons_hist_[i] = 0;\n    }\n#endif  // MAXNCONS\n}\n\nstatic int multisend_advance() {\n    NRNMPI_Spike spk;\n    int i = 0;\n    while (nrnmpi_multisend_single_advance(&spk)) {\n        i += 1;\n        int j = 0;\n#if MULTISEND_INTERVAL == 2\n        if (spk.gid < 0) {\n            spk.gid = ~spk.gid;\n            j = 1;\n        }\n#endif\n        multisend_receive_buffer[j]->incoming(spk.gid, spk.spiketime);\n    }\n    return i;\n}\n\n#if NRN_MULTISEND\nvoid nrn_multisend_advance() {\n    if (use_multisend_) {\n        multisend_advance();\n#if ENQUEUE == 2\n        multisend_receive_buffer[current_rbuf]->enqueue();\n#endif\n    }\n}\n#endif\n\nvoid nrn_multisend_receive(NrnThread* nt) {\n    //\tnrn_spike_exchange();\n    assert(nt == nrn_threads);\n    //\tdouble w1, w2;\n    int ncons = 0;\n    int& s = multisend_receive_buffer[current_rbuf]->nsend_;\n    int& r = multisend_receive_buffer[current_rbuf]->nrecv_;\n//\tw1 = nrn_wtime();\n#if NRN_MULTISEND & 1\n    if (use_multisend_) {\n        nrn_multisend_advance();\n        nrnmpi_barrier();\n        nrn_multisend_advance();\n        // with two phase we expect conservation to hold and ncons should\n        // be 0.\n        while (nrnmpi_multisend_conserve(s, r) != 0) {\n            nrn_multisend_advance();\n            ++ncons;\n        }\n    }\n#endif\n    //\tw1 = nrn_wtime() - w1;\n    //\tw2 = nrn_wtime();\n\n#if ENQUEUE == 0\n    multisend_receive_buffer[current_rbuf]->enqueue();\n#endif\n#if ENQUEUE == 1\n    multisend_receive_buffer[current_rbuf]->enqueue1();\n    multisend_receive_buffer[current_rbuf]->enqueue2();\n#endif\n#if ENQUEUE == 2\n    multisend_receive_buffer[current_rbuf]->enqueue();\n    s = r = multisend_receive_buffer[current_rbuf]->nsend_cell_ = 0;\n\n    multisend_receive_buffer[current_rbuf]->phase2_nsend_cell_ = 0;\n    multisend_receive_buffer[current_rbuf]->phase2_nsend_ = 0;\n\n    enq2_find_time_ = 0;\n    enq2_enqueue_time_ = 0;\n#endif  // ENQUEUE == 2\n//\twt1_ = nrn_wtime() - w2;\n//\twt_ = w1;\n#if MULTISEND_INTERVAL == 2\n    // printf(\"%d reverse buffers %g\\n\", nrnmpi_myid, t);\n    if (n_multisend_interval == 2) {\n        current_rbuf = next_rbuf;\n        next_rbuf = ((next_rbuf + 1) & 1);\n    }\n#endif\n}\n\nvoid nrn_multisend_cleanup() {\n    if (targets_phase1_) {\n        delete[] targets_phase1_;\n        targets_phase1_ = nullptr;\n    }\n\n    if (targets_phase2_) {\n        delete[] targets_phase2_;\n        targets_phase2_ = nullptr;\n    }\n\n    // cleanup MultisendReceiveBuffer here as well\n}\n\nvoid nrn_multisend_setup() {\n    nrn_multisend_cleanup();\n    if (!use_multisend_) {\n        return;\n    }\n    nrnmpi_multisend_comm();\n    // if (nrnmpi_myid == 0) printf(\"multisend_setup()\\n\");\n    // although we only care about the set of hosts that gid2out_\n    // sends spikes to (source centric). We do not want to send\n    // the entire list of gid2in (which may be 10000 times larger\n    // than gid2out) from every machine to every machine.\n    // so we accomplish the task in two phases the first of which\n    // involves allgather with a total receive buffer size of number\n    // of cells (even that is too large and we will split it up\n    // into chunks). And the second, an\n    // allreduce with receive buffer size of number of hosts.\n    max_ntarget_host = 0;\n    max_multisend_targets = 0;\n\n    // completely new algorithm does one and two phase.\n    nrn_multisend_setup_targets(use_phase2_, targets_phase1_, targets_phase2_);\n\n    if (!multisend_receive_buffer[0]) {\n        multisend_receive_buffer[0] = new Multisend_ReceiveBuffer();\n    }\n#if MULTISEND_INTERVAL == 2\n    if (n_multisend_interval == 2 && !multisend_receive_buffer[1]) {\n        multisend_receive_buffer[1] = new Multisend_ReceiveBuffer();\n    }\n#endif\n}\n#endif  // NRN_MULTISEND\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/network/multisend.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n#include \"coreneuron/mpi/nrnmpiuse.h\"\nnamespace coreneuron {\nextern bool use_multisend_;\nextern int n_multisend_interval;\nextern bool use_phase2_;\n\nclass PreSyn;\nstruct NrnThread;\n\nvoid nrn_multisend_send(PreSyn*, double t, NrnThread*);\nvoid nrn_multisend_receive(NrnThread*);  // must be thread 0\nvoid nrn_multisend_advance();\nvoid nrn_multisend_init();\n\nvoid nrn_multisend_cleanup();\nvoid nrn_multisend_setup();\n\nvoid nrn_multisend_setup_targets(bool use_phase2, int*& targets_phase1, int*& targets_phase2);\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/network/multisend_setup.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include <cstdio>\n#include <cmath>\n#include <numeric>\n\n#if CORENRN_DEBUG\n#include <fstream>\n#include <iomanip>\n#endif\n\n#include \"coreneuron/utils/randoms/nrnran123.h\"\n#include \"coreneuron/nrniv/nrniv_decl.h\"\n#include \"coreneuron/network/multisend.hpp\"\n#include \"coreneuron/mpi/nrnmpidec.h\"\n#include \"coreneuron/mpi/core/nrnmpi.hpp\"\n#include \"coreneuron/utils/memory_utils.h\"\n#include \"coreneuron/utils/utils.hpp\"\n/*\nFor very large numbers of processors and cells and fanout, it is taking\na long time to figure out each cells target list given the input gids\n(gid2in) on each host. e.g 240 seconds for 2^25 cells, 1k connections\nper cell, and 128K cores; and 340 seconds for two phase excchange.\nTo reduce this setup time we experiment with a very different algorithm in which\nwe construct a gid target host list on host gid%nhost and copy that list to\nthe source host owning the gid.\n*/\n\n#if NRN_MULTISEND\nnamespace coreneuron {\nusing Gid2IPS = std::map<int, InputPreSyn*>;\nusing Gid2PS = std::map<int, PreSyn*>;\n\n#if CORENRN_DEBUG\ntemplate <typename T>\nstatic void celldebug(const char* p, T& map) {\n    std::string fname = std::string(\"debug.\") + std::to_string(nrnmpi_myid);\n    std::ofstream f(fname, std::ios::app);\n    f << std::endl << p << std::endl;\n    int rank = nrnmpi_myid;\n    f << \"  \" << std::setw(2) << std::setfill('0') << rank << \":\";\n    for (const auto& m: map) {\n        int gid = m.first;\n        f << \"  \" << std::setw(2) << std::setfill('0') << gid << \":\";\n    }\n    f << std::endl;\n}\n\nstatic void alltoalldebug(const char* p,\n                          const std::vector<int>& s,\n                          const std::vector<int>& scnt,\n                          const std::vector<int>& sdispl,\n                          const std::vector<int>& r,\n                          const std::vector<int>& rcnt,\n                          const std::vector<int>& rdispl) {\n    std::string fname = std::string(\"debug.\") + std::to_string(nrnmpi_myid);\n    std::ofstream f(fname, std::ios::app);\n    f << std::endl << p << std::endl;\n    int rank = nrnmpi_myid;\n    f << \"  rank \" << rank << std::endl;\n    for (int i = 0; i < nrnmpi_numprocs; ++i) {\n        f << \"    s\" << i << \" : \" << scnt[i] << \" \" << sdispl[i] << \" :\";\n        for (int j = sdispl[i]; j < sdispl[i + 1]; ++j) {\n            f << \"  \" << std::setw(2) << std::setfill('0') << s[j] << \":\";\n        }\n        f << std::endl;\n    }\n    for (int i = 0; i < nrnmpi_numprocs; ++i) {\n        f << \"    r\" << i << \" : \" << rcnt[i] << \" \" << rdispl[i] << \" :\";\n        for (int j = rdispl[i]; j < rdispl[i + 1]; ++j) {\n            f << \"  \" << std::setw(2) << std::setfill('0') << r[j] << \":\";\n        }\n        f << std::endl;\n    }\n}\n#else\ntemplate <typename T>\nstatic void celldebug(const char*, T&) {}\nstatic void alltoalldebug(const char*,\n                          const std::vector<int>&,\n                          const std::vector<int>&,\n                          const std::vector<int>&,\n                          const std::vector<int>&,\n                          const std::vector<int>&,\n                          const std::vector<int>&) {}\n#endif\n\n#if CORENRN_DEBUG\nvoid phase1debug(int* targets_phase1) {\n    std::string fname = std::string(\"debug.\") + std::to_string(nrnmpi_myid);\n    std::ofstream f(fname, std::ios::app);\n    f << std::endl << \"phase1debug \" << nrnmpi_myid;\n    for (auto& g: gid2out) {\n        PreSyn* ps = g.second;\n        f << std::endl << \" \" << std::setw(2) << std::setfill('0') << ps->gid_ << \":\";\n        int* ranks = targets_phase1 + ps->multisend_index_;\n        int n = ranks[1];\n        ranks += 2;\n        for (int i = 0; i < n; ++i) {\n            f << \" \" << std::setw(2) << std::setfill('0') << ranks[i];\n        }\n    }\n    f << std::endl;\n}\n\nvoid phase2debug(int* targets_phase2) {\n    std::string fname = std::string(\"debug.\") + std::to_string(nrnmpi_myid);\n    std::ofstream f(fname, std::ios::app);\n    f << std::endl << \"phase2debug \" << nrnmpi_myid;\n    for (auto& g: gid2in) {\n        int gid = g.first;\n        InputPreSyn* ps = g.second;\n        f << std::endl << \" \" << std::setw(2) << std::setfill('0') << gid << \":\";\n        int j = ps->multisend_phase2_index_;\n        if (j >= 0) {\n            int* ranks = targets_phase2 + j;\n            int cnt = ranks[0];\n            ranks += 1;\n            for (int i = 0; i < cnt; ++i) {\n                f << \" \" << std::setw(2) << std::setfill('0') << ranks[i];\n            }\n        }\n    }\n    f << std::endl;\n}\n#endif\n\nstatic std::vector<int> newoffset(const std::vector<int>& acnt) {\n    std::vector<int> aoff(acnt.size() + 1);\n    aoff[0] = 0;\n    std::partial_sum(acnt.begin(), acnt.end(), aoff.begin() + 1);\n    return aoff;\n}\n\n// input: scnt, sdispl; output: rcnt, rdispl\nstatic std::pair<std::vector<int>, std::vector<int>> all2allv_helper(const std::vector<int>& scnt) {\n    int np = nrnmpi_numprocs;\n    std::vector<int> c(np, 1);\n    std::vector<int> rdispl = newoffset(c);\n    std::vector<int> rcnt(np, 0);\n    nrnmpi_int_alltoallv(\n        scnt.data(), c.data(), rdispl.data(), rcnt.data(), c.data(), rdispl.data());\n    rdispl = newoffset(rcnt);\n    return std::make_pair(std::move(rcnt), std::move(rdispl));\n}\n\n/*\ndefine following to 1 if desire space/performance information such as:\nall2allv_int gidin to intermediate space=1552 total=37345104 time=0.000495835\nall2allv_int gidout space=528 total=37379376 time=1.641e-05\nall2allv_int lists space=3088 total=37351312 time=4.4708e-05\n*/\n\n#define all2allv_perf 0\n\n// input: s, scnt, sdispl; output: r, rdispl\nstatic std::pair<std::vector<int>, std::vector<int>> all2allv_int(const std::vector<int>& s,\n                                                                  const std::vector<int>& scnt,\n                                                                  const std::vector<int>& sdispl,\n                                                                  const char* dmes) {\n#if all2allv_perf\n    double tm = nrn_wtime();\n#endif\n    int np = nrnmpi_numprocs;\n\n    std::vector<int> rcnt;\n    std::vector<int> rdispl;\n    std::tie(rcnt, rdispl) = all2allv_helper(scnt);\n    std::vector<int> r(rdispl[np], 0);\n    nrnmpi_int_alltoallv(\n        s.data(), scnt.data(), sdispl.data(), r.data(), rcnt.data(), rdispl.data());\n    alltoalldebug(dmes, s, scnt, sdispl, r, rcnt, rdispl);\n\n#if all2allv_perf\n    if (nrnmpi_myid == 0) {\n        int nb = 4 * nrnmpi_numprocs + sdispl[nrnmpi_numprocs] + rdispl[nrnmpi_numprocs];\n        tm = nrn_wtime() - tm;\n        printf(\"all2allv_int %s space=%d total=%g time=%g\\n\", dmes, nb, nrn_mallinfo(), tm);\n    }\n#endif\n    return std::make_pair(std::move(r), std::move(rdispl));\n}\n\nclass TarList {\n  public:\n    TarList();\n    virtual ~TarList();\n    virtual void alloc();\n    int size;\n    int* list;\n    int rank;\n\n    int* indices;  // indices of list for groups of phase2 targets.\n                   // If indices is not null, then size is one less than\n                   // the size of the indices list where indices[size] = the size of\n                   // the list. Indices[0] is 0 and list[indices[i]] is the rank\n                   // to send the ith group of phase2 targets.\n};\n\nusing Int2TarList = std::map<int, TarList*>;\n\nTarList::TarList()\n    : size(0)\n    , list(nullptr)\n    , rank(-1)\n    , indices(nullptr) {}\n\nTarList::~TarList() {\n    delete[] list;\n    delete[] indices;\n}\n\nvoid TarList::alloc() {\n    if (size) {\n        list = new int[size];\n    }\n}\n\n// for two phase\n\nstatic nrnran123_State* ranstate{nullptr};\n\nstatic void random_init(int i) {\n    if (!ranstate) {\n        ranstate = nrnran123_newstream(i, 0);\n    }\n}\n\nstatic unsigned int get_random() {\n    return nrnran123_ipick(ranstate);\n}\n\n// Avoid warnings if the global index is changed on subsequent psolve.\nstatic void random_delete() {\n    if (ranstate) {\n        nrnran123_deletestream(ranstate);\n        ranstate = nullptr;\n    }\n}\n\nstatic int iran(int i1, int i2) {\n    // discrete uniform random integer from i2 to i2 inclusive. Must\n    // work if i1 == i2\n    if (i1 == i2) {\n        return i1;\n    }\n    int i3 = i1 + get_random() % (i2 - i1 + 1);\n    return i3;\n}\n\nstatic void phase2organize(TarList* tl) {\n    int nt = tl->size;\n    int n = int(sqrt(double(nt)));\n    // change to about 20\n    if (n > 1) {  // do not bother if not many connections\n        // equal as possible group sizes\n        tl->indices = new int[n + 1];\n        tl->indices[n] = tl->size;\n        tl->size = n;\n        for (int i = 0; i < n; ++i) {\n            tl->indices[i] = (i * nt) / n;\n        }\n        // Note: not sure the following is true anymore but it could be.\n        // This distribution is very biased (if 0 is a phase1 target\n        // it is always a phase2 sender. So now choose a random\n        // target in the subset and make that the phase2 sender\n        // (need to switch the indices[i] target and the one chosen)\n        for (int i = 0; i < n; ++i) {\n            int i1 = tl->indices[i];\n            int i2 = tl->indices[i + 1] - 1;\n            // need discrete uniform random integer from i1 to i2\n            int i3 = iran(i1, i2);\n            int itar = tl->list[i1];\n            tl->list[i1] = tl->list[i3];\n            tl->list[i3] = itar;\n        }\n    }\n}\n\n// end of twophase\n\n/*\nSetting up target lists uses a lot of temporary memory. It is conceiveable\nthat this can be done prior to creating any cells or connections. I.e.\ngid2out is presently known from pc.set_gid2node(gid,...). Gid2in is presenly\nknown from NetCon = pc.gid_connect(gid, target) and it is quite a style\nand hoc network programming change to use something like pc.need_gid(gid)\nbefore cells with their synapses are created since one would have to imagine\nthat the hoc network setup code would have to be executed in a virtual\nor 'abstract' fashion without actually creating, cells, targets, or NetCons.\nAnyway, to potentially support this in the future, we write setup_target_lists\nto not use any PreSyn information.\n*/\n\nstatic std::vector<int> setup_target_lists(bool);\nstatic void fill_multisend_lists(bool, const std::vector<int>&, int*&, int*&);\n\nvoid nrn_multisend_setup_targets(bool use_phase2, int*& targets_phase1, int*& targets_phase2) {\n    auto r = setup_target_lists(use_phase2);\n\n    // initialize as unused\n    for (auto& g: gid2out) {\n        PreSyn* ps = g.second;\n        ps->multisend_index_ = -1;\n    }\n\n    // Only will be not -1 if non-nullptr input is a phase 2 sender.\n    for (auto& g: gid2in) {\n        InputPreSyn* ps = g.second;\n        ps->multisend_phase2_index_ = -1;\n    }\n\n    fill_multisend_lists(use_phase2, r, targets_phase1, targets_phase2);\n\n    // phase1debug(targets_phase1);\n    // phase2debug(targets_phase2);\n}\n\n// Some notes about threads and the rank lists.\n// Assume all MPI message sent and received from a single thread (0).\n// gid2in and gid2out are rank wide lists for all threads\n//\nstatic void fill_multisend_lists(bool use_phase2,\n                                 const std::vector<int>& r,\n                                 int*& targets_phase1,\n                                 int*& targets_phase2) {\n    // sequence of gid, size, [totalsize], list\n    // Note that totalsize is there only for output gid's and use_phase2.\n    // Using this sequence, copy lists to proper phase\n    // 1 and phase 2 lists. (Phase one lists found in gid2out_ and phase\n    // two lists found in gid2in_.\n    int phase1_index = 0;\n    int phase2_index = 0;\n    // Count and fill in multisend_index and multisend_phase2_index_\n    // From the counts can allocate targets_phase1 and targets_phase2\n    // Then can iterate again and copy r to proper target locations.\n    for (std::size_t i = 0; i < r.size();) {\n        InputPreSyn* ips = nullptr;\n        int gid = r[i++];\n        int size = r[i++];\n        if (use_phase2) {  // look in gid2in first\n            auto gid2in_it = gid2in.find(gid);\n            if (gid2in_it != gid2in.end()) {  // phase 2 target list\n                ips = gid2in_it->second;\n                ips->multisend_phase2_index_ = phase2_index;\n                phase2_index += 1 + size;  // count + ranks\n                i += size;\n            }\n        }\n        if (!ips) {  // phase 1 target list (or whole list if use_phase2 is 0)\n            auto gid2out_it = gid2out.find(gid);\n            assert(gid2out_it != gid2out.end());\n            PreSyn* ps = gid2out_it->second;\n            ps->multisend_index_ = phase1_index;\n            phase1_index += 2 + size;  // total + count + ranks\n            if (use_phase2) {\n                i++;\n            }\n            i += size;\n        }\n    }\n\n    targets_phase1 = new int[phase1_index];\n    targets_phase2 = new int[phase2_index];\n\n    // printf(\"%d sz=%d\\n\", nrnmpi_myid, r.size());\n    for (std::size_t i = 0; i < r.size();) {\n        InputPreSyn* ips = nullptr;\n        int gid = r[i++];\n        int size = r[i++];\n        if (use_phase2) {  // look in gid2in first\n            auto gid2in_it = gid2in.find(gid);\n            if (gid2in_it != gid2in.end()) {  // phase 2 target list\n                ips = gid2in_it->second;\n                int p = ips->multisend_phase2_index_;\n                int* ranks = targets_phase2 + p;\n                ranks[0] = size;\n                ranks += 1;\n                // printf(\"%d i=%d gid=%d phase2 size=%d\\n\", nrnmpi_myid, i, gid, size);\n                for (int j = 0; j < size; ++j) {\n                    ranks[j] = r[i++];\n                    // printf(\"%d   j=%d rank=%d\\n\", nrnmpi_myid, j, ranks[j]);\n                    assert(ranks[j] != nrnmpi_myid);\n                }\n            }\n        }\n        if (!ips) {  // phase 1 target list (or whole list if use_phase2 is 0)\n            auto gid2out_it = gid2out.find(gid);\n            assert(gid2out_it != gid2out.end());\n            PreSyn* ps = gid2out_it->second;\n            int p = ps->multisend_index_;\n            int* ranks = targets_phase1 + p;\n            int total = size;\n            if (use_phase2) {\n                total = r[i++];\n            }\n            ranks[0] = total;\n            ranks[1] = size;\n            ranks += 2;\n            // printf(\"%d i=%d gid=%d phase1 size=%d total=%d\\n\", nrnmpi_myid, i, gid, size, total);\n            for (int j = 0; j < size; ++j) {\n                ranks[j] = r[i++];\n                // printf(\"%d   j=%d rank=%d\\n\", nrnmpi_myid, j, ranks[j]);\n                // There never was a possibility of send2self\n                // because an output presyn is never in gid2in_.\n                assert(ranks[j] != nrnmpi_myid);\n            }\n        }\n    }\n\n    // compute max_ntarget_host and max_multisend_targets\n    int max_ntarget_host = 0;\n    int max_multisend_targets = 0;\n    for (auto& g: gid2out) {\n        PreSyn* ps = g.second;\n        if (ps->output_index_ >= 0) {  // only ones that generate spikes\n            int i = ps->multisend_index_;\n            if (i >= 0) {  // only if the gid has targets on other ranks.\n                max_ntarget_host = std::max(targets_phase1[i], max_ntarget_host);\n                max_multisend_targets = std::max(targets_phase1[i + 1], max_multisend_targets);\n            }\n        }\n    }\n    if (use_phase2) {\n        for (auto& g: gid2in) {\n            InputPreSyn* ps = g.second;\n            int i = ps->multisend_phase2_index_;\n            if (i >= 0) {\n                max_multisend_targets = std::max(max_multisend_targets, targets_phase2[i]);\n            }\n        }\n    }\n}\n\n// Return the vector encoding a sequence of gid, target list size, and target list\nstatic std::vector<int> setup_target_lists(bool use_phase2) {\n    int nhost = nrnmpi_numprocs;\n\n    // Construct hash table for finding the target rank list for a given gid.\n    Int2TarList gid2tarlist;\n\n    celldebug<Gid2PS>(\"output gid\", gid2out);\n    celldebug<Gid2IPS>(\"input gid\", gid2in);\n\n    // What are the target ranks for a given input gid. All the ranks\n    // with the same input gid send that gid to the intermediate\n    // gid%nhost rank. The intermediate rank can then construct the\n    // list of target ranks for the gids it gets.\n\n    {\n        // scnt1 is number of input gids from target\n        std::vector<int> scnt1(nhost, 0);\n        for (const auto& g: gid2in) {\n            int gid = g.first;\n            ++scnt1[gid % nhost];\n        }\n\n        // s1 are the input gids from target to be sent to the various intermediates\n        const std::vector<int> sdispl1 = newoffset(scnt1);\n        // Make an usable copy\n        auto sdispl1_ = sdispl1;\n        std::vector<int> s1(sdispl1[nhost], 0);\n        for (const auto& g: gid2in) {\n            int gid = g.first;\n            s1[sdispl1_[gid % nhost]++] = gid;\n        }\n\n        std::vector<int> r1;\n        std::vector<int> rdispl1;\n        std::tie(r1, rdispl1) = all2allv_int(s1, scnt1, sdispl1, \"gidin to intermediate\");\n        // r1 is the gids received by this intermediate rank from all other ranks.\n\n        // Now figure out the size of the target list for each distinct gid in r1.\n        for (const auto& gid: r1) {\n            if (gid2tarlist.find(gid) == gid2tarlist.end()) {\n                gid2tarlist[gid] = new TarList{};\n                gid2tarlist[gid]->size = 0;\n            }\n            auto tar = gid2tarlist[gid];\n            ++(tar->size);\n        }\n\n        // Conceptually, now the intermediate is the mpi source and the gid\n        // sources are the mpi destination in regard to target lists.\n        // It would be possible at this point, but confusing,\n        // to allocate a s[rdispl1[nhost]] and figure out scnt and sdispl by\n        // by getting the counts and gids from the ranks that own the source\n        // gids. In this way we could organize s without having to allocate\n        // individual target lists on the intermediate and then allocate\n        // another large s buffer to receive a copy of them. However for\n        // this processing we already require two large buffers for input\n        // gid's so there is no real savings of space.\n        // So let's do the simple obvious sequence and now complete the\n        // target lists.\n\n        // Allocate the target lists (and set size to 0 (we will recount when filling).\n        for (const auto& g: gid2tarlist) {\n            TarList* tl = g.second;\n            tl->alloc();\n            tl->size = 0;\n        }\n\n        // fill the target lists\n        for (int rank = 0; rank < nhost; ++rank) {\n            int b = rdispl1[rank];\n            int e = rdispl1[rank + 1];\n            for (int i = b; i < e; ++i) {\n                const auto itl_it = gid2tarlist.find(r1[i]);\n                if (itl_it != gid2tarlist.end()) {\n                    TarList* tl = itl_it->second;\n                    tl->list[tl->size] = rank;\n                    tl->size++;\n                }\n            }\n        }\n    }\n\n    {\n        // Now the intermediate hosts have complete target lists and\n        // the sources know the intermediate host from the gid2out_ map.\n        // We could potentially organize here for two-phase exchange as well.\n\n        // Which target lists are desired by the source rank?\n\n        // Ironically, for round robin distributions, the target lists are\n        // already on the proper source rank so the following code should\n        // be tested for random distributions of gids.\n        // How many on the source rank?\n        std::vector<int> scnt2(nhost, 0);\n        for (auto& g: gid2out) {\n            int gid = g.first;\n            PreSyn* ps = g.second;\n            if (ps->output_index_ >= 0) {  // only ones that generate spikes\n                ++scnt2[gid % nhost];\n            }\n        }\n        const auto sdispl2 = newoffset(scnt2);\n        auto sdispl2_ = sdispl2;\n\n        // what are the gids of those target lists\n        std::vector<int> s2(sdispl2[nhost], 0);\n        for (auto& g: gid2out) {\n            int gid = g.first;\n            PreSyn* ps = g.second;\n            if (ps->output_index_ >= 0) {  // only ones that generate spikes\n                s2[sdispl2_[gid % nhost]++] = gid;\n            }\n        }\n        std::vector<int> r2;\n        std::vector<int> rdispl2;\n        std::tie(r2, rdispl2) = all2allv_int(s2, scnt2, sdispl2, \"gidout\");\n\n        // fill in the tl->rank for phase 1 target lists\n        // r2 is an array of source spiking gids\n        // tl is list associating input gids with list of target ranks.\n        for (int rank = 0; rank < nhost; ++rank) {\n            int b = rdispl2[rank];\n            int e = rdispl2[rank + 1];\n            for (int i = b; i < e; ++i) {\n                // note that there may be input gids with no corresponding\n                // output gid so that the find may not return true and in\n                // that case the tl->rank remains -1.\n                // For example multisplit gids or simulation of a subset of\n                // cells.\n                const auto itl_it = gid2tarlist.find(r2[i]);\n                if (itl_it != gid2tarlist.end()) {\n                    TarList* tl = itl_it->second;\n                    tl->rank = rank;\n                }\n            }\n        }\n    }\n\n    if (use_phase2) {\n        random_init(nrnmpi_myid + 1);\n        for (const auto& gid2tar: gid2tarlist) {\n            TarList* tl = gid2tar.second;\n            if (tl->rank >= 0) {  // only if output gid is spike generating\n                phase2organize(tl);\n            }\n        }\n        random_delete();\n    }\n\n    // For clarity, use the all2allv_int style of information flow\n    // from source to destination as above\n    // and also use a uniform code\n    // for copying one and two phase information from a TarList to\n    // develop the s, scnt, and sdispl3 buffers. That is, a buffer list\n    // section in s for either a one-phase list or the much shorter\n    // (individually) lists for first and second phases, has a\n    // gid, size, totalsize header for each list where totalsize\n    // is only present if the gid is an output gid (for\n    // NrnMultisend_Send.ntarget_host used for conservation).\n    // Note that totalsize is tl->indices[tl->size]\n\n    // how much to send to each rank\n    std::vector<int> scnt3(nhost, 0);\n    for (const auto& gid2tar: gid2tarlist) {\n        TarList* tl = gid2tar.second;\n        if (tl->rank < 0) {\n            // When the output gid does not generate spikes, that rank\n            // is not interested if there is a target list for it.\n            // If the output gid does not exist, there is no rank.\n            // In either case ignore this target list.\n            continue;\n        }\n        if (tl->indices) {\n            // indices[size] is the size of list but size of those\n            // are the sublist phase 2 destination ranks which\n            // don't get sent as part of the phase 2 target list.\n            // Also there is a phase 1 target list of size so there\n            // are altogether size+1 target lists.\n            // (one phase 1 list and size phase 2 lists)\n            scnt3[tl->rank] += tl->size + 2;  // gid, size, list\n            for (int i = 0; i < tl->size; ++i) {\n                scnt3[tl->list[tl->indices[i]]] += tl->indices[i + 1] - tl->indices[i] + 1;\n                // gid, size, list\n            }\n        } else {\n            // gid, list size, list\n            scnt3[tl->rank] += tl->size + 2;\n        }\n        if (use_phase2) {\n            // The phase 1 header has as its third element, the\n            // total list size (needed for conservation);\n            scnt3[tl->rank] += 1;\n        }\n    }\n    const auto sdispl4 = newoffset(scnt3);\n    auto sdispl4_ = sdispl4;\n    std::vector<int> s3(sdispl4[nhost], 0);\n    // what to send to each rank\n    for (const auto& gid2tar: gid2tarlist) {\n        int gid = gid2tar.first;\n        TarList* tl = gid2tar.second;\n        if (tl->rank < 0) {\n            continue;\n        }\n        if (tl->indices) {\n            s3[sdispl4_[tl->rank]++] = gid;\n            s3[sdispl4_[tl->rank]++] = tl->size;\n            if (use_phase2) {\n                s3[sdispl4_[tl->rank]++] = tl->indices[tl->size];\n            }\n            for (int i = 0; i < tl->size; ++i) {\n                s3[sdispl4_[tl->rank]++] = tl->list[tl->indices[i]];\n            }\n            for (int i = 0; i < tl->size; ++i) {\n                int rank = tl->list[tl->indices[i]];\n                s3[sdispl4_[rank]++] = gid;\n                assert(tl->indices[i + 1] > tl->indices[i]);\n                s3[sdispl4_[rank]++] = tl->indices[i + 1] - tl->indices[i] - 1;\n                for (int j = tl->indices[i] + 1; j < tl->indices[i + 1]; ++j) {\n                    s3[sdispl4_[rank]++] = tl->list[j];\n                }\n            }\n        } else {\n            // gid, list size, list\n            s3[sdispl4_[tl->rank]++] = gid;\n            s3[sdispl4_[tl->rank]++] = tl->size;\n            if (use_phase2) {\n                s3[sdispl4_[tl->rank]++] = tl->size;\n            }\n            for (int i = 0; i < tl->size; ++i) {\n                s3[sdispl4_[tl->rank]++] = tl->list[i];\n            }\n        }\n        delete tl;\n    }\n    std::vector<int> r_return;\n    std::vector<int> rdispl3;\n    std::tie(r_return, rdispl3) = all2allv_int(s3, scnt3, sdispl4, \"lists\");\n    return r_return;\n}\n}  // namespace coreneuron\n#endif  // NRN_MULTISEND\n"
  },
  {
    "path": "coreneuron/network/netcon.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\n#include \"coreneuron/mpi/nrnmpi.h\"\n\n#undef check\n#if MAC\n#define NetCon nrniv_Dinfo\n#endif\nnamespace coreneuron {\nclass PreSyn;\nclass InputPreSyn;\nclass TQItem;\nstruct NrnThread;\nstruct Point_process;\nclass NetCvode;\n\n#define DiscreteEventType 0\n#define TstopEventType    1\n#define NetConType        2\n#define SelfEventType     3\n#define PreSynType        4\n#define NetParEventType   7\n#define InputPreSynType   20\n\nstruct DiscreteEvent {\n    DiscreteEvent() = default;\n    virtual ~DiscreteEvent() = default;\n    virtual void send(double deliverytime, NetCvode*, NrnThread*);\n    virtual void deliver(double t, NetCvode*, NrnThread*);\n    virtual int type() const {\n        return DiscreteEventType;\n    }\n    virtual bool require_checkpoint() {\n        return true;\n    }\n    virtual void pr(const char*, double t, NetCvode*);\n};\n\nclass NetCon: public DiscreteEvent {\n  public:\n    bool active_{};\n    double delay_{1.0};\n    Point_process* target_{};\n    union {\n        int weight_index_{};\n        int srcgid_;  // only to help InputPreSyn during setup\n        // before weights are read and stored. Saves on transient\n        // memory requirements by avoiding storage of all group file\n        // netcon_srcgid lists. ie. that info is copied into here.\n    } u;\n\n    NetCon() = default;\n    virtual ~NetCon() = default;\n    virtual void send(double sendtime, NetCvode*, NrnThread*) override;\n    virtual void deliver(double, NetCvode* ns, NrnThread*) override;\n    virtual int type() const override {\n        return NetConType;\n    }\n    virtual void pr(const char*, double t, NetCvode*) override;\n};\n\nclass SelfEvent: public DiscreteEvent {\n  public:\n    double flag_;\n    Point_process* target_;\n    void** movable_;  // actually a TQItem**\n    int weight_index_;\n\n    SelfEvent() = default;\n    virtual ~SelfEvent() = default;\n    virtual void deliver(double, NetCvode*, NrnThread*) override;\n    virtual int type() const override {\n        return SelfEventType;\n    }\n\n    virtual void pr(const char*, double t, NetCvode*) override;\n\n  private:\n    void call_net_receive(NetCvode*);\n};\n\nclass ConditionEvent: public DiscreteEvent {\n  public:\n    // condition detection factored out of PreSyn for re-use\n    ConditionEvent() = default;\n    virtual ~ConditionEvent() = default;\n    virtual bool check(NrnThread*);\n    virtual double value(NrnThread*) {\n        return -1.;\n    }\n\n    int flag_{};  // true when below, false when above. (changed from bool to int to avoid cray acc\n                  // bug(?))\n};\n\nclass PreSyn: public ConditionEvent {\n  public:\n#if NRNMPI\n    unsigned char localgid_{};  // compressed gid for spike transfer\n#endif\n    int nc_index_{};  // replaces dil_, index into global NetCon** netcon_in_presyn_order_\n    int nc_cnt_{};    // how many netcon starting at nc_index_\n    int output_index_{};\n    int gid_{-1};\n    double threshold_{10.};\n    int thvar_index_{-1};  // >=0 points into NrnThread._actual_v\n    Point_process* pntsrc_{};\n\n    PreSyn() = default;\n    virtual ~PreSyn() = default;\n    virtual void send(double sendtime, NetCvode*, NrnThread*) override;\n    virtual void deliver(double, NetCvode*, NrnThread*) override;\n    virtual int type() const override {\n        return PreSynType;\n    }\n\n    virtual double value(NrnThread*) override;\n    void record(double t);\n#if NRN_MULTISEND\n    int multisend_index_{-1};\n#endif\n};\n\nclass InputPreSyn: public DiscreteEvent {\n  public:\n    int nc_index_{-1};  // replaces dil_, index into global NetCon** netcon_in_presyn_order_\n    int nc_cnt_{};      // how many netcon starting at nc_index_\n\n    InputPreSyn() = default;\n    virtual ~InputPreSyn() = default;\n    virtual void send(double sendtime, NetCvode*, NrnThread*) override;\n    virtual void deliver(double, NetCvode*, NrnThread*) override;\n    virtual int type() const override {\n        return InputPreSynType;\n    }\n#if NRN_MULTISEND\n    int multisend_phase2_index_{-1};\n#endif\n};\n\nclass NetParEvent: public DiscreteEvent {\n  public:\n    int ithread_;     // for pr()\n    double wx_, ws_;  // exchange time and \"spikes to Presyn\" time\n\n    NetParEvent();\n    virtual ~NetParEvent() = default;\n    virtual void send(double, NetCvode*, NrnThread*) override;\n    virtual void deliver(double, NetCvode*, NrnThread*) override;\n    virtual int type() const override {\n        return NetParEventType;\n    }\n\n    virtual void pr(const char*, double t, NetCvode*) override;\n};\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/network/netcvode.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include <float.h>\n#include <map>\n#include <mutex>\n\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/network/netcon.hpp\"\n#include \"coreneuron/network/netcvode.hpp\"\n#include \"coreneuron/network/netpar.hpp\"\n#include \"coreneuron/utils/ivocvect.hpp\"\n#include \"coreneuron/utils/profile/profiler_interface.h\"\n#include \"coreneuron/nrniv/nrniv_decl.h\"\n#include \"coreneuron/io/output_spikes.hpp\"\n#include \"coreneuron/utils/nrn_assert.h\"\n#include \"coreneuron/gpu/nrn_acc_manager.hpp\"\n#include \"coreneuron/network/multisend.hpp\"\n#include \"coreneuron/mechanism/membfunc.hpp\"\n#include \"coreneuron/coreneuron.hpp\"\n#include \"coreneuron/utils/nrnoc_aux.hpp\"\n\nnamespace coreneuron {\n#define PP2NT(pp) (nrn_threads + (pp)->_tid)\n#define PP2t(pp)  (PP2NT(pp)->_t)\n//#define POINT_RECEIVE(type, tar, w, f) (*pnt_receive[type])(tar, w, f)\n\ndouble NetCvode::eps_;\nNetCvode* net_cvode_instance;\nbool cvode_active_;\n\n/// Flag to use the bin queue\nbool nrn_use_bin_queue_ = 0;\n\nvoid mk_netcvode() {\n    if (!net_cvode_instance) {\n        net_cvode_instance = new NetCvode();\n    }\n}\n\n#ifdef DEBUG\n// temporary\nstatic int nrn_errno_check(int type) {\n    printf(\"nrn_errno_check() was called on pid %d: errno=%d type=%d\\n\", nrnmpi_myid, errno, type);\n    //  assert(0);\n    type = 0;\n    return 1;\n}\n#endif\n\n// for _OPENACC and/or NET_RECEIVE_BUFFERING\n// sem 0:3 send event move\nvoid net_sem_from_gpu(int sendtype,\n                      int i_vdata,\n                      int weight_index_,\n                      int ith,\n                      int ipnt,\n                      double td,\n                      double flag) {\n    NrnThread& nt = nrn_threads[ith];\n    Point_process* pnt = (Point_process*) nt._vdata[ipnt];\n    if (sendtype == 0) {\n        net_send(nt._vdata + i_vdata, weight_index_, pnt, td, flag);\n    } else if (sendtype == 2) {\n        net_move(nt._vdata + i_vdata, pnt, td);\n    } else {\n        net_event(pnt, td);\n    }\n}\n\nvoid net_send(void** v, int weight_index_, Point_process* pnt, double td, double flag) {\n    NrnThread* nt = PP2NT(pnt);\n    NetCvodeThreadData& p = net_cvode_instance->p[nt->id];\n    SelfEvent* se = new SelfEvent;\n    se->flag_ = flag;\n    se->target_ = pnt;\n    se->weight_index_ = weight_index_;\n    if (v >= nt->_vdata) {\n        se->movable_ = v;  // needed for SaveState\n    }\n    assert(net_cvode_instance);\n    ++p.unreffed_event_cnt_;\n    if (td < nt->_t) {\n        char buf[100];\n        sprintf(buf, \"net_send td-t = %g\", td - nt->_t);\n        se->pr(buf, td, net_cvode_instance);\n        abort();\n        hoc_execerror(\"net_send delay < 0\", 0);\n    }\n    TQItem* q = net_cvode_instance->event(td, se, nt);\n    if (flag == 1.0 && v >= nt->_vdata) {\n        *v = (void*) q;\n    }\n    // printf(\"net_send %g %s %g %p\\n\", td, pnt_name(pnt), flag, *v);\n}\n\nvoid artcell_net_send(void** v, int weight_index_, Point_process* pnt, double td, double flag) {\n    net_send(v, weight_index_, pnt, td, flag);\n}\n\nvoid net_event(Point_process* pnt, double time) {\n    NrnThread* nt = PP2NT(pnt);\n    PreSyn* ps = nt->presyns +\n                 nt->pnt2presyn_ix[corenrn.get_pnttype2presyn()[pnt->_type]][pnt->_i_instance];\n    if (ps) {\n        if (time < nt->_t) {\n            char buf[100];\n            sprintf(buf, \"net_event time-t = %g\", time - nt->_t);\n            ps->pr(buf, time, net_cvode_instance);\n            hoc_execerror(\"net_event time < t\", 0);\n        }\n        ps->send(time, net_cvode_instance, nt);\n    }\n}\n\nNetCvodeThreadData::NetCvodeThreadData()\n    : tqe_{new TQueue<QTYPE>()} {\n    inter_thread_events_.reserve(1000);\n}\n\nNetCvodeThreadData::~NetCvodeThreadData() {\n    delete tqe_;\n}\n\n/// If the PreSyn is on a different thread than the target,\n/// we have to lock the buffer\nvoid NetCvodeThreadData::interthread_send(double td, DiscreteEvent* db, NrnThread* /* nt */) {\n    std::lock_guard<OMP_Mutex> lock(mut);\n    inter_thread_events_.emplace_back(InterThreadEvent{db, td});\n}\n\nvoid interthread_enqueue(NrnThread* nt) {\n    net_cvode_instance->p[nt->id].enqueue(net_cvode_instance, nt);\n}\n\nvoid NetCvodeThreadData::enqueue(NetCvode* nc, NrnThread* nt) {\n    std::lock_guard<OMP_Mutex> lock(mut);\n    for (const auto& ite: inter_thread_events_) {\n        nc->bin_event(ite.t_, ite.de_, nt);\n    }\n    inter_thread_events_.clear();\n}\n\nNetCvode::NetCvode() {\n    eps_ = 100. * DBL_EPSILON;\n#if PRINT_EVENT\n    print_event_ = 1;\n#else\n    print_event_ = 0;\n#endif\n    pcnt_ = 0;\n    p = nullptr;\n    p_construct(1);\n    // eventually these should not have to be thread safe\n    // for parallel network simulations hardly any presyns have\n    // a threshold and it can be very inefficient to check the entire\n    // presyn list for thresholds during the fixed step method.\n    // So keep a threshold list.\n}\n\nNetCvode::~NetCvode() {\n    if (net_cvode_instance == this) {\n        net_cvode_instance = nullptr;\n    }\n\n    p_construct(0);\n}\n\nvoid nrn_p_construct() {\n    net_cvode_instance->p_construct(nrn_nthread);\n}\n\nvoid NetCvode::p_construct(int n) {\n    if (pcnt_ != n) {\n        if (p) {\n            delete[] p;\n            p = nullptr;\n        }\n\n        if (n > 0)\n            p = new NetCvodeThreadData[n];\n        else\n            p = nullptr;\n\n        pcnt_ = n;\n    }\n\n    for (int i = 0; i < n; ++i)\n        p[i].unreffed_event_cnt_ = 0;\n}\n\nTQItem* NetCvode::bin_event(double td, DiscreteEvent* db, NrnThread* nt) {\n    if (nrn_use_bin_queue_) {\n#if PRINT_EVENT\n        if (print_event_) {\n            db->pr(\"binq send\", td, this);\n        }\n#endif\n        return p[nt->id].tqe_->enqueue_bin(td, db);\n    } else {\n#if PRINT_EVENT\n        if (print_event_) {\n            db->pr(\"send\", td, this);\n        }\n#endif\n        return p[nt->id].tqe_->insert(td, db);\n    }\n}\n\nTQItem* NetCvode::event(double td, DiscreteEvent* db, NrnThread* nt) {\n#if PRINT_EVENT\n    if (print_event_) {\n        db->pr(\"send\", td, this);\n    }\n#endif\n    return p[nt->id].tqe_->insert(td, db);\n}\n\nvoid NetCvode::clear_events() {\n    // DiscreteEvents may already have gone out of existence so the tqe_\n    // may contain many invalid item data pointers\n    enqueueing_ = 0;\n    for (int i = 0; i < nrn_nthread; ++i) {\n        NetCvodeThreadData& d = p[i];\n        delete d.tqe_;\n        d.tqe_ = new TQueue<QTYPE>();\n        d.unreffed_event_cnt_ = 0;\n        d.inter_thread_events_.clear();\n        d.tqe_->nshift_ = -1;\n        d.tqe_->shift_bin(nrn_threads->_t - 0.5 * nrn_threads->_dt);\n    }\n}\n\nvoid NetCvode::init_events() {\n    for (int i = 0; i < nrn_nthread; ++i) {\n        p[i].tqe_->nshift_ = -1;\n        p[i].tqe_->shift_bin(nrn_threads->_t - 0.5 * nrn_threads->_dt);\n    }\n\n    for (int tid = 0; tid < nrn_nthread; ++tid) {  // can be done in parallel\n        NrnThread* nt = nrn_threads + tid;\n\n        for (int ipre = 0; ipre < nt->n_presyn; ++ipre) {\n            PreSyn* ps = nt->presyns + ipre;\n            ps->flag_ = false;\n        }\n\n        for (int inetc = 0; inetc < nt->n_netcon; ++inetc) {\n            NetCon* d = nt->netcons + inetc;\n            if (d->target_) {\n                int type = d->target_->_type;\n                if (corenrn.get_pnt_receive_init()[type]) {\n                    (*corenrn.get_pnt_receive_init()[type])(d->target_, d->u.weight_index_, 0);\n                } else {\n                    int cnt = corenrn.get_pnt_receive_size()[type];\n                    double* wt = nt->weights + d->u.weight_index_;\n                    // not the first\n                    for (int j = 1; j < cnt; ++j) {\n                        wt[j] = 0.;\n                    }\n                }\n            }\n        }\n    }\n}\n\nbool NetCvode::deliver_event(double til, NrnThread* nt) {\n    TQItem* q = p[nt->id].tqe_->atomic_dq(til);\n    if (q == nullptr) {\n        return false;\n    }\n\n    DiscreteEvent* de = q->data_;\n    double tt = q->t_;\n    delete q;\n#if PRINT_EVENT\n    if (print_event_) {\n        de->pr(\"deliver\", tt, this);\n    }\n#endif\n    de->deliver(tt, this, nt);\n\n    /// In case of a self event we need to delete the self event\n    if (de->type() == SelfEventType) {\n        delete static_cast<SelfEvent*>(de);\n    }\n    return true;\n}\n\nvoid net_move(void** v, Point_process* pnt, double tt) {\n    // assert, if possible that *v == pnt->movable.\n    if (!(*v))\n        hoc_execerror(\"No event with flag=1 for net_move in \",\n                      corenrn.get_memb_func(pnt->_type).sym);\n\n    TQItem* q = (TQItem*) (*v);\n    // printf(\"net_move tt=%g %s *v=%p\\n\", tt, memb_func[pnt->_type].sym, *v);\n    if (tt < PP2t(pnt))\n        nrn_assert(0);\n\n    net_cvode_instance->move_event(q, tt, PP2NT(pnt));\n}\n\nvoid artcell_net_move(void** v, Point_process* pnt, double tt) {\n    net_move(v, pnt, tt);\n}\n\nvoid NetCvode::move_event(TQItem* q, double tnew, NrnThread* nt) {\n    int tid = nt->id;\n\n#if PRINT_EVENT\n    if (print_event_) {\n        SelfEvent* se = (SelfEvent*) q->data_;\n        printf(\"NetCvode::move_event self event target %s t=%g, old=%g new=%g\\n\",\n               corenrn.get_memb_func(se->target_->_type).sym,\n               nt->_t,\n               q->t_,\n               tnew);\n    }\n#endif\n\n    p[tid].tqe_->move(q, tnew);\n}\n\nvoid NetCvode::deliver_events(double til, NrnThread* nt) {\n    // printf(\"deliver_events til %20.15g\\n\", til);\n    /// Enqueue any outstanding events in the interthread event buffer\n    p[nt->id].enqueue(this, nt);\n\n    /// Deliver events. When the map is used, the loop is explicit\n    while (deliver_event(til, nt))\n        ;\n}\n\nvoid PreSyn::record(double tt) {\n    spikevec_lock();\n    if (gid_ > -1) {\n        spikevec_gid.push_back(gid_);\n        spikevec_time.push_back(tt);\n    }\n    spikevec_unlock();\n}\n\nbool ConditionEvent::check(NrnThread* nt) {\n    if (value(nt) > 0.0) {\n        if (flag_ == false) {\n            flag_ = true;\n            return true;\n        }\n    } else {\n        flag_ = false;\n    }\n    return false;\n}\n\nvoid DiscreteEvent::send(double tt, NetCvode* ns, NrnThread* nt) {\n    ns->event(tt, this, nt);\n}\n\nvoid DiscreteEvent::deliver(double /* tt */, NetCvode* /* ns */, NrnThread* /* nt */) {}\n\nvoid DiscreteEvent::pr(const char* s, double tt, NetCvode* /* ns */) {\n    printf(\"%s DiscreteEvent %.15g\\n\", s, tt);\n}\n\nvoid NetCon::send(double tt, NetCvode* ns, NrnThread* nt) {\n    if (active_ && target_) {\n        nrn_assert(PP2NT(target_) == nt);\n        ns->bin_event(tt, this, PP2NT(target_));\n    }\n}\n\nvoid NetCon::deliver(double tt, NetCvode* /* ns */, NrnThread* nt) {\n    nrn_assert(target_);\n\n    if (PP2NT(target_) != nt)\n        printf(\"NetCon::deliver nt=%d target=%d\\n\", nt->id, PP2NT(target_)->id);\n\n    nrn_assert(PP2NT(target_) == nt);\n    int typ = target_->_type;\n    nt->_t = tt;\n\n    // printf(\"NetCon::deliver t=%g tt=%g %s\\n\", t, tt, pnt_name(target_));\n    std::string ss(\"net-receive-\");\n    ss += nrn_get_mechname(typ);\n    Instrumentor::phase p_get_pnt_receive(ss.c_str());\n    (*corenrn.get_pnt_receive()[typ])(target_, u.weight_index_, 0);\n#ifdef DEBUG\n    if (errno && nrn_errno_check(typ))\n        hoc_warning(\"errno set during NetCon deliver to NET_RECEIVE\", (char*) 0);\n#endif\n}\n\nvoid NetCon::pr(const char* s, double tt, NetCvode* /* ns */) {\n    Point_process* pp = target_;\n    printf(\"%s NetCon target=%s[%d] %.15g\\n\",\n           s,\n           corenrn.get_memb_func(pp->_type).sym,\n           pp->_i_instance,\n           tt);\n}\n\nvoid PreSyn::send(double tt, NetCvode* ns, NrnThread* nt) {\n    record(tt);\n    for (int i = nc_cnt_ - 1; i >= 0; --i) {\n        NetCon* d = netcon_in_presyn_order_[nc_index_ + i];\n        if (d->active_ && d->target_) {\n            NrnThread* n = PP2NT(d->target_);\n\n            if (nt == n)\n                ns->bin_event(tt + d->delay_, d, n);\n            else\n                ns->p[n->id].interthread_send(tt + d->delay_, d, n);\n        }\n    }\n\n#if NRNMPI\n    if (output_index_ >= 0) {\n#if NRN_MULTISEND\n        if (use_multisend_) {\n            nrn_multisend_send(this, tt, nt);\n        } else {\n#else\n        {\n#endif\n            if (nrn_use_localgid_) {\n                nrn_outputevent(localgid_, tt);\n            } else {\n                nrn2ncs_outputevent(output_index_, tt);\n            }\n        }\n    }\n#endif  // NRNMPI\n}\n\nvoid InputPreSyn::send(double tt, NetCvode* ns, NrnThread* nt) {\n    for (int i = nc_cnt_ - 1; i >= 0; --i) {\n        NetCon* d = netcon_in_presyn_order_[nc_index_ + i];\n        if (d->active_ && d->target_) {\n            NrnThread* n = PP2NT(d->target_);\n\n            if (nt == n)\n                ns->bin_event(tt + d->delay_, d, n);\n            else\n                ns->p[n->id].interthread_send(tt + d->delay_, d, n);\n        }\n    }\n}\n\nvoid PreSyn::deliver(double, NetCvode*, NrnThread*) {\n    assert(0);  // no PreSyn delay.\n}\n\nvoid InputPreSyn::deliver(double, NetCvode*, NrnThread*) {\n    assert(0);  // no InputPreSyn delay.\n}\n\nvoid SelfEvent::deliver(double tt, NetCvode* ns, NrnThread* nt) {\n    nrn_assert(nt == PP2NT(target_));\n    PP2t(target_) = tt;\n    // printf(\"SelfEvent::deliver t=%g tt=%g %s\\n\", PP2t(target_), tt, pnt_name(target_));\n    call_net_receive(ns);\n}\n\nvoid SelfEvent::call_net_receive(NetCvode* ns) {\n    (*corenrn.get_pnt_receive()[target_->_type])(target_, weight_index_, flag_);\n\n#ifdef DEBUG\n    if (errno && nrn_errno_check(target_->_type))\n        hoc_warning(\"errno set during SelfEvent deliver to NET_RECEIVE\", (char*) 0);\n#endif\n\n    NetCvodeThreadData& nctd = ns->p[PP2NT(target_)->id];\n    --nctd.unreffed_event_cnt_;\n}\n\nvoid SelfEvent::pr(const char* s, double tt, NetCvode*) {\n    printf(\"%s\", s);\n    printf(\" SelfEvent target=%s %.15g flag=%g\\n\", pnt_name(target_), tt, flag_);\n}\n\nvoid ncs2nrn_integrate(double tstop) {\n    int total_sim_steps = static_cast<int>((tstop - nrn_threads->_t) / dt + 1e-9);\n\n    if (total_sim_steps > 3 && !nrn_have_gaps) {\n        nrn_fixed_step_group_minimal(total_sim_steps);\n    } else {\n        nrn_fixed_single_steps_minimal(total_sim_steps, tstop);\n    }\n\n    // handle all the pending flag=1 self events\n    for (int i = 0; i < nrn_nthread; ++i)\n        nrn_assert(nrn_threads[i]._t == nrn_threads->_t);\n}\n\n// factored this out from deliver_net_events so we can\n// stay in the cache\n// net_send_buffer added so checking can be done on gpu\n// while event queueing is on cpu.\n// Remember: passsing reference variable causes cray\n// compiler bug\n\nstatic bool pscheck(double var, double thresh, int* flag) {\n    if (var > thresh) {\n        if (*flag == false) {\n            *flag = true;\n            return true;\n        }\n    } else {\n        *flag = false;\n    }\n    return false;\n}\n\ndouble PreSyn::value(NrnThread* nt) {\n    return nt->_actual_v[thvar_index_] - threshold_;\n}\n\nvoid NetCvode::check_thresh(NrnThread* nt) {  // for default method\n    Instrumentor::phase p(\"check-threshold\");\n    double teps = 1e-10;\n\n    nt->_net_send_buffer_cnt = 0;\n    int net_send_buf_count = 0;\n    PreSyn* presyns = nt->presyns;\n    PreSynHelper* presyns_helper = nt->presyns_helper;\n    double* actual_v = nt->_actual_v;\n\n    if (nt->ncell == 0)\n        return;\n\n    nrn_pragma_acc(parallel loop present(\n        nt [0:1], presyns_helper [0:nt->n_presyn], presyns [0:nt->n_presyn], actual_v [0:nt->end])\n                       copy(net_send_buf_count) if (nt->compute_gpu) async(nt->stream_id))\n    nrn_pragma_omp(target teams distribute parallel for map(tofrom: net_send_buf_count) if(nt->compute_gpu))\n    for (int i = 0; i < nt->n_real_output; ++i) {\n        PreSyn* ps = presyns + i;\n        PreSynHelper* psh = presyns_helper + i;\n        int idx = 0;\n        int thidx = ps->thvar_index_;\n        double v = actual_v[thidx];\n        double threshold = ps->threshold_;\n        int* flag = &(psh->flag_);\n\n        if (pscheck(v, threshold, flag)) {\n#ifndef CORENEURON_ENABLE_GPU\n            nt->_net_send_buffer_cnt = net_send_buf_count;\n            if (nt->_net_send_buffer_cnt >= nt->_net_send_buffer_size) {\n                nt->_net_send_buffer_size *= 2;\n                nt->_net_send_buffer = (int*) erealloc(nt->_net_send_buffer,\n                                                       nt->_net_send_buffer_size * sizeof(int));\n            }\n#endif\n\n            nrn_pragma_acc(atomic capture)\n            nrn_pragma_omp(atomic capture)\n            idx = net_send_buf_count++;\n\n            nt->_net_send_buffer[idx] = i;\n        }\n    }\n    nrn_pragma_acc(wait(nt->stream_id))\n    nt->_net_send_buffer_cnt = net_send_buf_count;\n\n    if (nt->compute_gpu && nt->_net_send_buffer_cnt) {\n#ifdef CORENEURON_ENABLE_GPU\n        int* nsbuffer = nt->_net_send_buffer;\n#endif\n        nrn_pragma_acc(update host(nsbuffer [0:nt->_net_send_buffer_cnt]) async(nt->stream_id))\n        nrn_pragma_acc(wait(nt->stream_id))\n        nrn_pragma_omp(target update from(nsbuffer [0:nt->_net_send_buffer_cnt]))\n    }\n\n    // on CPU...\n    for (int i = 0; i < nt->_net_send_buffer_cnt; ++i) {\n        PreSyn* ps = nt->presyns + nt->_net_send_buffer[i];\n        ps->send(nt->_t + teps, net_cvode_instance, nt);\n    }\n\n    // Types that have WATCH statements. If exist, then last element is 0.\n    if (nt->_watch_types) {\n        for (int i = 0; nt->_watch_types[i] != 0; ++i) {\n            int type = nt->_watch_types[i];\n            (*corenrn.get_watch_check()[type])(nt, nt->_ml_list[type]);\n            // may generate net_send events (with 0 (teps) delay)\n        }\n    }\n}\n\n// WATCH statements are rare. Conceptually they are very similar to\n// PreSyn thresholds as above but an optimal peformance implementation for GPU is\n// not obvious. Each WATCH statement threshold test could make use of\n// pscheck.  Note that it is possible that there are several active WATCH\n// statements for a given POINT_PROCESS instance as well as none active.\n// Also WATCH statements switch between active and inactive state.\n//\n// In NEURON,\n// both PreSyn and WatchCondition were subclasses of ConditionEvent. When\n// a WatchCondition fired in the fixed step method, it was placed on the queue\n// with a delivery time of t+teps. WatchCondition::deliver called the NET_RECEIVE\n// block with proper flag ( but nullptr weight vector). WatchConditions\n// were created,added/removed,destroyed from a list as necessary.\n// Perhaps the most commonly used WATCH statement is in the context of a\n// ThresholdDetect Point_process which watches voltage and compares to\n// an instance specific threshold parameter. A firing ThresholdDetect instance\n// would call net_event(tdeliver) which then feeds into the standard\n// artcell PreSyn sequence (using pntsrc_ instead of thvar_index_).\n//\n// So... the PreSyns have the same order as they are checked (although PreSyn\n// data is AoS instead of SoA and nested 'if' means a failure of SIMD.)\n// But if multiple WATCH, there is (from one kind of implementation viewpoint),\n// yet another 'if' with regard to whether a WATCH is active. And if there\n// are multiple WATCH, the size of the list is dynamic.\n//\n// An experimental implementation is to check all WATCH of all instances\n// of a type with the proviso that there is an active flag for each WATCH.\n// ie. active, below, var1, var2 are all SoA (except one of the var may\n// be voltage). Can use 'if (active && pscheck(var1, var2, &below)'\n// The mod file net_send_buffering fragments can be used which\n// ultimately call net_send using a transient SelfEvent. ie. all\n// checking computation takes place in the context of the mod file without\n// using explicit WatchCondition instances.\n\n// events including binqueue events up to t+dt/2\nvoid NetCvode::deliver_net_events(NrnThread* nt) {  // for default method\n#if NRN_MULTISEND\n    if (use_multisend_ && nt->id == 0) {\n        nrn_multisend_advance();\n    }\n#endif\n    int tid = nt->id;\n    double tsav = nt->_t;\n    double tm = nt->_t + 0.5 * nt->_dt;\ntryagain:\n    // one of the events on the main queue may be a NetParEvent\n    // which due to dt round off error can result in an event\n    // placed on the bin queue to be delivered now, which\n    // can put 0 delay events on to the main queue. So loop til\n    // no events. The alternative would be to deliver an idt=0 event\n    // immediately but that would very much change the sequence\n    // with respect to what is being done here and it is unclear\n    // how to fix the value of t there. This can be a do while loop\n    // but I do not want to affect the case of not using a bin queue.\n\n    if (nrn_use_bin_queue_) {\n        TQItem* q;\n        while ((q = p[tid].tqe_->dequeue_bin()) != 0) {\n            DiscreteEvent* db = q->data_;\n\n#if PRINT_EVENT\n            if (print_event_) {\n                db->pr(\"binq deliver\", nrn_threads->_t, this);\n            }\n#endif\n\n            delete q;\n            db->deliver(nt->_t, this, nt);\n        }\n        // assert(int(tm/nt->_dt)%1000 == p[tid].tqe_->nshift_);\n    }\n\n    deliver_events(tm, nt);\n\n    if (nrn_use_bin_queue_) {\n        if (p[tid].tqe_->top()) {\n            goto tryagain;\n        }\n        p[tid].tqe_->shift_bin(tm);\n    }\n\n    nt->_t = tsav;\n\n    /*before executing on gpu, we have to update the NetReceiveBuffer_t on GPU */\n    update_net_receive_buffer(nt);\n\n    for (auto& net_buf_receive: corenrn.get_net_buf_receive()) {\n        std::string ss(\"net-buf-receive-\");\n        ss += nrn_get_mechname(net_buf_receive.second);\n        Instrumentor::phase p_net_buf_receive(ss.c_str());\n        (*net_buf_receive.first)(nt);\n    }\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/network/netcvode.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\n#include \"coreneuron/utils/nrnmutdec.hpp\"\n#include \"coreneuron/network/tqueue.hpp\"\n\n#define PRINT_EVENT 0\n\n/** QTYPE options include: spltree, pq_que\n *  STL priority queue is used instead of the splay tree by default.\n *  @todo: check if stl queue works with move_event functions.\n */\n\n#ifdef ENABLE_SPLAYTREE_QUEUING\n#define QTYPE spltree\n#else\n#define QTYPE pq_que\n#endif\nnamespace coreneuron {\n\n// defined in coreneuron/network/cvodestb.cpp\nextern void init_net_events(void);\nextern void nrn_play_init(void);\nextern void deliver_net_events(NrnThread*);\nextern void nrn_deliver_events(NrnThread*);\nextern void fixed_play_continuous(NrnThread*);\n\nstruct DiscreteEvent;\nclass NetCvode;\n\nextern NetCvode* net_cvode_instance;\nextern void interthread_enqueue(NrnThread*);\n\nstruct InterThreadEvent {\n    DiscreteEvent* de_;\n    double t_;\n};\n\nclass NetCvodeThreadData {\n  public:\n    int unreffed_event_cnt_ = 0;\n    TQueue<QTYPE>* tqe_;\n    std::vector<InterThreadEvent> inter_thread_events_;\n    OMP_Mutex mut;\n\n    NetCvodeThreadData();\n    virtual ~NetCvodeThreadData();\n    void interthread_send(double, DiscreteEvent*, NrnThread*);\n    void enqueue(NetCvode*, NrnThread*);\n};\n\nclass NetCvode {\n  public:\n    int print_event_;\n    int pcnt_;\n    int enqueueing_;\n    NetCvodeThreadData* p;\n    static double eps_;\n\n    NetCvode(void);\n    virtual ~NetCvode();\n    void p_construct(int);\n    void check_thresh(NrnThread*);\n    static double eps(double x) {\n        return eps_ * fabs(x);\n    }\n    TQItem* event(double tdeliver, DiscreteEvent*, NrnThread*);\n    void move_event(TQItem*, double, NrnThread*);\n    TQItem* bin_event(double tdeliver, DiscreteEvent*, NrnThread*);\n    void deliver_net_events(NrnThread*);          // for default staggered time step method\n    void deliver_events(double til, NrnThread*);  // for initialization events\n    bool deliver_event(double til, NrnThread*);   // uses TQueue atomically\n    void clear_events();\n    void init_events();\n    void point_receive(int, Point_process*, double*, double);\n};\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/network/netpar.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include <cstdio>\n#include <cstdlib>\n#include <map>\n#include <mutex>\n#include <vector>\n\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/apps/corenrn_parameters.hpp\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/mpi/nrnmpidec.h\"\n\n#include \"coreneuron/network/netcon.hpp\"\n#include \"coreneuron/network/netcvode.hpp\"\n#include \"coreneuron/nrniv/nrniv_decl.h\"\n#include \"coreneuron/utils/ivocvect.hpp\"\n#include \"coreneuron/network/multisend.hpp\"\n#include \"coreneuron/utils/nrn_assert.h\"\n#include \"coreneuron/utils/nrnoc_aux.hpp\"\n#include \"coreneuron/utils/profile/profiler_interface.h\"\n#include \"coreneuron/utils/utils.hpp\"\n\n#if NRNMPI\n#include \"coreneuron/mpi/nrnmpi.h\"\n#include \"coreneuron/mpi/core/nrnmpi.hpp\"\nint localgid_size_;\nint ag_send_nspike;\nnamespace coreneuron {\nint* nrnmpi_nin_;\n}\nint ovfl_capacity;\nint icapacity;\nunsigned char* spikeout_fixed;\nunsigned char* spfixin_ovfl_;\nunsigned char* spikein_fixed;\nint ag_send_size;\nint ovfl;\nint nout;\ncoreneuron::NRNMPI_Spikebuf* spbufout;\ncoreneuron::NRNMPI_Spikebuf* spbufin;\n#endif\n\nnamespace coreneuron {\nclass PreSyn;\nclass InputPreSyn;\n\nvoid nrn_spike_exchange_init();\n\n#if NRNMPI\nstatic double t_exchange_;\nstatic double dt1_;  // 1/dt\n\nNRNMPI_Spike* spikeout;\nNRNMPI_Spike* spikein;\n\nvoid nrn_timeout(int);\nvoid nrn_spike_exchange(NrnThread*);\nvoid nrn2ncs_outputevent(int netcon_output_index, double firetime);\n\n// for compressed gid info during spike exchange\nbool nrn_use_localgid_;\nvoid nrn_outputevent(unsigned char localgid, double firetime);\nstd::vector<std::map<int, InputPreSyn*>> localmaps;\n\nstatic int ocapacity_;  // for spikeout\n// require it to be smaller than  min_interprocessor_delay.\nstatic double wt_;   // wait time for nrnmpi_spike_exchange\nstatic double wt1_;  // time to find the PreSyns and send the spikes.\nstatic bool use_compress_;\nstatic int spfixout_capacity_;\nstatic int idxout_;\nstatic void nrn_spike_exchange_compressed(NrnThread*);\n\n#endif  // NRNMPI\n\nstatic bool active_ = false;\nstatic double usable_mindelay_;\nstatic double mindelay_;  // the one actually used. Some of our optional algorithms\nstatic double last_maxstep_arg_;\nstatic std::vector<NetParEvent> npe_;  // nrn_nthread of them\n\n#if NRNMPI\n// for combination of threads and mpi.\nstatic OMP_Mutex mut;\n#endif\n\n/// Allocate space for spikes: 200 structs of {int gid; double time}\n/// coming from nrnmpi.h and array of int of the global domain size\nstatic void alloc_mpi_space() {\n#if NRNMPI\n    if (corenrn_param.mpi_enable && !spikeout) {\n        ocapacity_ = 100;\n        spikeout = (NRNMPI_Spike*) emalloc(ocapacity_ * sizeof(NRNMPI_Spike));\n        icapacity = 100;\n        spikein = (NRNMPI_Spike*) malloc(icapacity * sizeof(NRNMPI_Spike));\n        nrnmpi_nin_ = (int*) emalloc(nrnmpi_numprocs * sizeof(int));\n#if nrn_spikebuf_size > 0\n        spbufout = (NRNMPI_Spikebuf*) emalloc(sizeof(NRNMPI_Spikebuf));\n        spbufin = (NRNMPI_Spikebuf*) emalloc(nrnmpi_numprocs * sizeof(NRNMPI_Spikebuf));\n#endif\n    }\n#endif\n}\n\nNetParEvent::NetParEvent()\n    : ithread_(-1)\n    , wx_(0.)\n    , ws_(0.) {}\n\nvoid NetParEvent::send(double tt, NetCvode* nc, NrnThread* nt) {\n    nc->event(tt + usable_mindelay_, this, nt);\n}\n\nvoid NetParEvent::deliver(double tt, NetCvode* nc, NrnThread* nt) {\n    net_cvode_instance->deliver_events(tt, nt);\n    nt->_stop_stepping = 1;\n    nt->_t = tt;\n    send(tt, nc, nt);\n}\n\nvoid NetParEvent::pr(const char* m, double tt, NetCvode*) {\n    printf(\"%s NetParEvent %d t=%.15g tt-t=%g\\n\", m, ithread_, tt, tt - nrn_threads[ithread_]._t);\n}\n\n#if NRNMPI\ninline static void sppk(unsigned char* c, int gid) {\n    for (int i = localgid_size_ - 1; i >= 0; --i) {\n        c[i] = gid & 255;\n        gid >>= 8;\n    }\n}\ninline static int spupk(unsigned char* c) {\n    int gid = *c++;\n    for (int i = 1; i < localgid_size_; ++i) {\n        gid <<= 8;\n        gid += *c++;\n    }\n    return gid;\n}\n\nvoid nrn_outputevent(unsigned char localgid, double firetime) {\n    if (!active_) {\n        return;\n    }\n    std::lock_guard<OMP_Mutex> lock(mut);\n    nout++;\n    int i = idxout_;\n    idxout_ += 2;\n    if (idxout_ >= spfixout_capacity_) {\n        spfixout_capacity_ *= 2;\n        spikeout_fixed = (unsigned char*) erealloc(spikeout_fixed,\n                                                   spfixout_capacity_ * sizeof(unsigned char));\n    }\n    spikeout_fixed[i++] = (unsigned char) ((firetime - t_exchange_) * dt1_ + .5);\n    spikeout_fixed[i] = localgid;\n    // printf(\"%d idx=%d lgid=%d firetime=%g t_exchange_=%g [0]=%d [1]=%d\\n\", nrnmpi_myid, i,\n    // (int)localgid, firetime, t_exchange_, (int)spikeout_fixed[i-1], (int)spikeout_fixed[i]);\n}\n\nvoid nrn2ncs_outputevent(int gid, double firetime) {\n    if (!active_) {\n        return;\n    }\n    std::lock_guard<OMP_Mutex> lock(mut);\n    if (use_compress_) {\n        nout++;\n        int i = idxout_;\n        idxout_ += 1 + localgid_size_;\n        if (idxout_ >= spfixout_capacity_) {\n            spfixout_capacity_ *= 2;\n            spikeout_fixed = (unsigned char*) erealloc(spikeout_fixed,\n                                                       spfixout_capacity_ * sizeof(unsigned char));\n        }\n        // printf(\"%d nrnncs_outputevent %d %.20g %.20g %d\\n\", nrnmpi_myid, gid, firetime,\n        // t_exchange_,\n        //(int)((unsigned char)((firetime - t_exchange_)*dt1_ + .5)));\n        spikeout_fixed[i++] = (unsigned char) ((firetime - t_exchange_) * dt1_ + .5);\n        // printf(\"%d idx=%d firetime=%g t_exchange_=%g spfixout=%d\\n\", nrnmpi_myid, i, firetime,\n        // t_exchange_, (int)spikeout_fixed[i-1]);\n        sppk(spikeout_fixed + i, gid);\n        // printf(\"%d idx=%d gid=%d spupk=%d\\n\", nrnmpi_myid, i, gid, spupk(spikeout_fixed+i));\n    } else {\n#if nrn_spikebuf_size == 0\n        int i = nout++;\n        if (i >= ocapacity_) {\n            ocapacity_ *= 2;\n            spikeout = (NRNMPI_Spike*) erealloc(spikeout, ocapacity_ * sizeof(NRNMPI_Spike));\n        }\n        // printf(\"%d cell %d in slot %d fired at %g\\n\", nrnmpi_myid, gid, i, firetime);\n        spikeout[i].gid = gid;\n        spikeout[i].spiketime = firetime;\n#else\n        int i = nout++;\n        if (i >= nrn_spikebuf_size) {\n            i -= nrn_spikebuf_size;\n            if (i >= ocapacity_) {\n                ocapacity_ *= 2;\n                spikeout = (NRNMPI_Spike*) hoc_Erealloc(spikeout,\n                                                        ocapacity_ * sizeof(NRNMPI_Spike));\n                hoc_malchk();\n            }\n            spikeout[i].gid = gid;\n            spikeout[i].spiketime = firetime;\n        } else {\n            spbufout->gid[i] = gid;\n            spbufout->spiketime[i] = firetime;\n        }\n#endif\n    }\n    // printf(\"%d cell %d in slot %d fired at %g\\n\", nrnmpi_myid, gid, i, firetime);\n}\n#endif  // NRNMPI\n\nstatic bool nrn_need_npe() {\n    if (active_ || nrn_nthread > 1) {\n        if (last_maxstep_arg_ == 0) {\n            last_maxstep_arg_ = 100.;\n        }\n        return true;\n    } else {\n        if (!npe_.empty()) {\n            npe_.clear();\n            npe_.shrink_to_fit();\n        }\n        return false;\n    }\n}\n\n#define TBUFSIZE 0\n\nvoid nrn_spike_exchange_init() {\n    // printf(\"nrn_spike_exchange_init\\n\");\n    if (!nrn_need_npe()) {\n        return;\n    }\n    alloc_mpi_space();\n    usable_mindelay_ = mindelay_;\n#if NRN_MULTISEND\n    if (use_multisend_ && n_multisend_interval == 2) {\n        usable_mindelay_ *= 0.5;\n    }\n#endif\n    if (nrn_nthread > 1) {\n        usable_mindelay_ -= dt;\n    }\n    if ((usable_mindelay_ < 1e-9) || (usable_mindelay_ < dt)) {\n        if (nrnmpi_myid == 0) {\n            hoc_execerror(\"usable mindelay is 0\", \"(or less than dt for fixed step method)\");\n        } else {\n            return;\n        }\n    }\n\n#if TBUFSIZE\n    itbuf_ = 0;\n#endif\n\n#if NRN_MULTISEND\n    if (use_multisend_) {\n        nrn_multisend_init();\n    }\n#endif\n\n    if (npe_.size() != static_cast<std::size_t>(nrn_nthread)) {\n        if (!npe_.empty()) {\n            npe_.clear();\n            npe_.shrink_to_fit();\n        }\n        npe_.resize(nrn_nthread);\n    }\n    for (int i = 0; i < nrn_nthread; ++i) {\n        npe_[i].ithread_ = i;\n        npe_[i].wx_ = 0.;\n        npe_[i].ws_ = 0.;\n        npe_[i].send(t, net_cvode_instance, nrn_threads + i);\n    }\n#if NRNMPI\n    if (corenrn_param.mpi_enable) {\n        if (use_compress_) {\n            idxout_ = 2;\n            t_exchange_ = t;\n            dt1_ = rev_dt;\n            usable_mindelay_ = floor(mindelay_ * dt1_ + 1e-9) * dt;\n            if (usable_mindelay_ * dt1_ >= 255.) {\n                usable_mindelay_ = 255. / dt1_;\n            }\n            assert(usable_mindelay_ >= dt && (usable_mindelay_ * dt1_) <= 255.);\n        } else {\n#if nrn_spikebuf_size > 0\n            if (spbufout) {\n                spbufout->nspike = 0;\n            }\n#endif\n        }\n        nout = 0;\n    }\n#endif  // NRNMPI\n        // if (nrnmpi_myid == 0){printf(\"usable_mindelay_ = %g\\n\", usable_mindelay_);}\n}\n\n#if NRNMPI\nvoid nrn_spike_exchange(NrnThread* nt) {\n    Instrumentor::phase p_spike_exchange(\"spike-exchange\");\n    if (!active_) {\n        return;\n    }\n#if NRN_MULTISEND\n    if (use_multisend_) {\n        nrn_multisend_receive(nt);\n        return;\n    }\n#endif\n    if (use_compress_) {\n        nrn_spike_exchange_compressed(nt);\n        return;\n    }\n#if TBUFSIZE\n    nrnmpi_barrier();\n#endif\n\n#if nrn_spikebuf_size > 0\n    spbufout->nspike = nout;\n#endif\n    double wt = nrn_wtime();\n\n    int n = nrnmpi_spike_exchange(\n        nrnmpi_nin_, spikeout, icapacity, &spikein, ovfl, nout, spbufout, spbufin);\n\n    wt_ = nrn_wtime() - wt;\n    wt = nrn_wtime();\n#if TBUFSIZE\n    tbuf_[itbuf_++] = (unsigned long) nout;\n    tbuf_[itbuf_++] = (unsigned long) n;\n#endif\n\n    errno = 0;\n    // if (n > 0) {\n    // printf(\"%d nrn_spike_exchange sent %d received %d\\n\", nrnmpi_myid, nout, n);\n    //}\n    nout = 0;\n    if (n == 0) {\n        return;\n    }\n#if nrn_spikebuf_size > 0\n    for (int i = 0; i < nrnmpi_numprocs; ++i) {\n        int nn = spbufin[i].nspike;\n        if (nn > nrn_spikebuf_size) {\n            nn = nrn_spikebuf_size;\n        }\n        for (int j = 0; j < nn; ++j) {\n            auto gid2in_it = gid2in.find(spbufin[i].gid[j]);\n            if (gid2in_it != gid2in.end()) {\n                InputPreSyn* ps = gid2in_it->second;\n                ps->send(spbufin[i].spiketime[j], net_cvode_instance, nt);\n            }\n        }\n    }\n    n = ovfl;\n#endif  // nrn_spikebuf_size > 0\n    for (int i = 0; i < n; ++i) {\n        auto gid2in_it = gid2in.find(spikein[i].gid);\n        if (gid2in_it != gid2in.end()) {\n            InputPreSyn* ps = gid2in_it->second;\n            ps->send(spikein[i].spiketime, net_cvode_instance, nt);\n        }\n    }\n    nrn_multithread_job(interthread_enqueue);\n    wt1_ = nrn_wtime() - wt;\n}\n\nvoid nrn_spike_exchange_compressed(NrnThread* nt) {\n    if (!active_) {\n        return;\n    }\n#if TBUFSIZE\n    nrnmpi_barrier();\n#endif\n\n    assert(nout < 0x10000);\n    spikeout_fixed[1] = (unsigned char) (nout & 0xff);\n    spikeout_fixed[0] = (unsigned char) (nout >> 8);\n\n    double wt = nrn_wtime();\n\n    int n = nrnmpi_spike_exchange_compressed(localgid_size_,\n                                             spfixin_ovfl_,\n                                             ag_send_nspike,\n                                             nrnmpi_nin_,\n                                             ovfl_capacity,\n                                             spikeout_fixed,\n                                             ag_send_size,\n                                             spikein_fixed,\n                                             ovfl);\n    wt_ = nrn_wtime() - wt;\n    wt = nrn_wtime();\n#if TBUFSIZE\n    tbuf_[itbuf_++] = (unsigned long) nout;\n    tbuf_[itbuf_++] = (unsigned long) n;\n#endif\n    errno = 0;\n    // if (n > 0) {\n    // printf(\"%d nrn_spike_exchange sent %d received %d\\n\", nrnmpi_myid, nout, n);\n    //}\n    nout = 0;\n    idxout_ = 2;\n    if (n == 0) {\n        t_exchange_ = nrn_threads->_t;\n        return;\n    }\n    if (nrn_use_localgid_) {\n        int idxov = 0;\n        for (int i = 0; i < nrnmpi_numprocs; ++i) {\n            int j, nnn;\n            int nn = nrnmpi_nin_[i];\n            if (nn) {\n                if (i == nrnmpi_myid) {  // skip but may need to increment idxov.\n                    if (nn > ag_send_nspike) {\n                        idxov += (nn - ag_send_nspike) * (1 + localgid_size_);\n                    }\n                    continue;\n                }\n                std::map<int, InputPreSyn*> gps = localmaps[i];\n                if (nn > ag_send_nspike) {\n                    nnn = ag_send_nspike;\n                } else {\n                    nnn = nn;\n                }\n                int idx = 2 + i * ag_send_size;\n                for (j = 0; j < nnn; ++j) {\n                    // order is (firetime,gid) pairs.\n                    double firetime = spikein_fixed[idx++] * dt + t_exchange_;\n                    int lgid = (int) spikein_fixed[idx];\n                    idx += localgid_size_;\n                    auto gid2in_it = gps.find(lgid);\n                    if (gid2in_it != gps.end()) {\n                        InputPreSyn* ps = gid2in_it->second;\n                        ps->send(firetime + 1e-10, net_cvode_instance, nt);\n                    }\n                }\n                for (; j < nn; ++j) {\n                    double firetime = spfixin_ovfl_[idxov++] * dt + t_exchange_;\n                    int lgid = (int) spfixin_ovfl_[idxov];\n                    idxov += localgid_size_;\n                    auto gid2in_it = gps.find(lgid);\n                    if (gid2in_it != gps.end()) {\n                        InputPreSyn* ps = gid2in_it->second;\n                        ps->send(firetime + 1e-10, net_cvode_instance, nt);\n                    }\n                }\n            }\n        }\n    } else {\n        for (int i = 0; i < nrnmpi_numprocs; ++i) {\n            int nn = nrnmpi_nin_[i];\n            if (nn > ag_send_nspike) {\n                nn = ag_send_nspike;\n            }\n            int idx = 2 + i * ag_send_size;\n            for (int j = 0; j < nn; ++j) {\n                // order is (firetime,gid) pairs.\n                double firetime = spikein_fixed[idx++] * dt + t_exchange_;\n                int gid = spupk(spikein_fixed + idx);\n                idx += localgid_size_;\n                auto gid2in_it = gid2in.find(gid);\n                if (gid2in_it != gid2in.end()) {\n                    InputPreSyn* ps = gid2in_it->second;\n                    ps->send(firetime + 1e-10, net_cvode_instance, nt);\n                }\n            }\n        }\n        n = ovfl;\n        int idx = 0;\n        for (int i = 0; i < n; ++i) {\n            double firetime = spfixin_ovfl_[idx++] * dt + t_exchange_;\n            int gid = spupk(spfixin_ovfl_ + idx);\n            idx += localgid_size_;\n            auto gid2in_it = gid2in.find(gid);\n            if (gid2in_it != gid2in.end()) {\n                InputPreSyn* ps = gid2in_it->second;\n                ps->send(firetime + 1e-10, net_cvode_instance, nt);\n            }\n        }\n    }\n    // In case of multiple threads some above ps->send events put\n    // NetCon events into interthread buffers. Some of those may\n    // need to be delivered early enough that the interthread buffers\n    // need transfer to the thread event queues before the next dqueue_bin\n    // while loop in deliver_net_events. So enqueue now...\n    nrn_multithread_job(interthread_enqueue);\n    t_exchange_ = nrn_threads->_t;\n    wt1_ = nrn_wtime() - wt;\n}\n\nstatic void mk_localgid_rep() {\n    // how many gids are there on this machine\n    // and can they be compressed into one byte\n    int ngid = 0;\n    for (const auto& gid2out_elem: gid2out) {\n        if (gid2out_elem.second->output_index_ >= 0) {\n            ++ngid;\n        }\n    }\n\n    int ngidmax = nrnmpi_int_allmax(ngid);\n    if (ngidmax > 256) {\n        // do not compress\n        return;\n    }\n    localgid_size_ = sizeof(unsigned char);\n    nrn_use_localgid_ = true;\n\n    // allocate Allgather receive buffer (send is the nrnmpi_myid one)\n    int* rbuf = new int[nrnmpi_numprocs * (ngidmax + 1)];\n    int* sbuf = new int[ngidmax + 1];\n\n    sbuf[0] = ngid;\n    ++sbuf;\n    ngid = 0;\n    // define the local gid and fill with the gids on this machine\n    for (const auto& gid2out_elem: gid2out) {\n        if (gid2out_elem.second->output_index_ >= 0) {\n            gid2out_elem.second->localgid_ = (unsigned char) ngid;\n            sbuf[ngid] = gid2out_elem.second->output_index_;\n            ++ngid;\n        }\n    }\n    --sbuf;\n\n    // exchange everything\n    nrnmpi_int_allgather(sbuf, rbuf, ngidmax + 1);\n    delete[] sbuf;\n    errno = 0;\n\n    // create the maps\n    // there is a lot of potential for efficiency here. i.e. use of\n    // perfect hash functions, or even simple Vectors.\n    localmaps.clear();\n    localmaps.resize(nrnmpi_numprocs);\n\n    // fill in the maps\n    for (int i = 0; i < nrnmpi_numprocs; ++i)\n        if (i != nrnmpi_myid) {\n            sbuf = rbuf + i * (ngidmax + 1);\n            ngid = *(sbuf++);\n            for (int k = 0; k < ngid; ++k) {\n                auto gid2in_it = gid2in.find(int(sbuf[k]));\n                if (gid2in_it != gid2in.end()) {\n                    localmaps[i][k] = gid2in_it->second;\n                }\n            }\n        }\n\n    // cleanup\n    delete[] rbuf;\n}\n\n#endif  // NRNMPI\n\n// may stimulate a gid for a cell not owned by this cpu. This allows\n// us to run single cells or subnets and stimulate exactly according to\n// their input in a full parallel net simulation.\n// For some purposes, it may be useful to simulate a spike from a\n// cell that does exist and would normally send its own spike, eg.\n// recurrent stimulation. This can be useful in debugging where the\n// spike raster comes from another implementation and one wants to\n// get complete control of all input spikes without the confounding\n// effects of output spikes from the simulated cells. In this case\n// set the third arg to 1 and set the output cell thresholds very\n// high so that they do not themselves generate spikes.\n// Can only be called by thread 0 because of the ps->send.\nvoid nrn_fake_fire(int gid, double spiketime, int fake_out) {\n    auto gid2in_it = gid2in.find(gid);\n    if (gid2in_it != gid2in.end()) {\n        InputPreSyn* psi = gid2in_it->second;\n        assert(psi);\n        // printf(\"nrn_fake_fire %d %g\\n\", gid, spiketime);\n        psi->send(spiketime, net_cvode_instance, nrn_threads);\n    } else if (fake_out) {\n        std::map<int, PreSyn*>::iterator gid2out_it;\n        gid2out_it = gid2out.find(gid);\n        if (gid2out_it != gid2out.end()) {\n            PreSyn* ps = gid2out_it->second;\n            assert(ps);\n            // printf(\"nrn_fake_fire fake_out %d %g\\n\", gid, spiketime);\n            ps->send(spiketime, net_cvode_instance, nrn_threads);\n        }\n    }\n}\n\nstatic int timeout_ = 0;\nint nrn_set_timeout(int timeout) {\n    int tt = timeout_;\n    timeout_ = timeout;\n    return tt;\n}\n\nvoid BBS_netpar_solve(double tstop) {\n    double time = nrn_wtime();\n\n#if NRNMPI\n    if (corenrn_param.mpi_enable) {\n        tstopunset;\n        double mt = dt;\n        double md = mindelay_ - 1e-10;\n        if (md < mt) {\n            if (nrnmpi_myid == 0) {\n                hoc_execerror(\"mindelay is 0\", \"(or less than dt for fixed step method)\");\n            } else {\n                return;\n            }\n        }\n\n        nrn_timeout(timeout_);\n        nrn_multithread_job(interthread_enqueue);\n        ncs2nrn_integrate(tstop * (1. + 1e-11));\n        nrn_spike_exchange(nrn_threads);\n        nrn_timeout(0);\n        if (!npe_.empty()) {\n            npe_[0].wx_ = npe_[0].ws_ = 0.;\n        };\n        // printf(\"%d netpar_solve exit t=%g tstop=%g mindelay_=%g\\n\",nrnmpi_myid, t, tstop,\n        // mindelay_);\n        nrnmpi_barrier();\n    } else\n#endif\n    {\n        ncs2nrn_integrate(tstop);\n    }\n    tstopunset;\n\n    if (nrnmpi_myid == 0 && !corenrn_param.is_quiet()) {\n        printf(\"\\nSolver Time : %g\\n\", nrn_wtime() - time);\n    }\n}\n\ndouble set_mindelay(double maxdelay) {\n    double mindelay = maxdelay;\n    last_maxstep_arg_ = maxdelay;\n\n    // if all==1 then minimum delay of all NetCon no matter the source.\n    // except if src in same thread as NetCon\n    int all = (nrn_nthread > 1);\n    // minumum delay of all NetCon having an InputPreSyn source\n\n    /** we have removed nt_ from PreSyn. Build local map of PreSyn\n     *  and NrnThread which will be used to find out if src in same thread as NetCon */\n    std::map<PreSyn*, NrnThread*> presynmap;\n\n    for (int ith = 0; ith < nrn_nthread; ++ith) {\n        NrnThread& nt = nrn_threads[ith];\n        for (int i = 0; i < nt.n_presyn; ++i) {\n            presynmap[nt.presyns + i] = nrn_threads + ith;\n        }\n    }\n\n    for (int ith = 0; ith < nrn_nthread; ++ith) {\n        NrnThread& nt = nrn_threads[ith];\n        // if single thread or file transfer then definitely empty.\n        std::vector<int>& negsrcgid_tid = nrnthreads_netcon_negsrcgid_tid[ith];\n        size_t i_tid = 0;\n        for (int i = 0; i < nt.n_netcon; ++i) {\n            NetCon* nc = nt.netcons + i;\n            bool chk = false;  // ignore nc.delay_\n            int gid = nrnthreads_netcon_srcgid[ith][i];\n            int tid = ith;\n            if (!negsrcgid_tid.empty() && gid < -1) {\n                tid = negsrcgid_tid[i_tid++];\n            }\n            PreSyn* ps;\n            InputPreSyn* psi;\n            netpar_tid_gid2ps(tid, gid, &ps, &psi);\n            if (psi) {\n                chk = true;\n            } else if (all) {\n                chk = true;\n                // but ignore if src in same thread as NetCon\n                if (ps && presynmap[ps] == &nt) {\n                    chk = false;\n                }\n            }\n            if (chk && nc->delay_ < mindelay) {\n                mindelay = nc->delay_;\n            }\n        }\n    }\n\n#if NRNMPI\n    if (corenrn_param.mpi_enable) {\n        active_ = true;\n        if (use_compress_) {\n            if (mindelay / dt > 255) {\n                mindelay = 255 * dt;\n            }\n        }\n\n        // printf(\"%d netpar_mindelay local %g now calling nrnmpi_mindelay\\n\", nrnmpi_myid,\n        // mindelay);\n        //\tdouble st = time();\n        mindelay_ = nrnmpi_dbl_allmin(mindelay);\n        //\tadd_wait_time(st);\n        // printf(\"%d local min=%g  global min=%g\\n\", nrnmpi_myid, mindelay, mindelay_);\n        errno = 0;\n    } else\n#endif  // NRNMPI\n    {\n        mindelay_ = mindelay;\n    }\n    return mindelay_;\n}\n\n/*  08-Nov-2010\nThe workhorse for spike exchange on up to 10K machines is MPI_Allgather\nbut as the number of machines becomes far greater than the fanout per\ncell we have been exploring a class of exchange methods called multisend\nwhere the spikes only go to those machines that need them and there is\noverlap between communication and computation.  The numer of variants of\nmultisend has grown so that some method selection function is needed\nthat makes sense.\n\nThe situation that needs to be captured by xchng_meth is\n\nAllgather\nmultisend implemented as MPI_ISend\nmultisend DCMF (only for Blue Gene/P)\nmultisend record_replay (only for Blue Gene/P with recordreplay_v1r4m2.patch)\n\nNote that Allgather allows spike compression and an allgather spike buffer\n with size chosen at setup time.  All methods allow bin queueing.\n\nAll the multisend methods should allow two phase multisend.\n\nNote that, in principle, MPI_ISend allows the source to send the index\n of the target PreSyn to avoid a hash table lookup (even with a two phase\n variant)\n\nRecordReplay should be best on the BG/P. The whole point is to make the\nspike transfer initiation as lowcost as possible since that is what causes\nmost load imbalance. I.e. since 10K more spikes arrive than are sent, spikes\nreceived per processor per interval are much more statistically\nbalanced than spikes sent per processor per interval. And presently\nDCMF multisend injects 10000 messages per spike into the network which\nis quite expensive. record replay avoids this overhead and the idea of\ntwo phase multisend distributes the injection.\n*/\n\nint nrnmpi_spike_compress(int nspike, bool gid_compress, int xchng_meth) {\n#if NRNMPI\n    if (corenrn_param.mpi_enable) {\n#if NRN_MULTISEND\n        if (xchng_meth > 0) {\n            use_multisend_ = 1;\n            return 0;\n        }\n#endif\n        nrn_assert(xchng_meth == 0);\n        if (nspike >= 0) {\n            ag_send_nspike = 0;\n            if (spikeout_fixed) {\n                free(spikeout_fixed);\n                spikeout_fixed = nullptr;\n            }\n            if (spikein_fixed) {\n                free(spikein_fixed);\n                spikein_fixed = nullptr;\n            }\n            if (spfixin_ovfl_) {\n                free(spfixin_ovfl_);\n                spfixin_ovfl_ = nullptr;\n            }\n            localmaps.clear();\n        }\n        if (nspike == 0) {  // turn off\n            use_compress_ = false;\n            nrn_use_localgid_ = false;\n        } else if (nspike > 0) {  // turn on\n            use_compress_ = true;\n            ag_send_nspike = nspike;\n            nrn_use_localgid_ = false;\n            if (gid_compress) {\n                // we can only do this after everything is set up\n                mk_localgid_rep();\n                if (!nrn_use_localgid_ && nrnmpi_myid == 0) {\n                    printf(\n                        \"Notice: gid compression did not succeed. Probably more than 255 cells on \"\n                        \"one \"\n                        \"cpu.\\n\");\n                }\n            }\n            if (!nrn_use_localgid_) {\n                localgid_size_ = sizeof(unsigned int);\n            }\n            ag_send_size = 2 + ag_send_nspike * (1 + localgid_size_);\n            spfixout_capacity_ = ag_send_size + 50 * (1 + localgid_size_);\n            spikeout_fixed = (unsigned char*) emalloc(spfixout_capacity_);\n            spikein_fixed = (unsigned char*) emalloc(nrnmpi_numprocs * ag_send_size);\n            ovfl_capacity = 100;\n            spfixin_ovfl_ = (unsigned char*) emalloc(ovfl_capacity * (1 + localgid_size_));\n        }\n        return ag_send_nspike;\n    } else\n#endif\n    {\n        return 0;\n    }\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/network/netpar.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\n#include \"coreneuron/network/partrans.hpp\"\n#include \"coreneuron/sim/multicore.hpp\"\n\nnamespace coreneuron {\n\nextern void nrn_spike_exchange_init(void);\nextern void nrn_spike_exchange(NrnThread* nt);\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/network/partrans.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/mpi/nrnmpi.h\"\n#include \"coreneuron/mpi/core/nrnmpi.hpp\"\n#include \"coreneuron/network/partrans.hpp\"\n#include \"coreneuron/apps/corenrn_parameters.hpp\"\n\n// This is the computational code for src->target transfer (e.g. gap junction)\n// simulation.\n// The setup code is in partrans_setup.cpp\n\nnamespace coreneuron {\nbool nrn_have_gaps;\n\nusing namespace nrn_partrans;\n\nTransferThreadData* nrn_partrans::transfer_thread_data_;\n\n// MPI_Alltoallv buffer info\ndouble* nrn_partrans::insrc_buf_;   // Receive buffer for gap voltages\ndouble* nrn_partrans::outsrc_buf_;  // Send buffer for gap voltages\nint* nrn_partrans::insrccnt_;\nint* nrn_partrans::insrcdspl_;\nint* nrn_partrans::outsrccnt_;\nint* nrn_partrans::outsrcdspl_;\n\nvoid nrnmpi_v_transfer() {\n    // copy source values to outsrc_buf_ and mpi transfer to insrc_buf\n\n    // note that same source value (usually voltage) may get copied to\n    // several locations in outsrc_buf\n\n    // gather the source values. can be done in parallel\n    for (int tid = 0; tid < nrn_nthread; ++tid) {\n        auto& ttd = transfer_thread_data_[tid];\n        auto* nt = &nrn_threads[tid];\n        int n = int(ttd.outsrc_indices.size());\n        if (n == 0) {\n            continue;\n        }\n        double* src_data = nt->_data;\n        int* src_indices = ttd.src_indices.data();\n\n        // gather sources on gpu and copy to cpu, cpu scatters to outsrc_buf\n        double* src_gather = ttd.src_gather.data();\n        size_t n_src_gather = ttd.src_gather.size();\n\n        nrn_pragma_acc(parallel loop present(src_indices [0:n_src_gather],\n                                             src_data [0:nt->_ndata],\n                                             src_gather [0:n_src_gather]) if (nt->compute_gpu)\n                           async(nt->stream_id))\n        nrn_pragma_omp(target teams distribute parallel for simd if(nt->compute_gpu))\n        for (std::size_t i = 0; i < n_src_gather; ++i) {\n            src_gather[i] = src_data[src_indices[i]];\n        }\n        nrn_pragma_acc(update host(src_gather [0:n_src_gather]) if (nt->compute_gpu)\n                           async(nt->stream_id))\n        nrn_pragma_omp(target update from(src_gather [0:n_src_gather]) if (nt->compute_gpu))\n    }\n\n    // copy gathered source values to outsrc_buf_\n    bool compute_gpu = false;\n    for (int tid = 0; tid < nrn_nthread; ++tid) {\n        if (nrn_threads[tid].compute_gpu) {\n            compute_gpu = true;\n            nrn_pragma_acc(wait(nrn_threads[tid].stream_id))\n        }\n        TransferThreadData& ttd = transfer_thread_data_[tid];\n        size_t n_outsrc_indices = ttd.outsrc_indices.size();\n        int* outsrc_indices = ttd.outsrc_indices.data();\n        double* src_gather = ttd.src_gather.data();\n        int* src_gather_indices = ttd.gather2outsrc_indices.data();\n        for (size_t i = 0; i < n_outsrc_indices; ++i) {\n            outsrc_buf_[outsrc_indices[i]] = src_gather[src_gather_indices[i]];\n        }\n    }\n    static_cast<void>(compute_gpu);\n\n    // transfer\n    int n_insrc_buf = insrcdspl_[nrnmpi_numprocs];\n#if NRNMPI\n    if (corenrn_param.mpi_enable) {  // otherwise insrc_buf_ == outsrc_buf_\n        nrnmpi_barrier();\n        nrnmpi_dbl_alltoallv(\n            outsrc_buf_, outsrccnt_, outsrcdspl_, insrc_buf_, insrccnt_, insrcdspl_);\n    } else\n#endif\n    {  // Use the multiprocess code even for one process to aid debugging\n        // For nrnmpi_numprocs == 1, insrc_buf_ and outsrc_buf_ are same size.\n        for (int i = 0; i < n_insrc_buf; ++i) {\n            insrc_buf_[i] = outsrc_buf_[i];\n        }\n    }\n\n    // insrc_buf_ will get copied to targets via nrnthread_v_transfer\n    nrn_pragma_acc(update device(insrc_buf_ [0:n_insrc_buf]) if (compute_gpu))\n    nrn_pragma_omp(target update to(insrc_buf_ [0:n_insrc_buf]) if (compute_gpu))\n}\n\nvoid nrnthread_v_transfer(NrnThread* _nt) {\n    // Copy insrc_buf_ values to the target locations. (An insrc_buf_ value\n    // may be copied to several target locations.\n    TransferThreadData& ttd = transfer_thread_data_[_nt->id];\n    size_t ntar = ttd.tar_indices.size();\n    int* tar_indices = ttd.tar_indices.data();\n    int* insrc_indices = ttd.insrc_indices.data();\n    double* tar_data = _nt->_data;\n    // last element in the displacement vector gives total length\n#if defined(CORENEURON_ENABLE_GPU) && !defined(CORENEURON_PREFER_OPENMP_OFFLOAD) && \\\n    defined(_OPENACC)\n    int n_insrc_buf = insrcdspl_[nrnmpi_numprocs];\n    int ndata = _nt->_ndata;\n#endif\n    nrn_pragma_acc(parallel loop copyin(tar_indices [0:ntar])\n                       present(insrc_indices [0:ntar],\n                               tar_data [0:ndata],\n                               insrc_buf_ [0:n_insrc_buf]) if (_nt->compute_gpu)\n                           async(_nt->stream_id))\n    nrn_pragma_omp(target teams distribute parallel for simd map(to: tar_indices[0:ntar]) if(_nt->compute_gpu))\n    for (size_t i = 0; i < ntar; ++i) {\n        tar_data[tar_indices[i]] = insrc_buf_[insrc_indices[i]];\n    }\n}\n\nvoid nrn_partrans::copy_gap_indices_to_device() {\n    // Ensure index vectors, src_gather, and insrc_buf_ are on the gpu.\n    if (insrcdspl_) {\n        // TODO: we don't actually need to copy here, just allocate + associate\n        // storage on the device\n        cnrn_target_copyin(insrc_buf_, insrcdspl_[nrnmpi_numprocs]);\n    }\n    for (int tid = 0; tid < nrn_nthread; ++tid) {\n        const NrnThread* nt = nrn_threads + tid;\n        if (!nt->compute_gpu) {\n            continue;\n        }\n\n        const TransferThreadData& ttd = transfer_thread_data_[tid];\n\n        if (!ttd.src_indices.empty()) {\n            cnrn_target_copyin(ttd.src_indices.data(), ttd.src_indices.size());\n            // TODO: we don't actually need to copy here, just allocate +\n            // associate storage on the device.\n            cnrn_target_copyin(ttd.src_gather.data(), ttd.src_gather.size());\n        }\n\n        if (ttd.insrc_indices.size()) {\n            cnrn_target_copyin(ttd.insrc_indices.data(), ttd.insrc_indices.size());\n        }\n    }\n}\n\nvoid nrn_partrans::delete_gap_indices_from_device() {\n    if (insrcdspl_) {\n        int n_insrc_buf = insrcdspl_[nrnmpi_numprocs];\n        cnrn_target_delete(insrc_buf_, n_insrc_buf);\n    }\n    for (int tid = 0; tid < nrn_nthread; ++tid) {\n        const NrnThread* nt = nrn_threads + tid;\n        if (!nt->compute_gpu) {\n            continue;\n        }\n\n        TransferThreadData& ttd = transfer_thread_data_[tid];\n\n        if (!ttd.src_indices.empty()) {\n            cnrn_target_delete(ttd.src_indices.data(), ttd.src_indices.size());\n            cnrn_target_delete(ttd.src_gather.data(), ttd.src_gather.size());\n        }\n\n        if (!ttd.insrc_indices.empty()) {\n            cnrn_target_delete(ttd.insrc_indices.data(), ttd.insrc_indices.size());\n        }\n    }\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/network/partrans.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n#include \"coreneuron/sim/multicore.hpp\"\n\n#ifndef NRNLONGSGID\n#define NRNLONGSGID 0\n#endif\n\n#if NRNLONGSGID\nusing sgid_t = int64_t;\n#else\nusing sgid_t = int;\n#endif\n\nnamespace coreneuron {\nstruct Memb_list;\n\nextern bool nrn_have_gaps;\nextern void nrnmpi_v_transfer();\nextern void nrnthread_v_transfer(NrnThread*);\n\nnamespace nrn_partrans {\n\n/** The basic problem is to copy sources to targets.\n *  It may be the case that a source gets copied to several targets.\n *  Sources and targets are a set of indices in NrnThread.data.\n *  A copy may be intrathread, interthread, interprocess.\n *  Copies happen every time step so efficiency is desirable.\n *  SetupTransferInfo gives us the source and target (sid, type, index) triples\n *  for a thread and all the global threads define what gets copied where.\n *  Need to process that info into TransferThreadData for each thread and\n *  the interprocessor mpi buffers insrc_buf_ and outsrc_buf transfered with\n *  MPI_Alltoallv, hopefully with a more or less optimal ordering.\n *  The compute strategy is: 1) Each thread copies its NrnThread.data source\n *  items to outsrc_buf_. 2) MPI_Allgatherv transfers outsrc_buf_ to insrc_buf_.\n *  3) Each thread, copies insrc_buf_ values to Nrnthread.data target.\n *\n *  Optimal ordering is probably beyond our reach but a few considerations\n *  may be useful. The typical use is for gap junctions where only voltage\n *  transferred and all instances of the HalfGap Point_process receive a\n *  voltage. Two situations are common. Voltage transfer is sparse and one\n *  to one, i.e many compartments do not have gap junctions, and those that do\n *  have only one. The other situation is that all compartments have gap\n *  junctions (e.g. syncytium of single compartment cells in the heart) and\n *  the voltage needs to be transferred to all neighboring cells (e.g. 6-18\n *  cells can be neighbors to the central cell). So on the target side, it\n *  might be good to copy to the target in target index order from the\n *  input_buf_. And on the source side, it is certainly simple to scatter\n *  to the outbut_buf_ in NrnThread.data order.  Note that one expects a wide\n *  scatter to the outsrc_buf and also a wide scatter within the insrc_buf_.\n **/\n\n/*\n * In partrans.cpp: nrnmpi_v_transfer\n *   Copy NrnThead.data to outsrc_buf_ for all threads via\n *     gpu: gather src_gather[i] = NrnThread._data[src_indices[i]];\n *     gpu to host src_gather\n *     cpu: outsrc_buf_[outsrc_indices[i]] = src_gather[gather2outsrc_indices[i]];\n *\n *   MPI_Allgatherv outsrc_buf_ to insrc_buf_\n *\n *   host to gpu insrc_buf_\n *\n * In partrans.cpp: nrnthread_v_transfer\n *   insrc_buf_ to NrnThread._data via\n *   NrnThread.data[tar_indices[i]] = insrc_buf_[insrc_indices[i]];\n *     where tar_indices depends on layout, type, etc.\n */\n\nstruct TransferThreadData {\n    std::vector<int> src_indices;            // indices into NrnThread._data\n    std::vector<double> src_gather;          // copy of NrnThread._data[src_indices]\n    std::vector<int> gather2outsrc_indices;  // ix of src_gather that send into outsrc_indices\n    std::vector<int> outsrc_indices;         // ix of outsrc_buf that receive src_gather values\n\n    std::vector<int> insrc_indices;  // insrc_buf_ indices copied to ...\n    std::vector<int> tar_indices;    // indices of NrnThread.data.\n};\nextern TransferThreadData* transfer_thread_data_; /* array for threads */\n\n}  // namespace nrn_partrans\n}  // namespace coreneuron\n\n// For direct transfer,\n// must be same as corresponding struct SetupTransferInfo in NEURON\nstruct SetupTransferInfo {\n    std::vector<sgid_t> src_sid;\n    std::vector<int> src_type;\n    std::vector<int> src_index;\n    std::vector<sgid_t> tar_sid;\n    std::vector<int> tar_type;\n    std::vector<int> tar_index;\n};\n\nnamespace coreneuron {\nnamespace nrn_partrans {\n\nextern SetupTransferInfo* setup_info_; /* array for threads exists only during setup*/\n\nextern void gap_mpi_setup(int ngroup);\nextern void gap_data_indices_setup(NrnThread* nt);\nextern void copy_gap_indices_to_device();\nextern void delete_gap_indices_from_device();\nextern void gap_cleanup();\n\nextern double* insrc_buf_;   // Receive buffer for gap voltages\nextern double* outsrc_buf_;  // Send buffer for gap voltages\nextern int *insrccnt_, *insrcdspl_, *outsrccnt_, *outsrcdspl_;\n}  // namespace nrn_partrans\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/network/partrans_setup.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include <map>\n#include <vector>\n\n#include \"coreneuron/coreneuron.hpp\"\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/mpi/nrnmpi.h\"\n#include \"coreneuron/mpi/core/nrnmpi.hpp\"\n#include \"coreneuron/network/partrans.hpp\"\n#include \"coreneuron/nrniv/nrniv_decl.h\"\n\nnamespace coreneuron {\nusing namespace coreneuron::nrn_partrans;\n\nSetupTransferInfo* nrn_partrans::setup_info_;\n\nclass SidInfo {\n  public:\n    std::vector<int> tids_;\n    std::vector<int> indices_;\n};\n\n}  // namespace coreneuron\n#if NRNLONGSGID\n#define sgid_alltoallv nrnmpi_long_alltoallv\n#else\n#define sgid_alltoallv nrnmpi_int_alltoallv\n#endif\n\n#define HAVEWANT_t         sgid_t\n#define HAVEWANT_alltoallv sgid_alltoallv\n#define HAVEWANT2Int       std::map<sgid_t, int>\n#include \"coreneuron/network/have2want.h\"\n\nnamespace coreneuron {\nusing namespace coreneuron::nrn_partrans;\n\nvoid nrn_partrans::gap_mpi_setup(int ngroup) {\n    // printf(\"%d gap_mpi_setup ngroup=%d\\n\", nrnmpi_myid, ngroup);\n\n    // count total_nsrc, total_ntar and allocate.\n    // Possible either or both are 0 on this process.\n    size_t total_nsrc = 0, total_ntar = 0;\n    for (int tid = 0; tid < ngroup; ++tid) {\n        auto& si = setup_info_[tid];\n        total_nsrc += si.src_sid.size();\n        total_ntar += si.tar_sid.size();\n    }\n\n    // have and want arrays (add 1 to guarantee new ... is an array.)\n    sgid_t* have = new sgid_t[total_nsrc + 1];\n    sgid_t* want = new sgid_t[total_ntar + 1];\n\n    // map from source sid to (tid, index), ie.  NrnThread[tid]._data[index].\n    // and target sid to lists of (tid, index) for memb_list\n    // also count the map sizes and fill have and want arrays\n    std::map<sgid_t, SidInfo> src2info;\n    std::map<sgid_t, SidInfo> tar2info;\n\n    int src2info_size = 0, tar2info_size = 0;  // number of unique sids\n    for (int tid = 0; tid < ngroup; ++tid) {\n        auto& si = setup_info_[tid];\n        // Sgid has unique source.\n\n        for (size_t i = 0; i < si.src_sid.size(); ++i) {\n            sgid_t sid = si.src_sid[i];\n            SidInfo sidinfo;\n            sidinfo.tids_.push_back(tid);\n            sidinfo.indices_.push_back(i);\n            src2info[sid] = sidinfo;\n            have[src2info_size] = sid;\n            src2info_size++;\n        }\n        // Possibly many targets of same sid\n        // Only want unique sids. From each, can obtain all its targets.\n        for (size_t i = 0; i < si.tar_sid.size(); ++i) {\n            sgid_t sid = si.tar_sid[i];\n            if (tar2info.find(sid) == tar2info.end()) {\n                tar2info[sid] = SidInfo();\n                want[tar2info_size] = sid;\n                tar2info_size++;\n            }\n            SidInfo& sidinfo = tar2info[sid];\n            sidinfo.tids_.push_back(tid);\n            sidinfo.indices_.push_back(i);\n        }\n    }\n\n    // 2) Call the have_to_want function.\n    sgid_t* send_to_want;\n    sgid_t* recv_from_have;\n\n    have_to_want(have,\n                 src2info_size,\n                 want,\n                 tar2info_size,\n                 send_to_want,\n                 outsrccnt_,\n                 outsrcdspl_,\n                 recv_from_have,\n                 insrccnt_,\n                 insrcdspl_,\n                 default_rendezvous);\n\n    int nhost = nrnmpi_numprocs;\n\n    // sanity check. all the sgids we are asked to send, we actually have\n    for (int i = 0; i < outsrcdspl_[nhost]; ++i) {\n        sgid_t sgid = send_to_want[i];\n        assert(src2info.find(sgid) != src2info.end());\n    }\n\n    // sanity check. all the sgids we receive, we actually need.\n    for (int i = 0; i < insrcdspl_[nhost]; ++i) {\n        sgid_t sgid = recv_from_have[i];\n        assert(tar2info.find(sgid) != tar2info.end());\n    }\n\n#if CORENRN_DEBUG\n    printf(\"%d mpi outsrccnt_, outsrcdspl_, insrccnt, insrcdspl_\\n\", nrnmpi_myid);\n    for (int i = 0; i < nrnmpi_numprocs; ++i) {\n        printf(\"%d : %d %d %d %d\\n\",\n               nrnmpi_myid,\n               outsrccnt_[i],\n               outsrcdspl_[i],\n               insrccnt_[i],\n               insrcdspl_[i]);\n    }\n#endif\n\n    // clean up a little\n    delete[] have;\n    delete[] want;\n\n    insrc_buf_ = new double[insrcdspl_[nhost]];\n    outsrc_buf_ = new double[outsrcdspl_[nhost]];\n\n    // for i: src_gather[i] = NrnThread._data[src_indices[i]]\n    // for j: outsrc_buf[outsrc_indices[j]] = src_gather[gather2outsrc_indices[j]]\n    // src_indices point into NrnThread._data\n    // Many outsrc_indices elements can point to the same src_gather element\n    // but only if an sgid src datum is destined for multiple ranks.\n    for (int i = 0; i < outsrcdspl_[nhost]; ++i) {\n        sgid_t sgid = send_to_want[i];\n        SidInfo& sidinfo = src2info[sgid];\n        // only one item in the lists.\n        int tid = sidinfo.tids_[0];\n        int setup_info_index = sidinfo.indices_[0];\n\n        auto& si = setup_info_[tid];\n        auto& ttd = transfer_thread_data_[tid];\n\n        // Note that src_index points into NrnThread.data, as it has already\n        // been transformed using original src_type and src_index via\n        // stdindex2ptr.\n        // For copying into outsrc_buf from src_gather. This is from\n        // NrnThread._data, fixup to \"from src_gather\" below.\n        ttd.gather2outsrc_indices.push_back(si.src_index[setup_info_index]);\n        ttd.outsrc_indices.push_back(i);\n    }\n\n    // Need to know src_gather index given NrnThread._data index\n    // to compute gather2outsrc_indices. And the update outsrc_indices so that\n    // for a given thread\n    // for j: outsrc_buf[outsrc_indices[j]] = src_gather[gather2outsrc_indices[j]]\n    for (int tid = 0; tid < ngroup; ++tid) {\n        auto& ttd = transfer_thread_data_[tid];\n        std::map<int, int> data2gather_indices;\n        for (size_t i = 0; i < ttd.src_indices.size(); ++i) {\n            data2gather_indices[ttd.src_indices[i]] = i;\n        }\n\n        for (size_t i = 0; i < ttd.outsrc_indices.size(); ++i) {\n            ttd.gather2outsrc_indices[i] = data2gather_indices[ttd.gather2outsrc_indices[i]];\n        }\n    }\n\n    // Which insrc_indices point into which NrnThread.data\n    // An sgid occurs at most once in the process recv_from_have.\n    // But it might get distributed to more than one thread and to\n    // several targets in a thread (specified by tar2info)\n    // insrc_indices is parallel to tar_indices and has size ntar of the thread.\n    // insrc_indices[i] is the index into insrc_buf\n    // tar_indices[i] is the index into NrnThread.data\n    // i.e. NrnThead._data[tar_indices[i]] = insrc_buf[insrc_indices[i]]\n    for (int i = 0; i < insrcdspl_[nhost]; ++i) {\n        sgid_t sgid = recv_from_have[i];\n        SidInfo& sidinfo = tar2info[sgid];\n        // there may be several items in the lists.\n        for (size_t j = 0; j < sidinfo.tids_.size(); ++j) {\n            int tid = sidinfo.tids_[j];\n            int index = sidinfo.indices_[j];\n\n            transfer_thread_data_[tid].insrc_indices[index] = i;\n        }\n    }\n\n#if CORENRN_DEBUG\n    // things look ok so far?\n    for (int tid = 0; tid < ngroup; ++tid) {\n        SetupTransferInfo& si = setup_info_[tid];\n        nrn_partrans::TransferThreadData& ttd = transfer_thread_data_[tid];\n        for (size_t i = 0; i < si.src_sid.size(); ++i) {\n            printf(\"%d %d src sid=%d v_index=%d %g\\n\",\n                   nrnmpi_myid,\n                   tid,\n                   si.src_sid[i],\n                   ttd.src_indices[i],\n                   nrn_threads[tid]._data[ttd.src_indices[i]]);\n        }\n        for (size_t i = 0; i < ttd.tar_indices.size(); ++i) {\n            printf(\"%d %d src sid=i%zd tar_index=%d %g\\n\",\n                   nrnmpi_myid,\n                   tid,\n                   i,\n                   ttd.tar_indices[i],\n                   nrn_threads[tid]._data[ttd.tar_indices[i]]);\n        }\n    }\n#endif\n\n    delete[] send_to_want;\n    delete[] recv_from_have;\n}\n\n/**\n *  For now, until conceptualization of the ordering is clear,\n *  just replace src setup_info_ indices values with stdindex2ptr determined\n *  index into NrnThread._data\n **/\nvoid nrn_partrans::gap_data_indices_setup(NrnThread* n) {\n    NrnThread& nt = *n;\n    auto& ttd = transfer_thread_data_[nt.id];\n    auto& sti = setup_info_[nt.id];\n\n    ttd.src_gather.resize(sti.src_sid.size());\n    ttd.src_indices.resize(sti.src_sid.size());\n    ttd.insrc_indices.resize(sti.tar_sid.size());\n    ttd.tar_indices.resize(sti.tar_sid.size());\n\n    // For copying into src_gather from NrnThread._data\n    for (size_t i = 0; i < sti.src_sid.size(); ++i) {\n        double* d = stdindex2ptr(sti.src_type[i], sti.src_index[i], nt);\n        sti.src_index[i] = int(d - nt._data);\n    }\n\n    // For copying into NrnThread._data from insrc_buf.\n    for (size_t i = 0; i < sti.tar_sid.size(); ++i) {\n        double* d = stdindex2ptr(sti.tar_type[i], sti.tar_index[i], nt);\n        // todo : this should be revisited once nt._data will be broken\n        // into mechanism specific data\n        sti.tar_index[i] = int(d - nt._data);\n    }\n\n    // Here we could reorder sti.src_... according to NrnThread._data index\n    // order\n\n    // copy into TransferThreadData\n    ttd.src_indices = sti.src_index;\n    ttd.tar_indices = sti.tar_index;\n}\n\nvoid nrn_partrans::gap_cleanup() {\n    if (transfer_thread_data_) {\n        delete[] transfer_thread_data_;\n        transfer_thread_data_ = nullptr;\n    }\n    if (insrc_buf_) {\n        delete[] insrc_buf_;\n        insrc_buf_ = nullptr;\n        delete[] insrccnt_;\n        insrccnt_ = nullptr;\n        delete[] insrcdspl_;\n        insrcdspl_ = nullptr;\n        delete[] outsrc_buf_;\n        outsrc_buf_ = nullptr;\n        delete[] outsrccnt_;\n        outsrccnt_ = nullptr;\n        delete[] outsrcdspl_;\n        outsrcdspl_ = nullptr;\n    }\n}\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/network/tnode.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n#include <vector>\n\n// experiment with ordering strategies for Tree Nodes\nnamespace coreneuron {\nclass TNode;\n\nusing VecTNode = std::vector<TNode*>;\n\n/**\n * \\class TNode\n * \\brief TNode is the tree node that represents the tree of the compartments\n */\nclass TNode {\n  public:\n    TNode(int ix);\n    virtual ~TNode();\n    TNode* parent;\n    VecTNode children;\n    size_t mkhash();  /// Hash algorith that generates a hash based on the hash of the children and\n                      /// the number of compartments of the children\n    size_t hash;      /// Hash value generated by mkhash\n    size_t treesize;  /// Total number of compartments from the current node and below\n    size_t nodevec_index;   /// index in nodevec that is set in check()\n                            /// In cell permute 2 this is set as Breadth First traversal\n    size_t treenode_order;  /// For cell permute 1 (Interleaved):\n                            /// - This is the id given to the compartments based on a Breadth First\n                            /// access on the tree that is created in the original circuit\n                            /// - This is what makes the cell ordering interleaved\n                            /// For cell permute 2 (Constant Depth):\n                            /// VVVTN: Vector (groups of cells) of vector (levels of this group of\n                            /// cells. Maxsize = maxlevel) of vector of TNodes This changes 3 times\n                            /// during cell permute 2:\n                            /// 1. According to the sorting of the nodes of each level\n                            /// 2. According to the sorting of the parents' treenode_order of the\n                            /// previous ordering\n                            /// 3. According to children and parents data races. Parents and\n                            /// children of the tree are moved by question2() so that threads that\n                            /// exist on the same warp don't have data races when updating the\n                            /// children and parent variables, so that threads have to wait in\n                            /// atomic instructions. If there are any races then those are solved by\n                            /// atomic instructions.\n    size_t level;           /// level of of this compartment in the tree\n    size_t cellindex;       /// Cell ID that this compartment belongs to\n    size_t groupindex;      /// Initialized index / groupsize\n    int nodeindex;\n};\n\nsize_t level_from_leaf(VecTNode&);\nsize_t level_from_root(VecTNode&);\n\n/**\n * \\brief Implementation of the advanced interleaving strategy (interleave_permute_type == 2)\n *\n * The main steps are the following:\n * 1. warp_balance function creates balanced groups of cells.\n * 2. The compartments/tree nodes populate the groups vector (VVVTN) based on their groudindex and\n * their level (see level_from_root).\n * 3. The analyze() & question2() functions (operating per group) make sure that each cell is still\n * a tree (treenode_order) and that the dependent nodes belong to separate warps.\n */\nvoid group_order2(VecTNode&, size_t groupsize, size_t ncell);\nsize_t dist2child(TNode* nd);\n\n/**\n * \\brief Use of the LPT (Least Processing Time) algorithm to create balanced groups of cells.\n *\n * Competing objectives are to keep identical cells together and also balance warps.\n *\n * \\param ncell number of cells\n * \\param nodevec vector of compartments from all cells\n * \\return number of warps\n */\nsize_t warp_balance(size_t ncell, VecTNode& nodevec);\n\n#define warpsize 32\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/network/tqueue.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include <cstdio>\n#include <cstdlib>\n#include <cstring>\n#include <cstdarg>\n\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/network/tqueue.hpp\"\n\nnamespace coreneuron {\n// splay tree + bin queue limited to fixed step method\n// for event-sets or priority queues\n// this starts from the sptqueue.cpp file and adds a bin queue\n\n/* Derived from David Brower's c translation of pascal code by\nDouglas Jones.\n*/\n/* The original c code is included from this file but note that instead\nof struct _spblk, we are really using TQItem\n*/\n\nBinQ::BinQ() {\n    nbin_ = 1000;\n    bins_ = new TQItem*[nbin_];\n    for (int i = 0; i < nbin_; ++i) {\n        bins_[i] = 0;\n    }\n    qpt_ = 0;\n    tt_ = 0.;\n}\n\nBinQ::~BinQ() {\n    for (int i = 0; i < nbin_; ++i) {\n        assert(!bins_[i]);\n    }\n    delete[] bins_;\n    vec_bins.clear();\n}\n\nvoid BinQ::resize(int size) {\n    // printf(\"BinQ::resize from %d to %d\\n\", nbin_, size);\n    assert(size >= nbin_);\n    TQItem** bins = new TQItem*[size];\n    for (int i = nbin_; i < size; ++i) {\n        bins[i] = 0;\n    }\n    for (int i = 0, j = qpt_; i < nbin_; ++i, ++j) {\n        if (j >= nbin_) {\n            j = 0;\n        }\n        bins[i] = bins_[j];\n        for (auto q = bins[i]; q; q = q->left_) {\n            q->cnt_ = i;\n        }\n    }\n    delete[] bins_;\n    bins_ = bins;\n    nbin_ = size;\n    qpt_ = 0;\n}\nvoid BinQ::enqueue(double td, TQItem* q) {\n    int idt = (int) ((td - tt_) * rev_dt + 1.e-10);\n    assert(idt >= 0);\n    if (idt >= nbin_) {\n        resize(idt + 1000);\n    }\n    // assert (idt < nbin_);\n    idt += qpt_;\n    if (idt >= nbin_) {\n        idt -= nbin_;\n    }\n    // printf(\"enqueue: idt=%d qpt=%d nbin_=%d\\n\", idt, qpt_, nbin_);\n    assert(idt < nbin_);\n    q->cnt_ = idt;  // only for iteration\n    q->left_ = bins_[idt];\n    bins_[idt] = q;\n}\nTQItem* BinQ::dequeue() {\n    TQItem* q = bins_[qpt_];\n    if (q) {\n        bins_[qpt_] = q->left_;\n    }\n    return q;\n}\n\nTQItem* BinQ::first() {\n    for (int i = 0; i < nbin_; ++i) {\n        if (bins_[i]) {\n            return bins_[i];\n        }\n    }\n    return 0;\n}\nTQItem* BinQ::next(TQItem* q) {\n    if (q->left_) {\n        return q->left_;\n    }\n    for (int i = q->cnt_ + 1; i < nbin_; ++i) {\n        if (bins_[i]) {\n            return bins_[i];\n        }\n    }\n    return 0;\n}\n\nvoid BinQ::remove(TQItem* q) {\n    TQItem* q1 = bins_[q->cnt_];\n    if (q1 == q) {\n        bins_[q->cnt_] = q->left_;\n        return;\n    }\n    for (TQItem* q2 = q1->left_; q2; q1 = q2, q2 = q2->left_) {\n        if (q2 == q) {\n            q1->left_ = q->left_;\n            return;\n        }\n    }\n}\n\n//#include \"coreneuron/nrniv/sptree.h\"\n\n/*\n *  The following code implements the basic operations on\n *  an event-set or priority-queue implemented using splay trees:\n *\nHines changed to void spinit(SPTREE**) for use with TQueue.\n *  SPTREE *spinit( compare )\tMake a new tree\n *  SPBLK *spenq( n, q )\tInsert n in q after all equal keys.\n *  SPBLK *spdeq( np )\t\tReturn first key under *np, removing it.\n *  void splay( n, q )\t\tn (already in q) becomes the root.\n *  int n = sphead( q )         n is the head item in q (not removed).\n *  spdelete( n, q )\t\tn is removed from q.\n *\n *  In the above, n points to an SPBLK type, while q points to an\n *  SPTREE.\n *\n *  The implementation used here is based on the implementation\n *  which was used in the tests of splay trees reported in:\n *\n *    An Empirical Comparison of Priority-Queue and Event-Set Implementations,\n *\tby Douglas W. Jones, Comm. ACM 29, 4 (Apr. 1986) 300-311.\n *\n *  The changes made include the addition of the enqprior\n *  operation and the addition of up-links to allow for the splay\n *  operation.  The basic splay tree algorithms were originally\n *  presented in:\n *\n *\tSelf Adjusting Binary Trees,\n *\t\tby D. D. Sleator and R. E. Tarjan,\n *\t\t\tProc. ACM SIGACT Symposium on Theory\n *\t\t\tof Computing (Boston, Apr 1983) 235-245.\n *\n *  The enq and enqprior routines use variations on the\n *  top-down splay operation, while the splay routine is bottom-up.\n *  All are coded for speed.\n *\n *  Written by:\n *    Douglas W. Jones\n *\n *  Translated to C by:\n *    David Brower, daveb@rtech.uucp\n *\n * Thu Oct  6 12:11:33 PDT 1988 (daveb) Fixed spdeq, which was broken\n *\thandling one-node trees.  I botched the pascal translation of\n *\ta VAR parameter.\n */\n\n/*----------------\n *\n * spinit() -- initialize an empty splay tree\n *\n */\nvoid spinit(SPTREE* q) {\n    q->enqcmps = 0;\n    q->root = nullptr;\n}\n\n/*----------------\n *\n *  spenq() -- insert item in a tree.\n *\n *  put n in q after all other nodes with the same key; when this is\n *  done, n will be the root of the splay tree representing q, all nodes\n *  in q with keys less than or equal to that of n will be in the\n *  left subtree, all with greater keys will be in the right subtree;\n *  the tree is split into these subtrees from the top down, with rotations\n *  performed along the way to shorten the left branch of the right subtree\n *  and the right branch of the left subtree\n */\nSPBLK* spenq(SPBLK* n, SPTREE* q) {\n    SPBLK* left;  /* the rightmost node in the left tree */\n    SPBLK* right; /* the leftmost node in the right tree */\n    SPBLK* next;  /* the root of the unsplit part */\n    SPBLK* temp;\n\n    double key;\n\n    n->uplink = nullptr;\n    next = q->root;\n    q->root = n;\n    if (next == nullptr) /* trivial enq */\n    {\n        n->leftlink = nullptr;\n        n->rightlink = nullptr;\n    } else /* difficult enq */\n    {\n        key = n->key;\n        left = n;\n        right = n;\n\n        /* n's left and right children will hold the right and left\n       splayed trees resulting from splitting on n->key;\n       note that the children will be reversed! */\n\n        q->enqcmps++;\n        if (STRCMP(next->key, key) > 0)\n            goto two;\n\n    one: /* assert next->key <= key */\n\n        do /* walk to the right in the left tree */\n        {\n            temp = next->rightlink;\n            if (temp == nullptr) {\n                left->rightlink = next;\n                next->uplink = left;\n                right->leftlink = nullptr;\n                goto done; /* job done, entire tree split */\n            }\n\n            q->enqcmps++;\n            if (STRCMP(temp->key, key) > 0) {\n                left->rightlink = next;\n                next->uplink = left;\n                left = next;\n                next = temp;\n                goto two; /* change sides */\n            }\n\n            next->rightlink = temp->leftlink;\n            if (temp->leftlink != nullptr)\n                temp->leftlink->uplink = next;\n            left->rightlink = temp;\n            temp->uplink = left;\n            temp->leftlink = next;\n            next->uplink = temp;\n            left = temp;\n            next = temp->rightlink;\n            if (next == nullptr) {\n                right->leftlink = nullptr;\n                goto done; /* job done, entire tree split */\n            }\n\n            q->enqcmps++;\n\n        } while (STRCMP(next->key, key) <= 0); /* change sides */\n\n    two: /* assert next->key > key */\n\n        do /* walk to the left in the right tree */\n        {\n            temp = next->leftlink;\n            if (temp == nullptr) {\n                right->leftlink = next;\n                next->uplink = right;\n                left->rightlink = nullptr;\n                goto done; /* job done, entire tree split */\n            }\n\n            q->enqcmps++;\n            if (STRCMP(temp->key, key) <= 0) {\n                right->leftlink = next;\n                next->uplink = right;\n                right = next;\n                next = temp;\n                goto one; /* change sides */\n            }\n            next->leftlink = temp->rightlink;\n            if (temp->rightlink != nullptr)\n                temp->rightlink->uplink = next;\n            right->leftlink = temp;\n            temp->uplink = right;\n            temp->rightlink = next;\n            next->uplink = temp;\n            right = temp;\n            next = temp->leftlink;\n            if (next == nullptr) {\n                left->rightlink = nullptr;\n                goto done; /* job done, entire tree split */\n            }\n\n            q->enqcmps++;\n\n        } while (STRCMP(next->key, key) > 0); /* change sides */\n\n        goto one;\n\n    done: /* split is done, branches of n need reversal */\n\n        temp = n->leftlink;\n        n->leftlink = n->rightlink;\n        n->rightlink = temp;\n    }\n\n    return (n);\n\n} /* spenq */\n\n/*----------------\n *\n *  spdeq() -- return and remove head node from a subtree.\n *\n *  remove and return the head node from the node set; this deletes\n *  (and returns) the leftmost node from q, replacing it with its right\n *  subtree (if there is one); on the way to the leftmost node, rotations\n *  are performed to shorten the left branch of the tree\n */\nSPBLK* spdeq(SPBLK** np) /* pointer to a node pointer */\n\n{\n    SPBLK* deq;        /* one to return */\n    SPBLK* next;       /* the next thing to deal with */\n    SPBLK* left;       /* the left child of next */\n    SPBLK* farleft;    /* the left child of left */\n    SPBLK* farfarleft; /* the left child of farleft */\n\n    if (np == nullptr || *np == nullptr) {\n        deq = nullptr;\n    } else {\n        next = *np;\n        left = next->leftlink;\n        if (left == nullptr) {\n            deq = next;\n            *np = next->rightlink;\n\n            if (*np != nullptr)\n                (*np)->uplink = nullptr;\n\n        } else\n            for (;;) /* left is not null */\n            {\n                /* next is not it, left is not nullptr, might be it */\n                farleft = left->leftlink;\n                if (farleft == nullptr) {\n                    deq = left;\n                    next->leftlink = left->rightlink;\n                    if (left->rightlink != nullptr)\n                        left->rightlink->uplink = next;\n                    break;\n                }\n\n                /* next, left are not it, farleft is not nullptr, might be it */\n                farfarleft = farleft->leftlink;\n                if (farfarleft == nullptr) {\n                    deq = farleft;\n                    left->leftlink = farleft->rightlink;\n                    if (farleft->rightlink != nullptr)\n                        farleft->rightlink->uplink = left;\n                    break;\n                }\n\n                /* next, left, farleft are not it, rotate */\n                next->leftlink = farleft;\n                farleft->uplink = next;\n                left->leftlink = farleft->rightlink;\n                if (farleft->rightlink != nullptr)\n                    farleft->rightlink->uplink = left;\n                farleft->rightlink = left;\n                left->uplink = farleft;\n                next = farleft;\n                left = farfarleft;\n            }\n    }\n\n    return (deq);\n\n} /* spdeq */\n\n/*----------------\n *\n *  splay() -- reorganize the tree.\n *\n *  the tree is reorganized so that n is the root of the\n *  splay tree representing q; results are unpredictable if n is not\n *  in q to start with; q is split from n up to the old root, with all\n *  nodes to the left of n ending up in the left subtree, and all nodes\n *  to the right of n ending up in the right subtree; the left branch of\n *  the right subtree and the right branch of the left subtree are\n *  shortened in the process\n *\n *  this code assumes that n is not nullptr and is in q; it can sometimes\n *  detect n not in q and complain\n */\n\nvoid splay(SPBLK* n, SPTREE* q) {\n    SPBLK* up;     /* points to the node being dealt with */\n    SPBLK* prev;   /* a descendent of up, already dealt with */\n    SPBLK* upup;   /* the parent of up */\n    SPBLK* upupup; /* the grandparent of up */\n    SPBLK* left;   /* the top of left subtree being built */\n    SPBLK* right;  /* the top of right subtree being built */\n\n    left = n->leftlink;\n    right = n->rightlink;\n    prev = n;\n    up = prev->uplink;\n\n    while (up != nullptr) {\n        /* walk up the tree towards the root, splaying all to the left of\n       n into the left subtree, all to right into the right subtree */\n\n        upup = up->uplink;\n        if (up->leftlink == prev) /* up is to the right of n */\n        {\n            if (upup != nullptr && upup->leftlink == up) /* rotate */\n            {\n                upupup = upup->uplink;\n                upup->leftlink = up->rightlink;\n                if (upup->leftlink != nullptr)\n                    upup->leftlink->uplink = upup;\n                up->rightlink = upup;\n                upup->uplink = up;\n                if (upupup == nullptr)\n                    q->root = up;\n                else if (upupup->leftlink == upup)\n                    upupup->leftlink = up;\n                else\n                    upupup->rightlink = up;\n                up->uplink = upupup;\n                upup = upupup;\n            }\n            up->leftlink = right;\n            if (right != nullptr)\n                right->uplink = up;\n            right = up;\n\n        } else /* up is to the left of n */\n        {\n            if (upup != nullptr && upup->rightlink == up) /* rotate */\n            {\n                upupup = upup->uplink;\n                upup->rightlink = up->leftlink;\n                if (upup->rightlink != nullptr)\n                    upup->rightlink->uplink = upup;\n                up->leftlink = upup;\n                upup->uplink = up;\n                if (upupup == nullptr)\n                    q->root = up;\n                else if (upupup->rightlink == upup)\n                    upupup->rightlink = up;\n                else\n                    upupup->leftlink = up;\n                up->uplink = upupup;\n                upup = upupup;\n            }\n            up->rightlink = left;\n            if (left != nullptr)\n                left->uplink = up;\n            left = up;\n        }\n        prev = up;\n        up = upup;\n    }\n\n#ifdef DEBUG\n    if (q->root != prev) {\n        /*\tfprintf(stderr, \" *** bug in splay: n not in q *** \" ); */\n        abort();\n    }\n#endif\n\n    n->leftlink = left;\n    n->rightlink = right;\n    if (left != nullptr)\n        left->uplink = n;\n    if (right != nullptr)\n        right->uplink = n;\n    q->root = n;\n    n->uplink = nullptr;\n\n} /* splay */\n\n/*----------------\n *\n * sphead() --  return the \"lowest\" element in the tree.\n *\n *      returns a reference to the head event in the event-set q,\n *      represented as a splay tree; q->root ends up pointing to the head\n *      event, and the old left branch of q is shortened, as if q had\n *      been splayed about the head element; this is done by dequeueing\n *      the head and then making the resulting queue the right son of\n *      the head returned by spdeq; an alternative is provided which\n *      avoids splaying but just searches for and returns a pointer to\n *      the bottom of the left branch\n */\nSPBLK* sphead(SPTREE* q) {\n    SPBLK* x;\n\n    /* splay version, good amortized bound */\n    x = spdeq(&q->root);\n    if (x != nullptr) {\n        x->rightlink = q->root;\n        x->leftlink = nullptr;\n        x->uplink = nullptr;\n        if (q->root != nullptr)\n            q->root->uplink = x;\n    }\n    q->root = x;\n\n    /* alternative version, bad amortized bound,\n       but faster on the average */\n\n    return (x);\n\n} /* sphead */\n\n/*----------------\n *\n * spdelete() -- Delete node from a tree.\n *\n *\tn is deleted from q; the resulting splay tree has been splayed\n *\taround its new root, which is the successor of n\n *\n */\nvoid spdelete(SPBLK* n, SPTREE* q) {\n    SPBLK* x;\n\n    splay(n, q);\n    x = spdeq(&q->root->rightlink);\n    if (x == nullptr) /* empty right subtree */\n    {\n        q->root = q->root->leftlink;\n        if (q->root)\n            q->root->uplink = nullptr;\n    } else /* non-empty right subtree */\n    {\n        x->uplink = nullptr;\n        x->leftlink = q->root->leftlink;\n        x->rightlink = q->root->rightlink;\n        if (x->leftlink != nullptr)\n            x->leftlink->uplink = x;\n        if (x->rightlink != nullptr)\n            x->rightlink->uplink = x;\n        q->root = x;\n    }\n\n} /* spdelete */\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/network/tqueue.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\n/*\n**  SPTREE:  The following type declarations provide the binary tree\n**  representation of event-sets or priority queues needed by splay trees\n**\n**  assumes that data and datb will be provided by the application\n**  to hold all application specific information\n**\n**  assumes that key will be provided by the application, comparable\n**  with the compare function applied to the addresses of two keys.\n*/\n// bin queue for the fixed step method for NetCons and PreSyns. Splay tree\n// for others.\n// fifo for the NetCons and PreSyns with same delay. Splay tree for\n// others (especially SelfEvents).\n// note that most methods below assume a TQItem is in the splay tree\n// For the bin part, only insert_fifo, and remove make sense,\n// The bin part assumes a fixed step method.\n\n#include <cstdio>\n#include <cassert>\n#include <queue>\n#include <vector>\n#include <map>\n#include <utility>\n\nnamespace coreneuron {\n#define STRCMP(a, b) (a - b)\n\nclass TQItem;\n#define SPBLK     TQItem\n#define leftlink  left_\n#define rightlink right_\n#define uplink    parent_\n#define cnt       cnt_\n#define key       t_\n\nstruct SPTREE {\n    SPBLK* root; /* root node */\n\n    /* Statistics, not strictly necessary, but handy for tuning  */\n    int enqcmps; /* compares in spenq */\n};\n\n#define spinit   sptq_spinit\n#define spenq    sptq_spenq\n#define spdeq    sptq_spdeq\n#define splay    sptq_splay\n#define sphead   sptq_sphead\n#define spdelete sptq_spdelete\n\nextern void spinit(SPTREE*);           /* init tree */\nextern SPBLK* spenq(SPBLK*, SPTREE*);  /* insert item into the tree */\nextern SPBLK* spdeq(SPBLK**);          /* return and remove lowest item in subtree */\nextern void splay(SPBLK*, SPTREE*);    /* reorganize tree */\nextern SPBLK* sphead(SPTREE*);         /* return first node in tree */\nextern void spdelete(SPBLK*, SPTREE*); /* delete node from tree */\n\nstruct DiscreteEvent;\nclass TQItem {\n  public:\n    DiscreteEvent* data_ = nullptr;\n    double t_ = 0;\n    TQItem* left_ = nullptr;\n    TQItem* right_ = nullptr;\n    TQItem* parent_ = nullptr;\n    int cnt_ = 0;  // reused: -1 means it is in the splay tree, >=0 gives bin\n};\n\nusing TQPair = std::pair<double, TQItem*>;\n\nstruct less_time {\n    bool operator()(const TQPair& x, const TQPair& y) const {\n        return x.first > y.first;\n    }\n};\n\n// helper class for the TQueue (SplayTBinQueue).\nclass BinQ {\n  public:\n    BinQ();\n    ~BinQ();\n    void enqueue(double tt, TQItem*);\n    void shift(double tt) {\n        assert(!bins_[qpt_]);\n        tt_ = tt;\n        if (++qpt_ >= nbin_) {\n            qpt_ = 0;\n        }\n    }\n    TQItem* top() {\n        return bins_[qpt_];\n    }\n    TQItem* dequeue();\n    double tbin() {\n        return tt_;\n    }\n    // for iteration\n    TQItem* first();\n    TQItem* next(TQItem*);\n    void remove(TQItem*);\n    void resize(int);\n\n  private:\n    double tt_;  // time at beginning of qpt_ interval\n    int nbin_, qpt_;\n    TQItem** bins_;\n    std::vector<std::vector<TQItem*>> vec_bins;\n};\n\nenum container { spltree, pq_que };\n\ntemplate <container C = spltree>\nclass TQueue {\n  public:\n    TQueue();\n    ~TQueue();\n\n    inline TQItem* least() {\n        return least_;\n    }\n    inline TQItem* insert(double t, DiscreteEvent* data);\n    inline TQItem* enqueue_bin(double t, DiscreteEvent* data);\n    inline TQItem* dequeue_bin() {\n        return binq_->dequeue();\n    }\n    inline void shift_bin(double _t_) {\n        ++nshift_;\n        binq_->shift(_t_);\n    }\n    inline TQItem* top() {\n        return binq_->top();\n    }\n\n    inline TQItem* atomic_dq(double til);\n    inline void remove(TQItem*);\n    inline void move(TQItem*, double tnew);\n    int nshift_;\n\n    /// Priority queue of vectors for queuing the events. enqueuing for move() and\n    /// move_least_nolock() is not implemented\n    std::priority_queue<TQPair, std::vector<TQPair>, less_time> pq_que_;\n    /// Types of queuing statistics\n    enum qtype { enq = 0, spike, ite, deq };\n\n  private:\n    double least_t_nolock() {\n        if (least_) {\n            return least_->t_;\n        } else {\n            return 1e15;\n        }\n    }\n    void move_least_nolock(double tnew);\n    SPTREE* sptree_;\n\n  public:\n    BinQ* binq_;\n\n  private:\n    TQItem* least_;\n    TQPair make_TQPair(TQItem* p) {\n        return TQPair(p->t_, p);\n    }\n};\n}  // namespace coreneuron\n#include \"coreneuron/network/tqueue.ipp\"\n"
  },
  {
    "path": "coreneuron/network/tqueue.ipp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#ifndef tqueue_ipp_\n#define tqueue_ipp_\n\n#include <cstdio>\n#include <cstdlib>\n#include <cstring>\n#include <cstdarg>\n\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/network/tqueue.hpp\"\n\nnamespace coreneuron {\n// splay tree + bin queue limited to fixed step method\n// for event-sets or priority queues\n// this starts from the sptqueue.cpp file and adds a bin queue\n\n/* Derived from David Brower's c translation of pascal code by\nDouglas Jones.\n*/\n/* The original c code is included from this file but note that instead\nof struct _spblk, we are really using TQItem\n*/\n\ntemplate <container C>\nTQueue<C>::TQueue() {\n    nshift_ = 0;\n    sptree_ = new SPTREE;\n    spinit(sptree_);\n    binq_ = new BinQ;\n    least_ = 0;\n}\n\ntemplate <container C>\nTQueue<C>::~TQueue() {\n    SPBLK *q, *q2;\n    /// Clear the binq\n    for (q = binq_->first(); q; q = q2) {\n        q2 = binq_->next(q);\n        binq_->remove(q);\n        delete q;\n    }\n    delete binq_;\n\n    if (least_) {\n        delete least_;\n        least_ = nullptr;\n    }\n\n    /// Clear the splay tree\n    while ((q = spdeq(&sptree_->root)) != nullptr) {\n        delete q;\n    }\n    delete sptree_;\n\n    /// Clear the priority queue\n    while (pq_que_.size()) {\n        delete pq_que_.top().second;\n        pq_que_.pop();\n    }\n}\n\ntemplate <container C>\nTQItem* TQueue<C>::enqueue_bin(double td, DiscreteEvent* d) {\n    TQItem* i = new TQItem;\n    i->data_ = d;\n    i->t_ = td;\n    binq_->enqueue(td, i);\n    return i;\n}\n\n/// Splay tree priority queue implementation\ntemplate <>\ninline void TQueue<spltree>::move_least_nolock(double tnew) {\n    TQItem* b = least();\n    if (b) {\n        b->t_ = tnew;\n        TQItem* nl;\n        nl = sphead(sptree_);\n        if (nl && (tnew > nl->t_)) {\n            least_ = spdeq(&sptree_->root);\n            spenq(b, sptree_);\n        }\n    }\n}\n\n/// STL priority queue implementation\ntemplate <>\ninline void TQueue<pq_que>::move_least_nolock(double tnew) {\n    TQItem* b = least();\n    if (b) {\n        b->t_ = tnew;\n        TQItem* nl;\n        nl = pq_que_.top().second;\n        if (nl && (tnew > nl->t_)) {\n            least_ = nl;\n            pq_que_.pop();\n            pq_que_.push(make_TQPair(b));\n        }\n    }\n}\n\n/// Splay tree priority queue implementation\ntemplate <>\ninline void TQueue<spltree>::move(TQItem* i, double tnew) {\n    if (i == least_) {\n        move_least_nolock(tnew);\n    } else if (tnew < least_->t_) {\n        spdelete(i, sptree_);\n        i->t_ = tnew;\n        spenq(least_, sptree_);\n        least_ = i;\n    } else {\n        spdelete(i, sptree_);\n        i->t_ = tnew;\n        spenq(i, sptree_);\n    }\n}\n\n/// STL priority queue implementation\ntemplate <>\ninline void TQueue<pq_que>::move(TQItem* i, double tnew) {\n    if (i == least_) {\n        move_least_nolock(tnew);\n    } else if (tnew < least_->t_) {\n        TQItem* qmove = new TQItem;\n        qmove->data_ = i->data_;\n        qmove->t_ = tnew;\n        qmove->cnt_ = i->cnt_;\n        i->t_ = -1.;\n        pq_que_.push(make_TQPair(least_));\n        least_ = qmove;\n    } else {\n        TQItem* qmove = new TQItem;\n        qmove->data_ = i->data_;\n        qmove->t_ = tnew;\n        qmove->cnt_ = i->cnt_;\n        i->t_ = -1.;\n        pq_que_.push(make_TQPair(qmove));\n    }\n}\n\n/// Splay tree priority queue implementation\ntemplate <>\ninline TQItem* TQueue<spltree>::insert(double tt, DiscreteEvent* d) {\n    TQItem* i = new TQItem;\n    i->data_ = d;\n    i->t_ = tt;\n    i->cnt_ = -1;\n    if (tt < least_t_nolock()) {\n        if (least_) {\n            /// Probably storing both time and event which has the time is redundant, but the event\n            /// is then returned\n            /// to the upper level call stack function. If we were to eliminate i->t_ and i->cnt_\n            /// fields,\n            /// we need to make sure we are not braking anything.\n            spenq(least_, sptree_);\n        }\n        least_ = i;\n    } else {\n        spenq(i, sptree_);\n    }\n    return i;\n}\n\n/// STL priority queue implementation\ntemplate <>\ninline TQItem* TQueue<pq_que>::insert(double tt, DiscreteEvent* d) {\n    TQItem* i = new TQItem;\n    i->data_ = d;\n    i->t_ = tt;\n    i->cnt_ = -1;\n    if (tt < least_t_nolock()) {\n        if (least_) {\n            /// Probably storing both time and event which has the time is redundant, but the event\n            /// is then returned\n            /// to the upper level call stack function. If we were to eliminate i->t_ and i->cnt_\n            /// fields,\n            /// we need to make sure we are not braking anything.\n            pq_que_.push(make_TQPair(least_));\n        }\n        least_ = i;\n    } else {\n        pq_que_.push(make_TQPair(i));\n    }\n    return i;\n}\n\n/// Splay tree priority queue implementation\ntemplate <>\ninline void TQueue<spltree>::remove(TQItem* q) {\n    if (q) {\n        if (q == least_) {\n            if (sptree_->root) {\n                least_ = spdeq(&sptree_->root);\n            } else {\n                least_ = nullptr;\n            }\n        } else {\n            spdelete(q, sptree_);\n        }\n        delete q;\n    }\n}\n\n/// STL priority queue implementation\ntemplate <>\ninline void TQueue<pq_que>::remove(TQItem* q) {\n    if (q) {\n        if (q == least_) {\n            if (pq_que_.size()) {\n                least_ = pq_que_.top().second;\n                pq_que_.pop();\n            } else {\n                least_ = nullptr;\n            }\n        } else {\n            q->t_ = -1.;\n        }\n    }\n}\n\n/// Splay tree priority queue implementation\ntemplate <>\ninline TQItem* TQueue<spltree>::atomic_dq(double tt) {\n    TQItem* q = nullptr;\n    if (least_ && least_->t_ <= tt) {\n        q = least_;\n        if (sptree_->root) {\n            least_ = spdeq(&sptree_->root);\n        } else {\n            least_ = nullptr;\n        }\n    }\n    return q;\n}\n\n/// STL priority queue implementation\ntemplate <>\ninline TQItem* TQueue<pq_que>::atomic_dq(double tt) {\n    TQItem* q = nullptr;\n    if (least_ && least_->t_ <= tt) {\n        q = least_;\n        //        int qsize = pq_que_.size();\n        //        printf(\"map size: %d\\n\", msize);\n        /// This while loop is to delete events whose times have been moved with the ::move\n        /// function,\n        /// but in fact events were left in the queue since the only function available is pop\n        while (pq_que_.size() && pq_que_.top().second->t_ < 0.) {\n            delete pq_que_.top().second;\n            pq_que_.pop();\n        }\n        if (pq_que_.size()) {\n            least_ = pq_que_.top().second;\n            pq_que_.pop();\n        } else {\n            least_ = nullptr;\n        }\n    }\n    return q;\n}\n}  // namespace coreneuron\n#endif\n"
  },
  {
    "path": "coreneuron/nrnconf.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n#pragma once\n\n#include \"coreneuron/config/version_macros.hpp\"\n#include \"coreneuron/utils/offload.hpp\"\n\n#include <cstdio>\n#include <cmath>\n#include <cassert>\n#include <cerrno>\n#include <cstdint>\n\nnamespace coreneuron {\n\n#define NRNBBCORE 1\n\nusing Datum = int;\nusing Pfri = int (*)();\nusing Symbol = char;\n\n#define VEC_A(i)    (_nt->_actual_a[(i)])\n#define VEC_B(i)    (_nt->_actual_b[(i)])\n#define VEC_D(i)    (_nt->_actual_d[(i)])\n#define VEC_RHS(i)  (_nt->_actual_rhs[(i)])\n#define VEC_V(i)    (_nt->_actual_v[(i)])\n#define VEC_AREA(i) (_nt->_actual_area[(i)])\n#define VECTORIZE   1\n\nextern double celsius;\nextern double pi;\nextern int secondorder;\n\nextern double t, dt;\nextern int rev_dt;\nextern bool stoprun;\nextern const char* bbcore_write_version;\n#define tstopbit   (1 << 15)\n#define tstopset   stoprun |= tstopbit\n#define tstopunset stoprun &= (~tstopbit)\n\nextern void* nrn_cacheline_alloc(void** memptr, size_t size);\nextern void* emalloc_align(size_t size, size_t alignment);\nextern void* ecalloc_align(size_t n, size_t size, size_t alignment);\nextern void check_bbcore_write_version(const char*);\n\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/nrniv/nrniv_decl.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\n#include <vector>\n#include <map>\n#include \"coreneuron/network/netcon.hpp\"\nnamespace coreneuron {\n\n/// Mechanism type to be used from stdindex2ptr and nrn_dblpntr2nrncore (in Neuron)\n/// Values of the mechanism types should be negative numbers to avoid any conflict with\n/// mechanism types of Memb_list(>0) or time(0) passed from Neuron\nenum mech_type { voltage = -1, i_membrane_ = -2 };\n\nextern bool cvode_active_;\n/// Vector of maps for negative presyns\nextern std::vector<std::map<int, PreSyn*>> neg_gid2out;\n/// Maps for ouput and input presyns\nextern std::map<int, PreSyn*> gid2out;\nextern std::map<int, InputPreSyn*> gid2in;\n\n/// InputPreSyn.nc_index_ to + InputPreSyn.nc_cnt_ give the NetCon*\nextern std::vector<NetCon*> netcon_in_presyn_order_;\n/// Only for setup vector of netcon source gids and mindelay determination\nextern std::vector<int*> nrnthreads_netcon_srcgid;\n/// Companion to nrnthreads_netcon_srcgid when src gid is negative to allow\n/// determination of the NrnThread of the source PreSyn.\nextern std::vector<std::vector<int>> nrnthreads_netcon_negsrcgid_tid;\n\nextern void mk_mech(const char* path);\nextern void set_globals(const char* path, bool cli_global_seed, int cli_global_seed_value);\nextern void mk_netcvode(void);\nextern void nrn_p_construct(void);\nextern double* stdindex2ptr(int mtype, int index, NrnThread&);\nextern void delete_trajectory_requests(NrnThread&);\nextern void nrn_cleanup();\nextern void nrn_cleanup_ion_map();\nextern void BBS_netpar_solve(double);\nextern void nrn_mkPatternStim(const char* filename, double tstop);\nextern int nrn_extra_thread0_vdata;\nextern void nrn_set_extra_thread0_vdata(void);\nextern Point_process* nrn_artcell_instantiate(const char* mechname);\nextern int nrnmpi_spike_compress(int nspike, bool gidcompress, int xchng);\nextern bool nrn_use_bin_queue_;\n\nextern void nrn_outputevent(unsigned char, double);\nextern void ncs2nrn_integrate(double tstop);\n\nextern void handle_forward_skip(double forwardskip, int prcellgid);\n\nextern int nrn_set_timeout(int);\nextern void nrn_fake_fire(int gid, double spiketime, int fake_out);\n\nextern void netpar_tid_gid2ps(int tid, int gid, PreSyn** ps, InputPreSyn** psi);\nextern double set_mindelay(double maxdelay);\n\nextern int nrn_soa_padded_size(int cnt, int layout);\n\nextern int interleave_permute_type;\nextern int cellorder_nwarp;\n\n// Mechanism pdata index values into _actual_v and _actual_area data need to be updated.\nenum Layout { SoA = 0, AoS = 1 };\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/nrnoc/md1redef.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\n#define v        _v\n#define area     _area\n#define thisnode _thisnode\n#define GC       _GC\n#define EC       _EC\n#define extnode  _extnode\n#define xain     _xain\n#define xbout    _xbout\n#define i        _i\n#define sec      _sec\n\n#undef Memb_list\n#undef nodelist\n#undef nodeindices\n#undef data\n#undef pdata\n#undef prop\n#undef nodecount\n#undef pval\n#undef id\n#undef weights\n#undef weight_index_\n\n#define nodelist      _nodelist\n#define nodeindices   _nodeindices\n#define data          _data\n#define pdata         _pdata\n#define prop          _prop\n#define nodecount     _nodecount\n#define pval          _pval\n#define id            _id\n#define weights       _weights\n#define weight_index_ _weight_index\n"
  },
  {
    "path": "coreneuron/nrnoc/md2redef.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\n#undef v\n#undef area\n#undef thisnode\n#undef GC\n#undef EC\n#undef extnode\n#undef xain\n#undef xbout\n#undef i\n#undef sec\n\n#undef NrnThread\n#undef Memb_list\n#undef nodelist\n#undef nodeindices\n#undef data\n#undef pdata\n#undef prop\n#undef nodecount\n#undef pval\n#undef weights\n#undef weight_index_\n\n#undef id\n"
  },
  {
    "path": "coreneuron/permute/balance.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n// use LPT algorithm to balance cells so all warps have similar number\n// of compartments.\n// NB: Ideally we'd balance so that warps have similar ncycle. But we do not\n// know how to predict warp quality without an apriori set of cells to\n// fill the warp. For large numbers of cells in a warp,\n// it is a justifiable speculation to presume that there will be very\n// few holes in warp filling. I.e., ncycle = ncompart/warpsize\n\n#include <algorithm>\n\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/network/tnode.hpp\"\n#include \"coreneuron/utils/lpt.hpp\"\n\nnamespace coreneuron {\nint cellorder_nwarp = 0;  // 0 means do not balance\n\n// ordering by warp, then old order\nbool warpcmp(const TNode* a, const TNode* b) {\n    if (a->groupindex < b->groupindex) {\n        return true;\n    } else if (a->groupindex == b->groupindex && a->nodevec_index < b->nodevec_index) {\n        return true;\n    }\n    return false;\n}\n\n// order the ncell nodevec roots for balance and return a displacement\n// vector specifying the contiguous roots for a warp.\n// The return vector should be freed by the caller.\n// On entry, nodevec is ordered so that each cell type is together and\n// largest cells first. On exit, nodevec is ordered so that warp i\n// should contain roots nodevec[displ[i]:displ[i+1]]\n\nsize_t warp_balance(size_t ncell, VecTNode& nodevec) {\n    if (ncell == 0) {\n        return 0;\n    }\n\n    if (cellorder_nwarp == 0) {\n        return 0;\n    }\n    size_t nwarp = size_t(cellorder_nwarp);\n    // cannot be more warps than cells\n    nwarp = std::min(ncell, nwarp);\n\n    // cellsize vector and location of types.\n    std::vector<size_t> cellsize(ncell);\n    std::vector<size_t> typedispl;\n    size_t total_compart = 0;\n    typedispl.push_back(0);  // types are already in order\n    for (size_t i = 0; i < ncell; ++i) {\n        cellsize[i] = nodevec[i]->treesize;\n        total_compart += cellsize[i];\n        if (i == 0 || nodevec[i]->hash != nodevec[i - 1]->hash) {\n            typedispl.push_back(typedispl.back() + 1);\n        } else {\n            typedispl.back() += 1;\n        }\n    }\n\n    size_t ideal_compart_per_warp = total_compart / nwarp;\n\n    size_t min_cells_per_warp = 0;\n    for (size_t i = 0, sz = 0; sz < ideal_compart_per_warp; ++i) {\n        ++min_cells_per_warp;\n        sz += cellsize[i];\n    }\n\n    // balance when order is unrestricted (identical cells not together)\n    // i.e. pieces are cellsize\n    double best_balance = 0.0;\n    auto inwarp = lpt(nwarp, cellsize, &best_balance);\n    printf(\"best_balance=%g ncell=%ld ntype=%ld nwarp=%ld\\n\",\n           best_balance,\n           ncell,\n           typedispl.size() - 1,\n           nwarp);\n\n    // order the roots for balance\n    for (size_t i = 0; i < ncell; ++i) {\n        TNode* nd = nodevec[i];\n        nd->groupindex = inwarp[i];\n    }\n    std::sort(nodevec.begin(), nodevec.begin() + ncell, warpcmp);\n    for (size_t i = 0; i < nodevec.size(); ++i) {\n        TNode* nd = nodevec[i];\n        for (size_t j = 0; j < nd->children.size(); ++j) {\n            nd->children[j]->groupindex = nd->groupindex;\n        }\n        nd->nodevec_index = i;\n    }\n\n    return nwarp;\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/permute/cellorder.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/utils/nrn_assert.h\"\n#include \"coreneuron/permute/cellorder.hpp\"\n#include \"coreneuron/network/tnode.hpp\"\n#include \"coreneuron/utils/lpt.hpp\"\n#include \"coreneuron/utils/memory.h\"\n#include \"coreneuron/utils/offload.hpp\"\n#include \"coreneuron/apps/corenrn_parameters.hpp\"\n\n#include \"coreneuron/permute/node_permute.h\"  // for print_quality\n\n#ifdef _OPENACC\n#include <openacc.h>\n#endif\n\n#include <set>\n\nnamespace coreneuron {\nint interleave_permute_type;\nInterleaveInfo* interleave_info;  // nrn_nthread array\n\n\nvoid InterleaveInfo::swap(InterleaveInfo& info) {\n    std::swap(nwarp, info.nwarp);\n    std::swap(nstride, info.nstride);\n\n    std::swap(stridedispl, info.stridedispl);\n    std::swap(stride, info.stride);\n    std::swap(firstnode, info.firstnode);\n    std::swap(lastnode, info.lastnode);\n    std::swap(cellsize, info.cellsize);\n\n    std::swap(nnode, info.nnode);\n    std::swap(ncycle, info.ncycle);\n    std::swap(idle, info.idle);\n    std::swap(cache_access, info.cache_access);\n    std::swap(child_race, info.child_race);\n}\n\nInterleaveInfo::InterleaveInfo(const InterleaveInfo& info) {\n    nwarp = info.nwarp;\n    nstride = info.nstride;\n\n    copy_align_array(stridedispl, info.stridedispl, nwarp + 1);\n    copy_align_array(stride, info.stride, nstride);\n    copy_align_array(firstnode, info.firstnode, nwarp + 1);\n    copy_align_array(lastnode, info.lastnode, nwarp + 1);\n    copy_align_array(cellsize, info.cellsize, nwarp);\n\n    copy_array(nnode, info.nnode, nwarp);\n    copy_array(ncycle, info.ncycle, nwarp);\n    copy_array(idle, info.idle, nwarp);\n    copy_array(cache_access, info.cache_access, nwarp);\n    copy_array(child_race, info.child_race, nwarp);\n}\n\nInterleaveInfo& InterleaveInfo::operator=(const InterleaveInfo& info) {\n    // self assignment\n    if (this == &info)\n        return *this;\n\n    InterleaveInfo temp(info);\n\n    this->swap(temp);\n    return *this;\n}\n\nInterleaveInfo::~InterleaveInfo() {\n    if (stride) {\n        free_memory(stride);\n        free_memory(firstnode);\n        free_memory(lastnode);\n        free_memory(cellsize);\n    }\n    if (stridedispl) {\n        free_memory(stridedispl);\n    }\n    if (idle) {\n        delete[] nnode;\n        delete[] ncycle;\n        delete[] idle;\n        delete[] cache_access;\n        delete[] child_race;\n    }\n}\n\nvoid create_interleave_info() {\n    destroy_interleave_info();\n    interleave_info = new InterleaveInfo[nrn_nthread];\n}\n\nvoid destroy_interleave_info() {\n    if (interleave_info) {\n        delete[] interleave_info;\n        interleave_info = nullptr;\n    }\n}\n\n// more precise visualization of the warp quality\n// can be called after admin2\nstatic void print_quality2(int iwarp, InterleaveInfo& ii, int* p) {\n    int pc = (iwarp == 0);  // print warp 0\n    pc = 0;                 // turn off printing\n    int nodebegin = ii.lastnode[iwarp];\n    int* stride = ii.stride + ii.stridedispl[iwarp];\n    int ncycle = ii.cellsize[iwarp];\n\n    int inode = nodebegin;\n\n    size_t nn = 0;  // number of nodes in warp. '.'\n    size_t nx = 0;  // number of idle cores on all cycles. 'X'\n    size_t ncacheline = 0;\n    ;                // number of parent memory cacheline accesses.\n                     //   assmue warpsize is max number in a cachline so all o\n    size_t ncr = 0;  // number of child race. nchild-1 of same parent in same cycle\n\n    for (int icycle = 0; icycle < ncycle; ++icycle) {\n        int s = stride[icycle];\n        int lastp = -2;\n        if (pc)\n            printf(\"  \");\n        std::set<int> crace;  // how many children have same parent in a cycle\n        for (int icore = 0; icore < warpsize; ++icore) {\n            char ch = '.';\n            if (icore < s) {\n                int par = p[inode];\n                if (crace.find(par) != crace.end()) {\n                    ch = 'r';\n                    ++ncr;\n                } else {\n                    crace.insert(par);\n                }\n\n                if (par != lastp + 1) {\n                    ch = (ch == 'r') ? 'R' : 'o';\n                    ++ncacheline;\n                }\n                lastp = p[inode++];\n                ++nn;\n            } else {\n                ch = 'X';\n                ++nx;\n            }\n            if (pc)\n                printf(\"%c\", ch);\n        }\n        if (pc)\n            printf(\"\\n\");\n    }\n\n    ii.nnode[iwarp] = nn;\n    ii.ncycle[iwarp] = size_t(ncycle);\n    ii.idle[iwarp] = nx;\n    ii.cache_access[iwarp] = ncacheline;\n    ii.child_race[iwarp] = ncr;\n    if (pc)\n        printf(\"warp %d:  %ld nodes, %d cycles, %ld idle, %ld cache access, %ld child races\\n\",\n               iwarp,\n               nn,\n               ncycle,\n               nx,\n               ncacheline,\n               ncr);\n}\n\nstatic void print_quality1(int iwarp, InterleaveInfo& ii, int ncell, int* p) {\n    int pc = ((iwarp == 0) || iwarp == (ii.nwarp - 1));  // warp not to skip printing\n    pc = 0;                                              // turn off printing.\n    int* stride = ii.stride;\n    int cellbegin = iwarp * warpsize;\n    int cellend = cellbegin + warpsize;\n    cellend = (cellend < stride[0]) ? cellend : stride[0];\n\n    int ncycle = 0;\n    for (int i = cellbegin; i < cellend; ++i) {\n        if (ncycle < ii.cellsize[i]) {\n            ncycle = ii.cellsize[i];\n        }\n    }\n    nrn_assert(ncycle == ii.cellsize[cellend - 1]);\n    nrn_assert(ncycle <= ii.nstride);\n\n    int ncell_in_warp = cellend - cellbegin;\n\n    size_t n = 0;   // number of nodes in warp (not including roots)\n    size_t nx = 0;  // number of idle cores on all cycles. X\n    size_t ncacheline = 0;\n    ;  // number of parent memory cacheline accesses.\n       // assume warpsize is max number in a cachline so\n       // first core has all o\n\n    int inode = ii.firstnode[cellbegin];\n    for (int icycle = 0; icycle < ncycle; ++icycle) {\n        int sbegin = ncell - stride[icycle] - cellbegin;\n        int lastp = -2;\n        if (pc)\n            printf(\"  \");\n        for (int icore = 0; icore < warpsize; ++icore) {\n            char ch = '.';\n            if (icore < ncell_in_warp && icore >= sbegin) {\n                int par = p[inode + icore];\n                if (par != lastp + 1) {\n                    ch = 'o';\n                    ++ncacheline;\n                }\n                lastp = par;\n                ++n;\n            } else {\n                ch = 'X';\n                ++nx;\n            }\n            if (pc)\n                printf(\"%c\", ch);\n        }\n        if (pc)\n            printf(\"\\n\");\n        inode += ii.stride[icycle + 1];\n    }\n\n    ii.nnode[iwarp] = n;\n    ii.ncycle[iwarp] = (size_t) ncycle;\n    ii.idle[iwarp] = nx;\n    ii.cache_access[iwarp] = ncacheline;\n    ii.child_race[iwarp] = 0;\n    if (pc)\n        printf(\"warp %d:  %ld nodes, %d cycles, %ld idle, %ld cache access\\n\",\n               iwarp,\n               n,\n               ncycle,\n               nx,\n               ncacheline);\n}\n\nstatic void warp_balance(int ith, InterleaveInfo& ii) {\n    size_t nwarp = size_t(ii.nwarp);\n    size_t smm[4][3];  // sum_min_max see cp below\n    for (size_t j = 0; j < 4; ++j) {\n        smm[j][0] = 0;\n        smm[j][1] = 1000000000;\n        smm[j][2] = 0;\n    }\n    double emax = 0.0, emin = 1.0;\n    for (size_t i = 0; i < nwarp; ++i) {\n        size_t n = ii.nnode[i];\n        double e = double(n) / (n + ii.idle[i]);\n        if (emax < e) {\n            emax = e;\n        }\n        if (emin > e) {\n            emin = e;\n        }\n        size_t s[4] = {n, ii.idle[i], ii.cache_access[i], ii.child_race[i]};\n        for (size_t j = 0; j < 4; ++j) {\n            smm[j][0] += s[j];\n            if (smm[j][1] > s[j]) {\n                smm[j][1] = s[j];\n            }\n            if (smm[j][2] < s[j]) {\n                smm[j][2] = s[j];\n            }\n        }\n    }\n    std::vector<size_t> v(nwarp);\n    for (size_t i = 0; i < nwarp; ++i) {\n        v[i] = ii.ncycle[i];\n    }\n    double bal = load_balance(v);\n#ifdef DEBUG\n    printf(\n        \"thread %d nwarp=%ld  balance=%g  warp_efficiency %g to %g\\n\", ith, nwarp, bal, emin, emax);\n    const char* cp[4] = {\"nodes\", \"idle\", \"ca\", \"cr\"};\n    for (size_t i = 0; i < 4; ++i) {\n        printf(\"  %s=%ld (%ld:%ld)\", cp[i], smm[i][0], smm[i][1], smm[i][2]);\n    }\n    printf(\"\\n\");\n#else\n    (void) bal;  // Remove warning about unused\n#endif\n}\n\nint* interleave_order(int ith, int ncell, int nnode, int* parent) {\n    // return if there are no nodes to permute\n    if (nnode <= 0)\n        return nullptr;\n\n    // ensure parent of root = -1\n    for (int i = 0; i < ncell; ++i) {\n        if (parent[i] == 0) {\n            parent[i] = -1;\n        }\n    }\n\n    int nwarp = 0, nstride = 0, *stride = nullptr, *firstnode = nullptr;\n    int *lastnode = nullptr, *cellsize = nullptr, *stridedispl = nullptr;\n\n    int* order = node_order(\n        ncell, nnode, parent, nwarp, nstride, stride, firstnode, lastnode, cellsize, stridedispl);\n\n    if (interleave_info) {\n        InterleaveInfo& ii = interleave_info[ith];\n        ii.nwarp = nwarp;\n        ii.nstride = nstride;\n        ii.stridedispl = stridedispl;\n        ii.stride = stride;\n        ii.firstnode = firstnode;\n        ii.lastnode = lastnode;\n        ii.cellsize = cellsize;\n        if (0 && ith == 0 && interleave_permute_type == 1) {\n            printf(\"ith=%d nstride=%d ncell=%d nnode=%d\\n\", ith, nstride, ncell, nnode);\n            for (int i = 0; i < ncell; ++i) {\n                printf(\"icell=%d cellsize=%d first=%d last=%d\\n\",\n                       i,\n                       cellsize[i],\n                       firstnode[i],\n                       lastnode[i]);\n            }\n            for (int i = 0; i < nstride; ++i) {\n                printf(\"istride=%d stride=%d\\n\", i, stride[i]);\n            }\n        }\n        if (ith == 0) {\n            // needed for print_quality[12] and done once here to save time\n            int* p = new int[nnode];\n            for (int i = 0; i < nnode; ++i) {\n                p[i] = parent[i];\n            }\n            permute_ptr(p, nnode, order);\n            node_permute(p, nnode, order);\n\n            ii.nnode = new size_t[nwarp];\n            ii.ncycle = new size_t[nwarp];\n            ii.idle = new size_t[nwarp];\n            ii.cache_access = new size_t[nwarp];\n            ii.child_race = new size_t[nwarp];\n            for (int i = 0; i < nwarp; ++i) {\n                if (interleave_permute_type == 1) {\n                    print_quality1(i, interleave_info[ith], ncell, p);\n                }\n                if (interleave_permute_type == 2) {\n                    print_quality2(i, interleave_info[ith], p);\n                }\n            }\n            delete[] p;\n            warp_balance(ith, interleave_info[ith]);\n        }\n    }\n\n    return order;\n}\n\n#if INTERLEAVE_DEBUG  // only the cell per core style\nstatic int** cell_indices_debug(NrnThread& nt, InterleaveInfo& ii) {\n    int ncell = nt.ncell;\n    int nnode = nt.end;\n    int* parents = nt._v_parent_index;\n\n    // we expect the nodes to be interleave ordered with smallest cell first\n    // establish consistency with ii.\n    // first ncell parents are -1\n    for (int i = 0; i < ncell; ++i) {\n        nrn_assert(parents[i] == -1);\n    }\n    int* sz = new int[ncell];\n    int* cell = new int[nnode];\n    for (int i = 0; i < ncell; ++i) {\n        sz[i] = 0;\n        cell[i] = i;\n    }\n    for (int i = ncell; i < nnode; ++i) {\n        cell[i] = cell[parents[i]];\n        sz[cell[i]] += 1;\n    }\n\n    // cells are in inceasing sz order;\n    for (int i = 1; i < ncell; ++i) {\n        nrn_assert(sz[i - 1] <= sz[i]);\n    }\n    // same as ii.cellsize\n    for (int i = 0; i < ncell; ++i) {\n        nrn_assert(sz[i] == ii.cellsize[i]);\n    }\n\n    int** cellindices = new int*[ncell];\n    for (int i = 0; i < ncell; ++i) {\n        cellindices[i] = new int[sz[i]];\n        sz[i] = 0;  // restart sz counts\n    }\n    for (int i = ncell; i < nnode; ++i) {\n        cellindices[cell[i]][sz[cell[i]]] = i;\n        sz[cell[i]] += 1;\n    }\n    // cellindices first and last same as ii first and last\n    for (int i = 0; i < ncell; ++i) {\n        nrn_assert(cellindices[i][0] == ii.firstnode[i]);\n        nrn_assert(cellindices[i][sz[i] - 1] == ii.lastnode[i]);\n    }\n\n    delete[] sz;\n    delete[] cell;\n\n    return cellindices;\n}\n\nstatic int*** cell_indices_threads;\nvoid mk_cell_indices() {\n    cell_indices_threads = new int**[nrn_nthread];\n    for (int i = 0; i < nrn_nthread; ++i) {\n        NrnThread& nt = nrn_threads[i];\n        if (nt.ncell) {\n            cell_indices_threads[i] = cell_indices_debug(nt, interleave_info[i]);\n        } else {\n            cell_indices_threads[i] = nullptr;\n        }\n    }\n}\n#endif  // INTERLEAVE_DEBUG\n\n#define GPU_V(i)      nt->_actual_v[i]\n#define GPU_A(i)      nt->_actual_a[i]\n#define GPU_B(i)      nt->_actual_b[i]\n#define GPU_D(i)      nt->_actual_d[i]\n#define GPU_RHS(i)    nt->_actual_rhs[i]\n#define GPU_PARENT(i) nt->_v_parent_index[i]\n\n// How does the interleaved permutation with stride get used in\n// triagularization?\n\n// each cell in parallel regardless of inhomogeneous topology\nstatic void triang_interleaved(NrnThread* nt,\n                               int icell,\n                               int icellsize,\n                               int nstride,\n                               int* stride,\n                               int* lastnode) {\n    int i = lastnode[icell];\n    for (int istride = nstride - 1; istride >= 0; --istride) {\n        if (istride < icellsize) {  // only first icellsize strides matter\n            // what is the index\n            int ip = GPU_PARENT(i);\n#ifndef CORENEURON_ENABLE_GPU\n            nrn_assert(ip >= 0);  // if (ip < 0) return;\n#endif\n            double p = GPU_A(i) / GPU_D(i);\n            GPU_D(ip) -= p * GPU_B(i);\n            GPU_RHS(ip) -= p * GPU_RHS(i);\n            i -= stride[istride];\n        }\n    }\n}\n\n// back substitution?\nstatic void bksub_interleaved(NrnThread* nt,\n                              int icell,\n                              int icellsize,\n                              int /* nstride */,\n                              int* stride,\n                              int* firstnode) {\n    int i = firstnode[icell];\n    GPU_RHS(icell) /= GPU_D(icell);  // the root\n    for (int istride = 0; istride < icellsize; ++istride) {\n        int ip = GPU_PARENT(i);\n#ifndef CORENEURON_ENABLE_GPU\n        nrn_assert(ip >= 0);\n#endif\n        GPU_RHS(i) -= GPU_B(i) * GPU_RHS(ip);\n        GPU_RHS(i) /= GPU_D(i);\n        i += stride[istride + 1];\n    }\n}\n\n// icore ranges [0:warpsize) ; stride[ncycle]\nnrn_pragma_acc(routine vector)\nstatic void triang_interleaved2(NrnThread* nt, int icore, int ncycle, int* stride, int lastnode) {\n    int icycle = ncycle - 1;\n    int istride = stride[icycle];\n    int i = lastnode - istride + icore;\n    int ii = i;\n\n    // execute until all tree depths are executed\n    bool has_subtrees_to_compute = true;\n\n    // clang-format off\n    nrn_pragma_acc(loop seq)\n    for (; has_subtrees_to_compute; ) {  // ncycle loop\n        // serial test, gpu does this in parallel\n        nrn_pragma_acc(loop vector)\n        nrn_pragma_omp(loop bind(parallel))\n        for (int icore = 0; icore < warpsize; ++icore) {\n            int i = ii + icore;\n            if (icore < istride) {  // most efficient if istride equal  warpsize\n                // what is the index\n                int ip = GPU_PARENT(i);\n                double p = GPU_A(i) / GPU_D(i);\n                nrn_pragma_acc(atomic update)\n                nrn_pragma_omp(atomic update)\n                GPU_D(ip) -= p * GPU_B(i);\n                nrn_pragma_acc(atomic update)\n                nrn_pragma_omp(atomic update)\n                GPU_RHS(ip) -= p * GPU_RHS(i);\n            }\n        }\n        // if finished with all tree depths then ready to break\n        // (note that break is not allowed in OpenACC)\n        if (icycle == 0) {\n            has_subtrees_to_compute = false;\n            continue;\n        }\n        --icycle;\n        istride = stride[icycle];\n        i -= istride;\n        ii -= istride;\n    }\n}\n\n// icore ranges [0:warpsize) ; stride[ncycle]\nnrn_pragma_acc(routine vector)\nstatic void bksub_interleaved2(NrnThread* nt,\n                               int root,\n                               int lastroot,\n                               int icore,\n                               int ncycle,\n                               int* stride,\n                               int firstnode) {\n    nrn_pragma_acc(loop seq)\n    for (int i = root; i < lastroot; i += 1) {\n        GPU_RHS(i) /= GPU_D(i);  // the root\n    }\n\n    int i = firstnode + icore;\n    int ii = i;\n    nrn_pragma_acc(loop seq)\n    for (int icycle = 0; icycle < ncycle; ++icycle) {\n        int istride = stride[icycle];\n        // serial test, gpu does this in parallel\n        nrn_pragma_acc(loop vector)\n        nrn_pragma_omp(loop bind(parallel))\n        for (int icore = 0; icore < warpsize; ++icore) {\n            int i = ii + icore;\n            if (icore < istride) {\n                int ip = GPU_PARENT(i);\n                GPU_RHS(i) -= GPU_B(i) * GPU_RHS(ip);\n                GPU_RHS(i) /= GPU_D(i);\n            }\n            i += istride;\n        }\n        ii += istride;\n    }\n}\n\n/**\n * \\brief Solve Hines matrices/cells with compartment-based granularity.\n *\n * The node ordering/permuation guarantees cell interleaving (as much coalesced memory access as\n * possible) and balanced warps (through the use of lpt algorithm to define the groups/warps). Every\n * warp deals with a group of cells, therefore multiple compartments (finer level of parallelism).\n */\nvoid solve_interleaved2(int ith) {\n    NrnThread* nt = nrn_threads + ith;\n    InterleaveInfo& ii = interleave_info[ith];\n    int nwarp = ii.nwarp;\n    if (nwarp == 0)\n        return;\n\n    int ncore = nwarp * warpsize;\n\n#ifdef _OPENACC\n    if (corenrn_param.gpu && corenrn_param.cuda_interface) {\n        auto* d_nt = static_cast<NrnThread*>(acc_deviceptr(nt));\n        auto* d_info = static_cast<InterleaveInfo*>(acc_deviceptr(interleave_info + ith));\n        solve_interleaved2_launcher(d_nt, d_info, ncore, acc_get_cuda_stream(nt->stream_id));\n    } else {\n#endif\n        int* ncycles = ii.cellsize;         // nwarp of these\n        int* stridedispl = ii.stridedispl;  // nwarp+1 of these\n        int* strides = ii.stride;           // sum ncycles of these (bad since ncompart/warpsize)\n        int* rootbegin = ii.firstnode;      // nwarp+1 of these\n        int* nodebegin = ii.lastnode;       // nwarp+1 of these\n#if defined(CORENEURON_ENABLE_GPU) && !defined(CORENEURON_PREFER_OPENMP_OFFLOAD) && \\\n    defined(_OPENACC)\n        int nstride = stridedispl[nwarp];\n#endif\n        /* If we compare this loop with the one from cellorder.cu (CUDA version), we will understand \n         * that the parallelism here is exposed in steps, while in the CUDA version all the parallelism \n         * is exposed from the very beginning of the loop. In more details, here we initially distribute\n         * the outermost loop, e.g. in the CUDA blocks, and for the innermost loops we explicitly use multiple\n         * threads for the parallelization (see for example the loop directives in triang/bksub_interleaved2). \n         * On the other hand, in the CUDA version the outermost loop is distributed to all the available threads,\n         * and therefore there is no need to have the innermost loops. Here, the loop/icore jumps every warpsize,\n         * while in the CUDA version the icore increases by one. Other than this, the two loop versions\n         * are equivalent (same results).\n         */\n        nrn_pragma_acc(parallel loop gang present(nt [0:1],\n                              strides [0:nstride],\n                              ncycles [0:nwarp],\n                              stridedispl [0:nwarp + 1],\n                              rootbegin [0:nwarp + 1],\n                              nodebegin [0:nwarp + 1]) if (nt->compute_gpu) async(nt->stream_id))\n        nrn_pragma_omp(target teams loop if(nt->compute_gpu))\n        for (int icore = 0; icore < ncore; icore += warpsize) {\n            int iwarp = icore / warpsize;     // figure out the >> value\n            int ic = icore & (warpsize - 1);  // figure out the & mask\n            int ncycle = ncycles[iwarp];\n            int* stride = strides + stridedispl[iwarp];\n            int root = rootbegin[iwarp];  // cell ID -> [0, ncell)\n            int lastroot = rootbegin[iwarp + 1];\n            int firstnode = nodebegin[iwarp];\n            int lastnode = nodebegin[iwarp + 1];\n            \n            triang_interleaved2(nt, ic, ncycle, stride, lastnode);\n            bksub_interleaved2(nt, root + ic, lastroot, ic, ncycle, stride, firstnode);\n        }\n        nrn_pragma_acc(wait(nt->stream_id))\n#ifdef _OPENACC\n    }\n#endif\n}\n\n/**\n * \\brief Solve Hines matrices/cells with cell-based granularity.\n *\n * The node ordering guarantees cell interleaving (as much coalesced memory access as possible),\n * but parallelism granularity is limited to a per cell basis. Therefore every execution stream\n * is mapped to a cell/tree.\n */\nvoid solve_interleaved1(int ith) {\n    NrnThread* nt = nrn_threads + ith;\n    int ncell = nt->ncell;\n    if (ncell == 0) {\n        return;\n    }\n    InterleaveInfo& ii = interleave_info[ith];\n    int nstride = ii.nstride;\n    int* stride = ii.stride;\n    int* firstnode = ii.firstnode;\n    int* lastnode = ii.lastnode;\n    int* cellsize = ii.cellsize;\n\n    // OL211123: can we preserve the error checking behaviour of OpenACC's\n    // present clause with OpenMP? It is a bug if these data are not present,\n    // so diagnostics are helpful...\n    nrn_pragma_acc(parallel loop present(nt [0:1],\n                                         stride [0:nstride],\n                                         firstnode [0:ncell],\n                                         lastnode [0:ncell],\n                                         cellsize [0:ncell]) if (nt->compute_gpu)\n                       async(nt->stream_id))\n    nrn_pragma_omp(target teams distribute parallel for simd if(nt->compute_gpu))\n    for (int icell = 0; icell < ncell; ++icell) {\n        int icellsize = cellsize[icell];\n        triang_interleaved(nt, icell, icellsize, nstride, stride, lastnode);\n        bksub_interleaved(nt, icell, icellsize, nstride, stride, firstnode);\n    }\n    nrn_pragma_acc(wait(nt->stream_id))\n}\n\nvoid solve_interleaved(int ith) {\n    if (interleave_permute_type != 1) {\n        solve_interleaved2(ith);\n    } else {\n        solve_interleaved1(ith);\n    }\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/permute/cellorder.cu",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include \"coreneuron/utils/utils_cuda.h\"\n#include \"coreneuron/permute/cellorder.hpp\"\n#include \"coreneuron/network/tnode.hpp\"\n#include \"coreneuron/sim/multicore.hpp\"\n\nnamespace coreneuron {\n\n__device__ void triang_interleaved2_device(NrnThread* nt,\n                                           int icore,\n                                           int ncycle,\n                                           int* stride,\n                                           int lastnode) {\n    int icycle = ncycle - 1;\n    int istride = stride[icycle];\n    int i = lastnode - istride + icore;\n\n    int ip;\n    double p;\n    while (icycle >= 0) {\n        // most efficient if istride equal warpsize, else branch divergence!\n        if (icore < istride) {\n            ip = nt->_v_parent_index[i];\n            p = nt->_actual_a[i] / nt->_actual_d[i];\n            atomicAdd(&nt->_actual_d[ip], -p * nt->_actual_b[i]);\n            atomicAdd(&nt->_actual_rhs[ip], -p * nt->_actual_rhs[i]);\n        }\n        --icycle;\n        istride = stride[icycle];\n        i -= istride;\n    }\n}\n\n__device__ void bksub_interleaved2_device(NrnThread* nt,\n                                          int root,\n                                          int lastroot,\n                                          int icore,\n                                          int ncycle,\n                                          int* stride,\n                                          int firstnode) {\n    for (int i = root; i < lastroot; i += warpsize) {\n        nt->_actual_rhs[i] /= nt->_actual_d[i];  // the root\n    }\n\n    int i = firstnode + icore;\n\n    int ip;\n    for (int icycle = 0; icycle < ncycle; ++icycle) {\n        int istride = stride[icycle];\n        if (icore < istride) {\n            ip = nt->_v_parent_index[i];\n            nt->_actual_rhs[i] -= nt->_actual_b[i] * nt->_actual_rhs[ip];\n            nt->_actual_rhs[i] /= nt->_actual_d[i];\n        }\n        i += istride;\n    }\n}\n\n__global__ void solve_interleaved2_kernel(NrnThread* nt, InterleaveInfo* ii, int ncore) {\n    int icore = blockDim.x * blockIdx.x + threadIdx.x;\n\n    int* ncycles = ii->cellsize;         // nwarp of these\n    int* stridedispl = ii->stridedispl;  // nwarp+1 of these\n    int* strides = ii->stride;           // sum ncycles of these (bad since ncompart/warpsize)\n    int* rootbegin = ii->firstnode;      // nwarp+1 of these\n    int* nodebegin = ii->lastnode;       // nwarp+1 of these\n\n    while (icore < ncore) {\n        int iwarp = icore / warpsize;     // figure out the >> value\n        int ic = icore & (warpsize - 1);  // figure out the & mask\n        int ncycle = ncycles[iwarp];\n        int* stride = strides + stridedispl[iwarp];\n        int root = rootbegin[iwarp];\n        int lastroot = rootbegin[iwarp + 1];\n        int firstnode = nodebegin[iwarp];\n        int lastnode = nodebegin[iwarp + 1];\n\n        triang_interleaved2_device(nt, ic, ncycle, stride, lastnode);\n        bksub_interleaved2_device(nt, root + ic, lastroot, ic, ncycle, stride, firstnode);\n\n        icore += blockDim.x * gridDim.x;\n    }\n}\n\nvoid solve_interleaved2_launcher(NrnThread* nt, InterleaveInfo* info, int ncore, void* stream) {\n    auto cuda_stream = static_cast<cudaStream_t>(stream);\n\n    /// the selection of these parameters has been done after running the channel-benchmark for\n    /// typical production runs, i.e. 1 MPI task with 1440 cells & 6 MPI tasks with 8800 cells.\n    /// In the OpenACC/OpenMP implementations threadsPerBlock is set to 32. From profiling the\n    /// channel-benchmark circuits mentioned above we figured out that the best performance was\n    /// achieved with this configuration\n    int threadsPerBlock = warpsize;\n    /// Max number of blocksPerGrid for NVIDIA GPUs is 65535, so we need to make sure that the\n    /// blocksPerGrid we launch the CUDA kernel with doesn't exceed this number\n    const auto maxBlocksPerGrid = 65535;\n    int provisionalBlocksPerGrid = (ncore + threadsPerBlock - 1) / threadsPerBlock;\n    int blocksPerGrid = provisionalBlocksPerGrid <= maxBlocksPerGrid ? provisionalBlocksPerGrid\n                                                                     : maxBlocksPerGrid;\n\n    solve_interleaved2_kernel<<<blocksPerGrid, threadsPerBlock, 0, cuda_stream>>>(nt, info, ncore);\n\n    cudaStreamSynchronize(cuda_stream);\n\n    CHECKLAST(\"solve_interleaved2_launcher\");\n}\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/permute/cellorder.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n#include \"coreneuron/utils/memory.h\"\n#include <algorithm>\nnamespace coreneuron {\n\n/**\n * \\brief Function that performs the permutation of the cells such that the\n *        execution threads access coalesced memory.\n *\n * \\param ith NrnThread to access\n * \\param ncell number of cells in NrnThread\n * \\param nnode number of compartments in the ncells\n * \\param parent parent indices of cells\n *\n * \\return int* order, interleaved order of the cells\n */\nint* interleave_order(int ith, int ncell, int nnode, int* parent);\n\nvoid create_interleave_info();\nvoid destroy_interleave_info();\n\n/**\n *\n * \\brief Solve the Hines matrices based on the interleave_permute_type (1 or 2).\n *\n * For interleave_permute_type == 1 : Naive interleaving -> Each execution thread deals with one\n * Hines matrix (cell) For interleave_permute_type == 2 : Advanced interleaving -> Each Hines matrix\n * is solved by multiple execution threads (with coalesced memory access as well)\n */\nextern void solve_interleaved(int ith);\n\nclass InterleaveInfo;  // forward declaration\n/**\n *\n * \\brief CUDA branch of the solve_interleaved with interleave_permute_type == 2.\n *\n * This branch is activated in runtime with the --cuda-interface CLI flag\n */\nvoid solve_interleaved2_launcher(NrnThread* nt, InterleaveInfo* info, int ncore, void* stream);\n\nclass InterleaveInfo: public MemoryManaged {\n  public:\n    InterleaveInfo() = default;\n    InterleaveInfo(const InterleaveInfo&);\n    InterleaveInfo& operator=(const InterleaveInfo&);\n    ~InterleaveInfo();\n    int nwarp = 0;  // used only by interleave2\n    int nstride = 0;\n    int* stridedispl = nullptr;  // interleave2: nwarp+1\n    int* stride = nullptr;       // interleave2: stride  length is stridedispl[nwarp]\n    int* firstnode = nullptr;    // interleave2: rootbegin nwarp+1 displacements\n    int* lastnode = nullptr;     // interleave2: nodebegin nwarp+1 displacements\n    int* cellsize = nullptr;     // interleave2: ncycles nwarp\n\n    // statistics (nwarp of each)\n    size_t* nnode = nullptr;\n    size_t* ncycle = nullptr;\n    size_t* idle = nullptr;\n    size_t* cache_access = nullptr;\n    size_t* child_race = nullptr;\n\n  private:\n    void swap(InterleaveInfo& info);\n};\n\n/**\n * \\brief Function that returns a permutation of length nnode.\n *\n * There are two permutation strategies:\n * For interleave_permute_type == 1 : Naive interleaving -> Each execution thread deals with one\n * Hines matrix (cell) For interleave_permute_type == 2 : Advanced interleaving -> Each Hines matrix\n * is solved by multiple execution threads (with coalesced memory access as well)\n *\n * \\param ncell number of cells\n * \\param nnode number of compartments in the ncells\n * \\param parents parent indices of the cells\n * \\param nwarp number of warps\n * \\param nstride nstride is the maximum cell size (not counting root)\n * \\param stride stride[i] is the number of cells with an ith node:\n *               using stride[i] we know how many positions to move in order to\n *               access the next element of the same cell (given that the cells are\n *               ordered with the treenode_order).\n * \\param firstnode firstnode[i] is the index of the first nonroot node of the cell\n * \\param lastnode lastnode[i] is the index of the last node of the cell\n * \\param cellsize cellsize is the number of nodes in the cell not counting root.\n * \\param stridedispl\n * \\return int* : a permutation of length nnode\n */\nint* node_order(int ncell,\n                int nnode,\n                int* parents,\n                int& nwarp,\n                int& nstride,\n                int*& stride,\n                int*& firstnode,\n                int*& lastnode,\n                int*& cellsize,\n                int*& stridedispl);\n\n// copy src array to dest with new allocation\ntemplate <typename T>\nvoid copy_array(T*& dest, T* src, size_t n) {\n    dest = new T[n];\n    std::copy(src, src + n, dest);\n}\n\n// copy src array to dest with NRN_SOA_BYTE_ALIGN ecalloc_align allocation\ntemplate <typename T>\nvoid copy_align_array(T*& dest, T* src, size_t n) {\n    dest = static_cast<T*>(ecalloc_align(n, sizeof(T)));\n    std::copy(src, src + n, dest);\n}\n\n#ifndef INTERLEAVE_DEBUG\n#define INTERLEAVE_DEBUG 0\n#endif\n\n#if INTERLEAVE_DEBUG\nvoid mk_cell_indices();\n#endif\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/permute/cellorder1.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include <cstdio>\n#include <map>\n#include <set>\n#include <algorithm>\n#include <cstring>\n\n#include \"coreneuron/utils/nrn_assert.h\"\n#include \"coreneuron/permute/cellorder.hpp\"\n#include \"coreneuron/network/tnode.hpp\"\n\n// just for interleave_permute_type\n#include \"coreneuron/nrniv/nrniv_decl.h\"\n#include \"coreneuron/utils/memory.h\"\n\n\nnamespace coreneuron {\nstatic size_t groupsize = 32;\n\n/**\n * \\brief Function to order trees by size, hash and nodeindex\n */\nstatic bool tnode_earlier(TNode* a, TNode* b) {\n    bool result = false;\n    if (a->treesize < b->treesize) {  // treesize dominates\n        result = true;\n    } else if (a->treesize == b->treesize) {\n        if (a->hash < b->hash) {  // if treesize same, keep identical trees together\n            result = true;\n        } else if (a->hash == b->hash) {\n            result = a->nodeindex < b->nodeindex;  // identical trees ordered by nodeindex\n        }\n    }\n    return result;\n}\n\nstatic bool ptr_tnode_earlier(TNode* a, TNode* b) {\n    return tnode_earlier(a, b);\n}\n\nTNode::TNode(int ix) {\n    nodeindex = ix;\n    cellindex = 0;\n    groupindex = 0;\n    level = 0;\n    hash = 0;\n    treesize = 1;\n    nodevec_index = 0;\n    treenode_order = 0;\n    parent = nullptr;\n    children.reserve(2);\n}\n\nTNode::~TNode() {}\n\nsize_t TNode::mkhash() {  // call on all nodes in leaf to root order\n    // concept from http://stackoverflow.com/questions/20511347/a-good-hash-function-for-a-vector\n    std::sort(children.begin(), children.end(), ptr_tnode_earlier);\n    hash = children.size();\n    treesize = 1;\n    for (size_t i = 0; i < children.size(); ++i) {  // need sorted by child hash\n        hash ^= children[i]->hash + 0x9e3779b9 + (hash << 6) + (hash >> 2);\n        treesize += children[i]->treesize;\n    }\n    return hash;  // hash of leaf nodes is 0\n}\n\nstatic void tree_analysis(int* parent, int nnode, int ncell, VecTNode&);\nstatic void node_interleave_order(int ncell, VecTNode&);\nstatic void admin1(int ncell,\n                   VecTNode& nodevec,\n                   int& nwarp,\n                   int& nstride,\n                   int*& stride,\n                   int*& firstnode,\n                   int*& lastnode,\n                   int*& cellsize);\nstatic void admin2(int ncell,\n                   VecTNode& nodevec,\n                   int& nwarp,\n                   int& nstride,\n                   int*& stridedispl,\n                   int*& strides,\n                   int*& rootbegin,\n                   int*& nodebegin,\n                   int*& ncycles);\nstatic void check(VecTNode&);\n#if CORENRN_DEBUG\nstatic void prtree(VecTNode&);\n#endif\n\nusing TNI = std::pair<TNode*, int>;\nusing HashCnt = std::map<size_t, std::pair<TNode*, int>>;\nusing TNIVec = std::vector<TNI>;\n\n/*\nassess the quality of the ordering. The measure is the size of a contiguous\nlist of nodes whose parents have the same order. How many contiguous lists\nhave that same size. How many nodes participate in that size list.\nModify the quality measure from experience with performance. Start with\nlist of (nnode, size_participation)\n*/\nstatic void quality(VecTNode& nodevec, size_t max = 32) {\n    size_t qcnt = 0;  // how many contiguous nodes have contiguous parents\n\n    // first ncell nodes are by definition in contiguous order\n    for (const auto& n: nodevec) {\n        if (n->parent != nullptr) {\n            break;\n        }\n        qcnt += 1;\n    }\n    size_t ncell = qcnt;\n\n    // key is how many parents in contiguous order\n    // value is number of nodes that participate in that\n    std::map<size_t, size_t> qual;\n    size_t ip_last = 10000000000;\n    for (size_t i = ncell; i < nodevec.size(); ++i) {\n        size_t ip = nodevec[i]->parent->nodevec_index;\n        // i%max == 0 means that if we start a warp with 8 and then have 32\n        // the 32 is broken into 24 and 8. (modify if the arrangement during\n        // gaussian elimination becomes more sophisticated.(\n        if (ip == ip_last + 1 && i % max != 0) {  // contiguous\n            qcnt += 1;\n        } else {\n            if (qcnt == 1) {\n                // printf(\"unique %ld p=%ld ix=%d\\n\", i, ip, nodevec[i]->nodeindex);\n            }\n            qual[max] += (qcnt / max) * max;\n            size_t x = qcnt % max;\n            if (x) {\n                qual[x] += x;\n            }\n            qcnt = 1;\n        }\n        ip_last = ip;\n    }\n    qual[max] += (qcnt / max) * max;\n    size_t x = qcnt % max;\n    if (x) {\n        qual[x] += x;\n    }\n\n    // print result\n    qcnt = 0;\n#if CORENRN_DEBUG\n    for (const auto& q: qual) {\n        qcnt += q.second;\n        printf(\"%6ld %6ld\\n\", q.first, q.second);\n    }\n#endif\n#if CORENRN_DEBUG\n    printf(\"qual.size=%ld  qual total nodes=%ld  nodevec.size=%ld\\n\",\n           qual.size(),\n           qcnt,\n           nodevec.size());\n#endif\n\n    // how many race conditions. ie refer to same parent on different core\n    // of warp (max cores) or parent in same group of max.\n    size_t maxip = ncell;\n    size_t nrace1 = 0;\n    size_t nrace2 = 0;\n    std::set<size_t> ipused;\n    for (size_t i = ncell; i < nodevec.size(); ++i) {\n        TNode* nd = nodevec[i];\n        size_t ip = nd->parent->nodevec_index;\n        if (i % max == 0) {\n            maxip = i;\n            ipused.clear();\n        }\n        if (ip >= maxip) {\n            nrace1 += 1;\n        } /*else*/\n        {\n            if (ipused.find(ip) != ipused.end()) {\n                nrace2 += 1;\n                if (ip >= maxip) {\n                    // printf(\"race for parent %ld (parent in same group as multiple users))\\n\",\n                    // ip);\n                }\n            } else {\n                ipused.insert(ip);\n            }\n        }\n    }\n    static_cast<void>(nrace1);\n    static_cast<void>(nrace2);\n#if CORENRN_DEBUG\n    printf(\"nrace = %ld (parent in same group of %ld nodes)\\n\", nrace1, max);\n    printf(\"nrace = %ld (parent used more than once by same group of %ld nodes)\\n\", nrace2, max);\n#endif\n}\n\nsize_t level_from_root(VecTNode& nodevec) {\n    size_t maxlevel = 0;\n    for (auto& nd: nodevec) {\n        if (nd->parent) {\n            nd->level = nd->parent->level + 1;\n            if (maxlevel < nd->level) {\n                maxlevel = nd->level;\n            }\n        } else {\n            nd->level = 0;\n        }\n    }\n    return maxlevel;\n}\n\nsize_t level_from_leaf(VecTNode& nodevec) {\n    size_t maxlevel = 0;\n    for (size_t i = nodevec.size() - 1; true; --i) {\n        TNode* nd = nodevec[i];\n        size_t lmax = 0;\n        for (auto& child: nd->children) {\n            if (lmax <= child->level) {\n                lmax = child->level + 1;\n            }\n        }\n        nd->level = lmax;\n        if (maxlevel < lmax) {\n            maxlevel = lmax;\n        }\n        if (i == 0) {\n            break;\n        }\n    }\n    return maxlevel;\n}\n\n/**\n * \\brief Set the cellindex to distinguish the different cells.\n */\nstatic void set_cellindex(int ncell, VecTNode& nodevec) {\n    for (int i = 0; i < ncell; ++i) {\n        nodevec[i]->cellindex = i;\n    }\n    for (size_t i = 0; i < nodevec.size(); ++i) {\n        TNode& nd = *nodevec[i];\n        for (size_t j = 0; j < nd.children.size(); ++j) {\n            TNode* cnode = nd.children[j];\n            cnode->cellindex = nd.cellindex;\n        }\n    }\n}\n\n/**\n * \\brief Initialization of the groupindex (groups)\n *\n * The cells are groupped at a later stage based on a load balancing algorithm.\n * This is just an initialization function.\n */\nstatic void set_groupindex(VecTNode& nodevec) {\n    for (size_t i = 0; i < nodevec.size(); ++i) {\n        TNode* nd = nodevec[i];\n        if (nd->parent) {\n            nd->groupindex = nd->parent->groupindex;\n        } else {\n            nd->groupindex = i / groupsize;\n        }\n    }\n}\n\n// how many identical trees and their levels\n// print when more than one instance of a type\n// reverse the sense of levels (all leaves are level 0) to get a good\n// idea of the depth of identical subtrees.\nstatic void ident_statistic(VecTNode& nodevec, size_t ncell) {\n    // reverse sense of levels\n    //  size_t maxlevel = level_from_leaf(nodevec);\n    size_t maxlevel = level_from_root(nodevec);\n\n    // # in each level\n    std::vector<std::vector<size_t>> n_in_level(maxlevel + 1);\n    for (auto& n: n_in_level) {\n        n.resize(ncell / groupsize);\n    }\n    for (const auto& n: nodevec) {\n        n_in_level[n->level][n->groupindex]++;\n    }\n    printf(\"n_in_level.size = %ld\\n\", n_in_level.size());\n    for (size_t i = 0; i < n_in_level.size(); ++i) {\n        printf(\"%5ld\\n\", i);\n        for (const auto& n: n_in_level[i]) {\n            printf(\" %5ld\", n);\n        }\n        printf(\"\\n\");\n    }\n}\n#undef MSS\n\nint* node_order(int ncell,\n                int nnode,\n                int* parent,\n                int& nwarp,\n                int& nstride,\n                int*& stride,\n                int*& firstnode,\n                int*& lastnode,\n                int*& cellsize,\n                int*& stridedispl) {\n    VecTNode nodevec;\n\n    // nodevec[0:ncell] in increasing size, with identical trees together,\n    // and otherwise nodeindex order\n    // nodevec.size = nnode\n    tree_analysis(parent, nnode, ncell, nodevec);\n    check(nodevec);\n\n    set_cellindex(ncell, nodevec);\n    set_groupindex(nodevec);\n    level_from_root(nodevec);\n\n    // nodevec[ncell:nnode] cells are interleaved in nodevec[0:ncell] cell order\n    if (interleave_permute_type == 1) {\n        node_interleave_order(ncell, nodevec);\n    } else {\n        group_order2(nodevec, groupsize, ncell);\n    }\n    check(nodevec);\n\n#if CORENRN_DEBUG\n    for (int i = 0; i < ncell; ++i) {\n        TNode& nd = *nodevec[i];\n        printf(\"%d size=%ld hash=%ld ix=%d\\n\", i, nd.treesize, nd.hash, nd.nodeindex);\n    }\n#endif\n\n    if (0)\n        ident_statistic(nodevec, ncell);\n    quality(nodevec);\n\n    // the permutation\n    int* nodeorder = new int[nnode];\n    for (int i = 0; i < nnode; ++i) {\n        TNode& nd = *nodevec[i];\n        nodeorder[nd.nodeindex] = i;\n    }\n\n    // administrative statistics for gauss elimination\n    if (interleave_permute_type == 1) {\n        admin1(ncell, nodevec, nwarp, nstride, stride, firstnode, lastnode, cellsize);\n    } else {\n        //  admin2(ncell, nodevec, nwarp, nstride, stridedispl, stride, rootbegin, nodebegin,\n        //  ncycles);\n        admin2(ncell, nodevec, nwarp, nstride, stridedispl, stride, firstnode, lastnode, cellsize);\n    }\n\n    int ntopol = 1;\n    for (int i = 1; i < ncell; ++i) {\n        if (nodevec[i - 1]->hash != nodevec[i]->hash) {\n            ntopol += 1;\n        }\n    }\n    static_cast<void>(ntopol);\n#ifdef DEBUG\n    printf(\"%d distinct tree topologies\\n\", ntopol);\n#endif\n\n    for (size_t i = 0; i < nodevec.size(); ++i) {\n        delete nodevec[i];\n    }\n\n    return nodeorder;\n}\n\nvoid check(VecTNode& nodevec) {\n    // printf(\"check\\n\");\n    size_t nnode = nodevec.size();\n    size_t ncell = 0;\n    for (size_t i = 0; i < nnode; ++i) {\n        nodevec[i]->nodevec_index = i;\n        if (nodevec[i]->parent == nullptr) {\n            ncell++;\n        }\n    }\n    ///  Check that the first compartments of nodevec are the root nodes (cells)\n    for (size_t i = 0; i < ncell; ++i) {\n        nrn_assert(nodevec[i]->parent == nullptr);\n    }\n    for (size_t i = ncell; i < nnode; ++i) {\n        TNode& nd = *nodevec[i];\n        if (nd.parent->nodevec_index >= nd.nodevec_index) {\n            printf(\"error i=%ld nodevec_index=%ld parent=%ld\\n\",\n                   i,\n                   nd.nodevec_index,\n                   nd.parent->nodevec_index);\n        }\n        nrn_assert(nd.nodevec_index > nd.parent->nodevec_index);\n    }\n}\n\n#if CORENRN_DEBUG\nvoid prtree(VecTNode& nodevec) {\n    size_t nnode = nodevec.size();\n    for (size_t i = 0; i < nnode; ++i) {\n        nodevec[i]->nodevec_index = i;\n    }\n    for (size_t i = 0; i < nnode; ++i) {\n        TNode& nd = *nodevec[i];\n        printf(\"%ld p=%d   c=%ld l=%ld o=%ld   ix=%d pix=%d\\n\",\n               i,\n               nd.parent ? int(nd.parent->nodevec_index) : -1,\n               nd.cellindex,\n               nd.level,\n               nd.treenode_order,\n               nd.nodeindex,\n               nd.parent ? int(nd.parent->nodeindex) : -1);\n    }\n}\n#endif\n\n/**\n * \\brief Perform tree preparation for interleaving strategies\n *\n * \\param parent vector of parent indices\n * \\param nnode number of compartments in the cells\n * \\param ncell number of cells\n */\nvoid tree_analysis(int* parent, int nnode, int ncell, VecTNode& nodevec) {\n    // create empty TNodes (knowing only their index)\n    nodevec.reserve(nnode);\n    for (int i = 0; i < nnode; ++i) {\n        nodevec.push_back(new TNode(i));\n    }\n\n    // determine the (sorted by hash) children of each node\n    for (int i = nnode - 1; i >= ncell; --i) {\n        nodevec[i]->parent = nodevec[parent[i]];\n        nodevec[i]->mkhash();\n        nodevec[parent[i]]->children.push_back(nodevec[i]);\n    }\n\n    // determine hash of the cells\n    for (int i = 0; i < ncell; ++i) {\n        nodevec[i]->mkhash();\n    }\n\n    // sort it by tree size (from smaller to larger)\n    std::sort(nodevec.begin(), nodevec.begin() + ncell, tnode_earlier);\n}\n\nstatic bool interleave_comp(TNode* a, TNode* b) {\n    bool result = false;\n    if (a->treenode_order < b->treenode_order) {\n        result = true;\n    } else if (a->treenode_order == b->treenode_order) {\n        if (a->cellindex < b->cellindex) {\n            result = true;\n        }\n    }\n    return result;\n}\n\n/**\n * \\brief Naive interleaving strategy (interleave_permute_type == 1)\n *\n * Sort so nodevec[ncell:nnode] cell instances are interleaved. Keep the\n * secondary ordering with respect to treenode_order so each cell is still a tree.\n *\n * \\param ncell number of cells (trees)\n * \\param nodevec vector that contains compartments (nodes of the trees)\n */\nvoid node_interleave_order(int ncell, VecTNode& nodevec) {\n    int* order = new int[ncell];\n    for (int i = 0; i < ncell; ++i) {\n        order[i] = 0;\n        nodevec[i]->treenode_order = order[i]++;\n    }\n    for (size_t i = 0; i < nodevec.size(); ++i) {\n        TNode& nd = *nodevec[i];\n        for (size_t j = 0; j < nd.children.size(); ++j) {\n            TNode* cnode = nd.children[j];\n            cnode->treenode_order = order[nd.cellindex]++;\n        }\n    }\n    delete[] order;\n\n    //  std::sort(nodevec.begin() + ncell, nodevec.end(), contig_comp);\n    // Traversal of nodevec: From root to leaves (this is why we compute the tree node order)\n    std::sort(nodevec.begin() + ncell, nodevec.end(), interleave_comp);\n\n#if CORENRN_DEBUG\n    for (size_t i = 0; i < nodevec.size(); ++i) {\n        TNode& nd = *nodevec[i];\n        printf(\"%ld cell=%ld ix=%d\\n\", i, nd.cellindex, nd.nodeindex);\n    }\n#endif\n}\n\nstatic void admin1(int ncell,\n                   VecTNode& nodevec,\n                   int& nwarp,\n                   int& nstride,\n                   int*& stride,\n                   int*& firstnode,\n                   int*& lastnode,\n                   int*& cellsize) {\n    firstnode = (int*) ecalloc_align(ncell, sizeof(int));\n    lastnode = (int*) ecalloc_align(ncell, sizeof(int));\n    cellsize = (int*) ecalloc_align(ncell, sizeof(int));\n\n    nwarp = (ncell % warpsize == 0) ? (ncell / warpsize) : (ncell / warpsize + 1);\n\n    for (int i = 0; i < ncell; ++i) {\n        firstnode[i] = -1;\n        lastnode[i] = -1;\n        cellsize[i] = 0;\n    }\n\n    nstride = 0;\n    for (size_t i = ncell; i < nodevec.size(); ++i) {\n        TNode& nd = *nodevec[i];\n        size_t ci = nd.cellindex;\n        if (firstnode[ci] == -1) {\n            firstnode[ci] = i;\n        }\n        lastnode[ci] = i;\n        cellsize[ci] += 1;\n        if (nstride < cellsize[ci]) {\n            nstride = cellsize[ci];\n        }\n    }\n\n    // this vector is used to move from one compartment to the other (per cell)\n    // its length is equal to the cell with the highest number of compartments\n    stride = static_cast<int*>(ecalloc_align(nstride + 1, sizeof(int)));\n    for (size_t i = ncell; i < nodevec.size(); ++i) {\n        TNode& nd = *nodevec[i];\n        // compute how many compartments with the same order\n        // treenode_order : defined in breadth first fashion (for each cell separately)\n        stride[nd.treenode_order - 1] += 1;  // -1 because treenode order includes root\n    }\n}\n\n// for admin2 we allow the node organisation in warps of (say 4 cores per warp)\n// ...............  ideal warp but unbalanced relative to warp with max cycles\n// ...............  ncycle = 15, icore [0:4), all strides are 4.\n// ...............\n// ...............\n//\n// ..........       unbalanced relative to warp with max cycles\n// ..........       ncycle = 10, not all strides the same because\n// ..........       of need to avoid occasional race conditions.\n//  .  . ..         icore [4:8) only 4 strides of 4\n//\n// ....................  ncycle = 20, uses only one core in the warp (cable)\n//                       icore 8, all ncycle strides are 1\n\n// One thing to be unhappy about is the large stride vector of size about\n// number of compartments/warpsize. There are a lot of models where the\n// stride for a warp is constant except for one cycle in the warp and that\n// is easy to obtain when there are more than warpsize cells per warp.\n\nstatic size_t stride_length(size_t begin, size_t end, VecTNode& nodevec) {\n    // return stride length starting at i. Do not go past j.\n    // max stride is warpsize.\n    // At this time, only assume vicious parent race conditions matter.\n    if (end - begin > warpsize) {\n        end = begin + warpsize;\n    }\n    for (size_t i = begin; i < end; ++i) {\n        TNode* nd = nodevec[i];\n        nrn_assert(nd->nodevec_index == i);\n        size_t diff = dist2child(nd);\n        if (i + diff < end) {\n            end = i + diff;\n        }\n    }\n    return end - begin;\n}\n\n/**\n * \\brief Prepare for solve_interleaved2\n *\n * One group of cells per warp.\n *\n * warp[i] has a number of compute cycles (ncycle[i])\n * the index of its first root (rootbegin[i], last rootbegin[nwarp] = ncell)\n * the index of its first node (nodebegin[i], last nodebegin[nwarp] = nnode)\n *\n * Each compute cycle has a stride\n * A stride is how many nodes are processed by a warp in one compute cycle\n * There are nstride strides. nstride is the sum of ncycles of all warps.\n * warp[i] has ncycle[i] strides\n * same as sum of ncycle\n * warp[i] has a stridedispl[i] which is stridedispl[i-1] + ncycle[i].\n * ie. The zeroth cycle of warp[j] works on stride[stridedispl[j]]\n * The value of a stride beginning at node i (node i is computed by core 0 of\n * some warp for some cycle) is determined by stride_length(i, j, nodevec)\n *\n */\nstatic void admin2(int ncell,\n                   VecTNode& nodevec,\n                   int& nwarp,\n                   int& nstride,\n                   int*& stridedispl,\n                   int*& strides,\n                   int*& rootbegin,\n                   int*& nodebegin,\n                   int*& ncycles) {\n    // the number of groups is the number of warps needed\n    // ncore is the number of warps * warpsize\n    nwarp = nodevec[ncell - 1]->groupindex + 1;\n\n    ncycles = (int*) ecalloc_align(nwarp, sizeof(int));\n    stridedispl = (int*) ecalloc_align(nwarp + 1,\n                                       sizeof(int));  // running sum of ncycles (start at 0)\n    rootbegin = (int*) ecalloc_align(nwarp + 1, sizeof(int));  // index (+1) of first root in warp.\n    nodebegin = (int*) ecalloc_align(nwarp + 1, sizeof(int));  // index (+1) of first node in warp.\n\n    // rootbegin and nodebegin are the root index values + 1 of the last of\n    // the sequence of constant groupindex\n    rootbegin[0] = 0;\n    for (size_t i = 0; i < size_t(ncell); ++i) {\n        rootbegin[nodevec[i]->groupindex + 1] = i + 1;\n    }\n    nodebegin[0] = ncell;\n    // We start from the leaves and go backwards towards the root\n    for (size_t i = size_t(ncell); i < nodevec.size(); ++i) {\n        nodebegin[nodevec[i]->groupindex + 1] = i + 1;\n    }\n\n    // ncycles, stridedispl, and nstride\n    nstride = 0;\n    stridedispl[0] = 0;\n    for (size_t iwarp = 0; iwarp < (size_t) nwarp; ++iwarp) {\n        size_t j = size_t(nodebegin[iwarp + 1]);\n        int nc = 0;\n        size_t i = nodebegin[iwarp];\n        // in this loop we traverse all the children of all the cells in the current warp (iwarp)\n        while (i < j) {\n            i += stride_length(i, j, nodevec);\n            ++nc;  // how many times the warp should loop in order to finish with all the tree\n                   // depths (for all the trees of the warp/group)\n        }\n        ncycles[iwarp] = nc;\n        stridedispl[iwarp + 1] = stridedispl[iwarp] + nc;\n        nstride += nc;\n    }\n\n    // strides\n    strides = (int*) ecalloc_align(nstride, sizeof(int));\n    nstride = 0;\n    for (size_t iwarp = 0; iwarp < (size_t) nwarp; ++iwarp) {\n        size_t j = size_t(nodebegin[iwarp + 1]);\n        size_t i = nodebegin[iwarp];\n        while (i < j) {\n            int k = stride_length(i, j, nodevec);\n            i += k;\n            strides[nstride++] = k;\n        }\n    }\n\n#if CORENRN_DEBUG\n    printf(\"warp rootbegin nodebegin stridedispl\\n\");\n    for (int i = 0; i <= nwarp; ++i) {\n        printf(\"%4d %4d %4d %4d\\n\", i, rootbegin[i], nodebegin[i], stridedispl[i]);\n    }\n#endif\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/permute/cellorder2.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include <cstdio>\n#include <map>\n#include <set>\n#include <algorithm>\n#include <cstring>\n#include <numeric>\n\n#include \"coreneuron/utils/nrn_assert.h\"\n#include \"coreneuron/permute/cellorder.hpp\"\n#include \"coreneuron/network/tnode.hpp\"\n#include \"coreneuron/nrniv/nrniv_decl.h\"\n\n// experiment starting with identical cell ordering\n// groupindex aleady defined that keeps identical cells together\n// begin with leaf to root ordering\nnamespace coreneuron {\nusing VTN = VecTNode;             // level of nodes\nusing VVTN = std::vector<VTN>;    // group of levels\nusing VVVTN = std::vector<VVTN>;  // groups\n\n// verify level in groups of nident identical nodes\nvoid chklevel(VTN& level, size_t nident = 8) {}\n\n// first child before second child, etc\n// if same parent level, then parent order\n// if not same parent, then earlier parent (no parent earlier than parent)\n// if same parents, then children order\n// if no parents then nodevec_index order.\nstatic bool sortlevel_cmp(TNode* a, TNode* b) {\n    // when starting with leaf to root order\n    // note that leaves are at max level and all roots at level 0\n    bool result = false;\n    // since cannot have an index < 0, just add 1 to level\n    size_t palevel = a->parent ? 1 + a->parent->level : 0;\n    size_t pblevel = b->parent ? 1 + b->parent->level : 0;\n    if (palevel < pblevel) {          // only used when starting leaf to root order\n        result = true;                // earlier level first\n    } else if (palevel == pblevel) {  // always true when starting root to leaf\n        if (palevel == 0) {           // a and b are roots\n            if (a->nodevec_index < b->nodevec_index) {\n                result = true;\n            }\n        } else {  // parent order (already sorted with proper treenode_order)\n            if (a->treenode_order < b->treenode_order) {  // children order\n                result = true;\n            } else if (a->treenode_order == b->treenode_order) {\n                if (a->parent->treenode_order < b->parent->treenode_order) {\n                    result = true;\n                }\n            }\n        }\n    }\n    return result;\n}\n\nstatic void sortlevel(VTN& level) {\n    std::sort(level.begin(), level.end(), sortlevel_cmp);\n\n    for (size_t i = 0; i < level.size(); ++i) {\n        level[i]->treenode_order = i;\n    }\n}\n\n// TODO: refactor since sortlevel() is traversing the nodes in same order\nstatic void set_treenode_order(VVTN& levels) {\n    size_t order = 0;\n    for (auto& level: levels) {\n        for (auto* nd: level) {\n            nd->treenode_order = order++;\n        }\n    }\n}\n\n#if CORENRN_DEBUG\n// every level starts out with no race conditions involving both\n// parent and child in the same level. Can we arrange things so that\n// every level has at least 32 nodes?\nstatic size_t g32(TNode* nd) {\n    return nd->nodevec_index / warpsize;\n}\n\nstatic bool is_parent_race(TNode* nd) {  // vitiating\n    size_t pg = g32(nd);\n    for (const auto& child: nd->children) {\n        if (pg == g32(child)) {\n            return true;\n        }\n    }\n    return false;\n}\n#endif\n\n// less than 32 apart\nstatic bool is_parent_race2(TNode* nd) {  // vitiating\n    size_t pi = nd->nodevec_index;\n    for (const auto& child: nd->children) {\n        if (child->nodevec_index - pi < warpsize) {\n            return true;\n        }\n    }\n    return false;\n}\n\n#if CORENRN_DEBUG\nstatic bool is_child_race(TNode* nd) {  // potentially handleable by atomic\n    if (nd->children.size() < 2) {\n        return false;\n    }\n    if (nd->children.size() == 2) {\n        return g32(nd->children[0]) == g32(nd->children[1]);\n    }\n    std::set<size_t> s;\n    for (const auto& child: nd->children) {\n        std::size_t gc = g32(child);\n        if (s.find(gc) != s.end()) {\n            return true;\n        }\n        s.insert(gc);\n    }\n    return false;\n}\n#endif\n\nstatic bool is_child_race2(TNode* nd) {  // potentially handleable by atomic\n    if (nd->children.size() < 2) {\n        return false;\n    }\n    if (nd->children.size() == 2) {\n        size_t c0 = nd->children[0]->nodevec_index;\n        size_t c1 = nd->children[1]->nodevec_index;\n        c0 = (c0 < c1) ? (c1 - c0) : (c0 - c1);\n        return c0 < warpsize;\n    }\n    size_t ic0 = nd->children[0]->nodevec_index;\n    for (size_t i = 1; i < nd->children.size(); ++i) {\n        size_t ic = nd->children[i]->nodevec_index;\n        if (ic - ic0 < warpsize) {\n            return true;\n        }\n        ic0 = ic;\n    }\n    return false;\n}\n\nsize_t dist2child(TNode* nd) {\n    size_t d = 1000;\n    size_t pi = nd->nodevec_index;\n    for (const auto& child: nd->children) {\n        std::size_t d1 = child->nodevec_index - pi;\n        if (d1 < d) {\n            d = d1;\n        }\n    }\n    return d;\n}\n\n// from stackoverflow.com\ntemplate <typename T>\nstatic void move_range(size_t start, size_t length, size_t dst, std::vector<T>& v) {\n    typename std::vector<T>::iterator first, middle, last;\n    if (start < dst) {\n        first = v.begin() + start;\n        middle = first + length;\n        last = v.begin() + dst;\n    } else {\n        first = v.begin() + dst;\n        middle = v.begin() + start;\n        last = middle + length;\n    }\n    std::rotate(first, middle, last);\n}\n\nstatic void move_nodes(size_t start, size_t length, size_t dst, VTN& nodes) {\n    nrn_assert(dst <= nodes.size());\n    nrn_assert(start + length <= dst);\n    move_range(start, length, dst, nodes);\n\n    // check correctness of move\n    for (size_t i = start; i < dst - length; ++i) {\n        nrn_assert(nodes[i]->nodevec_index == i + length);\n    }\n    for (size_t i = dst - length; i < dst; ++i) {\n        nrn_assert(nodes[i]->nodevec_index == start + (i - (dst - length)));\n    }\n\n    // update nodevec_index\n    for (size_t i = start; i < dst; ++i) {\n        nodes[i]->nodevec_index = i;\n    }\n}\n\n#if CORENRN_DEBUG\n// least number of nodes to move after nd to eliminate prace\nstatic size_t need2move(TNode* nd) {\n    size_t d = dist2child(nd);\n    return warpsize - ((nd->nodevec_index % warpsize) + d);\n}\n\nstatic void how_many_warpsize_groups_have_only_leaves(VTN& nodes) {\n    size_t n = 0;\n    for (size_t i = 0; i < nodes.size(); i += warpsize) {\n        bool r = true;\n        for (size_t j = 0; j < warpsize; ++j) {\n            if (!nodes[i + j]->children.empty()) {\n                r = false;\n                break;\n            }\n        }\n        if (r) {\n            printf(\"warpsize group %ld starting at level %ld\\n\", i / warpsize, nodes[i]->level);\n            ++n;\n        }\n    }\n    printf(\"number of warpsize groups with only leaves = %ld\\n\", n);\n}\n\nstatic void pr_race_situation(VTN& nodes) {\n    size_t prace2 = 0;\n    size_t prace = 0;\n    size_t crace = 0;\n    for (size_t i = nodes.size() - 1; nodes[i]->level != 0; --i) {\n        TNode* nd = nodes[i];\n        if (is_parent_race2(nd)) {\n            ++prace2;\n        }\n        if (is_parent_race(nd)) {\n            printf(\"level=%ld i=%ld d=%ld n=%ld\",\n                   nd->level,\n                   nd->nodevec_index,\n                   dist2child(nd),\n                   need2move(nd));\n            for (const auto& cnd: nd->children) {\n                printf(\"   %ld %ld\", cnd->level, cnd->nodevec_index);\n            }\n            printf(\"\\n\");\n            ++prace;\n        }\n        if (is_child_race(nd)) {\n            ++crace;\n        }\n    }\n    printf(\"prace=%ld  crace=%ld prace2=%ld\\n\", prace, crace, prace2);\n}\n#endif\n\nstatic size_t next_leaf(TNode* nd, VTN& nodes) {\n    size_t i = 0;\n    for (i = nd->nodevec_index - 1; i > 0; --i) {\n        if (nodes[i]->children.empty()) {\n            return i;\n        }\n    }\n    //  nrn_assert(i > 0);\n    return 0;\n}\n\nstatic void checkrace(TNode* nd, VTN& nodes) {\n    for (size_t i = nd->nodevec_index; i < nodes.size(); ++i) {\n        if (is_parent_race2(nodes[i])) {\n            //      printf(\"checkrace %ld\\n\", i);\n        }\n    }\n}\n\nstatic bool eliminate_race(TNode* nd, size_t d, VTN& nodes, TNode* look) {\n    // printf(\"eliminate_race %ld %ld\\n\", nd->nodevec_index, d);\n    // opportunistically move that number of leaves\n    // error if no leaves left to move.\n    size_t i = look->nodevec_index;\n    while (d > 0) {\n        i = next_leaf(nodes[i], nodes);\n        if (i == 0) {\n            return false;\n        }\n        size_t n = 1;\n        while (nodes[i - 1]->children.empty() && n < d) {\n            --i;\n            ++n;\n        }\n        // printf(\"  move_nodes src=%ld len=%ld dest=%ld\\n\", i, n, nd->nodevec_index);\n        move_nodes(i, n, nd->nodevec_index + 1, nodes);\n        d -= n;\n    }\n    checkrace(nd, nodes);\n    return true;\n}\n\nstatic void eliminate_prace(TNode* nd, VTN& nodes) {\n    size_t d = warpsize - dist2child(nd);\n    bool b = eliminate_race(nd, d, nodes, nd);\n    if (0 && !b) {\n        printf(\"could not eliminate prace for g=%ld  c=%ld l=%ld o=%ld   %ld\\n\",\n               nd->groupindex,\n               nd->cellindex,\n               nd->level,\n               nd->treenode_order,\n               nd->hash);\n    }\n}\n\nstatic void eliminate_crace(TNode* nd, VTN& nodes) {\n    size_t c0 = nd->children[0]->nodevec_index;\n    size_t c1 = nd->children[1]->nodevec_index;\n    size_t d = warpsize - ((c0 > c1) ? (c0 - c1) : (c1 - c0));\n    TNode* cnd = nd->children[0];\n    bool b = eliminate_race(cnd, d, nodes, nd);\n    if (0 && !b) {\n        printf(\"could not eliminate crace for g=%ld  c=%ld l=%ld o=%ld   %ld\\n\",\n               nd->groupindex,\n               nd->cellindex,\n               nd->level,\n               nd->treenode_order,\n               nd->hash);\n    }\n}\n\nstatic void question2(VVTN& levels) {\n    // number of compartments in the group\n    std::size_t nnode = std::accumulate(levels.begin(),\n                                        levels.end(),\n                                        0,\n                                        [](std::size_t s, const VTN& l) { return s + l.size(); });\n    VTN nodes(nnode);  // store the sorted nodes from analyze function\n    nnode = 0;\n    for (const auto& level: levels) {\n        for (const auto& l: level) {\n            nodes[nnode++] = l;\n        }\n    }\n    for (size_t i = 0; i < nodes.size(); ++i) {\n        nodes[i]->nodevec_index = i;\n    }\n\n    //  how_many_warpsize_groups_have_only_leaves(nodes);\n\n    // Here we need to make sure that the dependent nodes\n    // belong to separate warps\n\n    // work backward and check the distance from parent to children.\n    // if parent in different group (warp?) then there is no vitiating race.\n    // if children in different group (warp?) then ther is no race (satisfied by\n    // atomic).\n    // If there is a vitiating race, then figure out how many nodes\n    // need to be inserted just before the parent to avoid the race.\n    //   It is not clear if we should prioritize safe nodes (when moved they\n    //   do not introduce a race) and/or contiguous nodes (probably, to keep\n    //   the low hanging fruit together).\n    //   At least, moved nodes should have proper tree order and not themselves\n    //   introduce a race at their new location.  Leaves are nice in that there\n    //   are no restrictions in movement toward higher indices.\n    //   Note that unless groups of 32 are inserted, it may be the case that\n    //   races are generated at greater indices since otherwise a portion of\n    //   each group is placed into the next group. This would not be an issue\n    //   if, in fact, the stronger requirement of every parent having\n    //   pi (parent index) + 32 <= ci (child index) is demanded instead of merely being in different\n    //   warpsize. One nice thing about adding warpsize nodes is that it does not disturb any\n    //   existing contiguous groups except the moved group which gets divided between parent\n    //   warpsize and child, where the nodes past the parent get same relative indices in the next\n    //   warpsize\n\n    //  let's see how well we can do by opportunistically moving leaves to\n    //  separate parents from children by warpsize (ie is_parent_prace2 is false)\n    //  Hopefully, we won't run out of leaves before eliminating all\n    //  is_parent_prace2\n\n    if (0 && nodes.size() % warpsize != 0) {\n        size_t nnode = nodes.size() - levels[0].size();\n        printf(\"warp of %ld cells has %ld nodes in last cycle %ld\\n\",\n               levels[0].size(),\n               nnode % warpsize,\n               nnode / warpsize + 1);\n    }\n\n    //  pr_race_situation(nodes);\n\n    // eliminate parent and children races using leaves\n    // traverse all the children (no roots)\n    for (size_t i = nodes.size() - 1; i >= levels[0].size(); --i) {\n        TNode* nd = nodes[i];\n        if (is_child_race2(nd)) {\n            eliminate_crace(nd, nodes);\n            i = nd->nodevec_index;\n        }\n        if (is_parent_race2(nd)) {\n            eliminate_prace(nd, nodes);\n            i = nd->nodevec_index;\n        }\n    }\n    // copy nodes indices to treenode_order\n    for (size_t i = 0; i < nodes.size(); ++i) {\n        nodes[i]->treenode_order = i;\n    }\n}\n\n// analyze each group of cells\n// the cells are grouped based on warp balance (lpt) algorithm\nstatic void analyze(VVTN& levels) {\n    // sort each level with respect to parent level order\n    // earliest parent level first.\n\n    // treenode order can be anything as long as first children < second\n    // children etc.. After sorting a level, the order will be correct for\n    // that level, ranging from [0:level.size]\n    for (auto& level: levels) {\n        chklevel(level);  // does nothing\n        for (const auto& nd: level) {\n            for (size_t k = 0; k < nd->children.size(); ++k) {\n                nd->children[k]->treenode_order = k;\n            }\n        }\n    }\n\n    for (auto& level: levels) {\n        sortlevel(level);\n        chklevel(level);  // does nothing\n    }\n\n    set_treenode_order(levels);\n}\n\nvoid prgroupsize(VVVTN& groups) {\n#if CORENRN_DEBUG\n    for (size_t i = 0; i < groups[0].size(); ++i) {\n        printf(\"%5ld\\n\", i);\n        for (const auto& group: groups) {\n            printf(\" %5ld\", group[i].size());\n        }\n        printf(\"\\n\");\n    }\n#endif\n}\n\n// group index primary, treenode_order secondary\nstatic bool final_nodevec_cmp(TNode* a, TNode* b) {\n    bool result = false;\n    if (a->groupindex < b->groupindex) {\n        result = true;\n    } else if (a->groupindex == b->groupindex) {\n        if (a->treenode_order < b->treenode_order) {\n            result = true;\n        }\n    }\n    return result;\n}\n\nstatic void set_nodeindex(VecTNode& nodevec) {\n    for (size_t i = 0; i < nodevec.size(); ++i) {\n        nodevec[i]->nodevec_index = i;\n    }\n}\n\nvoid group_order2(VecTNode& nodevec, size_t groupsize, size_t ncell) {\n    size_t maxlevel = level_from_root(nodevec);\n\n    // reset TNode.groupindex\n    size_t nwarp = warp_balance(ncell, nodevec);\n\n    // work on a cellgroup as a vector of levels. ie only possible race is\n    // two children in same warpsize\n\n    // every warp deals with a group of cells\n    // the cell dispatching to the available groups is done through the warp_balance function (lpt\n    // algo)\n    VVVTN groups(nwarp ? nwarp : (ncell / groupsize + ((ncell % groupsize) ? 1 : 0)));\n\n    for (auto& group: groups) {\n        group.resize(maxlevel + 1);\n    }\n\n    // group the cells according to their groupindex and according to their level (see\n    // level_from_root)\n    for (const auto& nd: nodevec) {\n        groups[nd->groupindex][nd->level].push_back(nd);\n    }\n\n    prgroupsize(groups);  // debugging\n\n    // deal with each group\n    for (auto& group: groups) {\n        analyze(group);\n        question2(group);\n    }\n\n    // final nodevec order according to group_index and treenode_order\n    std::sort(nodevec.begin() + ncell, nodevec.end(), final_nodevec_cmp);\n    set_nodeindex(nodevec);\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/permute/data_layout.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include \"coreneuron/coreneuron.hpp\"\n#include \"coreneuron/permute/data_layout.hpp\"\n#include \"coreneuron/mechanism/mechanism.hpp\"\n#include \"coreneuron/permute/node_permute.h\"\n#include \"coreneuron/mechanism/membfunc.hpp\"\n\nnamespace coreneuron {\n/*\n * Return the index to mechanism variable based Original input files are organized in AoS\n */\nint get_data_index(int node_index, int variable_index, int mtype, Memb_list* ml) {\n    int layout = corenrn.get_mech_data_layout()[mtype];\n    nrn_assert(layout == SOA_LAYOUT);\n    return variable_index * ml->_nodecount_padded + node_index;\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/permute/data_layout.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n#define SOA_LAYOUT 0\n#define AOS_LAYOUT 1\nnamespace coreneuron {\nstruct Memb_list;\nint get_data_index(int node_index, int variable_index, int mtype, Memb_list* ml);\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/permute/node_permute.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n/*\nBelow, the sense of permutation, is reversed. Though consistent, forward\npermutation should be defined as (and the code should eventually transformed)\nso that\n  v: original vector\n  p: forward permutation\n  pv: permuted vector\n  pv[i] = v[p[i]]\nand\n  pinv: inverse permutation\n  pv[pinv[i]] = v[i]\nNote: pinv[p[i]] = i = p[pinv[i]]\n*/\n\n/*\nPermute nodes.\n\nTo make gaussian elimination on gpu more efficient.\n\nPermutation vector p[i] applied to a data vector, moves the data_original[i]\nto data[p[i]].\nThat suffices for node properties such as area[i], a[i], b[i]. e.g.\n  area[p[i]] <- area_original[i]\n\nNotice that p on the left side is a forward permutation. On the right side\nit serves as the inverse permutation.\narea_original[i] <- area_permuted[p[i]]\n\nbut things\nget a bit more complicated when the data is an integer index into the\noriginal data.\n\nFor example:\n\nparent[i] needs to be transformed so that\nparent[p[i]] <- p[parent_original[i]] except that if parent_original[j] = -1\n  then parent[p[j]] = -1\n\nmembrane mechanism nodelist ( a subset of nodes) needs to be at least\nminimally transformed so that\nnodelist_new[k] <- p[nodelist_original[k]]\nThis does not affect the order of the membrane mechanism property data.\n\nHowever, computation is more efficient to permute (sort) nodelist_new so that\nit follows as much as possible the permuted node ordering, ie in increasing\nnode order.  Consider this further mechanism specific nodelist permutation,\nwhich is to be applied to the above nodelist_new, to be p_m, which has the same\nsize as nodelist. ie.\nnodelist[p_m[k]] <- nodelist_new[k].\n\nNotice the similarity to the parent case...\nnodelist[p_m[k]] = p[nodelist_original[k]]\n\nand now the membrane mechanism node data, does need to be permuted to have an\norder consistent with the new nodelist. Since there are nm instances of the\nmechanism each with sz data values (consider AoS layout).\nThe data permutation is\nfor k=[0:nm] for isz=[0:sz]\n  data_m[p_m[k]*sz + isz] = data_m_original[k*sz + isz]\n\nFor an SoA layout the indexing is k + isz*nm (where nm may include padding).\n\nA more complicated case is a mechanisms dparam array (nm instances each with\ndsz values) Some of those values are indices into another mechanism (eg\npointers to ion properties) or voltage or area depending on the semantics of\nthe value. We can use the above data_m permutation but then need to update\nthe values according to the permutation of the object the value indexes into.\nConsider the permutation of the target object to be p_t . Then a value\niold = pdata_m(k, isz) - data_t in AoS format\nrefers to k_t = iold % sz_t and isz_t = iold - k_t*sz_t\nand for a target in SoA format isz_t = iold % nm_t and k_t = iold - isz_t*nm_t\nie k_t_new = p_m_t[k_t] so, for AoS, inew = k_t_new*sz_t + isz_t\nor , for SoA, inew = k_t_new + isz_t*nm_t\nso pdata_m(k, isz) = inew + data_t\n\n\n*/\n\n#include <vector>\n#include <utility>\n#include <algorithm>\n\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/io/nrn_setup.hpp\"\n#include \"coreneuron/nrniv/nrniv_decl.h\"\n#include \"coreneuron/utils/nrn_assert.h\"\n#include \"coreneuron/coreneuron.hpp\"\nnamespace coreneuron {\ntemplate <typename T>\nvoid permute(T* data, int cnt, int sz, int layout, int* p) {\n    // data(p[icnt], isz) <- data(icnt, isz)\n    // this does not change data, merely permutes it.\n    // assert len(p) == cnt\n    if (!p) {\n        return;\n    }\n    int n = cnt * sz;\n    if (n < 1) {\n        return;\n    }\n\n    if (layout == Layout::SoA) {  // for SoA, n might be larger due to cnt padding\n        n = nrn_soa_padded_size(cnt, layout) * sz;\n    }\n\n    T* data_orig = new T[n];\n    for (int i = 0; i < n; ++i) {\n        data_orig[i] = data[i];\n    }\n\n    for (int icnt = 0; icnt < cnt; ++icnt) {\n        for (int isz = 0; isz < sz; ++isz) {\n            // note that when layout==0, nrn_i_layout takes into account SoA padding.\n            int i = nrn_i_layout(icnt, cnt, isz, sz, layout);\n            int ip = nrn_i_layout(p[icnt], cnt, isz, sz, layout);\n            data[ip] = data_orig[i];\n        }\n    }\n\n    delete[] data_orig;\n}\n\nint* inverse_permute(int* p, int n) {\n    int* pinv = new int[n];\n    for (int i = 0; i < n; ++i) {\n        pinv[p[i]] = i;\n    }\n    return pinv;\n}\n\nstatic void invert_permute(int* p, int n) {\n    int* pinv = inverse_permute(p, n);\n    for (int i = 0; i < n; ++i) {\n        p[i] = pinv[i];\n    }\n    delete[] pinv;\n}\n\n// type_of_ntdata: Return the mechanism type (or voltage)  for nt._data[i].\n// Used for updating POINTER. Analogous to nrn_dblpntr2nrncore in NEURON.\n// To reduce search time, consider voltage first, then a few of the previous\n// search results.\n// type_hint first and store a few\n// of the previous search result types to try next.\n// Most usage is for voltage. Most of the rest is likely for a specific type.\n// Occasionally, eg. axial current, there are two types oscillationg between\n// a SUFFIX (for non-zero area node) and POINT_PROCESS (for zero area nodes)\n// version\n// full_search: helper for type_of_ntdata. Return mech type for nt._data[i].\n// Update type_hints.\n\nstatic std::vector<int> type_hints;\n\nstatic int full_search(NrnThread& nt, double* pd) {\n    int type = -1;\n    for (NrnThreadMembList* tml = nt.tml; tml; tml = tml->next) {\n        Memb_list* ml = tml->ml;\n        int n = corenrn.get_prop_param_size()[tml->index] * ml->_nodecount_padded;\n        if (pd >= ml->data && pd < ml->data + n) {\n            type = tml->index;\n            // insert into type_hints\n            int i = 0;\n            for (int type_hint: type_hints) {\n                if (type < type_hint) {\n                    break;\n                }\n                i++;\n            }\n            type_hints.insert(type_hints.begin() + i, type);\n            break;\n        }\n    }\n    assert(type > 0);\n    return type;\n}\n\n// no longer static because also used by POINTER in nrn_checkpoint.cpp\nint type_of_ntdata(NrnThread& nt, int i, bool reset) {\n    double* pd = nt._data + i;\n    assert(pd >= nt._actual_v);\n    if (pd < nt._actual_area) {  // voltage first (area just after voltage)\n        return voltage;\n    }\n    assert(size_t(i) < nt._ndata);\n    // then check the type hints. When inserting a hint, keep in type order\n    if (reset) {\n        type_hints.clear();\n    }\n    for (int type: type_hints) {\n        Memb_list* ml = nt._ml_list[type];\n        if (pd >= ml->data) {  // this or later\n            int n = corenrn.get_prop_param_size()[type] * ml->_nodecount_padded;\n            if (pd < ml->data + n) {  // this is the one\n                return type;\n            }\n        } else {  // earlier\n            return full_search(nt, pd);\n        }\n    }\n    // after the last type_hints\n    return full_search(nt, pd);\n}\n\nstatic void update_pdata_values(Memb_list* ml, int type, NrnThread& nt) {\n    // assumes AoS to SoA transformation already made since we are using\n    // nrn_i_layout to determine indices into both ml->pdata and into target data\n    int psz = corenrn.get_prop_dparam_size()[type];\n    if (psz == 0) {\n        return;\n    }\n    if (corenrn.get_is_artificial()[type]) {\n        return;\n    }\n    int* semantics = corenrn.get_memb_func(type).dparam_semantics;\n    if (!semantics) {\n        return;\n    }\n    int* pdata = ml->pdata;\n    int layout = corenrn.get_mech_data_layout()[type];\n    int cnt = ml->nodecount;\n    // ml padding does not matter (but target padding does matter)\n\n    // interesting semantics are -1 (area), -5 (pointer), -9 (diam), or 0-999 (ion variables)\n    for (int i = 0; i < psz; ++i) {\n        int s = semantics[i];\n        if (s == -1) {                               // area\n            int area0 = nt._actual_area - nt._data;  // includes padding if relevant\n            int* p_target = nt._permute;\n            for (int iml = 0; iml < cnt; ++iml) {\n                int* pd = pdata + nrn_i_layout(iml, cnt, i, psz, layout);\n                // *pd is the original integer into nt._data . Needs to be replaced\n                // by the permuted value\n\n                // This is ok whether or not area changed by padding?\n                // since old *pd updated appropriately by earlier AoS to SoA\n                // transformation\n                int ix = *pd - area0;  // original integer into area array.\n                nrn_assert((ix >= 0) && (ix < nt.end));\n                int ixnew = p_target[ix];\n                *pd = ixnew + area0;\n            }\n        } else if (s == -9) {                        // diam\n            int diam0 = nt._actual_diam - nt._data;  // includes padding if relevant\n            int* p_target = nt._permute;\n            for (int iml = 0; iml < cnt; ++iml) {\n                int* pd = pdata + nrn_i_layout(iml, cnt, i, psz, layout);\n                // *pd is the original integer into nt._data . Needs to be replaced\n                // by the permuted value\n\n                // This is ok whether or not diam changed by padding?\n                // since old *pd updated appropriately by earlier AoS to SoA\n                // transformation\n                int ix = *pd - diam0;  // original integer into actual_diam array.\n                nrn_assert((ix >= 0) && (ix < nt.end));\n                int ixnew = p_target[ix];\n                *pd = ixnew + diam0;\n            }\n        } else if (s == -5) {  // POINTER\n            // assume pointer into nt._data. Most likely voltage.\n            // If not voltage, most likely same mechanism for all indices.\n            for (int iml = 0; iml < cnt; ++iml) {\n                int* pd = pdata + nrn_i_layout(iml, cnt, i, psz, layout);\n                int etype = type_of_ntdata(nt, *pd, iml == 0);\n                if (etype == voltage) {\n                    int v0 = nt._actual_v - nt._data;\n                    int* e_target = nt._permute;\n                    int ix = *pd - v0;  // original integer into area array.\n                    nrn_assert((ix >= 0) && (ix < nt.end));\n                    int ixnew = e_target[ix];\n                    *pd = ixnew + v0;\n                } else if (etype > 0) {\n                    // about same as for ion below but check each instance\n                    Memb_list* eml = nt._ml_list[etype];\n                    int edata0 = eml->data - nt._data;\n                    int ecnt = eml->nodecount;\n                    int esz = corenrn.get_prop_param_size()[etype];\n                    int elayout = corenrn.get_mech_data_layout()[etype];\n                    int* e_permute = eml->_permute;\n                    int i_ecnt, i_esz, padded_ecnt;\n                    int ix = *pd - edata0;\n                    if (elayout == Layout::AoS) {\n                        padded_ecnt = ecnt;\n                        i_ecnt = ix / esz;\n                        i_esz = ix % esz;\n                    } else {  // SoA\n                        assert(elayout == Layout::SoA);\n                        padded_ecnt = nrn_soa_padded_size(ecnt, elayout);\n                        i_ecnt = ix % padded_ecnt;\n                        i_esz = ix / padded_ecnt;\n                    }\n                    int i_ecnt_new = e_permute ? e_permute[i_ecnt] : i_ecnt;\n                    int ix_new = nrn_i_layout(i_ecnt_new, ecnt, i_esz, esz, elayout);\n                    *pd = ix_new + edata0;\n                } else {\n                    nrn_assert(0);\n                }\n            }\n        } else if (s >= 0 && s < 1000) {  // ion\n            int etype = s;\n            int elayout = corenrn.get_mech_data_layout()[etype];\n            Memb_list* eml = nt._ml_list[etype];\n            int edata0 = eml->data - nt._data;\n            int ecnt = eml->nodecount;\n            int esz = corenrn.get_prop_param_size()[etype];\n            int* e_permute = eml->_permute;\n            for (int iml = 0; iml < cnt; ++iml) {\n                int* pd = pdata + nrn_i_layout(iml, cnt, i, psz, layout);\n                int ix = *pd - edata0;\n                // from ix determine i_ecnt and i_esz (need to permute i_ecnt)\n                int i_ecnt, i_esz, padded_ecnt;\n                if (elayout == Layout::AoS) {\n                    padded_ecnt = ecnt;\n                    i_ecnt = ix / esz;\n                    i_esz = ix % esz;\n                } else {  // SoA\n                    assert(elayout == Layout::SoA);\n                    padded_ecnt = nrn_soa_padded_size(ecnt, elayout);\n                    i_ecnt = ix % padded_ecnt;\n                    i_esz = ix / padded_ecnt;\n                }\n                int i_ecnt_new = e_permute[i_ecnt];\n                int ix_new = nrn_i_layout(i_ecnt_new, ecnt, i_esz, esz, elayout);\n                *pd = ix_new + edata0;\n            }\n        }\n    }\n}\n\nvoid node_permute(int* vec, int n, int* permute) {\n    for (int i = 0; i < n; ++i) {\n        if (vec[i] >= 0) {\n            vec[i] = permute[vec[i]];\n        }\n    }\n}\n\nvoid permute_ptr(int* vec, int n, int* p) {\n    permute(vec, n, 1, 1, p);\n}\n\nvoid permute_data(double* vec, int n, int* p) {\n    permute(vec, n, 1, 1, p);\n}\n\nvoid permute_ml(Memb_list* ml, int type, NrnThread& nt) {\n    int sz = corenrn.get_prop_param_size()[type];\n    int psz = corenrn.get_prop_dparam_size()[type];\n    int layout = corenrn.get_mech_data_layout()[type];\n    permute(ml->data, ml->nodecount, sz, layout, ml->_permute);\n    permute(ml->pdata, ml->nodecount, psz, layout, ml->_permute);\n\n    update_pdata_values(ml, type, nt);\n}\n\nint nrn_index_permute(int ix, int type, Memb_list* ml) {\n    int* p = ml->_permute;\n    if (!p) {\n        return ix;\n    }\n    int layout = corenrn.get_mech_data_layout()[type];\n    if (layout == Layout::AoS) {\n        int sz = corenrn.get_prop_param_size()[type];\n        int i_cnt = ix / sz;\n        int i_sz = ix % sz;\n        return p[i_cnt] * sz + i_sz;\n    } else {\n        assert(layout == Layout::SoA);\n        int padded_cnt = nrn_soa_padded_size(ml->nodecount, layout);\n        int i_cnt = ix % padded_cnt;\n        int i_sz = ix / padded_cnt;\n        return i_sz * padded_cnt + p[i_cnt];\n    }\n}\n\n#if CORENRN_DEBUG\nstatic void pr(const char* s, int* x, int n) {\n    printf(\"%s:\", s);\n    for (int i = 0; i < n; ++i) {\n        printf(\"  %d %d\", i, x[i]);\n    }\n    printf(\"\\n\");\n}\n\nstatic void pr(const char* s, double* x, int n) {\n    printf(\"%s:\", s);\n    for (int i = 0; i < n; ++i) {\n        printf(\"  %d %g\", i, x[i]);\n    }\n    printf(\"\\n\");\n}\n#endif\n\n// note that sort_indices has the sense of an inverse permutation in that\n// the value of sort_indices[0] is the index with the smallest value in the\n// indices array\n\nstatic bool nrn_index_sort_cmp(const std::pair<int, int>& a, const std::pair<int, int>& b) {\n    bool result = false;\n    if (a.first < b.first) {\n        result = true;\n    } else if (a.first == b.first) {\n        if (a.second < b.second) {\n            result = true;\n        }\n    }\n    return result;\n}\n\nstatic int* nrn_index_sort(int* values, int n) {\n    std::vector<std::pair<int, int>> vi(n);\n    for (int i = 0; i < n; ++i) {\n        vi[i].first = values[i];\n        vi[i].second = i;\n    }\n    std::sort(vi.begin(), vi.end(), nrn_index_sort_cmp);\n    int* sort_indices = new int[n];\n    for (int i = 0; i < n; ++i) {\n        sort_indices[i] = vi[i].second;\n    }\n    return sort_indices;\n}\n\nvoid permute_nodeindices(Memb_list* ml, int* p) {\n    // nodeindices values are permuted according to p (that per se does\n    //  not affect vec).\n\n    node_permute(ml->nodeindices, ml->nodecount, p);\n\n    // Then the new node indices are sorted by\n    // increasing index. Instances using the same node stay in same\n    // original relative order so that their contributions to rhs, d (if any)\n    // remain in same order (except for gpu parallelism).\n    // That becomes ml->_permute\n\n    ml->_permute = nrn_index_sort(ml->nodeindices, ml->nodecount);\n    invert_permute(ml->_permute, ml->nodecount);\n    permute_ptr(ml->nodeindices, ml->nodecount, ml->_permute);\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/permute/node_permute.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n#include \"coreneuron/sim/multicore.hpp\"\n\nnamespace coreneuron {\n// determine ml->_permute and permute the ml->nodeindices accordingly\nvoid permute_nodeindices(Memb_list* ml, int* permute);\n\n// vec values >= 0 updated according to permutation\nvoid node_permute(int* vec, int n, int* permute);\n\n// moves values to new location but does not change those values\nvoid permute_ptr(int* vec, int n, int* permute);\n\nvoid permute_data(double* vec, int n, int* permute);\nvoid permute_ml(Memb_list* ml, int type, NrnThread& nt);\nint nrn_index_permute(int, int type, Memb_list* ml);\n\nint* inverse_permute(int* p, int n);\n\nint type_of_ntdata(NrnThread&, int index, bool reset);\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/sim/fadvance_core.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include <functional>\n\n#include \"coreneuron/coreneuron.hpp\"\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/apps/corenrn_parameters.hpp\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/mpi/nrnmpi.h\"\n#include \"coreneuron/sim/fast_imem.hpp\"\n#include \"coreneuron/gpu/nrn_acc_manager.hpp\"\n#include \"coreneuron/io/reports/nrnreport.hpp\"\n#include \"coreneuron/network/netcvode.hpp\"\n#include \"coreneuron/network/netpar.hpp\"\n#include \"coreneuron/network/partrans.hpp\"\n#include \"coreneuron/utils/nrnoc_aux.hpp\"\n#include \"coreneuron/utils/progressbar/progressbar.hpp\"\n#include \"coreneuron/utils/profile/profiler_interface.h\"\n#include \"coreneuron/io/nrn2core_direct.h\"\n\nnamespace coreneuron {\nstatic void* nrn_fixed_step_thread(NrnThread*);\nstatic void nrn_fixed_step_group_thread(NrnThread*, int, int, int&);\n\n\nnamespace {\n\nclass ProgressBar final {\n    progressbar* pbar;\n    int current_step = 0;\n    bool show;\n    constexpr static int progressbar_update_steps = 5;\n\n  public:\n    ProgressBar(int nsteps)\n        : show(nrnmpi_myid == 0 && !corenrn_param.is_quiet()) {\n        if (show) {\n            printf(\"\\n\");\n            pbar = progressbar_new(\"psolve\", nsteps);\n        }\n    }\n\n    void update(int step, double time) {\n        current_step = step;\n        if (show && (current_step % progressbar_update_steps) == 0) {\n            progressbar_update(pbar, current_step, time);\n        }\n    }\n\n    void step(double time) {\n        update(current_step + 1, time);\n    }\n\n    ~ProgressBar() {\n        if (show) {\n            progressbar_finish(pbar);\n        }\n    }\n};\n\n}  // unnamed namespace\n\n\nvoid dt2thread(double adt) { /* copied from nrnoc/fadvance.c */\n    if (adt != nrn_threads[0]._dt) {\n        for (int i = 0; i < nrn_nthread; ++i) {\n            NrnThread* nt = nrn_threads + i;\n            nt->_t = t;\n            nt->_dt = dt;\n            if (secondorder) {\n                nt->cj = 2.0 / dt;\n            } else {\n                nt->cj = 1.0 / dt;\n            }\n            nrn_pragma_acc(update device(nt->_t, nt->_dt, nt->cj)\n                               async(nt->stream_id) if (nt->compute_gpu))\n            // clang-format off\n            nrn_pragma_omp(target update to(nt->_t, nt->_dt, nt->cj)\n                                         if(nt->compute_gpu))\n            // clang-format on\n        }\n    }\n}\n\nvoid nrn_fixed_step_minimal() { /* not so minimal anymore with gap junctions */\n    Instrumentor::phase p_timestep(\"timestep\");\n    if (t != nrn_threads->_t) {\n        dt2thread(-1.);\n    } else {\n        dt2thread(dt);\n    }\n    nrn_thread_table_check();\n    nrn_multithread_job(nrn_fixed_step_thread);\n    if (nrn_have_gaps) {\n        {\n            Instrumentor::phase p_gap(\"gap-v-transfer\");\n            nrnmpi_v_transfer();\n        }\n        nrn_multithread_job(nrn_fixed_step_lastpart);\n    }\n#if NRNMPI\n    if (nrn_threads[0]._stop_stepping) {\n        nrn_spike_exchange(nrn_threads);\n    }\n#endif\n\n#if defined(ENABLE_BIN_REPORTS) || defined(ENABLE_SONATA_REPORTS)\n    {\n        Instrumentor::phase p(\"flush_reports\");\n        nrn_flush_reports(nrn_threads[0]._t);\n    }\n#endif\n    t = nrn_threads[0]._t;\n}\n\n/* better cache efficiency since a thread can do an entire minimum delay\nintegration interval before joining\n*/\n/// --> Coreneuron\n\n\nvoid nrn_fixed_single_steps_minimal(int total_sim_steps, double tstop) {\n    ProgressBar progress_bar(total_sim_steps);\n#if NRNMPI\n    double updated_tstop = tstop - dt;\n    nrn_assert(nrn_threads->_t <= tstop);\n    // It may very well be the case that we do not advance at all\n    while (nrn_threads->_t <= updated_tstop) {\n#else\n    double updated_tstop = tstop - .5 * dt;\n    while (nrn_threads->_t < updated_tstop) {\n#endif\n        nrn_fixed_step_minimal();\n        if (stoprun) {\n            break;\n        }\n        progress_bar.step(nrn_threads[0]._t);\n    }\n}\n\n\nvoid nrn_fixed_step_group_minimal(int total_sim_steps) {\n    dt2thread(dt);\n    nrn_thread_table_check();\n    int step_group_n = total_sim_steps;\n    int step_group_begin = 0;\n    int step_group_end = 0;\n\n    ProgressBar progress_bar(step_group_n);\n    while (step_group_end < step_group_n) {\n        nrn_multithread_job(nrn_fixed_step_group_thread,\n                            step_group_n,\n                            step_group_begin,\n                            step_group_end);\n#if NRNMPI\n        nrn_spike_exchange(nrn_threads);\n#endif\n\n#if defined(ENABLE_BIN_REPORTS) || defined(ENABLE_SONATA_REPORTS)\n        {\n            Instrumentor::phase p(\"flush_reports\");\n            nrn_flush_reports(nrn_threads[0]._t);\n        }\n#endif\n        if (stoprun) {\n            break;\n        }\n        step_group_begin = step_group_end;\n        progress_bar.update(step_group_end, nrn_threads[0]._t);\n    }\n    t = nrn_threads[0]._t;\n}\n\nstatic void nrn_fixed_step_group_thread(NrnThread* nth,\n                                        int step_group_max,\n                                        int step_group_begin,\n                                        int& step_group_end) {\n    nth->_stop_stepping = 0;\n    for (int i = step_group_begin; i < step_group_max; ++i) {\n        Instrumentor::phase p_timestep(\"timestep\");\n        nrn_fixed_step_thread(nth);\n        if (nth->_stop_stepping) {\n            if (nth->id == 0) {\n                step_group_end = i + 1;\n            }\n            nth->_stop_stepping = 0;\n            return;\n        }\n    }\n    if (nth->id == 0) {\n        step_group_end = step_group_max;\n    }\n}\n\nvoid update(NrnThread* _nt) {\n    double* vec_v = &(VEC_V(0));\n    double* vec_rhs = &(VEC_RHS(0));\n    int i2 = _nt->end;\n\n    /* do not need to worry about linmod or extracellular*/\n    if (secondorder) {\n        nrn_pragma_acc(parallel loop present(vec_v [0:i2], vec_rhs [0:i2]) if (_nt->compute_gpu)\n                           async(_nt->stream_id))\n        nrn_pragma_omp(target teams distribute parallel for simd if(_nt->compute_gpu))\n        for (int i = 0; i < i2; ++i) {\n            vec_v[i] += 2. * vec_rhs[i];\n        }\n    } else {\n        nrn_pragma_acc(parallel loop present(vec_v [0:i2], vec_rhs [0:i2]) if (_nt->compute_gpu)\n                           async(_nt->stream_id))\n        nrn_pragma_omp(target teams distribute parallel for simd if(_nt->compute_gpu))\n        for (int i = 0; i < i2; ++i) {\n            vec_v[i] += vec_rhs[i];\n        }\n    }\n\n    if (_nt->tml) {\n        assert(_nt->tml->index == CAP);\n        nrn_cur_capacitance(_nt, _nt->tml->ml, _nt->tml->index);\n    }\n    if (nrn_use_fast_imem) {\n        nrn_calc_fast_imem(_nt);\n    }\n}\n\nvoid nonvint(NrnThread* _nt) {\n    if (nrn_have_gaps) {\n        Instrumentor::phase p(\"gap-v-transfer\");\n        nrnthread_v_transfer(_nt);\n    }\n    errno = 0;\n\n    Instrumentor::phase_begin(\"state-update\");\n    for (auto tml = _nt->tml; tml; tml = tml->next)\n        if (corenrn.get_memb_func(tml->index).state) {\n            mod_f_t s = corenrn.get_memb_func(tml->index).state;\n            std::string ss(\"state-\");\n            ss += nrn_get_mechname(tml->index);\n            {\n                Instrumentor::phase p(ss.c_str());\n                (*s)(_nt, tml->ml, tml->index);\n            }\n#ifdef DEBUG\n            if (errno) {\n                hoc_warning(\"errno set during calculation of states\", nullptr);\n            }\n#endif\n        }\n    Instrumentor::phase_end(\"state-update\");\n}\n\nvoid nrn_ba(NrnThread* nt, int bat) {\n    for (auto tbl = nt->tbl[bat]; tbl; tbl = tbl->next) {\n        mod_f_t f = tbl->bam->f;\n        int type = tbl->bam->type;\n        Memb_list* ml = tbl->ml;\n        (*f)(nt, ml, type);\n    }\n}\n\nvoid nrncore2nrn_send_init() {\n    if (nrn2core_trajectory_values_ == nullptr) {\n        // standalone execution : no callbacks\n        return;\n    }\n    // if per time step transfer, need to call nrn_record_init() in NEURON.\n    // if storing full trajectories in CoreNEURON, need to initialize\n    // vsize for all the trajectory requests.\n    (*nrn2core_trajectory_values_)(-1, 0, nullptr, 0.0);\n    for (int tid = 0; tid < nrn_nthread; ++tid) {\n        NrnThread& nt = nrn_threads[tid];\n        if (nt.trajec_requests) {\n            nt.trajec_requests->vsize = 0;\n        }\n    }\n}\n\nvoid nrncore2nrn_send_values(NrnThread* nth) {\n    if (nrn2core_trajectory_values_ == nullptr) {\n        // standalone execution : no callbacks\n        return;\n    }\n\n    TrajectoryRequests* tr = nth->trajec_requests;\n    if (tr) {\n        if (tr->varrays) {  // full trajectories into Vector data\n            int vs = tr->vsize++;\n            // make sure we do not overflow the `varrays` buffers\n            assert(vs < tr->bsize);\n\n            nrn_pragma_acc(parallel loop present(tr [0:1]) if (nth->compute_gpu)\n                               async(nth->stream_id))\n            nrn_pragma_omp(target teams distribute parallel for simd if(nth->compute_gpu))\n            for (int i = 0; i < tr->n_trajec; ++i) {\n                tr->varrays[i][vs] = *tr->gather[i];\n            }\n        } else if (tr->scatter) {  // scatter to NEURON and notify each step.\n            nrn_assert(nrn2core_trajectory_values_);\n            // Note that this is rather inefficient: we generate one `acc update\n            // self` call for each `double` value (voltage, membrane current,\n            // mechanism property, ...) that is being recorded, even though in most\n            // cases these values will actually fall in a small number of contiguous\n            // ranges in memory. A better solution, if the performance of this\n            // branch becomes limiting, might be to offload this loop to the\n            // device and populate some `scatter_values` array there and copy it\n            // back with a single transfer. Note that the `async` clause here\n            // should guarantee that correct values are reported even of\n            // mechanism data that is updated in `nrn_state`. See also:\n            // https://github.com/BlueBrain/CoreNeuron/issues/611\n            for (int i = 0; i < tr->n_trajec; ++i) {\n                double* gather_i = tr->gather[i];\n                static_cast<void>(gather_i);\n                nrn_pragma_acc(update self(gather_i [0:1]) if (nth->compute_gpu)\n                                   async(nth->stream_id))\n                nrn_pragma_omp(target update from(gather_i [0:1]) if (nth->compute_gpu))\n            }\n            nrn_pragma_acc(wait(nth->stream_id))\n            for (int i = 0; i < tr->n_trajec; ++i) {\n                *(tr->scatter[i]) = *(tr->gather[i]);\n            }\n            (*nrn2core_trajectory_values_)(nth->id, tr->n_pr, tr->vpr, nth->_t);\n        }\n    }\n}\n\nstatic void* nrn_fixed_step_thread(NrnThread* nth) {\n    /* check thresholds and deliver all (including binqueue)\n       events up to t+dt/2 */\n    {\n        Instrumentor::phase p(\"deliver-events\");\n        deliver_net_events(nth);\n    }\n\n    nth->_t += .5 * nth->_dt;\n\n    if (nth->ncell) {\n        /*@todo: do we need to update nth->_t on GPU: Yes (Michael, but can\n        launch kernel) */\n        nrn_pragma_acc(update device(nth->_t) if (nth->compute_gpu) async(nth->stream_id))\n        nrn_pragma_acc(wait(nth->stream_id))\n        nrn_pragma_omp(target update to(nth->_t) if (nth->compute_gpu))\n        fixed_play_continuous(nth);\n\n        {\n            Instrumentor::phase p(\"setup-tree-matrix\");\n            setup_tree_matrix_minimal(nth);\n        }\n\n        {\n            Instrumentor::phase p(\"matrix-solver\");\n            nrn_solve_minimal(nth);\n        }\n\n        {\n            Instrumentor::phase p(\"second-order-cur\");\n            second_order_cur(nth, secondorder);\n        }\n\n        {\n            Instrumentor::phase p(\"update\");\n            update(nth);\n        }\n    }\n    if (!nrn_have_gaps) {\n        nrn_fixed_step_lastpart(nth);\n    }\n    return nullptr;\n}\n\nvoid* nrn_fixed_step_lastpart(NrnThread* nth) {\n    nth->_t += .5 * nth->_dt;\n\n    if (nth->ncell) {\n        /*@todo: do we need to update nth->_t on GPU */\n        nrn_pragma_acc(update device(nth->_t) if (nth->compute_gpu) async(nth->stream_id))\n        nrn_pragma_acc(wait(nth->stream_id))\n        nrn_pragma_omp(target update to(nth->_t) if (nth->compute_gpu))\n        fixed_play_continuous(nth);\n        nonvint(nth);\n        nrn_ba(nth, AFTER_SOLVE);\n        nrn_ba(nth, BEFORE_STEP);\n        nrncore2nrn_send_values(nth);  // consistent with NEURON. (after BEFORE_STEP)\n    } else {\n        nrncore2nrn_send_values(nth);\n    }\n\n    {\n        Instrumentor::phase p(\"deliver-events\");\n        nrn_deliver_events(nth); /* up to but not past texit */\n    }\n\n    return nullptr;\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/sim/fast_imem.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/sim/fast_imem.hpp\"\n#include \"coreneuron/utils/memory.h\"\n#include \"coreneuron/mpi/nrnmpi.h\"\n#include \"coreneuron/utils/nrnoc_aux.hpp\"\n\nnamespace coreneuron {\n\nextern int nrn_nthread;\nextern NrnThread* nrn_threads;\nbool nrn_use_fast_imem;\n\nvoid fast_imem_free() {\n    for (auto nt = nrn_threads; nt < nrn_threads + nrn_nthread; ++nt) {\n        if (nt->nrn_fast_imem) {\n            free_memory(nt->nrn_fast_imem->nrn_sav_rhs);\n            free_memory(nt->nrn_fast_imem->nrn_sav_d);\n            free_memory(nt->nrn_fast_imem);\n            nt->nrn_fast_imem = nullptr;\n        }\n    }\n}\n\nvoid nrn_fast_imem_alloc() {\n    if (nrn_use_fast_imem) {\n        fast_imem_free();\n        for (auto nt = nrn_threads; nt < nrn_threads + nrn_nthread; ++nt) {\n            int n = nt->end;\n            nt->nrn_fast_imem = (NrnFastImem*) ecalloc_align(1, sizeof(NrnFastImem));\n            nt->nrn_fast_imem->nrn_sav_rhs = (double*) ecalloc_align(n, sizeof(double));\n            nt->nrn_fast_imem->nrn_sav_d = (double*) ecalloc_align(n, sizeof(double));\n        }\n    }\n}\n\nvoid nrn_calc_fast_imem(NrnThread* nt) {\n    int i1 = 0;\n    int i3 = nt->end;\n\n    double* vec_rhs = nt->_actual_rhs;\n    double* vec_area = nt->_actual_area;\n\n    double* fast_imem_d = nt->nrn_fast_imem->nrn_sav_d;\n    double* fast_imem_rhs = nt->nrn_fast_imem->nrn_sav_rhs;\n    nrn_pragma_acc(\n        parallel loop present(vec_rhs, vec_area, fast_imem_d, fast_imem_rhs) if (nt->compute_gpu)\n            async(nt->stream_id))\n    nrn_pragma_omp(target teams distribute parallel for simd if(nt->compute_gpu))\n    for (int i = i1; i < i3; ++i) {\n        fast_imem_rhs[i] = (fast_imem_d[i] * vec_rhs[i] + fast_imem_rhs[i]) * vec_area[i] * 0.01;\n    }\n}\n\nvoid nrn_calc_fast_imem_init(NrnThread* nt) {\n    // See the corresponding NEURON nrn_calc_fast_imem_fixedstep_init\n    int i1 = 0;\n    int i3 = nt->end;\n\n    double* vec_rhs = nt->_actual_rhs;\n    double* vec_area = nt->_actual_area;\n\n    double* fast_imem_rhs = nt->nrn_fast_imem->nrn_sav_rhs;\n    nrn_pragma_acc(parallel loop present(vec_rhs, vec_area, fast_imem_rhs) if (nt->compute_gpu)\n                       async(nt->stream_id))\n    nrn_pragma_omp(target teams distribute parallel for simd if(nt->compute_gpu))\n    for (int i = i1; i < i3; ++i) {\n        fast_imem_rhs[i] = (vec_rhs[i] + fast_imem_rhs[i]) * vec_area[i] * 0.01;\n    }\n}\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/sim/fast_imem.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n#include \"coreneuron/sim/multicore.hpp\"\n\nnamespace coreneuron {\n\n/* Bool global variable to define if the fast_imem\n * calculations should be enabled.\n */\nextern bool nrn_use_fast_imem;\n\n/* Free memory allocated for the fast current membrane calculation.\n * Found in src/nrnoc/multicore.c in NEURON.\n */\nvoid fast_imem_free();\n\n/* fast_imem_alloc() wrapper.\n * Found in src/nrnoc/multicore.c in NEURON.\n */\nvoid nrn_fast_imem_alloc();\n\n/* Calculate the new values of rhs array at every timestep.\n * Found in src/nrnoc/fadvance.cpp in NEURON.\n */\n\nvoid nrn_calc_fast_imem(NrnThread* _nt);\n/* Initialization used only in offline (file) mode.\n * See NEURON nrn_calc_fast_imem_fixedstep_init in src/nrnoc/fadvance.cpp\n */\nvoid nrn_calc_fast_imem_init(NrnThread* _nt);\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/sim/finitialize.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/network/netpar.hpp\"\n#include \"coreneuron/network/netcvode.hpp\"\n#include \"coreneuron/sim/fast_imem.hpp\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/utils/profile/profiler_interface.h\"\n#include \"coreneuron/coreneuron.hpp\"\n\nnamespace coreneuron {\n\nbool _nrn_skip_initmodel;\n\nvoid allocate_data_in_mechanism_nrn_init() {\n    // In case some nrn_init allocates data that we need. In this case\n    // we want to call nrn_init but not execute initmodel i.e. INITIAL\n    // block. For this, set _nrn_skip_initmodel to True temporarily\n    // , execute nrn_init and return.\n    _nrn_skip_initmodel = true;\n    for (int i = 0; i < nrn_nthread; ++i) {  // could be parallel\n        NrnThread& nt = nrn_threads[i];\n        for (NrnThreadMembList* tml = nt.tml; tml; tml = tml->next) {\n            Memb_list* ml = tml->ml;\n            mod_f_t s = corenrn.get_memb_func(tml->index).initialize;\n            if (s) {\n                (*s)(&nt, ml, tml->index);\n            }\n        }\n    }\n    _nrn_skip_initmodel = false;\n}\n\nvoid nrn_finitialize(int setv, double v) {\n    Instrumentor::phase_begin(\"finitialize\");\n    t = 0.;\n    dt2thread(-1.);\n    nrn_thread_table_check();\n    clear_event_queue();\n    nrn_spike_exchange_init();\n#if VECTORIZE\n    nrn_play_init(); /* Vector.play */\n                     /// Play events should be executed before initializing events\n    for (int i = 0; i < nrn_nthread; ++i) {\n        nrn_deliver_events(nrn_threads + i); /* The play events at t=0 */\n    }\n    if (setv) {\n        for (auto _nt = nrn_threads; _nt < nrn_threads + nrn_nthread; ++_nt) {\n            double* vec_v = &(VEC_V(0));\n            nrn_pragma_acc(\n                parallel loop present(_nt [0:1], vec_v [0:_nt->end]) if (_nt->compute_gpu))\n            nrn_pragma_omp(target teams distribute parallel for simd if(_nt->compute_gpu))\n            for (int i = 0; i < _nt->end; ++i) {\n                vec_v[i] = v;\n            }\n        }\n    }\n\n    if (nrn_have_gaps) {\n        Instrumentor::phase p(\"gap-v-transfer\");\n        nrnmpi_v_transfer();\n        for (int i = 0; i < nrn_nthread; ++i) {\n            nrnthread_v_transfer(nrn_threads + i);\n        }\n    }\n\n    for (int i = 0; i < nrn_nthread; ++i) {\n        nrn_ba(nrn_threads + i, BEFORE_INITIAL);\n    }\n    /* the INITIAL blocks are ordered so that mechanisms that write\n       concentrations are after ions and before mechanisms that read\n       concentrations.\n    */\n    /* the memblist list in NrnThread is already so ordered */\n    for (int i = 0; i < nrn_nthread; ++i) {\n        NrnThread* nt = nrn_threads + i;\n        for (auto tml = nt->tml; tml; tml = tml->next) {\n            mod_f_t s = corenrn.get_memb_func(tml->index).initialize;\n            if (s) {\n                (*s)(nt, tml->ml, tml->index);\n            }\n        }\n    }\n#endif\n\n    init_net_events();\n    for (int i = 0; i < nrn_nthread; ++i) {\n        nrn_ba(nrn_threads + i, AFTER_INITIAL);\n    }\n    for (int i = 0; i < nrn_nthread; ++i) {\n        nrn_deliver_events(nrn_threads + i); /* The INITIAL sent events at t=0 */\n    }\n    for (int i = 0; i < nrn_nthread; ++i) {\n        setup_tree_matrix_minimal(nrn_threads + i);\n        if (nrn_use_fast_imem) {\n            nrn_calc_fast_imem_init(nrn_threads + i);\n        }\n    }\n    for (int i = 0; i < nrn_nthread; ++i) {\n        nrn_ba(nrn_threads + i, BEFORE_STEP);\n    }\n    nrncore2nrn_send_init();\n    for (int i = 0; i < nrn_nthread; ++i) {\n        nrncore2nrn_send_values(nrn_threads + i);\n    }\n    // Consistent with NEURON. BEFORE_STEP and fixed_record_continuous before nrn_deliver_events.\n    for (int i = 0; i < nrn_nthread; ++i) {\n        nrn_deliver_events(nrn_threads + i); /* The record events at t=0 */\n    }\n#if NRNMPI\n    nrn_spike_exchange(nrn_threads);\n#endif\n    Instrumentor::phase_end(\"finitialize\");\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/sim/multicore.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include <cstdlib>\n#include <vector>\n\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/utils/memory.h\"\n#include \"coreneuron/coreneuron.hpp\"\n#include \"coreneuron/utils/nrnoc_aux.hpp\"\n\n/*\nNow that threads have taken over the actual_v, v_node, etc, it might\nbe a good time to regularize the method of freeing, allocating, and\nupdating those arrays. To recapitulate the history, Node used to\nbe the structure that held direct values for v, area, d, rhs, etc.\nThat continued to hold for the cray vectorization project which\nintroduced v_node, v_parent, memb_list. Cache efficiency introduced\nactual_v, actual_area, actual_d, etc and the Node started pointing\ninto those arrays. Additional nodes after allocation required updating\npointers to v and area since those arrays were freed and reallocated.\nNow, the threads hold all these arrays and we want to update them\nproperly under the circumstances of changing topology, changing\nnumber of threads, and changing distribution of cells on threads.\nNote there are no longer global versions of any of these arrays.\nWe do not want to update merely due to a change in area. Recently\nwe have dealt with diam, area, ri on a section basis. We generally\ndesire an update just before a simulation when the efficient\nstructures are necessary. This is reasonably well handled by the\nv_structure_change flag which historically freed and reallocated\nv_node and v_parent and, just before this comment,\nended up setting the NrnThread tml. This makes most of the old\nmemb_list vestigial and we now got rid of it except for\nthe artificial cells (and it is possibly not really necessary there).\nSwitching between sparse and tree matrix just cause freeing and\nreallocation of actual_rhs.\n\nIf we can get the freeing, reallocation, and pointer update correct\nfor _actual_v, I am guessing everything else can be dragged along with\nit. We have two major cases, call to pc.nthread and change in\nmodel structure. We want to use Node* as much as possible and defer\nthe handling of v_structure_change as long as possible.\n*/\n\nnamespace coreneuron {\n\nCoreNeuron corenrn;\n\nint nrn_nthread = 0;\nNrnThread* nrn_threads = nullptr;\nvoid (*nrn_mk_transfer_thread_data_)();\n\n/// --> CoreNeuron class\nstatic int table_check_cnt_;\nstatic ThreadDatum* table_check_;\n\n\nNrnThreadMembList* create_tml(NrnThread& nt,\n                              int mech_id,\n                              Memb_func& memb_func,\n                              int& shadow_rhs_cnt,\n                              const std::vector<int>& mech_types,\n                              const std::vector<int>& nodecounts) {\n    auto tml = (NrnThreadMembList*) emalloc_align(sizeof(NrnThreadMembList), 0);\n    tml->next = nullptr;\n    tml->index = mech_types[mech_id];\n\n    tml->ml = (Memb_list*) ecalloc_align(1, sizeof(Memb_list), 0);\n    tml->ml->_net_receive_buffer = nullptr;\n    tml->ml->_net_send_buffer = nullptr;\n    tml->ml->_permute = nullptr;\n    if (memb_func.alloc == nullptr) {\n        hoc_execerror(memb_func.sym, \"mechanism does not exist\");\n    }\n    tml->ml->nodecount = nodecounts[mech_id];\n    if (!memb_func.sym) {\n        printf(\"%s (type %d) is not available\\n\", nrn_get_mechname(tml->index), tml->index);\n        exit(1);\n    }\n    tml->ml->_nodecount_padded = nrn_soa_padded_size(tml->ml->nodecount,\n                                                     corenrn.get_mech_data_layout()[tml->index]);\n    if (memb_func.is_point && corenrn.get_is_artificial()[tml->index] == 0) {\n        // Avoid race for multiple PointProcess instances in same compartment.\n        if (tml->ml->nodecount > shadow_rhs_cnt) {\n            shadow_rhs_cnt = tml->ml->nodecount;\n        }\n    }\n\n    if (auto* const priv_ctor = corenrn.get_memb_func(tml->index).private_constructor) {\n        priv_ctor(&nt, tml->ml, tml->index);\n    }\n\n    return tml;\n}\n\nvoid nrn_threads_create(int n) {\n    if (nrn_nthread != n) {\n        /*printf(\"sizeof(NrnThread)=%d   sizeof(Memb_list)=%d\\n\", sizeof(NrnThread),\n         * sizeof(Memb_list));*/\n\n        nrn_threads = nullptr;\n        nrn_nthread = n;\n        if (n > 0) {\n            nrn_threads = new NrnThread[n];\n            for (int i = 0; i < nrn_nthread; ++i) {\n                NrnThread& nt = nrn_threads[i];\n                nt.id = i;\n                for (int j = 0; j < BEFORE_AFTER_SIZE; ++j) {\n                    nt.tbl[j] = nullptr;\n                }\n            }\n        }\n        v_structure_change = 1;\n        diam_changed = 1;\n    }\n    /*printf(\"nrn_threads_create %d %d\\n\", nrn_nthread, nrn_thread_parallel_);*/\n}\n\nvoid nrn_threads_free() {\n    if (nrn_nthread) {\n        delete[] nrn_threads;\n        nrn_threads = nullptr;\n        nrn_nthread = 0;\n    }\n}\n\nvoid nrn_mk_table_check() {\n    if (table_check_) {\n        free((void*) table_check_);\n        table_check_ = nullptr;\n    }\n    auto& memb_func = corenrn.get_memb_funcs();\n    // Allocate int array of size of mechanism types\n    std::vector<int> ix(memb_func.size(), -1);\n    table_check_cnt_ = 0;\n    for (int id = 0; id < nrn_nthread; ++id) {\n        auto& nt = nrn_threads[id];\n        for (auto tml = nt.tml; tml; tml = tml->next) {\n            int index = tml->index;\n            if (memb_func[index].thread_table_check_ && ix[index] == -1) {\n                ix[index] = id;\n                table_check_cnt_ += 2;\n            }\n        }\n    }\n    if (table_check_cnt_) {\n        table_check_ = (ThreadDatum*) emalloc(table_check_cnt_ * sizeof(ThreadDatum));\n    }\n    int i = 0;\n    for (int id = 0; id < nrn_nthread; ++id) {\n        auto& nt = nrn_threads[id];\n        for (auto tml = nt.tml; tml; tml = tml->next) {\n            int index = tml->index;\n            if (memb_func[index].thread_table_check_ && ix[index] == id) {\n                table_check_[i++].i = id;\n                table_check_[i++]._pvoid = (void*) tml;\n            }\n        }\n    }\n}\n\nvoid nrn_thread_table_check() {\n    for (int i = 0; i < table_check_cnt_; i += 2) {\n        auto& nt = nrn_threads[table_check_[i].i];\n        auto tml = static_cast<NrnThreadMembList*>(table_check_[i + 1]._pvoid);\n        Memb_list* ml = tml->ml;\n        (*corenrn.get_memb_func(tml->index).thread_table_check_)(\n            0, ml->_nodecount_padded, ml->data, ml->pdata, ml->_thread, &nt, ml, tml->index);\n    }\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/sim/multicore.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/mechanism/membfunc.hpp\"\n#include \"coreneuron/utils/memory.h\"\n#include \"coreneuron/mpi/nrnmpi.h\"\n#include \"coreneuron/mpi/core/nrnmpi.hpp\"\n#include \"coreneuron/io/reports/nrnreport.hpp\"\n#include <vector>\n#include <memory>\n\nnamespace coreneuron {\nclass NetCon;\nclass PreSyn;\n\nextern bool use_solve_interleave;\n\n/*\n   Point_process._presyn, used only if its NET_RECEIVE sends a net_event, is\n   eliminated. Needed only by net_event function. Replaced by\n   PreSyn* = nt->presyns + nt->pnt2presyn_ix[pnttype2presyn[pnt->_type]][pnt->_i_instance];\n*/\n\nstruct NrnThreadMembList { /* patterned after CvMembList in cvodeobj.h */\n    NrnThreadMembList* next;\n    Memb_list* ml;\n    int index;\n    int* dependencies; /* list of mechanism types that this mechanism depends on*/\n    int ndependencies; /* for scheduling we need to know the dependency count */\n};\nNrnThreadMembList* create_tml(NrnThread& nt,\n                              int mech_id,\n                              Memb_func& memb_func,\n                              int& shadow_rhs_cnt,\n                              const std::vector<int>& mech_types,\n                              const std::vector<int>& nodecounts);\n\nstruct NrnThreadBAList {\n    Memb_list* ml; /* an item in the NrnThreadMembList */\n    BAMech* bam;\n    NrnThreadBAList* next;\n};\n\nstruct NrnFastImem {\n    double* nrn_sav_rhs;\n    double* nrn_sav_d;\n};\n\nstruct TrajectoryRequests {\n    void** vpr;       /* PlayRecord Objects known by NEURON */\n    double** scatter; /* if bsize == 0, each time step */\n    double** varrays; /* if bsize > 0, the Vector data pointers. */\n    double** gather;  /* pointers to values that get scattered to NEURON */\n    int n_pr;         /* number of PlayRecord instances */\n    int n_trajec;     /* number of trajectories requested */\n    int bsize;        /* buffer size of the Vector data */\n    int vsize;        /* number of elements in varrays so far */\n};\n\n/* for OpenACC, in order to avoid an error while update PreSyn, with virtual base\n * class, we are adding helper with flag variable which could be updated on GPU\n */\nstruct PreSynHelper {\n    int flag_;\n};\n\nstruct NrnThread: public MemoryManaged {\n    double _t = 0;\n    double _dt = -1e9;\n    double cj = 0.0;\n\n    NrnThreadMembList* tml = nullptr;\n    Memb_list** _ml_list = nullptr;\n    Point_process* pntprocs = nullptr;  // synapses and artificial cells with and without gid\n    PreSyn* presyns = nullptr;          // all the output PreSyn with and without gid\n    PreSynHelper* presyns_helper = nullptr;\n    int** pnt2presyn_ix = nullptr;  // eliminates Point_process._presyn used only by net_event\n                                    // sender.\n    NetCon* netcons = nullptr;\n    double* weights = nullptr;  // size n_weight. NetCon.weight_ points into this array.\n\n    int n_pntproc = 0;\n    int n_weight = 0;\n    int n_netcon = 0;\n    int n_input_presyn = 0;\n    int n_presyn = 0;       // only for model_size\n    int n_real_output = 0;  // for checking their thresholds.\n\n    int ncell = 0; /* analogous to old rootnodecount */\n    int end = 0;   /* 1 + position of last in v_node array. Now v_node_count. */\n    int id = 0;    /* this is nrn_threads[id] */\n    int _stop_stepping = 0;\n    int n_vecplay = 0; /* number of instances of VecPlayContinuous */\n\n    size_t _ndata = 0;\n    size_t _nvdata = 0;\n    size_t _nidata = 0;        /* sizes */\n    double* _data = nullptr;   /* all the other double* and Datum to doubles point into here*/\n    int* _idata = nullptr;     /* all the Datum to ints index into here */\n    void** _vdata = nullptr;   /* all the Datum to pointers index into here */\n    void** _vecplay = nullptr; /* array of instances of VecPlayContinuous */\n\n    double* _actual_rhs = nullptr;\n    double* _actual_d = nullptr;\n    double* _actual_a = nullptr;\n    double* _actual_b = nullptr;\n    double* _actual_v = nullptr;\n    double* _actual_area = nullptr;\n    double* _actual_diam = nullptr; /* nullptr if no mechanism has dparam with diam semantics */\n    double* _shadow_rhs = nullptr;  /* Not pointer into _data. Avoid race for multiple POINT_PROCESS\n                             in same  compartment */\n    double* _shadow_d = nullptr; /* Not pointer into _data. Avoid race for multiple POINT_PROCESS in\n                          same compartment */\n\n    /* Fast membrane current calculation struct */\n    NrnFastImem* nrn_fast_imem = nullptr;\n\n    int* _v_parent_index = nullptr;\n    int* _permute = nullptr;\n    char* _sp13mat = nullptr;              /* handle to general sparse matrix */\n    Memb_list* _ecell_memb_list = nullptr; /* normally nullptr */\n\n    double _ctime = 0.0; /* computation time in seconds (using nrnmpi_wtime) */\n\n    NrnThreadBAList* tbl[BEFORE_AFTER_SIZE]; /* wasteful since almost all empty */\n\n    int shadow_rhs_cnt = 0; /* added to facilitate the NrnThread transfer to GPU */\n    int compute_gpu = 0;    /* define whether to compute with gpus */\n    int stream_id = 0;      /* define where the kernel will be launched on GPU stream */\n    int _net_send_buffer_size = 0;\n    int _net_send_buffer_cnt = 0;\n    int* _net_send_buffer = nullptr;\n\n    int* _watch_types = nullptr; /* nullptr or 0 terminated array of integers */\n    void* mapping = nullptr;     /* section to segment mapping information */\n    std::unique_ptr<SummationReportMapping> summation_report_handler_; /* report to ALU (values of\n                                                                          the current summation */\n    TrajectoryRequests* trajec_requests = nullptr; /* per time step values returned to NEURON */\n\n    /* Needed in case there are FOR_NETCON statements in use. */\n    std::size_t _fornetcon_perm_indices_size{}; /* length of _fornetcon_perm_indices */\n    size_t* _fornetcon_perm_indices{};          /* displacement like list of indices */\n    std::size_t _fornetcon_weight_perm_size{};  /* length of _fornetcon_weight_perm */\n    size_t* _fornetcon_weight_perm{};           /* permutation indices into weight */\n\n    std::vector<int> _pnt_offset; /* for SelfEvent queue transfer */\n};\n\nextern void nrn_threads_create(int n);\nextern int nrn_nthread;\nextern NrnThread* nrn_threads;\ntemplate <typename F, typename... Args>\nvoid nrn_multithread_job(F&& job, Args&&... args) {\n    int i;\n    // clang-format off\n\n    #pragma omp parallel for private(i) shared(nrn_threads, job, nrn_nthread, \\\n                                           nrnmpi_myid) schedule(static, 1)\n    // FIXME: multiple forwarding of the same arguments...\n    for (i = 0; i < nrn_nthread; ++i) {\n        job(nrn_threads + i, std::forward<Args>(args)...);\n    }\n    // clang-format on\n}\n\nextern void nrn_thread_table_check(void);\n\nextern void nrn_threads_free(void);\n\nextern bool _nrn_skip_initmodel;\n\n\nextern void dt2thread(double);\nextern void clear_event_queue(void);\nextern void nrn_ba(NrnThread*, int);\nextern void* nrn_fixed_step_lastpart(NrnThread*);\nextern void nrn_solve_minimal(NrnThread*);\nextern void nrncore2nrn_send_init();\nextern void* setup_tree_matrix_minimal(NrnThread*);\nextern void nrncore2nrn_send_values(NrnThread*);\nextern void nrn_fixed_step_group_minimal(int total_sim_steps);\nextern void nrn_fixed_single_steps_minimal(int total_sim_steps, double tstop);\nextern void nrn_fixed_step_minimal(void);\nextern void nrn_finitialize(int setv, double v);\nextern void direct_mode_initialize();\nextern void nrn_mk_table_check(void);\nextern void nonvint(NrnThread* _nt);\nextern void update(NrnThread*);\n\nconstexpr int at_time(NrnThread* nt, double te) {\n    double x = te - 1e-11;\n    if (x <= nt->_t && x > (nt->_t - nt->_dt)) {\n        return 1;\n    }\n    return 0;\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/sim/scopmath/abort.cpp",
    "content": "/******************************************************************************\n *\n * File: abort.c\n *\n * Copyright (c) 1984, 1985, 1986, 1987, 1988, 1989, 1990\n *   Duke University\n *\n ******************************************************************************/\n#include \"coreneuron/utils/nrnoc_aux.hpp\"\n\n/*-----------------------------------------------------------------------------\n *\n * ABORT_RUN()\n *\n *    Prints out an error message and returns to the main menu if a solver\n *    routine returns a nonzero error code.\n *\n * Calling sequence: abort_run(code)\n *\n * Argument:\tcode\tint\tflag for error\n *\n * Returns:\n *\n * Functions called: abs(), cls(), cursrpos(), puts(), gets()\n *\n * Files accessed:\n *---------------------------------------------------------------------------*/\n\n#include <setjmp.h>\n#include <stdio.h>\n#include \"errcodes.h\"\nnamespace coreneuron {\nint abort_run(int code) {\n    switch ((code >= 0) ? code : -code) {\n        case EXCEED_ITERS:\n            puts(\"Convergence not achieved in maximum number of iterations\");\n            break;\n        case SINGULAR:\n            puts(\"The matrix in the solution method is singular or ill-conditioned\");\n            break;\n        case PRECISION:\n            puts(\n                \"The increment in the independent variable is less than machine \"\n                \"roundoff error\");\n            break;\n        case CORR_FAIL:\n            puts(\"The corrector failed to satisfy the error check\");\n            break;\n        case DIVERGED:\n            puts(\"The corrector iteration diverged\");\n            break;\n        case INCONSISTENT:\n            puts(\"Inconsistent boundary conditions\");\n            puts(\"Convergence not acheived in maximum number of iterations\");\n            break;\n        case BAD_START:\n            puts(\"Poor starting estimate for initial conditions\");\n            puts(\"The matrix in the solution method is singular or ill-conditioned\");\n            break;\n        case NODATA:\n            puts(\"No data found in data file\");\n            break;\n        case NO_SOLN:\n            puts(\"No solution was obtained for the coefficients\");\n            break;\n        case LOWMEM:\n            puts(\"Insufficient memory to run the model\");\n            break;\n        case DIVCHECK:\n            puts(\"Attempt to divide by zero\");\n            break;\n        case NOFORCE:\n            puts(\n                \"Could not open forcing function file\\nThe model cannot be run \"\n                \"without the forcing function\");\n            break;\n        case NEG_ARG:\n            puts(\"Cannot compute factorial of negative argument\");\n            break;\n        case RANGE:\n            puts(\n                \"Value of variable is outside the range of the forcing function data \"\n                \"table\");\n            break;\n        default:\n            puts(\"Origin of error is unknown\");\n    }\n    hoc_execerror(\"scopmath library error\", (char*) 0);\n    return 0;\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/sim/scopmath/crout_thread.hpp",
    "content": "/*\n# =============================================================================\n# Originally crout.c from SCoP library, Copyright (c) 1987-90 Duke University\n# =============================================================================\n# Subsequent extensive prototype and memory layout changes for CoreNEURON\n#\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n#pragma once\n#include \"coreneuron/sim/scopmath/errcodes.h\"\n#include \"coreneuron/sim/scopmath/newton_struct.h\"\n\nnamespace coreneuron {\n#if defined(scopmath_crout_ix) || defined(scopmath_crout_y) || defined(scopmath_crout_b)\n#error \"naming clash on crout_thread.hpp-internal macros\"\n#endif\n#define scopmath_crout_b(arg)  b[scopmath_crout_ix(arg)]\n#define scopmath_crout_ix(arg) ((arg) *_STRIDE)\n#define scopmath_crout_y(arg)  _p[y[arg] * _STRIDE]\n\n/**\n * Performs an LU triangular factorization of a real matrix by the Crout\n * algorithm using partial pivoting. Rows are not normalized; implicit\n * equilibration is used. ROUNDOFF is the minimal value for a pivot element\n * without its being considered too close to zero (currently set to 1.0E-20).\n *\n * @return 0 if no error; 2 if matrix is singular or ill-conditioned\n * @param n number of rows of the matrix\n * @param a double precision matrix to be factored\n * @param[out] a factors required to transform the constant vector in the set of\n *               simultaneous equations are stored in the lower triangle;\n *               factors for back substitution are stored in the upper triangle.\n * @param[out] perm permutation vector to store row interchanges\n *\n * @note Having a differnt permutation per instance may not be a good idea.\n */\ninline int nrn_crout_thread(NewtonSpace* ns, int n, double** a, int* perm, _threadargsproto_) {\n    int save_i = 0;\n\n    /* Initialize permutation and rowmax vectors */\n    double* rowmax = ns->rowmax;\n    for (int i = 0; i < n; i++) {\n        perm[scopmath_crout_ix(i)] = i;\n        int k = 0;\n        for (int j = 1; j < n; j++)\n            if (fabs(a[i][scopmath_crout_ix(j)]) > fabs(a[i][scopmath_crout_ix(k)]))\n                k = j;\n        rowmax[scopmath_crout_ix(i)] = a[i][scopmath_crout_ix(k)];\n    }\n\n    /* Loop over rows and columns r */\n    for (int r = 0; r < n; r++) {\n        /*\n         * Operate on rth column.  This produces the lower triangular matrix\n         * of terms needed to transform the constant vector.\n         */\n\n        for (int i = r; i < n; i++) {\n            double sum = 0.0;\n            int irow = perm[scopmath_crout_ix(i)];\n            for (int k = 0; k < r; k++) {\n                int krow = perm[scopmath_crout_ix(k)];\n                sum += a[irow][scopmath_crout_ix(k)] * a[krow][scopmath_crout_ix(r)];\n            }\n            a[irow][scopmath_crout_ix(r)] -= sum;\n        }\n\n        /* Find row containing the pivot in the rth column */\n        int pivot = perm[scopmath_crout_ix(r)];\n        double equil_1 = fabs(a[pivot][scopmath_crout_ix(r)] / rowmax[scopmath_crout_ix(pivot)]);\n        for (int i = r + 1; i < n; i++) {\n            int irow = perm[scopmath_crout_ix(i)];\n            double equil_2 = fabs(a[irow][scopmath_crout_ix(r)] / rowmax[scopmath_crout_ix(irow)]);\n            if (equil_2 > equil_1) {\n                /* make irow the new pivot row */\n\n                pivot = irow;\n                save_i = i;\n                equil_1 = equil_2;\n            }\n        }\n\n        /* Interchange entries in permutation vector if necessary */\n        if (pivot != perm[scopmath_crout_ix(r)]) {\n            perm[scopmath_crout_ix(save_i)] = perm[scopmath_crout_ix(r)];\n            perm[scopmath_crout_ix(r)] = pivot;\n        }\n\n        /* Check that pivot element is not too small */\n        if (fabs(a[pivot][scopmath_crout_ix(r)]) < ROUNDOFF)\n            return SINGULAR;\n\n        /*\n         * Operate on row in rth position.  This produces the upper\n         * triangular matrix whose diagonal elements are assumed to be unity.\n         * This matrix is used in the back substitution algorithm.\n         */\n        for (int j = r + 1; j < n; j++) {\n            double sum = 0.0;\n            for (int k = 0; k < r; k++) {\n                int krow = perm[scopmath_crout_ix(k)];\n                sum += a[pivot][scopmath_crout_ix(k)] * a[krow][scopmath_crout_ix(j)];\n            }\n            a[pivot][scopmath_crout_ix(j)] = (a[pivot][scopmath_crout_ix(j)] - sum) /\n                                             a[pivot][scopmath_crout_ix(r)];\n        }\n    }\n    return SUCCESS;\n}\n\n/**\n * Performs forward substitution algorithm to transform the constant vector in\n * the linear simultaneous equations to be consistent with the factored matrix.\n * Then performs back substitution to find the solution to the simultaneous\n * linear equations.\n *\n * @param n number of rows of the matrix\n * @param a double precision matrix containing the factored matrix of\n *          coefficients of the linear equations\n * @param b vector of function values\n * @param perm permutation vector to store row interchanges\n * @param[out] p[y[i]] contains the solution vector\n */\ninline void nrn_scopmath_solve_thread(int n,\n                                      double** a,\n                                      double* b,\n                                      int* perm,\n                                      double* p,\n                                      int* y,\n                                      _threadargsproto_) {\n    /* Perform forward substitution with pivoting */\n    // if (y) { // pgacc bug. nullptr on cpu but not on GPU\n    if (0) {\n        for (int i = 0; i < n; i++) {\n            int pivot = perm[scopmath_crout_ix(i)];\n            double sum = 0.0;\n            for (int j = 0; j < i; j++)\n                sum += a[pivot][scopmath_crout_ix(j)] * (scopmath_crout_y(j));\n            scopmath_crout_y(i) = (scopmath_crout_b(pivot) - sum) / a[pivot][scopmath_crout_ix(i)];\n        }\n\n        /*\n         * Note that the y vector is already in the correct order for back\n         * substitution.  Perform back substitution, pivoting the matrix but not\n         * the y vector.  There is no need to divide by the diagonal element as\n         * this is assumed to be unity.\n         */\n\n        for (int i = n - 1; i >= 0; i--) {\n            int pivot = perm[scopmath_crout_ix(i)];\n            double sum = 0.0;\n            for (int j = i + 1; j < n; j++)\n                sum += a[pivot][scopmath_crout_ix(j)] * (scopmath_crout_y(j));\n            scopmath_crout_y(i) -= sum;\n        }\n    } else {\n        for (int i = 0; i < n; i++) {\n            int pivot = perm[scopmath_crout_ix(i)];\n            double sum = 0.0;\n            if (i > 0) {  // pgacc bug. with i==0 the following loop executes once\n                for (int j = 0; j < i; j++) {\n                    sum += a[pivot][scopmath_crout_ix(j)] * (p[scopmath_crout_ix(j)]);\n                }\n            }\n            p[scopmath_crout_ix(i)] = (scopmath_crout_b(pivot) - sum) /\n                                      a[pivot][scopmath_crout_ix(i)];\n        }\n\n        /*\n         * Note that the y vector is already in the correct order for back\n         * substitution.  Perform back substitution, pivoting the matrix but not\n         * the y vector.  There is no need to divide by the diagonal element as\n         * this is assumed to be unity.\n         */\n        for (int i = n - 1; i >= 0; i--) {\n            int pivot = perm[scopmath_crout_ix(i)];\n            double sum = 0.0;\n            for (int j = i + 1; j < n; j++)\n                sum += a[pivot][scopmath_crout_ix(j)] * (p[scopmath_crout_ix(j)]);\n            p[scopmath_crout_ix(i)] -= sum;\n        }\n    }\n}\n#undef scopmath_crout_b\n#undef scopmath_crout_ix\n#undef scopmath_crout_y\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/sim/scopmath/errcodes.h",
    "content": "/*\n# =============================================================================\n# Originally errcodes.h from SCoP library, Copyright (c) 1984-90 Duke University\n# =============================================================================\n# Subsequent extensive prototype and memory layout changes for CoreNEURON\n#\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n#pragma once\nnamespace coreneuron {\nextern int abort_run(int);\nnamespace scopmath {\n/** @brief Flag to disable some code sections at compile time.\n *\n *  Some methods, such as coreneuron::scopmath::sparse::getelm(...), decide at\n *  runtime whether they are simply accessors, or if they dynamically modify the\n *  matrix in question, possibly allocating new memory. Typically the second\n *  mode will be used during model initialisation, while the first will be used\n *  during computation/simulation. Compiling the more complicated code for the\n *  second mode can be problematic for targets such as GPU, where dynamic\n *  allocation and global state are complex. This enum is intended to be used as\n *  a template parameter to flag (at compile time) when this code can be\n *  omitted.\n */\nenum struct enabled_code { all, compute_only };\n}  // namespace scopmath\n}  // namespace coreneuron\n#define ROUNDOFF       1.e-20\n#define ZERO           1.e-8\n#define STEP           1.e-6\n#define CONVERGE       1.e-6\n#define MAXCHANGE      0.05\n#define INITSIMPLEX    0.25\n#define MAXITERS       50\n#define MAXSMPLXITERS  100\n#define MAXSTEPS       20\n#define MAXHALVE       15\n#define MAXORDER       6\n#define MAXTERMS       3\n#define MAXFAIL        10\n#define MAX_JAC_ITERS  20\n#define MAX_GOOD_ORDER 2\n#define MAX_GOOD_STEPS 3\n\n#define SUCCESS      0\n#define EXCEED_ITERS 1\n#define SINGULAR     2\n#define PRECISION    3\n#define CORR_FAIL    4\n#define INCONSISTENT 5\n#define BAD_START    6\n#define NODATA       7\n#define NO_SOLN      8\n#define LOWMEM       9\n#define DIVCHECK     10\n#define NOFORCE      11\n#define DIVERGED     12\n#define NEG_ARG      13\n#define RANGE        14\n"
  },
  {
    "path": "coreneuron/sim/scopmath/newton_struct.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n#pragma once\n#include \"coreneuron/mechanism/mech/mod2c_core_thread.hpp\"\n\nnamespace coreneuron {\n\n/* avoid incessant alloc/free memory */\nstruct NewtonSpace {\n    int n;\n    int n_instance;\n    double* delta_x;\n    double** jacobian;\n    int* perm;\n    double* high_value;\n    double* low_value;\n    double* rowmax;\n};\n\nvoid nrn_newtonspace_copyto_device(NewtonSpace* ns);\nvoid nrn_newtonspace_delete_from_device(NewtonSpace* ns);\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/sim/scopmath/newton_thread.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n#include <math.h>\n#include <stdlib.h>\n\n#include \"coreneuron/sim/scopmath/newton_thread.hpp\"\n#include \"coreneuron/utils/nrnoc_aux.hpp\"\n\nnamespace coreneuron {\nNewtonSpace* nrn_cons_newtonspace(int n, int n_instance) {\n    NewtonSpace* ns = (NewtonSpace*) emalloc(sizeof(NewtonSpace));\n    ns->n = n;\n    ns->n_instance = n_instance;\n    ns->delta_x = makevector(n * n_instance * sizeof(double));\n    ns->jacobian = makematrix(n, n * n_instance);\n    ns->perm = (int*) emalloc((unsigned) (n * n_instance * sizeof(int)));\n    ns->high_value = makevector(n * n_instance * sizeof(double));\n    ns->low_value = makevector(n * n_instance * sizeof(double));\n    ns->rowmax = makevector(n * n_instance * sizeof(double));\n    nrn_newtonspace_copyto_device(ns);\n    return ns;\n}\n\nvoid nrn_destroy_newtonspace(NewtonSpace* ns) {\n    nrn_newtonspace_delete_from_device(ns);\n    free((char*) ns->perm);\n    freevector(ns->delta_x);\n    freematrix(ns->jacobian);\n    freevector(ns->high_value);\n    freevector(ns->low_value);\n    freevector(ns->rowmax);\n    free((char*) ns);\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/sim/scopmath/newton_thread.hpp",
    "content": "/*\n# =============================================================================\n# Originally newton.c from SCoP library, Copyright (c) 1987-90 Duke University\n# =============================================================================\n# Subsequent extensive prototype and memory layout changes for CoreNEURON\n#\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n#pragma once\n#include \"coreneuron/sim/scopmath/errcodes.h\"\n#include \"coreneuron/sim/scopmath/newton_struct.h\"\n#include \"coreneuron/sim/scopmath/crout_thread.hpp\"\n\n#include <algorithm>\n#include <cmath>\n\nnamespace coreneuron {\n#if defined(scopmath_newton_ix) || defined(scopmath_newton_s) || defined(scopmath_newton_x)\n#error \"naming clash on newton_thread.hpp-internal macros\"\n#endif\n#define scopmath_newton_ix(arg) ((arg) *_STRIDE)\n#define scopmath_newton_s(arg)  _p[s[arg] * _STRIDE]\n#define scopmath_newton_x(arg)  _p[(arg) *_STRIDE]\nnamespace detail {\n/**\n * @brief Calculate the Jacobian matrix using finite central differences.\n *\n * Creates the Jacobian matrix by computing partial derivatives by finite\n * central differences. If the column variable is nonzero, an increment of 2% of\n * the variable is used. STEP is the minimum increment allowed; it is currently\n * set to 1.0E-6.\n *\n * @param n number of variables\n * @param x pointer to array of addresses of the solution vector elements\n * @param p array of parameter values\n * @param func callable that computes the deviation from zero of each equation\n *             in the model\n * @param value pointer to array of addresses of function values\n * @param[out] jacobian computed jacobian matrix\n */\ntemplate <typename F>\nvoid nrn_buildjacobian_thread(NewtonSpace* ns,\n                              int n,\n                              int* index,\n                              F const& func,\n                              double* value,\n                              double** jacobian,\n                              _threadargsproto_) {\n    double* high_value = ns->high_value;\n    double* low_value = ns->low_value;\n\n    /* Compute partial derivatives by central finite differences */\n\n    for (int j = 0; j < n; j++) {\n        double increment = std::max(std::fabs(0.02 * (scopmath_newton_x(index[j]))), STEP);\n        scopmath_newton_x(index[j]) += increment;\n        func(_threadargs_);  // std::invoke in C++17\n        for (int i = 0; i < n; i++)\n            high_value[scopmath_newton_ix(i)] = value[scopmath_newton_ix(i)];\n        scopmath_newton_x(index[j]) -= 2.0 * increment;\n        func(_threadargs_);  // std::invoke in C++17\n        for (int i = 0; i < n; i++) {\n            low_value[scopmath_newton_ix(i)] = value[scopmath_newton_ix(i)];\n\n            /* Insert partials into jth column of Jacobian matrix */\n\n            jacobian[i][scopmath_newton_ix(j)] = (high_value[scopmath_newton_ix(i)] -\n                                                  low_value[scopmath_newton_ix(i)]) /\n                                                 (2.0 * increment);\n        }\n\n        /* Restore original variable and function values. */\n\n        scopmath_newton_x(index[j]) += increment;\n        func(_threadargs_);  // std::invoke in C++17\n    }\n}\n#undef scopmath_newton_x\n}  // namespace detail\n\n/**\n * Iteratively solves simultaneous nonlinear equations by Newton's method, using\n * a Jacobian matrix computed by finite differences.\n *\n * @return 0 if no error; 2 if matrix is singular or ill-conditioned; 1 if\n *         maximum iterations exceeded.\n * @param n number of variables to solve for\n * @param x pointer to array of the solution vector elements possibly indexed by\n *          index\n * @param p array of parameter values\n * @param func callable that computes the deviation from zero of each equation\n *             in the model\n * @param value pointer to array to array of the function values\n * @param[out] x contains the solution value or the most recent iteration's\n *               result in the event of an error.\n */\ntemplate <typename F>\ninline int nrn_newton_thread(NewtonSpace* ns,\n                             int n,\n                             int* s,\n                             F func,\n                             double* value,\n                             _threadargsproto_) {\n    int count = 0, error = 0;\n    double change = 1.0, max_dev, temp;\n    int done = 0;\n    /*\n     * Create arrays for Jacobian, variable increments, function values, and\n     * permutation vector\n     */\n    double* delta_x = ns->delta_x;\n    double** jacobian = ns->jacobian;\n    int* perm = ns->perm;\n    /* Iteration loop */\n    while (!done) {\n        if (count++ >= MAXITERS) {\n            error = EXCEED_ITERS;\n            done = 2;\n        }\n        if (!done && change > MAXCHANGE) {\n            /*\n             * Recalculate Jacobian matrix if solution has changed by more\n             * than MAXCHANGE\n             */\n            detail::nrn_buildjacobian_thread(ns, n, s, func, value, jacobian, _threadargs_);\n            for (int i = 0; i < n; i++)\n                value[scopmath_newton_ix(i)] = -value[scopmath_newton_ix(i)]; /* Required correction\n                                                                               * to\n                                                                               * function values */\n            error = nrn_crout_thread(ns, n, jacobian, perm, _threadargs_);\n            if (error != SUCCESS) {\n                done = 2;\n            }\n        }\n\n        if (!done) {\n            nrn_scopmath_solve_thread(n, jacobian, value, perm, delta_x, (int*) 0, _threadargs_);\n\n            /* Update solution vector and compute norms of delta_x and value */\n\n            change = 0.0;\n            if (s) {\n                for (int i = 0; i < n; i++) {\n                    if (std::fabs(scopmath_newton_s(i)) > ZERO &&\n                        (temp = std::fabs(delta_x[scopmath_newton_ix(i)] /\n                                          (scopmath_newton_s(i)))) > change)\n                        change = temp;\n                    scopmath_newton_s(i) += delta_x[scopmath_newton_ix(i)];\n                }\n            } else {\n                for (int i = 0; i < n; i++) {\n                    if (std::fabs(scopmath_newton_s(i)) > ZERO &&\n                        (temp = std::fabs(delta_x[scopmath_newton_ix(i)] /\n                                          (scopmath_newton_s(i)))) > change)\n                        change = temp;\n                    scopmath_newton_s(i) += delta_x[scopmath_newton_ix(i)];\n                }\n            }\n            // Evaulate function values with new solution.\n            func(_threadargs_);  // std::invoke in C++17\n            max_dev = 0.0;\n            for (int i = 0; i < n; i++) {\n                value[scopmath_newton_ix(i)] = -value[scopmath_newton_ix(i)]; /* Required correction\n                                                                               * to function\n                                                                               * values */\n                if ((temp = std::fabs(value[scopmath_newton_ix(i)])) > max_dev)\n                    max_dev = temp;\n            }\n\n            /* Check for convergence or maximum iterations */\n\n            if (change <= CONVERGE && max_dev <= ZERO) {\n                // break;\n                done = 1;\n            }\n        }\n    } /* end of while loop */\n\n    return (error);\n}\n#undef scopmath_newton_ix\n#undef scopmath_newton_s\n\nNewtonSpace* nrn_cons_newtonspace(int n, int n_instance);\nvoid nrn_destroy_newtonspace(NewtonSpace* ns);\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/sim/scopmath/sparse_thread.hpp",
    "content": "/*\n# =============================================================================\n# Originally sparse.c from SCoP library, Copyright (c) 1989-90 Duke University\n# =============================================================================\n# Subsequent extensive prototype and memory layout changes for CoreNEURON\n#\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n#pragma once\n#include \"coreneuron/mechanism/mech/mod2c_core_thread.hpp\"\n#include \"coreneuron/sim/scopmath/errcodes.h\"\n\nnamespace coreneuron {\nnamespace scopmath {\nnamespace sparse {\n// Methods that may be called from offloaded regions are declared inline.\ninline void delete_item(Item* item) {\n    item->next->prev = item->prev;\n    item->prev->next = item->next;\n    item->prev = nullptr;\n    item->next = nullptr;\n}\n\n/*link ii before item*/\ninline void linkitem(Item* item, Item* ii) {\n    ii->prev = item->prev;\n    ii->next = item;\n    item->prev = ii;\n    ii->prev->next = ii;\n}\n\ninline void insert(SparseObj* so, Item* item) {\n    Item* ii{};\n    for (ii = so->orderlist->next; ii != so->orderlist; ii = ii->next) {\n        if (ii->norder >= item->norder) {\n            break;\n        }\n    }\n    linkitem(ii, item);\n}\n\n/* note: solution order refers to the following\n        diag[varord[row]]->row = row = diag[varord[row]]->col\n        rowst[varord[row]]->row = row\n        varord[el->row] < varord[el->c_right->row]\n        varord[el->col] < varord[el->r_down->col]\n*/\ninline void increase_order(SparseObj* so, unsigned row) {\n    /* order of row increases by 1. Maintain the orderlist. */\n    if (!so->do_flag)\n        return;\n    Item* order = so->roworder[row];\n    delete_item(order);\n    order->norder++;\n    insert(so, order);\n}\n\n/**\n * Return pointer to (row, col) element maintaining order in rows.\n *\n * See check_assert in minorder for info about how this matrix is supposed to\n * look. If new_elem is nonzero and an element would otherwise be created, new\n * is used instead. This is because linking an element is highly nontrivial. The\n * biggest difference is that elements are no longer removed and this saves much\n * time allocating and freeing during the solve phase.\n */\ntemplate <enabled_code code_to_enable = enabled_code::all>\nElm* getelm(SparseObj* so, unsigned row, unsigned col, Elm* new_elem) {\n    Elm *el, *elnext;\n\n    unsigned vrow = so->varord[row];\n    unsigned vcol = so->varord[col];\n\n    if (vrow == vcol) {\n        return so->diag[vrow]; /* a common case */\n    }\n    if (vrow > vcol) { /* in the lower triangle */\n        /* search downward from diag[vcol] */\n        for (el = so->diag[vcol];; el = elnext) {\n            elnext = el->r_down;\n            if (!elnext) {\n                break;\n            } else if (elnext->row == row) { /* found it */\n                return elnext;\n            } else if (so->varord[elnext->row] > vrow) {\n                break;\n            }\n        }\n        /* insert below el */\n        if (!new_elem) {\n            if constexpr (code_to_enable == enabled_code::compute_only) {\n                // Dynamic allocation should not happen during the compute phase.\n                assert(false);\n            } else {\n                new_elem = new Elm{};\n                new_elem->value = new double[so->_cntml_padded];\n                increase_order(so, row);\n            }\n        }\n        new_elem->r_down = el->r_down;\n        el->r_down = new_elem;\n        new_elem->r_up = el;\n        if (new_elem->r_down) {\n            new_elem->r_down->r_up = new_elem;\n        }\n        /* search leftward from diag[vrow] */\n        for (el = so->diag[vrow];; el = elnext) {\n            elnext = el->c_left;\n            if (!elnext) {\n                break;\n            } else if (so->varord[elnext->col] < vcol) {\n                break;\n            }\n        }\n        /* insert to left of el */\n        new_elem->c_left = el->c_left;\n        el->c_left = new_elem;\n        new_elem->c_right = el;\n        if (new_elem->c_left) {\n            new_elem->c_left->c_right = new_elem;\n        } else {\n            so->rowst[vrow] = new_elem;\n        }\n    } else { /* in the upper triangle */\n        /* search upward from diag[vcol] */\n        for (el = so->diag[vcol];; el = elnext) {\n            elnext = el->r_up;\n            if (!elnext) {\n                break;\n            } else if (elnext->row == row) { /* found it */\n                return elnext;\n            } else if (so->varord[elnext->row] < vrow) {\n                break;\n            }\n        }\n        /* insert above el */\n        if (!new_elem) {\n            if constexpr (code_to_enable == enabled_code::compute_only) {\n                assert(false);\n            } else {\n                new_elem = new Elm{};\n                new_elem->value = new double[so->_cntml_padded];\n                increase_order(so, row);\n            }\n        }\n        new_elem->r_up = el->r_up;\n        el->r_up = new_elem;\n        new_elem->r_down = el;\n        if (new_elem->r_up) {\n            new_elem->r_up->r_down = new_elem;\n        }\n        /* search right from diag[vrow] */\n        for (el = so->diag[vrow];; el = elnext) {\n            elnext = el->c_right;\n            if (!elnext) {\n                break;\n            } else if (so->varord[elnext->col] > vcol) {\n                break;\n            }\n        }\n        /* insert to right of el */\n        new_elem->c_right = el->c_right;\n        el->c_right = new_elem;\n        new_elem->c_left = el;\n        if (new_elem->c_right) {\n            new_elem->c_right->c_left = new_elem;\n        }\n    }\n    new_elem->row = row;\n    new_elem->col = col;\n    return new_elem;\n}\n\n/**\n * The following routines support the concept of a list. Modified from modl. The\n * list is a doubly linked list. A special item with element 0 is always at the\n * tail of the list and is denoted as the List pointer itself. list->next point\n * to the first item in the list and list->prev points to the last item in the\n * list. i.e. the list is circular. Note that in an empty list next and prev\n * points to itself.\n *\n * It is intended that this implementation be hidden from the user via the\n * following function calls.\n */\ninline List* newlist() {\n    auto* ii = new Item{};\n    ii->prev = ii;\n    ii->next = ii;\n    return ii;\n}\n\n/*free the list but not the elements*/\ninline void freelist(List* list) {\n    Item* i2;\n    for (Item* i1 = list->next; i1 != list; i1 = i2) {\n        i2 = i1->next;\n        delete i1;\n    }\n    delete list;\n}\n\ninline void check_assert(SparseObj* so) {\n    /* check that all links are consistent */\n    for (unsigned i = 1; i <= so->neqn; i++) {\n        assert(so->diag[i]);\n        assert(so->diag[i]->row == so->diag[i]->col);\n        assert(so->varord[so->diag[i]->row] == i);\n        assert(so->rowst[i]->row == so->diag[i]->row);\n        for (Elm* el = so->rowst[i]; el; el = el->c_right) {\n            if (el == so->rowst[i]) {\n                assert(el->c_left == nullptr);\n            } else {\n                assert(el->c_left->c_right == el);\n                assert(so->varord[el->c_left->col] < so->varord[el->col]);\n            }\n        }\n        for (Elm* el = so->diag[i]->r_down; el; el = el->r_down) {\n            assert(el->r_up->r_down == el);\n            assert(so->varord[el->r_up->row] < so->varord[el->row]);\n        }\n        for (Elm* el = so->diag[i]->r_up; el; el = el->r_up) {\n            assert(el->r_down->r_up == el);\n            assert(so->varord[el->r_down->row] > so->varord[el->row]);\n        }\n    }\n}\n\n/* at this point row links are out of order for diag[i]->col\n   and col links are out of order for diag[i]->row */\ninline void re_link(SparseObj* so, unsigned i) {\n    for (Elm* el = so->rowst[i]; el; el = el->c_right) {\n        /* repair hole */\n        if (el->r_up)\n            el->r_up->r_down = el->r_down;\n        if (el->r_down)\n            el->r_down->r_up = el->r_up;\n    }\n\n    for (Elm* el = so->diag[i]->r_down; el; el = el->r_down) {\n        /* repair hole */\n        if (el->c_right)\n            el->c_right->c_left = el->c_left;\n        if (el->c_left)\n            el->c_left->c_right = el->c_right;\n        else\n            so->rowst[so->varord[el->row]] = el->c_right;\n    }\n\n    for (Elm* el = so->diag[i]->r_up; el; el = el->r_up) {\n        /* repair hole */\n        if (el->c_right)\n            el->c_right->c_left = el->c_left;\n        if (el->c_left)\n            el->c_left->c_right = el->c_right;\n        else\n            so->rowst[so->varord[el->row]] = el->c_right;\n    }\n\n    /* matrix is consistent except that diagonal row elements are unlinked from\n    their columns and the diagonal column elements are unlinked from their\n    rows.\n    For simplicity discard all knowledge of links and use getelm to relink\n    */\n    Elm *dright, *dleft, *dup, *ddown, *elnext;\n\n    so->rowst[i] = so->diag[i];\n    dright = so->diag[i]->c_right;\n    dleft = so->diag[i]->c_left;\n    dup = so->diag[i]->r_up;\n    ddown = so->diag[i]->r_down;\n    so->diag[i]->c_right = so->diag[i]->c_left = nullptr;\n    so->diag[i]->r_up = so->diag[i]->r_down = nullptr;\n    for (Elm* el = dright; el; el = elnext) {\n        elnext = el->c_right;\n        getelm(so, el->row, el->col, el);\n    }\n    for (Elm* el = dleft; el; el = elnext) {\n        elnext = el->c_left;\n        getelm(so, el->row, el->col, el);\n    }\n    for (Elm* el = dup; el; el = elnext) {\n        elnext = el->r_up;\n        getelm(so, el->row, el->col, el);\n    }\n    for (Elm* el = ddown; el; el = elnext) {\n        elnext = el->r_down;\n        getelm(so, el->row, el->col, el);\n    }\n}\n\ninline void free_elm(SparseObj* so) {\n    /* free all elements */\n    for (unsigned i = 1; i <= so->neqn; i++) {\n        so->rowst[i] = nullptr;\n        so->diag[i] = nullptr;\n    }\n}\n\ninline void init_minorder(SparseObj* so) {\n    /* matrix has been set up. Construct the orderlist and orderfind\n       vector.\n    */\n\n    so->do_flag = 1;\n    if (so->roworder) {\n        for (unsigned i = 1; i <= so->nroworder; ++i) {\n            delete so->roworder[i];\n        }\n        delete[] so->roworder;\n    }\n    so->roworder = new Item* [so->neqn + 1] {};\n    so->nroworder = so->neqn;\n    if (so->orderlist) {\n        freelist(so->orderlist);\n    }\n    so->orderlist = newlist();\n    for (unsigned i = 1; i <= so->neqn; i++) {\n        so->roworder[i] = new Item{};\n    }\n    for (unsigned i = 1; i <= so->neqn; i++) {\n        unsigned j = 0;\n        for (auto el = so->rowst[i]; el; el = el->c_right) {\n            j++;\n        }\n        so->roworder[so->diag[i]->row]->elm = so->diag[i];\n        so->roworder[so->diag[i]->row]->norder = j;\n        insert(so, so->roworder[so->diag[i]->row]);\n    }\n}\n\ninline void reduce_order(SparseObj* so, unsigned row) {\n    /* order of row decreases by 1. Maintain the orderlist. */\n\n    if (!so->do_flag)\n        return;\n    Item* order = so->roworder[row];\n    delete_item(order);\n    order->norder--;\n    insert(so, order);\n}\n\ninline void get_next_pivot(SparseObj* so, unsigned i) {\n    /* get varord[i], etc. from the head of the orderlist. */\n    Item* order = so->orderlist->next;\n    assert(order != so->orderlist);\n\n    unsigned j;\n    if ((j = so->varord[order->elm->row]) != i) {\n        /* push order lists down by 1 and put new diag in empty slot */\n        assert(j > i);\n        Elm* el = so->rowst[j];\n        for (; j > i; j--) {\n            so->diag[j] = so->diag[j - 1];\n            so->rowst[j] = so->rowst[j - 1];\n            so->varord[so->diag[j]->row] = j;\n        }\n        so->diag[i] = order->elm;\n        so->rowst[i] = el;\n        so->varord[so->diag[i]->row] = i;\n        /* at this point row links are out of order for diag[i]->col\n           and col links are out of order for diag[i]->row */\n        re_link(so, i);\n    }\n\n    /* now make sure all needed elements exist */\n    for (Elm* el = so->diag[i]->r_down; el; el = el->r_down) {\n        for (Elm* pivot = so->diag[i]->c_right; pivot; pivot = pivot->c_right) {\n            getelm(so, el->row, pivot->col, nullptr);\n        }\n        reduce_order(so, el->row);\n    }\n    delete_item(order);\n}\n\n/* reallocate space for matrix */\ninline void initeqn(SparseObj* so, unsigned maxeqn) {\n    if (maxeqn == so->neqn)\n        return;\n    free_elm(so);\n    so->neqn = maxeqn;\n    delete[] so->rowst;\n    delete[] so->diag;\n    delete[] so->varord;\n    delete[] so->rhs;\n    delete[] so->ngetcall;\n    so->elmpool = nullptr;\n    so->rowst = new Elm*[maxeqn + 1];\n    so->diag = new Elm*[maxeqn + 1];\n    so->varord = new unsigned[maxeqn + 1];\n    so->rhs = new double[(maxeqn + 1) * so->_cntml_padded];\n    so->ngetcall = new unsigned[so->_cntml_padded];\n    for (unsigned i = 1; i <= maxeqn; i++) {\n        so->varord[i] = i;\n        so->diag[i] = new Elm{};\n        so->diag[i]->value = new double[so->_cntml_padded];\n        so->rowst[i] = so->diag[i];\n        so->diag[i]->row = i;\n        so->diag[i]->col = i;\n        so->diag[i]->r_down = so->diag[i]->r_up = nullptr;\n        so->diag[i]->c_right = so->diag[i]->c_left = nullptr;\n    }\n    unsigned nn = so->neqn * so->_cntml_padded;\n    for (unsigned i = 0; i < nn; ++i) {\n        so->rhs[i] = 0.;\n    }\n}\n\n/**\n * Minimum ordering algorithm to determine the order that the matrix should be\n * solved. Also make sure all needed elements are present. This does not mess up\n * the matrix.\n */\ninline void spar_minorder(SparseObj* so) {\n    check_assert(so);\n    init_minorder(so);\n    for (unsigned i = 1; i <= so->neqn; i++) {\n        get_next_pivot(so, i);\n    }\n    so->do_flag = 0;\n    check_assert(so);\n}\n\ninline void init_coef_list(SparseObj* so, int _iml) {\n    so->ngetcall[_iml] = 0;\n    for (unsigned i = 1; i <= so->neqn; i++) {\n        for (Elm* el = so->rowst[i]; el; el = el->c_right) {\n            el->value[_iml] = 0.;\n        }\n    }\n}\n\n#if defined(scopmath_sparse_d) || defined(scopmath_sparse_ix) || defined(scopmath_sparse_s) || \\\n    defined(scopmath_sparse_x)\n#error \"naming clash on sparse_thread.hpp-internal macros\"\n#endif\n#define scopmath_sparse_ix(arg) ((arg) *_STRIDE)\ninline void subrow(SparseObj* so, Elm* pivot, Elm* rowsub, int _iml) {\n    unsigned int const _cntml_padded{so->_cntml_padded};\n    double const r{rowsub->value[_iml] / pivot->value[_iml]};\n    so->rhs[scopmath_sparse_ix(rowsub->row)] -= so->rhs[scopmath_sparse_ix(pivot->row)] * r;\n    so->numop++;\n    for (auto el = pivot->c_right; el; el = el->c_right) {\n        for (rowsub = rowsub->c_right; rowsub->col != el->col; rowsub = rowsub->c_right) {\n        }\n        rowsub->value[_iml] -= el->value[_iml] * r;\n        so->numop++;\n    }\n}\n\ninline void bksub(SparseObj* so, int _iml) {\n    int _cntml_padded = so->_cntml_padded;\n    for (unsigned i = so->neqn; i >= 1; i--) {\n        for (Elm* el = so->diag[i]->c_right; el; el = el->c_right) {\n            so->rhs[scopmath_sparse_ix(el->row)] -= el->value[_iml] *\n                                                    so->rhs[scopmath_sparse_ix(el->col)];\n            so->numop++;\n        }\n        so->rhs[scopmath_sparse_ix(so->diag[i]->row)] /= so->diag[i]->value[_iml];\n        so->numop++;\n    }\n}\n\ninline int matsol(SparseObj* so, int _iml) {\n    /* Upper triangularization */\n    so->numop = 0;\n    for (unsigned i = 1; i <= so->neqn; i++) {\n        Elm* pivot{so->diag[i]};\n        if (fabs(pivot->value[_iml]) <= ROUNDOFF) {\n            return SINGULAR;\n        }\n        // Eliminate all elements in pivot column. The OpenACC annotation here\n        // is to avoid problems with nvc++'s automatic paralellisation; see:\n        // https://forums.developer.nvidia.com/t/device-kernel-hangs-at-o-and-above/212733\n        nrn_pragma_acc(loop seq)\n        for (auto el = pivot->r_down; el; el = el->r_down) {\n            subrow(so, pivot, el, _iml);\n        }\n    }\n    bksub(so, _iml);\n    return SUCCESS;\n}\n\ntemplate <typename SPFUN>\nvoid create_coef_list(SparseObj* so, int n, SPFUN fun, _threadargsproto_) {\n    initeqn(so, (unsigned) n);\n    so->phase = 1;\n    so->ngetcall[0] = 0;\n    fun(so, so->rhs, _threadargs_);  // std::invoke in C++17\n    if (so->coef_list) {\n        free(so->coef_list);\n    }\n    so->coef_list_size = so->ngetcall[0];\n    so->coef_list = new double*[so->coef_list_size];\n    spar_minorder(so);\n    so->phase = 2;\n    so->ngetcall[0] = 0;\n    fun(so, so->rhs, _threadargs_);  // std::invoke in C++17\n    so->phase = 0;\n}\n\ntemplate <enabled_code code_to_enable = enabled_code::all>\ndouble* thread_getelm(SparseObj* so, int row, int col, int _iml) {\n    if (!so->phase) {\n        return so->coef_list[so->ngetcall[_iml]++];\n    }\n    Elm* el = scopmath::sparse::getelm<code_to_enable>(so, (unsigned) row, (unsigned) col, nullptr);\n    if (so->phase == 1) {\n        so->ngetcall[_iml]++;\n    } else {\n        so->coef_list[so->ngetcall[_iml]++] = el->value;\n    }\n    return el->value;\n}\n}  // namespace sparse\n}  // namespace scopmath\n\n// Methods that may be called from translated MOD files are kept outside the\n// scopmath::sparse namespace.\n#define scopmath_sparse_s(arg) _p[scopmath_sparse_ix(s[arg])]\n#define scopmath_sparse_d(arg) _p[scopmath_sparse_ix(d[arg])]\n\n/**\n * sparse matrix dynamic allocation: create_coef_list makes a list for fast\n * setup, does minimum ordering and ensures all elements needed are present.\n * This could easily be made recursive but it isn't right now.\n */\ntemplate <typename SPFUN>\nvoid* nrn_cons_sparseobj(SPFUN fun, int n, Memb_list* ml, _threadargsproto_) {\n    // fill in the unset _threadargsproto_ assuming _iml = 0;\n    _iml = 0; /* from _threadargsproto_ */\n    _p = ml->data;\n    _ppvar = ml->pdata;\n    _v = _nt->_actual_v[ml->nodeindices[_iml]];\n    SparseObj* so{new SparseObj};\n    so->_cntml_padded = _cntml_padded;\n    scopmath::sparse::create_coef_list(so, n, fun, _threadargs_);\n    nrn_sparseobj_copyto_device(so);\n    return so;\n}\n\n/**\n * This is an experimental numerical method for SCoP-3 which integrates kinetic\n * rate equations.  It is intended to be used only by models generated by MODL,\n * and its identity is meant to be concealed from the user.\n *\n * @param n number of state variables\n * @param s array of pointers to the state variables\n * @param d array of pointers to the derivatives of states\n * @param t pointer to the independent variable\n * @param dt the time step\n * @param fun callable corresponding to the kinetic block equations\n * @param prhs pointer to right hand side vector (answer on return) does not\n *             have to be allocated by caller. (this is no longer quite right)\n * @param linflag solve as linear equations, when nonlinear, all states are\n *                forced >= 0\n */\ntemplate <typename F>\nint sparse_thread(SparseObj* so,\n                  int n,\n                  int* s,\n                  int* d,\n                  double* t,\n                  double dt,\n                  F fun,\n                  int linflag,\n                  _threadargsproto_) {\n    int i, j, ierr;\n    double err;\n\n    for (i = 0; i < n; i++) { /*save old state*/\n        scopmath_sparse_d(i) = scopmath_sparse_s(i);\n    }\n    for (err = 1, j = 0; err > CONVERGE; j++) {\n        scopmath::sparse::init_coef_list(so, _iml);\n        fun(so, so->rhs, _threadargs_);  // std::invoke in C++17\n        if ((ierr = scopmath::sparse::matsol(so, _iml))) {\n            return ierr;\n        }\n        for (err = 0., i = 1; i <= n; i++) { /* why oh why did I write it from 1 */\n            scopmath_sparse_s(i - 1) += so->rhs[scopmath_sparse_ix(i)];\n            if (!linflag && scopmath_sparse_s(i - 1) < 0.) {\n                scopmath_sparse_s(i - 1) = 0.;\n            }\n            err += fabs(so->rhs[scopmath_sparse_ix(i)]);\n        }\n        if (j > MAXSTEPS) {\n            return EXCEED_ITERS;\n        }\n        if (linflag)\n            break;\n    }\n    scopmath::sparse::init_coef_list(so, _iml);\n    fun(so, so->rhs, _threadargs_);  // std::invoke in C++17\n    for (i = 0; i < n; i++) {        /*restore Dstate at t+dt*/\n        scopmath_sparse_d(i) = (scopmath_sparse_s(i) - scopmath_sparse_d(i)) / dt;\n    }\n    return SUCCESS;\n}\n#undef scopmath_sparse_d\n#undef scopmath_sparse_ix\n#undef scopmath_sparse_s\n#define scopmath_sparse_x(arg) _p[x[arg] * _STRIDE]\n/* for solving ax=b */\ntemplate <typename SPFUN>\nint _cvode_sparse_thread(void** vpr, int n, int* x, SPFUN fun, _threadargsproto_) {\n    SparseObj* so = (SparseObj*) (*vpr);\n    if (!so) {\n        so = new SparseObj{};\n        *vpr = so;\n    }\n    scopmath::sparse::create_coef_list(so, n, fun, _threadargs_); /* calls fun twice */\n    scopmath::sparse::init_coef_list(so, _iml);\n    fun(so, so->rhs, _threadargs_);  // std::invoke in C++17\n    int ierr;\n    if ((ierr = scopmath::sparse::matsol(so, _iml))) {\n        return ierr;\n    }\n    for (int i = 1; i <= n; i++) { /* why oh why did I write it from 1 */\n        scopmath_sparse_x(i - 1) = so->rhs[i];\n    }\n    return SUCCESS;\n}\n#undef scopmath_sparse_x\n\ninline void _nrn_destroy_sparseobj_thread(SparseObj* so) {\n    if (!so) {\n        return;\n    }\n    nrn_sparseobj_delete_from_device(so);\n    delete[] so->rowst;\n    delete[] so->diag;\n    delete[] so->varord;\n    delete[] so->rhs;\n    delete[] so->coef_list;\n    if (so->roworder) {\n        for (int ii = 1; ii <= so->nroworder; ++ii) {\n            delete so->roworder[ii];\n        }\n        delete[] so->roworder;\n    }\n    if (so->orderlist) {\n        scopmath::sparse::freelist(so->orderlist);\n    }\n    delete so;\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/sim/scopmath/ssimplic_thread.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n#pragma once\n#include \"coreneuron/mechanism/mech/mod2c_core_thread.hpp\"\n\nnamespace coreneuron {\n\n#if defined(scopmath_ssimplic_s)\n#error \"naming clash on ssimplic_thread.hpp-internal macros\"\n#endif\n#define scopmath_ssimplic_s(arg) _p[s[arg] * _STRIDE]\nstatic int check_state(int n, int* s, _threadargsproto_) {\n    bool flag{true};\n    for (int i = 0; i < n; i++) {\n        if (scopmath_ssimplic_s(i) < -1e-6) {\n            scopmath_ssimplic_s(i) = 0.;\n            flag = false;\n        }\n    }\n    return flag;\n}\n#undef scopmath_ssimplic_s\n\ntemplate <typename SPFUN>\nint _ss_sparse_thread(SparseObj* so,\n                      int n,\n                      int* s,\n                      int* d,\n                      double* t,\n                      double dt,\n                      SPFUN fun,\n                      int linflag,\n                      _threadargsproto_) {\n    int err;\n    double ss_dt{1e9};\n    _nt->_dt = ss_dt;\n\n    if (linflag) { /*iterate linear solution*/\n        err = sparse_thread(so, n, s, d, t, ss_dt, fun, 0, _threadargs_);\n    } else {\n        int ii{7};\n        err = 0;\n        while (ii) {\n            err = sparse_thread(so, n, s, d, t, ss_dt, fun, 1, _threadargs_);\n            if (!err) {\n                if (check_state(n, s, _threadargs_)) {\n                    err = sparse_thread(so, n, s, d, t, ss_dt, fun, 0, _threadargs_);\n                }\n            }\n            --ii;\n            if (!err) {\n                ii = 0;\n            }\n        }\n    }\n\n    _nt->_dt = dt;\n    return err;\n}\n\ntemplate <typename DIFUN>\nint _ss_derivimplicit_thread(int n, int* slist, int* dlist, DIFUN fun, _threadargsproto_) {\n    double const dtsav{_nt->_dt};\n    _nt->_dt = 1e-9;\n    int err = fun(_threadargs_);\n    _nt->_dt = dtsav;\n    return err;\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/sim/solve_core.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/permute/cellorder.hpp\"\n#include \"coreneuron/sim/multicore.hpp\"\nnamespace coreneuron {\nbool use_solve_interleave;\n\nstatic void triang(NrnThread*), bksub(NrnThread*);\n\n/* solve the matrix equation */\nvoid nrn_solve_minimal(NrnThread* _nt) {\n    if (use_solve_interleave) {\n        solve_interleaved(_nt->id);\n    } else {\n        triang(_nt);\n        bksub(_nt);\n    }\n}\n\n/** @todo OpenACC GPU offload is sequential/slow. Because --cell-permute=0 and\n *  --gpu is forbidden anyway, no OpenMP target offload equivalent is implemented.\n */\n\n/* triangularization of the matrix equations */\nstatic void triang(NrnThread* _nt) {\n    int i2 = _nt->ncell;\n    int i3 = _nt->end;\n\n    double* vec_a = &(VEC_A(0));\n    double* vec_b = &(VEC_B(0));\n    double* vec_d = &(VEC_D(0));\n    double* vec_rhs = &(VEC_RHS(0));\n    int* parent_index = _nt->_v_parent_index;\n\n    nrn_pragma_acc(parallel loop seq present(\n        vec_a [0:i3], vec_b [0:i3], vec_d [0:i3], vec_rhs [0:i3], parent_index [0:i3])\n                       async(_nt->stream_id) if (_nt->compute_gpu))\n    nrn_pragma_omp(target if (_nt->compute_gpu))\n    for (int i = i3 - 1; i >= i2; --i) {\n        double p = vec_a[i] / vec_d[i];\n        vec_d[parent_index[i]] -= p * vec_b[i];\n        vec_rhs[parent_index[i]] -= p * vec_rhs[i];\n    }\n}\n\n/* back substitution to finish solving the matrix equations */\nstatic void bksub(NrnThread* _nt) {\n    int i1 = 0;\n    int i2 = i1 + _nt->ncell;\n    int i3 = _nt->end;\n\n    double* vec_b = &(VEC_B(0));\n    double* vec_d = &(VEC_D(0));\n    double* vec_rhs = &(VEC_RHS(0));\n    int* parent_index = _nt->_v_parent_index;\n\n    nrn_pragma_acc(parallel loop seq present(vec_d [0:i2], vec_rhs [0:i2])\n                       async(_nt->stream_id) if (_nt->compute_gpu))\n    nrn_pragma_omp(target if (_nt->compute_gpu))\n    for (int i = i1; i < i2; ++i) {\n        vec_rhs[i] /= vec_d[i];\n    }\n\n    nrn_pragma_acc(\n        parallel loop seq present(vec_b [0:i3], vec_d [0:i3], vec_rhs [0:i3], parent_index [0:i3])\n            async(_nt->stream_id) if (_nt->compute_gpu))\n    nrn_pragma_omp(target if (_nt->compute_gpu))\n    for (int i = i2; i < i3; ++i) {\n        vec_rhs[i] -= vec_b[i] * vec_rhs[parent_index[i]];\n        vec_rhs[i] /= vec_d[i];\n    }\n\n    if (_nt->compute_gpu) {\n        nrn_pragma_acc(wait(_nt->stream_id))\n    }\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/sim/treeset_core.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include <string>\n\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/utils/profile/profiler_interface.h\"\n#include \"coreneuron/coreneuron.hpp\"\n#include \"coreneuron/utils/nrnoc_aux.hpp\"\n\nnamespace coreneuron {\n/*\nFixed step method with threads and cache efficiency. No extracellular,\nsparse matrix, multisplit, or legacy features.\n*/\n\nstatic void nrn_rhs(NrnThread* _nt) {\n    int i1 = 0;\n    int i2 = i1 + _nt->ncell;\n    int i3 = _nt->end;\n\n    double* vec_rhs = &(VEC_RHS(0));\n    double* vec_d = &(VEC_D(0));\n    double* vec_a = &(VEC_A(0));\n    double* vec_b = &(VEC_B(0));\n    double* vec_v = &(VEC_V(0));\n    int* parent_index = _nt->_v_parent_index;\n\n    nrn_pragma_acc(parallel loop present(vec_rhs [0:i3], vec_d [0:i3]) if (_nt->compute_gpu)\n                       async(_nt->stream_id))\n    nrn_pragma_omp(target teams distribute parallel for if(_nt->compute_gpu))\n    for (int i = i1; i < i3; ++i) {\n        vec_rhs[i] = 0.;\n        vec_d[i] = 0.;\n    }\n\n    if (_nt->nrn_fast_imem) {\n        double* fast_imem_d = _nt->nrn_fast_imem->nrn_sav_d;\n        double* fast_imem_rhs = _nt->nrn_fast_imem->nrn_sav_rhs;\n        nrn_pragma_acc(\n            parallel loop present(fast_imem_d [i1:i3], fast_imem_rhs [i1:i3]) if (_nt->compute_gpu)\n                async(_nt->stream_id))\n        nrn_pragma_omp(target teams distribute parallel for if(_nt->compute_gpu))\n        for (int i = i1; i < i3; ++i) {\n            fast_imem_d[i] = 0.;\n            fast_imem_rhs[i] = 0.;\n        }\n    }\n\n    nrn_ba(_nt, BEFORE_BREAKPOINT);\n    /* note that CAP has no current */\n    for (auto tml = _nt->tml; tml; tml = tml->next)\n        if (corenrn.get_memb_func(tml->index).current) {\n            mod_f_t s = corenrn.get_memb_func(tml->index).current;\n            std::string ss(\"cur-\");\n            ss += nrn_get_mechname(tml->index);\n            Instrumentor::phase p(ss.c_str());\n            (*s)(_nt, tml->ml, tml->index);\n#ifdef DEBUG\n            if (errno) {\n                hoc_warning(\"errno set during calculation of currents\", nullptr);\n            }\n#endif\n        }\n\n    if (_nt->nrn_fast_imem) {\n        /* _nrn_save_rhs has only the contribution of electrode current\n           so here we transform so it only has membrane current contribution\n        */\n        double* p = _nt->nrn_fast_imem->nrn_sav_rhs;\n        nrn_pragma_acc(parallel loop present(p, vec_rhs) if (_nt->compute_gpu)\n                           async(_nt->stream_id))\n        nrn_pragma_omp(target teams distribute parallel for if(_nt->compute_gpu))\n        for (int i = i1; i < i3; ++i) {\n            p[i] -= vec_rhs[i];\n        }\n    }\n\n    /* now the internal axial currents.\n    The extracellular mechanism contribution is already done.\n            rhs += ai_j*(vi_j - vi)\n    */\n    nrn_pragma_acc(parallel loop present(vec_rhs [0:i3],\n                                         vec_d [0:i3],\n                                         vec_a [0:i3],\n                                         vec_b [0:i3],\n                                         vec_v [0:i3],\n                                         parent_index [0:i3]) if (_nt->compute_gpu)\n                       async(_nt->stream_id))\n    nrn_pragma_omp(target teams distribute parallel for if(_nt->compute_gpu))\n    for (int i = i2; i < i3; ++i) {\n        double dv = vec_v[parent_index[i]] - vec_v[i];\n        /* our connection coefficients are negative so */\n        nrn_pragma_acc(atomic update)\n        nrn_pragma_omp(atomic update)\n        vec_rhs[i] -= vec_b[i] * dv;\n        nrn_pragma_acc(atomic update)\n        nrn_pragma_omp(atomic update)\n        vec_rhs[parent_index[i]] += vec_a[i] * dv;\n    }\n}\n\n/* calculate left hand side of\ncm*dvm/dt = -i(vm) + is(vi) + ai_j*(vi_j - vi)\ncx*dvx/dt - cm*dvm/dt = -gx*(vx - ex) + i(vm) + ax_j*(vx_j - vx)\nwith a matrix so that the solution is of the form [dvm+dvx,dvx] on the right\nhand side after solving.\nThis is a common operation for fixed step, cvode, and daspk methods\n*/\n\nstatic void nrn_lhs(NrnThread* _nt) {\n    int i1 = 0;\n    int i2 = i1 + _nt->ncell;\n    int i3 = _nt->end;\n\n    /* note that CAP has no jacob */\n    for (auto tml = _nt->tml; tml; tml = tml->next)\n        if (corenrn.get_memb_func(tml->index).jacob) {\n            mod_f_t s = corenrn.get_memb_func(tml->index).jacob;\n            std::string ss(\"cur-\");\n            ss += nrn_get_mechname(tml->index);\n            Instrumentor::phase p(ss.c_str());\n            (*s)(_nt, tml->ml, tml->index);\n#ifdef DEBUG\n            if (errno) {\n                hoc_warning(\"errno set during calculation of jacobian\", (char*) 0);\n            }\n#endif\n        }\n    /* now the cap current can be computed because any change to cm by another model\n    has taken effect\n    */\n    /* note, the first is CAP if there are any nodes*/\n    if (_nt->end && _nt->tml) {\n        assert(_nt->tml->index == CAP);\n        nrn_jacob_capacitance(_nt, _nt->tml->ml, _nt->tml->index);\n    }\n\n    double* vec_d = &(VEC_D(0));\n    double* vec_a = &(VEC_A(0));\n    double* vec_b = &(VEC_B(0));\n    int* parent_index = _nt->_v_parent_index;\n\n    if (_nt->nrn_fast_imem) {\n        /* _nrn_save_d has only the contribution of electrode current\n           so here we transform so it only has membrane current contribution\n        */\n        double* p = _nt->nrn_fast_imem->nrn_sav_d;\n        nrn_pragma_acc(parallel loop present(p, vec_d) if (_nt->compute_gpu) async(_nt->stream_id))\n        nrn_pragma_omp(target teams distribute parallel for if(_nt->compute_gpu))\n        for (int i = i1; i < i3; ++i) {\n            p[i] += vec_d[i];\n        }\n    }\n\n    /* now add the axial currents */\n    nrn_pragma_acc(parallel loop present(\n        vec_d [0:i3], vec_a [0:i3], vec_b [0:i3], parent_index [0:i3]) if (_nt->compute_gpu)\n                       async(_nt->stream_id))\n    nrn_pragma_omp(target teams distribute parallel for if(_nt->compute_gpu))\n    for (int i = i2; i < i3; ++i) {\n        nrn_pragma_acc(atomic update)\n        nrn_pragma_omp(atomic update)\n        vec_d[i] -= vec_b[i];\n        nrn_pragma_acc(atomic update)\n        nrn_pragma_omp(atomic update)\n        vec_d[parent_index[i]] -= vec_a[i];\n    }\n}\n\n/* for the fixed step method */\nvoid* setup_tree_matrix_minimal(NrnThread* _nt) {\n    nrn_rhs(_nt);\n    nrn_lhs(_nt);\n    return nullptr;\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/utils/ivocvect.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include \"coreneuron/utils/ivocvect.hpp\"\n#include \"coreneuron/utils/offload.hpp\"\n\nnamespace coreneuron {\nIvocVect* vector_new(int n) {\n    return new IvocVect(n);\n}\nint vector_capacity(IvocVect* v) {\n    return v->size();\n}\ndouble* vector_vec(IvocVect* v) {\n    return v->data();\n}\n\n/*\n * Retro-compatibility implementations\n */\nIvocVect* vector_new1(int n) {\n    return new IvocVect(n);\n}\n\nnrn_pragma_acc(routine seq)\nint vector_capacity(void* v) {\n    return ((IvocVect*) v)->size();\n}\n\nnrn_pragma_acc(routine seq)\ndouble* vector_vec(void* v) {\n    return ((IvocVect*) v)->data();\n}\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/utils/ivocvect.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\n#include \"coreneuron/utils/offload.hpp\"\n\n#include <cstdio>\n#include <utility>\n\nnamespace coreneuron {\ntemplate <typename T>\nclass fixed_vector {\n    size_t n_;\n\n  public:\n    T* data_; /*making public for openacc copying */\n\n    fixed_vector() = default;\n\n    fixed_vector(size_t n)\n        : n_(n) {\n        data_ = new T[n_];\n    }\n\n    fixed_vector(const fixed_vector& vec) = delete;\n    fixed_vector& operator=(const fixed_vector& vec) = delete;\n    fixed_vector(fixed_vector&& vec)\n        : n_{vec.n_}\n        , data_{nullptr} {\n        std::swap(data_, vec.data_);\n    }\n    fixed_vector& operator=(fixed_vector&& vec) {\n        data_ = nullptr;\n        std::swap(data_, vec.data_);\n        n_ = vec.n_;\n        return *this;\n    }\n\n    ~fixed_vector() {\n        delete[] data_;\n    }\n\n    const T& operator[](int i) const {\n        return data_[i];\n    }\n    T& operator[](int i) {\n        return data_[i];\n    }\n\n    nrn_pragma_acc(routine seq)\n    const T* data(void) const {\n        return data_;\n    }\n\n    nrn_pragma_acc(routine seq)\n    T* data(void) {\n        return data_;\n    }\n\n    nrn_pragma_acc(routine seq)\n    size_t size() const {\n        return n_;\n    }\n};\n\nusing IvocVect = fixed_vector<double>;\n\nextern IvocVect* vector_new(int n);\nextern int vector_capacity(IvocVect* v);\nextern double* vector_vec(IvocVect* v);\n\n// retro-compatibility API\nextern IvocVect* vector_new1(int n);\nnrn_pragma_acc(routine seq)\nextern int vector_capacity(void* v);\nnrn_pragma_acc(routine seq)\nextern double* vector_vec(void* v);\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/utils/lpt.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include <algorithm>\n#include <functional>\n#include <numeric>\n#include <queue>\n\n#include \"coreneuron/nrnconf.h\"  // for size_t\n#include \"coreneuron/utils/lpt.hpp\"\n#include \"coreneuron/utils/nrn_assert.h\"\n\nusing P = std::pair<size_t, size_t>;\n\n// lpt Least Processing Time algorithm.\n// Largest piece goes into least size bag.\n// in: number of bags, vector of sizes\n// return: a new vector of bag indices parallel to the vector of sizes.\n\nstd::vector<std::size_t> lpt(std::size_t nbag, std::vector<std::size_t>& pieces, double* bal) {\n    nrn_assert(nbag > 0);\n    nrn_assert(!pieces.empty());\n\n    std::vector<P> pvec;\n    for (size_t i = 0; i < pieces.size(); ++i) {\n        pvec.push_back(P(i, pieces[i]));\n    }\n\n    auto P_comp = [](const P& a, const P& b) { return a.second > b.second; };\n\n    std::sort(pvec.begin(), pvec.end(), P_comp);\n\n    std::vector<std::size_t> bagindices(pieces.size());\n\n    std::priority_queue<P, std::vector<P>, decltype(P_comp)> bagq(P_comp);\n    for (size_t i = 0; i < nbag; ++i) {\n        bagq.push(P(i, 0));\n    }\n\n    for (const auto& p: pvec) {\n        P bagqitem = bagq.top();\n        bagq.pop();\n        bagindices[p.first] = bagqitem.first;\n        bagqitem.second += p.second;\n        bagq.push(bagqitem);\n    }\n\n    // load balance average/max (1.0 is perfect)\n    std::vector<size_t> v(bagq.size());\n    for (size_t i = 1; i < nbag; ++i) {\n        v[i] = bagq.top().second;\n        bagq.pop();\n    }\n    double b = load_balance(v);\n    if (bal) {\n        *bal = b;\n    } else {\n        printf(\"load balance = %g for %ld pieces in %ld bags\\n\", b, pieces.size(), nbag);\n    }\n\n    return bagindices;\n}\n\ndouble load_balance(std::vector<size_t>& v) {\n    nrn_assert(!v.empty());\n    std::size_t sum = std::accumulate(v.begin(), v.end(), 0);\n    std::size_t max = *std::max_element(v.begin(), v.end());\n    return (double(sum) / v.size()) / max;\n}\n"
  },
  {
    "path": "coreneuron/utils/lpt.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n#include <vector>\n\nstd::vector<std::size_t> lpt(std::size_t nbag,\n                             std::vector<std::size_t>& pieces,\n                             double* bal = nullptr);\n\ndouble load_balance(std::vector<size_t>&);\n"
  },
  {
    "path": "coreneuron/utils/memory.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n#include \"coreneuron/apps/corenrn_parameters.hpp\"\n#include \"coreneuron/utils/memory.h\"\n\n#ifdef CORENEURON_ENABLE_GPU\n#include <cuda_runtime_api.h>\n#endif\n\n#include <cassert>\n\nnamespace coreneuron {\nbool gpu_enabled() {\n#ifdef CORENEURON_ENABLE_GPU\n    return corenrn_param.gpu;\n#else\n    return false;\n#endif\n}\n\nvoid* allocate_unified(std::size_t num_bytes) {\n#ifdef CORENEURON_ENABLE_GPU\n    // The build supports GPU execution, check if --gpu was passed to actually\n    // enable it. We should not call CUDA APIs in GPU builds if --gpu was not passed.\n    if (corenrn_param.gpu) {\n        // Allocate managed/unified memory.\n        void* ptr{nullptr};\n        auto const code = cudaMallocManaged(&ptr, num_bytes);\n        assert(code == cudaSuccess);\n        return ptr;\n    }\n#endif\n    // Either the build does not have GPU support or --gpu was not passed.\n    // Allocate using standard operator new.\n    // When we have C++17 support then propagate `alignment` here.\n    return ::operator new(num_bytes);\n}\n\nvoid deallocate_unified(void* ptr, std::size_t num_bytes) {\n    // See comments in allocate_unified to understand the different branches.\n#ifdef CORENEURON_ENABLE_GPU\n    if (corenrn_param.gpu) {\n        // Deallocate managed/unified memory.\n        auto const code = cudaFree(ptr);\n        assert(code == cudaSuccess);\n        return;\n    }\n#endif\n#ifdef __cpp_sized_deallocation\n    ::operator delete(ptr, num_bytes);\n#else\n    ::operator delete(ptr);\n#endif\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/utils/memory.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\n#include <cstdint>\n#include <cstring>\n#include <memory>\n\n#include \"coreneuron/utils/nrn_assert.h\"\n#include \"coreneuron/nrniv/nrniv_decl.h\"\n\n#if !defined(NRN_SOA_BYTE_ALIGN)\n// for layout 0, every range variable array must be aligned by at least 16 bytes (the size of the\n// simd memory bus)\n#define NRN_SOA_BYTE_ALIGN (8 * sizeof(double))\n#endif\n\nnamespace coreneuron {\n/**\n * @brief Check if GPU support is enabled.\n *\n * This returns true if GPU support was enabled at compile time and at runtime\n * via coreneuron.gpu = True and/or --gpu, otherwise it returns false.\n */\nbool gpu_enabled();\n\n/** @brief Allocate unified memory in GPU builds iff GPU enabled, otherwise new\n */\nvoid* allocate_unified(std::size_t num_bytes);\n\n/** @brief Deallocate memory allocated by `allocate_unified`.\n */\nvoid deallocate_unified(void* ptr, std::size_t num_bytes);\n\n/** @brief C++ allocator that uses [de]allocate_unified.\n */\ntemplate <typename T>\nstruct unified_allocator {\n    using value_type = T;\n\n    unified_allocator() = default;\n\n    template <typename U>\n    unified_allocator(unified_allocator<U> const&) noexcept {}\n\n    value_type* allocate(std::size_t n) {\n        return static_cast<value_type*>(allocate_unified(n * sizeof(value_type)));\n    }\n\n    void deallocate(value_type* p, std::size_t n) noexcept {\n        deallocate_unified(p, n * sizeof(value_type));\n    }\n};\n\ntemplate <typename T, typename U>\nbool operator==(unified_allocator<T> const&, unified_allocator<U> const&) noexcept {\n    return true;\n}\n\ntemplate <typename T, typename U>\nbool operator!=(unified_allocator<T> const& x, unified_allocator<U> const& y) noexcept {\n    return !(x == y);\n}\n\n/** @brief Allocator-aware deleter for use with std::unique_ptr.\n *\n *  This is copied from https://stackoverflow.com/a/23132307. See also\n *  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0316r0.html,\n *  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p0211r3.html, and\n *  boost::allocate_unique<...>.\n *  Hopefully std::allocate_unique will be included in C++23.\n */\ntemplate <typename Alloc>\nstruct alloc_deleter {\n    alloc_deleter() = default;  // OL210813 addition\n    alloc_deleter(const Alloc& a)\n        : a(a) {}\n\n    using pointer = typename std::allocator_traits<Alloc>::pointer;\n\n    void operator()(pointer p) const {\n        Alloc aa(a);\n        std::allocator_traits<Alloc>::destroy(aa, std::addressof(*p));\n        std::allocator_traits<Alloc>::deallocate(aa, p, 1);\n    }\n\n  private:\n    Alloc a;\n};\n\ntemplate <typename T, typename Alloc, typename... Args>\nauto allocate_unique(const Alloc& alloc, Args&&... args) {\n    using AT = std::allocator_traits<Alloc>;\n    static_assert(std::is_same<typename AT::value_type, std::remove_cv_t<T>>{}(),\n                  \"Allocator has the wrong value_type\");\n\n    Alloc a(alloc);\n    auto p = AT::allocate(a, 1);\n    try {\n        AT::construct(a, std::addressof(*p), std::forward<Args>(args)...);\n        using D = alloc_deleter<Alloc>;\n        return std::unique_ptr<T, D>(p, D(a));\n    } catch (...) {\n        AT::deallocate(a, p, 1);\n        throw;\n    }\n}\n}  // namespace coreneuron\n\n/// for gpu builds with unified memory support\n#ifdef CORENEURON_UNIFIED_MEMORY\n\n#include <cuda_runtime_api.h>\n\n// TODO : error handling for CUDA routines\ninline void alloc_memory(void*& pointer, size_t num_bytes, size_t /*alignment*/) {\n    cudaMallocManaged(&pointer, num_bytes);\n}\n\ninline void calloc_memory(void*& pointer, size_t num_bytes, size_t /*alignment*/) {\n    alloc_memory(pointer, num_bytes, 64);\n    cudaMemset(pointer, 0, num_bytes);\n}\n\ninline void free_memory(void* pointer) {\n    cudaFree(pointer);\n}\n\n/**\n * A base class providing overloaded new and delete operators for CUDA allocation\n *\n * Classes that should be allocated on the GPU should inherit from this class. Additionally they\n * may need to implement a special copy-construtor. This is documented here:\n * \\link: https://devblogs.nvidia.com/unified-memory-in-cuda-6/\n */\nclass MemoryManaged {\n  public:\n    void* operator new(size_t len) {\n        void* ptr;\n        cudaMallocManaged(&ptr, len);\n        cudaDeviceSynchronize();\n        return ptr;\n    }\n\n    void* operator new[](size_t len) {\n        void* ptr;\n        cudaMallocManaged(&ptr, len);\n        cudaDeviceSynchronize();\n        return ptr;\n    }\n\n    void operator delete(void* ptr) {\n        cudaDeviceSynchronize();\n        cudaFree(ptr);\n    }\n\n    void operator delete[](void* ptr) {\n        cudaDeviceSynchronize();\n        cudaFree(ptr);\n    }\n};\n\n\n/// for cpu builds use posix memalign\n#else\nclass MemoryManaged {\n    // does nothing by default\n};\n\n#include <cstdlib>\n\ninline void alloc_memory(void*& pointer, size_t num_bytes, size_t alignment) {\n    size_t fill = 0;\n    if (alignment > 0) {\n        if (num_bytes % alignment != 0) {\n            size_t multiple = num_bytes / alignment;\n            fill = alignment * (multiple + 1) - num_bytes;\n        }\n        nrn_assert((pointer = std::aligned_alloc(alignment, num_bytes + fill)) != nullptr);\n    } else {\n        nrn_assert((pointer = std::malloc(num_bytes)) != nullptr);\n    }\n}\n\ninline void calloc_memory(void*& pointer, size_t num_bytes, size_t alignment) {\n    alloc_memory(pointer, num_bytes, alignment);\n    memset(pointer, 0, num_bytes);\n}\n\ninline void free_memory(void* pointer) {\n    free(pointer);\n}\n\n#endif\n\nnamespace coreneuron {\n\n/** Independent function to compute the needed chunkding,\n    the chunk argument is the number of doubles the chunk is chunkded upon.\n*/\ntemplate <int chunk>\ninline int soa_padded_size(int cnt, int layout) {\n    int imod = cnt % chunk;\n    if (layout == Layout::AoS)\n        return cnt;\n    if (imod) {\n        int idiv = cnt / chunk;\n        return (idiv + 1) * chunk;\n    }\n    return cnt;\n}\n\n/** Check for the pointer alignment.\n */\ninline bool is_aligned(void* pointer, std::size_t alignment) {\n    return (reinterpret_cast<std::uintptr_t>(pointer) % alignment) == 0;\n}\n\n/**\n * Allocate aligned memory. This will be unified memory if the corresponding\n * CMake option is set. This must be freed with the free_memory method.\n *\n * \\param size      Size of buffer to allocate in bytes.\n * \\param alignment Memory alignment, defaults to NRN_SOA_BYTE_ALIGN. Pass 0 for no alignment.\n */\ninline void* emalloc_align(size_t size, size_t alignment = NRN_SOA_BYTE_ALIGN) {\n    void* memptr;\n    alloc_memory(memptr, size, alignment);\n    if (alignment != 0) {\n        nrn_assert(is_aligned(memptr, alignment));\n    }\n    return memptr;\n}\n\n/**\n * Allocate the aligned memory and set it to 0. This will be unified memory if\n * the corresponding CMake option is set. This must be freed with the\n * free_memory method.\n *\n * \\param n         Number of objects to allocate\n * \\param size      Size of buffer for each object to allocate in bytes.\n * \\param alignment Memory alignment, defaults to NRN_SOA_BYTE_ALIGN. Pass 0 for no alignment.\n *\n * \\note the allocated size will be \\code n*size\n */\ninline void* ecalloc_align(size_t n, size_t size, size_t alignment = NRN_SOA_BYTE_ALIGN) {\n    void* p;\n    if (n == 0) {\n        return nullptr;\n    }\n    calloc_memory(p, n * size, alignment);\n    if (alignment != 0) {\n        nrn_assert(is_aligned(p, alignment));\n    }\n    return p;\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/utils/memory_utils.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n/**\n * @file memory_utils.cpp\n * @date 25th Oct 2014\n *\n * @brief Provides functionality to report current memory usage\n * of the simulator using interface provided by malloc.h\n *\n * Memory utilisation report is based on the use of mallinfo\n * interface defined in malloc.h. For 64 bit platform, this\n * is not portable and hence it will be replaced with new\n * glibc implementation of malloc_info.\n *\n * @see http://man7.org/linux/man-pages/man3/malloc_info.3.html\n */\n\n#include <stdio.h>\n#include <fstream>\n#include <unistd.h>\n#include \"coreneuron/utils/memory_utils.h\"\n#include \"coreneuron/mpi/nrnmpi.h\"\n#include \"coreneuron/mpi/core/nrnmpi.hpp\"\n#include \"coreneuron/apps/corenrn_parameters.hpp\"\n\n#if defined(__APPLE__) && defined(__MACH__)\n#include <mach/mach.h>\n#elif defined HAVE_MALLOC_H\n#include <malloc.h>\n#endif\n\n#ifdef CORENEURON_ENABLE_GPU\n#include \"cuda_profiler_api.h\"\n#endif\n\nnamespace coreneuron {\ndouble nrn_mallinfo(void) {\n    // -ve mem usage for non-supported platforms\n    double mbs = -1.0;\n\n// on os x returns the current resident set size (physical memory in use)\n#if defined(__APPLE__) && defined(__MACH__)\n    struct mach_task_basic_info info;\n    mach_msg_type_number_t infoCount = MACH_TASK_BASIC_INFO_COUNT;\n    if (task_info(mach_task_self(), MACH_TASK_BASIC_INFO, (task_info_t) &info, &infoCount) !=\n        KERN_SUCCESS)\n        return (size_t) 0L; /* Can't access? */\n    return info.resident_size / (1024.0 * 1024.0);\n#elif defined(MINGW)\n    mbs = -1;\n#else\n    std::ifstream file(\"/proc/self/statm\");\n    if (file.is_open()) {\n        unsigned long long int data_size;\n        file >> data_size >> data_size;\n        file.close();\n        mbs = (data_size * sysconf(_SC_PAGESIZE)) / (1024.0 * 1024.0);\n    } else {\n#if defined HAVE_MALLOC_H\n// The mallinfo2() function was added in glibc 2.33\n#if defined(__GLIBC__) && (__GLIBC__ >= 2 && __GLIBC_MINOR__ >= 33)\n        struct mallinfo2 m = mallinfo2();\n#else\n        struct mallinfo m = mallinfo();\n#endif\n        mbs = (m.hblkhd + m.uordblks) / (1024.0 * 1024.0);\n#endif\n    }\n#endif\n    return mbs;\n}\n\nvoid report_mem_usage(const char* message, bool all_ranks) {\n    double mem_max, mem_min, mem_avg;  // min, max, avg memory\n\n    // current memory usage on this rank\n    double cur_mem = nrn_mallinfo();\n\n/* @todo: avoid three all reduce class */\n#if NRNMPI\n    if (corenrn_param.mpi_enable) {\n        mem_avg = nrnmpi_dbl_allreduce(cur_mem, 1) / nrnmpi_numprocs;\n        mem_max = nrnmpi_dbl_allreduce(cur_mem, 2);\n        mem_min = nrnmpi_dbl_allreduce(cur_mem, 3);\n    } else\n#endif\n    {\n        mem_avg = mem_max = mem_min = cur_mem;\n    }\n\n    // all ranks prints information if all_ranks is true\n    if (all_ranks) {\n        printf(\" Memory (MBs) (Rank : %2d) : %30s : Cur %.4lf, Max %.4lf, Min %.4lf, Avg %.4lf \\n\",\n               nrnmpi_myid,\n               message,\n               cur_mem,\n               mem_max,\n               mem_min,\n               mem_avg);\n    } else if (nrnmpi_myid == 0) {\n        printf(\" Memory (MBs) : %25s : Max %.4lf, Min %.4lf, Avg %.4lf \\n\",\n               message,\n               mem_max,\n               mem_min,\n               mem_avg);\n#ifdef CORENEURON_ENABLE_GPU\n        if (corenrn_param.gpu) {\n            size_t free_byte, total_byte;\n            cudaError_t cuda_status = cudaMemGetInfo(&free_byte, &total_byte);\n            if (cudaSuccess != cuda_status) {\n                std::printf(\"cudaMemGetInfo failed: %s\\n\", cudaGetErrorString(cuda_status));\n            }\n            constexpr double MiB{1. / (1024. * 1024.)};\n            std::printf(\" GPU Memory (MiBs) : Used = %f, Free = %f, Total = %f\\n\",\n                        (total_byte - free_byte) * MiB,\n                        free_byte * MiB,\n                        total_byte * MiB);\n        }\n#endif\n    }\n    fflush(stdout);\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/utils/memory_utils.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n/**\n * @file memory_utils.h\n * @date 25th Oct 2014\n * @brief Function prototypes for the functions providing\n * information about simulator memory usage\n *\n */\n\n#pragma once\n\nnamespace coreneuron {\n/** @brief Reports current memory usage of the simulator to stdout\n *\n *  Current implementation is based on mallinfo. This routine prints\n *  min, max and avg memory usage across mpi comm world\n *  @param message string indicating current stage of the simulation\n *  @param all_ranks indicate whether to print info from all ranks\n *  @return Void\n */\nvoid report_mem_usage(const char* message, bool all_ranks = false);\n\n/** @brief Returns current memory usage in KBs\n *  @param Void\n *  @return memory usage in KBs\n */\ndouble nrn_mallinfo(void);\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/utils/nrn_assert.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\n#include <cstdio>\n#include <cstdlib>\n#include <cstdarg>\n\n/* Preserving original behaviour requires that we abort() on\n * parse failures.\n *\n * Relying on assert() (as in the original code) is fragile,\n * as this becomes a NOP if the source is compiled with\n * NDEBUG defined.\n */\n\n/** Emit formatted message to stderr, then abort(). */\nstatic void abortf(const char* fmt, ...) {\n    va_list va;\n    va_start(va, fmt);\n    vfprintf(stderr, fmt, va);\n    va_end(va);\n    abort();\n}\n\n/** assert()-like macro, independent of NDEBUG status */\n#define nrn_assert(x) \\\n    ((x) || (abortf(\"%s:%d: Assertion '%s' failed.\\n\", __FILE__, __LINE__, #x), 0))\n"
  },
  {
    "path": "coreneuron/utils/nrn_stats.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n/**\n * @file nrn_stats.cpp\n * @date 25th Dec 2014\n * @brief Function declarations for the cell statistics\n *\n */\n\n#include <algorithm>\n#include <cstdio>\n#include <climits>\n#include <vector>\n#include \"coreneuron/utils/nrn_stats.h\"\n#include \"coreneuron/mpi/nrnmpi.h\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/network/netcvode.hpp\"\n#include \"coreneuron/network/partrans.hpp\"\n#include \"coreneuron/io/output_spikes.hpp\"\n#include \"coreneuron/apps/corenrn_parameters.hpp\"\nnamespace coreneuron {\nconst int NUM_STATS = 13;\n\nvoid report_cell_stats() {\n    long stat_array[NUM_STATS] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};\n\n    for (int ith = 0; ith < nrn_nthread; ++ith) {\n        stat_array[0] += nrn_threads[ith].ncell;           // number of cells\n        stat_array[10] += nrn_threads[ith].end;            // number of compartments\n        stat_array[1] += nrn_threads[ith].n_presyn;        // number of presyns\n        stat_array[2] += nrn_threads[ith].n_input_presyn;  // number of input presyns\n        stat_array[3] += nrn_threads[ith].n_netcon;        // number of netcons, synapses\n        stat_array[4] += nrn_threads[ith].n_pntproc;       // number of point processes\n        if (nrn_partrans::transfer_thread_data_) {\n            size_t n = nrn_partrans::transfer_thread_data_[ith].tar_indices.size();\n            stat_array[11] += n;  // number of transfer targets\n            n = nrn_partrans::transfer_thread_data_[ith].src_indices.size();\n            stat_array[12] += n;  // number of transfer sources\n        }\n    }\n    stat_array[5] = spikevec_gid.size();  // number of spikes\n\n    stat_array[6] = std::count_if(spikevec_gid.cbegin(), spikevec_gid.cend(), [](const int& s) {\n        return s > -1;\n    });  // number of non-negative gid spikes\n\n#if NRNMPI\n    long gstat_array[NUM_STATS];\n    if (corenrn_param.mpi_enable) {\n        nrnmpi_long_allreduce_vec(stat_array, gstat_array, NUM_STATS, 1);\n    } else {\n        assert(sizeof(stat_array) == sizeof(gstat_array));\n        std::memcpy(gstat_array, stat_array, sizeof(stat_array));\n    }\n#else\n    const long(&gstat_array)[NUM_STATS] = stat_array;\n#endif\n\n    if (nrnmpi_myid == 0) {\n        printf(\"\\n\\n Simulation Statistics\\n\");\n        printf(\" Number of cells: %ld\\n\", gstat_array[0]);\n        printf(\" Number of compartments: %ld\\n\", gstat_array[10]);\n        printf(\" Number of presyns: %ld\\n\", gstat_array[1]);\n        printf(\" Number of input presyns: %ld\\n\", gstat_array[2]);\n        printf(\" Number of synapses: %ld\\n\", gstat_array[3]);\n        printf(\" Number of point processes: %ld\\n\", gstat_array[4]);\n        printf(\" Number of transfer sources: %ld\\n\", gstat_array[12]);\n        printf(\" Number of transfer targets: %ld\\n\", gstat_array[11]);\n        printf(\" Number of spikes: %ld\\n\", gstat_array[5]);\n        printf(\" Number of spikes with non negative gid-s: %ld\\n\", gstat_array[6]);\n    }\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/utils/nrn_stats.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n/**\n * @file nrn_stats.h\n * @date 25th Dec 2014\n * @brief Function declarations for the cell statistics\n *\n */\n\n#pragma once\nnamespace coreneuron {\n/** @brief Reports global cell statistics of the simulation\n *\n *  This routine prints the global number of cells, synapses of the simulation\n *  @param void\n *  @return void\n */\nvoid report_cell_stats();\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/utils/nrnmutdec.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n#pragma once\n\n#if defined(_OPENMP)\n#include <omp.h>\n\n// This class respects the requirement *Mutex*\nclass OMP_Mutex {\n  public:\n    // Default constructible\n    OMP_Mutex() {\n        omp_init_lock(&mut_);\n    }\n\n    // Destructible\n    ~OMP_Mutex() {\n        omp_destroy_lock(&mut_);\n    }\n\n    // Not copyable\n    OMP_Mutex(const OMP_Mutex&) = delete;\n    OMP_Mutex& operator=(const OMP_Mutex&) = delete;\n\n    // Not movable\n    OMP_Mutex(const OMP_Mutex&&) = delete;\n    OMP_Mutex& operator=(const OMP_Mutex&&) = delete;\n\n    // Basic Lockable\n    void lock() {\n        omp_set_lock(&mut_);\n    }\n\n    void unlock() {\n        omp_unset_lock(&mut_);\n    }\n\n    // Lockable\n    bool try_lock() {\n        return omp_test_lock(&mut_) != 0;\n    }\n\n  private:\n    omp_lock_t mut_;\n};\n\n#else\n\n// This class respects the requirement *Mutex*\nclass OMP_Mutex {\n  public:\n    // Default constructible\n    OMP_Mutex() = default;\n\n    // Destructible\n    ~OMP_Mutex() = default;\n\n    // Not copyable\n    OMP_Mutex(const OMP_Mutex&) = delete;\n    OMP_Mutex& operator=(const OMP_Mutex&) = delete;\n\n    // Not movable\n    OMP_Mutex(const OMP_Mutex&&) = delete;\n    OMP_Mutex& operator=(const OMP_Mutex&&) = delete;\n\n    // Basic Lockable\n    void lock() {}\n\n    void unlock() {}\n\n    // Lockable\n    bool try_lock() {\n        return true;\n    }\n};\n#endif\n"
  },
  {
    "path": "coreneuron/utils/nrnoc_aux.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include <cstdlib>\n#include <cstring>\n\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/mpi/nrnmpi.h\"\n#include \"coreneuron/coreneuron.hpp\"\n#include \"coreneuron/utils/nrnoc_aux.hpp\"\n#include \"coreneuron/apps/corenrn_parameters.hpp\"\n\nnamespace coreneuron {\nbool stoprun;\nint v_structure_change;\nint diam_changed;\n#define MAXERRCOUNT 5\nint hoc_errno_count;\nconst char* bbcore_write_version = \"1.6\";  // Allow multiple gid and PreSyn per real cell.\n\nchar* pnt_name(Point_process* pnt) {\n    return corenrn.get_memb_func(pnt->_type).sym;\n}\n\nvoid nrn_exit(int err) {\n#if NRNMPI\n    if (corenrn_param.mpi_enable) {\n        nrnmpi_finalize();\n    }\n#endif\n    exit(err);\n}\n\nvoid hoc_execerror(const char* s1, const char* s2) {\n    printf(\"error: %s %s\\n\", s1, s2 ? s2 : \"\");\n    abort();\n}\n\nvoid hoc_warning(const char* s1, const char* s2) {\n    printf(\"warning: %s %s\\n\", s1, s2 ? s2 : \"\");\n}\n\ndouble* makevector(size_t size) {\n    return (double*) ecalloc(size, sizeof(char));\n}\n\nvoid freevector(double* p) {\n    if (p) {\n        free(p);\n    }\n}\n\ndouble** makematrix(size_t nrows, size_t ncols) {\n    double** matrix = (double**) emalloc(nrows * sizeof(double*));\n    *matrix = (double*) emalloc(nrows * ncols * sizeof(double));\n    for (size_t i = 1; i < nrows; i++)\n        matrix[i] = matrix[i - 1] + ncols;\n    return (matrix);\n}\n\nvoid freematrix(double** matrix) {\n    if (matrix != nullptr) {\n        free(*matrix);\n        free(matrix);\n    }\n}\n\nvoid* emalloc(size_t size) {\n    void* memptr = malloc(size);\n    assert(memptr);\n    return memptr;\n}\n\n/* some user mod files may use this in VERBATIM */\nvoid* hoc_Emalloc(size_t size) {\n    return emalloc(size);\n}\nvoid hoc_malchk(void) {}\n\nvoid* ecalloc(size_t n, size_t size) {\n    if (n == 0) {\n        return nullptr;\n    }\n    void* p = calloc(n, size);\n    assert(p);\n    return p;\n}\n\nvoid* erealloc(void* ptr, size_t size) {\n    if (!ptr) {\n        return emalloc(size);\n    }\n    void* p = realloc(ptr, size);\n    assert(p);\n    return p;\n}\n\nvoid* nrn_cacheline_alloc(void** memptr, size_t size) {\n    alloc_memory(*memptr, size, 64);\n    return *memptr;\n}\n\n/* used by nmodl and other c, c++ code */\ndouble hoc_Exp(double x) {\n    if (x < -700.) {\n        return 0.;\n    } else if (x > 700) {\n        errno = ERANGE;\n        if (++hoc_errno_count < MAXERRCOUNT) {\n            fprintf(stderr, \"exp(%g) out of range, returning exp(700)\\n\", x);\n        }\n        if (hoc_errno_count == MAXERRCOUNT) {\n            fprintf(stderr, \"No more errno warnings during this execution\\n\");\n        }\n        return exp(700.);\n    }\n    return exp(x);\n}\n\n/* check for version bbcore_write version between NEURON and CoreNEURON\n * abort in case of missmatch\n */\nvoid check_bbcore_write_version(const char* version) {\n    if (strcmp(version, bbcore_write_version) != 0) {\n        if (nrnmpi_myid == 0)\n            fprintf(stderr,\n                    \"Error: Incompatible binary input dataset version (expected %s, input %s)\\n\",\n                    bbcore_write_version,\n                    version);\n        abort();\n    }\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/utils/nrnoc_aux.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\n#include <cstddef>\n#include \"coreneuron/mechanism/mechanism.hpp\"\n\nnamespace coreneuron {\n\nextern int v_structure_change;\nextern int diam_changed;\nextern int structure_change_cnt;\n\nextern char* pnt_name(Point_process* pnt);\n\nextern void nrn_exit(int);\n\nextern void* emalloc(size_t size);\nextern void* ecalloc(size_t n, size_t size);\nextern void* erealloc(void* ptr, size_t size);\n\nextern double* makevector(size_t size); /* size in bytes */\nextern double** makematrix(size_t nrow, size_t ncol);\nvoid freevector(double*);\nvoid freematrix(double**);\n\nextern void hoc_execerror(const char*, const char*); /* print and abort */\nextern void hoc_warning(const char*, const char*);\n\nextern double hoc_Exp(double x);\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/utils/nrntimeout.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/mpi/nrnmpi.h\"\n#include \"coreneuron/utils/utils.hpp\"\n\n#if NRNMPI\n\n#include <csignal>\n#include <sys/time.h>\n\n/* if you are using any sampling based profiling tool,\nsetitimer will conflict with profiler. In that case,\nuser can disable setitimer which is just safety for\ndeadlock situations */\nnamespace coreneuron {\n#if (defined(DISABLE_TIMEOUT) || defined(MINGW))\n\nvoid nrn_timeout(int seconds) {}\n\n#else\n\nvoid (*nrntimeout_call)();\nstatic double told;\nstatic struct itimerval value;\nstatic struct sigaction act, oact;\n\nstatic void timed_out(int sig) {\n    (void) sig; /* unused */\n#if CORENRN_DEBUG\n    printf(\"timed_out told=%g t=%g\\n\", told, t);\n#endif\n    if (nrn_threads->_t == told) { /* nothing has been accomplished since last signal*/\n        printf(\"nrn_timeout t=%g\\n\", nrn_threads->_t);\n        if (nrntimeout_call) {\n            (*nrntimeout_call)();\n        }\n        nrn_abort(0);\n    }\n    told = nrn_threads->_t;\n}\n\nvoid nrn_timeout(int seconds) {\n    if (nrnmpi_myid != 0) {\n        return;\n    }\n#if CORENRN_DEBUG\n    printf(\"nrn_timeout %d\\n\", seconds);\n#endif\n    if (seconds) {\n        told = nrn_threads->_t;\n        act.sa_handler = timed_out;\n        act.sa_flags = SA_RESTART;\n        if (sigaction(SIGALRM, &act, &oact)) {\n            printf(\"sigaction failed\\n\");\n            nrn_abort(0);\n        }\n    } else {\n        sigaction(SIGALRM, &oact, (struct sigaction*) 0);\n    }\n    value.it_interval.tv_sec = seconds;\n    value.it_interval.tv_usec = 0;\n    value.it_value.tv_sec = seconds;\n    value.it_value.tv_usec = 0;\n    if (setitimer(ITIMER_REAL, &value, (struct itimerval*) 0)) {\n        printf(\"setitimer failed\\n\");\n        nrn_abort(0);\n    }\n}\n\n#endif /* DISABLE_TIMEOUT */\n}  // namespace coreneuron\n\n#endif /*NRNMPI*/\n"
  },
  {
    "path": "coreneuron/utils/offload.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n#pragma once\n#define nrn_pragma_stringify(x) #x\n#if defined(CORENEURON_ENABLE_GPU) && defined(CORENEURON_PREFER_OPENMP_OFFLOAD) && defined(_OPENMP)\n#define nrn_pragma_acc(x)\n#define nrn_pragma_omp(x) _Pragma(nrn_pragma_stringify(omp x))\n#include <omp.h>\n#elif defined(CORENEURON_ENABLE_GPU) && !defined(CORENEURON_PREFER_OPENMP_OFFLOAD) && \\\n    defined(_OPENACC)\n#define nrn_pragma_acc(x) _Pragma(nrn_pragma_stringify(acc x))\n#define nrn_pragma_omp(x)\n#include <openacc.h>\n#else\n#define nrn_pragma_acc(x)\n#define nrn_pragma_omp(x)\n#endif\n\n#include <cstddef>\n#include <stdexcept>\n#include <string_view>\n\nnamespace coreneuron {\nvoid cnrn_target_copyin_debug(std::string_view file,\n                              int line,\n                              std::size_t sizeof_T,\n                              std::type_info const& typeid_T,\n                              void const* h_ptr,\n                              std::size_t len,\n                              void* d_ptr);\nvoid cnrn_target_delete_debug(std::string_view file,\n                              int line,\n                              std::size_t sizeof_T,\n                              std::type_info const& typeid_T,\n                              void const* h_ptr,\n                              std::size_t len);\nvoid cnrn_target_deviceptr_debug(std::string_view file,\n                                 int line,\n                                 std::type_info const& typeid_T,\n                                 void const* h_ptr,\n                                 void* d_ptr);\nvoid cnrn_target_is_present_debug(std::string_view file,\n                                  int line,\n                                  std::type_info const& typeid_T,\n                                  void const* h_ptr,\n                                  void* d_ptr);\nvoid cnrn_target_memcpy_to_device_debug(std::string_view file,\n                                        int line,\n                                        std::size_t sizeof_T,\n                                        std::type_info const& typeid_T,\n                                        void const* h_ptr,\n                                        std::size_t len,\n                                        void* d_ptr);\n#if defined(CORENEURON_ENABLE_GPU) && !defined(CORENEURON_UNIFIED_MEMORY) && \\\n    defined(__NVCOMPILER_MAJOR__) && defined(__NVCOMPILER_MINOR__) &&        \\\n    (__NVCOMPILER_MAJOR__ <= 22) && (__NVCOMPILER_MINOR__ <= 3)\n// Homegrown implementation for buggy NVHPC versions (<=22.3), see\n// https://forums.developer.nvidia.com/t/acc-deviceptr-does-not-work-in-openacc-code-dynamically-loaded-from-a-shared-library/211599\n#define CORENEURON_ENABLE_PRESENT_TABLE\nstd::pair<void*, bool> cnrn_target_deviceptr_impl(bool must_be_present_or_null, void const* h_ptr);\nvoid cnrn_target_copyin_update_present_table(void const* h_ptr, void* d_ptr, std::size_t len);\nvoid cnrn_target_delete_update_present_table(void const* h_ptr, std::size_t len);\n#endif\n\ntemplate <typename T>\nT* cnrn_target_deviceptr_or_present(std::string_view file,\n                                    int line,\n                                    bool must_be_present_or_null,\n                                    const T* h_ptr) {\n    T* d_ptr{};\n    bool error{false};\n#ifdef CORENEURON_ENABLE_PRESENT_TABLE\n    auto const d_ptr_and_error = cnrn_target_deviceptr_impl(must_be_present_or_null, h_ptr);\n    d_ptr = static_cast<T*>(d_ptr_and_error.first);\n    error = d_ptr_and_error.second;\n#elif defined(CORENEURON_ENABLE_GPU) && !defined(CORENEURON_PREFER_OPENMP_OFFLOAD) && \\\n    defined(_OPENACC)\n    d_ptr = static_cast<T*>(acc_deviceptr(const_cast<T*>(h_ptr)));\n#elif defined(CORENEURON_ENABLE_GPU) && defined(CORENEURON_PREFER_OPENMP_OFFLOAD) && \\\n    defined(_OPENMP)\n    if (must_be_present_or_null || omp_target_is_present(h_ptr, omp_get_default_device())) {\n        nrn_pragma_omp(target data use_device_ptr(h_ptr))\n        { d_ptr = const_cast<T*>(h_ptr); }\n    }\n#else\n    if (must_be_present_or_null && h_ptr) {\n        throw std::runtime_error(\n            \"cnrn_target_deviceptr() not implemented without OpenACC/OpenMP and gpu build\");\n    }\n#endif\n    if (must_be_present_or_null) {\n        cnrn_target_deviceptr_debug(file, line, typeid(T), h_ptr, d_ptr);\n    } else {\n        cnrn_target_is_present_debug(file, line, typeid(T), h_ptr, d_ptr);\n    }\n    if (error) {\n        throw std::runtime_error(\n            \"cnrn_target_deviceptr() encountered an error, you may want to try setting \"\n            \"CORENEURON_GPU_DEBUG=1\");\n    }\n    return d_ptr;\n}\n\ntemplate <typename T>\nT* cnrn_target_copyin(std::string_view file, int line, const T* h_ptr, std::size_t len = 1) {\n    T* d_ptr{};\n#if defined(CORENEURON_ENABLE_GPU) && !defined(CORENEURON_PREFER_OPENMP_OFFLOAD) && \\\n    defined(_OPENACC)\n    d_ptr = static_cast<T*>(acc_copyin(const_cast<T*>(h_ptr), len * sizeof(T)));\n#elif defined(CORENEURON_ENABLE_GPU) && defined(CORENEURON_PREFER_OPENMP_OFFLOAD) && \\\n    defined(_OPENMP)\n    nrn_pragma_omp(target enter data map(to : h_ptr[:len]))\n    nrn_pragma_omp(target data use_device_ptr(h_ptr))\n    { d_ptr = const_cast<T*>(h_ptr); }\n#else\n    throw std::runtime_error(\n        \"cnrn_target_copyin() not implemented without OpenACC/OpenMP and gpu build\");\n#endif\n#ifdef CORENEURON_ENABLE_PRESENT_TABLE\n    cnrn_target_copyin_update_present_table(h_ptr, d_ptr, len * sizeof(T));\n#endif\n    cnrn_target_copyin_debug(file, line, sizeof(T), typeid(T), h_ptr, len, d_ptr);\n    return d_ptr;\n}\n\ntemplate <typename T>\nvoid cnrn_target_delete(std::string_view file, int line, T* h_ptr, std::size_t len = 1) {\n    cnrn_target_delete_debug(file, line, sizeof(T), typeid(T), h_ptr, len);\n#ifdef CORENEURON_ENABLE_PRESENT_TABLE\n    cnrn_target_delete_update_present_table(h_ptr, len * sizeof(T));\n#endif\n#if defined(CORENEURON_ENABLE_GPU) && !defined(CORENEURON_PREFER_OPENMP_OFFLOAD) && \\\n    defined(_OPENACC)\n    acc_delete(h_ptr, len * sizeof(T));\n#elif defined(CORENEURON_ENABLE_GPU) && defined(CORENEURON_PREFER_OPENMP_OFFLOAD) && \\\n    defined(_OPENMP)\n    nrn_pragma_omp(target exit data map(delete : h_ptr[:len]))\n#else\n    throw std::runtime_error(\n        \"cnrn_target_delete() not implemented without OpenACC/OpenMP and gpu build\");\n#endif\n}\n\ntemplate <typename T>\nvoid cnrn_target_memcpy_to_device(std::string_view file,\n                                  int line,\n                                  T* d_ptr,\n                                  const T* h_ptr,\n                                  std::size_t len = 1) {\n    cnrn_target_memcpy_to_device_debug(file, line, sizeof(T), typeid(T), h_ptr, len, d_ptr);\n#if defined(CORENEURON_ENABLE_GPU) && !defined(CORENEURON_PREFER_OPENMP_OFFLOAD) && \\\n    defined(_OPENACC)\n    acc_memcpy_to_device(d_ptr, const_cast<T*>(h_ptr), len * sizeof(T));\n#elif defined(CORENEURON_ENABLE_GPU) && defined(CORENEURON_PREFER_OPENMP_OFFLOAD) && \\\n    defined(_OPENMP)\n    omp_target_memcpy(d_ptr,\n                      const_cast<T*>(h_ptr),\n                      len * sizeof(T),\n                      0,\n                      0,\n                      omp_get_default_device(),\n                      omp_get_initial_device());\n#else\n    throw std::runtime_error(\n        \"cnrn_target_memcpy_to_device() not implemented without OpenACC/OpenMP and gpu build\");\n#endif\n}\n\ntemplate <typename T>\nvoid cnrn_target_update_on_device(std::string_view file,\n                                  int line,\n                                  const T* h_ptr,\n                                  std::size_t len = 1) {\n    auto* d_ptr = cnrn_target_deviceptr_or_present(file, line, true, h_ptr);\n    cnrn_target_memcpy_to_device(file, line, d_ptr, h_ptr);\n}\n\n// Replace with std::source_location once we have C++20\n#define cnrn_target_copyin(...) cnrn_target_copyin(__FILE__, __LINE__, __VA_ARGS__)\n#define cnrn_target_delete(...) cnrn_target_delete(__FILE__, __LINE__, __VA_ARGS__)\n#define cnrn_target_is_present(...) \\\n    cnrn_target_deviceptr_or_present(__FILE__, __LINE__, false, __VA_ARGS__)\n#define cnrn_target_deviceptr(...) \\\n    cnrn_target_deviceptr_or_present(__FILE__, __LINE__, true, __VA_ARGS__)\n#define cnrn_target_memcpy_to_device(...) \\\n    cnrn_target_memcpy_to_device(__FILE__, __LINE__, __VA_ARGS__)\n#define cnrn_target_update_on_device(...) \\\n    cnrn_target_update_on_device(__FILE__, __LINE__, __VA_ARGS__)\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/utils/profile/profiler_interface.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#pragma once\n\n#include <initializer_list>\n#include <type_traits>\n\n#if defined(CORENEURON_CALIPER)\n#include <caliper/cali.h>\n#endif\n\n#ifdef CORENEURON_CUDA_PROFILING\n#include <cuda_profiler_api.h>\n#endif\n\n#if defined(CRAYPAT)\n#include <pat_api.h>\n#endif\n\n#if defined(TAU)\n#include <TAU.h>\n#endif\n\n#if defined(LIKWID_PERFMON)\n#include <likwid.h>\n#endif\n\nnamespace coreneuron {\n\nnamespace detail {\n\n/*! \\class Instrumentor\n *  \\brief Instrumentation infrastructure for benchmarking and profiling.\n *\n *  The Instrumentor class exposes static methods that can be used to\n *  toggle with fine-grained resolution the profiling of specific\n *  areas within the code.\n */\ntemplate <class... TProfilerImpl>\nstruct Instrumentor {\n#pragma clang diagnostic push\n#pragma clang diagnostic ignored \"-Wunused-value\"\n    /*! \\fn phase_begin\n     *  \\brief Activate the collection of profiling data within a code region.\n     *\n     *  This function semantically defines the beginning of a region\n     *  of code that the user wishes to profile.\n     *  Loops through all enabled profilers and calls the relevant\n     *  `phase_begin` function.\n     *  This function should have a non-empty implementation only for\n     *  profilers that allow multiple code regions with different names\n     *  to be profiled concurrently.\n     *\n     *  @param name the (unique) identifier of the code region to be profiled\n     */\n    inline static void phase_begin(const char* name) {\n        std::initializer_list<int>{(TProfilerImpl::phase_begin(name), 0)...};\n    }\n\n    /*! \\fn phase_end\n     *  \\brief Deactivate the collection of profiling data within a code region.\n     *\n     *  This function semantically defines the end of a region\n     *  of code that the user wishes to profile.\n     *  Loops through all enabled profilers and calls the relevant\n     *  `phase_end` function.\n     *  This function should have a non-empty implementation only for\n     *  profilers that allow multiple code regions with different names\n     *  to be profiled concurrently.\n     *\n     *  @param name the (unique) identifier of the code region to be profiled\n     */\n    inline static void phase_end(const char* name) {\n        std::initializer_list<int>{(TProfilerImpl::phase_end(name), 0)...};\n    }\n\n    /*! \\fn start_profile\n     *  \\brief Globally activate the collection of profiling data.\n     *\n     *  Activate the collection of profiler data without defining\n     *  a region of interest with a given name, as opposed to `phase_begin`.\n     *  Loops through all enabled profilers and calls the relevant\n     *  `start_profile` function.\n     *  This function should have a non-empty implementation only for\n     *  profilers that expose simply a global begin/end interface, without\n     *  named regions.\n     */\n    inline static void start_profile() {\n        std::initializer_list<int>{(TProfilerImpl::start_profile(), 0)...};\n    }\n\n    /*! \\fn stop_profile\n     *  \\brief Globally deactivate the collection of profiling data.\n     *\n     *  Deactivate the collection of profiler data without defining\n     *  a region of interest with a given name, as opposed to `phase_end`.\n     *  Loops through all enabled profilers and calls the relevant\n     *  `stop_profile` function.\n     *  This function should have a non-empty implementation only for\n     *  profilers that expose simply a global begin/end interface, without\n     *  named regions.\n     */\n    inline static void stop_profile() {\n        std::initializer_list<int>{(TProfilerImpl::stop_profile(), 0)...};\n    }\n\n    /*! \\fn init_profile\n     *  \\brief Initialize the profiler.\n     *\n     *  Initialize a profiler's internal structure, without activating yet\n     *  any data collection, similar in concept to MPI_Init.\n     *  Loops through all enabled profilers and calls the relevant\n     *  `init_profile` function.\n     *  This function should have a non-empty implementation only for\n     *  profilers that require special initialization, typically before\n     *  any memory allocation is done.\n     */\n    inline static void init_profile() {\n        std::initializer_list<int>{(TProfilerImpl::init_profile(), 0)...};\n    }\n\n    /*! \\fn finalize_profile\n     *  \\brief Finalize the profiler.\n     *\n     *  Finalize a profiler's internal structure, without activating yet\n     *  any data collection, similar in concept to MPI_Finalize.\n     *  Loops through all enabled profilers and calls the relevant\n     *  `finalize_profile` function.\n     *  This function should have a non-empty implementation only for\n     *  profilers that require special finalization.\n     */\n    inline static void finalize_profile() {\n        std::initializer_list<int>{(TProfilerImpl::finalize_profile(), 0)...};\n    }\n#pragma clang diagnostic pop\n};\n\n#if defined(CORENEURON_CALIPER)\n\nstruct Caliper {\n    inline static void phase_begin(const char* name) {\n        CALI_MARK_BEGIN(name);\n    };\n\n    inline static void phase_end(const char* name) {\n        CALI_MARK_END(name);\n    };\n\n    inline static void start_profile(){};\n\n    inline static void stop_profile(){};\n\n    inline static void init_profile(){};\n\n    inline static void finalize_profile(){};\n};\n\n#endif\n\n#ifdef CORENEURON_CUDA_PROFILING\n\nstruct CudaProfiling {\n    inline static void phase_begin(const char* name){};\n\n    inline static void phase_end(const char* name){};\n\n    inline static void start_profile() {\n        cudaProfilerStart();\n    };\n\n    inline static void stop_profile() {\n        cudaProfilerStop();\n    };\n\n    inline static void init_profile(){};\n\n    inline static void finalize_profile(){};\n};\n\n#endif\n\n#if defined(CRAYPAT)\n\nstruct CrayPat {\n    inline static void phase_begin(const char* name){};\n\n    inline static void phase_end(const char* name){};\n\n    inline static void start_profile() {\n        PAT_record(PAT_STATE_ON);\n    };\n\n    inline static void stop_profile() {\n        PAT_record(PAT_STATE_OFF);\n    };\n\n    inline static void init_profile(){};\n\n    inline static void finalize_profile(){};\n};\n#endif\n\n#if defined(TAU)\n\nstruct Tau {\n    inline static void phase_begin(const char* name){};\n\n    inline static void phase_end(const char* name){};\n\n    inline static void start_profile() {\n        TAU_ENABLE_INSTRUMENTATION();\n    };\n\n    inline static void stop_profile() {\n        TAU_DISABLE_INSTRUMENTATION();\n    };\n\n    inline static void init_profile(){};\n\n    inline static void finalize_profile(){};\n};\n\n#endif\n\n#if defined(LIKWID_PERFMON)\n\nstruct Likwid {\n    inline static void phase_begin(const char* name) {\n        LIKWID_MARKER_START(name);\n    };\n\n    inline static void phase_end(const char* name) {\n        LIKWID_MARKER_STOP(name);\n    };\n\n    inline static void start_profile(){};\n\n    inline static void stop_profile(){};\n\n    inline static void init_profile() {\n        LIKWID_MARKER_INIT;\n\n#pragma omp parallel\n        { LIKWID_MARKER_THREADINIT; }\n    };\n\n    inline static void finalize_profile() {\n        LIKWID_MARKER_CLOSE;\n    };\n};\n\n#endif\n\nstruct NullInstrumentor {\n    inline static void phase_begin(const char* name){};\n    inline static void phase_end(const char* name){};\n    inline static void start_profile(){};\n    inline static void stop_profile(){};\n    inline static void init_profile(){};\n    inline static void finalize_profile(){};\n};\n\nusing InstrumentorImpl = detail::Instrumentor<\n#if defined CORENEURON_CALIPER\n    detail::Caliper,\n#endif\n#ifdef CORENEURON_CUDA_PROFILING\n    detail::CudaProfiling,\n#endif\n#if defined(CRAYPAT)\n    detail::CrayPat,\n#endif\n#if defined(TAU)\n    detail::Tau,\n#endif\n#if defined(LIKWID_PERFMON)\n    detail::Likwid,\n#endif\n    detail::NullInstrumentor>;\n}  // namespace detail\n\nnamespace Instrumentor {\nstruct phase {\n    const char* phase_name;\n    phase(const char* name)\n        : phase_name(name) {\n        detail::InstrumentorImpl::phase_begin(phase_name);\n    }\n    ~phase() {\n        detail::InstrumentorImpl::phase_end(phase_name);\n    }\n};\n\ninline static void start_profile() {\n    detail::InstrumentorImpl::start_profile();\n}\n\ninline static void stop_profile() {\n    detail::InstrumentorImpl::stop_profile();\n}\n\ninline static void phase_begin(const char* name) {\n    detail::InstrumentorImpl::phase_begin(name);\n}\n\ninline static void phase_end(const char* name) {\n    detail::InstrumentorImpl::phase_end(name);\n}\n\ninline static void init_profile() {\n    detail::InstrumentorImpl::init_profile();\n}\n\ninline static void finalize_profile() {\n    detail::InstrumentorImpl::finalize_profile();\n}\n}  // namespace Instrumentor\n\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/utils/progressbar/progressbar.cpp",
    "content": "/**\n * \\file\n * \\author Trevor Fountain\n * \\author Johannes Buchner\n * \\author Erik Garrison\n * \\date 2010-2014\n * \\copyright BSD 3-Clause\n *\n * progressbar -- a C class (by convention) for displaying progress\n * on the command line (to stdout).\n */\n#include \"coreneuron/utils/progressbar/progressbar.hpp\"\n\n#include <cassert>\n#include <cstddef>\n#include <climits>\n#include <unistd.h>\n\n///  How wide we assume the screen is if termcap fails.\nenum { DEFAULT_SCREEN_WIDTH = 80 };\n\n/// The smallest that the bar can ever be (not including borders)\nenum { MINIMUM_BAR_WIDTH = 10 };\n\n/// The format in which the estimated remaining time will be reported\nstatic const char* const ETA_FORMAT = \"t: %-6.2f ETA:%2dh%02dm%02ds\";\n\n/// The maximum number of characters that the ETA_FORMAT can ever yield\nenum { ETA_FORMAT_LENGTH = 13 };\n\n/// Amount of screen width taken up by whitespace (i.e. whitespace between label/bar/ETA components)\nenum { WHITESPACE_LENGTH = 2 };\n\n/// The amount of width taken up by the border of the bar component.\nenum { BAR_BORDER_WIDTH = 2 };\n\n/// The maximum number of bar redraws (to avoid frequent output in long runs)\nenum { BAR_DRAW_COUNT_MAX = 500 };\n\nenum { BAR_DRAW_INTERVAL = 1, BAR_DRAW_INTERVAL_NOTTY = 5 };\n\n/// Models a duration of time broken into hour/minute/second components. The number of seconds\n/// should be less than the\n/// number of seconds in one minute, and the number of minutes should be less than the number of\n/// minutes in one hour.\nstruct progressbar_time_components {\n    int hours;\n    int minutes;\n    int seconds;\n};\n\nstatic void progressbar_draw(const progressbar* bar);\nstatic int progressbar_remaining_seconds(const progressbar* bar);\n\n/**\n * Create a new progress bar with the specified label, max number of steps, and format string.\n * Note that `format` must be exactly three characters long, e.g. \"<->\" to render a progress\n * bar like \"<---------->\". Returns nullptr if there isn't enough memory to allocate a progressbar\n */\nprogressbar* progressbar_new_with_format(const char* label, unsigned long max, const char* format) {\n    auto* new_bar = static_cast<progressbar*>(malloc(sizeof(progressbar)));\n    if (new_bar == nullptr) {\n        return nullptr;\n    }\n\n    new_bar->max = max;\n    new_bar->value = 0;\n    new_bar->draw_time_interval = isatty(STDOUT_FILENO) ? BAR_DRAW_INTERVAL\n                                                        : BAR_DRAW_INTERVAL_NOTTY;\n    new_bar->t = 0;\n    new_bar->start = time(nullptr);\n    assert(3 == strlen(format) && \"format must be 3 characters in length\");\n    new_bar->format.begin = format[0];\n    new_bar->format.fill = format[1];\n    new_bar->format.end = format[2];\n\n    progressbar_update_label(new_bar, label);\n    progressbar_draw(new_bar);\n    new_bar->prev_t = difftime(time(nullptr), new_bar->start);\n    new_bar->drawn_count = 1;\n\n    return new_bar;\n}\n\n/**\n * Create a new progress bar with the specified label and max number of steps.\n */\nprogressbar* progressbar_new(const char* label, unsigned long max) {\n    return progressbar_new_with_format(label, max, \"|=|\");\n}\n\nvoid progressbar_update_label(progressbar* bar, const char* label) {\n    bar->label = label;\n}\n\n/**\n * Delete an existing progress bar.\n */\nvoid progressbar_free(progressbar* bar) {\n    free(bar);\n}\n\n/**\n * Increment an existing progressbar by `value` steps.\n * Additionally issues a redraw in case a certain time interval has elapsed (min: 1sec)\n * Reasons for a larger interval are:\n *  - Stdout is not TTY\n *  - Respect BAR_DRAW_COUNT_MAX\n */\nvoid progressbar_update(progressbar* bar, unsigned long value, double t) {\n    bar->value = value;\n    bar->t = t;\n    int sim_time = difftime(time(nullptr), bar->start);\n\n    // If there is not enough time passed to redraw the progress bar return\n    if ((sim_time - bar->prev_t) < bar->draw_time_interval) {\n        return;\n    }\n\n    progressbar_draw(bar);\n\n    bar->drawn_count++;\n    bar->prev_t = sim_time;\n\n    if (bar->drawn_count >= BAR_DRAW_COUNT_MAX || sim_time < 15) {\n        // Dont change the interval after the limit. Simulation should be over any moment and\n        // avoid the calc of draw_time_interval which could raise DIV/0\n        // Also, dont do it the first 15sec to avoid really bad estimates which could potentially\n        // delay a better estimate too far away in the future.\n        return;\n    }\n\n    // Sample ETA to calculate the next interval until the redraw of the progressbar\n    int eta_s = progressbar_remaining_seconds(bar);\n    bar->draw_time_interval = eta_s / (BAR_DRAW_COUNT_MAX - bar->drawn_count);\n\n    if (bar->draw_time_interval < BAR_DRAW_INTERVAL_NOTTY) {\n        bar->draw_time_interval = isatty(STDOUT_FILENO)\n                                      ? ((bar->draw_time_interval < BAR_DRAW_INTERVAL)\n                                             ? BAR_DRAW_INTERVAL\n                                             : bar->draw_time_interval)\n                                      : BAR_DRAW_INTERVAL_NOTTY;\n    }\n}\n\n/**\n * Increment an existing progressbar by a single step.\n */\nvoid progressbar_inc(progressbar* bar, double t) {\n    progressbar_update(bar, bar->value + 1, t);\n}\n\nstatic void progressbar_write_char(FILE* file, const int ch, const size_t times) {\n    for (std::size_t i = 0; i < times; ++i) {\n        fputc(ch, file);\n    }\n}\n\nstatic int progressbar_max(int x, int y) {\n    return x > y ? x : y;\n}\n\nstatic unsigned int get_screen_width(void) {\n    return DEFAULT_SCREEN_WIDTH;\n}\n\nstatic int progressbar_bar_width(int screen_width, int label_length) {\n    return progressbar_max(MINIMUM_BAR_WIDTH,\n                           screen_width - label_length - ETA_FORMAT_LENGTH - WHITESPACE_LENGTH);\n}\n\nstatic int progressbar_label_width(int screen_width, int label_length, int bar_width) {\n    int eta_width = ETA_FORMAT_LENGTH;\n\n    // If the progressbar is too wide to fit on the screen, we must sacrifice the label.\n    if (label_length + 1 + bar_width + 1 + ETA_FORMAT_LENGTH > screen_width) {\n        return progressbar_max(0, screen_width - bar_width - eta_width - WHITESPACE_LENGTH);\n    } else {\n        return label_length;\n    }\n}\n\nstatic int progressbar_remaining_seconds(const progressbar* bar) {\n    double offset = difftime(time(nullptr), bar->start);\n    if (bar->value > 0 && offset > 0) {\n        return (offset / (double) bar->value) * (bar->max - bar->value);\n    } else {\n        return 0;\n    }\n}\n\nstatic progressbar_time_components progressbar_calc_time_components(int seconds) {\n    int hours = seconds / 3600;\n    seconds -= hours * 3600;\n    int minutes = seconds / 60;\n    seconds -= minutes * 60;\n\n    progressbar_time_components components = {hours, minutes, seconds};\n    return components;\n}\n\nstatic void progressbar_draw(const progressbar* bar) {\n    int screen_width = get_screen_width();\n    int label_length = strlen(bar->label);\n    int bar_width = progressbar_bar_width(screen_width, label_length);\n    int label_width = progressbar_label_width(screen_width, label_length, bar_width);\n\n    int progressbar_completed = (bar->value >= bar->max);\n    int bar_piece_count = bar_width - BAR_BORDER_WIDTH;\n    int bar_piece_current = (progressbar_completed)\n                                ? bar_piece_count\n                                : bar_piece_count * ((double) bar->value / bar->max);\n\n    progressbar_time_components eta =\n        (progressbar_completed)\n            ? progressbar_calc_time_components(difftime(time(nullptr), bar->start))\n            : progressbar_calc_time_components(progressbar_remaining_seconds(bar));\n\n    if (label_width == 0) {\n        // The label would usually have a trailing space, but in the case that we don't print\n        // a label, the bar can use that space instead.\n        bar_width += 1;\n    } else {\n        // Draw the label\n        fwrite(bar->label, 1, label_width, stdout);\n        fputc(' ', stdout);\n    }\n\n    // Draw the progressbar\n    fputc(bar->format.begin, stdout);\n    progressbar_write_char(stdout, bar->format.fill, bar_piece_current);\n    progressbar_write_char(stdout, ' ', bar_piece_count - bar_piece_current);\n    fputc(bar->format.end, stdout);\n\n    // Draw the ETA\n    fputc(' ', stdout);\n    fprintf(stdout, ETA_FORMAT, bar->t, eta.hours, eta.minutes, eta.seconds);\n    fputc('\\r', stdout);\n    fflush(stdout);\n}\n\n/**\n * Finish a progressbar, indicating 100% completion, and free it.\n */\nvoid progressbar_finish(progressbar* bar) {\n    // Make sure we fill the progressbar so things look complete.\n    progressbar_draw(bar);\n\n    // Print a newline, so that future outputs to stdout look prettier\n    fprintf(stdout, \"\\n\");\n\n    // We've finished with this progressbar, so go ahead and free it.\n    progressbar_free(bar);\n}\n"
  },
  {
    "path": "coreneuron/utils/progressbar/progressbar.hpp",
    "content": "/**\n * \\file\n * \\author Trevor Fountain\n * \\author Johannes Buchner\n * \\author Erik Garrison\n * \\date 2010-2014\n * \\copyright BSD 3-Clause\n *\n * progressbar -- a C class (by convention) for displaying progress\n * on the command line (to stderr).\n */\n#pragma once\n#include <time.h>\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n/**\n * Progressbar data structure (do not modify or create directly)\n */\nstruct progressbar {\n    /// maximum value\n    unsigned long max;\n\n    /// current value\n    unsigned long value;\n\n    /// value of the previous progress bar drawn in output\n    unsigned long prev_sample_value;\n\n    /// time interval between consecutive bar redraws (seconds)\n    time_t draw_time_interval;\n\n    /// number of redrawn bars\n    unsigned long drawn_count;\n\n    /// time progressbar was started\n    time_t start;\n\n    /// time progressbar was drawn for last time\n    time_t prev_t;\n\n    /// label\n    const char* label;\n\n    /// current time (added for simulation)\n    double t;\n\n    /// characters for the beginning, filling and end of the\n    /// progressbar. E.g. |###    | has |#|\n    struct {\n        char begin;\n        char fill;\n        char end;\n    } format;\n};\n\n/// Create a new progressbar with the specified label and number of steps.\n///\n/// @param label The label that will prefix the progressbar.\n/// @param max The number of times the progressbar must be incremented before it is considered\n/// complete, or, in other words, the number of tasks that this progressbar is tracking.\n/// @return A progressbar configured with the provided arguments. Note that the user is responsible\n/// for disposing of the progressbar via progressbar_finish when finished with the object.\nprogressbar* progressbar_new(const char* label, unsigned long max);\n\n/// Create a new progressbar with the specified label, number of steps, and format string.\n///\n/// @param label The label that will prefix the progressbar.\n/// @param max The number of times the progressbar must be incremented before it is considered\n/// complete, or, in other words, the number of tasks that this progressbar is tracking.\n/// @param format The format of the progressbar. The string provided must be three characters, and\n/// it will be interpretted with the first character as the left border of the bar, the second\n/// character of the bar and the third character as the right border of the bar. For example,\n/// \"<->\" would result in a bar formatted like \"<------     >\".\n///\n/// @return A progressbar configured with the provided arguments. Note that the user is responsible\n/// for disposing of the progressbar via progressbar_finish when finished with the object.\nprogressbar* progressbar_new_with_format(const char* label, unsigned long max, const char* format);\n\n/// Free an existing progress bar. Don't call this directly; call *progressbar_finish* instead.\nvoid progressbar_free(progressbar* bar);\n\n/// Increment the given progressbar. Don't increment past the initialized # of steps, though.\nvoid progressbar_inc(progressbar* bar, double t);\n\n/// Set the current status on the given progressbar.\nvoid progressbar_update(progressbar* bar, unsigned long value, double t);\n\n/// Set the label of the progressbar. Note that no rendering is done. The label is simply set so\n/// that the next rendering will use the new label. To immediately see the new label, call\n/// progressbar_draw.\n/// Does not update display or copy the label\nvoid progressbar_update_label(progressbar* bar, const char* label);\n\n/// Finalize (and free!) a progressbar. Call this when you're done, or if you break out\n/// partway through.\nvoid progressbar_finish(progressbar* bar);\n"
  },
  {
    "path": "coreneuron/utils/randoms/nrnran123.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n#include \"coreneuron/gpu/nrn_acc_manager.hpp\"\n#include \"coreneuron/mpi/core/nrnmpi.hpp\"\n#include \"coreneuron/utils/memory.h\"\n#include \"coreneuron/utils/nrnmutdec.hpp\"\n#include \"coreneuron/utils/randoms/nrnran123.h\"\n\n#ifdef CORENEURON_USE_BOOST_POOL\n#include <boost/pool/pool_alloc.hpp>\n#include <unordered_map>\n#endif\n\n#include <cmath>\n#include <iostream>\n#include <memory>\n#include <mutex>\n\n// Defining these attributes seems to help nvc++ in OpenMP target offload mode.\n#if defined(CORENEURON_ENABLE_GPU) && defined(CORENEURON_PREFER_OPENMP_OFFLOAD) && \\\n    defined(_OPENMP) && defined(__CUDACC__)\n#define CORENRN_HOST_DEVICE __host__ __device__\n#else\n#define CORENRN_HOST_DEVICE\n#endif\n\nnamespace {\n#ifdef CORENEURON_USE_BOOST_POOL\n/** Tag type for use with boost::fast_pool_allocator that forwards to\n *  coreneuron::[de]allocate_unified(). Using a Random123-specific type here\n *  makes sure that allocations do not come from the same global pool as other\n *  usage of boost pools for objects with sizeof == sizeof(nrnran123_State).\n *\n *  The messy m_block_sizes map is just because `deallocate_unified` uses sized\n *  deallocations, but the Boost pool allocators don't. Because this is hidden\n *  behind the pool mechanism, these methods are not called very often and the\n *  overhead is minimal.\n */\nstruct random123_allocate_unified {\n    using size_type = std::size_t;\n    using difference_type = std::size_t;\n    static char* malloc(const size_type bytes) {\n        std::lock_guard<std::mutex> const lock{m_mutex};\n        static_cast<void>(lock);\n        auto* buffer = coreneuron::allocate_unified(bytes);\n        m_block_sizes[buffer] = bytes;\n        return reinterpret_cast<char*>(buffer);\n    }\n    static void free(char* const block) {\n        std::lock_guard<std::mutex> const lock{m_mutex};\n        static_cast<void>(lock);\n        auto const iter = m_block_sizes.find(block);\n        assert(iter != m_block_sizes.end());\n        auto const size = iter->second;\n        m_block_sizes.erase(iter);\n        return coreneuron::deallocate_unified(block, size);\n    }\n    static std::mutex m_mutex;\n    static std::unordered_map<void*, std::size_t> m_block_sizes;\n};\n\nstd::mutex random123_allocate_unified::m_mutex{};\nstd::unordered_map<void*, std::size_t> random123_allocate_unified::m_block_sizes{};\n\nusing random123_allocator =\n    boost::fast_pool_allocator<coreneuron::nrnran123_State, random123_allocate_unified>;\n#else\nusing random123_allocator = coreneuron::unified_allocator<coreneuron::nrnran123_State>;\n#endif\n/* Global data structure per process. Using a unique_ptr here causes [minor]\n * problems because its destructor can be called very late during application\n * shutdown. If the destructor calls cudaFree and the CUDA runtime has already\n * been shut down then tools like cuda-memcheck reports errors.\n */\nOMP_Mutex g_instance_count_mutex;\nstd::size_t g_instance_count{};\n\n#ifdef __CUDACC__\n#define g_k_qualifiers __device__ __constant__\n#else\n#define g_k_qualifiers\n#endif\ng_k_qualifiers philox4x32_key_t g_k{};\n// Cannot refer to g_k directly from a nrn_pragma_acc(routine seq) method like\n// coreneuron_random123_philox4x32_helper, and cannot have this inlined there at\n// higher optimisation levels\n__attribute__((noinline)) philox4x32_key_t& global_state() {\n    return g_k;\n}\n}  // namespace\n\nCORENRN_HOST_DEVICE philox4x32_ctr_t\ncoreneuron_random123_philox4x32_helper(coreneuron::nrnran123_State* s) {\n    return philox4x32(s->c, global_state());\n}\n\nnamespace coreneuron {\nstd::size_t nrnran123_instance_count() {\n    return g_instance_count;\n}\n\n/* if one sets the global, one should reset all the stream sequences. */\nuint32_t nrnran123_get_globalindex() {\n    return global_state().v[0];\n}\n\n/* nrn123 streams are created from cpu launcher routine */\nvoid nrnran123_set_globalindex(uint32_t gix) {\n    // If the global seed is changing then we shouldn't have any active streams.\n    auto& g_k = global_state();\n    {\n        std::lock_guard<OMP_Mutex> _{g_instance_count_mutex};\n        if (g_instance_count != 0 && nrnmpi_myid == 0) {\n            std::cout\n                << \"nrnran123_set_globalindex(\" << gix\n                << \") called when a non-zero number of Random123 streams (\" << g_instance_count\n                << \") were active. This is not safe, some streams will remember the old value (\"\n                << g_k.v[0] << ')' << std::endl;\n        }\n    }\n    if (g_k.v[0] != gix) {\n        g_k.v[0] = gix;\n        if (coreneuron::gpu_enabled()) {\n#ifdef __CUDACC__\n            {\n                auto const code = cudaMemcpyToSymbol(g_k, &g_k, sizeof(g_k));\n                assert(code == cudaSuccess);\n            }\n            {\n                auto const code = cudaDeviceSynchronize();\n                assert(code == cudaSuccess);\n            }\n#else\n            nrn_pragma_acc(update device(g_k))\n            nrn_pragma_omp(target update to(g_k))\n#endif\n        }\n    }\n}\n\nvoid nrnran123_initialise_global_state_on_device() {\n    if (coreneuron::gpu_enabled()) {\n#ifndef __CUDACC__\n        nrn_pragma_acc(enter data copyin(g_k))\n#endif\n    }\n}\n\nvoid nrnran123_destroy_global_state_on_device() {\n    if (coreneuron::gpu_enabled()) {\n#ifndef __CUDACC__\n        nrn_pragma_acc(exit data delete (g_k))\n#endif\n    }\n}\n\n/** @brief Allocate a new Random123 stream.\n *  @todo  It would be nicer if the API return type was\n *  std::unique_ptr<nrnran123_State, ...not specified...>, so we could use a\n *  custom allocator/deleter and avoid the (fragile) need for matching\n *  nrnran123_deletestream calls.\n */\nnrnran123_State* nrnran123_newstream3(uint32_t id1,\n                                      uint32_t id2,\n                                      uint32_t id3,\n                                      bool use_unified_memory) {\n    // The `use_unified_memory` argument is an implementation detail to keep the\n    // old behaviour that some Random123 streams that are known to only be used\n    // from the CPU are allocated using new/delete instead of unified memory.\n    // See OPENACC_EXCLUDED_FILES in coreneuron/CMakeLists.txt. If we dropped\n    // this feature then we could always use coreneuron::unified_allocator.\n#ifndef CORENEURON_ENABLE_GPU\n    if (use_unified_memory) {\n        throw std::runtime_error(\"Tried to use CUDA unified memory in a non-GPU build.\");\n    }\n#endif\n    nrnran123_State* s{nullptr};\n    if (use_unified_memory) {\n        s = coreneuron::allocate_unique<nrnran123_State>(random123_allocator{}).release();\n    } else {\n        s = new nrnran123_State{};\n    }\n    s->c.v[0] = 0;\n    s->c.v[1] = id3;\n    s->c.v[2] = id1;\n    s->c.v[3] = id2;\n    nrnran123_setseq(s, 0, 0);\n    {\n        std::lock_guard<OMP_Mutex> _{g_instance_count_mutex};\n        ++g_instance_count;\n    }\n    return s;\n}\n\n/* nrn123 streams are destroyed from cpu launcher routine */\nvoid nrnran123_deletestream(nrnran123_State* s, bool use_unified_memory) {\n#ifndef CORENEURON_ENABLE_GPU\n    if (use_unified_memory) {\n        throw std::runtime_error(\"Tried to use CUDA unified memory in a non-GPU build.\");\n    }\n#endif\n    {\n        std::lock_guard<OMP_Mutex> _{g_instance_count_mutex};\n        --g_instance_count;\n    }\n    if (use_unified_memory) {\n        std::unique_ptr<nrnran123_State, coreneuron::alloc_deleter<random123_allocator>> _{s};\n    } else {\n        delete s;\n    }\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/utils/randoms/nrnran123.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n#pragma once\n\n/* interface to Random123 */\n/* http://www.thesalmons.org/john/random123/papers/random123sc11.pdf */\n\n/*\nThe 4x32 generators utilize a uint32x4 counter and uint32x4 key to transform\ninto an almost cryptographic quality uint32x4 random result.\nThere are many possibilites for balancing the sharing of the internal\nstate instances while reserving a uint32 counter for the stream sequence\nand reserving other portions of the counter vector for stream identifiers\nand global index used by all streams.\n\nWe currently provide a single instance by default in which the policy is\nto use the 0th counter uint32 as the stream sequence, words 2 and 3 as the\nstream identifier, and word 0 of the key as the global index. Unused words\nare constant uint32 0.\n\nIt is also possible to use Random123 directly without reference to this\ninterface. See Random123-1.02/docs/html/index.html\nof the full distribution available from\nhttp://www.deshawresearch.com/resources_random123.html\n*/\n\n#ifdef __bgclang__\n#define R123_USE_MULHILO64_MULHI_INTRIN 0\n#define R123_USE_GNU_UINT128            1\n#endif\n\n#include \"coreneuron/utils/offload.hpp\"\n\n#include <Random123/philox.h>\n#include <inttypes.h>\n\n#include <cmath>\n\n// Some files are compiled with DISABLE_OPENACC, and some builds have no GPU\n// support at all. In these two cases, request that the random123 state is\n// allocated using new/delete instead of CUDA unified memory.\n#if defined(CORENEURON_ENABLE_GPU) && !defined(DISABLE_OPENACC)\n#define CORENRN_RAN123_USE_UNIFIED_MEMORY true\n#else\n#define CORENRN_RAN123_USE_UNIFIED_MEMORY false\n#endif\n\nnamespace coreneuron {\n\nstruct nrnran123_State {\n    philox4x32_ctr_t c;\n    philox4x32_ctr_t r;\n    char which_;\n};\n\n}  // namespace coreneuron\n\n/** @brief Provide a helper function in global namespace that is declared target for OpenMP\n * offloading to function correctly with NVHPC\n */\nnrn_pragma_acc(routine seq)\nnrn_pragma_omp(declare target)\nphilox4x32_ctr_t coreneuron_random123_philox4x32_helper(coreneuron::nrnran123_State* s);\nnrn_pragma_omp(end declare target)\n\nnamespace coreneuron {\nvoid nrnran123_initialise_global_state_on_device();\nvoid nrnran123_destroy_global_state_on_device();\n\n/* global index. eg. run number */\n/* all generator instances share this global index */\nvoid nrnran123_set_globalindex(uint32_t gix);\nuint32_t nrnran123_get_globalindex();\n\n// Utilities used for calculating model size, only called from the CPU.\nstd::size_t nrnran123_instance_count();\ninline std::size_t nrnran123_state_size() {\n    return sizeof(nrnran123_State);\n}\n\n/* routines for creating and deleting streams are called from cpu */\nnrnran123_State* nrnran123_newstream3(uint32_t id1,\n                                      uint32_t id2,\n                                      uint32_t id3,\n                                      bool use_unified_memory = CORENRN_RAN123_USE_UNIFIED_MEMORY);\ninline nrnran123_State* nrnran123_newstream(\n    uint32_t id1,\n    uint32_t id2,\n    bool use_unified_memory = CORENRN_RAN123_USE_UNIFIED_MEMORY) {\n    return nrnran123_newstream3(id1, id2, 0, use_unified_memory);\n}\nvoid nrnran123_deletestream(nrnran123_State* s,\n                            bool use_unified_memory = CORENRN_RAN123_USE_UNIFIED_MEMORY);\n\n/* minimal data stream */\nconstexpr void nrnran123_getseq(nrnran123_State* s, uint32_t* seq, char* which) {\n    *seq = s->c.v[0];\n    *which = s->which_;\n}\nconstexpr void nrnran123_getids(nrnran123_State* s, uint32_t* id1, uint32_t* id2) {\n    *id1 = s->c.v[2];\n    *id2 = s->c.v[3];\n}\nconstexpr void nrnran123_getids3(nrnran123_State* s, uint32_t* id1, uint32_t* id2, uint32_t* id3) {\n    *id3 = s->c.v[1];\n    *id1 = s->c.v[2];\n    *id2 = s->c.v[3];\n}\n\n// Uniform 0 to 2*32-1\ninline uint32_t nrnran123_ipick(nrnran123_State* s) {\n    char which = s->which_;\n    uint32_t rval{s->r.v[int{which++}]};\n    if (which > 3) {\n        which = 0;\n        s->c.v[0]++;\n        s->r = coreneuron_random123_philox4x32_helper(s);\n    }\n    s->which_ = which;\n    return rval;\n}\n\nconstexpr double nrnran123_uint2dbl(uint32_t u) {\n    constexpr double SHIFT32 = 1.0 / 4294967297.0; /* 1/(2^32 + 1) */\n    /* 0 to 2^32-1 transforms to double value in open (0,1) interval */\n    /* min 2.3283064e-10 to max (1 - 2.3283064e-10) */\n    return (static_cast<double>(u) + 1.0) * SHIFT32;\n}\n\n// Uniform open interval (0,1), minimum value is 2.3283064e-10 and max value is 1-min\ninline double nrnran123_dblpick(nrnran123_State* s) {\n    return nrnran123_uint2dbl(nrnran123_ipick(s));\n}\n\n/* this could be called from openacc parallel construct (in INITIAL block) */\ninline void nrnran123_setseq(nrnran123_State* s, uint32_t seq, char which) {\n    if (which > 3) {\n        s->which_ = 0;\n    } else {\n        s->which_ = which;\n    }\n    s->c.v[0] = seq;\n    s->r = coreneuron_random123_philox4x32_helper(s);\n}\n\n// nrnran123_negexp min value is 2.3283064e-10, max is 22.18071, mean 1.0\ninline double nrnran123_negexp(nrnran123_State* s) {\n    return -std::log(nrnran123_dblpick(s));\n}\n\n/* at cost of a cached  value we could compute two at a time. */\ninline double nrnran123_normal(nrnran123_State* s) {\n    double w, u1;\n    do {\n        u1 = nrnran123_dblpick(s);\n        double u2{nrnran123_dblpick(s)};\n        u1 = 2. * u1 - 1.;\n        u2 = 2. * u2 - 1.;\n        w = (u1 * u1) + (u2 * u2);\n    } while (w > 1);\n    double y{std::sqrt((-2. * std::log(w)) / w)};\n    return u1 * y;\n}\n\n// nrnran123_gauss, nrnran123_iran were declared but not defined in CoreNEURON\n// nrnran123_array4x32 was declared but not used in CoreNEURON\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/utils/string_utils.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n#include <cstring>\n\nunsigned strcat_at_pos(char* dest, unsigned start_position, char* src, unsigned src_length) {\n    memcpy(dest + start_position, src, src_length);\n    dest[start_position + src_length] = '\\0';\n    return start_position + src_length;\n}\n"
  },
  {
    "path": "coreneuron/utils/string_utils.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n\n/**\n * @file string_utils.h\n * @brief Utility functions for strings\n *\n */\n\n#pragma once\n\n/** @brief Appends a copy of the source string to the destination string.\n *\n *  A null-character is included at the end of the new string formed by the concatenation of both in\n * destination. It has similar behavior to strcat but better performance in case that it is needed\n * to append a char array to another very large char array.\n *\n *  @param dest Destination string\n *  @param start_position Position of dest to start writing src\n *  @param src Source string\n *  @param src_length Length of src to append to dest\n *  @return Position of the final character of dest after appending src (including the null\n * terminating character)\n */\nunsigned strcat_at_pos(char* dest, unsigned start_position, char* src, unsigned src_length);\n"
  },
  {
    "path": "coreneuron/utils/units.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n*/\n#pragma once\nnamespace coreneuron {\nnamespace units {\n#if CORENEURON_USE_LEGACY_UNITS == 1\nconstexpr double faraday{96485.309};\nconstexpr double gasconstant{8.3134};\n#else\n/* NMODL translated MOD files get unit constants typically from\n * share/lib/nrnunits.lib.in. But there were other source files that hardcode\n * some of the constants. Here we gather a few modern units into a single place\n * (but, unfortunately, also in nrnunits.lib.in). Legacy units cannot be\n * gathered here because they can differ slightly from place to place.\n *\n * These come from https://physics.nist.gov/cuu/Constants/index.html.\n * Termed the \"2018 CODATA recommended values\", they became available\n * on 20 May 2019 and replace the 2014 CODATA set.\n *\n * See oc/hoc_init.c, nrnoc/eion.c, nrniv/kschan.h\n */\nnamespace detail {\nconstexpr double electron_charge{1.602176634e-19};  // coulomb exact\nconstexpr double avogadro_number{6.02214076e+23};   // exact\nconstexpr double boltzmann{1.380649e-23};           // joule/K exact\n}  // namespace detail\nconstexpr double faraday{detail::electron_charge * detail::avogadro_number};  // 96485.33212...\n                                                                              // coulomb/mol\nconstexpr double gasconstant{detail::boltzmann * detail::avogadro_number};    // 8.314462618...\n                                                                              // joule/mol-K\n#endif\n}  // namespace units\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/utils/utils.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2021-22 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n#include <sys/time.h>\n#include \"utils.hpp\"\n#include \"coreneuron/apps/corenrn_parameters.hpp\"\n\nnamespace coreneuron {\n[[noreturn]] void nrn_abort(int errcode) {\n#if NRNMPI\n    if (corenrn_param.mpi_enable && nrnmpi_initialized()) {\n        nrnmpi_abort(errcode);\n    }\n#endif\n    std::abort();\n}\n\ndouble nrn_wtime() {\n#if NRNMPI\n    if (corenrn_param.mpi_enable) {\n        return nrnmpi_wtime();\n    } else\n#endif\n    {\n        struct timeval time1;\n        gettimeofday(&time1, nullptr);\n        return (time1.tv_sec + time1.tv_usec / 1.e6);\n    }\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/utils/utils.hpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2021-22 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\n#include <utility>\n#include \"coreneuron/mpi/nrnmpi.h\"\n#include \"coreneuron/mpi/core/nrnmpi.hpp\"\n\nnamespace coreneuron {\n[[noreturn]] void nrn_abort(int errcode);\ntemplate <typename... Args>\nvoid nrn_fatal_error(const char* msg, Args&&... args) {\n    if (nrnmpi_myid == 0) {\n        printf(msg, std::forward<Args>(args)...);\n    }\n    nrn_abort(-1);\n}\nextern double nrn_wtime(void);\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/utils/utils_cuda.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\n#include <stdio.h>\n#include <cuda_runtime_api.h>\n\n// From Random123 lib\n#define CHECKLAST(MSG)                             \\\n    do {                                           \\\n        cudaError_t e = cudaGetLastError();        \\\n        if (e != cudaSuccess) {                    \\\n            fprintf(stderr,                        \\\n                    \"%s:%d: CUDA Error: %s: %s\\n\", \\\n                    __FILE__,                      \\\n                    __LINE__,                      \\\n                    (MSG),                         \\\n                    cudaGetErrorString(e));        \\\n            exit(1);                               \\\n        }                                          \\\n    } while (0)\n#define CHECKCALL(RET)                                                                             \\\n    do {                                                                                           \\\n        cudaError_t e = (RET);                                                                     \\\n        if (e != cudaSuccess) {                                                                    \\\n            fprintf(stderr, \"%s:%d: CUDA Error: %s\\n\", __FILE__, __LINE__, cudaGetErrorString(e)); \\\n            exit(1);                                                                               \\\n        }                                                                                          \\\n    } while (0)\n"
  },
  {
    "path": "coreneuron/utils/vrecitem.h",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#pragma once\n\n#include \"coreneuron/network/netcon.hpp\"\n#include \"coreneuron/utils/ivocvect.hpp\"\nnamespace coreneuron {\nclass PlayRecord;\n\n#define PlayRecordType        0\n#define VecPlayContinuousType 4\n#define PlayRecordEventType   21\n\n// used by PlayRecord subclasses that utilize discrete events\nclass PlayRecordEvent: public DiscreteEvent {\n  public:\n    PlayRecordEvent() = default;\n    virtual ~PlayRecordEvent() = default;\n    virtual void deliver(double, NetCvode*, NrnThread*) override;\n    virtual void pr(const char*, double t, NetCvode*) override;\n    virtual NrnThread* thread();\n    PlayRecord* plr_;\n    static unsigned long playrecord_send_;\n    static unsigned long playrecord_deliver_;\n    virtual int type() const override {\n        return PlayRecordEventType;\n    }\n};\n\n// common interface for Play and Record for all integration methods.\nclass PlayRecord {\n  public:\n    PlayRecord(double* pd, int ith);\n    virtual ~PlayRecord() = default;\n    virtual void play_init() {}  // called near beginning of finitialize\n    virtual void continuous(double) {\n    }  // play - every f(y, t) or res(y', y, t); record - advance_tn and initialize flag\n    virtual void deliver(double, NetCvode*) {}  // at associated DiscreteEvent\n    virtual PlayRecordEvent* event() {\n        return nullptr;\n    }\n    virtual void pr();  // print identifying info\n    virtual int type() const {\n        return PlayRecordType;\n    }\n\n    double* pd_;\n    int ith_;  // The thread index\n};\n\nclass VecPlayContinuous: public PlayRecord {\n  public:\n    VecPlayContinuous(double*, IvocVect&& yvec, IvocVect&& tvec, IvocVect* discon, int ith);\n    virtual ~VecPlayContinuous();\n    virtual void play_init() override;\n    virtual void deliver(double tt, NetCvode*) override;\n    virtual PlayRecordEvent* event() override {\n        return e_;\n    }\n    virtual void pr() override;\n\n    void continuous(double tt) override;\n    double interpolate(double tt);\n    double interp(double th, double x0, double x1) {\n        return x0 + (x1 - x0) * th;\n    }\n    void search(double tt);\n\n    virtual int type() const override {\n        return VecPlayContinuousType;\n    }\n\n    IvocVect y_;\n    IvocVect t_;\n    IvocVect* discon_indices_;\n    std::size_t last_index_{};\n    std::size_t discon_index_{};\n    std::size_t ubound_index_{};\n\n    PlayRecordEvent* e_ = nullptr;  // Need to be a raw pointer for acc\n};\n}  // namespace coreneuron\n"
  },
  {
    "path": "coreneuron/utils/vrecord.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n\n#include <cstdio>\n\n#include \"coreneuron/nrnconf.h\"\n#include \"coreneuron/sim/multicore.hpp\"\n#include \"coreneuron/utils/ivocvect.hpp\"\n#include \"coreneuron/network/netcvode.hpp\"\n#include \"coreneuron/utils/vrecitem.h\"\nnamespace coreneuron {\nextern NetCvode* net_cvode_instance;\n\nvoid PlayRecordEvent::deliver(double tt, NetCvode* ns, NrnThread*) {\n    plr_->deliver(tt, ns);\n}\n\nNrnThread* PlayRecordEvent::thread() {\n    return nrn_threads + plr_->ith_;\n}\n\nvoid PlayRecordEvent::pr(const char* s, double tt, NetCvode*) {\n    printf(\"%s PlayRecordEvent %.15g \", s, tt);\n    plr_->pr();\n}\n\nPlayRecord::PlayRecord(double* pd, int ith)\n    : pd_(pd)\n    , ith_(ith) {}\n\nvoid PlayRecord::pr() {\n    printf(\"PlayRecord\\n\");\n}\n\nVecPlayContinuous::VecPlayContinuous(double* pd,\n                                     IvocVect&& yvec,\n                                     IvocVect&& tvec,\n                                     IvocVect* discon,\n                                     int ith)\n    : PlayRecord(pd, ith)\n    , y_(std::move(yvec))\n    , t_(std::move(tvec))\n    , discon_indices_(discon)\n    , e_(new PlayRecordEvent{}) {\n    e_->plr_ = this;\n}\n\nVecPlayContinuous::~VecPlayContinuous() {\n    delete e_;\n}\n\nvoid VecPlayContinuous::play_init() {\n    NrnThread* nt = nrn_threads + ith_;\n    last_index_ = 0;\n    discon_index_ = 0;\n    if (discon_indices_) {\n        if (discon_indices_->size() > 0) {\n            ubound_index_ = (int) (*discon_indices_)[discon_index_++];\n            // printf(\"play_init %d %g\\n\", ubound_index_, t_->elem(ubound_index_));\n            e_->send(t_[ubound_index_], net_cvode_instance, nt);\n        } else {\n            ubound_index_ = t_.size() - 1;\n        }\n    } else {\n        ubound_index_ = 0;\n        e_->send(t_[ubound_index_], net_cvode_instance, nt);\n    }\n}\n\nvoid VecPlayContinuous::deliver(double tt, NetCvode* ns) {\n    NrnThread* nt = nrn_threads + ith_;\n    // printf(\"deliver %g\\n\", tt);\n    last_index_ = ubound_index_;\n    // clang-format off\n\n    nrn_pragma_acc(update device(last_index_) if (nt->compute_gpu))\n    nrn_pragma_omp(target update to(last_index_) if (nt->compute_gpu))\n    // clang-format on\n    if (discon_indices_) {\n        if (discon_index_ < discon_indices_->size()) {\n            ubound_index_ = (int) (*discon_indices_)[discon_index_++];\n            // printf(\"after deliver:send %d %g\\n\", ubound_index_, t_->elem(ubound_index_));\n            e_->send(t_[ubound_index_], ns, nt);\n        } else {\n            ubound_index_ = t_.size() - 1;\n        }\n    } else {\n        if (ubound_index_ < t_.size() - 1) {\n            ubound_index_++;\n            e_->send(t_[ubound_index_], ns, nt);\n        }\n    }\n    // clang-format off\n\n    nrn_pragma_acc(update device(ubound_index_) if (nt->compute_gpu))\n    nrn_pragma_omp(target update to(ubound_index_) if (nt->compute_gpu))\n    // clang-format on\n    continuous(tt);\n}\n\nvoid VecPlayContinuous::continuous(double tt) {\n#ifdef CORENEURON_ENABLE_GPU\n    NrnThread* nt = nrn_threads + ith_;\n#endif\n    // clang-format off\n\n    nrn_pragma_acc(kernels present(this) if(nt->compute_gpu))\n    nrn_pragma_omp(target if(nt->compute_gpu))\n    {\n        *pd_ = interpolate(tt);\n    }\n    // clang-format on\n}\n\ndouble VecPlayContinuous::interpolate(double tt) {\n    if (tt >= t_[ubound_index_]) {\n        last_index_ = ubound_index_;\n        if (last_index_ == 0) {\n            // printf(\"return last tt=%g ubound=%g y=%g\\n\", tt, t_->elem(ubound_index_),\n            // y_->elem(last_index_));\n            return y_[last_index_];\n        }\n    } else if (tt <= t_[0]) {\n        last_index_ = 0;\n        // printf(\"return elem(0) tt=%g t0=%g y=%g\\n\", tt, t_->elem(0), y_->elem(0));\n        return y_[0];\n    } else {\n        search(tt);\n    }\n    double x0 = y_[last_index_ - 1];\n    double x1 = y_[last_index_];\n    double t0 = t_[last_index_ - 1];\n    double t1 = t_[last_index_];\n    // printf(\"IvocVectRecorder::continuous tt=%g t0=%g t1=%g theta=%g x0=%g x1=%g\\n\", tt, t0, t1,\n    // (tt - t0)/(t1 - t0), x0, x1);\n    if (t0 == t1) {\n        return (x0 + x1) / 2.;\n    }\n    return interp((tt - t0) / (t1 - t0), x0, x1);\n}\n\nvoid VecPlayContinuous::search(double tt) {\n    //\tassert (tt > t_->elem(0) && tt < t_->elem(t_->size() - 1))\n    while (tt < t_[last_index_]) {\n        --last_index_;\n    }\n    while (tt >= t_[last_index_]) {\n        ++last_index_;\n    }\n}\n\nvoid VecPlayContinuous::pr() {\n    printf(\"VecPlayContinuous \");\n    // printf(\"%s.x[%d]\\n\", hoc_object_name(y_->obj_), last_index_);\n}\n}  // namespace coreneuron\n"
  },
  {
    "path": "docs/Doxyfile.in",
    "content": "# Doxyfile 1.8.15\n\n# This file describes the settings to be used by the documentation system\n# doxygen (www.doxygen.org) for a project.\n#\n# All text after a double hash (##) is considered a comment and is placed in\n# front of the TAG it is preceding.\n#\n# All text after a single hash (#) is considered a comment and will be ignored.\n# The format is:\n# TAG = value [value, ...]\n# For lists, items can also be appended using:\n# TAG += value [value, ...]\n# Values that contain spaces should be placed between quotes (\\\" \\\").\n\n#---------------------------------------------------------------------------\n# Project related configuration options\n#---------------------------------------------------------------------------\n\n# This tag specifies the encoding used for all characters in the configuration\n# file that follow. The default is UTF-8 which is also the encoding used for all\n# text before the first occurrence of this tag. Doxygen uses libiconv (or the\n# iconv built into libc) for the transcoding. See\n# https://www.gnu.org/software/libiconv/ for the list of possible encodings.\n# The default value is: UTF-8.\n\nDOXYFILE_ENCODING      = UTF-8\n\n# The PROJECT_NAME tag is a single word (or a sequence of words surrounded by\n# double-quotes, unless you are using Doxywizard) that should identify the\n# project for which the documentation is generated. This name is used in the\n# title of most generated pages and in a few other places.\n# The default value is: My Project.\n\nPROJECT_NAME           = \"CoreNEURON\"\n\n# The PROJECT_NUMBER tag can be used to enter a project or revision number. This\n# could be handy for archiving the generated documentation or if some version\n# control system is used.\n\nPROJECT_NUMBER         =\n\n# Using the PROJECT_BRIEF tag one can provide an optional one line description\n# for a project that appears at the top of each page and should give viewer a\n# quick idea about the purpose of the project. Keep the description short.\n\nPROJECT_BRIEF          =\n\n# With the PROJECT_LOGO tag one can specify a logo or an icon that is included\n# in the documentation. The maximum height of the logo should not exceed 55\n# pixels and the maximum width should not exceed 200 pixels. Doxygen will copy\n# the logo to the output directory.\n\n#PROJECT_LOGO           = @PROJECT_SOURCE_DIR@/docs/logo.png\n\n# The OUTPUT_DIRECTORY tag is used to specify the (relative or absolute) path\n# into which the generated documentation will be written. If a relative path is\n# entered, it will be relative to the location where doxygen was started. If\n# left blank the current directory will be used.\n\nOUTPUT_DIRECTORY       = @CMAKE_CURRENT_BINARY_DIR@/docs\n\n# If the CREATE_SUBDIRS tag is set to YES then doxygen will create 4096 sub-\n# directories (in 2 levels) under the output directory of each output format and\n# will distribute the generated files over these directories. Enabling this\n# option can be useful when feeding doxygen a huge amount of source files, where\n# putting all generated files in the same directory would otherwise causes\n# performance problems for the file system.\n# The default value is: NO.\n\nCREATE_SUBDIRS         = NO\n\n# If the ALLOW_UNICODE_NAMES tag is set to YES, doxygen will allow non-ASCII\n# characters to appear in the names of generated files. If set to NO, non-ASCII\n# characters will be escaped, for example _xE3_x81_x84 will be used for Unicode\n# U+3044.\n# The default value is: NO.\n\nALLOW_UNICODE_NAMES    = NO\n\n# The OUTPUT_LANGUAGE tag is used to specify the language in which all\n# documentation generated by doxygen is written. Doxygen will use this\n# information to generate all constant output in the proper language.\n# Possible values are: Afrikaans, Arabic, Armenian, Brazilian, Catalan, Chinese,\n# Chinese-Traditional, Croatian, Czech, Danish, Dutch, English (United States),\n# Esperanto, Farsi (Persian), Finnish, French, German, Greek, Hungarian,\n# Indonesian, Italian, Japanese, Japanese-en (Japanese with English messages),\n# Korean, Korean-en (Korean with English messages), Latvian, Lithuanian,\n# Macedonian, Norwegian, Persian (Farsi), Polish, Portuguese, Romanian, Russian,\n# Serbian, Serbian-Cyrillic, Slovak, Slovene, Spanish, Swedish, Turkish,\n# Ukrainian and Vietnamese.\n# The default value is: English.\n\nOUTPUT_LANGUAGE        = English\n\n# The OUTPUT_TEXT_DIRECTION tag is used to specify the direction in which all\n# documentation generated by doxygen is written. Doxygen will use this\n# information to generate all generated output in the proper direction.\n# Possible values are: None, LTR, RTL and Context.\n# The default value is: None.\n\nOUTPUT_TEXT_DIRECTION  = None\n\n# If the BRIEF_MEMBER_DESC tag is set to YES, doxygen will include brief member\n# descriptions after the members that are listed in the file and class\n# documentation (similar to Javadoc). Set to NO to disable this.\n# The default value is: YES.\n\nBRIEF_MEMBER_DESC      = YES\n\n# If the REPEAT_BRIEF tag is set to YES, doxygen will prepend the brief\n# description of a member or function before the detailed description\n#\n# Note: If both HIDE_UNDOC_MEMBERS and BRIEF_MEMBER_DESC are set to NO, the\n# brief descriptions will be completely suppressed.\n# The default value is: YES.\n\nREPEAT_BRIEF           = YES\n\n# This tag implements a quasi-intelligent brief description abbreviator that is\n# used to form the text in various listings. Each string in this list, if found\n# as the leading text of the brief description, will be stripped from the text\n# and the result, after processing the whole list, is used as the annotated\n# text. Otherwise, the brief description is used as-is. If left blank, the\n# following values are used ($name is automatically replaced with the name of\n# the entity):The $name class, The $name widget, The $name file, is, provides,\n# specifies, contains, represents, a, an and the.\n\nABBREVIATE_BRIEF       = \"The $name class\" \\\n                         \"The $name widget\" \\\n                         \"The $name file\" \\\n                         is \\\n                         provides \\\n                         specifies \\\n                         contains \\\n                         represents \\\n                         a \\\n                         an \\\n                         the\n\n# If the ALWAYS_DETAILED_SEC and REPEAT_BRIEF tags are both set to YES then\n# doxygen will generate a detailed section even if there is only a brief\n# description.\n# The default value is: NO.\n\nALWAYS_DETAILED_SEC    = NO\n\n# If the INLINE_INHERITED_MEMB tag is set to YES, doxygen will show all\n# inherited members of a class in the documentation of that class as if those\n# members were ordinary class members. Constructors, destructors and assignment\n# operators of the base classes will not be shown.\n# The default value is: NO.\n\nINLINE_INHERITED_MEMB  = NO\n\n# If the FULL_PATH_NAMES tag is set to YES, doxygen will prepend the full path\n# before files name in the file list and in the header files. If set to NO the\n# shortest path that makes the file name unique will be used\n# The default value is: YES.\n\nFULL_PATH_NAMES        = YES\n\n# The STRIP_FROM_PATH tag can be used to strip a user-defined part of the path.\n# Stripping is only done if one of the specified strings matches the left-hand\n# part of the path. The tag can be used to show relative paths in the file list.\n# If left blank the directory from which doxygen is run is used as the path to\n# strip.\n#\n# Note that you can specify absolute paths here, but also relative paths, which\n# will be relative from the directory where doxygen is started.\n# This tag requires that the tag FULL_PATH_NAMES is set to YES.\n\nSTRIP_FROM_PATH        =\n\n# The STRIP_FROM_INC_PATH tag can be used to strip a user-defined part of the\n# path mentioned in the documentation of a class, which tells the reader which\n# header file to include in order to use a class. If left blank only the name of\n# the header file containing the class definition is used. Otherwise one should\n# specify the list of include paths that are normally passed to the compiler\n# using the -I flag.\n\nSTRIP_FROM_INC_PATH    =\n\n# If the SHORT_NAMES tag is set to YES, doxygen will generate much shorter (but\n# less readable) file names. This can be useful is your file systems doesn't\n# support long names like on DOS, Mac, or CD-ROM.\n# The default value is: NO.\n\nSHORT_NAMES            = NO\n\n# If the JAVADOC_AUTOBRIEF tag is set to YES then doxygen will interpret the\n# first line (until the first dot) of a Javadoc-style comment as the brief\n# description. If set to NO, the Javadoc-style will behave just like regular Qt-\n# style comments (thus requiring an explicit @brief command for a brief\n# description.)\n# The default value is: NO.\n\nJAVADOC_AUTOBRIEF      = YES\n\n# If the QT_AUTOBRIEF tag is set to YES then doxygen will interpret the first\n# line (until the first dot) of a Qt-style comment as the brief description. If\n# set to NO, the Qt-style will behave just like regular Qt-style comments (thus\n# requiring an explicit \\brief command for a brief description.)\n# The default value is: NO.\n\nQT_AUTOBRIEF           = NO\n\n# The MULTILINE_CPP_IS_BRIEF tag can be set to YES to make doxygen treat a\n# multi-line C++ special comment block (i.e. a block of //! or /// comments) as\n# a brief description. This used to be the default behavior. The new default is\n# to treat a multi-line C++ comment block as a detailed description. Set this\n# tag to YES if you prefer the old behavior instead.\n#\n# Note that setting this tag to YES also means that rational rose comments are\n# not recognized any more.\n# The default value is: NO.\n\nMULTILINE_CPP_IS_BRIEF = NO\n\n# If the INHERIT_DOCS tag is set to YES then an undocumented member inherits the\n# documentation from any documented member that it re-implements.\n# The default value is: YES.\n\nINHERIT_DOCS           = YES\n\n# If the SEPARATE_MEMBER_PAGES tag is set to YES then doxygen will produce a new\n# page for each member. If set to NO, the documentation of a member will be part\n# of the file/class/namespace that contains it.\n# The default value is: NO.\n\nSEPARATE_MEMBER_PAGES  = NO\n\n# The TAB_SIZE tag can be used to set the number of spaces in a tab. Doxygen\n# uses this value to replace tabs by spaces in code fragments.\n# Minimum value: 1, maximum value: 16, default value: 4.\n\nTAB_SIZE               = 4\n\n# This tag can be used to specify a number of aliases that act as commands in\n# the documentation. An alias has the form:\n# name=value\n# For example adding\n# \"sideeffect=@par Side Effects:\\n\"\n# will allow you to put the command \\sideeffect (or @sideeffect) in the\n# documentation, which will result in a user-defined paragraph with heading\n# \"Side Effects:\". You can put \\n's in the value part of an alias to insert\n# newlines (in the resulting output). You can put ^^ in the value part of an\n# alias to insert a newline as if a physical newline was in the original file.\n# When you need a literal { or } or , in the value part of an alias you have to\n# escape them by means of a backslash (\\), this can lead to conflicts with the\n# commands \\{ and \\} for these it is advised to use the version @{ and @} or use\n# a double escape (\\\\{ and \\\\})\n\nALIASES                =\n\n# This tag can be used to specify a number of word-keyword mappings (TCL only).\n# A mapping has the form \"name=value\". For example adding \"class=itcl::class\"\n# will allow you to use the command class in the itcl::class meaning.\n\nTCL_SUBST              =\n\n# Set the OPTIMIZE_OUTPUT_FOR_C tag to YES if your project consists of C sources\n# only. Doxygen will then generate output that is more tailored for C. For\n# instance, some of the names that are used will be different. The list of all\n# members will be omitted, etc.\n# The default value is: NO.\n\nOPTIMIZE_OUTPUT_FOR_C  = NO\n\n# Set the OPTIMIZE_OUTPUT_JAVA tag to YES if your project consists of Java or\n# Python sources only. Doxygen will then generate output that is more tailored\n# for that language. For instance, namespaces will be presented as packages,\n# qualified scopes will look different, etc.\n# The default value is: NO.\n\nOPTIMIZE_OUTPUT_JAVA   = NO\n\n# Set the OPTIMIZE_FOR_FORTRAN tag to YES if your project consists of Fortran\n# sources. Doxygen will then generate output that is tailored for Fortran.\n# The default value is: NO.\n\nOPTIMIZE_FOR_FORTRAN   = NO\n\n# Set the OPTIMIZE_OUTPUT_VHDL tag to YES if your project consists of VHDL\n# sources. Doxygen will then generate output that is tailored for VHDL.\n# The default value is: NO.\n\nOPTIMIZE_OUTPUT_VHDL   = NO\n\n# Set the OPTIMIZE_OUTPUT_SLICE tag to YES if your project consists of Slice\n# sources only. Doxygen will then generate output that is more tailored for that\n# language. For instance, namespaces will be presented as modules, types will be\n# separated into more groups, etc.\n# The default value is: NO.\n\nOPTIMIZE_OUTPUT_SLICE  = NO\n\n# Doxygen selects the parser to use depending on the extension of the files it\n# parses. With this tag you can assign which parser to use for a given\n# extension. Doxygen has a built-in mapping, but you can override or extend it\n# using this tag. The format is ext=language, where ext is a file extension, and\n# language is one of the parsers supported by doxygen: IDL, Java, Javascript,\n# Csharp (C#), C, C++, D, PHP, md (Markdown), Objective-C, Python, Slice,\n# Fortran (fixed format Fortran: FortranFixed, free formatted Fortran:\n# FortranFree, unknown formatted Fortran: Fortran. In the later case the parser\n# tries to guess whether the code is fixed or free formatted code, this is the\n# default for Fortran type files), VHDL, tcl. For instance to make doxygen treat\n# .inc files as Fortran files (default is PHP), and .f files as C (default is\n# Fortran), use: inc=Fortran f=C.\n#\n# Note: For files without extension you can use no_extension as a placeholder.\n#\n# Note that for custom extensions you also need to set FILE_PATTERNS otherwise\n# the files are not read by doxygen.\n\nEXTENSION_MAPPING      = .yaml=Python\n\n# If the MARKDOWN_SUPPORT tag is enabled then doxygen pre-processes all comments\n# according to the Markdown format, which allows for more readable\n# documentation. See https://daringfireball.net/projects/markdown/ for details.\n# The output of markdown processing is further processed by doxygen, so you can\n# mix doxygen, HTML, and XML commands with Markdown formatting. Disable only in\n# case of backward compatibilities issues.\n# The default value is: YES.\n\nMARKDOWN_SUPPORT       = YES\n\n# When the TOC_INCLUDE_HEADINGS tag is set to a non-zero value, all headings up\n# to that level are automatically included in the table of contents, even if\n# they do not have an id attribute.\n# Note: This feature currently applies only to Markdown headings.\n# Minimum value: 0, maximum value: 99, default value: 0.\n# This tag requires that the tag MARKDOWN_SUPPORT is set to YES.\n\nTOC_INCLUDE_HEADINGS   = 0\n\n# When enabled doxygen tries to link words that correspond to documented\n# classes, or namespaces to their corresponding documentation. Such a link can\n# be prevented in individual cases by putting a % sign in front of the word or\n# globally by setting AUTOLINK_SUPPORT to NO.\n# The default value is: YES.\n\nAUTOLINK_SUPPORT       = YES\n\n# If you use STL classes (i.e. std::string, std::vector, etc.) but do not want\n# to include (a tag file for) the STL sources as input, then you should set this\n# tag to YES in order to let doxygen match functions declarations and\n# definitions whose arguments contain STL classes (e.g. func(std::string);\n# versus func(std::string) {}). This also make the inheritance and collaboration\n# diagrams that involve STL classes more complete and accurate.\n# The default value is: NO.\n\nBUILTIN_STL_SUPPORT    = YES\n\n# If you use Microsoft's C++/CLI language, you should set this option to YES to\n# enable parsing support.\n# The default value is: NO.\n\nCPP_CLI_SUPPORT        = NO\n\n# Set the SIP_SUPPORT tag to YES if your project consists of sip (see:\n# https://www.riverbankcomputing.com/software/sip/intro) sources only. Doxygen\n# will parse them like normal C++ but will assume all classes use public instead\n# of private inheritance when no explicit protection keyword is present.\n# The default value is: NO.\n\nSIP_SUPPORT            = NO\n\n# For Microsoft's IDL there are propget and propput attributes to indicate\n# getter and setter methods for a property. Setting this option to YES will make\n# doxygen to replace the get and set methods by a property in the documentation.\n# This will only work if the methods are indeed getting or setting a simple\n# type. If this is not the case, or you want to show the methods anyway, you\n# should set this option to NO.\n# The default value is: YES.\n\nIDL_PROPERTY_SUPPORT   = YES\n\n# If member grouping is used in the documentation and the DISTRIBUTE_GROUP_DOC\n# tag is set to YES then doxygen will reuse the documentation of the first\n# member in the group (if any) for the other members of the group. By default\n# all members of a group must be documented explicitly.\n# The default value is: NO.\n\nDISTRIBUTE_GROUP_DOC   = NO\n\n# If one adds a struct or class to a group and this option is enabled, then also\n# any nested class or struct is added to the same group. By default this option\n# is disabled and one has to add nested compounds explicitly via \\ingroup.\n# The default value is: NO.\n\nGROUP_NESTED_COMPOUNDS = NO\n\n# Set the SUBGROUPING tag to YES to allow class member groups of the same type\n# (for instance a group of public functions) to be put as a subgroup of that\n# type (e.g. under the Public Functions section). Set it to NO to prevent\n# subgrouping. Alternatively, this can be done per class using the\n# \\nosubgrouping command.\n# The default value is: YES.\n\nSUBGROUPING            = YES\n\n# When the INLINE_GROUPED_CLASSES tag is set to YES, classes, structs and unions\n# are shown inside the group in which they are included (e.g. using \\ingroup)\n# instead of on a separate page (for HTML and Man pages) or section (for LaTeX\n# and RTF).\n#\n# Note that this feature does not work in combination with\n# SEPARATE_MEMBER_PAGES.\n# The default value is: NO.\n\nINLINE_GROUPED_CLASSES = NO\n\n# When the INLINE_SIMPLE_STRUCTS tag is set to YES, structs, classes, and unions\n# with only public data fields or simple typedef fields will be shown inline in\n# the documentation of the scope in which they are defined (i.e. file,\n# namespace, or group documentation), provided this scope is documented. If set\n# to NO, structs, classes, and unions are shown on a separate page (for HTML and\n# Man pages) or section (for LaTeX and RTF).\n# The default value is: NO.\n\nINLINE_SIMPLE_STRUCTS  = NO\n\n# When TYPEDEF_HIDES_STRUCT tag is enabled, a typedef of a struct, union, or\n# enum is documented as struct, union, or enum with the name of the typedef. So\n# typedef struct TypeS {} TypeT, will appear in the documentation as a struct\n# with name TypeT. When disabled the typedef will appear as a member of a file,\n# namespace, or class. And the struct will be named TypeS. This can typically be\n# useful for C code in case the coding convention dictates that all compound\n# types are typedef'ed and only the typedef is referenced, never the tag name.\n# The default value is: NO.\n\nTYPEDEF_HIDES_STRUCT   = NO\n\n# The size of the symbol lookup cache can be set using LOOKUP_CACHE_SIZE. This\n# cache is used to resolve symbols given their name and scope. Since this can be\n# an expensive process and often the same symbol appears multiple times in the\n# code, doxygen keeps a cache of pre-resolved symbols. If the cache is too small\n# doxygen will become slower. If the cache is too large, memory is wasted. The\n# cache size is given by this formula: 2^(16+LOOKUP_CACHE_SIZE). The valid range\n# is 0..9, the default is 0, corresponding to a cache size of 2^16=65536\n# symbols. At the end of a run doxygen will report the cache usage and suggest\n# the optimal cache size from a speed point of view.\n# Minimum value: 0, maximum value: 9, default value: 0.\n\nLOOKUP_CACHE_SIZE      = 0\n\n#---------------------------------------------------------------------------\n# Build related configuration options\n#---------------------------------------------------------------------------\n\n# If the EXTRACT_ALL tag is set to YES, doxygen will assume all entities in\n# documentation are documented, even if no documentation was available. Private\n# class members and static file members will be hidden unless the\n# EXTRACT_PRIVATE respectively EXTRACT_STATIC tags are set to YES.\n# Note: This will also disable the warnings about undocumented members that are\n# normally produced when WARNINGS is set to YES.\n# The default value is: NO.\n\nEXTRACT_ALL            = YES\n\n# If the EXTRACT_PRIVATE tag is set to YES, all private members of a class will\n# be included in the documentation.\n# The default value is: NO.\n\nEXTRACT_PRIVATE        = YES\n\n# If the EXTRACT_PACKAGE tag is set to YES, all members with package or internal\n# scope will be included in the documentation.\n# The default value is: NO.\n\nEXTRACT_PACKAGE        = YES\n\n# If the EXTRACT_STATIC tag is set to YES, all static members of a file will be\n# included in the documentation.\n# The default value is: NO.\n\nEXTRACT_STATIC         = YES\n\n# If the EXTRACT_LOCAL_CLASSES tag is set to YES, classes (and structs) defined\n# locally in source files will be included in the documentation. If set to NO,\n# only classes defined in header files are included. Does not have any effect\n# for Java sources.\n# The default value is: YES.\n\nEXTRACT_LOCAL_CLASSES  = YES\n\n# This flag is only useful for Objective-C code. If set to YES, local methods,\n# which are defined in the implementation section but not in the interface are\n# included in the documentation. If set to NO, only methods in the interface are\n# included.\n# The default value is: NO.\n\nEXTRACT_LOCAL_METHODS  = NO\n\n# If this flag is set to YES, the members of anonymous namespaces will be\n# extracted and appear in the documentation as a namespace called\n# 'anonymous_namespace{file}', where file will be replaced with the base name of\n# the file that contains the anonymous namespace. By default anonymous namespace\n# are hidden.\n# The default value is: NO.\n\nEXTRACT_ANON_NSPACES   = NO\n\n# If the HIDE_UNDOC_MEMBERS tag is set to YES, doxygen will hide all\n# undocumented members inside documented classes or files. If set to NO these\n# members will be included in the various overviews, but no documentation\n# section is generated. This option has no effect if EXTRACT_ALL is enabled.\n# The default value is: NO.\n\nHIDE_UNDOC_MEMBERS     = NO\n\n# If the HIDE_UNDOC_CLASSES tag is set to YES, doxygen will hide all\n# undocumented classes that are normally visible in the class hierarchy. If set\n# to NO, these classes will be included in the various overviews. This option\n# has no effect if EXTRACT_ALL is enabled.\n# The default value is: NO.\n\nHIDE_UNDOC_CLASSES     = NO\n\n# If the HIDE_FRIEND_COMPOUNDS tag is set to YES, doxygen will hide all friend\n# (class|struct|union) declarations. If set to NO, these declarations will be\n# included in the documentation.\n# The default value is: NO.\n\nHIDE_FRIEND_COMPOUNDS  = NO\n\n# If the HIDE_IN_BODY_DOCS tag is set to YES, doxygen will hide any\n# documentation blocks found inside the body of a function. If set to NO, these\n# blocks will be appended to the function's detailed documentation block.\n# The default value is: NO.\n\nHIDE_IN_BODY_DOCS      = NO\n\n# The INTERNAL_DOCS tag determines if documentation that is typed after a\n# \\internal command is included. If the tag is set to NO then the documentation\n# will be excluded. Set it to YES to include the internal documentation.\n# The default value is: NO.\n\nINTERNAL_DOCS          = NO\n\n# If the CASE_SENSE_NAMES tag is set to NO then doxygen will only generate file\n# names in lower-case letters. If set to YES, upper-case letters are also\n# allowed. This is useful if you have classes or files whose names only differ\n# in case and if your file system supports case sensitive file names. Windows\n# and Mac users are advised to set this option to NO.\n# The default value is: system dependent.\n\nCASE_SENSE_NAMES       = NO\n\n# If the HIDE_SCOPE_NAMES tag is set to NO then doxygen will show members with\n# their full class and namespace scopes in the documentation. If set to YES, the\n# scope will be hidden.\n# The default value is: NO.\n\nHIDE_SCOPE_NAMES       = NO\n\n# If the HIDE_COMPOUND_REFERENCE tag is set to NO (default) then doxygen will\n# append additional text to a page's title, such as Class Reference. If set to\n# YES the compound reference will be hidden.\n# The default value is: NO.\n\nHIDE_COMPOUND_REFERENCE= NO\n\n# If the SHOW_INCLUDE_FILES tag is set to YES then doxygen will put a list of\n# the files that are included by a file in the documentation of that file.\n# The default value is: YES.\n\nSHOW_INCLUDE_FILES     = YES\n\n# If the SHOW_GROUPED_MEMB_INC tag is set to YES then Doxygen will add for each\n# grouped member an include statement to the documentation, telling the reader\n# which file to include in order to use the member.\n# The default value is: NO.\n\nSHOW_GROUPED_MEMB_INC  = NO\n\n# If the FORCE_LOCAL_INCLUDES tag is set to YES then doxygen will list include\n# files with double quotes in the documentation rather than with sharp brackets.\n# The default value is: NO.\n\nFORCE_LOCAL_INCLUDES   = NO\n\n# If the INLINE_INFO tag is set to YES then a tag [inline] is inserted in the\n# documentation for inline members.\n# The default value is: YES.\n\nINLINE_INFO            = YES\n\n# If the SORT_MEMBER_DOCS tag is set to YES then doxygen will sort the\n# (detailed) documentation of file and class members alphabetically by member\n# name. If set to NO, the members will appear in declaration order.\n# The default value is: YES.\n\nSORT_MEMBER_DOCS       = YES\n\n# If the SORT_BRIEF_DOCS tag is set to YES then doxygen will sort the brief\n# descriptions of file, namespace and class members alphabetically by member\n# name. If set to NO, the members will appear in declaration order. Note that\n# this will also influence the order of the classes in the class list.\n# The default value is: NO.\n\nSORT_BRIEF_DOCS        = NO\n\n# If the SORT_MEMBERS_CTORS_1ST tag is set to YES then doxygen will sort the\n# (brief and detailed) documentation of class members so that constructors and\n# destructors are listed first. If set to NO the constructors will appear in the\n# respective orders defined by SORT_BRIEF_DOCS and SORT_MEMBER_DOCS.\n# Note: If SORT_BRIEF_DOCS is set to NO this option is ignored for sorting brief\n# member documentation.\n# Note: If SORT_MEMBER_DOCS is set to NO this option is ignored for sorting\n# detailed member documentation.\n# The default value is: NO.\n\nSORT_MEMBERS_CTORS_1ST = NO\n\n# If the SORT_GROUP_NAMES tag is set to YES then doxygen will sort the hierarchy\n# of group names into alphabetical order. If set to NO the group names will\n# appear in their defined order.\n# The default value is: NO.\n\nSORT_GROUP_NAMES       = NO\n\n# If the SORT_BY_SCOPE_NAME tag is set to YES, the class list will be sorted by\n# fully-qualified names, including namespaces. If set to NO, the class list will\n# be sorted only by class name, not including the namespace part.\n# Note: This option is not very useful if HIDE_SCOPE_NAMES is set to YES.\n# Note: This option applies only to the class list, not to the alphabetical\n# list.\n# The default value is: NO.\n\nSORT_BY_SCOPE_NAME     = NO\n\n# If the STRICT_PROTO_MATCHING option is enabled and doxygen fails to do proper\n# type resolution of all parameters of a function it will reject a match between\n# the prototype and the implementation of a member function even if there is\n# only one candidate or it is obvious which candidate to choose by doing a\n# simple string match. By disabling STRICT_PROTO_MATCHING doxygen will still\n# accept a match between prototype and implementation in such cases.\n# The default value is: NO.\n\nSTRICT_PROTO_MATCHING  = NO\n\n# The GENERATE_TODOLIST tag can be used to enable (YES) or disable (NO) the todo\n# list. This list is created by putting \\todo commands in the documentation.\n# The default value is: YES.\n\nGENERATE_TODOLIST      = YES\n\n# The GENERATE_TESTLIST tag can be used to enable (YES) or disable (NO) the test\n# list. This list is created by putting \\test commands in the documentation.\n# The default value is: YES.\n\nGENERATE_TESTLIST      = YES\n\n# The GENERATE_BUGLIST tag can be used to enable (YES) or disable (NO) the bug\n# list. This list is created by putting \\bug commands in the documentation.\n# The default value is: YES.\n\nGENERATE_BUGLIST       = YES\n\n# The GENERATE_DEPRECATEDLIST tag can be used to enable (YES) or disable (NO)\n# the deprecated list. This list is created by putting \\deprecated commands in\n# the documentation.\n# The default value is: YES.\n\nGENERATE_DEPRECATEDLIST= YES\n\n# The ENABLED_SECTIONS tag can be used to enable conditional documentation\n# sections, marked by \\if <section_label> ... \\endif and \\cond <section_label>\n# ... \\endcond blocks.\n\nENABLED_SECTIONS       =\n\n# The MAX_INITIALIZER_LINES tag determines the maximum number of lines that the\n# initial value of a variable or macro / define can have for it to appear in the\n# documentation. If the initializer consists of more lines than specified here\n# it will be hidden. Use a value of 0 to hide initializers completely. The\n# appearance of the value of individual variables and macros / defines can be\n# controlled using \\showinitializer or \\hideinitializer command in the\n# documentation regardless of this setting.\n# Minimum value: 0, maximum value: 10000, default value: 30.\n\nMAX_INITIALIZER_LINES  = 30\n\n# Set the SHOW_USED_FILES tag to NO to disable the list of files generated at\n# the bottom of the documentation of classes and structs. If set to YES, the\n# list will mention the files that were used to generate the documentation.\n# The default value is: YES.\n\nSHOW_USED_FILES        = YES\n\n# Set the SHOW_FILES tag to NO to disable the generation of the Files page. This\n# will remove the Files entry from the Quick Index and from the Folder Tree View\n# (if specified).\n# The default value is: YES.\n\nSHOW_FILES             = YES\n\n# Set the SHOW_NAMESPACES tag to NO to disable the generation of the Namespaces\n# page. This will remove the Namespaces entry from the Quick Index and from the\n# Folder Tree View (if specified).\n# The default value is: YES.\n\nSHOW_NAMESPACES        = YES\n\n# The FILE_VERSION_FILTER tag can be used to specify a program or script that\n# doxygen should invoke to get the current version for each file (typically from\n# the version control system). Doxygen will invoke the program by executing (via\n# popen()) the command command input-file, where command is the value of the\n# FILE_VERSION_FILTER tag, and input-file is the name of an input file provided\n# by doxygen. Whatever the program writes to standard output is used as the file\n# version. For an example see the documentation.\n\nFILE_VERSION_FILTER    =\n\n# The LAYOUT_FILE tag can be used to specify a layout file which will be parsed\n# by doxygen. The layout file controls the global structure of the generated\n# output files in an output format independent way. To create the layout file\n# that represents doxygen's defaults, run doxygen with the -l option. You can\n# optionally specify a file name after the option, if omitted DoxygenLayout.xml\n# will be used as the name of the layout file.\n#\n# Note that if you run doxygen from a directory containing a file called\n# DoxygenLayout.xml, doxygen will parse it automatically even if the LAYOUT_FILE\n# tag is left empty.\n\nLAYOUT_FILE            =  @PROJECT_SOURCE_DIR@/docs/DoxygenLayout.xml\n\n# The CITE_BIB_FILES tag can be used to specify one or more bib files containing\n# the reference definitions. This must be a list of .bib files. The .bib\n# extension is automatically appended if omitted. This requires the bibtex tool\n# to be installed. See also https://en.wikipedia.org/wiki/BibTeX for more info.\n# For LaTeX the style of the bibliography can be controlled using\n# LATEX_BIB_STYLE. To use this feature you need bibtex and perl available in the\n# search path. See also \\cite for info how to create references.\n\nCITE_BIB_FILES         =\n\n#---------------------------------------------------------------------------\n# Configuration options related to warning and progress messages\n#---------------------------------------------------------------------------\n\n# The QUIET tag can be used to turn on/off the messages that are generated to\n# standard output by doxygen. If QUIET is set to YES this implies that the\n# messages are off.\n# The default value is: NO.\n\nQUIET                  = YES\n\n# The WARNINGS tag can be used to turn on/off the warning messages that are\n# generated to standard error (stderr) by doxygen. If WARNINGS is set to YES\n# this implies that the warnings are on.\n#\n# Tip: Turn warnings on while writing the documentation.\n# The default value is: YES.\n\nWARNINGS               = YES\n\n# If the WARN_IF_UNDOCUMENTED tag is set to YES then doxygen will generate\n# warnings for undocumented members. If EXTRACT_ALL is set to YES then this flag\n# will automatically be disabled.\n# The default value is: YES.\n\nWARN_IF_UNDOCUMENTED   = YES\n\n# If the WARN_IF_DOC_ERROR tag is set to YES, doxygen will generate warnings for\n# potential errors in the documentation, such as not documenting some parameters\n# in a documented function, or documenting parameters that don't exist or using\n# markup commands wrongly.\n# The default value is: YES.\n\nWARN_IF_DOC_ERROR      = YES\n\n# This WARN_NO_PARAMDOC option can be enabled to get warnings for functions that\n# are documented, but have no documentation for their parameters or return\n# value. If set to NO, doxygen will only warn about wrong or incomplete\n# parameter documentation, but not about the absence of documentation. If\n# EXTRACT_ALL is set to YES then this flag will automatically be disabled.\n# The default value is: NO.\n\nWARN_NO_PARAMDOC       = NO\n\n# If the WARN_AS_ERROR tag is set to YES then doxygen will immediately stop when\n# a warning is encountered.\n# The default value is: NO.\n\nWARN_AS_ERROR          = NO\n\n# The WARN_FORMAT tag determines the format of the warning messages that doxygen\n# can produce. The string should contain the $file, $line, and $text tags, which\n# will be replaced by the file and line number from which the warning originated\n# and the warning text. Optionally the format may contain $version, which will\n# be replaced by the version of the file (if it could be obtained via\n# FILE_VERSION_FILTER)\n# The default value is: $file:$line: $text.\n\nWARN_FORMAT            = \"$file:$line: $text\"\n\n# The WARN_LOGFILE tag can be used to specify a file to which warning and error\n# messages should be written. If left blank the output is written to standard\n# error (stderr).\n\nWARN_LOGFILE           =\n\n#---------------------------------------------------------------------------\n# Configuration options related to the input files\n#---------------------------------------------------------------------------\n\n# The INPUT tag is used to specify the files and/or directories that contain\n# documented source files. You may enter file names like myfile.cpp or\n# directories like /usr/src/myproject. Separate the files or directories with\n# spaces. See also FILE_PATTERNS and EXTENSION_MAPPING\n# Note: If this tag is empty the current directory is searched.\n\nINPUT                  = @PROJECT_SOURCE_DIR@/coreneuron\nINPUT                 += @PROJECT_SOURCE_DIR@/tests\n\n# This tag can be used to specify the character encoding of the source files\n# that doxygen parses. Internally doxygen uses the UTF-8 encoding. Doxygen uses\n# libiconv (or the iconv built into libc) for the transcoding. See the libiconv\n# documentation (see: https://www.gnu.org/software/libiconv/) for the list of\n# possible encodings.\n# The default value is: UTF-8.\n\nINPUT_ENCODING         = UTF-8\n\n# If the value of the INPUT tag contains directories, you can use the\n# FILE_PATTERNS tag to specify one or more wildcard patterns (like *.cpp and\n# *.h) to filter out the source-files in the directories.\n#\n# Note that for custom extensions or not directly supported extensions you also\n# need to set EXTENSION_MAPPING for the extension otherwise the files are not\n# read by doxygen.\n#\n# If left blank the following patterns are tested:*.c, *.cc, *.cxx, *.cpp,\n# *.c++, *.java, *.ii, *.ixx, *.ipp, *.i++, *.inl, *.idl, *.ddl, *.odl, *.h,\n# *.hh, *.hxx, *.hpp, *.h++, *.cs, *.d, *.php, *.php4, *.php5, *.phtml, *.inc,\n# *.m, *.markdown, *.md, *.mm, *.dox, *.py, *.pyw, *.f90, *.f95, *.f03, *.f08,\n# *.f, *.for, *.tcl, *.vhd, *.vhdl, *.ucf, *.qsf and *.ice.\n\nFILE_PATTERNS          = *.c \\\n                         *.cc \\\n                         *.cxx \\\n                         *.cpp \\\n                         *.c++ \\\n                         *.ipp \\\n                         *.h \\\n                         *.hh \\\n                         *.hxx \\\n                         *.hpp \\\n                         *.h++ \\\n                         *.markdown \\\n                         *.md \\\n                         *.mm \\\n                         *.dox \\\n                         *.yaml \\\n\n# The RECURSIVE tag can be used to specify whether or not subdirectories should\n# be searched for input files as well.\n# The default value is: NO.\n\nRECURSIVE              = YES\n\n# The EXCLUDE tag can be used to specify files and/or directories that should be\n# excluded from the INPUT source files. This way you can easily exclude a\n# subdirectory from a directory tree whose root is specified with the INPUT tag.\n#\n# Note that relative paths are relative to the directory from which doxygen is\n# run.\n\n\n\n# The EXCLUDE_SYMLINKS tag can be used to select whether or not files or\n# directories that are symbolic links (a Unix file system feature) are excluded\n# from the input.\n# The default value is: NO.\n\nEXCLUDE_SYMLINKS       = NO\n\n# If the value of the INPUT tag contains directories, you can use the\n# EXCLUDE_PATTERNS tag to specify one or more wildcard patterns to exclude\n# certain files from those directories.\n#\n# Note that the wildcards are matched against the file with absolute path, so to\n# exclude all test directories for example use the pattern */test/*\n\nEXCLUDE_PATTERNS       =\n\n# The EXCLUDE_SYMBOLS tag can be used to specify one or more symbol names\n# (namespaces, classes, functions, etc.) that should be excluded from the\n# output. The symbol name can be a fully qualified name, a word, or if the\n# wildcard * is used, a substring. Examples: ANamespace, AClass,\n# AClass::ANamespace, ANamespace::*Test\n#\n# Note that the wildcards are matched against the file with absolute path, so to\n# exclude all test directories use the pattern */test/*\n\nEXCLUDE_SYMBOLS        =\n\n# The EXAMPLE_PATH tag can be used to specify one or more files or directories\n# that contain example code fragments that are included (see the \\include\n# command).\n\nEXAMPLE_PATH           =\n\n# If the value of the EXAMPLE_PATH tag contains directories, you can use the\n# EXAMPLE_PATTERNS tag to specify one or more wildcard pattern (like *.cpp and\n# *.h) to filter out the source-files in the directories. If left blank all\n# files are included.\n\nEXAMPLE_PATTERNS       = *\n\n# If the EXAMPLE_RECURSIVE tag is set to YES then subdirectories will be\n# searched for input files to be used with the \\include or \\dontinclude commands\n# irrespective of the value of the RECURSIVE tag.\n# The default value is: NO.\n\nEXAMPLE_RECURSIVE      = NO\n\n# The IMAGE_PATH tag can be used to specify one or more files or directories\n# that contain images that are to be included in the documentation (see the\n# \\image command).\n\nIMAGE_PATH             =\n\n# The INPUT_FILTER tag can be used to specify a program that doxygen should\n# invoke to filter for each input file. Doxygen will invoke the filter program\n# by executing (via popen()) the command:\n#\n# <filter> <input-file>\n#\n# where <filter> is the value of the INPUT_FILTER tag, and <input-file> is the\n# name of an input file. Doxygen will then use the output that the filter\n# program writes to standard output. If FILTER_PATTERNS is specified, this tag\n# will be ignored.\n#\n# Note that the filter must not add or remove lines; it is applied before the\n# code is scanned, but not when the output code is generated. If lines are added\n# or removed, the anchors will not be placed correctly.\n#\n# Note that for custom extensions or not directly supported extensions you also\n# need to set EXTENSION_MAPPING for the extension otherwise the files are not\n# properly processed by doxygen.\n\nINPUT_FILTER           =\n\n# The FILTER_PATTERNS tag can be used to specify filters on a per file pattern\n# basis. Doxygen will compare the file name with each pattern and apply the\n# filter if there is a match. The filters are a list of the form: pattern=filter\n# (like *.cpp=my_cpp_filter). See INPUT_FILTER for further information on how\n# filters are used. If the FILTER_PATTERNS tag is empty or if none of the\n# patterns match the file name, INPUT_FILTER is applied.\n#\n# Note that for custom extensions or not directly supported extensions you also\n# need to set EXTENSION_MAPPING for the extension otherwise the files are not\n# properly processed by doxygen.\n\nFILTER_PATTERNS        =\n\n# If the FILTER_SOURCE_FILES tag is set to YES, the input filter (if set using\n# INPUT_FILTER) will also be used to filter the input files that are used for\n# producing the source files to browse (i.e. when SOURCE_BROWSER is set to YES).\n# The default value is: NO.\n\nFILTER_SOURCE_FILES    = NO\n\n# The FILTER_SOURCE_PATTERNS tag can be used to specify source filters per file\n# pattern. A pattern will override the setting for FILTER_PATTERN (if any) and\n# it is also possible to disable source filtering for a specific pattern using\n# *.ext= (so without naming a filter).\n# This tag requires that the tag FILTER_SOURCE_FILES is set to YES.\n\nFILTER_SOURCE_PATTERNS =\n\n# If the USE_MDFILE_AS_MAINPAGE tag refers to the name of a markdown file that\n# is part of the input, its contents will be placed on the main page\n# (index.html). This can be useful if you have a project on for instance GitHub\n# and want to reuse the introduction page also for the doxygen output.\n\nINPUT += ../README.md\nUSE_MDFILE_AS_MAINPAGE = ../README.md\n\n#---------------------------------------------------------------------------\n# Configuration options related to source browsing\n#---------------------------------------------------------------------------\n\n# If the SOURCE_BROWSER tag is set to YES then a list of source files will be\n# generated. Documented entities will be cross-referenced with these sources.\n#\n# Note: To get rid of all source code in the generated output, make sure that\n# also VERBATIM_HEADERS is set to NO.\n# The default value is: NO.\n\nSOURCE_BROWSER         = YES\n\n# Setting the INLINE_SOURCES tag to YES will include the body of functions,\n# classes and enums directly into the documentation.\n# The default value is: NO.\n\nINLINE_SOURCES         = NO\n\n# Setting the STRIP_CODE_COMMENTS tag to YES will instruct doxygen to hide any\n# special comment blocks from generated source code fragments. Normal C, C++ and\n# Fortran comments will always remain visible.\n# The default value is: YES.\n\nSTRIP_CODE_COMMENTS    = NO\n\n# If the REFERENCED_BY_RELATION tag is set to YES then for each documented\n# entity all documented functions referencing it will be listed.\n# The default value is: NO.\n\nREFERENCED_BY_RELATION = NO\n\n# If the REFERENCES_RELATION tag is set to YES then for each documented function\n# all documented entities called/used by that function will be listed.\n# The default value is: NO.\n\nREFERENCES_RELATION    = NO\n\n# If the REFERENCES_LINK_SOURCE tag is set to YES and SOURCE_BROWSER tag is set\n# to YES then the hyperlinks from functions in REFERENCES_RELATION and\n# REFERENCED_BY_RELATION lists will link to the source code. Otherwise they will\n# link to the documentation.\n# The default value is: YES.\n\nREFERENCES_LINK_SOURCE = YES\n\n# If SOURCE_TOOLTIPS is enabled (the default) then hovering a hyperlink in the\n# source code will show a tooltip with additional information such as prototype,\n# brief description and links to the definition and documentation. Since this\n# will make the HTML file larger and loading of large files a bit slower, you\n# can opt to disable this feature.\n# The default value is: YES.\n# This tag requires that the tag SOURCE_BROWSER is set to YES.\n\nSOURCE_TOOLTIPS        = YES\n\n# If the USE_HTAGS tag is set to YES then the references to source code will\n# point to the HTML generated by the htags(1) tool instead of doxygen built-in\n# source browser. The htags tool is part of GNU's global source tagging system\n# (see https://www.gnu.org/software/global/global.html). You will need version\n# 4.8.6 or higher.\n#\n# To use it do the following:\n# - Install the latest version of global\n# - Enable SOURCE_BROWSER and USE_HTAGS in the configuration file\n# - Make sure the INPUT points to the root of the source tree\n# - Run doxygen as normal\n#\n# Doxygen will invoke htags (and that will in turn invoke gtags), so these\n# tools must be available from the command line (i.e. in the search path).\n#\n# The result: instead of the source browser generated by doxygen, the links to\n# source code will now point to the output of htags.\n# The default value is: NO.\n# This tag requires that the tag SOURCE_BROWSER is set to YES.\n\nUSE_HTAGS              = NO\n\n# If the VERBATIM_HEADERS tag is set the YES then doxygen will generate a\n# verbatim copy of the header file for each class for which an include is\n# specified. Set to NO to disable this.\n# See also: Section \\class.\n# The default value is: YES.\n\nVERBATIM_HEADERS       = YES\n\n#---------------------------------------------------------------------------\n# Configuration options related to the alphabetical class index\n#---------------------------------------------------------------------------\n\n# If the ALPHABETICAL_INDEX tag is set to YES, an alphabetical index of all\n# compounds will be generated. Enable this if the project contains a lot of\n# classes, structs, unions or interfaces.\n# The default value is: YES.\n\nALPHABETICAL_INDEX     = YES\n\n# The COLS_IN_ALPHA_INDEX tag can be used to specify the number of columns in\n# which the alphabetical index list will be split.\n# Minimum value: 1, maximum value: 20, default value: 5.\n# This tag requires that the tag ALPHABETICAL_INDEX is set to YES.\n\nCOLS_IN_ALPHA_INDEX    = 5\n\n# In case all classes in a project start with a common prefix, all classes will\n# be put under the same header in the alphabetical index. The IGNORE_PREFIX tag\n# can be used to specify a prefix (or a list of prefixes) that should be ignored\n# while generating the index headers.\n# This tag requires that the tag ALPHABETICAL_INDEX is set to YES.\n\nIGNORE_PREFIX          =\n\n#---------------------------------------------------------------------------\n# Configuration options related to the HTML output\n#---------------------------------------------------------------------------\n\n# If the GENERATE_HTML tag is set to YES, doxygen will generate HTML output\n# The default value is: YES.\n\nGENERATE_HTML          = YES\n\n# The HTML_OUTPUT tag is used to specify where the HTML docs will be put. If a\n# relative path is entered the value of OUTPUT_DIRECTORY will be put in front of\n# it.\n# The default directory is: html.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nHTML_OUTPUT            = doxygen\n\n# The HTML_FILE_EXTENSION tag can be used to specify the file extension for each\n# generated HTML page (for example: .htm, .php, .asp).\n# The default value is: .html.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nHTML_FILE_EXTENSION    = .html\n\n# The HTML_HEADER tag can be used to specify a user-defined HTML header file for\n# each generated HTML page. If the tag is left blank doxygen will generate a\n# standard header.\n#\n# To get valid HTML the header file that includes any scripts and style sheets\n# that doxygen needs, which is dependent on the configuration options used (e.g.\n# the setting GENERATE_TREEVIEW). It is highly recommended to start with a\n# default header using\n# doxygen -w html new_header.html new_footer.html new_stylesheet.css\n# YourConfigFile\n# and then modify the file new_header.html. See also section \"Doxygen usage\"\n# for information on how to generate the default header that doxygen normally\n# uses.\n# Note: The header is subject to change so you typically have to regenerate the\n# default header when upgrading to a newer version of doxygen. For a description\n# of the possible markers and block names see the documentation.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nHTML_HEADER            =\n\n# The HTML_FOOTER tag can be used to specify a user-defined HTML footer for each\n# generated HTML page. If the tag is left blank doxygen will generate a standard\n# footer. See HTML_HEADER for more information on how to generate a default\n# footer and what special commands can be used inside the footer. See also\n# section \"Doxygen usage\" for information on how to generate the default footer\n# that doxygen normally uses.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nHTML_FOOTER            = @PROJECT_SOURCE_DIR@/docs/footer.html\n\n# The HTML_STYLESHEET tag can be used to specify a user-defined cascading style\n# sheet that is used by each HTML page. It can be used to fine-tune the look of\n# the HTML output. If left blank doxygen will generate a default style sheet.\n# See also section \"Doxygen usage\" for information on how to generate the style\n# sheet that doxygen normally uses.\n# Note: It is recommended to use HTML_EXTRA_STYLESHEET instead of this tag, as\n# it is more robust and this tag (HTML_STYLESHEET) will in the future become\n# obsolete.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nHTML_STYLESHEET        =\n\n# The HTML_EXTRA_STYLESHEET tag can be used to specify additional user-defined\n# cascading style sheets that are included after the standard style sheets\n# created by doxygen. Using this option one can overrule certain style aspects.\n# This is preferred over using HTML_STYLESHEET since it does not replace the\n# standard style sheet and is therefore more robust against future updates.\n# Doxygen will copy the style sheet files to the output directory.\n# Note: The order of the extra style sheet files is of importance (e.g. the last\n# style sheet in the list overrules the setting of the previous ones in the\n# list). For an example see the documentation.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nHTML_EXTRA_STYLESHEET  =\n\n# The HTML_EXTRA_FILES tag can be used to specify one or more extra images or\n# other source files which should be copied to the HTML output directory. Note\n# that these files will be copied to the base HTML output directory. Use the\n# $relpath^ marker in the HTML_HEADER and/or HTML_FOOTER files to load these\n# files. In the HTML_STYLESHEET file, use the file name only. Also note that the\n# files will be copied as-is; there are no commands or markers available.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\n# HTML_EXTRA_FILES       =\n\n# The HTML_COLORSTYLE_HUE tag controls the color of the HTML output. Doxygen\n# will adjust the colors in the style sheet and background images according to\n# this color. Hue is specified as an angle on a colorwheel, see\n# https://en.wikipedia.org/wiki/Hue for more information. For instance the value\n# 0 represents red, 60 is yellow, 120 is green, 180 is cyan, 240 is blue, 300\n# purple, and 360 is red again.\n# Minimum value: 0, maximum value: 359, default value: 220.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nHTML_COLORSTYLE_HUE    = 344\n\n# The HTML_COLORSTYLE_SAT tag controls the purity (or saturation) of the colors\n# in the HTML output. For a value of 0 the output will use grayscales only. A\n# value of 255 will produce the most vivid colors.\n# Minimum value: 0, maximum value: 255, default value: 100.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nHTML_COLORSTYLE_SAT    = 100\n\n# The HTML_COLORSTYLE_GAMMA tag controls the gamma correction applied to the\n# luminance component of the colors in the HTML output. Values below 100\n# gradually make the output lighter, whereas values above 100 make the output\n# darker. The value divided by 100 is the actual gamma applied, so 80 represents\n# a gamma of 0.8, The value 220 represents a gamma of 2.2, and 100 does not\n# change the gamma.\n# Minimum value: 40, maximum value: 240, default value: 80.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nHTML_COLORSTYLE_GAMMA  = 80\n\n# If the HTML_TIMESTAMP tag is set to YES then the footer of each generated HTML\n# page will contain the date and time when the page was generated. Setting this\n# to YES can help to show when doxygen was last run and thus if the\n# documentation is up to date.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nHTML_TIMESTAMP         = NO\n\n# If the HTML_DYNAMIC_MENUS tag is set to YES then the generated HTML\n# documentation will contain a main index with vertical navigation menus that\n# are dynamically created via Javascript. If disabled, the navigation index will\n# consists of multiple levels of tabs that are statically embedded in every HTML\n# page. Disable this option to support browsers that do not have Javascript,\n# like the Qt help browser.\n# The default value is: YES.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nHTML_DYNAMIC_MENUS     = YES\n\n# If the HTML_DYNAMIC_SECTIONS tag is set to YES then the generated HTML\n# documentation will contain sections that can be hidden and shown after the\n# page has loaded.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nHTML_DYNAMIC_SECTIONS  = NO\n\n# With HTML_INDEX_NUM_ENTRIES one can control the preferred number of entries\n# shown in the various tree structured indices initially; the user can expand\n# and collapse entries dynamically later on. Doxygen will expand the tree to\n# such a level that at most the specified number of entries are visible (unless\n# a fully collapsed tree already exceeds this amount). So setting the number of\n# entries 1 will produce a full collapsed tree by default. 0 is a special value\n# representing an infinite number of entries and will result in a full expanded\n# tree by default.\n# Minimum value: 0, maximum value: 9999, default value: 100.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nHTML_INDEX_NUM_ENTRIES = 100\n\n# If the GENERATE_DOCSET tag is set to YES, additional index files will be\n# generated that can be used as input for Apple's Xcode 3 integrated development\n# environment (see: https://developer.apple.com/xcode/), introduced with OSX\n# 10.5 (Leopard). To create a documentation set, doxygen will generate a\n# Makefile in the HTML output directory. Running make will produce the docset in\n# that directory and running make install will install the docset in\n# ~/Library/Developer/Shared/Documentation/DocSets so that Xcode will find it at\n# startup. See https://developer.apple.com/library/archive/featuredarticles/Doxy\n# genXcode/_index.html for more information.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nGENERATE_DOCSET        = NO\n\n# This tag determines the name of the docset feed. A documentation feed provides\n# an umbrella under which multiple documentation sets from a single provider\n# (such as a company or product suite) can be grouped.\n# The default value is: Doxygen generated docs.\n# This tag requires that the tag GENERATE_DOCSET is set to YES.\n\nDOCSET_FEEDNAME        = \"Doxygen generated docs\"\n\n# This tag specifies a string that should uniquely identify the documentation\n# set bundle. This should be a reverse domain-name style string, e.g.\n# com.mycompany.MyDocSet. Doxygen will append .docset to the name.\n# The default value is: org.doxygen.Project.\n# This tag requires that the tag GENERATE_DOCSET is set to YES.\n\nDOCSET_BUNDLE_ID       = org.doxygen.Project\n\n# The DOCSET_PUBLISHER_ID tag specifies a string that should uniquely identify\n# the documentation publisher. This should be a reverse domain-name style\n# string, e.g. com.mycompany.MyDocSet.documentation.\n# The default value is: org.doxygen.Publisher.\n# This tag requires that the tag GENERATE_DOCSET is set to YES.\n\nDOCSET_PUBLISHER_ID    = org.doxygen.Publisher\n\n# The DOCSET_PUBLISHER_NAME tag identifies the documentation publisher.\n# The default value is: Publisher.\n# This tag requires that the tag GENERATE_DOCSET is set to YES.\n\nDOCSET_PUBLISHER_NAME  = Publisher\n\n# If the GENERATE_HTMLHELP tag is set to YES then doxygen generates three\n# additional HTML index files: index.hhp, index.hhc, and index.hhk. The\n# index.hhp is a project file that can be read by Microsoft's HTML Help Workshop\n# (see: https://www.microsoft.com/en-us/download/details.aspx?id=21138) on\n# Windows.\n#\n# The HTML Help Workshop contains a compiler that can convert all HTML output\n# generated by doxygen into a single compiled HTML file (.chm). Compiled HTML\n# files are now used as the Windows 98 help format, and will replace the old\n# Windows help format (.hlp) on all Windows platforms in the future. Compressed\n# HTML files also contain an index, a table of contents, and you can search for\n# words in the documentation. The HTML workshop also contains a viewer for\n# compressed HTML files.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nGENERATE_HTMLHELP      = NO\n\n# The CHM_FILE tag can be used to specify the file name of the resulting .chm\n# file. You can add a path in front of the file if the result should not be\n# written to the html output directory.\n# This tag requires that the tag GENERATE_HTMLHELP is set to YES.\n\nCHM_FILE               =\n\n# The HHC_LOCATION tag can be used to specify the location (absolute path\n# including file name) of the HTML help compiler (hhc.exe). If non-empty,\n# doxygen will try to run the HTML help compiler on the generated index.hhp.\n# The file has to be specified with full path.\n# This tag requires that the tag GENERATE_HTMLHELP is set to YES.\n\nHHC_LOCATION           =\n\n# The GENERATE_CHI flag controls if a separate .chi index file is generated\n# (YES) or that it should be included in the master .chm file (NO).\n# The default value is: NO.\n# This tag requires that the tag GENERATE_HTMLHELP is set to YES.\n\nGENERATE_CHI           = NO\n\n# The CHM_INDEX_ENCODING is used to encode HtmlHelp index (hhk), content (hhc)\n# and project file content.\n# This tag requires that the tag GENERATE_HTMLHELP is set to YES.\n\nCHM_INDEX_ENCODING     =\n\n# The BINARY_TOC flag controls whether a binary table of contents is generated\n# (YES) or a normal table of contents (NO) in the .chm file. Furthermore it\n# enables the Previous and Next buttons.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_HTMLHELP is set to YES.\n\nBINARY_TOC             = NO\n\n# The TOC_EXPAND flag can be set to YES to add extra items for group members to\n# the table of contents of the HTML help documentation and to the tree view.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_HTMLHELP is set to YES.\n\nTOC_EXPAND             = NO\n\n# If the GENERATE_QHP tag is set to YES and both QHP_NAMESPACE and\n# QHP_VIRTUAL_FOLDER are set, an additional index file will be generated that\n# can be used as input for Qt's qhelpgenerator to generate a Qt Compressed Help\n# (.qch) of the generated HTML documentation.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nGENERATE_QHP           = NO\n\n# If the QHG_LOCATION tag is specified, the QCH_FILE tag can be used to specify\n# the file name of the resulting .qch file. The path specified is relative to\n# the HTML output folder.\n# This tag requires that the tag GENERATE_QHP is set to YES.\n\nQCH_FILE               =\n\n# The QHP_NAMESPACE tag specifies the namespace to use when generating Qt Help\n# Project output. For more information please see Qt Help Project / Namespace\n# (see: http://doc.qt.io/archives/qt-4.8/qthelpproject.html#namespace).\n# The default value is: org.doxygen.Project.\n# This tag requires that the tag GENERATE_QHP is set to YES.\n\nQHP_NAMESPACE          = org.doxygen.Project\n\n# The QHP_VIRTUAL_FOLDER tag specifies the namespace to use when generating Qt\n# Help Project output. For more information please see Qt Help Project / Virtual\n# Folders (see: http://doc.qt.io/archives/qt-4.8/qthelpproject.html#virtual-\n# folders).\n# The default value is: doc.\n# This tag requires that the tag GENERATE_QHP is set to YES.\n\nQHP_VIRTUAL_FOLDER     = doc\n\n# If the QHP_CUST_FILTER_NAME tag is set, it specifies the name of a custom\n# filter to add. For more information please see Qt Help Project / Custom\n# Filters (see: http://doc.qt.io/archives/qt-4.8/qthelpproject.html#custom-\n# filters).\n# This tag requires that the tag GENERATE_QHP is set to YES.\n\nQHP_CUST_FILTER_NAME   =\n\n# The QHP_CUST_FILTER_ATTRS tag specifies the list of the attributes of the\n# custom filter to add. For more information please see Qt Help Project / Custom\n# Filters (see: http://doc.qt.io/archives/qt-4.8/qthelpproject.html#custom-\n# filters).\n# This tag requires that the tag GENERATE_QHP is set to YES.\n\nQHP_CUST_FILTER_ATTRS  =\n\n# The QHP_SECT_FILTER_ATTRS tag specifies the list of the attributes this\n# project's filter section matches. Qt Help Project / Filter Attributes (see:\n# http://doc.qt.io/archives/qt-4.8/qthelpproject.html#filter-attributes).\n# This tag requires that the tag GENERATE_QHP is set to YES.\n\nQHP_SECT_FILTER_ATTRS  =\n\n# The QHG_LOCATION tag can be used to specify the location of Qt's\n# qhelpgenerator. If non-empty doxygen will try to run qhelpgenerator on the\n# generated .qhp file.\n# This tag requires that the tag GENERATE_QHP is set to YES.\n\nQHG_LOCATION           =\n\n# If the GENERATE_ECLIPSEHELP tag is set to YES, additional index files will be\n# generated, together with the HTML files, they form an Eclipse help plugin. To\n# install this plugin and make it available under the help contents menu in\n# Eclipse, the contents of the directory containing the HTML and XML files needs\n# to be copied into the plugins directory of eclipse. The name of the directory\n# within the plugins directory should be the same as the ECLIPSE_DOC_ID value.\n# After copying Eclipse needs to be restarted before the help appears.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nGENERATE_ECLIPSEHELP   = NO\n\n# A unique identifier for the Eclipse help plugin. When installing the plugin\n# the directory name containing the HTML and XML files should also have this\n# name. Each documentation set should have its own identifier.\n# The default value is: org.doxygen.Project.\n# This tag requires that the tag GENERATE_ECLIPSEHELP is set to YES.\n\nECLIPSE_DOC_ID         = org.doxygen.Project\n\n# If you want full control over the layout of the generated HTML pages it might\n# be necessary to disable the index and replace it with your own. The\n# DISABLE_INDEX tag can be used to turn on/off the condensed index (tabs) at top\n# of each HTML page. A value of NO enables the index and the value YES disables\n# it. Since the tabs in the index contain the same information as the navigation\n# tree, you can set this option to YES if you also set GENERATE_TREEVIEW to YES.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nDISABLE_INDEX          = NO\n\n# The GENERATE_TREEVIEW tag is used to specify whether a tree-like index\n# structure should be generated to display hierarchical information. If the tag\n# value is set to YES, a side panel will be generated containing a tree-like\n# index structure (just like the one that is generated for HTML Help). For this\n# to work a browser that supports JavaScript, DHTML, CSS and frames is required\n# (i.e. any modern browser). Windows users are probably better off using the\n# HTML help feature. Via custom style sheets (see HTML_EXTRA_STYLESHEET) one can\n# further fine-tune the look of the index. As an example, the default style\n# sheet generated by doxygen has an example that shows how to put an image at\n# the root of the tree instead of the PROJECT_NAME. Since the tree basically has\n# the same information as the tab index, you could consider setting\n# DISABLE_INDEX to YES when enabling this option.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nGENERATE_TREEVIEW      = YES\n\n# The ENUM_VALUES_PER_LINE tag can be used to set the number of enum values that\n# doxygen will group on one line in the generated HTML documentation.\n#\n# Note that a value of 0 will completely suppress the enum values from appearing\n# in the overview section.\n# Minimum value: 0, maximum value: 20, default value: 4.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nENUM_VALUES_PER_LINE   = 4\n\n# If the treeview is enabled (see GENERATE_TREEVIEW) then this tag can be used\n# to set the initial width (in pixels) of the frame in which the tree is shown.\n# Minimum value: 0, maximum value: 1500, default value: 250.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nTREEVIEW_WIDTH         = 250\n\n# If the EXT_LINKS_IN_WINDOW option is set to YES, doxygen will open links to\n# external symbols imported via tag files in a separate window.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nEXT_LINKS_IN_WINDOW    = NO\n\n# Use this tag to change the font size of LaTeX formulas included as images in\n# the HTML documentation. When you change the font size after a successful\n# doxygen run you need to manually remove any form_*.png images from the HTML\n# output directory to force them to be regenerated.\n# Minimum value: 8, maximum value: 50, default value: 10.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nFORMULA_FONTSIZE       = 10\n\n# Use the FORMULA_TRANSPARENT tag to determine whether or not the images\n# generated for formulas are transparent PNGs. Transparent PNGs are not\n# supported properly for IE 6.0, but are supported on all modern browsers.\n#\n# Note that when changing this option you need to delete any form_*.png files in\n# the HTML output directory before the changes have effect.\n# The default value is: YES.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nFORMULA_TRANSPARENT    = YES\n\n# Enable the USE_MATHJAX option to render LaTeX formulas using MathJax (see\n# https://www.mathjax.org) which uses client side Javascript for the rendering\n# instead of using pre-rendered bitmaps. Use this if you do not have LaTeX\n# installed or if you want to formulas look prettier in the HTML output. When\n# enabled you may also need to install MathJax separately and configure the path\n# to it using the MATHJAX_RELPATH option.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nUSE_MATHJAX            = YES\n\n# When MathJax is enabled you can set the default output format to be used for\n# the MathJax output. See the MathJax site (see:\n# http://docs.mathjax.org/en/latest/output.html) for more details.\n# Possible values are: HTML-CSS (which is slower, but has the best\n# compatibility), NativeMML (i.e. MathML) and SVG.\n# The default value is: HTML-CSS.\n# This tag requires that the tag USE_MATHJAX is set to YES.\n\nMATHJAX_FORMAT         = HTML-CSS\n\n# When MathJax is enabled you need to specify the location relative to the HTML\n# output directory using the MATHJAX_RELPATH option. The destination directory\n# should contain the MathJax.js script. For instance, if the mathjax directory\n# is located at the same level as the HTML output directory, then\n# MATHJAX_RELPATH should be ../mathjax. The default value points to the MathJax\n# Content Delivery Network so you can quickly see the result without installing\n# MathJax. However, it is strongly recommended to install a local copy of\n# MathJax from https://www.mathjax.org before deployment.\n# The default value is: https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/.\n# This tag requires that the tag USE_MATHJAX is set to YES.\n\nMATHJAX_RELPATH        = https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/\n\n# The MATHJAX_EXTENSIONS tag can be used to specify one or more MathJax\n# extension names that should be enabled during MathJax rendering. For example\n# MATHJAX_EXTENSIONS = TeX/AMSmath TeX/AMSsymbols\n# This tag requires that the tag USE_MATHJAX is set to YES.\n\nMATHJAX_EXTENSIONS     =\n\n# The MATHJAX_CODEFILE tag can be used to specify a file with javascript pieces\n# of code that will be used on startup of the MathJax code. See the MathJax site\n# (see: http://docs.mathjax.org/en/latest/output.html) for more details. For an\n# example see the documentation.\n# This tag requires that the tag USE_MATHJAX is set to YES.\n\nMATHJAX_CODEFILE       =\n\n# When the SEARCHENGINE tag is enabled doxygen will generate a search box for\n# the HTML output. The underlying search engine uses javascript and DHTML and\n# should work on any modern browser. Note that when using HTML help\n# (GENERATE_HTMLHELP), Qt help (GENERATE_QHP), or docsets (GENERATE_DOCSET)\n# there is already a search function so this one should typically be disabled.\n# For large projects the javascript based search engine can be slow, then\n# enabling SERVER_BASED_SEARCH may provide a better solution. It is possible to\n# search using the keyboard; to jump to the search box use <access key> + S\n# (what the <access key> is depends on the OS and browser, but it is typically\n# <CTRL>, <ALT>/<option>, or both). Inside the search box use the <cursor down\n# key> to jump into the search results window, the results can be navigated\n# using the <cursor keys>. Press <Enter> to select an item or <escape> to cancel\n# the search. The filter options can be selected when the cursor is inside the\n# search box by pressing <Shift>+<cursor down>. Also here use the <cursor keys>\n# to select a filter and <Enter> or <escape> to activate or cancel the filter\n# option.\n# The default value is: YES.\n# This tag requires that the tag GENERATE_HTML is set to YES.\n\nSEARCHENGINE           = YES\n\n# When the SERVER_BASED_SEARCH tag is enabled the search engine will be\n# implemented using a web server instead of a web client using Javascript. There\n# are two flavors of web server based searching depending on the EXTERNAL_SEARCH\n# setting. When disabled, doxygen will generate a PHP script for searching and\n# an index file used by the script. When EXTERNAL_SEARCH is enabled the indexing\n# and searching needs to be provided by external tools. See the section\n# \"External Indexing and Searching\" for details.\n# The default value is: NO.\n# This tag requires that the tag SEARCHENGINE is set to YES.\n\nSERVER_BASED_SEARCH    = NO\n\n# When EXTERNAL_SEARCH tag is enabled doxygen will no longer generate the PHP\n# script for searching. Instead the search results are written to an XML file\n# which needs to be processed by an external indexer. Doxygen will invoke an\n# external search engine pointed to by the SEARCHENGINE_URL option to obtain the\n# search results.\n#\n# Doxygen ships with an example indexer (doxyindexer) and search engine\n# (doxysearch.cgi) which are based on the open source search engine library\n# Xapian (see: https://xapian.org/).\n#\n# See the section \"External Indexing and Searching\" for details.\n# The default value is: NO.\n# This tag requires that the tag SEARCHENGINE is set to YES.\n\nEXTERNAL_SEARCH        = NO\n\n# The SEARCHENGINE_URL should point to a search engine hosted by a web server\n# which will return the search results when EXTERNAL_SEARCH is enabled.\n#\n# Doxygen ships with an example indexer (doxyindexer) and search engine\n# (doxysearch.cgi) which are based on the open source search engine library\n# Xapian (see: https://xapian.org/). See the section \"External Indexing and\n# Searching\" for details.\n# This tag requires that the tag SEARCHENGINE is set to YES.\n\nSEARCHENGINE_URL       =\n\n# When SERVER_BASED_SEARCH and EXTERNAL_SEARCH are both enabled the unindexed\n# search data is written to a file for indexing by an external tool. With the\n# SEARCHDATA_FILE tag the name of this file can be specified.\n# The default file is: searchdata.xml.\n# This tag requires that the tag SEARCHENGINE is set to YES.\n\nSEARCHDATA_FILE        = searchdata.xml\n\n# When SERVER_BASED_SEARCH and EXTERNAL_SEARCH are both enabled the\n# EXTERNAL_SEARCH_ID tag can be used as an identifier for the project. This is\n# useful in combination with EXTRA_SEARCH_MAPPINGS to search through multiple\n# projects and redirect the results back to the right project.\n# This tag requires that the tag SEARCHENGINE is set to YES.\n\nEXTERNAL_SEARCH_ID     =\n\n# The EXTRA_SEARCH_MAPPINGS tag can be used to enable searching through doxygen\n# projects other than the one defined by this configuration file, but that are\n# all added to the same external search index. Each project needs to have a\n# unique id set via EXTERNAL_SEARCH_ID. The search mapping then maps the id of\n# to a relative location where the documentation can be found. The format is:\n# EXTRA_SEARCH_MAPPINGS = tagname1=loc1 tagname2=loc2 ...\n# This tag requires that the tag SEARCHENGINE is set to YES.\n\nEXTRA_SEARCH_MAPPINGS  =\n\n#---------------------------------------------------------------------------\n# Configuration options related to the LaTeX output\n#---------------------------------------------------------------------------\n\n# If the GENERATE_LATEX tag is set to YES, doxygen will generate LaTeX output.\n# The default value is: YES.\n\nGENERATE_LATEX         = NO\n\n# The LATEX_OUTPUT tag is used to specify where the LaTeX docs will be put. If a\n# relative path is entered the value of OUTPUT_DIRECTORY will be put in front of\n# it.\n# The default directory is: latex.\n# This tag requires that the tag GENERATE_LATEX is set to YES.\n\nLATEX_OUTPUT           = latex\n\n# The LATEX_CMD_NAME tag can be used to specify the LaTeX command name to be\n# invoked.\n#\n# Note that when not enabling USE_PDFLATEX the default is latex when enabling\n# USE_PDFLATEX the default is pdflatex and when in the later case latex is\n# chosen this is overwritten by pdflatex. For specific output languages the\n# default can have been set differently, this depends on the implementation of\n# the output language.\n# This tag requires that the tag GENERATE_LATEX is set to YES.\n\nLATEX_CMD_NAME         =\n\n# The MAKEINDEX_CMD_NAME tag can be used to specify the command name to generate\n# index for LaTeX.\n# Note: This tag is used in the Makefile / make.bat.\n# See also: LATEX_MAKEINDEX_CMD for the part in the generated output file\n# (.tex).\n# The default file is: makeindex.\n# This tag requires that the tag GENERATE_LATEX is set to YES.\n\nMAKEINDEX_CMD_NAME     = makeindex\n\n# The LATEX_MAKEINDEX_CMD tag can be used to specify the command name to\n# generate index for LaTeX. In case there is no backslash (\\) as first character\n# it will be automatically added in the LaTeX code.\n# Note: This tag is used in the generated output file (.tex).\n# See also: MAKEINDEX_CMD_NAME for the part in the Makefile / make.bat.\n# The default value is: makeindex.\n# This tag requires that the tag GENERATE_LATEX is set to YES.\n\nLATEX_MAKEINDEX_CMD    = makeindex\n\n# If the COMPACT_LATEX tag is set to YES, doxygen generates more compact LaTeX\n# documents. This may be useful for small projects and may help to save some\n# trees in general.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_LATEX is set to YES.\n\nCOMPACT_LATEX          = NO\n\n# The PAPER_TYPE tag can be used to set the paper type that is used by the\n# printer.\n# Possible values are: a4 (210 x 297 mm), letter (8.5 x 11 inches), legal (8.5 x\n# 14 inches) and executive (7.25 x 10.5 inches).\n# The default value is: a4.\n# This tag requires that the tag GENERATE_LATEX is set to YES.\n\nPAPER_TYPE             = a4\n\n# The EXTRA_PACKAGES tag can be used to specify one or more LaTeX package names\n# that should be included in the LaTeX output. The package can be specified just\n# by its name or with the correct syntax as to be used with the LaTeX\n# \\usepackage command. To get the times font for instance you can specify :\n# EXTRA_PACKAGES=times or EXTRA_PACKAGES={times}\n# To use the option intlimits with the amsmath package you can specify:\n# EXTRA_PACKAGES=[intlimits]{amsmath}\n# If left blank no extra packages will be included.\n# This tag requires that the tag GENERATE_LATEX is set to YES.\n\nEXTRA_PACKAGES         =\n\n# The LATEX_HEADER tag can be used to specify a personal LaTeX header for the\n# generated LaTeX document. The header should contain everything until the first\n# chapter. If it is left blank doxygen will generate a standard header. See\n# section \"Doxygen usage\" for information on how to let doxygen write the\n# default header to a separate file.\n#\n# Note: Only use a user-defined header if you know what you are doing! The\n# following commands have a special meaning inside the header: $title,\n# $datetime, $date, $doxygenversion, $projectname, $projectnumber,\n# $projectbrief, $projectlogo. Doxygen will replace $title with the empty\n# string, for the replacement values of the other commands the user is referred\n# to HTML_HEADER.\n# This tag requires that the tag GENERATE_LATEX is set to YES.\n\nLATEX_HEADER           =\n\n# The LATEX_FOOTER tag can be used to specify a personal LaTeX footer for the\n# generated LaTeX document. The footer should contain everything after the last\n# chapter. If it is left blank doxygen will generate a standard footer. See\n# LATEX_HEADER for more information on how to generate a default footer and what\n# special commands can be used inside the footer.\n#\n# Note: Only use a user-defined footer if you know what you are doing!\n# This tag requires that the tag GENERATE_LATEX is set to YES.\n\nLATEX_FOOTER           =\n\n# The LATEX_EXTRA_STYLESHEET tag can be used to specify additional user-defined\n# LaTeX style sheets that are included after the standard style sheets created\n# by doxygen. Using this option one can overrule certain style aspects. Doxygen\n# will copy the style sheet files to the output directory.\n# Note: The order of the extra style sheet files is of importance (e.g. the last\n# style sheet in the list overrules the setting of the previous ones in the\n# list).\n# This tag requires that the tag GENERATE_LATEX is set to YES.\n\nLATEX_EXTRA_STYLESHEET =\n\n# The LATEX_EXTRA_FILES tag can be used to specify one or more extra images or\n# other source files which should be copied to the LATEX_OUTPUT output\n# directory. Note that the files will be copied as-is; there are no commands or\n# markers available.\n# This tag requires that the tag GENERATE_LATEX is set to YES.\n\nLATEX_EXTRA_FILES      =\n\n# If the PDF_HYPERLINKS tag is set to YES, the LaTeX that is generated is\n# prepared for conversion to PDF (using ps2pdf or pdflatex). The PDF file will\n# contain links (just like the HTML output) instead of page references. This\n# makes the output suitable for online browsing using a PDF viewer.\n# The default value is: YES.\n# This tag requires that the tag GENERATE_LATEX is set to YES.\n\nPDF_HYPERLINKS         = YES\n\n# If the USE_PDFLATEX tag is set to YES, doxygen will use pdflatex to generate\n# the PDF file directly from the LaTeX files. Set this option to YES, to get a\n# higher quality PDF documentation.\n# The default value is: YES.\n# This tag requires that the tag GENERATE_LATEX is set to YES.\n\nUSE_PDFLATEX           = YES\n\n# If the LATEX_BATCHMODE tag is set to YES, doxygen will add the \\batchmode\n# command to the generated LaTeX files. This will instruct LaTeX to keep running\n# if errors occur, instead of asking the user for help. This option is also used\n# when generating formulas in HTML.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_LATEX is set to YES.\n\nLATEX_BATCHMODE        = NO\n\n# If the LATEX_HIDE_INDICES tag is set to YES then doxygen will not include the\n# index chapters (such as File Index, Compound Index, etc.) in the output.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_LATEX is set to YES.\n\nLATEX_HIDE_INDICES     = NO\n\n# If the LATEX_SOURCE_CODE tag is set to YES then doxygen will include source\n# code with syntax highlighting in the LaTeX output.\n#\n# Note that which sources are shown also depends on other settings such as\n# SOURCE_BROWSER.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_LATEX is set to YES.\n\nLATEX_SOURCE_CODE      = NO\n\n# The LATEX_BIB_STYLE tag can be used to specify the style to use for the\n# bibliography, e.g. plainnat, or ieeetr. See\n# https://en.wikipedia.org/wiki/BibTeX and \\cite for more info.\n# The default value is: plain.\n# This tag requires that the tag GENERATE_LATEX is set to YES.\n\nLATEX_BIB_STYLE        = plain\n\n# If the LATEX_TIMESTAMP tag is set to YES then the footer of each generated\n# page will contain the date and time when the page was generated. Setting this\n# to NO can help when comparing the output of multiple runs.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_LATEX is set to YES.\n\nLATEX_TIMESTAMP        = NO\n\n# The LATEX_EMOJI_DIRECTORY tag is used to specify the (relative or absolute)\n# path from which the emoji images will be read. If a relative path is entered,\n# it will be relative to the LATEX_OUTPUT directory. If left blank the\n# LATEX_OUTPUT directory will be used.\n# This tag requires that the tag GENERATE_LATEX is set to YES.\n\nLATEX_EMOJI_DIRECTORY  =\n\n#---------------------------------------------------------------------------\n# Configuration options related to the RTF output\n#---------------------------------------------------------------------------\n\n# If the GENERATE_RTF tag is set to YES, doxygen will generate RTF output. The\n# RTF output is optimized for Word 97 and may not look too pretty with other RTF\n# readers/editors.\n# The default value is: NO.\n\nGENERATE_RTF           = NO\n\n# The RTF_OUTPUT tag is used to specify where the RTF docs will be put. If a\n# relative path is entered the value of OUTPUT_DIRECTORY will be put in front of\n# it.\n# The default directory is: rtf.\n# This tag requires that the tag GENERATE_RTF is set to YES.\n\nRTF_OUTPUT             = rtf\n\n# If the COMPACT_RTF tag is set to YES, doxygen generates more compact RTF\n# documents. This may be useful for small projects and may help to save some\n# trees in general.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_RTF is set to YES.\n\nCOMPACT_RTF            = NO\n\n# If the RTF_HYPERLINKS tag is set to YES, the RTF that is generated will\n# contain hyperlink fields. The RTF file will contain links (just like the HTML\n# output) instead of page references. This makes the output suitable for online\n# browsing using Word or some other Word compatible readers that support those\n# fields.\n#\n# Note: WordPad (write) and others do not support links.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_RTF is set to YES.\n\nRTF_HYPERLINKS         = NO\n\n# Load stylesheet definitions from file. Syntax is similar to doxygen's\n# configuration file, i.e. a series of assignments. You only have to provide\n# replacements, missing definitions are set to their default value.\n#\n# See also section \"Doxygen usage\" for information on how to generate the\n# default style sheet that doxygen normally uses.\n# This tag requires that the tag GENERATE_RTF is set to YES.\n\nRTF_STYLESHEET_FILE    =\n\n# Set optional variables used in the generation of an RTF document. Syntax is\n# similar to doxygen's configuration file. A template extensions file can be\n# generated using doxygen -e rtf extensionFile.\n# This tag requires that the tag GENERATE_RTF is set to YES.\n\nRTF_EXTENSIONS_FILE    =\n\n# If the RTF_SOURCE_CODE tag is set to YES then doxygen will include source code\n# with syntax highlighting in the RTF output.\n#\n# Note that which sources are shown also depends on other settings such as\n# SOURCE_BROWSER.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_RTF is set to YES.\n\nRTF_SOURCE_CODE        = NO\n\n#---------------------------------------------------------------------------\n# Configuration options related to the man page output\n#---------------------------------------------------------------------------\n\n# If the GENERATE_MAN tag is set to YES, doxygen will generate man pages for\n# classes and files.\n# The default value is: NO.\n\nGENERATE_MAN           = NO\n\n# The MAN_OUTPUT tag is used to specify where the man pages will be put. If a\n# relative path is entered the value of OUTPUT_DIRECTORY will be put in front of\n# it. A directory man3 will be created inside the directory specified by\n# MAN_OUTPUT.\n# The default directory is: man.\n# This tag requires that the tag GENERATE_MAN is set to YES.\n\nMAN_OUTPUT             = man\n\n# The MAN_EXTENSION tag determines the extension that is added to the generated\n# man pages. In case the manual section does not start with a number, the number\n# 3 is prepended. The dot (.) at the beginning of the MAN_EXTENSION tag is\n# optional.\n# The default value is: .3.\n# This tag requires that the tag GENERATE_MAN is set to YES.\n\nMAN_EXTENSION          = .3\n\n# The MAN_SUBDIR tag determines the name of the directory created within\n# MAN_OUTPUT in which the man pages are placed. If defaults to man followed by\n# MAN_EXTENSION with the initial . removed.\n# This tag requires that the tag GENERATE_MAN is set to YES.\n\nMAN_SUBDIR             =\n\n# If the MAN_LINKS tag is set to YES and doxygen generates man output, then it\n# will generate one additional man file for each entity documented in the real\n# man page(s). These additional files only source the real man page, but without\n# them the man command would be unable to find the correct page.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_MAN is set to YES.\n\nMAN_LINKS              = NO\n\n#---------------------------------------------------------------------------\n# Configuration options related to the XML output\n#---------------------------------------------------------------------------\n\n# If the GENERATE_XML tag is set to YES, doxygen will generate an XML file that\n# captures the structure of the code including all documentation.\n# The default value is: NO.\n\nGENERATE_XML           = NO\n\n# The XML_OUTPUT tag is used to specify where the XML pages will be put. If a\n# relative path is entered the value of OUTPUT_DIRECTORY will be put in front of\n# it.\n# The default directory is: xml.\n# This tag requires that the tag GENERATE_XML is set to YES.\n\nXML_OUTPUT             = xml\n\n# If the XML_PROGRAMLISTING tag is set to YES, doxygen will dump the program\n# listings (including syntax highlighting and cross-referencing information) to\n# the XML output. Note that enabling this will significantly increase the size\n# of the XML output.\n# The default value is: YES.\n# This tag requires that the tag GENERATE_XML is set to YES.\n\nXML_PROGRAMLISTING     = YES\n\n# If the XML_NS_MEMB_FILE_SCOPE tag is set to YES, doxygen will include\n# namespace members in file scope as well, matching the HTML output.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_XML is set to YES.\n\nXML_NS_MEMB_FILE_SCOPE = NO\n\n#---------------------------------------------------------------------------\n# Configuration options related to the DOCBOOK output\n#---------------------------------------------------------------------------\n\n# If the GENERATE_DOCBOOK tag is set to YES, doxygen will generate Docbook files\n# that can be used to generate PDF.\n# The default value is: NO.\n\nGENERATE_DOCBOOK       = NO\n\n# The DOCBOOK_OUTPUT tag is used to specify where the Docbook pages will be put.\n# If a relative path is entered the value of OUTPUT_DIRECTORY will be put in\n# front of it.\n# The default directory is: docbook.\n# This tag requires that the tag GENERATE_DOCBOOK is set to YES.\n\nDOCBOOK_OUTPUT         = docbook\n\n# If the DOCBOOK_PROGRAMLISTING tag is set to YES, doxygen will include the\n# program listings (including syntax highlighting and cross-referencing\n# information) to the DOCBOOK output. Note that enabling this will significantly\n# increase the size of the DOCBOOK output.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_DOCBOOK is set to YES.\n\nDOCBOOK_PROGRAMLISTING = NO\n\n#---------------------------------------------------------------------------\n# Configuration options for the AutoGen Definitions output\n#---------------------------------------------------------------------------\n\n# If the GENERATE_AUTOGEN_DEF tag is set to YES, doxygen will generate an\n# AutoGen Definitions (see http://autogen.sourceforge.net/) file that captures\n# the structure of the code including all documentation. Note that this feature\n# is still experimental and incomplete at the moment.\n# The default value is: NO.\n\nGENERATE_AUTOGEN_DEF   = NO\n\n#---------------------------------------------------------------------------\n# Configuration options related to the Perl module output\n#---------------------------------------------------------------------------\n\n# If the GENERATE_PERLMOD tag is set to YES, doxygen will generate a Perl module\n# file that captures the structure of the code including all documentation.\n#\n# Note that this feature is still experimental and incomplete at the moment.\n# The default value is: NO.\n\nGENERATE_PERLMOD       = NO\n\n# If the PERLMOD_LATEX tag is set to YES, doxygen will generate the necessary\n# Makefile rules, Perl scripts and LaTeX code to be able to generate PDF and DVI\n# output from the Perl module output.\n# The default value is: NO.\n# This tag requires that the tag GENERATE_PERLMOD is set to YES.\n\nPERLMOD_LATEX          = NO\n\n# If the PERLMOD_PRETTY tag is set to YES, the Perl module output will be nicely\n# formatted so it can be parsed by a human reader. This is useful if you want to\n# understand what is going on. On the other hand, if this tag is set to NO, the\n# size of the Perl module output will be much smaller and Perl will parse it\n# just the same.\n# The default value is: YES.\n# This tag requires that the tag GENERATE_PERLMOD is set to YES.\n\nPERLMOD_PRETTY         = YES\n\n# The names of the make variables in the generated doxyrules.make file are\n# prefixed with the string contained in PERLMOD_MAKEVAR_PREFIX. This is useful\n# so different doxyrules.make files included by the same Makefile don't\n# overwrite each other's variables.\n# This tag requires that the tag GENERATE_PERLMOD is set to YES.\n\nPERLMOD_MAKEVAR_PREFIX =\n\n#---------------------------------------------------------------------------\n# Configuration options related to the preprocessor\n#---------------------------------------------------------------------------\n\n# If the ENABLE_PREPROCESSING tag is set to YES, doxygen will evaluate all\n# C-preprocessor directives found in the sources and include files.\n# The default value is: YES.\n\nENABLE_PREPROCESSING   = YES\n\n# If the MACRO_EXPANSION tag is set to YES, doxygen will expand all macro names\n# in the source code. If set to NO, only conditional compilation will be\n# performed. Macro expansion can be done in a controlled way by setting\n# EXPAND_ONLY_PREDEF to YES.\n# The default value is: NO.\n# This tag requires that the tag ENABLE_PREPROCESSING is set to YES.\n\nMACRO_EXPANSION        = NO\n\n# If the EXPAND_ONLY_PREDEF and MACRO_EXPANSION tags are both set to YES then\n# the macro expansion is limited to the macros specified with the PREDEFINED and\n# EXPAND_AS_DEFINED tags.\n# The default value is: NO.\n# This tag requires that the tag ENABLE_PREPROCESSING is set to YES.\n\nEXPAND_ONLY_PREDEF     = NO\n\n# If the SEARCH_INCLUDES tag is set to YES, the include files in the\n# INCLUDE_PATH will be searched if a #include is found.\n# The default value is: YES.\n# This tag requires that the tag ENABLE_PREPROCESSING is set to YES.\n\nSEARCH_INCLUDES        = YES\n\n# The INCLUDE_PATH tag can be used to specify one or more directories that\n# contain include files that are not input files but should be processed by the\n# preprocessor.\n# This tag requires that the tag SEARCH_INCLUDES is set to YES.\n\nINCLUDE_PATH           =\n\n# You can use the INCLUDE_FILE_PATTERNS tag to specify one or more wildcard\n# patterns (like *.h and *.hpp) to filter out the header-files in the\n# directories. If left blank, the patterns specified with FILE_PATTERNS will be\n# used.\n# This tag requires that the tag ENABLE_PREPROCESSING is set to YES.\n\nINCLUDE_FILE_PATTERNS  =\n\n# The PREDEFINED tag can be used to specify one or more macro names that are\n# defined before the preprocessor is started (similar to the -D option of e.g.\n# gcc). The argument of the tag is a list of macros of the form: name or\n# name=definition (no spaces). If the definition and the \"=\" are omitted, \"=1\"\n# is assumed. To prevent a macro definition from being undefined via #undef or\n# recursively expanded use the := operator instead of the = operator.\n# This tag requires that the tag ENABLE_PREPROCESSING is set to YES.\n\nPREDEFINED             =\n\n# If the MACRO_EXPANSION and EXPAND_ONLY_PREDEF tags are set to YES then this\n# tag can be used to specify a list of macro names that should be expanded. The\n# macro definition that is found in the sources will be used. Use the PREDEFINED\n# tag if you want to use a different macro definition that overrules the\n# definition found in the source code.\n# This tag requires that the tag ENABLE_PREPROCESSING is set to YES.\n\nEXPAND_AS_DEFINED      =\n\n# If the SKIP_FUNCTION_MACROS tag is set to YES then doxygen's preprocessor will\n# remove all references to function-like macros that are alone on a line, have\n# an all uppercase name, and do not end with a semicolon. Such function macros\n# are typically used for boiler-plate code, and will confuse the parser if not\n# removed.\n# The default value is: YES.\n# This tag requires that the tag ENABLE_PREPROCESSING is set to YES.\n\nSKIP_FUNCTION_MACROS   = YES\n\n#---------------------------------------------------------------------------\n# Configuration options related to external references\n#---------------------------------------------------------------------------\n\n# The TAGFILES tag can be used to specify one or more tag files. For each tag\n# file the location of the external documentation should be added. The format of\n# a tag file without this location is as follows:\n# TAGFILES = file1 file2 ...\n# Adding location for the tag files is done as follows:\n# TAGFILES = file1=loc1 \"file2 = loc2\" ...\n# where loc1 and loc2 can be relative or absolute paths or URLs. See the\n# section \"Linking to external documentation\" for more information about the use\n# of tag files.\n# Note: Each tag file must have a unique name (where the name does NOT include\n# the path). If a tag file is not located in the directory in which doxygen is\n# run, you must also specify the path to the tagfile here.\n\nTAGFILES               =\n# TAGFILES              += \"cppreference-doxygen-web.tag.xml=http://en.cppreference.com/w/\"\n\n# When a file name is specified after GENERATE_TAGFILE, doxygen will create a\n# tag file that is based on the input files it reads. See section \"Linking to\n# external documentation\" for more information about the usage of tag files.\n\nGENERATE_TAGFILE       =\n\n# If the ALLEXTERNALS tag is set to YES, all external class will be listed in\n# the class index. If set to NO, only the inherited external classes will be\n# listed.\n# The default value is: NO.\n\nALLEXTERNALS           = NO\n\n# If the EXTERNAL_GROUPS tag is set to YES, all external groups will be listed\n# in the modules index. If set to NO, only the current project's groups will be\n# listed.\n# The default value is: YES.\n\nEXTERNAL_GROUPS        = YES\n\n# If the EXTERNAL_PAGES tag is set to YES, all external pages will be listed in\n# the related pages index. If set to NO, only the current project's pages will\n# be listed.\n# The default value is: YES.\n\nEXTERNAL_PAGES         = YES\n\n# The PERL_PATH should be the absolute path and name of the perl script\n# interpreter (i.e. the result of 'which perl').\n# The default file (with absolute path) is: /usr/bin/perl.\n\nPERL_PATH              = /usr/bin/perl\n\n#---------------------------------------------------------------------------\n# Configuration options related to the dot tool\n#---------------------------------------------------------------------------\n\n# If the CLASS_DIAGRAMS tag is set to YES, doxygen will generate a class diagram\n# (in HTML and LaTeX) for classes with base or super classes. Setting the tag to\n# NO turns the diagrams off. Note that this option also works with HAVE_DOT\n# disabled, but it is recommended to install and use dot, since it yields more\n# powerful graphs.\n# The default value is: YES.\n\nCLASS_DIAGRAMS         = YES\n\n# You can define message sequence charts within doxygen comments using the \\msc\n# command. Doxygen will then run the mscgen tool (see:\n# http://www.mcternan.me.uk/mscgen/)) to produce the chart and insert it in the\n# documentation. The MSCGEN_PATH tag allows you to specify the directory where\n# the mscgen tool resides. If left empty the tool is assumed to be found in the\n# default search path.\n\nMSCGEN_PATH            =\n\n# You can include diagrams made with dia in doxygen documentation. Doxygen will\n# then run dia to produce the diagram and insert it in the documentation. The\n# DIA_PATH tag allows you to specify the directory where the dia binary resides.\n# If left empty dia is assumed to be found in the default search path.\n\nDIA_PATH               =\n\n# If set to YES the inheritance and collaboration graphs will hide inheritance\n# and usage relations if the target is undocumented or is not a class.\n# The default value is: YES.\n\nHIDE_UNDOC_RELATIONS   = YES\n\n# If you set the HAVE_DOT tag to YES then doxygen will assume the dot tool is\n# available from the path. This tool is part of Graphviz (see:\n# http://www.graphviz.org/), a graph visualization toolkit from AT&T and Lucent\n# Bell Labs. The other options in this section have no effect if this option is\n# set to NO\n# The default value is: NO.\n\nHAVE_DOT               = NO\n\n# The DOT_NUM_THREADS specifies the number of dot invocations doxygen is allowed\n# to run in parallel. When set to 0 doxygen will base this on the number of\n# processors available in the system. You can set it explicitly to a value\n# larger than 0 to get control over the balance between CPU load and processing\n# speed.\n# Minimum value: 0, maximum value: 32, default value: 0.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nDOT_NUM_THREADS        = 0\n\n# When you want a differently looking font in the dot files that doxygen\n# generates you can specify the font name using DOT_FONTNAME. You need to make\n# sure dot is able to find the font, which can be done by putting it in a\n# standard location or by setting the DOTFONTPATH environment variable or by\n# setting DOT_FONTPATH to the directory containing the font.\n# The default value is: Helvetica.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nDOT_FONTNAME           = Helvetica\n\n# The DOT_FONTSIZE tag can be used to set the size (in points) of the font of\n# dot graphs.\n# Minimum value: 4, maximum value: 24, default value: 10.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nDOT_FONTSIZE           = 10\n\n# By default doxygen will tell dot to use the default font as specified with\n# DOT_FONTNAME. If you specify a different font using DOT_FONTNAME you can set\n# the path where dot can find it using this tag.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nDOT_FONTPATH           =\n\n# If the CLASS_GRAPH tag is set to YES then doxygen will generate a graph for\n# each documented class showing the direct and indirect inheritance relations.\n# Setting this tag to YES will force the CLASS_DIAGRAMS tag to NO.\n# The default value is: YES.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nCLASS_GRAPH            = YES\n\n# If the COLLABORATION_GRAPH tag is set to YES then doxygen will generate a\n# graph for each documented class showing the direct and indirect implementation\n# dependencies (inheritance, containment, and class references variables) of the\n# class with other documented classes.\n# The default value is: YES.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nCOLLABORATION_GRAPH    = YES\n\n# If the GROUP_GRAPHS tag is set to YES then doxygen will generate a graph for\n# groups, showing the direct groups dependencies.\n# The default value is: YES.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nGROUP_GRAPHS           = YES\n\n# If the UML_LOOK tag is set to YES, doxygen will generate inheritance and\n# collaboration diagrams in a style similar to the OMG's Unified Modeling\n# Language.\n# The default value is: NO.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nUML_LOOK               = NO\n\n# If the UML_LOOK tag is enabled, the fields and methods are shown inside the\n# class node. If there are many fields or methods and many nodes the graph may\n# become too big to be useful. The UML_LIMIT_NUM_FIELDS threshold limits the\n# number of items for each type to make the size more manageable. Set this to 0\n# for no limit. Note that the threshold may be exceeded by 50% before the limit\n# is enforced. So when you set the threshold to 10, up to 15 fields may appear,\n# but if the number exceeds 15, the total amount of fields shown is limited to\n# 10.\n# Minimum value: 0, maximum value: 100, default value: 10.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nUML_LIMIT_NUM_FIELDS   = 10\n\n# If the TEMPLATE_RELATIONS tag is set to YES then the inheritance and\n# collaboration graphs will show the relations between templates and their\n# instances.\n# The default value is: NO.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nTEMPLATE_RELATIONS     = NO\n\n# If the INCLUDE_GRAPH, ENABLE_PREPROCESSING and SEARCH_INCLUDES tags are set to\n# YES then doxygen will generate a graph for each documented file showing the\n# direct and indirect include dependencies of the file with other documented\n# files.\n# The default value is: YES.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nINCLUDE_GRAPH          = YES\n\n# If the INCLUDED_BY_GRAPH, ENABLE_PREPROCESSING and SEARCH_INCLUDES tags are\n# set to YES then doxygen will generate a graph for each documented file showing\n# the direct and indirect include dependencies of the file with other documented\n# files.\n# The default value is: YES.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nINCLUDED_BY_GRAPH      = YES\n\n# If the CALL_GRAPH tag is set to YES then doxygen will generate a call\n# dependency graph for every global function or class method.\n#\n# Note that enabling this option will significantly increase the time of a run.\n# So in most cases it will be better to enable call graphs for selected\n# functions only using the \\callgraph command. Disabling a call graph can be\n# accomplished by means of the command \\hidecallgraph.\n# The default value is: NO.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nCALL_GRAPH             = NO\n\n# If the CALLER_GRAPH tag is set to YES then doxygen will generate a caller\n# dependency graph for every global function or class method.\n#\n# Note that enabling this option will significantly increase the time of a run.\n# So in most cases it will be better to enable caller graphs for selected\n# functions only using the \\callergraph command. Disabling a caller graph can be\n# accomplished by means of the command \\hidecallergraph.\n# The default value is: NO.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nCALLER_GRAPH           = NO\n\n# If the GRAPHICAL_HIERARCHY tag is set to YES then doxygen will graphical\n# hierarchy of all classes instead of a textual one.\n# The default value is: YES.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nGRAPHICAL_HIERARCHY    = YES\n\n# If the DIRECTORY_GRAPH tag is set to YES then doxygen will show the\n# dependencies a directory has on other directories in a graphical way. The\n# dependency relations are determined by the #include relations between the\n# files in the directories.\n# The default value is: YES.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nDIRECTORY_GRAPH        = YES\n\n# The DOT_IMAGE_FORMAT tag can be used to set the image format of the images\n# generated by dot. For an explanation of the image formats see the section\n# output formats in the documentation of the dot tool (Graphviz (see:\n# http://www.graphviz.org/)).\n# Note: If you choose svg you need to set HTML_FILE_EXTENSION to xhtml in order\n# to make the SVG files visible in IE 9+ (other browsers do not have this\n# requirement).\n# Possible values are: png, jpg, gif, svg, png:gd, png:gd:gd, png:cairo,\n# png:cairo:gd, png:cairo:cairo, png:cairo:gdiplus, png:gdiplus and\n# png:gdiplus:gdiplus.\n# The default value is: png.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nDOT_IMAGE_FORMAT       = png\n\n# If DOT_IMAGE_FORMAT is set to svg, then this option can be set to YES to\n# enable generation of interactive SVG images that allow zooming and panning.\n#\n# Note that this requires a modern browser other than Internet Explorer. Tested\n# and working are Firefox, Chrome, Safari, and Opera.\n# Note: For IE 9+ you need to set HTML_FILE_EXTENSION to xhtml in order to make\n# the SVG files visible. Older versions of IE do not have SVG support.\n# The default value is: NO.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nINTERACTIVE_SVG        = NO\n\n# The DOT_PATH tag can be used to specify the path where the dot tool can be\n# found. If left blank, it is assumed the dot tool can be found in the path.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nDOT_PATH               =\n\n# The DOTFILE_DIRS tag can be used to specify one or more directories that\n# contain dot files that are included in the documentation (see the \\dotfile\n# command).\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nDOTFILE_DIRS           =\n\n# The MSCFILE_DIRS tag can be used to specify one or more directories that\n# contain msc files that are included in the documentation (see the \\mscfile\n# command).\n\nMSCFILE_DIRS           =\n\n# The DIAFILE_DIRS tag can be used to specify one or more directories that\n# contain dia files that are included in the documentation (see the \\diafile\n# command).\n\nDIAFILE_DIRS           =\n\n# When using plantuml, the PLANTUML_JAR_PATH tag should be used to specify the\n# path where java can find the plantuml.jar file. If left blank, it is assumed\n# PlantUML is not used or called during a preprocessing step. Doxygen will\n# generate a warning when it encounters a \\startuml command in this case and\n# will not generate output for the diagram.\n\nPLANTUML_JAR_PATH      =\n\n# When using plantuml, the PLANTUML_CFG_FILE tag can be used to specify a\n# configuration file for plantuml.\n\nPLANTUML_CFG_FILE      =\n\n# When using plantuml, the specified paths are searched for files specified by\n# the !include statement in a plantuml block.\n\nPLANTUML_INCLUDE_PATH  =\n\n# The DOT_GRAPH_MAX_NODES tag can be used to set the maximum number of nodes\n# that will be shown in the graph. If the number of nodes in a graph becomes\n# larger than this value, doxygen will truncate the graph, which is visualized\n# by representing a node as a red box. Note that doxygen if the number of direct\n# children of the root node in a graph is already larger than\n# DOT_GRAPH_MAX_NODES then the graph will not be shown at all. Also note that\n# the size of a graph can be further restricted by MAX_DOT_GRAPH_DEPTH.\n# Minimum value: 0, maximum value: 10000, default value: 50.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nDOT_GRAPH_MAX_NODES    = 50\n\n# The MAX_DOT_GRAPH_DEPTH tag can be used to set the maximum depth of the graphs\n# generated by dot. A depth value of 3 means that only nodes reachable from the\n# root by following a path via at most 3 edges will be shown. Nodes that lay\n# further from the root node will be omitted. Note that setting this option to 1\n# or 2 may greatly reduce the computation time needed for large code bases. Also\n# note that the size of a graph can be further restricted by\n# DOT_GRAPH_MAX_NODES. Using a depth of 0 means no depth restriction.\n# Minimum value: 0, maximum value: 1000, default value: 0.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nMAX_DOT_GRAPH_DEPTH    = 0\n\n# Set the DOT_TRANSPARENT tag to YES to generate images with a transparent\n# background. This is disabled by default, because dot on Windows does not seem\n# to support this out of the box.\n#\n# Warning: Depending on the platform used, enabling this option may lead to\n# badly anti-aliased labels on the edges of a graph (i.e. they become hard to\n# read).\n# The default value is: NO.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nDOT_TRANSPARENT        = NO\n\n# Set the DOT_MULTI_TARGETS tag to YES to allow dot to generate multiple output\n# files in one run (i.e. multiple -o and -T options on the command line). This\n# makes dot run faster, but since only newer versions of dot (>1.8.10) support\n# this, this feature is disabled by default.\n# The default value is: NO.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nDOT_MULTI_TARGETS      = NO\n\n# If the GENERATE_LEGEND tag is set to YES doxygen will generate a legend page\n# explaining the meaning of the various boxes and arrows in the dot generated\n# graphs.\n# The default value is: YES.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nGENERATE_LEGEND        = YES\n\n# If the DOT_CLEANUP tag is set to YES, doxygen will remove the intermediate dot\n# files that are used to generate the various graphs.\n# The default value is: YES.\n# This tag requires that the tag HAVE_DOT is set to YES.\n\nDOT_CLEANUP            = YES\n\n"
  },
  {
    "path": "docs/DoxygenLayout.xml",
    "content": "<doxygenlayout version=\"1.0\">\n  <!-- Generated by doxygen 1.8.15 -->\n  <!-- adapted to doxygen 1.8.13 (Katta) -->\n  <!-- Navigation index tabs for HTML output -->\n  <navindex>\n    <tab type=\"mainpage\" visible=\"yes\" title=\"Overview\"/>\n    <!-- <tab type=\"pages\" visible=\"yes\" title=\"Tutorials\" intro=\"\"/> -->\n    <tab type=\"modules\" visible=\"yes\" title=\"Components\" intro=\"\"/>\n    <tab type=\"namespaces\" visible=\"yes\" title=\"\">\n      <tab type=\"namespacelist\" visible=\"yes\" title=\"\" intro=\"\"/>\n      <tab type=\"namespacemembers\" visible=\"yes\" title=\"\" intro=\"\"/>\n    </tab>\n    <tab type=\"classes\" visible=\"yes\" title=\"\">\n      <tab type=\"classlist\" visible=\"yes\" title=\"\" intro=\"\"/>\n      <tab type=\"classindex\" visible=\"$ALPHABETICAL_INDEX\" title=\"\"/>\n      <tab type=\"hierarchy\" visible=\"yes\" title=\"\" intro=\"\"/>\n      <tab type=\"classmembers\" visible=\"yes\" title=\"\" intro=\"\"/>\n    </tab>\n    <tab type=\"files\" visible=\"yes\" title=\"\">\n      <tab type=\"filelist\" visible=\"yes\" title=\"\" intro=\"\"/>\n      <tab type=\"globals\" visible=\"yes\" title=\"\" intro=\"\"/>\n    </tab>\n    <tab type=\"examples\" visible=\"yes\" title=\"\" intro=\"\"/>\n  </navindex>\n\n  <!-- Layout definition for a class page -->\n  <class>\n    <briefdescription visible=\"yes\"/>\n    <detaileddescription title=\"\"/>\n    <includes visible=\"$SHOW_INCLUDE_FILES\"/>\n    <inheritancegraph visible=\"$CLASS_GRAPH\"/>\n    <collaborationgraph visible=\"$COLLABORATION_GRAPH\"/>\n    <memberdecl>\n      <nestedclasses visible=\"yes\" title=\"\"/>\n      <publictypes title=\"\"/>\n      <services title=\"\"/>\n      <interfaces title=\"\"/>\n      <publicslots title=\"\"/>\n      <signals title=\"\"/>\n      <publicmethods title=\"\"/>\n      <publicstaticmethods title=\"\"/>\n      <publicattributes title=\"\"/>\n      <publicstaticattributes title=\"\"/>\n      <protectedtypes title=\"\"/>\n      <protectedslots title=\"\"/>\n      <protectedmethods title=\"\"/>\n      <protectedstaticmethods title=\"\"/>\n      <protectedattributes title=\"\"/>\n      <protectedstaticattributes title=\"\"/>\n      <packagetypes title=\"\"/>\n      <packagemethods title=\"\"/>\n      <packagestaticmethods title=\"\"/>\n      <packageattributes title=\"\"/>\n      <packagestaticattributes title=\"\"/>\n      <properties title=\"\"/>\n      <events title=\"\"/>\n      <privatetypes title=\"\"/>\n      <privateslots title=\"\"/>\n      <privatemethods title=\"\"/>\n      <privatestaticmethods title=\"\"/>\n      <privateattributes title=\"\"/>\n      <privatestaticattributes title=\"\"/>\n      <friends title=\"\"/>\n      <related title=\"\" subtitle=\"\"/>\n      <membergroups visible=\"yes\"/>\n    </memberdecl>\n    <memberdef>\n      <inlineclasses title=\"\"/>\n      <typedefs title=\"\"/>\n      <enums title=\"\"/>\n      <services title=\"\"/>\n      <interfaces title=\"\"/>\n      <constructors title=\"\"/>\n      <functions title=\"\"/>\n      <related title=\"\"/>\n      <variables title=\"\"/>\n      <properties title=\"\"/>\n      <events title=\"\"/>\n    </memberdef>\n    <allmemberslink visible=\"yes\"/>\n    <usedfiles visible=\"$SHOW_USED_FILES\"/>\n    <authorsection visible=\"yes\"/>\n  </class>\n\n  <!-- Layout definition for a namespace page -->\n  <namespace>\n    <briefdescription visible=\"yes\"/>\n    <detaileddescription title=\"\"/>\n    <memberdecl>\n      <nestednamespaces visible=\"yes\" title=\"\"/>\n      <constantgroups visible=\"yes\" title=\"\"/>\n      <classes visible=\"yes\" title=\"\"/>\n      <typedefs title=\"\"/>\n      <enums title=\"\"/>\n      <functions title=\"\"/>\n      <variables title=\"\"/>\n      <membergroups visible=\"yes\"/>\n    </memberdecl>\n    <memberdef>\n      <inlineclasses title=\"\"/>\n      <typedefs title=\"\"/>\n      <enums title=\"\"/>\n      <functions title=\"\"/>\n      <variables title=\"\"/>\n    </memberdef>\n    <authorsection visible=\"yes\"/>\n  </namespace>\n\n  <!-- Layout definition for a file page -->\n  <file>\n    <briefdescription visible=\"yes\"/>\n    <detaileddescription title=\"\"/>\n    <includes visible=\"$SHOW_INCLUDE_FILES\"/>\n    <includegraph visible=\"$INCLUDE_GRAPH\"/>\n    <includedbygraph visible=\"$INCLUDED_BY_GRAPH\"/>\n    <sourcelink visible=\"yes\"/>\n    <memberdecl>\n      <classes visible=\"yes\" title=\"\"/>\n      <namespaces visible=\"yes\" title=\"\"/>\n      <constantgroups visible=\"yes\" title=\"\"/>\n      <defines title=\"\"/>\n      <typedefs title=\"\"/>\n      <enums title=\"\"/>\n      <functions title=\"\"/>\n      <variables title=\"\"/>\n      <membergroups visible=\"yes\"/>\n    </memberdecl>\n    <memberdef>\n      <inlineclasses title=\"\"/>\n      <defines title=\"\"/>\n      <typedefs title=\"\"/>\n      <enums title=\"\"/>\n      <functions title=\"\"/>\n      <variables title=\"\"/>\n    </memberdef>\n    <authorsection/>\n  </file>\n\n  <!-- Layout definition for a group page -->\n  <group>\n    <briefdescription visible=\"yes\"/>\n    <detaileddescription title=\"\"/>\n    <groupgraph visible=\"$GROUP_GRAPHS\"/>\n    <memberdecl>\n      <nestedgroups visible=\"yes\" title=\"\"/>\n      <dirs visible=\"yes\" title=\"\"/>\n      <files visible=\"yes\" title=\"\"/>\n      <namespaces visible=\"yes\" title=\"\"/>\n      <classes visible=\"yes\" title=\"\"/>\n      <defines title=\"\"/>\n      <typedefs title=\"\"/>\n      <enums title=\"\"/>\n      <enumvalues title=\"\"/>\n      <functions title=\"\"/>\n      <variables title=\"\"/>\n      <signals title=\"\"/>\n      <publicslots title=\"\"/>\n      <protectedslots title=\"\"/>\n      <privateslots title=\"\"/>\n      <events title=\"\"/>\n      <properties title=\"\"/>\n      <friends title=\"\"/>\n      <membergroups visible=\"yes\"/>\n    </memberdecl>\n    <memberdef>\n      <pagedocs/>\n      <inlineclasses title=\"\"/>\n      <defines title=\"\"/>\n      <typedefs title=\"\"/>\n      <enums title=\"\"/>\n      <enumvalues title=\"\"/>\n      <functions title=\"\"/>\n      <variables title=\"\"/>\n      <signals title=\"\"/>\n      <publicslots title=\"\"/>\n      <protectedslots title=\"\"/>\n      <privateslots title=\"\"/>\n      <events title=\"\"/>\n      <properties title=\"\"/>\n      <friends title=\"\"/>\n    </memberdef>\n    <authorsection visible=\"yes\"/>\n  </group>\n\n  <!-- Layout definition for a directory page -->\n  <directory>\n    <briefdescription visible=\"yes\"/>\n    <detaileddescription title=\"\"/>\n    <directorygraph visible=\"yes\"/>\n    <memberdecl>\n      <dirs visible=\"yes\"/>\n      <files visible=\"yes\"/>\n    </memberdecl>\n  </directory>\n</doxygenlayout>\n"
  },
  {
    "path": "docs/README.md",
    "content": "## CoreNEURON Documentation\n\n### Local build\n\nIt is recommended using a `virtualenv`, for example:\n\n```\npip3 install virtualenv\npython3 -m virtualenv venv\nsource venv/bin/activate\n```\n\nIn order to build documentation locally, you need to pip install the [docs_requirements](docs_requirements.txt) :\n```\npip3 install --user -r docs/docs_requirements.txt --upgrade\n```\n\nThen in your CMake build folder:\n```\nmake docs\n```  \nThat will build everything in the `build/docs` folder and you can then open `index.html` locally.\n\nWhen working locally on documentation, be aware of the following targets to speed up building process:\n\n* `doxygen` - build the API documentation only\n* `sphinx` - build Sphinx documentation\n\n"
  },
  {
    "path": "docs/_static/custom.css",
    "content": ".wy-nav-content {\n    max-width: 1000px;\n    margin-right: auto;\n}\n\n#notebook-container {\n    width: inherit;\n}\n"
  },
  {
    "path": "docs/conda_environment.yml",
    "content": "name: base\nchannels:\n  - conda-forge\n  - defaults\ndependencies:\n  - bison\n  - cmake\n  - doxygen\n"
  },
  {
    "path": "docs/conf.py",
    "content": "# Configuration file for the Sphinx documentation builder.\n#\n# This file only contains a selection of the most common options. For a full\n# list see the documentation:\n# https://www.sphinx-doc.org/en/master/usage/configuration.html\n\n# -- Path setup --------------------------------------------------------------\n\n# If extensions (or modules to document with autodoc) are in another directory,\n# add these directories to sys.path here. If the directory is relative to the\n# documentation root, use os.path.abspath to make it absolute, like shown here.\n#\n# import os\n# import sys\n# sys.path.insert(0, os.path.abspath('.'))\n\n\n# -- Project information -----------------------------------------------------\n\nproject = 'CoreNEURON'\ncopyright = 'Duke, Yale, and the BlueBrain Project -- Copyright 1984-2020'\nauthor = 'Michael Hines and the BlueBrain Project'\n\n\n# -- General configuration ---------------------------------------------------\n\n# Add any Sphinx extension module names here, as strings. They can be\n# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom\n# ones.\nextensions = [\n    'sphinx.ext.autodoc',\n    'sphinx.ext.autosummary',\n    'sphinx.ext.autosectionlabel',\n    'recommonmark',\n    'sphinx.ext.mathjax'\n]\n\nsource_suffix = {\n    '.rst': 'restructuredtext',\n    '.txt': 'markdown',\n    '.md': 'markdown',\n}\n\n# Add any paths that contain templates here, relative to this directory.\n# templates_path = ['_templates']\n\n# List of patterns, relative to source directory, that match files and\n# directories to ignore when looking for source files.\n# This pattern also affects html_static_path and html_extra_path.\nexclude_patterns = ['_build', 'Thumbs.db', '.DS_Store', 'python/venv']\n\n\n# -- Options for HTML output -------------------------------------------------\n\n# The theme to use for HTML and HTML Help pages.  See the documentation for\n# a list of builtin themes.\n#\nhtml_theme = 'sphinx_rtd_theme'\n\n# Sphinx expects the master doc to be contents\nmaster_doc = 'index'\n\n# Add any paths that contain custom static files (such as style sheets) here,\n# relative to this directory. They are copied after the builtin static files,\n# so a file named \"default.css\" will overwrite the builtin \"default.css\".\nhtml_static_path = ['_static']\n\nhtml_css_files = [\n    'custom.css',\n]\n\nnbsphinx_allow_errors = True\n\nimport os\nif os.environ.get(\"READTHEDOCS\"):\n    os.system(\"rm -rf BUILD && mkdir BUILD && cd BUILD && cmake -DCORENRN_ENABLE_MPI=OFF ../.. && make doxygen\")\n    html_extra_path = ['BUILD/docs']\n"
  },
  {
    "path": "docs/docs_requirements.txt",
    "content": "sphinx\nsphinx_rtd_theme\nrecommonmark"
  },
  {
    "path": "docs/doxygen.rst",
    "content": "C++ API\n===========\n\nLink to doxygen `C++ API`_ \n\n.. _C++ API: doxygen/index.html\n"
  },
  {
    "path": "docs/footer.html",
    "content": "<!-- HTML footer for doxygen 1.8.15-->\n<!-- start footer part -->\n<!--BEGIN GENERATE_TREEVIEW-->\n<div id=\"nav-path\" class=\"navpath\">\n  <ul>\n    $navpath\n  </ul>\n</div>\n<hr class=\"footer\"/>\n<address class=\"footer\">\n    <small>\n    </small>\n</address>\n<!--END !GENERATE_TREEVIEW-->\n</body>\n</html>\n"
  },
  {
    "path": "docs/index.rst",
    "content": "Welcome to CoreNEURON's documentation!\n==================================\n\n.. toctree::\n   :maxdepth: 2\n   :caption: User documentation:\n\n   userdoc/BinaryFormat/BinaryFormat.md\n   userdoc/MemoryManagement/bbcorepointer.md\n\n.. toctree::\n   :maxdepth: 2\n   :caption: Developer documentation:\n\n   doxygen\n\nIndices and tables\n==================\n\n* :ref:`genindex`\n* :ref:`modindex`\n* :ref:`search`\n"
  },
  {
    "path": "docs/userdoc/BinaryFormat/BinaryFormat.md",
    "content": "## CoreNEURON Input Binary File Format\n\nNEURON is used for building in-memory model of the network. The in-memory representation of model is then dumped to binary files and read by CoreNEURON. The abstract structure of these binary files is shown : ![Binary File Format](binary_file_format.jpg).\n\n> Note : additional datasets are being added for additional functionality (e.g. Gap Junctions). This dcoumentation / format will be updated in the future.\n"
  },
  {
    "path": "docs/userdoc/MemoryManagement/bbcorepointer.md",
    "content": "\n## Transferring dynamically allocated data between NEURON and CoreNEURON\n\n\nUser-allocated data can be managed in NMODL using the `POINTER` type. It allows the\nprogrammer to reference data that has been allocated in HOC or in VERBATIM blocks. This\nallows for more advanced data-structures that are not natively supported in NMODL.\n\nSince NEURON itself has no knowledge of the layout and size of this data it cannot\ntransfer `POINTER` data automatically to CoreNEURON. Furtheremore, in many cases there\nis no need to transfer the data between the two instances. In some cases, however, the\nprogrammer would like to transfer certain user-defined data into CoreNEURON. The most\nprominent example are random123 RNG stream parameters used in synapse mechanisms. To\nsupport this use-case the `BBCOREPOINTER` type was introduced. Variables that are declared as\n`BBCOREPOINTER` behave exactly the same as `POINTER` but are additionally taken into account\nwhen NEURON is serializing mechanism data (for file writing or direct-memory transfer).\nFor NEURON to be able to write (and indeed CoreNEURON to be able to read) `BBCOREPOINTER`\ndata, the programmer has to additionally provide two C functions that are called as part\nof the serialization/deserialization.\n\n```\nstatic void bbcore_write(double* x, int* d, int* d_offset, int* x_offset, _threadargsproto_);\n\nstatic void bbcore_read(double* x, int* d, int* d_offset, int* x_offset, _threadargsproto_);\n```\n\nThe implementation of `bbcore_write` and `bbcore_read` determines the serialization and\ndeserialization of the per-instance mechanism data referenced through the various\n`BBCOREPOINTER`s.\n\nNEURON will call `bbcore_write` twice per mechanism instance. In a first sweep, the call is used to\ndetermine the required memory to be allocated on the serialization arrays. In the second sweep the\ncall is used to fill in the data per mechanism instance.\n\nThe functions take following arguments\n\n* `x`: A `double` type array that will be allocated by NEURON to fill with real-valued data. In the\n  first call, `x` is NULL as it has not been allocated yet.\n* `d`: An `int` type array that will be allocated by NEURON to fill with integer-valued data. In the\n  first call, `d` is NULL as it has not been allocated yet.\n* `x_offset`: The offset in `x` at which the mechanism instance should write its real-valued\n  `BBCOREPOINTER` data. In the first call this is an output argument that is expected to be updated\n  by the per-instance size to be allocated.\n* `d_offset`: The offset in `x` at which the mechanism instance should write its integer-valued\n  `BBCOREPOINTER` data. In the first call this is an output argument that is expected to be updated\n  by the per-instance size to be allocated.\n* `_threadargsproto_`: a macro placeholder for NEURON/CoreNEURON data-structure parameters. They\n  are typically only used through generated defines and not by the programmer. The macro is defined\n  as follows:\n\n```\n#define _threadargsproto_                                                                         \\\n    int _iml, int _cntml_padded, double *_p, Datum *_ppvar, ThreadDatum *_thread, NrnThread *_nt, \\\n    double _v\n```\n\nPutting all of this together, the following is a minimal MOD using BBCOREPOINTER:\n\n```\nTITLE A BBCOREPOINTER Example \n\nNEURON {\n    BBCOREPOINTER my_data\n}\n\nASSIGNED {\n    my_data\n}\n\n: Do something interesting with my_data ...\n\nVERBATIM\nstatic void bbcore_write(double* x, int* d, int* x_offset, int* d_offset, _threadargsproto_) {\n    if (x) {\n        double* x_i = x + *x_offset;\n        x_i[0] = _p_my_data[0];\n        x_i[1] = _p_my_data[1];\n    }\n    *x_offset += 2; // reserve 2 doubles on serialization buffer x\n}\n\nstatic void bbcore_read(double* x, int* d, int* x_offset, int* d_offset, _threadargsproto_) {\n    assert(!_p_my_data);\n    double* x_i = x + *x_offset;\n    // my_data needs to be allocated somehow\n    _p_my_data = (double*)malloc(sizeof(double)*2); \n    _p_my_data[0] = x_i[0];\n    _p_my_data[1] = x_i[1];\n    *x_offset += 2;\n}\nENDVERBATIM\n```\n\n"
  },
  {
    "path": "extra/CMakeLists.txt",
    "content": "# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\n# =============================================================================\n# Copy first into build directory as it will be used for special-core\n# =============================================================================\nconfigure_file(nrnivmodl_core_makefile.in\n               ${CMAKE_BINARY_DIR}/share/coreneuron/nrnivmodl_core_makefile @ONLY)\nconfigure_file(nrnivmodl-core.in ${CMAKE_BINARY_DIR}/bin/nrnivmodl-core @ONLY)\n# nrnivmodl-core depends on the building of NMODL_TARGET_TO_DEPEND and the configuration of the\n# nrnivmodl-core and nrnivmodl_core_makefile this doesn't imply that whenever there is a change in\n# one of those files then the prebuilt mod files are going to be rebuilt\nadd_custom_target(\n  nrnivmodl-core ALL\n  DEPENDS ${CMAKE_BINARY_DIR}/bin/nrnivmodl-core\n          ${CMAKE_BINARY_DIR}/share/coreneuron/nrnivmodl_core_makefile ${NMODL_TARGET_TO_DEPEND})\n\n# =============================================================================\n# Install for end users\n# =============================================================================\ninstall(FILES ${CMAKE_BINARY_DIR}/share/coreneuron/nrnivmodl_core_makefile\n        DESTINATION share/coreneuron)\ninstall(PROGRAMS ${CMAKE_BINARY_DIR}/bin/nrnivmodl-core DESTINATION bin)\n"
  },
  {
    "path": "extra/instrumentation.tau",
    "content": "BEGIN_INCLUDE_LIST\n double nrnmpi_dbl_allreduce(double, int)\n int coreneuron::main(int, char **, char **)\n int coreneuron::nrnmpi_bgp_conserve(int, int)\n int coreneuron::nrnmpi_bgp_single_advance(NRNMPI_Spike *)\n int coreneuron::nrnmpi_spike_exchange(int*, NRNMPI_Spike*)\n int main(int, char **, char **)\n size_t nrnbbcore_write()\n void coreneuron::*nrn_fixed_step_group_thread(coreneuron::NrnThread *)\n void coreneuron::*nrn_fixed_step_lastpart(coreneuron::NrnThread *)\n void coreneuron::*nrn_fixed_step_thread(coreneuron::NrnThread *)\n void coreneuron::*nrn_ms_bksub(coreneuron::NrnThread *)\n void coreneuron::*nrn_ms_bksub_through_triang(coreneuron::NrnThread *)\n void coreneuron::*nrn_ms_reduce_solve(coreneuron::NrnThread *)\n void coreneuron::*nrn_ms_treeset_through_triang(coreneuron::NrnThread *)\n void coreneuron::*setup_tree_matrix(coreneuron::NrnThread *)\n void coreneuron::*setup_tree_matrix_minimal(coreneuron::NrnThread *)\n void coreneuron::BBS::netpar_solve(double)\n void coreneuron::BBS_netpar_solve(double)\n void coreneuron::NetParEvent::deliver(double, NetCvode *, coreneuron::NrnThread *)\n void coreneuron::NetParEvent::send(double, NetCvode *, coreneuron::NrnThread *)\n void coreneuron::_nrn_cur#(coreneuron::NrnThread *, coreneuron::Memb_list *, int)\n void coreneuron::_nrn_jacob#(coreneuron::NrnThread *, coreneuron::Memb_list *, int)\n void coreneuron::_nrn_state#(coreneuron::NrnThread *, coreneuron::Memb_list *, int)\n void coreneuron::all_wait_for_spike_exchange()\n void coreneuron::bksub(coreneuron::NrnThread *)\n void coreneuron::deliver_net_events(coreneuron::NrnThread *)\n void coreneuron::determine_inputpresyn()\n void coreneuron::finitialize(void)\n void coreneuron::ncs2nrn_integrate(double)\n void coreneuron::nonvint(coreneuron::NrnThread *)\n void coreneuron::nrn2ncs_outputevent(int, double)\n void coreneuron::nrn_cap_jacob(coreneuron::NrnThread *, Memb_list *)\n void coreneuron::nrn_cleanup_presyn(PreSyn *)\n void coreneuron::nrn_deliver_events(coreneuron::NrnThread *)\n void coreneuron::nrn_finitialize(int, double)\n void coreneuron::nrn_fixed_step_group(int)\n void coreneuron::nrn_fixed_step_group_minimal(int)\n void coreneuron::nrn_fixed_single_steps_minimal(int, double)\n void coreneuron::nrn_flush_reports(double)\n void coreneuron::nrn_lhs(coreneuron::NrnThread *)\n void coreneuron::nrn_multithread_job(void *(*)(coreneuron::NrnThread *))\n void coreneuron::nrn_promote()\n void coreneuron::nrn_rhs(coreneuron::NrnThread *)\n void coreneuron::nrn_setup(const char *, const char *, int, int)\n void coreneuron::nrn_solve(coreneuron::NrnThread *)\n void coreneuron::nrn_solve_minimal(coreneuron::NrnThread *)\n void coreneuron::nrn_spike_exchange(coreneuron::NrnThread *)\n void coreneuron::nrn_spike_exchange_init()\n void coreneuron::nrnmpi_barrier()\n void coreneuron::nrnmpi_bgp_multisend(NRNMPI_Spike *, int, int *)\n void coreneuron::nrnmpi_int_gather(int *, int *, int, int)\n void coreneuron::nrnmpi_int_gatherv(int *, int, int *, int *, int *, int)\n void coreneuron::nrnmpi_postrecv_doubles(double *, int, int, int, void **)\n void coreneuron::nrnmpi_send_doubles(double *, int, int, int)\n void coreneuron::nrnmpi_spike_initialize()\n void coreneuron::nrnmpi_wait(void **)\n void coreneuron::output_spikes(const char *)\n void coreneuron::output_spikes_parallel(const char *)\n void coreneuron::read_phase1(data_reader &, coreneuron::NrnThread &)\n void coreneuron::read_phase2(data_reader &, coreneuron::NrnThread &)\n void coreneuron::setup_report_engine(double, double)\n void coreneuron::solve_interleaved1(int)\n void coreneuron::triang(coreneuron::NrnThread *)\n void coreneuron::triang_interleaved(coreneuron::NrnThread *, int, int, int, int *, int *)\n void coreneuron::update(coreneuron::NrnThread *)\n void coreneuron::write_checkpoint(coreneuron::NrnThread *, int, const char *, bool)\n void coreneuron::write_checkpoint(coreneuron::NrnThread *, int, const char*, bool)\n void coreneuron::write_nrnthread(const char *, coreneuron::NrnThread &, nrncore_CellGroup &)\n void coreneuron::write_nrnthread_task(const char *, nrncore_CellGroup *)\nEND_INCLUDE_LIST\n"
  },
  {
    "path": "extra/nrnivmodl-core.in",
    "content": "#!/bin/bash\n\n# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\nset -e\n\n# TODO : mod2c_core can be linked with (HPE-)MPI library\n# and running that under slurm allocation result into\n# runtime error. For now, unset PMI_RANK variable\n# which is sufficint to avoid issue with HPE-MPI+SLURM.\nunset PMI_RANK\n\n# name of the script\nAPP_NAME=\"$(basename \"$0\")\"\n\n# directory and parent directory of this script\nPARENT_DIR=\"$(dirname \"$BASH_SOURCE\")/..\"\n\n# prefer perl exe set by neuron wrappers in case of wheel\nPERL_EXE=\"${CORENRN_PERLEXE:-@PERL_EXECUTABLE@}\"\n# in case of mac installer, wrapper is not used and hence\n# check if binary exist. otherwise, just rely on perl being\n# in default $PATH\nif [ ! -f \"${PERL_EXE}\" ]; then PERL_EXE=\"$(which perl)\"; fi\n\nROOT_DIR=\"$(\"${PERL_EXE}\" -e \"use Cwd 'abs_path'; print abs_path('$PARENT_DIR')\")\"\n\n# default arguments : number of parallel builds and default mod file path\nPARALLEL_BUILDS=4\nparams_MODS_PATH=\".\"\nparams_BUILD_TYPE=\"@COMPILE_LIBRARY_TYPE@\"\nparams_NRN_PRCELLSTATE=\"@CORENRN_NRN_PRCELLSTATE@\"\n\n# prefix for common options : make sure to rename these if options are changed.\nMAKE_OPTIONS=\"MECHLIB_SUFFIX MOD2CPP_BINARY MOD2CPP_RUNTIME_FLAGS DESTDIR INCFLAGS LINKFLAGS MODS_PATH VERBOSE BUILD_TYPE NRN_PRCELLSTATE\"\n\n# parse CLI args\nwhile getopts \"n:m:a:d:i:l:Vp:r:b:h\" OPT; do\n    case \"$OPT\" in\n    n)\n        # suffix for mechanism library\n        params_MECHLIB_SUFFIX=\"$OPTARG\";;\n    m)\n        # nmodl or mod2c binary to use\n        params_MOD2CPP_BINARY=\"$OPTARG\";;\n    a)\n        # additional nmodl flags to be used\n        params_MOD2CPP_RUNTIME_FLAGS=\"$OPTARG\";;\n    d)\n        # destination install directory\n        params_DESTDIR=\"$OPTARG\";;\n    i)\n        # extra include flags\n        params_INCFLAGS=\"$OPTARG\";;\n    l)\n        # extra link flags\n        params_LINKFLAGS=\"$OPTARG\";;\n    V)\n        # make with verbose\n        params_VERBOSE=1;;\n    p)\n        # option for parallel build (with -j)\n        PARALLEL_BUILDS=\"$OPTARG\";;\n    b)\n        # make with verbose\n        params_BUILD_TYPE=\"$OPTARG\";;\n    r)\n        # enable NRN_PRCELLSTATE mechanism\n        params_NRN_PRCELLSTATE=\"$OPTARG\";;\n    h)\n        echo \"$APP_NAME [options, ...] [mods_path]\"\n        echo \"Options:\"\n        echo \"  -n <name>                 The model name, used as a suffix in the shared library\"\n        echo \"  -m <nmodl_bin>            NMODL/mod2c code generation compiler path\"\n        echo \"  -a <nmodl_runtime_flags>  Runtime flags for NMODL/mod2c\"\n        echo \"  -i <incl_flags>           Definitions passed to the compiler, typically '-I dir..'\"\n        echo \"  -l <link_flags>           Definitions passed to the linker, typically '-Lx -lylib..'\"\n        echo \"  -d <dest_dir>             Install to dest_dir. Default: Off.\"\n        echo \"  -r <0|1>                  Enable NRN_PRCELLSTATE mechanism. Default: @CORENRN_NRN_PRCELLSTATE@.\"\n        echo \"  -V                        Verbose: show commands executed by make\"\n        echo \"  -p <n_procs>              Number of parallel builds (Default: $PARALLEL_BUILDS)\"\n        echo \"  -b <STATIC|SHARED>        libcorenrnmech library type\"\n        exit 0;;\n    ?)\n        exit 1;;\n    esac\ndone\n\n# consume an option\nshift $(($OPTIND - 1))\n\n# only one mod files directory is supported in neuron and coreneuron\nif [ $# -gt 1 ]; then\n    echo \"[ERROR] $APP_NAME expects at most one mod dir. See syntax: '$APP_NAME -h' \"\n    exit 1\nfi\n\n# if defined mods dir be in $1\nif [ $# -eq 1 ]; then\n    params_MODS_PATH=\"$1\"\nfi\n\nshopt -s nullglob\n# warn if no mod files provided\nif [ -d \"$params_MODS_PATH\" ]; then\n    files=( \"$params_MODS_PATH\"/*.mod )\n    if [ ${#files} -eq 0 ]; then\n        echo \"WARNING: No mod files found in '$(realpath ${params_MODS_PATH})', compiling default ones only!\"\n    fi\nelse\n    echo \"FATAL: Invalid mods directory: '$params_MODS_PATH'\"\n    exit 1\nfi\n\n# temporary directory where mod files will be copied\ntemp_mod_dir=\"@CMAKE_HOST_SYSTEM_PROCESSOR@/corenrn/mod2c\"\nmkdir -p \"$temp_mod_dir\"\n\n# copy mod files with include files. note that ${ROOT_DIR}/share\n# has inbuilt mod files and user provided mod files are in $params_MODS_PATH.\nset +e\nfor mod_dir in \"${ROOT_DIR}/share/modfile\" \"$params_MODS_PATH\" ;\ndo\n    # copy mod files and include files\n    files=( \"$mod_dir/\"*.mod \"$mod_dir/\"*.inc \"$mod_dir/\"*.h* )\n    for f in \"${files[@]}\";\n    do\n        # copy mod files only if it's changed (to avoid rebuild)\n        target_file_path=\"$temp_mod_dir/$(basename \"$f\")\"\n        if ! diff -q \"$f\" \"$target_file_path\" &>/dev/null;  then\n            cp \"$f\" \"$target_file_path\"\n        fi\n    done\ndone\nset -e\n\n# use new mod files directory for compilation\nparams_MODS_PATH=\"$temp_mod_dir\"\n\n# build params to make command\nmake_params=(\"ROOT=${ROOT_DIR}\")\nfor param in $MAKE_OPTIONS; do\n    var=\"params_${param}\"\n    if [ \"${!var+x}\" ]; then\n        make_params+=(\"$param=${!var}\")\n    fi\ndone\n\n# if -d (deploy) provided, call \"make install\"\nif [ \"$params_DESTDIR\" ]; then\n    make_params+=(\"install\")\nfi\n\nif [ \"$params_VERBOSE\" ]; then\n    make_params+=(\"VERBOSE=1\")\nfi\n\n# run makefile\necho \"[INFO] Running: make -j$PARALLEL_BUILDS -f ${ROOT_DIR}/share/coreneuron/nrnivmodl_core_makefile ${make_params[@]}\"\nmake -j$PARALLEL_BUILDS -f \"${ROOT_DIR}/share/coreneuron/nrnivmodl_core_makefile\" \"${make_params[@]}\"\necho \"[INFO] MOD files built successfully for CoreNEURON\"\n"
  },
  {
    "path": "extra/nrnivmodl_core_makefile.in",
    "content": "# This Makefile has the rules necessary for making the custom version of\n# CoreNEURON executable called \"special-core\" from the provided mod files.\n# Mod files are looked up in the MODS_PATH directory.\n\n# Current system OS\nOS_NAME := $(shell uname)\n\n# \",\"\" is an argument separator, never as a literal for Makefile rule\nCOMMA_OP =,\n\n# Default variables for various targets\nMECHLIB_SUFFIX =\nMODS_PATH = .\nOUTPUT_DIR = @CMAKE_HOST_SYSTEM_PROCESSOR@\nDESTDIR =\nTARGET_LIB_TYPE = $(BUILD_TYPE)\n\n# required for OSX to execute nrnivmodl-core\nifeq ($(origin SDKROOT), undefined)\n  export SDKROOT := $(shell xcrun --sdk macosx --show-sdk-path)\nendif\n\n# CoreNEURON installation directories\nCORENRN_BIN_DIR := $(ROOT)/bin\nCORENRN_LIB_DIR := $(ROOT)/lib\nCORENRN_INC_DIR := $(ROOT)/include\nCORENRN_SHARE_CORENRN_DIR:= $(ROOT)/share/coreneuron\nCORENRN_SHARE_MOD2CPP_DIR := $(ROOT)/share/mod2c\n\n# name of the CoreNEURON binary\nSPECIAL_EXE  = $(OUTPUT_DIR)/special-core\n\n# Directory where cpp files are generated for each mod file\nMOD_TO_CPP_DIR = $(OUTPUT_DIR)/corenrn/mod2c\n\n# Directory where cpp files are compiled\nMOD_OBJS_DIR = $(OUTPUT_DIR)/corenrn/build\n\n# Linked libraries gathered by CMake\nLDFLAGS = $(LINKFLAGS) @CORENRN_COMMON_LDFLAGS@\n\n# Includes paths gathered by CMake\n# coreneuron/utils/randoms goes first because it needs to override the NEURON\n# directory in INCFLAGS\nINCLUDES = -I$(CORENRN_INC_DIR)/coreneuron/utils/randoms $(INCFLAGS) -I$(CORENRN_INC_DIR)\nifeq (@CORENRN_ENABLE_MPI_DYNAMIC@, OFF)\n  INCLUDES += $(if @MPI_CXX_INCLUDE_PATH@, -I$(subst ;, -I,@MPI_CXX_INCLUDE_PATH@),)\nendif\nINCLUDES += $(if @reportinglib_INCLUDE_DIR@, -I$(subst ;, -I,@reportinglib_INCLUDE_DIR@),)\n\n# CXX is always defined. If the definition comes from default change it\nifeq ($(origin CXX), default)\n    CXX = @CMAKE_CXX_COMPILER@\nendif\n\nifeq (@CORENRN_ENABLE_GPU@, ON)\n  ifneq ($(shell $(CXX) --version | grep -o nvc++), nvc++)\n    $(error GPU wheels are only compatible with the NVIDIA C++ compiler nvc++, but CXX=$(CXX) and --version gives $(shell $(CXX) --version))\n  endif\n  # nvc++ -dumpversion is simpler, but only available from 22.2\n  ifeq ($(findstring nvc++ @CORENRN_NVHPC_MAJOR_MINOR_VERSION@, $(shell $(CXX) --version)),)\n    $(error GPU wheels are currently not compatible across NVIDIA HPC SDK versions. You have $(shell $(CXX) -V | grep nvc++) but this wheel was built with @CORENRN_NVHPC_MAJOR_MINOR_VERSION@.)\n  endif\nendif\n\n# In case of wheel, python and perl exe paths are from the build machine.\n# First prefer env variables set by neuron's nrnivmodl wrapper then check\n# binary used during build. If they don't exist then simply use python and\n# perl as the name of binaries.\nCORENRN_PYTHONEXE ?= @PYTHON_EXECUTABLE@\nCORENRN_PERLEXE ?= @PERL_EXECUTABLE@\nifeq ($(wildcard $(CORENRN_PYTHONEXE)),)\n  CORENRN_PYTHONEXE=python\nendif\nifeq ($(wildcard $(CORENRN_PERLEXE)),)\n  CORENRN_PERLEXE=perl\nendif\n\nCXXFLAGS = @CORENRN_CXX_FLAGS@\nCXX_COMPILE_CMD = $(CXX) $(CXXFLAGS) @CMAKE_CXX_COMPILE_OPTIONS_PIC@ $(INCLUDES)\nCXX_LINK_EXE_CMD = $(CXX) $(CXXFLAGS) @CMAKE_EXE_LINKER_FLAGS@\nCXX_SHARED_LIB_CMD = $(CXX) $(CXXFLAGS) @CMAKE_SHARED_LIBRARY_CREATE_CXX_FLAGS@ @CMAKE_SHARED_LIBRARY_CXX_FLAGS@ @CMAKE_SHARED_LINKER_FLAGS@\n\n# env variables required for mod2c or nmodl\nMOD2CPP_ENV_VAR = @CORENRN_SANITIZER_ENABLE_ENVIRONMENT_STRING@ PYTHONPATH=@CORENRN_NMODL_PYTHONPATH@:${CORENRN_LIB_DIR}/python MODLUNIT=$(CORENRN_SHARE_MOD2CPP_DIR)/nrnunits.lib\n\n# nmodl options\nifeq (@CORENRN_ENABLE_NMODL@, ON)\n    ifeq (@CORENRN_ENABLE_GPU@, ON)\n        nmodl_arguments_c=@NMODL_ACC_BACKEND_ARGS@ @NMODL_COMMON_ARGS@\n    else\n        nmodl_arguments_c=@NMODL_CPU_BACKEND_ARGS@ @NMODL_COMMON_ARGS@\n    endif\nendif\n\n# name of the mechanism library with suffix if provided\nCOREMECH_LIB_NAME = corenrnmech$(if $(MECHLIB_SUFFIX),_$(MECHLIB_SUFFIX),)\nCOREMECH_LIB_PATH = $(OUTPUT_DIR)/lib$(COREMECH_LIB_NAME)$(LIB_SUFFIX)\n\n# Various header and C++/Object file\nMOD_FUNC_CPP = $(MOD_TO_CPP_DIR)/_mod_func.cpp\nMOD_FUNC_OBJ = $(MOD_OBJS_DIR)/_mod_func.o\nENGINEMECH_OBJ = $(MOD_OBJS_DIR)/enginemech.o\n\n# Depending on static/shared build, determine library name and it's suffix\nifeq ($(TARGET_LIB_TYPE), STATIC)\n    LIB_SUFFIX = @CMAKE_STATIC_LIBRARY_SUFFIX@\n    corenrnmech_lib_target = coremech_lib_static\nelse\n    LIB_SUFFIX = @CMAKE_SHARED_LIBRARY_SUFFIX@\n    corenrnmech_lib_target = coremech_lib_shared\nendif\n\n# Binary of MOD2C/NMODL depending on CMake option activated\nifeq (@nmodl_FOUND@, TRUE)\n    MOD2CPP_BINARY_PATH = $(if $(MOD2CPP_BINARY),$(MOD2CPP_BINARY), @CORENRN_MOD2CPP_BINARY@)\n    INCLUDES += -I@CORENRN_MOD2CPP_INCLUDE@\nelse\n    MOD2CPP_BINARY_PATH = $(if $(MOD2CPP_BINARY),$(MOD2CPP_BINARY), $(CORENRN_BIN_DIR)/@nmodl_binary_name@)\nendif\n\n# MOD files with full path, without path and names without .mod extension\nmod_files_paths = $(sort $(wildcard $(MODS_PATH)/*.mod))\nmod_files_names = $(sort $(notdir $(wildcard $(MODS_PATH)/*.mod)))\nmod_files_no_ext = $(mod_files_names:.mod=)\nmod_files_for_cpp_backend = $(foreach mod_file, $(mod_files_paths), $(addprefix $(MOD_TO_CPP_DIR)/, $(notdir $(mod_file))))\n\n# CPP files and their obkects\nmod_cpp_files = $(patsubst %.mod,%.cpp,$(mod_files_for_cpp_backend))\nmod_cpp_objs = $(addprefix $(MOD_OBJS_DIR)/,$(addsuffix .o,$(basename $(mod_files_no_ext))))\n\n# We use $ORIGIN (@loader_path in OSX)\nORIGIN_RPATH := $(if $(filter Darwin,$(OS_NAME)),@loader_path,$$ORIGIN)\nSONAME_OPTION := -Wl,$(if $(filter Darwin,$(OS_NAME)),-install_name${COMMA_OP}@rpath/,-soname${COMMA_OP})$(notdir ${COREMECH_LIB_PATH})\nLIB_RPATH = $(if $(DESTDIR),$(DESTDIR)/lib,$(ORIGIN_RPATH))\n\n# When special-core is installed, it needs to find library in the\n# lib folder of install prefix. We use relative path in order it\n# to be portable when files are moved (e.g. python wheel)\nINSTALL_LIB_RPATH = $(ORIGIN_RPATH)/../lib\n\n# All objects used during build\nALL_OBJS = $(MOD_FUNC_OBJ) $(mod_cpp_objs)\n\n# Colors for pretty printing\nC_RESET := \\033[0m\nC_GREEN := \\033[32m\n\n# Default nmodl flags. Override if MOD2CPP_RUNTIME_FLAGS is not empty\nifeq (@CORENRN_ENABLE_NMODL@, ON)\n    MOD2CPP_FLAGS_C = $(if $(MOD2CPP_RUNTIME_FLAGS),$(MOD2CPP_RUNTIME_FLAGS),$(nmodl_arguments_c))\nendif\n\n$(info Default NMODL flags: @nmodl_arguments_c@)\n\nifneq ($(MOD2CPP_RUNTIME_FLAGS),)\n    $(warning Runtime nmodl flags (they replace the default ones): $(MOD2CPP_RUNTIME_FLAGS))\nendif\n\n# ======== MAIN BUILD RULES ============\n\n\n# main target to build binary\n$(SPECIAL_EXE): $(corenrnmech_lib_target)\n\t@printf \" => $(C_GREEN)Binary$(C_RESET) creating $(SPECIAL_EXE)\\n\"\n\t$(CXX_LINK_EXE_CMD) -o $(SPECIAL_EXE) $(CORENRN_SHARE_CORENRN_DIR)/coreneuron.cpp \\\n\t  -I$(CORENRN_INC_DIR) $(INCFLAGS) \\\n\t  -L$(OUTPUT_DIR) -l$(COREMECH_LIB_NAME) $(LDFLAGS) \\\n\t  -L$(CORENRN_LIB_DIR) \\\n\t  -Wl,-rpath,'$(LIB_RPATH)' -Wl,-rpath,$(CORENRN_LIB_DIR) -Wl,-rpath,'$(INSTALL_LIB_RPATH)'\n\n$(ENGINEMECH_OBJ): $(CORENRN_SHARE_CORENRN_DIR)/enginemech.cpp | $(MOD_OBJS_DIR)\n\t$(CXX_COMPILE_CMD) -c -DADDITIONAL_MECHS $(CORENRN_SHARE_CORENRN_DIR)/enginemech.cpp -o $(ENGINEMECH_OBJ)\n\n# build shared library of mechanisms\ncoremech_lib_shared: $(ALL_OBJS) $(ENGINEMECH_OBJ) build_always\n\t# extract the object files from libcoreneuron-core.a\n\tmkdir -p $(MOD_OBJS_DIR)/libcoreneuron-core\n\trm -f $(MOD_OBJS_DIR)/libcoreneuron-core/*.o\n\t# --output is only supported by modern versions of ar\n\t(cd $(MOD_OBJS_DIR)/libcoreneuron-core && ar x $(CORENRN_LIB_DIR)/libcoreneuron-core.a)\n\t$(CXX_SHARED_LIB_CMD) $(ENGINEMECH_OBJ) -o ${COREMECH_LIB_PATH} $(ALL_OBJS) \\\n\t  -I$(CORENRN_INC_DIR) $(INCFLAGS) \\\n\t  @CORENEURON_LINKER_START_GROUP@ \\\n\t  $(MOD_OBJS_DIR)/libcoreneuron-core/*.o @CORENEURON_LINKER_END_GROUP@ \\\n\t\t$(LDFLAGS) ${SONAME_OPTION} \\\n\t\t-Wl,-rpath,$(CORENRN_LIB_DIR) -L$(CORENRN_LIB_DIR)\n\t# cleanup\n\trm $(MOD_OBJS_DIR)/libcoreneuron-core/*.o\n\n# build static library of mechanisms\ncoremech_lib_static: $(ALL_OBJS) $(ENGINEMECH_OBJ) build_always\n\t# make a libcorenrnmech.a by copying libcoreneuron-core.a and then appending\n\t# the newly compiled objects\n\tcp $(CORENRN_LIB_DIR)/libcoreneuron-core.a ${COREMECH_LIB_PATH}\n\tar r ${COREMECH_LIB_PATH} $(ENGINEMECH_OBJ) $(ALL_OBJS)\n\n# compile cpp files to .o\n$(MOD_OBJS_DIR)/%.o: $(MOD_TO_CPP_DIR)/%.cpp | $(MOD_OBJS_DIR)\n\t$(CXX_COMPILE_CMD) -c $< -o $@ -DNRN_PRCELLSTATE=$(NRN_PRCELLSTATE) @CORENEURON_TRANSLATED_CODE_COMPILE_FLAGS@\n\n# translate MOD files to CPP using mod2c/NMODL\n$(mod_cpp_files): $(MOD_TO_CPP_DIR)/%.cpp: $(MODS_PATH)/%.mod | $(MOD_TO_CPP_DIR)\n\t$(MOD2CPP_ENV_VAR) $(MOD2CPP_BINARY_PATH) $< -o $(MOD_TO_CPP_DIR)/ $(MOD2CPP_FLAGS_C)\n\n# generate mod registration function. Dont overwrite if it's not changed\n$(MOD_FUNC_CPP): build_always | $(MOD_TO_CPP_DIR)\n\t$(CORENRN_PERLEXE) $(CORENRN_SHARE_CORENRN_DIR)/mod_func.c.pl $(mod_files_names) > $(MOD_FUNC_CPP).tmp\n\tdiff -q $(MOD_FUNC_CPP).tmp $(MOD_FUNC_CPP) || \\\n\tmv $(MOD_FUNC_CPP).tmp $(MOD_FUNC_CPP)\n\n# symlink to cpp files provided by coreneuron\n$(MOD_TO_CPP_DIR)/%.cpp: $(CORENRN_SHARE_MOD2CPP_DIR)/%.cpp | $(MOD_TO_CPP_DIR)\n\tln -s $< $@\n\n# create directories needed\n$(MOD_TO_CPP_DIR):\n\tmkdir -p $(MOD_TO_CPP_DIR)\n\n$(MOD_OBJS_DIR):\n\tmkdir -p $(MOD_OBJS_DIR)\n\n# install binary and libraries\ninstall: $(SPECIAL_EXE)\n\tinstall -d $(DESTDIR)/bin $(DESTDIR)/lib\n\tinstall ${COREMECH_LIB_PATH} $(DESTDIR)/lib\n\tinstall $(SPECIAL_EXE) $(DESTDIR)/bin\n\n.PHONY: build_always\n\n$(VERBOSE).SILENT:\n\n# delete cpp files if mod2c error, otherwise they are not generated again\n.DELETE_ON_ERROR:\n"
  },
  {
    "path": "tests/CMakeLists.txt",
    "content": "# =============================================================================\n# Copyright (c) 2016 - 2021 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\ninclude(TestHelpers)\n\ninclude_directories(${CORENEURON_PROJECT_SOURCE_DIR} ${CORENEURON_PROJECT_BINARY_DIR}/generated\n                    ${Boost_INCLUDE_DIRS})\n\n# Add compiler flags that should apply to all CoreNEURON targets, but which should not leak into\n# other included projects.\nadd_compile_definitions(${CORENRN_COMPILE_DEFS})\nadd_compile_options(${CORENRN_EXTRA_CXX_FLAGS})\nadd_link_options(${CORENRN_EXTRA_LINK_FLAGS})\n\nif(NOT Boost_USE_STATIC_LIBS)\n  add_definitions(-DBOOST_TEST_DYN_LINK=TRUE)\nendif()\n\nset(CMAKE_BUILD_RPATH ${CMAKE_BINARY_DIR}/bin/${CMAKE_HOST_SYSTEM_PROCESSOR})\n\nset(Boost_NO_BOOST_CMAKE TRUE)\n# Minimum set by needing the multi-argument version of BOOST_AUTO_TEST_CASE.\nfind_package(Boost 1.59 QUIET COMPONENTS filesystem system atomic unit_test_framework)\n\nif(Boost_FOUND)\n  if(CORENRN_ENABLE_UNIT_TESTS)\n    add_library(coreneuron-unit-test INTERFACE)\n    target_compile_options(coreneuron-unit-test\n                           INTERFACE ${CORENEURON_BOOST_UNIT_TEST_COMPILE_FLAGS})\n    target_include_directories(coreneuron-unit-test SYSTEM INTERFACE ${Boost_INCLUDE_DIRS})\n    target_link_libraries(coreneuron-unit-test INTERFACE coreneuron-all)\n    add_subdirectory(unit/cmdline_interface)\n    add_subdirectory(unit/interleave_info)\n    add_subdirectory(unit/alignment)\n    add_subdirectory(unit/queueing)\n    add_subdirectory(unit/solver)\n    # lfp test uses nrnmpi_* wrappers but does not load the dynamic MPI library TODO: re-enable\n    # after NEURON and CoreNEURON dynamic MPI are merged\n    if(NOT CORENRN_ENABLE_MPI_DYNAMIC)\n      add_subdirectory(unit/lfp)\n    endif()\n  endif()\n  message(STATUS \"Boost found, unit tests enabled\")\nelse()\n  message(STATUS \"Boost not found, unit tests disabled\")\nendif()\n\nadd_subdirectory(integration)\n"
  },
  {
    "path": "tests/integration/CMakeLists.txt",
    "content": "# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\n\nif(CORENRN_ENABLE_MPI_DYNAMIC)\n  # ~~~\n  # In case of submodule building we don't know the MPI launcher and mpi\n  # distribution being used. So for now just skip these tests and rely on\n  # neuron to test dynamic mpi mode. For coreneuron build assume are just\n  # building single generic mpi library libcorenrn_mpi.<suffix>\n  # ~~~\n  if(CORENEURON_AS_SUBPROJECT)\n    message(STATUS \"CoreNEURON integration tests are disabled with dynamic MPI\")\n    return()\n  else()\n    set(CORENRN_MPI_LIB_ARG\n        \"--mpi-lib ${PROJECT_BINARY_DIR}/lib/lib${CORENRN_MPI_LIB_NAME}${CMAKE_SHARED_LIBRARY_SUFFIX}\"\n    )\n  endif()\nendif()\n\nset(COMMON_ARGS \"--tstop 100. --celsius 6.3 --mpi ${CORENRN_MPI_LIB_ARG}\")\nset(MODEL_STATS_ARG \"--model-stats\")\nset(RING_DATASET_DIR \"${CMAKE_CURRENT_SOURCE_DIR}/ring\")\nset(RING_COMMON_ARGS \"--datpath ${RING_DATASET_DIR} ${COMMON_ARGS}\")\nset(RING_GAP_COMMON_ARGS \"--datpath ${CMAKE_CURRENT_SOURCE_DIR}/ring_gap ${COMMON_ARGS}\")\nset(PERMUTE1_ARGS \"--cell-permute 1\")\nset(PERMUTE2_ARGS \"--cell-permute 2\")\nset(CUDA_INTERFACE \"--cuda-interface\")\nif(CORENRN_ENABLE_GPU)\n  set(GPU_ARGS \"--gpu\")\n  set(permutation_modes 1 2)\nelse()\n  set(permutation_modes 0 1)\nendif()\n\n# List of tests with arguments\nset(TEST_CASES_WITH_ARGS\n    \"ring!${RING_COMMON_ARGS} ${MODEL_STATS_ARG} ${GPU_ARGS} --outpath ${CMAKE_CURRENT_BINARY_DIR}/ring\"\n    \"ring_binqueue!${RING_COMMON_ARGS} ${GPU_ARGS} --outpath ${CMAKE_CURRENT_BINARY_DIR}/ring_binqueue --binqueue\"\n    \"ring_multisend!${RING_COMMON_ARGS} ${GPU_ARGS} --outpath ${CMAKE_CURRENT_BINARY_DIR}/ring_multisend --multisend\"\n    \"ring_spike_buffer!${RING_COMMON_ARGS} ${GPU_ARGS} --outpath ${CMAKE_CURRENT_BINARY_DIR}/ring_spike_buffer --spikebuf 1\"\n    \"ring_gap!${RING_GAP_COMMON_ARGS} ${GPU_ARGS} --outpath ${CMAKE_CURRENT_BINARY_DIR}/ring_gap\"\n    \"ring_gap_binqueue!${RING_GAP_COMMON_ARGS} ${GPU_ARGS} --outpath ${CMAKE_CURRENT_BINARY_DIR}/ring_gap_binqueue --binqueue\"\n    \"ring_gap_multisend!${RING_GAP_COMMON_ARGS} ${GPU_ARGS} --outpath ${CMAKE_CURRENT_BINARY_DIR}/ring_gap_multisend --multisend\"\n)\nset(test_suffixes \"\" \"_binqueue\" \"_multisend\")\nforeach(cell_permute ${permutation_modes})\n  list(APPEND test_suffixes \"_permute${cell_permute}\")\n  list(\n    APPEND\n    TEST_CASES_WITH_ARGS\n    \"ring_permute${cell_permute}!${RING_COMMON_ARGS} ${GPU_ARGS} --outpath ${CMAKE_CURRENT_BINARY_DIR}/ring_permute${cell_permute} --cell-permute=${cell_permute}\"\n    \"ring_gap_permute${cell_permute}!${RING_GAP_COMMON_ARGS} ${GPU_ARGS} --outpath ${CMAKE_CURRENT_BINARY_DIR}/ring_gap_permute${cell_permute} --cell-permute=${cell_permute}\"\n  )\n  # As reports require MPI, do not add test if report is enabled.\n  if(NOT CORENRN_ENABLE_REPORTING)\n    list(APPEND test_suffixes \"_serial_permute${cell_permute}\")\n    list(\n      APPEND\n      TEST_CASES_WITH_ARGS\n      \"ring_serial_permute${cell_permute}!${GPU_ARGS} --cell-permute=${cell_permute} --tstop 100. --celsius 6.3 --datpath ${RING_DATASET_DIR} ${MODEL_STATS_ARG} --outpath ${CMAKE_CURRENT_BINARY_DIR}/ring_serial_permute${cell_permute}\"\n    )\n  endif()\nendforeach()\n\nif(CORENRN_ENABLE_GPU)\n  list(APPEND test_suffixes \"_permute2_cudaInterface\")\n  list(\n    APPEND\n    TEST_CASES_WITH_ARGS\n    \"ring_permute2_cudaInterface!${RING_COMMON_ARGS} ${GPU_ARGS} --outpath ${CMAKE_CURRENT_BINARY_DIR}/ring_permute2_cudaInterface ${PERMUTE2_ARGS} ${CUDA_INTERFACE}\"\n    \"ring_gap_permute2_cudaInterface!${RING_GAP_COMMON_ARGS} ${GPU_ARGS} --outpath ${CMAKE_CURRENT_BINARY_DIR}/ring_gap_permute2_cudaInterface ${PERMUTE2_ARGS} ${CUDA_INTERFACE}\"\n  )\nendif()\n\n# ~~~\n# There are no directories for permute and multisend related tests,\n# create them and copy reference spikes\n# ~~~\nforeach(data_dir \"ring\" \"ring_gap\")\n  # Naïve foreach(test_suffix ${test_suffixes}) does not seem to handle empty suffixes correctly.\n  list(LENGTH test_suffixes num_suffixes)\n  math(EXPR num_suffixes_m1 \"${num_suffixes} - 1\")\n  foreach(suffix_index RANGE 0 ${num_suffixes_m1})\n    list(GET test_suffixes ${suffix_index} test_suffix)\n    file(COPY \"${CMAKE_CURRENT_SOURCE_DIR}/${data_dir}/out.dat.ref\"\n         DESTINATION \"${CMAKE_CURRENT_BINARY_DIR}/${data_dir}${test_suffix}/\")\n  endforeach()\nendforeach()\n# test without ring_gap version\nfile(COPY \"${CMAKE_CURRENT_SOURCE_DIR}/ring/out.dat.ref\"\n     DESTINATION \"${CMAKE_CURRENT_BINARY_DIR}/ring_spike_buffer/\")\n\n# names of all tests added\nset(CORENRN_TEST_NAMES \"\")\n\n# Configure test scripts\nforeach(args_line ${TEST_CASES_WITH_ARGS})\n  string(REPLACE \"!\" \";\" string_line ${args_line})\n  set(test_num_processors 1)\n  if(MPI_FOUND)\n    # serial test run without srun or mpiexec\n    if(args_line MATCHES \"ring_serial.*\")\n      string(REPLACE \";\" \" \" SRUN_PREFIX \"\")\n    else()\n      set(test_num_processors 2)\n      string(REPLACE \";\" \" \" SRUN_PREFIX \"${TEST_MPI_EXEC_BIN};-n;${test_num_processors}\")\n    endif()\n  endif()\n  list(GET string_line 0 TEST_NAME)\n  list(GET string_line 1 TEST_ARGS)\n  set(SIM_NAME ${TEST_NAME})\n  configure_file(integration_test.sh.in ${TEST_NAME}/integration_test.sh @ONLY)\n  add_test(\n    NAME ${TEST_NAME}_TEST\n    COMMAND \"/bin/sh\" ${CMAKE_CURRENT_BINARY_DIR}/${TEST_NAME}/integration_test.sh\n    WORKING_DIRECTORY \"${CMAKE_CURRENT_BINARY_DIR}/${TEST_NAME}\")\n  set_tests_properties(${TEST_NAME}_TEST PROPERTIES PROCESSORS ${test_num_processors})\n  cpp_cc_configure_sanitizers(TEST ${TEST_NAME}_TEST)\n  list(APPEND CORENRN_TEST_NAMES ${TEST_NAME}_TEST)\nendforeach()\n\nif(CORENRN_ENABLE_REPORTING)\n  foreach(TEST_NAME \"1\")\n    set(SIM_NAME \"reporting_${TEST_NAME}\")\n    set(CONFIG_ARG \"${TEST_NAME}\")\n    configure_file(reportinglib/${TEST_NAME}.conf.in ${SIM_NAME}/${TEST_NAME}.conf @ONLY)\n    configure_file(reportinglib/reporting_test.sh.in ${SIM_NAME}/reporting_test.sh @ONLY)\n    configure_file(reportinglib/${TEST_NAME}.check.in ${SIM_NAME}/${TEST_NAME}.check @ONLY)\n    file(COPY \"${CMAKE_CURRENT_SOURCE_DIR}/reportinglib/test_ref.out\" DESTINATION \"${SIM_NAME}/\")\n    add_test(\n      NAME ${SIM_NAME}\n      COMMAND \"/bin/sh\" ${CMAKE_CURRENT_BINARY_DIR}/${SIM_NAME}/reporting_test.sh\n      WORKING_DIRECTORY \"${CMAKE_CURRENT_BINARY_DIR}/${SIM_NAME}\")\n    cpp_cc_configure_sanitizers(TEST ${SIM_NAME})\n    list(APPEND CORENRN_TEST_NAMES ${SIM_NAME})\n  endforeach()\nendif()\n"
  },
  {
    "path": "tests/integration/README.md",
    "content": "# Generating Tests Input Dataset\n\nThere two integration tests under `tests/integration/` directory. The input dataset is generated using NEURON. You can follow below steps for test data generation.\n\nOnce you have latest NEURON installed, you have to clone [ringtest](https://github.com/nrnhines/ringtest) model from github:\n\n```bash\ngit clone https://github.com/nrnhines/ringtest.git\n```\n\nYou have to create `special` as usual with NEURON:\n\n```bash\nnrnivmodl mod\n```\n\nNow we can generate data for `ring` test as:\n\n\n```bash\nmpirun -n 2 ./x86_64/special ringtest.py -nring 1 -ncell 20 -tstop 100 -mpi -dumpmodel\n\n# sort spikes and remove old spike output\nsortspike spk2.std coredat/out.dat.ref\nrm spk2.std\n```\n\nThe generated dataset can be copied to `tests/integration/ring/`:\n\n```bash\nmv coredat/* <external>/coreneuron/tests/integration/ring/\n```\n\n\nSimilarly, dataset for `ring_gap` test can be generated as:\n\n```bash\nmpirun -n 2 ./x86_64/special ringtest.py -nring 1 -ncell 20 -tstop 100 -gap -mpi -dumpmodel\n\n# sort spikes and remove old spike output\nsortspike spk2.std coredat/out.dat.ref\nrm spk2.std\nmv coredat/* <external>/coreneuron/tests/integration/ring_gap/\n```\n"
  },
  {
    "path": "tests/integration/integration_test.sh.in",
    "content": "#!/usr/bin/env bash\nset -e\n\nexport OMP_NUM_THREADS=1\nexport LIBSONATA_ZERO_BASED_GIDS=true\n\n# Run the executable\nSRUN_EXTRA=\nif [ -n \"$VALGRIND\" -a -n \"$VALGRIND_PRELOAD\" ]; then\n    echo \"Running with valgrind\"\n    LD_PRELOAD=$VALGRIND_PRELOAD \\\n    @SRUN_PREFIX@ $SRUN_EXTRA $VALGRIND @CMAKE_BINARY_DIR@/bin/@CMAKE_SYSTEM_PROCESSOR@/special-core @TEST_ARGS@\nelse\n    @SRUN_PREFIX@ $SRUN_EXTRA @CMAKE_BINARY_DIR@/bin/@CMAKE_SYSTEM_PROCESSOR@/special-core @TEST_ARGS@\nfi\nexitvalue=$?\n\n# Check for error result\nif [ $exitvalue -ne 0 ]; then\n  echo \"Error status value: $exitvalue\"\n  exit $exitvalue\nfi\n\n# diff outputed files with reference\ncd @CMAKE_CURRENT_BINARY_DIR@/@SIM_NAME@\n\n# We convert spikes to out.dat format\nreports=@ENABLE_SONATA_REPORTS_TESTS@\nif [ \"$reports\" = \"ON\" ]\nthen\n  data=$(@H5DUMP_EXECUTABLE@ -d /spikes/All/timestamps -d /spikes/All/node_ids -y -O out.h5 | sed 's/\"ms\"//g;s/,/\\n/g')\n  echo $data | awk '{n=NF/2; for (i=1;i<=n;i++) print $i \"\\t\" $(n+i) }' > out_SONATA.dat\n\n  if [ ! -f out_SONATA.dat ]\n  then\n    echo \"[ERROR] No SONATA output files. Test failed!\" >&2\n    exit 1\n  fi\n  diff -w out_SONATA.dat out.dat.ref > diff_SONATA.dat 2>&1\n  if [ -s diff_SONATA.dat ]\n  then\n    echo \"[ERROR] SONATA Results are different, check the file diff_SONATA.dat. Test failed!\" >&2\n    exit 1\n  fi\nfi\n\nif [ ! -f out.dat ]\nthen\n  echo \"[ERROR] No output files. Test failed!\" >&2\n  exit 1\nfi\n\ndiff -w out.dat out.dat.ref > diff.dat 2>&1 || true\n\nif [ -s diff.dat ]\nthen\n  echo \"[ERROR] Results are different, check the file diff.dat. Test failed!\" >&2\n  exit 1\nelse\n  echo \"Results are the same, test passed\"\n  rm -f *.dat\n  exit 0\nfi\n"
  },
  {
    "path": "tests/integration/reportinglib/1.check.in",
    "content": "#!/bin/sh\n\nOK=0\nFAILED=1\nsonata_reports=@ENABLE_SONATA_REPORTS_TESTS@\nbin_reports=@ENABLE_BIN_REPORTS_TESTS@\ntest_ref=@CMAKE_CURRENT_BINARY_DIR@/@SIM_NAME@/test_ref.out\n\nif [ \"$bin_reports\" = \"ON\" ]\nthen\n  if [ -f test_1.bbp ]\n  then\n    somaDump_diff=$(@reportinglib_somaDump@ test_1.bbp 1 | sed 's/ //g' | diff $test_ref -)\n    \n    if [ $? -ne 0 ]\n    then\n      echo -e \"[ERROR] The report output generated by Reportinglib differs!\\n$somaDump_diff\" >&2\n      exit $FAILED\n    fi\n  else\n     echo \"[ERROR] Expected ReportingLib soma file 'test_1.bbp' is missing. Test failed!\" >&2\n     exit $FAILED\n  fi\nfi\n\nif [ \"$sonata_reports\" = \"ON\" ]\nthen\n  if [ -f test_2.h5 ]\n  then\n    h5dump_diff=$(@H5DUMP_EXECUTABLE@ -d /report/PopA/data -y -O test_2.h5 | sed '1d;$d;s/,//g;s/ //g' | diff $test_ref -)\n    \n    if [ $? -ne 0 ]\n    then\n      echo -e \"[ERROR] The report output generated by Libsonata differs!\\n$h5dump_diff\" >&2\n      exit $FAILED\n    fi\n  else\n     echo \"[ERROR] Expected SONATA soma file 'test_2.h5' doesn't exist. Test failed!\" >&2\n     exit $FAILED\n  fi\n  if [ ! -f spikes.h5 ]\n  then\n     echo \"[ERROR] Expected SONATA spike file 'spikes.h5' doesn't exist. Test failed!\" >&2\n     exit $FAILED\n  fi\nfi\n\n# If we reach this point, all tests were successful\nexit $OK\n"
  },
  {
    "path": "tests/integration/reportinglib/1.conf.in",
    "content": "outpath = ./\ndatpath = @CMAKE_CURRENT_SOURCE_DIR@/ring/\ntstop = 10.000000\ndt = 0.025000\nforwardskip = 0.000000\nprcellgid = -1\nreport-conf = @CMAKE_CURRENT_SOURCE_DIR@/reportinglib/1.report\ncell-permute = 0\n"
  },
  {
    "path": "tests/integration/reportinglib/reporting_test.sh.in",
    "content": "#! /bin/sh\n\nset -e -o pipefail\n\nexport OMP_NUM_THREADS=1\nexport LIBSONATA_ZERO_BASED_GIDS=true\n\n@SRUN_PREFIX@ @CMAKE_BINARY_DIR@/bin/@CMAKE_SYSTEM_PROCESSOR@/special-core --mpi --read-config @CMAKE_CURRENT_BINARY_DIR@/@SIM_NAME@/@TEST_NAME@.conf\nchmod +x @CMAKE_CURRENT_BINARY_DIR@/@SIM_NAME@/@TEST_NAME@.check\nexit `@CMAKE_CURRENT_BINARY_DIR@/@SIM_NAME@/@TEST_NAME@.check`\n"
  },
  {
    "path": "tests/integration/reportinglib/test_ref.out",
    "content": "-65\n-64.9973\n-64.9951\n-64.9932\n-64.9916\n-64.9902\n-64.9889\n-64.9877\n-64.9867\n-64.9858\n-64.985\n-64.9842\n-64.9836\n-64.9829\n-64.9824\n-64.9819\n-64.9815\n-64.9811\n-64.9807\n-64.9804\n-64.9802\n-64.9799\n-64.9797\n-64.9796\n-64.9794\n-64.9793\n-64.9792\n-64.9791\n-64.979\n-64.979\n-64.979\n-64.979\n-64.979\n-64.979\n-64.979\n-64.9791\n-64.9791\n-64.7371\n-63.6264\n-62.1068\n-60.4682\n-58.847\n-57.2905\n-55.7913\n-54.3056\n-52.7594\n-51.044\n-48.9961\n-46.3491\n-42.6233\n-36.8741\n-27.1665\n-10.1852\n13.977\n31.4561\n36.143\n35.2487\n32.4239\n28.6338\n24.2472\n19.4933\n14.5405\n9.51339\n4.50006\n-0.440951\n-5.27461\n-9.98373\n-14.5648\n-19.0258\n-23.3868\n-27.6838\n-31.9759\n-36.353\n-40.9401\n-45.8855\n-51.303\n-57.1176\n-62.8313\n-67.5469\n-70.6416\n-72.2969\n-73.0829\n-73.4434\n-73.6102\n-73.6866\n-73.7171\n-73.7212\n-73.7082\n-73.6828\n-73.6479\n-73.6053\n-73.5561\n-73.5012\n-73.4414\n-73.3771\n-73.3089\n-73.237\n-73.1618\n-73.0836\n-73.0025\n"
  },
  {
    "path": "tests/integration/ring/out.dat.ref",
    "content": "2.65 0\n5.3 1\n7.95 2\n10.6 3\n13.25 4\n15.9 5\n18.55 6\n21.2 7\n23.85 8\n26.5 9\n29.15 10\n31.8 11\n34.45 12\n37.1 13\n39.75 14\n42.4 15\n45.05 16\n47.7 17\n50.35 18\n53 19\n55.65 0\n58.3 1\n60.95 2\n63.6 3\n66.25 4\n68.9 5\n71.55 6\n74.2 7\n76.85 8\n79.5 9\n82.15 10\n84.8 11\n87.45 12\n90.1 13\n92.75 14\n95.4 15\n98.05 16\n"
  },
  {
    "path": "tests/integration/ring_gap/mod files/halfgap.mod",
    "content": ": ggap.mod\n: This is a conductance based gap junction to allow setting g = 0\nNEURON {\n\tPOINT_PROCESS HalfGap\n\tRANGE g, i, vgap\n\tELECTRODE_CURRENT i\n}\nPARAMETER { g = 0 (1/megohm) }\nASSIGNED {\n\tv (millivolt)\n\tvgap (millivolt)\n\ti (nanoamp)\n}\nBREAKPOINT { i = (vgap - v)*g }\n"
  },
  {
    "path": "tests/integration/ring_gap/out.dat.ref",
    "content": "3.275 19\n4.325 0\n4.425 18\n5.5 1\n5.575 17\n6.65 2\n6.75 16\n7.825 3\n7.9 15\n8.975 4\n9.05 14\n10.15 5\n10.225 13\n11.325 6\n11.4 12\n12.475 7\n12.55 11\n13.625 8\n13.7 10\n14.25 9\n"
  },
  {
    "path": "tests/unit/alignment/CMakeLists.txt",
    "content": "# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\nadd_executable(alignment_test_bin alignment.cpp)\ntarget_link_libraries(alignment_test_bin coreneuron-unit-test)\nadd_test(NAME alignment_test COMMAND $<TARGET_FILE:alignment_test_bin>)\ncpp_cc_configure_sanitizers(TARGET alignment_test_bin TEST alignment_test)\n"
  },
  {
    "path": "tests/unit/alignment/alignment.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n#include \"coreneuron/utils/memory.h\"\n\n#include <boost/mpl/list.hpp>\n#define BOOST_TEST_MODULE PaddingCheck\n#include <boost/test/included/unit_test.hpp>\n\n#include <cstdint>\n#include <cstring>\n\ntemplate <class T, int n = 1>\nstruct data {\n    typedef T value_type;\n    static const int chunk = n;\n};\n\ntypedef boost::mpl::list<data<double>, data<long long int>> chunk_default_data_type;\n\ntypedef boost::mpl::list<data<double, 2>,\n                         data<double, 4>,\n                         data<double, 8>,\n                         data<double, 16>,\n                         data<double, 32>,\n                         data<int, 2>,\n                         data<int, 4>,\n                         data<int, 8>,\n                         data<int, 16>,\n                         data<int, 32>>\n    chunk_data_type;\n\nBOOST_AUTO_TEST_CASE(padding_simd) {\n    /** AOS test */\n    int pad = coreneuron::soa_padded_size<1>(11, 1);\n    BOOST_CHECK_EQUAL(pad, 11);\n\n    /** SOA tests with 11 */\n    pad = coreneuron::soa_padded_size<1>(11, 0);\n    BOOST_CHECK_EQUAL(pad, 11);\n\n    pad = coreneuron::soa_padded_size<2>(11, 0);\n    BOOST_CHECK_EQUAL(pad, 12);\n\n    pad = coreneuron::soa_padded_size<4>(11, 0);\n    BOOST_CHECK_EQUAL(pad, 12);\n\n    pad = coreneuron::soa_padded_size<8>(11, 0);\n    BOOST_CHECK_EQUAL(pad, 16);\n\n    pad = coreneuron::soa_padded_size<16>(11, 0);\n    BOOST_CHECK_EQUAL(pad, 16);\n\n    pad = coreneuron::soa_padded_size<32>(11, 0);\n    BOOST_CHECK_EQUAL(pad, 32);\n\n    /** SOA tests with 32 */\n    pad = coreneuron::soa_padded_size<1>(32, 0);\n    BOOST_CHECK_EQUAL(pad, 32);\n\n    pad = coreneuron::soa_padded_size<2>(32, 0);\n    BOOST_CHECK_EQUAL(pad, 32);\n\n    pad = coreneuron::soa_padded_size<4>(32, 0);\n    BOOST_CHECK_EQUAL(pad, 32);\n\n    pad = coreneuron::soa_padded_size<8>(32, 0);\n    BOOST_CHECK_EQUAL(pad, 32);\n\n    pad = coreneuron::soa_padded_size<16>(32, 0);\n    BOOST_CHECK_EQUAL(pad, 32);\n\n    pad = coreneuron::soa_padded_size<32>(32, 0);\n    BOOST_CHECK_EQUAL(pad, 32);\n\n    /** SOA tests with 33 */\n    pad = coreneuron::soa_padded_size<1>(33, 0);\n    BOOST_CHECK_EQUAL(pad, 33);\n\n    pad = coreneuron::soa_padded_size<2>(33, 0);\n    BOOST_CHECK_EQUAL(pad, 34);\n\n    pad = coreneuron::soa_padded_size<4>(33, 0);\n    BOOST_CHECK_EQUAL(pad, 36);\n\n    pad = coreneuron::soa_padded_size<8>(33, 0);\n    BOOST_CHECK_EQUAL(pad, 40);\n\n    pad = coreneuron::soa_padded_size<16>(33, 0);\n    BOOST_CHECK_EQUAL(pad, 48);\n\n    pad = coreneuron::soa_padded_size<32>(33, 0);\n    BOOST_CHECK_EQUAL(pad, 64);\n}\n\n/// Even number is randomly depends of the TYPE!!! and the number of elements.\n/// This test work for 64 bits type not for 32 bits.\nBOOST_AUTO_TEST_CASE_TEMPLATE(memory_alignment_simd_false, T, chunk_default_data_type) {\n    const int c = T::chunk;\n    int total_size_chunk = coreneuron::soa_padded_size<c>(247, 0);\n    int ne = 6 * total_size_chunk;\n\n    typename T::value_type* data =\n        (typename T::value_type*) coreneuron::ecalloc_align(ne, sizeof(typename T::value_type), 16);\n\n    for (int i = 1; i < 6; i += 2) {\n        bool b = coreneuron::is_aligned((data + i * total_size_chunk), 16);\n        BOOST_CHECK_EQUAL(b, 0);\n    }\n\n    for (int i = 0; i < 6; i += 2) {\n        bool b = coreneuron::is_aligned((data + i * total_size_chunk), 16);\n        BOOST_CHECK_EQUAL(b, 1);\n    }\n\n    free_memory(data);\n}\n\nBOOST_AUTO_TEST_CASE_TEMPLATE(memory_alignment_simd_true, T, chunk_data_type) {\n    const int c = T::chunk;\n    int total_size_chunk = coreneuron::soa_padded_size<c>(247, 0);\n    int ne = 6 * total_size_chunk;\n\n    typename T::value_type* data =\n        (typename T::value_type*) coreneuron::ecalloc_align(ne, sizeof(typename T::value_type), 16);\n\n    for (int i = 0; i < 6; ++i) {\n        bool b = coreneuron::is_aligned((data + i * total_size_chunk), 16);\n        BOOST_CHECK_EQUAL(b, 1);\n    }\n\n    free_memory(data);\n}\n"
  },
  {
    "path": "tests/unit/cmdline_interface/CMakeLists.txt",
    "content": "# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\nadd_executable(cmd_interface_test_bin test_cmdline_interface.cpp)\ntarget_link_libraries(cmd_interface_test_bin coreneuron-unit-test)\nadd_test(NAME cmd_interface_test COMMAND $<TARGET_FILE:cmd_interface_test_bin>)\ncpp_cc_configure_sanitizers(TARGET cmd_interface_test_bin TEST cmd_interface_test)\n"
  },
  {
    "path": "tests/unit/cmdline_interface/test_cmdline_interface.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n#include \"coreneuron/apps/corenrn_parameters.hpp\"\n\n#define BOOST_TEST_MODULE cmdline_interface\n#include <boost/test/included/unit_test.hpp>\n\n#include <cfloat>\n\nusing namespace coreneuron;\n\nBOOST_AUTO_TEST_CASE(cmdline_interface) {\n    const char* argv[] = {\n\n        \"nrniv-core\",\n\n        \"--mpi\",\n\n        \"--dt\",\n        \"0.02\",\n\n        \"--tstop\",\n        \"0.1\",\n#ifdef CORENEURON_ENABLE_GPU\n        \"--gpu\",\n#endif\n        \"--cell-permute\",\n        \"2\",\n\n        \"--nwarp\",\n        \"8\",\n\n        \"-d\",\n        \"./\",\n\n        \"--voltage\",\n        \"-32\",\n\n        \"--threading\",\n\n        \"--ms-phases\",\n        \"1\",\n\n        \"--ms-subintervals\",\n        \"2\",\n\n        \"--multisend\",\n\n        \"--spkcompress\",\n        \"32\",\n\n        \"--binqueue\",\n\n        \"--spikebuf\",\n        \"100\",\n\n        \"--prcellgid\",\n        \"12\",\n\n        \"--forwardskip\",\n        \"0.02\",\n\n        \"--celsius\",\n        \"25.12\",\n\n        \"--mindelay\",\n        \"0.1\",\n\n        \"--dt_io\",\n        \"0.2\"};\n    constexpr int argc = sizeof argv / sizeof argv[0];\n\n    corenrn_parameters corenrn_param_test;\n\n    corenrn_param_test.parse(argc, const_cast<char**>(argv));  // discarding const as CLI11\n                                                               // interface is not const\n\n    BOOST_CHECK(corenrn_param_test.seed == -1);  // testing default value\n\n    BOOST_CHECK(corenrn_param_test.spikebuf == 100);\n\n    BOOST_CHECK(corenrn_param_test.threading == true);\n\n    BOOST_CHECK(corenrn_param_test.dt == 0.02);\n\n    BOOST_CHECK(corenrn_param_test.tstop == 0.1);\n\n    BOOST_CHECK(corenrn_param_test.prcellgid == 12);\n#ifdef CORENEURON_ENABLE_GPU\n    BOOST_CHECK(corenrn_param_test.gpu == true);\n#else\n    BOOST_CHECK(corenrn_param_test.gpu == false);\n#endif\n    BOOST_CHECK(corenrn_param_test.dt_io == 0.2);\n\n    BOOST_CHECK(corenrn_param_test.forwardskip == 0.02);\n\n    BOOST_CHECK(corenrn_param_test.celsius == 25.12);\n\n    BOOST_CHECK(corenrn_param_test.mpi_enable == true);\n\n    BOOST_CHECK(corenrn_param_test.cell_interleave_permute == 2);\n\n    BOOST_CHECK(corenrn_param_test.voltage == -32);\n\n    BOOST_CHECK(corenrn_param_test.nwarp == 8);\n\n    BOOST_CHECK(corenrn_param_test.multisend == true);\n\n    BOOST_CHECK(corenrn_param_test.mindelay == 0.1);\n\n    BOOST_CHECK(corenrn_param_test.ms_phases == 1);\n\n    BOOST_CHECK(corenrn_param_test.ms_subint == 2);\n\n    BOOST_CHECK(corenrn_param_test.spkcompress == 32);\n\n    BOOST_CHECK(corenrn_param_test.multisend == true);\n\n    // Reset all parameters to their default values.\n    corenrn_param_test.reset();\n\n    // Should match a default-constructed set of parameters.\n    BOOST_CHECK_EQUAL(corenrn_param_test.voltage, corenrn_parameters{}.voltage);\n\n    // Everything has its default value, and the first `false` says not to\n    // include default values in the output, so this should be empty\n    BOOST_CHECK(corenrn_param_test.config_to_str(false, false).empty());\n}\n"
  },
  {
    "path": "tests/unit/interleave_info/CMakeLists.txt",
    "content": "# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\nadd_executable(interleave_info_bin check_constructors.cpp)\ntarget_link_libraries(interleave_info_bin coreneuron-unit-test)\nadd_test(NAME interleave_info_constructor_test COMMAND $<TARGET_FILE:interleave_info_bin>)\ncpp_cc_configure_sanitizers(TARGET interleave_info_bin TEST interleave_info_constructor_test)\n"
  },
  {
    "path": "tests/unit/interleave_info/check_constructors.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n#include \"coreneuron/permute/cellorder.hpp\"\n\n#define BOOST_TEST_MODULE cmdline_interface\n#include <boost/test/included/unit_test.hpp>\n\nusing namespace coreneuron;\n\nBOOST_AUTO_TEST_CASE(interleave_info_test) {\n    size_t nwarp = 4;\n    size_t nstride = 6;\n\n    InterleaveInfo info1;\n\n    int data1[] = {11, 37, 45, 2, 18, 37, 7, 39, 66, 33};\n    size_t data2[] = {111, 137, 245, 12, 118, 237, 199, 278, 458};\n\n    info1.nwarp = nwarp;\n    info1.nstride = nstride;\n\n    // to avoid same values, different sub-array is used to initialize different members\n    copy_align_array(info1.stridedispl, data1, nwarp + 1);\n    copy_align_array(info1.stride, data1 + 1, nstride);\n    copy_align_array(info1.firstnode, data1 + 1, nwarp + 1);\n    copy_align_array(info1.lastnode, data1 + 1, nwarp + 1);\n\n    // check if copy_array works\n    BOOST_CHECK_NE(info1.firstnode, info1.lastnode);\n    BOOST_CHECK_EQUAL_COLLECTIONS(info1.firstnode,\n                                  info1.firstnode + nwarp + 1,\n                                  info1.lastnode,\n                                  info1.lastnode + nwarp + 1);\n\n    copy_align_array(info1.cellsize, data1 + 4, nwarp);\n    copy_array(info1.nnode, data2, nwarp);\n    copy_array(info1.ncycle, data2 + 1, nwarp);\n    copy_array(info1.idle, data2 + 2, nwarp);\n    copy_array(info1.cache_access, data2 + 3, nwarp);\n    copy_array(info1.child_race, data2 + 4, nwarp);\n\n    // copy constructor\n    InterleaveInfo info2(info1);\n\n    // assignment operator\n    InterleaveInfo info3;\n    info3 = info1;\n\n    std::vector<InterleaveInfo*> infos;\n\n    infos.push_back(&info2);\n    infos.push_back(&info3);\n\n    // test few members\n    for (size_t i = 0; i < infos.size(); i++) {\n        BOOST_CHECK_EQUAL(info1.nwarp, infos[i]->nwarp);\n        BOOST_CHECK_EQUAL(info1.nstride, infos[i]->nstride);\n\n        BOOST_CHECK_EQUAL_COLLECTIONS(info1.stridedispl,\n                                      info1.stridedispl + nwarp + 1,\n                                      infos[i]->stridedispl,\n                                      infos[i]->stridedispl + nwarp + 1);\n\n        BOOST_CHECK_EQUAL_COLLECTIONS(info1.stride,\n                                      info1.stride + nstride,\n                                      infos[i]->stride,\n                                      infos[i]->stride + nstride);\n\n        BOOST_CHECK_EQUAL_COLLECTIONS(info1.cellsize,\n                                      info1.cellsize + nwarp,\n                                      infos[i]->cellsize,\n                                      infos[i]->cellsize + nwarp);\n\n        BOOST_CHECK_EQUAL_COLLECTIONS(info1.child_race,\n                                      info1.child_race + nwarp,\n                                      infos[i]->child_race,\n                                      infos[i]->child_race + nwarp);\n    }\n}\n"
  },
  {
    "path": "tests/unit/lfp/CMakeLists.txt",
    "content": "# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\nadd_executable(lfp_test_bin lfp.cpp)\ntarget_link_libraries(lfp_test_bin coreneuron-unit-test)\nadd_test(NAME lfp_test COMMAND $<TARGET_FILE:lfp_test_bin>)\ncpp_cc_configure_sanitizers(TARGET lfp_test_bin TEST lfp_test)\nset_property(\n  TEST lfp_test\n  APPEND\n  PROPERTY ENVIRONMENT OMP_NUM_THREADS=1)\n"
  },
  {
    "path": "tests/unit/lfp/lfp.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n#include \"coreneuron/io/lfp.hpp\"\n#include \"coreneuron/mpi/nrnmpi.h\"\n\n#define BOOST_TEST_MODULE LFPTest\n#include <boost/test/included/unit_test.hpp>\n\n#include <iostream>\n\nusing namespace coreneuron;\nusing namespace coreneuron::lfputils;\n\ntemplate <typename F>\ndouble integral(F f, double a, double b, int n) {\n    double step = (b - a) / n;  // width of each small rectangle\n    double area = 0.0;          // signed area\n    for (int i = 0; i < n; i++) {\n        area += f(a + (i + 0.5) * step) * step;  // sum up each small rectangle\n    }\n    return area;\n}\n\n\nBOOST_AUTO_TEST_CASE(LFP_PointSource_LineSource) {\n#if NRNMPI\n    nrnmpi_init(nullptr, nullptr, false);\n#endif\n    double segment_length{1.0e-6};\n    double segment_start_val{1.0e-6};\n    std::array<double, 3> segment_start = std::array<double, 3>{0.0, 0.0, segment_start_val};\n    std::array<double, 3> segment_end =\n        paxpy(segment_start, 1.0, std::array<double, 3>{0.0, 0.0, segment_length});\n    double floor{1.0e-6};\n    pi = 3.141592653589;\n\n    std::array<double, 10> vals;\n    double circling_radius{1.0e-6};\n    std::array<double, 3> segment_middle{0.0, 0.0, 1.5e-6};\n    double medium_resistivity_fac{1.0};\n    for (auto k = 0; k < 10; k++) {\n        std::array<double, 3> approaching_elec =\n            paxpy(segment_middle, 1.0, std::array<double, 3>{0.0, 1.0e-5 - k * 1.0e-6, 0.0});\n        std::array<double, 3> circling_elec =\n            paxpy(segment_middle,\n                  1.0,\n                  std::array<double, 3>{0.0,\n                                        circling_radius * std::cos(2.0 * pi * k / 10),\n                                        circling_radius * std::sin(2.0 * pi * k / 10)});\n\n        double analytic_approaching_lfp = line_source_lfp_factor(\n            approaching_elec, segment_start, segment_end, floor, medium_resistivity_fac);\n        double analytic_circling_lfp = line_source_lfp_factor(\n            circling_elec, segment_start, segment_end, floor, medium_resistivity_fac);\n        double numeric_circling_lfp = integral(\n            [&](double x) {\n                return 1.0 / std::max(floor,\n                                      norm(paxpy(circling_elec,\n                                                 -1.0,\n                                                 paxpy(segment_end,\n                                                       x,\n                                                       paxpy(segment_start, -1.0, segment_end)))));\n            },\n            0.0,\n            1.0,\n            10000);\n        // TEST of analytic vs numerical integration\n        std::clog << \"ANALYTIC line source \" << analytic_circling_lfp\n                  << \" vs NUMERIC line source LFP \" << numeric_circling_lfp << \"\\n\";\n        BOOST_REQUIRE_CLOSE(analytic_circling_lfp, numeric_circling_lfp, 1.0e-6);\n        // TEST of LFP Flooring\n        BOOST_REQUIRE((approaching_elec[1] < 0.866e-6) ? analytic_approaching_lfp == 1.0e6 : true);\n        vals[k] = analytic_circling_lfp;\n    }\n    // TEST of SYMMETRY of LFP FORMULA\n    for (size_t k = 0; k < 5; k++) {\n        BOOST_REQUIRE(std::abs((vals[k] - vals[k + 5]) /\n                               std::max(std::abs(vals[k]), std::abs(vals[k + 5]))) < 1.0e-12);\n    }\n    std::vector<std::array<double, 3>> segments_starts = {{0., 0., 1.},\n                                                          {0., 0., 0.5},\n                                                          {0.0, 0.0, 0.0},\n                                                          {0.0, 0.0, -0.5}};\n    std::vector<std::array<double, 3>> segments_ends = {{0., 0., 0.},\n                                                        {0., 0., 1.},\n                                                        {0., 0., 0.5},\n                                                        {0.0, 0.0, 0.0}};\n    std::vector<double> radii{0.1, 0.1, 0.1, 0.1};\n    std::vector<std::array<double, 3>> electrodes = {{0.0, 0.3, 0.0}, {0.0, 0.7, 0.8}};\n    std::vector<int> indices = {0, 1, 2, 3};\n    LFPCalculator<LineSource> lfp(segments_starts, segments_ends, radii, indices, electrodes, 1.0);\n    lfp.template lfp<std::vector<double>>({0.0, 1.0, 2.0, 3.0});\n    std::vector<double> res_line_source = lfp.lfp_values();\n    LFPCalculator<PointSource> lfpp(\n        segments_starts, segments_ends, radii, indices, electrodes, 1.0);\n    lfpp.template lfp<std::vector<double>>({0.0, 1.0, 2.0, 3.0});\n    std::vector<double> res_point_source = lfpp.lfp_values();\n    BOOST_REQUIRE_CLOSE(res_line_source[0], res_point_source[0], 1.0);\n    BOOST_REQUIRE_CLOSE(res_line_source[1], res_point_source[1], 1.0);\n#if NRNMPI\n    nrnmpi_finalize();\n#endif\n}\n"
  },
  {
    "path": "tests/unit/queueing/CMakeLists.txt",
    "content": "# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\nadd_executable(queuing_test_bin test_queueing.cpp)\ntarget_link_libraries(queuing_test_bin coreneuron-unit-test)\nadd_test(NAME queuing_test COMMAND $<TARGET_FILE:queuing_test_bin>)\ncpp_cc_configure_sanitizers(TARGET queuing_test_bin TEST queuing_test)\n"
  },
  {
    "path": "tests/unit/queueing/test_queueing.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2016 - 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n#include \"coreneuron/network/netcvode.hpp\"\n#include \"coreneuron/network/tqueue.hpp\"\n\n#define BOOST_TEST_MODULE QueueingTest\n#include <boost/test/included/unit_test.hpp>\n\n#include <cstdlib>\n#include <vector>\n#include <iostream>\n\nusing namespace coreneuron;\n// UNIT TESTS\nBOOST_AUTO_TEST_CASE(priority_queue_nq_dq) {\n    TQueue<pq_que> tq = TQueue<pq_que>();\n    const int num = 8;\n    int cnter = 0;\n    // enqueue 8 items with increasing time\n    for (int i = 0; i < num; ++i)\n        tq.insert(static_cast<double>(i), NULL);\n\n    BOOST_CHECK(tq.pq_que_.size() == (num - 1));\n\n    // dequeue items with time <= 5.0. Should be 6 events: from 0. to 5.\n    TQItem* item = NULL;\n    while ((item = tq.atomic_dq(5.0)) != NULL) {\n        ++cnter;\n        delete item;\n    }\n    BOOST_CHECK(cnter == 6);\n    BOOST_CHECK(tq.pq_que_.size() == (num - 6 - 1));\n\n    // dequeue the rest\n    while ((item = tq.atomic_dq(8.0)) != NULL) {\n        ++cnter;\n        delete item;\n    }\n\n    BOOST_CHECK(cnter == num);\n    BOOST_CHECK(tq.pq_que_.empty());\n    BOOST_CHECK(tq.least() == NULL);\n}\n\nBOOST_AUTO_TEST_CASE(tqueue_ordered_test) {\n    TQueue<pq_que> tq = TQueue<pq_que>();\n    const int num = 10;\n    int cnter = 0;\n    double time = double();\n\n    // insert N items with time < N\n    for (int i = 0; i < num; ++i) {\n        time = static_cast<double>(rand() % num);\n        tq.insert(time, NULL);\n    }\n\n    time = 0.0;\n    TQItem* item = NULL;\n    // dequeue all items and check that previous item time <= current item time\n    while ((item = tq.atomic_dq(10.0)) != NULL) {\n        BOOST_CHECK(time <= item->t_);\n        ++cnter;\n        time = item->t_;\n        delete item;\n    }\n    BOOST_CHECK(cnter == num);\n    BOOST_CHECK(tq.pq_que_.empty());\n    BOOST_CHECK(tq.least() == NULL);\n}\n\nBOOST_AUTO_TEST_CASE(tqueue_move_nolock) {}\n\nBOOST_AUTO_TEST_CASE(tqueue_remove) {}\n\nBOOST_AUTO_TEST_CASE(threaddata_interthread_send) {\n    NetCvodeThreadData nt{};\n    const size_t num = 6;\n    for (size_t i = 0; i < num; ++i)\n        nt.interthread_send(static_cast<double>(i), NULL, NULL);\n\n    BOOST_CHECK(nt.inter_thread_events_.size() == num);\n}\n/*\nBOOST_AUTO_TEST_CASE(threaddata_enqueue){\n    NetCvode n = NetCvode();\n    const int num = 6;\n    for(int i = 0; i < num; ++i)\n        n.p[1].interthread_send(static_cast<double>(i), NULL, NULL);\n\n    BOOST_CHECK(n.p[1].inter_thread_events_.size() == num);\n\n    //enqueue the inter_thread_events_\n    n.p[1].enqueue(&n, &(n.p[1]));\n    BOOST_CHECK(n.p[1].inter_thread_events_.empty());\n    BOOST_CHECK(n.p[1].tqe_->pq_que_.size() == num);\n\n    //cleanup priority queue\n    TQItem* item = NULL;\n    while((item = n.p[1].tqe_->atomic_dq(6.0)) != NULL)\n        delete item;\n}*/\n"
  },
  {
    "path": "tests/unit/solver/CMakeLists.txt",
    "content": "# =============================================================================\n# Copyright (c) 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================\nadd_executable(test-solver test_solver.cpp)\ntarget_link_libraries(test-solver coreneuron-unit-test)\nadd_test(NAME test-solver COMMAND $<TARGET_FILE:test-solver>)\ncpp_cc_configure_sanitizers(TARGET test-solver TEST test-solver)\n"
  },
  {
    "path": "tests/unit/solver/test_solver.cpp",
    "content": "/*\n# =============================================================================\n# Copyright (c) 2022 Blue Brain Project/EPFL\n#\n# See top-level LICENSE file for details.\n# =============================================================================.\n*/\n#include \"coreneuron/apps/corenrn_parameters.hpp\"\n#include \"coreneuron/gpu/nrn_acc_manager.hpp\"\n#include \"coreneuron/permute/cellorder.hpp\"\n#include \"coreneuron/permute/node_permute.h\"\n#include \"coreneuron/sim/multicore.hpp\"\n\n#define BOOST_TEST_MODULE CoreNEURON solver\n#include <boost/test/included/unit_test.hpp>\n\n#include <iostream>\n#include <functional>\n#include <map>\n#include <random>\n#include <utility>\n#include <vector>\n\nusing namespace coreneuron;\nnamespace utf = boost::unit_test;\n\n\nstruct SolverData {\n    std::vector<double> d, rhs;\n    std::vector<int> parent_index;\n};\n\nconstexpr auto magic_index_value = -2;\nconstexpr auto magic_double_value = std::numeric_limits<double>::lowest();\n\nenum struct SolverImplementation {\n    CellPermute0_CPU,\n    CellPermute0_GPU,\n    CellPermute1_CPU,\n    CellPermute1_GPU,\n    CellPermute2_CPU,\n    CellPermute2_GPU,\n    CellPermute2_CUDA\n};\n\nstd::ostream& operator<<(std::ostream& os, SolverImplementation impl) {\n    if (impl == SolverImplementation::CellPermute0_CPU) {\n        return os << \"SolverImplementation::CellPermute0_CPU\";\n    } else if (impl == SolverImplementation::CellPermute0_GPU) {\n        return os << \"SolverImplementation::CellPermute0_GPU\";\n    } else if (impl == SolverImplementation::CellPermute1_CPU) {\n        return os << \"SolverImplementation::CellPermute1_CPU\";\n    } else if (impl == SolverImplementation::CellPermute1_GPU) {\n        return os << \"SolverImplementation::CellPermute1_GPU\";\n    } else if (impl == SolverImplementation::CellPermute2_CPU) {\n        return os << \"SolverImplementation::CellPermute2_CPU\";\n    } else if (impl == SolverImplementation::CellPermute2_GPU) {\n        return os << \"SolverImplementation::CellPermute2_GPU\";\n    } else if (impl == SolverImplementation::CellPermute2_CUDA) {\n        return os << \"SolverImplementation::CellPermute2_CUDA\";\n    } else {\n        throw std::runtime_error(\"Invalid SolverImplementation\");\n    }\n}\n\nstruct ToyModelConfig {\n    int num_threads{1};\n    int num_cells{1};\n    int num_segments_per_cell{3};\n    std::function<double(int, int)> produce_a{[](auto, auto) { return 3.14159; }},\n        produce_b{[](auto, auto) { return 42.0; }}, produce_d{[](auto, auto) { return 7.0; }},\n        produce_rhs{[](auto, auto) { return -16.0; }};\n};\n\n// TODO include some global lock as a sanity check (only one instance of\n// SetupThreads should exist at any given time)\nstruct SetupThreads {\n    SetupThreads(SolverImplementation impl, ToyModelConfig config = {}) {\n        corenrn_param.cuda_interface = false;\n        corenrn_param.gpu = false;\n        switch (impl) {\n            case SolverImplementation::CellPermute0_GPU:\n                corenrn_param.gpu = true;\n                [[fallthrough]];\n            case SolverImplementation::CellPermute0_CPU:\n                interleave_permute_type = 0;\n                break;\n            case SolverImplementation::CellPermute1_GPU:\n                corenrn_param.gpu = true;\n                [[fallthrough]];\n            case SolverImplementation::CellPermute1_CPU:\n                interleave_permute_type = 1;\n                break;\n            case SolverImplementation::CellPermute2_CUDA:\n                corenrn_param.cuda_interface = true;\n                [[fallthrough]];\n            case SolverImplementation::CellPermute2_GPU:\n                corenrn_param.gpu = true;\n                [[fallthrough]];\n            case SolverImplementation::CellPermute2_CPU:\n                interleave_permute_type = 2;\n                break;\n        }\n        use_solve_interleave = interleave_permute_type > 0;\n        nrn_threads_create(config.num_threads);\n        create_interleave_info();\n        int num_cells_remaining{config.num_cells}, total_cells{};\n        for (auto ithread = 0; ithread < nrn_nthread; ++ithread) {\n            auto& nt = nrn_threads[ithread];\n            // How many cells to distribute on this thread, trying to get the right\n            // total even if num_threads does not exactly divide num_cells.\n            nt.ncell = num_cells_remaining / (nrn_nthread - ithread);\n            total_cells += nt.ncell;\n            num_cells_remaining -= nt.ncell;\n            // How many segments are there in this thread?\n            nt.end = nt.ncell * config.num_segments_per_cell;\n            auto const padded_size = nrn_soa_padded_size(nt.end, 0);\n            // Allocate one big block because the GPU data transfer code assumes this.\n            nt._ndata = padded_size * 4;\n            nt._data = static_cast<double*>(emalloc_align(nt._ndata * sizeof(double)));\n            auto* vec_rhs = (nt._actual_rhs = nt._data + 0 * padded_size);\n            auto* vec_d = (nt._actual_d = nt._data + 1 * padded_size);\n            auto* vec_a = (nt._actual_a = nt._data + 2 * padded_size);\n            auto* vec_b = (nt._actual_b = nt._data + 3 * padded_size);\n            auto* parent_indices =\n                (nt._v_parent_index = static_cast<int*>(emalloc_align(padded_size * sizeof(int))));\n            // Magic value to check against later.\n            std::fill(parent_indices, parent_indices + nt.end, magic_index_value);\n            // Put all the root nodes first, then put the other segments\n            // in blocks. i.e. ABCDAAAABBBBCCCCDDDD\n            auto const get_index = [ncell = nt.ncell,\n                                    nseg = config.num_segments_per_cell](auto icell, auto iseg) {\n                if (iseg == 0) {\n                    return icell;\n                } else {\n                    return ncell + icell * (nseg - 1) + iseg - 1;\n                }\n            };\n            for (auto icell = 0; icell < nt.ncell; ++icell) {\n                for (auto iseg = 0; iseg < config.num_segments_per_cell; ++iseg) {\n                    auto const global_index = get_index(icell, iseg);\n                    vec_a[global_index] = config.produce_a(icell, iseg);\n                    vec_b[global_index] = config.produce_b(icell, iseg);\n                    vec_d[global_index] = config.produce_d(icell, iseg);\n                    vec_rhs[global_index] = config.produce_rhs(icell, iseg);\n                    // 0th element is the root node, which has no parent\n                    // other elements are attached in a binary tree configuration\n                    // |      0      |\n                    // |    /   \\    |\n                    // |   1     2   |\n                    // |  / \\   / \\  |\n                    // | 3   4 5   6 |\n                    // TODO: include some other topologies, e.g. a long straight line, or\n                    // an unbalanced tree.\n                    auto const parent_id = iseg ? get_index(icell, (iseg - 1) / 2) : -1;\n                    parent_indices[global_index] = parent_id;\n                }\n            }\n            // Check we didn't mess up populating any parent indices\n            for (auto i = 0; i < nt.end; ++i) {\n                BOOST_REQUIRE(parent_indices[i] != magic_index_value);\n                // Root nodes should come first for --cell-permute=0\n                if (i < nt.ncell) {\n                    BOOST_REQUIRE(parent_indices[i] == -1);\n                }\n            }\n            if (interleave_permute_type) {\n                nt._permute = interleave_order(nt.id, nt.ncell, nt.end, parent_indices);\n                BOOST_REQUIRE(nt._permute);\n                permute_data(vec_a, nt.end, nt._permute);\n                permute_data(vec_b, nt.end, nt._permute);\n                // This isn't done in CoreNEURON because these are reset every\n                // time step, but permute d/rhs here so that the initial values\n                // set by produce_d and produce_rhs are propagated consistently\n                // to all of the solver implementations.\n                permute_data(vec_d, nt.end, nt._permute);\n                permute_data(vec_rhs, nt.end, nt._permute);\n                // index values change as well as ordering\n                permute_ptr(parent_indices, nt.end, nt._permute);\n                node_permute(parent_indices, nt.end, nt._permute);\n            }\n        }\n        if (impl == SolverImplementation::CellPermute0_GPU) {\n            std::cout << \"CellPermute0_GPU is a nonstandard configuration, copying data to the \"\n                         \"device may produce warnings:\";\n        }\n        if (corenrn_param.gpu) {\n            setup_nrnthreads_on_device(nrn_threads, nrn_nthread);\n        }\n        if (impl == SolverImplementation::CellPermute0_GPU) {\n            std::cout << \"\\n...no more warnings expected\" << std::endl;\n        }\n        // Make sure we produced the number of cells we were aiming for\n        BOOST_REQUIRE(total_cells == config.num_cells);\n        BOOST_REQUIRE(num_cells_remaining == 0);\n    }\n\n    ~SetupThreads() {\n        if (corenrn_param.gpu) {\n            delete_nrnthreads_on_device(nrn_threads, nrn_nthread);\n        }\n        for (auto& nt: *this) {\n            free_memory(std::exchange(nt._data, nullptr));\n            delete[] std::exchange(nt._permute, nullptr);\n            free_memory(std::exchange(nt._v_parent_index, nullptr));\n        }\n        destroy_interleave_info();\n        nrn_threads_free();\n    }\n\n    auto dump_solver_data() {\n        std::vector<SolverData> ret{static_cast<std::size_t>(nrn_nthread)};\n        // Sync the solver data from GPU to host\n        update_nrnthreads_on_host(nrn_threads, nrn_nthread);\n        // Un-permute the data in and store it in ret.{d,parent_index,rhs}\n        for (auto i = 0; i < nrn_nthread; ++i) {\n            auto& nt = nrn_threads[i];\n            auto& sd = ret[i];\n            sd.d.resize(nt.end, magic_double_value);\n            sd.parent_index.resize(nt.end, magic_index_value);\n            sd.rhs.resize(nt.end, magic_double_value);\n            auto* inv_permute = nt._permute ? inverse_permute(nt._permute, nt.end) : nullptr;\n            for (auto i = 0; i < nt.end; ++i) {\n                // index in permuted vectors\n                auto const p_i = nt._permute ? nt._permute[i] : i;\n                // parent index in permuted vectors\n                auto const p_parent = nt._v_parent_index[p_i];\n                // parent index in unpermuted vectors (i.e. on the same scale as `i`)\n                auto const parent = p_parent == -1\n                                        ? -1\n                                        : (inv_permute ? inv_permute[p_parent] : p_parent);\n                // Save the values to the de-permuted return structure\n                sd.d[i] = nt._actual_d[p_i];\n                sd.parent_index[i] = parent;\n                sd.rhs[i] = nt._actual_rhs[p_i];\n            }\n            delete[] inv_permute;\n            for (auto i = 0; i < nt.end; ++i) {\n                BOOST_REQUIRE(sd.d[i] != magic_double_value);\n                BOOST_REQUIRE(sd.parent_index[i] != magic_index_value);\n                BOOST_REQUIRE(sd.rhs[i] != magic_double_value);\n            }\n        }\n        return ret;\n    }\n\n    void solve() {\n        for (auto& thread: *this) {\n            nrn_solve_minimal(&thread);\n        }\n    }\n\n    NrnThread* begin() const {\n        return nrn_threads;\n    }\n    NrnThread* end() const {\n        return nrn_threads + nrn_nthread;\n    }\n};\n\ntemplate <typename... Args>\nauto solve_and_dump(Args&&... args) {\n    SetupThreads threads{std::forward<Args>(args)...};\n    threads.solve();\n    return threads.dump_solver_data();\n}\n\nauto active_implementations() {\n    // These are always available\n    std::vector<SolverImplementation> ret{SolverImplementation::CellPermute0_CPU,\n                                          SolverImplementation::CellPermute1_CPU,\n                                          SolverImplementation::CellPermute2_CPU};\n#ifdef CORENEURON_ENABLE_GPU\n    // Consider making these steerable via a runtime switch in GPU builds\n    ret.push_back(SolverImplementation::CellPermute0_GPU);\n    ret.push_back(SolverImplementation::CellPermute1_GPU);\n    ret.push_back(SolverImplementation::CellPermute2_GPU);\n    ret.push_back(SolverImplementation::CellPermute2_CUDA);\n#endif\n    return ret;\n}\n\nvoid compare_solver_data(\n    std::map<SolverImplementation, std::vector<SolverData>> const& solver_data) {\n    // CellPermute0_CPU is the simplest version of the solver, it should always\n    // be present and it's a good reference to use\n    constexpr auto ref_impl = SolverImplementation::CellPermute0_CPU;\n    BOOST_REQUIRE(solver_data.find(ref_impl) != solver_data.end());\n    auto const& ref_data = solver_data.at(ref_impl);\n    for (auto const& [impl, impl_data]: solver_data) {\n        // Must have compatible numbers of threads.\n        BOOST_REQUIRE(impl_data.size() == ref_data.size());\n        std::cout << \"Comparing \" << impl << \" to \" << ref_impl << std::endl;\n        for (auto n_thread = 0ul; n_thread < impl_data.size(); ++n_thread) {\n            // Must have compatible numbers of segments/data entries\n            BOOST_REQUIRE(impl_data[n_thread].d.size() == ref_data[n_thread].d.size());\n            BOOST_REQUIRE(impl_data[n_thread].parent_index.size() ==\n                          ref_data[n_thread].parent_index.size());\n            BOOST_REQUIRE(impl_data[n_thread].rhs.size() == ref_data[n_thread].rhs.size());\n            BOOST_TEST(impl_data[n_thread].d == ref_data[n_thread].d,\n                       boost::test_tools::per_element());\n            BOOST_TEST(impl_data[n_thread].parent_index == ref_data[n_thread].parent_index,\n                       boost::test_tools::per_element());\n            BOOST_TEST(impl_data[n_thread].rhs == ref_data[n_thread].rhs,\n                       boost::test_tools::per_element());\n        }\n    }\n}\n\ntemplate <typename... Args>\nauto compare_all_active_implementations(Args&&... args) {\n    std::map<SolverImplementation, std::vector<SolverData>> solver_data;\n    for (auto impl: active_implementations()) {\n        solver_data[impl] = solve_and_dump(impl, std::forward<Args>(args)...);\n    }\n    compare_solver_data(solver_data);\n    return solver_data;\n}\n\n// *Roughly* tuned to accomodate NVHPC 22.3 at -O0; the largest differences come\n// from the pseudorandom seeded tests.\nconstexpr double default_tolerance = 2e-11;\n\n// May need to add some different tolerances here\nBOOST_AUTO_TEST_CASE(SingleCellAndThread, *utf::tolerance(default_tolerance)) {\n    constexpr std::size_t segments = 32;\n    ToyModelConfig config{};\n    config.num_segments_per_cell = segments;\n    auto const solver_data = compare_all_active_implementations(config);\n    for (auto const& [impl, data]: solver_data) {\n        BOOST_REQUIRE(data.size() == 1);  // nthreads\n        BOOST_REQUIRE(data[0].d.size() == segments);\n        BOOST_REQUIRE(data[0].parent_index.size() == segments);\n        BOOST_REQUIRE(data[0].rhs.size() == segments);\n    }\n}\n\nBOOST_AUTO_TEST_CASE(UnbalancedCellSingleThread, *utf::tolerance(default_tolerance)) {\n    ToyModelConfig config{};\n    config.num_segments_per_cell = 19;  // not a nice round number\n    compare_all_active_implementations(config);\n}\n\nBOOST_AUTO_TEST_CASE(LargeCellSingleThread, *utf::tolerance(default_tolerance)) {\n    ToyModelConfig config{};\n    config.num_segments_per_cell = 4096;\n    compare_all_active_implementations(config);\n}\n\nBOOST_AUTO_TEST_CASE(ManySmallCellsSingleThread, *utf::tolerance(default_tolerance)) {\n    ToyModelConfig config{};\n    config.num_cells = 1024;\n    compare_all_active_implementations(config);\n}\n\nBOOST_AUTO_TEST_CASE(ManySmallCellsMultiThread, *utf::tolerance(default_tolerance)) {\n    ToyModelConfig config{};\n    config.num_cells = 1024;\n    config.num_threads = 2;\n    compare_all_active_implementations(config);\n}\n\nauto random_config() {\n    std::mt19937_64 gen{42};\n    ToyModelConfig config{};\n    config.produce_a = [g = gen, d = std::normal_distribution{1.0, 0.1}](int icell,\n                                                                         int iseg) mutable {\n        return d(g);\n    };\n    config.produce_b = [g = gen, d = std::normal_distribution{7.0, 0.2}](int, int) mutable {\n        return d(g);\n    };\n    config.produce_d = [g = gen, d = std::normal_distribution{-0.1, 0.01}](int, int) mutable {\n        return d(g);\n    };\n    config.produce_rhs = [g = gen, d = std::normal_distribution{-15.0, 2.0}](int, int) mutable {\n        return d(g);\n    };\n    return config;\n}\n\nBOOST_AUTO_TEST_CASE(LargeCellSingleThreadRandom, *utf::tolerance(default_tolerance)) {\n    auto config = random_config();\n    config.num_segments_per_cell = 4096;\n    compare_all_active_implementations(config);\n}\n\nBOOST_AUTO_TEST_CASE(ManySmallCellsSingleThreadRandom, *utf::tolerance(default_tolerance)) {\n    auto config = random_config();\n    config.num_cells = 1024;\n    compare_all_active_implementations(config);\n}\n"
  }
]