Full Code of HigherOrderCO/HVM2 for AI

main 654276018084 cached
84 files
422.5 KB
183.5k tokens
372 symbols
1 requests
Download .txt
Showing preview only (447K chars total). Download the full file or copy to clipboard to get everything.
Repository: HigherOrderCO/HVM2
Branch: main
Commit: 654276018084
Files: 84
Total size: 422.5 KB

Directory structure:
gitextract_cj26vbd4/

├── .github/
│   ├── ISSUE_TEMPLATE/
│   │   ├── bug_report.yml
│   │   ├── config.yml
│   │   └── feature_request.md
│   └── workflows/
│       ├── bench.yml
│       ├── checks.yml
│       └── delete-cancelled.yml
├── .gitignore
├── Cargo.toml
├── LICENSE
├── README.md
├── build.rs
├── examples/
│   ├── demo_io/
│   │   ├── main.bend
│   │   └── main.hvm
│   ├── sort_bitonic/
│   │   ├── main.bend
│   │   └── main.hvm
│   ├── sort_radix/
│   │   ├── main.bend
│   │   └── main.hvm
│   ├── stress/
│   │   ├── README.md
│   │   ├── main.bend
│   │   ├── main.hvm
│   │   ├── main.js
│   │   └── main.py
│   ├── sum_rec/
│   │   ├── main.bend
│   │   ├── main.hvm
│   │   ├── main.js
│   │   └── sum.js
│   ├── sum_tree/
│   │   ├── main.bend
│   │   └── main.hvm
│   └── tuples/
│       ├── tuples.bend
│       └── tuples.hvm
├── paper/
│   ├── HVM2.typst
│   ├── README.md
│   └── inet.typ
├── src/
│   ├── ast.rs
│   ├── cmp.rs
│   ├── hvm.c
│   ├── hvm.cu
│   ├── hvm.cuh
│   ├── hvm.h
│   ├── hvm.rs
│   ├── lib.rs
│   ├── main.rs
│   ├── run.c
│   └── run.cu
└── tests/
    ├── programs/
    │   ├── empty.hvm
    │   ├── hello-world.hvm
    │   ├── io/
    │   │   ├── basic.bend
    │   │   ├── basic.hvm
    │   │   ├── invalid-name.bend
    │   │   ├── invalid-name.hvm
    │   │   ├── open1.bend
    │   │   ├── open1.hvm
    │   │   ├── open2.bend
    │   │   ├── open2.hvm
    │   │   ├── open3.bend
    │   │   └── open3.hvm
    │   ├── list.hvm
    │   ├── numeric-casts.hvm
    │   ├── numerics/
    │   │   ├── f24.hvm
    │   │   ├── i24.hvm
    │   │   └── u24.hvm
    │   └── safety-check.hvm
    ├── run.rs
    └── snapshots/
        ├── run__file@empty.hvm.snap
        ├── run__file@hello-world.hvm.snap
        ├── run__file@list.hvm.snap
        ├── run__file@numeric-casts.hvm.snap
        ├── run__file@numerics__f24.hvm.snap
        ├── run__file@numerics__i24.hvm.snap
        ├── run__file@numerics__u24.hvm.snap
        ├── run__file@safety-check.hvm.snap
        ├── run__file@sort_bitonic__main.hvm.snap
        ├── run__file@sort_radix__main.hvm.snap
        ├── run__file@stress__main.hvm.snap
        ├── run__file@sum_rec__main.hvm.snap
        ├── run__file@sum_tree__main.hvm.snap
        ├── run__file@tuples__tuples.hvm.snap
        ├── run__io_file@demo_io__main.hvm.snap
        ├── run__io_file@io__basic.hvm.snap
        ├── run__io_file@io__invalid-name.hvm.snap
        ├── run__io_file@io__open1.hvm.snap
        ├── run__io_file@io__open2.hvm.snap
        ├── run__io_file@io__open3.hvm.snap
        └── run__io_file@io__read_and_print.hvm.snap

================================================
FILE CONTENTS
================================================

================================================
FILE: .github/ISSUE_TEMPLATE/bug_report.yml
================================================
name: Bug report
description: Create a report to help us improve.
body:
 - type: markdown
   attributes: 
     value: |
      ### Bug Report
      Note, your issue might have been already reported, please check [issues](https://github.com/HigherOrderCO/HVM/issues). If you find a similar issue, respond with a reaction or any additional information that you feel may be helpful.

      ### For Windows Users
      There is currently no native way to install HVM, as a temporary workaround, please use [WSL2](https://learn.microsoft.com/en-us/windows/wsl/install).
  
 - type: textarea
   attributes:
    label: Reproducing the behavior
    description: A clear and concise description of what the bug is.
    value: |
      Example:
       Running command...
       With code....
       Error...
       Expected behavior....
   validations:
    required: true
 
 - type: textarea
   attributes:
    label: System Settings
    description: Your System's settings
    value: |
     Example:
      - OS: [e.g. Linux (Ubuntu 22.04)]
      - CPU: [e.g. Intel i9-14900KF]
      - GPU: [e.g. RTX 4090]
      - Cuda Version [e.g. release 12.4, V12.4.131]
   validations:
     required: true
 - type: textarea
   attributes:
    label: Additional context
    description: Add any other context about the problem here (Optional). 
   


================================================
FILE: .github/ISSUE_TEMPLATE/config.yml
================================================
blank_issues_enabled: false
contact_links: 
    - name: Bend Related Issues
      url: https://github.com/HigherOrderCO/Bend/issues/new/choose
      about: For Bend related Issues, please Report them on the Bend repository.  


================================================
FILE: .github/ISSUE_TEMPLATE/feature_request.md
================================================
---
name: Feature request
about: Suggest a feature that you think should be added.
title: ''
labels: ''

---

**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm frustrated when [...]

**Describe the solution you'd like**
A clear and concise description of what you want to happen.

**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.

**Additional context**
Add any other context or screenshots about the feature request here.


================================================
FILE: .github/workflows/bench.yml
================================================
name: Bench

on:
  pull_request:

concurrency:
  group: bench-${{ github.ref }}
  cancel-in-progress: true

jobs:
  bench:
    runs-on: [self-hosted, cuda]
    timeout-minutes: 10
    steps:
      - uses: actions/checkout@v3
      - name: compare perf
        run: |
          git fetch origin main
          git clone https://github.com/HigherOrderCO/hvm-bench
          cd hvm-bench
          NO_COLOR=1 cargo run bench --repo-dir ../ -r main --timeout 20 > ../table
        shell: bash -l {0}
      - name: write comment
        run: |
          echo 'Perf run for [`'`git rev-parse --short ${{ github.sha }}`'`](https://github.com/higherorderco/HVM/commit/${{ github.sha }}):' >> comment
          echo '```' >> comment
          cat table >> comment
          echo '```' >> comment
      - name: post comment
        run: gh pr comment ${{ github.event.number }} -F comment
        env:
          GH_TOKEN: ${{ secrets.PAT }}
      - name: hide old comment
        env:
          GH_TOKEN: ${{ secrets.PAT }}
        run: |
          COMMENT_ID=$(
            gh api graphql -F pr=${{ github.event.number }} -f query='
              query($pr: Int!) {
                organization(login: "higherorderco") {
                  repository(name: "HVM") {
                    pullRequest(number: $pr) {
                      comments(last: 100) {
                        nodes { id author { login } }
                      }
                    }
                  }
                }
            }
            ' \
            | jq -r '
              [
                .data.organization.repository.pullRequest.comments.nodes | .[]
                | select(.author.login == "HigherOrderBot")
                | .id
              ] | .[-2]
            '
          )

          if [ $COMMENT_ID != null ]
          then
            gh api graphql -F id=$COMMENT_ID -f query='
              mutation($id: ID!) {
                minimizeComment(input: {
                  subjectId: $id,
                  classifier: OUTDATED,
                }) { minimizedComment { ...on Comment { id } } }
              }
            '
          fi
      - name: delete on cancel
        if: ${{ cancelled() }}
        run: gh workflow run delete-cancelled.yml -f run_id=${{ github.run_id }}
        env:
          GH_TOKEN: ${{ secrets.PAT }}


================================================
FILE: .github/workflows/checks.yml
================================================
name: Checks

on:
  pull_request:
  merge_group:
  push:
    branches:
      - main

jobs:
  check:
    runs-on: ubuntu-latest
    timeout-minutes: 10
    steps:
      - uses: actions/checkout@v3
      - uses: actions/cache@v2
        with:
          path: |
            ~/.cargo/registry
            ~/.cargo/git
            target
          key: ${{ runner.os }}-check-${{ hashFiles('**/Cargo.lock') }}
      - run: RUSTFLAGS="-D warnings" cargo check --all-targets
  test:
    runs-on: ubuntu-latest
    timeout-minutes: 10
    steps:
      - uses: actions/checkout@v3
      - uses: actions/cache@v2
        with:
          path: |
            ~/.cargo/registry
            ~/.cargo/git
            target
          key: ${{ runner.os }}-test-${{ hashFiles('**/Cargo.lock') }}
      - run: cargo test --release
  test-cuda:
    needs: test # don't bother the cuda machine if other tests are failing
    runs-on: [self-hosted, cuda]
    timeout-minutes: 10
    steps:
      - uses: actions/checkout@v3
      - run: cargo test --release
        shell: bash -l {0}



================================================
FILE: .github/workflows/delete-cancelled.yml
================================================
name: Delete Cancelled Benchmarks

on:
  workflow_dispatch:
    inputs:
      run_id:
        type: string
        description: "" 

jobs:
  delete:
    runs-on: ubuntu-latest
    steps:
      - run: gh api "repos/higherorderco/hvm-core/actions/runs/${{ inputs.run_id }}" -X DELETE
        env:
          GH_TOKEN: ${{ secrets.PAT }}


================================================
FILE: .gitignore
================================================
.fill.tmp
hvm-cuda-experiments/
src/hvm
src/old_cmp.rs
src/tmp/
target/
tmp/
.hvm/
examples/**/main
examples/**/*.c
examples/**/*.cu
.out.hvm

# nix-direnv
/.direnv/
/.envrc


================================================
FILE: Cargo.toml
================================================
[package]
name = "hvm"
description = "A massively parallel, optimal functional runtime in Rust."
license = "Apache-2.0"
version = "2.0.22"
edition = "2021"
rust-version = "1.74"
build = "build.rs"
repository = "https://github.com/HigherOrderCO/HVM"

[lib]
name = "hvm"
path = "src/lib.rs"

[dependencies]
TSPL = "0.0.13"
clap = "4.5.2"
highlight_error = "0.1.1"
num_cpus = "1.0"

[build-dependencies]
cc = "1.0"
num_cpus = "1.0"

[features]
default = []
# C and CUDA features are determined during build
c = []
cuda = []

[dev-dependencies]
insta = { version = "1.39.0", features = ["glob"] }


================================================
FILE: LICENSE
================================================
                                 Apache License
                           Version 2.0, January 2004
                        http://www.apache.org/licenses/

   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

   1. Definitions.

      "License" shall mean the terms and conditions for use, reproduction,
      and distribution as defined by Sections 1 through 9 of this document.

      "Licensor" shall mean the copyright owner or entity authorized by
      the copyright owner that is granting the License.

      "Legal Entity" shall mean the union of the acting entity and all
      other entities that control, are controlled by, or are under common
      control with that entity. For the purposes of this definition,
      "control" means (i) the power, direct or indirect, to cause the
      direction or management of such entity, whether by contract or
      otherwise, or (ii) ownership of fifty percent (50%) or more of the
      outstanding shares, or (iii) beneficial ownership of such entity.

      "You" (or "Your") shall mean an individual or Legal Entity
      exercising permissions granted by this License.

      "Source" form shall mean the preferred form for making modifications,
      including but not limited to software source code, documentation
      source, and configuration files.

      "Object" form shall mean any form resulting from mechanical
      transformation or translation of a Source form, including but
      not limited to compiled object code, generated documentation,
      and conversions to other media types.

      "Work" shall mean the work of authorship, whether in Source or
      Object form, made available under the License, as indicated by a
      copyright notice that is included in or attached to the work
      (an example is provided in the Appendix below).

      "Derivative Works" shall mean any work, whether in Source or Object
      form, that is based on (or derived from) the Work and for which the
      editorial revisions, annotations, elaborations, or other modifications
      represent, as a whole, an original work of authorship. For the purposes
      of this License, Derivative Works shall not include works that remain
      separable from, or merely link (or bind by name) to the interfaces of,
      the Work and Derivative Works thereof.

      "Contribution" shall mean any work of authorship, including
      the original version of the Work and any modifications or additions
      to that Work or Derivative Works thereof, that is intentionally
      submitted to Licensor for inclusion in the Work by the copyright owner
      or by an individual or Legal Entity authorized to submit on behalf of
      the copyright owner. For the purposes of this definition, "submitted"
      means any form of electronic, verbal, or written communication sent
      to the Licensor or its representatives, including but not limited to
      communication on electronic mailing lists, source code control systems,
      and issue tracking systems that are managed by, or on behalf of, the
      Licensor for the purpose of discussing and improving the Work, but
      excluding communication that is conspicuously marked or otherwise
      designated in writing by the copyright owner as "Not a Contribution."

      "Contributor" shall mean Licensor and any individual or Legal Entity
      on behalf of whom a Contribution has been received by Licensor and
      subsequently incorporated within the Work.

   2. Grant of Copyright License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      copyright license to reproduce, prepare Derivative Works of,
      publicly display, publicly perform, sublicense, and distribute the
      Work and such Derivative Works in Source or Object form.

   3. Grant of Patent License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      (except as stated in this section) patent license to make, have made,
      use, offer to sell, sell, import, and otherwise transfer the Work,
      where such license applies only to those patent claims licensable
      by such Contributor that are necessarily infringed by their
      Contribution(s) alone or by combination of their Contribution(s)
      with the Work to which such Contribution(s) was submitted. If You
      institute patent litigation against any entity (including a
      cross-claim or counterclaim in a lawsuit) alleging that the Work
      or a Contribution incorporated within the Work constitutes direct
      or contributory patent infringement, then any patent licenses
      granted to You under this License for that Work shall terminate
      as of the date such litigation is filed.

   4. Redistribution. You may reproduce and distribute copies of the
      Work or Derivative Works thereof in any medium, with or without
      modifications, and in Source or Object form, provided that You
      meet the following conditions:

      (a) You must give any other recipients of the Work or
          Derivative Works a copy of this License; and

      (b) You must cause any modified files to carry prominent notices
          stating that You changed the files; and

      (c) You must retain, in the Source form of any Derivative Works
          that You distribute, all copyright, patent, trademark, and
          attribution notices from the Source form of the Work,
          excluding those notices that do not pertain to any part of
          the Derivative Works; and

      (d) If the Work includes a "NOTICE" text file as part of its
          distribution, then any Derivative Works that You distribute must
          include a readable copy of the attribution notices contained
          within such NOTICE file, excluding those notices that do not
          pertain to any part of the Derivative Works, in at least one
          of the following places: within a NOTICE text file distributed
          as part of the Derivative Works; within the Source form or
          documentation, if provided along with the Derivative Works; or,
          within a display generated by the Derivative Works, if and
          wherever such third-party notices normally appear. The contents
          of the NOTICE file are for informational purposes only and
          do not modify the License. You may add Your own attribution
          notices within Derivative Works that You distribute, alongside
          or as an addendum to the NOTICE text from the Work, provided
          that such additional attribution notices cannot be construed
          as modifying the License.

      You may add Your own copyright statement to Your modifications and
      may provide additional or different license terms and conditions
      for use, reproduction, or distribution of Your modifications, or
      for any such Derivative Works as a whole, provided Your use,
      reproduction, and distribution of the Work otherwise complies with
      the conditions stated in this License.

   5. Submission of Contributions. Unless You explicitly state otherwise,
      any Contribution intentionally submitted for inclusion in the Work
      by You to the Licensor shall be under the terms and conditions of
      this License, without any additional terms or conditions.
      Notwithstanding the above, nothing herein shall supersede or modify
      the terms of any separate license agreement you may have executed
      with Licensor regarding such Contributions.

   6. Trademarks. This License does not grant permission to use the trade
      names, trademarks, service marks, or product names of the Licensor,
      except as required for reasonable and customary use in describing the
      origin of the Work and reproducing the content of the NOTICE file.

   7. Disclaimer of Warranty. Unless required by applicable law or
      agreed to in writing, Licensor provides the Work (and each
      Contributor provides its Contributions) on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
      implied, including, without limitation, any warranties or conditions
      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
      PARTICULAR PURPOSE. You are solely responsible for determining the
      appropriateness of using or redistributing the Work and assume any
      risks associated with Your exercise of permissions under this License.

   8. Limitation of Liability. In no event and under no legal theory,
      whether in tort (including negligence), contract, or otherwise,
      unless required by applicable law (such as deliberate and grossly
      negligent acts) or agreed to in writing, shall any Contributor be
      liable to You for damages, including any direct, indirect, special,
      incidental, or consequential damages of any character arising as a
      result of this License or out of the use or inability to use the
      Work (including but not limited to damages for loss of goodwill,
      work stoppage, computer failure or malfunction, or any and all
      other commercial damages or losses), even if such Contributor
      has been advised of the possibility of such damages.

   9. Accepting Warranty or Additional Liability. While redistributing
      the Work or Derivative Works thereof, You may choose to offer,
      and charge a fee for, acceptance of support, warranty, indemnity,
      or other liability obligations and/or rights consistent with this
      License. However, in accepting such obligations, You may act only
      on Your own behalf and on Your sole responsibility, not on behalf
      of any other Contributor, and only if You agree to indemnify,
      defend, and hold each Contributor harmless for any liability
      incurred by, or claims asserted against, such Contributor by reason
      of your accepting any such warranty or additional liability.

   END OF TERMS AND CONDITIONS


================================================
FILE: README.md
================================================
Higher-order Virtual Machine 2 (HVM2)
=====================================

**Higher-order Virtual Machine 2 (HVM2)** is a massively parallel [Interaction
Combinator](https://www.semanticscholar.org/paper/Interaction-Combinators-Lafont/6cfe09aa6e5da6ce98077b7a048cb1badd78cc76)
evaluator.

By compiling programs from high-level languages (such as Python and Haskell) to
HVM, one can run these languages directly on massively parallel hardware, like
GPUs, with near-ideal speedup.

HVM2 is the successor to [HVM1](https://github.com/HigherOrderCO/HVM1), a 2022
prototype of this concept. Compared to its predecessor, HVM2 is simpler, faster
and, most importantly, more correct. [HOC](https://HigherOrderCO.com/) provides
long-term support for all features listed on its [PAPER](./paper/HVM2.pdf).

This repository provides a low-level IR language for specifying the HVM2 nets
and a compiler from that language to C and CUDA. It is not meant for direct
human usage. If you're looking for a high-level language to interface with HVM2,
check [Bend](https://github.com/HigherOrderCO/Bend) instead.

Usage
-----

> DISCLAIMER: Windows is currently not supported, please use [WSL](https://learn.microsoft.com/en-us/windows/wsl/install) for now as a workaround.

First install the dependencies:
* If you want to use the C runtime, install a C-11 compatible compiler like GCC or Clang.
* If you want to use the CUDA runtime, install CUDA and nvcc (the CUDA compiler).
  - _HVM requires CUDA 12.x and currently only works on Nvidia GPUs._

Install HVM2:

```sh
cargo install hvm
```

There are multiple ways to run an HVM program:

```sh
hvm run    <file.hvm> # interpret via Rust
hvm run-c  <file.hvm> # interpret via C
hvm run-cu <file.hvm> # interpret via CUDA
hvm gen-c  <file.hvm> # compile to standalone C
hvm gen-cu <file.hvm> # compile to standalone CUDA
```

All modes produce the same output. The compiled modes require you to compile the
generated file (with `gcc file.c -o file`, for example), but are faster to run.
The CUDA versions have much higher peak performance but are less stable. As a
rule of thumb, `gen-c` should be used in production.

Language
--------

HVM is a low-level compile target for high-level languages. It provides a raw
syntax for wiring interaction nets. For example:

```javascript
@main = a
  & @sum ~ (28 (0 a))

@sum = (?(((a a) @sum__C0) b) b)

@sum__C0 = ({c a} ({$([*2] $([+1] d)) $([*2] $([+0] b))} f))
  &! @sum ~ (a (b $([+] $(e f))))
  &! @sum ~ (c (d e))
```

The file above implements a recursive sum. If that looks unreadable to you -
don't worry, it isn't meant to. [Bend](https://github.com/HigherOrderCO/Bend) is
the human-readable language and should be used both by end users and by languages
aiming to target the HVM. If you're looking to learn more about the core
syntax and tech, though, please check the [PAPER](./paper/HVM2.pdf).


================================================
FILE: build.rs
================================================
fn main() {
  let cores = num_cpus::get();
  let tpcl2 = (cores as f64).log2().floor() as u32;

  println!("cargo:rerun-if-changed=src/run.c");
  println!("cargo:rerun-if-changed=src/hvm.c");
  println!("cargo:rerun-if-changed=src/run.cu");
  println!("cargo:rerun-if-changed=src/hvm.cu");
  println!("cargo:rustc-link-arg=-rdynamic");

  match cc::Build::new()
      .file("src/run.c")
      .opt_level(3)
      .warnings(false)
      .define("TPC_L2", &*tpcl2.to_string())
      .define("IO", None)
      .try_compile("hvm-c") {
    Ok(_) => println!("cargo:rustc-cfg=feature=\"c\""),
    Err(e) => {
      println!("cargo:warning=\x1b[1m\x1b[31mWARNING: Failed to compile/run.c:\x1b[0m {}", e);
      println!("cargo:warning=Ignoring/run.c and proceeding with build. \x1b[1mThe C runtime will not be available.\x1b[0m");
    }
  }

  // Builds hvm.cu
  if std::process::Command::new("nvcc").arg("--version").stdout(std::process::Stdio::null()).stderr(std::process::Stdio::null()).status().is_ok() {
    if let Ok(cuda_path) = std::env::var("CUDA_HOME") {
      println!("cargo:rustc-link-search=native={}/lib64", cuda_path);
    } else {
      println!("cargo:rustc-link-search=native=/usr/local/cuda/lib64");
    }

    cc::Build::new()
      .cuda(true)
      .file("src/run.cu")
      .define("IO", None)
      .flag("-diag-suppress=177") // variable was declared but never referenced
      .flag("-diag-suppress=550") // variable was set but never used
      .flag("-diag-suppress=20039") // a __host__ function redeclared with __device__, hence treated as a __host__ __device__ function
      .compile("hvm-cu");

    println!("cargo:rustc-cfg=feature=\"cuda\"");
  }
  else {
    println!("cargo:warning=\x1b[1m\x1b[31mWARNING: CUDA compiler not found.\x1b[0m \x1b[1mHVM will not be able to run on GPU.\x1b[0m");
  }
}


================================================
FILE: examples/demo_io/main.bend
================================================
test-io = 1

def unwrap(res):
  match res:
    case Result/Ok:
      return res.val
    case Result/Err:
      return res.val

def open():
  return call("OPEN", ("./LICENSE", "r"))

def read(f):
  return call("READ", (f, 47))

def print(bytes):
  with IO:
    * <- call("WRITE", (1, bytes))
    * <- call("WRITE", (1, "\n"))

    return wrap(*)

def close(f):
  return call("CLOSE", f)

def main():
  with IO:
    f <- open()
    f = unwrap(f)
    bytes <- read(f)
    bytes = unwrap(bytes)
    * <- print(bytes)
    res <- close(f)

    return wrap(res)


================================================
FILE: examples/demo_io/main.hvm
================================================
@IO/Call = (a (b (c (d ((@IO/Call/tag (a (b (c (d e))))) e)))))

@IO/Call/tag = 1

@IO/Done = (a (b ((@IO/Done/tag (a (b c))) c)))

@IO/Done/tag = 0

@IO/MAGIC = (13683217 16719857)

@IO/bind = ((@IO/bind__C2 a) a)

@IO/bind__C0 = (* (b (a c)))
  & @undefer ~ (a (b c))

@IO/bind__C1 = (* (* (a (b ((c d) (e g))))))
  & @IO/Call ~ (@IO/MAGIC (a (b ((c f) g))))
  & @IO/bind ~ (d (e f))

@IO/bind__C2 = (?((@IO/bind__C0 @IO/bind__C1) a) a)

@IO/wrap = a
  & @IO/Done ~ (@IO/MAGIC a)

@String/Cons = (a (b ((@String/Cons/tag (a (b c))) c)))

@String/Cons/tag = 1

@String/Nil = ((@String/Nil/tag a) a)

@String/Nil/tag = 0

@call = (a (b c))
  & @IO/Call ~ (@IO/MAGIC (a (b (@call__C0 c))))

@call__C0 = a
  & @IO/Done ~ (@IO/MAGIC a)

@close = f
  & @call ~ (e f)
  & @String/Cons ~ (67 (d e))
  & @String/Cons ~ (76 (c d))
  & @String/Cons ~ (79 (b c))
  & @String/Cons ~ (83 (a b))
  & @String/Cons ~ (69 (@String/Nil a))

@main = w
  & @IO/bind ~ (@open ((((s (a u)) (@IO/wrap v)) v) w))
  & @IO/bind ~ (c ((((n (o (d q))) (r (s t))) t) u))
  & @unwrap ~ (a {b r})
  & @read ~ (b c)
  & @IO/bind ~ (f ((((g (k (* m))) (n (o p))) p) q))
  & @print ~ (e f)
  & @unwrap ~ (d e)
  & @IO/bind ~ (h ((((i i) (k l)) l) m))
  & @close ~ (g h)

@open = o
  & @call ~ (d ((m n) o))
  & @String/Cons ~ (79 (c d))
  & @String/Cons ~ (80 (b c))
  & @String/Cons ~ (69 (a b))
  & @String/Cons ~ (78 (@String/Nil a))
  & @String/Cons ~ (46 (l m))
  & @String/Cons ~ (47 (k l))
  & @String/Cons ~ (76 (j k))
  & @String/Cons ~ (73 (i j))
  & @String/Cons ~ (67 (h i))
  & @String/Cons ~ (69 (g h))
  & @String/Cons ~ (78 (f g))
  & @String/Cons ~ (83 (e f))
  & @String/Cons ~ (69 (@String/Nil e))
  & @String/Cons ~ (114 (@String/Nil n))

@print = (f h)
  & @IO/bind ~ (g (@print__C3 h))
  & @call ~ (e ((1 f) g))
  & @String/Cons ~ (87 (d e))
  & @String/Cons ~ (82 (c d))
  & @String/Cons ~ (73 (b c))
  & @String/Cons ~ (84 (a b))
  & @String/Cons ~ (69 (@String/Nil a))

@print__C0 = ((* a) (* a))

@print__C1 = g
  & @call ~ (e ((1 f) g))
  & @String/Cons ~ (87 (d e))
  & @String/Cons ~ (82 (c d))
  & @String/Cons ~ (73 (b c))
  & @String/Cons ~ (84 (a b))
  & @String/Cons ~ (69 (@String/Nil a))
  & @String/Cons ~ (10 (@String/Nil f))

@print__C2 = (a (* c))
  & @IO/bind ~ (@print__C1 (((@print__C0 (a b)) b) c))

@print__C3 = ((@print__C2 (@IO/wrap a)) a)

@read = (e f)
  & @call ~ (d ((e 47) f))
  & @String/Cons ~ (82 (c d))
  & @String/Cons ~ (69 (b c))
  & @String/Cons ~ (65 (a b))
  & @String/Cons ~ (68 (@String/Nil a))

@test-io = 1

@undefer = (((a a) b) b)

@unwrap = ((@unwrap__C0 a) a)

@unwrap__C0 = (?(((a a) (* (b b))) c) c)




================================================
FILE: examples/sort_bitonic/main.bend
================================================
def gen(d, x):
  switch d:
    case 0:
      return x
    case _:
      return (gen(d-1, x * 2 + 1), gen(d-1, x * 2))

def sum(d, t):
  switch d:
    case 0:
      return t
    case _:
      (t.a, t.b) = t
      return sum(d-1, t.a) + sum(d-1, t.b)

def swap(s, a, b):
  switch s:
    case 0:
      return (a,b)
    case _:
      return (b,a)

def warp(d, s, a, b):
  switch d:
    case 0:
      return swap(s ^ (a > b), a, b)
    case _:
      (a.a,a.b) = a
      (b.a,b.b) = b
      (A.a,A.b) = warp(d-1, s, a.a, b.a)
      (B.a,B.b) = warp(d-1, s, a.b, b.b)
      return ((A.a,B.a),(A.b,B.b))

def flow(d, s, t):
  switch d:
    case 0:
      return t
    case _:
      (t.a, t.b) = t
      return down(d, s, warp(d-1, s, t.a, t.b))

def down(d,s,t):
  switch d:
    case 0:
      return t
    case _:
      (t.a, t.b) = t
      return (flow(d-1, s, t.a), flow(d-1, s, t.b))

def sort(d, s, t):
  switch d:
    case 0:
      return t
    case _:
      (t.a, t.b) = t
      return flow(d, s, (sort(d-1, 0, t.a), sort(d-1, 1, t.b)))

def main:
  return sum(12, sort(12, 0, gen(12, 0)))


================================================
FILE: examples/sort_bitonic/main.hvm
================================================
@down = (?(((a (* a)) @down__C0) (b (c d))) (c (b d)))

@down__C0 = ({a e} ((c g) ({b f} (d h))))
  &! @flow ~ (a (b (c d)))
  &! @flow ~ (e (f (g h)))

@flow = (?(((a (* a)) @flow__C0) (b (c d))) (c (b d)))

@flow__C0 = ({$([+1] a) c} ((e f) ({b d} h)))
  & @down ~ (a (b (g h)))
  & @warp ~ (c (d (e (f g))))

@gen = (?(((a a) @gen__C0) b) b)

@gen__C0 = ({a d} ({$([*2] $([+1] b)) $([*2] e)} (c f)))
  &! @gen ~ (a (b c))
  &! @gen ~ (d (e f))

@main = a
  & @sum ~ (12 (@main__C1 a))

@main__C0 = a
  & @gen ~ (12 (0 a))

@main__C1 = a
  & @sort ~ (12 (0 (@main__C0 a)))

@sort = (?(((a (* a)) @sort__C0) (b (c d))) (c (b d)))

@sort__C0 = ({$([+1] a) {c f}} ((d g) (b i)))
  & @flow ~ (a (b ((e h) i)))
  &! @sort ~ (c (0 (d e)))
  &! @sort ~ (f (1 (g h)))

@sum = (?(((a a) @sum__C0) b) b)

@sum__C0 = ({a c} ((b d) f))
  &! @sum ~ (a (b $([+] $(e f))))
  &! @sum ~ (c (d e))

@swap = (?((@swap__C0 @swap__C1) (a (b c))) (b (a c)))

@swap__C0 = (b (a (a b)))

@swap__C1 = (* (a (b (a b))))

@warp = (?((@warp__C0 @warp__C1) (a (b (c d)))) (c (b (a d))))

@warp__C0 = ({a e} ({$([>] $(a b)) d} ($([^] $(b c)) f)))
  & @swap ~ (c (d (e f)))

@warp__C1 = ({a f} ((d i) ((c h) ({b g} ((e j) (k l))))))
  &! @warp ~ (f (g (h (i (j l)))))
  &! @warp ~ (a (b (c (d (e k)))))


================================================
FILE: examples/sort_radix/main.bend
================================================

# data Arr = Empty | (Single x) | (Concat x0 x1)
Empty  =         λempty λsingle λconcat empty
Single = λx      λempty λsingle λconcat (single x)
Concat = λx0 λx1 λempty λsingle λconcat (concat x0 x1)

# data Map = Free | Busy | (Node x0 x1)
Free = λfree λbusy λnode free
Busy = λfree λbusy λnode busy
Node = λx0 λx1 λfree λbusy λnode (node x0 x1)

# gen : u32 -> Arr
gen = λn switch n {
  0: λx (Single x)
  _: λx
    let x0 = (* x 2)
    let x1 = (+ x0 1)
    (Concat (gen n-1 x1) (gen n-1 x0))
}

# sum : Arr -> u32
sum = λa
  let a_empty  = 0
  let a_single = λx x
  let a_concat = λx0 λx1 (+ (sum x0) (sum x1))
  (a a_empty a_single a_concat)

# sort : Arr -> Arr
sort = λt (to_arr (to_map t) 0)

# to_arr : Map -> u32 -> Arr
to_arr = λa
  let a_free = λk Empty
  let a_busy = λk (Single k)
  let a_node = λx0 λx1 λk
    let x0 = (to_arr x0 (+ (* k 2) 0))
    let x1 = (to_arr x1 (+ (* k 2) 1))
    (Concat x0 x1)
  (a a_free a_busy a_node)

# to_map : Arr -> Map
to_map = λa
  let a_empty  = Free
  let a_single = λx (radix 24 x 1 Busy)
  let a_concat = λx0 λx1 (merge (to_map x0) (to_map x1))
  (a a_empty a_single a_concat)

# merge : Map -> Map -> Map
merge = λa
  let a_free = λb
    let b_free = Free
    let b_busy = Busy
    let b_node = λb0 λb1 (Node b0 b1)
    (b b_free b_busy b_node)
  let a_busy = λb
    let b_free = Busy
    let b_busy = Busy
    let b_node = λb0 λb1 0
    (b b_free b_busy b_node)
  let a_node = λa0 λa1 λb
    let b_free = λa0 λa1 (Node a0 a1)
    let b_busy = λa0 λa1 0
    let b_node = λb0 λb1 λa0 λa1 (Node (merge a0 b0) (merge a1 b1))
    (b b_free b_busy b_node a0 a1)
  (a a_free a_busy a_node)

# radix : u32 -> Map
radix = λi λn λk λr switch i {
  0: r
  _: (radix i-1 n (* k 2) (swap (& n k) r Free))
}

# swap : u32 -> Map -> Map -> Map
swap = λn switch n {
  0: λx0 λx1 (Node x0 x1)
  _: λx0 λx1 (Node x1 x0)
}

# main : u32
main = (sum (sort (gen 16 0)))


================================================
FILE: examples/sort_radix/main.hvm
================================================
@Busy = (* (a (* a)))

@Concat = (a (b (* (* ((a (b c)) c)))))

@Empty = (a (* (* a)))

@Free = (a (* (* a)))

@Node = (a (b (* (* ((a (b c)) c)))))

@Single = (a (* ((a b) (* b))))

@gen = (?((@gen__C0 @gen__C1) a) a)

@gen__C0 = a
  & @Single ~ a

@gen__C1 = ({a d} ($([*2] {e $([+1] b)}) g))
  & @Concat ~ (c (f g))
  &! @gen ~ (a (b c))
  &! @gen ~ (d (e f))

@main = a
  & @sum ~ (@main__C1 a)

@main__C0 = a
  & @gen ~ (16 (0 a))

@main__C1 = a
  & @sort ~ (@main__C0 a)

@merge = ((@merge__C5 (@merge__C4 (@merge__C3 a))) a)

@merge__C0 = (b (e (a (d g))))
  & @Node ~ (c (f g))
  &! @merge ~ (a (b c))
  &! @merge ~ (d (e f))

@merge__C1 = a
  & @Node ~ a

@merge__C2 = a
  & @Node ~ a

@merge__C3 = (a (b ((@merge__C1 ((* (* 0)) (@merge__C0 (a (b c))))) c)))

@merge__C4 = ((@Busy (@Busy ((* (* 0)) a))) a)

@merge__C5 = ((@Free (@Busy (@merge__C2 a))) a)

@radix = (?(((* (* (a a))) @radix__C0) b) b)

@radix__C0 = (a ({b $([&] $(d e))} ({$([*2] c) d} (f h))))
  & @radix ~ (a (b (c (g h))))
  & @swap ~ (e (f (@Free g)))

@sort = (a c)
  & @to_arr ~ (b (0 c))
  & @to_map ~ (a b)

@sum = ((0 ((a a) (@sum__C0 b))) b)

@sum__C0 = (a (b d))
  &! @sum ~ (a $([+] $(c d)))
  &! @sum ~ (b c)

@swap = (?((@swap__C0 @swap__C1) a) a)

@swap__C0 = a
  & @Node ~ a

@swap__C1 = (* (b (a c)))
  & @Node ~ (a (b c))

@to_arr = (((* @Empty) (@to_arr__C1 (@to_arr__C0 a))) a)

@to_arr__C0 = (a (d ({$([*2] $([+1] e)) $([*2] $([+0] b))} g)))
  & @Concat ~ (c (f g))
  &! @to_arr ~ (a (b c))
  &! @to_arr ~ (d (e f))

@to_arr__C1 = a
  & @Single ~ a

@to_map = ((@Free (@to_map__C1 (@to_map__C0 a))) a)

@to_map__C0 = (a (c e))
  & @merge ~ (b (d e))
  &! @to_map ~ (a b)
  &! @to_map ~ (c d)

@to_map__C1 = (a b)
  & @radix ~ (24 (a (1 (@Busy b))))


================================================
FILE: examples/stress/README.md
================================================
# stress

This is the basic stress-test used to test an implementation's maximum IPS. It
recursively creates a tree with a given depth, and then performs a recursive
computation with a given length:

```
def sum(n):
  if n == 0:
    return 0
  else:
    return n + sum(n - 1)

def fun(n):
  if n == 0:
    return sum(LENGTH)
  else:
    return fun(n - 1) + fun(n - 1)

fun(DEPTH)
```

This lets us test both the parallel and sequential performance of a runtime. For
example, by testing a tree of depth 14 and breadth 2^20, for example, we have
enough parallelism to use all the 32k threads of a RTX 4090, and enough
sequential work (1m calls) to keep each thread busy for a long time.


================================================
FILE: examples/stress/main.bend
================================================
def loop(n):
  switch n:
    case 0:
      return 0
    case _:
      return loop(n-1)

def fun(n):
  switch n:
    case 0:
      return loop(0x10000)
    case _:
      return fun(n-1) + fun(n-1)

def main:
  return fun(8)


================================================
FILE: examples/stress/main.hvm
================================================
@fun = (?((@fun__C0 @fun__C1) a) a)

@fun__C0 = a
  & @loop ~ (65536 a)

@fun__C1 = ({a b} d)
  &! @fun ~ (a $([+] $(c d)))
  &! @fun ~ (b c)

@loop = (?((0 @loop__C0) a) a)

@loop__C0 = a
  & @loop ~ a

@main = a
  & @fun ~ (8 a)


================================================
FILE: examples/stress/main.js
================================================
function sum(n) {
  if (n === 0) {
    return 0;
  } else {
    return n + sum(n - 1);
  }
}

function fun(n) {
  if (n === 0) {
    return sum(4096);
  } else {
    return fun(n - 1) + fun(n - 1);
  }
}

console.log(fun(18));


================================================
FILE: examples/stress/main.py
================================================
def sum(n):
  if n == 0:
    return 0
  else:
    return n + sum(n - 1)

def fun(n):
  if n == 0:
    return sum(16)
  else:
    return fun(n - 1) + fun(n - 1)

print(fun(8))

# Demo Micro-Benchmark / Stress-Test
# 
# Complexity: 120,264,589,303 Interactions
#
# CPython: 640s on Apple M3 Max (1 thread) *
# HVM-CPU: 268s on Apple M3 Max (1 thread)
# Node.js: 128s on Apple M3 Max (1 thread) *
# HVM-CPU:  14s on Apple M3 Max (12 threads)
# HVM-GPU:   2s on NVIDIA RTX 4090 (32k threads)
#
# * estimated due to stack overflow
















================================================
FILE: examples/sum_rec/main.bend
================================================
#flavor core

sum = λn λx switch n {
  0: x
  _: let fst = (sum n-1 (+ (* x 2) 0))
     let snd = (sum n-1 (+ (* x 2) 1))
     (+ fst snd)
}

main = (sum 20 0)


================================================
FILE: examples/sum_rec/main.hvm
================================================
@main = a
  & @sum ~ (20 (0 a))

@sum = (?(((a a) @sum__C0) b) b)

@sum__C0 = ({c a} ({$([*2] $([+1] d)) $([*2] $([+0] b))} f))
  &! @sum ~ (a (b $([+] $(e f))))
  &! @sum ~ (c (d e))


================================================
FILE: examples/sum_rec/main.js
================================================
function sum(a, b) {
  if (a === b) {
    return a;
  } else {
    let mid = Math.floor((a + b) / 2);
    let fst = sum(a, mid + 0);
    let snd = sum(mid + 1, b);
    return fst + snd;
  }
}

console.log(sum(0, 10000000));


================================================
FILE: examples/sum_rec/sum.js
================================================
var sum = 0;

for (var i = 0; i < 2**30; ++i) {
  sum += i;
}

console.log(sum);


================================================
FILE: examples/sum_tree/main.bend
================================================
gen = λd switch d {
  0: λx x
  _: λx ((gen d-1 (+ (* x 2) 1)), (gen d-1 (* x 2)))
}

sum = λd λt switch d {
  0: 1
  _: let (t.a,t.b) = t
    (+ (sum d-1 t.a) (sum d-1 t.b))
}

main = (sum 20 (gen 20 0))


================================================
FILE: examples/sum_tree/main.hvm
================================================
@gen = (?(((a a) @gen__C0) b) b)

@gen__C0 = ({a d} ({$([*2] $([+1] b)) $([*2] e)} (c f)))
  &! @gen ~ (a (b c))
  &! @gen ~ (d (e f))

@main = a
  & @sum ~ (20 (@main__C0 a))

@main__C0 = a
  & @gen ~ (20 (0 a))

@sum = (?(((* 1) @sum__C0) a) a)

@sum__C0 = ({a c} ((b d) f))
  &! @sum ~ (a (b $([+] $(e f))))
  &! @sum ~ (c (d e))


================================================
FILE: examples/tuples/tuples.bend
================================================
type Tup8:
  New { a, b, c, d, e, f, g, h }

rot = λx match x {
  Tup8/New: (Tup8/New x.b x.c x.d x.e x.f x.g x.h x.a)
}

app = λn switch n {
  0: λf λx x
  _: λf λx (app n-1 f (f x))
}

main = (app 1234 rot (Tup8/New 1 2 3 4 5 6 7 8))


================================================
FILE: examples/tuples/tuples.hvm
================================================
@Tup8/New = (a (b (c (d (e (f (g (h ((0 (a (b (c (d (e (f (g (h i))))))))) i)))))))))

@app = (?(((* (a a)) @app__C0) b) b)

@app__C0 = (a ({b (c d)} (c e)))
  & @app ~ (a (b (d e)))

@main = b
  & @app ~ (1234 (@rot (a b)))
  & @Tup8/New ~ (1 (2 (3 (4 (5 (6 (7 (8 a))))))))

@rot = ((@rot__C1 a) a)

@rot__C0 = (h (a (b (c (d (e (f (g i))))))))
  & @Tup8/New ~ (a (b (c (d (e (f (g (h i))))))))

@rot__C1 = (?((@rot__C0 *) a) a)


================================================
FILE: paper/HVM2.typst
================================================
#import "@preview/unequivocal-ams:0.1.0": ams-article, theorem, proof
#import "@preview/cetz:0.2.2": canvas, draw
#import "inet.typ"

#show link: underline
#set cite(form: "normal", style: "iso-690-author-date")


#show: ams-article.with(
  title: [HVM2: A Parallel Evaluator for Interaction Combinators],
  authors: ((name: "Victor Taelin", company: "Higher Order Company", email: "taelin@HigherOrderCO.com"),),
  abstract:[
    We present HVM2, an efficient, massively parallel evaluator for extended interaction combinators. When compiling non-sequential programs from a high-level programming language to C and CUDA, we achieved a near-ideal parallel speedup as a function of cores available (within a single device), scaling from 400 million interactions per second (MIPS) (Apple M3 Max; single thread), to 5,200 MIPS (Apple M3 Max; 16 threads), to 74,000 MIPS (NVIDIA RTX 4090; 32,768 threads). In this paper we describe HVM2's architecture, present snippets of the reference implementation in Rust, share early benchmarks and experimental results, and discuss current limitations and future plans.
  ],
  bibliography: bibliography("refs.bib", style: "annual-reviews"),
)

*This paper is a work in progress. See the #link("https://github.com/HigherOrderCO/HVM")[HVM repo] for the  latest version.*

= Introduction

Interaction Nets (IN's) @Lafont_1990 and Interaction Combinators (IC's) @Lafont_1997 were introduced by Lafont as a minimal and concurrent model of computation. Lafont proved that IC's were not only Turing Complete, but that they also preserve the complexity class and degree of parallelism. Moreover, Lafont argued that while Turing Machines are a universal model of sequential computation, IC's are a universal model of #emph[distributed] computation. The locality and strong confluence of Lafont's ICs make it suitable for massive parallel computation. This heavily implied that IC's are an optimal model of computation, in a very fundamental sense. Yet, it remained to be seen if this system could be implemented efficiently in practice.

In this paper, we answer this question positively. By storing Interaction Combinator nodes in a memory-efficient format, we're able to implement its core operations (annihilation, commutation, and erasure) as lightweight C procedures and CUDA kernels. Furthermore, by representing wires as atomic variables, we're able to perform interactions atomically, in a lock-free fashion and with minimal synchronization. We also extend our system with global definitions (for fast function applications) and native numbers (for fast numeric operations). The result, HVM2, is an efficient, massively parallel evaluator for ICs that achieves near-ideal speedup, up to at least 16,384 concurrent cores, peaking at 74 billion interactions per second on an NVIDIA RTX 4090.

This level of performance makes it compelling to propose HVM2 as a general
framework for parallel computing. By translating constructs such as functions, algebraic data types, pattern matching, and recursion to HVM2, we see it as a potential compilation target for modern programming languages such as Python and Haskell. As a demonstration of this possibility, we also introduce #link("https://github.com/HigherOrderCO/bend")[Bend], a high-level programming language that compiles to HVM2. We explain how some of these translations work, and set up a general framework to translate arbitrary languages, procedural or functional, to HVM2.

#pagebreak()

= Similar Works

*Work In Progress*

#pagebreak()

= Syntax

HVM2's syntax consists of an Interaction Calculus system which textually represents an Interaction Combinator system @Calculus_1999. This textual system is capable of representing any arbitrary Interaction Net, and it is therefore possible to represent "vicious circles" @Lafont_1997. We only consider HVM2 programs which do not contain any vicious circles.

HVM2's syntax has seven different types of _agents_ (in Lafont's terms) or _Nodes_. Additionally, HVM2 also has _Variables_ to represent wires which connect ports across Nodes. _Trees_ are either Variables or Nodes. They are represented syntactically as:

#align(center)[
  ```
<Node> ::=
    | "*"                    -- (ERA)ser
    | "@" <alphanumeric>     -- (REF)erence
    | <Numeric>              -- (NUM)eric
    | "(" <Tree> <Tree> ")"  -- (CON)structor
    | "{" <Tree> <Tree> "}"  -- (DUP)licator
    | "$(" <Tree> <Tree> ")" -- (OPE)rator
    | "?(" <Tree> <Tree> ")" -- (SWI)tch

<Tree> ::=
    | <alphanumeric>         -- (VAR)iable
    | <Node>

<alphanumeric> ::= [a-zA-Z0-9_.-/]+
  ``` \
  _(For details on the `<Numeric>` syntax, see Numbers (@numbers))_
] \
  \

Notice that Nodes form a tree-like structure, and throughout this document we will make systematic confusion between Nodes and Trees. For example, in Interactions (@interactions), it is critical to know when a Variable is permissible or not when referring to arbitrary Nodes. In Memory Layout (@memory), however, we purposefully blur the line between Nodes and Variables as the memory layout is greatly simplified by doing so.

The first three node types (`ERA`, `REF`, `NUM`) are nullary, and the last 4 types (`CON`, `DUP`, `OPE`, `SWI`) are binary. As implied above, `VAR` can be seen as an additional node type. As in Lafont's Interaction Nets, every node has an extra distinguished edge, called the main or principal port @Lafont_1997. Thus, nullary nodes have one port (one main and zero auxiliary), while binary nodes have three ports (one main and two auxiliary).

With the syntax above, the main port of the root node is "free", as it is not wired to another port. We can connect two main ports and form a reducible expression (redex), using the following syntax:

#align(center)[
  ```
  <Redex> ::= <Tree> "~" <Tree>
  ```
]

A Net consists of a root tree and a (possibly empty) list of `&`-separated redexes:

#align(center)[
  ```
  <Net> ::= <Tree> ("&" <Redex>)*
  ```
]

#pagebreak()

HVM2 Nets represent what are known as _configurations_ in the literature. A Net like
#align(center)[
  ```
  t1 & v1 ~ w1
     & ..
     & vn ~ wn
  ```
]

graphically represents a net like,

#pad(1em)[
  #figure(
    image("configuration.png", width: 50%),
    caption: [
      A configuration #footnote[This is a modified image
 of a configuration with multiple free main ports @Salikhmetov_2016.]
    ],
  )
]

where $omega$ is a wiring. Thus, HVM2 Nets contain only a single free main port. Wirings are possible between trees through pairs of `VAR` nodes with the same names. Note, however, that a variable can only occur twice. This aligns with Interaction Nets in that a wire can only connect two ports.

Lastly, an Book consists of a list of top-level definitions, or, "named" Nets:

#align(center)[
  ```
  <Book> ::= ("@" <name> "=" <Net>)*
  ```
]

Each `.hvm2` file contains a book, which is executed by HVM2. The entry point for HVM2 programs is the `@main` definition.

== An Example

The following definition:
#align(center)[
  ```
  @succ = ({(a b) (b R)} (a R))
  ```
]

Represents the HVM2 encoding of the $lambda$-calculus term `λs λz (s (s z))`, and can be drawn as the following Interaction Combinator net:
/*
          :
         /_\
      ...: :...
      :       :
     /#\     /_\
   ..: :..   a R
  /_\   /_\  : :
  a b...b R..:.:
  :..........:
*/
#figure(caption: [An example $lambda$-calculus term.], canvas({
  import draw: *

  // inet.con(name: "a", pos: (0, 2), rot: -90deg)
  // inet.con(name: "b", pos: (2, 0), rot: -180deg)
  // inet.con(name: "c", pos: (0, -2), rot: 90deg)
  inet.con(name: "a", pos: (0, 0), rot: 90deg)
  inet.dup(name: "b", pos: (1.5, -1), rot: 90deg)
  inet.con(name: "c", pos: (1.5, 1), rot: 90deg)
  inet.con(name: "d", pos: (3, -2), rot: 90deg)
  inet.con(name: "e", pos: (3, 0), rot: 90deg)
  inet.link("a.0", "a.0")

  inet.link("a.1", "b.0")
  inet.link("a.2", "c.0")
  inet.link("b.1", "d.0")
  inet.link("b.2", "e.0")

  inet.link("d.2", "e.1")
  content((3.5, -1), [`b`])

  // "c.2" -> "e.2"
  inet.port("R", (3.45, 1 + 1/6), -90deg)
  inet.port("R'", (3.45, 1 + 1/6), 90deg)
  inet.link("e.2", "R")
  inet.link("c.2", "R'")
  content((3, 1.5), [`R`])

  // "c.1" -> "d.1"
  inet.port("A1", (4.1, -1), -180deg)
  inet.port("A2", (4.1, 0), 0deg)
  inet.port("A3", (3.45, 1 - 1/6), 90deg)
  inet.port("A3'", (3.45, 1 - 1/6), -90deg)
  inet.link("d.1", "A1")
  inet.link("A1", "A2")
  inet.link("A2", "A3'")
  inet.link("c.1", "A3")
  content((4.4, -0.5), [`a`])
}))

Notice how `CON`/`DUP` nodes in HVM2 correspond directly to constructor and duplicator nodes in Lafont's Interaction Combinators @Lafont_1997. Aux-to-main wires are implicit through the tree-like structure of the syntax, while aux-to-aux wires are explicit through variable nodes, which are always paired.

Additionally, main ports being implicit is critical to storing nodes efficiently in memory. Nodes can be represented as just two ports (the HVM2 memory model for wires) rather than three. In HVM2, since every port is 32 bits, this allows us to store a single node in a 64-bit word. This compact representation lets us use built-in atomic operations in various parts of the code, which was key to making parallel C and CUDA versions efficient. For details on the precise memory representation, see @architecture.

== Interpretation of the Syntax

Semantically, `CON`, `DUP`, and `ERA` nodes correspond accordingly to Lafont's #emph[constructor], #emph[duplicator], and #emph[eraser] symbols @Lafont_1997, and behave like Mazza's Symmetric Interaction Combinators @Mazza_2007. The `VAR` node represents a wiring in the graph, connecting two ports of a Net. They are linear and paired in the sense that (except for the free main port) every port is connected to exactly one other port, and therefore each variable occurs exactly twice.

`REF` nodes are an extension to Lafont's IC's, and they represent an immutable net that is expanded in a single interaction. While not essential for the expressivity of the system, `REF` nodes are essential for performance, as they enable fast global functions, a degree of laziness in a strict setup (critical to making GPU implementations viable), and allow us to represent tail recursion in constant space.

`NUM`, `OPE` and `SWI` nodes are also not essential expressivity-wise, but are too important for performance reasons. Modern processors are equipped with native machine integer operations. Emulating these operations with IC constructs analogous to Church or Scott Numerals would be very inefficient. Thus, these numeric nodes are necessary for HVM2 to be efficient in practice.

#pagebreak()

= Interactions <interactions>

The AST above specifies HVM2's data format. As a virtual machine, it also provides a mechanism to compute with that data. In traditional VMs, these are called *instructions*. In term rewriting systems, there are usually *reductions*. In HVM2, the mechanism for computation is called *interactions*. There are ten of them. All interactions are listed below using Gentzen-style rules (redexes in the line above reduce to ones in the line below).

#v(1em)
$
  & "Arbitrary Nodes (including" #raw("VAR") ")" #h(2em) & #raw("A"), #raw("B"), #raw("C"), #raw("D") &:= #raw("<Tree>") \
  & "Binary Nodes" #footnote[In the interaction rules `()` and `{}` refer to arbitrary binary nodes, not just `CON` and `DUP`.] & #raw("()"), #raw("{}") &:= #raw("CON") | #raw("DUP") | #raw("OPA") | #raw("SWI") \
  & "Nullary Nodes" & circle.filled.small, circle.stroked.small &:= #raw("ERA") | #raw("REF") | #raw("NUM") \
  & "Numeric Nodes" & #raw("N"), #raw("M") &:= #raw("NUM") "(Numbers or Operations)"\
  & "Numeric Value Nodes (Not Operators)" & #raw("#n"), #raw("#m") &:= #raw("NUM") "where" #raw("n"), #raw("m") in bb(Q) \
  & "Erasure Nodes" & #raw("*") &:= #raw("ERA") \
  & "Variables" & #raw("x"), #raw("y"), #raw("z"), #raw("w") &:= #raw("VAR") \
$

#v(1em)
#v(1em)
#show math.equation: set text(15pt)
#grid(
  align: center,
  columns: (1fr, 1fr),
  gutter: 1pt,
  [
    #smallcaps[(#strong[link])]
    #math.frac(
      [
        `B` contains `x` \
        `x ~ A`
      ],
      [`B[x` $arrow.l$ `A]`]
    )
  ],
  [
    #smallcaps[(#strong[call])]
    #math.frac(
      [
      `A` is not a `VAR` node
      #v(1em) \
      `@foo ~ A`
    ],
    `expand(@foo) ~ A`)
  ],

)
#v(1em)
#grid(
  align: center,
  columns: (1fr, 1fr),
  gutter: 1pt,
  [
    #smallcaps[(#strong[void])]
    #math.frac([$circle.filled.small$ `~` $circle.stroked.small$], ``)
  ],
  [
    #smallcaps[(#strong[eras]e)]
    #math.frac([$circle.filled.small$ `~ (A B)`],
      [
        $circle.filled.small$ `~ A` \
        $circle.filled.small$ `~ B`
      ])
  ],
)
#v(1em)
#grid(
  align: center,
  columns: (1fr, 1fr),
  gutter: 1pt,
  [
    #smallcaps[(#strong[comm]ute)]
    #math.frac(`(A B) ~ {C D}`,
      ```
      {x y} ~ A
      {z w} ~ B
      (x z) ~ C
      (y w) ~ D
      ```
    )
  ],
  [
    #smallcaps[(#strong[anni]hilate)]
    #math.frac(
      `(A B) ~ (C D)`,
      ```
      A ~ C
      B ~ D
      ```
    )
  ],

)
#v(1em)
#grid(
  align: center,
  columns: (1fr, 1fr),
  gutter: 1pt,
[
    #smallcaps[(#strong[oper]ate 1)]
    #math.frac(`N ~ $(M A)`, `op(N, M) ~ A`)
  ],
  [
    #smallcaps[(#strong[swit]ch 1)]
    #math.frac(`#0 ~ ?(A B)`, `A ~ (B *)`)

  ],

)
#v(1em)
#grid(
  align: center,
  columns: (1fr, 1fr),
  gutter: 1pt,
  [
    #smallcaps[(#strong[oper]ate 2)]
    #math.frac(
      [
        `A` is not a `NUM` node
        #v(1em) \
        `N ~ $(A B)`
      ],
      `A ~ $(N B)`
    )
  ],
  [
    #smallcaps[(#strong[swit]ch 2)]
    #math.frac(`#n+1 ~ ?(A B)`, `A ~ (* (#n B))`)
  ],
)
#v(2em)
#show math.equation: set text(10pt)

Note that rules are #emph[symmetric]: if a rule applies to a redex `A ~ B` then it also applies to `B ~ A`. Explanations and implementation details of each of the rules follow.

== Link
"Links" two ports where at least one is a `VAR` Node. A *global substitution* is performed replacing the single other occurrence of `x` with `A`. Recall that there is _exactly_ one other occurrence of `x` in the net. In the graph-rewriting system of Interaction Nets, linking two ports isn't technically an interaction, as wires are not named. However, for the term-rewriting calculus, wires are named and must be substituted for. When applying this rule, the redex `x ~ A` is removed, and the occurrence of `x` in `B` node is replaced with `A`. This is the only rule where nodes "far apart" can affect each other. See <linker> for details on the `link` function.

== Call
Expands a `REF`, replacing it with its definition. The definition is essentially copied from the static Book to the global memory, allocating its nodes and creating fresh variables. This operation is key to enable fast function application, since, without it, one would need to use duplicator nodes for the same purpose, which brings considerable overhead. It also introduces some laziness in a strict evaluator, allowing for global recursive functions, and constant-space tail-calls.

== Void
Erases two nullary nodes connected to each other. The result is nothing: both nodes are consumed, fully cleared from memory. The `VOID` rule completes a garbage collection process.

== Erase
Erases a binary node `(A B)` connected to an nullary node, propagating the nullary node towards both nodes `A` and `B`. The rule performs a granular, parallel garbage collection of nets that go out of scope.

When the nullary node is a `NUM` or a `REF`, the #smallcaps[erase] rule actually behaves as a copy operation, cloning the `NUM` or `REF`, and connecting to both ports. #emph[However], when a copy operation is applied to a `REF` which contains `DUP` nodes, it instead is computed as a normal #smallcaps[call] operation. This allows us to perform fast copy of "function pointers", while still preserving Interaction Combinator semantics.

== Commute
Commutes two binary nodes of different types, essentially cloning them. The #smallcaps[commute] rule can be used to clone data and to perform loops and recursion, although these are preferably done via #smallcaps[call]s: Cloning large networks is faster through the #smallcaps[call] interaction, as it can be done in a single pass, as opposed to the incremental #smallcaps[commute] interactions that would have to propagate throughout the network.

== Annihilate
Annihilates two binary nodes of the same type connected to each-other, replacing them with two redexes. The #smallcaps[annihilate] rule is the most essential computation rule, and is used to implement beta-reduction and pattern-matching.

== Operate
Performs a numeric operation between two `NUM` nodes `N` and `M` connected by an `OPE` node. Note that `N` and `M` should not both be numeric values for this interaction to perform a sensible numeric operation. Dispatching to different native numeric operations depends on the `N` and `M` nodes themselves. See Numbers (@numbers) for details.

Note than when counting the number interactions, #smallcaps[operate 2] is not counted, as this would cause the number of interactions to be non-deterministic.

== Switch
Performs a switch on a `NUM` node `#n` connected to a `SWI` node, treating it like a `Nat ::= Zero | (Succ pred)`. Here, `A` is expected to be a #highlight[tuple (first reference of the term "tuple", it's unclear what the encoding is)] with both cases: `zero` and `succ`, and `B` is the return port. If `n` is 0, we return the `zero` case, and erase the `succ` case. Otherwise, we return the `succ` case applied to `n-1`, and erase the `zero` case.

#pagebreak()

== Interaction Table

Since there are eight node types, there is a total of 64 possible pairwise node interactions. The interaction rule for an active pair is _uniquely_ determined by the types of the two nodes in the pair. The table below shows which interaction rule is triggered for each possible pair of nodes that form a redex. Since the interaction rules are symmetric, this table is symmetric across the main diagonal. #highlight[`CON`-`SWI` being #smallcaps[comm] is problematic; should be removed or the reduction for #smallcaps[swit] should change].
#v(1em)

#align(center)[
  ```
  | A\B |  VAR |  REF |  ERA |  NUM |  CON |  DUP |  OPR |  SWI |
  |-----|------|------|------|------|------|------|------|------|
  | VAR | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK |
  | REF | LNIK | VOID | VOID | VOID | CALL | ERAS | CALL | CALL |
  | ERA | LINK | VOID | VOID | VOID | ERAS | ERAS | ERAS | ERAS |
  | NUM | LINK | VOID | VOID | VOID | ERAS | ERAS | OPER | SWIT |
  | CON | LINK | CALL | ERAS | ERAS | ANNI | COMM | COMM | COMM |
  | DUP | LINK | ERAS | ERAS | ERAS | COMM | ANNI | COMM | COMM |
  | OPR | LINK | CALL | ERAS | OPER | COMM | COMM | ANNI | COMM |
  | SWI | LINK | CALL | ERAS | SWIT | COMM | COMM | COMM | ANNI |
  ```
]

\

Because for each active pair exactly one rule applies, HVM2 retains the same _strong confluence_ that Lafont's Interaction Combinators do @Lafont_1997. This implies that not only can HVM2 programs be reduced completely in parallel, but also that the number of reductions is invariant to the order in which interaction rules are applied. This ensures that HVM2 can reduce redexes in any order without any risk of complexity blowups.

#pagebreak()

= Substitution Map & Atomic Linker <linker>

While HVM2 retains the strong confluence property of Lafont's IC's, _locality_ is more difficult to obtain. This is due to HVM2's variables. Variables link two different parts of the program, and thus can cause interference when two threads attempt to reduce two redexes in parallel. For example, consider a subset of a Net:

#v(1em)
#align(center)[
  ```
  & (a b) ~ (d c)
  & (c d) ~ (f e)
  ```
]
#v(1em)

Two threads attempting to reduce this Net can be represented as follows:

#v(1em)
#align(center)[
  ```
       Thread_0     Thread_1
  --a--|\____/|--c--|\____/|--e--
  --b--|/    \|--d--|/    \|--f--
  ```
]
#v(1em)

Notice that in the reduction of these two redexes, both `Thread_0` and `Thread_1` will need to access variables `c` and `d`. This requires synchronization.

In HVM2, there is a global collection of redexes that is mutated in parallel by a variety of threads. See Architecture (@architecture) for details. Since every variable occurs exactly twice, the #smallcaps[link] interaction with a variable `x` will also occur twice, but _possibly at very different times_. The first time is when this variable is first encountered, and somehow a substitution must be "deferred" until the #smallcaps[link] interaction rule is applied to the second occurrence of `x`.

This is accomplished by a global, atomic substitution map, which tracks these deferred substitutions. When a variable is linked to a node, or to another variable, it is inserted into the substitution map. When that same variable is linked again, it will already have an entry in the substitution map, and then the proper redex will be constructed.

The substitution map can be represented efficiently with a flat buffer, where the index is the variable name, and the value is the node that has been substituted. This can be done atomically, via a simple lock-free linker. In pseudocode, this roughly looks like:

#pagebreak()

#align(center)[
```python
# Attempts to link A and B.
def link(subst: Dict[str, Node], A: Node, B: Node):
    while True:
        # If A is not a VAR: swap A and B, and continue.
        if type(A) != VAR:
            swap(A, B)

        # If A is not a VAR: both are non-vars. Create a new redex.
        if type(A) != VAR:
            push_redex(A, B)

        # Here, A is a VAR. Create a `A: B` entry in the map.
        got: Port = subst.set_atomic(A, B)

        # If there was no `A` entry, stop.
        if got is None:
            break

        # Otherwise, delete `A` and link `got` to `B`.
        del subst[A]
        A = got
```
]
#v(1em)

To see how this algorithm works, let's consider, again, the scenario above:

#v(1em)
#align(center)[
  ```
       Thread_0     Thread_1
  --a--|\____/|--c--|\____/|--e--
  --b--|/    \|--d--|/    \|--f--
  ```
]
#v(1em)

Assume we start with a substitution `a` $arrow.l$ `#42`, and let both threads reduce a redex in parallel. Each thread  are an `ANNI` rule, their effect is to link both ports; thus, the resulting wire connected to `#42` (with the wires `a`, `d`, and `e` labelled for clarity) should be

#v(1em)
#align(center)[
  ```
  #42 ---a---.         .---e---
              '---d---'
  ```
]
#v(1em)

That is, `e` must be directly linked to `#42`. Let's now evaluate the algorithm in an arbitrary order, step-by-step. Recall that the initial Net is:

#v(1em)
#align(center)[
  ```
  & (a b) ~ (d c)
  & (c d) ~ (f e)
  ```
]
#v(1em)

And for simplicity we're observing only ports `a`, `d`, and `e`. `Thread_0` will attempt to perform `link(a, d)` and `Thread_1` will attempt `link(d, e)`. There are many possible orders of execution:

#pagebreak()

== Possible Execution Order 1

#pad(left: 2em)[
  ```
  - a: #42
  ======= Thread_2: link(d, e)
  - a: #42
  - d: e
  ======= Thread_1: link(a, d)
  - a: d
  - d: e
  ======= Thread_1: got `a: #42`, thus, delete `a` and link(d, #42)
  - d: #42
  ======= Thread_1: got `d: e`, thus, delete `d` and link(e, #42)
  - e: #42
  ```
]

The resulting substitution map is linking `e` to `42`, as required.

== Possible Execution Order 2

#pad(left: 2em)[
  ```
  - a: #42
  ======= Thread_1: link(d, a)
  - a: #42
  - d: a
  ======= Thread_2: link(d, e)
  - a: #42
  - d: e
  ======= Thread_2: got `d: a`, thus, delete `d` and link(a, e)
  - a: e
  ======= Thread_2: got `a: #42`, thus, delete `a` and link(e, #42)
  - e: 42
  ```
]

The resulting substitution map is linking, again, `e` to `42`, as required.

== Possible Execution Order 3

#pad(left: 2em)[
  ```
  - a: #42
  ======= Thread_1: link(d, a)
  - a: #42
  - d: a
  ======= Thread_2: link(e, d)
  - a: #42
  - d: a
  - e: d
  ```
]

In this case, the result isn't directly linking `e` to `#42`. But it does link `e` to `d`, which links to `a`, which links to `#42`. Thus, `e` is, indirectly, linked to `#42`. While it does temporarily use more memory in this case, it is, semantically, the same result. Additionally, the indirect links will be cleared as soon as `e` is linked to something else. It is easy enough to see that this holds for all possible evaluation orders.

#pagebreak()

= Numbers <numbers>
HVM2 has a built-in support for 32-bit numerics and operations represented by the `NUM` node type. `NUM` nodes in HVM2 have a 5-bit tag and a 24-bit payload. Depending on the tag, numbers can represent unsigned integers (U24), signed integers (I24), IEEE 754 binary32 floats (F24), or (possibly partially applied) operators. These choices mean any numeric node can be represented in 29 bits, which can be unboxed in a 32-bit port with a 3 bit tag for the node type.

== Syntax

Numeric nodes can either be a number or a (possibly partially applied) operator. Syntactically,

#align(center)[
  ```
  <Numeric> ::=
      | <Number>
      | <Operator>

  <Number> ::=
      | <Nat>
      | <Int>
      | <Float>

  <Operator> ::=
      | "[" <Operation> "]"           -- (unapplied)
      | "[" <Operation> <Number> "]"  -- (partially applied)

  <Operation> ::=
      | "+"   -- (ADD)
      | "-"   -- (SUB)
      | "*"   -- (MUL)
      | "/"   -- (DIV)
      | "%"   -- (REM)
      | "="   -- (EQ)
      | "!"   -- (NEQ)
      | "<"   -- (LT)
      | ">"   -- (GT)
      | "&"   -- (AND)
      | "|"   -- (OR)
      | "^"   -- (XOR)
      | ">>"  -- (SHR)
      | "<<"  -- (SHL)
      | ":-"  -- (FP_SUB)
      | ":/"  -- (FP_DIV)
      | ":%"  -- (FP_REM)
      | ":>>" -- (FP_SHR)
      | ":<<" -- (FP_SHL)
  ```
]

where `<Int>` is disambiguated from `<Nat>` by requiring a sign prefix `+`/`-`, and flipped versions of non-commutative operators (`FP_*`) are provided for convenience.

#pagebreak()

== Numeric Node Memory Layout

In order to understand how numeric operations are derived from the numeric nodes, we must look at the memory layout of the different numeric values and operations. As stated above, numeric nodes fit into 29 bits, so they can be inlined into a 32-bit ports. Numeric nodes have the following layout

#align(center)[
  ```
  VVVVVVVVVVVVVVVVVVVVVVVVTTTTT
  ```
]

Where the 5-bit tag `T` has 19 possible values: one of the three number types (`U24`, `I24`, `F24`), one of the 15 operations, or a special value `SYM`. The 24-bit value `V` has a few possible interpretations:
- If $#raw("T") in {#raw("U24"), #raw("I24"), #raw("F24")}$, then `V` is interpreted as a number with the type `T`. For example: `123`, `-123`, `1.0`.
- If $#raw("T") in #raw("<Operation>")$, then `V` is an untyped number. This is a partially applied operation. The type of the second argument (when applied) will dictate the interpretation of `V`. For example: `[/123]`, `[*-123]`, `[%1.0]`.
- If $#raw("T") = #raw("SYM")$, then $#raw("V") in #raw("<Operator>")$. This is an unapplied operator, like `[+]`.

== U24 - Unsigned 24-bit Integer

U24 numbers represent unsigned integers from 0 to 16,777,215 ($2^24 - 1$).

The 24-bit payload directly encodes the integer value. For example:

#align(center)[
  ```
  0000 0000 0000 0000 0000 0001 = 1
  0000 0000 0000 0000 0000 0010 = 2
  1111 1111 1111 1111 1111 1111 = 16,777,215
  ```
]

== I24 - Signed 24-bit Integer

I24 numbers represent signed integers from -8,388,608 to 8,388,607.

The 24-bit payload uses two's complement encoding. For example:

#align(center)[
  ```
  0000 0000 0000 0000 0000 0000 = 0
  0000 0000 0000 0000 0000 0001 = 1
  0111 1111 1111 1111 1111 1111 = 8,388,607
  1000 0000 0000 0000 0000 0000 = -8,388,608
  1111 1111 1111 1111 1111 1111 = -1
  ```
]

== F24 - 24-bit IEEE 754 binary32 Float

F24 numbers represent a subset of IEEE 754 binary32 floating point numbers; it supports approximately the same range, but with less precision. The 24-bit payload is laid out as follows:

#align(center)[
  ```
  SEEE EEEE EMMM MMMM MMMM MMMM
  ```
]

Where:
- S is the sign bit (1 = negative, 0 = positive)
- E is the 8-bit exponent, with a bias of 127 (range of −126 to +127)
- M is the 15-bit significand (mantissa) precision

The value is calculated as:
- If E = 0 and M = 0, the value is signed zero
- If E = 0 and M $eq.not$ 0, the value is a subnormal number: \
  $(-1)^S times 2^(-126) times (0.M "in base 2")$
- If 0 < E < 255, the value is a normal number: \
  $(-1)^S times 2^(E-127) times (1.M "in base 2")$
- If E = 255 and M = 0, the value is signed infinity
- If E = 255 and M $eq.not$ 0, the value is NaN (Not-a-Number)

F24 supports a range of approximately $plus.minus 3.4 times 10^38$. The smallest positive normal number is $2^(-126) approx 1.2 times 10^(-38)$, while the smallest subnormal numbers go down to $2^(-141) approx 3.6 times 10^(-43)$.

== Numeric Operations

When two `NUM` nodes are connected by an `OPE` node, as shown in Interaction Rules (@interactions), a numeric operation is performed using the `op` function. The operation to be performed depends on the tags of each numeric node.

Some operations `op(N, M)` are invalid, and simply return `0`:
- If both numeric tags are types.
- If both numeric tags are operations.
- If both numeric tags are `SYM`.

Otherwise:
- If one of the tags is `SYM`, the output has the tag represented by the `SYM` numeric node and the payload of the other operand. For example,

#align(center)[
  ```
  OP([+], 10) = [+10]
  OP(-1, [*]) = [*0xffffff]
  ```
]

- If one of the tags is an operation, and the other is a type, a native
  operation is performed, according to the following table:

#align(center)[
  ```
  |   | U24 | I24 | F24   |
  |---|-----|-----|-------|
  |ADD| +   | +   | +     |
  |SUB| -   | -   | -     |
  |MUL| *   | *   | *     |
  |DIV| /   | /   | /     |
  |REM| %   | %   | %     |
  |EQ | ==  | ==  | ==    |
  |NEQ| !=  | !=  | !=    |
  |LT | <   | <   | <     |
  |GT | >   | >   | >     |
  |AND| &   | &   | atan2 |
  |OR | |   | |   | log   |
  |XOR| ^   | ^   | pow   |
  |SHR| >>  |     |       |
  |SHL| <<  |     |       |
  ```
]
Where empty cells are intentionally left unspecified.

The resulting number type is the same as the input type, except for comparison operators (`EQ`, `NEQ`, `LT`, `GT`) which always return `U24` `0` or `1`.

#pagebreak()

The number tagged with the operation is the left operand of the native operation, and the number tagged with the type is the right operand. For example,

#align(center)[
  ```
  op([/1.2], 3.0) = op(3.0, [/1.2]) = 1.2 / 3.0
  ```
]

Note that this means that the number type used in an operation is always determined by the right operand; if the left operand is of a different type, its bits will be reinterpreted.

Finally, flipped operations (such as `FP_SUB`) interpret their operands in the opposite order (e.g. `SUB` represents `a - b` whereas `FP_SUB` represents `b - a`). This allows representing e.g. both `1 - x` and `x - 1` with partially-applied operations (`[-1]` and `[:-1]` respectively).

#align(center)[
```
OP([-2], +1) = +1
OP([:-2], 1) = -1
```
]

Note that `op` is a symmetric function (since the order of the operands is determined by their tags). That is, `op(N, M) = op(M, N)`. This makes the "swap" interaction in #smallcaps[operate 2] valid.

#pagebreak()

= The 32-bit Architecture <architecture>

The initial version of HVM2, as implemented in this paper, is based on a 32-bit architecture. In this section, we'll use snippets of the Rust reference implementation of the interpreter, available at the #link("https://github.com/HigherOrderCO/hvm2")[HVM2 repo], to document this architecture.

== Memory Layout <memory>

Ports are 32-bit values that represent a wire connected to a main port. The low 3-bits are reserved to identify the type of the node (`VAR`, `REF`, `ERA`, etc) whose main port the wire is connected to. This is a port's _tag_. The upper 29-bits hold a port's _value_. The interpretation of the value is dependent on the tag. The value is either an address (for binary `CON`, `DUP`, `OPR`, and `SWI` nodes), a virtual function address (for `REF` nodes), an unboxed 29-bit number (for `NUM` nodes), a variable name (for `VAR` nodes), or `0` (for `ERA` nodes). Binary nodes are represented in memory as a pair of two ports. Notice that _ports store the kind of node they are connecting to_, nodes don't store their own type.

#align(center)[
```rust
pub type Tag = u8;  // 3 bits  (rounded up to u8)
pub type Val = u32; // 29 bits (rounded up to u32)

pub struct Port(pub u32); //  Tag + Val  (32 bits)
pub struct Pair(pub u64); // Port + Port (64 bits)
```
]

The Global Net structure includes three shared memory buffers, `node`: a node buffer where nodes are allocated, `vars`: a buffer representing the current substitution map (with the 29-bit key being the variable name, and the value being the current substitution), and `rbag`: a collection of active redexes. `APair` and `APort` are atomic variants of `Pair` and `Port`, respectively,

#align(center)[
```rust
pub struct GNet<'a> {
  pub node: &'a mut [APair],
  pub vars: &'a mut [APort],
  pub rbag: &'a mut [APair],
}
```
]

Since this 32-bit architecture has 29-bit values, that means we can address a total of $2^29$ nodes and variables, making the `node` buffer at most `4 GB` long, and the `vars` buffer at most `2 GB` long. A 64-bit architecture would increase this limit to match the overwhelming majority of use cases, and will be incorporated in a future revision of the HVM2 runtime.

Since top-level definitions are just static nets, they are stored in a similar structure, with the key difference that they include an explicit `root` port, which, is used to connect the expanded definition its target port in the graph. We also include a `safe` flag, which indicates whether this definition has duplicator nodes or not. This affects the `DUP-REF` interaction, which will just copy the `REF` port, rather than expanding the definition, when it is `safe`.

#align(center)[
```rust
pub struct Def {
  pub safe: bool,      // has no dups
  pub root: Port,      // root port
  pub rbag: Vec<Pair>, // def redex bag
  pub node: Vec<Pair>, // def node buffer
}
```
]

Finally, the global Book is just a map of names to `Def`s:

#align(center)[
```rust
pub struct Book {
  pub defs: Vec<Def>,
}
```
]

Notice that, like variables, names are simply indexes into the global `Book`.

This concludes HVM2's memory layout. For more details, check the reference Rust implementation in the #link("https://github.com/HigherOrderCO/hvm2")[HVM2 repo].

== Example Interaction Net Layout

Consider, again, the following net:

#align(center)[
```
(a b)
& (b a) ~ (x (y *))
& {y x} ~ @foo
```
]

In HVM2's memory, it would be represented as:

#align(center)[
```
RBAG | FST-TREE | SND-TREE
---- | -------- | --------
0800 | CON 0002 | CON 0003 // '& (b a) ~ (x (y *))'
1800 | DUP 0005 | REF 0000 // '& {x y} ~ @foo'
---- | -------- | --------
NODE | PORT-1   | PORT-2
---- | -------- | --------
0001 | VAR 0000 | VAR 0001 // '(a b)' node (root)
0002 | VAR 0001 | VAR 0000 // '(b a)' node
0003 | VAR 0002 | CON 0004 // '(x (y *))' node
0004 | VAR 0003 | ERA 0000 // '(y *)' node
0005 | VAR 0003 | VAR 0002 // '{y x}' node
---- | -------- | --------
VARS | VALUE    |
---- | -------- |
FFFF | CON 0001 | // points to root node
```
]

Note that the `VARS` buffers has only one entry, because there are no substitutions, but we always use the last variable to represent the root port, serving as an entry point to the graph.

== Example Interaction

Interactions can be implemented in five steps:
1. Allocate the needed resources.

2. Loads nodes from global memory to registers.

3. Initialize fresh variables on the substitution map.

4. Stores fresh nodes on the node buffer.

5. Atomically links outgoing wires.

For example, the #smallcaps[commute] interaction is implemented in Rust as:

#align(center)[
```rust
pub fn interact_comm(&mut self, net: &GNet, a: Port, b: Port) -> bool {
  // Allocates needed resources.
  if !self.get_resources(net, 4, 4, 4) {
    return false;
  }

  // Loads nodes from global memory.
  let a_ = net.node_take(a.get_val() as usize);
  let a1 = a_.get_fst();
  let a2 = a_.get_snd();
  let b_ = net.node_take(b.get_val() as usize);
  let b1 = b_.get_fst();
  let b2 = b_.get_snd();

  // Stores new vars.
  net.vars_create(self.v0, NONE);
  net.vars_create(self.v1, NONE);
  net.vars_create(self.v2, NONE);
  net.vars_create(self.v3, NONE);

  // Stores new nodes.
  net.node_create(self.n0, pair(port(VAR, self.v0), port(VAR, self.v1)));
  net.node_create(self.n1, pair(port(VAR, self.v2), port(VAR, self.v3)));
  net.node_create(self.n2, pair(port(VAR, self.v0), port(VAR, self.v2)));
  net.node_create(self.n3, pair(port(VAR, self.v1), port(VAR, self.v3)));

  // Links.
  self.link_pair(net, pair(port(b.get_tag(), self.n0), a1));
  self.link_pair(net, pair(port(b.get_tag(), self.n1), a2));
  self.link_pair(net, pair(port(a.get_tag(), self.n2), b1));
  self.link_pair(net, pair(port(a.get_tag(), self.n3), b2));

  return true;
}
```
]
#v(1em)

Note that, other than the linking, all operations here are local. Taking nodes
from global memory is safe, because the thread that holds a redex implicitly
owns both trees it contains, and storing vars and nodes is safe, because these
spaces have been allocated by the thread. A fast concurrent allocator for small
values is assumed. In HVM2, we just use a simple linear bump allocator, which is
fast and fragmentation-free in a context where all allocations are at most
64-bit values (the size of a single node).

= Massively Parallel Evaluation

Provided the architecture we just constructed, evaluating an HVM2 program in
parallel is surprisingly easy: *just compute global redexes concurrently, until
there is no more work to do*.

HVM2's local interactions exposes the original program's full degree of
parallelism, ensuring that every work that *can* be done in parallel *will* be
done in parallel. In other words, it maximizes the theoretical speedup, per
Amdahl's law. The atomic linking procedure ensures that points of
synchronization that emerge from the original program are solved safely and
efficiently, without no room for race conditions. Finally, the strong confluence
property ensures that the total work done is independent of the order that
redexes are computed, giving us freedom to evaluate in parallel without
generating extra work.

== Redex Sharing

An additional question, is, how do we actually distribute that workload through
all cores of a modern processor? The act of sharing a redex is, itself, a point
of synchronization. If this is done without enough caution, it can result in
contention, and slowing up execution. HVM2 solves this by two different
approaches:

On _CPUs_, a simple task-stealing queue is used, where each thread pushes and
pops from its own local redex bag, while a starving neighbor thread actively
attempt to steal a redex from it. Since a redex is just a 64-bit value, stealing
can be done with a single atomic_exchange operation, making it very lightweight.
To reduce contention, and to force threads to steal "old redexes", which are
more likely to produce long independent workloads, this stealing is done from
the other end of the bag. In our experiences, this works extremely well in
practice, achieving full CPU occupancy in all cases tested, with minimal
overhead, and low impact on non-parallelizable programs.

On _GPUs_, this matter is more complex in many ways. First, there are two scales
on which we want sharing to occur:

1. Within a running block, where stealing between local threads can be
   accomplished by fast shared-memory operations and warp-sync primitives.

2. Across global blocks, where sharing requires either a global synchronization
   (i.e., calling the kernel again) or direct communication via global memory.

Unfortunately, the cost of global synchronization (i.e., across blocks) is very
high, so, having a globally shared redex bag, as in the C version, and accessing
it within the context of a kernel, would greatly impact performance. To improve
this, we, initially, attempted to implement a fast block-wise scheduler, which
simply lets local threads pass redexes to starving ones with warp syncs:

```c
__device__ void share_redexes(TM* tm) {
  __shared__ Pair pool[TPB];
  Pair send, recv;
  u32*  ini = &tm->rbag.lo_ini;
  u32*  end = &tm->rbag.lo_end;
  Pair* bag = tm->rbag.lo_buf;
  for (u32 off = 1; off < 32; off *= 2) {
    send = (*end - *ini) > 1 ? bag[*ini%RLEN] : 0;
    recv = __shfl_xor_sync(__activemask(), send, off);
    if (!send &&  recv) bag[((*end)++)%RLEN] = recv;
    if ( send && !recv) ++(*ini);
  }
  for (u32 off = 32; off < TPB; off *= 2) {
    u32 a = TID();
    u32 b = a ^ off;
    send = (*end - *ini) > 1 ? bag[*ini%RLEN] : 0;
    pool[a] = send;
    __syncthreads();
    recv = pool[b];
    if (!send &&  recv) bag[((*end)++)%RLEN] = recv;
    if ( send && !recv) ++(*ini);
  }
}
```

Such procedure is efficient enough to be called between every few interactions,
allowing redexes to quickly fill the whole block. With that, all we had to do is
let the kernel perform a constant number of local interactions (usually in the
range of $2^9$ to $2^13$), and, once it completes, i.e., across kernel invocations,
the global redex bag was transposed (rows become columns), letting the entire
GPU to fill naturally by just the block-wise sharing function above, and nothing
else. This approach worked very well in practice, and let us achieve a peak of
74,000 MIPS in a shader-like functional program (i.e., a "tree-map" operation).

Unfortunately, it didn't work so well in cases where the implied communication
was more involved. For example, consider the following implementation of a
purely functional Bitonic Sort:

```javascript
data Tree = (Leaf val) | (Node fst snd)

// Swaps distant values in parallel; corresponds to a Red Box
(warp s (Leaf a)   (Leaf b))   = (U60.swap (^ (> a b) s) (Leaf a) (Leaf b))
(warp s (Node a b) (Node c d)) = (join (warp s a c) (warp s b d))

// Rebuilds the warped tree in the original order
(join (Node a b) (Node c d)) = (Node (Node a c) (Node b d))

// Recursively warps each sub-tree; corresponds to a Blue/Green Box
(flow s (Leaf a))   = (Leaf a)
(flow s (Node a b)) = (down s (warp s a b))

// Propagates Flow downwards
(down s (Leaf a))   = (Leaf a)
(down s (Node a b)) = (Node (flow s a) (flow s b))

// Bitonic Sort
(sort s (Leaf a))   = (Leaf a)
(sort s (Node a b)) = (flow s (Node (sort 0 a) (sort 1 b)))
```

Since this is an $O(n*log(n))$ algorithm, its recursive structure unfolds in
such a manner that is much less regular than a tree-map. As such, the naive task
sharing approach had a consequence that greatly impacted performance on GPUs:
threads would give and receive "misaligned" redexes, causing warp-local threads
to compute different calls at any given point. For example, a given warp thread
might be processing `(flow 5 (Node _ _))`, while another might be processing
`(down 0 (Leaf _))` instead. This divergence has the consequence of producing
sequentialism in the GPU architecture, where warp-local threads are in lockstep.

To improve this, a different task-sharing mechanism has been implemented, which
requires a minimal annotation: redexes corresponding to branching recursive
calls are flagged with a `!` on the global Book. With this annotation, the GPU
evaluator will then only share redexes from functions that recurse in a
parallelizable fashion. This is extremely effective, as it allows threads to
always get "equivalent" redexes in a regular recursive algorithm. For example,
if given thread is processing `(flow 5 (Node _ _))`, it is very likely that
another warp local thread is too. This minimizes warp divergence, and has a
profound impact in performance across many cases. On the `bitonic_sort` example,
this new policy alone resulted in a jump from 1,300 MIPS to 12,000 MIPS (9x).

== Optimization: Shared Memory Interactions

GPUs also have another particularity that, if exploited properly, can result in
significant speedups: shared memory. The NVIDIA RTX 4090, for example, includes
A L1 Cache memory space of about 128KB, and GPU languages like CUDA usually
allow a programmer to manually read and write from that cache, in a shared
memory buffer that is accessible by all threads of a block. Reading and writing
from shared memory can be up to 2 orders of magnitude faster than doing so from
global memory, so, using that space properly is essential to fully harness a
GPU's computing power.

On HVM32, we use that space to store a local node buffer, and a local subst map,
with 8192 nodes and variables. This occupies exactly 96KB, just enough to fit
most modern processors. When a thread allocates a fresh node or variable, that
allocation occurs in the shared memory, rather than the global memory. With a
configuration of 128 threads per block, each thread has a "scratchpad" of 64
nodes and vars to work locally, with no global memory access. This is often
enough to compute long-running tail-loops, which is what makes HVM2 so efficient
on shader-like programs.

There is one problem, though: what happens when an interaction links a locally
allocated node to a global variable? This would cause a pointer to a local node
to "leak" to another block, which would then be unable to retrieve its
information, causing a runtime error. To handle this situation, we extend the
LINK interaction with a "LEAK" sub-interaction, which is specific to GPUs only.
That interaction essentially allocates a "global view" of the local node, filled
with two placeholder variables, such that one copy is local, and the other copy
is global (remember: variables are always paired). That way, we can continue the
local reduction without interruptions. If another block does get this "leaked"
node, it will be filled with two variables, which, in this case, act as "future"
values which will be resolved when the local thread links it.

```
^A ~ (b1 b2)
------------- LEAK
^X ~ b1
^Y ~ b2
^A ~ ^(^X ^Y)
```

The LEAK interaction allows us to safely work locally for as long as desired,
which has great impact on performance. On the stress test benchmark, the
throughput jumps from about 13,000 MIPS to 54,000 MIPS by this change alone.

= Garbage Collection

Since HVM2 is based on Interaction Combinators, which are fully linear, there is
no global "garbage collection" pass required. By IC evaluation semantics alone,
data is granularly allocated as needed, and freed as soon as they become
unreachable. This is specifically accomplished by the ERAS and VOID
interactions, which consume sub-nets that go out of scope in parallel, clearing
them from memory. When HVM2 completes evaluation (i.e., after all redexes have
been processed), the memory should be left with just the final result of the
program, and no remaining garbage or byproduct. No further work is needed.

= IO

TODO: explain how we do IO

= Benchmarks

TODO: include some benchmarks

= Translations

TODO: include example translations from high-level languages to HVM2

= Limitations

The HVM2 architecture, as currently presented, is capable of evaluating modern,
high-level programs in massively parallel hardware with near-ideal speedup,
which is a remarkable feat. That said, it has severe impactful limitations that
must be understood. Below is a list of many of these limitations, and how they
can be addressed in the future.

== Only one Duplicator Node

Since HVM2 is affine, duplicator nodes are often used to copy non-linear
variables. For example, when translating the $lambda$-term `λx.x+x`, which is not
linear (because `x` occurs twice), one might use a `DUP` node to clone `x`.
Due to Interaction Net semantics, though, `DUP` nodes don't always match the
behavior of cloning a variable on traditional languages. This, if not handled
properly, can lead to *unsound reductions*. For example, the $lambda$-term:

```
C4 = (λf.λx.(f (f x)) λf.λx.(f (f x)))
```

Which computes the Church-encoded exponentiation `2^2`, can not be soundly reduced by HVM2. To handle this, a source language must use either a type system or similar mechanism to verify that the following invariant holds:
#v(-1em)
#quote(block: true)[
  _A higher-order lambda that clones its variable can not be cloned._
]

While restrictive, it is important to stress that this limitation only applies
to cloning higher-order functions. A language that targets the HVM can still
clone data types with no restrictions, and it is still able to perform loops,
recursion and pattern-matches with no limitations. In other words, HVM is Turing
Complete, and can evaluate procedural languages with no restrictions, and
functional languages with some restrictions. That said, this can be amended by:

1. Adding more duplicator nodes. This would allow "nested copies" of
   higher-order functions. With a proper type system (such as EAL inference),
   this can greatly reduce the set of practical programs that are affected by
   this restriction.

2. Adding bookkeeping nodes. These nodes, originally proposed by Lamping (1990),
   allow interaction systems to evaluate the full λ-calculus with no
   restrictions. Adding bookkeeping to HVM should be easy. Sadly, this has the
   consequence of bringing a constant-time overhead, decreasing performance by
   about 10x. Because of that, it wasn't included in this model.

Ideally, a combination of both approaches should be used: a type-checker that
flags safe programs, which can be evaluated safely on HVM2, and a fallback
bookkeeper, which ensures sound reductions of programs that do not. Implementing
such system is outside the scope of this work, and should be done as a future
extension. [cite: the optimizing optimal evaluation paper]

== Ultra-Eager Evaluation Only

In our first implementation, HVM1, we used a lazy evaluation model. This not
only ensured that no unnecessary work was done, but also allowed one to compute
with infinite structures, like lists. Since the implementation presented here
reduces *all* available redexes eagerly, that means neither of these hold. For
example, if you allocate a big structure, but only read one branch, HVM2 will
allocate the entire structure, while HVM1 wouldn't. And if you do have an
infinite structure, HVM2 will never halt (because the redex bag will never be
empty). This applies even to code that doesn't look like it is an infinite
structure. For example, consider the JavaScript function below:

    ```
    foo = x => x == 0 ? 0 : 1 + foo(x-1);
    ```

In JavaScript, this is a perfectly valid function. In HVM2, if called as-is,
this would hang, because `foo(x-1)` would unroll infinitely, as we do not
"detect" that it is in a branch. To make recursive functions computable, the
usual approach is to split it into multiple definitions, as in:

    ```
    foo   = x => x == 0 ? fooZ : foo_S(x - 1);
    foo_Z = 0;
    foo_S = x => 1 + foo(x-1);
    ```

Since REFs unfold lazily, the program above will properly erase the `foo_S`
branch when it reaches the base case, avoiding the infinite recursion.

Extending HVM2 with a full lazy mode would requires us to store uplinks,
allowing threads to navigate through the graph and only reduce redexes that are
reachable from the root port. While not technically hard to do, doing so would
make the task scheduler way more complex to implement efficiently, specially in
the GPU version. We reserve this for a future extension.

== Single Core Inefficiency

While HVM2 achieves near-linear speedup, allowing it to make programs run
arbitrarily faster by just using more cores (as long as there is sufficient
degree of parallelism), its compiler is still extremely immature, and not nearly
as fast as state-of-art alternatives like GCC of GHC. In single-thread CPU
evaluation, HVM2, is, baseline, still about 5x slower than GHC, and this number
can grow to 100x on programs that involve loops and mutable arrays, since HVM2
doesn't feature these yet.

For example, a single-core C program that adds numbers from 0 to a few billions
will easily outperform an HVM2 one that uses thousands of threads, given the C
version is doing no allocation, while HVM2 is allocating a tree-like recursive
stack. That said, not every program can be implemented as an allocation-free,
register-mutating loop. For real programs that allocate tons of short memory
objects, HVM2 is expected to perform extremely well.

Moreover, and unlike some might think, HVM2 is not incompatible with loops or
mutable types, because it isn't a functional runtime, but one based on
interaction combinators, which are fully linear. Extending HVM2 with arrays is
as easy as creating nodes for it, and implementing the interactions, and can be
done in a timely fashion as a fork of this repository. Similarly, loops can be
implemented by optimizing tail-calls. We plan to add such optimization soon.

Finally, there are many other low-hanging fruits that could improve HVM2's
performance considerably. For example, currently, we do not have native
constructors, which means that algebraic datatypes have to be λ-encoded, which
brings a 2x-5x memory overhead. Adding proper constructors and eliminating this
overhead would likely bring a proportional speedup. Similarly, adding more
numeric types like vectors would allow using more of the available GPU
instructions, and adding read-only types like immutable strings and textures
with 1-interaction reads would allow one to implement many algorithms that,
currently, wouldn't be practical, specially for graphics rendering.

== 32-bit Architecture Limitations

Since this architecture is 32-bit, and since 3 bits are reserved for a tag, that
leaves us with a 29-bit addressable space. That amounts for a total of about 500
million nodes, or about 4 GB. Modern GPUs come with as much as 256 GB integrated
memory, so, HVM2 isn't able to fully use the available space, due to addressing
constraints. Moreover, its 29-bit unboxed numbers only allow for 24-bit machine
ints and floats, which may not be enough for many applications.

All these problems should be solved by extending ports to 64-bit and nodes to
128-bits, but this requires some additional considerations, since modern GPUs
don't come with 128-bit atomic operations. We'll do this in a future extension.

== More

*Work In Progress*

= Conclusion

By starting from a solid, inherently concurrent model of computation,
Interaction Combinators, carefully designing an efficient memory format,
implementing lock-free concurrent interactions via lightweight atomic
primitives, and granularly distributing workload across all cores, we were able
to design a parallel compiler and evaluator for high-level programming languages
that achieves near-linear speedup as a function of core count (within a single device).
While the resulting system still has many limitations, we proposed sensible plans to
address them in the future.

This work creates a solid foundation for parallel programming languages that are
able to harness the massively parallel capabilities of modern hardware, without
demanding explicit low-level management of threads, locks, mutexes and other
complex synchronization primitives by the programmer.


================================================
FILE: paper/README.md
================================================
# HVM - Paper(s)

- HVM2 (Work in progress)
  - HVM2's theoretical foundations, implementation, early benchmarks, current limitations, and future work.
- Extended Abstract
  - Accepted to [FProPer 2024][1].
  - A much shorter version of the main paper.

[1]: https://icfp24.sigplan.org/home/fproper-2024


================================================
FILE: paper/inet.typ
================================================
#import "@preview/cetz:0.2.2": draw, canvas

#let port = (name, pos, dir) => {
  import draw: *
  group(name: name, {
    translate(pos)
    rotate(dir)
    // scale(1.5)
    anchor("p", (0, 0))
    anchor("c", (0, 0.5))
  })
}

#let agent = (..agent) => (..args) => {
  import draw: *
  let style = agent.named().at("style", default: ())
  let name = args.named().at("name")
  let pos = args.named().at("pos")
  let rot = args.named().at("rot", default: 0deg)
  group(name: name, {
    translate(pos)
    rotate(rot)
    translate((0, -calc.sqrt(3)/4))
    stroke(2pt)
    line((-.5, 0), (.5, 0), (0, calc.sqrt(3)/2), close: true, ..style, stroke: 0.5pt)
    port("0", (0, calc.sqrt(3)/2), 0deg)
    port("1", (-1/2+1/3, 0), 180deg)
    port("2", (+1/2-1/3, 0), 180deg)
  })
}

#let link = (a, b) => {
  import draw: *
  stroke(2pt)
  bezier(a + ".p", b + ".p", a + ".c", b + ".c", stroke: 0.5pt)
}

#let con = agent()
#let dup = agent(style: (fill: black))


================================================
FILE: src/ast.rs
================================================
use TSPL::{new_parser, Parser, ParseError};
use highlight_error::highlight_error;
use crate::hvm;
use std::fmt::{Debug, Display};
use std::collections::{BTreeMap, BTreeSet};

// Types
// -----

#[derive(Clone, Hash, PartialEq, Eq, Debug)]
pub struct Numb(pub u32);

#[derive(Clone, Hash, PartialEq, Eq, Debug)]
pub enum Tree {
  Var { nam: String },
  Ref { nam: String },
  Era,
  Num { val: Numb },
  Con { fst: Box<Tree>, snd: Box<Tree> },
  Dup { fst: Box<Tree>, snd: Box<Tree> },
  Opr { fst: Box<Tree>, snd: Box<Tree> },
  Swi { fst: Box<Tree>, snd: Box<Tree> },
}

pub type Redex = (bool, Tree, Tree);

#[derive(Clone, Hash, PartialEq, Eq, Debug)]
pub struct Net {
  pub root: Tree,
  pub rbag: Vec<Redex>,
}

pub struct Book {
  pub defs: BTreeMap<String, Net>,
}

// Parser
// ------

pub type ParseResult<T> = std::result::Result<T, ParseError>;

new_parser!(CoreParser);

impl<'i> CoreParser<'i> {

  pub fn parse_numb_sym(&mut self) -> ParseResult<Numb> {
    self.consume("[")?;

    // numeric casts
    if let Some(cast) = match () {
      _ if self.try_consume("u24") => Some(hvm::TY_U24),
      _ if self.try_consume("i24") => Some(hvm::TY_I24),
      _ if self.try_consume("f24") => Some(hvm::TY_F24),
      _ => None
    } {
      // Casts can't be partially applied, so nothing should follow.
      self.consume("]")?;

      return Ok(Numb(hvm::Numb::new_sym(cast).0));
    }

    // Parses the symbol
    let op = hvm::Numb::new_sym(match () {
      // numeric operations
      _ if self.try_consume("+")   => hvm::OP_ADD,
      _ if self.try_consume("-")   => hvm::OP_SUB,
      _ if self.try_consume(":-")  => hvm::FP_SUB,
      _ if self.try_consume("*")   => hvm::OP_MUL,
      _ if self.try_consume("/")   => hvm::OP_DIV,
      _ if self.try_consume(":/")  => hvm::FP_DIV,
      _ if self.try_consume("%")   => hvm::OP_REM,
      _ if self.try_consume(":%")  => hvm::FP_REM,
      _ if self.try_consume("=")   => hvm::OP_EQ,
      _ if self.try_consume("!")   => hvm::OP_NEQ,
      _ if self.try_consume("<<")  => hvm::OP_SHL,
      _ if self.try_consume(":<<") => hvm::FP_SHL,
      _ if self.try_consume(">>")  => hvm::OP_SHR,
      _ if self.try_consume(":>>") => hvm::FP_SHR,
      _ if self.try_consume("<")   => hvm::OP_LT,
      _ if self.try_consume(">")   => hvm::OP_GT,
      _ if self.try_consume("&")   => hvm::OP_AND,
      _ if self.try_consume("|")   => hvm::OP_OR,
      _ if self.try_consume("^")   => hvm::OP_XOR,
      _ => self.expected("operator symbol")?,
    });

    self.skip_trivia();
    // Syntax for partial operations, like `[*2]`
    let num = if self.peek_one() != Some(']') {
      hvm::Numb::partial(op, hvm::Numb(self.parse_numb_lit()?.0))
    } else {
      op
    };

    // Closes symbol bracket
    self.consume("]")?;

    // Returns the symbol
    return Ok(Numb(num.0));
  }

  pub fn parse_numb_lit(&mut self) -> ParseResult<Numb> {
    let ini = self.index;
    let num = self.take_while(|x| x.is_alphanumeric() || x == '+' || x == '-' || x == '.');
    let mut num_parser = CoreParser::new(num);
    let end = self.index;
    Ok(Numb(if num.contains('.') || num.contains("inf") || num.contains("NaN") {
      let val: f32 = num.parse()
        .map_err(|err| {
          let msg = format!("invalid number literal: {}\n{}", err, highlight_error(ini, end, self.input));
          self.expected_and::<Numb>("number literal", &msg).unwrap_err()
        })?;
      hvm::Numb::new_f24(val)
    } else if num.starts_with('+') || num.starts_with('-') {
      *num_parser.index() += 1;
      let val = num_parser.parse_u64()? as i32;
      hvm::Numb::new_i24(if num.starts_with('-') { -val } else { val })
    } else {
      let val = num_parser.parse_u64()? as u32;
      hvm::Numb::new_u24(val)
    }.0))
  }

  pub fn parse_numb(&mut self) -> ParseResult<Numb> {
    self.skip_trivia();

    // Parses symbols (SYM)
    if let Some('[') = self.peek_one() {
      return self.parse_numb_sym();
    // Parses numbers (U24,I24,F24)
    } else {
      return self.parse_numb_lit();
    }
  }

  pub fn parse_tree(&mut self) -> ParseResult<Tree> {
    self.skip_trivia();
    //println!("aaa ||{}", &self.input[self.index..]);
    match self.peek_one() {
      Some('(') => {
        self.advance_one();
        let fst = Box::new(self.parse_tree()?);
        self.skip_trivia();
        let snd = Box::new(self.parse_tree()?);
        self.consume(")")?;
        Ok(Tree::Con { fst, snd })
      }
      Some('{') => {
        self.advance_one();
        let fst = Box::new(self.parse_tree()?);
        self.skip_trivia();
        let snd = Box::new(self.parse_tree()?);
        self.consume("}")?;
        Ok(Tree::Dup { fst, snd })
      }
      Some('$') => {
        self.advance_one();
        self.consume("(")?;
        let fst = Box::new(self.parse_tree()?);
        self.skip_trivia();
        let snd = Box::new(self.parse_tree()?);
        self.consume(")")?;
        Ok(Tree::Opr { fst, snd })
      }
      Some('?') => {
        self.advance_one();
        self.consume("(")?;
        let fst = Box::new(self.parse_tree()?);
        self.skip_trivia();
        let snd = Box::new(self.parse_tree()?);
        self.consume(")")?;
        Ok(Tree::Swi { fst, snd })
      }
      Some('@') => {
        self.advance_one();
        let nam = self.parse_name()?;
        Ok(Tree::Ref { nam })
      }
      Some('*') => {
        self.advance_one();
        Ok(Tree::Era)
      }
      _ => {
        if let Some(c) = self.peek_one() {
          if "0123456789+-[".contains(c) {
            return Ok(Tree::Num { val: self.parse_numb()? });
          }
        }
        let nam = self.parse_name()?;
        Ok(Tree::Var { nam })
      }
    }
  }

  pub fn parse_net(&mut self) -> ParseResult<Net> {
    let root = self.parse_tree()?;
    let mut rbag = Vec::new();
    self.skip_trivia();
    while self.peek_one() == Some('&') {
      self.consume("&")?;
      let par = if let Some('!') = self.peek_one() { self.consume("!")?; true } else { false };
      let fst = self.parse_tree()?;
      self.consume("~")?;
      let snd = self.parse_tree()?;
      rbag.push((par,fst,snd));
      self.skip_trivia();
    }
    Ok(Net { root, rbag })
  }

  pub fn parse_book(&mut self) -> ParseResult<Book> {
    let mut defs = BTreeMap::new();
    while !self.is_eof() {
      self.consume("@")?;
      let name = self.parse_name()?;
      self.consume("=")?;
      let net = self.parse_net()?;
      defs.insert(name, net);
    }
    Ok(Book { defs })
  }

  fn try_consume(&mut self, str: &str) -> bool {
    let matches = self.peek_many(str.len()) == Some(str);
    if matches {
      self.advance_many(str.len());
    }
    matches
  }
}

// Stringifier
// -----------

impl Numb {
  pub fn show(&self) -> String {
    let numb = hvm::Numb(self.0);
    match numb.get_typ() {
      hvm::TY_SYM => match numb.get_sym() as hvm::Tag {
        // casts
        hvm::TY_U24 => "[u24]".to_string(),
        hvm::TY_I24 => "[i24]".to_string(),
        hvm::TY_F24 => "[f24]".to_string(),
        // operations
        hvm::OP_ADD => "[+]".to_string(),
        hvm::OP_SUB => "[-]".to_string(),
        hvm::FP_SUB => "[:-]".to_string(),
        hvm::OP_MUL => "[*]".to_string(),
        hvm::OP_DIV => "[/]".to_string(),
        hvm::FP_DIV => "[:/]".to_string(),
        hvm::OP_REM => "[%]".to_string(),
        hvm::FP_REM => "[:%]".to_string(),
        hvm::OP_EQ  => "[=]".to_string(),
        hvm::OP_NEQ => "[!]".to_string(),
        hvm::OP_LT  => "[<]".to_string(),
        hvm::OP_GT  => "[>]".to_string(),
        hvm::OP_AND => "[&]".to_string(),
        hvm::OP_OR  => "[|]".to_string(),
        hvm::OP_XOR => "[^]".to_string(),
        hvm::OP_SHL => "[<<]".to_string(),
        hvm::FP_SHL => "[:<<]".to_string(),
        hvm::OP_SHR => "[>>]".to_string(),
        hvm::FP_SHR => "[:>>]".to_string(),
        _           => "[?]".to_string(),
      }
      hvm::TY_U24 => {
        let val = numb.get_u24();
        format!("{}", val)
      }
      hvm::TY_I24 => {
        let val = numb.get_i24();
        format!("{:+}", val)
      }
      hvm::TY_F24 => {
        let val = numb.get_f24();
        if val.is_infinite() {
          if val.is_sign_positive() {
            format!("+inf")
          } else {
            format!("-inf")
          }
        } else if val.is_nan() {
          format!("+NaN")
        } else {
          format!("{:?}", val)
        }
      }
      _ => {
        let typ = numb.get_typ();
        let val = numb.get_u24();
        format!("[{}0x{:07X}]", match typ {
          hvm::OP_ADD => "+",
          hvm::OP_SUB => "-",
          hvm::FP_SUB => ":-",
          hvm::OP_MUL => "*",
          hvm::OP_DIV => "/",
          hvm::FP_DIV => ":/",
          hvm::OP_REM => "%",
          hvm::FP_REM => ":%",
          hvm::OP_EQ  => "=",
          hvm::OP_NEQ => "!",
          hvm::OP_LT  => "<",
          hvm::OP_GT  => ">",
          hvm::OP_AND => "&",
          hvm::OP_OR  => "|",
          hvm::OP_XOR => "^",
          hvm::OP_SHL => "<<",
          hvm::FP_SHL => ":<<",
          hvm::OP_SHR => ">>",
          hvm::FP_SHR => ":>>",
          _           => "?",
        }, val)
      }
    }
  }
}

impl Tree {
  pub fn show(&self) -> String {
    match self {
      Tree::Var { nam } => nam.to_string(),
      Tree::Ref { nam } => format!("@{}", nam),
      Tree::Era => "*".to_string(),
      Tree::Num { val } => format!("{}", val.show()),
      Tree::Con { fst, snd } => format!("({} {})", fst.show(), snd.show()),
      Tree::Dup { fst, snd } => format!("{{{} {}}}", fst.show(), snd.show()),
      Tree::Opr { fst, snd } => format!("$({} {})", fst.show(), snd.show()),
      Tree::Swi { fst, snd } => format!("?({} {})", fst.show(), snd.show()),
    }
  }
}

impl Net {
  pub fn show(&self) -> String {
    let mut s = self.root.show();
    for (par, fst, snd) in &self.rbag {
      s.push_str(" &");
      s.push_str(if *par { "!" } else { " " });
      s.push_str(&fst.show());
      s.push_str(" ~ ");
      s.push_str(&snd.show());
    }
    s
  }
}

impl Book {
  pub fn show(&self) -> String {
    let mut s = String::new();
    for (name, net) in &self.defs {
      s.push_str("@");
      s.push_str(name);
      s.push_str(" = ");
      s.push_str(&net.show());
      s.push('\n');
    }
    s
  }
}

// Readback
// --------

impl Tree {
  pub fn readback(net: &hvm::GNet, port: hvm::Port, fids: &BTreeMap<hvm::Val, String>) -> Option<Tree> {
    //println!("reading {}", port.show());
    match port.get_tag() {
      hvm::VAR => {
        let got = net.enter(port);
        if got != port {
          return Tree::readback(net, got, fids);
        } else {
          return Some(Tree::Var { nam: format!("v{:x}", port.get_val()) });
        }
      }
      hvm::REF => {
        return Some(Tree::Ref { nam: fids.get(&port.get_val())?.clone() });
      }
      hvm::ERA => {
        return Some(Tree::Era);
      }
      hvm::NUM => {
        return Some(Tree::Num { val: Numb(port.get_val()) });
      }
      hvm::CON => {
        let pair = net.node_load(port.get_val() as usize);
        let fst = Tree::readback(net, pair.get_fst(), fids)?;
        let snd = Tree::readback(net, pair.get_snd(), fids)?;
        return Some(Tree::Con { fst: Box::new(fst), snd: Box::new(snd) });
      }
      hvm::DUP => {
        let pair = net.node_load(port.get_val() as usize);
        let fst = Tree::readback(net, pair.get_fst(), fids)?;
        let snd = Tree::readback(net, pair.get_snd(), fids)?;
        return Some(Tree::Dup { fst: Box::new(fst), snd: Box::new(snd) });
      }
      hvm::OPR => {
        let pair = net.node_load(port.get_val() as usize);
        let fst = Tree::readback(net, pair.get_fst(), fids)?;
        let snd = Tree::readback(net, pair.get_snd(), fids)?;
        return Some(Tree::Opr { fst: Box::new(fst), snd: Box::new(snd) });
      }
      hvm::SWI => {
        let pair = net.node_load(port.get_val() as usize);
        let fst = Tree::readback(net, pair.get_fst(), fids)?;
        let snd = Tree::readback(net, pair.get_snd(), fids)?;
        return Some(Tree::Swi { fst: Box::new(fst), snd: Box::new(snd) });
      }
      _ => {
        unreachable!()
      }
    }
  }
}

impl Net {
  pub fn readback(net: &hvm::GNet, book: &hvm::Book) -> Option<Net> {
    let mut fids = BTreeMap::new();
    for (fid, def) in book.defs.iter().enumerate() {
      fids.insert(fid as hvm::Val, def.name.clone());
    }
    let root = net.enter(hvm::ROOT);
    let root = Tree::readback(net, root, &fids)?;
    let rbag = Vec::new();
    return Some(Net { root, rbag });
  }
}

// Def Builder
// -----------

impl Tree {
  pub fn build(&self, def: &mut hvm::Def, fids: &BTreeMap<String, hvm::Val>, vars: &mut BTreeMap<String, hvm::Val>) -> hvm::Port {
    match self {
      Tree::Var { nam } => {
        if !vars.contains_key(nam) {
          vars.insert(nam.clone(), vars.len() as hvm::Val);
          def.vars += 1;
        }
        return hvm::Port::new(hvm::VAR, *vars.get(nam).unwrap());
      }
      Tree::Ref { nam } => {
        if let Some(fid) = fids.get(nam) {
          return hvm::Port::new(hvm::REF, *fid);
        } else {
          panic!("Unbound definition: {}", nam);
        }
      }
      Tree::Era => {
        return hvm::Port::new(hvm::ERA, 0);
      }
      Tree::Num { val } => {
        return hvm::Port::new(hvm::NUM, val.0);
      }
      Tree::Con { fst, snd } => {
        let index = def.node.len();
        def.node.push(hvm::Pair(0));
        let p1 = fst.build(def, fids, vars);
        let p2 = snd.build(def, fids, vars);
        def.node[index] = hvm::Pair::new(p1, p2);
        return hvm::Port::new(hvm::CON, index as hvm::Val);
      }
      Tree::Dup { fst, snd } => {
        def.safe = false;
        let index = def.node.len();
        def.node.push(hvm::Pair(0));
        let p1 = fst.build(def, fids, vars);
        let p2 = snd.build(def, fids, vars);
        def.node[index] = hvm::Pair::new(p1, p2);
        return hvm::Port::new(hvm::DUP, index as hvm::Val);
      },
      Tree::Opr { fst, snd } => {
        let index = def.node.len();
        def.node.push(hvm::Pair(0));
        let p1 = fst.build(def, fids, vars);
        let p2 = snd.build(def, fids, vars);
        def.node[index] = hvm::Pair::new(p1, p2);
        return hvm::Port::new(hvm::OPR, index as hvm::Val);
      },
      Tree::Swi { fst, snd } => {
        let index = def.node.len();
        def.node.push(hvm::Pair(0));
        let p1 = fst.build(def, fids, vars);
        let p2 = snd.build(def, fids, vars);
        def.node[index] = hvm::Pair::new(p1, p2);
        return hvm::Port::new(hvm::SWI, index as hvm::Val);
      },
    }
  }

  pub fn direct_dependencies<'name>(&'name self) -> BTreeSet<&'name str> {
    let mut stack: Vec<&Tree> = vec![self];
    let mut acc: BTreeSet<&'name str> = BTreeSet::new();
    
    while let Some(curr) = stack.pop() {
      match curr {
        Tree::Ref { nam } => { acc.insert(nam); },
        Tree::Con { fst, snd } => { stack.push(fst); stack.push(snd); },
        Tree::Dup { fst, snd } => { stack.push(fst); stack.push(snd); },
        Tree::Opr { fst, snd } => { stack.push(fst); stack.push(snd); },
        Tree::Swi { fst, snd } => { stack.push(fst); stack.push(snd); },
        Tree::Num { val } => {},
        Tree::Var { nam } => {},
        Tree::Era => {},
      };
    }
    acc
  }
}

impl Net {
  pub fn build(&self, def: &mut hvm::Def, fids: &BTreeMap<String, hvm::Val>, vars: &mut BTreeMap<String, hvm::Val>) {
    let index = def.node.len();
    def.root = self.root.build(def, fids, vars);
    for (par, fst, snd) in &self.rbag {
      let index = def.rbag.len();
      def.rbag.push(hvm::Pair(0));
      let p1 = fst.build(def, fids, vars);
      let p2 = snd.build(def, fids, vars);
      let rx = hvm::Pair::new(p1, p2);
      let rx = if *par { rx.set_par_flag() } else { rx };
      def.rbag[index] = rx;
    }
  }
}

impl Book {
  pub fn parse(code: &str) -> ParseResult<Self> {
    CoreParser::new(code).parse_book()
  }

  pub fn build(&self) -> hvm::Book {
    let mut name_to_fid = BTreeMap::new();
    let mut fid_to_name = BTreeMap::new();
    fid_to_name.insert(0, "main".to_string());
    name_to_fid.insert("main".to_string(), 0);
    for (_i, (name, _)) in self.defs.iter().enumerate() {
      if name != "main" {
        fid_to_name.insert(name_to_fid.len() as hvm::Val, name.clone());
        name_to_fid.insert(name.clone(), name_to_fid.len() as hvm::Val);
      }
    }
    let mut book = hvm::Book { defs: Vec::new() };
    for (fid, name) in &fid_to_name {
      let ast_def = self.defs.get(name).expect("missing `@main` definition");
      let mut def = hvm::Def {
        name: name.clone(),
        safe: true,
        root: hvm::Port(0),
        rbag: vec![],
        node: vec![],
        vars: 0,
      };
      ast_def.build(&mut def, &name_to_fid, &mut BTreeMap::new());
      book.defs.push(def);
    }
    self.propagate_safety(&mut book, &name_to_fid);
    return book;
  }

  /// Propagate unsafe definitions to those that reference them.
  ///
  /// When calling this function, it is expected that definitions that are directly
  /// unsafe are already marked as such in the `compiled_book`.
  /// 
  /// This does not completely solve the cloning safety in HVM. It only stops invalid
  /// **global** definitions from being cloned, but local unsafe code can still be
  /// cloned and can generate seemingly unexpected results, such as placing eraser
  /// nodes in weird places. See HVM issue [#362](https://github.com/HigherOrderCO/HVM/issues/362)
  /// for an example.
  fn propagate_safety(&self, compiled_book: &mut hvm::Book, lookup: &BTreeMap<String, u32>) {
    let dependents = self.direct_dependents();
    let mut stack: Vec<&str> = Vec::new();

    for (name, _) in self.defs.iter() {
      let def = &mut compiled_book.defs[lookup[name] as usize];
      if !def.safe {
        for next in dependents[name.as_str()].iter() {
          stack.push(next);
        }
      }
    }

    while let Some(curr) = stack.pop() {
      let def = &mut compiled_book.defs[lookup[curr] as usize];
      if !def.safe {
        // Already visited, skip this
        continue;
      }

      def.safe = false;

      for &next in dependents[curr].iter() {
        stack.push(next);
      }
    }
  }

  /// Calculates the dependents of each definition, that is, if definition `A`
  /// requires `B`, `B: A` is in the return map. This is used to propagate unsafe
  /// definitions to others that depend on them.
  /// 
  /// This solution has linear complexity on the number of definitions in the
  /// book and the number of direct references in each definition, but it also
  /// traverses each definition's trees entirely once.
  ///
  /// Complexity: O(d*t + r)
  /// - `d` is the number of definitions in the book
  /// - `r` is the number of direct references in each definition
  /// - `t` is the number of nodes in each tree
  fn direct_dependents<'name>(&'name self) -> BTreeMap<&'name str, BTreeSet<&'name str>> {
    let mut result = BTreeMap::new();
    for (name, _) in self.defs.iter() {
      result.insert(name.as_str(), BTreeSet::new());
    }

    let mut process = |tree: &'name Tree, name: &'name str| {
      for dependency in tree.direct_dependencies() {
        result
          .get_mut(dependency)
          .expect("global definition depends on undeclared reference")
          .insert(name);
      }
    };

    for (name, net) in self.defs.iter() {
      process(&net.root, name);
      for (_, r1, r2) in net.rbag.iter() {
        process(r1, name);
        process(r2, name);
      }
    }
    result
  }
}


================================================
FILE: src/cmp.rs
================================================
use crate::ast;
use crate::hvm;

use std::collections::HashMap;

#[derive(Clone, Copy, PartialEq, Eq)]
pub enum Target { CUDA, C }

// Compiles a whole Book.
pub fn compile_book(trg: Target, book: &hvm::Book) -> String {
  let mut code = String::new();

  // Compiles functions
  for fid in 0..book.defs.len() {
    compile_def(trg, &mut code, book, 0, fid as hvm::Val);
    code.push_str(&format!("\n"));
  }

  // Compiles interact_call
  if trg == Target::CUDA {
    code.push_str(&format!("__device__ "));
  }
  code.push_str(&format!("bool interact_call(Net *net, TM *tm, Port a, Port b) {{\n"));
  code.push_str(&format!("  u32 fid = get_val(a) & 0xFFFFFFF;\n"));
  code.push_str(&format!("  switch (fid) {{\n"));
  for (fid, def) in book.defs.iter().enumerate() {
    code.push_str(&format!("    case {}: return interact_call_{}(net, tm, a, b);\n", fid, &def.name.replace("/","_").replace(".","_").replace("-","_")));
  }
  code.push_str(&format!("    default: return false;\n"));
  code.push_str(&format!("  }}\n"));
  code.push_str(&format!("}}"));

  return code;
}

// Compiles a single Def.
pub fn compile_def(trg: Target, code: &mut String, book: &hvm::Book, tab: usize, fid: hvm::Val) {
  let def = &book.defs[fid as usize];
  let fun = &def.name.replace("/","_").replace(".","_").replace("-","_");

  // Initializes context
  let neo = &mut 0;
  
  // Generates function
  if trg == Target::CUDA {
    code.push_str(&format!("__device__ "));
  }
  code.push_str(&format!("{}bool interact_call_{}(Net *net, TM *tm, Port a, Port b) {{\n", indent(tab), fun));
  // Fast DUP-REF
  if def.safe {
    code.push_str(&format!("{}if (get_tag(b) == DUP) {{\n", indent(tab+1)));
    code.push_str(&format!("{}return interact_eras(net, tm, a, b);\n", indent(tab+2)));
    code.push_str(&format!("{}}}\n", indent(tab+1)));
  }
  code.push_str(&format!("{}u32 vl = 0;\n", indent(tab+1)));
  code.push_str(&format!("{}u32 nl = 0;\n", indent(tab+1)));

  // Allocs resources (using fast allocator)
  for i in 0 .. def.vars {
    code.push_str(&format!("{}Val v{:x} = vars_alloc_1(net, tm, &vl);\n", indent(tab+1), i));
  }
  for i in 0 .. def.node.len() {
    code.push_str(&format!("{}Val n{:x} = node_alloc_1(net, tm, &nl);\n", indent(tab+1), i));
  }
  code.push_str(&format!("{}if (0", indent(tab+1)));
  for i in 0 .. def.vars {
    code.push_str(&format!(" || !v{:x}", i));
  }
  for i in 0 .. def.node.len() {
    code.push_str(&format!(" || !n{:x}", i));
  }
  code.push_str(&format!(") {{\n"));
  code.push_str(&format!("{}return false;\n", indent(tab+2)));
  code.push_str(&format!("{}}}\n", indent(tab+1)));
  for i in 0 .. def.vars {
    code.push_str(&format!("{}vars_create(net, v{:x}, NONE);\n", indent(tab+1), i));
  }

  // Allocs resources (using slow allocator)
  //code.push_str(&format!("{}// Allocates needed resources.\n", indent(tab+1)));
  //code.push_str(&format!("{}if (!get_resources(net, tm, {}, {}, {})) {{\n", indent(tab+1), def.rbag.len()+1, def.node.len(), def.vars));
  //code.push_str(&format!("{}return false;\n", indent(tab+2)));
  //code.push_str(&format!("{}}}\n", indent(tab+1)));
  //for i in 0 .. def.node.len() {
    //code.push_str(&format!("{}Val n{:x} = tm->nloc[0x{:x}];\n", indent(tab+1), i, i));
  //}
  //for i in 0 .. def.vars {
    //code.push_str(&format!("{}Val v{:x} = tm->vloc[0x{:x}];\n", indent(tab+1), i, i));
  //}
  //for i in 0 .. def.vars {
    //code.push_str(&format!("{}vars_create(net, v{:x}, NONE);\n", indent(tab+1), i));
  //}

  // Compiles root
  compile_link_fast(trg, code, book, neo, tab+1, def, def.root, "b");

  // Compiles rbag
  for redex in &def.rbag {
    let fun = compile_node(trg, code, book, neo, tab+1, def, redex.get_fst());
    let arg = compile_node(trg, code, book, neo, tab+1, def, redex.get_snd());
    code.push_str(&format!("{}link(net, tm, {}, {});\n", indent(tab+1), &fun, &arg));
  }

  // Return
  code.push_str(&format!("{}return true;\n", indent(tab+1)));
  code.push_str(&format!("{}}}\n", indent(tab)));
}

// Compiles a link, performing some pre-defined static reductions.
pub fn compile_link_fast(trg: Target, code: &mut String, book: &hvm::Book, neo: &mut usize, tab: usize, def: &hvm::Def, a: hvm::Port, b: &str) {

  // (<?(a111 a112) a12> a2) <~ (#X R)
  // --------------------------------- fast SWITCH
  // if X == 0:
  //   a111 <~ R
  //   a112 <~ ERAS
  // else:
  //   a111 <~ ERAS
  //   a112 <~ (#(X-1) R)
  if trg != Target::CUDA && a.get_tag() == hvm::CON {
    let a_ = &def.node[a.get_val() as usize];
    let a1 = a_.get_fst();
    let a2 = a_.get_snd();
    if a1.get_tag() == hvm::SWI {
      let a1_ = &def.node[a1.get_val() as usize];
      let a11 = a1_.get_fst();
      let a12 = a1_.get_snd();
      if a11.get_tag() == hvm::CON && a2.get_tag() == hvm::VAR && a12.0 == a2.0 {
        let a11_ = &def.node[a11.get_val() as usize];
        let a111 = a11_.get_fst();
        let a112 = a11_.get_snd();
        let op   = fresh(neo);
        let bv   = fresh(neo);
        let x1   = fresh(neo);
        let x2   = fresh(neo);
        let nu   = fresh(neo);
        code.push_str(&format!("{}bool {} = 0;\n", indent(tab), &op));
        code.push_str(&format!("{}Pair {} = 0;\n", indent(tab), &bv));
        code.push_str(&format!("{}Port {} = NONE;\n", indent(tab), &nu));
        code.push_str(&format!("{}Port {} = NONE;\n", indent(tab), &x1));
        code.push_str(&format!("{}Port {} = NONE;\n", indent(tab), &x2));
        code.push_str(&format!("{}//fast switch\n", indent(tab)));
        code.push_str(&format!("{}if (get_tag({}) == CON) {{\n", indent(tab), b));
        code.push_str(&format!("{}{} = node_load(net, get_val({}));\n", indent(tab+1), &bv, b)); // recycled
        code.push_str(&format!("{}{} = enter(net,get_fst({}));\n", indent(tab+1), &nu, &bv));
        code.push_str(&format!("{}if (get_tag({}) == NUM) {{\n", indent(tab+1), &nu));
        code.push_str(&format!("{}tm->itrs += 3;\n", indent(tab+2)));
        code.push_str(&format!("{}vars_take(net, v{});\n", indent(tab+2), a2.get_val()));
        code.push_str(&format!("{}{} = 1;\n", indent(tab+2), &op));
        code.push_str(&format!("{}if (get_u24(get_val({})) == 0) {{\n", indent(tab+2), &nu));
        code.push_str(&format!("{}node_take(net, get_val({}));\n", indent(tab+3), b));
        code.push_str(&format!("{}{} = get_snd({});\n", indent(tab+3), &x1, &bv));
        code.push_str(&format!("{}{} = new_port(ERA,0);\n", indent(tab+3), &x2));
        code.push_str(&format!("{}}} else {{\n", indent(tab+2)));
        code.push_str(&format!("{}node_store(net, get_val({}), new_pair(new_port(NUM,new_u24(get_u24(get_val({}))-1)), get_snd({})));\n", indent(tab+3), b, &nu, &bv));
        code.push_str(&format!("{}{} = new_port(ERA,0);\n", indent(tab+3), &x1));
        code.push_str(&format!("{}{} = {};\n", indent(tab+3), &x2, b));
        code.push_str(&format!("{}}}\n", indent(tab+2)));
        code.push_str(&format!("{}}} else {{\n", indent(tab+1)));
        code.push_str(&format!("{}node_store(net, get_val({}), new_pair({},get_snd({})));\n", indent(tab+2), b, &nu, &bv)); // update "entered" var
        code.push_str(&format!("{}}}\n", indent(tab+1)));
        code.push_str(&format!("{}}}\n", indent(tab+0)));
        compile_link_fast(trg, code, book, neo, tab, def, a111, &x1);
        compile_link_fast(trg, code, book, neo, tab, def, a112, &x2);
        code.push_str(&format!("{}if (!{}) {{\n", indent(tab), &op));
        code.push_str(&format!("{}node_create(net, n{:x}, new_pair(new_port(SWI,n{}),new_port(VAR,v{})));\n", indent(tab+1), a.get_val(), a1.get_val(), a2.get_val()));
        code.push_str(&format!("{}node_create(net, n{:x}, new_pair(new_port(CON,n{}),new_port(VAR,v{})));\n", indent(tab+1), a1.get_val(), a11.get_val(), a12.get_val()));
        code.push_str(&format!("{}node_create(net, n{:x}, new_pair({},{}));\n", indent(tab+1), a11.get_val(), &x1, &x2));
        link_or_store(trg, code, book, neo, tab+1, def, &format!("new_port(CON, n{:x})", a.get_val()), b);
        code.push_str(&format!("{}}}\n", indent(tab)));
        return;
      }
    }
  }

  // FIXME: REVIEW
  // <+ #B r> <~ #A
  // --------------- fast OPER
  // r <~ #(op(A,B))
  if trg != Target::CUDA && a.get_tag() == hvm::OPR {
    let a_ = &def.node[a.get_val() as usize];
    let a1 = a_.get_fst();
    let a2 = a_.get_snd();
    let op = fresh(neo);
    let x1 = compile_node(trg, code, book, neo, tab, def, a1);
    let x2 = fresh(neo);
    code.push_str(&format!("{}bool {} = 0;\n", indent(tab), &op));
    code.push_str(&format!("{}Port {} = NONE;\n", indent(tab), &x2));
    code.push_str(&format!("{}// fast oper\n", indent(tab)));
    code.push_str(&format!("{}if (get_tag({}) == NUM && get_tag({}) == NUM) {{\n", indent(tab), b, &x1));
    code.push_str(&format!("{}tm->itrs += 1;\n", indent(tab+1)));
    code.push_str(&format!("{}{} = 1;\n", indent(tab+1), &op));
    code.push_str(&format!("{}{} = new_port(NUM, operate(get_val({}), get_val({})));\n", indent(tab+1), &x2, b, &x1));
    code.push_str(&format!("{}}}\n", indent(tab)));
    compile_link_fast(trg, code, book, neo, tab, def, a2, &x2);
    code.push_str(&format!("{}if (!{}) {{\n", indent(tab), &op));
    code.push_str(&format!("{}node_create(net, n{:x}, new_pair({},{}));\n", indent(tab+1), a.get_val(), &x1, &x2));
    link_or_store(trg, code, book, neo, tab+1, def, &format!("new_port(OPR, n{:x})", a.get_val()), b);
    code.push_str(&format!("{}}}\n", indent(tab)));
    return;
  }

  // FIXME: REVIEW
  // {a1 a2} <~ #v
  // ------------- Fast COPY
  // a1 <~ #v
  // a2 <~ #v
  if trg != Target::CUDA && a.get_tag() == hvm::DUP {
    let a_ = &def.node[a.get_val() as usize];
    let p1 = a_.get_fst();
    let p2 = a_.get_snd();
    let op = fresh(neo);
    let x1 = fresh(neo);
    let x2 = fresh(neo);
    code.push_str(&format!("{}bool {} = 0;\n", indent(tab), &op));
    code.push_str(&format!("{}Port {} = NONE;\n", indent(tab), &x1));
    code.push_str(&format!("{}Port {} = NONE;\n", indent(tab), &x2));
    code.push_str(&format!("{}// fast copy\n", indent(tab)));
    code.push_str(&format!("{}if (get_tag({}) == NUM) {{\n", indent(tab), b));
    code.push_str(&format!("{}tm->itrs += 1;\n", indent(tab+1)));
    code.push_str(&format!("{}{} = 1;\n", indent(tab+1), &op));
    code.push_str(&format!("{}{} = {};\n", indent(tab+1), &x1, b));
    code.push_str(&format!("{}{} = {};\n", indent(tab+1), &x2, b));
    code.push_str(&format!("{}}}\n", indent(tab)));
    compile_link_fast(trg, code, book, neo, tab, def, p2, &x2);
    compile_link_fast(trg, code, book, neo, tab, def, p1, &x1);
    code.push_str(&format!("{}if (!{}) {{\n", indent(tab), &op));
    code.push_str(&format!("{}node_create(net, n{:x}, new_pair({},{}));\n", indent(tab+1), a.get_val(), x1, x2));
    link_or_store(trg, code, book, neo, tab+1, def, &format!("new_port(DUP,n{:x})", a.get_val()), b);
    code.push_str(&format!("{}}}\n", indent(tab)));
    return;
  }

  // (a1 a2) <~ (x1 x2)
  // ------------------ Fast ANNI
  // a1 <~ x1
  // a2 <~ x2
  if trg != Target::CUDA && a.get_tag() == hvm::CON {
    let a_ = &def.node[a.get_val() as usize];
    let a1 = a_.get_fst();
    let a2 = a_.get_snd();
    let op = fresh(neo);
    let bv = fresh(neo);
    let x1 = fresh(neo);
    let x2 = fresh(neo);
    code.push_str(&format!("{}bool {} = 0;\n", indent(tab), &op));
    code.push_str(&format!("{}Pair {} = 0;\n", indent(tab), &bv));
    code.push_str(&format!("{}Port {} = NONE;\n", indent(tab), &x1));
    code.push_str(&format!("{}Port {} = NONE;\n", indent(tab), &x2));
    code.push_str(&format!("{}// fast anni\n", indent(tab)));
    code.push_str(&format!("{}if (get_tag({}) == CON && node_load(net, get_val({})) != 0) {{\n", indent(tab), b, b));
    //code.push_str(&format!("{}atomic_fetch_add(&FAST, 1);\n", indent(tab+1)));
    code.push_str(&format!("{}tm->itrs += 1;\n", indent(tab+1)));
    code.push_str(&format!("{}{} = 1;\n", indent(tab+1), &op));
    code.push_str(&format!("{}{} = node_take(net, get_val({}));\n", indent(tab+1), &bv, b));
    code.push_str(&format!("{}{} = get_fst({});\n", indent(tab+1), x1, &bv));
    code.push_str(&format!("{}{} = get_snd({});\n", indent(tab+1), x2, &bv));
    code.push_str(&format!("{}}}\n", indent(tab)));
    //code.push_str(&format!("{}else {{ atomic_fetch_add(&SLOW, 1); }}\n", indent(tab)));
    compile_link_fast(trg, code, book, neo, tab, def, a2, &x2);
    compile_link_fast(trg, code, book, neo, tab, def, a1, &x1);
    code.push_str(&format!("{}if (!{}) {{\n", indent(tab), &op));
    code.push_str(&format!("{}node_create(net, n{:x}, new_pair({},{}));\n", indent(tab+1), a.get_val(), x1, x2));
    link_or_store(trg, code, book, neo, tab+1, def, &format!("new_port(CON,n{:x})", a.get_val()), b);
    code.push_str(&format!("{}}}\n", indent(tab)));
    return;
  }

  // FIXME: since get_tag(NONE) == REF, comparing if something's tag is REF always has the issue of
  // returning true when that thing is NONE. this caused a bug in the optimization below. in
  // general, this is a potential source of bugs across the entire implementation, so we always
  // need to check that case. an alternative, of course, would be to make get_tag handle this, but
  // I'm concerned about the performance issues. so, instead, we should make sure that, across the
  // entire codebase, we never use get_tag expecting a REF on something that might be NONE
  
  // ATOM <~ *
  // --------- Fast VOID
  // nothing
  if trg != Target::CUDA && (a.get_tag() == hvm::NUM || a.get_tag() == hvm::ERA) {
    code.push_str(&format!("{}// fast void\n", indent(tab)));
    code.push_str(&format!("{}if (get_tag({}) == ERA || get_tag({}) == NUM) {{\n", indent(tab), b, b));
    code.push_str(&format!("{}tm->itrs += 1;\n", indent(tab+1)));
    code.push_str(&format!("{}}} else {{\n", indent(tab)));
    compile_link_slow(trg, code, book, neo, tab+1, def, a, b);
    code.push_str(&format!("{}}}\n", indent(tab)));
    return;
  }

  compile_link_slow(trg, code, book, neo, tab, def, a, b);
}

// Compiles a link, without pre-defined reductions.
pub fn compile_link_slow(trg: Target, code: &mut String, book: &hvm::Book, neo: &mut usize, tab: usize, def: &hvm::Def, a: hvm::Port, b: &str) {
  let a_node = compile_node(trg, code, book, neo, tab, def, a);
  link_or_store(trg, code, book, neo, tab, def, &a_node, b);
}

// TODO: comment
pub fn link_or_store(trg: Target, code: &mut String, book: &hvm::Book, neo: &mut usize, tab: usize, def: &hvm::Def, a: &str, b: &str) {
  code.push_str(&format!("{}if ({} != NONE) {{\n", indent(tab), b));
  code.push_str(&format!("{}link(net, tm, {}, {});\n", indent(tab+1), a, b));
  code.push_str(&format!("{}}} else {{\n", indent(tab)));
  code.push_str(&format!("{}{} = {};\n", indent(tab+1), b, a));
  code.push_str(&format!("{}}}\n", indent(tab)));
}

// Compiles just a node.
pub fn compile_node(trg: Target, code: &mut String, book: &hvm::Book, neo: &mut usize, tab: usize, def: &hvm::Def, a: hvm::Port) -> String {
  if a.is_nod() {
    let nd = &def.node[a.get_val() as usize];
    let p1 = compile_node(trg, code, book, neo, tab, def, nd.get_fst());
    let p2 = compile_node(trg, code, book, neo, tab, def, nd.get_snd());
    code.push_str(&format!("{}node_create(net, n{:x}, new_pair({},{}));\n", indent(tab), a.get_val(), p1, p2));
    return format!("new_port({},n{:x})", compile_tag(trg, a.get_tag()), a.get_val());
  } else if a.is_var() {
    return format!("new_port(VAR,v{:x})", a.get_val());
  } else {
    return format!("new_port({},0x{:08x})", compile_tag(trg, a.get_tag()), a.get_val());
  }
}

// Compiles an atomic port.
//fn compile_atom(trg: Target, port: hvm::Port) -> String {
  //return format!("new_port({},0x{:08x})/*atom*/", compile_tag(trg, port.get_tag()), port.get_val());
//}

// Compiles a tag.
pub fn compile_tag(trg: Target, tag: hvm::Tag) -> &'static str {
  match tag {
    hvm::VAR => "VAR",
    hvm::REF => "REF",
    hvm::ERA => "ERA",
    hvm::NUM => "NUM",
    hvm::OPR => "OPR",
    hvm::SWI => "SWI",
    hvm::CON => "CON",
    hvm::DUP => "DUP",
    _ => unreachable!(),
  }
}

// Creates indentation.
pub fn indent(tab: usize) -> String {
  return "  ".repeat(tab);
}

// Generates a fresh name.
fn fresh(count: &mut usize) -> String {
  *count += 1;
  format!("k{}", count)
}


================================================
FILE: src/hvm.c
================================================
#include <inttypes.h>
#include <math.h>
#include <pthread.h>
#include <stdatomic.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#ifdef DEBUG
  #define debug(...) fprintf(stderr, __VA_ARGS__)
#else
  #define debug(...)
#endif

#define INTERPRETED
#define WITHOUT_MAIN

// Types
// --------

typedef  uint8_t  u8;
typedef uint16_t u16;
typedef uint32_t u32;
typedef uint64_t u64;
typedef  int32_t i32;
typedef    float f32;
typedef   double f64;

typedef  _Atomic(u8)  a8;
typedef _Atomic(u16) a16;
typedef _Atomic(u32) a32;
typedef _Atomic(u64) a64;

// Configuration
// -------------

// Threads per CPU
#ifndef TPC_L2
#define TPC_L2 0
#endif
#define TPC (1ul << TPC_L2)

// Types
// -----

// Local Types
typedef u8  Tag;  // Tag  ::= 3-bit (rounded up to u8)
typedef u32 Val;  // Val  ::= 29-bit (rounded up to u32)
typedef u32 Port; // Port ::= Tag + Val (fits a u32)
typedef u64 Pair; // Pair ::= Port + Port (fits a u64)

typedef a32 APort; // atomic Port
typedef a64 APair; // atomic Pair

// Rules
typedef u8 Rule; // Rule ::= 3-bit (rounded up to 8)

// Numbs
typedef u32 Numb; // Numb ::= 29-bit (rounded up to u32)

// Tags
#define VAR 0x0 // variable
#define REF 0x1 // reference
#define ERA 0x2 // eraser
#define NUM 0x3 // number
#define CON 0x4 // constructor
#define DUP 0x5 // duplicator
#define OPR 0x6 // operator
#define SWI 0x7 // switch

// Interaction Rule Values
#define LINK 0x0
#define CALL 0x1
#define VOID 0x2
#define ERAS 0x3
#define ANNI 0x4
#define COMM 0x5
#define OPER 0x6
#define SWIT 0x7

// Numbers
static const f32 U24_MAX = (f32) (1 << 24) - 1;
static const f32 U24_MIN = 0.0;
static const f32 I24_MAX = (f32) (1 << 23) - 1;
static const f32 I24_MIN = (f32) (i32) ((-1u) << 23);
#define TY_SYM 0x00
#define TY_U24 0x01
#define TY_I24 0x02
#define TY_F24 0x03
#define OP_ADD 0x04
#define OP_SUB 0x05
#define FP_SUB 0x06
#define OP_MUL 0x07
#define OP_DIV 0x08
#define FP_DIV 0x09
#define OP_REM 0x0A
#define FP_REM 0x0B
#define OP_EQ  0x0C
#define OP_NEQ 0x0D
#define OP_LT  0x0E
#define OP_GT  0x0F
#define OP_AND 0x10
#define OP_OR  0x11
#define OP_XOR 0x12
#define OP_SHL 0x13
#define FP_SHL 0x14
#define OP_SHR 0x15
#define FP_SHR 0x16

// Constants
#define FREE 0x00000000
#define ROOT 0xFFFFFFF8
#define NONE 0xFFFFFFFF

// Cache Padding
#define CACHE_PAD 64

// Global Net
#define HLEN (1ul << 16) // max 16k high-priority redexes
#define RLEN (1ul << 24) // max 16m low-priority redexes
#define G_NODE_LEN (1ul << 29) // max 536m nodes
#define G_VARS_LEN (1ul << 29) // max 536m vars
#define G_RBAG_LEN (TPC * RLEN)

typedef struct Net {
  APair node_buf[G_NODE_LEN]; // global node buffer
  APort vars_buf[G_VARS_LEN]; // global vars buffer
  APair rbag_buf[G_RBAG_LEN]; // global rbag buffer
  a64 itrs; // interaction count
  a32 idle; // idle thread counter
} Net;

#define DEF_RBAG_LEN 0xFFF
#define DEF_NODE_LEN 0xFFF

// Top-Level Definition
typedef struct Def {
  char name[256];
  bool safe;
  u32  rbag_len;
  u32  node_len;
  u32  vars_len;
  Port root;
  Pair node_buf[DEF_NODE_LEN];
  Pair rbag_buf[DEF_RBAG_LEN];
} Def;

typedef struct Book Book;

// A Foreign Function
typedef struct {
  char name[256];
  Port (*func)(Net*, Book*, Port);
} FFn;

// Book of Definitions
typedef struct Book {
  u32 defs_len;
  Def defs_buf[0x4000];
  u32 ffns_len;
  FFn ffns_buf[0x4000];
} Book;

// Local Thread Memory
typedef struct TM {
  u32  tid; // thread id
  u32  itrs; // interaction count
  u32  nput; // next node allocation attempt index
  u32  vput; // next vars allocation attempt index
  u32  hput; // next hbag push index
  u32  rput; // next rbag push index
  u32  sidx; // steal index
  u32  nloc[0xFFF]; // global node allocation indices
  u32  vloc[0xFFF]; // global vars allocation indices
  Pair hbag_buf[HLEN]; // high-priority redexes
} TM;

// Debugger
// --------

typedef struct {
  char x[13];
} Show;

void put_u16(char* B, u16 val);
Show show_port(Port port);
Show show_rule(Rule rule);
void print_net(Net* net);
void pretty_print_numb(Numb word);
void pretty_print_port(Net* net, Book* book, Port port);

// Port: Constructor and Getters
// -----------------------------

static inline Port new_port(Tag tag, Val val) {
  return (val << 3) | tag;
}

static inline Tag get_tag(Port port) {
  return port & 7;
}

static inline Val get_val(Port port) {
  return port >> 3;
}

// Pair: Constructor and Getters
// -----------------------------

static inline const Pair new_pair(Port fst, Port snd) {
  return ((u64)snd << 32) | fst;
}

static inline Port get_fst(Pair pair) {
  return pair & 0xFFFFFFFF;
}

static inline Port get_snd(Pair pair) {
  return pair >> 32;
}

Pair set_par_flag(Pair pair) {
  Port p1 = get_fst(pair);
  Port p2 = get_snd(pair);
  if (get_tag(p1) == REF) {
    return new_pair(new_port(get_tag(p1), get_val(p1) | 0x10000000), p2);
  } else {
    return pair;
  }
}

Pair clr_par_flag(Pair pair) {
  Port p1 = get_fst(pair);
  Port p2 = get_snd(pair);
  if (get_tag(p1) == REF) {
    return new_pair(new_port(get_tag(p1), get_val(p1) & 0xFFFFFFF), p2);
  } else {
    return pair;
  }
}

bool get_par_flag(Pair pair) {
  Port p1 = get_fst(pair);
  if (get_tag(p1) == REF) {
    return (get_val(p1) >> 28) == 1;
  } else {
    return false;
  }
}

// Utils
// -----

// Swaps two ports.
static inline void swap(Port *a, Port *b) {
  Port x = *a; *a = *b; *b = x;
}

static inline u32 min(u32 a, u32 b) {
  return (a < b) ? a : b;
}

static inline f32 clamp(f32 x, f32 min, f32 max) {
  const f32 t = x < min ? min : x;
  return (t > max) ? max : t;
}

// A simple spin-wait barrier using atomic operations
a64 a_reached = 0; // number of threads that reached the current barrier
a64 a_barrier = 0; // number of barriers passed during this program
void sync_threads() {
  u64 barrier_old = atomic_load_explicit(&a_barrier, memory_order_relaxed);
  if (atomic_fetch_add_explicit(&a_reached, 1, memory_order_relaxed) == (TPC - 1)) {
    // Last thread to reach the barrier resets the counter and advances the barrier
    atomic_store_explicit(&a_reached, 0, memory_order_relaxed);
    atomic_store_explicit(&a_barrier, barrier_old + 1, memory_order_release);
  } else {
    u32 tries = 0;
    while (atomic_load_explicit(&a_barrier, memory_order_acquire) == barrier_old) {
      sched_yield();
    }
  }
}

// Global sum function
static a32 GLOBAL_SUM = 0;
u32 global_sum(u32 x) {
  atomic_fetch_add_explicit(&GLOBAL_SUM, x, memory_order_relaxed);
  sync_threads();
  u32 sum = atomic_load_explicit(&GLOBAL_SUM, memory_order_relaxed);
  sync_threads();
  atomic_store_explicit(&GLOBAL_SUM, 0, memory_order_relaxed);
  return sum;
}

// TODO: write a time64() function that returns the time as fast as possible as a u64
static inline u64 time64() {
  struct timespec ts;
  clock_gettime(CLOCK_MONOTONIC, &ts);
  return (u64)ts.tv_sec * 1000000000ULL + (u64)ts.tv_nsec;
}

// Ports / Pairs / Rules
// ---------------------

// True if this port has a pointer to a node.
static inline bool is_nod(Port a) {
  return get_tag(a) >= CON;
}

// True if this port is a variable.
static inline bool is_var(Port a) {
  return get_tag(a) == VAR;
}

// Given two tags, gets their interaction rule.
static inline Rule get_rule(Port a, Port b) {
  const u8 table[8][8] = {
    //VAR  REF  ERA  NUM  CON  DUP  OPR  SWI
    {LINK,LINK,LINK,LINK,LINK,LINK,LINK,LINK}, // VAR
    {LINK,VOID,VOID,VOID,CALL,CALL,CALL,CALL}, // REF
    {LINK,VOID,VOID,VOID,ERAS,ERAS,ERAS,ERAS}, // ERA
    {LINK,VOID,VOID,VOID,ERAS,ERAS,OPER,SWIT}, // NUM
    {LINK,CALL,ERAS,ERAS,ANNI,COMM,COMM,COMM}, // CON
    {LINK,CALL,ERAS,ERAS,COMM,ANNI,COMM,COMM}, // DUP
    {LINK,CALL,ERAS,OPER,COMM,COMM,ANNI,COMM}, // OPR
    {LINK,CALL,ERAS,SWIT,COMM,COMM,COMM,ANNI}, // SWI
  };
  return table[get_tag(a)][get_tag(b)];
}

// Same as above, but receiving a pair.
static inline Rule get_pair_rule(Pair AB) {
  return get_rule(get_fst(AB), get_snd(AB));
}

// Should we swap ports A and B before reducing this rule?
static inline bool should_swap(Port A, Port B) {
  return get_tag(B) < get_tag(A);
}

// Gets a rule's priority
static inline bool is_high_priority(Rule rule) {
  // TODO: this needs to be more readable
  return (bool)((0b00011101 >> rule) & 1);
}

// Adjusts a newly allocated port.
static inline Port adjust_port(Net* net, TM* tm, Port port) {
  Tag tag = get_tag(port);
  Val val = get_val(port);
  if (is_nod(port)) return new_port(tag, tm->nloc[val]);
  if (is_var(port)) return new_port(tag, tm->vloc[val]);
  return new_port(tag, val);
}

// Adjusts a newly allocated pair.
static inline Pair adjust_pair(Net* net, TM* tm, Pair pair) {
  Port p1 = adjust_port(net, tm, get_fst(pair));
  Port p2 = adjust_port(net, tm, get_snd(pair));
  return new_pair(p1, p2);
}

// Numbs
// -----

// Constructor and getters for SYM (operation selector)
static inline Numb new_sym(u32 val) {
  return (val << 5) | TY_SYM;
}

static inline u32 get_sym(Numb word) {
  return (word >> 5);
}

// Constructor and getters for U24 (unsigned 24-bit integer)
static inline Numb new_u24(u32 val) {
  return (val << 5) | TY_U24;
}

static inline u32 get_u24(Numb word) {
  return word >> 5;
}

// Constructor and getters for I24 (signed 24-bit integer)
static inline Numb new_i24(i32 val) {
  return ((u32)val << 5) | TY_I24;
}

static inline i32 get_i24(Numb word) {
  return ((i32)word) << 3 >> 8;
}

// Constructor and getters for F24 (24-bit float)
static inline Numb new_f24(float val) {
  u32 bits = *(u32*)&val;
  u32 shifted_bits = bits >> 8;
  u32 lost_bits = bits & 0xFF;
  // round ties to even
  shifted_bits += (!isnan(val)) & ((lost_bits - ((lost_bits >> 7) & !shifted_bits)) >> 7);
  // ensure NaNs don't become infinities
  shifted_bits |= isnan(val);
  return (shifted_bits << 5) | TY_F24;
}

static inline float get_f24(Numb word) {
  u32 bits = (word << 3) & 0xFFFFFF00;
  return *(float*)&bits;
}

// Flip flag
static inline Tag get_typ(Numb word) {
  return word & 0x1F;
}

static inline bool is_num(Numb word) {
  return get_typ(word) >= TY_U24 && get_typ(word) <= TY_F24;
}

static inline bool is_cast(Numb word) {
  return get_typ(word) == TY_SYM && get_sym(word) >= TY_U24 && get_sym(word) <= TY_F24;
}

// Partial application
static inline Numb partial(Numb a, Numb b) {
  return (b & ~0x1F) | get_sym(a);
}

// Cast a number to another type.
// The semantics are meant to spiritually resemble rust's numeric casts:
// - i24 <-> u24: is just reinterpretation of bits
// - f24  -> i24,
//   f24  -> u24: casts to the "closest" integer representing this float,
//                saturating if out of range and 0 if NaN
// - i24  -> f24,
//   u24  -> f24: casts to the "closest" float representing this integer.
static inline Numb cast(Numb a, Numb b) {
  if (get_sym(a) == TY_U24 && get_typ(b) == TY_U24) return b;
  if (get_sym(a) == TY_U24 && get_typ(b) == TY_I24) {
    // reinterpret bits
    i32 val = get_i24(b);
    return new_u24(*(u32*) &val);
  }
  if (get_sym(a) == TY_U24 && get_typ(b) == TY_F24) {
    f32 val = get_f24(b);
    if (isnan(val)) {
      return new_u24(0);
    }
    return new_u24((u32) clamp(val, U24_MIN, U24_MAX));
  }

  if (get_sym(a) == TY_I24 && get_typ(b) == TY_U24) {
    // reinterpret bits
    u32 val = get_u24(b);
    return new_i24(*(i32*) &val);
  }
  if (get_sym(a) == TY_I24 && get_typ(b) == TY_I24) return b;
  if (get_sym(a) == TY_I24 && get_typ(b) == TY_F24) {
    f32 val = get_f24(b);
    if (isnan(val)) {
      return new_i24(0);
    }
    return new_i24((i32) clamp(val, I24_MIN, I24_MAX));
  }

  if (get_sym(a) == TY_F24 && get_typ(b) == TY_U24) return new_f24((f32) get_u24(b));
  if (get_sym(a) == TY_F24 && get_typ(b) == TY_I24) return new_f24((f32) get_i24(b));
  if (get_sym(a) == TY_F24 && get_typ(b) == TY_F24) return b;

  return new_u24(0);
}

// Operate function
static inline Numb operate(Numb a, Numb b) {
  Tag at = get_typ(a);
  Tag bt = get_typ(b);
  if (at == TY_SYM && bt == TY_SYM) {
    return new_u24(0);
  }
  if (is_cast(a) && is_num(b)) {
    return cast(a, b);
  }
  if (is_cast(b) && is_num(a)) {
    return cast(b, a);
  }
  if (at == TY_SYM && bt != TY_SYM) {
    return partial(a, b);
  }
  if (at != TY_SYM && bt == TY_SYM) {
    return partial(b, a);
  }
  if (at >= OP_ADD && bt >= OP_ADD) {
    return new_u24(0);
  }
  if (at < OP_ADD && bt < OP_ADD) {
    return new_u24(0);
  }
  Tag op, ty;
  Numb swp;
  if (at >= OP_ADD) {
    op = at; ty = bt;
  } else {
    op = bt; ty = at; swp = a; a = b; b = swp;
  }
  switch (ty) {
    case TY_U24: {
      u32 av = get_u24(a);
      u32 bv = get_u24(b);
      switch (op) {
        case OP_ADD: return new_u24(av + bv);
        case OP_SUB: return new_u24(av - bv);
        case FP_SUB: return new_u24(bv - av);
        case OP_MUL: return new_u24(av * bv);
        case OP_DIV: return new_u24(av / bv);
        case FP_DIV: return new_u24(bv / av);
        case OP_REM: return new_u24(av % bv);
        case FP_REM: return new_u24(bv % av);
        case OP_EQ:  return new_u24(av == bv);
        case OP_NEQ: return new_u24(av != bv);
        case OP_LT:  return new_u24(av < bv);
        case OP_GT:  return new_u24(av > bv);
        case OP_AND: return new_u24(av & bv);
        case OP_OR:  return new_u24(av | bv);
        case OP_XOR: return new_u24(av ^ bv);
        case OP_SHL: return new_u24(av << (bv & 31));
        case FP_SHL: return new_u24(bv << (av & 31));
        case OP_SHR: return new_u24(av >> (bv & 31));
        case FP_SHR: return new_u24(bv >> (av & 31));
        default:     return new_u24(0);
      }
    }
    case TY_I24: {
      i32 av = get_i24(a);
      i32 bv = get_i24(b);
      switch (op) {
        case OP_ADD: return new_i24(av + bv);
        case OP_SUB: return new_i24(av - bv);
        case FP_SUB: return new_i24(bv - av);
        case OP_MUL: return new_i24(av * bv);
        case OP_DIV: return new_i24(av / bv);
        case FP_DIV: return new_i24(bv / av);
        case OP_REM: return new_i24(av % bv);
        case FP_REM: return new_i24(bv % av);
        case OP_EQ:  return new_u24(av == bv);
        case OP_NEQ: return new_u24(av != bv);
        case OP_LT:  return new_u24(av < bv);
        case OP_GT:  return new_u24(av > bv);
        case OP_AND: return new_i24(av & bv);
        case OP_OR:  return new_i24(av | bv);
        case OP_XOR: return new_i24(av ^ bv);
        default:     return new_i24(0);
      }
    }
    case TY_F24: {
      float av = get_f24(a);
      float bv = get_f24(b);
      switch (op) {
        case OP_ADD: return new_f24(av + bv);
        case OP_SUB: return new_f24(av - bv);
        case FP_SUB: return new_f24(bv - av);
        case OP_MUL: return new_f24(av * bv);
        case OP_DIV: return new_f24(av / bv);
        case FP_DIV: return new_f24(bv / av);
        case OP_REM: return new_f24(fmodf(av, bv));
        case FP_REM: return new_f24(fmodf(bv, av));
        case OP_EQ:  return new_u24(av == bv);
        case OP_NEQ: return new_u24(av != bv);
        case OP_LT:  return new_u24(av < bv);
        case OP_GT:  return new_u24(av > bv);
        case OP_AND: return new_f24(atan2f(av, bv));
        case OP_OR:  return new_f24(logf(bv) / logf(av));
        case OP_XOR: return new_f24(powf(av, bv));
        case OP_SHL: return new_f24(sin(av + bv));
        case OP_SHR: return new_f24(tan(av + bv));
        default:     return new_f24(0);
      }
    }
    default: return new_u24(0);
  }
}

// RBag
// ----

// FIXME: what about some bound checks?

static inline void push_redex(Net* net, TM* tm, Pair redex) {
  #ifdef DEBUG
  bool free_local = tm->hput < HLEN;
  bool free_global = tm->rput < RLEN;
  if (!free_global || !free_local) {
    debug("push_redex: limited resources, maybe corrupting memory\n");
  }
  #endif

  if (is_high_priority(get_pair_rule(redex))) {
    tm->hbag_buf[tm->hput++] = redex;
  } else {
    atomic_store_explicit(&net->rbag_buf[tm->tid*(G_RBAG_LEN/TPC) + (tm->rput++)], redex, memory_order_relaxed);
  }
}

static inline Pair pop_redex(Net* net, TM* tm) {
  if (tm->hput > 0) {
    return tm->hbag_buf[--tm->hput];
  } else if (tm->rput > 0) {
    return atomic_exchange_explicit(&net->rbag_buf[tm->tid*(G_RBAG_LEN/TPC) + (--tm->rput)], 0, memory_order_relaxed);
  } else {
    return 0;
  }
}

static inline u32 rbag_len(Net* net, TM* tm) {
  return tm->rput + tm->hput;
}

// TM
// --

static TM* tm[TPC];

TM* tm_new(u32 tid) {
  TM* tm   = malloc(sizeof(TM));
  tm->tid  = tid;
  tm->itrs = 0;
  tm->nput = 1;
  tm->vput = 1;
  tm->rput = 0;
  tm->hput = 0;
  tm->sidx = 0;
  return tm;
}

void alloc_static_tms() {
  for (u32 t = 0; t < TPC; ++t) {
    tm[t] = tm_new(t);
  }
}

void free_static_tms() {
  for (u32 t = 0; t < TPC; ++t) {
    free(tm[t]);
  }
}

// Net
// ----

// Stores a new node on global.
static inline void node_create(Net* net, u32 loc, Pair val) {
  atomic_store_explicit(&net->node_buf[loc], val, memory_order_relaxed);
}

// Stores a var on global.
static inline void vars_create(Net* net, u32 var, Port val) {
  atomic_store_explicit(&net->vars_buf[var], val, memory_order_relaxed);
}

// Reads a node from global.
static inline Pair node_load(Net* net, u32 loc) {
  return atomic_load_explicit(&net->node_buf[loc], memory_order_relaxed);
}

// Reads a var from global.
static inline Port vars_load(Net* net, u32 var) {
  return atomic_load_explicit(&net->vars_buf[var], memory_order_relaxed);
}

// Stores a node on global.
static inline void node_store(Net* net, u32 loc, Pair val) {
  atomic_store_explicit(&net->node_buf[loc], val, memory_order_relaxed);
}

// Exchanges a node on global by a value. Returns old.
static inline Pair node_exchange(Net* net, u32 loc, Pair val) {
  return atomic_exchange_explicit(&net->node_buf[loc], val, memory_order_relaxed);
}

// Exchanges a var on global by a value. Returns old.
static inline Port vars_exchange(Net* net, u32 var, Port val) {
  return atomic_exchange_explicit(&net->vars_buf[var], val, memory_order_relaxed);
}

// Takes a node.
static inline Pair node_take(Net* net, u32 loc) {
  return node_exchange(net, loc, 0);
}

// Takes a var.
static inline Port vars_take(Net* net, u32 var) {
  return vars_exchange(net, var, 0);
}


// Net
// ---

// Initializes a net.
static inline Net* net_new() {
  Net* net = calloc(1, sizeof(Net));

  atomic_store(&net->itrs, 0);
  atomic_store(&net->idle, 0);

  return net;
}

// Allocator
// ---------

u32 node_alloc_1(Net* net, TM* tm, u32* lps) {
  while (true) {
    u32 lc = tm->tid*(G_NODE_LEN/TPC) + (tm->nput%(G_NODE_LEN/TPC));
    Pair elem = net->node_buf[lc];
    tm->nput += 1;
    if (lc > 0 && elem == 0) {
      return lc;
    }
    // FIXME: check this decently
    if (++(*lps) >= G_NODE_LEN/TPC) printf("OOM\n");
  }
}

u32 vars_alloc_1(Net* net, TM* tm, u32* lps) {
  while (true) {
    u32 lc = tm->tid*(G_NODE_LEN/TPC) + (tm->vput%(G_NODE_LEN/TPC));
    Port elem = net->vars_buf[lc];
    tm->vput += 1;
    if (lc > 0 && elem == 0) {
      return lc;
    }
    // FIXME: check this decently
    if (++(*lps) >= G_NODE_LEN/TPC) printf("OOM\n");
  }
}

u32 node_alloc(Net* net, TM* tm, u32 num) {
  u32 got = 0;
  u32 lps = 0;
  while (got < num) {
    u32 lc = tm->tid*(G_NODE_LEN/TPC) + (tm->nput%(G_NODE_LEN/TPC));
    Pair elem = net->node_buf[lc];
    tm->nput += 1;
    if (lc > 0 && elem == 0) {
      tm->nloc[got++] = lc;
    }
    // FIXME: check this decently
    if (++lps >= G_NODE_LEN/TPC) printf("OOM\n");
  }
  return got;
}

u32 vars_alloc(Net* net, TM* tm, u32 num) {
  u32 got = 0;
  u32 lps = 0;
  while (got < num) {
    u32 lc = tm->tid*(G_NODE_LEN/TPC) + (tm->vput%(G_NODE_LEN/TPC));
    Port elem = net->vars_buf[lc];
    tm->vput += 1;
    if (lc > 0 && elem == 0) {
      tm->vloc[got++] = lc;
    }
    // FIXME: check this decently
    if (++lps >= G_NODE_LEN/TPC) printf("OOM\n");
  }
  return got;
}

// Gets the necessary resources for an interaction. Returns success.
static inline bool get_resources(Net* net, TM* tm, u32 need_rbag, u32 need_node, u32 need_vars) {
  u32 got_rbag = min(RLEN - tm->rput, HLEN - tm->hput);
  u32 got_node = node_alloc(net, tm, need_node);
  u32 got_vars = vars_alloc(net, tm, need_vars);

  return got_rbag >= need_rbag && got_node >= need_node && got_vars >= need_vars;
}

// Linking
// -------

// Peeks a variable's final target without modifying it.
static inline Port peek(Net* net, Port var) {
  while (get_tag(var) == VAR) {
    Port val = vars_load(net, get_val(var));
    if (val == NONE) break;
    if (val == 0) break;
    var = val;
  }
  return var;
}

// Finds a variable's value.
static inline Port enter(Net* net, Port var) {
  // While `B` is VAR: extend it (as an optimization)
  while (get_tag(var) == VAR) {
    // Takes the current `var` substitution as `val`
    Port val = vars_exchange(net, get_val(var), NONE);
    // If there was no `val`, stop, as there is no extension
    if (val == NONE || val == 0) {
      break;
    }
    // Otherwise, delete `B` (we own both) and continue
    vars_take(net, get_val(var));
    var = val;
  }
  return var;
}

// Atomically Links `A ~ B`.
static inline void link(Net* net, TM* tm, Port A, Port B) {
  // Attempts to directionally point `A ~> B`
  while (true) {
    // If `A` is NODE: swap `A` and `B`, and continue
    if (get_tag(A) != VAR && get_tag(B) == VAR) {
      Port X = A; A = B; B = X;
    }

    // If `A` is NODE: create the `A ~ B` redex
    if (get_tag(A) != VAR) {
      push_redex(net, tm, new_pair(A, B)); // TODO: move global ports to local
      break;
    }

    // Extends B (as an optimization)
    B = enter(net, B);

    // Since `A` is VAR: point `A ~> B`.
    // Stores `A -> B`, taking the current `A` subst as `A'`
    Port A_ = vars_exchange(net, get_val(A), B);
    // If there was no `A'`, stop, as we lost B's ownership
    if (A_ == NONE) {
      break;
    }
    //if (A_ == 0) { ? } // FIXME: must handle on the move-to-global algo
    // Otherwise, delete `A` (we own both) and link `A' ~ B`
    vars_take(net, get_val(A));
    A = A_;
  }
}

// Links `A ~ B` (as a pair).
static inline void link_pair(Net* net, TM* tm, Pair AB) {
  link(net, tm, get_fst(AB), get_snd(AB));
}

// Interactions
// ------------

// The Link Interaction.
static inline bool interact_link(Net* net, TM* tm, Port a, Port b) {
  // Allocates needed nodes and vars.
  if (!get_resources(net, tm, 1, 0, 0)) {
    debug("interact_link: get_resources failed\n");
    return false;
  }

  // Links.
  link_pair(net, tm, new_pair(a, b));

  return true;
}

// Declared here for use in call interactions.
static inline bool interact_eras(Net* net, TM* tm, Port a, Port b);

// The Call Interaction.
#ifdef COMPILED
///COMPILED_INTERACT_CALL///
#else
static inline bool interact_call(Net* net, TM* tm, Port a, Port b, Book* book) {
  // Loads Definition.
  u32  fid = get_val(a) & 0xFFFFFFF;
  Def* def = &book->defs_buf[fid];

  // Copy Optimization.
  if (def->safe && get_tag(b) == DUP) {
    return interact_eras(net, tm, a, b);
  }

  // Allocates needed nodes and vars.
  if (!get_resources(net, tm, def->rbag_len + 1, def->node_len, def->vars_len)) {
    debug("interact_call: get_resources failed\n");
    return false;
  }

  // Stores new vars.
  for (u32 i = 0; i < def->vars_len; ++i) {
    vars_create(net, tm->vloc[i], NONE);
  }

  // Stores new nodes.
  for (u32 i = 0; i < def->node_len; ++i) {
    node_create(net, tm->nloc[i], adjust_pair(net, tm, def->node_buf[i]));
  }

  // Links.
  for (u32 i = 0; i < def->rbag_len; ++i) {
    link_pair(net, tm, adjust_pair(net, tm, def->rbag_buf[i]));
  }
  link_pair(net, tm, new_pair(adjust_port(net, tm, def->root), b));

  return true;
}
#endif

// The Void Interaction.
static inline bool interact_void(Net* net, TM* tm, Port a, Port b) {
  return true;
}

// The Eras Interaction.
static inline bool interact_eras(Net* net, TM* tm, Port a, Port b) {
  // Allocates needed nodes and vars.
  if (!get_resources(net, tm, 2, 0, 0)) {
    debug("interact_eras: get_resources failed\n");
    return false;
  }

  // Checks availability
  if (node_load(net, get_val(b)) == 0) {
    return false;
  }

  // Loads ports.
  Pair B  = node_exchange(net, get_val(b), 0);
  Port B1 = get_fst(B);
  Port B2 = get_snd(B);

  // Links.
  link_pair(net, tm, new_pair(a, B1));
  link_pair(net, tm, new_pair(a, B2));

  return true;
}

// The Anni Interaction.
static inline bool interact_anni(Net* net, TM* tm, Port a, Port b) {
  // Allocates needed nodes and vars.
  if (!get_resources(net, tm, 2, 0, 0)) {
    debug("interact_anni: get_resources failed\n");
    return false;
  }

  // Checks availability
  if (node_load(net, get_val(a)) == 0 || node_load(net, get_val(b)) == 0) {
    return false;
  }

  // Loads ports.
  Pair A  = node_take(net, get_val(a));
  Port A1 = get_fst(A);
  Port A2 = get_snd(A);
  Pair B  = node_take(net, get_val(b));
  Port B1 = get_fst(B);
  Port B2 = get_snd(B);

  // Links.
  link_pair(net, tm, new_pair(A1, B1));
  link_pair(net, tm, new_pair(A2, B2));

  return true;
}

// The Comm Interaction.
static inline bool interact_comm(Net* net, TM* tm, Port a, Port b) {
  // Allocates needed nodes and vars.
  if (!get_resources(net, tm, 4, 4, 4)) {
    debug("interact_comm: get_resources failed\n");
    return false;
  }

  // Checks availability
  if (node_load(net, get_val(a)) == 0 || node_load(net, get_val(b)) == 0) {
    return false;
  }

  // Loads ports.
  Pair A  = node_take(net, get_val(a));
  Port A1 = get_fst(A);
  Port A2 = get_snd(A);
  Pair B  = node_take(net, get_val(b));
  Port B1 = get_fst(B);
  Port B2 = get_snd(B);

  // Stores new vars.
  vars_create(net, tm->vloc[0], NONE);
  vars_create(net, tm->vloc[1], NONE);
  vars_create(net, tm->vloc[2], NONE);
  vars_create(net, tm->vloc[3], NONE);

  // Stores new nodes.
  node_create(net, tm->nloc[0], new_pair(new_port(VAR, tm->vloc[0]), new_port(VAR, tm->vloc[1])));
  node_create(net, tm->nloc[1], new_pair(new_port(VAR, tm->vloc[2]), new_port(VAR, tm->vloc[3])));
  node_create(net, tm->nloc[2], new_pair(new_port(VAR, tm->vloc[0]), new_port(VAR, tm->vloc[2])));
  node_create(net, tm->nloc[3], new_pair(new_port(VAR, tm->vloc[1]), new_port(VAR, tm->vloc[3])));

  // Links.
  link_pair(net, tm, new_pair(new_port(get_tag(b), tm->nloc[0]), A1));
  link_pair(net, tm, new_pair(new_port(get_tag(b), tm->nloc[1]), A2));
  link_pair(net, tm, new_pair(new_port(get_tag(a), tm->nloc[2]), B1));
  link_pair(net, tm, new_pair(new_port(get_tag(a), tm->nloc[3]), B2));

  return true;
}

// The Oper Interaction.
static inline bool interact_oper(Net* net, TM* tm, Port a, Port b) {
  // Allocates needed nodes and vars.
  if (!get_resources(net, tm, 1, 1, 0)) {
    debug("interact_oper: get_resources failed\n");
    return false;
  }

  // Checks availability
  if (node_load(net, get_val(b)) == 0) {
    return false;
  }

  // Loads ports.
  Val  av = get_val(a);
  Pair B  = node_take(net, get_val(b));
  Port B1 = get_fst(B);
  Port B2 = enter(net, get_snd(B));

  // Performs operation.
  if (get_tag(B1) == NUM) {
    Val  bv = get_val(B1);
    Numb cv = operate(av, bv);
    link_pair(net, tm, new_pair(new_port(NUM, cv), B2));
  } else {
    node_create(net, tm->nloc[0], new_pair(a, B2));
    link_pair(net, tm, new_pair(B1, new_port(OPR, tm->nloc[0])));
  }

  return true;
}

// The Swit Interaction.
static inline bool interact_swit(Net* net, TM* tm, Port a, Port b) {
  // Allocates needed nodes and vars.
  if (!get_resources(net, tm, 1, 2, 0)) {
    debug("interact_swit: get_resources failed\n");
    return false;
  }

  // Checks availability
  if (node_load(net, get_val(b)) == 0) {
    return false;
  }

  // Loads ports.
  u32  av = get_u24(get_val(a));
  Pair B  = node_take(net, get_val(b));
  Port B1 = get_fst(B);
  Port B2 = get_snd(B);

  // Stores new nodes.
  if (av == 0) {
    node_create(net, tm->nloc[0], new_pair(B2, new_port(ERA,0)));
    link_pair(net, tm, new_pair(new_port(CON, tm->nloc[0]), B1));
  } else {
    node_create(net, tm->nloc[0], new_pair(new_port(ERA,0), new_port(CON, tm->nloc[1])));
    node_create(net, tm->nloc[1], new_pair(new_port(NUM, new_u24(av-1)), B2));
    link_pair(net, tm, new_pair(new_port(CON, tm->nloc[0]), B1));
  }

  return true;
}

// Pops a local redex and performs a single interaction.
static inline bool interact(Net* net, TM* tm, Book* book) {
  // Pops a redex.
  Pair redex = pop_redex(net, tm);

  // If there is no redex, stop.
  if (redex != 0) {
    // Gets redex ports A and B.
    Port a = get_fst(redex);
    Port b = get_snd(redex);

    // Gets the rule type.
    Rule rule = get_rule(a, b);

    // Used for root redex.
    if (get_tag(a) == REF && b == ROOT) {
      rule = CALL;
    // Swaps ports if necessary.
    } else if (should_swap(a,b)) {
      swap(&a, &b);
    }

    // Dispatches interaction rule.
    bool success;
    switch (rule) {
      case LINK: success = interact_link(net, tm, a, b); break;
      #ifdef COMPILED
      case CALL: success = interact_call(net, tm, a, b); break;
      #else
      case CALL: success = interact_call(net, tm, a, b, book); break;
      #endif
      case VOID: success = interact_void(net, tm, a, b); break;
      case ERAS: success = interact_eras(net, tm, a, b); break;
      case ANNI: success = interact_anni(net, tm, a, b); break;
      case COMM: success = interact_comm(net, tm, a, b); break;
      case OPER: success = interact_oper(net, tm, a, b); break;
      case SWIT: success = interact_swit(net, tm, a, b); break;
    }

    // If error, pushes redex back.
    if (!success) {
      push_redex(net, tm, redex);
      return false;
    // Else, increments the interaction count.
    } else if (rule != LINK) {
      tm->itrs += 1;
    }
  }

  return true;
}

// Evaluator
// ---------

void evaluator(Net* net, TM* tm, Book* book) {
  // Initializes the global idle counter
  atomic_store_explicit(&net->idle, TPC - 1, memory_order_relaxed);
  sync_threads();

  // Performs some interactions
  u32  tick = 0;
  bool busy = tm->tid == 0;
  while (true) {
    tick += 1;

    // If we have redexes...
    if (rbag_len(net, tm) > 0) {
      // Update global idle counter
      if (!busy) atomic_fetch_sub_explicit(&net->idle, 1, memory_order_relaxed);
      busy = true;

      // Perform an interaction
      #ifdef DEBUG
      if (!interact(net, tm, book)) debug("interaction failed\n");
      #else
      interact(net, tm, book);
      #endif
    // If we have no redexes...
    } else {
      // Update global idle counter
      if (busy) atomic_fetch_add_explicit(&net->idle, 1, memory_order_relaxed);
      busy = false;

      //// Peeks a redex from target
      u32 sid = (tm->tid - 1) % TPC;
      u32 idx = sid*(G_RBAG_LEN/TPC) + (tm->sidx++);

      // Stealing Everything: this will steal all redexes

      Pair got = atomic_exchange_explicit(&net->rbag_buf[idx], 0, memory_order_relaxed);
      if (got != 0) {
        push_redex(net, tm, got);
        continue;
      } else {
        tm->sidx = 0;
      }

      // Chill...
      sched_yield();
      // Halt if all threads are idle
      if (tick % 256 == 0) {
        if (atomic_load_explicit(&net->idle, memory_order_relaxed) == TPC) {
          break;
        }
      }
    }
  }

  sync_threads();

  atomic_fetch_add(&net->itrs, tm->itrs);
  tm->itrs = 0;
}

// Normalizer
// ----------

// Thread data
typedef struct {
  Net*  net;
  TM*   tm;
  Book* book;
} ThreadArg;

void* thread_func(void* arg) {
  ThreadArg* data = (ThreadArg*)arg;
  evaluator(data->net, data->tm, data->book);
  return NULL;
}

// Sets the initial redex.
void boot_redex(Net* net, Pair redex) {
  net->vars_buf[get_val(ROOT)] = NONE;
  net->rbag_buf[0] = redex;
}

// Evaluates all redexes.
// TODO: cache threads to avoid spawning overhead
void normalize(Net* net, Book* book) {
  // Inits thread_arg objects
  ThreadArg thread_arg[TPC];
  for (u32 t = 0; t < TPC; ++t) {
    thread_arg[t].net  = net;
    thread_arg[t].tm   = tm[t];
    thread_arg[t].book = book;
  }

  // Spawns the evaluation threads
  pthread_t threads[TPC];
  for (u32 t = 0; t < TPC; ++t) {
    pthread_create(&threads[t], NULL, thread_func, &thread_arg[t]);
  }

  // Wait for the threads to finish
  for (u32 t = 0; t < TPC; ++t) {
    pthread_join(threads[t], NULL);
  }
}

// Util: expands a REF Port.
Port expand(Net* net, Book* book, Port port) {
  Port old = vars_load(net, get_val(ROOT));
  Port got = peek(net, port);
  while (get_tag(got) == REF) {
    boot_redex(net, new_pair(got, ROOT));
    normalize(net, book);
    got = peek(net, vars_load(net, get_val(ROOT)));
  }
  vars_create(net, get_val(ROOT), old);
  return got;
}

// Reads back an image.
// Encoding: (<tree>,<tree>) | #RRGGBB
void read_img(Net* net, Port port, u32 width, u32 height, u32* buffer) {
  //pretty_print_port(net, port);
  //printf("\n");
  typedef struct {
    Port port; u32 lv;
    u32 x0; u32 x1;
    u32 y0; u32 y1;
  } Rect;
  Rect stk[24];
  u32 pos = 0;
  stk[pos++] = (Rect){port, 0, 0, width, 0, height};
  while (pos > 0) {
    Rect rect = stk[--pos];
    Port port = enter(net, rect.port);
    u32  lv   = rect.lv;
    u32  x0   = rect.x0;
    u32  x1   = rect.x1;
    u32  y0   = rect.y0;
    u32  y1   = rect.y1;
    if (get_tag(port) == CON) {
      Pair nd = node_load(net, get_val(port));
      Port p1 = get_fst(nd);
      Port p2 = get_snd(nd);
      u32  xm = (x0 + x1) / 2;
      u32  ym = (y0 + y1) / 2;
      if (lv % 2 == 0) {
        stk[pos++] = (Rect){p2, lv+1, xm, x1, y0, y1};
        stk[pos++] = (Rect){p1, lv+1, x0, xm, y0, y1};
      } else {
        stk[pos++] = (Rect){p2, lv+1, x0, x1, ym, y1};
        stk[pos++] = (Rect){p1, lv+1, x0, x1, y0, ym};
      }
      continue;
    }
    if (get_tag(port) == NUM) {
      u32 color = get_u24(get_val(port));
      printf("COL=%08x x0=%04u x1=%04u y0=%04u y1=%04u | %s\n", color, x0, x1, y0, y1, show_port(port).x);
      for (u32 y = y0; y < y1; y++) {
        for (u32 x = x0; x < x1; x++) {
          buffer[y*width + x] = 0xFF000000 | color;
        }
      }
      continue;
    }
    break;
  }
}


//#ifdef IO_DRAWIMAGE
//// Global variables for the window and renderer
//static SDL_Window *window = NULL;
//static SDL_Renderer *renderer = NULL;
//static SDL_Texture *texture = NULL;
//// Function to close the SDL window and clean up resources
//void close_sdl(void) {
  //if (texture != NULL) {
    //SDL_DestroyTexture(texture);
    //texture = NULL;
  //}
  //if (renderer != NULL) {
    //SDL_DestroyRenderer(renderer);
    //renderer = NULL;
  //}
  //if (window != NULL) {
    //SDL_DestroyWindow(window);
    //window = NULL;
  //}
  //SDL_Quit();
//}
//// Function to render an image to the SDL window
//void render(uint32_t width, uint32_t height, uint32_t *buffer) {
  //// Initialize SDL if it hasn't been initialized
  //if (SDL_WasInit(SDL_INIT_VIDEO) == 0) {
    //if (SDL_Init(SDL_INIT_VIDEO) < 0) {
      //fprintf(stderr, "SDL could not initialize! SDL Error: %s\n", SDL_GetError());
      //return;
    //}
  //}
  //// Create window and renderer if they don't exist
  //if (window == NULL) {
    //window = SDL_CreateWindow("SDL Window", SDL_WINDOWPOS_UNDEFINED, SDL_WINDOWPOS_UNDEFINED, width, height, SDL_WINDOW_SHOWN);
    //if (window == NULL) {
      //fprintf(stderr, "Window could not be created! SDL Error: %s\n", SDL_GetError());
      //return;
    //}
    //renderer = SDL_CreateRenderer(window, -1, SDL_RENDERER_ACCELERATED | SDL_RENDERER_PRESENTVSYNC);
    //if (renderer == NULL) {
      //SDL_DestroyWindow(window);
      //window = NULL;
      //fprintf(stderr, "Renderer could not be created! SDL Error: %s\n", SDL_GetError());
      //return;
    //}
  //}
  //// Create or recreate the texture if necessary
  //if (texture == NULL) {
    //texture = SDL_CreateTexture(renderer, SDL_PIXELFORMAT_ARGB8888, SDL_TEXTUREACCESS_STREAMING, width, height);
    //if (texture == NULL) {
      //fprintf(stderr, "Texture could not be created! SDL Error: %s\n", SDL_GetError());
      //return;
    //}
  //}
  //// Update the texture with the new buffer
  //if (SDL_UpdateTexture(texture, NULL, buffer, width * sizeof(uint32_t)) < 0) {
    //fprintf(stderr, "Texture could not be updated! SDL Error: %s\n", SDL_GetError());
    //return;
  //}
  //// Clear the renderer
  //SDL_RenderClear(renderer);
  //// Copy the texture to the renderer
  //SDL_RenderCopy(renderer, texture, NULL, NULL);
  //// Update the screen
  //SDL_RenderPresent(renderer);
  //// Process events to prevent the OS from thinking the application is unresponsive
  //SDL_Event e;
  //while (SDL_PollEvent(&e)) {
    //if (e.type == SDL_QUIT) {
      //close_sdl();
      //exit(0);
    //}
  //}
//}
//// IO: DrawImage
//Port io_put_image(Net* net, Book* book, u32 argc, Port* argv) {
  //u32 width = 256;
  //u32 height = 256;
  //// Create a buffer
  //uint32_t *buffer = (uint32_t *)malloc(width * height * sizeof(uint32_t));
  //if (buffer == NULL) {
    //fprintf(stderr, "Failed to allocate memory for buffer\n");
    //return 1;
  //}
  //// Initialize buffer to a dark blue background
  //for (int i = 0; i < width * height; ++i) {
    //buffer[i] = 0xFF000030; // Dark blue background
  //}
  //// Converts a HVM2 tuple-encoded quadtree to a color buffer
  //read_img(net, argv[0], width, height, buffer);
  //// Render the buffer to the screen
  //render(width, height, buffer);
  //// Wait some time
  //SDL_Delay(2000);
  //// Free the buffer
  //free(buffer);
  //return new_port(ERA, 0);
//}
//#else
//// IO: DrawImage
//Port io_put_image(Net* net, Book* book, u32 argc, Port* argv) {
  //printf("DRAWIMAGE: disabled.\n");
  //printf("Image rendering is a WIP. For now, to enable it, you must:\n");
  //printf("1. Generate a C file, with `hvm gen-c your_file.hvm`.\n");
  //printf("2. Manually un-comment the '#define IO_DRAWIMAGE' line on it.\n");
  //printf("3. Have SDL installed and compile it with '-lSDL2'.\n");
  //return new_port(ERA, 0);
//}
//#endif

// Book Loader
// -----------

bool book_load(Book* book, u32* buf) {
  // Reads defs_len
  book->defs_len = *buf++;

  // Parses each def
  for (u32 i = 0; i < book->defs_len; ++i) {
    // Reads fid
    u32 fid = *buf++;

    // Gets def
    Def* def = &book->defs_buf[fid];

    // Reads name
    memcpy(def->name, buf, 256);
    buf += 64;

    // Reads safe flag
    def->safe = *buf++;

    // Reads lengths
    def->rbag_len = *buf++;
    def->node_len = *buf++;
    def->vars_len = *buf++;

    if (def->rbag_len > DEF_RBAG_LEN) {
      fprintf(stderr, "def '%s' has too many redexes: %u\n", def->name, def->rbag_len);
      return false;
    }

    if (def->node_len > DEF_NODE_LEN) {
      fprintf(stderr, "def '%s' has too many nodes: %u\n", def->name, def->node_len);
      return false;
    }

    // Reads root
    def->root = *buf++;

    // Reads rbag_buf
    memcpy(def->rbag_buf, buf, 8*def->rbag_len);
    buf += def->rbag_len * 2;

    // Reads node_buf
    memcpy(def->node_buf, buf, 8*def->node_len);
    buf += def->node_len * 2;
  }

  return true;
}

// Debug Printing
// --------------

void put_u32(char* B, u32 val) {
  for (int i = 0; i < 8; i++, val >>= 4) {
    B[8-i-1] = "0123456789ABCDEF"[val & 0xF];
  }
}

Show show_port(Port port) {
  // NOTE: this is done like that because sprintf seems not to be working
  Show s;
  switch (get_tag(port)) {
    case VAR: memcpy(s.x, "VAR:", 4); put_u32(s.x+4, get_val(port)); break;
    case REF: memcpy(s.x, "REF:", 4); put_u32(s.x+4, get_val(port)); break;
    case ERA: memcpy(s.x, "ERA:________", 12); break;
    case NUM: memcpy(s.x, "NUM:", 4); put_u32(s.x+4, get_val(port)); break;
    case CON: memcpy(s.x, "CON:", 4); put_u32(s.x+4, get_val(port)); break;
    case DUP: memcpy(s.x, "DUP:", 4); put_u32(s.x+4, get_val(port)); break;
    case OPR: memcpy(s.x, "OPR:", 4); put_u32(s.x+4, get_val(port)); break;
    case SWI: memcpy(s.x, "SWI:", 4); put_u32(s.x+4, get_val(port)); break;
  }
  s.x[12] = '\0';
  return s;
}

Show show_rule(Rule rule) {
  Show s;
  switch (rule) {
    case LINK: memcpy(s.x, "LINK", 4); break;
    case VOID: memcpy(s.x, "VOID", 4); break;
    case ERAS: memcpy(s.x, "ERAS", 4); break;
    case ANNI: memcpy(s.x, "ANNI", 4); break;
    case COMM: memcpy(s.x, "COMM", 4); break;
    case OPER: memcpy(s.x, "OPER", 4); break;
    case SWIT: memcpy(s.x, "SWIT", 4); break;
    case CALL: memcpy(s.x, "CALL", 4); break;
    default  : memcpy(s.x, "????", 4); break;
  }
  s.x[4] = '\0';
  return s;
}

//void print_rbag(RBag* rbag) {
  //printf("RBAG | FST-TREE     | SND-TREE    \n");
  //printf("---- | ------------ | ------------\n");
  //for (u32 i = rbag->lo_ini; i < rbag->lo_end; ++i) {
    //Pair redex = rbag->lo_buf[i%RLEN];
    //printf("%04X | %s | %s\n", i, show_port(get_fst(redex)).x, show_port(get_snd(redex)).x);
  //}
  //for (u32 i = 0; i > rbag->hi_end; ++i) {
    //Pair redex = rbag->hi_buf[i];
    //printf("%04X | %s | %s\n", i, show_port(get_fst(redex)).x, show_port(get_snd(redex)).x);
  //}
  //printf("==== | ============ | ============\n");
//}

void print_net(Net* net) {
  printf("NODE | PORT-1       | PORT-2      \n");
  printf("---- | ------------ | ------------\n");
  for (u32 i = 0; i < G_NODE_LEN; ++i) {
    Pair node = node_load(net, i);
    if (node != 0) {
      printf("%04X | %s | %s\n", i, show_port(get_fst(node)).x, show_port(get_snd(node)).x);
    }
  }
  printf("==== | ============ |\n");
  printf("VARS | VALUE        |\n");
  printf("---- | ------------ |\n");
  for (u32 i = 0; i < G_VARS_LEN; ++i) {
    Port var = vars_load(net,i);
    if (var != 0) {
      printf("%04X | %s |\n", i, show_port(vars_load(net,i)).x);
    }
  }
  printf("==== | ============ |\n");
}

void pretty_print_numb(Numb word) {
  switch (get_typ(word)) {
    case TY_SYM: {
      switch (get_sym(word)) {
        // types
        case TY_U24: printf("[u24]"); break;
        case TY_I24: printf("[i24]"); break;
        case TY_F24: printf("[f24]"); break;
        // operations
        case OP_ADD: printf("[+]"); break;
        case OP_SUB: printf("[-]"); break;
        case FP_SUB: printf("[:-]"); break;
        case OP_MUL: printf("[*]"); break;
        case OP_DIV: printf("[/]"); break;
        case FP_DIV: printf("[:/]"); break;
        case OP_REM: printf("[%%]"); break;
        case FP_REM: printf("[:%%]"); break;
        case OP_EQ:  printf("[=]"); break;
        case OP_NEQ: printf("[!]"); break;
        case OP_LT:  printf("[<]"); break;
        case OP_GT:  printf("[>]"); break;
        case OP_AND: printf("[&]"); break;
        case OP_OR:  printf("[|]"); break;
        case OP_XOR: printf("[^]"); break;
        case OP_SHL: printf("[<<]"); break;
        case FP_SHL: printf("[:<<]"); break;
        case OP_SHR: printf("[>>]"); break;
        case FP_SHR: printf("[:>>]"); break;
        default:     printf("[?]"); break;
      }
      break;
    }
    case TY_U24: {
      printf("%u", get_u24(word));
      break;
    }
    case TY_I24: {
      printf("%+d", get_i24(word));
      break;
    }
    case TY_F24: {
      if (isinf(get_f24(word))) {
        if (signbit(get_f24(word))) {
          printf("-inf");
        } else {
          printf("+inf");
        }
      } else if (isnan(get_f24(word))) {
        printf("+NaN");
      } else {
        printf("%.7e", get_f24(word));
      }
      break;
    }
    default: {
      switch (get_typ(word)) {
        case OP_ADD: printf("[+0x%07X]", get_u24(word)); break;
        case OP_SUB: printf("[-0x%07X]", get_u24(word)); break;
        case FP_SUB: printf("[:-0x%07X]", get_u24(word)); break;
        case OP_MUL: printf("[*0x%07X]", get_u24(word)); break;
        case OP_DIV: printf("[/0x%07X]", get_u24(word)); break;
        case FP_DIV: printf("[:/0x%07X]", get_u24(word)); break;
        case OP_REM: printf("[%%0x%07X]", get_u24(word)); break;
        case FP_REM: printf("[:%%0x%07X]", get_u24(word)); break;
        case OP_EQ:  printf("[=0x%07X]", get_u24(word)); break;
        case OP_NEQ: printf("[!0x%07X]", get_u24(word)); break;
        case OP_LT:  printf("[<0x%07X]", get_u24(word)); break;
        case OP_GT:  printf("[>0x%07X]", get_u24(word)); break;
        case OP_AND: printf("[&0x%07X]", get_u24(word)); break;
        case OP_OR:  printf("[|0x%07X]", get_u24(word)); break;
        case OP_XOR: printf("[^0x%07X]", get_u24(word)); break;
        case OP_SHL: printf("[<<0x%07X]", get_u24(word)); break;
        case FP_SHL: printf("[:<<0x%07X]", get_u24(word)); break;
        case OP_SHR: printf("[>>0x%07X]", get_u24(word)); break;
        case FP_SHR: printf("[:>>0x%07X]", get_u24(word)); break;
        default:     printf("[?0x%07X]", get_u24(word)); break;
      }
      break;
    }
  }

}

void pretty_print_port(Net* net, Book* book, Port port) {
  Port stack[4096];
  stack[0] = port;
  u32 len = 1;
  u32 num = 0;
  while (len > 0) {
    Port cur = stack[--len];
    switch (get_tag(cur)) {
      case CON: {
        Pair node = node_load(net,get_val(cur));
        Port p2   = get_snd(node);
        Port p1   = get_fst(node);
        printf("(");
        stack[len++] = new_port(ERA, (u32)(')'));
        stack[len++] = p2;
        stack[len++] = new_port(ERA, (u32)(' '));
        stack[len++] = p1;
        break;
      }
      case ERA: {
        if (get_val(cur) != 0) {
          printf("%c", (char)get_val(cur));
        } else {
          printf("*");
        }
        break;
      }
      case VAR: {
        Port got = vars_load(net, get_val(cur));
        if (got != NONE) {
          stack[len++] = got;
        } else {
          printf("x%x", get_val(cur));
        }
        break;
      }
      case NUM: {
        pretty_print_numb(get_val(cur));
        break;
      }
      case DUP: {
        Pair node = node_load(net,get_val(cur));
        Port p2   = get_snd(node);
        Port p1   = get_fst(node);
        printf("{");
        stack[len++] = new_port(ERA, (u32)('}'));
        stack[len++] = p2;
        stack[len++] = new_port(ERA, (u32)(' '));
        stack[len++] = p1;
        break;
      }
      case OPR: {
        Pair node = node_load(net,get_val(cur));
        Port p2   = get_snd(node);
        Port p1   = get_fst(node);
        printf("$(");
        stack[len++] = new_port(ERA, (u32)(')'));
        stack[len++] = p2;
        stack[len++] = new_port(ERA, (u32)(' '));
        stack[len++] = p1;
        break;
      }
      case SWI: {
        Pair node = node_load(net,get_val(cur));
        Port p2   = get_snd(node);
        Port p1   = get_fst(node);
        printf("?(");
        stack[len++] = new_port(ERA, (u32)(')'));
        stack[len++] = p2;
        stack[len++] = new_port(ERA, (u32)(' '));
        stack[len++] = p1;
        break;
      }
      case REF: {
        u32  fid = get_val(cur) & 0xFFFFFFF;
        Def* def = &book->defs_buf[fid];
        printf("@%s", def->name);
        break;
      }
    }
  }
}

//void pretty_print_rbag(Net* net, RBag* rbag) {
  //for (u32 i = rbag->lo_ini; i < rbag->lo_end; ++i) {
    //Pair redex = rbag->lo_buf[i];
    //if (redex != 0) {
      //pretty_print_port(net, get_fst(redex));
      //printf(" ~ ");
      //pretty_print_port(net, get_snd(redex));
      //printf("\n");
    //}
  //}
  //for (u32 i = 0; i > rbag->hi_end; ++i) {
    //Pair redex = rbag->hi_buf[i];
    //if (redex != 0) {
      //pretty_print_port(net, get_fst(redex));
      //printf(" ~ ");
      //pretty_print_port(net, get_snd(redex));
      //printf("\n");
    //}
  //}
//}

// Demos
// -----

  // stress_test 2^10 x 65536
  //static const u8 BOOK_BUF[] = {6, 0, 0, 0, 0, 0, 0, 0, 109, 97, 105, 110, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 9, 0, 0, 0, 4, 0, 0, 0, 11, 10, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 102, 117, 110, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 1, 0, 0, 0, 4, 0, 0, 0, 15, 0, 0, 0, 0, 0, 0, 0, 20, 0, 0, 0, 0, 0, 0, 0, 17, 0, 0, 0, 25, 0, 0, 0, 2, 0, 0, 0, 102, 117, 110, 95, 95, 67, 48, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 33, 0, 0, 0, 4, 0, 0, 0, 11, 0, 0, 1, 0, 0, 0, 0, 3, 0, 0, 0, 102, 117, 110, 95, 95, 67, 49, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 6, 0, 0, 0, 4, 0, 0, 0, 4, 0, 0, 0, 9, 0, 0, 128, 20, 0, 0, 0, 9, 0, 0, 128, 44, 0, 0, 0, 13, 0, 0, 0, 16, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 30, 0, 0, 0, 3, 4, 0, 0, 38, 0, 0, 0, 24, 0, 0, 0, 16, 0, 0, 0, 8, 0, 0, 0, 24, 0, 0, 0, 4, 0, 0, 0, 108, 111, 111, 112, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 1, 0, 0, 0, 4, 0, 0, 0, 15, 0, 0, 0, 0, 0, 0, 0, 20, 0, 0, 0, 0, 0, 0, 0, 11, 0, 0, 0, 41, 0, 0, 0, 5, 0, 0, 0, 108, 111, 111, 112, 95, 95, 67, 48, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 33, 0, 0, 0, 0, 0, 0, 0};

  // stress_test 2^18 x 65536
  //static const u8 BOOK_BUF[] = {6, 0, 0, 0, 0, 0, 0, 0, 109, 97, 105, 110, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 9, 0, 0, 0, 4, 0, 0, 0, 11, 18, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 102, 117, 110, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 1, 0, 0, 0, 4, 0, 0, 0, 15, 0, 0, 0, 0, 0, 0, 0, 20, 0, 0, 0, 0, 0, 0, 0, 17, 0, 0, 0, 25, 0, 0, 0, 2, 0, 0, 0, 102, 117, 110, 95, 95, 67, 48, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 33, 0, 0, 0, 4, 0, 0, 0, 11, 0, 0, 1, 0, 0, 0, 0, 3, 0, 0, 0, 102, 117, 110, 95, 95, 67, 49, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 6, 0, 0, 0, 4, 0, 0, 0, 4, 0, 0, 0, 9, 0, 0, 128, 20, 0, 0, 0, 9, 0, 0, 128, 44, 0, 0, 0, 13, 0, 0, 0, 16, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 30, 0, 0, 0, 3, 4, 0, 0, 38, 0, 0, 0, 24, 0, 0, 0, 16, 0, 0, 0, 8, 0, 0, 0, 24, 0, 0, 0, 4, 0, 0, 0, 108, 111, 111, 112, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 1, 0, 0, 0, 4, 0, 0, 0, 15, 0, 0, 0, 0, 0, 0, 0, 20, 0, 0, 0, 0, 0, 0, 0, 11, 0, 0, 0, 41, 0, 0, 0, 5, 0, 0, 0, 108, 111, 111, 112, 95, 95, 67, 48, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 33, 0, 0, 0, 0, 0, 0, 0};

  // bitonic_sort 2^20
  //static const u8 BOOK_BUF[] = {19, 0, 0, 0, 0, 0, 0, 0, 109, 97, 105, 110, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 2, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 89, 0, 0, 0, 4, 0, 0, 0, 11, 18, 0, 0, 12, 0, 0, 0, 65, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 100, 111, 119, 110, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 9, 0, 0, 0, 4, 0, 0, 0, 4, 0, 0, 0, 15, 0, 0, 0, 60, 0, 0, 0, 20, 0, 0, 0, 44, 0, 0, 0, 28, 0, 0, 0, 17, 0, 0, 0, 0, 0, 0, 0, 36, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 52, 0, 0, 0, 16, 0, 0, 0, 24, 0, 0, 0, 16, 0, 0, 0, 68, 0, 0, 0, 8, 0, 0, 0, 24, 0, 0, 0, 2, 0, 0, 0, 100, 111, 119, 110, 95, 95, 67, 48, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 13, 0, 0, 0, 8, 0, 0, 0, 4, 0, 0, 0, 25, 0, 0, 128, 60, 0, 0, 0, 25, 0, 0, 128, 84, 0, 0, 0, 13, 0, 0, 0, 20, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 28, 0, 0, 0, 36, 0, 0, 0, 16, 0, 0, 0, 24, 0, 0, 0, 45, 0, 0, 0, 52, 0, 0, 0, 32, 0, 0, 0, 40, 0, 0, 0, 48, 0, 0, 0, 56, 0, 0, 0, 0, 0, 0, 0, 68, 0, 0, 0, 32, 0, 0, 0, 76, 0, 0, 0, 16, 0, 0, 0, 48, 0, 0, 0, 8, 0, 0, 0, 92, 0, 0, 0, 40, 0, 0, 0, 100, 0, 0, 0, 24, 0, 0, 0, 56, 0, 0, 0, 3, 0, 0, 0, 102, 108, 111, 119, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 9, 0, 0, 0, 4, 0, 0, 0, 4, 0, 0, 0, 15, 0, 0, 0, 60, 0, 0, 0, 20, 0, 0, 0, 44, 0, 0, 0, 28, 0, 0, 0, 33, 0, 0, 0, 0, 0, 0, 0, 36, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 52, 0, 0, 0, 16, 0, 0, 0, 24, 0, 0, 0, 16, 0, 0, 0, 68, 0, 0, 0, 8, 0, 0, 0, 24, 0, 0, 0, 4, 0, 0, 0, 102, 108, 111, 119, 95, 95, 67, 48, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 14, 0, 0, 0, 8, 0, 0, 0, 4, 0, 0, 0, 9, 0, 0, 0, 60, 0, 0, 0, 129, 0, 0, 0, 84, 0, 0, 0, 13, 0, 0, 0, 28, 0, 0, 0, 22, 0, 0, 0, 8, 0, 0, 0, 35, 1, 0, 0, 0, 0, 0, 0, 36, 0, 0, 0, 44, 0, 0, 0, 16, 0, 0, 0, 24, 0, 0, 0, 53, 0, 0, 0, 48, 0, 0, 0, 32, 0, 0, 0, 40, 0, 0, 0, 0, 0, 0, 0, 68, 0, 0, 0, 32, 0, 0, 0, 76, 0, 0, 0, 56, 0, 0, 0, 48, 0, 0, 0, 8, 0, 0, 0, 92, 0, 0, 0, 40, 0, 0, 0, 100, 0, 0, 0, 16, 0, 0, 0, 108, 0, 0, 0, 24, 0, 0, 0, 56, 0, 0, 0, 5, 0, 0, 0, 103, 101, 110, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 2, 0, 0, 0, 4, 0, 0, 0, 15, 0, 0, 0, 8, 0, 0, 0, 20, 0, 0, 0, 8, 0, 0, 0, 28, 0, 0, 0, 49, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 103, 101, 110, 95, 95, 67, 48, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 12, 0, 0, 0, 6, 0, 0, 0, 4, 0, 0, 0, 41, 0, 0, 128, 68, 0, 0, 0, 41, 0, 0, 128, 84, 0, 0, 0, 13, 0, 0, 0, 20, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 29, 0, 0, 0, 60, 0, 0, 0, 38, 0, 0, 0, 54, 0, 0, 0, 59, 2, 0, 0, 46, 0, 0, 0, 35, 1, 0, 0, 16, 0, 0, 0, 59, 2, 0, 0, 24, 0, 0, 0, 32, 0, 0, 0, 40, 0, 0, 0, 0, 0, 0, 0, 76, 0, 0, 0, 16, 0, 0, 0, 32, 0, 0, 0, 8, 0, 0, 0, 92, 0, 0, 0, 24, 0, 0, 0, 40, 0, 0, 0, 7, 0, 0, 0, 109, 97, 105, 110, 95, 95, 67, 48, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 2, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 41, 0, 0, 0, 4, 0, 0, 0, 11, 18, 0, 0, 12, 0, 0, 0, 11, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 109, 97, 105, 110, 95, 95, 67, 49, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
Download .txt
gitextract_cj26vbd4/

├── .github/
│   ├── ISSUE_TEMPLATE/
│   │   ├── bug_report.yml
│   │   ├── config.yml
│   │   └── feature_request.md
│   └── workflows/
│       ├── bench.yml
│       ├── checks.yml
│       └── delete-cancelled.yml
├── .gitignore
├── Cargo.toml
├── LICENSE
├── README.md
├── build.rs
├── examples/
│   ├── demo_io/
│   │   ├── main.bend
│   │   └── main.hvm
│   ├── sort_bitonic/
│   │   ├── main.bend
│   │   └── main.hvm
│   ├── sort_radix/
│   │   ├── main.bend
│   │   └── main.hvm
│   ├── stress/
│   │   ├── README.md
│   │   ├── main.bend
│   │   ├── main.hvm
│   │   ├── main.js
│   │   └── main.py
│   ├── sum_rec/
│   │   ├── main.bend
│   │   ├── main.hvm
│   │   ├── main.js
│   │   └── sum.js
│   ├── sum_tree/
│   │   ├── main.bend
│   │   └── main.hvm
│   └── tuples/
│       ├── tuples.bend
│       └── tuples.hvm
├── paper/
│   ├── HVM2.typst
│   ├── README.md
│   └── inet.typ
├── src/
│   ├── ast.rs
│   ├── cmp.rs
│   ├── hvm.c
│   ├── hvm.cu
│   ├── hvm.cuh
│   ├── hvm.h
│   ├── hvm.rs
│   ├── lib.rs
│   ├── main.rs
│   ├── run.c
│   └── run.cu
└── tests/
    ├── programs/
    │   ├── empty.hvm
    │   ├── hello-world.hvm
    │   ├── io/
    │   │   ├── basic.bend
    │   │   ├── basic.hvm
    │   │   ├── invalid-name.bend
    │   │   ├── invalid-name.hvm
    │   │   ├── open1.bend
    │   │   ├── open1.hvm
    │   │   ├── open2.bend
    │   │   ├── open2.hvm
    │   │   ├── open3.bend
    │   │   └── open3.hvm
    │   ├── list.hvm
    │   ├── numeric-casts.hvm
    │   ├── numerics/
    │   │   ├── f24.hvm
    │   │   ├── i24.hvm
    │   │   └── u24.hvm
    │   └── safety-check.hvm
    ├── run.rs
    └── snapshots/
        ├── run__file@empty.hvm.snap
        ├── run__file@hello-world.hvm.snap
        ├── run__file@list.hvm.snap
        ├── run__file@numeric-casts.hvm.snap
        ├── run__file@numerics__f24.hvm.snap
        ├── run__file@numerics__i24.hvm.snap
        ├── run__file@numerics__u24.hvm.snap
        ├── run__file@safety-check.hvm.snap
        ├── run__file@sort_bitonic__main.hvm.snap
        ├── run__file@sort_radix__main.hvm.snap
        ├── run__file@stress__main.hvm.snap
        ├── run__file@sum_rec__main.hvm.snap
        ├── run__file@sum_tree__main.hvm.snap
        ├── run__file@tuples__tuples.hvm.snap
        ├── run__io_file@demo_io__main.hvm.snap
        ├── run__io_file@io__basic.hvm.snap
        ├── run__io_file@io__invalid-name.hvm.snap
        ├── run__io_file@io__open1.hvm.snap
        ├── run__io_file@io__open2.hvm.snap
        ├── run__io_file@io__open3.hvm.snap
        └── run__io_file@io__read_and_print.hvm.snap
Download .txt
SYMBOL INDEX (372 symbols across 12 files)

FILE: build.rs
  function main (line 1) | fn main() {

FILE: examples/stress/main.js
  function sum (line 1) | function sum(n) {
  function fun (line 9) | function fun(n) {

FILE: examples/stress/main.py
  function sum (line 1) | def sum(n):
  function fun (line 7) | def fun(n):

FILE: examples/sum_rec/main.js
  function sum (line 1) | function sum(a, b) {

FILE: src/ast.rs
  type Numb (line 11) | pub struct Numb(pub u32);
    method show (line 236) | pub fn show(&self) -> String {
  type Tree (line 14) | pub enum Tree {
    method show (line 319) | pub fn show(&self) -> String {
    method readback (line 365) | pub fn readback(net: &hvm::GNet, port: hvm::Port, fids: &BTreeMap<hvm:...
    method build (line 433) | pub fn build(&self, def: &mut hvm::Def, fids: &BTreeMap<String, hvm::V...
    method direct_dependencies (line 491) | pub fn direct_dependencies<'name>(&'name self) -> BTreeSet<&'name str> {
  type Redex (line 25) | pub type Redex = (bool, Tree, Tree);
  type Net (line 28) | pub struct Net {
    method show (line 334) | pub fn show(&self) -> String {
    method readback (line 417) | pub fn readback(net: &hvm::GNet, book: &hvm::Book) -> Option<Net> {
    method build (line 512) | pub fn build(&self, def: &mut hvm::Def, fids: &BTreeMap<String, hvm::V...
  type Book (line 33) | pub struct Book {
    method show (line 348) | pub fn show(&self) -> String {
    method parse (line 528) | pub fn parse(code: &str) -> ParseResult<Self> {
    method build (line 532) | pub fn build(&self) -> hvm::Book {
    method propagate_safety (line 571) | fn propagate_safety(&self, compiled_book: &mut hvm::Book, lookup: &BTr...
    method direct_dependents (line 611) | fn direct_dependents<'name>(&'name self) -> BTreeMap<&'name str, BTree...
  type ParseResult (line 40) | pub type ParseResult<T> = std::result::Result<T, ParseError>;
  function parse_numb_sym (line 46) | pub fn parse_numb_sym(&mut self) -> ParseResult<Numb> {
  function parse_numb_lit (line 102) | pub fn parse_numb_lit(&mut self) -> ParseResult<Numb> {
  function parse_numb (line 124) | pub fn parse_numb(&mut self) -> ParseResult<Numb> {
  function parse_tree (line 136) | pub fn parse_tree(&mut self) -> ParseResult<Tree> {
  function parse_net (line 195) | pub fn parse_net(&mut self) -> ParseResult<Net> {
  function parse_book (line 211) | pub fn parse_book(&mut self) -> ParseResult<Book> {
  function try_consume (line 223) | fn try_consume(&mut self, str: &str) -> bool {

FILE: src/cmp.rs
  type Target (line 7) | pub enum Target { CUDA, C }
  function compile_book (line 10) | pub fn compile_book(trg: Target, book: &hvm::Book) -> String {
  function compile_def (line 37) | pub fn compile_def(trg: Target, code: &mut String, book: &hvm::Book, tab...
  function compile_link_fast (line 110) | pub fn compile_link_fast(trg: Target, code: &mut String, book: &hvm::Boo...
  function compile_link_slow (line 293) | pub fn compile_link_slow(trg: Target, code: &mut String, book: &hvm::Boo...
  function link_or_store (line 299) | pub fn link_or_store(trg: Target, code: &mut String, book: &hvm::Book, n...
  function compile_node (line 308) | pub fn compile_node(trg: Target, code: &mut String, book: &hvm::Book, ne...
  function compile_tag (line 328) | pub fn compile_tag(trg: Target, tag: hvm::Tag) -> &'static str {
  function indent (line 343) | pub fn indent(tab: usize) -> String {
  function fresh (line 348) | fn fresh(count: &mut usize) -> String {

FILE: src/hvm.c
  type u8 (line 23) | typedef  uint8_t  u8;
  type u16 (line 24) | typedef uint16_t u16;
  type u32 (line 25) | typedef uint32_t u32;
  type u64 (line 26) | typedef uint64_t u64;
  type i32 (line 27) | typedef  int32_t i32;
  type f32 (line 28) | typedef    float f32;
  type f64 (line 29) | typedef   double f64;
  type a8 (line 31) | typedef  _Atomic(u8)  a8;
  type a16 (line 32) | typedef _Atomic(u16) a16;
  type a32 (line 33) | typedef _Atomic(u32) a32;
  type a64 (line 34) | typedef _Atomic(u64) a64;
  type u8 (line 49) | typedef u8  Tag;
  type u32 (line 50) | typedef u32 Val;
  type u32 (line 51) | typedef u32 Port;
  type u64 (line 52) | typedef u64 Pair;
  type a32 (line 54) | typedef a32 APort;
  type a64 (line 55) | typedef a64 APair;
  type u8 (line 58) | typedef u8 Rule;
  type u32 (line 61) | typedef u32 Numb;
  type Net (line 127) | typedef struct Net {
  type Def (line 139) | typedef struct Def {
  type Book (line 150) | typedef struct Book Book;
  type FFn (line 153) | typedef struct {
  type Book (line 159) | typedef struct Book {
  type TM (line 167) | typedef struct TM {
  type Show (line 183) | typedef struct {
  function Port (line 197) | static inline Port new_port(Tag tag, Val val) {
  function Tag (line 201) | static inline Tag get_tag(Port port) {
  function Val (line 205) | static inline Val get_val(Port port) {
  function Pair (line 212) | static inline const Pair new_pair(Port fst, Port snd) {
  function Port (line 216) | static inline Port get_fst(Pair pair) {
  function Port (line 220) | static inline Port get_snd(Pair pair) {
  function Pair (line 224) | Pair set_par_flag(Pair pair) {
  function Pair (line 234) | Pair clr_par_flag(Pair pair) {
  function get_par_flag (line 244) | bool get_par_flag(Pair pair) {
  function swap (line 257) | static inline void swap(Port *a, Port *b) {
  function u32 (line 261) | static inline u32 min(u32 a, u32 b) {
  function f32 (line 265) | static inline f32 clamp(f32 x, f32 min, f32 max) {
  function sync_threads (line 273) | void sync_threads() {
  function u32 (line 289) | u32 global_sum(u32 x) {
  function u64 (line 299) | static inline u64 time64() {
  function is_nod (line 309) | static inline bool is_nod(Port a) {
  function is_var (line 314) | static inline bool is_var(Port a) {
  function Rule (line 319) | static inline Rule get_rule(Port a, Port b) {
  function Rule (line 335) | static inline Rule get_pair_rule(Pair AB) {
  function should_swap (line 340) | static inline bool should_swap(Port A, Port B) {
  function is_high_priority (line 345) | static inline bool is_high_priority(Rule rule) {
  function Port (line 351) | static inline Port adjust_port(Net* net, TM* tm, Port port) {
  function Pair (line 360) | static inline Pair adjust_pair(Net* net, TM* tm, Pair pair) {
  function Numb (line 370) | static inline Numb new_sym(u32 val) {
  function u32 (line 374) | static inline u32 get_sym(Numb word) {
  function Numb (line 379) | static inline Numb new_u24(u32 val) {
  function u32 (line 383) | static inline u32 get_u24(Numb word) {
  function Numb (line 388) | static inline Numb new_i24(i32 val) {
  function i32 (line 392) | static inline i32 get_i24(Numb word) {
  function Numb (line 397) | static inline Numb new_f24(float val) {
  function get_f24 (line 408) | static inline float get_f24(Numb word) {
  function Tag (line 414) | static inline Tag get_typ(Numb word) {
  function is_num (line 418) | static inline bool is_num(Numb word) {
  function is_cast (line 422) | static inline bool is_cast(Numb word) {
  function Numb (line 427) | static inline Numb partial(Numb a, Numb b) {
  function Numb (line 439) | static inline Numb cast(Numb a, Numb b) {
  function Numb (line 476) | static inline Numb operate(Numb a, Numb b) {
  function push_redex (line 589) | static inline void push_redex(Net* net, TM* tm, Pair redex) {
  function Pair (line 605) | static inline Pair pop_redex(Net* net, TM* tm) {
  function u32 (line 615) | static inline u32 rbag_len(Net* net, TM* tm) {
  function TM (line 624) | TM* tm_new(u32 tid) {
  function alloc_static_tms (line 636) | void alloc_static_tms() {
  function free_static_tms (line 642) | void free_static_tms() {
  function node_create (line 652) | static inline void node_create(Net* net, u32 loc, Pair val) {
  function vars_create (line 657) | static inline void vars_create(Net* net, u32 var, Port val) {
  function Pair (line 662) | static inline Pair node_load(Net* net, u32 loc) {
  function Port (line 667) | static inline Port vars_load(Net* net, u32 var) {
  function node_store (line 672) | static inline void node_store(Net* net, u32 loc, Pair val) {
  function Pair (line 677) | static inline Pair node_exchange(Net* net, u32 loc, Pair val) {
  function Port (line 682) | static inline Port vars_exchange(Net* net, u32 var, Port val) {
  function Pair (line 687) | static inline Pair node_take(Net* net, u32 loc) {
  function Port (line 692) | static inline Port vars_take(Net* net, u32 var) {
  function Net (line 701) | static inline Net* net_new() {
  function u32 (line 713) | u32 node_alloc_1(Net* net, TM* tm, u32* lps) {
  function u32 (line 726) | u32 vars_alloc_1(Net* net, TM* tm, u32* lps) {
  function u32 (line 739) | u32 node_alloc(Net* net, TM* tm, u32 num) {
  function u32 (line 755) | u32 vars_alloc(Net* net, TM* tm, u32 num) {
  function get_resources (line 772) | static inline bool get_resources(Net* net, TM* tm, u32 need_rbag, u32 ne...
  function Port (line 784) | static inline Port peek(Net* net, Port var) {
  function Port (line 795) | static inline Port enter(Net* net, Port var) {
  function link (line 812) | static inline void link(Net* net, TM* tm, Port A, Port B) {
  function link_pair (line 844) | static inline void link_pair(Net* net, TM* tm, Pair AB) {
  function interact_link (line 852) | static inline bool interact_link(Net* net, TM* tm, Port a, Port b) {
  function interact_call (line 872) | static inline bool interact_call(Net* net, TM* tm, Port a, Port b, Book*...
  function interact_void (line 909) | static inline bool interact_void(Net* net, TM* tm, Port a, Port b) {
  function interact_eras (line 914) | static inline bool interact_eras(Net* net, TM* tm, Port a, Port b) {
  function interact_anni (line 939) | static inline bool interact_anni(Net* net, TM* tm, Port a, Port b) {
  function interact_comm (line 967) | static inline bool interact_comm(Net* net, TM* tm, Port a, Port b) {
  function interact_oper (line 1009) | static inline bool interact_oper(Net* net, TM* tm, Port a, Port b) {
  function interact_swit (line 1041) | static inline bool interact_swit(Net* net, TM* tm, Port a, Port b) {
  function interact (line 1073) | static inline bool interact(Net* net, TM* tm, Book* book) {
  function evaluator (line 1127) | void evaluator(Net* net, TM* tm, Book* book) {
  type ThreadArg (line 1191) | typedef struct {
  function boot_redex (line 1204) | void boot_redex(Net* net, Pair redex) {
  function normalize (line 1211) | void normalize(Net* net, Book* book) {
  function Port (line 1233) | Port expand(Net* net, Book* book, Port port) {
  function read_img (line 1247) | void read_img(Net* net, Port port, u32 width, u32 height, u32* buffer) {
  function book_load (line 1408) | bool book_load(Book* book, u32* buf) {
  function put_u32 (line 1460) | void put_u32(char* B, u32 val) {
  function Show (line 1466) | Show show_port(Port port) {
  function Show (line 1483) | Show show_rule(Rule rule) {
  function print_net (line 1514) | void print_net(Net* net) {
  function pretty_print_numb (line 1535) | void pretty_print_numb(Numb word) {
  function pretty_print_port (line 1618) | void pretty_print_port(Net* net, Book* book, Port port) {
  function hvm_c (line 1743) | void hvm_c(u32* book_buffer) {
  function main (line 1794) | int main() {

FILE: src/hvm.h
  type u8 (line 11) | typedef  uint8_t  u8;
  type u16 (line 12) | typedef uint16_t u16;
  type u32 (line 13) | typedef uint32_t u32;
  type i32 (line 14) | typedef  int32_t i32;
  type u64 (line 15) | typedef uint64_t u64;
  type f32 (line 16) | typedef    float f32;
  type f64 (line 17) | typedef   double f64;
  type u8 (line 20) | typedef u8  Tag;
  type u32 (line 21) | typedef u32 Val;
  type u32 (line 22) | typedef u32 Port;
  type u64 (line 23) | typedef u64 Pair;
  type u32 (line 26) | typedef u32 Numb;
  type Net (line 67) | typedef struct Net Net;
  type Book (line 68) | typedef struct Book Book;
  type Show (line 73) | typedef struct {
  function Port (line 86) | static inline Port new_port(Tag tag, Val val) {
  function Tag (line 90) | static inline Tag get_tag(Port port) {
  function Val (line 94) | static inline Val get_val(Port port) {
  function Pair (line 101) | static inline const Pair new_pair(Port fst, Port snd) {
  function Port (line 105) | static inline Port get_fst(Pair pair) {
  function Port (line 109) | static inline Port get_snd(Pair pair) {
  function swap (line 117) | static inline void swap(Port *a, Port *b) {
  function u32 (line 121) | static inline u32 min(u32 a, u32 b) {
  function f32 (line 125) | static inline f32 clamp(f32 x, f32 min, f32 max) {
  function Numb (line 134) | static inline Numb new_sym(u32 val) {
  function u32 (line 138) | static inline u32 get_sym(Numb word) {
  function Numb (line 143) | static inline Numb new_u24(u32 val) {
  function u32 (line 147) | static inline u32 get_u24(Numb word) {
  function Numb (line 152) | static inline Numb new_i24(i32 val) {
  function i32 (line 156) | static inline i32 get_i24(Numb word) {
  function Numb (line 161) | static inline Numb new_f24(float val) {
  function get_f24 (line 172) | static inline float get_f24(Numb word) {
  function Tag (line 177) | static inline Tag get_typ(Numb word) {
  function is_num (line 181) | static inline bool is_num(Numb word) {
  function is_cast (line 185) | static inline bool is_cast(Numb word) {
  function Numb (line 190) | static inline Numb partial(Numb a, Numb b) {
  type Tup (line 198) | typedef struct Tup {
  type Str (line 208) | typedef struct Str {
  type Bytes (line 217) | typedef struct Bytes {

FILE: src/hvm.rs
  type Tag (line 9) | pub type Tag  = u8;
  type Lab (line 10) | pub type Lab  = u32;
  type Val (line 11) | pub type Val  = u32;
  type Rule (line 12) | pub type Rule = u8;
  type Port (line 16) | pub struct Port(pub Val);
    method new (line 127) | pub fn new(tag: Tag, val: Val) -> Self {
    method get_tag (line 131) | pub fn get_tag(&self) -> Tag {
    method get_val (line 135) | pub fn get_val(&self) -> Val {
    method is_nod (line 139) | pub fn is_nod(&self) -> bool {
    method is_var (line 143) | pub fn is_var(&self) -> bool {
    method get_rule (line 147) | pub fn get_rule(a: Port, b: Port) -> Rule {
    method should_swap (line 162) | pub fn should_swap(a: Port, b: Port) -> bool {
    method is_high_priority (line 166) | pub fn is_high_priority(rule: Rule) -> bool {
    method adjust_port (line 170) | pub fn adjust_port(&self, tm: &TMem) -> Port {
    method show (line 1009) | pub fn show(&self) -> String {
  type Pair (line 19) | pub struct Pair(pub u64);
    method new (line 185) | pub fn new(fst: Port, snd: Port) -> Self {
    method get_fst (line 189) | pub fn get_fst(&self) -> Port {
    method get_snd (line 193) | pub fn get_snd(&self) -> Port {
    method adjust_pair (line 197) | pub fn adjust_pair(&self, tm: &TMem) -> Pair {
    method set_par_flag (line 203) | pub fn set_par_flag(&self) -> Self {
    method get_par_flag (line 213) | pub fn get_par_flag(&self) -> bool {
    method show (line 1025) | pub fn show(&self) -> String {
  type AVal (line 22) | pub type AVal = AtomicU32;
  type APort (line 23) | pub struct APort(pub AVal);
  type APair (line 24) | pub struct APair(pub AtomicU64);
  type Numb (line 27) | pub struct Numb(pub Val);
    method new_sym (line 227) | pub fn new_sym(val: Tag) -> Self {
    method get_sym (line 231) | pub fn get_sym(&self) -> Tag {
    method new_u24 (line 237) | pub fn new_u24(val: u32) -> Self {
    method get_u24 (line 241) | pub fn get_u24(&self) -> u32 {
    method new_i24 (line 247) | pub fn new_i24(val: i32) -> Self {
    method get_i24 (line 251) | pub fn get_i24(&self) -> i32 {
    method new_f24 (line 257) | pub fn new_f24(val: f32) -> Self {
    method get_f24 (line 268) | pub fn get_f24(&self) -> f32 {
    method get_typ (line 273) | pub fn get_typ(&self) -> Tag {
    method is_num (line 277) | pub fn is_num(&self) -> bool {
    method is_cast (line 281) | pub fn is_cast(&self) -> bool {
    method partial (line 286) | pub fn partial(a: Self, b: Self) -> Self {
    method cast (line 298) | pub fn cast(a: Self, b: Self) -> Self {
    method operate (line 316) | pub fn operate(a: Self, b: Self) -> Self {
  constant U24_MAX (line 28) | const U24_MAX : u32 = (1 << 24) - 1;
  constant U24_MIN (line 29) | const U24_MIN : u32 = 0;
  constant I24_MAX (line 30) | const I24_MAX : i32 = (1 << 23) - 1;
  constant I24_MIN (line 31) | const I24_MIN : i32 = (-1) << 23;
  constant VAR (line 34) | pub const VAR : Tag = 0x0;
  constant REF (line 35) | pub const REF : Tag = 0x1;
  constant ERA (line 36) | pub const ERA : Tag = 0x2;
  constant NUM (line 37) | pub const NUM : Tag = 0x3;
  constant CON (line 38) | pub const CON : Tag = 0x4;
  constant DUP (line 39) | pub const DUP : Tag = 0x5;
  constant OPR (line 40) | pub const OPR : Tag = 0x6;
  constant SWI (line 41) | pub const SWI : Tag = 0x7;
  constant LINK (line 44) | pub const LINK : Rule = 0x0;
  constant CALL (line 45) | pub const CALL : Rule = 0x1;
  constant VOID (line 46) | pub const VOID : Rule = 0x2;
  constant ERAS (line 47) | pub const ERAS : Rule = 0x3;
  constant ANNI (line 48) | pub const ANNI : Rule = 0x4;
  constant COMM (line 49) | pub const COMM : Rule = 0x5;
  constant OPER (line 50) | pub const OPER : Rule = 0x6;
  constant SWIT (line 51) | pub const SWIT : Rule = 0x7;
  constant TY_SYM (line 54) | pub const TY_SYM : Tag = 0x00;
  constant TY_U24 (line 55) | pub const TY_U24 : Tag = 0x01;
  constant TY_I24 (line 56) | pub const TY_I24 : Tag = 0x02;
  constant TY_F24 (line 57) | pub const TY_F24 : Tag = 0x03;
  constant OP_ADD (line 58) | pub const OP_ADD : Tag = 0x04;
  constant OP_SUB (line 59) | pub const OP_SUB : Tag = 0x05;
  constant FP_SUB (line 60) | pub const FP_SUB : Tag = 0x06;
  constant OP_MUL (line 61) | pub const OP_MUL : Tag = 0x07;
  constant OP_DIV (line 62) | pub const OP_DIV : Tag = 0x08;
  constant FP_DIV (line 63) | pub const FP_DIV : Tag = 0x09;
  constant OP_REM (line 64) | pub const OP_REM : Tag = 0x0A;
  constant FP_REM (line 65) | pub const FP_REM : Tag = 0x0B;
  constant OP_EQ (line 66) | pub const OP_EQ  : Tag = 0x0C;
  constant OP_NEQ (line 67) | pub const OP_NEQ : Tag = 0x0D;
  constant OP_LT (line 68) | pub const OP_LT  : Tag = 0x0E;
  constant OP_GT (line 69) | pub const OP_GT  : Tag = 0x0F;
  constant OP_AND (line 70) | pub const OP_AND : Tag = 0x10;
  constant OP_OR (line 71) | pub const OP_OR  : Tag = 0x11;
  constant OP_XOR (line 72) | pub const OP_XOR : Tag = 0x12;
  constant OP_SHL (line 73) | pub const OP_SHL : Tag = 0x13;
  constant FP_SHL (line 74) | pub const FP_SHL : Tag = 0x14;
  constant OP_SHR (line 75) | pub const OP_SHR : Tag = 0x15;
  constant FP_SHR (line 76) | pub const FP_SHR : Tag = 0x16;
  constant FREE (line 79) | pub const FREE : Port = Port(0x0);
  constant ROOT (line 80) | pub const ROOT : Port = Port(0xFFFFFFF8);
  constant NONE (line 81) | pub const NONE : Port = Port(0xFFFFFFFF);
  type RBag (line 84) | pub struct RBag {
    method new (line 421) | pub fn new() -> Self {
    method push_redex (line 428) | pub fn push_redex(&mut self, redex: Pair) {
    method pop_redex (line 437) | pub fn pop_redex(&mut self) -> Option<Pair> {
    method len (line 445) | pub fn len(&self) -> usize {
    method has_highs (line 449) | pub fn has_highs(&self) -> bool {
    method show (line 1031) | pub fn show(&self) -> String {
  type GNet (line 90) | pub struct GNet<'a> {
  type TMem (line 99) | pub struct TMem {
    method new (line 545) | pub fn new(tid: u32, tids: u32) -> Self {
    method node_alloc (line 559) | pub fn node_alloc(&mut self, net: &GNet, num: usize) -> usize {
    method vars_alloc (line 575) | pub fn vars_alloc(&mut self, net: &GNet, num: usize) -> usize {
    method get_resources (line 591) | pub fn get_resources(&mut self, net: &GNet, _need_rbag: usize, need_no...
    method link (line 598) | pub fn link(&mut self, net: &GNet, a: Port, b: Port) {
    method link_pair (line 635) | pub fn link_pair(&mut self, net: &GNet, ab: Pair) {
    method interact_link (line 641) | pub fn interact_link(&mut self, net: &GNet, a: Port, b: Port) -> bool {
    method interact_call (line 654) | pub fn interact_call(&mut self, net: &GNet, a: Port, b: Port, book: &B...
    method interact_void (line 700) | pub fn interact_void(&mut self, _net: &GNet, _a: Port, _b: Port) -> bo...
    method interact_eras (line 705) | pub fn interact_eras(&mut self, net: &GNet, a: Port, b: Port) -> bool {
    method interact_anni (line 729) | pub fn interact_anni(&mut self, net: &GNet, a: Port, b: Port) -> bool {
    method interact_comm (line 756) | pub fn interact_comm(&mut self, net: &GNet, a: Port, b: Port) -> bool {
    method interact_oper (line 797) | pub fn interact_oper(&mut self, net: &GNet, a: Port, b: Port) -> bool {
    method interact_swit (line 829) | pub fn interact_swit(&mut self, net: &GNet, a: Port, b: Port) -> bool {
    method interact (line 860) | pub fn interact(&mut self, net: &GNet, book: &Book) -> bool {
    method evaluator (line 909) | pub fn evaluator(&mut self, net: &GNet, book: &Book) {
  type Def (line 112) | pub struct Def {
  type Book (line 122) | pub struct Book {
    method to_buffer (line 958) | pub fn to_buffer(&self, buf: &mut Vec<u8>) {
    method show (line 1077) | pub fn show(&self) -> String {
  function new (line 455) | pub fn new(nlen: usize, vlen: usize) -> Self {
  function node_create (line 465) | pub fn node_create(&self, loc: usize, val: Pair) {
  function vars_create (line 469) | pub fn vars_create(&self, var: usize, val: Port) {
  function node_load (line 473) | pub fn node_load(&self, loc: usize) -> Pair {
  function vars_load (line 477) | pub fn vars_load(&self, var: usize) -> Port {
  function node_store (line 481) | pub fn node_store(&self, loc: usize, val: Pair) {
  function vars_store (line 485) | pub fn vars_store(&self, var: usize, val: Port) {
  function node_exchange (line 489) | pub fn node_exchange(&self, loc: usize, val: Pair) -> Pair {
  function vars_exchange (line 493) | pub fn vars_exchange(&self, var: usize, val: Port) -> Port {
  function node_take (line 497) | pub fn node_take(&self, loc: usize) -> Pair {
  function vars_take (line 501) | pub fn vars_take(&self, var: usize) -> Port {
  function is_node_free (line 505) | pub fn is_node_free(&self, loc: usize) -> bool {
  function is_vars_free (line 509) | pub fn is_vars_free(&self, var: usize) -> bool {
  function enter (line 513) | pub fn enter(&self, mut var: Port) -> Port {
  method drop (line 532) | fn drop(&mut self) {
  function show (line 1048) | pub fn show(&self) -> String {
  function test_f24 (line 1161) | fn test_f24() {

FILE: src/main.rs
  function hvm_c (line 14) | fn hvm_c(book_buffer: *const u32);
  function hvm_cu (line 19) | fn hvm_cu(book_buffer: *const u32);
  function main (line 22) | fn main() {
  function run (line 160) | pub fn run(book: &hvm::Book) {

FILE: src/run.c
  type Ctr (line 7) | typedef struct Ctr {
  type Tup (line 14) | typedef struct Tup {
  type Str (line 21) | typedef struct Str {
  type Bytes (line 27) | typedef struct Bytes {
  type IOError (line 53) | typedef struct IOError {
  function Ctr (line 67) | Ctr readback_ctr(Net* net, Book* book, Port port) {
  function Tup (line 102) | extern Tup readback_tup(Net* net, Book* book, Port port, u32 size) {
  function Bytes (line 123) | Bytes readback_bytes(Net* net, Book* book, Port port) {
  function Str (line 169) | Str readback_str(Net* net, Book* book, Port port) {
  function Port (line 184) | Port inject_nil(Net* net) {
  function Port (line 201) | Port inject_cons(Net* net, Port head, Port tail) {
  function Port (line 224) | Port inject_bytes(Net* net, Bytes *bytes) {
  function Port (line 249) | Port inject_ok(Net* net, Port val) {
  function Port (line 272) | Port inject_err(Net* net, Port err) {
  function Port (line 295) | Port inject_io_err(Net* net, IOError err) {
  function Port (line 338) | Port inject_io_err_type(Net* net) {
  function Port (line 347) | Port inject_io_err_name(Net* net) {
  function Port (line 356) | Port inject_io_err_inner(Net* net, Port val) {
  function Port (line 367) | Port inject_io_err_str(Net* net, char* err) {
  function FILE (line 392) | FILE* readback_file(Port port) {
  function Port (line 433) | Port io_read(Net* net, Book* book, Port argm) {
  function Port (line 467) | Port io_open(Net* net, Book* book, Port argm) {
  function Port (line 500) | Port io_close(Net* net, Book* book, Port argm) {
  function Port (line 519) | Port io_write(Net* net, Book* book, Port argm) {
  function Port (line 547) | Port io_flush(Net* net, Book* book, Port argm) {
  function Port (line 569) | Port io_seek(Net* net, Book* book, Port argm) {
  function Port (line 602) | Port io_get_time(Net* net, Book* book, Port argm) {
  function Port (line 620) | Port io_sleep(Net* net, Book* book, Port argm) {
  function Port (line 647) | Port io_dl_open(Net* net, Book* book, Port argm) {
  function Port (line 681) | Port io_dl_call(Net* net, Book* book, Port argm) {
  function Port (line 703) | Port io_dl_close(Net* net, Book* book, Port argm) {
  function book_init (line 722) | void book_init(Book* book) {
  function do_run_io (line 740) | void do_run_io(Net* net, Book* book, Port port) {

FILE: tests/run.rs
  function test_run_programs (line 16) | fn test_run_programs() {
  function test_run_examples (line 21) | fn test_run_examples() {
  function test_dir (line 25) | fn test_dir(dir: &Path) {
  function manifest_relative (line 29) | fn manifest_relative(sub: &str) -> PathBuf {
  function test_file (line 33) | fn test_file(path: &Path) {
  function test_io_file (line 67) | fn test_io_file(path: &Path) {
  function execute_hvm (line 79) | fn execute_hvm(args: &[&OsStr], send_io: bool) -> Result<String, Box<dyn...
  function parse_output (line 112) | fn parse_output(output: &str) -> Result<String, String> {
  function normalize_vars (line 135) | fn normalize_vars(tree: &mut Tree, vars: &mut HashMap<String, usize>) {
Condensed preview — 84 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (452K chars).
[
  {
    "path": ".github/ISSUE_TEMPLATE/bug_report.yml",
    "chars": 1325,
    "preview": "name: Bug report\ndescription: Create a report to help us improve.\nbody:\n - type: markdown\n   attributes: \n     value: |\n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/config.yml",
    "chars": 226,
    "preview": "blank_issues_enabled: false\ncontact_links: \n    - name: Bend Related Issues\n      url: https://github.com/HigherOrderCO/"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/feature_request.md",
    "chars": 591,
    "preview": "---\nname: Feature request\nabout: Suggest a feature that you think should be added.\ntitle: ''\nlabels: ''\n\n---\n\n**Is your "
  },
  {
    "path": ".github/workflows/bench.yml",
    "chars": 2326,
    "preview": "name: Bench\n\non:\n  pull_request:\n\nconcurrency:\n  group: bench-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  benc"
  },
  {
    "path": ".github/workflows/checks.yml",
    "chars": 1066,
    "preview": "name: Checks\n\non:\n  pull_request:\n  merge_group:\n  push:\n    branches:\n      - main\n\njobs:\n  check:\n    runs-on: ubuntu-"
  },
  {
    "path": ".github/workflows/delete-cancelled.yml",
    "chars": 334,
    "preview": "name: Delete Cancelled Benchmarks\n\non:\n  workflow_dispatch:\n    inputs:\n      run_id:\n        type: string\n        descr"
  },
  {
    "path": ".gitignore",
    "chars": 174,
    "preview": ".fill.tmp\nhvm-cuda-experiments/\nsrc/hvm\nsrc/old_cmp.rs\nsrc/tmp/\ntarget/\ntmp/\n.hvm/\nexamples/**/main\nexamples/**/*.c\nexam"
  },
  {
    "path": "Cargo.toml",
    "chars": 593,
    "preview": "[package]\nname = \"hvm\"\ndescription = \"A massively parallel, optimal functional runtime in Rust.\"\nlicense = \"Apache-2.0\"\n"
  },
  {
    "path": "LICENSE",
    "chars": 10173,
    "preview": "                                 Apache License\n                           Version 2.0, January 2004\n                   "
  },
  {
    "path": "README.md",
    "chars": 2884,
    "preview": "Higher-order Virtual Machine 2 (HVM2)\n=====================================\n\n**Higher-order Virtual Machine 2 (HVM2)** i"
  },
  {
    "path": "build.rs",
    "chars": 1828,
    "preview": "fn main() {\n  let cores = num_cpus::get();\n  let tpcl2 = (cores as f64).log2().floor() as u32;\n\n  println!(\"cargo:rerun-"
  },
  {
    "path": "examples/demo_io/main.bend",
    "chars": 555,
    "preview": "test-io = 1\n\ndef unwrap(res):\n  match res:\n    case Result/Ok:\n      return res.val\n    case Result/Err:\n      return re"
  },
  {
    "path": "examples/demo_io/main.hvm",
    "chars": 2641,
    "preview": "@IO/Call = (a (b (c (d ((@IO/Call/tag (a (b (c (d e))))) e)))))\n\n@IO/Call/tag = 1\n\n@IO/Done = (a (b ((@IO/Done/tag (a (b"
  },
  {
    "path": "examples/sort_bitonic/main.bend",
    "chars": 1087,
    "preview": "def gen(d, x):\n  switch d:\n    case 0:\n      return x\n    case _:\n      return (gen(d-1, x * 2 + 1), gen(d-1, x * 2))\n\nd"
  },
  {
    "path": "examples/sort_bitonic/main.hvm",
    "chars": 1274,
    "preview": "@down = (?(((a (* a)) @down__C0) (b (c d))) (c (b d)))\n\n@down__C0 = ({a e} ((c g) ({b f} (d h))))\n  &! @flow ~ (a (b (c "
  },
  {
    "path": "examples/sort_radix/main.bend",
    "chars": 1907,
    "preview": "\n# data Arr = Empty | (Single x) | (Concat x0 x1)\nEmpty  =         λempty λsingle λconcat empty\nSingle = λx      λempty "
  },
  {
    "path": "examples/sort_radix/main.hvm",
    "chars": 1746,
    "preview": "@Busy = (* (a (* a)))\n\n@Concat = (a (b (* (* ((a (b c)) c)))))\n\n@Empty = (a (* (* a)))\n\n@Free = (a (* (* a)))\n\n@Node = ("
  },
  {
    "path": "examples/stress/README.md",
    "chars": 685,
    "preview": "# stress\n\nThis is the basic stress-test used to test an implementation's maximum IPS. It\nrecursively creates a tree with"
  },
  {
    "path": "examples/stress/main.bend",
    "chars": 223,
    "preview": "def loop(n):\n  switch n:\n    case 0:\n      return 0\n    case _:\n      return loop(n-1)\n\ndef fun(n):\n  switch n:\n    case"
  },
  {
    "path": "examples/stress/main.hvm",
    "chars": 231,
    "preview": "@fun = (?((@fun__C0 @fun__C1) a) a)\n\n@fun__C0 = a\n  & @loop ~ (65536 a)\n\n@fun__C1 = ({a b} d)\n  &! @fun ~ (a $([+] $(c d"
  },
  {
    "path": "examples/stress/main.js",
    "chars": 227,
    "preview": "function sum(n) {\n  if (n === 0) {\n    return 0;\n  } else {\n    return n + sum(n - 1);\n  }\n}\n\nfunction fun(n) {\n  if (n "
  },
  {
    "path": "examples/stress/main.py",
    "chars": 540,
    "preview": "def sum(n):\n  if n == 0:\n    return 0\n  else:\n    return n + sum(n - 1)\n\ndef fun(n):\n  if n == 0:\n    return sum(16)\n  e"
  },
  {
    "path": "examples/sum_rec/main.bend",
    "chars": 160,
    "preview": "#flavor core\n\nsum = λn λx switch n {\n  0: x\n  _: let fst = (sum n-1 (+ (* x 2) 0))\n     let snd = (sum n-1 (+ (* x 2) 1)"
  },
  {
    "path": "examples/sum_rec/main.hvm",
    "chars": 184,
    "preview": "@main = a\n  & @sum ~ (20 (0 a))\n\n@sum = (?(((a a) @sum__C0) b) b)\n\n@sum__C0 = ({c a} ({$([*2] $([+1] d)) $([*2] $([+0] b"
  },
  {
    "path": "examples/sum_rec/main.js",
    "chars": 224,
    "preview": "function sum(a, b) {\n  if (a === b) {\n    return a;\n  } else {\n    let mid = Math.floor((a + b) / 2);\n    let fst = sum("
  },
  {
    "path": "examples/sum_rec/sum.js",
    "chars": 81,
    "preview": "var sum = 0;\n\nfor (var i = 0; i < 2**30; ++i) {\n  sum += i;\n}\n\nconsole.log(sum);\n"
  },
  {
    "path": "examples/sum_tree/main.bend",
    "chars": 205,
    "preview": "gen = λd switch d {\n  0: λx x\n  _: λx ((gen d-1 (+ (* x 2) 1)), (gen d-1 (* x 2)))\n}\n\nsum = λd λt switch d {\n  0: 1\n  _:"
  },
  {
    "path": "examples/sum_tree/main.hvm",
    "chars": 333,
    "preview": "@gen = (?(((a a) @gen__C0) b) b)\n\n@gen__C0 = ({a d} ({$([*2] $([+1] b)) $([*2] e)} (c f)))\n  &! @gen ~ (a (b c))\n  &! @g"
  },
  {
    "path": "examples/tuples/tuples.bend",
    "chars": 236,
    "preview": "type Tup8:\n  New { a, b, c, d, e, f, g, h }\n\nrot = λx match x {\n  Tup8/New: (Tup8/New x.b x.c x.d x.e x.f x.g x.h x.a)\n}"
  },
  {
    "path": "examples/tuples/tuples.hvm",
    "chars": 430,
    "preview": "@Tup8/New = (a (b (c (d (e (f (g (h ((0 (a (b (c (d (e (f (g (h i))))))))) i)))))))))\n\n@app = (?(((* (a a)) @app__C0) b)"
  },
  {
    "path": "paper/HVM2.typst",
    "chars": 55375,
    "preview": "#import \"@preview/unequivocal-ams:0.1.0\": ams-article, theorem, proof\n#import \"@preview/cetz:0.2.2\": canvas, draw\n#impor"
  },
  {
    "path": "paper/README.md",
    "chars": 304,
    "preview": "# HVM - Paper(s)\n\n- HVM2 (Work in progress)\n  - HVM2's theoretical foundations, implementation, early benchmarks, curren"
  },
  {
    "path": "paper/inet.typ",
    "chars": 959,
    "preview": "#import \"@preview/cetz:0.2.2\": draw, canvas\n\n#let port = (name, pos, dir) => {\n  import draw: *\n  group(name: name, {\n  "
  },
  {
    "path": "src/ast.rs",
    "chars": 19716,
    "preview": "use TSPL::{new_parser, Parser, ParseError};\nuse highlight_error::highlight_error;\nuse crate::hvm;\nuse std::fmt::{Debug, "
  },
  {
    "path": "src/cmp.rs",
    "chars": 16307,
    "preview": "use crate::ast;\nuse crate::hvm;\n\nuse std::collections::HashMap;\n\n#[derive(Clone, Copy, PartialEq, Eq)]\npub enum Target {"
  },
  {
    "path": "src/hvm.c",
    "chars": 80962,
    "preview": "#include <inttypes.h>\n#include <math.h>\n#include <pthread.h>\n#include <stdatomic.h>\n#include <stdbool.h>\n#include <stdin"
  },
  {
    "path": "src/hvm.cu",
    "chars": 98148,
    "preview": "#define INTERPRETED\n#define WITHOUT_MAIN\n\n#ifdef DEBUG\n  #define debug(...) printf(__VA_ARGS__)\n#else\n  #define debug(.."
  },
  {
    "path": "src/hvm.cuh",
    "chars": 5169,
    "preview": "#ifndef hvm_cuh_INCLUDED\n#define hvm_cuh_INCLUDED\n\n#include <math.h>\n#include <stdint.h>\n\n// Types\n// -----\n\ntypedef  ui"
  },
  {
    "path": "src/hvm.h",
    "chars": 5268,
    "preview": "#ifndef hvm_h_INCLUDED\n#define hvm_h_INCLUDED\n\n#include <math.h>\n#include <stdint.h>\n#include <stdbool.h>\n\n// Types\n// -"
  },
  {
    "path": "src/hvm.rs",
    "chars": 36232,
    "preview": "use std::sync::atomic::{AtomicU32, AtomicU64, Ordering};\nuse std::alloc::{alloc, dealloc, Layout};\nuse std::mem;\n\n// Run"
  },
  {
    "path": "src/lib.rs",
    "chars": 115,
    "preview": "#![allow(dead_code)]\n#![allow(unused_imports)]\n#![allow(unused_variables)]\n\npub mod ast;\npub mod cmp;\npub mod hvm;\n"
  },
  {
    "path": "src/main.rs",
    "chars": 7350,
    "preview": "#![allow(dead_code)]\n#![allow(unused_imports)]\n#![allow(unused_variables)]\n\nuse clap::{Arg, ArgAction, Command};\nuse ::h"
  },
  {
    "path": "src/run.c",
    "chars": 21867,
    "preview": "#include <dlfcn.h>\n#include <errno.h>\n#include <stdio.h>\n#include \"hvm.c\"\n\n// Readback: λ-Encoded Ctr\ntypedef struct Ctr"
  },
  {
    "path": "src/run.cu",
    "chars": 24750,
    "preview": "#include <dlfcn.h>\n#include <errno.h>\n#include <stdio.h>\n#include \"hvm.cu\"\n\n// Readback: λ-Encoded Ctr\nstruct Ctr {\n  u3"
  },
  {
    "path": "tests/programs/empty.hvm",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "tests/programs/hello-world.hvm",
    "chars": 531,
    "preview": "@String/Cons = (a (b ((@String/Cons/tag (a (b c))) c)))\n\n@String/Cons/tag = 1\n\n@String/Nil = ((@String/Nil/tag a) a)\n\n@S"
  },
  {
    "path": "tests/programs/io/basic.bend",
    "chars": 555,
    "preview": "test-io = 1\n\ndef unwrap(res):\n  match res:\n    case Result/Ok:\n      return res.val\n    case Result/Err:\n      return re"
  },
  {
    "path": "tests/programs/io/basic.hvm",
    "chars": 2641,
    "preview": "@IO/Call = (a (b (c (d ((@IO/Call/tag (a (b (c (d e))))) e)))))\n\n@IO/Call/tag = 1\n\n@IO/Done = (a (b ((@IO/Done/tag (a (b"
  },
  {
    "path": "tests/programs/io/invalid-name.bend",
    "chars": 108,
    "preview": "test-io = 1\n\ndef main():\n  with IO:\n    f <- call(\"INVALID-NAME\", (\"./README.md\", \"r\"))\n\n    return wrap(f)\n"
  },
  {
    "path": "tests/programs/io/invalid-name.hvm",
    "chars": 1617,
    "preview": "@IO/Call = (a (b (c (d ((@IO/Call/tag (a (b (c (d e))))) e)))))\n\n@IO/Call/tag = 1\n\n@IO/Done = (a (b ((@IO/Done/tag (a (b"
  },
  {
    "path": "tests/programs/io/open1.bend",
    "chars": 98,
    "preview": "test-io = 1\n\ndef main():\n  with IO:\n    f <- call(\"OPEN\", (\"./LICENSE\", \"r\"))\n\n    return wrap(f)\n"
  },
  {
    "path": "tests/programs/io/open1.hvm",
    "chars": 1311,
    "preview": "@IO/Call = (a (b (c (d ((@IO/Call/tag (a (b (c (d e))))) e)))))\n\n@IO/Call/tag = 1\n\n@IO/Done = (a (b ((@IO/Done/tag (a (b"
  },
  {
    "path": "tests/programs/io/open2.bend",
    "chars": 98,
    "preview": "test-io = 1\n\ndef main():\n  with IO:\n    f <- call(\"OPEN\", (\"fake-file\", \"r\"))\n\n    return wrap(f)\n"
  },
  {
    "path": "tests/programs/io/open2.hvm",
    "chars": 1318,
    "preview": "@IO/Call = (a (b (c (d ((@IO/Call/tag (a (b (c (d e))))) e)))))\n\n@IO/Call/tag = 1\n\n@IO/Done = (a (b ((@IO/Done/tag (a (b"
  },
  {
    "path": "tests/programs/io/open3.bend",
    "chars": 133,
    "preview": "test-io = 1\n\ndef main():\n  with IO:\n    # calling open with an unexpected type of arg\n    f <- call(\"OPEN\", 123)\n\n    re"
  },
  {
    "path": "tests/programs/io/open3.hvm",
    "chars": 988,
    "preview": "@IO/Call = (a (b (c (d ((@IO/Call/tag (a (b (c (d e))))) e)))))\n\n@IO/Call/tag = 1\n\n@IO/Done = (a (b ((@IO/Done/tag (a (b"
  },
  {
    "path": "tests/programs/list.hvm",
    "chars": 552,
    "preview": "@List/Cons = (a (b ((@List/Cons/tag (a (b c))) c)))\n\n@List/Cons/tag = 1\n\n@List/Nil = ((@List/Nil/tag a) a)\n\n@List/Nil/ta"
  },
  {
    "path": "tests/programs/numeric-casts.hvm",
    "chars": 2508,
    "preview": "@main = x & @tu0 ~ (* x)\n\n// casting to u24\n@tu0 = (* {n x}) & @tu1 ~ (* x) &  0          ~ $([u24] n) // 0\n@tu1 = (* {n"
  },
  {
    "path": "tests/programs/numerics/f24.hvm",
    "chars": 1998,
    "preview": "@half = xN & 1.0 ~ $([/] $(2.0 xN))\n@nan = xN & 0.0 ~ $([/] $(0.0 xN))\n\n@main = x & @t0 ~ (* x)\n\n// nan and inf division"
  },
  {
    "path": "tests/programs/numerics/i24.hvm",
    "chars": 1168,
    "preview": "@main = x & @t0 ~ (* x)\n\n// all ops\n@t0 = (* {n x}) & @t1 ~ (* x) & +10 ~ $([+]  $(+2 n)) // 12\n@t1 = (* {n x}) & @t2 ~ "
  },
  {
    "path": "tests/programs/numerics/u24.hvm",
    "chars": 1084,
    "preview": "@main = x & @t0 ~ (* x)\n\n// all ops\n@t0 = (* {n x}) & @t1 ~ (* x) & 10 ~ $([+]  $(2 n)) // 12\n@t1 = (* {n x}) & @t2 ~ (*"
  },
  {
    "path": "tests/programs/safety-check.hvm",
    "chars": 532,
    "preview": "@List/Cons = (a (b ((@List/Cons/tag (a (b c))) c)))\n\n@List/Cons/tag = 1\n\n@List/Nil = ((@List/Nil/tag a) a)\n\n@List/Nil/ta"
  },
  {
    "path": "tests/run.rs",
    "chars": 4264,
    "preview": "use std::{\n  collections::HashMap,\n  error::Error,\n  ffi::OsStr,\n  fs,\n  io::{Read, Write},\n  path::{Path, PathBuf},\n  p"
  },
  {
    "path": "tests/snapshots/run__file@empty.hvm.snap",
    "chars": 257,
    "preview": "---\nsource: tests/run.rs\nexpression: rust_output\ninput_file: tests/programs/empty.hvm\n---\nexit status: 101\nthread 'main'"
  },
  {
    "path": "tests/snapshots/run__file@hello-world.hvm.snap",
    "chars": 538,
    "preview": "---\nsource: tests/run.rs\nexpression: rust_output\ninput_file: tests/programs/hello-world.hvm\n---\nResult: ((@String/Cons/t"
  },
  {
    "path": "tests/snapshots/run__file@list.hvm.snap",
    "chars": 239,
    "preview": "---\nsource: tests/run.rs\nexpression: rust_output\ninput_file: tests/programs/list.hvm\n---\nResult: ((@List/Cons/tag (((@Li"
  },
  {
    "path": "tests/snapshots/run__file@numeric-casts.hvm.snap",
    "chars": 390,
    "preview": "---\nsource: tests/run.rs\nexpression: rust_output\ninput_file: tests/programs/numeric-casts.hvm\n---\nResult: {0 {1234 {4321"
  },
  {
    "path": "tests/snapshots/run__file@numerics__f24.hvm.snap",
    "chars": 317,
    "preview": "---\nsource: tests/run.rs\nexpression: rust_output\ninput_file: tests/programs/numerics/f24.hvm\n---\nResult: {+inf {-inf {+N"
  },
  {
    "path": "tests/snapshots/run__file@numerics__i24.hvm.snap",
    "chars": 199,
    "preview": "---\nsource: tests/run.rs\nexpression: rust_output\ninput_file: tests/programs/i24.hvm\n---\nResult: {+12 {+8 {+20 {+5 {+0 {0"
  },
  {
    "path": "tests/snapshots/run__file@numerics__u24.hvm.snap",
    "chars": 177,
    "preview": "---\nsource: tests/run.rs\nexpression: rust_output\ninput_file: tests/programs/u24.hvm\n---\nResult: {12 {8 {20 {5 {0 {0 {1 {"
  },
  {
    "path": "tests/snapshots/run__file@safety-check.hvm.snap",
    "chars": 152,
    "preview": "---\nsource: tests/run.rs\nexpression: rust_output\ninput_file: tests/programs/safety-check.hvm\n---\nERROR: attempt to clone"
  },
  {
    "path": "tests/snapshots/run__file@sort_bitonic__main.hvm.snap",
    "chars": 112,
    "preview": "---\nsource: tests/run.rs\nexpression: rust_output\ninput_file: examples/sort_bitonic/main.hvm\n---\nResult: 8386560\n"
  },
  {
    "path": "tests/snapshots/run__file@sort_radix__main.hvm.snap",
    "chars": 111,
    "preview": "---\nsource: tests/run.rs\nexpression: rust_output\ninput_file: examples/sort_radix/main.hvm\n---\nResult: 16744448\n"
  },
  {
    "path": "tests/snapshots/run__file@stress__main.hvm.snap",
    "chars": 100,
    "preview": "---\nsource: tests/run.rs\nexpression: rust_output\ninput_file: examples/stress/main.hvm\n---\nResult: 0\n"
  },
  {
    "path": "tests/snapshots/run__file@sum_rec__main.hvm.snap",
    "chars": 108,
    "preview": "---\nsource: tests/run.rs\nexpression: rust_output\ninput_file: examples/sum_rec/main.hvm\n---\nResult: 16252928\n"
  },
  {
    "path": "tests/snapshots/run__file@sum_tree__main.hvm.snap",
    "chars": 108,
    "preview": "---\nsource: tests/run.rs\nexpression: rust_output\ninput_file: examples/sum_tree/main.hvm\n---\nResult: 1048576\n"
  },
  {
    "path": "tests/snapshots/run__file@tuples__tuples.hvm.snap",
    "chars": 144,
    "preview": "---\nsource: tests/run.rs\nexpression: rust_output\ninput_file: examples/tuples/tuples.hvm\n---\nResult: ((0 (3 (4 (5 (6 (7 ("
  },
  {
    "path": "tests/snapshots/run__io_file@demo_io__main.hvm.snap",
    "chars": 197,
    "preview": "---\nsource: tests/run.rs\nexpression: c_output\ninput_file: examples/demo_io/main.hvm\n---\n                                "
  },
  {
    "path": "tests/snapshots/run__io_file@io__basic.hvm.snap",
    "chars": 199,
    "preview": "---\nsource: tests/run.rs\nexpression: c_output\ninput_file: tests/programs/io/basic.hvm\n---\n                              "
  },
  {
    "path": "tests/snapshots/run__io_file@io__invalid-name.hvm.snap",
    "chars": 168,
    "preview": "---\nsource: tests/run.rs\nexpression: c_output\ninput_file: tests/programs/io/invalid-name.hvm\n---\nResult: ((@IO/Done/tag "
  },
  {
    "path": "tests/snapshots/run__io_file@io__open1.hvm.snap",
    "chars": 151,
    "preview": "---\nsource: tests/run.rs\nexpression: c_output\ninput_file: tests/programs/io/open1.hvm\n---\nResult: ((@IO/Done/tag (@IO/MA"
  },
  {
    "path": "tests/snapshots/run__io_file@io__open2.hvm.snap",
    "chars": 166,
    "preview": "---\nsource: tests/run.rs\nexpression: c_output\ninput_file: tests/programs/io/open2.hvm\n---\nResult: ((@IO/Done/tag (@IO/MA"
  },
  {
    "path": "tests/snapshots/run__io_file@io__open3.hvm.snap",
    "chars": 161,
    "preview": "---\nsource: tests/run.rs\nexpression: c_output\ninput_file: tests/programs/io/open3.hvm\n---\nResult: ((@IO/Done/tag (@IO/MA"
  },
  {
    "path": "tests/snapshots/run__io_file@io__read_and_print.hvm.snap",
    "chars": 140,
    "preview": "---\nsource: tests/run.rs\nexpression: c_output\ninput_file: tests/programs/io/read_and_print.hvm\n---\nWhat is your name?\nio"
  }
]

About this extraction

This page contains the full source code of the HigherOrderCO/HVM2 GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 84 files (422.5 KB), approximately 183.5k tokens, and a symbol index with 372 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!