Full Code of blainehansen/magma for AI

main a6ced658d623 cached
58 files
591.0 KB
164.1k tokens
85 symbols
1 requests
Download .txt
Showing preview only (619K chars total). Download the full file or copy to clipboard to get everything.
Repository: blainehansen/magma
Branch: main
Commit: a6ced658d623
Files: 58
Total size: 591.0 KB

Directory structure:
gitextract_w5tkx20i/

├── .editorconfig
├── .github/
│   └── FUNDING.yml
├── .gitignore
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── Cargo.toml
├── README.future.md
├── README.md
├── iris-notes.md
├── justfile
├── lab.ll
├── mg_examples/
│   └── main.mg
├── notes/
│   ├── 2019-popl-iron-final.md
│   ├── assembly-proofs.md
│   ├── category-theory-for-programmers.md
│   ├── coq-coq-correct.md
│   ├── coq-metacoq.md
│   ├── indexing-foundational-proof-carrying-code.md
│   ├── indexing-indexed-model.md
│   ├── indexing-modal-model.md
│   ├── iris-from-the-ground-up.md
│   ├── iris-lecture-notes.md
│   ├── jung-thesis.md
│   ├── known_types.md
│   ├── pony-reference-capabilities.md
│   ├── tarjan/
│   │   ├── README.md
│   │   ├── _CoqProject
│   │   ├── extra_nocolors.v
│   │   └── tarjan_nocolors.v
│   └── tarjan.md
├── notes.md
├── old/
│   ├── checker.rs
│   ├── inductive_serde.v
│   ├── machine.md
│   ├── machine.v
│   ├── main.md
│   ├── main.v
│   └── parser_low.rs
├── posts/
│   ├── approachable-language-design.md
│   ├── comparisons-with-other-projects.md
│   ├── coq-for-engineers.md
│   ├── crossing-no-mans-land.md
│   ├── design-of-magmide.md
│   ├── intro-verification-logic-in-magmide.md
│   ├── iris-in-plain-terms.md
│   ├── toward-termination-vcgen.md
│   └── what-is-magmide.md
├── src/
│   ├── ast.rs
│   ├── checker.rs
│   ├── lib.rs
│   ├── main.rs
│   ├── old.md
│   └── parser.rs
└── theory/
    ├── _CoqProject
    ├── list_assertions.v
    ├── main.v
    ├── playground.v
    └── utils.v

================================================
FILE CONTENTS
================================================

================================================
FILE: .editorconfig
================================================
root = true

[*]
charset = utf-8
indent_style = tab
end_of_line = lf
insert_final_newline = true
trim_trailing_whitespace = true

[*.{md,yml,yaml}]
indent_style = space
indent_size = 2


================================================
FILE: .github/FUNDING.yml
================================================
github: [blainehansen]


================================================
FILE: .gitignore
================================================
\#*.v\#
*.glob
*.vo
*.vok
*.vos
*.aux
*.d

*.cmi
*.cmo
*.out
_build

Makefile
Makefile.conf
*.cache
theorems/*.ml
theorems/*.mli

*.local.*
*.bc

# Added by cargo

/target


================================================
FILE: CODE_OF_CONDUCT.md
================================================
# Code of Conduct

We're using the exact same Code of Conduct as the Rust project, which [can be found online](https://www.rust-lang.org/conduct.html).


================================================
FILE: CONTRIBUTING.md
================================================
Hey there!

Right now this project is optimized to be easy for me (Blaine Hansen) to work in. This means it might not be easy for anyone to jump right in, and syntax or workflow may disregard certain community standards if I find them inconvenient. I'm not really concerned with the different standards of different language communities, and if I feel a language community has made a standard choice that makes code more difficult to work with, I will completely ignore it.

Although I will gladly accept pull requests that add conveniences for other setups, **I will deny any that disrupt my workflow**. I wish I had time to support other setups, but unfortunately working in Coq is very difficult and nit-picky, so the usual developer niceties such as using docker for local development aren't really practical.

Here are the main things I can think of:

- I run Ubuntu, so I have arranged all the scripts and build files to assume that. If you're interested in running on other systems, I'm afraid I have to leave you to your own devices. If a pull request makes a change that breaks the build on my system, I won't accept it. **I will gladly accept pull requests that make it possible to build everywhere!** However an important constraint is that [Coq interactive mode](https://packagecontrol.io/packages/Coq) must continue to work for me. If you can guide me toward a setup that allows other systems to run the build while working with Coq interactive mode, I'm happy to hear it.
- [I only ever use tabs over spaces for indentation, always.](https://adamtuttle.codes/blog/2021/tabs-vs-spaces-its-an-accessibility-issue/) I will only use spaces if some irreplaceable piece of the system will literally not work if I don't (`yml` is an example). I'm more likely to simply [not use](https://github.com/avh4/elm-format/issues/158) a language if it requires spaces. You can see this choice being made in all the `dune` files throughout the project. The OCaml ecosystem seems to think that a *single* space is easy enough to read, whereas I find it extremely difficult to read (which highlights the real reason tabs are better, everyone can configure their own tab display width).
- If some syntactic structure is "list-like" and supports one item per line, I will write it in a way that allows quickly adding and reordering lines without having to change the location of ending braces/parens. You can also see this in the `dune` files, where instead of using the lisp standard of placing closing parens on the same line as the last item, I place them on a new deindented line.

These probably seem trite and nit-picky, and maybe they are. I just don't want to fight with this code more than is necessary.

Thank you for your understanding!


================================================
FILE: Cargo.toml
================================================
[package]
name = "magmide"
version = "0.1.0"
edition = "2021"

# [lib]
# name = "magmide"
# path = "src/lib.rs"
# crate-type = ["staticlib", "cdylib"]

# [build]
# # https://doc.rust-lang.org/cargo/reference/config.html
# rustflags = ["-l", "LLVM-13", "-C", "link-args=-Wl,-undefined,dynamic_lookup"]

# # See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
nom = "7"
# anyhow = "1.0.57"


================================================
FILE: README.future.md
================================================
# Magmide

> Correct, Fast, Productive: pick three.

Magmide is the first language built from the ground up to allow software engineers to productively write extremely high performance software for any computational environment, logically prove the software correct, and run/compile that code all within the same tool.

The goal of the project is to spread the so-far purely academic knowledge of software verification and formal logic to a broad audience. It should be normal for engineers to create programs that are truly correct, safe, secure, robust, and performant.

This file is a "by example" style reference for the features and interface of Magmide. It doesn't try to explain any of the underlying concepts, just document decisions, so you might want to read one of these other resources:

- If you want to be convinced the goal of this project is both possible and necessary, please read [What is Magmide and Why is it Important?]()
- If you want to learn about software verification and formal logic using Magmide, please read [Intro to Verification and Logic with Magmide]().
- If you want to contribute and need the nitty-gritty technical details and current roadmap, please read [The Technical Design of Magmide]().

## Install and Use

Magmide is heavily inspired by Rust and its commitment to ergonomic tooling and straightforward documentation.

```bash
# install magmide and its tools
curl --proto '=https' --tlsv1.2 -sSf https://sh.magmide.dev | sh

# create a new project
magmide new hello-world
cd hello-world

magmide check <entry>
magmide run
magmide build
```

## Syntax

Here's what we can do

calling is just placing things next to each other with no commas. an *explicit* comma-separated list is always a tuple, which is why function arguments are always specified that way
piping style calling uses `>functionname`. it seems that because of precedence and indentation rules which expressions are function names is always inferable?
this works inline too, so `data>functionname` or `data >infix something`
`>> arg arg2; expr` defines an anonymous function and immediately calls it in piping style. `>>;` is then the equivalent of your old `do` idea
`--` is the "bumper" for an indented expression
the sections of keywords are delimited by semicolons
nested function calls are just indented since function calling is
`/` is the *keyword continuation operator*, so all keywords, even possibly multi-line ones, can be defined metaprogramatically within the language

```
if yo; --
  function_name arg arg
  >whatevs
  >another thing
  >> something; yo different something
  >> hm; abb >hm diff
/elif yoyo; whatevs
/else; dude

if yo; yoyo /else; dude

let thingy = if some >whatevs hmm; dude /else; yo
```

piping custom keywords can be done with a leading `;`? and standalone statement style ones are something else like `$`?
custom keywords are called with a leading `;`? so something like `;route_get yoyo something; whatevs /err; dude`

calling macros/known functions is indicated with something like a `~` or just the backtick thing? which means it can be done

include the "backpassing" idea? or simplify it by somehow creating an "implicit callback defining pipe operator?" such as `>>>`?






Magmide is whitespace/indentation sensitive.
Anywhere a `;` can be used an opening indent can be used *additionally*.
Anywhere a `,` can be used a newline can be used *instead*.
The `:` operator is always used in some way to indicate type-like assertions.
Precedence is decided using nesting with parentheses or indentation and never operator power.
"Wrapping" delimiters are avoided.
"Pipeability" is strongly valued.
Operators are rarely used to represent actions that could be defined within the language, and instead prioritize adding new capabilities.

```
// defining computational types
data Unit
data Tuple;


data Macro (S=undefined);
  | Block; BlockMacroFn
  | Function; FunctionMacroFn
  | Decorator; DecoratorMacroFn
  | Import; ImportMacroFn(S)


alias SourceChannel S; Dict<S> -> void

fn non_existent_err macroName: str; str, str;
  return "Macro non-existent", "The macro "${macroName}" doesn't exist.

fn incorrect_type_err
  macroName: str
  macroType: str
  expectedType: str
;
  str
  str
;
  return "Macro type mismatch", "The macro "${macroName}" is a ${macroType} type, but here it's being used as a ${expectedType} type."

data CompileContext S;
  macros: Dict(Macro(S))
  fileContext: FileContext
  sourceChannel: SourceChannel(S)
  handleScript: { path: str source: str } -> void
  readFile: str -> str | undefined
  joinPath: ..str -> str
  subsume: @T -> SpanResult<T> -> Result<T, void>
  Err: (ts.TextRange, str, str) -> Result<any, void>
  macroCtx: MacroContext

data MacroContext;
  Ok: @T, (T, SpanWarning[]?) -> SpanResult<T>
  TsNodeErr: (ts.TextRange, str, ..str) -> SpanResult<any>
  Err: (fileName: str, title: str, ..str) -> SpanResult<any>
  tsNodeWarn: (node: ts.TextRange, str, ..str[]) -> void
  warn: (str, str, ..str[]) -> void
  subsume: @T, SpanResult T -> Result T, void


data u8; bitarray(8)

ideal Day;
  | monday | tuesday | wednesday | thursday
  | friday | saturday | sunday

  use Day.*

  rec next_weekday day: Day; match day;
    monday; tuesday, tuesday; wednesday, wednesday; thursday, thursday; friday
    friday; monday, saturday; monday, sunday; monday

ideal Bool;
  | true
  | false

  use Bool.*

  rec negate b: Bool :: bool;
    match b;
      true; false
      false; true

  rec and b1: bool, b2: bool :: bool;
    match b1;
      true; b2
      false; false

  rec or b1: bool, b2: bool :: bool;
    match b1;
      true; true
      false; b2

  impl core.testable;
    rec test b: Bool :: bool;
      match b; true; testable.true, false; testable.false

  rec negate_using_test b: Bool :: bool;
    test b;
      false
      true


ideal IndexList<A: ideal> :: nat;
  | Nil :: IndexList(0)
  | Cons :: @n A IndexList(n) -> IndexList(n;next)

  rec append n1, ls1: IndexList(n1), n2, ls2: IndexList(n2) :: IndexList(n1 ;add n2);
    match ls1;
      Nil; ls2
      Cons(_, x, ls1'); Cons(x, append(ls1', ls2))

prop even :: nat;
  | zero: even(0)
  | add_two: @n, even(n) -> even(n;next;next)

  use even.*
  thm four_is: even(4); prf;
    + add_two; + add_two; + zero

  thm four_is__next: even(4); prf;
    + (add_two 2 (add_two 0 zero))

  thm plus_four: @n, even n -> even (4 ;add n); prf;
    => n; >>; => Hn;
    + add_two; + add_two; + Hn

  thm inversion:
    @n: nat, even n -> (n = 0) ;or (exists m; n = m;next;next ;and even m)
  ; prf;
    => n [| n' E']
      left; _
      --
        right; exists n'; split
        _; + E'

```



## Metaprogramming

## Interactive Tactic Mode



## Module system

```
// use a module whose location has been specified in the manifest
// the manifest is essentially sugar for a handful of macros
use lang{logic, compute}

// the libraries 'lang', 'core', and 'std' are spoken for. perhaps though we can allow people to specify external packages with these names, we'll just give a warning that they're shadowing builtin modules

// use a local module
// files/directories/internal modules are all accessed with .
// `__mod.mg` can act as an "module entry" for a directory, you can't shadow child files or directories
// the `mod` keyword can create modules inside a file, you can't shadow sibling files or directories
// `_file.mg` means that module is private, but since this is a verified language this is just a hint to not show the module in tooling, any true invariants should be fully specified with `&`
use .local.nested{thing, further{nested.more, stuff}}

// can do indented instead
use .local.nested
  thing
  further{nested.more, stuff}
  whatever
    stuff.thingy

// goes up to the project root
use ~local.whatever

// the module system allows full qualification of libraries, even to git repositories
// the format 'name/something' defaults to namespaced libraries on the main package manager
// a full git url obviously refers to that repo
use person/lib.whatever

// the above could be equivalent to:
let person_lib = lang.pull_lib$(git: "https://github.com/person/lib")
use person_lib.whatever
```


```
use lang.{ logic, compute }

// all inductive definitions use the `ind` keyword
// the different kinds of types are included by default and automatically desugared to be the more "pure" versions of themselves

// a union-like inductive
ind Day
  | monday | tuesday | wednesday | thursday
  | friday | saturday | sunday

// a record-like inductive
ind Date
  year: logic.Nat
  month: logic.Nat & between(1, 12)
  day: logic.Nat

// a tuple-like inductive
ind IpAddress; logic.Byte, logic.Byte, logic.Byte, logic.Byte

// the same as above but with a helper macro
ind IpAddress; logic.tuple_repeat(logic.Byte, 4)

// a unit-like inductive
ind Unit

rec next_weekday day
  // bring all the constructors of Day into scope
  use Day.*
  match day
    monday; tuesday, tuesday; wednesday, wednesday; thursday, thursday; friday
    friday; monday, saturday; monday, sunday; monday


let next_weekday_computable = compute.logic_computable(next_weekday)
let DayComputable = compute.type(next_weekday_computable).args[0].type

dbg next_weekday_computable(DayComputable.monday)
// outputs "Day.tuesday"


// what if we were define the above types and function in the computable language?
// it's as simple as changing "ind" to "type", "rec" to "fn", and ensuring all types are computable
// all of these "creation" keywords are ultimately just some kind of sugar for a "let"

type Day
  | monday | tuesday | wednesday | thursday
  | friday | saturday | sunday

type Date
  year: u16
  month: u8 & between(1, 12)
  day: u8

type Name; first: str, last: str

type Pair U, T; U, T

type IpAddress; u8, u8, u8, u8

type IpAddress; compute.tuple_repeat(u8, 4)

type Unit

fn next_weekday day
  use Day.*
  // a match implicitly takes discriminee, arms, proof of completeness
  match day
    monday; tuesday, tuesday; wednesday, wednesday; thursday, thursday; friday
    friday; monday, saturday; monday, sunday; monday

// now no need to convert it first
dbg next_weekday(Day.monday)
// outputs "Day.tuesday"
```

In general, `;` is an inline delimiter between tuples, and `,` is an inline delimiter between tuple elements. Since basically every positional item in a programming language is a tuple (or the tuple equivalent record), the alteration of these two can delimit everything. Note these are only *inline* delimiters, indents are the equivalent of `;` and newlines are the equivalent of `,`.
Why `;`? Because `:` is for type specification.

`==` is for equality, and maps to the two different kinds of equality if it's used in a logical or computational context.


### trait system in host magmide
don't need an orphan rule, just need explicit impl import and usage. the default impl is the bare one defined alongside the type, and either you always have to manually include/specify a different impl or its a semver violation to add a bare impl alongside a type that previously didn't have one



### example: converting a "logical" inductive type into an actual computable type

### example: adding an option to a computable discriminated union

### example: proving termination of a

## The embedded `core` language


## Testing

talk about quickcheck and working up to a proof

## Metaprogramming

Known strings given to a function
Keyword macros



================================================
FILE: README.md
================================================
# :construction: Magmide is purely a research project at this point :construction:

This repo is still very early and rough, it's mostly just notes, speculative writing, and exploratory theorem proving. Most of the files in this repo are just "mad scribblings" that I haven't refined enough to actually stand by!

If you prefer video, this presentation talks about the core ideas that make formal verification and Magmide possible, and the design goals and intentions of the project:

[![magmide talk](https://img.youtube.com/vi/Lf7ML_ErWvQ/0.jpg)](https://www.youtube.com/watch?v=Lf7ML_ErWvQ)

In this readme I give a broad overview and answer a few possible questions. Enjoy!

---

The goal of this project is to: **create a programming language capable of making formal verification and provably correct software practical and mainstream**. The language and its surrounding education/tooling ecosystem should provide a foundation strong enough to create verified software for any system or environment.

Software is an increasingly critical component of our society, underpinning almost everything we do. It's also extremely vulnerable and unreliable. Software vulnerabilities and errors have likely caused humanity [trillions of dollars](https://www.it-cisq.org/pdf/CPSQ-2020-report.pdf) in damage, [social harm](https://findstack.com/hacking-statistics/), waste, and [lost growth opportunity](https://raygun.com/blog/cost-of-software-errors/) in the digital age (it seems clear [Tony Hoare's estimate](https://en.wikipedia.org/wiki/Tony_Hoare#Apologies_and_retractions) is way too conservative, especially if you include more than `null` errors).

What would it look like if it was both possible and tractable for working software engineers to build and deploy software that was *provably correct*? Using [proof assistant languages](https://en.wikipedia.org/wiki/Proof_assistant) such as [Coq](https://en.wikipedia.org/wiki/Coq) it's possible to define logical assertions as code, and then write proofs of those assertions that can be automatically checked for consistency and correctness. Systems like this are extremely powerful, but have only been suited for niche academic applications until the fairly recent invention of [separation logic](http://www0.cs.ucl.ac.uk/staff/p.ohearn/papers/Marktoberdorf11LectureNotes.pdf).

Separation logic isn't a tool, but a paradigm for making logical assertions about mutable and destructible state. The Rust ownership system was directly inspired by separation logic, which shows us that it really can be used to unlock revolutionary levels of productivity and excitement. Separation logic makes it possible to verify things about practical imperative code, rather than simply outlawing mutation and side effects as is done in functional languages.

However Rust only exposes a simplified subset of separation logic, rather than exposing the full power of the paradigm. [The Iris separation logic](https://people.mpi-sws.org/~dreyer/papers/iris-ground-up/paper.pdf) was recently created by a team of academics to fully verify the correctness of the Rust type system and several core implementations that use `unsafe`. Iris is a fully powered separation logic, making it uniquely capable of verifying the kind of complex, concurrent, arbitrarily flexible assertions that could be implied by practical Rust code, even those that use `unsafe`. Iris could do the same for any other practical and realistic language.

Isn't that amazing?!? A system that can prove completely and eternally that a use of `unsafe` isn't actually unsafe??!! You'd think the entire Rust and systems programming community would be over the moon!

But as is common with academic projects, it's only being used to write papers rather than build real software systems. All the existing uses of Iris perform the proofs "on the side", analyzing [manual transcriptions of the source code as Coq notation](https://coq.inria.fr/refman/user-extensions/syntax-extensions.html) rather than directly reading the original source. And although the papers are more approachable than most academic papers, they're still academic papers, and so basically no working engineers have even heard of any of this.

This is why I'm building Magmide, which is intended to be to Coq what Rust has been to C. There are quite a few proof languages capable of proving logical assertions in code, but none exist that are specifically designed to be used by working engineers to build real imperative programs. None have placed a full separation logic, particularly one as powerful as Iris, at the heart of their design, but instead are overly dogmatic about the pure functional paradigm. And all existing proof languages are hopelessly mired in the obtuse and unapproachable fog of [research debt](https://distill.pub/2017/research-debt/) created by the culture of academia. Even if formal verification is already capable of producing [provably safe and secure code](https://www.quantamagazine.org/formal-verification-creates-hacker-proof-code-20160920/), it isn't good enough if only professors have the time to gain the necessary expertise. We need to pull all this amazing knowledge out of the ivory tower and finally put it to work to make computing truly safe and robust.

I strongly believe a world with mainstream formal verification would not only see a significant improvement in *magnitude* of social good produced by software, but a significant improvement in *kind* of social good. In the same way that Rust gave engineers much more capability to safely compose pieces of software therefore enabling them to confidently build much more ambitious systems, a language that gives them the ability to automatically check arbitrary conditions will make safe composition and ambitious design arbitrarily easier to do correctly.

What kinds of ambitious software projects have been conceived but not pursued because getting them working would simply be too difficult? With machine checkable proofs in many more hands could we finally build *truly secure* operating systems, trustless networks, or electronic voting methods? How many people could be making previously unimagined contributions to computer science, mathematics, and even other logical fields such as economics and philosophy if only they had approachable tools to do so? I speculate about some possibilities at the end of this readme.

To achieve this goal I've chosen an architecture I call the "split Logic/Host" architecture, where the two domains of software thinking are separated into two languages:

- Logic, the dependently typed lambda calculus of constructions. This is where "imaginary" types are defined and proofs are conducted.
- Host, the imperative language that actually runs on real machines.

These two components must have a symbiotic relationship with one another: Logic is used to define and make assertions about Host, and Host computationally represents and implements both Logic and Host itself.

```
         represents and
           implements
  +------------+------------+
  |            |            |
  |            |            |
  v            |            |
Logic          +---------> Host
  |                         ^
  |                         |
  |                         |
  +-------------------------+
        logically defines
          and verifies
```

The easiest way to understand this is to think of Logic as the type system of Host. Logic is "imaginary" and only exists at compile time, and constrains/defines the behavior of Host. Logic just happens to itself be a dependently typed functional programming language! This design takes the concept of [self-hosting](https://en.wikipedia.org/wiki/Self-hosting_(compilers)) to its logical extreme.

We intend to achieve this goal by [building Magmide as the Logic portion with Rust as Host, then defining the semantics of Rust *within* Magmide, and finally building a "reflective proof rule" into Magmide to allow it to use verified Rust code during proof checking.](https://github.com/magmide/magmide/blob/main/posts/design-of-magmide.md) This seems the most realistic way to bootstrap the project!

I'm convinced this general architecture is the only one that can achieve Magmide's extremely ambitious goal. It feels like an optimal point in the design space, since I can't imagine another architecture that would allow all of the language components (proof checker, code compiler, target code being compiled) the possibility to be both bare metal and fully verified.

But it's not good enough for the architecture to *allow* a great language design. Everything else about the design has to be chosen correctly as well. I claim that in order for the language to achieve its goal, it has to meet all these descriptions:

## Capable of arbitrary logic

In order to really deliver the kind of truly transformative correctness guarantees that will inspire working engineers to learn and use a difficult new language, it doesn't make sense to stop short and only give them an "easy mode" verification tool. It should be possible to formalize and attempt to prove any proposition humanity is capable of representing logically, not only those that a fully automated tool like an [SMT solver](https://liquid.kosmikus.org/01-intro.html) can figure out. **A language with full logical expressiveness and manual proofs can still use convenient automation as well**, but the opposite isn't true.

To meet this description, the language will be fully dependently typed and use the [Calculus of Constructions](https://en.wikipedia.org/wiki/Calculus_of_constructions) much like [Coq](https://en.wikipedia.org/wiki/Coq). I find [Adam Chlipala's "Why Coq?"](http://adam.chlipala.net/cpdt/html/Cpdt.Intro.html) arguments convincing in regard to this choice. Coq will also be used to bootstrap the first version of the compiler, allowing it to be self-hosting and even self-verifying using a minimally small trusted theory base. Read more about the design and bootstrapping plan in [`posts/design-of-magmide.md`](./posts/design-of-magmide.md). The [metacoq](https://github.com/MetaCoq/metacoq) and ["Coq Coq Correct!"](https://metacoq.github.io/coqcoqcorrect) projects have already done the work of formalizing and verifying Coq using Coq, so they will be very helpful.

It's absolutely possible for mainstream engineers to learn and use these powerful logical concepts. The core ideas of formal verification (dependent types, proof objects, higher order logic, separation logic) aren't actually that complicated. They just haven't ever been properly explained because of [research debt](https://distill.pub/2017/research-debt/), and they weren't even all that practical before separation logic and Iris. I've been working on better explanations in the (extremely rough and early) [`posts/intro-verification-logic-in-magmide.md`](./posts/intro-verification-logic-in-magmide.md) and [`posts/coq-for-engineers.md`](./posts/coq-for-engineers.md).

## Capable of bare metal performance

Software needs to perform well! Not all software has the same requirements, but often performance is intrinsically tied to correct execution. Very often the software that most importantly needs to be correct also most importantly needs to perform well. **If the language is capable of truly bare metal performance, it can still choose to create easy abstractions that sacrifice performance where that makes sense.**

To meet this description Magmide will be built in and deeply integrated with Rust. Excitingly because of the inherent power and flexibility of a proof assistant this integration with Rust doesn't have to be permanent, and we could build other languages to act as Host as long as we can specify their semantics and make them interoperable!

Because of separation logic and Iris, it is finally possible to verify code as low-level as Rust and more!

## Gradually verifiable

Just because it's *possible* to fully verify all code, doesn't mean it should be *required*. It simply isn't practical to try to completely rewrite a legacy system in order to verify it. **We must be able to write code without needing to prove it's perfectly correct**, otherwise iteration and incremental adoption are impossible. Existing languages with goals of increased rigor such as Rust and Typescript strategically use concessions in the language such as `unsafe` and `any` to allow more rigorous code to coexist with legacy code as it's incrementally replaced. The only problem is that these concessions introduce genuine soundness gaps into the language, and it's often difficult or impossible to really understand how exposed your program is to these safety gaps.

We can get both practical incremental adoption and complete understanding of the current safety of our program by leveraging work done in the [Iron obligation management logic](https://iris-project.org/pdfs/2019-popl-iron-final.pdf) built using Iris. We can use a concept of trackable effects to allow some safety conditions to be *optional*.

Trackable effects will work by requiring a piece of some "correctness token" to be forever given up in order to perform a dangerous operation without justifying its safety with a proof. This would infect the violating code block with an effect type that will bubble up through any parent blocks. Defining effects in this way makes them completely composable *resources* rather than *wrappers*, meaning that they're more flexible and powerful than existing effect systems. Systems like algebraic effects or effect monads could be implemented using this resource paradigm, but the opposite isn't true.

If the trackable effect system is defined in a sufficiently generic way then custom trackable effects could be created, allowing different projects to introduce new kinds of safety and correctness tracking, such as ensuring asynchronous code doesn't block the executor, or a web app doesn't render raw untrusted input, or a server doesn't leak secrets.

Even if a project chooses to ignore some effects, they'll always know those effects are there, which means other possible users of the project will know as well. Project teams could choose to fail compilation if their program isn't memory safe or could panic, while others could tolerate some possible effects or write proofs to assert they only happen in certain well-defined circumstances. It would even be possible to create code that provably sandboxes an effect by ensuring it can't be detected at any higher level if contained within the sandbox. With all these systems in place, we can finally have a genuinely secure software ecosystem!

## Fully reusable

We can't write all software in assembly language! Including first-class support for powerful metaprogramming, alongside a [query-based compiler](https://ollef.github.io/blog/posts/query-based-compilers.html), will allow users of this language to build verified abstractions that "combine upward" into higher levels, while still allowing the possibility for those higher levels to "drop down" back into the lower levels. Being a proof assistant, these escape hatches don't have to be unsafe, as higher level code can provide proofs to the lower level to justify its actions.

This ability to create fully verifiable higher level abstractions means we can create a "verification pyramid", with excruciatingly verified software forming a foundation for a spectrum of software that decreases in importance and rigor. **Not all software has the same constraints, and it would be dumb to to verify a recipe app as rigorously as a cryptography function.** But even a recipe app would benefit from its foundations removing the need to worry about whole classes of safety and soundness conditions. And wouldn't it be great to prove your app will never leak memory or throw exceptions or enter an infinite loop/recursion?

Magmide *itself* doesn't have to achieve mainstream success to massively improve the quality of all downstream software, but merely some sub-language. Many engineers have never heard of LLVM, but they still implicitly rely on it every day. Magmide would seek to do the same. We don't have to make formal verification fully mainstream, we just have to make it available for the handful of people willing to do the work. If a full theorem prover is sitting right below the high-level language you're currently working in, you don't have to bother with it most of the time, but you still have the option to do so when it makes sense.

The metaprogramming can of course also be used directly in the dependently typed language, allowing compile-time manipulation of proofs, functions, and data. Verified proof tactics, macros, and higher-level embedded programming languages are all possible. This is the layer where absolutely essential proof automation tactics similar to Coq's `auto` or [Adam Chlipala's `crush`](http://adam.chlipala.net/cpdt/html/Cpdt.Intro.html), or fast counter-example searchers such as `quickcheck`, or [computational reflection systems](./posts/design-of-magmide.md#heavy-use-of-computational-reflection-to-improve-proof-performance) would be implemented.

Importantly, the language will be self-hosting, so metaprogramming functions will benefit from the same bare metal performance and full verifiability.

You can find rough notes about the current design thinking for the metaprogramming interface in [`posts/design-of-magmide.md`](./posts/design-of-magmide.md).

## Practical and ergonomic

My experience using languages like Coq has been extremely painful, and the interface is "more knife than handle". I've been astounded how willing academics seem to be to use extremely clunky workflows and syntaxes just to avoid having to build better tools.

To meet this description, this project will learn heavily from `cargo` and other excellent projects. **It should be possible to verify, interactively prove, and query Magmide code with a single tool.** The split Logic/Host architecture will likely make it easier to understand and use Magmide.

It will also fully embrace ergonomic type inference, and use techniques such as those from ["Flux: Liquid Types for Rust"](https://arxiv.org/abs/2207.04034) to allow even many *proof* conditions to be inferred.

## Taught effectively

**Working engineers are resource constrained and don't have years of free time to wade through arcane and disconnected academic papers.** Academics aren't incentivized to properly explain and expose their amazing work, and a massive amount of [research debt](https://distill.pub/2017/research-debt/) has accrued in many fields, including formal verification.

To meet this description, this project will enshrine the following values in regard to teaching materials:

- Speak to a person who wants to get something done and not a review committee evaluating academic merit.
- Put concrete examples front and center.
- Point the audience toward truly necessary prerequisites rather than assuming shared knowledge.
- Prefer graspable human words to represent ideas, never use opaque and unsearchable non-ascii symbols, and only use symbolic notations when it's both truly useful and properly explained.
- Prioritize the hard work of finding clear and distilled explanations.

---

Read [`posts/design-of-magmide.md`](./posts/design-of-magmide.md) or [`posts/comparisons-with-other-projects.md`](./posts/comparisons-with-other-projects.md) to more deeply understand the intended design and how it's different than other projects.

Building such a language is a massively ambitious goal. It might even be too ambitious! But we have to also consider the opposite: perhaps previous projects haven't been ambitious enough, and that's why formal verification is still niche! Software has been broken for too long, and we won't have truly solved the problem until it's at least *possible* for all software to be verified.

<!--
# Project values

[The long term path of a project is determined by its values](TODO), so we should define ours.

- Fidelity. This value combines performance and full verifiability

For a language to both perform as well as possible and be verifiable as deeply as possible, we must make the language as faithful and accurate a model of the real underlying computation as possible. We can still use our very low level and granular models to build up verified abstractions for higher levels of reasoning, but the whole thing must have an accurate foundation that ties directly into the bare metal. Fidelity can be used to get us several other desirable things, such as performance and increased safety.
- Practicality. If we want to make the software of our world safer and more robust, then we have to build a language that can actually be used to achieve useful work in real applications. This means the language should allow compatibility with existing systems and incremental adoption.
- Approachability. I genuinely believe the culture and working patterns of academia aren't just inefficient in regards to producing usable knowledge for society, but are toxic and exclusionary. [Research debt]()

To create a language that can possibly have all the above design qualities, I claim we have to max out
-->

# FAQ

## Is it technically possible to build a language like this?

Yes! None of the technical details of this idea are untested or novel. Dependently typed proof languages, higher-order separation logic, query-based compilers, introspective metaprogramming, and abstract assembly languages are all ideas that have been proven in other contexts. Magmide would merely attempt to combine them into one unified and practical package.

## Is this language trying to replace Rust?

No! My perfect outcome of this project would be for it to sit *underneath* Rust, acting as a new verified toolchain that Rust could "drop into". The concepts and api of Rust are awesome and widely loved, so Magmide would just try to give it a more solid foundation.

## If this is such a good idea why hasn't it happened yet?

Mostly because this idea exists in an "incentive no man's land".

Academics aren't incentivized to create something like this, because doing so is just "applied" research which tends not to be as prestigious. You don't get to write many groundbreaking papers by taking a bunch of existing ideas and putting them together nicely.

Software engineers aren't incentivized to create something like this, because a programming language is a pure public good and there aren't any truly viable business models that can support it while still remaining open. Even amazing public good ideas like the [interplanetary filesystem](https://en.wikipedia.org/wiki/InterPlanetary_File_System) can be productized by applying the protocol to markets of networked computers, but a programming language can't really pull off that kind of maneuver.

Although the software startup ecosystem does routinely build pure public goods such as databases and web frameworks, those projects tend to have an obvious and relatively short path to being useful in revenue-generating SaaS companies. The problems they solve are clear and visible enough that well-funded engineers can both recognize them and justify the time to fix them. In contrast the path to usefulness for a project like Magmide is absolutely not short, and despite promising immense benefits to both our industry and society as a whole, most engineers capable of building it can't clearly see those benefits behind the impenetrable fog of research debt.

<!-- The problem of not properly funding pure public goods is much bigger than just this project. We do a bad job of this in every industry and so our society has to tolerate a lot of missed opportunity and negative externalities. The costs of broken software are more often borne by society than the companies at fault since insurance and limited-liability structures and PR shenanigans and expensive lawyers can all help a company wriggle out of fully internalizing the cost of their mistakes. Profit-motivated actors are extremely short-sighted and don't have to care if they leave society better off, they just have to get marketshare. -->

We only got Rust because Mozilla has been investing in dedicated research for a long time, and it still doesn't seem to have really financially paid off for them in the way you might hope.

## Will working engineers actually use it?

Maybe! We can't force people or guarantee it will be successful, but we can learn a lot from how Rust has been able to successfully teach quite complex ideas to an huge and excited audience. I think Rust has succeeded by:

- *Making big promises* in terms of how performant/robust/safe the final code can be.
- *Delivering on those promises* by building something awesome. I hope that since the entire project will have verification in mind from the start it will be easier to ship something excellent and robust with less churn than usual.
- *Respecting people's time* by making the teaching materials clear and distilled and the tooling simple and ergonomic.

All of those things are easier said than done! Fully achieving those goals will require work from a huge community of contributors.

## Won't writing verified software be way more expensive? Do you actually think this is worth it?

**Emphatically yes it is worth it.** As alluded to earlier, broken software is a massive drain on our society. Even if it were much more expensive to write verified software, it would still be worth it. Rust has already taught us that it's almost always worth it to [have the hangover first](https://www.youtube.com/watch?v=ylOpCXI2EMM&t=565s&ab_channel=Rust) rather than wastefully churn on a problem after you thought you could move on.

Verification is obviously very difficult. Although I have some modest theories about ways to speed up/improve automatic theorem proving, and how to teach verification concepts in a more intuitive way that can thereby involve a larger body of engineers, we still can't avoid the fact that refining our abstractions and proving theorems is hard and will remain so.

But we don't have to make verification completely easy and approachable to still get massive improvements. We only have to make proof labor more *available* and *reusable*. Since Magmide will be inherently metaprogrammable and integrate programming and proving, developments in one project can quickly disseminate through the entire language community. Research would be much less likely to remain trapped in the ivory tower, and could be usefully deployed in real software much more quickly.

And of course, a big goal of the project is to make verification less expensive! Tooling, better education, better algorithms and abstractions can all decrease verification burden. If the project ever reaches maturity these kinds of improvements will likely be most of the continued effort for a long time.

Besides, many projects already write [absolutely gobs of unit tests](https://softwareengineering.stackexchange.com/questions/156883/what-is-a-normal-functional-lines-of-code-to-test-lines-of-code-ratio), and a proof is literally *infinitely* better than a unit test. At this point I'm actually hopeful that proofs will *decrease* the cost of writing software. We'll see.

## Is it actually useful to prove code meets some specification if we still have to trust the specification?

In a way yes this is true: when we prove an implementation meets some specification we're mostly just shifting uncertainty/trust from the implementation to the specification. This is part of why it's impossible for our systems to ever be completely perfect (whatever "perfect" means).

However I assert that this shifting of trust from code to specifications (or put another way, from trusted code to trusted theory) is worth the effort and a huge improvement over the status quo for these reasons:

- Specifications can refer to each other and be built upon, thereby revealing inconsistent assumptions and shaking out errors. Every time an incorrect specification in any way interfaces with a correct one then the incompatibility between them will be revealed at compile time. It's likely you've already experienced exactly this dynamic when you incorrectly define a *type* (type systems are just very simple proof systems!). If you mistakenly define a type field as an unsigned integer when it needs to be a signed integer, when you try to use the incorrect type in other code that expects a signed integer your mistake will be revealed. This won't always happen, but with deeper proof systems it has the opportunity to happen even more often than it happens in type systems.
- Specifications can be much smaller and terser than implementations, and therefore easier to audit. When we audit a specification we only have to audit the type signatures of our theorems and functions, rather than all the code inside them. Implementations have to worry about performance and many internal details that don't need to be revealed, whereas specifications only have to make assertions about whatever visible behavior is desired. Specifications can be stated in the whatever naive, simple, pure functional form makes the assertion easy to understand, whereas implementations often need to use arcane tricks and confusingly evolving mutable structures to make the algorithm efficient. If the specification is larger than the implementation I would tend to suspect one or both of them could be structured more intelligently.

## Do you think this language will make all software perfectly secure?

No! Although it's certainly [very exciting to see how truly secure verified software can be](https://www.quantamagazine.org/formal-verification-creates-hacker-proof-code-20160920/), there will always be a long tail of hacking risk. Not all code will be written in securable languages, not all engineers will have the diligence or the oversight to write secure code, people can make bad assumptions, and brilliant hackers might invent entirely new *types* of attack vectors that aren't considered by our safety specifications (although inventing new attack vectors is obviously way more difficult than just doing some web searches and running scripts, which is all a hacker has to do today).

However *any* verified software is better than *none*, and right now it's basically impossible for a security-conscious team to even attempt to prove their code secure. Hopefully the "verification pyramid" referred to earlier will enable almost all software to quickly reuse secure foundations provided by someone else.

And of course, social engineering and hardware tampering are never going away, no matter how perfect our software is.

## Is logically verifying code even useful if that code relies on possibly faulty software/hardware?

This is nuanced, but the answer is still yes!

First let's get something out of the way: software is *literally nothing more* than a mathematical/logical machine. It is one of the very few things in the world that can actually be perfect. Of course this perfection is in regard to an axiomatic model of a real machine rather than the true machine itself. But isn't it better to have an implementation that's provably correct according to a model rather than what we have now, an implementation that's obviously flawed according to a model? Formal verification is really just the next level of type checking, and type checking is still incredibly useful despite also only relating to a model.

If you don't think a logical model can be accurate enough to model a real machine in sufficient detail, please check out these papers discussing [separation logic](http://www0.cs.ucl.ac.uk/staff/p.ohearn/papers/Marktoberdorf11LectureNotes.pdf), extremely high fidelity formalizations of the [x86](http://nickbenton.name/hlsl.pdf) and [arm](https://www.cl.cam.ac.uk/~mom22/arm-hoare-logic.pdf) instruction sets, and [Iris](https://people.mpi-sws.org/~dreyer/papers/iris-ground-up/paper.pdf). Academics have been busy doing amazing stuff, even if they haven't been sharing it very well.

If you think we'll constantly be tripping over problems in incorrectly implemented operating systems or web browsers, well you're missing the whole point of this project. These systems provide environments for other software yes, but they're still just software themselves. Even if they aren't perfectly reliable *now*, the entire ambition of this project is to *make* them reliable.

We would however need hardware axioms to model the abstractions provided by a concrete computer architecture, and this layer is trickier to be completely confident in. Hardware faults and ambient problems of all kinds can absolutely cause unavoidable data corruption. Hardware is intentionally designed with layers of error correction and redundancy to avoid propagating corruption, but it still gets through sometimes. There's one big reason to press on with formal verification nonetheless: the possibility of corruption or failure can be included in our axioms!

Firmware and operating systems already include state consistency assertions and [error correction codes](https://en.wikipedia.org/wiki/Error_detection_and_correction), and it would be nice if those checks themselves could be verified. The entire purpose of trackable effects is to allow environmental assumptions to be as high fidelity and stringent as possible without requiring every piece of software to actually care about all that detail. This means the lowest levels of our verification pyramid can fully include the possibility of corruption and carefully prove it can only cause a certain amount of damage in a few well-understood places. Then the higher levels of the pyramid can build on top of that much sturdier foundation. Additionally the concept of [corruption panics](./posts/design-of-magmide.md#corruption-panics) would allow software to include consistency checks even in situations that are logically impossible, to account for situations where the hardware has failed.

Yes it's true that we can only go so far with formal verification, so we should always remain humble and remember that real machines in the real world fail for lots of reasons we can't control. But we can go much much farther with formal verification than we can with testing alone! Proving correctness against a mere model with possible caveats is incalculably more robust than doing the same thing we've been doing for decades.

## Why can't you just teach people how to use existing proof languages like Coq?

The short answer is that languages like Coq weren't designed with the intent of making formal verification mainstream, so they're all pretty mismatched to the task. If you want a deep answer to this question both for Coq and several other projects, check out [`posts/comparisons-with-other-projects.md`](./posts/comparisons-with-other-projects.md).

This question is a lot like asking the Rust project creators "why not just write better tooling and teaching materials for C"? Because instead of making something *awesome* we'd have to drag around a bunch of frustrating design decisions. Sometimes it's worth it to start fresh.

<!-- ## How would trackable effects compare with algebraic effects?

There's a ton of overlap between the algebraic effects used in a language like [Koka](https://koka-lang.github.io/koka/doc/index.html) and the trackable effects planned for Magmide. Trackable effects are actually general enough to *implement* algebraic effects, so there are some subtle differences.

On the surface level the actual theoretical structure is different. Algebraic effects are "created" by certain operations and then "wrap" the results of functions. Trackable effects are defined by *starting* with some token representing a "clean slate", and then pieces of that token are given up to perform possibly effectful operations, and only given back if a proof that the operation is in fact "safe" is given.

This design means that trackable effects can be used for *any* kind of program aspect, from signaling conditions that can't be "caught" or "intercepted" (such as leaking memory), to notifying callers of the presence of some polymorphic control flow entrypoint that can be "hijacked".

It's important to also note that the polymorphic control flow use cases of algebraic effects could be achieved with many different patterns that no one would strictly call "algebraic effects". For example a type system could simply treat all the implicitly "captured" global symbols as the default arguments of an implicit call signature of a function, allowing those captured global signals to be swapped out by callers (if a function uses a `print` function, you could detect that capture and supply a new `print` function without the function author needing to explicitly support that ability). Or you could simply use metaprogramming to ingest foreign code and replace existing structures. For this reason trackable effects would be more focused on effects related to correctness and safety rather than control flow, despite the relationships between the two. -->

## Isn't is undecidable to prove a program terminates or is correct?

If I was claiming Magmide could somehow ignore the problem of [undecidability](https://en.wikipedia.org/wiki/Decidability_(logic)) (or the [halting problem](https://en.wikipedia.org/wiki/Halting_problem), or [Rice's theorem](https://en.wikipedia.org/wiki/Rice%27s_theorem), or [Godel's incompleteness theorems](https://en.wikipedia.org/wiki/G%C3%B6del%27s_incompleteness_theorems)) then this question would be a useful one. However I'm *not* claiming that, which means you just haven't understood Magmide and its goals.

It's impossible to write an algorithm that can *automatically* and *without any guidance* determine whether *any arbitrary program* terminates/meets some non-trivial semantic condition. However it is possible to write algorithms that can do so *some* of the time. And it's *always* possible to use dependent type theory to check whether a proof object successfully proves some proposition. *Checking* proofs is decidable, it's only *constructing* proofs that's in general undecidable. Researchers routinely prove that *particular* programs terminate or have certain characteristics, and they often have to manually write proofs to do so.

Nothing in any of these documents claims we can ignore proven truths of logic. Magmide is just trying to integrate proven concepts (proof assistants and bare metal compilers) into a nice package.

I'm not an expert logician, and I'm happy to be corrected by more knowledgeable people. But if you're asking questions like this, you've simply misunderstood either Magmide or the referenced theorems.

## Isn't formal verification impractical in practice?

Historically systems have been very impractical yes, with three commonly cited issues:

- Extreme difficulty of composing proofs.
- Overly long and burdensome correctness annotations.
- Combinatorial explosion of proof terms or constraints, leading to unacceptable proof checking time.

I'm not terribly worried about composability, since separation logic systems such as Iris have demonstrated how much improvement the right abstractions can give. And I'm betting design features such as [asserted types](./posts/design-of-magmide.md#builtin-asserted-types), [inferred annotations](https://arxiv.org/abs/2207.04034), and [inferred proof holes](./posts/design-of-magmide.md#inferred-proof-holes) would make composing verified functions much more ergonomic. Ergonomics and abstractions can be improved over time, especially for specific classes of problems. We shouldn't throw out the entire idea of verification just because previous systems have had poor ergonomics.

I'm extremely excited about the already mentioned ["Flux: Liquid Types for Rust"](https://arxiv.org/abs/2207.04034) project, which demonstrated it's possible to ergonomically infer proof annotations. Essentially (mostly) all a programmer must do is add correctness conditions to *types* (just like [asserted types](./posts/design-of-magmide.md#builtin-asserted-types)) and (basically) all the other program annotations can be inferred. Flux then sends all those conditions to a solver and doesn't allow manual proofs for more complex conditions, but Magmide would allow manual proofs, meaning the correctness conditions could be arbitrarily interesting.

As for questions like combinatorial explosion of verification conditions, it's absolutely true that all the computational work necessary to verify software can indeed be very expensive, especially if the proof system in question is fully automated and just generates a massive list of constraints to solve.

A few techniques can help us improve the situation:

- [Incremental compilation of proof terms](https://github.com/salsa-rs/salsa).
- [Computational reflection](https://gmalecha.github.io/reflections/2017/speeding-up-proofs-with-computational-reflection). For many specific problem domains it's possible to write very targeted decidable algorithms to find proofs or at least discharge many trivial proof obligations (the Rust borrow checker is an example!). Since such algorithms are narrowly targeted at a specific domain, they can perform much better than a general purpose tactic or constraint solver.
- Allowing manual/interactive proofs rather than requiring full automation. This may seem like a cop-out, and it certainly adds work for engineers, but if some theorem is simple to manually prove but would lead an automated system on a costly run through a massive search space, it's probably worth the time.

Just like ergonomics, compiler performance can be improved over time. Type systems can potentially add a huge amount of usability pain and compilation cost, but if the right design tradeoffs are found then type systems are well worth the trouble. Proof systems are simply much more advanced type systems, and I'm willing to bet the combination of Iris and a few of the design ideas I've referenced can achieve a worthwhile set of tradeoffs.

## Do you really think non-experts can meaningfully contribute here? Aren't you ignoring the difficult problems that researchers still haven't solved?

This question is a useful one to ask, but I ultimately think it's wrong-headed.

I make this claim: **the most important bottleneck to the broader adoption and application of formal methods isn't unsolved research problems, but the "day one" problems of ergonomic usability and connected reusability.** Importantly, I only make this claim because Iris exists, which demonstrated the ability to verify extremely complex and realistic Rust code.

Most of the software that's written every day isn't that complicated. Most of the correctness conditions people will actually care to prove will either relate to safety/security or to general robustness (not leaking memory, not throwing exceptions, not going into infinite loops/recursions), conditions that have been very rigorously explored by researchers. The research cutting edge is lightyears ahead of engineering practice, and we don't have to apply the full depth of theory to get huge payoffs in the general safety and stability of software.

Researchers will continue to find solutions to difficult theoretical problems, which is great. But as long as their solutions only exist in difficult to reuse media such as Coq or pdf papers, those solutions will barely matter. Amazing theoretical progress hasn't truly fulfilled its purpose until it has *somehow* been applied to the real world.

So instead of saying "we should wait for researchers to solve all these difficult problems", I propose we build a highly usable system *now* with the theory we already have. If such a system existed, even researchers would benefit, since they would have a place to contribute further breakthroughs that would give them more visibility and support and return contributions. Magmide just wants to give both industrial engineers and academic researchers a solid foundation, one they can share and build up together.

<!--
Furthermore, this question reveals a fundamental lack of respect for industrial engineers. It's certainly true that market incentives and a loose startup culture have made many programmers undisciplined and flippant about quality and robustness. But not all practitioners have the same incentives and culture, and a large body of them (including myself!) care deeply about these questions. These engineers might have even realized that their life gets *easier* and their code velocity *faster* when they use more robust systems, and so will be glad to ["get the hangover first"](https://www.youtube.com/watch?v=ylOpCXI2EMM&t=565s&ab_channel=Rust), especially if they can do so incrementally.

Academic researchers are not a separate race of super geniuses who are the only ones capable of understanding formal methods. Academics are simply given access to the time and social network necessary to understand a literature that seems to intentionally shun outsiders.
-->

## Why build a system focused on engineers when even academics don't always use proof assistants? Shouldn't we try to build a system researchers will use first?

No. If you create a tool that allows practical verification of real software systems, primarily intended for approachable use by engineers, you'll necessarily have created a theorem prover that's enjoyable and ergonomic to use, and that supports easy sharing and reuse of proof labor across an entire community.

That design doesn't in any way preclude supporting the patterns that researchers like (using/supporting homotopy type theory, allowing concise notation using a flexible metaprogramming system, rendering proofs as latex/pdf/html/whatever documents). A highly metaprogrammable bare metal proof assistant would attract researchers, but a beautiful theorem prover without any special capability to reason about or compile bare metal code wouldn't attract engineers.

Think about it: tons of researchers use python to analyze data or automate common tasks, or focus their research on the details of C or Rust or some specific instruction set architecture. Many fewer use Coq or do research about Coq. In general, at least in computing, researchers tend to follow industrial engineers.

The verification use cases engineers care about are more specific and fully implied by those that researchers care about. If we nail the use cases engineers care about, we'll get the use cases researchers care about basically for free.

## Isn't most software too fuzzy or quickly evolving to make verification worth the effort?

Yes, many systems don't really have a clear definition of "correct", but that doesn't mean *aspects* of the system aren't worth verifying, or that it wouldn't be worth building that system *using* verified tools.

We don't have to be able to verify every facet of every program to make verification worth the effort, we just have to be able to prove enough useful things that we can't already prove with existing type systems.

Refer to the concept of the [verification pyramid discussed above](https://github.com/magmide/magmide#fully-reusable).

## Why bother writing code and then verifying it when we could instead simply generate code from specifications?

Generating code based on specifications is an extremely cool idea! [Some researchers have already made extremely interesting strides in that direction.](https://plv.csail.mit.edu/fiat/)

It seems impossible to always generate code for *any* specification, since some specifications aren't true or are undecidable. I'm not even sure it would always be possible for even relatively mundane code (reach out to me if you know more about the related theory!)

Regardless of the theoretical limits of the approach, deductive synthesis systems have to be built *from* something, and compile *to* something. That something ought to be a proof language capable of bare metal performance, so Magmide would be a perfect fit for creating deductive synthesis systems.

## How far are you? What remains to be done?

Very early, and basically everything remains to be done! I've been playing with models of very simple assembly languages to get my arms around formalization of truly imperative execution. Especially interesting has been what it looks like to prove some specific assembly language program will always terminate, and to ergonomically discover paths in the control flow graph which require extra proof justification. I have some raw notes and thoughts about this in [`posts/toward-termination-vcgen.md`](./posts/toward-termination-vcgen.md). Basically I've been playing with the design for the foundational computational theory.

In [`posts/design-of-magmide.md`](./posts/design-of-magmide.md) I outline my guess at the project's major milestones. Obviously a project as gigantic as this can only be achieved by inspiring a lot of hardworking people to come and make contributions, so each milestone will have to show exciting enough capability to make the next milestone happen.

Read [this blog post discussing my journey to this project](https://blainehansen.me/post/my-path-to-magmide/) if you're interested in a more personal view.

<!--
## Should I financially support this project?

I (Blaine Hansen, the maintainer and author of this document) have recently enabled Github Sponsors for Magmide, but **you likely shouldn't sponsor the project yet**. You are likely to be disappointed at how little influence a small sponsorship has on the speed of progress.

The ambition of this project means it's pretty "all or nothing". If the project never reaches the point where we've bootstrapped an initial version of the compiler, it would be difficult to say the project has provided any value at all. The chasm between here and there is pretty wide, with the most harrowing step being the definition of the language theory (operational semantics and custom weakest-precondition proposition to instantiate Iris).

I'm confident I could get the project to that point if I had some help/pointers from Iris experts *and the freedom to work on this project full-time*. I'm not at all confident I can do so in my nights and weekends, even with occasional code contributions from others. I've actually been looking around at various ways to support the project at the level of a full-time pursuit, but haven't been able to find anything that makes sense. The most natural path to complete a project like this would be to pursue it in a PhD program, and as exciting as that would be it isn't possible because of a flurry of personal constraints.

If you know of a company that would be willing to make a series of short bets on an unproven researcher, then please let me know. Otherwise the volume of support that will come through Github Sponsors is unlikely to materially effect how much time I or anyone else will have to work on this project. I'm not going to make anyone a promise I'm not sure I can keep.
-->

## This is an exciting idea! How can I help?

Just reach out! Since things are so early there are many questions to be answered, and I welcome any useful help. Feedback and encouragement are welcome, and you're free to reach out to me directly if you think you can contribute in some substantial way.

If you would like to get up to speed with formal verification and Coq enough to contribute at this stage, you ought to read [Software Foundations](https://softwarefoundations.cis.upenn.edu/), [Certified Programming with Dependent Types](http://adam.chlipala.net/cpdt/html/Cpdt.Intro.html), [this introduction to separation logic](http://www0.cs.ucl.ac.uk/staff/p.ohearn/papers/Marktoberdorf11LectureNotes.pdf), and sections 1, 2, and 3 of the [Iris from the ground up](https://people.mpi-sws.org/~dreyer/papers/iris-ground-up/paper.pdf) paper. You might also find my unfinished [introduction to verification and logic in Magmide](./posts/intro-verification-logic-in-magmide.md) useful, even if it's still very rough.

Here's a broad map of all the mad scribblings in this repo:

- `theory` contains exploratory Coq code, much of which is unfinished. This is where I've been playing with designs for the foundational computational theory.
- `src`, `plugins`, and `test_theory` contains Rust, Ocaml, and Coq code representing the current skeleton of the [initial bootstrapping toolchain](./posts/design-of-magmide.md#project-plan).
- `posts` has a lot of speculative writing, mostly to help me nail down the goals and design of the project.
- `notes` has papers on relevant topics and notes I've made purely for my own learning.
- `notes.md` is a scratchpad for raw ideas, usually ripped right from my brain with very little editing.
- `README.future.md` is speculative writing about a "by example" introduction to the language. I've been toying with different syntax ideas there, and have unsurprisingly found those decisions to be the most difficult and annoying :cry:

Thank you! Hope to see you around!

---

# What could we build with Magmide?

A proof checker with builtin support for metaprogramming and verification of assembly languages would allow us to build any logically representable software system imaginable. Here are some rough ideas I think are uniquely empowered by the blend of capabilities that would be afforded by Magmide. Not all of these ideas are *only* possible with full verification, but I feel they would get much more tractable.

## Truly eternal software

This is a general quality, one that could apply to any piece of software. With machine checked proofs, it's possible to write software *that never has to be rewritten or maintained*. Of course in practice we often want to add features or improve the interface or performance of a piece of software, and those kind of expected improvements can't be anticipated enough to prove them ahead of time.

But if the intended function of a piece of software is completely understood and won't significantly evolve, it's possible to get it right *once and for all*. Places where this would be a good idea are places where it's hard to get to the software, such as in many embedded systems like firmware, IOT applications, software in spacecraft, etc.

## Safe foreign code execution without sandboxing

If it's possible to prove a piece of code is well-behaved in arbitrary ways then it's possible to simply run foreign and untrusted code without any kind of sandboxing or resource limitations, as long as that foreign code provides a consistent proof object demonstrating it won't cause trouble.

What kind of performance improvements and increased flexibility could we gain if layers like operating systems, hypervisors, or even internet browsers only had to type check foreign code to know it was safe to execute with arbitrary system access? Of course we still might deem this too large a risk, but it's interesting to imagine.

## Verified critical systems

Many software applications are critical for safety of people and property. It would be nice if applications in aeronautics, medicine, industrial automation, cars, banking and finance, decentralized ledgers, and all the others were fully verified.

## Secure voting protocols

It isn't good enough for voting machines to be provably secure, the voting system itself must be cryptographically transparent and auditable. The [ideal requirements](https://en.wikipedia.org/wiki/End-to-end_auditable_voting_systems) are extremely complex, and would be very difficult to get right without machine checked proofs.

Voting is sufficiently high stakes that it's extremely important for a voting infrastructure to not simply be correct, but be *undeniably* correct. I imagine it will be much easier to assert the fairness and legitimacy of voting results if all the underlying code is much more than merely audited and tested.

## Universally applicable type systems

Things like the [Underlay](https://research.protocol.ai/talks/the-underlay-a-distributed-public-knowledge-graph/) or the [Intercranial Abstraction System](https://research.protocol.ai/talks/the-inter-cranial-abstraction-system-icas/) get much more exciting in a world with a standardized proof checker syntax to describe binary type formats. If a piece of data can be annotated with its precise logical format, including things like endianness and layout semantics, then many more pieces of software can automatically interoperate.

I'm particularly excited by the possibility of improving the universality of self-describing apis, ones that allow consumers to merely point at some endpoint and metaprogrammatically understand the protocol and type interface.

## Truly universal interoperability

All computer programs in our world operate on bits, and those bits are commonly interpreted as the same few types of values (numbers, strings, booleans, lists, structures of those things, standardized media types). In a world where all common computation environments are formalized and programs can be verified to correctly model common logical types in any of those common computation environments, then correct interoperation between those environments can also be verified!

It would be very exciting to know with deep rigorous certainty that a program can be compiled for a broad host of architectures and model the same logical behavior on all of them.

## Semver enforcing and truly secure package management

Since so much more knowledge of a package's api can be had with proof checking and trackable effects, we can have distributed package management systems that can enforce semver protocols at a much greater granularity and ensure unwanted program effects don't accidentally (or maliciously!) sneak into our dependency graphs.

## Invariant protection without data hiding

In many languages some idea of encapsulation or data hiding is supported by the language, to allow component authors to ensure outside components don't reach into data structures and break invariants. With proof checking available, it's possible to simply encode invariants directly alongside data, effectively making arbitrary invariants a part of the type system. When this is true data no longer has to be hidden at the type system level. We can still choose to make some data hidden from documentation, but doing so would simply be for clarity rather than necessity.

Removing the need for data hiding allows us to reconsider almost all common software architectures, since most are simply trying to enforce consistency with extra separation. Correct composition can be easy and flexible, so we can architect systems for greatest performance or clarity and remove unnecessary walls. For example strict microservice architectures might lose much of their usefulness.

## Flattened async executor micro-kernel operating system

The process model is a very good abstraction, but the main reason it's useful is because it creates hard boundaries around different programs to prevent them from corrupting each other's state. Related to the above point, what if we don't have to do that anymore? What if code from different sources could simply inhabit the same memory space without much intervention?

The Rust community has made some very innovative strides with their asynchronous executor implementations, and I am one person who believes the "async task" paradigm is an extremely natural way to think about system concurrency and separation. What if an async task executor could simply be the entire operating system, doing nothing but managing task scheduling and type checking new code to ensure it will be well-behaved? In this paradigm, the abstractions offered by the operating system can be moved into a *library* instead of being offered at runtime, and can use arbitrary capability types to enforce permissions or other requirements. Might such a system be both much more performant and simpler to reason about?

## Metaprogrammable multi-persistence database

Most databases are designed to run as an isolated service to ensure the persistence layer is always in a consistent state that can't accidentally be violated by user code. With proof invariants this isn't necessary, and databases can be implemented as mere libraries.

Immutable update logs have proven their value, and with proof checking it would be much easier to correctly build "mutable seeming" materialized views based on update commands. Databases could more easily save multiple materialized views at different scales in different formats.

## More advanced memory ownership models

Rust has inspired many engineers with the beautiful and powerful ideas of ownership and reference lifetimes, rooting out many tricky problems before they arise.

However the model is too simple for many obviously correct scenarios, such as mutation of a value from multiple places within the same thread, or pointers in complex data structures that still only point to ownership ancestors or strict siblings such as is the case in doubly-linked lists. More advanced invariants and arbitrary proofs can solve this problem.

## Reactivity systems that are provably free from leaks, deadlocks, and cycles

Reactive programming models have become ubiquitous in most user interface ecosystems, but in order to make sense they often rely on the tacit assumption that user code doesn't introduce resource leaks or deadlocks or infinite cycles between reactive tasks. Verification can step in here, and produce algorithms that enforce tree-like structures for arbitrary code.


================================================
FILE: iris-notes.md
================================================
>
  “number of steps of computation that the program may perform”. This intuition is not entirely
  correct, but it is close enough.

  VJAKδ is now a predicate over both a natural number k ∈ N and a closed value v.
  Intuitively, (k,v) ∈ VJAKδ means that no well-typed program using v at type A will “go
  wrong” in k steps (or less).

what does it mean for something to hold for k steps?


>
  iProp is obtained from a more general construction: uniform predicates over
  a unital resource algebra M, written UPred(M).

  The type UPred(M) consists of predicates over step-indices and resources (from M) which
  are down-closed with respect to the step-index and up-closed with respect to the resource:

  UPred(M) := {P ∈ Prop(N, M) | ∀(n, a) ∈ P. ∀m, b. m ≤ n ⇒ a ;included b ⇒ (m, b) ∈ P}

so if some (n, a) is "proven", then so is any (m, b) where both m is `<=` (earlier than or same?) n and b is `>=` (includes or same) a
so you can take a valid (n, a) and make it either closer in number of steps or involving a larger piece of resource algebra state?


- how does step-indexing *actually* relate to program steps?
- are the step indexes only ever `infinity` or `1`?

```
In the base case, when the argument is a value v, we have to prove the postcondition Q(v)
(after potentially) updating the ghost state. Otherwise, if e is a proper expression, we get to
assume the state interpretation SI(h) (explained below) and have to show two conditions:
(1) the current expression e can make progress in the heap h where progress(e, h) :=
∃e
0
, h0
. (e, h) ❀ (e
0
, h0
) and (2) for any successor expression e
0 and heap h
0
, we have to
show the weakest precondition and the state interpretation after an update to the ghost
state and after a later.
The updates in both cases makes sure that we can always update our ghost state when
we prove a weakest precondition. These updates are instrumental for working with the
state interpretation below and for verifying code which relies on auxiliary ghost state.
The later in the second case ensures that the weakest precondition can be defined as a
guarded fixpoint. Moreover, it ties program steps to laters in our program logic (i.e., in the
rules LaterPureStep, LaterNew, LaterLoad, and LaterStore). In fact, this later in the
definition of the weakest precondition is responsible for the intuition: “. P means P holds
after the next step of computation”. More concretely, if one proves a weakest precondition
wp e {v. Q(v)} under the assumption . P then, after the next step of computation, the goal
becomes .wp e
0 {v. Q(v)}. We can then use the rule LaterMono to remove the later in
front of wp e
0 {v. Q(v)} and in front of . P.
```



the prefix `TC` is "typeclass" and comes from stdpp. it seems they've redefined a bunch of the basic operators in coq (eq, and, or, forall, etc) as typeclasses?




`bi` == bunched implications, which is just the logical ideas of separation logic (* operator as resource composition, -* like a "resource function" that can take resources and transform them, etc)

`si` == step-indexed, still don't entirely get the intuition behind step indexed relations, but whatever

`coPset` == set of positive binary numbers. `co` is for the idea of "cofiniteness"? a subset is `co`finite if it's `co`mplement is finite.
it looks like `coPset`s are used as the "masks"? the sets that hold ghost variable/invariant names?

`E` is generally used for masks

`Canonical` is just a command for making some typeclass instance available to coq's type inference, so it can be found automatically

`Structure` is the same as `Record`!!!!

`lb` == lower bound

`%I` means to resolve in `bi_scope`

Leibniz equality is the kind where two things are equal if all propositions that are true for one are true for the other


`|==>` is `bupd`, or basic update

`P ==∗ Q` is `(P ⊢ |==> Q)`, so P entails you can get an updatable Q, using separation logic entailment
confusingly it can also mean `(P -∗ |==> Q)` in bi_scope?

```
Class BUpd (PROP : Type) : Type := bupd : PROP → PROP.
Notation "|==> Q" := (bupd Q) : bi_scope.
Notation "P ==∗ Q" := (P ⊢ |==> Q) (only parsing) : stdpp_scope.
Notation "P ==∗ Q" := (P -∗ |==> Q)%I : bi_scope.

Class FUpd (PROP : Type) : Type := fupd : coPset → coPset → PROP → PROP.
Notation "|={ E1 , E2 }=> Q" := (fupd E1 E2 Q) : bi_scope.
Notation "P ={ E1 , E2 }=∗ Q" := (P -∗ |={E1,E2}=> Q)%I : bi_scope.
Notation "P ={ E1 , E2 }=∗ Q" := (P -∗ |={E1,E2}=> Q) : stdpp_scope.

Notation "|={ E }=> Q" := (fupd E E Q) : bi_scope.
Notation "P ={ E }=∗ Q" := (P -∗ |={E}=> Q)%I : bi_scope.
Notation "P ={ E }=∗ Q" := (P -∗ |={E}=> Q) : stdpp_scope.
```

In general the `▷=>^ n` syntax indicates a number of steps `n` accompanying the mask update?


`wsat` is world satisfaction



in the context of ofes `dist` means distance
> The type `A -n> B` packages a function with a non-expansiveness proof

> When an OFE structure on a function type is required but the domain is discrete,
one can use the type `A -d> B`.  This has the advantage of not bundling any
proofs, i.e., this is notation for a plain Coq function type.

> When writing `(P)%I`, notations in `P` are resolved in `bi_scope`

so it looks like the suffix `I` means internal


`■ (P)` means "plainly P", meaning P holds when no resources are available

`Λ` is generally an instance of a `language`

it seems `tp` is generally a thread pool?

it seems `upd` is update
and `bupd` is basic update
and `fupd` is fancy update


It seems the suffix `G` is used to mean "in global"


the only purpose of "later" is to prevent the kinds of infinite loops that can make a logic invalid (able to prove False). it's used to define propositions like weakest preconditions that must somehow bake the idea of "the program takes a step" into their meaning

ordered families of equivalences (ofe's) are just a "convenient" (if you can call them that) way of encoding "steps" into the system. ofe's make the equivalence of some pieces of data dependent on a step index, so pieces of data might be equivalent at some indexes but not others.
but most of the time the step indexes don't matter! most actual *data types* aren't recursive or hold some concept of computational steps in them, so the "equivalences" hold for *all* step indexes!

a "cmra" or "camera" is the fully general version of a resource algebra that actually uses the idea of step-indexed equality.



just copying a chunk of `docs/resource_algebras.md`:

>
  The type of Iris propositions `iProp Σ` is parameterized by a *global* list `Σ:
  gFunctors` of resource algebras that the proof may use.  (Actually this list
  contains functors instead of resource algebras, but you only need to worry about
  that when dealing with higher-order ghost state -- see "Camera functors" below.)

  In our proofs, we always keep the `Σ` universally quantified to enable composition of proofs.
  Each proof just assumes that some particular resource algebras are contained in that global list.
  This is expressed via the `inG Σ R` typeclass, which roughly says that `R ∈ Σ`
  ("`R` is in the `G`lobal list of RAs `Σ` -- hence the `G`).



iris
  program_logic: it seems contains files related to the instantiation of iris and weakest preconditions for the general "language" concept with exprs and vals etc. I don't think I care except to look for patterns and examples

  base_logic: is all the pay dirt in here?

  bi: contains files related to bunched implications logic?
  si_logic: contains files related to step-indexed logic?

  algebra: contains files related to resource algebras?










So I'll have to define some `magmideG` typeclass and `magmideΣ` list of resource algebras and a `subG_magmideΣ` instance

`inG` asserts some resource algebra is in a list
`subG` asserts a list of resource algebras is contained in a list

> The trailing `S` here is for "singleton"

hmm

```coq
Class magmideG Σ := {
  magmide_inG: inG Σ magmideR;
  magmide_some_other_library: some_other_libraryG Σ
}.
Local Existing Instances magmide_inG.
Local Existing Instances magmide_some_other_library.
... other fields

Definition magmideΣ: gFunctors := #[GFunctor magmideR; some_other_libraryΣ].

Instance subG_magmideΣ {Σ}: subG magmideΣ Σ → magmideG Σ.
Proof. solve_inG. Qed.

Section proof.
  Context `{!magmideG Σ, !otherthingsG Σ}.
EndSection proof.
```

> The backtick (`` ` ``) is used to make anonymous assumptions and to automatically
generalize the `Σ`.  When adding assumptions with backtick, you should most of
the time also add a `!` in front of every assumption.  If you do not then Coq
will also automatically generalize all indices of type-classes that you are
assuming.  This can easily lead to making more assumptions than you are aware
of, and often it leads to duplicate assumptions which breaks type class
resolutions.



================================================
FILE: justfile
================================================
# build:
# 	dune build

# wget --no-check-certificate -O - https://apt.llvm.org/llvm-snapshot.gpg.key | sudo apt-key add -
# add-apt-repository 'deb http://apt.llvm.org/bionic/   llvm-toolchain-bionic-13  main'
# sudo apt install llvm-13 libclang-common-13-dev
lab:
	#!/usr/bin/env bash
	# lli-13 lab.ll
	cargo run
	lli-13 lab.bc
	echo $?

test:
	cargo test
	dune runtest

dev:
	cargo test test_lex -- --nocapture


clean:
	#!/usr/bin/env bash
	pushd theory
	make clean
	rm -f *.glob
	rm -f *.vo
	rm -f *.vok
	rm -f *.vos
	rm -f .*.aux
	rm -f .*.d
	rm -f Makefile*
	rm -f .lia.cache
	rm -f *.ml*
	popd

build:
	#!/usr/bin/env bash
	pushd theory
	make
	popd

fullbuild:
	#!/usr/bin/env bash
	pushd theory
	coq_makefile -f _CoqProject *.v -o Makefile
	make clean
	make
	popd


================================================
FILE: lab.ll
================================================
; https://stackoverflow.com/questions/41716079/llvm-how-do-i-write-ir-to-file-and-run-it/41833643
; https://stackoverflow.com/questions/7773194/is-it-possible-to-use-llvm-assembly-directly

; https://ecksit.wordpress.com/2011/01/01/hello-world-in-llvm/
; https://kripken.github.io/llvm.js/demo.html

; @str = internal constant [19 x i8] c"Hello LLVM-C world!"

; declare i32 @puts(i8*)

define i32 @main() {
doit:
	; https://blog.yossarian.net/2020/09/19/LLVMs-getelementptr-by-example
	; %0 = call i32 @puts(i8* getelementptr inbounds ([19 x i8], [19 x i8]* @str, i32 0, i32 0))
	%0 = add i32 3, 4
	%1 = add i32 %0, %0
	ret i32 %1
}


================================================
FILE: mg_examples/main.mg
================================================
type Day;
	| Monday
	| Tuesday
	| Wednesday
	| Thursday
	| Friday
	| Saturday
	| Sunday

proc next_weekday(d: Day): Day;
	match d;
		Day.Monday => Day.Tuesday
		Day.Tuesday => Day.Wednesday
		Day.Wednesday => Day.Thursday
		Day.Thursday => Day.Friday
		_ => Day.Monday

proc same_day(d: Day): Day;
	d

prop Eq(@T: type): [T, T];
	(t: T): [t, t]

thm example_next_weekday: Eq[next_weekday(Day.Saturday), Day.Monday];
	Eq(Day.Monday)


================================================
FILE: notes/2019-popl-iron-final.md
================================================
pretty simple so far, just saying none of the concurrent separation logics enable tracking *obligations*, merely correctness in the sense of not *doing* something incorrect, rather than *incorrectly forgetting to do something necessary*.

this is a problem whenever we're using persistent/duplicable/shareable invariants, which can be copied arbitrarily to be given to different threads. doing this is necessary in fork-style concurrency (vs "structured" concurrency in which the language syntax itself determines where invariants exist).
since they're duplicable, they can be thrown away

the main way they're going to solve this problem is with what they're calling "trackable resources"
the first one is the "trackable points-to connective" `l ->_pi v`, where pi is a rational number describing what fraction of the heap we have control or knowledge of. `pi = 1` means we own the whole thing, and `pi < 1` means someone else has some control

then they define Iron++, which defines "trackable invariants" (rather than resources), and Iron++ is linear rather than affine (it doesn't have the weakening rule, so you can't throw away resources). this means these invariants aren't duplicable, but instead have to be "split"

getting into it, they define some rules, in which the `e_pi` proposition is like an empty heap, equivalent to the permission to allocate.

emp-split:
`e_pi1 ∗ e_pi2 <-> e_(pi1 + pi2)`

pt-split:
`(l -->_pi1 v) * (e_pi2) <-> (l -->_(pi1 + pi2) v)`

since `e_pi` propositions allow us to demonstrate we've deallocated memory, we can prove a program doesn't leak memory by giving it a hoare triple of `{ e_pi } program { e_pi }` where pi is equal in pre and post, for any pi


I got all I needed from this paper I think


================================================
FILE: notes/assembly-proofs.md
================================================
this paper is mostly just a reimplementation of vale in f*, but with a more efficient proof reflection style verification condition generator
the generator is more efficient just because it's a polynomial time algorithm that checks all the easily decidable stuff and defers everything else to a solver. whatevs


================================================
FILE: notes/category-theory-for-programmers.md
================================================


================================================
FILE: notes/coq-coq-correct.md
================================================
something I have to look at is how metaprogramming works in a bunch of these other languages, metacoq and f* metaprogramming

> This paper proposes to switch from a trusted code base to a trusted theory base paradigm!

okay I can't read this yet, I have to read metacoq


================================================
FILE: notes/coq-metacoq.md
================================================
okay reading this has actually been helpful
I'm still a little hazy on all the typing rules of cic, I guess mostly that they seem to be not as complex as I would assume them to be. honestly the coq reference is almost certainly a better place to understand all of that.

however now that I actually understand metacoq and how to use it, I intend to use it to play around with a simpler way of declaring everything, such as a `type A = ` oriented way of doing things


================================================
FILE: notes/indexing-foundational-proof-carrying-code.md
================================================
so far this paper is really simple, it's just saying what proof-carrying-code (PCC) is and why it's valuable. he's also saying it would be great for these systems to not assume a particular type-system, but instead just be rooted in mathematics/logic.

VC generator: verification condition generator (akin to a tactic that examines code and infers hoare triples?)

so the first 4 sections of this paper are just talking about how we can specify the operational semantics of a physical machine and instruction set, then define program state safety and program safety in terms of the step relation given by the operational semantics. pretty simple! especially interesting is the idea of a safe *program*, which depends on the program being written in a *position independent* manner (which I suppose would mean all instructions merely reference offsets from the program counter).

see now in section 5 he's talking about *typed* intermediate representations, which is dumb! metaprogrammable recombination forever!

he's also talking about the difference between syntactic and semantic type representation. I guess the core difference is that syntactic type representation is *opaque*, the syntax rules are basically assigned axiomatically. whereas semantic ones are rooted in actual logic, so all the transformation rules can be derived from the underlying meaning of the types.

but now we're getting to "recursive contravariance?" and how it makes step-indexing necessary? I'm almost there.

Instead of saying a type is a set of values, we say it is a set of pairs `<k, v>`, where k is an approximation index and v is a value. The judgement `<k, v>` ∈ τ means, "v approximately has type τ, and any program that runs for fewer than k instructions can't tell the  difference."" The indices k allow the construction of a well founded recursion, even when modeling contravariant recursive types.

So I guess the k-indexing is just a wrapper of some kind? I think contravariant recursion is just another way of saying it has to be strictly postive in the coq sense. an inductive constructor can't accept as an argument a function that itself takes the inductive type being defined as an argument, because this allows for infinite recursion and therefore unsoundness.


================================================
FILE: notes/indexing-indexed-model.md
================================================
this one is actually getting somewhere. it's basically the same paper as `indexing-foundational-proof-carrying-code` but actually gives some intuitions for what they're talking about with recursive types

the important thing it seems is this mu operator

```
µF ≡ {<k, v> | <k, v> ∈ F^(k+1)(⊥)}

µ(F) = λkλv.∀τ. ncomp(F, k + 1, ⊥, τ) ⇒ τ k v

where ncomp(F, k, g, h) means informally that F^k (g) = h

ncomp(f, n, x, y) can be defined as,
  ∀ g.
    (∀z. g(f, 0, z, z))
    ⇒ (∀m, z1, z2 .m > 0 ⇒ g (f, m − 1, z1, z2 ) ⇒ g (f, m, z1, f(z2)))
    ⇒ g (f, n, x, y).
```


================================================
FILE: notes/indexing-modal-model.md
================================================
Before getting into the real paper, I'm going to quickly try to gain some clue about what modal logic and kripke semantics are, why they're useful, and how they might relate to step-indexing.

# https://plato.stanford.edu/entries/logic-modal/
In general, modal logic is a logic where the truth of a statement is "qualified", using some "mode" like "necessarily" or "possibly"

There's a weak logic called `K` (after Saul Kripke) that includes ~, -> as usual, but also the `nec` operator for "necessarily". (written with the annoying box symbol □)

`K` is just normal propositional logic with these rules added relating to the `nec`

Necessitation Rule: If A is a theorem of K, then so is `nec(A)`.

Distribution Axiom: `nec(A -> B) -> (nec(A) -> nec(B))`.

Then there's the `may` operator (for "possibly" or "maybe", written with the annoying diamond symbol ◊).
It can be defined from `nec` by letting `may(A) = ~nec(~A)`, or "not necessarily not A". This means `nec` and `may` mirror each other in the same way `forall` and `exists` do.

Uh oh, there's a whole family of modal logics based on which axioms of "simplification" they include? They're saying which ones make sense depends on what area you're working in. I'm sure this will lead to fun situations in step-indexing.

The important part! **Possible Worlds**

Every proposition is given a truth value *in every possible world*, and different worlds might have different "truthiness".

`v(p, w)` means that for some valuation `v`, propositional variable `p` is true in world `w`.

~ := `v(∼A, w) = True <-> v(A, w) = False`
-> := `v(A -> B, w) = True <-> v(A, w) = False or v(B, w) = True`
theorem 5 := `v(□A, w) = True <-> forall w': W, v(A, w') = True`
^^^^^^^^^
theorem 5 is important! it seems this is the thing that makes it all make sense.
since `nec` and `may` are equivalent to "all" and "some" when thinking about possible worlds, theorem 5 implies that `may` is similar to `exists`, `◊A = ∼□∼A`
`may` is true when the proposition is true in *some* worlds, but not necessarily all of them, or that we merely know that A isn't necessarily false *everywhere*.

Ah yeah hold on, theorem 5 isn't always reasonable for every kind of modal logic. in temporal logic, where a "world" is really just an "instant" (hint, this is almost certainly what we're dealing with in step-indexing), `nec` really means that something will *continue* to be true into the future, but may not have been in the past.

in these cases, we have to define some relation R to define "earlier than"

theorem K := `v(□A, w) = True <-> forall w', (R(w, w') -> v(A, w')) = True`

so essentially A is necessarily true in w if and only if forall worlds *that are later than w* A is still true

so then a kripke frame `<W, R>` is a pair of a set of worlds W and a relation R.

I'm skipping over a bunch of stuff that doesn't seem relevant for getting to step-indexing.

Okay bisimulation is a place where this is useful.
labeled transition systems (LTSs) represent computation pathways between different machine states.
An easily understood quote:

```
LTSs are generalizations of Kripke frames, consisting of a set W of states, and a collection of i-accessibility relations Ri, one for each computer process i. Intuitively, Ri(w, w') holds exactly when w' is a state that results from applying the process i to state w.
```

The last important thing I'll say: the properties (such as transitivity, or being a total preorder) of the *accessibility relation* R (it defines accessibility!) define what axioms are reasonable to use in some context.

# moving onto the paper!
https://www.irif.fr/~vouillon//smot/Proofs.html

okay they're just talking about what they're trying to achieve, especially how we need recursive and quantified types (quantifed means that they may be generic or unknown, as is the case with something like `forall t: T`, where t is quantified) in order to represent tree structures in memory and other such things. types need to allow impredicativity, so types can refer to themselves

they talk a little about the difference between syntactic and semantic interpretations. The way I choose to understand this distinction is that syntactic rules can only refer to themselves and can't derive value from other systems, whereas semantic ones are merely embedded in some larger logical system that itself can be used to extend the rules.

This seems to point to an important distinction I've been missing:

```
We start from the idea of approximation
which pervades all the semantic research in the area. If we
type-check v : τ in order to guarantee safety of just the
next k computation steps, we need only a k-approximation
of the typing-judgment v : τ .
```

The important part is *next* k computation steps. It seems this implies that the type judgment maybe become false *after* k. This isn't how I was thinking about it, which was that the judgment *will become* true in k steps. The less-than relationship to k makes a lot more sense with this interpretation.

This also seems important:

```
We express this idea here using a Kripke semantics whose possible-worlds accessibility
relation R is well-founded: every path from a world w into
the future must terminate. In this work, worlds characterize
abstract computation states, giving an upper bound on the
number of future computations steps and constraining the
contents of memory.
```

I'm a little scared about the implications of this "every path must terminate" thing. I'm hoping that doesn't mean we can't prove things about possibly non-terminating programs (maybe we could define infinite divergence as a terminating "world"?). Nope! They specify in a later section of the paper that we can still use this idea to prove things about *any finite prefix* of any program.

I'll write down some of their base rules to help me remember:

w ||- v: T
means v has type T in world w

U |- T
means every value u of type U in any world w also has type T in w this seems equivalent to saying U is a subtype of T.

Then the modal operator "later"!
`lat` quantifies over all worlds (all times) *strictly in the future*

they point out that `nec` instead applies to *now* as well as the future. I guess this contradicts my intuition that the "less than k" steps thing is meaningful here

More seemingly important stuff

```
Indeed, the combination of a well-founded R with a strict modal operator `lat` provides a clean induction principle to the logic, called the Löb rule,

lat(T) |- T
-----------
    |- T
```

So if it is true that if a prop is True later then it is also true always, then it is always true always.
It seems this just means the later is meaningless, *or that there's nothing in the prop that depends on the world*.

> In this section we will interpret worlds as characterizing abstract properties of the current state of computation. In particular, in a  system with mutable references, each world contains a memory typing

In different types of machines, a "world" is a different thing (a lambda calculus with a store is a pair of an expression and that store, a von Neumann machine is the pair of registers (including program pointer) and memory)

```
Clearly, the same value v may or may not have type T depending on the world w, that is, depending on the data structures in memory that v points to. Accordingly, we call a pair (w, v) a configuration (abbreviated "config"):

Config = W x V,

and define a type T ∈ Type as a set of configurations. Then,

(w, v) ∈ T
and
w |- v : T

are two alternative notations expressing the same fact.
```

> We will show how our semantics connects the relation R between worlds and the relation >-> between states.

I guess they're saying there's some sort of correspondence between the R relation showing how "worlds" are accessible in time from one-another and the small step relation `>->` that shows how computation states are accessible from one-another. This makes sense since worlds and states are the same thing.


So a type is just a set of configurations, or a set of values pointing to something in some world. This is basically saying that a type is all values *who exist in a world* that makes the type assertions true. Yeah, they say "a type is just any set of configurations"
basic stuff like the top/bottom types, "logical" intersection/union, and function types are pretty easy to describe then.

I'll put the first few in my own words:

top := {(w, v) | True}
the top type describes all configs! so it is a subtype of all configs

bot := {}
the bot type is the empty set, so it describes no configs, so it is unrepresentable

T /\ U := T intersection U
type and set intersection are equivalent, since the intersection of type T and U is only the types that are described by both conditions

T \/ U := T union U
similar idea, we smush together the types which means any config describe by either of them is valid
related, discriminated unions then are the union of types which have no intersection, or where the description of each type necessarily precludes the other

U => T := {(w, v) | (w, v) ∈ U => (w, v) ∈ T }
this is slightly more involved, but only because I'm not sure if he's talking about implication or functions. I'm going to guess implication, since there's no talk of substitution or anything like that
all configs such that if the config is in U, it is also in T


Now he gets into how quantification is represented in the type system. These are more interesting.
importantly, in the below, A can be either Type, Loc, or Mem.

forall x:A.T := global_intersection<a in A>(T[a/x])
okay, first parsing it:
forall x which is an A, then T is defined as
the global intersection of all items a in A, for each which we've subsituted the a in the set which our variable x
that basically means that forall is the intersection set of all configs (or locs or mems) where...
I'm not sure I get it yet. the exists below is similar just with union, so I'll wait until later to understand what's going on. hopefully he gives an applied example.

exists x:A.T := global_union<a in A>(T[a/x])

Quantification over values in a world.
pretty simple,

!T := {(w, v) | forall v'. (w, v') in T}
all values in the current world have type T

?T := {(w, v) | exists v'. (w, v') in T}
some value in the current world has type T


Then they brag how they can define types in terms of their primitives without using the underlying logic.

T <=> U = T => U /\ U => T
T iff U, pretty simple (this confirms my suspicion that => was meant to indicate implication, although implication is isomorphic with functions, so there's something there as well)

teq(T, U) = !(T <=> U)
basically type equality, since for all values in (the current world) the types are equivalent to each other
the dependence on the current world is the only part I don't love....

world types (which teq(T, U) is one of) are types that only depend on the world, not the value (I'm guessing persistent types are ones that depend on neither)



Okay Vector Values is where I got kind of stuck before, let's write things out as we go to keep it clear.

```
we have locations `l: Loc`, that index a mutable store m;
storable values `u: SV` that are the range of m (contents of memory cells);
and values `v: V`.

We assume Loc subset SV (meaning locations are storable values, but there are more storable values than just locations)

On a von Neumann machine, SV = Loc (so locations *do* in fact fully describe storable values)
and v is a vector of locations (one could think of a register-bank) indexed by a natural number j.
That is, if v is a value, then v(j) is a Loc. (meaning a "value" is a terrible name for what they're talking about! value is a register bank, so v(j. but they're using value in the config sense of a (w, v), or a world and a *value*. this means they're saying the world is the state of memory and the value is the state of the registers, at least on a von Neumann machine) is choosing a particular register to grab a Loc from. Magmide will make this clearer by just making all things byte arrays and lists of byte arrays)
```

This part is where is gets hairier:

```
In order to type locations, we choose an injective function (a function that is one-to-one) `.->` from storable values to values (ints to register banks), for instance
`u-> := lambda j. u`
This way the same set of types can be used for all kinds of values.
```

The "in order to type locations" is important. I'm hoping this will become more clear. I understand all the parts of that sentence, but not the purpose of the sentence.

I think it becomes clearer with the "This way the same set of types can be used for all kinds of values." They're talking about *world/value/config* values in this context, so I guess this injective function is trying to produce some kind of equivalence between von Neumann machines and lambda calculus.

This is even less clear

```
In lambda-calculus, `SV = V` is the usual set of values, so we have `Loc strictsubset SV` by syntactic inclusion, and we take `u-> := u`
```

again I understand the parts but not the sentence.
perhaps they're saying that in lambda calculus the store can hold anything, and the "value" of a von Neumann machine is the current expression being reduced, so there isn't a need for this injective function? I'm still not sure what the injective function is for.


Singletons and slots.

I don't want to get stuck on this stuff.

based on this definition of the single type `just u` (the single storable value (SV) u)

just u := {(w, v) | v = u->}

I'm going to choose to believe that the injective arrow function is just saying that the value (register bank) v *can possibly produce u*????

and then

u: T = !(just u => T)
: here means "has type" in the more traditional sense
so for all values in the current world, if the value is u, then it has type T


exists l: Loc. just l /\ w(l)


Okay this makes more sense:

The type `slot(j, T)` characterizes values v such that the jth slot has type T.

slot(j, T) := {(w, v) | w ||- v(j): T}
all configs such that in the current world the storable value at slot j has type T

I think all this stuff was simpler than they made it seem, by this sentence:

> To say that register 2 has the value 3 we write slot(2, just 3).


Now on to the important stuff,

## Necessity and the modal operator "later"

Given two types U and T, we write
U |- T
when the type U is a subset of the type T, meaning
for every world w and value v,

w ||- v: U
implies
w ||- v: T

(if a value is a U, then it is also a T, so Us can be replaced with Ts, U is a subtype of T)

We write
|- T
to mean
top |- T
(so we don't assume any (useful) types are subsets of T, only the top type, which is a subtype of all types)


The accessibility relation R has to be transitive and well-founded, such as the less-than (<) relation.

So R(w, w') means the world w' comes at a strictly later stage than the world w.

From this we can define the later operator:

later(T) := {(w, v) | forall w'. R(w, w') => (w', v) in T}
so for all worlds strictly later than now (so w < w', or the step-index of w is less than w')
so v has type later(T) when v has type T in all worlds strictly later than now (the world w)

Some stuff can be proven about later,

it's monotone (if U is a subtype of T, then later(U) is a subtype of later(T))
it distributes over intersection:
  `later(global_intersection(Ti)) = global_intersection(later(Ti))`

now the necessity operator (the box), `nec`
`nec` means now and later, and is defined simply:

nec(T) = T /\ later(T)

also monotone,
forall T, nec(T) subtype of T
if nec(U) subtype T, then nec(U) subtype nec(T)
also distributes over intersection


necessary types
types that, once true in some world w, are true forever.

necessary(T) = T subtype later(T)
so if T is true then also later(T)
or T is a subtype of later(T)
or T can be used as later(T)

this won't always be true, since the store evolves from one world to the next, possibly destroying some type

forall T, necessary(nec(T))
since nec(T) simply contains lat(T) so we can grab it

forall T, necessary(lat(T))


The lob rule
since R is well-founded, this induction principle is true:

```
later(T) judges T
------------------
    judges T
```



Recursive types

I'm not going to go over this in detail.
Basically, let's say we have a *type* operator F, which maps types to types.
such an operator is contractive if



A Kripke semantics of stores


I think this sentence is what makes the later operator make sense:

In this definition we write `later(m(l): T)`. There is some value u in memory at address l, and we guarantee to every future world that `u: T`. We don’t need to guarantee `u: T` in the current world because it takes one step just to dereference `m(l)`, and in that step we move to a future world.
This use of the later operator rather than the nec operator is crucial in order to solve the cardinality issue. Indeed, for a
configuration ((n, Ψ), v), only the configurations of index
strictly less than n are then relevant in the type Ψ(l).

So basically types can only refer to themselves because the assertions on memory locations only apply to future states.
All types can only refer (at least in regards to memory) to worlds later than the current.

this especially applies to reference types, since by necessity accessing the value requires a step of computation, so `ref T` just means that some location has type T *later*.




Oh my god, they say in section 11 that a type T describes *the entire register bank*. it's the type of the whole machine! since reference types are attached to the locations stored in the "value" (the register bank), we can assert the state of memory just by the type of the register bank.
we can type stack arguments by making assertions about the state of memory around the stack pointer.

A minimal machine could get by with just a program counter and memory, since even the return address can be put in stack arguments in memory


This paper hasn't heard of separation logic or something ha. They keep saying they have to specify that other registers aren't changed. no thanks.


This paper still doesn't explain why *props* have to have step-indexing when they are self-referencing!!


I get it all, at least at a high level, but I'm unsatisfied. maybe cpdt will help.

## cpdt because I said so (Universes)

> A predicative system enforces the constraint that, when an object is defined using some sort of quantifier, none of the quantifiers may ever be instantiated with the object itself.

an object can be passed itself as an argument.
but what counts as "itself"?
I guess Prop gets around this by not taking *itself*, but *instances* of itself
okay all he really says is that since Prop is always eliminated at extraction, and therefore doesn't produce infinite regressions allowing infinite loops to prove anything, it doesn't matter if it's impredicative.

so why can't iris use them directly!!!???


================================================
FILE: notes/iris-from-the-ground-up.md
================================================
An affine logic seems to only mean that the logic includes the weakening rule: `P * Q -> P`, you can *throw away* knowledge/resources

Resources algebras seem to be the important thing.

A resource algebra is a tuple


(M, V : M → Prop, |−| : M → M ? , (·) : M × M → M)

rules:

RA-associative: forall a, b, c. (a · b) · c = a · (b · c)
it doesn't matter what order the composition operator is used in

RA-commutative: forall a, b. a · b = b · a
it doesn't matter what order the variables are composed in

RA-core-composition-identity: forall a. |a|: M ⇒ |a| · a = a
if the core of a value is in the type, then composing the core with the same value is the same as the original value

RA-core-idempotent: forall a. |a|: M ⇒ ||a|| = |a|
if the core of a value is in the type, then the core of the core is the same as the core
(this also implies the core of the core composed with the original value is the same as the original value)

RA-core-monotonic: forall a, b. |a|: M ∧ a << b ⇒ |b|: M ∧ |a| << |b|
not sure yet

M? := M union {False}
M? basically is just the set of type invariants extended with contradiction

M? · False <-> False · M? <-> M?
composition with False is commutative and identical with M?

a << b := exists c: M. b = a · c
a is "less than", or "extended by" by
some c exists that "fill the gap" between a and b in terms of composition

a --> B := forall c?: M?. V(a · c?) -> exists b: B. V(b · c?)
a --> b := a --> {b}


a unital resource algebra (uRA) is a resource algebra M with an element ep satisfying these propositions:

V(ep)
ep is valid

forall a: M. ep · a = a
ep can be composed with anything without changing the original thing

|ep| = ep
the duplication of ep is the same as ep


a frame-preserving update is an update from some resource a to some resource b, such that if a is compatible (according to the V function) with all c?: M?, then b is also compatible with all c?
this essentially means that you can only update resources to a different state if you both already have valid resources and the updated state will be valid.



the core function |−| is basically the *duplication* function. it can be partial when some variants of a type aren't duplicable

The validity function V: M -> Prop basically defines what variants of the type are valid or acceptable

the composition function · defines what happens when you combine resources from different threads, or maybe more correctly it's equivalent to the separating conjunction `*` from separation logic


ghost state view shifts are *consuming*, to update `P ==>_ep Q` you have to update the state, or consume or destroy P. normal propositions `A -> B` are *constructive*, and wands

a mask on a hoare triple is like a set or map keeping track of which invariants are in force. accessing an invariant removes that invariant's *namespace* from the mask.





================================================
FILE: notes/iris-lecture-notes.md
================================================
https://gitlab.mpi-sws.org/iris/examples/-/tree/master/theories/lecture_notes

iris invariants let different threads read/write to the same locations, as long as they don't violate the invariant
iris ghost state lets invariants evolve over time, and keep track of information that doesn't exist in the actual program

# lambda,ref,conc

> A configuration consists of a heap and a thread pool, and a thread pool is a mapping from thread identifiers (natural numbers) to expressions, i.e., a finite set of named threads. Note that reduction of configuations is nondeterministic: we may choose to reduce in any thread in the thread pool. This reflects that we are modelling a kind of preemptive concurrent system.

> In the case of Iris the underlying language of “things” is simple type theory with a number of basic constants. These basic constants are given by the signature S.

This signature concept is probably going to be important.


> The types of Iris are built up from the following grammar, where T stands for additional base types which we will add later, Val and Exp are types of values and expressions in the language, and Prop is the type of Iris propositions.

τ ::= T | Z | Val | Exp | Prop | 1 | τ + τ | τ × τ | τ → τ

1 is basically just shorthand for unit? I guess?

> The judgments take the form Γ |-S t: τ and express when a term t has type τ in context Γ , given signature S. The variable context Γ assigns types to variables of the logic. It is a list of pairs of a variable x and a type τ such that all the variables are distinct. We write contexts in the usual way, e.g., x1: τ 1 , x2: τ 2 is a context.


> The magic wand P −∗ Q is akin to the difference of resources in Q and those in P : it is the set of all those resources which when combined with any resource in P are in Q



Then they go on for a long time discussing pretty obvious rules that I already understand (basic logic, separation logic, basic lambda calculus stuff).


================================================
FILE: notes/jung-thesis.md
================================================
<< is an *inclusion relation*. a << b means that b is a "bigger resource" than a, or that we obtain b by composing a with some other resource


================================================
FILE: notes/known_types.md
================================================
```coq
Inductive typ: Type :=
  | Unit: typ
  | Nat: typ
  | Bool: typ
  | Arrow: typ -> typ -> typ
.

Fixpoint typeDenote (t: typ): Set :=
  match t with
    | Unit => unit
    | Nat => nat
    | Bool => bool
    | Arrow arg ret => typeDenote arg -> typeDenote ret
  end.

(*Definition typctx := list type.*)

Inductive exp: list typ -> typ -> Type :=
| Const: forall env newtyp (value: typeDenote newtyp), exp env newtyp
| Var: forall env newtyp, member newtyp env -> exp env newtyp
| App: forall env arg ret, exp env (Arrow arg ret) -> exp env arg -> exp env ret
| Abs: forall env arg ret, exp (arg :: env) ret -> exp env (Arrow arg ret).

Arguments Const [env].

(*Definition a: exp hlist Bool := Const HNil true.*)

Fixpoint expDenote env t (e: exp env t): hlist typeDenote env -> typeDenote t :=
  match e with
    | Const _ value => fun _ => tt

    | Var _ _ mem => fun s => hget s mem
    | App _ _ _ e1 e2 => fun s => (expDenote e1 s) (expDenote e2 s)
    | Abs _ _ _ e' => fun s => fun x => expDenote e' (HCons x s)
  end.

(*Eval simpl in expDenote Const HNil.*)






(*
  okay I feel like I want to have a `compile` function that takes terms and just reduces the knowns, typechecks them, and outputs a string representing the "compiled" program
  then a `run` function that reduces the knowns and typechecks the program, but then reduces all the terms and outputs the "stdout" of the program
  this is presupposing that you'll have some kind of effectful commands that append some string to the "stdout". that seems like the more natural way I would prefer to structure a language that I'll eventually be using to learn while making a real imperative language
*)

(*Require Import Coq.Strings.String.
Require Import theorems.Maps.

Inductive typ: Type :=
  (*| Generic*)
  | Bool
  | Nat
  | Arrow (input output: typ)
  | UnionNil
  | UnionCons (arm_name: string) (arm_type: typ) (rest: typ)
  | TupleNil
  | TupleCons (left right: typ)
  (*| KnownType (type_value: trm)*)
  (*| KnownValue (value: trm)*)
.

Inductive Arm: Type :=
  | arm (arm_name: string).

Inductive trm: Type :=
  | tru | fls
  | debug_bool
  (*| nat_const (n: nat)*)
  (*| nat_plus (left right: trm)*)
  (*| debug_nat*)
  | binding (decl_name: string) (after: trm)
  | usage (var_name: string)
  | test (conditional iftru iffls: trm)
  | fn (args_name: string) (output_type: typ) (body: trm)
  | call (target_fn args: trm)
  | union_nil
  | union_cons (arm_name: string) (arm_value: trm) (rest_type: typ)
  | union_match (tr: trm) (arms: list (string * trm))
  | tuple_nil
  | tuple_cons (left right: trm)
  | tuple_access (tup: trm) (index: nat)
.


Fixpoint tuple_lookup (n: nat) (tr: trm): option trm :=
  match tr with
  | tuple_cons t tr' => match n with
    | 0 => Some t
    | S n' => tuple_lookup n' tr'
    end
  | _ => None
  end
.

Fixpoint union_lookup (tr: trm) (arms: list (string, (string * trm))): option trm :=
  match tr with
  | union_cons tr_arm_name tr_arm_value _ => match arms with
    | (arm_name, (arm_var, arm_body)) :: arms' => if eqb_string tr_arm_name arm_name
      then Some (substitute arm_var tr_arm_value arm_body)
      else union_lookup tr arms'
    | [] => None
    end
  | _ => None
  end
.
*)






(*Require Import Coq.Strings.String.
Require Import theorems.Maps.

(*Notation memarr := (@list string).*)


Inductive typ: Type :=
  | Base: string -> typ
  | Arrow: typ -> typ -> typ
  | TupleNil: typ
  | TupleCons: typ -> typ -> typ.


Inductive trm: Type :=
  | var: string -> trm
  | call: trm -> trm -> trm
  | fn: string -> typ -> trm -> trm
  (* tuples *)
  | tuple_proj: trm -> nat -> trm
  | tuple_nil: trm
  | tuple_cons: trm -> trm -> trm.


Inductive tuple_typ: typ -> Prop :=
  | TTnil:
    tuple_typ TupleNil
  | TTcons: forall T1 T2,
    tuple_typ (TupleCons T1 T2).

Inductive well_formed_typ: typ -> Prop :=
  | wfBase: forall i,
    well_formed_typ (Base i)
  | wfArrow: forall T1 T2,
    well_formed_typ T1 ->
    well_formed_typ T2 ->
    well_formed_typ (Arrow T1 T2)
  | wfTupleNil:
    well_formed_typ TupleNil
  | wfTupleCons: forall T1 T2,
    well_formed_typ T1 ->
    well_formed_typ T2 ->
    tuple_typ T2 ->
    well_formed_typ (TupleCons T1 T2).

Hint Constructors tuple_typ well_formed_typ.

Inductive tuple_trm: trm -> Prop :=
  | tuple_tuple_nil:
    tuple_trm tuple_nil
  | tuple_trm_tuple_cons: forall t1 t2,
    tuple_trm (tuple_cons t1 t2).

Hint Constructors tuple_trm.

(*Notation "x :: l" := (cons x l)
                     (at level 60, right associativity).*)
Notation "{ }" := tuple_nil.
Notation "{ x ; .. ; y }" := (tuple_cons x .. (tuple_cons y tuple_nil) ..).


Fixpoint subst (prev: string) (next: trm) (target: trm) : trm :=
  match target with
  | var y => if eqb_string prev y then next else target
  | fn y T t1 => fn y T (if eqb_string prev y then t1 else (subst prev next t1))
  | call t1 t2 => call (subst prev next t1) (subst prev next t2)
  | tuple_proj t1 i => tuple_proj (subst prev next t1) i
  | tuple_nil => tuple_nil
  | tuple_cons t1 tup => tuple_cons (subst prev next t1) (subst prev next tup)
  end.

Notation "'[' prev ':=' next ']' target" := (subst prev next target) (at level 20).


Inductive value: trm -> Prop :=
  | v_fn: forall x T11 t12,
    value (fn x T11 t12)
  | v_tuple_nil: value tuple_nil
  | v_tuple_cons: forall v1 vtup,
    value v1 ->
    value vtup ->
    value (tuple_cons v1 vtup).

Hint Constructors value.

Fixpoint tuple_lookup (n: nat) (tr: trm): option trm :=
  match tr with
  | tuple_cons t tr' => match n with
    | 0 => Some t
    | S n' => tuple_lookup n' tr'
    end
  | _ => None
  end.


Open Scope string_scope.

Notation a := (var "a").
Notation b := (var "b").
Notation c := (var "c").
Notation d := (var "d").
Notation e := (var "e").
Notation f := (var "f").
Notation g := (var "g").
Notation l := (var "l").
Notation A := (Base "A").
Notation B := (Base "B").
Notation k := (var "k").
Notation i1 := (var "i1").
Notation i2 := (var "i2").


Example test_tuple_lookup_nil_0:
  (tuple_lookup 0 {}) = None.
Proof. reflexivity. Qed.

Example test_tuple_lookup_nil_1:
  (tuple_lookup 1 {}) = None.
Proof. reflexivity. Qed.

Example test_tuple_lookup_cons_valid_0_a:
  (tuple_lookup 0 { a }) = Some a.
Proof. reflexivity. Qed.

Example test_tuple_lookup_cons_valid_0_a_b:
  (tuple_lookup 0 { a; b }) = Some a.
Proof. reflexivity. Qed.

Example test_tuple_lookup_cons_invalid:
  (tuple_lookup 3 { a; b; c }) = None.
Proof. reflexivity. Qed.
*)

```



```
Add LoadPath "/home/blaine/lab/cpdtlib" as Cpdt.
Set Implicit Arguments. Set Asymmetric Patterns.
Require Import List Cpdt.CpdtTactics Cpdt.DepList theorems.Maps Coq.Strings.String.

(*blaine, you need to write examples of what you'd like to accomplish in the near term*)
(*some concrete examples of "metaprogramming" in some abstract language is all you need*)
(*you don't have to prove almost anything about them, at least not at first, just get them working as expected and then prove things about them*)

(*the term type you create *is* the meta datatype! syntactic macros are just functions that operate on the same objects as the compiler*)

Inductive ty: Type :=
  | Ty_Bool: ty
  | Ty_Arrow (domain: ty) (range: ty): ty.

Inductive tm: Type :=
  | tm_var (name: string): tm
  | tm_call (fn: tm) (arg: tm): tm
  | tm_fn (argname: string) (argty: ty) (body: tm): tm
  | tm_true: tm
  | tm_false: tm
  | tm_if (test: tm) (tbody: tm) (fbody: tm): tm.

Declare Custom Entry stlc.
Notation "<{ e }>" := e (e custom stlc at level 99).
Notation "( x )" := x (in custom stlc, x at level 99).
Notation "x" := x (in custom stlc at level 0, x constr at level 0).
Notation "U -> T" := (Ty_Arrow U T) (in custom stlc at level 50, right associativity).
Notation "x y" := (tm_call x y) (in custom stlc at level 1, left associativity).
Notation "\ x : t , y" := (tm_fn x t y) (
  in custom stlc at level 90, x at level 99,
  t custom stlc at level 99,
  y custom stlc at level 99,
  left associativity
).
Coercion tm_var : string >-> tm.
Notation "'Bool'" := Ty_Bool (in custom stlc at level 0).
Notation "'if' x 'then' y 'else' z" := (tm_if x y z) (
  in custom stlc at level 89,
  x custom stlc at level 99,
  y custom stlc at level 99,
  z custom stlc at level 99,
  left associativity
).
Notation "'true'" := true (at level 1).
Notation "'true'" := tm_true (in custom stlc at level 0).
Notation "'false'" := false (at level 1).
Notation "'false'" := tm_false (in custom stlc at level 0).

Definition x: string := "x".
Definition y: string := "y".
Definition z: string := "z".
Hint Unfold x: core.
Hint Unfold y: core.
Hint Unfold z: core.

Notation idB := <{\x:Bool, x}>.
Notation idBB := <{\x:Bool -> Bool, x}>.

Inductive value: tm -> Prop :=
  | v_fn: forall arg T body,
      value <{\arg:T, body}>
  | v_true:
      value <{true}>
  | v_false:
      value <{false}>.
Hint Constructors value: core.


Reserved Notation "'[' old ':=' new ']' target" (in custom stlc at level 20, old constr).
Fixpoint subst (old: string) (new: tm) (target: tm): tm :=
  match target with
  | <{true}> => <{true}>
  | <{false}> => <{false}>
  | tm_var varname =>
      if string_dec old varname then new else target
  | <{\var:T, body}> =>
      if string_dec old var then target else <{\var:T, [old:=new] body}>
  | <{fn arg}> =>
      <{([old:=new] fn) ([old:=new] arg)}>
  | <{if test then tbody else fbody}> =>
      <{if ([old:=new] test) then ([old:=new] tbody) else ([old:=new] fbody)}>
  end

where "'[' old ':=' new ']' target" := (subst old new target) (in custom stlc).
Hint Unfold subst: core.

Check <{[x:=true] x}>.
Compute <{[x:=true] x}>.

Inductive substi (old: string) (new: tm): tm -> tm -> Prop :=
  | s_true: substi old new <{true}> <{true}>
  | s_false: substi old new <{false}> <{false}>
  | s_var_matches:
      substi old new (tm_var old) new
  | s_var_not_matches: forall varname,
      let varitem := (tm_var varname) in
      old <> varname -> substi old new varitem varitem
  | s_fn_matches: forall T body,
      let fn := <{\old:T, body}> in
      substi old new fn fn
  | s_fn_not_matches: forall var T body newbody,
      old <> var
      -> substi old new body newbody
      -> substi old new <{\var:T, body}> <{\var:T, newbody}>
  | s_fn_call: forall fn newfn arg newarg,
      substi old new fn newfn
      -> substi old new arg newarg
      -> substi old new <{fn arg}> <{newfn newarg}>
  | s_if: forall test tbody fbody newtest newtbody newfbody,
      substi old new test newtest
      -> substi old new tbody newtbody
      -> substi old new fbody newfbody
      -> substi old new
        <{if test then tbody else fbody}>
        <{if newtest then newtbody else newfbody}>
.
Hint Constructors substi: core.

(*Theorem substi_correct: forall old new before after,
  <{ [old:=new]before }> = after <-> substi old new before after.
Proof.
  intros. split; generalize after.
  induction before; if_crush.
  induction 1; if_crush.
Qed.*)


Reserved Notation "t '-->' t'" (at level 40).
Inductive step: tm -> tm -> Prop :=
  | ST_AppAbs: forall x T2 t1 v2,
      value v2
      -> <{(\x:T2, t1) v2}> --> <{ [x:=v2]t1 }>
  | ST_App1: forall t1 t1' t2,
      t1 --> t1' ->
      <{t1 t2}> --> <{t1' t2}>
  | ST_App2: forall v1 t2 t2',
      value v1
      -> t2 --> t2'
      -> <{ v1 t2}> --> <{v1 t2'}>
  | ST_IfTrue: forall t1 t2,
      <{if true then t1 else t2}> --> t1
  | ST_IfFalse: forall t1 t2,
      <{if false then t1 else t2}> --> t2
  | ST_If: forall t1 t1' t2 t3,
      t1 --> t1'
      -> <{ if t1 then t2 else t3}> --> <{if t1' then t2 else t3}>

where "t '-->' t'" := (step t t').

Definition relation (X: Type) := X -> X -> Prop.
Inductive multi {X: Type} (R: relation X): relation X :=
  | multi_refl: forall (x: X), multi R x x
  | multi_step: forall (x y z: X),
      R x y
      -> multi R y z
      -> multi R x z.

Hint Constructors step: core.
Notation multistep := (multi step).
Notation "t1 '-->*' t2" := (multistep t1 t2) (at level 40).

Tactic Notation "print_goal" :=
  match goal with |- ?x => idtac x end.
Tactic Notation "normalize" :=
  repeat (
    print_goal; eapply multi_step;
    [ (eauto 10; fail) | (instantiate; simpl)]
  );
  apply multi_refl.

Lemma step_example1':
  <{idBB idB}> -->* idB.
Proof. normalize. Qed.

Definition context := partial_map ty.

Inductive typed: context -> tm -> ty -> Prop :=
  | T_True: forall ctx, typed ctx <{true}> <{Bool}>
  | T_False: forall ctx, typed ctx <{false}> <{Bool}>
  | T_Var: forall ctx varname T,
      ctx varname = Some T ->
      typed ctx varname T
  | T_Abs: forall ctx var Tvar body Tbody,
      typed (update ctx var Tvar) body Tbody ->
      typed ctx <{\var:Tvar, body}> <{Tvar -> Tbody}>
  | T_App: forall ctx fn arg domain range,
      typed ctx fn <{domain -> range}> ->
      typed ctx arg domain ->
      typed ctx <{fn arg}> range
  | T_If: forall test tbody fbody T ctx,
       typed ctx test <{Bool}> ->
       typed ctx tbody T ->
       typed ctx fbody T ->
       typed ctx <{if test then tbody else fbody}> T
.
Hint Constructors typed: core.

Example typing_example_1:
  typed empty <{\x:Bool, x}> <{Bool -> Bool}>.
Proof. auto. Qed.


Fixpoint types_equal (T1 T2: ty): {T1 = T2} + {T1 <> T2}.
  decide equality.
Defined.


Notation "x <- e1 -- e2" := (match e1 with | Some x => e2 | None => None end)
  (right associativity, at level 60).

Fixpoint type_check (ctx: context) (t: tm): option ty :=
  match t with
  | <{true}> => Some <{ Bool }>
  | <{false}> => Some <{ Bool }>
  | tm_var varname => ctx varname
  | <{\var:Tvar, body}> =>
      Tbody <- type_check (update ctx var Tvar) body --
      Some <{Tvar -> Tbody}>
  | <{fn arg}> =>
      Tfn <- type_check ctx fn --
      Targ <- type_check ctx arg --
      match Tfn with
      | <{Tdomain -> Trange}> =>
          if types_equal Tdomain Targ then Some Trange else None
      | _ => None
      end
  | <{if test then tbody else fbody}> =>
      Ttest <- type_check ctx test --
      Ttbody <- type_check ctx tbody --
      Tfbody <- type_check ctx fbody --
      match Ttest with
      | <{ Bool }> =>
          if types_equal Ttbody Tfbody then Some Ttbody else None
      | _ => None
      end
  end.
Hint Unfold type_check.

Ltac solve_by_inverts n :=
  match goal with | H : ?T |- _ =>
  match type of T with Prop =>
    solve [
      inversion H;
      match n with S (S (?n')) => subst; solve_by_inverts (S n') end ]
  end end.

Ltac solve_by_invert :=
  solve_by_inverts 1.

Ltac if_crush :=
  crush; repeat match goal with
    | [ |- context[if ?X then _ else _] ] => destruct X
  end; crush.

Theorem type_checking_complete: forall ctx t T,
  typed ctx t T -> type_check ctx t = Some T.
Proof.
  intros. induction H; if_crush.
Qed.
Hint Resolve type_checking_complete: core.

Theorem type_checking_sound: forall ctx t T,
  type_check ctx t = Some T -> typed ctx t T.
Proof.
  intros ctx t. generalize dependent ctx.
  induction t; intros ctx T; inversion 1; crush.
  - rename t1 into fn, t2 into arg.
    remember (type_check ctx fn) as Fnchk.
    destruct Fnchk as [TFn|]; try solve_by_invert;
    destruct TFn as [|Tdomain Trange]; try solve_by_invert;
    remember (type_check ctx arg) as Argchk;
    destruct Argchk as [TArg|]; try solve_by_invert.
    destruct (types_equal Tdomain TArg) eqn: Hd; crush.
    apply T_App.
  -
  -

  intros. generalize dependent T. generalize dependent ctx.
  induction t; intros ctx T; inversion 1.
  - crush.
  -
    crush.
  induction t; intros crush.
Qed.
Hint Resolve type_checking_sound.


Theorem type_checking_correct: forall ctx t T,
  type_check ctx t = Some T <-> typed ctx t T.
Proof. crush. Qed.

```





You should probably write out this whole (almost) blog post informally before you really dig into the formal stuff. This is just such a huge undertaking, first understanding what you even precisely want to accomplish is a good idea.

Think of it like writing the documentation before you write the code! You do that all the time since it helps clarify what's special and useful about the code, and what features it needs to have.












So I guess this whole project has a few beliefs:

- We can and should bring formally verified programming with dependent types to the mainstream.
- We can and should make a bedrock language with a dependent type system that is defined in the smallest and most primitive constructs of machine computation, because all the code we actually write is intended for such systems.
- We should design some set of "known" combinators to allow someone to write a compiler in bedrock that translates some set of terms of a language into bedrock, so that arbitrarily convenient and powerful languages can be implemented from these bedrock building blocks. By doing so we can have all languages be truly safe and also truly interoperable. Formalizing and implementing the algorithms for a type system in bedrock allows you to prove that all of your derived forms are valid in bedrock! Dependent types and the ability to prove arbitrary statements is *most* powerful at this lowest level of abstraction, since it allows us to build literally any language construct we can imagine, since the derived types people build can encapsulate on bytes and propositions, which are the most flexible constructs for machine computation.








So far you've considered "generics" as something that exists in the "computable" set of terms, but that's not really correct
a generic function is actually two function calls, the first of a "known" function that takes some function containing type variables and a type substitution mapping those type variables to concrete types (or to other type variables! which can allow you to partially apply generics, there should probably be two functions at least for now, one that expects all type variables to be resolved and returns a concrete function, and one that allows for partial application and returns a known function. both of these functions can resolve to either their intended type or a compilation error term)


so you should probably have these inductives: concrete types (which include the types that encode type variables in a "computable" way. there's some thinking to do here, but I think this means that you can pass any concrete term to a known function as long as it meets some "known" criteria which for functions is assumed and but for other values simply means that they have to be constants) and concrete terms (basically just the base lambda calculus stuff), known types and known terms (which are the "inductive" step, since they can take both concrete things as well as other knowns, creating the unbounded but finite dag of compilation)

all of this means that bedrock itself won't actually have "primitive syntactic" generics like other languages do, but syntactic generics will of course be possible by means of translation in any theoretical derived language.




It is actually possible to have "dynamic" functions! By the time bedrock is done, *everything* will just be bytes, and *instructions* are just bytes! All you need in order to allow dynamic functions is to "include" the typechecker or compiler in your final "computable" binary! All we've done here is "move up" the known steps, since what is typically known and performed at compile time is still "dynamic" in the sense that actual machine computation is being performed, just like it will be at runtime! compile time is just a special case of runtime!







Known types are simply all about how we're able to produce code.

One of the first things we need is a "bedrock type". This is the actual

If we implement this as a simply typed lambda calculus, then the "ordering" of everything is taken care of?
It's also less interesting, but that's okay, at least for now.

Really this first version to validate everything is basically just a simply typed lambda calculus but where there's some kind of "known" system that allows the functions to operate on types.


You need to sit and draw out how different types relate to each other.

Then you basically do all the work he does in SLTC. Define preservation and progress and all that.





First you have "computable terms". These are basically just terms that have been reduced enough that they can actually be "run", whatever that means in the context you're talking about. In a "compiled" language that means something that's been reduced enough to be output as llvm and run. In these more theoretical contexts it's just reduced down to a subset of terms that have been deemed computable.

The interesting part of the "computable term" definition is what terms it reveals as *not* being computable. These are basically all the "known" structures. Those known structures need to be reduced all the way to computable ones before they're ready to actually compute. But the *bodies* of the known structures *themselves* also need to be reduced as well! This produces a directed acyclic graph of "known" terms that need to be reduced in order all the way down to computable terms.


Does this mean that the only "types" we actually *need* are computable ones? It certainly seems that way, since we can simply say that the only thing we need to "typecheck" is a computable term that we're about to compute. Having more "advanced" higher order types is merely useful for a more ergonomic version of the language that we can do a "higher order" typecheck on before even bothering to reduce any terms. Higher order typechecks probably also play right into a full proof-capable language, one where you can prove that your higher order functions will always reduce to things that will typecheck.

For now it seems all this version needs is an initial "dag" check, if it even allows recursion that is.


Does this mean that the typing relation is something like this?

```v
Inductive ty : Type :=
  | Bool: ty
  | Arrow: ty -> ty -> ty
  | Known: ty -> ty.

I think this really is it! At least for formally defining it, all this "Known" type needs to do to work is to "reduce" in a different way. It yields an abstract description of the type or value or whatever rather than another term. Or rather the term it reduces to *is* the type.

Is this true? I need to keep thinking.

Inductive tm : Type :=
  | var : string -> tm
  | call : tm -> tm -> tm
  | fn : string -> ty -> tm -> tm
  | tru : tm
  | fls : tm
  | test : tm -> tm -> tm -> tm.
```




maybe we define types not inherently, but as things that reduce from known terms?
or maybe our typechecking function and relation aren't total, we can't (and don't want to bother to) typecheck terms that haven't reduced all the way to computable terms. the typechecking function should return `option` on all terms that aren't computable







So let's say we had a language that had these types

bool: typ; obvious, computable
nat: typ; obvious, computable
arrow: typ -> typ -> typ; obvious, computable
typvalue: booltyp | nattyp | arrowtyp; hmmmm, this is computable since we need to compute based on it to progress and output something
need union (variant) and tuple and unit
known: (tm -> tm) -> typ?; not computable directly, but we can reduce it to being computable

and these terms:

tru: tm; obvious, computable
fls: tm; obvious, computable
n: nat -> tm; obvious, computable
known






While reading types and programming languages, something's occuring to me.

The base "bedrock" language has to be fully strict and exact in the way it defines the calculable language, which can basically only consist of arrays of bytes and propositions on those arrays of bytes.

However once we've done that, we can build all kinds of convenient language forms and theorems about them by simply defining them as meta-functions in that bedrock language.

For example, in the strict "bedrock" sense, subtyping is basically never valid, since subtyping ignores the very concrete byte-level representation of the structures. But if we have a "meta-language" (which is just a "compiler" that itself is a program in bedrock that takes the terms of the meta-language and computes them to bedrock) then we can allow subtyping simply by saying that whenever we encounter an action that gives a subtype, we can compile that action into the actually valid bytes level action that will satisfy the propositions of bedrock. In this way we have a *provably correct* desugaring process.


================================================
FILE: notes/pony-reference-capabilities.md
================================================
http://jtfmumm.com/blog/2016/03/06/safely-sharing-data-pony-reference-capabilities/


`iso`: writeable/readable, only one reference exists (this one). can be used to read or write locally. can be converted to anything, including giving it up to pass to another actor
`val`: readable, only immutable aliases exist, so can be shared for reading with anyone.
`tag`: neither, the address of an actor, can be shared anywhere, but can't be read or written.
`ref`: writeable/readable but only locally, an unknown number of mutable local aliases exist, so this is just like a typical alias. since we don't know how many aliases exist, we can only possibly share this thing if we somehow destroy those other aliases.
`trn`: a local reference we can write/read, but can only create readable references from. this allows us to eventually convert this type to a `val`.
`box`: readable locally, we don't know how many other people are looking at this thing

the subtyping (or "can be substituted for") relation

```
               --> ref --
              /          \
iso --> trn --            --> box --> tag
              \          /
               --> val --
```


1) A mutable reference capability denies neither read nor write permissions. This category includes `iso`, `ref`, and `trn`.

2) An immutable reference capability denies write permissions but not read permissions. This category includes `val` and `box`.

3) An opaque reference capability denies both read and write permissions. The only example is `tag`.



https://tutorial.ponylang.io/reference-capabilities/reference-capabilities.html#isolated-data-may-be-complex

```
Isolated data may be complex
An isolated piece of data may be a single byte. But it can also be a large data structure with multiple references between the various objects in that structure. What matters for the data to be isolated is that there is only a single reference to that structure as a whole. We talk about the isolation boundary of a data structure. For the structure to be isolated:

There must only be a single reference outside the boundary that points to an object inside.
There can be any number of references inside the boundary, but none of them must point to an object outside.



Isolated, written iso. This is for references to isolated data structures. If you have an iso variable then you know that there are no other variables that can access that data. So you can change it however you like and give it to another actor.

Value, written val. This is for references to immutable data structures. If you have a val variable then you know that no-one can change the data. So you can read it and share it with other actors.

Reference, written ref. This is for references to mutable data structures that are not isolated, in other words, “normal” data. If you have a ref variable then you can read and write the data however you like and you can have multiple variables that can access the same data. But you can’t share it with other actors.

Box. This is for references to data that is read-only to you. That data might be immutable and shared with other actors or there may be other variables using it in your actor that can change the data. Either way, the box variable can be used to safely read the data. This may sound a little pointless, but it allows you to write code that can work for both val and ref variables, as long as it doesn’t write to the object.

Transition, written trn. This is used for data structures that you want to write to, while also holding read-only (box) variables for them. You can also convert the trn variable to a val variable later if you wish, which stops anyone from changing the data and allows it be shared with other actors.

Tag. This is for references used only for identification. You cannot read or write data using a tag variable. But you can store and compare tags to check object identity and share tag variables with other actors.

Note that if you have a variable referring to an actor then you can send messages to that actor regardless of what reference capability that variable has.
```



so reference capabilities have these qualities:
readable/writeable *to you*
readable/writeable *to others*
writeable *locally*
shareable *locally*
shareable *globally*


https://tutorial.ponylang.io/reference-capabilities/guarantees.html


`iso`: others/local read/write unique
`trn`: others/local write unique, others read unique
`ref`: others read/write unique

`val`: others/local immutable
`box`: others immutable

`tag`: opaque

|                       | Deny global read/write   | Deny global write | None denied      |
|-----------------------|--------------------------|-------------------|------------------|
| Deny local read/write | `iso` (sendable)         |                   |                  |
| Deny local write      | `trn`                    | `val` (sendable)  |                  |
| None denied           | `ref`                    | `box`             | `tag` (sendable) |
|                       | (Mutable)                | (Immutable)       | (Opaque)         |

Sendable capabilities. If we want to send references to a different actor, we must make sure that the global and local aliases make the same guarantees. It’d be unsafe to send a trn to another actor, since we could possibly hold box references locally. Only iso, val, and tag have the same global and local restrictions – all of which are in the main diagonal of the matrix.


================================================
FILE: notes/tarjan/README.md
================================================
Tarjan and Kosaraju
-------------------

# Main files

## Proofs of Tarjan strongly connected component algorithm (independent from each other)
* `tarjan_rank.v` *(751 sloc)*: proof with rank
* `tarjan_rank_bigmin.v` *(806 sloc)*: same proof but with a `\min_` instead of multiple inequalities on the output rank
* `tarjan_num.v` *(1029 sloc)*: same proof as `tarjan_rank_bigmin.v` but with serial numbers instead of ranks
* `tarjan_nocolor.v` *(548 sloc)*: new proof, with ranks and without colors, less fields in environement and less invariants, preconditions and postconditions.
* `tarjan_nocolor_optim.v` *(560 sloc)*: same proof as `tarjan_nocolor.v`, but with the serial number field of the environement restored, and passing around stack extensions as sets.

## Proof of Kosaraju strongly connected component algorithm
* `Kosaraju.v` *(679 sloc)*: proof of Kosaraju connected component algorithm

## Extra library files
* `bigmin.v` *(137 sloc)*: extra library to deal with \min(i in A) F i
* `extra.v` *(265 sloc)*: naive definitions of strongly connected components and various basic extentions of mathcomp libraries on paths and fintypes.

# Authors:

Cyril Cohen, Jean-Jacques Lévy and Laurent Théry


================================================
FILE: notes/tarjan/_CoqProject
================================================
-R . mathcomp.tarjan
-arg -w -arg -notation-overridden

tarjan_nocolors.v
extra_nocolors.v



================================================
FILE: notes/tarjan/extra_nocolors.v
================================================
From mathcomp Require Import all_ssreflect.

Set Implicit Arguments.
Unset Strict Implicit.
Unset Printing Implicit Defensive.

Lemma ord_minn_le n (i j : 'I_n) : minn i j < n.
Proof. by rewrite gtn_min ltn_ord. Qed.
Definition ord_minn {n} (i j : 'I_n) := Ordinal (ord_minn_le i j).

Section ord_min.
Variable (n : nat).
Notation T := (ord_max : 'I_n.+1).
Notation min := (@ord_minn n.+1).

Lemma minTo : left_id T min.
Proof. by move=> i; apply/val_inj; rewrite /= (minn_idPr _) ?leq_ord. Qed.

Lemma minoT : right_id T min.
Proof. by move=> i; apply/val_inj; rewrite /= (minn_idPl _) ?leq_ord. Qed.

Lemma minoA : associative min.
Proof. by move=> ???; apply/val_inj/minnA. Qed.

Lemma minoC : commutative min.
Proof. by move=> ??; apply/val_inj/minnC. Qed.

Canonical ord_minn_monoid := Monoid.Law minoA minTo minoT.
Canonical ord_minn_comoid := Monoid.ComLaw minoC.

End ord_min.

Notation "\min_ ( i | P ) F" := (\big[ord_minn/ord_max]_(i | P%B) F%N)
  (at level 41, F at level 41, i at level 50,
   format "'[' \min_ ( i  |  P ) '/  '  F ']'") : nat_scope.
Notation "\min_ i F" := (\big[ord_minn/ord_max]_i F%N) 
  (at level 41, F at level 41, i at level 0,
   format "'[' \min_ i '/  '  F ']'") : nat_scope.
Notation "\min_ ( i 'in' A | P ) F" :=
 (\big[ord_minn/ord_max]_(i in A | P%B) F%N)
  (at level 41, F at level 41, i, A at level 50,
   format "'[' \min_ ( i  'in'  A  |  P ) '/  '  F ']'") : nat_scope.
Notation "\min_ ( i 'in' A ) F" :=
 (\big[ord_minn/ord_max]_(i in A) F%N)
  (at level 41, F at level 41, i, A at level 50,
   format "'[' \min_ ( i  'in'  A ) '/  '  F ']'") : nat_scope.

Section extra_bigmin.

Variables (n : nat) (I : finType).
Implicit Type (F : I -> 'I_n.+1).

Lemma geq_bigmin_cond (P : pred I) F i0 :
  P i0 -> F i0 >= \min_(i | P i) F i.
Proof. by move=> Pi0; rewrite (bigD1 i0) //= geq_minl. Qed.
Arguments geq_bigmin_cond [P F].

Lemma geq_bigmin F (i0 : I) : F i0 >= \min_i F i.
Proof. exact: geq_bigmin_cond. Qed.

Lemma bigmin_geqP (P : pred I) (m : 'I_n.+1) F :
  reflect (forall i, P i -> F i >= m) (\min_(i | P i) F i >= m).
Proof.
apply: (iffP idP) => leFm => [i Pi|].
  by apply: leq_trans leFm _; apply: geq_bigmin_cond.
by elim/big_ind: _; rewrite ?leq_ord // => m1 m2; rewrite leq_min => ->.
Qed.

Lemma bigmin_inf i0 (P : pred I) (m : 'I_n.+1) F :
  P i0 -> m >= F i0 -> m >= \min_(i | P i) F i.
Proof.
by move=> Pi0 le_m_Fi0; apply: leq_trans (geq_bigmin_cond i0 Pi0) _.
Qed.

Lemma bigmin_eq_arg i0 (P : pred I) F :
  P i0 -> \min_(i | P i) F i = F [arg min_(i < i0 | P i) F i].
Proof.
move=> Pi0; case: arg_minP => //= i Pi minFi.
by apply/val_inj/eqP; rewrite eqn_leq geq_bigmin_cond //=; apply/bigmin_geqP.
Qed.

Lemma eq_bigmin_cond (A : pred I) F :
  #|A| > 0 -> {i0 | i0 \in A & \min_(i in A) F i = F i0}.
Proof.
case: (pickP A) => [i0 Ai0 _ | ]; last by move/eq_card0->.
by exists [arg min_(i < i0 in A) F i]; [case: arg_minP | apply: bigmin_eq_arg].
Qed.

Lemma eq_bigmin F : #|I| > 0 -> {i0 : I | \min_i F i = F i0}.
Proof. by case/(eq_bigmin_cond F) => x _ ->; exists x. Qed.

Lemma bigmin_setU (A B : {set I}) F :
  \min_(i in (A :|: B)) F i =
  ord_minn (\min_(i in A) F i) (\min_(i in B) F i).
Proof.
have d : [disjoint A :\: B & B] by rewrite -setI_eq0 setIDAC setDIl setDv setI0.
rewrite (eq_bigl [predU (A :\: B) & B]) ?bigU//=; last first.
  by move=> y; rewrite !inE; case: (_ \in _) (_ \in _) => [] [].
symmetry; rewrite (big_setID B) /= [X in ord_minn X _]minoC -minoA.
congr (ord_minn _ _); apply: val_inj; rewrite /= (minn_idPr _)//.
by apply/bigmin_geqP=> i; rewrite inE => /andP[iA iB]; rewrite (bigmin_inf iB).
Qed.

End extra_bigmin.

Arguments geq_bigmin_cond [n I P F].
Arguments geq_bigmin [n I F].
Arguments bigmin_geqP [n I P m F].
Arguments bigmin_inf [n I] i0 [P m F].
Arguments bigmin_eq_arg [n I] i0 [P F].

Section extra_fintype.

Variable V : finType.

Definition relto (a : pred V) (g : rel V) := [rel x y | (y \in a) && g x y].
Definition relfrom (a : pred V) (g : rel V) := [rel x y | (x \in a) && g x y].

Lemma connect_rev (g : rel V) :
  connect g =2 (fun x => connect (fun x => g^~ x) ^~ x).
Proof.
move=> x y; apply/connectP/connectP=> [] [p gp ->];
[exists (rev (belast x p))|exists (rev (belast y p))]; rewrite ?rev_path //;
by case: (lastP p) => //= ??; rewrite belast_rcons rev_cons last_rcons.
Qed.

Lemma path_to a g z p : path (relto a g) z p = (path g z p) && (all a p).
Proof.
apply/(pathP z)/idP => [fgi|/andP[/pathP gi] /allP ga]; last first.
  by move=> i i_lt /=; rewrite gi ?andbT ?[_ \in _]ga // mem_nth.
rewrite (appP (pathP z) idP) //=; last by move=> i /fgi /= /andP[_ ->].
by apply/(all_nthP z) => i /fgi /andP [].
Qed.

Lemma path_from a g z p :
  path (relfrom a g) z p = (path g z p) && (all a (belast z p)).
Proof. by rewrite -rev_path path_to all_rev rev_path. Qed.


Lemma connect_to (a : pred V) (g : rel V) x z : connect g x z ->
  exists y, [/\ (y \in a) ==> (x == y) && (x \in a),
                 connect g x y & connect (relto a g) y z].
Proof.
move=> /connectP [p gxp ->].
pose P := [pred i | let y := nth x (x :: p) i in
  [&& connect g x y & connect (relto a g) y (last x p)]].
have [] := @ex_minnP P.
  by exists (size p); rewrite /= nth_last (path_connect gxp) //= mem_last.
move=> i /= /andP[g1 g2] i_min; exists (nth x (x :: p) i); split=> //.
case: i => [|i] //= in g1 g2 i_min *; first by rewrite eqxx /= implybb.
have i_lt : i < size p.
  by rewrite i_min // !nth_last /= (path_connect gxp) //= mem_last.
have [<-/=|neq_xpi /=] := altP eqP; first by rewrite implybb.
have := i_min i; rewrite ltnn => /contraNF /(_ isT) <-; apply/implyP=> axpi.
rewrite (connect_trans _ g2) ?andbT //; last first.
  by rewrite connect1 //= [_ \in _]axpi /= (pathP x _).
by rewrite (path_connect gxp) //= mem_nth //= ltnW.
Qed.

Lemma connect_from (a : pred V) (g : rel V) x z : connect g x z ->
  exists y, [/\ (y \in a) ==> (z == y) && (z \in a),
                connect (relfrom a g) x y & connect g y z].
Proof.
rewrite connect_rev => cgxz; have [y [ayaz]]//= := connect_to a cgxz.
by exists y; split; rewrite // connect_rev.
Qed.

Lemma connect1l (g : rel V) x z :
  connect g x z -> z != x -> exists2 y, g x y & connect g y z.
Proof.
move=> /connectP [[|y p] //= xyp ->]; first by rewrite eqxx.
by move: xyp=> /andP[]; exists y => //; apply/connectP; exists p.
Qed.

Lemma connect1r (g : rel V) x z :
  connect g x z -> z != x -> exists2 y, connect g x y & g y z.
Proof.
move=> xz zNx; move: xz; rewrite connect_rev => /connect1l.
by rewrite eq_sym => /(_ zNx) [y]; exists y; rewrite // connect_rev.
Qed.

Section connected.

Variable (g : rel V).

Definition connected := forall x y, connect g x y.

Lemma cover1U (A : {set V}) P : cover (A |: P) = A :|: cover P.
Proof. by apply/setP => x; rewrite /cover bigcup_setU big_set1. Qed.

Lemma connectedU (A B : {set V}) : {in A &, connected} -> {in B &, connected} ->
  {in A & B, connected} -> {in B & A, connected} -> {in A :|: B &, connected}.
Proof.
move=> cA cB cAB cBA z t; rewrite !inE => /orP[zA|zB] /orP[tA|tB];
by[apply: cA|apply: cB|apply: cAB|apply: cBA].
Qed.

End connected.

Section Symconnect.

Variable r : rel V.

(* x is symconnected to y *)
Definition symconnect x y := connect r x y && connect r y x.

Lemma symconnect0 : reflexive symconnect.
Proof. by move=> x; apply/andP. Qed.

Lemma symconnect_sym : symmetric symconnect.
Proof. by move=> x y; apply/andP/andP=> [] []. Qed.

Lemma symconnect_trans : transitive symconnect.
Proof.
move=> x y z /andP[Cyx Cxy] /andP[Cxz Czx].
by rewrite /symconnect (connect_trans Cyx) ?(connect_trans Czx).
Qed.
Hint Resolve symconnect0 symconnect_sym symconnect_trans.

Lemma symconnect_equiv : equivalence_rel symconnect.
Proof. by apply/equivalence_relP; split; last apply/sym_left_transitive. Qed.

(*************************************************)
(* Connected components of the graph, abstractly *)
(*************************************************)

Definition sccs := equivalence_partition symconnect setT.

Lemma sccs_partition : partition sccs setT.
Proof. by apply: equivalence_partitionP => ?*; apply: symconnect_equiv. Qed.

Definition cover_sccs := cover_partition sccs_partition.

Lemma trivIset_sccs : trivIset sccs.
Proof. by case/and3P: sccs_partition. Qed.
Hint Resolve trivIset_sccs.

Notation scc_of := (pblock sccs).

Lemma mem_scc x y : x \in scc_of y = symconnect y x.
Proof.
by rewrite pblock_equivalence_partition // => ?*; apply: symconnect_equiv.
Qed.

Definition def_scc scc x := @def_pblock _ _ scc x trivIset_sccs.

Definition is_subscc (A : {set V}) := A != set0 /\
                                      {in A &, forall x y, connect r x y}.

Lemma is_subscc_in_scc (A : {set V}) :
  is_subscc A -> exists2 scc, scc \in sccs & A \subset scc.
Proof.
move=> []; have [->|[x xA]] := set_0Vmem A; first by rewrite eqxx.
move=> AN0 A_sub; exists (scc_of x); first by rewrite pblock_mem ?cover_sccs.
by apply/subsetP => y yA; rewrite mem_scc /symconnect !A_sub.
Qed.

Lemma is_subscc1 x (A : {set V}) : x \in A ->
  (forall y, y \in A -> connect r x y /\ connect r y x) -> is_subscc A.
Proof.
move=> xA AP; split; first by apply: contraTneq xA => ->; rewrite inE.
by move=> y z /AP [xy yx] /AP [xz zx]; rewrite (connect_trans yx).
Qed.

End Symconnect.

Lemma setUD (B A C : {set V}) : B \subset A -> C \subset B -> 
  (A :\: B) :|: (B :\: C) = (A :\: C).
Proof.
move=> subBA subCB; apply/setP=> x; rewrite !inE.
have /implyP  := subsetP subBA x; have /implyP  := subsetP subCB x.
by do !case: (_ \in _).
Qed.

Lemma setUDl (T : finType) (A B : {set T}) : A :|: B :\: A = A :|: B.
Proof. by apply/setP=> x; rewrite !inE; do !case: (_ \in _). Qed.

Lemma subset_cover (sccs sccs' : {set {set V}}) :
  sccs \subset sccs' -> cover sccs \subset cover sccs'.
Proof.
move=> /subsetP subsccs; apply/subsetP=> x /bigcupP [scc /subsccs].
by move=> scc' x_in; apply/bigcupP; exists scc.
Qed.

Lemma disjoint1s (A : pred V) (x : V) : [disjoint [set x] & A] = (x \notin A).
Proof.
apply/pred0P/idP=> [/(_ x)/=|]; first by rewrite inE eqxx /= => ->.
by move=> xNA y; rewrite !inE; case: eqP => //= ->; apply/negbTE.
Qed.

Lemma disjoints1 (A : pred V) (x : V) : [disjoint A & [set x]] = (x \notin A).
Proof. by rewrite disjoint_sym disjoint1s. Qed.

End extra_fintype.


================================================
FILE: notes/tarjan/tarjan_nocolors.v
================================================
From mathcomp Require Import all_ssreflect.
Require Import extra_nocolors.

Set Implicit Arguments.
Unset Strict Implicit.
Unset Printing Implicit Defensive.

Section tarjan.

Variable (V : finType) (successor_seq : V -> seq V).
Notation successors x := [set y in successor_seq x].
Notation infty := #|V|.

(*************************************************************)
(*               Tarjan 72 algorithm,                        *)
(* rewritten in a functional style  with extra modifications *)
(*************************************************************)

Record env := Env {esccs : {set {set V}}; num: {ffun V -> nat}}.

Definition visited e := [set x | num e x <= infty].
Notation sn e := #|visited e|.
Definition stack e := [set x | num e x < sn e].

Definition visit x e :=
  Env (esccs e) (finfun [eta num e with x |-> sn e]).
Definition store scc e :=
  Env (scc |: esccs e) [ffun x => if x \in scc then infty else num e x].

Definition dfs1 dfs x e :=
    let: (n1, e1) as res := dfs (successors x) (visit x e) in
    if n1 < sn e then res else (infty, store (stack e1 :\: stack e) e1).

Definition dfs dfs1 dfs (roots : {set V}) e :=
  if [pick x in roots] isn't Some x then (infty, e) else
  let: (n1, e1) := if num e x <= infty then (num e x, e) else dfs1 x e in
  let: (n2, e2) := dfs (roots :\ x) e1 in (minn n1 n2, e2).

Fixpoint rec k r e :=
  if k is k.+1 then dfs (dfs1 (rec k)) (rec k) r e
  else (infty, e).

Definition e0 := (Env set0 [ffun _ => infty.+1]).
Definition tarjan := let: (_, e) := rec (infty * infty.+2) setT e0 in esccs e.

(*****************)
(* Abbreviations *)
(*****************)

Notation edge := (grel successor_seq).
Notation gconnect := (connect edge).
Notation gsymconnect := (symconnect edge).
Notation gsccs := (sccs edge).
Notation gscc_of := (pblock gsccs).
Notation gconnected := (connected edge).
Notation new_stack e1 e2 := (stack e2 :\: stack e1).
Notation new_visited e1 e2 := (visited e2 :\: visited e1).
Notation inord := (@inord infty).

(*******************)
(* next, and nexts *)
(*******************)

Section Nexts.
Variable (D : {set V}).

Definition nexts (A : {set V}) :=
  \bigcup_(v in A) [set w in connect (relfrom (mem D) edge) v].

Lemma nexts0 : nexts set0 = set0.
Proof. by rewrite /nexts big_set0. Qed.

Lemma nexts1 x :
  nexts [set x] = x |: (if x \in D then nexts (successors x) else set0).
Proof.
apply/setP=> y; rewrite /nexts big_set1 !inE.
have [->|neq_yx/=] := altP eqP; first by rewrite connect0.
apply/idP/idP=> [/connect1l[]// z/=/andP[/= xD xz zy]|].
  by rewrite xD; apply/bigcupP; exists z; rewrite !inE.
case: ifPn; rewrite ?inE// => xD /bigcupP[z]; rewrite !inE.
by move=> xz; apply/connect_trans/connect1; rewrite /= xD.
Qed.

Lemma nextsU A B : nexts (A :|: B) = nexts A :|: nexts B.
Proof. exact: bigcup_setU. Qed.

Lemma nextsS (A : {set V}) : A \subset nexts A.
Proof. by apply/subsetP=> a aA; apply/bigcupP; exists a; rewrite ?inE. Qed.

Lemma nextsT : nexts setT = setT.
Proof. by apply/eqP; rewrite eqEsubset nextsS subsetT. Qed.

Lemma nexts_id (A : {set V}) : nexts (nexts A) = nexts A.
Proof.
apply/eqP; rewrite eqEsubset nextsS andbT; apply/subsetP=> x.
move=> /bigcupP[y /bigcupP[z zA]]; rewrite !inE => /connect_trans yto /yto zx.
by apply/bigcupP; exists z; rewrite ?inE.
Qed.

Lemma in_nextsW A y : y \in nexts A -> exists2 x, x \in A & gconnect x y.
Proof.
move=>/bigcupP[x xA]; rewrite inE => xy; exists x => //.
by apply: connect_sub xy => u v /andP[_ /connect1].
Qed.

End Nexts.

Lemma sub_nexts (D D' A B : {set V}) :
  D \subset D' -> A \subset B -> nexts D A \subset nexts D' B.
Proof.
move=> /subsetP subD /subsetP subAB; apply/subsetP => v /bigcupP[a /subAB aB].
rewrite !inE => av; apply/bigcupP; exists a; rewrite ?inE //=.
by apply: connect_sub av => x y /andP[xD xy]; rewrite connect1//= subD.
Qed.

Lemma nextsUI A B C : nexts B A \subset A ->
  A :|: nexts (B :&: ~: A) C = A :|: nexts B C.
Proof.
move=> subA; apply/setP=> y; rewrite !inE; have [//|/= yNA] := boolP (y \in A).
apply/idP/idP; first by apply: subsetP; rewrite sub_nexts// subsetIl.
move=> /bigcupP[z zr zy]; apply/bigcupP; exists z; first by [].
rewrite !inE; apply: contraTT isT => Nzy; move: zy; rewrite !inE.
move=> /(connect_from (mem (~: A))) /= [t].
rewrite !inE => -[xtxy zt ty]; move: zt.
rewrite (@eq_connect _ _ (relfrom (mem (B :&: ~: A)) edge)); last first.
  by move=> u v /=; rewrite !inE andbCA andbA.
case: (altP eqP) xtxy => /= [<-|neq_yt]; first by rewrite (negPf Nzy).
rewrite implybF negbK => tA zt; rewrite -(negPf yNA) (subsetP subA)//.
by apply/bigcupP; exists t; rewrite // inE.
Qed.

Lemma nexts1_split (A : {set V}) x : x \in A ->
  nexts A [set x] = x |: nexts (A :\ x) (successors x).
Proof.
move=> xA; apply/setP=> y; apply/idP/idP; last first.
  rewrite nexts1 !inE xA; case: (_ == _); rewrite //=.
  by apply: subsetP; rewrite sub_nexts// subsetDl.
move=> /bigcupP[z]; rewrite !inE => /eqP {z}->.
move=> /connectP[p /shortenP[[_ _ _ /eqP->//|z q/=/andP[/andP[_ xz]]]]].
rewrite path_from => /andP[zq] /allP/= qA.
move=> /and3P[xNzq _ _] _ ->; apply/orP; right.
apply/bigcupP; exists z; rewrite !inE//.
apply/connectP; exists q; rewrite // path_from zq/=.
apply/allP=> t tq; rewrite !inE qA ?andbT//.
by apply: contraNneq xNzq=> <-; apply: mem_belast tq.
Qed.

(*******************)
(* Well formed env *)
(*******************)

Lemma num_le_infty e x : num e x <= infty = (x \in visited e).
Proof. by rewrite inE. Qed.

Lemma num_lt_sn e x : num e x < sn e = (x \in stack e).
Proof. by rewrite inE. Qed.

Lemma visited_visit e x : visited (visit x e) = x |: visited e.
Proof.
by apply/setP=> y; rewrite !inE ffunE/=; case: (altP eqP); rewrite ?max_card.
Qed.

Lemma sub_stack_visited e : stack e \subset visited e.
Proof.
by apply/subsetP => x; rewrite !inE => /ltnW /leq_trans ->//; rewrite max_card.
Qed.

Lemma sub_new_stack_visited e1 e2: new_stack e1 e2 \subset visited e2.
Proof. by rewrite (subset_trans _ (sub_stack_visited _)) ?subsetDl. Qed.

Section wfenv.

Record wf_env e := WfEnv {
  sub_gsccs : esccs e \subset gsccs;
  num_lt_V_is_stack : forall x, num e x < infty -> num e x < sn e;
  num_sccs : forall x, (num e x == infty) = (x \in cover (esccs e));
  le_connect : forall x y, num e x <= num e y < sn e -> gconnect x y;
}.

Variables (e : env) (e_wf : wf_env e).

Lemma num_gt_V x : x \notin visited e -> num e x > infty.
Proof. by rewrite inE -ltnNge. Qed.

Lemma num_lt_V x : (num e x < infty) = (num e x < sn e).
Proof.
apply/idP/idP => [/num_lt_V_is_stack//|]; first exact.
by move=> /leq_trans; apply; rewrite max_card.
Qed.

Lemma num_lt_card x (A : pred V) : visited e \subset A ->
  (num e x < #|A|) = (num e x < sn e).
Proof.
move=> subeA; apply/idP/idP => /leq_trans.
  by rewrite -num_lt_V; apply; rewrite max_card.
by apply; rewrite subset_leq_card.
Qed.

Lemma visitedE : visited e = stack e :|: cover (esccs e).
Proof. by apply/setP=> x; rewrite !inE leq_eqVlt -num_sccs// num_lt_V orbC. Qed.

Lemma sub_sccs_visited : cover (esccs e) \subset visited e.
Proof. by apply/subsetP => x; rewrite !inE -num_sccs// => /eqP->. Qed.

Lemma stack_visit x : x \notin visited e -> stack (visit x e) = x |: stack e.
Proof.
move=> xNvisited; apply/setP=> y; rewrite !inE/= ffunE/= visited_visit.
have [->|neq_yx]//= := altP eqP; first by rewrite cardsU1 xNvisited ltnS ?leqnn.
by rewrite num_lt_card// subsetUr.
Qed.

End wfenv.

Lemma wf_visit e x : wf_env e ->
   (forall y, num e y < sn e -> gconnect y x) ->
   x \notin visited e -> wf_env (visit x e).
Proof.
move=> e_wf x_connected xNvisited.
constructor=> [|y|y|] //=; rewrite ?inE ?ffunE/=.
- exact: sub_gsccs.
- rewrite visited_visit cardsU1 xNvisited; case: ifPn => // _.
  by rewrite num_lt_V// ltnS => /ltnW.
- have [->|] := altP (y =P x); last by rewrite num_sccs.
  rewrite -num_sccs// eq_sym !gtn_eqF ?num_gt_V//.
  by rewrite (@leq_trans #|x |: visited e|) ?max_card// cardsU1 xNvisited.
move=> y z; rewrite !ffunE/=.
have sub_visit : visited e \subset visited (visit x e).
  by apply/subsetP => ?; rewrite visited_visit !inE orbC => ->.
have [{y}->|neq_yx] := altP eqP; have [{z}->|neq_zx]//= := altP eqP.
+ by rewrite num_lt_card//; case: ltngtP.
+ move=> /andP[/leq_ltn_trans lt/lt].
  by rewrite num_lt_card//; apply: x_connected.
+ by rewrite num_lt_card//; apply: le_connect.
Qed.

Definition subenv e1 e2 := [&&
  esccs e1 \subset esccs e2,
  [forall x, (num e1 x <= infty) ==> (num e2 x == num e1 x)] &
  [forall x, (num e2 x < sn e1) ==> (num e1 x < sn e1)]].

Lemma sub_sccs e1 e2 : subenv e1 e2 -> esccs e1 \subset esccs e2.
Proof. by move=> /and3P[]. Qed.

Lemma sub_snum e1 e2 : subenv e1 e2 -> forall x, num e1 x <= infty ->
  num e2 x = num e1 x.
Proof. by move=> /and3P[_ /forall_inP /(_ _ _) /eqP]. Qed.

Lemma sub_vnum e1 e2 : subenv e1 e2 -> forall x, num e1 x < sn e1 ->
  num e2 x = num e1 x.
Proof.
move=> sube12 x /ltnW num_lt; rewrite (sub_snum sube12)//.
by rewrite (leq_trans num_lt)// max_card.
Qed.

Lemma sub_num_lt e1 e2 : subenv e1 e2 ->
  forall x, (num e1 x < sn e1) = (num e2 x < sn e1).
Proof.
move=> /and3P[_ /forall_inP /(_ _ _)/eqP num_eq /forall_inP] num_lt x.
have nume1_lt := num_lt x; apply/idP/idP => // {nume1_lt}nume1_lt.
by rewrite num_eq ?inE// (leq_trans (ltnW nume1_lt))//  max_card.
Qed.

Lemma sub_visited e1 e2 : subenv e1 e2 -> visited e1 \subset visited e2.
Proof.
move=> sube12; apply/subsetP=> x; rewrite !inE => x_visited1.
by rewrite (sub_snum sube12)// inE.
Qed.

Lemma leq_sn e1 e2 : subenv e1 e2 -> sn e1 <= sn e2.
Proof. by move=> sube12; rewrite subset_leq_card// sub_visited. Qed.

Lemma sub_stack e1 e2 : subenv e1 e2 -> stack e1 \subset stack e2.
Proof.
move=> sube12; apply/subsetP=> x; rewrite !inE => x_stack.
by rewrite (sub_vnum sube12)// (leq_trans x_stack)// leq_sn.
Qed.

Lemma new_stackE e1 e2 : subenv e1 e2 ->
  new_stack e1 e2 = [set x | sn e1 <= num e2 x < sn e2].
Proof.
move=> sube12; apply/setP=> x; rewrite !inE.
have [x_e2|] := ltnP (num e2 x) (sn e2); rewrite ?andbT ?andbF//.
have [e1_after|e1_before] /= := leqP (sn e1) (num e1 x).
  by rewrite leqNgt -sub_num_lt// -leqNgt.
by rewrite leqNgt -sub_num_lt// e1_before.
Qed.

Lemma new_visitedE e1 e2 : wf_env e1 -> wf_env e2 -> subenv e1 e2 ->
  (new_visited e1 e2) =
    (new_stack e1 e2) :|: cover (esccs e2) :\: cover (esccs e1).
Proof.
move=> e1_wf e2_wf sube12; rewrite !visitedE//; apply/setP=> x.
rewrite !inE -!num_sccs -?num_lt_V//; do 2!case: ltngtP => //=.
  by rewrite num_lt_V// (sub_num_lt sube12)// => ->; rewrite ltnNge max_card.
by move=> xe2 xe1; move: xe2; rewrite (sub_snum sube12)// ?xe1// ltnn.
Qed.

Lemma sub_new_stack_new_visited e1 e2 :
    subenv e1 e2 -> wf_env e1 -> wf_env e2 ->
  (new_stack e1 e2) \subset (new_visited e1 e2).
Proof.
by move=> e1wf e2wf sube12; rewrite (@new_visitedE e1 e2)// subsetUl.
Qed.

Lemma sub_refl e : subenv e e.
Proof. by rewrite /subenv !subxx /=; apply/andP; split; apply/forall_inP. Qed.
Hint Resolve sub_refl.

Lemma sub_trans : transitive subenv.
Proof.
move=> e2 e1 e3 sub12 sub23; rewrite /subenv.
rewrite (subset_trans (sub_sccs sub12))// ?sub_sccs//=.
apply/andP; split; apply/forall_inP=> x xP.
  by rewrite (sub_snum sub23) ?(sub_snum sub12)//.
have x2 : num e3 x < sn e2 by rewrite (leq_trans xP)// leq_sn.
by rewrite (sub_num_lt sub12)// -(sub_vnum sub23)// (sub_num_lt sub23).
Qed.

Lemma sub_visit e x : x \notin visited e -> subenv e (visit x e).
Proof.
move=> xNvisited; rewrite /subenv subxx/=; apply/andP; split; last first.
  by apply/forall_inP => y; rewrite !ffunE/=; case: ifP; rewrite ?ltnn.
apply/forall_inP => y y_in; rewrite !ffunE/=.
by case: (altP (y =P x)) xNvisited => // <-; rewrite inE y_in.
Qed.

Lemma visited_store (A : {set V}) e : A \subset visited e ->
  visited (store A e) = visited e.
Proof.
move=> A_sub; apply/setP=> x; rewrite !inE/= ffunE.
by case: ifPn => // /(subsetP A_sub); rewrite inE leqnn => ->.
Qed.

Lemma stack_store (A : {set V}) e : A \subset visited e ->
  stack (store A e) = stack e :\: A.
Proof.
move=> A_sub; apply/setP => x; rewrite !inE visited_store//= ffunE.
by case: (x \in A); rewrite //= ltnNge max_card.
Qed.

(*********************)
(* DFS specification *)
(*********************)

Definition outenv (roots : {set V}) (e e' : env) := [/\
  {in new_stack e e' &, gconnected},
  {in new_stack e e', forall x, exists2 y, y \in stack e & gconnect x y} &
  visited e' = visited e :|: nexts (~: visited e) roots ].

Variant dfs_spec_def (dfs : nat * env) (roots : {set V}) e :
  (nat * env) -> nat -> env -> Type := DfsSpec ne' (n : nat) e' of
    ne' = (n, e') &
    n = \min_(x in nexts (~: visited e) roots) inord (num e' x) &
    wf_env e' & subenv e e' & outenv roots e e' :
  dfs_spec_def dfs roots e ne' n e'.
Notation dfs_spec ne' roots e := (dfs_spec_def ne' roots e ne' ne'.1 ne'.2).

Definition dfs_correct dfs (roots : {set V}) e := wf_env e ->
  {in stack e & roots, gconnected} -> dfs_spec (dfs roots e) roots e.
Definition dfs1_correct dfs1 x e := wf_env e -> x \notin visited e ->
  {in stack e & [set x], gconnected} -> dfs_spec (dfs1 x e) [set x] e.

(*****************)
(* Correctness *)
(*****************)

Lemma dfsP dfs1 dfsrec (roots : {set V}) e:
  (forall x, x \in roots -> dfs1_correct dfs1 x e) ->
  (forall x, x \in roots -> forall e1, subenv e e1 ->
     dfs_correct dfsrec (roots :\ x) e1) ->
  dfs_correct (dfs dfs1 dfsrec) roots e.
Proof.
rewrite /dfs => dfs1P dfsP e_wf roots_connected.
case: pickP => /= [x x_roots|]; last first.
  move=> r0; have {r0}r_eq0 : roots = set0 by apply/setP=> x; rewrite inE.
  do ?constructor=> //=;
    rewrite ?setDv ?r_eq0 ?nexts0 ?sub0set ?eqxx ?setU0 ?big_set0 //=;
    by move=> ?; rewrite inE.
have [numx_gt|numx_le]/= := ltnP; last first.
  have x_visited : x \in visited e by rewrite inE.
  case: dfsP => //= [u v ve|_ _ e1 ->-> e1_wf subee1 [new1c new1old visited1E]].
    by rewrite inE => /andP[_ v_roots]; rewrite roots_connected.
  constructor => //=.
    rewrite -[in RHS](setD1K x_roots) nextsU nexts1 inE x_visited/= setU0.
    by rewrite bigmin_setU /= big_set1/= (@sub_snum e e1)// inordK//.
  constructor=> //=; rewrite -(setD1K x_roots) nextsU nexts1 inE x_visited/=.
  by rewrite setU0 setUCA setUA [x |: _](setUidPr _) ?sub1set.
case: dfs1P => //=; first by rewrite inE -ltnNge.
  by move=> u v ue; rewrite inE => /eqP->; apply: roots_connected.
move=> _ _  e1 -> -> e1_wf subee1 [new1c new1old visited1E].
case: dfsP => //= [u v ue1|_ _ e2 -> -> e2_wf sube12 [new2c new2old visited2E]].
  rewrite inE => /andP[_ v_roots].
  have [ue|uNe] := boolP (u \in stack e); first by rewrite roots_connected.
  have [|w we] := new1old u; first by rewrite inE ue1 uNe.
  by move=> /connect_trans->//; rewrite roots_connected//.
have sube2 : subenv e e2 by exact: sub_trans sube12.
have nexts_split : nexts (~: visited e) roots =
      nexts (~: visited e) [set x] :|: nexts (~: visited e1) (roots :\ x).
  rewrite -[in LHS](setD1K x_roots) nextsU visited1E.
  by rewrite setCU nextsUI// nexts_id.
constructor => //=.
  rewrite (eq_bigr (inord \o num e2)).
   by rewrite -[LHS]/(val (ord_minn _ _)) -bigmin_setU /= -nexts_split.
  move=> y y_in; rewrite /= (@sub_snum e1 e2)// num_le_infty.
  by rewrite visited1E setUC inE y_in.
constructor => /=.
+ rewrite -(@setUD _ (stack e1)) ?sub_stack//.
  apply: connectedU => // y z; last first.
    rewrite !new_stackE// ?inE => /andP[y_ge y_lt] /andP[z_ge z_lt].
    rewrite (@le_connect e2) // z_lt (leq_trans _ z_ge)//.
    by rewrite (sub_vnum sube12)// ltnW.
  rewrite !new_stackE// ?inE => /andP[y_ge y_lt] /andP[z_ge z_lt].
  have [|r] := new2old y; rewrite ?new_stackE ?inE ?y_ge//.
  move=> r_lt /connect_trans->//; have [rz|zr] := leqP (num e1 r) (num e1 z).
    by rewrite (@le_connect e1)// rz/=.
  by rewrite new1c ?new_stackE ?inE ?z_ge ?z_lt //= (leq_trans z_ge)// ltnW.
+ move=> y; rewrite ?new_stackE ?inE// => /andP[y_ge y_lt].
  have [y_lt1|y_ge1] := ltnP (num e1 y) (sn e1).
    have [|r] := new1old y; last by exists r.
    by rewrite new_stackE ?inE// ?y_lt1 -(sub_vnum sube12) ?y_ge.
  have [|r r_lt1 yr] := new2old y; first by rewrite !inE -leqNgt y_ge1//.
  rewrite ?inE in r_lt1; have [r_lt|r_ge] := ltnP (num e r) (sn e).
    by exists r; rewrite ?inE.
  have [|r' r's rr'] := new1old r; first by rewrite ?inE -leqNgt r_ge r_lt1.
  by exists r'; rewrite // (connect_trans yr rr').
+ by rewrite visited2E {1}visited1E nexts_split setUA.
Qed.

Lemma dfs1P dfs x e (A := successors x) :
  dfs_correct dfs A (visit x e) -> dfs1_correct (dfs1 dfs) x e.
Proof.
rewrite /dfs1 => dfsP e_wf xNvisited x_connected.
have subexe: subenv e (visit x e) by exact: sub_visit.
have numx : num e x > infty by apply: num_gt_V.
have xNstack : x \notin stack e.
  by rewrite inE -leqNgt (leq_trans _ numx) ?leqW ?max_card.
have xe_wf : wf_env (visit x e).
  by apply: wf_visit => // y y_lt; rewrite x_connected ?inE.
have nexts1E : nexts (~: visited e) [set x] =
    x |: nexts (~: (x |: visited e)) A.
  by rewrite nexts1_split ?setDE ?setCU 1?setIC 1?inE.
case: dfsP => //=.
  rewrite stack_visit// => u v; rewrite in_setU1=> /predU1P[->|ue];
  rewrite inE => /(@connect1 _ edge)// /(connect_trans _)->//.
  by rewrite x_connected// set11.
move=> _ _ e1 //= -> -> e1_wf subxe1 [newc new_old visited1E].
have sube1 : subenv e e1 by apply: sub_trans subxe1.
have num1x : num e1 x = sn e.
  by rewrite (sub_snum subxe1)// ?inE ?ffunE/= ?eqxx// max_card.
rewrite visited_visit in visited1E *.
have lt_sn_sn1 : sn e < sn e1.
  by rewrite (leq_trans _ (leq_sn subxe1))// visited_visit cardsU1 xNvisited.
have x_visited1 : x \in visited e1 by rewrite visited1E inE setU11.
have x_stack : x \in stack e1.
  by rewrite (subsetP (sub_stack subxe1))//= stack_visit// setU11.
have [min_after|min_before] := leqP; last first.
  constructor => //=.
    rewrite nexts1E bigmin_setU big_set1 /= inordK ?num1x ?ltnS ?max_card//.
    by rewrite (minn_idPr _)// ltnW.
  constructor=> //=; last by rewrite nexts1E setUCA setUA visited1E.
    move=> y z; have [-> _|neq_yx] := eqVneq y x.
      by rewrite new_stackE ?inE// -num1x; apply: le_connect.
    rewrite -(@setUD _ (stack (visit x e))) ?sub_stack//.
    rewrite [in X in _ :|: X]stack_visit// setDUl setDv setU0.
    rewrite [_ :\: stack e](setDidPl _) ?disjoint1s//.
    rewrite setUC !in_setU1 (negPf neq_yx)/=.
    move=> y_e1 /predU1P[->|]; last exact: newc y_e1.
    have [t] := new_old y y_e1; rewrite !inE => t_le /connect_trans->//.
    rewrite (@le_connect (visit x e))// andbC; move: t_le.
    by rewrite visited_visit !ffunE /= eqxx cardsU1 xNvisited add1n !ltnS leqnn.
  move=> y; have [v ve xv] : exists2 v, v \in stack e & gconnect x v.
    have [|v] := @eq_bigmin_cond _ _ (mem (nexts (~: (x |: visited e)) A))
                               (inord \o num e1).
      rewrite card_gt0; apply: contraTneq min_before => ->.
      by rewrite big_set0 -leqNgt max_card.
    rewrite !inE => v_in min_is_v; move: min_before; rewrite min_is_v/=.
    rewrite inordK; last by rewrite ltnS num_le_infty visited1E inE v_in orbT.
    rewrite -sub_num_lt// => v_lt; exists v; rewrite ?inE//.
    move: v_in => /in_nextsW[z]; rewrite inE => /(@connect1 _ edge).
    by apply: connect_trans.
  rewrite -(@setUD _ (stack (visit x e))) ?sub_stack//.
  rewrite [in X in _ :|: X]stack_visit// setDUl setDv setU0.
  rewrite [_ :\: stack e](setDidPl _) ?disjoint1s// setUC !in_setU1.
  move=> /predU1P[->|]; first by exists v.
  move=> /new_old[z]; rewrite stack_visit// in_setU1.
  move=> /predU1P[->|]; last by exists z.
  by move=> yx; exists v; rewrite // (connect_trans yx).
have all_geq y : y \in nexts (~: visited e) [set x] ->
  (#|visited e| <= num e1 y) * (num e1 y <= infty).
  have := min_after; have sn_inord : sn e = inord (sn e).
    by rewrite inordK// ltnS max_card.
  rewrite {1}sn_inord; move/bigmin_geqP => /(_ y) y_ge.
  rewrite nexts1E !inE => /predU1P[->|yA]; rewrite ?num1x ?max_card ?leqnn//.
  rewrite sn_inord (leq_trans (y_ge _))// ?inordK//;
  by rewrite ?ltnS num_le_infty visited1E 2!inE yA orbT.
constructor => //=.
- rewrite big1// => y xy; rewrite ffunE new_stackE ?inE//=.
  have y_visited1 : num e1 y <= infty.
    by rewrite num_le_infty visited1E -setUA setUCA -nexts1E inE xy orbT.
  apply/val_inj=> /=; case: ifPn; rewrite ?inordK//.
  by rewrite all_geq//= -num_lt_V// -leqNgt; move: y_visited1; case: ltngtP.
- constructor => //=; rewrite ?visited_store ?sub_new_stack_visited//.
  + rewrite subUset sub_gsccs// andbT sub1set.
    suff -> : new_stack e e1 = gscc_of x by rewrite pblock_mem ?cover_sccs.
    apply/setP=> y; rewrite mem_scc /symconnect.
    have [->|neq_yx] := eqVneq y x.
      by rewrite connect0 inE xNstack inE num1x lt_sn_sn1.
    apply/idP/andP=> [|[xy yx]].
      move=> y_ee1; have y_xee1 : y \in new_stack (visit x e) e1.
        by rewrite inE stack_visit// in_setU1 (negPf neq_yx)/= -in_setD.
      split; last first.
        have [z] := new_old _ y_xee1.
        rewrite stack_visit// in_setU1 => /predU1P[->//|/x_connected].
        by move=> /(_ _ (set11 x))/(connect_trans _) xz /xz.
      have: y \in new_visited (visit x e) e1.
        by apply: subsetP y_xee1; rewrite sub_new_stack_new_visited.
      rewrite inE visited1E in_setU visited_visit//; case: (y \in _) => //=.
      move=> /in_nextsW[z]; rewrite inE=> /(@connect1 _ edge).
      exact: connect_trans.
    have /(connect_from (mem (~: visited e))) [z []] := xy; rewrite inE.
    move=> eq_yz xz zy; have /all_geq [] : z \in nexts (~: visited e) [set x].
      by apply/bigcupP; exists x; rewrite !inE.
    rewrite leqNgt -sub_num_lt// -num_lt_V// -leqNgt => zNstack.
    have zNcover e' : wf_env e' -> z \in cover (esccs e') ->
                      x \in cover (esccs e').
      move=> e'_wf /bigcupP[C] Ce zC; apply/bigcupP; exists C => //.
      have /def_scc: C \in gsccs by apply: subsetP Ce; apply: sub_gsccs.
      move=> /(_ _ zC)<-; rewrite mem_scc /= /symconnect (connect_trans zy)//=.
      by apply: connect_sub xz => ?? /andP[_ /connect1].
    rewrite leq_eqVlt num_sccs// num_lt_V// => /orP[|z_stack].
       move=> /zNcover; rewrite -num_sccs// num1x => /(_ _) /eqP eq_V.
       by rewrite eq_V// ltnNge max_card in lt_sn_sn1.
    have zNvisited : z \notin visited e.
      rewrite inE -ltnNge ltn_neqAle zNstack andbT/= eq_sym num_sccs//.
      by apply: contraTN isT => /(zNcover _ e_wf); rewrite -num_sccs// gtn_eqF.
    move: eq_yz; rewrite zNvisited /= => /andP[/eqP eq_yz _].
    rewrite -eq_yz in zNstack z_stack.
    by rewrite !inE -num_lt_V// -leqNgt zNstack.
  + move=> v; rewrite ffunE/=; case: ifPn; rewrite ?ltnn// => vNe12.
    by rewrite num_lt_V// visited_store.
  + move=> v; rewrite ffunE /= cover1U [in RHS]inE.
    by case: ifPn; rewrite ?eqxx//= => vNe12; rewrite -num_sccs//.
  + move=> y z; rewrite !ffunE; case: ifPn => _.
      by move=> /andP[/leq_ltn_trans Vsmall/Vsmall]; rewrite ltnNge max_card.
    by case: ifPn => _; [by rewrite ltnNge max_card andbF|exact : le_connect].
- rewrite /subenv /= (subset_trans (sub_sccs sube1)) ?subsetUr//=.
  apply/andP; split; apply/forallP => v; apply/implyP;
  rewrite ffunE/= new_stackE// ?inE.
    move=> vs; rewrite (sub_snum sube1)// leqNgt -!num_lt_V// -leqNgt ifN//.
    by apply/negP => /andP[/leq_ltn_trans Vlt/Vlt]; rewrite ltnNge max_card.
  by case: ifPn; [move=> _; rewrite ltnNge max_card|rewrite -sub_num_lt].
- rewrite /outenv stack_store ?visited_store ?sub_new_stack_visited//.
  rewrite setDDr setDUl setDv set0D set0U setDIl !setDv setI0.
  split; do ?by move=> ?; rewrite inE.
  by rewrite visited1E -setUA setUCA -nexts1E.
Qed.

Theorem rec_terminates k (roots : {set V}) e :
  k >= #|~: visited e| * infty.+1 + #|roots| -> dfs_correct (rec k) roots e.
Proof.
move=> k_ge; elim: k => [|k IHk/=] in roots e k_ge *.
  move: k_ge; rewrite leqn0 addn_eq0 cards_eq0 => /andP[_ /eqP-> e_wf _]/=.
  constructor=> //=; rewrite /outenv ?nexts0 ?setDv ?big_set0// ?setU0.
  by split=> // ?; rewrite inE.
apply: dfsP=> x x_roots; last first.
  move=> e1 subee1; apply: IHk; rewrite -ltnS (leq_trans _ k_ge)//.
  rewrite (cardsD1 x roots) x_roots add1n -addSnnS ltn_add2r ltnS.
  by rewrite leq_mul2r //= subset_leq_card// setCS sub_visited.
move=> e_wf xNvisited; apply: dfs1P => //; apply: IHk.
rewrite visited_visit setCU setIC -setDE -ltnS (leq_trans _ k_ge)//.
rewrite (cardsD1 x (~: _)) inE xNvisited add1n mulSnr -addnA ltn_add2l.
by rewrite ltn_addr// ltnS max_card.
Qed.

Lemma visited0 : visited e0 = set0.
Proof. by apply/setP=> y; rewrite !inE ffunE ltnn. Qed.

Lemma stack0 : stack e0 = set0.
Proof. by apply/setP=> y; rewrite !inE ffunE ltnNge leqW ?max_card. Qed.

Theorem tarjan_correct : tarjan = gsccs.
Proof.
rewrite /tarjan mulnSr; case: rec_terminates.
- by rewrite visited0 setC0 cardsT.
- constructor; rewrite /= ?sub0set// => x; rewrite !ffunE//.
  + by rewrite ltnNge leqW//.
  + by rewrite gtn_eqF// /cover big_set0 inE.
  + by move=> y; rewrite !ffunE//= andbC ltnNge leqW// ?max_card.
- by move=> y; rewrite !inE !ffunE/= ltnNge leqW// max_card.
move=> _ _ e -> _ e_wf _ [_]; rewrite stack0 setD0.
have [stacke _|[x xe]] := set_0Vmem (stack e); last first.
  by move=> /(_ _ xe)[?]; rewrite inE.
rewrite visited0 set0U setC0 nextsT => visitede.
have numE x : num e x = infty.
  apply/eqP; have /setP/(_ x) := visitede.
  by rewrite visitedE// stacke set0U !inE -num_sccs.
apply/eqP; rewrite eqEsubset sub_gsccs//=; apply/subsetP => _/imsetP[/=x _->].
have: x \in cover (esccs e) by rewrite -num_sccs ?numE//.
move=> /bigcupP [C Csccs /(def_scc (subsetP (sub_gsccs e_wf) _ Csccs))] eqC.
rewrite -eqC (_ : [set _ in _ | _] = gscc_of x)// in Csccs *.
by apply/setP => y; rewrite !inE mem_scc /=.
Qed.

End tarjan.


================================================
FILE: notes/tarjan.md
================================================
so as we're going through the depth first search, we store: a stack of visited but not yet assigned vertices, pushed onto the stack in the order they are visited; a set of finalized components; the current serial number; and a "function" (map?) of vertices to serial numbers.

two mutually recursive functions, they call then `dfs1` and `dsf`, but those are awful names, I'll renamed once I fully understand them

- `dfs` takes a set of roots and an initial environment, returns a pair of an integer and the modified environment. if the roots is empty the integer is `infinity` (what they should have done is just used `option` or something here). Otherwise the returned integer is the minimum of the results of the calls to `dfs1` on non-visited vertices in r and of the serial numbers of the already visited ones.
- `dfs1`

the main function creates the initial environment with an empty stack, empty set of components, serial number 0, and an empty map assigning numbers to vertices


```
let rec dfs1 vertex e =
  let n0 = e.cur in
  let (n1, e1) = dfs (successors vertex) (add_stack_incr vertex e) in
  if n1 < n0
    then (n1, e1)
    else
      let (s2, s3) = split x e1.stack in
      (+∞, {stack = s3; sccs = add (elements s2) e1.sccs; cur = e1.cur; num = set_infty s2 e1.num})

with dfs r e =
  if is_empty r
    then (+∞, e)
    else
      let x = choose r in
      let r’ = remove x r in
      let (n1, e1) = if e.num[x] != -1
        then (e.num[x], e)
        else dfs1 x e in
      let (n2, e2) = dfs r’ e1 in (min n1 n2, e2)

let tarjan () =
  let e = {stack = []; sccs = empty; cur = 0; num = const (-1)} in
  let (_, e’) = dfs vertices e in e’.sccs

let add_stack_incr x e =
  let n = e.cur in
  {stack = x :: e.stack; sccs = e.sccs; cur = n+1; num = e.num[x ← n]}

let rec set_infty s f = match s with
  | [] → f
  | x :: s’ → (set_infty s’ f)[x ← +∞] end

let rec split x s = match s with
  | [] → ([], [])
  | y :: s’ → if x = y
    then ([x], s’)
    else
      let (s1’, s2) = split x s’ in
      (y :: s1’, s2) end
```

looks like I need to check out their better version
https://www-sop.inria.fr/marelle/Tarjan/
https://math-comp.github.io/mcb/

Łukasz Czajka and Cezary Kaliszyk. Hammer for coq: Automation for dependent
type theory. J. Autom. Reasoning, 61(1-4):423–453, 2018.


================================================
FILE: notes.md
================================================
modified orphan rule:
traits can have crate *automatic derive implementations* that will "kick in" when a type the trait author hasn't explicitly defined an implementation for "requests" one. this means such trait authors could merely define manual implementations for the primitive types. this automatic implementation can be superseded by an explicit implementation in the type crate
what about derive arguments for third-party crates? honestly easy to still solve using the newtype pattern


https://people.mpi-sws.org/~dreyer/papers/sandboxing/paper.pdf


https://people.mpi-sws.org/~beta/papers/unicoq.pdf
https://www.sciencedirect.com/science/article/pii/S089054010300138X?via%3Dihub
https://golem.ph.utexas.edu/category/2021/08/you_could_have_invented_de_bru.html
https://proofassistants.stackexchange.com/questions/900/when-should-i-use-de-bruijn-levels-instead-of-indices
https://www.sciencedirect.com/science/article/pii/0167642395000216
https://arxiv.org/pdf/1102.2405.pdf
https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.156.1170
https://davidchristiansen.dk/tutorials/nbe/



when defining the "handy" or base equality proposition (`==`), why not make it an `or` over strict definitional equality (`===`) and an existence proof over a more complex setoid equality? or simply not include it as a base at all and let operator chaining allow people to use more complex custom equality concepts themselves?
maybe we should pull apart "definitional" or "shallow" equality from "computable" equality? is that useful?




https://inria.hal.science/hal-01094195/preview/CIC.pdf
notes on `match`
`match t as x` could probably just be rewritten with a `let` beforehand?
or do like rust!
`match let x: I[y1, ... yp] = t {}`

the `in I y1 . . . yp` is basically a destructuring of the *type* of the match target. it allows you to use the *index* values of the type in the body.
a better syntax that makes this more clear would be `match t: I y1 ... yp { ...match arms... }`

every arm can have a different type, which means the *type* of the entire match *is itself another `match`!*, but one that returns a type and not a value which has a type

strictly speaking the `return P` isn't necessary, but it allows you to use the `in whatever` type you've destructured in other places

A match type checks mostly based on its `return` clause.
to type check a match, you have to:

- check all constructors are accounted for
- check that each arm's type aligns with the `return` clause, including when the `return` is a function in which case you run the concrete pattern through that function

this is how "absurdity" is possible, when the `return` clause gives a *different* type for the *actual* input data than it does for the concrete constructor arms! this can only happen when the input data is impossible to construct, because otherwise the constructor arms would definitely handle it since by construction they cover every possibility.

```
Definition do_inversion (x y: N): Prop :=
  match x, y with S _, z => False | _,_ => True end.

Definition le_inversion n (H: le (S n) z): False :=
  match H in le x y return do_inversion x y with
  | lez x => True()
  | leS x y p => True()
  end.
```

there are also rules about which *sort* the match target and the `return` type are, and it *broadly* seems you just have to keep `Type` and `Prop` separate? that seems like a massive oversimplification, but...


`fixpoint` definitions just have to make sense and syntactically terminate



it seems that inductive types with indexed values are *usefully conceptually different than functions*
syntactically they're similar, but it's almost like "primitive" functions that don't have a body, but instead are just "declared" to give a value of a certain type
you could definitely frame these as something similar to a generic type, but there's even still a difference between that concept and an indexed type

A universal asserted type constructor would really help
basically you don't need `exists` or special case subset types like `vec` if you have a general purpose asserted type that's directly supported in the syntax

you *don't need index values at all* for `Type` definitions if you have a universal asserted type system.
this means you could do index values for `Prop` very differently, not treating them like functions that eventually return `Prop` but something more similar to generics

```
@(A: Type, R: (A, A) -> Prop)
prop RT(A, A);
  // | RTrefl: @(x), RT x x
  | RTrefl(x): RT x x

  // | RTR: @(x, y), R x y -> RT x y
  // @() specifies names, types optional, () specifies types without names
  | RTR@(x, y)(R x y): RT x y

  // | RTtran: @(x, y, z), RT x y -> RT y z -> RT x z
  | RTtran@(x, y, z)(RT x y, RT y z): RT x z

prop le(N, N);
  | lez: @(x) -> le(0, x)
  | leS: @(x, y), le x y -> le(S x, S y)


prop le[N, N];
  | lez(x): [0, x]
  | leS(x, y)&(le[x, y]): [S x, S y]

or you could use <> instead of []
```


to demonstrate that an imaginary type "models" a real type, you have to provide:

- a function that can always convert (the separation logic representation of) the real type to the imaginary type
- a function that can convert an imaginary type into a real type with a proof that this function is a mirror image of the above function




https://www.cs.cmu.edu/~fp/papers/mfps89.pdf
https://github.com/VictorTaelin/calculus-of-constructions

https://github.com/coq/coq/wiki/TheoryBehindCoq

https://softwarefoundations.cis.upenn.edu/lf-current/ProofObjects.html
https://softwarefoundations.cis.upenn.edu/lf-current/Logic.html
https://www.researchgate.net/figure/Sketch-of-type-checking-rules-in-Coq_fig17_221336389
https://www.williamjbowman.com/tmp/wjb-sized-coq.pdf
https://www.labri.fr/perso/casteran/CoqArt/Tsinghua/C5.pdf
https://hal.science/hal-02380196/document
https://coq.inria.fr/refman/language/cic.html



https://arxiv.org/pdf/2105.12077.pdf



need to look at xcap paper and other references in the bedrock paper

https://plv.csail.mit.edu/blog/iris-intro.html#iris-intro
https://plv.csail.mit.edu/blog/alectryon.html#alectryon











Verified hardware simulators are easy with magmide

Engineers want tools that can give them stronger guarantees about safety robustness and performance, but that tool has to be tractably usable and respect their time

This idea exists in incentive no man's land. Academics won't think about it or care about it because it merely applies existing work, so they'll trudge along in their tenure track and keep publishing post hoc verifications of existing systems. Engineers won't think about or care about it because it can't make money quickly or be made into a service or even very quickly be used to improve some service
This is an idea that carries basically zero short term benefits, but incalculable long term ones, mainly in the way it could shift the culture of software and even mathematics and logic if successful.
This project is hoping and gambling that it itself won't even be the truly exciting innovation, but some other project that builds upon it, and that wouldn't have happened otherwise. I'm merely hoping to be the pair of shoulders someone else stands on, and I hope the paradigm shift this project creates is merely assumed to be obvious, that they'll think we were insane to write programs and not prove them correct




https://mattkimber.co.uk/avoiding-growth-by-accretion/
Most effects aren't really effects but environmental capabilities, although sometimes those capabilities come with effects



Traits, shapes, and the next level of type inference

Discriminated unions and procedural macros make dynamically typed languages pointless, and they've existed for eighty years. So what gives?

What's better than a standard? An automatically checkable and enforceable standard


https://project-oak.github.io/rust-verification-tools/2021/09/01/retrospective.html
we have to go all the way. anything less than the capabilities given by a full proof checker proving theories on the literal environment abstractions isn't going to be good enough, will always have bugs and hard edges and cases that can't be done. but those full capabilties can *contain* other more "ad hoc" things like fuzzers, quickcheck libraries, test generators, etc. we must build upon a magmide!





stop trying to make functional programming happen, it isn't going to happen

## project values

- **Correctness**: this project should be a flexible toolkit capable of verifying and compiling any software for any architecture or environment. It should make it as easy as possible to model the abstractions presented by any hardware or host system with full and complete fidelity.
- **Clarity**: this project should be accessible to as many people as possible, because it doesn't matter how powerful a tool is if no one can understand it. To guide us in this pursuit we have a few maxims: speak plainly and don't use jargon when simpler words could be just as precise; don't use a term unless you've given some path for the reader to understand it, if a topic has prerequisites point readers toward them; assume your reader is capable but busy; use fully descriptive words, not vague abbreviations and symbols.
- **Practicality**: a tool must be usable, both in terms of the demands it makes and its design. This tool is intended to be used by busy people building real things with real stakes.
- **Performance**: often those programs which absolutely must be fast are also those which absolutely must be correct. Infrastructural software is constantly depended on, and must perform well.

These values inherently reinforce one another. As we gain more ability to guarantee correctness, we can make programs faster and solve more problems. As our tools become faster, they become more usable. Guaranteeing correctness saves others time and headache dealing with our bugs. As we improve clarity, more people gather to help improve the project, making it even better in 
Download .txt
gitextract_w5tkx20i/

├── .editorconfig
├── .github/
│   └── FUNDING.yml
├── .gitignore
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── Cargo.toml
├── README.future.md
├── README.md
├── iris-notes.md
├── justfile
├── lab.ll
├── mg_examples/
│   └── main.mg
├── notes/
│   ├── 2019-popl-iron-final.md
│   ├── assembly-proofs.md
│   ├── category-theory-for-programmers.md
│   ├── coq-coq-correct.md
│   ├── coq-metacoq.md
│   ├── indexing-foundational-proof-carrying-code.md
│   ├── indexing-indexed-model.md
│   ├── indexing-modal-model.md
│   ├── iris-from-the-ground-up.md
│   ├── iris-lecture-notes.md
│   ├── jung-thesis.md
│   ├── known_types.md
│   ├── pony-reference-capabilities.md
│   ├── tarjan/
│   │   ├── README.md
│   │   ├── _CoqProject
│   │   ├── extra_nocolors.v
│   │   └── tarjan_nocolors.v
│   └── tarjan.md
├── notes.md
├── old/
│   ├── checker.rs
│   ├── inductive_serde.v
│   ├── machine.md
│   ├── machine.v
│   ├── main.md
│   ├── main.v
│   └── parser_low.rs
├── posts/
│   ├── approachable-language-design.md
│   ├── comparisons-with-other-projects.md
│   ├── coq-for-engineers.md
│   ├── crossing-no-mans-land.md
│   ├── design-of-magmide.md
│   ├── intro-verification-logic-in-magmide.md
│   ├── iris-in-plain-terms.md
│   ├── toward-termination-vcgen.md
│   └── what-is-magmide.md
├── src/
│   ├── ast.rs
│   ├── checker.rs
│   ├── lib.rs
│   ├── main.rs
│   ├── old.md
│   └── parser.rs
└── theory/
    ├── _CoqProject
    ├── list_assertions.v
    ├── main.v
    ├── playground.v
    └── utils.v
Download .txt
SYMBOL INDEX (85 symbols across 4 files)

FILE: src/ast.rs
  type TypeBody (line 2) | pub enum TypeBody {
  type Statement (line 16) | pub type Statement = Term;
  type TypeDefinition (line 19) | pub struct TypeDefinition {
  type ProcedureDefinition (line 25) | pub struct ProcedureDefinition {
  type LetStatement (line 42) | pub struct LetStatement {
  type DebugStatement (line 50) | pub struct DebugStatement {
  type ModuleItem (line 57) | pub enum ModuleItem {
  type Term (line 64) | pub enum Term {
  type MatchArm (line 74) | pub struct MatchArm {
  type ChainItem (line 82) | pub enum ChainItem {

FILE: src/checker.rs
  type Ident (line 5) | pub type Ident = String;
  type Scope (line 9) | pub struct Scope<'a> {
  type ScopeItem (line 15) | pub enum ScopeItem<'a> {
  type DebugDisplay (line 26) | trait DebugDisplay {
    method debug_display (line 27) | fn debug_display(&self) -> String;
    method debug_display (line 31) | fn debug_display(&self) -> String {
    method debug_display (line 60) | fn debug_display(&self) -> String {
  type Data (line 45) | pub struct Data {
  type Body (line 52) | pub enum Body {
  function new (line 66) | pub fn new() -> (Scope<'a>, Ctx) {
  function from (line 70) | pub fn from<const N: usize>(pairs: [(&'a Ident, ScopeItem<'a>); N]) -> (...
  function checked_insert (line 74) | pub fn checked_insert(&mut self, ctx: &mut Ctx, ident: &'a Ident, scope_...
  function type_check_module (line 84) | pub fn type_check_module(&mut self, ctx: &mut Ctx, module_items: &'a Vec...
  function name_pass_type_check_module_item (line 94) | pub fn name_pass_type_check_module_item(&mut self, ctx: &mut Ctx, module...
  function main_pass_type_check_module_item (line 106) | pub fn main_pass_type_check_module_item(&self, ctx: &mut Ctx, module_ite...
  function type_check_type_definition (line 120) | pub fn type_check_type_definition(&self, _ctx: &mut Ctx, type_definition...
  function type_check_procedure_definition (line 130) | pub fn type_check_procedure_definition(&self, ctx: &mut Ctx, procedure_d...
  function type_check_parameters (line 144) | pub fn type_check_parameters(&mut self, ctx: &mut Ctx, parameters: &'a V...
  function type_check_statements (line 151) | pub fn type_check_statements(&mut self, ctx: &mut Ctx, statements: &Vec<...
  function reduce_statements (line 158) | pub fn reduce_statements(&mut self, ctx: &mut Ctx, statements: &Vec<Stat...
  function checked_assignable_to (line 166) | pub fn checked_assignable_to(&self, ctx: &mut Ctx, proposed: &ScopeItem<...
  function type_check_debug_statement (line 176) | pub fn type_check_debug_statement(&self, ctx: &mut Ctx, debug_statement:...
  function type_check_term (line 184) | pub fn type_check_term(&self, ctx: &mut Ctx, term: &Term) -> Option<Scop...
  function reduce_term (line 234) | pub fn reduce_term(&self, ctx: &mut Ctx, term: &Term) -> Option<ScopeIte...
  function type_check_pattern_matches (line 267) | pub fn type_check_pattern_matches(&self, ctx: &mut Ctx, pattern: &Term, ...
  function test_pattern_matches (line 282) | pub fn test_pattern_matches(&self, ctx: &mut Ctx, pattern: &Term, discri...
  function type_check_call (line 296) | pub fn type_check_call(&self, ctx: &mut Ctx, item: &ScopeItem<'a>, argum...
  function checked_call (line 328) | pub fn checked_call(&self, ctx: &mut Ctx, item: &ScopeItem<'a>, argument...
  function type_check_access_path (line 344) | pub fn type_check_access_path(&self, ctx: &mut Ctx, item: &ScopeItem<'a>...
  function checked_access_path (line 367) | pub fn checked_access_path(&self, _ctx: &mut Ctx, item: &ScopeItem<'a>, ...
  function placeholder_get (line 385) | pub fn placeholder_get<'s>(&'s self, ident: &Ident) -> &'s ScopeItem<'a> {
  function checked_get (line 389) | pub fn checked_get<'s>(&'s self, ctx: &mut Ctx, ident: &Ident) -> Option...
  type Ctx (line 405) | pub struct Ctx {
    method add_error (line 411) | pub fn add_error(&mut self, error: String) {
    method add_debug (line 414) | pub fn add_debug(&mut self, debug: String) {
  function basic_type_errors (line 426) | fn basic_type_errors() {
  function foundations_day_of_week (line 522) | fn foundations_day_of_week() {

FILE: src/main.rs
  function main (line 6) | fn main() {

FILE: src/parser.rs
  type DiscardingResult (line 21) | pub type DiscardingResult<T> = Result<T, nom::Err<nom::error::Error<T>>>;
  function is_underscore (line 23) | fn is_underscore(chr: char) -> bool {
  function is_ident (line 26) | fn is_ident(chr: char) -> bool {
  function is_start_ident (line 29) | fn is_start_ident(chr: char) -> bool {
  function parse_ident (line 34) | pub fn parse_ident(i: &str) -> IResult<&str, String> {
  function parse_branch (line 40) | pub fn parse_branch(_: usize, i: &str) -> IResult<&str, String> {
  function parse_indents (line 45) | fn parse_indents(indentation: usize, i: &str) -> IResult<&str, Vec<char>> {
  function indents (line 48) | fn indents(i: &str, indentation: usize) -> DiscardingResult<&str> {
  function parse_newlines (line 51) | fn parse_newlines(i: &str) -> IResult<&str, Vec<char>> {
  function newlines (line 54) | fn newlines(i: &str) -> DiscardingResult<&str> {
  function indented_line (line 58) | fn indented_line<T>(indentation: usize, i: &str, line_parser: fn(usize, ...
  function indented_block (line 63) | fn indented_block<T>(indentation: usize, i: &str, line_parser: fn(usize,...
  function parse_file (line 71) | pub fn parse_file(i: &str) -> IResult<&str, Vec<ModuleItem>> {
  function parse_file_with_indentation (line 75) | pub fn parse_file_with_indentation(indentation: usize, i: &str) -> IResu...
  function parse_module_items (line 84) | pub fn parse_module_items(i: &str) -> IResult<&str, Vec<ModuleItem>> {
  function parse_module_items_with_indentation (line 88) | pub fn parse_module_items_with_indentation(indentation: usize, i: &str) ...
  function parse_module_item (line 92) | pub fn parse_module_item(indentation: usize, i: &str) -> IResult<&str, M...
  function parse_type_definition (line 103) | pub fn parse_type_definition(indentation: usize, i: &str) -> IResult<&st...
  function parse_procedure_definition (line 115) | pub fn parse_procedure_definition(indentation: usize, i: &str) -> IResul...
  function parse_debug_statement (line 125) | pub fn parse_debug_statement(indentation: usize, i: &str) -> IResult<&st...
  function parse_parameters (line 131) | pub fn parse_parameters(i: &str) -> IResult<&str, Vec<(String, String)>> {
  function parse_parameter (line 134) | pub fn parse_parameter(i: &str) -> IResult<&str, (String, String)> {
  function parse_statement (line 138) | pub fn parse_statement(indentation: usize, i: &str) -> IResult<&str, Ter...
  function parse_term (line 142) | pub fn parse_term(indentation: usize, i: &str) -> IResult<&str, Term> {
  function parse_match (line 149) | pub fn parse_match(indentation: usize, i: &str) -> IResult<&str, Term> {
  function parse_match_arm (line 157) | pub fn parse_match_arm(indentation: usize, i: &str) -> IResult<&str, Mat...
  function parse_pattern (line 164) | pub fn parse_pattern(i: &str) -> IResult<&str, String> {
  function parse_expression (line 168) | pub fn parse_expression(indentation: usize, i: &str) -> IResult<&str, Te...
  function parse_chain_item (line 177) | pub fn parse_chain_item(indentation: usize, i: &str) -> IResult<&str, Ch...
  function get_tab_count (line 188) | fn get_tab_count(i: &str) -> usize {
  function make_lone (line 202) | fn make_lone(s: &str) -> Term {
  function make_day (line 205) | fn make_day(day: &str) -> Term {
  function test_parse_expression (line 210) | fn test_parse_expression() {
  function test_parse_match (line 225) | fn test_parse_match() {
  function test_parse_type_definition (line 241) | fn test_parse_type_definition() {
  function test_parse_procedure (line 257) | fn test_parse_procedure() {
  function test_parse_debug_statement (line 284) | fn test_parse_debug_statement() {
Condensed preview — 58 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (632K chars).
[
  {
    "path": ".editorconfig",
    "chars": 185,
    "preview": "root = true\n\n[*]\ncharset = utf-8\nindent_style = tab\nend_of_line = lf\ninsert_final_newline = true\ntrim_trailing_whitespac"
  },
  {
    "path": ".github/FUNDING.yml",
    "chars": 23,
    "preview": "github: [blainehansen]\n"
  },
  {
    "path": ".gitignore",
    "chars": 172,
    "preview": "\\#*.v\\#\n*.glob\n*.vo\n*.vok\n*.vos\n*.aux\n*.d\n\n*.cmi\n*.cmo\n*.out\n_build\n\nMakefile\nMakefile.conf\n*.cache\ntheorems/*.ml\ntheore"
  },
  {
    "path": "CODE_OF_CONDUCT.md",
    "chars": 152,
    "preview": "# Code of Conduct\n\nWe're using the exact same Code of Conduct as the Rust project, which [can be found online](https://w"
  },
  {
    "path": "CONTRIBUTING.md",
    "chars": 2741,
    "preview": "Hey there!\n\nRight now this project is optimized to be easy for me (Blaine Hansen) to work in. This means it might not be"
  },
  {
    "path": "Cargo.toml",
    "chars": 446,
    "preview": "[package]\nname = \"magmide\"\nversion = \"0.1.0\"\nedition = \"2021\"\n\n# [lib]\n# name = \"magmide\"\n# path = \"src/lib.rs\"\n# crate-"
  },
  {
    "path": "README.future.md",
    "chars": 11412,
    "preview": "# Magmide\n\n> Correct, Fast, Productive: pick three.\n\nMagmide is the first language built from the ground up to allow sof"
  },
  {
    "path": "README.md",
    "chars": 61516,
    "preview": "# :construction: Magmide is purely a research project at this point :construction:\n\nThis repo is still very early and ro"
  },
  {
    "path": "iris-notes.md",
    "chars": 8829,
    "preview": ">\n  “number of steps of computation that the program may perform”. This intuition is not entirely\n  correct, but it is c"
  },
  {
    "path": "justfile",
    "chars": 773,
    "preview": "# build:\n# \tdune build\n\n# wget --no-check-certificate -O - https://apt.llvm.org/llvm-snapshot.gpg.key | sudo apt-key add"
  },
  {
    "path": "lab.ll",
    "chars": 634,
    "preview": "; https://stackoverflow.com/questions/41716079/llvm-how-do-i-write-ir-to-file-and-run-it/41833643\n; https://stackoverflo"
  },
  {
    "path": "mg_examples/main.mg",
    "chars": 432,
    "preview": "type Day;\n\t| Monday\n\t| Tuesday\n\t| Wednesday\n\t| Thursday\n\t| Friday\n\t| Saturday\n\t| Sunday\n\nproc next_weekday(d: Day): Day;"
  },
  {
    "path": "notes/2019-popl-iron-final.md",
    "chars": 1742,
    "preview": "pretty simple so far, just saying none of the concurrent separation logics enable tracking *obligations*, merely correct"
  },
  {
    "path": "notes/assembly-proofs.md",
    "chars": 311,
    "preview": "this paper is mostly just a reimplementation of vale in f*, but with a more efficient proof reflection style verificatio"
  },
  {
    "path": "notes/category-theory-for-programmers.md",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "notes/coq-coq-correct.md",
    "chars": 270,
    "preview": "something I have to look at is how metaprogramming works in a bunch of these other languages, metacoq and f* metaprogram"
  },
  {
    "path": "notes/coq-metacoq.md",
    "chars": 466,
    "preview": "okay reading this has actually been helpful\nI'm still a little hazy on all the typing rules of cic, I guess mostly that "
  },
  {
    "path": "notes/indexing-foundational-proof-carrying-code.md",
    "chars": 2262,
    "preview": "so far this paper is really simple, it's just saying what proof-carrying-code (PCC) is and why it's valuable. he's also "
  },
  {
    "path": "notes/indexing-indexed-model.md",
    "chars": 568,
    "preview": "this one is actually getting somewhere. it's basically the same paper as `indexing-foundational-proof-carrying-code` but"
  },
  {
    "path": "notes/indexing-modal-model.md",
    "chars": 18949,
    "preview": "Before getting into the real paper, I'm going to quickly try to gain some clue about what modal logic and kripke semanti"
  },
  {
    "path": "notes/iris-from-the-ground-up.md",
    "chars": 2849,
    "preview": "An affine logic seems to only mean that the logic includes the weakening rule: `P * Q -> P`, you can *throw away* knowle"
  },
  {
    "path": "notes/iris-lecture-notes.md",
    "chars": 1959,
    "preview": "https://gitlab.mpi-sws.org/iris/examples/-/tree/master/theories/lecture_notes\n\niris invariants let different threads rea"
  },
  {
    "path": "notes/jung-thesis.md",
    "chars": 142,
    "preview": "<< is an *inclusion relation*. a << b means that b is a \"bigger resource\" than a, or that we obtain b by composing a wit"
  },
  {
    "path": "notes/known_types.md",
    "chars": 24344,
    "preview": "```coq\nInductive typ: Type :=\n  | Unit: typ\n  | Nat: typ\n  | Bool: typ\n  | Arrow: typ -> typ -> typ\n.\n\nFixpoint typeDeno"
  },
  {
    "path": "notes/pony-reference-capabilities.md",
    "chars": 5458,
    "preview": "http://jtfmumm.com/blog/2016/03/06/safely-sharing-data-pony-reference-capabilities/\n\n\n`iso`: writeable/readable, only on"
  },
  {
    "path": "notes/tarjan/README.md",
    "chars": 1212,
    "preview": "Tarjan and Kosaraju\n-------------------\n\n# Main files\n\n## Proofs of Tarjan strongly connected component algorithm (indep"
  },
  {
    "path": "notes/tarjan/_CoqProject",
    "chars": 92,
    "preview": "-R . mathcomp.tarjan\n-arg -w -arg -notation-overridden\n\ntarjan_nocolors.v\nextra_nocolors.v\n\n"
  },
  {
    "path": "notes/tarjan/extra_nocolors.v",
    "chars": 10221,
    "preview": "From mathcomp Require Import all_ssreflect.\n\nSet Implicit Arguments.\nUnset Strict Implicit.\nUnset Printing Implicit Defe"
  },
  {
    "path": "notes/tarjan/tarjan_nocolors.v",
    "chars": 25596,
    "preview": "From mathcomp Require Import all_ssreflect.\nRequire Import extra_nocolors.\n\nSet Implicit Arguments.\nUnset Strict Implici"
  },
  {
    "path": "notes/tarjan.md",
    "chars": 2304,
    "preview": "so as we're going through the depth first search, we store: a stack of visited but not yet assigned vertices, pushed ont"
  },
  {
    "path": "notes.md",
    "chars": 51967,
    "preview": "modified orphan rule:\ntraits can have crate *automatic derive implementations* that will \"kick in\" when a type the trait"
  },
  {
    "path": "old/checker.rs",
    "chars": 31054,
    "preview": "// use std::collections::{HashMap, HashSet};\n\n// // a Ctx is a map of *mere Idents* (which for now will be strings but l"
  },
  {
    "path": "old/inductive_serde.v",
    "chars": 2980,
    "preview": "(*\n\tit seems possible to define a function that given an AST representing an inductive type is able to produce a pair of"
  },
  {
    "path": "old/machine.md",
    "chars": 52581,
    "preview": "```\nmap_disjoint_sym: ∀ (K: Type) (M: Type → Type) (H0: ∀ A: Type, Lookup K A (M A)) (A: Type), Symmetric (map_disjoint:"
  },
  {
    "path": "old/machine.v",
    "chars": 19506,
    "preview": "Add LoadPath \"/home/blaine/lab/cpdtlib\" as Cpdt.\nSet Implicit Arguments. Set Asymmetric Patterns.\n(*Require Import List "
  },
  {
    "path": "old/main.md",
    "chars": 1831,
    "preview": "```v\nTheorem absurd_stuck instr:\n  ~(stopping instr)\n  -> forall program cur,\n  (cur_instr cur program) = Some instr\n  -"
  },
  {
    "path": "old/main.v",
    "chars": 19313,
    "preview": "Add LoadPath \"/home/blaine/lab/cpdtlib\" as Cpdt.\nSet Implicit Arguments. Set Asymmetric Patterns.\nRequire Import List St"
  },
  {
    "path": "old/parser_low.rs",
    "chars": 8214,
    "preview": "// use self::{Instruction::*, Value::*};\n// use anyhow::Error;\n// use inkwell::{context::Context, types::IntType, values"
  },
  {
    "path": "posts/approachable-language-design.md",
    "chars": 2908,
    "preview": "we ought to only use nonword symbols to only indicate concepts of the *language*, things that can't be represented *with"
  },
  {
    "path": "posts/comparisons-with-other-projects.md",
    "chars": 23182,
    "preview": "# Comparisons with other projects\n\nAn important question to ask of any project is: \"How is the project different than X?"
  },
  {
    "path": "posts/coq-for-engineers.md",
    "chars": 27514,
    "preview": "You'll probably have to chew on these big ideas over time, so I've tried my best to make them short and easy to read thr"
  },
  {
    "path": "posts/crossing-no-mans-land.md",
    "chars": 17986,
    "preview": "# Crossing No Man's Land: figuring out Magmide's path to success\n\nThese are just the important points I want to make. Sh"
  },
  {
    "path": "posts/design-of-magmide.md",
    "chars": 30310,
    "preview": "# Design of Magmide\n\nTo achieve the goals of the Magmide project, we have to arrive at a system with these essential com"
  },
  {
    "path": "posts/intro-verification-logic-in-magmide.md",
    "chars": 48767,
    "preview": "<!-- examples tell *how*, words explain *why* -->\n\nHello!\n\nIf you're reading this, you must be curious about how it coul"
  },
  {
    "path": "posts/iris-in-plain-terms.md",
    "chars": 963,
    "preview": "# A No Nonsense Introduction to the Iris Separation Logic\n\nThe Iris Separation Logic is amazing, and I believe it's goin"
  },
  {
    "path": "posts/toward-termination-vcgen.md",
    "chars": 8670,
    "preview": "we want to do this rather than simply think at a higher level of abstraction for two reasons:\n\n- we want the power and f"
  },
  {
    "path": "posts/what-is-magmide.md",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "src/ast.rs",
    "chars": 2023,
    "preview": "#[derive(Debug, Eq, PartialEq)]\npub enum TypeBody {\n\tUnit,\n\tUnion { branches: Vec<String> },\n\t// Tuple {  },\n\t// Record "
  },
  {
    "path": "src/checker.rs",
    "chars": 19880,
    "preview": "// http://adam.chlipala.net/cpdt/html/Universes.html\nuse std::collections::{HashMap};\n\nuse crate::ast::*;\npub type Ident"
  },
  {
    "path": "src/lib.rs",
    "chars": 34,
    "preview": "mod parser;\nmod checker;\nmod ast;\n"
  },
  {
    "path": "src/main.rs",
    "chars": 817,
    "preview": "// use magmide::ast;\n// use magmide::parser;\n// use magmide::Database;\n// use magmide::checker;\n\nfn main() {\n\t// let pat"
  },
  {
    "path": "src/old.md",
    "chars": 3954,
    "preview": "```rust\n#[derive(Debug, Eq, PartialEq, Clone)]\npub struct ModuleItemBlock {\n  pub line: usize,\n  pub body: String,\n  pub"
  },
  {
    "path": "src/parser.rs",
    "chars": 9264,
    "preview": "// https://matklad.github.io/2023/05/21/resilient-ll-parsing-tutorial.html\n// https://github.com/rust-analyzer/rowan\n// "
  },
  {
    "path": "theory/_CoqProject",
    "chars": 45,
    "preview": "-R . theory\n-Q /home/blaine/lab/cpdtlib Cpdt\n"
  },
  {
    "path": "theory/list_assertions.v",
    "chars": 4623,
    "preview": "Set Implicit Arguments. Set Asymmetric Patterns.\nRequire Import List.\nImport ListNotations.\nFrom stdpp Require Import ba"
  },
  {
    "path": "theory/main.v",
    "chars": 11965,
    "preview": "Add LoadPath \"/home/blaine/lab/cpdtlib\" as Cpdt.\nSet Implicit Arguments. Set Asymmetric Patterns.\n\nFrom stdpp Require Im"
  },
  {
    "path": "theory/playground.v",
    "chars": 8305,
    "preview": "Inductive Trivial: Prop :=\n\t| TrivialCreate.\n\nDefinition Trivial_manual: Trivial := TrivialCreate.\n\nDefinition Trivial_i"
  },
  {
    "path": "theory/utils.v",
    "chars": 10434,
    "preview": "Add LoadPath \"/home/blaine/lab/cpdtlib\" as Cpdt.\nSet Implicit Arguments. Set Asymmetric Patterns.\nRequire Import List Cp"
  }
]

About this extraction

This page contains the full source code of the blainehansen/magma GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 58 files (591.0 KB), approximately 164.1k tokens, and a symbol index with 85 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!