Full Code of cslarsen/stack-machine for AI

master da2c5b325784 cached
31 files
49.1 KB
15.3k tokens
26 symbols
1 requests
Download .txt
Repository: cslarsen/stack-machine
Branch: master
Commit: da2c5b325784
Files: 31
Total size: 49.1 KB

Directory structure:
gitextract_6bwf0lp8/

├── .gitignore
├── Makefile
├── README.md
├── compiler.cpp
├── error.cpp
├── fileptr.cpp
├── include/
│   ├── compiler.hpp
│   ├── error.hpp
│   ├── fileptr.hpp
│   ├── instructions.hpp
│   ├── label.hpp
│   ├── machine.hpp
│   ├── parser.hpp
│   ├── upper.hpp
│   └── version.hpp
├── instructions.cpp
├── machine.cpp
├── parser.cpp
├── sm.cpp
├── smc.cpp
├── smd.cpp
├── smr.cpp
├── tests/
│   ├── core-test.src
│   ├── core.src
│   ├── fib.src
│   ├── forward-goto.src
│   ├── func.src
│   ├── hello.src
│   ├── todo-print.src
│   └── yo.src
└── upper.cpp

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
*.o
*.sm
sm
smc
smd
smr


================================================
FILE: Makefile
================================================
CXXFLAGS = -g -W -Wall -Weffc++ -Iinclude
LINK.o = $(LINK.cc)

TARGETS = instructions.o parser.o error.o upper.o fileptr.o machine.o compiler.o sm.o smr.o smc.o smd.o sm smr smc smd

all: $(TARGETS)
	@echo Run \"make check\" to test package

%.sm: tests/%.src
	./smc $<

smr: instructions.o machine.o upper.o fileptr.o smr.o

smc: instructions.o machine.o upper.o error.o fileptr.o parser.o compiler.o smc.o

smd: instructions.o machine.o upper.o error.o fileptr.o smd.o

sm: instructions.o machine.o upper.o error.o fileptr.o parser.o compiler.o sm.o

check: all
	./sm tests/fib.src
	./smc tests/fib.src
	./smr tests/fib.sm
	./smc tests/hello.src
	./smr tests/hello.sm
	./smc tests/forward-goto.src
	./smr tests/forward-goto.sm
	./sm tests/yo.src
	./sm tests/func.src
	cat tests/core-test.src tests/core.src | ./sm -

clean:
	rm -f $(TARGETS) *.stackdump tests/*.sm


================================================
FILE: README.md
================================================
Stack-Machine
=============

This project contains

  * A simple, stack-based virtual machine for executing low-level instructions
  * An assembler supporting a Forth / PostScript like language
  * An interpreter able to run compiled programs

Architecture and design
-----------------------

The instructions are fixed-width at 32-bits and so are the arithmetic
operands.

By default, programs have 1 million cells available for both program text
and data.  This means that a virtual machine memory takes up 4MB plus the
data and instruction stacks.

The text and data regions are overlapped, so you can easily write
self-modifying code (early versions actually required self-modification to
be able to return from subroutine calls, just like Knuth's MIX, but I've
since taken liberty to add such modern convenience into the core instruction
set).

There are no registers.  This _is_ a stack machine, after all.

As we know from theoretical computer science, a pushdown automaton needs
_two_ stacks to be Turing equivalent.  Therefore we employ two as well; one
for the instruction pointer and one for the data.  They live separately from
the text and data region, and are only limited by the host process heap
size.

The machine contains no special facilities besides this:  It's inherently
single-threaded and has no protection mechanisms.  Its operation is
completely sandboxed, though, except for access to standard output.

Aim
---

The project aim was to create a simple machine and language to play around
with.  You can benefit from it by reading the source code, playing with a
language similar to Forth, but conceptually simpler, and finally by seeing
how easy it is to build your own system.

The programming language
========================

The language is very similar to Forth and PostScript:  You basically write
in RPN --- reverse Polish notation.  Anything not recognized as an
instruction is put on the data stack, so to put the numbers 3 and 2 on the
stack, just write

    3 2

To multiply them, just append with an asterix:

    3 2 * ; multiplication

This operation pops the topmost two numbers on the stack and replaces them
with the result of the multiplication.  To run such a program, you'd need to
include the core library first, since multiplication is defined as a
function:

    $ cat tests/core.src your-file.src | sm
    6

Labels, addresses and their values
----------------------------------

Labels are identifiers ending with a colon.

They refer to a particular cell in the machine, and you can access their
position, value or execute code from that cell location:

    label:      ; create a label for the cell at this location
    &label      ; put ADDRESS of label on top of stack
    &label LOAD ; put VALUE of label's cell "label" on top-of-stack
    label       ; EXECUTE code from label position

So, to put the _address_ of a label on the top of the data stack, just
prepend the label name with an ampersand.

If you want the _value_ of an address, put the address on the TOS (top of
stack) and use the `LOAD` instruction to replace the TOS with the value at
the given cell location.

When executing code at a given label position, the machine first puts the
address of the next instruction on top of the instruction stack.  This way
you can return from a function call by using the instruction `POPIP`:

    main:       ; program start
      print-dot
      print-dot
      HALT

    print-dot:
      '.' OUT
      '\n' OUT
      POPIP     ; return from "function"

Variables and subroutines
-------------------------

An idiom for creating variables is to create labels and putting a `NOP` at
that location to reserve one memory cell to hold variables.  An example of
using a counter variable to implement a loop is given below.

    counter: NOP                     ; reserve 1 word for the variable "counter"

    program: 2 &counter STOR                       ; set counter to two
             &counter LOAD 1 ADD &counter STOR     ; increment counter by one

    ; loop counter+1 times

    display: '\n' '*' OUT OUT                      ; print an asterix
             1 &counter LOAD SUB &counter STOR     ; decrement counter by one
             &display &counter LOAD JNZ            ; jump to display if not zero

The output of the above program is three stars:

    $ ./sm foo.src
    *
    *
    *

You can forward-reference labels.  In fact, another idiom is to jump to the
main part of the program at the start of the source.

Hello, world!
-------------

You can do `72 OUT` to print the letter "H" (72 is the ASCII code for "H").
Cutting to the chase, a program to print "Hello!" would be:

    ; Labels are written as a name without whitespace
    ; and a colon at the end.

    main:
       72 out          ; "H"
      101 out          ; "e"
      108 dup out out  ; "ll"
      111 out          ; "o"
       33 out          ; "!"

      ; newline
      '\n' out

      42 outnum     ; print a number
      '\n' out      ; and newline

      ; stop program
      halt

Notice the use of the `HALT` instruction to stop the program.

Multiplication and core library
-------------------------------

I've implemented a multiplication function in the core library in
`tests/core.src`:

    mul:            ; ( a b -- (a*b) )
      mul-res: nop  ; placeholder for result
      mul-cnt: nop  ; placeholder for counter
      mul-num: nop

      &mul-cnt stor ; b to cnt
      dup
      &mul-res stor ; a to res
      &mul-num stor ; and to num

      mul-loop:
        ; calculate res += a
        &mul-res load
        &mul-num load +
        &mul-res stor

        ; decrement counter
        &mul-cnt load
        -1
        &mul-cnt stor

        ; loop until counter is zero
        &mul-cnt load
        &mul-loop swap -1 jnz

      &mul-res load
      popip

    ; ...

    *:        ; alias for mul
      mul
      popip

Note that this function needs definitions for the functions `+` and `-1`.

Recall the program to multiply two numbers.  Put the following in a file
`hey.src`:

    3 2 * outnum
    '\n' out
    halt

If we concatenate the core library with our program, we get:

    $ cat tests/core.src hey.src | ./sm
    6

You could implement the whole program without depending on the core library:

    ; semi-obfuscated multiply and print
    ; does not depend on any libraries

    ; re-inventing the wheel can be very educational!

    main:
      12345 67890 * outnum
      '\n' out
      halt

    ; multiplication function w/inner loop
    *:
      R: nop C: nop N: nop
      &C stor dup &R stor &N stor

      *-loop:
        &R load &N load add &R stor
        1 &C load sub &C stor
        &C load &*-loop swap 1 swap sub jnz

      &R load
      popip

While implementing the Karatsuba algorithm should be quite easy, Toom-Cook
multiplication is left as an exercise for the reader.

It's not a joke
---------------

I think I need to clarify that this project is actually not a joke.  Fun,
absolutely, but not a joke.

I just wanted to create a simple virtual machine and from that I grew a
language.  It's very similar to Forth and PostScript, and we all know those
are extremely powerful --- particularly Forth!

Building stuff yourself is a powerful way of learning.

A Fibonacci program
-------------------

The following is a program to generate and print Fibonacci numbers, taken
from `tests/fib.src`:

    ; Print the well-known Fibonacci sequence
    ;
    ; Our word size is only 32-bits, so we can't
    ; count very far.

    ; Program starts at main, so jump there

    &main jmp

    ; Create label 'count', which refers to this memory
    ; address.
    ;
    ; The NOP (no operation; do nothing) is only used
    ; to reserve memory space for a variable.

    count:
      nop

    ; Initialize the counter by storing 46 at the address of 'count'.
    ;
    ; POPIP will pop the instruction pointer, effectively jumping to
    ; the next location (probably the caller).

    count-init:
      46 &count stor
      popip

    ; Shorthand for loading the number at 'count' onto the top of the stack.
    ;
    ; The "( -- counter)" comment is similar to Forth's comments, explaining
    ; that no number is expected on the stack, and after running this function,
    ; a number ("counter") will be on the stack.

    count-get: ; ( -- counter )
      &count load     ; load number
      popip

    ; Shorthand for decrementing the number on the stack

    dec: ; ( a -- a-1 )
      1 swap sub
      popip

    ; Store top of stack to 'count', do not alter stack

    count-set: ; ( counter -- counter )
      dup &count stor
      popip

    ; Decrement counter and return it

    count-dec: ; ( -- counter )
      count-get dec
      count-set
      popip

    ; Print number with a newline without altering stack

    show: ; ( number -- number )
      dup outnum
      '\n' out
      popip

    ; Duplicate two top-most numbers on stack

    dup2: ; ( a b -- a b a b )
      swap       ; b a
      dup        ; b a a
      rol3       ; a a b
      dup        ; a a b b
      rol3       ; a b b a
      swap       ; a b a b
      popip

    jump-if-nonzero: ; ( dest_address predicate -- )
      swap jnz
      popip

    ; The start of our Fibonacci printing program

    main:
      count-init

      0 show  ; first Fibonacci number
      1       ; second Fibonacci number

      loop:
        ; add top numbers and show
        ; a b -> a b a b -> a b (a + b)
        dup2 add show

        ; decrement, loop if non-zero
        count-dec &loop jump-if-nonzero

Convenience features
--------------------

I've added a `HALT` instruction.  This replaces the old idiom of looping
forever to signal that a program was finished:

    stop: stop      ; form 1
    stop: &stop jmp ; form 2
    halt            ; convenience form

Originally, it was an argument of minimalism for not including any halt
instructions.

Secondly, I've added a `POPIP` instruction along with automatically storing
the next instruction before performing a jump.  This effectively let's you
call and return from subroutines:

    boot:
      &main jmp halt

    foo: bar: baz:
      '\n' '!' 'e' 'c' 'i' 'u' 'j' 'e' 'l' 't' 'e' 'e' 'B'
      out out out out out out out out out out out out out
      popip

    main:
      foo bar baz

Third, I never bothered to write my own print number function, because it
would require me to write both division and modulus functions in source
first.  So I implemented `OUTNUM` that prints a number to the output:

    123 OUTNUM '\n' OUT ; prints "123\n"

Lacking is proper string handling.  One could say that string handling is
not this language's strongest point.

Compiling the project
=====================

To compile and run the examples:

    $ make all check

To see the low-level machine instructions:

    $ ./smr -h

To execute source code on-the-fly:

    $ ./sm filename

To compile source to bytecode:

    $ ./smc filename

The assembly language is not documented other than in code, because I'm
actively playing with it.

Although the interpreter is slow, it should be possible to convert stack
operations to a register machine.  In fact, it should be trivial to compile
programs to native machine code, e.g. x86.

Instruction set
---------------

The instructions are found `include/instructions.hpp`:

    VALUE       OPCODE  EXPLANATION
    0x00000000  NOP     do nothing
    0x00000001  ADD     pop a, pop b, push a + b
    0x00000002  SUB     pop a, pop b, push a - b
    0x00000003  AND     pop a, pop b, push a & b
    0x00000004  OR      pop a, pop b, push a | b
    0x00000005  XOR     pop a, pop b, push a ^ b
    0x00000006  NOT     pop a, push !a
    0x00000007  IN      read one byte from stdin, push as word on stack
    0x00000008  OUT     pop one word and write to stream as one byte
    0x00000009  LOAD    pop a, push word read from address a
    0x0000000A  STOR    pop a, pop b, write b to address a
    0x0000000B  JMP     pop a, goto a
    0x0000000C  JZ      pop a, pop b, if a == 0 goto b
    0x0000000D  PUSH    push next word
    0x0000000E  DUP     duplicate word on stack
    0x0000000F  SWAP    swap top two words on stack
    0x00000010  ROL3    rotate top three words on stack once left, (a b c) -> (b c a)
    0x00000011  OUTNUM  pop one word and write to stream as number
    0x00000012  JNZ     pop a, pop b, if a != 0 goto b
    0x00000013  DROP    remove top of stack
    0x00000014  PUSHIP  push a in IP stack
    0x00000015  POPIP   pop IP stack to current IP, effectively performing a jump
    0x00000016  DROPIP  pop IP, but do not jump
    0x00000017  COMPL   pop a, push the complement of a

The instruction set could easily be more minimal, even more so if we allowed
registers.  Also, we have taken absolutely no care about the machine code
values for each instruction.  A good design would do something cool with
that.

License and author
==================

Placed in the public domain in 2010 by the author, Christian Stigen Larsen
http://csl.sublevel3.org


================================================
FILE: compiler.cpp
================================================
/*
 * Made in 2010 by Christian Stigen Larsen
 * http://csl.sublevel3.org
 *
 * Placed in the public domain by the author.
 *
 */

#include <stdlib.h>
#include "compiler.hpp"
#include "parser.hpp"
#include "machine.hpp"
#include "label.hpp"
#include "upper.hpp"

void compiler::error(const std::string& s)
{
  if ( callback )
    callback(s.c_str());
}

bool compiler::islabel(const std::string& s)
{
  size_t l = s.length();
  return l<1? false : s[l-1] == ':';
}

bool compiler::iscomment(const std::string& s)
{
  return s[0] == ';';
}

Op compiler::tok2op(const std::string& s)
{
  return from_s(s.c_str());
}

bool compiler::isliteral(const std::string& s)
{
  if ( islabel(s) )
    return false;

  return tok2op(s) == NOP_END;
}

bool compiler::isnumber(const char* s)
{
  while ( *s )
    if ( !isdigit(*s++) )
      return false;

  return true;
}

bool compiler::ischar(const std::string& s)
{
  size_t l = s.length();

  if ( l==3 && s[0]=='\'' && s[2]=='\'' && s[1]!='\\' )
    return true;

  if ( l==4 && s[0]=='\'' && s[3]=='\'' && s[1]=='\\' 
            && (s[2]=='t' || s[2]=='r' || s[2]=='n' || s[2]=='0') )
    return true;

  return false;
}

char compiler::to_ord(const std::string& s)
{
  size_t l = s.length();

  if ( l == 3 ) // 'x'
    return s[1];
 
  if ( l == 4 ) // '\x'
    switch ( s[2] ) {
    case 't': return '\t';
    case 'r': return '\r';
    case 'n': return '\n';
    case '0': return '\0';
    }

  error("Unknown character literal: " + s);
  return '\0';
}

bool compiler::islabel_ref(const std::string& s)
{
  return s[0] == '&';
}

int32_t compiler::to_literal(const std::string& s)
{
  if ( isnumber(s.c_str()) )
    return atoi(s.c_str());

  if ( ischar(s) )
    return to_ord(s);

  return -1;
}

bool compiler::ishalt(const std::string& s)
{
  return s.empty() || upper(s)=="HALT";
}

void compiler::check_label_name(const std::string& label)
{
  if ( upper(label) == "HERE" )
    error("Label is reserved: HERE");
}

compiler::compiler(void (*cb)(const char*)) :
  m(cb),
  forwards(),
  callback(cb)
{
}

void compiler::set_error_callback(void (*error_callback)(const char* message))
{
  callback = error_callback;
}

void compiler::compile_label(const std::string& label)
{
  int32_t address = m.get_label_address(label);

  m.load(PUSH);

  // if label not found, mark it for update
  if ( address == -1 ) {
    check_label_name(label);
    forwards.push_back(label_t(label, m.pos()));
  }

  m.load(address);
}

void compiler::compile_function_call(const std::string& function)
{
  // Return address is here plus four instructions
  m.load(PUSHIP); m.load(m.pos() + 4*m.wordsize());

  // Push function destination address -- update it later
  m.load(PUSH);
  forwards.push_back(label_t(function, m.pos()));
  m.load(-1); // just push an arbitrary number

  // Jump to function
  m.load(JMP);

  // This is the return point
}

void compiler::compile_literal(const std::string& token)
{
  if ( islabel_ref(token) ) {
    compile_label(token.substr(1));
    return;
  }

  int32_t literal = to_literal(token);

  // Literals are pushed on to the stack
  if ( literal != -1 ) {
    m.load(PUSH);
    m.load(literal);
    return;
  }

  // Unknown literals are treated as forward function calls
  compile_function_call(token);
}

void compiler::resolve_forwards()
{
  for ( size_t n=0; n<forwards.size(); ++n ) {
    std::string label = forwards[n].name;
    int32_t address = m.get_label_address(label);

    if ( address == -1 )
      error("Code label not found: " + label);

    // update label jump to address
    m.set_mem(forwards[n].pos, address);
  }
}

// Return FALSE when compilation has finished
bool compiler::compile_token(const std::string& s, parser& p)
{
  if ( s.empty() ) {
    m.load_halt();
    resolve_forwards();
    return false;
  }
  else if ( ishalt(s) )    m.load_halt();
  else if ( iscomment(s) ) p.skip_line();
  else if ( isliteral(s) ) compile_literal(s);
  else if ( islabel(s) )   m.addlabel(s.c_str(), m.pos());
  else {
    Op op = tok2op(s);

    if ( op == NOP_END )
      error("Unknown operation: " + s);

    m.load(op);
  }

  return true;
}

machine_t& compiler::get_program()
{
  return m;
}

compiler::compiler(parser& p, void (*fp)(const char*)) :
  m(fp), forwards(), callback(fp)
{
  // Perform complete compilation
  while ( compile_token(p.next_token(), p) )
    ; // loop
}


================================================
FILE: error.cpp
================================================
/*
 * Made in 2010 by Christian Stigen Larsen
 * http://csl.sublevel3.org
 *
 * Placed in the public domain by the author.
 *
 */

#include <stdlib.h>
#include <stdio.h>
#include "error.hpp"

void error(const char* s)
{
  fprintf(stderr, "\n%s\n", s);
  exit(1);
}


================================================
FILE: fileptr.cpp
================================================
/*
 * Made in 2010 by Christian Stigen Larsen
 * http://csl.sublevel3.org
 *
 * Placed in the public domain by the author.
 *
 */

#include <stdexcept>
#include "fileptr.hpp"

fileptr::fileptr(FILE *file) : f(file)
{
  if ( f == NULL )
    throw std::runtime_error("Could not open file");
}

fileptr::~fileptr()
{
  fclose(f);
}

fileptr::operator FILE*() const
{
  return f;
}


================================================
FILE: include/compiler.hpp
================================================
/*
 * Made in 2010 by Christian Stigen Larsen
 * http://csl.sublevel3.org
 *
 * Placed in the public domain by the author.
 *
 */

#include "instructions.hpp"
#include "parser.hpp"
#include "machine.hpp"

#ifndef INC_COMPILER_HPP
#define INC_COMPILER_HPP

class compiler
{
  machine_t m;
  std::vector<label_t> forwards;
  void (*callback)(const char*);

  void error(const std::string& s);
  char to_ord(const std::string& s);
  int32_t to_literal(const std::string& s);
  void check_label_name(const std::string& label);

  static bool islabel(const std::string& s);
  static bool iscomment(const std::string& s);
  static Op tok2op(const std::string& s);
  static bool isliteral(const std::string& s);
  static bool isnumber(const char* s);
  static bool ischar(const std::string& s);
  static bool islabel_ref(const std::string& s);
  static bool ishalt(const std::string& s);

public:
  compiler(void (*error_callback)(const char* message) = NULL);
  compiler(parser& p, void (*error_callback)(const char* message) = NULL);

  void set_error_callback(void (*error_callback)(const char* message));
  void compile_label(const std::string& label);
  void compile_function_call(const std::string& function);
  void compile_literal(const std::string& token);
  void resolve_forwards();
  bool compile_token(const std::string& s, parser& p);
  machine_t& get_program();
};

#endif


================================================
FILE: include/error.hpp
================================================
/*
 * Made in 2010 by Christian Stigen Larsen
 * http://csl.sublevel3.org
 *
 * Placed in the public domain by the author.
 *
 */

void error(const char* s);


================================================
FILE: include/fileptr.hpp
================================================
/*
 * Made in 2010 by Christian Stigen Larsen
 * http://csl.sublevel3.org
 *
 * Placed in the public domain by the author.
 *
 */

#include <stdio.h>

#ifndef INC_FILEPTR_HPP
#define INC_FILEPTR_HPP

class fileptr {
  FILE* f;
  fileptr(const fileptr&); // deny
  fileptr& operator=(const fileptr&); // deny
public:
  fileptr(FILE *file);
  ~fileptr();
  operator FILE*() const;
};

#endif


================================================
FILE: include/instructions.hpp
================================================
/*
 * Made in 2010 by Christian Stigen Larsen
 * http://csl.sublevel3.org
 *
 * Placed in the public domain by the author.
 *
 */

#ifndef INC_SMCORE_H
#define INC_SMCORE_H

enum Op {
  NOP,  // do nothing
  ADD,  // pop a, pop b, push a + b
  SUB,  // pop a, pop b, push a - b
  AND,  // pop a, pop b, push a & b
  OR,   // pop a, pop b, push a | b
  XOR,  // pop a, pop b, push a ^ b
  NOT,  // pop a, push !a
  IN,   // push one byte read from stream
  OUT,  // pop one byte and write to stream
  LOAD, // pop a, push byte read from address a
  STOR, // pop a, pop b, write b to address a
  JMP,  // pop a, goto a
  JZ,   // pop a, pop b, if a == 0 goto b
  PUSH, // push next word
  DUP,  // duplicate word on stack
  SWAP, // swap top two words on stack
  ROL3, // rotate top three words on stack once left, (a b c) -> (b c a)
  OUTNUM, // pop one byte and write to stream as number
  JNZ,  // pop a, pop b, if a != 0 goto b
  DROP, // remove top of stack
  PUSHIP, // push a in IP stack
  POPIP,  // pop IP stack to current IP, effectively performing a jump
  DROPIP, // pop IP, but do not jump
  COMPL,  // pop a, push the complement of a
  NOP_END // placeholder for end of enum; MUST BE LAST
};

extern const char* OpStr[];

const char* to_s(Op op);
Op from_s(const char* s);

#endif


================================================
FILE: include/label.hpp
================================================
/*
 * Made in 2010 by Christian Stigen Larsen
 * http://csl.sublevel3.org
 *
 * Placed in the public domain by the author.
 *
 */

#include <stdlib.h>
#include <string>

#ifndef INC_LABEL_HPP
#define INC_LABEL_HPP

struct label_t {
  std::string name;
  int32_t pos;

  label_t(const std::string& name_, int32_t position)
    : name(name_), pos(position)
  {
  }
};

#endif


================================================
FILE: include/machine.hpp
================================================
/*
 * Made in 2010 by Christian Stigen Larsen
 * http://csl.sublevel3.org
 *
 * Placed in the public domain by the author.
 *
 */

#include <stdio.h>
#include <vector>
#include <string>
#include "instructions.hpp"
#include "label.hpp"

#ifndef INC_MACHINE_HPP
#define INC_MACHINE_HPP

class machine_t {
  std::vector<int32_t> stack;
  std::vector<int32_t> stackip;
  std::vector<label_t> labels;
  size_t memsize;
  int32_t *memory;
  int32_t ip; // instruction pointer
  FILE* fin;
  FILE* fout;
  bool running;
  void (*error_cb)(const char*);

public:
  machine_t(void (*error_callback)(const char* msg));
  machine_t(
    const size_t memory_size = 1024*1000/sizeof(int32_t),
    FILE* out = stdout,
    FILE* in  = stdin,
    void (*error_callback)(const char* msg) = NULL);
  machine_t(const machine_t& p, void (*error_callback)(const char* msg) = NULL);
  machine_t& operator=(const machine_t& p);
  ~machine_t();
  void reset();
  void error(const char* s) const;
  void push(const int32_t& n);
  int32_t pop();
  void puship(const int32_t&);
  int32_t popip();
  void check_bounds(int32_t n, const char* msg) const; 
  void next();
  void prev();
  void load(Op);
  void load(int32_t n);
  int run(int32_t start_address = 0);
  void exec(Op);
  int32_t* find_end() const;
  void load_image(FILE* f);
  void save_image(FILE* f) const;
  void load_halt();
  void showstack() const;

  size_t size() const;
  int32_t cur() const;
  int32_t pos() const;

  int32_t get_label_address(const std::string& label) const;
  void addlabel(const char* name, int32_t pos, int lineno = -1);

  bool isrunning() const;
  void set_fout(FILE*);
  void set_fin(FILE*);

  void set_mem(int32_t adr, int32_t val);
  int32_t get_mem(int32_t adr) const;
  int32_t wordsize() const;

  // instructions
  void instr_nop();    
  void instr_add();    
  void instr_sub();    
  void instr_and();    
  void instr_or();     
  void instr_xor();    
  void instr_not();    
  void instr_in();     
  void instr_out();    
  void instr_outnum(); 
  void instr_load();   
  void instr_stor();   
  void instr_jmp();    
  void instr_jz();     
  void instr_drop();   
  void instr_popip();  
  void instr_dropip(); 
  void instr_jnz();    
  void instr_push();   
  void instr_puship();  
  void instr_dup();    
  void instr_swap();   
  void instr_rol3();   
  void instr_compl();
};

#endif


================================================
FILE: include/parser.hpp
================================================
/*
 * Made in 2010 by Christian Stigen Larsen
 * http://csl.sublevel3.org
 *
 * Placed in the public domain by the author.
 *
 */

#include <string>

#ifndef INC_PARSER_HPP
#define INC_PARSER_HPP

class parser
{
  FILE* f;
  int lineno;
  int update_lineno(int c);
  int fgetchar();
  void move_back(int c);
  void skip_whitespace();

public:
  parser(FILE* f);
  int get_lineno() const;
  std::string next_token();
  void skip_line();
};

#endif


================================================
FILE: include/upper.hpp
================================================
/*
 * Made in 2010 by Christian Stigen Larsen
 * http://csl.sublevel3.org
 *
 * Placed in the public domain by the author.
 *
 */

#include <string>

std::string upper(const std::string& s);


================================================
FILE: include/version.hpp
================================================
#define VERSION "Public domain, 2010-2011 by Christian Stigen Larsen"


================================================
FILE: instructions.cpp
================================================
/*
 * Made in 2010 by Christian Stigen Larsen
 * http://csl.sublevel3.org
 *
 * Placed in the public domain by the author.
 *
 */

#include <stdio.h>
#include "instructions.hpp"
#include "machine.hpp"
#include "upper.hpp"

const char* OpStr[] = {
  "NOP",
  "ADD",
  "SUB",
  "AND",
  "OR",
  "XOR",
  "NOT",
  "IN",
  "OUT",
  "LOAD",
  "STOR",
  "JMP",
  "JZ",
  "PUSH",
  "DUP",
  "SWAP",
  "ROL3",
  "OUTNUM",
  "JNZ",
  "DROP",
  "PUSHIP",
  "POPIP",
  "DROPIP",
  "COMPL",
  "NOP_END"
};

const char* to_s(Op op)
{
  if ( op >= NOP && op < NOP_END )
    return OpStr[op];

  return "<?>";
}

Op from_s(const char* str)
{
  std::string s(upper(str));

  // slow, O(n/2) seek... :-)
  for ( int n=0; n<NOP_END; ++n )
    if ( s == OpStr[n] )
      return static_cast<Op>(n);

  return NOP_END;
}


================================================
FILE: machine.cpp
================================================
/*
 * Made in 2010 by Christian Stigen Larsen
 * http://csl.sublevel3.org
 *
 * Placed in the public domain by the author.
 *
 */

#include <stdlib.h>
#include <memory.h>
#include "machine.hpp"
#include "label.hpp"
#include "upper.hpp"

machine_t::machine_t(
  const machine_t& p,
  void (*error_callback)(const char*))
:
  stack(p.stack),
  stackip(p.stackip),
  labels(p.labels),
  memsize(p.memsize),
  memory(new int32_t[p.memsize]),
  ip(p.ip),
  fin(p.fin),
  fout(p.fout),
  running(p.running),
  error_cb(error_callback)
{
  memmove(memory, p.memory, memsize*sizeof(int32_t));
}

machine_t::machine_t(const size_t memory_size,
  FILE* out,
  FILE* in,
  void (*error_callback)(const char*))
:
  stack(),
  stackip(),
  labels(),
  memsize(memory_size),
  memory(new int32_t[memory_size]),
  ip(0),
  fin(in),
  fout(out),
  running(true),
  error_cb(error_callback)
{
  reset();
}

machine_t::machine_t(void (*error_callback)(const char*))
:
  stack(),
  stackip(),
  labels(),
  memsize(1000*1024*sizeof(int32_t)),
  memory(new int32_t[memsize]),
  ip(0),
  fin(stdin),
  fout(stdout),
  running(true),
  error_cb(error_callback)
{
  reset();
}

machine_t& machine_t::operator=(const machine_t& p)
{
  if ( &p == this )
    return *this;

  delete[](memory);

  stack = p.stack;
  stackip = p.stackip;
  labels = p.labels;
  memsize = p.memsize;
  memory = new int32_t[memsize];
  memcpy(memory, p.memory, memsize*sizeof(int32_t));
  ip = p.ip;
  fin = p.fin;
  fout = p.fout;
  running = p.running;
  error_cb = p.error_cb;

  return *this;
}

void machine_t::reset()
{
  memset(memory, NOP, memsize*sizeof(int32_t));
  stack.clear();
  ip = 0;
}

machine_t::~machine_t()
{
  delete[](memory);
}

void machine_t::error(const char* s) const
{
  if ( error_cb )
    error_cb(s);
}

void machine_t::push(const int32_t& n)
{
  stack.push_back(n);
}

void machine_t::puship(const int32_t& n)
{
  stackip.push_back(n);
}

int32_t machine_t::popip()
{
  if ( stackip.empty() ) {
    error("POP empty IP stack");
    return 0;
  }

  int32_t n = stackip.back();
  stackip.pop_back();
  return n;
}

int32_t machine_t::pop()
{
  if ( stack.empty() )
    error("POP empty stack");

  int32_t n = stack.back();
  stack.pop_back();
  return n;
}

void machine_t::check_bounds(int32_t n, const char* msg) const
{
  if ( n < 0 || static_cast<size_t>(n) >= memsize )
    error(msg);
}

void machine_t::next()
{
  ip += sizeof(int32_t);

  if ( ip < 0 )
    error("IP < 0");

  if ( static_cast<size_t>(ip) >= memsize )
    ip = 0; // TODO: Halt instead of wrap-around?
}

void machine_t::prev()
{
  if ( ip == 0 )
    error("prev() reached zero");

  ip -= sizeof(int32_t);
}

void machine_t::load(Op op)
{
  memory[ip] = op;
  next();
}

void machine_t::load(int32_t n)
{
  memory[ip] = n;
  next();
}

int machine_t::run(int32_t start_address)
{
  ip = start_address;

  while(running)
    exec(static_cast<Op>(memory[ip]));

  return 0; // TODO: exit-code ?
}

void machine_t::instr_nop()
{
  next();
}

void machine_t::instr_add()
{
  push(pop() + pop());
  next();
}

void machine_t::instr_sub()
{
  /*
   * This operation is not primitive.  It can
   * be implemented by adding the minuend to
   * the two's complement of the subtrahend:
   *
   * SUB: ; ( a b -- (b-a))
   *   swap  ; b a
   *   compl ; b ~a
   *   1 add ; b (~a+1), or b -a
   *   add   ; b-a
   *   popip
   *
   * The problem is that IF the underlying
   * architecture does not use two's complement
   * to represent negative values, stuff like
   * printing will fail miserably (at least in
   * the current implementation on top of C).
   */

  // TODO: Consider reversing the operands for SUB
  //       (it's currently unnatural)

  int32_t tos = pop();
  push(tos - pop());
  next();
}

void machine_t::instr_and()
{
  push(pop() & pop());
  next();
}

void machine_t::instr_or()
{
  push(pop() | pop());
  next();
}

void machine_t::instr_xor()
{
  push(pop() ^ pop());
  next();
}

void machine_t::instr_not()
{
  // TODO: this probably does not work as intended
  push(!pop());
  next();
}

void machine_t::instr_compl()
{
  push(~pop());
  next();
}

void machine_t::instr_in()
{
  /*
   * The IN/OUT functions should be implemented
   * using something akin to x86 INT or SYSCALL or
   * similar.  E.g.:
   *
   * 123 SYSCALL ; exec system call 123
   *
   */
  push(getc(fin));
  next();
}

void machine_t::instr_out()
{
  putc(pop(), fout);
  fflush(fout);
  next();
}

void machine_t::instr_outnum()
{
  fprintf(fout, "%u", pop());
  next();
}

void machine_t::instr_load()
{
  int32_t a = pop();
  check_bounds(a, "LOAD");
  push(memory[a]);
  next();
}

void machine_t::instr_stor()
{
  int32_t a = pop();
  check_bounds(a, "STOR");
  memory[a] = pop();
  next();
}

void machine_t::instr_jmp()
{
  /*
   * This function is not primitive.
   * If we have e.g. JZ, we can always
   * do "0 JZ" to perform the jump.
   *
   * (Note that this will break the
   * HALT-idiom)
   *
   */

  // TODO: Implement as library function

  //push(0);
  //instr_jz();

  int32_t a = pop();
  check_bounds(a, "JMP");  

  // check if we are halting, i.e. jumping to current
  // address -- if so, quit
  if ( a == ip )
    running = false;
  else
    ip = a;
}

void machine_t::instr_jz()
{
  int32_t a = pop();
  int32_t b = pop();

  if ( a != 0 )
    next();
  else {
    check_bounds(b, "JZ");
    ip = b; // perform jump
  }
}

void machine_t::instr_drop()
{
  pop();
  next();
}

void machine_t::instr_popip()
{
  int32_t a = popip();
  check_bounds(a, "POPIP");
  ip = a;
}

void machine_t::instr_dropip()
{
  popip();
  next();
}

void machine_t::instr_jnz()
{
  /*
   * Only one of JNZ and JZ is needed as
   * a primitive -- one can be implemented
   * in terms of the other with a negation
   * of the TOS.
   *
   * (Note that this will break the HALT-idiom)
   */

  /*
  instr_puship();
  instr_compl();
  instr_popip();
  instr_jz();
  */

  int32_t a = pop();
  int32_t b = pop();

  if ( a == 0 )
    next();
  else {
    check_bounds(b, "JNZ");
    ip = b; // jump
  }
}

void machine_t::instr_push()
{
  next();
  push(memory[ip]);
  next();
}

void machine_t::instr_puship()
{
  next();
  puship(memory[ip]);
  next();
}

void machine_t::instr_dup()
{
  /*
   * This function is not primitive.
   * It can be replaced with a "function":
   *
   * ; ( a -- a a )
   * dup:  nop       ; placeholder <- nop
   *       &dup stor ; placeholder <- a
   *       &dup load ; tos <- a
   *       &dup load ; tos <- a
   *       popip
   */

  // TODO: Implement as library function

  int32_t a = pop();
  push(a);
  push(a);
  next();
}

void machine_t::instr_swap()
{
  /*
   * This function is not primitive.
   * It can be replaced with a "function",
   * something like:
   *
   * ; ( a b -- b a )
   * swap:
   *   swap-b: nop  ; placeholder
   *   swap-a: nop  ; placeholder
   *   &swap-b stor ; swap-b <- b
   *   &swap-a stor ; swap-a <- a
   *   &swap-b load ; tos <- a
   *   &swap-a load ; tos <- b
   *   popip
   *
   */

  // TODO: Implement as library function

  // a, b -- b, a
  int32_t b = pop();
  int32_t a = pop();
  push(b);
  push(a);
  next();
}

void machine_t::instr_rol3()
{
  /*
   * This function is not primitive.
   * It can be replaced with "functions",
   * something like:
   *
   * rol3:
   *   rol3-var: nop  ; stack = a b c
   *   &rol3-var stor ; stack = a b, var = c
   *   swap           ; stack = b a, var = c
   *   &rol3-var load ; stack = b a c
   *   swap           ; stack = b c a
   *   popip
   *
   */

  // TODO: Implement as library function

  // abc -> bca
  int32_t c = pop(); // TOS
  int32_t b = pop();
  int32_t a = pop();
  push(b);
  push(c);
  push(a);
  next();
}

void machine_t::exec(Op operation)
{
  switch(operation) {
  default:     error("Unknown instruction"); break;
  case NOP:    instr_nop();    break;

  // Strictly speaking, SUB can be implemented
  // by ADDing the minuend with the two's complement
  // of the subtrahend -- but that's not necessarily
  // portable down to native code

  case ADD:    instr_add();    break;
  case SUB:    instr_sub();    break; // non-primitive

  // Strictly speaking, all but NOT and AND are
  // non-primitive (or some other combination of
  // two operations)

  case AND:    instr_and();    break;
  case OR:     instr_or();     break;
  case XOR:    instr_xor();    break;
  case NOT:    instr_not();    break;
  case COMPL:  instr_compl();  break;

  // Should be replaced with x86 INT-like operations

  case IN:     instr_in();     break;
  case OUT:    instr_out();    break;

  case LOAD:   instr_load();   break;   
  case STOR:   instr_stor();   break;   

  case PUSH:   instr_push();   break;   
  case DROP:   instr_drop();   break;   

  case PUSHIP: instr_puship(); break; 
  case POPIP:  instr_popip();  break;  
  case DROPIP: instr_dropip(); break; 

  case JZ:     instr_jz();     break;     
  case JMP:    instr_jmp();    break; // non-primitive
  case JNZ:    instr_jnz();    break; // non-primitive
  case DUP:    instr_dup();    break; // non-primitive
  case SWAP:   instr_swap();   break; // non-primitive 
  case ROL3:   instr_rol3();   break; // non-primitive
  case OUTNUM: instr_outnum(); break; // non-primitive
  }
}

int32_t* machine_t::find_end() const
{
  // find end of program by scanning
  // backwards until non-NOP is found
  int32_t *p = &memory[memsize-1];
  while ( *p == NOP ) --p;
  return p;
}

void machine_t::load_image(FILE* f)
{
  reset();

  while ( !feof(f) ) {
    Op op = NOP;
    fread(&op, sizeof(Op), 1, f);
    load(op);
  }

  ip = 0;
}

void machine_t::save_image(FILE* f) const
{
  int32_t *start = memory;
  int32_t *end = find_end() + sizeof(int32_t);

  while ( start != end ) {
    fwrite(start, sizeof(Op), 1, f);
    start += sizeof(int32_t);
  }
}

void machine_t::load_halt()
{
  load(PUSH);
  load(ip + sizeof(int32_t));
  load(JMP);
}

size_t machine_t::size() const
{
  return find_end() - &memory[0];
}

int32_t machine_t::cur() const
{
  return memory[ip];
}

int32_t machine_t::pos() const
{
  return ip;
}

void machine_t::addlabel(const char* name, int32_t pos, int)
{
  std::string n = upper(name);

  if ( n.empty() )
    error("Empty label");
  else {
    n.erase(n.length()-1, 1); // remove ":"
    labels.push_back(label_t(n.c_str(), pos));
  }
}

int32_t machine_t::get_label_address(const std::string& s) const
{
  std::string p(upper(s));

  // special label address "here" returns current position
  if ( p == "HERE" )
    return ip;

  for ( size_t n=0; n < labels.size(); ++n )
    if ( upper(labels[n].name.c_str()) == p )
      return labels[n].pos;
  
  return -1; // not found
}

bool machine_t::isrunning() const
{
  return running;
}

void machine_t::set_fout(FILE* f)
{
  fout = f;
}

void machine_t::set_fin(FILE* f)
{
  fin = f;
}

void machine_t::set_mem(int32_t adr, int32_t val)
{
  check_bounds(adr, "set_mem out of bounds");
  memory[adr] = val;
}

int32_t machine_t::get_mem(int32_t adr) const
{
  check_bounds(adr, "get_mem out of bounds");
  return memory[adr];
}

int32_t machine_t::wordsize() const
{
  return sizeof(int32_t);
}


================================================
FILE: parser.cpp
================================================
/*
 * Made in 2010 by Christian Stigen Larsen
 * http://csl.sublevel3.org
 *
 * Placed in the public domain by the author.
 *
 */

#include <stdio.h>
#include <ctype.h>
#include "parser.hpp"

int parser::update_lineno(int c)
{
  if ( c == '\n' )
    ++lineno;

  return c;
}

int parser::fgetchar()
{
  return update_lineno(fgetc(f));
}

void parser::move_back(int c)
{
  if ( c == '\n' )
    --lineno;

  ungetc(c, f);
}

void parser::skip_whitespace()
{
  int c;
  while ( (c = fgetchar()) != EOF && isspace(c) )
    ;
  move_back(c);
}

parser::parser(FILE* file) :
  f(file),
  lineno(1)
{
}

int parser::get_lineno() const
{
  return lineno;
}

std::string parser::next_token()
{
  int c;
  std::string s;

  skip_whitespace();

  while ( (c = fgetchar()) != EOF && !isspace(c) )
      s += c;

  return s;
}

void parser::skip_line()
{
  int c;
  while ( (c = fgetchar()) != EOF && c != '\n' )
    ;
}


================================================
FILE: sm.cpp
================================================
/*
 * Made in 2010 by Christian Stigen Larsen
 * http://csl.sublevel3.org
 *
 * Placed in the public domain by the author.
 *
 * Synopsis:  Compile and run code on-the-fly.
 *
 */

#include <stdio.h>
#include <string.h>
#include "instructions.hpp"
#include "fileptr.hpp"
#include "compiler.hpp"
#include "error.hpp"
#include "upper.hpp"

void compile_and_run(FILE* f)
{
  parser p(f);
  compiler c(p, error);
  c.get_program().run();
}

void help()
{
  printf("Usage: sm [ file(s] ]\n");
  printf("Compiles and runs source files on the fly.\n\n");
  exit(1);
}

int main(int argc, char** argv)
{
  try {
    if ( argc == 1 ) // by default, read standard input
      compile_and_run(stdin);
  
    for ( int n=1; n<argc; ++n )
      if ( argv[n][0]=='-' ) {
        if ( argv[n][1] == '\0' )
          compile_and_run(stdin);
        else
          help();
      } else
        compile_and_run(fileptr(fopen(argv[n], "rt")));

    return 0;
  }
  catch(const std::exception& e) {
    error(e.what());
  }
}


================================================
FILE: smc.cpp
================================================
/*
 * Made in 2010 by Christian Stigen Larsen
 * http://csl.sublevel3.org
 *
 * Placed in the public domain by the author.
 *
 * Synopsis:  Compile source to bydecode.
 *
 */

#include <stdlib.h>
#include <string.h>
#include <stdexcept>
#include "version.hpp"
#include "instructions.hpp"
#include "fileptr.hpp"
#include "compiler.hpp"
#include "error.hpp"

const char* file = "";
parser *p = NULL;

// Return '<this part>.<ext>' of a filename
static std::string sbasename(const std::string& s)
{
  using namespace std;
  const string::size_type p = s.rfind('.');
  return p == string::npos ? s : s.substr(0, p);
}

static void compile_error(const char* msg)
{
  fprintf(stderr, "%s:%d:%s\n", file, p->get_lineno(), msg);
  exit(1);
}

void compile(FILE* f, const std::string& out)
{
  delete(p);
  p = new parser(f);
  compiler c(*p, compile_error);
  c.get_program().save_image( fileptr(fopen(out.c_str(), "wb")));
}

int main(int argc, char** argv)
{
  try {
    if ( argc < 2 )
      error("Usage: smc [ filename(s) | - ]\n" VERSION);

    for ( int n=1; n<argc; ++n ) {
      if ( !strcmp(argv[n], "-") ) {
        file = "<stdin>";
        compile(stdin, "out.sm");
      } else {
        file = argv[n];
        compile(fileptr(fopen(argv[n], "rt")),
                sbasename(argv[n]) + ".sm");
      }
    }

    return 0;
  }
  catch(const std::exception& e) {
    error(e.what());
  }
}


================================================
FILE: smd.cpp
================================================
/*
 * Made in 2010 by Christian Stigen Larsen
 * http://csl.sublevel3.org
 *
 * Placed in the public domain by the author.
 *
 * Synopsis:  Disassemble bytecode.
 *
 */

#include <stdio.h>
#include "instructions.hpp"
#include "machine.hpp"
#include "fileptr.hpp"
#include "error.hpp"

static bool isprintable(int c)
{
  return (c>=32 && c<=127)
    || c=='\n'
    || c=='\r'
    || c=='\t';
}

static const char* to_s(char c)
{
  static char buf[2];
  buf[0] = c;
  buf[1] = '\0';

  switch ( c ) {
  default: return buf;
  case '\t': return "\\t";
  case '\n': return "\\n";
  case '\r': return "\\r";
  }
}

static void disassemble(machine_t &m)
{
  int32_t end = m.size();

  while ( m.pos() <= end ) {
    Op op = static_cast<Op>(m.cur());
    printf("0x%x %s", m.pos(), to_s(op));

    if ( (op==PUSH || op==PUSHIP) && m.pos()<=end ) {
        m.next();
        printf(" 0x%x", m.cur());

        if ( isprintable(m.cur()) )
          printf(" ('%s')", to_s(m.cur()));
    }

    printf("\n");
    m.next();
  }
}

int help()
{
  printf("Usage: smd [ file(s) }\n\n");
  printf("Disassembles compiled bytecode files.\n");
  exit(1);
}

int main(int argc, char** argv)
{
  try {
    for ( int n=1; n<argc; ++n ) {
      if ( argv[n][0] == '-' ) {
        if ( argv[n][1] != '\0' )
          help();
        continue;
      }

      machine_t m;
      m.load_image(fileptr(fopen(argv[n], "rb")));
      printf("; File %s --- %lu bytes\n", argv[n], m.size());
      disassemble(m);
    }
    return 0;
  }
  catch(const std::exception& e) {
    error(e.what());
  }
}


================================================
FILE: smr.cpp
================================================
/*
 * Made in 2010 by Christian Stigen Larsen
 * http://csl.sublevel3.org
 *
 * Placed in the public domain by the author.
 *
 * Synopsis:  Run compiled bytecode.
 *
 */

#include <stdio.h>
#include <stdlib.h>
#include "version.hpp"
#include "instructions.hpp"
#include "machine.hpp"
#include "fileptr.hpp"

static void help()
{
  printf("smr -- stack-machine run\n");
  printf("%s\n\n", VERSION);

  printf("Opcodes:\n\n");

  Op op=NOP; 
  do {
    printf("0x%x = %s\n", op, to_s(op));
    op = static_cast<Op>(op+1);
  } while ( op != NOP_END );

  printf("\nTo halt program, jump to current position:\n\n");
  printf("0x0 PUSH 0x%x\n", (unsigned int)sizeof(int32_t));
  printf("0x%x JMP\n\n", (unsigned int)sizeof(int32_t));
  printf("Word size is %lu bytes\n", sizeof(int32_t));

  exit(0);
}

int main(int argc, char** argv)
{
  try {
    bool found_file = false;

    for ( int n=1; n<argc; ++n ) {
      if ( argv[n][0] == '-' ) {
        help();
        continue;
      }
      
      found_file = true;
      machine_t m;
      m.load_image(fileptr(fopen(argv[n], "rb")));
      m.run();
    }

    if ( !found_file ) {
      machine_t m;
      m.load_image(stdin);
      m.run();
    }

    return 0;
  }
  catch(const std::exception& e) {
    fprintf(stderr, "%s\n", e.what());
    return 1;
  }
}


================================================
FILE: tests/core-test.src
================================================
; To run this example using the core.src library:
;
;   cat tests/core-test.src tests/core.src | ./sm
;

1 outnum '+' out
2 outnum '+' out
3 outnum '+' out
4 outnum '+' out
5 outnum '=' out

1 _dup +1 3 4 5
             +
           +
         +
       +
outnum ; should be 15
'\n' out

6 outnum '*' out
7 outnum '=' out
6 7 * outnum ; should be 42
'\n' out

halt


================================================
FILE: tests/core.src
================================================
; Some helpful functions that can be implemented
; by primitives

; only parse this code, don't execute it
&inc-core jmp

_nop:  ; do nothing
  popip

_jnz:
  not jz
  popip

_jmp:
  0 jz

_swap:          ; ( a b -- b a)
  swap-a: nop   ; placeholder for a
  swap-b: nop   ; placeholder for b
  load &swap-b  ; pop to b
  load &swap-a  ; pop to a
  push &swap-a  ; push a
  push &swap-b  ; push b
  popip

_dup:           ; ( a -- a a )
  nop           ; placeholder <- nop
  &_dup stor ; placeholder <- a
  &_dup load ; tos <- a
  &_dup load ; tos <- a
  popip

_sub-two's-complementB: ; ( a b -- (b-a))
  swap  ; b a
  compl ; b ~a
  1 add ; b (~a+1), or b -a
  add   ; b-a
  popip

_rol3: ; ( a b c -- b c a )
  rol3-var: nop  ; stack = a b c
  &rol3-var stor ; stack = a b, var = c
  swap           ; stack = b a, var = c
  &rol3-var load ; stack = b a c
  swap           ; stack = b c a
  popip

+:              ; ( a b -- (a+b) )
  add           ; 32-bit native addition
  popip

-:              ; ( a b -- (b-a) )
  sub           ; 32-bit native subtraction
  popip

*:              ; ( a b -- (a*b) )
  mul           ; 32-bit native multiplication
  popip

-1:             ; ( a -- (a-1))
  1 swap sub
  popip

+1:             ; ( a -- (a+1))
  1 add
  popip

mul:            ; ( a b -- (a*b) )
  mul-res: nop  ; placeholder for result
  mul-cnt: nop  ; placeholder for counter
  mul-num: nop

  &mul-cnt stor ; b to cnt
  dup
  &mul-res stor ; a to res
  &mul-num stor ; and to num

  mul-loop:
    ; calculate res += a
    &mul-res load
    &mul-num load +
    &mul-res stor

    ; decrement counter
    &mul-cnt load
    -1
    &mul-cnt stor

    ; loop until counter is zero
    &mul-cnt load
    &mul-loop swap -1 jnz

  &mul-res load
  popip

&:              ; ( a b -- (a AND b) )
  and           ; 32-bit bitwise AND operation
  popip

|:              ; ( a b -- (a OR b) )
  or            ; 32-bit bitwise OR operation
  popip

^:              ; ( a b -- (a XOR b) )
  xor           ; 32-bit bitwise EXCLUSIVE OR operation
  popip

~:              ; ( a -- complement of a )
  not           ; 32-bit bitwise COMPLEMENT OF A
  popip

<:              ; ( -- a )
  in            ; read 8-bit to LSB of 32-bit value from input stream
  popip

>:              ; ( a -- )
  out           ; write 8-bit to LSB of 32-bit value to output stream
  popip

@:              ; ( address -- value at address )
  load
  popip

inc-core:
  nop


================================================
FILE: tests/fib.src
================================================
; Print the well-known Fibonacci sequence
;
; Our word size is only 32-bits, so we can't
; count very far.

; Program starts at main, so jump there

&main jmp

; Create label 'count', which refers to this memory
; address.
;
; The NOP (no operation; do nothing) is only used
; to reserve memory space for a variable.

count:
  nop

; Initialize the counter by storing 46 at the address of 'count'.
;
; POPIP will pop the instruction pointer, effectively jumping to
; the next location (probably the caller).

count-init:
  46 &count stor
  popip

; Shorthand for loading the number at 'count' onto the top of the stack.
;
; The "( -- counter)" comment is similar to Forth's comments, explaining
; that no number is expected on the stack, and after running this function,
; a number ("counter") will be on the stack.

count-get: ; ( -- counter )
  &count load     ; load number
  popip

; Shorthand for decrementing the number on the stack

dec: ; ( a -- a-1 )
  1 swap sub
  popip

; Store top of stack to 'count', do not alter stack

count-set: ; ( counter -- counter )
  dup &count stor
  popip

; Decrement counter and return it

count-dec: ; ( -- counter )
  count-get dec
  count-set
  popip

; Print number with a newline without altering stack

show: ; ( number -- number )
  dup outnum
  '\n' out
  popip

; Duplicate two top-most numbers on stack

dup2: ; ( a b -- a b a b )
  swap       ; b a
  dup        ; b a a
  rol3       ; a a b
  dup        ; a a b b
  rol3       ; a b b a
  swap       ; a b a b
  popip

jump-if-nonzero: ; ( dest_address predicate -- )
  swap jnz
  popip
  
; The start of our Fibonacci printing program

main:
  count-init

  0 show  ; first Fibonacci number
  1       ; second Fibonacci number

  loop:
    ; add top numbers and show
    ; a b -> a b a b -> a b (a + b)
    dup2 add show

    ; decrement, loop if non-zero
    count-dec &loop jump-if-nonzero


================================================
FILE: tests/forward-goto.src
================================================
; simple test of forward labels

start:
  &cause jmp

effect:
  'e' out
  'f' out
  'f' out
  'e' out
  'c' out
  't' out
  '\n' out
  halt

cause:
  'c' out
  'a' out
  'u' out
  's' out
  'e' out
  32 out
  '-' out
  '>' out
  32 out
  &effect
  jmp


================================================
FILE: tests/func.src
================================================
; program starts at main program
  main
  '4' out '\n' out
  halt

three:
  '3' out '\n' out
  popip

main:
  one
  two
  three
  popip

one:
  '1' out '\n' out
  popip

two:
  '2' out '\n' out
  popip


================================================
FILE: tests/hello.src
================================================
; Labels are written as a name without whitespace
; and a colon at the end

main:
   72 out          ; "H"
  101 out          ; "e"
  108 dup out out  ; "ll"
  111 out          ; "o"
   33 out          ; "!"

  ; newline
  10 13 out out

  42 outnum     ; print a number
  10 13 out out ; ... and CRLF

  ; stop program
  halt


================================================
FILE: tests/todo-print.src
================================================
; This is a suggestion for a new "embed" keyword,
; as well as support for parsing strings.

&main jmp

msg: embed "Hello, world!\n"
num: embed 40

printstr:         ; ( adr -- )
  print-src: nop  ; placeholder
  &print-src stor ; store src address to print-src

  print-loop:
    &print-src load        ; get ptr
    dup +1 &print-src stor ; save ptr + 1
    load                   ; get char
    dup &print-exit jz     ; stop if '\0'
    out                    ; print character
    &print-loop jmp        ; loop

  print-exit: popip

main:
  &msg printstr              ; print "Hello, world\n"
  &num +1 +1 outnum '\n' out ; print "42"


================================================
FILE: tests/yo.src
================================================
; simple code to demonstrate compile-and-run
'y' out
'o' out
'!' out 
'\n' out


================================================
FILE: upper.cpp
================================================
/*
 * Made in 2010 by Christian Stigen Larsen
 * http://csl.sublevel3.org
 *
 * Placed in the public domain by the author.
 *
 */

#include <ctype.h>
#include "upper.hpp"

std::string upper(const std::string& s)
{
  std::string r(s);

  for ( int n=0, l=s.length(); n<l; ++n )
    r[n] = toupper(r[n]);

  return r;
}

Download .txt
gitextract_6bwf0lp8/

├── .gitignore
├── Makefile
├── README.md
├── compiler.cpp
├── error.cpp
├── fileptr.cpp
├── include/
│   ├── compiler.hpp
│   ├── error.hpp
│   ├── fileptr.hpp
│   ├── instructions.hpp
│   ├── label.hpp
│   ├── machine.hpp
│   ├── parser.hpp
│   ├── upper.hpp
│   └── version.hpp
├── instructions.cpp
├── machine.cpp
├── parser.cpp
├── sm.cpp
├── smc.cpp
├── smd.cpp
├── smr.cpp
├── tests/
│   ├── core-test.src
│   ├── core.src
│   ├── fib.src
│   ├── forward-goto.src
│   ├── func.src
│   ├── hello.src
│   ├── todo-print.src
│   └── yo.src
└── upper.cpp
Download .txt
SYMBOL INDEX (26 symbols across 15 files)

FILE: compiler.cpp
  function Op (line 33) | Op compiler::tok2op(const std::string& s)
  function machine_t (line 216) | machine_t& compiler::get_program()

FILE: error.cpp
  function error (line 13) | void error(const char* s)

FILE: include/compiler.hpp
  class compiler (line 16) | class compiler

FILE: include/fileptr.hpp
  class fileptr (line 14) | class fileptr {

FILE: include/instructions.hpp
  type Op (line 12) | enum Op {

FILE: include/label.hpp
  type label_t (line 15) | struct label_t {
    method label_t (line 19) | label_t(const std::string& name_, int32_t position)

FILE: include/machine.hpp
  class machine_t (line 18) | class machine_t {

FILE: include/parser.hpp
  class parser (line 14) | class parser

FILE: instructions.cpp
  function Op (line 50) | Op from_s(const char* str)

FILE: machine.cpp
  function machine_t (line 68) | machine_t& machine_t::operator=(const machine_t& p)

FILE: sm.cpp
  function compile_and_run (line 19) | void compile_and_run(FILE* f)
  function help (line 26) | void help()
  function main (line 33) | int main(int argc, char** argv)

FILE: smc.cpp
  function sbasename (line 24) | static std::string sbasename(const std::string& s)
  function compile_error (line 31) | static void compile_error(const char* msg)
  function compile (line 37) | void compile(FILE* f, const std::string& out)
  function main (line 45) | int main(int argc, char** argv)

FILE: smd.cpp
  function isprintable (line 17) | static bool isprintable(int c)
  function disassemble (line 39) | static void disassemble(machine_t &m)
  function help (line 60) | int help()
  function main (line 67) | int main(int argc, char** argv)

FILE: smr.cpp
  function help (line 18) | static void help()
  function main (line 39) | int main(int argc, char** argv)

FILE: upper.cpp
  function upper (line 12) | std::string upper(const std::string& s)
Condensed preview — 31 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (54K chars).
[
  {
    "path": ".gitignore",
    "chars": 24,
    "preview": "*.o\n*.sm\nsm\nsmc\nsmd\nsmr\n"
  },
  {
    "path": "Makefile",
    "chars": 867,
    "preview": "CXXFLAGS = -g -W -Wall -Weffc++ -Iinclude\nLINK.o = $(LINK.cc)\n\nTARGETS = instructions.o parser.o error.o upper.o fileptr"
  },
  {
    "path": "README.md",
    "chars": 13036,
    "preview": "Stack-Machine\n=============\n\nThis project contains\n\n  * A simple, stack-based virtual machine for executing low-level in"
  },
  {
    "path": "compiler.cpp",
    "chars": 4378,
    "preview": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the autho"
  },
  {
    "path": "error.cpp",
    "chars": 265,
    "preview": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the autho"
  },
  {
    "path": "fileptr.cpp",
    "chars": 378,
    "preview": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the autho"
  },
  {
    "path": "include/compiler.hpp",
    "chars": 1380,
    "preview": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the autho"
  },
  {
    "path": "include/error.hpp",
    "chars": 158,
    "preview": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the autho"
  },
  {
    "path": "include/fileptr.hpp",
    "chars": 390,
    "preview": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the autho"
  },
  {
    "path": "include/instructions.hpp",
    "chars": 1293,
    "preview": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the autho"
  },
  {
    "path": "include/label.hpp",
    "chars": 374,
    "preview": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the autho"
  },
  {
    "path": "include/machine.hpp",
    "chars": 2374,
    "preview": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the autho"
  },
  {
    "path": "include/parser.hpp",
    "chars": 447,
    "preview": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the autho"
  },
  {
    "path": "include/upper.hpp",
    "chars": 191,
    "preview": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the autho"
  },
  {
    "path": "include/version.hpp",
    "chars": 70,
    "preview": "#define VERSION \"Public domain, 2010-2011 by Christian Stigen Larsen\"\n"
  },
  {
    "path": "instructions.cpp",
    "chars": 800,
    "preview": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the autho"
  },
  {
    "path": "machine.cpp",
    "chars": 11139,
    "preview": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the autho"
  },
  {
    "path": "parser.cpp",
    "chars": 908,
    "preview": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the autho"
  },
  {
    "path": "sm.cpp",
    "chars": 1006,
    "preview": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the autho"
  },
  {
    "path": "smc.cpp",
    "chars": 1397,
    "preview": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the autho"
  },
  {
    "path": "smd.cpp",
    "chars": 1569,
    "preview": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the autho"
  },
  {
    "path": "smr.cpp",
    "chars": 1310,
    "preview": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the autho"
  },
  {
    "path": "tests/core-test.src",
    "chars": 364,
    "preview": "; To run this example using the core.src library:\n;\n;   cat tests/core-test.src tests/core.src | ./sm\n;\n\n1 outnum '+' ou"
  },
  {
    "path": "tests/core.src",
    "chars": 2444,
    "preview": "; Some helpful functions that can be implemented\n; by primitives\n\n; only parse this code, don't execute it\n&inc-core jmp"
  },
  {
    "path": "tests/fib.src",
    "chars": 1897,
    "preview": "; Print the well-known Fibonacci sequence\n;\n; Our word size is only 32-bits, so we can't\n; count very far.\n\n; Program st"
  },
  {
    "path": "tests/forward-goto.src",
    "chars": 252,
    "preview": "; simple test of forward labels\n\nstart:\n  &cause jmp\n\neffect:\n  'e' out\n  'f' out\n  'f' out\n  'e' out\n  'c' out\n  't' ou"
  },
  {
    "path": "tests/func.src",
    "chars": 202,
    "preview": "; program starts at main program\n  main\n  '4' out '\\n' out\n  halt\n\nthree:\n  '3' out '\\n' out\n  popip\n\nmain:\n  one\n  two\n"
  },
  {
    "path": "tests/hello.src",
    "chars": 327,
    "preview": "; Labels are written as a name without whitespace\n; and a colon at the end\n\nmain:\n   72 out          ; \"H\"\n  101 out    "
  },
  {
    "path": "tests/todo-print.src",
    "chars": 639,
    "preview": "; This is a suggestion for a new \"embed\" keyword,\n; as well as support for parsing strings.\n\n&main jmp\n\nmsg: embed \"Hell"
  },
  {
    "path": "tests/yo.src",
    "chars": 79,
    "preview": "; simple code to demonstrate compile-and-run\n'y' out\n'o' out\n'!' out \n'\\n' out\n"
  },
  {
    "path": "upper.cpp",
    "chars": 319,
    "preview": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the autho"
  }
]

About this extraction

This page contains the full source code of the cslarsen/stack-machine GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 31 files (49.1 KB), approximately 15.3k tokens, and a symbol index with 26 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!