[
  {
    "path": ".gitignore",
    "content": "*.o\n*.sm\nsm\nsmc\nsmd\nsmr\n"
  },
  {
    "path": "Makefile",
    "content": "CXXFLAGS = -g -W -Wall -Weffc++ -Iinclude\nLINK.o = $(LINK.cc)\n\nTARGETS = instructions.o parser.o error.o upper.o fileptr.o machine.o compiler.o sm.o smr.o smc.o smd.o sm smr smc smd\n\nall: $(TARGETS)\n\t@echo Run \\\"make check\\\" to test package\n\n%.sm: tests/%.src\n\t./smc $<\n\nsmr: instructions.o machine.o upper.o fileptr.o smr.o\n\nsmc: instructions.o machine.o upper.o error.o fileptr.o parser.o compiler.o smc.o\n\nsmd: instructions.o machine.o upper.o error.o fileptr.o smd.o\n\nsm: instructions.o machine.o upper.o error.o fileptr.o parser.o compiler.o sm.o\n\ncheck: all\n\t./sm tests/fib.src\n\t./smc tests/fib.src\n\t./smr tests/fib.sm\n\t./smc tests/hello.src\n\t./smr tests/hello.sm\n\t./smc tests/forward-goto.src\n\t./smr tests/forward-goto.sm\n\t./sm tests/yo.src\n\t./sm tests/func.src\n\tcat tests/core-test.src tests/core.src | ./sm -\n\nclean:\n\trm -f $(TARGETS) *.stackdump tests/*.sm\n"
  },
  {
    "path": "README.md",
    "content": "Stack-Machine\n=============\n\nThis project contains\n\n  * A simple, stack-based virtual machine for executing low-level instructions\n  * An assembler supporting a Forth / PostScript like language\n  * An interpreter able to run compiled programs\n\nArchitecture and design\n-----------------------\n\nThe instructions are fixed-width at 32-bits and so are the arithmetic\noperands.\n\nBy default, programs have 1 million cells available for both program text\nand data.  This means that a virtual machine memory takes up 4MB plus the\ndata and instruction stacks.\n\nThe text and data regions are overlapped, so you can easily write\nself-modifying code (early versions actually required self-modification to\nbe able to return from subroutine calls, just like Knuth's MIX, but I've\nsince taken liberty to add such modern convenience into the core instruction\nset).\n\nThere are no registers.  This _is_ a stack machine, after all.\n\nAs we know from theoretical computer science, a pushdown automaton needs\n_two_ stacks to be Turing equivalent.  Therefore we employ two as well; one\nfor the instruction pointer and one for the data.  They live separately from\nthe text and data region, and are only limited by the host process heap\nsize.\n\nThe machine contains no special facilities besides this:  It's inherently\nsingle-threaded and has no protection mechanisms.  Its operation is\ncompletely sandboxed, though, except for access to standard output.\n\nAim\n---\n\nThe project aim was to create a simple machine and language to play around\nwith.  You can benefit from it by reading the source code, playing with a\nlanguage similar to Forth, but conceptually simpler, and finally by seeing\nhow easy it is to build your own system.\n\nThe programming language\n========================\n\nThe language is very similar to Forth and PostScript:  You basically write\nin RPN --- reverse Polish notation.  Anything not recognized as an\ninstruction is put on the data stack, so to put the numbers 3 and 2 on the\nstack, just write\n\n    3 2\n\nTo multiply them, just append with an asterix:\n\n    3 2 * ; multiplication\n\nThis operation pops the topmost two numbers on the stack and replaces them\nwith the result of the multiplication.  To run such a program, you'd need to\ninclude the core library first, since multiplication is defined as a\nfunction:\n\n    $ cat tests/core.src your-file.src | sm\n    6\n\nLabels, addresses and their values\n----------------------------------\n\nLabels are identifiers ending with a colon.\n\nThey refer to a particular cell in the machine, and you can access their\nposition, value or execute code from that cell location:\n\n    label:      ; create a label for the cell at this location\n    &label      ; put ADDRESS of label on top of stack\n    &label LOAD ; put VALUE of label's cell \"label\" on top-of-stack\n    label       ; EXECUTE code from label position\n\nSo, to put the _address_ of a label on the top of the data stack, just\nprepend the label name with an ampersand.\n\nIf you want the _value_ of an address, put the address on the TOS (top of\nstack) and use the `LOAD` instruction to replace the TOS with the value at\nthe given cell location.\n\nWhen executing code at a given label position, the machine first puts the\naddress of the next instruction on top of the instruction stack.  This way\nyou can return from a function call by using the instruction `POPIP`:\n\n    main:       ; program start\n      print-dot\n      print-dot\n      HALT\n\n    print-dot:\n      '.' OUT\n      '\\n' OUT\n      POPIP     ; return from \"function\"\n\nVariables and subroutines\n-------------------------\n\nAn idiom for creating variables is to create labels and putting a `NOP` at\nthat location to reserve one memory cell to hold variables.  An example of\nusing a counter variable to implement a loop is given below.\n\n    counter: NOP                     ; reserve 1 word for the variable \"counter\"\n\n    program: 2 &counter STOR                       ; set counter to two\n             &counter LOAD 1 ADD &counter STOR     ; increment counter by one\n\n    ; loop counter+1 times\n\n    display: '\\n' '*' OUT OUT                      ; print an asterix\n             1 &counter LOAD SUB &counter STOR     ; decrement counter by one\n             &display &counter LOAD JNZ            ; jump to display if not zero\n\nThe output of the above program is three stars:\n\n    $ ./sm foo.src\n    *\n    *\n    *\n\nYou can forward-reference labels.  In fact, another idiom is to jump to the\nmain part of the program at the start of the source.\n\nHello, world!\n-------------\n\nYou can do `72 OUT` to print the letter \"H\" (72 is the ASCII code for \"H\").\nCutting to the chase, a program to print \"Hello!\" would be:\n\n    ; Labels are written as a name without whitespace\n    ; and a colon at the end.\n\n    main:\n       72 out          ; \"H\"\n      101 out          ; \"e\"\n      108 dup out out  ; \"ll\"\n      111 out          ; \"o\"\n       33 out          ; \"!\"\n\n      ; newline\n      '\\n' out\n\n      42 outnum     ; print a number\n      '\\n' out      ; and newline\n\n      ; stop program\n      halt\n\nNotice the use of the `HALT` instruction to stop the program.\n\nMultiplication and core library\n-------------------------------\n\nI've implemented a multiplication function in the core library in\n`tests/core.src`:\n\n    mul:            ; ( a b -- (a*b) )\n      mul-res: nop  ; placeholder for result\n      mul-cnt: nop  ; placeholder for counter\n      mul-num: nop\n\n      &mul-cnt stor ; b to cnt\n      dup\n      &mul-res stor ; a to res\n      &mul-num stor ; and to num\n\n      mul-loop:\n        ; calculate res += a\n        &mul-res load\n        &mul-num load +\n        &mul-res stor\n\n        ; decrement counter\n        &mul-cnt load\n        -1\n        &mul-cnt stor\n\n        ; loop until counter is zero\n        &mul-cnt load\n        &mul-loop swap -1 jnz\n\n      &mul-res load\n      popip\n\n    ; ...\n\n    *:        ; alias for mul\n      mul\n      popip\n\nNote that this function needs definitions for the functions `+` and `-1`.\n\nRecall the program to multiply two numbers.  Put the following in a file\n`hey.src`:\n\n    3 2 * outnum\n    '\\n' out\n    halt\n\nIf we concatenate the core library with our program, we get:\n\n    $ cat tests/core.src hey.src | ./sm\n    6\n\nYou could implement the whole program without depending on the core library:\n\n    ; semi-obfuscated multiply and print\n    ; does not depend on any libraries\n\n    ; re-inventing the wheel can be very educational!\n\n    main:\n      12345 67890 * outnum\n      '\\n' out\n      halt\n\n    ; multiplication function w/inner loop\n    *:\n      R: nop C: nop N: nop\n      &C stor dup &R stor &N stor\n\n      *-loop:\n        &R load &N load add &R stor\n        1 &C load sub &C stor\n        &C load &*-loop swap 1 swap sub jnz\n\n      &R load\n      popip\n\nWhile implementing the Karatsuba algorithm should be quite easy, Toom-Cook\nmultiplication is left as an exercise for the reader.\n\nIt's not a joke\n---------------\n\nI think I need to clarify that this project is actually not a joke.  Fun,\nabsolutely, but not a joke.\n\nI just wanted to create a simple virtual machine and from that I grew a\nlanguage.  It's very similar to Forth and PostScript, and we all know those\nare extremely powerful --- particularly Forth!\n\nBuilding stuff yourself is a powerful way of learning.\n\nA Fibonacci program\n-------------------\n\nThe following is a program to generate and print Fibonacci numbers, taken\nfrom `tests/fib.src`:\n\n    ; Print the well-known Fibonacci sequence\n    ;\n    ; Our word size is only 32-bits, so we can't\n    ; count very far.\n\n    ; Program starts at main, so jump there\n\n    &main jmp\n\n    ; Create label 'count', which refers to this memory\n    ; address.\n    ;\n    ; The NOP (no operation; do nothing) is only used\n    ; to reserve memory space for a variable.\n\n    count:\n      nop\n\n    ; Initialize the counter by storing 46 at the address of 'count'.\n    ;\n    ; POPIP will pop the instruction pointer, effectively jumping to\n    ; the next location (probably the caller).\n\n    count-init:\n      46 &count stor\n      popip\n\n    ; Shorthand for loading the number at 'count' onto the top of the stack.\n    ;\n    ; The \"( -- counter)\" comment is similar to Forth's comments, explaining\n    ; that no number is expected on the stack, and after running this function,\n    ; a number (\"counter\") will be on the stack.\n\n    count-get: ; ( -- counter )\n      &count load     ; load number\n      popip\n\n    ; Shorthand for decrementing the number on the stack\n\n    dec: ; ( a -- a-1 )\n      1 swap sub\n      popip\n\n    ; Store top of stack to 'count', do not alter stack\n\n    count-set: ; ( counter -- counter )\n      dup &count stor\n      popip\n\n    ; Decrement counter and return it\n\n    count-dec: ; ( -- counter )\n      count-get dec\n      count-set\n      popip\n\n    ; Print number with a newline without altering stack\n\n    show: ; ( number -- number )\n      dup outnum\n      '\\n' out\n      popip\n\n    ; Duplicate two top-most numbers on stack\n\n    dup2: ; ( a b -- a b a b )\n      swap       ; b a\n      dup        ; b a a\n      rol3       ; a a b\n      dup        ; a a b b\n      rol3       ; a b b a\n      swap       ; a b a b\n      popip\n\n    jump-if-nonzero: ; ( dest_address predicate -- )\n      swap jnz\n      popip\n\n    ; The start of our Fibonacci printing program\n\n    main:\n      count-init\n\n      0 show  ; first Fibonacci number\n      1       ; second Fibonacci number\n\n      loop:\n        ; add top numbers and show\n        ; a b -> a b a b -> a b (a + b)\n        dup2 add show\n\n        ; decrement, loop if non-zero\n        count-dec &loop jump-if-nonzero\n\nConvenience features\n--------------------\n\nI've added a `HALT` instruction.  This replaces the old idiom of looping\nforever to signal that a program was finished:\n\n    stop: stop      ; form 1\n    stop: &stop jmp ; form 2\n    halt            ; convenience form\n\nOriginally, it was an argument of minimalism for not including any halt\ninstructions.\n\nSecondly, I've added a `POPIP` instruction along with automatically storing\nthe next instruction before performing a jump.  This effectively let's you\ncall and return from subroutines:\n\n    boot:\n      &main jmp halt\n\n    foo: bar: baz:\n      '\\n' '!' 'e' 'c' 'i' 'u' 'j' 'e' 'l' 't' 'e' 'e' 'B'\n      out out out out out out out out out out out out out\n      popip\n\n    main:\n      foo bar baz\n\nThird, I never bothered to write my own print number function, because it\nwould require me to write both division and modulus functions in source\nfirst.  So I implemented `OUTNUM` that prints a number to the output:\n\n    123 OUTNUM '\\n' OUT ; prints \"123\\n\"\n\nLacking is proper string handling.  One could say that string handling is\nnot this language's strongest point.\n\nCompiling the project\n=====================\n\nTo compile and run the examples:\n\n    $ make all check\n\nTo see the low-level machine instructions:\n\n    $ ./smr -h\n\nTo execute source code on-the-fly:\n\n    $ ./sm filename\n\nTo compile source to bytecode:\n\n    $ ./smc filename\n\nThe assembly language is not documented other than in code, because I'm\nactively playing with it.\n\nAlthough the interpreter is slow, it should be possible to convert stack\noperations to a register machine.  In fact, it should be trivial to compile\nprograms to native machine code, e.g. x86.\n\nInstruction set\n---------------\n\nThe instructions are found `include/instructions.hpp`:\n\n    VALUE       OPCODE  EXPLANATION\n    0x00000000  NOP     do nothing\n    0x00000001  ADD     pop a, pop b, push a + b\n    0x00000002  SUB     pop a, pop b, push a - b\n    0x00000003  AND     pop a, pop b, push a & b\n    0x00000004  OR      pop a, pop b, push a | b\n    0x00000005  XOR     pop a, pop b, push a ^ b\n    0x00000006  NOT     pop a, push !a\n    0x00000007  IN      read one byte from stdin, push as word on stack\n    0x00000008  OUT     pop one word and write to stream as one byte\n    0x00000009  LOAD    pop a, push word read from address a\n    0x0000000A  STOR    pop a, pop b, write b to address a\n    0x0000000B  JMP     pop a, goto a\n    0x0000000C  JZ      pop a, pop b, if a == 0 goto b\n    0x0000000D  PUSH    push next word\n    0x0000000E  DUP     duplicate word on stack\n    0x0000000F  SWAP    swap top two words on stack\n    0x00000010  ROL3    rotate top three words on stack once left, (a b c) -> (b c a)\n    0x00000011  OUTNUM  pop one word and write to stream as number\n    0x00000012  JNZ     pop a, pop b, if a != 0 goto b\n    0x00000013  DROP    remove top of stack\n    0x00000014  PUSHIP  push a in IP stack\n    0x00000015  POPIP   pop IP stack to current IP, effectively performing a jump\n    0x00000016  DROPIP  pop IP, but do not jump\n    0x00000017  COMPL   pop a, push the complement of a\n\nThe instruction set could easily be more minimal, even more so if we allowed\nregisters.  Also, we have taken absolutely no care about the machine code\nvalues for each instruction.  A good design would do something cool with\nthat.\n\nLicense and author\n==================\n\nPlaced in the public domain in 2010 by the author, Christian Stigen Larsen\nhttp://csl.sublevel3.org\n"
  },
  {
    "path": "compiler.cpp",
    "content": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the author.\n *\n */\n\n#include <stdlib.h>\n#include \"compiler.hpp\"\n#include \"parser.hpp\"\n#include \"machine.hpp\"\n#include \"label.hpp\"\n#include \"upper.hpp\"\n\nvoid compiler::error(const std::string& s)\n{\n  if ( callback )\n    callback(s.c_str());\n}\n\nbool compiler::islabel(const std::string& s)\n{\n  size_t l = s.length();\n  return l<1? false : s[l-1] == ':';\n}\n\nbool compiler::iscomment(const std::string& s)\n{\n  return s[0] == ';';\n}\n\nOp compiler::tok2op(const std::string& s)\n{\n  return from_s(s.c_str());\n}\n\nbool compiler::isliteral(const std::string& s)\n{\n  if ( islabel(s) )\n    return false;\n\n  return tok2op(s) == NOP_END;\n}\n\nbool compiler::isnumber(const char* s)\n{\n  while ( *s )\n    if ( !isdigit(*s++) )\n      return false;\n\n  return true;\n}\n\nbool compiler::ischar(const std::string& s)\n{\n  size_t l = s.length();\n\n  if ( l==3 && s[0]=='\\'' && s[2]=='\\'' && s[1]!='\\\\' )\n    return true;\n\n  if ( l==4 && s[0]=='\\'' && s[3]=='\\'' && s[1]=='\\\\' \n            && (s[2]=='t' || s[2]=='r' || s[2]=='n' || s[2]=='0') )\n    return true;\n\n  return false;\n}\n\nchar compiler::to_ord(const std::string& s)\n{\n  size_t l = s.length();\n\n  if ( l == 3 ) // 'x'\n    return s[1];\n \n  if ( l == 4 ) // '\\x'\n    switch ( s[2] ) {\n    case 't': return '\\t';\n    case 'r': return '\\r';\n    case 'n': return '\\n';\n    case '0': return '\\0';\n    }\n\n  error(\"Unknown character literal: \" + s);\n  return '\\0';\n}\n\nbool compiler::islabel_ref(const std::string& s)\n{\n  return s[0] == '&';\n}\n\nint32_t compiler::to_literal(const std::string& s)\n{\n  if ( isnumber(s.c_str()) )\n    return atoi(s.c_str());\n\n  if ( ischar(s) )\n    return to_ord(s);\n\n  return -1;\n}\n\nbool compiler::ishalt(const std::string& s)\n{\n  return s.empty() || upper(s)==\"HALT\";\n}\n\nvoid compiler::check_label_name(const std::string& label)\n{\n  if ( upper(label) == \"HERE\" )\n    error(\"Label is reserved: HERE\");\n}\n\ncompiler::compiler(void (*cb)(const char*)) :\n  m(cb),\n  forwards(),\n  callback(cb)\n{\n}\n\nvoid compiler::set_error_callback(void (*error_callback)(const char* message))\n{\n  callback = error_callback;\n}\n\nvoid compiler::compile_label(const std::string& label)\n{\n  int32_t address = m.get_label_address(label);\n\n  m.load(PUSH);\n\n  // if label not found, mark it for update\n  if ( address == -1 ) {\n    check_label_name(label);\n    forwards.push_back(label_t(label, m.pos()));\n  }\n\n  m.load(address);\n}\n\nvoid compiler::compile_function_call(const std::string& function)\n{\n  // Return address is here plus four instructions\n  m.load(PUSHIP); m.load(m.pos() + 4*m.wordsize());\n\n  // Push function destination address -- update it later\n  m.load(PUSH);\n  forwards.push_back(label_t(function, m.pos()));\n  m.load(-1); // just push an arbitrary number\n\n  // Jump to function\n  m.load(JMP);\n\n  // This is the return point\n}\n\nvoid compiler::compile_literal(const std::string& token)\n{\n  if ( islabel_ref(token) ) {\n    compile_label(token.substr(1));\n    return;\n  }\n\n  int32_t literal = to_literal(token);\n\n  // Literals are pushed on to the stack\n  if ( literal != -1 ) {\n    m.load(PUSH);\n    m.load(literal);\n    return;\n  }\n\n  // Unknown literals are treated as forward function calls\n  compile_function_call(token);\n}\n\nvoid compiler::resolve_forwards()\n{\n  for ( size_t n=0; n<forwards.size(); ++n ) {\n    std::string label = forwards[n].name;\n    int32_t address = m.get_label_address(label);\n\n    if ( address == -1 )\n      error(\"Code label not found: \" + label);\n\n    // update label jump to address\n    m.set_mem(forwards[n].pos, address);\n  }\n}\n\n// Return FALSE when compilation has finished\nbool compiler::compile_token(const std::string& s, parser& p)\n{\n  if ( s.empty() ) {\n    m.load_halt();\n    resolve_forwards();\n    return false;\n  }\n  else if ( ishalt(s) )    m.load_halt();\n  else if ( iscomment(s) ) p.skip_line();\n  else if ( isliteral(s) ) compile_literal(s);\n  else if ( islabel(s) )   m.addlabel(s.c_str(), m.pos());\n  else {\n    Op op = tok2op(s);\n\n    if ( op == NOP_END )\n      error(\"Unknown operation: \" + s);\n\n    m.load(op);\n  }\n\n  return true;\n}\n\nmachine_t& compiler::get_program()\n{\n  return m;\n}\n\ncompiler::compiler(parser& p, void (*fp)(const char*)) :\n  m(fp), forwards(), callback(fp)\n{\n  // Perform complete compilation\n  while ( compile_token(p.next_token(), p) )\n    ; // loop\n}\n"
  },
  {
    "path": "error.cpp",
    "content": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the author.\n *\n */\n\n#include <stdlib.h>\n#include <stdio.h>\n#include \"error.hpp\"\n\nvoid error(const char* s)\n{\n  fprintf(stderr, \"\\n%s\\n\", s);\n  exit(1);\n}\n"
  },
  {
    "path": "fileptr.cpp",
    "content": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the author.\n *\n */\n\n#include <stdexcept>\n#include \"fileptr.hpp\"\n\nfileptr::fileptr(FILE *file) : f(file)\n{\n  if ( f == NULL )\n    throw std::runtime_error(\"Could not open file\");\n}\n\nfileptr::~fileptr()\n{\n  fclose(f);\n}\n\nfileptr::operator FILE*() const\n{\n  return f;\n}\n"
  },
  {
    "path": "include/compiler.hpp",
    "content": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the author.\n *\n */\n\n#include \"instructions.hpp\"\n#include \"parser.hpp\"\n#include \"machine.hpp\"\n\n#ifndef INC_COMPILER_HPP\n#define INC_COMPILER_HPP\n\nclass compiler\n{\n  machine_t m;\n  std::vector<label_t> forwards;\n  void (*callback)(const char*);\n\n  void error(const std::string& s);\n  char to_ord(const std::string& s);\n  int32_t to_literal(const std::string& s);\n  void check_label_name(const std::string& label);\n\n  static bool islabel(const std::string& s);\n  static bool iscomment(const std::string& s);\n  static Op tok2op(const std::string& s);\n  static bool isliteral(const std::string& s);\n  static bool isnumber(const char* s);\n  static bool ischar(const std::string& s);\n  static bool islabel_ref(const std::string& s);\n  static bool ishalt(const std::string& s);\n\npublic:\n  compiler(void (*error_callback)(const char* message) = NULL);\n  compiler(parser& p, void (*error_callback)(const char* message) = NULL);\n\n  void set_error_callback(void (*error_callback)(const char* message));\n  void compile_label(const std::string& label);\n  void compile_function_call(const std::string& function);\n  void compile_literal(const std::string& token);\n  void resolve_forwards();\n  bool compile_token(const std::string& s, parser& p);\n  machine_t& get_program();\n};\n\n#endif\n"
  },
  {
    "path": "include/error.hpp",
    "content": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the author.\n *\n */\n\nvoid error(const char* s);\n"
  },
  {
    "path": "include/fileptr.hpp",
    "content": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the author.\n *\n */\n\n#include <stdio.h>\n\n#ifndef INC_FILEPTR_HPP\n#define INC_FILEPTR_HPP\n\nclass fileptr {\n  FILE* f;\n  fileptr(const fileptr&); // deny\n  fileptr& operator=(const fileptr&); // deny\npublic:\n  fileptr(FILE *file);\n  ~fileptr();\n  operator FILE*() const;\n};\n\n#endif\n"
  },
  {
    "path": "include/instructions.hpp",
    "content": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the author.\n *\n */\n\n#ifndef INC_SMCORE_H\n#define INC_SMCORE_H\n\nenum Op {\n  NOP,  // do nothing\n  ADD,  // pop a, pop b, push a + b\n  SUB,  // pop a, pop b, push a - b\n  AND,  // pop a, pop b, push a & b\n  OR,   // pop a, pop b, push a | b\n  XOR,  // pop a, pop b, push a ^ b\n  NOT,  // pop a, push !a\n  IN,   // push one byte read from stream\n  OUT,  // pop one byte and write to stream\n  LOAD, // pop a, push byte read from address a\n  STOR, // pop a, pop b, write b to address a\n  JMP,  // pop a, goto a\n  JZ,   // pop a, pop b, if a == 0 goto b\n  PUSH, // push next word\n  DUP,  // duplicate word on stack\n  SWAP, // swap top two words on stack\n  ROL3, // rotate top three words on stack once left, (a b c) -> (b c a)\n  OUTNUM, // pop one byte and write to stream as number\n  JNZ,  // pop a, pop b, if a != 0 goto b\n  DROP, // remove top of stack\n  PUSHIP, // push a in IP stack\n  POPIP,  // pop IP stack to current IP, effectively performing a jump\n  DROPIP, // pop IP, but do not jump\n  COMPL,  // pop a, push the complement of a\n  NOP_END // placeholder for end of enum; MUST BE LAST\n};\n\nextern const char* OpStr[];\n\nconst char* to_s(Op op);\nOp from_s(const char* s);\n\n#endif\n"
  },
  {
    "path": "include/label.hpp",
    "content": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the author.\n *\n */\n\n#include <stdlib.h>\n#include <string>\n\n#ifndef INC_LABEL_HPP\n#define INC_LABEL_HPP\n\nstruct label_t {\n  std::string name;\n  int32_t pos;\n\n  label_t(const std::string& name_, int32_t position)\n    : name(name_), pos(position)\n  {\n  }\n};\n\n#endif\n"
  },
  {
    "path": "include/machine.hpp",
    "content": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the author.\n *\n */\n\n#include <stdio.h>\n#include <vector>\n#include <string>\n#include \"instructions.hpp\"\n#include \"label.hpp\"\n\n#ifndef INC_MACHINE_HPP\n#define INC_MACHINE_HPP\n\nclass machine_t {\n  std::vector<int32_t> stack;\n  std::vector<int32_t> stackip;\n  std::vector<label_t> labels;\n  size_t memsize;\n  int32_t *memory;\n  int32_t ip; // instruction pointer\n  FILE* fin;\n  FILE* fout;\n  bool running;\n  void (*error_cb)(const char*);\n\npublic:\n  machine_t(void (*error_callback)(const char* msg));\n  machine_t(\n    const size_t memory_size = 1024*1000/sizeof(int32_t),\n    FILE* out = stdout,\n    FILE* in  = stdin,\n    void (*error_callback)(const char* msg) = NULL);\n  machine_t(const machine_t& p, void (*error_callback)(const char* msg) = NULL);\n  machine_t& operator=(const machine_t& p);\n  ~machine_t();\n  void reset();\n  void error(const char* s) const;\n  void push(const int32_t& n);\n  int32_t pop();\n  void puship(const int32_t&);\n  int32_t popip();\n  void check_bounds(int32_t n, const char* msg) const; \n  void next();\n  void prev();\n  void load(Op);\n  void load(int32_t n);\n  int run(int32_t start_address = 0);\n  void exec(Op);\n  int32_t* find_end() const;\n  void load_image(FILE* f);\n  void save_image(FILE* f) const;\n  void load_halt();\n  void showstack() const;\n\n  size_t size() const;\n  int32_t cur() const;\n  int32_t pos() const;\n\n  int32_t get_label_address(const std::string& label) const;\n  void addlabel(const char* name, int32_t pos, int lineno = -1);\n\n  bool isrunning() const;\n  void set_fout(FILE*);\n  void set_fin(FILE*);\n\n  void set_mem(int32_t adr, int32_t val);\n  int32_t get_mem(int32_t adr) const;\n  int32_t wordsize() const;\n\n  // instructions\n  void instr_nop();    \n  void instr_add();    \n  void instr_sub();    \n  void instr_and();    \n  void instr_or();     \n  void instr_xor();    \n  void instr_not();    \n  void instr_in();     \n  void instr_out();    \n  void instr_outnum(); \n  void instr_load();   \n  void instr_stor();   \n  void instr_jmp();    \n  void instr_jz();     \n  void instr_drop();   \n  void instr_popip();  \n  void instr_dropip(); \n  void instr_jnz();    \n  void instr_push();   \n  void instr_puship();  \n  void instr_dup();    \n  void instr_swap();   \n  void instr_rol3();   \n  void instr_compl();\n};\n\n#endif\n"
  },
  {
    "path": "include/parser.hpp",
    "content": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the author.\n *\n */\n\n#include <string>\n\n#ifndef INC_PARSER_HPP\n#define INC_PARSER_HPP\n\nclass parser\n{\n  FILE* f;\n  int lineno;\n  int update_lineno(int c);\n  int fgetchar();\n  void move_back(int c);\n  void skip_whitespace();\n\npublic:\n  parser(FILE* f);\n  int get_lineno() const;\n  std::string next_token();\n  void skip_line();\n};\n\n#endif\n"
  },
  {
    "path": "include/upper.hpp",
    "content": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the author.\n *\n */\n\n#include <string>\n\nstd::string upper(const std::string& s);\n"
  },
  {
    "path": "include/version.hpp",
    "content": "#define VERSION \"Public domain, 2010-2011 by Christian Stigen Larsen\"\n"
  },
  {
    "path": "instructions.cpp",
    "content": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the author.\n *\n */\n\n#include <stdio.h>\n#include \"instructions.hpp\"\n#include \"machine.hpp\"\n#include \"upper.hpp\"\n\nconst char* OpStr[] = {\n  \"NOP\",\n  \"ADD\",\n  \"SUB\",\n  \"AND\",\n  \"OR\",\n  \"XOR\",\n  \"NOT\",\n  \"IN\",\n  \"OUT\",\n  \"LOAD\",\n  \"STOR\",\n  \"JMP\",\n  \"JZ\",\n  \"PUSH\",\n  \"DUP\",\n  \"SWAP\",\n  \"ROL3\",\n  \"OUTNUM\",\n  \"JNZ\",\n  \"DROP\",\n  \"PUSHIP\",\n  \"POPIP\",\n  \"DROPIP\",\n  \"COMPL\",\n  \"NOP_END\"\n};\n\nconst char* to_s(Op op)\n{\n  if ( op >= NOP && op < NOP_END )\n    return OpStr[op];\n\n  return \"<?>\";\n}\n\nOp from_s(const char* str)\n{\n  std::string s(upper(str));\n\n  // slow, O(n/2) seek... :-)\n  for ( int n=0; n<NOP_END; ++n )\n    if ( s == OpStr[n] )\n      return static_cast<Op>(n);\n\n  return NOP_END;\n}\n"
  },
  {
    "path": "machine.cpp",
    "content": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the author.\n *\n */\n\n#include <stdlib.h>\n#include <memory.h>\n#include \"machine.hpp\"\n#include \"label.hpp\"\n#include \"upper.hpp\"\n\nmachine_t::machine_t(\n  const machine_t& p,\n  void (*error_callback)(const char*))\n:\n  stack(p.stack),\n  stackip(p.stackip),\n  labels(p.labels),\n  memsize(p.memsize),\n  memory(new int32_t[p.memsize]),\n  ip(p.ip),\n  fin(p.fin),\n  fout(p.fout),\n  running(p.running),\n  error_cb(error_callback)\n{\n  memmove(memory, p.memory, memsize*sizeof(int32_t));\n}\n\nmachine_t::machine_t(const size_t memory_size,\n  FILE* out,\n  FILE* in,\n  void (*error_callback)(const char*))\n:\n  stack(),\n  stackip(),\n  labels(),\n  memsize(memory_size),\n  memory(new int32_t[memory_size]),\n  ip(0),\n  fin(in),\n  fout(out),\n  running(true),\n  error_cb(error_callback)\n{\n  reset();\n}\n\nmachine_t::machine_t(void (*error_callback)(const char*))\n:\n  stack(),\n  stackip(),\n  labels(),\n  memsize(1000*1024*sizeof(int32_t)),\n  memory(new int32_t[memsize]),\n  ip(0),\n  fin(stdin),\n  fout(stdout),\n  running(true),\n  error_cb(error_callback)\n{\n  reset();\n}\n\nmachine_t& machine_t::operator=(const machine_t& p)\n{\n  if ( &p == this )\n    return *this;\n\n  delete[](memory);\n\n  stack = p.stack;\n  stackip = p.stackip;\n  labels = p.labels;\n  memsize = p.memsize;\n  memory = new int32_t[memsize];\n  memcpy(memory, p.memory, memsize*sizeof(int32_t));\n  ip = p.ip;\n  fin = p.fin;\n  fout = p.fout;\n  running = p.running;\n  error_cb = p.error_cb;\n\n  return *this;\n}\n\nvoid machine_t::reset()\n{\n  memset(memory, NOP, memsize*sizeof(int32_t));\n  stack.clear();\n  ip = 0;\n}\n\nmachine_t::~machine_t()\n{\n  delete[](memory);\n}\n\nvoid machine_t::error(const char* s) const\n{\n  if ( error_cb )\n    error_cb(s);\n}\n\nvoid machine_t::push(const int32_t& n)\n{\n  stack.push_back(n);\n}\n\nvoid machine_t::puship(const int32_t& n)\n{\n  stackip.push_back(n);\n}\n\nint32_t machine_t::popip()\n{\n  if ( stackip.empty() ) {\n    error(\"POP empty IP stack\");\n    return 0;\n  }\n\n  int32_t n = stackip.back();\n  stackip.pop_back();\n  return n;\n}\n\nint32_t machine_t::pop()\n{\n  if ( stack.empty() )\n    error(\"POP empty stack\");\n\n  int32_t n = stack.back();\n  stack.pop_back();\n  return n;\n}\n\nvoid machine_t::check_bounds(int32_t n, const char* msg) const\n{\n  if ( n < 0 || static_cast<size_t>(n) >= memsize )\n    error(msg);\n}\n\nvoid machine_t::next()\n{\n  ip += sizeof(int32_t);\n\n  if ( ip < 0 )\n    error(\"IP < 0\");\n\n  if ( static_cast<size_t>(ip) >= memsize )\n    ip = 0; // TODO: Halt instead of wrap-around?\n}\n\nvoid machine_t::prev()\n{\n  if ( ip == 0 )\n    error(\"prev() reached zero\");\n\n  ip -= sizeof(int32_t);\n}\n\nvoid machine_t::load(Op op)\n{\n  memory[ip] = op;\n  next();\n}\n\nvoid machine_t::load(int32_t n)\n{\n  memory[ip] = n;\n  next();\n}\n\nint machine_t::run(int32_t start_address)\n{\n  ip = start_address;\n\n  while(running)\n    exec(static_cast<Op>(memory[ip]));\n\n  return 0; // TODO: exit-code ?\n}\n\nvoid machine_t::instr_nop()\n{\n  next();\n}\n\nvoid machine_t::instr_add()\n{\n  push(pop() + pop());\n  next();\n}\n\nvoid machine_t::instr_sub()\n{\n  /*\n   * This operation is not primitive.  It can\n   * be implemented by adding the minuend to\n   * the two's complement of the subtrahend:\n   *\n   * SUB: ; ( a b -- (b-a))\n   *   swap  ; b a\n   *   compl ; b ~a\n   *   1 add ; b (~a+1), or b -a\n   *   add   ; b-a\n   *   popip\n   *\n   * The problem is that IF the underlying\n   * architecture does not use two's complement\n   * to represent negative values, stuff like\n   * printing will fail miserably (at least in\n   * the current implementation on top of C).\n   */\n\n  // TODO: Consider reversing the operands for SUB\n  //       (it's currently unnatural)\n\n  int32_t tos = pop();\n  push(tos - pop());\n  next();\n}\n\nvoid machine_t::instr_and()\n{\n  push(pop() & pop());\n  next();\n}\n\nvoid machine_t::instr_or()\n{\n  push(pop() | pop());\n  next();\n}\n\nvoid machine_t::instr_xor()\n{\n  push(pop() ^ pop());\n  next();\n}\n\nvoid machine_t::instr_not()\n{\n  // TODO: this probably does not work as intended\n  push(!pop());\n  next();\n}\n\nvoid machine_t::instr_compl()\n{\n  push(~pop());\n  next();\n}\n\nvoid machine_t::instr_in()\n{\n  /*\n   * The IN/OUT functions should be implemented\n   * using something akin to x86 INT or SYSCALL or\n   * similar.  E.g.:\n   *\n   * 123 SYSCALL ; exec system call 123\n   *\n   */\n  push(getc(fin));\n  next();\n}\n\nvoid machine_t::instr_out()\n{\n  putc(pop(), fout);\n  fflush(fout);\n  next();\n}\n\nvoid machine_t::instr_outnum()\n{\n  fprintf(fout, \"%u\", pop());\n  next();\n}\n\nvoid machine_t::instr_load()\n{\n  int32_t a = pop();\n  check_bounds(a, \"LOAD\");\n  push(memory[a]);\n  next();\n}\n\nvoid machine_t::instr_stor()\n{\n  int32_t a = pop();\n  check_bounds(a, \"STOR\");\n  memory[a] = pop();\n  next();\n}\n\nvoid machine_t::instr_jmp()\n{\n  /*\n   * This function is not primitive.\n   * If we have e.g. JZ, we can always\n   * do \"0 JZ\" to perform the jump.\n   *\n   * (Note that this will break the\n   * HALT-idiom)\n   *\n   */\n\n  // TODO: Implement as library function\n\n  //push(0);\n  //instr_jz();\n\n  int32_t a = pop();\n  check_bounds(a, \"JMP\");  \n\n  // check if we are halting, i.e. jumping to current\n  // address -- if so, quit\n  if ( a == ip )\n    running = false;\n  else\n    ip = a;\n}\n\nvoid machine_t::instr_jz()\n{\n  int32_t a = pop();\n  int32_t b = pop();\n\n  if ( a != 0 )\n    next();\n  else {\n    check_bounds(b, \"JZ\");\n    ip = b; // perform jump\n  }\n}\n\nvoid machine_t::instr_drop()\n{\n  pop();\n  next();\n}\n\nvoid machine_t::instr_popip()\n{\n  int32_t a = popip();\n  check_bounds(a, \"POPIP\");\n  ip = a;\n}\n\nvoid machine_t::instr_dropip()\n{\n  popip();\n  next();\n}\n\nvoid machine_t::instr_jnz()\n{\n  /*\n   * Only one of JNZ and JZ is needed as\n   * a primitive -- one can be implemented\n   * in terms of the other with a negation\n   * of the TOS.\n   *\n   * (Note that this will break the HALT-idiom)\n   */\n\n  /*\n  instr_puship();\n  instr_compl();\n  instr_popip();\n  instr_jz();\n  */\n\n  int32_t a = pop();\n  int32_t b = pop();\n\n  if ( a == 0 )\n    next();\n  else {\n    check_bounds(b, \"JNZ\");\n    ip = b; // jump\n  }\n}\n\nvoid machine_t::instr_push()\n{\n  next();\n  push(memory[ip]);\n  next();\n}\n\nvoid machine_t::instr_puship()\n{\n  next();\n  puship(memory[ip]);\n  next();\n}\n\nvoid machine_t::instr_dup()\n{\n  /*\n   * This function is not primitive.\n   * It can be replaced with a \"function\":\n   *\n   * ; ( a -- a a )\n   * dup:  nop       ; placeholder <- nop\n   *       &dup stor ; placeholder <- a\n   *       &dup load ; tos <- a\n   *       &dup load ; tos <- a\n   *       popip\n   */\n\n  // TODO: Implement as library function\n\n  int32_t a = pop();\n  push(a);\n  push(a);\n  next();\n}\n\nvoid machine_t::instr_swap()\n{\n  /*\n   * This function is not primitive.\n   * It can be replaced with a \"function\",\n   * something like:\n   *\n   * ; ( a b -- b a )\n   * swap:\n   *   swap-b: nop  ; placeholder\n   *   swap-a: nop  ; placeholder\n   *   &swap-b stor ; swap-b <- b\n   *   &swap-a stor ; swap-a <- a\n   *   &swap-b load ; tos <- a\n   *   &swap-a load ; tos <- b\n   *   popip\n   *\n   */\n\n  // TODO: Implement as library function\n\n  // a, b -- b, a\n  int32_t b = pop();\n  int32_t a = pop();\n  push(b);\n  push(a);\n  next();\n}\n\nvoid machine_t::instr_rol3()\n{\n  /*\n   * This function is not primitive.\n   * It can be replaced with \"functions\",\n   * something like:\n   *\n   * rol3:\n   *   rol3-var: nop  ; stack = a b c\n   *   &rol3-var stor ; stack = a b, var = c\n   *   swap           ; stack = b a, var = c\n   *   &rol3-var load ; stack = b a c\n   *   swap           ; stack = b c a\n   *   popip\n   *\n   */\n\n  // TODO: Implement as library function\n\n  // abc -> bca\n  int32_t c = pop(); // TOS\n  int32_t b = pop();\n  int32_t a = pop();\n  push(b);\n  push(c);\n  push(a);\n  next();\n}\n\nvoid machine_t::exec(Op operation)\n{\n  switch(operation) {\n  default:     error(\"Unknown instruction\"); break;\n  case NOP:    instr_nop();    break;\n\n  // Strictly speaking, SUB can be implemented\n  // by ADDing the minuend with the two's complement\n  // of the subtrahend -- but that's not necessarily\n  // portable down to native code\n\n  case ADD:    instr_add();    break;\n  case SUB:    instr_sub();    break; // non-primitive\n\n  // Strictly speaking, all but NOT and AND are\n  // non-primitive (or some other combination of\n  // two operations)\n\n  case AND:    instr_and();    break;\n  case OR:     instr_or();     break;\n  case XOR:    instr_xor();    break;\n  case NOT:    instr_not();    break;\n  case COMPL:  instr_compl();  break;\n\n  // Should be replaced with x86 INT-like operations\n\n  case IN:     instr_in();     break;\n  case OUT:    instr_out();    break;\n\n  case LOAD:   instr_load();   break;   \n  case STOR:   instr_stor();   break;   \n\n  case PUSH:   instr_push();   break;   \n  case DROP:   instr_drop();   break;   \n\n  case PUSHIP: instr_puship(); break; \n  case POPIP:  instr_popip();  break;  \n  case DROPIP: instr_dropip(); break; \n\n  case JZ:     instr_jz();     break;     \n  case JMP:    instr_jmp();    break; // non-primitive\n  case JNZ:    instr_jnz();    break; // non-primitive\n  case DUP:    instr_dup();    break; // non-primitive\n  case SWAP:   instr_swap();   break; // non-primitive \n  case ROL3:   instr_rol3();   break; // non-primitive\n  case OUTNUM: instr_outnum(); break; // non-primitive\n  }\n}\n\nint32_t* machine_t::find_end() const\n{\n  // find end of program by scanning\n  // backwards until non-NOP is found\n  int32_t *p = &memory[memsize-1];\n  while ( *p == NOP ) --p;\n  return p;\n}\n\nvoid machine_t::load_image(FILE* f)\n{\n  reset();\n\n  while ( !feof(f) ) {\n    Op op = NOP;\n    fread(&op, sizeof(Op), 1, f);\n    load(op);\n  }\n\n  ip = 0;\n}\n\nvoid machine_t::save_image(FILE* f) const\n{\n  int32_t *start = memory;\n  int32_t *end = find_end() + sizeof(int32_t);\n\n  while ( start != end ) {\n    fwrite(start, sizeof(Op), 1, f);\n    start += sizeof(int32_t);\n  }\n}\n\nvoid machine_t::load_halt()\n{\n  load(PUSH);\n  load(ip + sizeof(int32_t));\n  load(JMP);\n}\n\nsize_t machine_t::size() const\n{\n  return find_end() - &memory[0];\n}\n\nint32_t machine_t::cur() const\n{\n  return memory[ip];\n}\n\nint32_t machine_t::pos() const\n{\n  return ip;\n}\n\nvoid machine_t::addlabel(const char* name, int32_t pos, int)\n{\n  std::string n = upper(name);\n\n  if ( n.empty() )\n    error(\"Empty label\");\n  else {\n    n.erase(n.length()-1, 1); // remove \":\"\n    labels.push_back(label_t(n.c_str(), pos));\n  }\n}\n\nint32_t machine_t::get_label_address(const std::string& s) const\n{\n  std::string p(upper(s));\n\n  // special label address \"here\" returns current position\n  if ( p == \"HERE\" )\n    return ip;\n\n  for ( size_t n=0; n < labels.size(); ++n )\n    if ( upper(labels[n].name.c_str()) == p )\n      return labels[n].pos;\n  \n  return -1; // not found\n}\n\nbool machine_t::isrunning() const\n{\n  return running;\n}\n\nvoid machine_t::set_fout(FILE* f)\n{\n  fout = f;\n}\n\nvoid machine_t::set_fin(FILE* f)\n{\n  fin = f;\n}\n\nvoid machine_t::set_mem(int32_t adr, int32_t val)\n{\n  check_bounds(adr, \"set_mem out of bounds\");\n  memory[adr] = val;\n}\n\nint32_t machine_t::get_mem(int32_t adr) const\n{\n  check_bounds(adr, \"get_mem out of bounds\");\n  return memory[adr];\n}\n\nint32_t machine_t::wordsize() const\n{\n  return sizeof(int32_t);\n}\n"
  },
  {
    "path": "parser.cpp",
    "content": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the author.\n *\n */\n\n#include <stdio.h>\n#include <ctype.h>\n#include \"parser.hpp\"\n\nint parser::update_lineno(int c)\n{\n  if ( c == '\\n' )\n    ++lineno;\n\n  return c;\n}\n\nint parser::fgetchar()\n{\n  return update_lineno(fgetc(f));\n}\n\nvoid parser::move_back(int c)\n{\n  if ( c == '\\n' )\n    --lineno;\n\n  ungetc(c, f);\n}\n\nvoid parser::skip_whitespace()\n{\n  int c;\n  while ( (c = fgetchar()) != EOF && isspace(c) )\n    ;\n  move_back(c);\n}\n\nparser::parser(FILE* file) :\n  f(file),\n  lineno(1)\n{\n}\n\nint parser::get_lineno() const\n{\n  return lineno;\n}\n\nstd::string parser::next_token()\n{\n  int c;\n  std::string s;\n\n  skip_whitespace();\n\n  while ( (c = fgetchar()) != EOF && !isspace(c) )\n      s += c;\n\n  return s;\n}\n\nvoid parser::skip_line()\n{\n  int c;\n  while ( (c = fgetchar()) != EOF && c != '\\n' )\n    ;\n}\n"
  },
  {
    "path": "sm.cpp",
    "content": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the author.\n *\n * Synopsis:  Compile and run code on-the-fly.\n *\n */\n\n#include <stdio.h>\n#include <string.h>\n#include \"instructions.hpp\"\n#include \"fileptr.hpp\"\n#include \"compiler.hpp\"\n#include \"error.hpp\"\n#include \"upper.hpp\"\n\nvoid compile_and_run(FILE* f)\n{\n  parser p(f);\n  compiler c(p, error);\n  c.get_program().run();\n}\n\nvoid help()\n{\n  printf(\"Usage: sm [ file(s] ]\\n\");\n  printf(\"Compiles and runs source files on the fly.\\n\\n\");\n  exit(1);\n}\n\nint main(int argc, char** argv)\n{\n  try {\n    if ( argc == 1 ) // by default, read standard input\n      compile_and_run(stdin);\n  \n    for ( int n=1; n<argc; ++n )\n      if ( argv[n][0]=='-' ) {\n        if ( argv[n][1] == '\\0' )\n          compile_and_run(stdin);\n        else\n          help();\n      } else\n        compile_and_run(fileptr(fopen(argv[n], \"rt\")));\n\n    return 0;\n  }\n  catch(const std::exception& e) {\n    error(e.what());\n  }\n}\n"
  },
  {
    "path": "smc.cpp",
    "content": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the author.\n *\n * Synopsis:  Compile source to bydecode.\n *\n */\n\n#include <stdlib.h>\n#include <string.h>\n#include <stdexcept>\n#include \"version.hpp\"\n#include \"instructions.hpp\"\n#include \"fileptr.hpp\"\n#include \"compiler.hpp\"\n#include \"error.hpp\"\n\nconst char* file = \"\";\nparser *p = NULL;\n\n// Return '<this part>.<ext>' of a filename\nstatic std::string sbasename(const std::string& s)\n{\n  using namespace std;\n  const string::size_type p = s.rfind('.');\n  return p == string::npos ? s : s.substr(0, p);\n}\n\nstatic void compile_error(const char* msg)\n{\n  fprintf(stderr, \"%s:%d:%s\\n\", file, p->get_lineno(), msg);\n  exit(1);\n}\n\nvoid compile(FILE* f, const std::string& out)\n{\n  delete(p);\n  p = new parser(f);\n  compiler c(*p, compile_error);\n  c.get_program().save_image( fileptr(fopen(out.c_str(), \"wb\")));\n}\n\nint main(int argc, char** argv)\n{\n  try {\n    if ( argc < 2 )\n      error(\"Usage: smc [ filename(s) | - ]\\n\" VERSION);\n\n    for ( int n=1; n<argc; ++n ) {\n      if ( !strcmp(argv[n], \"-\") ) {\n        file = \"<stdin>\";\n        compile(stdin, \"out.sm\");\n      } else {\n        file = argv[n];\n        compile(fileptr(fopen(argv[n], \"rt\")),\n                sbasename(argv[n]) + \".sm\");\n      }\n    }\n\n    return 0;\n  }\n  catch(const std::exception& e) {\n    error(e.what());\n  }\n}\n"
  },
  {
    "path": "smd.cpp",
    "content": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the author.\n *\n * Synopsis:  Disassemble bytecode.\n *\n */\n\n#include <stdio.h>\n#include \"instructions.hpp\"\n#include \"machine.hpp\"\n#include \"fileptr.hpp\"\n#include \"error.hpp\"\n\nstatic bool isprintable(int c)\n{\n  return (c>=32 && c<=127)\n    || c=='\\n'\n    || c=='\\r'\n    || c=='\\t';\n}\n\nstatic const char* to_s(char c)\n{\n  static char buf[2];\n  buf[0] = c;\n  buf[1] = '\\0';\n\n  switch ( c ) {\n  default: return buf;\n  case '\\t': return \"\\\\t\";\n  case '\\n': return \"\\\\n\";\n  case '\\r': return \"\\\\r\";\n  }\n}\n\nstatic void disassemble(machine_t &m)\n{\n  int32_t end = m.size();\n\n  while ( m.pos() <= end ) {\n    Op op = static_cast<Op>(m.cur());\n    printf(\"0x%x %s\", m.pos(), to_s(op));\n\n    if ( (op==PUSH || op==PUSHIP) && m.pos()<=end ) {\n        m.next();\n        printf(\" 0x%x\", m.cur());\n\n        if ( isprintable(m.cur()) )\n          printf(\" ('%s')\", to_s(m.cur()));\n    }\n\n    printf(\"\\n\");\n    m.next();\n  }\n}\n\nint help()\n{\n  printf(\"Usage: smd [ file(s) }\\n\\n\");\n  printf(\"Disassembles compiled bytecode files.\\n\");\n  exit(1);\n}\n\nint main(int argc, char** argv)\n{\n  try {\n    for ( int n=1; n<argc; ++n ) {\n      if ( argv[n][0] == '-' ) {\n        if ( argv[n][1] != '\\0' )\n          help();\n        continue;\n      }\n\n      machine_t m;\n      m.load_image(fileptr(fopen(argv[n], \"rb\")));\n      printf(\"; File %s --- %lu bytes\\n\", argv[n], m.size());\n      disassemble(m);\n    }\n    return 0;\n  }\n  catch(const std::exception& e) {\n    error(e.what());\n  }\n}\n"
  },
  {
    "path": "smr.cpp",
    "content": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the author.\n *\n * Synopsis:  Run compiled bytecode.\n *\n */\n\n#include <stdio.h>\n#include <stdlib.h>\n#include \"version.hpp\"\n#include \"instructions.hpp\"\n#include \"machine.hpp\"\n#include \"fileptr.hpp\"\n\nstatic void help()\n{\n  printf(\"smr -- stack-machine run\\n\");\n  printf(\"%s\\n\\n\", VERSION);\n\n  printf(\"Opcodes:\\n\\n\");\n\n  Op op=NOP; \n  do {\n    printf(\"0x%x = %s\\n\", op, to_s(op));\n    op = static_cast<Op>(op+1);\n  } while ( op != NOP_END );\n\n  printf(\"\\nTo halt program, jump to current position:\\n\\n\");\n  printf(\"0x0 PUSH 0x%x\\n\", (unsigned int)sizeof(int32_t));\n  printf(\"0x%x JMP\\n\\n\", (unsigned int)sizeof(int32_t));\n  printf(\"Word size is %lu bytes\\n\", sizeof(int32_t));\n\n  exit(0);\n}\n\nint main(int argc, char** argv)\n{\n  try {\n    bool found_file = false;\n\n    for ( int n=1; n<argc; ++n ) {\n      if ( argv[n][0] == '-' ) {\n        help();\n        continue;\n      }\n      \n      found_file = true;\n      machine_t m;\n      m.load_image(fileptr(fopen(argv[n], \"rb\")));\n      m.run();\n    }\n\n    if ( !found_file ) {\n      machine_t m;\n      m.load_image(stdin);\n      m.run();\n    }\n\n    return 0;\n  }\n  catch(const std::exception& e) {\n    fprintf(stderr, \"%s\\n\", e.what());\n    return 1;\n  }\n}\n"
  },
  {
    "path": "tests/core-test.src",
    "content": "; To run this example using the core.src library:\n;\n;   cat tests/core-test.src tests/core.src | ./sm\n;\n\n1 outnum '+' out\n2 outnum '+' out\n3 outnum '+' out\n4 outnum '+' out\n5 outnum '=' out\n\n1 _dup +1 3 4 5\n             +\n           +\n         +\n       +\noutnum ; should be 15\n'\\n' out\n\n6 outnum '*' out\n7 outnum '=' out\n6 7 * outnum ; should be 42\n'\\n' out\n\nhalt\n"
  },
  {
    "path": "tests/core.src",
    "content": "; Some helpful functions that can be implemented\n; by primitives\n\n; only parse this code, don't execute it\n&inc-core jmp\n\n_nop:  ; do nothing\n  popip\n\n_jnz:\n  not jz\n  popip\n\n_jmp:\n  0 jz\n\n_swap:          ; ( a b -- b a)\n  swap-a: nop   ; placeholder for a\n  swap-b: nop   ; placeholder for b\n  load &swap-b  ; pop to b\n  load &swap-a  ; pop to a\n  push &swap-a  ; push a\n  push &swap-b  ; push b\n  popip\n\n_dup:           ; ( a -- a a )\n  nop           ; placeholder <- nop\n  &_dup stor ; placeholder <- a\n  &_dup load ; tos <- a\n  &_dup load ; tos <- a\n  popip\n\n_sub-two's-complementB: ; ( a b -- (b-a))\n  swap  ; b a\n  compl ; b ~a\n  1 add ; b (~a+1), or b -a\n  add   ; b-a\n  popip\n\n_rol3: ; ( a b c -- b c a )\n  rol3-var: nop  ; stack = a b c\n  &rol3-var stor ; stack = a b, var = c\n  swap           ; stack = b a, var = c\n  &rol3-var load ; stack = b a c\n  swap           ; stack = b c a\n  popip\n\n+:              ; ( a b -- (a+b) )\n  add           ; 32-bit native addition\n  popip\n\n-:              ; ( a b -- (b-a) )\n  sub           ; 32-bit native subtraction\n  popip\n\n*:              ; ( a b -- (a*b) )\n  mul           ; 32-bit native multiplication\n  popip\n\n-1:             ; ( a -- (a-1))\n  1 swap sub\n  popip\n\n+1:             ; ( a -- (a+1))\n  1 add\n  popip\n\nmul:            ; ( a b -- (a*b) )\n  mul-res: nop  ; placeholder for result\n  mul-cnt: nop  ; placeholder for counter\n  mul-num: nop\n\n  &mul-cnt stor ; b to cnt\n  dup\n  &mul-res stor ; a to res\n  &mul-num stor ; and to num\n\n  mul-loop:\n    ; calculate res += a\n    &mul-res load\n    &mul-num load +\n    &mul-res stor\n\n    ; decrement counter\n    &mul-cnt load\n    -1\n    &mul-cnt stor\n\n    ; loop until counter is zero\n    &mul-cnt load\n    &mul-loop swap -1 jnz\n\n  &mul-res load\n  popip\n\n&:              ; ( a b -- (a AND b) )\n  and           ; 32-bit bitwise AND operation\n  popip\n\n|:              ; ( a b -- (a OR b) )\n  or            ; 32-bit bitwise OR operation\n  popip\n\n^:              ; ( a b -- (a XOR b) )\n  xor           ; 32-bit bitwise EXCLUSIVE OR operation\n  popip\n\n~:              ; ( a -- complement of a )\n  not           ; 32-bit bitwise COMPLEMENT OF A\n  popip\n\n<:              ; ( -- a )\n  in            ; read 8-bit to LSB of 32-bit value from input stream\n  popip\n\n>:              ; ( a -- )\n  out           ; write 8-bit to LSB of 32-bit value to output stream\n  popip\n\n@:              ; ( address -- value at address )\n  load\n  popip\n\ninc-core:\n  nop\n"
  },
  {
    "path": "tests/fib.src",
    "content": "; Print the well-known Fibonacci sequence\n;\n; Our word size is only 32-bits, so we can't\n; count very far.\n\n; Program starts at main, so jump there\n\n&main jmp\n\n; Create label 'count', which refers to this memory\n; address.\n;\n; The NOP (no operation; do nothing) is only used\n; to reserve memory space for a variable.\n\ncount:\n  nop\n\n; Initialize the counter by storing 46 at the address of 'count'.\n;\n; POPIP will pop the instruction pointer, effectively jumping to\n; the next location (probably the caller).\n\ncount-init:\n  46 &count stor\n  popip\n\n; Shorthand for loading the number at 'count' onto the top of the stack.\n;\n; The \"( -- counter)\" comment is similar to Forth's comments, explaining\n; that no number is expected on the stack, and after running this function,\n; a number (\"counter\") will be on the stack.\n\ncount-get: ; ( -- counter )\n  &count load     ; load number\n  popip\n\n; Shorthand for decrementing the number on the stack\n\ndec: ; ( a -- a-1 )\n  1 swap sub\n  popip\n\n; Store top of stack to 'count', do not alter stack\n\ncount-set: ; ( counter -- counter )\n  dup &count stor\n  popip\n\n; Decrement counter and return it\n\ncount-dec: ; ( -- counter )\n  count-get dec\n  count-set\n  popip\n\n; Print number with a newline without altering stack\n\nshow: ; ( number -- number )\n  dup outnum\n  '\\n' out\n  popip\n\n; Duplicate two top-most numbers on stack\n\ndup2: ; ( a b -- a b a b )\n  swap       ; b a\n  dup        ; b a a\n  rol3       ; a a b\n  dup        ; a a b b\n  rol3       ; a b b a\n  swap       ; a b a b\n  popip\n\njump-if-nonzero: ; ( dest_address predicate -- )\n  swap jnz\n  popip\n  \n; The start of our Fibonacci printing program\n\nmain:\n  count-init\n\n  0 show  ; first Fibonacci number\n  1       ; second Fibonacci number\n\n  loop:\n    ; add top numbers and show\n    ; a b -> a b a b -> a b (a + b)\n    dup2 add show\n\n    ; decrement, loop if non-zero\n    count-dec &loop jump-if-nonzero\n"
  },
  {
    "path": "tests/forward-goto.src",
    "content": "; simple test of forward labels\n\nstart:\n  &cause jmp\n\neffect:\n  'e' out\n  'f' out\n  'f' out\n  'e' out\n  'c' out\n  't' out\n  '\\n' out\n  halt\n\ncause:\n  'c' out\n  'a' out\n  'u' out\n  's' out\n  'e' out\n  32 out\n  '-' out\n  '>' out\n  32 out\n  &effect\n  jmp\n"
  },
  {
    "path": "tests/func.src",
    "content": "; program starts at main program\n  main\n  '4' out '\\n' out\n  halt\n\nthree:\n  '3' out '\\n' out\n  popip\n\nmain:\n  one\n  two\n  three\n  popip\n\none:\n  '1' out '\\n' out\n  popip\n\ntwo:\n  '2' out '\\n' out\n  popip\n"
  },
  {
    "path": "tests/hello.src",
    "content": "; Labels are written as a name without whitespace\n; and a colon at the end\n\nmain:\n   72 out          ; \"H\"\n  101 out          ; \"e\"\n  108 dup out out  ; \"ll\"\n  111 out          ; \"o\"\n   33 out          ; \"!\"\n\n  ; newline\n  10 13 out out\n\n  42 outnum     ; print a number\n  10 13 out out ; ... and CRLF\n\n  ; stop program\n  halt\n"
  },
  {
    "path": "tests/todo-print.src",
    "content": "; This is a suggestion for a new \"embed\" keyword,\n; as well as support for parsing strings.\n\n&main jmp\n\nmsg: embed \"Hello, world!\\n\"\nnum: embed 40\n\nprintstr:         ; ( adr -- )\n  print-src: nop  ; placeholder\n  &print-src stor ; store src address to print-src\n\n  print-loop:\n    &print-src load        ; get ptr\n    dup +1 &print-src stor ; save ptr + 1\n    load                   ; get char\n    dup &print-exit jz     ; stop if '\\0'\n    out                    ; print character\n    &print-loop jmp        ; loop\n\n  print-exit: popip\n\nmain:\n  &msg printstr              ; print \"Hello, world\\n\"\n  &num +1 +1 outnum '\\n' out ; print \"42\"\n"
  },
  {
    "path": "tests/yo.src",
    "content": "; simple code to demonstrate compile-and-run\n'y' out\n'o' out\n'!' out \n'\\n' out\n"
  },
  {
    "path": "upper.cpp",
    "content": "/*\n * Made in 2010 by Christian Stigen Larsen\n * http://csl.sublevel3.org\n *\n * Placed in the public domain by the author.\n *\n */\n\n#include <ctype.h>\n#include \"upper.hpp\"\n\nstd::string upper(const std::string& s)\n{\n  std::string r(s);\n\n  for ( int n=0, l=s.length(); n<l; ++n )\n    r[n] = toupper(r[n]);\n\n  return r;\n}\n\n"
  }
]