[
  {
    "path": ".gitignore",
    "content": "# Intermediate and built stuff.\n.sass-cache/\n/build/\n/gen/\nclox\n*.class\nexercises/chapter01_introduction/3/linked_list\n.idea/\n\n# I keep a scratch file at the top level to try stuff out.\ntemp.lox\n\n# XCode user-specific stuff.\nxcuserdata/\n\n# Dart stuff.\n/tool/.dart_tool/\n/tool/.packages\n"
  },
  {
    "path": "LICENSE",
    "content": "Copyright (c) 2015 Robert Nystrom\n\n---------------------------------- Commentary ----------------------------------\n\nThe licensing story for this repository is a little complex. Here's my\nmotivation:\n\n* I want you to get as much use out of the material here as possible. I wrote\n  this book to help you, and I don't want you to be encumbered when it comes to\n  making the most of it. That's also why I put it online for free.\n\n* With my previous book, collaboration on GitHub was immesensely helpful. I want\n  to ensure people can fork the repo, send me fixes, etc. without violating the\n  license or feeling weird.\n\n* When it comes to code, I'm completely comfortable with people redistributing,\n  remixing, changing, whatever with it. I've been using the MIT license for open\n  source stuff for decades.\n\n  This book contains two complete interpreters and I would be delighted for them\n  to be the jumping-off point for any number of real full-featured language\n  implementations.\n\n* When it comes to my prose, illustrations, and the visual design of the site,\n  that feels a little more, I don't know, *me* than the code. The words are in\n  my voice, the drawings are literally my handwriting, and the look of the site\n  is part of the book's and, by extension, my brand.\n\n  I feel weird thinking about someone, say taking one of the chapters and making\n  significant changes to it to fit their writing style while still having some\n  of it read like it came from me. Likewise, I'd be sad to see another site\n  online that looked exactly like mine because it reuses my stylesheets.\n\n* My previous book ended up being translated into several languages. I want to\n  be careful to not be so permissive that it prevents me from signing typical\n  contracts that give them exclusive translation rights to certain territories\n  and languages.\n\n* If I allow the prose and illustrations to be redistributed commercially, there\n  is nothing preventing someone from slapping together a cheap print or ebook\n  version of the book and putting it up for sale. I'm not too worried about my\n  own sales being undercut, but I very much want to avoid readers finding\n  themselves with a low quality book that they incorrectly think is from me.\n\n  I worked very hard on this book. I want you to get the best possible\n  experience.\n\nAll of this is way more complex than I'd like, especially since my brain isn't\nwired to care about intellectual property. I like thinking about making stuff,\nnot thinking about the legal rights around the stuff I made. (If your brain is\nwired to think about legal stuff and you see that I'm doing something dumb,\nplease do let me know.)\n\nThe best solution I've been able to come up with is to use two licenses:\n\n---------------------------------- License(s) ----------------------------------\n\nEach file in this repository falls under one of two licenses. Files whose\nextension is \".c\", \".dart\", \".h\", \".java\", or \".lox\" use the MIT license:\n\n    Permission is hereby granted, free of charge, to any person obtaining a copy\n    of this software and associated documentation files (the \"Software\"), to\n    deal in the Software without restriction, including without limitation the\n    rights to use, copy, modify, merge, publish, distribute, sublicense, and/or\n    sell copies of the Software, and to permit persons to whom the Software is\n    furnished to do so, subject to the following conditions:\n\n    The above copyright notice and this permission notice shall be included in\n    all copies or substantial portions of the Software.\n\n    THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\n    AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING\n    FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS\n    IN THE SOFTWARE.\n\nAll other files, including (but not limited to) \".md\" (except for\n\"book/appendix-i.md\" which uses the MIT license above), \".png\", \".jpg\", \".html\",\n\".scss\", \".css\", and \".txt\" use this Creative Commons license:\n\n    Attribution-NonCommercial-NoDerivatives 4.0\n    International (CC BY-NC-ND 4.0)\n\n    https://creativecommons.org/licenses/by-nc-nd/4.0/\n"
  },
  {
    "path": "Makefile",
    "content": "BUILD_DIR := build\nTOOL_SOURCES := tool/pubspec.lock $(shell find tool -name '*.dart')\nBUILD_SNAPSHOT := $(BUILD_DIR)/build.dart.snapshot\nTEST_SNAPSHOT := $(BUILD_DIR)/test.dart.snapshot\n\ndefault: book clox jlox\n\n# Run dart pub get on tool directory.\nget:\n\t@ cd ./tool; dart pub get\n\n# Remove all build outputs and intermediate files.\nclean:\n\t@ rm -rf $(BUILD_DIR)\n\t@ rm -rf gen\n\n# Build the site.\nbook: $(BUILD_SNAPSHOT)\n\t@ dart $(BUILD_SNAPSHOT)\n\n# Run a local development server for the site that rebuilds automatically.\nserve: $(BUILD_SNAPSHOT)\n\t@ dart $(BUILD_SNAPSHOT) --serve\n\n$(BUILD_SNAPSHOT): $(TOOL_SOURCES)\n\t@ mkdir -p build\n\t@ echo \"Compiling Dart snapshot...\"\n\t@ dart --snapshot=$@ --snapshot-kind=app-jit tool/bin/build.dart >/dev/null\n\n# Run the tests for the final versions of clox and jlox.\ntest: debug jlox $(TEST_SNAPSHOT)\n\t@- dart $(TEST_SNAPSHOT) clox\n\t@ dart $(TEST_SNAPSHOT) jlox\n\n# Run the tests for the final version of clox.\ntest_clox: debug $(TEST_SNAPSHOT)\n\t@ dart $(TEST_SNAPSHOT) clox\n\n# Run the tests for the final version of jlox.\ntest_jlox: jlox $(TEST_SNAPSHOT)\n\t@ dart $(TEST_SNAPSHOT) jlox\n\n# Run the tests for every chapter's version of clox.\ntest_c: debug c_chapters $(TEST_SNAPSHOT)\n\t@ dart $(TEST_SNAPSHOT) c\n\n# Run the tests for every chapter's version of jlox.\ntest_java: jlox java_chapters $(TEST_SNAPSHOT)\n\t@ dart $(TEST_SNAPSHOT) java\n\n# Run the tests for every chapter's version of clox and jlox.\ntest_all: debug jlox c_chapters java_chapters compile_snippets $(TEST_SNAPSHOT)\n\t@ dart $(TEST_SNAPSHOT) all\n\n$(TEST_SNAPSHOT): $(TOOL_SOURCES)\n\t@ mkdir -p build\n\t@ echo \"Compiling Dart snapshot...\"\n\t@ dart --snapshot=$@ --snapshot-kind=app-jit tool/bin/test.dart clox >/dev/null\n\n# Compile a debug build of clox.\ndebug:\n\t@ $(MAKE) -f util/c.make NAME=cloxd MODE=debug SOURCE_DIR=c\n\n# Compile the C interpreter.\nclox:\n\t@ $(MAKE) -f util/c.make NAME=clox MODE=release SOURCE_DIR=c\n\t@ cp build/clox clox # For convenience, copy the interpreter to the top level.\n\n# Compile the C interpreter as ANSI standard C++.\ncpplox:\n\t@ $(MAKE) -f util/c.make NAME=cpplox MODE=debug CPP=true SOURCE_DIR=c\n\n# Compile and run the AST generator.\ngenerate_ast:\n\t@ $(MAKE) -f util/java.make DIR=java PACKAGE=tool\n\t@ java -cp build/java com.craftinginterpreters.tool.GenerateAst \\\n\t\t\tjava/com/craftinginterpreters/lox\n\n# Compile the Java interpreter .java files to .class files.\njlox: generate_ast\n\t@ $(MAKE) -f util/java.make DIR=java PACKAGE=lox\n\nrun_generate_ast = @ java -cp build/gen/$(1) \\\n\t\t\tcom.craftinginterpreters.tool.GenerateAst \\\n\t\t\tgen/$(1)/com/craftinginterpreters/lox\n\njava_chapters: split_chapters\n\t@ $(MAKE) -f util/java.make DIR=gen/chap04_scanning PACKAGE=lox\n\n\t@ $(MAKE) -f util/java.make DIR=gen/chap05_representing PACKAGE=tool\n\t$(call run_generate_ast,chap05_representing)\n\t@ $(MAKE) -f util/java.make DIR=gen/chap05_representing PACKAGE=lox\n\n\t@ $(MAKE) -f util/java.make DIR=gen/chap06_parsing PACKAGE=tool\n\t$(call run_generate_ast,chap06_parsing)\n\t@ $(MAKE) -f util/java.make DIR=gen/chap06_parsing PACKAGE=lox\n\n\t@ $(MAKE) -f util/java.make DIR=gen/chap07_evaluating PACKAGE=tool\n\t$(call run_generate_ast,chap07_evaluating)\n\t@ $(MAKE) -f util/java.make DIR=gen/chap07_evaluating PACKAGE=lox\n\n\t@ $(MAKE) -f util/java.make DIR=gen/chap08_statements PACKAGE=tool\n\t$(call run_generate_ast,chap08_statements)\n\t@ $(MAKE) -f util/java.make DIR=gen/chap08_statements PACKAGE=lox\n\n\t@ $(MAKE) -f util/java.make DIR=gen/chap09_control PACKAGE=tool\n\t$(call run_generate_ast,chap09_control)\n\t@ $(MAKE) -f util/java.make DIR=gen/chap09_control PACKAGE=lox\n\n\t@ $(MAKE) -f util/java.make DIR=gen/chap10_functions PACKAGE=tool\n\t$(call run_generate_ast,chap10_functions)\n\t@ $(MAKE) -f util/java.make DIR=gen/chap10_functions PACKAGE=lox\n\n\t@ $(MAKE) -f util/java.make DIR=gen/chap11_resolving PACKAGE=tool\n\t$(call run_generate_ast,chap11_resolving)\n\t@ $(MAKE) -f util/java.make DIR=gen/chap11_resolving PACKAGE=lox\n\n\t@ $(MAKE) -f util/java.make DIR=gen/chap12_classes PACKAGE=tool\n\t$(call run_generate_ast,chap12_classes)\n\t@ $(MAKE) -f util/java.make DIR=gen/chap12_classes PACKAGE=lox\n\n\t@ $(MAKE) -f util/java.make DIR=gen/chap13_inheritance PACKAGE=tool\n\t$(call run_generate_ast,chap13_inheritance)\n\t@ $(MAKE) -f util/java.make DIR=gen/chap13_inheritance PACKAGE=lox\n\nc_chapters: split_chapters\n\t@ $(MAKE) -f util/c.make NAME=chap14_chunks MODE=release SOURCE_DIR=gen/chap14_chunks\n\t@ $(MAKE) -f util/c.make NAME=chap15_virtual MODE=release SOURCE_DIR=gen/chap15_virtual\n\t@ $(MAKE) -f util/c.make NAME=chap16_scanning MODE=release SOURCE_DIR=gen/chap16_scanning\n\t@ $(MAKE) -f util/c.make NAME=chap17_compiling MODE=release SOURCE_DIR=gen/chap17_compiling\n\t@ $(MAKE) -f util/c.make NAME=chap18_types MODE=release SOURCE_DIR=gen/chap18_types\n\t@ $(MAKE) -f util/c.make NAME=chap19_strings MODE=release SOURCE_DIR=gen/chap19_strings\n\t@ $(MAKE) -f util/c.make NAME=chap20_hash MODE=release SOURCE_DIR=gen/chap20_hash\n\t@ $(MAKE) -f util/c.make NAME=chap21_global MODE=release SOURCE_DIR=gen/chap21_global\n\t@ $(MAKE) -f util/c.make NAME=chap22_local MODE=release SOURCE_DIR=gen/chap22_local\n\t@ $(MAKE) -f util/c.make NAME=chap23_jumping MODE=release SOURCE_DIR=gen/chap23_jumping\n\t@ $(MAKE) -f util/c.make NAME=chap24_calls MODE=release SOURCE_DIR=gen/chap24_calls\n\t@ $(MAKE) -f util/c.make NAME=chap25_closures MODE=release SOURCE_DIR=gen/chap25_closures\n\t@ $(MAKE) -f util/c.make NAME=chap26_garbage MODE=release SOURCE_DIR=gen/chap26_garbage\n\t@ $(MAKE) -f util/c.make NAME=chap27_classes MODE=release SOURCE_DIR=gen/chap27_classes\n\t@ $(MAKE) -f util/c.make NAME=chap28_methods MODE=release SOURCE_DIR=gen/chap28_methods\n\t@ $(MAKE) -f util/c.make NAME=chap29_superclasses MODE=release SOURCE_DIR=gen/chap29_superclasses\n\t@ $(MAKE) -f util/c.make NAME=chap30_optimization MODE=release SOURCE_DIR=gen/chap30_optimization\n\ncpp_chapters: split_chapters\n\t@ $(MAKE) -f util/c.make NAME=cpp_chap14_chunks MODE=release CPP=true SOURCE_DIR=gen/chap14_chunks\n\t@ $(MAKE) -f util/c.make NAME=cpp_chap15_virtual MODE=release CPP=true SOURCE_DIR=gen/chap15_virtual\n\t@ $(MAKE) -f util/c.make NAME=cpp_chap16_scanning MODE=release CPP=true SOURCE_DIR=gen/chap16_scanning\n\t@ $(MAKE) -f util/c.make NAME=cpp_chap17_compiling MODE=release CPP=true SOURCE_DIR=gen/chap17_compiling\n\t@ $(MAKE) -f util/c.make NAME=cpp_chap18_types MODE=release CPP=true SOURCE_DIR=gen/chap18_types\n\t@ $(MAKE) -f util/c.make NAME=cpp_chap19_strings MODE=release CPP=true SOURCE_DIR=gen/chap19_strings\n\t@ $(MAKE) -f util/c.make NAME=cpp_chap20_hash MODE=release CPP=true SOURCE_DIR=gen/chap20_hash\n\t@ $(MAKE) -f util/c.make NAME=cpp_chap21_global MODE=release CPP=true SOURCE_DIR=gen/chap21_global\n\t@ $(MAKE) -f util/c.make NAME=cpp_chap22_local MODE=release CPP=true SOURCE_DIR=gen/chap22_local\n\t@ $(MAKE) -f util/c.make NAME=cpp_chap23_jumping MODE=release CPP=true SOURCE_DIR=gen/chap23_jumping\n\t@ $(MAKE) -f util/c.make NAME=cpp_chap24_calls MODE=release CPP=true SOURCE_DIR=gen/chap24_calls\n\t@ $(MAKE) -f util/c.make NAME=cpp_chap25_closures MODE=release CPP=true SOURCE_DIR=gen/chap25_closures\n\t@ $(MAKE) -f util/c.make NAME=cpp_chap26_garbage MODE=release CPP=true SOURCE_DIR=gen/chap26_garbage\n\t@ $(MAKE) -f util/c.make NAME=cpp_chap27_classes MODE=release CPP=true SOURCE_DIR=gen/chap27_classes\n\t@ $(MAKE) -f util/c.make NAME=cpp_chap28_methods MODE=release CPP=true SOURCE_DIR=gen/chap28_methods\n\t@ $(MAKE) -f util/c.make NAME=cpp_chap29_superclasses MODE=release CPP=true SOURCE_DIR=gen/chap29_superclasses\n\t@ $(MAKE) -f util/c.make NAME=cpp_chap30_optimization MODE=release CPP=true SOURCE_DIR=gen/chap30_optimization\n\ndiffs: split_chapters java_chapters\n\t@ mkdir -p build/diffs\n\t@ -diff --recursive --new-file nonexistent/ gen/chap04_scanning/com/craftinginterpreters/ > build/diffs/chap04_scanning.diff\n\t@ -diff --recursive --new-file gen/chap04_scanning/com/craftinginterpreters/ gen/chap05_representing/com/craftinginterpreters/ > build/diffs/chap05_representing.diff\n\t@ -diff --recursive --new-file gen/chap05_representing/com/craftinginterpreters/ gen/chap06_parsing/com/craftinginterpreters/ > build/diffs/chap06_parsing.diff\n\t@ -diff --recursive --new-file gen/chap06_parsing/com/craftinginterpreters/ gen/chap07_evaluating/com/craftinginterpreters/ > build/diffs/chap07_evaluating.diff\n\t@ -diff --recursive --new-file gen/chap07_evaluating/com/craftinginterpreters/ gen/chap08_statements/com/craftinginterpreters/ > build/diffs/chap08_statements.diff\n\t@ -diff --recursive --new-file gen/chap08_statements/com/craftinginterpreters/ gen/chap09_control/com/craftinginterpreters/ > build/diffs/chap09_control.diff\n\t@ -diff --recursive --new-file gen/chap09_control/com/craftinginterpreters/ gen/chap10_functions/com/craftinginterpreters/ > build/diffs/chap10_functions.diff\n\t@ -diff --recursive --new-file gen/chap10_functions/com/craftinginterpreters/ gen/chap11_resolving/com/craftinginterpreters/ > build/diffs/chap11_resolving.diff\n\t@ -diff --recursive --new-file gen/chap11_resolving/com/craftinginterpreters/ gen/chap12_classes/com/craftinginterpreters/ > build/diffs/chap12_classes.diff\n\t@ -diff --recursive --new-file gen/chap12_classes/com/craftinginterpreters/ gen/chap13_inheritance/com/craftinginterpreters/ > build/diffs/chap13_inheritance.diff\n\n\t@ -diff --new-file nonexistent/ gen/chap14_chunks/ > build/diffs/chap14_chunks.diff\n\t@ -diff --new-file gen/chap14_chunks/ gen/chap15_virtual/ > build/diffs/chap15_virtual.diff\n\t@ -diff --new-file gen/chap15_virtual/ gen/chap16_scanning/ > build/diffs/chap16_scanning.diff\n\t@ -diff --new-file gen/chap16_scanning/ gen/chap17_compiling/ > build/diffs/chap17_compiling.diff\n\t@ -diff --new-file gen/chap17_compiling/ gen/chap18_types/ > build/diffs/chap18_types.diff\n\t@ -diff --new-file gen/chap18_types/ gen/chap19_strings/ > build/diffs/chap19_strings.diff\n\t@ -diff --new-file gen/chap19_strings/ gen/chap20_hash/ > build/diffs/chap20_hash.diff\n\t@ -diff --new-file gen/chap20_hash/ gen/chap21_global/ > build/diffs/chap21_global.diff\n\t@ -diff --new-file gen/chap21_global/ gen/chap22_local/ > build/diffs/chap22_local.diff\n\t@ -diff --new-file gen/chap22_local/ gen/chap23_jumping/ > build/diffs/chap23_jumping.diff\n\t@ -diff --new-file gen/chap23_jumping/ gen/chap24_calls/ > build/diffs/chap24_calls.diff\n\t@ -diff --new-file gen/chap24_calls/ gen/chap25_closures/ > build/diffs/chap25_closures.diff\n\t@ -diff --new-file gen/chap25_closures/ gen/chap26_garbage/ > build/diffs/chap26_garbage.diff\n\t@ -diff --new-file gen/chap26_garbage/ gen/chap27_classes/ > build/diffs/chap27_classes.diff\n\t@ -diff --new-file gen/chap27_classes/ gen/chap28_methods/ > build/diffs/chap28_methods.diff\n\t@ -diff --new-file gen/chap28_methods/ gen/chap29_superclasses/ > build/diffs/chap29_superclasses.diff\n\t@ -diff --new-file gen/chap29_superclasses/ gen/chap30_optimization/ > build/diffs/chap30_optimization.diff\n\nsplit_chapters:\n\t@ dart tool/bin/split_chapters.dart\n\ncompile_snippets:\n\t@ dart tool/bin/compile_snippets.dart\n\n# Generate the XML for importing into InDesign.\nxml: $(TOOL_SOURCES)\n\t@ dart --enable-asserts tool/bin/build_xml.dart\n\n.PHONY: book c_chapters clean clox compile_snippets debug default diffs \\\n\tget java_chapters jlox serve split_chapters test test_all test_c test_java\n"
  },
  {
    "path": "README.md",
    "content": "This is the repo used for the in-progress book \"[Crafting Interpreters][]\". It\ncontains the Markdown text of the book, full implementations of both\ninterpreters, as well as the build system to weave the two together into the\nfinal site.\n\n[crafting interpreters]: http://craftinginterpreters.com\n\nIf you find an error or have a suggestion, please do file an issue here. Thank\nyou!\n\n## Contributing\n\nOne of the absolute best things about writing a book online and putting it out\nthere before it's done is that people like you have been kind enough to give me\nfeedback, point out typos, and find other errors or unclear text.\n\nIf you'd like to do that, great! You can just file bugs here on the repo, or\nsend a pull request if you're so inclined. If you want to send a pull request,\nbut don't want to get the build system set up to regenerate the HTML too, don't\nworry about it. I'll do that when I pull it in.\n\n## Ports and implementations\n\nAnother way to get involved is by sharing your own implementation of Lox. Ports\nto other languages are particularly useful since not every reader likes Java and\nC. Feel free to add your Lox port or implementation to the wiki:\n\n* [Lox implementations][]\n\n[lox implementations]: https://github.com/munificent/craftinginterpreters/wiki/Lox-implementations\n\n## Building Stuff\n\nI am a terribly forgetful, error-prone mammal, so I automated as much as I\ncould.\n\n### Prerequisites\n\nI develop on an OS X machine, but any POSIX system should work too. With a\nlittle extra effort, you should be able to get this working on Windows as well,\nthough I can't help you out much.\n\nMost of the work is orchestrated by make. The build scripts, test runner, and\nother utilities are all written in [Dart][]. Instructions to install Dart are\n[here][install]. Once you have Dart installed and on your path, run:\n\n```sh\n$ make get\n```\n\n[dart]: https://dart.dev/\n[install]: https://dart.dev/get-dart\n\nThis downloads all of the packages used by the build and test scripts.\n\nIn order to compile the two interpreters, you also need a C compiler on your\npath as well as `javac`.\n\n### Building\n\nOnce you've got that setup, try:\n\n```sh\n$ make\n```\n\nIf everything is working, that will generate the site for the book as well as\ncompiling the two interpreters clox and jlox. You can run either interpreter\nright from the root of the repo:\n\n```sh\n$ ./clox\n$ ./jlox\n```\n\n### Hacking on the book\n\nThe Markdown and snippets of source code are woven together into the final HTML\nusing a hand-written static site generator that started out as a [single tiny\nPython script][py] for [my first book][gpp] and somehow grew into something\napproximating a real program.\n\n[py]: https://github.com/munificent/game-programming-patterns/blob/master/script/format.py\n[gpp]: http://gameprogrammingpatterns.com/\n\nThe generated HTML is committed in the repo under `site/`. It is built from a\ncombination of Markdown for prose, which lives in `book/`, and snippets of code\nthat are weaved in from the Java and C implementations in `java/` and `c/`. (All\nof those funny looking comments in the source code are how it knows which\nsnippet goes where.)\n\nThe script that does all the magic is `tool/bin/build.dart`. You can run that\ndirectly, or run:\n\n```sh\n$ make book\n```\n\nThat generates the entire site in one batch. If you are incrementally working\non it, you'll want to run the development server:\n\n```sh\n$ make serve\n```\n\nThis runs a little HTTP server on localhost rooted at the `site/` directory.\nAny time you request a page, it regenerates any files whose sources have been\nchanged, including Markdown files, interpreter source files, templates, and\nassets. Just let that keep running, edit files locally, and refresh your\nbrowser to see the changes.\n\n### Building the interpreters\n\nYou can build each interpreter like so:\n\n```sh\n$ make clox\n$ make jlox\n```\n\nThis builds the final version of each interpreter as it appears at the end of\nits part in the book.\n\nYou can also see what the interpreters look like at the end of each chapter. (I\nuse this to make sure they are working even in the middle of the book.) This is\ndriven by a script, `tool/bin/split_chapters.dart` that uses the same comment\nmarkers for the code snippets to determine which chunks of code are present in\neach chapter. It takes only the snippets that have been seen by the end of each\nchapter and produces a new copy of the source in `gen/`, one directory for each\nchapter's code. (These are also an easier way to view the source code since they\nhave all of the distracting marker comments stripped out.)\n\nThen, each of those can be built separately. Run:\n\n```sh\n$ make c_chapters\n```\n\nAnd in the `build/` directory, you'll get an executable for each chapter, like\n`chap14_chunks`, etc. Likewise:\n\n```sh\n$ make java_chapters\n```\n\nThis compiles the Java code to classfiles in `build/gen/` in a subdirectory for\neach chapter.\n\n## Testing\n\nI have a full Lox test suite that I use to ensure the interpreters in the book\ndo what they're supposed to do. The test cases live in `test/`. The Dart\nprogram `tool/bin/test.dart` is a test runner that runs each of those test\nfiles on a Lox interpreter, parses the result, and validates that that the test\ndoes what it's expected to do.\n\nThere are various interpreters you can run the tests against:\n\n```sh\n$ make test       # The final versions of clox and jlox.\n$ make test_clox  # The final version of clox.\n$ make test_jlox  # The final version of jlox.\n$ make test_c     # Every chapter's version of clox.\n$ make test_java  # Every chapter's version of jlox.\n$ make test_all   # All of the above.\n```\n\n### Testing your implementation\n\nYou are welcome to use the test suite and the test runner to test your own Lox\nimplementation. The test runner is at `tool/bin/test.dart` and can be given a\ncustom interpreter executable to run using `--interpreter`. For example, if you\nhad an interpreter executable at `my_code/boblox`, you could test it like:\n\n```sh\n$ dart tool/bin/test.dart clox --interpreter my_code/boblox\n```\n\nYou still need to tell it which suite of tests to run because that determines\nthe test expectations. If your interpreter should behave like jlox, use \"jlox\"\nas the suite name. If it behaves like clox, use \"clox\". If your interpreter is\nonly complete up to the end of one of the chapters in the book, you can use\nthat chapter as the suite, like \"chap10_functions\". See the Makefile for the\nnames of all of the chapters.\n\nIf your interpreter needs other command line arguments passed to use, pass them\nto the test runner using `--arguments` and it will forward to your interpreter.\n\n## Repository Layout\n\n*   `asset/` – Sass files and jinja2 templates used to generate the site.\n*   `book/` - Markdown files for the text of each chapter.\n*   `build/` - Intermediate files and other build output (except for the site\n    itself) go here. Not committed to Git.\n*   `c/` – Source code of clox, the interpreter written in C. Also contains an\n    XCode project, if that's your thing.\n*   `gen/` – Java source files generated by GenerateAst.java go here. Not\n    committed.\n*   `java/` – Source code of jlox, the interpreter written in Java.\n*   `note/` – Various research, notes, TODOs, and other miscellanea.\n*   `note/answers` – Sample answers for the challenges. No cheating!\n*   `site/` – The final generated site. The contents of this directory directly\n    mirror craftinginterpreters.com. Most content here is generated by build.py,\n    but fonts, images, and JS only live here. Everything is committed, even the\n    generated content.\n*   `test/` – Test cases for the Lox implementations.\n*   `tool/` – Dart package containing the build, test, and other scripts.\n"
  },
  {
    "path": "asset/index.scss",
    "content": "@import 'sass/shared';\n@import 'sass/sign-up';\n\nbody, h1, h2, h3, h4, p, blockquote, code, ul, ol, dl, dd, img {\n  margin: 0;\n}\n\nbody {\n  background: $dark url('image/background.png') top center / 100% auto no-repeat;\n  color: #222;\n  font: normal 16px/24px $serif;\n}\n\na {\n  color: $primary;\n  text-decoration: none;\n\n  border-bottom: solid 1px transparentize($light, 1.0);\n\n  transition: color 0.2s ease,\n              border-color 0.4s ease;\n}\n\na:hover {\n  color: $primary;\n  border-bottom: solid 1px opacify($light, 1.0);\n}\n\narticle {\n  margin: 0 auto;\n  padding: 0 0 12px 0;\n  max-width: $col * 20;\n  background: #fff;\n}\n\nheader {\n  margin: 0 0 $col 0;\n  color: $warm-dark;\n  background: $warm-5;\n  border-bottom: solid 1px $warm-4;\n}\n\nmain {\n  margin: 0 $col;\n}\n\nimg.header {\n  display: block;\n  width: 100%;\n}\n\nimg.small {\n  display: none;\n}\n\ndiv.intro {\n  display: flex;\n\n  blockquote {\n    flex-basis: 40%;\n    margin: 0 $col 0 0;\n    font: italic 28px/42px $serif;\n  }\n\n  div.text {\n    flex-basis: 60%;\n    margin: 8px 0 24px 0;\n  }\n}\n\np + p {\n  margin-top: 24px;\n}\n\n.format {\n  margin: 0 -12px 24px -12px;\n  padding: 12px 12px 8px 12px;\n  height: 244px;\n\n  box-sizing: border-box;\n  background: $lighter;\n  background-size: cover;\n  background-position: left;\n\n  color: #444;\n  border-radius: 3px;\n  font: normal 16px/24px $nav;\n\n  h3 {\n    margin: 0;\n    padding: 0 0 4px 0;\n    font: 600 16px/24px $nav;\n    text-transform: uppercase;\n    letter-spacing: 1px;\n  }\n\n  p {\n    margin-bottom: 8px;\n  }\n}\n\n.format.print, .format.pdf {\n  background-position: right;\n  text-align: right;\n}\n\n.format-info {\n  display: inline-block;\n  width: $col * 8;\n  text-align: left;\n\n  table {\n    width: 100%;\n    border-collapse: collapse;\n\n    td + td {\n      padding-left: 5px;\n    }\n  }\n}\n\n.format.print { background-image: url(\"image/format-print.jpg\"); }\n.format.ebook { background-image: url(\"image/format-ebook.jpg\"); }\n.format.pdf {   background-image: url(\"image/format-pdf.jpg\"); }\n.format.web {   background-image: url(\"image/format-web.jpg\"); }\n\na.action {\n  display: block;\n\n  margin: 0 0 4px 0;\n  padding: 4px 0;\n  text-align: center;\n  border-radius: 3px;\n  background: $primary;\n\n  transition: background-color 0.2s ease,\n              color 0.2s ease;\n\n  font: 400 17px/24px $nav;\n  color: white;\n\n  small {\n    font-size: 14px;\n    padding: 4px;\n    color: hsla(0, 0, 100%, 0.7);\n    transition: color 0.2s ease;\n  }\n}\n\na.action:hover {\n  background-color: hsl(200, 85%, 55%);\n\n  small {\n    color: white;\n  }\n}\n\n  h3 {\n    font: italic 24px/24px $serif;\n    margin: 12px 0;\n  }\n\nimg.author {\n  float: left;\n  width: 240px;\n  margin: 0 12px 0 -12px;\n  padding: 12px;\n\n  background: $warm-5;\n  border-radius: 3px;\n}\n\ndiv.author {\n  vertical-align: top;\n  margin: 36px 0 0 240px + $col;\n}\n\nfooter {\n  position: relative;\n  border-top: solid 1px $light;\n  color: $gray-4;\n  font: 400 15px $nav;\n  text-align: center;\n  margin: 12px 0 36px 0;\n  padding-top: 48px;\n\n  a, a:hover {\n    border: none;\n  }\n}\n\n@media only screen and (max-width: 700px) {\n  main {\n    margin: 0 24px;\n  }\n\n  header {\n    margin-bottom: 24px;\n  }\n\n  img.big {\n    display: none;\n  }\n\n  img.small {\n    display: block;\n  }\n\n  div.intro {\n    display: block;\n\n    blockquote {\n      display: block;\n      font: italic 24px/36px $serif;\n    }\n\n    div.text {\n      display: block;\n      margin: 24px 0 24px 0;\n    }\n  }\n\n  .format {\n    margin-bottom: 12px;\n    height: auto;\n    background-blend-mode: lighten;\n  }\n\n  .format-info {\n    display: block;\n    width: 100%;\n  }\n\n  .format.print { background-color: #a6a29f; }\n  .format.ebook { background-color: #97a2aa; }\n  .format.pdf {   background-color: #cfccca; }\n  .format.web {   background-color: #d6dbd3; }\n\n  img.author {\n    float: none;\n  }\n\n  div.author {\n    margin: 0 0 0 0;\n  }\n}\n"
  },
  {
    "path": "asset/mustache/contents-nav.html",
    "content": "<h2><a href=\"#top\"><small>&nbsp;</small> Table of Contents</a></h2>\n<ul>\n  <li><a href=\"#welcome\"><small>I</small>Welcome</a></li>\n  <li><a href=\"#a-tree-walk-interpreter\"><small>II</small>A Tree-Walk Interpreter</a></li>\n  <li><a href=\"#a-bytecode-virtual-machine\"><small>III</small>A Bytecode Virtual Machine</a></li>\n  <li><a href=\"#backmatter\"><small>&#10087;</small>Backmatter</a></li>\n</ul>\n{{> prev-next }}\n"
  },
  {
    "path": "asset/mustache/contents-part.html",
    "content": "<h2><span class=\"num\">{{ number }}.</span><a href=\"{{ file }}.html\" name=\"{{ file }}\">{{ title }}</a></h2>\n<ul>\n{{# chapters }}\n  <li><span class=\"num\">{{ number }}.</span><a href=\"{{ file }}.html\">{{ title }}</a>\n  </li>\n  {{# design_note }}\n  <li class=\"design-note\">\n  <span class=\"num\">&nbsp;</span><a href=\"{{ file }}.html#design-note\">Design Note: {{{ design_note }}}</a>\n  </li>\n  {{/ design_note }}\n{{/ chapters }}\n</ul>"
  },
  {
    "path": "asset/mustache/contents.html",
    "content": "{{> header }}\n\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n    {{> contents-nav }}\n  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n{{# has_prev }}\n<a href=\"{{ prev_file }}.html\" title=\"{{ prev }}\" class=\"prev\">←</a>\n{{/ has_prev }}\n{{# has_next }}\n<a href=\"{{ next_file }}.html\" title=\"{{ next }}\" class=\"next\">→</a>\n{{/ has_next }}\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n    {{> contents-nav }}\n  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"contents\">\n\n<h1 class=\"part\">{{title}}</h1>\n\n<div class=\"chapters\">\n  <div class=\"row\">\n    <div class=\"first\">\n    <h2><span class=\"num\">&#10087;</span>Frontmatter</h2>\n    <ul>\n      <li><span class=\"num\">&nbsp;</span><a href=\"dedication.html\">Dedication</a></li>\n      <li><span class=\"num\">&nbsp;</span><a href=\"acknowledgements.html\">Acknowledgements</a></li>\n    </ul>\n\n    {{# part_1 }}\n      {{> contents-part }}\n    {{/ part_1 }}\n    {{# part_2 }}\n      {{> contents-part }}\n    {{/ part_2 }}\n    </div>\n    <div class=\"second\">\n    {{# part_3 }}\n      {{> contents-part }}\n    {{/ part_3 }}\n\n    <h2><span class=\"num\">&#10087;</span><a href=\"backmatter.html\" name=\"backmatter\">Backmatter</a></h2>\n    <ul>\n      <li><span class=\"num\">A1.</span><a href=\"appendix-i.html\">Appendix I: Lox Grammar</a></li>\n      <li><span class=\"num\">A2.</span><a href=\"appendix-ii.html\">Appendix II: Generated Syntax Tree Classes</a></li>\n    </ul>\n    </div>\n  </div>\n</div>\n\n<footer>\n  <a href=\"{{ next_file }}.html\" class=\"next\">\n    First {{ next_type }}: &ldquo;{{ next }}&rdquo; &rarr;\n  </a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2020</a>\n</footer>\n</article>\n\n{{> footer }}\n"
  },
  {
    "path": "asset/mustache/footer.html",
    "content": "</div>\n</body>\n</html>\n"
  },
  {
    "path": "asset/mustache/header.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>{{title}} &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n"
  },
  {
    "path": "asset/mustache/in_design.html",
    "content": "<chapter>\n<chapter-number>{{ number }}</chapter-number>\n<title>{{ title }}</title>\n<part>{{ part }}</part>\n{{{ body }}}\n</chapter>\n"
  },
  {
    "path": "asset/mustache/index.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Crafting Interpreters</title>\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"index.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body>\n\n<article>\n\n<header>\n  <a href=\"dedication.html\"><img class=\"header big\" src=\"image/header.png\" alt=\"Crafting Interpreters by Robert Nystrom\" /><img class=\"header small\" src=\"image/header-small.png\" alt=\"Crafting Interpreters by Robert Nystrom\" /></a>\n</header>\n\n<main>\n\n<div class=\"intro\">\n\n<blockquote><p>Ever wanted to make your own programming language or wondered how\nthey are designed and built?</p><p>If so, this book is for you.</p></blockquote>\n\n<div class=\"text\">\n\n<p><em>Crafting Interpreters</em> contains everything you need to implement a\nfull-featured, efficient scripting language. You&#8217;ll learn both high-level\nconcepts around parsing and semantics and gritty details like bytecode\nrepresentation and garbage collection. Your brain will light up with new ideas,\nand your hands will get dirty and calloused. It&#8217;s a blast.</p>\n\n<p>Starting from <code>main()</code>, you build a language that features rich\nsyntax, dynamic typing, garbage collection, lexical scope, first-class\nfunctions, closures, classes, and inheritance. All packed into a few thousand\nlines of clean, fast code that you thoroughly understand because you write each\none yourself.</p>\n\n<p>The book is available in four delectable formats:</p>\n\n</div>\n\n</div>\n\n<div class=\"format print\">\n  <div class=\"format-info\">\n    <h3>Print</h3>\n    <p>640 pages of beautiful typography and high resolution hand-drawn\n    illustrations. Each page lovingly typeset by the author. The premiere reading\n    experience.</p>\n    <table>\n    <tr>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.com/dp/0990582930\" target=\"_blank\">Amazon<small>.com</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.ca/dp/0990582930\" target=\"_blank\"><small>.ca</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.co.uk/dp/0990582930\" target=\"_blank\"><small>.uk</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.com.au/dp/0990582930\" target=\"_blank\"><small>.au</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.de/dp/0990582930\" target=\"_blank\"><small>.de</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.fr/dp/0990582930\" target=\"_blank\"><small>.fr</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.es/dp/0990582930\" target=\"_blank\"><small>.es</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.it/dp/0990582930\" target=\"_blank\"><small>.it</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.co.jp/dp/0990582930\" target=\"_blank\"><small>.jp</small></a>\n    </td>\n    </tr>\n    </table>\n    <table>\n    <tr>\n    <td>\n      <a class=\"action\" href=\"https://www.barnesandnoble.com/w/crafting-interpreters-robert-nystrom/1139915245?ean=9780990582939\" target=\"_blank\">Barnes and Noble</a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.bookdepository.com/Crafting-Interpreters-Robert-Nystrom/9780990582939\" target=\"_blank\">Book Depository</a>\n    </td>\n    </tr>\n    </table>\n    <a class=\"action\" href=\"/sample.pdf\" target=\"_blank\">Download Sample <small>PDF</small></a>\n  </div>\n</div>\n<div class=\"format ebook\">\n  <div class=\"format-info\">\n    <h3>eBook</h3>\n    <p>Carefully tuned CSS fits itself to your ebook reader and screen size.\n    Full-color syntax highlighting and live hyperlinks. Like Alan Kay's Dynabook\n    but real.</p>\n    <table>\n    <tr>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.com/dp/B09BCCVLCL\" target=\"_blank\">Kindle <small class=\"hide-small\"><span class=\"hide-medium\">Amazon</span>.com</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.co.uk/dp/B09BCCVLCL\" target=\"_blank\"><small>.uk</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.ca/dp/B09BCCVLCL\" target=\"_blank\"><small>.ca</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.com.au/dp/B09BCCVLCL\" target=\"_blank\"><small>.au</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.de/dp/B09BCCVLCL\" target=\"_blank\"><small>.de</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.in/dp/B09BCCVLCL\" target=\"_blank\"><small>.in</small></a>\n    </td>\n    </tr>\n    </table>\n    <table>\n    <tr>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.fr/dp/B09BCCVLCL\" target=\"_blank\"><small>.fr</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.es/dp/B09BCCVLCL\" target=\"_blank\"><small>.es</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.it/dp/B09BCCVLCL\" target=\"_blank\"><small>.it</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.co.jp/dp/B09BCCVLCL\" target=\"_blank\"><small>.jp</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.com.br/dp/B09BCCVLCL\" target=\"_blank\"><small>.br</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.com.mx/dp/B09BCCVLCL\" target=\"_blank\"><small>.mx</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://books.apple.com/us/book/crafting-interpreters/id1578795812\" target=\"_blank\">Apple Books</a>\n    </td>\n    </tr>\n    </table>\n    <table>\n    <tr>\n    <td>\n      <a class=\"action\" href=\"https://play.google.com/store/books/details?id=q0c6EAAAQBAJ\" target=\"_blank\">Play Books <small class=\"hide-small\">Google</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.barnesandnoble.com/w/crafting-interpreters-robert-nystrom/1139915245?ean=2940164977092\" target=\"_blank\">Nook <small class=\"hide-small\">B&amp;N</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.smashwords.com/books/view/1096463\" target=\"_blank\">EPUB <small class=\"hide-small\">Smashwords</small></a>\n    </td>\n    </tr>\n    </table>\n  </div>\n</div>\n<div class=\"format pdf\">\n  <div class=\"format-info\">\n    <h3>PDF</h3>\n    <p>Perfectly mirrors the hand-crafted typesetting and sharp illustrations of\n    the print book, but much easier to carry around.</p>\n    <a class=\"action\" href=\"https://payhip.com/b/F0zkr\" target=\"_blank\">Buy from Payhip</a>\n    <a class=\"action\" href=\"/sample.pdf\" target=\"_blank\">Download Free Sample</a>\n  </div>\n</div>\n<div class=\"format web\">\n  <div class=\"format-info\">\n    <h3>Web</h3>\n    <p>Meticulous responsive design looks great from your desktop down to your\n    phone. Every chapter, aside, and illustration is there. Read the whole book\n    for free. Really.</p>\n    <a class=\"action\" href=\"contents.html\">Read Now</a>\n  </div>\n</div>\n\n<img src=\"image/dogshot.jpg\" class=\"author\" />\n\n<div class=\"author\">\n<h3>About Robert Nystrom</h3>\n\n<p>I got bitten by the language bug years ago while on paternity leave between\nmidnight feedings. I cobbled together a <a href=\"http://wren.io/\"\ntarget=\"_blank\">number</a> <a href=\"http://magpie-lang.org/\"\ntarget=\"_blank\">of</a> <a href=\"http://finch.stuffwithstuff.com/\"\ntarget=\"_blank\">hobby</a> <a href=\"https://github.com/munificent/vigil\"\ntarget=\"_blank\">languages</a> before worming my way into an honest-to-God,\nfull-time programming language job. Today, I work at Google on the <a\nhref=\"http://dart.dev/\" target=\"_blank\">Dart language</a>.</p>\n\n<p>Before I fell in love with languages, I developed games at Electronic Arts\nfor eight years. I wrote the best-selling book <em><a\nhref=\"http://gameprogrammingpatterns.com/\" target=\"_blank\">Game Programming\nPatterns</a></em> based on what I learned there. You can read that book for free\ntoo.</p>\n\n<p>If you want more, you can find me on Twitter (<a\nhref=\"https://twitter.com/intent/user?screen_name=munificentbob\"\ntarget=\"_blank\"><code>@munificentbob</code></a>), email me at <code>bob</code>\nat this site's domain (though I am slow to respond), read <a\nhref=\"http://journal.stuffwithstuff.com/\" target=\"_blank\">my blog</a>, or join\nmy low frequency mailing list:</p>\n\n<div class=\"sign-up\">\n  <!-- Begin MailChimp Signup Form -->\n  <div id=\"mc_embed_signup\">\n  <form action=\"//gameprogrammingpatterns.us7.list-manage.com/subscribe/post?u=0952ca43ed2536d6717766b88&amp;id=6e96334109\" method=\"post\" id=\"mc-embedded-subscribe-form\" name=\"mc-embedded-subscribe-form\" class=\"validate\" target=\"_blank\" novalidate>\n    <input type=\"email\" value=\"\" name=\"EMAIL\" class=\"email\" id=\"mce-EMAIL\" placeholder=\"Your email address\" required>\n    <!-- real people should not fill this in and expect good things - do not remove this or risk form bot signups -->\n    <div style=\"position: absolute; left: -5000px;\" aria-hidden=\"true\"><input type=\"text\" name=\"b_0952ca43ed2536d6717766b88_6e96334109\" tabindex=\"-1\" value=\"\"></div>\n    <input type=\"submit\" value=\"Sign me up!\" name=\"subscribe\" id=\"mc-embedded-subscribe\" class=\"button\">\n  </form>\n  </div>\n  <!--End mc_embed_signup -->\n</div>\n\n</div>\n\n<footer>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</main>\n</article>\n</body>\n</html>\n"
  },
  {
    "path": "asset/mustache/nav.html",
    "content": "{{# is_chapter }}\n<h3><a href=\"#top\">{{ title }}<small>{{ number }}</small></a></h3>\n\n<ul>\n  {{# sections }}\n    <li><a href=\"#{{ anchor }}\"><small>{{ number }}.{{ index }}</small> {{ name }}</a></li>\n  {{/ sections }}\n  {{# has_challenges_or_design_note }}\n    <li class=\"divider\"></li>\n  {{/ has_challenges_or_design_note }}\n  {{# has_challenges }}\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n  {{/ has_challenges }}\n  {{# has_design_note }}\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>{{ design_note }}</a></li>\n  {{/ has_design_note }}\n</ul>\n\n{{/ is_chapter }}\n{{# is_part }}\n<h2><small>{{ number }}</small>{{ title }}</h2>\n\n<ul>\n  {{# chapters }}\n    <li><a href=\"{{ file }}.html\"><small>{{ number }}</small>{{ title }}</a></li>\n  {{/ chapters }}\n</ul>\n\n{{/ is_part }}\n{{# is_frontmatter }}\n<h2><small>{{ number }}</small>{{ title }}</h2>\n<hr>\n{{/ is_frontmatter }}\n\n{{> prev-next }}"
  },
  {
    "path": "asset/mustache/page.html",
    "content": "{{> header }}\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n{{> nav }}  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n{{# has_prev }}\n<a href=\"{{ prev_file }}.html\" title=\"{{ prev }}\" class=\"prev\">←</a>\n{{/ has_prev }}\n{{# has_next }}\n<a href=\"{{ next_file }}.html\" title=\"{{ next }}\" class=\"next\">→</a>\n{{/ has_next }}\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n{{> nav }}\n  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n{{# has_number }}\n  <div class=\"number\">{{ number }}</div>\n{{/ has_number }}\n{{# is_chapter }}\n  <h1>{{ title }}</h1>\n{{/ is_chapter }}\n{{^ is_chapter }}\n  <h1 class=\"part\">{{ title }}</h1>\n{{/ is_chapter }}\n\n{{{ body }}}\n<footer>\n{{# has_next }}\n<a href=\"{{ next_file }}.html\" class=\"next\">\n  Next {{ next_type }}: &ldquo;{{ next }}&rdquo; &rarr;\n</a>\n{{/ has_next }}\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n{{> footer }}\n"
  },
  {
    "path": "asset/mustache/prev-next.html",
    "content": "<div class=\"prev-next\">\n  {{# has_prev }}\n    <a href=\"{{ prev_file }}.html\" title=\"{{ prev }}\" class=\"left\">&larr;&nbsp;Previous</a>\n  {{/ has_prev }}\n  {{# has_up }}\n    <a href=\"{{ up_file }}.html\" title=\"{{ up }}\">&uarr;&nbsp;Up</a>\n  {{/ has_up }}\n  {{# has_next }}\n    <a href=\"{{ next_file }}.html\" title=\"{{ next }}\" class=\"right\">Next&nbsp;&rarr;</a>\n  {{/ has_next }}\n</div>"
  },
  {
    "path": "asset/sass/chapter.scss",
    "content": "article.chapter {\n  h2 {\n    font: 600 30px/24px $serif;\n    margin: 69px 0 0 0;\n    padding-bottom: 3px;\n\n    small {\n      font: 800 22px/24px $serif;\n      float: right;\n    }\n  }\n\n  h3 {\n    font: italic 24px/24px $serif;\n    margin: 71px 0 0 0;\n    padding-bottom: 1px;\n\n    small {\n      font: 600 16px/24px $serif;\n      float: right;\n    }\n  }\n\n  h2 a, h3 a {\n    color: #222;\n    border-bottom: none;\n  }\n\n  h2 a:hover, h3 a:hover {\n    border-bottom: none;\n    color: inherit;\n  }\n\n  h2 a::before, h3 a::before {\n    position: absolute;\n    left: -$col;\n    width: $col;\n    content: \"\\000A7\";\n    color: #fff;\n    transition: color 0.2s ease;\n    text-align: center;\n  }\n\n  h2 a:hover::before, h3 a:hover::before {\n    color: #ddd;\n  }\n\n  .challenges, .design-note {\n    border-radius: 3px;\n    padding: 12px;\n    margin: -2px -12px 26px -12px;\n\n    font: normal 16px/24px $nav;\n    color: #444;\n\n    h2 {\n      margin: 0 0 -12px 0;\n      padding: 0;\n      font: 600 16px/24px $nav;\n      text-transform: uppercase;\n      letter-spacing: 1px;\n    }\n\n    h2 a {\n      color: inherit;\n    }\n\n    h2 a::before {\n      content: none;\n    }\n\n    ol {\n      padding: 0 0 0 18px;\n\n      li {\n        padding: 0 0 0 6px;\n        font-weight: 600;\n\n        p {\n          font-weight: 400;\n        }\n      }\n    }\n\n    pre {\n      margin: 0;\n    }\n\n    // Chapter 23 has some blockquotes in the design note.\n    > blockquote {\n      p {\n        margin: 0 24px;\n        font: italic 16px/24px $nav;\n        color: #444;\n      }\n\n      &::before, &::after {\n        content: none;\n      }\n    }\n\n    // Use the regular code colors in asides, and not the tinted versions used\n    // inside the challenge or design notes boxes themselves.\n    aside {\n      code, .codehilite {\n        color: $warm-dark;\n        background: $warm-light;\n      }\n    }\n\n    // Remove the extra padding at the bottom of the box.\n    *:last-child {\n      margin-bottom: 0;\n    }\n  }\n\n  .challenges .codehilite,\n  .design-note .codehilite {\n    margin: -12px 0 -12px 0;\n  }\n\n  .challenges {\n    background: $lighter;\n\n    code, .codehilite {\n      background: hsl(195, 30%, 92%);\n    }\n  }\n\n  .design-note {\n    background: hsl(80, 30%, 96%);\n\n    code, .codehilite {\n      background: hsl(80, 20%, 93%);\n    }\n  }\n\n  table {\n    width: 100%;\n    border-collapse: collapse;\n\n    thead {\n      font: 700 15px $serif;\n    }\n\n    td {\n      border-bottom: solid 1px $light;\n      line-height: 22px;\n      padding: 3px 0 0 0;\n      margin: 0;\n    }\n\n    td + td {\n      padding-left: 12px;\n    }\n  }\n}\n\n// Tablets and mobile go single column.\n@media only screen and (max-width: $col * 20) {\n  article.chapter {\n\n    // Now that the asides are inline, make them match the challenge/design-note\n    // colors and font.\n    .challenges, .design-note {\n      aside {\n        font: normal 15px/24px $nav;\n        padding-bottom: 4px;\n      }\n    }\n\n    .challenges {\n      aside {\n        code, .codehilite {\n          background: hsl(195, 30%, 92%);\n        }\n      }\n    }\n\n    .design-note {\n      aside {\n        code, .codehilite {\n          background: hsl(80, 20%, 93%);\n        }\n      }\n    }\n  }\n}\n\n// Then bring the margins in some.\n// The cut-off sizes here are based on trying to get 72 columns of code to fit.\n@media only screen and (max-width: 630px) {\n  article.chapter {\n    h2 a::before, h3 a::before {\n      left: -($col / 2);\n      width: $col / 2;\n    }\n  }\n}\n\n// Finally start shrinking text.\n@media only screen and (max-width: 580px) {\n  article.chapter {\n    h2 {\n      margin-top: 64px;\n      padding-bottom: 2px;\n      font-size: 22px;\n      line-height: 22px;\n    }\n\n    h3 {\n      margin-top: 64px;\n      padding-bottom: 0;\n      font-size: 20px;\n    }\n\n    .challenges, .design-note {\n      padding: 11px 11px 8px 11px;\n      margin: 25px 0 0 0;\n\n      font-size: 15px;\n      line-height: 22px;\n\n      code, .codehilite {\n        font-size: 14px;\n      }\n\n      h2 {\n        padding: 5px 0 4px 6px;\n        font-size: 17px;\n        line-height: 22px;\n      }\n\n      aside {\n        line-height: 22px;\n      }\n    }\n  }\n}\n"
  },
  {
    "path": "asset/sass/contents.scss",
    "content": "article.contents {\n  h2 {\n    margin: 22px 0 6px 0;\n    font: 600 normal 18px/24px $nav;\n    text-transform: uppercase;\n    letter-spacing: 1px;\n\n    .num {\n      display: inline-block;\n      width: 36px;\n    }\n  }\n\n  ul {\n    margin: -12px 0 0 0;\n    padding: 6px 0 14px 0;\n  }\n\n  li {\n    padding: 12px 0 0 36px;\n    font: normal 16px/24px $nav;\n    color: $gray-4;\n\n    list-style-type: none;\n\n    .num {\n      display: inline-block;\n      letter-spacing: 1px;\n      width: 36px;\n    }\n\n    a {\n      font: 600 17px/24px $nav;\n    }\n  }\n\n  li.design-note {\n    padding-top: 0;\n\n    a {\n      font: 400 16px/23px $nav;\n    }\n  }\n\n  // Format the chapter list in two columns.\n  .chapters {\n    display: table;\n    width: $col * 18;\n  }\n\n  .row {\n    display: table-row;\n  }\n\n  .first, .second {\n    display: table-cell;\n    vertical-align: top;\n  }\n\n  .second {\n    padding-left: $col;\n  }\n\n  footer {\n    width: $col * 18;\n  }\n}\n\n// Go single-column with the chapter list.\n@media only screen and (max-width: 1344px) {\n  article.contents {\n    .chapters, .row, .first, .second {\n      display: block;\n      width: auto;\n    }\n\n    .second {\n      padding-left: 0;\n    }\n\n    footer {\n      width: inherit;\n    }\n  }\n}\n\n// Then bring the margins in some.\n@media only screen and (max-width: 630px) {\n  article.contents {\n    h2 .num, li .num {\n      width: 28px;\n    }\n\n    ol, ul {\n      margin-left: 0;\n    }\n\n    li {\n      padding-left: 0;\n    }\n  }\n}\n\n// Finally start shrinking text.\n@media only screen and (max-width: 580px) {\n  article.contents {\n    h2 {\n      margin: 19px 0 6px 0;\n      font-size: 17px;\n      line-height: 22px;\n    }\n\n    h3 {\n      padding: 1px 0 2px 0;\n      font-size: 17px;\n      line-height: 22px;\n    }\n\n    p {\n      font-size: 15px;\n      line-height: 22px;\n    }\n\n    ol, ul {\n      padding-bottom: 8px;\n    }\n\n    li {\n      font-size: 14px;\n      line-height: 22px;\n      padding: 4px 0 3px 0;\n    }\n  }\n}\n"
  },
  {
    "path": "asset/sass/print.scss",
    "content": "@import 'shared';\n\n@media print {\n  // Pure black text.\n  body, a, code {\n    color: #000 !important;\n    background: none !important;\n  }\n\n  // Hide non-content stuff.\n  nav, .sign-up {\n    display: none;\n  }\n\n  // Get rid of extra margins. The page margin will handle this.\n  .page {\n    margin: 0 !important;\n  }\n\n  // Tweak how code is formatted since we don't want to use a background color.\n  .codehilite {\n    pre {\n      color: #000 !important;\n    }\n\n    margin: 0 !important;\n\n    // Borders above and below and no background.\n    background: none !important;\n    border-radius: 0 !important;\n    border-left: solid 1px $warm-4;\n    border-right: solid 1px $warm-4;\n\n    // Show thicker borders on the left and right instead of a background.\n    .insert {\n      border-left: solid 3px $warm-4 !important;\n      border-right: solid 3px $warm-4 !important;\n      background: none !important;\n    }\n\n    .delete {\n      -webkit-print-color-adjust: exact;\n      color-adjust: exact;\n    }\n\n    // Browsers don't honor the specific authored colors when printing if the\n    // color is too close the background. Tell the browser not to do that.\n    .insert-before span, .insert-after span {\n      -webkit-print-color-adjust: exact;\n      color-adjust: exact;\n    }\n  }\n}\n"
  },
  {
    "path": "asset/sass/shared.scss",
    "content": "// Font stacks.\n$serif:   \"Crimson\", Georgia, serif;\n$mono:    \"Source Code Pro\", Menlo, Consolas, Monaco, monospace;\n$nav:     \"Source Sans Pro\", sans-serif;\n\n// The main intense primary accent color.\n$primary:       hsl(200, 80%, 40%);\n$primary-dark:  hsl(200, 100%, 20%);\n$primary-light: hsl(200, 70%, 60%);\n\n// A ramp of washed out blues from dark to light.\n$dark:    hsl(215, 20%, 20%);\n$gray-1:  hsl(212, 23%, 30%);\n$gray-2:  hsl(209, 26%, 40%);\n$gray-3:  hsl(206, 30%, 50%);\n$gray-4:  hsl(203, 30%, 60%);\n$light:   hsl(195, 30%, 90%);\n$lighter: hsl(195, 35%, 95%);\n\n// An opposing warm light color (code background).\n$warm-dark:  hsl(40, 0%, 35%);\n$warm-light: hsl(40, 30%, 97%);\n$warm-1:     mix($warm-light, $warm-dark, 15%);\n$warm-2:     mix($warm-light, $warm-dark, 40%);\n$warm-3:     mix($warm-light, $warm-dark, 60%);\n$warm-4:     mix($warm-light, $warm-dark, 80%);\n$warm-5:     hsl(40, 20%, 95%);\n\n// The full-size design is 28 units wide, in three columns:\n// [][][][][][][][][][][][][][][][][][][][][][][][][][][][]\n//   (   5    )    (          12          )  (    6     )\n// They are asymmetric because the left column has a dark background, which\n// requires a double margin.\n$col: 48px;\n\n@font-face {\n  font-family: 'Crimson';\n  src: url('font/crimson-roman.woff') format('woff');\n}\n\n@font-face {\n  font-family: 'Crimson';\n  src: url('font/crimson-italic.woff') format('woff');\n  font-style: italic;\n}\n\n@font-face {\n  font-family: 'Crimson';\n  src: url('font/crimson-semibold.woff') format('woff');\n  font-weight: 600;\n}\n\n@font-face {\n  font-family: 'Crimson';\n  src: url('font/crimson-semibolditalic.woff') format('woff');\n  font-style: italic;\n  font-weight: 600;\n}\n\n@font-face {\n  font-family: 'Crimson';\n  src: url('font/crimson-bold.woff') format('woff');\n  font-weight: bold;\n}\n\n@font-face {\n  font-family: 'Crimson';\n  src: url('font/crimson-bolditalic.woff') format('woff');\n  font-style: italic;\n  font-weight: bold;\n}\n\n// Reset stuff.\n\nbody, h1, h2, h3, h4, p, blockquote, code, ul, ol, dl, dd, img {\n  margin: 0;\n}\n\nimg {\n  outline: none;\n}\n\nimg.arrow {\n  width: auto;\n  height: 11px;\n}\n\nimg.dot {\n  width: auto;\n  height: 18px;\n  vertical-align: text-bottom;\n}\n\n// Basic styles.\n\nbody {\n  color: #222;\n  font: normal 16px/24px $serif;\n}\n"
  },
  {
    "path": "asset/sass/sign-up.scss",
    "content": ".sign-up {\n  padding: 12px;\n  margin: 24px 0 24px 0;\n  background: hsl(40, 80%, 95%);\n  color: hsl(40, 50%, 50%);\n  border-radius: 3px;\n\n  form {\n    display: flex;\n  }\n\n  input {\n    padding: 4px;\n    font: 16px $nav;\n    outline: none;\n    border-radius: 3px;\n    border: solid 2px hsl(40, 100%, 75%);\n    color: hsl(40, 70%, 30%);\n    height: 32px;\n  }\n\n  input.email {\n    display: block;\n    box-sizing: border-box;\n    width: 100%;\n  }\n\n  input.button {\n    margin-left: 8px;\n    padding: 4px 8px;\n    font: 600 13px $nav;\n    text-transform: uppercase;\n    letter-spacing: 1px;\n    background: hsl(40, 100%, 60%);\n    border: none;\n\n  transition: background-color 0.2s ease;\n  }\n\n  input.button:hover {\n    background: hsl(40, 100%, 75%);\n  }\n\n  input:focus {\n    border-color: hsl(40, 100%, 50%);\n  }\n}"
  },
  {
    "path": "asset/style.scss",
    "content": "@import 'sass/shared';\n@import 'sass/chapter';\n@import 'sass/contents';\n@import 'sass/sign-up';\n@import 'sass/print';\n\n// Make sure we don't split on the thin spaces around an em dash.\n.emdash {\n  white-space: nowrap;\n}\n\n.scrim {\n  position: absolute;\n  width: 100%;\n  height: 10000px;\n\n  z-index: 4;\n\n  // background: url('columns.png');\n  background: url('rows.png');\n}\n\n// Used for drawing the bitwise operators \"AND\", \"OR\", and \"NOT\" in small caps.\n.small-caps {\n  font-weight: 600;\n  font-size: 13px;\n}\n\na {\n  color: $primary;\n  text-decoration: none;\n\n  border-bottom: solid 1px transparentize($light, 1.0);\n\n  transition: color 0.2s ease,\n              border-color 0.4s ease;\n}\n\na:hover {\n  color: $primary;\n  border-bottom: solid 1px opacify($light, 1.0);\n}\n\nnav {\n  font: 300 15px/24px $nav;\n  background: $dark;\n  color: $gray-2;\n\n  a, h2 a {\n    color: $gray-4;\n    text-decoration: none;\n    border-bottom: none;\n  }\n\n  a:hover {\n    color: $light;\n    text-decoration: none;\n    border-bottom: none;\n  }\n\n  img {\n    box-sizing: border-box;\n    width: 100%;\n    padding: 55px $col 23px $col;\n  }\n\n  h2 {\n    font: 400 16px/24px $nav;\n    text-transform: uppercase;\n    letter-spacing: 1px;\n    color: $gray-4;\n  }\n\n  h3 {\n    font: 400 18px/24px $nav;\n    color: $gray-4;\n  }\n\n  h2 small, h3 small {\n    float: right;\n    font-size: 16px;\n    color: $gray-2;\n  }\n\n  ol, ul {\n    margin: 6px 0 3px 0;\n    padding: 6px 0 4px 24px;\n    border-top: solid 1px $gray-1;\n    border-bottom: solid 1px $gray-1;\n  }\n\n  ul {\n    list-style-type: none;\n    padding-left: 0;\n  }\n\n  hr {\n    border: none;\n    border-top: solid 1px $gray-1;\n    margin: 6px 0 0 0;\n    padding: 0 0 3px 0;\n  }\n\n  li small {\n    float: right;\n    font-size: 14px;\n    color: $gray-2;\n  }\n\n  li.divider {\n    margin: 5px 0 7px 0;\n    border-top: solid 1px $gray-1;\n  }\n\n  li.end-part {\n    font-size: 12px;\n    font-weight: 400;\n    text-transform: uppercase;\n    letter-spacing: 1px;\n\n    small {\n      font-weight: 300;\n      text-transform: none;\n      letter-spacing: 0;\n    }\n  }\n\n  .prev-next {\n    padding-top: 7px;\n    font: 400 12px/18px $nav;\n    text-align: center;\n    text-transform: uppercase;\n    letter-spacing: 1px;\n  }\n}\n\nnav.wide {\n  position: fixed;\n  width: $col * 7;\n  height: 100%;\n\n  .contents {\n    margin: 24px $col;\n  }\n}\n\n// This is needed to make the nav fixed (not scrolling with the content) but\n// still positioned horizontally based on the page.\n// See: http://stackoverflow.com/a/11833892/9457\n.nav-wrapper {\n  position: absolute;\n  right: $col * 6;\n}\n\n// For medium-sized screens, the navigation floats over the same column as the\n// asides.\nnav.floating {\n  // Only shown on narrower screens.\n  display: none;\n\n  z-index: 2;\n  position: absolute;\n  width: $col * 6;\n\n  border-bottom-left-radius: 3px;\n  border-bottom-right-radius: 3px;\n\n  #expand-nav {\n    padding: 0 0 4px 0;\n    display: block;\n    font-size: 20px;\n    text-align: center;\n    color: $gray-2;\n    cursor: pointer;\n\n    transition: padding 0.2s ease,\n                margin 0.2s ease,\n                color 0.2s ease;\n  }\n\n  #expand-nav, #expand-nav:hover {\n    border-bottom: none;\n  }\n\n  #expand-nav:hover {\n    color: $light;\n  }\n\n  .expandable {\n    overflow: hidden;\n    padding: 0 12px;\n\n    // Using max-height instead of height to allow the list to navigation to\n    // automatically choose its height based on the size of the list while\n    // still transitioning.\n    // See: http://stackoverflow.com/a/8331169/9457\n    max-height: 0;\n    transition: margin 0.2s ease,\n                max-height 1.0s ease;\n\n    .prev-next {\n      padding-bottom: 6px;\n    }\n  }\n\n  .expandable.shown {\n    // This should be as small as possible while still being large enough for\n    // the worst case chapter.\n    max-height: 550px;\n  }\n\n  img {\n    padding: 110px $col/2 23px $col/2;\n  }\n}\n\nnav.floating.pinned {\n  position: fixed;\n\n  top: -85px;\n\n  .expandable {\n    margin-top: -13px;\n  }\n\n  #expand-nav {\n    margin-top: -14px;\n  }\n}\n\nnav.narrow {\n  display: none;\n\n  text-align: center;\n\n  img {\n    box-sizing: content-box;\n    padding: 11px 0 3px 0;\n    width: auto;\n    height: 27px;\n  }\n\n  .prev, .next {\n    font-size: 32px;\n    position: absolute;\n    top: 12px;\n    padding: 0 $col;\n  }\n\n  .prev {\n    left: 0;\n  }\n\n  .next {\n    right: 0;\n  }\n}\n\n.left {\n  float: left;\n}\n\n.right {\n  float: right;\n}\n\n.page {\n  position: relative;\n\n  width: $col * 19;\n  margin: 0 auto 0 $col * 8;\n}\n\n// Make em dashes look pretty. Goals:\n//\n// - Add a tiny bit of space on either side. Completely unspaced em dashes\n//   look too tight to me.\n// - Allow an em dash at the end of a line.\n// - Prevent an em dash at the beginning of a line.\n//\n// Wrapping each `&mdash;` in a span with this class and consuming the\n// preceding whitespace seems to accomplish that.\n.em {\n  padding: 0 .1em;\n  white-space: nowrap;\n}\n\n// Make ellipses follow Chicago style. The `&hellip;` entity puts a tiny amount\n// of space between each `.`, but not as much as Chicago style specificies. It\n// also doesn't put any space before. Instead, the build system writes a span\n// of this class with thin-space separated dots. This class here ensures there\n// is no splitting between the dots.\n.ellipse {\n  white-space: nowrap;\n}\n\ncode {\n  font: normal 16px $mono;\n  color: $warm-1;\n  white-space: nowrap;\n  padding: 2px;\n}\n\nstrong code {\n  font-weight: bold;\n  color: inherit;\n}\n\na code {\n  color: $primary;\n}\n\n.codehilite {\n  color: $warm-dark;\n  background: $warm-light;\n  border-radius: 3px;\n  padding: 12px;\n  margin: -12px;\n}\n\npre {\n  font: normal 13px/20px $mono;\n  margin: 0;\n  padding: 0;\n\n  // If the code doesn't fit, just force it to wrap instead of cropping it. It\n  // doesn't look great, but it ensures the code is visible and can be correctly\n  // copy-pasted.\n  white-space: pre-wrap;\n  overflow-wrap: anywhere;\n}\n\n// If the chapter ends with code, don't overlap the challenges box.\ndiv.codehilite + div.challenges {\n  margin-top: $col / 2;\n}\n\narticle {\n  position: relative;\n  width: $col * 12;\n\n  h1 {\n    position: relative;\n    font: 48px/48px $serif;\n    padding: 109px 0 19px 0;\n    z-index: 2;\n  }\n\n  h1.part {\n    font: 600 36px/48px $nav;\n    padding: 108px 0 20px 0;\n    text-transform: uppercase;\n    letter-spacing: 1px;\n  }\n\n  .number {\n    position: absolute;\n    top: 50px;\n    left: $col * 13;\n\n    z-index: 1;\n\n    font: 300 96px $nav;\n    color: $light;\n  }\n\n  p {\n    margin: 24px 0;\n  }\n\n  ol, ul {\n    margin: 24px 0;\n    padding: 0 0 0 24px;\n  }\n\n  img {\n    max-width: 100%;\n  }\n\n  img.wide {\n    max-width: none;\n    width: $col * 19;\n  }\n}\n\naside {\n  position: absolute;\n  right: -$col * 7;\n  width: $col * 6;\n\n  font: normal 14px/20px $serif;\n\n  border-top: solid 1px $light;\n\n  p {\n    margin: 20px 0;\n  }\n\n  p:first-child,\n  img:first-child {\n    margin-top: 4px;\n  }\n\n  p:last-child {\n    margin-bottom: 4px;\n  }\n\n  code {\n    font-size: 14px;\n    border-radius: 2px;\n    padding: 1px 2px;\n  }\n\n  .codehilite {\n    padding: 6px;\n    margin: -12px 0;\n  }\n\n  .codehilite:last-child {\n    margin-bottom: 4px;\n  }\n\n  img.above {\n    position: absolute;\n    bottom: 100%;\n    margin-bottom: 16px;\n  }\n\n  blockquote {\n    margin: 20px 0;\n\n    &::before, &::after {\n      content: none;\n    }\n\n    p {\n      margin: 0 12px;\n      font: italic 15px/20px $serif;\n      color: inherit;\n    }\n  }\n}\n\n// Sometimes there isn't room to hang the aside *down* next to the content it's\n// annotating, so support asides where the bottom is aligned with the content.\naside.bottom {\n  border-top: none;\n  border-bottom: solid 1px $light;\n}\n\nblockquote {\n  position: relative;\n\n  margin: 29px 0 31px 0;\n\n  &::before, &::after {\n    position: absolute;\n    top: -20px;\n    font: italic 72px $serif;\n    color: $light;\n  }\n\n  &::before {\n    content: \"\\201c\";\n    left: -7px;\n  }\n\n  &::after {\n    content: \"\\201d\";\n    right: 8px;\n  }\n\n  p {\n    margin: 0 $col;\n    font: italic 24px/36px $serif;\n    color: $gray-3;\n  \n    em {\n      font-style: normal;\n    }\n  }\n\n  cite {\n    display: block;\n    text-align: right;\n    color: $gray-4;\n    font-style: normal;\n    font-size: 18px;\n\n    &::before {\n      content: \"\\2014\\00a0\";\n      color: $light;\n    }\n\n    em {\n      font-style: italic;\n    }\n  }\n}\n\nfooter {\n  position: relative;\n  border-top: solid 1px $light;\n  color: $gray-4;\n  font: 400 15px $nav;\n  text-align: center;\n  margin: 48px 0;\n  padding-top: 48px;\n\n  a, a:hover {\n    border: none;\n  }\n\n  .next {\n    position: absolute;\n    right: 0;\n    top: -13px;\n\n    padding-left: 4px;\n    background: #fff;\n\n    font: 400 17px/24px $nav;\n    text-transform: uppercase;\n    letter-spacing: 1px;\n  }\n\n  .next:hover {\n    color: $primary-dark;\n    border: none;\n  }\n}\n\n.dedication {\n  margin: 96px 0 128px 0;\n  text-align: center;\n\n  img {\n    width: 50%;\n  }\n}\n\n.source-file, .source-file-narrow {\n  font: normal 11px/16px $mono;\n  color: $warm-3;\n\n  em {\n    color: $warm-2;\n    font-style: normal;\n  }\n}\n\n.source-file-narrow {\n  // Don't show unless in single-column.\n  display: none;\n\n  margin: 0px -12px 0 0;\n  padding: 14px 0 0 0;\n  text-align: right;\n}\n\n.source-file {\n  position: absolute;\n  right: -$col * 7;\n  width: $col * 6;\n  padding: 2px 0 0 0;\n\n  &::before {\n    content: \"<<\";\n    color: $warm-4;\n    position: absolute;\n    left: -($col - 12px);\n    width: $col - 12px;\n    text-align: center;\n  }\n}\n\n// Syntax highlighting.\n.codehilite {\n  pre { color: mix($warm-light, $warm-dark, 20%); }\n\n  .k { color: hsl(200, 100%,  45%); }               // Keyword.\n  .n { color: hsl( 20,  70%,  55%); }               // Number.\n  .s { color: hsl( 40,  70%,  45%); }               // String.\n  .e { color: hsl( 45,  80%,  55%); }               // String escape.\n  .c { color: mix($warm-light, $warm-dark, 50%); }  // Comment.\n  .a { color: hsl(270,  50%,  60%); }               // Preprocessor, annotation.\n  .i { color: hsl(200,  70%,  35%); }               // Identifier.\n  .t { color: hsl(185, 100%,  35%); }               // Type name.\n\n  .insert {\n    margin: -2px -12px;\n    padding: 2px 10px;\n    border-left: solid 2px $warm-4;\n    border-right: solid 2px $warm-4;\n    background: $warm-5;\n  }\n\n  .delete {\n    margin: -2px -12px;\n    padding: 2px 10px;\n    border-left: solid 2px $warm-4;\n    border-right: solid 2px $warm-4;\n    // Hatched lines.\n    background: repeating-linear-gradient(\n      -45deg,\n      $warm-4,\n      $warm-4 1px,\n      rgba(0, 0, 0, 0.0) 1px,\n      rgba(0, 0, 0, 0.0) 6px\n    );\n\n    span {\n      color: $warm-3;\n    }\n  }\n\n  // Snippets of code before and after real code to show where to insert it.\n  .insert-before, .insert-after {\n    color: $warm-3;\n  }\n\n  // When we just add a trailing comma to a line, highlight it specially.\n  .insert-before .insert-comma {\n    margin: -2px -1px;\n    padding: 2px 1px;\n    border-radius: 2px;\n\n    background: $warm-5;\n    color: $warm-dark;\n  }\n}\n\n// On a not-entirely-large screen, don't show the fixed nav on the left.\n@media only screen and (max-width: 1344px) {\n  nav.wide { display: none; }\n  nav.floating { display: block; }\n\n  body {\n    margin: 0 24px;\n  }\n\n  .page {\n    position: relative;\n    width: inherit;\n    max-width: $col * 19;\n    margin: 0 auto;\n  }\n\n  article {\n    width: inherit;\n    margin-right: $col * 7;\n\n    // Move the number over to not be hidden behind the navigation.\n    .number {\n      top: 73px;\n      left: inherit;\n      right: 0;\n      font-size: 72px;\n    }\n\n    h1 {\n      padding: 110px 0 18px 0;\n      font-size: 44px;\n    }\n  }\n}\n\n// Tablets and mobile go single column.\n@media only screen and (max-width: $col * 20) {\n  body {\n    margin: 0;\n  }\n\n  nav.floating {\n    display: none;\n  }\n\n  nav.narrow {\n    display: block;\n  }\n\n  .page {\n    margin: 0 $col;\n    width: inherit;\n  }\n\n  article {\n    margin: 0;\n\n    // Size wide images to fit inside the column again.\n    img.wide {\n      width: inherit;\n      max-width: 100%;\n    }\n  }\n\n  aside {\n    position: inherit;\n    right: inherit;\n    width: inherit;\n\n    border-bottom: solid 1px $light;\n\n    p:first-child {\n      margin-top: 8px;\n    }\n\n    p:last-child {\n      margin-bottom: 8px;\n    }\n\n    // If an aside ends with code (like in \"classes.html\"), then make sure we\n    // give it some margin.\n    div.codehilite:last-child {\n      margin-bottom: 12px;\n    }\n\n    // Make sure aside images don't get too big when the asides are inlined\n    // in single column mode.\n    img {\n      display: block;\n      max-width: $col * 6;\n      margin: 0 auto;\n    }\n\n    img.above {\n      position: relative;\n    }\n  }\n\n  // If aside is right before a code block (when the asides are inline), make\n  // sure they don't overlap.\n  aside + div.codehilite {\n    margin-top: 12px;\n  }\n\n  div.codehilite + aside {\n    margin-top: 24px;\n  }\n\n  .source-file {\n    display: none;\n  }\n\n  .source-file-narrow {\n    display: block;\n  }\n}\n\n// Then bring the margins in some.\n// The cut-off sizes here are based on trying to get 72 columns of code to fit.\n@media only screen and (max-width: 630px) {\n  .page {\n    margin: 0 $col / 2;\n    width: inherit;\n  }\n\n  nav.narrow {\n    .prev, .next {\n      padding: 0 $col / 2;\n    }\n  }\n}\n\n// Finally, shrink the grid to 22px and shrink the text.\n@media only screen and (max-width: 580px) {\n  body {\n    font-size: 15px;\n    line-height: 22px;\n  }\n\n  .small-caps {\n    font-size: 12px;\n  }\n\n  .scrim {\n    background: url('rows-22.png');\n  }\n\n  nav.narrow {\n    img {\n      padding: 9px 0 1px 0;\n      height: 27px;\n    }\n\n    .prev, .next {\n      top: 11px;\n    }\n  }\n\n  article {\n    h1 {\n      font-size: 36px;\n      padding: 100px 0 14px 0;\n    }\n\n    h1.part {\n      font-size: 30px;\n      padding: 97px 0 17px 0;\n    }\n\n    .number {\n      top: 61px;\n      font-size: 72px;\n    }\n\n    p {\n      margin: 22px 0;\n    }\n\n    ol, ul {\n      margin: 22px 0;\n      padding: 0 0 0 22px;\n    }\n  }\n\n  blockquote {\n    margin: 27px 0 28px 0;\n\n    &::before, &::after {\n      top: -17px;\n      font-size: 52px;\n    }\n\n    p {\n      margin: 0 22px;\n      font-size: 20px;\n      line-height: 33px;\n    }\n  }\n\n  footer {\n    .next {\n      font-size: 15px;\n    }\n  }\n}\n"
  },
  {
    "path": "book/a-bytecode-virtual-machine.md",
    "content": "Our Java interpreter, jlox, taught us many of the fundamentals of programming\nlanguages, but we still have much to learn. First, if you run any interesting\nLox programs in jlox, you'll discover it's achingly slow. The style of\ninterpretation it uses -- walking the AST directly -- is good enough for *some*\nreal-world uses, but leaves a lot to be desired for a general-purpose scripting\nlanguage.\n\nAlso, we implicitly rely on runtime features of the JVM itself. We take for\ngranted that things like `instanceof` in Java work *somehow*. And we never for a\nsecond worry about memory management because the JVM's garbage collector takes\ncare of it for us.\n\nWhen we were focused on high-level concepts, it was fine to gloss over those.\nBut now that we know our way around an interpreter, it's time to dig down to\nthose lower layers and build our own virtual machine from scratch using nothing\nmore than the C standard library...\n"
  },
  {
    "path": "book/a-map-of-the-territory.md",
    "content": "> You must have a map, no matter how rough. Otherwise you wander all over the\n> place. In *The Lord of the Rings* I never made anyone go farther than he could\n> on a given day.\n>\n> <cite>J. R. R. Tolkien</cite>\n\nWe don't want to wander all over the place, so before we set off, let's scan\nthe territory charted by previous language implementers. It will help us\nunderstand where we are going and the alternate routes others have taken.\n\nFirst, let me establish a shorthand. Much of this book is about a language's\n*implementation*, which is distinct from the *language itself* in some sort of\nPlatonic ideal form. Things like \"stack\", \"bytecode\", and \"recursive descent\",\nare nuts and bolts one particular implementation might use. From the user's\nperspective, as long as the resulting contraption faithfully follows the\nlanguage's specification, it's all implementation detail.\n\nWe're going to spend a lot of time on those details, so if I have to write\n\"language *implementation*\" every single time I mention them, I'll wear my\nfingers off. Instead, I'll use \"language\" to refer to either a language or an\nimplementation of it, or both, unless the distinction matters.\n\n## The Parts of a Language\n\nEngineers have been building programming languages since the Dark Ages of\ncomputing. As soon as we could talk to computers, we discovered doing so was too\nhard, and we enlisted their help. I find it fascinating that even though today's\nmachines are literally a million times faster and have orders of magnitude more\nstorage, the way we build programming languages is virtually unchanged.\n\nThough the area explored by language designers is vast, the trails they've\ncarved through it are <span name=\"dead\">few</span>. Not every language takes the\nexact same path -- some take a shortcut or two -- but otherwise they are\nreassuringly similar, from Rear Admiral Grace Hopper's first COBOL compiler all\nthe way to some hot, new, transpile-to-JavaScript language whose \"documentation\"\nconsists entirely of a single, poorly edited README in a Git repository\nsomewhere.\n\n<aside name=\"dead\">\n\nThere are certainly dead ends, sad little cul-de-sacs of CS papers with zero\ncitations and now-forgotten optimizations that only made sense when memory was\nmeasured in individual bytes.\n\n</aside>\n\nI visualize the network of paths an implementation may choose as climbing a\nmountain. You start off at the bottom with the program as raw source text,\nliterally just a string of characters. Each phase analyzes the program and\ntransforms it to some higher-level representation where the semantics -- what\nthe author wants the computer to do -- become more apparent.\n\nEventually we reach the peak. We have a bird's-eye view of the user's program\nand can see what their code *means*. We begin our descent down the other side of\nthe mountain. We transform this highest-level representation down to\nsuccessively lower-level forms to get closer and closer to something we know how\nto make the CPU actually execute.\n\n<img src=\"image/a-map-of-the-territory/mountain.png\" alt=\"The branching paths a language may take over the mountain.\" class=\"wide\" />\n\nLet's trace through each of those trails and points of interest. Our journey\nbegins on the left with the bare text of the user's source code:\n\n<img src=\"image/a-map-of-the-territory/string.png\" alt=\"var average = (min + max) / 2;\" />\n\n### Scanning\n\nThe first step is **scanning**, also known as **lexing**, or (if you're trying\nto impress someone) **lexical analysis**. They all mean pretty much the same\nthing. I like \"lexing\" because it sounds like something an evil supervillain\nwould do, but I'll use \"scanning\" because it seems to be marginally more\ncommonplace.\n\nA **scanner** (or **lexer**) takes in the linear stream of characters and chunks\nthem together into a series of something more akin to <span\nname=\"word\">\"words\"</span>. In programming languages, each of these words is\ncalled a **token**. Some tokens are single characters, like `(` and `,`. Others\nmay be several characters long, like numbers (`123`), string literals (`\"hi!\"`),\nand identifiers (`min`).\n\n<aside name=\"word\">\n\n\"Lexical\" comes from the Greek root \"lex\", meaning \"word\".\n\n</aside>\n\nSome characters in a source file don't actually mean anything. Whitespace is\noften insignificant, and comments, by definition, are ignored by the language.\nThe scanner usually discards these, leaving a clean sequence of meaningful\ntokens.\n\n<img src=\"image/a-map-of-the-territory/tokens.png\" alt=\"[var] [average] [=] [(] [min] [+] [max] [)] [/] [2] [;]\" />\n\n### Parsing\n\nThe next step is **parsing**. This is where our syntax gets a **grammar** -- the\nability to compose larger expressions and statements out of smaller parts. Did\nyou ever diagram sentences in English class? If so, you've done what a parser\ndoes, except that English has thousands and thousands of \"keywords\" and an\noverflowing cornucopia of ambiguity. Programming languages are much simpler.\n\nA **parser** takes the flat sequence of tokens and builds a tree structure that\nmirrors the nested nature of the grammar. These trees have a couple of different\nnames -- **parse tree** or **abstract syntax tree** -- depending on how\nclose to the bare syntactic structure of the source language they are. In\npractice, language hackers usually call them **syntax trees**, **ASTs**, or\noften just **trees**.\n\n<img src=\"image/a-map-of-the-territory/ast.png\" alt=\"An abstract syntax tree.\" />\n\nParsing has a long, rich history in computer science that is closely tied to the\nartificial intelligence community. Many of the techniques used today to parse\nprogramming languages were originally conceived to parse *human* languages by AI\nresearchers who were trying to get computers to talk to us.\n\nIt turns out human languages were too messy for the rigid grammars those parsers\ncould handle, but they were a perfect fit for the simpler artificial grammars of\nprogramming languages. Alas, we flawed humans still manage to use those simple\ngrammars incorrectly, so the parser's job also includes letting us know when we\ndo by reporting **syntax errors**.\n\n### Static analysis\n\nThe first two stages are pretty similar across all implementations. Now, the\nindividual characteristics of each language start coming into play. At this\npoint, we know the syntactic structure of the code -- things like which\nexpressions are nested in which -- but we don't know much more than that.\n\nIn an expression like `a + b`, we know we are adding `a` and `b`, but we don't\nknow what those names refer to. Are they local variables? Global? Where are they\ndefined?\n\nThe first bit of analysis that most languages do is called **binding** or\n**resolution**. For each **identifier**, we find out where that name is defined\nand wire the two together. This is where **scope** comes into play -- the region\nof source code where a certain name can be used to refer to a certain\ndeclaration.\n\nIf the language is <span name=\"type\">statically typed</span>, this is when we\ntype check. Once we know where `a` and `b` are declared, we can also figure out\ntheir types. Then if those types don't support being added to each other, we\nreport a **type error**.\n\n<aside name=\"type\">\n\nThe language we'll build in this book is dynamically typed, so it will do its\ntype checking later, at runtime.\n\n</aside>\n\nTake a deep breath. We have attained the summit of the mountain and a sweeping\nview of the user's program. All this semantic insight that is visible to us from\nanalysis needs to be stored somewhere. There are a few places we can squirrel it\naway:\n\n* Often, it gets stored right back as **attributes** on the syntax tree\n  itself -- extra fields in the nodes that aren't initialized during parsing\n  but get filled in later.\n\n* Other times, we may store data in a lookup table off to the side. Typically,\n  the keys to this table are identifiers -- names of variables and declarations.\n  In that case, we call it a **symbol table** and the values it associates with\n  each key tell us what that identifier refers to.\n\n* The most powerful bookkeeping tool is to transform the tree into an entirely\n  new data structure that more directly expresses the semantics of the code.\n  That's the next section.\n\nEverything up to this point is considered the **front end** of the\nimplementation. You might guess everything after this is the **back end**, but\nno. Back in the days of yore when \"front end\" and \"back end\" were coined,\ncompilers were much simpler. Later researchers invented new phases to stuff\nbetween the two halves. Rather than discard the old terms, William Wulf and\ncompany lumped those new phases into the charming but spatially paradoxical name\n**middle end**.\n\n### Intermediate representations\n\nYou can think of the compiler as a pipeline where each stage's job is to\norganize the data representing the user's code in a way that makes the next\nstage simpler to implement. The front end of the pipeline is specific to the\nsource language the program is written in. The back end is concerned with the\nfinal architecture where the program will run.\n\nIn the middle, the code may be stored in some <span name=\"ir\">**intermediate\nrepresentation**</span> (**IR**) that isn't tightly tied to either the source or\ndestination forms (hence \"intermediate\"). Instead, the IR acts as an interface\nbetween these two languages.\n\n<aside name=\"ir\">\n\nThere are a few well-established styles of IRs out there. Hit your search engine\nof choice and look for \"control flow graph\", \"static single-assignment\",\n\"continuation-passing style\", and \"three-address code\".\n\n</aside>\n\nThis lets you support multiple source languages and target platforms with less\neffort. Say you want to implement Pascal, C, and Fortran compilers, and you want\nto target x86, ARM, and, I dunno, SPARC. Normally, that means you're signing up\nto write *nine* full compilers: Pascal&rarr;x86, C&rarr;ARM, and every other\ncombination.\n\nA <span name=\"gcc\">shared</span> intermediate representation reduces that\ndramatically. You write *one* front end for each source language that produces\nthe IR. Then *one* back end for each target architecture. Now you can mix and\nmatch those to get every combination.\n\n<aside name=\"gcc\">\n\nIf you've ever wondered how [GCC][] supports so many crazy languages and\narchitectures, like Modula-3 on Motorola 68k, now you know. Language front ends\ntarget one of a handful of IRs, mainly [GIMPLE][] and [RTL][]. Target back ends\nlike the one for 68k then take those IRs and produce native code.\n\n[gcc]: https://en.wikipedia.org/wiki/GNU_Compiler_Collection\n[gimple]: https://gcc.gnu.org/onlinedocs/gccint/GIMPLE.html\n[rtl]: https://gcc.gnu.org/onlinedocs/gccint/RTL.html\n\n</aside>\n\nThere's another big reason we might want to transform the code into a form that\nmakes the semantics more apparent...\n\n### Optimization\n\nOnce we understand what the user's program means, we are free to swap it out\nwith a different program that has the *same semantics* but implements them more\nefficiently -- we can **optimize** it.\n\nA simple example is **constant folding**: if some expression always evaluates to\nthe exact same value, we can do the evaluation at compile time and replace the\ncode for the expression with its result. If the user typed in this:\n\n```java\npennyArea = 3.14159 * (0.75 / 2) * (0.75 / 2);\n```\n\nwe could do all of that arithmetic in the compiler and change the code to:\n\n```java\npennyArea = 0.4417860938;\n```\n\nOptimization is a huge part of the programming language business. Many language\nhackers spend their entire careers here, squeezing every drop of performance\nthey can out of their compilers to get their benchmarks a fraction of a percent\nfaster. It can become a sort of obsession.\n\nWe're mostly going to <span name=\"rathole\">hop over that rathole</span> in this\nbook. Many successful languages have surprisingly few compile-time\noptimizations. For example, Lua and CPython generate relatively unoptimized\ncode, and focus most of their performance effort on the runtime.\n\n<aside name=\"rathole\">\n\nIf you can't resist poking your foot into that hole, some keywords to get you\nstarted are \"constant propagation\", \"common subexpression elimination\", \"loop\ninvariant code motion\", \"global value numbering\", \"strength reduction\", \"scalar\nreplacement of aggregates\", \"dead code elimination\", and \"loop unrolling\".\n\n</aside>\n\n### Code generation\n\nWe have applied all of the optimizations we can think of to the user's program.\nThe last step is converting it to a form the machine can actually run. In other\nwords, **generating code** (or **code gen**), where \"code\" here usually refers to\nthe kind of primitive assembly-like instructions a CPU runs and not the kind of\n\"source code\" a human might want to read.\n\nFinally, we are in the **back end**, descending the other side of the mountain.\nFrom here on out, our representation of the code becomes more and more\nprimitive, like evolution run in reverse, as we get closer to something our\nsimple-minded machine can understand.\n\nWe have a decision to make. Do we generate instructions for a real CPU or a\nvirtual one? If we generate real machine code, we get an executable that the OS\ncan load directly onto the chip. Native code is lightning fast, but generating\nit is a lot of work. Today's architectures have piles of instructions, complex\npipelines, and enough <span name=\"aad\">historical baggage</span> to fill a 747's\nluggage bay.\n\nSpeaking the chip's language also means your compiler is tied to a specific\narchitecture. If your compiler targets [x86][] machine code, it's not going to\nrun on an [ARM][] device. All the way back in the '60s, during the\nCambrian explosion of computer architectures, that lack of portability was a\nreal obstacle.\n\n<aside name=\"aad\">\n\nFor example, the [AAD][] (\"ASCII Adjust AX Before Division\") instruction lets\nyou perform division, which sounds useful. Except that instruction takes, as\noperands, two binary-coded decimal digits packed into a single 16-bit register.\nWhen was the last time *you* needed BCD on a 16-bit machine?\n\n[aad]: http://www.felixcloutier.com/x86/AAD.html\n\n</aside>\n\n[x86]: https://en.wikipedia.org/wiki/X86\n[arm]: https://en.wikipedia.org/wiki/ARM_architecture\n\nTo get around that, hackers like Martin Richards and Niklaus Wirth, of BCPL and\nPascal fame, respectively, made their compilers produce *virtual* machine code.\nInstead of instructions for some real chip, they produced code for a\nhypothetical, idealized machine. Wirth called this **p-code** for *portable*,\nbut today, we generally call it **bytecode** because each instruction is often a\nsingle byte long.\n\nThese synthetic instructions are designed to map a little more closely to the\nlanguage's semantics, and not be so tied to the peculiarities of any one\ncomputer architecture and its accumulated historical cruft. You can think of it\nlike a dense, binary encoding of the language's low-level operations.\n\n### Virtual machine\n\nIf your compiler produces bytecode, your work isn't over once that's done. Since\nthere is no chip that speaks that bytecode, it's your job to translate. Again,\nyou have two options. You can write a little mini-compiler for each target\narchitecture that converts the bytecode to native code for that machine. You\nstill have to do work for <span name=\"shared\">each</span> chip you support, but\nthis last stage is pretty simple and you get to reuse the rest of the compiler\npipeline across all of the machines you support. You're basically using your\nbytecode as an intermediate representation.\n\n<aside name=\"shared\" class=\"bottom\">\n\nThe basic principle here is that the farther down the pipeline you push the\narchitecture-specific work, the more of the earlier phases you can share across\narchitectures.\n\nThere is a tension, though. Many optimizations, like register allocation and\ninstruction selection, work best when they know the strengths and capabilities\nof a specific chip. Figuring out which parts of your compiler can be shared and\nwhich should be target-specific is an art.\n\n</aside>\n\nOr you can write a <span name=\"vm\">**virtual machine**</span> (**VM**), a\nprogram that emulates a hypothetical chip supporting your virtual architecture\nat runtime. Running bytecode in a VM is slower than translating it to native\ncode ahead of time because every instruction must be simulated at runtime each\ntime it executes. In return, you get simplicity and portability. Implement your\nVM in, say, C, and you can run your language on any platform that has a C\ncompiler. This is how the second interpreter we build in this book works.\n\n<aside name=\"vm\">\n\nThe term \"virtual machine\" also refers to a different kind of abstraction. A\n**system virtual machine** emulates an entire hardware platform and operating\nsystem in software. This is how you can play Windows games on your Linux\nmachine, and how cloud providers give customers the user experience of\ncontrolling their own \"server\" without needing to physically allocate separate\ncomputers for each user.\n\nThe kind of VMs we'll talk about in this book are **language virtual machines**\nor **process virtual machines** if you want to be unambiguous.\n\n</aside>\n\n### Runtime\n\nWe have finally hammered the user's program into a form that we can execute. The\nlast step is running it. If we compiled it to machine code, we simply tell the\noperating system to load the executable and off it goes. If we compiled it to\nbytecode, we need to start up the VM and load the program into that.\n\nIn both cases, for all but the basest of low-level languages, we usually need\nsome services that our language provides while the program is running. For\nexample, if the language automatically manages memory, we need a garbage\ncollector going in order to reclaim unused bits. If our language supports\n\"instance of\" tests so you can see what kind of object you have, then we need\nsome representation to keep track of the type of each object during execution.\n\nAll of this stuff is going at runtime, so it's called, appropriately, the\n**runtime**. In a fully compiled language, the code implementing the runtime\ngets inserted directly into the resulting executable. In, say, [Go][], each\ncompiled application has its own copy of Go's runtime directly embedded in it.\nIf the language is run inside an interpreter or VM, then the runtime lives\nthere. This is how most implementations of languages like Java, Python, and\nJavaScript work.\n\n[go]: https://golang.org/\n\n## Shortcuts and Alternate Routes\n\nThat's the long path covering every possible phase you might implement. Many\nlanguages do walk the entire route, but there are a few shortcuts and alternate\npaths.\n\n### Single-pass compilers\n\nSome simple compilers interleave parsing, analysis, and code generation so that\nthey produce output code directly in the parser, without ever allocating any\nsyntax trees or other IRs. These <span name=\"sdt\">**single-pass\ncompilers**</span> restrict the design of the language. You have no intermediate\ndata structures to store global information about the program, and you don't\nrevisit any previously parsed part of the code. That means as soon as you see\nsome expression, you need to know enough to correctly compile it.\n\n<aside name=\"sdt\">\n\n[**Syntax-directed translation**][pass] is a structured technique for building\nthese all-at-once compilers. You associate an *action* with each piece of the\ngrammar, usually one that generates output code. Then, whenever the parser\nmatches that chunk of syntax, it executes the action, building up the target\ncode one rule at a time.\n\n[pass]: https://en.wikipedia.org/wiki/Syntax-directed_translation\n\n</aside>\n\nPascal and C were designed around this limitation. At the time, memory was so\nprecious that a compiler might not even be able to hold an entire *source file*\nin memory, much less the whole program. This is why Pascal's grammar requires\ntype declarations to appear first in a block. It's why in C you can't call a\nfunction above the code that defines it unless you have an explicit forward\ndeclaration that tells the compiler what it needs to know to generate code for a\ncall to the later function.\n\n### Tree-walk interpreters\n\nSome programming languages begin executing code right after parsing it to an AST\n(with maybe a bit of static analysis applied). To run the program, the\ninterpreter traverses the syntax tree one branch and leaf at a time, evaluating\neach node as it goes.\n\nThis implementation style is common for student projects and little languages,\nbut is not widely used for <span name=\"ruby\">general-purpose</span> languages\nsince it tends to be slow. Some people use \"interpreter\" to mean only these\nkinds of implementations, but others define that word more generally, so I'll\nuse the inarguably explicit **tree-walk interpreter** to refer to these. Our\nfirst interpreter rolls this way.\n\n<aside name=\"ruby\">\n\nA notable exception is early versions of Ruby, which were tree walkers. At 1.9,\nthe canonical implementation of Ruby switched from the original MRI (Matz's Ruby\nInterpreter) to Koichi Sasada's YARV (Yet Another Ruby VM). YARV is a\nbytecode virtual machine.\n\n</aside>\n\n### Transpilers\n\n<span name=\"gary\">Writing</span> a complete back end for a language can be a lot\nof work. If you have some existing generic IR to target, you could bolt your\nfront end onto that. Otherwise, it seems like you're stuck. But what if you\ntreated some other *source language* as if it were an intermediate\nrepresentation?\n\nYou write a front end for your language. Then, in the back end, instead of doing\nall the work to *lower* the semantics to some primitive target language, you\nproduce a string of valid source code for some other language that's about as\nhigh level as yours. Then, you use the existing compilation tools for *that*\nlanguage as your escape route off the mountain and down to something you can\nexecute.\n\nThey used to call this a **source-to-source compiler** or a **transcompiler**.\nAfter the rise of languages that compile to JavaScript in order to run in the\nbrowser, they've affected the hipster sobriquet **transpiler**.\n\n<aside name=\"gary\">\n\nThe first transcompiler, XLT86, translated 8080 assembly into 8086 assembly.\nThat might seem straightforward, but keep in mind the 8080 was an 8-bit chip and\nthe 8086 a 16-bit chip that could use each register as a pair of 8-bit ones.\nXLT86 did data flow analysis to track register usage in the source program and\nthen efficiently map it to the register set of the 8086.\n\nIt was written by Gary Kildall, a tragic hero of computer science if there\never was one. One of the first people to recognize the promise of\nmicrocomputers, he created PL/M and CP/M, the first high-level language and OS\nfor them.\n\nHe was a sea captain, business owner, licensed pilot, and motorcyclist. A TV\nhost with the Kris Kristofferson-esque look sported by dashing bearded dudes in\nthe '80s. He took on Bill Gates and, like many, lost, before meeting his end in\na biker bar under mysterious circumstances. He died too young, but sure as hell\nlived before he did.\n\n</aside>\n\nWhile the first transcompiler translated one assembly language to another,\ntoday, most transpilers work on higher-level languages. After the viral spread\nof UNIX to machines various and sundry, there began a long tradition of\ncompilers that produced C as their output language. C compilers were available\neverywhere UNIX was and produced efficient code, so targeting C was a good way\nto get your language running on a lot of architectures.\n\nWeb browsers are the \"machines\" of today, and their \"machine code\" is\nJavaScript, so these days it seems [almost every language out there][js] has a\ncompiler that targets JS since that's the <span name=\"js\">main</span> way to get\nyour code running in a browser.\n\n[js]: https://github.com/jashkenas/coffeescript/wiki/list-of-languages-that-compile-to-js\n\n<aside name=\"js\">\n\nJS used to be the *only* way to execute code in a browser. Thanks to\n[WebAssembly][], compilers now have a second, lower-level language they can\ntarget that runs on the web.\n\n[webassembly]: https://github.com/webassembly/\n\n</aside>\n\nThe front end -- scanner and parser -- of a transpiler looks like other\ncompilers. Then, if the source language is only a simple syntactic skin over the\ntarget language, it may skip analysis entirely and go straight to outputting the\nanalogous syntax in the destination language.\n\nIf the two languages are more semantically different, you'll see more of the\ntypical phases of a full compiler including analysis and possibly even\noptimization. Then, when it comes to code generation, instead of outputting some\nbinary language like machine code, you produce a string of grammatically correct\nsource (well, destination) code in the target language.\n\nEither way, you then run that resulting code through the output language's\nexisting compilation pipeline, and you're good to go.\n\n### Just-in-time compilation\n\nThis last one is less a shortcut and more a dangerous alpine scramble best\nreserved for experts. The fastest way to execute code is by compiling it to\nmachine code, but you might not know what architecture your end user's machine\nsupports. What to do?\n\nYou can do the same thing that the HotSpot Java Virtual Machine (JVM),\nMicrosoft's Common Language Runtime (CLR), and most JavaScript interpreters do.\nOn the end user's machine, when the program is loaded -- either from source in\nthe case of JS, or platform-independent bytecode for the JVM and CLR -- you\ncompile it to native code for the architecture their computer supports.\nNaturally enough, this is called **just-in-time compilation**. Most hackers just\nsay \"JIT\", pronounced like it rhymes with \"fit\".\n\nThe most sophisticated JITs insert profiling hooks into the generated code to\nsee which regions are most performance critical and what kind of data is flowing\nthrough them. Then, over time, they will automatically recompile those <span\nname=\"hot\">hot spots</span> with more advanced optimizations.\n\n<aside name=\"hot\">\n\nThis is, of course, exactly where the HotSpot JVM gets its name.\n\n</aside>\n\n## Compilers and Interpreters\n\nNow that I've stuffed your head with a dictionary's worth of programming\nlanguage jargon, we can finally address a question that's plagued coders since\ntime immemorial: What's the difference between a compiler and an interpreter?\n\nIt turns out this is like asking the difference between a fruit and a vegetable.\nThat seems like a binary either-or choice, but actually \"fruit\" is a *botanical*\nterm and \"vegetable\" is *culinary*. One does not strictly imply the negation of\nthe other. There are fruits that aren't vegetables (apples) and vegetables that\naren't fruits (carrots), but also edible plants that are both fruits *and*\nvegetables, like tomatoes.\n\n<span name=\"veg\"></span>\n\n<img src=\"image/a-map-of-the-territory/plants.png\" alt=\"A Venn diagram of edible plants\" />\n\n<aside name=\"veg\">\n\nPeanuts (which are not even nuts) and cereals like wheat are actually fruit, but\nI got this drawing wrong. What can I say, I'm a software engineer, not a\nbotanist. I should probably erase the little peanut guy, but he's so cute that I\ncan't bear to.\n\nNow *pine nuts*, on the other hand, are plant-based foods that are neither\nfruits nor vegetables. At least as far as I can tell.\n\n</aside>\n\nSo, back to languages:\n\n* **Compiling** is an *implementation technique* that involves translating a\n  source language to some other -- usually lower-level -- form. When you\n  generate bytecode or machine code, you are compiling. When you transpile to\n  another high-level language, you are compiling too.\n\n* When we say a language implementation \"is a **compiler**\", we mean it\n  translates source code to some other form but doesn't execute it. The user has\n  to take the resulting output and run it themselves.\n\n* Conversely, when we say an implementation \"is an **interpreter**\", we mean it\n  takes in source code and executes it immediately. It runs programs \"from\n  source\".\n\nLike apples and oranges, some implementations are clearly compilers and *not*\ninterpreters. GCC and Clang take your C code and compile it to machine code. An\nend user runs that executable directly and may never even know which tool was\nused to compile it. So those are *compilers* for C.\n\nIn older versions of Matz's canonical implementation of Ruby, the user ran Ruby\nfrom source. The implementation parsed it and executed it directly by traversing\nthe syntax tree. No other translation occurred, either internally or in any\nuser-visible form. So this was definitely an *interpreter* for Ruby.\n\nBut what of CPython? When you run your Python program using it, the code is\nparsed and converted to an internal bytecode format, which is then executed\ninside the VM. From the user's perspective, this is clearly an interpreter --\nthey run their program from source. But if you look under CPython's scaly skin,\nyou'll see that there is definitely some compiling going on.\n\nThe answer is that it is <span name=\"go\">both</span>. CPython *is* an\ninterpreter, and it *has* a compiler. In practice, most scripting languages work\nthis way, as you can see:\n\n<aside name=\"go\">\n\nThe [Go tool][go] is even more of a horticultural curiosity. If you run `go\nbuild`, it compiles your Go source code to machine code and stops. If you type\n`go run`, it does that, then immediately executes the generated executable.\n\nSo `go` *is* a compiler (you can use it as a tool to compile code without\nrunning it), *is* an interpreter (you can invoke it to immediately run a program\nfrom source), and also *has* a compiler (when you use it as an interpreter, it\nis still compiling internally).\n\n[go tool]: https://golang.org/cmd/go/\n\n</aside>\n\n<img src=\"image/a-map-of-the-territory/venn.png\" alt=\"A Venn diagram of compilers and interpreters\" />\n\nThat overlapping region in the center is where our second interpreter lives too,\nsince it internally compiles to bytecode. So while this book is nominally about\ninterpreters, we'll cover some compilation too.\n\n## Our Journey\n\nThat's a lot to take in all at once. Don't worry. This isn't the chapter where\nyou're expected to *understand* all of these pieces and parts. I just want you\nto know that they are out there and roughly how they fit together.\n\nThis map should serve you well as you explore the territory beyond the guided\npath we take in this book. I want to leave you yearning to strike out on your\nown and wander all over that mountain.\n\nBut, for now, it's time for our own journey to begin. Tighten your bootlaces,\ncinch up your pack, and come along. From <span name=\"here\">here</span> on out,\nall you need to focus on is the path in front of you.\n\n<aside name=\"here\">\n\nHenceforth, I promise to tone down the whole mountain metaphor thing.\n\n</aside>\n\n<div class=\"challenges\">\n\n## Challenges\n\n1. Pick an open source implementation of a language you like. Download the\n   source code and poke around in it. Try to find the code that implements the\n   scanner and parser. Are they handwritten, or generated using tools like\n   Lex and Yacc? (`.l` or `.y` files usually imply the latter.)\n\n1. Just-in-time compilation tends to be the fastest way to implement dynamically\n   typed languages, but not all of them use it. What reasons are there to *not*\n   JIT?\n\n1. Most Lisp implementations that compile to C also contain an interpreter that\n   lets them execute Lisp code on the fly as well. Why?\n\n</div>\n"
  },
  {
    "path": "book/a-tree-walk-interpreter.md",
    "content": "With this part, we begin jlox, the first of our two interpreters. Programming\nlanguages are a huge topic with piles of concepts and terminology to cram into\nyour brain all at once. Programming language theory requires a level of mental\nrigor that you probably haven't had to summon since your last calculus final.\n(Fortunately there isn't too much theory in this book.)\n\nImplementing an interpreter uses a few architectural tricks and design\npatterns uncommon in other kinds of applications, so we'll be getting used to\nthe engineering side of things too. Given all of that, we'll keep the code we\nhave to write as simple and plain as possible.\n\nIn less than two thousand lines of clean Java code, we'll build a complete\ninterpreter for Lox that implements every single feature of the language,\nexactly as we've specified. The first few chapters work front-to-back through\nthe phases of the interpreter -- [scanning][], [parsing][], and\n[evaluating code][]. After that, we add language features one at a time,\ngrowing a simple calculator into a full-fledged scripting language.\n\n[scanning]: scanning.html\n[parsing]: parsing-expressions.html\n[evaluating code]: evaluating-expressions.html\n"
  },
  {
    "path": "book/a-virtual-machine.md",
    "content": "> Magicians protect their secrets not because the secrets are large and\n> important, but because they are so small and trivial. The wonderful effects\n> created on stage are often the result of a secret so absurd that the magician\n> would be embarrassed to admit that that was how it was done.\n>\n> <cite>Christopher Priest, <em>The Prestige</em></cite>\n\nWe've spent a lot of time talking about how to represent a program as a sequence\nof bytecode instructions, but it feels like learning biology using only stuffed,\ndead animals. We know what instructions are in theory, but we've never seen them\nin action, so it's hard to really understand what they *do*. It would be hard to\nwrite a compiler that outputs bytecode when we don't have a good understanding\nof how that bytecode behaves.\n\nSo, before we go and build the front end of our new interpreter, we will begin\nwith the back end -- the virtual machine that executes instructions. It breathes\nlife into the bytecode. Watching the instructions prance around gives us a\nclearer picture of how a compiler might translate the user's source code into a\nseries of them.\n\n## An Instruction Execution Machine\n\nThe virtual machine is one part of our interpreter's internal architecture. You\nhand it a chunk of code -- literally a Chunk -- and it runs it. The code and\ndata structures for the VM reside in a new module.\n\n^code vm-h\n\nAs usual, we start simple. The VM will gradually acquire a whole pile of state\nit needs to keep track of, so we define a struct now to stuff that all in.\nCurrently, all we store is the chunk that it executes.\n\nLike we do with most of the data structures we create, we also define functions\nto create and tear down a VM. Here's the implementation:\n\n^code vm-c\n\nOK, calling those functions \"implementations\" is a stretch. We don't have any\ninteresting state to initialize or free yet, so the functions are empty. Trust\nme, we'll get there.\n\nThe slightly more interesting line here is that declaration of `vm`. This module\nis eventually going to have a slew of functions and it would be a chore to pass\naround a pointer to the VM to all of them. Instead, we declare a single global\nVM object. We need only one anyway, and this keeps the code in the book a little\nlighter on the page.\n\n<aside name=\"one\">\n\nThe choice to have a static VM instance is a concession for the book, but not\nnecessarily a sound engineering choice for a real language implementation. If\nyou're building a VM that's designed to be embedded in other host applications,\nit gives the host more flexibility if you *do* explicitly take a VM pointer\nand pass it around.\n\nThat way, the host app can control when and where memory for the VM is\nallocated, run multiple VMs in parallel, etc.\n\nWhat I'm doing here is a global variable, and [everything bad you've heard about\nglobal variables][global] is still true when programming in the large. But when\nkeeping things small for a book...\n\n[global]: http://gameprogrammingpatterns.com/singleton.html\n\n</aside>\n\nBefore we start pumping fun code into our VM, let's go ahead and wire it up to\nthe interpreter's main entrypoint.\n\n^code main-init-vm (1 before, 1 after)\n\nWe spin up the VM when the interpreter first starts. Then when we're about to\nexit, we wind it down.\n\n^code main-free-vm (1 before, 1 after)\n\nOne last ceremonial obligation:\n\n^code main-include-vm (1 before, 2 after)\n\nNow when you run clox, it starts up the VM before it creates that hand-authored\nchunk from the [last chapter][]. The VM is ready and waiting, so let's teach it\nto do something.\n\n[last chapter]: chunks-of-bytecode.html#disassembling-chunks\n\n### Executing instructions\n\nThe VM springs into action when we command it to interpret a chunk of bytecode.\n\n^code main-interpret (1 before, 1 after)\n\nThis function is the main entrypoint into the VM. It's declared like so:\n\n^code interpret-h (1 before, 2 after)\n\nThe VM runs the chunk and then responds with a value from this enum:\n\n^code interpret-result (2 before, 2 after)\n\nWe aren't using the result yet, but when we have a compiler that reports static\nerrors and a VM that detects runtime errors, the interpreter will use this to\nknow how to set the exit code of the process.\n\nWe're inching towards some actual implementation.\n\n^code interpret\n\nFirst, we store the chunk being executed in the VM. Then we call `run()`, an\ninternal helper function that actually runs the bytecode instructions. Between\nthose two parts is an intriguing line. What is this `ip` business?\n\nAs the VM works its way through the bytecode, it keeps track of where it is --\nthe location of the instruction currently being executed. We don't use a <span\nname=\"local\">local</span> variable inside `run()` for this because eventually\nother functions will need to access it. Instead, we store it as a field in VM.\n\n<aside name=\"local\">\n\nIf we were trying to squeeze every ounce of speed out of our bytecode\ninterpreter, we would store `ip` in a local variable. It gets modified so often\nduring execution that we want the C compiler to keep it in a register.\n\n</aside>\n\n^code ip (2 before, 1 after)\n\nIts type is a byte pointer. We use an actual real C pointer pointing right into\nthe middle of the bytecode array instead of something like an integer index\nbecause it's faster to dereference a pointer than look up an element in an array\nby index.\n\nThe name \"IP\" is traditional, and -- unlike many traditional names in CS --\nactually makes sense: it's an **[instruction pointer][ip]**. Almost every\ninstruction set in the <span name=\"ip\">world</span>, real and virtual, has a\nregister or variable like this.\n\n[ip]: https://en.wikipedia.org/wiki/Program_counter\n\n<aside name=\"ip\">\n\nx86, x64, and the CLR call it \"IP\". 68k, PowerPC, ARM, p-code, and the JVM call\nit \"PC\", for **program counter**.\n\n</aside>\n\nWe initialize `ip` by pointing it at the first byte of code in the chunk. We\nhaven't executed that instruction yet, so `ip` points to the instruction *about\nto be executed*. This will be true during the entire time the VM is running: the\nIP always points to the next instruction, not the one currently being handled.\n\nThe real fun happens in `run`().\n\n^code run\n\nThis is the single most <span name=\"important\">important</span> function in all\nof clox, by far. When the interpreter executes a user's program, it will spend\nsomething like 90% of its time inside `run()`. It is the beating heart of the\nVM.\n\n<aside name=\"important\">\n\nOr, at least, it *will* be in a few chapters when it has enough content to be\nuseful. Right now, it's not exactly a wonder of software wizardry.\n\n</aside>\n\nDespite that dramatic intro, it's conceptually pretty simple. We have an outer\nloop that goes and goes. Each turn through that loop, we read and execute a\nsingle bytecode instruction.\n\nTo process an instruction, we first figure out what kind of instruction we're\ndealing with. The `READ_BYTE` macro reads the byte currently pointed at by `ip`\nand then <span name=\"next\">advances</span> the instruction pointer. The first\nbyte of any instruction is the opcode. Given a numeric opcode, we need to get to\nthe right C code that implements that instruction's semantics. This process is\ncalled **decoding** or **dispatching** the instruction.\n\n<aside name=\"next\">\n\nNote that `ip` advances as soon as we read the opcode, before we've actually\nstarted executing the instruction. So, again, `ip` points to the *next*\nbyte of code to be used.\n\n</aside>\n\nWe do that process for every single instruction, every single time one is\nexecuted, so this is the most performance critical part of the entire virtual\nmachine. Programming language lore is filled with <span\nname=\"dispatch\">clever</span> techniques to do bytecode dispatch efficiently,\ngoing all the way back to the early days of computers.\n\n<aside name=\"dispatch\">\n\nIf you want to learn some of these techniques, look up \"direct threaded code\",\n\"jump table\", and \"computed goto\".\n\n</aside>\n\nAlas, the fastest solutions require either non-standard extensions to C, or\nhandwritten assembly code. For clox, we'll keep it simple. Just like our\ndisassembler, we have a single giant `switch` statement with a case for each\nopcode. The body of each case implements that opcode's behavior.\n\nSo far, we handle only a single instruction, `OP_RETURN`, and the only thing it\ndoes is exit the loop entirely. Eventually, that instruction will be used to\nreturn from the current Lox function, but we don't have functions yet, so we'll\nrepurpose it temporarily to end the execution.\n\nLet's go ahead and support our one other instruction.\n\n^code op-constant (1 before, 1 after)\n\nWe don't have enough machinery in place yet to do anything useful with a\nconstant. For now, we'll just print it out so we interpreter hackers can see\nwhat's going on inside our VM. That call to `printf()` necessitates an include.\n\n^code vm-include-stdio (1 after)\n\nWe also have a new macro to define.\n\n^code read-constant (1 before, 2 after)\n\n`READ_CONSTANT()` reads the next byte from the bytecode, treats the resulting\nnumber as an index, and looks up the corresponding Value in the chunk's constant\ntable. In later chapters, we'll add a few more instructions with operands that\nrefer to constants, so we're setting up this helper macro now.\n\nLike the previous `READ_BYTE` macro, `READ_CONSTANT` is only used inside\n`run()`. To make that scoping more explicit, the macro definitions themselves\nare confined to that function. We <span name=\"macro\">define</span> them at the\nbeginning and -- because we care -- undefine them at the end.\n\n^code undef-read-constant (1 before, 1 after)\n\n<aside name=\"macro\">\n\nUndefining these macros explicitly might seem needlessly fastidious, but C tends\nto punish sloppy users, and the C preprocessor doubly so.\n\n</aside>\n\n### Execution tracing\n\nIf you run clox now, it executes the chunk we hand-authored in the last chapter\nand spits out `1.2` to your terminal. We can see that it's working, but that's\nonly because our implementation of `OP_CONSTANT` has temporary code to log the\nvalue. Once that instruction is doing what it's supposed to do and plumbing that\nconstant along to other operations that want to consume it, the VM will become a\nblack box. That makes our lives as VM implementers harder.\n\nTo help ourselves out, now is a good time to add some diagnostic logging to the\nVM like we did with chunks themselves. In fact, we'll even reuse the same code.\nWe don't want this logging enabled all the time -- it's just for us VM hackers,\nnot Lox users -- so first we create a flag to hide it behind.\n\n^code define-debug-trace (1 before, 2 after)\n\nWhen this flag is defined, the VM disassembles and prints each instruction right\nbefore executing it. Where our previous disassembler walked an entire chunk\nonce, statically, this disassembles instructions dynamically, on the fly.\n\n^code trace-execution (1 before, 1 after)\n\nSince `disassembleInstruction()` takes an integer byte *offset* and we store the\ncurrent instruction reference as a direct pointer, we first do a little pointer\nmath to convert `ip` back to a relative offset from the beginning of the\nbytecode. Then we disassemble the instruction that begins at that byte.\n\nAs ever, we need to bring in the declaration of the function before we can call\nit.\n\n^code vm-include-debug (1 before, 1 after)\n\nI know this code isn't super impressive so far -- it's literally a switch\nstatement wrapped in a `for` loop but, believe it or not, this is one of the two\nmajor components of our VM. With this, we can imperatively execute instructions.\nIts simplicity is a virtue -- the less work it does, the faster it can do it.\nContrast this with all of the complexity and overhead we had in jlox with the\nVisitor pattern for walking the AST.\n\n## A Value Stack Manipulator\n\nIn addition to imperative side effects, Lox has expressions that produce,\nmodify, and consume values. Thus, our compiled bytecode needs a way to shuttle\nvalues around between the different instructions that need them. For example:\n\n```lox\nprint 3 - 2;\n```\n\nWe obviously need instructions for the constants 3 and 2, the `print` statement,\nand the subtraction. But how does the subtraction instruction know that 3 is\nthe <span name=\"word\">minuend</span> and 2 is the subtrahend? How does the print\ninstruction know to print the result of that?\n\n<aside name=\"word\">\n\nYes, I did have to look up \"subtrahend\" and \"minuend\" in a dictionary. But\naren't they delightful words? \"Minuend\" sounds like a kind of Elizabethan dance\nand \"subtrahend\" might be some sort of underground Paleolithic monument.\n\n</aside>\n\nTo put a finer point on it, look at this thing right here:\n\n```lox\nfun echo(n) {\n  print n;\n  return n;\n}\n\nprint echo(echo(1) + echo(2)) + echo(echo(4) + echo(5));\n```\n\nI wrapped each subexpression in a call to `echo()` that prints and returns its\nargument. That side effect means we can see the exact order of operations.\n\nDon't worry about the VM for a minute. Think about just the semantics of Lox\nitself. The operands to an arithmetic operator obviously need to be evaluated\nbefore we can perform the operation itself. (It's pretty hard to add `a + b` if\nyou don't know what `a` and `b` are.) Also, when we implemented expressions in\njlox, we <span name=\"undefined\">decided</span> that the left operand must be\nevaluated before the right.\n\n<aside name=\"undefined\">\n\nWe could have left evaluation order unspecified and let each implementation\ndecide. That leaves the door open for optimizing compilers to reorder arithmetic\nexpressions for efficiency, even in cases where the operands have visible side\neffects. C and Scheme leave evaluation order unspecified. Java specifies\nleft-to-right evaluation like we do for Lox.\n\nI think nailing down stuff like this is generally better for users. When\nexpressions are not evaluated in the order users intuit -- possibly in different\norders across different implementations! -- it can be a burning hellscape of\npain to figure out what's going on.\n\n</aside>\n\nHere is the syntax tree for the `print` statement:\n\n<img src=\"image/a-virtual-machine/ast.png\" alt=\"The AST for the example\nstatement, with numbers marking the order that the nodes are evaluated.\" />\n\nGiven left-to-right evaluation, and the way the expressions are nested, any\ncorrect Lox implementation *must* print these numbers in this order:\n\n```text\n1  // from echo(1)\n2  // from echo(2)\n3  // from echo(1 + 2)\n4  // from echo(4)\n5  // from echo(5)\n9  // from echo(4 + 5)\n12 // from print 3 + 9\n```\n\nOur old jlox interpreter accomplishes this by recursively traversing the AST. It\ndoes a postorder traversal. First it recurses down the left operand branch,\nthen the right operand, then finally it evaluates the node itself.\n\nAfter evaluating the left operand, jlox needs to store that result somewhere\ntemporarily while it's busy traversing down through the right operand tree. We\nuse a local variable in Java for that. Our recursive tree-walk interpreter\ncreates a unique Java call frame for each node being evaluated, so we could have\nas many of these local variables as we needed.\n\nIn clox, our `run()` function is not recursive -- the nested expression tree is\nflattened out into a linear series of instructions. We don't have the luxury of\nusing C local variables, so how and where should we store these temporary\nvalues? You can probably <span name=\"guess\">guess</span> already, but I want to\nreally drill into this because it's an aspect of programming that we take for\ngranted, but we rarely learn *why* computers are architected this way.\n\n<aside name=\"guess\">\n\nHint: it's in the name of this section, and it's how Java and C manage recursive\ncalls to functions.\n\n</aside>\n\nLet's do a weird exercise. We'll walk through the execution of the above program\na step at a time:\n\n<img src=\"image/a-virtual-machine/bars.png\" alt=\"The series of instructions with\nbars showing which numbers need to be preserved across which instructions.\" />\n\nOn the left are the steps of code. On the right are the values we're tracking.\nEach bar represents a number. It starts when the value is first produced --\neither a constant or the result of an addition. The length of the bar tracks\nwhen a previously produced value needs to be kept around, and it ends when that\nvalue finally gets consumed by an operation.\n\nAs you step through, you see values appear and then later get eaten. The\nlongest-lived ones are the values produced from the left-hand side of an\naddition. Those stick around while we work through the right-hand operand\nexpression.\n\nIn the above diagram, I gave each unique number its own visual column. Let's be\na little more parsimonious. Once a number is consumed, we allow its column to be\nreused for another later value. In other words, we take all of those gaps\nup there and fill them in, pushing in numbers from the right:\n\n<img src=\"image/a-virtual-machine/bars-stacked.png\" alt=\"Like the previous\ndiagram, but with number bars pushed to the left, forming a stack.\" />\n\nThere's some interesting stuff going on here. When we shift everything over,\neach number still manages to stay in a single column for its entire life. Also,\nthere are no gaps left. In other words, whenever a number appears earlier than\nanother, then it will live at least as long as that second one. The first number\nto appear is the last to be consumed. Hmm... last-in, first-out... why, that's a\n<span name=\"pancakes\">stack</span>!\n\n<aside name=\"pancakes\">\n\nThis is also a stack:\n\n<img src=\"image/a-virtual-machine/pancakes.png\" alt=\"A stack... of pancakes.\" />\n\n</aside>\n\nIn the second diagram, each time we introduce a number, we push it onto the\nstack from the right. When numbers are consumed, they are always popped off from\nrightmost to left.\n\nSince the temporary values we need to track naturally have stack-like behavior,\nour VM will use a stack to manage them. When an instruction \"produces\" a value,\nit pushes it onto the stack. When it needs to consume one or more values, it\ngets them by popping them off the stack.\n\n### The VM's Stack\n\nMaybe this doesn't seem like a revelation, but I *love* stack-based VMs. When\nyou first see a magic trick, it feels like something actually magical. But then\nyou learn how it works -- usually some mechanical gimmick or misdirection -- and\nthe sense of wonder evaporates. There are a <span name=\"wonder\">couple</span> of\nideas in computer science where even after I pulled them apart and learned all\nthe ins and outs, some of the initial sparkle remained. Stack-based VMs are one\nof those.\n\n<aside name=\"wonder\">\n\nHeaps -- [the data structure][heap], not [the memory management thing][heap mem]\n-- are another. And Vaughan Pratt's top-down operator precedence parsing scheme,\nwhich we'll learn about [in due time][pratt].\n\n[heap]: https://en.wikipedia.org/wiki/Heap_(data_structure)\n[heap mem]: https://en.wikipedia.org/wiki/Memory_management#HEAP\n[pratt]: compiling-expressions.html\n\n</aside>\n\nAs you'll see in this chapter, executing instructions in a stack-based VM is\ndead <span name=\"cheat\">simple</span>. In later chapters, you'll also discover\nthat compiling a source language to a stack-based instruction set is a piece of\ncake. And yet, this architecture is fast enough to be used by production\nlanguage implementations. It almost feels like cheating at the programming\nlanguage game.\n\n<aside name=\"cheat\">\n\nTo take a bit of the sheen off: stack-based interpreters aren't a silver bullet.\nThey're often *adequate*, but modern implementations of the JVM, the CLR, and\nJavaScript all use sophisticated [just-in-time compilation][jit] pipelines to\ngenerate *much* faster native code on the fly.\n\n[jit]: https://en.wikipedia.org/wiki/Just-in-time_compilation\n\n</aside>\n\nAlrighty, it's codin' time! Here's the stack:\n\n^code vm-stack (3 before, 1 after)\n\nWe implement the stack semantics ourselves on top of a raw C array. The bottom\nof the stack -- the first value pushed and the last to be popped -- is at\nelement zero in the array, and later pushed values follow it. If we push the\nletters of \"crepe\" -- my favorite stackable breakfast item -- onto the stack, in\norder, the resulting C array looks like this:\n\n<img src=\"image/a-virtual-machine/array.png\" alt=\"An array containing the\nletters in 'crepe' in order starting at element 0.\" />\n\nSince the stack grows and shrinks as values are pushed and popped, we need to\ntrack where the top of the stack is in the array. As with `ip`, we use a direct\npointer instead of an integer index since it's faster to dereference the pointer\nthan calculate the offset from the index each time we need it.\n\nThe pointer points at the array element just *past* the element containing the\ntop value on the stack. That seems a little odd, but almost every implementation\ndoes this. It means we can indicate that the stack is empty by pointing at\nelement zero in the array.\n\n<img src=\"image/a-virtual-machine/stack-empty.png\" alt=\"An empty array with\nstackTop pointing at the first element.\" />\n\nIf we pointed to the top element, then for an empty stack we'd need to point at\nelement -1. That's <span name=\"defined\">undefined</span> in C. As we push values\nonto the stack...\n\n<aside name=\"defined\">\n\nWhat about when the stack is *full*, you ask, Clever Reader? The C standard is\none step ahead of you. It *is* allowed and well-specified to have an array\npointer that points just past the end of an array.\n\n</aside>\n\n<img src=\"image/a-virtual-machine/stack-c.png\" alt=\"An array with 'c' at element\nzero.\" />\n\n...`stackTop` always points just past the last item.\n\n<img src=\"image/a-virtual-machine/stack-crepe.png\" alt=\"An array with 'c', 'r',\n'e', 'p', and 'e' in the first five elements.\" />\n\nI remember it like this: `stackTop` points to where the next value to be pushed\nwill go. The maximum number of values we can store on the stack (for now, at\nleast) is:\n\n^code stack-max (1 before, 2 after)\n\nGiving our VM a fixed stack size means it's possible for some sequence of\ninstructions to push too many values and run out of stack space -- the classic\n\"stack overflow\". We could grow the stack dynamically as needed, but for now\nwe'll keep it simple. Since VM uses Value, we need to include its declaration.\n\n^code vm-include-value (1 before, 2 after)\n\nNow that VM has some interesting state, we get to initialize it.\n\n^code call-reset-stack (1 before, 1 after)\n\nThat uses this helper function:\n\n^code reset-stack\n\nSince the stack array is declared directly inline in the VM struct, we don't\nneed to allocate it. We don't even need to clear the unused cells in the\narray -- we simply won't access them until after values have been stored in\nthem. The only initialization we need is to set `stackTop` to point to the\nbeginning of the array to indicate that the stack is empty.\n\nThe stack protocol supports two operations:\n\n^code push-pop (1 before, 2 after)\n\nYou can push a new value onto the top of the stack, and you can pop the most\nrecently pushed value back off. Here's the first function:\n\n^code push\n\nIf you're rusty on your C pointer syntax and operations, this is a good warm-up.\nThe first line stores `value` in the array element at the top of the stack.\nRemember, `stackTop` points just *past* the last used element, at the next\navailable one. This stores the value in that slot. Then we increment the pointer\nitself to point to the next unused slot in the array now that the previous slot\nis occupied.\n\nPopping is the mirror image.\n\n^code pop\n\nFirst, we move the stack pointer *back* to get to the most recent used slot in\nthe array. Then we look up the value at that index and return it. We don't need\nto explicitly \"remove\" it from the array -- moving `stackTop` down is enough to\nmark that slot as no longer in use.\n\n### Stack tracing\n\nWe have a working stack, but it's hard to *see* that it's working. When we start\nimplementing more complex instructions and compiling and running larger pieces\nof code, we'll end up with a lot of values crammed into that array. It would\nmake our lives as VM hackers easier if we had some visibility into the stack.\n\nTo that end, whenever we're tracing execution, we'll also show the current\ncontents of the stack before we interpret each instruction.\n\n^code trace-stack (1 before, 1 after)\n\nWe loop, printing each value in the array, starting at the first (bottom of the\nstack) and ending when we reach the top. This lets us observe the effect of each\ninstruction on the stack. The output is pretty verbose, but it's useful when\nwe're surgically extracting a nasty bug from the bowels of the interpreter.\n\nStack in hand, let's revisit our two instructions. First up:\n\n^code push-constant (2 before, 1 after)\n\nIn the last chapter, I was hand-wavey about how the `OP_CONSTANT` instruction\n\"loads\" a constant. Now that we have a stack you know what it means to actually\nproduce a value: it gets pushed onto the stack.\n\n^code print-return (1 before, 1 after)\n\nThen we make `OP_RETURN` pop the stack and print the top value before exiting.\nWhen we add support for real functions to clox, we'll change this code. But, for\nnow, it gives us a way to get the VM executing simple instruction sequences and\ndisplaying the result.\n\n## An Arithmetic Calculator\n\nThe heart and soul of our VM are in place now. The bytecode loop dispatches and\nexecutes instructions. The stack grows and shrinks as values flow through it.\nThe two halves work, but it's hard to get a feel for how cleverly they interact\nwith only the two rudimentary instructions we have so far. So let's teach our\ninterpreter to do arithmetic.\n\nWe'll start with the simplest arithmetic operation, unary negation.\n\n```lox\nvar a = 1.2;\nprint -a; // -1.2.\n```\n\nThe prefix `-` operator takes one operand, the value to negate. It produces a\nsingle result. We aren't fussing with a parser yet, but we can add the\nbytecode instruction that the above syntax will compile to.\n\n^code negate-op (1 before, 1 after)\n\nWe execute it like so:\n\n^code op-negate (1 before, 1 after)\n\nThe instruction needs a value to operate on, which it gets by popping from the\nstack. It negates that, then pushes the result back on for later instructions to\nuse. Doesn't get much easier than that. We can disassemble it too.\n\n^code disassemble-negate (2 before, 1 after)\n\nAnd we can try it out in our test chunk.\n\n^code main-negate (1 before, 2 after)\n\nAfter loading the constant, but before returning, we execute the negate\ninstruction. That replaces the constant on the stack with its negation. Then the\nreturn instruction prints that out:\n\n```text\n-1.2\n```\n\nMagical!\n\n### Binary operators\n\nOK, unary operators aren't *that* impressive. We still only ever have a single\nvalue on the stack. To really see some depth, we need binary operators. Lox has\nfour binary <span name=\"ops\">arithmetic</span> operators: addition, subtraction,\nmultiplication, and division. We'll go ahead and implement them all at the same\ntime.\n\n<aside name=\"ops\">\n\nLox has some other binary operators -- comparison and equality -- but those\ndon't produce numbers as a result, so we aren't ready for them yet.\n\n</aside>\n\n^code binary-ops (1 before, 1 after)\n\nBack in the bytecode loop, they are executed like this:\n\n^code op-binary (1 before, 1 after)\n\nThe only difference between these four instructions is which underlying C\noperator they ultimately use to combine the two operands. Surrounding that core\narithmetic expression is some boilerplate code to pull values off the stack and\npush the result. When we later add dynamic typing, that boilerplate will grow.\nTo avoid repeating that code four times, I wrapped it up in a macro.\n\n^code binary-op (1 before, 2 after)\n\nI admit this is a fairly <span name=\"operator\">adventurous</span> use of the C\npreprocessor. I hesitated to do this, but you'll be glad in later chapters when\nwe need to add the type checking for each operand and stuff. It would be a chore\nto walk you through the same code four times.\n\n<aside name=\"operator\">\n\nDid you even know you can pass an *operator* as an argument to a macro? Now you\ndo. The preprocessor doesn't care that operators aren't first class in C. As far\nas it's concerned, it's all just text tokens.\n\nI know, you can just *feel* the temptation to abuse this, can't you?\n\n</aside>\n\nIf you aren't familiar with the trick already, that outer `do while` loop\nprobably looks really weird. This macro needs to expand to a series of\nstatements. To be careful macro authors, we want to ensure those statements all\nend up in the same scope when the macro is expanded. Imagine if you defined:\n\n```c\n#define WAKE_UP() makeCoffee(); drinkCoffee();\n```\n\nAnd then used it like:\n\n```c\nif (morning) WAKE_UP();\n```\n\nThe intent is to execute both statements of the macro body only if `morning` is\ntrue. But it expands to:\n\n```c\nif (morning) makeCoffee(); drinkCoffee();;\n```\n\nOops. The `if` attaches only to the *first* statement. You might think you could\nfix this using a block.\n\n```c\n#define WAKE_UP() { makeCoffee(); drinkCoffee(); }\n```\n\nThat's better, but you still risk:\n\n```c\nif (morning)\n  WAKE_UP();\nelse\n  sleepIn();\n```\n\nNow you get a compile error on the `else` because of that trailing `;` after the\nmacro's block. Using a `do while` loop in the macro looks funny, but it gives\nyou a way to contain multiple statements inside a block that *also* permits a\nsemicolon at the end.\n\nWhere were we? Right, so what the body of that macro does is straightforward. A\nbinary operator takes two operands, so it pops twice. It performs the operation\non those two values and then pushes the result.\n\nPay close attention to the *order* of the two pops. Note that we assign the\nfirst popped operand to `b`, not `a`. It looks backwards. When the operands\nthemselves are calculated, the left is evaluated first, then the right. That\nmeans the left operand gets pushed before the right operand. So the right\noperand will be on top of the stack. Thus, the first value we pop is `b`.\n\nFor example, if we compile `3 - 1`, the data flow between the instructions looks\nlike so:\n\n<img src=\"image/a-virtual-machine/reverse.png\" alt=\"A sequence of instructions\nwith the stack for each showing how pushing and then popping values reverses\ntheir order.\" />\n\nAs we did with the other macros inside `run()`, we clean up after ourselves at\nthe end of the function.\n\n^code undef-binary-op (1 before, 1 after)\n\nLast is disassembler support.\n\n^code disassemble-binary (2 before, 1 after)\n\nThe arithmetic instruction formats are simple, like `OP_RETURN`. Even though the\narithmetic *operators* take operands -- which are found on the stack -- the\narithmetic *bytecode instructions* do not.\n\nLet's put some of our new instructions through their paces by evaluating a\nlarger expression:\n\n<img src=\"image/a-virtual-machine/chunk.png\" alt=\"The expression being\nevaluated: -((1.2 + 3.4) / 5.6)\" />\n\nBuilding on our existing example chunk, here's the additional instructions we\nneed to hand-compile that AST to bytecode.\n\n^code main-chunk (3 before, 3 after)\n\nThe addition goes first. The instruction for the left constant, 1.2, is already\nthere, so we add another for 3.4. Then we add those two using `OP_ADD`, leaving\nit on the stack. That covers the left side of the division. Next we push the\n5.6, and divide the result of the addition by it. Finally, we negate the result\nof that.\n\nNote how the output of the `OP_ADD` implicitly flows into being an operand of\n`OP_DIVIDE` without either instruction being directly coupled to each other.\nThat's the magic of the stack. It lets us freely compose instructions without\nthem needing any complexity or awareness of the data flow. The stack acts like a\nshared workspace that they all read from and write to.\n\nIn this tiny example chunk, the stack still only gets two values tall, but when\nwe start compiling Lox source to bytecode, we'll have chunks that use much more\nof the stack. In the meantime, try playing around with this hand-authored chunk\nto calculate different nested arithmetic expressions and see how values flow\nthrough the instructions and stack.\n\nYou may as well get it out of your system now. This is the last chunk we'll\nbuild by hand. When we next revisit bytecode, we will be writing a compiler to\ngenerate it for us.\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  What bytecode instruction sequences would you generate for the following\n    expressions:\n\n    ```lox\n    1 * 2 + 3\n    1 + 2 * 3\n    3 - 2 - 1\n    1 + 2 * 3 - 4 / -5\n    ```\n\n    (Remember that Lox does not have a syntax for negative number literals, so\n    the `-5` is negating the number 5.)\n\n1.  If we really wanted a minimal instruction set, we could eliminate either\n    `OP_NEGATE` or `OP_SUBTRACT`. Show the bytecode instruction sequence you\n    would generate for:\n\n    ```lox\n    4 - 3 * -2\n    ```\n\n    First, without using `OP_NEGATE`. Then, without using `OP_SUBTRACT`.\n\n    Given the above, do you think it makes sense to have both instructions? Why\n    or why not? Are there any other redundant instructions you would consider\n    including?\n\n1.  Our VM's stack has a fixed size, and we don't check if pushing a value\n    overflows it. This means the wrong series of instructions could cause our\n    interpreter to crash or go into undefined behavior. Avoid that by\n    dynamically growing the stack as needed.\n\n    What are the costs and benefits of doing so?\n\n1.  To interpret `OP_NEGATE`, we pop the operand, negate the value, and then\n    push the result. That's a simple implementation, but it increments and\n    decrements `stackTop` unnecessarily, since the stack ends up the same height\n    in the end. It might be faster to simply negate the value in place on the\n    stack and leave `stackTop` alone. Try that and see if you can measure a\n    performance difference.\n\n    Are there other instructions where you can do a similar optimization?\n\n</div>\n\n<div class=\"design-note\">\n\n## Design Note: Register-Based Bytecode\n\nFor the remainder of this book, we'll meticulously implement an interpreter\naround a stack-based bytecode instruction set. There's another family of\nbytecode architectures out there -- *register-based*. Despite the name, these\nbytecode instructions aren't quite as difficult to work with as the registers in\nan actual chip like <span name=\"x64\">x64</span>. With real hardware registers,\nyou usually have only a handful for the entire program, so you spend a lot of\neffort [trying to use them efficiently and shuttling stuff in and out of\nthem][register allocation].\n\n[register allocation]: https://en.wikipedia.org/wiki/Register_allocation\n\n<aside name=\"x64\">\n\nRegister-based bytecode is a little closer to the [*register windows*][window]\nsupported by SPARC chips.\n\n[window]: https://en.wikipedia.org/wiki/Register_window\n\n</aside>\n\nIn a register-based VM, you still have a stack. Temporary values still get\npushed onto it and popped when no longer needed. The main difference is that\ninstructions can read their inputs from anywhere in the stack and can store\ntheir outputs into specific stack slots.\n\nTake this little Lox script:\n\n```lox\nvar a = 1;\nvar b = 2;\nvar c = a + b;\n```\n\nIn our stack-based VM, the last statement will get compiled to something like:\n\n```lox\nload <a>  // Read local variable a and push onto stack.\nload <b>  // Read local variable b and push onto stack.\nadd       // Pop two values, add, push result.\nstore <c> // Pop value and store in local variable c.\n```\n\n(Don't worry if you don't fully understand the load and store instructions yet.\nWe'll go over them in much greater detail [when we implement\nvariables][variables].) We have four separate instructions. That means four\ntimes through the bytecode interpret loop, four instructions to decode and\ndispatch. It's at least seven bytes of code -- four for the opcodes and another\nthree for the operands identifying which locals to load and store. Three pushes\nand three pops. A lot of work!\n\n[variables]: global-variables.html\n\nIn a register-based instruction set, instructions can read from and store\ndirectly into local variables. The bytecode for the last statement above looks\nlike:\n\n```lox\nadd <a> <b> <c> // Read values from a and b, add, store in c.\n```\n\nThe add instruction is bigger -- it has three instruction operands that define\nwhere in the stack it reads its inputs from and writes the result to. But since\nlocal variables live on the stack, it can read directly from `a` and `b` and\nthen store the result right into `c`.\n\nThere's only a single instruction to decode and dispatch, and the whole thing\nfits in four bytes. Decoding is more complex because of the additional operands,\nbut it's still a net win. There's no pushing and popping or other stack\nmanipulation.\n\nThe main implementation of Lua used to be stack-based. For <span name=\"lua\">Lua\n5.0</span>, the implementers switched to a register instruction set and noted a\nspeed improvement. The amount of improvement, naturally, depends heavily on the\ndetails of the language semantics, specific instruction set, and compiler\nsophistication, but that should get your attention.\n\n<aside name=\"lua\">\n\nThe Lua dev team -- Roberto Ierusalimschy, Waldemar Celes, and Luiz Henrique de\nFigueiredo -- wrote a *fantastic* paper on this, one of my all time favorite\ncomputer science papers, \"[The Implementation of Lua 5.0][lua]\" (PDF).\n\n[lua]: https://www.lua.org/doc/jucs05.pdf\n\n</aside>\n\nThat raises the obvious question of why I'm going to spend the rest of the book\ndoing a stack-based bytecode. Register VMs are neat, but they are quite a bit\nharder to write a compiler for. For what is likely to be your very first\ncompiler, I wanted to stick with an instruction set that's easy to generate and\neasy to execute. Stack-based bytecode is marvelously simple.\n\nIt's also *much* better known in the literature and the community. Even though\nyou may eventually move to something more advanced, it's a good common ground to\nshare with the rest of your language hacker peers.\n\n</div>\n"
  },
  {
    "path": "book/acknowledgements.md",
    "content": "When the first copy of \"[Game Programming Patterns][gpp]\" sold, I guess I had\nthe right to call myself an author. But it took time to feel comfortable with\nthat label. Thank you to everyone who bought copies of my first book, and to the\npublishers and translators who brought it to other languages. You gave me the\nconfidence to believe I could tackle a project of this scope. Well, that, and\nmassively underestimating what I was getting myself into, but that's on me.\n\n[gpp]: https://gameprogrammingpatterns.com/\n\nA fear particular to technical writing is *getting stuff wrong*. Tests and\nstatic analysis only get you so far. Once the code and prose is in ink on paper,\nthere's no fixing it. I am deeply grateful to the many people who filed issues\nand pull requests on the [open source repo][repo] for the book. Special thanks\ngo to cm1776, who filed 145 tactfully worded issues pointing out hundreds of\ncode errors, typos, and unclear sentences. The book is more accurate and\nreadable because of you all.\n\n[repo]: https://github.com/munificent/craftinginterpreters\n\nI'm grateful to my copy editor Kari Somerton who braved a heap of computer\nscience jargon and an unfamilar workflow in order to fix my many grammar errors\nand stylistic inconsistencies.\n\nWhen the pandemic turned everyone's life upside down, a number of people reached\nout to tell me that my book provided a helpful distraction. This book that I\nspent six years writing forms a chapter in my own life's story and I'm grateful\nto the readers who contacted me and made that chapter more meaningful.\n\nFinally, the deepest thanks go to my wife Megan and my daughters Lily and\nGretchen. You patiently endured the time I had to sink into the book, and my\nstress while writing it. There's no one I'd rather be stuck at home with.\n"
  },
  {
    "path": "book/appendix-i.md",
    "content": "Here is a complete grammar for Lox. The chapters that introduce each part of the\nlanguage include the grammar rules there, but this collects them all into one\nplace.\n\n## Syntax Grammar\n\nThe syntactic grammar is used to parse the linear sequence of tokens into the\nnested syntax tree structure. It starts with the first rule that matches an\nentire Lox program (or a single REPL entry).\n\n```ebnf\nprogram        → declaration* EOF ;\n```\n\n### Declarations\n\nA program is a series of declarations, which are the statements that bind new\nidentifiers or any of the other statement types.\n\n```ebnf\ndeclaration    → classDecl\n               | funDecl\n               | varDecl\n               | statement ;\n\nclassDecl      → \"class\" IDENTIFIER ( \"<\" IDENTIFIER )?\n                 \"{\" function* \"}\" ;\nfunDecl        → \"fun\" function ;\nvarDecl        → \"var\" IDENTIFIER ( \"=\" expression )? \";\" ;\n```\n\n### Statements\n\nThe remaining statement rules produce side effects, but do not introduce\nbindings.\n\n```ebnf\nstatement      → exprStmt\n               | forStmt\n               | ifStmt\n               | printStmt\n               | returnStmt\n               | whileStmt\n               | block ;\n\nexprStmt       → expression \";\" ;\nforStmt        → \"for\" \"(\" ( varDecl | exprStmt | \";\" )\n                           expression? \";\"\n                           expression? \")\" statement ;\nifStmt         → \"if\" \"(\" expression \")\" statement\n                 ( \"else\" statement )? ;\nprintStmt      → \"print\" expression \";\" ;\nreturnStmt     → \"return\" expression? \";\" ;\nwhileStmt      → \"while\" \"(\" expression \")\" statement ;\nblock          → \"{\" declaration* \"}\" ;\n```\n\nNote that `block` is a statement rule, but is also used as a nonterminal in a\ncouple of other rules for things like function bodies.\n\n### Expressions\n\nExpressions produce values. Lox has a number of unary and binary operators with\ndifferent levels of precedence. Some grammars for languages do not directly\nencode the precedence relationships and specify that elsewhere. Here, we use a\nseparate rule for each precedence level to make it explicit.\n\n```ebnf\nexpression     → assignment ;\n\nassignment     → ( call \".\" )? IDENTIFIER \"=\" assignment\n               | logic_or ;\n\nlogic_or       → logic_and ( \"or\" logic_and )* ;\nlogic_and      → equality ( \"and\" equality )* ;\nequality       → comparison ( ( \"!=\" | \"==\" ) comparison )* ;\ncomparison     → term ( ( \">\" | \">=\" | \"<\" | \"<=\" ) term )* ;\nterm           → factor ( ( \"-\" | \"+\" ) factor )* ;\nfactor         → unary ( ( \"/\" | \"*\" ) unary )* ;\n\nunary          → ( \"!\" | \"-\" ) unary | call ;\ncall           → primary ( \"(\" arguments? \")\" | \".\" IDENTIFIER )* ;\nprimary        → \"true\" | \"false\" | \"nil\" | \"this\"\n               | NUMBER | STRING | IDENTIFIER | \"(\" expression \")\"\n               | \"super\" \".\" IDENTIFIER ;\n```\n\n### Utility rules\n\nIn order to keep the above rules a little cleaner, some of the grammar is\nsplit out into a few reused helper rules.\n\n```ebnf\nfunction       → IDENTIFIER \"(\" parameters? \")\" block ;\nparameters     → IDENTIFIER ( \",\" IDENTIFIER )* ;\narguments      → expression ( \",\" expression )* ;\n```\n\n## Lexical Grammar\n\nThe lexical grammar is used by the scanner to group characters into tokens.\nWhere the syntax is [context free][], the lexical grammar is [regular][] -- note\nthat there are no recursive rules.\n\n[context free]: https://en.wikipedia.org/wiki/Context-free_grammar\n[regular]: https://en.wikipedia.org/wiki/Regular_grammar\n\n```ebnf\nNUMBER         → DIGIT+ ( \".\" DIGIT+ )? ;\nSTRING         → \"\\\"\" <any char except \"\\\"\">* \"\\\"\" ;\nIDENTIFIER     → ALPHA ( ALPHA | DIGIT )* ;\nALPHA          → \"a\" ... \"z\" | \"A\" ... \"Z\" | \"_\" ;\nDIGIT          → \"0\" ... \"9\" ;\n```\n"
  },
  {
    "path": "book/appendix-ii.md",
    "content": "For your edification, here is the code produced by [the little script\nwe built][generator] to automate generating the syntax tree classes for jlox.\n\n[generator]: representing-code.html#metaprogramming-the-trees\n\n## Expressions\n\nExpressions are the first syntax tree nodes we see, introduced in \"[Representing\nCode](representing-code.html)\". The main Expr class defines the visitor\ninterface used to dispatch against the specific expression types, and contains\nthe other expression subclasses as nested classes.\n\n^code expr\n\n### Assign expression\n\nVariable assignment is introduced in \"[Statements and\nState](statements-and-state.html#assignment)\".\n\n^code expr-assign\n\n### Binary expression\n\nBinary operators are introduced in \"[Representing\nCode](representing-code.html)\".\n\n^code expr-binary\n\n### Call expression\n\nFunction call expressions are introduced in\n\"[Functions](functions.html#function-calls)\".\n\n^code expr-call\n\n### Get expression\n\nProperty access, or \"get\" expressions are introduced in\n\"[Classes](classes.html#properties-on-instances)\".\n\n^code expr-get\n\n### Grouping expression\n\nUsing parentheses to group expressions is introduced in \"[Representing\nCode](representing-code.html)\".\n\n^code expr-grouping\n\n### Literal expression\n\nLiteral value expressions are introduced in \"[Representing\nCode](representing-code.html)\".\n\n^code expr-literal\n\n### Logical expression\n\nThe logical `and` and `or` operators are introduced in \"[Control\nFlow](control-flow.html#logical-operators)\".\n\n^code expr-logical\n\n### Set expression\n\nProperty assignment, or \"set\" expressions are introduced in\n\"[Classes](classes.html#properties-on-instances)\".\n\n^code expr-set\n\n### Super expression\n\nThe `super` expression is introduced in\n\"[Inheritance](inheritance.html#calling-superclass-methods)\".\n\n^code expr-super\n\n### This expression\n\nThe `this` expression is introduced in \"[Classes](classes.html#this)\".\n\n^code expr-this\n\n### Unary expression\n\nUnary operators are introduced in \"[Representing Code](representing-code.html)\".\n\n^code expr-unary\n\n### Variable expression\n\nVariable access expressions are introduced in \"[Statements and\nState](statements-and-state.html#variable-syntax)\".\n\n^code expr-variable\n\n## Statements\n\nStatements form a second hierarchy of syntax tree nodes independent of\nexpressions. We add the first couple of them in \"[Statements and\nState](statements-and-state.html)\".\n\n^code stmt\n\n### Block statement\n\nThe curly-braced block statement that defines a local scope is introduced in\n\"[Statements and State](statements-and-state.html#block-syntax-and-semantics)\".\n\n^code stmt-block\n\n### Class statement\n\nClass declarations are introduced in, unsurprisingly,\n\"[Classes](classes.html#class-declarations)\".\n\n^code stmt-class\n\n### Expression statement\n\nThe expression statement is introduced in \"[Statements and\nState](statements-and-state.html#statements)\".\n\n^code stmt-expression\n\n### Function statement\n\nFunction declarations are introduced in, you guessed it,\n\"[Functions](functions.html#function-declarations)\".\n\n^code stmt-function\n\n### If statement\n\nThe `if` statement is introduced in \"[Control\nFlow](control-flow.html#conditional-execution)\".\n\n^code stmt-if\n\n### Print statement\n\nThe `print` statement is introduced in \"[Statements and\nState](statements-and-state.html#statements)\".\n\n^code stmt-print\n\n### Return statement\n\nYou need a function to return from, so `return` statements are introduced in\n\"[Functions](functions.html#return-statements)\".\n\n^code stmt-return\n\n### Variable statement\n\nVariable declarations are introduced in \"[Statements and\nState](statements-and-state.html#variable-syntax)\".\n\n^code stmt-var\n\n### While statement\n\nThe `while` statement is introduced in \"[Control\nFlow](control-flow.html#while-loops)\".\n\n^code stmt-while\n"
  },
  {
    "path": "book/backmatter.md",
    "content": "You've reached the end of the book! There are two pieces of supplementary\nmaterial you may find helpful:\n\n* **[Appendix I][]** contains a complete grammar for Lox, all in one place.\n\n* **[Appendix II][]** shows the Java classes produced by [the AST generator][]\n  we use for jlox.\n\n[appendix i]: appendix-i.html\n[appendix ii]: appendix-ii.html\n[the ast generator]: representing-code.html#metaprogramming-the-trees\n"
  },
  {
    "path": "book/calls-and-functions.md",
    "content": "> Any problem in computer science can be solved with another level of\n> indirection. Except for the problem of too many layers of indirection.\n>\n> <cite>David Wheeler</cite>\n\nThis chapter is a beast. I try to break features into bite-sized pieces, but\nsometimes you gotta swallow the whole <span name=\"eat\">meal</span>. Our next\ntask is functions. We could start with only function declarations, but that's\nnot very useful when you can't call them. We could do calls, but there's nothing\nto call. And all of the runtime support needed in the VM to support both of\nthose isn't very rewarding if it isn't hooked up to anything you can see. So\nwe're going to do it all. It's a lot, but we'll feel good when we're done.\n\n<aside name=\"eat\">\n\nEating -- consumption -- is a weird metaphor for a creative act. But most of the\nbiological processes that produce \"output\" are a little less, ahem, decorous.\n\n</aside>\n\n## Function Objects\n\nThe most interesting structural change in the VM is around the stack. We already\n*have* a stack for local variables and temporaries, so we're partway there. But\nwe have no notion of a *call* stack. Before we can make much progress, we'll\nhave to fix that. But first, let's write some code. I always feel better once I\nstart moving. We can't do much without having some kind of representation for\nfunctions, so we'll start there. From the VM's perspective, what is a function?\n\nA function has a body that can be executed, so that means some bytecode. We\ncould compile the entire program and all of its function declarations into one\nbig monolithic Chunk. Each function would have a pointer to the first\ninstruction of its code inside the Chunk.\n\nThis is roughly how compilation to native code works where you end up with one\nsolid blob of machine code. But for our bytecode VM, we can do something a\nlittle higher level. I think a cleaner model is to give each function its own\nChunk. We'll want some other metadata too, so let's go ahead and stuff it all in\na struct now.\n\n^code obj-function (2 before, 2 after)\n\nFunctions are first class in Lox, so they need to be actual Lox objects. Thus\nObjFunction has the same Obj header that all object types share. The `arity`\nfield stores the number of parameters the function expects. Then, in addition to\nthe chunk, we store the function's <span name=\"name\">name</span>. That will be\nhandy for reporting readable runtime errors.\n\n<aside name=\"name\">\n\nHumans don't seem to find numeric bytecode offsets particularly illuminating in\ncrash dumps.\n\n</aside>\n\nThis is the first time the \"object\" module has needed to reference Chunk, so we\nget an include.\n\n^code object-include-chunk (1 before, 1 after)\n\nLike we did with strings, we define some accessories to make Lox functions\neasier to work with in C. Sort of a poor man's object orientation. First, we'll\ndeclare a C function to create a new Lox function.\n\n^code new-function-h (3 before, 1 after)\n\nThe implementation is over here:\n\n^code new-function\n\nWe use our friend `ALLOCATE_OBJ()` to allocate memory and initialize the\nobject's header so that the VM knows what type of object it is. Instead of\npassing in arguments to initialize the function like we did with ObjString, we\nset the function up in a sort of blank state -- zero arity, no name, and no\ncode. That will get filled in later after the function is created.\n\nSince we have a new kind of object, we need a new object type in the enum.\n\n^code obj-type-function (1 before, 2 after)\n\nWhen we're done with a function object, we must return the bits it borrowed back\nto the operating system.\n\n^code free-function (1 before, 1 after)\n\nThis switch case is <span name=\"free-name\">responsible</span> for freeing the\nObjFunction itself as well as any other memory it owns. Functions own their\nchunk, so we call Chunk's destructor-like function.\n\n<aside name=\"free-name\">\n\nWe don't need to explicitly free the function's name because it's an ObjString.\nThat means we can let the garbage collector manage its lifetime for us. Or, at\nleast, we'll be able to once we [implement a garbage collector][gc].\n\n[gc]: garbage-collection.html\n\n</aside>\n\nLox lets you print any object, and functions are first-class objects, so we\nneed to handle them too.\n\n^code print-function (1 before, 1 after)\n\nThis calls out to:\n\n^code print-function-helper\n\nSince a function knows its name, it may as well say it.\n\nFinally, we have a couple of macros for converting values to functions. First,\nmake sure your value actually *is* a function.\n\n^code is-function (2 before, 1 after)\n\nAssuming that evaluates to true, you can then safely cast the Value to an\nObjFunction pointer using this:\n\n^code as-function (2 before, 1 after)\n\nWith that, our object model knows how to represent functions. I'm feeling warmed\nup now. You ready for something a little harder?\n\n## Compiling to Function Objects\n\nRight now, our compiler assumes it is always compiling to one single chunk. With\neach function's code living in separate chunks, that gets more complex. When the\ncompiler reaches a function declaration, it needs to emit code into the\nfunction's chunk when compiling its body. At the end of the function body, the\ncompiler needs to return to the previous chunk it was working with.\n\nThat's fine for code inside function bodies, but what about code that isn't? The\n\"top level\" of a Lox program is also imperative code and we need a chunk to\ncompile that into. We can simplify the compiler and VM by placing that top-level\ncode inside an automatically defined function too. That way, the compiler is\nalways within some kind of function body, and the VM always runs code by\ninvoking a function. It's as if the entire program is <span\nname=\"wrap\">wrapped</span> inside an implicit `main()` function.\n\n<aside name=\"wrap\">\n\nOne semantic corner where that analogy breaks down is global variables. They\nhave special scoping rules different from local variables, so in that way, the\ntop level of a script isn't like a function body.\n\n</aside>\n\nBefore we get to user-defined functions, then, let's do the reorganization to\nsupport that implicit top-level function. It starts with the Compiler struct.\nInstead of pointing directly to a Chunk that the compiler writes to, it instead\nhas a reference to the function object being built.\n\n^code function-fields (1 before, 1 after)\n\nWe also have a little FunctionType enum. This lets the compiler tell when it's\ncompiling top-level code versus the body of a function. Most of the compiler\ndoesn't care about this -- that's why it's a useful abstraction -- but in one or\ntwo places the distinction is meaningful. We'll get to one later.\n\n^code function-type-enum\n\nEvery place in the compiler that was writing to the Chunk now needs to go\nthrough that `function` pointer. Fortunately, many <span\nname=\"current\">chapters</span> ago, we encapsulated access to the chunk in the\n`currentChunk()` function. We only need to fix that and the rest of the compiler\nis happy.\n\n<aside name=\"current\">\n\nIt's almost like I had a crystal ball that could see into the future and knew\nwe'd need to change the code later. But, really, it's because I wrote all the\ncode for the book before any of the text.\n\n</aside>\n\n^code current-chunk (1 before, 2 after)\n\nThe current chunk is always the chunk owned by the function we're in the middle\nof compiling. Next, we need to actually create that function. Previously, the VM\npassed a Chunk to the compiler which filled it with code. Instead, the compiler\nwill create and return a function that contains the compiled top-level code --\nwhich is all we support right now -- of the user's program.\n\n### Creating functions at compile time\n\nWe start threading this through in `compile()`, which is the main entry point\ninto the compiler.\n\n^code call-init-compiler (1 before, 2 after)\n\nThere are a bunch of changes in how the compiler is initialized. First, we\ninitialize the new Compiler fields.\n\n^code init-compiler (1 after)\n\nThen we allocate a new function object to compile into.\n\n^code init-function (1 before, 1 after)\n\n<span name=\"null\"></span>\n\n<aside name=\"null\">\n\nI know, it looks dumb to null the `function` field only to immediately assign it\na value a few lines later. More garbage collection-related paranoia.\n\n</aside>\n\nCreating an ObjFunction in the compiler might seem a little strange. A function\nobject is the *runtime* representation of a function, but here we are creating\nit at compile time. The way to think of it is that a function is similar to a\nstring or number literal. It forms a bridge between the compile time and runtime\nworlds. When we get to function *declarations*, those really *are* literals\n-- they are a notation that produces values of a built-in type. So the <span\nname=\"closure\">compiler</span> creates function objects during compilation.\nThen, at runtime, they are simply invoked.\n\n<aside name=\"closure\">\n\nWe can create functions at compile time because they contain only data available\nat compile time. The function's code, name, and arity are all fixed. When we add\nclosures in the [next chapter][closures], which capture variables at runtime,\nthe story gets more complex.\n\n[closures]: closures.html\n\n</aside>\n\nHere is another strange piece of code:\n\n^code init-function-slot (1 before, 1 after)\n\nRemember that the compiler's `locals` array keeps track of which stack slots are\nassociated with which local variables or temporaries. From now on, the compiler\nimplicitly claims stack slot zero for the VM's own internal use. We give it an\nempty name so that the user can't write an identifier that refers to it. I'll\nexplain what this is about when it becomes useful.\n\nThat's the initialization side. We also need a couple of changes on the other\nend when we finish compiling some code.\n\n^code end-compiler (1 after)\n\nPreviously, when `interpret()` called into the compiler, it passed in a Chunk to\nbe written to. Now that the compiler creates the function object itself, we\nreturn that function. We grab it from the current compiler here:\n\n^code end-function (1 before, 1 after)\n\nAnd then return it to `compile()` like so:\n\n^code return-function (1 before, 1 after)\n\nNow is a good time to make another tweak in this function. Earlier, we added\nsome diagnostic code to have the VM dump the disassembled bytecode so we could\ndebug the compiler. We should fix that to keep working now that the generated\nchunk is wrapped in a function.\n\n^code disassemble-end (2 before, 2 after)\n\nNotice the check in here to see if the function's name is `NULL`? User-defined\nfunctions have names, but the implicit function we create for the top-level code\ndoes not, and we need to handle that gracefully even in our own diagnostic code.\nSpeaking of which:\n\n^code print-script (1 before, 1 after)\n\nThere's no way for a *user* to get a reference to the top-level function and try\nto print it, but our `DEBUG_TRACE_EXECUTION` <span\nname=\"debug\">diagnostic</span> code that prints the entire stack can and does.\n\n<aside name=\"debug\">\n\nIt is no fun if the diagnostic code we use to find bugs itself causes the VM to\nsegfault!\n\n</aside>\n\nBumping up a level to `compile()`, we adjust its signature.\n\n^code compile-h (2 before, 2 after)\n\nInstead of taking a chunk, now it returns a function. Over in the\nimplementation:\n\n^code compile-signature (1 after)\n\nFinally we get to some actual code. We change the very end of the function to\nthis:\n\n^code call-end-compiler (4 before, 1 after)\n\nWe get the function object from the compiler. If there were no compile errors,\nwe return it. Otherwise, we signal an error by returning `NULL`. This way, the\nVM doesn't try to execute a function that may contain invalid bytecode.\n\nEventually, we will update `interpret()` to handle the new declaration of\n`compile()`, but first we have some other changes to make.\n\n## Call Frames\n\nIt's time for a big conceptual leap. Before we can implement function\ndeclarations and calls, we need to get the VM ready to handle them. There are\ntwo main problems we need to worry about:\n\n### Allocating local variables\n\nThe compiler allocates stack slots for local variables. How should that work\nwhen the set of local variables in a program is distributed across multiple\nfunctions?\n\nOne option would be to keep them totally separate. Each function would get its\nown dedicated set of slots in the VM stack that it would own <span\nname=\"static\">forever</span>, even when the function isn't being called. Each\nlocal variable in the entire program would have a bit of memory in the VM that\nit keeps to itself.\n\n<aside name=\"static\">\n\nIt's basically what you'd get if you declared every local variable in a C\nprogram using `static`.\n\n</aside>\n\nBelieve it or not, early programming language implementations worked this way.\nThe first Fortran compilers statically allocated memory for each variable. The\nobvious problem is that it's really inefficient. Most functions are not in the\nmiddle of being called at any point in time, so sitting on unused memory for\nthem is wasteful.\n\nThe more fundamental problem, though, is recursion. With recursion, you can be\n\"in\" multiple calls to the same function at the same time. Each needs its <span\nname=\"fortran\">own</span> memory for its local variables. In jlox, we solved\nthis by dynamically allocating memory for an environment each time a function\nwas called or a block entered. In clox, we don't want that kind of performance\ncost on every function call.\n\n<aside name=\"fortran\">\n\nFortran avoided this problem by disallowing recursion entirely. Recursion was\nconsidered an advanced, esoteric feature at the time.\n\n</aside>\n\nInstead, our solution lies somewhere between Fortran's static allocation and\njlox's dynamic approach. The value stack in the VM works on the observation that\nlocal variables and temporaries behave in a last-in first-out fashion.\nFortunately for us, that's still true even when you add function calls into the\nmix. Here's an example:\n\n```lox\nfun first() {\n  var a = 1;\n  second();\n  var b = 2;\n}\n\nfun second() {\n  var c = 3;\n  var d = 4;\n}\n\nfirst();\n```\n\nStep through the program and look at which variables are in memory at each point\nin time:\n\n<img src=\"image/calls-and-functions/calls.png\" alt=\"Tracing through the execution of the previous program, showing the stack of variables at each step.\" />\n\nAs execution flows through the two calls, every local variable obeys the\nprinciple that any variable declared after it will be discarded before the first\nvariable needs to be. This is true even across calls. We know we'll be done with\n`c` and `d` before we are done with `a`. It seems we should be able to allocate\nlocal variables on the VM's value stack.\n\nIdeally, we still determine *where* on the stack each variable will go at\ncompile time. That keeps the bytecode instructions for working with variables\nsimple and fast. In the above example, we could <span\nname=\"imagine\">imagine</span> doing so in a straightforward way, but that\ndoesn't always work out. Consider:\n\n<aside name=\"imagine\">\n\nI say \"imagine\" because the compiler can't actually figure this out. Because\nfunctions are first class in Lox, we can't determine which functions call which\nothers at compile time.\n\n</aside>\n\n```lox\nfun first() {\n  var a = 1;\n  second();\n  var b = 2;\n  second();\n}\n\nfun second() {\n  var c = 3;\n  var d = 4;\n}\n\nfirst();\n```\n\nIn the first call to `second()`, `c` and `d` would go into slots 1 and 2. But in\nthe second call, we need to have made room for `b`, so `c` and `d` need to be in\nslots 2 and 3. Thus the compiler can't pin down an exact slot for each local\nvariable across function calls. But *within* a given function, the *relative*\nlocations of each local variable are fixed. Variable `d` is always in the slot\nright after `c`. This is the key insight.\n\nWhen a function is called, we don't know where the top of the stack will be\nbecause it can be called from different contexts. But, wherever that top happens\nto be, we do know where all of the function's local variables will be relative\nto that starting point. So, like many problems, we solve our allocation problem\nwith a level of indirection.\n\nAt the beginning of each function call, the VM records the location of the first\nslot where that function's own locals begin. The instructions for working with\nlocal variables access them by a slot index relative to that, instead of\nrelative to the bottom of the stack like they do today. At compile time, we\ncalculate those relative slots. At runtime, we convert that relative slot to an\nabsolute stack index by adding the function call's starting slot.\n\nIt's as if the function gets a \"window\" or \"frame\" within the larger stack where\nit can store its locals. The position of the **call frame** is determined at\nruntime, but within and relative to that region, we know where to find things.\n\n<img src=\"image/calls-and-functions/window.png\" alt=\"The stack at the two points when second() is called, with a window hovering over each one showing the pair of stack slots used by the function.\" />\n\nThe historical name for this recorded location where the function's locals start\nis a **frame pointer** because it points to the beginning of the function's call\nframe. Sometimes you hear **base pointer**, because it points to the base stack\nslot on top of which all of the function's variables live.\n\nThat's the first piece of data we need to track. Every time we call a function,\nthe VM determines the first stack slot where that function's variables begin.\n\n### Return addresses\n\nRight now, the VM works its way through the instruction stream by incrementing\nthe `ip` field. The only interesting behavior is around control flow\ninstructions which offset the `ip` by larger amounts. *Calling* a function is\npretty straightforward -- simply set `ip` to point to the first instruction in\nthat function's chunk. But what about when the function is done?\n\nThe VM needs to <span name=\"return\">return</span> back to the chunk where the\nfunction was called from and resume execution at the instruction immediately\nafter the call. Thus, for each function call, we need to track where we jump\nback to when the call completes. This is called a **return address** because\nit's the address of the instruction that the VM returns to after the call.\n\nAgain, thanks to recursion, there may be multiple return addresses for a single\nfunction, so this is a property of each *invocation* and not the function\nitself.\n\n<aside name=\"return\">\n\nThe authors of early Fortran compilers had a clever trick for implementing\nreturn addresses. Since they *didn't* support recursion, any given function\nneeded only a single return address at any point in time. So when a function was\ncalled at runtime, the program would *modify its own code* to change a jump\ninstruction at the end of the function to jump back to its caller. Sometimes the\nline between genius and madness is hair thin.\n\n</aside>\n\n### The call stack\n\nSo for each live function invocation -- each call that hasn't returned yet -- we\nneed to track where on the stack that function's locals begin, and where the\ncaller should resume. We'll put this, along with some other stuff, in a new\nstruct.\n\n^code call-frame (1 before, 2 after)\n\nA CallFrame represents a single ongoing function call. The `slots` field points\ninto the VM's value stack at the first slot that this function can use. I gave\nit a plural name because -- thanks to C's weird \"pointers are sort of arrays\"\nthing -- we'll treat it like an array.\n\nThe implementation of return addresses is a little different from what I\ndescribed above. Instead of storing the return address in the callee's frame,\nthe caller stores its own `ip`. When we return from a function, the VM will jump\nto the `ip` of the caller's CallFrame and resume from there.\n\nI also stuffed a pointer to the function being called in here. We'll use that to\nlook up constants and for a few other things.\n\nEach time a function is called, we create one of these structs. We could <span\nname=\"heap\">dynamically</span> allocate them on the heap, but that's slow.\nFunction calls are a core operation, so they need to be as fast as possible.\nFortunately, we can make the same observation we made for variables: function\ncalls have stack semantics. If `first()` calls `second()`, the call to\n`second()` will complete before `first()` does.\n\n<aside name=\"heap\">\n\nMany Lisp implementations dynamically allocate stack frames because it\nsimplifies implementing [continuations][cont]. If your language supports\ncontinuations, then function calls do *not* always have stack semantics.\n\n[cont]: https://en.wikipedia.org/wiki/Continuation\n\n</aside>\n\nSo over in the VM, we create an array of these CallFrame structs up front and\ntreat it as a stack, like we do with the value array.\n\n^code frame-array (1 before, 1 after)\n\nThis array replaces the `chunk` and `ip` fields we used to have directly in the\nVM. Now each CallFrame has its own `ip` and its own pointer to the ObjFunction\nthat it's executing. From there, we can get to the function's chunk.\n\nThe new `frameCount` field in the VM stores the current height of the CallFrame\nstack -- the number of ongoing function calls. To keep clox simple, the array's\ncapacity is fixed. This means, as in many language implementations, there is a\nmaximum call depth we can handle. For clox, it's defined here:\n\n^code frame-max (2 before, 2 after)\n\nWe also redefine the value stack's <span name=\"plenty\">size</span> in terms of\nthat to make sure we have plenty of stack slots even in very deep call trees.\nWhen the VM starts up, the CallFrame stack is empty.\n\n<aside name=\"plenty\">\n\nIt is still possible to overflow the stack if enough function calls use enough\ntemporaries in addition to locals. A robust implementation would guard against\nthis, but I'm trying to keep things simple.\n\n</aside>\n\n^code reset-frame-count (1 before, 1 after)\n\nThe \"vm.h\" header needs access to ObjFunction, so we add an include.\n\n^code vm-include-object (2 before, 1 after)\n\nNow we're ready to move over to the VM's implementation file. We've got some\ngrunt work ahead of us. We've moved `ip` out of the VM struct and into\nCallFrame. We need to fix every line of code in the VM that touches `ip` to\nhandle that. Also, the instructions that access local variables by stack slot\nneed to be updated to do so relative to the current CallFrame's `slots` field.\n\nWe'll start at the top and plow through it.\n\n^code run (1 before, 1 after)\n\nFirst, we store the current topmost CallFrame in a <span\nname=\"local\">local</span> variable inside the main bytecode execution function.\nThen we replace the bytecode access macros with versions that access `ip`\nthrough that variable.\n\n<aside name=\"local\">\n\nWe could access the current frame by going through the CallFrame array every\ntime, but that's verbose. More importantly, storing the frame in a local\nvariable encourages the C compiler to keep that pointer in a register. That\nspeeds up access to the frame's `ip`. There's no *guarantee* that the compiler\nwill do this, but there's a good chance it will.\n\n</aside>\n\nNow onto each instruction that needs a little tender loving care.\n\n^code push-local (2 before, 1 after)\n\nPreviously, `OP_GET_LOCAL` read the given local slot directly from the VM's\nstack array, which meant it indexed the slot starting from the bottom of the\nstack. Now, it accesses the current frame's `slots` array, which means it\naccesses the given numbered slot relative to the beginning of that frame.\n\nSetting a local variable works the same way.\n\n^code set-local (2 before, 1 after)\n\nThe jump instructions used to modify the VM's `ip` field. Now, they do the same\nfor the current frame's `ip`.\n\n^code jump (2 before, 1 after)\n\nSame with the conditional jump:\n\n^code jump-if-false (2 before, 1 after)\n\nAnd our backward-jumping loop instruction:\n\n^code loop (2 before, 1 after)\n\nWe have some diagnostic code that prints each instruction as it executes to help\nus debug our VM. That needs to work with the new structure too.\n\n^code trace-execution (1 before, 1 after)\n\nInstead of passing in the VM's `chunk` and `ip` fields, now we read from the\ncurrent CallFrame.\n\nYou know, that wasn't too bad, actually. Most instructions just use the macros\nso didn't need to be touched. Next, we jump up a level to the code that calls\n`run()`.\n\n^code interpret-stub (1 before, 2 after)\n\nWe finally get to wire up our earlier compiler changes to the back-end changes\nwe just made. First, we pass the source code to the compiler. It returns us a\nnew ObjFunction containing the compiled top-level code. If we get `NULL` back,\nit means there was some compile-time error which the compiler has already\nreported. In that case, we bail out since we can't run anything.\n\nOtherwise, we store the function on the stack and prepare an initial CallFrame\nto execute its code. Now you can see why the compiler sets aside stack slot zero\n-- that stores the function being called. In the new CallFrame, we point to the\nfunction, initialize its `ip` to point to the beginning of the function's\nbytecode, and set up its stack window to start at the very bottom of the VM's\nvalue stack.\n\nThis gets the interpreter ready to start executing code. After finishing, the VM\nused to free the hardcoded chunk. Now that the ObjFunction owns that code, we\ndon't need to do that anymore, so the end of `interpret()` is simply this:\n\n^code end-interpret (2 before, 1 after)\n\nThe last piece of code referring to the old VM fields is `runtimeError()`. We'll\nrevisit that later in the chapter, but for now let's change it to this:\n\n^code runtime-error-temp (2 before, 1 after)\n\nInstead of reading the chunk and `ip` directly from the VM, it pulls those from\nthe topmost CallFrame on the stack. That should get the function working again\nand behaving as it did before.\n\nAssuming we did all of that correctly, we got clox back to a runnable\nstate. Fire it up and it does... exactly what it did before. We haven't added\nany new features yet, so this is kind of a let down. But all of the\ninfrastructure is there and ready for us now. Let's take advantage of it.\n\n## Function Declarations\n\nBefore we can do call expressions, we need something to call, so we'll do\nfunction declarations first. The <span name=\"fun\">fun</span> starts with a\nkeyword.\n\n<aside name=\"fun\">\n\nYes, I am going to make a dumb joke about the `fun` keyword every time it\ncomes up.\n\n</aside>\n\n^code match-fun (1 before, 1 after)\n\nThat passes control to here:\n\n^code fun-declaration\n\nFunctions are first-class values, and a function declaration simply creates and\nstores one in a newly declared variable. So we parse the name just like any\nother variable declaration. A function declaration at the top level will bind\nthe function to a global variable. Inside a block or other function, a function\ndeclaration creates a local variable.\n\nIn an earlier chapter, I explained how variables [get defined in two\nstages][stage]. This ensures you can't access a variable's value inside the\nvariable's own initializer. That would be bad because the variable doesn't\n*have* a value yet.\n\n[stage]: local-variables.html#another-scope-edge-case\n\nFunctions don't suffer from this problem. It's safe for a function to refer to\nits own name inside its body. You can't *call* the function and execute the body\nuntil after it's fully defined, so you'll never see the variable in an\nuninitialized state. Practically speaking, it's useful to allow this in order to\nsupport recursive local functions.\n\nTo make that work, we mark the function declaration's variable \"initialized\" as\nsoon as we compile the name, before we compile the body. That way the name can\nbe referenced inside the body without generating an error.\n\nWe do need one check, though.\n\n^code check-depth (1 before, 1 after)\n\nBefore, we called `markInitialized()` only when we already knew we were in a\nlocal scope. Now, a top-level function declaration will also call this function.\nWhen that happens, there is no local variable to mark initialized -- the\nfunction is bound to a global variable.\n\nNext, we compile the function itself -- its parameter list and block body. For\nthat, we use a separate helper function. That helper generates code that\nleaves the resulting function object on top of the stack. After that, we call\n`defineVariable()` to store that function back into the variable we declared for\nit.\n\nI split out the code to compile the parameters and body because we'll reuse it\nlater for parsing method declarations inside classes. Let's build it\nincrementally, starting with this:\n\n^code compile-function\n\n<aside name=\"no-end-scope\">\n\nThis `beginScope()` doesn't have a corresponding `endScope()` call. Because we\nend Compiler completely when we reach the end of the function body, there's no\nneed to close the lingering outermost scope.\n\n</aside>\n\nFor now, we won't worry about parameters. We parse an empty pair of parentheses\nfollowed by the body. The body starts with a left curly brace, which we parse\nhere. Then we call our existing `block()` function, which knows how to compile\nthe rest of a block including the closing brace.\n\n### A stack of compilers\n\nThe interesting parts are the compiler stuff at the top and bottom. The Compiler\nstruct stores data like which slots are owned by which local variables, how many\nblocks of nesting we're currently in, etc. All of that is specific to a single\nfunction. But now the front end needs to handle compiling multiple functions\n<span name=\"nested\">nested</span> within each other.\n\n<aside name=\"nested\">\n\nRemember that the compiler treats top-level code as the body of an implicit\nfunction, so as soon as we add *any* function declarations, we're in a world of\nnested functions.\n\n</aside>\n\nThe trick for managing that is to create a separate Compiler for each function\nbeing compiled. When we start compiling a function declaration, we create a new\nCompiler on the C stack and initialize it. `initCompiler()` sets that Compiler\nto be the current one. Then, as we compile the body, all of the functions that\nemit bytecode write to the chunk owned by the new Compiler's function.\n\nAfter we reach the end of the function's block body, we call `endCompiler()`.\nThat yields the newly compiled function object, which we store as a constant in\nthe *surrounding* function's constant table. But, wait, how do we get back to\nthe surrounding function? We lost it when `initCompiler()` overwrote the current\ncompiler pointer.\n\nWe fix that by treating the series of nested Compiler structs as a stack. Unlike\nthe Value and CallFrame stacks in the VM, we won't use an array. Instead, we use\na linked list. Each Compiler points back to the Compiler for the function that\nencloses it, all the way back to the root Compiler for the top-level code.\n\n^code enclosing-field (2 before, 1 after)\n\nInside the Compiler struct, we can't reference the Compiler *typedef* since that\ndeclaration hasn't finished yet. Instead, we give a name to the struct itself\nand use that for the field's type. C is weird.\n\nWhen initializing a new Compiler, we capture the about-to-no-longer-be-current\none in that pointer.\n\n^code store-enclosing (1 before, 1 after)\n\nThen when a Compiler finishes, it pops itself off the stack by restoring the\nprevious compiler to be the new current one.\n\n^code restore-enclosing (2 before, 1 after)\n\nNote that we don't even need to <span name=\"compiler\">dynamically</span>\nallocate the Compiler structs. Each is stored as a local variable in the C stack\n-- either in `compile()` or `function()`. The linked list of Compilers threads\nthrough the C stack. The reason we can get an unbounded number of them is\nbecause our compiler uses recursive descent, so `function()` ends up calling\nitself recursively when you have nested function declarations.\n\n<aside name=\"compiler\">\n\nUsing the native stack for Compiler structs does mean our compiler has a\npractical limit on how deeply nested function declarations can be. Go too far\nand you could overflow the C stack. If we want the compiler to be more robust\nagainst pathological or even malicious code -- a real concern for tools like\nJavaScript VMs -- it would be good to have our compiler artificially limit the\namount of function nesting it permits.\n\n</aside>\n\n### Function parameters\n\nFunctions aren't very useful if you can't pass arguments to them, so let's do\nparameters next.\n\n^code parameters (1 before, 1 after)\n\nSemantically, a parameter is simply a local variable declared in the outermost\nlexical scope of the function body. We get to use the existing compiler support\nfor declaring named local variables to parse and compile parameters. Unlike\nlocal variables, which have initializers, there's no code here to initialize the\nparameter's value. We'll see how they are initialized later when we do argument\npassing in function calls.\n\nWhile we're at it, we note the function's arity by counting how many parameters\nwe parse. The other piece of metadata we store with a function is its name. When\ncompiling a function declaration, we call `initCompiler()` right after we parse\nthe function's name. That means we can grab the name right then from the\nprevious token.\n\n^code init-function-name (1 before, 2 after)\n\nNote that we're careful to create a copy of the name string. Remember, the\nlexeme points directly into the original source code string. That string may get\nfreed once the code is finished compiling. The function object we create in the\ncompiler outlives the compiler and persists until runtime. So it needs its own\nheap-allocated name string that it can keep around.\n\nRad. Now we can compile function declarations, like this:\n\n```lox\nfun areWeHavingItYet() {\n  print \"Yes we are!\";\n}\n\nprint areWeHavingItYet;\n```\n\nWe just can't do anything <span name=\"useful\">useful</span> with them.\n\n<aside name=\"useful\">\n\nWe can print them! I guess that's not very useful, though.\n\n</aside>\n\n## Function Calls\n\nBy the end of this section, we'll start to see some interesting behavior. The\nnext step is calling functions. We don't usually think of it this way, but a\nfunction call expression is kind of an infix `(` operator. You have a\nhigh-precedence expression on the left for the thing being called -- usually\njust a single identifier. Then the `(` in the middle, followed by the argument\nexpressions separated by commas, and a final `)` to wrap it up at the end.\n\nThat odd grammatical perspective explains how to hook the syntax into our\nparsing table.\n\n^code infix-left-paren (1 before, 1 after)\n\nWhen the parser encounters a left parenthesis following an expression, it\ndispatches to a new parser function.\n\n^code compile-call\n\nWe've already consumed the `(` token, so next we compile the arguments using a\nseparate `argumentList()` helper. That function returns the number of arguments\nit compiled. Each argument expression generates code that leaves its value on\nthe stack in preparation for the call. After that, we emit a new `OP_CALL`\ninstruction to invoke the function, using the argument count as an operand.\n\nWe compile the arguments using this friend:\n\n^code argument-list\n\nThat code should look familiar from jlox. We chew through arguments as long as\nwe find commas after each expression. Once we run out, we consume the final\nclosing parenthesis and we're done.\n\nWell, almost. Back in jlox, we added a compile-time check that you don't pass\nmore than 255 arguments to a call. At the time, I said that was because clox\nwould need a similar limit. Now you can see why -- since we stuff the argument\ncount into the bytecode as a single-byte operand, we can only go up to 255. We\nneed to verify that in this compiler too.\n\n^code arg-limit (1 before, 1 after)\n\nThat's the front end. Let's skip over to the back end, with a quick stop in the\nmiddle to declare the new instruction.\n\n^code op-call (1 before, 1 after)\n\n### Binding arguments to parameters\n\nBefore we get to the implementation, we should think about what the stack looks\nlike at the point of a call and what we need to do from there. When we reach the\ncall instruction, we have already executed the expression for the function being\ncalled, followed by its arguments. Say our program looks like this:\n\n```lox\nfun sum(a, b, c) {\n  return a + b + c;\n}\n\nprint 4 + sum(5, 6, 7);\n```\n\nIf we pause the VM right on the `OP_CALL` instruction for that call to `sum()`,\nthe stack looks like this:\n\n<img src=\"image/calls-and-functions/argument-stack.png\" alt=\"Stack: 4, fn sum, 5, 6, 7.\" />\n\nPicture this from the perspective of `sum()` itself. When the compiler compiled\n`sum()`, it automatically allocated slot zero. Then, after that, it allocated\nlocal slots for the parameters `a`, `b`, and `c`, in order. To perform a call to\n`sum()`, we need a CallFrame initialized with the function being called and a\nregion of stack slots that it can use. Then we need to collect the arguments\npassed to the function and get them into the corresponding slots for the\nparameters.\n\nWhen the VM starts executing the body of `sum()`, we want its stack window to\nlook like this:\n\n<img src=\"image/calls-and-functions/parameter-window.png\" alt=\"The same stack with the sum() function's call frame window surrounding fn sum, 5, 6, and 7.\" />\n\nDo you notice how the argument slots that the caller sets up and the parameter\nslots the callee needs are both in exactly the right order? How convenient! This\nis no coincidence. When I talked about each CallFrame having its own window into\nthe stack, I never said those windows must be *disjoint*. There's nothing\npreventing us from overlapping them, like this:\n\n<img src=\"image/calls-and-functions/overlapping-windows.png\" alt=\"The same stack with the top-level call frame covering the entire stack and the sum() function's call frame window surrounding fn sum, 5, 6, and 7.\" />\n\n<span name=\"lua\">The</span> top of the caller's stack contains the function\nbeing called followed by the arguments in order. We know the caller doesn't have\nany other slots above those in use because any temporaries needed when\nevaluating argument expressions have been discarded by now. The bottom of the\ncallee's stack overlaps so that the parameter slots exactly line up with where\nthe argument values already live.\n\n<aside name=\"lua\">\n\nDifferent bytecode VMs and real CPU architectures have different *calling\nconventions*, which is the specific mechanism they use to pass arguments, store\nthe return address, etc. The mechanism I use here is based on Lua's clean, fast\nvirtual machine.\n\n</aside>\n\nThis means that we don't need to do *any* work to \"bind an argument to a\nparameter\". There's no copying values between slots or across environments. The\narguments are already exactly where they need to be. It's hard to beat that for\nperformance.\n\nTime to implement the call instruction.\n\n^code interpret-call (1 before, 1 after)\n\nWe need to know the function being called and the number of arguments passed to\nit. We get the latter from the instruction's operand. That also tells us where\nto find the function on the stack by counting past the argument slots from the\ntop of the stack. We hand that data off to a separate `callValue()` function. If\nthat returns `false`, it means the call caused some sort of runtime error. When\nthat happens, we abort the interpreter.\n\nIf `callValue()` is successful, there will be a new frame on the CallFrame stack\nfor the called function. The `run()` function has its own cached pointer to the\ncurrent frame, so we need to update that.\n\n^code update-frame-after-call (2 before, 1 after)\n\nSince the bytecode dispatch loop reads from that `frame` variable, when the VM\ngoes to execute the next instruction, it will read the `ip` from the newly\ncalled function's CallFrame and jump to its code. The work for executing that\ncall begins here:\n\n^code call-value\n\n<aside name=\"switch\">\n\nUsing a `switch` statement to check a single type is overkill now, but will make\nsense when we add cases to handle other callable types.\n\n</aside>\n\nThere's more going on here than just initializing a new CallFrame. Because Lox\nis dynamically typed, there's nothing to prevent a user from writing bad code\nlike:\n\n```lox\nvar notAFunction = 123;\nnotAFunction();\n```\n\nIf that happens, the runtime needs to safely report an error and halt. So the\nfirst thing we do is check the type of the value that we're trying to call. If\nit's not a function, we error out. Otherwise, the actual call happens here:\n\n^code call\n\nThis simply initializes the next CallFrame on the stack. It stores a pointer to\nthe function being called and points the frame's `ip` to the beginning of the\nfunction's bytecode. Finally, it sets up the `slots` pointer to give the frame\nits window into the stack. The arithmetic there ensures that the arguments\nalready on the stack line up with the function's parameters:\n\n<img src=\"image/calls-and-functions/arithmetic.png\" alt=\"The arithmetic to calculate frame-&gt;slots from stackTop and argCount.\" />\n\nThe funny little `- 1` is to account for stack slot zero which the compiler set\naside for when we add methods later. The parameters start at slot one so we\nmake the window start one slot earlier to align them with the arguments.\n\nBefore we move on, let's add the new instruction to our disassembler.\n\n^code disassemble-call (1 before, 1 after)\n\nAnd one more quick side trip. Now that we have a handy function for initiating a\nCallFrame, we may as well use it to set up the first frame for executing the\ntop-level code.\n\n^code interpret (1 before, 2 after)\n\nOK, now back to calls...\n\n### Runtime error checking\n\nThe overlapping stack windows work based on the assumption that a call passes\nexactly one argument for each of the function's parameters. But, again, because\nLox ain't statically typed, a foolish user could pass too many or too few\narguments. In Lox, we've defined that to be a runtime error, which we report\nlike so:\n\n^code check-arity (1 before, 1 after)\n\nPretty straightforward. This is why we store the arity of each function inside\nthe ObjFunction for it.\n\nThere's another error we need to report that's less to do with the user's\nfoolishness than our own. Because the CallFrame array has a fixed size, we need\nto ensure a deep call chain doesn't overflow it.\n\n^code check-overflow (2 before, 1 after)\n\nIn practice, if a program gets anywhere close to this limit, there's most likely\na bug in some runaway recursive code.\n\n### Printing stack traces\n\nWhile we're on the subject of runtime errors, let's spend a little time making\nthem more useful. Stopping on a runtime error is important to prevent the VM\nfrom crashing and burning in some ill-defined way. But simply aborting doesn't\nhelp the user fix their code that *caused* that error.\n\nThe classic tool to aid debugging runtime failures is a **stack trace** -- a\nprint out of each function that was still executing when the program died, and\nwhere the execution was at the point that it died. Now that we have a call stack\nand we've conveniently stored each function's name, we can show that entire\nstack when a runtime error disrupts the harmony of the user's existence. It\nlooks like this:\n\n^code runtime-error-stack (2 before, 2 after)\n\n<aside name=\"minus\">\n\nThe `- 1` is because the IP is already sitting on the next instruction to be\nexecuted but we want the stack trace to point to the previous failed\ninstruction.\n\n</aside>\n\nAfter printing the error message itself, we walk the call stack from <span\nname=\"top\">top</span> (the most recently called function) to bottom (the\ntop-level code). For each frame, we find the line number that corresponds to the\ncurrent `ip` inside that frame's function. Then we print that line number along\nwith the function name.\n\n<aside name=\"top\">\n\nThere is some disagreement on which order stack frames should be shown in a\ntrace. Most put the innermost function as the first line and work their way\ntowards the bottom of the stack. Python prints them out in the opposite order.\nSo reading from top to bottom tells you how your program got to where it is, and\nthe last line is where the error actually occurred.\n\nThere's a logic to that style. It ensures you can always see the innermost\nfunction even if the stack trace is too long to fit on one screen. On the other\nhand, the \"[inverted pyramid][]\" from journalism tells us we should put the most\nimportant information *first* in a block of text. In a stack trace, that's the\nfunction where the error actually occurred. Most other language implementations\ndo that.\n\n[inverted pyramid]: https://en.wikipedia.org/wiki/Inverted_pyramid_(journalism)\n\n</aside>\n\nFor example, if you run this broken program:\n\n```lox\nfun a() { b(); }\nfun b() { c(); }\nfun c() {\n  c(\"too\", \"many\");\n}\n\na();\n```\n\nIt prints out:\n\n```text\nExpected 0 arguments but got 2.\n[line 4] in c()\n[line 2] in b()\n[line 1] in a()\n[line 7] in script\n```\n\nThat doesn't look too bad, does it?\n\n### Returning from functions\n\nWe're getting close. We can call functions, and the VM will execute them. But we\ncan't *return* from them yet. We've had an `OP_RETURN` instruction for quite\nsome time, but it's always had some kind of temporary code hanging out in it\njust to get us out of the bytecode loop. The time has arrived for a real\nimplementation.\n\n^code interpret-return (1 before, 1 after)\n\nWhen a function returns a value, that value will be on top of the stack. We're\nabout to discard the called function's entire stack window, so we pop that\nreturn value off and hang on to it. Then we discard the CallFrame for the\nreturning function. If that was the very last CallFrame, it means we've finished\nexecuting the top-level code. The entire program is done, so we pop the main\nscript function from the stack and then exit the interpreter.\n\nOtherwise, we discard all of the slots the callee was using for its parameters\nand local variables. That includes the same slots the caller used to pass the\narguments. Now that the call is done, the caller doesn't need them anymore. This\nmeans the top of the stack ends up right at the beginning of the returning\nfunction's stack window.\n\nWe push the return value back onto the stack at that new, lower location. Then\nwe update the `run()` function's cached pointer to the current frame. Just like\nwhen we began a call, on the next iteration of the bytecode dispatch loop, the\nVM will read `ip` from that frame, and execution will jump back to the caller,\nright where it left off, immediately after the `OP_CALL` instruction.\n\n<img src=\"image/calls-and-functions/return.png\" alt=\"Each step of the return process: popping the return value, discarding the call frame, pushing the return value.\" />\n\nNote that we assume here that the function *did* actually return a value, but\na function can implicitly return by reaching the end of its body:\n\n```lox\nfun noReturn() {\n  print \"Do stuff\";\n  // No return here.\n}\n\nprint noReturn(); // ???\n```\n\nWe need to handle that correctly too. The language is specified to implicitly\nreturn `nil` in that case. To make that happen, we add this:\n\n^code return-nil (1 before, 2 after)\n\nThe compiler calls `emitReturn()` to write the `OP_RETURN` instruction at the\nend of a function body. Now, before that, it emits an instruction to push `nil`\nonto the stack. And with that, we have working function calls! They can even\ntake parameters! It almost looks like we know what we're doing here.\n\n## Return Statements\n\nIf you want a function that returns something other than the implicit `nil`, you\nneed a `return` statement. Let's get that working.\n\n^code match-return (1 before, 1 after)\n\nWhen the compiler sees a `return` keyword, it goes here:\n\n^code return-statement\n\nThe return value expression is optional, so the parser looks for a semicolon\ntoken to tell if a value was provided. If there is no return value, the\nstatement implicitly returns `nil`. We implement that by calling `emitReturn()`,\nwhich emits an `OP_NIL` instruction. Otherwise, we compile the return value\nexpression and return it with an `OP_RETURN` instruction.\n\nThis is the same `OP_RETURN` instruction we've already implemented -- we don't\nneed any new runtime code. This is quite a difference from jlox. There, we had\nto use exceptions to unwind the stack when a `return` statement was executed.\nThat was because you could return from deep inside some nested blocks. Since\njlox recursively walks the AST, that meant there were a bunch of Java method\ncalls we needed to escape out of.\n\nOur bytecode compiler flattens that all out. We do recursive descent during\nparsing, but at runtime, the VM's bytecode dispatch loop is completely flat.\nThere is no recursion going on at the C level at all. So returning, even from\nwithin some nested blocks, is as straightforward as returning from the end of\nthe function's body.\n\nWe're not totally done, though. The new `return` statement gives us a new\ncompile error to worry about. Returns are useful for returning from functions\nbut the top level of a Lox program is imperative code too. You shouldn't be able\nto <span name=\"worst\">return</span> from there.\n\n```lox\nreturn \"What?!\";\n```\n\n<aside name=\"worst\">\n\nAllowing `return` at the top level isn't the worst idea in the world. It would\ngive you a natural way to terminate a script early. You could maybe even use a\nreturned number to indicate the process's exit code.\n\n</aside>\n\nWe've specified that it's a compile error to have a `return` statement outside\nof any function, which we implement like so:\n\n^code return-from-script (1 before, 1 after)\n\nThis is one of the reasons we added that FunctionType enum to the compiler.\n\n## Native Functions\n\nOur VM is getting more powerful. We've got functions, calls, parameters,\nreturns. You can define lots of different functions that can call each other in\ninteresting ways. But, ultimately, they can't really *do* anything. The only\nuser-visible thing a Lox program can do, regardless of its complexity, is print.\nTo add more capabilities, we need to expose them to the user.\n\nA programming language implementation reaches out and touches the material world\nthrough **native functions**. If you want to be able to write programs that\ncheck the time, read user input, or access the file system, we need to add\nnative functions -- callable from Lox but implemented in C -- that expose those\ncapabilities.\n\nAt the language level, Lox is fairly complete -- it's got closures, classes,\ninheritance, and other fun stuff. One reason it feels like a toy language is\nbecause it has almost no native capabilities. We could turn it into a real\nlanguage by adding a long list of them.\n\nHowever, grinding through a pile of OS operations isn't actually very\neducational. Once you've seen how to bind one piece of C code to Lox, you get\nthe idea. But you do need to see *one*, and even a single native function\nrequires us to build out all the machinery for interfacing Lox with C. So we'll\ngo through that and do all the hard work. Then, when that's done, we'll add one\ntiny native function just to prove that it works.\n\nThe reason we need new machinery is because, from the implementation's\nperspective, native functions are different from Lox functions. When they are\ncalled, they don't push a CallFrame, because there's no bytecode code for that\nframe to point to. They have no bytecode chunk. Instead, they somehow reference\na piece of native C code.\n\nWe handle this in clox by defining native functions as an entirely different\nobject type.\n\n^code obj-native (1 before, 2 after)\n\nThe representation is simpler than ObjFunction -- merely an Obj header and a\npointer to the C function that implements the native behavior. The native\nfunction takes the argument count and a pointer to the first argument on the\nstack. It accesses the arguments through that pointer. Once it's done, it\nreturns the result value.\n\nAs always, a new object type carries some accoutrements with it. To create an\nObjNative, we declare a constructor-like function.\n\n^code new-native-h (1 before, 1 after)\n\nWe implement that like so:\n\n^code new-native\n\nThe constructor takes a C function pointer to wrap in an ObjNative. It sets up\nthe object header and stores the function. For the header, we need a new object\ntype.\n\n^code obj-type-native (2 before, 2 after)\n\nThe VM also needs to know how to deallocate a native function object.\n\n^code free-native (1 before, 1 after)\n\nThere isn't much here since ObjNative doesn't own any extra memory. The other\ncapability all Lox objects support is being printed.\n\n^code print-native (1 before, 1 after)\n\nIn order to support dynamic typing, we have a macro to see if a value is a\nnative function.\n\n^code is-native (1 before, 1 after)\n\nAssuming that returns true, this macro extracts the C function pointer from a\nValue representing a native function:\n\n^code as-native (1 before, 1 after)\n\nAll of this baggage lets the VM treat native functions like any other object.\nYou can store them in variables, pass them around, throw them birthday parties,\netc. Of course, the operation we actually care about is *calling* them -- using\none as the left-hand operand in a call expression.\n\nOver in `callValue()` we add another type case.\n\n^code call-native (2 before, 1 after)\n\nIf the object being called is a native function, we invoke the C function right\nthen and there. There's no need to muck with CallFrames or anything. We just\nhand off to C, get the result, and stuff it back in the stack. This makes native\nfunctions as fast as we can get.\n\nWith this, users should be able to call native functions, but there aren't any\nto call. Without something like a foreign function interface, users can't define\ntheir own native functions. That's our job as VM implementers. We'll start with\na helper to define a new native function exposed to Lox programs.\n\n^code define-native\n\nIt takes a pointer to a C function and the name it will be known as in Lox.\nWe wrap the function in an ObjNative and then store that in a global variable\nwith the given name.\n\nYou're probably wondering why we push and pop the name and function on the\nstack. That looks weird, right? This is the kind of stuff you have to worry\nabout when <span name=\"worry\">garbage</span> collection gets involved. Both\n`copyString()` and `newNative()` dynamically allocate memory. That means once we\nhave a GC, they can potentially trigger a collection. If that happens, we need\nto ensure the collector knows we're not done with the name and ObjFunction so\nthat it doesn't free them out from under us. Storing them on the value stack\naccomplishes that.\n\n<aside name=\"worry\">\n\nDon't worry if you didn't follow all that. It will make a lot more sense once we\nget around to [implementing the GC][gc].\n\n[gc]: garbage-collection.html\n\n</aside>\n\nIt feels silly, but after all of that work, we're going to add only one\nlittle native function.\n\n^code clock-native\n\nThis returns the elapsed time since the program started running, in seconds. It's\nhandy for benchmarking Lox programs. In Lox, we'll name it `clock()`.\n\n^code define-native-clock (1 before, 1 after)\n\nTo get to the C standard library `clock()` function, the \"vm\" module needs an\ninclude.\n\n^code vm-include-time (1 before, 2 after)\n\nThat was a lot of material to work through, but we did it! Type this in and try\nit out:\n\n```lox\nfun fib(n) {\n  if (n < 2) return n;\n  return fib(n - 2) + fib(n - 1);\n}\n\nvar start = clock();\nprint fib(35);\nprint clock() - start;\n```\n\nWe can write a really inefficient recursive Fibonacci function. Even better, we\ncan measure just <span name=\"faster\">*how*</span> inefficient it is. This is, of\ncourse, not the smartest way to calculate a Fibonacci number. But it is a good\nway to stress test a language implementation's support for function calls. On my\nmachine, running this in clox is about five times faster than in jlox. That's\nquite an improvement.\n\n<aside name=\"faster\">\n\nIt's a little slower than a comparable Ruby program run in Ruby 2.4.3p205, and\nabout 3x faster than one run in Python 3.7.3. And we still have a lot of simple\noptimizations we can do in our VM.\n\n</aside>\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  Reading and writing the `ip` field is one of the most frequent operations\n    inside the bytecode loop. Right now, we access it through a pointer to the\n    current CallFrame. That requires a pointer indirection which may force the\n    CPU to bypass the cache and hit main memory. That can be a real performance\n    sink.\n\n    Ideally, we'd keep the `ip` in a native CPU register. C doesn't let us\n    *require* that without dropping into inline assembly, but we can structure\n    the code to encourage the compiler to make that optimization. If we store\n    the `ip` directly in a C local variable and mark it `register`, there's a\n    good chance the C compiler will accede to our polite request.\n\n    This does mean we need to be careful to load and store the local `ip` back\n    into the correct CallFrame when starting and ending function calls.\n    Implement this optimization. Write a couple of benchmarks and see how it\n    affects the performance. Do you think the extra code complexity is worth it?\n\n2.  Native function calls are fast in part because we don't validate that the\n    call passes as many arguments as the function expects. We really should, or\n    an incorrect call to a native function without enough arguments could cause\n    the function to read uninitialized memory. Add arity checking.\n\n3.  Right now, there's no way for a native function to signal a runtime error.\n    In a real implementation, this is something we'd need to support because\n    native functions live in the statically typed world of C but are called\n    from dynamically typed Lox land. If a user, say, tries to pass a string to\n    `sqrt()`, that native function needs to report a runtime error.\n\n    Extend the native function system to support that. How does this capability\n    affect the performance of native calls?\n\n4.  Add some more native functions to do things you find useful. Write some\n    programs using those. What did you add? How do they affect the feel of the\n    language and how practical it is?\n\n</div>\n"
  },
  {
    "path": "book/chunks-of-bytecode.md",
    "content": "> If you find that you're spending almost all your time on theory, start turning\n> some attention to practical things; it will improve your theories. If you find\n> that you're spending almost all your time on practice, start turning some\n> attention to theoretical things; it will improve your practice.\n>\n> <cite>Donald Knuth</cite>\n\nWe already have ourselves a complete implementation of Lox with jlox, so why\nisn't the book over yet? Part of this is because jlox relies on the <span\nname=\"metal\">JVM</span> to do lots of things for us. If we want to understand\nhow an interpreter works all the way down to the metal, we need to build those\nbits and pieces ourselves.\n\n<aside name=\"metal\">\n\nOf course, our second interpreter relies on the C standard library for basics\nlike memory allocation, and the C compiler frees us from details of the\nunderlying machine code we're running it on. Heck, that machine code is probably\nimplemented in terms of microcode on the chip. And the C runtime relies on the\noperating system to hand out pages of memory. But we have to stop *somewhere* if\nthis book is going to fit on your bookshelf.\n\n</aside>\n\nAn even more fundamental reason that jlox isn't sufficient is that it's too damn\nslow. A tree-walk interpreter is fine for some kinds of high-level, declarative\nlanguages. But for a general-purpose, imperative language -- even a \"scripting\"\nlanguage like Lox -- it won't fly. Take this little script:\n\n```lox\nfun fib(n) {\n  if (n < 2) return n;\n  return fib(n - 1) + fib(n - 2); // [fib]\n}\n\nvar before = clock();\nprint fib(40);\nvar after = clock();\nprint after - before;\n```\n\n<aside name=\"fib\">\n\nThis is a comically inefficient way to actually calculate Fibonacci numbers.\nOur goal is to see how fast the *interpreter* runs, not to see how fast of a\nprogram we can write. A slow program that does a lot of work -- pointless or not\n-- is a good test case for that.\n\n</aside>\n\nOn my laptop, that takes jlox about 72 seconds to execute. An equivalent C\nprogram finishes in half a second. Our dynamically typed scripting language is\nnever going to be as fast as a statically typed language with manual memory\nmanagement, but we don't need to settle for more than *two orders of magnitude*\nslower.\n\nWe could take jlox and run it in a profiler and start tuning and tweaking\nhotspots, but that will only get us so far. The execution model -- walking the\nAST -- is fundamentally the wrong design. We can't micro-optimize that to the\nperformance we want any more than you can polish an AMC Gremlin into an SR-71\nBlackbird.\n\nWe need to rethink the core model. This chapter introduces that model, bytecode,\nand begins our new interpreter, clox.\n\n## Bytecode?\n\nIn engineering, few choices are without trade-offs. To best understand why we're\ngoing with bytecode, let's stack it up against a couple of alternatives.\n\n### Why not walk the AST?\n\nOur existing interpreter has a couple of things going for it:\n\n*   Well, first, we already wrote it. It's done. And the main reason it's done\n    is because this style of interpreter is *really simple to implement*. The\n    runtime representation of the code directly maps to the syntax. It's\n    virtually effortless to get from the parser to the data structures we need\n    at runtime.\n\n*   It's *portable*. Our current interpreter is written in Java and runs on any\n    platform Java supports. We could write a new implementation in C using the\n    same approach and compile and run our language on basically every platform\n    under the sun.\n\nThose are real advantages. But, on the other hand, it's *not memory-efficient*.\nEach piece of syntax becomes an AST node. A tiny Lox expression like `1 + 2`\nturns into a slew of objects with lots of pointers between them, something like:\n\n<span name=\"header\"></span>\n\n<aside name=\"header\">\n\nThe \"(header)\" parts are the bookkeeping information the Java virtual machine\nuses to support memory management and store the object's type. Those take up\nspace too!\n\n</aside>\n\n<img src=\"image/chunks-of-bytecode/ast.png\" alt=\"The tree of Java objects created to represent '1 + 2'.\" />\n\nEach of those pointers adds an extra 32 or 64 bits of overhead to the object.\nWorse, sprinkling our data across the heap in a loosely connected web of objects\ndoes bad things for <span name=\"locality\">*spatial locality*</span>.\n\n<aside name=\"locality\">\n\nI wrote [an entire chapter][gpp locality] about this exact problem in my first\nbook, *Game Programming Patterns*, if you want to really dig in.\n\n[gpp locality]: http://gameprogrammingpatterns.com/data-locality.html\n\n</aside>\n\nModern CPUs process data way faster than they can pull it from RAM. To\ncompensate for that, chips have multiple layers of caching. If a piece of memory\nit needs is already in the cache, it can be loaded more quickly. We're talking\nupwards of 100 *times* faster.\n\nHow does data get into that cache? The machine speculatively stuffs things in\nthere for you. Its heuristic is pretty simple. Whenever the CPU reads a bit of\ndata from RAM, it pulls in a whole little bundle of adjacent bytes and stuffs\nthem in the cache.\n\nIf our program next requests some data close enough to be inside that cache\nline, our CPU runs like a well-oiled conveyor belt in a factory. We *really*\nwant to take advantage of this. To use the cache effectively, the way we\nrepresent code in memory should be dense and ordered like it's read.\n\nNow look up at that tree. Those sub-objects could be <span\nname=\"anywhere\">*anywhere*</span>. Every step the tree-walker takes where it\nfollows a reference to a child node may step outside the bounds of the cache and\nforce the CPU to stall until a new lump of data can be slurped in from RAM. Just\nthe *overhead* of those tree nodes with all of their pointer fields and object\nheaders tends to push objects away from each other and out of the cache.\n\n<aside name=\"anywhere\">\n\nEven if the objects happened to be allocated in sequential memory when the\nparser first produced them, after a couple of rounds of garbage collection --\nwhich may move objects around in memory -- there's no telling where they'll be.\n\n</aside>\n\nOur AST walker has other overhead too around interface dispatch and the Visitor\npattern, but the locality issues alone are enough to justify a better code\nrepresentation.\n\n### Why not compile to native code?\n\nIf you want to go *real* fast, you want to get all of those layers of\nindirection out of the way. Right down to the metal. Machine code. It even\n*sounds* fast. *Machine code.*\n\nCompiling directly to the native instruction set the chip supports is what the\nfastest languages do. Targeting native code has been the most efficient option\nsince way back in the early days when engineers actually <span\nname=\"hand\">handwrote</span> programs in machine code.\n\n<aside name=\"hand\">\n\nYes, they actually wrote machine code by hand. On punched cards. Which,\npresumably, they punched *with their fists*.\n\n</aside>\n\nIf you've never written any machine code, or its slightly more human-palatable\ncousin assembly code before, I'll give you the gentlest of introductions. Native\ncode is a dense series of operations, encoded directly in binary. Each\ninstruction is between one and a few bytes long, and is almost mind-numbingly\nlow level. \"Move a value from this address to this register.\" \"Add the integers\nin these two registers.\" Stuff like that.\n\nThe CPU cranks through the instructions, decoding and executing each one in\norder. There is no tree structure like our AST, and control flow is handled by\njumping from one point in the code directly to another. No indirection, no\noverhead, no unnecessary skipping around or chasing pointers.\n\nLightning fast, but that performance comes at a cost. First of all, compiling to\nnative code ain't easy. Most chips in wide use today have sprawling Byzantine\narchitectures with heaps of instructions that accreted over decades. They\nrequire sophisticated register allocation, pipelining, and instruction\nscheduling.\n\nAnd, of course, you've thrown <span name=\"back\">portability</span> out. Spend a\nfew years mastering some architecture and that still only gets you onto *one* of\nthe several popular instruction sets out there. To get your language on all of\nthem, you need to learn all of their instruction sets and write a separate back\nend for each one.\n\n<aside name=\"back\">\n\nThe situation isn't entirely dire. A well-architected compiler lets you\nshare the front end and most of the middle layer optimization passes across the\ndifferent architectures you support. It's mainly the code generation and some of\nthe details around instruction selection that you'll need to write afresh each\ntime.\n\nThe [LLVM][] project gives you some of this out of the box. If your compiler\noutputs LLVM's own special intermediate language, LLVM in turn compiles that to\nnative code for a plethora of architectures.\n\n[llvm]: https://llvm.org/\n\n</aside>\n\n### What is bytecode?\n\nFix those two points in your mind. On one end, a tree-walk interpreter is\nsimple, portable, and slow. On the other, native code is complex and\nplatform-specific but fast. Bytecode sits in the middle. It retains the\nportability of a tree-walker -- we won't be getting our hands dirty with\nassembly code in this book. It sacrifices *some* simplicity to get a performance\nboost in return, though not as fast as going fully native.\n\nStructurally, bytecode resembles machine code. It's a dense, linear sequence of\nbinary instructions. That keeps overhead low and plays nice with the cache.\nHowever, it's a much simpler, higher-level instruction set than any real chip\nout there. (In many bytecode formats, each instruction is only a single byte\nlong, hence \"bytecode\".)\n\nImagine you're writing a native compiler from some source language and you're\ngiven carte blanche to define the easiest possible architecture to target.\nBytecode is kind of like that. It's an idealized fantasy instruction set that\nmakes your life as the compiler writer easier.\n\nThe problem with a fantasy architecture, of course, is that it doesn't exist. We\nsolve that by writing an *emulator* -- a simulated chip written in software that\ninterprets the bytecode one instruction at a time. A *virtual machine (VM)*, if\nyou will.\n\nThat emulation layer adds <span name=\"p-code\">overhead</span>, which is a key\nreason bytecode is slower than native code. But in return, it gives us\nportability. Write our VM in a language like C that is already supported on all\nthe machines we care about, and we can run our emulator on top of any hardware\nwe like.\n\n<aside name=\"p-code\">\n\nOne of the first bytecode formats was [p-code][], developed for Niklaus Wirth's\nPascal language. You might think a PDP-11 running at 15MHz couldn't afford the\noverhead of emulating a virtual machine. But back then, computers were in their\nCambrian explosion and new architectures appeared every day. Keeping up with the\nlatest chips was worth more than squeezing the maximum performance from each\none. That's why the \"p\" in p-code doesn't stand for \"Pascal\", but \"portable\".\n\n[p-code]: https://en.wikipedia.org/wiki/P-code_machine\n\n</aside>\n\nThis is the path we'll take with our new interpreter, clox. We'll follow in the\nfootsteps of the main implementations of Python, Ruby, Lua, OCaml, Erlang, and\nothers. In many ways, our VM's design will parallel the structure of our\nprevious interpreter:\n\n<img src=\"image/chunks-of-bytecode/phases.png\" alt=\"Phases of the two\nimplementations. jlox is Parser to Syntax Trees to Interpreter. clox is Compiler\nto Bytecode to Virtual Machine.\" />\n\nOf course, we won't implement the phases strictly in order. Like our previous\ninterpreter, we'll bounce around, building up the implementation one language\nfeature at a time. In this chapter, we'll get the skeleton of the application in\nplace and create the data structures needed to store and represent a chunk of\nbytecode.\n\n## Getting Started\n\nWhere else to begin, but at `main()`? <span name=\"ready\">Fire</span> up your\ntrusty text editor and start typing.\n\n<aside name=\"ready\">\n\nNow is a good time to stretch, maybe crack your knuckles. A little montage music\nwouldn't hurt either.\n\n</aside>\n\n^code main-c\n\nFrom this tiny seed, we will grow our entire VM. Since C provides us with so\nlittle, we first need to spend some time amending the soil. Some of that goes\ninto this header:\n\n^code common-h\n\nThere are a handful of types and constants we'll use throughout the interpreter,\nand this is a convenient place to put them. For now, it's the venerable `NULL`,\n`size_t`, the nice C99 Boolean `bool`, and explicit-sized integer types --\n`uint8_t` and friends.\n\n## Chunks of Instructions\n\nNext, we need a module to define our code representation. I've been using\n\"chunk\" to refer to sequences of bytecode, so let's make that the official name\nfor that module.\n\n^code chunk-h\n\nIn our bytecode format, each instruction has a one-byte **operation code**\n(universally shortened to **opcode**). That number controls what kind of\ninstruction we're dealing with -- add, subtract, look up variable, etc. We\ndefine those here:\n\n^code op-enum (1 before, 2 after)\n\nFor now, we start with a single instruction, `OP_RETURN`. When we have a\nfull-featured VM, this instruction will mean \"return from the current function\".\nI admit this isn't exactly useful yet, but we have to start somewhere, and this\nis a particularly simple instruction, for reasons we'll get to later.\n\n### A dynamic array of instructions\n\nBytecode is a series of instructions. Eventually, we'll store some other data\nalong with the instructions, so let's go ahead and create a struct to hold it\nall.\n\n^code chunk-struct (1 before, 2 after)\n\nAt the moment, this is simply a wrapper around an array of bytes. Since we don't\nknow how big the array needs to be before we start compiling a chunk, it must be\ndynamic. Dynamic arrays are one of my favorite data structures. That sounds like\nclaiming vanilla is my favorite ice cream <span name=\"flavor\">flavor</span>, but\nhear me out. Dynamic arrays provide:\n\n<aside name=\"flavor\">\n\nButter pecan is actually my favorite.\n\n</aside>\n\n* Cache-friendly, dense storage\n\n* Constant-time indexed element lookup\n\n* Constant-time appending to the end of the array\n\nThose features are exactly why we used dynamic arrays all the time in jlox under\nthe guise of Java's ArrayList class. Now that we're in C, we get to roll our\nown. If you're rusty on dynamic arrays, the idea is pretty simple. In addition\nto the array itself, we keep two numbers: the number of elements in the array we\nhave allocated (\"capacity\") and how many of those allocated entries are actually\nin use (\"count\").\n\n^code count-and-capacity (1 before, 2 after)\n\nWhen we add an element, if the count is less than the capacity, then there is\nalready available space in the array. We store the new element right in there\nand bump the count.\n\n<img src=\"image/chunks-of-bytecode/insert.png\" alt=\"Storing an element in an\narray that has enough capacity.\" />\n\nIf we have no spare capacity, then the process is a little more involved.\n\n<img src=\"image/chunks-of-bytecode/grow.png\" alt=\"Growing the dynamic array\nbefore storing an element.\" class=\"wide\" />\n\n1.  <span name=\"amortized\">Allocate</span> a new array with more capacity.\n2.  Copy the existing elements from the old array to the new one.\n3.  Store the new `capacity`.\n4.  Delete the old array.\n5.  Update `code` to point to the new array.\n6.  Store the element in the new array now that there is room.\n7.  Update the `count`.\n\n<aside name=\"amortized\">\n\nCopying the existing elements when you grow the array makes it seem like\nappending an element is *O(n)*, not *O(1)* like I said above. However, you need\nto do this copy step only on *some* of the appends. Most of the time, there is\nalready extra capacity, so you don't need to copy.\n\nTo understand how this works, we need [**amortized\nanalysis**](https://en.wikipedia.org/wiki/Amortized_analysis). That shows us\nthat as long as we grow the array by a multiple of its current size, when we\naverage out the cost of a *sequence* of appends, each append is *O(1)*.\n\n</aside>\n\nWe have our struct ready, so let's implement the functions to work with it. C\ndoesn't have constructors, so we declare a function to initialize a new chunk.\n\n^code init-chunk-h (1 before, 2 after)\n\nAnd implement it thusly:\n\n^code chunk-c\n\nThe dynamic array starts off completely empty. We don't even allocate a raw\narray yet. To append a byte to the end of the chunk, we use a new function.\n\n^code write-chunk-h (1 before, 2 after)\n\nThis is where the interesting work happens.\n\n^code write-chunk\n\nThe first thing we need to do is see if the current array already has capacity\nfor the new byte. If it doesn't, then we first need to grow the array to make\nroom. (We also hit this case on the very first write when the array is `NULL`\nand `capacity` is 0.)\n\nTo grow the array, first we figure out the new capacity and grow the array to\nthat size. Both of those lower-level memory operations are defined in a new\nmodule.\n\n^code chunk-c-include-memory (1 before, 2 after)\n\nThis is enough to get us started.\n\n^code memory-h\n\nThis macro calculates a new capacity based on a given current capacity. In order\nto get the performance we want, the important part is that it *scales* based on\nthe old size. We grow by a factor of two, which is pretty typical. 1.5&times; is\nanother common choice.\n\nWe also handle when the current capacity is zero. In that case, we jump straight\nto eight elements instead of starting at one. That <span\nname=\"profile\">avoids</span> a little extra memory churn when the array is very\nsmall, at the expense of wasting a few bytes on very small chunks.\n\n<aside name=\"profile\">\n\nI picked the number eight somewhat arbitrarily for the book. Most dynamic array\nimplementations have a minimum threshold like this. The right way to pick a\nvalue for this is to profile against real-world usage and see which constant\nmakes the best performance trade-off between extra grows versus wasted space.\n\n</aside>\n\nOnce we know the desired capacity, we create or grow the array to that size\nusing `GROW_ARRAY()`.\n\n^code grow-array (2 before, 2 after)\n\nThis macro pretties up a function call to `reallocate()` where the real work\nhappens. The macro itself takes care of getting the size of the array's element\ntype and casting the resulting `void*` back to a pointer of the right type.\n\nThis `reallocate()` function is the single function we'll use for all dynamic\nmemory management in clox -- allocating memory, freeing it, and changing the\nsize of an existing allocation. Routing all of those operations through a single\nfunction will be important later when we add a garbage collector that needs to\nkeep track of how much memory is in use.\n\nThe two size arguments passed to `reallocate()` control which operation to\nperform:\n\n<table>\n  <thead>\n    <tr>\n      <td>oldSize</td>\n      <td>newSize</td>\n      <td>Operation</td>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <td>0</td>\n      <td>Non&#8209;zero</td>\n      <td>Allocate new block.</td>\n    </tr>\n    <tr>\n      <td>Non&#8209;zero</td>\n      <td>0</td>\n      <td>Free allocation.</td>\n    </tr>\n    <tr>\n      <td>Non&#8209;zero</td>\n      <td>Smaller&nbsp;than&nbsp;<code>oldSize</code></td>\n      <td>Shrink existing allocation.</td>\n    </tr>\n    <tr>\n      <td>Non&#8209;zero</td>\n      <td>Larger&nbsp;than&nbsp;<code>oldSize</code></td>\n      <td>Grow existing allocation.</td>\n    </tr>\n  </tbody>\n</table>\n\nThat sounds like a lot of cases to handle, but here's the implementation:\n\n^code memory-c\n\nWhen `newSize` is zero, we handle the deallocation case ourselves by calling\n`free()`. Otherwise, we rely on the C standard library's `realloc()` function.\nThat function conveniently supports the other three aspects of our policy. When\n`oldSize` is zero, `realloc()` is equivalent to calling `malloc()`.\n\nThe interesting cases are when both `oldSize` and `newSize` are not zero. Those\ntell `realloc()` to resize the previously allocated block. If the new size is\nsmaller than the existing block of memory, it simply <span\nname=\"shrink\">updates</span> the size of the block and returns the same pointer\nyou gave it. If the new size is larger, it attempts to grow the existing block\nof memory.\n\nIt can do that only if the memory after that block isn't already in use. If\nthere isn't room to grow the block, `realloc()` instead allocates a *new* block\nof memory of the desired size, copies over the old bytes, frees the old block,\nand then returns a pointer to the new block. Remember, that's exactly the\nbehavior we want for our dynamic array.\n\nBecause computers are finite lumps of matter and not the perfect mathematical\nabstractions computer science theory would have us believe, allocation can fail\nif there isn't enough memory and `realloc()` will return `NULL`. We should\nhandle that.\n\n^code out-of-memory (1 before, 1 after)\n\nThere's not really anything *useful* that our VM can do if it can't get the\nmemory it needs, but we at least detect that and abort the process immediately\ninstead of returning a `NULL` pointer and letting it go off the rails later.\n\n<aside name=\"shrink\">\n\nSince all we passed in was a bare pointer to the first byte of memory, what does\nit mean to \"update\" the block's size? Under the hood, the memory allocator\nmaintains additional bookkeeping information for each block of heap-allocated\nmemory, including its size.\n\nGiven a pointer to some previously allocated memory, it can find this\nbookkeeping information, which is necessary to be able to cleanly free it. It's\nthis size metadata that `realloc()` updates.\n\nMany implementations of `malloc()` store the allocated size in memory right\n*before* the returned address.\n\n</aside>\n\nOK, we can create new chunks and write instructions to them. Are we done? Nope!\nWe're in C now, remember, we have to manage memory ourselves, like in Ye Olden\nTimes, and that means *freeing* it too.\n\n^code free-chunk-h (1 before, 1 after)\n\nThe implementation is:\n\n^code free-chunk\n\nWe deallocate all of the memory and then call `initChunk()` to zero out the\nfields leaving the chunk in a well-defined empty state. To free the memory, we\nadd one more macro.\n\n^code free-array (3 before, 2 after)\n\nLike `GROW_ARRAY()`, this is a wrapper around a call to `reallocate()`. This one\nfrees the memory by passing in zero for the new size. I know, this is a lot of\nboring low-level stuff. Don't worry, we'll get a lot of use out of these in\nlater chapters and will get to program at a higher level. Before we can do that,\nthough, we gotta lay our own foundation.\n\n## Disassembling Chunks\n\nNow we have a little module for creating chunks of bytecode. Let's try it out by\nhand-building a sample chunk.\n\n^code main-chunk (1 before, 1 after)\n\nDon't forget the include.\n\n^code main-include-chunk (1 before, 2 after)\n\nRun that and give it a try. Did it work? Uh... who knows? All we've done is push\nsome bytes around in memory. We have no human-friendly way to see what's\nactually inside that chunk we made.\n\nTo fix this, we're going to create a **disassembler**. An **assembler** is an\nold-school program that takes a file containing human-readable mnemonic names\nfor CPU instructions like \"ADD\" and \"MULT\" and translates them to their binary\nmachine code equivalent. A *dis*assembler goes in the other direction -- given a\nblob of machine code, it spits out a textual listing of the instructions.\n\nWe'll implement something <span name=\"printer\">similar</span>. Given a chunk, it\nwill print out all of the instructions in it. A Lox *user* won't use this, but\nwe Lox *maintainers* will certainly benefit since it gives us a window into the\ninterpreter's internal representation of code.\n\n<aside name=\"printer\">\n\nIn jlox, our analogous tool was the [AstPrinter class][].\n\n[astprinter class]: representing-code.html#a-not-very-pretty-printer\n\n</aside>\n\nIn `main()`, after we create the chunk, we pass it to the disassembler.\n\n^code main-disassemble-chunk (2 before, 1 after)\n\nAgain, we whip up <span name=\"module\">yet another</span> module.\n\n<aside name=\"module\">\n\nI promise you we won't be creating this many new files in later chapters.\n\n</aside>\n\n^code main-include-debug (1 before, 2 after)\n\nHere's that header:\n\n^code debug-h\n\nIn `main()`, we call `disassembleChunk()` to disassemble all of the instructions\nin the entire chunk. That's implemented in terms of the other function, which\njust disassembles a single instruction. It shows up here in the header because\nwe'll call it from the VM in later chapters.\n\nHere's a start at the implementation file:\n\n^code debug-c\n\nTo disassemble a chunk, we print a little header (so we can tell *which* chunk\nwe're looking at) and then crank through the bytecode, disassembling each\ninstruction. The way we iterate through the code is a little odd. Instead of\nincrementing `offset` in the loop, we let `disassembleInstruction()` do it for\nus. When we call that function, after disassembling the instruction at the given\noffset, it returns the offset of the *next* instruction. This is because, as\nwe'll see later, instructions can have different sizes.\n\nThe core of the \"debug\" module is this function:\n\n^code disassemble-instruction\n\nFirst, it prints the byte offset of the given instruction -- that tells us where\nin the chunk this instruction is. This will be a helpful signpost when we start\ndoing control flow and jumping around in the bytecode.\n\nNext, it reads a single byte from the bytecode at the given offset. That's our\nopcode. We <span name=\"switch\">switch</span> on that. For each kind of\ninstruction, we dispatch to a little utility function for displaying it. On the\noff chance that the given byte doesn't look like an instruction at all -- a bug\nin our compiler -- we print that too. For the one instruction we do have,\n`OP_RETURN`, the display function is:\n\n<aside name=\"switch\">\n\nWe have only one instruction right now, but this switch will grow throughout the\nrest of the book.\n\n</aside>\n\n^code simple-instruction\n\nThere isn't much to a return instruction, so all it does is print the name of\nthe opcode, then return the next byte offset past this instruction. Other\ninstructions will have more going on.\n\nIf we run our nascent interpreter now, it actually prints something:\n\n```text\n== test chunk ==\n0000 OP_RETURN\n```\n\nIt worked! This is sort of the \"Hello, world!\" of our code representation. We\ncan create a chunk, write an instruction to it, and then extract that\ninstruction back out. Our encoding and decoding of the binary bytecode is\nworking.\n\n## Constants\n\nNow that we have a rudimentary chunk structure working, let's start making it\nmore useful. We can store *code* in chunks, but what about *data*? Many values\nthe interpreter works with are created at runtime as the result of operations.\n\n```lox\n1 + 2;\n```\n\nThe value 3 appears nowhere in the code here. However, the literals `1` and `2`\ndo. To compile that statement to bytecode, we need some sort of instruction that\nmeans \"produce a constant\" and those literal values need to get stored in the\nchunk somewhere. In jlox, the Expr.Literal AST node held the value. We need a\ndifferent solution now that we don't have a syntax tree.\n\n### Representing values\n\nWe won't be *running* any code in this chapter, but since constants have a foot\nin both the static and dynamic worlds of our interpreter, they force us to start\nthinking at least a little bit about how our VM should represent values.\n\nFor now, we're going to start as simple as possible -- we'll support only\ndouble-precision, floating-point numbers. This will obviously expand over time,\nso we'll set up a new module to give ourselves room to grow.\n\n^code value-h\n\nThis typedef abstracts how Lox values are concretely represented in C. That way,\nwe can change that representation without needing to go back and fix existing\ncode that passes around values.\n\nBack to the question of where to store constants in a chunk. For small\nfixed-size values like integers, many instruction sets store the value directly\nin the code stream right after the opcode. These are called **immediate\ninstructions** because the bits for the value are immediately after the opcode.\n\nThat doesn't work well for large or variable-sized constants like strings. In a\nnative compiler to machine code, those bigger constants get stored in a separate\n\"constant data\" region in the binary executable. Then, the instruction to load a\nconstant has an address or offset pointing to where the value is stored in that\nsection.\n\nMost virtual machines do something similar. For example, the Java Virtual\nMachine [associates a **constant pool**][jvm const] with each compiled class.\nThat sounds good enough for clox to me. Each chunk will carry with it a list of\nthe values that appear as literals in the program. To keep things <span\nname=\"immediate\">simpler</span>, we'll put *all* constants in there, even simple\nintegers.\n\n[jvm const]: https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-4.html#jvms-4.4\n\n<aside name=\"immediate\">\n\nIn addition to needing two kinds of constant instructions -- one for immediate\nvalues and one for constants in the constant table -- immediates also force us\nto worry about alignment, padding, and endianness. Some architectures aren't\nhappy if you try to say, stuff a 4-byte integer at an odd address.\n\n</aside>\n\n### Value arrays\n\nThe constant pool is an array of values. The instruction to load a constant\nlooks up the value by index in that array. As with our <span\nname=\"generic\">bytecode</span> array, the compiler doesn't know how big the\narray needs to be ahead of time. So, again, we need a dynamic one. Since C\ndoesn't have generic data structures, we'll write another dynamic array data\nstructure, this time for Value.\n\n<aside name=\"generic\">\n\nDefining a new struct and manipulation functions each time we need a dynamic\narray of a different type is a chore. We could cobble together some preprocessor\nmacros to fake generics, but that's overkill for clox. We won't need many more\nof these.\n\n</aside>\n\n^code value-array (1 before, 2 after)\n\nAs with the bytecode array in Chunk, this struct wraps a pointer to an array\nalong with its allocated capacity and the number of elements in use. We also\nneed the same three functions to work with value arrays.\n\n^code array-fns-h (1 before, 2 after)\n\nThe implementations will probably give you déjà vu. First, to create a new one:\n\n^code value-c\n\nOnce we have an initialized array, we can start <span name=\"add\">adding</span>\nvalues to it.\n\n<aside name=\"add\">\n\nFortunately, we don't need other operations like insertion and removal.\n\n</aside>\n\n^code write-value-array\n\nThe memory-management macros we wrote earlier do let us reuse some of the logic\nfrom the code array, so this isn't too bad. Finally, to release all memory used\nby the array:\n\n^code free-value-array\n\nNow that we have growable arrays of values, we can add one to Chunk to store the\nchunk's constants.\n\n^code chunk-constants (1 before, 1 after)\n\nDon't forget the include.\n\n^code chunk-h-include-value (1 before, 2 after)\n\nAh, C, and its Stone Age modularity story. Where were we? Right. When we\ninitialize a new chunk, we initialize its constant list too.\n\n^code chunk-init-constant-array (1 before, 1 after)\n\nLikewise, we free the constants when we free the chunk.\n\n^code chunk-free-constants (1 before, 1 after)\n\nNext, we define a convenience method to add a new constant to the chunk. Our\nyet-to-be-written compiler could write to the constant array inside Chunk\ndirectly -- it's not like C has private fields or anything -- but it's a little\nnicer to add an explicit function.\n\n^code add-constant-h (1 before, 2 after)\n\nThen we implement it.\n\n^code add-constant\n\nAfter we add the constant, we return the index where the constant was appended\nso that we can locate that same constant later.\n\n### Constant instructions\n\nWe can *store* constants in chunks, but we also need to *execute* them. In a\npiece of code like:\n\n```lox\nprint 1;\nprint 2;\n```\n\nThe compiled chunk needs to not only contain the values 1 and 2, but know *when*\nto produce them so that they are printed in the right order. Thus, we need an\ninstruction that produces a particular constant.\n\n^code op-constant (1 before, 1 after)\n\nWhen the VM executes a constant instruction, it <span name=\"load\">\"loads\"</span>\nthe constant for use. This new instruction is a little more complex than\n`OP_RETURN`. In the above example, we load two different constants. A single\nbare opcode isn't enough to know *which* constant to load.\n\n<aside name=\"load\">\n\nI'm being vague about what it means to \"load\" or \"produce\" a constant because we\nhaven't learned how the virtual machine actually executes code at runtime yet.\nFor that, you'll have to wait until you get to (or skip ahead to, I suppose) the\n[next chapter][vm].\n\n[vm]: a-virtual-machine.html\n\n</aside>\n\nTo handle cases like this, our bytecode -- like most others -- allows\ninstructions to have <span name=\"operand\">**operands**</span>. These are stored\nas binary data immediately after the opcode in the instruction stream and let us\nparameterize what the instruction does.\n\n<img src=\"image/chunks-of-bytecode/format.png\" alt=\"OP_CONSTANT is a byte for\nthe opcode followed by a byte for the constant index.\" />\n\nEach opcode determines how many operand bytes it has and what they mean. For\nexample, a simple operation like \"return\" may have no operands, where an\ninstruction for \"load local variable\" needs an operand to identify which\nvariable to load. Each time we add a new opcode to clox, we specify what its\noperands look like -- its **instruction format**.\n\n<aside name=\"operand\">\n\nBytecode instruction operands are *not* the same as the operands passed to an\narithmetic operator. You'll see when we get to expressions that arithmetic\noperand values are tracked separately. Instruction operands are a lower-level\nnotion that modify how the bytecode instruction itself behaves.\n\n</aside>\n\nIn this case, `OP_CONSTANT` takes a single byte operand that specifies which\nconstant to load from the chunk's constant array. Since we don't have a compiler\nyet, we \"hand-compile\" an instruction in our test chunk.\n\n^code main-constant (1 before, 1 after)\n\nWe add the constant value itself to the chunk's constant pool. That returns the\nindex of the constant in the array. Then we write the constant instruction,\nstarting with its opcode. After that, we write the one-byte constant index\noperand. Note that `writeChunk()` can write opcodes or operands. It's all raw\nbytes as far as that function is concerned.\n\nIf we try to run this now, the disassembler is going to yell at us because it\ndoesn't know how to decode the new instruction. Let's fix that.\n\n^code disassemble-constant (1 before, 1 after)\n\nThis instruction has a different instruction format, so we write a new helper\nfunction to disassemble it.\n\n^code constant-instruction\n\nThere's more going on here. As with `OP_RETURN`, we print out the name of the\nopcode. Then we pull out the constant index from the subsequent byte in the\nchunk. We print that index, but that isn't super useful to us human readers. So\nwe also look up the actual constant value -- since constants *are* known at\ncompile time after all -- and display the value itself too.\n\nThis requires some way to print a clox Value. That function will live in the\n\"value\" module, so we include that.\n\n^code debug-include-value (1 before, 2 after)\n\nOver in that header, we declare:\n\n^code print-value-h (1 before, 2 after)\n\nAnd here's an implementation:\n\n^code print-value\n\nMagnificent, right? As you can imagine, this is going to get more complex once\nwe add dynamic typing to Lox and have values of different types.\n\nBack in `constantInstruction()`, the only remaining piece is the return value.\n\n^code return-after-operand (1 before, 1 after)\n\nRemember that `disassembleInstruction()` also returns a number to tell the\ncaller the offset of the beginning of the *next* instruction. Where `OP_RETURN`\nwas only a single byte, `OP_CONSTANT` is two -- one for the opcode and one for\nthe operand.\n\n## Line Information\n\nChunks contain almost all of the information that the runtime needs from the\nuser's source code. It's kind of crazy to think that we can reduce all of the\ndifferent AST classes that we created in jlox down to an array of bytes and an\narray of constants. There's only one piece of data we're missing. We need it,\neven though the user hopes to never see it.\n\nWhen a runtime error occurs, we show the user the line number of the offending\nsource code. In jlox, those numbers live in tokens, which we in turn store in\nthe AST nodes. We need a different solution for clox now that we've ditched\nsyntax trees in favor of bytecode. Given any bytecode instruction, we need to be\nable to determine the line of the user's source program that it was compiled\nfrom.\n\nThere are a lot of clever ways we could encode this. I took the absolute <span\nname=\"side\">simplest</span> approach I could come up with, even though it's\nembarrassingly inefficient with memory. In the chunk, we store a separate array\nof integers that parallels the bytecode. Each number in the array is the line\nnumber for the corresponding byte in the bytecode. When a runtime error occurs,\nwe look up the line number at the same index as the current instruction's offset\nin the code array.\n\n<aside name=\"side\">\n\nThis braindead encoding does do one thing right: it keeps the line information\nin a *separate* array instead of interleaving it in the bytecode itself. Since\nline information is only used when a runtime error occurs, we don't want it\nbetween the instructions, taking up precious space in the CPU cache and causing\nmore cache misses as the interpreter skips past it to get to the opcodes and\noperands it cares about.\n\n</aside>\n\nTo implement this, we add another array to Chunk.\n\n^code chunk-lines (1 before, 1 after)\n\nSince it exactly parallels the bytecode array, we don't need a separate count or\ncapacity. Every time we touch the code array, we make a corresponding change to\nthe line number array, starting with initialization.\n\n^code chunk-null-lines (1 before, 1 after)\n\nAnd likewise deallocation:\n\n^code chunk-free-lines (1 before, 1 after)\n\nWhen we write a byte of code to the chunk, we need to know what source line it\ncame from, so we add an extra parameter in the declaration of `writeChunk()`.\n\n^code write-chunk-with-line-h (1 before, 1 after)\n\nAnd in the implementation:\n\n^code write-chunk-with-line (1 after)\n\nWhen we allocate or grow the code array, we do the same for the line info too.\n\n^code write-chunk-line (2 before, 1 after)\n\nFinally, we store the line number in the array.\n\n^code chunk-write-line (1 before, 1 after)\n\n### Disassembling line information\n\nAlright, let's try this out with our little, uh, artisanal chunk. First, since\nwe added a new parameter to `writeChunk()`, we need to fix those calls to pass\nin some -- arbitrary at this point -- line number.\n\n^code main-chunk-line (1 before, 2 after)\n\nOnce we have a real front end, of course, the compiler will track the current\nline as it parses and pass that in.\n\nNow that we have line information for every instruction, let's put it to good\nuse. In our disassembler, it's helpful to show which source line each\ninstruction was compiled from. That gives us a way to map back to the original\ncode when we're trying to figure out what some blob of bytecode is supposed to\ndo. After printing the offset of the instruction -- the number of bytes from the\nbeginning of the chunk -- we show its source line.\n\n^code show-location (2 before, 2 after)\n\nBytecode instructions tend to be pretty fine-grained. A single line of source\ncode often compiles to a whole sequence of instructions. To make that more\nvisually clear, we show a `|` for any instruction that comes from the same\nsource line as the preceding one. The resulting output for our handwritten\nchunk looks like:\n\n```text\n== test chunk ==\n0000  123 OP_CONSTANT         0 '1.2'\n0002    | OP_RETURN\n```\n\nWe have a three-byte chunk. The first two bytes are a constant instruction that\nloads 1.2 from the chunk's constant pool. The first byte is the `OP_CONSTANT`\nopcode and the second is the index in the constant pool. The third byte (at\noffset 2) is a single-byte return instruction.\n\nIn the remaining chapters, we will flesh this out with lots more kinds of\ninstructions. But the basic structure is here, and we have everything we need\nnow to completely represent an executable piece of code at runtime in our\nvirtual machine. Remember that whole family of AST classes we defined in jlox?\nIn clox, we've reduced that down to three arrays: bytes of code, constant\nvalues, and line information for debugging.\n\nThis reduction is a key reason why our new interpreter will be faster than jlox.\nYou can think of bytecode as a sort of compact serialization of the AST, highly\noptimized for how the interpreter will deserialize it in the order it needs as\nit executes. In the [next chapter][vm], we will see how the virtual machine does\nexactly that.\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  Our encoding of line information is hilariously wasteful of memory. Given\n    that a series of instructions often correspond to the same source line, a\n    natural solution is something akin to [run-length encoding][rle] of the line\n    numbers.\n\n    Devise an encoding that compresses the line information for a\n    series of instructions on the same line. Change `writeChunk()` to write this\n    compressed form, and implement a `getLine()` function that, given the index\n    of an instruction, determines the line where the instruction occurs.\n\n    *Hint: It's not necessary for `getLine()` to be particularly efficient.\n    Since it is called only when a runtime error occurs, it is well off the\n    critical path where performance matters.*\n\n2.  Because `OP_CONSTANT` uses only a single byte for its operand, a chunk may\n    only contain up to 256 different constants. That's small enough that people\n    writing real-world code will hit that limit. We could use two or more bytes\n    to store the operand, but that makes *every* constant instruction take up\n    more space. Most chunks won't need that many unique constants, so that\n    wastes space and sacrifices some locality in the common case to support the\n    rare case.\n\n    To balance those two competing aims, many instruction sets feature multiple\n    instructions that perform the same operation but with operands of different\n    sizes. Leave our existing one-byte `OP_CONSTANT` instruction alone, and\n    define a second `OP_CONSTANT_LONG` instruction. It stores the operand as a\n    24-bit number, which should be plenty.\n\n    Implement this function:\n\n    ```c\n    void writeConstant(Chunk* chunk, Value value, int line) {\n      // Implement me...\n    }\n    ```\n\n    It adds `value` to `chunk`'s constant array and then writes an appropriate\n    instruction to load the constant. Also add support to the disassembler for\n    `OP_CONSTANT_LONG` instructions.\n\n    Defining two instructions seems to be the best of both worlds. What\n    sacrifices, if any, does it force on us?\n\n3.  Our `reallocate()` function relies on the C standard library for dynamic\n    memory allocation and freeing. `malloc()` and `free()` aren't magic. Find\n    a couple of open source implementations of them and explain how they work.\n    How do they keep track of which bytes are allocated and which are free?\n    What is required to allocate a block of memory? Free it? How do they make\n    that efficient? What do they do about fragmentation?\n\n    *Hardcore mode:* Implement `reallocate()` without calling `realloc()`,\n    `malloc()`, or `free()`. You are allowed to call `malloc()` *once*, at the\n    beginning of the interpreter's execution, to allocate a single big block of\n    memory, which your `reallocate()` function has access to. It parcels out\n    blobs of memory from that single region, your own personal heap. It's your\n    job to define how it does that.\n\n</div>\n\n[rle]: https://en.wikipedia.org/wiki/Run-length_encoding\n\n<div class=\"design-note\">\n\n## Design Note: Test Your Language\n\nWe're almost halfway through the book and one thing we haven't talked about is\n*testing* your language implementation. That's not because testing isn't\nimportant. I can't possibly stress enough how vital it is to have a good,\ncomprehensive test suite for your language.\n\nI wrote a [test suite for Lox][tests] (which you are welcome to use on your own\nLox implementation) before I wrote a single word of this book. Those tests found\ncountless bugs in my implementations.\n\n[tests]: https://github.com/munificent/craftinginterpreters/tree/master/test\n\nTests are important in all software, but they're even more important for a\nprogramming language for at least a couple of reasons:\n\n*   **Users expect their programming languages to be rock solid.** We are so\n    used to mature, stable compilers and interpreters that \"It's your code, not\n    the compiler\" is [an ingrained part of software culture][fault]. If there\n    are bugs in your language implementation, users will go through the full\n    five stages of grief before they can figure out what's going on, and you\n    don't want to put them through all that.\n\n*   **A language implementation is a deeply interconnected piece of software.**\n    Some codebases are broad and shallow. If the file loading code is broken in\n    your text editor, it -- hopefully! -- won't cause failures in the text\n    rendering on screen. Language implementations are narrower and deeper,\n    especially the core of the interpreter that handles the language's actual\n    semantics. That makes it easy for subtle bugs to creep in caused by weird\n    interactions between various parts of the system. It takes good tests to\n    flush those out.\n\n*   **The input to a language implementation is, by design, combinatorial.**\n    There are an infinite number of possible programs a user could write, and\n    your implementation needs to run them all correctly. You obviously can't\n    test that exhaustively, but you need to work hard to cover as much of the\n    input space as you can.\n\n*   **Language implementations are often complex, constantly changing, and full\n    of optimizations.** That leads to gnarly code with lots of dark corners\n    where bugs can hide.\n\n[fault]: https://blog.codinghorror.com/the-first-rule-of-programming-its-always-your-fault/\n\nAll of that means you're gonna want a lot of tests. But *what* tests? Projects\nI've seen focus mostly on end-to-end \"language tests\". Each test is a program\nwritten in the language along with the output or errors it is expected to\nproduce. Then you have a test runner that pushes the test program through your\nlanguage implementation and validates that it does what it's supposed to.\nWriting your tests in the language itself has a few nice advantages:\n\n*   The tests aren't coupled to any particular API or internal architecture\n    decisions of the implementation. This frees you to reorganize or rewrite\n    parts of your interpreter or compiler without needing to update a slew of\n    tests.\n\n*   You can use the same tests for multiple implementations of the language.\n\n*   Tests can often be terse and easy to read and maintain since they are\n    simply scripts in your language.\n\nIt's not all rosy, though:\n\n*   End-to-end tests help you determine *if* there is a bug, but not *where* the\n    bug is. It can be harder to figure out where the erroneous code in the\n    implementation is because all the test tells you is that the right output\n    didn't appear.\n\n*   It can be a chore to craft a valid program that tickles some obscure corner\n    of the implementation. This is particularly true for highly optimized\n    compilers where you may need to write convoluted code to ensure that you\n    end up on just the right optimization path where a bug may be hiding.\n\n*   The overhead can be high to fire up the interpreter, parse, compile, and\n    run each test script. With a big suite of tests -- which you *do* want,\n    remember -- that can mean a lot of time spent waiting for the tests to\n    finish running.\n\nI could go on, but I don't want this to turn into a sermon. Also, I don't\npretend to be an expert on *how* to test languages. I just want you to\ninternalize how important it is *that* you test yours. Seriously. Test your\nlanguage. You'll thank me for it.\n\n</div>\n"
  },
  {
    "path": "book/classes-and-instances.md",
    "content": "> Caring too much for objects can destroy you. Only -- if you care for a thing\n> enough, it takes on a life of its own, doesn't it? And isn’t the whole point\n> of things -- beautiful things -- that they connect you to some larger beauty?\n>\n> <cite>Donna Tartt, <em>The Goldfinch</em></cite>\n\nThe last area left to implement in clox is object-oriented programming. <span\nname=\"oop\">OOP</span> is a bundle of intertwined features: classes, instances,\nfields, methods, initializers, and inheritance. Using relatively high-level\nJava, we packed all that into two chapters. Now that we're coding in C, which\nfeels like building a model of the Eiffel tower out of toothpicks, we'll devote\nthree chapters to covering the same territory. This makes for a leisurely stroll\nthrough the implementation. After strenuous chapters like [closures][] and the\n[garbage collector][], you have earned a rest. In fact, the book should be easy\nfrom here on out.\n\n<aside name=\"oop\">\n\nPeople who have strong opinions about object-oriented programming -- read\n\"everyone\" -- tend to assume OOP means some very specific list of language\nfeatures, but really there's a whole space to explore, and each language has its\nown ingredients and recipes.\n\nSelf has objects but no classes. CLOS has methods but doesn't attach them to\nspecific classes. C++ initially had no runtime polymorphism -- no virtual\nmethods. Python has multiple inheritance, but Java does not. Ruby attaches\nmethods to classes, but you can also define methods on a single object.\n\n</aside>\n\nIn this chapter, we cover the first three features: classes, instances, and\nfields. This is the stateful side of object orientation. Then in the next two\nchapters, we will hang behavior and code reuse off of those objects.\n\n[closures]: closures.html\n[garbage collector]: garbage-collection.html\n\n## Class Objects\n\nIn a class-based object-oriented language, everything begins with classes. They\ndefine what sorts of objects exist in the program and are the factories used to\nproduce new instances. Going bottom-up, we'll start with their runtime\nrepresentation and then hook that into the language.\n\nBy this point, we're well-acquainted with the process of adding a new object\ntype to the VM. We start with a struct.\n\n^code obj-class (1 before, 2 after)\n\nAfter the Obj header, we store the class's name. This isn't strictly needed for\nthe user's program, but it lets us show the name at runtime for things like\nstack traces.\n\nThe new type needs a corresponding case in the ObjType enum.\n\n^code obj-type-class (1 before, 1 after)\n\nAnd that type gets a corresponding pair of macros. First, for testing an\nobject's type:\n\n^code is-class (2 before, 1 after)\n\nAnd then for casting a Value to an ObjClass pointer:\n\n^code as-class (2 before, 1 after)\n\nThe VM creates new class objects using this function:\n\n^code new-class-h (2 before, 1 after)\n\nThe implementation lives over here:\n\n^code new-class\n\nPretty much all boilerplate. It takes in the class's name as a string and stores\nit. Every time the user declares a new class, the VM will create a new one of\nthese ObjClass structs to represent it.\n\n<aside name=\"klass\">\n\n<img src=\"image/classes-and-instances/klass.png\" alt=\"'Klass' in a zany kidz font.\"/>\n\nI named the variable \"klass\" not just to give the VM a zany preschool \"Kidz\nKorner\" feel. It makes it easier to get clox compiling as C++ where \"class\" is\na reserved word.\n\n</aside>\n\nWhen the VM no longer needs a class, it frees it like so:\n\n^code free-class (1 before, 1 after)\n\n<aside name=\"braces\">\n\nThe braces here are pointless now, but will be useful in the next chapter when\nwe add some more code to the switch case.\n\n</aside>\n\nWe have a memory manager now, so we also need to support tracing through class\nobjects.\n\n^code blacken-class (1 before, 1 after)\n\nWhen the GC reaches a class object, it marks the class's name to keep that\nstring alive too.\n\nThe last operation the VM can perform on a class is printing it.\n\n^code print-class (1 before, 1 after)\n\nA class simply says its own name.\n\n## Class Declarations\n\nRuntime representation in hand, we are ready to add support for classes to the\nlanguage. Next, we move into the parser.\n\n^code match-class (1 before, 1 after)\n\nClass declarations are statements, and the parser recognizes one by the leading\n`class` keyword. The rest of the compilation happens over here:\n\n^code class-declaration\n\nImmediately after the `class` keyword is the class's name. We take that\nidentifier and add it to the surrounding function's constant table as a string.\nAs you just saw, printing a class shows its name, so the compiler needs to stuff\nthe name string somewhere that the runtime can find. The constant table is the\nway to do that.\n\nThe class's <span name=\"variable\">name</span> is also used to bind the class\nobject to a variable of the same name. So we declare a variable with that\nidentifier right after consuming its token.\n\n<aside name=\"variable\">\n\nWe could have made class declarations be *expressions* instead of statements --\nthey are essentially a literal that produces a value after all. Then users would\nhave to explicitly bind the class to a variable themselves like:\n\n```lox\nvar Pie = class {}\n```\n\nSort of like lambda functions but for classes. But since we generally want\nclasses to be named anyway, it makes sense to treat them as declarations.\n\n</aside>\n\nNext, we emit a new instruction to actually create the class object at runtime.\nThat instruction takes the constant table index of the class's name as an\noperand.\n\nAfter that, but before compiling the body of the class, we define the variable\nfor the class's name. *Declaring* the variable adds it to the scope, but recall\nfrom [a previous chapter][scope] that we can't *use* the variable until it's\n*defined*. For classes, we define the variable before the body. That way, users\ncan refer to the containing class inside the bodies of its own methods. That's\nuseful for things like factory methods that produce new instances of the class.\n\n[scope]: local-variables.html#another-scope-edge-case\n\nFinally, we compile the body. We don't have methods yet, so right now it's\nsimply an empty pair of braces. Lox doesn't require fields to be declared in the\nclass, so we're done with the body -- and the parser -- for now.\n\nThe compiler is emitting a new instruction, so let's define that.\n\n^code class-op (1 before, 1 after)\n\nAnd add it to the disassembler:\n\n^code disassemble-class (2 before, 1 after)\n\nFor such a large-seeming feature, the interpreter support is minimal.\n\n^code interpret-class (2 before, 1 after)\n\nWe load the string for the class's name from the constant table and pass that to\n`newClass()`. That creates a new class object with the given name. We push that\nonto the stack and we're good. If the class is bound to a global variable, then\nthe compiler's call to `defineVariable()` will emit code to store that object\nfrom the stack into the global variable table. Otherwise, it's right where it\nneeds to be on the stack for a new <span name=\"local\">local</span> variable.\n\n<aside name=\"local\">\n\n\"Local\" classes -- classes declared inside the body of a function or block, are\nan unusual concept. Many languages don't allow them at all. But since Lox is a\ndynamically typed scripting language, it treats the top level of a program and\nthe bodies of functions and blocks uniformly. Classes are just another kind of\ndeclaration, and since you can declare variables and functions inside blocks,\nyou can declare classes in there too.\n\n</aside>\n\nThere you have it, our VM supports classes now. You can run this:\n\n```lox\nclass Brioche {}\nprint Brioche;\n```\n\nUnfortunately, printing is about *all* you can do with classes, so next is\nmaking them more useful.\n\n## Instances of Classes\n\nClasses serve two main purposes in a language:\n\n*   **They are how you create new instances.** Sometimes this involves a `new`\n    keyword, other times it's a method call on the class object, but you usually\n    mention the class by name *somehow* to get a new instance.\n\n*   **They contain methods.** These define how all instances of the class\n    behave.\n\nWe won't get to methods until the next chapter, so for now we will only worry\nabout the first part. Before classes can create instances, we need a\nrepresentation for them.\n\n^code obj-instance (1 before, 2 after)\n\nInstances know their class -- each instance has a pointer to the class that it\nis an instance of.  We won't use this much in this chapter, but it will become\ncritical when we add methods.\n\nMore important to this chapter is how instances store their state. Lox lets\nusers freely add fields to an instance at runtime. This means we need a storage\nmechanism that can grow. We could use a dynamic array, but we also want to look\nup fields by name as quickly as possible. There's a data structure that's just\nperfect for quickly accessing a set of values by name and\n-- even more conveniently -- we've already implemented it. Each instance stores\nits fields using a hash table.\n\n<aside name=\"fields\">\n\nBeing able to freely add fields to an object at runtime is a big practical\ndifference between most dynamic and static languages. Statically typed languages\nusually require fields to be explicitly declared. This way, the compiler knows\nexactly what fields each instance has. It can use that to determine the precise\namount of memory needed for each instance and the offsets in that memory where\neach field can be found.\n\nIn Lox and other dynamic languages, accessing a field is usually a hash table\nlookup. Constant time, but still pretty heavyweight. In a language like C++,\naccessing a field is as fast as offsetting a pointer by an integer constant.\n\n</aside>\n\nWe only need to add an include, and we've got it.\n\n^code object-include-table (1 before, 1 after)\n\nThis new struct gets a new object type.\n\n^code obj-type-instance (1 before, 1 after)\n\nI want to slow down a bit here because the Lox *language's* notion of \"type\" and\nthe VM *implementation's* notion of \"type\" brush against each other in ways that\ncan be confusing. Inside the C code that makes clox, there are a number of\ndifferent types of Obj -- ObjString, ObjClosure, etc. Each has its own internal\nrepresentation and semantics.\n\nIn the Lox *language*, users can define their own classes -- say Cake and Pie --\nand then create instances of those classes. From the user's perspective, an\ninstance of Cake is a different type of object than an instance of Pie. But,\nfrom the VM's perspective, every class the user defines is simply another value\nof type ObjClass. Likewise, each instance in the user's program, no matter what\nclass it is an instance of, is an ObjInstance. That one VM object type covers\ninstances of all classes. The two worlds map to each other something like this:\n\n<img src=\"image/classes-and-instances/lox-clox.png\" alt=\"A set of class declarations and instances, and the runtime representations each maps to.\"/>\n\nGot it? OK, back to the implementation. We also get our usual macros.\n\n^code is-instance (1 before, 1 after)\n\nAnd:\n\n^code as-instance (1 before, 1 after)\n\nSince fields are added after the instance is created, the \"constructor\" function\nonly needs to know the class.\n\n^code new-instance-h (1 before, 1 after)\n\nWe implement that function here:\n\n^code new-instance\n\nWe store a reference to the instance's class. Then we initialize the field\ntable to an empty hash table. A new baby object is born!\n\nAt the sadder end of the instance's lifespan, it gets freed.\n\n^code free-instance (3 before, 1 after)\n\nThe instance owns its field table so when freeing the instance, we also free the\ntable. We don't explicitly free the entries *in* the table, because there may\nbe other references to those objects. The garbage collector will take care of\nthose for us. Here we free only the entry array of the table itself.\n\nSpeaking of the garbage collector, it needs support for tracing through\ninstances.\n\n^code blacken-instance (3 before, 1 after)\n\nIf the instance is alive, we need to keep its class around. Also, we need to\nkeep every object referenced by the instance's fields. Most live objects that\nare not roots are reachable because some instance refers to the object in a\nfield. Fortunately, we already have a nice `markTable()` function to make\ntracing them easy.\n\nLess critical but still important is printing.\n\n^code print-instance (1 before, 1 after)\n\n<span name=\"print\">An</span> instance prints its name followed by \"instance\".\n(The \"instance\" part is mainly so that classes and instances don't print the\nsame.)\n\n<aside name=\"print\">\n\nMost object-oriented languages let a class define some sort of `toString()`\nmethod that lets the class specify how its instances are converted to a string\nand printed. If Lox was less of a toy language, I would want to support that\ntoo.\n\n</aside>\n\nThe real fun happens over in the interpreter. Lox has no special `new` keyword.\nThe way to create an instance of a class is to invoke the class itself as if it\nwere a function. The runtime already supports function calls, and it checks the\ntype of object being called to make sure the user doesn't try to invoke a number\nor other invalid type.\n\nWe extend that runtime checking with a new case.\n\n^code call-class (1 before, 1 after)\n\nIf the value being called -- the object that results when evaluating the\nexpression to the left of the opening parenthesis -- is a class, then we treat\nit as a constructor call. We <span name=\"args\">create</span> a new instance of\nthe called class and store the result on the stack.\n\n<aside name=\"args\">\n\nWe ignore any arguments passed to the call for now. We'll revisit this code in\nthe [next chapter][next] when we add support for initializers.\n\n[next]: methods-and-initializers.html\n\n</aside>\n\nWe're one step farther. Now we can define classes and create instances of them.\n\n```lox\nclass Brioche {}\nprint Brioche();\n```\n\nNote the parentheses after `Brioche` on the second line now. This prints\n\"Brioche instance\".\n\n## Get and Set Expressions\n\nOur object representation for instances can already store state, so all that\nremains is exposing that functionality to the user. Fields are accessed and\nmodified using get and set expressions. Not one to break with tradition, Lox\nuses the classic \"dot\" syntax:\n\n```lox\neclair.filling = \"pastry creme\";\nprint eclair.filling;\n```\n\nThe period -- full stop for my English friends -- works <span\nname=\"sort\">sort</span> of like an infix operator. There is an expression to the\nleft that is evaluated first and produces an instance. After that is the `.`\nfollowed by a field name. Since there is a preceding operand, we hook this into\nthe parse table as an infix expression.\n\n<aside name=\"sort\">\n\nI say \"sort of\" because the right-hand side after the `.` is not an expression,\nbut a single identifier whose semantics are handled by the get or set expression\nitself. It's really closer to a postfix expression.\n\n</aside>\n\n^code table-dot (1 before, 1 after)\n\nAs in other languages, the `.` operator binds tightly, with precedence as high\nas the parentheses in a function call. After the parser consumes the dot token,\nit dispatches to a new parse function.\n\n^code compile-dot\n\nThe parser expects to find a <span name=\"prop\">property</span> name immediately\nafter the dot. We load that token's lexeme into the constant table as a string\nso that the name is available at runtime.\n\n<aside name=\"prop\">\n\nThe compiler uses \"property\" instead of \"field\" here because, remember, Lox also\nlets you use dot syntax to access a method without calling it. \"Property\" is the\ngeneral term we use to refer to any named entity you can access on an instance.\nFields are the subset of properties that are backed by the instance's state.\n\n</aside>\n\nWe have two new expression forms -- getters and setters -- that this one\nfunction handles. If we see an equals sign after the field name, it must be a\nset expression that is assigning to a field. But we don't *always* allow an\nequals sign after the field to be compiled. Consider:\n\n```lox\na + b.c = 3\n```\n\nThis is syntactically invalid according to Lox's grammar, which means our Lox\nimplementation is obligated to detect and report the error. If `dot()` silently\nparsed the `= 3` part, we would incorrectly interpret the code as if the user\nhad written:\n\n```lox\na + (b.c = 3)\n```\n\nThe problem is that the `=` side of a set expression has much lower precedence\nthan the `.` part. The parser may call `dot()` in a context that is too high\nprecedence to permit a setter to appear. To avoid incorrectly allowing that, we\nparse and compile the equals part only when `canAssign` is true. If an equals\ntoken appears when `canAssign` is false, `dot()` leaves it alone and returns. In\nthat case, the compiler will eventually unwind up to `parsePrecedence()`, which\nstops at the unexpected `=` still sitting as the next token and reports an\nerror.\n\nIf we find an `=` in a context where it *is* allowed, then we compile the\nexpression that follows. After that, we emit a new <span\nname=\"set\">`OP_SET_PROPERTY`</span> instruction. That takes a single operand for\nthe index of the property name in the constant table. If we didn't compile a set\nexpression, we assume it's a getter and emit an `OP_GET_PROPERTY` instruction,\nwhich also takes an operand for the property name.\n\n<aside name=\"set\">\n\nYou can't *set* a non-field property, so I suppose that instruction could have\nbeen `OP_SET_FIELD`, but I thought it looked nicer to be consistent with the get\ninstruction.\n\n</aside>\n\nNow is a good time to define these two new instructions.\n\n^code property-ops (1 before, 1 after)\n\nAnd add support for disassembling them:\n\n^code disassemble-property-ops (1 before, 1 after)\n\n### Interpreting getter and setter expressions\n\nSliding over to the runtime, we'll start with get expressions since those are a\nlittle simpler.\n\n^code interpret-get-property (1 before, 1 after)\n\nWhen the interpreter reaches this instruction, the expression to the left of the\ndot has already been executed and the resulting instance is on top of the stack.\nWe read the field name from the constant pool and look it up in the instance's\nfield table. If the hash table contains an entry with that name, we pop the\ninstance and push the entry's value as the result.\n\nOf course, the field might not exist. In Lox, we've defined that to be a runtime\nerror. So we add a check for that and abort if it happens.\n\n^code get-undefined (3 before, 2 after)\n\n<span name=\"field\">There</span> is another failure mode to handle which you've\nprobably noticed. The above code assumes the expression to the left of the dot\ndid evaluate to an ObjInstance. But there's nothing preventing a user from\nwriting this:\n\n```lox\nvar obj = \"not an instance\";\nprint obj.field;\n```\n\nThe user's program is wrong, but the VM still has to handle it with some grace.\nRight now, it will misinterpret the bits of the ObjString as an ObjInstance and,\nI don't know, catch on fire or something definitely not graceful.\n\nIn Lox, only instances are allowed to have fields. You can't stuff a field onto\na string or number. So we need to check that the value is an instance before\naccessing any fields on it.\n\n<aside name=\"field\">\n\nLox *could* support adding fields to values of other types. It's our language\nand we can do what we want. But it's likely a bad idea. It significantly\ncomplicates the implementation in ways that hurt performance -- for example,\nstring interning gets a lot harder.\n\nAlso, it raises gnarly semantic questions around the equality and identity of\nvalues. If I attach a field to the number `3`, does the result of `1 + 2` have\nthat field as well? If so, how does the implementation track that? If not, are\nthose two resulting \"threes\" still considered equal?\n\n</aside>\n\n^code get-not-instance (1 before, 1 after)\n\nIf the value on the stack isn't an instance, we report a runtime error and\nsafely exit.\n\nOf course, get expressions are not very useful when no instances have any\nfields. For that we need setters.\n\n^code interpret-set-property (2 before, 1 after)\n\nThis is a little more complex than `OP_GET_PROPERTY`. When this executes, the\ntop of the stack has the instance whose field is being set and above that, the\nvalue to be stored. Like before, we read the instruction's operand and find the\nfield name string. Using that, we store the value on top of the stack into the\ninstance's field table.\n\nAfter that is a little <span name=\"stack\">stack</span> juggling. We pop the\nstored value off, then pop the instance, and finally push the value back on. In\nother words, we remove the *second* element from the stack while leaving the top\nalone. A setter is itself an expression whose result is the assigned value, so\nwe need to leave that value on the stack. Here's what I mean:\n\n<aside name=\"stack\">\n\nThe stack operations go like this:\n\n<img src=\"image/classes-and-instances/stack.png\" alt=\"Popping two values and then pushing the first value back on the stack.\"/>\n\n</aside>\n\n```lox\nclass Toast {}\nvar toast = Toast();\nprint toast.jam = \"grape\"; // Prints \"grape\".\n```\n\nUnlike when reading a field, we don't need to worry about the hash table not\ncontaining the field. A setter implicitly creates the field if needed. We do\nneed to handle the user incorrectly trying to store a field on a value that\nisn't an instance.\n\n^code set-not-instance (1 before, 1 after)\n\nExactly like with get expressions, we check the value's type and report a\nruntime error if it's invalid. And, with that, the stateful side of Lox's\nsupport for object-oriented programming is in place. Give it a try:\n\n```lox\nclass Pair {}\n\nvar pair = Pair();\npair.first = 1;\npair.second = 2;\nprint pair.first + pair.second; // 3.\n```\n\nThis doesn't really feel very *object*-oriented. It's more like a strange,\ndynamically typed variant of C where objects are loose struct-like bags of data.\nSort of a dynamic procedural language. But this is a big step in expressiveness.\nOur Lox implementation now lets users freely aggregate data into bigger units.\nIn the next chapter, we will breathe life into those inert blobs.\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  Trying to access a non-existent field on an object immediately aborts the\n    entire VM. The user has no way to recover from this runtime error, nor is\n    there any way to see if a field exists *before* trying to access it. It's up\n    to the user to ensure on their own that only valid fields are read.\n\n    How do other dynamically typed languages handle missing fields? What do you\n    think Lox should do? Implement your solution.\n\n2.  Fields are accessed at runtime by their *string* name. But that name must\n    always appear directly in the source code as an *identifier token*. A user\n    program cannot imperatively build a string value and then use that as the\n    name of a field. Do you think they should be able to? Devise a language\n    feature that enables that and implement it.\n\n3.  Conversely, Lox offers no way to *remove* a field from an instance. You can\n    set a field's value to `nil`, but the entry in the hash table is still\n    there. How do other languages handle this? Choose and implement a strategy\n    for Lox.\n\n4.  Because fields are accessed by name at runtime, working with instance state\n    is slow. It's technically a constant-time operation -- thanks, hash tables\n    -- but the constant factors are relatively large. This is a major component\n    of why dynamic languages are slower than statically typed ones.\n\n    How do sophisticated implementations of dynamically typed languages cope\n    with and optimize this?\n\n</div>\n"
  },
  {
    "path": "book/classes.md",
    "content": "> One has no right to love or hate anything if one has not acquired a thorough\n> knowledge of its nature. Great love springs from great knowledge of the\n> beloved object, and if you know it but little you will be able to love it only\n> a little or not at all.\n>\n> <cite>Leonardo da Vinci</cite>\n\nWe're eleven chapters in, and the interpreter sitting on your machine is nearly\na complete scripting language. It could use a couple of built-in data structures\nlike lists and maps, and it certainly needs a core library for file I/O, user\ninput, etc. But the language itself is sufficient. We've got a little procedural\nlanguage in the same vein as BASIC, Tcl, Scheme (minus macros), and early\nversions of Python and Lua.\n\nIf this were the '80s, we'd stop here. But today, many popular languages support\n\"object-oriented programming\". Adding that to Lox will give users a familiar set\nof tools for writing larger programs. Even if you personally don't <span\nname=\"hate\">like</span> OOP, this chapter and [the next][inheritance] will help\nyou understand how others design and build object systems.\n\n[inheritance]: inheritance.html\n\n<aside name=\"hate\">\n\nIf you *really* hate classes, though, you can skip these two chapters. They are\nfairly isolated from the rest of the book. Personally, I find it's good to learn\nmore about the things I dislike. Things look simple at a distance, but as I get\ncloser, details emerge and I gain a more nuanced perspective.\n\n</aside>\n\n## OOP and Classes\n\nThere are three broad paths to object-oriented programming: classes,\n[prototypes][], and <span name=\"multimethods\">[multimethods][]</span>. Classes\ncame first and are the most popular style. With the rise of JavaScript (and to a\nlesser extent [Lua][]), prototypes are more widely known than they used to be.\nI'll talk more about those [later][]. For Lox, we're taking the, ahem, classic\napproach.\n\n[prototypes]: http://gameprogrammingpatterns.com/prototype.html\n[multimethods]: https://en.wikipedia.org/wiki/Multiple_dispatch\n[lua]: https://www.lua.org/pil/13.4.1.html\n[later]: #design-note\n\n<aside name=\"multimethods\">\n\nMultimethods are the approach you're least likely to be familiar with. I'd love\nto talk more about them -- I designed [a hobby language][magpie] around them\nonce and they are *super rad* -- but there are only so many pages I can fit in.\nIf you'd like to learn more, take a look at [CLOS][] (the object system in\nCommon Lisp), [Dylan][], [Julia][], or [Raku][].\n\n[clos]: https://en.wikipedia.org/wiki/Common_Lisp_Object_System\n[magpie]: http://magpie-lang.org/\n[dylan]: https://opendylan.org/\n[julia]: https://julialang.org/\n[raku]: https://docs.raku.org/language/functions#Multi-dispatch\n\n</aside>\n\nSince you've written about a thousand lines of Java code with me already, I'm\nassuming you don't need a detailed introduction to object orientation. The main\ngoal is to bundle data with the code that acts on it. Users do that by declaring\na *class* that:\n\n<span name=\"circle\"></span>\n\n1. Exposes a *constructor* to create and initialize new *instances* of the\n   class\n\n1. Provides a way to store and access *fields* on instances\n\n1. Defines a set of *methods* shared by all instances of the class that\n   operate on each instances' state.\n\nThat's about as minimal as it gets. Most object-oriented languages, all the way\nback to Simula, also do inheritance to reuse behavior across classes. We'll add\nthat in the [next chapter][inheritance]. Even kicking that out, we still have a\nlot to get through. This is a big chapter and everything doesn't quite come\ntogether until we have all of the above pieces, so gather your stamina.\n\n<aside name=\"circle\">\n\n<img src=\"image/classes/circle.png\" alt=\"The relationships between classes, methods, instances, constructors, and fields.\" />\n\nIt's like the circle of life, *sans* Sir Elton John.\n\n</aside>\n\n[inheritance]: inheritance.html\n\n## Class Declarations\n\nLike we do, we're gonna start with syntax. A `class` statement introduces a new\nname, so it lives in the `declaration` grammar rule.\n\n```ebnf\ndeclaration    → classDecl\n               | funDecl\n               | varDecl\n               | statement ;\n\nclassDecl      → \"class\" IDENTIFIER \"{\" function* \"}\" ;\n```\n\nThe new `classDecl` rule relies on the `function` rule we defined\n[earlier][function rule]. To refresh your memory:\n\n[function rule]: functions.html#function-declarations\n\n```ebnf\nfunction       → IDENTIFIER \"(\" parameters? \")\" block ;\nparameters     → IDENTIFIER ( \",\" IDENTIFIER )* ;\n```\n\nIn plain English, a class declaration is the `class` keyword, followed by the\nclass's name, then a curly-braced body. Inside that body is a list of method\ndeclarations. Unlike function declarations, methods don't have a leading <span\nname=\"fun\">`fun`</span> keyword. Each method is a name, parameter list, and\nbody. Here's an example:\n\n<aside name=\"fun\">\n\nNot that I'm trying to say methods aren't fun or anything.\n\n</aside>\n\n```lox\nclass Breakfast {\n  cook() {\n    print \"Eggs a-fryin'!\";\n  }\n\n  serve(who) {\n    print \"Enjoy your breakfast, \" + who + \".\";\n  }\n}\n```\n\nLike most dynamically typed languages, fields are not explicitly listed in the\nclass declaration. Instances are loose bags of data and you can freely add\nfields to them as you see fit using normal imperative code.\n\nOver in our AST generator, the `classDecl` grammar rule gets its own statement\n<span name=\"class-ast\">node</span>.\n\n^code class-ast (1 before, 1 after)\n\n<aside name=\"class-ast\">\n\nThe generated code for the new node is in [Appendix II][appendix-class].\n\n[appendix-class]: appendix-ii.html#class-statement\n\n</aside>\n\nIt stores the class's name and the methods inside its body. Methods are\nrepresented by the existing Stmt.Function class that we use for function\ndeclaration AST nodes. That gives us all the bits of state that we need for a\nmethod: name, parameter list, and body.\n\nA class can appear anywhere a named declaration is allowed, triggered by the\nleading `class` keyword.\n\n^code match-class (1 before, 1 after)\n\nThat calls out to:\n\n^code parse-class-declaration\n\nThere's more meat to this than most of the other parsing methods, but it roughly\nfollows the grammar. We've already consumed the `class` keyword, so we look for\nthe expected class name next, followed by the opening curly brace. Once inside\nthe body, we keep parsing method declarations until we hit the closing brace.\nEach method declaration is parsed by a call to `function()`, which we defined\nback in the [chapter where functions were introduced][functions].\n\n[functions]: functions.html\n\nLike we do in any open-ended loop in the parser, we also check for hitting the\nend of the file. That won't happen in correct code since a class should have a\nclosing brace at the end, but it ensures the parser doesn't get stuck in an\ninfinite loop if the user has a syntax error and forgets to correctly end the\nclass body.\n\nWe wrap the name and list of methods into a Stmt.Class node and we're done.\nPreviously, we would jump straight into the interpreter, but now we need to\nplumb the node through the resolver first.\n\n^code resolver-visit-class\n\nWe aren't going to worry about resolving the methods themselves yet, so for now\nall we need to do is declare the class using its name. It's not common to\ndeclare a class as a local variable, but Lox permits it, so we need to handle it\ncorrectly.\n\nNow we interpret the class declaration.\n\n^code interpreter-visit-class\n\nThis looks similar to how we execute function declarations. We declare the\nclass's name in the current environment. Then we turn the class *syntax node*\ninto a LoxClass, the *runtime* representation of a class. We circle back and\nstore the class object in the variable we previously declared. That two-stage\nvariable binding process allows references to the class inside its own methods.\n\nWe will refine it throughout the chapter, but the first draft of LoxClass looks\nlike this:\n\n^code lox-class\n\nLiterally a wrapper around a name. We don't even store the methods yet. Not\nsuper useful, but it does have a `toString()` method so we can write a trivial\nscript and test that class objects are actually being parsed and executed.\n\n```lox\nclass DevonshireCream {\n  serveOn() {\n    return \"Scones\";\n  }\n}\n\nprint DevonshireCream; // Prints \"DevonshireCream\".\n```\n\n## Creating Instances\n\nWe have classes, but they don't do anything yet. Lox doesn't have \"static\"\nmethods that you can call right on the class itself, so without actual\ninstances, classes are useless. Thus instances are the next step.\n\nWhile some syntax and semantics are fairly standard across OOP languages, the\nway you create new instances isn't. Ruby, following Smalltalk, creates instances\nby calling a method on the class object itself, a <span\nname=\"turtles\">recursively</span> graceful approach. Some, like C++ and Java,\nhave a `new` keyword dedicated to birthing a new object. Python has you \"call\"\nthe class itself like a function. (JavaScript, ever weird, sort of does both.)\n\n<aside name=\"turtles\">\n\nIn Smalltalk, even *classes* are created by calling methods on an existing\nobject, usually the desired superclass. It's sort of a turtles-all-the-way-down\nthing. It ultimately bottoms out on a few magical classes like Object and\nMetaclass that the runtime conjures into being *ex nihilo*.\n\n</aside>\n\nI took a minimal approach with Lox. We already have class objects, and we\nalready have function calls, so we'll use call expressions on class objects to\ncreate new instances. It's as if a class is a factory function that generates\ninstances of itself. This feels elegant to me, and also spares us the need to\nintroduce syntax like `new`. Therefore, we can skip past the front end straight\ninto the runtime.\n\nRight now, if you try this:\n\n```lox\nclass Bagel {}\nBagel();\n```\n\nYou get a runtime error. `visitCallExpr()` checks to see if the called object\nimplements `LoxCallable` and reports an error since LoxClass doesn't. Not *yet*,\nthat is.\n\n^code lox-class-callable (2 before, 1 after)\n\nImplementing that interface requires two methods.\n\n^code lox-class-call-arity\n\nThe interesting one is `call()`. When you \"call\" a class, it instantiates a new\nLoxInstance for the called class and returns it. The `arity()` method is how the\ninterpreter validates that you passed the right number of arguments to a\ncallable. For now, we'll say you can't pass any. When we get to user-defined\nconstructors, we'll revisit this.\n\nThat leads us to LoxInstance, the runtime representation of an instance of a Lox\nclass. Again, our first implementation starts small.\n\n^code lox-instance\n\nLike LoxClass, it's pretty bare bones, but we're only getting started. If you\nwant to give it a try, here's a script to run:\n\n```lox\nclass Bagel {}\nvar bagel = Bagel();\nprint bagel; // Prints \"Bagel instance\".\n```\n\nThis program doesn't do much, but it's starting to do *something*.\n\n## Properties on Instances\n\nWe have instances, so we should make them useful. We're at a fork in the road.\nWe could add behavior first -- methods -- or we could start with state --\nproperties. We're going to take the latter because, as we'll see, the two get\nentangled in an interesting way and it will be easier to make sense of them if\nwe get properties working first.\n\nLox follows JavaScript and Python in how it handles state. Every instance is an\nopen collection of named values. Methods on the instance's class can access and\nmodify properties, but so can <span name=\"outside\">outside</span> code.\nProperties are accessed using a `.` syntax.\n\n<aside name=\"outside\">\n\nAllowing code outside of the class to directly modify an object's fields goes\nagainst the object-oriented credo that a class *encapsulates* state. Some\nlanguages take a more principled stance. In Smalltalk, fields are accessed using\nsimple identifiers -- essentially, variables that are only in scope inside a\nclass's methods. Ruby uses `@` followed by a name to access a field in an\nobject. That syntax is only meaningful inside a method and always accesses state\non the current object.\n\nLox, for better or worse, isn't quite so pious about its OOP faith.\n\n</aside>\n\n```lox\nsomeObject.someProperty\n```\n\nAn expression followed by `.` and an identifier reads the property with that\nname from the object the expression evaluates to. That dot has the same\nprecedence as the parentheses in a function call expression, so we slot it into\nthe grammar by replacing the existing `call` rule with:\n\n```ebnf\ncall           → primary ( \"(\" arguments? \")\" | \".\" IDENTIFIER )* ;\n```\n\nAfter a primary expression, we allow a series of any mixture of parenthesized\ncalls and dotted property accesses. \"Property access\" is a mouthful, so from\nhere on out, we'll call these \"get expressions\".\n\n### Get expressions\n\nThe <span name=\"get-ast\">syntax tree node</span> is:\n\n^code get-ast (1 before, 1 after)\n\n<aside name=\"get-ast\">\n\nThe generated code for the new node is in [Appendix II][appendix-get].\n\n[appendix-get]: appendix-ii.html#get-expression\n\n</aside>\n\nFollowing the grammar, the new parsing code goes in our existing `call()`\nmethod.\n\n^code parse-property (3 before, 4 after)\n\nThe outer `while` loop there corresponds to the `*` in the grammar rule. We zip\nalong the tokens building up a chain of calls and gets as we find parentheses\nand dots, like so:\n\n<img src=\"image/classes/zip.png\" alt=\"Parsing a series of '.' and '()' expressions to an AST.\" />\n\nInstances of the new Expr.Get node feed into the resolver.\n\n^code resolver-visit-get\n\nOK, not much to that. Since properties are looked up <span\nname=\"dispatch\">dynamically</span>, they don't get resolved. During resolution,\nwe recurse only into the expression to the left of the dot. The actual property\naccess happens in the interpreter.\n\n<aside name=\"dispatch\">\n\nYou can literally see that property dispatch in Lox is dynamic since we don't\nprocess the property name during the static resolution pass.\n\n</aside>\n\n^code interpreter-visit-get\n\nFirst, we evaluate the expression whose property is being accessed. In Lox, only\ninstances of classes have properties. If the object is some other type like a\nnumber, invoking a getter on it is a runtime error.\n\nIf the object is a LoxInstance, then we ask it to look up the property. It must\nbe time to give LoxInstance some actual state. A map will do fine.\n\n^code lox-instance-fields (1 before, 2 after)\n\nEach key in the map is a property name and the corresponding value is the\nproperty's value. To look up a property on an instance:\n\n^code lox-instance-get-property\n\n<aside name=\"hidden\">\n\nDoing a hash table lookup for every field access is fast enough for many\nlanguage implementations, but not ideal. High performance VMs for languages like\nJavaScript use sophisticated optimizations like \"[hidden classes][]\" to avoid\nthat overhead.\n\nParadoxically, many of the optimizations invented to make dynamic languages fast\nrest on the observation that -- even in those languages -- most code is fairly\nstatic in terms of the types of objects it works with and their fields.\n\n[hidden classes]: http://richardartoul.github.io/jekyll/update/2015/04/26/hidden-classes.html\n\n</aside>\n\nAn interesting edge case we need to handle is what happens if the instance\ndoesn't *have* a property with the given name. We could silently return some\ndummy value like `nil`, but my experience with languages like JavaScript is that\nthis behavior masks bugs more often than it does anything useful. Instead, we'll\nmake it a runtime error.\n\nSo the first thing we do is see if the instance actually has a field with the\ngiven name. Only then do we return it. Otherwise, we raise an error.\n\nNote how I switched from talking about \"properties\" to \"fields\". There is a\nsubtle difference between the two. Fields are named bits of state stored\ndirectly in an instance. Properties are the named, uh, *things*, that a get\nexpression may return. Every field is a property, but as we'll see <span\nname=\"foreshadowing\">later</span>, not every property is a field.\n\n<aside name=\"foreshadowing\">\n\nOoh, foreshadowing. Spooky!\n\n</aside>\n\nIn theory, we can now read properties on objects. But since there's no way to\nactually stuff any state into an instance, there are no fields to access. Before\nwe can test out reading, we must support writing.\n\n### Set expressions\n\nSetters use the same syntax as getters, except they appear on the left side of\nan assignment.\n\n```lox\nsomeObject.someProperty = value;\n```\n\nIn grammar land, we extend the rule for assignment to allow dotted identifiers\non the left-hand side.\n\n```ebnf\nassignment     → ( call \".\" )? IDENTIFIER \"=\" assignment\n               | logic_or ;\n```\n\nUnlike getters, setters don't chain. However, the reference to `call` allows any\nhigh-precedence expression before the last dot, including any number of\n*getters*, as in:\n\n<img src=\"image/classes/setter.png\" alt=\"breakfast.omelette.filling.meat = ham\" />\n\nNote here that only the *last* part, the `.meat` is the *setter*. The\n`.omelette` and `.filling` parts are both *get* expressions.\n\nJust as we have two separate AST nodes for variable access and variable\nassignment, we need a <span name=\"set-ast\">second setter node</span> to\ncomplement our getter node.\n\n^code set-ast (1 before, 1 after)\n\n<aside name=\"set-ast\">\n\nThe generated code for the new node is in [Appendix II][appendix-set].\n\n[appendix-set]: appendix-ii.html#set-expression\n\n</aside>\n\nIn case you don't remember, the way we handle assignment in the parser is a\nlittle funny. We can't easily tell that a series of tokens is the left-hand side\nof an assignment until we reach the `=`. Now that our assignment grammar rule\nhas `call` on the left side, which can expand to arbitrarily large expressions,\nthat final `=` may be many tokens away from the point where we need to know\nwe're parsing an assignment.\n\nInstead, the trick we do is parse the left-hand side as a normal expression.\nThen, when we stumble onto the equal sign after it, we take the expression we\nalready parsed and transform it into the correct syntax tree node for the\nassignment.\n\nWe add another clause to that transformation to handle turning an Expr.Get\nexpression on the left into the corresponding Expr.Set.\n\n^code assign-set (1 before, 1 after)\n\nThat's parsing our syntax. We push that node through into the resolver.\n\n^code resolver-visit-set\n\nAgain, like Expr.Get, the property itself is dynamically evaluated, so there's\nnothing to resolve there. All we need to do is recurse into the two\nsubexpressions of Expr.Set, the object whose property is being set, and the\nvalue it's being set to.\n\nThat leads us to the interpreter.\n\n^code interpreter-visit-set\n\nWe evaluate the object whose property is being set and check to see if it's a\nLoxInstance. If not, that's a runtime error. Otherwise, we evaluate the value\nbeing set and store it on the instance. That relies on a new method in\nLoxInstance.\n\n<aside name=\"order\">\n\nThis is another semantic edge case. There are three distinct operations:\n\n1. Evaluate the object.\n\n2. Raise a runtime error if it's not an instance of a class.\n\n3. Evaluate the value.\n\nThe order that those are performed in could be user visible, which means we need\nto carefully specify it and ensure our implementations do these in the same\norder.\n\n</aside>\n\n^code lox-instance-set-property\n\nNo real magic here. We stuff the values straight into the Java map where fields\nlive. Since Lox allows freely creating new fields on instances, there's no need\nto see if the key is already present.\n\n## Methods on Classes\n\nYou can create instances of classes and stuff data into them, but the class\nitself doesn't really *do* anything. Instances are just maps and all instances\nare more or less the same. To make them feel like instances *of classes*, we\nneed behavior -- methods.\n\nOur helpful parser already parses method declarations, so we're good there. We\nalso don't need to add any new parser support for method *calls*. We already\nhave `.` (getters) and `()` (function calls). A \"method call\" simply chains\nthose together.\n\n<img src=\"image/classes/method.png\" alt=\"The syntax tree for 'object.method(argument)\" />\n\nThat raises an interesting question. What happens when those two expressions are\npulled apart? Assuming that `method` in this example is a method on the class of\n`object` and not a field on the instance, what should the following piece of\ncode do?\n\n```lox\nvar m = object.method;\nm(argument);\n```\n\nThis program \"looks up\" the method and stores the result -- whatever that is --\nin a variable and then calls that object later. Is this allowed? Can you treat a\nmethod like it's a function on the instance?\n\nWhat about the other direction?\n\n```lox\nclass Box {}\n\nfun notMethod(argument) {\n  print \"called function with \" + argument;\n}\n\nvar box = Box();\nbox.function = notMethod;\nbox.function(\"argument\");\n```\n\nThis program creates an instance and then stores a function in a field on it.\nThen it calls that function using the same syntax as a method call. Does that\nwork?\n\nDifferent languages have different answers to these questions. One could write a\ntreatise on it. For Lox, we'll say the answer to both of these is yes, it does\nwork. We have a couple of reasons to justify that. For the second example --\ncalling a function stored in a field -- we want to support that because\nfirst-class functions are useful and storing them in fields is a perfectly\nnormal thing to do.\n\nThe first example is more obscure. One motivation is that users generally expect\nto be able to hoist a subexpression out into a local variable without changing\nthe meaning of the program. You can take this:\n\n```lox\nbreakfast(omelette.filledWith(cheese), sausage);\n```\n\nAnd turn it into this:\n\n```lox\nvar eggs = omelette.filledWith(cheese);\nbreakfast(eggs, sausage);\n```\n\nAnd it does the same thing. Likewise, since the `.` and the `()` in a method\ncall *are* two separate expressions, it seems you should be able to hoist the\n*lookup* part into a variable and then call it <span\nname=\"callback\">later</span>. We need to think carefully about what the *thing*\nyou get when you look up a method is, and how it behaves, even in weird cases\nlike:\n\n<aside name=\"callback\">\n\nA motivating use for this is callbacks. Often, you want to pass a callback whose\nbody simply invokes a method on some object. Being able to look up the method and\npass it directly saves you the chore of manually declaring a function to wrap\nit. Compare this:\n\n```lox\nfun callback(a, b, c) {\n  object.method(a, b, c);\n}\n\ntakeCallback(callback);\n```\n\nWith this:\n\n```lox\ntakeCallback(object.method);\n```\n\n</aside>\n\n```lox\nclass Person {\n  sayName() {\n    print this.name;\n  }\n}\n\nvar jane = Person();\njane.name = \"Jane\";\n\nvar method = jane.sayName;\nmethod(); // ?\n```\n\nIf you grab a handle to a method on some instance and call it later, does it\n\"remember\" the instance it was pulled off from? Does `this` inside the method\nstill refer to that original object?\n\nHere's a more pathological example to bend your brain:\n\n```lox\nclass Person {\n  sayName() {\n    print this.name;\n  }\n}\n\nvar jane = Person();\njane.name = \"Jane\";\n\nvar bill = Person();\nbill.name = \"Bill\";\n\nbill.sayName = jane.sayName;\nbill.sayName(); // ?\n```\n\nDoes that last line print \"Bill\" because that's the instance that we *called*\nthe method through, or \"Jane\" because it's the instance where we first grabbed\nthe method?\n\nEquivalent code in Lua and JavaScript would print \"Bill\". Those languages don't\nreally have a notion of \"methods\". Everything is sort of functions-in-fields, so\nit's not clear that `jane` \"owns\" `sayName` any more than `bill` does.\n\nLox, though, has real class syntax so we do know which callable things are\nmethods and which are functions. Thus, like Python, C#, and others, we will have\nmethods \"bind\" `this` to the original instance when the method is first grabbed.\nPython calls <span name=\"bound\">these</span> **bound methods**.\n\n<aside name=\"bound\">\n\nI know, imaginative name, right?\n\n</aside>\n\nIn practice, that's usually what you want. If you take a reference to a method\non some object so you can use it as a callback later, you want to remember the\ninstance it belonged to, even if that callback happens to be stored in a field\non some other object.\n\nOK, that's a lot of semantics to load into your head. Forget about the edge\ncases for a bit. We'll get back to those. For now, let's get basic method calls\nworking. We're already parsing the method declarations inside the class body, so\nthe next step is to resolve them.\n\n^code resolve-methods (1 before, 1 after)\n\n<aside name=\"local\">\n\nStoring the function type in a local variable is pointless right now, but we'll\nexpand this code before too long and it will make more sense.\n\n</aside>\n\nWe iterate through the methods in the class body and call the\n`resolveFunction()` method we wrote for handling function declarations already.\nThe only difference is that we pass in a new FunctionType enum value.\n\n^code function-type-method (1 before, 1 after)\n\nThat's going to be important when we resolve `this` expressions. For now, don't\nworry about it. The interesting stuff is in the interpreter.\n\n^code interpret-methods (1 before, 1 after)\n\nWhen we interpret a class declaration statement, we turn the syntactic\nrepresentation of the class -- its AST node -- into its runtime representation.\nNow, we need to do that for the methods contained in the class as well. Each\nmethod declaration blossoms into a LoxFunction object.\n\nWe take all of those and wrap them up into a map, keyed by the method names.\nThat gets stored in LoxClass.\n\n^code lox-class-methods (1 before, 3 after)\n\nWhere an instance stores state, the class stores behavior. LoxInstance has its\nmap of fields, and LoxClass gets a map of methods. Even though methods are\nowned by the class, they are still accessed through instances of that class.\n\n^code lox-instance-get-method (5 before, 2 after)\n\nWhen looking up a property on an instance, if we don't <span\nname=\"shadow\">find</span> a matching field, we look for a method with that name\non the instance's class. If found, we return that. This is where the distinction\nbetween \"field\" and \"property\" becomes meaningful. When accessing a property,\nyou might get a field -- a bit of state stored on the instance -- or you could\nhit a method defined on the instance's class.\n\nThe method is looked up using this:\n\n<aside name=\"shadow\">\n\nLooking for a field first implies that fields shadow methods, a subtle but\nimportant semantic point.\n\n</aside>\n\n^code lox-class-find-method\n\nYou can probably guess this method is going to get more interesting later. For\nnow, a simple map lookup on the class's method table is enough to get us\nstarted. Give it a try:\n\n<span name=\"crunch\"></span>\n\n```lox\nclass Bacon {\n  eat() {\n    print \"Crunch crunch crunch!\";\n  }\n}\n\nBacon().eat(); // Prints \"Crunch crunch crunch!\".\n```\n\n<aside name=\"crunch\">\n\nApologies if you prefer chewy bacon over crunchy. Feel free to adjust the script\nto your taste.\n\n</aside>\n\n## This\n\nWe can define both behavior and state on objects, but they aren't tied together\nyet. Inside a method, we have no way to access the fields of the \"current\"\nobject -- the instance that the method was called on -- nor can we call other\nmethods on that same object.\n\nTo get at that instance, it needs a <span name=\"i\">name</span>. Smalltalk,\nRuby, and Swift use \"self\". Simula, C++, Java, and others use \"this\". Python\nuses \"self\" by convention, but you can technically call it whatever you like.\n\n<aside name=\"i\">\n\n\"I\" would have been a great choice, but using \"i\" for loop variables predates\nOOP and goes all the way back to Fortran. We are victims of the incidental\nchoices of our forebears.\n\n</aside>\n\nFor Lox, since we generally hew to Java-ish style, we'll go with \"this\". Inside\na method body, a `this` expression evaluates to the instance that the method was\ncalled on. Or, more specifically, since methods are accessed and then invoked as\ntwo steps, it will refer to the object that the method was *accessed* from.\n\nThat makes our job harder. Peep at:\n\n```lox\nclass Egotist {\n  speak() {\n    print this;\n  }\n}\n\nvar method = Egotist().speak;\nmethod();\n```\n\nOn the second-to-last line, we grab a reference to the `speak()` method off an\ninstance of the class. That returns a function, and that function needs to\nremember the instance it was pulled off of so that *later*, on the last line, it\ncan still find it when the function is called.\n\nWe need to take `this` at the point that the method is accessed and attach it to\nthe function somehow so that it stays around as long as we need it to. Hmm... a\nway to store some extra data that hangs around a function, eh? That sounds an\nawful lot like a *closure*, doesn't it?\n\nIf we defined `this` as a sort of hidden variable in an environment that\nsurrounds the function returned when looking up a method, then uses of `this` in\nthe body would be able to find it later. LoxFunction already has the ability to\nhold on to a surrounding environment, so we have the machinery we need.\n\nLet's walk through an example to see how it works:\n\n```lox\nclass Cake {\n  taste() {\n    var adjective = \"delicious\";\n    print \"The \" + this.flavor + \" cake is \" + adjective + \"!\";\n  }\n}\n\nvar cake = Cake();\ncake.flavor = \"German chocolate\";\ncake.taste(); // Prints \"The German chocolate cake is delicious!\".\n```\n\nWhen we first evaluate the class definition, we create a LoxFunction for\n`taste()`. Its closure is the environment surrounding the class, in this case\nthe global one. So the LoxFunction we store in the class's method map looks\nlike so:\n\n<img src=\"image/classes/closure.png\" alt=\"The initial closure for the method.\" />\n\nWhen we evaluate the `cake.taste` get expression, we create a new environment\nthat binds `this` to the object the method is accessed from (here, `cake`). Then\nwe make a *new* LoxFunction with the same code as the original one but using\nthat new environment as its closure.\n\n<img src=\"image/classes/bound-method.png\" alt=\"The new closure that binds 'this'.\" />\n\nThis is the LoxFunction that gets returned when evaluating the get expression\nfor the method name. When that function is later called by a `()` expression,\nwe create an environment for the method body as usual.\n\n<img src=\"image/classes/call.png\" alt=\"Calling the bound method and creating a new environment for the method body.\" />\n\nThe parent of the body environment is the environment we created earlier to bind\n`this` to the current object. Thus any use of `this` inside the body\nsuccessfully resolves to that instance.\n\nReusing our environment code for implementing `this` also takes care of\ninteresting cases where methods and functions interact, like:\n\n```lox\nclass Thing {\n  getCallback() {\n    fun localFunction() {\n      print this;\n    }\n\n    return localFunction;\n  }\n}\n\nvar callback = Thing().getCallback();\ncallback();\n```\n\nIn, say, JavaScript, it's common to return a callback from inside a method. That\ncallback may want to hang on to and retain access to the original object -- the\n`this` value -- that the method was associated with. Our existing support for\nclosures and environment chains should do all this correctly.\n\nLet's code it up. The first step is adding <span name=\"this-ast\">new\nsyntax</span> for `this`.\n\n^code this-ast (1 before, 1 after)\n\n<aside name=\"this-ast\">\n\nThe generated code for the new node is in [Appendix II][appendix-this].\n\n[appendix-this]: appendix-ii.html#this-expression\n\n</aside>\n\nParsing is simple since it's a single token which our lexer already\nrecognizes as a reserved word.\n\n^code parse-this (2 before, 2 after)\n\nYou can start to see how `this` works like a variable when we get to the\nresolver.\n\n^code resolver-visit-this\n\nWe resolve it exactly like any other local variable using \"this\" as the name for\nthe \"variable\". Of course, that's not going to work right now, because \"this\"\n*isn't* declared in any scope. Let's fix that over in `visitClassStmt()`.\n\n^code resolver-begin-this-scope (2 before, 1 after)\n\nBefore we step in and start resolving the method bodies, we push a new scope and\ndefine \"this\" in it as if it were a variable. Then, when we're done, we discard\nthat surrounding scope.\n\n^code resolver-end-this-scope (2 before, 1 after)\n\nNow, whenever a `this` expression is encountered (at least inside a method) it\nwill resolve to a \"local variable\" defined in an implicit scope just outside of\nthe block for the method body.\n\nThe resolver has a new *scope* for `this`, so the interpreter needs to create a\ncorresponding *environment* for it. Remember, we always have to keep the\nresolver's scope chains and the interpreter's linked environments in sync with\neach other. At runtime, we create the environment after we find the method on\nthe instance. We replace the previous line of code that simply returned the\nmethod's LoxFunction with this:\n\n^code lox-instance-bind-method (1 before, 3 after)\n\nNote the new call to `bind()`. That looks like so:\n\n^code bind-instance\n\nThere isn't much to it. We create a new environment nestled inside the method's\noriginal closure. Sort of a closure-within-a-closure. When the method is called,\nthat will become the parent of the method body's environment.\n\nWe declare \"this\" as a variable in that environment and bind it to the given\ninstance, the instance that the method is being accessed from. *Et voilà*, the\nreturned LoxFunction now carries around its own little persistent world where\n\"this\" is bound to the object.\n\nThe remaining task is interpreting those `this` expressions. Similar to the\nresolver, it is the same as interpreting a variable expression.\n\n^code interpreter-visit-this\n\nGo ahead and give it a try using that cake example from earlier. With less than\ntwenty lines of code, our interpreter handles `this` inside methods even in all\nof the weird ways it can interact with nested classes, functions inside methods,\nhandles to methods, etc.\n\n### Invalid uses of this\n\nWait a minute. What happens if you try to use `this` *outside* of a method? What\nabout:\n\n```lox\nprint this;\n```\n\nOr:\n\n```lox\nfun notAMethod() {\n  print this;\n}\n```\n\nThere is no instance for `this` to point to if you're not in a method. We could\ngive it some default value like `nil` or make it a runtime error, but the user\nhas clearly made a mistake. The sooner they find and fix that mistake, the\nhappier they'll be.\n\nOur resolution pass is a fine place to detect this error statically. It already\ndetects `return` statements outside of functions. We'll do something similar for\n`this`. In the vein of our existing FunctionType enum, we define a new ClassType\none.\n\n^code class-type (1 before, 1 after)\n\nYes, it could be a Boolean. When we get to inheritance, it will get a third\nvalue, hence the enum right now. We also add a corresponding field,\n`currentClass`. Its value tells us if we are currently inside a class\ndeclaration while traversing the syntax tree. It starts out `NONE` which means\nwe aren't in one.\n\nWhen we begin to resolve a class declaration, we change that.\n\n^code set-current-class (1 before, 1 after)\n\nAs with `currentFunction`, we store the previous value of the field in a local\nvariable. This lets us piggyback onto the JVM to keep a stack of `currentClass`\nvalues. That way we don't lose track of the previous value if one class nests\ninside another.\n\nOnce the methods have been resolved, we \"pop\" that stack by restoring the old\nvalue.\n\n^code restore-current-class (2 before, 1 after)\n\nWhen we resolve a `this` expression, the `currentClass` field gives us the bit\nof data we need to report an error if the expression doesn't occur nestled\ninside a method body.\n\n^code this-outside-of-class (1 before, 1 after)\n\nThat should help users use `this` correctly, and it saves us from having to\nhandle misuse at runtime in the interpreter.\n\n## Constructors and Initializers\n\nWe can do almost everything with classes now, and as we near the end of the\nchapter we find ourselves strangely focused on a beginning. Methods and fields\nlet us encapsulate state and behavior together so that an object always *stays*\nin a valid configuration. But how do we ensure a brand new object *starts* in a\ngood state?\n\nFor that, we need constructors. I find them one of the trickiest parts of a\nlanguage to design, and if you peer closely at most other languages, you'll see\n<span name=\"cracks\">cracks</span> around object construction where the seams of\nthe design don't quite fit together perfectly. Maybe there's something\nintrinsically messy about the moment of birth.\n\n<aside name=\"cracks\">\n\nA few examples: In Java, even though final fields must be initialized, it is\nstill possible to read one *before* it has been. Exceptions -- a huge, complex\nfeature -- were added to C++ mainly as a way to emit errors from constructors.\n\n</aside>\n\n\"Constructing\" an object is actually a pair of operations:\n\n1.  The runtime <span name=\"allocate\">*allocates*</span> the memory required for\n    a fresh instance. In most languages, this operation is at a fundamental\n    level beneath what user code is able to access.\n\n    <aside name=\"allocate\">\n\n    C++'s \"[placement new][]\" is a rare example where the bowels of allocation\n    are laid bare for the programmer to prod.\n\n    </aside>\n\n2.  Then, a user-provided chunk of code is called which *initializes* the\n    unformed object.\n\n[placement new]: https://en.wikipedia.org/wiki/Placement_syntax\n\nThe latter is what we tend to think of when we hear \"constructor\", but the\nlanguage itself has usually done some groundwork for us before we get to that\npoint. In fact, our Lox interpreter already has that covered when it creates a\nnew LoxInstance object.\n\nWe'll do the remaining part -- user-defined initialization -- now. Languages\nhave a variety of notations for the chunk of code that sets up a new object for\na class. C++, Java, and C# use a method whose name matches the class name. Ruby\nand Python call it `init()`. The latter is nice and short, so we'll do that.\n\nIn LoxClass's implementation of LoxCallable, we add a few more lines.\n\n^code lox-class-call-initializer (2 before, 1 after)\n\nWhen a class is called, after the LoxInstance is created, we look for an \"init\"\nmethod. If we find one, we immediately bind and invoke it just like a normal\nmethod call. The argument list is forwarded along.\n\nThat argument list means we also need to tweak how a class declares its arity.\n\n^code lox-initializer-arity (1 before, 1 after)\n\nIf there is an initializer, that method's arity determines how many arguments\nyou must pass when you call the class itself. We don't *require* a class to\ndefine an initializer, though, as a convenience. If you don't have an\ninitializer, the arity is still zero.\n\nThat's basically it. Since we bind the `init()` method before we call it, it has\naccess to `this` inside its body. That, along with the arguments passed to the\nclass, are all you need to be able to set up the new instance however you\ndesire.\n\n### Invoking init() directly\n\nAs usual, exploring this new semantic territory rustles up a few weird\ncreatures. Consider:\n\n```lox\nclass Foo {\n  init() {\n    print this;\n  }\n}\n\nvar foo = Foo();\nprint foo.init();\n```\n\nCan you \"re-initialize\" an object by directly calling its `init()` method? If\nyou do, what does it return? A <span name=\"compromise\">reasonable</span> answer\nwould be `nil` since that's what it appears the body returns.\n\nHowever -- and I generally dislike compromising to satisfy the\nimplementation -- it will make clox's implementation of constructors much\neasier if we say that `init()` methods always return `this`, even when\ndirectly called. In order to keep jlox compatible with that, we add a little\nspecial case code in LoxFunction.\n\n<aside name=\"compromise\">\n\nMaybe \"dislike\" is too strong a claim. It's reasonable to have the constraints\nand resources of your implementation affect the design of the language. There\nare only so many hours in the day, and if a cut corner here or there lets you get\nmore features to users in less time, it may very well be a net win for their\nhappiness and productivity. The trick is figuring out *which* corners to cut\nthat won't cause your users and future self to curse your shortsightedness.\n\n</aside>\n\n^code return-this (2 before, 1 after)\n\nIf the function is an initializer, we override the actual return value and\nforcibly return `this`. That relies on a new `isInitializer` field.\n\n^code is-initializer-field (2 before, 2 after)\n\nWe can't simply see if the name of the LoxFunction is \"init\" because the user\ncould have defined a *function* with that name. In that case, there *is* no\n`this` to return. To avoid *that* weird edge case, we'll directly store whether\nthe LoxFunction represents an initializer method. That means we need to go back\nand fix the few places where we create LoxFunctions.\n\n^code construct-function (1 before, 1 after)\n\nFor actual function declarations, `isInitializer` is always false. For methods,\nwe check the name.\n\n^code interpreter-method-initializer (1 before, 1 after)\n\nAnd then in `bind()` where we create the closure that binds `this` to a method,\nwe pass along the original method's value.\n\n^code lox-function-bind-with-initializer (1 before, 1 after)\n\n### Returning from init()\n\nWe aren't out of the woods yet. We've been assuming that a user-written\ninitializer doesn't explicitly return a value because most constructors don't.\nWhat should happen if a user tries:\n\n```lox\nclass Foo {\n  init() {\n    return \"something else\";\n  }\n}\n```\n\nIt's definitely not going to do what they want, so we may as well make it a\nstatic error. Back in the resolver, we add another case to FunctionType.\n\n^code function-type-initializer (1 before, 1 after)\n\nWe use the visited method's name to determine if we're resolving an initializer\nor not.\n\n^code resolver-initializer-type (1 before, 1 after)\n\nWhen we later traverse into a `return` statement, we check that field and make\nit an error to return a value from inside an `init()` method.\n\n^code return-in-initializer (1 before, 1 after)\n\nWe're *still* not done. We statically disallow returning a *value* from an\ninitializer, but you can still use an empty early `return`.\n\n```lox\nclass Foo {\n  init() {\n    return;\n  }\n}\n```\n\nThat is actually kind of useful sometimes, so we don't want to disallow it\nentirely. Instead, it should return `this` instead of `nil`. That's an easy fix\nover in LoxFunction.\n\n^code early-return-this (1 before, 1 after)\n\nIf we're in an initializer and execute a `return` statement, instead of\nreturning the value (which will always be `nil`), we again return `this`.\n\nPhew! That was a whole list of tasks but our reward is that our little\ninterpreter has grown an entire programming paradigm. Classes, methods, fields,\n`this`, and constructors. Our baby language is looking awfully grown-up.\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  We have methods on instances, but there is no way to define \"static\" methods\n    that can be called directly on the class object itself. Add support for\n    them. Use a `class` keyword preceding the method to indicate a static method\n    that hangs off the class object.\n\n    ```lox\n    class Math {\n      class square(n) {\n        return n * n;\n      }\n    }\n\n    print Math.square(3); // Prints \"9\".\n    ```\n\n    You can solve this however you like, but the \"[metaclasses][]\" used by\n    Smalltalk and Ruby are a particularly elegant approach. *Hint: Make LoxClass\n    extend LoxInstance and go from there.*\n\n2.  Most modern languages support \"getters\" and \"setters\" -- members on a class\n    that look like field reads and writes but that actually execute user-defined\n    code. Extend Lox to support getter methods. These are declared without a\n    parameter list. The body of the getter is executed when a property with that\n    name is accessed.\n\n    ```lox\n    class Circle {\n      init(radius) {\n        this.radius = radius;\n      }\n\n      area {\n        return 3.141592653 * this.radius * this.radius;\n      }\n    }\n\n    var circle = Circle(4);\n    print circle.area; // Prints roughly \"50.2655\".\n    ```\n\n3.  Python and JavaScript allow you to freely access an object's fields from\n    outside of its own methods. Ruby and Smalltalk encapsulate instance state.\n    Only methods on the class can access the raw fields, and it is up to the\n    class to decide which state is exposed. Most statically typed languages\n    offer modifiers like `private` and `public` to control which parts of a\n    class are externally accessible on a per-member basis.\n\n    What are the trade-offs between these approaches and why might a language\n    prefer one or the other?\n\n[metaclasses]: https://en.wikipedia.org/wiki/Metaclass\n\n</div>\n\n<div class=\"design-note\">\n\n## Design Note: Prototypes and Power\n\nIn this chapter, we introduced two new runtime entities, LoxClass and\nLoxInstance. The former is where behavior for objects lives, and the latter is\nfor state. What if you could define methods right on a single object, inside\nLoxInstance? In that case, we wouldn't need LoxClass at all. LoxInstance would\nbe a complete package for defining the behavior and state of an object.\n\nWe'd still want some way, without classes, to reuse behavior across multiple\ninstances. We could let a LoxInstance [*delegate*][delegate] directly to another\nLoxInstance to reuse its fields and methods, sort of like inheritance.\n\nUsers would model their program as a constellation of objects, some of which\ndelegate to each other to reflect commonality. Objects used as delegates\nrepresent \"canonical\" or \"prototypical\" objects that others refine. The result\nis a simpler runtime with only a single internal construct, LoxInstance.\n\nThat's where the name **[prototypes][proto]** comes from for this paradigm. It\nwas invented by David Ungar and Randall Smith in a language called [Self][].\nThey came up with it by starting with Smalltalk and following the above mental\nexercise to see how much they could pare it down.\n\nPrototypes were an academic curiosity for a long time, a fascinating one that\ngenerated interesting research but didn't make a dent in the larger world of\nprogramming. That is, until Brendan Eich crammed prototypes into JavaScript,\nwhich then promptly took over the world. Many (many) <span\nname=\"words\">words</span> have been written about prototypes in JavaScript.\nWhether that shows that prototypes are brilliant or confusing -- or both! -- is\nan open question.\n\n<aside name=\"words\">\n\nIncluding [more than a handful][prototypes] by yours truly.\n\n</aside>\n\nI won't get into whether or not I think prototypes are a good idea for a\nlanguage. I've made languages that are [prototypal][finch] and\n[class-based][wren], and my opinions of both are complex. What I want to discuss\nis the role of *simplicity* in a language.\n\nPrototypes are simpler than classes -- less code for the language implementer to\nwrite, and fewer concepts for the user to learn and understand. Does that make\nthem better? We language nerds have a tendency to fetishize minimalism.\nPersonally, I think simplicity is only part of the equation. What we really want\nto give the user is *power*, which I define as:\n\n```text\npower = breadth × ease ÷ complexity\n```\n\nNone of these are precise numeric measures. I'm using math as analogy here, not\nactual quantification.\n\n*   **Breadth** is the range of different things the language lets you express.\n    C has a lot of breadth -- it's been used for everything from operating\n    systems to user applications to games. Domain-specific languages like\n    AppleScript and Matlab have less breadth.\n\n*   **Ease** is how little effort it takes to make the language do what you\n    want. \"Usability\" might be another term, though it carries more baggage than\n    I want to bring in. \"Higher-level\" languages tend to have more ease than\n    \"lower-level\" ones. Most languages have a \"grain\" to them where some things\n    feel easier to express than others.\n\n*   **Complexity** is how big the language (including its runtime, core libraries,\n    tools, ecosystem, etc.) is. People talk about how many pages are in a\n    language's spec, or how many keywords it has. It's how much the user has to\n    load into their wetware before they can be productive in the system. It is\n    the antonym of simplicity.\n\n[proto]: https://en.wikipedia.org/wiki/Prototype-based_programming\n\nReducing complexity *does* increase power. The smaller the denominator, the\nlarger the resulting value, so our intuition that simplicity is good is valid.\nHowever, when reducing complexity, we must take care not to sacrifice breadth or\nease in the process, or the total power may go down. Java would be a strictly\n*simpler* language if it removed strings, but it probably wouldn't handle text\nmanipulation tasks well, nor would it be as easy to get things done.\n\nThe art, then, is finding *accidental* complexity that can be omitted --\nlanguage features and interactions that don't carry their weight by increasing\nthe breadth or ease of using the language.\n\nIf users want to express their program in terms of categories of objects, then\nbaking classes into the language increases the ease of doing that, hopefully by\na large enough margin to pay for the added complexity. But if that isn't how\nusers are using your language, then by all means leave classes out.\n\n</div>\n\n[delegate]: https://en.wikipedia.org/wiki/Prototype-based_programming#Delegation\n[prototypes]: http://gameprogrammingpatterns.com/prototype.html\n[self]: http://www.selflanguage.org/\n[finch]: http://finch.stuffwithstuff.com/\n[wren]: http://wren.io/\n"
  },
  {
    "path": "book/closures.md",
    "content": "> As the man said, for every complex problem there's a simple solution, and it's\n> wrong.\n>\n> <cite>Umberto Eco, <em>Foucault's Pendulum</em></cite>\n\nThanks to our diligent labor in [the last chapter][last], we have a virtual\nmachine with working functions. What it lacks is closures. Aside from global\nvariables, which are their own breed of animal, a function has no way to\nreference a variable declared outside of its own body.\n\n[last]: calls-and-functions.html\n\n```lox\nvar x = \"global\";\nfun outer() {\n  var x = \"outer\";\n  fun inner() {\n    print x;\n  }\n  inner();\n}\nouter();\n```\n\nRun this example now and it prints \"global\". It's supposed to print \"outer\". To\nfix this, we need to include the entire lexical scope of all surrounding\nfunctions when resolving a variable.\n\nThis problem is harder in clox than it was in jlox because our bytecode VM\nstores locals on a stack. We used a stack because I claimed locals have stack\nsemantics -- variables are discarded in the reverse order that they are created.\nBut with closures, that's only *mostly* true.\n\n```lox\nfun makeClosure() {\n  var local = \"local\";\n  fun closure() {\n    print local;\n  }\n  return closure;\n}\n\nvar closure = makeClosure();\nclosure();\n```\n\nThe outer function `makeClosure()` declares a variable, `local`. It also creates\nan inner function, `closure()` that captures that variable. Then `makeClosure()`\nreturns a reference to that function. Since the closure <span\nname=\"flying\">escapes</span> while holding on to the local variable, `local` must\noutlive the function call where it was created.\n\n<aside name=\"flying\">\n\n<img src=\"image/closures/flying.png\" class=\"above\" alt=\"A local variable flying away from the stack.\"/>\n\nOh no, it's escaping!\n\n</aside>\n\nWe could solve this problem by dynamically allocating memory for all local\nvariables. That's what jlox does by putting everything in those Environment\nobjects that float around in Java's heap. But we don't want to. Using a <span\nname=\"stack\">stack</span> is *really* fast. Most local variables are *not*\ncaptured by closures and do have stack semantics. It would suck to make all of\nthose slower for the benefit of the rare local that is captured.\n\n<aside name=\"stack\">\n\nThere is a reason that C and Java use the stack for their local variables, after\nall.\n\n</aside>\n\nThis means a more complex approach than we used in our Java interpreter. Because\nsome locals have very different lifetimes, we will have two implementation\nstrategies. For locals that aren't used in closures, we'll keep them just as\nthey are on the stack. When a local is captured by a closure, we'll adopt\nanother solution that lifts them onto the heap where they can live as long as\nneeded.\n\nClosures have been around since the early Lisp days when bytes of memory and CPU\ncycles were more precious than emeralds. Over the intervening decades, hackers\ndevised all <span name=\"lambda\">manner</span> of ways to compile closures to\noptimized runtime representations. Some are more efficient but require a more\ncomplex compilation process than we could easily retrofit into clox.\n\n<aside name=\"lambda\">\n\nSearch for \"closure conversion\" or \"lambda lifting\" to start exploring.\n\n</aside>\n\nThe technique I explain here comes from the design of the Lua VM. It is fast,\nparsimonious with memory, and implemented with relatively little code. Even more\nimpressive, it fits naturally into the single-pass compilers clox and Lua both\nuse. It is somewhat intricate, though. It might take a while before all the\npieces click together in your mind. We'll build them one step at a time, and\nI'll try to introduce the concepts in stages.\n\n## Closure Objects\n\nOur VM represents functions at runtime using ObjFunction. These objects are\ncreated by the front end during compilation. At runtime, all the VM does is load\nthe function object from a constant table and bind it to a name. There is no\noperation to \"create\" a function at runtime. Much like string and number <span\nname=\"literal\">literals</span>, they are constants instantiated purely at\ncompile time.\n\n<aside name=\"literal\">\n\nIn other words, a function declaration in Lox *is* a kind of literal -- a piece\nof syntax that defines a constant value of a built-in type.\n\n</aside>\n\nThat made sense because all of the data that composes a function is known at\ncompile time: the chunk of bytecode compiled from the function's body, and the\nconstants used in the body. Once we introduce closures, though, that\nrepresentation is no longer sufficient. Take a gander at:\n\n```lox\nfun makeClosure(value) {\n  fun closure() {\n    print value;\n  }\n  return closure;\n}\n\nvar doughnut = makeClosure(\"doughnut\");\nvar bagel = makeClosure(\"bagel\");\ndoughnut();\nbagel();\n```\n\nThe `makeClosure()` function defines and returns a function. We call it twice\nand get two closures back. They are created by the same nested function\ndeclaration, `closure`, but close over different values. When we call the two\nclosures, each prints a different string. That implies we need some runtime\nrepresentation for a closure that captures the local variables surrounding the\nfunction as they exist when the function declaration is *executed*, not just\nwhen it is compiled.\n\nWe'll work our way up to capturing variables, but a good first step is defining\nthat object representation. Our existing ObjFunction type represents the <span\nname=\"raw\">\"raw\"</span> compile-time state of a function declaration, since all\nclosures created from a single declaration share the same code and constants. At\nruntime, when we execute a function declaration, we wrap the ObjFunction in a\nnew ObjClosure structure. The latter has a reference to the underlying bare\nfunction along with runtime state for the variables the function closes over.\n\n<aside name=\"raw\">\n\nThe Lua implementation refers to the raw function object containing the bytecode\nas a \"prototype\", which is a great word to describe this, except that word also\ngets overloaded to refer to [prototypal inheritance][].\n\n[prototypal inheritance]: https://en.wikipedia.org/wiki/Prototype-based_programming\n\n</aside>\n\n<img src=\"image/closures/obj-closure.png\" alt=\"An ObjClosure with a reference to an ObjFunction.\"/>\n\nWe'll wrap every function in an ObjClosure, even if the function doesn't\nactually close over and capture any surrounding local variables. This is a\nlittle wasteful, but it simplifies the VM because we can always assume that the\nfunction we're calling is an ObjClosure. That new struct starts out like this:\n\n^code obj-closure\n\nRight now, it simply points to an ObjFunction and adds the necessary object\nheader stuff. Grinding through the usual ceremony for adding a new object type\nto clox, we declare a C function to create a new closure.\n\n^code new-closure-h (2 before, 1 after)\n\nThen we implement it here:\n\n^code new-closure\n\nIt takes a pointer to the ObjFunction it wraps. It also initializes the type\nfield to a new type.\n\n^code obj-type-closure (1 before, 1 after)\n\nAnd when we're done with a closure, we release its memory.\n\n^code free-closure (1 before, 1 after)\n\nWe free only the ObjClosure itself, not the ObjFunction. That's because the\nclosure doesn't *own* the function. There may be multiple closures that all\nreference the same function, and none of them claims any special privilege over\nit. We can't free the ObjFunction until *all* objects referencing it are gone --\nincluding even the surrounding function whose constant table contains it.\nTracking that sounds tricky, and it is! That's why we'll write a garbage\ncollector soon to manage it for us.\n\nWe also have the usual <span name=\"macro\">macros</span> for checking a value's\ntype.\n\n<aside name=\"macro\">\n\nPerhaps I should have defined a macro to make it easier to generate these\nmacros. Maybe that would be a little too meta.\n\n</aside>\n\n^code is-closure (2 before, 1 after)\n\nAnd to cast a value:\n\n^code as-closure (2 before, 1 after)\n\nClosures are first-class objects, so you can print them.\n\n^code print-closure (1 before, 1 after)\n\nThey display exactly as ObjFunction does. From the user's perspective, the\ndifference between ObjFunction and ObjClosure is purely a hidden implementation\ndetail. With that out of the way, we have a working but empty representation for\nclosures.\n\n### Compiling to closure objects\n\nWe have closure objects, but our VM never creates them. The next step is getting\nthe compiler to emit instructions to tell the runtime when to create a new\nObjClosure to wrap a given ObjFunction. This happens right at the end of a\nfunction declaration.\n\n^code emit-closure (1 before, 1 after)\n\nBefore, the final bytecode for a function declaration was a single `OP_CONSTANT`\ninstruction to load the compiled function from the surrounding function's\nconstant table and push it onto the stack. Now we have a new instruction.\n\n^code closure-op (1 before, 1 after)\n\nLike `OP_CONSTANT`, it takes a single operand that represents a constant table\nindex for the function. But when we get over to the runtime implementation, we\ndo something more interesting.\n\nFirst, let's be diligent VM hackers and slot in disassembler support for the\ninstruction.\n\n^code disassemble-closure (2 before, 1 after)\n\nThere's more going on here than we usually have in the disassembler. By the end\nof the chapter, you'll discover that `OP_CLOSURE` is quite an unusual\ninstruction. It's straightforward right now -- just a single byte operand -- but\nwe'll be adding to it. This code here anticipates that future.\n\n### Interpreting function declarations\n\nMost of the work we need to do is in the runtime. We have to handle the new\ninstruction, naturally. But we also need to touch every piece of code in the VM\nthat works with ObjFunction and change it to use ObjClosure instead -- function\ncalls, call frames, etc. We'll start with the instruction, though.\n\n^code interpret-closure (1 before, 1 after)\n\nLike the `OP_CONSTANT` instruction we used before, first we load the compiled\nfunction from the constant table. The difference now is that we wrap that\nfunction in a new ObjClosure and push the result onto the stack.\n\nOnce you have a closure, you'll eventually want to call it.\n\n^code call-value-closure (1 before, 1 after)\n\nWe remove the code for calling objects whose type is `OBJ_FUNCTION`. Since we\nwrap all functions in ObjClosures, the runtime will never try to invoke a bare\nObjFunction anymore. Those objects live only in constant tables and get\nimmediately <span name=\"naked\">wrapped</span> in closures before anything else\nsees them.\n\n<aside name=\"naked\">\n\nWe don't want any naked functions wandering around the VM! What would the\nneighbors say?\n\n</aside>\n\nWe replace the old code with very similar code for calling a closure instead.\nThe only difference is the type of object we pass to `call()`. The real changes\nare over in that function. First, we update its signature.\n\n^code call-signature (1 after)\n\nThen, in the body, we need to fix everything that referenced the function to\nhandle the fact that we've introduced a layer of indirection. We start with the\narity checking:\n\n^code check-arity (1 before, 1 after)\n\nThe only change is that we unwrap the closure to get to the underlying function.\nThe next thing `call()` does is create a new CallFrame. We change that code to\nstore the closure in the CallFrame and get the bytecode pointer from the\nclosure's function.\n\n^code call-init-closure (1 before, 1 after)\n\nThis necessitates changing the declaration of CallFrame too.\n\n^code call-frame-closure (1 before, 1 after)\n\nThat change triggers a few other cascading changes. Every place in the VM that\naccessed CallFrame's function needs to use a closure instead. First, the macro\nfor reading a constant from the current function's constant table:\n\n^code read-constant (2 before, 2 after)\n\nWhen `DEBUG_TRACE_EXECUTION` is enabled, it needs to get to the chunk from the\nclosure.\n\n^code disassemble-instruction (1 before, 1 after)\n\nLikewise when reporting a runtime error:\n\n^code runtime-error-function (1 before, 1 after)\n\nAlmost there. The last piece is the blob of code that sets up the very first\nCallFrame to begin executing the top-level code for a Lox script.\n\n^code interpret (1 before, 2 after)\n\n<span name=\"pop\">The</span> compiler still returns a raw ObjFunction when\ncompiling a script. That's fine, but it means we need to wrap it in an\nObjClosure here, before the VM can execute it.\n\n<aside name=\"pop\">\n\nThe code looks a little silly because we still push the original ObjFunction\nonto the stack. Then we pop it after creating the closure, only to then push the\nclosure. Why put the ObjFunction on there at all? As usual, when you see weird\nstack stuff going on, it's to keep the [forthcoming garbage collector][gc] aware\nof some heap-allocated objects.\n\n[gc]: garbage-collection.html\n\n</aside>\n\nWe are back to a working interpreter. The *user* can't tell any difference, but\nthe compiler now generates code telling the VM to create a closure for each\nfunction declaration. Every time the VM executes a function declaration, it\nwraps the ObjFunction in a new ObjClosure. The rest of the VM now handles those\nObjClosures floating around. That's the boring stuff out of the way. Now we're\nready to make these closures actually *do* something.\n\n## Upvalues\n\nOur existing instructions for reading and writing local variables are limited to\na single function's stack window. Locals from a surrounding function are outside\nof the inner function's window. We're going to need some new instructions.\n\nThe easiest approach might be an instruction that takes a relative stack slot\noffset that can reach *before* the current function's window. That would work if\nclosed-over variables were always on the stack. But as we saw earlier, these\nvariables sometimes outlive the function where they are declared. That means\nthey won't always be on the stack.\n\nThe next easiest approach, then, would be to take any local variable that gets\nclosed over and have it always live on the heap. When the local variable\ndeclaration in the surrounding function is executed, the VM would allocate\nmemory for it dynamically. That way it could live as long as needed.\n\nThis would be a fine approach if clox didn't have a single-pass compiler. But\nthat restriction we chose in our implementation makes things harder. Take a look\nat this example:\n\n```lox\nfun outer() {\n  var x = 1;    // (1)\n  x = 2;        // (2)\n  fun inner() { // (3)\n    print x;\n  }\n  inner();\n}\n```\n\nHere, the compiler compiles the declaration of `x` at `(1)` and emits code for\nthe assignment at `(2)`. It does that before reaching the declaration of\n`inner()` at `(3)` and discovering that `x` is in fact closed over. We don't\nhave an easy way to go back and fix that already-emitted code to treat `x`\nspecially. Instead, we want a solution that allows a closed-over variable to\nlive on the stack exactly like a normal local variable *until the point that it\nis closed over*.\n\nFortunately, thanks to the Lua dev team, we have a solution. We use a level of\nindirection that they call an **upvalue**. An upvalue refers to a local variable\nin an enclosing function. Every closure maintains an array of upvalues, one for\neach surrounding local variable that the closure uses.\n\nThe upvalue points back into the stack to where the variable it captured lives.\nWhen the closure needs to access a closed-over variable, it goes through the\ncorresponding upvalue to reach it. When a function declaration is first executed\nand we create a closure for it, the VM creates the array of upvalues and wires\nthem up to \"capture\" the surrounding local variables that the closure needs.\n\nFor example, if we throw this program at clox,\n\n```lox\n{\n  var a = 3;\n  fun f() {\n    print a;\n  }\n}\n```\n\nthe compiler and runtime will conspire together to build up a set of objects in\nmemory like this:\n\n<img src=\"image/closures/open-upvalue.png\" alt=\"The object graph of the stack, ObjClosure, ObjFunction, and upvalue array.\"/>\n\n\nThat might look overwhelming, but fear not. We'll work our way through it. The\nimportant part is that upvalues serve as the layer of indirection needed to\ncontinue to find a captured local variable even after it moves off the stack.\nBut before we get to all that, let's focus on compiling captured variables.\n\n### Compiling upvalues\n\nAs usual, we want to do as much work as possible during compilation to keep\nexecution simple and fast. Since local variables are lexically scoped in Lox, we\nhave enough knowledge at compile time to resolve which surrounding local\nvariables a function accesses and where those locals are declared. That, in\nturn, means we know *how many* upvalues a closure needs, *which* variables they\ncapture, and *which stack slots* contain those variables in the declaring\nfunction's stack window.\n\nCurrently, when the compiler resolves an identifier, it walks the block scopes\nfor the current function from innermost to outermost. If we don't find the\nvariable in that function, we assume the variable must be a global. We don't\nconsider the local scopes of enclosing functions -- they get skipped right over.\nThe first change, then, is inserting a resolution step for those outer local\nscopes.\n\n^code named-variable-upvalue (3 before, 1 after)\n\nThis new `resolveUpvalue()` function looks for a local variable declared in any\nof the surrounding functions. If it finds one, it returns an \"upvalue index\" for\nthat variable. (We'll get into what that means later.) Otherwise, it returns -1\nto indicate the variable wasn't found. If it was found, we use these two new\ninstructions for reading or writing to the variable through its upvalue:\n\n^code upvalue-ops (1 before, 1 after)\n\nWe're implementing this sort of top-down, so I'll show you how these work at\nruntime soon. The part to focus on now is how the compiler actually resolves the\nidentifier.\n\n^code resolve-upvalue\n\nWe call this after failing to resolve a local variable in the current function's\nscope, so we know the variable isn't in the current compiler. Recall that\nCompiler stores a pointer to the Compiler for the enclosing function, and these\npointers form a linked chain that goes all the way to the root Compiler for the\ntop-level code. Thus, if the enclosing Compiler is `NULL`, we know we've reached\nthe outermost function without finding a local variable. The variable must be\n<span name=\"undefined\">global</span>, so we return -1.\n\n<aside name=\"undefined\">\n\nIt might end up being an entirely undefined variable and not even global. But in\nLox, we don't detect that error until runtime, so from the compiler's\nperspective, it's \"hopefully global\".\n\n</aside>\n\nOtherwise, we try to resolve the identifier as a *local* variable in the\n*enclosing* compiler. In other words, we look for it right outside the current\nfunction. For example:\n\n```lox\nfun outer() {\n  var x = 1;\n  fun inner() {\n    print x; // (1)\n  }\n  inner();\n}\n```\n\nWhen compiling the identifier expression at `(1)`, `resolveUpvalue()` looks for\na local variable `x` declared in `outer()`. If found -- like it is in this\nexample -- then we've successfully resolved the variable. We create an upvalue\nso that the inner function can access the variable through that. The upvalue is\ncreated here:\n\n^code add-upvalue\n\nThe compiler keeps an array of upvalue structures to track the closed-over\nidentifiers that it has resolved in the body of each function. Remember how the\ncompiler's Local array mirrors the stack slot indexes where locals live at\nruntime? This new upvalue array works the same way. The indexes in the\ncompiler's array match the indexes where upvalues will live in the ObjClosure at\nruntime.\n\nThis function adds a new upvalue to that array. It also keeps track of the\nnumber of upvalues the function uses. It stores that count directly in the\nObjFunction itself because we'll also <span name=\"bridge\">need</span> that\nnumber for use at runtime.\n\n<aside name=\"bridge\">\n\nLike constants and function arity, the upvalue count is another one of those\nlittle pieces of data that form the bridge between the compiler and runtime.\n\n</aside>\n\nThe `index` field tracks the closed-over local variable's slot index. That way\nthe compiler knows *which* variable in the enclosing function needs to be\ncaptured. We'll circle back to what that `isLocal` field is for before too long.\nFinally, `addUpvalue()` returns the index of the created upvalue in the\nfunction's upvalue list. That index becomes the operand to the `OP_GET_UPVALUE`\nand `OP_SET_UPVALUE` instructions.\n\nThat's the basic idea for resolving upvalues, but the function isn't fully\nbaked. A closure may reference the same variable in a surrounding function\nmultiple times. In that case, we don't want to waste time and memory creating a\nseparate upvalue for each identifier expression. To fix that, before we add a\nnew upvalue, we first check to see if the function already has an upvalue that\ncloses over that variable.\n\n^code existing-upvalue (1 before, 1 after)\n\nIf we find an upvalue in the array whose slot index matches the one we're\nadding, we just return that *upvalue* index and reuse it. Otherwise, we fall\nthrough and add the new upvalue.\n\nThese two functions access and modify a bunch of new state, so let's define\nthat. First, we add the upvalue count to ObjFunction.\n\n^code upvalue-count (1 before, 1 after)\n\nWe're conscientious C programmers, so we zero-initialize that when an\nObjFunction is first allocated.\n\n^code init-upvalue-count (1 before, 1 after)\n\nIn the compiler, we add a field for the upvalue array.\n\n^code upvalues-array (1 before, 1 after)\n\nFor simplicity, I gave it a fixed size. The `OP_GET_UPVALUE` and\n`OP_SET_UPVALUE` instructions encode an upvalue index using a single byte\noperand, so there's a restriction on how many upvalues a function can have --\nhow many unique variables it can close over. Given that, we can afford a static\narray that large. We also need to make sure the compiler doesn't overflow that\nlimit.\n\n^code too-many-upvalues (5 before, 1 after)\n\nFinally, the Upvalue struct type itself.\n\n^code upvalue-struct\n\nThe `index` field stores which local slot the upvalue is capturing. The\n`isLocal` field deserves its own section, which we'll get to next.\n\n### Flattening upvalues\n\nIn the example I showed before, the closure is accessing a variable declared in\nthe immediately enclosing function. Lox also supports accessing local variables\ndeclared in *any* enclosing scope, as in:\n\n```lox\nfun outer() {\n  var x = 1;\n  fun middle() {\n    fun inner() {\n      print x;\n    }\n  }\n}\n```\n\nHere, we're accessing `x` in `inner()`. That variable is defined not in\n`middle()`, but all the way out in `outer()`. We need to handle cases like this\ntoo. You *might* think that this isn't much harder since the variable will\nsimply be somewhere farther down on the stack. But consider this <span\nname=\"devious\">devious</span> example:\n\n<aside name=\"devious\">\n\nIf you work on programming languages long enough, you will develop a\nfinely honed skill at creating bizarre programs like this that are technically\nvalid but likely to trip up an implementation written by someone with a less\nperverse imagination than you.\n\n</aside>\n\n```lox\nfun outer() {\n  var x = \"value\";\n  fun middle() {\n    fun inner() {\n      print x;\n    }\n\n    print \"create inner closure\";\n    return inner;\n  }\n\n  print \"return from outer\";\n  return middle;\n}\n\nvar mid = outer();\nvar in = mid();\nin();\n```\n\nWhen you run this, it should print:\n\n```text\nreturn from outer\ncreate inner closure\nvalue\n```\n\nI know, it's convoluted. The important part is that `outer()` -- where `x` is\ndeclared -- returns and pops all of its variables off the stack before the\n*declaration* of `inner()` executes. So, at the point in time that we create the\nclosure for `inner()`, `x` is already off the stack.\n\nHere, I traced out the execution flow for you:\n\n<img src=\"image/closures/execution-flow.png\" alt=\"Tracing through the previous example program.\"/>\n\nSee how `x` is popped &#9312; before it is captured &#9313; and then later\naccessed &#9314;? We really have two problems:\n\n1.  We need to resolve local variables that are declared in surrounding\n    functions beyond the immediately enclosing one.\n\n2.  We need to be able to capture variables that have already left the stack.\n\nFortunately, we're in the middle of adding upvalues to the VM, and upvalues are\nexplicitly designed for tracking variables that have escaped the stack. So, in a\nclever bit of self-reference, we can use upvalues to allow upvalues to capture\nvariables declared outside of the immediately surrounding function.\n\nThe solution is to allow a closure to capture either a local variable or *an\nexisting upvalue* in the immediately enclosing function. If a deeply nested\nfunction references a local variable declared several hops away, we'll thread it\nthrough all of the intermediate functions by having each function capture an\nupvalue for the next function to grab.\n\n<img src=\"image/closures/linked-upvalues.png\" alt=\"An upvalue in inner() points to an upvalue in middle(), which points to a local variable in outer().\"/>\n\nIn the above example, `middle()` captures the local variable `x` in the\nimmediately enclosing function `outer()` and stores it in its own upvalue. It\ndoes this even though `middle()` itself doesn't reference `x`. Then, when the\ndeclaration of `inner()` executes, its closure grabs the *upvalue* from the\nObjClosure for `middle()` that captured `x`. A function captures -- either a\nlocal or upvalue -- *only* from the immediately surrounding function, which is\nguaranteed to still be around at the point that the inner function declaration\nexecutes.\n\nIn order to implement this, `resolveUpvalue()` becomes recursive.\n\n^code resolve-upvalue-recurse (4 before, 1 after)\n\nIt's only another three lines of code, but I found this function really\nchallenging to get right the first time. This in spite of the fact that I wasn't\ninventing anything new, just porting the concept over from Lua. Most recursive\nfunctions either do all their work before the recursive call (a **pre-order\ntraversal**, or \"on the way down\"), or they do all the work after the recursive\ncall (a **post-order traversal**, or \"on the way back up\"). This function does\nboth. The recursive call is right in the middle.\n\nWe'll walk through it slowly. First, we look for a matching local variable in\nthe enclosing function. If we find one, we capture that local and return. That's\nthe <span name=\"base\">base</span> case.\n\n<aside name=\"base\">\n\nThe other base case, of course, is if there is no enclosing function. In that\ncase, the variable can't be resolved lexically and is treated as global.\n\n</aside>\n\nOtherwise, we look for a local variable beyond the immediately enclosing\nfunction. We do that by recursively calling `resolveUpvalue()` on the\n*enclosing* compiler, not the current one. This series of `resolveUpvalue()`\ncalls works its way along the chain of nested compilers until it hits one of\nthe base cases -- either it finds an actual local variable to capture or it\nruns out of compilers.\n\nWhen a local variable is found, the most deeply <span name=\"outer\">nested</span>\ncall to `resolveUpvalue()` captures it and returns the upvalue index. That\nreturns to the next call for the inner function declaration. That call captures\nthe *upvalue* from the surrounding function, and so on. As each nested call to\n`resolveUpvalue()` returns, we drill back down into the innermost function\ndeclaration where the identifier we are resolving appears. At each step along\nthe way, we add an upvalue to the intervening function and pass the resulting\nupvalue index down to the next call.\n\n<aside name=\"outer\">\n\nEach recursive call to `resolveUpvalue()` walks *out* one level of function\nnesting. So an inner *recursive call* refers to an *outer* nested declaration.\nThe innermost recursive call to `resolveUpvalue()` that finds the local variable\nwill be for the *outermost* function, just inside the enclosing function where\nthat variable is actually declared.\n\n</aside>\n\nIt might help to walk through the original example when resolving `x`:\n\n<img src=\"image/closures/recursion.png\" alt=\"Tracing through a recursive call to resolveUpvalue().\"/>\n\nNote that the new call to `addUpvalue()` passes `false` for the `isLocal`\nparameter. Now you see that that flag controls whether the closure captures a\nlocal variable or an upvalue from the surrounding function.\n\nBy the time the compiler reaches the end of a function declaration, every\nvariable reference has been resolved as either a local, an upvalue, or a global.\nEach upvalue may in turn capture a local variable from the surrounding function,\nor an upvalue in the case of transitive closures. We finally have enough data to\nemit bytecode which creates a closure at runtime that captures all of the\ncorrect variables.\n\n^code capture-upvalues (1 before, 1 after)\n\nThe `OP_CLOSURE` instruction is unique in that it has a variably sized encoding.\nFor each upvalue the closure captures, there are two single-byte operands. Each\npair of operands specifies what that upvalue captures. If the first byte is one,\nit captures a local variable in the enclosing function. If zero, it captures one\nof the function's upvalues. The next byte is the local slot or upvalue index to\ncapture.\n\nThis odd encoding means we need some bespoke support in the disassembly code\nfor `OP_CLOSURE`.\n\n^code disassemble-upvalues (1 before, 1 after)\n\nFor example, take this script:\n\n```lox\nfun outer() {\n  var a = 1;\n  var b = 2;\n  fun middle() {\n    var c = 3;\n    var d = 4;\n    fun inner() {\n      print a + c + b + d;\n    }\n  }\n}\n```\n\nIf we disassemble the instruction that creates the closure for `inner()`, it\nprints this:\n\n```text\n0004    9 OP_CLOSURE          2 <fn inner>\n0006      |                     upvalue 0\n0008      |                     local 1\n0010      |                     upvalue 1\n0012      |                     local 2\n```\n\nWe have two other, simpler instructions to add disassembler support for.\n\n^code disassemble-upvalue-ops (2 before, 1 after)\n\nThese both have a single-byte operand, so there's nothing exciting going on. We\ndo need to add an include so the debug module can get to `AS_FUNCTION()`.\n\n^code debug-include-object (1 before, 1 after)\n\nWith that, our compiler is where we want it. For each function declaration, it\noutputs an `OP_CLOSURE` instruction followed by a series of operand byte pairs\nfor each upvalue it needs to capture at runtime. It's time to hop over to that\nside of the VM and get things running.\n\n## Upvalue Objects\n\nEach `OP_CLOSURE` instruction is now followed by the series of bytes that\nspecify the upvalues the ObjClosure should own. Before we process those\noperands, we need a runtime representation for upvalues.\n\n^code obj-upvalue\n\nWe know upvalues must manage closed-over variables that no longer live on the\nstack, which implies some amount of dynamic allocation. The easiest way to do\nthat in our VM is by building on the object system we already have. That way,\nwhen we implement a garbage collector in [the next chapter][gc], the GC can\nmanage memory for upvalues too.\n\n[gc]: garbage-collection.html\n\nThus, our runtime upvalue structure is an ObjUpvalue with the typical Obj header\nfield. Following that is a `location` field that points to the closed-over\nvariable. Note that this is a *pointer* to a Value, not a Value itself. It's a\nreference to a *variable*, not a *value*. This is important because it means\nthat when we assign to the variable the upvalue captures, we're assigning to the\nactual variable, not a copy. For example:\n\n```lox\nfun outer() {\n  var x = \"before\";\n  fun inner() {\n    x = \"assigned\";\n  }\n  inner();\n  print x;\n}\nouter();\n```\n\nThis program should print \"assigned\" even though the closure assigns to `x` and\nthe surrounding function accesses it.\n\nBecause upvalues are objects, we've got all the usual object machinery, starting\nwith a constructor-like function:\n\n^code new-upvalue-h (1 before, 1 after)\n\nIt takes the address of the slot where the closed-over variable lives. Here is\nthe implementation:\n\n^code new-upvalue\n\nWe simply initialize the object and store the pointer. That requires a new\nobject type.\n\n^code obj-type-upvalue (1 before, 1 after)\n\nAnd on the back side, a destructor-like function:\n\n^code free-upvalue (3 before, 1 after)\n\nMultiple closures can close over the same variable, so ObjUpvalue does not own\nthe variable it references. Thus, the only thing to free is the ObjUpvalue\nitself.\n\nAnd, finally, to print:\n\n^code print-upvalue (3 before, 1 after)\n\nPrinting isn't useful to end users. Upvalues are objects only so that we can\ntake advantage of the VM's memory management. They aren't first-class values\nthat a Lox user can directly access in a program. So this code will never\nactually execute... but it keeps the compiler from yelling at us about an\nunhandled switch case, so here we are.\n\n### Upvalues in closures\n\nWhen I first introduced upvalues, I said each closure has an array of them.\nWe've finally worked our way back to implementing that.\n\n^code upvalue-fields (1 before, 1 after)\n\n<span name=\"count\">Different</span> closures may have different numbers of\nupvalues, so we need a dynamic array. The upvalues themselves are dynamically\nallocated too, so we end up with a double pointer -- a pointer to a dynamically\nallocated array of pointers to upvalues. We also store the number of elements in\nthe array.\n\n<aside name=\"count\">\n\nStoring the upvalue count in the closure is redundant because the ObjFunction\nthat the ObjClosure references also keeps that count. As usual, this weird code\nis to appease the GC. The collector may need to know an ObjClosure's upvalue\narray size after the closure's corresponding ObjFunction has already been freed.\n\n</aside>\n\nWhen we create an ObjClosure, we allocate an upvalue array of the proper size,\nwhich we determined at compile time and stored in the ObjFunction.\n\n^code allocate-upvalue-array (1 before, 1 after)\n\nBefore creating the closure object itself, we allocate the array of upvalues and\ninitialize them all to `NULL`. This weird ceremony around memory is a careful\ndance to please the (forthcoming) garbage collection deities. It ensures the\nmemory manager never sees uninitialized memory.\n\nThen we store the array in the new closure, as well as copy the count over from\nthe ObjFunction.\n\n^code init-upvalue-fields (1 before, 1 after)\n\nWhen we free an ObjClosure, we also free the upvalue array.\n\n^code free-upvalues (1 before, 1 after)\n\nObjClosure does not own the ObjUpvalue objects themselves, but it does own *the\narray* containing pointers to those upvalues.\n\nWe fill the upvalue array over in the interpreter when it creates a closure.\nThis is where we walk through all of the operands after `OP_CLOSURE` to see what\nkind of upvalue each slot captures.\n\n^code interpret-capture-upvalues (1 before, 1 after)\n\nThis code is the magic moment when a closure comes to life. We iterate over each\nupvalue the closure expects. For each one, we read a pair of operand bytes. If\nthe upvalue closes over a local variable in the enclosing function, we let\n`captureUpvalue()` do the work.\n\nOtherwise, we capture an upvalue from the surrounding function. An `OP_CLOSURE`\ninstruction is emitted at the end of a function declaration. At the moment that\nwe are executing that declaration, the *current* function is the surrounding\none. That means the current function's closure is stored in the CallFrame at the\ntop of the callstack. So, to grab an upvalue from the enclosing function, we can\nread it right from the `frame` local variable, which caches a reference to that\nCallFrame.\n\nClosing over a local variable is more interesting. Most of the work happens in a\nseparate function, but first we calculate the argument to pass to it. We need to\ngrab a pointer to the captured local's slot in the surrounding function's stack\nwindow. That window begins at `frame->slots`, which points to slot zero. Adding\n`index` offsets that to the local slot we want to capture. We pass that pointer\nhere:\n\n^code capture-upvalue\n\nThis seems a little silly. All it does is create a new ObjUpvalue that captures\nthe given stack slot and returns it. Did we need a separate function for this?\nWell, no, not *yet*. But you know we are going to end up sticking more code in\nhere.\n\nFirst, let's wrap up what we're working on. Back in the interpreter code for\nhandling `OP_CLOSURE`, we eventually finish iterating through the upvalue\narray and initialize each one. When that completes, we have a new closure with\nan array full of upvalues pointing to variables.\n\nWith that in hand, we can implement the instructions that work with those\nupvalues.\n\n^code interpret-get-upvalue (1 before, 1 after)\n\nThe operand is the index into the current function's upvalue array. So we simply\nlook up the corresponding upvalue and dereference its location pointer to read\nthe value in that slot. Setting a variable is similar.\n\n^code interpret-set-upvalue (1 before, 1 after)\n\nWe <span name=\"assign\">take</span> the value on top of the stack and store it\ninto the slot pointed to by the chosen upvalue. Just as with the instructions\nfor local variables, it's important that these instructions are fast. User\nprograms are constantly reading and writing variables, so if that's slow,\neverything is slow. And, as usual, the way we make them fast is by keeping them\nsimple. These two new instructions are pretty good: no control flow, no complex\narithmetic, just a couple of pointer indirections and a `push()`.\n\n<aside name=\"assign\">\n\nThe set instruction doesn't *pop* the value from the stack because, remember,\nassignment is an expression in Lox. So the result of the assignment -- the\nassigned value -- needs to remain on the stack for the surrounding expression.\n\n</aside>\n\nThis is a milestone. As long as all of the variables remain on the stack, we\nhave working closures. Try this:\n\n```lox\nfun outer() {\n  var x = \"outside\";\n  fun inner() {\n    print x;\n  }\n  inner();\n}\nouter();\n```\n\nRun this, and it correctly prints \"outside\".\n\n## Closed Upvalues\n\nOf course, a key feature of closures is that they hold on to the variable as\nlong as needed, even after the function that declares the variable has returned.\nHere's another example that *should* work:\n\n```lox\nfun outer() {\n  var x = \"outside\";\n  fun inner() {\n    print x;\n  }\n\n  return inner;\n}\n\nvar closure = outer();\nclosure();\n```\n\nBut if you run it right now... who knows what it does? At runtime, it will end\nup reading from a stack slot that no longer contains the closed-over variable.\nLike I've mentioned a few times, the crux of the issue is that variables in\nclosures don't have stack semantics. That means we've got to hoist them off the\nstack when the function where they were declared returns. This final section of\nthe chapter does that.\n\n### Values and variables\n\nBefore we get to writing code, I want to dig into an important semantic point.\nDoes a closure close over a *value* or a *variable?* This isn't purely an <span\nname=\"academic\">academic</span> question. I'm not just splitting hairs.\nConsider:\n\n<aside name=\"academic\">\n\nIf Lox didn't allow assignment, it *would* be an academic question.\n\n</aside>\n\n```lox\nvar globalSet;\nvar globalGet;\n\nfun main() {\n  var a = \"initial\";\n\n  fun set() { a = \"updated\"; }\n  fun get() { print a; }\n\n  globalSet = set;\n  globalGet = get;\n}\n\nmain();\nglobalSet();\nglobalGet();\n```\n\nThe outer `main()` function creates two closures and stores them in <span\nname=\"global\">global</span> variables so that they outlive the execution of\n`main()` itself. Both of those closures capture the same variable. The first\nclosure assigns a new value to it and the second closure reads the variable.\n\n<aside name=\"global\">\n\nThe fact that I'm using a couple of global variables isn't significant. I needed\nsome way to return two values from a function, and without any kind of\ncollection type in Lox, my options were limited.\n\n</aside>\n\nWhat does the call to `globalGet()` print? If closures capture *values* then\neach closure gets its own copy of `a` with the value that `a` had at the point\nin time that the closure's function declaration executed. The call to\n`globalSet()` will modify `set()`'s copy of `a`, but `get()`'s copy will be\nunaffected. Thus, the call to `globalGet()` will print \"initial\".\n\nIf closures close over variables, then `get()` and `set()` will both capture --\nreference -- the *same mutable variable*. When `set()` changes `a`, it changes\nthe same `a` that `get()` reads from. There is only one `a`. That, in turn,\nimplies the call to `globalGet()` will print \"updated\".\n\nWhich is it? The answer for Lox and most other languages I know with closures is\nthe latter. Closures capture variables. You can think of them as capturing *the\nplace the value lives*. This is important to keep in mind as we deal with\nclosed-over variables that are no longer on the stack. When a variable moves to\nthe heap, we need to ensure that all closures capturing that variable retain a\nreference to its *one* new location. That way, when the variable is mutated, all\nclosures see the change.\n\n### Closing upvalues\n\nWe know that local variables always start out on the stack. This is faster, and\nlets our single-pass compiler emit code before it discovers the variable has\nbeen captured. We also know that closed-over variables need to move to the heap\nif the closure outlives the function where the captured variable is declared.\n\nFollowing Lua, we'll use **open upvalue** to refer to an upvalue that points to\na local variable still on the stack. When a variable moves to the heap, we are\n*closing* the upvalue and the result is, naturally, a **closed upvalue**. The\ntwo questions we need to answer are:\n\n1.  Where on the heap does the closed-over variable go?\n\n2.  When do we close the upvalue?\n\nThe answer to the first question is easy. We already have a convenient object on\nthe heap that represents a reference to a variable -- ObjUpvalue itself. The\nclosed-over variable will move into a new field right inside the ObjUpvalue\nstruct. That way we don't need to do any additional heap allocation to close an\nupvalue.\n\nThe second question is straightforward too. As long as the variable is on the\nstack, there may be code that refers to it there, and that code must work\ncorrectly. So the logical time to hoist the variable to the heap is as late as\npossible. If we move the local variable right when it goes out of scope, we are\ncertain that no code after that point will try to access it from the stack.\n<span name=\"after\">After</span> the variable is out of scope, the compiler will\nhave reported an error if any code tried to use it.\n\n<aside name=\"after\">\n\nBy \"after\" here, I mean in the lexical or textual sense -- code past the `}`\nfor the block containing the declaration of the closed-over variable.\n\n</aside>\n\nThe compiler already emits an `OP_POP` instruction when a local variable goes\nout of scope. If a variable is captured by a closure, we will instead emit a\ndifferent instruction to hoist that variable out of the stack and into its\ncorresponding upvalue. To do that, the compiler needs to know which <span\nname=\"param\">locals</span> are closed over.\n\n<aside name=\"param\">\n\nThe compiler doesn't pop parameters and locals declared immediately inside the\nbody of a function. We'll handle those too, in the runtime.\n\n</aside>\n\nThe compiler already maintains an array of Upvalue structs for each local\nvariable in the function to track exactly that state. That array is good for\nanswering \"Which variables does this closure use?\" But it's poorly suited for\nanswering, \"Does *any* function capture this local variable?\" In particular,\nonce the Compiler for some closure has finished, the Compiler for the enclosing\nfunction whose variable has been captured no longer has access to any of the\nupvalue state.\n\nIn other words, the compiler maintains pointers from upvalues to the locals they\ncapture, but not in the other direction. So we first need to add some extra\ntracking inside the existing Local struct so that we can tell if a given local\nis captured by a closure.\n\n^code is-captured-field (1 before, 1 after)\n\nThis field is `true` if the local is captured by any later nested function\ndeclaration. Initially, all locals are not captured.\n\n^code init-is-captured (1 before, 1 after)\n\n<span name=\"zero\">Likewise</span>, the special \"slot zero local\" that the\ncompiler implicitly declares is not captured.\n\n<aside name=\"zero\">\n\nLater in the book, it *will* become possible for a user to capture this\nvariable. Just building some anticipation here.\n\n</aside>\n\n^code init-zero-local-is-captured (1 before, 1 after)\n\nWhen resolving an identifier, if we end up creating an upvalue for a local\nvariable, we mark it as captured.\n\n^code mark-local-captured (1 before, 1 after)\n\nNow, at the end of a block scope when the compiler emits code to free the stack\nslots for the locals, we can tell which ones need to get hoisted onto the heap.\nWe'll use a new instruction for that.\n\n^code end-scope (3 before, 2 after)\n\nThe instruction requires no operand. We know that the variable will always be\nright on top of the stack at the point that this instruction executes. We\ndeclare the instruction.\n\n^code close-upvalue-op (1 before, 1 after)\n\nAnd add trivial disassembler support for it:\n\n^code disassemble-close-upvalue (1 before, 1 after)\n\nExcellent. Now the generated bytecode tells the runtime exactly when each\ncaptured local variable must move to the heap. Better, it does so only for the\nlocals that *are* used by a closure and need this special treatment. This aligns\nwith our general performance goal that we want users to pay only for\nfunctionality that they use. Variables that aren't used by closures live and die\nentirely on the stack just as they did before.\n\n### Tracking open upvalues\n\nLet's move over to the runtime side. Before we can interpret `OP_CLOSE_UPVALUE`\ninstructions, we have an issue to resolve. Earlier, when I talked about whether\nclosures capture variables or values, I said it was important that if multiple\nclosures access the same variable that they end up with a reference to the\nexact same storage location in memory. That way if one closure writes to the\nvariable, the other closure sees the change.\n\nRight now, if two closures capture the same <span name=\"indirect\">local</span>\nvariable, the VM creates a separate Upvalue for each one. The necessary sharing\nis missing. When we move the variable off the stack, if we move it into only one\nof the upvalues, the other upvalue will have an orphaned value.\n\n<aside name=\"indirect\">\n\nThe VM *does* share upvalues if one closure captures an *upvalue* from a\nsurrounding function. The nested case works correctly. But if two *sibling*\nclosures capture the same local variable, they each create a separate\nObjUpvalue.\n\n</aside>\n\nTo fix that, whenever the VM needs an upvalue that captures a particular local\nvariable slot, we will first search for an existing upvalue pointing to that\nslot. If found, we reuse that. The challenge is that all of the previously\ncreated upvalues are squirreled away inside the upvalue arrays of the various\nclosures. Those closures could be anywhere in the VM's memory.\n\nThe first step is to give the VM its own list of all open upvalues that point to\nvariables still on the stack. Searching a list each time the VM needs an upvalue\nsounds like it might be slow, but in practice, it's not bad. The number of\nvariables on the stack that actually get closed over tends to be small. And\nfunction declarations that <span name=\"create\">create</span> closures are rarely\non performance critical execution paths in the user's program.\n\n<aside name=\"create\">\n\nClosures are frequently *invoked* inside hot loops. Think about the closures\npassed to typical higher-order functions on collections like [`map()`][map] and\n[`filter()`][filter]. That should be fast. But the function declaration that\n*creates* the closure happens only once and is usually outside of the loop.\n\n[map]: https://en.wikipedia.org/wiki/Map_(higher-order_function)\n[filter]: https://en.wikipedia.org/wiki/Filter_(higher-order_function)\n\n</aside>\n\nEven better, we can order the list of open upvalues by the stack slot index they\npoint to. The common case is that a slot has *not* already been captured --\nsharing variables between closures is uncommon -- and closures tend to capture\nlocals near the top of the stack. If we store the open upvalue array in stack\nslot order, as soon as we step past the slot where the local we're capturing\nlives, we know it won't be found. When that local is near the top of the stack,\nwe can exit the loop pretty early.\n\nMaintaining a sorted list requires inserting elements in the middle efficiently.\nThat suggests using a linked list instead of a dynamic array. Since we defined\nthe ObjUpvalue struct ourselves, the easiest implementation is an intrusive list\nthat puts the next pointer right inside the ObjUpvalue struct itself.\n\n^code next-field (1 before, 1 after)\n\nWhen we allocate an upvalue, it is not attached to any list yet so the link is\n`NULL`.\n\n^code init-next (1 before, 1 after)\n\nThe VM owns the list, so the head pointer goes right inside the main VM struct.\n\n^code open-upvalues-field (1 before, 1 after)\n\nThe list starts out empty.\n\n^code init-open-upvalues (1 before, 1 after)\n\nStarting with the first upvalue pointed to by the VM, each open upvalue points\nto the next open upvalue that references a local variable farther down the\nstack. This script, for example,\n\n```lox\n{\n  var a = 1;\n  fun f() {\n    print a;\n  }\n  var b = 2;\n  fun g() {\n    print b;\n  }\n  var c = 3;\n  fun h() {\n    print c;\n  }\n}\n```\n\nshould produce a series of linked upvalues like so:\n\n<img src=\"image/closures/linked-list.png\" alt=\"Three upvalues in a linked list.\"/>\n\nWhenever we close over a local variable, before creating a new upvalue, we look\nfor an existing one in the list.\n\n^code look-for-existing-upvalue (1 before, 1 after)\n\nWe start at the <span name=\"head\">head</span> of the list, which is the upvalue\nclosest to the top of the stack. We walk through the list, using a little\npointer comparison to iterate past every upvalue pointing to slots above the one\nwe're looking for. While we do that, we keep track of the preceding upvalue on\nthe list. We'll need to update that node's `next` pointer if we end up inserting\na node after it.\n\n<aside name=\"head\">\n\nIt's a singly linked list. It's not like we have any other choice than to start\nat the head and go forward from there.\n\n</aside>\n\nThere are three reasons we can exit the loop:\n\n1.  **The local slot we stopped at *is* the slot we're looking for.** We found\n    an existing upvalue capturing the variable, so we reuse that upvalue.\n\n2.  **We ran out of upvalues to search.** When `upvalue` is `NULL`, it means\n    every open upvalue in the list points to locals above the slot we're looking\n    for, or (more likely) the upvalue list is empty. Either way, we didn't find\n    an upvalue for our slot.\n\n3.  **We found an upvalue whose local slot is *below* the one we're looking\n    for.** Since the list is sorted, that means we've gone past the slot we are\n    closing over, and thus there must not be an existing upvalue for it.\n\nIn the first case, we're done and we've returned. Otherwise, we create a new\nupvalue for our local slot and insert it into the list at the right location.\n\n^code insert-upvalue-in-list (1 before, 1 after)\n\nThe current incarnation of this function already creates the upvalue, so we only\nneed to add code to insert the upvalue into the list. We exited the list\ntraversal by either going past the end of the list, or by stopping on the first\nupvalue whose stack slot is below the one we're looking for. In either case,\nthat means we need to insert the new upvalue *before* the object pointed at by\n`upvalue` (which may be `NULL` if we hit the end of the list).\n\nAs you may have learned in Data Structures 101, to insert a node into a linked\nlist, you set the `next` pointer of the previous node to point to your new one.\nWe have been conveniently keeping track of that preceding node as we walked the\nlist. We also need to handle the <span name=\"double\">special</span> case where\nwe are inserting a new upvalue at the head of the list, in which case the \"next\"\npointer is the VM's head pointer.\n\n<aside name=\"double\">\n\nThere is a shorter implementation that handles updating either the head pointer\nor the previous upvalue's `next` pointer uniformly by using a pointer to a\npointer, but that kind of code confuses almost everyone who hasn't reached some\nZen master level of pointer expertise. I went with the basic `if` statement\napproach.\n\n</aside>\n\nWith this updated function, the VM now ensures that there is only ever a single\nObjUpvalue for any given local slot. If two closures capture the same variable,\nthey will get the same upvalue. We're ready to move those upvalues off the\nstack now.\n\n### Closing upvalues at runtime\n\nThe compiler helpfully emits an `OP_CLOSE_UPVALUE` instruction to tell the VM\nexactly when a local variable should be hoisted onto the heap. Executing that\ninstruction is the interpreter's responsibility.\n\n^code interpret-close-upvalue (1 before, 1 after)\n\nWhen we reach the instruction, the variable we are hoisting is right on top of\nthe stack. We call a helper function, passing the address of that stack slot.\nThat function is responsible for closing the upvalue and moving the local from\nthe stack to the heap. After that, the VM is free to discard the stack slot,\nwhich it does by calling `pop()`.\n\nThe fun stuff happens here:\n\n^code close-upvalues\n\nThis function takes a pointer to a stack slot. It closes every open upvalue it\ncan find that points to that slot or any slot above it on the stack. Right now,\nwe pass a pointer only to the top slot on the stack, so the \"or above it\" part\ndoesn't come into play, but it will soon.\n\nTo do this, we walk the VM's list of open upvalues, again from top to bottom. If\nan upvalue's location points into the range of slots we're closing, we close the\nupvalue. Otherwise, once we reach an upvalue outside of the range, we know the\nrest will be too, so we stop iterating.\n\nThe way an upvalue gets closed is pretty <span name=\"cool\">cool</span>. First,\nwe copy the variable's value into the `closed` field in the ObjUpvalue. That's\nwhere closed-over variables live on the heap. The `OP_GET_UPVALUE` and\n`OP_SET_UPVALUE` instructions need to look for the variable there after it's\nbeen moved. We could add some conditional logic in the interpreter code for\nthose instructions to check some flag for whether the upvalue is open or closed.\n\nBut there is already a level of indirection in play -- those instructions\ndereference the `location` pointer to get to the variable's value. When the\nvariable moves from the stack to the `closed` field, we simply update that\n`location` to the address of the ObjUpvalue's *own* `closed` field.\n\n<aside name=\"cool\">\n\nI'm not praising myself here. This is all the Lua dev team's innovation.\n\n</aside>\n\n<img src=\"image/closures/closing.png\" alt=\"Moving a value from the stack to the upvalue's 'closed' field and then pointing the 'value' field to it.\"/>\n\nWe don't need to change how `OP_GET_UPVALUE` and `OP_SET_UPVALUE` are\ninterpreted at all. That keeps them simple, which in turn keeps them fast. We do\nneed to add the new field to ObjUpvalue, though.\n\n^code closed-field (1 before, 1 after)\n\nAnd we should zero it out when we create an ObjUpvalue so there's no\nuninitialized memory floating around.\n\n^code init-closed (1 before, 1 after)\n\nWhenever the compiler reaches the end of a block, it discards all local\nvariables in that block and emits an `OP_CLOSE_UPVALUE` for each local variable\nthat was closed over. The compiler <span name=\"close\">does</span> *not* emit any\ninstructions at the end of the outermost block scope that defines a function\nbody. That scope contains the function's parameters and any locals declared\nimmediately inside the function. Those need to get closed too.\n\n<aside name=\"close\">\n\nThere's nothing *preventing* us from closing the outermost function scope in the\ncompiler and emitting `OP_POP` and `OP_CLOSE_UPVALUE` instructions. Doing so is\njust unnecessary because the runtime discards all of the stack slots used by the\nfunction implicitly when it pops the call frame.\n\n</aside>\n\nThis is the reason `closeUpvalues()` accepts a pointer to a stack slot. When a\nfunction returns, we call that same helper and pass in the first stack slot\nowned by the function.\n\n^code return-close-upvalues (1 before, 1 after)\n\nBy passing the first slot in the function's stack window, we close every\nremaining open upvalue owned by the returning function. And with that, we now\nhave a fully functioning closure implementation. Closed-over variables live as\nlong as they are needed by the functions that capture them.\n\nThis was a lot of work! In jlox, closures fell out naturally from our\nenvironment representation. In clox, we had to add a lot of code -- new bytecode\ninstructions, more data structures in the compiler, and new runtime objects. The\nVM very much treats variables in closures as different from other variables.\n\nThere is a rationale for that. In terms of implementation complexity, jlox gave\nus closures \"for free\". But in terms of *performance*, jlox's closures are\nanything but. By allocating *all* environments on the heap, jlox pays a\nsignificant performance price for *all* local variables, even the majority which\nare never captured by closures.\n\nWith clox, we have a more complex system, but that allows us to tailor the\nimplementation to fit the two use patterns we observe for local variables. For\nmost variables which do have stack semantics, we allocate them entirely on the\nstack which is simple and fast. Then, for the few local variables where that\ndoesn't work, we have a second slower path we can opt in to as needed.\n\nFortunately, users don't perceive the complexity. From their perspective, local\nvariables in Lox are simple and uniform. The *language itself* is as simple as\njlox's implementation. But under the hood, clox is watching what the user does\nand optimizing for their specific uses. As your language implementations grow in\nsophistication, you'll find yourself doing this more. A large fraction of\n\"optimization\" is about adding special case code that detects certain uses and\nprovides a custom-built, faster path for code that fits that pattern.\n\nWe have lexical scoping fully working in clox now, which is a major milestone.\nAnd, now that we have functions and variables with complex lifetimes, we also\nhave a *lot* of objects floating around in clox's heap, with a web of pointers\nstringing them together. The [next step][] is figuring out how to manage that\nmemory so that we can free some of those objects when they're no longer needed.\n\n[next step]: garbage-collection.html\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  Wrapping every ObjFunction in an ObjClosure introduces a level of\n    indirection that has a performance cost. That cost isn't necessary for\n    functions that do not close over any variables, but it does let the runtime\n    treat all calls uniformly.\n\n    Change clox to only wrap functions in ObjClosures that need upvalues. How\n    does the code complexity and performance compare to always wrapping\n    functions? Take care to benchmark programs that do and do not use closures.\n    How should you weight the importance of each benchmark? If one gets slower\n    and one faster, how do you decide what trade-off to make to choose an\n    implementation strategy?\n\n2.  Read the design note below. I'll wait. Now, how do you think Lox *should*\n    behave? Change the implementation to create a new variable for each loop\n    iteration.\n\n3.  A [famous koan][koan] teaches us that \"objects are a poor man's closure\"\n    (and vice versa). Our VM doesn't support objects yet, but now that we have\n    closures we can approximate them. Using closures, write a Lox program that\n    models two-dimensional vector \"objects\". It should:\n\n    *   Define a \"constructor\" function to create a new vector with the given\n        *x* and *y* coordinates.\n\n    *   Provide \"methods\" to access the *x* and *y* coordinates of values\n        returned from that constructor.\n\n    *   Define an addition \"method\" that adds two vectors and produces a third.\n\n\n[koan]: http://wiki.c2.com/?ClosuresAndObjectsAreEquivalent\n\n</div>\n\n<div class=\"design-note\">\n\n## Design Note: Closing Over the Loop Variable\n\nClosures capture variables. When two closures capture the same variable, they\nshare a reference to the same underlying storage location. This fact is visible\nwhen new values are assigned to the variable. Obviously, if two closures capture\n*different* variables, there is no sharing.\n\n```lox\nvar globalOne;\nvar globalTwo;\n\nfun main() {\n  {\n    var a = \"one\";\n    fun one() {\n      print a;\n    }\n    globalOne = one;\n  }\n\n  {\n    var a = \"two\";\n    fun two() {\n      print a;\n    }\n    globalTwo = two;\n  }\n}\n\nmain();\nglobalOne();\nglobalTwo();\n```\n\nThis prints \"one\" then \"two\". In this example, it's pretty clear that the two\n`a` variables are different. But it's not always so obvious. Consider:\n\n```lox\nvar globalOne;\nvar globalTwo;\n\nfun main() {\n  for (var a = 1; a <= 2; a = a + 1) {\n    fun closure() {\n      print a;\n    }\n    if (globalOne == nil) {\n      globalOne = closure;\n    } else {\n      globalTwo = closure;\n    }\n  }\n}\n\nmain();\nglobalOne();\nglobalTwo();\n```\n\nThe code is convoluted because Lox has no collection types. The important part\nis that the `main()` function does two iterations of a `for` loop. Each time\nthrough the loop, it creates a closure that captures the loop variable. It\nstores the first closure in `globalOne` and the second in `globalTwo`.\n\nThere are definitely two different closures. Do they close over two different\nvariables? Is there only one `a` for the entire duration of the loop, or does\neach iteration get its own distinct `a` variable?\n\nThe script here is strange and contrived, but this does show up in real code\nin languages that aren't as minimal as clox. Here's a JavaScript example:\n\n```js\nvar closures = [];\nfor (var i = 1; i <= 2; i++) {\n  closures.push(function () { console.log(i); });\n}\n\nclosures[0]();\nclosures[1]();\n```\n\nDoes this print \"1\" then \"2\", or does it print <span name=\"three\">\"3\"</span>\ntwice? You may be surprised to hear that it prints \"3\" twice. In this JavaScript\nprogram, there is only a single `i` variable whose lifetime includes all\niterations of the loop, including the final exit.\n\n<aside name=\"three\">\n\nYou're wondering how *three* enters the picture? After the second iteration,\n`i++` is executed, which increments `i` to three. That's what causes `i <= 2` to\nevaluate to false and end the loop. If `i` never reached three, the loop would\nrun forever.\n\n</aside>\n\nIf you're familiar with JavaScript, you probably know that variables declared\nusing `var` are implicitly *hoisted* to the surrounding function or top-level\nscope. It's as if you really wrote this:\n\n```js\nvar closures = [];\nvar i;\nfor (i = 1; i <= 2; i++) {\n  closures.push(function () { console.log(i); });\n}\n\nclosures[0]();\nclosures[1]();\n```\n\nAt that point, it's clearer that there is only a single `i`. Now consider if\nyou change the program to use the newer `let` keyword:\n\n```js\nvar closures = [];\nfor (let i = 1; i <= 2; i++) {\n  closures.push(function () { console.log(i); });\n}\n\nclosures[0]();\nclosures[1]();\n```\n\nDoes this new program behave the same? Nope. In this case, it prints \"1\" then\n\"2\". Each closure gets its own `i`. That's sort of strange when you think about\nit. The increment clause is `i++`. That looks very much like it is assigning to\nand mutating an existing variable, not creating a new one.\n\nLet's try some other languages. Here's Python:\n\n```python\nclosures = []\nfor i in range(1, 3):\n  closures.append(lambda: print(i))\n\nclosures[0]()\nclosures[1]()\n```\n\nPython doesn't really have block scope. Variables are implicitly declared and\nare automatically scoped to the surrounding function. Kind of like hoisting in\nJS, now that I think about it. So both closures capture the same variable.\nUnlike C, though, we don't exit the loop by incrementing `i` *past* the last\nvalue, so this prints \"2\" twice.\n\nWhat about Ruby? Ruby has two typical ways to iterate numerically. Here's the\nclassic imperative style:\n\n```ruby\nclosures = []\nfor i in 1..2 do\n  closures << lambda { puts i }\nend\n\nclosures[0].call\nclosures[1].call\n```\n\nThis, like Python, prints \"2\" twice. But the more idiomatic Ruby style is using\na higher-order `each()` method on range objects:\n\n```ruby\nclosures = []\n(1..2).each do |i|\n  closures << lambda { puts i }\nend\n\nclosures[0].call\nclosures[1].call\n```\n\nIf you're not familiar with Ruby, the `do |i| ... end` part is basically a\nclosure that gets created and passed to the `each()` method. The `|i|` is the\nparameter signature for the closure. The `each()` method invokes that closure\ntwice, passing in 1 for `i` the first time and 2 the second time.\n\nIn this case, the \"loop variable\" is really a function parameter. And, since\neach iteration of the loop is a separate invocation of the function, those are\ndefinitely separate variables for each call. So this prints \"1\" then \"2\".\n\nIf a language has a higher-level iterator-based looping structure like `foreach`\nin C#, Java's \"enhanced for\", `for-of` in JavaScript, `for-in` in Dart, etc.,\nthen I think it's natural to the reader to have each iteration create a new\nvariable. The code *looks* like a new variable because the loop header looks\nlike a variable declaration. And there's no increment expression that looks like\nit's mutating that variable to advance to the next step.\n\nIf you dig around StackOverflow and other places, you find evidence that this is\nwhat users expect, because they are very surprised when they *don't* get it. In\nparticular, C# originally did *not* create a new loop variable for each\niteration of a `foreach` loop. This was such a frequent source of user confusion\nthat they took the very rare step of shipping a breaking change to the language.\nIn C# 5, each iteration creates a fresh variable.\n\nOld C-style `for` loops are harder. The increment clause really does look like\nmutation. That implies there is a single variable that's getting updated each\nstep. But it's almost never *useful* for each iteration to share a loop\nvariable. The only time you can even detect this is when closures capture it.\nAnd it's rarely helpful to have a closure that references a variable whose value\nis whatever value caused you to exit the loop.\n\nThe pragmatically useful answer is probably to do what JavaScript does with\n`let` in `for` loops. Make it look like mutation but actually create a new\nvariable each time, because that's what users want. It is kind of weird when you\nthink about it, though.\n\n</div>\n"
  },
  {
    "path": "book/compiling-expressions.md",
    "content": "> In the middle of the journey of our life I found myself within a dark woods\n> where the straight way was lost.\n>\n> <cite>Dante Alighieri, <em>Inferno</em></cite>\n\nThis chapter is exciting for not one, not two, but *three* reasons. First, it\nprovides the final segment of our VM's execution pipeline. Once in place, we can\nplumb the user's source code from scanning all the way through to executing it.\n\n<img src=\"image/compiling-expressions/pipeline.png\" alt=\"Lowering the 'compiler' section of pipe between 'scanner' and 'VM'.\" />\n\nSecond, we get to write an actual, honest-to-God *compiler*. It parses source\ncode and outputs a low-level series of binary instructions. Sure, it's <span\nname=\"wirth\">bytecode</span> and not some chip's native instruction set, but\nit's way closer to the metal than jlox was. We're about to be real language\nhackers.\n\n<aside name=\"wirth\">\n\nBytecode was good enough for Niklaus Wirth, and no one questions his street\ncred.\n\n</aside>\n\n<span name=\"pratt\">Third</span> and finally, I get to show you one of my\nabsolute favorite algorithms: Vaughan Pratt's \"top-down operator precedence\nparsing\". It's the most elegant way I know to parse expressions. It gracefully\nhandles prefix operators, postfix, infix, *mixfix*, any kind of *-fix* you got.\nIt deals with precedence and associativity without breaking a sweat. I love it.\n\n<aside name=\"pratt\">\n\nPratt parsers are a sort of oral tradition in industry. No compiler or language\nbook I've read teaches them. Academia is very focused on generated parsers, and\nPratt's technique is for handwritten ones, so it gets overlooked.\n\nBut in production compilers, where hand-rolled parsers are common, you'd be\nsurprised how many people know it. Ask where they learned it, and it's always,\n\"Oh, I worked on this compiler years ago and my coworker said they took it from\nthis old front end...\"\n\n</aside>\n\nAs usual, before we get to the fun stuff, we've got some preliminaries to work\nthrough. You have to eat your vegetables before you get dessert. First, let's\nditch that temporary scaffolding we wrote for testing the scanner and replace it\nwith something more useful.\n\n^code interpret-chunk (1 before, 1 after)\n\nWe create a new empty chunk and pass it over to the compiler. The compiler will\ntake the user's program and fill up the chunk with bytecode. At least, that's\nwhat it will do if the program doesn't have any compile errors. If it does\nencounter an error, `compile()` returns `false` and we discard the unusable\nchunk.\n\nOtherwise, we send the completed chunk over to the VM to be executed. When the\nVM finishes, we free the chunk and we're done. As you can see, the signature to\n`compile()` is different now.\n\n^code compile-h (2 before, 2 after)\n\nWe pass in the chunk where the compiler will write the code, and then\n`compile()` returns whether or not compilation succeeded. We make the same\nchange to the signature in the implementation.\n\n^code compile-signature (2 before, 1 after)\n\nThat call to `initScanner()` is the only line that survives this chapter. Rip\nout the temporary code we wrote to test the scanner and replace it with these\nthree lines:\n\n^code compile-chunk (1 before, 1 after)\n\nThe call to `advance()` \"primes the pump\" on the scanner. We'll see what it does\nsoon. Then we parse a single expression. We aren't going to do statements yet,\nso that's the only subset of the grammar we support. We'll revisit this when we\n[add statements in a few chapters][globals]. After we compile the expression, we\nshould be at the end of the source code, so we check for the sentinel EOF token.\n\n[globals]: global-variables.html\n\nWe're going to spend the rest of the chapter making this function work,\nespecially that little `expression()` call. Normally, we'd dive right into that\nfunction definition and work our way through the implementation from top to\nbottom.\n\nThis chapter is <span name=\"blog\">different</span>. Pratt's parsing technique is\nremarkably simple once you have it all loaded in your head, but it's a little\ntricky to break into bite-sized pieces. It's recursive, of course, which is part\nof the problem. But it also relies on a big table of data. As we build up the\nalgorithm, that table grows additional columns.\n\n<aside name=\"blog\">\n\nIf this chapter isn't clicking with you and you'd like another take on the\nconcepts, I wrote an article that teaches the same algorithm but using Java and\nan object-oriented style: [\"Pratt Parsing: Expression Parsing Made Easy\"][blog].\n\n[blog]: http://journal.stuffwithstuff.com/2011/03/19/pratt-parsers-expression-parsing-made-easy/\n\n</aside>\n\nI don't want to revisit 40-something lines of code each time we extend the\ntable. So we're going to work our way into the core of the parser from the\noutside and cover all of the surrounding bits before we get to the juicy center.\nThis will require a little more patience and mental scratch space than most\nchapters, but it's the best I could do.\n\n## Single-Pass Compilation\n\nA compiler has roughly two jobs. It parses the user's source code to understand\nwhat it means. Then it takes that knowledge and outputs low-level instructions\nthat produce the same semantics. Many languages split those two roles into two\nseparate <span name=\"passes\">passes</span> in the implementation. A parser\nproduces an AST -- just like jlox does -- and then a code generator traverses\nthe AST and outputs target code.\n\n<aside name=\"passes\">\n\nIn fact, most sophisticated optimizing compilers have a heck of a lot more than\ntwo passes. Determining not just *what* optimization passes to have, but how to\norder them to squeeze the most performance out of the compiler -- since the\noptimizations often interact in complex ways -- is somewhere between an \"open\narea of research\" and a \"dark art\".\n\n</aside>\n\nIn clox, we're taking an old-school approach and merging these two passes into\none. Back in the day, language hackers did this because computers literally\ndidn't have enough memory to store an entire source file's AST. We're doing it\nbecause it keeps our compiler simpler, which is a real asset when programming in\nC.\n\nSingle-pass compilers like we're going to build don't work well for all\nlanguages. Since the compiler has only a peephole view into the user's program\nwhile generating code, the language must be designed such that you don't need\nmuch surrounding context to understand a piece of syntax. Fortunately, tiny,\ndynamically typed Lox is <span name=\"lox\">well-suited</span> to that.\n\n<aside name=\"lox\">\n\nNot that this should come as much of a surprise. I did design the language\nspecifically for this book after all.\n\n<img src=\"image/compiling-expressions/keyhole.png\" alt=\"Peering through a keyhole at 'var x;'\" />\n\n</aside>\n\nWhat this means in practical terms is that our \"compiler\" C module has\nfunctionality you'll recognize from jlox for parsing -- consuming tokens,\nmatching expected token types, etc. And it also has functions for code gen --\nemitting bytecode and adding constants to the destination chunk. (And it means\nI'll use \"parsing\" and \"compiling\" interchangeably throughout this and later\nchapters.)\n\nWe'll build the parsing and code generation halves first. Then we'll stitch them\ntogether with the code in the middle that uses Pratt's technique to parse Lox's\nparticular grammar and output the right bytecode.\n\n## Parsing Tokens\n\nFirst up, the front half of the compiler. This function's name should sound\nfamiliar.\n\n^code advance (1 before)\n\nJust like in jlox, it steps forward through the token stream. It asks the\nscanner for the next token and stores it for later use. Before doing that, it\ntakes the old `current` token and stashes that in a `previous` field. That will\ncome in handy later so that we can get at the lexeme after we match a token.\n\nThe code to read the next token is wrapped in a loop. Remember, clox's scanner\ndoesn't report lexical errors. Instead, it creates special *error tokens* and\nleaves it up to the parser to report them. We do that here.\n\nWe keep looping, reading tokens and reporting the errors, until we hit a\nnon-error one or reach the end. That way, the rest of the parser sees only valid\ntokens. The current and previous token are stored in this struct:\n\n^code parser (1 before, 2 after)\n\nLike we did in other modules, we have a single global variable of this struct\ntype so we don't need to pass the state around from function to function in the\ncompiler.\n\n### Handling syntax errors\n\nIf the scanner hands us an error token, we need to actually tell the user. That\nhappens using this:\n\n^code error-at-current\n\nWe pull the location out of the current token in order to tell the user where\nthe error occurred and forward it to `errorAt()`. More often, we'll report an\nerror at the location of the token we just consumed, so we give the shorter name\nto this other function:\n\n^code error\n\nThe actual work happens here:\n\n^code error-at\n\nFirst, we print where the error occurred. We try to show the lexeme if it's\nhuman-readable. Then we print the error message itself. After that, we set this\n`hadError` flag. That records whether any errors occurred during compilation.\nThis field also lives in the parser struct.\n\n^code had-error-field (1 before, 1 after)\n\nEarlier I said that `compile()` should return `false` if an error occurred. Now\nwe can make it do that.\n\n^code return-had-error (1 before, 1 after)\n\nI've got another flag to introduce for error handling. We want to avoid error\ncascades. If the user has a mistake in their code and the parser gets confused\nabout where it is in the grammar, we don't want it to spew out a whole pile of\nmeaningless knock-on errors after the first one.\n\nWe fixed that in jlox using panic mode error recovery. In the Java interpreter,\nwe threw an exception to unwind out of all of the parser code to a point where\nwe could skip tokens and resynchronize. We don't have <span\nname=\"setjmp\">exceptions</span> in C. Instead, we'll do a little smoke and\nmirrors. We add a flag to track whether we're currently in panic mode.\n\n<aside name=\"setjmp\">\n\nThere is `setjmp()` and `longjmp()`, but I'd rather not go there. Those make it\ntoo easy to leak memory, forget to maintain invariants, or otherwise have a Very\nBad Day.\n\n</aside>\n\n^code panic-mode-field (1 before, 1 after)\n\nWhen an error occurs, we set it.\n\n^code set-panic-mode (1 before, 1 after)\n\nAfter that, we go ahead and keep compiling as normal as if the error never\noccurred. The bytecode will never get executed, so it's harmless to keep on\ntrucking. The trick is that while the panic mode flag is set, we simply suppress\nany other errors that get detected.\n\n^code check-panic-mode (1 before, 1 after)\n\nThere's a good chance the parser will go off in the weeds, but the user won't\nknow because the errors all get swallowed. Panic mode ends when the parser\nreaches a synchronization point. For Lox, we chose statement boundaries, so when\nwe later add those to our compiler, we'll clear the flag there.\n\nThese new fields need to be initialized.\n\n^code init-parser-error (1 before, 1 after)\n\nAnd to display the errors, we need a standard header.\n\n^code compiler-include-stdlib (1 before, 2 after)\n\nThere's one last parsing function, another old friend from jlox.\n\n^code consume\n\nIt's similar to `advance()` in that it reads the next token. But it also\nvalidates that the token has an expected type. If not, it reports an error. This\nfunction is the foundation of most syntax errors in the compiler.\n\nOK, that's enough on the front end for now.\n\n## Emitting Bytecode\n\nAfter we parse and understand a piece of the user's program, the next step is to\ntranslate that to a series of bytecode instructions. It starts with the easiest\npossible step: appending a single byte to the chunk.\n\n^code emit-byte\n\nIt's hard to believe great things will flow through such a simple function. It\nwrites the given byte, which may be an opcode or an operand to an instruction.\nIt sends in the previous token's line information so that runtime errors are\nassociated with that line.\n\nThe chunk that we're writing gets passed into `compile()`, but it needs to make\nits way to `emitByte()`. To do that, we rely on this intermediary function:\n\n^code compiling-chunk (1 before, 1 after)\n\nRight now, the chunk pointer is stored in a module-level variable like we store\nother global state. Later, when we start compiling user-defined functions, the\nnotion of \"current chunk\" gets more complicated. To avoid having to go back and\nchange a lot of code, I encapsulate that logic in the `currentChunk()` function.\n\nWe initialize this new module variable before we write any bytecode:\n\n^code init-compile-chunk (2 before, 2 after)\n\nThen, at the very end, when we're done compiling the chunk, we wrap things up.\n\n^code finish-compile (1 before, 1 after)\n\nThat calls this:\n\n^code end-compiler\n\nIn this chapter, our VM deals only with expressions. When you run clox, it will\nparse, compile, and execute a single expression, then print the result. To print\nthat value, we are temporarily using the `OP_RETURN` instruction. So we have the\ncompiler add one of those to the end of the chunk.\n\n^code emit-return\n\nWhile we're here in the back end we may as well make our lives easier.\n\n^code emit-bytes\n\nOver time, we'll have enough cases where we need to write an opcode followed by\na one-byte operand that it's worth defining this convenience function.\n\n## Parsing Prefix Expressions\n\nWe've assembled our parsing and code generation utility functions. The missing\npiece is the code in the middle that connects those together.\n\n<img src=\"image/compiling-expressions/mystery.png\" alt=\"Parsing functions on the left, bytecode emitting functions on the right. What goes in the middle?\" />\n\nThe only step in `compile()` that we have left to implement is this function:\n\n^code expression\n\nWe aren't ready to implement every kind of expression in Lox yet. Heck, we don't\neven have Booleans. For this chapter, we're only going to worry about four:\n\n* Number literals: `123`\n* Parentheses for grouping: `(123)`\n* Unary negation: `-123`\n* The Four Horsemen of the Arithmetic: `+`, `-`, `*`, `/`\n\nAs we work through the functions to compile each of those kinds of expressions,\nwe'll also assemble the requirements for the table-driven parser that calls\nthem.\n\n### Parsers for tokens\n\nFor now, let's focus on the Lox expressions that are each only a single token.\nIn this chapter, that's just number literals, but there will be more later. Here's\nhow we can compile them:\n\nWe map each token type to a different kind of expression. We define a function\nfor each expression that outputs the appropriate bytecode. Then we build an\narray of function pointers. The indexes in the array correspond to the\n`TokenType` enum values, and the function at each index is the code to compile\nan expression of that token type.\n\nTo compile number literals, we store a pointer to the following function at the\n`TOKEN_NUMBER` index in the array.\n\n^code number\n\nWe assume the token for the number literal has already been consumed and is\nstored in `previous`. We take that lexeme and use the C standard library to\nconvert it to a double value. Then we generate the code to load that value using\nthis function:\n\n^code emit-constant\n\nFirst, we add the value to the constant table, then we emit an `OP_CONSTANT`\ninstruction that pushes it onto the stack at runtime. To insert an entry in the\nconstant table, we rely on:\n\n^code make-constant\n\nMost of the work happens in `addConstant()`, which we defined back in an\n[earlier chapter][bytecode]. That adds the given value to the end of the chunk's\nconstant table and returns its index. The new function's job is mostly to make\nsure we don't have too many constants. Since the `OP_CONSTANT` instruction uses\na single byte for the index operand, we can store and load only up to <span\nname=\"256\">256</span> constants in a chunk.\n\n[bytecode]: chunks-of-bytecode.html\n\n<aside name=\"256\">\n\nYes, that limit is pretty low. If this were a full-sized language\nimplementation, we'd want to add another instruction like `OP_CONSTANT_16` that\nstores the index as a two-byte operand so we could handle more constants when\nneeded.\n\nThe code to support that isn't particularly illuminating, so I omitted it from\nclox, but you'll want your VMs to scale to larger programs.\n\n</aside>\n\nThat's basically all it takes. Provided there is some suitable code that\nconsumes a `TOKEN_NUMBER` token, looks up `number()` in the function pointer\narray, and then calls it, we can now compile number literals to bytecode.\n\n### Parentheses for grouping\n\nOur as-yet-imaginary array of parsing function pointers would be great if every\nexpression was only a single token long. Alas, most are longer. However, many\nexpressions *start* with a particular token. We call these *prefix* expressions.\nFor example, when we're parsing an expression and the current token is `(`, we\nknow we must be looking at a parenthesized grouping expression.\n\nIt turns out our function pointer array handles those too. The parsing function\nfor an expression type can consume any additional tokens that it wants to, just\nlike in a regular recursive descent parser. Here's how parentheses work:\n\n^code grouping\n\nAgain, we assume the initial `(` has already been consumed. We <span\nname=\"recursive\">recursively</span> call back into `expression()` to compile the\nexpression between the parentheses, then parse the closing `)` at the end.\n\n<aside name=\"recursive\">\n\nA Pratt parser isn't a recursive *descent* parser, but it's still recursive.\nThat's to be expected since the grammar itself is recursive.\n\n</aside>\n\nAs far as the back end is concerned, there's literally nothing to a grouping\nexpression. Its sole function is syntactic -- it lets you insert a\nlower-precedence expression where a higher precedence is expected. Thus, it has\nno runtime semantics on its own and therefore doesn't emit any bytecode. The\ninner call to `expression()` takes care of generating bytecode for the\nexpression inside the parentheses.\n\n### Unary negation\n\nUnary minus is also a prefix expression, so it works with our model too.\n\n^code unary\n\nThe leading `-` token has been consumed and is sitting in `parser.previous`. We\ngrab the token type from that to note which unary operator we're dealing with.\nIt's unnecessary right now, but this will make more sense when we use this same\nfunction to compile the `!` operator in [the next chapter][next].\n\n[next]: types-of-values.html\n\nAs in `grouping()`, we recursively call `expression()` to compile the operand.\nAfter that, we emit the bytecode to perform the negation. It might seem a little\nweird to write the negate instruction *after* its operand's bytecode since the\n`-` appears on the left, but think about it in terms of order of execution:\n\n1. We evaluate the operand first which leaves its value on the stack.\n\n2. Then we pop that value, negate it, and push the result.\n\nSo the `OP_NEGATE` instruction should be emitted <span name=\"line\">last</span>.\nThis is part of the compiler's job -- parsing the program in the order it\nappears in the source code and rearranging it into the order that execution\nhappens.\n\n<aside name=\"line\">\n\nEmitting the `OP_NEGATE` instruction after the operands does mean that the\ncurrent token when the bytecode is written is *not* the `-` token. That mostly\ndoesn't matter, except that we use that token for the line number to associate\nwith that instruction.\n\nThis means if you have a multi-line negation expression, like:\n\n```lox\nprint -\n  true;\n```\n\nThen the runtime error will be reported on the wrong line. Here, it would show\nthe error on line 2, even though the `-` is on line 1. A more robust approach\nwould be to store the token's line before compiling the operand and then pass\nthat into `emitByte()`, but I wanted to keep things simple for the book.\n\n</aside>\n\nThere is one problem with this code, though. The `expression()` function it\ncalls will parse any expression for the operand, regardless of precedence. Once\nwe add binary operators and other syntax, that will do the wrong thing.\nConsider:\n\n```lox\n-a.b + c;\n```\n\nHere, the operand to `-` should be just the `a.b` expression, not the entire\n`a.b + c`. But if `unary()` calls `expression()`, the latter will happily chew\nthrough all of the remaining code including the `+`. It will erroneously treat\nthe `-` as lower precedence than the `+`.\n\nWhen parsing the operand to unary `-`, we need to compile only expressions at a\ncertain precedence level or higher. In jlox's recursive descent parser we\naccomplished that by calling into the parsing method for the lowest-precedence\nexpression we wanted to allow (in this case, `call()`). Each method for parsing\na specific expression also parsed any expressions of higher precedence too, so\nthat included the rest of the precedence table.\n\nThe parsing functions like `number()` and `unary()` here in clox are different.\nEach only parses exactly one type of expression. They don't cascade to include\nhigher-precedence expression types too. We need a different solution, and it\nlooks like this:\n\n^code parse-precedence\n\nThis function -- once we implement it -- starts at the current token and parses\nany expression at the given precedence level or higher. We have some other setup\nto get through before we can write the body of this function, but you can\nprobably guess that it will use that table of parsing function pointers I've\nbeen talking about. For now, don't worry too much about how it works. In order\nto take the \"precedence\" as a parameter, we define it numerically.\n\n^code precedence (1 before, 2 after)\n\nThese are all of Lox's precedence levels in order from lowest to highest. Since\nC implicitly gives successively larger numbers for enums, this means that\n`PREC_CALL` is numerically larger than `PREC_UNARY`. For example, say the\ncompiler is sitting on a chunk of code like:\n\n```lox\n-a.b + c\n```\n\nIf we call `parsePrecedence(PREC_ASSIGNMENT)`, then it will parse the entire\nexpression because `+` has higher precedence than assignment. If instead we\ncall `parsePrecedence(PREC_UNARY)`, it will compile the `-a.b` and stop there.\nIt doesn't keep going through the `+` because the addition has lower precedence\nthan unary operators.\n\nWith this function in hand, it's a snap to fill in the missing body for\n`expression()`.\n\n^code expression-body (1 before, 1 after)\n\nWe simply parse the lowest precedence level, which subsumes all of the\nhigher-precedence expressions too. Now, to compile the operand for a unary\nexpression, we call this new function and limit it to the appropriate level:\n\n^code unary-operand (1 before, 2 after)\n\nWe use the unary operator's own `PREC_UNARY` precedence to permit <span\nname=\"useful\">nested</span> unary expressions like `!!doubleNegative`. Since\nunary operators have pretty high precedence, that correctly excludes things like\nbinary operators. Speaking of which...\n\n<aside name=\"useful\">\n\nNot that nesting unary expressions is particularly useful in Lox. But other\nlanguages let you do it, so we do too.\n\n</aside>\n\n## Parsing Infix Expressions\n\nBinary operators are different from the previous expressions because they are\n*infix*. With the other expressions, we know what we are parsing from the very\nfirst token. With infix expressions, we don't know we're in the middle of a\nbinary operator until *after* we've parsed its left operand and then stumbled\nonto the operator token in the middle.\n\nHere's an example:\n\n```lox\n1 + 2\n```\n\nLet's walk through trying to compile it with what we know so far:\n\n1.  We call `expression()`. That in turn calls\n    `parsePrecedence(PREC_ASSIGNMENT)`.\n\n2.  That function (once we implement it) sees the leading number token and\n    recognizes it is parsing a number literal. It hands off control to\n    `number()`.\n\n3.  `number()` creates a constant, emits an `OP_CONSTANT`, and returns back to\n    `parsePrecedence()`.\n\nNow what? The call to `parsePrecedence()` should consume the entire addition\nexpression, so it needs to keep going somehow. Fortunately, the parser is right\nwhere we need it to be. Now that we've compiled the leading number expression,\nthe next token is `+`. That's the exact token that `parsePrecedence()` needs to\ndetect that we're in the middle of an infix expression and to realize that the\nexpression we already compiled is actually an operand to that.\n\nSo this hypothetical array of function pointers doesn't just list functions to\nparse expressions that start with a given token. Instead, it's a *table* of\nfunction pointers. One column associates prefix parser functions with token\ntypes. The second column associates infix parser functions with token types.\n\nThe function we will use as the infix parser for `TOKEN_PLUS`, `TOKEN_MINUS`,\n`TOKEN_STAR`, and `TOKEN_SLASH` is this:\n\n^code binary\n\nWhen a prefix parser function is called, the leading token has already been\nconsumed. An infix parser function is even more *in medias res* -- the entire\nleft-hand operand expression has already been compiled and the subsequent infix\noperator consumed.\n\nThe fact that the left operand gets compiled first works out fine. It means at\nruntime, that code gets executed first. When it runs, the value it produces will\nend up on the stack. That's right where the infix operator is going to need it.\n\nThen we come here to `binary()` to handle the rest of the arithmetic operators.\nThis function compiles the right operand, much like how `unary()` compiles its\nown trailing operand. Finally, it emits the bytecode instruction that performs\nthe binary operation.\n\nWhen run, the VM will execute the left and right operand code, in that order,\nleaving their values on the stack. Then it executes the instruction for the\noperator. That pops the two values, computes the operation, and pushes the\nresult.\n\nThe code that probably caught your eye here is that `getRule()` line. When we\nparse the right-hand operand, we again need to worry about precedence. Take an\nexpression like:\n\n```lox\n2 * 3 + 4\n```\n\nWhen we parse the right operand of the `*` expression, we need to just capture\n`3`, and not `3 + 4`, because `+` is lower precedence than `*`. We could define\na separate function for each binary operator. Each would call\n`parsePrecedence()` and pass in the correct precedence level for its operand.\n\nBut that's kind of tedious. Each binary operator's right-hand operand precedence\nis one level <span name=\"higher\">higher</span> than its own. We can look that up\ndynamically with this `getRule()` thing we'll get to soon. Using that, we call\n`parsePrecedence()` with one level higher than this operator's level.\n\n<aside name=\"higher\">\n\nWe use one *higher* level of precedence for the right operand because the binary\noperators are left-associative. Given a series of the *same* operator, like:\n\n```lox\n1 + 2 + 3 + 4\n```\n\nWe want to parse it like:\n\n```lox\n((1 + 2) + 3) + 4\n```\n\nThus, when parsing the right-hand operand to the first `+`, we want to consume\nthe `2`, but not the rest, so we use one level above `+`'s precedence. But if\nour operator was *right*-associative, this would be wrong. Given:\n\n```lox\na = b = c = d\n```\n\nSince assignment is right-associative, we want to parse it as:\n\n```lox\na = (b = (c = d))\n```\n\nTo enable that, we would call `parsePrecedence()` with the *same* precedence as\nthe current operator.\n\n</aside>\n\nThis way, we can use a single `binary()` function for all binary operators even\nthough they have different precedences.\n\n## A Pratt Parser\n\nWe now have all of the pieces and parts of the compiler laid out. We have a\nfunction for each grammar production: `number()`, `grouping()`, `unary()`, and\n`binary()`. We still need to implement `parsePrecedence()`, and `getRule()`. We\nalso know we need a table that, given a token type, lets us find\n\n*   the function to compile a prefix expression starting with a token of that\n    type,\n\n*   the function to compile an infix expression whose left operand is followed\n    by a token of that type, and\n\n*   the precedence of an <span name=\"prefix\">infix</span> expression that uses\n    that token as an operator.\n\n<aside name=\"prefix\">\n\nWe don't need to track the precedence of the *prefix* expression starting with a\ngiven token because all prefix operators in Lox have the same precedence.\n\n</aside>\n\nWe wrap these three properties in a little struct which represents a single row\nin the parser table.\n\n^code parse-rule (1 before, 2 after)\n\nThat ParseFn type is a simple <span name=\"typedef\">typedef</span> for a function\ntype that takes no arguments and returns nothing.\n\n<aside name=\"typedef\" class=\"bottom\">\n\nC's syntax for function pointer types is so bad that I always hide it behind a\ntypedef. I understand the intent behind the syntax -- the whole \"declaration\nreflects use\" thing -- but I think it was a failed syntactic experiment.\n\n</aside>\n\n^code parse-fn-type (1 before, 2 after)\n\nThe table that drives our whole parser is an array of ParseRules. We've been\ntalking about it forever, and finally you get to see it.\n\n^code rules\n\n<aside name=\"big\">\n\nSee what I mean about not wanting to revisit the table each time we needed a new\ncolumn? It's a beast.\n\nIf you haven't seen the `[TOKEN_DOT] = ` syntax in a C array literal, that is\nC99's designated initializer syntax. It's clearer than having to count array\nindexes by hand.\n\n</aside>\n\nYou can see how `grouping` and `unary` are slotted into the prefix parser column\nfor their respective token types. In the next column, `binary` is wired up to\nthe four arithmetic infix operators. Those infix operators also have their\nprecedences set in the last column.\n\nAside from those, the rest of the table is full of `NULL` and `PREC_NONE`. Most\nof those empty cells are because there is no expression associated with those\ntokens. You can't start an expression with, say, `else`, and `}` would make for\na pretty confusing infix operator.\n\nBut, also, we haven't filled in the entire grammar yet. In later chapters, as we\nadd new expression types, some of these slots will get functions in them. One of\nthe things I like about this approach to parsing is that it makes it very easy\nto see which tokens are in use by the grammar and which are available.\n\nNow that we have the table, we are finally ready to write the code that uses it.\nThis is where our Pratt parser comes to life. The easiest function to define is\n`getRule()`.\n\n^code get-rule\n\nIt simply returns the rule at the given index. It's called by `binary()` to look\nup the precedence of the current operator. This function exists solely to handle\na declaration cycle in the C code. `binary()` is defined *before* the rules\ntable so that the table can store a pointer to it. That means the body of\n`binary()` cannot access the table directly.\n\nInstead, we wrap the lookup in a function. That lets us forward declare\n`getRule()` before the definition of `binary()`, and <span\nname=\"forward\">then</span> *define* `getRule()` after the table. We'll need a\ncouple of other forward declarations to handle the fact that our grammar is\nrecursive, so let's get them all out of the way.\n\n<aside name=\"forward\">\n\nThis is what happens when you write your VM in a language that was designed to\nbe compiled on a PDP-11.\n\n</aside>\n\n^code forward-declarations (2 before, 1 after)\n\nIf you're following along and implementing clox yourself, pay close attention to\nthe little annotations that tell you where to put these code snippets. Don't\nworry, though, if you get it wrong, the C compiler will be happy to tell you.\n\n### Parsing with precedence\n\nNow we're getting to the fun stuff. The maestro that orchestrates all of the\nparsing functions we've defined is `parsePrecedence()`. Let's start with parsing\nprefix expressions.\n\n^code precedence-body (1 before, 1 after)\n\nWe read the next token and look up the corresponding ParseRule. If there is no\nprefix parser, then the token must be a syntax error. We report that and return\nto the caller.\n\nOtherwise, we call that prefix parse function and let it do its thing. That\nprefix parser compiles the rest of the prefix expression, consuming any other\ntokens it needs, and returns back here. Infix expressions are where it gets\ninteresting since precedence comes into play. The implementation is remarkably\nsimple.\n\n^code infix (1 before, 1 after)\n\nThat's the whole thing. Really. Here's how the entire function works: At the\nbeginning of `parsePrecedence()`, we look up a prefix parser for the current\ntoken. The first token is *always* going to belong to some kind of prefix\nexpression, by definition. It may turn out to be nested as an operand inside one\nor more infix expressions, but as you read the code from left to right, the\nfirst token you hit always belongs to a prefix expression.\n\nAfter parsing that, which may consume more tokens, the prefix expression is\ndone. Now we look for an infix parser for the next token. If we find one, it\nmeans the prefix expression we already compiled might be an operand for it. But\nonly if the call to `parsePrecedence()` has a `precedence` that is low enough to\npermit that infix operator.\n\nIf the next token is too low precedence, or isn't an infix operator at all,\nwe're done. We've parsed as much expression as we can. Otherwise, we consume the\noperator and hand off control to the infix parser we found. It consumes whatever\nother tokens it needs (usually the right operand) and returns back to\n`parsePrecedence()`. Then we loop back around and see if the *next* token is\nalso a valid infix operator that can take the entire preceding expression as its\noperand. We keep looping like that, crunching through infix operators and their\noperands until we hit a token that isn't an infix operator or is too low\nprecedence and stop.\n\nThat's a lot of prose, but if you really want to mind meld with Vaughan Pratt\nand fully understand the algorithm, step through the parser in your debugger as\nit works through some expressions. Maybe a picture will help. There's only a\nhandful of functions, but they are marvelously intertwined:\n\n<span name=\"connections\"></span>\n\n<img src=\"image/compiling-expressions/connections.png\" alt=\"The various parsing\nfunctions and how they call each other.\" />\n\n<aside name=\"connections\">\n\nThe <img src=\"image/compiling-expressions/calls.png\" alt=\"A solid arrow.\"\nclass=\"arrow\" /> arrow connects a function to another function it directly\ncalls. The <img src=\"image/compiling-expressions/points-to.png\" alt=\"An open\narrow.\" class=\"arrow\" /> arrow shows the table's pointers to the parsing\nfunctions.\n\n</aside>\n\nLater, we'll need to tweak the code in this chapter to handle assignment. But,\notherwise, what we wrote covers all of our expression compiling needs for the\nrest of the book. We'll plug additional parsing functions into the table when we\nadd new kinds of expressions, but `parsePrecedence()` is complete.\n\n## Dumping Chunks\n\nWhile we're here in the core of our compiler, we should put in some\ninstrumentation. To help debug the generated bytecode, we'll add support for\ndumping the chunk once the compiler finishes. We had some temporary logging\nearlier when we hand-authored the chunk. Now we'll put in some real code so that\nwe can enable it whenever we want.\n\nSince this isn't for end users, we hide it behind a flag.\n\n^code define-debug-print-code (2 before, 1 after)\n\nWhen that flag is defined, we use our existing \"debug\" module to print out the\nchunk's bytecode.\n\n^code dump-chunk (1 before, 1 after)\n\nWe do this only if the code was free of errors. After a syntax error, the\ncompiler keeps on going but it's in kind of a weird state and might produce\nbroken code. That's harmless because it won't get executed, but we'll just\nconfuse ourselves if we try to read it.\n\nFinally, to access `disassembleChunk()`, we need to include its header.\n\n^code include-debug (1 before, 2 after)\n\nWe made it! This was the last major section to install in our VM's compilation\nand execution pipeline. Our interpreter doesn't *look* like much, but inside it\nis scanning, parsing, compiling to bytecode, and executing.\n\nFire up the VM and type in an expression. If we did everything right, it should\ncalculate and print the result. We now have a very over-engineered arithmetic\ncalculator. We have a lot of language features to add in the coming chapters,\nbut the foundation is in place.\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  To really understand the parser, you need to see how execution threads\n    through the interesting parsing functions -- `parsePrecedence()` and the\n    parser functions stored in the table. Take this (strange) expression:\n\n    ```lox\n    (-1 + 2) * 3 - -4\n    ```\n\n    Write a trace of how those functions are called. Show the order they are\n    called, which calls which, and the arguments passed to them.\n\n2.  The ParseRule row for `TOKEN_MINUS` has both prefix and infix function\n    pointers. That's because `-` is both a prefix operator (unary negation) and\n    an infix one (subtraction).\n\n    In the full Lox language, what other tokens can be used in both prefix and\n    infix positions? What about in C or in another language of your choice?\n\n3.  You might be wondering about complex \"mixfix\" expressions that have more\n    than two operands separated by tokens. C's conditional or \"ternary\"\n    operator, `?:`, is a widely known one.\n\n    Add support for that operator to the compiler. You don't have to generate\n    any bytecode, just show how you would hook it up to the parser and handle\n    the operands.\n\n</div>\n\n<div class=\"design-note\">\n\n## Design Note: It's Just Parsing\n\nI'm going to make a claim here that will be unpopular with some compiler and\nlanguage people. It's OK if you don't agree. Personally, I learn more from\nstrongly stated opinions that I disagree with than I do from several pages of\nqualifiers and equivocation. My claim is that *parsing doesn't matter*.\n\nOver the years, many programming language people, especially in academia, have\ngotten *really* into parsers and taken them very seriously. Initially, it was\nthe compiler folks who got into <span name=\"yacc\">compiler-compilers</span>,\nLALR, and other stuff like that. The first half of the dragon book is a long\nlove letter to the wonders of parser generators.\n\n<aside name=\"yacc\">\n\nAll of us suffer from the vice of \"when all you have is a hammer, everything\nlooks like a nail\", but perhaps none so visibly as compiler people. You wouldn't\nbelieve the breadth of software problems that miraculously seem to require a new\nlittle language in their solution as soon as you ask a compiler hacker for help.\n\nYacc and other compiler-compilers are the most delightfully recursive example.\n\"Wow, writing compilers is a chore. I know, let's write a compiler to write our\ncompiler for us.\"\n\nFor the record, I don't claim immunity to this affliction.\n\n</aside>\n\nLater, the functional programming folks got into parser combinators, packrat\nparsers, and other sorts of things. Because, obviously, if you give a functional\nprogrammer a problem, the first thing they'll do is whip out a pocketful of\nhigher-order functions.\n\nOver in math and algorithm analysis land, there is a long legacy of research\ninto proving time and memory usage for various parsing techniques, transforming\nparsing problems into other problems and back, and assigning complexity classes\nto different grammars.\n\nAt one level, this stuff is important. If you're implementing a language, you\nwant some assurance that your parser won't go exponential and take 7,000 years\nto parse a weird edge case in the grammar. Parser theory gives you that bound.\nAs an intellectual exercise, learning about parsing techniques is also fun and\nrewarding.\n\nBut if your goal is just to implement a language and get it in front of users,\nalmost all of that stuff doesn't matter. It's really easy to get worked up by\nthe enthusiasm of the people who *are* into it and think that your front end\n*needs* some whiz-bang generated combinator-parser-factory thing. I've seen\npeople burn tons of time writing and rewriting their parser using whatever\ntoday's hot library or technique is.\n\nThat's time that doesn't add any value to your user's life. If you're just\ntrying to get your parser done, pick one of the bog-standard techniques, use it,\nand move on. Recursive descent, Pratt parsing, and the popular parser generators\nlike ANTLR or Bison are all fine.\n\nTake the extra time you saved not rewriting your parsing code and spend it\nimproving the compile error messages your compiler shows users. Good error\nhandling and reporting is more valuable to users than almost anything else you\ncan put time into in the front end.\n\n</div>\n"
  },
  {
    "path": "book/contents.md",
    "content": "This text is not used. All of the content is in the contents.html template.\n"
  },
  {
    "path": "book/control-flow.md",
    "content": "> Logic, like whiskey, loses its beneficial effect when taken in too large\n> quantities.\n>\n> <cite>Edward John Moreton Drax Plunkett, Lord Dunsany</cite>\n\nCompared to [last chapter's][statements] grueling marathon, today is a\nlighthearted frolic through a daisy meadow. But while the work is easy, the\nreward is surprisingly large.\n\n[statements]: statements-and-state.html\n\nRight now, our interpreter is little more than a calculator. A Lox program can\nonly do a fixed amount of work before completing. To make it run twice as long\nyou have to make the source code twice as lengthy. We're about to fix that. In\nthis chapter, our interpreter takes a big step towards the programming\nlanguage major leagues: *Turing-completeness*.\n\n## Turing Machines (Briefly)\n\nIn the early part of last century, mathematicians stumbled into a series of\nconfusing <span name=\"paradox\">paradoxes</span> that led them to doubt the\nstability of the foundation they had built their work upon. To address that\n[crisis][], they went back to square one. Starting from a handful of axioms,\nlogic, and set theory, they hoped to rebuild mathematics on top of an\nimpervious foundation.\n\n[crisis]: https://en.wikipedia.org/wiki/Foundations_of_mathematics#Foundational_crisis\n\n<aside name=\"paradox\">\n\nThe most famous is [**Russell's paradox**][russell]. Initially, set theory\nallowed you to define any sort of set. If you could describe it in English, it\nwas valid. Naturally, given mathematicians' predilection for self-reference,\nsets can contain other sets. So Russell, rascal that he was, came up with:\n\n*R is the set of all sets that do not contain themselves.*\n\nDoes R contain itself? If it doesn't, then according to the second half of the\ndefinition it should. But if it does, then it no longer meets the definition.\nCue mind exploding.\n\n[russell]: https://en.wikipedia.org/wiki/Russell%27s_paradox\n\n</aside>\n\nThey wanted to rigorously answer questions like, \"Can all true statements be\nproven?\", \"Can we [compute][] all functions that we can define?\", or even the\nmore general question, \"What do we mean when we claim a function is\n'computable'?\"\n\n[compute]: https://en.wikipedia.org/wiki/Computable_function\n\nThey presumed the answer to the first two questions would be \"yes\". All that\nremained was to prove it. It turns out that the answer to both is \"no\", and\nastonishingly, the two questions are deeply intertwined. This is a fascinating\ncorner of mathematics that touches fundamental questions about what brains are\nable to do and how the universe works. I can't do it justice here.\n\nWhat I do want to note is that in the process of proving that the answer to the\nfirst two questions is \"no\", Alan Turing and Alonzo Church devised a precise\nanswer to the last question -- a definition of what kinds of functions are <span\nname=\"uncomputable\">computable</span>. They each crafted a tiny system with a\nminimum set of machinery that is still powerful enough to compute any of a\n(very) large class of functions.\n\n<aside name=\"uncomputable\">\n\nThey proved the answer to the first question is \"no\" by showing that the\nfunction that returns the truth value of a given statement is *not* a computable\none.\n\n</aside>\n\nThese are now considered the \"computable functions\". Turing's system is called a\n<span name=\"turing\">**Turing machine**</span>. Church's is the **lambda\ncalculus**. Both are still widely used as the basis for models of computation\nand, in fact, many modern functional programming languages use the lambda\ncalculus at their core.\n\n<aside name=\"turing\">\n\nTuring called his inventions \"a-machines\" for \"automatic\". He wasn't so\nself-aggrandizing as to put his *own* name on them. Later mathematicians did\nthat for him. That's how you get famous while still retaining some modesty.\n\n</aside>\n\n<img src=\"image/control-flow/turing-machine.png\" alt=\"A Turing machine.\" />\n\nTuring machines have better name recognition -- there's no Hollywood film about\nAlonzo Church yet -- but the two formalisms are [equivalent in power][thesis].\nIn fact, any programming language with some minimal level of expressiveness is\npowerful enough to compute *any* computable function.\n\n[thesis]: https://en.wikipedia.org/wiki/Church%E2%80%93Turing_thesis\n\nYou can prove that by writing a simulator for a Turing machine in your language.\nSince Turing proved his machine can compute any computable function, by\nextension, that means your language can too. All you need to do is translate the\nfunction into a Turing machine, and then run that on your simulator.\n\nIf your language is expressive enough to do that, it's considered\n**Turing-complete**. Turing machines are pretty dang simple, so it doesn't take\nmuch power to do this. You basically need arithmetic, a little control flow,\nand the ability to allocate and use (theoretically) arbitrary amounts of memory.\nWe've got the first. By the end of this chapter, we'll have the <span\nname=\"memory\">second</span>.\n\n<aside name=\"memory\">\n\nWe *almost* have the third too. You can create and concatenate strings of\narbitrary size, so you can *store* unbounded memory. But we don't have any way\nto access parts of a string.\n\n</aside>\n\n## Conditional Execution\n\nEnough history, let's jazz up our language. We can divide control flow roughly\ninto two kinds:\n\n*   **Conditional** or **branching control flow** is used to *not* execute\n    some piece of code. Imperatively, you can think of it as jumping *ahead*\n    over a region of code.\n\n*   **Looping control flow** executes a chunk of code more than once. It jumps\n    *back* so that you can do something again. Since you don't usually want\n    *infinite* loops, it typically has some conditional logic to know when to\n    stop looping as well.\n\nBranching is simpler, so we'll start there. C-derived languages have two main\nconditional execution features, the `if` statement and the perspicaciously named\n\"conditional\" <span name=\"ternary\">operator</span> (`?:`). An `if` statement\nlets you conditionally execute statements and the conditional operator lets you\nconditionally execute expressions.\n\n<aside name=\"ternary\">\n\nThe conditional operator is also called the \"ternary\" operator because it's the\nonly operator in C that takes three operands.\n\n</aside>\n\nFor simplicity's sake, Lox doesn't have a conditional operator, so let's get our\n`if` statement on. Our statement grammar gets a new production.\n\n<span name=\"semicolon\"></span>\n\n```ebnf\nstatement      → exprStmt\n               | ifStmt\n               | printStmt\n               | block ;\n\nifStmt         → \"if\" \"(\" expression \")\" statement\n               ( \"else\" statement )? ;\n```\n\n<aside name=\"semicolon\">\n\nThe semicolons in the rules aren't quoted, which means they are part of the\ngrammar metasyntax, not Lox's syntax. A block does not have a `;` at the end and\nan `if` statement doesn't either, unless the then or else statement happens to\nbe one that ends in a semicolon.\n\n</aside>\n\nAn `if` statement has an expression for the condition, then a statement to execute\nif the condition is truthy. Optionally, it may also have an `else` keyword and a\nstatement to execute if the condition is falsey. The <span name=\"if-ast\">syntax\ntree node</span> has fields for each of those three pieces.\n\n^code if-ast (1 before, 1 after)\n\n<aside name=\"if-ast\">\n\nThe generated code for the new node is in [Appendix II][appendix-if].\n\n[appendix-if]: appendix-ii.html#if-statement\n\n</aside>\n\nLike other statements, the parser recognizes an `if` statement by the leading\n`if` keyword.\n\n^code match-if (1 before, 1 after)\n\nWhen it finds one, it calls this new method to parse the rest:\n\n^code if-statement\n\n<aside name=\"parens\">\n\nThe parentheses around the condition are only half useful. You need some kind of\ndelimiter *between* the condition and the then statement, otherwise the parser\ncan't tell when it has reached the end of the condition expression. But the\n*opening* parenthesis after `if` doesn't do anything useful. Dennis Ritchie put\nit there so he could use `)` as the ending delimiter without having unbalanced\nparentheses.\n\nOther languages like Lua and some BASICs use a keyword like `then` as the ending\ndelimiter and don't have anything before the condition. Go and Swift instead\nrequire the statement to be a braced block. That lets them use the `{` at the\nbeginning of the statement to tell when the condition is done.\n\n</aside>\n\nAs usual, the parsing code hews closely to the grammar. It detects an else\nclause by looking for the preceding `else` keyword. If there isn't one, the\n`elseBranch` field in the syntax tree is `null`.\n\nThat seemingly innocuous optional else has, in fact, opened up an ambiguity in\nour grammar. Consider:\n\n```lox\nif (first) if (second) whenTrue(); else whenFalse();\n```\n\nHere's the riddle: Which `if` statement does that else clause belong to? This\nisn't just a theoretical question about how we notate our grammar. It actually\naffects how the code executes:\n\n*   If we attach the else to the first `if` statement, then `whenFalse()` is\n    called if `first` is falsey, regardless of what value `second` has.\n\n*   If we attach it to the second `if` statement, then `whenFalse()` is only\n    called if `first` is truthy and `second` is falsey.\n\nSince else clauses are optional, and there is no explicit delimiter marking the\nend of the `if` statement, the grammar is ambiguous when you nest `if`s in this\nway. This classic pitfall of syntax is called the **[dangling else][]** problem.\n\n[dangling else]: https://en.wikipedia.org/wiki/Dangling_else\n\n<span name=\"else\"></span>\n\n<img class=\"above\" src=\"image/control-flow/dangling-else.png\" alt=\"Two ways the else can be interpreted.\" />\n\n<aside name=\"else\">\n\nHere, formatting highlights the two ways the else could be parsed. But note that\nsince whitespace characters are ignored by the parser, this is only a guide to\nthe human reader.\n\n</aside>\n\nIt *is* possible to define a context-free grammar that avoids the ambiguity\ndirectly, but it requires splitting most of the statement rules into pairs, one\nthat allows an `if` with an `else` and one that doesn't. It's annoying.\n\nInstead, most languages and parsers avoid the problem in an ad hoc way. No\nmatter what hack they use to get themselves out of the trouble, they always\nchoose the same interpretation -- the `else` is bound to the nearest `if` that\nprecedes it.\n\nOur parser conveniently does that already. Since `ifStatement()` eagerly looks\nfor an `else` before returning, the innermost call to a nested series will claim\nthe else clause for itself before returning to the outer `if` statements.\n\nSyntax in hand, we are ready to interpret.\n\n^code visit-if\n\nThe interpreter implementation is a thin wrapper around the self-same Java code.\nIt evaluates the condition. If truthy, it executes the then branch. Otherwise,\nif there is an else branch, it executes that.\n\nIf you compare this code to how the interpreter handles other syntax we've\nimplemented, the part that makes control flow special is that Java `if`\nstatement. Most other syntax trees always evaluate their subtrees. Here, we may\nnot evaluate the then or else statement. If either of those has a side effect,\nthe choice not to evaluate it becomes user visible.\n\n## Logical Operators\n\nSince we don't have the conditional operator, you might think we're done with\nbranching, but no. Even without the ternary operator, there are two other\noperators that are technically control flow constructs -- the logical operators\n`and` and `or`.\n\nThese aren't like other binary operators because they **short-circuit**. If,\nafter evaluating the left operand, we know what the result of the logical\nexpression must be, we don't evaluate the right operand. For example:\n\n```lox\nfalse and sideEffect();\n```\n\nFor an `and` expression to evaluate to something truthy, both operands must be\ntruthy. We can see as soon as we evaluate the left `false` operand that that\nisn't going to be the case, so there's no need to evaluate `sideEffect()` and it\ngets skipped.\n\nThis is why we didn't implement the logical operators with the other binary\noperators. Now we're ready. The two new operators are low in the precedence\ntable. Similar to `||` and `&&` in C, they each have their <span\nname=\"logical\">own</span> precedence with `or` lower than `and`. We slot them\nright between `assignment` and `equality`.\n\n<aside name=\"logical\">\n\nI've always wondered why they don't have the same precedence, like the various\ncomparison or equality operators do.\n\n</aside>\n\n```ebnf\nexpression     → assignment ;\nassignment     → IDENTIFIER \"=\" assignment\n               | logic_or ;\nlogic_or       → logic_and ( \"or\" logic_and )* ;\nlogic_and      → equality ( \"and\" equality )* ;\n```\n\nInstead of falling back to `equality`, `assignment` now cascades to `logic_or`.\nThe two new rules, `logic_or` and `logic_and`, are <span\nname=\"same\">similar</span> to other binary operators. Then `logic_and` calls\nout to `equality` for its operands, and we chain back to the rest of the\nexpression rules.\n\n<aside name=\"same\">\n\nThe *syntax* doesn't care that they short-circuit. That's a semantic concern.\n\n</aside>\n\nWe could reuse the existing Expr.Binary class for these two new expressions\nsince they have the same fields. But then `visitBinaryExpr()` would have to\ncheck to see if the operator is one of the logical operators and use a different\ncode path to handle the short circuiting. I think it's cleaner to define a <span\nname=\"logical-ast\">new class</span> for these operators so that they get their\nown visit method.\n\n^code logical-ast (1 before, 1 after)\n\n<aside name=\"logical-ast\">\n\nThe generated code for the new node is in [Appendix II][appendix-logical].\n\n[appendix-logical]: appendix-ii.html#logical-expression\n\n</aside>\n\nTo weave the new expressions into the parser, we first change the parsing code\nfor assignment to call `or()`.\n\n^code or-in-assignment (1 before, 2 after)\n\nThe code to parse a series of `or` expressions mirrors other binary operators.\n\n^code or\n\nIts operands are the next higher level of precedence, the new `and` expression.\n\n^code and\n\nThat calls `equality()` for its operands, and with that, the expression parser\nis all tied back together again. We're ready to interpret.\n\n^code visit-logical\n\nIf you compare this to the [earlier chapter's][evaluating] `visitBinaryExpr()`\nmethod, you can see the difference. Here, we evaluate the left operand first. We\nlook at its value to see if we can short-circuit. If not, and only then, do we\nevaluate the right operand.\n\n[evaluating]: evaluating-expressions.html\n\nThe other interesting piece here is deciding what actual value to return. Since\nLox is dynamically typed, we allow operands of any type and use truthiness to\ndetermine what each operand represents. We apply similar reasoning to the\nresult. Instead of promising to literally return `true` or `false`, a logic\noperator merely guarantees it will return a value with appropriate truthiness.\n\nFortunately, we have values with proper truthiness right at hand -- the results\nof the operands themselves. So we use those. For example:\n\n```lox\nprint \"hi\" or 2; // \"hi\".\nprint nil or \"yes\"; // \"yes\".\n```\n\nOn the first line, `\"hi\"` is truthy, so the `or` short-circuits and returns\nthat. On the second line, `nil` is falsey, so it evaluates and returns the\nsecond operand, `\"yes\"`.\n\nThat covers all of the branching primitives in Lox. We're ready to jump ahead to\nloops. You see what I did there? *Jump. Ahead.* Get it? See, it's like a\nreference to... oh, forget it.\n\n## While Loops\n\nLox features two looping control flow statements, `while` and `for`. The `while`\nloop is the simpler one, so we'll start there. Its grammar is the same as in C.\n\n```ebnf\nstatement      → exprStmt\n               | ifStmt\n               | printStmt\n               | whileStmt\n               | block ;\n\nwhileStmt      → \"while\" \"(\" expression \")\" statement ;\n```\n\nWe add another clause to the statement rule that points to the new rule for\nwhile. It takes a `while` keyword, followed by a parenthesized condition\nexpression, then a statement for the body. That new grammar rule gets a <span\nname=\"while-ast\">syntax tree node</span>.\n\n^code while-ast (1 before, 1 after)\n\n<aside name=\"while-ast\">\n\nThe generated code for the new node is in [Appendix II][appendix-while].\n\n[appendix-while]: appendix-ii.html#while-statement\n\n</aside>\n\nThe node stores the condition and body. Here you can see why it's nice to have\nseparate base classes for expressions and statements. The field declarations\nmake it clear that the condition is an expression and the body is a statement.\n\nOver in the parser, we follow the same process we used for `if` statements.\nFirst, we add another case in `statement()` to detect and match the leading\nkeyword.\n\n^code match-while (1 before, 1 after)\n\nThat delegates the real work to this method:\n\n^code while-statement\n\nThe grammar is dead simple and this is a straight translation of it to Java.\nSpeaking of translating straight to Java, here's how we execute the new syntax:\n\n^code visit-while\n\nLike the visit method for `if`, this visitor uses the corresponding Java\nfeature. This method isn't complex, but it makes Lox much more powerful. We can\nfinally write a program whose running time isn't strictly bound by the length of\nthe source code.\n\n## For Loops\n\nWe're down to the last control flow construct, <span name=\"for\">Ye Olde</span>\nC-style `for` loop. I probably don't need to remind you, but it looks like this:\n\n```lox\nfor (var i = 0; i < 10; i = i + 1) print i;\n```\n\nIn grammarese, that's:\n\n```ebnf\nstatement      → exprStmt\n               | forStmt\n               | ifStmt\n               | printStmt\n               | whileStmt\n               | block ;\n\nforStmt        → \"for\" \"(\" ( varDecl | exprStmt | \";\" )\n                 expression? \";\"\n                 expression? \")\" statement ;\n```\n\n<aside name=\"for\">\n\nMost modern languages have a higher-level looping statement for iterating over\narbitrary user-defined sequences. C# has `foreach`, Java has \"enhanced for\",\neven C++ has range-based `for` statements now. Those offer cleaner syntax than\nC's `for` statement by implicitly calling into an iteration protocol that the\nobject being looped over supports.\n\nI love those. For Lox, though, we're limited by building up the interpreter a\nchapter at a time. We don't have objects and methods yet, so we have no way of\ndefining an iteration protocol that the `for` loop could use. So we'll stick\nwith the old school C `for` loop. Think of it as \"vintage\". The fixie of control\nflow statements.\n\n</aside>\n\nInside the parentheses, you have three clauses separated by semicolons:\n\n1.  The first clause is the *initializer*. It is executed exactly once, before\n    anything else. It's usually an expression, but for convenience, we also\n    allow a variable declaration. In that case, the variable is scoped to the\n    rest of the `for` loop -- the other two clauses and the body.\n\n2.  Next is the *condition*. As in a `while` loop, this expression controls when\n    to exit the loop. It's evaluated once at the beginning of each iteration,\n    including the first. If the result is truthy, it executes the loop body.\n    Otherwise, it bails.\n\n3.  The last clause is the *increment*. It's an arbitrary expression that does\n    some work at the end of each loop iteration. The result of the expression is\n    discarded, so it must have a side effect to be useful. In practice, it\n    usually increments a variable.\n\nAny of these clauses can be omitted. Following the closing parenthesis is a\nstatement for the body, which is typically a block.\n\n### Desugaring\n\nThat's a lot of machinery, but note that none of it does anything you couldn't\ndo with the statements we already have. If `for` loops didn't support\ninitializer clauses, you could just put the initializer expression before the\n`for` statement. Without an increment clause, you could simply put the increment\nexpression at the end of the body yourself.\n\nIn other words, Lox doesn't *need* `for` loops, they just make some common code\npatterns more pleasant to write. These kinds of features are called <span\nname=\"sugar\">**syntactic sugar**</span>. For example, the previous `for` loop\ncould be rewritten like so:\n\n<aside name=\"sugar\">\n\nThis delightful turn of phrase was coined by Peter J. Landin in 1964 to describe\nhow some of the nice expression forms supported by languages like ALGOL were a\nsweetener sprinkled over the more fundamental -- but presumably less palatable\n-- lambda calculus underneath.\n\n<img class=\"above\" src=\"image/control-flow/sugar.png\" alt=\"Slightly more than a spoonful of sugar.\" />\n\n</aside>\n\n```lox\n{\n  var i = 0;\n  while (i < 10) {\n    print i;\n    i = i + 1;\n  }\n}\n```\n\nThis script has the exact same semantics as the previous one, though it's not as\neasy on the eyes. Syntactic sugar features like Lox's `for` loop make a language\nmore pleasant and productive to work in. But, especially in sophisticated\nlanguage implementations, every language feature that requires back-end support\nand optimization is expensive.\n\nWe can have our cake and eat it too by <span\nname=\"caramel\">**desugaring**</span>. That funny word describes a process where\nthe front end takes code using syntax sugar and translates it to a more\nprimitive form that the back end already knows how to execute.\n\n<aside name=\"caramel\">\n\nOh, how I wish the accepted term for this was \"caramelization\". Why introduce a\nmetaphor if you aren't going to stick with it?\n\n</aside>\n\nWe're going to desugar `for` loops to the `while` loops and other statements the\ninterpreter already handles. In our simple interpreter, desugaring really\ndoesn't save us much work, but it does give me an excuse to introduce you to the\ntechnique. So, unlike the previous statements, we *won't* add a new syntax tree\nnode. Instead, we go straight to parsing. First, add an import we'll need soon.\n\n^code import-arrays (1 before, 1 after)\n\nLike every statement, we start parsing a `for` loop by matching its keyword.\n\n^code match-for (1 before, 1 after)\n\nHere is where it gets interesting. The desugaring is going to happen here, so\nwe'll build this method a piece at a time, starting with the opening parenthesis\nbefore the clauses.\n\n^code for-statement\n\nThe first clause following that is the initializer.\n\n^code for-initializer (2 before, 1 after)\n\nIf the token following the `(` is a semicolon then the initializer has been\nomitted. Otherwise, we check for a `var` keyword to see if it's a <span\nname=\"variable\">variable</span> declaration. If neither of those matched, it\nmust be an expression. We parse that and wrap it in an expression statement so\nthat the initializer is always of type Stmt.\n\n<aside name=\"variable\">\n\nIn a previous chapter, I said we can split expression and statement syntax trees\ninto two separate class hierarchies because there's no single place in the\ngrammar that allows both an expression and a statement. That wasn't *entirely*\ntrue, I guess.\n\n</aside>\n\nNext up is the condition.\n\n^code for-condition (2 before, 1 after)\n\nAgain, we look for a semicolon to see if the clause has been omitted. The last\nclause is the increment.\n\n^code for-increment (1 before, 1 after)\n\nIt's similar to the condition clause except this one is terminated by the\nclosing parenthesis. All that remains is the <span name=\"body\">body</span>.\n\n<aside name=\"body\">\n\nIs it just me or does that sound morbid? \"All that remained... was the *body*\".\n\n</aside>\n\n^code for-body (1 before, 1 after)\n\nWe've parsed all of the various pieces of the `for` loop and the resulting AST\nnodes are sitting in a handful of Java local variables. This is where the\ndesugaring comes in. We take those and use them to synthesize syntax tree nodes\nthat express the semantics of the `for` loop, like the hand-desugared example I\nshowed you earlier.\n\nThe code is a little simpler if we work backward, so we start with the increment\nclause.\n\n^code for-desugar-increment (2 before, 1 after)\n\nThe increment, if there is one, executes after the body in each iteration of the\nloop. We do that by replacing the body with a little block that contains the\noriginal body followed by an expression statement that evaluates the increment.\n\n^code for-desugar-condition (2 before, 1 after)\n\nNext, we take the condition and the body and build the loop using a primitive\n`while` loop. If the condition is omitted, we jam in `true` to make an infinite\nloop.\n\n^code for-desugar-initializer (2 before, 1 after)\n\nFinally, if there is an initializer, it runs once before the entire loop. We do\nthat by, again, replacing the whole statement with a block that runs the\ninitializer and then executes the loop.\n\nThat's it. Our interpreter now supports C-style `for` loops and we didn't have\nto touch the Interpreter class at all. Since we desugared to nodes the\ninterpreter already knows how to visit, there is no more work to do.\n\nFinally, Lox is powerful enough to entertain us, at least for a few minutes.\nHere's a tiny program to print the first 21 elements in the Fibonacci\nsequence:\n\n```lox\nvar a = 0;\nvar temp;\n\nfor (var b = 1; a < 10000; b = temp + b) {\n  print a;\n  temp = a;\n  a = b;\n}\n```\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  A few chapters from now, when Lox supports first-class functions and dynamic\n    dispatch, we technically won't *need* branching statements built into the\n    language. Show how conditional execution can be implemented in terms of\n    those. Name a language that uses this technique for its control flow.\n\n2.  Likewise, looping can be implemented using those same tools, provided our\n    interpreter supports an important optimization. What is it, and why is it\n    necessary? Name a language that uses this technique for iteration.\n\n3.  Unlike Lox, most other C-style languages also support `break` and `continue`\n    statements inside loops. Add support for `break` statements.\n\n    The syntax is a `break` keyword followed by a semicolon. It should be a\n    syntax error to have a `break` statement appear outside of any enclosing\n    loop. At runtime, a `break` statement causes execution to jump to the end of\n    the nearest enclosing loop and proceeds from there. Note that the `break`\n    may be nested inside other blocks and `if` statements that also need to be\n    exited.\n\n</div>\n\n<div class=\"design-note\">\n\n## Design Note: Spoonfuls of Syntactic Sugar\n\nWhen you design your own language, you choose how much syntactic sugar to pour\ninto the grammar. Do you make an unsweetened health food where each semantic\noperation maps to a single syntactic unit, or some decadent dessert where every\nbit of behavior can be expressed ten different ways? Successful languages\ninhabit all points along this continuum.\n\nOn the extreme acrid end are those with ruthlessly minimal syntax like Lisp,\nForth, and Smalltalk. Lispers famously claim their language \"has no syntax\",\nwhile Smalltalkers proudly show that you can fit the entire grammar on an index\ncard. This tribe has the philosophy that the *language* doesn't need syntactic\nsugar. Instead, the minimal syntax and semantics it provides are powerful enough\nto let library code be as expressive as if it were part of the language itself.\n\nNear these are languages like C, Lua, and Go. They aim for simplicity and\nclarity over minimalism. Some, like Go, deliberately eschew both syntactic sugar\nand the kind of syntactic extensibility of the previous category. They want the\nsyntax to get out of the way of the semantics, so they focus on keeping both the\ngrammar and libraries simple. Code should be obvious more than beautiful.\n\nSomewhere in the middle you have languages like Java, C#, and Python. Eventually\nyou reach Ruby, C++, Perl, and D -- languages which have stuffed so much syntax\ninto their grammar, they are running out of punctuation characters on the\nkeyboard.\n\nTo some degree, location on the spectrum correlates with age. It's relatively\neasy to add bits of syntactic sugar in later releases. New syntax is a crowd\npleaser, and it's less likely to break existing programs than mucking with the\nsemantics. Once added, you can never take it away, so languages tend to sweeten\nwith time. One of the main benefits of creating a new language from scratch is\nit gives you an opportunity to scrape off those accumulated layers of frosting\nand start over.\n\nSyntactic sugar has a bad rap among the PL intelligentsia. There's a real fetish\nfor minimalism in that crowd. There is some justification for that. Poorly\ndesigned, unneeded syntax raises the cognitive load without adding enough\nexpressiveness to carry its weight. Since there is always pressure to cram new\nfeatures into the language, it takes discipline and a focus on simplicity to\navoid bloat. Once you add some syntax, you're stuck with it, so it's smart to be\nparsimonious.\n\nAt the same time, most successful languages do have fairly complex grammars, at\nleast by the time they are widely used. Programmers spend a ton of time in their\nlanguage of choice, and a few niceties here and there really can improve the\ncomfort and efficiency of their work.\n\nStriking the right balance -- choosing the right level of sweetness for your\nlanguage -- relies on your own sense of taste.\n\n</div>\n"
  },
  {
    "path": "book/dedication.md",
    "content": "<div class=\"dedication\">\n\n<img src=\"image/ginny.png\" alt=\"My beloved dog and her stupid face.\" />\n\nTo Ginny, I miss your stupid face.\n\n</div>"
  },
  {
    "path": "book/evaluating-expressions.md",
    "content": "> You are my creator, but I am your master; Obey!\n>\n> <cite>Mary Shelley, <em>Frankenstein</em></cite>\n\nIf you want to properly set the mood for this chapter, try to conjure up a\nthunderstorm, one of those swirling tempests that likes to yank open shutters at\nthe climax of the story. Maybe toss in a few bolts of lightning. In this\nchapter, our interpreter will take breath, open its eyes, and execute some code.\n\n<span name=\"spooky\"></span>\n\n<img src=\"image/evaluating-expressions/lightning.png\" alt=\"A bolt of lightning strikes a Victorian mansion. Spooky!\" />\n\n<aside name=\"spooky\">\n\nA decrepit Victorian mansion is optional, but adds to the ambiance.\n\n</aside>\n\nThere are all manner of ways that language implementations make a computer do\nwhat the user's source code commands. They can compile it to machine code,\ntranslate it to another high-level language, or reduce it to some bytecode\nformat for a virtual machine to run. For our first interpreter, though, we are\ngoing to take the simplest, shortest path and execute the syntax tree itself.\n\nRight now, our parser only supports expressions. So, to \"execute\" code, we will\nevaluate an expression and produce a value. For each kind of expression syntax\nwe can parse -- literal, operator, etc. -- we need a corresponding chunk of code\nthat knows how to evaluate that tree and produce a result. That raises two\nquestions:\n\n1. What kinds of values do we produce?\n\n2. How do we organize those chunks of code?\n\nTaking them on one at a time...\n\n## Representing Values\n\nIn Lox, <span name=\"value\">values</span> are created by literals, computed by\nexpressions, and stored in variables. The user sees these as *Lox* objects, but\nthey are implemented in the underlying language our interpreter is written in.\nThat means bridging the lands of Lox's dynamic typing and Java's static types. A\nvariable in Lox can store a value of any (Lox) type, and can even store values\nof different types at different points in time. What Java type might we use to\nrepresent that?\n\n<aside name=\"value\">\n\nHere, I'm using \"value\" and \"object\" pretty much interchangeably.\n\nLater in the C interpreter we'll make a slight distinction between them, but\nthat's mostly to have unique terms for two different corners of the\nimplementation -- in-place versus heap-allocated data. From the user's\nperspective, the terms are synonymous.\n\n</aside>\n\nGiven a Java variable with that static type, we must also be able to determine\nwhich kind of value it holds at runtime. When the interpreter executes a `+`\noperator, it needs to tell if it is adding two numbers or concatenating two\nstrings. Is there a Java type that can hold numbers, strings, Booleans, and\nmore? Is there one that can tell us what its runtime type is? There is! Good old\njava.lang.Object.\n\nIn places in the interpreter where we need to store a Lox value, we can use\nObject as the type. Java has boxed versions of its primitive types that all\nsubclass Object, so we can use those for Lox's built-in types:\n\n<table>\n<thead>\n<tr>\n  <td>Lox type</td>\n  <td>Java representation</td>\n</tr>\n</thead>\n<tbody>\n<tr>\n  <td>Any Lox value</td>\n  <td>Object</td>\n</tr>\n<tr>\n  <td><code>nil</code></td>\n  <td><code>null</code></td>\n</tr>\n<tr>\n  <td>Boolean</td>\n  <td>Boolean</td>\n</tr>\n<tr>\n  <td>number</td>\n  <td>Double</td>\n</tr>\n<tr>\n  <td>string</td>\n  <td>String</td>\n</tr>\n</tbody>\n</table>\n\nGiven a value of static type Object, we can determine if the runtime value is a\nnumber or a string or whatever using Java's built-in `instanceof` operator. In\nother words, the <span name=\"jvm\">JVM</span>'s own object representation\nconveniently gives us everything we need to implement Lox's built-in types.\nWe'll have to do a little more work later when we add Lox's notions of\nfunctions, classes, and instances, but Object and the boxed primitive classes\nare sufficient for the types we need right now.\n\n<aside name=\"jvm\">\n\nAnother thing we need to do with values is manage their memory, and Java does\nthat too. A handy object representation and a really nice garbage collector are\nthe main reasons we're writing our first interpreter in Java.\n\n</aside>\n\n## Evaluating Expressions\n\nNext, we need blobs of code to implement the evaluation logic for each kind of\nexpression we can parse. We could stuff that code into the syntax tree classes\nin something like an `interpret()` method. In effect, we could tell each syntax\ntree node, \"Interpret thyself\". This is the Gang of Four's\n[Interpreter design pattern][]. It's a neat pattern, but like I mentioned\nearlier, it gets messy if we jam all sorts of logic into the tree classes.\n\n[interpreter design pattern]: https://en.wikipedia.org/wiki/Interpreter_pattern\n\nInstead, we're going to reuse our groovy [Visitor pattern][]. In the previous\nchapter, we created an AstPrinter class. It took in a syntax tree and\nrecursively traversed it, building up a string which it ultimately returned.\nThat's almost exactly what a real interpreter does, except instead of\nconcatenating strings, it computes values.\n\n[visitor pattern]: representing-code.html#the-visitor-pattern\n\nWe start with a new class.\n\n^code interpreter-class\n\nThe class declares that it's a visitor. The return type of the visit methods\nwill be Object, the root class that we use to refer to a Lox value in our Java\ncode. To satisfy the Visitor interface, we need to define visit methods for each\nof the four expression tree classes our parser produces. We'll start with the\nsimplest...\n\n### Evaluating literals\n\nThe leaves of an expression tree -- the atomic bits of syntax that all other\nexpressions are composed of -- are <span name=\"leaf\">literals</span>. Literals\nare almost values already, but the distinction is important. A literal is a *bit\nof syntax* that produces a value. A literal always appears somewhere in the\nuser's source code. Lots of values are produced by computation and don't exist\nanywhere in the code itself. Those aren't literals. A literal comes from the\nparser's domain. Values are an interpreter concept, part of the runtime's world.\n\n<aside name=\"leaf\">\n\nIn the [next chapter][vars], when we implement variables, we'll add identifier\nexpressions, which are also leaf nodes.\n\n[vars]: statements-and-state.html\n\n</aside>\n\nSo, much like we converted a literal *token* into a literal *syntax tree node*\nin the parser, now we convert the literal tree node into a runtime value. That\nturns out to be trivial.\n\n^code visit-literal\n\nWe eagerly produced the runtime value way back during scanning and stuffed it in\nthe token. The parser took that value and stuck it in the literal tree node,\nso to evaluate a literal, we simply pull it back out.\n\n### Evaluating parentheses\n\nThe next simplest node to evaluate is grouping -- the node you get as a result\nof using explicit parentheses in an expression.\n\n^code visit-grouping\n\nA <span name=\"grouping\">grouping</span> node has a reference to an inner node\nfor the expression contained inside the parentheses. To evaluate the grouping\nexpression itself, we recursively evaluate that subexpression and return it.\n\nWe rely on this helper method which simply sends the expression back into the\ninterpreter's visitor implementation:\n\n<aside name=\"grouping\">\n\nSome parsers don't define tree nodes for parentheses. Instead, when parsing a\nparenthesized expression, they simply return the node for the inner expression.\nWe do create a node for parentheses in Lox because we'll need it later to\ncorrectly handle the left-hand sides of assignment expressions.\n\n</aside>\n\n^code evaluate\n\n### Evaluating unary expressions\n\nLike grouping, unary expressions have a single subexpression that we must\nevaluate first. The difference is that the unary expression itself does a little\nwork afterwards.\n\n^code visit-unary\n\nFirst, we evaluate the operand expression. Then we apply the unary operator\nitself to the result of that. There are two different unary expressions,\nidentified by the type of the operator token.\n\nShown here is `-`, which negates the result of the subexpression. The\nsubexpression must be a number. Since we don't *statically* know that in Java,\nwe <span name=\"cast\">cast</span> it before performing the operation. This type\ncast happens at runtime when the `-` is evaluated. That's the core of what makes\na language dynamically typed right there.\n\n<aside name=\"cast\">\n\nYou're probably wondering what happens if the cast fails. Fear not, we'll get\ninto that soon.\n\n</aside>\n\nYou can start to see how evaluation recursively traverses the tree. We can't\nevaluate the unary operator itself until after we evaluate its operand\nsubexpression. That means our interpreter is doing a **post-order traversal** --\neach node evaluates its children before doing its own work.\n\nThe other unary operator is logical not.\n\n^code unary-bang (1 before, 1 after)\n\nThe implementation is simple, but what is this \"truthy\" thing about? We need to\nmake a little side trip to one of the great questions of Western philosophy:\n*What is truth?*\n\n### Truthiness and falsiness\n\nOK, maybe we're not going to really get into the universal question, but at\nleast inside the world of Lox, we need to decide what happens when you use\nsomething other than `true` or `false` in a logic operation like `!` or any\nother place where a Boolean is expected.\n\nWe *could* just say it's an error because we don't roll with implicit\nconversions, but most dynamically typed languages aren't that ascetic. Instead,\nthey take the universe of values of all types and partition them into two sets,\none of which they define to be \"true\", or \"truthful\", or (my favorite) \"truthy\",\nand the rest which are \"false\" or \"falsey\". This partitioning is somewhat\narbitrary and gets <span name=\"weird\">weird</span> in a few languages.\n\n<aside name=\"weird\" class=\"bottom\">\n\nIn JavaScript, strings are truthy, but empty strings are not. Arrays are truthy\nbut empty arrays are... also truthy. The number `0` is falsey, but the *string*\n`\"0\"` is truthy.\n\nIn Python, empty strings are falsey like in JS, but other empty sequences are\nfalsey too.\n\nIn PHP, both the number `0` and the string `\"0\"` are falsey. Most other\nnon-empty strings are truthy.\n\nGet all that?\n\n</aside>\n\nLox follows Ruby's simple rule: `false` and `nil` are falsey, and everything else\nis truthy. We implement that like so:\n\n^code is-truthy\n\n### Evaluating binary operators\n\nOn to the last expression tree class, binary operators. There's a handful of\nthem, and we'll start with the arithmetic ones.\n\n^code visit-binary\n\n<aside name=\"left\">\n\nDid you notice we pinned down a subtle corner of the language semantics here?\nIn a binary expression, we evaluate the operands in left-to-right order. If\nthose operands have side effects, that choice is user visible, so this isn't\nsimply an implementation detail.\n\nIf we want our two interpreters to be consistent (hint: we do), we'll need to\nmake sure clox does the same thing.\n\n</aside>\n\nI think you can figure out what's going on here. The main difference from the\nunary negation operator is that we have two operands to evaluate.\n\nI left out one arithmetic operator because it's a little special.\n\n^code binary-plus (3 before, 1 after)\n\nThe `+` operator can also be used to concatenate two strings. To handle that, we\ndon't just assume the operands are a certain type and *cast* them, we\ndynamically *check* the type and choose the appropriate operation. This is why\nwe need our object representation to support `instanceof`.\n\n<aside name=\"plus\">\n\nWe could have defined an operator specifically for string concatenation. That's\nwhat Perl (`.`), Lua (`..`), Smalltalk (`,`), Haskell (`++`), and others do.\n\nI thought it would make Lox a little more approachable to use the same syntax as\nJava, JavaScript, Python, and others. This means that the `+` operator is\n**overloaded** to support both adding numbers and concatenating strings. Even in\nlanguages that don't use `+` for strings, they still often overload it for\nadding both integers and floating-point numbers.\n\n</aside>\n\nNext up are the comparison operators.\n\n^code binary-comparison (1 before, 1 after)\n\nThey are basically the same as arithmetic. The only difference is that where the\narithmetic operators produce a value whose type is the same as the operands\n(numbers or strings), the comparison operators always produce a Boolean.\n\nThe last pair of operators are equality.\n\n^code binary-equality\n\nUnlike the comparison operators which require numbers, the equality operators\nsupport operands of any type, even mixed ones. You can't ask Lox if 3 is *less*\nthan `\"three\"`, but you can ask if it's <span name=\"equal\">*equal*</span> to\nit.\n\n<aside name=\"equal\">\n\nSpoiler alert: it's not.\n\n</aside>\n\nLike truthiness, the equality logic is hoisted out into a separate method.\n\n^code is-equal\n\nThis is one of those corners where the details of how we represent Lox objects\nin terms of Java matter. We need to correctly implement *Lox's* notion of\nequality, which may be different from Java's.\n\nFortunately, the two are pretty similar. Lox doesn't do implicit conversions in\nequality and Java does not either. We do have to handle `nil`/`null` specially\nso that we don't throw a NullPointerException if we try to call `equals()` on\n`null`. Otherwise, we're fine. Java's <span name=\"nan\">`equals()`</span> method\non Boolean, Double, and String have the behavior we want for Lox.\n\n<aside name=\"nan\">\n\nWhat do you expect this to evaluate to:\n\n```lox\n(0 / 0) == (0 / 0)\n```\n\nAccording to [IEEE 754][], which specifies the behavior of double-precision\nnumbers, dividing a zero by zero gives you the special **NaN** (\"not a number\")\nvalue. Strangely enough, NaN is *not* equal to itself.\n\nIn Java, the `==` operator on primitive doubles preserves that behavior, but the\n`equals()` method on the Double class does not. Lox uses the latter, so doesn't\nfollow IEEE. These kinds of subtle incompatibilities occupy a dismaying fraction\nof language implementers' lives.\n\n[ieee 754]: https://en.wikipedia.org/wiki/IEEE_754\n\n</aside>\n\nAnd that's it! That's all the code we need to correctly interpret a valid Lox\nexpression. But what about an *invalid* one? In particular, what happens when a\nsubexpression evaluates to an object of the wrong type for the operation being\nperformed?\n\n## Runtime Errors\n\nI was cavalier about jamming casts in whenever a subexpression produces an\nObject and the operator requires it to be a number or a string. Those casts can\nfail. Even though the user's code is erroneous, if we want to make a <span\nname=\"fail\">usable</span> language, we are responsible for handling that error\ngracefully.\n\n<aside name=\"fail\">\n\nWe could simply not detect or report a type error at all. This is what C does if\nyou cast a pointer to some type that doesn't match the data that is actually\nbeing pointed to. C gains flexibility and speed by allowing that, but is\nalso famously dangerous. Once you misinterpret bits in memory, all bets are off.\n\nFew modern languages accept unsafe operations like that. Instead, most are\n**memory safe** and ensure -- through a combination of static and runtime checks\n-- that a program can never incorrectly interpret the value stored in a piece of\nmemory.\n\n</aside>\n\nIt's time for us to talk about **runtime errors**. I spilled a lot of ink in the\nprevious chapters talking about error handling, but those were all *syntax* or\n*static* errors. Those are detected and reported before *any* code is executed.\nRuntime errors are failures that the language semantics demand we detect and\nreport while the program is running (hence the name).\n\nRight now, if an operand is the wrong type for the operation being performed,\nthe Java cast will fail and the JVM will throw a ClassCastException. That\nunwinds the whole stack and exits the application, vomiting a Java stack trace\nonto the user. That's probably not what we want. The fact that Lox is\nimplemented in Java should be a detail hidden from the user. Instead, we want\nthem to understand that a *Lox* runtime error occurred, and give them an error\nmessage relevant to our language and their program.\n\nThe Java behavior does have one thing going for it, though. It correctly stops\nexecuting any code when the error occurs. Let's say the user enters some\nexpression like:\n\n```lox\n2 * (3 / -\"muffin\")\n```\n\nYou can't negate a <span name=\"muffin\">muffin</span>, so we need to report a\nruntime error at that inner `-` expression. That in turn means we can't evaluate\nthe `/` expression since it has no meaningful right operand. Likewise for the\n`*`. So when a runtime error occurs deep in some expression, we need to escape\nall the way out.\n\n<aside name=\"muffin\">\n\nI don't know, man, *can* you negate a muffin?\n\n<img src=\"image/evaluating-expressions/muffin.png\" alt=\"A muffin, negated.\" />\n\n</aside>\n\nWe could print a runtime error and then abort the process and exit the\napplication entirely. That has a certain melodramatic flair. Sort of the\nprogramming language interpreter equivalent of a mic drop.\n\nTempting as that is, we should probably do something a little less cataclysmic.\nWhile a runtime error needs to stop evaluating the *expression*, it shouldn't\nkill the *interpreter*. If a user is running the REPL and has a typo in a line\nof code, they should still be able to keep the session going and enter more code\nafter that.\n\n### Detecting runtime errors\n\nOur tree-walk interpreter evaluates nested expressions using recursive method\ncalls, and we need to unwind out of all of those. Throwing an exception in Java\nis a fine way to accomplish that. However, instead of using Java's own cast\nfailure, we'll define a Lox-specific one so that we can handle it how we want.\n\nBefore we do the cast, we check the object's type ourselves. So, for unary `-`,\nwe add:\n\n^code check-unary-operand (1 before, 1 after)\n\nThe code to check the operand is:\n\n^code check-operand\n\nWhen the check fails, it throws one of these:\n\n^code runtime-error-class\n\nUnlike the Java cast exception, our <span name=\"class\">class</span> tracks the\ntoken that identifies where in the user's code the runtime error came from. As\nwith static errors, this helps the user know where to fix their code.\n\n<aside name=\"class\">\n\nI admit the name \"RuntimeError\" is confusing since Java defines a\nRuntimeException class. An annoying thing about building interpreters is your\nnames often collide with ones already taken by the implementation language. Just\nwait until we support Lox classes.\n\n</aside>\n\nWe need similar checking for the binary operators. Since I promised you every\nsingle line of code needed to implement the interpreters, I'll run through them\nall.\n\nGreater than:\n\n^code check-greater-operand (1 before, 1 after)\n\nGreater than or equal to:\n\n^code check-greater-equal-operand (1 before, 1 after)\n\nLess than:\n\n^code check-less-operand (1 before, 1 after)\n\nLess than or equal to:\n\n^code check-less-equal-operand (1 before, 1 after)\n\nSubtraction:\n\n^code check-minus-operand (1 before, 1 after)\n\nDivision:\n\n^code check-slash-operand (1 before, 1 after)\n\nMultiplication:\n\n^code check-star-operand (1 before, 1 after)\n\nAll of those rely on this validator, which is virtually the same as the unary\none:\n\n^code check-operands\n\n<aside name=\"operand\">\n\nAnother subtle semantic choice: We evaluate *both* operands before checking the\ntype of *either*. Imagine we have a function `say()` that prints its argument\nthen returns it. Using that, we write:\n\n```lox\nsay(\"left\") - say(\"right\");\n```\n\nOur interpreter prints \"left\" and \"right\" before reporting the runtime error. We\ncould have instead specified that the left operand is checked before even\nevaluating the right.\n\n</aside>\n\nThe last remaining operator, again the odd one out, is addition. Since `+` is\noverloaded for numbers and strings, it already has code to check the types. All\nwe need to do is fail if neither of the two success cases match.\n\n^code string-wrong-type (3 before, 1 after)\n\nThat gets us detecting runtime errors deep in the innards of the evaluator. The\nerrors are getting thrown. The next step is to write the code that catches them.\nFor that, we need to wire up the Interpreter class into the main Lox class that\ndrives it.\n\n## Hooking Up the Interpreter\n\nThe visit methods are sort of the guts of the Interpreter class, where the real\nwork happens. We need to wrap a skin around them to interface with the rest of\nthe program. The Interpreter's public API is simply one method.\n\n^code interpret\n\nThis takes in a syntax tree for an expression and evaluates it. If that\nsucceeds, `evaluate()` returns an object for the result value. `interpret()`\nconverts that to a string and shows it to the user. To convert a Lox value to a\nstring, we rely on:\n\n^code stringify\n\nThis is another of those pieces of code like `isTruthy()` that crosses the\nmembrane between the user's view of Lox objects and their internal\nrepresentation in Java.\n\nIt's pretty straightforward. Since Lox was designed to be familiar to someone\ncoming from Java, things like Booleans look the same in both languages. The two\nedge cases are `nil`, which we represent using Java's `null`, and numbers.\n\nLox uses double-precision numbers even for integer values. In that case, they\nshould print without a decimal point. Since Java has both floating point and\ninteger types, it wants you to know which one you're using. It tells you by\nadding an explicit `.0` to integer-valued doubles. We don't care about that, so\nwe <span name=\"number\">hack</span> it off the end.\n\n<aside name=\"number\">\n\nYet again, we take care of this edge case with numbers to ensure that jlox and\nclox work the same. Handling weird corners of the language like this will drive\nyou crazy but is an important part of the job.\n\nUsers rely on these details -- either deliberately or inadvertently -- and if\nthe implementations aren't consistent, their program will break when they run it\non different interpreters.\n\n</aside>\n\n### Reporting runtime errors\n\nIf a runtime error is thrown while evaluating the expression, `interpret()`\ncatches it. This lets us report the error to the user and then gracefully\ncontinue. All of our existing error reporting code lives in the Lox class, so we\nput this method there too:\n\n^code runtime-error-method\n\nWe use the token associated with the RuntimeError to tell the user what line of\ncode was executing when the error occurred. Even better would be to give the\nuser an entire call stack to show how they *got* to be executing that code. But\nwe don't have function calls yet, so I guess we don't have to worry about it.\n\nAfter showing the error, `runtimeError()` sets this field:\n\n^code had-runtime-error-field (1 before, 1 after)\n\nThat field plays a small but important role.\n\n^code check-runtime-error (4 before, 1 after)\n\nIf the user is running a Lox <span name=\"repl\">script from a file</span> and a\nruntime error occurs, we set an exit code when the process quits to let the\ncalling process know. Not everyone cares about shell etiquette, but we do.\n\n<aside name=\"repl\">\n\nIf the user is running the REPL, we don't care about tracking runtime errors.\nAfter they are reported, we simply loop around and let them input new code and\nkeep going.\n\n</aside>\n\n### Running the interpreter\n\nNow that we have an interpreter, the Lox class can start using it.\n\n^code interpreter-instance (1 before, 1 after)\n\nWe make the field static so that successive calls to `run()` inside a REPL\nsession reuse the same interpreter. That doesn't make a difference now, but it\nwill later when the interpreter stores global variables. Those variables should\npersist throughout the REPL session.\n\nFinally, we remove the line of temporary code from the [last chapter][] for\nprinting the syntax tree and replace it with this:\n\n[last chapter]: parsing-expressions.html\n\n^code interpreter-interpret (3 before, 1 after)\n\nWe have an entire language pipeline now: scanning, parsing, and\nexecution. Congratulations, you now have your very own arithmetic calculator.\n\nAs you can see, the interpreter is pretty bare bones. But the Interpreter class\nand the Visitor pattern we've set up today form the skeleton that later chapters\nwill stuff full of interesting guts -- variables, functions, etc. Right now, the\ninterpreter doesn't do very much, but it's alive!\n\n<img src=\"image/evaluating-expressions/skeleton.png\" alt=\"A skeleton waving hello.\" />\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  Allowing comparisons on types other than numbers could be useful. The\n    operators might have a reasonable interpretation for strings. Even\n    comparisons among mixed types, like `3 < \"pancake\"` could be handy to enable\n    things like ordered collections of heterogeneous types. Or it could simply\n    lead to bugs and confusion.\n\n    Would you extend Lox to support comparing other types? If so, which pairs of\n    types do you allow and how do you define their ordering? Justify your\n    choices and compare them to other languages.\n\n2.  Many languages define `+` such that if *either* operand is a string, the\n    other is converted to a string and the results are then concatenated. For\n    example, `\"scone\" + 4` would yield `scone4`. Extend the code in\n    `visitBinaryExpr()` to support that.\n\n3.  What happens right now if you divide a number by zero? What do you think\n    should happen? Justify your choice. How do other languages you know handle\n    division by zero, and why do they make the choices they do?\n\n    Change the implementation in `visitBinaryExpr()` to detect and report a\n    runtime error for this case.\n\n</div>\n\n<div class=\"design-note\">\n\n## Design Note: Static and Dynamic Typing\n\nSome languages, like Java, are statically typed which means type errors are\ndetected and reported at compile time before any code is run. Others, like Lox,\nare dynamically typed and defer checking for type errors until runtime right\nbefore an operation is attempted. We tend to consider this a black-and-white\nchoice, but there is actually a continuum between them.\n\nIt turns out even most statically typed languages do *some* type checks at\nruntime. The type system checks most type rules statically, but inserts runtime\nchecks in the generated code for other operations.\n\nFor example, in Java, the *static* type system assumes a cast expression will\nalways safely succeed. After you cast some value, you can statically treat it as\nthe destination type and not get any compile errors. But downcasts can fail,\nobviously. The only reason the static checker can presume that casts always\nsucceed without violating the language's soundness guarantees, is because the\ncast is checked *at runtime* and throws an exception on failure.\n\nA more subtle example is [covariant arrays][] in Java and C#. The static\nsubtyping rules for arrays allow operations that are not sound. Consider:\n\n[covariant arrays]: https://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)#Covariant_arrays_in_Java_and_C.23\n\n```java\nObject[] stuff = new Integer[1];\nstuff[0] = \"not an int!\";\n```\n\nThis code compiles without any errors. The first line upcasts the Integer array\nand stores it in a variable of type Object array. The second line stores a\nstring in one of its cells. The Object array type statically allows that\n-- strings *are* Objects -- but the actual Integer array that `stuff` refers to\nat runtime should never have a string in it! To avoid that catastrophe, when you\nstore a value in an array, the JVM does a *runtime* check to make sure it's an\nallowed type. If not, it throws an ArrayStoreException.\n\nJava could have avoided the need to check this at runtime by disallowing the\ncast on the first line. It could make arrays *invariant* such that an array of\nIntegers is *not* an array of Objects. That's statically sound, but it prohibits\ncommon and safe patterns of code that only read from arrays. Covariance is safe\nif you never *write* to the array. Those patterns were particularly important\nfor usability in Java 1.0 before it supported generics. James Gosling and the\nother Java designers traded off a little static safety and performance -- those\narray store checks take time -- in return for some flexibility.\n\nThere are few modern statically typed languages that don't make that trade-off\n*somewhere*. Even Haskell will let you run code with non-exhaustive matches. If\nyou find yourself designing a statically typed language, keep in mind that you\ncan sometimes give users more flexibility without sacrificing *too* many of the\nbenefits of static safety by deferring some type checks until runtime.\n\nOn the other hand, a key reason users choose statically typed languages is\nbecause of the confidence the language gives them that certain kinds of errors\ncan *never* occur when their program is run. Defer too many type checks until\nruntime, and you erode that confidence.\n\n</div>\n"
  },
  {
    "path": "book/functions.md",
    "content": "> And that is also the way the human mind works -- by the compounding of old\n> ideas into new structures that become new ideas that can themselves be used in\n> compounds, and round and round endlessly, growing ever more remote from the\n> basic earthbound imagery that is each language's soil.\n>\n> <cite>Douglas R. Hofstadter, <em>I Am a Strange Loop</em></cite>\n\nThis chapter marks the culmination of a lot of hard work. The previous chapters\nadd useful functionality in their own right, but each also supplies a piece of a\n<span name=\"lambda\">puzzle</span>. We'll take those pieces -- expressions,\nstatements, variables, control flow, and lexical scope -- add a couple more, and\nassemble them all into support for real user-defined functions and function\ncalls.\n\n<aside name=\"lambda\">\n\n<img src=\"image/functions/lambda.png\" alt=\"A lambda puzzle.\" />\n\n</aside>\n\n## Function Calls\n\nYou're certainly familiar with C-style function call syntax, but the grammar is\nmore subtle than you may realize. Calls are typically to named functions like:\n\n```lox\naverage(1, 2);\n```\n\nBut the <span name=\"pascal\">name</span> of the function being called isn't\nactually part of the call syntax. The thing being called -- the **callee** --\ncan be any expression that evaluates to a function. (Well, it does have to be a\npretty *high precedence* expression, but parentheses take care of that.) For\nexample:\n\n<aside name=\"pascal\">\n\nThe name *is* part of the call syntax in Pascal. You can call only named\nfunctions or functions stored directly in variables.\n\n</aside>\n\n```lox\ngetCallback()();\n```\n\nThere are two call expressions here. The first pair of parentheses has\n`getCallback` as its callee. But the second call has the entire `getCallback()`\nexpression as its callee. It is the parentheses following an expression that\nindicate a function call. You can think of a call as sort of like a postfix\noperator that starts with `(`.\n\nThis \"operator\" has higher precedence than any other operator, even the unary\nones. So we slot it into the grammar by having the `unary` rule bubble up to a\nnew `call` rule.\n\n<span name=\"curry\"></span>\n\n```ebnf\nunary          → ( \"!\" | \"-\" ) unary | call ;\ncall           → primary ( \"(\" arguments? \")\" )* ;\n```\n\nThis rule matches a primary expression followed by zero or more function calls.\nIf there are no parentheses, this parses a bare primary expression. Otherwise,\neach call is recognized by a pair of parentheses with an optional list of\narguments inside. The argument list grammar is:\n\n<aside name=\"curry\">\n\nThe rule uses `*` to allow matching a series of calls like `fn(1)(2)(3)`. Code\nlike that isn't common in C-style languages, but it is in the family of\nlanguages derived from ML. There, the normal way of defining a function that\ntakes multiple arguments is as a series of nested functions. Each function takes\none argument and returns a new function. That function consumes the next\nargument, returns yet another function, and so on. Eventually, once all of the\narguments are consumed, the last function completes the operation.\n\nThis style, called **currying**, after Haskell Curry (the same guy whose first\nname graces that *other* well-known functional language), is baked directly into\nthe language syntax so it's not as weird looking as it would be here.\n\n</aside>\n\n```ebnf\narguments      → expression ( \",\" expression )* ;\n```\n\nThis rule requires at least one argument expression, followed by zero or more\nother expressions, each preceded by a comma. To handle zero-argument calls, the\n`call` rule itself considers the entire `arguments` production to be optional.\n\nI admit, this seems more grammatically awkward than you'd expect for the\nincredibly common \"zero or more comma-separated things\" pattern. There are some\nsophisticated metasyntaxes that handle this better, but in our BNF and in many\nlanguage specs I've seen, it is this cumbersome.\n\nOver in our syntax tree generator, we add a <span name=\"call-ast\">new\nnode</span>.\n\n^code call-expr (1 before, 1 after)\n\n<aside name=\"call-ast\">\n\nThe generated code for the new node is in [Appendix II][appendix-call].\n\n[appendix-call]: appendix-ii.html#call-expression\n\n</aside>\n\nIt stores the callee expression and a list of expressions for the arguments. It\nalso stores the token for the closing parenthesis. We'll use that token's\nlocation when we report a runtime error caused by a function call.\n\nCrack open the parser. Where `unary()` used to jump straight to `primary()`,\nchange it to call, well, `call()`.\n\n^code unary-call (3 before, 1 after)\n\nIts definition is:\n\n^code call\n\nThe code here doesn't quite line up with the grammar rules. I moved a few things\naround to make the code cleaner -- one of the luxuries we have with a\nhandwritten parser. But it's roughly similar to how we parse infix operators.\nFirst, we parse a primary expression, the \"left operand\" to the call. Then, each\ntime we see a `(`, we call `finishCall()` to parse the call expression using the\npreviously parsed expression as the callee. The returned expression becomes the\nnew `expr` and we loop to see if the result is itself called.\n\n<aside name=\"while-true\">\n\nThis code would be simpler as `while (match(LEFT_PAREN))` instead of the silly\n`while (true)` and `break`. Don't worry, it will make sense when we expand the\nparser later to handle properties on objects.\n\n</aside>\n\nThe code to parse the argument list is in this helper:\n\n^code finish-call\n\nThis is more or less the `arguments` grammar rule translated to code, except\nthat we also handle the zero-argument case. We check for that case first by\nseeing if the next token is `)`. If it is, we don't try to parse any arguments.\n\nOtherwise, we parse an expression, then look for a comma indicating that there\nis another argument after that. We keep doing that as long as we find commas\nafter each expression. When we don't find a comma, then the argument list must\nbe done and we consume the expected closing parenthesis. Finally, we wrap the\ncallee and those arguments up into a call AST node.\n\n### Maximum argument counts\n\nRight now, the loop where we parse arguments has no bound. If you want to call a\nfunction and pass a million arguments to it, the parser would have no problem\nwith it. Do we want to limit that?\n\nOther languages have various approaches. The C standard says a conforming\nimplementation has to support *at least* 127 arguments to a function, but\ndoesn't say there's any upper limit. The Java specification says a method can\naccept *no more than* <span name=\"254\">255</span> arguments.\n\n<aside name=\"254\">\n\nThe limit is 25*4* arguments if the method is an instance method. That's because\n`this` -- the receiver of the method -- works like an argument that is\nimplicitly passed to the method, so it claims one of the slots.\n\n</aside>\n\nOur Java interpreter for Lox doesn't really need a limit, but having a maximum\nnumber of arguments will simplify our bytecode interpreter in [Part III][]. We\nwant our two interpreters to be compatible with each other, even in weird corner\ncases like this, so we'll add the same limit to jlox.\n\n[part iii]: a-bytecode-virtual-machine.html\n\n^code check-max-arity (1 before, 1 after)\n\nNote that the code here *reports* an error if it encounters too many arguments,\nbut it doesn't *throw* the error. Throwing is how we kick into panic mode which\nis what we want if the parser is in a confused state and doesn't know where it\nis in the grammar anymore. But here, the parser is still in a perfectly valid\nstate -- it just found too many arguments. So it reports the error and keeps on\nkeepin' on.\n\n### Interpreting function calls\n\nWe don't have any functions we can call, so it seems weird to start implementing\ncalls first, but we'll worry about that when we get there. First, our\ninterpreter needs a new import.\n\n^code import-array-list (1 after)\n\nAs always, interpretation starts with a new visit method for our new call\nexpression node.\n\n^code visit-call\n\nFirst, we evaluate the expression for the callee. Typically, this expression is\njust an identifier that looks up the function by its name, but it could be\nanything. Then we evaluate each of the argument expressions in order and store\nthe resulting values in a list.\n\n<aside name=\"in-order\">\n\nThis is another one of those subtle semantic choices. Since argument expressions\nmay have side effects, the order they are evaluated could be user visible. Even\nso, some languages like Scheme and C don't specify an order. This gives\ncompilers freedom to reorder them for efficiency, but means users may be\nunpleasantly surprised if arguments aren't evaluated in the order they expect.\n\n</aside>\n\nOnce we've got the callee and the arguments ready, all that remains is to\nperform the call. We do that by casting the callee to a <span\nname=\"callable\">LoxCallable</span> and then invoking a `call()` method on it.\nThe Java representation of any Lox object that can be called like a function\nwill implement this interface. That includes user-defined functions, naturally,\nbut also class objects since classes are \"called\" to construct new instances.\nWe'll also use it for one more purpose shortly.\n\n<aside name=\"callable\">\n\nI stuck \"Lox\" before the name to distinguish it from the Java standard library's\nown Callable interface. Alas, all the good simple names are already taken.\n\n</aside>\n\nThere isn't too much to this new interface.\n\n^code callable\n\nWe pass in the interpreter in case the class implementing `call()` needs it. We\nalso give it the list of evaluated argument values. The implementer's job is\nthen to return the value that the call expression produces.\n\n### Call type errors\n\nBefore we get to implementing LoxCallable, we need to make the visit method a\nlittle more robust. It currently ignores a couple of failure modes that we can't\npretend won't occur. First, what happens if the callee isn't actually something\nyou can call? What if you try to do this:\n\n```lox\n\"totally not a function\"();\n```\n\nStrings aren't callable in Lox. The runtime representation of a Lox string is a\nJava string, so when we cast that to LoxCallable, the JVM will throw a\nClassCastException. We don't want our interpreter to vomit out some nasty Java\nstack trace and die. Instead, we need to check the type ourselves first.\n\n^code check-is-callable (2 before, 1 after)\n\nWe still throw an exception, but now we're throwing our own exception type, one\nthat the interpreter knows to catch and report gracefully.\n\n### Checking arity\n\nThe other problem relates to the function's **arity**. Arity is the fancy term\nfor the number of arguments a function or operation expects. Unary operators\nhave arity one, binary operators two, etc. With functions, the arity is\ndetermined by the number of parameters it declares.\n\n```lox\nfun add(a, b, c) {\n  print a + b + c;\n}\n```\n\nThis function defines three parameters, `a`, `b`, and `c`, so its arity is\nthree and it expects three arguments. So what if you try to call it like this:\n\n```lox\nadd(1, 2, 3, 4); // Too many.\nadd(1, 2);       // Too few.\n```\n\nDifferent languages take different approaches to this problem. Of course, most\nstatically typed languages check this at compile time and refuse to compile the\ncode if the argument count doesn't match the function's arity. JavaScript\ndiscards any extra arguments you pass. If you don't pass enough, it fills in the\nmissing parameters with the magic sort-of-like-null-but-not-really value\n`undefined`. Python is stricter. It raises a runtime error if the argument list\nis too short or too long.\n\nI think the latter is a better approach. Passing the wrong number of arguments\nis almost always a bug, and it's a mistake I do make in practice. Given that,\nthe sooner the implementation draws my attention to it, the better. So for Lox,\nwe'll take Python's approach. Before invoking the callable, we check to see if\nthe argument list's length matches the callable's arity.\n\n^code check-arity (1 before, 1 after)\n\nThat requires a new method on the LoxCallable interface to ask it its arity.\n\n^code callable-arity (1 before, 1 after)\n\nWe *could* push the arity checking into the concrete implementation of `call()`.\nBut, since we'll have multiple classes implementing LoxCallable, that would end\nup with redundant validation spread across a few classes. Hoisting it up into\nthe visit method lets us do it in one place.\n\n## Native Functions\n\nWe can theoretically call functions, but we have no functions to call yet.\nBefore we get to user-defined functions, now is a good time to introduce a vital\nbut often overlooked facet of language implementations -- <span\nname=\"native\">**native functions**</span>. These are functions that the\ninterpreter exposes to user code but that are implemented in the host language\n(in our case Java), not the language being implemented (Lox).\n\nSometimes these are called **primitives**, **external functions**, or **foreign\nfunctions**. Since these functions can be called while the user's program is\nrunning, they form part of the implementation's runtime. A lot of programming\nlanguage books gloss over these because they aren't conceptually interesting.\nThey're mostly grunt work.\n\n<aside name=\"native\">\n\nCuriously, two names for these functions -- \"native\" and \"foreign\" -- are\nantonyms. Maybe it depends on the perspective of the person choosing the term.\nIf you think of yourself as \"living\" within the runtime's implementation (in our\ncase, Java) then functions written in that are \"native\". But if you have the\nmindset of a *user* of your language, then the runtime is implemented in some\nother \"foreign\" language.\n\nOr it may be that \"native\" refers to the machine code language of the underlying\nhardware. In Java, \"native\" methods are ones implemented in C or C++ and\ncompiled to native machine code.\n\n<img src=\"image/functions/foreign.png\" class=\"above\" alt=\"All a matter of perspective.\" />\n\n</aside>\n\nBut when it comes to making your language actually good at doing useful stuff,\nthe native functions your implementation provides are key. They provide access\nto the fundamental services that all programs are defined in terms of. If you\ndon't provide native functions to access the file system, a user's going to have\na hell of a time writing a program that reads and <span\nname=\"print\">displays</span> a file.\n\n<aside name=\"print\">\n\nA classic native function almost every language provides is one to print text to\nstdout. In Lox, I made `print` a built-in statement so that we could get stuff\non screen in the chapters before this one.\n\nOnce we have functions, we could simplify the language by tearing out the old\nprint syntax and replacing it with a native function. But that would mean that\nexamples early in the book wouldn't run on the interpreter from later chapters\nand vice versa. So, for the book, I'll leave it alone.\n\nIf you're building an interpreter for your *own* language, though, you may want\nto consider it.\n\n</aside>\n\nMany languages also allow users to provide their own native functions. The\nmechanism for doing so is called a **foreign function interface** (**FFI**),\n**native extension**, **native interface**, or something along those lines.\nThese are nice because they free the language implementer from providing access\nto every single capability the underlying platform supports. We won't define an\nFFI for jlox, but we will add one native function to give you an idea of what it\nlooks like.\n\n### Telling time\n\nWhen we get to [Part III][] and start working on a much more efficient\nimplementation of Lox, we're going to care deeply about performance. Performance\nwork requires measurement, and that in turn means **benchmarks**. These are\nprograms that measure the time it takes to exercise some corner of the\ninterpreter.\n\nWe could measure the time it takes to start up the interpreter, run the\nbenchmark, and exit, but that adds a lot of overhead -- JVM startup time, OS\nshenanigans, etc. That stuff does matter, of course, but if you're just trying\nto validate an optimization to some piece of the interpreter, you don't want\nthat overhead obscuring your results.\n\nA nicer solution is to have the benchmark script itself measure the time elapsed\nbetween two points in the code. To do that, a Lox program needs to be able to\ntell time. There's no way to do that now -- you can't implement a useful clock\n\"from scratch\" without access to the underlying clock on the computer.\n\nSo we'll add `clock()`, a native function that returns the number of seconds\nthat have passed since some fixed point in time. The difference between two\nsuccessive invocations tells you how much time elapsed between the two calls.\nThis function is defined in the global scope, so let's ensure the interpreter\nhas access to that.\n\n^code global-environment (2 before, 2 after)\n\nThe `environment` field in the interpreter changes as we enter and exit local\nscopes. It tracks the *current* environment. This new `globals` field holds a\nfixed reference to the outermost global environment.\n\nWhen we instantiate an Interpreter, we stuff the native function in that global\nscope.\n\n^code interpreter-constructor (2 before, 1 after)\n\nThis defines a <span name=\"lisp-1\">variable</span> named \"clock\". Its value is a\nJava anonymous class that implements LoxCallable. The `clock()` function takes\nno arguments, so its arity is zero. The implementation of `call()` calls the\ncorresponding Java function and converts the result to a double value in\nseconds.\n\n<aside name=\"lisp-1\">\n\nIn Lox, functions and variables occupy the same namespace. In Common Lisp, the\ntwo live in their own worlds. A function and variable with the same name don't\ncollide. If you call the name, it looks up the function. If you refer to it, it\nlooks up the variable. This does require jumping through some hoops when you do\nwant to refer to a function as a first-class value.\n\nRichard P. Gabriel and Kent Pitman coined the terms \"Lisp-1\" to refer to\nlanguages like Scheme that put functions and variables in the same namespace,\nand \"Lisp-2\" for languages like Common Lisp that partition them. Despite being\ntotally opaque, those names have since stuck. Lox is a Lisp-1.\n\n</aside>\n\nIf we wanted to add other native functions -- reading input from the user,\nworking with files, etc. -- we could add them each as their own anonymous class\nthat implements LoxCallable. But for the book, this one is really all we need.\n\nLet's get ourselves out of the function-defining business and let our users\ntake over...\n\n## Function Declarations\n\nWe finally get to add a new production to the `declaration` rule we introduced\nback when we added variables. Function declarations, like variables, bind a new\n<span name=\"name\">name</span>. That means they are allowed only in places where\na declaration is permitted.\n\n<aside name=\"name\">\n\nA named function declaration isn't really a single primitive operation. It's\nsyntactic sugar for two distinct steps: (1) creating a new function object, and\n(2) binding a new variable to it. If Lox had syntax for anonymous functions, we\nwouldn't need function declaration statements. You could just do:\n\n```lox\nvar add = fun (a, b) {\n  print a + b;\n};\n```\n\nHowever, since named functions are the common case, I went ahead and gave Lox\nnice syntax for them.\n\n</aside>\n\n```ebnf\ndeclaration    → funDecl\n               | varDecl\n               | statement ;\n```\n\nThe updated `declaration` rule references this new rule:\n\n```ebnf\nfunDecl        → \"fun\" function ;\nfunction       → IDENTIFIER \"(\" parameters? \")\" block ;\n```\n\nThe main `funDecl` rule uses a separate helper rule `function`. A function\n*declaration statement* is the `fun` keyword followed by the actual function-y\nstuff. When we get to classes, we'll reuse that `function` rule for declaring\nmethods. Those look similar to function declarations, but aren't preceded by\n<span name=\"fun\">`fun`</span>.\n\n<aside name=\"fun\">\n\nMethods are too classy to have fun.\n\n</aside>\n\nThe function itself is a name followed by the parenthesized parameter list and\nthe body. The body is always a braced block, using the same grammar rule that\nblock statements use. The parameter list uses this rule:\n\n```ebnf\nparameters     → IDENTIFIER ( \",\" IDENTIFIER )* ;\n```\n\nIt's like the earlier `arguments` rule, except that each parameter is an\nidentifier, not an expression. That's a lot of new syntax for the parser to chew\nthrough, but the resulting AST <span name=\"fun-ast\">node</span> isn't too bad.\n\n^code function-ast (1 before, 1 after)\n\n<aside name=\"fun-ast\">\n\nThe generated code for the new node is in [Appendix II][appendix-fun].\n\n[appendix-fun]: appendix-ii.html#function-statement\n\n</aside>\n\nA function node has a name, a list of parameters (their names), and then the\nbody. We store the body as the list of statements contained inside the curly\nbraces.\n\nOver in the parser, we weave in the new declaration.\n\n^code match-fun (1 before, 1 after)\n\nLike other statements, a function is recognized by the leading keyword. When we\nencounter `fun`, we call `function`. That corresponds to the `function` grammar\nrule since we already matched and consumed the `fun` keyword. We'll build the\nmethod up a piece at a time, starting with this:\n\n^code parse-function\n\nRight now, it only consumes the identifier token for the function's name. You\nmight be wondering about that funny little `kind` parameter. Just like we reuse\nthe grammar rule, we'll reuse the `function()` method later to parse methods\ninside classes. When we do that, we'll pass in \"method\" for `kind` so that the\nerror messages are specific to the kind of declaration being parsed.\n\nNext, we parse the parameter list and the pair of parentheses wrapped around it.\n\n^code parse-parameters (1 before, 1 after)\n\nThis is like the code for handling arguments in a call, except not split out\ninto a helper method. The outer `if` statement handles the zero parameter case,\nand the inner `while` loop parses parameters as long as we find commas to\nseparate them. The result is the list of tokens for each parameter's name.\n\nJust like we do with arguments to function calls, we validate at parse time\nthat you don't exceed the maximum number of parameters a function is allowed to\nhave.\n\nFinally, we parse the body and wrap it all up in a function node.\n\n^code parse-body (1 before, 1 after)\n\nNote that we consume the `{` at the beginning of the body here before calling\n`block()`. That's because `block()` assumes the brace token has already been\nmatched. Consuming it here lets us report a more precise error message if the\n`{` isn't found since we know it's in the context of a function declaration.\n\n## Function Objects\n\nWe've got some syntax parsed so usually we're ready to interpret, but first we\nneed to think about how to represent a Lox function in Java. We need to keep\ntrack of the parameters so that we can bind them to argument values when the\nfunction is called. And, of course, we need to keep the code for the body of the\nfunction so that we can execute it.\n\nThat's basically what the Stmt.Function class is. Could we just use that?\nAlmost, but not quite. We also need a class that implements LoxCallable so that\nwe can call it. We don't want the runtime phase of the interpreter to bleed into\nthe front end's syntax classes so we don't want Stmt.Function itself to\nimplement that. Instead, we wrap it in a new class.\n\n^code lox-function\n\nWe implement the `call()` of LoxCallable like so:\n\n^code function-call\n\nThis handful of lines of code is one of the most fundamental, powerful pieces of\nour interpreter. As we saw in [the chapter on statements and <span\nname=\"env\">state</span>][statements], managing name environments is a core part\nof a language implementation. Functions are deeply tied to that.\n\n[statements]: statements-and-state.html\n\n<aside name=\"env\">\n\nWe'll dig even deeper into environments in the [next chapter][].\n\n[next chapter]: resolving-and-binding.html\n\n</aside>\n\nParameters are core to functions, especially the fact that a function\n*encapsulates* its parameters -- no other code outside of the function can see\nthem. This means each function gets its own environment where it stores those\nvariables.\n\nFurther, this environment must be created dynamically. Each function *call* gets\nits own environment. Otherwise, recursion would break. If there are multiple\ncalls to the same function in play at the same time, each needs its *own*\nenvironment, even though they are all calls to the same function.\n\nFor example, here's a convoluted way to count to three:\n\n```lox\nfun count(n) {\n  if (n > 1) count(n - 1);\n  print n;\n}\n\ncount(3);\n```\n\nImagine we pause the interpreter right at the point where it's about to print 1\nin the innermost nested call. The outer calls to print 2 and 3 haven't printed\ntheir values yet, so there must be environments somewhere in memory that still\nstore the fact that `n` is bound to 3 in one context, 2 in another, and 1 in the\ninnermost, like:\n\n<img src=\"image/functions/recursion.png\" alt=\"A separate environment for each recursive call.\" />\n\nThat's why we create a new environment at each *call*, not at the function\n*declaration*. The `call()` method we saw earlier does that. At the beginning of\nthe call, it creates a new environment. Then it walks the parameter and argument\nlists in lockstep. For each pair, it creates a new variable with the parameter's\nname and binds it to the argument's value.\n\nSo, for a program like this:\n\n```lox\nfun add(a, b, c) {\n  print a + b + c;\n}\n\nadd(1, 2, 3);\n```\n\nAt the point of the call to `add()`, the interpreter creates something like\nthis:\n\n<img src=\"image/functions/binding.png\" alt=\"Binding arguments to their parameters.\" />\n\nThen `call()` tells the interpreter to execute the body of the function in this\nnew function-local environment. Up until now, the current environment was the\nenvironment where the function was being called. Now, we teleport from there\ninside the new parameter space we've created for the function.\n\nThis is all that's required to pass data into the function. By using different\nenvironments when we execute the body, calls to the same function with the\nsame code can produce different results.\n\nOnce the body of the function has finished executing, `executeBlock()` discards\nthat function-local environment and restores the previous one that was active\nback at the callsite. Finally, `call()` returns `null`, which returns `nil` to\nthe caller. (We'll add return values later.)\n\nMechanically, the code is pretty simple. Walk a couple of lists. Bind some new\nvariables. Call a method. But this is where the crystalline *code* of the\nfunction declaration becomes a living, breathing *invocation*. This is one of my\nfavorite snippets in this entire book. Feel free to take a moment to meditate on\nit if you're so inclined.\n\nDone? OK. Note when we bind the parameters, we assume the parameter and argument\nlists have the same length. This is safe because `visitCallExpr()` checks the\narity before calling `call()`. It relies on the function reporting its arity to\ndo that.\n\n^code function-arity\n\nThat's most of our object representation. While we're in here, we may as well\nimplement `toString()`.\n\n^code function-to-string\n\nThis gives nicer output if a user decides to print a function value.\n\n```lox\nfun add(a, b) {\n  print a + b;\n}\n\nprint add; // \"<fn add>\".\n```\n\n### Interpreting function declarations\n\nWe'll come back and refine LoxFunction soon, but that's enough to get started.\nNow we can visit a function declaration.\n\n^code visit-function\n\nThis is similar to how we interpret other literal expressions. We take a\nfunction *syntax node* -- a compile-time representation of the function -- and\nconvert it to its runtime representation. Here, that's a LoxFunction that wraps\nthe syntax node.\n\nFunction declarations are different from other literal nodes in that the\ndeclaration *also* binds the resulting object to a new variable. So, after\ncreating the LoxFunction, we create a new binding in the current environment and\nstore a reference to it there.\n\nWith that, we can define and call our own functions all within Lox. Give it a\ntry:\n\n```lox\nfun sayHi(first, last) {\n  print \"Hi, \" + first + \" \" + last + \"!\";\n}\n\nsayHi(\"Dear\", \"Reader\");\n```\n\nI don't know about you, but that looks like an honest-to-God programming\nlanguage to me.\n\n## Return Statements\n\nWe can get data into functions by passing parameters, but we've got no way to\nget results back <span name=\"hotel\">*out*</span>. If Lox were an\nexpression-oriented language like Ruby or Scheme, the body would be an\nexpression whose value is implicitly the function's result. But in Lox, the body\nof a function is a list of statements which don't produce values, so we need\ndedicated syntax for emitting a result. In other words, `return` statements. I'm\nsure you can guess the grammar already.\n\n<aside name=\"hotel\">\n\nThe Hotel California of data.\n\n</aside>\n\n```ebnf\nstatement      → exprStmt\n               | forStmt\n               | ifStmt\n               | printStmt\n               | returnStmt\n               | whileStmt\n               | block ;\n\nreturnStmt     → \"return\" expression? \";\" ;\n```\n\nWe've got one more -- the final, in fact -- production under the venerable\n`statement` rule. A `return` statement is the `return` keyword followed by an\noptional expression and terminated with a semicolon.\n\nThe return value is optional to support exiting early from a function that\ndoesn't return a useful value. In statically typed languages, \"void\" functions\ndon't return a value and non-void ones do. Since Lox is dynamically typed, there\nare no true void functions. The compiler has no way of preventing you from\ntaking the result value of a call to a function that doesn't contain a `return`\nstatement.\n\n```lox\nfun procedure() {\n  print \"don't return anything\";\n}\n\nvar result = procedure();\nprint result; // ?\n```\n\nThis means every Lox function must return *something*, even if it contains no\n`return` statements at all. We use `nil` for this, which is why LoxFunction's\nimplementation of `call()` returns `null` at the end. In that same vein, if you\nomit the value in a `return` statement, we simply treat it as equivalent to:\n\n```lox\nreturn nil;\n```\n\nOver in our AST generator, we add a <span name=\"return-ast\">new node</span>.\n\n^code return-ast (1 before, 1 after)\n\n<aside name=\"return-ast\">\n\nThe generated code for the new node is in [Appendix II][appendix-return].\n\n[appendix-return]: appendix-ii.html#return-statement\n\n</aside>\n\nIt keeps the `return` keyword token so we can use its location for error\nreporting, and the value being returned, if any. We parse it like other\nstatements, first by recognizing the initial keyword.\n\n^code match-return (1 before, 1 after)\n\nThat branches out to:\n\n^code parse-return-statement\n\nAfter snagging the previously consumed `return` keyword, we look for a value\nexpression. Since many different tokens can potentially start an expression,\nit's hard to tell if a return value is *present*. Instead, we check if it's\n*absent*. Since a semicolon can't begin an expression, if the next token is\nthat, we know there must not be a value.\n\n### Returning from calls\n\nInterpreting a `return` statement is tricky. You can return from anywhere within\nthe body of a function, even deeply nested inside other statements. When the\nreturn is executed, the interpreter needs to jump all the way out of whatever\ncontext it's currently in and cause the function call to complete, like some\nkind of jacked up control flow construct.\n\nFor example, say we're running this program and we're about to execute the\n`return` statement:\n\n```lox\nfun count(n) {\n  while (n < 100) {\n    if (n == 3) return n; // <--\n    print n;\n    n = n + 1;\n  }\n}\n\ncount(1);\n```\n\nThe Java call stack currently looks roughly like this:\n\n```text\nInterpreter.visitReturnStmt()\nInterpreter.visitIfStmt()\nInterpreter.executeBlock()\nInterpreter.visitBlockStmt()\nInterpreter.visitWhileStmt()\nInterpreter.executeBlock()\nLoxFunction.call()\nInterpreter.visitCallExpr()\n```\n\nWe need to get from the top of the stack all the way back to `call()`. I don't\nknow about you, but to me that sounds like exceptions. When we execute a\n`return` statement, we'll use an exception to unwind the interpreter past the\nvisit methods of all of the containing statements back to the code that began\nexecuting the body.\n\nThe visit method for our new AST node looks like this:\n\n^code visit-return\n\nIf we have a return value, we evaluate it, otherwise, we use `nil`. Then we take\nthat value and wrap it in a custom exception class and throw it.\n\n^code return-exception\n\nThis class wraps the return value with the accoutrements Java requires for a\nruntime exception class. The weird super constructor call with those `null` and\n`false` arguments disables some JVM machinery that we don't need. Since we're\nusing our exception class for <span name=\"exception\">control flow</span> and not\nactual error handling, we don't need overhead like stack traces.\n\n<aside name=\"exception\">\n\nFor the record, I'm not generally a fan of using exceptions for control flow.\nBut inside a heavily recursive tree-walk interpreter, it's the way to go. Since\nour own syntax tree evaluation is so heavily tied to the Java call stack, we're\npressed to do some heavyweight call stack manipulation occasionally, and\nexceptions are a handy tool for that.\n\n</aside>\n\nWe want this to unwind all the way to where the function call began, the\n`call()` method in LoxFunction.\n\n^code catch-return (3 before, 1 after)\n\nWe wrap the call to `executeBlock()` in a try-catch block. When it catches a\nreturn exception, it pulls out the value and makes that the return value from\n`call()`. If it never catches one of these exceptions, it means the function\nreached the end of its body without hitting a `return` statement. In that case,\nit implicitly returns `nil`.\n\nLet's try it out. We finally have enough power to support this classic\nexample -- a recursive function to calculate Fibonacci numbers:\n\n<span name=\"slow\"></span>\n\n```lox\nfun fib(n) {\n  if (n <= 1) return n;\n  return fib(n - 2) + fib(n - 1);\n}\n\nfor (var i = 0; i < 20; i = i + 1) {\n  print fib(i);\n}\n```\n\nThis tiny program exercises almost every language feature we have spent the past\nseveral chapters implementing -- expressions, arithmetic, branching, looping,\nvariables, functions, function calls, parameter binding, and returns.\n\n<aside name=\"slow\">\n\nYou might notice this is pretty slow. Obviously, recursion isn't the most\nefficient way to calculate Fibonacci numbers, but as a microbenchmark, it does\na good job of stress testing how fast our interpreter implements function calls.\n\nAs you can see, the answer is \"not very fast\". That's OK. Our C interpreter will\nbe faster.\n\n</aside>\n\n## Local Functions and Closures\n\nOur functions are pretty full featured, but there is one hole to patch. In fact,\nit's a big enough gap that we'll spend most of the [next chapter][] sealing it\nup, but we can get started here.\n\nLoxFunction's implementation of `call()` creates a new environment where it\nbinds the function's parameters. When I showed you that code, I glossed over one\nimportant point: What is the *parent* of that environment?\n\nRight now, it is always `globals`, the top-level global environment. That way,\nif an identifier isn't defined inside the function body itself, the interpreter\ncan look outside the function in the global scope to find it. In the Fibonacci\nexample, that's how the interpreter is able to look up the recursive call to\n`fib` inside the function's own body -- `fib` is a global variable.\n\nBut recall that in Lox, function declarations are allowed *anywhere* a name can\nbe bound. That includes the top level of a Lox script, but also the inside of\nblocks or other functions. Lox supports **local functions** that are defined\ninside another function, or nested inside a block.\n\nConsider this classic example:\n\n```lox\nfun makeCounter() {\n  var i = 0;\n  fun count() {\n    i = i + 1;\n    print i;\n  }\n\n  return count;\n}\n\nvar counter = makeCounter();\ncounter(); // \"1\".\ncounter(); // \"2\".\n```\n\nHere, `count()` uses `i`, which is declared outside of itself in the containing\nfunction `makeCounter()`. `makeCounter()` returns a reference to the `count()`\nfunction and then its own body finishes executing completely.\n\nMeanwhile, the top-level code invokes the returned `count()` function. That\nexecutes the body of `count()`, which assigns to and reads `i`, even though the\nfunction where `i` was defined has already exited.\n\nIf you've never encountered a language with nested functions before, this might\nseem crazy, but users do expect it to work. Alas, if you run it now, you get an\nundefined variable error in the call to `counter()` when the body of `count()`\ntries to look up `i`. That's because the environment chain in effect looks like\nthis:\n\n<img src=\"image/functions/global.png\" alt=\"The environment chain from count()'s body to the global scope.\" />\n\nWhen we call `count()` (through the reference to it stored in `counter`), we\ncreate a new empty environment for the function body. The parent of that is the\nglobal environment. We lost the environment for `makeCounter()` where `i` is\nbound.\n\nLet's go back in time a bit. Here's what the environment chain looked like right\nwhen we declared `count()` inside the body of `makeCounter()`:\n\n<img src=\"image/functions/body.png\" alt=\"The environment chain inside the body of makeCounter().\" />\n\nSo at the point where the function is declared, we can see `i`. But when we\nreturn from `makeCounter()` and exit its body, the interpreter discards that\nenvironment. Since the interpreter doesn't keep the environment surrounding\n`count()` around, it's up to the function object itself to hang on to it.\n\nThis data structure is called a <span name=\"closure\">**closure**</span> because\nit \"closes over\" and holds on to the surrounding variables where the function is\ndeclared. Closures have been around since the early Lisp days, and language\nhackers have come up with all manner of ways to implement them. For jlox, we'll\ndo the simplest thing that works. In LoxFunction, we add a field to store an\nenvironment.\n\n<aside name=\"closure\">\n\n\"Closure\" is yet another term coined by Peter J. Landin. I assume before he came\nalong that computer scientists communicated with each other using only primitive\ngrunts and pawing hand gestures.\n\n</aside>\n\n^code closure-field (1 before, 1 after)\n\nWe initialize that in the constructor.\n\n^code closure-constructor (1 after)\n\nWhen we create a LoxFunction, we capture the current environment.\n\n^code visit-closure (1 before, 1 after)\n\nThis is the environment that is active when the function is *declared* not when\nit's *called*, which is what we want. It represents the lexical scope\nsurrounding the function declaration. Finally, when we call the function, we use\nthat environment as the call's parent instead of going straight to `globals`.\n\n^code call-closure (1 before, 1 after)\n\nThis creates an environment chain that goes from the function's body out through\nthe environments where the function is declared, all the way out to the global\nscope. The runtime environment chain matches the textual nesting of the source\ncode like we want. The end result when we call that function looks like this:\n\n<img src=\"image/functions/closure.png\" alt=\"The environment chain with the closure.\" />\n\nNow, as you can see, the interpreter can still find `i` when it needs to because\nit's in the middle of the environment chain. Try running that `makeCounter()`\nexample now. It works!\n\nFunctions let us abstract over, reuse, and compose code. Lox is much more\npowerful than the rudimentary arithmetic calculator it used to be. Alas, in our\nrush to cram closures in, we have let a tiny bit of dynamic scoping leak into\nthe interpreter. In the [next chapter][], we will explore deeper into lexical\nscope and close that hole.\n\n[next chapter]: resolving-and-binding.html\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  Our interpreter carefully checks that the number of arguments passed to a\n    function matches the number of parameters it expects. Since this check is\n    done at runtime on every call, it has a performance cost. Smalltalk\n    implementations don't have that problem. Why not?\n\n1.  Lox's function declaration syntax performs two independent operations. It\n    creates a function and also binds it to a name. This improves usability for\n    the common case where you do want to associate a name with the function.\n    But in functional-styled code, you often want to create a function to\n    immediately pass it to some other function or return it. In that case, it\n    doesn't need a name.\n\n    Languages that encourage a functional style usually support **anonymous\n    functions** or **lambdas** -- an expression syntax that creates a function\n    without binding it to a name. Add anonymous function syntax to Lox so that\n    this works:\n\n    ```lox\n    fun thrice(fn) {\n      for (var i = 1; i <= 3; i = i + 1) {\n        fn(i);\n      }\n    }\n\n    thrice(fun (a) {\n      print a;\n    });\n    // \"1\".\n    // \"2\".\n    // \"3\".\n    ```\n\n    How do you handle the tricky case of an anonymous function expression\n    occurring in an expression statement:\n\n    ```lox\n    fun () {};\n    ```\n\n1.  Is this program valid?\n\n    ```lox\n    fun scope(a) {\n      var a = \"local\";\n    }\n    ```\n\n    In other words, are a function's parameters in the *same* scope as its local\n    variables, or in an outer scope? What does Lox do? What about other\n    languages you are familiar with? What do you think a language *should* do?\n\n</div>\n"
  },
  {
    "path": "book/garbage-collection.md",
    "content": "> I wanna, I wanna,<br />\n> I wanna, I wanna,<br />\n> I wanna be trash.<br />\n>\n> <cite>The Whip, &ldquo;Trash&rdquo;</cite>\n\nWe say Lox is a \"high-level\" language because it frees programmers from worrying\nabout details irrelevant to the problem they're solving. The user becomes an\nexecutive, giving the machine abstract goals and letting the lowly computer\nfigure out how to get there.\n\nDynamic memory allocation is a perfect candidate for automation. It's necessary\nfor a working program, tedious to do by hand, and yet still error-prone. The\ninevitable mistakes can be catastrophic, leading to crashes, memory corruption,\nor security violations. It's the kind of risky-yet-boring work that machines\nexcel at over humans.\n\nThis is why Lox is a **managed language**, which means that the language\nimplementation manages memory allocation and freeing on the user's behalf. When\na user performs an operation that requires some dynamic memory, the VM\nautomatically allocates it. The programmer never worries about deallocating\nanything. The machine ensures any memory the program is using sticks around as\nlong as needed.\n\nLox provides the illusion that the computer has an infinite amount of memory.\nUsers can allocate and allocate and allocate and never once think about where\nall these bytes are coming from. Of course, computers do not yet *have* infinite\nmemory. So the way managed languages maintain this illusion is by going behind\nthe programmer's back and reclaiming memory that the program no longer needs.\nThe component that does this is called a **garbage <span\nname=\"recycle\">collector</span>**.\n\n<aside name=\"recycle\">\n\nRecycling would really be a better metaphor for this. The GC doesn't *throw\naway* the memory, it reclaims it to be reused for new data. But managed\nlanguages are older than Earth Day, so the inventors went with the analogy they\nknew.\n\n<img src=\"image/garbage-collection/recycle.png\" class=\"above\" alt=\"A recycle bin full of bits.\" />\n\n</aside>\n\n## Reachability\n\nThis raises a surprisingly difficult question: how does a VM tell what memory is\n*not* needed? Memory is only needed if it is read in the future, but short of\nhaving a time machine, how can an implementation tell what code the program\n*will* execute and which data it *will* use? Spoiler alert: VMs cannot travel\ninto the future. Instead, the language makes a <span\nname=\"conservative\">conservative</span> approximation: it considers a piece of\nmemory to still be in use if it *could possibly* be read in the future.\n\n<aside name=\"conservative\">\n\nI'm using \"conservative\" in the general sense. There is such a thing as a\n\"conservative garbage collector\" which means something more specific. All\ngarbage collectors are \"conservative\" in that they keep memory alive if it\n*could* be accessed, instead of having a Magic 8-Ball that lets them more\nprecisely know what data *will* be accessed.\n\nA **conservative GC** is a special kind of collector that considers any piece of\nmemory to be a pointer if the value in there looks like it could be an address.\nThis is in contrast to a **precise GC** -- which is what we'll implement -- that\nknows exactly which words in memory are pointers and which store other kinds of\nvalues like numbers or strings.\n\n</aside>\n\nThat sounds *too* conservative. Couldn't *any* bit of memory potentially be\nread? Actually, no, at least not in a memory-safe language like Lox. Here's an\nexample:\n\n```lox\nvar a = \"first value\";\na = \"updated\";\n// GC here.\nprint a;\n```\n\nSay we run the GC after the assignment has completed on the second line. The\nstring \"first value\" is still sitting in memory, but there is no way for the\nuser's program to ever get to it. Once `a` got reassigned, the program lost any\nreference to that string. We can safely free it. A value is **reachable** if\nthere is some way for a user program to reference it. Otherwise, like the string\n\"first value\" here, it is **unreachable**.\n\nMany values can be directly accessed by the VM. Take a look at:\n\n```lox\nvar global = \"string\";\n{\n  var local = \"another\";\n  print global + local;\n}\n```\n\nPause the program right after the two strings have been concatenated but before\nthe `print` statement has executed. The VM can reach `\"string\"` by looking\nthrough the global variable table and finding the entry for `global`. It can\nfind `\"another\"` by walking the value stack and hitting the slot for the local\nvariable `local`. It can even find the concatenated string `\"stringanother\"`\nsince that temporary value is also sitting on the VM's stack at the point when\nwe paused our program.\n\nAll of these values are called **roots**. A root is any object that the VM can\nreach directly without going through a reference in some other object. Most\nroots are global variables or on the stack, but as we'll see, there are a couple\nof other places the VM stores references to objects that it can find.\n\nOther values can be found by going through a reference inside another value.\n<span name=\"class\">Fields</span> on instances of classes are the most obvious\ncase, but we don't have those yet. Even without those, our VM still has indirect\nreferences. Consider:\n\n<aside name=\"class\">\n\nWe'll get there [soon][classes], though!\n\n[classes]: classes-and-instances.html\n\n</aside>\n\n```lox\nfun makeClosure() {\n  var a = \"data\";\n\n  fun f() { print a; }\n  return f;\n}\n\n{\n  var closure = makeClosure();\n  // GC here.\n  closure();\n}\n```\n\nSay we pause the program on the marked line and run the garbage collector. When\nthe collector is done and the program resumes, it will call the closure, which\nwill in turn print `\"data\"`. So the collector needs to *not* free that string.\nBut here's what the stack looks like when we pause the program:\n\n<img src=\"image/garbage-collection/stack.png\" alt=\"The stack, containing only the script and closure.\" />\n\nThe `\"data\"` string is nowhere on it. It has already been hoisted off the stack\nand moved into the closed upvalue that the closure uses. The closure itself is\non the stack. But to get to the string, we need to trace through the closure and\nits upvalue array. Since it *is* possible for the user's program to do that, all\nof these indirectly accessible objects are also considered reachable.\n\n<img src=\"image/garbage-collection/reachable.png\" class=\"wide\" alt=\"All of the referenced objects from the closure, and the path to the 'data' string from the stack.\" />\n\nThis gives us an inductive definition of reachability:\n\n*   All roots are reachable.\n\n*   Any object referred to from a reachable object is itself reachable.\n\nThese are the values that are still \"live\" and need to stay in memory. Any value\nthat *doesn't* meet this definition is fair game for the collector to reap.\nThat recursive pair of rules hints at a recursive algorithm we can use to free\nup unneeded memory:\n\n1.  Starting with the roots, traverse through object references to find the\n    full set of reachable objects.\n\n2.  Free all objects *not* in that set.\n\nMany <span name=\"handbook\">different</span> garbage collection algorithms are in\nuse today, but they all roughly follow that same structure. Some may interleave\nthe steps or mix them, but the two fundamental operations are there. They mostly\ndiffer in *how* they perform each step.\n\n<aside name=\"handbook\">\n\nIf you want to explore other GC algorithms,\n[*The Garbage Collection Handbook*][gc book] (Jones, et al.) is the canonical\nreference. For a large book on such a deep, narrow topic, it is quite enjoyable\nto read. Or perhaps I have a strange idea of fun.\n\n[gc book]: http://gchandbook.org/\n\n</aside>\n\n## Mark-Sweep Garbage Collection\n\nThe first managed language was Lisp, the second \"high-level\" language to be\ninvented, right after Fortran. John McCarthy considered using manual memory\nmanagement or reference counting, but <span\nname=\"procrastination\">eventually</span> settled on (and coined) garbage\ncollection -- once the program was out of memory, it would go back and find\nunused storage it could reclaim.\n\n<aside name=\"procrastination\">\n\nIn John McCarthy's \"History of Lisp\", he notes: \"Once we decided on garbage\ncollection, its actual implementation could be postponed, because only toy\nexamples were being done.\" Our choice to procrastinate adding the GC to clox\nfollows in the footsteps of giants.\n\n</aside>\n\nHe designed the very first, simplest garbage collection algorithm, called\n**mark-and-sweep** or just **mark-sweep**. Its description fits in three short\nparagraphs in the initial paper on Lisp. Despite its age and simplicity, the\nsame fundamental algorithm underlies many modern memory managers. Some corners\nof CS seem to be timeless.\n\nAs the name implies, mark-sweep works in two phases:\n\n*   **Marking:** We start with the roots and traverse or <span\n    name=\"trace\">*trace*</span> through all of the objects those roots refer to.\n    This is a classic graph traversal of all of the reachable objects. Each time\n    we visit an object, we *mark* it in some way. (Implementations differ in how\n    they record the mark.)\n\n*   **Sweeping:** Once the mark phase completes, every reachable object\n    in the heap has been marked. That means any unmarked object is unreachable and\n    ripe for reclamation. We go through all the unmarked objects and free each\n    one.\n\nIt looks something like this:\n\n<img src=\"image/garbage-collection/mark-sweep.png\" class=\"wide\" alt=\"Starting from a graph of objects, first the reachable ones are marked, the remaining are swept, and then only the reachable remain.\" />\n\n<aside name=\"trace\">\n\nA **tracing garbage collector** is any algorithm that traces through the graph\nof object references. This is in contrast with reference counting, which has a\ndifferent strategy for tracking the reachable objects.\n\n</aside>\n\nThat's what we're gonna implement. Whenever we decide it's time to reclaim some\nbytes, we'll trace everything and mark all the reachable objects, free what\ndidn't get marked, and then resume the user's program.\n\n### Collecting garbage\n\nThis entire chapter is about implementing this one <span\nname=\"one\">function</span>:\n\n<aside name=\"one\">\n\nOf course, we'll end up adding a bunch of helper functions too.\n\n</aside>\n\n^code collect-garbage-h (1 before, 1 after)\n\nWe'll work our way up to a full implementation starting with this empty shell:\n\n^code collect-garbage\n\nThe first question you might ask is, When does this function get called? It\nturns out that's a subtle question that we'll spend some time on later in the\nchapter. For now we'll sidestep the issue and build ourselves a handy diagnostic\ntool in the process.\n\n^code define-stress-gc (1 before, 2 after)\n\nWe'll add an optional \"stress test\" mode for the garbage collector. When this\nflag is defined, the GC runs as often as it possibly can. This is, obviously,\nhorrendous for performance. But it's great for flushing out memory management\nbugs that occur only when a GC is triggered at just the right moment. If *every*\nmoment triggers a GC, you're likely to find those bugs.\n\n^code call-collect (1 before, 1 after)\n\nWhenever we call `reallocate()` to acquire more memory, we force a collection to\nrun. The if check is because `reallocate()` is also called to free or shrink an\nallocation. We don't want to trigger a GC for that -- in particular because the\nGC itself will call `reallocate()` to free memory.\n\nCollecting right before <span name=\"demand\">allocation</span> is the classic way\nto wire a GC into a VM. You're already calling into the memory manager, so it's\nan easy place to hook in the code. Also, allocation is the only time when you\nreally *need* some freed up memory so that you can reuse it. If you *don't* use\nallocation to trigger a GC, you have to make sure every possible place in code\nwhere you can loop and allocate memory also has a way to trigger the collector.\nOtherwise, the VM can get into a starved state where it needs more memory but\nnever collects any.\n\n<aside name=\"demand\">\n\nMore sophisticated collectors might run on a separate thread or be interleaved\nperiodically during program execution -- often at function call boundaries or\nwhen a backward jump occurs.\n\n</aside>\n\n### Debug logging\n\nWhile we're on the subject of diagnostics, let's put some more in. A real\nchallenge I've found with garbage collectors is that they are opaque. We've been\nrunning lots of Lox programs just fine without any GC *at all* so far. Once we\nadd one, how do we tell if it's doing anything useful? Can we tell only if we\nwrite programs that plow through acres of memory? How do we debug that?\n\nAn easy way to shine a light into the GC's inner workings is with some logging.\n\n^code define-log-gc (1 before, 2 after)\n\nWhen this is enabled, clox prints information to the console when it does\nsomething with dynamic memory.\n\nWe need a couple of includes.\n\n^code debug-log-includes (1 before, 2 after)\n\nWe don't have a collector yet, but we can start putting in some of the logging\nnow. We'll want to know when a collection run starts.\n\n^code log-before-collect (1 before, 1 after)\n\nEventually we will log some other operations during the collection, so we'll\nalso want to know when the show's over.\n\n^code log-after-collect (2 before, 1 after)\n\nWe don't have any code for the collector yet, but we do have functions for\nallocating and freeing, so we can instrument those now.\n\n^code debug-log-allocate (1 before, 1 after)\n\nAnd at the end of an object's lifespan:\n\n^code log-free-object (1 before, 1 after)\n\nWith these two flags, we should be able to see that we're making progress as we\nwork through the rest of the chapter.\n\n## Marking the Roots\n\nObjects are scattered across the heap like stars in the inky night sky. A\nreference from one object to another forms a connection, and these\nconstellations are the graph that the mark phase traverses. Marking begins at\nthe roots.\n\n^code call-mark-roots (3 before, 2 after)\n\nMost roots are local variables or temporaries sitting right in the VM's stack,\nso we start by walking that.\n\n^code mark-roots\n\nTo mark a Lox value, we use this new function:\n\n^code mark-value-h (1 before, 1 after)\n\nIts implementation is here:\n\n^code mark-value\n\nSome Lox values -- numbers, Booleans, and `nil` -- are stored directly inline in\nValue and require no heap allocation. The garbage collector doesn't need to\nworry about them at all, so the first thing we do is ensure that the value is an\nactual heap object. If so, the real work happens in this function:\n\n^code mark-object-h (1 before, 1 after)\n\nWhich is defined here:\n\n^code mark-object\n\nThe `NULL` check is unnecessary when called from `markValue()`. A Lox Value that\nis some kind of Obj type will always have a valid pointer. But later we will\ncall this function directly from other code, and in some of those places, the\nobject being pointed to is optional.\n\nAssuming we do have a valid object, we mark it by setting a flag. That new field\nlives in the Obj header struct all objects share.\n\n^code is-marked-field (1 before, 1 after)\n\nEvery new object begins life unmarked because we haven't yet determined if it is\nreachable or not.\n\n^code init-is-marked (1 before, 2 after)\n\nBefore we go any farther, let's add some logging to `markObject()`.\n\n^code log-mark-object (2 before, 1 after)\n\nThis way we can see what the mark phase is doing. Marking the stack takes care\nof local variables and temporaries. The other main source of roots are the\nglobal variables.\n\n^code mark-globals (2 before, 1 after)\n\nThose live in a hash table owned by the VM, so we'll declare another helper\nfunction for marking all of the objects in a table.\n\n^code mark-table-h (2 before, 2 after)\n\nWe implement that in the \"table\" module here:\n\n^code mark-table\n\nPretty straightforward. We walk the entry array. For each one, we mark its\nvalue. We also mark the key strings for each entry since the GC manages those\nstrings too.\n\n### Less obvious roots\n\nThose cover the roots that we typically think of -- the values that are\nobviously reachable because they're stored in variables the user's program can\nsee. But the VM has a few of its own hidey-holes where it squirrels away\nreferences to values that it directly accesses.\n\nMost function call state lives in the value stack, but the VM maintains a\nseparate stack of CallFrames. Each CallFrame contains a pointer to the closure\nbeing called. The VM uses those pointers to access constants and upvalues, so\nthose closures need to be kept around too.\n\n^code mark-closures (1 before, 2 after)\n\nSpeaking of upvalues, the open upvalue list is another set of values that the\nVM can directly reach.\n\n^code mark-open-upvalues (3 before, 2 after)\n\nRemember also that a collection can begin during *any* allocation. Those\nallocations don't just happen while the user's program is running. The compiler\nitself periodically grabs memory from the heap for literals and the constant\ntable. If the GC runs while we're in the middle of compiling, then any values\nthe compiler directly accesses need to be treated as roots too.\n\nTo keep the compiler module cleanly separated from the rest of the VM, we'll do\nthat in a separate function.\n\n^code call-mark-compiler-roots (1 before, 1 after)\n\nIt's declared here:\n\n^code mark-compiler-roots-h (1 before, 2 after)\n\nWhich means the \"memory\" module needs an include.\n\n^code memory-include-compiler (2 before, 1 after)\n\nAnd the definition is over in the \"compiler\" module.\n\n^code mark-compiler-roots\n\nFortunately, the compiler doesn't have too many values that it hangs on to. The\nonly object it uses is the ObjFunction it is compiling into. Since function\ndeclarations can nest, the compiler has a linked list of those and we walk the\nwhole list.\n\nSince the \"compiler\" module is calling `markObject()`, it also needs an include.\n\n^code compiler-include-memory (1 before, 1 after)\n\nThose are all the roots. After running this, every object that the VM -- runtime\nand compiler -- can get to *without* going through some other object has its\nmark bit set.\n\n## Tracing Object References\n\nThe next step in the marking process is tracing through the graph of references\nbetween objects to find the indirectly reachable values. We don't have instances\nwith fields yet, so there aren't many objects that contain references, but we do\nhave <span name=\"some\">some</span>. In particular, ObjClosure has the list of\nObjUpvalues it closes over as well as a reference to the raw ObjFunction that it\nwraps. ObjFunction, in turn, has a constant table containing references to all\nof the literals created in the function's body. This is enough to build a fairly\ncomplex web of objects for our collector to crawl through.\n\n<aside name=\"some\">\n\nI slotted this chapter into the book right here specifically *because* we now\nhave closures which give us interesting objects for the garbage collector to\nprocess.\n\n</aside>\n\nNow it's time to implement that traversal. We can go breadth-first, depth-first,\nor in some other order. Since we just need to find the *set* of all reachable\nobjects, the order we visit them <span name=\"dfs\">mostly</span> doesn't matter.\n\n<aside name=\"dfs\">\n\nI say \"mostly\" because some garbage collectors move objects in the order that\nthey are visited, so traversal order determines which objects end up adjacent in\nmemory. That impacts performance because the CPU uses locality to determine\nwhich memory to preload into the caches.\n\nEven when traversal order does matter, it's not clear which order is *best*.\nIt's very difficult to determine which order objects will be used in in the\nfuture, so it's hard for the GC to know which order will help performance.\n\n</aside>\n\n### The tricolor abstraction\n\nAs the collector wanders through the graph of objects, we need to make sure it\ndoesn't lose track of where it is or get stuck going in circles. This is\nparticularly a concern for advanced implementations like incremental GCs that\ninterleave marking with running pieces of the user's program. The collector\nneeds to be able to pause and then pick up where it left off later.\n\nTo help us soft-brained humans reason about this complex process, VM hackers\ncame up with a metaphor called the <span name=\"color\"></span>**tricolor\nabstraction**. Each object has a conceptual \"color\" that tracks what state the\nobject is in, and what work is left to do.\n\n<aside name=\"color\">\n\nAdvanced garbage collection algorithms often add other colors to the\nabstraction. I've seen multiple shades of gray, and even purple in some designs.\nMy puce-chartreuse-fuchsia-malachite collector paper was, alas, not accepted for\npublication.\n\n</aside>\n\n*   **<img src=\"image/garbage-collection/white.png\" alt=\"A white circle.\"\n    class=\"dot\" /> White:** At the beginning of a garbage collection, every\n    object is white. This color means we have not reached or processed the\n    object at all.\n\n*   **<img src=\"image/garbage-collection/gray.png\" alt=\"A gray circle.\"\n    class=\"dot\" /> Gray:** During marking, when we first reach an object, we\n    darken it gray. This color means we know the object itself is reachable and\n    should not be collected. But we have not yet traced *through* it to see what\n    *other* objects it references. In graph algorithm terms, this is the\n    *worklist* -- the set of objects we know about but haven't processed yet.\n\n*   **<img src=\"image/garbage-collection/black.png\" alt=\"A black circle.\"\n    class=\"dot\" /> Black:** When\n    we take a gray object and mark all of the objects it references, we then\n    turn the gray object black. This color means the mark phase is done\n    processing that object.\n\nIn terms of that abstraction, the marking process now looks like this:\n\n1.  Start off with all objects white.\n\n2.  Find all the roots and mark them gray.\n\n3.  Repeat as long as there are still gray objects:\n\n    1.  Pick a gray object. Turn any white objects that the object mentions to\n        gray.\n\n    2.  Mark the original gray object black.\n\nI find it helps to visualize this. You have a web of objects with references\nbetween them. Initially, they are all little white dots. Off to the side are\nsome incoming edges from the VM that point to the roots. Those roots turn gray.\nThen each gray object's siblings turn gray while the object itself turns black.\nThe full effect is a gray wavefront that passes through the graph, leaving a\nfield of reachable black objects behind it. Unreachable objects are not touched\nby the wavefront and stay white.\n\n<img src=\"image/garbage-collection/tricolor-trace.png\" class=\"wide\" alt=\"A gray wavefront working through a graph of nodes.\" />\n\nAt the <span name=\"invariant\">end</span>, you're left with a sea of reached,\nblack objects sprinkled with islands of white objects that can be swept up and\nfreed. Once the unreachable objects are freed, the remaining objects -- all\nblack -- are reset to white for the next garbage collection cycle.\n\n<aside name=\"invariant\">\n\nNote that at every step of this process no black node ever points to a white\nnode. This property is called the **tricolor invariant**. The traversal process\nmaintains this invariant to ensure that no reachable object is ever collected.\n\n</aside>\n\n### A worklist for gray objects\n\nIn our implementation we have already marked the roots. They're all gray. The\nnext step is to start picking them and traversing their references. But we don't\nhave any easy way to find them. We set a field on the object, but that's it. We\ndon't want to have to traverse the entire object list looking for objects with\nthat field set.\n\nInstead, we'll create a separate worklist to keep track of all of the gray\nobjects. When an object turns gray, in addition to setting the mark field we'll\nalso add it to the worklist.\n\n^code add-to-gray-stack (1 before, 1 after)\n\nWe could use any kind of data structure that lets us put items in and take them\nout easily. I picked a stack because that's the simplest to implement with a\ndynamic array in C. It works mostly like other dynamic arrays we've built in\nLox, *except*, note that it calls the *system* `realloc()` function and not our\nown `reallocate()` wrapper. The memory for the gray stack itself is *not*\nmanaged by the garbage collector. We don't want growing the gray stack during a\nGC to cause the GC to recursively start a new GC. That could tear a hole in the\nspace-time continuum.\n\nWe'll manage its memory ourselves, explicitly. The VM owns the gray stack.\n\n^code vm-gray-stack (1 before, 1 after)\n\nIt starts out empty.\n\n^code init-gray-stack (1 before, 2 after)\n\nAnd we need to free it when the VM shuts down.\n\n^code free-gray-stack (2 before, 1 after)\n\n<span name=\"robust\">We</span> take full responsibility for this array. That\nincludes allocation failure. If we can't create or grow the gray stack, then we\ncan't finish the garbage collection. This is bad news for the VM, but\nfortunately rare since the gray stack tends to be pretty small. It would be nice\nto do something more graceful, but to keep the code in this book simple, we just\nabort.\n\n<aside name=\"robust\">\n\nTo be more robust, we can allocate a \"rainy day fund\" block of memory when we\nstart the VM. If the gray stack allocation fails, we free the rainy day block\nand try again. That may give us enough wiggle room on the heap to create the\ngray stack, finish the GC, and free up more memory.\n\n</aside>\n\n^code exit-gray-stack (2 before, 1 after)\n\n### Processing gray objects\n\nOK, now when we're done marking the roots, we have both set a bunch of fields\nand filled our work list with objects to chew through. It's time for the next\nphase.\n\n^code call-trace-references (1 before, 2 after)\n\nHere's the implementation:\n\n^code trace-references\n\nIt's as close to that textual algorithm as you can get. Until the stack empties,\nwe keep pulling out gray objects, traversing their references, and then marking\nthem black. Traversing an object's references may turn up new white objects that\nget marked gray and added to the stack. So this function swings back and forth\nbetween turning white objects gray and gray objects black, gradually advancing\nthe entire wavefront forward.\n\nHere's where we traverse a single object's references:\n\n^code blacken-object\n\nEach object <span name=\"leaf\">kind</span> has different fields that might\nreference other objects, so we need a specific blob of code for each type. We\nstart with the easy ones -- strings and native function objects contain no\noutgoing references so there is nothing to traverse.\n\n<aside name=\"leaf\">\n\nAn easy optimization we could do in `markObject()` is to skip adding strings and\nnative functions to the gray stack at all since we know they don't need to be\nprocessed. Instead, they could darken from white straight to black.\n\n</aside>\n\nNote that we don't set any state in the traversed object itself. There is no\ndirect encoding of \"black\" in the object's state. A black object is any object\nwhose `isMarked` field is <span name=\"field\">set</span> and that is no longer in\nthe gray stack.\n\n<aside name=\"field\">\n\nYou may rightly wonder why we have the `isMarked` field at all. All in good\ntime, friend.\n\n</aside>\n\nNow let's start adding in the other object types. The simplest is upvalues.\n\n^code blacken-upvalue (2 before, 1 after)\n\nWhen an upvalue is closed, it contains a reference to the closed-over value.\nSince the value is no longer on the stack, we need to make sure we trace the\nreference to it from the upvalue.\n\nNext are functions.\n\n^code blacken-function (1 before, 1 after)\n\nEach function has a reference to an ObjString containing the function's name.\nMore importantly, the function has a constant table packed full of references to\nother objects. We trace all of those using this helper:\n\n^code mark-array\n\nThe last object type we have now -- we'll add more in later chapters -- is\nclosures.\n\n^code blacken-closure (1 before, 1 after)\n\nEach closure has a reference to the bare function it wraps, as well as an array\nof pointers to the upvalues it captures. We trace all of those.\n\nThat's the basic mechanism for processing a gray object, but there are two loose\nends to tie up. First, some logging.\n\n^code log-blacken-object (1 before, 1 after)\n\nThis way, we can watch the tracing percolate through the object graph. Speaking\nof which, note that I said *graph*. References between objects are directed, but\nthat doesn't mean they're *acyclic!* It's entirely possible to have cycles of\nobjects. When that happens, we need to ensure our collector doesn't get stuck in\nan infinite loop as it continually re-adds the same series of objects to the\ngray stack.\n\nThe fix is easy.\n\n^code check-is-marked (1 before, 1 after)\n\nIf the object is already marked, we don't mark it again and thus don't add it to\nthe gray stack. This ensures that an already-gray object is not redundantly\nadded and that a black object is not inadvertently turned back to gray. In other\nwords, it keeps the wavefront moving forward through only the white objects.\n\n## Sweeping Unused Objects\n\nWhen the loop in `traceReferences()` exits, we have processed all the objects we\ncould get our hands on. The gray stack is empty, and every object in the heap is\neither black or white. The black objects are reachable, and we want to hang on to\nthem. Anything still white never got touched by the trace and is thus garbage.\nAll that's left is to reclaim them.\n\n^code call-sweep (1 before, 2 after)\n\nAll of the logic lives in one function.\n\n^code sweep\n\nI know that's kind of a lot of code and pointer shenanigans, but there isn't\nmuch to it once you work through it. The outer `while` loop walks the linked\nlist of every object in the heap, checking their mark bits. If an object is\nmarked (black), we leave it alone and continue past it. If it is unmarked\n(white), we unlink it from the list and free it using the `freeObject()`\nfunction we already wrote.\n\n<img src=\"image/garbage-collection/unlink.png\" alt=\"A recycle bin full of bits.\" />\n\nMost of the other code in here deals with the fact that removing a node from a\nsingly linked list is cumbersome. We have to continuously remember the previous\nnode so we can unlink its next pointer, and we have to handle the edge case\nwhere we are freeing the first node. But, otherwise, it's pretty simple --\ndelete every node in a linked list that doesn't have a bit set in it.\n\nThere's one little addition:\n\n^code unmark (1 before, 1 after)\n\nAfter `sweep()` completes, the only remaining objects are the live black ones\nwith their mark bits set. That's correct, but when the *next* collection cycle\nstarts, we need every object to be white. So whenever we reach a black object,\nwe go ahead and clear the bit now in anticipation of the next run.\n\n### Weak references and the string pool\n\nWe are almost done collecting. There is one remaining corner of the VM that has\nsome unusual requirements around memory. Recall that when we added strings to\nclox we made the VM intern them all. That means the VM has a hash table\ncontaining a pointer to every single string in the heap. The VM uses this to\nde-duplicate strings.\n\nDuring the mark phase, we deliberately did *not* treat the VM's string table as\na source of roots. If we had, no <span name=\"intern\">string</span> would *ever*\nbe collected. The string table would grow and grow and never yield a single byte\nof memory back to the operating system. That would be bad.\n\n<aside name=\"intern\">\n\nThis can be a real problem. Java does not intern *all* strings, but it does\nintern string *literals*. It also provides an API to add strings to the string\ntable. For many years, the capacity of that table was fixed, and strings added\nto it could never be removed. If users weren't careful about their use of\n`String.intern()`, they could run out of memory and crash.\n\nRuby had a similar problem for years where symbols -- interned string-like\nvalues -- were not garbage collected. Both eventually enabled the GC to collect\nthese strings.\n\n</aside>\n\nAt the same time, if we *do* let the GC free strings, then the VM's string table\nwill be left with dangling pointers to freed memory. That would be even worse.\n\nThe string table is special and we need special support for it. In particular,\nit needs a special kind of reference. The table should be able to refer to a\nstring, but that link should not be considered a root when determining\nreachability. That implies that the referenced object can be freed. When that\nhappens, the dangling reference must be fixed too, sort of like a magic,\nself-clearing pointer. This particular set of semantics comes up frequently\nenough that it has a name: a [**weak reference**][weak].\n\n[weak]: https://en.wikipedia.org/wiki/Weak_reference\n\nWe have already implicitly implemented half of the string table's unique\nbehavior by virtue of the fact that we *don't* traverse it during marking. That\nmeans it doesn't force strings to be reachable. The remaining piece is clearing\nout any dangling pointers for strings that are freed.\n\nTo remove references to unreachable strings, we need to know which strings *are*\nunreachable. We don't know that until after the mark phase has completed. But we\ncan't wait until after the sweep phase is done because by then the objects --\nand their mark bits -- are no longer around to check. So the right time is\nexactly between the marking and sweeping phases.\n\n^code sweep-strings (1 before, 1 after)\n\nThe logic for removing the about-to-be-deleted strings exists in a new function\nin the \"table\" module.\n\n^code table-remove-white-h (2 before, 2 after)\n\nThe implementation is here:\n\n^code table-remove-white\n\nWe walk every entry in the table. The string intern table uses only the key of\neach entry -- it's basically a hash *set* not a hash *map*. If the key string\nobject's mark bit is not set, then it is a white object that is moments from\nbeing swept away. We delete it from the hash table first and thus ensure we\nwon't see any dangling pointers.\n\n## When to Collect\n\nWe have a fully functioning mark-sweep garbage collector now. When the stress\ntesting flag is enabled, it gets called all the time, and with the logging\nenabled too, we can watch it do its thing and see that it is indeed reclaiming\nmemory. But, when the stress testing flag is off, it never runs at all. It's\ntime to decide when the collector should be invoked during normal program\nexecution.\n\nAs far as I can tell, this question is poorly answered by the literature. When\ngarbage collectors were first invented, computers had a tiny, fixed amount of\nmemory. Many of the early GC papers assumed that you set aside a few thousand\nwords of memory -- in other words, most of it -- and invoked the collector\nwhenever you ran out. Simple.\n\nModern machines have gigs of physical RAM, hidden behind the operating system's\neven larger virtual memory abstraction, which is shared among a slew of other\nprograms all fighting for their chunk of memory. The operating system will let\nyour program request as much as it wants and then page in and out from the disc\nwhen physical memory gets full. You never really \"run out\" of memory, you just\nget slower and slower.\n\n### Latency and throughput\n\nIt no longer makes sense to wait until you \"have to\", to run the GC, so we need\na more subtle timing strategy. To reason about this more precisely, it's time to\nintroduce two fundamental numbers used when measuring a memory manager's\nperformance: *throughput* and *latency*.\n\nEvery managed language pays a performance price compared to explicit,\nuser-authored deallocation. The time spent actually freeing memory is the same,\nbut the GC spends cycles figuring out *which* memory to free. That is time *not*\nspent running the user's code and doing useful work. In our implementation,\nthat's the entirety of the mark phase. The goal of a sophisticated garbage\ncollector is to minimize that overhead.\n\nThere are two key metrics we can use to understand that cost better:\n\n*   **Throughput** is the total fraction of time spent running user code versus\n    doing garbage collection work. Say you run a clox program for ten seconds\n    and it spends a second of that inside `collectGarbage()`. That means the\n    throughput is 90% -- it spent 90% of the time running the program and 10%\n    on GC overhead.\n\n    Throughput is the most fundamental measure because it tracks the total cost\n    of collection overhead. All else being equal, you want to maximize\n    throughput. Up until this chapter, clox had no GC at all and thus <span\n    name=\"hundred\">100%</span> throughput. That's pretty hard to beat. Of\n    course, it came at the slight expense of potentially running out of memory\n    and crashing if the user's program ran long enough. You can look at the goal\n    of a GC as fixing that \"glitch\" while sacrificing as little throughput as\n    possible.\n\n<aside name=\"hundred\">\n\nWell, not *exactly* 100%. It did still put the allocated objects into a linked\nlist, so there was some tiny overhead for setting those pointers.\n\n</aside>\n\n*   **Latency** is the longest *continuous* chunk of time where the user's\n    program is completely paused while garbage collection happens. It's a\n    measure of how \"chunky\" the collector is. Latency is an entirely different\n    metric than throughput.\n\n    Consider two runs of a clox program that both take ten seconds. In the first\n    run, the GC kicks in once and spends a solid second in `collectGarbage()` in\n    one massive collection. In the second run, the GC gets invoked five times,\n    each for a fifth of a second. The *total* amount of time spent collecting is\n    still a second, so the throughput is 90% in both cases. But in the second\n    run, the latency is only 1/5th of a second, five times less than in the\n    first.\n\n<span name=\"latency\"></span>\n\n<img src=\"image/garbage-collection/latency-throughput.png\" alt=\"A bar representing execution time with slices for running user code and running the GC. The largest GC slice is latency. The size of all of the user code slices is throughput.\" />\n\n<aside name=\"latency\">\n\nThe bar represents the execution of a program, divided into time spent running\nuser code and time spent in the GC. The size of the largest single slice of time\nrunning the GC is the latency. The size of all of the user code slices added up\nis the throughput.\n\n</aside>\n\nIf you like analogies, imagine your program is a bakery selling fresh-baked\nbread to customers. Throughput is the total number of warm, crusty baguettes you\ncan serve to customers in a single day. Latency is how long the unluckiest\ncustomer has to wait in line before they get served.\n\n<span name=\"dishwasher\">Running</span> the garbage collector is like shutting\ndown the bakery temporarily to go through all of the dishes, sort out the dirty\nfrom the clean, and then wash the used ones. In our analogy, we don't have\ndedicated dishwashers, so while this is going on, no baking is happening. The\nbaker is washing up.\n\n<aside name=\"dishwasher\">\n\nIf each person represents a thread, then an obvious optimization is to have\nseparate threads running garbage collection, giving you a **concurrent garbage\ncollector**. In other words, hire some dishwashers to clean while others bake.\nThis is how very sophisticated GCs work because it does let the bakers\n-- the worker threads -- keep running user code with little interruption.\n\nHowever, coordination is required. You don't want a dishwasher grabbing a bowl\nout of a baker's hands! This coordination adds overhead and a lot of complexity.\nConcurrent collectors are fast, but challenging to implement correctly.\n\n<img src=\"image/garbage-collection/baguette.png\" class=\"above\" alt=\"Un baguette.\" />\n\n</aside>\n\nSelling fewer loaves of bread a day is bad, and making any particular customer\nsit and wait while you clean all the dishes is too. The goal is to maximize\nthroughput and minimize latency, but there is no free lunch, even inside a\nbakery. Garbage collectors make different trade-offs between how much throughput\nthey sacrifice and latency they tolerate.\n\nBeing able to make these trade-offs is useful because different user programs\nhave different needs. An overnight batch job that is generating a report from a\nterabyte of data just needs to get as much work done as fast as possible.\nThroughput is queen. Meanwhile, an app running on a user's smartphone needs to\nalways respond immediately to user input so that dragging on the screen feels\n<span name=\"butter\">buttery</span> smooth. The app can't freeze for a few\nseconds while the GC mucks around in the heap.\n\n<aside name=\"butter\">\n\nClearly the baking analogy is going to my head.\n\n</aside>\n\nAs a garbage collector author, you control some of the trade-off between\nthroughput and latency by your choice of collection algorithm. But even within a\nsingle algorithm, we have a lot of control over *how frequently* the collector\nruns.\n\nOur collector is a <span name=\"incremental\">**stop-the-world GC**</span> which\nmeans the user's program is paused until the entire garbage collection process\nhas completed. If we wait a long time before we run the collector, then a large\nnumber of dead objects will accumulate. That leads to a very long pause while\nthe collector runs, and thus high latency. So, clearly, we want to run the\ncollector really frequently.\n\n<aside name=\"incremental\">\n\nIn contrast, an **incremental garbage collector** can do a little collection,\nthen run some user code, then collect a little more, and so on.\n\n</aside>\n\nBut every time the collector runs, it spends some time visiting live objects.\nThat doesn't really *do* anything useful (aside from ensuring that they don't\nincorrectly get deleted). Time visiting live objects is time not freeing memory\nand also time not running user code. If you run the GC *really* frequently, then\nthe user's program doesn't have enough time to even generate new garbage for the\nVM to collect. The VM will spend all of its time obsessively revisiting the same\nset of live objects over and over, and throughput will suffer. So, clearly, we\nwant to run the collector really *in*frequently.\n\nIn fact, we want something in the middle, and the frequency of when the\ncollector runs is one of our main knobs for tuning the trade-off between latency\nand throughput.\n\n### Self-adjusting heap\n\nWe want our GC to run frequently enough to minimize latency but infrequently\nenough to maintain decent throughput. But how do we find the balance between\nthese when we have no idea how much memory the user's program needs and how\noften it allocates? We could pawn the problem onto the user and force them to\npick by exposing GC tuning parameters. Many VMs do this. But if we, the GC\nauthors, don't know how to tune it well, odds are good most users won't either.\nThey deserve a reasonable default behavior.\n\nI'll be honest with you, this is not my area of expertise. I've talked to a\nnumber of professional GC hackers -- this is something you can build an entire\ncareer on -- and read a lot of the literature, and all of the answers I got\nwere... vague. The strategy I ended up picking is common, pretty simple, and (I\nhope!) good enough for most uses.\n\nThe idea is that the collector frequency automatically adjusts based on the live\nsize of the heap. We track the total number of bytes of managed memory that the\nVM has allocated. When it goes above some threshold, we trigger a GC. After\nthat, we note how many bytes of memory remain -- how many were *not* freed. Then\nwe adjust the threshold to some value larger than that.\n\nThe result is that as the amount of live memory increases, we collect less\nfrequently in order to avoid sacrificing throughput by re-traversing the growing\npile of live objects. As the amount of live memory goes down, we collect more\nfrequently so that we don't lose too much latency by waiting too long.\n\nThe implementation requires two new bookkeeping fields in the VM.\n\n^code vm-fields (1 before, 1 after)\n\nThe first is a running total of the number of bytes of managed memory the VM has\nallocated. The second is the threshold that triggers the next collection. We\ninitialize them when the VM starts up.\n\n^code init-gc-fields (1 before, 2 after)\n\nThe starting threshold here is <span name=\"lab\">arbitrary</span>. It's similar\nto the initial capacity we picked for our various dynamic arrays. The goal is to\nnot trigger the first few GCs *too* quickly but also to not wait too long. If we\nhad some real-world Lox programs, we could profile those to tune this. But since\nall we have are toy programs, I just picked a number.\n\n<aside name=\"lab\">\n\nA challenge with learning garbage collectors is that it's *very* hard to\ndiscover the best practices in an isolated lab environment. You don't see how a\ncollector actually performs unless you run it on the kind of large, messy\nreal-world programs it is actually intended for. It's like tuning a rally car\n-- you need to take it out on the course.\n\n</aside>\n\nEvery time we allocate or free some memory, we adjust the counter by that delta.\n\n^code updated-bytes-allocated (1 before, 1 after)\n\nWhen the total crosses the limit, we run the collector.\n\n^code collect-on-next (2 before, 1 after)\n\nNow, finally, our garbage collector actually does something when the user runs a\nprogram without our hidden diagnostic flag enabled. The sweep phase frees\nobjects by calling `reallocate()`, which lowers the value of `bytesAllocated`,\nso after the collection completes, we know how many live bytes remain. We adjust\nthe threshold of the next GC based on that.\n\n^code update-next-gc (1 before, 2 after)\n\nThe threshold is a multiple of the heap size. This way, as the amount of memory\nthe program uses grows, the threshold moves farther out to limit the total time\nspent re-traversing the larger live set. Like other numbers in this chapter, the\nscaling factor is basically arbitrary.\n\n^code heap-grow-factor (1 before, 2 after)\n\nYou'd want to tune this in your implementation once you had some real programs\nto benchmark it on. Right now, we can at least log some of the statistics that\nwe have. We capture the heap size before the collection.\n\n^code log-before-size (1 before, 1 after)\n\nAnd then print the results at the end.\n\n^code log-collected-amount (1 before, 1 after)\n\nThis way we can see how much the garbage collector accomplished while it ran.\n\n## Garbage Collection Bugs\n\nIn theory, we are all done now. We have a GC. It kicks in periodically, collects\nwhat it can, and leaves the rest. If this were a typical textbook, we would wipe\nthe dust from our hands and bask in the soft glow of the flawless marble edifice\nwe have created.\n\nBut I aim to teach you not just the theory of programming languages but the\nsometimes painful reality. I am going to roll over a rotten log and show you the\nnasty bugs that live under it, and garbage collector bugs really are some of the\ngrossest invertebrates out there.\n\nThe collector's job is to free dead objects and preserve live ones. Mistakes are\neasy to make in both directions. If the VM fails to free objects that aren't\nneeded, it slowly leaks memory. If it frees an object that is in use, the user's\nprogram can access invalid memory. These failures often don't immediately cause\na crash, which makes it hard for us to trace backward in time to find the bug.\n\nThis is made harder by the fact that we don't know when the collector will run.\nAny call that eventually allocates some memory is a place in the VM where a\ncollection could happen. It's like musical chairs. At any point, the GC might\nstop the music. Every single heap-allocated object that we want to keep needs to\nfind a chair quickly -- get marked as a root or stored as a reference in some\nother object -- before the sweep phase comes to kick it out of the game.\n\nHow is it possible for the VM to use an object later -- one that the GC itself\ndoesn't see? How can the VM find it? The most common answer is through a pointer\nstored in some local variable on the C stack. The GC walks the *VM's* value and\nCallFrame stacks, but the C stack is <span name=\"c\">hidden</span> to it.\n\n<aside name=\"c\">\n\nOur GC can't find addresses in the C stack, but many can. Conservative garbage\ncollectors look all through memory, including the native stack. The most\nwell-known of this variety is the [**Boehm–Demers–Weiser garbage\ncollector**][boehm], usually just called the \"Boehm collector\". (The shortest\npath to fame in CS is a last name that's alphabetically early so that it shows\nup first in sorted lists of names.)\n\n[boehm]: https://en.wikipedia.org/wiki/Boehm_garbage_collector\n\nMany precise GCs walk the C stack too. Even those have to be careful about\npointers to live objects that exist only in *CPU registers*.\n\n</aside>\n\nIn previous chapters, we wrote seemingly pointless code that pushed an object\nonto the VM's value stack, did a little work, and then popped it right back off.\nMost times, I said this was for the GC's benefit. Now you see why. The code\nbetween pushing and popping potentially allocates memory and thus can trigger a\nGC. We had to make sure the object was on the value stack so that the\ncollector's mark phase would find it and keep it alive.\n\nI wrote the entire clox implementation before splitting it into chapters and\nwriting the prose, so I had plenty of time to find all of these corners and\nflush out most of these bugs. The stress testing code we put in at the beginning\nof this chapter and a pretty good test suite were very helpful.\n\nBut I fixed only *most* of them. I left a couple in because I want to give you a\nhint of what it's like to encounter these bugs in the wild. If you enable the\nstress test flag and run some toy Lox programs, you can probably stumble onto a\nfew. Give it a try and *see if you can fix any yourself*.\n\n\n### Adding to the constant table\n\nYou are very likely to hit the first bug. The constant table each chunk owns is\na dynamic array. When the compiler adds a new constant to the current function's\ntable, that array may need to grow. The constant itself may also be some\nheap-allocated object like a string or a nested function.\n\nThe new object being added to the constant table is passed to `addConstant()`.\nAt that moment, the object can be found only in the parameter to that function\non the C stack. That function appends the object to the constant table. If the\ntable doesn't have enough capacity and needs to grow, it calls `reallocate()`.\nThat in turn triggers a GC, which fails to mark the new constant object and\nthus sweeps it right before we have a chance to add it to the table. Crash.\n\nThe fix, as you've seen in other places, is to push the constant onto the stack\ntemporarily.\n\n^code add-constant-push (1 before, 1 after)\n\nOnce the constant table contains the object, we pop it off the stack.\n\n^code add-constant-pop (1 before, 1 after)\n\nWhen the GC is marking roots, it walks the chain of compilers and marks each of\ntheir functions, so the new constant is reachable now. We do need an include\nto call into the VM from the \"chunk\" module.\n\n^code chunk-include-vm (1 before, 2 after)\n\n### Interning strings\n\nHere's another similar one. All strings are interned in clox, so whenever we\ncreate a new string, we also add it to the intern table. You can see where this\nis going. Since the string is brand new, it isn't reachable anywhere. And\nresizing the string pool can trigger a collection. Again, we go ahead and stash\nthe string on the stack first.\n\n^code push-string (2 before, 1 after)\n\nAnd then pop it back off once it's safely nestled in the table.\n\n^code pop-string (1 before, 2 after)\n\nThis ensures the string is safe while the table is being resized. Once it\nsurvives that, `allocateString()` will return it to some caller which can then\ntake responsibility for ensuring the string is still reachable before the next\nheap allocation occurs.\n\n### Concatenating strings\n\nOne last example: Over in the interpreter, the `OP_ADD` instruction can be used\nto concatenate two strings. As it does with numbers, it pops the two operands\nfrom the stack, computes the result, and pushes that new value back onto the\nstack. For numbers that's perfectly safe.\n\nBut concatenating two strings requires allocating a new character array on the\nheap, which can in turn trigger a GC. Since we've already popped the operand\nstrings by that point, they can potentially be missed by the mark phase and get\nswept away. Instead of popping them off the stack eagerly, we peek them.\n\n^code concatenate-peek (1 before, 2 after)\n\nThat way, they are still hanging out on the stack when we create the result\nstring. Once that's done, we can safely pop them off and replace them with the\nresult.\n\n^code concatenate-pop (1 before, 1 after)\n\nThose were all pretty easy, especially because I *showed* you where the fix was.\nIn practice, *finding* them is the hard part. All you see is an object that\n*should* be there but isn't. It's not like other bugs where you're looking for\nthe code that *causes* some problem. You're looking for the *absence* of code\nwhich fails to *prevent* a problem, and that's a much harder search.\n\nBut, for now at least, you can rest easy. As far as I know, we've found all of\nthe collection bugs in clox, and now we have a working, robust, self-tuning,\nmark-sweep garbage collector.\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  The Obj header struct at the top of each object now has three fields:\n    `type`, `isMarked`, and `next`. How much memory do those take up (on your\n    machine)? Can you come up with something more compact? Is there a runtime\n    cost to doing so?\n\n1.  When the sweep phase traverses a live object, it clears the `isMarked`\n    field to prepare it for the next collection cycle. Can you come up with a\n    more efficient approach?\n\n1.  Mark-sweep is only one of a variety of garbage collection algorithms out\n    there. Explore those by replacing or augmenting the current collector with\n    another one. Good candidates to consider are reference counting, Cheney's\n    algorithm, or the Lisp 2 mark-compact algorithm.\n\n</div>\n\n<div class=\"design-note\">\n\n## Design Note: Generational Collectors\n\nA collector loses throughput if it spends a long time re-visiting objects that\nare still alive. But it can increase latency if it avoids collecting and\naccumulates a large pile of garbage to wade through. If only there were some way\nto tell which objects were likely to be long-lived and which weren't. Then the\nGC could avoid revisiting the long-lived ones as often and clean up the\nephemeral ones more frequently.\n\nIt turns out there kind of is. Many years ago, GC researchers gathered metrics\non the lifetime of objects in real-world running programs. They tracked every\nobject when it was allocated, and eventually when it was no longer needed, and\nthen graphed out how long objects tended to live.\n\nThey discovered something they called the **generational hypothesis**, or the\nmuch less tactful term **infant mortality**. Their observation was that most\nobjects are very short-lived but once they survive beyond a certain age, they\ntend to stick around quite a long time. The longer an object *has* lived, the\nlonger it likely will *continue* to live. This observation is powerful because\nit gave them a handle on how to partition objects into groups that benefit from\nfrequent collections and those that don't.\n\nThey designed a technique called **generational garbage collection**. It works\nlike this: Every time a new object is allocated, it goes into a special,\nrelatively small region of the heap called the \"nursery\". Since objects tend to\ndie young, the garbage collector is invoked <span\nname=\"nursery\">frequently</span> over the objects just in this region.\n\n<aside name=\"nursery\">\n\nNurseries are also usually managed using a copying collector which is faster at\nallocating and freeing objects than a mark-sweep collector.\n\n</aside>\n\nEach time the GC runs over the nursery is called a \"generation\". Any objects\nthat are no longer needed get freed. Those that survive are now considered one\ngeneration older, and the GC tracks this for each object. If an object survives\na certain number of generations -- often just a single collection -- it gets\n*tenured*. At this point, it is copied out of the nursery into a much larger\nheap region for long-lived objects. The garbage collector runs over that region\ntoo, but much less frequently since odds are good that most of those objects\nwill still be alive.\n\nGenerational collectors are a beautiful marriage of empirical data -- the\nobservation that object lifetimes are *not* evenly distributed -- and clever\nalgorithm design that takes advantage of that fact. They're also conceptually\nquite simple. You can think of one as just two separately tuned GCs and a pretty\nsimple policy for moving objects from one to the other.\n\n</div>\n"
  },
  {
    "path": "book/global-variables.md",
    "content": "> If only there could be an invention that bottled up a memory, like scent. And\n> it never faded, and it never got stale. And then, when one wanted it, the\n> bottle could be uncorked, and it would be like living the moment all over\n> again.\n>\n> <cite>Daphne du Maurier, <em>Rebecca</em></cite>\n\nThe [previous chapter][hash] was a long exploration of one big, deep,\nfundamental computer science data structure. Heavy on theory and concept. There\nmay have been some discussion of big-O notation and algorithms. This chapter has\nfewer intellectual pretensions. There are no large ideas to learn. Instead, it's\na handful of straightforward engineering tasks. Once we've completed them, our\nvirtual machine will support variables.\n\nActually, it will support only *global* variables. Locals are coming in the\n[next chapter][]. In jlox, we managed to cram them both into a single chapter\nbecause we used the same implementation technique for all variables. We built a\nchain of environments, one for each scope, all the way up to the top. That was a\nsimple, clean way to learn how to manage state.\n\n[next chapter]: local-variables.html\n\nBut it's also *slow*. Allocating a new hash table each time you enter a block or\ncall a function is not the road to a fast VM. Given how much code is concerned\nwith using variables, if variables go slow, everything goes slow. For clox,\nwe'll improve that by using a much more efficient strategy for <span\nname=\"different\">local</span> variables, but globals aren't as easily optimized.\n\n<aside name=\"different\">\n\nThis is a common meta-strategy in sophisticated language implementations. Often,\nthe same language feature will have multiple implementation techniques, each\ntuned for different use patterns. For example, JavaScript VMs often have a\nfaster representation for objects that are used more like instances of classes\ncompared to other objects whose set of properties is more freely modified. C and\nC++ compilers usually have a variety of ways to compile `switch` statements\nbased on the number of cases and how densely packed the case values are.\n\n</aside>\n\n[hash]: hash-tables.html\n\nA quick refresher on Lox semantics: Global variables in Lox are \"late bound\", or\nresolved dynamically. This means you can compile a chunk of code that refers to\na global variable before it's defined. As long as the code doesn't *execute*\nbefore the definition happens, everything is fine. In practice, that means you\ncan refer to later variables inside the body of functions.\n\n```lox\nfun showVariable() {\n  print global;\n}\n\nvar global = \"after\";\nshowVariable();\n```\n\nCode like this might seem odd, but it's handy for defining mutually recursive\nfunctions. It also plays nicer with the REPL. You can write a little function in\none line, then define the variable it uses in the next.\n\nLocal variables work differently. Since a local variable's declaration *always*\noccurs before it is used, the VM can resolve them at compile time, even in a\nsimple single-pass compiler. That will let us use a smarter representation for\nlocals. But that's for the next chapter. Right now, let's just worry about\nglobals.\n\n## Statements\n\nVariables come into being using variable declarations, which means now is also\nthe time to add support for statements to our compiler. If you recall, Lox\nsplits statements into two categories. \"Declarations\" are those statements that\nbind a new name to a value. The other kinds of statements -- control flow,\nprint, etc. -- are just called \"statements\". We disallow declarations directly\ninside control flow statements, like this:\n\n```lox\nif (monday) var croissant = \"yes\"; // Error.\n```\n\nAllowing it would raise confusing questions around the scope of the variable.\nSo, like other languages, we prohibit it syntactically by having a separate\ngrammar rule for the subset of statements that *are* allowed inside a control\nflow body.\n\n```ebnf\nstatement      → exprStmt\n               | forStmt\n               | ifStmt\n               | printStmt\n               | returnStmt\n               | whileStmt\n               | block ;\n```\n\nThen we use a separate rule for the top level of a script and inside a block.\n\n```ebnf\ndeclaration    → classDecl\n               | funDecl\n               | varDecl\n               | statement ;\n```\n\nThe `declaration` rule contains the statements that declare names, and also\nincludes `statement` so that all statement types are allowed. Since `block`\nitself is in `statement`, you can put declarations <span\nname=\"parens\">inside</span> a control flow construct by nesting them inside a\nblock.\n\n<aside name=\"parens\">\n\nBlocks work sort of like parentheses do for expressions. A block lets you put\nthe \"lower-precedence\" declaration statements in places where only a\n\"higher-precedence\" non-declaring statement is allowed.\n\n</aside>\n\nIn this chapter, we'll cover only a couple of statements and one\ndeclaration.\n\n```ebnf\nstatement      → exprStmt\n               | printStmt ;\n\ndeclaration    → varDecl\n               | statement ;\n```\n\nUp to now, our VM considered a \"program\" to be a single expression since that's\nall we could parse and compile. In a full Lox implementation, a program is a\nsequence of declarations. We're ready to support that now.\n\n^code compile (1 before, 1 after)\n\nWe keep compiling declarations until we hit the end of the source file. We\ncompile a single declaration using this:\n\n^code declaration\n\nWe'll get to variable declarations later in the chapter, so for now, we simply\nforward to `statement()`.\n\n^code statement\n\nBlocks can contain declarations, and control flow statements can contain other\nstatements. That means these two functions will eventually be recursive. We may\nas well write out the forward declarations now.\n\n^code forward-declarations (1 before, 1 after)\n\n### Print statements\n\nWe have two statement types to support in this chapter. Let's start with `print`\nstatements, which begin, naturally enough, with a `print` token. We detect that\nusing this helper function:\n\n^code match\n\nYou may recognize it from jlox. If the current token has the given type, we\nconsume the token and return `true`. Otherwise we leave the token alone and\nreturn `false`. This <span name=\"turtles\">helper</span> function is implemented\nin terms of this other helper:\n\n<aside name=\"turtles\">\n\nIt's helpers all the way down!\n\n</aside>\n\n^code check\n\nThe `check()` function returns `true` if the current token has the given type.\nIt seems a little <span name=\"read\">silly</span> to wrap this in a function, but\nwe'll use it more later, and I think short verb-named functions like this make\nthe parser easier to read.\n\n<aside name=\"read\">\n\nThis sounds trivial, but handwritten parsers for non-toy languages get pretty\nbig. When you have thousands of lines of code, a utility function that turns two\nlines into one and makes the result a little more readable easily earns its\nkeep.\n\n</aside>\n\nIf we did match the `print` token, then we compile the rest of the statement\nhere:\n\n^code print-statement\n\nA `print` statement evaluates an expression and prints the result, so we first\nparse and compile that expression. The grammar expects a semicolon after that,\nso we consume it. Finally, we emit a new instruction to print the result.\n\n^code op-print (1 before, 1 after)\n\nAt runtime, we execute this instruction like so:\n\n^code interpret-print (1 before, 1 after)\n\nWhen the interpreter reaches this instruction, it has already executed the code\nfor the expression, leaving the result value on top of the stack. Now we simply\npop and print it.\n\nNote that we don't push anything else after that. This is a key difference\nbetween expressions and statements in the VM. Every bytecode instruction has a\n<span name=\"effect\">**stack effect**</span> that describes how the instruction\nmodifies the stack. For example, `OP_ADD` pops two values and pushes one,\nleaving the stack one element smaller than before.\n\n<aside name=\"effect\">\n\nThe stack is one element shorter after an `OP_ADD`, so its effect is -1:\n\n<img src=\"image/global-variables/stack-effect.png\" alt=\"The stack effect of an OP_ADD instruction.\" />\n\n</aside>\n\nYou can sum the stack effects of a series of instructions to get their total\neffect. When you add the stack effects of the series of instructions compiled\nfrom any complete expression, it will total one. Each expression leaves one\nresult value on the stack.\n\nThe bytecode for an entire statement has a total stack effect of zero. Since a\nstatement produces no values, it ultimately leaves the stack unchanged, though\nit of course uses the stack while it's doing its thing. This is important\nbecause when we get to control flow and looping, a program might execute a long\nseries of statements. If each statement grew or shrank the stack, it might\neventually overflow or underflow.\n\nWhile we're in the interpreter loop, we should delete a bit of code.\n\n^code op-return (1 before, 1 after)\n\nWhen the VM only compiled and evaluated a single expression, we had some\ntemporary code in `OP_RETURN` to output the value. Now that we have statements\nand `print`, we don't need that anymore. We're one <span\nname=\"return\">step</span> closer to the complete implementation of clox.\n\n<aside name=\"return\">\n\nWe're only one step closer, though. We will revisit `OP_RETURN` again when we\nadd functions. Right now, it exits the entire interpreter loop.\n\n</aside>\n\nAs usual, a new instruction needs support in the disassembler.\n\n^code disassemble-print (1 before, 1 after)\n\nThat's our `print` statement. If you want, give it a whirl:\n\n```lox\nprint 1 + 2;\nprint 3 * 4;\n```\n\nExciting! OK, maybe not thrilling, but we can build scripts that contain as many\nstatements as we want now, which feels like progress.\n\n### Expression statements\n\nWait until you see the next statement. If we *don't* see a `print` keyword, then\nwe must be looking at an expression statement.\n\n^code parse-expressions-statement (1 before, 1 after)\n\nIt's parsed like so:\n\n^code expression-statement\n\nAn \"expression statement\" is simply an expression followed by a semicolon.\nThey're how you write an expression in a context where a statement is expected.\nUsually, it's so that you can call a function or evaluate an assignment for its\nside effect, like this:\n\n```lox\nbrunch = \"quiche\";\neat(brunch);\n```\n\nSemantically, an expression statement evaluates the expression and discards the\nresult. The compiler directly encodes that behavior. It compiles the expression,\nand then emits an `OP_POP` instruction.\n\n^code pop-op (1 before, 1 after)\n\nAs the name implies, that instruction pops the top value off the stack and\nforgets it.\n\n^code interpret-pop (1 before, 1 after)\n\nWe can disassemble it too.\n\n^code disassemble-pop (1 before, 1 after)\n\nExpression statements aren't very useful yet since we can't create any\nexpressions that have side effects, but they'll be essential when we\n[add functions later][functions]. The <span name=\"majority\">majority</span> of\nstatements in real-world code in languages like C are expression statements.\n\n<aside name=\"majority\">\n\nBy my count, 80 of the 149 statements, in the version of \"compiler.c\" that we\nhave at the end of this chapter are expression statements.\n\n</aside>\n\n[functions]: calls-and-functions.html\n\n### Error synchronization\n\nWhile we're getting this initial work done in the compiler, we can tie off a\nloose end we left [several chapters back][errors]. Like jlox, clox uses panic\nmode error recovery to minimize the number of cascaded compile errors that it\nreports. The compiler exits panic mode when it reaches a synchronization point.\nFor Lox, we chose statement boundaries as that point. Now that we have\nstatements, we can implement synchronization.\n\n[errors]: compiling-expressions.html#handling-syntax-errors\n\n^code call-synchronize (1 before, 1 after)\n\nIf we hit a compile error while parsing the previous statement, we enter panic\nmode. When that happens, after the statement we start synchronizing.\n\n^code synchronize\n\nWe skip tokens indiscriminately until we reach something that looks like a\nstatement boundary. We recognize the boundary by looking for a preceding token\nthat can end a statement, like a semicolon. Or we'll look for a subsequent token\nthat begins a statement, usually one of the control flow or declaration\nkeywords.\n\n## Variable Declarations\n\nMerely being able to *print* doesn't win your language any prizes at the\nprogramming language <span name=\"fair\">fair</span>, so let's move on to\nsomething a little more ambitious and get variables going. There are three\noperations we need to support:\n\n<aside name=\"fair\">\n\nI can't help but imagine a \"language fair\" like some country 4H thing. Rows of\nstraw-lined stalls full of baby languages *moo*ing and *baa*ing at each other.\n\n</aside>\n\n*   Declaring a new variable using a `var` statement.\n*   Accessing the value of a variable using an identifier expression.\n*   Storing a new value in an existing variable using an assignment expression.\n\nWe can't do either of the last two until we have some variables, so we start\nwith declarations.\n\n^code match-var (1 before, 2 after)\n\nThe placeholder parsing function we sketched out for the declaration grammar\nrule has an actual production now. If we match a `var` token, we jump here:\n\n^code var-declaration\n\nThe keyword is followed by the variable name. That's compiled by\n`parseVariable()`, which we'll get to in a second. Then we look for an `=`\nfollowed by an initializer expression. If the user doesn't initialize the\nvariable, the compiler implicitly initializes it to <span\nname=\"nil\">`nil`</span> by emitting an `OP_NIL` instruction. Either way, we\nexpect the statement to be terminated with a semicolon.\n\n<aside name=\"nil\" class=\"bottom\">\n\nEssentially, the compiler desugars a variable declaration like:\n\n```lox\nvar a;\n```\n\ninto:\n\n```lox\nvar a = nil;\n```\n\nThe code it generates for the former is identical to what it produces for the\nlatter.\n\n</aside>\n\nThere are two new functions here for working with variables and identifiers.\nHere is the first:\n\n^code parse-variable (2 before)\n\nIt requires the next token to be an identifier, which it consumes and sends\nhere:\n\n^code identifier-constant (2 before)\n\nThis function takes the given token and adds its lexeme to the chunk's constant\ntable as a string. It then returns the index of that constant in the constant\ntable.\n\nGlobal variables are looked up *by name* at runtime. That means the VM -- the\nbytecode interpreter loop -- needs access to the name. A whole string is too big\nto stuff into the bytecode stream as an operand. Instead, we store the string in\nthe constant table and the instruction then refers to the name by its index in\nthe table.\n\nThis function returns that index all the way to `varDeclaration()` which later\nhands it over to here:\n\n^code define-variable\n\n<span name=\"helper\">This</span> outputs the bytecode instruction that defines\nthe new variable and stores its initial value. The index of the variable's name\nin the constant table is the instruction's operand. As usual in a stack-based\nVM, we emit this instruction last. At runtime, we execute the code for the\nvariable's initializer first. That leaves the value on the stack. Then this\ninstruction takes that value and stores it away for later.\n\n<aside name=\"helper\">\n\nI know some of these functions seem pretty pointless right now. But we'll get\nmore mileage out of them as we add more language features for working with\nnames. Function and class declarations both declare new variables, and variable\nand assignment expressions access them.\n\n</aside>\n\nOver in the runtime, we begin with this new instruction:\n\n^code define-global-op (1 before, 1 after)\n\nThanks to our handy-dandy hash table, the implementation isn't too hard.\n\n^code interpret-define-global (1 before, 1 after)\n\nWe get the name of the variable from the constant table. Then we <span\nname=\"pop\">take</span> the value from the top of the stack and store it in a\nhash table with that name as the key.\n\n<aside name=\"pop\">\n\nNote that we don't *pop* the value until *after* we add it to the hash table.\nThat ensures the VM can still find the value if a garbage collection is\ntriggered right in the middle of adding it to the hash table. That's a distinct\npossibility since the hash table requires dynamic allocation when it resizes.\n\n</aside>\n\nThis code doesn't check to see if the key is already in the table. Lox is pretty\nlax with global variables and lets you redefine them without error. That's\nuseful in a REPL session, so the VM supports that by simply overwriting the\nvalue if the key happens to already be in the hash table.\n\nThere's another little helper macro:\n\n^code read-string (1 before, 1 after)\n\nIt reads a one-byte operand from the bytecode chunk. It treats that as an index\ninto the chunk's constant table and returns the string at that index. It doesn't\ncheck that the value *is* a string -- it just indiscriminately casts it. That's\nsafe because the compiler never emits an instruction that refers to a non-string\nconstant.\n\nBecause we care about lexical hygiene, we also undefine this macro at the end of\nthe interpret function.\n\n^code undef-read-string (1 before, 1 after)\n\nI keep saying \"the hash table\", but we don't actually have one yet. We need a\nplace to store these globals. Since we want them to persist as long as clox is\nrunning, we store them right in the VM.\n\n^code vm-globals (1 before, 1 after)\n\nAs we did with the string table, we need to initialize the hash table to a valid\nstate when the VM boots up.\n\n^code init-globals (1 before, 1 after)\n\nAnd we <span name=\"tear\">tear</span> it down when we exit.\n\n<aside name=\"tear\">\n\nThe process will free everything on exit, but it feels undignified to require\nthe operating system to clean up our mess.\n\n</aside>\n\n^code free-globals (1 before, 1 after)\n\nAs usual, we want to be able to disassemble the new instruction too.\n\n^code disassemble-define-global (1 before, 1 after)\n\nAnd with that, we can define global variables. Not that users can *tell* that\nthey've done so, because they can't actually *use* them. So let's fix that next.\n\n## Reading Variables\n\nAs in every programming language ever, we access a variable's value using its\nname. We hook up identifier tokens to the expression parser here:\n\n^code table-identifier (1 before, 1 after)\n\nThat calls this new parser function:\n\n^code variable-without-assign\n\nLike with declarations, there are a couple of tiny helper functions that seem\npointless now but will become more useful in later chapters. I promise.\n\n^code read-named-variable\n\nThis calls the same `identifierConstant()` function from before to take the\ngiven identifier token and add its lexeme to the chunk's constant table as a\nstring. All that remains is to emit an instruction that loads the global\nvariable with that name. Here's the instruction:\n\n^code get-global-op (1 before, 1 after)\n\nOver in the interpreter, the implementation mirrors `OP_DEFINE_GLOBAL`.\n\n^code interpret-get-global (1 before, 1 after)\n\nWe pull the constant table index from the instruction's operand and get the\nvariable name. Then we use that as a key to look up the variable's value in the\nglobals hash table.\n\nIf the key isn't present in the hash table, it means that global variable has\nnever been defined. That's a runtime error in Lox, so we report it and exit the\ninterpreter loop if that happens. Otherwise, we take the value and push it\nonto the stack.\n\n^code disassemble-get-global (1 before, 1 after)\n\nA little bit of disassembling, and we're done. Our interpreter is now able to\nrun code like this:\n\n```lox\nvar beverage = \"cafe au lait\";\nvar breakfast = \"beignets with \" + beverage;\nprint breakfast;\n```\n\nThere's only one operation left.\n\n## Assignment\n\nThroughout this book, I've tried to keep you on a fairly safe and easy path. I\ndon't avoid hard *problems*, but I try to not make the *solutions* more complex\nthan they need to be. Alas, other design choices in our <span\nname=\"jlox\">bytecode</span> compiler make assignment annoying to implement.\n\n<aside name=\"jlox\">\n\nIf you recall, assignment was pretty easy in jlox.\n\n</aside>\n\nOur bytecode VM uses a single-pass compiler. It parses and generates bytecode\non the fly without any intermediate AST. As soon as it recognizes a piece of\nsyntax, it emits code for it. Assignment doesn't naturally fit that. Consider:\n\n```lox\nmenu.brunch(sunday).beverage = \"mimosa\";\n```\n\nIn this code, the parser doesn't realize `menu.brunch(sunday).beverage` is the\ntarget of an assignment and not a normal expression until it reaches `=`, many\ntokens after the first `menu`. By then, the compiler has already emitted\nbytecode for the whole thing.\n\nThe problem is not as dire as it might seem, though. Look at how the parser sees that example:\n\n<img src=\"image/global-variables/setter.png\" alt=\"The 'menu.brunch(sunday).beverage = &quot;mimosa&quot;' statement, showing that 'menu.brunch(sunday)' is an expression.\" />\n\nEven though the `.beverage` part must not be compiled as a get expression,\neverything to the left of the `.` is an expression, with the normal expression\nsemantics. The `menu.brunch(sunday)` part can be compiled and executed as usual.\n\nFortunately for us, the only semantic differences on the left side of an\nassignment appear at the very right-most end of the tokens, immediately\npreceding the `=`. Even though the receiver of a setter may be an arbitrarily\nlong expression, the part whose behavior differs from a get expression is only\nthe trailing identifier, which is right before the `=`. We don't need much\nlookahead to realize `beverage` should be compiled as a set expression and not a\ngetter.\n\nVariables are even easier since they are just a single bare identifier before an\n`=`. The idea then is that right *before* compiling an expression that can also\nbe used as an assignment target, we look for a subsequent `=` token. If we see\none, we compile it as an assignment or setter instead of a variable access or\ngetter.\n\nWe don't have setters to worry about yet, so all we need to handle are variables.\n\n^code named-variable (1 before, 1 after)\n\nIn the parse function for identifier expressions, we look for an equals sign\nafter the identifier. If we find one, instead of emitting code for a variable\naccess, we compile the assigned value and then emit an assignment instruction.\n\nThat's the last instruction we need to add in this chapter.\n\n^code set-global-op (1 before, 1 after)\n\nAs you'd expect, its runtime behavior is similar to defining a new variable.\n\n^code interpret-set-global (1 before, 1 after)\n\nThe main difference is what happens when the key doesn't already exist in the\nglobals hash table. If the variable hasn't been defined yet, it's a runtime\nerror to try to assign to it. Lox [doesn't do implicit variable\ndeclaration][implicit].\n\n<aside name=\"delete\">\n\nThe call to `tableSet()` stores the value in the global variable table even if\nthe variable wasn't previously defined. That fact is visible in a REPL session,\nsince it keeps running even after the runtime error is reported. So we also take\ncare to delete that zombie value from the table.\n\n</aside>\n\nThe other difference is that setting a variable doesn't pop the value off the\nstack. Remember, assignment is an expression, so it needs to leave that value\nthere in case the assignment is nested inside some larger expression.\n\n[implicit]: statements-and-state.html#design-note\n\nAdd a dash of disassembly:\n\n^code disassemble-set-global (2 before, 1 after)\n\nSo we're done, right? Well... not quite. We've made a mistake! Take a gander at:\n\n```lox\na * b = c + d;\n```\n\nAccording to Lox's grammar, `=` has the lowest precedence, so this should be\nparsed roughly like:\n\n<img src=\"image/global-variables/ast-good.png\" alt=\"The expected parse, like '(a * b) = (c + d)'.\" />\n\nObviously, `a * b` isn't a <span name=\"do\">valid</span> assignment target, so\nthis should be a syntax error. But here's what our parser does:\n\n<aside name=\"do\">\n\nWouldn't it be wild if `a * b` *was* a valid assignment target, though? You\ncould imagine some algebra-like language that tried to divide the assigned value\nup in some reasonable way and distribute it to `a` and `b`... that's probably\na terrible idea.\n\n</aside>\n\n1.  First, `parsePrecedence()` parses `a` using the `variable()` prefix parser.\n1.  After that, it enters the infix parsing loop.\n1.  It reaches the `*` and calls `binary()`.\n1.  That recursively calls `parsePrecedence()` to parse the right-hand operand.\n1.  That calls `variable()` again for parsing `b`.\n1.  Inside that call to `variable()`, it looks for a trailing `=`. It sees one\n    and thus parses the rest of the line as an assignment.\n\nIn other words, the parser sees the above code like:\n\n<img src=\"image/global-variables/ast-bad.png\" alt=\"The actual parse, like 'a * (b = c + d)'.\" />\n\nWe've messed up the precedence handling because `variable()` doesn't take into\naccount the precedence of the surrounding expression that contains the variable.\nIf the variable happens to be the right-hand side of an infix operator, or the\noperand of a unary operator, then that containing expression is too high\nprecedence to permit the `=`.\n\nTo fix this, `variable()` should look for and consume the `=` only if it's in\nthe context of a low-precedence expression. The code that knows the current\nprecedence is, logically enough, `parsePrecedence()`. The `variable()` function\ndoesn't need to know the actual level. It just cares that the precedence is low\nenough to allow assignment, so we pass that fact in as a Boolean.\n\n^code prefix-rule (4 before, 2 after)\n\nSince assignment is the lowest-precedence expression, the only time we allow an\nassignment is when parsing an assignment expression or top-level expression like\nin an expression statement. That flag makes its way to the parser function here:\n\n^code variable\n\nWhich passes it through a new parameter:\n\n^code named-variable-signature (1 after)\n\nAnd then finally uses it here:\n\n^code named-variable-can-assign (2 before, 1 after)\n\nThat's a lot of plumbing to get literally one bit of data to the right place in\nthe compiler, but arrived it has. If the variable is nested inside some\nexpression with higher precedence, `canAssign` will be `false` and this will\nignore the `=` even if there is one there. Then `namedVariable()` returns, and\nexecution eventually makes its way back to `parsePrecedence()`.\n\nThen what? What does the compiler do with our broken example from before? Right\nnow, `variable()` won't consume the `=`, so that will be the current token. The\ncompiler returns back to `parsePrecedence()` from the `variable()` prefix parser\nand then tries to enter the infix parsing loop. There is no parsing function\nassociated with `=`, so it skips that loop.\n\nThen `parsePrecedence()` silently returns back to the caller. That also isn't\nright. If the `=` doesn't get consumed as part of the expression, nothing else\nis going to consume it. It's an error and we should report it.\n\n^code invalid-assign (2 before, 1 after)\n\nWith that, the previous bad program correctly gets an error at compile time. OK,\n*now* are we done? Still not quite. See, we're passing an argument to one of the\nparse functions. But those functions are stored in a table of function pointers,\nso all of the parse functions need to have the same type. Even though most parse\nfunctions don't support being used as an assignment target -- setters are the\n<span name=\"index\">only</span> other one -- our friendly C compiler requires\nthem *all* to accept the parameter.\n\n<aside name=\"index\">\n\nIf Lox had arrays and subscript operators like `array[index]` then an infix `[`\nwould also allow assignment to support `array[index] = value`.\n\n</aside>\n\nSo we're going to finish off this chapter with some grunt work. First, let's go\nahead and pass the flag to the infix parse functions.\n\n^code infix-rule (1 before, 1 after)\n\nWe'll need that for setters eventually. Then we'll fix the typedef for the\nfunction type.\n\n^code parse-fn-type (2 before, 2 after)\n\nAnd some completely tedious code to accept this parameter in all of our existing\nparse functions. Here:\n\n^code binary (1 after)\n\nAnd here:\n\n^code parse-literal (1 after)\n\nAnd here:\n\n^code grouping (1 after)\n\nAnd here:\n\n^code number (1 after)\n\nAnd here too:\n\n^code string (1 after)\n\nAnd, finally:\n\n^code unary (1 after)\n\nPhew! We're back to a C program we can compile. Fire it up and now you can run\nthis:\n\n```lox\nvar breakfast = \"beignets\";\nvar beverage = \"cafe au lait\";\nbreakfast = \"beignets with \" + beverage;\n\nprint breakfast;\n```\n\nIt's starting to look like real code for an actual language!\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  The compiler adds a global variable's name to the constant table as a string\n    every time an identifier is encountered. It creates a new constant each\n    time, even if that variable name is already in a previous slot in the\n    constant table. That's wasteful in cases where the same variable is\n    referenced multiple times by the same function. That, in turn, increases the\n    odds of filling up the constant table and running out of slots since we\n    allow only 256 constants in a single chunk.\n\n    Optimize this. How does your optimization affect the performance of the\n    compiler compared to the runtime? Is this the right trade-off?\n\n2.  Looking up a global variable by name in a hash table each time it is used\n    is pretty slow, even with a good hash table. Can you come up with a more\n    efficient way to store and access global variables without changing the\n    semantics?\n\n3.  When running in the REPL, a user might write a function that references an\n    unknown global variable. Then, in the next line, they declare the variable.\n    Lox should handle this gracefully by not reporting an \"unknown variable\"\n    compile error when the function is first defined.\n\n    But when a user runs a Lox *script*, the compiler has access to the full\n    text of the entire program before any code is run. Consider this program:\n\n    ```lox\n    fun useVar() {\n      print oops;\n    }\n\n    var ooops = \"too many o's!\";\n    ```\n\n    Here, we can tell statically that `oops` will not be defined because there\n    is *no* declaration of that global anywhere in the program. Note that\n    `useVar()` is never called either, so even though the variable isn't\n    defined, no runtime error will occur because it's never used either.\n\n    We could report mistakes like this as compile errors, at least when running\n    from a script. Do you think we should? Justify your answer. What do other\n    scripting languages you know do?\n\n</div>\n"
  },
  {
    "path": "book/hash-tables.md",
    "content": "> Hash, x. There is no definition for this word -- nobody knows what hash is.\n>\n> <cite>Ambrose Bierce, <em>The Unabridged Devil's Dictionary</em></cite>\n\nBefore we can add variables to our burgeoning virtual machine, we need some way\nto look up a value given a variable's name. Later, when we add classes, we'll\nalso need a way to store fields on instances. The perfect data structure for\nthese problems and others is a hash table.\n\nYou probably already know what a hash table is, even if you don't know it by\nthat name. If you're a Java programmer, you call them \"HashMaps\". C# and Python\nusers call them \"dictionaries\". In C++, it's an \"unordered map\". \"Objects\" in\nJavaScript and \"tables\" in Lua are hash tables under the hood, which is what\ngives them their flexibility.\n\nA hash table, whatever your language calls it, associates a set of **keys** with\na set of **values**. Each key/value pair is an **entry** in the table. Given a\nkey, you can look up its corresponding value. You can add new key/value pairs\nand remove entries by key. If you add a new value for an existing key, it\nreplaces the previous entry.\n\nHash tables appear in so many languages because they are incredibly powerful.\nMuch of this power comes from one metric: given a key, a hash table returns the\ncorresponding value in <span name=\"constant\">constant time</span>, *regardless\nof how many keys are in the hash table*.\n\n<aside name=\"constant\">\n\nMore specifically, the *average-case* lookup time is constant. Worst-case\nperformance can be, well, worse. In practice, it's easy to avoid degenerate\nbehavior and stay on the happy path.\n\n</aside>\n\nThat's pretty remarkable when you think about it. Imagine you've got a big stack\nof business cards and I ask you to find a certain person. The bigger the pile\nis, the longer it will take. Even if the pile is nicely sorted and you've got\nthe manual dexterity to do a binary search by hand, you're still talking\n*O(log n)*. But with a <span name=\"rolodex\">hash table</span>, it takes the\nsame time to find that business card when the stack has ten cards as when it has\na million.\n\n<aside name=\"rolodex\">\n\nStuff all those cards in a Rolodex -- does anyone even remember those things\nanymore? -- with dividers for each letter, and you improve your speed\ndramatically. As we'll see, that's not too far from the trick a hash table uses.\n\n</aside>\n\n## An Array of Buckets\n\nA complete, fast hash table has a couple of moving parts. I'll introduce them\none at a time by working through a couple of toy problems and their solutions.\nEventually, we'll build up to a data structure that can associate any set of\nnames with their values.\n\nFor now, imagine if Lox was a *lot* more restricted in variable names. What if a\nvariable's name could only be a <span name=\"basic\">single</span> lowercase\nletter. How could we very efficiently represent a set of variable names and\ntheir values?\n\n<aside name=\"basic\">\n\nThis limitation isn't *too* far-fetched. The initial versions of BASIC out of\nDartmouth allowed variable names to be only a single letter followed by one\noptional digit.\n\n</aside>\n\nWith only 26 possible variables (27 if you consider underscore a \"letter\", I\nguess), the answer is easy. Declare a fixed-size array with 26 elements. We'll\nfollow tradition and call each element a **bucket**. Each represents a variable\nwith `a` starting at index zero. If there's a value in the array at some\nletter's index, then that key is present with that value. Otherwise, the bucket\nis empty and that key/value pair isn't in the data structure.\n\n<aside name=\"bucket\">\n\n<img src=\"image/hash-tables/bucket-array.png\" alt=\"A row of buckets, each\nlabeled with a letter of the alphabet.\" />\n\n</aside>\n\nMemory usage is great -- just a single, reasonably sized <span\nname=\"bucket\">array</span>. There's some waste from the empty buckets, but it's\nnot huge. There's no overhead for node pointers, padding, or other stuff you'd\nget with something like a linked list or tree.\n\nPerformance is even better. Given a variable name -- its character -- you can\nsubtract the ASCII value of `a` and use the result to index directly into the\narray. Then you can either look up the existing value or store a new value\ndirectly into that slot. It doesn't get much faster than that.\n\nThis is sort of our Platonic ideal data structure. Lightning fast, dead simple,\nand compact in memory. As we add support for more complex keys, we'll have to\nmake some concessions, but this is what we're aiming for. Even once you add in\nhash functions, dynamic resizing, and collision resolution, this is still the\ncore of every hash table out there -- a contiguous array of buckets that you\nindex directly into.\n\n### Load factor and wrapped keys\n\nConfining Lox to single-letter variables would make our job as implementers\neasier, but it's probably no fun programming in a language that gives you only\n26 storage locations. What if we loosened it a little and allowed variables up\nto <span name=\"six\">eight</span> characters long?\n\n<aside name=\"six\">\n\nAgain, this restriction isn't so crazy. Early linkers for C treated only the\nfirst six characters of external identifiers as meaningful. Everything after\nthat was ignored. If you've ever wondered why the C standard library is so\nenamored of abbreviation -- looking at you, `strncmp()` -- it turns out it\nwasn't entirely because of the small screens (or teletypes!) of the day.\n\n</aside>\n\nThat's small enough that we can pack all eight characters into a 64-bit integer\nand easily turn the string into a number. We can then use it as an array index.\nOr, at least, we could if we could somehow allocate a 295,148 *petabyte* array.\nMemory's gotten cheaper over time, but not quite *that* cheap. Even if we could\nmake an array that big, it would be heinously wasteful. Almost every bucket\nwould be empty unless users started writing way bigger Lox programs than we've\nanticipated.\n\nEven though our variable keys cover the full 64-bit numeric range, we clearly\ndon't need an array that large. Instead, we allocate an array with more than\nenough capacity for the entries we need, but not unreasonably large. We map the\nfull 64-bit keys down to that smaller range by taking the value modulo the size\nof the array. Doing that essentially folds the larger numeric range onto itself\nuntil it fits the smaller range of array elements.\n\nFor example, say we want to store \"bagel\". We allocate an array with eight\nelements, plenty enough to store it and more later. We treat the key string as a\n64-bit integer. On a little-endian machine like Intel, packing those characters\ninto a 64-bit word puts the first letter, \"b\" (ASCII value 98), in the\nleast-significant byte. We take that integer modulo the array size (<span\nname=\"power-of-two\">8</span>) to fit it in the bounds and get a bucket index, 2.\nThen we store the value there as usual.\n\n<aside name=\"power-of-two\">\n\nI'm using powers of two for the array sizes here, but they don't need to be.\nSome styles of hash tables work best with powers of two, including the one we'll\nbuild in this book. Others prefer prime number array sizes or have other rules.\n\n</aside>\n\nUsing the array size as a modulus lets us map the key's numeric range down to\nfit an array of any size. We can thus control the number of buckets\nindependently of the key range. That solves our waste problem, but introduces a\nnew one. Any two variables whose key number has the same remainder when divided\nby the array size will end up in the same bucket. Keys can **collide**. For\nexample, if we try to add \"jam\", it also ends up in bucket 2.\n\n<img src=\"image/hash-tables/collision.png\" alt=\"'Bagel' and 'jam' both end up in bucket index 2.\" />\n\nWe have some control over this by tuning the array size. The bigger the array,\nthe fewer the indexes that get mapped to the same bucket and the fewer the\ncollisions that are likely to occur. Hash table implementers track this\ncollision likelihood by measuring the table's **load factor**. It's defined as\nthe number of entries divided by the number of buckets. So a hash table with\nfive entries and an array of 16 elements has a load factor of 0.3125. The higher\nthe load factor, the greater the chance of collisions.\n\nOne way we mitigate collisions is by resizing the array. Just like the dynamic\narrays we implemented earlier, we reallocate and grow the hash table's array as\nit fills up. Unlike a regular dynamic array, though, we won't wait until the\narray is *full*. Instead, we pick a desired load factor and grow the array when\nit goes over that.\n\n## Collision Resolution\n\nEven with a very low load factor, collisions can still occur. The [*birthday\nparadox*][birthday] tells us that as the number of entries in the hash table\nincreases, the chance of collision increases very quickly. We can pick a large\narray size to reduce that, but it's a losing game. Say we wanted to store a\nhundred items in a hash table. To keep the chance of collision below a\nstill-pretty-high 10%, we need an array with at least 47,015 elements. To get\nthe chance below 1% requires an array with 492,555 elements, over 4,000 empty\nbuckets for each one in use.\n\n[birthday]: https://en.wikipedia.org/wiki/Birthday_problem\n\nA low load factor can make collisions <span name=\"pigeon\">rarer</span>, but the\n[*pigeonhole principle*][pigeon] tells us we can never eliminate them entirely.\nIf you've got five pet pigeons and four holes to put them in, at least one hole\nis going to end up with more than one pigeon. With 18,446,744,073,709,551,616\ndifferent variable names, any reasonably sized array can potentially end up with\nmultiple keys in the same bucket.\n\n[pigeon]: https://en.wikipedia.org/wiki/Pigeonhole_principle\n\nThus we still have to handle collisions gracefully when they occur. Users don't\nlike it when their programming language can look up variables correctly only\n*most* of the time.\n\n<aside name=\"pigeon\">\n\nPut these two funny-named mathematical rules together and you get this\nobservation: Take a birdhouse containing 365 pigeonholes, and use each pigeon's\nbirthday to assign it to a pigeonhole. You'll need only about 26 randomly chosen\npigeons before you get a greater than 50% chance of two pigeons in the same box.\n\n<img src=\"image/hash-tables/pigeons.png\" alt=\"Two pigeons in the same hole.\" />\n\n</aside>\n\n### Separate chaining\n\nTechniques for resolving collisions fall into two broad categories. The first is\n**separate chaining**. Instead of each bucket containing a single entry, we let\nit contain a collection of them. In the classic implementation, each bucket\npoints to a linked list of entries. To look up an entry, you find its bucket and\nthen walk the list until you find an entry with the matching key.\n\n<img src=\"image/hash-tables/chaining.png\" alt=\"An array with eight buckets. Bucket 2 links to a chain of two nodes. Bucket 5 links to a single node.\" />\n\nIn catastrophically bad cases where every entry collides in the same bucket, the\ndata structure degrades into a single unsorted linked list with *O(n)* lookup.\nIn practice, it's easy to avoid that by controlling the load factor and how\nentries get scattered across buckets. In typical separate-chained hash tables,\nit's rare for a bucket to have more than one or two entries.\n\nSeparate chaining is conceptually simple -- it's literally an array of linked\nlists. Most operations are straightforward to implement, even deletion which, as\nwe'll see, can be a pain. But it's not a great fit for modern CPUs. It has a lot\nof overhead from pointers and tends to scatter little linked list <span\nname=\"node\">nodes</span> around in memory which isn't great for cache usage.\n\n<aside name=\"node\">\n\nThere are a few tricks to optimize this. Many implementations store the first\nentry right in the bucket so that in the common case where there's only one, no\nextra pointer indirection is needed. You can also make each linked list node\nstore a few entries to reduce the pointer overhead.\n\n</aside>\n\n### Open addressing\n\nThe other technique is <span name=\"open\">called</span> **open addressing** or\n(confusingly) **closed hashing**. With this technique, all entries live directly\nin the bucket array, with one entry per bucket. If two entries collide in the\nsame bucket, we find a different empty bucket to use instead.\n\n<aside name=\"open\">\n\nIt's called \"open\" addressing because the entry may end up at an address\n(bucket) outside of its preferred one. It's called \"closed\" hashing because all\nof the entries stay inside the array of buckets.\n\n</aside>\n\nStoring all entries in a single, big, contiguous array is great for keeping the\nmemory representation simple and fast. But it makes all of the operations on the\nhash table more complex. When inserting an entry, its bucket may be full,\nsending us to look at another bucket. That bucket itself may be occupied and so\non. This process of finding an available bucket is called **probing**, and the\norder that you examine buckets is a **probe sequence**.\n\nThere are a <span name=\"probe\">number</span> of algorithms for determining\nwhich buckets to probe and how to decide which entry goes in which bucket.\nThere's been a ton of research here because even slight tweaks can have a large\nperformance impact. And, on a data structure as heavily used as hash tables,\nthat performance impact touches a very large number of real-world programs\nacross a range of hardware capabilities.\n\n<aside name=\"probe\">\n\nIf you'd like to learn more (and you should, because some of these are really\ncool), look into \"double hashing\", \"cuckoo hashing\", \"Robin Hood hashing\", and\nanything those lead you to.\n\n</aside>\n\nAs usual in this book, we'll pick the simplest one that gets the job done\nefficiently. That's good old **linear probing**. When looking for an entry, we\nlook in the first bucket its key maps to. If it's not in there, we look in the\nvery next element in the array, and so on. If we reach the end, we wrap back\naround to the beginning.\n\nThe good thing about linear probing is that it's cache friendly. Since you walk\nthe array directly in memory order, it keeps the CPU's cache lines full and\nhappy. The bad thing is that it's prone to **clustering**. If you have a lot of\nentries with numerically similar key values, you can end up with a lot of\ncolliding, overflowing buckets right next to each other.\n\nCompared to separate chaining, open addressing can be harder to wrap your head\naround. I think of open addressing as similar to separate chaining except that\nthe \"list\" of nodes is threaded through the bucket array itself. Instead of\nstoring the links between them in pointers, the connections are calculated\nimplicitly by the order that you look through the buckets.\n\nThe tricky part is that more than one of these implicit lists may be interleaved\ntogether. Let's walk through an example that covers all the interesting cases.\nWe'll ignore values for now and just worry about a set of keys. We start with an\nempty array of 8 buckets.\n\n<img src=\"image/hash-tables/insert-1.png\" alt=\"An array with eight empty buckets.\" class=\"wide\" />\n\nWe decide to insert \"bagel\". The first letter, \"b\" (ASCII value 98), modulo the\narray size (8) puts it in bucket 2.\n\n<img src=\"image/hash-tables/insert-2.png\" alt=\"Bagel goes into bucket 2.\" class=\"wide\" />\n\nNext, we insert \"jam\". That also wants to go in bucket 2 (106 mod 8 = 2), but\nthat bucket's taken. We keep probing to the next bucket. It's empty, so we put\nit there.\n\n<img src=\"image/hash-tables/insert-3.png\" alt=\"Jam goes into bucket 3, since 2 is full.\" class=\"wide\" />\n\nWe insert \"fruit\", which happily lands in bucket 6.\n\n<img src=\"image/hash-tables/insert-4.png\" alt=\"Fruit goes into bucket 6.\" class=\"wide\" />\n\nLikewise, \"migas\" can go in its preferred bucket 5.\n\n<img src=\"image/hash-tables/insert-5.png\" alt=\"Migas goes into bucket 5.\" class=\"wide\" />\n\nWhen we try to insert \"eggs\", it also wants to be in bucket 5. That's full, so we\nskip to 6. Bucket 6 is also full. Note that the entry in there is *not* part of\nthe same probe sequence. \"Fruit\" is in its preferred bucket, 6. So the 5 and 6\nsequences have collided and are interleaved. We skip over that and finally put\n\"eggs\" in bucket 7.\n\n<img src=\"image/hash-tables/insert-6.png\" alt=\"Eggs goes into bucket 7 because 5 and 6 are full.\" class=\"wide\" />\n\nWe run into a similar problem with \"nuts\". It can't land in 6 like it wants to.\nNor can it go into 7. So we keep going. But we've reached the end of the array,\nso we wrap back around to 0 and put it there.\n\n<img src=\"image/hash-tables/insert-7.png\" alt=\"Nuts wraps around to bucket 0 because 6 and 7 are full.\" class=\"wide\" />\n\nIn practice, the interleaving turns out to not be much of a problem. Even in\nseparate chaining, we need to walk the list to check each entry's key because\nmultiple keys can reduce to the same bucket. With open addressing, we need to do\nthat same check, and that also covers the case where you are stepping over\nentries that \"belong\" to a different original bucket.\n\n## Hash Functions\n\nWe can now build ourselves a reasonably efficient table for storing variable\nnames up to eight characters long, but that limitation is still annoying. In\norder to relax the last constraint, we need a way to take a string of any length\nand convert it to a fixed-size integer.\n\nFinally, we get to the \"hash\" part of \"hash table\". A **hash function** takes\nsome larger blob of data and \"hashes\" it to produce a fixed-size integer **hash\ncode** whose value depends on all of the bits of the original data. A <span\nname=\"crypto\">good</span> hash function has three main goals:\n\n<aside name=\"crypto\">\n\nHash functions are also used for cryptography. In that domain, \"good\" has a\n*much* more stringent definition to avoid exposing details about the data being\nhashed. We, thankfully, don't need to worry about those concerns for this book.\n\n</aside>\n\n*   **It must be *deterministic*.** The same input must always hash to the same\n    number. If the same variable ends up in different buckets at different\n    points in time, it's gonna get really hard to find it.\n\n*   **It must be *uniform*.** Given a typical set of inputs, it should produce a\n    wide and evenly distributed range of output numbers, with as few clumps or\n    patterns as possible. We want it to <span name=\"scatter\">scatter</span>\n    values across the whole numeric range to minimize collisions and clustering.\n\n*   **It must be *fast*.** Every operation on the hash table requires us to hash\n    the key first. If hashing is slow, it can potentially cancel out the speed\n    of the underlying array storage.\n\n<aside name=\"scatter\">\n\nOne of the original names for a hash table was \"scatter table\" because it takes\nthe entries and scatters them throughout the array. The word \"hash\" came from\nthe idea that a hash function takes the input data, chops it up, and tosses it\nall together into a pile to come up with a single number from all of those bits.\n\n</aside>\n\nThere is a veritable pile of hash functions out there. Some are old and\noptimized for architectures no one uses anymore. Some are designed to be fast,\nothers cryptographically secure. Some take advantage of vector instructions and\ncache sizes for specific chips, others aim to maximize portability.\n\nThere are people out there for whom designing and evaluating hash functions is,\nlike, their *jam*. I admire them, but I'm not mathematically astute enough to\n*be* one. So for clox, I picked a simple, well-worn hash function called\n[FNV-1a][] that's served me fine over the years. Consider <span\nname=\"thing\">trying</span> out different ones in your code and see if they make\na difference.\n\n[fnv-1a]: http://www.isthe.com/chongo/tech/comp/fnv/\n\n<aside name=\"thing\">\n\nWho knows, maybe hash functions could turn out to be your thing too?\n\n</aside>\n\nOK, that's a quick run through of buckets, load factors, open addressing,\ncollision resolution, and hash functions. That's an awful lot of text and not a\nlot of real code. Don't worry if it still seems vague. Once we're done coding it\nup, it will all click into place.\n\n## Building a Hash Table\n\nThe great thing about hash tables compared to other classic techniques like\nbalanced search trees is that the actual data structure is so simple. Ours goes\ninto a new module.\n\n^code table-h\n\nA hash table is an array of entries. As in our dynamic array earlier, we keep\ntrack of both the allocated size of the array (`capacity`) and the number of\nkey/value pairs currently stored in it (`count`). The ratio of count to capacity\nis exactly the load factor of the hash table.\n\nEach entry is one of these:\n\n^code entry (1 before, 2 after)\n\nIt's a simple key/value pair. Since the key is always a <span\nname=\"string\">string</span>, we store the ObjString pointer directly instead of\nwrapping it in a Value. It's a little faster and smaller this way.\n\n<aside name=\"string\">\n\nIn clox, we only need to support keys that are strings. Handling other types of\nkeys doesn't add much complexity. As long as you can compare two objects for\nequality and reduce them to sequences of bits, it's easy to use them as hash\nkeys.\n\n</aside>\n\nTo create a new, empty hash table, we declare a constructor-like function.\n\n^code init-table-h (2 before, 2 after)\n\nWe need a new implementation file to define that. While we're at it, let's get\nall of the pesky includes out of the way.\n\n^code table-c\n\nAs in our dynamic value array type, a hash table initially starts with zero\ncapacity and a `NULL` array. We don't allocate anything until needed. Assuming\nwe do eventually allocate something, we need to be able to free it too.\n\n^code free-table-h (1 before, 2 after)\n\nAnd its glorious implementation:\n\n^code free-table\n\nAgain, it looks just like a dynamic array. In fact, you can think of a hash\ntable as basically a dynamic array with a really strange policy for inserting\nitems. We don't need to check for `NULL` here since `FREE_ARRAY()` already\nhandles that gracefully.\n\n### Hashing strings\n\nBefore we can start putting entries in the table, we need to, well, hash them.\nTo ensure that the entries get distributed uniformly throughout the array, we\nwant a good hash function that looks at all of the bits of the key string. If it\nlooked at, say, only the first few characters, then a series of strings that all\nshared the same prefix would end up colliding in the same bucket.\n\nOn the other hand, walking the entire string to calculate the hash is kind of\nslow. We'd lose some of the performance benefit of the hash table if we had to\nwalk the string every time we looked for a key in the table. So we'll do the\nobvious thing: cache it.\n\nOver in the \"object\" module in ObjString, we add:\n\n^code obj-string-hash (1 before, 1 after)\n\nEach ObjString stores the hash code for its string. Since strings are immutable\nin Lox, we can calculate the hash code once up front and be certain that it will\nnever get invalidated. Caching it eagerly makes a kind of sense: allocating the\nstring and copying its characters over is already an *O(n)* operation, so it's a\ngood time to also do the *O(n)* calculation of the string's hash.\n\nWhenever we call the internal function to allocate a string, we pass in its\nhash code.\n\n^code allocate-string (1 after)\n\nThat function simply stores the hash in the struct.\n\n^code allocate-store-hash (1 before, 2 after)\n\nThe fun happens over at the callers. `allocateString()` is called from two\nplaces: the function that copies a string and the one that takes ownership of an\nexisting dynamically allocated string. We'll start with the first.\n\n^code copy-string-hash (1 before, 1 after)\n\nNo magic here. We calculate the hash code and then pass it along.\n\n^code copy-string-allocate (2 before, 1 after)\n\nThe other string function is similar.\n\n^code take-string-hash (1 before, 1 after)\n\nThe interesting code is over here:\n\n^code hash-string\n\nThis is the actual bona fide \"hash function\" in clox. The algorithm is called\n\"FNV-1a\", and is the shortest decent hash function I know. Brevity is certainly\na virtue in a book that aims to show you every line of code.\n\nThe basic idea is pretty simple, and many hash functions follow the same\npattern. You start with some initial hash value, usually a constant with certain\ncarefully chosen mathematical properties. Then you walk the data to be hashed.\nFor each byte (or sometimes word), you mix the bits into the hash value somehow,\nand then scramble the resulting bits around some.\n\nWhat it means to \"mix\" and \"scramble\" can get pretty sophisticated. Ultimately,\nthough, the basic goal is *uniformity* -- we want the resulting hash values to\nbe as widely scattered around the numeric range as possible to avoid collisions\nand clustering.\n\n### Inserting entries\n\nNow that string objects know their hash code, we can start putting them into\nhash tables.\n\n^code table-set-h (1 before, 2 after)\n\nThis function adds the given key/value pair to the given hash table. If an entry\nfor that key is already present, the new value overwrites the old value. The\nfunction returns `true` if a new entry was added. Here's the implementation:\n\n^code table-set\n\nMost of the interesting logic is in `findEntry()` which we'll get to soon. That\nfunction's job is to take a key and figure out which bucket in the array it\nshould go in. It returns a pointer to that bucket -- the address of the Entry in\nthe array.\n\nOnce we have a bucket, inserting is straightforward. We update the hash table's\nsize, taking care to not increase the count if we overwrote the value for an\nalready-present key. Then we copy the key and value into the corresponding\nfields in the Entry.\n\nWe're missing a little something here, though. We haven't actually allocated the\nEntry array yet. Oops! Before we can insert anything, we need to make sure we\nhave an array, and that it's big enough.\n\n^code table-set-grow (1 before, 1 after)\n\nThis is similar to the code we wrote a while back for growing a dynamic array.\nIf we don't have enough capacity to insert an item, we reallocate and grow the\narray. The `GROW_CAPACITY()` macro takes an existing capacity and grows it by\na multiple to ensure that we get amortized constant performance over a series\nof inserts.\n\nThe interesting difference here is that `TABLE_MAX_LOAD` constant.\n\n^code max-load (2 before, 1 after)\n\nThis is how we manage the table's <span name=\"75\">load</span> factor. We don't\ngrow when the capacity is completely full. Instead, we grow the array before\nthen, when the array becomes at least 75% full.\n\n<aside name=\"75\">\n\nIdeal max load factor varies based on the hash function, collision-handling\nstrategy, and typical keysets you'll see. Since a toy language like Lox doesn't\nhave \"real world\" data sets, it's hard to optimize this, and I picked 75%\nsomewhat arbitrarily. When you build your own hash tables, benchmark and tune\nthis.\n\n</aside>\n\nWe'll get to the implementation of `adjustCapacity()` soon. First, let's look\nat that `findEntry()` function you've been wondering about.\n\n^code find-entry\n\nThis function is the real core of the hash table. It's responsible for taking a\nkey and an array of buckets, and figuring out which bucket the entry belongs in.\nThis function is also where linear probing and collision handling come into\nplay. We'll use `findEntry()` both to look up existing entries in the hash\ntable and to decide where to insert new ones.\n\nFor all that, there isn't much to it. First, we use modulo to map the key's hash\ncode to an index within the array's bounds. That gives us a bucket index where,\nideally, we'll be able to find or place the entry.\n\nThere are a few cases to check for:\n\n*   If the key for the Entry at that array index is `NULL`, then the bucket is\n    empty. If we're using `findEntry()` to look up something in the hash table,\n    this means it isn't there. If we're using it to insert, it means we've found\n    a place to add the new entry.\n\n*   If the key in the bucket is <span name=\"equal\">equal</span> to the key we're\n    looking for, then that key is already present in the table. If we're doing a\n    lookup, that's good -- we've found the key we seek. If we're doing an insert,\n    this means we'll be replacing the value for that key instead of adding a new\n    entry.\n\n<aside name=\"equal\">\n\nIt looks like we're using `==` to see if two strings are equal. That doesn't\nwork, does it? There could be two copies of the same string at different places\nin memory. Fear not, astute reader. We'll solve this further on. And, strangely\nenough, it's a hash table that provides the tool we need.\n\n</aside>\n\n*   Otherwise, the bucket has an entry in it, but with a different key. This is\n    a collision. In that case, we start probing. That's what that `for` loop\n    does. We start at the bucket where the entry would ideally go. If that\n    bucket is empty or has the same key, we're done. Otherwise, we advance to\n    the next element -- this is the *linear* part of \"linear probing\" -- and\n    check there. If we go past the end of the array, that second modulo operator\n    wraps us back around to the beginning.\n\nWe exit the loop when we find either an empty bucket or a bucket with the same\nkey as the one we're looking for. You might be wondering about an infinite loop.\nWhat if we collide with *every* bucket? Fortunately, that can't happen thanks to\nour load factor. Because we grow the array as soon as it gets close to being\nfull, we know there will always be empty buckets.\n\nWe return directly from within the loop, yielding a pointer to the found Entry\nso the caller can either insert something into it or read from it. Way back in\n`tableSet()`, the function that first kicked this off, we store the new entry in\nthat returned bucket and we're done.\n\n### Allocating and resizing\n\nBefore we can put entries in the hash table, we do need a place to actually\nstore them. We need to allocate an array of buckets. That happens in this\nfunction:\n\n^code table-adjust-capacity\n\nWe create a bucket array with `capacity` entries. After we allocate the array,\nwe initialize every element to be an empty bucket and then store the array (and\nits capacity) in the hash table's main struct. This code is fine for when we\ninsert the very first entry into the table, and we require the first allocation\nof the array. But what about when we already have one and we need to grow it?\n\nBack when we were doing a dynamic array, we could just use `realloc()` and let\nthe C standard library copy everything over. That doesn't work for a hash table.\nRemember that to choose the bucket for each entry, we take its hash key *modulo\nthe array size*. That means that when the array size changes, entries may end up\nin different buckets.\n\nThose new buckets may have new collisions that we need to deal with. So the\nsimplest way to get every entry where it belongs is to rebuild the table from\nscratch by re-inserting every entry into the new empty array.\n\n^code re-hash (2 before, 2 after)\n\nWe walk through the old array front to back. Any time we find a non-empty\nbucket, we insert that entry into the new array. We use `findEntry()`, passing\nin the *new* array instead of the one currently stored in the Table. (This is\nwhy `findEntry()` takes a pointer directly to an Entry array and not the whole\n`Table` struct. That way, we can pass the new array and capacity before we've\nstored those in the struct.)\n\nAfter that's done, we can release the memory for the old array.\n\n^code free-old-array (3 before, 1 after)\n\nWith that, we have a hash table that we can stuff as many entries into as we\nlike. It handles overwriting existing keys and growing itself as needed to\nmaintain the desired load capacity.\n\nWhile we're at it, let's also define a helper function for copying all of the\nentries of one hash table into another.\n\n^code table-add-all-h (1 before, 2 after)\n\nWe won't need this until much later when we support method inheritance, but we\nmay as well implement it now while we've got all the hash table stuff fresh in\nour minds.\n\n^code table-add-all\n\nThere's not much to say about this. It walks the bucket array of the source hash\ntable. Whenever it finds a non-empty bucket, it adds the entry to the\ndestination hash table using the `tableSet()` function we recently defined.\n\n### Retrieving values\n\nNow that our hash table contains some stuff, let's start pulling things back\nout. Given a key, we can look up the corresponding value, if there is one, with\nthis function:\n\n^code table-get-h (1 before, 1 after)\n\nYou pass in a table and a key. If it finds an entry with that key, it returns\n`true`, otherwise it returns `false`. If the entry exists, the `value` output\nparameter points to the resulting value.\n\nSince `findEntry()` already does the hard work, the implementation isn't bad.\n\n^code table-get\n\nIf the table is completely empty, we definitely won't find the entry, so we\ncheck for that first. This isn't just an optimization -- it also ensures that we\ndon't try to access the bucket array when the array is `NULL`. Otherwise, we let\n`findEntry()` work its magic. That returns a pointer to a bucket. If the bucket\nis empty, which we detect by seeing if the key is `NULL`, then we didn't find an\nEntry with our key. If `findEntry()` does return a non-empty Entry, then that's\nour match. We take the Entry's value and copy it to the output parameter so the\ncaller can get it. Piece of cake.\n\n### Deleting entries\n\nThere is one more fundamental operation a full-featured hash table needs to\nsupport: removing an entry. This seems pretty obvious, if you can add things,\nyou should be able to *un*-add them, right? But you'd be surprised how many\ntutorials on hash tables omit this.\n\nI could have taken that route too. In fact, we use deletion in clox only in a\ntiny edge case in the VM. But if you want to actually understand how to\ncompletely implement a hash table, this feels important. I can sympathize with\ntheir desire to overlook it. As we'll see, deleting from a hash table that uses\n<span name=\"delete\">open</span> addressing is tricky.\n\n<aside name=\"delete\">\n\nWith separate chaining, deleting is as easy as removing a node from a linked\nlist.\n\n</aside>\n\nAt least the declaration is simple.\n\n^code table-delete-h (1 before, 1 after)\n\nThe obvious approach is to mirror insertion. Use `findEntry()` to look up the\nentry's bucket. Then clear out the bucket. Done!\n\nIn cases where there are no collisions, that works fine. But if a collision has\noccurred, then the bucket where the entry lives may be part of one or more\nimplicit probe sequences. For example, here's a hash table containing three keys\nall with the same preferred bucket, 2:\n\n<img src=\"image/hash-tables/delete-1.png\" alt=\"A hash table containing 'bagel' in bucket 2, 'biscuit' in bucket 3, and 'jam' in bucket 4.\" />\n\nRemember that when we're walking a probe sequence to find an entry, we know\nwe've reached the end of a sequence and that the entry isn't present when we hit\nan empty bucket. It's like the probe sequence is a list of entries and an empty\nentry terminates that list.\n\nIf we delete \"biscuit\" by simply clearing the Entry, then we break that probe\nsequence in the middle, leaving the trailing entries orphaned and unreachable.\nSort of like removing a node from a linked list without relinking the pointer\nfrom the previous node to the next one.\n\nIf we later try to look for \"jam\", we'd start at \"bagel\", stop at the next\nempty Entry, and never find it.\n\n<img src=\"image/hash-tables/delete-2.png\" alt=\"The 'biscuit' entry has been deleted from the hash table, breaking the chain.\" />\n\nTo solve this, most implementations use a trick called <span\nname=\"tombstone\">**tombstones**</span>. Instead of clearing the entry on\ndeletion, we replace it with a special sentinel entry called a \"tombstone\". When\nwe are following a probe sequence during a lookup, and we hit a tombstone, we\n*don't* treat it like an empty slot and stop iterating. Instead, we keep going\nso that deleting an entry doesn't break any implicit collision chains and we can\nstill find entries after it.\n\n<img src=\"image/hash-tables/delete-3.png\" alt=\"Instead of deleting 'biscuit', it's replaced with a tombstone.\" />\n\nThe code looks like this:\n\n^code table-delete\n\nFirst, we find the bucket containing the entry we want to delete. (If we don't\nfind it, there's nothing to delete, so we bail out.) We replace the entry with a\ntombstone. In clox, we use a `NULL` key and a `true` value to represent that,\nbut any representation that can't be confused with an empty bucket or a valid\nentry works.\n\n<aside name=\"tombstone\">\n\n<img src=\"image/hash-tables/tombstone.png\" alt=\"A tombstone enscribed 'Here lies entry biscuit &rarr; 3.75, gone but not deleted'.\" />\n\n</aside>\n\nThat's all we need to do to delete an entry. Simple and fast. But all of the\nother operations need to correctly handle tombstones too. A tombstone is a sort\nof \"half\" entry. It has some of the characteristics of a present entry, and some\nof the characteristics of an empty one.\n\nWhen we are following a probe sequence during a lookup, and we hit a tombstone,\nwe note it and keep going.\n\n^code find-tombstone (2 before, 2 after)\n\nThe first time we pass a tombstone, we store it in this local variable:\n\n^code find-entry-tombstone (1 before, 1 after)\n\nIf we reach a truly empty entry, then the key isn't present. In that case, if we\nhave passed a tombstone, we return its bucket instead of the later empty one. If\nwe're calling `findEntry()` in order to insert a node, that lets us treat the\ntombstone bucket as empty and reuse it for the new entry.\n\nReusing tombstone slots automatically like this helps reduce the number of\ntombstones wasting space in the bucket array. In typical use cases where there\nis a mixture of insertions and deletions, the number of tombstones grows for a\nwhile and then tends to stabilize.\n\nEven so, there's no guarantee that a large number of deletes won't cause the\narray to be full of tombstones. In the very worst case, we could end up with\n*no* empty buckets. That would be bad because, remember, the only thing\npreventing an infinite loop in `findEntry()` is the assumption that we'll\neventually hit an empty bucket.\n\nSo we need to be thoughtful about how tombstones interact with the table's load\nfactor and resizing. The key question is, when calculating the load factor,\nshould we treat tombstones like full buckets or empty ones?\n\n### Counting tombstones\n\nIf we treat tombstones like full buckets, then we may end up with a bigger array\nthan we probably need because it artificially inflates the load factor. There\nare tombstones we could reuse, but they aren't treated as unused so we end up\ngrowing the array prematurely.\n\nBut if we treat tombstones like empty buckets and *don't* include them in the\nload factor, then we run the risk of ending up with *no* actual empty buckets to\nterminate a lookup. An infinite loop is a much worse problem than a few extra\narray slots, so for load factor, we consider tombstones to be full buckets.\n\nThat's why we don't reduce the count when deleting an entry in the previous\ncode. The count is no longer the number of entries in the hash table, it's the\nnumber of entries plus tombstones. That implies that we increment the count\nduring insertion only if the new entry goes into an entirely empty bucket.\n\n^code set-increment-count (1 before, 2 after)\n\nIf we are replacing a tombstone with a new entry, the bucket has already been\naccounted for and the count doesn't change.\n\nWhen we resize the array, we allocate a new array and re-insert all of the\nexisting entries into it. During that process, we *don't* copy the tombstones\nover. They don't add any value since we're rebuilding the probe sequences\nanyway, and would just slow down lookups. That means we need to recalculate the\ncount since it may change during a resize. So we clear it out:\n\n^code resize-init-count (2 before, 1 after)\n\nThen each time we find a non-tombstone entry, we increment it.\n\n^code resize-increment-count (1 before, 1 after)\n\nThis means that when we grow the capacity, we may end up with *fewer* entries in\nthe resulting larger array because all of the tombstones get discarded. That's a\nlittle wasteful, but not a huge practical problem.\n\nI find it interesting that much of the work to support deleting entries is in\n`findEntry()` and `adjustCapacity()`. The actual delete logic is quite simple\nand fast. In practice, deletions tend to be rare, so you'd expect a hash table\nto do as much work as it can in the delete function and leave the other\nfunctions alone to keep them faster. With our tombstone approach, deletes are\nfast, but lookups get penalized.\n\nI did a little benchmarking to test this out in a few different deletion\nscenarios. I was surprised to discover that tombstones did end up being faster\noverall compared to doing all the work during deletion to reinsert the affected\nentries.\n\nBut if you think about it, it's not that the tombstone approach pushes the work\nof fully deleting an entry to other operations, it's more that it makes deleting\n*lazy*. At first, it does the minimal work to turn the entry into a tombstone.\nThat can cause a penalty when later lookups have to skip over it. But it also\nallows that tombstone bucket to be reused by a later insert too. That reuse is a\nvery efficient way to avoid the cost of rearranging all of the following\naffected entries. You basically recycle a node in the chain of probed entries.\nIt's a neat trick.\n\n## String Interning\n\nWe've got ourselves a hash table that mostly works, though it has a critical\nflaw in its center. Also, we aren't using it for anything yet. It's time to\naddress both of those and, in the process, learn a classic technique used by\ninterpreters.\n\nThe reason the hash table doesn't totally work is that when `findEntry()` checks\nto see if an existing key matches the one it's looking for, it uses `==` to\ncompare two strings for equality. That only returns true if the two keys are the\nexact same string in memory. Two separate strings with the same characters\nshould be considered equal, but aren't.\n\nRemember, back when we added strings in the last chapter, we added [explicit\nsupport to compare the strings character-by-character][equals] in order to get\ntrue value equality. We could do that in `findEntry()`, but that's <span\nname=\"hash-collision\">slow</span>.\n\n[equals]: strings.html#operations-on-strings\n\n<aside name=\"hash-collision\">\n\nIn practice, we would first compare the hash codes of the two strings. That\nquickly detects almost all different strings -- it wouldn't be a very good hash\nfunction if it didn't. But when the two hashes are the same, we still have to\ncompare characters to make sure we didn't have a hash collision on different\nstrings.\n\n</aside>\n\nInstead, we'll use a technique called **string interning**. The core problem is\nthat it's possible to have different strings in memory with the same characters.\nThose need to behave like equivalent values even though they are distinct\nobjects. They're essentially duplicates, and we have to compare all of their\nbytes to detect that.\n\n<span name=\"intern\">String interning</span> is a process of deduplication. We\ncreate a collection of \"interned\" strings. Any string in that collection is\nguaranteed to be textually distinct from all others. When you intern a string,\nyou look for a matching string in the collection. If found, you use that\noriginal one. Otherwise, the string you have is unique, so you add it to the\ncollection.\n\n<aside name=\"intern\">\n\nI'm guessing \"intern\" is short for \"internal\". I think the idea is that the\nlanguage's runtime keeps its own \"internal\" collection of these strings, whereas\nother strings could be user created and floating around in memory. When you\nintern a string, you ask the runtime to add the string to that internal\ncollection and return a pointer to it.\n\nLanguages vary in how much string interning they do and how it's exposed to the\nuser. Lua interns *all* strings, which is what clox will do too. Lisp, Scheme,\nSmalltalk, Ruby and others have a separate string-like type called \"symbol\" that\nis implicitly interned. (This is why they say symbols are \"faster\" in Ruby.)\nJava interns constant strings by default, and provides an API to let you\nexplicitly intern any string you give it.\n\n</aside>\n\nIn this way, you know that each sequence of characters is represented by only\none string in memory. This makes value equality trivial. If two strings point\nto the same address in memory, they are obviously the same string and must be\nequal. And, because we know strings are unique, if two strings point to\ndifferent addresses, they must be distinct strings.\n\nThus, pointer equality exactly matches value equality. Which in turn means that\nour existing `==` in `findEntry()` does the right thing. Or, at least, it will\nonce we intern all the strings. In order to reliably deduplicate all strings,\nthe VM needs to be able to find every string that's created. We do that by\ngiving it a hash table to store them all.\n\n^code vm-strings (1 before, 1 after)\n\nAs usual, we need an include.\n\n^code vm-include-table (1 before, 1 after)\n\nWhen we spin up a new VM, the string table is empty.\n\n^code init-strings (1 before, 1 after)\n\nAnd when we shut down the VM, we clean up any resources used by the table.\n\n^code free-strings (1 before, 1 after)\n\nSome languages have a separate type or an explicit step to intern a string. For\nclox, we'll automatically intern every one. That means whenever we create a new\nunique string, we add it to the table.\n\n^code allocate-store-string (1 before, 1 after)\n\nWe're using the table more like a hash *set* than a hash *table*. The keys are\nthe strings and those are all we care about, so we just use `nil` for the\nvalues.\n\nThis gets a string into the table assuming that it's unique, but we need to\nactually check for duplication before we get here. We do that in the two\nhigher-level functions that call `allocateString()`. Here's one:\n\n^code copy-string-intern (1 before, 1 after)\n\nWhen copying a string into a new LoxString, we look it up in the string table\nfirst. If we find it, instead of \"copying\", we just return a reference to that\nstring. Otherwise, we fall through, allocate a new string, and store it in the\nstring table.\n\nTaking ownership of a string is a little different.\n\n^code take-string-intern (1 before, 1 after)\n\nAgain, we look up the string in the string table first. If we find it, before we\nreturn it, we free the memory for the string that was passed in. Since ownership\nis being passed to this function and we no longer need the duplicate string,\nit's up to us to free it.\n\nBefore we get to the new function we need to write, there's one more include.\n\n^code object-include-table (1 before, 1 after)\n\nTo look for a string in the table, we can't use the normal `tableGet()` function\nbecause that calls `findEntry()`, which has the exact problem with duplicate\nstrings that we're trying to fix right now. Instead, we use this new function:\n\n^code table-find-string-h (1 before, 2 after)\n\nThe implementation looks like so:\n\n^code table-find-string\n\nIt appears we have copy-pasted `findEntry()`. There is a lot of redundancy, but\nalso a couple of key differences. First, we pass in the raw character array of\nthe key we're looking for instead of an ObjString. At the point that we call\nthis, we haven't created an ObjString yet.\n\nSecond, when checking to see if we found the key, we look at the actual strings.\nWe first see if they have matching lengths and hashes. Those are quick to check\nand if they aren't equal, the strings definitely aren't the same.\n\nIf there is a hash collision, we do an actual character-by-character string\ncomparison. This is the one place in the VM where we actually test strings for\ntextual equality. We do it here to deduplicate strings and then the rest of the\nVM can take for granted that any two strings at different addresses in memory\nmust have different contents.\n\nIn fact, now that we've interned all the strings, we can take advantage of it in\nthe bytecode interpreter. When a user does `==` on two objects that happen to be\nstrings, we don't need to test the characters any more.\n\n^code equal (1 before, 1 after)\n\nWe've added a little overhead when creating strings to intern them. But in\nreturn, at runtime, the equality operator on strings is much faster. With that,\nwe have a full-featured hash table ready for us to use for tracking variables,\ninstances, or any other key-value pairs that might show up.\n\nWe also sped up testing strings for equality. This is nice for when the user\ndoes `==` on strings. But it's even more critical in a dynamically typed\nlanguage like Lox where method calls and instance fields are looked up by name\nat runtime. If testing a string for equality is slow, then that means looking up\na method by name is slow. And if *that's* slow in your object-oriented language,\nthen *everything* is slow.\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  In clox, we happen to only need keys that are strings, so the hash table we\n    built is hardcoded for that key type. If we exposed hash tables to Lox users\n    as a first-class collection, it would be useful to support different kinds\n    of keys.\n\n    Add support for keys of the other primitive types: numbers, Booleans, and\n    `nil`. Later, clox will support user-defined classes. If we want to support\n    keys that are instances of those classes, what kind of complexity does that\n    add?\n\n1.  Hash tables have a lot of knobs you can tweak that affect their performance.\n    You decide whether to use separate chaining or open addressing. Depending on\n    which fork in that road you take, you can tune how many entries are stored\n    in each node, or the probing strategy you use. You control the hash\n    function, load factor, and growth rate.\n\n    All of this variety wasn't created just to give CS doctoral candidates\n    something to <span name=\"publish\">publish</span> theses on: each has its\n    uses in the many varied domains and hardware scenarios where hashing comes\n    into play. Look up a few hash table implementations in different open source\n    systems, research the choices they made, and try to figure out why they did\n    things that way.\n\n    <aside name=\"publish\">\n\n    Well, at least that wasn't the *only* reason they were created. Whether that\n    was the *main* reason is up for debate.\n\n    </aside>\n\n1.  Benchmarking a hash table is notoriously difficult. A hash table\n    implementation may perform well with some keysets and poorly with others. It\n    may work well at small sizes but degrade as it grows, or vice versa. It may\n    choke when deletions are common, but fly when they aren't. Creating\n    benchmarks that accurately represent how your users will use the hash table\n    is a challenge.\n\n    Write a handful of different benchmark programs to validate our hash table\n    implementation. How does the performance vary between them? Why did you\n    choose the specific test cases you chose?\n\n</div>\n"
  },
  {
    "path": "book/index.md",
    "content": "This text is not used. All of the content is in the index.html template.\n"
  },
  {
    "path": "book/inheritance.md",
    "content": "> Once we were blobs in the sea, and then fishes, and then lizards and rats and\n> then monkeys, and hundreds of things in between. This hand was once a fin,\n> this hand once had claws! In my human mouth I have the pointy teeth of a wolf\n> and the chisel teeth of a rabbit and the grinding teeth of a cow! Our blood is\n> as salty as the sea we used to live in! When we're frightened, the hair on our\n> skin stands up, just like it did when we had fur. We are history! Everything\n> we've ever been on the way to becoming us, we still are.\n>\n> <cite>Terry Pratchett, <em>A Hat Full of Sky</em></cite>\n\nCan you believe it? We've reached the last chapter of [Part II][]. We're almost\ndone with our first Lox interpreter. The [previous chapter][] was a big ball of\nintertwined object-orientation features. I couldn't separate those from each\nother, but I did manage to untangle one piece. In this chapter, we'll finish\noff Lox's class support by adding inheritance.\n\n[part ii]: a-tree-walk-interpreter.html\n[previous chapter]: classes.html\n\nInheritance appears in object-oriented languages all the way back to the <span\nname=\"inherited\">first</span> one, [Simula][]. Early on, Kristen Nygaard and\nOle-Johan Dahl noticed commonalities across classes in the simulation programs\nthey wrote. Inheritance gave them a way to reuse the code for those similar\nparts.\n\n[simula]: https://en.wikipedia.org/wiki/Simula\n\n<aside name=\"inherited\">\n\nYou could say all those other languages *inherited* it from Simula. Hey-ooo!\nI'll, uh, see myself out.\n\n</aside>\n\n## Superclasses and Subclasses\n\nGiven that the concept is \"inheritance\", you would hope they would pick a\nconsistent metaphor and call them \"parent\" and \"child\" classes, but that would\nbe too easy. Way back when, C. A. R. Hoare coined the term \"<span\nname=\"subclass\">subclass</span>\" to refer to a record type that refines another\ntype. Simula borrowed that term to refer to a *class* that inherits from\nanother. I don't think it was until Smalltalk came along that someone flipped\nthe Latin prefix to get \"superclass\" to refer to the other side of the\nrelationship. From C++, you also hear \"base\" and \"derived\" classes. I'll mostly\nstick with \"superclass\" and \"subclass\".\n\n<aside name=\"subclass\">\n\n\"Super-\" and \"sub-\" mean \"above\" and \"below\" in Latin, respectively. Picture an\ninheritance tree like a family tree with the root at the top -- subclasses are\nbelow their superclasses on the diagram. More generally, \"sub-\" refers to things\nthat refine or are contained by some more general concept. In zoology, a\nsubclass is a finer categorization of a larger class of living things.\n\nIn set theory, a subset is contained by a larger superset which has all of the\nelements of the subset and possibly more. Set theory and programming languages\nmeet each other in type theory. There, you have \"supertypes\" and \"subtypes\".\n\nIn statically typed object-oriented languages, a subclass is also often a\nsubtype of its superclass. Say we have a Doughnut superclass and a BostonCream\nsubclass. Every BostonCream is also an instance of Doughnut, but there may be\ndoughnut objects that are not BostonCreams (like Crullers).\n\nThink of a type as the set of all values of that type. The set of all Doughnut\ninstances contains the set of all BostonCream instances since every BostonCream\nis also a Doughnut. So BostonCream is a subclass, and a subtype, and its\ninstances are a subset. It all lines up.\n\n<img src=\"image/inheritance/doughnuts.png\" alt=\"Boston cream &lt;: doughnut.\" />\n\n</aside>\n\nOur first step towards supporting inheritance in Lox is a way to specify a\nsuperclass when declaring a class. There's a lot of variety in syntax for this.\nC++ and C# place a `:` after the subclass's name, followed by the superclass\nname. Java uses `extends` instead of the colon. Python puts the superclass(es)\nin parentheses after the class name. Simula puts the superclass's name *before*\nthe `class` keyword.\n\nThis late in the game, I'd rather not add a new reserved word or token to the\nlexer. We don't have `extends` or even `:`, so we'll follow Ruby and use a\nless-than sign (`<`).\n\n```lox\nclass Doughnut {\n  // General doughnut stuff...\n}\n\nclass BostonCream < Doughnut {\n  // Boston Cream-specific stuff...\n}\n```\n\nTo work this into the grammar, we add a new optional clause in our existing\n`classDecl` rule.\n\n```ebnf\nclassDecl      → \"class\" IDENTIFIER ( \"<\" IDENTIFIER )?\n                 \"{\" function* \"}\" ;\n```\n\nAfter the class name, you can have a `<` followed by the superclass's name. The\nsuperclass clause is optional because you don't *have* to have a superclass.\nUnlike some other object-oriented languages like Java, Lox has no root \"Object\"\nclass that everything inherits from, so when you omit the superclass clause, the\nclass has *no* superclass, not even an implicit one.\n\nWe want to capture this new syntax in the class declaration's AST node.\n\n^code superclass-ast (1 before, 1 after)\n\nYou might be surprised that we store the superclass name as an Expr.Variable,\nnot a Token. The grammar restricts the superclass clause to a single identifier,\nbut at runtime, that identifier is evaluated as a variable access. Wrapping the\nname in an Expr.Variable early on in the parser gives us an object that the\nresolver can hang the resolution information off of.\n\nThe new parser code follows the grammar directly.\n\n^code parse-superclass (1 before, 1 after)\n\nOnce we've (possibly) parsed a superclass declaration, we store it in the AST.\n\n^code construct-class-ast (2 before, 1 after)\n\nIf we didn't parse a superclass clause, the superclass expression will be\n`null`. We'll have to make sure the later passes check for that. The first of\nthose is the resolver.\n\n^code resolve-superclass (1 before, 2 after)\n\nThe class declaration AST node has a new subexpression, so we traverse into and\nresolve that. Since classes are usually declared at the top level, the\nsuperclass name will most likely be a global variable, so this doesn't usually\ndo anything useful. However, Lox allows class declarations even inside blocks,\nso it's possible the superclass name refers to a local variable. In that case,\nwe need to make sure it's resolved.\n\nBecause even well-intentioned programmers sometimes write weird code, there's a\nsilly edge case we need to worry about while we're in here. Take a look at this:\n\n```lox\nclass Oops < Oops {}\n```\n\nThere's no way this will do anything useful, and if we let the runtime try to\nrun this, it will break the expectation the interpreter has about there not\nbeing cycles in the inheritance chain. The safest thing is to detect this case\nstatically and report it as an error.\n\n^code inherit-self (2 before, 1 after)\n\nAssuming the code resolves without error, the AST travels to the interpreter.\n\n^code interpret-superclass (1 before, 1 after)\n\nIf the class has a superclass expression, we evaluate it. Since that could\npotentially evaluate to some other kind of object, we have to check at runtime\nthat the thing we want to be the superclass is actually a class. Bad things\nwould happen if we allowed code like:\n\n```lox\nvar NotAClass = \"I am totally not a class\";\n\nclass Subclass < NotAClass {} // ?!\n```\n\nAssuming that check passes, we continue on. Executing a class declaration turns\nthe syntactic representation of a class -- its AST node -- into its runtime\nrepresentation, a LoxClass object. We need to plumb the superclass through to\nthat too. We pass the superclass to the constructor.\n\n^code interpreter-construct-class (3 before, 1 after)\n\nThe constructor stores it in a field.\n\n^code lox-class-constructor (1 after)\n\nWhich we declare here:\n\n^code lox-class-superclass-field (1 before, 1 after)\n\nWith that, we can define classes that are subclasses of other classes. Now, what\ndoes having a superclass actually *do?*\n\n## Inheriting Methods\n\nInheriting from another class means that everything that's <span\nname=\"liskov\">true</span> of the superclass should be true, more or less, of the\nsubclass. In statically typed languages, that carries a lot of implications. The\nsub*class* must also be a sub*type*, and the memory layout is controlled so that\nyou can pass an instance of a subclass to a function expecting a superclass and\nit can still access the inherited fields correctly.\n\n<aside name=\"liskov\">\n\nA fancier name for this hand-wavey guideline is the [*Liskov substitution\nprinciple*][liskov]. Barbara Liskov introduced it in a keynote during the\nformative period of object-oriented programming.\n\n[liskov]: https://en.wikipedia.org/wiki/Liskov_substitution_principle\n\n</aside>\n\nLox is a dynamically typed language, so our requirements are much simpler.\nBasically, it means that if you can call some method on an instance of the\nsuperclass, you should be able to call that method when given an instance of the\nsubclass. In other words, methods are inherited from the superclass.\n\nThis lines up with one of the goals of inheritance -- to give users a way to\nreuse code across classes. Implementing this in our interpreter is\nastonishingly easy.\n\n^code find-method-recurse-superclass (3 before, 1 after)\n\nThat's literally all there is to it. When we are looking up a method on an\ninstance, if we don't find it on the instance's class, we recurse up through the\nsuperclass chain and look there. Give it a try:\n\n```lox\nclass Doughnut {\n  cook() {\n    print \"Fry until golden brown.\";\n  }\n}\n\nclass BostonCream < Doughnut {}\n\nBostonCream().cook();\n```\n\nThere we go, half of our inheritance features are complete with only three lines\nof Java code.\n\n## Calling Superclass Methods\n\nIn `findMethod()` we look for a method on the current class *before* walking up\nthe superclass chain. If a method with the same name exists in both the subclass\nand the superclass, the subclass one takes precedence or **overrides** the\nsuperclass method. Sort of like how variables in inner scopes shadow outer ones.\n\nThat's great if the subclass wants to *replace* some superclass behavior\ncompletely. But, in practice, subclasses often want to *refine* the superclass's\nbehavior. They want to do a little work specific to the subclass, but also\nexecute the original superclass behavior too.\n\nHowever, since the subclass has overridden the method, there's no way to refer\nto the original one. If the subclass method tries to call it by name, it will\njust recursively hit its own override. We need a way to say \"Call this method,\nbut look for it directly on my superclass and ignore my override\". Java uses\n`super` for this, and we'll use that same syntax in Lox. Here is an example:\n\n```lox\nclass Doughnut {\n  cook() {\n    print \"Fry until golden brown.\";\n  }\n}\n\nclass BostonCream < Doughnut {\n  cook() {\n    super.cook();\n    print \"Pipe full of custard and coat with chocolate.\";\n  }\n}\n\nBostonCream().cook();\n```\n\nIf you run this, it should print:\n\n```text\nFry until golden brown.\nPipe full of custard and coat with chocolate.\n```\n\nWe have a new expression form. The `super` keyword, followed by a dot and an\nidentifier, looks for a method with that name. Unlike calls on `this`, the search\nstarts at the superclass.\n\n### Syntax\n\nWith `this`, the keyword works sort of like a magic variable, and the expression\nis that one lone token. But with `super`, the subsequent `.` and property name\nare inseparable parts of the `super` expression. You can't have a bare `super`\ntoken all by itself.\n\n```lox\nprint super; // Syntax error.\n```\n\nSo the new clause we add to the `primary` rule in our grammar includes the\nproperty access as well.\n\n```ebnf\nprimary        → \"true\" | \"false\" | \"nil\" | \"this\"\n               | NUMBER | STRING | IDENTIFIER | \"(\" expression \")\"\n               | \"super\" \".\" IDENTIFIER ;\n```\n\nTypically, a `super` expression is used for a method call, but, as with regular\nmethods, the argument list is *not* part of the expression. Instead, a super\n*call* is a super *access* followed by a function call. Like other method calls,\nyou can get a handle to a superclass method and invoke it separately.\n\n```lox\nvar method = super.cook;\nmethod();\n```\n\nSo the `super` expression itself contains only the token for the `super` keyword\nand the name of the method being looked up. The corresponding <span\nname=\"super-ast\">syntax tree node</span> is thus:\n\n^code super-expr (1 before, 1 after)\n\n<aside name=\"super-ast\">\n\nThe generated code for the new node is in [Appendix II][appendix-super].\n\n[appendix-super]: appendix-ii.html#super-expression\n\n</aside>\n\nFollowing the grammar, the new parsing code goes inside our existing `primary()`\nmethod.\n\n^code parse-super (2 before, 2 after)\n\nA leading `super` keyword tells us we've hit a `super` expression. After that we\nconsume the expected `.` and method name.\n\n### Semantics\n\nEarlier, I said a `super` expression starts the method lookup from \"the\nsuperclass\", but *which* superclass? The naïve answer is the superclass of\n`this`, the object the surrounding method was called on. That coincidentally\nproduces the right behavior in a lot of cases, but that's not actually correct.\nGaze upon:\n\n```lox\nclass A {\n  method() {\n    print \"A method\";\n  }\n}\n\nclass B < A {\n  method() {\n    print \"B method\";\n  }\n\n  test() {\n    super.method();\n  }\n}\n\nclass C < B {}\n\nC().test();\n```\nTranslate this program to Java, C#, or C++ and it will print \"A method\", which\nis what we want Lox to do too. When this program runs, inside the body of\n`test()`, `this` is an instance of C. The superclass of C is B, but that is\n*not* where the lookup should start. If it did, we would hit B's `method()`.\n\nInstead, lookup should start on the superclass of *the class containing the\n`super` expression*. In this case, since `test()` is defined inside B, the\n`super` expression inside it should start the lookup on *B*&rsquo;s superclass\n-- A.\n\n<span name=\"flow\"></span>\n\n<img src=\"image/inheritance/classes.png\" alt=\"The call chain flowing through the classes.\" />\n\n<aside name=\"flow\">\n\nThe execution flow looks something like this:\n\n1. We call `test()` on an instance of C.\n\n2. That enters the `test()` method inherited from B. That calls\n   `super.method()`.\n\n3. The superclass of B is A, so that chains to `method()` on A, and the program\n   prints \"A method\".\n\n</aside>\n\nThus, in order to evaluate a `super` expression, we need access to the\nsuperclass of the class definition surrounding the call. Alack and alas, at the\npoint in the interpreter where we are executing a `super` expression, we don't\nhave that easily available.\n\nWe *could* add a field to LoxFunction to store a reference to the LoxClass that\nowns that method. The interpreter would keep a reference to the\ncurrently executing LoxFunction so that we could look it up later when we hit a\n`super` expression. From there, we'd get the LoxClass of the method, then its\nsuperclass.\n\nThat's a lot of plumbing. In the [last chapter][], we had a similar problem when\nwe needed to add support for `this`. In that case, we used our existing\nenvironment and closure mechanism to store a reference to the current object.\nCould we do something similar for storing the superclass<span\nname=\"rhetorical\">?</span> Well, I probably wouldn't be talking about it if the\nanswer was no, so... yes.\n\n<aside name=\"rhetorical\">\n\nDoes anyone even like rhetorical questions?\n\n</aside>\n\n[last chapter]: classes.html\n\nOne important difference is that we bound `this` when the method was *accessed*.\nThe same method can be called on different instances and each needs its own\n`this`. With `super` expressions, the superclass is a fixed property of the\n*class declaration itself*. Every time you evaluate some `super` expression, the\nsuperclass is always the same.\n\nThat means we can create the environment for the superclass once, when the class\ndefinition is executed. Immediately before we define the methods, we make a new\nenvironment to bind the class's superclass to the name `super`.\n\n<img src=\"image/inheritance/superclass.png\" alt=\"The superclass environment.\" />\n\nWhen we create the LoxFunction runtime representation for each method, that is\nthe environment they will capture in their closure. Later, when a method is\ninvoked and `this` is bound, the superclass environment becomes the parent for\nthe method's environment, like so:\n\n<img src=\"image/inheritance/environments.png\" alt=\"The environment chain including the superclass environment.\" />\n\nThat's a lot of machinery, but we'll get through it a step at a time. Before we\ncan get to creating the environment at runtime, we need to handle the\ncorresponding scope chain in the resolver.\n\n^code begin-super-scope (2 before, 2 after)\n\nIf the class declaration has a superclass, then we create a new scope\nsurrounding all of its methods. In that scope, we define the name \"super\". Once\nwe're done resolving the class's methods, we discard that scope.\n\n^code end-super-scope (2 before, 1 after)\n\nIt's a minor optimization, but we only create the superclass environment if the\nclass actually *has* a superclass. There's no point creating it when there isn't\na superclass since there'd be no superclass to store in it anyway.\n\nWith \"super\" defined in a scope chain, we are able to resolve the `super`\nexpression itself.\n\n^code resolve-super-expr\n\nWe resolve the `super` token exactly as if it were a variable. The resolution\nstores the number of hops along the environment chain that the interpreter needs\nto walk to find the environment where the superclass is stored.\n\nThis code is mirrored in the interpreter. When we evaluate a subclass\ndefinition, we create a new environment.\n\n^code begin-superclass-environment (6 before, 2 after)\n\nInside that environment, we store a reference to the superclass -- the actual\nLoxClass object for the superclass which we have now that we are in the runtime.\nThen we create the LoxFunctions for each method. Those will capture the current\nenvironment -- the one where we just bound \"super\" -- as their closure, holding\non to the superclass like we need. Once that's done, we pop the environment.\n\n^code end-superclass-environment (2 before, 2 after)\n\nWe're ready to interpret `super` expressions themselves. There are a few moving\nparts, so we'll build this method up in pieces.\n\n^code interpreter-visit-super\n\nFirst, the work we've been leading up to. We look up the surrounding class's\nsuperclass by looking up \"super\" in the proper environment.\n\nWhen we access a method, we also need to bind `this` to the object the method is\naccessed from. In an expression like `doughnut.cook`, the object is whatever we\nget from evaluating `doughnut`. In a `super` expression like `super.cook`, the\ncurrent object is implicitly the *same* current object that we're using. In\nother words, `this`. Even though we are looking up the *method* on the\nsuperclass, the *instance* is still `this`.\n\nUnfortunately, inside the `super` expression, we don't have a convenient node\nfor the resolver to hang the number of hops to `this` on. Fortunately, we do\ncontrol the layout of the environment chains. The environment where \"this\" is\nbound is always right inside the environment where we store \"super\".\n\n^code super-find-this (2 before, 1 after)\n\nOffsetting the distance by one looks up \"this\" in that inner environment. I\nadmit this isn't the most <span name=\"elegant\">elegant</span> code, but it\nworks.\n\n<aside name=\"elegant\">\n\nWriting a book that includes every single line of code for a program means I\ncan't hide the hacks by leaving them as an \"exercise for the reader\".\n\n</aside>\n\nNow we're ready to look up and bind the method, starting at the superclass.\n\n^code super-find-method (2 before, 1 after)\n\nThis is almost exactly like the code for looking up a method of a get\nexpression, except that we call `findMethod()` on the superclass instead of on\nthe class of the current object.\n\nThat's basically it. Except, of course, that we might *fail* to find the method.\nSo we check for that too.\n\n^code super-no-method (2 before, 2 after)\n\nThere you have it! Take that BostonCream example earlier and give it a try.\nAssuming you and I did everything right, it should fry it first, then stuff it\nwith cream.\n\n### Invalid uses of super\n\nAs with previous language features, our implementation does the right thing when\nthe user writes correct code, but we haven't bulletproofed the intepreter\nagainst bad code. In particular, consider:\n\n```lox\nclass Eclair {\n  cook() {\n    super.cook();\n    print \"Pipe full of crème pâtissière.\";\n  }\n}\n```\n\nThis class has a `super` expression, but no superclass. At runtime, the code for\nevaluating `super` expressions assumes that \"super\" was successfully resolved\nand will be found in the environment. That's going to fail here because there is\nno surrounding environment for the superclass since there is no superclass. The\nJVM will throw an exception and bring our interpreter to its knees.\n\nHeck, there are even simpler broken uses of super:\n\n```lox\nsuper.notEvenInAClass();\n```\n\nWe could handle errors like these at runtime by checking to see if the lookup\nof \"super\" succeeded. But we can tell statically -- just by looking at the\nsource code -- that Eclair has no superclass and thus no `super` expression will\nwork inside it. Likewise, in the second example, we know that the `super`\nexpression is not even inside a method body.\n\nEven though Lox is dynamically typed, that doesn't mean we want to defer\n*everything* to runtime. If the user made a mistake, we'd like to help them find\nit sooner rather than later. So we'll report these errors statically, in the\nresolver.\n\nFirst, we add a new case to the enum we use to keep track of what kind of class\nis surrounding the current code being visited.\n\n^code class-type-subclass (1 before, 1 after)\n\nWe'll use that to distinguish when we're inside a class that has a superclass\nversus one that doesn't. When we resolve a class declaration, we set that if the\nclass is a subclass.\n\n^code set-current-subclass (1 before, 1 after)\n\nThen, when we resolve a `super` expression, we check to see that we are\ncurrently inside a scope where that's allowed.\n\n^code invalid-super (1 before, 1 after)\n\nIf not -- oopsie! -- the user made a mistake.\n\n## Conclusion\n\nWe made it! That final bit of error handling is the last chunk of code needed to\ncomplete our Java implementation of Lox. This is a real <span\nname=\"superhero\">accomplishment</span> and one you should be proud of. In the\npast dozen chapters and a thousand or so lines of code, we have learned and\nimplemented...\n\n* [tokens and lexing][4],\n* [abstract syntax trees][5],\n* [recursive descent parsing][6],\n* prefix and infix expressions,\n* runtime representation of objects,\n* [interpreting code using the Visitor pattern][7],\n* [lexical scope][8],\n* environment chains for storing variables,\n* [control flow][9],\n* [functions with parameters][10],\n* closures,\n* [static variable resolution and error detection][11],\n* [classes][12],\n* constructors,\n* fields,\n* methods, and finally,\n* inheritance.\n\n[4]: scanning.html\n[5]: representing-code.html\n[6]: parsing-expressions.html\n[7]: evaluating-expressions.html\n[8]: statements-and-state.html\n[9]: control-flow.html\n[10]: functions.html\n[11]: resolving-and-binding.html\n[12]: classes.html\n\n<aside name=\"superhero\">\n\n<img src=\"image/inheritance/superhero.png\" alt=\"You, being your bad self.\" />\n\n</aside>\n\nWe did all of that from scratch, with no external dependencies or magic tools.\nJust you and I, our respective text editors, a couple of collection classes in\nthe Java standard library, and the JVM runtime.\n\nThis marks the end of Part II, but not the end of the book. Take a break. Maybe\nwrite a few fun Lox programs and run them in your interpreter. (You may want to\nadd a few more native methods for things like reading user input.) When you're\nrefreshed and ready, we'll embark on our [next adventure][].\n\n[next adventure]: a-bytecode-virtual-machine.html\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  Lox supports only *single inheritance* -- a class may have a single\n    superclass and that's the only way to reuse methods across classes. Other\n    languages have explored a variety of ways to more freely reuse and share\n    capabilities across classes: mixins, traits, multiple inheritance, virtual\n    inheritance, extension methods, etc.\n\n    If you were to add some feature along these lines to Lox, which would you\n    pick and why? If you're feeling courageous (and you should be at this\n    point), go ahead and add it.\n\n1.  In Lox, as in most other object-oriented languages, when looking up a\n    method, we start at the bottom of the class hierarchy and work our way up --\n    a subclass's method is preferred over a superclass's. In order to get to the\n    superclass method from within an overriding method, you use `super`.\n\n    The language [BETA][] takes the [opposite approach][inner]. When you call a\n    method, it starts at the *top* of the class hierarchy and works *down*. A\n    superclass method wins over a subclass method. In order to get to the\n    subclass method, the superclass method can call `inner`, which is sort of\n    like the inverse of `super`. It chains to the next method down the\n    hierarchy.\n\n    The superclass method controls when and where the subclass is allowed to\n    refine its behavior. If the superclass method doesn't call `inner` at all,\n    then the subclass has no way of overriding or modifying the superclass's\n    behavior.\n\n    Take out Lox's current overriding and `super` behavior and replace it with\n    BETA's semantics. In short:\n\n    *   When calling a method on a class, prefer the method *highest* on the\n        class's inheritance chain.\n\n    *   Inside the body of a method, a call to `inner` looks for a method with\n        the same name in the nearest subclass along the inheritance chain\n        between the class containing the `inner` and the class of `this`. If\n        there is no matching method, the `inner` call does nothing.\n\n    For example:\n\n    ```lox\n    class Doughnut {\n      cook() {\n        print \"Fry until golden brown.\";\n        inner();\n        print \"Place in a nice box.\";\n      }\n    }\n\n    class BostonCream < Doughnut {\n      cook() {\n        print \"Pipe full of custard and coat with chocolate.\";\n      }\n    }\n\n    BostonCream().cook();\n    ```\n\n    This should print:\n\n    ```text\n    Fry until golden brown.\n    Pipe full of custard and coat with chocolate.\n    Place in a nice box.\n    ```\n\n1.  In the chapter where I introduced Lox, [I challenged you][challenge] to\n    come up with a couple of features you think the language is missing. Now\n    that you know how to build an interpreter, implement one of those features.\n\n[challenge]: the-lox-language.html#challenges\n[inner]: http://journal.stuffwithstuff.com/2012/12/19/the-impoliteness-of-overriding-methods/\n[beta]: https://beta.cs.au.dk/\n\n</div>\n"
  },
  {
    "path": "book/introduction.md",
    "content": "> Fairy tales are more than true: not because they tell us that dragons exist,\n> but because they tell us that dragons can be beaten.\n>\n> <cite>G.K. Chesterton by way of Neil Gaiman, <em>Coraline</em></cite>\n\nI'm really excited we're going on this journey together. This is a book on\nimplementing interpreters for programming languages. It's also a book on how to\ndesign a language worth implementing. It's the book I wish I'd had when I first\nstarted getting into languages, and it's the book I've been writing in my <span\nname=\"head\">head</span> for nearly a decade.\n\n<aside name=\"head\">\n\nTo my friends and family, sorry I've been so absentminded!\n\n</aside>\n\nIn these pages, we will walk step-by-step through two complete interpreters for\na full-featured language. I assume this is your first foray into languages, so\nI'll cover each concept and line of code you need to build a complete, usable,\nfast language implementation.\n\nIn order to cram two full implementations inside one book without it turning\ninto a doorstop, this text is lighter on theory than others. As we build each\npiece of the system, I will introduce the history and concepts behind it. I'll\ntry to get you familiar with the lingo so that if you ever find yourself at a\n<span name=\"party\">cocktail party</span> full of PL (programming language)\nresearchers, you'll fit in.\n\n<aside name=\"party\">\n\nStrangely enough, a situation I have found myself in multiple times. You\nwouldn't believe how much some of them can drink.\n\n</aside>\n\nBut we're mostly going to spend our brain juice getting the language up and\nrunning. This is not to say theory isn't important. Being able to reason\nprecisely and <span name=\"formal\">formally</span> about syntax and semantics is\na vital skill when working on a language. But, personally, I learn best by\ndoing. It's hard for me to wade through paragraphs full of abstract concepts and\nreally absorb them. But if I've coded something, run it, and debugged it, then I\n*get* it.\n\n<aside name=\"formal\">\n\nStatic type systems in particular require rigorous formal reasoning. Hacking on\na type system has the same feel as proving a theorem in mathematics.\n\nIt turns out this is no coincidence. In the early half of last century, Haskell\nCurry and William Alvin Howard showed that they are two sides of the same coin:\n[the Curry-Howard isomorphism][].\n\n[the curry-howard isomorphism]: https://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspondence\n\n</aside>\n\nThat's my goal for you. I want you to come away with a solid intuition of how a\nreal language lives and breathes. My hope is that when you read other, more\ntheoretical books later, the concepts there will firmly stick in your mind,\nadhered to this tangible substrate.\n\n## Why Learn This Stuff?\n\nEvery introduction to every compiler book seems to have this section. I don't\nknow what it is about programming languages that causes such existential doubt.\nI don't think ornithology books worry about justifying their existence. They\nassume the reader loves birds and start teaching.\n\nBut programming languages are a little different. I suppose it is true that the\nodds of any of us creating a broadly successful, general-purpose programming\nlanguage are slim. The designers of the world's widely used languages could fit\nin a Volkswagen bus, even without putting the pop-top camper up. If joining that\nelite group was the *only* reason to learn languages, it would be hard to\njustify. Fortunately, it isn't.\n\n### Little languages are everywhere\n\nFor every successful general-purpose language, there are a thousand successful\nniche ones. We used to call them \"little languages\", but inflation in the jargon\neconomy led to the name \"domain-specific languages\". These are pidgins\ntailor-built to a specific task. Think application scripting languages, template\nengines, markup formats, and configuration files.\n\n<span name=\"little\"></span><img src=\"image/introduction/little-languages.png\" alt=\"A random selection of little languages.\" />\n\n<aside name=\"little\">\n\nA random selection of some little languages you might run into.\n\n</aside>\n\nAlmost every large software project needs a handful of these. When you can, it's\ngood to reuse an existing one instead of rolling your own. Once you factor in\ndocumentation, debuggers, editor support, syntax highlighting, and all of the\nother trappings, doing it yourself becomes a tall order.\n\nBut there's still a good chance you'll find yourself needing to whip up a parser\nor other tool when there isn't an existing library that fits your needs. Even\nwhen you are reusing some existing implementation, you'll inevitably end up\nneeding to debug and maintain it and poke around in its guts.\n\n### Languages are great exercise\n\nLong distance runners sometimes train with weights strapped to their ankles or\nat high altitudes where the atmosphere is thin. When they later unburden\nthemselves, the new relative ease of light limbs and oxygen-rich air enables\nthem to run farther and faster.\n\nImplementing a language is a real test of programming skill. The code is complex\nand performance critical. You must master recursion, dynamic arrays, trees,\ngraphs, and hash tables. You probably use hash tables at least in your\nday-to-day programming, but do you *really* understand them? Well, after we've\ncrafted our own from scratch, I guarantee you will.\n\nWhile I intend to show you that an interpreter isn't as daunting as you might\nbelieve, implementing one well is still a challenge. Rise to it, and you'll come\naway a stronger programmer, and smarter about how you use data structures and\nalgorithms in your day job.\n\n### One more reason\n\nThis last reason is hard for me to admit, because it's so close to my heart.\nEver since I learned to program as a kid, I felt there was something magical\nabout languages. When I first tapped out BASIC programs one key at a time I\ncouldn't conceive how BASIC *itself* was made.\n\nLater, the mixture of awe and terror on my college friends' faces when talking\nabout their compilers class was enough to convince me language hackers were a\ndifferent breed of human -- some sort of wizards granted privileged access to\narcane arts.\n\nIt's a charming <span name=\"image\">image</span>, but it has a darker side. *I*\ndidn't feel like a wizard, so I was left thinking I lacked some inborn quality\nnecessary to join the cabal. Though I've been fascinated by languages ever since\nI doodled made-up keywords in my school notebook, it took me decades to muster\nthe courage to try to really learn them. That \"magical\" quality, that sense of\nexclusivity, excluded *me*.\n\n<aside name=\"image\">\n\nAnd its practitioners don't hesitate to play up this image. Two of the seminal\ntexts on programming languages feature a [dragon][] and a [wizard][] on their\ncovers.\n\n[dragon]: https://en.wikipedia.org/wiki/Compilers:_Principles,_Techniques,_and_Tools\n[wizard]: https://mitpress.mit.edu/sites/default/files/sicp/index.html\n\n</aside>\n\nWhen I did finally start cobbling together my own little interpreters, I quickly\nlearned that, of course, there is no magic at all. It's just code, and the\npeople who hack on languages are just people.\n\nThere *are* a few techniques you don't often encounter outside of languages, and\nsome parts are a little difficult. But not more difficult than other obstacles\nyou've overcome. My hope is that if you've felt intimidated by languages and\nthis book helps you overcome that fear, maybe I'll leave you just a tiny bit\nbraver than you were before.\n\nAnd, who knows, maybe you *will* make the next great language. Someone has to.\n\n## How the Book Is Organized\n\nThis book is broken into three parts. You're reading the first one now. It's a\ncouple of chapters to get you oriented, teach you some of the lingo that\nlanguage hackers use, and introduce you to Lox, the language we'll be\nimplementing.\n\nEach of the other two parts builds one complete Lox interpreter. Within those\nparts, each chapter is structured the same way. The chapter takes a single\nlanguage feature, teaches you the concepts behind it, and walks you through an\nimplementation.\n\nIt took a good bit of trial and error on my part, but I managed to carve up the\ntwo interpreters into chapter-sized chunks that build on the previous chapters\nbut require nothing from later ones. From the very first chapter, you'll have a\nworking program you can run and play with. With each passing chapter, it grows\nincreasingly full-featured until you eventually have a complete language.\n\nAside from copious, scintillating English prose, chapters have a few other\ndelightful facets:\n\n### The code\n\nWe're about *crafting* interpreters, so this book contains real code. Every\nsingle line of code needed is included, and each snippet tells you where to\ninsert it in your ever-growing implementation.\n\nMany other language books and language implementations use tools like [Lex][]\nand <span name=\"yacc\">[Yacc][]</span>, so-called **compiler-compilers**, that\nautomatically generate some of the source files for an implementation from some\nhigher-level description. There are pros and cons to tools like those, and\nstrong opinions -- some might say religious convictions -- on both sides.\n\n<aside name=\"yacc\">\n\nYacc is a tool that takes in a grammar file and produces a source file for a\ncompiler, so it's sort of like a \"compiler\" that outputs a compiler, which is\nwhere we get the term \"compiler-compiler\".\n\nYacc wasn't the first of its ilk, which is why it's named \"Yacc\" -- *Yet\nAnother* Compiler-Compiler. A later similar tool is [Bison][], named as a pun on\nthe pronunciation of Yacc like \"yak\".\n\n<img src=\"image/introduction/yak.png\" alt=\"A yak.\" />\n\n[bison]: https://en.wikipedia.org/wiki/GNU_bison\n\nIf you find all of these little self-references and puns charming and fun,\nyou'll fit right in here. If not, well, maybe the language nerd sense of humor\nis an acquired taste.\n\n</aside>\n\nWe will abstain from using them here. I want to ensure there are no dark corners\nwhere magic and confusion can hide, so we'll write everything by hand. As you'll\nsee, it's not as bad as it sounds, and it means you really will understand each\nline of code and how both interpreters work.\n\n[lex]: https://en.wikipedia.org/wiki/Lex_(software)\n[yacc]: https://en.wikipedia.org/wiki/Yacc\n\nA book has different constraints from the \"real world\" and so the coding style\nhere might not always reflect the best way to write maintainable production\nsoftware. If I seem a little cavalier about, say, omitting `private` or\ndeclaring a global variable, understand I do so to keep the code easier on your\neyes. The pages here aren't as wide as your IDE and every character counts.\n\nAlso, the code doesn't have many comments. That's because each handful of lines\nis surrounded by several paragraphs of honest-to-God prose explaining it. When\nyou write a book to accompany your program, you are welcome to omit comments\ntoo. Otherwise, you should probably use `//` a little more than I do.\n\nWhile the book contains every line of code and teaches what each means, it does\nnot describe the machinery needed to compile and run the interpreter. I assume\nyou can slap together a makefile or a project in your IDE of choice in order to\nget the code to run. Those kinds of instructions get out of date quickly, and\nI want this book to age like XO brandy, not backyard hooch.\n\n### Snippets\n\nSince the book contains literally every line of code needed for the\nimplementations, the snippets are quite precise. Also, because I try to keep the\nprogram in a runnable state even when major features are missing, sometimes we\nadd temporary code that gets replaced in later snippets.\n\nA snippet with all the bells and whistles looks like this:\n\n<div class=\"codehilite\"><pre class=\"insert-before\">\n      default:\n</pre><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nin <em>scanToken</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">\n        <span class=\"k\">if</span> (<span class=\"i\">isDigit</span>(<span class=\"i\">c</span>)) {\n          <span class=\"i\">number</span>();\n        } <span class=\"k\">else</span> {\n          <span class=\"t\">Lox</span>.<span class=\"i\">error</span>(<span class=\"i\">line</span>, <span class=\"s\">&quot;Unexpected character.&quot;</span>);\n        }\n</pre><pre class=\"insert-after\">\n        break;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, in <em>scanToken</em>(), replace 1 line</div>\n\nIn the center, you have the new code to add. It may have a few faded out lines\nabove or below to show where it goes in the existing surrounding code. There is\nalso a little blurb telling you in which file and where to place the snippet. If\nthat blurb says \"replace _ lines\", there is some existing code between the faded\nlines that you need to remove and replace with the new snippet.\n\n### Asides\n\n<span name=\"joke\">Asides</span> contain biographical sketches, historical\nbackground, references to related topics, and suggestions of other areas to\nexplore. There's nothing that you *need* to know in them to understand later\nparts of the book, so you can skip them if you want. I won't judge you, but I\nmight be a little sad.\n\n<aside name=\"joke\">\n\nWell, some asides do, at least. Most of them are just dumb jokes and amateurish\ndrawings.\n\n</aside>\n\n### Challenges\n\nEach chapter ends with a few exercises. Unlike textbook problem sets, which tend\nto review material you already covered, these are to help you learn *more* than\nwhat's in the chapter. They force you to step off the guided path and explore on\nyour own. They will make you research other languages, figure out how to\nimplement features, or otherwise get you out of your comfort zone.\n\n<span name=\"warning\">Vanquish</span> the challenges and you'll come away with a\nbroader understanding and possibly a few bumps and scrapes. Or skip them if you\nwant to stay inside the comfy confines of the tour bus. It's your book.\n\n<aside name=\"warning\">\n\nA word of warning: the challenges often ask you to make changes to the\ninterpreter you're building. You'll want to implement those in a copy of your\ncode. The later chapters assume your interpreter is in a pristine\n(\"unchallenged\"?) state.\n\n</aside>\n\n### Design notes\n\nMost \"programming language\" books are strictly programming language\n*implementation* books. They rarely discuss how one might happen to *design* the\nlanguage being implemented. Implementation is fun because it is so <span\nname=\"benchmark\">precisely defined</span>. We programmers seem to have an\naffinity for things that are black and white, ones and zeroes.\n\n<aside name=\"benchmark\">\n\nI know a lot of language hackers whose careers are based on this. You slide a\nlanguage spec under their door, wait a few months, and code and benchmark\nresults come out.\n\n</aside>\n\nPersonally, I think the world needs only so many implementations of <span\nname=\"fortran\">FORTRAN 77</span>. At some point, you find yourself designing a\n*new* language. Once you start playing *that* game, then the softer, human side\nof the equation becomes paramount. Things like which features are easy to learn,\nhow to balance innovation and familiarity, what syntax is more readable and to\nwhom.\n\n<aside name=\"fortran\">\n\nHopefully your new language doesn't hardcode assumptions about the width of a\npunched card into its grammar.\n\n</aside>\n\nAll of that stuff profoundly affects the success of your new language. I want\nyour language to succeed, so in some chapters I end with a \"design note\", a\nlittle essay on some corner of the human aspect of programming languages. I'm no\nexpert on this -- I don't know if anyone really is -- so take these with a large\npinch of salt. That should make them tastier food for thought, which is my main\naim.\n\n## The First Interpreter\n\nWe'll write our first interpreter, jlox, in <span name=\"lang\">Java</span>. The\nfocus is on *concepts*. We'll write the simplest, cleanest code we can to\ncorrectly implement the semantics of the language. This will get us comfortable\nwith the basic techniques and also hone our understanding of exactly how the\nlanguage is supposed to behave.\n\n<aside name=\"lang\">\n\nThe book uses Java and C, but readers have ported the code to [many other\nlanguages][port]. If the languages I picked aren't your bag, take a look at\nthose.\n\n[port]: https://github.com/munificent/craftinginterpreters/wiki/Lox-implementations\n\n</aside>\n\nJava is a great language for this. It's high level enough that we don't get\noverwhelmed by fiddly implementation details, but it's still pretty explicit.\nUnlike in scripting languages, there tends to be less complex machinery hiding\nunder the hood, and you've got static types to see what data structures you're\nworking with.\n\nI also chose Java specifically because it is an object-oriented language. That\nparadigm swept the programming world in the '90s and is now the dominant way of\nthinking for millions of programmers. Odds are good you're already used to\norganizing code into classes and methods, so we'll keep you in that comfort\nzone.\n\nWhile academic language folks sometimes look down on object-oriented languages,\nthe reality is that they are widely used even for language work. GCC and LLVM\nare written in C++, as are most JavaScript virtual machines. Object-oriented\nlanguages are ubiquitous, and the tools and compilers *for* a language are often\nwritten *in* the <span name=\"host\">same language</span>.\n\n<aside name=\"host\">\n\nA compiler reads files in one language, translates them, and outputs files in\nanother language. You can implement a compiler in any language, including the\nsame language it compiles, a process called **self-hosting**.\n\nYou can't compile your compiler using itself yet, but if you have another\ncompiler for your language written in some other language, you use *that* one to\ncompile your compiler once. Now you can use the compiled version of your own\ncompiler to compile future versions of itself, and you can discard the original\none compiled from the other compiler. This is called **bootstrapping**, from\nthe image of pulling yourself up by your own bootstraps.\n\n<img src=\"image/introduction/bootstrap.png\" alt=\"Fact: This is the primary mode of transportation of the American cowboy.\" />\n\n</aside>\n\nAnd, finally, Java is hugely popular. That means there's a good chance you\nalready know it, so there's less for you to learn to get going in the book. If\nyou aren't that familiar with Java, don't freak out. I try to stick to a fairly\nminimal subset of it. I use the diamond operator from Java 7 to make things a\nlittle more terse, but that's about it as far as \"advanced\" features go. If you\nknow another object-oriented language, like C# or C++, you can muddle through.\n\nBy the end of part II, we'll have a simple, readable implementation. It's not\nvery fast, but it's correct. However, we are only able to accomplish that by\nbuilding on the Java virtual machine's own runtime facilities. We want to learn\nhow Java *itself* implements those things.\n\n## The Second Interpreter\n\nSo in the next part, we start all over again, but this time in C. C is the\nperfect language for understanding how an implementation *really* works, all the\nway down to the bytes in memory and the code flowing through the CPU.\n\nA big reason that we're using C is so I can show you things C is particularly\ngood at, but that *does* mean you'll need to be pretty comfortable with it. You\ndon't have to be the reincarnation of Dennis Ritchie, but you shouldn't be\nspooked by pointers either.\n\nIf you aren't there yet, pick up an introductory book on C and chew through it,\nthen come back here when you're done. In return, you'll come away from this book\nan even stronger C programmer. That's useful given how many language\nimplementations are written in C: Lua, CPython, and Ruby's MRI, to name a few.\n\nIn our C interpreter, <span name=\"clox\">clox</span>, we are forced to implement\nfor ourselves all the things Java gave us for free. We'll write our own dynamic\narray and hash table. We'll decide how objects are represented in memory, and\nbuild a garbage collector to reclaim them.\n\n<aside name=\"clox\">\n\nI pronounce the name like \"sea-locks\", but you can say it \"clocks\" or even\n\"cloch\", where you pronounce the \"x\" like the Greeks do if it makes you happy.\n\n</aside>\n\nOur Java implementation was focused on being correct. Now that we have that\ndown, we'll turn to also being *fast*. Our C interpreter will contain a <span\nname=\"compiler\">compiler</span> that translates Lox to an efficient bytecode\nrepresentation (don't worry, I'll get into what that means soon), which it then\nexecutes. This is the same technique used by implementations of Lua, Python,\nRuby, PHP, and many other successful languages.\n\n<aside name=\"compiler\">\n\nDid you think this was just an interpreter book? It's a compiler book as well.\nTwo for the price of one!\n\n</aside>\n\nWe'll even try our hand at benchmarking and optimization. By the end, we'll have\na robust, accurate, fast interpreter for our language, able to keep up with\nother professional caliber implementations out there. Not bad for one book and a\nfew thousand lines of code.\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  There are at least six domain-specific languages used in the [little system\n    I cobbled together][repo] to write and publish this book. What are they?\n\n1.  Get a \"Hello, world!\" program written and running in Java. Set up whatever\n    makefiles or IDE projects you need to get it working. If you have a\n    debugger, get comfortable with it and step through your program as it runs.\n\n1.  Do the same thing for C. To get some practice with pointers, define a\n    [doubly linked list][] of heap-allocated strings. Write functions to insert,\n    find, and delete items from it. Test them.\n\n[repo]: https://github.com/munificent/craftinginterpreters\n[doubly linked list]: https://en.wikipedia.org/wiki/Doubly_linked_list\n\n</div>\n\n<div class=\"design-note\">\n\n## Design Note: What's in a Name?\n\nOne of the hardest challenges in writing this book was coming up with a name for\nthe language it implements. I went through *pages* of candidates before I found\none that worked. As you'll discover on the first day you start building your own\nlanguage, naming is deviously hard. A good name satisfies a few criteria:\n\n1.  **It isn't in use.** You can run into all sorts of trouble, legal and\n    social, if you inadvertently step on someone else's name.\n\n2.  **It's easy to pronounce.** If things go well, hordes of people will be\n    saying and writing your language's name. Anything longer than a couple of\n    syllables or a handful of letters will annoy them to no end.\n\n3.  **It's distinct enough to search for.** People will Google your language's\n    name to learn about it, so you want a word that's rare enough that most\n    results point to your docs. Though, with the amount of AI search engines are\n    packing today, that's less of an issue. Still, you won't be doing your users\n    any favors if you name your language \"for\".\n\n4.  **It doesn't have negative connotations across a number of cultures.** This\n    is hard to be on guard for, but it's worth considering. The designer of\n    Nimrod ended up renaming his language to \"Nim\" because too many people\n    remember that Bugs Bunny used \"Nimrod\" as an insult. (Bugs was using it\n    ironically.)\n\nIf your potential name makes it through that gauntlet, keep it. Don't get hung\nup on trying to find an appellation that captures the quintessence of your\nlanguage. If the names of the world's other successful languages teach us\nanything, it's that the name doesn't matter much. All you need is a reasonably\nunique token.\n\n</div>\n"
  },
  {
    "path": "book/jumping-back-and-forth.md",
    "content": "> The order that our mind imagines is like a net, or like a ladder, built to\n> attain something. But afterward you must throw the ladder away, because you\n> discover that, even if it was useful, it was meaningless.\n>\n> <cite>Umberto Eco, <em>The Name of the Rose</em></cite>\n\nIt's taken a while to get here, but we're finally ready to add control flow to\nour virtual machine. In the tree-walk interpreter we built for jlox, we\nimplemented Lox's control flow in terms of Java's. To execute a Lox `if`\nstatement, we used a Java `if` statement to run the chosen branch. That works,\nbut isn't entirely satisfying. By what magic does the *JVM itself* or a native\nCPU implement `if` statements? Now that we have our own bytecode VM to hack on,\nwe can answer that.\n\nWhen we talk about \"control flow\", what are we referring to? By \"flow\" we mean\nthe way execution moves through the text of the program. Almost like there is a\nlittle robot inside the computer wandering through our code, executing bits and\npieces here and there. Flow is the path that robot takes, and by *controlling*\nthe robot, we drive which pieces of code it executes.\n\nIn jlox, the robot's locus of attention -- the *current* bit of code -- was\nimplicit based on which AST nodes were stored in various Java variables and what\nJava code we were in the middle of running. In clox, it is much more explicit.\nThe VM's `ip` field stores the address of the current bytecode instruction. The\nvalue of that field is exactly \"where we are\" in the program.\n\nExecution proceeds normally by incrementing the `ip`. But we can mutate that\nvariable however we want to. In order to implement control flow, all that's\nnecessary is to change the `ip` in more interesting ways. The simplest control\nflow construct is an `if` statement with no `else` clause:\n\n```lox\nif (condition) print(\"condition was truthy\");\n```\n\nThe VM evaluates the bytecode for the condition expression. If the result is\ntruthy, then it continues along and executes the `print` statement in the body.\nThe interesting case is when the condition is falsey. When that happens,\nexecution skips over the then branch and proceeds to the next statement.\n\nTo skip over a chunk of code, we simply set the `ip` field to the address of the\nbytecode instruction following that code. To *conditionally* skip over some\ncode, we need an instruction that looks at the value on top of the stack. If\nit's falsey, it adds a given offset to the `ip` to jump over a range of\ninstructions. Otherwise, it does nothing and lets execution proceed to the next\ninstruction as usual.\n\nWhen we compile to bytecode, the explicit nested block structure of the code\nevaporates, leaving only a flat series of instructions behind. Lox is a\n[structured programming][] language, but clox bytecode isn't. The right -- or\nwrong, depending on how you look at it -- set of bytecode instructions could\njump into the middle of a block, or from one scope into another.\n\nThe VM will happily execute that, even if the result leaves the stack in an\nunknown, inconsistent state. So even though the bytecode is unstructured, we'll\ntake care to ensure that our compiler only generates clean code that maintains\nthe same structure and nesting that Lox itself does.\n\nThis is exactly how real CPUs behave. Even though we might program them using\nhigher-level languages that mandate structured control flow, the compiler lowers\nthat down to raw jumps. At the bottom, it turns out goto is the only real\ncontrol flow.\n\n[structured programming]: https://en.wikipedia.org/wiki/Structured_programming\n\nAnyway, I didn't mean to get all philosophical. The important bit is that if we\nhave that one conditional jump instruction, that's enough to implement Lox's\n`if` statement, as long as it doesn't have an `else` clause. So let's go ahead\nand get started with that.\n\n## If Statements\n\nThis many chapters in, you know the drill. Any new feature starts in the front\nend and works its way through the pipeline. An `if` statement is, well, a\nstatement, so that's where we hook it into the parser.\n\n^code parse-if (2 before, 1 after)\n\nWhen we see an `if` keyword, we hand off compilation to this function:\n\n^code if-statement\n\n<aside name=\"paren\">\n\nHave you ever noticed that the `(` after the `if` keyword doesn't actually do\nanything useful? The language would be just as unambiguous and easy to parse\nwithout it, like:\n\n```lox\nif condition) print(\"looks weird\");\n```\n\nThe closing `)` is useful because it separates the condition expression from the\nbody. Some languages use a `then` keyword instead. But the opening `(` doesn't\ndo anything. It's just there because unmatched parentheses look bad to us\nhumans.\n\n</aside>\n\nFirst we compile the condition expression, bracketed by parentheses. At runtime,\nthat will leave the condition value on top of the stack. We'll use that to\ndetermine whether to execute the then branch or skip it.\n\nThen we emit a new `OP_JUMP_IF_FALSE` instruction. It has an operand for how\nmuch to offset the `ip` -- how many bytes of code to skip. If the condition is\nfalsey, it adjusts the `ip` by that amount. Something like this:\n\n<aside name=\"legend\">\n\nThe boxes with the torn edges here represent the blob of bytecode generated by\ncompiling some sub-clause of a control flow construct. So the \"condition\nexpression\" box is all of the instructions emitted when we compiled that\nexpression.\n\n</aside>\n\n<span name=\"legend\"></span>\n\n<img src=\"image/jumping-back-and-forth/if-without-else.png\" alt=\"Flowchart of the compiled bytecode of an if statement.\" />\n\nBut we have a problem. When we're writing the `OP_JUMP_IF_FALSE` instruction's\noperand, how do we know how far to jump? We haven't compiled the then branch\nyet, so we don't know how much bytecode it contains.\n\nTo fix that, we use a classic trick called **backpatching**. We emit the jump\ninstruction first with a placeholder offset operand. We keep track of where that\nhalf-finished instruction is. Next, we compile the then body. Once that's done,\nwe know how far to jump. So we go back and replace that placeholder offset with\nthe real one now that we can calculate it. Sort of like sewing a patch onto the\nexisting fabric of the compiled code.\n\n<img src=\"image/jumping-back-and-forth/patch.png\" alt=\"A patch containing a number being sewn onto a sheet of bytecode.\" />\n\nWe encode this trick into two helper functions.\n\n^code emit-jump\n\nThe first emits a bytecode instruction and writes a placeholder operand for the\njump offset. We pass in the opcode as an argument because later we'll have two\ndifferent instructions that use this helper. We use two bytes for the jump\noffset operand. A 16-bit <span name=\"offset\">offset</span> lets us jump over up\nto 65,535 bytes of code, which should be plenty for our needs.\n\n<aside name=\"offset\">\n\nSome instruction sets have separate \"long\" jump instructions that take larger\noperands for when you need to jump a greater distance.\n\n</aside>\n\nThe function returns the offset of the emitted instruction in the chunk. After\ncompiling the then branch, we take that offset and pass it to this:\n\n^code patch-jump\n\nThis goes back into the bytecode and replaces the operand at the given location\nwith the calculated jump offset. We call `patchJump()` right before we emit the\nnext instruction that we want the jump to land on, so it uses the current\nbytecode count to determine how far to jump. In the case of an `if` statement,\nthat means right after we compile the then branch and before we compile the next\nstatement.\n\nThat's all we need at compile time. Let's define the new instruction.\n\n^code jump-if-false-op (1 before, 1 after)\n\nOver in the VM, we get it working like so:\n\n^code op-jump-if-false (2 before, 1 after)\n\nThis is the first instruction we've added that takes a 16-bit operand. To read\nthat from the chunk, we use a new macro.\n\n^code read-short (1 before, 1 after)\n\nIt yanks the next two bytes from the chunk and builds a 16-bit unsigned integer\nout of them. As usual, we clean up our macro when we're done with it.\n\n^code undef-read-short (1 before, 1 after)\n\nAfter reading the offset, we check the condition value on top of the stack.\n<span name=\"if\">If</span> it's falsey, we apply this jump offset to the `ip`.\nOtherwise, we leave the `ip` alone and execution will automatically proceed to\nthe next instruction following the jump instruction.\n\nIn the case where the condition is falsey, we don't need to do any other work.\nWe've offset the `ip`, so when the outer instruction dispatch loop turns again,\nit will pick up execution at that new instruction, past all of the code in the\nthen branch.\n\n<aside name=\"if\">\n\nI said we wouldn't use C's `if` statement to implement Lox's control flow, but\nwe do use one here to determine whether or not to offset the instruction\npointer. But we aren't really using C for *control flow*. If we wanted to, we\ncould do the same thing purely arithmetically. Let's assume we have a function\n`falsey()` that takes a Lox Value and returns 1 if it's falsey or 0 otherwise.\nThen we could implement the jump instruction like:\n\n```c\ncase OP_JUMP_IF_FALSE: {\n  uint16_t offset = READ_SHORT();\n  vm.ip += falsey() * offset;\n  break;\n}\n```\n\nThe `falsey()` function would probably use some control flow to handle the\ndifferent value types, but that's an implementation detail of that function and\ndoesn't affect how our VM does its own control flow.\n\n</aside>\n\nNote that the jump instruction doesn't pop the condition value off the stack. So\nwe aren't totally done here, since this leaves an extra value floating around on\nthe stack. We'll clean that up soon. Ignoring that for the moment, we do have a\nworking `if` statement in Lox now, with only one little instruction required to\nsupport it at runtime in the VM.\n\n### Else clauses\n\nAn `if` statement without support for `else` clauses is like Morticia Addams\nwithout Gomez. So, after we compile the then branch, we look for an `else`\nkeyword. If we find one, we compile the else branch.\n\n^code compile-else (1 before, 1 after)\n\nWhen the condition is falsey, we'll jump over the then branch. If there's an\nelse branch, the `ip` will land right at the beginning of its code. But that's\nnot enough, though. Here's the flow that leads to:\n\n<img src=\"image/jumping-back-and-forth/bad-else.png\" alt=\"Flowchart of the compiled bytecode with the then branch incorrectly falling through to the else branch.\" />\n\nIf the condition is truthy, we execute the then branch like we want. But after\nthat, execution rolls right on through into the else branch. Oops! When the\ncondition is true, after we run the then branch, we need to jump over the else\nbranch. That way, in either case, we only execute a single branch, like this:\n\n<img src=\"image/jumping-back-and-forth/if-else.png\" alt=\"Flowchart of the compiled bytecode for an if with an else clause.\" />\n\nTo implement that, we need another jump from the end of the then branch.\n\n^code jump-over-else (2 before, 1 after)\n\nWe patch that offset after the end of the else body.\n\n^code patch-else (1 before, 1 after)\n\nAfter executing the then branch, this jumps to the next statement after the else\nbranch. Unlike the other jump, this jump is unconditional. We always take it, so\nwe need another instruction that expresses that.\n\n^code jump-op (1 before, 1 after)\n\nWe interpret it like so:\n\n^code op-jump (2 before, 1 after)\n\nNothing too surprising here -- the only difference is that it doesn't check a\ncondition and always applies the offset.\n\nWe have then and else branches working now, so we're close. The last bit is to\nclean up that condition value we left on the stack. Remember, each statement is\nrequired to have zero stack effect -- after the statement is finished executing,\nthe stack should be as tall as it was before.\n\nWe could have the `OP_JUMP_IF_FALSE` instruction pop the condition itself, but\nsoon we'll use that same instruction for the logical operators where we don't\nwant the condition popped. Instead, we'll have the compiler emit a couple of\nexplicit `OP_POP` instructions when compiling an `if` statement. We need to take\ncare that every execution path through the generated code pops the condition.\n\nWhen the condition is truthy, we pop it right before the code inside the then\nbranch.\n\n^code pop-then (1 before, 1 after)\n\nOtherwise, we pop it at the beginning of the else branch.\n\n^code pop-end (1 before, 2 after)\n\nThis little instruction here also means that every `if` statement has an\nimplicit else branch even if the user didn't write an `else` clause. In the case\nwhere they left it off, all the branch does is discard the condition value.\n\nThe full correct flow looks like this:\n\n<img src=\"image/jumping-back-and-forth/full-if-else.png\" alt=\"Flowchart of the compiled bytecode including necessary pop instructions.\" />\n\nIf you trace through, you can see that it always executes a single branch and\nensures the condition is popped first. All that remains is a little disassembler\nsupport.\n\n^code disassemble-jump (1 before, 1 after)\n\nThese two instructions have a new format with a 16-bit operand, so we add a new\nutility function to disassemble them.\n\n^code jump-instruction\n\nThere we go, that's one complete control flow construct. If this were an '80s\nmovie, the montage music would kick in and the rest of the control flow syntax\nwould take care of itself. Alas, the <span name=\"80s\">'80s</span> are long over,\nso we'll have to grind it out ourselves.\n\n<aside name=\"80s\">\n\nMy enduring love of Depeche Mode notwithstanding.\n\n</aside>\n\n## Logical Operators\n\nYou probably remember this from jlox, but the logical operators `and` and `or`\naren't just another pair of binary operators like `+` and `-`. Because they\nshort-circuit and may not evaluate their right operand depending on the value of\nthe left one, they work more like control flow expressions.\n\nThey're basically a little variation on an `if` statement with an `else` clause.\nThe easiest way to explain them is to just show you the compiler code and the\ncontrol flow it produces in the resulting bytecode. Starting with `and`, we hook\nit into the expression parsing table here:\n\n^code table-and (1 before, 1 after)\n\nThat hands off to a new parser function.\n\n^code and\n\nAt the point this is called, the left-hand side expression has already been\ncompiled. That means at runtime, its value will be on top of the stack. If that\nvalue is falsey, then we know the entire `and` must be false, so we skip the\nright operand and leave the left-hand side value as the result of the entire\nexpression. Otherwise, we discard the left-hand value and evaluate the right\noperand which becomes the result of the whole `and` expression.\n\nThose four lines of code right there produce exactly that. The flow looks like\nthis:\n\n<img src=\"image/jumping-back-and-forth/and.png\" alt=\"Flowchart of the compiled bytecode of an 'and' expression.\" />\n\nNow you can see why `OP_JUMP_IF_FALSE` <span name=\"instr\">leaves</span> the\nvalue on top of the stack. When the left-hand side of the `and` is falsey, that\nvalue sticks around to become the result of the entire expression.\n\n<aside name=\"instr\">\n\nWe've got plenty of space left in our opcode range, so we could have separate\ninstructions for conditional jumps that implicitly pop and those that don't, I\nsuppose. But I'm trying to keep things minimal for the book. In your bytecode\nVM, it's worth exploring adding more specialized instructions and seeing how\nthey affect performance.\n\n</aside>\n\n### Logical or operator\n\nThe `or` operator is a little more complex. First we add it to the parse table.\n\n^code table-or (1 before, 1 after)\n\nWhen that parser consumes an infix `or` token, it calls this:\n\n^code or\n\nIn an `or` expression, if the left-hand side is *truthy*, then we skip over the\nright operand. Thus we need to jump when a value is truthy. We could add a\nseparate instruction, but just to show how our compiler is free to map the\nlanguage's semantics to whatever instruction sequence it wants, I implemented it\nin terms of the jump instructions we already have.\n\nWhen the left-hand side is falsey, it does a tiny jump over the next statement.\nThat statement is an unconditional jump over the code for the right operand.\nThis little dance effectively does a jump when the value is truthy. The flow\nlooks like this:\n\n<img src=\"image/jumping-back-and-forth/or.png\" alt=\"Flowchart of the compiled bytecode of a logical or expression.\" />\n\nIf I'm honest with you, this isn't the best way to do this. There are more\ninstructions to dispatch and more overhead. There's no good reason why `or`\nshould be slower than `and`. But it is kind of fun to see that it's possible to\nimplement both operators without adding any new instructions. Forgive me my\nindulgences.\n\nOK, those are the three *branching* constructs in Lox. By that, I mean, these\nare the control flow features that only jump *forward* over code. Other\nlanguages often have some kind of multi-way branching statement like `switch`\nand maybe a conditional expression like `?:`, but Lox keeps it simple.\n\n## While Statements\n\nThat takes us to the *looping* statements, which jump *backward* so that code\ncan be executed more than once. Lox only has two loop constructs, `while` and\n`for`. A `while` loop is (much) simpler, so we start the party there.\n\n^code parse-while (1 before, 1 after)\n\nWhen we reach a `while` token, we call:\n\n^code while-statement\n\nMost of this mirrors `if` statements -- we compile the condition expression,\nsurrounded by mandatory parentheses. That's followed by a jump instruction that\nskips over the subsequent body statement if the condition is falsey.\n\nWe patch the jump after compiling the body and take care to <span\nname=\"pop\">pop</span> the condition value from the stack on either path. The\nonly difference from an `if` statement is the loop. That looks like this:\n\n<aside name=\"pop\">\n\nReally starting to second-guess my decision to use the same jump instructions\nfor the logical operators.\n\n</aside>\n\n^code loop (1 before, 2 after)\n\nAfter the body, we call this function to emit a \"loop\" instruction. That\ninstruction needs to know how far back to jump. When jumping forward, we had to\nemit the instruction in two stages since we didn't know how far we were going to\njump until after we emitted the jump instruction. We don't have that problem\nnow. We've already compiled the point in code that we want to jump back to --\nit's right before the condition expression.\n\nAll we need to do is capture that location as we compile it.\n\n^code loop-start (1 before, 1 after)\n\nAfter executing the body of a `while` loop, we jump all the way back to before\nthe condition. That way, we re-evaluate the condition expression on each\niteration. We store the chunk's current instruction count in `loopStart` to\nrecord the offset in the bytecode right before the condition expression we're\nabout to compile. Then we pass that into this helper function:\n\n^code emit-loop\n\nIt's a bit like `emitJump()` and `patchJump()` combined. It emits a new loop\ninstruction, which unconditionally jumps *backwards* by a given offset. Like the\njump instructions, after that we have a 16-bit operand. We calculate the offset\nfrom the instruction we're currently at to the `loopStart` point that we want to\njump back to. The `+ 2` is to take into account the size of the `OP_LOOP`\ninstruction's own operands which we also need to jump over.\n\nFrom the VM's perspective, there really is no semantic difference between\n`OP_LOOP` and `OP_JUMP`. Both just add an offset to the `ip`. We could have used\na single instruction for both and given it a signed offset operand. But I\nfigured it was a little easier to sidestep the annoying bit twiddling required\nto manually pack a signed 16-bit integer into two bytes, and we've got the\nopcode space available, so why not use it?\n\nThe new instruction is here:\n\n^code loop-op (1 before, 1 after)\n\nAnd in the VM, we implement it thusly:\n\n^code op-loop (1 before, 1 after)\n\nThe only difference from `OP_JUMP` is a subtraction instead of an addition.\nDisassembly is similar too.\n\n^code disassemble-loop (1 before, 1 after)\n\nThat's our `while` statement. It contains two jumps -- a conditional forward one\nto escape the loop when the condition is not met, and an unconditional loop\nbackward after we have executed the body. The flow looks like this:\n\n<img src=\"image/jumping-back-and-forth/while.png\" alt=\"Flowchart of the compiled bytecode of a while statement.\" />\n\n## For Statements\n\nThe other looping statement in Lox is the venerable `for` loop, inherited from\nC. It's got a lot more going on with it compared to a `while` loop. It has three\nclauses, all of which are optional:\n\n<span name=\"detail\"></span>\n\n*   The initializer can be a variable declaration or an expression. It runs once\n    at the beginning of the statement.\n\n*   The condition clause is an expression. Like in a `while` loop, we exit the\n    loop when it evaluates to something falsey.\n\n*   The increment expression runs once at the end of each loop iteration.\n\n<aside name=\"detail\">\n\nIf you want a refresher, the corresponding chapter in part II goes through the\nsemantics [in more detail][jlox].\n\n[jlox]: control-flow.html#for-loops\n\n</aside>\n\nIn jlox, the parser desugared a `for` loop to a synthesized AST for a `while`\nloop with some extra stuff before it and at the end of the body. We'll do\nsomething similar, though we won't go through anything like an AST. Instead,\nour bytecode compiler will use the jump and loop instructions we already have.\n\nWe'll work our way through the implementation a piece at a time, starting with\nthe `for` keyword.\n\n^code parse-for (1 before, 1 after)\n\nIt calls a helper function. If we only supported `for` loops with empty clauses\nlike `for (;;)`, then we could implement it like this:\n\n^code for-statement\n\nThere's a bunch of mandatory punctuation at the top. Then we compile the body.\nLike we did for `while` loops, we record the bytecode offset at the top of the\nbody and emit a loop to jump back to that point after it. We've got a working\nimplementation of <span name=\"infinite\">infinite</span> loops now.\n\n<aside name=\"infinite\">\n\nAlas, without `return` statements, there isn't any way to terminate it short of\na runtime error.\n\n</aside>\n\n### Initializer clause\n\nNow we'll add the first clause, the initializer. It executes only once, before\nthe body, so compiling is straightforward.\n\n^code for-initializer (1 before, 2 after)\n\nThe syntax is a little complex since we allow either a variable declaration or\nan expression. We use the presence of the `var` keyword to tell which we have.\nFor the expression case, we call `expressionStatement()` instead of\n`expression()`. That looks for a semicolon, which we need here too, and also\nemits an `OP_POP` instruction to discard the value. We don't want the\ninitializer to leave anything on the stack.\n\nIf a `for` statement declares a variable, that variable should be scoped to the\nloop body. We ensure that by wrapping the whole statement in a scope.\n\n^code for-begin-scope (1 before, 1 after)\n\nThen we close it at the end.\n\n^code for-end-scope (1 before, 1 after)\n\n### Condition clause\n\nNext, is the condition expression that can be used to exit the loop.\n\n^code for-exit (1 before, 1 after)\n\nSince the clause is optional, we need to see if it's actually present. If the\nclause is omitted, the next token must be a semicolon, so we look for that to\ntell. If there isn't a semicolon, there must be a condition expression.\n\nIn that case, we compile it. Then, just like with while, we emit a conditional\njump that exits the loop if the condition is falsey. Since the jump leaves the\nvalue on the stack, we pop it before executing the body. That ensures we discard\nthe value when the condition is true.\n\nAfter the loop body, we need to patch that jump.\n\n^code exit-jump (1 before, 2 after)\n\nWe do this only when there is a condition clause. If there isn't, there's no\njump to patch and no condition value on the stack to pop.\n\n### Increment clause\n\nI've saved the best for last, the increment clause. It's pretty convoluted. It\nappears textually before the body, but executes *after* it. If we parsed to an\nAST and generated code in a separate pass, we could simply traverse into and\ncompile the `for` statement AST's body field before its increment clause.\n\nUnfortunately, we can't compile the increment clause later, since our compiler\nonly makes a single pass over the code. Instead, we'll *jump over* the\nincrement, run the body, jump *back* up to the increment, run it, and then go to\nthe next iteration.\n\nI know, a little weird, but hey, it beats manually managing ASTs in memory in C,\nright? Here's the code:\n\n^code for-increment (2 before, 2 after)\n\nAgain, it's optional. Since this is the last clause, when omitted, the next\ntoken will be the closing parenthesis. When an increment is present, we need to\ncompile it now, but it shouldn't execute yet. So, first, we emit an\nunconditional jump that hops over the increment clause's code to the body of the\nloop.\n\nNext, we compile the increment expression itself. This is usually an assignment.\nWhatever it is, we only execute it for its side effect, so we also emit a pop to\ndiscard its value.\n\nThe last part is a little tricky. First, we emit a loop instruction. This is the\nmain loop that takes us back to the top of the `for` loop -- right before the\ncondition expression if there is one. That loop happens right after the\nincrement, since the increment executes at the end of each loop iteration.\n\nThen we change `loopStart` to point to the offset where the increment expression\nbegins. Later, when we emit the loop instruction after the body statement, this\nwill cause it to jump up to the *increment* expression instead of the top of the\nloop like it does when there is no increment. This is how we weave the\nincrement in to run after the body.\n\nIt's convoluted, but it all works out. A complete loop with all the clauses\ncompiles to a flow like this:\n\n<img src=\"image/jumping-back-and-forth/for.png\" alt=\"Flowchart of the compiled bytecode of a for statement.\" />\n\nAs with implementing `for` loops in jlox, we didn't need to touch the runtime.\nIt all gets compiled down to primitive control flow operations the VM already\nsupports. In this chapter, we've taken a big <span name=\"leap\">leap</span>\nforward -- clox is now Turing complete. We've also covered quite a bit of new\nsyntax: three statements and two expression forms. Even so, it only took three\nnew simple instructions. That's a pretty good effort-to-reward ratio for the\narchitecture of our VM.\n\n<aside name=\"leap\">\n\nI couldn't resist the pun. I regret nothing.\n\n</aside>\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  In addition to `if` statements, most C-family languages have a multi-way\n    `switch` statement. Add one to clox. The grammar is:\n\n    ```ebnf\n    switchStmt     → \"switch\" \"(\" expression \")\"\n                     \"{\" switchCase* defaultCase? \"}\" ;\n    switchCase     → \"case\" expression \":\" statement* ;\n    defaultCase    → \"default\" \":\" statement* ;\n    ```\n\n    To execute a `switch` statement, first evaluate the parenthesized switch\n    value expression. Then walk the cases. For each case, evaluate its value\n    expression. If the case value is equal to the switch value, execute the\n    statements under the case and then exit the `switch` statement. Otherwise,\n    try the next case. If no case matches and there is a `default` clause,\n    execute its statements.\n\n    To keep things simpler, we're omitting fallthrough and `break` statements.\n    Each case automatically jumps to the end of the switch statement after its\n    statements are done.\n\n1.  In jlox, we had a challenge to add support for `break` statements. This\n    time, let's do `continue`:\n\n    ```ebnf\n    continueStmt   → \"continue\" \";\" ;\n    ```\n\n    A `continue` statement jumps directly to the top of the nearest enclosing\n    loop, skipping the rest of the loop body. Inside a `for` loop, a `continue`\n    jumps to the increment clause, if there is one. It's a compile-time error to\n    have a `continue` statement not enclosed in a loop.\n\n    Make sure to think about scope. What should happen to local variables\n    declared inside the body of the loop or in blocks nested inside the loop\n    when a `continue` is executed?\n\n1.  Control flow constructs have been mostly unchanged since Algol 68. Language\n    evolution since then has focused on making code more declarative and high\n    level, so imperative control flow hasn't gotten much attention.\n\n    For fun, try to invent a useful novel control flow feature for Lox. It can\n    be a refinement of an existing form or something entirely new. In practice,\n    it's hard to come up with something useful enough at this low expressiveness\n    level to outweigh the cost of forcing a user to learn an unfamiliar notation\n    and behavior, but it's a good chance to practice your design skills.\n\n</div>\n\n<div class=\"design-note\">\n\n## Design Note: Considering Goto Harmful\n\nDiscovering that all of our beautiful structured control flow in Lox is actually\ncompiled to raw unstructured jumps is like the moment in Scooby Doo when the\nmonster rips the mask off their face. It was goto all along! Except in this\ncase, the monster is *under* the mask. We all know goto is evil. But... why?\n\nIt is true that you can write outrageously unmaintainable code using goto. But I\ndon't think most programmers around today have seen that first hand. It's been a\nlong time since that style was common. These days, it's a boogie man we invoke\nin scary stories around the campfire.\n\nThe reason we rarely confront that monster in person is because Edsger Dijkstra\nslayed it with his famous letter \"Go To Statement Considered Harmful\", published\nin *Communications of the ACM* (March, 1968). Debate around structured\nprogramming had been fierce for some time with adherents on both sides, but I\nthink Dijkstra deserves the most credit for effectively ending it. Most new\nlanguages today have no unstructured jump statements.\n\nA one-and-a-half page letter that almost single-handedly destroyed a language\nfeature must be pretty impressive stuff. If you haven't read it, I encourage you\nto do so. It's a seminal piece of computer science lore, one of our tribe's\nancestral songs. Also, it's a nice, short bit of practice for reading academic\nCS <span name=\"style\">writing</span>, which is a useful skill to develop.\n\n<aside name=\"style\">\n\nThat is, if you can get past Dijkstra's insufferable faux-modest\nself-aggrandizing writing style:\n\n> More recently I discovered why the use of the go to statement has such\n> disastrous effects. ...At that time I did not attach too much importance to\n> this discovery; I now submit my considerations for publication because in very\n> recent discussions in which the subject turned up, I have been urged to do so.\n\nAh, yet another one of my many discoveries. I couldn't even be bothered to write\nit up until the clamoring masses begged me to.\n\n</aside>\n\nI've read it through a number of times, along with a few critiques, responses,\nand commentaries. I ended up with mixed feelings, at best. At a very high level,\nI'm with him. His general argument is something like this:\n\n1.  As programmers, we write programs -- static text -- but what we care about\n    is the actual running program -- its dynamic behavior.\n\n2.  We're better at reasoning about static things than dynamic things. (He\n    doesn't provide any evidence to support this claim, but I accept it.)\n\n3.  Thus, the more we can make the dynamic execution of the program reflect its\n    textual structure, the better.\n\nThis is a good start. Drawing our attention to the separation between the code\nwe write and the code as it runs inside the machine is an interesting insight.\nThen he tries to define a \"correspondence\" between program text and execution.\nFor someone who spent literally his entire career advocating greater rigor in\nprogramming, his definition is pretty hand-wavey. He says:\n\n> Let us now consider how we can characterize the progress of a process. (You\n> may think about this question in a very concrete manner: suppose that a\n> process, considered as a time succession of actions, is stopped after an\n> arbitrary action, what data do we have to fix in order that we can redo the\n> process until the very same point?)\n\nImagine it like this. You have two computers with the same program running on\nthe exact same inputs -- so totally deterministic. You pause one of them at an\narbitrary point in its execution. What data would you need to send to the other\ncomputer to be able to stop it exactly as far along as the first one was?\n\nIf your program allows only simple statements like assignment, it's easy. You\njust need to know the point after the last statement you executed. Basically a\nbreakpoint, the `ip` in our VM, or the line number in an error message. Adding\nbranching control flow like `if` and `switch` doesn't add any more to this. Even\nif the marker points inside a branch, we can still tell where we are.\n\nOnce you add function calls, you need something more. You could have paused the\nfirst computer in the middle of a function, but that function may be called from\nmultiple places. To pause the second machine at exactly the same point in *the\nentire program's* execution, you need to pause it on the *right* call to that\nfunction.\n\nSo you need to know not just the current statement, but, for function calls that\nhaven't returned yet, you need to know the locations of the callsites. In other\nwords, a call stack, though I don't think that term existed when Dijkstra wrote\nthis. Groovy.\n\nHe notes that loops make things harder. If you pause in the middle of a loop\nbody, you don't know how many iterations have run. So he says you also need to\nkeep an iteration count. And, since loops can nest, you need a stack of those\n(presumably interleaved with the call stack pointers since you can be in loops\nin outer calls too).\n\nThis is where it gets weird. So we're really building to something now, and you\nexpect him to explain how goto breaks all of this. Instead, he just says:\n\n> The unbridled use of the go to statement has an immediate consequence that it\n> becomes terribly hard to find a meaningful set of coordinates in which to\n> describe the process progress.\n\nHe doesn't prove that this is hard, or say why. He just says it. He does say\nthat one approach is unsatisfactory:\n\n> With the go to statement one can, of course, still describe the progress\n> uniquely by a counter counting the number of actions performed since program\n> start (viz. a kind of normalized clock). The difficulty is that such a\n> coordinate, although unique, is utterly unhelpful.\n\nBut... that's effectively what loop counters do, and he was fine with those.\nIt's not like every loop is a simple \"for every integer from 0 to 10\"\nincrementing count. Many are `while` loops with complex conditionals.\n\nTaking an example close to home, consider the core bytecode execution loop at\nthe heart of clox. Dijkstra argues that that loop is tractable because we can\nsimply count how many times the loop has run to reason about its progress. But\nthat loop runs once for each executed instruction in some user's compiled Lox\nprogram. Does knowing that it executed 6,201 bytecode instructions really tell\nus VM maintainers *anything* edifying about the state of the interpreter?\n\nIn fact, this particular example points to a deeper truth. Böhm and Jacopini\n[proved][] that *any* control flow using goto can be transformed into one using\njust sequencing, loops, and branches. Our bytecode interpreter loop is a living\nexample of that proof: it implements the unstructured control flow of the clox\nbytecode instruction set without using any gotos itself.\n\n[proved]: https://en.wikipedia.org/wiki/Structured_program_theorem\n\nThat seems to offer a counter-argument to Dijkstra's claim: you *can* define a\ncorrespondence for a program using gotos by transforming it to one that doesn't\nand then use the correspondence from that program, which -- according to him --\nis acceptable because it uses only branches and loops.\n\nBut, honestly, my argument here is also weak. I think both of us are basically\ndoing pretend math and using fake logic to make what should be an empirical,\nhuman-centered argument. Dijkstra is right that some code using goto is really\nbad. Much of that could and should be turned into clearer code by using\nstructured control flow.\n\nBy eliminating goto completely from languages, you're definitely prevented from\nwriting bad code using gotos. It may be that forcing users to use structured\ncontrol flow and making it an uphill battle to write goto-like code using those\nconstructs is a net win for all of our productivity.\n\nBut I do wonder sometimes if we threw out the baby with the bathwater. In the\nabsence of goto, we often resort to more complex structured patterns. The\n\"switch inside a loop\" is a classic one. Another is using a guard variable to\nexit out of a series of nested loops:\n\n<span name=\"break\">\n</span>\n\n```c\n// See if the matrix contains a zero.\nbool found = false;\nfor (int x = 0; x < xSize; x++) {\n  for (int y = 0; y < ySize; y++) {\n    for (int z = 0; z < zSize; z++) {\n      if (matrix[x][y][z] == 0) {\n        printf(\"found\");\n        found = true;\n        break;\n      }\n    }\n    if (found) break;\n  }\n  if (found) break;\n}\n```\n\nIs that really better than:\n\n```c\nfor (int x = 0; x < xSize; x++) {\n  for (int y = 0; y < ySize; y++) {\n    for (int z = 0; z < zSize; z++) {\n      if (matrix[x][y][z] == 0) {\n        printf(\"found\");\n        goto done;\n      }\n    }\n  }\n}\ndone:\n```\n\n<aside name=\"break\">\n\nYou could do this without `break` statements -- themselves a limited goto-ish\nconstruct -- by inserting `!found &&` at the beginning of the condition clause\nof each loop.\n\n</aside>\n\nI guess what I really don't like is that we're making language design and\nengineering decisions today based on fear. Few people today have any subtle\nunderstanding of the problems and benefits of goto. Instead, we just think it's\n\"considered harmful\". Personally, I've never found dogma a good starting place\nfor quality creative work.\n\n</div>\n"
  },
  {
    "path": "book/local-variables.md",
    "content": "> And as imagination bodies forth<br />\n> The forms of things unknown, the poet's pen<br />\n> Turns them to shapes and gives to airy nothing<br />\n> A local habitation and a name.\n>\n> <cite>William Shakespeare, <em>A Midsummer Night's Dream</em></cite>\n\nThe [last chapter][] introduced variables to clox, but only of the <span\nname=\"global\">global</span> variety. In this chapter, we'll extend that to\nsupport blocks, block scope, and local variables. In jlox, we managed to pack\nall of that and globals into one chapter. For clox, that's two chapters worth of\nwork partially because, frankly, everything takes more effort in C.\n\n<aside name=\"global\">\n\nThere's probably some dumb \"think globally, act locally\" joke here, but I'm\nstruggling to find it.\n\n</aside>\n\n[last chapter]: global-variables.html\n\nBut an even more important reason is that our approach to local variables will\nbe quite different from how we implemented globals. Global variables are late\nbound in Lox. \"Late\" in this context means \"resolved after compile time\". That's\ngood for keeping the compiler simple, but not great for performance. Local\nvariables are one of the most-used <span name=\"params\">parts</span> of a\nlanguage. If locals are slow, *everything* is slow. So we want a strategy for\nlocal variables that's as efficient as possible.\n\n<aside name=\"params\">\n\nFunction parameters are also heavily used. They work like local variables too,\nso we'll use the same implementation technique for them.\n\n</aside>\n\nFortunately, lexical scoping is here to help us. As the name implies, lexical\nscope means we can resolve a local variable just by looking at the text of the\nprogram -- locals are *not* late bound. Any processing work we do in the\ncompiler is work we *don't* have to do at runtime, so our implementation of\nlocal variables will lean heavily on the compiler.\n\n## Representing Local Variables\n\nThe nice thing about hacking on a programming language in modern times is\nthere's a long lineage of other languages to learn from. So how do C and Java\nmanage their local variables? Why, on the stack, of course! They typically use\nthe native stack mechanisms supported by the chip and OS. That's a little too\nlow level for us, but inside the virtual world of clox, we have our own stack we\ncan use.\n\nRight now, we only use it for holding on to **temporaries** -- short-lived blobs\nof data that we need to remember while computing an expression. As long as we\ndon't get in the way of those, we can stuff our local variables onto the stack\ntoo. This is great for performance. Allocating space for a new local requires\nonly incrementing the `stackTop` pointer, and freeing is likewise a decrement.\nAccessing a variable from a known stack slot is an indexed array lookup.\n\nWe do need to be careful, though. The VM expects the stack to behave like, well,\na stack. We have to be OK with allocating new locals only on the top of the\nstack, and we have to accept that we can discard a local only when nothing is\nabove it on the stack. Also, we need to make sure temporaries don't interfere.\n\nConveniently, the design of Lox is in <span name=\"harmony\">harmony</span> with\nthese constraints. New locals are always created by declaration statements.\nStatements don't nest inside expressions, so there are never any temporaries on\nthe stack when a statement begins executing. Blocks are strictly nested. When a\nblock ends, it always takes the innermost, most recently declared locals with\nit. Since those are also the locals that came into scope last, they should be on\ntop of the stack where we need them.\n\n<aside name=\"harmony\">\n\nThis alignment obviously isn't coincidental. I designed Lox to be amenable to\nsingle-pass compilation to stack-based bytecode. But I didn't have to tweak the\nlanguage too much to fit in those restrictions. Most of its design should feel\npretty natural.\n\nThis is in large part because the history of languages is deeply tied to\nsingle-pass compilation and -- to a lesser degree -- stack-based architectures.\nLox's block scoping follows a tradition stretching back to BCPL. As programmers,\nour intuition of what's \"normal\" in a language is informed even today by the\nhardware limitations of yesteryear.\n\n</aside>\n\nStep through this example program and watch how the local variables come in and\ngo out of scope:\n\n<img src=\"image/local-variables/scopes.png\" alt=\"A series of local variables come into and out of scope in a stack-like fashion.\" />\n\nSee how they fit a stack perfectly? It seems that the stack will work for\nstoring locals at runtime. But we can go further than that. Not only do we know\n*that* they will be on the stack, but we can even pin down precisely *where*\nthey will be on the stack. Since the compiler knows exactly which local\nvariables are in scope at any point in time, it can effectively simulate the\nstack during compilation and note <span name=\"fn\">where</span> in the stack each\nvariable lives.\n\nWe'll take advantage of this by using these stack offsets as operands for the\nbytecode instructions that read and store local variables. This makes working\nwith locals deliciously fast -- as simple as indexing into an array.\n\n<aside name=\"fn\">\n\nIn this chapter, locals start at the bottom of the VM's stack array and are\nindexed from there. When we add [functions][], that scheme gets a little more\ncomplex. Each function needs its own region of the stack for its parameters and\nlocal variables. But, as we'll see, that doesn't add as much complexity as you\nmight expect.\n\n[functions]: calls-and-functions.html\n\n</aside>\n\nThere's a lot of state we need to track in the compiler to make this whole thing\ngo, so let's get started there. In jlox, we used a linked chain of \"environment\"\nHashMaps to track which local variables were currently in scope. That's sort of\nthe classic, schoolbook way of representing lexical scope. For clox, as usual,\nwe're going a little closer to the metal. All of the state lives in a new\nstruct.\n\n^code compiler-struct (1 before, 2 after)\n\nWe have a simple, flat array of all locals that are in scope during each point in\nthe compilation process. They are <span name=\"order\">ordered</span> in the array\nin the order that their declarations appear in the code. Since the instruction\noperand we'll use to encode a local is a single byte, our VM has a hard limit on\nthe number of locals that can be in scope at once. That means we can also give\nthe locals array a fixed size.\n\n<aside name=\"order\">\n\nWe're writing a single-pass compiler, so it's not like we have *too* many other\noptions for how to order them in the array.\n\n</aside>\n\n^code uint8-count (1 before, 2 after)\n\nBack in the Compiler struct, the `localCount` field tracks how many locals are\nin scope -- how many of those array slots are in use. We also track the \"scope\ndepth\". This is the number of blocks surrounding the current bit of code we're\ncompiling.\n\nOur Java interpreter used a chain of maps to keep each block's variables\nseparate from other blocks'. This time, we'll simply number variables with the\nlevel of nesting where they appear. Zero is the global scope, one is the first\ntop-level block, two is inside that, you get the idea. We use this to track\nwhich block each local belongs to so that we know which locals to discard when a\nblock ends.\n\nEach local in the array is one of these:\n\n^code local-struct (1 before, 2 after)\n\nWe store the name of the variable. When we're resolving an identifier, we\ncompare the identifier's lexeme with each local's name to find a match. It's\npretty hard to resolve a variable if you don't know its name. The `depth` field\nrecords the scope depth of the block where the local variable was declared.\nThat's all the state we need for now.\n\nThis is a very different representation from what we had in jlox, but it still\nlets us answer all of the same questions our compiler needs to ask of the\nlexical environment. The next step is figuring out how the compiler *gets* at\nthis state. If we were <span name=\"thread\">principled</span> engineers, we'd\ngive each function in the front end a parameter that accepts a pointer to a\nCompiler. We'd create a Compiler at the beginning and carefully thread it\nthrough each function call... but that would mean a lot of boring changes to\nthe code we already wrote, so here's a global variable instead:\n\n<aside name=\"thread\">\n\nIn particular, if we ever want to use our compiler in a multi-threaded\napplication, possibly with multiple compilers running in parallel, then using a\nglobal variable is a *bad* idea.\n\n</aside>\n\n^code current-compiler (1 before, 1 after)\n\nHere's a little function to initialize the compiler:\n\n^code init-compiler\n\nWhen we first start up the VM, we call it to get everything into a clean state.\n\n^code compiler (1 before, 1 after)\n\nOur compiler has the data it needs, but not the operations on that data. There's\nno way to create and destroy scopes, or add and resolve variables. We'll add\nthose as we need them. First, let's start building some language features.\n\n## Block Statements\n\nBefore we can have any local variables, we need some local scopes. These come\nfrom two things: function bodies and <span name=\"block\">blocks</span>. Functions\nare a big chunk of work that we'll tackle in [a later chapter][functions], so\nfor now we're only going to do blocks. As usual, we start with the syntax. The\nnew grammar we'll introduce is:\n\n```ebnf\nstatement      → exprStmt\n               | printStmt\n               | block ;\n\nblock          → \"{\" declaration* \"}\" ;\n```\n\n<aside name=\"block\">\n\nWhen you think about it, \"block\" is a weird name. Used metaphorically, \"block\"\nusually means a small indivisible unit, but for some reason, the Algol 60\ncommittee decided to use it to refer to a *compound* structure -- a series of\nstatements. It could be worse, I suppose. Algol 58 called `begin` and `end`\n\"statement parentheses\".\n\n<img src=\"image/local-variables/block.png\" alt=\"A cinder block.\" class=\"above\" />\n\n</aside>\n\nBlocks are a kind of statement, so the rule for them goes in the `statement`\nproduction. The corresponding code to compile one looks like this:\n\n^code parse-block (2 before, 1 after)\n\nAfter <span name=\"helper\">parsing</span> the initial curly brace, we use this\nhelper function to compile the rest of the block:\n\n<aside name=\"helper\">\n\nThis function will come in handy later for compiling function bodies.\n\n</aside>\n\n^code block\n\nIt keeps parsing declarations and statements until it hits the closing brace. As\nwe do with any loop in the parser, we also check for the end of the token\nstream. This way, if there's a malformed program with a missing closing curly,\nthe compiler doesn't get stuck in a loop.\n\nExecuting a block simply means executing the statements it contains, one after\nthe other, so there isn't much to compiling them. The semantically interesting\nthing blocks do is create scopes. Before we compile the body of a block, we call\nthis function to enter a new local scope:\n\n^code begin-scope\n\nIn order to \"create\" a scope, all we do is increment the current depth. This is\ncertainly much faster than jlox, which allocated an entire new HashMap for\neach one. Given `beginScope()`, you can probably guess what `endScope()` does.\n\n^code end-scope\n\nThat's it for blocks and scopes -- more or less -- so we're ready to stuff some\nvariables into them.\n\n## Declaring Local Variables\n\nUsually we start with parsing here, but our compiler already supports parsing\nand compiling variable declarations. We've got `var` statements, identifier\nexpressions and assignment in there now. It's just that the compiler assumes\nall variables are global. So we don't need any new parsing support, we just need\nto hook up the new scoping semantics to the existing code.\n\n<img src=\"image/local-variables/declaration.png\" alt=\"The code flow within varDeclaration().\" />\n\nVariable declaration parsing begins in `varDeclaration()` and relies on a couple\nof other functions. First, `parseVariable()` consumes the identifier token for\nthe variable name, adds its lexeme to the chunk's constant table as a string,\nand then returns the constant table index where it was added. Then, after\n`varDeclaration()` compiles the initializer, it calls `defineVariable()` to emit\nthe bytecode for storing the variable's value in the global variable hash table.\n\nBoth of those helpers need a few changes to support local variables. In\n`parseVariable()`, we add:\n\n^code parse-local (1 before, 1 after)\n\nFirst, we \"declare\" the variable. I'll get to what that means in a second. After\nthat, we exit the function if we're in a local scope. At runtime, locals aren't\nlooked up by name. There's no need to stuff the variable's name into the\nconstant table, so if the declaration is inside a local scope, we return a dummy\ntable index instead.\n\nOver in `defineVariable()`, we need to emit the code to store a local variable\nif we're in a local scope. It looks like this:\n\n^code define-variable (1 before, 1 after)\n\nWait, what? Yup. That's it. There is no code to create a local variable at\nruntime. Think about what state the VM is in. It has already executed the code\nfor the variable's initializer (or the implicit `nil` if the user omitted an\ninitializer), and that value is sitting right on top of the stack as the only\nremaining temporary. We also know that new locals are allocated at the top of\nthe stack... right where that value already is. Thus, there's nothing to do. The\ntemporary simply *becomes* the local variable. It doesn't get much more\nefficient than that.\n\n<span name=\"locals\"></span>\n\n<img src=\"image/local-variables/local-slots.png\" alt=\"Walking through the bytecode execution showing that each initializer's result ends up in the local's slot.\" />\n\n<aside name=\"locals\">\n\nThe code on the left compiles to the sequence of instructions on the right.\n\n</aside>\n\nOK, so what's \"declaring\" about? Here's what that does:\n\n^code declare-variable\n\nThis is the point where the compiler records the existence of the variable. We\nonly do this for locals, so if we're in the top-level global scope, we just bail\nout. Because global variables are late bound, the compiler doesn't keep track of\nwhich declarations for them it has seen.\n\nBut for local variables, the compiler does need to remember that the variable\nexists. That's what declaring it does -- it adds it to the compiler's list of\nvariables in the current scope. We implement that using another new function.\n\n^code add-local\n\nThis initializes the next available Local in the compiler's array of variables.\nIt stores the variable's <span name=\"lexeme\">name</span> and the depth of the\nscope that owns the variable.\n\n<aside name=\"lexeme\">\n\nWorried about the lifetime of the string for the variable's name? The Local\ndirectly stores a copy of the Token struct for the identifier. Tokens store a\npointer to the first character of their lexeme and the lexeme's length. That\npointer points into the original source string for the script or REPL entry\nbeing compiled.\n\nAs long as that string stays around during the entire compilation process --\nwhich it must since, you know, we're compiling it -- then all of the tokens\npointing into it are fine.\n\n</aside>\n\nOur implementation is fine for a correct Lox program, but what about invalid\ncode? Let's aim to be robust. The first error to handle is not really the user's\nfault, but more a limitation of the VM. The instructions to work with local\nvariables refer to them by slot index. That index is stored in a single-byte\noperand, which means the VM only supports up to 256 local variables in scope at\none time.\n\nIf we try to go over that, not only could we not refer to them at runtime, but\nthe compiler would overwrite its own locals array, too. Let's prevent that.\n\n^code too-many-locals (1 before, 1 after)\n\nThe next case is trickier. Consider:\n\n```lox\n{\n  var a = \"first\";\n  var a = \"second\";\n}\n```\n\nAt the top level, Lox allows redeclaring a variable with the same name as a\nprevious declaration because that's useful for the REPL. But inside a local\nscope, that's a pretty <span name=\"rust\">weird</span> thing to do. It's likely\nto be a mistake, and many languages, including our own Lox, enshrine that\nassumption by making this an error.\n\n<aside name=\"rust\">\n\nInterestingly, the Rust programming language *does* allow this, and idiomatic\ncode relies on it.\n\n</aside>\n\nNote that the above program is different from this one:\n\n```lox\n{\n  var a = \"outer\";\n  {\n    var a = \"inner\";\n  }\n}\n```\n\nIt's OK to have two variables with the same name in *different* scopes, even\nwhen the scopes overlap such that both are visible at the same time. That's\nshadowing, and Lox does allow that. It's only an error to have two variables\nwith the same name in the *same* local scope.\n\nWe detect that error like so:\n\n^code existing-in-scope (1 before, 2 after)\n\n<aside name=\"negative\">\n\nDon't worry about that odd `depth != -1` part yet. We'll get to what that's\nabout later.\n\n</aside>\n\nLocal variables are appended to the array when they're declared, which means the\ncurrent scope is always at the end of the array. When we declare a new variable,\nwe start at the end and work backward, looking for an existing variable with the\nsame name. If we find one in the current scope, we report the error. Otherwise,\nif we reach the beginning of the array or a variable owned by another scope,\nthen we know we've checked all of the existing variables in the scope.\n\nTo see if two identifiers are the same, we use this:\n\n^code identifiers-equal\n\nSince we know the lengths of both lexemes, we check that first. That will fail\nquickly for many non-equal strings. If the <span name=\"hash\">lengths</span> are\nthe same, we check the characters using `memcmp()`. To get to `memcmp()`, we\nneed an include.\n\n<aside name=\"hash\">\n\nIt would be a nice little optimization if we could check their hashes, but\ntokens aren't full LoxStrings, so we haven't calculated their hashes yet.\n\n</aside>\n\n^code compiler-include-string (1 before, 2 after)\n\nWith this, we're able to bring variables into being. But, like ghosts, they\nlinger on beyond the scope where they are declared. When a block ends, we need\nto put them to rest.\n\n^code pop-locals (1 before, 1 after)\n\nWhen we pop a scope, we walk backward through the local array looking for any\nvariables declared at the scope depth we just left. We discard them by simply\ndecrementing the length of the array.\n\nThere is a runtime component to this too. Local variables occupy slots on the\nstack. When a local variable goes out of scope, that slot is no longer needed\nand should be freed. So, for each variable that we discard, we also emit an\n`OP_POP` <span name=\"pop\">instruction</span> to pop it from the stack.\n\n<aside name=\"pop\">\n\nWhen multiple local variables go out of scope at once, you get a series of\n`OP_POP` instructions that get interpreted one at a time. A simple optimization\nyou could add to your Lox implementation is a specialized `OP_POPN` instruction\nthat takes an operand for the number of slots to pop and pops them all at once.\n\n</aside>\n\n## Using Locals\n\nWe can now compile and execute local variable declarations. At runtime, their\nvalues are sitting where they should be on the stack. Let's start using them.\nWe'll do both variable access and assignment at the same time since they touch\nthe same functions in the compiler.\n\nWe already have code for getting and setting global variables, and -- like good\nlittle software engineers -- we want to reuse as much of that existing code as\nwe can. Something like this:\n\n^code named-local (1 before, 2 after)\n\nInstead of hardcoding the bytecode instructions emitted for variable access and\nassignment, we use a couple of C variables. First, we try to find a local\nvariable with the given name. If we find one, we use the instructions for\nworking with locals. Otherwise, we assume it's a global variable and use the\nexisting bytecode instructions for globals.\n\nA little further down, we use those variables to emit the right instructions.\nFor assignment:\n\n^code emit-set (2 before, 1 after)\n\nAnd for access:\n\n^code emit-get (2 before, 1 after)\n\nThe real heart of this chapter, the part where we resolve a local variable, is\nhere:\n\n^code resolve-local\n\nFor all that, it's straightforward. We walk the list of locals that are\ncurrently in scope. If one has the same name as the identifier token, the\nidentifier must refer to that variable. We've found it! We walk the array\nbackward so that we find the *last* declared variable with the identifier. That\nensures that inner local variables correctly shadow locals with the same name in\nsurrounding scopes.\n\nAt runtime, we load and store locals using the stack slot index, so that's what\nthe compiler needs to calculate after it resolves the variable. Whenever a\nvariable is declared, we append it to the locals array in Compiler. That means\nthe first local variable is at index zero, the next one is at index one, and so\non. In other words, the locals array in the compiler has the *exact* same layout\nas the VM's stack will have at runtime. The variable's index in the locals array\nis the same as its stack slot. How convenient!\n\nIf we make it through the whole array without finding a variable with the given\nname, it must not be a local. In that case, we return `-1` to signal that it\nwasn't found and should be assumed to be a global variable instead.\n\n### Interpreting local variables\n\nOur compiler is emitting two new instructions, so let's get them working. First\nis loading a local variable:\n\n^code get-local-op (1 before, 1 after)\n\nAnd its implementation:\n\n^code interpret-get-local (1 before, 1 after)\n\nIt takes a single-byte operand for the stack slot where the local lives. It\nloads the value from that index and then pushes it on top of the stack where\nlater instructions can find it.\n\n<aside name=\"slot\">\n\nIt seems redundant to push the local's value onto the stack since it's already\non the stack lower down somewhere. The problem is that the other bytecode\ninstructions only look for data at the *top* of the stack. This is the core\naspect that makes our bytecode instruction set *stack*-based.\n[Register-based][reg] bytecode instruction sets avoid this stack juggling at the\ncost of having larger instructions with more operands.\n\n[reg]: a-virtual-machine.html#design-note\n\n</aside>\n\nNext is assignment:\n\n^code set-local-op (1 before, 1 after)\n\nYou can probably predict the implementation.\n\n^code interpret-set-local (1 before, 1 after)\n\nIt takes the assigned value from the top of the stack and stores it in the stack\nslot corresponding to the local variable. Note that it doesn't pop the value\nfrom the stack. Remember, assignment is an expression, and every expression\nproduces a value. The value of an assignment expression is the assigned value\nitself, so the VM just leaves the value on the stack.\n\nOur disassembler is incomplete without support for these two new instructions.\n\n^code disassemble-local (1 before, 1 after)\n\nThe compiler compiles local variables to direct slot access. The local\nvariable's name never leaves the compiler to make it into the chunk at all.\nThat's great for performance, but not so great for introspection. When we\ndisassemble these instructions, we can't show the variable's name like we could\nwith globals. Instead, we just show the slot number.\n\n<aside name=\"debug\">\n\nErasing local variable names in the compiler is a real issue if we ever want to\nimplement a debugger for our VM. When users step through code, they expect to\nsee the values of local variables organized by their names. To support that,\nwe'd need to output some additional information that tracks the name of each\nlocal variable at each stack slot.\n\n</aside>\n\n^code byte-instruction\n\n### Another scope edge case\n\nWe already sunk some time into handling a couple of weird edge cases around\nscopes. We made sure shadowing works correctly. We report an error if two\nvariables in the same local scope have the same name. For reasons that aren't\nentirely clear to me, variable scoping seems to have a lot of these wrinkles.\nI've never seen a language where it feels completely <span\nname=\"elegant\">elegant</span>.\n\n<aside name=\"elegant\">\n\nNo, not even Scheme.\n\n</aside>\n\nWe've got one more edge case to deal with before we end this chapter. Recall this strange beastie we first met in [jlox's implementation of variable resolution][shadow]:\n\n[shadow]: resolving-and-binding.html#resolving-variable-declarations\n\n```lox\n{\n  var a = \"outer\";\n  {\n    var a = a;\n  }\n}\n```\n\nWe slayed it then by splitting a variable's declaration into two phases, and\nwe'll do that again here:\n\n<img src=\"image/local-variables/phases.png\" alt=\"An example variable declaration marked 'declared uninitialized' before the variable name and 'ready for use' after the initializer.\" />\n\nAs soon as the variable declaration begins -- in other words, before its\ninitializer -- the name is declared in the current scope. The variable exists,\nbut in a special \"uninitialized\" state. Then we compile the initializer. If at\nany point in that expression we resolve an identifier that points back to this\nvariable, we'll see that it is not initialized yet and report an error. After we\nfinish compiling the initializer, we mark the variable as initialized and ready\nfor use.\n\nTo implement this, when we declare a local, we need to indicate the\n\"uninitialized\" state somehow. We could add a new field to Local, but let's be a\nlittle more parsimonious with memory. Instead, we'll set the variable's scope\ndepth to a special sentinel value, `-1`.\n\n^code declare-undefined (1 before, 1 after)\n\nLater, once the variable's initializer has been compiled, we mark it\ninitialized.\n\n^code define-local (1 before, 2 after)\n\nThat is implemented like so:\n\n^code mark-initialized\n\nSo this is *really* what \"declaring\" and \"defining\" a variable means in the\ncompiler. \"Declaring\" is when the variable is added to the scope, and \"defining\"\nis when it becomes available for use.\n\nWhen we resolve a reference to a local variable, we check the scope depth to see\nif it's fully defined.\n\n^code own-initializer-error (1 before, 1 after)\n\nIf the variable has the sentinel depth, it must be a reference to a variable in\nits own initializer, and we report that as an error.\n\nThat's it for this chapter! We added blocks, local variables, and real,\nhonest-to-God lexical scoping. Given that we introduced an entirely different\nruntime representation for variables, we didn't have to write a lot of code. The\nimplementation ended up being pretty clean and efficient.\n\nYou'll notice that almost all of the code we wrote is in the compiler. Over in\nthe runtime, it's just two little instructions. You'll see this as a continuing\n<span name=\"static\">trend</span> in clox compared to jlox. One of the biggest\nhammers in the optimizer's toolbox is pulling work forward into the compiler so\nthat you don't have to do it at runtime. In this chapter, that meant resolving\nexactly which stack slot every local variable occupies. That way, at runtime, no\nlookup or resolution needs to happen.\n\n<aside name=\"static\">\n\nYou can look at static types as an extreme example of this trend. A statically\ntyped language takes all of the type analysis and type error handling and sorts\nit all out during compilation. Then the runtime doesn't have to waste any time\nchecking that values have the proper type for their operation. In fact, in some\nstatically typed languages like C, you don't even *know* the type at runtime.\nThe compiler completely erases any representation of a value's type leaving just\nthe bare bits.\n\n</aside>\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  Our simple local array makes it easy to calculate the stack slot of each\n    local variable. But it means that when the compiler resolves a reference to\n    a variable, we have to do a linear scan through the array.\n\n    Come up with something more efficient. Do you think the additional\n    complexity is worth it?\n\n1.  How do other languages handle code like this:\n\n    ```lox\n    var a = a;\n    ```\n\n    What would you do if it was your language? Why?\n\n1.  Many languages make a distinction between variables that can be reassigned\n    and those that can't. In Java, the `final` modifier prevents you from\n    assigning to a variable. In JavaScript, a variable declared with `let` can\n    be assigned, but one declared using `const` can't. Swift treats `let` as\n    single-assignment and uses `var` for assignable variables. Scala and Kotlin\n    use `val` and `var`.\n\n    Pick a keyword for a single-assignment variable form to add to Lox. Justify\n    your choice, then implement it. An attempt to assign to a variable declared\n    using your new keyword should cause a compile error.\n\n1.  Extend clox to allow more than 256 local variables to be in scope at a time.\n\n</div>\n"
  },
  {
    "path": "book/methods-and-initializers.md",
    "content": "> When you are on the dancefloor, there is nothing to do but dance.\n>\n> <cite>Umberto Eco, <em>The Mysterious Flame of Queen Loana</em></cite>\n\nIt is time for our virtual machine to bring its nascent objects to life with\nbehavior. That means methods and method calls. And, since they are a special\nkind of method, initializers too.\n\nAll of this is familiar territory from our previous jlox interpreter. What's new\nin this second trip is an important optimization we'll implement to make method\ncalls over seven times faster than our baseline performance. But before we get\nto that fun, we gotta get the basic stuff working.\n\n## Method Declarations\n\nWe can't optimize method calls before we have method calls, and we can't call\nmethods without having methods to call, so we'll start with declarations.\n\n### Representing methods\n\nWe usually start in the compiler, but let's knock the object model out first\nthis time. The runtime representation for methods in clox is similar to that of\njlox. Each class stores a hash table of methods. Keys are method names, and each\nvalue is an ObjClosure for the body of the method.\n\n^code class-methods (3 before, 1 after)\n\nA brand new class begins with an empty method table.\n\n^code init-methods (1 before, 1 after)\n\nThe ObjClass struct owns the memory for this table, so when the memory manager\ndeallocates a class, the table should be freed too.\n\n^code free-methods (1 before, 1 after)\n\nSpeaking of memory managers, the GC needs to trace through classes into the\nmethod table. If a class is still reachable (likely through some instance),\nthen all of its methods certainly need to stick around too.\n\n^code mark-methods (1 before, 1 after)\n\nWe use the existing `markTable()` function, which traces through the key string\nand value in each table entry.\n\nStoring a class's methods is pretty familiar coming from jlox. The different\npart is how that table gets populated. Our previous interpreter had access to\nthe entire AST node for the class declaration and all of the methods it\ncontained. At runtime, the interpreter simply walked that list of declarations.\n\nNow every piece of information the compiler wants to shunt over to the runtime\nhas to squeeze through the interface of a flat series of bytecode instructions.\nHow do we take a class declaration, which can contain an arbitrarily large set\nof methods, and represent it as bytecode? Let's hop over to the compiler and\nfind out.\n\n### Compiling method declarations\n\nThe last chapter left us with a compiler that parses classes but allows only an\nempty body. Now we insert a little code to compile a series of method\ndeclarations between the braces.\n\n^code class-body (1 before, 1 after)\n\nLox doesn't have field declarations, so anything before the closing brace at the\nend of the class body must be a method. We stop compiling methods when we hit\nthat final curly or if we reach the end of the file. The latter check ensures\nour compiler doesn't get stuck in an infinite loop if the user accidentally\nforgets the closing brace.\n\nThe tricky part with compiling a class declaration is that a class may declare\nany number of methods. Somehow the runtime needs to look up and bind all of\nthem. That would be a lot to pack into a single `OP_CLASS` instruction. Instead,\nthe bytecode we generate for a class declaration will split the process into a\n<span name=\"series\">*series*</span> of instructions. The compiler already emits\nan `OP_CLASS` instruction that creates a new empty ObjClass object. Then it\nemits instructions to store the class in a variable with its name.\n\n<aside name=\"series\">\n\nWe did something similar for closures. The `OP_CLOSURE` instruction needs to\nknow the type and index for each captured upvalue. We encoded that using a\nseries of pseudo-instructions following the main `OP_CLOSURE` instruction --\nbasically a variable number of operands. The VM processes all of those extra\nbytes immediately when interpreting the `OP_CLOSURE` instruction.\n\nHere our approach is a little different because from the VM's perspective, each\ninstruction to define a method is a separate stand-alone operation. Either\napproach would work. A variable-sized pseudo-instruction is possibly marginally\nfaster, but class declarations are rarely in hot loops, so it doesn't matter\nmuch.\n\n</aside>\n\nNow, for each method declaration, we emit a new `OP_METHOD` instruction that\nadds a single method to that class. When all of the `OP_METHOD` instructions\nhave executed, we're left with a fully formed class. While the user sees a class\ndeclaration as a single atomic operation, the VM implements it as a series of\nmutations.\n\nTo define a new method, the VM needs three things:\n\n1.  The name of the method.\n\n1.  The closure for the method body.\n\n1.  The class to bind the method to.\n\nWe'll incrementally write the compiler code to see how those all get through to\nthe runtime, starting here:\n\n^code method\n\nLike `OP_GET_PROPERTY` and other instructions that need names at runtime, the\ncompiler adds the method name token's lexeme to the constant table, getting back\na table index. Then we emit an `OP_METHOD` instruction with that index as the\noperand. That's the name. Next is the method body:\n\n^code method-body (1 before, 1 after)\n\nWe use the same `function()` helper that we wrote for compiling function\ndeclarations. That utility function compiles the subsequent parameter list and\nfunction body. Then it emits the code to create an ObjClosure and leave it on\ntop of the stack. At runtime, the VM will find the closure there.\n\nLast is the class to bind the method to. Where can the VM find that?\nUnfortunately, by the time we reach the `OP_METHOD` instruction, we don't know\nwhere it is. It <span name=\"global\">could</span> be on the stack, if the user\ndeclared the class in a local scope. But a top-level class declaration ends up\nwith the ObjClass in the global variable table.\n\n<aside name=\"global\">\n\nIf Lox supported declaring classes only at the top level, the VM could assume\nthat any class could be found by looking it up directly from the global\nvariable table. Alas, because we support local classes, we need to handle that\ncase too.\n\n</aside>\n\nFear not. The compiler does know the *name* of the class. We can capture it\nright after we consume its token.\n\n^code class-name (1 before, 1 after)\n\nAnd we know that no other declaration with that name could possibly shadow the\nclass. So we do the easy fix. Before we start binding methods, we emit whatever\ncode is necessary to load the class back on top of the stack.\n\n^code load-class (2 before, 1 after)\n\nRight before compiling the class body, we <span name=\"load\">call</span>\n`namedVariable()`. That helper function generates code to load a variable with\nthe given name onto the stack. Then we compile the methods.\n\n<aside name=\"load\">\n\nThe preceding call to `defineVariable()` pops the class, so it seems silly to\ncall `namedVariable()` to load it right back onto the stack. Why not simply\nleave it on the stack in the first place? We could, but in the [next\nchapter][super] we will insert code between these two calls to support\ninheritance. At that point, it will be simpler if the class isn't sitting around\non the stack.\n\n[super]: superclasses.html\n\n</aside>\n\nThis means that when we execute each `OP_METHOD` instruction, the stack has the\nmethod's closure on top with the class right under it. Once we've reached the\nend of the methods, we no longer need the class and tell the VM to pop it off\nthe stack.\n\n^code pop-class (1 before, 1 after)\n\nPutting all of that together, here is an example class declaration to throw at\nthe compiler:\n\n```lox\nclass Brunch {\n  bacon() {}\n  eggs() {}\n}\n```\n\nGiven that, here is what the compiler generates and how those instructions\naffect the stack at runtime:\n\n<img src=\"image/methods-and-initializers/method-instructions.png\" alt=\"The series of bytecode instructions for a class declaration with two methods.\" />\n\nAll that remains for us is to implement the runtime for that new `OP_METHOD`\ninstruction.\n\n### Executing method declarations\n\nFirst we define the opcode.\n\n^code method-op (1 before, 1 after)\n\nWe disassemble it like other instructions that have string constant operands.\n\n^code disassemble-method (2 before, 1 after)\n\nAnd over in the interpreter, we add a new case too.\n\n^code interpret-method (1 before, 1 after)\n\nThere, we read the method name from the constant table and pass it here:\n\n^code define-method\n\nThe method closure is on top of the stack, above the class it will be bound to.\nWe read those two stack slots and store the closure in the class's method table.\nThen we pop the closure since we're done with it.\n\nNote that we don't do any runtime type checking on the closure or class object.\nThat `AS_CLASS()` call is safe because the compiler itself generated the code\nthat causes the class to be in that stack slot. The VM <span\nname=\"verify\">trusts</span> its own compiler.\n\n<aside name=\"verify\">\n\nThe VM trusts that the instructions it executes are valid because the *only* way\nto get code to the bytecode interpreter is by going through clox's own compiler.\nMany bytecode VMs, like the JVM and CPython, support executing bytecode that has\nbeen compiled separately. That leads to a different security story. Maliciously\ncrafted bytecode could crash the VM or worse.\n\nTo prevent that, the JVM does a bytecode verification pass before it executes\nany loaded code. CPython says it's up to the user to ensure any bytecode they\nrun is safe.\n\n</aside>\n\nAfter the series of `OP_METHOD` instructions is done and the `OP_POP` has popped\nthe class, we will have a class with a nicely populated method table, ready to\nstart doing things. The next step is pulling those methods back out and using\nthem.\n\n## Method References\n\nMost of the time, methods are accessed and immediately called, leading to this\nfamiliar syntax:\n\n```lox\ninstance.method(argument);\n```\n\nBut remember, in Lox and some other languages, those two steps are distinct and\ncan be separated.\n\n```lox\nvar closure = instance.method;\nclosure(argument);\n```\n\nSince users *can* separate the operations, we have to implement them separately.\nThe first step is using our existing dotted property syntax to access a method\ndefined on the instance's class. That should return some kind of object that the\nuser can then call like a function.\n\nThe obvious approach is to look up the method in the class's method table and\nreturn the ObjClosure associated with that name. But we also need to remember\nthat when you access a method, `this` gets bound to the instance the method was\naccessed from. Here's the example from [when we added methods to jlox][jlox]:\n\n[jlox]: classes.html#methods-on-classes\n\n```lox\nclass Person {\n  sayName() {\n    print this.name;\n  }\n}\n\nvar jane = Person();\njane.name = \"Jane\";\n\nvar method = jane.sayName;\nmethod(); // ?\n```\n\nThis should print \"Jane\", so the object returned by `.sayName` somehow needs to\nremember the instance it was accessed from when it later gets called. In jlox,\nwe implemented that \"memory\" using the interpreter's existing heap-allocated\nEnvironment class, which handled all variable storage.\n\nOur bytecode VM has a more complex architecture for storing state. [Local\nvariables and temporaries][locals] are on the stack, [globals][] are in a hash\ntable, and variables in closures use [upvalues][]. That necessitates a somewhat\nmore complex solution for tracking a method's receiver in clox, and a new\nruntime type.\n\n[locals]: local-variables.html#representing-local-variables\n[globals]: global-variables.html#variable-declarations\n[upvalues]: closures.html#upvalues\n\n### Bound methods\n\nWhen the user executes a method access, we'll find the closure for that method\nand wrap it in a new <span name=\"bound\">\"bound method\"</span> object that tracks\nthe instance that the method was accessed from. This bound object can be called\nlater like a function. When invoked, the VM will do some shenanigans to wire up\n`this` to point to the receiver inside the method's body.\n\n<aside name=\"bound\">\n\nI took the name \"bound method\" from CPython. Python behaves similar to Lox here,\nand I used its implementation for inspiration.\n\n</aside>\n\nHere's the new object type:\n\n^code obj-bound-method (2 before, 1 after)\n\nIt wraps the receiver and the method closure together. The receiver's type is\nValue even though methods can be called only on ObjInstances. Since the VM\ndoesn't care what kind of receiver it has anyway, using Value means we don't\nhave to keep converting the pointer back to a Value when it gets passed to more\ngeneral functions.\n\nThe new struct implies the usual boilerplate you're used to by now. A new case\nin the object type enum:\n\n^code obj-type-bound-method (1 before, 1 after)\n\nA macro to check a value's type:\n\n^code is-bound-method (2 before, 1 after)\n\nAnother macro to cast the value to an ObjBoundMethod pointer:\n\n^code as-bound-method (2 before, 1 after)\n\nA function to create a new ObjBoundMethod:\n\n^code new-bound-method-h (2 before, 1 after)\n\nAnd an implementation of that function here:\n\n^code new-bound-method\n\nThe constructor-like function simply stores the given closure and receiver. When\nthe bound method is no longer needed, we free it.\n\n^code free-bound-method (1 before, 1 after)\n\nThe bound method has a couple of references, but it doesn't *own* them, so it\nfrees nothing but itself. However, those references do get traced by the garbage\ncollector.\n\n^code blacken-bound-method (1 before, 1 after)\n\nThis <span name=\"trace\">ensures</span> that a handle to a method keeps the\nreceiver around in memory so that `this` can still find the object when you\ninvoke the handle later. We also trace the method closure.\n\n<aside name=\"trace\">\n\nTracing the method closure isn't really necessary. The receiver is an\nObjInstance, which has a pointer to its ObjClass, which has a table for all of\nthe methods. But it feels dubious to me in some vague way to have ObjBoundMethod\nrely on that.\n\n</aside>\n\nThe last operation all objects support is printing.\n\n^code print-bound-method (1 before, 1 after)\n\nA bound method prints exactly the same way as a function. From the user's\nperspective, a bound method *is* a function. It's an object they can call. We\ndon't expose that the VM implements bound methods using a different object type.\n\n<aside name=\"party\">\n\n<img src=\"image/methods-and-initializers/party-hat.png\" alt=\"A party hat.\" />\n\n</aside>\n\nPut on your <span name=\"party\">party</span> hat because we just reached a little\nmilestone. ObjBoundMethod is the very last runtime type to add to clox. You've\nwritten your last `IS_` and `AS_` macros. We're only a few chapters from the end\nof the book, and we're getting close to a complete VM.\n\n### Accessing methods\n\nLet's get our new object type doing something. Methods are accessed using the\nsame \"dot\" property syntax we implemented in the last chapter. The compiler\nalready parses the right expressions and emits `OP_GET_PROPERTY` instructions\nfor them. The only changes we need to make are in the runtime.\n\nWhen a property access instruction executes, the instance is on top of the\nstack. The instruction's job is to find a field or method with the given name\nand replace the top of the stack with the accessed property.\n\nThe interpreter already handles fields, so we simply extend the\n`OP_GET_PROPERTY` case with another section.\n\n^code get-method (5 before, 1 after)\n\nWe insert this after the code to look up a field on the receiver instance.\nFields take priority over and shadow methods, so we look for a field first. If\nthe instance does not have a field with the given property name, then the name\nmay refer to a method.\n\nWe take the instance's class and pass it to a new `bindMethod()` helper. If that\nfunction finds a method, it places the method on the stack and returns `true`.\nOtherwise it returns `false` to indicate a method with that name couldn't be\nfound. Since the name also wasn't a field, that means we have a runtime error,\nwhich aborts the interpreter.\n\nHere is the good stuff:\n\n^code bind-method\n\nFirst we look for a method with the given name in the class's method table. If\nwe don't find one, we report a runtime error and bail out. Otherwise, we take\nthe method and wrap it in a new ObjBoundMethod. We grab the receiver from its\nhome on top of the stack. Finally, we pop the instance and replace the top of\nthe stack with the bound method.\n\nFor example:\n\n```lox\nclass Brunch {\n  eggs() {}\n}\n\nvar brunch = Brunch();\nvar eggs = brunch.eggs;\n```\n\nHere is what happens when the VM executes the `bindMethod()` call for the\n`brunch.eggs` expression:\n\n<img src=\"image/methods-and-initializers/bind-method.png\" alt=\"The stack changes caused by bindMethod().\" />\n\nThat's a lot of machinery under the hood, but from the user's perspective, they\nsimply get a function that they can call.\n\n### Calling methods\n\nUsers can declare methods on classes, access them on instances, and get bound\nmethods onto the stack. They just can't <span name=\"do\">*do*</span> anything\nuseful with those bound method objects. The operation we're missing is calling\nthem. Calls are implemented in `callValue()`, so we add a case there for the new\nobject type.\n\n<aside name=\"do\">\n\nA bound method *is* a first-class value, so they can store it in variables, pass\nit to functions, and otherwise do \"value\"-y stuff with it.\n\n</aside>\n\n^code call-bound-method (1 before, 1 after)\n\nWe pull the raw closure back out of the ObjBoundMethod and use the existing\n`call()` helper to begin an invocation of that closure by pushing a CallFrame\nfor it onto the call stack. That's all it takes to be able to run this Lox\nprogram:\n\n```lox\nclass Scone {\n  topping(first, second) {\n    print \"scone with \" + first + \" and \" + second;\n  }\n}\n\nvar scone = Scone();\nscone.topping(\"berries\", \"cream\");\n```\n\nThat's three big steps. We can declare, access, and invoke methods. But\nsomething is missing. We went to all that trouble to wrap the method closure in\nan object that binds the receiver, but when we invoke the method, we don't use\nthat receiver at all.\n\n## This\n\nThe reason bound methods need to keep hold of the receiver is so that it can be\naccessed inside the body of the method. Lox exposes a method's receiver through\n`this` expressions. It's time for some new syntax. The lexer already treats\n`this` as a special token type, so the first step is wiring that token up in the\nparse table.\n\n^code table-this (1 before, 1 after)\n\n<aside name=\"this\">\n\nThe underscore at the end of the name of the parser function is because `this`\nis a reserved word in C++ and we support compiling clox as C++.\n\n</aside>\n\nWhen the parser encounters a `this` in prefix position, it dispatches to a new\nparser function.\n\n^code this\n\nWe'll apply the same implementation technique for `this` in clox that we used in\njlox. We treat `this` as a lexically scoped local variable whose value gets\nmagically initialized. Compiling it like a local variable means we get a lot of\nbehavior for free. In particular, closures inside a method that reference `this`\nwill do the right thing and capture the receiver in an upvalue.\n\nWhen the parser function is called, the `this` token has just been consumed and\nis stored as the previous token. We call our existing `variable()` function\nwhich compiles identifier expressions as variable accesses. It takes a single\nBoolean parameter for whether the compiler should look for a following `=`\noperator and parse a setter. You can't assign to `this`, so we pass `false` to\ndisallow that.\n\nThe `variable()` function doesn't care that `this` has its own token type and\nisn't an identifier. It is happy to treat the lexeme \"this\" as if it were a\nvariable name and then look it up using the existing scope resolution machinery.\nRight now, that lookup will fail because we never declared a variable whose name\nis \"this\". It's time to think about where the receiver should live in memory.\n\nAt least until they get captured by closures, clox stores every local variable\non the VM's stack. The compiler keeps track of which slots in the function's\nstack window are owned by which local variables. If you recall, the compiler\nsets aside stack slot zero by declaring a local variable whose name is an empty\nstring.\n\nFor function calls, that slot ends up holding the function being called. Since\nthe slot has no name, the function body never accesses it. You can guess where\nthis is going. For *method* calls, we can repurpose that slot to store the\nreceiver. Slot zero will store the instance that `this` is bound to. In order to\ncompile `this` expressions, the compiler simply needs to give the correct name\nto that local variable.\n\n^code slot-zero (1 before, 1 after)\n\nWe want to do this only for methods. Function declarations don't have a `this`.\nAnd, in fact, they *must not* declare a variable named \"this\", so that if you\nwrite a `this` expression inside a function declaration which is itself inside a\nmethod, the `this` correctly resolves to the outer method's receiver.\n\n```lox\nclass Nested {\n  method() {\n    fun function() {\n      print this;\n    }\n\n    function();\n  }\n}\n\nNested().method();\n```\n\nThis program should print \"Nested instance\". To decide what name to give to\nlocal slot zero, the compiler needs to know whether it's compiling a function or\nmethod declaration, so we add a new case to our FunctionType enum to distinguish\nmethods.\n\n^code method-type-enum (1 before, 1 after)\n\nWhen we compile a method, we use that type.\n\n^code method-type (2 before, 1 after)\n\nNow we can correctly compile references to the special \"this\" variable, and the\ncompiler will emit the right `OP_GET_LOCAL` instructions to access it. Closures\ncan even capture `this` and store the receiver in upvalues. Pretty cool.\n\nExcept that at runtime, the receiver isn't actually *in* slot zero. The\ninterpreter isn't holding up its end of the bargain yet. Here is the fix:\n\n^code store-receiver (2 before, 2 after)\n\nWhen a method is called, the top of the stack contains all of the arguments, and\nthen just under those is the closure of the called method. That's where slot\nzero in the new CallFrame will be. This line of code inserts the receiver into\nthat slot. For example, given a method call like this:\n\n```lox\nscone.topping(\"berries\", \"cream\");\n```\n\nWe calculate the slot to store the receiver like so:\n\n<img src=\"image/methods-and-initializers/closure-slot.png\" alt=\"Skipping over the argument stack slots to find the slot containing the closure.\" />\n\nThe `-argCount` skips past the arguments and the `- 1` adjusts for the fact that\n`stackTop` points just *past* the last used stack slot.\n\n### Misusing this\n\nOur VM now supports users *correctly* using `this`, but we also need to make\nsure it properly handles users *mis*using `this`. Lox says it is a compile\nerror for a `this` expression to appear outside of the body of a method. These\ntwo wrong uses should be caught by the compiler:\n\n```lox\nprint this; // At top level.\n\nfun notMethod() {\n  print this; // In a function.\n}\n```\n\nSo how does the compiler know if it's inside a method? The obvious answer is to\nlook at the FunctionType of the current Compiler. We did just add an enum case\nthere to treat methods specially. However, that wouldn't correctly handle code\nlike the earlier example where you are inside a function which is, itself,\nnested inside a method.\n\nWe could try to resolve \"this\" and then report an error if it wasn't found in\nany of the surrounding lexical scopes. That would work, but would require us to\nshuffle around a bunch of code, since right now the code for resolving a\nvariable implicitly considers it a global access if no declaration is found.\n\nIn the next chapter, we will need information about the nearest enclosing class.\nIf we had that, we could use it here to determine if we are inside a method. So\nwe may as well make our future selves' lives a little easier and put that\nmachinery in place now.\n\n^code current-class (1 before, 2 after)\n\nThis module variable points to a struct representing the current, innermost\nclass being compiled. The new type looks like this:\n\n^code class-compiler-struct (1 before, 2 after)\n\nRight now we store only a pointer to the ClassCompiler for the enclosing class,\nif any. Nesting a class declaration inside a method in some other class is an\nuncommon thing to do, but Lox supports it. Just like the Compiler struct, this\nmeans ClassCompiler forms a linked list from the current innermost class being\ncompiled out through all of the enclosing classes.\n\nIf we aren't inside any class declaration at all, the module variable\n`currentClass` is `NULL`. When the compiler begins compiling a class, it pushes\na new ClassCompiler onto that implicit linked stack.\n\n^code create-class-compiler (2 before, 1 after)\n\nThe memory for the ClassCompiler struct lives right on the C stack, a handy\ncapability we get by writing our compiler using recursive descent. At the end of\nthe class body, we pop that compiler off the stack and restore the enclosing\none.\n\n^code pop-enclosing (1 before, 1 after)\n\nWhen an outermost class body ends, `enclosing` will be `NULL`, so this resets\n`currentClass` to `NULL`. Thus, to see if we are inside a class -- and therefore\ninside a method -- we simply check that module variable.\n\n^code this-outside-class (1 before, 1 after)\n\nWith that, `this` outside of a class is correctly forbidden. Now our methods\nreally feel like *methods* in the object-oriented sense. Accessing the receiver\nlets them affect the instance you called the method on. We're getting there!\n\n## Instance Initializers\n\nThe reason object-oriented languages tie state and behavior together -- one of\nthe core tenets of the paradigm -- is to ensure that objects are always in a\nvalid, meaningful state. When the only way to touch an object's state is <span\nname=\"through\">through</span> its methods, the methods can make sure nothing\ngoes awry. But that presumes the object is *already* in a proper state. What\nabout when it's first created?\n\n<aside name=\"through\">\n\nOf course, Lox does let outside code directly access and modify an instance's\nfields without going through its methods. This is unlike Ruby and Smalltalk,\nwhich completely encapsulate state inside objects. Our toy scripting language,\nalas, isn't so principled.\n\n</aside>\n\nObject-oriented languages ensure that brand new objects are properly set up\nthrough constructors, which both produce a new instance and initialize its\nstate. In Lox, the runtime allocates new raw instances, and a class may declare\nan initializer to set up any fields. Initializers work mostly like normal\nmethods, with a few tweaks:\n\n1.  The runtime automatically invokes the initializer method whenever an\n    instance of a class is created.\n\n2.  The caller that constructs an instance always gets the instance <span\n    name=\"return\">back</span> after the initializer finishes, regardless of what\n    the initializer function itself returns. The initializer method doesn't need\n    to explicitly return `this`.\n\n3.  In fact, an initializer is *prohibited* from returning any value at all\n    since the value would never be seen anyway.\n\n<aside name=\"return\">\n\nIt's as if the initializer is implicitly wrapped in a bundle of code like this:\n\n```lox\nfun create(klass) {\n  var obj = newInstance(klass);\n  obj.init();\n  return obj;\n}\n```\n\nNote how the value returned by `init()` is discarded.\n\n</aside>\n\nNow that we support methods, to add initializers, we merely need to implement\nthose three special rules. We'll go in order.\n\n### Invoking initializers\n\nFirst, automatically calling `init()` on new instances:\n\n^code call-init (1 before, 1 after)\n\nAfter the runtime allocates the new instance, we look for an `init()` method on\nthe class. If we find one, we initiate a call to it. This pushes a new CallFrame\nfor the initializer's closure. Say we run this program:\n\n```lox\nclass Brunch {\n  init(food, drink) {}\n}\n\nBrunch(\"eggs\", \"coffee\");\n```\n\nWhen the VM executes the call to `Brunch()`, it goes like this:\n\n<img src=\"image/methods-and-initializers/init-call-frame.png\" alt=\"The aligned stack windows for the Brunch() call and the corresponding init() method it forwards to.\" />\n\nAny arguments passed to the class when we called it are still sitting on the\nstack above the instance. The new CallFrame for the `init()` method shares that\nstack window, so those arguments implicitly get forwarded to the initializer.\n\nLox doesn't require a class to define an initializer. If omitted, the runtime\nsimply returns the new uninitialized instance. However, if there is no `init()`\nmethod, then it doesn't make any sense to pass arguments to the class when\ncreating the instance. We make that an error.\n\n^code no-init-arity-error (1 before, 1 after)\n\nWhen the class *does* provide an initializer, we also need to ensure that the\nnumber of arguments passed matches the initializer's arity. Fortunately, the\n`call()` helper does that for us already.\n\nTo call the initializer, the runtime looks up the `init()` method by name. We\nwant that to be fast since it happens every time an instance is constructed.\nThat means it would be good to take advantage of the string interning we've\nalready implemented. To do that, the VM creates an ObjString for \"init\" and\nreuses it. The string lives right in the VM struct.\n\n^code vm-init-string (1 before, 1 after)\n\nWe create and intern the string when the VM boots up.\n\n^code init-init-string (1 before, 2 after)\n\nWe want it to stick around, so the GC considers it a root.\n\n^code mark-init-string (1 before, 1 after)\n\nLook carefully. See any bug waiting to happen? No? It's a subtle one. The\ngarbage collector now reads `vm.initString`. That field is initialized from the\nresult of calling `copyString()`. But copying a string allocates memory, which\ncan trigger a GC. If the collector ran at just the wrong time, it would read\n`vm.initString` before it had been initialized. So, first we zero the field out.\n\n^code null-init-string (2 before, 2 after)\n\nWe clear the pointer when the VM shuts down since the next line will free it.\n\n^code clear-init-string (1 before, 1 after)\n\nOK, that lets us call initializers.\n\n### Initializer return values\n\nThe next step is ensuring that constructing an instance of a class with an\ninitializer always returns the new instance, and not `nil` or whatever the body\nof the initializer returns. Right now, if a class defines an initializer, then\nwhen an instance is constructed, the VM pushes a call to that initializer onto\nthe CallFrame stack. Then it just keeps on trucking.\n\nThe user's invocation on the class to create the instance will complete whenever\nthat initializer method returns, and will leave on the stack whatever value the\ninitializer puts there. That means that unless the user takes care to put\n`return this;` at the end of the initializer, no instance will come out. Not\nvery helpful.\n\nTo fix this, whenever the front end compiles an initializer method, it will emit\ndifferent bytecode at the end of the body to return `this` from the method\ninstead of the usual implicit `nil` most functions return. In order to do\n*that*, the compiler needs to actually know when it is compiling an initializer.\nWe detect that by checking to see if the name of the method we're compiling is\n\"init\".\n\n^code initializer-name (1 before, 1 after)\n\nWe define a new function type to distinguish initializers from other methods.\n\n^code initializer-type-enum (1 before, 1 after)\n\nWhenever the compiler emits the implicit return at the end of a body, we check\nthe type to decide whether to insert the initializer-specific behavior.\n\n^code return-this (1 before, 1 after)\n\nIn an initializer, instead of pushing `nil` onto the stack before returning,\nwe load slot zero, which contains the instance. This `emitReturn()` function is\nalso called when compiling a `return` statement without a value, so this also\ncorrectly handles cases where the user does an early return inside the\ninitializer.\n\n### Incorrect returns in initializers\n\nThe last step, the last item in our list of special features of initializers, is\nmaking it an error to try to return anything *else* from an initializer. Now\nthat the compiler tracks the method type, this is straightforward.\n\n^code return-from-init (3 before, 1 after)\n\nWe report an error if a `return` statement in an initializer has a value. We\nstill go ahead and compile the value afterwards so that the compiler doesn't get\nconfused by the trailing expression and report a bunch of cascaded errors.\n\nAside from inheritance, which we'll get to [soon][super], we now have a\nfairly full-featured class system working in clox.\n\n```lox\nclass CoffeeMaker {\n  init(coffee) {\n    this.coffee = coffee;\n  }\n\n  brew() {\n    print \"Enjoy your cup of \" + this.coffee;\n\n    // No reusing the grounds!\n    this.coffee = nil;\n  }\n}\n\nvar maker = CoffeeMaker(\"coffee and chicory\");\nmaker.brew();\n```\n\nPretty fancy for a C program that would fit on an old <span\nname=\"floppy\">floppy</span> disk.\n\n<aside name=\"floppy\">\n\nI acknowledge that \"floppy disk\" may no longer be a useful size reference for\ncurrent generations of programmers. Maybe I should have said \"a few tweets\" or\nsomething.\n\n</aside>\n\n## Optimized Invocations\n\nOur VM correctly implements the language's semantics for method calls and\ninitializers. We could stop here. But the main reason we are building an entire\nsecond implementation of Lox from scratch is to execute faster than our old Java\ninterpreter. Right now, method calls even in clox are slow.\n\nLox's semantics define a method invocation as two operations -- accessing the\nmethod and then calling the result. Our VM must support those as separate\noperations because the user *can* separate them. You can access a method without\ncalling it and then invoke the bound method later. Nothing we've implemented so\nfar is unnecessary.\n\nBut *always* executing those as separate operations has a significant cost.\nEvery single time a Lox program accesses and invokes a method, the runtime\nheap allocates a new ObjBoundMethod, initializes its fields, then pulls them\nright back out. Later, the GC has to spend time freeing all of those ephemeral\nbound methods.\n\nMost of the time, a Lox program accesses a method and then immediately calls it.\nThe bound method is created by one bytecode instruction and then consumed by the\nvery next one. In fact, it's so immediate that the compiler can even textually\n*see* that it's happening -- a dotted property access followed by an opening\nparenthesis is most likely a method call.\n\nSince we can recognize this pair of operations at compile time, we have the\nopportunity to emit a <span name=\"super\">new, special</span> instruction that\nperforms an optimized method call.\n\nWe start in the function that compiles dotted property expressions.\n\n<aside name=\"super\" class=\"bottom\">\n\nIf you spend enough time watching your bytecode VM run, you'll notice it often\nexecutes the same series of bytecode instructions one after the other. A classic\noptimization technique is to define a new single instruction called a\n**superinstruction** that fuses those into a single instruction with the same\nbehavior as the entire sequence.\n\nOne of the largest performance drains in a bytecode interpreter is the overhead\nof decoding and dispatching each instruction. Fusing several instructions into\none eliminates some of that.\n\nThe challenge is determining *which* instruction sequences are common enough to\nbenefit from this optimization. Every new superinstruction claims an opcode for\nits own use and there are only so many of those to go around. Add too many, and\nyou'll need a larger encoding for opcodes, which then increases code size and\nmakes decoding *all* instructions slower.\n\n</aside>\n\n^code parse-call (3 before, 1 after)\n\nAfter the compiler has parsed the property name, we look for a left parenthesis.\nIf we match one, we switch to a new code path. There, we compile the argument\nlist exactly like we do when compiling a call expression. Then we emit a single\nnew `OP_INVOKE` instruction. It takes two operands:\n\n1.  The index of the property name in the constant table.\n\n2.  The number of arguments passed to the method.\n\nIn other words, this single instruction combines the operands of the\n`OP_GET_PROPERTY` and `OP_CALL` instructions it replaces, in that order. It\nreally is a fusion of those two instructions. Let's define it.\n\n^code invoke-op (1 before, 1 after)\n\nAnd add it to the disassembler:\n\n^code disassemble-invoke (2 before, 1 after)\n\nThis is a new, special instruction format, so it needs a little custom\ndisassembly logic.\n\n^code invoke-instruction\n\nWe read the two operands and then print out both the method name and the\nargument count. Over in the interpreter's bytecode dispatch loop is where the\nreal action begins.\n\n^code interpret-invoke (1 before, 1 after)\n\nMost of the work happens in `invoke()`, which we'll get to. Here, we look up the\nmethod name from the first operand and then read the argument count operand.\nThen we hand off to `invoke()` to do the heavy lifting. That function returns\n`true` if the invocation succeeds. As usual, a `false` return means a runtime\nerror occurred. We check for that here and abort the interpreter if disaster has\nstruck.\n\nFinally, assuming the invocation succeeded, then there is a new CallFrame on the\nstack, so we refresh our cached copy of the current frame in `frame`.\n\nThe interesting work happens here:\n\n^code invoke\n\nFirst we grab the receiver off the stack. The arguments passed to the method are\nabove it on the stack, so we peek that many slots down. Then it's a simple\nmatter to cast the object to an instance and invoke the method on it.\n\nThat does assume the object *is* an instance. As with `OP_GET_PROPERTY`\ninstructions, we also need to handle the case where a user incorrectly tries to\ncall a method on a value of the wrong type.\n\n^code invoke-check-type (1 before, 1 after)\n\n<span name=\"helper\">That's</span> a runtime error, so we report that and bail\nout. Otherwise, we get the instance's class and jump over to this other new\nutility function:\n\n<aside name=\"helper\">\n\nAs you can guess by now, we split this code into a separate function because\nwe're going to reuse it later -- in this case for `super` calls.\n\n</aside>\n\n^code invoke-from-class\n\nThis function combines the logic of how the VM implements `OP_GET_PROPERTY` and\n`OP_CALL` instructions, in that order. First we look up the method by name in\nthe class's method table. If we don't find one, we report that runtime error and\nexit.\n\nOtherwise, we take the method's closure and push a call to it onto the CallFrame\nstack. We don't need to heap allocate and initialize an ObjBoundMethod. In fact,\nwe don't even need to <span name=\"juggle\">juggle</span> anything on the stack.\nThe receiver and method arguments are already right where they need to be.\n\n<aside name=\"juggle\">\n\nThis is a key reason *why* we use stack slot zero to store the receiver -- it's\nhow the caller already organizes the stack for a method call. An efficient\ncalling convention is an important part of a bytecode VM's performance story.\n\n</aside>\n\nIf you fire up the VM and run a little program that calls methods now, you\nshould see the exact same behavior as before. But, if we did our job right, the\n*performance* should be much improved. I wrote a little microbenchmark that\ndoes a batch of 10,000 method calls. Then it tests how many of these batches it\ncan execute in 10 seconds. On my computer, without the new `OP_INVOKE`\ninstruction, it got through 1,089 batches. With this new optimization, it\nfinished 8,324 batches in the same time. That's *7.6 times faster*, which is a\nhuge improvement when it comes to programming language optimization.\n\n<span name=\"pat\"></span>\n\n<aside name=\"pat\">\n\nWe shouldn't pat ourselves on the back *too* firmly. This performance\nimprovement is relative to our own unoptimized method call implementation which\nwas quite slow. Doing a heap allocation for every single method call isn't going\nto win any races.\n\n</aside>\n\n<img src=\"image/methods-and-initializers/benchmark.png\" alt=\"Bar chart comparing the two benchmark results.\" />\n\n### Invoking fields\n\nThe fundamental creed of optimization is: \"Thou shalt not break correctness.\"\n<span name=\"monte\">Users</span> like it when a language implementation gives\nthem an answer faster, but only if it's the *right* answer. Alas, our\nimplementation of faster method invocations fails to uphold that principle:\n\n```lox\nclass Oops {\n  init() {\n    fun f() {\n      print \"not a method\";\n    }\n\n    this.field = f;\n  }\n}\n\nvar oops = Oops();\noops.field();\n```\n\nThe last line looks like a method call. The compiler thinks that it is and\ndutifully emits an `OP_INVOKE` instruction for it. However, it's not. What is\nactually happening is a *field* access that returns a function which then gets\ncalled. Right now, instead of executing that correctly, our VM reports a runtime\nerror when it can't find a method named \"field\".\n\n<aside name=\"monte\">\n\nThere are cases where users may be satisfied when a program sometimes returns\nthe wrong answer in return for running significantly faster or with a better\nbound on the performance. These are the field of [**Monte Carlo\nalgorithms**][monte]. For some use cases, this is a good trade-off.\n\n[monte]: https://en.wikipedia.org/wiki/Monte_Carlo_algorithm\n\nThe important part, though, is that the user is *choosing* to apply one of these\nalgorithms. We language implementers can't unilaterally decide to sacrifice\ntheir program's correctness.\n\n</aside>\n\nEarlier, when we implemented `OP_GET_PROPERTY`, we handled both field and method\naccesses. To squash this new bug, we need to do the same thing for `OP_INVOKE`.\n\n^code invoke-field (1 before, 1 after)\n\nPretty simple fix. Before looking up a method on the instance's class, we look\nfor a field with the same name. If we find a field, then we store it on the\nstack in place of the receiver, *under* the argument list. This is how\n`OP_GET_PROPERTY` behaves since the latter instruction executes before a\nsubsequent parenthesized list of arguments has been evaluated.\n\nThen we try to call that field's value like the callable that it hopefully is.\nThe `callValue()` helper will check the value's type and call it as appropriate\nor report a runtime error if the field's value isn't a callable type like a\nclosure.\n\nThat's all it takes to make our optimization fully safe. We do sacrifice a\nlittle performance, unfortunately. But that's the price you have to pay\nsometimes. You occasionally get frustrated by optimizations you *could* do if\nonly the language wouldn't allow some annoying corner case. But, as language\n<span name=\"designer\">implementers</span>, we have to play the game we're given.\n\n<aside name=\"designer\">\n\nAs language *designers*, our role is very different. If we do control the\nlanguage itself, we may sometimes choose to restrict or change the language in\nways that enable optimizations. Users want expressive languages, but they also\nwant fast implementations. Sometimes it is good language design to sacrifice a\nlittle power if you can give them perf in return.\n\n</aside>\n\nThe code we wrote here follows a typical pattern in optimization:\n\n1.  Recognize a common operation or sequence of operations that is performance\n    critical. In this case, it is a method access followed by a call.\n\n2.  Add an optimized implementation of that pattern. That's our `OP_INVOKE`\n    instruction.\n\n3.  Guard the optimized code with some conditional logic that validates that the\n    pattern actually applies. If it does, stay on the fast path. Otherwise, fall\n    back to a slower but more robust unoptimized behavior. Here, that means\n    checking that we are actually calling a method and not accessing a field.\n\nAs your language work moves from getting the implementation working *at all* to\ngetting it to work *faster*, you will find yourself spending more and more\ntime looking for patterns like this and adding guarded optimizations for them.\nFull-time VM engineers spend much of their careers in this loop.\n\nBut we can stop here for now. With this, clox now supports most of the features\nof an object-oriented programming language, and with respectable performance.\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  The hash table lookup to find a class's `init()` method is constant time,\n    but still fairly slow. Implement something faster. Write a benchmark and\n    measure the performance difference.\n\n1.  In a dynamically typed language like Lox, a single callsite may invoke a\n    variety of methods on a number of classes throughout a program's execution.\n    Even so, in practice, most of the time a callsite ends up calling the exact\n    same method on the exact same class for the duration of the run. Most calls\n    are actually not polymorphic even if the language says they can be.\n\n    How do advanced language implementations optimize based on that observation?\n\n1.  When interpreting an `OP_INVOKE` instruction, the VM has to do two hash\n    table lookups. First, it looks for a field that could shadow a method, and\n    only if that fails does it look for a method. The former check is rarely\n    useful -- most fields do not contain functions. But it is *necessary*\n    because the language says fields and methods are accessed using the same\n    syntax, and fields shadow methods.\n\n    That is a language *choice* that affects the performance of our\n    implementation. Was it the right choice? If Lox were your language, what\n    would you do?\n\n</div>\n\n<div class=\"design-note\">\n\n## Design Note: Novelty Budget\n\nI still remember the first time I wrote a tiny BASIC program on a TRS-80 and\nmade a computer do something it hadn't done before. It felt like a superpower.\nThe first time I cobbled together just enough of a parser and interpreter to let\nme write a tiny program in *my own language* that made a computer do a thing was\nlike some sort of higher-order meta-superpower. It was and remains a wonderful\nfeeling.\n\nI realized I could design a language that looked and behaved however I chose. It\nwas like I'd been going to a private school that required uniforms my whole life\nand then one day transferred to a public school where I could wear whatever I\nwanted. I don't need to use curly braces for blocks? I can use something other\nthan an equals sign for assignment? I can do objects without classes? Multiple\ninheritance *and* multimethods? A dynamic language that overloads statically, by\narity?\n\nNaturally, I took that freedom and ran with it. I made the weirdest, most\narbitrary language design decisions. Apostrophes for generics. No commas between\narguments. Overload resolution that can fail at runtime. I did things\ndifferently just for difference's sake.\n\nThis is a very fun experience that I highly recommend. We need more weird,\navant-garde programming languages. I want to see more art languages. I still\nmake oddball toy languages for fun sometimes.\n\n*However*, if your goal is success where \"success\" is defined as a large number\nof users, then your priorities must be different. In that case, your primary\ngoal is to have your language loaded into the brains of as many people as\npossible. That's *really hard*. It takes a lot of human effort to move a\nlanguage's syntax and semantics from a computer into trillions of neurons.\n\nProgrammers are naturally conservative with their time and cautious about what\nlanguages are worth uploading into their wetware. They don't want to waste their\ntime on a language that ends up not being useful to them. As a language\ndesigner, your goal is thus to give them as much language power as you can with\nas little required learning as possible.\n\nOne natural approach is *simplicity*. The fewer concepts and features your\nlanguage has, the less total volume of stuff there is to learn. This is one of\nthe reasons minimal <span name=\"dynamic\">scripting</span> languages often find\nsuccess even though they aren't as powerful as the big industrial languages --\nthey are easier to get started with, and once they are in someone's brain, the\nuser wants to keep using them.\n\n<aside name=\"dynamic\">\n\nIn particular, this is a big advantage of dynamically typed languages. A static\nlanguage requires you to learn *two* languages -- the runtime semantics and the\nstatic type system -- before you can get to the point where you are making the\ncomputer do stuff. Dynamic languages require you to learn only the former.\n\nEventually, programs get big enough that the value of static analysis pays for\nthe effort to learn that second static language, but the value proposition isn't\nas obvious at the outset.\n\n</aside>\n\nThe problem with simplicity is that simply cutting features often sacrifices\npower and expressiveness. There is an art to finding features that punch above\ntheir weight, but often minimal languages simply do less.\n\nThere is another path that avoids much of that problem. The trick is to realize\nthat a user doesn't have to load your entire language into their head, *just the\npart they don't already have in there*. As I mentioned in an [earlier design\nnote][note], learning is about transferring the *delta* between what they\nalready know and what they need to know.\n\n[note]: parsing-expressions.html#design-note\n\nMany potential users of your language already know some other programming\nlanguage. Any features your language shares with that language are essentially\n\"free\" when it comes to learning. It's already in their head, they just have to\nrecognize that your language does the same thing.\n\nIn other words, *familiarity* is another key tool to lower the adoption cost of\nyour language. Of course, if you fully maximize that attribute, the end result\nis a language that is completely identical to some existing one. That's not a\nrecipe for success, because at that point there's no incentive for users to\nswitch to your language at all.\n\nSo you do need to provide some compelling differences. Some things your language\ncan do that other languages can't, or at least can't do as well. I believe this\nis one of the fundamental balancing acts of language design: similarity to other\nlanguages lowers learning cost, while divergence raises the compelling\nadvantages.\n\nI think of this balancing act in terms of a <span name=\"idiosyncracy\">**novelty\nbudget**</span>, or as Steve Klabnik calls it, a \"[strangeness budget][]\". Users\nhave a low threshold for the total amount of new stuff they are willing to\naccept to learn a new language. Exceed that, and they won't show up.\n\n[strangeness budget]: https://words.steveklabnik.com/the-language-strangeness-budget\n\n<aside name=\"idiosyncracy\">\n\nA related concept in psychology is [**idiosyncrasy credit**][idiosyncracy], the\nidea that other people in society grant you a finite amount of deviations from\nsocial norms. You earn credit by fitting in and doing in-group things, which you\ncan then spend on oddball activities that might otherwise raise eyebrows. In\nother words, demonstrating that you are \"one of the good ones\" gives you license\nto raise your freak flag, but only so far.\n\n[idiosyncracy]: https://en.wikipedia.org/wiki/Idiosyncrasy_credit\n\n</aside>\n\nAnytime you add something new to your language that other languages don't have,\nor anytime your language does something other languages do in a different way,\nyou spend some of that budget. That's OK -- you *need* to spend it to make your\nlanguage compelling. But your goal is to spend it *wisely*. For each feature or\ndifference, ask yourself how much compelling power it adds to your language and\nthen evaluate critically whether it pays its way. Is the change so valuable that\nit is worth blowing some of your novelty budget?\n\nIn practice, I find this means that you end up being pretty conservative with\nsyntax and more adventurous with semantics. As fun as it is to put on a new\nchange of clothes, swapping out curly braces with some other block delimiter is\nvery unlikely to add much real power to the language, but it does spend some\nnovelty. It's hard for syntax differences to carry their weight.\n\nOn the other hand, new semantics can significantly increase the power of the\nlanguage. Multimethods, mixins, traits, reflection, dependent types, runtime\nmetaprogramming, etc. can radically level up what a user can do with the\nlanguage.\n\nAlas, being conservative like this is not as fun as just changing everything.\nBut it's up to you to decide whether you want to chase mainstream success or not\nin the first place. We don't all need to be radio-friendly pop bands. If you\nwant your language to be like free jazz or drone metal and are happy with the\nproportionally smaller (but likely more devoted) audience size, go for it.\n\n</div>\n"
  },
  {
    "path": "book/optimization.md",
    "content": "> The evening's the best part of the day. You've done your day's work. Now you\n> can put your feet up and enjoy it.\n>\n> <cite>Kazuo Ishiguro, <em>The Remains of the Day</em></cite>\n\nIf I still lived in New Orleans, I'd call this chapter a *lagniappe*, a little\nsomething extra given for free to a customer. You've got a whole book and a\ncomplete virtual machine already, but I want you to have some more fun hacking\non clox. This time, we're going for pure performance. We'll apply two very\ndifferent optimizations to our virtual machine.  In the process, you'll get a\nfeel for measuring and improving the performance of a language implementation --\nor any program, really.\n\n## Measuring Performance\n\n**Optimization** means taking a working application and improving its\nperformance. An optimized program does the same thing, it just takes less\nresources to do so. The resource we usually think of when optimizing is runtime\nspeed, but it can also be important to reduce memory usage, startup time,\npersistent storage size, or network bandwidth. All physical resources have some\ncost -- even if the cost is mostly in wasted human time -- so optimization work\noften pays off.\n\nThere was a time in the early days of computing that a skilled programmer could\nhold the entire hardware architecture and compiler pipeline in their head and\nunderstand a program's performance just by thinking real hard. Those days are\nlong gone, separated from the present by microcode, cache lines, branch\nprediction, deep compiler pipelines, and mammoth instruction sets. We like to\npretend C is a \"low-level\" language, but the stack of technology between\n\n```c\nprintf(\"Hello, world!\");\n```\n\nand a greeting appearing on screen is now perilously tall.\n\nOptimization today is an empirical science. Our program is a border collie\nsprinting through the hardware's obstacle course. If we want her to reach the\nend faster, we can't just sit and ruminate on canine physiology until\nenlightenment strikes. Instead, we need to *observe* her performance, see where\nshe stumbles, and then find faster paths for her to take.\n\nMuch like agility training is particular to one dog and one obstacle course, we\ncan't assume that our virtual machine optimizations will make *all* Lox programs\nrun faster on *all* hardware. Different Lox programs stress different areas of\nthe VM, and different architectures have their own strengths and weaknesses.\n\n### Benchmarks\n\nWhen we add new functionality, we validate correctness by writing tests -- Lox\nprograms that use a feature and validate the VM's behavior. Tests pin down\nsemantics and ensure we don't break existing features when we add new ones. We\nhave similar needs when it comes to performance:\n\n1.  How do we validate that an optimization *does* improve performance, and by\n    how much?\n\n2.  How do we ensure that other unrelated changes don't *regress* performance?\n\nThe Lox programs we write to accomplish those goals are **benchmarks**. These\nare carefully crafted programs that stress some part of the language\nimplementation. They measure not *what* the program does, but how <span\nname=\"much\">*long*</span> it takes to do it.\n\n<aside name=\"much\">\n\nMost benchmarks measure running time. But, of course, you'll eventually find\nyourself needing to write benchmarks that measure memory allocation, how much\ntime is spent in the garbage collector, startup time, etc.\n\n</aside>\n\nBy measuring the performance of a benchmark before and after a change, you can\nsee what your change does. When you land an optimization, all of the tests\nshould behave exactly the same as they did before, but hopefully the benchmarks\nrun faster.\n\nOnce you have an entire <span name=\"js\">*suite*</span> of benchmarks, you can\nmeasure not just *that* an optimization changes performance, but on which\n*kinds* of code. Often you'll find that some benchmarks get faster while others\nget slower. Then you have to make hard decisions about what kinds of code your\nlanguage implementation optimizes for.\n\nThe suite of benchmarks you choose to write is a key part of that decision. In\nthe same way that your tests encode your choices around what correct behavior\nlooks like, your benchmarks are the embodiment of your priorities when it comes\nto performance. They will guide which optimizations you implement, so choose\nyour benchmarks carefully, and don't forget to periodically reflect on whether\nthey are helping you reach your larger goals.\n\n<aside name=\"js\">\n\nIn the early proliferation of JavaScript VMs, the first widely used benchmark\nsuite was SunSpider from WebKit. During the browser wars, marketing folks used\nSunSpider results to claim their browser was fastest. That highly incentivized\nVM hackers to optimize to those benchmarks.\n\nUnfortunately, SunSpider programs often didn't match real-world JavaScript. They\nwere mostly microbenchmarks -- tiny toy programs that completed quickly. Those\nbenchmarks penalize complex just-in-time compilers that start off slower but get\n*much* faster once the JIT has had enough time to optimize and re-compile hot\ncode paths. This put VM hackers in the unfortunate position of having to choose\nbetween making the SunSpider numbers get better, or actually optimizing the\nkinds of programs real users ran.\n\nGoogle's V8 team responded by sharing their Octane benchmark suite, which was\ncloser to real-world code at the time. Years later, as JavaScript use patterns\ncontinued to evolve, even Octane outlived its usefulness. Expect that your\nbenchmarks will evolve as your language's ecosystem does.\n\nRemember, the ultimate goal is to make *user programs* faster, and benchmarks\nare only a proxy for that.\n\n</aside>\n\nBenchmarking is a subtle art. Like tests, you need to balance not overfitting to\nyour implementation while ensuring that the benchmark does actually tickle the\ncode paths that you care about. When you measure performance, you need to\ncompensate for variance caused by CPU throttling, caching, and other weird\nhardware and operating system quirks. I won't give you a whole sermon here,\nbut treat benchmarking as its own skill that improves with practice.\n\n### Profiling\n\nOK, so you've got a few benchmarks now. You want to make them go faster. Now\nwhat? First of all, let's assume you've done all the obvious, easy work. You are\nusing the right algorithms and data structures -- or, at least, you aren't using\nones that are aggressively wrong. I don't consider using a hash table instead of\na linear search through a huge unsorted array \"optimization\" so much as \"good\nsoftware engineering\".\n\nSince the hardware is too complex to reason about our program's performance from\nfirst principles, we have to go out into the field. That means *profiling*. A\n**profiler**, if you've never used one, is a tool that runs your <span\nname=\"program\">program</span> and tracks hardware resource use as the code\nexecutes. Simple ones show you how much time was spent in each function in your\nprogram. Sophisticated ones log data cache misses, instruction cache misses,\nbranch mispredictions, memory allocations, and all sorts of other metrics.\n\n<aside name=\"program\">\n\n\"Your program\" here means the Lox VM itself running some *other* Lox program. We\nare trying to optimize clox, not the user's Lox script. Of course, the choice of\nwhich Lox program to load into our VM will highly affect which parts of clox get\nstressed, which is why benchmarks are so important.\n\nA profiler *won't* show us how much time is spent in each *Lox* function in the\nscript being run. We'd have to write our own \"Lox profiler\" to do that, which is\nslightly out of scope for this book.\n\n</aside>\n\nThere are many profilers out there for various operating systems and languages.\nOn whatever platform you program, it's worth getting familiar with a decent\nprofiler. You don't need to be a master. I have learned things within minutes of\nthrowing a program at a profiler that would have taken me *days* to discover on\nmy own through trial and error. Profilers are wonderful, magical tools.\n\n## Faster Hash Table Probing\n\nEnough pontificating, let's get some performance charts going up and to the\nright. The first optimization we'll do, it turns out, is about the *tiniest*\npossible change we could make to our VM.\n\nWhen I first got the bytecode virtual machine that clox is descended from\nworking, I did what any self-respecting VM hacker would do. I cobbled together a\ncouple of benchmarks, fired up a profiler, and ran those scripts through my\ninterpreter. In a dynamically typed language like Lox, a large fraction of user\ncode is field accesses and method calls, so one of my benchmarks looked\nsomething like this:\n\n```lox\nclass Zoo {\n  init() {\n    this.aardvark = 1;\n    this.baboon   = 1;\n    this.cat      = 1;\n    this.donkey   = 1;\n    this.elephant = 1;\n    this.fox      = 1;\n  }\n  ant()    { return this.aardvark; }\n  banana() { return this.baboon; }\n  tuna()   { return this.cat; }\n  hay()    { return this.donkey; }\n  grass()  { return this.elephant; }\n  mouse()  { return this.fox; }\n}\n\nvar zoo = Zoo();\nvar sum = 0;\nvar start = clock();\nwhile (sum < 100000000) {\n  sum = sum + zoo.ant()\n            + zoo.banana()\n            + zoo.tuna()\n            + zoo.hay()\n            + zoo.grass()\n            + zoo.mouse();\n}\n\nprint clock() - start;\nprint sum;\n```\n\n<aside name=\"sum\" class=\"bottom\">\n\nAnother thing this benchmark is careful to do is *use* the result of the code it\nexecutes. By calculating a rolling sum and printing the result, we ensure the VM\n*must* execute all that Lox code. This is an important habit. Unlike our simple\nLox VM, many compilers do aggressive dead code elimination and are smart enough\nto discard a computation whose result is never used.\n\nMany a programming language hacker has been impressed by the blazing performance\nof a VM on some benchmark, only to realize that it's because the compiler\noptimized the entire benchmark program away to nothing.\n\n</aside>\n\nIf you've never seen a benchmark before, this might seem ludicrous. *What* is\ngoing on here? The program itself doesn't intend to <span name=\"sum\">do</span>\nanything useful. What it does do is call a bunch of methods and access a bunch\nof fields since those are the parts of the language we're interested in. Fields\nand methods live in hash tables, so it takes care to populate at least a <span\nname=\"more\">*few*</span> interesting keys in those tables. That is all wrapped\nin a big loop to ensure our profiler has enough execution time to dig in and see\nwhere the cycles are going.\n\n<aside name=\"more\">\n\nIf you really want to benchmark hash table performance, you should use many\ntables of different sizes. The six keys we add to each table here aren't even\nenough to get over our hash table's eight-element minimum threshold. But I\ndidn't want to throw an enormous benchmark script at you. Feel free to add more\ncritters and treats if you like.\n\n</aside>\n\nBefore I tell you what my profiler showed me, spend a minute taking a few\nguesses. Where in clox's codebase do you think the VM spent most of its time? Is\nthere any code we've written in previous chapters that you suspect is\nparticularly slow?\n\nHere's what I found: Naturally, the function with the greatest inclusive time is\n`run()`. (**Inclusive time** means the total time spent in some function and all\nother functions it calls -- the total time between when you enter the function\nand when it returns.) Since `run()` is the main bytecode execution loop, it\ndrives everything.\n\nInside `run()`, there are small chunks of time sprinkled in various cases in the\nbytecode switch for common instructions like `OP_POP`, `OP_RETURN`, and\n`OP_ADD`. The big heavy instructions are `OP_GET_GLOBAL` with 17% of the\nexecution time, `OP_GET_PROPERTY` at 12%, and `OP_INVOKE` which takes a whopping\n42% of the total running time.\n\nSo we've got three hotspots to optimize? Actually, no. Because it turns out\nthose three instructions spend almost all of their time inside calls to the same\nfunction: `tableGet()`. That function claims a whole 72% of the execution time\n(again, inclusive). Now, in a dynamically typed language, we expect to spend a\nfair bit of time looking stuff up in hash tables -- it's sort of the price of\ndynamism. But, still, *wow.*\n\n### Slow key wrapping\n\nIf you take a look at `tableGet()`, you'll see it's mostly a wrapper around a\ncall to `findEntry()` where the actual hash table lookup happens. To refresh\nyour memory, here it is in full:\n\n```c\nstatic Entry* findEntry(Entry* entries, int capacity,\n                        ObjString* key) {\n  uint32_t index = key->hash % capacity;\n  Entry* tombstone = NULL;\n\n  for (;;) {\n    Entry* entry = &entries[index];\n    if (entry->key == NULL) {\n      if (IS_NIL(entry->value)) {\n        // Empty entry.\n        return tombstone != NULL ? tombstone : entry;\n      } else {\n        // We found a tombstone.\n        if (tombstone == NULL) tombstone = entry;\n      }\n    } else if (entry->key == key) {\n      // We found the key.\n      return entry;\n    }\n\n    index = (index + 1) % capacity;\n  }\n}\n```\n\nWhen running that previous benchmark -- on my machine, at least -- the VM spends\n70% of the total execution time on *one line* in this function. Any guesses as\nto which one? No? It's this:\n\n```c\n  uint32_t index = key->hash % capacity;\n```\n\nThat pointer dereference isn't the problem. It's the little `%`. It turns out\nthe modulo operator is *really* slow. Much slower than other <span\nname=\"division\">arithmetic</span> operators. Can we do something better?\n\n<aside name=\"division\">\n\nPipelining makes it hard to talk about the performance of an individual CPU\ninstruction, but to give you a feel for things, division and modulo are about\n30-50 *times* slower than addition and subtraction on x86.\n\n</aside>\n\nIn the general case, it's really hard to re-implement a fundamental arithmetic\noperator in user code in a way that's faster than what the CPU itself can do.\nAfter all, our C code ultimately compiles down to the CPU's own arithmetic\noperations. If there were tricks we could use to go faster, the chip would\nalready be using them.\n\nHowever, we can take advantage of the fact that we know more about our problem\nthan the CPU does. We use modulo here to take a key string's hash code and\nwrap it to fit within the bounds of the table's entry array. That array starts\nout at eight elements and grows by a factor of two each time. We know -- and the\nCPU and C compiler do not -- that our table's size is always a power of two.\n\nBecause we're clever bit twiddlers, we know a faster way to calculate the\nremainder of a number modulo a power of two: **bit masking**. Let's say we want\nto calculate 229 modulo 64. The answer is 37, which is not particularly apparent\nin decimal, but is clearer when you view those numbers in binary:\n\n<img src=\"image/optimization/mask.png\" alt=\"The bit patterns resulting from 229 % 64 = 37 and 229 &amp; 63 = 37.\" />\n\nOn the left side of the illustration, notice how the result (37) is simply the\ndividend (229) with the highest two bits shaved off? Those two highest bits are\nthe bits at or to the left of the divisor's single 1 bit.\n\nOn the right side, we get the same result by taking 229 and bitwise <span\nclass=\"small-caps\">AND</span>-ing it with 63, which is one less than our\noriginal power of two divisor. Subtracting one from a power of two gives you a\nseries of 1 bits. That is exactly the mask we need in order to strip out those\ntwo leftmost bits.\n\nIn other words, you can calculate a number modulo any power of two by simply\n<span class=\"small-caps\">AND</span>-ing it with that power of two minus one. I'm\nnot enough of a mathematician to *prove* to you that this works, but if you\nthink it through, it should make sense. We can replace that slow modulo operator\nwith a very fast decrement and bitwise <span class=\"small-caps\">AND</span>. We\nsimply change the offending line of code to this:\n\n^code initial-index (2 before, 1 after)\n\nCPUs love bitwise operators, so it's hard to <span name=\"sub\">improve</span> on that. \n\n<aside name=\"sub\">\n\nAnother potential improvement is to eliminate the decrement by storing the bit\nmask directly instead of the capacity. In my tests, that didn't make a\ndifference. Instruction pipelining makes some operations essentially free if the\nCPU is bottlenecked elsewhere.\n\n</aside>\n\nOur linear probing search may need to wrap around the end of the array, so there\nis another modulo in `findEntry()` to update.\n\n^code next-index (4 before, 1 after)\n\nThis line didn't show up in the profiler since most searches don't wrap.\n\nThe `findEntry()` function has a sister function, `tableFindString()` that does\na hash table lookup for interning strings. We may as well apply the same\noptimizations there too. This function is called only when interning strings,\nwhich wasn't heavily stressed by our benchmark. But a Lox program that created\nlots of strings might noticeably benefit from this change.\n\n^code find-string-index (2 before, 2 after)\n\nAnd also when the linear probing wraps around.\n\n^code find-string-next (3 before, 1 after)\n\nLet's see if our fixes were worth it. I tweaked that zoological benchmark to\ncount how many <span name=\"batch\">batches</span> of 10,000 calls it can run in\nten seconds. More batches equals faster performance. On my machine using the\nunoptimized code, the benchmark gets through 3,192 batches. After this\noptimization, that jumps to 6,249.\n\n<img src=\"image/optimization/hash-chart.png\" alt=\"Bar chart comparing the performance before and after the optimization.\" />\n\nThat's almost exactly twice as much work in the same amount of time. We made the\nVM twice as fast (usual caveat: on this benchmark). That is a massive win when\nit comes to optimization. Usually you feel good if you can claw a few percentage\npoints here or there. Since methods, fields, and global variables are so\nprevalent in Lox programs, this tiny optimization improves performance across\nthe board. Almost every Lox program benefits.\n\n<aside name=\"batch\">\n\nOur original benchmark fixed the amount of *work* and then measured the *time*.\nChanging the script to count how many batches of calls it can do in ten seconds\nfixes the time and measures the work. For performance comparisons, I like the\nlatter measure because the reported number represents *speed*. You can directly\ncompare the numbers before and after an optimization. When measuring execution\ntime, you have to do a little arithmetic to get to a good relative measure of\nperformance.\n\n</aside>\n\nNow, the point of this section is *not* that the modulo operator is profoundly\nevil and you should stamp it out of every program you ever write. Nor is it that\nmicro-optimization is a vital engineering skill. It's rare that a performance\nproblem has such a narrow, effective solution. We got lucky.\n\nThe point is that we didn't *know* that the modulo operator was a performance\ndrain until our profiler told us so. If we had wandered around our VM's codebase\nblindly guessing at hotspots, we likely wouldn't have noticed it. What I want\nyou to take away from this is how important it is to have a profiler in your\ntoolbox.\n\nTo reinforce that point, let's go ahead and run the original benchmark in our\nnow-optimized VM and see what the profiler shows us. On my machine, `tableGet()`\nis still a fairly large chunk of execution time. That's to be expected for a\ndynamically typed language. But it has dropped from 72% of the total execution\ntime down to 35%. That's much more in line with what we'd like to see and shows\nthat our optimization didn't just make the program faster, but made it faster\n*in the way we expected*. Profilers are as useful for verifying solutions as\nthey are for discovering problems.\n\n## NaN Boxing\n\nThis next optimization has a very different feel. Thankfully, despite the odd\nname, it does not involve punching your grandmother. It's different, but not,\nlike, *that* different. With our previous optimization, the profiler told us\nwhere the problem was, and we merely had to use some ingenuity to come up with a\nsolution.\n\nThis optimization is more subtle, and its performance effects more scattered\nacross the virtual machine. The profiler won't help us come up with this.\nInstead, it was invented by <span name=\"someone\">someone</span> thinking deeply\nabout the lowest levels of machine architecture.\n\n<aside name=\"someone\">\n\nI'm not sure who first came up with this trick. The earliest source I can find\nis David Gudeman's 1993 paper \"Representing Type Information in Dynamically\nTyped Languages\". Everyone else cites that. But Gudeman himself says the paper\nisn't novel work, but instead \"gathers together a body of folklore\".\n\nMaybe the inventor has been lost to the mists of time, or maybe it's been\nreinvented a number of times. Anyone who ruminates on IEEE 754 long enough\nprobably starts thinking about trying to stuff something useful into all those\nunused NaN bits.\n\n</aside>\n\nLike the heading says, this optimization is called **NaN boxing** or sometimes\n**NaN tagging**. Personally I like the latter name because \"boxing\" tends to imply\nsome kind of heap-allocated representation, but the former seems to be the more\nwidely used term. This technique changes how we represent values in the VM.\n\nOn a 64-bit machine, our Value type takes up 16 bytes. The struct has two\nfields, a type tag and a union for the payload. The largest fields in the union\nare an Obj pointer and a double, which are both 8 bytes. To keep the union field\naligned to an 8-byte boundary, the compiler adds padding after the tag too:\n\n<img src=\"image/optimization/union.png\" alt=\"Byte layout of the 16-byte tagged union Value.\" />\n\nThat's pretty big. If we could cut that down, then the VM could pack more values\ninto the same amount of memory. Most computers have plenty of RAM these days, so\nthe direct memory savings aren't a huge deal. But a smaller representation means\nmore Values fit in a cache line. That means fewer cache misses, which affects\n*speed*.\n\nIf Values need to be aligned to their largest payload size, and a Lox number or\nObj pointer needs a full 8 bytes, how can we get any smaller? In a dynamically\ntyped language like Lox, each value needs to carry not just its payload, but\nenough additional information to determine the value's type at runtime. If a Lox\nnumber is already using the full 8 bytes, where could we squirrel away a couple\nof extra bits to tell the runtime \"this is a number\"?\n\nThis is one of the perennial problems for dynamic language hackers. It\nparticularly bugs them because statically typed languages don't generally have\nthis problem. The type of each value is known at compile time, so no extra\nmemory is needed at runtime to track it. When your C compiler compiles a 32-bit\nint, the resulting variable gets *exactly* 32 bits of storage.\n\nDynamic language folks hate losing ground to the static camp, so they've come up\nwith a number of very clever ways to pack type information and a payload into a\nsmall number of bits. NaN boxing is one of those. It's a particularly good fit\nfor languages like JavaScript and Lua, where all numbers are double-precision\nfloating point. Lox is in that same boat.\n\n### What is (and is not) a number?\n\nBefore we start optimizing, we need to really understand how our friend the CPU\nrepresents floating-point numbers. Almost all machines today use the same\nscheme, encoded in the venerable scroll [IEEE 754][754], known to mortals as the\n\"IEEE Standard for Floating-Point Arithmetic\".\n\n[754]: https://en.wikipedia.org/wiki/IEEE_754\n\nIn the eyes of your computer, a <span name=\"hyphen\">64-bit</span>,\ndouble-precision, IEEE floating-point number looks like this:\n\n<aside name=\"hyphen\">\n\nThat's a lot of hyphens for one sentence.\n\n</aside>\n\n<img src=\"image/optimization/double.png\" alt=\"Bit representation of an IEEE 754 double.\" />\n\n*   Starting from the right, the first 52 bits are the **fraction**,\n    **mantissa**, or **significand** bits. They represent the significant digits\n    of the number, as a binary integer.\n\n*   Next to that are 11 **exponent** bits. These tell you how far the mantissa\n    is shifted away from the decimal (well, binary) point.\n\n*   The highest bit is the <span name=\"sign\">**sign bit**</span>, which\n    indicates whether the number is positive or negative.\n\nI know that's a little vague, but this chapter isn't a deep dive on\nfloating point representation. If you want to know how the exponent and mantissa\nplay together, there are already better explanations out there than I could\nwrite.\n\n<aside name=\"sign\">\n\nSince the sign bit is always present, even if the number is zero, that implies\nthat \"positive zero\" and \"negative zero\" have different bit representations, and\nindeed, IEEE 754 does distinguish those.\n\n</aside>\n\nThe important part for our purposes is that the spec carves out a special case\nexponent. When all of the exponent bits are set, then instead of just\nrepresenting a really big number, the value has a different meaning. These\nvalues are \"Not a Number\" (hence, **NaN**) values. They represent concepts like\ninfinity or the result of division by zero.\n\n*Any* double whose exponent bits are all set is a NaN, regardless of the\nmantissa bits. That means there's lots and lots of *different* NaN bit patterns.\nIEEE 754 divides those into two categories. Values where the highest mantissa\nbit is 0 are called **signalling NaNs**, and the others are **quiet NaNs**.\nSignalling NaNs are intended to be the result of erroneous computations, like\ndivision by zero. A chip <span name=\"abort\">may</span> detect when one of these\nvalues is produced and abort a program completely. They may self-destruct if you\ntry to read one.\n\n<aside name=\"abort\">\n\nI don't know if any CPUs actually *do* trap signalling NaNs and abort. The spec\njust says they *could*.\n\n</aside>\n\nQuiet NaNs are supposed to be safer to use. They don't represent useful numeric\nvalues, but they should at least not set your hand on fire if you touch them.\n\nEvery double with all of its exponent bits set and its highest mantissa bit set\nis a quiet NaN. That leaves 52 bits unaccounted for. We'll avoid one of those so\nthat we don't step on Intel's \"QNaN Floating-Point Indefinite\" value, leaving us\n51 bits. Those remaining bits can be anything. We're talking\n2,251,799,813,685,248 unique quiet NaN bit patterns.\n\n<img src=\"image/optimization/nan.png\" alt=\"The bits in a double that make it a quiet NaN.\" />\n\nThis means a 64-bit double has enough room to store all of the various different\nnumeric floating-point values and *also* has room for another 51 bits of data\nthat we can use however we want. That's plenty of room to set aside a couple of\nbit patterns to represent Lox's `nil`, `true`, and `false` values. But what\nabout Obj pointers? Don't pointers need a full 64 bits too?\n\nFortunately, we have another trick up our other sleeve. Yes, technically\npointers on a 64-bit architecture are 64 bits. But, no architecture I know of\nactually uses that entire address space. Instead, most widely used chips today\nonly ever use the low <span name=\"48\">48</span> bits. The remaining 16 bits are\neither unspecified or always zero.\n\n<aside name=\"48\">\n\n48 bits is enough to address 262,144 gigabytes of memory. Modern operating\nsystems also give each process its own address space, so that should be plenty.\n\n</aside>\n\nIf we've got 51 bits, we can stuff a 48-bit pointer in there with three bits to\nspare. Those three bits are just enough to store tiny type tags to distinguish\nbetween `nil`, Booleans, and Obj pointers.\n\nThat's NaN boxing. Within a single 64-bit double, you can store all of the\ndifferent floating-point numeric values, a pointer, or any of a couple of other\nspecial sentinel values. Half the memory usage of our current Value struct,\nwhile retaining all of the fidelity.\n\nWhat's particularly nice about this representation is that there is no need to\n*convert* a numeric double value into a \"boxed\" form. Lox numbers *are* just\nnormal, 64-bit doubles. We still need to *check* their type before we use them,\nsince Lox is dynamically typed, but we don't need to do any bit shifting or\npointer indirection to go from \"value\" to \"number\".\n\nFor the other value types, there is a conversion step, of course. But,\nfortunately, our VM hides all of the mechanism to go from values to raw types\nbehind a handful of macros. Rewrite those to implement NaN boxing, and the rest\nof the VM should just work.\n\n### Conditional support\n\nI know the details of this new representation aren't clear in your head yet.\nDon't worry, they will crystallize as we work through the implementation. Before\nwe get to that, we're going to put some compile-time scaffolding in place.\n\nFor our previous optimization, we rewrote the previous slow code and called it\ndone. This one is a little different. NaN boxing relies on some very low-level\ndetails of how a chip represents floating-point numbers and pointers. It\n*probably* works on most CPUs you're likely to encounter, but you can never be\ntotally sure.\n\nIt would suck if our VM completely lost support for an architecture just because\nof its value representation. To avoid that, we'll maintain support for *both*\nthe old tagged union implementation of Value and the new NaN-boxed form. We\nselect which representation we want at compile time using this flag:\n\n^code define-nan-boxing (2 before, 1 after)\n\nIf that's defined, the VM uses the new form. Otherwise, it reverts to the old\nstyle. The few pieces of code that care about the details of the value\nrepresentation -- mainly the handful of macros for wrapping and unwrapping\nValues -- vary based on whether this flag is set. The rest of the VM can\ncontinue along its merry way.\n\nMost of the work happens in the \"value\" module where we add a section for the\nnew type.\n\n^code nan-boxing (2 before, 1 after)\n\nWhen NaN boxing is enabled, the actual type of a Value is a flat, unsigned\n64-bit integer. We could use double instead, which would make the macros for\ndealing with Lox numbers a little simpler. But all of the other macros need to\ndo bitwise operations and uint64_t is a much friendlier type for that. Outside\nof this module, the rest of the VM doesn't really care one way or the other.\n\nBefore we start re-implementing those macros, we close the `#else` branch of the\n`#ifdef` at the end of the definitions for the old representation.\n\n^code end-if-nan-boxing (1 before, 2 after)\n\nOur remaining task is simply to fill in that first `#ifdef` section with new\nimplementations of all the stuff already in the `#else` side. We'll work through\nit one value type at a time, from easiest to hardest.\n\n### Numbers\n\nWe'll start with numbers since they have the most direct representation under\nNaN boxing. To \"convert\" a C double to a NaN-boxed clox Value, we don't need to\ntouch a single bit -- the representation is exactly the same. But we do need to\nconvince our C compiler of that fact, which we made harder by defining Value to\nbe uint64_t.\n\nWe need to get the compiler to take a set of bits that it thinks are a double\nand use those same bits as a uint64_t, or vice versa. This is called **type\npunning**. C and C++ programmers have been doing this since the days of bell\nbottoms and 8-tracks, but the language specifications have <span\nname=\"hesitate\">hesitated</span> to say which of the many ways to do this is\nofficially sanctioned.\n\n<aside name=\"hesitate\" class=\"bottom\">\n\nSpec authors don't like type punning because it makes optimization harder. A key\noptimization technique is reordering instructions to fill the CPU's execution\npipelines. A compiler can reorder code only when doing so doesn't have a\nuser-visible effect, obviously.\n\nPointers make that harder. If two pointers point to the same value, then a write\nthrough one and a read through the other cannot be reordered. But what about two\npointers of *different* types? If those could point to the same object, then\nbasically *any* two pointers could be aliases to the same value. That\ndrastically limits the amount of code the compiler is free to rearrange.\n\nTo avoid that, compilers want to assume **strict aliasing** -- pointers of\nincompatible types cannot point to the same value. Type punning, by nature,\nbreaks that assumption.\n\n</aside>\n\nI know one way to convert a `double` to `Value` and back that I believe is\nsupported by both the C and C++ specs. Unfortunately, it doesn't fit in a single\nexpression, so the conversion macros have to call out to helper functions.\nHere's the first macro:\n\n^code number-val (1 before, 2 after)\n\nThat macro passes the double here:\n\n^code num-to-value (1 before, 2 after)\n\nI know, weird, right? The way to treat a series of bytes as having a different\ntype without changing their value at all is `memcpy()`? This looks horrendously\nslow: Create a local variable. Pass its address to the operating system through\na syscall to copy a few bytes. Then return the result, which is the exact same\nbytes as the input. Thankfully, because this *is* the supported idiom for type\npunning, most compilers recognize the pattern and optimize away the `memcpy()`\nentirely.\n\n\"Unwrapping\" a Lox number is the mirror image.\n\n^code as-number (1 before, 2 after)\n\nThat macro calls this function:\n\n^code value-to-num (1 before, 2 after)\n\nIt works exactly the same except we swap the types. Again, the compiler will\neliminate all of it. Even though those calls to\n`memcpy()` will disappear, we still need to show the compiler *which* `memcpy()`\nwe're calling so we also need an <span name=\"union\">include</span>.\n\n<aside name=\"union\" class=\"bottom\">\n\nIf you find yourself with a compiler that does not optimize the `memcpy()` away,\ntry this instead:\n\n```c\ndouble valueToNum(Value value) {\n  union {\n    uint64_t bits;\n    double num;\n  } data;\n  data.bits = value;\n  return data.num;\n}\n```\n\n</aside>\n\n^code include-string (1 before, 2 after)\n\nThat was a lot of code to ultimately do nothing but silence the C type checker.\nDoing a runtime type *test* on a Lox number is a little more interesting. If all\nwe have are exactly the bits for a double, how do we tell that it *is* a double?\nIt's time to get bit twiddling.\n\n^code is-number (1 before, 2 after)\n\nWe know that every Value that is *not* a number will use a special quiet NaN\nrepresentation. And we presume we have correctly avoided any of the meaningful\nNaN representations that may actually be produced by doing arithmetic on\nnumbers.\n\nIf the double has all of its NaN bits set, and the quiet NaN bit set, and one\nmore for good measure, we can be <span name=\"certain\">pretty certain</span> it\nis one of the bit patterns we ourselves have set aside for other types. To check\nthat, we mask out all of the bits except for our set of quiet NaN bits. If *all*\nof those bits are set, it must be a NaN-boxed value of some other Lox type.\nOtherwise, it is actually a number.\n\n<aside name=\"certain\">\n\nPretty certain, but not strictly guaranteed. As far as I know, there is nothing\npreventing a CPU from producing a NaN value as the result of some operation\nwhose bit representation collides with ones we have claimed. But in my tests\nacross a number of architectures, I haven't seen it happen.\n\n</aside>\n\nThe set of quiet NaN bits are declared like this:\n\n^code qnan (1 before, 2 after)\n\nIt would be nice if C supported binary literals. But if you do the conversion,\nyou'll see that value is the same as this:\n\n<img src=\"image/optimization/qnan.png\" alt=\"The quiet NaN bits.\" />\n\nThis is exactly all of the exponent bits, plus the quiet NaN bit, plus one extra\nto dodge that Intel value.\n\n### Nil, true, and false\n\nThe next type to handle is `nil`. That's pretty simple since there's only one\n`nil` value and thus we need only a single bit pattern to represent it. There\nare two other singleton values, the two Booleans, `true` and `false`. This calls\nfor three total unique bit patterns.\n\nTwo bits give us four different combinations, which is plenty. We claim the two\nlowest bits of our unused mantissa space as a \"type tag\" to determine which of\nthese three singleton values we're looking at. The three type tags are defined\nlike so:\n\n^code tags (1 before, 2 after)\n\nOur representation of `nil` is thus all of the bits required to define our\nquiet NaN representation along with the `nil` type tag bits:\n\n<img src=\"image/optimization/nil.png\" alt=\"The bit representation of the nil value.\" />\n\nIn code, we check the bits like so:\n\n^code nil-val (2 before, 1 after)\n\nWe simply bitwise <span class=\"small-caps\">OR</span> the quiet NaN bits and the\ntype tag, and then do a little cast dance to teach the C compiler what we want\nthose bits to mean.\n\nSince `nil` has only a single bit representation, we can use equality on\nuint64_t to see if a Value is `nil`.\n\n<span name=\"equal\"></span>\n\n^code is-nil (2 before, 1 after)\n\nYou can guess how we define the `true` and `false` values.\n\n^code false-true-vals (2 before, 1 after)\n\nThe bits look like this:\n\n<img src=\"image/optimization/bools.png\" alt=\"The bit representation of the true and false values.\" />\n\nTo convert a C bool into a Lox Boolean, we rely on these two singleton values\nand the good old conditional operator.\n\n^code bool-val (2 before, 1 after)\n\nThere's probably a cleverer bitwise way to do this, but my hunch is that the\ncompiler can figure one out faster than I can. Going the other direction is\nsimpler.\n\n^code as-bool (2 before, 1 after)\n\nSince we know there are exactly two Boolean bit representations in Lox -- unlike\nin C where any non-zero value can be considered \"true\" -- if it ain't `true`, it\nmust be `false`. This macro does assume you call it only on a Value that you\nknow *is* a Lox Boolean. To check that, there's one more macro.\n\n^code is-bool (2 before, 1 after)\n\nThat looks a little strange. A more obvious macro would look like this:\n\n```c\n#define IS_BOOL(v) ((v) == TRUE_VAL || (v) == FALSE_VAL)\n```\n\nUnfortunately, that's not safe. The expansion mentions `v` twice, which means if\nthat expression has any side effects, they will be executed twice. We could have\nthe macro call out to a separate function, but, ugh, what a chore.\n\nInstead, we bitwise <span class=\"small-caps\">OR</span> a 1 onto the value to\nmerge the only two valid Boolean bit patterns. That leaves three potential\nstates the value can be in:\n\n1. It was `FALSE_VAL` and has now been converted to `TRUE_VAL`.\n\n2. It was `TRUE_VAL` and the `| 1` did nothing and it's still `TRUE_VAL`.\n\n3. It's some other, non-Boolean value.\n\nAt that point, we can simply compare the result to `TRUE_VAL` to see if we're\nin the first two states or the third.\n\n### Objects\n\nThe last value type is the hardest. Unlike the singleton values, there are\nbillions of different pointer values we need to box inside a NaN. This means we\nneed both some kind of tag to indicate that these particular NaNs *are* Obj\npointers, and room for the addresses themselves.\n\nThe tag bits we used for the singleton values are in the region where I decided\nto store the pointer itself, so we can't easily use a different <span\nname=\"ptr\">bit</span> there to indicate that the value is an object reference.\nHowever, there is another bit we aren't using. Since all our NaN values are not\nnumbers -- it's right there in the name -- the sign bit isn't used for anything.\nWe'll go ahead and use that as the type tag for objects. If one of our quiet\nNaNs has its sign bit set, then it's an Obj pointer. Otherwise, it must be one\nof the previous singleton values.\n\n<aside name=\"ptr\">\n\nWe actually *could* use the lowest bits to store the type tag even when the\nvalue is an Obj pointer. That's because Obj pointers are always aligned to an\n8-byte boundary since Obj contains a 64-bit field. That, in turn, implies that\nthe three lowest bits of an Obj pointer will always be zero. We could store\nwhatever we wanted in there and just mask it off before dereferencing the\npointer.\n\nThis is another value representation optimization called **pointer tagging**.\n\n</aside>\n\nIf the sign bit is set, then the remaining low bits store the pointer to the\nObj:\n\n<img src=\"image/optimization/obj.png\" alt=\"Bit representation of an Obj* stored in a Value.\" />\n\nTo convert a raw Obj pointer to a Value, we take the pointer and set all of the\nquiet NaN bits and the sign bit.\n\n^code obj-val (1 before, 2 after)\n\nThe pointer itself is a full 64 bits, and in <span name=\"safe\">principle</span>,\nit could thus overlap with some of those quiet NaN and sign bits. But in\npractice, at least on the architectures I've tested, everything above the 48th\nbit in a pointer is always zero. There's a lot of casting going on here, which\nI've found is necessary to satisfy some of the pickiest C compilers, but the\nend result is just jamming some bits together.\n\n<aside name=\"safe\">\n\nI try to follow the letter of the law when it comes to the code in this book, so\nthis paragraph is dubious. There comes a point when optimizing where you push\nthe boundary of not just what the *spec says* you can do, but what a real\ncompiler and chip let you get away with.\n\nThere are risks when stepping outside of the spec, but there are rewards in that\nlawless territory too. It's up to you to decide if the gains are worth it.\n\n</aside>\n\nWe define the sign bit like so:\n\n^code sign-bit (2 before, 2 after)\n\nTo get the Obj pointer back out, we simply mask off all of those extra bits.\n\n^code as-obj (1 before, 2 after)\n\nThe tilde (`~`), if you haven't done enough bit manipulation to encounter it\nbefore, is bitwise <span class=\"small-caps\">NOT</span>. It toggles all ones and\nzeroes in its operand. By masking the value with the bitwise negation of the\nquiet NaN and sign bits, we *clear* those bits and let the pointer bits remain.\n\nOne last macro:\n\n^code is-obj (1 before, 2 after)\n\nA Value storing an Obj pointer has its sign bit set, but so does any negative\nnumber. To tell if a Value is an Obj pointer, we need to check that both the\nsign bit and all of the quiet NaN bits are set. This is similar to how we detect\nthe type of the singleton values, except this time we use the sign bit as the\ntag.\n\n### Value functions\n\nThe rest of the VM usually goes through the macros when working with Values, so\nwe are almost done. However, there are a couple of functions in the \"value\"\nmodule that peek inside the otherwise black box of Value and work with its\nencoding directly. We need to fix those too.\n\nThe first is `printValue()`. It has separate code for each value type. We no\nlonger have an explicit type enum we can switch on, so instead we use a series\nof type tests to handle each kind of value.\n\n^code print-value (1 before, 1 after)\n\nThis is technically a tiny bit slower than a switch, but compared to the\noverhead of actually writing to a stream, it's negligible.\n\nWe still support the original tagged union representation, so we keep the old\ncode and enclose it in the `#else` conditional section.\n\n^code end-print-value (1 before, 1 after)\n\nThe other operation is testing two values for equality.\n\n^code values-equal (1 before, 1 after)\n\nIt doesn't get much simpler than that! If the two bit representations are\nidentical, the values are equal. That does the right thing for the singleton\nvalues since each has a unique bit representation and they are only equal to\nthemselves. It also does the right thing for Obj pointers, since objects use\nidentity for equality -- two Obj references are equal only if they point to the\nexact same object.\n\nIt's *mostly* correct for numbers too. Most floating-point numbers with\ndifferent bit representations are distinct numeric values. Alas, IEEE 754\ncontains a pothole to trip us up. For reasons that aren't entirely clear to me,\nthe spec mandates that NaN values are *not* equal to *themselves*. This isn't a\nproblem for the special quiet NaNs that we are using for our own purposes. But\nit's possible to produce a \"real\" arithmetic NaN in Lox, and if we want to\ncorrectly implement IEEE 754 numbers, then the resulting value is not supposed\nto be equal to itself. More concretely:\n\n```lox\nvar nan = 0/0;\nprint nan == nan;\n```\n\nIEEE 754 says this program is supposed to print \"false\". It does the right thing\nwith our old tagged union representation because the `VAL_NUMBER` case applies\n`==` to two values that the C compiler knows are doubles. Thus the compiler\ngenerates the right CPU instruction to perform an IEEE floating-point equality.\n\nOur new representation breaks that by defining Value to be a uint64_t. If we\nwant to be *fully* compliant with IEEE 754, we need to handle this case.\n\n^code nan-equality (1 before, 1 after)\n\nI know, it's weird. And there is a performance cost to doing this type test\nevery time we check two Lox values for equality. If we are willing to sacrifice\na little <span name=\"java\">compatibility</span> -- who *really* cares if NaN is\nnot equal to itself? -- we could leave this off. I'll leave it up to you to\ndecide how pedantic you want to be.\n\n<aside name=\"java\">\n\nIn fact, jlox gets NaN equality wrong. Java does the right thing when you\ncompare primitive doubles using `==`, but not if you box those to Double or\nObject and compare them using `equals()`, which is how jlox implements equality.\n\n</aside>\n\nFinally, we close the conditional compilation section around the old\nimplementation.\n\n^code end-values-equal (1 before, 1 after)\n\nAnd that's it. This optimization is complete, as is our clox virtual machine.\nThat was the last line of new code in the book.\n\n### Evaluating performance\n\nThe code is done, but we still need to figure out if we actually made anything\nbetter with these changes. Evaluating an optimization like this is very\ndifferent from the previous one. There, we had a clear hotspot visible in the\nprofiler. We fixed that part of the code and could instantly see the hotspot\nget faster.\n\nThe effects of changing the value representation are more diffused. The macros\nare expanded in place wherever they are used, so the performance changes are\nspread across the codebase in a way that's hard for many profilers to track\nwell, especially in an <span name=\"opt\">optimized</span> build.\n\n<aside name=\"opt\">\n\nWhen doing profiling work, you almost always want to profile an optimized\n\"release\" build of your program since that reflects the performance story your\nend users experience. Compiler optimizations, like inlining, can dramatically\naffect which parts of the code are performance hotspots. Hand-optimizing a debug\nbuild risks sending you off \"fixing\" problems that the optimizing compiler will\nalready solve for you.\n\nMake sure you don't accidentally benchmark and optimize your debug build. I seem\nto make that mistake at least once a year.\n\n</aside>\n\nWe also can't easily *reason* about the effects of our change. We've made values\nsmaller, which reduces cache misses all across the VM. But the actual real-world\nperformance effect of that change is highly dependent on the memory use of the\nLox program being run. A tiny Lox microbenchmark may not have enough values\nscattered around in memory for the effect to be noticeable, and even things like\nthe addresses handed out to us by the C memory allocator can impact the results.\n\nIf we did our job right, basically everything gets a little faster, especially\non larger, more complex Lox programs. But it is possible that the extra bitwise\noperations we do when NaN-boxing values nullify the gains from the better\nmemory use. Doing performance work like this is unnerving because you can't\neasily *prove* that you've made the VM better. You can't point to a single\nsurgically targeted microbenchmark and say, \"There, see?\"\n\nInstead, what we really need is a *suite* of larger benchmarks. Ideally, they\nwould be distilled from real-world applications -- not that such a thing exists\nfor a toy language like Lox. Then we can measure the aggregate performance\nchanges across all of those. I did my best to cobble together a handful of\nlarger Lox programs. On my machine, the new value representation seems to make\neverything roughly 10% faster across the board.\n\nThat's not a huge improvement, especially compared to the profound effect of\nmaking hash table lookups faster. I added this optimization in large part\nbecause it's a good example of a certain *kind* of performance work you may\nexperience, and honestly, because I think it's technically really cool. It might\nnot be the first thing I would reach for if I were seriously trying to make clox\nfaster. There is probably other, lower-hanging fruit.\n\nBut, if you find yourself working on a program where all of the easy wins have\nbeen taken, then at some point you may want to think about tuning your value\nrepresentation. I hope this chapter has shined a light on some of the options\nyou have in that area.\n\n## Where to Next\n\nWe'll stop here with the Lox language and our two interpreters. We could tinker\non it forever, adding new language features and clever speed improvements. But,\nfor this book, I think we've reached a natural place to call our work complete.\nI won't rehash everything we've learned in the past many pages. You were there\nwith me and you remember. Instead, I'd like to take a minute to talk about where\nyou might go from here. What is the next step in your programming language\njourney?\n\nMost of you probably won't spend a significant part of your career working in\ncompilers or interpreters. It's a pretty small slice of the computer science\nacademia pie, and an even smaller segment of software engineering in industry.\nThat's OK. Even if you never work on a compiler again in your life, you will\ncertainly *use* one, and I hope this book has equipped you with a better\nunderstanding of how the programming languages you use are designed and\nimplemented.\n\nYou have also learned a handful of important, fundamental data structures and\ngotten some practice doing low-level profiling and optimization work. That kind\nof expertise is helpful no matter what domain you program in.\n\nI also hope I gave you a new way of <span name=\"domain\">looking</span> at and\nsolving problems. Even if you never work on a language again, you may be\nsurprised to discover how many programming problems can be seen as\nlanguage-*like*. Maybe that report generator you need to write can be modeled as\na series of stack-based \"instructions\" that the generator \"executes\". That user\ninterface you need to render looks an awful lot like traversing an AST.\n\n<aside name=\"domain\">\n\nThis goes for other domains too. I don't think there's a single topic I've\nlearned in programming -- or even outside of programming -- that I haven't ended\nup finding useful in other areas. One of my favorite aspects of software\nengineering is how much it rewards those with eclectic interests.\n\n</aside>\n\nIf you do want to go further down the programming language rabbit hole, here\nare some suggestions for which branches in the tunnel to explore:\n\n*   Our simple, single-pass bytecode compiler pushed us towards mostly runtime\n    optimization. In a mature language implementation, compile-time optimization\n    is generally more important, and the field of compiler optimizations is\n    incredibly rich. Grab a classic <span name=\"cooper\">compilers</span> book,\n    and rebuild the front end of clox or jlox to be a sophisticated compilation\n    pipeline with some interesting intermediate representations and optimization\n    passes.\n\n    Dynamic typing will place some restrictions on how far you can go, but there\n    is still a lot you can do. Or maybe you want to take a big leap and add\n    static types and a type checker to Lox. That will certainly give your front\n    end a lot more to chew on.\n\n    <aside name=\"cooper\">\n\n    I like Cooper and Torczon's *Engineering a Compiler* for this. Appel's\n    *Modern Compiler Implementation* books are also well regarded.\n\n    </aside>\n\n*   In this book, I aim to be correct, but not particularly rigorous. My goal is\n    mostly to give you an *intuition* and a feel for doing language work. If you\n    like more precision, then the whole world of programming language academia\n    is waiting for you. Languages and compilers have been studied formally since\n    before we even had computers, so there is no shortage of books and papers on\n    parser theory, type systems, semantics, and formal logic. Going down this\n    path will also teach you how to read CS papers, which is a valuable skill in\n    its own right.\n\n*   Or, if you just really enjoy hacking on and making languages, you can take\n    Lox and turn it into your own <span name=\"license\">plaything</span>. Change\n    the syntax to something that delights your eye. Add missing features or\n    remove ones you don't like. Jam new optimizations in there.\n\n    <aside name=\"license\">\n\n    The *text* of this book is copyrighted to me, but the *code* and the\n    implementations of jlox and clox use the very permissive [MIT license][].\n    You are more than welcome to [take either of those interpreters][source] and\n    do whatever you want with them. Go to town.\n\n    If you make significant changes to the language, it would be good to also\n    change the name, mostly to avoid confusing people about what the name \"Lox\"\n    represents.\n\n    </aside>\n\n    Eventually you may get to a point where you have something you think others\n    could use as well. That gets you into the very distinct world of programming\n    language *popularity*. Expect to spend a ton of time writing documentation,\n    example programs, tools, and useful libraries. The field is crowded with\n    languages vying for users. To thrive in that space you'll have to put on\n    your marketing hat and *sell*. Not everyone enjoys that kind of\n    public-facing work, but if you do, it can be incredibly gratifying to see\n    people use your language to express themselves.\n\nOr maybe this book has satisfied your craving and you'll stop here. Whichever\nway you go, or don't go, there is one lesson I hope to lodge in your heart. Like\nI was, you may have initially been intimidated by programming languages. But in\nthese chapters, you've seen that even really challenging material can be tackled\nby us mortals if we get our hands dirty and take it a step at a time. If you can\nhandle compilers and interpreters, you can do anything you put your mind to.\n\n[mit license]: https://en.wikipedia.org/wiki/MIT_License\n[source]: https://github.com/munificent/craftinginterpreters\n\n<div class=\"challenges\">\n\n## Challenges\n\nAssigning homework on the last day of school seems cruel but if you really want\nsomething to do during your summer vacation:\n\n1.  Fire up your profiler, run a couple of benchmarks, and look for other\n    hotspots in the VM. Do you see anything in the runtime that you can improve?\n\n2.  Many strings in real-world user programs are small, often only a character\n    or two. This is less of a concern in clox because we intern strings, but\n    most VMs don't. For those that don't, heap allocating a tiny character array\n    for each of those little strings and then representing the value as a\n    pointer to that array is wasteful. Often, the pointer is larger than the\n    string's characters. A classic trick is to have a separate value\n    representation for small strings that stores the characters inline in the\n    value.\n\n    Starting from clox's original tagged union representation, implement that\n    optimization. Write a couple of relevant benchmarks and see if it helps.\n\n3.  Reflect back on your experience with this book. What parts of it worked well\n    for you? What didn't? Was it easier for you to learn bottom-up or top-down?\n    Did the illustrations help or distract? Did the analogies clarify or\n    confuse?\n\n    The more you understand your personal learning style, the more effectively\n    you can upload knowledge into your head. You can specifically target\n    material that teaches you the way you learn best.\n\n</div>\n"
  },
  {
    "path": "book/parsing-expressions.md",
    "content": "> Grammar, which knows how to control even kings.\n> <cite>Molière</cite>\n\n<span name=\"parse\">This</span> chapter marks the first major milestone of the\nbook. Many of us have cobbled together a mishmash of regular expressions and\nsubstring operations to extract some sense out of a pile of text. The code was\nprobably riddled with bugs and a beast to maintain. Writing a *real* parser --\none with decent error handling, a coherent internal structure, and the ability\nto robustly chew through a sophisticated syntax -- is considered a rare,\nimpressive skill. In this chapter, you will <span name=\"attain\">attain</span>\nit.\n\n<aside name=\"parse\">\n\n\"Parse\" comes to English from the Old French \"pars\" for \"part of speech\". It\nmeans to take a text and map each word to the grammar of the language. We use it\nhere in the same sense, except that our language is a little more modern than\nOld French.\n\n</aside>\n\n<aside name=\"attain\">\n\nLike many rites of passage, you'll probably find it looks a little smaller, a\nlittle less daunting when it's behind you than when it loomed ahead.\n\n</aside>\n\nIt's easier than you think, partially because we front-loaded a lot of the hard\nwork in the [last chapter][]. You already know your way around a formal grammar.\nYou're familiar with syntax trees, and we have some Java classes to represent\nthem. The only remaining piece is parsing -- transmogrifying a sequence of\ntokens into one of those syntax trees.\n\n[last chapter]: representing-code.html\n\nSome CS textbooks make a big deal out of parsers. In the '60s, computer\nscientists -- understandably tired of programming in assembly language --\nstarted designing more sophisticated, <span name=\"human\">human</span>-friendly\nlanguages like Fortran and ALGOL. Alas, they weren't very *machine*-friendly\nfor the primitive computers of the time.\n\n<aside name=\"human\">\n\nImagine how harrowing assembly programming on those old machines must have been\nthat they considered *Fortran* to be an improvement.\n\n</aside>\n\nThese pioneers designed languages that they honestly weren't even sure how to\nwrite compilers for, and then did groundbreaking work inventing parsing and\ncompiling techniques that could handle these new, big languages on those old, tiny\nmachines.\n\nClassic compiler books read like fawning hagiographies of these heroes and their\ntools. The cover of *Compilers: Principles, Techniques, and Tools* literally has\na dragon labeled \"complexity of compiler design\" being slain by a knight bearing\na sword and shield branded \"LALR parser generator\" and \"syntax directed\ntranslation\". They laid it on thick.\n\nA little self-congratulation is well-deserved, but the truth is you don't need\nto know most of that stuff to bang out a high quality parser for a modern\nmachine. As always, I encourage you to broaden your education and take it in\nlater, but this book omits the trophy case.\n\n## Ambiguity and the Parsing Game\n\nIn the last chapter, I said you can \"play\" a context-free grammar like a game in\norder to *generate* strings. Parsers play that game in reverse. Given a string\n-- a series of tokens -- we map those tokens to terminals in the grammar to\nfigure out which rules could have generated that string.\n\nThe \"could have\" part is interesting. It's entirely possible to create a grammar\nthat is *ambiguous*, where different choices of productions can lead to the same\nstring. When you're using the grammar to *generate* strings, that doesn't matter\nmuch. Once you have the string, who cares how you got to it?\n\nWhen parsing, ambiguity means the parser may misunderstand the user's code. As\nwe parse, we aren't just determining if the string is valid Lox code, we're\nalso tracking which rules match which parts of it so that we know what part of\nthe language each token belongs to. Here's the Lox expression grammar we put\ntogether in the last chapter:\n\n```ebnf\nexpression     → literal\n               | unary\n               | binary\n               | grouping ;\n\nliteral        → NUMBER | STRING | \"true\" | \"false\" | \"nil\" ;\ngrouping       → \"(\" expression \")\" ;\nunary          → ( \"-\" | \"!\" ) expression ;\nbinary         → expression operator expression ;\noperator       → \"==\" | \"!=\" | \"<\" | \"<=\" | \">\" | \">=\"\n               | \"+\"  | \"-\"  | \"*\" | \"/\" ;\n```\n\nThis is a valid string in that grammar:\n\n<img src=\"image/parsing-expressions/tokens.png\" alt=\"6 / 3 - 1\" />\n\nBut there are two ways we could have generated it. One way is:\n\n1. Starting at `expression`, pick `binary`.\n2. For the left-hand `expression`, pick `NUMBER`, and use `6`.\n3. For the operator, pick `\"/\"`.\n4. For the right-hand `expression`, pick `binary` again.\n5. In that nested `binary` expression, pick `3 - 1`.\n\nAnother is:\n\n1. Starting at `expression`, pick `binary`.\n2. For the left-hand `expression`, pick `binary` again.\n3. In that nested `binary` expression, pick `6 / 3`.\n4. Back at the outer `binary`, for the operator, pick `\"-\"`.\n5. For the right-hand `expression`, pick `NUMBER`, and use `1`.\n\nThose produce the same *strings*, but not the same *syntax trees*:\n\n<img src=\"image/parsing-expressions/syntax-trees.png\" alt=\"Two valid syntax trees: (6 / 3) - 1 and 6 / (3 - 1)\" />\n\nIn other words, the grammar allows seeing the expression as `(6 / 3) - 1` or `6\n/ (3 - 1)`. The `binary` rule lets operands nest any which way you want. That in\nturn affects the result of evaluating the parsed tree. The way mathematicians\nhave addressed this ambiguity since blackboards were first invented is by\ndefining rules for precedence and associativity.\n\n*   <span name=\"nonassociative\">**Precedence**</span> determines which operator\n    is evaluated first in an expression containing a mixture of different\n    operators. Precedence rules tell us that we evaluate the `/` before the `-`\n    in the above example. Operators with higher precedence are evaluated\n    before operators with lower precedence. Equivalently, higher precedence\n    operators are said to \"bind tighter\".\n\n*   **Associativity** determines which operator is evaluated first in a series\n    of the *same* operator. When an operator is **left-associative** (think\n    \"left-to-right\"), operators on the left evaluate before those on the right.\n    Since `-` is left-associative, this expression:\n\n    ```lox\n    5 - 3 - 1\n    ```\n\n    is equivalent to:\n\n    ```lox\n    (5 - 3) - 1\n    ```\n\n    Assignment, on the other hand, is **right-associative**. This:\n\n    ```lox\n    a = b = c\n    ```\n\n    is equivalent to:\n\n    ```lox\n    a = (b = c)\n    ```\n\n<aside name=\"nonassociative\">\n\nWhile not common these days, some languages specify that certain pairs of\noperators have *no* relative precedence. That makes it a syntax error to mix\nthose operators in an expression without using explicit grouping.\n\nLikewise, some operators are **non-associative**. That means it's an error to\nuse that operator more than once in a sequence. For example, Perl's range\noperator isn't associative, so `a .. b` is OK, but `a .. b .. c` is an error.\n\n</aside>\n\nWithout well-defined precedence and associativity, an expression that uses\nmultiple operators is ambiguous -- it can be parsed into different syntax trees,\nwhich could in turn evaluate to different results. We'll fix that in Lox by\napplying the same precedence rules as C, going from lowest to highest.\n\n<table>\n<thead>\n<tr>\n  <td>Name</td>\n  <td>Operators</td>\n  <td>Associates</td>\n</tr>\n</thead>\n<tbody>\n<tr>\n  <td>Equality</td>\n  <td><code>==</code> <code>!=</code></td>\n  <td>Left</td>\n</tr>\n<tr>\n  <td>Comparison</td>\n  <td><code>&gt;</code> <code>&gt;=</code>\n      <code>&lt;</code> <code>&lt;=</code></td>\n  <td>Left</td>\n</tr>\n<tr>\n  <td>Term</td>\n  <td><code>-</code> <code>+</code></td>\n  <td>Left</td>\n</tr>\n<tr>\n  <td>Factor</td>\n  <td><code>/</code> <code>*</code></td>\n  <td>Left</td>\n</tr>\n<tr>\n  <td>Unary</td>\n  <td><code>!</code> <code>-</code></td>\n  <td>Right</td>\n</tr>\n</tbody>\n</table>\n\nRight now, the grammar stuffs all expression types into a single `expression`\nrule. That same rule is used as the non-terminal for operands, which lets the\ngrammar accept any kind of expression as a subexpression, regardless of whether\nthe precedence rules allow it.\n\nWe fix that by <span name=\"massage\">stratifying</span> the grammar. We define a\nseparate rule for each precedence level.\n\n```ebnf\nexpression     → ...\nequality       → ...\ncomparison     → ...\nterm           → ...\nfactor         → ...\nunary          → ...\nprimary        → ...\n```\n\n<aside name=\"massage\">\n\nInstead of baking precedence right into the grammar rules, some parser\ngenerators let you keep the same ambiguous-but-simple grammar and then add in a\nlittle explicit operator precedence metadata on the side in order to\ndisambiguate.\n\n</aside>\n\nEach rule here only matches expressions at its precedence level or higher. For\nexample, `unary` matches a unary expression like `!negated` or a primary\nexpression like `1234`. And `term` can match `1 + 2` but also `3 * 4 / 5`. The\nfinal `primary` rule covers the highest-precedence forms -- literals and\nparenthesized expressions.\n\nWe just need to fill in the productions for each of those rules. We'll do the\neasy ones first. The top `expression` rule matches any expression at any\nprecedence level. Since <span name=\"equality\">`equality`</span> has the lowest\nprecedence, if we match that, then it covers everything.\n\n<aside name=\"equality\">\n\nWe could eliminate `expression` and simply use `equality` in the other rules\nthat contain expressions, but using `expression` makes those other rules read a\nlittle better.\n\nAlso, in later chapters when we expand the grammar to include assignment and\nlogical operators, we'll only need to change the production for `expression`\ninstead of touching every rule that contains an expression.\n\n</aside>\n\n```ebnf\nexpression     → equality\n```\n\nOver at the other end of the precedence table, a primary expression contains\nall the literals and grouping expressions.\n\n```ebnf\nprimary        → NUMBER | STRING | \"true\" | \"false\" | \"nil\"\n               | \"(\" expression \")\" ;\n```\n\nA unary expression starts with a unary operator followed by the operand. Since\nunary operators can nest -- `!!true` is a valid if weird expression -- the\noperand can itself be a unary operator. A recursive rule handles that nicely.\n\n```ebnf\nunary          → ( \"!\" | \"-\" ) unary ;\n```\n\nBut this rule has a problem. It never terminates.\n\nRemember, each rule needs to match expressions at that precedence level *or\nhigher*, so we also need to let this match a primary expression.\n\n```ebnf\nunary          → ( \"!\" | \"-\" ) unary\n               | primary ;\n```\n\nThat works.\n\nThe remaining rules are all binary operators. We'll start with the rule for\nmultiplication and division. Here's a first try:\n\n```ebnf\nfactor         → factor ( \"/\" | \"*\" ) unary\n               | unary ;\n```\n\nThe rule recurses to match the left operand. That enables the rule to match a\nseries of multiplication and division expressions like `1 * 2 / 3`. Putting the\nrecursive production on the left side and `unary` on the right makes the rule\n<span name=\"mult\">left-associative</span> and unambiguous.\n\n<aside name=\"mult\">\n\nIn principle, it doesn't matter whether you treat multiplication as left- or\nright-associative -- you get the same result either way. Alas, in the real world\nwith limited precision, roundoff and overflow mean that associativity can affect\nthe result of a sequence of multiplications. Consider:\n\n```lox\nprint 0.1 * (0.2 * 0.3);\nprint (0.1 * 0.2) * 0.3;\n```\n\nIn languages like Lox that use [IEEE 754][754] double-precision floating-point\nnumbers, the first evaluates to `0.006`, while the second yields\n`0.006000000000000001`. Sometimes that tiny difference matters.\n[This][float] is a good place to learn more.\n\n[754]: https://en.wikipedia.org/wiki/Double-precision_floating-point_format\n[float]: https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html\n\n</aside>\n\nAll of this is correct, but the fact that the first symbol in the body of the\nrule is the same as the head of the rule means this production is\n**left-recursive**. Some parsing techniques, including the one we're going to\nuse, have trouble with left recursion. (Recursion elsewhere, like we have in\n`unary` and the indirect recursion for grouping in `primary` are not a problem.)\n\nThere are many grammars you can define that match the same language. The choice\nfor how to model a particular language is partially a matter of taste and\npartially a pragmatic one. This rule is correct, but not optimal for how we\nintend to parse it. Instead of a left recursive rule, we'll use a different one.\n\n```ebnf\nfactor         → unary ( ( \"/\" | \"*\" ) unary )* ;\n```\n\nWe define a factor expression as a flat *sequence* of multiplications\nand divisions. This matches the same syntax as the previous rule, but better\nmirrors the code we'll write to parse Lox. We use the same structure for all of\nthe other binary operator precedence levels, giving us this complete expression\ngrammar:\n\n```ebnf\nexpression     → equality ;\nequality       → comparison ( ( \"!=\" | \"==\" ) comparison )* ;\ncomparison     → term ( ( \">\" | \">=\" | \"<\" | \"<=\" ) term )* ;\nterm           → factor ( ( \"-\" | \"+\" ) factor )* ;\nfactor         → unary ( ( \"/\" | \"*\" ) unary )* ;\nunary          → ( \"!\" | \"-\" ) unary\n               | primary ;\nprimary        → NUMBER | STRING | \"true\" | \"false\" | \"nil\"\n               | \"(\" expression \")\" ;\n```\n\nThis grammar is more complex than the one we had before, but in return we have\neliminated the previous one's ambiguity. It's just what we need to make a\nparser.\n\n## Recursive Descent Parsing\n\nThere is a whole pack of parsing techniques whose names are mostly combinations\nof \"L\" and \"R\" -- [LL(k)][], [LR(1)][lr], [LALR][] -- along with more exotic\nbeasts like [parser combinators][], [Earley parsers][], [the shunting yard\nalgorithm][yard], and [packrat parsing][]. For our first interpreter, one\ntechnique is more than sufficient: **recursive descent**.\n\n[ll(k)]: https://en.wikipedia.org/wiki/LL_parser\n[lr]: https://en.wikipedia.org/wiki/LR_parser\n[lalr]: https://en.wikipedia.org/wiki/LALR_parser\n[parser combinators]: https://en.wikipedia.org/wiki/Parser_combinator\n[earley parsers]: https://en.wikipedia.org/wiki/Earley_parser\n[yard]: https://en.wikipedia.org/wiki/Shunting-yard_algorithm\n[packrat parsing]: https://en.wikipedia.org/wiki/Parsing_expression_grammar\n\nRecursive descent is the simplest way to build a parser, and doesn't require\nusing complex parser generator tools like Yacc, Bison or ANTLR. All you need is\nstraightforward handwritten code. Don't be fooled by its simplicity, though.\nRecursive descent parsers are fast, robust, and can support sophisticated\nerror handling. In fact, GCC, V8 (the JavaScript VM in Chrome), Roslyn (the C#\ncompiler written in C#) and many other heavyweight production language\nimplementations use recursive descent. It rocks.\n\nRecursive descent is considered a **top-down parser** because it starts from the\ntop or outermost grammar rule (here `expression`) and works its way <span\nname=\"descent\">down</span> into the nested subexpressions before finally\nreaching the leaves of the syntax tree. This is in contrast with bottom-up\nparsers like LR that start with primary expressions and compose them into larger\nand larger chunks of syntax.\n\n<aside name=\"descent\">\n\nIt's called \"recursive *descent*\" because it walks *down* the grammar.\nConfusingly, we also use direction metaphorically when talking about \"high\" and\n\"low\" precedence, but the orientation is reversed. In a top-down parser, you\nreach the lowest-precedence expressions first because they may in turn contain\nsubexpressions of higher precedence.\n\n<img src=\"image/parsing-expressions/direction.png\" alt=\"Top-down grammar rules in order of increasing precedence.\" />\n\nCS people really need to get together and straighten out their metaphors. Don't\neven get me started on which direction a stack grows or why trees have their\nroots on top.\n\n</aside>\n\nA recursive descent parser is a literal translation of the grammar's rules\nstraight into imperative code. Each rule becomes a function. The body of the\nrule translates to code roughly like:\n\n<table>\n<thead>\n<tr>\n  <td>Grammar notation</td>\n  <td>Code representation</td>\n</tr>\n</thead>\n<tbody>\n  <tr><td>Terminal</td><td>Code to match and consume a token</td></tr>\n  <tr><td>Nonterminal</td><td>Call to that rule&rsquo;s function</td></tr>\n  <tr><td><code>|</code></td><td><code>if</code> or <code>switch</code> statement</td></tr>\n  <tr><td><code>*</code> or <code>+</code></td><td><code>while</code> or <code>for</code> loop</td></tr>\n  <tr><td><code>?</code></td><td><code>if</code> statement</td></tr>\n</tbody>\n</table>\n\nThe descent is described as \"recursive\" because when a grammar rule refers to\nitself -- directly or indirectly -- that translates to a recursive function\ncall.\n\n### The parser class\n\nEach grammar rule becomes a method inside this new class:\n\n^code parser\n\nLike the scanner, the parser consumes a flat input sequence, only now we're\nreading tokens instead of characters. We store the list of tokens and use\n`current` to point to the next token eagerly waiting to be parsed.\n\nWe're going to run straight through the expression grammar now and translate\neach rule to Java code. The first rule, `expression`, simply expands to the\n`equality` rule, so that's straightforward.\n\n^code expression\n\nEach method for parsing a grammar rule produces a syntax tree for that rule and\nreturns it to the caller. When the body of the rule contains a nonterminal -- a\nreference to another rule -- we <span name=\"left\">call</span> that other rule's\nmethod.\n\n<aside name=\"left\">\n\nThis is why left recursion is problematic for recursive descent. The function\nfor a left-recursive rule immediately calls itself, which calls itself again,\nand so on, until the parser hits a stack overflow and dies.\n\n</aside>\n\nThe rule for equality is a little more complex.\n\n```ebnf\nequality       → comparison ( ( \"!=\" | \"==\" ) comparison )* ;\n```\n\nIn Java, that becomes:\n\n^code equality\n\nLet's step through it. The first `comparison` nonterminal in the body translates\nto the first call to `comparison()` in the method. We take that result and store\nit in a local variable.\n\nThen, the `( ... )*` loop in the rule maps to a `while` loop. We need to know\nwhen to exit that loop. We can see that inside the rule, we must first find\neither a `!=` or `==` token. So, if we *don't* see one of those, we must be done\nwith the sequence of equality operators. We express that check using a handy\n`match()` method.\n\n^code match\n\nThis checks to see if the current token has any of the given types. If so, it\nconsumes the token and returns `true`. Otherwise, it returns `false` and leaves\nthe current token alone. The `match()` method is defined in terms of two more\nfundamental operations.\n\nThe `check()` method returns `true` if the current token is of the given type.\nUnlike `match()`, it never consumes the token, it only looks at it.\n\n^code check\n\nThe `advance()` method consumes the current token and returns it, similar to how\nour scanner's corresponding method crawled through characters.\n\n^code advance\n\nThese methods bottom out on the last handful of primitive operations.\n\n^code utils\n\n`isAtEnd()` checks if we've run out of tokens to parse. `peek()` returns the\ncurrent token we have yet to consume, and `previous()` returns the most recently\nconsumed token. The latter makes it easier to use `match()` and then access the\njust-matched token.\n\nThat's most of the parsing infrastructure we need. Where were we? Right, so if\nwe are inside the `while` loop in `equality()`, then we know we have found a\n`!=` or `==` operator and must be parsing an equality expression.\n\nWe grab the matched operator token so we can track which kind of equality\nexpression we have. Then we call `comparison()` again to parse the right-hand\noperand. We combine the operator and its two operands into a new `Expr.Binary`\nsyntax tree node, and then loop around. For each iteration, we store the\nresulting expression back in the same `expr` local variable. As we zip through a\nsequence of equality expressions, that creates a left-associative nested tree of\nbinary operator nodes.\n\n<span name=\"sequence\"></span>\n\n<img src=\"image/parsing-expressions/sequence.png\" alt=\"The syntax tree created by parsing 'a == b == c == d == e'\" />\n\n<aside name=\"sequence\">\n\nParsing `a == b == c == d == e`. For each iteration, we create a new binary\nexpression using the previous one as the left operand.\n\n</aside>\n\nThe parser falls out of the loop once it hits a token that's not an equality\noperator. Finally, it returns the expression. Note that if the parser never\nencounters an equality operator, then it never enters the loop. In that case,\nthe `equality()` method effectively calls and returns `comparison()`. In that\nway, this method matches an equality operator *or anything of higher\nprecedence*.\n\nMoving on to the next rule...\n\n```ebnf\ncomparison     → term ( ( \">\" | \">=\" | \"<\" | \"<=\" ) term )* ;\n```\n\nTranslated to Java:\n\n^code comparison\n\nThe grammar rule is virtually <span name=\"handle\">identical</span> to `equality`\nand so is the corresponding code. The only differences are the token types for\nthe operators we match, and the method we call for the operands -- now\n`term()` instead of `comparison()`. The remaining two binary operator rules\nfollow the same pattern.\n\nIn order of precedence, first addition and subtraction:\n\n<aside name=\"handle\">\n\nIf you wanted to do some clever Java 8, you could create a helper method for\nparsing a left-associative series of binary operators given a list of token\ntypes, and an operand method handle to simplify this redundant code.\n\n</aside>\n\n^code term\n\nAnd finally, multiplication and division:\n\n^code factor\n\nThat's all of the binary operators, parsed with the correct precedence and\nassociativity. We're crawling up the precedence hierarchy and now we've reached\nthe unary operators.\n\n```ebnf\nunary          → ( \"!\" | \"-\" ) unary\n               | primary ;\n```\n\nThe code for this is a little different.\n\n^code unary\n\nAgain, we look at the <span name=\"current\">current</span> token to see how to\nparse. If it's a `!` or `-`, we must have a unary expression. In that case, we\ngrab the token and then recursively call `unary()` again to parse the operand.\nWrap that all up in a unary expression syntax tree and we're done.\n\n<aside name=\"current\">\n\nThe fact that the parser looks ahead at upcoming tokens to decide how to parse\nputs recursive descent into the category of **predictive parsers**.\n\n</aside>\n\nOtherwise, we must have reached the highest level of precedence, primary\nexpressions.\n\n```ebnf\nprimary        → NUMBER | STRING | \"true\" | \"false\" | \"nil\"\n               | \"(\" expression \")\" ;\n```\n\nMost of the cases for the rule are single terminals, so parsing is\nstraightforward.\n\n^code primary\n\nThe interesting branch is the one for handling parentheses. After we match an\nopening `(` and parse the expression inside it, we *must* find a `)` token. If\nwe don't, that's an error.\n\n## Syntax Errors\n\nA parser really has two jobs:\n\n1.  Given a valid sequence of tokens, produce a corresponding syntax tree.\n\n2.  Given an *invalid* sequence of tokens, detect any errors and tell the\n    user about their mistakes.\n\nDon't underestimate how important the second job is! In modern IDEs and editors,\nthe parser is constantly reparsing code -- often while the user is still editing\nit -- in order to syntax highlight and support things like auto-complete. That\nmeans it will encounter code in incomplete, half-wrong states *all the time.*\n\nWhen the user doesn't realize the syntax is wrong, it is up to the parser to\nhelp guide them back onto the right path. The way it reports errors is a large\npart of your language's user interface. Good syntax error handling is hard. By\ndefinition, the code isn't in a well-defined state, so there's no infallible way\nto know what the user *meant* to write. The parser can't read your <span\nname=\"telepathy\">mind</span>.\n\n<aside name=\"telepathy\">\n\nNot yet at least. With the way things are going in machine learning these days,\nwho knows what the future will bring?\n\n</aside>\n\nThere are a couple of hard requirements for when the parser runs into a syntax\nerror. A parser must:\n\n*   **Detect and report the error.** If it doesn't detect the <span\n    name=\"error\">error</span> and passes the resulting malformed syntax tree on\n    to the interpreter, all manner of horrors may be summoned.\n\n    <aside name=\"error\">\n\n    Philosophically speaking, if an error isn't detected and the interpreter\n    runs the code, is it *really* an error?\n\n    </aside>\n\n*   **Avoid crashing or hanging.** Syntax errors are a fact of life, and\n    language tools have to be robust in the face of them. Segfaulting or getting\n    stuck in an infinite loop isn't allowed. While the source may not be valid\n    *code*, it's still a valid *input to the parser* because users use the\n    parser to learn what syntax is allowed.\n\nThose are the table stakes if you want to get in the parser game at all, but you\nreally want to raise the ante beyond that. A decent parser should:\n\n*   **Be fast.** Computers are thousands of times faster than they were when\n    parser technology was first invented. The days of needing to optimize your\n    parser so that it could get through an entire source file during a coffee\n    break are over. But programmer expectations have risen as quickly, if not\n    faster. They expect their editors to reparse files in milliseconds after\n    every keystroke.\n\n*   **Report as many distinct errors as there are.** Aborting after the first\n    error is easy to implement, but it's annoying for users if every time they\n    fix what they think is the one error in a file, a new one appears. They\n    want to see them all.\n\n*   **Minimize *cascaded* errors.** Once a single error is found, the parser no\n    longer really knows what's going on. It tries to get itself back on track\n    and keep going, but if it gets confused, it may report a slew of ghost\n    errors that don't indicate other real problems in the code. When the first\n    error is fixed, those phantoms disappear, because they reflect only the\n    parser's own confusion. Cascaded errors are annoying because they can scare\n    the user into thinking their code is in a worse state than it is.\n\nThe last two points are in tension. We want to report as many separate errors as\nwe can, but we don't want to report ones that are merely side effects of an\nearlier one.\n\nThe way a parser responds to an error and keeps going to look for later errors\nis called **error recovery**. This was a hot research topic in the '60s. Back\nthen, you'd hand a stack of punch cards to the secretary and come back the next\nday to see if the compiler succeeded. With an iteration loop that slow, you\n*really* wanted to find every single error in your code in one pass.\n\nToday, when parsers complete before you've even finished typing, it's less of an\nissue. Simple, fast error recovery is fine.\n\n### Panic mode error recovery\n\n<aside name=\"panic\">\n\nYou know you want to push it.\n\n<img src=\"image/parsing-expressions/panic.png\" alt=\"A big shiny 'PANIC' button.\" />\n\n</aside>\n\nOf all the recovery techniques devised in yesteryear, the one that best stood\nthe test of time is called -- somewhat alarmingly -- <span name=\"panic\">**panic\nmode**</span>. As soon as the parser detects an error, it enters panic mode. It\nknows at least one token doesn't make sense given its current state in the\nmiddle of some stack of grammar productions.\n\nBefore it can get back to parsing, it needs to get its state and the sequence of\nforthcoming tokens aligned such that the next token does match the rule being\nparsed. This process is called **synchronization**.\n\nTo do that, we select some rule in the grammar that will mark the\nsynchronization point. The parser fixes its parsing state by jumping out of any\nnested productions until it gets back to that rule. Then it synchronizes the\ntoken stream by discarding tokens until it reaches one that can appear at that\npoint in the rule.\n\nAny additional real syntax errors hiding in those discarded tokens aren't\nreported, but it also means that any mistaken cascaded errors that are side\neffects of the initial error aren't *falsely* reported either, which is a decent\ntrade-off.\n\nThe traditional place in the grammar to synchronize is between statements. We\ndon't have those yet, so we won't actually synchronize in this chapter, but\nwe'll get the machinery in place for later.\n\n### Entering panic mode\n\nBack before we went on this side trip around error recovery, we were writing the\ncode to parse a parenthesized expression. After parsing the expression, the\nparser looks for the closing `)` by calling `consume()`. Here, finally, is that\nmethod:\n\n^code consume\n\nIt's similar to `match()` in that it checks to see if the next token is of the\nexpected type. If so, it consumes the token and everything is groovy. If some\nother token is there, then we've hit an error. We report it by calling this:\n\n^code error\n\nFirst, that shows the error to the user by calling:\n\n^code token-error\n\nThis reports an error at a given token. It shows the token's location and the\ntoken itself. This will come in handy later since we use tokens throughout the\ninterpreter to track locations in code.\n\nAfter we report the error, the user knows about their mistake, but what does the\n*parser* do next? Back in `error()`, we create and return a ParseError, an\ninstance of this new class:\n\n^code parse-error (1 before, 1 after)\n\nThis is a simple sentinel class we use to unwind the parser. The `error()`\nmethod *returns* the error instead of *throwing* it because we want to let the\ncalling method inside the parser decide whether to unwind or not. Some parse\nerrors occur in places where the parser isn't likely to get into a weird state\nand we don't need to <span name=\"production\">synchronize</span>. In those\nplaces, we simply report the error and keep on truckin'.\n\nFor example, Lox limits the number of arguments you can pass to a function. If\nyou pass too many, the parser needs to report that error, but it can and should\nsimply keep on parsing the extra arguments instead of freaking out and going\ninto panic mode.\n\n<aside name=\"production\">\n\nAnother way to handle common syntax errors is with **error productions**. You\naugment the grammar with a rule that *successfully* matches the *erroneous*\nsyntax. The parser safely parses it but then reports it as an error instead of\nproducing a syntax tree.\n\nFor example, some languages have a unary `+` operator, like `+123`, but Lox does\nnot. Instead of getting confused when the parser stumbles onto a `+` at the\nbeginning of an expression, we could extend the unary rule to allow it.\n\n```ebnf\nunary → ( \"!\" | \"-\" | \"+\" ) unary\n      | primary ;\n```\n\nThis lets the parser consume `+` without going into panic mode or leaving the\nparser in a weird state.\n\nError productions work well because you, the parser author, know *how* the code\nis wrong and what the user was likely trying to do. That means you can give a\nmore helpful message to get the user back on track, like, \"Unary '+' expressions\nare not supported.\" Mature parsers tend to accumulate error productions like\nbarnacles since they help users fix common mistakes.\n\n</aside>\n\nIn our case, though, the syntax error is nasty enough that we want to panic and\nsynchronize. Discarding tokens is pretty easy, but how do we synchronize the\nparser's own state?\n\n### Synchronizing a recursive descent parser\n\nWith recursive descent, the parser's state -- which rules it is in the middle of\nrecognizing -- is not stored explicitly in fields. Instead, we use Java's\nown call stack to track what the parser is doing. Each rule in the middle of\nbeing parsed is a call frame on the stack. In order to reset that state, we need\nto clear out those call frames.\n\nThe natural way to do that in Java is exceptions. When we want to synchronize,\nwe *throw* that ParseError object. Higher up in the method for the grammar rule\nwe are synchronizing to, we'll catch it. Since we synchronize on statement\nboundaries, we'll catch the exception there. After the exception is caught, the\nparser is in the right state. All that's left is to synchronize the tokens.\n\nWe want to discard tokens until we're right at the beginning of the next\nstatement. That boundary is pretty easy to spot -- it's one of the main reasons\nwe picked it. *After* a semicolon, we're <span name=\"semicolon\">probably</span>\nfinished with a statement. Most statements start with a keyword -- `for`, `if`,\n`return`, `var`, etc. When the *next* token is any of those, we're probably\nabout to start a statement.\n\n<aside name=\"semicolon\">\n\nI say \"probably\" because we could hit a semicolon separating clauses in a `for`\nloop. Our synchronization isn't perfect, but that's OK. We've already reported\nthe first error precisely, so everything after that is kind of \"best effort\".\n\n</aside>\n\nThis method encapsulates that logic:\n\n^code synchronize\n\nIt discards tokens until it thinks it has found a statement boundary. After\ncatching a ParseError, we'll call this and then we are hopefully back in sync.\nWhen it works well, we have discarded tokens that would have likely caused\ncascaded errors anyway, and now we can parse the rest of the file starting at\nthe next statement.\n\nAlas, we don't get to see this method in action, since we don't have statements\nyet. We'll get to that [in a couple of chapters][statements]. For now, if an\nerror occurs, we'll panic and unwind all the way to the top and stop parsing.\nSince we can parse only a single expression anyway, that's no big loss.\n\n[statements]: statements-and-state.html\n\n## Wiring up the Parser\n\nWe are mostly done parsing expressions now. There is one other place where we\nneed to add a little error handling. As the parser descends through the parsing\nmethods for each grammar rule, it eventually hits `primary()`. If none of the\ncases in there match, it means we are sitting on a token that can't start an\nexpression. We need to handle that error too.\n\n^code primary-error (5 before, 1 after)\n\nWith that, all that remains in the parser is to define an initial method to kick\nit off. That method is called, naturally enough, `parse()`.\n\n^code parse\n\nWe'll revisit this method later when we add statements to the language. For now,\nit parses a single expression and returns it. We also have some temporary code\nto exit out of panic mode. Syntax error recovery is the parser's job, so we\ndon't want the ParseError exception to escape into the rest of the interpreter.\n\nWhen a syntax error does occur, this method returns `null`. That's OK. The\nparser promises not to crash or hang on invalid syntax, but it doesn't promise\nto return a *usable syntax tree* if an error is found. As soon as the parser\nreports an error, `hadError` gets set, and subsequent phases are skipped.\n\nFinally, we can hook up our brand new parser to the main Lox class and try it\nout. We still don't have an interpreter, so for now, we'll parse to a syntax\ntree and then use the AstPrinter class from the [last chapter][ast-printer] to\ndisplay it.\n\n[ast-printer]: representing-code.html#a-not-very-pretty-printer\n\nDelete the old code to print the scanned tokens and replace it with this:\n\n^code print-ast (1 before, 1 after)\n\nCongratulations, you have crossed the <span name=\"harder\">threshold</span>! That\nreally is all there is to handwriting a parser. We'll extend the grammar in\nlater chapters with assignment, statements, and other stuff, but none of that is\nany more complex than the binary operators we tackled here.\n\n<aside name=\"harder\">\n\nIt is possible to define a more complex grammar than Lox's that's difficult to\nparse using recursive descent. Predictive parsing gets tricky when you may need\nto look ahead a large number of tokens to figure out what you're sitting on.\n\nIn practice, most languages are designed to avoid that. Even in cases where they\naren't, you can usually hack around it without too much pain. If you can parse\nC++ using recursive descent -- which many C++ compilers do -- you can parse\nanything.\n\n</aside>\n\nFire up the interpreter and type in some expressions. See how it handles\nprecedence and associativity correctly? Not bad for less than 200 lines of code.\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  In C, a block is a statement form that allows you to pack a series of\n    statements where a single one is expected. The [comma operator][] is an\n    analogous syntax for expressions. A comma-separated series of expressions\n    can be given where a single expression is expected (except inside a function\n    call's argument list). At runtime, the comma operator evaluates the left\n    operand and discards the result. Then it evaluates and returns the right\n    operand.\n\n    Add support for comma expressions. Give them the same precedence and\n    associativity as in C. Write the grammar, and then implement the necessary\n    parsing code.\n\n2.  Likewise, add support for the C-style conditional or \"ternary\" operator\n    `?:`. What precedence level is allowed between the `?` and `:`? Is the whole\n    operator left-associative or right-associative?\n\n3.  Add error productions to handle each binary operator appearing without a\n    left-hand operand. In other words, detect a binary operator appearing at the\n    beginning of an expression. Report that as an error, but also parse and\n    discard a right-hand operand with the appropriate precedence.\n\n[comma operator]: https://en.wikipedia.org/wiki/Comma_operator\n\n</div>\n\n<div class=\"design-note\">\n\n## Design Note: Logic Versus History\n\nLet's say we decide to add bitwise `&` and `|` operators to Lox. Where should we\nput them in the precedence hierarchy? C -- and most languages that follow in C's\nfootsteps -- place them below `==`. This is widely considered a mistake because\nit means common operations like testing a flag require parentheses.\n\n```c\nif (flags & FLAG_MASK == SOME_FLAG) { ... } // Wrong.\nif ((flags & FLAG_MASK) == SOME_FLAG) { ... } // Right.\n```\n\nShould we fix this for Lox and put bitwise operators higher up the precedence\ntable than C does? There are two strategies we can take.\n\nYou almost never want to use the result of an `==` expression as the operand to\na bitwise operator. By making bitwise bind tighter, users don't need to\nparenthesize as often. So if we do that, and users assume the precedence is\nchosen logically to minimize parentheses, they're likely to infer it correctly.\n\nThis kind of internal consistency makes the language easier to learn because\nthere are fewer edge cases and exceptions users have to stumble into and then\ncorrect. That's good, because before users can use our language, they have to\nload all of that syntax and semantics into their heads. A simpler, more rational\nlanguage *makes sense*.\n\nBut, for many users there is an even faster shortcut to getting our language's\nideas into their wetware -- *use concepts they already know*. Many newcomers to\nour language will be coming from some other language or languages. If our\nlanguage uses some of the same syntax or semantics as those, there is much less\nfor the user to learn (and *unlearn*).\n\nThis is particularly helpful with syntax. You may not remember it well today,\nbut way back when you learned your very first programming language, code\nprobably looked alien and unapproachable. Only through painstaking effort did\nyou learn to read and accept it. If you design a novel syntax for your new\nlanguage, you force users to start that process all over again.\n\nTaking advantage of what users already know is one of the most powerful tools\nyou can use to ease adoption of your language. It's almost impossible to\noverestimate how valuable this is. But it faces you with a nasty problem: What\nhappens when the thing the users all know *kind of sucks*? C's bitwise operator\nprecedence is a mistake that doesn't make sense. But it's a *familiar* mistake\nthat millions have already gotten used to and learned to live with.\n\nDo you stay true to your language's own internal logic and ignore history? Do\nyou start from a blank slate and first principles? Or do you weave your language\ninto the rich tapestry of programming history and give your users a leg up by\nstarting from something they already know?\n\nThere is no perfect answer here, only trade-offs. You and I are obviously biased\ntowards liking novel languages, so our natural inclination is to burn the\nhistory books and start our own story.\n\nIn practice, it's often better to make the most of what users already know.\nGetting them to come to your language requires a big leap. The smaller you can\nmake that chasm, the more people will be willing to cross it. But you can't\n*always* stick to history, or your language won't have anything new and\ncompelling to give people a *reason* to jump over.\n\n</div>\n"
  },
  {
    "path": "book/representing-code.md",
    "content": "> To dwellers in a wood, almost every species of tree has its voice as well as\n> its feature.\n> <cite>Thomas Hardy, <em>Under the Greenwood Tree</em></cite>\n\nIn the [last chapter][scanning], we took the raw source code as a string and\ntransformed it into a slightly higher-level representation: a series of tokens.\nThe parser we'll write in the [next chapter][parsing] takes those tokens and\ntransforms them yet again, into an even richer, more complex representation.\n\n[scanning]: scanning.html\n[parsing]: parsing-expressions.html\n\nBefore we can produce that representation, we need to define it. That's the\nsubject of this chapter. Along the way, we'll <span name=\"boring\">cover</span>\nsome theory around formal grammars, feel the difference between functional and\nobject-oriented programming, go over a couple of design patterns, and do some\nmetaprogramming.\n\n<aside name=\"boring\">\n\nI was so worried about this being one of the most boring chapters in the book\nthat I kept stuffing more fun ideas into it until I ran out of room.\n\n</aside>\n\nBefore we do all that, let's focus on the main goal -- a representation for\ncode. It should be simple for the parser to produce and easy for the\ninterpreter to consume. If you haven't written a parser or interpreter yet,\nthose requirements aren't exactly illuminating. Maybe your intuition can help.\nWhat is your brain doing when you play the part of a *human* interpreter? How do\nyou mentally evaluate an arithmetic expression like this:\n\n```lox\n1 + 2 * 3 - 4\n```\n\nBecause you understand the order of operations -- the old \"[Please Excuse My\nDear Aunt Sally][sally]\" stuff -- you know that the multiplication is evaluated\nbefore the addition or subtraction. One way to visualize that precedence is\nusing a tree. Leaf nodes are numbers, and interior nodes are operators with\nbranches for each of their operands.\n\n[sally]: https://en.wikipedia.org/wiki/Order_of_operations#Mnemonics\n\nIn order to evaluate an arithmetic node, you need to know the numeric values of\nits subtrees, so you have to evaluate those first. That means working your way\nfrom the leaves up to the root -- a *post-order* traversal:\n\n<span name=\"tree-steps\"></span>\n\n<img src=\"image/representing-code/tree-evaluate.png\" alt=\"Evaluating the tree from the bottom up.\" />\n\n<aside name=\"tree-steps\">\n\nA. Starting with the full tree, evaluate the bottom-most operation, `2 * 3`.\n\nB. Now we can evaluate the `+`.\n\nC. Next, the `-`.\n\nD. The final answer.\n\n</aside>\n\nIf I gave you an arithmetic expression, you could draw one of these trees pretty\neasily. Given a tree, you can evaluate it without breaking a sweat. So it\nintuitively seems like a workable representation of our code is a <span\nname=\"only\">tree</span> that matches the grammatical structure -- the operator\nnesting -- of the language.\n\n<aside name=\"only\">\n\nThat's not to say a tree is the *only* possible representation of our code. In\n[Part III][], we'll generate bytecode, another representation that isn't as\nhuman friendly but is closer to the machine.\n\n[part iii]: a-bytecode-virtual-machine.html\n\n</aside>\n\nWe need to get more precise about what that grammar is then. Like lexical\ngrammars in the last chapter, there is a long ton of theory around syntactic\ngrammars. We're going into that theory a little more than we did when scanning\nbecause it turns out to be a useful tool throughout much of the interpreter.\nWe start by moving one level up the [Chomsky hierarchy][]...\n\n[chomsky hierarchy]: https://en.wikipedia.org/wiki/Chomsky_hierarchy\n\n## Context-Free Grammars\n\nIn the last chapter, the formalism we used for defining the lexical grammar --\nthe rules for how characters get grouped into tokens -- was called a *regular\nlanguage*. That was fine for our scanner, which emits a flat sequence of tokens.\nBut regular languages aren't powerful enough to handle expressions which can\nnest arbitrarily deeply.\n\nWe need a bigger hammer, and that hammer is a **context-free grammar**\n(**CFG**). It's the next heaviest tool in the toolbox of\n**[formal grammars][]**. A formal grammar takes a set of atomic pieces it calls\nits \"alphabet\". Then it defines a (usually infinite) set of \"strings\" that are\n\"in\" the grammar. Each string is a sequence of \"letters\" in the alphabet.\n\n[formal grammars]: https://en.wikipedia.org/wiki/Formal_grammar\n\nI'm using all those quotes because the terms get a little confusing as you move\nfrom lexical to syntactic grammars. In our scanner's grammar, the alphabet\nconsists of individual characters and the strings are the valid lexemes --\nroughly \"words\". In the syntactic grammar we're talking about now, we're at a\ndifferent level of granularity. Now each \"letter\" in the alphabet is an entire\ntoken and a \"string\" is a sequence of *tokens* -- an entire expression.\n\nOof. Maybe a table will help:\n\n<table>\n<thead>\n<tr>\n  <td>Terminology</td>\n  <td></td>\n  <td>Lexical grammar</td>\n  <td>Syntactic grammar</td>\n</tr>\n</thead>\n<tbody>\n<tr>\n  <td>The &ldquo;alphabet&rdquo; is<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.</span></td>\n  <td>&rarr;&ensp;</td>\n  <td>Characters</td>\n  <td>Tokens</td>\n</tr>\n<tr>\n  <td>A &ldquo;string&rdquo; is<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.</span></td>\n  <td>&rarr;&ensp;</td>\n  <td>Lexeme or token</td>\n  <td>Expression</td>\n</tr>\n<tr>\n  <td>It&rsquo;s implemented by the<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.</span></td>\n  <td>&rarr;&ensp;</td>\n  <td>Scanner</td>\n  <td>Parser</td>\n</tr>\n</tbody>\n</table>\n\nA formal grammar's job is to specify which strings are valid and which aren't.\nIf we were defining a grammar for English sentences, \"eggs are tasty for\nbreakfast\" would be in the grammar, but \"tasty breakfast for are eggs\" would\nprobably not.\n\n### Rules for grammars\n\nHow do we write down a grammar that contains an infinite number of valid\nstrings? We obviously can't list them all out. Instead, we create a finite set\nof rules. You can think of them as a game that you can \"play\" in one of two\ndirections.\n\nIf you start with the rules, you can use them to *generate* strings that are in\nthe grammar. Strings created this way are called **derivations** because each is\n*derived* from the rules of the grammar. In each step of the game, you pick a\nrule and follow what it tells you to do. Most of the lingo around formal\ngrammars comes from playing them in this direction. Rules are called\n**productions** because they *produce* strings in the grammar.\n\nEach production in a context-free grammar has a **head** -- its <span\nname=\"name\">name</span> -- and a **body**, which describes what it generates. In\nits pure form, the body is simply a list of symbols. Symbols come in two\ndelectable flavors:\n\n<aside name=\"name\">\n\nRestricting heads to a single symbol is a defining feature of context-free\ngrammars. More powerful formalisms like **[unrestricted grammars][]** allow a\nsequence of symbols in the head as well as in the body.\n\n[unrestricted grammars]: https://en.wikipedia.org/wiki/Unrestricted_grammar\n\n</aside>\n\n*   A **terminal** is a letter from the grammar's alphabet. You can think of it\n    like a literal value. In the syntactic grammar we're defining, the terminals\n    are individual lexemes -- tokens coming from the scanner like `if` or\n    `1234`.\n\n    These are called \"terminals\", in the sense of an \"end point\" because they\n    don't lead to any further \"moves\" in the game. You simply produce that one\n    symbol.\n\n*   A **nonterminal** is a named reference to another rule in the grammar. It\n    means \"play that rule and insert whatever it produces here\". In this way,\n    the grammar composes.\n\nThere is one last refinement: you may have multiple rules with the same name.\nWhen you reach a nonterminal with that name, you are allowed to pick any of the\nrules for it, whichever floats your boat.\n\nTo make this concrete, we need a <span name=\"turtles\">way</span> to write down\nthese production rules. People have been trying to crystallize grammar all the\nway back to Pāṇini's *Ashtadhyayi*, which codified Sanskrit grammar a mere\ncouple thousand years ago. Not much progress happened until John Backus and\ncompany needed a notation for specifying ALGOL 58 and came up with\n[**Backus-Naur form**][bnf] (**BNF**). Since then, nearly everyone uses some\nflavor of BNF, tweaked to their own tastes.\n\n[bnf]: https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form\n\nI tried to come up with something clean. Each rule is a name, followed by an\narrow (`→`), followed by a sequence of symbols, and finally ending with a\nsemicolon (`;`). Terminals are quoted strings, and nonterminals are lowercase\nwords.\n\n<aside name=\"turtles\">\n\nYes, we need to define a syntax to use for the rules that define our syntax.\nShould we specify that *metasyntax* too? What notation do we use for *it?* It's\nlanguages all the way down!\n\n</aside>\n\nUsing that, here's a grammar for <span name=\"breakfast\">breakfast</span> menus:\n\n<aside name=\"breakfast\">\n\nYes, I really am going to be using breakfast examples throughout this entire\nbook. Sorry.\n\n</aside>\n\n```ebnf\nbreakfast  → protein \"with\" breakfast \"on the side\" ;\nbreakfast  → protein ;\nbreakfast  → bread ;\n\nprotein    → crispiness \"crispy\" \"bacon\" ;\nprotein    → \"sausage\" ;\nprotein    → cooked \"eggs\" ;\n\ncrispiness → \"really\" ;\ncrispiness → \"really\" crispiness ;\n\ncooked     → \"scrambled\" ;\ncooked     → \"poached\" ;\ncooked     → \"fried\" ;\n\nbread      → \"toast\" ;\nbread      → \"biscuits\" ;\nbread      → \"English muffin\" ;\n```\n\nWe can use this grammar to generate random breakfasts. Let's play a round and\nsee how it works. By age-old convention, the game starts with the first rule in\nthe grammar, here `breakfast`. There are three productions for that, and we\nrandomly pick the first one. Our resulting string looks like:\n\n```text\nprotein \"with\" breakfast \"on the side\"\n```\n\nWe need to expand that first nonterminal, `protein`, so we pick a production for\nthat. Let's pick:\n\n```ebnf\nprotein → cooked \"eggs\" ;\n```\n\nNext, we need a production for `cooked`, and so we pick `\"poached\"`. That's a\nterminal, so we add that. Now our string looks like:\n\n```text\n\"poached\" \"eggs\" \"with\" breakfast \"on the side\"\n```\n\nThe next non-terminal is `breakfast` again. The first `breakfast` production we\nchose recursively refers back to the `breakfast` rule. Recursion in the grammar\nis a good sign that the language being defined is context-free instead of\nregular. In particular, recursion where the recursive nonterminal has\nproductions on <span name=\"nest\">both</span> sides implies that the language is\nnot regular.\n\n<aside name=\"nest\">\n\nImagine that we've recursively expanded the `breakfast` rule here several times,\nlike \"bacon with bacon with bacon with...\" In order to complete the string\ncorrectly, we need to add an *equal* number of \"on the side\" bits to the end.\nTracking the number of required trailing parts is beyond the capabilities of a\nregular grammar. Regular grammars can express *repetition*, but they can't *keep\ncount* of how many repetitions there are, which is necessary to ensure that the\nstring has the same number of `with` and `on the side` parts.\n\n</aside>\n\nWe could keep picking the first production for `breakfast` over and over again\nyielding all manner of breakfasts like \"bacon with sausage with scrambled eggs\nwith bacon...\" We won't though. This time we'll pick `bread`. There are three\nrules for that, each of which contains only a terminal. We'll pick \"English\nmuffin\".\n\nWith that, every nonterminal in the string has been expanded until it finally\ncontains only terminals and we're left with:\n\n<img src=\"image/representing-code/breakfast.png\" alt='\"Playing\" the grammar to generate a string.' />\n\nThrow in some ham and Hollandaise, and you've got eggs Benedict.\n\nAny time we hit a rule that had multiple productions, we just picked one\narbitrarily. It is this flexibility that allows a short number of grammar rules\nto encode a combinatorially larger set of strings. The fact that a rule can\nrefer to itself -- directly or indirectly -- kicks it up even more, letting us\npack an infinite number of strings into a finite grammar.\n\n### Enhancing our notation\n\nStuffing an infinite set of strings in a handful of rules is pretty fantastic,\nbut let's take it further. Our notation works, but it's tedious. So, like any\ngood language designer, we'll sprinkle a little syntactic sugar on top -- some\nextra convenience notation. In addition to terminals and nonterminals, we'll\nallow a few other kinds of expressions in the body of a rule:\n\n*   Instead of repeating the rule name each time we want to add another\n    production for it, we'll allow a series of productions separated by a pipe\n    (`|`).\n\n    ```ebnf\n    bread → \"toast\" | \"biscuits\" | \"English muffin\" ;\n    ```\n\n*   Further, we'll allow parentheses for grouping and then allow `|` within that\n    to select one from a series of options within the middle of a production.\n\n    ```ebnf\n    protein → ( \"scrambled\" | \"poached\" | \"fried\" ) \"eggs\" ;\n    ```\n\n*   Using recursion to support repeated sequences of symbols has a certain\n    appealing <span name=\"purity\">purity</span>, but it's kind of a chore to\n    make a separate named sub-rule each time we want to loop. So, we also use a\n    postfix `*` to allow the previous symbol or group to be repeated zero or\n    more times.\n\n    ```ebnf\n    crispiness → \"really\" \"really\"* ;\n    ```\n\n<aside name=\"purity\">\n\nThis is how the Scheme programming language works. It has no built-in looping\nfunctionality at all. Instead, *all* repetition is expressed in terms of\nrecursion.\n\n</aside>\n\n*   A postfix `+` is similar, but requires the preceding production to appear\n    at least once.\n\n    ```ebnf\n    crispiness → \"really\"+ ;\n    ```\n\n*   A postfix `?` is for an optional production. The thing before it can appear\n    zero or one time, but not more.\n\n    ```ebnf\n    breakfast → protein ( \"with\" breakfast \"on the side\" )? ;\n    ```\n\nWith all of those syntactic niceties, our breakfast grammar condenses down to:\n\n```ebnf\nbreakfast → protein ( \"with\" breakfast \"on the side\" )?\n          | bread ;\n\nprotein   → \"really\"+ \"crispy\" \"bacon\"\n          | \"sausage\"\n          | ( \"scrambled\" | \"poached\" | \"fried\" ) \"eggs\" ;\n\nbread     → \"toast\" | \"biscuits\" | \"English muffin\" ;\n```\n\nNot too bad, I hope. If you're used to grep or using [regular\nexpressions][regex] in your text editor, most of the punctuation should be\nfamiliar. The main difference is that symbols here represent entire tokens, not\nsingle characters.\n\n[regex]: https://en.wikipedia.org/wiki/Regular_expression#Standards\n\nWe'll use this notation throughout the rest of the book to precisely describe\nLox's grammar. As you work on programming languages, you'll find that\ncontext-free grammars (using this or [EBNF][] or some other notation) help you\ncrystallize your informal syntax design ideas. They are also a handy medium for\ncommunicating with other language hackers about syntax.\n\n[ebnf]: https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form\n\nThe rules and productions we define for Lox are also our guide to the tree data\nstructure we're going to implement to represent code in memory. Before we can do\nthat, we need an actual grammar for Lox, or at least enough of one for us to get\nstarted.\n\n### A Grammar for Lox expressions\n\nIn the previous chapter, we did Lox's entire lexical grammar in one fell swoop.\nEvery keyword and bit of punctuation is there. The syntactic grammar is larger,\nand it would be a real bore to grind through the entire thing before we actually\nget our interpreter up and running.\n\nInstead, we'll crank through a subset of the language in the next couple of\nchapters. Once we have that mini-language represented, parsed, and interpreted,\nthen later chapters will progressively add new features to it, including the new\nsyntax. For now, we are going to worry about only a handful of expressions:\n\n*   **Literals.** Numbers, strings, Booleans, and `nil`.\n\n*   **Unary expressions.** A prefix `!` to perform a logical not, and `-` to\n    negate a number.\n\n*   **Binary expressions.** The infix arithmetic (`+`, `-`, `*`, `/`) and logic\n    operators (`==`, `!=`, `<`, `<=`, `>`, `>=`) we know and love.\n\n*   **Parentheses.** A pair of `(` and `)` wrapped around an expression.\n\nThat gives us enough syntax for expressions like:\n\n```lox\n1 - (2 * 3) < 4 == false\n```\n\nUsing our handy dandy new notation, here's a grammar for those:\n\n```ebnf\nexpression     → literal\n               | unary\n               | binary\n               | grouping ;\n\nliteral        → NUMBER | STRING | \"true\" | \"false\" | \"nil\" ;\ngrouping       → \"(\" expression \")\" ;\nunary          → ( \"-\" | \"!\" ) expression ;\nbinary         → expression operator expression ;\noperator       → \"==\" | \"!=\" | \"<\" | \"<=\" | \">\" | \">=\"\n               | \"+\"  | \"-\"  | \"*\" | \"/\" ;\n```\n\nThere's one bit of extra <span name=\"play\">metasyntax</span> here. In addition\nto quoted strings for terminals that match exact lexemes, we `CAPITALIZE`\nterminals that are a single lexeme whose text representation may vary. `NUMBER`\nis any number literal, and `STRING` is any string literal. Later, we'll do the\nsame for `IDENTIFIER`.\n\nThis grammar is actually ambiguous, which we'll see when we get to parsing it.\nBut it's good enough for now.\n\n<aside name=\"play\">\n\nIf you're so inclined, try using this grammar to generate a few expressions like\nwe did with the breakfast grammar before. Do the resulting expressions look\nright to you? Can you make it generate anything wrong like `1 + / 3`?\n\n</aside>\n\n## Implementing Syntax Trees\n\nFinally, we get to write some code. That little expression grammar is our\nskeleton. Since the grammar is recursive -- note how `grouping`, `unary`, and\n`binary` all refer back to `expression` -- our data structure will form a tree.\nSince this structure represents the syntax of our language, it's called a <span\nname=\"ast\">**syntax tree**</span>.\n\n<aside name=\"ast\">\n\nIn particular, we're defining an **abstract syntax tree** (**AST**). In a\n**parse tree**, every single grammar production becomes a node in the tree. An\nAST elides productions that aren't needed by later phases.\n\n</aside>\n\nOur scanner used a single Token class to represent all kinds of lexemes. To\ndistinguish the different kinds -- think the number `123` versus the string\n`\"123\"` -- we included a simple TokenType enum. Syntax trees are not so <span\nname=\"token-data\">homogeneous</span>. Unary expressions have a single operand,\nbinary expressions have two, and literals have none.\n\nWe *could* mush that all together into a single Expression class with an\narbitrary list of children. Some compilers do. But I like getting the most out\nof Java's type system. So we'll define a base class for expressions. Then, for\neach kind of expression -- each production under `expression` -- we create a\nsubclass that has fields for the nonterminals specific to that rule. This way,\nwe get a compile error if we, say, try to access the second operand of a unary\nexpression.\n\n<aside name=\"token-data\">\n\nTokens aren't entirely homogeneous either. Tokens for literals store the value,\nbut other kinds of lexemes don't need that state. I have seen scanners that use\ndifferent classes for literals and other kinds of lexemes, but I figured I'd\nkeep things simpler.\n\n</aside>\n\nSomething like this:\n\n```java\npackage com.craftinginterpreters.lox;\n\nabstract class Expr { // [expr]\n  static class Binary extends Expr {\n    Binary(Expr left, Token operator, Expr right) {\n      this.left = left;\n      this.operator = operator;\n      this.right = right;\n    }\n\n    final Expr left;\n    final Token operator;\n    final Expr right;\n  }\n\n  // Other expressions...\n}\n```\n\n<aside name=\"expr\">\n\nI avoid abbreviations in my code because they trip up a reader who doesn't know\nwhat they stand for. But in compilers I've looked at, \"Expr\" and \"Stmt\" are so\nubiquitous that I may as well start getting you used to them now.\n\n</aside>\n\nExpr is the base class that all expression classes inherit from. As you can see\nfrom `Binary`, the subclasses are nested inside of it. There's no technical need\nfor this, but it lets us cram all of the classes into a single Java file.\n\n### Disoriented objects\n\nYou'll note that, much like the Token class, there aren't any methods here. It's\na dumb structure. Nicely typed, but merely a bag of data. This feels strange in\nan object-oriented language like Java. Shouldn't the class *do stuff*?\n\nThe problem is that these tree classes aren't owned by any single domain. Should\nthey have methods for parsing since that's where the trees are created? Or\ninterpreting since that's where they are consumed? Trees span the border between\nthose territories, which means they are really owned by *neither*.\n\nIn fact, these types exist to enable the parser and interpreter to\n*communicate*. That lends itself to types that are simply data with no\nassociated behavior. This style is very natural in functional languages like\nLisp and ML where *all* data is separate from behavior, but it feels odd in\nJava.\n\nFunctional programming aficionados right now are jumping up to exclaim \"See!\nObject-oriented languages are a bad fit for an interpreter!\" I won't go that\nfar. You'll recall that the scanner itself was admirably suited to\nobject-orientation. It had all of the mutable state to keep track of where it\nwas in the source code, a well-defined set of public methods, and a handful of\nprivate helpers.\n\nMy feeling is that each phase or part of the interpreter works fine in an\nobject-oriented style. It is the data structures that flow between them that are\nstripped of behavior.\n\n### Metaprogramming the trees\n\nJava can express behavior-less classes, but I wouldn't say that it's\nparticularly great at it. Eleven lines of code to stuff three fields in an\nobject is pretty tedious, and when we're all done, we're going to have 21 of\nthese classes.\n\nI don't want to waste your time or my ink writing all that down. Really, what is\nthe essence of each subclass? A name, and a list of typed fields. That's it.\nWe're smart language hackers, right? Let's <span\nname=\"automate\">automate</span>.\n\n<aside name=\"automate\">\n\nPicture me doing an awkward robot dance when you read that. \"AU-TO-MATE.\"\n\n</aside>\n\nInstead of tediously handwriting each class definition, field declaration,\nconstructor, and initializer, we'll hack together a <span\nname=\"python\">script</span> that does it for us. It has a description of each\ntree type -- its name and fields -- and it prints out the Java code needed to\ndefine a class with that name and state.\n\nThis script is a tiny Java command-line app that generates a file named\n\"Expr.java\":\n\n<aside name=\"python\">\n\nI got the idea of scripting the syntax tree classes from Jim Hugunin, creator of\nJython and IronPython.\n\nAn actual scripting language would be a better fit for this than Java, but I'm\ntrying not to throw too many languages at you.\n\n</aside>\n\n^code generate-ast\n\nNote that this file is in a different package, `.tool` instead of `.lox`. This\nscript isn't part of the interpreter itself. It's a tool *we*, the people\nhacking on the interpreter, run ourselves to generate the syntax tree classes.\nWhen it's done, we treat \"Expr.java\" like any other file in the implementation.\nWe are merely automating how that file gets authored.\n\nTo generate the classes, it needs to have some description of each type and its\nfields.\n\n^code call-define-ast (1 before, 1 after)\n\nFor brevity's sake, I jammed the descriptions of the expression types into\nstrings. Each is the name of the class followed by `:` and the list of fields,\nseparated by commas. Each field has a type and a name.\n\nThe first thing `defineAst()` needs to do is output the base Expr class.\n\n^code define-ast\n\nWhen we call this, `baseName` is \"Expr\", which is both the name of the class and\nthe name of the file it outputs. We pass this as an argument instead of\nhardcoding the name because we'll add a separate family of classes later for\nstatements.\n\nInside the base class, we define each subclass.\n\n^code nested-classes (2 before, 1 after)\n\n<aside name=\"robust\">\n\nThis isn't the world's most elegant string manipulation code, but that's fine.\nIt only runs on the exact set of class definitions we give it. Robustness ain't\na priority.\n\n</aside>\n\nThat code, in turn, calls:\n\n^code define-type\n\nThere we go. All of that glorious Java boilerplate is done. It declares each\nfield in the class body. It defines a constructor for the class with parameters\nfor each field and initializes them in the body.\n\nCompile and run this Java program now and it <span name=\"longer\">blasts</span>\nout a new &ldquo;.java\" file containing a few dozen lines of code. That file's\nabout to get even longer.\n\n<aside name=\"longer\">\n\n[Appendix II][] contains the code generated by this script once we've finished\nimplementing jlox and defined all of its syntax tree nodes.\n\n[appendix ii]: appendix-ii.html\n\n</aside>\n\n## Working with Trees\n\nPut on your imagination hat for a moment. Even though we aren't there yet,\nconsider what the interpreter will do with the syntax trees. Each kind of\nexpression in Lox behaves differently at runtime. That means the interpreter\nneeds to select a different chunk of code to handle each expression type. With\ntokens, we can simply switch on the TokenType. But we don't have a \"type\" enum\nfor the syntax trees, just a separate Java class for each one.\n\nWe could write a long chain of type tests:\n\n```java\nif (expr instanceof Expr.Binary) {\n  // ...\n} else if (expr instanceof Expr.Grouping) {\n  // ...\n} else // ...\n```\n\nBut all of those sequential type tests are slow. Expression types whose names\nare alphabetically later would take longer to execute because they'd fall\nthrough more `if` cases before finding the right type. That's not my idea of an\nelegant solution.\n\nWe have a family of classes and we need to associate a chunk of behavior with\neach one. The natural solution in an object-oriented language like Java is to\nput those behaviors into methods on the classes themselves. We could add an\nabstract <span name=\"interpreter-pattern\">`interpret()`</span> method on Expr\nwhich each subclass would then implement to interpret itself.\n\n<aside name=\"interpreter-pattern\">\n\nThis exact thing is literally called the [\"Interpreter pattern\"][interp] in\n*Design Patterns: Elements of Reusable Object-Oriented Software*, by Erich\nGamma, et al.\n\n[interp]: https://en.wikipedia.org/wiki/Interpreter_pattern\n\n</aside>\n\nThis works alright for tiny projects, but it scales poorly. Like I noted before,\nthese tree classes span a few domains. At the very least, both the parser and\ninterpreter will mess with them. As [you'll see later][resolution], we need to\ndo name resolution on them. If our language was statically typed, we'd have a\ntype checking pass.\n\n[resolution]: resolving-and-binding.html\n\nIf we added instance methods to the expression classes for every one of those\noperations, that would smush a bunch of different domains together. That\nviolates [separation of concerns][] and leads to hard-to-maintain code.\n\n[separation of concerns]: https://en.wikipedia.org/wiki/Separation_of_concerns\n\n### The expression problem\n\nThis problem is more fundamental than it may seem at first. We have a handful of\ntypes, and a handful of high-level operations like \"interpret\". For each pair of\ntype and operation, we need a specific implementation. Picture a table:\n\n<img src=\"image/representing-code/table.png\" alt=\"A table where rows are labeled with expression classes, and columns are function names.\" />\n\nRows are types, and columns are operations. Each cell represents the unique\npiece of code to implement that operation on that type.\n\nAn object-oriented language like Java assumes that all of the code in one row\nnaturally hangs together. It figures all the things you do with a type are\nlikely related to each other, and the language makes it easy to define them\ntogether as methods inside the same class.\n\n<img src=\"image/representing-code/rows.png\" alt=\"The table split into rows for each class.\" />\n\nThis makes it easy to extend the table by adding new rows. Simply define a new\nclass. No existing code has to be touched. But imagine if you want to add a new\n*operation* -- a new column. In Java, that means cracking open each of those\nexisting classes and adding a method to it.\n\nFunctional paradigm languages in the <span name=\"ml\">ML</span> family flip that\naround. There, you don't have classes with methods. Types and functions are\ntotally distinct. To implement an operation for a number of different types, you\ndefine a single function. In the body of that function, you use *pattern\nmatching* -- sort of a type-based switch on steroids -- to implement the\noperation for each type all in one place.\n\n<aside name=\"ml\">\n\nML, short for \"metalanguage\" was created by Robin Milner and friends and forms\none of the main branches in the great programming language family tree. Its\nchildren include SML, Caml, OCaml, Haskell, and F#. Even Scala, Rust, and Swift\nbear a strong resemblance.\n\nMuch like Lisp, it is one of those languages that is so full of good ideas that\nlanguage designers today are still rediscovering them over forty years later.\n\n</aside>\n\nThis makes it trivial to add new operations -- simply define another function\nthat pattern matches on all of the types.\n\n<img src=\"image/representing-code/columns.png\" alt=\"The table split into columns for each function.\" />\n\nBut, conversely, adding a new type is hard. You have to go back and add a new\ncase to all of the pattern matches in all of the existing functions.\n\nEach style has a certain \"grain\" to it. That's what the paradigm name literally\nsays -- an object-oriented language wants you to *orient* your code along the\nrows of types. A functional language instead encourages you to lump each\ncolumn's worth of code together into a *function*.\n\nA bunch of smart language nerds noticed that neither style made it easy to add\n*both* rows and columns to the <span name=\"multi\">table</span>. They called this\ndifficulty the \"expression problem\" because -- like we are now -- they first ran\ninto it when they were trying to figure out the best way to model expression\nsyntax tree nodes in a compiler.\n\n<aside name=\"multi\">\n\nLanguages with *multimethods*, like Common Lisp's CLOS, Dylan, and Julia do\nsupport adding both new types and operations easily. What they typically\nsacrifice is either static type checking, or separate compilation.\n\n</aside>\n\nPeople have thrown all sorts of language features, design patterns, and\nprogramming tricks to try to knock that problem down but no perfect language has\nfinished it off yet. In the meantime, the best we can do is try to pick a\nlanguage whose orientation matches the natural architectural seams in the\nprogram we're writing.\n\nObject-orientation works fine for many parts of our interpreter, but these tree\nclasses rub against the grain of Java. Fortunately, there's a design pattern we\ncan bring to bear on it.\n\n### The Visitor pattern\n\nThe **Visitor pattern** is the most widely misunderstood pattern in all of\n*Design Patterns*, which is really saying something when you look at the\nsoftware architecture excesses of the past couple of decades.\n\nThe trouble starts with terminology. The pattern isn't about \"visiting\", and the\n\"accept\" method in it doesn't conjure up any helpful imagery either. Many think\nthe pattern has to do with traversing trees, which isn't the case at all. We\n*are* going to use it on a set of classes that are tree-like, but that's a\ncoincidence. As you'll see, the pattern works as well on a single object.\n\nThe Visitor pattern is really about approximating the functional style within an\nOOP language. It lets us add new columns to that table easily. We can define all\nof the behavior for a new operation on a set of types in one place, without\nhaving to touch the types themselves. It does this the same way we solve almost\nevery problem in computer science: by adding a layer of indirection.\n\nBefore we apply it to our auto-generated Expr classes, let's walk through a\nsimpler example. Say we have two kinds of pastries: <span\nname=\"beignet\">beignets</span> and crullers.\n\n<aside name=\"beignet\">\n\nA beignet (pronounced \"ben-yay\", with equal emphasis on both syllables) is a\ndeep-fried pastry in the same family as doughnuts. When the French colonized\nNorth America in the 1700s, they brought beignets with them. Today, in the US,\nthey are most strongly associated with the cuisine of New Orleans.\n\nMy preferred way to consume them is fresh out of the fryer at Café du Monde,\npiled high in powdered sugar, and washed down with a cup of café au lait while I\nwatch tourists staggering around trying to shake off their hangover from the\nprevious night's revelry.\n\n</aside>\n\n^code pastries (no location)\n\nWe want to be able to define new pastry operations -- cooking them, eating them,\ndecorating them, etc. -- without having to add a new method to each class every\ntime. Here's how we do it. First, we define a separate interface.\n\n^code pastry-visitor (no location)\n\n<aside name=\"overload\">\n\nIn *Design Patterns*, both of these methods are confusingly named `visit()`, and\nthey rely on overloading to distinguish them. This leads some readers to think\nthat the correct visit method is chosen *at runtime* based on its parameter\ntype. That isn't the case. Unlike over*riding*, over*loading* is statically\ndispatched at compile time.\n\nUsing distinct names for each method makes the dispatch more obvious, and also\nshows you how to apply this pattern in languages that don't support overloading.\n\n</aside>\n\nEach operation that can be performed on pastries is a new class that implements\nthat interface. It has a concrete method for each type of pastry. That keeps the\ncode for the operation on both types all nestled snugly together in one class.\n\nGiven some pastry, how do we route it to the correct method on the visitor based\non its type? Polymorphism to the rescue! We add this method to Pastry:\n\n^code pastry-accept (1 before, 1 after, no location)\n\nEach subclass implements it.\n\n^code beignet-accept (1 before, 1 after, no location)\n\nAnd:\n\n^code cruller-accept (1 before, 1 after, no location)\n\nTo perform an operation on a pastry, we call its `accept()` method and pass in\nthe visitor for the operation we want to execute. The pastry -- the specific\nsubclass's overriding implementation of `accept()` -- turns around and calls the\nappropriate visit method on the visitor and passes *itself* to it.\n\nThat's the heart of the trick right there. It lets us use polymorphic dispatch\non the *pastry* classes to select the appropriate method on the *visitor* class.\nIn the table, each pastry class is a row, but if you look at all of the methods\nfor a single visitor, they form a *column*.\n\n<img src=\"image/representing-code/visitor.png\" alt=\"Now all of the cells for one operation are part of the same class, the visitor.\" />\n\nWe added one `accept()` method to each class, and we can use it for as many\nvisitors as we want without ever having to touch the pastry classes again. It's\na clever pattern.\n\n### Visitors for expressions\n\nOK, let's weave it into our expression classes. We'll also <span\nname=\"context\">refine</span> the pattern a little. In the pastry example, the\nvisit and `accept()` methods don't return anything. In practice, visitors often\nwant to define operations that produce values. But what return type should\n`accept()` have? We can't assume every visitor class wants to produce the same\ntype, so we'll use generics to let each implementation fill in a return type.\n\n<aside name=\"context\">\n\nAnother common refinement is an additional \"context\" parameter that is passed to\nthe visit methods and then sent back through as a parameter to `accept()`. That\nlets operations take an additional parameter. The visitors we'll define in the\nbook don't need that, so I omitted it.\n\n</aside>\n\nFirst, we define the visitor interface. Again, we nest it inside the base class\nso that we can keep everything in one file.\n\n^code call-define-visitor (2 before, 1 after)\n\nThat function generates the visitor interface.\n\n^code define-visitor\n\nHere, we iterate through all of the subclasses and declare a visit method for\neach one. When we define new expression types later, this will automatically\ninclude them.\n\nInside the base class, we define the abstract `accept()` method.\n\n^code base-accept-method (2 before, 1 after)\n\nFinally, each subclass implements that and calls the right visit method for its\nown type.\n\n^code accept-method (1 before, 2 after)\n\nThere we go. Now we can define operations on expressions without having to muck\nwith the classes or our generator script. Compile and run this generator script\nto output an updated \"Expr.java\" file. It contains a generated Visitor\ninterface and a set of expression node classes that support the Visitor pattern\nusing it.\n\nBefore we end this rambling chapter, let's implement that Visitor interface and\nsee the pattern in action.\n\n## A (Not Very) Pretty Printer\n\nWhen we debug our parser and interpreter, it's often useful to look at a parsed\nsyntax tree and make sure it has the structure we expect. We could inspect it in\nthe debugger, but that can be a chore.\n\nInstead, we'd like some code that, given a syntax tree, produces an unambiguous\nstring representation of it. Converting a tree to a string is sort of the\nopposite of a parser, and is often called \"pretty printing\" when the goal is to\nproduce a string of text that is valid syntax in the source language.\n\nThat's not our goal here. We want the string to very explicitly show the nesting\nstructure of the tree. A printer that returned `1 + 2 * 3` isn't super helpful\nif what we're trying to debug is whether operator precedence is handled\ncorrectly. We want to know if the `+` or `*` is at the top of the tree.\n\nTo that end, the string representation we produce isn't going to be Lox syntax.\nInstead, it will look a lot like, well, Lisp. Each expression is explicitly\nparenthesized, and all of its subexpressions and tokens are contained in that.\n\nGiven a syntax tree like:\n\n<img src=\"image/representing-code/expression.png\" alt=\"An example syntax tree.\" />\n\nIt produces:\n\n```text\n(* (- 123) (group 45.67))\n```\n\nNot exactly \"pretty\", but it does show the nesting and grouping explicitly. To\nimplement this, we define a new class.\n\n^code ast-printer\n\nAs you can see, it implements the visitor interface. That means we need visit\nmethods for each of the expression types we have so far.\n\n^code visit-methods (2 before, 1 after)\n\nLiteral expressions are easy -- they convert the value to a string with a little\ncheck to handle Java's `null` standing in for Lox's `nil`. The other expressions\nhave subexpressions, so they use this `parenthesize()` helper method:\n\n^code print-utilities\n\nIt takes a name and a list of subexpressions and wraps them all up in\nparentheses, yielding a string like:\n\n```text\n(+ 1 2)\n```\n\nNote that it calls `accept()` on each subexpression and passes in itself. This\nis the <span name=\"tree\">recursive</span> step that lets us print an entire\ntree.\n\n<aside name=\"tree\">\n\nThis recursion is also why people think the Visitor pattern itself has to do\nwith trees.\n\n</aside>\n\nWe don't have a parser yet, so it's hard to see this in action. For now, we'll\nhack together a little `main()` method that manually instantiates a tree and\nprints it.\n\n^code printer-main\n\nIf we did everything right, it prints:\n\n```text\n(* (- 123) (group 45.67))\n```\n\nYou can go ahead and delete this method. We won't need it. Also, as we add new\nsyntax tree types, I won't bother showing the necessary visit methods for them\nin AstPrinter. If you want to (and you want the Java compiler to not yell at\nyou), go ahead and add them yourself. It will come in handy in the next chapter\nwhen we start parsing Lox code into syntax trees. Or, if you don't care to\nmaintain AstPrinter, feel free to delete it. We won't need it again.\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  Earlier, I said that the `|`, `*`, and `+` forms we added to our grammar\n    metasyntax were just syntactic sugar. Take this grammar:\n\n    ```ebnf\n    expr → expr ( \"(\" ( expr ( \",\" expr )* )? \")\" | \".\" IDENTIFIER )+\n         | IDENTIFIER\n         | NUMBER\n    ```\n\n    Produce a grammar that matches the same language but does not use any of\n    that notational sugar.\n\n    *Bonus:* What kind of expression does this bit of grammar encode?\n\n1.  The Visitor pattern lets you emulate the functional style in an\n    object-oriented language. Devise a complementary pattern for a functional\n    language. It should let you bundle all of the operations on one type\n    together and let you define new types easily.\n\n    (SML or Haskell would be ideal for this exercise, but Scheme or another Lisp\n    works as well.)\n\n1.  In [reverse Polish notation][rpn] (RPN), the operands to an arithmetic\n    operator are both placed before the operator, so `1 + 2` becomes `1 2 +`.\n    Evaluation proceeds from left to right. Numbers are pushed onto an implicit\n    stack. An arithmetic operator pops the top two numbers, performs the\n    operation, and pushes the result. Thus, this:\n\n    ```lox\n    (1 + 2) * (4 - 3)\n    ```\n\n    in RPN becomes:\n\n    ```lox\n    1 2 + 4 3 - *\n    ```\n\n    Define a visitor class for our syntax tree classes that takes an expression,\n    converts it to RPN, and returns the resulting string.\n\n[rpn]: https://en.wikipedia.org/wiki/Reverse_Polish_notation\n\n</div>\n"
  },
  {
    "path": "book/resolving-and-binding.md",
    "content": "> Once in a while you find yourself in an odd situation. You get into it by\n> degrees and in the most natural way but, when you are right in the midst of\n> it, you are suddenly astonished and ask yourself how in the world it all came\n> about.\n>\n> <cite>Thor Heyerdahl, <em>Kon-Tiki</em></cite>\n\nOh, no! Our language implementation is taking on water! Way back when we [added\nvariables and blocks][statements], we had scoping nice and tight. But when we\n[later added closures][functions], a hole opened in our formerly waterproof\ninterpreter. Most real programs are unlikely to slip through this hole, but as\nlanguage implementers, we take a sacred vow to care about correctness even in\nthe deepest, dampest corners of the semantics.\n\n[statements]: statements-and-state.html\n[functions]: functions.html\n\nWe will spend this entire chapter exploring that leak, and then carefully\npatching it up. In the process, we will gain a more rigorous understanding of\nlexical scoping as used by Lox and other languages in the C tradition. We'll\nalso get a chance to learn about *semantic analysis* -- a powerful technique for\nextracting meaning from the user's source code without having to run it.\n\n## Static Scope\n\nA quick refresher: Lox, like most modern languages, uses *lexical* scoping. This\nmeans that you can figure out which declaration a variable name refers to just\nby reading the text of the program. For example:\n\n```lox\nvar a = \"outer\";\n{\n  var a = \"inner\";\n  print a;\n}\n```\n\nHere, we know that the `a` being printed is the variable declared on the\nprevious line, and not the global one. Running the program doesn't -- *can't* --\naffect this. The scope rules are part of the *static* semantics of the language,\nwhich is why they're also called *static scope*.\n\nI haven't spelled out those scope rules, but now is the time for <span\nname=\"precise\">precision</span>:\n\n<aside name=\"precise\">\n\nThis is still nowhere near as precise as a real language specification. Those\ndocs must be so explicit that even a Martian or an outright malicious programmer\nwould be forced to implement the correct semantics provided they followed the\nletter of the spec.\n\nThat exactitude is important when a language may be implemented by competing\ncompanies who want their product to be incompatible with the others to lock\ncustomers onto their platform. For this book, we can thankfully ignore those\nkinds of shady shenanigans.\n\n</aside>\n\n**A variable usage refers to the preceding declaration with the same name in the\ninnermost scope that encloses the expression where the variable is used.**\n\nThere's a lot to unpack in that:\n\n*   I say \"variable usage\" instead of \"variable expression\" to cover both\n    variable expressions and assignments. Likewise with \"expression where the\n    variable is used\".\n\n*   \"Preceding\" means appearing before *in the program text*.\n\n    ```lox\n    var a = \"outer\";\n    {\n      print a;\n      var a = \"inner\";\n    }\n    ```\n\n    Here, the `a` being printed is the outer one since it appears <span\n    name=\"hoisting\">before</span> the `print` statement that uses it. In most\n    cases, in straight line code, the declaration preceding in *text* will also\n    precede the usage in *time*. But that's not always true. As we'll see,\n    functions may defer a chunk of code such that its *dynamic temporal*\n    execution no longer mirrors the *static textual* ordering.\n\n    <aside name=\"hoisting\">\n\n    In JavaScript, variables declared using `var` are implicitly \"hoisted\" to\n    the beginning of the block. Any use of that name in the block will refer to\n    that variable, even if the use appears before the declaration. When you\n    write this in JavaScript:\n\n    ```js\n    {\n      console.log(a);\n      var a = \"value\";\n    }\n    ```\n\n    It behaves like:\n\n    ```js\n    {\n      var a; // Hoist.\n      console.log(a);\n      a = \"value\";\n    }\n    ```\n\n    That means that in some cases you can read a variable before its initializer\n    has run -- an annoying source of bugs. The alternate `let` syntax for\n    declaring variables was added later to address this problem.\n\n    </aside>\n\n*   \"Innermost\" is there because of our good friend shadowing. There may be more\n    than one variable with the given name in enclosing scopes, as in:\n\n    ```lox\n    var a = \"outer\";\n    {\n      var a = \"inner\";\n      print a;\n    }\n    ```\n\n    Our rule disambiguates this case by saying the innermost scope wins.\n\nSince this rule makes no mention of any runtime behavior, it implies that a\nvariable expression always refers to the same declaration through the entire\nexecution of the program. Our interpreter so far *mostly* implements the rule\ncorrectly. But when we added closures, an error snuck in.\n\n```lox\nvar a = \"global\";\n{\n  fun showA() {\n    print a;\n  }\n\n  showA();\n  var a = \"block\";\n  showA();\n}\n```\n\n<span name=\"tricky\">Before</span> you type this in and run it, decide what you\nthink it *should* print.\n\n<aside name=\"tricky\">\n\nI know, it's a totally pathological, contrived program. It's just *weird*. No\nreasonable person would ever write code like this. Alas, more of your life than\nyou'd expect will be spent dealing with bizarro snippets of code like this if\nyou stay in the programming language game for long.\n\n</aside>\n\nOK... got it? If you're familiar with closures in other languages, you'll expect\nit to print \"global\" twice. The first call to `showA()` should definitely print\n\"global\" since we haven't even reached the declaration of the inner `a` yet. And\nby our rule that a variable expression always resolves to the same variable,\nthat implies the second call to `showA()` should print the same thing.\n\nAlas, it prints:\n\n```text\nglobal\nblock\n```\n\nLet me stress that this program never reassigns any variable and contains only a\nsingle `print` statement. Yet, somehow, that `print` statement for a\nnever-assigned variable prints two different values at different points in time.\nWe definitely broke something somewhere.\n\n### Scopes and mutable environments\n\nIn our interpreter, environments are the dynamic manifestation of static scopes.\nThe two mostly stay in sync with each other -- we create a new environment when\nwe enter a new scope, and discard it when we leave the scope. There is one other\noperation we perform on environments: binding a variable in one. This is where\nour bug lies.\n\nLet's walk through that problematic example and see what the environments look\nlike at each step. First, we declare `a` in the global scope.\n\n<img src=\"image/resolving-and-binding/environment-1.png\" alt=\"The global environment with 'a' defined in it.\" />\n\nThat gives us a single environment with a single variable in it. Then we enter\nthe block and execute the declaration of `showA()`.\n\n<img src=\"image/resolving-and-binding/environment-2.png\" alt=\"A block environment linking to the global one.\" />\n\nWe get a new environment for the block. In that, we declare one name, `showA`,\nwhich is bound to the LoxFunction object we create to represent the function.\nThat object has a `closure` field that captures the environment where the\nfunction was declared, so it has a reference back to the environment for the\nblock.\n\nNow we call `showA()`.\n\n<img src=\"image/resolving-and-binding/environment-3.png\" alt=\"An empty environment for showA()'s body linking to the previous two. 'a' is resolved in the global environment.\" />\n\nThe interpreter dynamically creates a new environment for the function body of\n`showA()`. It's empty since that function doesn't declare any variables. The\nparent of that environment is the function's closure -- the outer block\nenvironment.\n\nInside the body of `showA()`, we print the value of `a`. The interpreter looks\nup this value by walking the chain of environments. It gets all the way\nto the global environment before finding it there and printing `\"global\"`.\nGreat.\n\nNext, we declare the second `a`, this time inside the block.\n\n<img src=\"image/resolving-and-binding/environment-4.png\" alt=\"The block environment has both 'a' and 'showA' now.\" />\n\nIt's in the same block -- the same scope -- as `showA()`, so it goes into the\nsame environment, which is also the same environment `showA()`'s closure refers\nto. This is where it gets interesting. We call `showA()` again.\n\n<img src=\"image/resolving-and-binding/environment-5.png\" alt=\"An empty environment for showA()'s body linking to the previous two. 'a' is resolved in the block environment.\" />\n\nWe create a new empty environment for the body of `showA()` again, wire it up to\nthat closure, and run the body. When the interpreter walks the chain of\nenvironments to find `a`, it now discovers the *new* `a` in the block\nenvironment. Boo.\n\nI chose to implement environments in a way that I hoped would agree with your\ninformal intuition around scopes. We tend to consider all of the code within a\nblock as being within the same scope, so our interpreter uses a single\nenvironment to represent that. Each environment is a mutable hash table. When a\nnew local variable is declared, it gets added to the existing environment for\nthat scope.\n\nThat intuition, like many in life, isn't quite right. A block is not necessarily\nall the same scope. Consider:\n\n```lox\n{\n  var a;\n  // 1.\n  var b;\n  // 2.\n}\n```\n\nAt the first marked line, only `a` is in scope. At the second line, both `a` and\n`b` are. If you define a \"scope\" to be a set of declarations, then those are\nclearly not the same scope -- they don't contain the same declarations. It's\nlike each `var` statement <span name=\"split\">splits</span> the block into two\nseparate scopes, the scope before the variable is declared and the one after,\nwhich includes the new variable.\n\n<aside name=\"split\">\n\nSome languages make this split explicit. In Scheme and ML, when you declare a\nlocal variable using `let`, you also delineate the subsequent code where the new\nvariable is in scope. There is no implicit \"rest of the block\".\n\n</aside>\n\nBut in our implementation, environments do act like the entire block is one\nscope, just a scope that changes over time. Closures do not like that. When a\nfunction is declared, it captures a reference to the current environment. The\nfunction *should* capture a frozen snapshot of the environment *as it existed at\nthe moment the function was declared*. But instead, in the Java code, it has a\nreference to the actual mutable environment object. When a variable is later\ndeclared in the scope that environment corresponds to, the closure sees the new\nvariable, even though the declaration does *not* precede the function.\n\n### Persistent environments\n\nThere is a style of programming that uses what are called **persistent data\nstructures**. Unlike the squishy data structures you're familiar with in\nimperative programming, a persistent data structure can never be directly\nmodified. Instead, any \"modification\" to an existing structure produces a <span\nname=\"copy\">brand</span> new object that contains all of the original data and\nthe new modification. The original is left unchanged.\n\n<aside name=\"copy\">\n\nThis sounds like it might waste tons of memory and time copying the structure\nfor each operation. In practice, persistent data structures share most of their\ndata between the different \"copies\".\n\n</aside>\n\nIf we were to apply that technique to Environment, then every time you declared\na variable it would return a *new* environment that contained all of the\npreviously declared variables along with the one new name. Declaring a variable\nwould do the implicit \"split\" where you have an environment before the variable\nis declared and one after:\n\n<img src=\"image/resolving-and-binding/split.png\" alt=\"Separate environments before and after the variable is declared.\" />\n\nA closure retains a reference to the Environment instance in play when the\nfunction was declared. Since any later declarations in that block would produce\nnew Environment objects, the closure wouldn't see the new variables and our bug\nwould be fixed.\n\nThis is a legit way to solve the problem, and it's the classic way to implement\nenvironments in Scheme interpreters. We could do that for Lox, but it would mean\ngoing back and changing a pile of existing code.\n\nI won't drag you through that. We'll keep the way we represent environments the\nsame. Instead of making the data more statically structured, we'll bake the\nstatic resolution into the access *operation* itself.\n\n## Semantic Analysis\n\nOur interpreter **resolves** a variable -- tracks down which declaration it\nrefers to -- each and every time the variable expression is evaluated. If that\nvariable is swaddled inside a loop that runs a thousand times, that variable\ngets re-resolved a thousand times.\n\nWe know static scope means that a variable usage always resolves to the same\ndeclaration, which can be determined just by looking at the text. Given that,\nwhy are we doing it dynamically every time? Doing so doesn't just open the hole\nthat leads to our annoying bug, it's also needlessly slow.\n\nA better solution is to resolve each variable use *once*. Write a chunk of code\nthat inspects the user's program, finds every variable mentioned, and figures\nout which declaration each refers to. This process is an example of a **semantic\nanalysis**. Where a parser tells only if a program is grammatically correct (a\n*syntactic* analysis), semantic analysis goes farther and starts to figure out\nwhat pieces of the program actually mean. In this case, our analysis will\nresolve variable bindings. We'll know not just that an expression *is* a\nvariable, but *which* variable it is.\n\nThere are a lot of ways we could store the binding between a variable and its\ndeclaration. When we get to the C interpreter for Lox, we'll have a *much* more\nefficient way of storing and accessing local variables. But for jlox, I want to\nminimize the collateral damage we inflict on our existing codebase. I'd hate to\nthrow out a bunch of mostly fine code.\n\nInstead, we'll store the resolution in a way that makes the most out of our\nexisting Environment class. Recall how the accesses of `a` are interpreted in\nthe problematic example.\n\n<img src=\"image/resolving-and-binding/environment-3.png\" alt=\"An empty environment for showA()'s body linking to the previous two. 'a' is resolved in the global environment.\" />\n\nIn the first (correct) evaluation, we look at three environments in the chain\nbefore finding the global declaration of `a`. Then, when the inner `a` is later\ndeclared in a block scope, it shadows the global one.\n\n<img src=\"image/resolving-and-binding/environment-5.png\" alt=\"An empty environment for showA()'s body linking to the previous two. 'a' is resolved in the block environment.\" />\n\nThe next lookup walks the chain, finds `a` in the *second* environment and\nstops there. Each environment corresponds to a single lexical scope where\nvariables are declared. If we could ensure a variable lookup always walked the\n*same* number of links in the environment chain, that would ensure that it\nfound the same variable in the same scope every time.\n\nTo \"resolve\" a variable usage, we only need to calculate how many \"hops\" away\nthe declared variable will be in the environment chain. The interesting question\nis *when* to do this calculation -- or, put differently, where in our\ninterpreter's implementation do we stuff the code for it?\n\nSince we're calculating a static property based on the structure of the source\ncode, the obvious answer is in the parser. That is the traditional home, and is\nwhere we'll put it later in clox. It would work here too, but I want an excuse to\nshow you another technique. We'll write our resolver as a separate pass.\n\n### A variable resolution pass\n\nAfter the parser produces the syntax tree, but before the interpreter starts\nexecuting it, we'll do a single walk over the tree to resolve all of the\nvariables it contains. Additional passes between parsing and execution are\ncommon. If Lox had static types, we could slide a type checker in there.\nOptimizations are often implemented in separate passes like this too. Basically,\nany work that doesn't rely on state that's only available at runtime can be done\nin this way.\n\nOur variable resolution pass works like a sort of mini-interpreter. It walks the\ntree, visiting each node, but a static analysis is different from a dynamic\nexecution:\n\n*   **There are no side effects.** When the static analysis visits a print\n    statement, it doesn't actually print anything. Calls to native functions or\n    other operations that reach out to the outside world are stubbed out and\n    have no effect.\n\n*   **There is no control flow.** Loops are visited only <span\n    name=\"fix\">once</span>. Both branches are visited in `if` statements. Logic\n    operators are not short-circuited.\n\n<aside name=\"fix\">\n\nVariable resolution touches each node once, so its performance is *O(n)* where\n*n* is the number of syntax tree nodes. More sophisticated analyses may have\ngreater complexity, but most are carefully designed to be linear or not far from\nit. It's an embarrassing faux pas if your compiler gets exponentially slower as\nthe user's program grows.\n\n</aside>\n\n## A Resolver Class\n\nLike everything in Java, our variable resolution pass is embodied in a class.\n\n^code resolver\n\nSince the resolver needs to visit every node in the syntax tree, it implements\nthe visitor abstraction we already have in place. Only a few kinds of nodes are\ninteresting when it comes to resolving variables:\n\n*   A block statement introduces a new scope for the statements it contains.\n\n*   A function declaration introduces a new scope for its body and binds its\n    parameters in that scope.\n\n*   A variable declaration adds a new variable to the current scope.\n\n*   Variable and assignment expressions need to have their variables resolved.\n\nThe rest of the nodes don't do anything special, but we still need to implement\nvisit methods for them that traverse into their subtrees. Even though a `+`\nexpression doesn't *itself* have any variables to resolve, either of its\noperands might.\n\n### Resolving blocks\n\nWe start with blocks since they create the local scopes where all the magic\nhappens.\n\n^code visit-block-stmt\n\nThis begins a new scope, traverses into the statements inside the block, and\nthen discards the scope. The fun stuff lives in those helper methods. We start\nwith the simple one.\n\n^code resolve-statements\n\nThis walks a list of statements and resolves each one. It in turn calls:\n\n^code resolve-stmt\n\nWhile we're at it, let's add another overload that we'll need later for\nresolving an expression.\n\n^code resolve-expr\n\nThese methods are similar to the `evaluate()` and `execute()` methods in\nInterpreter -- they turn around and apply the Visitor pattern to the given\nsyntax tree node.\n\nThe real interesting behavior is around scopes. A new block scope is created\nlike so:\n\n^code begin-scope\n\nLexical scopes nest in both the interpreter and the resolver. They behave like a\nstack. The interpreter implements that stack using a linked list -- the chain of\nEnvironment objects. In the resolver, we use an actual Java Stack.\n\n^code scopes-field (1 before, 2 after)\n\nThis field keeps track of the stack of scopes currently, uh, in scope. Each\nelement in the stack is a Map representing a single block scope. Keys, as in\nEnvironment, are variable names. The values are Booleans, for a reason I'll\nexplain soon.\n\nThe scope stack is only used for local block scopes. Variables declared at the\ntop level in the global scope are not tracked by the resolver since they are\nmore dynamic in Lox. When resolving a variable, if we can't find it in the stack\nof local scopes, we assume it must be global.\n\nSince scopes are stored in an explicit stack, exiting one is straightforward.\n\n^code end-scope\n\nNow we can push and pop a stack of empty scopes. Let's put some things in them.\n\n### Resolving variable declarations\n\nResolving a variable declaration adds a new entry to the current innermost\nscope's map. That seems simple, but there's a little dance we need to do.\n\n^code visit-var-stmt\n\nWe split binding into two steps, declaring then defining, in order to handle\nfunny edge cases like this:\n\n```lox\nvar a = \"outer\";\n{\n  var a = a;\n}\n```\n\nWhat happens when the initializer for a local variable refers to a variable with\nthe same name as the variable being declared? We have a few options:\n\n1.  **Run the initializer, then put the new variable in scope.** Here, the new\n    local `a` would be initialized with \"outer\", the value of the *global* one.\n    In other words, the previous declaration would desugar to:\n\n    ```lox\n    var temp = a; // Run the initializer.\n    var a;        // Declare the variable.\n    a = temp;     // Initialize it.\n    ```\n\n2.  **Put the new variable in scope, then run the initializer.** This means you\n    could observe a variable before it's initialized, so we would need to figure\n    out what value it would have then. Probably `nil`. That means the new local\n    `a` would be re-initialized to its own implicitly initialized value, `nil`.\n    Now the desugaring would look like:\n\n    ```lox\n    var a; // Define the variable.\n    a = a; // Run the initializer.\n    ```\n\n3.  **Make it an error to reference a variable in its initializer.** Have the\n    interpreter fail either at compile time or runtime if an initializer\n    mentions the variable being initialized.\n\nDo either of those first two options look like something a user actually\n*wants*? Shadowing is rare and often an error, so initializing a shadowing\nvariable based on the value of the shadowed one seems unlikely to be deliberate.\n\nThe second option is even less useful. The new variable will *always* have the\nvalue `nil`. There is never any point in mentioning it by name. You could use an\nexplicit `nil` instead.\n\nSince the first two options are likely to mask user errors, we'll take the\nthird. Further, we'll make it a compile error instead of a runtime one. That\nway, the user is alerted to the problem before any code is run.\n\nIn order to do that, as we visit expressions, we need to know if we're inside\nthe initializer for some variable. We do that by splitting binding into two\nsteps. The first is **declaring** it.\n\n^code declare\n\nDeclaration adds the variable to the innermost scope so that it shadows any\nouter one and so that we know the variable exists. We mark it as \"not ready yet\"\nby binding its name to `false` in the scope map. The value associated with a key\nin the scope map represents whether or not we have finished resolving that\nvariable's initializer.\n\nAfter declaring the variable, we resolve its initializer expression in that same\nscope where the new variable now exists but is unavailable. Once the initializer\nexpression is done, the variable is ready for prime time. We do that by\n**defining** it.\n\n^code define\n\nWe set the variable's value in the scope map to `true` to mark it as fully\ninitialized and available for use. It's alive! \n\n### Resolving variable expressions\n\nVariable declarations -- and function declarations, which we'll get to -- write\nto the scope maps. Those maps are read when we resolve variable expressions.\n\n^code visit-variable-expr\n\nFirst, we check to see if the variable is being accessed inside its own\ninitializer. This is where the values in the scope map come into play. If the\nvariable exists in the current scope but its value is `false`, that means we\nhave declared it but not yet defined it. We report that error.\n\nAfter that check, we actually resolve the variable itself using this helper:\n\n^code resolve-local\n\nThis looks, for good reason, a lot like the code in Environment for evaluating a\nvariable. We start at the innermost scope and work outwards, looking in each map\nfor a matching name. If we find the variable, we resolve it, passing in the\nnumber of scopes between the current innermost scope and the scope where the\nvariable was found. So, if the variable was found in the current scope, we\npass in 0. If it's in the immediately enclosing scope, 1. You get the idea.\n\nIf we walk through all of the block scopes and never find the variable, we leave\nit unresolved and assume it's global. We'll get to the implementation of that\n`resolve()` method a little later. For now, let's keep on cranking through the\nother syntax nodes.\n\n### Resolving assignment expressions\n\nThe other expression that references a variable is assignment. Resolving one\nlooks like this:\n\n^code visit-assign-expr\n\nFirst, we resolve the expression for the assigned value in case it also contains\nreferences to other variables. Then we use our existing `resolveLocal()` method\nto resolve the variable that's being assigned to.\n\n### Resolving function declarations\n\nFinally, functions. Functions both bind names and introduce a scope. The name of\nthe function itself is bound in the surrounding scope where the function is\ndeclared. When we step into the function's body, we also bind its parameters\ninto that inner function scope.\n\n^code visit-function-stmt\n\nSimilar to `visitVariableStmt()`, we declare and define the name of the function\nin the current scope. Unlike variables, though, we define the name eagerly,\nbefore resolving the function's body. This lets a function recursively refer to\nitself inside its own body.\n\nThen we resolve the function's body using this:\n\n^code resolve-function\n\nIt's a separate method since we will also use it for resolving Lox methods when\nwe add classes later. It creates a new scope for the body and then binds\nvariables for each of the function's parameters.\n\nOnce that's ready, it resolves the function body in that scope. This is\ndifferent from how the interpreter handles function declarations. At *runtime*,\ndeclaring a function doesn't do anything with the function's body. The body\ndoesn't get touched until later when the function is called. In a *static*\nanalysis, we immediately traverse into the body right then and there.\n\n### Resolving the other syntax tree nodes\n\nThat covers the interesting corners of the grammars. We handle every place where\na variable is declared, read, or written, and every place where a scope is\ncreated or destroyed. Even though they aren't affected by variable resolution,\nwe also need visit methods for all of the other syntax tree nodes in order to\nrecurse into their subtrees. <span name=\"boring\">Sorry</span> this bit is\nboring, but bear with me. We'll go kind of \"top down\" and start with statements.\n\n<aside name=\"boring\">\n\nI did say the book would have every single line of code for these interpreters.\nI didn't say they'd all be exciting.\n\n</aside>\n\nAn expression statement contains a single expression to traverse.\n\n^code visit-expression-stmt\n\nAn if statement has an expression for its condition and one or two statements\nfor the branches.\n\n^code visit-if-stmt\n\nHere, we see how resolution is different from interpretation. When we resolve an\n`if` statement, there is no control flow. We resolve the condition and *both*\nbranches. Where a dynamic execution steps only into the branch that *is* run, a\nstatic analysis is conservative -- it analyzes any branch that *could* be run.\nSince either one could be reached at runtime, we resolve both.\n\nLike expression statements, a `print` statement contains a single subexpression.\n\n^code visit-print-stmt\n\nSame deal for return.\n\n^code visit-return-stmt\n\nAs in `if` statements, with a `while` statement, we resolve its condition and\nresolve the body exactly once.\n\n^code visit-while-stmt\n\nThat covers all the statements. On to expressions...\n\nOur old friend the binary expression. We traverse into and resolve both\noperands.\n\n^code visit-binary-expr\n\nCalls are similar -- we walk the argument list and resolve them all. The thing\nbeing called is also an expression (usually a variable expression), so that gets\nresolved too.\n\n^code visit-call-expr\n\nParentheses are easy.\n\n^code visit-grouping-expr\n\nLiterals are easiest of all.\n\n^code visit-literal-expr\n\nA literal expression doesn't mention any variables and doesn't contain any\nsubexpressions so there is no work to do.\n\nSince a static analysis does no control flow or short-circuiting, logical\nexpressions are exactly the same as other binary operators.\n\n^code visit-logical-expr\n\nAnd, finally, the last node. We resolve its one operand.\n\n^code visit-unary-expr\n\nWith all of these visit methods, the Java compiler should be satisfied that\nResolver fully implements Stmt.Visitor and Expr.Visitor. Now is a good time to\ntake a break, have a snack, maybe a little nap.\n\n## Interpreting Resolved Variables\n\nLet's see what our resolver is good for. Each time it visits a variable, it\ntells the interpreter how many scopes there are between the current scope and\nthe scope where the variable is defined. At runtime, this corresponds exactly to\nthe number of *environments* between the current one and the enclosing one where\nthe interpreter can find the variable's value. The resolver hands that number to\nthe interpreter by calling this:\n\n^code resolve\n\nWe want to store the resolution information somewhere so we can use it when the\nvariable or assignment expression is later executed, but where? One obvious\nplace is right in the syntax tree node itself. That's a fine approach, and\nthat's where many compilers store the results of analyses like this.\n\nWe could do that, but it would require mucking around with our syntax tree\ngenerator. Instead, we'll take another common approach and store it off to the\n<span name=\"side\">side</span> in a map that associates each syntax tree node\nwith its resolved data.\n\n<aside name=\"side\">\n\nI *think* I've heard this map called a \"side table\" since it's a tabular data\nstructure that stores data separately from the objects it relates to. But\nwhenever I try to Google for that term, I get pages about furniture.\n\n</aside>\n\nInteractive tools like IDEs often incrementally reparse and re-resolve parts of\nthe user's program. It may be hard to find all of the bits of state that need\nrecalculating when they're hiding in the foliage of the syntax tree. A benefit\nof storing this data outside of the nodes is that it makes it easy to *discard*\nit -- simply clear the map.\n\n^code locals-field (1 before, 2 after)\n\nYou might think we'd need some sort of nested tree structure to avoid getting\nconfused when there are multiple expressions that reference the same variable,\nbut each expression node is its own Java object with its own unique identity. A\nsingle monolithic map doesn't have any trouble keeping them separated.\n\nAs usual, using a collection requires us to import a couple of names.\n\n^code import-hash-map (1 before, 1 after)\n\nAnd:\n\n^code import-map (1 before, 2 after)\n\n### Accessing a resolved variable\n\nOur interpreter now has access to each variable's resolved location. Finally, we\nget to make use of that. We replace the visit method for variable expressions\nwith this:\n\n^code call-look-up-variable (1 before, 1 after)\n\nThat delegates to:\n\n^code look-up-variable\n\nThere are a couple of things going on here. First, we look up the resolved\ndistance in the map. Remember that we resolved only *local* variables. Globals\nare treated specially and don't end up in the map (hence the name `locals`). So,\nif we don't find a distance in the map, it must be global. In that case, we\nlook it up, dynamically, directly in the global environment. That throws a\nruntime error if the variable isn't defined.\n\nIf we *do* get a distance, we have a local variable, and we get to take\nadvantage of the results of our static analysis. Instead of calling `get()`, we\ncall this new method on Environment:\n\n^code get-at\n\nThe old `get()` method dynamically walks the chain of enclosing environments,\nscouring each one to see if the variable might be hiding in there somewhere. But\nnow we know exactly which environment in the chain will have the variable. We\nreach it using this helper method:\n\n^code ancestor\n\nThis walks a fixed number of hops up the parent chain and returns the\nenvironment there. Once we have that, `getAt()` simply returns the value of the\nvariable in that environment's map. It doesn't even have to check to see if the\nvariable is there -- we know it will be because the resolver already found it\nbefore.\n\n<aside name=\"coupled\">\n\nThe way the interpreter assumes the variable is in that map feels like flying\nblind. The interpreter code trusts that the resolver did its job and resolved\nthe variable correctly. This implies a deep coupling between these two classes.\nIn the resolver, each line of code that touches a scope must have its exact\nmatch in the interpreter for modifying an environment.\n\nI felt that coupling firsthand because as I wrote the code for the book, I\nran into a couple of subtle bugs where the resolver and interpreter code were\nslightly out of sync. Tracking those down was difficult. One tool to make that\neasier is to have the interpreter explicitly assert -- using Java's assert\nstatements or some other validation tool -- the contract it expects the resolver\nto have already upheld.\n\n</aside>\n\n### Assigning to a resolved variable\n\nWe can also use a variable by assigning to it. The changes to visiting an\nassignment expression are similar.\n\n^code resolved-assign (2 before, 1 after)\n\nAgain, we look up the variable's scope distance. If not found, we assume it's\nglobal and handle it the same way as before. Otherwise, we call this new method:\n\n^code assign-at\n\nAs `getAt()` is to `get()`, `assignAt()` is to `assign()`. It walks a fixed\nnumber of environments, and then stuffs the new value in that map.\n\nThose are the only changes to Interpreter. This is why I chose a representation\nfor our resolved data that was minimally invasive. All of the rest of the nodes\ncontinue working as they did before. Even the code for modifying environments is\nunchanged.\n\n### Running the resolver\n\nWe do need to actually *run* the resolver, though. We insert the new pass after\nthe parser does its magic.\n\n^code create-resolver (3 before, 1 after)\n\nWe don't run the resolver if there are any parse errors. If the code has a\nsyntax error, it's never going to run, so there's little value in resolving it.\nIf the syntax is clean, we tell the resolver to do its thing. The resolver has a\nreference to the interpreter and pokes the resolution data directly into it as\nit walks over variables. When the interpreter runs next, it has everything it\nneeds.\n\nAt least, that's true if the resolver *succeeds*. But what about errors during\nresolution?\n\n## Resolution Errors\n\nSince we are doing a semantic analysis pass, we have an opportunity to make\nLox's semantics more precise, and to help users catch bugs early before running\ntheir code. Take a look at this bad boy:\n\n```lox\nfun bad() {\n  var a = \"first\";\n  var a = \"second\";\n}\n```\n\nWe do allow declaring multiple variables with the same name in the *global*\nscope, but doing so in a local scope is probably a mistake. If they knew the\nvariable already existed, they would have assigned to it instead of using `var`.\nAnd if they *didn't* know it existed, they probably didn't intend to overwrite\nthe previous one.\n\nWe can detect this mistake statically while resolving.\n\n^code duplicate-variable (1 before, 1 after)\n\nWhen we declare a variable in a local scope, we already know the names of every\nvariable previously declared in that same scope. If we see a collision, we\nreport an error.\n\n### Invalid return errors\n\nHere's another nasty little script:\n\n```lox\nreturn \"at top level\";\n```\n\nThis executes a `return` statement, but it's not even inside a function at all.\nIt's top-level code. I don't know what the user *thinks* is going to happen, but\nI don't think we want Lox to allow this.\n\nWe can extend the resolver to detect this statically. Much like we track scopes\nas we walk the tree, we can track whether or not the code we are currently\nvisiting is inside a function declaration.\n\n^code function-type-field (1 before, 2 after)\n\nInstead of a bare Boolean, we use this funny enum:\n\n^code function-type\n\nIt seems kind of dumb now, but we'll add a couple more cases to it later and\nthen it will make more sense. When we resolve a function declaration, we pass\nthat in.\n\n^code pass-function-type (2 before, 1 after)\n\nOver in `resolveFunction()`, we take that parameter and store it in the field\nbefore resolving the body.\n\n^code set-current-function (1 after)\n\nWe stash the previous value of the field in a local variable first. Remember,\nLox has local functions, so you can nest function declarations arbitrarily\ndeeply. We need to track not just that we're in a function, but *how many* we're\nin.\n\nWe could use an explicit stack of FunctionType values for that, but instead\nwe'll piggyback on the JVM. We store the previous value in a local on the Java\nstack. When we're done resolving the function body, we restore the field to that\nvalue.\n\n^code restore-current-function (1 before, 1 after)\n\nNow that we can always tell whether or not we're inside a function declaration,\nwe check that when resolving a `return` statement.\n\n^code return-from-top (1 before, 1 after)\n\nNeat, right?\n\nThere's one more piece. Back in the main Lox class that stitches everything\ntogether, we are careful to not run the interpreter if any parse errors are\nencountered. That check runs *before* the resolver so that we don't try to\nresolve syntactically invalid code.\n\nBut we also need to skip the interpreter if there are resolution errors, so we\nadd *another* check.\n\n^code resolution-error (1 before, 2 after)\n\nYou could imagine doing lots of other analysis in here. For example, if we added\n`break` statements to Lox, we would probably want to ensure they are only used\ninside loops.\n\nWe could go farther and report warnings for code that isn't necessarily *wrong*\nbut probably isn't useful. For example, many IDEs will warn if you have\nunreachable code after a `return` statement, or a local variable whose value is\nnever read. All of that would be pretty easy to add to our static visiting pass,\nor as <span name=\"separate\">separate</span> passes.\n\n<aside name=\"separate\">\n\nThe choice of how many different analyses to lump into a single pass is\ndifficult. Many small isolated passes, each with their own responsibility, are\nsimpler to implement and maintain. However, there is a real runtime cost to\ntraversing the syntax tree itself, so bundling multiple analyses into a single\npass is usually faster.\n\n</aside>\n\nBut, for now, we'll stick with that limited amount of analysis. The important\npart is that we fixed that one weird annoying edge case bug, though it might be\nsurprising that it took this much work to do it.\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  Why is it safe to eagerly define the variable bound to a function's name\n    when other variables must wait until after they are initialized before they\n    can be used?\n\n1.  How do other languages you know handle local variables that refer to the\n    same name in their initializer, like:\n\n    ```lox\n    var a = \"outer\";\n    {\n      var a = a;\n    }\n    ```\n\n    Is it a runtime error? Compile error? Allowed? Do they treat global\n    variables differently? Do you agree with their choices? Justify your answer.\n\n1.  Extend the resolver to report an error if a local variable is never used.\n\n1.  Our resolver calculates *which* environment the variable is found in, but\n    it's still looked up by name in that map. A more efficient environment\n    representation would store local variables in an array and look them up by\n    index.\n\n    Extend the resolver to associate a unique index for each local variable\n    declared in a scope. When resolving a variable access, look up both the\n    scope the variable is in and its index and store that. In the interpreter,\n    use that to quickly access a variable by its index instead of using a map.\n\n</div>\n"
  },
  {
    "path": "book/scanning-on-demand.md",
    "content": "> Literature is idiosyncratic arrangements in horizontal lines in only\n> twenty-six phonetic symbols, ten Arabic numbers, and about eight punctuation\n> marks.\n>\n> <cite>Kurt Vonnegut, <em>Like Shaking Hands With God: A Conversation about Writing</em></cite>\n\nOur second interpreter, clox, has three phases -- scanner, compiler, and virtual\nmachine. A data structure joins each pair of phases. Tokens flow from scanner to\ncompiler, and chunks of bytecode from compiler to VM. We began our\nimplementation near the end with [chunks][] and the [VM][]. Now, we're going to\nhop back to the beginning and build a scanner that makes tokens. In the\n[next chapter][], we'll tie the two ends together with our bytecode compiler.\n\n[chunks]: chunks-of-bytecode.html\n[vm]: a-virtual-machine.html\n[next chapter]: compiling-expressions.html\n\n<img src=\"image/scanning-on-demand/pipeline.png\" alt=\"Source code &rarr; scanner &rarr; tokens &rarr; compiler &rarr; bytecode chunk &rarr; VM.\" />\n\nI'll admit, this is not the most exciting chapter in the book. With two\nimplementations of the same language, there's bound to be some redundancy. I did\nsneak in a few interesting differences compared to jlox's scanner. Read on to\nsee what they are.\n\n## Spinning Up the Interpreter\n\nNow that we're building the front end, we can get clox running like a real\ninterpreter. No more hand-authored chunks of bytecode. It's time for a REPL and\nscript loading. Tear out most of the code in `main()` and replace it with:\n\n^code args (3 before, 2 after)\n\nIf you pass <span name=\"args\">no arguments</span> to the executable, you are\ndropped into the REPL. A single command line argument is understood to be the\npath to a script to run.\n\n<aside name=\"args\">\n\nThe code tests for one and two arguments, not zero and one, because the first\nargument in `argv` is always the name of the executable being run.\n\n</aside>\n\nWe'll need a few system headers, so let's get them all out of the way.\n\n^code main-includes (1 after)\n\nNext, we get the REPL up and REPL-ing.\n\n^code repl (1 before)\n\nA quality REPL handles input that spans multiple lines gracefully and doesn't\nhave a hardcoded line length limit. This REPL here is a little more, ahem,\naustere, but it's fine for our purposes.\n\nThe real work happens in `interpret()`. We'll get to that soon, but first let's\ntake care of loading scripts.\n\n^code run-file\n\nWe read the file and execute the resulting string of Lox source code. Then,\nbased on the result of that, we set the exit code appropriately because we're\nscrupulous tool builders and care about little details like that.\n\nWe also need to free the source code string because `readFile()` dynamically\nallocates it and passes ownership to its caller. That function looks like this:\n\n<aside name=\"owner\">\n\nC asks us not just to manage memory explicitly, but *mentally*. We programmers\nhave to remember the ownership rules and hand-implement them throughout the\nprogram. Java just does it for us. C++ gives us tools to encode the policy\ndirectly so that the compiler validates it for us.\n\nI like C's simplicity, but we pay a real price for it -- the language requires\nus to be more conscientious.\n\n</aside>\n\n^code read-file\n\nLike a lot of C code, it takes more effort than it seems like it should,\nespecially for a language expressly designed for operating systems. The\ndifficult part is that we want to allocate a big enough string to read the whole\nfile, but we don't know how big the file is until we've read it.\n\nThe code here is the classic trick to solve that. We open the file, but before\nreading it, we seek to the very end using `fseek()`. Then we call `ftell()`\nwhich tells us how many bytes we are from the start of the file. Since we seeked\n(sought?) to the end, that's the size. We rewind back to the beginning, allocate\na string of that <span name=\"one\">size</span>, and read the whole file in a\nsingle batch.\n\n<aside name=\"one\">\n\nWell, that size *plus one*. Always gotta remember to make room for the null\nbyte.\n\n</aside>\n\nSo we're done, right? Not quite. These function calls, like most calls in the C\nstandard library, can fail. If this were Java, the failures would be thrown as\nexceptions and automatically unwind the stack so we wouldn't *really* need to\nhandle them. In C, if we don't check for them, they silently get ignored.\n\nThis isn't really a book on good C programming practice, but I hate to encourage\nbad style, so let's go ahead and handle the errors. It's good for us, like\neating our vegetables or flossing.\n\nFortunately, we don't need to do anything particularly clever if a failure\noccurs. If we can't correctly read the user's script, all we can really do is\ntell the user and exit the interpreter gracefully. First of all, we might fail\nto open the file.\n\n^code no-file (1 before, 2 after)\n\nThis can happen if the file doesn't exist or the user doesn't have access to it.\nIt's pretty common -- people mistype paths all the time.\n\nThis failure is much rarer:\n\n^code no-buffer (1 before, 1 after)\n\nIf we can't even allocate enough memory to read the Lox script, the user's\nprobably got bigger problems to worry about, but we should do our best to at\nleast let them know.\n\nFinally, the read itself may fail.\n\n^code no-read (1 before, 1 after)\n\nThis is also unlikely. Actually, the <span name=\"printf\"> calls</span> to\n`fseek()`, `ftell()`, and `rewind()` could theoretically fail too, but let's not\ngo too far off in the weeds, shall we?\n\n<aside name=\"printf\">\n\nEven good old `printf()` can fail. Yup. How many times have you handled *that*\nerror?\n\n</aside>\n\n### Opening the compilation pipeline\n\nWe've got ourselves a string of Lox source code, so now we're ready to set up a\npipeline to scan, compile, and execute it. It's driven by `interpret()`. Right\nnow, that function runs our old hardcoded test chunk. Let's change it to\nsomething closer to its final incarnation.\n\n^code vm-interpret-h (1 before, 1 after)\n\nWhere before we passed in a Chunk, now we pass in the string of source code.\nHere's the new implementation:\n\n^code vm-interpret-c (1 after)\n\nWe won't build the actual *compiler* yet in this chapter, but we can start\nlaying out its structure. It lives in a new module.\n\n^code vm-include-compiler (1 before, 1 after)\n\nFor now, the one function in it is declared like so:\n\n^code compiler-h\n\nThat signature will change, but it gets us going.\n\nThe first phase of compilation is scanning -- the thing we're doing in this\nchapter -- so right now all the compiler does is set that up.\n\n^code compiler-c\n\nThis will also grow in later chapters, naturally.\n\n### The scanner scans\n\nThere are still a few more feet of scaffolding to stand up before we can start\nwriting useful code. First, a new header:\n\n^code scanner-h\n\nAnd its corresponding implementation:\n\n^code scanner-c\n\nAs our scanner chews through the user's source code, it tracks how far it's\ngone. Like we did with the VM, we wrap that state in a struct and then create a\nsingle top-level module variable of that type so we don't have to pass it around\nall of the various functions.\n\nThere are surprisingly few fields. The `start` pointer marks the beginning of\nthe current lexeme being scanned, and `current` points to the current character\nbeing looked at.\n\n<span name=\"fields\"></span>\n\n<img src=\"image/scanning-on-demand/fields.png\" alt=\"The start and current fields pointing at 'print bacon;'. Start points at 'b' and current points at 'o'.\" />\n\n<aside name=\"fields\">\n\nHere, we are in the middle of scanning the identifier `bacon`. The current\ncharacter is `o` and the character we most recently consumed is `c`.\n\n</aside>\n\nWe have a `line` field to track what line the current lexeme is on for error\nreporting. That's it! We don't even keep a pointer to the beginning of the\nsource code string. The scanner works its way through the code once and is done\nafter that.\n\nSince we have some state, we should initialize it.\n\n^code init-scanner\n\nWe start at the very first character on the very first line, like a runner\ncrouched at the starting line.\n\n## A Token at a Time\n\nIn jlox, when the starting gun went off, the scanner raced ahead and eagerly\nscanned the whole program, returning a list of tokens. This would be a challenge\nin clox. We'd need some sort of growable array or list to store the tokens in.\nWe'd need to manage allocating and freeing the tokens, and the collection\nitself. That's a lot of code, and a lot of memory churn.\n\nAt any point in time, the compiler needs only one or two tokens -- remember our\ngrammar requires only a single token of lookahead -- so we don't need to keep\nthem *all* around at the same time. Instead, the simplest solution is to not\nscan a token until the compiler needs one. When the scanner provides one, it\nreturns the token by value. It doesn't need to dynamically allocate anything --\nit can just pass tokens around on the C stack.\n\nUnfortunately, we don't have a compiler yet that can ask the scanner for tokens,\nso the scanner will just sit there doing nothing. To kick it into action, we'll\nwrite some temporary code to drive it.\n\n^code dump-tokens (1 before, 1 after)\n\n<aside name=\"format\">\n\nThat `%.*s` in the format string is a neat feature. Usually, you set the output\nprecision -- the number of characters to show -- by placing a number inside the\nformat string. Using `*` instead lets you pass the precision as an argument. So\nthat `printf()` call prints the first `token.length` characters of the string at\n`token.start`. We need to limit the length like that because the lexeme points\ninto the original source string and doesn't have a terminator at the end.\n\n</aside>\n\nThis loops indefinitely. Each turn through the loop, it scans one token and\nprints it. When it reaches a special \"end of file\" token or an error, it stops.\nFor example, if we run the interpreter on this program:\n\n```lox\nprint 1 + 2;\n```\n\nIt prints out:\n\n```text\n   1 31 'print'\n   | 21 '1'\n   |  7 '+'\n   | 21 '2'\n   |  8 ';'\n   2 39 ''\n```\n\nThe first column is the line number, the second is the numeric value of the\ntoken <span name=\"token\">type</span>, and then finally the lexeme. That last\nempty lexeme on line 2 is the EOF token.\n\n<aside name=\"token\">\n\nYeah, the raw index of the token type isn't exactly human readable, but it's all\nC gives us.\n\n</aside>\n\nThe goal for the rest of the chapter is to make that blob of code work by\nimplementing this key function:\n\n^code scan-token-h (1 before, 2 after)\n\nEach call scans and returns the next token in the source code. A token looks\nlike this:\n\n^code token-struct (1 before, 2 after)\n\nIt's pretty similar to jlox's Token class. We have an enum identifying what type\nof token it is -- number, identifier, `+` operator, etc. The enum is virtually\nidentical to the one in jlox, so let's just hammer out the whole thing.\n\n^code token-type (2 before, 2 after)\n\nAside from prefixing all the names with `TOKEN_` (since C tosses enum names in\nthe top-level namespace) the only difference is that extra `TOKEN_ERROR` type.\nWhat's that about?\n\nThere are only a couple of errors that get detected during scanning:\nunterminated strings and unrecognized characters. In jlox, the scanner reports\nthose itself. In clox, the scanner produces a synthetic \"error\" token for that\nerror and passes it over to the compiler. This way, the compiler knows an error\noccurred and can kick off error recovery before reporting it.\n\nThe novel part in clox's Token type is how it represents the lexeme. In jlox,\neach Token stored the lexeme as its own separate little Java string. If we did\nthat for clox, we'd have to figure out how to manage the memory for those\nstrings. That's especially hard since we pass tokens by value\n-- multiple tokens could point to the same lexeme string. Ownership gets weird.\n\nInstead, we use the original source string as our character store. We represent\na lexeme by a pointer to its first character and the number of characters it\ncontains. This means we don't need to worry about managing memory for lexemes at\nall and we can freely copy tokens around. As long as the main source code string\n<span name=\"outlive\">outlives</span> all of the tokens, everything works fine.\n\n<aside name=\"outlive\">\n\nI don't mean to sound flippant. We really do need to think about and ensure that\nthe source string, which is created far away over in the \"main\" module, has a\nlong enough lifetime. That's why `runFile()` doesn't free the string until\n`interpret()` finishes executing the code and returns.\n\n</aside>\n\n### Scanning tokens\n\nWe're ready to scan some tokens. We'll work our way up to the complete\nimplementation, starting with this:\n\n^code scan-token\n\nSince each call to this function scans a complete token, we know we are at the\nbeginning of a new token when we enter the function. Thus, we set\n`scanner.start` to point to the current character so we remember where the\nlexeme we're about to scan starts.\n\nThen we check to see if we've reached the end of the source code. If so, we\nreturn an EOF token and stop. This is a sentinel value that signals to the\ncompiler to stop asking for more tokens.\n\nIf we aren't at the end, we do some... stuff... to scan the next token. But we\nhaven't written that code yet. We'll get to that soon. If that code doesn't\nsuccessfully scan and return a token, then we reach the end of the function.\nThat must mean we're at a character that the scanner can't recognize, so we\nreturn an error token for that.\n\nThis function relies on a couple of helpers, most of which are familiar from\njlox. First up:\n\n^code is-at-end\n\nWe require the source string to be a good null-terminated C string. If the\ncurrent character is the null byte, then we've reached the end.\n\nTo create a token, we have this constructor-like function:\n\n^code make-token\n\nIt uses the scanner's `start` and `current` pointers to capture the token's\nlexeme. It sets a couple of other obvious fields then returns the token. It has\na sister function for returning error tokens.\n\n^code error-token\n\n<span name=\"axolotl\"></span>\n\n<aside name=\"axolotl\">\n\nThis part of the chapter is pretty dry, so here's a picture of an axolotl.\n\n<img src=\"image/scanning-on-demand/axolotl.png\" alt=\"A drawing of an axolotl.\" />\n\n</aside>\n\nThe only difference is that the \"lexeme\" points to the error message string\ninstead of pointing into the user's source code. Again, we need to ensure that\nthe error message sticks around long enough for the compiler to read it. In\npractice, we only ever call this function with C string literals. Those are\nconstant and eternal, so we're fine.\n\nWhat we have now is basically a working scanner for a language with an empty\nlexical grammar. Since the grammar has no productions, every character is an\nerror. That's not exactly a fun language to program in, so let's fill in the\nrules.\n\n## A Lexical Grammar for Lox\n\nThe simplest tokens are only a single character. We recognize those like so:\n\n^code scan-char (1 before, 2 after)\n\nWe read the next character from the source code, and then do a straightforward\nswitch to see if it matches any of Lox's one-character lexemes. To read the next\ncharacter, we use a new helper which consumes the current character and returns\nit.\n\n^code advance\n\nNext up are the two-character punctuation tokens like `!=` and `>=`. Each of\nthese also has a corresponding single-character token. That means that when we\nsee a character like `!`, we don't know if we're in a `!` token or a `!=` until\nwe look at the next character too. We handle those like so:\n\n^code two-char (1 before, 1 after)\n\nAfter consuming the first character, we look for an `=`. If found, we consume it\nand return the corresponding two-character token. Otherwise, we leave the\ncurrent character alone (so it can be part of the *next* token) and return the\nappropriate one-character token.\n\nThat logic for conditionally consuming the second character lives here:\n\n^code match\n\nIf the current character is the desired one, we advance and return `true`.\nOtherwise, we return `false` to indicate it wasn't matched.\n\nNow our scanner supports all of the punctuation-like tokens. Before we get to\nthe longer ones, let's take a little side trip to handle characters that aren't\npart of a token at all.\n\n### Whitespace\n\nOur scanner needs to handle spaces, tabs, and newlines, but those characters\ndon't become part of any token's lexeme. We could check for those inside the\nmain character switch in `scanToken()` but it gets a little tricky to ensure\nthat the function still correctly finds the next token *after* the whitespace\nwhen you call it. We'd have to wrap the whole body of the function in a loop or\nsomething.\n\nInstead, before starting the token, we shunt off to a separate function.\n\n^code call-skip-whitespace (1 before, 1 after)\n\nThis advances the scanner past any leading whitespace. After this call returns,\nwe know the very next character is a meaningful one (or we're at the end of the\nsource code).\n\n^code skip-whitespace\n\nIt's sort of a separate mini-scanner. It loops, consuming every whitespace\ncharacter it encounters. We need to be careful that it does *not* consume any\n*non*-whitespace characters. To support that, we use this:\n\n^code peek\n\nThis simply returns the current character, but doesn't consume it. The previous\ncode handles all the whitespace characters except for newlines.\n\n^code newline (1 before, 2 after)\n\nWhen we consume one of those, we also bump the current line number.\n\n### Comments\n\nComments aren't technically \"whitespace\", if you want to get all precise with\nyour terminology, but as far as Lox is concerned, they may as well be, so we\nskip those too.\n\n^code comment (1 before, 2 after)\n\nComments start with `//` in Lox, so as with `!=` and friends, we need a second\ncharacter of lookahead. However, with `!=`, we still wanted to consume the `!`\neven if the `=` wasn't found. Comments are different. If we don't find a second\n`/`, then `skipWhitespace()` needs to not consume the *first* slash either.\n\nTo handle that, we add:\n\n^code peek-next\n\nThis is like `peek()` but for one character past the current one. If the current\ncharacter and the next one are both `/`, we consume them and then any other\ncharacters until the next newline or the end of the source code.\n\nWe use `peek()` to check for the newline but not consume it. That way, the\nnewline will be the current character on the next turn of the outer loop in\n`skipWhitespace()` and we'll recognize it and increment `scanner.line`.\n\n### Literal tokens\n\nNumber and string tokens are special because they have a runtime value\nassociated with them. We'll start with strings because they are easy to\nrecognize -- they always begin with a double quote.\n\n^code scan-string (1 before, 1 after)\n\nThat calls a new function.\n\n^code string\n\nSimilar to jlox, we consume characters until we reach the closing quote. We also\ntrack newlines inside the string literal. (Lox supports multi-line strings.)\nAnd, as ever, we gracefully handle running out of source code before we find the\nend quote.\n\nThe main change here in clox is something that's *not* present. Again, it\nrelates to memory management. In jlox, the Token class had a field of type\nObject to store the runtime value converted from the literal token's lexeme.\n\nImplementing that in C would require a lot of work. We'd need some sort of union\nand type tag to tell whether the token contains a string or double value. If\nit's a string, we'd need to manage the memory for the string's character array\nsomehow.\n\nInstead of adding that complexity to the scanner, we defer <span\nname=\"convert\">converting</span> the literal lexeme to a runtime value until\nlater. In clox, tokens only store the lexeme -- the character sequence exactly\nas it appears in the user's source code. Later in the compiler, we'll convert\nthat lexeme to a runtime value right when we are ready to store it in the\nchunk's constant table.\n\n<aside name=\"convert\">\n\nDoing the lexeme-to-value conversion in the compiler does introduce some\nredundancy. The work to scan a number literal is awfully similar to the work\nrequired to convert a sequence of digit characters to a number value. But there\nisn't *that* much redundancy, it isn't in anything performance critical, and it\nkeeps our scanner simpler.\n\n</aside>\n\nNext up, numbers. Instead of adding a switch case for each of the ten digits\nthat can start a number, we handle them here:\n\n^code scan-number (1 before, 2 after)\n\nThat uses this obvious utility function:\n\n^code is-digit\n\nWe finish scanning the number using this:\n\n^code number\n\nIt's virtually identical to jlox's version except, again, we don't convert the\nlexeme to a double yet.\n\n## Identifiers and Keywords\n\nThe last batch of tokens are identifiers, both user-defined and reserved. This\nsection should be fun -- the way we recognize keywords in clox is quite\ndifferent from how we did it in jlox, and touches on some important data\nstructures.\n\nFirst, though, we have to scan the lexeme. Names start with a letter or\nunderscore.\n\n^code scan-identifier (1 before, 1 after)\n\nWe recognize those using this:\n\n^code is-alpha\n\nOnce we've found an identifier, we scan the rest of it here:\n\n^code identifier\n\nAfter the first letter, we allow digits too, and we keep consuming alphanumerics\nuntil we run out of them. Then we produce a token with the proper type.\nDetermining that \"proper\" type is the unique part of this chapter.\n\n^code identifier-type\n\nOkay, I guess that's not very exciting yet. That's what it looks like if we\nhave no reserved words at all. How should we go about recognizing keywords? In\njlox, we stuffed them all in a Java Map and looked them up by name. We don't\nhave any sort of hash table structure in clox, at least not yet.\n\nA hash table would be overkill anyway. To look up a string in a hash <span\nname=\"hash\">table</span>, we need to walk the string to calculate its hash code,\nfind the corresponding bucket in the hash table, and then do a\ncharacter-by-character equality comparison on any string it happens to find\nthere.\n\n<aside name=\"hash\">\n\nDon't worry if this is unfamiliar to you. When we get to [building our own hash\ntable from scratch][hash], we'll learn all about it in exquisite detail.\n\n[hash]: hash-tables.html\n\n</aside>\n\nLet's say we've scanned the identifier \"gorgonzola\". How much work *should* we\nneed to do to tell if that's a reserved word? Well, no Lox keyword starts with\n\"g\", so looking at the first character is enough to definitively answer no.\nThat's a lot simpler than a hash table lookup.\n\nWhat about \"cardigan\"? We do have a keyword in Lox that starts with \"c\":\n\"class\". But the second character in \"cardigan\", \"a\", rules that out. What about\n\"forest\"? Since \"for\" is a keyword, we have to go farther in the string before\nwe can establish that we don't have a reserved word. But, in most cases, only a\ncharacter or two is enough to tell we've got a user-defined name on our hands.\nWe should be able to recognize that and fail fast.\n\nHere's a visual representation of that branching character-inspection logic:\n\n<span name=\"down\"></span>\n\n<img src=\"image/scanning-on-demand/keywords.png\" alt=\"A trie that contains all of Lox's keywords.\" />\n\n<aside name=\"down\">\n\nRead down each chain of nodes and you'll see Lox's keywords emerge.\n\n</aside>\n\nWe start at the root node. If there is a child node whose letter matches the\nfirst character in the lexeme, we move to that node. Then repeat for the next\nletter in the lexeme and so on. If at any point the next letter in the lexeme\ndoesn't match a child node, then the identifier must not be a keyword and we\nstop. If we reach a double-lined box, and we're at the last character of the\nlexeme, then we found a keyword.\n\n### Tries and state machines\n\nThis tree diagram is an example of a thing called a <span\nname=\"trie\">[**trie**][trie]</span>. A trie stores a set of strings. Most other\ndata structures for storing strings contain the raw character arrays and then\nwrap them inside some larger construct that helps you search faster. A trie is\ndifferent. Nowhere in the trie will you find a whole string.\n\n[trie]: https://en.wikipedia.org/wiki/Trie\n\n<aside name=\"trie\">\n\n\"Trie\" is one of the most confusing names in CS. Edward Fredkin yanked it out of\nthe middle of the word \"retrieval\", which means it should be pronounced like\n\"tree\". But, uh, there is already a pretty important data structure pronounced\n\"tree\" *which tries are a special case of*, so unless you never speak of these\nthings out loud, no one can tell which one you're talking about. Thus, people\nthese days often pronounce it like \"try\" to avoid the headache.\n\n</aside>\n\nInstead, each string the trie \"contains\" is represented as a *path* through the\ntree of character nodes, as in our traversal above. Nodes that match the last\ncharacter in a string have a special marker -- the double lined boxes in the\nillustration. That way, if your trie contains, say, \"banquet\" and \"ban\", you are\nable to tell that it does *not* contain \"banque\" -- the \"e\" node won't have that\nmarker, while the \"n\" and \"t\" nodes will.\n\nTries are a special case of an even more fundamental data structure: a\n[**deterministic finite automaton**][dfa] (**DFA**). You might also know these\nby other names: **finite state machine**, or just **state machine**. State\nmachines are rad. They end up useful in everything from [game\nprogramming][state] to implementing networking protocols.\n\n[dfa]: https://en.wikipedia.org/wiki/Deterministic_finite_automaton\n[state]: http://gameprogrammingpatterns.com/state.html\n\nIn a DFA, you have a set of *states* with *transitions* between them, forming a\ngraph. At any point in time, the machine is \"in\" exactly one state. It gets to\nother states by following transitions. When you use a DFA for lexical analysis,\neach transition is a character that gets matched from the string. Each state\nrepresents a set of allowed characters.\n\nOur keyword tree is exactly a DFA that recognizes Lox keywords. But DFAs are\nmore powerful than simple trees because they can be arbitrary *graphs*.\nTransitions can form cycles between states. That lets you recognize arbitrarily\nlong strings. For example, here's a DFA that recognizes number literals:\n\n<span name=\"railroad\"></span>\n\n<img src=\"image/scanning-on-demand/numbers.png\" alt=\"A syntax diagram that recognizes integer and floating point literals.\" />\n\n<aside name=\"railroad\">\n\nThis style of diagram is called a [**syntax diagram**][syntax diagram] or the\nmore charming **railroad diagram**. The latter name is because it looks\nsomething like a switching yard for trains.\n\nBack before Backus-Naur Form was a thing, this was one of the predominant ways\nof documenting a language's grammar. These days, we mostly use text, but there's\nsomething delightful about the official specification for a *textual language*\nrelying on an *image*.\n\n[syntax diagram]: https://en.wikipedia.org/wiki/Syntax_diagram\n\n</aside>\n\nI've collapsed the nodes for the ten digits together to keep it more readable,\nbut the basic process works the same -- you work through the path, entering\nnodes whenever you consume a corresponding character in the lexeme. If we were\nso inclined, we could construct one big giant DFA that does *all* of the lexical\nanalysis for Lox, a single state machine that recognizes and spits out all of\nthe tokens we need.\n\nHowever, crafting that mega-DFA by <span name=\"regex\">hand</span> would be\nchallenging. That's why [Lex][] was created. You give it a simple textual\ndescription of your lexical grammar -- a bunch of regular expressions -- and it\nautomatically generates a DFA for you and produces a pile of C code that\nimplements it.\n\n[lex]: https://en.wikipedia.org/wiki/Lex_(software)\n\n<aside name=\"regex\">\n\nThis is also how most regular expression engines in programming languages and\ntext editors work under the hood. They take your regex string and convert it to\na DFA, which they then use to match strings.\n\nIf you want to learn the algorithm to convert a regular expression into a DFA,\n[the dragon book][dragon] has you covered.\n\n[dragon]: https://en.wikipedia.org/wiki/Compilers:_Principles,_Techniques,_and_Tools\n\n</aside>\n\nWe won't go down that road. We already have a perfectly serviceable hand-rolled\nscanner. We just need a tiny trie for recognizing keywords. How should we map\nthat to code?\n\nThe absolute simplest <span name=\"v8\">solution</span> is to use a switch\nstatement for each node with cases for each branch. We'll start with the root\nnode and handle the easy keywords.\n\n<aside name=\"v8\">\n\nSimple doesn't mean dumb. The same approach is [essentially what V8 does][v8],\nand that's currently one of the world's most sophisticated, fastest language\nimplementations.\n\n[v8]: https://github.com/v8/v8/blob/e77eebfe3b747fb315bd3baad09bec0953e53e68/src/parsing/scanner.cc#L1643\n\n</aside>\n\n^code keywords (1 before, 1 after)\n\nThese are the initial letters that correspond to a single keyword. If we see an\n\"s\", the only keyword the identifier could possibly be is `super`. It might not\nbe, though, so we still need to check the rest of the letters too. In the tree\ndiagram, this is basically that straight path hanging off the \"s\".\n\nWe won't roll a switch for each of those nodes. Instead, we have a utility\nfunction that tests the rest of a potential keyword's lexeme.\n\n^code check-keyword\n\nWe use this for all of the unbranching paths in the tree. Once we've found a\nprefix that could only be one possible reserved word, we need to verify two\nthings. The lexeme must be exactly as long as the keyword. If the first letter\nis \"s\", the lexeme could still be \"sup\" or \"superb\". And the remaining\ncharacters must match exactly -- \"supar\" isn't good enough.\n\nIf we do have the right number of characters, and they're the ones we want, then\nit's a keyword, and we return the associated token type. Otherwise, it must be a\nnormal identifier.\n\nWe have a couple of keywords where the tree branches again after the first\nletter. If the lexeme starts with \"f\", it could be `false`, `for`, or `fun`. So\nwe add another switch for the branches coming off the \"f\" node.\n\n^code keyword-f (1 before, 1 after)\n\nBefore we switch, we need to check that there even *is* a second letter. \"f\" by\nitself is a valid identifier too, after all. The other letter that branches is\n\"t\".\n\n^code keyword-t (1 before, 1 after)\n\nThat's it. A couple of nested `switch` statements. Not only is this code <span\nname=\"short\">short</span>, but it's very, very fast. It does the minimum amount\nof work required to detect a keyword, and bails out as soon as it can tell the\nidentifier will not be a reserved one.\n\nAnd with that, our scanner is complete.\n\n<aside name=\"short\">\n\nWe sometimes fall into the trap of thinking that performance comes from\ncomplicated data structures, layers of caching, and other fancy optimizations.\nBut, many times, all that's required is to do less work, and I often find that\nwriting the simplest code I can is sufficient to accomplish that.\n\n</aside>\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  Many newer languages support [**string interpolation**][interp]. Inside a\n    string literal, you have some sort of special delimiters -- most commonly\n    `${` at the beginning and `}` at the end. Between those delimiters, any\n    expression can appear. When the string literal is executed, the inner\n    expression is evaluated, converted to a string, and then merged with the\n    surrounding string literal.\n\n    For example, if Lox supported string interpolation, then this...\n\n    ```lox\n    var drink = \"Tea\";\n    var steep = 4;\n    var cool = 2;\n    print \"${drink} will be ready in ${steep + cool} minutes.\";\n    ```\n\n    ...would print:\n\n    ```text\n    Tea will be ready in 6 minutes.\n    ```\n\n    What token types would you define to implement a scanner for string\n    interpolation? What sequence of tokens would you emit for the above string\n    literal?\n\n    What tokens would you emit for:\n\n    ```text\n    \"Nested ${\"interpolation?! Are you ${\"mad?!\"}\"}\"\n    ```\n\n    Consider looking at other language implementations that support\n    interpolation to see how they handle it.\n\n2.  Several languages use angle brackets for generics and also have a `>>` right\n    shift operator. This led to a classic problem in early versions of C++:\n\n    ```c++\n    vector<vector<string>> nestedVectors;\n    ```\n\n    This would produce a compile error because the `>>` was lexed to a single\n    right shift token, not two `>` tokens. Users were forced to avoid this by\n    putting a space between the closing angle brackets.\n\n    Later versions of C++ are smarter and can handle the above code. Java and C#\n    never had the problem. How do those languages specify and implement this?\n\n3.  Many languages, especially later in their evolution, define \"contextual\n    keywords\". These are identifiers that act like reserved words in some\n    contexts but can be normal user-defined identifiers in others.\n\n    For example, `await` is a keyword inside an `async` method in C#, but\n    in other methods, you can use `await` as your own identifier.\n\n    Name a few contextual keywords from other languages, and the context where\n    they are meaningful. What are the pros and cons of having contextual\n    keywords? How would you implement them in your language's front end if you\n    needed to?\n\n[interp]: https://en.wikipedia.org/wiki/String_interpolation\n\n</div>\n"
  },
  {
    "path": "book/scanning.md",
    "content": "> Take big bites. Anything worth doing is worth overdoing.\n>\n> <cite>Robert A. Heinlein, <em>Time Enough for Love</em></cite>\n\nThe first step in any compiler or interpreter is <span\nname=\"lexing\">scanning</span>. The scanner takes in raw source code as a series\nof characters and groups it into a series of chunks we call **tokens**. These\nare the meaningful \"words\" and \"punctuation\" that make up the language's\ngrammar.\n\n<aside name=\"lexing\">\n\nThis task has been variously called \"scanning\" and \"lexing\" (short for \"lexical\nanalysis\") over the years. Way back when computers were as big as Winnebagos but\nhad less memory than your watch, some people used \"scanner\" only to refer to the\npiece of code that dealt with reading raw source code characters from disk and\nbuffering them in memory. Then \"lexing\" was the subsequent phase that did useful\nstuff with the characters.\n\nThese days, reading a source file into memory is trivial, so it's rarely a\ndistinct phase in the compiler. Because of that, the two terms are basically\ninterchangeable.\n\n</aside>\n\nScanning is a good starting point for us too because the code isn't very hard --\npretty much a `switch` statement with delusions of grandeur. It will help us\nwarm up before we tackle some of the more interesting material later. By the end\nof this chapter, we'll have a full-featured, fast scanner that can take any\nstring of Lox source code and produce the tokens that we'll feed into the parser\nin the next chapter.\n\n## The Interpreter Framework\n\nSince this is our first real chapter, before we get to actually scanning some\ncode we need to sketch out the basic shape of our interpreter, jlox. Everything\nstarts with a class in Java.\n\n^code lox-class\n\n<aside name=\"64\">\n\nFor exit codes, I'm using the conventions defined in the UNIX\n[\"sysexits.h\"][sysexits] header. It's the closest thing to a standard I could\nfind.\n\n[sysexits]: https://www.freebsd.org/cgi/man.cgi?query=sysexits&amp;apropos=0&amp;sektion=0&amp;manpath=FreeBSD+4.3-RELEASE&amp;format=html\n\n</aside>\n\nStick that in a text file, and go get your IDE or Makefile or whatever set up.\nI'll be right here when you're ready. Good? OK!\n\nLox is a scripting language, which means it executes directly from source. Our\ninterpreter supports two ways of running code. If you start jlox from the\ncommand line and give it a path to a file, it reads the file and executes it.\n\n^code run-file\n\nIf you want a more intimate conversation with your interpreter, you can also run\nit interactively. Fire up jlox without any arguments, and it drops you into a\nprompt where you can enter and execute code one line at a time.\n\n<aside name=\"repl\">\n\nAn interactive prompt is also called a \"REPL\" (pronounced like \"rebel\" but with\na \"p\"). The name comes from Lisp where implementing one is as simple as\nwrapping a loop around a few built-in functions:\n\n```lisp\n(print (eval (read)))\n```\n\nWorking outwards from the most nested call, you **R**ead a line of input,\n**E**valuate it, **P**rint the result, then **L**oop and do it all over again.\n\n</aside>\n\n^code prompt\n\nThe `readLine()` function, as the name so helpfully implies, reads a line of\ninput from the user on the command line and returns the result. To kill an\ninteractive command-line app, you usually type Control-D. Doing so signals an\n\"end-of-file\" condition to the program. When that happens `readLine()` returns\n`null`, so we check for that to exit the loop.\n\nBoth the prompt and the file runner are thin wrappers around this core function:\n\n^code run\n\nIt's not super useful yet since we haven't written the interpreter, but baby\nsteps, you know? Right now, it prints out the tokens our forthcoming scanner\nwill emit so that we can see if we're making progress.\n\n### Error handling\n\nWhile we're setting things up, another key piece of infrastructure is *error\nhandling*. Textbooks sometimes gloss over this because it's more a practical\nmatter than a formal computer science-y problem. But if you care about making a\nlanguage that's actually *usable*, then handling errors gracefully is vital.\n\nThe tools our language provides for dealing with errors make up a large portion\nof its user interface. When the user's code is working, they aren't thinking\nabout our language at all -- their headspace is all about *their program*. It's\nusually only when things go wrong that they notice our implementation.\n\n<span name=\"errors\">When</span> that happens, it's up to us to give the user all\nthe information they need to understand what went wrong and guide them gently\nback to where they are trying to go. Doing that well means thinking about error\nhandling all through the implementation of our interpreter, starting now.\n\n<aside name=\"errors\">\n\nHaving said all that, for *this* interpreter, what we'll build is pretty bare\nbones. I'd love to talk about interactive debuggers, static analyzers, and other\nfun stuff, but there's only so much ink in the pen.\n\n</aside>\n\n^code lox-error\n\nThis `error()` function and its `report()` helper tells the user some syntax\nerror occurred on a given line. That is really the bare minimum to be able to\nclaim you even *have* error reporting. Imagine if you accidentally left a\ndangling comma in some function call and the interpreter printed out:\n\n```text\nError: Unexpected \",\" somewhere in your code. Good luck finding it!\n```\n\nThat's not very helpful. We need to at least point them to the right line. Even\nbetter would be the beginning and end column so they know *where* in the line.\nEven better than *that* is to *show* the user the offending line, like:\n\n```text\nError: Unexpected \",\" in argument list.\n\n    15 | function(first, second,);\n                               ^-- Here.\n```\n\nI'd love to implement something like that in this book but the honest truth is\nthat it's a lot of grungy string manipulation code. Very useful for users, but\nnot super fun to read in a book and not very technically interesting. So we'll\nstick with just a line number. In your own interpreters, please do as I say and\nnot as I do.\n\nThe primary reason we're sticking this error reporting function in the main Lox\nclass is because of that `hadError` field. It's defined here:\n\n^code had-error (1 before)\n\nWe'll use this to ensure we don't try to execute code that has a known error.\nAlso, it lets us exit with a non-zero exit code like a good command line citizen\nshould.\n\n^code exit-code (1 before, 1 after)\n\nWe need to reset this flag in the interactive loop. If the user makes a mistake,\nit shouldn't kill their entire session.\n\n^code reset-had-error (1 before, 1 after)\n\nThe other reason I pulled the error reporting out here instead of stuffing it\ninto the scanner and other phases where the error might occur is to remind you\nthat it's good engineering practice to separate the code that *generates* the\nerrors from the code that *reports* them.\n\nVarious phases of the front end will detect errors, but it's not really their\njob to know how to present that to a user. In a full-featured language\nimplementation, you will likely have multiple ways errors get displayed: on\nstderr, in an IDE's error window, logged to a file, etc. You don't want that\ncode smeared all over your scanner and parser.\n\nIdeally, we would have an actual abstraction, some kind of <span\nname=\"reporter\">\"ErrorReporter\"</span> interface that gets passed to the scanner\nand parser so that we can swap out different reporting strategies. For our\nsimple interpreter here, I didn't do that, but I did at least move the code for\nerror reporting into a different class.\n\n<aside name=\"reporter\">\n\nI had exactly that when I first implemented jlox. I ended up tearing it out\nbecause it felt over-engineered for the minimal interpreter in this book.\n\n</aside>\n\nWith some rudimentary error handling in place, our application shell is ready.\nOnce we have a Scanner class with a `scanTokens()` method, we can start running\nit. Before we get to that, let's get more precise about what tokens are.\n\n## Lexemes and Tokens\n\nHere's a line of Lox code:\n\n```lox\nvar language = \"lox\";\n```\n\nHere, `var` is the keyword for declaring a variable. That three-character\nsequence \"v-a-r\" means something. But if we yank three letters out of the\nmiddle of `language`, like \"g-u-a\", those don't mean anything on their own.\n\nThat's what lexical analysis is about. Our job is to scan through the list of\ncharacters and group them together into the smallest sequences that still\nrepresent something. Each of these blobs of characters is called a **lexeme**.\nIn that example line of code, the lexemes are:\n\n<img src=\"image/scanning/lexemes.png\" alt=\"'var', 'language', '=', 'lox', ';'\" />\n\nThe lexemes are only the raw substrings of the source code. However, in the\nprocess of grouping character sequences into lexemes, we also stumble upon some\nother useful information. When we take the lexeme and bundle it together with\nthat other data, the result is a token. It includes useful stuff like:\n\n### Token type\n\nKeywords are part of the shape of the language's grammar, so the parser often\nhas code like, \"If the next token is `while` then do...\" That means the parser\nwants to know not just that it has a lexeme for some identifier, but that it has\na *reserved* word, and *which* keyword it is.\n\nThe <span name=\"ugly\">parser</span> could categorize tokens from the raw lexeme\nby comparing the strings, but that's slow and kind of ugly. Instead, at the\npoint that we recognize a lexeme, we also remember which *kind* of lexeme it\nrepresents. We have a different type for each keyword, operator, bit of\npunctuation, and literal type.\n\n<aside name=\"ugly\">\n\nAfter all, string comparison ends up looking at individual characters, and isn't\nthat the scanner's job?\n\n</aside>\n\n^code token-type\n\n### Literal value\n\nThere are lexemes for literal values -- numbers and strings and the like. Since\nthe scanner has to walk each character in the literal to correctly identify it,\nit can also convert that textual representation of a value to the living runtime\nobject that will be used by the interpreter later.\n\n### Location information\n\nBack when I was preaching the gospel about error handling, we saw that we need\nto tell users *where* errors occurred. Tracking that starts here. In our simple\ninterpreter, we note only which line the token appears on, but more\nsophisticated implementations include the column and length too.\n\n<aside name=\"location\">\n\nSome token implementations store the location as two numbers: the offset from\nthe beginning of the source file to the beginning of the lexeme, and the length\nof the lexeme. The scanner needs to know these anyway, so there's no overhead to\ncalculate them.\n\nAn offset can be converted to line and column positions later by looking back at\nthe source file and counting the preceding newlines. That sounds slow, and it\nis. However, you need to do it *only when you need to actually display a line\nand column to the user*. Most tokens never appear in an error message. For\nthose, the less time you spend calculating position information ahead of time,\nthe better.\n\n</aside>\n\nWe take all of this data and wrap it in a class.\n\n^code token-class\n\nNow we have an object with enough structure to be useful for all of the later\nphases of the interpreter.\n\n## Regular Languages and Expressions\n\nNow that we know what we're trying to produce, let's, well, produce it. The core\nof the scanner is a loop. Starting at the first character of the source code,\nthe scanner figures out what lexeme the character belongs to, and consumes it\nand any following characters that are part of that lexeme. When it reaches the\nend of that lexeme, it emits a token.\n\nThen it loops back and does it again, starting from the very next character in\nthe source code. It keeps doing that, eating characters and occasionally, uh,\nexcreting tokens, until it reaches the end of the input.\n\n<span name=\"alligator\"></span>\n\n<img src=\"image/scanning/lexigator.png\" alt=\"An alligator eating characters and, well, you don't want to know.\" />\n\n<aside name=\"alligator\">\n\nLexical analygator.\n\n</aside>\n\nThe part of the loop where we look at a handful of characters to figure out\nwhich kind of lexeme it \"matches\" may sound familiar. If you know regular\nexpressions, you might consider defining a regex for each kind of lexeme and\nusing those to match characters. For example, Lox has the same rules as C for\nidentifiers (variable names and the like). This regex matches one:\n\n```text\n[a-zA-Z_][a-zA-Z_0-9]*\n```\n\nIf you did think of regular expressions, your intuition is a deep one. The rules\nthat determine how a particular language groups characters into lexemes are\ncalled its <span name=\"theory\">**lexical grammar**</span>. In Lox, as in most\nprogramming languages, the rules of that grammar are simple enough for the\nlanguage to be classified a **[regular language][]**. That's the same \"regular\"\nas in regular expressions.\n\n[regular language]: https://en.wikipedia.org/wiki/Regular_language\n\n<aside name=\"theory\">\n\nIt pains me to gloss over the theory so much, especially when it's as\ninteresting as I think the [Chomsky hierarchy][] and [finite-state machines][]\nare. But the honest truth is other books cover this better than I could.\n[*Compilers: Principles, Techniques, and Tools*][dragon] (universally known as\n\"the dragon book\") is the canonical reference.\n\n[chomsky hierarchy]: https://en.wikipedia.org/wiki/Chomsky_hierarchy\n[dragon]: https://en.wikipedia.org/wiki/Compilers:_Principles,_Techniques,_and_Tools\n[finite-state machines]: https://en.wikipedia.org/wiki/Finite-state_machine\n\n</aside>\n\nYou very precisely *can* recognize all of the different lexemes for Lox using\nregexes if you want to, and there's a pile of interesting theory underlying why\nthat is and what it means. Tools like [Lex][] or\n[Flex][] are designed expressly to let you do this -- throw a handful of regexes\nat them, and they give you a complete scanner <span name=\"lex\">back</span>.\n\n<aside name=\"lex\">\n\nLex was created by Mike Lesk and Eric Schmidt. Yes, the same Eric Schmidt who\nwas executive chairman of Google. I'm not saying programming languages are a\nsurefire path to wealth and fame, but we *can* count at least one\nmega billionaire among us.\n\n</aside>\n\n[lex]: http://dinosaur.compilertools.net/lex/\n[flex]: https://github.com/westes/flex\n\nSince our goal is to understand how a scanner does what it does, we won't be\ndelegating that task. We're about handcrafted goods.\n\n## The Scanner Class\n\nWithout further ado, let's make ourselves a scanner.\n\n^code scanner-class\n\n<aside name=\"static-import\">\n\nI know static imports are considered bad style by some, but they save me from\nhaving to sprinkle `TokenType.` all over the scanner and parser. Forgive me, but\nevery character counts in a book.\n\n</aside>\n\nWe store the raw source code as a simple string, and we have a list ready to\nfill with tokens we're going to generate. The aforementioned loop that does that\nlooks like this:\n\n^code scan-tokens\n\nThe scanner works its way through the source code, adding tokens until it runs\nout of characters. Then it appends one final \"end of file\" token. That isn't\nstrictly needed, but it makes our parser a little cleaner.\n\nThis loop depends on a couple of fields to keep track of where the scanner is in\nthe source code.\n\n^code scan-state (1 before, 2 after)\n\nThe `start` and `current` fields are offsets that index into the string. The\n`start` field points to the first character in the lexeme being scanned, and\n`current` points at the character currently being considered. The `line` field\ntracks what source line `current` is on so we can produce tokens that know their\nlocation.\n\nThen we have one little helper function that tells us if we've consumed all the\ncharacters.\n\n^code is-at-end\n\n## Recognizing Lexemes\n\nIn each turn of the loop, we scan a single token. This is the real heart of the\nscanner. We'll start simple. Imagine if every lexeme were only a single character\nlong. All you would need to do is consume the next character and pick a token type for\nit. Several lexemes *are* only a single character in Lox, so let's start with\nthose.\n\n^code scan-token\n\n<aside name=\"slash\">\n\nWondering why `/` isn't in here? Don't worry, we'll get to it.\n\n</aside>\n\nAgain, we need a couple of helper methods.\n\n^code advance-and-add-token\n\nThe `advance()` method consumes the next character in the source file and\nreturns it. Where `advance()` is for input, `addToken()` is for output. It grabs\nthe text of the current lexeme and creates a new token for it. We'll use the\nother overload to handle tokens with literal values soon.\n\n### Lexical errors\n\nBefore we get too far in, let's take a moment to think about errors at the\nlexical level. What happens if a user throws a source file containing some\ncharacters Lox doesn't use, like `@#^`, at our interpreter? Right now, those\ncharacters get silently discarded. They aren't used by the Lox language, but\nthat doesn't mean the interpreter can pretend they aren't there. Instead, we\nreport an error.\n\n^code char-error (1 before, 1 after)\n\nNote that the erroneous character is still *consumed* by the earlier call to\n`advance()`. That's important so that we don't get stuck in an infinite loop.\n\nNote also that we <span name=\"shotgun\">*keep scanning*</span>. There may be\nother errors later in the program. It gives our users a better experience if we\ndetect as many of those as possible in one go. Otherwise, they see one tiny\nerror and fix it, only to have the next error appear, and so on. Syntax error\nWhac-A-Mole is no fun.\n\n(Don't worry. Since `hadError` gets set, we'll never try to *execute* any of the\ncode, even though we keep going and scan the rest of it.)\n\n<aside name=\"shotgun\">\n\nThe code reports each invalid character separately, so this shotguns the user\nwith a blast of errors if they accidentally paste a big blob of weird text.\nCoalescing a run of invalid characters into a single error would give a nicer\nuser experience.\n\n</aside>\n\n### Operators\n\nWe have single-character lexemes working, but that doesn't cover all of Lox's\noperators. What about `!`? It's a single character, right? Sometimes, yes, but\nif the very next character is an equals sign, then we should instead create a\n`!=` lexeme. Note that the `!` and `=` are *not* two independent operators. You\ncan't write `!   =` in Lox and have it behave like an inequality operator.\nThat's why we need to scan `!=` as a single lexeme. Likewise, `<`, `>`, and `=`\ncan all be followed by `=` to create the other equality and comparison\noperators.\n\nFor all of these, we need to look at the second character.\n\n^code two-char-tokens (1 before, 2 after)\n\nThose cases use this new method:\n\n^code match\n\nIt's like a conditional `advance()`. We only consume the current character if\nit's what we're looking for.\n\nUsing `match()`, we recognize these lexemes in two stages. When we reach, for\nexample, `!`, we jump to its switch case. That means we know the lexeme *starts*\nwith `!`. Then we look at the next character to determine if we're on a `!=` or\nmerely a `!`.\n\n## Longer Lexemes\n\nWe're still missing one operator: `/` for division. That character needs a\nlittle special handling because comments begin with a slash too.\n\n^code slash (1 before, 2 after)\n\nThis is similar to the other two-character operators, except that when we find a\nsecond `/`, we don't end the token yet. Instead, we keep consuming characters\nuntil we reach the end of the line.\n\nThis is our general strategy for handling longer lexemes. After we detect the\nbeginning of one, we shunt over to some lexeme-specific code that keeps eating\ncharacters until it sees the end.\n\nWe've got another helper:\n\n^code peek\n\nIt's sort of like `advance()`, but doesn't consume the character. This is called\n<span name=\"match\">**lookahead**</span>. Since it only looks at the current\nunconsumed character, we have *one character of lookahead*. The smaller this\nnumber is, generally, the faster the scanner runs. The rules of the lexical\ngrammar dictate how much lookahead we need. Fortunately, most languages in wide\nuse peek only one or two characters ahead.\n\n<aside name=\"match\">\n\nTechnically, `match()` is doing lookahead too. `advance()` and `peek()` are the\nfundamental operators and `match()` combines them.\n\n</aside>\n\nComments are lexemes, but they aren't meaningful, and the parser doesn't want\nto deal with them. So when we reach the end of the comment, we *don't* call\n`addToken()`. When we loop back around to start the next lexeme, `start` gets\nreset and the comment's lexeme disappears in a puff of smoke.\n\nWhile we're at it, now's a good time to skip over those other meaningless\ncharacters: newlines and whitespace.\n\n^code whitespace (1 before, 3 after)\n\nWhen encountering whitespace, we simply go back to the beginning of the scan\nloop. That starts a new lexeme *after* the whitespace character. For newlines,\nwe do the same thing, but we also increment the line counter. (This is why we\nused `peek()` to find the newline ending a comment instead of `match()`. We want\nthat newline to get us here so we can update `line`.)\n\nOur scanner is getting smarter. It can handle fairly free-form code like:\n\n```lox\n// this is a comment\n(( )){} // grouping stuff\n!*+-/=<> <= == // operators\n```\n\n### String literals\n\nNow that we're comfortable with longer lexemes, we're ready to tackle literals.\nWe'll do strings first, since they always begin with a specific character, `\"`.\n\n^code string-start (1 before, 2 after)\n\nThat calls:\n\n^code string\n\nLike with comments, we consume characters until we hit the `\"` that ends the\nstring. We also gracefully handle running out of input before the string is\nclosed and report an error for that.\n\nFor no particular reason, Lox supports multi-line strings. There are pros and\ncons to that, but prohibiting them was a little more complex than allowing them,\nso I left them in. That does mean we also need to update `line` when we hit a\nnewline inside a string.\n\nFinally, the last interesting bit is that when we create the token, we also\nproduce the actual string *value* that will be used later by the interpreter.\nHere, that conversion only requires a `substring()` to strip off the surrounding\nquotes. If Lox supported escape sequences like `\\n`, we'd unescape those here.\n\n### Number literals\n\nAll numbers in Lox are floating point at runtime, but both integer and decimal\nliterals are supported. A number literal is a series of <span\nname=\"minus\">digits</span> optionally followed by a `.` and one or more trailing\ndigits.\n\n<aside name=\"minus\">\n\nSince we look only for a digit to start a number, that means `-123` is not a\nnumber *literal*. Instead, `-123`, is an *expression* that applies `-` to the\nnumber literal `123`. In practice, the result is the same, though it has one\ninteresting edge case if we were to add method calls on numbers. Consider:\n\n```lox\nprint -123.abs();\n```\n\nThis prints `-123` because negation has lower precedence than method calls. We\ncould fix that by making `-` part of the number literal. But then consider:\n\n```lox\nvar n = 123;\nprint -n.abs();\n```\n\nThis still produces `-123`, so now the language seems inconsistent. No matter\nwhat you do, some case ends up weird.\n\n</aside>\n\n```lox\n1234\n12.34\n```\n\nWe don't allow a leading or trailing decimal point, so these are both invalid:\n\n```lox\n.1234\n1234.\n```\n\nWe could easily support the former, but I left it out to keep things simple. The\nlatter gets weird if we ever want to allow methods on numbers like `123.sqrt()`.\n\nTo recognize the beginning of a number lexeme, we look for any digit. It's kind\nof tedious to add cases for every decimal digit, so we'll stuff it in the\ndefault case instead.\n\n^code digit-start (1 before, 1 after)\n\nThis relies on this little utility:\n\n^code is-digit\n\n<aside name=\"is-digit\">\n\nThe Java standard library provides [`Character.isDigit()`][is-digit], which seems\nlike a good fit. Alas, that method allows things like Devanagari digits,\nfull-width numbers, and other funny stuff we don't want.\n\n[is-digit]: http://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#isDigit(char)\n\n</aside>\n\nOnce we know we are in a number, we branch to a separate method to consume the\nrest of the literal, like we do with strings.\n\n^code number\n\nWe consume as many digits as we find for the integer part of the literal. Then\nwe look for a fractional part, which is a decimal point (`.`) followed by at\nleast one digit. If we do have a fractional part, again, we consume as many\ndigits as we can find.\n\nLooking past the decimal point requires a second character of lookahead since we\ndon't want to consume the `.` until we're sure there is a digit *after* it. So\nwe add:\n\n^code peek-next\n\n<aside name=\"peek-next\">\n\nI could have made `peek()` take a parameter for the number of characters ahead\nto look instead of defining two functions, but that would allow *arbitrarily*\nfar lookahead. Providing these two functions makes it clearer to a reader of the\ncode that our scanner looks ahead at most two characters.\n\n</aside>\n\n\nFinally, we convert the lexeme to its numeric value. Our interpreter uses Java's\n`Double` type to represent numbers, so we produce a value of that type. We're\nusing Java's own parsing method to convert the lexeme to a real Java double. We\ncould implement that ourselves, but, honestly, unless you're trying to cram for\nan upcoming programming interview, it's not worth your time.\n\nThe remaining literals are Booleans and `nil`, but we handle those as keywords,\nwhich gets us to...\n\n## Reserved Words and Identifiers\n\nOur scanner is almost done. The only remaining pieces of the lexical grammar to\nimplement are identifiers and their close cousins, the reserved words. You might\nthink we could match keywords like `or` in the same way we handle\nmultiple-character operators like `<=`.\n\n```java\ncase 'o':\n  if (match('r')) {\n    addToken(OR);\n  }\n  break;\n```\n\nConsider what would happen if a user named a variable `orchid`. The scanner\nwould see the first two letters, `or`, and immediately emit an `or` keyword\ntoken. This gets us to an important principle called <span\nname=\"maximal\">**maximal munch**</span>. When two lexical grammar rules can both\nmatch a chunk of code that the scanner is looking at, *whichever one matches the\nmost characters wins*.\n\nThat rule states that if we can match `orchid` as an identifier and `or` as a\nkeyword, then the former wins. This is also why we tacitly assumed, previously,\nthat `<=` should be scanned as a single `<=` token and not `<` followed by `=`.\n\n<aside name=\"maximal\">\n\nConsider this nasty bit of C code:\n\n```c\n---a;\n```\n\nIs it valid? That depends on how the scanner splits the lexemes. What if the scanner\nsees it like this:\n\n```c\n- --a;\n```\n\nThen it could be parsed. But that would require the scanner to know about the\ngrammatical structure of the surrounding code, which entangles things more than\nwe want. Instead, the maximal munch rule says that it is *always* scanned like:\n\n```c\n-- -a;\n```\n\nIt scans it that way even though doing so leads to a syntax error later in the\nparser.\n\n</aside>\n\nMaximal munch means we can't easily detect a reserved word until we've reached\nthe end of what might instead be an identifier. After all, a reserved word *is*\nan identifier, it's just one that has been claimed by the language for its own\nuse. That's where the term **reserved word** comes from.\n\nSo we begin by assuming any lexeme starting with a letter or underscore is an\nidentifier.\n\n^code identifier-start (3 before, 3 after)\n\nThe rest of the code lives over here:\n\n^code identifier\n\nWe define that in terms of these helpers:\n\n^code is-alpha\n\nThat gets identifiers working. To handle keywords, we see if the identifier's\nlexeme is one of the reserved words. If so, we use a token type specific to that\nkeyword. We define the set of reserved words in a map.\n\n^code keyword-map\n\nThen, after we scan an identifier, we check to see if it matches anything in the\nmap.\n\n^code keyword-type (2 before, 1 after)\n\nIf so, we use that keyword's token type. Otherwise, it's a regular user-defined\nidentifier.\n\nAnd with that, we now have a complete scanner for the entire Lox lexical\ngrammar. Fire up the REPL and type in some valid and invalid code. Does it\nproduce the tokens you expect? Try to come up with some interesting edge cases\nand see if it handles them as it should.\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  The lexical grammars of Python and Haskell are not *regular*. What does that\n    mean, and why aren't they?\n\n1.  Aside from separating tokens -- distinguishing `print foo` from `printfoo`\n    -- spaces aren't used for much in most languages. However, in a couple of\n    dark corners, a space *does* affect how code is parsed in CoffeeScript,\n    Ruby, and the C preprocessor. Where and what effect does it have in each of\n    those languages?\n\n1.  Our scanner here, like most, discards comments and whitespace since those\n    aren't needed by the parser. Why might you want to write a scanner that does\n    *not* discard those? What would it be useful for?\n\n1.  Add support to Lox's scanner for C-style `/* ... */` block comments. Make\n    sure to handle newlines in them. Consider allowing them to nest. Is adding\n    support for nesting more work than you expected? Why?\n\n</div>\n\n<div class=\"design-note\">\n\n## Design Note: Implicit Semicolons\n\nProgrammers today are spoiled for choice in languages and have gotten picky\nabout syntax. They want their language to look clean and modern. One bit of\nsyntactic lichen that almost every new language scrapes off (and some ancient\nones like BASIC never had) is `;` as an explicit statement terminator.\n\nInstead, they treat a newline as a statement terminator where it makes sense to\ndo so. The \"where it makes sense\" part is the challenging bit. While *most*\nstatements are on their own line, sometimes you need to spread a single\nstatement across a couple of lines. Those intermingled newlines should not be\ntreated as terminators.\n\nMost of the obvious cases where the newline should be ignored are easy to\ndetect, but there are a handful of nasty ones:\n\n* A return value on the next line:\n\n    ```js\n    if (condition) return\n    \"value\"\n    ```\n\n    Is \"value\" the value being returned, or do we have a `return` statement with\n    no value followed by an expression statement containing a string literal?\n\n* A parenthesized expression on the next line:\n\n    ```js\n    func\n    (parenthesized)\n    ```\n\n    Is this a call to `func(parenthesized)`, or two expression statements, one\n    for `func` and one for a parenthesized expression?\n\n* A `-` on the next line:\n\n    ```js\n    first\n    -second\n    ```\n\n    Is this `first - second` -- an infix subtraction -- or two expression\n    statements, one for `first` and one to negate `second`?\n\nIn all of these, either treating the newline as a separator or not would both\nproduce valid code, but possibly not the code the user wants. Across languages,\nthere is an unsettling variety of rules used to decide which newlines are\nseparators. Here are a couple:\n\n*   [Lua][] completely ignores newlines, but carefully controls its grammar such\n    that no separator between statements is needed at all in most cases. This is\n    perfectly legit:\n\n    ```lua\n    a = 1 b = 2\n    ```\n\n    Lua avoids the `return` problem by requiring a `return` statement to be the\n    very last statement in a block. If there is a value after `return` before\n    the keyword `end`, it *must* be for the `return`. For the other two cases,\n    they allow an explicit `;` and expect users to use that. In practice, that\n    almost never happens because there's no point in a parenthesized or unary\n    negation expression statement.\n\n*   [Go][] handles newlines in the scanner. If a newline appears following one\n    of a handful of token types that are known to potentially end a statement,\n    the newline is treated like a semicolon. Otherwise it is ignored. The Go\n    team provides a canonical code formatter, [gofmt][], and the ecosystem is\n    fervent about its use, which ensures that idiomatic styled code works well\n    with this simple rule.\n\n*   [Python][] treats all newlines as significant unless an explicit backslash\n    is used at the end of a line to continue it to the next line. However,\n    newlines anywhere inside a pair of brackets (`()`, `[]`, or `{}`) are\n    ignored. Idiomatic style strongly prefers the latter.\n\n    This rule works well for Python because it is a highly statement-oriented\n    language. In particular, Python's grammar ensures a statement never appears\n    inside an expression. C does the same, but many other languages which have a\n    \"lambda\" or function literal syntax do not.\n\n    An example in JavaScript:\n\n    ```js\n    console.log(function() {\n      statement();\n    });\n    ```\n\n    Here, the `console.log()` *expression* contains a function literal which\n    in turn contains the *statement* `statement();`.\n\n    Python would need a different set of rules for implicitly joining lines if\n    you could get back *into* a <span name=\"lambda\">statement</span> where\n    newlines should become meaningful while still nested inside brackets.\n\n<aside name=\"lambda\">\n\nAnd now you know why Python's `lambda` allows only a single expression body.\n\n</aside>\n\n*   JavaScript's \"[automatic semicolon insertion][asi]\" rule is the real odd\n    one. Where other languages assume most newlines *are* meaningful and only a\n    few should be ignored in multi-line statements, JS assumes the opposite. It\n    treats all of your newlines as meaningless whitespace *unless* it encounters\n    a parse error. If it does, it goes back and tries turning the previous\n    newline into a semicolon to get something grammatically valid.\n\n    This design note would turn into a design diatribe if I went into complete\n    detail about how that even *works*, much less all the various ways that\n    JavaScript's \"solution\" is a bad idea. It's a mess. JavaScript is the only\n    language I know where many style guides demand explicit semicolons after\n    every statement even though the language theoretically lets you elide them.\n\nIf you're designing a new language, you almost surely *should* avoid an explicit\nstatement terminator. Programmers are creatures of fashion like other humans, and\nsemicolons are as passé as ALL CAPS KEYWORDS. Just make sure you pick a set of\nrules that make sense for your language's particular grammar and idioms. And\ndon't do what JavaScript did.\n\n</div>\n\n[lua]: https://www.lua.org/pil/1.1.html\n[go]: https://golang.org/ref/spec#Semicolons\n[gofmt]: https://golang.org/cmd/gofmt/\n[python]: https://docs.python.org/3.5/reference/lexical_analysis.html#implicit-line-joining\n[asi]: https://www.ecma-international.org/ecma-262/5.1/#sec-7.9\n"
  },
  {
    "path": "book/statements-and-state.md",
    "content": "> All my life, my heart has yearned for a thing I cannot name.\n> <cite>Andr&eacute; Breton, <em>Mad Love</em></cite>\n\nThe interpreter we have so far feels less like programming a real language and\nmore like punching buttons on a calculator. \"Programming\" to me means building\nup a system out of smaller pieces. We can't do that yet because we have no way\nto bind a name to some data or function. We can't compose software without a way\nto refer to the pieces.\n\nTo support bindings, our interpreter needs internal state. When you define a\nvariable at the beginning of the program and use it at the end, the interpreter\nhas to hold on to the value of that variable in the meantime. So in this\nchapter, we will give our interpreter a brain that can not just process, but\n*remember*.\n\n<img src=\"image/statements-and-state/brain.png\" alt=\"A brain, presumably remembering stuff.\" />\n\nState and <span name=\"expr\">statements</span> go hand in hand. Since statements,\nby definition, don't evaluate to a value, they need to do something else to be\nuseful. That something is called a **side effect**. It could mean producing\nuser-visible output or modifying some state in the interpreter that can be\ndetected later. The latter makes them a great fit for defining variables or\nother named entities.\n\n<aside name=\"expr\">\n\nYou could make a language that treats variable declarations as expressions that\nboth create a binding and produce a value. The only language I know that does\nthat is Tcl. Scheme seems like a contender, but note that after a `let`\nexpression is evaluated, the variable it bound is forgotten. The `define` syntax\nis not an expression.\n\n</aside>\n\nIn this chapter, we'll do all of that. We'll define statements that produce\noutput (`print`) and create state (`var`). We'll add expressions to access and\nassign to variables. Finally, we'll add blocks and local scope. That's a lot to\nstuff into one chapter, but we'll chew through it all one bite at a time.\n\n## Statements\n\nWe start by extending Lox's grammar with statements. They aren't very different\nfrom expressions. We start with the two simplest kinds:\n\n1.  An **expression statement** lets you place an expression where a statement\n    is expected. They exist to evaluate expressions that have side effects. You\n    may not notice them, but you use them all the time in <span\n    name=\"expr-stmt\">C</span>, Java, and other languages. Any time you see a\n    function or method call followed by a `;`, you're looking at an expression\n    statement.\n\n    <aside name=\"expr-stmt\">\n\n    Pascal is an outlier. It distinguishes between *procedures* and *functions*.\n    Functions return values, but procedures cannot. There is a statement form\n    for calling a procedure, but functions can only be called where an\n    expression is expected. There are no expression statements in Pascal.\n\n    </aside>\n\n2.  A **`print` statement** evaluates an expression and displays the result to\n    the user. I admit it's weird to bake printing right into the language\n    instead of making it a library function. Doing so is a concession to the\n    fact that we're building this interpreter one chapter at a time and want to\n    be able to play with it before it's all done. To make print a library\n    function, we'd have to wait until we had all of the machinery for defining\n    and calling functions <span name=\"print\">before</span> we could witness any\n    side effects.\n\n    <aside name=\"print\">\n\n    I will note with only a modicum of defensiveness that BASIC and Python\n    have dedicated `print` statements and they are real languages. Granted,\n    Python did remove their `print` statement in 3.0...\n\n    </aside>\n\nNew syntax means new grammar rules. In this chapter, we finally gain the ability\nto parse an entire Lox script. Since Lox is an imperative, dynamically typed\nlanguage, the \"top level\" of a script is simply a list of statements. The new\nrules are:\n\n```ebnf\nprogram        → statement* EOF ;\n\nstatement      → exprStmt\n               | printStmt ;\n\nexprStmt       → expression \";\" ;\nprintStmt      → \"print\" expression \";\" ;\n```\n\nThe first rule is now `program`, which is the starting point for the grammar and\nrepresents a complete Lox script or REPL entry. A program is a list of\nstatements followed by the special \"end of file\" token. The mandatory end token\nensures the parser consumes the entire input and doesn't silently ignore\nerroneous unconsumed tokens at the end of a script.\n\nRight now, `statement` only has two cases for the two kinds of statements we've\ndescribed. We'll fill in more later in this chapter and in the following ones.\nThe next step is turning this grammar into something we can store in memory --\nsyntax trees.\n\n### Statement syntax trees\n\nThere is no place in the grammar where both an expression and a statement are\nallowed. The operands of, say, `+` are always expressions, never statements. The\nbody of a `while` loop is always a statement.\n\nSince the two syntaxes are disjoint, we don't need a single base class that they\nall inherit from. Splitting expressions and statements into separate class\nhierarchies enables the Java compiler to help us find dumb mistakes like passing\na statement to a Java method that expects an expression.\n\nThat means a new base class for statements. As our elders did before us, we will\nuse the cryptic name \"Stmt\". With great <span name=\"foresight\">foresight</span>,\nI have designed our little AST metaprogramming script in anticipation of this.\nThat's why we passed in \"Expr\" as a parameter to `defineAst()`. Now we add\nanother call to define Stmt and its <span name=\"stmt-ast\">subclasses</span>.\n\n<aside name=\"foresight\">\n\nNot really foresight: I wrote all the code for the book before I sliced it into\nchapters.\n\n</aside>\n\n^code stmt-ast (2 before, 1 after)\n\n<aside name=\"stmt-ast\">\n\nThe generated code for the new nodes is in [Appendix II][appendix-ii]: [Expression statement][], [Print statement][].\n\n[appendix-ii]: appendix-ii.html\n[expression statement]: appendix-ii.html#expression-statement\n[print statement]: appendix-ii.html#print-statement\n\n</aside>\n\nRun the AST generator script and behold the resulting \"Stmt.java\" file with the\nsyntax tree classes we need for expression and `print` statements. Don't forget\nto add the file to your IDE project or makefile or whatever.\n\n### Parsing statements\n\nThe parser's `parse()` method that parses and returns a single expression was a\ntemporary hack to get the last chapter up and running. Now that our grammar has\nthe correct starting rule, `program`, we can turn `parse()` into the real deal.\n\n^code parse\n\n<aside name=\"parse-error-handling\">\n\nWhat about the code we had in here for catching `ParseError` exceptions? We'll\nput better parse error handling in place soon when we add support for additional\nstatement types.\n\n</aside>\n\nThis parses a series of statements, as many as it can find until it hits the end\nof the input. This is a pretty direct translation of the `program` rule into\nrecursive descent style. We must also chant a minor prayer to the Java verbosity\ngods since we are using ArrayList now.\n\n^code parser-imports (2 before, 1 after)\n\nA program is a list of statements, and we parse one of those statements using\nthis method:\n\n^code parse-statement\n\nA little bare bones, but we'll fill it in with more statement types later. We\ndetermine which specific statement rule is matched by looking at the current\ntoken. A `print` token means it's obviously a `print` statement.\n\nIf the next token doesn't look like any known kind of statement, we assume it\nmust be an expression statement. That's the typical final fallthrough case when\nparsing a statement, since it's hard to proactively recognize an expression from\nits first token.\n\nEach statement kind gets its own method. First `print`:\n\n^code parse-print-statement\n\nSince we already matched and consumed the `print` token itself, we don't need to\ndo that here. We parse the subsequent expression, consume the terminating\nsemicolon, and emit the syntax tree.\n\nIf we didn't match a `print` statement, we must have one of these:\n\n^code parse-expression-statement\n\nSimilar to the previous method, we parse an expression followed by a semicolon.\nWe wrap that Expr in a Stmt of the right type and return it.\n\n### Executing statements\n\nWe're running through the previous couple of chapters in microcosm, working our\nway through the front end. Our parser can now produce statement syntax trees, so\nthe next and final step is to interpret them. As in expressions, we use the\nVisitor pattern, but we have a new visitor interface, Stmt.Visitor, to\nimplement since statements have their own base class.\n\nWe add that to the list of interfaces Interpreter implements.\n\n^code interpreter (1 after)\n\n<aside name=\"void\">\n\nJava doesn't let you use lowercase \"void\" as a generic type argument for obscure\nreasons having to do with type erasure and the stack. Instead, there is a\nseparate \"Void\" type specifically for this use. Sort of a \"boxed void\", like\n\"Integer\" is for \"int\".\n\n</aside>\n\nUnlike expressions, statements produce no values, so the return type of the\nvisit methods is Void, not Object. We have two statement types, and we need a\nvisit method for each. The easiest is expression statements.\n\n^code visit-expression-stmt\n\nWe evaluate the inner expression using our existing `evaluate()` method and\n<span name=\"discard\">discard</span> the value. Then we return `null`. Java\nrequires that to satisfy the special capitalized Void return type. Weird, but\nwhat can you do?\n\n<aside name=\"discard\">\n\nAppropriately enough, we discard the value returned by `evaluate()` by placing\nthat call inside a *Java* expression statement.\n\n</aside>\n\nThe `print` statement's visit method isn't much different.\n\n^code visit-print\n\nBefore discarding the expression's value, we convert it to a string using the\n`stringify()` method we introduced in the last chapter and then dump it to\nstdout.\n\nOur interpreter is able to visit statements now, but we have some work to do to\nfeed them to it. First, modify the old `interpret()` method in the Interpreter\nclass to accept a list of statements -- in other words, a program.\n\n^code interpret\n\nThis replaces the old code which took a single expression. The new code relies\non this tiny helper method:\n\n^code execute\n\nThat's the statement analogue to the `evaluate()` method we have for\nexpressions. Since we're working with lists now, we need to let Java know.\n\n^code import-list (2 before, 2 after)\n\nThe main Lox class is still trying to parse a single expression and pass it to\nthe interpreter. We fix the parsing line like so:\n\n^code parse-statements (1 before, 2 after)\n\nAnd then replace the call to the interpreter with this:\n\n^code interpret-statements (2 before, 1 after)\n\nBasically just plumbing the new syntax through. OK, fire up the interpreter and\ngive it a try. At this point, it's worth sketching out a little Lox program in a\ntext file to run as a script. Something like:\n\n```lox\nprint \"one\";\nprint true;\nprint 2 + 1;\n```\n\nIt almost looks like a real program! Note that the REPL, too, now requires you\nto enter a full statement instead of a simple expression. Don't forget your\nsemicolons.\n\n## Global Variables\n\nNow that we have statements, we can start working on state. Before we get into\nall of the complexity of lexical scoping, we'll start off with the easiest kind\nof variables -- <span name=\"globals\">globals</span>. We need two new constructs.\n\n1.  A **variable declaration** statement brings a new variable into the world.\n\n    ```lox\n    var beverage = \"espresso\";\n    ```\n\n    This creates a new binding that associates a name (here \"beverage\") with a\n    value (here, the string `\"espresso\"`).\n\n2.  Once that's done, a **variable expression** accesses that binding. When the\n    identifier \"beverage\" is used as an expression, it looks up the value bound\n    to that name and returns it.\n\n    ```lox\n    print beverage; // \"espresso\".\n    ```\n\nLater, we'll add assignment and block scope, but that's enough to get moving.\n\n<aside name=\"globals\">\n\nGlobal state gets a bad rap. Sure, lots of global state -- especially *mutable*\nstate -- makes it hard to maintain large programs. It's good software\nengineering to minimize how much you use.\n\nBut when you're slapping together a simple programming language or, heck, even\nlearning your first language, the flat simplicity of global variables helps. My\nfirst language was BASIC and, though I outgrew it eventually, it was nice that I\ndidn't have to wrap my head around scoping rules before I could make a computer\ndo fun stuff.\n\n</aside>\n\n### Variable syntax\n\nAs before, we'll work through the implementation from front to back, starting\nwith the syntax. Variable declarations are statements, but they are different\nfrom other statements, and we're going to split the statement grammar in two to\nhandle them. That's because the grammar restricts where some kinds of statements\nare allowed.\n\nThe clauses in control flow statements -- think the then and else branches of\nan `if` statement or the body of a `while` -- are each a single statement. But\nthat statement is not allowed to be one that declares a name. This is OK:\n\n```lox\nif (monday) print \"Ugh, already?\";\n```\n\nBut this is not:\n\n```lox\nif (monday) var beverage = \"espresso\";\n```\n\nWe *could* allow the latter, but it's confusing. What is the scope of that\n`beverage` variable? Does it persist after the `if` statement? If so, what is\nits value on days other than Monday? Does the variable exist at all on those\ndays?\n\nCode like this is weird, so C, Java, and friends all disallow it. It's as if\nthere are two levels of <span name=\"brace\">\"precedence\"</span> for statements.\nSome places where a statement is allowed -- like inside a block or at the top\nlevel -- allow any kind of statement, including declarations. Others allow only\nthe \"higher\" precedence statements that don't declare names.\n\n<aside name=\"brace\">\n\nIn this analogy, block statements work sort of like parentheses do for\nexpressions. A block is itself in the \"higher\" precedence level and can be used\nanywhere, like in the clauses of an `if` statement. But the statements it\n*contains* can be lower precedence. You're allowed to declare variables and\nother names inside the block. The curlies let you escape back into the full\nstatement grammar from a place where only some statements are allowed.\n\n</aside>\n\nTo accommodate the distinction, we add another rule for kinds of statements that\ndeclare names.\n\n```ebnf\nprogram        → declaration* EOF ;\n\ndeclaration    → varDecl\n               | statement ;\n\nstatement      → exprStmt\n               | printStmt ;\n```\n\nDeclaration statements go under the new `declaration` rule. Right now, it's only\nvariables, but later it will include functions and classes. Any place where a\ndeclaration is allowed also allows non-declaring statements, so the\n`declaration` rule falls through to `statement`. Obviously, you can declare\nstuff at the top level of a script, so `program` routes to the new rule.\n\nThe rule for declaring a variable looks like:\n\n```ebnf\nvarDecl        → \"var\" IDENTIFIER ( \"=\" expression )? \";\" ;\n```\n\nLike most statements, it starts with a leading keyword. In this case, `var`.\nThen an identifier token for the name of the variable being declared, followed\nby an optional initializer expression. Finally, we put a bow on it with the\nsemicolon.\n\nTo access a variable, we define a new kind of primary expression.\n\n```ebnf\nprimary        → \"true\" | \"false\" | \"nil\"\n               | NUMBER | STRING\n               | \"(\" expression \")\"\n               | IDENTIFIER ;\n```\n\nThat `IDENTIFIER` clause matches a single identifier token, which is understood\nto be the name of the variable being accessed.\n\nThese new grammar rules get their corresponding syntax trees. Over in the AST\ngenerator, we add a <span name=\"var-stmt-ast\">new statement</span> node for a\nvariable declaration.\n\n^code var-stmt-ast (1 before, 1 after)\n\n<aside name=\"var-stmt-ast\">\n\nThe generated code for the new node is in [Appendix II][appendix-var-stmt].\n\n[appendix-var-stmt]: appendix-ii.html#variable-statement\n\n</aside>\n\nIt stores the name token so we know what it's declaring, along with the\ninitializer expression. (If there isn't an initializer, that field is `null`.)\n\nThen we add an expression node for accessing a variable.\n\n^code var-expr (1 before, 1 after)\n\n<span name=\"var-expr-ast\">It's</span> simply a wrapper around the token for the\nvariable name. That's it. As always, don't forget to run the AST generator\nscript so that you get updated \"Expr.java\" and \"Stmt.java\" files.\n\n<aside name=\"var-expr-ast\">\n\nThe generated code for the new node is in [Appendix II][appendix-var-expr].\n\n[appendix-var-expr]: appendix-ii.html#variable-expression\n\n</aside>\n\n### Parsing variables\n\nBefore we parse variable statements, we need to shift around some code to make\nroom for the new `declaration` rule in the grammar. The top level of a program\nis now a list of declarations, so the entrypoint method to the parser changes.\n\n^code parse-declaration (3 before, 4 after)\n\nThat calls this new method:\n\n^code declaration\n\nHey, do you remember way back in that [earlier chapter][parsing] when we put the\ninfrastructure in place to do error recovery? We are finally ready to hook that\nup.\n\n[parsing]: parsing-expressions.html\n[error recovery]: parsing-expressions.html#panic-mode-error-recovery\n\nThis `declaration()` method is the method we call repeatedly when parsing a\nseries of statements in a block or a script, so it's the right place to\nsynchronize when the parser goes into panic mode. The whole body of this method\nis wrapped in a try block to catch the exception thrown when the parser begins\nerror recovery. This gets it back to trying to parse the beginning of the next\nstatement or declaration.\n\nThe real parsing happens inside the try block. First, it looks to see if we're\nat a variable declaration by looking for the leading `var` keyword. If not, it\nfalls through to the existing `statement()` method that parses `print` and\nexpression statements.\n\nRemember how `statement()` tries to parse an expression statement if no other\nstatement matches? And `expression()` reports a syntax error if it can't parse\nan expression at the current token? That chain of calls ensures we report an\nerror if a valid declaration or statement isn't parsed.\n\nWhen the parser matches a `var` token, it branches to:\n\n^code parse-var-declaration\n\nAs always, the recursive descent code follows the grammar rule. The parser has\nalready matched the `var` token, so next it requires and consumes an identifier\ntoken for the variable name.\n\nThen, if it sees an `=` token, it knows there is an initializer expression and\nparses it. Otherwise, it leaves the initializer `null`. Finally, it consumes the\nrequired semicolon at the end of the statement. All this gets wrapped in a\nStmt.Var syntax tree node and we're groovy.\n\nParsing a variable expression is even easier. In `primary()`, we look for an\nidentifier token.\n\n^code parse-identifier (2 before, 2 after)\n\nThat gives us a working front end for declaring and using variables. All that's\nleft is to feed it into the interpreter. Before we get to that, we need to talk\nabout where variables live in memory.\n\n## Environments\n\nThe bindings that associate variables to values need to be stored somewhere.\nEver since the Lisp folks invented parentheses, this data structure has been\ncalled an <span name=\"env\">**environment**</span>.\n\n<img src=\"image/statements-and-state/environment.png\" alt=\"An environment containing two bindings.\" />\n\n<aside name=\"env\">\n\nI like to imagine the environment literally, as a sylvan wonderland where\nvariables and values frolic.\n\n</aside>\n\nYou can think of it like a <span name=\"map\">map</span> where the keys are\nvariable names and the values are the variable's, uh, values. In fact, that's\nhow we'll implement it in Java. We could stuff that map and the code to manage\nit right into Interpreter, but since it forms a nicely delineated concept, we'll\npull it out into its own class.\n\nStart a new file and add:\n\n<aside name=\"map\">\n\nJava calls them **maps** or **hashmaps**. Other languages call them **hash\ntables**, **dictionaries** (Python and C#), **hashes** (Ruby and Perl),\n**tables** (Lua), or **associative arrays** (PHP). Way back when, they were\nknown as **scatter tables**.\n\n</aside>\n\n^code environment-class\n\nThere's a Java Map in there to store the bindings. It uses bare strings for the\nkeys, not tokens. A token represents a unit of code at a specific place in the\nsource text, but when it comes to looking up variables, all identifier tokens\nwith the same name should refer to the same variable (ignoring scope for now).\nUsing the raw string ensures all of those tokens refer to the same map key.\n\nThere are two operations we need to support. First, a variable definition binds\na new name to a value.\n\n^code environment-define\n\nNot exactly brain surgery, but we have made one interesting semantic choice.\nWhen we add the key to the map, we don't check to see if it's already present.\nThat means that this program works:\n\n```lox\nvar a = \"before\";\nprint a; // \"before\".\nvar a = \"after\";\nprint a; // \"after\".\n```\n\nA variable statement doesn't just define a *new* variable, it can also be used\nto *re*define an existing variable. We could <span name=\"scheme\">choose</span>\nto make this an error instead. The user may not intend to redefine an existing\nvariable. (If they did mean to, they probably would have used assignment, not\n`var`.) Making redefinition an error would help them find that bug.\n\nHowever, doing so interacts poorly with the REPL. In the middle of a REPL\nsession, it's nice to not have to mentally track which variables you've already\ndefined. We could allow redefinition in the REPL but not in scripts, but then\nusers would have to learn two sets of rules, and code copied and pasted from one\nform to the other might not work.\n\n<aside name=\"scheme\">\n\nMy rule about variables and scoping is, \"When in doubt, do what Scheme does\".\nThe Scheme folks have probably spent more time thinking about variable scope\nthan we ever will -- one of the main goals of Scheme was to introduce lexical\nscoping to the world -- so it's hard to go wrong if you follow in their\nfootsteps.\n\nScheme allows redefining variables at the top level.\n\n</aside>\n\nSo, to keep the two modes consistent, we'll allow it -- at least for global\nvariables. Once a variable exists, we need a way to look it up.\n\n^code environment-get (2 before, 1 after)\n\nThis is a little more semantically interesting. If the variable is found, it\nsimply returns the value bound to it. But what if it's not? Again, we have a\nchoice:\n\n* Make it a syntax error.\n\n* Make it a runtime error.\n\n* Allow it and return some default value like `nil`.\n\nLox is pretty lax, but the last option is a little *too* permissive to me.\nMaking it a syntax error -- a compile-time error -- seems like a smart choice.\nUsing an undefined variable is a bug, and the sooner you detect the mistake, the\nbetter.\n\nThe problem is that *using* a variable isn't the same as *referring* to it. You\ncan refer to a variable in a chunk of code without immediately evaluating it if\nthat chunk of code is wrapped inside a function. If we make it a static error to\n*mention* a variable before it's been declared, it becomes much harder to define\nrecursive functions.\n\nWe could accommodate single recursion -- a function that calls itself -- by\ndeclaring the function's own name before we examine its body. But that doesn't\nhelp with mutually recursive procedures that call each other. Consider:\n\n<span name=\"contrived\"></span>\n\n```lox\nfun isOdd(n) {\n  if (n == 0) return false;\n  return isEven(n - 1);\n}\n\nfun isEven(n) {\n  if (n == 0) return true;\n  return isOdd(n - 1);\n}\n```\n\n<aside name=\"contrived\">\n\nGranted, this is probably not the most efficient way to tell if a number is even\nor odd (not to mention the bad things that happen if you pass a non-integer or\nnegative number to them). Bear with me.\n\n</aside>\n\nThe `isEven()` function isn't defined by the <span name=\"declare\">time</span> we\nare looking at the body of `isOdd()` where it's called. If we swap the order of\nthe two functions, then `isOdd()` isn't defined when we're looking at\n`isEven()`'s body.\n\n<aside name=\"declare\">\n\nSome statically typed languages like Java and C# solve this by specifying that\nthe top level of a program isn't a sequence of imperative statements. Instead, a\nprogram is a set of declarations which all come into being simultaneously. The\nimplementation declares *all* of the names before looking at the bodies of *any*\nof the functions.\n\nOlder languages like C and Pascal don't work like this. Instead, they force you\nto add explicit *forward declarations* to declare a name before it's fully\ndefined. That was a concession to the limited computing power at the time. They\nwanted to be able to compile a source file in one single pass through the text,\nso those compilers couldn't gather up all of the declarations first before\nprocessing function bodies.\n\n</aside>\n\nSince making it a *static* error makes recursive declarations too difficult,\nwe'll defer the error to runtime. It's OK to refer to a variable before it's\ndefined as long as you don't *evaluate* the reference. That lets the program\nfor even and odd numbers work, but you'd get a runtime error in:\n\n```lox\nprint a;\nvar a = \"too late!\";\n```\n\nAs with type errors in the expression evaluation code, we report a runtime error\nby throwing an exception. The exception contains the variable's token so we can\ntell the user where in their code they messed up.\n\n### Interpreting global variables\n\nThe Interpreter class gets an instance of the new Environment class.\n\n^code environment-field (2 before, 1 after)\n\nWe store it as a field directly in Interpreter so that the variables stay in\nmemory as long as the interpreter is still running.\n\nWe have two new syntax trees, so that's two new visit methods. The first is for\ndeclaration statements.\n\n^code visit-var\n\nIf the variable has an initializer, we evaluate it. If not, we have another\nchoice to make. We could have made this a syntax error in the parser by\n*requiring* an initializer. Most languages don't, though, so it feels a little\nharsh to do so in Lox.\n\nWe could make it a runtime error. We'd let you define an uninitialized variable,\nbut if you accessed it before assigning to it, a runtime error would occur. It's\nnot a bad idea, but most dynamically typed languages don't do that. Instead,\nwe'll keep it simple and say that Lox sets a variable to `nil` if it isn't\nexplicitly initialized.\n\n```lox\nvar a;\nprint a; // \"nil\".\n```\n\nThus, if there isn't an initializer, we set the value to `null`, which is the\nJava representation of Lox's `nil` value. Then we tell the environment to bind\nthe variable to that value.\n\nNext, we evaluate a variable expression.\n\n^code visit-variable\n\nThis simply forwards to the environment which does the heavy lifting to make\nsure the variable is defined. With that, we've got rudimentary variables\nworking. Try this out:\n\n```lox\nvar a = 1;\nvar b = 2;\nprint a + b;\n```\n\nWe can't reuse *code* yet, but we can start to build up programs that reuse\n*data*.\n\n## Assignment\n\nIt's possible to create a language that has variables but does not let you\nreassign -- or **mutate** -- them. Haskell is one example. SML supports only\nmutable references and arrays -- variables cannot be reassigned. Rust steers you\naway from mutation by requiring a `mut` modifier to enable assignment.\n\nMutating a variable is a side effect and, as the name suggests, some language\nfolks think side effects are <span name=\"pure\">dirty</span> or inelegant. Code\nshould be pure math that produces values -- crystalline, unchanging ones -- like\nan act of divine creation. Not some grubby automaton that beats blobs of data\ninto shape, one imperative grunt at a time.\n\n<aside name=\"pure\">\n\nI find it delightful that the same group of people who pride themselves on\ndispassionate logic are also the ones who can't resist emotionally loaded terms\nfor their work: \"pure\", \"side effect\", \"lazy\", \"persistent\", \"first-class\",\n\"higher-order\".\n\n</aside>\n\nLox is not so austere. Lox is an imperative language, and mutation comes with\nthe territory. Adding support for assignment doesn't require much work. Global\nvariables already support redefinition, so most of the machinery is there now.\nMainly, we're missing an explicit assignment notation.\n\n### Assignment syntax\n\nThat little `=` syntax is more complex than it might seem. Like most C-derived\nlanguages, assignment is an <span name=\"assign\">expression</span> and not a\nstatement. As in C, it is the lowest precedence expression form. That means the\nrule slots between `expression` and `equality` (the next lowest precedence\nexpression).\n\n<aside name=\"assign\">\n\nIn some other languages, like Pascal, Python, and Go, assignment is a statement.\n\n</aside>\n\n```ebnf\nexpression     → assignment ;\nassignment     → IDENTIFIER \"=\" assignment\n               | equality ;\n```\n\nThis says an `assignment` is either an identifier followed by an `=` and an\nexpression for the value, or an `equality` (and thus any other) expression.\nLater, `assignment` will get more complex when we add property setters on\nobjects, like:\n\n```lox\ninstance.field = \"value\";\n```\n\nThe easy part is adding the <span name=\"assign-ast\">new syntax tree node</span>.\n\n^code assign-expr (1 before, 1 after)\n\n<aside name=\"assign-ast\">\n\nThe generated code for the new node is in [Appendix II][appendix-assign].\n\n[appendix-assign]: appendix-ii.html#assign-expression\n\n</aside>\n\nIt has a token for the variable being assigned to, and an expression for the new\nvalue. After you run the AstGenerator to get the new Expr.Assign class, swap out\nthe body of the parser's existing `expression()` method to match the updated\nrule.\n\n^code expression (1 before, 1 after)\n\nHere is where it gets tricky. A single token lookahead recursive descent parser\ncan't see far enough to tell that it's parsing an assignment until *after* it\nhas gone through the left-hand side and stumbled onto the `=`. You might wonder\nwhy it even needs to. After all, we don't know we're parsing a `+` expression\nuntil after we've finished parsing the left operand.\n\nThe difference is that the left-hand side of an assignment isn't an expression\nthat evaluates to a value. It's a sort of pseudo-expression that evaluates to a\n\"thing\" you can assign to. Consider:\n\n```lox\nvar a = \"before\";\na = \"value\";\n```\n\nOn the second line, we don't *evaluate* `a` (which would return the string\n\"before\"). We figure out what variable `a` refers to so we know where to store\nthe right-hand side expression's value. The [classic terms][l-value] for these\ntwo <span name=\"l-value\">constructs</span> are **l-value** and **r-value**. All\nof the expressions that we've seen so far that produce values are r-values. An\nl-value \"evaluates\" to a storage location that you can assign into.\n\n[l-value]: https://en.wikipedia.org/wiki/Value_(computer_science)#lrvalue\n\n<aside name=\"l-value\">\n\nIn fact, the names come from assignment expressions: *l*-values appear on the\n*left* side of the `=` in an assignment, and *r*-values on the *right*.\n\n</aside>\n\nWe want the syntax tree to reflect that an l-value isn't evaluated like a normal\nexpression. That's why the Expr.Assign node has a *Token* for the left-hand\nside, not an Expr. The problem is that the parser doesn't know it's parsing an\nl-value until it hits the `=`. In a complex l-value, that may occur <span\nname=\"many\">many</span> tokens later.\n\n```lox\nmakeList().head.next = node;\n```\n\n<aside name=\"many\">\n\nSince the receiver of a field assignment can be any expression, and expressions\ncan be as long as you want to make them, it may take an *unbounded* number of\ntokens of lookahead to find the `=`.\n\n</aside>\n\nWe have only a single token of lookahead, so what do we do? We use a little\ntrick, and it looks like this:\n\n^code parse-assignment\n\nMost of the code for parsing an assignment expression looks similar to that of\nthe other binary operators like `+`. We parse the left-hand side, which can be\nany expression of higher precedence. If we find an `=`, we parse the right-hand\nside and then wrap it all up in an assignment expression tree node.\n\n<aside name=\"no-throw\">\n\nWe *report* an error if the left-hand side isn't a valid assignment target, but\nwe don't *throw* it because the parser isn't in a confused state where we need\nto go into panic mode and synchronize.\n\n</aside>\n\nOne slight difference from binary operators is that we don't loop to build up a\nsequence of the same operator. Since assignment is right-associative, we instead\nrecursively call `assignment()` to parse the right-hand side.\n\nThe trick is that right before we create the assignment expression node, we look\nat the left-hand side expression and figure out what kind of assignment target\nit is. We convert the r-value expression node into an l-value representation.\n\nThis conversion works because it turns out that every valid assignment target\nhappens to also be <span name=\"converse\">valid syntax</span> as a normal\nexpression. Consider a complex field assignment like:\n\n<aside name=\"converse\">\n\nYou can still use this trick even if there are assignment targets that are not\nvalid expressions. Define a **cover grammar**, a looser grammar that accepts\nall of the valid expression *and* assignment target syntaxes. When you hit\nan `=`, report an error if the left-hand side isn't within the valid assignment\ntarget grammar. Conversely, if you *don't* hit an `=`, report an error if the\nleft-hand side isn't a valid *expression*.\n\n</aside>\n\n```lox\nnewPoint(x + 2, 0).y = 3;\n```\n\nThe left-hand side of that assignment could also work as a valid expression.\n\n```lox\nnewPoint(x + 2, 0).y;\n```\n\nThe first example sets the field, the second gets it.\n\nThis means we can parse the left-hand side *as if it were* an expression and\nthen after the fact produce a syntax tree that turns it into an assignment\ntarget. If the left-hand side expression isn't a <span name=\"paren\">valid</span>\nassignment target, we fail with a syntax error. That ensures we report an error\non code like this:\n\n```lox\na + b = c;\n```\n\n<aside name=\"paren\">\n\nWay back in the parsing chapter, I said we represent parenthesized expressions\nin the syntax tree because we'll need them later. This is why. We need to be\nable to distinguish these cases:\n\n```lox\na = 3;   // OK.\n(a) = 3; // Error.\n```\n\n</aside>\n\nRight now, the only valid target is a simple variable expression, but we'll add\nfields later. The end result of this trick is an assignment expression tree node\nthat knows what it is assigning to and has an expression subtree for the value\nbeing assigned. All with only a single token of lookahead and no backtracking.\n\n### Assignment semantics\n\nWe have a new syntax tree node, so our interpreter gets a new visit method.\n\n^code visit-assign\n\nFor obvious reasons, it's similar to variable declaration. It evaluates the\nright-hand side to get the value, then stores it in the named variable. Instead\nof using `define()` on Environment, it calls this new method:\n\n^code environment-assign\n\nThe key difference between assignment and definition is that assignment is not\n<span name=\"new\">allowed</span> to create a *new* variable. In terms of our\nimplementation, that means it's a runtime error if the key doesn't already exist\nin the environment's variable map.\n\n<aside name=\"new\">\n\nUnlike Python and Ruby, Lox doesn't do [implicit variable declaration][].\n\n[implicit variable declaration]: #design-note\n\n</aside>\n\nThe last thing the `visit()` method does is return the assigned value. That's\nbecause assignment is an expression that can be nested inside other expressions,\nlike so:\n\n```lox\nvar a = 1;\nprint a = 2; // \"2\".\n```\n\nOur interpreter can now create, read, and modify variables. It's about as\nsophisticated as early <span name=\"basic\">BASICs</span>. Global variables are\nsimple, but writing a large program when any two chunks of code can accidentally\nstep on each other's state is no fun. We want *local* variables, which means\nit's time for *scope*.\n\n<aside name=\"basic\">\n\nMaybe a little better than that. Unlike some old BASICs, Lox can handle variable\nnames longer than two characters.\n\n</aside>\n\n## Scope\n\nA **scope** defines a region where a name maps to a certain entity. Multiple\nscopes enable the same name to refer to different things in different contexts.\nIn my house, \"Bob\" usually refers to me. But maybe in your town you know a\ndifferent Bob. Same name, but different dudes based on where you say it.\n\n<span name=\"lexical\">**Lexical scope**</span> (or the less commonly heard\n**static scope**) is a specific style of scoping where the text of the program\nitself shows where a scope begins and ends. In Lox, as in most modern languages,\nvariables are lexically scoped. When you see an expression that uses some\nvariable, you can figure out which variable declaration it refers to just by\nstatically reading the code.\n\n<aside name=\"lexical\">\n\n\"Lexical\" comes from the Greek \"lexikos\" which means \"related to words\". When we\nuse it in programming languages, it usually means a thing you can figure out\nfrom source code itself without having to execute anything.\n\nLexical scope came onto the scene with ALGOL. Earlier languages were often\ndynamically scoped. Computer scientists back then believed dynamic scope was\nfaster to execute. Today, thanks to early Scheme hackers, we know that isn't\ntrue. If anything, it's the opposite.\n\nDynamic scope for variables lives on in some corners. Emacs Lisp defaults to\ndynamic scope for variables. The [`binding`][binding] macro in Clojure provides\nit. The widely disliked [`with` statement][with] in JavaScript turns properties\non an object into dynamically scoped variables.\n\n[binding]: http://clojuredocs.org/clojure.core/binding\n[with]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/with\n\n</aside>\n\nFor example:\n\n```lox\n{\n  var a = \"first\";\n  print a; // \"first\".\n}\n\n{\n  var a = \"second\";\n  print a; // \"second\".\n}\n```\n\nHere, we have two blocks with a variable `a` declared in each of them. You and\nI can tell just from looking at the code that the use of `a` in the first\n`print` statement refers to the first `a`, and the second one refers to the\nsecond.\n\n<img src=\"image/statements-and-state/blocks.png\" alt=\"An environment for each 'a'.\" />\n\nThis is in contrast to **dynamic scope** where you don't know what a name refers\nto until you execute the code. Lox doesn't have dynamically scoped *variables*,\nbut methods and fields on objects are dynamically scoped.\n\n```lox\nclass Saxophone {\n  play() {\n    print \"Careless Whisper\";\n  }\n}\n\nclass GolfClub {\n  play() {\n    print \"Fore!\";\n  }\n}\n\nfun playIt(thing) {\n  thing.play();\n}\n```\n\nWhen `playIt()` calls `thing.play()`, we don't know if we're about to hear\n\"Careless Whisper\" or \"Fore!\" It depends on whether you pass a Saxophone or a\nGolfClub to the function, and we don't know that until runtime.\n\nScope and environments are close cousins. The former is the theoretical concept,\nand the latter is the machinery that implements it. As our interpreter works its\nway through code, syntax tree nodes that affect scope will change the\nenvironment. In a C-ish syntax like Lox's, scope is controlled by curly-braced\nblocks. (That's why we call it **block scope**.)\n\n```lox\n{\n  var a = \"in block\";\n}\nprint a; // Error! No more \"a\".\n```\n\nThe beginning of a block introduces a new local scope, and that scope ends when\nexecution passes the closing `}`. Any variables declared inside the block\ndisappear.\n\n### Nesting and shadowing\n\nA first cut at implementing block scope might work like this:\n\n1.  As we visit each statement inside the block, keep track of any variables\n    declared.\n\n2.  After the last statement is executed, tell the environment to delete all of\n    those variables.\n\nThat would work for the previous example. But remember, one motivation for\nlocal scope is encapsulation -- a block of code in one corner of the program\nshouldn't interfere with some other block. Check this out:\n\n```lox\n// How loud?\nvar volume = 11;\n\n// Silence.\nvolume = 0;\n\n// Calculate size of 3x4x5 cuboid.\n{\n  var volume = 3 * 4 * 5;\n  print volume;\n}\n```\n\nLook at the block where we calculate the volume of the cuboid using a local\ndeclaration of `volume`. After the block exits, the interpreter will delete the\n*global* `volume` variable. That ain't right. When we exit the block, we should\nremove any variables declared inside the block, but if there is a variable with\nthe same name declared outside of the block, *that's a different variable*. It\nshouldn't get touched.\n\nWhen a local variable has the same name as a variable in an enclosing scope, it\n**shadows** the outer one. Code inside the block can't see it any more -- it is\nhidden in the \"shadow\" cast by the inner one -- but it's still there.\n\nWhen we enter a new block scope, we need to preserve variables defined in outer\nscopes so they are still around when we exit the inner block. We do that by\ndefining a fresh environment for each block containing only the variables\ndefined in that scope. When we exit the block, we discard its environment and\nrestore the previous one.\n\nWe also need to handle enclosing variables that are *not* shadowed.\n\n```lox\nvar global = \"outside\";\n{\n  var local = \"inside\";\n  print global + local;\n}\n```\n\nHere, `global` lives in the outer global environment and `local` is defined\ninside the block's environment. In that `print` statement, both of those\nvariables are in scope. In order to find them, the interpreter must search not\nonly the current innermost environment, but also any enclosing ones.\n\nWe implement this by <span name=\"cactus\">chaining</span> the environments\ntogether. Each environment has a reference to the environment of the immediately\nenclosing scope. When we look up a variable, we walk that chain from innermost\nout until we find the variable. Starting at the inner scope is how we make local\nvariables shadow outer ones.\n\n<img src=\"image/statements-and-state/chaining.png\" alt=\"Environments for each scope, linked together.\" />\n\n<aside name=\"cactus\">\n\nWhile the interpreter is running, the environments form a linear list of\nobjects, but consider the full set of environments created during the entire\nexecution. An outer scope may have multiple blocks nested within it, and each\nwill point to the outer one, giving a tree-like structure, though only one path\nthrough the tree exists at a time.\n\nThe boring name for this is a [**parent-pointer tree**][parent pointer], but I\nmuch prefer the evocative **cactus stack**.\n\n[parent pointer]: https://en.wikipedia.org/wiki/Parent_pointer_tree\n\n<img class=\"above\" src=\"image/statements-and-state/cactus.png\" alt=\"Each branch points to its parent. The root is global scope.\" />\n\n</aside>\n\nBefore we add block syntax to the grammar, we'll beef up our Environment class\nwith support for this nesting. First, we give each environment a reference to\nits enclosing one.\n\n^code enclosing-field (1 before, 1 after)\n\nThis field needs to be initialized, so we add a couple of constructors.\n\n^code environment-constructors\n\nThe no-argument constructor is for the global scope's environment, which ends\nthe chain. The other constructor creates a new local scope nested inside the\ngiven outer one.\n\nWe don't have to touch the `define()` method -- a new variable is always\ndeclared in the current innermost scope. But variable lookup and assignment work\nwith existing variables and they need to walk the chain to find them. First,\nlookup:\n\n^code environment-get-enclosing (2 before, 3 after)\n\nIf the variable isn't found in this environment, we simply try the enclosing\none. That in turn does the same thing <span name=\"recurse\">recursively</span>,\nso this will ultimately walk the entire chain. If we reach an environment with\nno enclosing one and still don't find the variable, then we give up and report\nan error as before.\n\nAssignment works the same way.\n\n<aside name=\"recurse\">\n\nIt's likely faster to iteratively walk the chain, but I think the recursive\nsolution is prettier. We'll do something *much* faster in clox.\n\n</aside>\n\n^code environment-assign-enclosing (4 before, 1 after)\n\nAgain, if the variable isn't in this environment, it checks the outer one,\nrecursively.\n\n### Block syntax and semantics\n\nNow that Environments nest, we're ready to add blocks to the language. Behold\nthe grammar:\n\n```ebnf\nstatement      → exprStmt\n               | printStmt\n               | block ;\n\nblock          → \"{\" declaration* \"}\" ;\n```\n\nA block is a (possibly empty) series of statements or declarations surrounded by\ncurly braces. A block is itself a statement and can appear anywhere a statement\nis allowed. The <span name=\"block-ast\">syntax tree</span> node looks like this:\n\n^code block-ast (1 before, 1 after)\n\n<aside name=\"block-ast\">\n\nThe generated code for the new node is in [Appendix II][appendix-block].\n\n[appendix-block]: appendix-ii.html#block-statement\n\n</aside>\n\n<span name=\"generate\">It</span> contains the list of statements that are inside\nthe block. Parsing is straightforward. Like other statements, we detect the\nbeginning of a block by its leading token -- in this case the `{`. In the\n`statement()` method, we add:\n\n<aside name=\"generate\">\n\nAs always, don't forget to run \"GenerateAst.java\".\n\n</aside>\n\n^code parse-block (1 before, 2 after)\n\nAll the real work happens here:\n\n^code block\n\nWe <span name=\"list\">create</span> an empty list and then parse statements and\nadd them to the list until we reach the end of the block, marked by the closing\n`}`. Note that the loop also has an explicit check for `isAtEnd()`. We have to\nbe careful to avoid infinite loops, even when parsing invalid code. If the user\nforgets a closing `}`, the parser needs to not get stuck.\n\n<aside name=\"list\">\n\nHaving `block()` return the raw list of statements and leaving it to\n`statement()` to wrap the list in a Stmt.Block looks a little odd. I did it that\nway because we'll reuse `block()` later for parsing function bodies and we don't\nwant that body wrapped in a Stmt.Block.\n\n</aside>\n\nThat's it for syntax. For semantics, we add another visit method to Interpreter.\n\n^code visit-block\n\nTo execute a block, we create a new environment for the block's scope and pass\nit off to this other method:\n\n^code execute-block\n\nThis new method executes a list of statements in the context of a given <span\nname=\"param\">environment</span>. Up until now, the `environment` field in\nInterpreter always pointed to the same environment -- the global one. Now, that\nfield represents the *current* environment. That's the environment that\ncorresponds to the innermost scope containing the code to be executed.\n\nTo execute code within a given scope, this method updates the interpreter's\n`environment` field, visits all of the statements, and then restores the\nprevious value. As is always good practice in Java, it restores the previous\nenvironment using a finally clause. That way it gets restored even if an\nexception is thrown.\n\n<aside name=\"param\">\n\nManually changing and restoring a mutable `environment` field feels inelegant.\nAnother classic approach is to explicitly pass the environment as a parameter to\neach visit method. To \"change\" the environment, you pass a different one as you\nrecurse down the tree. You don't have to restore the old one, since the new one\nlives on the Java stack and is implicitly discarded when the interpreter returns\nfrom the block's visit method.\n\nI considered that for jlox, but it's kind of tedious and verbose adding an\nenvironment parameter to every single visit method. To keep the book a little\nsimpler, I went with the mutable field.\n\n</aside>\n\nSurprisingly, that's all we need to do in order to fully support local\nvariables, nesting, and shadowing. Go ahead and try this out:\n\n```lox\nvar a = \"global a\";\nvar b = \"global b\";\nvar c = \"global c\";\n{\n  var a = \"outer a\";\n  var b = \"outer b\";\n  {\n    var a = \"inner a\";\n    print a;\n    print b;\n    print c;\n  }\n  print a;\n  print b;\n  print c;\n}\nprint a;\nprint b;\nprint c;\n```\n\nOur little interpreter can remember things now. We are inching closer to\nsomething resembling a full-featured programming language.\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  The REPL no longer supports entering a single expression and automatically\n    printing its result value. That's a drag. Add support to the REPL to let\n    users type in both statements and expressions. If they enter a statement,\n    execute it. If they enter an expression, evaluate it and display the result\n    value.\n\n2.  Maybe you want Lox to be a little more explicit about variable\n    initialization. Instead of implicitly initializing variables to `nil`, make\n    it a runtime error to access a variable that has not been initialized or\n    assigned to, as in:\n\n    ```lox\n    // No initializers.\n    var a;\n    var b;\n\n    a = \"assigned\";\n    print a; // OK, was assigned first.\n\n    print b; // Error!\n    ```\n\n3.  What does the following program do?\n\n    ```lox\n    var a = 1;\n    {\n      var a = a + 2;\n      print a;\n    }\n    ```\n\n    What did you *expect* it to do? Is it what you think it should do? What\n    does analogous code in other languages you are familiar with do? What do\n    you think users will expect this to do?\n\n</div>\n\n<div class=\"design-note\">\n\n## Design Note: Implicit Variable Declaration\n\nLox has distinct syntax for declaring a new variable and assigning to an\nexisting one. Some languages collapse those to only assignment syntax. Assigning\nto a non-existent variable automatically brings it into being. This is called\n**implicit variable declaration** and exists in Python, Ruby, and CoffeeScript,\namong others. JavaScript has an explicit syntax to declare variables, but can\nalso create new variables on assignment. Visual Basic has [an option to enable\nor disable implicit variables][vb].\n\n[vb]: https://msdn.microsoft.com/en-us/library/xe53dz5w(v=vs.100).aspx\n\nWhen the same syntax can assign or create a variable, each language must decide\nwhat happens when it isn't clear about which behavior the user intends. In\nparticular, each language must choose how implicit declaration interacts with\nshadowing, and which scope an implicitly declared variable goes into.\n\n*   In Python, assignment always creates a variable in the current function's\n    scope, even if there is a variable with the same name declared outside of\n    the function.\n\n*   Ruby avoids some ambiguity by having different naming rules for local and\n    global variables. However, blocks in Ruby (which are more like closures than\n    like \"blocks\" in C) have their own scope, so it still has the problem.\n    Assignment in Ruby assigns to an existing variable outside of the current\n    block if there is one with the same name. Otherwise, it creates a new\n    variable in the current block's scope.\n\n*   CoffeeScript, which takes after Ruby in many ways, is similar. It explicitly\n    disallows shadowing by saying that assignment always assigns to a variable\n    in an outer scope if there is one, all the way up to the outermost global\n    scope. Otherwise, it creates the variable in the current function scope.\n\n*   In JavaScript, assignment modifies an existing variable in any enclosing\n    scope, if found. If not, it implicitly creates a new variable in the\n    *global* scope.\n\nThe main advantage to implicit declaration is simplicity. There's less syntax\nand no \"declaration\" concept to learn. Users can just start assigning stuff and\nthe language figures it out.\n\nOlder, statically typed languages like C benefit from explicit declaration\nbecause they give the user a place to tell the compiler what type each variable\nhas and how much storage to allocate for it. In a dynamically typed,\ngarbage-collected language, that isn't really necessary, so you can get away\nwith making declarations implicit. It feels a little more \"scripty\", more \"you\nknow what I mean\".\n\nBut is that a good idea? Implicit declaration has some problems.\n\n*   A user may intend to assign to an existing variable, but may have misspelled\n    it. The interpreter doesn't know that, so it goes ahead and silently creates\n    some new variable and the variable the user wanted to assign to still has\n    its old value. This is particularly heinous in JavaScript where a typo will\n    create a *global* variable, which may in turn interfere with other code.\n\n*   JS, Ruby, and CoffeeScript use the presence of an existing variable with the\n    same name -- even in an outer scope -- to determine whether or not an\n    assignment creates a new variable or assigns to an existing one. That means\n    adding a new variable in a surrounding scope can change the meaning of\n    existing code. What was once a local variable may silently turn into an\n    assignment to that new outer variable.\n\n*   In Python, you may *want* to assign to some variable outside of the current\n    function instead of creating a new variable in the current one, but you\n    can't.\n\nOver time, the languages I know with implicit variable declaration ended up\nadding more features and complexity to deal with these problems.\n\n*   Implicit declaration of global variables in JavaScript is universally\n    considered a mistake today. \"Strict mode\" disables it and makes it a compile\n    error.\n\n*   Python added a `global` statement to let you explicitly assign to a global\n    variable from within a function. Later, as functional programming and nested\n    functions became more popular, they added a similar `nonlocal` statement to\n    assign to variables in enclosing functions.\n\n*   Ruby extended its block syntax to allow declaring certain variables to be\n    explicitly local to the block even if the same name exists in an outer\n    scope.\n\nGiven those, I think the simplicity argument is mostly lost. There is an\nargument that implicit declaration is the right *default* but I personally find\nthat less compelling.\n\nMy opinion is that implicit declaration made sense in years past when most\nscripting languages were heavily imperative and code was pretty flat. As\nprogrammers have gotten more comfortable with deep nesting, functional\nprogramming, and closures, it's become much more common to want access to\nvariables in outer scopes. That makes it more likely that users will run into\nthe tricky cases where it's not clear whether they intend their assignment to\ncreate a new variable or reuse a surrounding one.\n\nSo I prefer explicitly declaring variables, which is why Lox requires it.\n\n</div>\n"
  },
  {
    "path": "book/strings.md",
    "content": "> \"Ah? A small aversion to menial labor?\" The doctor cocked an eyebrow.\n> \"Understandable, but misplaced. One should treasure those hum-drum\n> tasks that keep the body occupied but leave the mind and heart unfettered.\"\n>\n> <cite>Tad Williams, <em>The Dragonbone Chair</em></cite>\n\nOur little VM can represent three types of values right now: numbers, Booleans,\nand `nil`. Those types have two important things in common: they're immutable\nand they're small. Numbers are the largest, and they still fit into two 64-bit\nwords. That's a small enough price that we can afford to pay it for all values,\neven Booleans and nils which don't need that much space.\n\nStrings, unfortunately, are not so petite. There's no maximum length for a\nstring. Even if we were to artificially cap it at some contrived limit like\n<span name=\"pascal\">255</span> characters, that's still too much memory to spend\non every single value.\n\n<aside name=\"pascal\">\n\nUCSD Pascal, one of the first implementations of Pascal, had this exact limit.\nInstead of using a terminating null byte to indicate the end of the string like\nC, Pascal strings started with a length value. Since UCSD used only a single\nbyte to store the length, strings couldn't be any longer than 255 characters.\n\n<img src=\"image/strings/pstring.png\" alt=\"The Pascal string 'hello' with a length byte of 5 preceding it.\" />\n\n</aside>\n\nWe need a way to support values whose sizes vary, sometimes greatly. This is\nexactly what dynamic allocation on the heap is designed for. We can allocate as\nmany bytes as we need. We get back a pointer that we'll use to keep track of the\nvalue as it flows through the VM.\n\n## Values and Objects\n\nUsing the heap for larger, variable-sized values and the stack for smaller,\natomic ones leads to a two-level representation. Every Lox value that you can\nstore in a variable or return from an expression will be a Value. For small,\nfixed-size types like numbers, the payload is stored directly inside the Value\nstruct itself.\n\nIf the object is larger, its data lives on the heap. Then the Value's payload is\na *pointer* to that blob of memory. We'll eventually have a handful of\nheap-allocated types in clox: strings, instances, functions, you get the idea.\nEach type has its own unique data, but there is also state they all share that\n[our future garbage collector][gc] will use to manage their memory.\n\n<img src=\"image/strings/value.png\" class=\"wide\" alt=\"Field layout of number and obj values.\" />\n\n[gc]: garbage-collection.html\n\nWe'll call this common representation <span name=\"short\">\"Obj\"</span>. Each Lox\nvalue whose state lives on the heap is an Obj. We can thus use a single new\nValueType case to refer to all heap-allocated types.\n\n<aside name=\"short\">\n\n\"Obj\" is short for \"object\", natch.\n\n</aside>\n\n^code val-obj (1 before, 1 after)\n\nWhen a Value's type is `VAL_OBJ`, the payload is a pointer to the heap memory,\nso we add another case to the union for that.\n\n^code union-object (1 before, 1 after)\n\nAs we did with the other value types, we crank out a couple of helpful macros\nfor working with Obj values.\n\n^code is-obj (1 before, 2 after)\n\nThis evaluates to `true` if the given Value is an Obj. If so, we can use this:\n\n^code as-obj (2 before, 1 after)\n\nIt extracts the Obj pointer from the value. We can also go the other way.\n\n^code obj-val (1 before, 2 after)\n\nThis takes a bare Obj pointer and wraps it in a full Value.\n\n## Struct Inheritance\n\nEvery heap-allocated value is an Obj, but <span name=\"objs\">Objs</span> are\nnot all the same. For strings, we need the array of characters. When we get to\ninstances, they will need their data fields. A function object will need its\nchunk of bytecode. How do we handle different payloads and sizes? We can't use\nanother union like we did for Value since the sizes are all over the place.\n\n<aside name=\"objs\">\n\nNo, I don't know how to pronounce \"objs\" either. Feels like there should be a\nvowel in there somewhere.\n\n</aside>\n\nInstead, we'll use another technique. It's been around for ages, to the point\nthat the C specification carves out specific support for it, but I don't know\nthat it has a canonical name. It's an example of [*type punning*][pun], but that\nterm is too broad. In the absence of any better ideas, I'll call it **struct\ninheritance**, because it relies on structs and roughly follows how\nsingle-inheritance of state works in object-oriented languages.\n\n[pun]: https://en.wikipedia.org/wiki/Type_punning\n\nLike a tagged union, each Obj starts with a tag field that identifies what kind\nof object it is -- string, instance, etc. Following that are the payload fields.\nInstead of a union with cases for each type, each type is its own separate\nstruct. The tricky part is how to treat these structs uniformly since C has no\nconcept of inheritance or polymorphism. I'll explain that soon, but first lets\nget the preliminary stuff out of the way.\n\nThe name \"Obj\" itself refers to a struct that contains the state shared across\nall object types. It's sort of like the \"base class\" for objects. Because of\nsome cyclic dependencies between values and objects, we forward-declare it in\nthe \"value\" module.\n\n^code forward-declare-obj (2 before, 1 after)\n\nAnd the actual definition is in a new module.\n\n^code object-h\n\nRight now, it contains only the type tag. Shortly, we'll add some other\nbookkeeping information for memory management. The type enum is this:\n\n^code obj-type (1 before, 2 after)\n\nObviously, that will be more useful in later chapters after we add more\nheap-allocated types. Since we'll be accessing these tag types frequently, it's\nworth making a little macro that extracts the object type tag from a given\nValue.\n\n^code obj-type-macro (1 before, 2 after)\n\nThat's our foundation.\n\nNow, let's build strings on top of it. The payload for strings is defined in a\nseparate struct. Again, we need to forward-declare it.\n\n^code forward-declare-obj-string (1 before, 2 after)\n\nThe definition lives alongside Obj.\n\n^code obj-string (1 before, 2 after)\n\nA string object contains an array of characters. Those are stored in a separate,\nheap-allocated array so that we set aside only as much room as needed for each\nstring. We also store the number of bytes in the array. This isn't strictly\nnecessary but lets us tell how much memory is allocated for the string without\nwalking the character array to find the null terminator.\n\nBecause ObjString is an Obj, it also needs the state all Objs share. It\naccomplishes that by having its first field be an Obj. C specifies that struct\nfields are arranged in memory in the order that they are declared. Also, when\nyou nest structs, the inner struct's fields are expanded right in place. So the\nmemory for Obj and for ObjString looks like this:\n\n<img src=\"image/strings/obj.png\" alt=\"The memory layout for the fields in Obj and ObjString.\" />\n\nNote how the first bytes of ObjString exactly line up with Obj. This is not a\ncoincidence -- C <span name=\"spec\">mandates</span> it. This is designed to\nenable a clever pattern: You can take a pointer to a struct and safely convert\nit to a pointer to its first field and back.\n\n<aside name=\"spec\">\n\nThe key part of the spec is:\n\n> &sect; 6.7.2.1 13\n>\n> Within a structure object, the non-bit-field members and the units in which\n> bit-fields reside have addresses that increase in the order in which they\n> are declared. A pointer to a structure object, suitably converted, points to\n> its initial member (or if that member is a bit-field, then to the unit in\n> which it resides), and vice versa. There may be unnamed padding within a\n> structure object, but not at its beginning.\n\n</aside>\n\nGiven an `ObjString*`, you can safely cast it to `Obj*` and then access the\n`type` field from it. Every ObjString \"is\" an Obj in the OOP sense of \"is\". When\nwe later add other object types, each struct will have an Obj as its first\nfield. Any code that wants to work with all objects can treat them as base\n`Obj*` and ignore any other fields that may happen to follow.\n\nYou can go in the other direction too. Given an `Obj*`, you can \"downcast\" it to\nan `ObjString*`. Of course, you need to ensure that the `Obj*` pointer you have\ndoes point to the `obj` field of an actual ObjString. Otherwise, you are\nunsafely reinterpreting random bits of memory. To detect that such a cast is\nsafe, we add another macro.\n\n^code is-string (1 before, 2 after)\n\nIt takes a Value, not a raw `Obj*` because most code in the VM works with\nValues. It relies on this inline function:\n\n^code is-obj-type (2 before, 2 after)\n\nPop quiz: Why not just put the body of this function right in the macro? What's\ndifferent about this one compared to the others? Right, it's because the body\nuses `value` twice. A macro is expanded by inserting the argument *expression*\nevery place the parameter name appears in the body. If a macro uses a parameter\nmore than once, that expression gets evaluated multiple times.\n\nThat's bad if the expression has side effects. If we put the body of\n`isObjType()` into the macro definition and then you did, say,\n\n```c\nIS_STRING(POP())\n```\n\nthen it would pop two values off the stack! Using a function fixes that.\n\nAs long as we ensure that we set the type tag correctly whenever we create an\nObj of some type, this macro will tell us when it's safe to cast a value to a\nspecific object type. We can do that using these:\n\n^code as-string (1 before, 2 after)\n\nThese two macros take a Value that is expected to contain a pointer to a valid\nObjString on the heap. The first one returns the `ObjString*` pointer. The\nsecond one steps through that to return the character array itself, since that's\noften what we'll end up needing.\n\n## Strings\n\nOK, our VM can now represent string values. It's time to add strings to the\nlanguage itself. As usual, we begin in the front end. The lexer already\ntokenizes string literals, so it's the parser's turn.\n\n^code table-string (1 before, 1 after)\n\nWhen the parser hits a string token, it calls this parse function:\n\n^code parse-string\n\nThis takes the string's characters <span name=\"escape\">directly</span> from the\nlexeme. The `+ 1` and `- 2` parts trim the leading and trailing quotation marks.\nIt then creates a string object, wraps it in a Value, and stuffs it into the\nconstant table.\n\n<aside name=\"escape\">\n\nIf Lox supported string escape sequences like `\\n`, we'd translate those here.\nSince it doesn't, we can take the characters as they are.\n\n</aside>\n\nTo create the string, we use `copyString()`, which is declared in `object.h`.\n\n^code copy-string-h (2 before, 1 after)\n\nThe compiler module needs to include that.\n\n^code compiler-include-object (2 before, 1 after)\n\nOur \"object\" module gets an implementation file where we define the new\nfunction.\n\n^code object-c\n\nFirst, we allocate a new array on the heap, just big enough for the string's\ncharacters and the trailing <span name=\"terminator\">terminator</span>, using\nthis low-level macro that allocates an array with a given element type and\ncount:\n\n^code allocate (2 before, 1 after)\n\nOnce we have the array, we copy over the characters from the lexeme and\nterminate it.\n\n<aside name=\"terminator\" class=\"bottom\">\n\nWe need to terminate the string ourselves because the lexeme points at a range\nof characters inside the monolithic source string and isn't terminated.\n\nSince ObjString stores the length explicitly, we *could* leave the character\narray unterminated, but slapping a terminator on the end costs us only a byte\nand lets us pass the character array to C standard library functions that expect\na terminated string.\n\n</aside>\n\nYou might wonder why the ObjString can't just point back to the original\ncharacters in the source string. Some ObjStrings will be created dynamically at\nruntime as a result of string operations like concatenation. Those strings\nobviously need to dynamically allocate memory for the characters, which means\nthe string needs to *free* that memory when it's no longer needed.\n\nIf we had an ObjString for a string literal, and tried to free its character\narray that pointed into the original source code string, bad things would\nhappen. So, for literals, we preemptively copy the characters over to the heap.\nThis way, every ObjString reliably owns its character array and can free it.\n\nThe real work of creating a string object happens in this function:\n\n^code allocate-string (2 before)\n\nIt creates a new ObjString on the heap and then initializes its fields. It's\nsort of like a constructor in an OOP language. As such, it first calls the \"base\nclass\" constructor to initialize the Obj state, using a new macro.\n\n^code allocate-obj (1 before, 2 after)\n\n<span name=\"factored\">Like</span> the previous macro, this exists mainly to\navoid the need to redundantly cast a `void*` back to the desired type. The\nactual functionality is here:\n\n<aside name=\"factored\">\n\nI admit this chapter has a sea of helper functions and macros to wade through. I\ntry to keep the code nicely factored, but that leads to a scattering of tiny\nfunctions. They will pay off when we reuse them later.\n\n</aside>\n\n^code allocate-object (2 before, 2 after)\n\nIt allocates an object of the given size on the heap. Note that the size is\n*not* just the size of Obj itself. The caller passes in the number of bytes so\nthat there is room for the extra payload fields needed by the specific object\ntype being created.\n\nThen it initializes the Obj state -- right now, that's just the type tag. This\nfunction returns to `allocateString()`, which finishes initializing the ObjString\nfields. <span name=\"viola\">*Voilà*</span>, we can compile and execute string\nliterals.\n\n<aside name=\"viola\">\n\n<img src=\"image/strings/viola.png\" class=\"above\" alt=\"A viola.\" />\n\nDon't get \"voilà\" confused with \"viola\". One means \"there it is\" and the other\nis a string instrument, the middle child between a violin and a cello. Yes, I\ndid spend two hours drawing a viola just to mention that.\n\n</aside>\n\n## Operations on Strings\n\nOur fancy strings are there, but they don't do much of anything yet. A good\nfirst step is to make the existing print code not barf on the new value type.\n\n^code call-print-object (1 before, 1 after)\n\nIf the value is a heap-allocated object, it defers to a helper function over in\nthe \"object\" module.\n\n^code print-object-h (1 before, 2 after)\n\nThe implementation looks like this:\n\n^code print-object\n\nWe have only a single object type now, but this function will sprout additional\nswitch cases in later chapters. For string objects, it simply <span\nname=\"term-2\">prints</span> the character array as a C string.\n\n<aside name=\"term-2\">\n\nI told you terminating the string would come in handy.\n\n</aside>\n\nThe equality operators also need to gracefully handle strings. Consider:\n\n```lox\n\"string\" == \"string\"\n```\n\nThese are two separate string literals. The compiler will make two separate\ncalls to `copyString()`, create two distinct ObjString objects and store them as\ntwo constants in the chunk. They are different objects in the heap. But our\nusers (and thus we) expect strings to have value equality. The above expression\nshould evaluate to `true`. That requires a little special support.\n\n^code strings-equal (1 before, 1 after)\n\nIf the two values are both strings, then they are equal if their character\narrays contain the same characters, regardless of whether they are two separate\nobjects or the exact same one. This does mean that string equality is slower\nthan equality on other types since it has to walk the whole string. We'll revise\nthat [later][hash], but this gives us the right semantics for now.\n\n[hash]: hash-tables.html\n\nFinally, in order to use `memcmp()` and the new stuff in the \"object\" module, we\nneed a couple of includes. Here:\n\n^code value-include-string (1 before, 2 after)\n\nAnd here:\n\n^code value-include-object (2 before, 1 after)\n\n### Concatenation\n\nFull-grown languages provide lots of operations for working with strings --\naccess to individual characters, the string's length, changing case, splitting,\njoining, searching, etc. When you implement your language, you'll likely want\nall that. But for this book, we keep things *very* minimal.\n\nThe only interesting operation we support on strings is `+`. If you use that\noperator on two string objects, it produces a new string that's a concatenation\nof the two operands. Since Lox is dynamically typed, we can't tell which\nbehavior is needed at compile time because we don't know the types of the\noperands until runtime. Thus, the `OP_ADD` instruction dynamically inspects the\noperands and chooses the right operation.\n\n^code add-strings (1 before, 1 after)\n\nIf both operands are strings, it concatenates. If they're both numbers, it adds\nthem. Any other <span name=\"convert\">combination</span> of operand types is a\nruntime error.\n\n<aside name=\"convert\" class=\"bottom\">\n\nThis is more conservative than most languages. In other languages, if one\noperand is a string, the other can be any type and it will be implicitly\nconverted to a string before concatenating the two.\n\nI think that's a fine feature, but would require writing tedious \"convert to\nstring\" code for each type, so I left it out of Lox.\n\n</aside>\n\nTo concatenate strings, we define a new function.\n\n^code concatenate\n\nIt's pretty verbose, as C code that works with strings tends to be. First, we\ncalculate the length of the result string based on the lengths of the operands.\nWe allocate a character array for the result and then copy the two halves in. As\nalways, we carefully ensure the string is terminated.\n\nIn order to call `memcpy()`, the VM needs an include.\n\n^code vm-include-string (1 before, 2 after)\n\nFinally, we produce an ObjString to contain those characters. This time we use a\nnew function, `takeString()`.\n\n^code take-string-h (2 before, 1 after)\n\nThe implementation looks like this:\n\n^code take-string\n\nThe previous `copyString()` function assumes it *cannot* take ownership of the\ncharacters you pass in. Instead, it conservatively creates a copy of the\ncharacters on the heap that the ObjString can own. That's the right thing for\nstring literals where the passed-in characters are in the middle of the source\nstring.\n\nBut, for concatenation, we've already dynamically allocated a character array on\nthe heap. Making another copy of that would be redundant (and would mean\n`concatenate()` has to remember to free its copy). Instead, this function claims\nownership of the string you give it.\n\nAs usual, stitching this functionality together requires a couple of includes.\n\n^code vm-include-object-memory (1 before, 1 after)\n\n## Freeing Objects\n\nBehold this innocuous-seeming expression:\n\n```lox\n\"st\" + \"ri\" + \"ng\"\n```\n\nWhen the compiler chews through this, it allocates an ObjString for each of\nthose three string literals and stores them in the chunk's constant table and\ngenerates this <span name=\"stack\">bytecode</span>:\n\n<aside name=\"stack\">\n\nHere's what the stack looks like after each instruction:\n\n<img src=\"image/strings/stack.png\" alt=\"The state of the stack at each instruction.\" />\n\n</aside>\n\n```text\n0000    OP_CONSTANT         0 \"st\"\n0002    OP_CONSTANT         1 \"ri\"\n0004    OP_ADD\n0005    OP_CONSTANT         2 \"ng\"\n0007    OP_ADD\n0008    OP_RETURN\n```\n\nThe first two instructions push `\"st\"` and `\"ri\"` onto the stack. Then the\n`OP_ADD` pops those and concatenates them. That dynamically allocates a new\n`\"stri\"` string on the heap. The VM pushes that and then pushes the `\"ng\"`\nconstant. The last `OP_ADD` pops `\"stri\"` and `\"ng\"`, concatenates them, and\npushes the result: `\"string\"`. Great, that's what we expect.\n\nBut, wait. What happened to that `\"stri\"` string? We dynamically allocated it,\nthen the VM discarded it after concatenating it with `\"ng\"`. We popped it from\nthe stack and no longer have a reference to it, but we never freed its memory.\nWe've got ourselves a classic memory leak.\n\nOf course, it's perfectly fine for the *Lox program* to forget about\nintermediate strings and not worry about freeing them. Lox automatically manages\nmemory on the user's behalf. The responsibility to manage memory doesn't\n*disappear*. Instead, it falls on our shoulders as VM implementers.\n\nThe full <span name=\"borrowed\">solution</span> is a [garbage collector][gc] that\nreclaims unused memory while the program is running. We've got some other stuff\nto get in place before we're ready to tackle that project. Until then, we are\nliving on borrowed time. The longer we wait to add the collector, the harder it\nis to do.\n\n<aside name=\"borrowed\">\n\nI've seen a number of people implement large swathes of their language before\ntrying to start on the GC. For the kind of toy programs you typically run while\na language is being developed, you actually don't run out of memory before\nreaching the end of the program, so this gets you surprisingly far.\n\nBut that underestimates how *hard* it is to add a garbage collector later. The\ncollector *must* ensure it can find every bit of memory that *is* still being\nused so that it doesn't collect live data. There are hundreds of places a\nlanguage implementation can squirrel away a reference to some object. If you\ndon't find all of them, you get nightmarish bugs.\n\nI've seen language implementations die because it was too hard to get the GC in\nlater. If your language needs GC, get it working as soon as you can. It's a\ncrosscutting concern that touches the entire codebase.\n\n</aside>\n\nToday, we should at least do the bare minimum: avoid *leaking* memory by making\nsure the VM can still find every allocated object even if the Lox program itself\nno longer references them. There are many sophisticated techniques that advanced\nmemory managers use to allocate and track memory for objects. We're going to\ntake the simplest practical approach.\n\nWe'll create a linked list that stores every Obj. The VM can traverse that\nlist to find every single object that has been allocated on the heap, whether or\nnot the user's program or the VM's stack still has a reference to it.\n\nWe could define a separate linked list node struct but then we'd have to\nallocate those too. Instead, we'll use an **intrusive list** -- the Obj struct\nitself will be the linked list node. Each Obj gets a pointer to the next Obj in\nthe chain.\n\n^code next-field (2 before, 1 after)\n\nThe VM stores a pointer to the head of the list.\n\n^code objects-root (1 before, 1 after)\n\nWhen we first initialize the VM, there are no allocated objects.\n\n^code init-objects-root (1 before, 1 after)\n\nEvery time we allocate an Obj, we insert it in the list.\n\n^code add-to-list (1 before, 1 after)\n\nSince this is a singly linked list, the easiest place to insert it is as the\nhead. That way, we don't need to also store a pointer to the tail and keep it\nupdated.\n\nThe \"object\" module is directly using the global `vm` variable from the \"vm\"\nmodule, so we need to expose that externally.\n\n^code extern-vm (2 before, 1 after)\n\nEventually, the garbage collector will free memory while the VM is still\nrunning. But, even then, there will usually be unused objects still lingering in\nmemory when the user's program completes. The VM should free those too.\n\nThere's no sophisticated logic for that. Once the program is done, we can free\n*every* object. We can and should implement that now.\n\n^code call-free-objects (1 before, 1 after)\n\nThat empty function we defined [way back when][vm] finally does something! It\ncalls this:\n\n[vm]: a-virtual-machine.html#an-instruction-execution-machine\n\n^code free-objects-h (1 before, 2 after)\n\nHere's how we free the objects:\n\n^code free-objects\n\nThis is a CS 101 textbook implementation of walking a linked list and freeing\nits nodes. For each node, we call:\n\n^code free-object\n\nWe aren't only freeing the Obj itself. Since some object types also allocate\nother memory that they own, we also need a little type-specific code to handle\neach object type's special needs. Here, that means we free the character array\nand then free the ObjString. Those both use one last memory management macro.\n\n^code free (1 before, 2 after)\n\nIt's a tiny <span name=\"free\">wrapper</span> around `reallocate()` that\n\"resizes\" an allocation down to zero bytes.\n\n<aside name=\"free\">\n\nUsing `reallocate()` to free memory might seem pointless. Why not just call\n`free()`? Later, this will help the VM track how much memory is still being\nused. If all allocation and freeing goes through `reallocate()`, it's easy to\nkeep a running count of the number of bytes of allocated memory.\n\n</aside>\n\nAs usual, we need an include to wire everything together.\n\n^code memory-include-object (1 before, 2 after)\n\nThen in the implementation file:\n\n^code memory-include-vm (1 before, 2 after)\n\nWith this, our VM no longer leaks memory. Like a good C program, it cleans up\nits mess before exiting. But it doesn't free any objects while the VM is\nrunning. Later, when it's possible to write longer-running Lox programs, the VM\nwill eat more and more memory as it goes, not relinquishing a single byte until\nthe entire program is done.\n\nWe won't address that until we've added [a real garbage collector][gc], but this\nis a big step. We now have the infrastructure to support a variety of different\nkinds of dynamically allocated objects. And we've used that to add strings to\nclox, one of the most used types in most programming languages. Strings in turn\nenable us to build another fundamental data type, especially in dynamic\nlanguages: the venerable [hash table][]. But that's for the next chapter...\n\n[hash table]: hash-tables.html\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  Each string requires two separate dynamic allocations -- one for the\n    ObjString and a second for the character array. Accessing the characters\n    from a value requires two pointer indirections, which can be bad for\n    performance. A more efficient solution relies on a technique called\n    **[flexible array members][]**. Use that to store the ObjString and its\n    character array in a single contiguous allocation.\n\n2.  When we create the ObjString for each string literal, we copy the characters\n    onto the heap. That way, when the string is later freed, we know it is safe\n    to free the characters too.\n\n    This is a simpler approach but wastes some memory, which might be a problem\n    on very constrained devices. Instead, we could keep track of which\n    ObjStrings own their character array and which are \"constant strings\" that\n    just point back to the original source string or some other non-freeable\n    location. Add support for this.\n\n3.  If Lox was your language, what would you have it do when a user tries to use\n    `+` with one string operand and the other some other type? Justify your\n    choice. What do other languages do?\n\n[flexible array members]: https://en.wikipedia.org/wiki/Flexible_array_member\n\n</div>\n\n<div class=\"design-note\">\n\n## Design Note: String Encoding\n\nIn this book, I try not to shy away from the gnarly problems you'll run into in\na real language implementation. We might not always use the most *sophisticated*\nsolution -- it's an intro book after all -- but I don't think it's honest to\npretend the problem doesn't exist at all. However, I did skirt around one really\nnasty conundrum: deciding how to represent strings.\n\nThere are two facets to a string encoding:\n\n*   **What is a single \"character\" in a string?** How many different values are\n    there and what do they represent? The first widely adopted standard answer\n    to this was [ASCII][]. It gave you 127 different character values and\n    specified what they were. It was great... if you only ever cared about\n    English. While it has weird, mostly forgotten characters like \"record\n    separator\" and \"synchronous idle\", it doesn't have a single umlaut, acute,\n    or grave. It can't represent \"jalapeño\", \"naïve\", <span\n    name=\"gruyere\">\"Gruyère\"</span>, or \"Mötley Crüe\".\n\n    <aside name=\"gruyere\">\n\n    It goes without saying that a language that does not let one discuss Gruyère\n    or Mötley Crüe is a language not worth using.\n\n    </aside>\n\n    Next came [Unicode][]. Initially, it supported 16,384 different characters\n    (**code points**), which fit nicely in 16 bits with a couple of bits to\n    spare. Later that grew and grew, and now there are well over 100,000\n    different code points including such vital instruments of human\n    communication as 💩 (Unicode Character 'PILE OF POO', `U+1F4A9`).\n\n    Even that long list of code points is not enough to represent each possible\n    visible glyph a language might support. To handle that, Unicode also has\n    **combining characters** that modify a preceding code point. For example,\n    \"a\" followed by the combining character \"¨\" gives you \"ä\". (To make things\n    more confusing Unicode *also* has a single code point that looks like \"ä\".)\n\n    If a user accesses the fourth \"character\" in \"naïve\", do they expect to get\n    back \"v\" or &ldquo;¨&rdquo;? The former means they are thinking of each code\n    point and its combining character as a single unit -- what Unicode calls an\n    **extended grapheme cluster** -- the latter means they are thinking in\n    individual code points. Which do your users expect?\n\n*   **How is a single unit represented in memory?** Most systems using ASCII\n    gave a single byte to each character and left the high bit unused. Unicode\n    has a handful of common encodings. UTF-16 packs most code points into 16\n    bits. That was great when every code point fit in that size. When that\n    overflowed, they added *surrogate pairs* that use multiple 16-bit code units\n    to represent a single code point. UTF-32 is the next evolution of\n    UTF-16 -- it gives a full 32 bits to each and every code point.\n\n    UTF-8 is more complex than either of those. It uses a variable number of\n    bytes to encode a code point. Lower-valued code points fit in fewer bytes.\n    Since each character may occupy a different number of bytes, you can't\n    directly index into the string to find a specific code point. If you want,\n    say, the 10th code point, you don't know how many bytes into the string that\n    is without walking and decoding all of the preceding ones.\n\n[ascii]: https://en.wikipedia.org/wiki/ASCII\n[unicode]: https://en.wikipedia.org/wiki/Unicode\n\nChoosing a character representation and encoding involves fundamental\ntrade-offs. Like many things in engineering, there's no <span\nname=\"python\">perfect</span> solution:\n\n<aside name=\"python\">\n\nAn example of how difficult this problem is comes from Python. The achingly long\ntransition from Python 2 to 3 is painful mostly because of its changes around\nstring encoding.\n\n</aside>\n\n*   ASCII is memory efficient and fast, but it kicks non-Latin languages to the\n    side.\n\n*   UTF-32 is fast and supports the whole Unicode range, but wastes a lot of\n    memory given that most code points do tend to be in the lower range of\n    values, where a full 32 bits aren't needed.\n\n*   UTF-8 is memory efficient and supports the whole Unicode range, but its\n    variable-length encoding makes it slow to access arbitrary code points.\n\n*   UTF-16 is worse than all of them -- an ugly consequence of Unicode\n    outgrowing its earlier 16-bit range. It's less memory efficient than UTF-8\n    but is still a variable-length encoding thanks to surrogate pairs. Avoid it\n    if you can. Alas, if your language needs to run on or interoperate with the\n    browser, the JVM, or the CLR, you might be stuck with it, since those all\n    use UTF-16 for their strings and you don't want to have to convert every\n    time you pass a string to the underlying system.\n\nOne option is to take the maximal approach and do the \"rightest\" thing. Support\nall the Unicode code points. Internally, select an encoding for each string\nbased on its contents -- use ASCII if every code point fits in a byte, UTF-16 if\nthere are no surrogate pairs, etc. Provide APIs to let users iterate over both\ncode points and extended grapheme clusters.\n\nThis covers all your bases but is really complex. It's a lot to implement,\ndebug, and optimize. When serializing strings or interoperating with other\nsystems, you have to deal with all of the encodings. Users need to understand\nthe two indexing APIs and know which to use when. This is the approach that\nnewer, big languages tend to take -- like Raku and Swift.\n\nA simpler compromise is to always encode using UTF-8 and only expose an API that\nworks with code points. For users that want to work with grapheme clusters, let\nthem use a third-party library for that. This is less Latin-centric than ASCII\nbut not much more complex. You lose fast direct indexing by code point, but you\ncan usually live without that or afford to make it *O(n)* instead of *O(1)*.\n\nIf I were designing a big workhorse language for people writing large\napplications, I'd probably go with the maximal approach. For my little embedded\nscripting language [Wren][], I went with UTF-8 and code points.\n\n[wren]: http://wren.io\n\n</div>\n"
  },
  {
    "path": "book/superclasses.md",
    "content": "> You can choose your friends but you sho' can't choose your family, an' they're\n> still kin to you no matter whether you acknowledge &rsquo;em or not, and it\n> makes you look right silly when you don't.\n>\n> <cite>Harper Lee, <em>To Kill a Mockingbird</em></cite>\n\nThis is the very last chapter where we add new functionality to our VM. We've\npacked almost the entire Lox language in there already. All that remains is\ninheriting methods and calling superclass methods. We have [another\nchapter][optimization] after this one, but it introduces no new behavior. It\n<span name=\"faster\">only</span> makes existing stuff faster. Make it to the end\nof this one, and you'll have a complete Lox implementation.\n\n<aside name=\"faster\">\n\nThat \"only\" should not imply that making stuff faster isn't important! After\nall, the whole purpose of our entire second virtual machine is better\nperformance over jlox. You could argue that *all* of the past fifteen chapters\nare \"optimization\".\n\n</aside>\n\n[optimization]: optimization.html\n\nSome of the material in this chapter will remind you of jlox. The way we resolve\nsuper calls is pretty much the same, though viewed through clox's more complex\nmechanism for storing state on the stack. But we have an entirely different,\nmuch faster, way of handling inherited method calls this time around.\n\n## Inheriting Methods\n\nWe'll kick things off with method inheritance since it's the simpler piece. To\nrefresh your memory, Lox inheritance syntax looks like this:\n\n```lox\nclass Doughnut {\n  cook() {\n    print \"Dunk in the fryer.\";\n  }\n}\n\nclass Cruller < Doughnut {\n  finish() {\n    print \"Glaze with icing.\";\n  }\n}\n```\n\nHere, the Cruller class inherits from Doughnut and thus, instances of Cruller\ninherit the `cook()` method. I don't know why I'm belaboring this. You know how\ninheritance works. Let's start compiling the new syntax.\n\n^code compile-superclass (2 before, 1 after)\n\nAfter we compile the class name, if the next token is a `<`, then we found a\nsuperclass clause. We consume the superclass's identifier token, then call\n`variable()`. That function takes the previously consumed token, treats it as a\nvariable reference, and emits code to load the variable's value. In other words,\nit looks up the superclass by name and pushes it onto the stack.\n\nAfter that, we call `namedVariable()` to load the subclass doing the inheriting\nonto the stack, followed by an `OP_INHERIT` instruction. That instruction\nwires up the superclass to the new subclass. In the last chapter, we defined an\n`OP_METHOD` instruction to mutate an existing class object by adding a method to\nits method table. This is similar -- the `OP_INHERIT` instruction takes an\nexisting class and applies the effect of inheritance to it.\n\nIn the previous example, when the compiler works through this bit of syntax:\n\n```lox\nclass Cruller < Doughnut {\n```\n\nThe result is this bytecode:\n\n<img src=\"image/superclasses/inherit-stack.png\" alt=\"The series of bytecode instructions for a Cruller class inheriting from Doughnut.\" />\n\nBefore we implement the new `OP_INHERIT` instruction, we have an edge case to\ndetect.\n\n^code inherit-self (1 before, 1 after)\n\n<span name=\"cycle\">A</span> class cannot be its own superclass. Unless you have\naccess to a deranged nuclear physicist and a very heavily modified DeLorean, you\ncannot inherit from yourself.\n\n<aside name=\"cycle\">\n\nInterestingly, with the way we implement method inheritance, I don't think\nallowing cycles would actually cause any problems in clox. It wouldn't do\nanything *useful*, but I don't think it would cause a crash or infinite loop.\n\n</aside>\n\n### Executing inheritance\n\nNow onto the new instruction.\n\n^code inherit-op (1 before, 1 after)\n\nThere are no operands to worry about. The two values we need -- superclass and\nsubclass -- are both found on the stack. That means disassembling is easy.\n\n^code disassemble-inherit (1 before, 1 after)\n\nThe interpreter is where the action happens.\n\n^code interpret-inherit (1 before, 1 after)\n\nFrom the top of the stack down, we have the subclass then the superclass. We\ngrab both of those and then do the inherit-y bit. This is where clox takes a\ndifferent path than jlox. In our first interpreter, each subclass stored a\nreference to its superclass. On method access, if we didn't find the method in\nthe subclass's method table, we recursed through the inheritance chain looking\nat each ancestor's method table until we found it.\n\nFor example, calling `cook()` on an instance of Cruller sends jlox on this\njourney:\n\n<img src=\"image/superclasses/jlox-resolve.png\" alt=\"Resolving a call to cook() in an instance of Cruller means walking the superclass chain.\" />\n\nThat's a lot of work to perform during method *invocation* time. It's slow, and\nworse, the farther an inherited method is up the ancestor chain, the slower it\ngets. Not a great performance story.\n\nThe new approach is much faster. When the subclass is declared, we copy all of\nthe inherited class's methods down into the subclass's own method table. Later,\nwhen *calling* a method, any method inherited from a superclass will be found\nright in the subclass's own method table. There is no extra runtime work needed\nfor inheritance at all. By the time the class is declared, the work is done.\nThis means inherited method calls are exactly as fast as normal method calls --\na <span name=\"two\">single</span> hash table lookup.\n\n<img src=\"image/superclasses/clox-resolve.png\" alt=\"Resolving a call to cook() in an instance of Cruller which has the method in its own method table.\" />\n\n<aside name=\"two\">\n\nWell, two hash table lookups, I guess. Because first we have to make sure a\nfield on the instance doesn't shadow the method.\n\n</aside>\n\nI've sometimes heard this technique called \"copy-down inheritance\". It's simple\nand fast, but, like most optimizations, you get to use it only under certain\nconstraints. It works in Lox because Lox classes are *closed*. Once a class\ndeclaration is finished executing, the set of methods for that class can never\nchange.\n\nIn languages like Ruby, Python, and JavaScript, it's possible to <span\nname=\"monkey\">crack</span> open an existing class and jam some new methods into\nit or even remove them. That would break our optimization because if those\nmodifications happened to a superclass *after* the subclass declaration\nexecuted, the subclass would not pick up those changes. That breaks a user's\nexpectation that inheritance always reflects the current state of the\nsuperclass.\n\n<aside name=\"monkey\">\n\nAs you can imagine, changing the set of methods a class defines imperatively at\nruntime can make it hard to reason about a program. It is a very powerful tool,\nbut also a dangerous tool.\n\nThose who find this tool maybe a little *too* dangerous gave it the unbecoming\nname \"monkey patching\", or the even less decorous \"duck punching\".\n\n<img src=\"image/superclasses/monkey.png\" alt=\"A monkey with an eyepatch, naturally.\" />\n\n</aside>\n\nFortunately for us (but not for users who like the feature, I guess), Lox\ndoesn't let you patch monkeys or punch ducks, so we can safely apply this\noptimization.\n\nWhat about method overrides? Won't copying the superclass's methods into the\nsubclass's method table clash with the subclass's own methods? Fortunately, no.\nWe emit the `OP_INHERIT` after the `OP_CLASS` instruction that creates the\nsubclass but before any method declarations and `OP_METHOD` instructions have\nbeen compiled. At the point that we copy the superclass's methods down, the\nsubclass's method table is empty. Any methods the subclass overrides will\noverwrite those inherited entries in the table.\n\n### Invalid superclasses\n\nOur implementation is simple and fast, which is just the way I like my VM code.\nBut it's not robust. Nothing prevents a user from inheriting from an object that\nisn't a class at all:\n\n```lox\nvar NotClass = \"So not a class\";\nclass OhNo < NotClass {}\n```\n\nObviously, no self-respecting programmer would write that, but we have to guard\nagainst potential Lox users who have no self respect. A simple runtime check\nfixes that.\n\n^code inherit-non-class (1 before, 1 after)\n\nIf the value we loaded from the identifier in the superclass clause isn't an\nObjClass, we report a runtime error to let the user know what we think of them\nand their code.\n\n## Storing Superclasses\n\nDid you notice that when we added method inheritance, we didn't actually add any\nreference from a subclass to its superclass? After we copy the inherited methods\nover, we forget the superclass entirely. We don't need to keep a handle on the\nsuperclass, so we don't.\n\nThat won't be sufficient to support super calls. Since a subclass <span\nname=\"may\">may</span> override the superclass method, we need to be able to get\nour hands on superclass method tables. Before we get to that mechanism, I want \nto refresh your memory on how super calls are statically resolved.\n\n<aside name=\"may\">\n\n\"May\" might not be a strong enough word. Presumably the method *has* been\noverridden. Otherwise, why are you bothering to use `super` instead of just\ncalling it directly?\n\n</aside>\n\nBack in the halcyon days of jlox, I showed you [this tricky example][example] to\nexplain the way super calls are dispatched:\n\n[example]: inheritance.html#semantics\n\n```lox\nclass A {\n  method() {\n    print \"A method\";\n  }\n}\n\nclass B < A {\n  method() {\n    print \"B method\";\n  }\n\n  test() {\n    super.method();\n  }\n}\n\nclass C < B {}\n\nC().test();\n```\n\nInside the body of the `test()` method, `this` is an instance of C. If super\ncalls were resolved relative to the superclass of the *receiver*, then we would\nlook in C's superclass, B. But super calls are resolved relative to the\nsuperclass of the *surrounding class where the super call occurs*. In this case,\nwe are in B's `test()` method, so the superclass is A, and the program should\nprint \"A method\".\n\nThis means that super calls are not resolved dynamically based on the runtime\ninstance. The superclass used to look up the method is a static -- practically\nlexical -- property of where the call occurs. When we added inheritance to jlox,\nwe took advantage of that static aspect by storing the superclass in the same\nEnvironment structure we used for all lexical scopes. Almost as if the\ninterpreter saw the above program like this:\n\n```lox\nclass A {\n  method() {\n    print \"A method\";\n  }\n}\n\nvar Bs_super = A;\nclass B < A {\n  method() {\n    print \"B method\";\n  }\n\n  test() {\n    runtimeSuperCall(Bs_super, \"method\");\n  }\n}\n\nvar Cs_super = B;\nclass C < B {}\n\nC().test();\n```\n\nEach subclass has a hidden variable storing a reference to its superclass.\nWhenever we need to perform a super call, we access the superclass from that\nvariable and tell the runtime to start looking for methods there.\n\nWe'll take the same path with clox. The difference is that instead of jlox's\nheap-allocated Environment class, we have the bytecode VM's value stack and\nupvalue system. The machinery is a little different, but the overall effect is\nthe same.\n\n### A superclass local variable\n\nOur compiler already emits code to load the superclass onto the stack. Instead\nof leaving that slot as a temporary, we create a new scope and make it a local\nvariable.\n\n^code superclass-variable (2 before, 2 after)\n\nCreating a new lexical scope ensures that if we declare two classes in the same\nscope, each has a different local slot to store its superclass. Since we always\nname this variable \"super\", if we didn't make a scope for each subclass, the\nvariables would collide.\n\nWe name the variable \"super\" for the same reason we use \"this\" as the name of\nthe hidden local variable that `this` expressions resolve to: \"super\" is a\nreserved word, which guarantees the compiler's hidden variable won't collide\nwith a user-defined one.\n\nThe difference is that when compiling `this` expressions, we conveniently have a\ntoken sitting around whose lexeme is \"this\". We aren't so lucky here. Instead,\nwe add a little helper function to create a synthetic token for the given <span\nname=\"constant\">constant</span> string.\n\n^code synthetic-token\n\n<aside name=\"constant\" class=\"bottom\">\n\nI say \"constant string\" because tokens don't do any memory management of their\nlexeme. If we tried to use a heap-allocated string for this, we'd end up leaking\nmemory because it never gets freed. But the memory for C string literals lives\nin the executable's constant data section and never needs to be freed, so we're\nfine.\n\n</aside>\n\nSince we opened a local scope for the superclass variable, we need to close it.\n\n^code end-superclass-scope (1 before, 2 after)\n\nWe pop the scope and discard the \"super\" variable after compiling the class body\nand its methods. That way, the variable is accessible in all of the methods of\nthe subclass. It's a somewhat pointless optimization, but we create the scope\nonly if there *is* a superclass clause. Thus we need to close the scope only if\nthere is one.\n\nTo track that, we could declare a little local variable in `classDeclaration()`.\nBut soon, other functions in the compiler will need to know whether the\nsurrounding class is a subclass or not. So we may as well give our future selves\na hand and store this fact as a field in the ClassCompiler now.\n\n^code has-superclass (2 before, 1 after)\n\nWhen we first initialize a ClassCompiler, we assume it is not a subclass.\n\n^code init-has-superclass (1 before, 1 after)\n\nThen, if we see a superclass clause, we know we are compiling a subclass.\n\n^code set-has-superclass (1 before, 1 after)\n\nThis machinery gives us a mechanism at runtime to access the superclass object\nof the surrounding subclass from within any of the subclass's methods -- simply\nemit code to load the variable named \"super\". That variable is a local outside\nof the method body, but our existing upvalue support enables the VM to capture\nthat local inside the body of the method or even in functions nested inside that\nmethod.\n\n## Super Calls\n\nWith that runtime support in place, we are ready to implement super calls. As\nusual, we go front to back, starting with the new syntax. A super call <span\nname=\"last\">begins</span>, naturally enough, with the `super` keyword.\n\n<aside name=\"last\">\n\nThis is it, friend. The very last entry you'll add to the parsing table.\n\n</aside>\n\n^code table-super (1 before, 1 after)\n\nWhen the expression parser lands on a `super` token, control jumps to a new\nparsing function which starts off like so:\n\n^code super\n\nThis is pretty different from how we compiled `this` expressions. Unlike `this`,\na `super` <span name=\"token\">token</span> is not a standalone expression.\nInstead, the dot and method name following it are inseparable parts of the\nsyntax. However, the parenthesized argument list is separate. As with normal\nmethod access, Lox supports getting a reference to a superclass method as a\nclosure without invoking it:\n\n<aside name=\"token\">\n\nHypothetical question: If a bare `super` token *was* an expression, what kind of\nobject would it evaluate to?\n\n</aside>\n\n```lox\nclass A {\n  method() {\n    print \"A\";\n  }\n}\n\nclass B < A {\n  method() {\n    var closure = super.method;\n    closure(); // Prints \"A\".\n  }\n}\n```\n\nIn other words, Lox doesn't really have super *call* expressions, it has super\n*access* expressions, which you can choose to immediately invoke if you want. So\nwhen the compiler hits a `super` token, we consume the subsequent `.` token and\nthen look for a method name. Methods are looked up dynamically, so we use\n`identifierConstant()` to take the lexeme of the method name token and store it\nin the constant table just like we do for property access expressions.\n\nHere is what the compiler does after consuming those tokens:\n\n^code super-get (1 before, 1 after)\n\nIn order to access a *superclass method* on *the current instance*, the runtime\nneeds both the receiver *and* the superclass of the surrounding method's class.\nThe first `namedVariable()` call generates code to look up the current receiver\nstored in the hidden variable \"this\" and push it onto the stack. The second\n`namedVariable()` call emits code to look up the superclass from its \"super\"\nvariable and push that on top.\n\nFinally, we emit a new `OP_GET_SUPER` instruction with an operand for the\nconstant table index of the method name. That's a lot to hold in your head. To\nmake it tangible, consider this example program:\n\n```lox\nclass Doughnut {\n  cook() {\n    print \"Dunk in the fryer.\";\n    this.finish(\"sprinkles\");\n  }\n\n  finish(ingredient) {\n    print \"Finish with \" + ingredient;\n  }\n}\n\nclass Cruller < Doughnut {\n  finish(ingredient) {\n    // No sprinkles, always icing.\n    super.finish(\"icing\");\n  }\n}\n```\n\nThe bytecode emitted for the `super.finish(\"icing\")` expression looks and works\nlike this:\n\n<img src=\"image/superclasses/super-instructions.png\" alt=\"The series of bytecode instructions for calling super.finish().\" />\n\nThe first three instructions give the runtime access to the three pieces of\ninformation it needs to perform the super access:\n\n1.  The first instruction loads **the instance** onto the stack.\n\n2.  The second instruction loads **the superclass where the method is\n    resolved**.\n\n3.  Then the new `OP_GET_SUPER` instuction encodes **the name of the method to\n    access** as an operand.\n\nThe remaining instructions are the normal bytecode for evaluating an argument\nlist and calling a function.\n\nWe're almost ready to implement the new `OP_GET_SUPER` instruction in the\ninterpreter. But before we do, the compiler has some errors it is responsible\nfor reporting.\n\n^code super-errors (1 before, 1 after)\n\nA super call is meaningful only inside the body of a method (or in a function\nnested inside a method), and only inside the method of a class that has a\nsuperclass. We detect both of these cases using the value of `currentClass`. If\nthat's `NULL` or points to a class with no superclass, we report those errors.\n\n### Executing super accesses\n\nAssuming the user didn't put a `super` expression where it's not allowed, their\ncode passes from the compiler over to the runtime. We've got ourselves a new\ninstruction.\n\n^code get-super-op (1 before, 1 after)\n\nWe disassemble it like other opcodes that take a constant table index operand.\n\n^code disassemble-get-super (1 before, 1 after)\n\nYou might anticipate something harder, but interpreting the new instruction is\nsimilar to executing a normal property access.\n\n^code interpret-get-super (1 before, 1 after)\n\nAs with properties, we read the method name from the\nconstant table. Then we pass that to `bindMethod()` which looks up the method in\nthe given class's method table and creates an ObjBoundMethod to bundle the\nresulting closure to the current instance.\n\nThe key <span name=\"field\">difference</span> is *which* class we pass to\n`bindMethod()`. With a normal property access, we use the ObjInstances's own\nclass, which gives us the dynamic dispatch we want. For a super call, we don't\nuse the instance's class. Instead, we use the statically resolved superclass of\nthe containing class, which the compiler has conveniently ensured is sitting on\ntop of the stack waiting for us.\n\nWe pop that superclass and pass it to `bindMethod()`, which correctly skips over\nany overriding methods in any of the subclasses between that superclass and the\ninstance's own class. It also correctly includes any methods inherited by the\nsuperclass from any of *its* superclasses.\n\nThe rest of the behavior is the same. Popping the superclass leaves the instance\nat the top of the stack. When `bindMethod()` succeeds, it pops the instance and\npushes the new bound method. Otherwise, it reports a runtime error and returns\n`false`. In that case, we abort the interpreter.\n\n<aside name=\"field\">\n\nAnother difference compared to `OP_GET_PROPERTY` is that we don't try to look\nfor a shadowing field first. Fields are not inherited, so `super` expressions\nalways resolve to methods.\n\nIf Lox were a prototype-based language that used *delegation* instead of\n*inheritance*, then instead of one *class* inheriting from another *class*,\ninstances would inherit from (\"delegate to\") other instances. In that case,\nfields *could* be inherited, and we would need to check for them here.\n\n</aside>\n\n### Faster super calls\n\nWe have superclass method accesses working now. And since the returned object is\nan ObjBoundMethod that you can then invoke, we've got super *calls* working too.\nJust like last chapter, we've reached a point where our VM has the complete,\ncorrect semantics.\n\nBut, also like last chapter, it's pretty slow. Again, we're heap allocating an\nObjBoundMethod for each super call even though most of the time the very next\ninstruction is an `OP_CALL` that immediately unpacks that bound method, invokes\nit, and then discards it. In fact, this is even more likely to be true for\nsuper calls than for regular method calls. At least with method calls there is\na chance that the user is actually invoking a function stored in a field. With\nsuper calls, you're *always* looking up a method. The only question is whether\nyou invoke it immediately or not.\n\nThe compiler can certainly answer that question for itself if it sees a left\nparenthesis after the superclass method name, so we'll go ahead and perform the\nsame optimization we did for method calls. Take out the two lines of code that\nload the superclass and emit `OP_GET_SUPER`, and replace them with this:\n\n^code super-invoke (1 before, 1 after)\n\nNow before we emit anything, we look for a parenthesized argument list. If we\nfind one, we compile that. Then we load the superclass. After that, we emit a\nnew `OP_SUPER_INVOKE` instruction. This <span\nname=\"superinstruction\">superinstruction</span> combines the behavior of\n`OP_GET_SUPER` and `OP_CALL`, so it takes two operands: the constant table index\nof the method name to look up and the number of arguments to pass to it.\n\n<aside name=\"superinstruction\">\n\nThis is a particularly *super* superinstruction, if you get what I'm saying.\nI... I'm sorry for this terrible joke.\n\n</aside>\n\nOtherwise, if we don't find a `(`, we continue to compile the expression as a\nsuper access like we did before and emit an `OP_GET_SUPER`.\n\nDrifting down the compilation pipeline, our first stop is a new instruction.\n\n^code super-invoke-op (1 before, 1 after)\n\nAnd just past that, its disassembler support.\n\n^code disassemble-super-invoke (1 before, 1 after)\n\nA super invocation instruction has the same set of operands as `OP_INVOKE`, so\nwe reuse the same helper to disassemble it. Finally, the pipeline dumps us into\nthe interpreter.\n\n^code interpret-super-invoke (2 before, 1 after)\n\nThis handful of code is basically our implementation of `OP_INVOKE` mixed\ntogether with a dash of `OP_GET_SUPER`. There are some differences in how the\nstack is organized, though. With an unoptimized super call, the superclass is\npopped and replaced by the ObjBoundMethod for the resolved function *before* the\narguments to the call are executed. This ensures that by the time the `OP_CALL`\nis executed, the bound method is *under* the argument list, where the runtime\nexpects it to be for a closure call.\n\nWith our optimized instructions, things are shuffled a bit:\n\n<img src=\"image/superclasses/super-invoke.png\" class=\"wide\" alt=\"The series of bytecode instructions for calling super.finish() using OP_SUPER_INVOKE.\" />\n\nNow resolving the superclass method is part of the *invocation*, so the\narguments need to already be on the stack at the point that we look up the\nmethod. This means the superclass object is on top of the arguments.\n\nAside from that, the behavior is roughly the same as an `OP_GET_SUPER` followed\nby an `OP_CALL`. First, we pull out the method name and argument count operands.\nThen we pop the superclass off the top of the stack so that we can look up the\nmethod in its method table. This conveniently leaves the stack set up just right\nfor a method call.\n\nWe pass the superclass, method name, and argument count to our existing\n`invokeFromClass()` function. That function looks up the given method on the\ngiven class and attempts to create a call to it with the given arity. If a\nmethod could not be found, it returns `false`, and we bail out of the\ninterpreter. Otherwise, `invokeFromClass()` pushes a new CallFrame onto the call\nstack for the method's closure. That invalidates the interpreter's cached\nCallFrame pointer, so we refresh `frame`.\n\n## A Complete Virtual Machine\n\nTake a look back at what we've created. By my count, we wrote around 2,500 lines\nof fairly clean, straightforward C. That little program contains a complete\nimplementation of the -- quite high-level! -- Lox language, with a whole\nprecedence table full of expression types and a suite of control flow\nstatements. We implemented variables, functions, closures, classes, fields,\nmethods, and inheritance.\n\nEven more impressive, our implementation is portable to any platform with a C\ncompiler, and is fast enough for real-world production use. We have a\nsingle-pass bytecode compiler, a tight virtual machine interpreter for our\ninternal instruction set, compact object representations, a stack for storing\nvariables without heap allocation, and a precise garbage collector.\n\nIf you go out and start poking around in the implementations of Lua, Python, or\nRuby, you will be surprised by how much of it now looks familiar to you. You\nhave seriously leveled up your knowledge of how programming languages work,\nwhich in turn gives you a deeper understanding of programming itself. It's like\nyou used to be a race car driver, and now you can pop the hood and repair the\nengine too.\n\nYou can stop here if you like. The two implementations of Lox you have are\ncomplete and full featured. You built the car and can drive it wherever you want\nnow. But if you are looking to have more fun tuning and tweaking for even\ngreater performance out on the track, there is one more chapter. We don't add\nany new capabilities, but we roll in a couple of classic optimizations to\nsqueeze even more perf out. If that sounds fun, [keep reading][opt]...\n\n[opt]: optimization.html\n\n<div class=\"challenges\">\n\n## Challenges\n\n1.  A tenet of object-oriented programming is that a class should ensure new\n    objects are in a valid state. In Lox, that means defining an initializer\n    that populates the instance's fields. Inheritance complicates invariants\n    because the instance must be in a valid state according to all of the\n    classes in the object's inheritance chain.\n\n    The easy part is remembering to call `super.init()` in each subclass's\n    `init()` method. The harder part is fields. There is nothing preventing two\n    classes in the inheritance chain from accidentally claiming the same field\n    name. When this happens, they will step on each other's fields and possibly\n    leave you with an instance in a broken state.\n\n    If Lox was your language, how would you address this, if at all? If you\n    would change the language, implement your change.\n\n2.  Our copy-down inheritance optimization is valid only because Lox does not\n    permit you to modify a class's methods after its declaration. This means we\n    don't have to worry about the copied methods in the subclass getting out of\n    sync with later changes to the superclass.\n\n    Other languages, like Ruby, *do* allow classes to be modified after the\n    fact. How do implementations of languages like that support class\n    modification while keeping method resolution efficient?\n\n3.  In the [jlox chapter on inheritance][inheritance], we had a challenge to\n    implement the BETA language's approach to method overriding. Solve the\n    challenge again, but this time in clox. Here's the description of the\n    previous challenge:\n\n    In Lox, as in most other object-oriented languages, when looking up a\n    method, we start at the bottom of the class hierarchy and work our way up --\n    a subclass's method is preferred over a superclass's. In order to get to the\n    superclass method from within an overriding method, you use `super`.\n\n    The language [BETA][] takes the [opposite approach][inner]. When you call a\n    method, it starts at the *top* of the class hierarchy and works *down*. A\n    superclass method wins over a subclass method. In order to get to the\n    subclass method, the superclass method can call `inner`, which is sort of\n    like the inverse of `super`. It chains to the next method down the\n    hierarchy.\n\n    The superclass method controls when and where the subclass is allowed to\n    refine its behavior. If the superclass method doesn't call `inner` at all,\n    then the subclass has no way of overriding or modifying the superclass's\n    behavior.\n\n    Take out Lox's current overriding and `super` behavior, and replace it with\n    BETA's semantics. In short:\n\n    *   When calling a method on a class, the method *highest* on the\n        class's inheritance chain takes precedence.\n\n    *   Inside the body of a method, a call to `inner` looks for a method with\n        the same name in the nearest subclass along the inheritance chain\n        between the class containing the `inner` and the class of `this`. If\n        there is no matching method, the `inner` call does nothing.\n\n    For example:\n\n    ```lox\n    class Doughnut {\n      cook() {\n        print \"Fry until golden brown.\";\n        inner();\n        print \"Place in a nice box.\";\n      }\n    }\n\n    class BostonCream < Doughnut {\n      cook() {\n        print \"Pipe full of custard and coat with chocolate.\";\n      }\n    }\n\n    BostonCream().cook();\n    ```\n\n    This should print:\n\n    ```text\n    Fry until golden brown.\n    Pipe full of custard and coat with chocolate.\n    Place in a nice box.\n    ```\n\n    Since clox is about not just implementing Lox, but doing so with good\n    performance, this time around try to solve the challenge with an eye towards\n    efficiency.\n\n[inheritance]: inheritance.html\n[inner]: http://journal.stuffwithstuff.com/2012/12/19/the-impoliteness-of-overriding-methods/\n[beta]: https://beta.cs.au.dk/\n\n</div>\n"
  },
  {
    "path": "book/the-lox-language.md",
    "content": "> What nicer thing can you do for somebody than make them breakfast?\n>\n> <cite>Anthony Bourdain</cite>\n\nWe'll spend the rest of this book illuminating every dark and sundry corner of\nthe Lox language, but it seems cruel to have you immediately start grinding out\ncode for the interpreter without at least a glimpse of what we're going to end\nup with.\n\nAt the same time, I don't want to drag you through reams of language lawyering\nand specification-ese before you get to touch your text <span\nname=\"home\">editor</span>. So this will be a gentle, friendly introduction to\nLox. It will leave out a lot of details and edge cases. We've got plenty of time\nfor those later.\n\n<aside name=\"home\">\n\nA tutorial isn't very fun if you can't try the code out yourself. Alas, you\ndon't have a Lox interpreter yet, since you haven't built one!\n\nFear not. You can use [mine][repo].\n\n[repo]: https://github.com/munificent/craftinginterpreters\n\n</aside>\n\n## Hello, Lox\n\nHere's your very first taste of <span name=\"salmon\">Lox</span>:\n\n<aside name=\"salmon\">\n\nYour first taste of Lox, the language, that is. I don't know if you've ever had\nthe cured, cold-smoked salmon before. If not, give it a try too.\n\n</aside>\n\n```lox\n// Your first Lox program!\nprint \"Hello, world!\";\n```\n\nAs that `//` line comment and the trailing semicolon imply, Lox's syntax is a\nmember of the C family. (There are no parentheses around the string because\n`print` is a built-in statement, and not a library function.)\n\nNow, I won't claim that <span name=\"c\">C</span> has a *great* syntax. If we\nwanted something elegant, we'd probably mimic Pascal or Smalltalk. If we wanted\nto go full Scandinavian-furniture-minimalism, we'd do a Scheme. Those all have\ntheir virtues.\n\n<aside name=\"c\">\n\nI'm surely biased, but I think Lox's syntax is pretty clean. C's most egregious\ngrammar problems are around types. Dennis Ritchie had this idea called\n\"[declaration reflects use][use]\", where variable declarations mirror the\noperations you would have to perform on the variable to get to a value of the\nbase type. Clever idea, but I don't think it worked out great in practice.\n\n[use]: http://softwareengineering.stackexchange.com/questions/117024/why-was-the-c-syntax-for-arrays-pointers-and-functions-designed-this-way\n\nLox doesn't have static types, so we avoid that.\n\n</aside>\n\nWhat C-like syntax has instead is something you'll often find more valuable\nin a language: *familiarity*. I know you are already comfortable with that style\nbecause the two languages we'll be using to *implement* Lox -- Java and C --\nalso inherit it. Using a similar syntax for Lox gives you one less thing to\nlearn.\n\n## A High-Level Language\n\nWhile this book ended up bigger than I was hoping, it's still not big enough to\nfit a huge language like Java in it. In order to fit two complete\nimplementations of Lox in these pages, Lox itself has to be pretty compact.\n\nWhen I think of languages that are small but useful, what comes to mind are\nhigh-level \"scripting\" languages like <span name=\"js\">JavaScript</span>, Scheme,\nand Lua. Of those three, Lox looks most like JavaScript, mainly because most\nC-syntax languages do. As we'll learn later, Lox's approach to scoping hews\nclosely to Scheme. The C flavor of Lox we'll build in [Part III][] is heavily\nindebted to Lua's clean, efficient implementation.\n\n[part iii]: a-bytecode-virtual-machine.html\n\n<aside name=\"js\">\n\nNow that JavaScript has taken over the world and is used to build ginormous\napplications, it's hard to think of it as a \"little scripting language\". But\nBrendan Eich hacked the first JS interpreter into Netscape Navigator in *ten\ndays* to make buttons animate on web pages. JavaScript has grown up since then,\nbut it was once a cute little language.\n\nBecause Eich slapped JS together with roughly the same raw materials and time as\nan episode of MacGyver, it has some weird semantic corners where the duct tape\nand paper clips show through. Things like variable hoisting, dynamically bound\n`this`, holes in arrays, and implicit conversions.\n\nI had the luxury of taking my time on Lox, so it should be a little cleaner.\n\n</aside>\n\nLox shares two other aspects with those three languages:\n\n### Dynamic typing\n\nLox is dynamically typed. Variables can store values of any type, and a single\nvariable can even store values of different types at different times. If you try\nto perform an operation on values of the wrong type -- say, dividing a number by\na string -- then the error is detected and reported at runtime.\n\nThere are plenty of reasons to like <span name=\"static\">static</span> types, but\nthey don't outweigh the pragmatic reasons to pick dynamic types for Lox. A\nstatic type system is a ton of work to learn and implement. Skipping it gives\nyou a simpler language and a shorter book. We'll get our interpreter up and\nexecuting bits of code sooner if we defer our type checking to runtime.\n\n<aside name=\"static\">\n\nAfter all, the two languages we'll be using to *implement* Lox are both\nstatically typed.\n\n</aside>\n\n### Automatic memory management\n\nHigh-level languages exist to eliminate error-prone, low-level drudgery, and what\ncould be more tedious than manually managing the allocation and freeing of\nstorage? No one rises and greets the morning sun with, \"I can't wait to figure\nout the correct place to call `free()` for every byte of memory I allocate\ntoday!\"\n\nThere are two main <span name=\"gc\">techniques</span> for managing memory:\n**reference counting** and **tracing garbage collection** (usually just called\n**garbage collection** or **GC**). Ref counters are much simpler to implement --\nI think that's why Perl, PHP, and Python all started out using them. But, over\ntime, the limitations of ref counting become too troublesome. All of those\nlanguages eventually ended up adding a full tracing GC, or at least enough of\none to clean up object cycles.\n\n<aside name=\"gc\">\n\nIn practice, ref counting and tracing are more ends of a continuum than\nopposing sides. Most ref counting systems end up doing some tracing to handle\ncycles, and the write barriers of a generational collector look a bit like\nretain calls if you squint.\n\nFor lots more on this, see \"[A Unified Theory of Garbage Collection][gc]\" (PDF).\n\n[gc]: https://researcher.watson.ibm.com/researcher/files/us-bacon/Bacon04Unified.pdf\n\n</aside>\n\nTracing garbage collection has a fearsome reputation. It *is* a little harrowing\nworking at the level of raw memory. Debugging a GC can sometimes leave you\nseeing hex dumps in your dreams. But, remember, this book is about dispelling\nmagic and slaying those monsters, so we *are* going to write our own garbage\ncollector. I think you'll find the algorithm is quite simple and a lot of fun to\nimplement.\n\n## Data Types\n\nIn Lox's little universe, the atoms that make up all matter are the built-in\ndata types. There are only a few:\n\n*   **<span name=\"bool\">Booleans</span>.** You can't code without logic and you\n    can't logic without Boolean values. \"True\" and \"false\", the yin and yang of\n    software. Unlike some ancient languages that repurpose an existing type to\n    represent truth and falsehood, Lox has a dedicated Boolean type. We may\n    be roughing it on this expedition, but we aren't *savages*.\n\n    <aside name=\"bool\">\n\n    Boolean variables are the only data type in Lox named after a person, George\n    Boole, which is why \"Boolean\" is capitalized. He died in 1864, nearly a\n    century before digital computers turned his algebra into electricity. I\n    wonder what he'd think to see his name all over billions of lines of Java\n    code.\n\n    </aside>\n\n    There are two Boolean values, obviously, and a literal for each one.\n\n    ```lox\n    true;  // Not false.\n    false; // Not *not* false.\n    ```\n\n*   **Numbers.** Lox has only one kind of number: double-precision floating\n    point. Since floating-point numbers can also represent a wide range of\n    integers, that covers a lot of territory, while keeping things simple.\n\n    Full-featured languages have lots of syntax for numbers -- hexadecimal,\n    scientific notation, octal, all sorts of fun stuff. We'll settle for basic\n    integer and decimal literals.\n\n    ```lox\n    1234;  // An integer.\n    12.34; // A decimal number.\n    ```\n\n*   **Strings.** We've already seen one string literal in the first example.\n    Like most languages, they are enclosed in double quotes.\n\n    ```lox\n    \"I am a string\";\n    \"\";    // The empty string.\n    \"123\"; // This is a string, not a number.\n    ```\n\n    As we'll see when we get to implementing them, there is quite a lot of\n    complexity hiding in that innocuous sequence of <span\n    name=\"char\">characters</span>.\n\n    <aside name=\"char\">\n\n    Even that word \"character\" is a trickster. Is it ASCII? Unicode? A\n    code point or a \"grapheme cluster\"? How are characters encoded? Is each\n    character a fixed size, or can they vary?\n\n    </aside>\n\n*   **Nil.** There's one last built-in value who's never invited to the party\n    but always seems to show up. It represents \"no value\". It's called \"null\" in\n    many other languages. In Lox we spell it `nil`. (When we get to implementing\n    it, that will help distinguish when we're talking about Lox's `nil` versus\n    Java or C's `null`.)\n\n    There are good arguments for not having a null value in a language since\n    null pointer errors are the scourge of our industry. If we were doing a\n    statically typed language, it would be worth trying to ban it. In a\n    dynamically typed one, though, eliminating it is often more annoying\n    than having it.\n\n## Expressions\n\nIf built-in data types and their literals are atoms, then **expressions** must\nbe the molecules. Most of these will be familiar.\n\n### Arithmetic\n\nLox features the basic arithmetic operators you know and love from C and other\nlanguages:\n\n```lox\nadd + me;\nsubtract - me;\nmultiply * me;\ndivide / me;\n```\n\nThe subexpressions on either side of the operator are **operands**. Because\nthere are *two* of them, these are called **binary** operators. (It has nothing\nto do with the ones-and-zeroes use of \"binary\".) Because the operator is <span\nname=\"fixity\">fixed</span> *in* the middle of the operands, these are also\ncalled **infix** operators (as opposed to **prefix** operators where the\noperator comes before the operands, and **postfix** where it comes after).\n\n<aside name=\"fixity\">\n\nThere are some operators that have more than two operands and the operators are\ninterleaved between them. The only one in wide usage is the \"conditional\" or\n\"ternary\" operator of C and friends:\n\n```c\ncondition ? thenArm : elseArm;\n```\n\nSome call these **mixfix** operators. A few languages let you define your own\noperators and control how they are positioned -- their \"fixity\".\n\n</aside>\n\nOne arithmetic operator is actually *both* an infix and a prefix one. The `-`\noperator can also be used to negate a number.\n\n```lox\n-negateMe;\n```\n\nAll of these operators work on numbers, and it's an error to pass any other\ntypes to them. The exception is the `+` operator -- you can also pass it two\nstrings to concatenate them.\n\n### Comparison and equality\n\nMoving along, we have a few more operators that always return a Boolean result.\nWe can compare numbers (and only numbers), using Ye Olde Comparison Operators.\n\n```lox\nless < than;\nlessThan <= orEqual;\ngreater > than;\ngreaterThan >= orEqual;\n```\n\nWe can test two values of any kind for equality or inequality.\n\n```lox\n1 == 2;         // false.\n\"cat\" != \"dog\"; // true.\n```\n\nEven different types.\n\n```lox\n314 == \"pi\"; // false.\n```\n\nValues of different types are *never* equivalent.\n\n```lox\n123 == \"123\"; // false.\n```\n\nI'm generally against implicit conversions.\n\n### Logical operators\n\nThe not operator, a prefix `!`, returns `false` if its operand is true, and vice\nversa.\n\n```lox\n!true;  // false.\n!false; // true.\n```\n\nThe other two logical operators really are control flow constructs in the guise\nof expressions. An <span name=\"and\">`and`</span> expression determines if two\nvalues are *both* true. It returns the left operand if it's false, or the\nright operand otherwise.\n\n```lox\ntrue and false; // false.\ntrue and true;  // true.\n```\n\nAnd an `or` expression determines if *either* of two values (or both) are true.\nIt returns the left operand if it is true and the right operand otherwise.\n\n```lox\nfalse or false; // false.\ntrue or false;  // true.\n```\n\n<aside name=\"and\">\n\nI used `and` and `or` for these instead of `&&` and `||` because Lox doesn't use\n`&` and `|` for bitwise operators. It felt weird to introduce the\ndouble-character forms without the single-character ones.\n\nI also kind of like using words for these since they are really control flow\nstructures and not simple operators.\n\n</aside>\n\nThe reason `and` and `or` are like control flow structures is that they\n**short-circuit**. Not only does `and` return the left operand if it is false,\nit doesn't even *evaluate* the right one in that case. Conversely\n(contrapositively?), if the left operand of an `or` is true, the right is\nskipped.\n\n### Precedence and grouping\n\nAll of these operators have the same precedence and associativity that you'd\nexpect coming from C. (When we get to parsing, we'll get *way* more precise\nabout that.) In cases where the precedence isn't what you want, you can use `()`\nto group stuff.\n\n```lox\nvar average = (min + max) / 2;\n```\n\nSince they aren't very technically interesting, I've cut the remainder of the\ntypical operator menagerie out of our little language. No bitwise, shift,\nmodulo, or conditional operators. I'm not grading you, but you will get bonus\npoints in my heart if you augment your own implementation of Lox with them.\n\nThose are the expression forms (except for a couple related to specific features\nthat we'll get to later), so let's move up a level.\n\n## Statements\n\nNow we're at statements. Where an expression's main job is to produce a *value*,\na statement's job is to produce an *effect*. Since, by definition, statements\ndon't evaluate to a value, to be useful they have to otherwise change the world\nin some way -- usually modifying some state, reading input, or producing output.\n\nYou've seen a couple of kinds of statements already. The first one was:\n\n```lox\nprint \"Hello, world!\";\n```\n\nA <span name=\"print\">`print` statement</span> evaluates a single expression\nand displays the result to the user. You've also seen some statements like:\n\n<aside name=\"print\">\n\nBaking `print` into the language instead of just making it a core library\nfunction is a hack. But it's a *useful* hack for us: it means our in-progress\ninterpreter can start producing output before we've implemented all of the\nmachinery required to define functions, look them up by name, and call them.\n\n</aside>\n\n```lox\n\"some expression\";\n```\n\nAn expression followed by a semicolon (`;`) promotes the expression to\nstatement-hood. This is called (imaginatively enough), an **expression\nstatement**.\n\nIf you want to pack a series of statements where a single one is expected, you\ncan wrap them up in a **block**.\n\n```lox\n{\n  print \"One statement.\";\n  print \"Two statements.\";\n}\n```\n\nBlocks also affect scoping, which leads us to the next section...\n\n## Variables\n\nYou declare variables using `var` statements. If you <span\nname=\"omit\">omit</span> the initializer, the variable's value defaults to `nil`.\n\n<aside name=\"omit\">\n\nThis is one of those cases where not having `nil` and forcing every variable to\nbe initialized to some value would be more annoying than dealing with `nil`\nitself.\n\n</aside>\n\n```lox\nvar imAVariable = \"here is my value\";\nvar iAmNil;\n```\n\nOnce declared, you can, naturally, access and assign a variable using its name.\n\n<span name=\"breakfast\"></span>\n\n```lox\nvar breakfast = \"bagels\";\nprint breakfast; // \"bagels\".\nbreakfast = \"beignets\";\nprint breakfast; // \"beignets\".\n```\n\n<aside name=\"breakfast\">\n\nCan you tell that I tend to work on this book in the morning before I've had\nanything to eat?\n\n</aside>\n\nI won't get into the rules for variable scope here, because we're going to spend\na surprising amount of time in later chapters mapping every square inch of the\nrules. In most cases, it works like you would expect coming from C or Java.\n\n## Control Flow\n\nIt's hard to write <span name=\"flow\">useful</span> programs if you can't skip\nsome code or execute some more than once. That means control flow. In addition\nto the logical operators we already covered, Lox lifts three statements straight\nfrom C.\n\n<aside name=\"flow\">\n\nWe already have `and` and `or` for branching, and we *could* use recursion to\nrepeat code, so that's theoretically sufficient. It would be pretty awkward to\nprogram that way in an imperative-styled language, though.\n\nScheme, on the other hand, has no built-in looping constructs. It *does* rely on\nrecursion for repetition. Smalltalk has no built-in branching constructs, and\nrelies on dynamic dispatch for selectively executing code.\n\n</aside>\n\nAn `if` statement executes one of two statements based on some condition.\n\n```lox\nif (condition) {\n  print \"yes\";\n} else {\n  print \"no\";\n}\n```\n\nA `while` <span name=\"do\">loop</span> executes the body repeatedly as long as\nthe condition expression evaluates to true.\n\n```lox\nvar a = 1;\nwhile (a < 10) {\n  print a;\n  a = a + 1;\n}\n```\n\n<aside name=\"do\">\n\nI left `do while` loops out of Lox because they aren't that common and wouldn't\nteach you anything that you won't already learn from `while`. Go ahead and add\nit to your implementation if it makes you happy. It's your party.\n\n</aside>\n\nFinally, we have `for` loops.\n\n```lox\nfor (var a = 1; a < 10; a = a + 1) {\n  print a;\n}\n```\n\nThis loop does the same thing as the previous `while` loop. Most modern\nlanguages also have some sort of <span name=\"foreach\">`for-in`</span> or\n`foreach` loop for explicitly iterating over various sequence types. In a real\nlanguage, that's nicer than the crude C-style `for` loop we got here. Lox keeps\nit basic.\n\n<aside name=\"foreach\">\n\nThis is a concession I made because of how the implementation is split across\nchapters. A `for-in` loop needs some sort of dynamic dispatch in the iterator\nprotocol to handle different kinds of sequences, but we don't get that until\nafter we're done with control flow. We could circle back and add `for-in` loops\nlater, but I didn't think doing so would teach you anything super interesting.\n\n</aside>\n\n## Functions\n\nA function call expression looks the same as it does in C.\n\n```lox\nmakeBreakfast(bacon, eggs, toast);\n```\n\nYou can also call a function without passing anything to it.\n\n```lox\nmakeBreakfast();\n```\n\nUnlike in, say, Ruby, the parentheses are mandatory in this case. If you leave them\noff, the name doesn't *call* the function, it just refers to it.\n\nA language isn't very fun if you can't define your own functions. In Lox, you do\nthat with <span name=\"fun\">`fun`</span>.\n\n<aside name=\"fun\">\n\nI've seen languages that use `fn`, `fun`, `func`, and `function`. I'm still\nhoping to discover a `funct`, `functi`, or `functio` somewhere.\n\n</aside>\n\n```lox\nfun printSum(a, b) {\n  print a + b;\n}\n```\n\nNow's a good time to clarify some <span name=\"define\">terminology</span>. Some\npeople throw around \"parameter\" and \"argument\" like they are interchangeable\nand, to many, they are. We're going to spend a lot of time splitting the finest\nof downy hairs around semantics, so let's sharpen our words. From here on out:\n\n*   An **argument** is an actual value you pass to a function when you call it.\n    So a function *call* has an *argument* list. Sometimes you hear **actual\n    parameter** used for these.\n\n*   A **parameter** is a variable that holds the value of the argument inside\n    the body of the function. Thus, a function *declaration* has a *parameter*\n    list. Others call these **formal parameters** or simply **formals**.\n\n<aside name=\"define\">\n\nSpeaking of terminology, some statically typed languages like C make a\ndistinction between *declaring* a function and *defining* it. A declaration\nbinds the function's type to its name so that calls can be type-checked but does\nnot provide a body. A definition declares the function and also fills in the\nbody so that the function can be compiled.\n\nSince Lox is dynamically typed, this distinction isn't meaningful. A function\ndeclaration fully specifies the function including its body.\n\n</aside>\n\nThe body of a function is always a block. Inside it, you can return a value\nusing a `return` statement.\n\n```lox\nfun returnSum(a, b) {\n  return a + b;\n}\n```\n\nIf execution reaches the end of the block without hitting a `return`, it\n<span name=\"sneaky\">implicitly</span> returns `nil`.\n\n<aside name=\"sneaky\">\n\nSee, I told you `nil` would sneak in when we weren't looking.\n\n</aside>\n\n### Closures\n\nFunctions are *first class* in Lox, which just means they are real values that\nyou can get a reference to, store in variables, pass around, etc. This works:\n\n```lox\nfun addPair(a, b) {\n  return a + b;\n}\n\nfun identity(a) {\n  return a;\n}\n\nprint identity(addPair)(1, 2); // Prints \"3\".\n```\n\nSince function declarations are statements, you can declare local functions\ninside another function.\n\n```lox\nfun outerFunction() {\n  fun localFunction() {\n    print \"I'm local!\";\n  }\n\n  localFunction();\n}\n```\n\nIf you combine local functions, first-class functions, and block scope, you run\ninto this interesting situation:\n\n```lox\nfun returnFunction() {\n  var outside = \"outside\";\n\n  fun inner() {\n    print outside;\n  }\n\n  return inner;\n}\n\nvar fn = returnFunction();\nfn();\n```\n\nHere, `inner()` accesses a local variable declared outside of its body in the\nsurrounding function. Is this kosher? Now that lots of languages have borrowed\nthis feature from Lisp, you probably know the answer is yes.\n\nFor that to work, `inner()` has to \"hold on\" to references to any surrounding\nvariables that it uses so that they stay around even after the outer function\nhas returned. We call functions that do this <span\nname=\"closure\">**closures**</span>. These days, the term is often used for *any*\nfirst-class function, though it's sort of a misnomer if the function doesn't\nhappen to close over any variables.\n\n<aside name=\"closure\">\n\nPeter J. Landin coined the term \"closure\". Yes, he invented damn near half the\nterms in programming languages. Most of them came out of one incredible paper,\n\"[The Next 700 Programming Languages][svh]\".\n\n[svh]: https://homepages.inf.ed.ac.uk/wadler/papers/papers-we-love/landin-next-700.pdf\n\nIn order to implement these kind of functions, you need to create a data\nstructure that bundles together the function's code and the surrounding\nvariables it needs. He called this a \"closure\" because it *closes over* and\nholds on to the variables it needs.\n\n</aside>\n\nAs you can imagine, implementing these adds some complexity because we can no\nlonger assume variable scope works strictly like a stack where local variables\nevaporate the moment the function returns. We're going to have a fun time\nlearning how to make these work correctly and efficiently.\n\n## Classes\n\nSince Lox has dynamic typing, lexical (roughly, \"block\") scope, and closures,\nit's about halfway to being a functional language. But as you'll see, it's\n*also* about halfway to being an object-oriented language. Both paradigms have a\nlot going for them, so I thought it was worth covering some of each.\n\nSince classes have come under fire for not living up to their hype, let me first\nexplain why I put them into Lox and this book. There are really two questions:\n\n### Why might any language want to be object oriented?\n\nNow that object-oriented languages like Java have sold out and only play arena\nshows, it's not cool to like them anymore. Why would anyone make a *new*\nlanguage with objects? Isn't that like releasing music on 8-track?\n\nIt is true that the \"all inheritance all the time\" binge of the '90s produced\nsome monstrous class hierarchies, but **object-oriented programming** (**OOP**)\nis still pretty rad. Billions of lines of successful code have been written in\nOOP languages, shipping millions of apps to happy users. Likely a majority of\nworking programmers today are using an object-oriented language. They can't all\nbe *that* wrong.\n\nIn particular, for a dynamically typed language, objects are pretty handy. We\nneed *some* way of defining compound data types to bundle blobs of stuff\ntogether.\n\nIf we can also hang methods off of those, then we avoid the need to prefix all\nof our functions with the name of the data type they operate on to avoid\ncolliding with similar functions for different types. In, say, Racket, you end\nup having to name your functions like `hash-copy` (to copy a hash table) and\n`vector-copy` (to copy a vector) so that they don't step on each other. Methods\nare scoped to the object, so that problem goes away.\n\n### Why is Lox object oriented?\n\nI could claim objects are groovy but still out of scope for the book. Most\nprogramming language books, especially ones that try to implement a whole\nlanguage, leave objects out. To me, that means the topic isn't well covered.\nWith such a widespread paradigm, that omission makes me sad.\n\nGiven how many of us spend all day *using* OOP languages, it seems like the\nworld could use a little documentation on how to *make* one. As you'll see, it\nturns out to be pretty interesting. Not as hard as you might fear, but not as\nsimple as you might presume, either.\n\n### Classes or prototypes\n\nWhen it comes to objects, there are actually two approaches to them, [classes][]\nand [prototypes][]. Classes came first, and are more common thanks to C++, Java,\nC#, and friends. Prototypes were a virtually forgotten offshoot until JavaScript\naccidentally took over the world.\n\n[classes]: https://en.wikipedia.org/wiki/Class-based_programming\n[prototypes]: https://en.wikipedia.org/wiki/Prototype-based_programming\n\nIn class-based languages, there are two core concepts: instances and classes.\nInstances store the state for each object and have a reference to the instance's\nclass. Classes contain the methods and inheritance chain. To call a method on an\ninstance, there is always a level of indirection. You <span\nname=\"dispatch\">look</span> up the instance's class and then you find the method\n*there*:\n\n<aside name=\"dispatch\">\n\nIn a statically typed language like C++, method lookup typically happens at\ncompile time based on the *static* type of the instance, giving you **static\ndispatch**. In contrast, **dynamic dispatch** looks up the class of the actual\ninstance object at runtime. This is how virtual methods in statically typed\nlanguages and all methods in a dynamically typed language like Lox work.\n\n</aside>\n\n<img src=\"image/the-lox-language/class-lookup.png\" alt=\"How fields and methods are looked up on classes and instances\" />\n\nPrototype-based languages <span name=\"blurry\">merge</span> these two concepts.\nThere are only objects -- no classes -- and each individual object may contain\nstate and methods. Objects can directly inherit from each other (or \"delegate\nto\" in prototypal lingo):\n\n<aside name=\"blurry\">\n\nIn practice the line between class-based and prototype-based languages blurs.\nJavaScript's \"constructor function\" notion [pushes you pretty hard][js new]\ntowards defining class-like objects. Meanwhile, class-based Ruby is perfectly\nhappy to let you attach methods to individual instances.\n\n[js new]: http://gameprogrammingpatterns.com/prototype.html#what-about-javascript\n\n</aside>\n\n<img src=\"image/the-lox-language/prototype-lookup.png\" alt=\"How fields and methods are looked up in a prototypal system\" />\n\nThis means that in some ways prototypal languages are more fundamental than\nclasses. They are really neat to implement because they're *so* simple. Also,\nthey can express lots of unusual patterns that classes steer you away from.\n\nBut I've looked at a *lot* of code written in prototypal languages -- including\n[some of my own devising][finch]. Do you know what people generally do with all\nof the power and flexibility of prototypes? ...They use them to reinvent\nclasses.\n\n[finch]: http://finch.stuffwithstuff.com/\n\nI don't know *why* that is, but people naturally seem to prefer a class-based\n(Classic? Classy?) style. Prototypes *are* simpler in the language, but they\nseem to accomplish that only by <span name=\"waterbed\">pushing</span> the\ncomplexity onto the user. So, for Lox, we'll save our users the trouble and bake\nclasses right in.\n\n<aside name=\"waterbed\">\n\nLarry Wall, Perl's inventor/prophet calls this the \"[waterbed theory][]\". Some\ncomplexity is essential and cannot be eliminated. If you push it down in one\nplace, it swells up in another.\n\n[waterbed theory]: http://wiki.c2.com/?WaterbedTheory\n\nPrototypal languages don't so much *eliminate* the complexity of classes as they\ndo make the *user* take that complexity by building their own class-like\nmetaprogramming libraries.\n\n</aside>\n\n### Classes in Lox\n\nEnough rationale, let's see what we actually have. Classes encompass a\nconstellation of features in most languages. For Lox, I've selected what I think\nare the brightest stars. You declare a class and its methods like so:\n\n```lox\nclass Breakfast {\n  cook() {\n    print \"Eggs a-fryin'!\";\n  }\n\n  serve(who) {\n    print \"Enjoy your breakfast, \" + who + \".\";\n  }\n}\n```\n\nThe body of a class contains its methods. They look like function declarations\nbut without the `fun` <span name=\"method\">keyword</span>. When the class\ndeclaration is executed, Lox creates a class object and stores that in a\nvariable named after the class. Just like functions, classes are first class in\nLox.\n\n<aside name=\"method\">\n\nThey are still just as fun, though.\n\n</aside>\n\n```lox\n// Store it in variables.\nvar someVariable = Breakfast;\n\n// Pass it to functions.\nsomeFunction(Breakfast);\n```\n\nNext, we need a way to create instances. We could add some sort of `new`\nkeyword, but to keep things simple, in Lox the class itself is a factory\nfunction for instances. Call a class like a function, and it produces a new\ninstance of itself.\n\n```lox\nvar breakfast = Breakfast();\nprint breakfast; // \"Breakfast instance\".\n```\n\n### Instantiation and initialization\n\nClasses that only have behavior aren't super useful. The idea behind\nobject-oriented programming is encapsulating behavior *and state* together. To\ndo that, you need fields. Lox, like other dynamically typed languages, lets you\nfreely add properties onto objects.\n\n```lox\nbreakfast.meat = \"sausage\";\nbreakfast.bread = \"sourdough\";\n```\n\nAssigning to a field creates it if it doesn't already exist.\n\nIf you want to access a field or method on the current object from within a\nmethod, you use good old `this`.\n\n```lox\nclass Breakfast {\n  serve(who) {\n    print \"Enjoy your \" + this.meat + \" and \" +\n        this.bread + \", \" + who + \".\";\n  }\n\n  // ...\n}\n```\n\nPart of encapsulating data within an object is ensuring the object is in a valid\nstate when it's created. To do that, you can define an initializer. If your\nclass has a method named `init()`, it is called automatically when the object is\nconstructed. Any parameters passed to the class are forwarded to its\ninitializer.\n\n```lox\nclass Breakfast {\n  init(meat, bread) {\n    this.meat = meat;\n    this.bread = bread;\n  }\n\n  // ...\n}\n\nvar baconAndToast = Breakfast(\"bacon\", \"toast\");\nbaconAndToast.serve(\"Dear Reader\");\n// \"Enjoy your bacon and toast, Dear Reader.\"\n```\n\n### Inheritance\n\nEvery object-oriented language lets you not only define methods, but reuse them\nacross multiple classes or objects. For that, Lox supports single inheritance.\nWhen you declare a class, you can specify a class that it inherits from using a less-than\n<span name=\"less\">(`<`)</span> operator.\n\n```lox\nclass Brunch < Breakfast {\n  drink() {\n    print \"How about a Bloody Mary?\";\n  }\n}\n```\n\n<aside name=\"less\">\n\nWhy the `<` operator? I didn't feel like introducing a new keyword like\n`extends`. Lox doesn't use `:` for anything else so I didn't want to reserve\nthat either. Instead, I took a page from Ruby and used `<`.\n\nIf you know any type theory, you'll notice it's not a *totally* arbitrary\nchoice. Every instance of a subclass is an instance of its superclass too, but\nthere may be instances of the superclass that are not instances of the subclass.\nThat means, in the universe of objects, the set of subclass objects is smaller\nthan the superclass's set, though type nerds usually use `<:` for that relation.\n\n</aside>\n\nHere, Brunch is the **derived class** or **subclass**, and Breakfast is the\n**base class** or **superclass**.\n\nEvery method defined in the superclass is also available to its subclasses.\n\n```lox\nvar benedict = Brunch(\"ham\", \"English muffin\");\nbenedict.serve(\"Noble Reader\");\n```\n\nEven the `init()` method gets <span name=\"init\">inherited</span>. In practice,\nthe subclass usually wants to define its own `init()` method too. But the\noriginal one also needs to be called so that the superclass can maintain its\nstate. We need some way to call a method on our own *instance* without hitting\nour own *methods*.\n\n<aside name=\"init\">\n\nLox is different from C++, Java, and C#, which do not inherit constructors, but\nsimilar to Smalltalk and Ruby, which do.\n\n</aside>\n\nAs in Java, you use `super` for that.\n\n```lox\nclass Brunch < Breakfast {\n  init(meat, bread, drink) {\n    super.init(meat, bread);\n    this.drink = drink;\n  }\n}\n```\n\nThat's about it for object orientation. I tried to keep the feature set minimal.\nThe structure of the book did force one compromise. Lox is not a *pure*\nobject-oriented language. In a true OOP language every object is an instance of\na class, even primitive values like numbers and Booleans.\n\nBecause we don't implement classes until well after we start working with the\nbuilt-in types, that would have been hard. So values of primitive types aren't\nreal objects in the sense of being instances of classes. They don't have methods\nor properties. If I were trying to make Lox a real language for real users, I\nwould fix that.\n\n## The Standard Library\n\nWe're almost done. That's the whole language, so all that's left is the \"core\"\nor \"standard\" library -- the set of functionality that is implemented directly\nin the interpreter and that all user-defined behavior is built on top of.\n\nThis is the saddest part of Lox. Its standard library goes beyond minimalism and\nveers close to outright nihilism. For the sample code in the book, we only need\nto demonstrate that code is running and doing what it's supposed to do. For\nthat, we already have the built-in `print` statement.\n\nLater, when we start optimizing, we'll write some benchmarks and see how long it\ntakes to execute code. That means we need to track time, so we'll define one\nbuilt-in function, `clock()`, that returns the number of seconds since the\nprogram started.\n\nAnd... that's it. I know, right? It's embarrassing.\n\nIf you wanted to turn Lox into an actual useful language, the very first thing\nyou should do is flesh this out. String manipulation, trigonometric functions,\nfile I/O, networking, heck, even *reading input from the user* would help. But we\ndon't need any of that for this book, and adding it wouldn't teach you anything\ninteresting, so I've left it out.\n\nDon't worry, we'll have plenty of exciting stuff in the language itself to keep\nus busy.\n\n<div class=\"challenges\">\n\n## Challenges\n\n1. Write some sample Lox programs and run them (you can use the implementations\n   of Lox in [my repository][repo]). Try to come up with edge case behavior I\n   didn't specify here. Does it do what you expect? Why or why not?\n\n2. This informal introduction leaves a *lot* unspecified. List several open\n   questions you have about the language's syntax and semantics. What do you\n   think the answers should be?\n\n3. Lox is a pretty tiny language. What features do you think it is missing that\n   would make it annoying to use for real programs? (Aside from the standard\n   library, of course.)\n\n</div>\n\n<div class=\"design-note\">\n\n## Design Note: Expressions and Statements\n\nLox has both expressions and statements. Some languages omit the latter.\nInstead, they treat declarations and control flow constructs as expressions too.\nThese \"everything is an expression\" languages tend to have functional pedigrees\nand include most Lisps, SML, Haskell, Ruby, and CoffeeScript.\n\nTo do that, for each \"statement-like\" construct in the language, you need to\ndecide what value it evaluates to. Some of those are easy:\n\n*   An `if` expression evaluates to the result of whichever branch is chosen.\n    Likewise, a `switch` or other multi-way branch evaluates to whichever case\n    is picked.\n\n*   A variable declaration evaluates to the value of the variable.\n\n*   A block evaluates to the result of the last expression in the sequence.\n\nSome get a little stranger. What should a loop evaluate to? A `while` loop in\nCoffeeScript evaluates to an array containing each element that the body\nevaluated to. That can be handy, or a waste of memory if you don't need the\narray.\n\nYou also have to decide how these statement-like expressions compose with other\nexpressions -- you have to fit them into the grammar's precedence table. For\nexample, Ruby allows:\n\n```ruby\nputs 1 + if true then 2 else 3 end + 4\n```\n\nIs this what you'd expect? Is it what your *users* expect? How does this affect\nhow you design the syntax for your \"statements\"? Note that Ruby has an explicit\n`end` to tell when the `if` expression is complete. Without it, the `+ 4` would\nlikely be parsed as part of the `else` clause.\n\nTurning every statement into an expression forces you to answer a few hairy\nquestions like that. In return, you eliminate some redundancy. C has both blocks\nfor sequencing statements, and the comma operator for sequencing expressions. It\nhas both the `if` statement and the `?:` conditional operator. If everything was\nan expression in C, you could unify each of those.\n\nLanguages that do away with statements usually also feature **implicit returns**\n-- a function automatically returns whatever value its body evaluates to without\nneed for some explicit `return` syntax. For small functions and methods, this is\nreally handy. In fact, many languages that do have statements have added syntax\nlike `=>` to be able to define functions whose body is the result of evaluating\na single expression.\n\nBut making *all* functions work that way can be a little strange. If you aren't\ncareful, your function will leak a return value even if you only intend it to\nproduce a side effect. In practice, though, users of these languages don't find\nit to be a problem.\n\nFor Lox, I gave it statements for prosaic reasons. I picked a C-like syntax for\nfamiliarity's sake, and trying to take the existing C statement syntax and\ninterpret it like expressions gets weird pretty fast.\n\n</div>\n"
  },
  {
    "path": "book/types-of-values.md",
    "content": "> When you are a Bear of Very Little Brain, and you Think of Things, you find\n> sometimes that a Thing which seemed very Thingish inside you is quite\n> different when it gets out into the open and has other people looking at it.\n>\n> <cite>A. A. Milne, <em>Winnie-the-Pooh</em></cite>\n\nThe past few chapters were huge, packed full of complex techniques and pages of\ncode. In this chapter, there's only one new concept to learn and a scattering of\nstraightforward code. You've earned a respite.\n\nLox is <span name=\"unityped\">dynamically</span> typed. A single variable can\nhold a Boolean, number, or string at different points in time. At least, that's\nthe idea. Right now, in clox, all values are numbers. By the end of the chapter,\nit will also support Booleans and `nil`. While those aren't super interesting,\nthey force us to figure out how our value representation can dynamically handle\ndifferent types.\n\n<aside name=\"unityped\">\n\nThere is a third category next to statically typed and dynamically typed:\n**unityped**. In that paradigm, all variables have a single type, usually a\nmachine register integer. Unityped languages aren't common today, but some\nForths and BCPL, the language that inspired C, worked like this.\n\nAs of this moment, clox is unityped.\n\n</aside>\n\n## Tagged Unions\n\nThe nice thing about working in C is that we can build our data structures from\nthe raw bits up. The bad thing is that we *have* to do that. C doesn't give you\nmuch for free at compile time and even less at runtime. As far as C is\nconcerned, the universe is an undifferentiated array of bytes. It's up to us to\ndecide how many of those bytes to use and what they mean.\n\nIn order to choose a value representation, we need to answer two key questions:\n\n1.  **How do we represent the type of a value?** If you try to, say, multiply a\n    number by `true`, we need to detect that error at runtime and report it. In\n    order to do that, we need to be able to tell what a value's type is.\n\n2.  **How do we store the value itself?** We need to not only be able to tell\n    that three is a number, but that it's different from the number four. I\n    know, seems obvious, right? But we're operating at a level where it's good\n    to spell these things out.\n\nSince we're not just designing this language but building it ourselves, when\nanswering these two questions we also have to keep in mind the implementer's\neternal quest: to do it *efficiently*.\n\nLanguage hackers over the years have come up with a variety of clever ways to\npack the above information into as few bits as possible. For now, we'll start\nwith the simplest, classic solution: a **tagged union**. A value contains two\nparts: a type \"tag\", and a payload for the actual value. To store the value's\ntype, we define an enum for each kind of value the VM supports.\n\n^code value-type (2 before, 1 after)\n\n<aside name=\"user-types\">\n\nThe cases here cover each kind of value that has *built-in support in the VM*.\nWhen we get to adding classes to the language, each class the user defines\ndoesn't need its own entry in this enum. As far as the VM is concerned, every\ninstance of a class is the same type: \"instance\".\n\nIn other words, this is the VM's notion of \"type\", not the user's.\n\n</aside>\n\nFor now, we have only a couple of cases, but this will grow as we add strings,\nfunctions, and classes to clox. In addition to the type, we also need to store\nthe data for the value -- the `double` for a number, `true` or `false` for a\nBoolean. We could define a struct with fields for each possible type.\n\n<img src=\"image/types-of-values/struct.png\" alt=\"A struct with two fields laid next to each other in memory.\" />\n\nBut this is a waste of memory. A value can't simultaneously be both a number and\na Boolean. So at any point in time, only one of those fields will be used. C\nlets you optimize this by defining a <span name=\"sum\">union</span>. A union\nlooks like a struct except that all of its fields overlap in memory.\n\n<aside name=\"sum\">\n\nIf you're familiar with a language in the ML family, structs and unions in C\nroughly mirror the difference between product and sum types, between tuples\nand algebraic data types.\n\n</aside>\n\n<img src=\"image/types-of-values/union.png\" alt=\"A union with two fields overlapping in memory.\" />\n\nThe size of a union is the size of its largest field. Since the fields all reuse\nthe same bits, you have to be very careful when working with them. If you store\ndata using one field and then access it using <span\nname=\"reinterpret\">another</span>, you will reinterpret what the underlying bits\nmean.\n\n<aside name=\"reinterpret\">\n\nUsing a union to interpret bits as different types is the quintessence of C. It\nopens up a number of clever optimizations and lets you slice and dice each byte\nof memory in ways that memory-safe languages disallow. But it is also wildly\nunsafe and will happily saw your fingers off if you don't watch out.\n\n</aside>\n\nAs the name \"tagged union\" implies, our new value representation combines these\ntwo parts into a single struct.\n\n^code value (2 before, 2 after)\n\nThere's a field for the type tag, and then a second field containing the union\nof all of the underlying values. On a 64-bit machine with a typical C compiler,\nthe layout looks like this:\n\n<aside name=\"as\">\n\nA smart language hacker gave me the idea to use \"as\" for the name of the union\nfield because it reads nicely, almost like a cast, when you pull the various\nvalues out.\n\n</aside>\n\n<img src=\"image/types-of-values/value.png\" alt=\"The full value struct, with the type and as fields next to each other in memory.\" />\n\nThe four-byte type tag comes first, then the union. Most architectures prefer\nvalues be aligned to their size. Since the union field contains an eight-byte\ndouble, the compiler adds four bytes of <span name=\"pad\">padding</span> after\nthe type field to keep that double on the nearest eight-byte boundary. That\nmeans we're effectively spending eight bytes on the type tag, which only needs\nto represent a number between zero and three. We could stuff the enum in a\nsmaller size, but all that would do is increase the padding.\n\n<aside name=\"pad\">\n\nWe could move the tag field *after* the union, but that doesn't help much\neither. Whenever we create an array of Values -- which is where most of our\nmemory usage for Values will be -- the C compiler will insert that same padding\n*between* each Value to keep the doubles aligned.\n\n</aside>\n\nSo our Values are 16 bytes, which seems a little large. We'll improve it\n[later][optimization]. In the meantime, they're still small enough to store on\nthe C stack and pass around by value. Lox's semantics allow that because the\nonly types we support so far are **immutable**. If we pass a copy of a Value\ncontaining the number three to some function, we don't need to worry about the\ncaller seeing modifications to the value. You can't \"modify\" three. It's three\nforever.\n\n[optimization]: optimization.html\n\n## Lox Values and C Values\n\nThat's our new value representation, but we aren't done. Right now, the rest of\nclox assumes Value is an alias for `double`. We have code that does a straight C\ncast from one to the other. That code is all broken now. So sad.\n\nWith our new representation, a Value can *contain* a double, but it's not\n*equivalent* to it. There is a mandatory conversion step to get from one to the\nother. We need to go through the code and insert those conversions to get clox\nworking again.\n\nWe'll implement these conversions as a handful of macros, one for each type and\noperation. First, to promote a native C value to a clox Value:\n\n^code value-macros (1 before, 2 after)\n\nEach one of these takes a C value of the appropriate type and produces a Value\nthat has the correct type tag and contains the underlying value. This hoists\nstatically typed values up into clox's dynamically typed universe. In order to\n*do* anything with a Value, though, we need to unpack it and get the C value\nback out.\n\n^code as-macros (1 before, 2 after)\n\n<aside name=\"as-null\">\n\nThere's no `AS_NIL` macro because there is only one `nil` value, so a Value with\ntype `VAL_NIL` doesn't carry any extra data.\n\n</aside>\n\n<span name=\"as-null\">These</span> macros go in the opposite direction. Given a\nValue of the right type, they unwrap it and return the corresponding raw C\nvalue. The \"right type\" part is important! These macros directly access the\nunion fields. If we were to do something like:\n\n```c\nValue value = BOOL_VAL(true);\ndouble number = AS_NUMBER(value);\n```\n\nThen we may open a smoldering portal to the Shadow Realm. It's not safe to use\nany of the `AS_` macros unless we know the Value contains the appropriate type.\nTo that end, we define a last few macros to check a Value's type.\n\n^code is-macros (1 before, 2 after)\n\n<span name=\"universe\">These</span> macros return `true` if the Value has that\ntype. Any time we call one of the `AS_` macros, we need to guard it behind a\ncall to one of these first. With these eight macros, we can now safely shuttle\ndata between Lox's dynamic world and C's static one.\n\n<aside name=\"universe\">\n\n<img src=\"image/types-of-values/universe.png\" alt=\"The earthly C firmament with the Lox heavens above.\" />\n\nThe `_VAL` macros lift a C value into the heavens. The `AS_` macros bring it\nback down.\n\n</aside>\n\n## Dynamically Typed Numbers\n\nWe've got our value representation and the tools to convert to and from it. All\nthat's left to get clox running again is to grind through the code and fix every\nplace where data moves across that boundary. This is one of those sections of\nthe book that isn't exactly mind-blowing, but I promised I'd show you every\nsingle line of code, so here we are.\n\nThe first values we create are the constants generated when we compile number\nliterals. After we convert the lexeme to a C double, we simply wrap it in a\nValue before storing it in the constant table.\n\n^code const-number-val (1 before, 1 after)\n\nOver in the runtime, we have a function to print values.\n\n^code print-number-value (1 before, 1 after)\n\nRight before we send the Value to `printf()`, we unwrap it and extract the\ndouble value. We'll revisit this function shortly to add the other types, but\nlet's get our existing code working first.\n\n### Unary negation and runtime errors\n\nThe next simplest operation is unary negation. It pops a value off the stack,\nnegates it, and pushes the result. Now that we have other types of values, we\ncan't assume the operand is a number anymore. The user could just as well do:\n\n```lox\nprint -false; // Uh...\n```\n\nWe need to handle that gracefully, which means it's time for *runtime errors*.\nBefore performing an operation that requires a certain type, we need to make\nsure the Value *is* that type.\n\nFor unary negation, the check looks like this:\n\n^code op-negate (1 before, 1 after)\n\nFirst, we check to see if the Value on top of the stack is a number. If it's\nnot, we report the runtime error and <span name=\"halt\">stop</span> the\ninterpreter. Otherwise, we keep going. Only after this validation do we unwrap\nthe operand, negate it, wrap the result and push it.\n\n<aside name=\"halt\">\n\nLox's approach to error-handling is rather... *spare*. All errors are fatal and\nimmediately halt the interpreter. There's no way for user code to recover from\nan error. If Lox were a real language, this is one of the first things I would\nremedy.\n\n</aside>\n\nTo access the Value, we use a new little function.\n\n^code peek\n\nIt returns a Value from the stack but doesn't <span name=\"peek\">pop</span> it.\nThe `distance` argument is how far down from the top of the stack to look: zero\nis the top, one is one slot down, etc.\n\n<aside name=\"peek\">\n\nWhy not just pop the operand and then validate it? We could do that. In later\nchapters, it will be important to leave operands on the stack to ensure the\ngarbage collector can find them if a collection is triggered in the middle of\nthe operation. I do the same thing here mostly out of habit.\n\n</aside>\n\nWe report the runtime error using a new function that we'll get a lot of mileage\nout of over the remainder of the book.\n\n^code runtime-error\n\nYou've certainly *called* variadic functions -- ones that take a varying number\nof arguments -- in C before: `printf()` is one. But you may not have *defined*\nyour own. This book isn't a C <span name=\"tutorial\">tutorial</span>, so I'll\nskim over it here, but basically the `...` and `va_list` stuff let us pass an\narbitrary number of arguments to `runtimeError()`. It forwards those on to\n`vfprintf()`, which is the flavor of `printf()` that takes an explicit\n`va_list`.\n\n<aside name=\"tutorial\">\n\nIf you are looking for a C tutorial, I love *[The C Programming Language][kr]*,\nusually called \"K&R\" in honor of its authors. It's not entirely up to date, but\nthe quality of the writing more than makes up for it.\n\n[kr]: https://www.cs.princeton.edu/~bwk/cbook.html\n\n</aside>\n\nCallers can pass a format string to `runtimeError()` followed by a number of\narguments, just like they can when calling `printf()` directly. `runtimeError()`\nthen formats and prints those arguments. We won't take advantage of that in this\nchapter, but later chapters will produce formatted runtime error messages that\ncontain other data.\n\nAfter we show the hopefully helpful error message, we tell the user which <span\nname=\"stack\">line</span> of their code was being executed when the error\noccurred. Since we left the tokens behind in the compiler, we look up the line\nin the debug information compiled into the chunk. If our compiler did its job\nright, that corresponds to the line of source code that the bytecode was\ncompiled from.\n\nWe look into the chunk's debug line array using the current bytecode instruction\nindex *minus one*. That's because the interpreter advances past each instruction\nbefore executing it. So, at the point that we call `runtimeError()`, the failed\ninstruction is the previous one.\n\n<aside name=\"stack\">\n\nJust showing the immediate line where the error occurred doesn't provide much\ncontext. Better would be a full stack trace. But we don't even have functions to\ncall yet, so there is no call stack to trace.\n\n</aside>\n\nIn order to use `va_list` and the macros for working with it, we need to bring\nin a standard header.\n\n^code include-stdarg (1 after)\n\nWith this, our VM can not only do the right thing when we negate numbers (like\nit used to before we broke it), but it also gracefully handles erroneous\nattempts to negate other types (which we don't have yet, but still).\n\n### Binary arithmetic operators\n\nWe have our runtime error machinery in place now, so fixing the binary operators\nis easier even though they're more complex. We support four binary operators\ntoday: `+`, `-`, `*`, and `/`. The only difference between them is which\nunderlying C operator they use. To minimize redundant code between the four\noperators, we wrapped up the commonality in a big preprocessor macro that takes\nthe operator token as a parameter.\n\nThat macro seemed like overkill a [few chapters ago][], but we get the benefit\nfrom it today. It lets us add the necessary type checking and conversions in one\nplace.\n\n[few chapters ago]: a-virtual-machine.html#binary-operators\n\n^code binary-op (1 before, 2 after)\n\nYeah, I realize that's a monster of a macro. It's not what I'd normally consider\ngood C practice, but let's roll with it. The changes are similar to what we did\nfor unary negate. First, we check that the two operands are both numbers. If\neither isn't, we report a runtime error and yank the ejection seat lever.\n\nIf the operands are fine, we pop them both and unwrap them. Then we apply the\ngiven operator, wrap the result, and push it back on the stack. Note that we\ndon't wrap the result by directly using `NUMBER_VAL()`. Instead, the wrapper to\nuse is passed in as a macro <span name=\"macro\">parameter</span>. For our\nexisting arithmetic operators, the result is a number, so we pass in the\n`NUMBER_VAL` macro.\n\n<aside name=\"macro\">\n\nDid you know you can pass macros as parameters to macros? Now you do!\n\n</aside>\n\n^code op-arithmetic (1 before, 1 after)\n\nSoon, I'll show you why we made the wrapping macro an argument.\n\n## Two New Types\n\nAll of our existing clox code is back in working order. Finally, it's time to\nadd some new types. We've got a running numeric calculator that now does a\nnumber of pointless paranoid runtime type checks. We can represent other types\ninternally, but there's no way for a user's program to ever create a Value of\none of those types.\n\nNot until now, that is. We'll start by adding compiler support for the three new\nliterals: `true`, `false`, and `nil`. They're all pretty simple, so we'll do all\nthree in a single batch.\n\nWith number literals, we had to deal with the fact that there are billions of\npossible numeric values. We attended to that by storing the literal's value in\nthe chunk's constant table and emitting a bytecode instruction that simply\nloaded that constant. We could do the same thing for the new types. We'd store,\nsay, `true`, in the constant table, and use an `OP_CONSTANT` to read it out.\n\nBut given that there are literally (heh) only three possible values we need to\nworry about with these new types, it's gratuitous -- and <span\nname=\"small\">slow!</span> -- to waste a two-byte instruction and a constant\ntable entry on them. Instead, we'll define three dedicated instructions to push\neach of these literals on the stack.\n\n<aside name=\"small\" class=\"bottom\">\n\nI'm not kidding about dedicated operations for certain constant values being\nfaster. A bytecode VM spends much of its execution time reading and decoding\ninstructions. The fewer, simpler instructions you need for a given piece of\nbehavior, the faster it goes. Short instructions dedicated to common operations\nare a classic optimization.\n\nFor example, the Java bytecode instruction set has dedicated instructions for\nloading 0.0, 1.0, 2.0, and the integer values from -1 through 5. (This ends up\nbeing a vestigial optimization given that most mature JVMs now JIT-compile the\nbytecode to machine code before execution anyway.)\n\n</aside>\n\n^code literal-ops (1 before, 1 after)\n\nOur scanner already treats `true`, `false`, and `nil` as keywords, so we can\nskip right to the parser. With our table-based Pratt parser, we just need to\nslot parser functions into the rows associated with those keyword token types.\nWe'll use the same function in all three slots. Here:\n\n^code table-false (1 before, 1 after)\n\nHere:\n\n^code table-true (1 before, 1 after)\n\nAnd here:\n\n^code table-nil (1 before, 1 after)\n\nWhen the parser encounters `false`, `nil`, or `true`, in prefix position, it\ncalls this new parser function:\n\n^code parse-literal\n\nSince `parsePrecedence()` has already consumed the keyword token, all we need to\ndo is output the proper instruction. We <span name=\"switch\">figure</span> that\nout based on the type of token we parsed. Our front end can now compile Boolean\nand nil literals to bytecode. Moving down the execution pipeline, we reach the\ninterpreter.\n\n<aside name=\"switch\">\n\nWe could have used separate parser functions for each literal and saved\nourselves a switch but that felt needlessly verbose to me. I think it's mostly a\nmatter of taste.\n\n</aside>\n\n^code interpret-literals (5 before, 1 after)\n\nThis is pretty self-explanatory. Each instruction summons the appropriate value\nand pushes it onto the stack. We shouldn't forget our disassembler either.\n\n^code disassemble-literals (2 before, 1 after)\n\nWith this in place, we can run this Earth-shattering program:\n\n```lox\ntrue\n```\n\nExcept that when the interpreter tries to print the result, it blows up. We need\nto extend `printValue()` to handle the new types too:\n\n^code print-value (1 before, 1 after)\n\nThere we go! Now we have some new types. They just aren't very useful yet. Aside\nfrom the literals, you can't really *do* anything with them. It will be a while\nbefore `nil` comes into play, but we can start putting Booleans to work in the\nlogical operators.\n\n### Logical not and falsiness\n\nThe simplest logical operator is our old exclamatory friend unary not.\n\n```lox\nprint !true; // \"false\"\n```\n\nThis new operation gets a new instruction.\n\n^code not-op (1 before, 1 after)\n\nWe can reuse the `unary()` parser function we wrote for unary negation to\ncompile a not expression. We just need to slot it into the parsing table.\n\n^code table-not (1 before, 1 after)\n\nBecause I knew we were going to do this, the `unary()` function already has a\nswitch on the token type to figure out which bytecode instruction to output. We\nmerely add another case.\n\n^code compile-not (1 before, 3 after)\n\nThat's it for the front end. Let's head over to the VM and conjure this\ninstruction into life.\n\n^code op-not (1 before, 1 after)\n\nLike our previous unary operator, it pops the one operand, performs the\noperation, and pushes the result. And, as we did there, we have to worry about\ndynamic typing. Taking the logical not of `true` is easy, but there's nothing\npreventing an unruly programmer from writing something like this:\n\n```lox\nprint !nil;\n```\n\nFor unary minus, we made it an error to negate anything that isn't a <span\nname=\"negate\">number</span>. But Lox, like most scripting languages, is more\npermissive when it comes to `!` and other contexts where a Boolean is expected.\nThe rule for how other types are handled is called \"falsiness\", and we implement\nit here:\n\n<aside name=\"negate\">\n\nNow I can't help but try to figure out what it would mean to negate other types\nof values. `nil` is probably its own negation, sort of like a weird pseudo-zero.\nNegating a string could, uh, reverse it?\n\n</aside>\n\n^code is-falsey\n\nLox follows Ruby in that `nil` and `false` are falsey and every other value\nbehaves like `true`. We've got a new instruction we can generate, so we also\nneed to be able to *un*generate it in the disassembler.\n\n^code disassemble-not (2 before, 1 after)\n\n### Equality and comparison operators\n\nThat wasn't too bad. Let's keep the momentum going and knock out the equality\nand comparison operators too: `==`, `!=`, `<`, `>`, `<=`, and `>=`. That covers\nall of the operators that return Boolean results except the logical operators\n`and` and `or`. Since those need to short-circuit (basically do a little\ncontrol flow) we aren't ready for them yet.\n\nHere are the new instructions for those operators:\n\n^code comparison-ops (1 before, 1 after)\n\nWait, only three? What about `!=`, `<=`, and `>=`? We could create instructions\nfor those too. Honestly, the VM would execute faster if we did, so we *should*\ndo that if the goal is performance.\n\nBut my main goal is to teach you about bytecode compilers. I want you to start\ninternalizing the idea that the bytecode instructions don't need to closely\nfollow the user's source code. The VM has total freedom to use whatever\ninstruction set and code sequences it wants as long as they have the right\nuser-visible behavior.\n\nThe expression `a != b` has the same semantics as `!(a == b)`, so the compiler\nis free to compile the former as if it were the latter. Instead of a dedicated\n`OP_NOT_EQUAL` instruction, it can output an `OP_EQUAL` followed by an `OP_NOT`.\nLikewise, `a <= b` is the <span name=\"same\">same</span> as `!(a > b)` and `a >=\nb` is `!(a < b)`. Thus, we only need three new instructions.\n\n<aside name=\"same\" class=\"bottom\">\n\n*Is* `a <= b` always the same as `!(a > b)`? According to [IEEE 754][], all\ncomparison operators return false when an operand is NaN. That means `NaN <= 1`\nis false and `NaN > 1` is also false. But our desugaring assumes the latter is\nalways the negation of the former.\n\nFor the book, we won't get hung up on this, but these kinds of details will\nmatter in your real language implementations.\n\n[ieee 754]: https://en.wikipedia.org/wiki/IEEE_754\n\n</aside>\n\nOver in the parser, though, we do have six new operators to slot into the parse\ntable. We use the same `binary()` parser function from before. Here's the row\nfor `!=`:\n\n^code table-equal (1 before, 1 after)\n\nThe remaining five operators are a little farther down in the table.\n\n^code table-comparisons (1 before, 1 after)\n\nInside `binary()` we already have a switch to generate the right bytecode for\neach token type. We add cases for the six new operators.\n\n^code comparison-operators (1 before, 1 after)\n\nThe `==`, `<`, and `>` operators output a single instruction. The others output\na pair of instructions, one to evalute the inverse operation, and then an\n`OP_NOT` to flip the result. Six operators for the price of three instructions!\n\nThat means over in the VM, our job is simpler. Equality is the most general\noperation.\n\n^code interpret-equal (1 before, 1 after)\n\nYou can evaluate `==` on any pair of objects, even objects of different types.\nThere's enough complexity that it makes sense to shunt that logic over to a\nseparate function. That function always returns a C `bool`, so we can safely\nwrap the result in a `BOOL_VAL`. The function relates to Values, so it lives\nover in the \"value\" module.\n\n^code values-equal-h (2 before, 1 after)\n\nAnd here's the implementation:\n\n^code values-equal\n\nFirst, we check the types. If the Values have <span\nname=\"equal\">different</span> types, they are definitely not equal. Otherwise,\nwe unwrap the two Values and compare them directly.\n\n<aside name=\"equal\">\n\nSome languages have \"implicit conversions\" where values of different types may\nbe considered equal if one can be converted to the other's type. For example,\nthe number 0 is equivalent to the string \"0\" in JavaScript. This looseness was a\nlarge enough source of pain that JS added a separate \"strict equality\" operator,\n`===`.\n\nPHP considers the strings \"1\" and \"01\" to be equivalent because both can be\nconverted to equivalent numbers, though the ultimate reason is because PHP was\ndesigned by a Lovecraftian eldritch god to destroy the mind.\n\nMost dynamically typed languages that have separate integer and floating-point\nnumber types consider values of different number types equal if the numeric\nvalues are the same (so, say, 1.0 is equal to 1), though even that seemingly\ninnocuous convenience can bite the unwary.\n\n</aside>\n\nFor each value type, we have a separate case that handles comparing the value\nitself. Given how similar the cases are, you might wonder why we can't simply\n`memcmp()` the two Value structs and be done with it. The problem is that\nbecause of padding and different-sized union fields, a Value contains unused\nbits. C gives no guarantee about what is in those, so it's possible that two\nequal Values actually differ in memory that isn't used.\n\n<img src=\"image/types-of-values/memcmp.png\" alt=\"The memory respresentations of two equal values that differ in unused bytes.\" />\n\n(You wouldn't believe how much pain I went through before learning this fact.)\n\nAnyway, as we add more types to clox, this function will grow new cases. For\nnow, these three are sufficient. The other comparison operators are easier since\nthey work only on numbers.\n\n^code interpret-comparison (3 before, 1 after)\n\nWe already extended the `BINARY_OP` macro to handle operators that return\nnon-numeric types. Now we get to use that. We pass in `BOOL_VAL` since the\nresult value type is Boolean. Otherwise, it's no different from plus or minus.\n\nAs always, the coda to today's aria is disassembling the new instructions.\n\n^code disassemble-comparison (2 before, 1 after)\n\nWith that, our numeric calculator has become something closer to a general\nexpression evaluator. Fire up clox and type in:\n\n```lox\n!(5 - 4 > 3 * 2 == !nil)\n```\n\nOK, I'll admit that's maybe not the most *useful* expression, but we're making\nprogress. We have one missing built-in type with its own literal form: strings.\nThose are much more complex because strings can vary in size. That tiny\ndifference turns out to have implications so large that we give strings [their\nvery own chapter][strings].\n\n[strings]: strings.html\n\n<div class=\"challenges\">\n\n## Challenges\n\n1. We could reduce our binary operators even further than we did here. Which\n   other instructions can you eliminate, and how would the compiler cope with\n   their absence?\n\n2. Conversely, we can improve the speed of our bytecode VM by adding more\n   specific instructions that correspond to higher-level operations. What\n   instructions would you define to speed up the kind of user code we added\n   support for in this chapter?\n\n</div>\n"
  },
  {
    "path": "book/welcome.md",
    "content": "This may be the beginning of a grand adventure. Programming languages encompass\na huge space to explore and play in. Plenty of room for your own creations to\nshare with others or just enjoy yourself. Brilliant computer scientists and\nsoftware engineers have spent entire careers traversing this land without ever\nreaching the end. If this book is your first entry into the country, welcome.\n\nThe pages of this book give you a guided tour through some of the world of\nlanguages. But before we strap on our hiking boots and venture out, we should\nfamiliarize ourselves with the territory. The chapters in this part introduce\nyou to the basic concepts used by programming languages and how those concepts\nare organized.\n\nWe will also get acquainted with Lox, the language we'll spend the rest of the\nbook implementing (twice).\n"
  },
  {
    "path": "c/chunk.c",
    "content": "//> Chunks of Bytecode chunk-c\n#include <stdlib.h>\n\n#include \"chunk.h\"\n//> chunk-c-include-memory\n#include \"memory.h\"\n//< chunk-c-include-memory\n//> Garbage Collection chunk-include-vm\n#include \"vm.h\"\n//< Garbage Collection chunk-include-vm\n\nvoid initChunk(Chunk* chunk) {\n  chunk->count = 0;\n  chunk->capacity = 0;\n  chunk->code = NULL;\n//> chunk-null-lines\n  chunk->lines = NULL;\n//< chunk-null-lines\n//> chunk-init-constant-array\n  initValueArray(&chunk->constants);\n//< chunk-init-constant-array\n}\n//> free-chunk\nvoid freeChunk(Chunk* chunk) {\n  FREE_ARRAY(uint8_t, chunk->code, chunk->capacity);\n//> chunk-free-lines\n  FREE_ARRAY(int, chunk->lines, chunk->capacity);\n//< chunk-free-lines\n//> chunk-free-constants\n  freeValueArray(&chunk->constants);\n//< chunk-free-constants\n  initChunk(chunk);\n}\n//< free-chunk\n/* Chunks of Bytecode write-chunk < Chunks of Bytecode write-chunk-with-line\nvoid writeChunk(Chunk* chunk, uint8_t byte) {\n*/\n//> write-chunk\n//> write-chunk-with-line\nvoid writeChunk(Chunk* chunk, uint8_t byte, int line) {\n//< write-chunk-with-line\n  if (chunk->capacity < chunk->count + 1) {\n    int oldCapacity = chunk->capacity;\n    chunk->capacity = GROW_CAPACITY(oldCapacity);\n    chunk->code = GROW_ARRAY(uint8_t, chunk->code,\n        oldCapacity, chunk->capacity);\n//> write-chunk-line\n    chunk->lines = GROW_ARRAY(int, chunk->lines,\n        oldCapacity, chunk->capacity);\n//< write-chunk-line\n  }\n\n  chunk->code[chunk->count] = byte;\n//> chunk-write-line\n  chunk->lines[chunk->count] = line;\n//< chunk-write-line\n  chunk->count++;\n}\n//< write-chunk\n//> add-constant\nint addConstant(Chunk* chunk, Value value) {\n//> Garbage Collection add-constant-push\n  push(value);\n//< Garbage Collection add-constant-push\n  writeValueArray(&chunk->constants, value);\n//> Garbage Collection add-constant-pop\n  pop();\n//< Garbage Collection add-constant-pop\n  return chunk->constants.count - 1;\n}\n//< add-constant\n"
  },
  {
    "path": "c/chunk.h",
    "content": "//> Chunks of Bytecode chunk-h\n#ifndef clox_chunk_h\n#define clox_chunk_h\n\n#include \"common.h\"\n//> chunk-h-include-value\n#include \"value.h\"\n//< chunk-h-include-value\n//> op-enum\n\ntypedef enum {\n//> op-constant\n  OP_CONSTANT,\n//< op-constant\n//> Types of Values literal-ops\n  OP_NIL,\n  OP_TRUE,\n  OP_FALSE,\n//< Types of Values literal-ops\n//> Global Variables pop-op\n  OP_POP,\n//< Global Variables pop-op\n//> Local Variables get-local-op\n  OP_GET_LOCAL,\n//< Local Variables get-local-op\n//> Local Variables set-local-op\n  OP_SET_LOCAL,\n//< Local Variables set-local-op\n//> Global Variables get-global-op\n  OP_GET_GLOBAL,\n//< Global Variables get-global-op\n//> Global Variables define-global-op\n  OP_DEFINE_GLOBAL,\n//< Global Variables define-global-op\n//> Global Variables set-global-op\n  OP_SET_GLOBAL,\n//< Global Variables set-global-op\n//> Closures upvalue-ops\n  OP_GET_UPVALUE,\n  OP_SET_UPVALUE,\n//< Closures upvalue-ops\n//> Classes and Instances property-ops\n  OP_GET_PROPERTY,\n  OP_SET_PROPERTY,\n//< Classes and Instances property-ops\n//> Superclasses get-super-op\n  OP_GET_SUPER,\n//< Superclasses get-super-op\n//> Types of Values comparison-ops\n  OP_EQUAL,\n  OP_GREATER,\n  OP_LESS,\n//< Types of Values comparison-ops\n//> A Virtual Machine binary-ops\n  OP_ADD,\n  OP_SUBTRACT,\n  OP_MULTIPLY,\n  OP_DIVIDE,\n//> Types of Values not-op\n  OP_NOT,\n//< Types of Values not-op\n//< A Virtual Machine binary-ops\n//> A Virtual Machine negate-op\n  OP_NEGATE,\n//< A Virtual Machine negate-op\n//> Global Variables op-print\n  OP_PRINT,\n//< Global Variables op-print\n//> Jumping Back and Forth jump-op\n  OP_JUMP,\n//< Jumping Back and Forth jump-op\n//> Jumping Back and Forth jump-if-false-op\n  OP_JUMP_IF_FALSE,\n//< Jumping Back and Forth jump-if-false-op\n//> Jumping Back and Forth loop-op\n  OP_LOOP,\n//< Jumping Back and Forth loop-op\n//> Calls and Functions op-call\n  OP_CALL,\n//< Calls and Functions op-call\n//> Methods and Initializers invoke-op\n  OP_INVOKE,\n//< Methods and Initializers invoke-op\n//> Superclasses super-invoke-op\n  OP_SUPER_INVOKE,\n//< Superclasses super-invoke-op\n//> Closures closure-op\n  OP_CLOSURE,\n//< Closures closure-op\n//> Closures close-upvalue-op\n  OP_CLOSE_UPVALUE,\n//< Closures close-upvalue-op\n  OP_RETURN,\n//> Classes and Instances class-op\n  OP_CLASS,\n//< Classes and Instances class-op\n//> Superclasses inherit-op\n  OP_INHERIT,\n//< Superclasses inherit-op\n//> Methods and Initializers method-op\n  OP_METHOD\n//< Methods and Initializers method-op\n} OpCode;\n//< op-enum\n//> chunk-struct\n\ntypedef struct {\n//> count-and-capacity\n  int count;\n  int capacity;\n//< count-and-capacity\n  uint8_t* code;\n//> chunk-lines\n  int* lines;\n//< chunk-lines\n//> chunk-constants\n  ValueArray constants;\n//< chunk-constants\n} Chunk;\n//< chunk-struct\n//> init-chunk-h\n\nvoid initChunk(Chunk* chunk);\n//< init-chunk-h\n//> free-chunk-h\nvoid freeChunk(Chunk* chunk);\n//< free-chunk-h\n/* Chunks of Bytecode write-chunk-h < Chunks of Bytecode write-chunk-with-line-h\nvoid writeChunk(Chunk* chunk, uint8_t byte);\n*/\n//> write-chunk-with-line-h\nvoid writeChunk(Chunk* chunk, uint8_t byte, int line);\n//< write-chunk-with-line-h\n//> add-constant-h\nint addConstant(Chunk* chunk, Value value);\n//< add-constant-h\n\n#endif\n"
  },
  {
    "path": "c/clox.xcodeproj/project.pbxproj",
    "content": "// !$*UTF8*$!\n{\n\tarchiveVersion = 1;\n\tclasses = {\n\t};\n\tobjectVersion = 46;\n\tobjects = {\n\n/* Begin PBXBuildFile section */\n\t\t2905EA1B1CAC1C3900E258E5 /* memory.c in Sources */ = {isa = PBXBuildFile; fileRef = 2905EA191CAC1C3900E258E5 /* memory.c */; };\n\t\t293173A51D03628E0028CBCC /* chunk.c in Sources */ = {isa = PBXBuildFile; fileRef = 293173A31D03628E0028CBCC /* chunk.c */; };\n\t\t293173A81D0378530028CBCC /* value.c in Sources */ = {isa = PBXBuildFile; fileRef = 293173A61D0378530028CBCC /* value.c */; };\n\t\t2940770F1C8368CF0067320B /* vm.c in Sources */ = {isa = PBXBuildFile; fileRef = 2940770D1C8368CF0067320B /* vm.c */; };\n\t\t294077121C8369BC0067320B /* compiler.c in Sources */ = {isa = PBXBuildFile; fileRef = 294077101C8369BC0067320B /* compiler.c */; };\n\t\t296041FF1C5DCCD0007310F9 /* scanner.c in Sources */ = {isa = PBXBuildFile; fileRef = 296041FE1C5DCCD0007310F9 /* scanner.c */; };\n\t\t29815E3F1C5DCC3A004A67D8 /* main.c in Sources */ = {isa = PBXBuildFile; fileRef = 29815E3E1C5DCC3A004A67D8 /* main.c */; };\n\t\t2984DBA21C83FD540075BAC3 /* object.c in Sources */ = {isa = PBXBuildFile; fileRef = 2984DBA01C83FD540075BAC3 /* object.c */; };\n\t\t29C6CA711C85EBE6009617A9 /* debug.c in Sources */ = {isa = PBXBuildFile; fileRef = 29C6CA6F1C85EBE6009617A9 /* debug.c */; };\n\t\t29CD6FB01CB6A3430005D92B /* table.c in Sources */ = {isa = PBXBuildFile; fileRef = 29CD6FAE1CB6A3430005D92B /* table.c */; };\n/* End PBXBuildFile section */\n\n/* Begin PBXCopyFilesBuildPhase section */\n\t\t292D23761E10F6590044C66E /* CopyFiles */ = {\n\t\t\tisa = PBXCopyFilesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tdstPath = /usr/share/man/man1/;\n\t\t\tdstSubfolderSpec = 0;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 1;\n\t\t};\n\t\t292D239F1E10F6C30044C66E /* CopyFiles */ = {\n\t\t\tisa = PBXCopyFilesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tdstPath = /usr/share/man/man1/;\n\t\t\tdstSubfolderSpec = 0;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 1;\n\t\t};\n\t\t292D23BC1E10F6E70044C66E /* CopyFiles */ = {\n\t\t\tisa = PBXCopyFilesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tdstPath = /usr/share/man/man1/;\n\t\t\tdstSubfolderSpec = 0;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 1;\n\t\t};\n\t\t29815E321C5DCBF7004A67D8 /* CopyFiles */ = {\n\t\t\tisa = PBXCopyFilesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tdstPath = /usr/share/man/man1/;\n\t\t\tdstSubfolderSpec = 0;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 1;\n\t\t};\n/* End PBXCopyFilesBuildPhase section */\n\n/* Begin PBXFileReference section */\n\t\t2905EA191CAC1C3900E258E5 /* memory.c */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.c; path = memory.c; sourceTree = \"<group>\"; };\n\t\t2905EA1A1CAC1C3900E258E5 /* memory.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = memory.h; sourceTree = \"<group>\"; };\n\t\t2905EA1C1CAC1DFB00E258E5 /* common.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; path = common.h; sourceTree = \"<group>\"; };\n\t\t292D23781E10F6590044C66E /* chap14_chunks */ = {isa = PBXFileReference; explicitFileType = \"compiled.mach-o.executable\"; includeInIndex = 0; path = chap14_chunks; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\t292D23A11E10F6C30044C66E /* chap15_virtual */ = {isa = PBXFileReference; explicitFileType = \"compiled.mach-o.executable\"; includeInIndex = 0; path = chap15_virtual; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\t292D23BE1E10F6E70044C66E /* chap16_scanning */ = {isa = PBXFileReference; explicitFileType = \"compiled.mach-o.executable\"; includeInIndex = 0; path = chap16_scanning; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\t293173A31D03628E0028CBCC /* chunk.c */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.c; path = chunk.c; sourceTree = \"<group>\"; };\n\t\t293173A41D03628E0028CBCC /* chunk.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = chunk.h; sourceTree = \"<group>\"; };\n\t\t293173A61D0378530028CBCC /* value.c */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.c; path = value.c; sourceTree = \"<group>\"; };\n\t\t293173A71D0378530028CBCC /* value.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = value.h; sourceTree = \"<group>\"; };\n\t\t2940770D1C8368CF0067320B /* vm.c */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.c; path = vm.c; sourceTree = \"<group>\"; };\n\t\t2940770E1C8368CF0067320B /* vm.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = vm.h; sourceTree = \"<group>\"; };\n\t\t294077101C8369BC0067320B /* compiler.c */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.c; path = compiler.c; sourceTree = \"<group>\"; };\n\t\t294077111C8369BC0067320B /* compiler.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = compiler.h; sourceTree = \"<group>\"; };\n\t\t296041FE1C5DCCD0007310F9 /* scanner.c */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.c; path = scanner.c; sourceTree = \"<group>\"; };\n\t\t29815E341C5DCBF7004A67D8 /* clox */ = {isa = PBXFileReference; explicitFileType = \"compiled.mach-o.executable\"; includeInIndex = 0; path = clox; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\t29815E3E1C5DCC3A004A67D8 /* main.c */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.c; path = main.c; sourceTree = \"<group>\"; };\n\t\t29815E401C5DCCAC004A67D8 /* scanner.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; path = scanner.h; sourceTree = \"<group>\"; };\n\t\t2984DBA01C83FD540075BAC3 /* object.c */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.c; path = object.c; sourceTree = \"<group>\"; };\n\t\t2984DBA11C83FD540075BAC3 /* object.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = object.h; sourceTree = \"<group>\"; };\n\t\t29C6CA6F1C85EBE6009617A9 /* debug.c */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.c; path = debug.c; sourceTree = \"<group>\"; };\n\t\t29C6CA701C85EBE6009617A9 /* debug.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = debug.h; sourceTree = \"<group>\"; };\n\t\t29CD6FAE1CB6A3430005D92B /* table.c */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.c; path = table.c; sourceTree = \"<group>\"; };\n\t\t29CD6FAF1CB6A3430005D92B /* table.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = table.h; sourceTree = \"<group>\"; };\n/* End PBXFileReference section */\n\n/* Begin PBXFrameworksBuildPhase section */\n\t\t292D23751E10F6590044C66E /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\t292D239E1E10F6C30044C66E /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\t292D23BB1E10F6E70044C66E /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\t29815E311C5DCBF7004A67D8 /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXFrameworksBuildPhase section */\n\n/* Begin PBXGroup section */\n\t\t29815E2B1C5DCBF7004A67D8 = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t293173A41D03628E0028CBCC /* chunk.h */,\n\t\t\t\t293173A31D03628E0028CBCC /* chunk.c */,\n\t\t\t\t2905EA1C1CAC1DFB00E258E5 /* common.h */,\n\t\t\t\t294077111C8369BC0067320B /* compiler.h */,\n\t\t\t\t294077101C8369BC0067320B /* compiler.c */,\n\t\t\t\t29C6CA701C85EBE6009617A9 /* debug.h */,\n\t\t\t\t29C6CA6F1C85EBE6009617A9 /* debug.c */,\n\t\t\t\t29815E3E1C5DCC3A004A67D8 /* main.c */,\n\t\t\t\t2905EA1A1CAC1C3900E258E5 /* memory.h */,\n\t\t\t\t2905EA191CAC1C3900E258E5 /* memory.c */,\n\t\t\t\t2984DBA11C83FD540075BAC3 /* object.h */,\n\t\t\t\t2984DBA01C83FD540075BAC3 /* object.c */,\n\t\t\t\t29815E401C5DCCAC004A67D8 /* scanner.h */,\n\t\t\t\t296041FE1C5DCCD0007310F9 /* scanner.c */,\n\t\t\t\t29CD6FAF1CB6A3430005D92B /* table.h */,\n\t\t\t\t29CD6FAE1CB6A3430005D92B /* table.c */,\n\t\t\t\t293173A71D0378530028CBCC /* value.h */,\n\t\t\t\t293173A61D0378530028CBCC /* value.c */,\n\t\t\t\t2940770E1C8368CF0067320B /* vm.h */,\n\t\t\t\t2940770D1C8368CF0067320B /* vm.c */,\n\t\t\t\t29815E351C5DCBF7004A67D8 /* Products */,\n\t\t\t);\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t29815E351C5DCBF7004A67D8 /* Products */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t29815E341C5DCBF7004A67D8 /* clox */,\n\t\t\t\t292D23781E10F6590044C66E /* chap14_chunks */,\n\t\t\t\t292D23A11E10F6C30044C66E /* chap15_virtual */,\n\t\t\t\t292D23BE1E10F6E70044C66E /* chap16_scanning */,\n\t\t\t);\n\t\t\tname = Products;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n/* End PBXGroup section */\n\n/* Begin PBXNativeTarget section */\n\t\t292D23771E10F6590044C66E /* chap14_chunks */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = 292D237C1E10F6590044C66E /* Build configuration list for PBXNativeTarget \"chap14_chunks\" */;\n\t\t\tbuildPhases = (\n\t\t\t\t292D23741E10F6590044C66E /* Sources */,\n\t\t\t\t292D23751E10F6590044C66E /* Frameworks */,\n\t\t\t\t292D23761E10F6590044C66E /* CopyFiles */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t);\n\t\t\tname = chap14_chunks;\n\t\t\tproductName = chap14_chunks;\n\t\t\tproductReference = 292D23781E10F6590044C66E /* chap14_chunks */;\n\t\t\tproductType = \"com.apple.product-type.tool\";\n\t\t};\n\t\t292D23A01E10F6C30044C66E /* chap15_virtual */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = 292D23A51E10F6C30044C66E /* Build configuration list for PBXNativeTarget \"chap15_virtual\" */;\n\t\t\tbuildPhases = (\n\t\t\t\t292D239D1E10F6C30044C66E /* Sources */,\n\t\t\t\t292D239E1E10F6C30044C66E /* Frameworks */,\n\t\t\t\t292D239F1E10F6C30044C66E /* CopyFiles */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t);\n\t\t\tname = chap15_virtual;\n\t\t\tproductName = chap15_virtual;\n\t\t\tproductReference = 292D23A11E10F6C30044C66E /* chap15_virtual */;\n\t\t\tproductType = \"com.apple.product-type.tool\";\n\t\t};\n\t\t292D23BD1E10F6E70044C66E /* chap16_scanning */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = 292D23C21E10F6E70044C66E /* Build configuration list for PBXNativeTarget \"chap16_scanning\" */;\n\t\t\tbuildPhases = (\n\t\t\t\t292D23BA1E10F6E70044C66E /* Sources */,\n\t\t\t\t292D23BB1E10F6E70044C66E /* Frameworks */,\n\t\t\t\t292D23BC1E10F6E70044C66E /* CopyFiles */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t);\n\t\t\tname = chap16_scanning;\n\t\t\tproductName = chap16_scanning;\n\t\t\tproductReference = 292D23BE1E10F6E70044C66E /* chap16_scanning */;\n\t\t\tproductType = \"com.apple.product-type.tool\";\n\t\t};\n\t\t29815E331C5DCBF7004A67D8 /* clox */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = 29815E3B1C5DCBF7004A67D8 /* Build configuration list for PBXNativeTarget \"clox\" */;\n\t\t\tbuildPhases = (\n\t\t\t\t29815E301C5DCBF7004A67D8 /* Sources */,\n\t\t\t\t29815E311C5DCBF7004A67D8 /* Frameworks */,\n\t\t\t\t29815E321C5DCBF7004A67D8 /* CopyFiles */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t);\n\t\t\tname = clox;\n\t\t\tproductName = cvox;\n\t\t\tproductReference = 29815E341C5DCBF7004A67D8 /* clox */;\n\t\t\tproductType = \"com.apple.product-type.tool\";\n\t\t};\n/* End PBXNativeTarget section */\n\n/* Begin PBXProject section */\n\t\t29815E2C1C5DCBF7004A67D8 /* Project object */ = {\n\t\t\tisa = PBXProject;\n\t\t\tattributes = {\n\t\t\t\tLastUpgradeCheck = 0830;\n\t\t\t\tORGANIZATIONNAME = \"Robert Nystrom\";\n\t\t\t\tTargetAttributes = {\n\t\t\t\t\t292D23771E10F6590044C66E = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 8.1;\n\t\t\t\t\t\tProvisioningStyle = Automatic;\n\t\t\t\t\t};\n\t\t\t\t\t292D23A01E10F6C30044C66E = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 8.1;\n\t\t\t\t\t\tProvisioningStyle = Automatic;\n\t\t\t\t\t};\n\t\t\t\t\t292D23BD1E10F6E70044C66E = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 8.1;\n\t\t\t\t\t\tProvisioningStyle = Automatic;\n\t\t\t\t\t};\n\t\t\t\t\t29815E331C5DCBF7004A67D8 = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 6.4;\n\t\t\t\t\t};\n\t\t\t\t};\n\t\t\t};\n\t\t\tbuildConfigurationList = 29815E2F1C5DCBF7004A67D8 /* Build configuration list for PBXProject \"clox\" */;\n\t\t\tcompatibilityVersion = \"Xcode 3.2\";\n\t\t\tdevelopmentRegion = English;\n\t\t\thasScannedForEncodings = 0;\n\t\t\tknownRegions = (\n\t\t\t\tEnglish,\n\t\t\t\ten,\n\t\t\t);\n\t\t\tmainGroup = 29815E2B1C5DCBF7004A67D8;\n\t\t\tproductRefGroup = 29815E351C5DCBF7004A67D8 /* Products */;\n\t\t\tprojectDirPath = \"\";\n\t\t\tprojectRoot = \"\";\n\t\t\ttargets = (\n\t\t\t\t29815E331C5DCBF7004A67D8 /* clox */,\n\t\t\t\t292D23771E10F6590044C66E /* chap14_chunks */,\n\t\t\t\t292D23A01E10F6C30044C66E /* chap15_virtual */,\n\t\t\t\t292D23BD1E10F6E70044C66E /* chap16_scanning */,\n\t\t\t);\n\t\t};\n/* End PBXProject section */\n\n/* Begin PBXSourcesBuildPhase section */\n\t\t292D23741E10F6590044C66E /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\t292D239D1E10F6C30044C66E /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\t292D23BA1E10F6E70044C66E /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\t29815E301C5DCBF7004A67D8 /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\t293173A51D03628E0028CBCC /* chunk.c in Sources */,\n\t\t\t\t29CD6FB01CB6A3430005D92B /* table.c in Sources */,\n\t\t\t\t2905EA1B1CAC1C3900E258E5 /* memory.c in Sources */,\n\t\t\t\t2984DBA21C83FD540075BAC3 /* object.c in Sources */,\n\t\t\t\t296041FF1C5DCCD0007310F9 /* scanner.c in Sources */,\n\t\t\t\t293173A81D0378530028CBCC /* value.c in Sources */,\n\t\t\t\t2940770F1C8368CF0067320B /* vm.c in Sources */,\n\t\t\t\t29C6CA711C85EBE6009617A9 /* debug.c in Sources */,\n\t\t\t\t294077121C8369BC0067320B /* compiler.c in Sources */,\n\t\t\t\t29815E3F1C5DCC3A004A67D8 /* main.c in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXSourcesBuildPhase section */\n\n/* Begin XCBuildConfiguration section */\n\t\t292D237D1E10F6590044C66E /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVES = YES;\n\t\t\t\tCODE_SIGN_IDENTITY = \"-\";\n\t\t\t\tENABLE_TESTABILITY = YES;\n\t\t\t\tMACOSX_DEPLOYMENT_TARGET = 10.12;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t292D237E1E10F6590044C66E /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVES = YES;\n\t\t\t\tCODE_SIGN_IDENTITY = \"-\";\n\t\t\t\tMACOSX_DEPLOYMENT_TARGET = 10.12;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\t292D23A61E10F6C30044C66E /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVES = YES;\n\t\t\t\tCODE_SIGN_IDENTITY = \"-\";\n\t\t\t\tENABLE_TESTABILITY = YES;\n\t\t\t\tMACOSX_DEPLOYMENT_TARGET = 10.12;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t292D23A71E10F6C30044C66E /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVES = YES;\n\t\t\t\tCODE_SIGN_IDENTITY = \"-\";\n\t\t\t\tMACOSX_DEPLOYMENT_TARGET = 10.12;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\t292D23C31E10F6E70044C66E /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVES = YES;\n\t\t\t\tCODE_SIGN_IDENTITY = \"-\";\n\t\t\t\tENABLE_TESTABILITY = YES;\n\t\t\t\tMACOSX_DEPLOYMENT_TARGET = 10.12;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t292D23C41E10F6E70044C66E /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVES = YES;\n\t\t\t\tCODE_SIGN_IDENTITY = \"-\";\n\t\t\t\tMACOSX_DEPLOYMENT_TARGET = 10.12;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\t29815E391C5DCBF7004A67D8 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++0x\";\n\t\t\t\tCLANG_CXX_LIBRARY = \"libc++\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCLANG_WARN_UNREACHABLE_CODE = YES;\n\t\t\t\tCLANG_WARN__DUPLICATE_METHOD_MATCH = YES;\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = dwarf;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_TESTABILITY = YES;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu99;\n\t\t\t\tGCC_DYNAMIC_NO_PIC = NO;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_OPTIMIZATION_LEVEL = 0;\n\t\t\t\tGCC_PREPROCESSOR_DEFINITIONS = (\n\t\t\t\t\t\"DEBUG=1\",\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t);\n\t\t\t\tGCC_SYMBOLS_PRIVATE_EXTERN = NO;\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNDECLARED_SELECTOR = YES;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tMACOSX_DEPLOYMENT_TARGET = 10.11;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = YES;\n\t\t\t\tONLY_ACTIVE_ARCH = YES;\n\t\t\t\tSDKROOT = macosx;\n\t\t\t\tWARNING_CFLAGS = \"-Wno-gnu-label-as-value\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t29815E3A1C5DCBF7004A67D8 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++0x\";\n\t\t\t\tCLANG_CXX_LIBRARY = \"libc++\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCLANG_WARN_UNREACHABLE_CODE = YES;\n\t\t\t\tCLANG_WARN__DUPLICATE_METHOD_MATCH = YES;\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = \"dwarf-with-dsym\";\n\t\t\t\tENABLE_NS_ASSERTIONS = NO;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu99;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_OPTIMIZATION_LEVEL = 3;\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNDECLARED_SELECTOR = YES;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tMACOSX_DEPLOYMENT_TARGET = 10.11;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = NO;\n\t\t\t\tSDKROOT = macosx;\n\t\t\t\tWARNING_CFLAGS = \"-Wno-gnu-label-as-value\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\t29815E3C1C5DCBF7004A67D8 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tCLANG_WARN_SUSPICIOUS_IMPLICIT_CONVERSION = YES;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = c99;\n\t\t\t\tGCC_TREAT_IMPLICIT_FUNCTION_DECLARATIONS_AS_ERRORS = YES;\n\t\t\t\tGCC_TREAT_INCOMPATIBLE_POINTER_TYPE_WARNINGS_AS_ERRORS = YES;\n\t\t\t\tGCC_TREAT_WARNINGS_AS_ERRORS = YES;\n\t\t\t\tGCC_WARN_ABOUT_MISSING_FIELD_INITIALIZERS = YES;\n\t\t\t\tGCC_WARN_PEDANTIC = YES;\n\t\t\t\tGCC_WARN_SHADOW = YES;\n\t\t\t\tMACOSX_DEPLOYMENT_TARGET = 10.14;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t29815E3D1C5DCBF7004A67D8 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tCLANG_WARN_SUSPICIOUS_IMPLICIT_CONVERSION = YES;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = c99;\n\t\t\t\tGCC_TREAT_IMPLICIT_FUNCTION_DECLARATIONS_AS_ERRORS = YES;\n\t\t\t\tGCC_TREAT_INCOMPATIBLE_POINTER_TYPE_WARNINGS_AS_ERRORS = YES;\n\t\t\t\tGCC_TREAT_WARNINGS_AS_ERRORS = YES;\n\t\t\t\tGCC_WARN_ABOUT_MISSING_FIELD_INITIALIZERS = YES;\n\t\t\t\tGCC_WARN_PEDANTIC = YES;\n\t\t\t\tGCC_WARN_SHADOW = YES;\n\t\t\t\tMACOSX_DEPLOYMENT_TARGET = 10.14;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n/* End XCBuildConfiguration section */\n\n/* Begin XCConfigurationList section */\n\t\t292D237C1E10F6590044C66E /* Build configuration list for PBXNativeTarget \"chap14_chunks\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t292D237D1E10F6590044C66E /* Debug */,\n\t\t\t\t292D237E1E10F6590044C66E /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\t292D23A51E10F6C30044C66E /* Build configuration list for PBXNativeTarget \"chap15_virtual\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t292D23A61E10F6C30044C66E /* Debug */,\n\t\t\t\t292D23A71E10F6C30044C66E /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\t292D23C21E10F6E70044C66E /* Build configuration list for PBXNativeTarget \"chap16_scanning\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t292D23C31E10F6E70044C66E /* Debug */,\n\t\t\t\t292D23C41E10F6E70044C66E /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\t29815E2F1C5DCBF7004A67D8 /* Build configuration list for PBXProject \"clox\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t29815E391C5DCBF7004A67D8 /* Debug */,\n\t\t\t\t29815E3A1C5DCBF7004A67D8 /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\t29815E3B1C5DCBF7004A67D8 /* Build configuration list for PBXNativeTarget \"clox\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t29815E3C1C5DCBF7004A67D8 /* Debug */,\n\t\t\t\t29815E3D1C5DCBF7004A67D8 /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n/* End XCConfigurationList section */\n\t};\n\trootObject = 29815E2C1C5DCBF7004A67D8 /* Project object */;\n}\n"
  },
  {
    "path": "c/clox.xcodeproj/project.xcworkspace/contents.xcworkspacedata",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Workspace\n   version = \"1.0\">\n   <FileRef\n      location = \"self:cvox.xcodeproj\">\n   </FileRef>\n</Workspace>\n"
  },
  {
    "path": "c/clox.xcodeproj/project.xcworkspace/xcshareddata/IDEWorkspaceChecks.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>IDEDidComputeMac32BitWarning</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "c/clox.xcodeproj/project.xcworkspace/xcshareddata/WorkspaceSettings.xcsettings",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>PreviewsEnabled</key>\n\t<false/>\n</dict>\n</plist>\n"
  },
  {
    "path": "c/clox.xcodeproj/xcshareddata/xcschemes/clox.xcscheme",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Scheme\n   LastUpgradeVersion = \"1130\"\n   version = \"1.3\">\n   <BuildAction\n      parallelizeBuildables = \"YES\"\n      buildImplicitDependencies = \"YES\">\n      <BuildActionEntries>\n         <BuildActionEntry\n            buildForTesting = \"YES\"\n            buildForRunning = \"YES\"\n            buildForProfiling = \"YES\"\n            buildForArchiving = \"YES\"\n            buildForAnalyzing = \"YES\">\n            <BuildableReference\n               BuildableIdentifier = \"primary\"\n               BlueprintIdentifier = \"29815E331C5DCBF7004A67D8\"\n               BuildableName = \"clox\"\n               BlueprintName = \"clox\"\n               ReferencedContainer = \"container:clox.xcodeproj\">\n            </BuildableReference>\n         </BuildActionEntry>\n      </BuildActionEntries>\n   </BuildAction>\n   <TestAction\n      buildConfiguration = \"Debug\"\n      selectedDebuggerIdentifier = \"Xcode.DebuggerFoundation.Debugger.LLDB\"\n      selectedLauncherIdentifier = \"Xcode.DebuggerFoundation.Launcher.LLDB\"\n      shouldUseLaunchSchemeArgsEnv = \"YES\">\n      <Testables>\n      </Testables>\n   </TestAction>\n   <LaunchAction\n      buildConfiguration = \"Debug\"\n      selectedDebuggerIdentifier = \"Xcode.DebuggerFoundation.Debugger.LLDB\"\n      selectedLauncherIdentifier = \"Xcode.DebuggerFoundation.Launcher.LLDB\"\n      launchStyle = \"0\"\n      useCustomWorkingDirectory = \"YES\"\n      customWorkingDirectory = \"/Users/bob/Dropbox/Writing/Crafting Interpreters/interpreters\"\n      ignoresPersistentStateOnLaunch = \"NO\"\n      debugDocumentVersioning = \"YES\"\n      debugServiceExtension = \"internal\"\n      allowLocationSimulation = \"YES\">\n      <BuildableProductRunnable\n         runnableDebuggingMode = \"0\">\n         <BuildableReference\n            BuildableIdentifier = \"primary\"\n            BlueprintIdentifier = \"29815E331C5DCBF7004A67D8\"\n            BuildableName = \"clox\"\n            BlueprintName = \"clox\"\n            ReferencedContainer = \"container:clox.xcodeproj\">\n         </BuildableReference>\n      </BuildableProductRunnable>\n      <CommandLineArguments>\n         <CommandLineArgument\n            argument = \"test/closure/reuse_closure_slot.lox\"\n            isEnabled = \"YES\">\n         </CommandLineArgument>\n      </CommandLineArguments>\n   </LaunchAction>\n   <ProfileAction\n      buildConfiguration = \"Release\"\n      shouldUseLaunchSchemeArgsEnv = \"YES\"\n      savedToolIdentifier = \"\"\n      useCustomWorkingDirectory = \"NO\"\n      debugDocumentVersioning = \"YES\">\n      <BuildableProductRunnable\n         runnableDebuggingMode = \"0\">\n         <BuildableReference\n            BuildableIdentifier = \"primary\"\n            BlueprintIdentifier = \"29815E331C5DCBF7004A67D8\"\n            BuildableName = \"clox\"\n            BlueprintName = \"clox\"\n            ReferencedContainer = \"container:clox.xcodeproj\">\n         </BuildableReference>\n      </BuildableProductRunnable>\n   </ProfileAction>\n   <AnalyzeAction\n      buildConfiguration = \"Debug\">\n   </AnalyzeAction>\n   <ArchiveAction\n      buildConfiguration = \"Release\"\n      revealArchiveInOrganizer = \"YES\">\n   </ArchiveAction>\n</Scheme>\n"
  },
  {
    "path": "c/common.h",
    "content": "//> Chunks of Bytecode common-h\n#ifndef clox_common_h\n#define clox_common_h\n\n#include <stdbool.h>\n#include <stddef.h>\n#include <stdint.h>\n//> A Virtual Machine define-debug-trace\n\n//> Optimization define-nan-boxing\n#define NAN_BOXING\n//< Optimization define-nan-boxing\n//> Compiling Expressions define-debug-print-code\n#define DEBUG_PRINT_CODE\n//< Compiling Expressions define-debug-print-code\n#define DEBUG_TRACE_EXECUTION\n//< A Virtual Machine define-debug-trace\n//> Garbage Collection define-stress-gc\n\n#define DEBUG_STRESS_GC\n//< Garbage Collection define-stress-gc\n//> Garbage Collection define-log-gc\n#define DEBUG_LOG_GC\n//< Garbage Collection define-log-gc\n//> Local Variables uint8-count\n\n#define UINT8_COUNT (UINT8_MAX + 1)\n//< Local Variables uint8-count\n\n#endif\n//> omit\n// In the book, we show them defined, but for working on them locally,\n// we don't want them to be.\n#undef DEBUG_PRINT_CODE\n#undef DEBUG_TRACE_EXECUTION\n#undef DEBUG_STRESS_GC\n#undef DEBUG_LOG_GC\n//< omit\n"
  },
  {
    "path": "c/compiler.c",
    "content": "//> Scanning on Demand compiler-c\n#include <stdio.h>\n//> Compiling Expressions compiler-include-stdlib\n#include <stdlib.h>\n//< Compiling Expressions compiler-include-stdlib\n//> Local Variables compiler-include-string\n#include <string.h>\n//< Local Variables compiler-include-string\n\n#include \"common.h\"\n#include \"compiler.h\"\n//> Garbage Collection compiler-include-memory\n#include \"memory.h\"\n//< Garbage Collection compiler-include-memory\n#include \"scanner.h\"\n//> Compiling Expressions include-debug\n\n#ifdef DEBUG_PRINT_CODE\n#include \"debug.h\"\n#endif\n//< Compiling Expressions include-debug\n//> Compiling Expressions parser\n\ntypedef struct {\n  Token current;\n  Token previous;\n//> had-error-field\n  bool hadError;\n//< had-error-field\n//> panic-mode-field\n  bool panicMode;\n//< panic-mode-field\n} Parser;\n//> precedence\n\ntypedef enum {\n  PREC_NONE,\n  PREC_ASSIGNMENT,  // =\n  PREC_OR,          // or\n  PREC_AND,         // and\n  PREC_EQUALITY,    // == !=\n  PREC_COMPARISON,  // < > <= >=\n  PREC_TERM,        // + -\n  PREC_FACTOR,      // * /\n  PREC_UNARY,       // ! -\n  PREC_CALL,        // . ()\n  PREC_PRIMARY\n} Precedence;\n//< precedence\n//> parse-fn-type\n\n//< parse-fn-type\n/* Compiling Expressions parse-fn-type < Global Variables parse-fn-type\ntypedef void (*ParseFn)();\n*/\n//> Global Variables parse-fn-type\ntypedef void (*ParseFn)(bool canAssign);\n//< Global Variables parse-fn-type\n//> parse-rule\n\ntypedef struct {\n  ParseFn prefix;\n  ParseFn infix;\n  Precedence precedence;\n} ParseRule;\n//< parse-rule\n//> Local Variables local-struct\n\ntypedef struct {\n  Token name;\n  int depth;\n//> Closures is-captured-field\n  bool isCaptured;\n//< Closures is-captured-field\n} Local;\n//< Local Variables local-struct\n//> Closures upvalue-struct\ntypedef struct {\n  uint8_t index;\n  bool isLocal;\n} Upvalue;\n//< Closures upvalue-struct\n//> Calls and Functions function-type-enum\ntypedef enum {\n  TYPE_FUNCTION,\n//> Methods and Initializers initializer-type-enum\n  TYPE_INITIALIZER,\n//< Methods and Initializers initializer-type-enum\n//> Methods and Initializers method-type-enum\n  TYPE_METHOD,\n//< Methods and Initializers method-type-enum\n  TYPE_SCRIPT\n} FunctionType;\n//< Calls and Functions function-type-enum\n//> Local Variables compiler-struct\n\n/* Local Variables compiler-struct < Calls and Functions enclosing-field\ntypedef struct {\n*/\n//> Calls and Functions enclosing-field\ntypedef struct Compiler {\n  struct Compiler* enclosing;\n//< Calls and Functions enclosing-field\n//> Calls and Functions function-fields\n  ObjFunction* function;\n  FunctionType type;\n\n//< Calls and Functions function-fields\n  Local locals[UINT8_COUNT];\n  int localCount;\n//> Closures upvalues-array\n  Upvalue upvalues[UINT8_COUNT];\n//< Closures upvalues-array\n  int scopeDepth;\n} Compiler;\n//< Local Variables compiler-struct\n//> Methods and Initializers class-compiler-struct\n\ntypedef struct ClassCompiler {\n  struct ClassCompiler* enclosing;\n//> Superclasses has-superclass\n  bool hasSuperclass;\n//< Superclasses has-superclass\n} ClassCompiler;\n//< Methods and Initializers class-compiler-struct\n\nParser parser;\n//< Compiling Expressions parser\n//> Local Variables current-compiler\nCompiler* current = NULL;\n//< Local Variables current-compiler\n//> Methods and Initializers current-class\nClassCompiler* currentClass = NULL;\n//< Methods and Initializers current-class\n//> Compiling Expressions compiling-chunk\n/* Compiling Expressions compiling-chunk < Calls and Functions current-chunk\nChunk* compilingChunk;\n\nstatic Chunk* currentChunk() {\n  return compilingChunk;\n}\n*/\n//> Calls and Functions current-chunk\n\nstatic Chunk* currentChunk() {\n  return &current->function->chunk;\n}\n//< Calls and Functions current-chunk\n\n//< Compiling Expressions compiling-chunk\n//> Compiling Expressions error-at\nstatic void errorAt(Token* token, const char* message) {\n//> check-panic-mode\n  if (parser.panicMode) return;\n//< check-panic-mode\n//> set-panic-mode\n  parser.panicMode = true;\n//< set-panic-mode\n  fprintf(stderr, \"[line %d] Error\", token->line);\n\n  if (token->type == TOKEN_EOF) {\n    fprintf(stderr, \" at end\");\n  } else if (token->type == TOKEN_ERROR) {\n    // Nothing.\n  } else {\n    fprintf(stderr, \" at '%.*s'\", token->length, token->start);\n  }\n\n  fprintf(stderr, \": %s\\n\", message);\n  parser.hadError = true;\n}\n//< Compiling Expressions error-at\n//> Compiling Expressions error\nstatic void error(const char* message) {\n  errorAt(&parser.previous, message);\n}\n//< Compiling Expressions error\n//> Compiling Expressions error-at-current\nstatic void errorAtCurrent(const char* message) {\n  errorAt(&parser.current, message);\n}\n//< Compiling Expressions error-at-current\n//> Compiling Expressions advance\n\nstatic void advance() {\n  parser.previous = parser.current;\n\n  for (;;) {\n    parser.current = scanToken();\n    if (parser.current.type != TOKEN_ERROR) break;\n\n    errorAtCurrent(parser.current.start);\n  }\n}\n//< Compiling Expressions advance\n//> Compiling Expressions consume\nstatic void consume(TokenType type, const char* message) {\n  if (parser.current.type == type) {\n    advance();\n    return;\n  }\n\n  errorAtCurrent(message);\n}\n//< Compiling Expressions consume\n//> Global Variables check\nstatic bool check(TokenType type) {\n  return parser.current.type == type;\n}\n//< Global Variables check\n//> Global Variables match\nstatic bool match(TokenType type) {\n  if (!check(type)) return false;\n  advance();\n  return true;\n}\n//< Global Variables match\n//> Compiling Expressions emit-byte\nstatic void emitByte(uint8_t byte) {\n  writeChunk(currentChunk(), byte, parser.previous.line);\n}\n//< Compiling Expressions emit-byte\n//> Compiling Expressions emit-bytes\nstatic void emitBytes(uint8_t byte1, uint8_t byte2) {\n  emitByte(byte1);\n  emitByte(byte2);\n}\n//< Compiling Expressions emit-bytes\n//> Jumping Back and Forth emit-loop\nstatic void emitLoop(int loopStart) {\n  emitByte(OP_LOOP);\n\n  int offset = currentChunk()->count - loopStart + 2;\n  if (offset > UINT16_MAX) error(\"Loop body too large.\");\n\n  emitByte((offset >> 8) & 0xff);\n  emitByte(offset & 0xff);\n}\n//< Jumping Back and Forth emit-loop\n//> Jumping Back and Forth emit-jump\nstatic int emitJump(uint8_t instruction) {\n  emitByte(instruction);\n  emitByte(0xff);\n  emitByte(0xff);\n  return currentChunk()->count - 2;\n}\n//< Jumping Back and Forth emit-jump\n//> Compiling Expressions emit-return\nstatic void emitReturn() {\n/* Calls and Functions return-nil < Methods and Initializers return-this\n  emitByte(OP_NIL);\n*/\n//> Methods and Initializers return-this\n  if (current->type == TYPE_INITIALIZER) {\n    emitBytes(OP_GET_LOCAL, 0);\n  } else {\n    emitByte(OP_NIL);\n  }\n\n//< Methods and Initializers return-this\n  emitByte(OP_RETURN);\n}\n//< Compiling Expressions emit-return\n//> Compiling Expressions make-constant\nstatic uint8_t makeConstant(Value value) {\n  int constant = addConstant(currentChunk(), value);\n  if (constant > UINT8_MAX) {\n    error(\"Too many constants in one chunk.\");\n    return 0;\n  }\n\n  return (uint8_t)constant;\n}\n//< Compiling Expressions make-constant\n//> Compiling Expressions emit-constant\nstatic void emitConstant(Value value) {\n  emitBytes(OP_CONSTANT, makeConstant(value));\n}\n//< Compiling Expressions emit-constant\n//> Jumping Back and Forth patch-jump\nstatic void patchJump(int offset) {\n  // -2 to adjust for the bytecode for the jump offset itself.\n  int jump = currentChunk()->count - offset - 2;\n\n  if (jump > UINT16_MAX) {\n    error(\"Too much code to jump over.\");\n  }\n\n  currentChunk()->code[offset] = (jump >> 8) & 0xff;\n  currentChunk()->code[offset + 1] = jump & 0xff;\n}\n//< Jumping Back and Forth patch-jump\n//> Local Variables init-compiler\n/* Local Variables init-compiler < Calls and Functions init-compiler\nstatic void initCompiler(Compiler* compiler) {\n*/\n//> Calls and Functions init-compiler\nstatic void initCompiler(Compiler* compiler, FunctionType type) {\n//> store-enclosing\n  compiler->enclosing = current;\n//< store-enclosing\n  compiler->function = NULL;\n  compiler->type = type;\n//< Calls and Functions init-compiler\n  compiler->localCount = 0;\n  compiler->scopeDepth = 0;\n//> Calls and Functions init-function\n  compiler->function = newFunction();\n//< Calls and Functions init-function\n  current = compiler;\n//> Calls and Functions init-function-name\n  if (type != TYPE_SCRIPT) {\n    current->function->name = copyString(parser.previous.start,\n                                         parser.previous.length);\n  }\n//< Calls and Functions init-function-name\n//> Calls and Functions init-function-slot\n\n  Local* local = &current->locals[current->localCount++];\n  local->depth = 0;\n//> Closures init-zero-local-is-captured\n  local->isCaptured = false;\n//< Closures init-zero-local-is-captured\n/* Calls and Functions init-function-slot < Methods and Initializers slot-zero\n  local->name.start = \"\";\n  local->name.length = 0;\n*/\n//> Methods and Initializers slot-zero\n  if (type != TYPE_FUNCTION) {\n    local->name.start = \"this\";\n    local->name.length = 4;\n  } else {\n    local->name.start = \"\";\n    local->name.length = 0;\n  }\n//< Methods and Initializers slot-zero\n//< Calls and Functions init-function-slot\n}\n//< Local Variables init-compiler\n//> Compiling Expressions end-compiler\n/* Compiling Expressions end-compiler < Calls and Functions end-compiler\nstatic void endCompiler() {\n*/\n//> Calls and Functions end-compiler\nstatic ObjFunction* endCompiler() {\n//< Calls and Functions end-compiler\n  emitReturn();\n//> Calls and Functions end-function\n  ObjFunction* function = current->function;\n\n//< Calls and Functions end-function\n//> dump-chunk\n#ifdef DEBUG_PRINT_CODE\n  if (!parser.hadError) {\n/* Compiling Expressions dump-chunk < Calls and Functions disassemble-end\n    disassembleChunk(currentChunk(), \"code\");\n*/\n//> Calls and Functions disassemble-end\n    disassembleChunk(currentChunk(), function->name != NULL\n        ? function->name->chars : \"<script>\");\n//< Calls and Functions disassemble-end\n  }\n#endif\n//< dump-chunk\n//> Calls and Functions return-function\n\n//> restore-enclosing\n  current = current->enclosing;\n//< restore-enclosing\n  return function;\n//< Calls and Functions return-function\n}\n//< Compiling Expressions end-compiler\n//> Local Variables begin-scope\nstatic void beginScope() {\n  current->scopeDepth++;\n}\n//< Local Variables begin-scope\n//> Local Variables end-scope\nstatic void endScope() {\n  current->scopeDepth--;\n//> pop-locals\n\n  while (current->localCount > 0 &&\n         current->locals[current->localCount - 1].depth >\n            current->scopeDepth) {\n/* Local Variables pop-locals < Closures end-scope\n    emitByte(OP_POP);\n*/\n//> Closures end-scope\n    if (current->locals[current->localCount - 1].isCaptured) {\n      emitByte(OP_CLOSE_UPVALUE);\n    } else {\n      emitByte(OP_POP);\n    }\n//< Closures end-scope\n    current->localCount--;\n  }\n//< pop-locals\n}\n//< Local Variables end-scope\n//> Compiling Expressions forward-declarations\n\nstatic void expression();\n//> Global Variables forward-declarations\nstatic void statement();\nstatic void declaration();\n//< Global Variables forward-declarations\nstatic ParseRule* getRule(TokenType type);\nstatic void parsePrecedence(Precedence precedence);\n\n//< Compiling Expressions forward-declarations\n//> Global Variables identifier-constant\nstatic uint8_t identifierConstant(Token* name) {\n  return makeConstant(OBJ_VAL(copyString(name->start,\n                                         name->length)));\n}\n//< Global Variables identifier-constant\n//> Local Variables identifiers-equal\nstatic bool identifiersEqual(Token* a, Token* b) {\n  if (a->length != b->length) return false;\n  return memcmp(a->start, b->start, a->length) == 0;\n}\n//< Local Variables identifiers-equal\n//> Local Variables resolve-local\nstatic int resolveLocal(Compiler* compiler, Token* name) {\n  for (int i = compiler->localCount - 1; i >= 0; i--) {\n    Local* local = &compiler->locals[i];\n    if (identifiersEqual(name, &local->name)) {\n//> own-initializer-error\n      if (local->depth == -1) {\n        error(\"Can't read local variable in its own initializer.\");\n      }\n//< own-initializer-error\n      return i;\n    }\n  }\n\n  return -1;\n}\n//< Local Variables resolve-local\n//> Closures add-upvalue\nstatic int addUpvalue(Compiler* compiler, uint8_t index,\n                      bool isLocal) {\n  int upvalueCount = compiler->function->upvalueCount;\n//> existing-upvalue\n\n  for (int i = 0; i < upvalueCount; i++) {\n    Upvalue* upvalue = &compiler->upvalues[i];\n    if (upvalue->index == index && upvalue->isLocal == isLocal) {\n      return i;\n    }\n  }\n\n//< existing-upvalue\n//> too-many-upvalues\n  if (upvalueCount == UINT8_COUNT) {\n    error(\"Too many closure variables in function.\");\n    return 0;\n  }\n\n//< too-many-upvalues\n  compiler->upvalues[upvalueCount].isLocal = isLocal;\n  compiler->upvalues[upvalueCount].index = index;\n  return compiler->function->upvalueCount++;\n}\n//< Closures add-upvalue\n//> Closures resolve-upvalue\nstatic int resolveUpvalue(Compiler* compiler, Token* name) {\n  if (compiler->enclosing == NULL) return -1;\n\n  int local = resolveLocal(compiler->enclosing, name);\n  if (local != -1) {\n//> mark-local-captured\n    compiler->enclosing->locals[local].isCaptured = true;\n//< mark-local-captured\n    return addUpvalue(compiler, (uint8_t)local, true);\n  }\n\n//> resolve-upvalue-recurse\n  int upvalue = resolveUpvalue(compiler->enclosing, name);\n  if (upvalue != -1) {\n    return addUpvalue(compiler, (uint8_t)upvalue, false);\n  }\n  \n//< resolve-upvalue-recurse\n  return -1;\n}\n//< Closures resolve-upvalue\n//> Local Variables add-local\nstatic void addLocal(Token name) {\n//> too-many-locals\n  if (current->localCount == UINT8_COUNT) {\n    error(\"Too many local variables in function.\");\n    return;\n  }\n\n//< too-many-locals\n  Local* local = &current->locals[current->localCount++];\n  local->name = name;\n/* Local Variables add-local < Local Variables declare-undefined\n  local->depth = current->scopeDepth;\n*/\n//> declare-undefined\n  local->depth = -1;\n//< declare-undefined\n//> Closures init-is-captured\n  local->isCaptured = false;\n//< Closures init-is-captured\n}\n//< Local Variables add-local\n//> Local Variables declare-variable\nstatic void declareVariable() {\n  if (current->scopeDepth == 0) return;\n\n  Token* name = &parser.previous;\n//> existing-in-scope\n  for (int i = current->localCount - 1; i >= 0; i--) {\n    Local* local = &current->locals[i];\n    if (local->depth != -1 && local->depth < current->scopeDepth) {\n      break; // [negative]\n    }\n    \n    if (identifiersEqual(name, &local->name)) {\n      error(\"Already a variable with this name in this scope.\");\n    }\n  }\n\n//< existing-in-scope\n  addLocal(*name);\n}\n//< Local Variables declare-variable\n//> Global Variables parse-variable\nstatic uint8_t parseVariable(const char* errorMessage) {\n  consume(TOKEN_IDENTIFIER, errorMessage);\n//> Local Variables parse-local\n\n  declareVariable();\n  if (current->scopeDepth > 0) return 0;\n\n//< Local Variables parse-local\n  return identifierConstant(&parser.previous);\n}\n//< Global Variables parse-variable\n//> Local Variables mark-initialized\nstatic void markInitialized() {\n//> Calls and Functions check-depth\n  if (current->scopeDepth == 0) return;\n//< Calls and Functions check-depth\n  current->locals[current->localCount - 1].depth =\n      current->scopeDepth;\n}\n//< Local Variables mark-initialized\n//> Global Variables define-variable\nstatic void defineVariable(uint8_t global) {\n//> Local Variables define-variable\n  if (current->scopeDepth > 0) {\n//> define-local\n    markInitialized();\n//< define-local\n    return;\n  }\n\n//< Local Variables define-variable\n  emitBytes(OP_DEFINE_GLOBAL, global);\n}\n//< Global Variables define-variable\n//> Calls and Functions argument-list\nstatic uint8_t argumentList() {\n  uint8_t argCount = 0;\n  if (!check(TOKEN_RIGHT_PAREN)) {\n    do {\n      expression();\n//> arg-limit\n      if (argCount == 255) {\n        error(\"Can't have more than 255 arguments.\");\n      }\n//< arg-limit\n      argCount++;\n    } while (match(TOKEN_COMMA));\n  }\n  consume(TOKEN_RIGHT_PAREN, \"Expect ')' after arguments.\");\n  return argCount;\n}\n//< Calls and Functions argument-list\n//> Jumping Back and Forth and\nstatic void and_(bool canAssign) {\n  int endJump = emitJump(OP_JUMP_IF_FALSE);\n\n  emitByte(OP_POP);\n  parsePrecedence(PREC_AND);\n\n  patchJump(endJump);\n}\n//< Jumping Back and Forth and\n//> Compiling Expressions binary\n/* Compiling Expressions binary < Global Variables binary\nstatic void binary() {\n*/\n//> Global Variables binary\nstatic void binary(bool canAssign) {\n//< Global Variables binary\n  TokenType operatorType = parser.previous.type;\n  ParseRule* rule = getRule(operatorType);\n  parsePrecedence((Precedence)(rule->precedence + 1));\n\n  switch (operatorType) {\n//> Types of Values comparison-operators\n    case TOKEN_BANG_EQUAL:    emitBytes(OP_EQUAL, OP_NOT); break;\n    case TOKEN_EQUAL_EQUAL:   emitByte(OP_EQUAL); break;\n    case TOKEN_GREATER:       emitByte(OP_GREATER); break;\n    case TOKEN_GREATER_EQUAL: emitBytes(OP_LESS, OP_NOT); break;\n    case TOKEN_LESS:          emitByte(OP_LESS); break;\n    case TOKEN_LESS_EQUAL:    emitBytes(OP_GREATER, OP_NOT); break;\n//< Types of Values comparison-operators\n    case TOKEN_PLUS:          emitByte(OP_ADD); break;\n    case TOKEN_MINUS:         emitByte(OP_SUBTRACT); break;\n    case TOKEN_STAR:          emitByte(OP_MULTIPLY); break;\n    case TOKEN_SLASH:         emitByte(OP_DIVIDE); break;\n    default: return; // Unreachable.\n  }\n}\n//< Compiling Expressions binary\n//> Calls and Functions compile-call\nstatic void call(bool canAssign) {\n  uint8_t argCount = argumentList();\n  emitBytes(OP_CALL, argCount);\n}\n//< Calls and Functions compile-call\n//> Classes and Instances compile-dot\nstatic void dot(bool canAssign) {\n  consume(TOKEN_IDENTIFIER, \"Expect property name after '.'.\");\n  uint8_t name = identifierConstant(&parser.previous);\n\n  if (canAssign && match(TOKEN_EQUAL)) {\n    expression();\n    emitBytes(OP_SET_PROPERTY, name);\n//> Methods and Initializers parse-call\n  } else if (match(TOKEN_LEFT_PAREN)) {\n    uint8_t argCount = argumentList();\n    emitBytes(OP_INVOKE, name);\n    emitByte(argCount);\n//< Methods and Initializers parse-call\n  } else {\n    emitBytes(OP_GET_PROPERTY, name);\n  }\n}\n//< Classes and Instances compile-dot\n//> Types of Values parse-literal\n/* Types of Values parse-literal < Global Variables parse-literal\nstatic void literal() {\n*/\n//> Global Variables parse-literal\nstatic void literal(bool canAssign) {\n//< Global Variables parse-literal\n  switch (parser.previous.type) {\n    case TOKEN_FALSE: emitByte(OP_FALSE); break;\n    case TOKEN_NIL: emitByte(OP_NIL); break;\n    case TOKEN_TRUE: emitByte(OP_TRUE); break;\n    default: return; // Unreachable.\n  }\n}\n//< Types of Values parse-literal\n//> Compiling Expressions grouping\n/* Compiling Expressions grouping < Global Variables grouping\nstatic void grouping() {\n*/\n//> Global Variables grouping\nstatic void grouping(bool canAssign) {\n//< Global Variables grouping\n  expression();\n  consume(TOKEN_RIGHT_PAREN, \"Expect ')' after expression.\");\n}\n//< Compiling Expressions grouping\n/* Compiling Expressions number < Global Variables number\nstatic void number() {\n*/\n//> Compiling Expressions number\n//> Global Variables number\nstatic void number(bool canAssign) {\n//< Global Variables number\n  double value = strtod(parser.previous.start, NULL);\n/* Compiling Expressions number < Types of Values const-number-val\n  emitConstant(value);\n*/\n//> Types of Values const-number-val\n  emitConstant(NUMBER_VAL(value));\n//< Types of Values const-number-val\n}\n//< Compiling Expressions number\n//> Jumping Back and Forth or\nstatic void or_(bool canAssign) {\n  int elseJump = emitJump(OP_JUMP_IF_FALSE);\n  int endJump = emitJump(OP_JUMP);\n\n  patchJump(elseJump);\n  emitByte(OP_POP);\n\n  parsePrecedence(PREC_OR);\n  patchJump(endJump);\n}\n//< Jumping Back and Forth or\n/* Strings parse-string < Global Variables string\nstatic void string() {\n*/\n//> Strings parse-string\n//> Global Variables string\nstatic void string(bool canAssign) {\n//< Global Variables string\n  emitConstant(OBJ_VAL(copyString(parser.previous.start + 1,\n                                  parser.previous.length - 2)));\n}\n//< Strings parse-string\n/* Global Variables read-named-variable < Global Variables named-variable-signature\nstatic void namedVariable(Token name) {\n*/\n//> Global Variables named-variable-signature\nstatic void namedVariable(Token name, bool canAssign) {\n//< Global Variables named-variable-signature\n/* Global Variables read-named-variable < Local Variables named-local\n  uint8_t arg = identifierConstant(&name);\n*/\n//> Global Variables read-named-variable\n//> Local Variables named-local\n  uint8_t getOp, setOp;\n  int arg = resolveLocal(current, &name);\n  if (arg != -1) {\n    getOp = OP_GET_LOCAL;\n    setOp = OP_SET_LOCAL;\n//> Closures named-variable-upvalue\n  } else if ((arg = resolveUpvalue(current, &name)) != -1) {\n    getOp = OP_GET_UPVALUE;\n    setOp = OP_SET_UPVALUE;\n//< Closures named-variable-upvalue\n  } else {\n    arg = identifierConstant(&name);\n    getOp = OP_GET_GLOBAL;\n    setOp = OP_SET_GLOBAL;\n  }\n//< Local Variables named-local\n/* Global Variables read-named-variable < Global Variables named-variable\n  emitBytes(OP_GET_GLOBAL, arg);\n*/\n//> named-variable\n\n/* Global Variables named-variable < Global Variables named-variable-can-assign\n  if (match(TOKEN_EQUAL)) {\n*/\n//> named-variable-can-assign\n  if (canAssign && match(TOKEN_EQUAL)) {\n//< named-variable-can-assign\n    expression();\n/* Global Variables named-variable < Local Variables emit-set\n    emitBytes(OP_SET_GLOBAL, arg);\n*/\n//> Local Variables emit-set\n    emitBytes(setOp, (uint8_t)arg);\n//< Local Variables emit-set\n  } else {\n/* Global Variables named-variable < Local Variables emit-get\n    emitBytes(OP_GET_GLOBAL, arg);\n*/\n//> Local Variables emit-get\n    emitBytes(getOp, (uint8_t)arg);\n//< Local Variables emit-get\n  }\n//< named-variable\n}\n//< Global Variables read-named-variable\n/* Global Variables variable-without-assign < Global Variables variable\nstatic void variable() {\n  namedVariable(parser.previous);\n}\n*/\n//> Global Variables variable\nstatic void variable(bool canAssign) {\n  namedVariable(parser.previous, canAssign);\n}\n//< Global Variables variable\n//> Superclasses synthetic-token\nstatic Token syntheticToken(const char* text) {\n  Token token;\n  token.start = text;\n  token.length = (int)strlen(text);\n  return token;\n}\n//< Superclasses synthetic-token\n//> Superclasses super\nstatic void super_(bool canAssign) {\n//> super-errors\n  if (currentClass == NULL) {\n    error(\"Can't use 'super' outside of a class.\");\n  } else if (!currentClass->hasSuperclass) {\n    error(\"Can't use 'super' in a class with no superclass.\");\n  }\n\n//< super-errors\n  consume(TOKEN_DOT, \"Expect '.' after 'super'.\");\n  consume(TOKEN_IDENTIFIER, \"Expect superclass method name.\");\n  uint8_t name = identifierConstant(&parser.previous);\n//> super-get\n  \n  namedVariable(syntheticToken(\"this\"), false);\n/* Superclasses super-get < Superclasses super-invoke\n  namedVariable(syntheticToken(\"super\"), false);\n  emitBytes(OP_GET_SUPER, name);\n*/\n//< super-get\n//> super-invoke\n  if (match(TOKEN_LEFT_PAREN)) {\n    uint8_t argCount = argumentList();\n    namedVariable(syntheticToken(\"super\"), false);\n    emitBytes(OP_SUPER_INVOKE, name);\n    emitByte(argCount);\n  } else {\n    namedVariable(syntheticToken(\"super\"), false);\n    emitBytes(OP_GET_SUPER, name);\n  }\n//< super-invoke\n}\n//< Superclasses super\n//> Methods and Initializers this\nstatic void this_(bool canAssign) {\n//> this-outside-class\n  if (currentClass == NULL) {\n    error(\"Can't use 'this' outside of a class.\");\n    return;\n  }\n  \n//< this-outside-class\n  variable(false);\n} // [this]\n//< Methods and Initializers this\n//> Compiling Expressions unary\n/* Compiling Expressions unary < Global Variables unary\nstatic void unary() {\n*/\n//> Global Variables unary\nstatic void unary(bool canAssign) {\n//< Global Variables unary\n  TokenType operatorType = parser.previous.type;\n\n  // Compile the operand.\n/* Compiling Expressions unary < Compiling Expressions unary-operand\n  expression();\n*/\n//> unary-operand\n  parsePrecedence(PREC_UNARY);\n//< unary-operand\n\n  // Emit the operator instruction.\n  switch (operatorType) {\n//> Types of Values compile-not\n    case TOKEN_BANG: emitByte(OP_NOT); break;\n//< Types of Values compile-not\n    case TOKEN_MINUS: emitByte(OP_NEGATE); break;\n    default: return; // Unreachable.\n  }\n}\n//< Compiling Expressions unary\n//> Compiling Expressions rules\nParseRule rules[] = {\n/* Compiling Expressions rules < Calls and Functions infix-left-paren\n  [TOKEN_LEFT_PAREN]    = {grouping, NULL,   PREC_NONE},\n*/\n//> Calls and Functions infix-left-paren\n  [TOKEN_LEFT_PAREN]    = {grouping, call,   PREC_CALL},\n//< Calls and Functions infix-left-paren\n  [TOKEN_RIGHT_PAREN]   = {NULL,     NULL,   PREC_NONE},\n  [TOKEN_LEFT_BRACE]    = {NULL,     NULL,   PREC_NONE}, // [big]\n  [TOKEN_RIGHT_BRACE]   = {NULL,     NULL,   PREC_NONE},\n  [TOKEN_COMMA]         = {NULL,     NULL,   PREC_NONE},\n/* Compiling Expressions rules < Classes and Instances table-dot\n  [TOKEN_DOT]           = {NULL,     NULL,   PREC_NONE},\n*/\n//> Classes and Instances table-dot\n  [TOKEN_DOT]           = {NULL,     dot,    PREC_CALL},\n//< Classes and Instances table-dot\n  [TOKEN_MINUS]         = {unary,    binary, PREC_TERM},\n  [TOKEN_PLUS]          = {NULL,     binary, PREC_TERM},\n  [TOKEN_SEMICOLON]     = {NULL,     NULL,   PREC_NONE},\n  [TOKEN_SLASH]         = {NULL,     binary, PREC_FACTOR},\n  [TOKEN_STAR]          = {NULL,     binary, PREC_FACTOR},\n/* Compiling Expressions rules < Types of Values table-not\n  [TOKEN_BANG]          = {NULL,     NULL,   PREC_NONE},\n*/\n//> Types of Values table-not\n  [TOKEN_BANG]          = {unary,    NULL,   PREC_NONE},\n//< Types of Values table-not\n/* Compiling Expressions rules < Types of Values table-equal\n  [TOKEN_BANG_EQUAL]    = {NULL,     NULL,   PREC_NONE},\n*/\n//> Types of Values table-equal\n  [TOKEN_BANG_EQUAL]    = {NULL,     binary, PREC_EQUALITY},\n//< Types of Values table-equal\n  [TOKEN_EQUAL]         = {NULL,     NULL,   PREC_NONE},\n/* Compiling Expressions rules < Types of Values table-comparisons\n  [TOKEN_EQUAL_EQUAL]   = {NULL,     NULL,   PREC_NONE},\n  [TOKEN_GREATER]       = {NULL,     NULL,   PREC_NONE},\n  [TOKEN_GREATER_EQUAL] = {NULL,     NULL,   PREC_NONE},\n  [TOKEN_LESS]          = {NULL,     NULL,   PREC_NONE},\n  [TOKEN_LESS_EQUAL]    = {NULL,     NULL,   PREC_NONE},\n*/\n//> Types of Values table-comparisons\n  [TOKEN_EQUAL_EQUAL]   = {NULL,     binary, PREC_EQUALITY},\n  [TOKEN_GREATER]       = {NULL,     binary, PREC_COMPARISON},\n  [TOKEN_GREATER_EQUAL] = {NULL,     binary, PREC_COMPARISON},\n  [TOKEN_LESS]          = {NULL,     binary, PREC_COMPARISON},\n  [TOKEN_LESS_EQUAL]    = {NULL,     binary, PREC_COMPARISON},\n//< Types of Values table-comparisons\n/* Compiling Expressions rules < Global Variables table-identifier\n  [TOKEN_IDENTIFIER]    = {NULL,     NULL,   PREC_NONE},\n*/\n//> Global Variables table-identifier\n  [TOKEN_IDENTIFIER]    = {variable, NULL,   PREC_NONE},\n//< Global Variables table-identifier\n/* Compiling Expressions rules < Strings table-string\n  [TOKEN_STRING]        = {NULL,     NULL,   PREC_NONE},\n*/\n//> Strings table-string\n  [TOKEN_STRING]        = {string,   NULL,   PREC_NONE},\n//< Strings table-string\n  [TOKEN_NUMBER]        = {number,   NULL,   PREC_NONE},\n/* Compiling Expressions rules < Jumping Back and Forth table-and\n  [TOKEN_AND]           = {NULL,     NULL,   PREC_NONE},\n*/\n//> Jumping Back and Forth table-and\n  [TOKEN_AND]           = {NULL,     and_,   PREC_AND},\n//< Jumping Back and Forth table-and\n  [TOKEN_CLASS]         = {NULL,     NULL,   PREC_NONE},\n  [TOKEN_ELSE]          = {NULL,     NULL,   PREC_NONE},\n/* Compiling Expressions rules < Types of Values table-false\n  [TOKEN_FALSE]         = {NULL,     NULL,   PREC_NONE},\n*/\n//> Types of Values table-false\n  [TOKEN_FALSE]         = {literal,  NULL,   PREC_NONE},\n//< Types of Values table-false\n  [TOKEN_FOR]           = {NULL,     NULL,   PREC_NONE},\n  [TOKEN_FUN]           = {NULL,     NULL,   PREC_NONE},\n  [TOKEN_IF]            = {NULL,     NULL,   PREC_NONE},\n/* Compiling Expressions rules < Types of Values table-nil\n  [TOKEN_NIL]           = {NULL,     NULL,   PREC_NONE},\n*/\n//> Types of Values table-nil\n  [TOKEN_NIL]           = {literal,  NULL,   PREC_NONE},\n//< Types of Values table-nil\n/* Compiling Expressions rules < Jumping Back and Forth table-or\n  [TOKEN_OR]            = {NULL,     NULL,   PREC_NONE},\n*/\n//> Jumping Back and Forth table-or\n  [TOKEN_OR]            = {NULL,     or_,    PREC_OR},\n//< Jumping Back and Forth table-or\n  [TOKEN_PRINT]         = {NULL,     NULL,   PREC_NONE},\n  [TOKEN_RETURN]        = {NULL,     NULL,   PREC_NONE},\n/* Compiling Expressions rules < Superclasses table-super\n  [TOKEN_SUPER]         = {NULL,     NULL,   PREC_NONE},\n*/\n//> Superclasses table-super\n  [TOKEN_SUPER]         = {super_,   NULL,   PREC_NONE},\n//< Superclasses table-super\n/* Compiling Expressions rules < Methods and Initializers table-this\n  [TOKEN_THIS]          = {NULL,     NULL,   PREC_NONE},\n*/\n//> Methods and Initializers table-this\n  [TOKEN_THIS]          = {this_,    NULL,   PREC_NONE},\n//< Methods and Initializers table-this\n/* Compiling Expressions rules < Types of Values table-true\n  [TOKEN_TRUE]          = {NULL,     NULL,   PREC_NONE},\n*/\n//> Types of Values table-true\n  [TOKEN_TRUE]          = {literal,  NULL,   PREC_NONE},\n//< Types of Values table-true\n  [TOKEN_VAR]           = {NULL,     NULL,   PREC_NONE},\n  [TOKEN_WHILE]         = {NULL,     NULL,   PREC_NONE},\n  [TOKEN_ERROR]         = {NULL,     NULL,   PREC_NONE},\n  [TOKEN_EOF]           = {NULL,     NULL,   PREC_NONE},\n};\n//< Compiling Expressions rules\n//> Compiling Expressions parse-precedence\nstatic void parsePrecedence(Precedence precedence) {\n/* Compiling Expressions parse-precedence < Compiling Expressions precedence-body\n  // What goes here?\n*/\n//> precedence-body\n  advance();\n  ParseFn prefixRule = getRule(parser.previous.type)->prefix;\n  if (prefixRule == NULL) {\n    error(\"Expect expression.\");\n    return;\n  }\n\n/* Compiling Expressions precedence-body < Global Variables prefix-rule\n  prefixRule();\n*/\n//> Global Variables prefix-rule\n  bool canAssign = precedence <= PREC_ASSIGNMENT;\n  prefixRule(canAssign);\n//< Global Variables prefix-rule\n//> infix\n\n  while (precedence <= getRule(parser.current.type)->precedence) {\n    advance();\n    ParseFn infixRule = getRule(parser.previous.type)->infix;\n/* Compiling Expressions infix < Global Variables infix-rule\n    infixRule();\n*/\n//> Global Variables infix-rule\n    infixRule(canAssign);\n//< Global Variables infix-rule\n  }\n//> Global Variables invalid-assign\n\n  if (canAssign && match(TOKEN_EQUAL)) {\n    error(\"Invalid assignment target.\");\n  }\n//< Global Variables invalid-assign\n//< infix\n//< precedence-body\n}\n//< Compiling Expressions parse-precedence\n//> Compiling Expressions get-rule\nstatic ParseRule* getRule(TokenType type) {\n  return &rules[type];\n}\n//< Compiling Expressions get-rule\n//> Compiling Expressions expression\nstatic void expression() {\n/* Compiling Expressions expression < Compiling Expressions expression-body\n  // What goes here?\n*/\n//> expression-body\n  parsePrecedence(PREC_ASSIGNMENT);\n//< expression-body\n}\n//< Compiling Expressions expression\n//> Local Variables block\nstatic void block() {\n  while (!check(TOKEN_RIGHT_BRACE) && !check(TOKEN_EOF)) {\n    declaration();\n  }\n\n  consume(TOKEN_RIGHT_BRACE, \"Expect '}' after block.\");\n}\n//< Local Variables block\n//> Calls and Functions compile-function\nstatic void function(FunctionType type) {\n  Compiler compiler;\n  initCompiler(&compiler, type);\n  beginScope(); // [no-end-scope]\n\n  consume(TOKEN_LEFT_PAREN, \"Expect '(' after function name.\");\n//> parameters\n  if (!check(TOKEN_RIGHT_PAREN)) {\n    do {\n      current->function->arity++;\n      if (current->function->arity > 255) {\n        errorAtCurrent(\"Can't have more than 255 parameters.\");\n      }\n      uint8_t constant = parseVariable(\"Expect parameter name.\");\n      defineVariable(constant);\n    } while (match(TOKEN_COMMA));\n  }\n//< parameters\n  consume(TOKEN_RIGHT_PAREN, \"Expect ')' after parameters.\");\n  consume(TOKEN_LEFT_BRACE, \"Expect '{' before function body.\");\n  block();\n\n  ObjFunction* function = endCompiler();\n/* Calls and Functions compile-function < Closures emit-closure\n  emitBytes(OP_CONSTANT, makeConstant(OBJ_VAL(function)));\n*/\n//> Closures emit-closure\n  emitBytes(OP_CLOSURE, makeConstant(OBJ_VAL(function)));\n//< Closures emit-closure\n//> Closures capture-upvalues\n\n  for (int i = 0; i < function->upvalueCount; i++) {\n    emitByte(compiler.upvalues[i].isLocal ? 1 : 0);\n    emitByte(compiler.upvalues[i].index);\n  }\n//< Closures capture-upvalues\n}\n//< Calls and Functions compile-function\n//> Methods and Initializers method\nstatic void method() {\n  consume(TOKEN_IDENTIFIER, \"Expect method name.\");\n  uint8_t constant = identifierConstant(&parser.previous);\n//> method-body\n\n//< method-body\n/* Methods and Initializers method-body < Methods and Initializers method-type\n  FunctionType type = TYPE_FUNCTION;\n*/\n//> method-type\n  FunctionType type = TYPE_METHOD;\n//< method-type\n//> initializer-name\n  if (parser.previous.length == 4 &&\n      memcmp(parser.previous.start, \"init\", 4) == 0) {\n    type = TYPE_INITIALIZER;\n  }\n  \n//< initializer-name\n//> method-body\n  function(type);\n//< method-body\n  emitBytes(OP_METHOD, constant);\n}\n//< Methods and Initializers method\n//> Classes and Instances class-declaration\nstatic void classDeclaration() {\n  consume(TOKEN_IDENTIFIER, \"Expect class name.\");\n//> Methods and Initializers class-name\n  Token className = parser.previous;\n//< Methods and Initializers class-name\n  uint8_t nameConstant = identifierConstant(&parser.previous);\n  declareVariable();\n\n  emitBytes(OP_CLASS, nameConstant);\n  defineVariable(nameConstant);\n\n//> Methods and Initializers create-class-compiler\n  ClassCompiler classCompiler;\n//> Superclasses init-has-superclass\n  classCompiler.hasSuperclass = false;\n//< Superclasses init-has-superclass\n  classCompiler.enclosing = currentClass;\n  currentClass = &classCompiler;\n\n//< Methods and Initializers create-class-compiler\n//> Superclasses compile-superclass\n  if (match(TOKEN_LESS)) {\n    consume(TOKEN_IDENTIFIER, \"Expect superclass name.\");\n    variable(false);\n//> inherit-self\n\n    if (identifiersEqual(&className, &parser.previous)) {\n      error(\"A class can't inherit from itself.\");\n    }\n\n//< inherit-self\n//> superclass-variable\n    beginScope();\n    addLocal(syntheticToken(\"super\"));\n    defineVariable(0);\n    \n//< superclass-variable\n    namedVariable(className, false);\n    emitByte(OP_INHERIT);\n//> set-has-superclass\n    classCompiler.hasSuperclass = true;\n//< set-has-superclass\n  }\n  \n//< Superclasses compile-superclass\n//> Methods and Initializers load-class\n  namedVariable(className, false);\n//< Methods and Initializers load-class\n  consume(TOKEN_LEFT_BRACE, \"Expect '{' before class body.\");\n//> Methods and Initializers class-body\n  while (!check(TOKEN_RIGHT_BRACE) && !check(TOKEN_EOF)) {\n    method();\n  }\n//< Methods and Initializers class-body\n  consume(TOKEN_RIGHT_BRACE, \"Expect '}' after class body.\");\n//> Methods and Initializers pop-class\n  emitByte(OP_POP);\n//< Methods and Initializers pop-class\n//> Superclasses end-superclass-scope\n\n  if (classCompiler.hasSuperclass) {\n    endScope();\n  }\n//< Superclasses end-superclass-scope\n//> Methods and Initializers pop-enclosing\n\n  currentClass = currentClass->enclosing;\n//< Methods and Initializers pop-enclosing\n}\n//< Classes and Instances class-declaration\n//> Calls and Functions fun-declaration\nstatic void funDeclaration() {\n  uint8_t global = parseVariable(\"Expect function name.\");\n  markInitialized();\n  function(TYPE_FUNCTION);\n  defineVariable(global);\n}\n//< Calls and Functions fun-declaration\n//> Global Variables var-declaration\nstatic void varDeclaration() {\n  uint8_t global = parseVariable(\"Expect variable name.\");\n\n  if (match(TOKEN_EQUAL)) {\n    expression();\n  } else {\n    emitByte(OP_NIL);\n  }\n  consume(TOKEN_SEMICOLON,\n          \"Expect ';' after variable declaration.\");\n\n  defineVariable(global);\n}\n//< Global Variables var-declaration\n//> Global Variables expression-statement\nstatic void expressionStatement() {\n  expression();\n  consume(TOKEN_SEMICOLON, \"Expect ';' after expression.\");\n  emitByte(OP_POP);\n}\n//< Global Variables expression-statement\n//> Jumping Back and Forth for-statement\nstatic void forStatement() {\n//> for-begin-scope\n  beginScope();\n//< for-begin-scope\n  consume(TOKEN_LEFT_PAREN, \"Expect '(' after 'for'.\");\n/* Jumping Back and Forth for-statement < Jumping Back and Forth for-initializer\n  consume(TOKEN_SEMICOLON, \"Expect ';'.\");\n*/\n//> for-initializer\n  if (match(TOKEN_SEMICOLON)) {\n    // No initializer.\n  } else if (match(TOKEN_VAR)) {\n    varDeclaration();\n  } else {\n    expressionStatement();\n  }\n//< for-initializer\n\n  int loopStart = currentChunk()->count;\n/* Jumping Back and Forth for-statement < Jumping Back and Forth for-exit\n  consume(TOKEN_SEMICOLON, \"Expect ';'.\");\n*/\n//> for-exit\n  int exitJump = -1;\n  if (!match(TOKEN_SEMICOLON)) {\n    expression();\n    consume(TOKEN_SEMICOLON, \"Expect ';' after loop condition.\");\n\n    // Jump out of the loop if the condition is false.\n    exitJump = emitJump(OP_JUMP_IF_FALSE);\n    emitByte(OP_POP); // Condition.\n  }\n\n//< for-exit\n/* Jumping Back and Forth for-statement < Jumping Back and Forth for-increment\n  consume(TOKEN_RIGHT_PAREN, \"Expect ')' after for clauses.\");\n*/\n//> for-increment\n  if (!match(TOKEN_RIGHT_PAREN)) {\n    int bodyJump = emitJump(OP_JUMP);\n    int incrementStart = currentChunk()->count;\n    expression();\n    emitByte(OP_POP);\n    consume(TOKEN_RIGHT_PAREN, \"Expect ')' after for clauses.\");\n\n    emitLoop(loopStart);\n    loopStart = incrementStart;\n    patchJump(bodyJump);\n  }\n//< for-increment\n\n  statement();\n  emitLoop(loopStart);\n//> exit-jump\n\n  if (exitJump != -1) {\n    patchJump(exitJump);\n    emitByte(OP_POP); // Condition.\n  }\n\n//< exit-jump\n//> for-end-scope\n  endScope();\n//< for-end-scope\n}\n//< Jumping Back and Forth for-statement\n//> Jumping Back and Forth if-statement\nstatic void ifStatement() {\n  consume(TOKEN_LEFT_PAREN, \"Expect '(' after 'if'.\");\n  expression();\n  consume(TOKEN_RIGHT_PAREN, \"Expect ')' after condition.\"); // [paren]\n\n  int thenJump = emitJump(OP_JUMP_IF_FALSE);\n//> pop-then\n  emitByte(OP_POP);\n//< pop-then\n  statement();\n\n//> jump-over-else\n  int elseJump = emitJump(OP_JUMP);\n\n//< jump-over-else\n  patchJump(thenJump);\n//> pop-end\n  emitByte(OP_POP);\n//< pop-end\n//> compile-else\n\n  if (match(TOKEN_ELSE)) statement();\n//< compile-else\n//> patch-else\n  patchJump(elseJump);\n//< patch-else\n}\n//< Jumping Back and Forth if-statement\n//> Global Variables print-statement\nstatic void printStatement() {\n  expression();\n  consume(TOKEN_SEMICOLON, \"Expect ';' after value.\");\n  emitByte(OP_PRINT);\n}\n//< Global Variables print-statement\n//> Calls and Functions return-statement\nstatic void returnStatement() {\n//> return-from-script\n  if (current->type == TYPE_SCRIPT) {\n    error(\"Can't return from top-level code.\");\n  }\n\n//< return-from-script\n  if (match(TOKEN_SEMICOLON)) {\n    emitReturn();\n  } else {\n//> Methods and Initializers return-from-init\n    if (current->type == TYPE_INITIALIZER) {\n      error(\"Can't return a value from an initializer.\");\n    }\n\n//< Methods and Initializers return-from-init\n    expression();\n    consume(TOKEN_SEMICOLON, \"Expect ';' after return value.\");\n    emitByte(OP_RETURN);\n  }\n}\n//< Calls and Functions return-statement\n//> Jumping Back and Forth while-statement\nstatic void whileStatement() {\n//> loop-start\n  int loopStart = currentChunk()->count;\n//< loop-start\n  consume(TOKEN_LEFT_PAREN, \"Expect '(' after 'while'.\");\n  expression();\n  consume(TOKEN_RIGHT_PAREN, \"Expect ')' after condition.\");\n\n  int exitJump = emitJump(OP_JUMP_IF_FALSE);\n  emitByte(OP_POP);\n  statement();\n//> loop\n  emitLoop(loopStart);\n//< loop\n\n  patchJump(exitJump);\n  emitByte(OP_POP);\n}\n//< Jumping Back and Forth while-statement\n//> Global Variables synchronize\nstatic void synchronize() {\n  parser.panicMode = false;\n\n  while (parser.current.type != TOKEN_EOF) {\n    if (parser.previous.type == TOKEN_SEMICOLON) return;\n    switch (parser.current.type) {\n      case TOKEN_CLASS:\n      case TOKEN_FUN:\n      case TOKEN_VAR:\n      case TOKEN_FOR:\n      case TOKEN_IF:\n      case TOKEN_WHILE:\n      case TOKEN_PRINT:\n      case TOKEN_RETURN:\n        return;\n\n      default:\n        ; // Do nothing.\n    }\n\n    advance();\n  }\n}\n//< Global Variables synchronize\n//> Global Variables declaration\nstatic void declaration() {\n//> Classes and Instances match-class\n  if (match(TOKEN_CLASS)) {\n    classDeclaration();\n/* Calls and Functions match-fun < Classes and Instances match-class\n  if (match(TOKEN_FUN)) {\n*/\n  } else if (match(TOKEN_FUN)) {\n//< Classes and Instances match-class\n//> Calls and Functions match-fun\n    funDeclaration();\n/* Global Variables match-var < Calls and Functions match-fun\n  if (match(TOKEN_VAR)) {\n*/\n  } else if (match(TOKEN_VAR)) {\n//< Calls and Functions match-fun\n//> match-var\n    varDeclaration();\n  } else {\n    statement();\n  }\n//< match-var\n/* Global Variables declaration < Global Variables match-var\n  statement();\n*/\n//> call-synchronize\n\n  if (parser.panicMode) synchronize();\n//< call-synchronize\n}\n//< Global Variables declaration\n//> Global Variables statement\nstatic void statement() {\n  if (match(TOKEN_PRINT)) {\n    printStatement();\n//> Jumping Back and Forth parse-for\n  } else if (match(TOKEN_FOR)) {\n    forStatement();\n//< Jumping Back and Forth parse-for\n//> Jumping Back and Forth parse-if\n  } else if (match(TOKEN_IF)) {\n    ifStatement();\n//< Jumping Back and Forth parse-if\n//> Calls and Functions match-return\n  } else if (match(TOKEN_RETURN)) {\n    returnStatement();\n//< Calls and Functions match-return\n//> Jumping Back and Forth parse-while\n  } else if (match(TOKEN_WHILE)) {\n    whileStatement();\n//< Jumping Back and Forth parse-while\n//> Local Variables parse-block\n  } else if (match(TOKEN_LEFT_BRACE)) {\n    beginScope();\n    block();\n    endScope();\n//< Local Variables parse-block\n//> parse-expressions-statement\n  } else {\n    expressionStatement();\n//< parse-expressions-statement\n  }\n}\n//< Global Variables statement\n\n/* Scanning on Demand compiler-c < Compiling Expressions compile-signature\nvoid compile(const char* source) {\n*/\n/* Compiling Expressions compile-signature < Calls and Functions compile-signature\nbool compile(const char* source, Chunk* chunk) {\n*/\n//> Calls and Functions compile-signature\nObjFunction* compile(const char* source) {\n//< Calls and Functions compile-signature\n  initScanner(source);\n/* Scanning on Demand dump-tokens < Compiling Expressions compile-chunk\n  int line = -1;\n  for (;;) {\n    Token token = scanToken();\n    if (token.line != line) {\n      printf(\"%4d \", token.line);\n      line = token.line;\n    } else {\n      printf(\"   | \");\n    }\n    printf(\"%2d '%.*s'\\n\", token.type, token.length, token.start); // [format]\n\n    if (token.type == TOKEN_EOF) break;\n  }\n*/\n//> Local Variables compiler\n  Compiler compiler;\n//< Local Variables compiler\n/* Local Variables compiler < Calls and Functions call-init-compiler\n  initCompiler(&compiler);\n*/\n//> Calls and Functions call-init-compiler\n  initCompiler(&compiler, TYPE_SCRIPT);\n//< Calls and Functions call-init-compiler\n/* Compiling Expressions init-compile-chunk < Calls and Functions call-init-compiler\n  compilingChunk = chunk;\n*/\n//> Compiling Expressions compile-chunk\n//> init-parser-error\n\n  parser.hadError = false;\n  parser.panicMode = false;\n\n//< init-parser-error\n  advance();\n//< Compiling Expressions compile-chunk\n/* Compiling Expressions compile-chunk < Global Variables compile\n  expression();\n  consume(TOKEN_EOF, \"Expect end of expression.\");\n*/\n//> Global Variables compile\n\n  while (!match(TOKEN_EOF)) {\n    declaration();\n  }\n\n//< Global Variables compile\n/* Compiling Expressions finish-compile < Calls and Functions call-end-compiler\n  endCompiler();\n*/\n/* Compiling Expressions return-had-error < Calls and Functions call-end-compiler\n  return !parser.hadError;\n*/\n//> Calls and Functions call-end-compiler\n  ObjFunction* function = endCompiler();\n  return parser.hadError ? NULL : function;\n//< Calls and Functions call-end-compiler\n}\n//> Garbage Collection mark-compiler-roots\nvoid markCompilerRoots() {\n  Compiler* compiler = current;\n  while (compiler != NULL) {\n    markObject((Obj*)compiler->function);\n    compiler = compiler->enclosing;\n  }\n}\n//< Garbage Collection mark-compiler-roots\n"
  },
  {
    "path": "c/compiler.h",
    "content": "//> Scanning on Demand compiler-h\n#ifndef clox_compiler_h\n#define clox_compiler_h\n\n//> Strings compiler-include-object\n#include \"object.h\"\n//< Strings compiler-include-object\n//> Compiling Expressions compile-h\n#include \"vm.h\"\n\n//< Compiling Expressions compile-h\n/* Scanning on Demand compiler-h < Compiling Expressions compile-h\nvoid compile(const char* source);\n*/\n/* Compiling Expressions compile-h < Calls and Functions compile-h\nbool compile(const char* source, Chunk* chunk);\n*/\n//> Calls and Functions compile-h\nObjFunction* compile(const char* source);\n//< Calls and Functions compile-h\n//> Garbage Collection mark-compiler-roots-h\nvoid markCompilerRoots();\n//< Garbage Collection mark-compiler-roots-h\n\n#endif\n"
  },
  {
    "path": "c/debug.c",
    "content": "//> Chunks of Bytecode debug-c\n#include <stdio.h>\n\n#include \"debug.h\"\n//> Closures debug-include-object\n#include \"object.h\"\n//< Closures debug-include-object\n//> debug-include-value\n#include \"value.h\"\n//< debug-include-value\n\nvoid disassembleChunk(Chunk* chunk, const char* name) {\n  printf(\"== %s ==\\n\", name);\n  \n  for (int offset = 0; offset < chunk->count;) {\n    offset = disassembleInstruction(chunk, offset);\n  }\n}\n//> constant-instruction\nstatic int constantInstruction(const char* name, Chunk* chunk,\n                               int offset) {\n  uint8_t constant = chunk->code[offset + 1];\n  printf(\"%-16s %4d '\", name, constant);\n  printValue(chunk->constants.values[constant]);\n  printf(\"'\\n\");\n//> return-after-operand\n  return offset + 2;\n//< return-after-operand\n}\n//< constant-instruction\n//> Methods and Initializers invoke-instruction\nstatic int invokeInstruction(const char* name, Chunk* chunk,\n                                int offset) {\n  uint8_t constant = chunk->code[offset + 1];\n  uint8_t argCount = chunk->code[offset + 2];\n  printf(\"%-16s (%d args) %4d '\", name, argCount, constant);\n  printValue(chunk->constants.values[constant]);\n  printf(\"'\\n\");\n  return offset + 3;\n}\n//< Methods and Initializers invoke-instruction\n//> simple-instruction\nstatic int simpleInstruction(const char* name, int offset) {\n  printf(\"%s\\n\", name);\n  return offset + 1;\n}\n//< simple-instruction\n//> Local Variables byte-instruction\nstatic int byteInstruction(const char* name, Chunk* chunk,\n                           int offset) {\n  uint8_t slot = chunk->code[offset + 1];\n  printf(\"%-16s %4d\\n\", name, slot);\n  return offset + 2; // [debug]\n}\n//< Local Variables byte-instruction\n//> Jumping Back and Forth jump-instruction\nstatic int jumpInstruction(const char* name, int sign,\n                           Chunk* chunk, int offset) {\n  uint16_t jump = (uint16_t)(chunk->code[offset + 1] << 8);\n  jump |= chunk->code[offset + 2];\n  printf(\"%-16s %4d -> %d\\n\", name, offset,\n         offset + 3 + sign * jump);\n  return offset + 3;\n}\n//< Jumping Back and Forth jump-instruction\n//> disassemble-instruction\nint disassembleInstruction(Chunk* chunk, int offset) {\n  printf(\"%04d \", offset);\n//> show-location\n  if (offset > 0 &&\n      chunk->lines[offset] == chunk->lines[offset - 1]) {\n    printf(\"   | \");\n  } else {\n    printf(\"%4d \", chunk->lines[offset]);\n  }\n//< show-location\n  \n  uint8_t instruction = chunk->code[offset];\n  switch (instruction) {\n//> disassemble-constant\n    case OP_CONSTANT:\n      return constantInstruction(\"OP_CONSTANT\", chunk, offset);\n//< disassemble-constant\n//> Types of Values disassemble-literals\n    case OP_NIL:\n      return simpleInstruction(\"OP_NIL\", offset);\n    case OP_TRUE:\n      return simpleInstruction(\"OP_TRUE\", offset);\n    case OP_FALSE:\n      return simpleInstruction(\"OP_FALSE\", offset);\n//< Types of Values disassemble-literals\n//> Global Variables disassemble-pop\n    case OP_POP:\n      return simpleInstruction(\"OP_POP\", offset);\n//< Global Variables disassemble-pop\n//> Local Variables disassemble-local\n    case OP_GET_LOCAL:\n      return byteInstruction(\"OP_GET_LOCAL\", chunk, offset);\n    case OP_SET_LOCAL:\n      return byteInstruction(\"OP_SET_LOCAL\", chunk, offset);\n//< Local Variables disassemble-local\n//> Global Variables disassemble-get-global\n    case OP_GET_GLOBAL:\n      return constantInstruction(\"OP_GET_GLOBAL\", chunk, offset);\n//< Global Variables disassemble-get-global\n//> Global Variables disassemble-define-global\n    case OP_DEFINE_GLOBAL:\n      return constantInstruction(\"OP_DEFINE_GLOBAL\", chunk,\n                                 offset);\n//< Global Variables disassemble-define-global\n//> Global Variables disassemble-set-global\n    case OP_SET_GLOBAL:\n      return constantInstruction(\"OP_SET_GLOBAL\", chunk, offset);\n//< Global Variables disassemble-set-global\n//> Closures disassemble-upvalue-ops\n    case OP_GET_UPVALUE:\n      return byteInstruction(\"OP_GET_UPVALUE\", chunk, offset);\n    case OP_SET_UPVALUE:\n      return byteInstruction(\"OP_SET_UPVALUE\", chunk, offset);\n//< Closures disassemble-upvalue-ops\n//> Classes and Instances disassemble-property-ops\n    case OP_GET_PROPERTY:\n      return constantInstruction(\"OP_GET_PROPERTY\", chunk, offset);\n    case OP_SET_PROPERTY:\n      return constantInstruction(\"OP_SET_PROPERTY\", chunk, offset);\n//< Classes and Instances disassemble-property-ops\n//> Superclasses disassemble-get-super\n    case OP_GET_SUPER:\n      return constantInstruction(\"OP_GET_SUPER\", chunk, offset);\n//< Superclasses disassemble-get-super\n//> Types of Values disassemble-comparison\n    case OP_EQUAL:\n      return simpleInstruction(\"OP_EQUAL\", offset);\n    case OP_GREATER:\n      return simpleInstruction(\"OP_GREATER\", offset);\n    case OP_LESS:\n      return simpleInstruction(\"OP_LESS\", offset);\n//< Types of Values disassemble-comparison\n//> A Virtual Machine disassemble-binary\n    case OP_ADD:\n      return simpleInstruction(\"OP_ADD\", offset);\n    case OP_SUBTRACT:\n      return simpleInstruction(\"OP_SUBTRACT\", offset);\n    case OP_MULTIPLY:\n      return simpleInstruction(\"OP_MULTIPLY\", offset);\n    case OP_DIVIDE:\n      return simpleInstruction(\"OP_DIVIDE\", offset);\n//> Types of Values disassemble-not\n    case OP_NOT:\n      return simpleInstruction(\"OP_NOT\", offset);\n//< Types of Values disassemble-not\n//< A Virtual Machine disassemble-binary\n//> A Virtual Machine disassemble-negate\n    case OP_NEGATE:\n      return simpleInstruction(\"OP_NEGATE\", offset);\n//< A Virtual Machine disassemble-negate\n//> Global Variables disassemble-print\n    case OP_PRINT:\n      return simpleInstruction(\"OP_PRINT\", offset);\n//< Global Variables disassemble-print\n//> Jumping Back and Forth disassemble-jump\n    case OP_JUMP:\n      return jumpInstruction(\"OP_JUMP\", 1, chunk, offset);\n    case OP_JUMP_IF_FALSE:\n      return jumpInstruction(\"OP_JUMP_IF_FALSE\", 1, chunk, offset);\n//< Jumping Back and Forth disassemble-jump\n//> Jumping Back and Forth disassemble-loop\n    case OP_LOOP:\n      return jumpInstruction(\"OP_LOOP\", -1, chunk, offset);\n//< Jumping Back and Forth disassemble-loop\n//> Calls and Functions disassemble-call\n    case OP_CALL:\n      return byteInstruction(\"OP_CALL\", chunk, offset);\n//< Calls and Functions disassemble-call\n//> Methods and Initializers disassemble-invoke\n    case OP_INVOKE:\n      return invokeInstruction(\"OP_INVOKE\", chunk, offset);\n//< Methods and Initializers disassemble-invoke\n//> Superclasses disassemble-super-invoke\n    case OP_SUPER_INVOKE:\n      return invokeInstruction(\"OP_SUPER_INVOKE\", chunk, offset);\n//< Superclasses disassemble-super-invoke\n//> Closures disassemble-closure\n    case OP_CLOSURE: {\n      offset++;\n      uint8_t constant = chunk->code[offset++];\n      printf(\"%-16s %4d \", \"OP_CLOSURE\", constant);\n      printValue(chunk->constants.values[constant]);\n      printf(\"\\n\");\n//> disassemble-upvalues\n\n      ObjFunction* function = AS_FUNCTION(\n          chunk->constants.values[constant]);\n      for (int j = 0; j < function->upvalueCount; j++) {\n        int isLocal = chunk->code[offset++];\n        int index = chunk->code[offset++];\n        printf(\"%04d      |                     %s %d\\n\",\n               offset - 2, isLocal ? \"local\" : \"upvalue\", index);\n      }\n      \n//< disassemble-upvalues\n      return offset;\n    }\n//< Closures disassemble-closure\n//> Closures disassemble-close-upvalue\n    case OP_CLOSE_UPVALUE:\n      return simpleInstruction(\"OP_CLOSE_UPVALUE\", offset);\n//< Closures disassemble-close-upvalue\n    case OP_RETURN:\n      return simpleInstruction(\"OP_RETURN\", offset);\n//> Classes and Instances disassemble-class\n    case OP_CLASS:\n      return constantInstruction(\"OP_CLASS\", chunk, offset);\n//< Classes and Instances disassemble-class\n//> Superclasses disassemble-inherit\n    case OP_INHERIT:\n      return simpleInstruction(\"OP_INHERIT\", offset);\n//< Superclasses disassemble-inherit\n//> Methods and Initializers disassemble-method\n    case OP_METHOD:\n      return constantInstruction(\"OP_METHOD\", chunk, offset);\n//< Methods and Initializers disassemble-method\n    default:\n      printf(\"Unknown opcode %d\\n\", instruction);\n      return offset + 1;\n  }\n}\n//< disassemble-instruction\n"
  },
  {
    "path": "c/debug.h",
    "content": "//> Chunks of Bytecode debug-h\n#ifndef clox_debug_h\n#define clox_debug_h\n\n#include \"chunk.h\"\n\nvoid disassembleChunk(Chunk* chunk, const char* name);\nint disassembleInstruction(Chunk* chunk, int offset);\n\n#endif\n"
  },
  {
    "path": "c/main.c",
    "content": "//> Chunks of Bytecode main-c\n//> Scanning on Demand main-includes\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n//< Scanning on Demand main-includes\n#include \"common.h\"\n//> main-include-chunk\n#include \"chunk.h\"\n//< main-include-chunk\n//> main-include-debug\n#include \"debug.h\"\n//< main-include-debug\n//> A Virtual Machine main-include-vm\n#include \"vm.h\"\n//< A Virtual Machine main-include-vm\n//> Scanning on Demand repl\n\nstatic void repl() {\n  char line[1024];\n  for (;;) {\n    printf(\"> \");\n\n    if (!fgets(line, sizeof(line), stdin)) {\n      printf(\"\\n\");\n      break;\n    }\n\n    interpret(line);\n  }\n}\n//< Scanning on Demand repl\n//> Scanning on Demand read-file\nstatic char* readFile(const char* path) {\n  FILE* file = fopen(path, \"rb\");\n//> no-file\n  if (file == NULL) {\n    fprintf(stderr, \"Could not open file \\\"%s\\\".\\n\", path);\n    exit(74);\n  }\n//< no-file\n\n  fseek(file, 0L, SEEK_END);\n  size_t fileSize = ftell(file);\n  rewind(file);\n\n  char* buffer = (char*)malloc(fileSize + 1);\n//> no-buffer\n  if (buffer == NULL) {\n    fprintf(stderr, \"Not enough memory to read \\\"%s\\\".\\n\", path);\n    exit(74);\n  }\n  \n//< no-buffer\n  size_t bytesRead = fread(buffer, sizeof(char), fileSize, file);\n//> no-read\n  if (bytesRead < fileSize) {\n    fprintf(stderr, \"Could not read file \\\"%s\\\".\\n\", path);\n    exit(74);\n  }\n  \n//< no-read\n  buffer[bytesRead] = '\\0';\n\n  fclose(file);\n  return buffer;\n}\n//< Scanning on Demand read-file\n//> Scanning on Demand run-file\nstatic void runFile(const char* path) {\n  char* source = readFile(path);\n  InterpretResult result = interpret(source);\n  free(source); // [owner]\n\n  if (result == INTERPRET_COMPILE_ERROR) exit(65);\n  if (result == INTERPRET_RUNTIME_ERROR) exit(70);\n}\n//< Scanning on Demand run-file\n\nint main(int argc, const char* argv[]) {\n//> A Virtual Machine main-init-vm\n  initVM();\n\n//< A Virtual Machine main-init-vm\n/* Chunks of Bytecode main-chunk < Scanning on Demand args\n  Chunk chunk;\n  initChunk(&chunk);\n*/\n/* Chunks of Bytecode main-constant < Scanning on Demand args\n\n  int constant = addConstant(&chunk, 1.2);\n*/\n/* Chunks of Bytecode main-constant < Chunks of Bytecode main-chunk-line\n  writeChunk(&chunk, OP_CONSTANT);\n  writeChunk(&chunk, constant);\n\n*/\n/* Chunks of Bytecode main-chunk-line < Scanning on Demand args\n  writeChunk(&chunk, OP_CONSTANT, 123);\n  writeChunk(&chunk, constant, 123);\n*/\n/* A Virtual Machine main-chunk < Scanning on Demand args\n\n  constant = addConstant(&chunk, 3.4);\n  writeChunk(&chunk, OP_CONSTANT, 123);\n  writeChunk(&chunk, constant, 123);\n\n  writeChunk(&chunk, OP_ADD, 123);\n\n  constant = addConstant(&chunk, 5.6);\n  writeChunk(&chunk, OP_CONSTANT, 123);\n  writeChunk(&chunk, constant, 123);\n\n  writeChunk(&chunk, OP_DIVIDE, 123);\n*/\n/* A Virtual Machine main-negate < Scanning on Demand args\n  writeChunk(&chunk, OP_NEGATE, 123);\n*/\n/* Chunks of Bytecode main-chunk < Chunks of Bytecode main-chunk-line\n  writeChunk(&chunk, OP_RETURN);\n*/\n/* Chunks of Bytecode main-chunk-line < Scanning on Demand args\n\n  writeChunk(&chunk, OP_RETURN, 123);\n*/\n/* Chunks of Bytecode main-disassemble-chunk < Scanning on Demand args\n\n  disassembleChunk(&chunk, \"test chunk\");\n*/\n/* A Virtual Machine main-interpret < Scanning on Demand args\n  interpret(&chunk);\n*/\n//> Scanning on Demand args\n  if (argc == 1) {\n    repl();\n  } else if (argc == 2) {\n    runFile(argv[1]);\n  } else {\n    fprintf(stderr, \"Usage: clox [path]\\n\");\n    exit(64);\n  }\n  \n  freeVM();\n//< Scanning on Demand args\n/* A Virtual Machine main-free-vm < Scanning on Demand args\n  freeVM();\n*/\n/* Chunks of Bytecode main-chunk < Scanning on Demand args\n  freeChunk(&chunk);\n*/\n  return 0;\n}\n"
  },
  {
    "path": "c/memory.c",
    "content": "//> Chunks of Bytecode memory-c\n#include <stdlib.h>\n\n//> Garbage Collection memory-include-compiler\n#include \"compiler.h\"\n//< Garbage Collection memory-include-compiler\n#include \"memory.h\"\n//> Strings memory-include-vm\n#include \"vm.h\"\n//< Strings memory-include-vm\n//> Garbage Collection debug-log-includes\n\n#ifdef DEBUG_LOG_GC\n#include <stdio.h>\n#include \"debug.h\"\n#endif\n//< Garbage Collection debug-log-includes\n//> Garbage Collection heap-grow-factor\n\n#define GC_HEAP_GROW_FACTOR 2\n//< Garbage Collection heap-grow-factor\n\nvoid* reallocate(void* pointer, size_t oldSize, size_t newSize) {\n//> Garbage Collection updated-bytes-allocated\n  vm.bytesAllocated += newSize - oldSize;\n//< Garbage Collection updated-bytes-allocated\n//> Garbage Collection call-collect\n  if (newSize > oldSize) {\n#ifdef DEBUG_STRESS_GC\n    collectGarbage();\n#endif\n//> collect-on-next\n\n    if (vm.bytesAllocated > vm.nextGC) {\n      collectGarbage();\n    }\n//< collect-on-next\n  }\n\n//< Garbage Collection call-collect\n  if (newSize == 0) {\n    free(pointer);\n    return NULL;\n  }\n\n  void* result = realloc(pointer, newSize);\n//> out-of-memory\n  if (result == NULL) exit(1);\n//< out-of-memory\n  return result;\n}\n//> Garbage Collection mark-object\nvoid markObject(Obj* object) {\n  if (object == NULL) return;\n//> check-is-marked\n  if (object->isMarked) return;\n\n//< check-is-marked\n//> log-mark-object\n#ifdef DEBUG_LOG_GC\n  printf(\"%p mark \", (void*)object);\n  printValue(OBJ_VAL(object));\n  printf(\"\\n\");\n#endif\n\n//< log-mark-object\n  object->isMarked = true;\n//> add-to-gray-stack\n\n  if (vm.grayCapacity < vm.grayCount + 1) {\n    vm.grayCapacity = GROW_CAPACITY(vm.grayCapacity);\n    vm.grayStack = (Obj**)realloc(vm.grayStack,\n                                  sizeof(Obj*) * vm.grayCapacity);\n//> exit-gray-stack\n\n    if (vm.grayStack == NULL) exit(1);\n//< exit-gray-stack\n  }\n\n  vm.grayStack[vm.grayCount++] = object;\n//< add-to-gray-stack\n}\n//< Garbage Collection mark-object\n//> Garbage Collection mark-value\nvoid markValue(Value value) {\n  if (IS_OBJ(value)) markObject(AS_OBJ(value));\n}\n//< Garbage Collection mark-value\n//> Garbage Collection mark-array\nstatic void markArray(ValueArray* array) {\n  for (int i = 0; i < array->count; i++) {\n    markValue(array->values[i]);\n  }\n}\n//< Garbage Collection mark-array\n//> Garbage Collection blacken-object\nstatic void blackenObject(Obj* object) {\n//> log-blacken-object\n#ifdef DEBUG_LOG_GC\n  printf(\"%p blacken \", (void*)object);\n  printValue(OBJ_VAL(object));\n  printf(\"\\n\");\n#endif\n\n//< log-blacken-object\n  switch (object->type) {\n//> Methods and Initializers blacken-bound-method\n    case OBJ_BOUND_METHOD: {\n      ObjBoundMethod* bound = (ObjBoundMethod*)object;\n      markValue(bound->receiver);\n      markObject((Obj*)bound->method);\n      break;\n    }\n//< Methods and Initializers blacken-bound-method\n//> Classes and Instances blacken-class\n    case OBJ_CLASS: {\n      ObjClass* klass = (ObjClass*)object;\n      markObject((Obj*)klass->name);\n//> Methods and Initializers mark-methods\n      markTable(&klass->methods);\n//< Methods and Initializers mark-methods\n      break;\n    }\n//< Classes and Instances blacken-class\n//> blacken-closure\n    case OBJ_CLOSURE: {\n      ObjClosure* closure = (ObjClosure*)object;\n      markObject((Obj*)closure->function);\n      for (int i = 0; i < closure->upvalueCount; i++) {\n        markObject((Obj*)closure->upvalues[i]);\n      }\n      break;\n    }\n//< blacken-closure\n//> blacken-function\n    case OBJ_FUNCTION: {\n      ObjFunction* function = (ObjFunction*)object;\n      markObject((Obj*)function->name);\n      markArray(&function->chunk.constants);\n      break;\n    }\n//< blacken-function\n//> Classes and Instances blacken-instance\n    case OBJ_INSTANCE: {\n      ObjInstance* instance = (ObjInstance*)object;\n      markObject((Obj*)instance->klass);\n      markTable(&instance->fields);\n      break;\n    }\n//< Classes and Instances blacken-instance\n//> blacken-upvalue\n    case OBJ_UPVALUE:\n      markValue(((ObjUpvalue*)object)->closed);\n      break;\n//< blacken-upvalue\n    case OBJ_NATIVE:\n    case OBJ_STRING:\n      break;\n  }\n}\n//< Garbage Collection blacken-object\n//> Strings free-object\nstatic void freeObject(Obj* object) {\n//> Garbage Collection log-free-object\n#ifdef DEBUG_LOG_GC\n  printf(\"%p free type %d\\n\", (void*)object, object->type);\n#endif\n\n//< Garbage Collection log-free-object\n  switch (object->type) {\n//> Methods and Initializers free-bound-method\n    case OBJ_BOUND_METHOD:\n      FREE(ObjBoundMethod, object);\n      break;\n//< Methods and Initializers free-bound-method\n//> Classes and Instances free-class\n    case OBJ_CLASS: {\n//> Methods and Initializers free-methods\n      ObjClass* klass = (ObjClass*)object;\n      freeTable(&klass->methods);\n//< Methods and Initializers free-methods\n      FREE(ObjClass, object);\n      break;\n    } // [braces]\n//< Classes and Instances free-class\n//> Closures free-closure\n    case OBJ_CLOSURE: {\n//> free-upvalues\n      ObjClosure* closure = (ObjClosure*)object;\n      FREE_ARRAY(ObjUpvalue*, closure->upvalues,\n                 closure->upvalueCount);\n//< free-upvalues\n      FREE(ObjClosure, object);\n      break;\n    }\n//< Closures free-closure\n//> Calls and Functions free-function\n    case OBJ_FUNCTION: {\n      ObjFunction* function = (ObjFunction*)object;\n      freeChunk(&function->chunk);\n      FREE(ObjFunction, object);\n      break;\n    }\n//< Calls and Functions free-function\n//> Classes and Instances free-instance\n    case OBJ_INSTANCE: {\n      ObjInstance* instance = (ObjInstance*)object;\n      freeTable(&instance->fields);\n      FREE(ObjInstance, object);\n      break;\n    }\n//< Classes and Instances free-instance\n//> Calls and Functions free-native\n    case OBJ_NATIVE:\n      FREE(ObjNative, object);\n      break;\n//< Calls and Functions free-native\n    case OBJ_STRING: {\n      ObjString* string = (ObjString*)object;\n      FREE_ARRAY(char, string->chars, string->length + 1);\n      FREE(ObjString, object);\n      break;\n    }\n//> Closures free-upvalue\n    case OBJ_UPVALUE:\n      FREE(ObjUpvalue, object);\n      break;\n//< Closures free-upvalue\n  }\n}\n//< Strings free-object\n//> Garbage Collection mark-roots\nstatic void markRoots() {\n  for (Value* slot = vm.stack; slot < vm.stackTop; slot++) {\n    markValue(*slot);\n  }\n//> mark-closures\n\n  for (int i = 0; i < vm.frameCount; i++) {\n    markObject((Obj*)vm.frames[i].closure);\n  }\n//< mark-closures\n//> mark-open-upvalues\n\n  for (ObjUpvalue* upvalue = vm.openUpvalues;\n       upvalue != NULL;\n       upvalue = upvalue->next) {\n    markObject((Obj*)upvalue);\n  }\n//< mark-open-upvalues\n//> mark-globals\n\n  markTable(&vm.globals);\n//< mark-globals\n//> call-mark-compiler-roots\n  markCompilerRoots();\n//< call-mark-compiler-roots\n//> Methods and Initializers mark-init-string\n  markObject((Obj*)vm.initString);\n//< Methods and Initializers mark-init-string\n}\n//< Garbage Collection mark-roots\n//> Garbage Collection trace-references\nstatic void traceReferences() {\n  while (vm.grayCount > 0) {\n    Obj* object = vm.grayStack[--vm.grayCount];\n    blackenObject(object);\n  }\n}\n//< Garbage Collection trace-references\n//> Garbage Collection sweep\nstatic void sweep() {\n  Obj* previous = NULL;\n  Obj* object = vm.objects;\n  while (object != NULL) {\n    if (object->isMarked) {\n//> unmark\n      object->isMarked = false;\n//< unmark\n      previous = object;\n      object = object->next;\n    } else {\n      Obj* unreached = object;\n      object = object->next;\n      if (previous != NULL) {\n        previous->next = object;\n      } else {\n        vm.objects = object;\n      }\n\n      freeObject(unreached);\n    }\n  }\n}\n//< Garbage Collection sweep\n//> Garbage Collection collect-garbage\nvoid collectGarbage() {\n//> log-before-collect\n#ifdef DEBUG_LOG_GC\n  printf(\"-- gc begin\\n\");\n//> log-before-size\n  size_t before = vm.bytesAllocated;\n//< log-before-size\n#endif\n//< log-before-collect\n//> call-mark-roots\n\n  markRoots();\n//< call-mark-roots\n//> call-trace-references\n  traceReferences();\n//< call-trace-references\n//> sweep-strings\n  tableRemoveWhite(&vm.strings);\n//< sweep-strings\n//> call-sweep\n  sweep();\n//< call-sweep\n//> update-next-gc\n\n  vm.nextGC = vm.bytesAllocated * GC_HEAP_GROW_FACTOR;\n//< update-next-gc\n//> log-after-collect\n\n#ifdef DEBUG_LOG_GC\n  printf(\"-- gc end\\n\");\n//> log-collected-amount\n  printf(\"   collected %zu bytes (from %zu to %zu) next at %zu\\n\",\n         before - vm.bytesAllocated, before, vm.bytesAllocated,\n         vm.nextGC);\n//< log-collected-amount\n#endif\n//< log-after-collect\n}\n//< Garbage Collection collect-garbage\n//> Strings free-objects\nvoid freeObjects() {\n  Obj* object = vm.objects;\n  while (object != NULL) {\n    Obj* next = object->next;\n    freeObject(object);\n    object = next;\n  }\n//> Garbage Collection free-gray-stack\n\n  free(vm.grayStack);\n//< Garbage Collection free-gray-stack\n}\n//< Strings free-objects\n"
  },
  {
    "path": "c/memory.h",
    "content": "//> Chunks of Bytecode memory-h\n#ifndef clox_memory_h\n#define clox_memory_h\n\n#include \"common.h\"\n//> Strings memory-include-object\n#include \"object.h\"\n//< Strings memory-include-object\n\n//> Strings allocate\n#define ALLOCATE(type, count) \\\n    (type*)reallocate(NULL, 0, sizeof(type) * (count))\n//> free\n\n#define FREE(type, pointer) reallocate(pointer, sizeof(type), 0)\n//< free\n\n//< Strings allocate\n#define GROW_CAPACITY(capacity) \\\n    ((capacity) < 8 ? 8 : (capacity) * 2)\n//> grow-array\n\n#define GROW_ARRAY(type, pointer, oldCount, newCount) \\\n    (type*)reallocate(pointer, sizeof(type) * (oldCount), \\\n        sizeof(type) * (newCount))\n//> free-array\n\n#define FREE_ARRAY(type, pointer, oldCount) \\\n    reallocate(pointer, sizeof(type) * (oldCount), 0)\n//< free-array\n\nvoid* reallocate(void* pointer, size_t oldSize, size_t newSize);\n//< grow-array\n//> Garbage Collection mark-object-h\nvoid markObject(Obj* object);\n//< Garbage Collection mark-object-h\n//> Garbage Collection mark-value-h\nvoid markValue(Value value);\n//< Garbage Collection mark-value-h\n//> Garbage Collection collect-garbage-h\nvoid collectGarbage();\n//< Garbage Collection collect-garbage-h\n//> Strings free-objects-h\nvoid freeObjects();\n//< Strings free-objects-h\n\n#endif\n"
  },
  {
    "path": "c/object.c",
    "content": "//> Strings object-c\n#include <stdio.h>\n#include <string.h>\n\n#include \"memory.h\"\n#include \"object.h\"\n//> Hash Tables object-include-table\n#include \"table.h\"\n//< Hash Tables object-include-table\n#include \"value.h\"\n#include \"vm.h\"\n//> allocate-obj\n\n#define ALLOCATE_OBJ(type, objectType) \\\n    (type*)allocateObject(sizeof(type), objectType)\n//< allocate-obj\n//> allocate-object\n\nstatic Obj* allocateObject(size_t size, ObjType type) {\n  Obj* object = (Obj*)reallocate(NULL, 0, size);\n  object->type = type;\n//> Garbage Collection init-is-marked\n  object->isMarked = false;\n//< Garbage Collection init-is-marked\n//> add-to-list\n  \n  object->next = vm.objects;\n  vm.objects = object;\n//< add-to-list\n//> Garbage Collection debug-log-allocate\n\n#ifdef DEBUG_LOG_GC\n  printf(\"%p allocate %zu for %d\\n\", (void*)object, size, type);\n#endif\n\n//< Garbage Collection debug-log-allocate\n  return object;\n}\n//< allocate-object\n//> Methods and Initializers new-bound-method\nObjBoundMethod* newBoundMethod(Value receiver,\n                               ObjClosure* method) {\n  ObjBoundMethod* bound = ALLOCATE_OBJ(ObjBoundMethod,\n                                       OBJ_BOUND_METHOD);\n  bound->receiver = receiver;\n  bound->method = method;\n  return bound;\n}\n//< Methods and Initializers new-bound-method\n//> Classes and Instances new-class\nObjClass* newClass(ObjString* name) {\n  ObjClass* klass = ALLOCATE_OBJ(ObjClass, OBJ_CLASS);\n  klass->name = name; // [klass]\n//> Methods and Initializers init-methods\n  initTable(&klass->methods);\n//< Methods and Initializers init-methods\n  return klass;\n}\n//< Classes and Instances new-class\n//> Closures new-closure\nObjClosure* newClosure(ObjFunction* function) {\n//> allocate-upvalue-array\n  ObjUpvalue** upvalues = ALLOCATE(ObjUpvalue*,\n                                   function->upvalueCount);\n  for (int i = 0; i < function->upvalueCount; i++) {\n    upvalues[i] = NULL;\n  }\n\n//< allocate-upvalue-array\n  ObjClosure* closure = ALLOCATE_OBJ(ObjClosure, OBJ_CLOSURE);\n  closure->function = function;\n//> init-upvalue-fields\n  closure->upvalues = upvalues;\n  closure->upvalueCount = function->upvalueCount;\n//< init-upvalue-fields\n  return closure;\n}\n//< Closures new-closure\n//> Calls and Functions new-function\nObjFunction* newFunction() {\n  ObjFunction* function = ALLOCATE_OBJ(ObjFunction, OBJ_FUNCTION);\n  function->arity = 0;\n//> Closures init-upvalue-count\n  function->upvalueCount = 0;\n//< Closures init-upvalue-count\n  function->name = NULL;\n  initChunk(&function->chunk);\n  return function;\n}\n//< Calls and Functions new-function\n//> Classes and Instances new-instance\nObjInstance* newInstance(ObjClass* klass) {\n  ObjInstance* instance = ALLOCATE_OBJ(ObjInstance, OBJ_INSTANCE);\n  instance->klass = klass;\n  initTable(&instance->fields);\n  return instance;\n}\n//< Classes and Instances new-instance\n//> Calls and Functions new-native\nObjNative* newNative(NativeFn function) {\n  ObjNative* native = ALLOCATE_OBJ(ObjNative, OBJ_NATIVE);\n  native->function = function;\n  return native;\n}\n//< Calls and Functions new-native\n\n/* Strings allocate-string < Hash Tables allocate-string\nstatic ObjString* allocateString(char* chars, int length) {\n*/\n//> allocate-string\n//> Hash Tables allocate-string\nstatic ObjString* allocateString(char* chars, int length,\n                                 uint32_t hash) {\n//< Hash Tables allocate-string\n  ObjString* string = ALLOCATE_OBJ(ObjString, OBJ_STRING);\n  string->length = length;\n  string->chars = chars;\n//> Hash Tables allocate-store-hash\n  string->hash = hash;\n//< Hash Tables allocate-store-hash\n//> Hash Tables allocate-store-string\n//> Garbage Collection push-string\n\n  push(OBJ_VAL(string));\n//< Garbage Collection push-string\n  tableSet(&vm.strings, string, NIL_VAL);\n//> Garbage Collection pop-string\n  pop();\n\n//< Garbage Collection pop-string\n//< Hash Tables allocate-store-string\n  return string;\n}\n//< allocate-string\n//> Hash Tables hash-string\nstatic uint32_t hashString(const char* key, int length) {\n  uint32_t hash = 2166136261u;\n  for (int i = 0; i < length; i++) {\n    hash ^= (uint8_t)key[i];\n    hash *= 16777619;\n  }\n  return hash;\n}\n//< Hash Tables hash-string\n//> take-string\nObjString* takeString(char* chars, int length) {\n/* Strings take-string < Hash Tables take-string-hash\n  return allocateString(chars, length);\n*/\n//> Hash Tables take-string-hash\n  uint32_t hash = hashString(chars, length);\n//> take-string-intern\n  ObjString* interned = tableFindString(&vm.strings, chars, length,\n                                        hash);\n  if (interned != NULL) {\n    FREE_ARRAY(char, chars, length + 1);\n    return interned;\n  }\n\n//< take-string-intern\n  return allocateString(chars, length, hash);\n//< Hash Tables take-string-hash\n}\n//< take-string\nObjString* copyString(const char* chars, int length) {\n//> Hash Tables copy-string-hash\n  uint32_t hash = hashString(chars, length);\n//> copy-string-intern\n  ObjString* interned = tableFindString(&vm.strings, chars, length,\n                                        hash);\n  if (interned != NULL) return interned;\n\n//< copy-string-intern\n//< Hash Tables copy-string-hash\n  char* heapChars = ALLOCATE(char, length + 1);\n  memcpy(heapChars, chars, length);\n  heapChars[length] = '\\0';\n/* Strings object-c < Hash Tables copy-string-allocate\n  return allocateString(heapChars, length);\n*/\n//> Hash Tables copy-string-allocate\n  return allocateString(heapChars, length, hash);\n//< Hash Tables copy-string-allocate\n}\n//> Closures new-upvalue\nObjUpvalue* newUpvalue(Value* slot) {\n  ObjUpvalue* upvalue = ALLOCATE_OBJ(ObjUpvalue, OBJ_UPVALUE);\n//> init-closed\n  upvalue->closed = NIL_VAL;\n//< init-closed\n  upvalue->location = slot;\n//> init-next\n  upvalue->next = NULL;\n//< init-next\n  return upvalue;\n}\n//< Closures new-upvalue\n//> Calls and Functions print-function-helper\nstatic void printFunction(ObjFunction* function) {\n//> print-script\n  if (function->name == NULL) {\n    printf(\"<script>\");\n    return;\n  }\n//< print-script\n  printf(\"<fn %s>\", function->name->chars);\n}\n//< Calls and Functions print-function-helper\n//> print-object\nvoid printObject(Value value) {\n  switch (OBJ_TYPE(value)) {\n//> Methods and Initializers print-bound-method\n    case OBJ_BOUND_METHOD:\n      printFunction(AS_BOUND_METHOD(value)->method->function);\n      break;\n//< Methods and Initializers print-bound-method\n//> Classes and Instances print-class\n    case OBJ_CLASS:\n      printf(\"%s\", AS_CLASS(value)->name->chars);\n      break;\n//< Classes and Instances print-class\n//> Closures print-closure\n    case OBJ_CLOSURE:\n      printFunction(AS_CLOSURE(value)->function);\n      break;\n//< Closures print-closure\n//> Calls and Functions print-function\n    case OBJ_FUNCTION:\n      printFunction(AS_FUNCTION(value));\n      break;\n//< Calls and Functions print-function\n//> Classes and Instances print-instance\n    case OBJ_INSTANCE:\n      printf(\"%s instance\",\n             AS_INSTANCE(value)->klass->name->chars);\n      break;\n//< Classes and Instances print-instance\n//> Calls and Functions print-native\n    case OBJ_NATIVE:\n      printf(\"<native fn>\");\n      break;\n//< Calls and Functions print-native\n    case OBJ_STRING:\n      printf(\"%s\", AS_CSTRING(value));\n      break;\n//> Closures print-upvalue\n    case OBJ_UPVALUE:\n      printf(\"upvalue\");\n      break;\n//< Closures print-upvalue\n  }\n}\n//< print-object\n"
  },
  {
    "path": "c/object.h",
    "content": "//> Strings object-h\n#ifndef clox_object_h\n#define clox_object_h\n\n#include \"common.h\"\n//> Calls and Functions object-include-chunk\n#include \"chunk.h\"\n//< Calls and Functions object-include-chunk\n//> Classes and Instances object-include-table\n#include \"table.h\"\n//< Classes and Instances object-include-table\n#include \"value.h\"\n//> obj-type-macro\n\n#define OBJ_TYPE(value)        (AS_OBJ(value)->type)\n//< obj-type-macro\n//> is-string\n\n//> Methods and Initializers is-bound-method\n#define IS_BOUND_METHOD(value) isObjType(value, OBJ_BOUND_METHOD)\n//< Methods and Initializers is-bound-method\n//> Classes and Instances is-class\n#define IS_CLASS(value)        isObjType(value, OBJ_CLASS)\n//< Classes and Instances is-class\n//> Closures is-closure\n#define IS_CLOSURE(value)      isObjType(value, OBJ_CLOSURE)\n//< Closures is-closure\n//> Calls and Functions is-function\n#define IS_FUNCTION(value)     isObjType(value, OBJ_FUNCTION)\n//< Calls and Functions is-function\n//> Classes and Instances is-instance\n#define IS_INSTANCE(value)     isObjType(value, OBJ_INSTANCE)\n//< Classes and Instances is-instance\n//> Calls and Functions is-native\n#define IS_NATIVE(value)       isObjType(value, OBJ_NATIVE)\n//< Calls and Functions is-native\n#define IS_STRING(value)       isObjType(value, OBJ_STRING)\n//< is-string\n//> as-string\n\n//> Methods and Initializers as-bound-method\n#define AS_BOUND_METHOD(value) ((ObjBoundMethod*)AS_OBJ(value))\n//< Methods and Initializers as-bound-method\n//> Classes and Instances as-class\n#define AS_CLASS(value)        ((ObjClass*)AS_OBJ(value))\n//< Classes and Instances as-class\n//> Closures as-closure\n#define AS_CLOSURE(value)      ((ObjClosure*)AS_OBJ(value))\n//< Closures as-closure\n//> Calls and Functions as-function\n#define AS_FUNCTION(value)     ((ObjFunction*)AS_OBJ(value))\n//< Calls and Functions as-function\n//> Classes and Instances as-instance\n#define AS_INSTANCE(value)     ((ObjInstance*)AS_OBJ(value))\n//< Classes and Instances as-instance\n//> Calls and Functions as-native\n#define AS_NATIVE(value) \\\n    (((ObjNative*)AS_OBJ(value))->function)\n//< Calls and Functions as-native\n#define AS_STRING(value)       ((ObjString*)AS_OBJ(value))\n#define AS_CSTRING(value)      (((ObjString*)AS_OBJ(value))->chars)\n//< as-string\n//> obj-type\n\ntypedef enum {\n//> Methods and Initializers obj-type-bound-method\n  OBJ_BOUND_METHOD,\n//< Methods and Initializers obj-type-bound-method\n//> Classes and Instances obj-type-class\n  OBJ_CLASS,\n//< Classes and Instances obj-type-class\n//> Closures obj-type-closure\n  OBJ_CLOSURE,\n//< Closures obj-type-closure\n//> Calls and Functions obj-type-function\n  OBJ_FUNCTION,\n//< Calls and Functions obj-type-function\n//> Classes and Instances obj-type-instance\n  OBJ_INSTANCE,\n//< Classes and Instances obj-type-instance\n//> Calls and Functions obj-type-native\n  OBJ_NATIVE,\n//< Calls and Functions obj-type-native\n  OBJ_STRING,\n//> Closures obj-type-upvalue\n  OBJ_UPVALUE\n//< Closures obj-type-upvalue\n} ObjType;\n//< obj-type\n\nstruct Obj {\n  ObjType type;\n//> Garbage Collection is-marked-field\n  bool isMarked;\n//< Garbage Collection is-marked-field\n//> next-field\n  struct Obj* next;\n//< next-field\n};\n//> Calls and Functions obj-function\n\ntypedef struct {\n  Obj obj;\n  int arity;\n//> Closures upvalue-count\n  int upvalueCount;\n//< Closures upvalue-count\n  Chunk chunk;\n  ObjString* name;\n} ObjFunction;\n//< Calls and Functions obj-function\n//> Calls and Functions obj-native\n\ntypedef Value (*NativeFn)(int argCount, Value* args);\n\ntypedef struct {\n  Obj obj;\n  NativeFn function;\n} ObjNative;\n//< Calls and Functions obj-native\n//> obj-string\n\nstruct ObjString {\n  Obj obj;\n  int length;\n  char* chars;\n//> Hash Tables obj-string-hash\n  uint32_t hash;\n//< Hash Tables obj-string-hash\n};\n//< obj-string\n//> Closures obj-upvalue\ntypedef struct ObjUpvalue {\n  Obj obj;\n  Value* location;\n//> closed-field\n  Value closed;\n//< closed-field\n//> next-field\n  struct ObjUpvalue* next;\n//< next-field\n} ObjUpvalue;\n//< Closures obj-upvalue\n//> Closures obj-closure\ntypedef struct {\n  Obj obj;\n  ObjFunction* function;\n//> upvalue-fields\n  ObjUpvalue** upvalues;\n  int upvalueCount;\n//< upvalue-fields\n} ObjClosure;\n//< Closures obj-closure\n//> Classes and Instances obj-class\n\ntypedef struct {\n  Obj obj;\n  ObjString* name;\n//> Methods and Initializers class-methods\n  Table methods;\n//< Methods and Initializers class-methods\n} ObjClass;\n//< Classes and Instances obj-class\n//> Classes and Instances obj-instance\n\ntypedef struct {\n  Obj obj;\n  ObjClass* klass;\n  Table fields; // [fields]\n} ObjInstance;\n//< Classes and Instances obj-instance\n\n//> Methods and Initializers obj-bound-method\ntypedef struct {\n  Obj obj;\n  Value receiver;\n  ObjClosure* method;\n} ObjBoundMethod;\n\n//< Methods and Initializers obj-bound-method\n//> Methods and Initializers new-bound-method-h\nObjBoundMethod* newBoundMethod(Value receiver,\n                               ObjClosure* method);\n//< Methods and Initializers new-bound-method-h\n//> Classes and Instances new-class-h\nObjClass* newClass(ObjString* name);\n//< Classes and Instances new-class-h\n//> Closures new-closure-h\nObjClosure* newClosure(ObjFunction* function);\n//< Closures new-closure-h\n//> Calls and Functions new-function-h\nObjFunction* newFunction();\n//< Calls and Functions new-function-h\n//> Classes and Instances new-instance-h\nObjInstance* newInstance(ObjClass* klass);\n//< Classes and Instances new-instance-h\n//> Calls and Functions new-native-h\nObjNative* newNative(NativeFn function);\n//< Calls and Functions new-native-h\n//> take-string-h\nObjString* takeString(char* chars, int length);\n//< take-string-h\n//> copy-string-h\nObjString* copyString(const char* chars, int length);\n//> Closures new-upvalue-h\nObjUpvalue* newUpvalue(Value* slot);\n//< Closures new-upvalue-h\n//> print-object-h\nvoid printObject(Value value);\n//< print-object-h\n\n//< copy-string-h\n//> is-obj-type\nstatic inline bool isObjType(Value value, ObjType type) {\n  return IS_OBJ(value) && AS_OBJ(value)->type == type;\n}\n\n//< is-obj-type\n#endif\n"
  },
  {
    "path": "c/scanner.c",
    "content": "//> Scanning on Demand scanner-c\n#include <stdio.h>\n#include <string.h>\n\n#include \"common.h\"\n#include \"scanner.h\"\n\ntypedef struct {\n  const char* start;\n  const char* current;\n  int line;\n} Scanner;\n\nScanner scanner;\n//> init-scanner\nvoid initScanner(const char* source) {\n  scanner.start = source;\n  scanner.current = source;\n  scanner.line = 1;\n}\n//< init-scanner\n//> is-alpha\nstatic bool isAlpha(char c) {\n  return (c >= 'a' && c <= 'z') ||\n         (c >= 'A' && c <= 'Z') ||\n          c == '_';\n}\n//< is-alpha\n//> is-digit\nstatic bool isDigit(char c) {\n  return c >= '0' && c <= '9';\n}\n//< is-digit\n//> is-at-end\nstatic bool isAtEnd() {\n  return *scanner.current == '\\0';\n}\n//< is-at-end\n//> advance\nstatic char advance() {\n  scanner.current++;\n  return scanner.current[-1];\n}\n//< advance\n//> peek\nstatic char peek() {\n  return *scanner.current;\n}\n//< peek\n//> peek-next\nstatic char peekNext() {\n  if (isAtEnd()) return '\\0';\n  return scanner.current[1];\n}\n//< peek-next\n//> match\nstatic bool match(char expected) {\n  if (isAtEnd()) return false;\n  if (*scanner.current != expected) return false;\n  scanner.current++;\n  return true;\n}\n//< match\n//> make-token\nstatic Token makeToken(TokenType type) {\n  Token token;\n  token.type = type;\n  token.start = scanner.start;\n  token.length = (int)(scanner.current - scanner.start);\n  token.line = scanner.line;\n  return token;\n}\n//< make-token\n//> error-token\nstatic Token errorToken(const char* message) {\n  Token token;\n  token.type = TOKEN_ERROR;\n  token.start = message;\n  token.length = (int)strlen(message);\n  token.line = scanner.line;\n  return token;\n}\n//< error-token\n//> skip-whitespace\nstatic void skipWhitespace() {\n  for (;;) {\n    char c = peek();\n    switch (c) {\n      case ' ':\n      case '\\r':\n      case '\\t':\n        advance();\n        break;\n//> newline\n      case '\\n':\n        scanner.line++;\n        advance();\n        break;\n//< newline\n//> comment\n      case '/':\n        if (peekNext() == '/') {\n          // A comment goes until the end of the line.\n          while (peek() != '\\n' && !isAtEnd()) advance();\n        } else {\n          return;\n        }\n        break;\n//< comment\n      default:\n        return;\n    }\n  }\n}\n//< skip-whitespace\n//> check-keyword\nstatic TokenType checkKeyword(int start, int length,\n    const char* rest, TokenType type) {\n  if (scanner.current - scanner.start == start + length &&\n      memcmp(scanner.start + start, rest, length) == 0) {\n    return type;\n  }\n\n  return TOKEN_IDENTIFIER;\n}\n//< check-keyword\n//> identifier-type\nstatic TokenType identifierType() {\n//> keywords\n  switch (scanner.start[0]) {\n    case 'a': return checkKeyword(1, 2, \"nd\", TOKEN_AND);\n    case 'c': return checkKeyword(1, 4, \"lass\", TOKEN_CLASS);\n    case 'e': return checkKeyword(1, 3, \"lse\", TOKEN_ELSE);\n//> keyword-f\n    case 'f':\n      if (scanner.current - scanner.start > 1) {\n        switch (scanner.start[1]) {\n          case 'a': return checkKeyword(2, 3, \"lse\", TOKEN_FALSE);\n          case 'o': return checkKeyword(2, 1, \"r\", TOKEN_FOR);\n          case 'u': return checkKeyword(2, 1, \"n\", TOKEN_FUN);\n        }\n      }\n      break;\n//< keyword-f\n    case 'i': return checkKeyword(1, 1, \"f\", TOKEN_IF);\n    case 'n': return checkKeyword(1, 2, \"il\", TOKEN_NIL);\n    case 'o': return checkKeyword(1, 1, \"r\", TOKEN_OR);\n    case 'p': return checkKeyword(1, 4, \"rint\", TOKEN_PRINT);\n    case 'r': return checkKeyword(1, 5, \"eturn\", TOKEN_RETURN);\n    case 's': return checkKeyword(1, 4, \"uper\", TOKEN_SUPER);\n//> keyword-t\n    case 't':\n      if (scanner.current - scanner.start > 1) {\n        switch (scanner.start[1]) {\n          case 'h': return checkKeyword(2, 2, \"is\", TOKEN_THIS);\n          case 'r': return checkKeyword(2, 2, \"ue\", TOKEN_TRUE);\n        }\n      }\n      break;\n//< keyword-t\n    case 'v': return checkKeyword(1, 2, \"ar\", TOKEN_VAR);\n    case 'w': return checkKeyword(1, 4, \"hile\", TOKEN_WHILE);\n  }\n\n//< keywords\n  return TOKEN_IDENTIFIER;\n}\n//< identifier-type\n//> identifier\nstatic Token identifier() {\n  while (isAlpha(peek()) || isDigit(peek())) advance();\n  return makeToken(identifierType());\n}\n//< identifier\n//> number\nstatic Token number() {\n  while (isDigit(peek())) advance();\n\n  // Look for a fractional part.\n  if (peek() == '.' && isDigit(peekNext())) {\n    // Consume the \".\".\n    advance();\n\n    while (isDigit(peek())) advance();\n  }\n\n  return makeToken(TOKEN_NUMBER);\n}\n//< number\n//> string\nstatic Token string() {\n  while (peek() != '\"' && !isAtEnd()) {\n    if (peek() == '\\n') scanner.line++;\n    advance();\n  }\n\n  if (isAtEnd()) return errorToken(\"Unterminated string.\");\n\n  // The closing quote.\n  advance();\n  return makeToken(TOKEN_STRING);\n}\n//< string\n//> scan-token\nToken scanToken() {\n//> call-skip-whitespace\n  skipWhitespace();\n//< call-skip-whitespace\n  scanner.start = scanner.current;\n\n  if (isAtEnd()) return makeToken(TOKEN_EOF);\n//> scan-char\n  \n  char c = advance();\n//> scan-identifier\n  if (isAlpha(c)) return identifier();\n//< scan-identifier\n//> scan-number\n  if (isDigit(c)) return number();\n//< scan-number\n\n  switch (c) {\n    case '(': return makeToken(TOKEN_LEFT_PAREN);\n    case ')': return makeToken(TOKEN_RIGHT_PAREN);\n    case '{': return makeToken(TOKEN_LEFT_BRACE);\n    case '}': return makeToken(TOKEN_RIGHT_BRACE);\n    case ';': return makeToken(TOKEN_SEMICOLON);\n    case ',': return makeToken(TOKEN_COMMA);\n    case '.': return makeToken(TOKEN_DOT);\n    case '-': return makeToken(TOKEN_MINUS);\n    case '+': return makeToken(TOKEN_PLUS);\n    case '/': return makeToken(TOKEN_SLASH);\n    case '*': return makeToken(TOKEN_STAR);\n//> two-char\n    case '!':\n      return makeToken(\n          match('=') ? TOKEN_BANG_EQUAL : TOKEN_BANG);\n    case '=':\n      return makeToken(\n          match('=') ? TOKEN_EQUAL_EQUAL : TOKEN_EQUAL);\n    case '<':\n      return makeToken(\n          match('=') ? TOKEN_LESS_EQUAL : TOKEN_LESS);\n    case '>':\n      return makeToken(\n          match('=') ? TOKEN_GREATER_EQUAL : TOKEN_GREATER);\n//< two-char\n//> scan-string\n    case '\"': return string();\n//< scan-string\n  }\n//< scan-char\n\n  return errorToken(\"Unexpected character.\");\n}\n//< scan-token\n"
  },
  {
    "path": "c/scanner.h",
    "content": "//> Scanning on Demand scanner-h\n#ifndef clox_scanner_h\n#define clox_scanner_h\n//> token-type\n\ntypedef enum {\n  // Single-character tokens.\n  TOKEN_LEFT_PAREN, TOKEN_RIGHT_PAREN,\n  TOKEN_LEFT_BRACE, TOKEN_RIGHT_BRACE,\n  TOKEN_COMMA, TOKEN_DOT, TOKEN_MINUS, TOKEN_PLUS,\n  TOKEN_SEMICOLON, TOKEN_SLASH, TOKEN_STAR,\n  // One or two character tokens.\n  TOKEN_BANG, TOKEN_BANG_EQUAL,\n  TOKEN_EQUAL, TOKEN_EQUAL_EQUAL,\n  TOKEN_GREATER, TOKEN_GREATER_EQUAL,\n  TOKEN_LESS, TOKEN_LESS_EQUAL,\n  // Literals.\n  TOKEN_IDENTIFIER, TOKEN_STRING, TOKEN_NUMBER,\n  // Keywords.\n  TOKEN_AND, TOKEN_CLASS, TOKEN_ELSE, TOKEN_FALSE,\n  TOKEN_FOR, TOKEN_FUN, TOKEN_IF, TOKEN_NIL, TOKEN_OR,\n  TOKEN_PRINT, TOKEN_RETURN, TOKEN_SUPER, TOKEN_THIS,\n  TOKEN_TRUE, TOKEN_VAR, TOKEN_WHILE,\n\n  TOKEN_ERROR, TOKEN_EOF\n} TokenType;\n//< token-type\n//> token-struct\n\ntypedef struct {\n  TokenType type;\n  const char* start;\n  int length;\n  int line;\n} Token;\n//< token-struct\n\nvoid initScanner(const char* source);\n//> scan-token-h\nToken scanToken();\n//< scan-token-h\n\n#endif\n"
  },
  {
    "path": "c/table.c",
    "content": "//> Hash Tables table-c\n#include <stdlib.h>\n#include <string.h>\n\n#include \"memory.h\"\n#include \"object.h\"\n#include \"table.h\"\n#include \"value.h\"\n\n//> max-load\n#define TABLE_MAX_LOAD 0.75\n\n//< max-load\nvoid initTable(Table* table) {\n  table->count = 0;\n  table->capacity = 0;\n  table->entries = NULL;\n}\n//> free-table\nvoid freeTable(Table* table) {\n  FREE_ARRAY(Entry, table->entries, table->capacity);\n  initTable(table);\n}\n//< free-table\n//> find-entry\n//> omit\n// NOTE: The \"Optimization\" chapter has a manual copy of this function.\n// If you change it here, make sure to update that copy.\n//< omit\nstatic Entry* findEntry(Entry* entries, int capacity,\n                        ObjString* key) {\n/* Hash Tables find-entry < Optimization initial-index\n  uint32_t index = key->hash % capacity;\n*/\n//> Optimization initial-index\n  uint32_t index = key->hash & (capacity - 1);\n//< Optimization initial-index\n//> find-entry-tombstone\n  Entry* tombstone = NULL;\n  \n//< find-entry-tombstone\n  for (;;) {\n    Entry* entry = &entries[index];\n/* Hash Tables find-entry < Hash Tables find-tombstone\n    if (entry->key == key || entry->key == NULL) {\n      return entry;\n    }\n*/\n//> find-tombstone\n    if (entry->key == NULL) {\n      if (IS_NIL(entry->value)) {\n        // Empty entry.\n        return tombstone != NULL ? tombstone : entry;\n      } else {\n        // We found a tombstone.\n        if (tombstone == NULL) tombstone = entry;\n      }\n    } else if (entry->key == key) {\n      // We found the key.\n      return entry;\n    }\n//< find-tombstone\n\n/* Hash Tables find-entry < Optimization next-index\n    index = (index + 1) % capacity;\n*/\n//> Optimization next-index\n    index = (index + 1) & (capacity - 1);\n//< Optimization next-index\n  }\n}\n//< find-entry\n//> table-get\nbool tableGet(Table* table, ObjString* key, Value* value) {\n  if (table->count == 0) return false;\n\n  Entry* entry = findEntry(table->entries, table->capacity, key);\n  if (entry->key == NULL) return false;\n\n  *value = entry->value;\n  return true;\n}\n//< table-get\n//> table-adjust-capacity\nstatic void adjustCapacity(Table* table, int capacity) {\n  Entry* entries = ALLOCATE(Entry, capacity);\n  for (int i = 0; i < capacity; i++) {\n    entries[i].key = NULL;\n    entries[i].value = NIL_VAL;\n  }\n//> re-hash\n\n//> resize-init-count\n  table->count = 0;\n//< resize-init-count\n  for (int i = 0; i < table->capacity; i++) {\n    Entry* entry = &table->entries[i];\n    if (entry->key == NULL) continue;\n\n    Entry* dest = findEntry(entries, capacity, entry->key);\n    dest->key = entry->key;\n    dest->value = entry->value;\n//> resize-increment-count\n    table->count++;\n//< resize-increment-count\n  }\n//< re-hash\n\n//> Hash Tables free-old-array\n  FREE_ARRAY(Entry, table->entries, table->capacity);\n//< Hash Tables free-old-array\n  table->entries = entries;\n  table->capacity = capacity;\n}\n//< table-adjust-capacity\n//> table-set\nbool tableSet(Table* table, ObjString* key, Value value) {\n//> table-set-grow\n  if (table->count + 1 > table->capacity * TABLE_MAX_LOAD) {\n    int capacity = GROW_CAPACITY(table->capacity);\n    adjustCapacity(table, capacity);\n  }\n\n//< table-set-grow\n  Entry* entry = findEntry(table->entries, table->capacity, key);\n  bool isNewKey = entry->key == NULL;\n/* Hash Tables table-set < Hash Tables set-increment-count\n  if (isNewKey) table->count++;\n*/\n//> set-increment-count\n  if (isNewKey && IS_NIL(entry->value)) table->count++;\n//< set-increment-count\n\n  entry->key = key;\n  entry->value = value;\n  return isNewKey;\n}\n//< table-set\n//> table-delete\nbool tableDelete(Table* table, ObjString* key) {\n  if (table->count == 0) return false;\n\n  // Find the entry.\n  Entry* entry = findEntry(table->entries, table->capacity, key);\n  if (entry->key == NULL) return false;\n\n  // Place a tombstone in the entry.\n  entry->key = NULL;\n  entry->value = BOOL_VAL(true);\n  return true;\n}\n//< table-delete\n//> table-add-all\nvoid tableAddAll(Table* from, Table* to) {\n  for (int i = 0; i < from->capacity; i++) {\n    Entry* entry = &from->entries[i];\n    if (entry->key != NULL) {\n      tableSet(to, entry->key, entry->value);\n    }\n  }\n}\n//< table-add-all\n//> table-find-string\nObjString* tableFindString(Table* table, const char* chars,\n                           int length, uint32_t hash) {\n  if (table->count == 0) return NULL;\n\n/* Hash Tables table-find-string < Optimization find-string-index\n  uint32_t index = hash % table->capacity;\n*/\n//> Optimization find-string-index\n  uint32_t index = hash & (table->capacity - 1);\n//< Optimization find-string-index\n  for (;;) {\n    Entry* entry = &table->entries[index];\n    if (entry->key == NULL) {\n      // Stop if we find an empty non-tombstone entry.\n      if (IS_NIL(entry->value)) return NULL;\n    } else if (entry->key->length == length &&\n        entry->key->hash == hash &&\n        memcmp(entry->key->chars, chars, length) == 0) {\n      // We found it.\n      return entry->key;\n    }\n\n/* Hash Tables table-find-string < Optimization find-string-next\n    index = (index + 1) % table->capacity;\n*/\n//> Optimization find-string-next\n    index = (index + 1) & (table->capacity - 1);\n//< Optimization find-string-next\n  }\n}\n//< table-find-string\n//> Garbage Collection table-remove-white\nvoid tableRemoveWhite(Table* table) {\n  for (int i = 0; i < table->capacity; i++) {\n    Entry* entry = &table->entries[i];\n    if (entry->key != NULL && !entry->key->obj.isMarked) {\n      tableDelete(table, entry->key);\n    }\n  }\n}\n//< Garbage Collection table-remove-white\n//> Garbage Collection mark-table\nvoid markTable(Table* table) {\n  for (int i = 0; i < table->capacity; i++) {\n    Entry* entry = &table->entries[i];\n    markObject((Obj*)entry->key);\n    markValue(entry->value);\n  }\n}\n//< Garbage Collection mark-table\n"
  },
  {
    "path": "c/table.h",
    "content": "//> Hash Tables table-h\n#ifndef clox_table_h\n#define clox_table_h\n\n#include \"common.h\"\n#include \"value.h\"\n//> entry\n\ntypedef struct {\n  ObjString* key;\n  Value value;\n} Entry;\n//< entry\n\ntypedef struct {\n  int count;\n  int capacity;\n  Entry* entries;\n} Table;\n\n//> init-table-h\nvoid initTable(Table* table);\n//> free-table-h\nvoid freeTable(Table* table);\n//< free-table-h\n//> table-get-h\nbool tableGet(Table* table, ObjString* key, Value* value);\n//< table-get-h\n//> table-set-h\nbool tableSet(Table* table, ObjString* key, Value value);\n//< table-set-h\n//> table-delete-h\nbool tableDelete(Table* table, ObjString* key);\n//< table-delete-h\n//> table-add-all-h\nvoid tableAddAll(Table* from, Table* to);\n//< table-add-all-h\n//> table-find-string-h\nObjString* tableFindString(Table* table, const char* chars,\n                           int length, uint32_t hash);\n//< table-find-string-h\n//> Garbage Collection table-remove-white-h\n\nvoid tableRemoveWhite(Table* table);\n//< Garbage Collection table-remove-white-h\n//> Garbage Collection mark-table-h\nvoid markTable(Table* table);\n//< Garbage Collection mark-table-h\n\n//< init-table-h\n#endif\n"
  },
  {
    "path": "c/value.c",
    "content": "//> Chunks of Bytecode value-c\n#include <stdio.h>\n//> Strings value-include-string\n#include <string.h>\n//< Strings value-include-string\n\n//> Strings value-include-object\n#include \"object.h\"\n//< Strings value-include-object\n#include \"memory.h\"\n#include \"value.h\"\n\nvoid initValueArray(ValueArray* array) {\n  array->values = NULL;\n  array->capacity = 0;\n  array->count = 0;\n}\n//> write-value-array\nvoid writeValueArray(ValueArray* array, Value value) {\n  if (array->capacity < array->count + 1) {\n    int oldCapacity = array->capacity;\n    array->capacity = GROW_CAPACITY(oldCapacity);\n    array->values = GROW_ARRAY(Value, array->values,\n                               oldCapacity, array->capacity);\n  }\n  \n  array->values[array->count] = value;\n  array->count++;\n}\n//< write-value-array\n//> free-value-array\nvoid freeValueArray(ValueArray* array) {\n  FREE_ARRAY(Value, array->values, array->capacity);\n  initValueArray(array);\n}\n//< free-value-array\n//> print-value\nvoid printValue(Value value) {\n//> Optimization print-value\n#ifdef NAN_BOXING\n  if (IS_BOOL(value)) {\n    printf(AS_BOOL(value) ? \"true\" : \"false\");\n  } else if (IS_NIL(value)) {\n    printf(\"nil\");\n  } else if (IS_NUMBER(value)) {\n    printf(\"%g\", AS_NUMBER(value));\n  } else if (IS_OBJ(value)) {\n    printObject(value);\n  }\n#else\n//< Optimization print-value\n/* Chunks of Bytecode print-value < Types of Values print-number-value\n  printf(\"%g\", value);\n*/\n/* Types of Values print-number-value < Types of Values print-value\n printf(\"%g\", AS_NUMBER(value));\n */\n//> Types of Values print-value\n  switch (value.type) {\n    case VAL_BOOL:\n      printf(AS_BOOL(value) ? \"true\" : \"false\");\n      break;\n    case VAL_NIL: printf(\"nil\"); break;\n    case VAL_NUMBER: printf(\"%g\", AS_NUMBER(value)); break;\n//> Strings call-print-object\n    case VAL_OBJ: printObject(value); break;\n//< Strings call-print-object\n  }\n//< Types of Values print-value\n//> Optimization end-print-value\n#endif\n//< Optimization end-print-value\n}\n//< print-value\n//> Types of Values values-equal\nbool valuesEqual(Value a, Value b) {\n//> Optimization values-equal\n#ifdef NAN_BOXING\n//> nan-equality\n  if (IS_NUMBER(a) && IS_NUMBER(b)) {\n    return AS_NUMBER(a) == AS_NUMBER(b);\n  }\n//< nan-equality\n  return a == b;\n#else\n//< Optimization values-equal\n  if (a.type != b.type) return false;\n  switch (a.type) {\n    case VAL_BOOL:   return AS_BOOL(a) == AS_BOOL(b);\n    case VAL_NIL:    return true;\n    case VAL_NUMBER: return AS_NUMBER(a) == AS_NUMBER(b);\n/* Strings strings-equal < Hash Tables equal\n    case VAL_OBJ: {\n      ObjString* aString = AS_STRING(a);\n      ObjString* bString = AS_STRING(b);\n      return aString->length == bString->length &&\n          memcmp(aString->chars, bString->chars,\n                 aString->length) == 0;\n    }\n */\n//> Hash Tables equal\n    case VAL_OBJ:    return AS_OBJ(a) == AS_OBJ(b);\n//< Hash Tables equal\n    default:         return false; // Unreachable.\n  }\n//> Optimization end-values-equal\n#endif\n//< Optimization end-values-equal\n}\n//< Types of Values values-equal\n"
  },
  {
    "path": "c/value.h",
    "content": "//> Chunks of Bytecode value-h\n#ifndef clox_value_h\n#define clox_value_h\n//> Optimization include-string\n\n#include <string.h>\n//< Optimization include-string\n\n#include \"common.h\"\n\n//> Strings forward-declare-obj\ntypedef struct Obj Obj;\n//> forward-declare-obj-string\ntypedef struct ObjString ObjString;\n//< forward-declare-obj-string\n\n//< Strings forward-declare-obj\n//> Optimization nan-boxing\n#ifdef NAN_BOXING\n//> qnan\n\n//> sign-bit\n#define SIGN_BIT ((uint64_t)0x8000000000000000)\n//< sign-bit\n#define QNAN     ((uint64_t)0x7ffc000000000000)\n//< qnan\n//> tags\n\n#define TAG_NIL   1 // 01.\n#define TAG_FALSE 2 // 10.\n#define TAG_TRUE  3 // 11.\n//< tags\n\ntypedef uint64_t Value;\n//> is-number\n\n//> is-bool\n#define IS_BOOL(value)      (((value) | 1) == TRUE_VAL)\n//< is-bool\n//> is-nil\n#define IS_NIL(value)       ((value) == NIL_VAL)\n//< is-nil\n#define IS_NUMBER(value)    (((value) & QNAN) != QNAN)\n//< is-number\n//> is-obj\n#define IS_OBJ(value) \\\n    (((value) & (QNAN | SIGN_BIT)) == (QNAN | SIGN_BIT))\n//< is-obj\n//> as-number\n\n//> as-bool\n#define AS_BOOL(value)      ((value) == TRUE_VAL)\n//< as-bool\n#define AS_NUMBER(value)    valueToNum(value)\n//< as-number\n//> as-obj\n#define AS_OBJ(value) \\\n    ((Obj*)(uintptr_t)((value) & ~(SIGN_BIT | QNAN)))\n//< as-obj\n//> number-val\n\n//> bool-val\n#define BOOL_VAL(b)     ((b) ? TRUE_VAL : FALSE_VAL)\n//< bool-val\n//> false-true-vals\n#define FALSE_VAL       ((Value)(uint64_t)(QNAN | TAG_FALSE))\n#define TRUE_VAL        ((Value)(uint64_t)(QNAN | TAG_TRUE))\n//< false-true-vals\n//> nil-val\n#define NIL_VAL         ((Value)(uint64_t)(QNAN | TAG_NIL))\n//< nil-val\n#define NUMBER_VAL(num) numToValue(num)\n//< number-val\n//> obj-val\n#define OBJ_VAL(obj) \\\n    (Value)(SIGN_BIT | QNAN | (uint64_t)(uintptr_t)(obj))\n//< obj-val\n//> value-to-num\n\nstatic inline double valueToNum(Value value) {\n  double num;\n  memcpy(&num, &value, sizeof(Value));\n  return num;\n}\n//< value-to-num\n//> num-to-value\n\nstatic inline Value numToValue(double num) {\n  Value value;\n  memcpy(&value, &num, sizeof(double));\n  return value;\n}\n//< num-to-value\n\n#else\n\n//< Optimization nan-boxing\n//> Types of Values value-type\ntypedef enum {\n  VAL_BOOL,\n  VAL_NIL, // [user-types]\n  VAL_NUMBER,\n//> Strings val-obj\n  VAL_OBJ\n//< Strings val-obj\n} ValueType;\n\n//< Types of Values value-type\n/* Chunks of Bytecode value-h < Types of Values value\ntypedef double Value;\n*/\n//> Types of Values value\ntypedef struct {\n  ValueType type;\n  union {\n    bool boolean;\n    double number;\n//> Strings union-object\n    Obj* obj;\n//< Strings union-object\n  } as; // [as]\n} Value;\n//< Types of Values value\n//> Types of Values is-macros\n\n#define IS_BOOL(value)    ((value).type == VAL_BOOL)\n#define IS_NIL(value)     ((value).type == VAL_NIL)\n#define IS_NUMBER(value)  ((value).type == VAL_NUMBER)\n//> Strings is-obj\n#define IS_OBJ(value)     ((value).type == VAL_OBJ)\n//< Strings is-obj\n//< Types of Values is-macros\n//> Types of Values as-macros\n\n//> Strings as-obj\n#define AS_OBJ(value)     ((value).as.obj)\n//< Strings as-obj\n#define AS_BOOL(value)    ((value).as.boolean)\n#define AS_NUMBER(value)  ((value).as.number)\n//< Types of Values as-macros\n//> Types of Values value-macros\n\n#define BOOL_VAL(value)   ((Value){VAL_BOOL, {.boolean = value}})\n#define NIL_VAL           ((Value){VAL_NIL, {.number = 0}})\n#define NUMBER_VAL(value) ((Value){VAL_NUMBER, {.number = value}})\n//> Strings obj-val\n#define OBJ_VAL(object)   ((Value){VAL_OBJ, {.obj = (Obj*)object}})\n//< Strings obj-val\n//< Types of Values value-macros\n//> Optimization end-if-nan-boxing\n\n#endif\n//< Optimization end-if-nan-boxing\n//> value-array\n\ntypedef struct {\n  int capacity;\n  int count;\n  Value* values;\n} ValueArray;\n//< value-array\n//> array-fns-h\n\n//> Types of Values values-equal-h\nbool valuesEqual(Value a, Value b);\n//< Types of Values values-equal-h\nvoid initValueArray(ValueArray* array);\nvoid writeValueArray(ValueArray* array, Value value);\nvoid freeValueArray(ValueArray* array);\n//< array-fns-h\n//> print-value-h\nvoid printValue(Value value);\n//< print-value-h\n\n#endif\n"
  },
  {
    "path": "c/vm.c",
    "content": "//> A Virtual Machine vm-c\n//> Types of Values include-stdarg\n#include <stdarg.h>\n//< Types of Values include-stdarg\n//> vm-include-stdio\n#include <stdio.h>\n//> Strings vm-include-string\n#include <string.h>\n//< Strings vm-include-string\n//> Calls and Functions vm-include-time\n#include <time.h>\n//< Calls and Functions vm-include-time\n\n//< vm-include-stdio\n#include \"common.h\"\n//> Scanning on Demand vm-include-compiler\n#include \"compiler.h\"\n//< Scanning on Demand vm-include-compiler\n//> vm-include-debug\n#include \"debug.h\"\n//< vm-include-debug\n//> Strings vm-include-object-memory\n#include \"object.h\"\n#include \"memory.h\"\n//< Strings vm-include-object-memory\n#include \"vm.h\"\n\nVM vm; // [one]\n//> Calls and Functions clock-native\nstatic Value clockNative(int argCount, Value* args) {\n  return NUMBER_VAL((double)clock() / CLOCKS_PER_SEC);\n}\n//< Calls and Functions clock-native\n//> reset-stack\nstatic void resetStack() {\n  vm.stackTop = vm.stack;\n//> Calls and Functions reset-frame-count\n  vm.frameCount = 0;\n//< Calls and Functions reset-frame-count\n//> Closures init-open-upvalues\n  vm.openUpvalues = NULL;\n//< Closures init-open-upvalues\n}\n//< reset-stack\n//> Types of Values runtime-error\nstatic void runtimeError(const char* format, ...) {\n  va_list args;\n  va_start(args, format);\n  vfprintf(stderr, format, args);\n  va_end(args);\n  fputs(\"\\n\", stderr);\n\n/* Types of Values runtime-error < Calls and Functions runtime-error-temp\n  size_t instruction = vm.ip - vm.chunk->code - 1;\n  int line = vm.chunk->lines[instruction];\n*/\n/* Calls and Functions runtime-error-temp < Calls and Functions runtime-error-stack\n  CallFrame* frame = &vm.frames[vm.frameCount - 1];\n  size_t instruction = frame->ip - frame->function->chunk.code - 1;\n  int line = frame->function->chunk.lines[instruction];\n*/\n/* Types of Values runtime-error < Calls and Functions runtime-error-stack\n  fprintf(stderr, \"[line %d] in script\\n\", line);\n*/\n//> Calls and Functions runtime-error-stack\n  for (int i = vm.frameCount - 1; i >= 0; i--) {\n    CallFrame* frame = &vm.frames[i];\n/* Calls and Functions runtime-error-stack < Closures runtime-error-function\n    ObjFunction* function = frame->function;\n*/\n//> Closures runtime-error-function\n    ObjFunction* function = frame->closure->function;\n//< Closures runtime-error-function\n    size_t instruction = frame->ip - function->chunk.code - 1;\n    fprintf(stderr, \"[line %d] in \", // [minus]\n            function->chunk.lines[instruction]);\n    if (function->name == NULL) {\n      fprintf(stderr, \"script\\n\");\n    } else {\n      fprintf(stderr, \"%s()\\n\", function->name->chars);\n    }\n  }\n\n//< Calls and Functions runtime-error-stack\n  resetStack();\n}\n//< Types of Values runtime-error\n//> Calls and Functions define-native\nstatic void defineNative(const char* name, NativeFn function) {\n  push(OBJ_VAL(copyString(name, (int)strlen(name))));\n  push(OBJ_VAL(newNative(function)));\n  tableSet(&vm.globals, AS_STRING(vm.stack[0]), vm.stack[1]);\n  pop();\n  pop();\n}\n//< Calls and Functions define-native\n\nvoid initVM() {\n//> call-reset-stack\n  resetStack();\n//< call-reset-stack\n//> Strings init-objects-root\n  vm.objects = NULL;\n//< Strings init-objects-root\n//> Garbage Collection init-gc-fields\n  vm.bytesAllocated = 0;\n  vm.nextGC = 1024 * 1024;\n//< Garbage Collection init-gc-fields\n//> Garbage Collection init-gray-stack\n\n  vm.grayCount = 0;\n  vm.grayCapacity = 0;\n  vm.grayStack = NULL;\n//< Garbage Collection init-gray-stack\n//> Global Variables init-globals\n\n  initTable(&vm.globals);\n//< Global Variables init-globals\n//> Hash Tables init-strings\n  initTable(&vm.strings);\n//< Hash Tables init-strings\n//> Methods and Initializers init-init-string\n\n//> null-init-string\n  vm.initString = NULL;\n//< null-init-string\n  vm.initString = copyString(\"init\", 4);\n//< Methods and Initializers init-init-string\n//> Calls and Functions define-native-clock\n\n  defineNative(\"clock\", clockNative);\n//< Calls and Functions define-native-clock\n}\n\nvoid freeVM() {\n//> Global Variables free-globals\n  freeTable(&vm.globals);\n//< Global Variables free-globals\n//> Hash Tables free-strings\n  freeTable(&vm.strings);\n//< Hash Tables free-strings\n//> Methods and Initializers clear-init-string\n  vm.initString = NULL;\n//< Methods and Initializers clear-init-string\n//> Strings call-free-objects\n  freeObjects();\n//< Strings call-free-objects\n}\n//> push\nvoid push(Value value) {\n  *vm.stackTop = value;\n  vm.stackTop++;\n}\n//< push\n//> pop\nValue pop() {\n  vm.stackTop--;\n  return *vm.stackTop;\n}\n//< pop\n//> Types of Values peek\nstatic Value peek(int distance) {\n  return vm.stackTop[-1 - distance];\n}\n//< Types of Values peek\n/* Calls and Functions call < Closures call-signature\nstatic bool call(ObjFunction* function, int argCount) {\n*/\n//> Calls and Functions call\n//> Closures call-signature\nstatic bool call(ObjClosure* closure, int argCount) {\n//< Closures call-signature\n/* Calls and Functions check-arity < Closures check-arity\n  if (argCount != function->arity) {\n    runtimeError(\"Expected %d arguments but got %d.\",\n        function->arity, argCount);\n*/\n//> Closures check-arity\n  if (argCount != closure->function->arity) {\n    runtimeError(\"Expected %d arguments but got %d.\",\n        closure->function->arity, argCount);\n//< Closures check-arity\n//> check-arity\n    return false;\n  }\n\n//< check-arity\n//> check-overflow\n  if (vm.frameCount == FRAMES_MAX) {\n    runtimeError(\"Stack overflow.\");\n    return false;\n  }\n\n//< check-overflow\n  CallFrame* frame = &vm.frames[vm.frameCount++];\n/* Calls and Functions call < Closures call-init-closure\n  frame->function = function;\n  frame->ip = function->chunk.code;\n*/\n//> Closures call-init-closure\n  frame->closure = closure;\n  frame->ip = closure->function->chunk.code;\n//< Closures call-init-closure\n  frame->slots = vm.stackTop - argCount - 1;\n  return true;\n}\n//< Calls and Functions call\n//> Calls and Functions call-value\nstatic bool callValue(Value callee, int argCount) {\n  if (IS_OBJ(callee)) {\n    switch (OBJ_TYPE(callee)) {\n//> Methods and Initializers call-bound-method\n      case OBJ_BOUND_METHOD: {\n        ObjBoundMethod* bound = AS_BOUND_METHOD(callee);\n//> store-receiver\n        vm.stackTop[-argCount - 1] = bound->receiver;\n//< store-receiver\n        return call(bound->method, argCount);\n      }\n//< Methods and Initializers call-bound-method\n//> Classes and Instances call-class\n      case OBJ_CLASS: {\n        ObjClass* klass = AS_CLASS(callee);\n        vm.stackTop[-argCount - 1] = OBJ_VAL(newInstance(klass));\n//> Methods and Initializers call-init\n        Value initializer;\n        if (tableGet(&klass->methods, vm.initString,\n                     &initializer)) {\n          return call(AS_CLOSURE(initializer), argCount);\n//> no-init-arity-error\n        } else if (argCount != 0) {\n          runtimeError(\"Expected 0 arguments but got %d.\",\n                       argCount);\n          return false;\n//< no-init-arity-error\n        }\n//< Methods and Initializers call-init\n        return true;\n      }\n//< Classes and Instances call-class\n//> Closures call-value-closure\n      case OBJ_CLOSURE:\n        return call(AS_CLOSURE(callee), argCount);\n//< Closures call-value-closure\n/* Calls and Functions call-value < Closures call-value-closure\n      case OBJ_FUNCTION: // [switch]\n        return call(AS_FUNCTION(callee), argCount);\n*/\n//> call-native\n      case OBJ_NATIVE: {\n        NativeFn native = AS_NATIVE(callee);\n        Value result = native(argCount, vm.stackTop - argCount);\n        vm.stackTop -= argCount + 1;\n        push(result);\n        return true;\n      }\n//< call-native\n      default:\n        break; // Non-callable object type.\n    }\n  }\n  runtimeError(\"Can only call functions and classes.\");\n  return false;\n}\n//< Calls and Functions call-value\n//> Methods and Initializers invoke-from-class\nstatic bool invokeFromClass(ObjClass* klass, ObjString* name,\n                            int argCount) {\n  Value method;\n  if (!tableGet(&klass->methods, name, &method)) {\n    runtimeError(\"Undefined property '%s'.\", name->chars);\n    return false;\n  }\n  return call(AS_CLOSURE(method), argCount);\n}\n//< Methods and Initializers invoke-from-class\n//> Methods and Initializers invoke\nstatic bool invoke(ObjString* name, int argCount) {\n  Value receiver = peek(argCount);\n//> invoke-check-type\n\n  if (!IS_INSTANCE(receiver)) {\n    runtimeError(\"Only instances have methods.\");\n    return false;\n  }\n\n//< invoke-check-type\n  ObjInstance* instance = AS_INSTANCE(receiver);\n//> invoke-field\n\n  Value value;\n  if (tableGet(&instance->fields, name, &value)) {\n    vm.stackTop[-argCount - 1] = value;\n    return callValue(value, argCount);\n  }\n\n//< invoke-field\n  return invokeFromClass(instance->klass, name, argCount);\n}\n//< Methods and Initializers invoke\n//> Methods and Initializers bind-method\nstatic bool bindMethod(ObjClass* klass, ObjString* name) {\n  Value method;\n  if (!tableGet(&klass->methods, name, &method)) {\n    runtimeError(\"Undefined property '%s'.\", name->chars);\n    return false;\n  }\n\n  ObjBoundMethod* bound = newBoundMethod(peek(0),\n                                         AS_CLOSURE(method));\n  pop();\n  push(OBJ_VAL(bound));\n  return true;\n}\n//< Methods and Initializers bind-method\n//> Closures capture-upvalue\nstatic ObjUpvalue* captureUpvalue(Value* local) {\n//> look-for-existing-upvalue\n  ObjUpvalue* prevUpvalue = NULL;\n  ObjUpvalue* upvalue = vm.openUpvalues;\n  while (upvalue != NULL && upvalue->location > local) {\n    prevUpvalue = upvalue;\n    upvalue = upvalue->next;\n  }\n\n  if (upvalue != NULL && upvalue->location == local) {\n    return upvalue;\n  }\n\n//< look-for-existing-upvalue\n  ObjUpvalue* createdUpvalue = newUpvalue(local);\n//> insert-upvalue-in-list\n  createdUpvalue->next = upvalue;\n\n  if (prevUpvalue == NULL) {\n    vm.openUpvalues = createdUpvalue;\n  } else {\n    prevUpvalue->next = createdUpvalue;\n  }\n\n//< insert-upvalue-in-list\n  return createdUpvalue;\n}\n//< Closures capture-upvalue\n//> Closures close-upvalues\nstatic void closeUpvalues(Value* last) {\n  while (vm.openUpvalues != NULL &&\n         vm.openUpvalues->location >= last) {\n    ObjUpvalue* upvalue = vm.openUpvalues;\n    upvalue->closed = *upvalue->location;\n    upvalue->location = &upvalue->closed;\n    vm.openUpvalues = upvalue->next;\n  }\n}\n//< Closures close-upvalues\n//> Methods and Initializers define-method\nstatic void defineMethod(ObjString* name) {\n  Value method = peek(0);\n  ObjClass* klass = AS_CLASS(peek(1));\n  tableSet(&klass->methods, name, method);\n  pop();\n}\n//< Methods and Initializers define-method\n//> Types of Values is-falsey\nstatic bool isFalsey(Value value) {\n  return IS_NIL(value) || (IS_BOOL(value) && !AS_BOOL(value));\n}\n//< Types of Values is-falsey\n//> Strings concatenate\nstatic void concatenate() {\n/* Strings concatenate < Garbage Collection concatenate-peek\n  ObjString* b = AS_STRING(pop());\n  ObjString* a = AS_STRING(pop());\n*/\n//> Garbage Collection concatenate-peek\n  ObjString* b = AS_STRING(peek(0));\n  ObjString* a = AS_STRING(peek(1));\n//< Garbage Collection concatenate-peek\n\n  int length = a->length + b->length;\n  char* chars = ALLOCATE(char, length + 1);\n  memcpy(chars, a->chars, a->length);\n  memcpy(chars + a->length, b->chars, b->length);\n  chars[length] = '\\0';\n\n  ObjString* result = takeString(chars, length);\n//> Garbage Collection concatenate-pop\n  pop();\n  pop();\n//< Garbage Collection concatenate-pop\n  push(OBJ_VAL(result));\n}\n//< Strings concatenate\n//> run\nstatic InterpretResult run() {\n//> Calls and Functions run\n  CallFrame* frame = &vm.frames[vm.frameCount - 1];\n\n/* A Virtual Machine run < Calls and Functions run\n#define READ_BYTE() (*vm.ip++)\n*/\n#define READ_BYTE() (*frame->ip++)\n/* A Virtual Machine read-constant < Calls and Functions run\n#define READ_CONSTANT() (vm.chunk->constants.values[READ_BYTE()])\n*/\n\n/* Jumping Back and Forth read-short < Calls and Functions run\n#define READ_SHORT() \\\n    (vm.ip += 2, (uint16_t)((vm.ip[-2] << 8) | vm.ip[-1]))\n*/\n#define READ_SHORT() \\\n    (frame->ip += 2, \\\n    (uint16_t)((frame->ip[-2] << 8) | frame->ip[-1]))\n\n/* Calls and Functions run < Closures read-constant\n#define READ_CONSTANT() \\\n    (frame->function->chunk.constants.values[READ_BYTE()])\n*/\n//> Closures read-constant\n#define READ_CONSTANT() \\\n    (frame->closure->function->chunk.constants.values[READ_BYTE()])\n//< Closures read-constant\n\n//< Calls and Functions run\n//> Global Variables read-string\n#define READ_STRING() AS_STRING(READ_CONSTANT())\n//< Global Variables read-string\n/* A Virtual Machine binary-op < Types of Values binary-op\n#define BINARY_OP(op) \\\n    do { \\\n      double b = pop(); \\\n      double a = pop(); \\\n      push(a op b); \\\n    } while (false)\n*/\n//> Types of Values binary-op\n#define BINARY_OP(valueType, op) \\\n    do { \\\n      if (!IS_NUMBER(peek(0)) || !IS_NUMBER(peek(1))) { \\\n        runtimeError(\"Operands must be numbers.\"); \\\n        return INTERPRET_RUNTIME_ERROR; \\\n      } \\\n      double b = AS_NUMBER(pop()); \\\n      double a = AS_NUMBER(pop()); \\\n      push(valueType(a op b)); \\\n    } while (false)\n//< Types of Values binary-op\n\n  for (;;) {\n//> trace-execution\n#ifdef DEBUG_TRACE_EXECUTION\n//> trace-stack\n    printf(\"          \");\n    for (Value* slot = vm.stack; slot < vm.stackTop; slot++) {\n      printf(\"[ \");\n      printValue(*slot);\n      printf(\" ]\");\n    }\n    printf(\"\\n\");\n//< trace-stack\n/* A Virtual Machine trace-execution < Calls and Functions trace-execution\n    disassembleInstruction(vm.chunk,\n                           (int)(vm.ip - vm.chunk->code));\n*/\n/* Calls and Functions trace-execution < Closures disassemble-instruction\n    disassembleInstruction(&frame->function->chunk,\n        (int)(frame->ip - frame->function->chunk.code));\n*/\n//> Closures disassemble-instruction\n    disassembleInstruction(&frame->closure->function->chunk,\n        (int)(frame->ip - frame->closure->function->chunk.code));\n//< Closures disassemble-instruction\n#endif\n\n//< trace-execution\n    uint8_t instruction;\n    switch (instruction = READ_BYTE()) {\n//> op-constant\n      case OP_CONSTANT: {\n        Value constant = READ_CONSTANT();\n/* A Virtual Machine op-constant < A Virtual Machine push-constant\n        printValue(constant);\n        printf(\"\\n\");\n*/\n//> push-constant\n        push(constant);\n//< push-constant\n        break;\n      }\n//< op-constant\n//> Types of Values interpret-literals\n      case OP_NIL: push(NIL_VAL); break;\n      case OP_TRUE: push(BOOL_VAL(true)); break;\n      case OP_FALSE: push(BOOL_VAL(false)); break;\n//< Types of Values interpret-literals\n//> Global Variables interpret-pop\n      case OP_POP: pop(); break;\n//< Global Variables interpret-pop\n//> Local Variables interpret-get-local\n      case OP_GET_LOCAL: {\n        uint8_t slot = READ_BYTE();\n/* Local Variables interpret-get-local < Calls and Functions push-local\n        push(vm.stack[slot]); // [slot]\n*/\n//> Calls and Functions push-local\n        push(frame->slots[slot]);\n//< Calls and Functions push-local\n        break;\n      }\n//< Local Variables interpret-get-local\n//> Local Variables interpret-set-local\n      case OP_SET_LOCAL: {\n        uint8_t slot = READ_BYTE();\n/* Local Variables interpret-set-local < Calls and Functions set-local\n        vm.stack[slot] = peek(0);\n*/\n//> Calls and Functions set-local\n        frame->slots[slot] = peek(0);\n//< Calls and Functions set-local\n        break;\n      }\n//< Local Variables interpret-set-local\n//> Global Variables interpret-get-global\n      case OP_GET_GLOBAL: {\n        ObjString* name = READ_STRING();\n        Value value;\n        if (!tableGet(&vm.globals, name, &value)) {\n          runtimeError(\"Undefined variable '%s'.\", name->chars);\n          return INTERPRET_RUNTIME_ERROR;\n        }\n        push(value);\n        break;\n      }\n//< Global Variables interpret-get-global\n//> Global Variables interpret-define-global\n      case OP_DEFINE_GLOBAL: {\n        ObjString* name = READ_STRING();\n        tableSet(&vm.globals, name, peek(0));\n        pop();\n        break;\n      }\n//< Global Variables interpret-define-global\n//> Global Variables interpret-set-global\n      case OP_SET_GLOBAL: {\n        ObjString* name = READ_STRING();\n        if (tableSet(&vm.globals, name, peek(0))) {\n          tableDelete(&vm.globals, name); // [delete]\n          runtimeError(\"Undefined variable '%s'.\", name->chars);\n          return INTERPRET_RUNTIME_ERROR;\n        }\n        break;\n      }\n//< Global Variables interpret-set-global\n//> Closures interpret-get-upvalue\n      case OP_GET_UPVALUE: {\n        uint8_t slot = READ_BYTE();\n        push(*frame->closure->upvalues[slot]->location);\n        break;\n      }\n//< Closures interpret-get-upvalue\n//> Closures interpret-set-upvalue\n      case OP_SET_UPVALUE: {\n        uint8_t slot = READ_BYTE();\n        *frame->closure->upvalues[slot]->location = peek(0);\n        break;\n      }\n//< Closures interpret-set-upvalue\n//> Classes and Instances interpret-get-property\n      case OP_GET_PROPERTY: {\n//> get-not-instance\n        if (!IS_INSTANCE(peek(0))) {\n          runtimeError(\"Only instances have properties.\");\n          return INTERPRET_RUNTIME_ERROR;\n        }\n\n//< get-not-instance\n        ObjInstance* instance = AS_INSTANCE(peek(0));\n        ObjString* name = READ_STRING();\n        \n        Value value;\n        if (tableGet(&instance->fields, name, &value)) {\n          pop(); // Instance.\n          push(value);\n          break;\n        }\n//> get-undefined\n\n//< get-undefined\n/* Classes and Instances get-undefined < Methods and Initializers get-method\n        runtimeError(\"Undefined property '%s'.\", name->chars);\n        return INTERPRET_RUNTIME_ERROR;\n*/\n//> Methods and Initializers get-method\n        if (!bindMethod(instance->klass, name)) {\n          return INTERPRET_RUNTIME_ERROR;\n        }\n        break;\n//< Methods and Initializers get-method\n      }\n//< Classes and Instances interpret-get-property\n//> Classes and Instances interpret-set-property\n      case OP_SET_PROPERTY: {\n//> set-not-instance\n        if (!IS_INSTANCE(peek(1))) {\n          runtimeError(\"Only instances have fields.\");\n          return INTERPRET_RUNTIME_ERROR;\n        }\n\n//< set-not-instance\n        ObjInstance* instance = AS_INSTANCE(peek(1));\n        tableSet(&instance->fields, READ_STRING(), peek(0));\n        Value value = pop();\n        pop();\n        push(value);\n        break;\n      }\n//< Classes and Instances interpret-set-property\n//> Superclasses interpret-get-super\n      case OP_GET_SUPER: {\n        ObjString* name = READ_STRING();\n        ObjClass* superclass = AS_CLASS(pop());\n        \n        if (!bindMethod(superclass, name)) {\n          return INTERPRET_RUNTIME_ERROR;\n        }\n        break;\n      }\n//< Superclasses interpret-get-super\n//> Types of Values interpret-equal\n      case OP_EQUAL: {\n        Value b = pop();\n        Value a = pop();\n        push(BOOL_VAL(valuesEqual(a, b)));\n        break;\n      }\n//< Types of Values interpret-equal\n//> Types of Values interpret-comparison\n      case OP_GREATER:  BINARY_OP(BOOL_VAL, >); break;\n      case OP_LESS:     BINARY_OP(BOOL_VAL, <); break;\n//< Types of Values interpret-comparison\n/* A Virtual Machine op-binary < Types of Values op-arithmetic\n      case OP_ADD:      BINARY_OP(+); break;\n      case OP_SUBTRACT: BINARY_OP(-); break;\n      case OP_MULTIPLY: BINARY_OP(*); break;\n      case OP_DIVIDE:   BINARY_OP(/); break;\n*/\n/* A Virtual Machine op-negate < Types of Values op-negate\n      case OP_NEGATE:   push(-pop()); break;\n*/\n/* Types of Values op-arithmetic < Strings add-strings\n      case OP_ADD:      BINARY_OP(NUMBER_VAL, +); break;\n*/\n//> Strings add-strings\n      case OP_ADD: {\n        if (IS_STRING(peek(0)) && IS_STRING(peek(1))) {\n          concatenate();\n        } else if (IS_NUMBER(peek(0)) && IS_NUMBER(peek(1))) {\n          double b = AS_NUMBER(pop());\n          double a = AS_NUMBER(pop());\n          push(NUMBER_VAL(a + b));\n        } else {\n          runtimeError(\n              \"Operands must be two numbers or two strings.\");\n          return INTERPRET_RUNTIME_ERROR;\n        }\n        break;\n      }\n//< Strings add-strings\n//> Types of Values op-arithmetic\n      case OP_SUBTRACT: BINARY_OP(NUMBER_VAL, -); break;\n      case OP_MULTIPLY: BINARY_OP(NUMBER_VAL, *); break;\n      case OP_DIVIDE:   BINARY_OP(NUMBER_VAL, /); break;\n//< Types of Values op-arithmetic\n//> Types of Values op-not\n      case OP_NOT:\n        push(BOOL_VAL(isFalsey(pop())));\n        break;\n//< Types of Values op-not\n//> Types of Values op-negate\n      case OP_NEGATE:\n        if (!IS_NUMBER(peek(0))) {\n          runtimeError(\"Operand must be a number.\");\n          return INTERPRET_RUNTIME_ERROR;\n        }\n        push(NUMBER_VAL(-AS_NUMBER(pop())));\n        break;\n//< Types of Values op-negate\n//> Global Variables interpret-print\n      case OP_PRINT: {\n        printValue(pop());\n        printf(\"\\n\");\n        break;\n      }\n//< Global Variables interpret-print\n//> Jumping Back and Forth op-jump\n      case OP_JUMP: {\n        uint16_t offset = READ_SHORT();\n/* Jumping Back and Forth op-jump < Calls and Functions jump\n        vm.ip += offset;\n*/\n//> Calls and Functions jump\n        frame->ip += offset;\n//< Calls and Functions jump\n        break;\n      }\n//< Jumping Back and Forth op-jump\n//> Jumping Back and Forth op-jump-if-false\n      case OP_JUMP_IF_FALSE: {\n        uint16_t offset = READ_SHORT();\n/* Jumping Back and Forth op-jump-if-false < Calls and Functions jump-if-false\n        if (isFalsey(peek(0))) vm.ip += offset;\n*/\n//> Calls and Functions jump-if-false\n        if (isFalsey(peek(0))) frame->ip += offset;\n//< Calls and Functions jump-if-false\n        break;\n      }\n//< Jumping Back and Forth op-jump-if-false\n//> Jumping Back and Forth op-loop\n      case OP_LOOP: {\n        uint16_t offset = READ_SHORT();\n/* Jumping Back and Forth op-loop < Calls and Functions loop\n        vm.ip -= offset;\n*/\n//> Calls and Functions loop\n        frame->ip -= offset;\n//< Calls and Functions loop\n        break;\n      }\n//< Jumping Back and Forth op-loop\n//> Calls and Functions interpret-call\n      case OP_CALL: {\n        int argCount = READ_BYTE();\n        if (!callValue(peek(argCount), argCount)) {\n          return INTERPRET_RUNTIME_ERROR;\n        }\n//> update-frame-after-call\n        frame = &vm.frames[vm.frameCount - 1];\n//< update-frame-after-call\n        break;\n      }\n//< Calls and Functions interpret-call\n//> Methods and Initializers interpret-invoke\n      case OP_INVOKE: {\n        ObjString* method = READ_STRING();\n        int argCount = READ_BYTE();\n        if (!invoke(method, argCount)) {\n          return INTERPRET_RUNTIME_ERROR;\n        }\n        frame = &vm.frames[vm.frameCount - 1];\n        break;\n      }\n//< Methods and Initializers interpret-invoke\n//> Superclasses interpret-super-invoke\n      case OP_SUPER_INVOKE: {\n        ObjString* method = READ_STRING();\n        int argCount = READ_BYTE();\n        ObjClass* superclass = AS_CLASS(pop());\n        if (!invokeFromClass(superclass, method, argCount)) {\n          return INTERPRET_RUNTIME_ERROR;\n        }\n        frame = &vm.frames[vm.frameCount - 1];\n        break;\n      }\n//< Superclasses interpret-super-invoke\n//> Closures interpret-closure\n      case OP_CLOSURE: {\n        ObjFunction* function = AS_FUNCTION(READ_CONSTANT());\n        ObjClosure* closure = newClosure(function);\n        push(OBJ_VAL(closure));\n//> interpret-capture-upvalues\n        for (int i = 0; i < closure->upvalueCount; i++) {\n          uint8_t isLocal = READ_BYTE();\n          uint8_t index = READ_BYTE();\n          if (isLocal) {\n            closure->upvalues[i] =\n                captureUpvalue(frame->slots + index);\n          } else {\n            closure->upvalues[i] = frame->closure->upvalues[index];\n          }\n        }\n//< interpret-capture-upvalues\n        break;\n      }\n//< Closures interpret-closure\n//> Closures interpret-close-upvalue\n      case OP_CLOSE_UPVALUE:\n        closeUpvalues(vm.stackTop - 1);\n        pop();\n        break;\n//< Closures interpret-close-upvalue\n      case OP_RETURN: {\n/* A Virtual Machine print-return < Global Variables op-return\n        printValue(pop());\n        printf(\"\\n\");\n*/\n/* Global Variables op-return < Calls and Functions interpret-return\n        // Exit interpreter.\n*/\n/* A Virtual Machine run < Calls and Functions interpret-return\n        return INTERPRET_OK;\n*/\n//> Calls and Functions interpret-return\n        Value result = pop();\n//> Closures return-close-upvalues\n        closeUpvalues(frame->slots);\n//< Closures return-close-upvalues\n        vm.frameCount--;\n        if (vm.frameCount == 0) {\n          pop();\n          return INTERPRET_OK;\n        }\n\n        vm.stackTop = frame->slots;\n        push(result);\n        frame = &vm.frames[vm.frameCount - 1];\n        break;\n//< Calls and Functions interpret-return\n      }\n//> Classes and Instances interpret-class\n      case OP_CLASS:\n        push(OBJ_VAL(newClass(READ_STRING())));\n        break;\n//< Classes and Instances interpret-class\n//> Superclasses interpret-inherit\n      case OP_INHERIT: {\n        Value superclass = peek(1);\n//> inherit-non-class\n        if (!IS_CLASS(superclass)) {\n          runtimeError(\"Superclass must be a class.\");\n          return INTERPRET_RUNTIME_ERROR;\n        }\n\n//< inherit-non-class\n        ObjClass* subclass = AS_CLASS(peek(0));\n        tableAddAll(&AS_CLASS(superclass)->methods,\n                    &subclass->methods);\n        pop(); // Subclass.\n        break;\n      }\n//< Superclasses interpret-inherit\n//> Methods and Initializers interpret-method\n      case OP_METHOD:\n        defineMethod(READ_STRING());\n        break;\n//< Methods and Initializers interpret-method\n    }\n  }\n\n#undef READ_BYTE\n//> Jumping Back and Forth undef-read-short\n#undef READ_SHORT\n//< Jumping Back and Forth undef-read-short\n//> undef-read-constant\n#undef READ_CONSTANT\n//< undef-read-constant\n//> Global Variables undef-read-string\n#undef READ_STRING\n//< Global Variables undef-read-string\n//> undef-binary-op\n#undef BINARY_OP\n//< undef-binary-op\n}\n//< run\n//> omit\nvoid hack(bool b) {\n  // Hack to avoid unused function error. run() is not used in the\n  // scanning chapter.\n  run();\n  if (b) hack(false);\n}\n//< omit\n//> interpret\n/* A Virtual Machine interpret < Scanning on Demand vm-interpret-c\nInterpretResult interpret(Chunk* chunk) {\n  vm.chunk = chunk;\n  vm.ip = vm.chunk->code;\n  return run();\n*/\n//> Scanning on Demand vm-interpret-c\nInterpretResult interpret(const char* source) {\n/* Scanning on Demand vm-interpret-c < Compiling Expressions interpret-chunk\n  compile(source);\n  return INTERPRET_OK;\n*/\n/* Compiling Expressions interpret-chunk < Calls and Functions interpret-stub\n  Chunk chunk;\n  initChunk(&chunk);\n\n  if (!compile(source, &chunk)) {\n    freeChunk(&chunk);\n    return INTERPRET_COMPILE_ERROR;\n  }\n\n  vm.chunk = &chunk;\n  vm.ip = vm.chunk->code;\n*/\n//> Calls and Functions interpret-stub\n  ObjFunction* function = compile(source);\n  if (function == NULL) return INTERPRET_COMPILE_ERROR;\n\n  push(OBJ_VAL(function));\n//< Calls and Functions interpret-stub\n/* Calls and Functions interpret-stub < Calls and Functions interpret\n  CallFrame* frame = &vm.frames[vm.frameCount++];\n  frame->function = function;\n  frame->ip = function->chunk.code;\n  frame->slots = vm.stack;\n*/\n/* Calls and Functions interpret < Closures interpret\n  call(function, 0);\n*/\n//> Closures interpret\n  ObjClosure* closure = newClosure(function);\n  pop();\n  push(OBJ_VAL(closure));\n  call(closure, 0);\n//< Closures interpret\n//< Scanning on Demand vm-interpret-c\n//> Compiling Expressions interpret-chunk\n\n/* Compiling Expressions interpret-chunk < Calls and Functions end-interpret\n  InterpretResult result = run();\n\n  freeChunk(&chunk);\n  return result;\n*/\n//> Calls and Functions end-interpret\n  return run();\n//< Calls and Functions end-interpret\n//< Compiling Expressions interpret-chunk\n}\n//< interpret\n"
  },
  {
    "path": "c/vm.h",
    "content": "//> A Virtual Machine vm-h\n#ifndef clox_vm_h\n#define clox_vm_h\n\n/* A Virtual Machine vm-h < Calls and Functions vm-include-object\n#include \"chunk.h\"\n*/\n//> Calls and Functions vm-include-object\n#include \"object.h\"\n//< Calls and Functions vm-include-object\n//> Hash Tables vm-include-table\n#include \"table.h\"\n//< Hash Tables vm-include-table\n//> vm-include-value\n#include \"value.h\"\n//< vm-include-value\n//> stack-max\n\n//< stack-max\n/* A Virtual Machine stack-max < Calls and Functions frame-max\n#define STACK_MAX 256\n*/\n//> Calls and Functions frame-max\n#define FRAMES_MAX 64\n#define STACK_MAX (FRAMES_MAX * UINT8_COUNT)\n//< Calls and Functions frame-max\n//> Calls and Functions call-frame\n\ntypedef struct {\n/* Calls and Functions call-frame < Closures call-frame-closure\n  ObjFunction* function;\n*/\n//> Closures call-frame-closure\n  ObjClosure* closure;\n//< Closures call-frame-closure\n  uint8_t* ip;\n  Value* slots;\n} CallFrame;\n//< Calls and Functions call-frame\n\ntypedef struct {\n/* A Virtual Machine vm-h < Calls and Functions frame-array\n  Chunk* chunk;\n*/\n/* A Virtual Machine ip < Calls and Functions frame-array\n  uint8_t* ip;\n*/\n//> Calls and Functions frame-array\n  CallFrame frames[FRAMES_MAX];\n  int frameCount;\n  \n//< Calls and Functions frame-array\n//> vm-stack\n  Value stack[STACK_MAX];\n  Value* stackTop;\n//< vm-stack\n//> Global Variables vm-globals\n  Table globals;\n//< Global Variables vm-globals\n//> Hash Tables vm-strings\n  Table strings;\n//< Hash Tables vm-strings\n//> Methods and Initializers vm-init-string\n  ObjString* initString;\n//< Methods and Initializers vm-init-string\n//> Closures open-upvalues-field\n  ObjUpvalue* openUpvalues;\n//< Closures open-upvalues-field\n//> Garbage Collection vm-fields\n\n  size_t bytesAllocated;\n  size_t nextGC;\n//< Garbage Collection vm-fields\n//> Strings objects-root\n  Obj* objects;\n//< Strings objects-root\n//> Garbage Collection vm-gray-stack\n  int grayCount;\n  int grayCapacity;\n  Obj** grayStack;\n//< Garbage Collection vm-gray-stack\n} VM;\n\n//> interpret-result\ntypedef enum {\n  INTERPRET_OK,\n  INTERPRET_COMPILE_ERROR,\n  INTERPRET_RUNTIME_ERROR\n} InterpretResult;\n\n//< interpret-result\n//> Strings extern-vm\nextern VM vm;\n\n//< Strings extern-vm\nvoid initVM();\nvoid freeVM();\n/* A Virtual Machine interpret-h < Scanning on Demand vm-interpret-h\nInterpretResult interpret(Chunk* chunk);\n*/\n//> Scanning on Demand vm-interpret-h\nInterpretResult interpret(const char* source);\n//< Scanning on Demand vm-interpret-h\n//> push-pop\nvoid push(Value value);\nValue pop();\n//< push-pop\n\n#endif\n"
  },
  {
    "path": "java/com/craftinginterpreters/lox/AstPrinter.java",
    "content": "//> Representing Code ast-printer\npackage com.craftinginterpreters.lox;\n//> omit\n\nimport java.util.List;\n//< omit\n\n/* Representing Code ast-printer < Statements and State omit\nclass AstPrinter implements Expr.Visitor<String> {\n*/\n//> Statements and State omit\nclass AstPrinter implements Expr.Visitor<String>, Stmt.Visitor<String> {\n//< Statements and State omit\n  String print(Expr expr) {\n    return expr.accept(this);\n  }\n//> Statements and State omit\n\n  String print(Stmt stmt) {\n    return stmt.accept(this);\n  }\n//< Statements and State omit\n//> visit-methods\n//> Statements and State omit\n  @Override\n  public String visitBlockStmt(Stmt.Block stmt) {\n    StringBuilder builder = new StringBuilder();\n    builder.append(\"(block \");\n\n    for (Stmt statement : stmt.statements) {\n      builder.append(statement.accept(this));\n    }\n\n    builder.append(\")\");\n    return builder.toString();\n  }\n//< Statements and State omit\n//> Classes omit\n\n  @Override\n  public String visitClassStmt(Stmt.Class stmt) {\n    StringBuilder builder = new StringBuilder();\n    builder.append(\"(class \" + stmt.name.lexeme);\n//> Inheritance omit\n\n    if (stmt.superclass != null) {\n      builder.append(\" < \" + print(stmt.superclass));\n    }\n//< Inheritance omit\n\n    for (Stmt.Function method : stmt.methods) {\n      builder.append(\" \" + print(method));\n    }\n\n    builder.append(\")\");\n    return builder.toString();\n  }\n//< Classes omit\n//> Statements and State omit\n\n  @Override\n  public String visitExpressionStmt(Stmt.Expression stmt) {\n    return parenthesize(\";\", stmt.expression);\n  }\n//< Statements and State omit\n//> Functions omit\n\n  @Override\n  public String visitFunctionStmt(Stmt.Function stmt) {\n    StringBuilder builder = new StringBuilder();\n    builder.append(\"(fun \" + stmt.name.lexeme + \"(\");\n\n    for (Token param : stmt.params) {\n      if (param != stmt.params.get(0)) builder.append(\" \");\n      builder.append(param.lexeme);\n    }\n\n    builder.append(\") \");\n\n    for (Stmt body : stmt.body) {\n      builder.append(body.accept(this));\n    }\n\n    builder.append(\")\");\n    return builder.toString();\n  }\n//< Functions omit\n//> Control Flow omit\n\n  @Override\n  public String visitIfStmt(Stmt.If stmt) {\n    if (stmt.elseBranch == null) {\n      return parenthesize2(\"if\", stmt.condition, stmt.thenBranch);\n    }\n\n    return parenthesize2(\"if-else\", stmt.condition, stmt.thenBranch,\n        stmt.elseBranch);\n  }\n//< Control Flow omit\n//> Statements and State omit\n\n  @Override\n  public String visitPrintStmt(Stmt.Print stmt) {\n    return parenthesize(\"print\", stmt.expression);\n  }\n//< Statements and State omit\n//> Functions omit\n\n  @Override\n  public String visitReturnStmt(Stmt.Return stmt) {\n    if (stmt.value == null) return \"(return)\";\n    return parenthesize(\"return\", stmt.value);\n  }\n//< Functions omit\n//> Statements and State omit\n\n  @Override\n  public String visitVarStmt(Stmt.Var stmt) {\n    if (stmt.initializer == null) {\n      return parenthesize2(\"var\", stmt.name);\n    }\n\n    return parenthesize2(\"var\", stmt.name, \"=\", stmt.initializer);\n  }\n//< Statements and State omit\n//> Control Flow omit\n\n  @Override\n  public String visitWhileStmt(Stmt.While stmt) {\n    return parenthesize2(\"while\", stmt.condition, stmt.body);\n  }\n//< Control Flow omit\n//> Statements and State omit\n\n  @Override\n  public String visitAssignExpr(Expr.Assign expr) {\n    return parenthesize2(\"=\", expr.name.lexeme, expr.value);\n  }\n//< Statements and State omit\n\n  @Override\n  public String visitBinaryExpr(Expr.Binary expr) {\n    return parenthesize(expr.operator.lexeme,\n                        expr.left, expr.right);\n  }\n//> Functions omit\n\n  @Override\n  public String visitCallExpr(Expr.Call expr) {\n    return parenthesize2(\"call\", expr.callee, expr.arguments);\n  }\n//< Functions omit\n//> Classes omit\n\n  @Override\n  public String visitGetExpr(Expr.Get expr) {\n    return parenthesize2(\".\", expr.object, expr.name.lexeme);\n  }\n//< Classes omit\n\n  @Override\n  public String visitGroupingExpr(Expr.Grouping expr) {\n    return parenthesize(\"group\", expr.expression);\n  }\n\n  @Override\n  public String visitLiteralExpr(Expr.Literal expr) {\n    if (expr.value == null) return \"nil\";\n    return expr.value.toString();\n  }\n//> Control Flow omit\n\n  @Override\n  public String visitLogicalExpr(Expr.Logical expr) {\n    return parenthesize(expr.operator.lexeme, expr.left, expr.right);\n  }\n//< Control Flow omit\n//> Classes omit\n\n  @Override\n  public String visitSetExpr(Expr.Set expr) {\n    return parenthesize2(\"=\",\n        expr.object, expr.name.lexeme, expr.value);\n  }\n//< Classes omit\n//> Inheritance omit\n\n  @Override\n  public String visitSuperExpr(Expr.Super expr) {\n    return parenthesize2(\"super\", expr.method);\n  }\n//< Inheritance omit\n//> Classes omit\n\n  @Override\n  public String visitThisExpr(Expr.This expr) {\n    return \"this\";\n  }\n//< Classes omit\n\n  @Override\n  public String visitUnaryExpr(Expr.Unary expr) {\n    return parenthesize(expr.operator.lexeme, expr.right);\n  }\n//> Statements and State omit\n\n  @Override\n  public String visitVariableExpr(Expr.Variable expr) {\n    return expr.name.lexeme;\n  }\n//< Statements and State omit\n//< visit-methods\n//> print-utilities\n  private String parenthesize(String name, Expr... exprs) {\n    StringBuilder builder = new StringBuilder();\n\n    builder.append(\"(\").append(name);\n    for (Expr expr : exprs) {\n      builder.append(\" \");\n      builder.append(expr.accept(this));\n    }\n    builder.append(\")\");\n\n    return builder.toString();\n  }\n//< print-utilities\n//> omit\n  // Note: AstPrinting other types of syntax trees is not shown in the\n  // book, but this is provided here as a reference for those reading\n  // the full code.\n  private String parenthesize2(String name, Object... parts) {\n    StringBuilder builder = new StringBuilder();\n\n    builder.append(\"(\").append(name);\n    transform(builder, parts);\n    builder.append(\")\");\n\n    return builder.toString();\n  }\n\n  private void transform(StringBuilder builder, Object... parts) {\n    for (Object part : parts) {\n      builder.append(\" \");\n      if (part instanceof Expr) {\n        builder.append(((Expr)part).accept(this));\n//> Statements and State omit\n      } else if (part instanceof Stmt) {\n        builder.append(((Stmt) part).accept(this));\n//< Statements and State omit\n      } else if (part instanceof Token) {\n        builder.append(((Token) part).lexeme);\n      } else if (part instanceof List) {\n        transform(builder, ((List) part).toArray());\n      } else {\n        builder.append(part);\n      }\n    }\n  }\n//< omit\n/* Representing Code printer-main < Representing Code omit\n  public static void main(String[] args) {\n    Expr expression = new Expr.Binary(\n        new Expr.Unary(\n            new Token(TokenType.MINUS, \"-\", null, 1),\n            new Expr.Literal(123)),\n        new Token(TokenType.STAR, \"*\", null, 1),\n        new Expr.Grouping(\n            new Expr.Literal(45.67)));\n\n    System.out.println(new AstPrinter().print(expression));\n  }\n*/\n}\n"
  },
  {
    "path": "java/com/craftinginterpreters/lox/Environment.java",
    "content": "//> Statements and State environment-class\npackage com.craftinginterpreters.lox;\n\nimport java.util.HashMap;\nimport java.util.Map;\n\nclass Environment {\n//> enclosing-field\n  final Environment enclosing;\n//< enclosing-field\n  private final Map<String, Object> values = new HashMap<>();\n//> environment-constructors\n  Environment() {\n    enclosing = null;\n  }\n\n  Environment(Environment enclosing) {\n    this.enclosing = enclosing;\n  }\n//< environment-constructors\n//> environment-get\n\n  Object get(Token name) {\n    if (values.containsKey(name.lexeme)) {\n      return values.get(name.lexeme);\n    }\n//> environment-get-enclosing\n\n    if (enclosing != null) return enclosing.get(name);\n//< environment-get-enclosing\n\n    throw new RuntimeError(name,\n        \"Undefined variable '\" + name.lexeme + \"'.\");\n  }\n\n//< environment-get\n//> environment-assign\n  void assign(Token name, Object value) {\n    if (values.containsKey(name.lexeme)) {\n      values.put(name.lexeme, value);\n      return;\n    }\n\n//> environment-assign-enclosing\n    if (enclosing != null) {\n      enclosing.assign(name, value);\n      return;\n    }\n\n//< environment-assign-enclosing\n    throw new RuntimeError(name,\n        \"Undefined variable '\" + name.lexeme + \"'.\");\n  }\n//< environment-assign\n//> environment-define\n  void define(String name, Object value) {\n    values.put(name, value);\n  }\n//< environment-define\n//> Resolving and Binding ancestor\n  Environment ancestor(int distance) {\n    Environment environment = this;\n    for (int i = 0; i < distance; i++) {\n      environment = environment.enclosing; // [coupled]\n    }\n\n    return environment;\n  }\n//< Resolving and Binding ancestor\n//> Resolving and Binding get-at\n  Object getAt(int distance, String name) {\n    return ancestor(distance).values.get(name);\n  }\n//< Resolving and Binding get-at\n//> Resolving and Binding assign-at\n  void assignAt(int distance, Token name, Object value) {\n    ancestor(distance).values.put(name.lexeme, value);\n  }\n//< Resolving and Binding assign-at\n//> omit\n  @Override\n  public String toString() {\n    String result = values.toString();\n    if (enclosing != null) {\n      result += \" -> \" + enclosing.toString();\n    }\n\n    return result;\n  }\n//< omit\n}\n"
  },
  {
    "path": "java/com/craftinginterpreters/lox/Expr.java",
    "content": "//> Appendix II expr\npackage com.craftinginterpreters.lox;\n\nimport java.util.List;\n\nabstract class Expr {\n  interface Visitor<R> {\n    R visitAssignExpr(Assign expr);\n    R visitBinaryExpr(Binary expr);\n    R visitCallExpr(Call expr);\n    R visitGetExpr(Get expr);\n    R visitGroupingExpr(Grouping expr);\n    R visitLiteralExpr(Literal expr);\n    R visitLogicalExpr(Logical expr);\n    R visitSetExpr(Set expr);\n    R visitSuperExpr(Super expr);\n    R visitThisExpr(This expr);\n    R visitUnaryExpr(Unary expr);\n    R visitVariableExpr(Variable expr);\n  }\n\n  // Nested Expr classes here...\n//> expr-assign\n  static class Assign extends Expr {\n    Assign(Token name, Expr value) {\n      this.name = name;\n      this.value = value;\n    }\n\n    @Override\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitAssignExpr(this);\n    }\n\n    final Token name;\n    final Expr value;\n  }\n//< expr-assign\n//> expr-binary\n  static class Binary extends Expr {\n    Binary(Expr left, Token operator, Expr right) {\n      this.left = left;\n      this.operator = operator;\n      this.right = right;\n    }\n\n    @Override\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitBinaryExpr(this);\n    }\n\n    final Expr left;\n    final Token operator;\n    final Expr right;\n  }\n//< expr-binary\n//> expr-call\n  static class Call extends Expr {\n    Call(Expr callee, Token paren, List<Expr> arguments) {\n      this.callee = callee;\n      this.paren = paren;\n      this.arguments = arguments;\n    }\n\n    @Override\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitCallExpr(this);\n    }\n\n    final Expr callee;\n    final Token paren;\n    final List<Expr> arguments;\n  }\n//< expr-call\n//> expr-get\n  static class Get extends Expr {\n    Get(Expr object, Token name) {\n      this.object = object;\n      this.name = name;\n    }\n\n    @Override\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitGetExpr(this);\n    }\n\n    final Expr object;\n    final Token name;\n  }\n//< expr-get\n//> expr-grouping\n  static class Grouping extends Expr {\n    Grouping(Expr expression) {\n      this.expression = expression;\n    }\n\n    @Override\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitGroupingExpr(this);\n    }\n\n    final Expr expression;\n  }\n//< expr-grouping\n//> expr-literal\n  static class Literal extends Expr {\n    Literal(Object value) {\n      this.value = value;\n    }\n\n    @Override\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitLiteralExpr(this);\n    }\n\n    final Object value;\n  }\n//< expr-literal\n//> expr-logical\n  static class Logical extends Expr {\n    Logical(Expr left, Token operator, Expr right) {\n      this.left = left;\n      this.operator = operator;\n      this.right = right;\n    }\n\n    @Override\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitLogicalExpr(this);\n    }\n\n    final Expr left;\n    final Token operator;\n    final Expr right;\n  }\n//< expr-logical\n//> expr-set\n  static class Set extends Expr {\n    Set(Expr object, Token name, Expr value) {\n      this.object = object;\n      this.name = name;\n      this.value = value;\n    }\n\n    @Override\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitSetExpr(this);\n    }\n\n    final Expr object;\n    final Token name;\n    final Expr value;\n  }\n//< expr-set\n//> expr-super\n  static class Super extends Expr {\n    Super(Token keyword, Token method) {\n      this.keyword = keyword;\n      this.method = method;\n    }\n\n    @Override\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitSuperExpr(this);\n    }\n\n    final Token keyword;\n    final Token method;\n  }\n//< expr-super\n//> expr-this\n  static class This extends Expr {\n    This(Token keyword) {\n      this.keyword = keyword;\n    }\n\n    @Override\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitThisExpr(this);\n    }\n\n    final Token keyword;\n  }\n//< expr-this\n//> expr-unary\n  static class Unary extends Expr {\n    Unary(Token operator, Expr right) {\n      this.operator = operator;\n      this.right = right;\n    }\n\n    @Override\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitUnaryExpr(this);\n    }\n\n    final Token operator;\n    final Expr right;\n  }\n//< expr-unary\n//> expr-variable\n  static class Variable extends Expr {\n    Variable(Token name) {\n      this.name = name;\n    }\n\n    @Override\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitVariableExpr(this);\n    }\n\n    final Token name;\n  }\n//< expr-variable\n\n  abstract <R> R accept(Visitor<R> visitor);\n}\n//< Appendix II expr\n"
  },
  {
    "path": "java/com/craftinginterpreters/lox/Interpreter.java",
    "content": "//> Evaluating Expressions interpreter-class\npackage com.craftinginterpreters.lox;\n//> Statements and State import-list\n\n//> Functions import-array-list\nimport java.util.ArrayList;\n//< Functions import-array-list\n//> Resolving and Binding import-hash-map\nimport java.util.HashMap;\n//< Resolving and Binding import-hash-map\nimport java.util.List;\n//< Statements and State import-list\n//> Resolving and Binding import-map\nimport java.util.Map;\n//< Resolving and Binding import-map\n\n/* Evaluating Expressions interpreter-class < Statements and State interpreter\nclass Interpreter implements Expr.Visitor<Object> {\n*/\n//> Statements and State interpreter\nclass Interpreter implements Expr.Visitor<Object>,\n                             Stmt.Visitor<Void> {\n//< Statements and State interpreter\n/* Statements and State environment-field < Functions global-environment\n  private Environment environment = new Environment();\n*/\n//> Functions global-environment\n  final Environment globals = new Environment();\n  private Environment environment = globals;\n//< Functions global-environment\n//> Resolving and Binding locals-field\n  private final Map<Expr, Integer> locals = new HashMap<>();\n//< Resolving and Binding locals-field\n//> Statements and State environment-field\n\n//< Statements and State environment-field\n//> Functions interpreter-constructor\n  Interpreter() {\n    globals.define(\"clock\", new LoxCallable() {\n      @Override\n      public int arity() { return 0; }\n\n      @Override\n      public Object call(Interpreter interpreter,\n                         List<Object> arguments) {\n        return (double)System.currentTimeMillis() / 1000.0;\n      }\n\n      @Override\n      public String toString() { return \"<native fn>\"; }\n    });\n  }\n  \n//< Functions interpreter-constructor\n/* Evaluating Expressions interpret < Statements and State interpret\n  void interpret(Expr expression) { // [void]\n    try {\n      Object value = evaluate(expression);\n      System.out.println(stringify(value));\n    } catch (RuntimeError error) {\n      Lox.runtimeError(error);\n    }\n  }\n*/\n//> Statements and State interpret\n  void interpret(List<Stmt> statements) {\n    try {\n      for (Stmt statement : statements) {\n        execute(statement);\n      }\n    } catch (RuntimeError error) {\n      Lox.runtimeError(error);\n    }\n  }\n//< Statements and State interpret\n//> evaluate\n  private Object evaluate(Expr expr) {\n    return expr.accept(this);\n  }\n//< evaluate\n//> Statements and State execute\n  private void execute(Stmt stmt) {\n    stmt.accept(this);\n  }\n//< Statements and State execute\n//> Resolving and Binding resolve\n  void resolve(Expr expr, int depth) {\n    locals.put(expr, depth);\n  }\n//< Resolving and Binding resolve\n//> Statements and State execute-block\n  void executeBlock(List<Stmt> statements,\n                    Environment environment) {\n    Environment previous = this.environment;\n    try {\n      this.environment = environment;\n\n      for (Stmt statement : statements) {\n        execute(statement);\n      }\n    } finally {\n      this.environment = previous;\n    }\n  }\n//< Statements and State execute-block\n//> Statements and State visit-block\n  @Override\n  public Void visitBlockStmt(Stmt.Block stmt) {\n    executeBlock(stmt.statements, new Environment(environment));\n    return null;\n  }\n//< Statements and State visit-block\n//> Classes interpreter-visit-class\n  @Override\n  public Void visitClassStmt(Stmt.Class stmt) {\n//> Inheritance interpret-superclass\n    Object superclass = null;\n    if (stmt.superclass != null) {\n      superclass = evaluate(stmt.superclass);\n      if (!(superclass instanceof LoxClass)) {\n        throw new RuntimeError(stmt.superclass.name,\n            \"Superclass must be a class.\");\n      }\n    }\n\n//< Inheritance interpret-superclass\n    environment.define(stmt.name.lexeme, null);\n//> Inheritance begin-superclass-environment\n\n    if (stmt.superclass != null) {\n      environment = new Environment(environment);\n      environment.define(\"super\", superclass);\n    }\n//< Inheritance begin-superclass-environment\n//> interpret-methods\n\n    Map<String, LoxFunction> methods = new HashMap<>();\n    for (Stmt.Function method : stmt.methods) {\n/* Classes interpret-methods < Classes interpreter-method-initializer\n      LoxFunction function = new LoxFunction(method, environment);\n*/\n//> interpreter-method-initializer\n      LoxFunction function = new LoxFunction(method, environment,\n          method.name.lexeme.equals(\"init\"));\n//< interpreter-method-initializer\n      methods.put(method.name.lexeme, function);\n    }\n\n/* Classes interpret-methods < Inheritance interpreter-construct-class\n    LoxClass klass = new LoxClass(stmt.name.lexeme, methods);\n*/\n//> Inheritance interpreter-construct-class\n    LoxClass klass = new LoxClass(stmt.name.lexeme,\n        (LoxClass)superclass, methods);\n//> end-superclass-environment\n\n    if (superclass != null) {\n      environment = environment.enclosing;\n    }\n//< end-superclass-environment\n\n//< Inheritance interpreter-construct-class\n//< interpret-methods\n/* Classes interpreter-visit-class < Classes interpret-methods\n    LoxClass klass = new LoxClass(stmt.name.lexeme);\n*/\n    environment.assign(stmt.name, klass);\n    return null;\n  }\n//< Classes interpreter-visit-class\n//> Statements and State visit-expression-stmt\n  @Override\n  public Void visitExpressionStmt(Stmt.Expression stmt) {\n    evaluate(stmt.expression);\n    return null;\n  }\n//< Statements and State visit-expression-stmt\n//> Functions visit-function\n  @Override\n  public Void visitFunctionStmt(Stmt.Function stmt) {\n/* Functions visit-function < Functions visit-closure\n    LoxFunction function = new LoxFunction(stmt);\n*/\n/* Functions visit-closure < Classes construct-function\n    LoxFunction function = new LoxFunction(stmt, environment);\n*/\n//> Classes construct-function\n    LoxFunction function = new LoxFunction(stmt, environment,\n                                           false);\n//< Classes construct-function\n    environment.define(stmt.name.lexeme, function);\n    return null;\n  }\n//< Functions visit-function\n//> Control Flow visit-if\n  @Override\n  public Void visitIfStmt(Stmt.If stmt) {\n    if (isTruthy(evaluate(stmt.condition))) {\n      execute(stmt.thenBranch);\n    } else if (stmt.elseBranch != null) {\n      execute(stmt.elseBranch);\n    }\n    return null;\n  }\n//< Control Flow visit-if\n//> Statements and State visit-print\n  @Override\n  public Void visitPrintStmt(Stmt.Print stmt) {\n    Object value = evaluate(stmt.expression);\n    System.out.println(stringify(value));\n    return null;\n  }\n//< Statements and State visit-print\n//> Functions visit-return\n  @Override\n  public Void visitReturnStmt(Stmt.Return stmt) {\n    Object value = null;\n    if (stmt.value != null) value = evaluate(stmt.value);\n\n    throw new Return(value);\n  }\n//< Functions visit-return\n//> Statements and State visit-var\n  @Override\n  public Void visitVarStmt(Stmt.Var stmt) {\n    Object value = null;\n    if (stmt.initializer != null) {\n      value = evaluate(stmt.initializer);\n    }\n\n    environment.define(stmt.name.lexeme, value);\n    return null;\n  }\n//< Statements and State visit-var\n//> Control Flow visit-while\n  @Override\n  public Void visitWhileStmt(Stmt.While stmt) {\n    while (isTruthy(evaluate(stmt.condition))) {\n      execute(stmt.body);\n    }\n    return null;\n  }\n//< Control Flow visit-while\n//> Statements and State visit-assign\n  @Override\n  public Object visitAssignExpr(Expr.Assign expr) {\n    Object value = evaluate(expr.value);\n/* Statements and State visit-assign < Resolving and Binding resolved-assign\n    environment.assign(expr.name, value);\n*/\n//> Resolving and Binding resolved-assign\n\n    Integer distance = locals.get(expr);\n    if (distance != null) {\n      environment.assignAt(distance, expr.name, value);\n    } else {\n      globals.assign(expr.name, value);\n    }\n\n//< Resolving and Binding resolved-assign\n    return value;\n  }\n//< Statements and State visit-assign\n//> visit-binary\n  @Override\n  public Object visitBinaryExpr(Expr.Binary expr) {\n    Object left = evaluate(expr.left);\n    Object right = evaluate(expr.right); // [left]\n\n    switch (expr.operator.type) {\n//> binary-equality\n      case BANG_EQUAL: return !isEqual(left, right);\n      case EQUAL_EQUAL: return isEqual(left, right);\n//< binary-equality\n//> binary-comparison\n      case GREATER:\n//> check-greater-operand\n        checkNumberOperands(expr.operator, left, right);\n//< check-greater-operand\n        return (double)left > (double)right;\n      case GREATER_EQUAL:\n//> check-greater-equal-operand\n        checkNumberOperands(expr.operator, left, right);\n//< check-greater-equal-operand\n        return (double)left >= (double)right;\n      case LESS:\n//> check-less-operand\n        checkNumberOperands(expr.operator, left, right);\n//< check-less-operand\n        return (double)left < (double)right;\n      case LESS_EQUAL:\n//> check-less-equal-operand\n        checkNumberOperands(expr.operator, left, right);\n//< check-less-equal-operand\n        return (double)left <= (double)right;\n//< binary-comparison\n      case MINUS:\n//> check-minus-operand\n        checkNumberOperands(expr.operator, left, right);\n//< check-minus-operand\n        return (double)left - (double)right;\n//> binary-plus\n      case PLUS:\n        if (left instanceof Double && right instanceof Double) {\n          return (double)left + (double)right;\n        } // [plus]\n\n        if (left instanceof String && right instanceof String) {\n          return (String)left + (String)right;\n        }\n\n/* Evaluating Expressions binary-plus < Evaluating Expressions string-wrong-type\n        break;\n*/\n//> string-wrong-type\n        throw new RuntimeError(expr.operator,\n            \"Operands must be two numbers or two strings.\");\n//< string-wrong-type\n//< binary-plus\n      case SLASH:\n//> check-slash-operand\n        checkNumberOperands(expr.operator, left, right);\n//< check-slash-operand\n        return (double)left / (double)right;\n      case STAR:\n//> check-star-operand\n        checkNumberOperands(expr.operator, left, right);\n//< check-star-operand\n        return (double)left * (double)right;\n    }\n\n    // Unreachable.\n    return null;\n  }\n//< visit-binary\n//> Functions visit-call\n  @Override\n  public Object visitCallExpr(Expr.Call expr) {\n    Object callee = evaluate(expr.callee);\n\n    List<Object> arguments = new ArrayList<>();\n    for (Expr argument : expr.arguments) { // [in-order]\n      arguments.add(evaluate(argument));\n    }\n\n//> check-is-callable\n    if (!(callee instanceof LoxCallable)) {\n      throw new RuntimeError(expr.paren,\n          \"Can only call functions and classes.\");\n    }\n\n//< check-is-callable\n    LoxCallable function = (LoxCallable)callee;\n//> check-arity\n    if (arguments.size() != function.arity()) {\n      throw new RuntimeError(expr.paren, \"Expected \" +\n          function.arity() + \" arguments but got \" +\n          arguments.size() + \".\");\n    }\n\n//< check-arity\n    return function.call(this, arguments);\n  }\n//< Functions visit-call\n//> Classes interpreter-visit-get\n  @Override\n  public Object visitGetExpr(Expr.Get expr) {\n    Object object = evaluate(expr.object);\n    if (object instanceof LoxInstance) {\n      return ((LoxInstance) object).get(expr.name);\n    }\n\n    throw new RuntimeError(expr.name,\n        \"Only instances have properties.\");\n  }\n//< Classes interpreter-visit-get\n//> visit-grouping\n  @Override\n  public Object visitGroupingExpr(Expr.Grouping expr) {\n    return evaluate(expr.expression);\n  }\n//< visit-grouping\n//> visit-literal\n  @Override\n  public Object visitLiteralExpr(Expr.Literal expr) {\n    return expr.value;\n  }\n//< visit-literal\n//> Control Flow visit-logical\n  @Override\n  public Object visitLogicalExpr(Expr.Logical expr) {\n    Object left = evaluate(expr.left);\n\n    if (expr.operator.type == TokenType.OR) {\n      if (isTruthy(left)) return left;\n    } else {\n      if (!isTruthy(left)) return left;\n    }\n\n    return evaluate(expr.right);\n  }\n//< Control Flow visit-logical\n//> Classes interpreter-visit-set\n  @Override\n  public Object visitSetExpr(Expr.Set expr) {\n    Object object = evaluate(expr.object);\n\n    if (!(object instanceof LoxInstance)) { // [order]\n      throw new RuntimeError(expr.name,\n                             \"Only instances have fields.\");\n    }\n\n    Object value = evaluate(expr.value);\n    ((LoxInstance)object).set(expr.name, value);\n    return value;\n  }\n//< Classes interpreter-visit-set\n//> Inheritance interpreter-visit-super\n  @Override\n  public Object visitSuperExpr(Expr.Super expr) {\n    int distance = locals.get(expr);\n    LoxClass superclass = (LoxClass)environment.getAt(\n        distance, \"super\");\n//> super-find-this\n\n    LoxInstance object = (LoxInstance)environment.getAt(\n        distance - 1, \"this\");\n//< super-find-this\n//> super-find-method\n\n    LoxFunction method = superclass.findMethod(expr.method.lexeme);\n//> super-no-method\n\n    if (method == null) {\n      throw new RuntimeError(expr.method,\n          \"Undefined property '\" + expr.method.lexeme + \"'.\");\n    }\n\n//< super-no-method\n    return method.bind(object);\n//< super-find-method\n  }\n//< Inheritance interpreter-visit-super\n//> Classes interpreter-visit-this\n  @Override\n  public Object visitThisExpr(Expr.This expr) {\n    return lookUpVariable(expr.keyword, expr);\n  }\n//< Classes interpreter-visit-this\n//> visit-unary\n  @Override\n  public Object visitUnaryExpr(Expr.Unary expr) {\n    Object right = evaluate(expr.right);\n\n    switch (expr.operator.type) {\n//> unary-bang\n      case BANG:\n        return !isTruthy(right);\n//< unary-bang\n      case MINUS:\n//> check-unary-operand\n        checkNumberOperand(expr.operator, right);\n//< check-unary-operand\n        return -(double)right;\n    }\n\n    // Unreachable.\n    return null;\n  }\n//< visit-unary\n//> Statements and State visit-variable\n  @Override\n  public Object visitVariableExpr(Expr.Variable expr) {\n/* Statements and State visit-variable < Resolving and Binding call-look-up-variable\n    return environment.get(expr.name);\n*/\n//> Resolving and Binding call-look-up-variable\n    return lookUpVariable(expr.name, expr);\n//< Resolving and Binding call-look-up-variable\n  }\n//> Resolving and Binding look-up-variable\n  private Object lookUpVariable(Token name, Expr expr) {\n    Integer distance = locals.get(expr);\n    if (distance != null) {\n      return environment.getAt(distance, name.lexeme);\n    } else {\n      return globals.get(name);\n    }\n  }\n//< Resolving and Binding look-up-variable\n//< Statements and State visit-variable\n//> check-operand\n  private void checkNumberOperand(Token operator, Object operand) {\n    if (operand instanceof Double) return;\n    throw new RuntimeError(operator, \"Operand must be a number.\");\n  }\n//< check-operand\n//> check-operands\n  private void checkNumberOperands(Token operator,\n                                   Object left, Object right) {\n    if (left instanceof Double && right instanceof Double) return;\n    // [operand]\n    throw new RuntimeError(operator, \"Operands must be numbers.\");\n  }\n//< check-operands\n//> is-truthy\n  private boolean isTruthy(Object object) {\n    if (object == null) return false;\n    if (object instanceof Boolean) return (boolean)object;\n    return true;\n  }\n//< is-truthy\n//> is-equal\n  private boolean isEqual(Object a, Object b) {\n    if (a == null && b == null) return true;\n    if (a == null) return false;\n\n    return a.equals(b);\n  }\n//< is-equal\n//> stringify\n  private String stringify(Object object) {\n    if (object == null) return \"nil\";\n\n    if (object instanceof Double) {\n      String text = object.toString();\n      if (text.endsWith(\".0\")) {\n        text = text.substring(0, text.length() - 2);\n      }\n      return text;\n    }\n\n    return object.toString();\n  }\n//< stringify\n}\n"
  },
  {
    "path": "java/com/craftinginterpreters/lox/Lox.java",
    "content": "//> Scanning lox-class\npackage com.craftinginterpreters.lox;\n\nimport java.io.BufferedReader;\nimport java.io.IOException;\nimport java.io.InputStreamReader;\nimport java.nio.charset.Charset;\nimport java.nio.file.Files;\nimport java.nio.file.Paths;\nimport java.util.List;\n\npublic class Lox {\n//> Evaluating Expressions interpreter-instance\n  private static final Interpreter interpreter = new Interpreter();\n//< Evaluating Expressions interpreter-instance\n//> had-error\n  static boolean hadError = false;\n//< had-error\n//> Evaluating Expressions had-runtime-error-field\n  static boolean hadRuntimeError = false;\n\n//< Evaluating Expressions had-runtime-error-field\n  public static void main(String[] args) throws IOException {\n    if (args.length > 1) {\n      System.out.println(\"Usage: jlox [script]\");\n      System.exit(64); // [64]\n    } else if (args.length == 1) {\n      runFile(args[0]);\n    } else {\n      runPrompt();\n    }\n  }\n//> run-file\n  private static void runFile(String path) throws IOException {\n    byte[] bytes = Files.readAllBytes(Paths.get(path));\n    run(new String(bytes, Charset.defaultCharset()));\n//> exit-code\n\n    // Indicate an error in the exit code.\n    if (hadError) System.exit(65);\n//< exit-code\n//> Evaluating Expressions check-runtime-error\n    if (hadRuntimeError) System.exit(70);\n//< Evaluating Expressions check-runtime-error\n  }\n//< run-file\n//> prompt\n  private static void runPrompt() throws IOException {\n    InputStreamReader input = new InputStreamReader(System.in);\n    BufferedReader reader = new BufferedReader(input);\n\n    for (;;) { // [repl]\n      System.out.print(\"> \");\n      String line = reader.readLine();\n      if (line == null) break;\n      run(line);\n//> reset-had-error\n      hadError = false;\n//< reset-had-error\n    }\n  }\n//< prompt\n//> run\n  private static void run(String source) {\n    Scanner scanner = new Scanner(source);\n    List<Token> tokens = scanner.scanTokens();\n/* Scanning run < Parsing Expressions print-ast\n\n    // For now, just print the tokens.\n    for (Token token : tokens) {\n      System.out.println(token);\n    }\n*/\n//> Parsing Expressions print-ast\n    Parser parser = new Parser(tokens);\n/* Parsing Expressions print-ast < Statements and State parse-statements\n    Expr expression = parser.parse();\n*/\n//> Statements and State parse-statements\n    List<Stmt> statements = parser.parse();\n//< Statements and State parse-statements\n\n    // Stop if there was a syntax error.\n    if (hadError) return;\n\n//< Parsing Expressions print-ast\n//> Resolving and Binding create-resolver\n    Resolver resolver = new Resolver(interpreter);\n    resolver.resolve(statements);\n//> resolution-error\n\n    // Stop if there was a resolution error.\n    if (hadError) return;\n//< resolution-error\n\n//< Resolving and Binding create-resolver\n/* Parsing Expressions print-ast < Evaluating Expressions interpreter-interpret\n    System.out.println(new AstPrinter().print(expression));\n*/\n/* Evaluating Expressions interpreter-interpret < Statements and State interpret-statements\n    interpreter.interpret(expression);\n*/\n//> Statements and State interpret-statements\n    interpreter.interpret(statements);\n//< Statements and State interpret-statements\n  }\n//< run\n//> lox-error\n  static void error(int line, String message) {\n    report(line, \"\", message);\n  }\n\n  private static void report(int line, String where,\n                             String message) {\n    System.err.println(\n        \"[line \" + line + \"] Error\" + where + \": \" + message);\n    hadError = true;\n  }\n//< lox-error\n//> Parsing Expressions token-error\n  static void error(Token token, String message) {\n    if (token.type == TokenType.EOF) {\n      report(token.line, \" at end\", message);\n    } else {\n      report(token.line, \" at '\" + token.lexeme + \"'\", message);\n    }\n  }\n//< Parsing Expressions token-error\n//> Evaluating Expressions runtime-error-method\n  static void runtimeError(RuntimeError error) {\n    System.err.println(error.getMessage() +\n        \"\\n[line \" + error.token.line + \"]\");\n    hadRuntimeError = true;\n  }\n//< Evaluating Expressions runtime-error-method\n}\n"
  },
  {
    "path": "java/com/craftinginterpreters/lox/LoxCallable.java",
    "content": "//> Functions callable\npackage com.craftinginterpreters.lox;\n\nimport java.util.List;\n\ninterface LoxCallable {\n//> callable-arity\n  int arity();\n//< callable-arity\n  Object call(Interpreter interpreter, List<Object> arguments);\n}\n"
  },
  {
    "path": "java/com/craftinginterpreters/lox/LoxClass.java",
    "content": "//> Classes lox-class\npackage com.craftinginterpreters.lox;\n\nimport java.util.List;\nimport java.util.Map;\n\n/* Classes lox-class < Classes lox-class-callable\nclass LoxClass {\n*/\n//> lox-class-callable\nclass LoxClass implements LoxCallable {\n//< lox-class-callable\n  final String name;\n//> Inheritance lox-class-superclass-field\n  final LoxClass superclass;\n//< Inheritance lox-class-superclass-field\n/* Classes lox-class < Classes lox-class-methods\n\n  LoxClass(String name) {\n    this.name = name;\n  }\n*/\n//> lox-class-methods\n  private final Map<String, LoxFunction> methods;\n\n/* Classes lox-class-methods < Inheritance lox-class-constructor\n  LoxClass(String name, Map<String, LoxFunction> methods) {\n*/\n//> Inheritance lox-class-constructor\n  LoxClass(String name, LoxClass superclass,\n           Map<String, LoxFunction> methods) {\n    this.superclass = superclass;\n//< Inheritance lox-class-constructor\n    this.name = name;\n    this.methods = methods;\n  }\n//< lox-class-methods\n//> lox-class-find-method\n  LoxFunction findMethod(String name) {\n    if (methods.containsKey(name)) {\n      return methods.get(name);\n    }\n\n//> Inheritance find-method-recurse-superclass\n    if (superclass != null) {\n      return superclass.findMethod(name);\n    }\n\n//< Inheritance find-method-recurse-superclass\n    return null;\n  }\n//< lox-class-find-method\n\n  @Override\n  public String toString() {\n    return name;\n  }\n//> lox-class-call-arity\n  @Override\n  public Object call(Interpreter interpreter,\n                     List<Object> arguments) {\n    LoxInstance instance = new LoxInstance(this);\n//> lox-class-call-initializer\n    LoxFunction initializer = findMethod(\"init\");\n    if (initializer != null) {\n      initializer.bind(instance).call(interpreter, arguments);\n    }\n\n//< lox-class-call-initializer\n    return instance;\n  }\n\n  @Override\n  public int arity() {\n/* Classes lox-class-call-arity < Classes lox-initializer-arity\n    return 0;\n*/\n//> lox-initializer-arity\n    LoxFunction initializer = findMethod(\"init\");\n    if (initializer == null) return 0;\n    return initializer.arity();\n//< lox-initializer-arity\n  }\n//< lox-class-call-arity\n}\n"
  },
  {
    "path": "java/com/craftinginterpreters/lox/LoxFunction.java",
    "content": "//> Functions lox-function\npackage com.craftinginterpreters.lox;\n\nimport java.util.List;\n\nclass LoxFunction implements LoxCallable {\n  private final Stmt.Function declaration;\n//> closure-field\n  private final Environment closure;\n  \n//< closure-field\n/* Functions lox-function < Functions closure-constructor\n  LoxFunction(Stmt.Function declaration) {\n*/\n/* Functions closure-constructor < Classes is-initializer-field\n  LoxFunction(Stmt.Function declaration, Environment closure) {\n*/\n//> Classes is-initializer-field\n  private final boolean isInitializer;\n\n  LoxFunction(Stmt.Function declaration, Environment closure,\n              boolean isInitializer) {\n    this.isInitializer = isInitializer;\n//< Classes is-initializer-field\n//> closure-constructor\n    this.closure = closure;\n//< closure-constructor\n    this.declaration = declaration;\n  }\n//> Classes bind-instance\n  LoxFunction bind(LoxInstance instance) {\n    Environment environment = new Environment(closure);\n    environment.define(\"this\", instance);\n/* Classes bind-instance < Classes lox-function-bind-with-initializer\n    return new LoxFunction(declaration, environment);\n*/\n//> lox-function-bind-with-initializer\n    return new LoxFunction(declaration, environment,\n                           isInitializer);\n//< lox-function-bind-with-initializer\n  }\n//< Classes bind-instance\n//> function-to-string\n  @Override\n  public String toString() {\n    return \"<fn \" + declaration.name.lexeme + \">\";\n  }\n//< function-to-string\n//> function-arity\n  @Override\n  public int arity() {\n    return declaration.params.size();\n  }\n//< function-arity\n//> function-call\n  @Override\n  public Object call(Interpreter interpreter,\n                     List<Object> arguments) {\n/* Functions function-call < Functions call-closure\n    Environment environment = new Environment(interpreter.globals);\n*/\n//> call-closure\n    Environment environment = new Environment(closure);\n//< call-closure\n    for (int i = 0; i < declaration.params.size(); i++) {\n      environment.define(declaration.params.get(i).lexeme,\n          arguments.get(i));\n    }\n\n/* Functions function-call < Functions catch-return\n    interpreter.executeBlock(declaration.body, environment);\n*/\n//> catch-return\n    try {\n      interpreter.executeBlock(declaration.body, environment);\n    } catch (Return returnValue) {\n//> Classes early-return-this\n      if (isInitializer) return closure.getAt(0, \"this\");\n\n//< Classes early-return-this\n      return returnValue.value;\n    }\n//< catch-return\n//> Classes return-this\n\n    if (isInitializer) return closure.getAt(0, \"this\");\n//< Classes return-this\n    return null;\n  }\n//< function-call\n}\n"
  },
  {
    "path": "java/com/craftinginterpreters/lox/LoxInstance.java",
    "content": "//> Classes lox-instance\npackage com.craftinginterpreters.lox;\n\nimport java.util.HashMap;\nimport java.util.Map;\n\nclass LoxInstance {\n  private LoxClass klass;\n//> lox-instance-fields\n  private final Map<String, Object> fields = new HashMap<>();\n//< lox-instance-fields\n\n  LoxInstance(LoxClass klass) {\n    this.klass = klass;\n  }\n\n//> lox-instance-get-property\n  Object get(Token name) {\n    if (fields.containsKey(name.lexeme)) {\n      return fields.get(name.lexeme);\n    }\n\n//> lox-instance-get-method\n    LoxFunction method = klass.findMethod(name.lexeme);\n/* Classes lox-instance-get-method < Classes lox-instance-bind-method\n    if (method != null) return method;\n*/\n//> lox-instance-bind-method\n    if (method != null) return method.bind(this);\n//< lox-instance-bind-method\n\n//< lox-instance-get-method\n    throw new RuntimeError(name, // [hidden]\n        \"Undefined property '\" + name.lexeme + \"'.\");\n  }\n//< lox-instance-get-property\n//> lox-instance-set-property\n  void set(Token name, Object value) {\n    fields.put(name.lexeme, value);\n  }\n//< lox-instance-set-property\n  @Override\n  public String toString() {\n    return klass.name + \" instance\";\n  }\n}\n"
  },
  {
    "path": "java/com/craftinginterpreters/lox/Parser.java",
    "content": "//> Parsing Expressions parser\npackage com.craftinginterpreters.lox;\n\n//> Statements and State parser-imports\nimport java.util.ArrayList;\n//< Statements and State parser-imports\n//> Control Flow import-arrays\nimport java.util.Arrays;\n//< Control Flow import-arrays\nimport java.util.List;\n\nimport static com.craftinginterpreters.lox.TokenType.*;\n\nclass Parser {\n//> parse-error\n  private static class ParseError extends RuntimeException {}\n\n//< parse-error\n  private final List<Token> tokens;\n  private int current = 0;\n\n  Parser(List<Token> tokens) {\n    this.tokens = tokens;\n  }\n/* Parsing Expressions parse < Statements and State parse\n  Expr parse() {\n    try {\n      return expression();\n    } catch (ParseError error) {\n      return null;\n    }\n  }\n*/\n//> Statements and State parse\n  List<Stmt> parse() {\n    List<Stmt> statements = new ArrayList<>();\n    while (!isAtEnd()) {\n/* Statements and State parse < Statements and State parse-declaration\n      statements.add(statement());\n*/\n//> parse-declaration\n      statements.add(declaration());\n//< parse-declaration\n    }\n\n    return statements; // [parse-error-handling]\n  }\n//< Statements and State parse\n//> expression\n  private Expr expression() {\n/* Parsing Expressions expression < Statements and State expression\n    return equality();\n*/\n//> Statements and State expression\n    return assignment();\n//< Statements and State expression\n  }\n//< expression\n//> Statements and State declaration\n  private Stmt declaration() {\n    try {\n//> Classes match-class\n      if (match(CLASS)) return classDeclaration();\n//< Classes match-class\n//> Functions match-fun\n      if (match(FUN)) return function(\"function\");\n//< Functions match-fun\n      if (match(VAR)) return varDeclaration();\n\n      return statement();\n    } catch (ParseError error) {\n      synchronize();\n      return null;\n    }\n  }\n//< Statements and State declaration\n//> Classes parse-class-declaration\n  private Stmt classDeclaration() {\n    Token name = consume(IDENTIFIER, \"Expect class name.\");\n//> Inheritance parse-superclass\n\n    Expr.Variable superclass = null;\n    if (match(LESS)) {\n      consume(IDENTIFIER, \"Expect superclass name.\");\n      superclass = new Expr.Variable(previous());\n    }\n\n//< Inheritance parse-superclass\n    consume(LEFT_BRACE, \"Expect '{' before class body.\");\n\n    List<Stmt.Function> methods = new ArrayList<>();\n    while (!check(RIGHT_BRACE) && !isAtEnd()) {\n      methods.add(function(\"method\"));\n    }\n\n    consume(RIGHT_BRACE, \"Expect '}' after class body.\");\n\n/* Classes parse-class-declaration < Inheritance construct-class-ast\n    return new Stmt.Class(name, methods);\n*/\n//> Inheritance construct-class-ast\n    return new Stmt.Class(name, superclass, methods);\n//< Inheritance construct-class-ast\n  }\n//< Classes parse-class-declaration\n//> Statements and State parse-statement\n  private Stmt statement() {\n//> Control Flow match-for\n    if (match(FOR)) return forStatement();\n//< Control Flow match-for\n//> Control Flow match-if\n    if (match(IF)) return ifStatement();\n//< Control Flow match-if\n    if (match(PRINT)) return printStatement();\n//> Functions match-return\n    if (match(RETURN)) return returnStatement();\n//< Functions match-return\n//> Control Flow match-while\n    if (match(WHILE)) return whileStatement();\n//< Control Flow match-while\n//> parse-block\n    if (match(LEFT_BRACE)) return new Stmt.Block(block());\n//< parse-block\n\n    return expressionStatement();\n  }\n//< Statements and State parse-statement\n//> Control Flow for-statement\n  private Stmt forStatement() {\n    consume(LEFT_PAREN, \"Expect '(' after 'for'.\");\n\n/* Control Flow for-statement < Control Flow for-initializer\n    // More here...\n*/\n//> for-initializer\n    Stmt initializer;\n    if (match(SEMICOLON)) {\n      initializer = null;\n    } else if (match(VAR)) {\n      initializer = varDeclaration();\n    } else {\n      initializer = expressionStatement();\n    }\n//< for-initializer\n//> for-condition\n\n    Expr condition = null;\n    if (!check(SEMICOLON)) {\n      condition = expression();\n    }\n    consume(SEMICOLON, \"Expect ';' after loop condition.\");\n//< for-condition\n//> for-increment\n\n    Expr increment = null;\n    if (!check(RIGHT_PAREN)) {\n      increment = expression();\n    }\n    consume(RIGHT_PAREN, \"Expect ')' after for clauses.\");\n//< for-increment\n//> for-body\n    Stmt body = statement();\n\n//> for-desugar-increment\n    if (increment != null) {\n      body = new Stmt.Block(\n          Arrays.asList(\n              body,\n              new Stmt.Expression(increment)));\n    }\n\n//< for-desugar-increment\n//> for-desugar-condition\n    if (condition == null) condition = new Expr.Literal(true);\n    body = new Stmt.While(condition, body);\n\n//< for-desugar-condition\n//> for-desugar-initializer\n    if (initializer != null) {\n      body = new Stmt.Block(Arrays.asList(initializer, body));\n    }\n\n//< for-desugar-initializer\n    return body;\n//< for-body\n  }\n//< Control Flow for-statement\n//> Control Flow if-statement\n  private Stmt ifStatement() {\n    consume(LEFT_PAREN, \"Expect '(' after 'if'.\");\n    Expr condition = expression();\n    consume(RIGHT_PAREN, \"Expect ')' after if condition.\"); // [parens]\n\n    Stmt thenBranch = statement();\n    Stmt elseBranch = null;\n    if (match(ELSE)) {\n      elseBranch = statement();\n    }\n\n    return new Stmt.If(condition, thenBranch, elseBranch);\n  }\n//< Control Flow if-statement\n//> Statements and State parse-print-statement\n  private Stmt printStatement() {\n    Expr value = expression();\n    consume(SEMICOLON, \"Expect ';' after value.\");\n    return new Stmt.Print(value);\n  }\n//< Statements and State parse-print-statement\n//> Functions parse-return-statement\n  private Stmt returnStatement() {\n    Token keyword = previous();\n    Expr value = null;\n    if (!check(SEMICOLON)) {\n      value = expression();\n    }\n\n    consume(SEMICOLON, \"Expect ';' after return value.\");\n    return new Stmt.Return(keyword, value);\n  }\n//< Functions parse-return-statement\n//> Statements and State parse-var-declaration\n  private Stmt varDeclaration() {\n    Token name = consume(IDENTIFIER, \"Expect variable name.\");\n\n    Expr initializer = null;\n    if (match(EQUAL)) {\n      initializer = expression();\n    }\n\n    consume(SEMICOLON, \"Expect ';' after variable declaration.\");\n    return new Stmt.Var(name, initializer);\n  }\n//< Statements and State parse-var-declaration\n//> Control Flow while-statement\n  private Stmt whileStatement() {\n    consume(LEFT_PAREN, \"Expect '(' after 'while'.\");\n    Expr condition = expression();\n    consume(RIGHT_PAREN, \"Expect ')' after condition.\");\n    Stmt body = statement();\n\n    return new Stmt.While(condition, body);\n  }\n//< Control Flow while-statement\n//> Statements and State parse-expression-statement\n  private Stmt expressionStatement() {\n    Expr expr = expression();\n    consume(SEMICOLON, \"Expect ';' after expression.\");\n    return new Stmt.Expression(expr);\n  }\n//< Statements and State parse-expression-statement\n//> Functions parse-function\n  private Stmt.Function function(String kind) {\n    Token name = consume(IDENTIFIER, \"Expect \" + kind + \" name.\");\n//> parse-parameters\n    consume(LEFT_PAREN, \"Expect '(' after \" + kind + \" name.\");\n    List<Token> parameters = new ArrayList<>();\n    if (!check(RIGHT_PAREN)) {\n      do {\n        if (parameters.size() >= 255) {\n          error(peek(), \"Can't have more than 255 parameters.\");\n        }\n\n        parameters.add(\n            consume(IDENTIFIER, \"Expect parameter name.\"));\n      } while (match(COMMA));\n    }\n    consume(RIGHT_PAREN, \"Expect ')' after parameters.\");\n//< parse-parameters\n//> parse-body\n\n    consume(LEFT_BRACE, \"Expect '{' before \" + kind + \" body.\");\n    List<Stmt> body = block();\n    return new Stmt.Function(name, parameters, body);\n//< parse-body\n  }\n//< Functions parse-function\n//> Statements and State block\n  private List<Stmt> block() {\n    List<Stmt> statements = new ArrayList<>();\n\n    while (!check(RIGHT_BRACE) && !isAtEnd()) {\n      statements.add(declaration());\n    }\n\n    consume(RIGHT_BRACE, \"Expect '}' after block.\");\n    return statements;\n  }\n//< Statements and State block\n//> Statements and State parse-assignment\n  private Expr assignment() {\n/* Statements and State parse-assignment < Control Flow or-in-assignment\n    Expr expr = equality();\n*/\n//> Control Flow or-in-assignment\n    Expr expr = or();\n//< Control Flow or-in-assignment\n\n    if (match(EQUAL)) {\n      Token equals = previous();\n      Expr value = assignment();\n\n      if (expr instanceof Expr.Variable) {\n        Token name = ((Expr.Variable)expr).name;\n        return new Expr.Assign(name, value);\n//> Classes assign-set\n      } else if (expr instanceof Expr.Get) {\n        Expr.Get get = (Expr.Get)expr;\n        return new Expr.Set(get.object, get.name, value);\n//< Classes assign-set\n      }\n\n      error(equals, \"Invalid assignment target.\"); // [no-throw]\n    }\n\n    return expr;\n  }\n//< Statements and State parse-assignment\n//> Control Flow or\n  private Expr or() {\n    Expr expr = and();\n\n    while (match(OR)) {\n      Token operator = previous();\n      Expr right = and();\n      expr = new Expr.Logical(expr, operator, right);\n    }\n\n    return expr;\n  }\n//< Control Flow or\n//> Control Flow and\n  private Expr and() {\n    Expr expr = equality();\n\n    while (match(AND)) {\n      Token operator = previous();\n      Expr right = equality();\n      expr = new Expr.Logical(expr, operator, right);\n    }\n\n    return expr;\n  }\n//< Control Flow and\n//> equality\n  private Expr equality() {\n    Expr expr = comparison();\n\n    while (match(BANG_EQUAL, EQUAL_EQUAL)) {\n      Token operator = previous();\n      Expr right = comparison();\n      expr = new Expr.Binary(expr, operator, right);\n    }\n\n    return expr;\n  }\n//< equality\n//> comparison\n  private Expr comparison() {\n    Expr expr = term();\n\n    while (match(GREATER, GREATER_EQUAL, LESS, LESS_EQUAL)) {\n      Token operator = previous();\n      Expr right = term();\n      expr = new Expr.Binary(expr, operator, right);\n    }\n\n    return expr;\n  }\n//< comparison\n//> term\n  private Expr term() {\n    Expr expr = factor();\n\n    while (match(MINUS, PLUS)) {\n      Token operator = previous();\n      Expr right = factor();\n      expr = new Expr.Binary(expr, operator, right);\n    }\n\n    return expr;\n  }\n//< term\n//> factor\n  private Expr factor() {\n    Expr expr = unary();\n\n    while (match(SLASH, STAR)) {\n      Token operator = previous();\n      Expr right = unary();\n      expr = new Expr.Binary(expr, operator, right);\n    }\n\n    return expr;\n  }\n//< factor\n//> unary\n  private Expr unary() {\n    if (match(BANG, MINUS)) {\n      Token operator = previous();\n      Expr right = unary();\n      return new Expr.Unary(operator, right);\n    }\n\n/* Parsing Expressions unary < Functions unary-call\n    return primary();\n*/\n//> Functions unary-call\n    return call();\n//< Functions unary-call\n  }\n//< unary\n//> Functions finish-call\n  private Expr finishCall(Expr callee) {\n    List<Expr> arguments = new ArrayList<>();\n    if (!check(RIGHT_PAREN)) {\n      do {\n//> check-max-arity\n        if (arguments.size() >= 255) {\n          error(peek(), \"Can't have more than 255 arguments.\");\n        }\n//< check-max-arity\n        arguments.add(expression());\n      } while (match(COMMA));\n    }\n\n    Token paren = consume(RIGHT_PAREN,\n                          \"Expect ')' after arguments.\");\n\n    return new Expr.Call(callee, paren, arguments);\n  }\n//< Functions finish-call\n//> Functions call\n  private Expr call() {\n    Expr expr = primary();\n\n    while (true) { // [while-true]\n      if (match(LEFT_PAREN)) {\n        expr = finishCall(expr);\n//> Classes parse-property\n      } else if (match(DOT)) {\n        Token name = consume(IDENTIFIER,\n            \"Expect property name after '.'.\");\n        expr = new Expr.Get(expr, name);\n//< Classes parse-property\n      } else {\n        break;\n      }\n    }\n\n    return expr;\n  }\n//< Functions call\n//> primary\n  private Expr primary() {\n    if (match(FALSE)) return new Expr.Literal(false);\n    if (match(TRUE)) return new Expr.Literal(true);\n    if (match(NIL)) return new Expr.Literal(null);\n\n    if (match(NUMBER, STRING)) {\n      return new Expr.Literal(previous().literal);\n    }\n//> Inheritance parse-super\n\n    if (match(SUPER)) {\n      Token keyword = previous();\n      consume(DOT, \"Expect '.' after 'super'.\");\n      Token method = consume(IDENTIFIER,\n          \"Expect superclass method name.\");\n      return new Expr.Super(keyword, method);\n    }\n//< Inheritance parse-super\n//> Classes parse-this\n\n    if (match(THIS)) return new Expr.This(previous());\n//< Classes parse-this\n//> Statements and State parse-identifier\n\n    if (match(IDENTIFIER)) {\n      return new Expr.Variable(previous());\n    }\n//< Statements and State parse-identifier\n\n    if (match(LEFT_PAREN)) {\n      Expr expr = expression();\n      consume(RIGHT_PAREN, \"Expect ')' after expression.\");\n      return new Expr.Grouping(expr);\n    }\n//> primary-error\n\n    throw error(peek(), \"Expect expression.\");\n//< primary-error\n  }\n//< primary\n//> match\n  private boolean match(TokenType... types) {\n    for (TokenType type : types) {\n      if (check(type)) {\n        advance();\n        return true;\n      }\n    }\n\n    return false;\n  }\n//< match\n//> consume\n  private Token consume(TokenType type, String message) {\n    if (check(type)) return advance();\n\n    throw error(peek(), message);\n  }\n//< consume\n//> check\n  private boolean check(TokenType type) {\n    if (isAtEnd()) return false;\n    return peek().type == type;\n  }\n//< check\n//> advance\n  private Token advance() {\n    if (!isAtEnd()) current++;\n    return previous();\n  }\n//< advance\n//> utils\n  private boolean isAtEnd() {\n    return peek().type == EOF;\n  }\n\n  private Token peek() {\n    return tokens.get(current);\n  }\n\n  private Token previous() {\n    return tokens.get(current - 1);\n  }\n//< utils\n//> error\n  private ParseError error(Token token, String message) {\n    Lox.error(token, message);\n    return new ParseError();\n  }\n//< error\n//> synchronize\n  private void synchronize() {\n    advance();\n\n    while (!isAtEnd()) {\n      if (previous().type == SEMICOLON) return;\n\n      switch (peek().type) {\n        case CLASS:\n        case FUN:\n        case VAR:\n        case FOR:\n        case IF:\n        case WHILE:\n        case PRINT:\n        case RETURN:\n          return;\n      }\n\n      advance();\n    }\n  }\n//< synchronize\n}\n"
  },
  {
    "path": "java/com/craftinginterpreters/lox/Resolver.java",
    "content": "//> Resolving and Binding resolver\npackage com.craftinginterpreters.lox;\n\nimport java.util.HashMap;\nimport java.util.List;\nimport java.util.Map;\nimport java.util.Stack;\n\nclass Resolver implements Expr.Visitor<Void>, Stmt.Visitor<Void> {\n  private final Interpreter interpreter;\n//> scopes-field\n  private final Stack<Map<String, Boolean>> scopes = new Stack<>();\n//< scopes-field\n//> function-type-field\n  private FunctionType currentFunction = FunctionType.NONE;\n//< function-type-field\n\n  Resolver(Interpreter interpreter) {\n    this.interpreter = interpreter;\n  }\n//> function-type\n  private enum FunctionType {\n    NONE,\n/* Resolving and Binding function-type < Classes function-type-method\n    FUNCTION\n*/\n//> Classes function-type-method\n    FUNCTION,\n//> function-type-initializer\n    INITIALIZER,\n//< function-type-initializer\n    METHOD\n//< Classes function-type-method\n  }\n//< function-type\n//> Classes class-type\n\n  private enum ClassType {\n    NONE,\n/* Classes class-type < Inheritance class-type-subclass\n    CLASS\n */\n//> Inheritance class-type-subclass\n    CLASS,\n    SUBCLASS\n//< Inheritance class-type-subclass\n  }\n\n  private ClassType currentClass = ClassType.NONE;\n\n//< Classes class-type\n//> resolve-statements\n  void resolve(List<Stmt> statements) {\n    for (Stmt statement : statements) {\n      resolve(statement);\n    }\n  }\n//< resolve-statements\n//> visit-block-stmt\n  @Override\n  public Void visitBlockStmt(Stmt.Block stmt) {\n    beginScope();\n    resolve(stmt.statements);\n    endScope();\n    return null;\n  }\n//< visit-block-stmt\n//> Classes resolver-visit-class\n  @Override\n  public Void visitClassStmt(Stmt.Class stmt) {\n//> set-current-class\n    ClassType enclosingClass = currentClass;\n    currentClass = ClassType.CLASS;\n\n//< set-current-class\n    declare(stmt.name);\n    define(stmt.name);\n//> Inheritance resolve-superclass\n\n//> inherit-self\n    if (stmt.superclass != null &&\n        stmt.name.lexeme.equals(stmt.superclass.name.lexeme)) {\n      Lox.error(stmt.superclass.name,\n          \"A class can't inherit from itself.\");\n    }\n\n//< inherit-self\n    if (stmt.superclass != null) {\n//> set-current-subclass\n      currentClass = ClassType.SUBCLASS;\n//< set-current-subclass\n      resolve(stmt.superclass);\n    }\n//< Inheritance resolve-superclass\n//> Inheritance begin-super-scope\n\n    if (stmt.superclass != null) {\n      beginScope();\n      scopes.peek().put(\"super\", true);\n    }\n//< Inheritance begin-super-scope\n//> resolve-methods\n\n//> resolver-begin-this-scope\n    beginScope();\n    scopes.peek().put(\"this\", true);\n\n//< resolver-begin-this-scope\n    for (Stmt.Function method : stmt.methods) {\n      FunctionType declaration = FunctionType.METHOD;\n//> resolver-initializer-type\n      if (method.name.lexeme.equals(\"init\")) {\n        declaration = FunctionType.INITIALIZER;\n      }\n\n//< resolver-initializer-type\n      resolveFunction(method, declaration); // [local]\n    }\n\n//> resolver-end-this-scope\n    endScope();\n\n//< resolver-end-this-scope\n//< resolve-methods\n//> Inheritance end-super-scope\n    if (stmt.superclass != null) endScope();\n\n//< Inheritance end-super-scope\n//> restore-current-class\n    currentClass = enclosingClass;\n//< restore-current-class\n    return null;\n  }\n//< Classes resolver-visit-class\n//> visit-expression-stmt\n  @Override\n  public Void visitExpressionStmt(Stmt.Expression stmt) {\n    resolve(stmt.expression);\n    return null;\n  }\n//< visit-expression-stmt\n//> visit-function-stmt\n  @Override\n  public Void visitFunctionStmt(Stmt.Function stmt) {\n    declare(stmt.name);\n    define(stmt.name);\n\n/* Resolving and Binding visit-function-stmt < Resolving and Binding pass-function-type\n    resolveFunction(stmt);\n*/\n//> pass-function-type\n    resolveFunction(stmt, FunctionType.FUNCTION);\n//< pass-function-type\n    return null;\n  }\n//< visit-function-stmt\n//> visit-if-stmt\n  @Override\n  public Void visitIfStmt(Stmt.If stmt) {\n    resolve(stmt.condition);\n    resolve(stmt.thenBranch);\n    if (stmt.elseBranch != null) resolve(stmt.elseBranch);\n    return null;\n  }\n//< visit-if-stmt\n//> visit-print-stmt\n  @Override\n  public Void visitPrintStmt(Stmt.Print stmt) {\n    resolve(stmt.expression);\n    return null;\n  }\n//< visit-print-stmt\n//> visit-return-stmt\n  @Override\n  public Void visitReturnStmt(Stmt.Return stmt) {\n//> return-from-top\n    if (currentFunction == FunctionType.NONE) {\n      Lox.error(stmt.keyword, \"Can't return from top-level code.\");\n    }\n\n//< return-from-top\n    if (stmt.value != null) {\n//> Classes return-in-initializer\n      if (currentFunction == FunctionType.INITIALIZER) {\n        Lox.error(stmt.keyword,\n            \"Can't return a value from an initializer.\");\n      }\n\n//< Classes return-in-initializer\n      resolve(stmt.value);\n    }\n\n    return null;\n  }\n//< visit-return-stmt\n//> visit-var-stmt\n  @Override\n  public Void visitVarStmt(Stmt.Var stmt) {\n    declare(stmt.name);\n    if (stmt.initializer != null) {\n      resolve(stmt.initializer);\n    }\n    define(stmt.name);\n    return null;\n  }\n//< visit-var-stmt\n//> visit-while-stmt\n  @Override\n  public Void visitWhileStmt(Stmt.While stmt) {\n    resolve(stmt.condition);\n    resolve(stmt.body);\n    return null;\n  }\n//< visit-while-stmt\n//> visit-assign-expr\n  @Override\n  public Void visitAssignExpr(Expr.Assign expr) {\n    resolve(expr.value);\n    resolveLocal(expr, expr.name);\n    return null;\n  }\n//< visit-assign-expr\n//> visit-binary-expr\n  @Override\n  public Void visitBinaryExpr(Expr.Binary expr) {\n    resolve(expr.left);\n    resolve(expr.right);\n    return null;\n  }\n//< visit-binary-expr\n//> visit-call-expr\n  @Override\n  public Void visitCallExpr(Expr.Call expr) {\n    resolve(expr.callee);\n\n    for (Expr argument : expr.arguments) {\n      resolve(argument);\n    }\n\n    return null;\n  }\n//< visit-call-expr\n//> Classes resolver-visit-get\n  @Override\n  public Void visitGetExpr(Expr.Get expr) {\n    resolve(expr.object);\n    return null;\n  }\n//< Classes resolver-visit-get\n//> visit-grouping-expr\n  @Override\n  public Void visitGroupingExpr(Expr.Grouping expr) {\n    resolve(expr.expression);\n    return null;\n  }\n//< visit-grouping-expr\n//> visit-literal-expr\n  @Override\n  public Void visitLiteralExpr(Expr.Literal expr) {\n    return null;\n  }\n//< visit-literal-expr\n//> visit-logical-expr\n  @Override\n  public Void visitLogicalExpr(Expr.Logical expr) {\n    resolve(expr.left);\n    resolve(expr.right);\n    return null;\n  }\n//< visit-logical-expr\n//> Classes resolver-visit-set\n  @Override\n  public Void visitSetExpr(Expr.Set expr) {\n    resolve(expr.value);\n    resolve(expr.object);\n    return null;\n  }\n//< Classes resolver-visit-set\n//> Inheritance resolve-super-expr\n  @Override\n  public Void visitSuperExpr(Expr.Super expr) {\n//> invalid-super\n    if (currentClass == ClassType.NONE) {\n      Lox.error(expr.keyword,\n          \"Can't use 'super' outside of a class.\");\n    } else if (currentClass != ClassType.SUBCLASS) {\n      Lox.error(expr.keyword,\n          \"Can't use 'super' in a class with no superclass.\");\n    }\n\n//< invalid-super\n    resolveLocal(expr, expr.keyword);\n    return null;\n  }\n//< Inheritance resolve-super-expr\n//> Classes resolver-visit-this\n  @Override\n  public Void visitThisExpr(Expr.This expr) {\n//> this-outside-of-class\n    if (currentClass == ClassType.NONE) {\n      Lox.error(expr.keyword,\n          \"Can't use 'this' outside of a class.\");\n      return null;\n    }\n\n//< this-outside-of-class\n    resolveLocal(expr, expr.keyword);\n    return null;\n  }\n\n//< Classes resolver-visit-this\n//> visit-unary-expr\n  @Override\n  public Void visitUnaryExpr(Expr.Unary expr) {\n    resolve(expr.right);\n    return null;\n  }\n//< visit-unary-expr\n//> visit-variable-expr\n  @Override\n  public Void visitVariableExpr(Expr.Variable expr) {\n    if (!scopes.isEmpty() &&\n        scopes.peek().get(expr.name.lexeme) == Boolean.FALSE) {\n      Lox.error(expr.name,\n          \"Can't read local variable in its own initializer.\");\n    }\n\n    resolveLocal(expr, expr.name);\n    return null;\n  }\n//< visit-variable-expr\n//> resolve-stmt\n  private void resolve(Stmt stmt) {\n    stmt.accept(this);\n  }\n//< resolve-stmt\n//> resolve-expr\n  private void resolve(Expr expr) {\n    expr.accept(this);\n  }\n//< resolve-expr\n//> resolve-function\n/* Resolving and Binding resolve-function < Resolving and Binding set-current-function\n  private void resolveFunction(Stmt.Function function) {\n*/\n//> set-current-function\n  private void resolveFunction(\n      Stmt.Function function, FunctionType type) {\n    FunctionType enclosingFunction = currentFunction;\n    currentFunction = type;\n\n//< set-current-function\n    beginScope();\n    for (Token param : function.params) {\n      declare(param);\n      define(param);\n    }\n    resolve(function.body);\n    endScope();\n//> restore-current-function\n    currentFunction = enclosingFunction;\n//< restore-current-function\n  }\n//< resolve-function\n//> begin-scope\n  private void beginScope() {\n    scopes.push(new HashMap<String, Boolean>());\n  }\n//< begin-scope\n//> end-scope\n  private void endScope() {\n    scopes.pop();\n  }\n//< end-scope\n//> declare\n  private void declare(Token name) {\n    if (scopes.isEmpty()) return;\n\n    Map<String, Boolean> scope = scopes.peek();\n//> duplicate-variable\n    if (scope.containsKey(name.lexeme)) {\n      Lox.error(name,\n          \"Already a variable with this name in this scope.\");\n    }\n\n//< duplicate-variable\n    scope.put(name.lexeme, false);\n  }\n//< declare\n//> define\n  private void define(Token name) {\n    if (scopes.isEmpty()) return;\n    scopes.peek().put(name.lexeme, true);\n  }\n//< define\n//> resolve-local\n  private void resolveLocal(Expr expr, Token name) {\n    for (int i = scopes.size() - 1; i >= 0; i--) {\n      if (scopes.get(i).containsKey(name.lexeme)) {\n        interpreter.resolve(expr, scopes.size() - 1 - i);\n        return;\n      }\n    }\n  }\n//< resolve-local\n}\n"
  },
  {
    "path": "java/com/craftinginterpreters/lox/Return.java",
    "content": "//> Functions return-exception\npackage com.craftinginterpreters.lox;\n\nclass Return extends RuntimeException {\n  final Object value;\n\n  Return(Object value) {\n    super(null, null, false, false);\n    this.value = value;\n  }\n}\n"
  },
  {
    "path": "java/com/craftinginterpreters/lox/RuntimeError.java",
    "content": "//> Evaluating Expressions runtime-error-class\npackage com.craftinginterpreters.lox;\n\nclass RuntimeError extends RuntimeException {\n  final Token token;\n\n  RuntimeError(Token token, String message) {\n    super(message);\n    this.token = token;\n  }\n}\n"
  },
  {
    "path": "java/com/craftinginterpreters/lox/Scanner.java",
    "content": "//> Scanning scanner-class\npackage com.craftinginterpreters.lox;\n\nimport java.util.ArrayList;\nimport java.util.HashMap;\nimport java.util.List;\nimport java.util.Map;\n\nimport static com.craftinginterpreters.lox.TokenType.*; // [static-import]\n\nclass Scanner {\n//> keyword-map\n  private static final Map<String, TokenType> keywords;\n\n  static {\n    keywords = new HashMap<>();\n    keywords.put(\"and\",    AND);\n    keywords.put(\"class\",  CLASS);\n    keywords.put(\"else\",   ELSE);\n    keywords.put(\"false\",  FALSE);\n    keywords.put(\"for\",    FOR);\n    keywords.put(\"fun\",    FUN);\n    keywords.put(\"if\",     IF);\n    keywords.put(\"nil\",    NIL);\n    keywords.put(\"or\",     OR);\n    keywords.put(\"print\",  PRINT);\n    keywords.put(\"return\", RETURN);\n    keywords.put(\"super\",  SUPER);\n    keywords.put(\"this\",   THIS);\n    keywords.put(\"true\",   TRUE);\n    keywords.put(\"var\",    VAR);\n    keywords.put(\"while\",  WHILE);\n  }\n//< keyword-map\n  private final String source;\n  private final List<Token> tokens = new ArrayList<>();\n//> scan-state\n  private int start = 0;\n  private int current = 0;\n  private int line = 1;\n//< scan-state\n\n  Scanner(String source) {\n    this.source = source;\n  }\n//> scan-tokens\n  List<Token> scanTokens() {\n    while (!isAtEnd()) {\n      // We are at the beginning of the next lexeme.\n      start = current;\n      scanToken();\n    }\n\n    tokens.add(new Token(EOF, \"\", null, line));\n    return tokens;\n  }\n//< scan-tokens\n//> scan-token\n  private void scanToken() {\n    char c = advance();\n    switch (c) {\n      case '(': addToken(LEFT_PAREN); break;\n      case ')': addToken(RIGHT_PAREN); break;\n      case '{': addToken(LEFT_BRACE); break;\n      case '}': addToken(RIGHT_BRACE); break;\n      case ',': addToken(COMMA); break;\n      case '.': addToken(DOT); break;\n      case '-': addToken(MINUS); break;\n      case '+': addToken(PLUS); break;\n      case ';': addToken(SEMICOLON); break;\n      case '*': addToken(STAR); break; // [slash]\n//> two-char-tokens\n      case '!':\n        addToken(match('=') ? BANG_EQUAL : BANG);\n        break;\n      case '=':\n        addToken(match('=') ? EQUAL_EQUAL : EQUAL);\n        break;\n      case '<':\n        addToken(match('=') ? LESS_EQUAL : LESS);\n        break;\n      case '>':\n        addToken(match('=') ? GREATER_EQUAL : GREATER);\n        break;\n//< two-char-tokens\n//> slash\n      case '/':\n        if (match('/')) {\n          // A comment goes until the end of the line.\n          while (peek() != '\\n' && !isAtEnd()) advance();\n        } else {\n          addToken(SLASH);\n        }\n        break;\n//< slash\n//> whitespace\n\n      case ' ':\n      case '\\r':\n      case '\\t':\n        // Ignore whitespace.\n        break;\n\n      case '\\n':\n        line++;\n        break;\n//< whitespace\n//> string-start\n\n      case '\"': string(); break;\n//< string-start\n//> char-error\n\n      default:\n/* Scanning char-error < Scanning digit-start\n        Lox.error(line, \"Unexpected character.\");\n*/\n//> digit-start\n        if (isDigit(c)) {\n          number();\n//> identifier-start\n        } else if (isAlpha(c)) {\n          identifier();\n//< identifier-start\n        } else {\n          Lox.error(line, \"Unexpected character.\");\n        }\n//< digit-start\n        break;\n//< char-error\n    }\n  }\n//< scan-token\n//> identifier\n  private void identifier() {\n    while (isAlphaNumeric(peek())) advance();\n\n/* Scanning identifier < Scanning keyword-type\n    addToken(IDENTIFIER);\n*/\n//> keyword-type\n    String text = source.substring(start, current);\n    TokenType type = keywords.get(text);\n    if (type == null) type = IDENTIFIER;\n    addToken(type);\n//< keyword-type\n  }\n//< identifier\n//> number\n  private void number() {\n    while (isDigit(peek())) advance();\n\n    // Look for a fractional part.\n    if (peek() == '.' && isDigit(peekNext())) {\n      // Consume the \".\"\n      advance();\n\n      while (isDigit(peek())) advance();\n    }\n\n    addToken(NUMBER,\n        Double.parseDouble(source.substring(start, current)));\n  }\n//< number\n//> string\n  private void string() {\n    while (peek() != '\"' && !isAtEnd()) {\n      if (peek() == '\\n') line++;\n      advance();\n    }\n\n    if (isAtEnd()) {\n      Lox.error(line, \"Unterminated string.\");\n      return;\n    }\n\n    // The closing \".\n    advance();\n\n    // Trim the surrounding quotes.\n    String value = source.substring(start + 1, current - 1);\n    addToken(STRING, value);\n  }\n//< string\n//> match\n  private boolean match(char expected) {\n    if (isAtEnd()) return false;\n    if (source.charAt(current) != expected) return false;\n\n    current++;\n    return true;\n  }\n//< match\n//> peek\n  private char peek() {\n    if (isAtEnd()) return '\\0';\n    return source.charAt(current);\n  }\n//< peek\n//> peek-next\n  private char peekNext() {\n    if (current + 1 >= source.length()) return '\\0';\n    return source.charAt(current + 1);\n  } // [peek-next]\n//< peek-next\n//> is-alpha\n  private boolean isAlpha(char c) {\n    return (c >= 'a' && c <= 'z') ||\n           (c >= 'A' && c <= 'Z') ||\n            c == '_';\n  }\n\n  private boolean isAlphaNumeric(char c) {\n    return isAlpha(c) || isDigit(c);\n  }\n//< is-alpha\n//> is-digit\n  private boolean isDigit(char c) {\n    return c >= '0' && c <= '9';\n  } // [is-digit]\n//< is-digit\n//> is-at-end\n  private boolean isAtEnd() {\n    return current >= source.length();\n  }\n//< is-at-end\n//> advance-and-add-token\n  private char advance() {\n    return source.charAt(current++);\n  }\n\n  private void addToken(TokenType type) {\n    addToken(type, null);\n  }\n\n  private void addToken(TokenType type, Object literal) {\n    String text = source.substring(start, current);\n    tokens.add(new Token(type, text, literal, line));\n  }\n//< advance-and-add-token\n}\n"
  },
  {
    "path": "java/com/craftinginterpreters/lox/Stmt.java",
    "content": "//> Appendix II stmt\npackage com.craftinginterpreters.lox;\n\nimport java.util.List;\n\nabstract class Stmt {\n  interface Visitor<R> {\n    R visitBlockStmt(Block stmt);\n    R visitClassStmt(Class stmt);\n    R visitExpressionStmt(Expression stmt);\n    R visitFunctionStmt(Function stmt);\n    R visitIfStmt(If stmt);\n    R visitPrintStmt(Print stmt);\n    R visitReturnStmt(Return stmt);\n    R visitVarStmt(Var stmt);\n    R visitWhileStmt(While stmt);\n  }\n\n  // Nested Stmt classes here...\n//> stmt-block\n  static class Block extends Stmt {\n    Block(List<Stmt> statements) {\n      this.statements = statements;\n    }\n\n    @Override\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitBlockStmt(this);\n    }\n\n    final List<Stmt> statements;\n  }\n//< stmt-block\n//> stmt-class\n  static class Class extends Stmt {\n    Class(Token name,\n          Expr.Variable superclass,\n          List<Stmt.Function> methods) {\n      this.name = name;\n      this.superclass = superclass;\n      this.methods = methods;\n    }\n\n    @Override\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitClassStmt(this);\n    }\n\n    final Token name;\n    final Expr.Variable superclass;\n    final List<Stmt.Function> methods;\n  }\n//< stmt-class\n//> stmt-expression\n  static class Expression extends Stmt {\n    Expression(Expr expression) {\n      this.expression = expression;\n    }\n\n    @Override\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitExpressionStmt(this);\n    }\n\n    final Expr expression;\n  }\n//< stmt-expression\n//> stmt-function\n  static class Function extends Stmt {\n    Function(Token name, List<Token> params, List<Stmt> body) {\n      this.name = name;\n      this.params = params;\n      this.body = body;\n    }\n\n    @Override\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitFunctionStmt(this);\n    }\n\n    final Token name;\n    final List<Token> params;\n    final List<Stmt> body;\n  }\n//< stmt-function\n//> stmt-if\n  static class If extends Stmt {\n    If(Expr condition, Stmt thenBranch, Stmt elseBranch) {\n      this.condition = condition;\n      this.thenBranch = thenBranch;\n      this.elseBranch = elseBranch;\n    }\n\n    @Override\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitIfStmt(this);\n    }\n\n    final Expr condition;\n    final Stmt thenBranch;\n    final Stmt elseBranch;\n  }\n//< stmt-if\n//> stmt-print\n  static class Print extends Stmt {\n    Print(Expr expression) {\n      this.expression = expression;\n    }\n\n    @Override\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitPrintStmt(this);\n    }\n\n    final Expr expression;\n  }\n//< stmt-print\n//> stmt-return\n  static class Return extends Stmt {\n    Return(Token keyword, Expr value) {\n      this.keyword = keyword;\n      this.value = value;\n    }\n\n    @Override\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitReturnStmt(this);\n    }\n\n    final Token keyword;\n    final Expr value;\n  }\n//< stmt-return\n//> stmt-var\n  static class Var extends Stmt {\n    Var(Token name, Expr initializer) {\n      this.name = name;\n      this.initializer = initializer;\n    }\n\n    @Override\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitVarStmt(this);\n    }\n\n    final Token name;\n    final Expr initializer;\n  }\n//< stmt-var\n//> stmt-while\n  static class While extends Stmt {\n    While(Expr condition, Stmt body) {\n      this.condition = condition;\n      this.body = body;\n    }\n\n    @Override\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitWhileStmt(this);\n    }\n\n    final Expr condition;\n    final Stmt body;\n  }\n//< stmt-while\n\n  abstract <R> R accept(Visitor<R> visitor);\n}\n//< Appendix II stmt\n"
  },
  {
    "path": "java/com/craftinginterpreters/lox/Token.java",
    "content": "//> Scanning token-class\npackage com.craftinginterpreters.lox;\n\nclass Token {\n  final TokenType type;\n  final String lexeme;\n  final Object literal;\n  final int line; // [location]\n\n  Token(TokenType type, String lexeme, Object literal, int line) {\n    this.type = type;\n    this.lexeme = lexeme;\n    this.literal = literal;\n    this.line = line;\n  }\n\n  public String toString() {\n    return type + \" \" + lexeme + \" \" + literal;\n  }\n}\n"
  },
  {
    "path": "java/com/craftinginterpreters/lox/TokenType.java",
    "content": "//> Scanning token-type\npackage com.craftinginterpreters.lox;\n\nenum TokenType {\n  // Single-character tokens.\n  LEFT_PAREN, RIGHT_PAREN, LEFT_BRACE, RIGHT_BRACE,\n  COMMA, DOT, MINUS, PLUS, SEMICOLON, SLASH, STAR,\n\n  // One or two character tokens.\n  BANG, BANG_EQUAL,\n  EQUAL, EQUAL_EQUAL,\n  GREATER, GREATER_EQUAL,\n  LESS, LESS_EQUAL,\n\n  // Literals.\n  IDENTIFIER, STRING, NUMBER,\n\n  // Keywords.\n  AND, CLASS, ELSE, FALSE, FUN, FOR, IF, NIL, OR,\n  PRINT, RETURN, SUPER, THIS, TRUE, VAR, WHILE,\n\n  EOF\n}\n"
  },
  {
    "path": "java/com/craftinginterpreters/tool/GenerateAst.java",
    "content": "//> Representing Code generate-ast\npackage com.craftinginterpreters.tool;\n\nimport java.io.IOException;\nimport java.io.PrintWriter;\nimport java.util.Arrays;\nimport java.util.List;\n\npublic class GenerateAst {\n  public static void main(String[] args) throws IOException {\n    if (args.length != 1) {\n      System.err.println(\"Usage: generate_ast <output directory>\");\n      System.exit(64);\n    }\n    String outputDir = args[0];\n//> call-define-ast\n    defineAst(outputDir, \"Expr\", Arrays.asList(\n//> Statements and State assign-expr\n      \"Assign   : Token name, Expr value\",\n//< Statements and State assign-expr\n      \"Binary   : Expr left, Token operator, Expr right\",\n//> Functions call-expr\n      \"Call     : Expr callee, Token paren, List<Expr> arguments\",\n//< Functions call-expr\n//> Classes get-ast\n      \"Get      : Expr object, Token name\",\n//< Classes get-ast\n      \"Grouping : Expr expression\",\n      \"Literal  : Object value\",\n//> Control Flow logical-ast\n      \"Logical  : Expr left, Token operator, Expr right\",\n//< Control Flow logical-ast\n//> Classes set-ast\n      \"Set      : Expr object, Token name, Expr value\",\n//< Classes set-ast\n//> Inheritance super-expr\n      \"Super    : Token keyword, Token method\",\n//< Inheritance super-expr\n//> Classes this-ast\n      \"This     : Token keyword\",\n//< Classes this-ast\n/* Representing Code call-define-ast < Statements and State var-expr\n      \"Unary    : Token operator, Expr right\"\n*/\n//> Statements and State var-expr\n      \"Unary    : Token operator, Expr right\",\n      \"Variable : Token name\"\n//< Statements and State var-expr\n    ));\n//> Statements and State stmt-ast\n\n    defineAst(outputDir, \"Stmt\", Arrays.asList(\n//> block-ast\n      \"Block      : List<Stmt> statements\",\n//< block-ast\n/* Classes class-ast < Inheritance superclass-ast\n      \"Class      : Token name, List<Stmt.Function> methods\",\n*/\n//> Inheritance superclass-ast\n      \"Class      : Token name, Expr.Variable superclass,\" +\n                  \" List<Stmt.Function> methods\",\n//< Inheritance superclass-ast\n      \"Expression : Expr expression\",\n//> Functions function-ast\n      \"Function   : Token name, List<Token> params,\" +\n                  \" List<Stmt> body\",\n//< Functions function-ast\n//> Control Flow if-ast\n      \"If         : Expr condition, Stmt thenBranch,\" +\n                  \" Stmt elseBranch\",\n//< Control Flow if-ast\n/* Statements and State stmt-ast < Statements and State var-stmt-ast\n      \"Print      : Expr expression\"\n*/\n//> var-stmt-ast\n      \"Print      : Expr expression\",\n//< var-stmt-ast\n//> Functions return-ast\n      \"Return     : Token keyword, Expr value\",\n//< Functions return-ast\n/* Statements and State var-stmt-ast < Control Flow while-ast\n      \"Var        : Token name, Expr initializer\"\n*/\n//> Control Flow while-ast\n      \"Var        : Token name, Expr initializer\",\n      \"While      : Expr condition, Stmt body\"\n//< Control Flow while-ast\n    ));\n//< Statements and State stmt-ast\n//< call-define-ast\n  }\n//> define-ast\n  private static void defineAst(\n      String outputDir, String baseName, List<String> types)\n      throws IOException {\n    String path = outputDir + \"/\" + baseName + \".java\";\n    PrintWriter writer = new PrintWriter(path, \"UTF-8\");\n\n//> omit\n    writer.println(\"//> Appendix II \" + baseName.toLowerCase());\n//< omit\n    writer.println(\"package com.craftinginterpreters.lox;\");\n    writer.println();\n    writer.println(\"import java.util.List;\");\n    writer.println();\n    writer.println(\"abstract class \" + baseName + \" {\");\n\n//> call-define-visitor\n    defineVisitor(writer, baseName, types);\n\n//< call-define-visitor\n//> omit\n    writer.println();\n    writer.println(\"  // Nested \" + baseName + \" classes here...\");\n//< omit\n//> nested-classes\n    // The AST classes.\n    for (String type : types) {\n      String className = type.split(\":\")[0].trim();\n      String fields = type.split(\":\")[1].trim(); // [robust]\n      defineType(writer, baseName, className, fields);\n    }\n//< nested-classes\n//> base-accept-method\n\n    // The base accept() method.\n    writer.println();\n    writer.println(\"  abstract <R> R accept(Visitor<R> visitor);\");\n\n//< base-accept-method\n    writer.println(\"}\");\n//> omit\n    writer.println(\"//< Appendix II \" + baseName.toLowerCase());\n//< omit\n    writer.close();\n  }\n//< define-ast\n//> define-visitor\n  private static void defineVisitor(\n      PrintWriter writer, String baseName, List<String> types) {\n    writer.println(\"  interface Visitor<R> {\");\n\n    for (String type : types) {\n      String typeName = type.split(\":\")[0].trim();\n      writer.println(\"    R visit\" + typeName + baseName + \"(\" +\n          typeName + \" \" + baseName.toLowerCase() + \");\");\n    }\n\n    writer.println(\"  }\");\n  }\n//< define-visitor\n//> define-type\n  private static void defineType(\n      PrintWriter writer, String baseName,\n      String className, String fieldList) {\n//> omit\n    writer.println(\"//> \" +\n        baseName.toLowerCase() + \"-\" + className.toLowerCase());\n//< omit\n    writer.println(\"  static class \" + className + \" extends \" +\n        baseName + \" {\");\n\n//> omit\n    // Hack. Stmt.Class has such a long constructor that it overflows\n    // the line length on the Appendix II page. Wrap it.\n    if (fieldList.length() > 64) {\n      fieldList = fieldList.replace(\", \", \",\\n          \");\n    }\n\n//< omit\n    // Constructor.\n    writer.println(\"    \" + className + \"(\" + fieldList + \") {\");\n\n//> omit\n    fieldList = fieldList.replace(\",\\n          \", \", \");\n//< omit\n    // Store parameters in fields.\n    String[] fields = fieldList.split(\", \");\n    for (String field : fields) {\n      String name = field.split(\" \")[1];\n      writer.println(\"      this.\" + name + \" = \" + name + \";\");\n    }\n\n    writer.println(\"    }\");\n//> accept-method\n\n    // Visitor pattern.\n    writer.println();\n    writer.println(\"    @Override\");\n    writer.println(\"    <R> R accept(Visitor<R> visitor) {\");\n    writer.println(\"      return visitor.visit\" +\n        className + baseName + \"(this);\");\n    writer.println(\"    }\");\n//< accept-method\n\n    // Fields.\n    writer.println();\n    for (String field : fields) {\n      writer.println(\"    final \" + field + \";\");\n    }\n\n    writer.println(\"  }\");\n//> omit\n    writer.println(\"//< \" +\n        baseName.toLowerCase() + \"-\" + className.toLowerCase());\n//< omit\n  }\n//< define-type\n//> pastry-visitor\n  interface PastryVisitor {\n    void visitBeignet(Beignet beignet); // [overload]\n    void visitCruller(Cruller cruller);\n  }\n//< pastry-visitor\n//> pastries\n  abstract class Pastry {\n//> pastry-accept\n    abstract void accept(PastryVisitor visitor);\n//< pastry-accept\n  }\n\n  class Beignet extends Pastry {\n//> beignet-accept\n    @Override\n    void accept(PastryVisitor visitor) {\n      visitor.visitBeignet(this);\n    }\n//< beignet-accept\n  }\n\n  class Cruller extends Pastry {\n//> cruller-accept\n    @Override\n    void accept(PastryVisitor visitor) {\n      visitor.visitCruller(this);\n    }\n//< cruller-accept\n  }\n//< pastries\n}\n"
  },
  {
    "path": "jlox",
    "content": "#!/usr/bin/env bash\n\nscript_dir=$(dirname \"$0\")\njava -cp ${script_dir}/build/java com.craftinginterpreters.lox.Lox $@\n"
  },
  {
    "path": "note/BISAC.txt",
    "content": "COMPUTERS / Programming / Compilers\nCOMPUTERS / Languages / General\nCOMPUTERS / Software Development & Engineering / Tools"
  },
  {
    "path": "note/answers/chapter01_introduction/1.md",
    "content": "Markdown, Jinja2, Makefile, SASS, CSS, HTML. There's also the homegrown little\ntags inserted in the code and Markdown to weave the two together.\n\nThe tests used to ensure the interpreters work correctly also have a\nmini-language embedded in comments to define expectations for how the test\nshould behave.\n\nThis doesn't count the Python scripts that glue this altogether, since  Python\nis a general-purpose language.\n"
  },
  {
    "path": "note/answers/chapter01_introduction/2/Hello.java",
    "content": "public class Hello {\n  public static void main(String[] args) {\n    System.out.println(\"Hello, world!\");\n  }\n}\n"
  },
  {
    "path": "note/answers/chapter01_introduction/2/Makefile",
    "content": "# Compile the Java file to a class file.\nHello.class: Hello.java\n\t@ javac Hello.java\n\n# Convenience target to build and run it.\nrun: Hello.class\n\t@ java Hello\n\n# Tell make that \"run\" is not the name of a file.\n.PHONY: run\n"
  },
  {
    "path": "note/answers/chapter01_introduction/3/Makefile",
    "content": "# Compile the .c file to an executable.\nlinked_list: linked_list.c\n\tgcc linked_list.c -o linked_list\n"
  },
  {
    "path": "note/answers/chapter01_introduction/3/linked_list.c",
    "content": "#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\ntypedef struct sNode {\n  struct sNode* prev;\n  struct sNode* next;\n  char* string;\n} Node;\n\n// Insert a new node containing [string] after [prev], or at the beginning of\n// the list if [prev] is NULL.\nvoid insert(Node** list, Node* prev, const char* string) {\n  // Create the new node and copy the string to the heap.\n  Node* node = malloc(sizeof(Node));\n  node->string = malloc(strlen(string) + 1);\n  strcpy(node->string, string);\n\n  if (prev == NULL) {\n    if (*list != NULL) (*list)->prev = node;\n    node->prev = NULL;\n    node->next = *list;\n    *list = node;\n  } else {\n    node->next = prev->next;\n    if (node->next != NULL) node->next->prev = node;\n    prev->next = node;\n    node->prev = prev;\n  }\n}\n\nNode* find(Node* list, const char* string) {\n  while (list != NULL) {\n    if (strcmp(string, list->string) == 0) {\n      return list;\n    }\n    \n    list = list->next;\n  }\n  \n  // Not found.\n  return NULL;\n}\n\nvoid delete(Node** list, Node* node) {\n  // Unlink it.\n  if (node->prev != NULL) node->prev->next = node->next;\n  if (node->next != NULL) node->next->prev = node->prev;\n  \n  // If we're deleting the head, update it.\n  if (*list == node) *list = node->next;\n  \n  free(node->string);\n  free(node);\n}\n\nvoid dump(Node* list) {\n  while (list != NULL) {\n    printf(\"%p [prev %p next %p] %s\\n\",\n           list, list->prev, list->next, list->string);\n    list = list->next;\n  }\n}\n\nint main(int argc, const char* argv[]) {\n  printf(\"Hello, World!\\n\");\n  \n  Node* list = NULL;\n  insert(&list, NULL, \"four\");\n  insert(&list, NULL, \"one\");\n  insert(&list, find(list, \"one\"), \"two\");\n  insert(&list, find(list, \"two\"), \"three\");\n  \n  dump(list);\n  printf(\"-- delete three --\\n\");\n  delete(&list, find(list, \"three\"));\n  dump(list);\n\n  printf(\"-- delete one --\\n\");\n  delete(&list, find(list, \"one\"));\n  dump(list);\n\n  return 0;\n}\n"
  },
  {
    "path": "note/answers/chapter01_introduction/3/linked_list.xcodeproj/project.pbxproj",
    "content": "// !$*UTF8*$!\n{\n\tarchiveVersion = 1;\n\tclasses = {\n\t};\n\tobjectVersion = 46;\n\tobjects = {\n\n/* Begin PBXBuildFile section */\n\t\t29BE01271DBD3A9300EB6E51 /* linked_list.c in Sources */ = {isa = PBXBuildFile; fileRef = 29BE01261DBD3A9300EB6E51 /* linked_list.c */; };\n/* End PBXBuildFile section */\n\n/* Begin PBXCopyFilesBuildPhase section */\n\t\t2973DC0C1DBD3A69005047A2 /* CopyFiles */ = {\n\t\t\tisa = PBXCopyFilesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tdstPath = /usr/share/man/man1/;\n\t\t\tdstSubfolderSpec = 0;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 1;\n\t\t};\n/* End PBXCopyFilesBuildPhase section */\n\n/* Begin PBXFileReference section */\n\t\t2973DC0E1DBD3A69005047A2 /* linked_list */ = {isa = PBXFileReference; explicitFileType = \"compiled.mach-o.executable\"; includeInIndex = 0; path = linked_list; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\t29BE01261DBD3A9300EB6E51 /* linked_list.c */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.c; path = linked_list.c; sourceTree = \"<group>\"; };\n/* End PBXFileReference section */\n\n/* Begin PBXFrameworksBuildPhase section */\n\t\t2973DC0B1DBD3A69005047A2 /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXFrameworksBuildPhase section */\n\n/* Begin PBXGroup section */\n\t\t2973DC051DBD3A69005047A2 = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t29BE01261DBD3A9300EB6E51 /* linked_list.c */,\n\t\t\t\t2973DC0F1DBD3A69005047A2 /* Products */,\n\t\t\t);\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t2973DC0F1DBD3A69005047A2 /* Products */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t2973DC0E1DBD3A69005047A2 /* linked_list */,\n\t\t\t);\n\t\t\tname = Products;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n/* End PBXGroup section */\n\n/* Begin PBXNativeTarget section */\n\t\t2973DC0D1DBD3A69005047A2 /* linked_list */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = 2973DC151DBD3A69005047A2 /* Build configuration list for PBXNativeTarget \"linked_list\" */;\n\t\t\tbuildPhases = (\n\t\t\t\t2973DC0A1DBD3A69005047A2 /* Sources */,\n\t\t\t\t2973DC0B1DBD3A69005047A2 /* Frameworks */,\n\t\t\t\t2973DC0C1DBD3A69005047A2 /* CopyFiles */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t);\n\t\t\tname = linked_list;\n\t\t\tproductName = linked_list;\n\t\t\tproductReference = 2973DC0E1DBD3A69005047A2 /* linked_list */;\n\t\t\tproductType = \"com.apple.product-type.tool\";\n\t\t};\n/* End PBXNativeTarget section */\n\n/* Begin PBXProject section */\n\t\t2973DC061DBD3A69005047A2 /* Project object */ = {\n\t\t\tisa = PBXProject;\n\t\t\tattributes = {\n\t\t\t\tLastUpgradeCheck = 0640;\n\t\t\t\tORGANIZATIONNAME = \"Robert Nystrom\";\n\t\t\t\tTargetAttributes = {\n\t\t\t\t\t2973DC0D1DBD3A69005047A2 = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 6.4;\n\t\t\t\t\t};\n\t\t\t\t};\n\t\t\t};\n\t\t\tbuildConfigurationList = 2973DC091DBD3A69005047A2 /* Build configuration list for PBXProject \"linked_list\" */;\n\t\t\tcompatibilityVersion = \"Xcode 3.2\";\n\t\t\tdevelopmentRegion = English;\n\t\t\thasScannedForEncodings = 0;\n\t\t\tknownRegions = (\n\t\t\t\ten,\n\t\t\t);\n\t\t\tmainGroup = 2973DC051DBD3A69005047A2;\n\t\t\tproductRefGroup = 2973DC0F1DBD3A69005047A2 /* Products */;\n\t\t\tprojectDirPath = \"\";\n\t\t\tprojectRoot = \"\";\n\t\t\ttargets = (\n\t\t\t\t2973DC0D1DBD3A69005047A2 /* linked_list */,\n\t\t\t);\n\t\t};\n/* End PBXProject section */\n\n/* Begin PBXSourcesBuildPhase section */\n\t\t2973DC0A1DBD3A69005047A2 /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\t29BE01271DBD3A9300EB6E51 /* linked_list.c in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXSourcesBuildPhase section */\n\n/* Begin XCBuildConfiguration section */\n\t\t2973DC131DBD3A69005047A2 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++0x\";\n\t\t\t\tCLANG_CXX_LIBRARY = \"libc++\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_UNREACHABLE_CODE = YES;\n\t\t\t\tCLANG_WARN__DUPLICATE_METHOD_MATCH = YES;\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = dwarf;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu99;\n\t\t\t\tGCC_DYNAMIC_NO_PIC = NO;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_OPTIMIZATION_LEVEL = 0;\n\t\t\t\tGCC_PREPROCESSOR_DEFINITIONS = (\n\t\t\t\t\t\"DEBUG=1\",\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t);\n\t\t\t\tGCC_SYMBOLS_PRIVATE_EXTERN = NO;\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNDECLARED_SELECTOR = YES;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tMACOSX_DEPLOYMENT_TARGET = 10.11;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = YES;\n\t\t\t\tONLY_ACTIVE_ARCH = YES;\n\t\t\t\tSDKROOT = macosx;\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t2973DC141DBD3A69005047A2 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++0x\";\n\t\t\t\tCLANG_CXX_LIBRARY = \"libc++\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_UNREACHABLE_CODE = YES;\n\t\t\t\tCLANG_WARN__DUPLICATE_METHOD_MATCH = YES;\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = \"dwarf-with-dsym\";\n\t\t\t\tENABLE_NS_ASSERTIONS = NO;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu99;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNDECLARED_SELECTOR = YES;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tMACOSX_DEPLOYMENT_TARGET = 10.11;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = NO;\n\t\t\t\tSDKROOT = macosx;\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\t2973DC161DBD3A69005047A2 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t2973DC171DBD3A69005047A2 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n/* End XCBuildConfiguration section */\n\n/* Begin XCConfigurationList section */\n\t\t2973DC091DBD3A69005047A2 /* Build configuration list for PBXProject \"linked_list\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t2973DC131DBD3A69005047A2 /* Debug */,\n\t\t\t\t2973DC141DBD3A69005047A2 /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\t2973DC151DBD3A69005047A2 /* Build configuration list for PBXNativeTarget \"linked_list\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t2973DC161DBD3A69005047A2 /* Debug */,\n\t\t\t\t2973DC171DBD3A69005047A2 /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n/* End XCConfigurationList section */\n\t};\n\trootObject = 2973DC061DBD3A69005047A2 /* Project object */;\n}\n"
  },
  {
    "path": "note/answers/chapter01_introduction/3/linked_list.xcodeproj/project.xcworkspace/contents.xcworkspacedata",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Workspace\n   version = \"1.0\">\n   <FileRef\n      location = \"self:linked_list.xcodeproj\">\n   </FileRef>\n</Workspace>\n"
  },
  {
    "path": "note/answers/chapter02_map.md",
    "content": "## 1. Find the various parts in an open source implementation.\n\nTODO\n\n## 2. Why not use a JIT?\n\n1. It's really complex to implement, debug, and maintain. Few people have the\n   skill to do it.\n2. Like a native code compiler (which it is), it ties you to a specific CPU\n   architecture.\n3. Bytecode is generally more compact than machine code (since it's closer to\n   the semantics of the language), so it takes up less memory. In platforms\n   like embedded devices where memory may matter more than speed, that can be\n   a worthwhile trade-off.\n4. Some platforms, like iOS and most game consoles, expressly disallow\n   executing code generated at runtime. The OS simply won't allow you to jump\n   into memory that can be written to.\n\n## 3. Why do Lisp compilers also contain an interpreter?\n\nMost Lisps support macros -- code that is executed at compile time, so the\nimplementation needs to be able to evaluate the macro itself while in the middle\nof compiling. You could do that by *compiling* the macro and then running that,\nbut that's a lot of overhead.\n"
  },
  {
    "path": "note/answers/chapter03_lox.md",
    "content": "1.  I've, uh, written plenty. Look in /test/. Here's another:\n    ~~~~\n    class List {\n      init(data, next) {\n        this.data = data;\n        this.next = next;\n      }\n\n      map(function) {\n        var data = function(this.data);\n        var next;\n        if (this.next != nil) next = this.next.map(function);\n        return List(data, next);\n      }\n\n      display() {\n        var list = this;\n        while (list != nil) {\n          print(list.data);\n          list = list.next;\n        }\n      }\n    }\n\n    var list = List(1, List(2, List(3, List(4, nil))));\n    list.display();\n\n    fun double(n) { return n * 2; }\n    list = list.map(double);\n    list.display();\n    ~~~~\n\n2.  Here's a few:\n\n    1.  What happens if you access a global variable in a function before it is\n        defined?\n\n    2.  What does it mean to say something is \"an error\"? Runtime error?\n        Compile time?\n\n    3.  What kind of expressions are allowed when a superclass is specified?\n\n    4.  What happens if you declare two classes or functions with the same name?\n\n    5.  Can a class inherit from something that isn't a class?\n\n    6.  Can you reassign to the name that is bound by a class or function\n        declaration?\n\n3.  The big ones are:\n\n    1.  Lists/arrays. You can build your own linked lists, but there's no way to\n        create a data structure that stores a contiguous series of values that\n        can be accessed in constant time in user code. That has to be baked\n        into the language or core library.\n\n    2.  Some mechanism for handling runtime errors along the lines of exception\n        handling.\n\n    Also:\n\n    3.  No `break` or `continue` for loops.\n\n    4.  No `switch`.\n"
  },
  {
    "path": "note/answers/chapter04_scanning.md",
    "content": "1.  Both of them have significant indentation. To handle that, the scanner\n    emits synthetic \"{\" and \"}\" tokens (or \"indent\" and \"dedent\" as Python\n    calls them), as if there were explicit delimiters for each level of\n    indentation.\n\n    In order to know when a new line begins or ends one of more levels of\n    indentation, the scanner has to track the *previous* indentation value.\n    That state has to be stored in the scanner, which means it has a little bit\n    of *memory*. That makes it no longer a regular language, which is defined\n    to only need to store a single finite number identifying which state it's\n    in.\n\n    You *could* make a regular language for significant indentation, by having\n    a hardcoded limit to the maximum amount of indentation, but that starts to\n    split semantic hairs around the Chomsky hierarchy.\n\n2.  In CoffeeScript, parentheses are option in function calls. You can call a\n    function like:\n\n    ```coffeescript\n    doStuff withThis\n    ```\n\n    Also, there is a nice syntax for lambda functions:\n\n    ```coffeescript\n    () -> someLambda\n    -> anotherOne\n    ```\n\n    On the second line, you can see that you can omit the `()` if the lambda\n    takes no parameters. So what does this do:\n\n    ```coffeescript\n    someFunction () -> someLambda\n    ```\n\n    Does it call `someFunction` with zero parameters and then call the result of\n    *that* with one parameter, a lambda? Or does it call `someFunction` with\n    one parameter, the lambda? The answer depends on spaces:\n\n    ```coffeescript\n    someFunction() -> someLambda\n    # Means the same as:\n    someFunction()(() -> someLambda)\n\n    someFunction () -> someLambda\n    # Means the same as:\n    someFunction(() -> someLambda)\n    ```\n\n    Ruby has similar corner cases because it also allow omitting the parentheses\n    in method calls (which is where CoffeeScript gets it from).\n\n    The C preprocessor relies on spaces to distinguish function macros from\n    simple macros:\n\n    ```c\n    #define MACRO1 (p) (p)\n    #define MACRO2(p) (p)\n    ```\n\n    Here, `MACRO1` is a simple text macro that expands to `(p) (p)` when used.\n    `MACRO2(p)` is a function-like macro that takes a parameter and expands to\n    `(p)` with `p` replaced by the parameter.\n\n3.  Programmers often write \"doc comments\" above their functions and types. A\n    documentation generator or an IDE that shows help text for declarations\n    needs access to those comments, so a scanner for those should include them.\n\n    An automated code formatter obviously needs to preserve comments and may\n    want to be aware of the original whitespace if some of the author's\n    formatting should be preserved.\n\n4.  You can see where I've implemented them for a similar language here:\n\n    https://github.com/munificent/wren/blob/c6eb0be99014d34085e2d24c696aed449e2fb171/src/vm/wren_compiler.c#L663\n\n    The interesting part is the `nesting` variable. Like challenge #1, we\n    require some extra state to track the nesting, which makes this not quite\n    regular.\n\n    Note also that we need to handle an unterminated block comment.\n\n\n"
  },
  {
    "path": "note/answers/chapter05_representing.md",
    "content": "1.  There are a few ways to do it. Here is one:\n\n    ```text\n    expr → expr calls\n    expr → IDENTIFIER\n    expr → NUMBER\n\n    calls → calls call\n    calls → call\n\n    call → \"(\" \")\"\n    call → \"(\" arguments \")\"\n    call → \".\" IDENTIFIER\n\n    arguments → expr\n    arguments → arguments \",\" expr\n    ```\n\n    It's the syntax for a function invocation.\n\n2.  One way is to create a record or tuple containing a function pointer for\n    each operation. In order to allow defining new types and passing them to\n    existing code, these functions need to encapsulate the type entirely -- the\n    existing code isn't aware of it, so it can't type check. You can do that by\n    having the functions be closures that all close over the same shared object,\n    \"this\", basically.\n\n3.  Here you go:\n\n    ```java\n    class RpnPrinter implements Expr.Visitor<String> {\n      String print(Expr expr) {\n        return expr.accept(this);\n      }\n\n      @Override\n      public String visitBinaryExpr(Expr.Binary expr) {\n        return expr.left.accept(this) + \" \" +\n               expr.right.accept(this) + \" \" +\n               expr.operator.lexeme;\n      }\n\n      @Override\n      public String visitGroupingExpr(Expr.Grouping expr) {\n        return expr.expression.accept(this);\n      }\n\n      @Override\n      public String visitLiteralExpr(Expr.Literal expr) {\n        return expr.value.toString();\n      }\n\n      @Override\n      public String visitUnaryExpr(Expr.Unary expr) {\n        String operator = expr.operator.lexeme;\n        if (expr.operator.type == TokenType.MINUS) {\n          // Can't use same symbol for unary and binary.\n          operator = \"~\";\n        }\n\n        return expr.right.accept(this) + \" \" + operator;\n      }\n\n      public static void main(String[] args) {\n        Expr expression = new Expr.Binary(\n            new Expr.Unary(\n                new Token(TokenType.MINUS, \"-\", null, 1),\n                new Expr.Literal(123)),\n            new Token(TokenType.STAR, \"*\", null, 1),\n            new Expr.Grouping(\n                new Expr.Literal(\"str\")));\n\n        System.out.println(new RpnPrinter().print(expression));\n      }\n    }\n    ```\n\n    Note that we have to handle unary \"-\" specially. In RPN, we can't use the\n    same symbol for both unary and binary forms. When we encounter it, we\n    wouldn't know whether to pop one or two numbers off the stack. So, to\n    disambiguate, we pick a different symbol for negation.\n"
  },
  {
    "path": "note/answers/chapter06_parsing.md",
    "content": "1.  The comma operator has the lowest precedence, so it goes between expression\n    and equality:\n\n    ```ebnf\n    expression → comma ;\n    comma      → equality ( \",\" equality )* ;\n    equality   → comparison ( ( \"!=\" | \"==\" ) comparison )* ;\n    comparison → term ( ( \">\" | \">=\" | \"<\" | \"<=\" ) term )* ;\n    term       → factor ( ( \"-\" | \"+\" ) factor )* ;\n    factor     → unary ( ( \"/\" | \"*\" ) unary )* ;\n    unary      → ( \"!\" | \"-\" | \"--\" | \"++\" ) unary\n               | postfix ;\n    postfix    → primary ( \"--\" | ++\" )* ;\n    primary    → NUMBER | STRING | \"true\" | \"false\" | \"nil\"\n               | \"(\" expression \")\" ;\n    ```\n\n    We could define a new syntax tree node by adding this to the `defineAst()`\n    call:\n\n    ```java\n    \"Comma    : Expr left, Expr right\",\n    ```\n\n    But a simpler choice is to treat it like any other binary operator and\n    reuse Expr.Binary.\n\n    Parsing is similar to other infix operators (except that we don't bother to\n    keep the operator token):\n\n    ```java\n    private Expr expression() {\n      return comma();\n    }\n\n    private Expr comma() {\n      Expr expr = equality();\n\n      while (match(COMMA)) {\n        Token operator = previous();\n        Expr right = equality();\n        expr = new Expr.Binary(expr, operator, right);\n      }\n\n      return expr;\n    }\n    ```\n\n    Keep in mind that commas are already used in the grammar to separate\n    arguments in function calls. With the above change, this:\n\n    ```lox\n    foo(1, 2)\n    ```\n\n    Now gets parsed as:\n\n    ```lox\n    foo((1, 2))\n    ```\n\n    In other words, pass a single argument to `foo`, the result of evaluating\n    `1, 2`. That's not what we want. To fix that, we simply change the way we\n    parse function arguments to require a higher precedence expression than the\n    comma operator:\n\n    ```java\n    if (!check(RIGHT_PAREN)) {\n      do {\n        if (arguments.size() >= 8) {\n          error(peek(), \"Can't have more than 8 arguments.\");\n        }\n        arguments.add(equality()); // <-- was expression().\n      } while (match(COMMA));\n    }\n    ```\n\n2.  We just need one new rule.\n\n    ```ebnf\n    expression  → conditional ;\n    conditional → equality ( \"?\" expression \":\" conditional )? ;\n    // Other rules...\n    ```\n\n    The precedence of the operands is pretty interesting. The left operand has\n    higher precedence than the others, and the middle operand has lower\n    precedence than the condition expression itself. That allows:\n\n        a ? b = c : d\n\n    Again, I won't bother showing the scanner and token changes since they're\n    pretty obvious.\n\n    ```java\n    private Expr expression() {\n      return conditional();\n    }\n\n    private Expr conditional() {\n      Expr expr = equality();\n\n      if (match(QUESTION)) {\n        Expr thenBranch = expression();\n        consume(COLON,\n            \"Expect ':' after then branch of conditional expression.\");\n        Expr elseBranch = conditional();\n        expr = new Expr.Conditional(expr, thenBranch, elseBranch);\n      }\n\n      return expr;\n    }\n    ```\n\n3.  Here's an updated grammar. The grammar itself doesn't \"know\" that some of\n    these productions are errors. The parser handles that.\n\n    ```ebnf\n    expression → equality ;\n    equality   → comparison ( ( \"!=\" | \"==\" ) comparison )* ;\n    comparison → term ( ( \">\" | \">=\" | \"<\" | \"<=\" ) term )* ;\n    term       → factor ( ( \"-\" | \"+\" ) factor )* ;\n    factor     → unary ( ( \"/\" | \"*\" ) unary )* ;\n    unary      → ( \"!\" | \"-\" | \"--\" | \"++\" ) unary\n               | postfix ;\n    postfix    → primary ( \"--\" | ++\" )* ;\n    primary    → NUMBER | STRING | \"true\" | \"false\" | \"nil\"\n               | \"(\" expression \")\"\n               // Error productions...\n               | ( \"!=\" | \"==\" ) equality\n               | ( \">\" | \">=\" | \"<\" | \"<=\" ) comparison\n               | ( \"+\" ) term\n               | ( \"/\" | \"*\" ) factor ;\n    ```\n\n    Note that \"-\" isn't an error production because that *is* a valid prefix\n    expression.\n\n    With the normal infix productions, the operand non-terminals are one\n    precedence level higher than the operator's own precedence. In order to\n    handle a series of operators of the same precedence, the rules explicitly\n    allow repetition.\n\n    With the error productions, though, the right-hand operand rule is the same\n    precedence level. That will effectively strip off the erroneous leading\n    operator and then consume a series of infix uses of operators at the same\n    level by reusing the existing correct rule. For example:\n\n    ```lox\n    + a - b + c - d\n    ```\n\n    The error production for `+` will match the leading `+` and then use\n    `term` to also match the rest of the expression.\n\n    ```java\n    private Expr primary() {\n      if (match(FALSE)) return new Expr.Literal(false);\n      if (match(TRUE)) return new Expr.Literal(true);\n      if (match(NIL)) return new Expr.Literal(null);\n\n      if (match(NUMBER, STRING)) {\n        return new Expr.Literal(previous().literal);\n      }\n\n      if (match(LEFT_PAREN)) {\n        Expr expr = expression();\n        consume(RIGHT_PAREN, \"Expect ')' after expression.\");\n        return new Expr.Grouping(expr);\n      }\n\n      // Error productions.\n      if (match(BANG_EQUAL, EQUAL_EQUAL)) {\n        error(previous(), \"Missing left-hand operand.\");\n        equality();\n        return null;\n      }\n\n      if (match(GREATER, GREATER_EQUAL, LESS, LESS_EQUAL)) {\n        error(previous(), \"Missing left-hand operand.\");\n        comparison();\n        return null;\n      }\n\n      if (match(PLUS)) {\n        error(previous(), \"Missing left-hand operand.\");\n        term();\n        return null;\n      }\n\n      if (match(SLASH, STAR)) {\n        error(previous(), \"Missing left-hand operand.\");\n        factor();\n        return null;\n      }\n\n      throw error(peek(), \"Expect expression.\");\n    }\n    ```\n"
  },
  {
    "path": "note/answers/chapter07_evaluating.md",
    "content": "1.  Python 3 allows comparing all of the various number types with each other,\n    except for complex numbers. Booleans (True and False) are a subclass of\n    int and work like 1 and 0 for comparison.\n\n    Strings can be compared with each other and are ordered lexicographically.\n    Likewise other sequences.\n\n    Comparing sets is defined in terms of subsets and supersets, so that, for\n    example `{1, 2} < {1, 2, 3}`. This isn't a total order since many pairs of\n    sets are neither subsets nor supersets of each other.\n\n    I think it would be reasonable to extend Lox to support comparing strings\n    with each other. I wouldn't support comparing other built in types, nor\n    mixing them. Allowing `\"1\" < 2` is a recipe for confusion.\n\n2.  Replace the Token.PLUS case with:\n\n    ```java\n    case PLUS:\n      if (left instanceof String || right instanceof String) {\n        return stringify(left) + stringify(right);\n      }\n\n      if (left instanceof Double && right instanceof Double) {\n        return (double)left + (double)right;\n      }\n\n      throw new RuntimeError(expr.operator,\n          \"Operands must be two numbers or two strings.\");\n      ```\n\n3.  It returns Infinity, -Infinity, or NaN based on sign of the dividend. Given\n    that Lox is a high level scripting language, I think it would be better to\n    raise a runtime error to let the user know something got weird. That's what\n    Python and Ruby do.\n\n    On the other hand, given that Lox gives the user no way to catch and\n    handle runtime errors, not throwing one might be more flexible.\n"
  },
  {
    "path": "note/answers/chapter08_statements.md",
    "content": "1.  It can be hard to do this in a clean way since the expression grammar\n    overlaps the statement grammar so much (every expression is also the\n    beginning of an expression statement containing that same expression).\n\n    One trick some parsers use is to simply *try* to parse the syntax as a\n    statement. If that fails, hide any parse errors and then try to parse it\n    again as expression.\n\n    I took a slightly different approach. Instead, the parser tries to parse a\n    list of statements, but if it knows it's allowed to parse a single\n    expression, and it reaches the end of the source right after parsing the\n    expression part of an expression statement, then it stops early and returns\n    that expression.\n\n    All that's left is to see if the parsed value is an expression and, if so,\n    evaluate it and print it.\n\n    This isn't the cleanest implementation, but here goes. In Parser, add two\n    new fields:\n\n    ```java\n    private boolean allowExpression;\n    private boolean foundExpression = false;\n    ```\n\n    Then define this method:\n\n    ```java\n    Object parseRepl() {\n      allowExpression = true;\n      List<Stmt> statements = new ArrayList<>();\n      while (!isAtEnd()) {\n        statements.add(declaration());\n\n        if (foundExpression) {\n          Stmt last = statements.get(statements.size() - 1);\n          return ((Stmt.Expression) last).expression;\n        }\n\n        allowExpression = false;\n      }\n\n      return statements;\n    }\n    ```\n\n    And change expressionStatement() to:\n\n    ```java\n    private Stmt expressionStatement() {\n      Expr expr = expression();\n\n      if (allowExpression && isAtEnd()) {\n        foundExpression = true;\n      } else {\n        consume(SEMICOLON, \"Expect ';' after expression.\");\n      }\n      return new Stmt.Expression(expr);\n    }\n    ```\n\n    In Interpreter, add:\n\n    ```java\n    String interpret(Expr expression) {\n      try {\n        Object value = evaluate(expression);\n        return stringify(value);\n      } catch (RuntimeError error) {\n        Lox.runtimeError(error);\n        return null;\n      }\n    }\n    ```\n\n    Finally, in Lox, change runPrompt() to:\n\n    ```java\n    private static void runPrompt() throws IOException {\n      InputStreamReader input = new InputStreamReader(System.in);\n      BufferedReader reader = new BufferedReader(input);\n\n      for (;;) {\n        hadError = false;\n\n        System.out.print(\"> \");\n        Scanner scanner = new Scanner(reader.readLine());\n        List<Token> tokens = scanner.scanTokens();\n\n        Parser parser = new Parser(tokens);\n        Object syntax = parser.parseRepl();\n\n        // Ignore it if there was a syntax error.\n        if (hadError) continue;\n\n        if (syntax instanceof List) {\n          interpreter.interpret((List<Stmt>)syntax);\n        } else if (syntax instanceof Expr) {\n          String result = interpreter.interpret((Expr)syntax);\n          if (result != null) {\n            System.out.println(\"= \" + result);\n          }\n        }\n      }\n    }\n    ```\n\n    That should about do it.\n\n2.  This is pretty simple. Instead of initializing variables with null if they\n    have no initializer, we use a special sentinel value to distinguish it from\n    Lox's nil. Then, we check for that when the variable is accessed.\n\n    In Interpreter, add:\n\n    ```java\n    private static Object uninitialized = new Object();\n    ```\n\n    Change the first line of visitVarStmt() to:\n\n    ```java\n    Object value = uninitialized;\n    ```\n\n    Finally, change visitVariableExpr() to:\n\n    ```java\n    public Object visitVariableExpr(Expr.Variable expr) {\n      Object value = environment.get(expr.name);\n      if (value == uninitialized) {\n        throw new RuntimeError(expr.name,\n            \"Variable must be initialized before use.\");\n      }\n      return value;\n    }\n    ```\n\n    The main downside is that checking for the uninitialized variable on every\n    single access significantly slows execution for what is a very common\n    operation. Not a big deal given that our Java interpreter isn't designed\n    for speed anyway.\n\n3.  > What does the following program do?\n\n    It prints 3. The shadowed variable doesn't come into scope until *after* its\n    initializer expression is evaluated, so `a + 2` is evaluated using the\n    outer `a`, whose value is 1. Then the result is stored in the new `a`.\n\n    > What did you expect it to do?\n\n    Well, I wrote this book, so it's no surprise to me.\n\n    > Is it what you think it should do?\n\n    Code like this is rare in practice, so I don't care too much. But the\n    current behavior is a little surprising. People read code left-to-right, so\n    they probably expect the new variable to be in scope as soon as they scan\n    over its name after `var`.\n\n    Ideally, I'd make this kind of code a static error. Put the variable in\n    scope as soon as its name is encountered but in a special \"unusable\" state.\n    Then, once its initializer is done, make it available. If the initializer\n    references it, make that a static error.\n\n    > What does analogous code in other languages you are familiar with do?\n\n    Java disallows shadowing local variables. C# allows shadowing, but doesn't\n    allow multiple mentions of the same name in the same block to resolve to\n    different variables.\n\n    > What do you think users will expect this to do?\n\n    I think they'd be surprised if the code was valid at all, and would\n    probably consider it bad code even if it did do something.\n"
  },
  {
    "path": "note/answers/chapter09_control.md",
    "content": "1.  The basic idea is that the control flow operations become methods that take\n    callbacks for the blocks to execute when true or false. You define two\n    classes with singleton instances, one for true and one for false. The\n    implementations of the control flow methods on the true class invoke the\n    then callbacks. The ones on the false class implement the else callbacks.\n\n    Like so:\n\n    ```lox\n    class True {\n      ifThen(thenBranch) {\n        return thenBranch();\n      }\n\n      ifThenElse(thenBranch, elseBranch) {\n        return thenBranch();\n      }\n    }\n\n    class False {\n      ifThen(thenBranch) {\n        return nil;\n      }\n\n      ifThenElse(thenBranch, elseBranch) {\n        return elseBranch();\n      }\n    }\n    ```\n\n    Then we make singleton instances of these classes:\n\n    ```lox\n    var t = True();\n    var f = False();\n    ```\n\n    You can try them out like so:\n\n    ```lox\n    fun test(condition) {\n      fun ifThenFn() {\n        print \"if then -> then\";\n      }\n\n      condition.ifThen(ifThenFn);\n\n      fun ifThenElseThenFn() {\n        print \"if then else -> then\";\n      }\n\n      fun ifThenElseElseFn() {\n        print \"if then else -> else\";\n      }\n\n      condition.ifThenElse(ifThenElseThenFn, ifThenElseElseFn);\n    }\n\n    test(t);\n    test(f);\n    ```\n\n    This is famously how Smalltalk implements its control flow.\n\n    It looks cumbersome because Lox doesn't have lambdas -- anonymous function\n    expressions -- but those would be easy to add to the language if\n    we wanted to go in this direction.\n\n    Even more powerful would a nice terse syntax for defining and passing a\n    closure to a method. The Grace language has a particularly nice notation\n    for passing multiple blocks to a method. If we adapted that to Lox, we'd\n    get something like:\n\n    ```text\n    fun test(condition) {\n      condition.ifThen {\n        print \"if then -> then\";\n      };\n\n      condition.ifThen {\n        print \"if then else -> then\";\n      } else {\n        print \"if then else -> else\";\n      };\n    }\n\n    test(t);\n    test(f);\n    ```\n\n    It starts to look like this control flow is built into the language even\n    though it's only method calls.\n\n2.  Scheme is the language that famously shows that all iteration can be\n    represented in terms of recursion and conditional execution. To execute a\n    chunk of code more than once, hoist it out into a function that calls itself\n    at the end of its body for the next iteration.\n\n    For example, we could represent this `for` loop:\n\n    ```lox\n    for (var i = 0; i < 100; i = i + 1) {\n      print i;\n    }\n    ```\n\n    Like so:\n\n    ```lox\n    fun forStep(i) {\n      print i;\n      if (i < 99) forStep(i + 1);\n    }\n    ```\n\n    When you see heavy use of recursion like here where there are almost a\n    hundred recursive calls, the concern is overflowing the stack. However, in\n    many cases, you don't need to preserve any information from the previous\n    call when beginning a recursive call. If the recursive call is in *tail\n    position* -- it's the last thing in the body of the function -- then you\n    can discard any stack space used by the previous call before beginning the\n    next one.\n\n    This **tail call optimization** lets you use recursion for an unbounded\n    number of iterations while consuming only a constant amount of stack space.\n    Scheme and some other functional languages require an implementation to\n    perform this optimization so that users can safely rely on recursion for\n    iteration.\n\n3.  As usual, we start with the AST:\n\n    ```java\n    defineAst(outputDir, \"Stmt\", Arrays.asList(\n      \"Block      : List<Stmt> statements\",\n      \"Break      : \",  // <--\n      \"Expression : Expr expression\",\n      \"If         : Expr condition, Stmt thenBranch, Stmt elseBranch\",\n      \"Print      : Expr expression\",\n      \"Var        : Token name, Expr initializer\",\n      \"While      : Expr condition, Stmt body\"\n    ));\n    ```\n\n    Break doesn't have any fields, which actually breaks the little generator\n    script, so you also need to change defineType() to:\n\n    ```java\n    // Store parameters in fields.\n    String[] fields;\n    if (fieldList.isEmpty()) {\n      fields = new String[0];\n    } else {\n      fields = fieldList.split(\", \");\n    }\n    ```\n\n    Run that to get the new AST class. Now we need to push the syntax through the\n    front end, starting with the new keyword. In TokenType, add `BREAK`:\n\n    ```java\n    // Keywords.\n    AND, BREAK, CLASS, ELSE, FALSE, FUN, FOR, IF, NIL, OR,\n    ```\n\n    And then define it in the lexer:\n\n    ```java\n    keywords.put(\"break\",  BREAK);\n    ```\n\n    In the parser, we match the keyword in `statement()`:\n\n    ```java\n    if (match(BREAK)) return breakStatement();\n    ```\n\n    Which calls:\n\n    ```java\n    private Stmt breakStatement() {\n      consume(SEMICOLON, \"Expect ';' after 'break'.\");\n      return new Stmt.Break();\n    }\n    ```\n\n    We need some additional parser support. It should be a syntax error to use\n    `break` outside of a loop. We do that by adding a field in Parser to track\n    how many enclosing loops there currently are:\n\n    ```java\n    private int loopDepth = 0;\n    ```\n\n    In `forStatement()`, we update that when parsing the loop body:\n\n    ```java\n    try {\n      loopDepth++;\n      Stmt body = statement();\n\n      if (increment != null) {\n        body = new Stmt.Block(Arrays.asList(\n            body,\n            new Stmt.Expression(increment)));\n      }\n\n      if (condition == null) condition = new Expr.Literal(true);\n      body = new Stmt.While(condition, body);\n\n      if (initializer != null) {\n        body = new Stmt.Block(Arrays.asList(initializer, body));\n      }\n\n      return body;\n    } finally {\n      loopDepth--;\n    }\n    ```\n\n    Likewise `whileStatement()`:\n\n    ```java\n    try {\n      loopDepth++;\n      Stmt body = statement();\n\n      return new Stmt.While(condition, body);\n    } finally {\n      loopDepth--;\n    }\n    ```\n\n    Now we can check that when parsing the `break` statement:\n\n    ```java\n    private Stmt breakStatement() {\n      if (loopDepth == 0) {\n        error(previous(), \"Must be inside a loop to use 'break'.\");\n      }\n      consume(SEMICOLON, \"Expect ';' after 'break'.\");\n      return new Stmt.Break();\n    }\n    ```\n\n    To interpret this, we'll use exceptions to jump from the break out of the\n    loop. In Interpreter, define a class:\n\n    ```java\n    private static class BreakException extends RuntimeException {}\n    ```\n\n    Executing a `break` simply throws that:\n\n    ```java\n    @Override\n    public Void visitBreakStmt(Stmt.Break stmt) {\n      throw new BreakException();\n    }\n    ```\n\n    That gets caught by the `while` loop code and then proceeds from there.\n\n    ```java\n    @Override\n    public Void visitWhileStmt(Stmt.While stmt) {\n      try {\n        while (isTruthy(evaluate(stmt.condition))) {\n          execute(stmt.body);\n        }\n      } catch (BreakException ex) {\n        // Do nothing.\n      }\n      return null;\n    }\n    ```\n"
  },
  {
    "path": "note/answers/chapter10_functions.md",
    "content": "1.  Smalltalk has different call syntax for different arities. To define a\n    method that takes multiple arguments, you use **keyword selectors**. Each\n    argument has a piece of the method name preceding instead of using commas\n    as a separator. For example, a method like:\n\n    ```lox\n    list.insert(\"element\", 2)\n    ```\n\n    To insert \"element\" as index 2 would look like this in Smalltalk:\n\n    ```smalltalk\n    list insert: \"element\" at: 2\n    ```\n\n    Smalltalk doesn't use a dot to separate method name from receiver. More\n    interestingly, the \"insert:\" and \"at:\" parts both form a single method\n    call whose full name is \"insert:at:\". Since the selectors and the colons\n    that separate them form part of the method's name, there's no way to call\n    it with the wrong number of arguments. You can't pass too many or two few\n    arguments to \"insert:at:\" because there would be no way to write that call\n    while still actually naming that method.\n\n2.  This requires juggling some code around. In GenerateAst, we need a node\n    for function expressions. In the defineAst() call for Expr, add:\n\n    ```java\n    \"Function : List<Token> parameters, List<Stmt> body\",\n    ```\n\n    While we're at it, we can reuse that for function statements. A function\n    *statement* is now just a name and a function expression:\n\n    ```java\n    \"Function   : Token name, Expr.Function function\",\n    ```\n\n    Over in LoxFunction, it will store an Expr.Function instead of a statement\n    to handle both types. If the function does have a name, that is tracked\n    separately, since lambdas won't have one:\n\n    ```java\n    class LoxFunction implements Callable {\n      private final String name;\n      private final Expr.Function declaration;\n      private final Environment closure;\n\n      LoxFunction(String name, Expr.Function declaration, Environment closure) {\n        this.name = name;\n        this.closure = closure;\n        this.declaration = declaration;\n      }\n      @Override\n      public String toString() {\n        if (name == null) return \"<fn>\";\n        return \"<fn \" + name + \">\";\n      }\n\n      // ...\n    }\n    ```\n\n    The parser changes are a little more complex. We move the logic to handle\n    anonymous functions into a new method. Then the method to handle named\n    functions becomes wrapper around that one:\n\n    ```java\n    private Stmt.Function function(String kind) {\n      Token name = consume(IDENTIFIER, \"Expect \" + kind + \" name.\");\n      return new Stmt.Function(name, functionBody(kind));\n    }\n\n    private Expr.Function functionBody(String kind) {\n      consume(LEFT_PAREN, \"Expect '(' after \" + kind + \" name.\");\n      List<Token> parameters = new ArrayList<>();\n      if (!check(RIGHT_PAREN)) {\n        do {\n          if (parameters.size() >= 8) {\n            error(peek(), \"Can't have more than 8 parameters.\");\n          }\n\n          parameters.add(consume(IDENTIFIER, \"Expect parameter name.\"));\n        } while (match(COMMA));\n      }\n      consume(RIGHT_PAREN, \"Expect ')' after parameters.\");\n\n      consume(LEFT_BRACE, \"Expect '{' before \" + kind + \" body.\");\n      List<Stmt> body = block();\n      return new Expr.Function(parameters, body);\n    }\n    ```\n\n    Now we can use `functionBody()` to parse lambdas. In `primary()`, add\n    another clause:\n\n    ```java\n    if (match(FUN)) return functionBody(\"function\");\n    ```\n\n    We've got one nasty little problem. We want lambdas to be a valid primary\n    expression, and in theory any primary expression is allowed in a primary\n    statement. But if you try to do:\n\n    ```lox\n    fun () {};\n    ```\n\n    Then the `declaration()` parser will match that `fun` and try to parse it\n    as a named function declaration statement. It won't see a name and will\n    report a parse error. Even though the above code is pointless, we want it\n    to work to avoid a weird edge case in the grammar.\n\n    To handle that, we only want to parse a function declaration if the current\n    token is `fun` and the one past that is an identifier. That requires another\n    token of lookahead, as we add:\n\n    ```java\n    private boolean checkNext(TokenType tokenType) {\n      if (isAtEnd()) return false;\n      if (tokens.get(current + 1).type == EOF) return false;\n      return tokens.get(current + 1).type == tokenType;\n    }\n    ```\n\n    Then, in `declaration()`, change the `match(FUN)) ...` line to:\n\n    ```java\n    if (check(FUN) && checkNext(IDENTIFIER)) {\n      consume(FUN, null);\n      return function(\"function\");\n    }\n    ```\n\n    Now only a function with a name is parsed as such.\n    \n    Then our interpreter needs to handle both cases:\n\n\n    ```java\n  \n    @Override\n    public Void visitFunctionStmt(Stmt.Function stmt) {\n        String fnName = stmt.name.lexeme;\n        environment.define(fnName, new LoxFunction(fnName, stmt.function, environment));\n        return null;\n    }\n\n    @Override\n    public Object visitFunctionExpr(Expr.Function expr) {\n        return new LoxFunction(null, expr, environment);\n    }\n    ```\n\n    We could have re-used visitFunctionExpr but that would lose the function name if someone were to print it, this ensures we preserve it.\n    ```lox\n    fun whichFn(fn) {\n      print fn;\n    }\n\n    whichFn(fun (b) {\n     print b;\n    });\n\n    fun named(a) { print a; }\n    whichFn(named);\n    //\n    // <fn>\n    // <fn named>\n    ```\n\n3.  No, it isn't. Lox uses the same scope for the parameters and local variables\n    immediately inside the body. That's why Stmt.Function stores the body as a\n    list of statements, not a single Stmt.Block that would create its own\n    nested scope separate from the parameters.\n\n    In Java, it's an error because you aren't allowed to shadow local variables\n    inside a method or collide them.\n\n    It's an error in C because parameters and locals share the same scope.\n\n    It is allowed in Dart. There, parameters are in a separate scope surrounding\n    the function body.\n\n    I'm not a fan of Dart's choice. I think shadowing should be allowed in\n    general because it helps ensure changes to code are encapsulated and don't\n    affect parts of the program unrelated to the change. (See this design note\n    for more: http://craftinginterpreters.com/statements-and-state.html#design-note).\n\n    But shadowing still usually leads to more confusing code, so it should be\n    avoided when possible. The only thing putting parameters in an outer scope\n    allows is shadowing those parameters, but I think any code that did that\n    would be *very* hard to read. I would rather prohibit that outright.\n"
  },
  {
    "path": "note/answers/chapter11_resolving/4/com/craftinginterpreters/lox/AstPrinter.java",
    "content": "package com.craftinginterpreters.lox;\n\n// Creates an unambiguous, if ugly, string representation of AST nodes.\nclass AstPrinter implements Expr.Visitor<String>, Stmt.Visitor<String> {\n  String print(Expr expr) {\n    return expr.accept(this);\n  }\n\n  String print(Stmt stmt) {\n    return stmt.accept(this);\n  }\n  @Override\n  public String visitBlockStmt(Stmt.Block stmt) {\n    StringBuilder builder = new StringBuilder();\n    builder.append(\"(block \");\n\n    for (Stmt statement : stmt.statements) {\n      builder.append(statement.accept(this));\n    }\n\n    builder.append(\")\");\n    return builder.toString();\n  }\n\n  @Override\n  public String visitExpressionStmt(Stmt.Expression stmt) {\n    return parenthesize(\";\", stmt.expression);\n  }\n\n  @Override\n  public String visitFunctionStmt(Stmt.Function stmt) {\n    StringBuilder builder = new StringBuilder();\n    builder.append(\"(fun \" + stmt.name.lexeme + \"(\");\n\n    for (Token param : stmt.parameters) {\n      if (param != stmt.parameters.get(0)) builder.append(\" \");\n      builder.append(param.lexeme);\n    }\n\n    builder.append(\") \");\n\n    for (Stmt body : stmt.body) {\n      builder.append(body.accept(this));\n    }\n\n    builder.append(\")\");\n    return builder.toString();\n  }\n\n  @Override\n  public String visitIfStmt(Stmt.If stmt) {\n    if (stmt.elseBranch == null) {\n      return parenthesize2(\"if\", stmt.condition, stmt.thenBranch);\n    }\n\n    return parenthesize2(\"if-else\", stmt.condition, stmt.thenBranch,\n        stmt.elseBranch);\n  }\n\n  @Override\n  public String visitPrintStmt(Stmt.Print stmt) {\n    return parenthesize(\"print\", stmt.expression);\n  }\n\n  @Override\n  public String visitReturnStmt(Stmt.Return stmt) {\n    if (stmt.value == null) return \"(return)\";\n    return parenthesize(\"return\", stmt.value);\n  }\n\n  @Override\n  public String visitVarStmt(Stmt.Var stmt) {\n    if (stmt.initializer == null) {\n      return parenthesize2(\"var\", stmt.name);\n    }\n\n    return parenthesize2(\"var\", stmt.name, \"=\", stmt.initializer);\n  }\n\n  @Override\n  public String visitWhileStmt(Stmt.While stmt) {\n    return parenthesize2(\"while\", stmt.condition, stmt.body);\n  }\n\n  @Override\n  public String visitAssignExpr(Expr.Assign expr) {\n    return parenthesize2(\"=\", expr.name.lexeme, expr.value);\n  }\n\n  @Override\n  public String visitBinaryExpr(Expr.Binary expr) {\n    return parenthesize(expr.operator.lexeme, expr.left, expr.right);\n  }\n\n  @Override\n  public String visitCallExpr(Expr.Call expr) {\n    return parenthesize2(\"call\", expr.callee, expr.arguments);\n  }\n\n  @Override\n  public String visitGroupingExpr(Expr.Grouping expr) {\n    return parenthesize(\"group\", expr.expression);\n  }\n\n  @Override\n  public String visitLiteralExpr(Expr.Literal expr) {\n    if (expr.value == null) return \"nil\";\n    return expr.value.toString();\n  }\n\n  @Override\n  public String visitLogicalExpr(Expr.Logical expr) {\n    return parenthesize(expr.operator.lexeme, expr.left, expr.right);\n  }\n\n  @Override\n  public String visitUnaryExpr(Expr.Unary expr) {\n    return parenthesize(expr.operator.lexeme, expr.right);\n  }\n\n  @Override\n  public String visitVariableExpr(Expr.Variable expr) {\n    return expr.name.lexeme;\n  }\n  private String parenthesize(String name, Expr... exprs) {\n    StringBuilder builder = new StringBuilder();\n\n    builder.append(\"(\").append(name);\n    for (Expr expr : exprs) {\n      builder.append(\" \");\n      builder.append(expr.accept(this));\n    }\n    builder.append(\")\");\n\n    return builder.toString();\n  }\n  // Note: AstPrinting other types of syntax trees is not shown in the\n  // book, but this is provided here as a reference for those reading\n  // the full code.\n  private String parenthesize2(String name, Object... parts) {\n    StringBuilder builder = new StringBuilder();\n\n    builder.append(\"(\").append(name);\n\n    for (Object part : parts) {\n      builder.append(\" \");\n\n      if (part instanceof Expr) {\n        builder.append(((Expr)part).accept(this));\n      } else if (part instanceof Stmt) {\n        builder.append(((Stmt) part).accept(this));\n      } else if (part instanceof Token) {\n        builder.append(((Token) part).lexeme);\n      } else {\n        builder.append(part);\n      }\n    }\n    builder.append(\")\");\n\n    return builder.toString();\n  }\n}\n"
  },
  {
    "path": "note/answers/chapter11_resolving/4/com/craftinginterpreters/lox/Environment.java",
    "content": "package com.craftinginterpreters.lox;\n\nimport java.util.ArrayList;\nimport java.util.List;\n\nclass Environment {\n  final Environment enclosing;\n  private final List<Object> values = new ArrayList<>();\n\n  Environment() {\n    enclosing = null;\n  }\n\n  Environment(Environment enclosing) {\n    this.enclosing = enclosing;\n  }\n\n  void define(Object value) {\n    values.add(value);\n  }\n\n  Object getAt(int distance, int slot) {\n    Environment environment = this;\n    for (int i = 0; i < distance; i++) {\n      environment = environment.enclosing;\n    }\n\n    return environment.values.get(slot);\n  }\n\n  void assignAt(int distance, int slot, Object value) {\n    Environment environment = this;\n    for (int i = 0; i < distance; i++) {\n      environment = environment.enclosing;\n    }\n\n    environment.values.set(slot, value);\n  }\n  @Override\n  public String toString() {\n    String result = values.toString();\n    if (enclosing != null) {\n      result += \" -> \" + enclosing.toString();\n    }\n\n    return result;\n  }\n}\n"
  },
  {
    "path": "note/answers/chapter11_resolving/4/com/craftinginterpreters/lox/Expr.java",
    "content": "package com.craftinginterpreters.lox;\n\nimport java.util.List;\n\nabstract class Expr {\n  interface Visitor<R> {\n    R visitAssignExpr(Assign expr);\n    R visitBinaryExpr(Binary expr);\n    R visitCallExpr(Call expr);\n    R visitGroupingExpr(Grouping expr);\n    R visitLiteralExpr(Literal expr);\n    R visitLogicalExpr(Logical expr);\n    R visitUnaryExpr(Unary expr);\n    R visitVariableExpr(Variable expr);\n  }\n\n  static class Assign extends Expr {\n    Assign(Token name, Expr value) {\n      this.name = name;\n      this.value = value;\n    }\n\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitAssignExpr(this);\n    }\n\n    final Token name;\n    final Expr value;\n  }\n\n  static class Binary extends Expr {\n    Binary(Expr left, Token operator, Expr right) {\n      this.left = left;\n      this.operator = operator;\n      this.right = right;\n    }\n\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitBinaryExpr(this);\n    }\n\n    final Expr left;\n    final Token operator;\n    final Expr right;\n  }\n\n  static class Call extends Expr {\n    Call(Expr callee, Token paren, List<Expr> arguments) {\n      this.callee = callee;\n      this.paren = paren;\n      this.arguments = arguments;\n    }\n\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitCallExpr(this);\n    }\n\n    final Expr callee;\n    final Token paren;\n    final List<Expr> arguments;\n  }\n\n  static class Grouping extends Expr {\n    Grouping(Expr expression) {\n      this.expression = expression;\n    }\n\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitGroupingExpr(this);\n    }\n\n    final Expr expression;\n  }\n\n  static class Literal extends Expr {\n    Literal(Object value) {\n      this.value = value;\n    }\n\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitLiteralExpr(this);\n    }\n\n    final Object value;\n  }\n\n  static class Logical extends Expr {\n    Logical(Expr left, Token operator, Expr right) {\n      this.left = left;\n      this.operator = operator;\n      this.right = right;\n    }\n\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitLogicalExpr(this);\n    }\n\n    final Expr left;\n    final Token operator;\n    final Expr right;\n  }\n\n  static class Unary extends Expr {\n    Unary(Token operator, Expr right) {\n      this.operator = operator;\n      this.right = right;\n    }\n\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitUnaryExpr(this);\n    }\n\n    final Token operator;\n    final Expr right;\n  }\n\n  static class Variable extends Expr {\n    Variable(Token name) {\n      this.name = name;\n    }\n\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitVariableExpr(this);\n    }\n\n    final Token name;\n  }\n\n  abstract <R> R accept(Visitor<R> visitor);\n}\n"
  },
  {
    "path": "note/answers/chapter11_resolving/4/com/craftinginterpreters/lox/Interpreter.java",
    "content": "package com.craftinginterpreters.lox;\n\nimport java.util.ArrayList;\nimport java.util.HashMap;\nimport java.util.List;\nimport java.util.Map;\n\nclass Interpreter implements Expr.Visitor<Object>, Stmt.Visitor<Void> {\n  final Map<String, Object> globals = new HashMap<>();\n  private Environment environment;\n  private final Map<Expr, Integer> locals = new HashMap<>();\n  private final Map<Expr, Integer> slots = new HashMap<>();\n\n  Interpreter() {\n    globals.put(\"clock\", new LoxCallable() {\n      @Override\n      public int arity() {\n        return 0;\n      }\n\n      @Override\n      public Object call(Interpreter interpreter,\n                         List<Object> arguments) {\n        return (double)System.currentTimeMillis() / 1000.0;\n      }\n    });\n  }\n  void interpret(List<Stmt> statements) {\n    try {\n      for (Stmt statement : statements) {\n        execute(statement);\n      }\n    } catch (RuntimeError error) {\n      Lox.runtimeError(error);\n    }\n  }\n  private Object evaluate(Expr expr) {\n    return expr.accept(this);\n  }\n  private void execute(Stmt stmt) {\n    stmt.accept(this);\n  }\n  void resolve(Expr expr, int depth, int slot) {\n    locals.put(expr, depth);\n    slots.put(expr, slot);\n  }\n  void executeBlock(List<Stmt> statements, Environment environment) {\n    Environment previous = this.environment;\n    try {\n      this.environment = environment;\n\n      for (Stmt statement : statements) {\n        execute(statement);\n      }\n    } finally {\n      this.environment = previous;\n    }\n  }\n  @Override\n  public Void visitBlockStmt(Stmt.Block stmt) {\n    executeBlock(stmt.statements, new Environment(environment));\n    return null;\n  }\n  @Override\n  public Void visitExpressionStmt(Stmt.Expression stmt) {\n    evaluate(stmt.expression);\n    return null; // [void]\n  }\n  @Override\n  public Void visitFunctionStmt(Stmt.Function stmt) {\n    LoxFunction function = new LoxFunction(stmt, environment);\n    define(stmt.name, function);\n    return null;\n  }\n  @Override\n  public Void visitIfStmt(Stmt.If stmt) {\n    if (isTruthy(evaluate(stmt.condition))) {\n      execute(stmt.thenBranch);\n    } else if (stmt.elseBranch != null) {\n      execute(stmt.elseBranch);\n    }\n    return null;\n  }\n  @Override\n  public Void visitPrintStmt(Stmt.Print stmt) {\n    Object value = evaluate(stmt.expression);\n    System.out.println(stringify(value));\n    return null;\n  }\n  @Override\n  public Void visitReturnStmt(Stmt.Return stmt) {\n    Object value = null;\n    if (stmt.value != null) value = evaluate(stmt.value);\n\n    throw new Return(value);\n  }\n  @Override\n  public Void visitVarStmt(Stmt.Var stmt) {\n    Object value = null;\n    if (stmt.initializer != null) {\n      value = evaluate(stmt.initializer);\n    }\n\n    define(stmt.name, value);\n    return null;\n  }\n  @Override\n  public Void visitWhileStmt(Stmt.While stmt) {\n    while (isTruthy(evaluate(stmt.condition))) {\n      execute(stmt.body);\n    }\n    return null;\n  }\n  @Override\n  public Object visitAssignExpr(Expr.Assign expr) {\n    Object value = evaluate(expr.value);\n\n    Integer distance = locals.get(expr);\n    if (distance != null) {\n      environment.assignAt(distance, slots.get(expr), value);\n    } else {\n      if (globals.containsKey(expr.name.lexeme)) {\n        globals.put(expr.name.lexeme, value);\n      } else {\n        throw new RuntimeError(expr.name,\n            \"Undefined variable '\" + expr.name.lexeme + \"'.\");\n      }\n    }\n\n    return value;\n  }\n  @Override\n  public Object visitBinaryExpr(Expr.Binary expr) {\n    Object left = evaluate(expr.left);\n    Object right = evaluate(expr.right); // [left]\n\n    switch (expr.operator.type) {\n      case BANG_EQUAL: return !isEqual(left, right);\n      case EQUAL_EQUAL: return isEqual(left, right);\n      case GREATER:\n        checkNumberOperands(expr.operator, left, right);\n        return (double)left > (double)right;\n      case GREATER_EQUAL:\n        checkNumberOperands(expr.operator, left, right);\n        return (double)left >= (double)right;\n      case LESS:\n        checkNumberOperands(expr.operator, left, right);\n        return (double)left < (double)right;\n      case LESS_EQUAL:\n        checkNumberOperands(expr.operator, left, right);\n        return (double)left <= (double)right;\n      case MINUS:\n        checkNumberOperands(expr.operator, left, right);\n        return (double)left - (double)right;\n      case PLUS:\n        if (left instanceof Double && right instanceof Double) {\n          return (double)left + (double)right;\n        } // [plus]\n\n        if (left instanceof String && right instanceof String) {\n          return (String)left + (String)right;\n        }\n\n        throw new RuntimeError(expr.operator,\n            \"Operands must be two numbers or two strings.\");\n      case SLASH:\n        checkNumberOperands(expr.operator, left, right);\n        return (double)left / (double)right;\n      case STAR:\n        checkNumberOperands(expr.operator, left, right);\n        return (double)left * (double)right;\n    }\n\n    // Unreachable.\n    return null;\n  }\n  @Override\n  public Object visitCallExpr(Expr.Call expr) {\n    Object callee = evaluate(expr.callee);\n\n    List<Object> arguments = new ArrayList<>();\n    for (Expr argument : expr.arguments) { // [in-order]\n      arguments.add(evaluate(argument));\n    }\n\n    if (!(callee instanceof LoxCallable)) {\n      throw new RuntimeError(expr.paren,\n          \"Can only call functions and classes.\");\n    }\n\n    LoxCallable function = (LoxCallable)callee;\n   if (arguments.size() != function.arity()) {\n      throw new RuntimeError(expr.paren, \"Expected \" +\n          function.arity() + \" arguments but got \" +\n          arguments.size() + \".\");\n    }\n\n    return function.call(this, arguments);\n  }\n  @Override\n  public Object visitGroupingExpr(Expr.Grouping expr) {\n    return evaluate(expr.expression);\n  }\n  @Override\n  public Object visitLiteralExpr(Expr.Literal expr) {\n    return expr.value;\n  }\n  @Override\n  public Object visitLogicalExpr(Expr.Logical expr) {\n    Object left = evaluate(expr.left);\n\n    if (expr.operator.type == TokenType.OR) {\n      if (isTruthy(left)) return left;\n    } else {\n      if (!isTruthy(left)) return left;\n    }\n\n    return evaluate(expr.right);\n  }\n  @Override\n  public Object visitUnaryExpr(Expr.Unary expr) {\n    Object right = evaluate(expr.right);\n\n    switch (expr.operator.type) {\n      case BANG:\n        return !isTruthy(right);\n      case MINUS:\n        checkNumberOperand(expr.operator, right);\n        return -(double)right;\n    }\n\n    // Unreachable.\n    return null;\n  }\n  @Override\n  public Object visitVariableExpr(Expr.Variable expr) {\n    return lookUpVariable(expr.name, expr);\n  }\n  private Object lookUpVariable(Token name, Expr expr) {\n    Integer distance = locals.get(expr);\n    if (distance != null) {\n      return environment.getAt(distance, slots.get(expr));\n    } else {\n      if (globals.containsKey(name.lexeme)) {\n        return globals.get(name.lexeme);\n      } else {\n        throw new RuntimeError(name,\n            \"Undefined variable '\" + name.lexeme + \"'.\");\n      }\n    }\n  }\n  private void checkNumberOperand(Token operator, Object operand) {\n    if (operand instanceof Double) return;\n    throw new RuntimeError(operator, \"Operand must be a number.\");\n  }\n  private void checkNumberOperands(Token operator,\n                                   Object left, Object right) {\n    if (left instanceof Double && right instanceof Double) return;\n    // [operand]\n    throw new RuntimeError(operator, \"Operands must be numbers.\");\n  }\n  private boolean isTruthy(Object object) {\n    if (object == null) return false;\n    if (object instanceof Boolean) return (boolean)object;\n    return true;\n  }\n  private boolean isEqual(Object a, Object b) {\n    // nil is only equal to nil.\n    if (a == null && b == null) return true;\n    if (a == null) return false;\n\n    return a.equals(b);\n  }\n  private String stringify(Object object) {\n    if (object == null) return \"nil\";\n\n    // Hack. Work around Java adding \".0\" to integer-valued doubles.\n    if (object instanceof Double) {\n      String text = object.toString();\n      if (text.endsWith(\".0\")) {\n        text = text.substring(0, text.length() - 2);\n      }\n      return text;\n    }\n\n    return object.toString();\n  }\n  private void define(Token name, Object value) {\n    if (environment != null) {\n      environment.define(value);\n    } else {\n      globals.put(name.lexeme, value);\n    }\n  }\n}\n"
  },
  {
    "path": "note/answers/chapter11_resolving/4/com/craftinginterpreters/lox/Lox.java",
    "content": "package com.craftinginterpreters.lox;\n\nimport java.io.BufferedReader;\nimport java.io.IOException;\nimport java.io.InputStreamReader;\nimport java.nio.charset.Charset;\nimport java.nio.file.Files;\nimport java.nio.file.Paths;\nimport java.util.List;\n\npublic class Lox {\n  private static final Interpreter interpreter = new Interpreter();\n  static boolean hadError = false;\n  static boolean hadRuntimeError = false;\n\n  public static void main(String[] args) throws IOException {\n    if (args.length > 1) {\n      System.out.println(\"Usage: jlox [script]\");\n    } else if (args.length == 1) {\n      runFile(args[0]);\n    } else {\n      runPrompt();\n    }\n  }\n  private static void runFile(String path) throws IOException {\n    byte[] bytes = Files.readAllBytes(Paths.get(path));\n    run(new String(bytes, Charset.defaultCharset()));\n\n    // Indicate an error in the exit code.\n    if (hadError) System.exit(65);\n    if (hadRuntimeError) System.exit(70);\n  }\n  private static void runPrompt() throws IOException {\n    InputStreamReader input = new InputStreamReader(System.in);\n    BufferedReader reader = new BufferedReader(input);\n\n    for (;;) { // [repl]\n      System.out.print(\"> \");\n      run(reader.readLine());\n      hadError = false;\n    }\n  }\n  private static void run(String source) {\n    Scanner scanner = new Scanner(source);\n    List<Token> tokens = scanner.scanTokens();\n    Parser parser = new Parser(tokens);\n    List<Stmt> statements = parser.parse();\n\n    // Stop if there was a syntax error.\n    if (hadError) return;\n\n    Resolver resolver = new Resolver(interpreter);\n    resolver.resolve(statements);\n\n    // Stop if there was a resolution error.\n    if (hadError) return;\n\n    interpreter.interpret(statements);\n  }\n  static void error(int line, String message) {\n    report(line, \"\", message);\n  }\n\n  static private void report(int line, String where, String message) {\n    System.err.println(\n        \"[line \" + line + \"] Error\" + where + \": \" + message);\n    hadError = true;\n  }\n  static void error(Token token, String message) {\n    if (token.type == TokenType.EOF) {\n      report(token.line, \" at end\", message);\n    } else {\n      report(token.line, \" at '\" + token.lexeme + \"'\", message);\n    }\n  }\n  static void runtimeError(RuntimeError error) {\n    System.err.println(error.getMessage() +\n        \"\\n[line \" + error.token.line + \"]\");\n    hadRuntimeError = true;\n  }\n}\n"
  },
  {
    "path": "note/answers/chapter11_resolving/4/com/craftinginterpreters/lox/LoxCallable.java",
    "content": "package com.craftinginterpreters.lox;\n\nimport java.util.List;\n\ninterface LoxCallable {\n  int arity();\n  Object call(Interpreter interpreter, List<Object> arguments);\n}\n"
  },
  {
    "path": "note/answers/chapter11_resolving/4/com/craftinginterpreters/lox/LoxFunction.java",
    "content": "package com.craftinginterpreters.lox;\n\nimport java.util.List;\n\nclass LoxFunction implements LoxCallable {\n  private final Stmt.Function declaration;\n  private final Environment closure;\n\n  LoxFunction(Stmt.Function declaration, Environment closure) {\n    this.closure = closure;\n    this.declaration = declaration;\n  }\n  @Override\n  public String toString() {\n    return \"<fn \" + declaration.name.lexeme + \">\";\n  }\n  @Override\n  public int arity() {\n    return declaration.parameters.size();\n  }\n  @Override\n  public Object call(Interpreter interpreter, List<Object> arguments) {\n    Environment environment = new Environment(closure);\n    for (int i = 0; i < declaration.parameters.size(); i++) {\n      environment.define(arguments.get(i));\n    }\n\n    try {\n      interpreter.executeBlock(declaration.body, environment);\n    } catch (Return returnValue) {\n      return returnValue.value;\n    }\n    return null;\n  }\n}\n"
  },
  {
    "path": "note/answers/chapter11_resolving/4/com/craftinginterpreters/lox/Parser.java",
    "content": "package com.craftinginterpreters.lox;\n\nimport java.util.ArrayList;\nimport java.util.Arrays;\nimport java.util.List;\n\nimport static com.craftinginterpreters.lox.TokenType.*;\n\nclass Parser {\n  private static class ParseError extends RuntimeException {}\n\n  private final List<Token> tokens;\n  private int current = 0;\n\n  Parser(List<Token> tokens) {\n    this.tokens = tokens;\n  }\n  List<Stmt> parse() {\n    List<Stmt> statements = new ArrayList<>();\n    while (!isAtEnd()) {\n      statements.add(declaration());\n    }\n\n    return statements;\n  }\n  private Expr expression() {\n    return assignment();\n  }\n  private Stmt declaration() {\n    try {\n      if (match(FUN)) return function(\"function\");\n      if (match(VAR)) return varDeclaration();\n\n      return statement();\n    } catch (ParseError error) {\n      synchronize();\n      return null;\n    }\n  }\n  private Stmt statement() {\n    if (match(FOR)) return forStatement();\n    if (match(IF)) return ifStatement();\n    if (match(PRINT)) return printStatement();\n    if (match(RETURN)) return returnStatement();\n    if (match(WHILE)) return whileStatement();\n    if (match(LEFT_BRACE)) return new Stmt.Block(block());\n\n    return expressionStatement();\n  }\n  private Stmt forStatement() {\n    consume(LEFT_PAREN, \"Expect '(' after 'for'.\");\n\n    Stmt initializer;\n    if (match(SEMICOLON)) {\n      initializer = null;\n    } else if (match(VAR)) {\n      initializer = varDeclaration();\n    } else {\n      initializer = expressionStatement();\n    }\n\n    Expr condition = null;\n    if (!check(SEMICOLON)) {\n      condition = expression();\n    }\n    consume(SEMICOLON, \"Expect ';' after loop condition.\");\n\n    Expr increment = null;\n    if (!check(RIGHT_PAREN)) {\n      increment = expression();\n    }\n    consume(RIGHT_PAREN, \"Expect ')' after for clauses.\");\n    Stmt body = statement();\n\n    if (increment != null) {\n      body = new Stmt.Block(Arrays.asList(\n          body,\n          new Stmt.Expression(increment)));\n    }\n\n    if (condition == null) condition = new Expr.Literal(true);\n    body = new Stmt.While(condition, body);\n\n    if (initializer != null) {\n      body = new Stmt.Block(Arrays.asList(initializer, body));\n    }\n\n    return body;\n  }\n  private Stmt ifStatement() {\n    consume(LEFT_PAREN, \"Expect '(' after 'if'.\");\n    Expr condition = expression();\n    consume(RIGHT_PAREN, \"Expect ')' after if condition.\"); // [parens]\n\n    Stmt thenBranch = statement();\n    Stmt elseBranch = null;\n    if (match(ELSE)) {\n      elseBranch = statement();\n    }\n\n    return new Stmt.If(condition, thenBranch, elseBranch);\n  }\n  private Stmt printStatement() {\n    Expr value = expression();\n    consume(SEMICOLON, \"Expect ';' after value.\");\n    return new Stmt.Print(value);\n  }\n  private Stmt returnStatement() {\n    Token keyword = previous();\n    Expr value = null;\n    if (!check(SEMICOLON)) {\n      value = expression();\n    }\n\n    consume(SEMICOLON, \"Expect ';' after return value.\");\n    return new Stmt.Return(keyword, value);\n  }\n  private Stmt varDeclaration() {\n    Token name = consume(IDENTIFIER, \"Expect variable name.\");\n\n    Expr initializer = null;\n    if (match(EQUAL)) {\n      initializer = expression();\n    }\n\n    consume(SEMICOLON, \"Expect ';' after variable declaration.\");\n    return new Stmt.Var(name, initializer);\n  }\n  private Stmt whileStatement() {\n    consume(LEFT_PAREN, \"Expect '(' after 'while'.\");\n    Expr condition = expression();\n    consume(RIGHT_PAREN, \"Expect ')' after condition.\");\n    Stmt body = statement();\n\n    return new Stmt.While(condition, body);\n  }\n  private Stmt expressionStatement() {\n    Expr expr = expression();\n    consume(SEMICOLON, \"Expect ';' after expression.\");\n    return new Stmt.Expression(expr);\n  }\n  private Stmt.Function function(String kind) {\n    Token name = consume(IDENTIFIER, \"Expect \" + kind + \" name.\");\n    consume(LEFT_PAREN, \"Expect '(' after \" + kind + \" name.\");\n    List<Token> parameters = new ArrayList<>();\n    if (!check(RIGHT_PAREN)) {\n      do {\n        if (parameters.size() >= 8) {\n          error(peek(), \"Can't have more than 8 parameters.\");\n        }\n\n        parameters.add(consume(IDENTIFIER, \"Expect parameter name.\"));\n      } while (match(COMMA));\n    }\n    consume(RIGHT_PAREN, \"Expect ')' after parameters.\");\n\n    consume(LEFT_BRACE, \"Expect '{' before \" + kind + \" body.\");\n    List<Stmt> body = block();\n    return new Stmt.Function(name, parameters, body);\n  }\n  private List<Stmt> block() {\n    List<Stmt> statements = new ArrayList<>();\n\n    while (!check(RIGHT_BRACE) && !isAtEnd()) {\n      statements.add(declaration());\n    }\n\n    consume(RIGHT_BRACE, \"Expect '}' after block.\");\n    return statements;\n  }\n  private Expr assignment() {\n    Expr expr = or();\n\n    if (match(EQUAL)) {\n      Token equals = previous();\n      Expr value = assignment();\n\n      if (expr instanceof Expr.Variable) {\n        Token name = ((Expr.Variable)expr).name;\n        return new Expr.Assign(name, value);\n      }\n\n      error(equals, \"Invalid assignment target.\");\n    }\n\n    return expr;\n  }\n  private Expr or() {\n    Expr expr = and();\n\n    while (match(OR)) {\n      Token operator = previous();\n      Expr right = and();\n      expr = new Expr.Logical(expr, operator, right);\n    }\n\n    return expr;\n  }\n  private Expr and() {\n    Expr expr = equality();\n\n    while (match(AND)) {\n      Token operator = previous();\n      Expr right = equality();\n      expr = new Expr.Logical(expr, operator, right);\n    }\n\n    return expr;\n  }\n  private Expr equality() {\n    Expr expr = comparison();\n\n    while (match(BANG_EQUAL, EQUAL_EQUAL)) {\n      Token operator = previous();\n      Expr right = comparison();\n      expr = new Expr.Binary(expr, operator, right);\n    }\n\n    return expr;\n  }\n  private Expr comparison() {\n    Expr expr = addition();\n\n    while (match(GREATER, GREATER_EQUAL, LESS, LESS_EQUAL)) {\n      Token operator = previous();\n      Expr right = addition();\n      expr = new Expr.Binary(expr, operator, right);\n    }\n\n    return expr;\n  }\n  private Expr addition() {\n    Expr expr = multiplication();\n\n    while (match(MINUS, PLUS)) {\n      Token operator = previous();\n      Expr right = multiplication();\n      expr = new Expr.Binary(expr, operator, right);\n    }\n\n    return expr;\n  }\n\n  private Expr multiplication() {\n    Expr expr = unary();\n\n    while (match(SLASH, STAR)) {\n      Token operator = previous();\n      Expr right = unary();\n      expr = new Expr.Binary(expr, operator, right);\n    }\n\n    return expr;\n  }\n  private Expr unary() {\n    if (match(BANG, MINUS)) {\n      Token operator = previous();\n      Expr right = unary();\n      return new Expr.Unary(operator, right);\n    }\n\n    return call();\n  }\n  private Expr finishCall(Expr callee) {\n    List<Expr> arguments = new ArrayList<>();\n    if (!check(RIGHT_PAREN)) {\n      do {\n        if (arguments.size() >= 8) {\n          error(peek(), \"Can't have more than 8 arguments.\");\n        }\n        arguments.add(expression());\n      } while (match(COMMA));\n    }\n\n    Token paren = consume(RIGHT_PAREN, \"Expect ')' after arguments.\");\n\n    return new Expr.Call(callee, paren, arguments);\n  }\n  private Expr call() {\n    Expr expr = primary();\n\n    while (true) {\n      if (match(LEFT_PAREN)) {\n        expr = finishCall(expr);\n      } else {\n        break;\n      }\n    }\n\n    return expr;\n  }\n\n  private Expr primary() {\n    if (match(FALSE)) return new Expr.Literal(false);\n    if (match(TRUE)) return new Expr.Literal(true);\n    if (match(NIL)) return new Expr.Literal(null);\n\n    if (match(NUMBER, STRING)) {\n      return new Expr.Literal(previous().literal);\n    }\n\n    if (match(IDENTIFIER)) {\n      return new Expr.Variable(previous());\n    }\n\n    if (match(LEFT_PAREN)) {\n      Expr expr = expression();\n      consume(RIGHT_PAREN, \"Expect ')' after expression.\");\n      return new Expr.Grouping(expr);\n    }\n\n    throw error(peek(), \"Expect expression.\");\n  }\n  private boolean match(TokenType... types) {\n    for (TokenType type : types) {\n      if (check(type)) {\n        advance();\n        return true;\n      }\n    }\n\n    return false;\n  }\n  private Token consume(TokenType type, String message) {\n    if (check(type)) return advance();\n\n    throw error(peek(), message);\n  }\n  private boolean check(TokenType tokenType) {\n    if (isAtEnd()) return false;\n    return peek().type == tokenType;\n  }\n  private Token advance() {\n    if (!isAtEnd()) current++;\n    return previous();\n  }\n  private boolean isAtEnd() {\n    return peek().type == EOF;\n  }\n\n  private Token peek() {\n    return tokens.get(current);\n  }\n\n  private Token previous() {\n    return tokens.get(current - 1);\n  }\n  private ParseError error(Token token, String message) {\n    Lox.error(token, message);\n    return new ParseError();\n  }\n  private void synchronize() {\n    advance();\n\n    while (!isAtEnd()) {\n      if (previous().type == SEMICOLON) return;\n\n      switch (peek().type) {\n        case CLASS:\n        case FUN:\n        case VAR:\n        case FOR:\n        case IF:\n        case WHILE:\n        case PRINT:\n        case RETURN:\n          return;\n      }\n\n      advance();\n    }\n  }\n}\n"
  },
  {
    "path": "note/answers/chapter11_resolving/4/com/craftinginterpreters/lox/Resolver.java",
    "content": "package com.craftinginterpreters.lox;\n\nimport java.util.HashMap;\nimport java.util.List;\nimport java.util.Map;\nimport java.util.Stack;\n\nclass Resolver implements Expr.Visitor<Void>, Stmt.Visitor<Void> {\n  private final Interpreter interpreter;\n  private final Stack<Map<String, Variable>> scopes = new Stack<>();\nprivate FunctionType currentFunction = FunctionType.NONE;\n\n  Resolver(Interpreter interpreter) {\n    this.interpreter = interpreter;\n  }\n  private class Variable {\n    boolean isDefined = false;\n    final int slot;\n\n    private Variable(int slot) {\n      this.slot = slot;\n    }\n  }\n  private enum FunctionType {\n    NONE,\n    FUNCTION\n  }\n  void resolve(List<Stmt> statements) {\n    for (Stmt statement : statements) {\n      resolve(statement);\n    }\n  }\n  @Override\n  public Void visitBlockStmt(Stmt.Block stmt) {\n    beginScope();\n    resolve(stmt.statements);\n    endScope();\n    return null;\n  }\n  @Override\n  public Void visitExpressionStmt(Stmt.Expression stmt) {\n    resolve(stmt.expression);\n    return null;\n  }\n  @Override\n  public Void visitFunctionStmt(Stmt.Function stmt) {\n    declare(stmt.name);\n    define(stmt.name);\n\n    resolveFunction(stmt, FunctionType.FUNCTION);\n    return null;\n  }\n  @Override\n  public Void visitIfStmt(Stmt.If stmt) {\n    resolve(stmt.condition);\n    resolve(stmt.thenBranch);\n    if (stmt.elseBranch != null) resolve(stmt.elseBranch);\n    return null;\n  }\n  @Override\n  public Void visitPrintStmt(Stmt.Print stmt) {\n    resolve(stmt.expression);\n    return null;\n  }\n  @Override\n  public Void visitReturnStmt(Stmt.Return stmt) {\n    if (currentFunction == FunctionType.NONE) {\n      Lox.error(stmt.keyword, \"Can't return from top-level code.\");\n    }\n\n    if (stmt.value != null) {\n      resolve(stmt.value);\n    }\n\n    return null;\n  }\n  @Override\n  public Void visitVarStmt(Stmt.Var stmt) {\n    declare(stmt.name);\n    if (stmt.initializer != null) {\n      resolve(stmt.initializer);\n    }\n    define(stmt.name);\n    return null;\n  }\n  @Override\n  public Void visitWhileStmt(Stmt.While stmt) {\n    resolve(stmt.condition);\n    resolve(stmt.body);\n    return null;\n  }\n  @Override\n  public Void visitAssignExpr(Expr.Assign expr) {\n    resolve(expr.value);\n    resolveLocal(expr, expr.name);\n    return null;\n  }\n  @Override\n  public Void visitBinaryExpr(Expr.Binary expr) {\n    resolve(expr.left);\n    resolve(expr.right);\n    return null;\n  }\n  @Override\n  public Void visitCallExpr(Expr.Call expr) {\n    resolve(expr.callee);\n\n    for (Expr argument : expr.arguments) {\n      resolve(argument);\n    }\n\n    return null;\n  }\n  @Override\n  public Void visitGroupingExpr(Expr.Grouping expr) {\n    resolve(expr.expression);\n    return null;\n  }\n  @Override\n  public Void visitLiteralExpr(Expr.Literal expr) {\n    return null;\n  }\n  @Override\n  public Void visitLogicalExpr(Expr.Logical expr) {\n    resolve(expr.left);\n    resolve(expr.right);\n    return null;\n  }\n  @Override\n  public Void visitUnaryExpr(Expr.Unary expr) {\n    resolve(expr.right);\n    return null;\n  }\n  @Override\n  public Void visitVariableExpr(Expr.Variable expr) {\n    if (!scopes.isEmpty() &&\n        scopes.peek().containsKey(expr.name.lexeme) &&\n        !scopes.peek().get(expr.name.lexeme).isDefined) {\n      Lox.error(expr.name,\n          \"Can't read local variable in its own initializer.\");\n    }\n\n    resolveLocal(expr, expr.name);\n    return null;\n  }\n  private void resolve(Stmt stmt) {\n    stmt.accept(this);\n  }\n  private void resolve(Expr expr) {\n    expr.accept(this);\n  }\n  private void resolveFunction(Stmt.Function function, FunctionType type) {\n    FunctionType enclosingFunction = currentFunction;\n    currentFunction = type;\n\n    beginScope();\n    for (Token param : function.parameters) {\n      declare(param);\n      define(param);\n    }\n    resolve(function.body);\n    endScope();\n    currentFunction = enclosingFunction;\n  }\n  private void beginScope() {\n    scopes.push(new HashMap<String, Variable>());\n  }\n  private void endScope() {\n    scopes.pop();\n  }\n  private void declare(Token name) {\n    if (scopes.isEmpty()) return;\n\n    Map<String, Variable> scope = scopes.peek();\n    if (scope.containsKey(name.lexeme)) {\n      Lox.error(name,\n          \"Already variable with this name in this scope.\");\n    }\n\n    scope.put(name.lexeme, new Variable(scope.size()));\n  }\n  private void define(Token name) {\n    if (scopes.isEmpty()) return;\n    scopes.peek().get(name.lexeme).isDefined = true;\n  }\n  private void resolveLocal(Expr expr, Token name) {\n    for (int i = scopes.size() - 1; i >= 0; i--) {\n      Map<String, Variable> scope = scopes.get(i);\n      if (scope.containsKey(name.lexeme)) {\n        interpreter.resolve(expr, scopes.size() - 1 - i,\n            scope.get(name.lexeme).slot);\n        return;\n      }\n    }\n\n    // Not found. Assume it is global.\n  }\n}\n"
  },
  {
    "path": "note/answers/chapter11_resolving/4/com/craftinginterpreters/lox/Return.java",
    "content": "package com.craftinginterpreters.lox;\n\nclass Return extends RuntimeException {\n  final Object value;\n\n  Return(Object value) {\n    super(null, null, false, false);\n    this.value = value;\n  }\n}\n"
  },
  {
    "path": "note/answers/chapter11_resolving/4/com/craftinginterpreters/lox/RuntimeError.java",
    "content": "package com.craftinginterpreters.lox;\n\nclass RuntimeError extends RuntimeException {\n  final Token token;\n\n  RuntimeError(Token token, String message) {\n    super(message);\n    this.token = token;\n  }\n}\n"
  },
  {
    "path": "note/answers/chapter11_resolving/4/com/craftinginterpreters/lox/Scanner.java",
    "content": "package com.craftinginterpreters.lox;\n\nimport java.util.ArrayList;\nimport java.util.HashMap;\nimport java.util.List;\nimport java.util.Map;\n\nimport static com.craftinginterpreters.lox.TokenType.*; // [static-import]\n\nclass Scanner {\n  private static final Map<String, TokenType> keywords;\n\n  static {\n    keywords = new HashMap<>();\n    keywords.put(\"and\",    AND);\n    keywords.put(\"class\",  CLASS);\n    keywords.put(\"else\",   ELSE);\n    keywords.put(\"false\",  FALSE);\n    keywords.put(\"for\",    FOR);\n    keywords.put(\"fun\",    FUN);\n    keywords.put(\"if\",     IF);\n    keywords.put(\"nil\",    NIL);\n    keywords.put(\"or\",     OR);\n    keywords.put(\"print\",  PRINT);\n    keywords.put(\"return\", RETURN);\n    keywords.put(\"super\",  SUPER);\n    keywords.put(\"this\",   THIS);\n    keywords.put(\"true\",   TRUE);\n    keywords.put(\"var\",    VAR);\n    keywords.put(\"while\",  WHILE);\n  }\n  private final String source;\n  private final List<Token> tokens = new ArrayList<>();\n  private int start = 0;\n  private int current = 0;\n  private int line = 1;\n\n  Scanner(String source) {\n    this.source = source;\n  }\n  List<Token> scanTokens() {\n    while (!isAtEnd()) {\n      // We are at the beginning of the next lexeme.\n      start = current;\n      scanToken();\n    }\n\n    tokens.add(new Token(EOF, \"\", null, line));\n    return tokens;\n  }\n  private void scanToken() {\n    char c = advance();\n    switch (c) {\n      case '(': addToken(LEFT_PAREN); break;\n      case ')': addToken(RIGHT_PAREN); break;\n      case '{': addToken(LEFT_BRACE); break;\n      case '}': addToken(RIGHT_BRACE); break;\n      case ',': addToken(COMMA); break;\n      case '.': addToken(DOT); break;\n      case '-': addToken(MINUS); break;\n      case '+': addToken(PLUS); break;\n      case ';': addToken(SEMICOLON); break;\n      case '*': addToken(STAR); break;\n      case '!': addToken(match('=') ? BANG_EQUAL : BANG); break;\n      case '=': addToken(match('=') ? EQUAL_EQUAL : EQUAL); break;\n      case '<': addToken(match('=') ? LESS_EQUAL : LESS); break;\n      case '>': addToken(match('=') ? GREATER_EQUAL : GREATER); break;\n      case '/':\n        if (match('/')) {\n          // A comment goes until the end of the line.\n          while (peek() != '\\n' && !isAtEnd()) advance();\n        } else {\n          addToken(SLASH);\n        }\n        break;\n\n      case ' ':\n      case '\\r':\n      case '\\t':\n        // Ignore whitespace.\n        break;\n\n      case '\\n':\n        line++;\n        break;\n\n      case '\"': string(); break;\n\n      default:\n        if (isDigit(c)) {\n          number();\n        } else if (isAlpha(c)) {\n          identifier();\n        } else {\n          Lox.error(line, \"Unexpected character.\");\n        }\n        break;\n    }\n  }\n  private void identifier() {\n    while (isAlphaNumeric(peek())) advance();\n\n    // See if the identifier is a reserved word.\n    String text = source.substring(start, current);\n\n    TokenType type = keywords.get(text);\n    if (type == null) type = IDENTIFIER;\n    addToken(type);\n  }\n  private void number() {\n    while (isDigit(peek())) advance();\n\n    // Look for a fractional part.\n    if (peek() == '.' && isDigit(peekNext())) {\n      // Consume the \".\"\n      advance();\n\n      while (isDigit(peek())) advance();\n    }\n\n    addToken(NUMBER,\n        Double.parseDouble(source.substring(start, current)));\n  }\n  private void string() {\n    while (peek() != '\"' && !isAtEnd()) {\n      if (peek() == '\\n') line++;\n      advance();\n    }\n\n    // Unterminated string.\n    if (isAtEnd()) {\n      Lox.error(line, \"Unterminated string.\");\n      return;\n    }\n\n    // The closing \".\n    advance();\n\n    // Trim the surrounding quotes.\n    String value = source.substring(start + 1, current - 1);\n    addToken(STRING, value);\n  }\n  private boolean match(char expected) {\n    if (isAtEnd()) return false;\n    if (source.charAt(current) != expected) return false;\n\n    current++;\n    return true;\n  }\n  private char peek() {\n    if (isAtEnd()) return '\\0';\n    return source.charAt(current);\n  }\n  private char peekNext() {\n    if (current + 1 >= source.length()) return '\\0';\n    return source.charAt(current + 1);\n  } // [peek-next]\n  private boolean isAlpha(char c) {\n    return (c >= 'a' && c <= 'z') ||\n           (c >= 'A' && c <= 'Z') ||\n            c == '_';\n  }\n\n  private boolean isAlphaNumeric(char c) {\n    return isAlpha(c) || isDigit(c);\n  }\n  private boolean isDigit(char c) {\n    return c >= '0' && c <= '9';\n  } // [is-digit]\n  private boolean isAtEnd() {\n    return current >= source.length();\n  }\n  private char advance() {\n    current++;\n    return source.charAt(current - 1);\n  }\n\n  private void addToken(TokenType type) {\n    addToken(type, null);\n  }\n\n  private void addToken(TokenType type, Object literal) {\n    String text = source.substring(start, current);\n    tokens.add(new Token(type, text, literal, line));\n  }\n}\n"
  },
  {
    "path": "note/answers/chapter11_resolving/4/com/craftinginterpreters/lox/Stmt.java",
    "content": "package com.craftinginterpreters.lox;\n\nimport java.util.List;\n\nabstract class Stmt {\n  interface Visitor<R> {\n    R visitBlockStmt(Block stmt);\n    R visitExpressionStmt(Expression stmt);\n    R visitFunctionStmt(Function stmt);\n    R visitIfStmt(If stmt);\n    R visitPrintStmt(Print stmt);\n    R visitReturnStmt(Return stmt);\n    R visitVarStmt(Var stmt);\n    R visitWhileStmt(While stmt);\n  }\n\n  static class Block extends Stmt {\n    Block(List<Stmt> statements) {\n      this.statements = statements;\n    }\n\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitBlockStmt(this);\n    }\n\n    final List<Stmt> statements;\n  }\n\n  static class Expression extends Stmt {\n    Expression(Expr expression) {\n      this.expression = expression;\n    }\n\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitExpressionStmt(this);\n    }\n\n    final Expr expression;\n  }\n\n  static class Function extends Stmt {\n    Function(Token name, List<Token> parameters, List<Stmt> body) {\n      this.name = name;\n      this.parameters = parameters;\n      this.body = body;\n    }\n\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitFunctionStmt(this);\n    }\n\n    final Token name;\n    final List<Token> parameters;\n    final List<Stmt> body;\n  }\n\n  static class If extends Stmt {\n    If(Expr condition, Stmt thenBranch, Stmt elseBranch) {\n      this.condition = condition;\n      this.thenBranch = thenBranch;\n      this.elseBranch = elseBranch;\n    }\n\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitIfStmt(this);\n    }\n\n    final Expr condition;\n    final Stmt thenBranch;\n    final Stmt elseBranch;\n  }\n\n  static class Print extends Stmt {\n    Print(Expr expression) {\n      this.expression = expression;\n    }\n\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitPrintStmt(this);\n    }\n\n    final Expr expression;\n  }\n\n  static class Return extends Stmt {\n    Return(Token keyword, Expr value) {\n      this.keyword = keyword;\n      this.value = value;\n    }\n\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitReturnStmt(this);\n    }\n\n    final Token keyword;\n    final Expr value;\n  }\n\n  static class Var extends Stmt {\n    Var(Token name, Expr initializer) {\n      this.name = name;\n      this.initializer = initializer;\n    }\n\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitVarStmt(this);\n    }\n\n    final Token name;\n    final Expr initializer;\n  }\n\n  static class While extends Stmt {\n    While(Expr condition, Stmt body) {\n      this.condition = condition;\n      this.body = body;\n    }\n\n    <R> R accept(Visitor<R> visitor) {\n      return visitor.visitWhileStmt(this);\n    }\n\n    final Expr condition;\n    final Stmt body;\n  }\n\n  abstract <R> R accept(Visitor<R> visitor);\n}\n"
  },
  {
    "path": "note/answers/chapter11_resolving/4/com/craftinginterpreters/lox/Token.java",
    "content": "package com.craftinginterpreters.lox;\n\nclass Token {\n  final TokenType type;\n  final String lexeme;\n  final Object literal;\n  final int line; // [location]\n\n  Token(TokenType type, String lexeme, Object literal, int line) {\n    this.type = type;\n    this.lexeme = lexeme;\n    this.literal = literal;\n    this.line = line;\n  }\n\n  public String toString() {\n    return type + \" \" + lexeme + \" \" + literal;\n  }\n}\n"
  },
  {
    "path": "note/answers/chapter11_resolving/4/com/craftinginterpreters/lox/TokenType.java",
    "content": "package com.craftinginterpreters.lox;\n\nenum TokenType {\n  // Single-character tokens.\n  LEFT_PAREN, RIGHT_PAREN, LEFT_BRACE, RIGHT_BRACE,\n  COMMA, DOT, MINUS, PLUS, SEMICOLON, SLASH, STAR,\n\n  // One or two character tokens.\n  BANG, BANG_EQUAL,\n  EQUAL, EQUAL_EQUAL,\n  GREATER, GREATER_EQUAL,\n  LESS, LESS_EQUAL,\n\n  // Literals.\n  IDENTIFIER, STRING, NUMBER,\n\n  // Keywords.\n  AND, CLASS, ELSE, FALSE, FUN, FOR, IF, NIL, OR,\n  PRINT, RETURN, SUPER, THIS, TRUE, VAR, WHILE,\n\n  EOF\n}\n"
  },
  {
    "path": "note/answers/chapter11_resolving/4/com/craftinginterpreters/tool/GenerateAst.java",
    "content": "package com.craftinginterpreters.tool;\n\nimport java.io.IOException;\nimport java.io.PrintWriter;\nimport java.util.Arrays;\nimport java.util.List;\n\npublic class GenerateAst {\n  public static void main(String[] args) throws IOException {\n    if (args.length != 1) {\n      System.err.println(\"Usage: generate_ast <output directory>\");\n      System.exit(1);\n    }\n    String outputDir = args[0];\n    defineAst(outputDir, \"Expr\", Arrays.asList(\n      \"Assign   : Token name, Expr value\",\n      \"Binary   : Expr left, Token operator, Expr right\",\n      \"Call     : Expr callee, Token paren, List<Expr> arguments\",\n      \"Grouping : Expr expression\",\n      \"Literal  : Object value\",\n      \"Logical  : Expr left, Token operator, Expr right\",\n      \"Unary    : Token operator, Expr right\",\n      \"Variable : Token name\"\n    ));\n\n    defineAst(outputDir, \"Stmt\", Arrays.asList(\n      \"Block      : List<Stmt> statements\",\n      \"Expression : Expr expression\",\n      \"Function   : Token name, List<Token> parameters, List<Stmt> body\",\n      \"If         : Expr condition, Stmt thenBranch, Stmt elseBranch\",\n      \"Print      : Expr expression\",\n      \"Return     : Token keyword, Expr value\",\n      \"Var        : Token name, Expr initializer\",\n      \"While      : Expr condition, Stmt body\"\n    ));\n  }\n  private static void defineAst(\n      String outputDir, String baseName, List<String> types)\n      throws IOException {\n    String path = outputDir + \"/\" + baseName + \".java\";\n    PrintWriter writer = new PrintWriter(path, \"UTF-8\");\n\n    writer.println(\"package com.craftinginterpreters.lox;\");\n    writer.println(\"\");\n    writer.println(\"import java.util.List;\");\n    writer.println(\"\");\n    writer.println(\"abstract class \" + baseName + \" {\");\n\n    defineVisitor(writer, baseName, types);\n\n    // The AST classes.\n    for (String type : types) {\n      String className = type.split(\":\")[0].trim();\n      String fields = type.split(\":\")[1].trim(); // [robust]\n      defineType(writer, baseName, className, fields);\n    }\n\n    // The base accept() method.\n    writer.println(\"\");\n    writer.println(\"  abstract <R> R accept(Visitor<R> visitor);\");\n\n    writer.println(\"}\");\n    writer.close();\n  }\n  private static void defineVisitor(\n      PrintWriter writer, String baseName, List<String> types) {\n    writer.println(\"  interface Visitor<R> {\");\n\n    for (String type : types) {\n      String typeName = type.split(\":\")[0].trim();\n      writer.println(\"    R visit\" + typeName + baseName + \"(\" +\n          typeName + \" \" + baseName.toLowerCase() + \");\");\n    }\n\n    writer.println(\"  }\");\n  }\n  private static void defineType(\n      PrintWriter writer, String baseName,\n      String className, String fieldList) {\n    writer.println(\"\");\n    writer.println(\"  static class \" + className + \" extends \" +\n        baseName + \" {\");\n\n    // Constructor.\n    writer.println(\"    \" + className + \"(\" + fieldList + \") {\");\n\n    // Store parameters in fields.\n    String[] fields = fieldList.split(\", \");\n    for (String field : fields) {\n      String name = field.split(\" \")[1];\n      writer.println(\"      this.\" + name + \" = \" + name + \";\");\n    }\n\n    writer.println(\"    }\");\n\n    // Visitor pattern.\n    writer.println();\n    writer.println(\"    <R> R accept(Visitor<R> visitor) {\");\n    writer.println(\"      return visitor.visit\" +\n        className + baseName + \"(this);\");\n    writer.println(\"    }\");\n\n    // Fields.\n    writer.println();\n    for (String field : fields) {\n      writer.println(\"    final \" + field + \";\");\n    }\n\n    writer.println(\"  }\");\n  }\n  interface PastryVisitor {\n    void visitBeignet(Beignet beignet); // [overload]\n    void visitCruller(Cruller cruller);\n  }\n  abstract class Pastry {\n    abstract void accept(PastryVisitor visitor);\n  }\n\n  class Beignet extends Pastry {\n    @Override\n    void accept(PastryVisitor visitor) {\n      visitor.visitBeignet(this);\n    }\n  }\n\n  class Cruller extends Pastry {\n    @Override\n    void accept(PastryVisitor visitor) {\n      visitor.visitCruller(this);\n    }\n  }\n}\n"
  },
  {
    "path": "note/answers/chapter11_resolving/chapter11_resolving.md",
    "content": "1.  Consider:\n\n    ```lox\n    fun foo() {\n      if (itsTuesday) foo();\n    }\n    ```\n\n    The function does call itself inside it's definition. But it relies on some\n    initial outer call to kick off the recursion. Some outside code must refer\n    to \"foo\" by name first. That can't happen until the function declaration\n    statement itself has finished executing. By then, \"foo\" is fully defined\n    and is safe to use.\n\n2.  In C, the variable is put in scope before its initializer, which means that\n    the initializer refers to the variable being initialized. Since C does not\n    require any clearing of uninitialized memory, it means you could potentially\n    access garbage data.\n\n    Java does not allow one local variable to shadow another so it's an error\n    because of that if the outer variable is also local. The outer variable\n    could be a field on the surrounding class. In that case, like C, the local\n    variable is in scope in its own initializer. However, Java makes it an error\n    to refer to a variable that may not have been initialized, so this falls\n    under that case and is an error.\n\n    Obviously, C's approach is crazy talk. Java is fine and takes advantage of\n    definite assignment analysis, which is useful for other things (like\n    ensuring final fields are initialized before the constructor body\n    completes). I like when languages get a lot of mileage out of a single\n    concept.\n\n3.  The basic idea is that instead of storing just a boolean state for each\n    local variable as we resolve the code, we'll allow a variable to be in one\n    of three states:\n\n    1. It has been declared but not yet defined.\n    2. It has been defined but not yet read.\n    3. It has been read.\n\n    Any variable that goes out of scope when in the defined-but-not-yet-read\n    state is an error. The annoying part is that we can't detect the error until\n    the variable goes out of scope, but we want to report it on the line that\n    the variable was declared. So we also need to keep track of the token from\n    the variable declaration. We'll bundle that and the three-state enum into\n    a little class inside the Resolver class:\n\n    ```java\n      private static class Variable {\n        final Token name;\n        VariableState state;\n\n        private Variable(Token name, VariableState state) {\n          this.name = name;\n          this.state = state;\n        }\n      }\n\n      private enum VariableState {\n        DECLARED,\n        DEFINED,\n        READ\n      }\n    ```\n\n    Then we change the scope stack to use that instead of Boolean:\n\n    ```java\n      private final Stack<Map<String, Variable>> scopes = new Stack<>();\n    ```\n\n    When we resolve a local variable, we mark it used. However, we don't want\n    to consider assigning to a local variable to be a \"use\". Writing to a\n    variable that's never read is still pointless. So we change resolveLocal()\n    to:\n\n    ```java\n      private void resolveLocal(Expr expr, Token name, boolean isRead) {\n        for (int i = scopes.size() - 1; i >= 0; i--) {\n          if (scopes.get(i).containsKey(name.lexeme)) {\n            interpreter.resolve(expr, scopes.size() - 1 - i);\n\n            // Mark it used.\n            if (isRead) {\n              scopes.get(i).get(name.lexeme).state = VariableState.READ;\n            }\n            return;\n          }\n        }\n\n        // Not found. Assume it is global.\n      }\n    ```\n\n    Every call to resolveLocal() needs to pass in that flag. In\n    visitVariableExpr(), it's true:\n\n    ```java\n        resolveLocal(expr, expr.name, true);\n    ```\n\n    In visitAssignExpr(), it's false:\n\n    ```java\n        resolveLocal(expr, expr.name, false);\n    ```\n\n    Next, we update the existing code that touches scopes to use the new\n    Variable class:\n\n    ```java\n      public Void visitVariableExpr(Expr.Variable expr) {\n        if (!scopes.isEmpty() &&\n            scopes.peek().containsKey(expr.name.lexeme) &&\n            scopes.peek().get(expr.name.lexeme).state == VariableState.DECLARED) {\n          Lox.error(expr.name,\n              \"Can't read local variable in its own initializer.\");\n        }\n\n        resolveLocal(expr, expr.name, true);\n        return null;\n      }\n\n      private void beginScope() {\n        scopes.push(new HashMap<String, Variable>());\n      }\n\n      private void declare(Token name) {\n        if (scopes.isEmpty()) return;\n\n        Map<String, Variable> scope = scopes.peek();\n        if (scope.containsKey(name.lexeme)) {\n          Lox.error(name,\n              \"Already variable with this name in this scope.\");\n        }\n\n        scope.put(name.lexeme, new Variable(name, VariableState.DECLARED));\n      }\n\n      private void define(Token name) {\n        if (scopes.isEmpty()) return;\n        scopes.peek().get(name.lexeme).state = VariableState.DEFINED;\n      }\n    ```\n\n    Finally, when a scope is popped, we check its variables to see if any were\n    not read:\n\n    ```java\n      private void endScope() {\n        Map<String, Variable> scope = scopes.pop();\n\n        for (Map.Entry<String, Variable> entry : scope.entrySet()) {\n          if (entry.getValue().state == VariableState.DEFINED) {\n            Lox.error(entry.getValue().name, \"Local variable is not used.\");\n          }\n        }\n      }\n    ```\n\n4. This challenge is a real challenge and involves even more code changes.\n   I went ahead and made a copy of the interpreter with the relevant changes\n   in the \"4\" directory here.\n"
  },
  {
    "path": "note/answers/chapter12_classes.md",
    "content": "1.  Metaclasses are so cool, I almost wish the book itself discussed them\n    properly, but there are only so many pages. The idea is that a class object\n    is itself an instance, which means it must have its own class -- a\n    metaclass. That metaclass defines the methods that are available on the\n    class object -- what you'd think of as the \"static\" methods in a language\n    like Java.\n\n    Before we get to metaclasses, we need to push the new syntax through. In\n    AstGenerator, add a new field to Stmt.Class:\n\n    ```java\n    \"Class      : Token name, List<Stmt.Function> methods, List<Stmt.Function> classMethods\",\n    ```\n\n    When parsing a class, we separate out the class methods (prefixed with\n    \"class\") into a separate list:\n\n    ```java\n    private Stmt classDeclaration() {\n      Token name = consume(IDENTIFIER, \"Expect class name.\");\n\n      List<Stmt.Function> methods = new ArrayList<>();\n      List<Stmt.Function> classMethods = new ArrayList<>();\n      consume(LEFT_BRACE, \"Expect '{' before class body.\");\n\n      while (!check(RIGHT_BRACE) && !isAtEnd()) {\n        boolean isClassMethod = match(CLASS);\n        (isClassMethod ? classMethods : methods).add(function(\"method\"));\n      }\n\n      consume(RIGHT_BRACE, \"Expect '}' after class body.\");\n\n      return new Stmt.Class(name, methods, classMethods);\n    }\n    ```\n\n    In the resolver, we need to make sure to resolve the class methods too:\n\n    ```java\n    for (Stmt.Function method : stmt.classMethods) {\n      beginScope();\n      scopes.peek().put(\"this\", true);\n      resolveFunction(method, FunctionType.METHOD);\n      endScope();\n    }\n    ```\n\n    They are resolved mostly like methods. They even have a \"this\" variable,\n    which will be the class itself.\n\n    Now we're ready for metaclasses. Change the declaration of LoxClass to:\n\n    ```java\n    class LoxClass extends LoxInstance implements LoxCallable {\n      final String name;\n      private final Map<String, LoxFunction> methods;\n\n      LoxClass(LoxClass metaclass, String name,\n            Map<String, LoxFunction> methods) {\n        super(metaclass);\n        this.name = name;\n        this.methods = methods;\n      }\n\n      // ...\n    }\n    ```\n\n    LoxClass now extends LoxInstance. Every class object is also itself an\n    instance of a class, its metaclass. When we interpret a class declaration,\n    we create two LoxClasses:\n\n    ```java\n    @Override\n    public Void visitClassStmt(Stmt.Class stmt) {\n      environment.define(stmt.name.lexeme, null);\n      Map<String, LoxFunction> classMethods = new HashMap<>();\n      for (Stmt.Function method : stmt.classMethods) {\n        LoxFunction function = new LoxFunction(method, environment, false);\n        classMethods.put(method.name.lexeme, function);\n      }\n\n      LoxClass metaclass = new LoxClass(null,\n          stmt.name.lexeme + \" metaclass\", classMethods);\n\n      Map<String, LoxFunction> methods = new HashMap<>();\n      for (Stmt.Function method : stmt.methods) {\n        LoxFunction function = new LoxFunction(method, environment,\n            method.name.lexeme.equals(\"init\"));\n        methods.put(method.name.lexeme, function);\n      }\n\n      LoxClass klass = new LoxClass(metaclass, stmt.name.lexeme, methods);\n      environment.assign(stmt.name, klass);\n      return null;\n    }\n    ```\n\n    First, we create a metaclass containing all of the class methods. It has\n    null for its metametaclass to stop the infinite regress. Then we create the\n    main class like we did previously. The only difference is that we pass in\n    the metaclass as its class.\n\n    That's it. There are no other interpreter changes. Now that LoxClass is an\n    instance of LoxInstance, the existing code for property gets now applies to\n    class objects. On the last line of:\n\n    ```lox\n    class Math {\n      class square(n) {\n        return n * n;\n      }\n    }\n\n    print Math.square(3); // Prints \"9\".\n    ```\n\n    The `.square` expression looks at the object on the left. It's a\n    LoxInstance. We call `.get()` on that. That fails to find a field named\n    \"square\" so it looks for a method on the object's class with that name. The\n    object's class is the metaclass, and the method is found there. You can\n    even put fields on classes now:\n\n    ```lox\n    Math.pi = 3.141592653;\n    print Math.pi;\n    ```\n\n2.  The first implementation detail we have to figure out is how our AST\n    distinguishes a getter declaration from the declaration of a method that\n    takes no parameters. This is kind of cute, but we'll use a *null*\n    parameter list to indicate the former and an *empty* for the latter. So,\n    when parsing a method (and only a method, there are no getter *functions*),\n    we allow the parameter list to be omitted:\n\n    ```java\n    private Stmt.Function function(String kind) {\n      Token name = consume(IDENTIFIER, \"Expect \" + kind + \" name.\");\n\n      List<Token> parameters = null;\n\n      // Allow omitting the parameter list entirely in method getters.\n      if (!kind.equals(\"method\") || check(LEFT_PAREN)) {\n        consume(LEFT_PAREN, \"Expect '(' after \" + kind + \" name.\");\n        parameters = new ArrayList<>();\n        if (!check(RIGHT_PAREN)) {\n          do {\n            if (parameters.size() >= 255) {\n              error(peek(), \"Can't have more than 255 parameters.\");\n            }\n\n            parameters.add(consume(IDENTIFIER, \"Expect parameter name.\"));\n          } while (match(COMMA));\n        }\n        consume(RIGHT_PAREN, \"Expect ')' after parameters.\");\n      }\n\n      consume(LEFT_BRACE, \"Expect '{' before \" + kind + \" body.\");\n      List<Stmt> body = block();\n      return new Stmt.Function(name, parameters, body);\n    }\n    ```\n\n    Now we need to make sure the rest of the interpreter doesn't choke on a\n    null parameter list. We check for it when resolving:\n\n    ```java\n    private void resolveFunction(Stmt.Function function, FunctionType type) {\n      FunctionType enclosingFunction = currentFunction;\n      currentFunction = type;\n\n      beginScope();\n      if (function.params != null) {\n        for (Token param : function.params) {\n          declare(param);\n          define(param);\n        }\n      }\n      resolve(function.body);\n      endScope();\n      currentFunction = enclosingFunction;\n    }\n    ```\n\n    And when calling a LoxFunction:\n\n    ```java\n    @Override\n    public Object call(Interpreter interpreter, List<Object> arguments) {\n      Environment environment = new Environment(closure);\n      if (declaration.params != null) {\n        for (int i = 0; i < declaration.params.size(); i++) {\n          environment.define(declaration.params.get(i).lexeme,\n              arguments.get(i));\n        }\n      }\n\n      // ...\n    }\n    ```\n\n    Now all that's left is to interpret getters. The only difference compared to\n    methods is that the getter body is executed eagerly as soon as the property\n    is accessed instead of waiting for a later call expression to invoke it.\n\n    This isn't maybe the most elegant implementation, but it gets it done:\n\n    ```java\n    @Override\n    public Object visitGetExpr(Expr.Get expr) {\n      Object object = evaluate(expr.object);\n      if (object instanceof LoxInstance) {\n        Object result = ((LoxInstance) object).get(expr.name);\n        if (result instanceof LoxFunction &&\n            ((LoxFunction) result).isGetter()) {\n          result = ((LoxFunction) result).call(this, null);\n        }\n\n        return result;\n      }\n\n      throw new RuntimeError(expr.name,\n          \"Only instances have properties.\");\n    }\n    ```\n\n    After looking up the property, we see if the resulting object is a getter.\n    If so, we invoke it right now and use the result of that. This relies on\n    one little helper in LoxFunction:\n\n    ```java\n    public boolean isGetter() {\n      return declaration.params == null;\n    }\n    ```\n\n    And that's it.\n\n3.  Python and JavaScript allow you to freely access the fields on an object\n    from outside of the methods on that object. Ruby and Smalltalk encapsulate\n    instance state. Only methods on the class can access the raw fields, and it\n    is up to the class to decide which state is exposed using getters and\n    setters. Most statically typed languages offer access control modifiers\n    like `private` and `public` to explicitly control on a per-member basis\n    which parts of a class are externally accesible.\n\n    What are the trade-offs between these approaches and why might a language\n    might prefer one or the other?\n\n    The decision to encapsulate at all or not is the classic\n    trade-off between whether you want to make things easier for the class\n    *consumer* or the class *maintainer*. By making everything public and\n    freely externally visible and modifier, a downstream user of a class has\n    more freedom to pop the hood open and muck around in the class's internals.\n\n    However, that access tends to increasing coupling between the class and its\n    users. That increased coupling makes the class itself more brittle, similar\n    to the \"fragile base class problem\". If users are directly accessing\n    properties that the class author considered implementation details, they\n    lose the freedom to tweak that implementation without breaking those users.\n    The class can end up harder to change. That's more painful for the\n    maintainer, but also has a knock-on effect to the consumer -- if the class\n    evolves more slowly, they get fewer newer features for free from the\n    upstream maintainer.\n\n    On the other hand, free external access to class state is a simpler, easier\n    user experience when the class maintainer and consumer are the same person.\n    If you're banging out a small script, it's handy to be able to just push\n    stuff around without having to go through a lot of ceremony and boilerplate.\n    At small scales, most language features that build fences in the program are\n    more annoying than they are useful.\n\n    As the program scales up, though, those fences become increasingly important\n    since no one person is able to hold the entire program in their head.\n    Boundaries in the code let you make productive changes while only knowing a\n    single region of the program.\n\n    Assuming you do want some sort of access control over properties, the next\n    question is how fine-grained. Java has four different access control levels.\n    That's four concepts the user needs to understand. Every time you add a\n    member to a class, you need to pick one of the four, and need to have the\n    expertise and forethought to choose wisely. This adds to the cognitive load\n    of the language and adds some mental friction when programming.\n\n    However, at large scales, each of those access control levels (except maybe\n    package private) has proven to be useful. Having a few options gives class\n    maintainers precise control over what extension points the class user has\n    access to. While the class author has to do the mental work to pick a\n    modifier, the class *consumer* gets to benefit from that. The modifier\n    chosen for each member clearly communicates to the class user how the class\n    is intended to be used. If you're subclassing a class and looking at a sea\n    of methods, trying to figure out which one to override, the fact that one\n    is protected while the others are all private or public makes your choice\n    much easier -- it's a clear sign that that method is for the subclass's\n    use.\n"
  },
  {
    "path": "note/answers/chapter13_inheritance/1.md",
    "content": "I'm gonna pick traits, for no particular reason. \"Traits\" means slightly\ndifferent things in the various languages that implement them. For my purposes,\nI'll say:\n\n*   A trait is a set of reusable methods.\n\n*   A class can include as many traits as it wants. When it does, all of the\n    methods from the traits are copied into the class.\n\n*   A trait is *not* a class. That means you can't construct one. A trait\n    doesn't define a kind of object or any sort of identity.\n\n*   Traits can be composed. A trait can include the methods of other traits.\n\n*   Any method collision is an error. The fact that collisions are not silently\n    treated like overrides or shadows is one of the defining characteristics of\n    traits, compared to mixins or multiple inheritance. Sophisticated languages\n    give you ways of renaming or hiding in order to fix method collisions. I'll\n    just make it an error.\n\nThe syntax for defining a trait looks like a class but with a different keyword:\n\n```lox\ntrait SomeStuff {\n  method() {\n    print \"method\";\n  }\n\n  another() {\n    print \"another\";\n  }\n}\n```\n\nTo include the methods from one trait into a class or another trait, add a\n\"with\" clause followed by the list of traits after the declaration:\n\n```lox\nclass UsesTrait < Superclass with ATrait, AnotherTrait { ... }\n\ntrait ComposesTraits with SomeTrait, AnotherTrait { ... }\n```\n\nWe'll do the implementation front to back. First, a couple of new reserved words\nin TokenType:\n\n```java\n  TRAIT, WITH,\n```\n\nIn the scanner, we add the keywords for them:\n\n```java\nkeywords.put(\"trait\",  TRAIT);\nkeywords.put(\"with\",   WITH);\n```\n\nIn the AST generator, we add a new statement node for a trait declaration:\n\n```java\n\"Trait      : Token name, List<Expr> traits,\" +\n            \" List<Stmt.Function> methods\",\n```\n\nAnd we also need to extend the class declaration AST to store the list of traits\nit applies:\n\n```java\n\"Class      : Token name, Expr superclass,\" +\n            \" List<Expr> traits,\" +\n            \" List<Stmt.Function> methods\",\n```\n\nNow to parse. A trait declaration looks much like a class declaration. We start\nby recognizing its leading keyword in `declaration()`:\n\n```java\nif (match(TRAIT)) return traitDeclaration();\n```\n\nThat calls:\n\n```java\nprivate Stmt traitDeclaration() {\n  Token name = consume(IDENTIFIER, \"Expect trait name.\");\n\n  List<Expr> traits = withClause();\n\n  consume(LEFT_BRACE, \"Expect '{' before trait body.\");\n\n  List<Stmt.Function> methods = new ArrayList<>();\n  while (!check(RIGHT_BRACE) && !isAtEnd()) {\n    methods.add(function(\"method\"));\n  }\n\n  consume(RIGHT_BRACE, \"Expect '}' after trait body.\");\n\n  return new Stmt.Trait(name, traits, methods);\n}\n```\n\nI could probably refactor and reuse some code from `classDeclaration()`, but I'm\nnot gonna worry about that. We also need this helper for parsing the \"with\"\nclause:\n\n```java\nprivate List<Expr> withClause() {\n  List<Expr> traits = new ArrayList<>();\n  if (match(WITH)) {\n    do {\n      consume(IDENTIFIER, \"Expect trait name.\");\n      traits.add(new Expr.Variable(previous()));\n    } while (match(COMMA));\n  }\n\n  return traits;\n}\n```\n\nA class declaration can also apply traits, so we extend `classDeclaration()` by\nparsing a with clause before the class body and then passing that to the AST\nconstructor:\n\n```java\nprivate Stmt classDeclaration() {\n  Token name = consume(IDENTIFIER, \"Expect class name.\");\n\n  Expr superclass = null;\n  if (match(LESS)) {\n    consume(IDENTIFIER, \"Expect superclass name.\");\n    superclass = new Expr.Variable(previous());\n  }\n\n  List<Expr> traits = withClause(); // <-- Add this.\n\n  consume(LEFT_BRACE, \"Expect '{' before class body.\");\n\n  List<Stmt.Function> methods = new ArrayList<>();\n  while (!check(RIGHT_BRACE) && !isAtEnd()) {\n    methods.add(function(\"method\"));\n  }\n\n  consume(RIGHT_BRACE, \"Expect '}' after class body.\");\n\n  // Add this.                          --v\n  return new Stmt.Class(name, superclass, traits, methods);\n}\n```\n\nNext is the resolver. Traits are not like other classes (they can't contain\n`super` calls, in particular), so we add another ClassType case for them:\n\n```java\nprivate enum ClassType {\n  NONE,\n  CLASS,\n  SUBCLASS,\n  TRAIT // <-- Add this.\n}\n```\n\nAnd we need a visit method for trait declarations:\n\n```java\n@Override\npublic Void visitTraitStmt(Stmt.Trait stmt) {\n  declare(stmt.name);\n  define(stmt.name);\n  ClassType enclosingClass = currentClass;\n  currentClass = ClassType.TRAIT;\n\n  for (Expr trait : stmt.traits) {\n    resolve(trait);\n  }\n\n  beginScope();\n  scopes.peek().put(\"this\", true);\n\n  for (Stmt.Function method : stmt.methods) {\n    FunctionType declaration = FunctionType.METHOD;\n    resolveFunction(method, declaration);\n  }\n\n  endScope();\n\n  currentClass = enclosingClass;\n  return null;\n}\n```\n\nIt's pretty similar to resolving a class. The main difference is we don't treat\ninitializers specially. (We probably should. This means if you apply a trait\nthat defines a method named `init()`, it will act like an initializer but won't\nhave been resolved as one. Forgive me.)\n\nAlso, when resolving a class declaration, we resolve its with clause:\n\n```java\n// Add right before beginScope() call.\nfor (Expr trait : stmt.traits) {\n  resolve(trait);\n}\n```\n\nOne last resolution bit. We'll disallow super calls in trait methods since we\ndon't know if there will be a superclass when the trait is applied:\n\n```java\n@Override\npublic Void visitSuperExpr(Expr.Super expr) {\n  if (currentClass == ClassType.NONE) {\n    Lox.error(expr.keyword,\n        \"Can't use 'super' outside of a class.\");\n  } else if (currentClass == ClassType.TRAIT) { // <-- Add this.\n    Lox.error(expr.keyword,                     // <-- Add this.\n        \"Can't use 'super' in a trait.\");      // <-- Add this.\n  } else if (currentClass != ClassType.SUBCLASS) {\n    Lox.error(expr.keyword,\n        \"Can't use 'super' in a class with no superclass.\");\n  }\n\n  resolveLocal(expr, expr.keyword);\n  return null;\n}\n```\n\nWe're almost ready to interpret. First, we need a runtime representation for a\ntrait. I thought about reusing LoxClass, but that would let you construct\ntraits, which we don't want. Instead, let's define a new class:\n\n```java\npackage com.craftinginterpreters.lox;\n\nimport java.util.Map;\n\nclass LoxTrait {\n  final Token name;\n  final Map<String, LoxFunction> methods;\n\n  LoxTrait(Token name, Map<String, LoxFunction> methods) {\n    this.name = name;\n    this.methods = methods;\n  }\n\n  @Override\n  public String toString() {\n    return name.lexeme;\n  }\n}\n```\n\nSort of like a stripped down class. To interpret a trait declaration:\n\n```java\n@Override\npublic Void visitTraitStmt(Stmt.Trait stmt) {\n  environment.define(stmt.name.lexeme, null);\n\n  Map<String, LoxFunction> methods = applyTraits(stmt.traits);\n\n  for (Stmt.Function method : stmt.methods) {\n    if (methods.containsKey(method.name.lexeme)) {\n      throw new RuntimeError(method.name,\n          \"A previous trait declares a method named '\" +\n              method.name.lexeme + \"'.\");\n    }\n\n    LoxFunction function = new LoxFunction(\n        method, environment, false);\n    methods.put(method.name.lexeme, function);\n  }\n\n  LoxTrait trait = new LoxTrait(stmt.name, methods);\n\n  environment.assign(stmt.name, trait);\n  return null;\n}\n```\n\nPretty similar to a class. A cleaner implementation would refactor and reuse\nsome code. Since a trait can apply other traits, first we compose all of the\ntraits in its with clause together into a single method map. That's done by:\n\n```java\nprivate Map<String, LoxFunction> applyTraits(List<Expr> traits) {\n  Map<String, LoxFunction> methods = new HashMap<>();\n\n  for (Expr traitExpr : traits) {\n    Object traitObject = evaluate(traitExpr);\n    if (!(traitObject instanceof LoxTrait)) {\n      Token name = ((Expr.Variable)traitExpr).name;\n      throw new RuntimeError(name,\n          \"'\" + name.lexeme + \"' is not a trait.\");\n    }\n\n    LoxTrait trait = (LoxTrait) traitObject;\n    for (String name : trait.methods.keySet()) {\n      if (methods.containsKey(name)) {\n        throw new RuntimeError(trait.name,\n            \"A previous trait declares a method named '\" +\n                name + \"'.\");\n      }\n\n      methods.put(name, trait.methods.get(name));\n    }\n  }\n\n  return methods;\n}\n```\n\nIt walks the list of traits, adding the methods for each one into a big map.\nNote that unlike with subclassing and overriding, this explicitly checks for a\ncollision and makes it a runtime error. Assuming nothing collided, it returns\nthe new map. The trait declaration then adds its own methods into that, again\nchecking for collisions.\n\nThe end result is a single flattened set of methods, not a *chain* of inherited\nones. This is one of the key differences between traits and other forms of\nreuse.\n\nA class declaration can also apply traits, so we replace this line in\n`visitClassStmt()`:\n\n```java\nMap<String, LoxFunction> methods = new HashMap<>();\n```\n\nwith:\n\n```java\nMap<String, LoxFunction> methods = applyTraits(stmt.traits);\n```\n\nThis implementation is a little rough, especially around things like super, but\nit has the main features we want. Give it a try:\n\n```lox\ntrait A {\n  a() {\n    print \"a\";\n  }\n}\n\ntrait B1 {\n  b1() {\n    print \"b1\";\n  }\n}\n\ntrait B2 {\n  b2() {\n    print \"b2\";\n  }\n}\n\ntrait B with B1, B2 {\n  b() {\n    this.b1();\n    this.b2();\n  }\n}\n\nclass C with A, B {}\n\nvar c = C();\nc.a();\nc.b();\n```\n"
  },
  {
    "path": "note/answers/chapter13_inheritance/2.md",
    "content": "Ideally, we'd make \"inner\" a reserved word, but that means changing the scanner\nand adding a new AST node for it and stuff. Since this is just a challenge\nanswer, I'll skip that. That means users could technically shadow \"inner\", but\nthat's OK.\n\nThe implementation I have here is correct (I think) but not very fast. There are\nonly a couple of pieces. The most interesting one is the change to\nLoxClass.findMethod(). It now looks like:\n\n```java\nLoxFunction findMethod(LoxInstance instance, String name) {\n  LoxFunction method = null;\n  LoxFunction inner = null;\n  LoxClass klass = this;\n  while (klass != null) {\n    if (klass.methods.containsKey(name)) {\n      inner = method;\n      method = klass.methods.get(name);\n    }\n\n    klass = klass.superclass;\n  }\n\n  if (method != null) {\n    return method.bind(instance, inner);\n  }\n\n  return null;\n}\n```\n\nUnlike before, this does not shortcut walking the superclass chain when it finds\nthe method. Instead, it keeps going so that it can find the *first* (i.e.\nsuper-most) implementation of the method. As it does, it also keeps track of the\npreviously found method. That is the next one down the inheritance chain, and is\nthe one \"inner\" will invoke.\n\nOnce that loop is done, it now knows the top method to return, as well as the\nmethod that \"inner\" should call. (If there is no matching method in the\nsubclass, \"inner\" will be null.) It then passes the inner method into bind:\n\n```java\nLoxFunction bind(LoxInstance instance, LoxFunction inner) {\n  Environment environment = new Environment(closure);\n  environment.define(\"this\", instance);\n  environment.define(\"inner\", inner);\n  return new LoxFunction(declaration, environment, isInitializer);\n}\n```\n\nJust like \"this\", we store the function that should be called in the method's\nclosure environment, bound to \"inner\". Now a call to \"inner\" will call the next\nmethod down in the inheritance chain.\n\nIn order for uses of \"inner\" to work, it also needs to be in the resolver's\nstatic scope chains, so we add that there too. In visitClassStmt(), we define\n\"inner\" right after \"this\":\n\n```java\nbeginScope();\nscopes.peek().put(\"this\", true);\nscopes.peek().put(\"inner\", true); // <-- Add.\n```\n\nThe last piece of bookkeeping is in LoxClass's call() method:\n\n```java\npublic Object call(Interpreter interpreter, List<Object> arguments) {\n  LoxInstance instance = new LoxInstance(this);\n  LoxFunction initializer = findMethod(instance, \"init\");\n  if (initializer != null) {\n    initializer.call(interpreter, arguments);\n  }\n\n  return instance;\n}\n```\n\nNow that bind() takes two arguments, we also need to fix how initializers are\nlooked up. (This is also good because users may use \"inner\" in an initializer\ntoo.) So we change the body of call() to use the above findMethod() method to\ncorrectly find the initializer and bind it.\n\nThat's it!\n"
  },
  {
    "path": "note/answers/chapter13_inheritance/3.md",
    "content": "There's a bunch of small features I'd add to Lox to make it feel a little more\nuser-friendly. Things like getters, setters, and operator overloading would be\nnice. Perhaps a better syntax than having to do \"this.\" inside methods to refer\nto properties on the current object.\n\nBut, to me, the biggest real missing feature is some form of arrays. You can\nimplement linked lists and lots of other data structures yourself in Lox, but\narrays are special. In order to have true constant-time access to any element in\nthe array, you need to be able to create a truly contiguous array. Lox's current\nonly data abstract is objects with fields, which don't enable that.\n\nSo I'd add arrays. To make them really nice, I'd ideally do something like\ngrowable lists, with literal syntax like `[1, 2, 3]` and a subscript operator\nlike `someArray[2]` to access and set elements. To keep this challenge simple,\nI'll ignore the syntactic niceties and just do the bare minimum to expose the\nsemantics.\n\nI'll add one new native function, \"Array()\". It creates a new array with the\ngiven number of elements, all initialized to null:\n\n```lox\nvar array = Array(3);\nprint array; // \"[null, null, null]\".\n```\n\nAn array object has its own runtime representation. It exposes a few properties\nand methods that are also implemented natively:\n\n```lox\nvar array = Array(3);\n\n// \"length\" returns the number of elements.\nprint array.length; // \"3\".\n\n// \"set\" sets the element at the given index to the given value.\narray.set(1, \"new\");\n\n// \"get\" returns the element at a given index.\nprint array.get(1); // \"new\".\n```\n\nThe implementation is pretty straightforward, though native \"methods\" look a\nlittle funny since our natives up to this point have been top-level functions.\nFirst, in the constructor for Interpreter, we add another native function:\n\n```java\nglobals.define(\"Array\", new LoxCallable() {\n  @Override\n  public int arity() {\n    return 1;\n  }\n\n  @Override\n  public Object call(Interpreter interpreter,\n                     List<Object> arguments) {\n    int size = (int)(double)arguments.get(0);\n    return new LoxArray(size);\n  }\n});\n```\n\nThat returns a new LoxArray object. It's defined like:\n\n```java\npackage com.craftinginterpreters.lox;\n\nimport java.util.List;\n\nclass LoxArray extends LoxInstance {\n  private final Object[] elements;\n\n  LoxArray(int size) {\n    super(null);\n    elements = new Object[size];\n  }\n\n  @Override\n  Object get(Token name) {\n    if (name.lexeme.equals(\"get\")) {\n      return new LoxCallable() {\n        @Override\n        public int arity() {\n          return 1;\n        }\n\n        @Override\n        public Object call(Interpreter interpreter,\n                           List<Object> arguments) {\n          int index = (int)(double)arguments.get(0);\n          return elements[index];\n        }\n      };\n    } else if (name.lexeme.equals(\"set\")) {\n      return new LoxCallable() {\n        @Override\n        public int arity() {\n          return 2;\n        }\n\n        @Override\n        public Object call(Interpreter interpreter,\n                           List<Object> arguments) {\n          int index = (int)(double)arguments.get(0);\n          Object value = arguments.get(1);\n          return elements[index] = value;\n        }\n      };\n    } else if (name.lexeme.equals(\"length\")) {\n      return (double) elements.length;\n    }\n\n    throw new RuntimeError(name, // [hidden]\n        \"Undefined property '\" + name.lexeme + \"'.\");\n  }\n\n  @Override\n  void set(Token name, Object value) {\n    throw new RuntimeError(name, \"Can't add properties to arrays.\");\n  }\n\n  @Override\n  public String toString() {\n    StringBuffer buffer = new StringBuffer();\n    buffer.append(\"[\");\n    for (int i = 0; i < elements.length; i++) {\n      if (i != 0) buffer.append(\", \");\n      buffer.append(elements[i]);\n    }\n    buffer.append(\"]\");\n    return buffer.toString();\n  }\n}\n```\n\nAnd that's it. Fixed-size arrays are the only other data structure primitive we\nreally need in order to implement all of the other fancy data structures we take\nfor granted like hash tables, trees, etc.\n"
  },
  {
    "path": "note/answers/chapter14_chunks/1.md",
    "content": "In order to run-length encode the line information, we need a slightly smarter\ndata structure than just a flat array of integers. Instead, we'll define a\nlittle struct:\n\n```c\n// chunk.h\ntypedef struct {\n  int offset;\n  int line;\n} LineStart;\n```\n\nEach of these marks the beginning of a new source line in the code, and the\ncorresponding byte offset of the first instruction on that line. Any bytes after\nthat first one are understood to be on that same line, until we hit the next\nLineStart.\n\nIn Chunk, we store an array of these:\n\n```c\n// chunk.h\ntypedef struct {\n  int count;\n  int capacity;\n  uint8_t* code;\n  ValueArray constants;\n  int lineCount;\n  int lineCapacity;\n  LineStart* lines;\n} Chunk;\n```\n\nNote also that we now need a separate lineCount and lineCapacity for this\ndynamic array since its size will be different than code's (it should be much\nshorter, that's the goal).\n\nWe've got to maintain that dynamic array now. When initializing:\n\n```c\n// chunk.c\nvoid initChunk(Chunk* chunk) {\n  chunk->count = 0;\n  chunk->capacity = 0;\n  chunk->code = NULL;\n  chunk->lineCount = 0;    // <--\n  chunk->lineCapacity = 0; // <--\n  chunk->lines = NULL;\n  initValueArray(&chunk->constants);\n}\n```\n\n...and freeing...\n\n```c\n// chunk.c\nvoid freeChunk(Chunk* chunk) {\n  // ...\n  FREE_ARRAY(LineStart, chunk->lines, chunk->lineCapacity);\n}\n```\n\nWhere it gets interesting is when writing a new byte:\n\n```c\n// chunk.c\nvoid writeChunk(Chunk* chunk, uint8_t byte, int line) {\n  if (chunk->capacity < chunk->count + 1) {\n    int oldCapacity = chunk->capacity;\n    chunk->capacity = GROW_CAPACITY(oldCapacity);\n    chunk->code = GROW_ARRAY(uint8_t, chunk->code,\n        oldCapacity, chunk->capacity);\n    // Don't grow line array here...\n  }\n\n  chunk->code[chunk->count] = byte;\n  chunk->count++;\n\n  // See if we're still on the same line.\n  if (chunk->lineCount > 0 &&\n      chunk->lines[chunk->lineCount - 1].line == line) {\n    return;\n  }\n\n  // Append a new LineStart.\n  if (chunk->lineCapacity < chunk->lineCount + 1) {\n    int oldCapacity = chunk->lineCapacity;\n    chunk->lineCapacity = GROW_CAPACITY(oldCapacity);\n    chunk->lines = GROW_ARRAY(LineStart, chunk->lines,\n                              oldCapacity, chunk->lineCapacity);\n  }\n\n  LineStart* lineStart = &chunk->lines[chunk->lineCount++];\n  lineStart->offset = chunk->count - 1;\n  lineStart->line = line;\n}\n```\n\nThere are three changes here. First, we *don't* implicitly grow the line array\nwhen we grow the code array. Their sizes are decoupled now. Instead, we grow the\nline array when appending a new LineStart, if needed.\n\nThe second `if` statement is where we take advantage of adjacent instructions on\nthe same line. If the line for the byte we're writing is on the same line as\nthe current line start, we don't create a new one. This is the compression.\n\nOtherwise, if this is the first byte of code, or it appears on a different line,\nwe begin a new LineStart and grow the array if needed.\n\nThis gives us a compressed array of LineStarts, where each one begins a new\nline. Next, we have to use this data when showing line info.\n\nSince the lookup process is a little more complex, we define a helper function:\n\n```c\n// chunk.h\nint getLine(Chunk* chunk, int instruction);\n```\n\nIt looks like this:\n\n```c\n// chunk.c\nint getLine(Chunk* chunk, int instruction) {\n  int start = 0;\n  int end = chunk->lineCount - 1;\n\n  for (;;) {\n    int mid = (start + end) / 2;\n    LineStart* line = &chunk->lines[mid];\n    if (instruction < line->offset) {\n      end = mid - 1;\n    } else if (mid == chunk->lineCount - 1 ||\n        instruction < chunk->lines[mid + 1].offset) {\n      return line->line;\n    } else {\n      start = mid + 1;\n    }\n  }\n}\n```\n\nGiven a byte offset for an instruction, it binary searches through the\nLineStart array to find which LineStart -- and thus which line -- contains that\noffset. Using binary search is much faster than walking the whole array, but\nit does place a constraint on the compiler. It assumes line numbers for the\ninstructions always monotonically increase. Since we're going to have a\nsingle-pass compiler, that should be doable.\n\nNow we can use this function when we disassemble an instruction:\n\n```c\n// debug.c\nint disassembleInstruction(Chunk* chunk, int offset) {\n  printf(\"%04d \", offset);\n  int line = getLine(chunk, offset);\n  if (offset > 0 && line == getLine(chunk, offset - 1)) {\n    printf(\"   | \");\n  } else {\n    printf(\"%4d \", line);\n  }\n  // ...\n}\n```\n"
  },
  {
    "path": "note/answers/chapter14_chunks/2.md",
    "content": "There's not too much to this challenge. We add another opcode:\n\n```c\n// chunk.h\ntypedef enum {\n  OP_CONSTANT,\n  OP_CONSTANT_LONG, // <--\n  OP_RETURN,\n} OpCode;\n```\n\nDeclare the new function:\n\n```c\n// chunk.h\nvoid writeConstant(Chunk* chunk, Value value, int line);\n```\n\nAnd implement it:\n\n```c\n// chunk.c\nvoid writeConstant(Chunk* chunk, Value value, int line) {\n  int index = addConstant(chunk, value);\n  if (index < 256) {\n    writeChunk(chunk, OP_CONSTANT, line);\n    writeChunk(chunk, (uint8_t)index, line);\n  } else {\n    writeChunk(chunk, OP_CONSTANT_LONG, line);\n    writeChunk(chunk, (uint8_t)(index & 0xff), line);\n    writeChunk(chunk, (uint8_t)((index >> 8) & 0xff), line);\n    writeChunk(chunk, (uint8_t)((index >> 16) & 0xff), line);\n  }\n}\n```\n\nThis is pretty straightforward. We add the constant to the array and get the\nindex back. If the index fits in one byte, we use the short opcode and just\nwrite the single byte.\n\nOtherwise, we write the long opcode. Then we need to split the value into\nmultiple bytes. It's up to us to pick an endianness -- do we put the most\nsignificant byte first or last? For no particular reason, I went with\nlittle-endian, the same order x86 uses.\n\nWe want to be able to disassemble it too, so we add another case:\n\n```c\n// debug.c\n    case OP_CONSTANT_LONG:\n      return longConstantInstruction(\"OP_CONSTANT_LONG\", chunk, offset);\n```\n\nAnd that calls:\n\n```c\n// debug.c\nstatic int longConstantInstruction(const char* name, Chunk* chunk,\n                                   int offset) {\n  uint32_t constant = chunk->code[offset + 1] |\n                     (chunk->code[offset + 2] << 8) |\n                     (chunk->code[offset + 3] << 16);\n  printf(\"%-16s %4d '\", name, constant);\n  printValue(chunk->constants.values[constant]);\n  printf(\"'\\n\");\n  return offset + 4;\n}\n```\n\nAgain, we need to worry about endianness and we need to make sure we decode\nthe bytes the same way we encoded them. (If we were interpreting these, we'd\nneed to do it right there too.)\n\nThis isn't a bad approach. The main trade-off is that it adds to the number of\ninstructions we have. That has a couple of downsides:\n\n- It makes our interpreter more complex. This is pretty minor, though.\n\n- It uses up an opcode. If we want all opcodes to fit in a single byte, we can\n  only have 256 different ones. Our toy interpreter won't need anywhere near\n  that many, but a full-featured bytecode VM like the JVM or CPython can end up\n  using lots of them and we may not want to sacrifice another opcode for this.\n\n- It *might* slightly slow down the interpreter. Machine code has to be loaded\n  onto the CPU before it can be executed, so locality affects it too. The less\n  code you have in your code interpreter bytecode execution loop, the fewer\n  cache misses you'll have as it dispatches to different instructions.\n\n  Having multiple instructions, each with their own code, for handing constants\n  of different sizes increases the code size of the core interpreter loop and\n  might cause a few more caches misses.\n\nIn practice, though, none of these is fatal and having multiple instructions\nof different sizes isn't a terrible idea.\n"
  },
  {
    "path": "note/answers/chapter15_virtual/1.md",
    "content": "\nA helpful intermediate step is to explicitly parenthesize them so we can see\nthe operator precedence:\n\n    (1 * 2) + 3\n    1 + (2 * 3)\n    (3 - 2) - 1\n    (1 + (2 * 3)) - (4 / (-5))\n\nFrom there, it's straightforward to mentally do a post-order traversal of the\nsyntax trees:\n\n    // (1 * 2) + 3\n    CONST 1\n    CONST 2\n    MULTIPLY\n    CONST 3\n    ADD\n\n    // 1 + (2 * 3)\n    CONST 1\n    CONST 2\n    CONST 3\n    MULTIPLY\n    ADD\n\n    // (3 - 2) - 1\n    CONST 3\n    CONST 2\n    SUBTRACT\n    CONST 1\n    SUBTRACT\n\n    // (1 + (2 * 3)) - (4 / (-5))\n    CONST 1\n    CONST 2\n    CONST 3\n    MULTIPLY\n    ADD\n    CONST 4\n    CONST 5\n    NEGATE\n    DIVIDE\n    SUBTRACT\n"
  },
  {
    "path": "note/answers/chapter15_virtual/2.md",
    "content": "First, let's parenthesize:\n\n    4 - (3 * (- 2))\n\nThat gives:\n\n    CONST 4\n    CONST 3\n    CONST 2\n    NEGATE\n    MULTIPLY\n    SUBTRACT\n\nWithout negation, we need to subtract a number from zero to negate it, so the\ncode conceptually becomes:\n\n    4 - (3 * (0 - 2))\n\nWhich is:\n\n    CONST 4\n    CONST 3\n    CONST 0 // <--\n    CONST 2\n    SUBTRACT // <--\n    MULTIPLY\n    SUBTRACT\n\nWithout subtraction, we add the negation of the subtrahend:\n\n    4 + - (3 * (- 2))\n\nWhich is:\n\n    CONST 4\n    CONST 3\n    CONST 2\n    NEGATE\n    MULTIPLY\n    NEGATE // <--\n    ADD // <--\n\nI do think it makes sense to have both instructions. The overhead of dispatching\nis pretty high, so you want instructions as high level as possible, you want to\nfill your opcode space, and you want common operations to encode as a single\ninstruction when possible.\n\nGiven how common both negation and subtraction are, and given that we've got\nplenty of room in our opcode set, it makes perfect sense to have instructions\nfor both.\n\nI would also consider specialized instructions to load common number constants\nlike zero and one. It might be worth having instructions to increment and\ndecrement a number too.\n"
  },
  {
    "path": "note/answers/chapter15_virtual/3.md",
    "content": "There's nothing super algorithmically interesting about the change. We basically\nturn it into a dynamic array like we've seen before. A side effect of this\nchange is that `stackTop` becomes `stackCount`, an int. Using a raw pointer to\nthe top makes it a little harder to tell if we've run out of capacity:\n\n```c\ntypedef struct {\n  Chunk* chunk;\n  uint8_t* ip;\n  Value* stack;\n  int stackCount;\n  int stackCapacity;\n} VM;\n```\n\nWhen we first create the VM, we need to initialize the dynamic array fields:\n\n```c\nvoid initVM() {\n  vm.stack = NULL;\n  vm.stackCapacity = 0;\n  resetStack();\n}\n```\n\nResetting is still pretty simple:\n\n```c\nstatic void resetStack() {\n  vm.stackCount = 0;\n}\n```\n\nSo is `pop()`:\n\n```c\nValue pop() {\n  vm.stackCount--;\n  return vm.stack[vm.stackCount];\n}\n```\n\nWhere it gets interesting is `push()`:\n\n```c\nvoid push(Value value) {\n  if (vm.stackCapacity < vm.stackCount + 1) {\n    int oldCapacity = vm.stackCapacity;\n    vm.stackCapacity = GROW_CAPACITY(oldCapacity);\n    vm.stack = GROW_ARRAY(Value, vm.stack, \n                          oldCapacity, vm.stackCapacity);\n  }\n\n  vm.stack[vm.stackCount] = value;\n  vm.stackCount++;\n}\n```\nWe also have to change the way we debug the stack:\n```c\nfor (Value *slot = vm.stack; slot < vm.stack + vm.stackCount; slot++) {\n  printf(\"[ \");\n  printValue(*slot);\n  printf(\" ]\");\n}\n```\n\nThat `if` test needs to happen every single time we push a value. That happens\nall the time while the VM is running, so this is a significant performance\nproblem.\n\nWe wouldn't want to have to do that. Fortunately, it turns out we won't need\nto. If you're willing to limit the generated bytecode to fit within certain\nconstraints -- which happen to be implicitly true in a language with structured\ncontrol flow like Lox -- then you can *statically* determine the maximum amount\nof stack space a chunk of bytecode could ever use.\n\nDuring compilation, you always know how many stack slots are in use for locals\nand temporaries at any point in time. So you just keep a running tally of the\nhighwater mark -- the greatest amount of stack space used at any point, and then\nstore that in with the resulting chunk.\n\nSo instead of checking on every single push, we check once before evaluating\nthe bytecode to see if the stack is big enough to cover the worst case.\n"
  },
  {
    "path": "note/answers/chapter16_scanning.md",
    "content": "## 1\n\nI've implemented this in another language, Wren. You can see the code here:\n\nhttps://github.com/munificent/wren/blob/8fae8e4f1e490888e2cc9b2ea6b8e0d0ff9dd60f/src/vm/wren_compiler.c#L118-L130\n\nPoke around in that file for \"interp\" to see everything. The basic idea is you\nhave two token types. TOKEN_STRING is for uninterpolated string literals, and\nthe last segment of an interpolated string. Every piece of a string literal that\nprecedes an interpolated expression uses a different TOKEN_INTERPOLATION type.\n\nThis:\n\n```lox\n\"Tea will be ready in ${steep + cool} minutes.\"\n```\n\nGets scanned like:\n\n```text\nTOKEN_INTERPOLATION \"Tea will be ready in\"\nTOKEN_IDENTIFIER    \"steep\"\nTOKEN_PLUS          \"+\"\nTOKEN_IDENTIFIER    \"cool\"\nTOKEN_STRING        \"minutes.\"\n```\n\n(The interpolation delimiters themselves are discarded.)\n\nAnd this:\n\n```lox\n\"Nested ${\"interpolation?! Are you ${\"mad?!\"}\"}\"\n```\n\nScans as:\n\n```text\nTOKEN_INTERPOLATION \"Nested \"\nTOKEN_INTERPOLATION \"interpolation?! Are you \"\nTOKEN_STRING        \"mad?!\"\nTOKEN_STRING        \"\"\nTOKEN_STRING        \"\"\n```\n\nThe two empty TOKEN_STRING tokens are because the interpolation appears at the\nvery end of the string. They tell the parser that they've reached the end of\nthe interpolated expression.\n\n## 2\n\nAs far as I can tell, Java and C# don't actually specify it correctly. Unless\nthe verbiage is hidden away somewhere in the specs, I believe that this:\n\n```java\nList<List<String>> nestedList;\n```\n\nShould technically by a syntax error in a fully spec-compliant implementation\nof Java or C#. However, all practical implementations don't follow the letter\nof the spec and instead do what users want.\n\nC++, as of C++0x, does actually specify this:\n\nhttp://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1757.html\n\nIt states that if a `<` has been scanned and no closing `>` has been scanned\nyet, and there are no other intervening bracket characters, then a subsequent\n`>>` is scanned as two `>` tokens instead of a single shift.\n\nAs far as implementation, I think javac handles this by scanning the `>>` as a\nsingle shift token. When the parser is looking for a `>` to close the type\nargument, if it sees a shift token, it splits it into two `>` tokens right then,\nconsumes the first, and then keeps parsing.\n\nMicrosoft's C# parser takes the opposite approach. It always scans `>>` as two\nseparate `>` tokens. Then, when parsing an expression, if it sees two `>` tokens\nnext to each other with no whitespace between them, it parses them as a shift\noperator.\n\n## 3\n\nI don't generally like contextual keywords. It's fairly easy to write a real\nparser that can handle them gracefully, but:\n\n*   Users are often confused by them. Many programmers don't even realize that\n    contextual keywords exist. They assume all identifiers are either fully\n    reserved by the language or fully available for use.\n\n*   Once an identifier becomes a keyword in some context, it quickly takes on\n    that meaning to readers and becomes *very* confusing if you use it for your\n    own name outside of that context. Now that C# has async/await, you will\n    just anger your fellow C# users if you name a variable `await` in some\n    non-async method because they are so used to seeing `await` used for its\n    keyword meaning.\n\n    So even though it's *technically* usable elsewhere, it's effectively fully\n    reserved.\n\nThat being said, sometimes you have no other option. Once your language is in\nwide use, reserving a new keyword is a breaking change to any code that was\npreviously using that name. If you can only reserve it inside a new context that\ndidn't previously exist (for example, async functions in C#), or in a context\nwhere an identifier can't appear, then you can reserve it only in that context\nand be confident that you didn't break any previous code.\n\nSo they're sort of an inevitable compromise when evolving a language over time.\n\nImplementing them is pretty easy. The scanner scans them like regular\nidentifiers, since it doesn't generally know the surrounding context. In the\nparser, you recognize the keyword in that context by looking for an identifier\ntoken and checking to see if its lexeme is the right string.\n"
  },
  {
    "path": "note/answers/chapter17_compiling.md",
    "content": "##1\n\nIt's:\n\n```\nexpression\n| parsePrecedence(PREC_ASSIGNMENT)\n| | grouping\n| | | expression\n| | | | parsePrecedence(PREC_ASSIGNMENT)\n| | | | | unary // for \"-\"\n| | | | | | parsePrecedence(PREC_UNARY)\n| | | | | | | number\n| | | | | binary // for \"+\"\n| | | | | | parsePrecedence(PREC_FACTOR) // PREC_TERM + 1\n| | | | | | | number\n| | binary // for \"*\"\n| | | parsePrecedence(PREC_UNARY) // PREC_FACTOR + 1\n| | | | number\n| | binary // for \"-\"\n| | | parsePrecedence(PREC_FACTOR) // PREC_TERM + 1\n| | | | unary // for \"-\"\n| | | | | parsePrecedence(PREC_UNARY)\n| | | | | | number\n```\n\n## 2\n\nLox only has one other: left parenthesis is used as a prefix expression for\ngrouping, and as an infix expression for invoking a function.\n\nSeveral languages allow `+` as a prefix unary operator as a parallel to `-` and\nthen also of course use infix `+` for addition.\n\nA number of languages use square brackets for list or array literals, which\nmakes `[` a prefix expression and then also use square brackets as a subscript\noperator to access elements from a list.\n\nC uses `*` as a prefix operator to dereference a pointer and as infix for\nmultiplication. Likewise, `&` is a prefix address-of operator and infix bitwise\nand.\n\n`*` and `&` aren't prefix *expressions* in Ruby, but they can appear in prefix\nposition before an argument in an argument list.\n\n## 3\n\nThe `?:` operator has lower precedence than almost anything, so we add a new `PREC_CONDITIONAL` level between `PREC_ASSIGN` and `PREC_OR`. I'll skip adding the new TokenType enums for `?` and `:`. That part is pretty obvious. In the new row in the table for the `?` token type, we call:\n\n```c\nstatic void conditional()\n{\n  // Compile the then branch.\n  parsePrecedence(compiler, PREC_CONDITIONAL);\n\n  consume(compiler, TOKEN_COLON,\n          \"Expect ':' after then branch of conditional operator.\");\n\n  // Compile the else branch.\n  parsePrecedence(compiler, PREC_ASSIGNMENT);\n}\n```\n\nOf course a full implementation needs more code to actually do the conditional\nevaluation, but that should compile the operands with the right precedence. Note\nthat the precedence of the operands is a little unusual. The precedence of the\nlast operand is *lower* than the conditional expression itself.\n\nThat might be surprising, but it's how C rolls.\n"
  },
  {
    "path": "note/answers/chapter18_types.md",
    "content": "## 1\n\nHaving both `OP_NEGATE` and `OP_SUBTRACT` is redundant. We can replace\nsubtraction with negate-then-add:\n\n```c\n// Emit the operator instruction.\nswitch (operatorType) {\n  // ...\n  case TOKEN_PLUS:          emitByte(OP_ADD); break;\n  case TOKEN_MINUS:         emitBytes(OP_NEGATE, OP_ADD); break; // <--\n  case TOKEN_STAR:          emitByte(OP_MULTIPLY); break;\n  case TOKEN_SLASH:         emitByte(OP_DIVIDE); break;\n  default:\n    return; // Unreachable.\n}\n```\n\nOr we can replace negation with:\n\n1. Push zero.\n2. Compile the negate operand.\n3. Subtract.\n\nIt's also possibly to simplify the comparison and equality instructions using\nsome stack juggling and a bitwise operator. Fundamentally, you only need a\nsingle operation, an instruction that returns one of three values: \"less\",\n\"equal\", or \"greater\". Similar to the `compareTo()` methods in many languages or\nthe `<=>` in Ruby. Once you have that, the other operators can be defined in\nterms of it.\n\n## 2\n\nMany other instruction sets define dedicated instructions for common small\ninteger constants. 0, 1, 2, and -1 are good candidates.\n\nA few arithmetic operations have common constant operands. For those cases, it\nmay be worth adding instructions for them: incrementing and decrementing by one\nare the main ones. But maybe even doubling comes up enough to warrant it.\n\nLikewise, comparisons to certain numbers are also common and can be encoded\ndirectly in a single instruction instead of needing to load the number from a\nconstant and then use the comparison instruction. Many CPU instruction sets can\ncompare a number with zero in a single instruction.\n\nThere's been some research into \"superinstructions\" -- automated or manual\ntechniques for defining instructions that represent a sequence of common simpler\ninstructions. There is a point of diminishing returns because eventually you run\nout of opcodes. You can use larger opcodes (16 bits, etc.), but then that slows\ndown dispatch overall because now your code is larger.\n"
  },
  {
    "path": "note/answers/chapter19_strings.md",
    "content": "## 1\n\nThis change is mostly mechanical and not too difficult. First, in the type\nitself, we change the last field to use the C99 flexible array member syntax:\n\n```c\nstruct sObjString {\n  Obj obj;\n  int length;\n  // Was:\n  // char* chars;\n  // Now:\n  char chars[];\n};\n```\n\nThis means that, by default, `chars` is treated as having zero size, but still\nof array type. It's up to us to allocate enough memory for the ObjString and\nas many trailing bytes as we need. This means the little memory macros don't\nwork, so we'll manually call `reallocate()`.\n\nFirst, replace `takeString()` and `copyString()` with:\n\n```c\nObjString* makeString(int length) {\n  ObjString* string = (ObjString*)allocateObject(\n      sizeof(ObjString) + length + 1, OBJ_STRING);\n  string->length = length;\n  return string;\n}\n\nObjString* copyString(const char* chars, int length) {\n  ObjString* string = makeString(length);\n\n  memcpy(string->chars, chars, length);\n  string->chars[length] = '\\0';\n\n  return string;\n}\n```\n\nNow that the character buffer is part of the same allocation as the ObjString,\nwe can't take ownership of an existing character array. Instead, we need to\ncreate the ObjString that the characters will be copied into.\n\nThe `makeString()` function allocates an ObjString with as many extra bytes at\nthe end as the string needs. It also sets the length, but doesn't initialize\nthe characters.\n\n`copyString()` uses that to make a new string and copy in the given characters.\nThat's what string literals do. For concatenation, we do:\n\n```c\nstatic void concatenate() {\n  ObjString* b = AS_STRING(pop());\n  ObjString* a = AS_STRING(pop());\n\n  int length = a->length + b->length;\n  ObjString* result = makeString(length);\n  memcpy(result->chars, a->chars, a->length);\n  memcpy(result->chars + a->length, b->chars, b->length);\n  result->chars[length] = '\\0';\n\n  push(OBJ_VAL(result));\n}\n```\n\nInstead of creating the character array then the string object, we create the\nstring object first and then write the concatenated string right into it.\n\nHere's how we free it:\n\n```c\n  switch (object->type) {\n    case OBJ_STRING: {\n      ObjString* string = (ObjString*)object;\n      // Was:\n      // FREE_ARRAY(char, string->chars, string->length + 1);\n      // FREE(ObjString, object);\n      // Now:\n      reallocate(object, sizeof(ObjString) + string->length + 1, 0);\n      break;\n    }\n  }\n```\n\nNote that we include the extra size, but also that now only a single\n`reallocate()` call is needed.\n\n## 2\n\nThis one's also not too bad. A more efficient solution would be to pack the\n\"is owned\" bit into the type tag or as a bitfield next to the length. Of course,\nsince this is an optimization, the right way to go about it is to profile some\nreal-world programs and see if this optimization is worth doing.\n\nBut the simple implementation looks like this:\n\nWe add a field to the struct to track whether it owns the character array:\n\n```c\nstruct sObjString {\n  Obj obj;\n  bool ownsChars; // <--\n  int length;\n  const char* chars; // <--\n};\n```\n\nWe replace `takeString()` and `copyString()` with:\n\n```c\nObjString* makeString(bool ownsChars, char* chars, int length) {\n  ObjString* string = ALLOCATE_OBJ(ObjString, OBJ_STRING);\n  string->ownsChars = ownsChars;\n  string->length = length;\n  string->chars = chars;\n\n  return string;\n}\n```\n\nWhen we create a string from a literal, we call `makeString()` and have it not\nown the characters:\n\n```c\nstatic void string() {\n  emitConstant(OBJ_VAL(makeString(false,\n      (char*)parser.previous.start + 1, parser.previous.length - 2)));\n}\n```\n\nAnd when we concatenate, it does:\n\n```c\nstatic void concatenate() {\n  ObjString* b = AS_STRING(pop());\n  ObjString* a = AS_STRING(pop());\n\n  int length = a->length + b->length;\n  char* chars = ALLOCATE(char, length + 1);\n  memcpy(chars, a->chars, a->length);\n  memcpy(chars + a->length, b->chars, b->length);\n  chars[length] = '\\0';\n\n  ObjString* result = makeString(true, chars, length); // <--\n  push(OBJ_VAL(result));\n}\n```\n\nWe also need to fix `printObject()` since we can't assume strings are terminated\nanymore:\n\n```c\nvoid printObject(Value value) {\n  switch (OBJ_TYPE(value)) {\n    case OBJ_STRING:\n      // Changed:\n      printf(\"%.*s\", AS_STRING(value)->length, AS_CSTRING(value));\n      break;\n  }\n}\n```\n\nFinally, when we free a string, we only free the character array if we own it:\n\n```c\nstatic void freeObject(Obj* object) {\n  switch (object->type) {\n    case OBJ_STRING: {\n      ObjString* string = (ObjString*)object;\n      if (string->ownsChars) { // <--\n        FREE_ARRAY(char, (char*)string->chars, string->length + 1);\n      }\n      FREE(ObjString, object);\n      break;\n    }\n  }\n}\n```\n\n## 3\n\nMy preference depends on the semantics of dispatching the \"+\" operator. My\ngeneral goals are:\n\n* Do convert the other operand to a string and then concatenate when possible.\n* Try to maintain symmetry of the operator.\n\nIn some languages, these two goals are in conflict.\n\nIn C++, you can do it by defining `+` to take two strings. Then, any type that\nwants to allow itself to be a concatenated operand defines an implicit\nconversion to string. This works whether the operand is on the left or right.\n\nC# has similar behavior, but built in. If one operand of `+` is a string, the\nother is converted to a string by calling `ToString()` on it and the results are\nconcatenated. I think that works fine.\n\nIn languages like Smalltalk where `+` is a method dynamically dispatched on the\nleft-hand operand, it's harder to make the behavior symmetric. It's easy to\ndefine a `+` method on string that converts the right-hand operand to a string.\nBut it's harder to define a `+` on all types that converts the receiver to a\nstring if the right operand is a string.\n\nIn that case, I'm not as thrilled about overloading `+` to mean concatenation\nand might prefer a different operator. (In Smalltalk, that operator is `,`.)\n\nAt a higher level, while I like `+` for concatenation because it's familiar, I\ndon't think it's a great way to build strings out of parts. I *much* prefer\nhaving string interpolation built into the language.\n"
  },
  {
    "path": "note/answers/chapter20_hash/1.md",
    "content": "There's nothing mind-blowing about this exercise. It's mostly just replacing\n`ObjString*` with `Value` in the places where keys are passed around. In a\ncouple of places, you need to wrap a string in a value or unwrap it.\n\nThe full diff is below.\n\nThere are two interesting parts. First, we can no longer use a `NULL` key to\nrepresent an empty bucket. Keys are now Values, not pointers, so there is no\n`NULL`. We could use `nil`, but remember, `nil` is a valid key now too! Instead,\nI added a singleton value type, \"empty\":\n\n```c\ntypedef enum {\n   VAL_BOOL,\n   VAL_NIL,\n   VAL_NUMBER,\n   VAL_OBJ,\n   VAL_EMPTY // <--\n } ValueType;\n```\n\nUsers can never produce or see a value of this type. It's only used internally\nto identify empty buckets.\n\nSecond, we need to be able to generate a hash code for any kind of value, not\njust strings. Because the other value types are small and fixed-size, I don't\nthink it's worth caching the hash code. Instead, it's calculated on the fly\nas needed. The implementation looks like:\n\n```\nstatic uint32_t hashDouble(double value) {\n  union BitCast {\n    double value;\n    uint32_t ints[2];\n  };\n\n  union BitCast cast;\n  cast.value = (value) + 1.0;\n  return cast.ints[0] + cast.ints[1];\n}\n\nuint32_t hashValue(Value value) {\n  switch (value.type) {\n    case VAL_BOOL:   return AS_BOOL(value) ? 3 : 5;\n    case VAL_NIL:    return 7;\n    case VAL_NUMBER: return hashDouble(AS_NUMBER(value));\n    case VAL_OBJ:    return AS_STRING(value)->hash;\n    case VAL_EMPTY:  return 0;\n  }\n}\n```\n\nThere are some somewhat arbitrary choices here. I picked distinct constant\nhash codes for the singleton values `true`, `false`, and `nil`. As long as they\naren't all zero, I don't think the value matters too much.\n\nGenerating a hash code for a double is harder and exposes some subtle issues.\nShould two `NaN` values that have different underlying bit representations be\nconsidered the same or not? Should `0.0` and `-0.0` have the same hash code?\n\nI don't claim to be an expert on this, so I just borrowed the above\nimplementation from Lua. CPython has an interesting, very different approach.\n\nHere's the whole thing:\n\n```\ndiff --git a/c/object.c b/c/object.c\nindex 94f2bb5..c6f97f5 100644\n--- a/c/object.c\n+++ b/c/object.c\n@@ -26,7 +26,7 @@ static ObjString* allocateString(char* chars, int length,\n   string->chars = chars;\n   string->hash = hash;\n\n-  tableSet(&vm.strings, string, NIL_VAL);\n+  tableSet(&vm.strings, OBJ_VAL(string), NIL_VAL);\n\n   return string;\n }\ndiff --git a/c/table.c b/c/table.c\nindex 0082f46..78dd7ed 100644\n--- a/c/table.c\n+++ b/c/table.c\n@@ -18,14 +18,14 @@ void freeTable(Table* table) {\n   initTable(table);\n }\n static Entry* findEntry(Entry* entries, int capacity,\n-                        ObjString* key) {\n-  uint32_t index = key->hash % capacity;\n+                        Value key) {\n+  uint32_t index = hashValue(key) % capacity;\n   Entry* tombstone = NULL;\n\n   for (;;) {\n     Entry* entry = &entries[index];\n\n-    if (entry->key == NULL) {\n+    if (IS_EMPTY(entry->key)) {\n       if (IS_NIL(entry->value)) {\n         // Empty entry.\n         return tombstone != NULL ? tombstone : entry;\n@@ -33,7 +33,7 @@ static Entry* findEntry(Entry* entries, int capacity,\n         // We found a tombstone.\n         if (tombstone == NULL) tombstone = entry;\n       }\n-    } else if (entry->key == key) {\n+    } else if (valuesEqual(key, entry->key)) {\n       // We found the key.\n       return entry;\n     }\n@@ -41,11 +41,11 @@ static Entry* findEntry(Entry* entries, int capacity,\n     index = (index + 1) % capacity;\n   }\n }\n-bool tableGet(Table* table, ObjString* key, Value* value) {\n+bool tableGet(Table* table, Value key, Value* value) {\n   if (table->entries == NULL) return false;\n\n   Entry* entry = findEntry(table->entries, table->capacity, key);\n-  if (entry->key == NULL) return false;\n+  if (IS_NIL(entry->key)) return false;\n\n   *value = entry->value;\n   return true;\n@@ -53,14 +53,14 @@ bool tableGet(Table* table, ObjString* key, Value* value) {\n static void adjustCapacity(Table* table, int capacity) {\n   Entry* entries = ALLOCATE(Entry, capacity);\n   for (int i = 0; i < capacity; i++) {\n-    entries[i].key = NULL;\n+    entries[i].key = EMPTY_VAL;\n     entries[i].value = NIL_VAL;\n   }\n\n   table->count = 0;\n   for (int i = 0; i < table->capacity; i++) {\n     Entry* entry = &table->entries[i];\n-    if (entry->key == NULL) continue;\n+    if (IS_EMPTY(entry->key)) continue;\n\n     Entry* dest = findEntry(entries, capacity, entry->key);\n     dest->key = entry->key;\n@@ -72,29 +72,29 @@ static void adjustCapacity(Table* table, int capacity) {\n   table->entries = entries;\n   table->capacity = capacity;\n }\n-bool tableSet(Table* table, ObjString* key, Value value) {\n+bool tableSet(Table* table, Value key, Value value) {\n   if (table->count + 1 > table->capacity * TABLE_MAX_LOAD) {\n     int capacity = GROW_CAPACITY(table->capacity);\n     adjustCapacity(table, capacity);\n   }\n\n   Entry* entry = findEntry(table->entries, table->capacity, key);\n-  bool isNewKey = entry->key == NULL;\n+  bool isNewKey = IS_EMPTY(entry->key);\n   entry->key = key;\n   entry->value = value;\n\n   if (isNewKey) table->count++;\n   return isNewKey;\n }\n-bool tableDelete(Table* table, ObjString* key) {\n+bool tableDelete(Table* table, Value key) {\n   if (table->count == 0) return false;\n\n   // Find the entry.\n   Entry* entry = findEntry(table->entries, table->capacity, key);\n-  if (entry->key == NULL) return false;\n+  if (IS_EMPTY(entry->key)) return false;\n\n   // Place a tombstone in the entry.\n-  entry->key = NULL;\n+  entry->key = EMPTY_VAL;\n   entry->value = BOOL_VAL(true);\n\n   return true;\n@@ -102,7 +102,7 @@ bool tableDelete(Table* table, ObjString* key) {\n void tableAddAll(Table* from, Table* to) {\n   for (int i = 0; i < from->capacity; i++) {\n     Entry* entry = &from->entries[i];\n-    if (entry->key != NULL) {\n+    if (!IS_EMPTY(entry->key)) {\n       tableSet(to, entry->key, entry->value);\n     }\n   }\n@@ -119,11 +119,13 @@ ObjString* tableFindString(Table* table, const char* chars, int length,\n   for (;;) {\n     Entry* entry = &table->entries[index];\n\n-    if (entry->key == NULL) return NULL;\n-    if (entry->key->length == length &&\n-        memcmp(entry->key->chars, chars, length) == 0) {\n+    if (IS_EMPTY(entry->key)) return NULL;\n+\n+    ObjString* string = AS_STRING(entry->key);\n+    if (string->length == length &&\n+        memcmp(string->chars, chars, length) == 0) {\n       // We found it.\n-      return entry->key;\n+      return string;\n     }\n\n     // Try the next slot.\ndiff --git a/c/table.h b/c/table.h\nindex 4a51599..02c365d 100644\n--- a/c/table.h\n+++ b/c/table.h\n@@ -5,7 +5,7 @@\n #include \"value.h\"\n\n typedef struct {\n-  ObjString* key;\n+  Value key;\n   Value value;\n } Entry;\n\n@@ -17,9 +17,9 @@ typedef struct {\n\n void initTable(Table* table);\n void freeTable(Table* table);\n-bool tableGet(Table* table, ObjString* key, Value* value);\n-bool tableSet(Table* table, ObjString* key, Value value);\n-bool tableDelete(Table* table, ObjString* key);\n+bool tableGet(Table* table, Value key, Value* value);\n+bool tableSet(Table* table, Value key, Value value);\n+bool tableDelete(Table* table, Value key);\n void tableAddAll(Table* from, Table* to);\n ObjString* tableFindString(Table* table, const char* chars, int length,\n                            uint32_t hash);\ndiff --git a/c/value.c b/c/value.c\nindex bebcdb6..c139907 100644\n--- a/c/value.c\n+++ b/c/value.c\n@@ -30,6 +30,7 @@ void printValue(Value value) {\n     case VAL_NIL:    printf(\"nil\"); break;\n     case VAL_NUMBER: printf(\"%g\", AS_NUMBER(value)); break;\n     case VAL_OBJ:    printObject(value); break;\n+    case VAL_EMPTY:  printf(\"<empty>\"); break;\n   }\n }\n bool valuesEqual(Value a, Value b) {\n@@ -41,5 +42,27 @@ bool valuesEqual(Value a, Value b) {\n     case VAL_NUMBER: return AS_NUMBER(a) == AS_NUMBER(b);\n     case VAL_OBJ:\n       return AS_OBJ(a) == AS_OBJ(b);\n+    case VAL_EMPTY:  return true;\n+  }\n+}\n+\n+static uint32_t hashDouble(double value) {\n+  union BitCast {\n+    double value;\n+    uint32_t ints[2];\n+  };\n+\n+  union BitCast cast;\n+  cast.value = (value) + 1.0;\n+  return cast.ints[0] + cast.ints[1];\n+}\n+\n+uint32_t hashValue(Value value) {\n+  switch (value.type) {\n+    case VAL_BOOL:   return AS_BOOL(value) ? 3 : 5;\n+    case VAL_NIL:    return 7;\n+    case VAL_NUMBER: return hashDouble(AS_NUMBER(value));\n+    case VAL_OBJ:    return AS_STRING(value)->hash;\n+    case VAL_EMPTY:  return 0;\n   }\n }\ndiff --git a/c/value.h b/c/value.h\nindex a24af84..2ed3370 100644\n--- a/c/value.h\n+++ b/c/value.h\n@@ -10,7 +10,8 @@ typedef enum {\n   VAL_BOOL,\n   VAL_NIL, // [user-types]\n   VAL_NUMBER,\n-  VAL_OBJ\n+  VAL_OBJ,\n+  VAL_EMPTY\n } ValueType;\n\n typedef struct {\n@@ -26,6 +27,7 @@ typedef struct {\n #define IS_NIL(value)     ((value).type == VAL_NIL)\n #define IS_NUMBER(value)  ((value).type == VAL_NUMBER)\n #define IS_OBJ(value)     ((value).type == VAL_OBJ)\n+#define IS_EMPTY(value)   ((value).type == VAL_EMPTY)\n\n #define AS_OBJ(value)     ((value).as.obj)\n #define AS_BOOL(value)    ((value).as.boolean)\n@@ -35,6 +37,7 @@ typedef struct {\n #define NIL_VAL           ((Value){ VAL_NIL, { .number = 0 } })\n #define NUMBER_VAL(value) ((Value){ VAL_NUMBER, { .number = value } })\n #define OBJ_VAL(object)   ((Value){ VAL_OBJ, { .obj = (Obj*)object } })\n+#define EMPTY_VAL         ((Value){ VAL_EMPTY, { .number = 0 } })\n\n typedef struct {\n   int capacity;\n@@ -47,5 +50,6 @@ void initValueArray(ValueArray* array);\n void writeValueArray(ValueArray* array, Value value);\n void freeValueArray(ValueArray* array);\n void printValue(Value value);\n+uint32_t hashValue(Value value);\n\n #endif\n```\n"
  },
  {
    "path": "note/answers/chapter21_global.md",
    "content": "## 1\n\nThe optimization is pretty straightforward. When adding a string constant, we\nlook in the constant table to see if that string is already in there. The\ninteresting question is how. The simplest implementation is a linear scan over\nthe existing constants.\n\nBut that means compilation time is quadratic in the number of unique identifiers\nin the chunk. While that's fine for relatively small programs, users have a\nhabit of writing larger programs than we ever anticipated. Virtually every\nalgorithm in the compiler that isn't linear is potentially a performance\nproblem.\n\nFortunately, we have a way of looking up strings in constant time -- a hash\ntable. So, in the compiler, we add a hash table that keeps track of the\nidentifier constants that have already been added. Each key is an identifier,\nand its value is the index of the identifier in the constant table.\n\nIn compiler.c, add a module variable:\n\n```c\nTable stringConstants;\n```\n\nIn `compile()`, we initialize and tear it down:\n\n```c\nbool compile(const char* source, Chunk* chunk) {\n  initScanner(source);\n\n  compilingChunk = chunk;\n  parser.hadError = false;\n  parser.panicMode = false;\n  initTable(&stringConstants); // <--\n\n  advance();\n\n  while (!match(TOKEN_EOF)) {\n    declaration();\n  }\n\n  endCompiler();\n  freeTable(&stringConstants); // <--\n  return !parser.hadError;\n}\n```\n\nWhen adding an identifier constant, we look for it in the hash table first:\n\n```c\nstatic uint8_t identifierConstant(Token* name) {\n  // See if we already have it.\n  ObjString* string = copyString(name->start, name->length);\n  Value indexValue;\n  if (tableGet(&stringConstants, string, &indexValue)) {\n    // We do.\n    return (uint8_t)AS_NUMBER(indexValue);\n  }\n\n  uint8_t index = makeConstant(OBJ_VAL(string));\n  tableSet(&stringConstants, string, NUMBER_VAL((double)index));\n  return index;\n}\n```\n\nThat's pretty simple. Compiling an identifier is still (amortized) constant\ntime, though with slightly worse constant factors. In return, we use up fewer\nconstant table slots. We don't actually save memory from redundant strings\nbecause clox already interns all strings. But the smaller table is nice.\n\n*Note that we leak memory for the identifier string in `identifierConstant()`\nif the name is already found. That's because we don't have a GC yet.*\n\n## 2\n\nThere are a few ways to solve this. I'll do one that introduces another layer\nof indirection, and a little information sharing between the compiler and VM.\n\nIn the VM, we remove the `globals` hash table and replace it with:\n\n```c\n  Table globalNames;\n  ValueArray globalValues;\n```\n\nThe value array is where the global variable values live. The hash table maps\nthe name of a global variable to its index in the value array. So, if the\nprogram is:\n\n```lox\nvar a = \"value\";\n```\n\nThen `globalNames` will contain a single entry, `\"a\" -> 0` and `globalValues`\nwill contain a single element, `\"value\"`. This association is all wired up at\ncompile time:\n\n```c\nstatic uint8_t identifierConstant(Token* name) {\n  Value index;\n  ObjString* identifier = copyString(name->start, name->length);\n  if (tableGet(&vm.globalNames, identifier, &index)) {\n    return (uint8_t)AS_NUMBER(index);\n  }\n\n  uint8_t newIndex = (uint8_t)vm.globalValues.count;\n  writeValueArray(&vm.globalValues, UNDEFINED_VAL);\n\n  tableSet(&vm.globalNames, identifier, NUMBER_VAL((double)newIndex));\n  return newIndex;\n}\n```\n\nWhen compiling a reference to a global variable, we see if we've ever\nencountered its name before. If so, we know what index the value will be in in\nthe `globalValues` array. Otherwise, we add a new empty undefined value in the\narray and then store a new hash table entry binding the name to that index.\n\nEven though these two fields live in the VM, the compiler creates them at\ncompile time. You can think of it sort of like statically allocating memory for\nthe globals. We actually store the values in the VM so that they persist across\nmultiple REPL entries. We need to store the name association there too so that\nwe can find existing global variables.\n\n`UNDEFINED_VAL` is a new, separate singleton value like `nil`. It's used to\nmark a global variable slot as not having been defined yet. We can't use `nil`\nbecause `nil` is a valid value to store in a variable.\n\nAt runtime, the instructions work like so:\n\n```c\n      case OP_GET_GLOBAL: {\n        Value value = vm.globalValues.values[READ_BYTE()];\n        if (IS_UNDEFINED(value)) {\n          runtimeError(\"Undefined variable.\");\n          return INTERPRET_RUNTIME_ERROR;\n        }\n        push(value);\n        break;\n      }\n\n      case OP_DEFINE_GLOBAL: {\n        vm.globalValues.values[READ_BYTE()] = pop();\n        break;\n      }\n\n      case OP_SET_GLOBAL: {\n        uint8_t index = READ_BYTE();\n        if (IS_UNDEFINED(vm.globalValues.values[index])) {\n          runtimeError(\"Undefined variable.\");\n          return INTERPRET_RUNTIME_ERROR;\n        }\n        vm.globalValues.values[index] = peek(0);\n        break;\n      }\n```\n\nThe operand for the instructions is now the direct index of the global variable\nin the `globalValues` array. We've looked up the slot at compile time and\nbound the result, so at runtime we don't need to worry about the name at all.\nThis is much faster. The only perf hit we take now is the necessary check at\nruntime to ensure the variable has been initialized.\n\n## 3\n\nThis question is more subtle than it may seem.\n\nThe seemingly safe error is to say that obviously using a variable that is\nnever defined anywhere is clearly wrong code so it should be an error. That's\na reasonable choice.\n\nBut when you're in the middle refactoring a large program, you sometimes have\ncode in a known broken state. As long as the broken code isn't *called*, it\nmight be nice to let the user run the other parts of the program that are OK.\n\nYou could try to have your cake and eat it too by making a reference to an\nundeclared variable be a *warning*. That usually means the language reports it\nas an error but still allows the program to be run. That works too, but in\npractice, having shades of gray in your error reporting tends to cause user\nheadaches.\n\nSome teams will want things to be black and white by turning all warnings into\nerrors, which sacrifices the ability you were trying to provide. Meanwhile,\nother teams have the bad habit of committing code containing unfixed warnings,\nleading to gradually worsening code. You will likely end up in long arguments\nabout which diagnostics should be considered fatal errors and which mere\nwarnings. People have strangely strong opinions about this stuff.\n\nPersonally, I'm pretty error-prone and like tools and languages to help me catch\nmy mistakes, so I'd like it to tell me if there's a use of an undeclared\nvariable name. If I'm in the middle of refactoring a big codebase, I'm OK with\nhaving to comment out large regions of it to temporarily silence errors. But\nthat's just me.\n"
  },
  {
    "path": "note/answers/chapter23_jumping/1.md",
    "content": "\nAdd `TOKEN_CASE`, `TOKEN_COLON`, `TOKEN_DEFAULT`, and `TOKEN_SWITCH` to\nTokenType and then implement scanning `:`, `case`, `default`, and `switch` in\nthe scanner. Not shown here because it's not very interesting.\n\nMost of the work is in the compiler. In `statement()`, add:\n\n```c\n  } else if (match(TOKEN_SWITCH)) {\n    switchStatement();\n```\n\nThen here's the main thing:\n\n```c\n#define MAX_CASES 256\n\nstatic void switchStatement() {\n  consume(TOKEN_LEFT_PAREN, \"Expect '(' after 'switch'.\");\n  expression();\n  consume(TOKEN_RIGHT_PAREN, \"Expect ')' after value.\");\n  consume(TOKEN_LEFT_BRACE, \"Expect '{' before switch cases.\");\n\n  int state = 0; // 0: before all cases, 1: before default, 2: after default.\n  int caseEnds[MAX_CASES];\n  int caseCount = 0;\n  int previousCaseSkip = -1;\n\n  while (!match(TOKEN_RIGHT_BRACE) && !check(TOKEN_EOF)) {\n    if (match(TOKEN_CASE) || match(TOKEN_DEFAULT)) {\n      TokenType caseType = parser.previous.type;\n\n      if (state == 2) {\n        error(\"Can't have another case or default after the default case.\");\n      }\n\n      if (state == 1) {\n        // At the end of the previous case, jump over the others.\n        caseEnds[caseCount++] = emitJump(OP_JUMP);\n\n        // Patch its condition to jump to the next case (this one).\n        patchJump(previousCaseSkip);\n        emitByte(OP_POP);\n      }\n\n      if (caseType == TOKEN_CASE) {\n        state = 1;\n\n        // See if the case is equal to the value.\n        emitByte(OP_DUP);\n        expression();\n\n        consume(TOKEN_COLON, \"Expect ':' after case value.\");\n\n        emitByte(OP_EQUAL);\n        previousCaseSkip = emitJump(OP_JUMP_IF_FALSE);\n\n        // Pop the comparison result.\n        emitByte(OP_POP);\n      } else {\n        state = 2;\n        consume(TOKEN_COLON, \"Expect ':' after default.\");\n        previousCaseSkip = -1;\n      }\n    } else {\n      // Otherwise, it's a statement inside the current case.\n      if (state == 0) {\n        error(\"Can't have statements before any case.\");\n      }\n      statement();\n    }\n  }\n\n  // If we ended without a default case, patch its condition jump.\n  if (state == 1) {\n    patchJump(previousCaseSkip);\n    emitByte(OP_POP);\n  }\n\n  // Patch all the case jumps to the end.\n  for (int i = 0; i < caseCount; i++) {\n    patchJump(caseEnds[i]);\n  }\n\n  emitByte(OP_POP); // The switch value.\n}\n```\n\nThe `==` operator pops its operands. In order, to repeatedly compare the switch\nvalue to each case, we need to keep it around, so before we case, we push a copy\nof the switch value using a new `OP_DUP` (for \"duplicate\") instruction.\n\nAdd `OP_DUP` to OpCode. In the VM, its implementation is simply:\n\n```c\ncase OP_DUP: push(peek(0)); break;\n```\n\nGiven all that, if you compile:\n\n```lox\nswitch (2) {\ncase 1:\n  print(\"one\");\ncase 2:\n  print(\"two\");\ncase 3:\n  print(\"three\");\ndefault:\n  print(\"default\");\n}\nprint(\"after\");\n```\n\nThen it generates:\n\n```\n    0000    1 OP_CONSTANT         0 '2'\n    0002    2 OP_DUP\n    0003    | OP_CONSTANT         1 '1'\n    0005    | OP_EQUAL\n.-- 0006    | OP_JUMP_IF_FALSE    6 -> 16\n|   0009    | OP_POP\n|   0010    3 OP_CONSTANT         2 'one'\n|   0012    | OP_PRINT\n|   0013    4 OP_JUMP            13 -> 50 ------.\n'-> 0016    | OP_POP                            |\n    0017    | OP_DUP                            |\n    0018    | OP_CONSTANT         3 '2'         |\n    0020    | OP_EQUAL                          |\n.-- 0021    | OP_JUMP_IF_FALSE   21 -> 31       |\n|   0024    | OP_POP                            |\n|   0025    5 OP_CONSTANT         4 'two'       |\n|   0027    | OP_PRINT                          |\n|   0028    6 OP_JUMP            28 -> 50 ------|\n'-> 0031    | OP_POP                            |\n    0032    | OP_DUP                            |\n    0033    | OP_CONSTANT         5 '3'         |\n    0035    | OP_EQUAL                          |\n.-- 0036    | OP_JUMP_IF_FALSE   36 -> 46       |\n|   0039    | OP_POP                            |\n|   0040    7 OP_CONSTANT         6 'three'     |\n|   0042    | OP_PRINT                          |\n|   0043    8 OP_JUMP            43 -> 50 ------|\n'-> 0046    | OP_POP                            |\n    0047    9 OP_CONSTANT         7 'default'   |\n    0049    | OP_PRINT                          |\n.-----------------------------------------------'\n'-> 0050   10 OP_POP\n    0051   11 OP_CONSTANT         8 'after'\n    0053    | OP_PRINT\n    0054   13 OP_RETURN\n```\n\nThere are a couple of interesting design questions to think about:\n\n*   Can you have declarations inside a case? If so, what is their scope? I said\n    no. You can introduce a block if you want them.\n\n*   Can you have a switch with no cases? I allow this.\n\n*   Can you have a switch with only a default. I allow this too.\n\nFor all of these, I just picked the simplest-to-implement choice. In a real\nimplementation, I probably would allow variables, scoped to the current case. I\nwould forbid empty or default-only switches because they clearly aren't useful.\n"
  },
  {
    "path": "note/answers/chapter23_jumping/2.md",
    "content": "Add `TOKEN_CONTINUE` to TokenType and then implement scanning the `continue`\nkeyword. Not shown here because it's not very interesting.\n\nMost of the work is in the compiler. First, we need two mode global variables:\n\n```c\nint innermostLoopStart = -1;\nint innermostLoopScopeDepth = 0;\n```\n\nThese keep track of the point that a `continue` statement should jump to, and\nthe scope of the variables declared inside the loop.\n\nWe change `forStatement()` to keep track of those (and restore their previous\nvalues in the case of a nested loop:\n\n```c\nstatic void forStatement() {\n  beginScope();\n\n  consume(TOKEN_LEFT_PAREN, \"Expect '(' after 'for'.\");\n  if (match(TOKEN_VAR)) {\n    varDeclaration();\n  } else if (match(TOKEN_SEMICOLON)) {\n    // No initializer.\n  } else {\n    expressionStatement();\n  }\n\n  int surroundingLoopStart = innermostLoopStart; // <--\n  int surroundingLoopScopeDepth = innermostLoopScopeDepth; // <--\n  innermostLoopStart = currentChunk()->count; // <--\n  innermostLoopScopeDepth = current->scopeDepth; // <--\n\n  int exitJump = -1;\n  if (!match(TOKEN_SEMICOLON)) {\n    expression();\n    consume(TOKEN_SEMICOLON, \"Expect ';' after loop condition.\");\n\n    // Jump out of the loop if the condition is false.\n    exitJump = emitJump(OP_JUMP_IF_FALSE);\n    emitByte(OP_POP); // Condition.\n  }\n\n  if (!match(TOKEN_RIGHT_PAREN)) {\n    int bodyJump = emitJump(OP_JUMP);\n\n    int incrementStart = currentChunk()->count;\n    expression();\n    emitByte(OP_POP);\n    consume(TOKEN_RIGHT_PAREN, \"Expect ')' after for clauses.\");\n\n    emitLoop(innermostLoopStart); // <--\n    innermostLoopStart = incrementStart; // <--\n    patchJump(bodyJump);\n  }\n\n  statement();\n\n  emitLoop(innermostLoopStart); // <--\n\n  if (exitJump != -1) {\n    patchJump(exitJump);\n    emitByte(OP_POP); // Condition.\n  }\n\n  innermostLoopStart = surroundingLoopStart; // <--\n  innermostLoopScopeDepth = surroundingLoopScopeDepth; // <--\n\n  endScope();\n}\n```\n\nNow we're ready to implement `continue`. In `statement()`, add:\n\n```c\n  } else if (match(TOKEN_CONTINUE)) {\n    continueStatement();\n```\n\nThat calls:\n\n```c\nstatic void continueStatement() {\n  if (innermostLoopStart == -1) {\n    error(\"Can't use 'continue' outside of a loop.\");\n  }\n\n  consume(TOKEN_SEMICOLON, \"Expect ';' after 'continue'.\");\n\n  // Discard any locals created inside the loop.\n  for (int i = current->localCount - 1;\n       i >= 0 && current->locals[i].depth > innermostLoopScopeDepth;\n       i--) {\n    emitByte(OP_POP);\n  }\n\n  // Jump to top of current innermost loop.\n  emitLoop(innermostLoopStart);\n}\n```\n"
  },
  {
    "path": "note/answers/chapter23_jumping/3.md",
    "content": "Reusing [an old StackOverflow answer of mine][answer]:\n\n[answer]: https://stackoverflow.com/a/4296080/9457\n\nMost languages have built-in functions to cover the common cases, but\n\"fencepost\" loops are always a chore: loops where you want to do something on\neach iteration and also do something else between iterations. For example,\njoining strings with a separator:\n\n```\nString result = \"\";\nfor (int i = 0; i < items.Count; i++) {\n  result += items[i];\n  if (i < items.Count - 1) result += \", \"; // This is gross.\n  // What if I can't access items by index?\n  // I have off-by-one errors *every* time I do this.\n}\n```\n\nI know folds can cover this case, but sometimes you want something imperative.\nIt would be cool if you could do:\n\n```\nString result = \"\";\nfor (var item in items) {\n  result += item;\n} between {\n  result += \", \";\n}\n```\n"
  },
  {
    "path": "note/answers/chapter24_calls/1.md",
    "content": "Since our interpreter is so small, the change is pretty straightforward. First,\nwe declare a local variable for the `ip` of the current CallFrame:\n\n```c\nstatic InterpretResult run() {\n  CallFrame* frame = &vm.frames[vm.frameCount - 1];\n  register uint8_t* ip = frame->ip; // <-- Add.\n```\n\nWe replace the macros to read from that:\n\n```c\n#define READ_BYTE() (*ip++)\n#define READ_SHORT() \\\n    (ip += 2, (uint16_t)((ip[-2] << 8) | ip[-1]))\n```\n\nThen the jump instructions write to it:\n\n```c\n      case OP_JUMP: {\n        uint16_t offset = READ_SHORT();\n        ip += offset;\n        break;\n      }\n\n      case OP_JUMP_IF_FALSE: {\n        uint16_t offset = READ_SHORT();\n        if (isFalsey(peek(0))) ip += offset;\n        break;\n      }\n\n      case OP_LOOP: {\n        uint16_t offset = READ_SHORT();\n        ip -= offset;\n        break;\n      }\n```\n\nCache invalidation is the harder part. Before a call, we store the `ip` back\ninto the frame in case the call pushes a new frame. Then we load the `ip` of\nthe new frame once the call has pushed it:\n\n```c\n      case OP_CALL: {\n        int argCount = READ_BYTE();\n        frame->ip = ip; // <-- Add.\n        if (!callValue(peek(argCount), argCount)) {\n          return INTERPRET_RUNTIME_ERROR;\n        }\n        frame = &vm.frames[vm.frameCount - 1];\n        ip = frame->ip; // <-- Add.\n        break;\n      }\n```\n\nLikewise, on a return, we need to reload the `ip` of the CallFrame we're\nreturning to:\n\n```c\n        frame = &vm.frames[vm.frameCount - 1];\n        ip = frame->ip; // <-- Add.\n        break;\n```\n\nThe last place that `ip` is used is in `runtimeError()`. We need to ensure\nevery code path that calls `runtimeError()` from `run()` stores the `ip` first.\nThe runtime errors that are the result of bad calls are handled already, so it's\njust the other instructions:\n\n```c\n#define BINARY_OP(valueType, op) \\\n    do { \\\n      if (!IS_NUMBER(peek(0)) || !IS_NUMBER(peek(1))) { \\\n        frame->ip = ip; // <-- Add.\n        runtimeError(\"Operands must be numbers.\"); \\\n        return INTERPRET_RUNTIME_ERROR; \\\n      } \\\n      \\\n      double b = AS_NUMBER(pop()); \\\n      double a = AS_NUMBER(pop()); \\\n      push(valueType(a op b)); \\\n    } while (false)\n\n// ...\n\n      case OP_GET_GLOBAL: {\n        ObjString* name = READ_STRING();\n        Value value;\n        if (!tableGet(&vm.globals, name, &value)) {\n          frame->ip = ip; // <-- Add.\n          runtimeError(\"Undefined variable '%s'.\", name->chars);\n          return INTERPRET_RUNTIME_ERROR;\n        }\n        push(value);\n        break;\n      }\n\n// ...\n\n      case OP_SET_GLOBAL: {\n        ObjString* name = READ_STRING();\n        if (tableSet(&vm.globals, name, peek(0))) {\n          tableDelete(&vm.globals, name);\n          frame->ip = ip; // <-- Add.\n          runtimeError(\"Undefined variable '%s'.\", name->chars);\n          return INTERPRET_RUNTIME_ERROR;\n        }\n        break;\n      }\n\n// ...\n\n      case OP_ADD: {\n        if (IS_STRING(peek(0)) && IS_STRING(peek(1))) {\n          concatenate();\n        } else if (IS_NUMBER(peek(0)) && IS_NUMBER(peek(1))) {\n          double b = AS_NUMBER(pop());\n          double a = AS_NUMBER(pop());\n          push(NUMBER_VAL(a + b));\n        } else {\n          frame->ip = ip; // <-- Add.\n          runtimeError(\"Operands must be two numbers or two strings.\");\n          return INTERPRET_RUNTIME_ERROR;\n        }\n        break;\n      }\n\n// ...\n\n      case OP_NEGATE:\n        if (!IS_NUMBER(peek(0))) {\n          frame->ip = ip; // <-- Add.\n          runtimeError(\"Operand must be a number.\");\n          return INTERPRET_RUNTIME_ERROR;\n        }\n\n        push(NUMBER_VAL(-AS_NUMBER(pop())));\n        break;\n```\n\nNote that in all of these cases, the code to store the `ip` is only executed\n*after* we're sure a runtime error will occur. That avoids wasting cycles\nstoring it when not necessary.\n\nOn my machine, this reduce the execution time of a simple Fibonacci benchmark by\nabout 8.5%. That doesn't sound like a huge amount, but many language\nimplementers would be thrilled to find an optimization that juicy. If you run\nthe VM in a profiler, you'll see a good chunk of the execution time is spent\nlooking up `fib` in the global variable hash table, so speeding up calls is only\ngoing to buy us so much.\n\nI definitely think this is worth it.\n"
  },
  {
    "path": "note/answers/chapter24_calls/2.md",
    "content": "There are a few ways you can do this. The interesting part is that the native\nC function needs to have sort of two signal paths to get data back to the VM:\nit needs to be able to return a Value when successful, and it needs a separate\nway to indicate a runtime error.\n\nI think a clean way is to use the `args` array as both an input and output to\nthe native function. The function will read arguments from that and write the\nresult value to it when successful. Right now, `args` points to the first\nargument. After a call completes, the return value is expected to be at the\nslot just before that, which currently contains the function itself. So we'll\nsay that a native function is expected to store the return value in `args[-1]`.\n\nThen the return value of the C function itself can be used to indicate success\nor failure:\n\n```c\ntypedef bool (*NativeFn)(int argCount, Value* args);\n```\n\nSo the `clock()` native function becomes this:\n\n```c\nstatic bool clockNative(int argCount, Value* args) {\n  args[-1] = NUMBER_VAL((double)clock() / CLOCKS_PER_SEC);\n  return true;\n}\n```\n\nIf a native function does fail, it would be nice to print a runtime error, so\nwe'll let it store a string in `args[-1]` for an error message to print. Here's\none that always fails:\n\n```c\nstatic bool errNative(int argCount, Value* args) {\n  args[-1] = OBJ_VAL(copyString(\"Error!\", 6));\n  return false;\n}\n```\n\nThe VM needs to handle this new calling convention. In `callValue()`, the new\ncode looks like this:\n\n```c\n      case OBJ_NATIVE: {\n        NativeFn native = AS_NATIVE(callee);\n        if (native(argCount, vm.stackTop - argCount)) {\n          vm.stackTop -= argCount;\n          return true;\n        } else {\n          runtimeError(AS_STRING(vm.stackTop[-argCount - 1])->chars);\n          return false;\n        }\n      }\n```\n\nIn some ways, the code is simpler. Instead of getting the return value from the\nC function and pushing it onto the stack, this simply discards all but one of\nthe stack slots. Since the return value is already there at slot zero, that\nleaves it right on top with no extra work.\n\nBut the `if` statement to see if the call succeeded is expensive. Inserting some\ncontrol flow on a critical path like this is always a performance hit. On my\nlaptop, this change makes the Fibonnaci benchmark about 25% slower, even though\nno actual runtime errors ever occur.\n\nThat's the price you pay for a robust VM, I guess.\n"
  },
  {
    "path": "note/answers/chapter25_closures/1.md",
    "content": "One could spend a lot of time tweaking this and optimizing. Here's a simple\nimplementation. First, in the compiler we need to not emit `OP_CLOSURE` and the\nsubsequent operands if there are no upvalues. Instead, we just emit an\n`OP_CONSTANT` to load the function like we did before we had closures.\n\n```c\n  // Create the function object.\n  ObjFunction* function = endCompiler();\n  // Remove 7 lines and add:\n  uint8_t functionConstant = makeConstant(OBJ_VAL(function));\n  if (function->upvalueCount > 0) {\n    emitBytes(OP_CLOSURE, functionConstant);\n\n    // Emit arguments for each upvalue to know whether to capture a local\n    // or an upvalue.\n    for (int i = 0; i < function->upvalueCount; i++) {\n      emitByte(compiler.upvalues[i].isLocal ? 1 : 0);\n      emitByte(compiler.upvalues[i].index);\n    }\n  } else {\n    // No need to create a closure.\n    emitBytes(OP_CONSTANT, functionConstant);\n  }\n  // End.\n}\n```\n\nIn the VM, we first need to change CallFrame. We can't rely on the current\nfunction always being an ObjClosure:\n\n```c\ntypedef struct {\n  // Remove 1 line and add:\n  Obj* function;\n  // End.\n  uint8_t* ip;\n  Value* slots;\n} CallFrame;\n```\n\nWe store it as an `Obj*` since it may be either an ObjClosure or ObjFunction.\nSince Obj contains the type type, we can use that at runtime to see which kind\nof function we have.\n\nOver in the implementation, add:\n\n```c\nstatic inline ObjFunction* getFrameFunction(CallFrame* frame) {\n  if (frame->function->type == OBJ_FUNCTION) {\n    return (ObjFunction*)frame->function;\n  } else {\n    return ((ObjClosure*)frame->function)->function;\n  }\n}\n```\n\nAccessing the underlying ObjFunction for a given CallFrame requires some\nconditional logic. We need to do this in a couple of places, including macros,\nso I wrapped it in a function that the compiler will hopefully inline for us.\n\nIn `runtimeError()`, replace:\n\n```c\n    ObjFunction* function = frame->closure->function;\n```\n\nWith:\n\n```c\n    ObjFunction* function = getFrameFunction(frame);\n```\n\nIn `callValue()`, we need to handle both kinds of callable objects. There are\na few ways to do this, but I split `call()` into two functions:\n\n```c\n      case OBJ_CLOSURE:\n        return callClosure(AS_CLOSURE(callee), argCount);\n      case OBJ_FUNCTION:\n        return callFunction(AS_FUNCTION(callee), argCount);\n```\n\nDelete the old `call()` and replace it with:\n\n```c\nstatic bool call(Obj* callee, ObjFunction* function, int argCount) {\n  if (argCount != function->arity) {\n    runtimeError(\"Expected %d arguments but got %d.\",\n        function->arity, argCount);\n    return false;\n  }\n\n  if (vm.frameCount == FRAMES_MAX) {\n    runtimeError(\"Stack overflow.\");\n    return false;\n  }\n\n  CallFrame* frame = &vm.frames[vm.frameCount++];\n  frame->function = (Obj*)callee;\n  frame->ip = function->chunk.code;\n\n  frame->slots = vm.stackTop - argCount - 1;\n  return true;\n}\n\nstatic bool callClosure(ObjClosure* closure, int argCount) {\n  return call((Obj*)closure, closure->function, argCount);\n}\n\nstatic bool callFunction(ObjFunction* function, int argCount) {\n  return call((Obj*)function, function, argCount);\n}\n```\n\nMost of the code is the same, but we have to jump through a few hoops to handle\nthe level of indirection in ObjClosure.\n\nI did a little benchmarking. On our old fib program that doesn't use any\nclosures, this change makes it a few percent slower. Unsurprising because\nthere's a little more conditional logic when accessing the function from a\nCallFrame. I was actually surprised there wasn't a bigger performance cost.\n\nThen I made a little synthetic benchmark to stress closure creation:\n\n```\nfor (var i = 0; i < 10; i = i + 1) {\n  var start = clock();\n  var sum = 0;\n  for (var j = 0; j < 1000000; j = j + 1) {\n    fun outer(a, b, c) {\n      fun inner() {\n        return a + b + c;\n      }\n      return inner;\n    }\n\n    var closure = outer(j, j, j);\n    sum = sum + closure();\n  }\n\n  print sum;\n  print clock() - start;\n}\n```\n\nThis program is obviously pathological. Real code rarely creates so many\nfunctions and closures. But on this program, there was a significant improvement\nwith the new code. About 24% faster. I think most of this is because we don't\nhave to create a closure for each declaration of `outer()`.\n\nOverall, I'm not sure if this optimization is worth it. I'd want to try it on\nreal-world code that uses closures in an idiomatic way.\n"
  },
  {
    "path": "note/answers/chapter25_closures/2.md",
    "content": "This took me quite a while to get working, even though the end result is pretty\nsimple. I wandered down a few dead ends before I picked the right path.\n\nThe basic idea is pretty simple:\n\n1.  Right before compile the body of the loop, create a new scope with a local\n    variable that shadows the loop variable. Initialize that variable with the\n    loop variable's current value.\n\n2.  Compile the loop body. This way, if a closure happens to reference the loop\n    variable, it will resolve to that inner shadowed one.\n\n3.  Store the current value of that inner variable back in the outer one it\n    shadows. This is important so that any explicit modifications to the loop\n    variable inside the body correctly affect the loop condition and increment\n    clauses. Otherwise, this loop will never exit:\n\n    ```lox\n    for (var i = 0; i < 10; ) {\n      i = i + 1;\n    }\n    ```\n\n4.  After the body, end the scope where the inner variable is declared. If it\n    got captured by the closure, this will close its upvalue and capture the\n    current value of it.\n\nHere's the entire resulting function, with comments marking the changes,\nnumbered by which point about the correspond to:\n\n```c\nstatic void forStatement() {\n  beginScope();\n\n  // 1: Grab the name and slot of the loop variable so we can refer to it later.\n  int loopVariable = -1;\n  Token loopVariableName;\n  loopVariableName.start = NULL;\n  // end.\n\n  consume(TOKEN_LEFT_PAREN, \"Expect '(' after 'for'.\");\n  if (match(TOKEN_VAR)) {\n    // 1: Grab the name of the loop variable.\n    loopVariableName = parser.current;\n    // end.\n    varDeclaration();\n    // 1: And get its slot.\n    loopVariable = current->localCount - 1;\n    // end.\n  } else if (match(TOKEN_SEMICOLON)) {\n    // No initializer.\n  } else {\n    expressionStatement();\n  }\n\n  int loopStart = currentChunk()->count;\n\n  int exitJump = -1;\n  if (!match(TOKEN_SEMICOLON)) {\n    expression();\n    consume(TOKEN_SEMICOLON, \"Expect ';' after loop condition.\");\n\n    // Jump out of the loop if the condition is false.\n    exitJump = emitJump(OP_JUMP_IF_FALSE);\n    emitByte(OP_POP); // Condition.\n  }\n\n  if (!match(TOKEN_RIGHT_PAREN)) {\n    int bodyJump = emitJump(OP_JUMP);\n\n    int incrementStart = currentChunk()->count;\n    expression();\n    emitByte(OP_POP);\n    consume(TOKEN_RIGHT_PAREN, \"Expect ')' after for clauses.\");\n\n    emitLoop(loopStart);\n    loopStart = incrementStart;\n    patchJump(bodyJump);\n  }\n\n  // 1: If the loop declares a variable...\n  int innerVariable = -1;\n  if (loopVariable != -1) {\n    // 1: Create a scope for the copy...\n    beginScope();\n    // 1: Define a new variable initialized with the current value of the loop\n    //    variable.\n    emitBytes(OP_GET_LOCAL, (uint8_t)loopVariable);\n    addLocal(loopVariableName);\n    markInitialized();\n    // 1: Keep track of its slot.\n    innerVariable = current->localCount - 1;\n  }\n  // end.\n\n  statement();\n\n  // 3: If the loop declares a variable...\n  if (loopVariable != -1) {\n    // 3: Store the inner variable back in the loop variable.\n    emitBytes(OP_GET_LOCAL, (uint8_t)innerVariable);\n    emitBytes(OP_SET_LOCAL, (uint8_t)loopVariable);\n    emitByte(OP_POP);\n\n    // 4: Close the temporary scope for the copy of the loop variable.\n    endScope();\n  }\n\n  emitLoop(loopStart);\n\n  if (exitJump != -1) {\n    patchJump(exitJump);\n    emitByte(OP_POP); // Condition.\n  }\n\n  endScope();\n}\n```"
  },
  {
    "path": "note/answers/chapter25_closures/3.lox",
    "content": "// Here is the classic message-based pattern:\nfun vector(x, y) {\n  fun object(message) {\n    fun add(other) {\n      return vector(x + other(\"x\"), y + other(\"y\"));\n    }\n\n    if (message == \"x\") return x;\n    if (message == \"y\") return y;\n    if (message == \"add\") return add;\n    print \"unknown message\";\n  }\n\n  return object;\n}\n\nvar a = vector(1, 2);\nvar b = vector(3, 4);\nvar c = a(\"add\")(b);\nprint c(\"x\");\nprint c(\"y\");\n\n// The constructor, \"vector()\" returns a closure that closes over the object's\n// fields. In this case, it's the \"x\" and \"y\" parameters. The closure accepts a\n// single argument which is the string name of the \"method\" to invoke on the\n// object. It supports three methods:\n//\n// \"x\" returns the vector's X coordinate. Likewise \"y\". \"add\" returns a second\n// function, which is the add method. That function in turn accepts an argument\n// for the other vector to add to it.\n"
  },
  {
    "path": "note/answers/chapter26_garbage/1.md",
    "content": "On my 64-bit Mac laptop, it takes 16 bytes or 128 bits. That's quite a lot for a\npointer, a Boolean, and an enum with only eight cases (once we add the couple of\nremaining ones for classes and instances).\n\nIn principle all we need is 64 bits for the pointer, 1 bit for the mark, and\n3 bits for the type. And, in fact, most 64-bit operating systems don't give an\napplication a full 64 bits of address space. On x64 and ARM, a pointer will only\never use 48 of those bits.\n\nAlso, the pointer in our Obj header points to another Obj whose first field is\nalso a pointer. When allocating memory for objects, the OS will align them to a\n8-byte boundary. That implies the low three bits of the pointer will always be\nzero and there's really only 45 meaningful bits of pointer data.\n\nThus, the minimum we really need is 49 bits: 45 for the pointer, 1 for the mark\nbit, and 3 for the type enum. Because of alignment reasons, we won't be able to\nget that all the way down, so we'll round it to 64 bits. If we leave the\npointer bits where they normal are in there, that leaves two empty bytes at the\ntop and a few empty bits at the bottom.\n\nWe'll store the type enum in the highest byte, the mark bit in the next byte,\nand the next pointer in the remaining bits, like this:\n\n```\n00000000 00000000 01111111 11010110 01001111 01010000 00000000 01100000\nBit position:\n66665555 55555544 44444444 33333333 33222222 22221111 11111100 00000000\n32109876 54321098 76543210 98765432 10987654 32109876 54321098 76543210\n\nBits needed for pointer:\n........ ........ |------- -------- -------- ------- --------- ----|...\n\nPacking everything in:\n.....TTT .......M NNNNNNNN NNNNNNNN NNNNNNNN NNNNNNNN NNNNNNNN NNNNNNNN\n\nT = type enum, M = mark bit, N = next pointer.\n```\n\nTo implement this, we'll replace the old fields in Obj with a single 64-bit int:\n\n```c\nstruct sObj {\n  uint64_t header;\n};\n```\n\nBecause the values are all bit-packed together, simple field access no longer\nworks. Instead, we'll write accessor functions to pull the right bits out and\nconvert them back to their desired representation:\n\n```c\nstatic inline ObjType objType(Obj* object) {\n  return (ObjType)((object->header >> 56) & 0xff);\n}\n\nstatic inline bool isMarked(Obj* object) {\n  return (bool)((object->header >> 48) & 0x01);\n}\n\nstatic inline Obj* objNext(Obj* object) {\n  return (Obj*)(object->header & 0x0000ffffffffffff);\n}\n```\n\nThey're fairly straightforward. Each uses a bitwise and with a constant to mask\noff and clear the bits for the *other* fields, then shifts the remaining bits\ndown to where they belong for the desired type. For the next pointer, we don't\nneed to shift anything.\n\nSetting the fields a little more complex:\n\n```c\nstatic inline void setIsMarked(Obj* object, bool isMarked) {\n  object->header = (object->header & 0xff00ffffffffffff) |\n      ((uint64_t)isMarked << 48);\n}\n\nstatic inline void setObjNext(Obj* object, Obj* next) {\n  object->header = (object->header & 0xffff000000000000) |\n      (uint64_t)next;\n}\n```\n\nWe need to clear out the old value of the field and store the updated bits. But\nwe also need to preserve the bits for the *other* fields. So this time we mask\nand clear only the bits are updating and preserve the rest.\n\nWhen an object is first created, the mark bit is clear, and we have a type and\nnext pointer, so we initialized it like:\n\n```c\nobject->header = (unsigned long)vm.objects | (unsigned long)type << 56;\n```\n\nAll that remains is to replace every use of the old fields in the VM with calls\nto the above utility functions. That's mechanical so I won't write them all out\nhere. The end result is that we've cut the size of the object header in half.\n\nThere is some runtime expense when accessing fields now because of the masking\nand shifting. The next pointer and mark bits are only used during GC, so that's\nlikely not a large impact. Accessing the object's type is potentially more of an\nissue since that happens frequently during runtime. One option we could take is\nto store the type bits down in the least significant bits and shift the next\npointer up. That would let us access the type just by bitmasking without needing\na shift.\n"
  },
  {
    "path": "note/answers/chapter26_garbage/2.md",
    "content": "The basic idea is that instead of clearing the mark bit of every live object,\nwe simply redefine their current value to mean \"not marked\". In other words,\ninstead of \"true\" always meaning \"marked\", after each cycle, we toggle which\nBoolean value represents the marked state. Since every live object will have\nthe previous version's mark value, toggling the definition of marked instantly\nsets them all to unmarked.\n\nThe implementation is fairly straightforward. In the VM struct, we add a new\nfield to store the Boolean value that currently means \"marked\":\n\n```c\n  bool markValue;\n```\n\nIn `initVM()`, we initialize that to some value (it doesn't matter which):\n\n```c\n  vm.markValue = true;\n```\n\nOver in `sObj`, we rename the mark field from `isMarked` to `mark` to make it\nclearer that `true` doesn't necessarily mean it's marked:\n\n```\n  bool mark;\n```\n\nThen we go through all of the code that uses `isMarked` and update it to the\nnew semantics:\n\n\n```diff\n static Obj* allocateObject(size_t size, ObjType type) {\n   Obj* object = (Obj*)reallocate(NULL, 0, size);\n   object->type = type;\n-  object->isMarked = false;\n+  object->mark = !vm.markValue;\n\n   object->next = vm.objects;\n   vm.objects = object;\n```\n\nA new object starts off unmarked, so we initialize `mark` to the opposite of\nthe value that means \"marked\".\n\n```diff\n void markObject(Obj* object) {\n   if (object == NULL) return;\n-  if (object->isMarked) return;\n+  if (object->mark == vm.markValue) return;\n```\n\nTo see if an object is marked, we compare its mark value to the VM's. If they\nare the same, the object is marked.\n\nOtherwise, we mark it like so:\n\n```diff\n-  object->isMarked = true;\n+  object->mark = vm.markValue;\n```\n\nWhen removing the weak references from the string table, we also check the mark\nbit:\n\n```diff\n void tableRemoveWhite(Table* table) {\n   for (int i = 0; i < table->capacity; i++) {\n     Entry* entry = &table->entries[i];\n-    if (entry->key != NULL && !entry->key->obj.isMarked) {\n+    if (entry->key != NULL && entry->key->obj.mark != vm.markValue) {\n       tableDelete(table, entry->key);\n     }\n   }\n```\n\nOver in `sweep()`, we compare against the VM's mark value to check each object's\nmark state:\n\n```diff\n   Obj* previous = NULL;\n   Obj* object = vm.objects;\n   while (object != NULL) {\n-    if (object->isMarked) {\n-      object->isMarked = false;\n+    if (object->mark == vm.markValue) {\n       previous = object;\n       object = object->next;\n     } else {\n```\n\nThe whole point of this change is that other removed line. We no longer need to\nclear the mark bit on each live object.\n\nFinally, when `collectGarbage()` completes, we flip which value means \"marked\":\n\n```diff\n   sweep();\n\n   vm.nextGC = vm.bytesAllocated * GC_HEAP_GROW_FACTOR;\n-\n+  vm.markValue = !vm.markValue;\n```\n\nThis way, every object's current mark value now means \"unmarked\". OK, so what's\nthe performance gain here? On my laptop, with one little microbenchmark...\nalmost none. It was slightly faster, but small enough to be within the noise.\nDoes that mean this is a bad technique? It's hard to say. It might make a bigger\ndifference on other benchmarks or other machines.\n"
  },
  {
    "path": "note/answers/chapter27_classes/1.md",
    "content": "In Ruby, if you access an instance variable that you never defined, you silently\nget `nil` in return. It's as if the object has all fields and they are\nimplicitly initialized to `nil` for you.\n\nIf you want to explicitly check to see if an instance variable is defined, you\ncan call a special `instance_variable_defined?()` method on the object, passing\nin the name of the instance variable as a string or symbol:\n\n```ruby\nsome_object.instance_variable_defined?(\"field_name\")\n```\n\nJavaScript works somewhat like Ruby. If you access a property on an object that\nwas never set, you get an implicit sentinel value back. To make things more\nconfusing, JavaScript has *two* special \"absent\" values: `null` and `undefined`.\nWhen you access an undefined field, you get `undefined` back. You can think of\n`null` as the \"application-level\" absent value that users can define to mean\nwhat they want in their program. `undefined` is more like a \"system-level\"\nabsent value that gets returned from some built-in language semantics like\naccessing an undefined field.\n\nTo tell if a property is present on the object, you can call `hasOwnProperty()`\non it, passing in the name of the property as a string.\n\nPython takes a stricter approach. Accessing a non-existent object attribute\nthrows an exception. You can catch this if you want to handle the absent field\ndirectly. To determine whether a field exists *before* an exception gets thrown,\nyou can a special top-level function `hasattr()`, passing in the object in\nquestion and the name of the attribute.\n\nIn statically-typed languages, of course, it is a compile-time error to access\na field on defined for an object.\n\nIn other words, there are basically two dynamic approaches to handling accessing\nundefined fields:\n\n1. Return a special sentinel value like `nil`.\n2. Produce a runtime error.\n\nFor Lox, the former feels too loose to me. Lox is generally stricter around\nthings like missing function arguments, and I think it should be strict here\ntoo. At the same time, Lox lacks exceptions or a way for user to handle runtime\nerrors so we need to take that into account.\n\nIf users have a way to *detect* an absent field before trying to access it,\nthen it's fine for the language to abort on undefined field access -- users can\navoid that by checking beforehand. So I think that's the approach I'd take for\nLox.\n\nWe'll add a global `hasField()` native function that takes an instance and a\nfield name and returns `true` if the field is defined on the instance. Here is\nan implementation:\n\n```c\nstatic Value hasFieldNative(int argCount, Value* args) {\n  if (argCount != 2) return FALSE_VAL;\n  if (!IS_INSTANCE(args[0])) return FALSE_VAL;\n  if (!IS_STRING(args[1])) return FALSE_VAL;\n\n  ObjInstance* instance = AS_INSTANCE(args[0]);\n  Value dummy;\n  return BOOL_VAL(tableGet(&instance->fields, AS_STRING(args[1]), &dummy));\n}\n```\n\nThe error-checking at the top is lame. Right now, the VM doesn't support\nnative functions producing runtime errors, so it just returns `false` if you\npass invalid arguments. Ideally, those would be runtime errors.\n\nWe define it when the VM starts up by adding this to `initVM()`:\n\n```c\n  defineNative(\"hasField\", hasFieldNative);\n```\n"
  },
  {
    "path": "note/answers/chapter27_classes/2.md",
    "content": "I am actually iffy on whether a language should allow this, or at least whether\nit should make accessing fields using imperatively-built strings should be\n*easy*.\n\nThat's really something like a metaprogramming feature. Users are writing code\nthat builds almost a tiny piece of \"code\" -- a field name -- and then executing\nthat. Metaprogramming is useful, but I think it should be clear to users when\nthey are doing it.\n\nJavaScript tried to merge instances and data structures into a single \"object\"\nconcept and the result was a mess. People would try to use normal JavaScript\nobjects as hash tables, which JS encourages by putting a `[]` operator right on\nobjects that let you pass in string for field names. Then they would get very\nsurprised when their \"hash table\" happened to contain \"keys\" like `toString`.\n\nI think it's better to keep objects and data structures stratified, and likewise\nto keep regular programming and metaprogramming clearly distinguished. That\nsaid, I do think it's useful to offer metaprogramming.\n\nA simple way to offer the functionality but make users go out of their way to\nget it is by using a top-level function instead of hanging some kind of operator\nsyntax right off the instance. (An even more explicit approach is to put those\nfunctions in a separate \"reflection\" module users have to import, but Lox\ndoesn't have any modularity story.)\n\nSo let's add two new functions `getField()` and `setField()`. The first takes\nan instance and a field name string. The second takes those plus a value to\nstore.\n\nThey are implemented like so:\n\n```c\nstatic Value getFieldNative(int argCount, Value* args) {\n  if (argCount != 2) return FALSE_VAL;\n  if (!IS_INSTANCE(args[0])) return FALSE_VAL;\n  if (!IS_STRING(args[1])) return FALSE_VAL;\n\n  ObjInstance* instance = AS_INSTANCE(args[0]);\n  Value value;\n  tableGet(&instance->fields, AS_STRING(args[1]), &value);\n  return value;\n}\n\nstatic Value setFieldNative(int argCount, Value* args) {\n  if (argCount != 3) return FALSE_VAL;\n  if (!IS_INSTANCE(args[0])) return FALSE_VAL;\n  if (!IS_STRING(args[1])) return FALSE_VAL;\n\n  ObjInstance* instance = AS_INSTANCE(args[0]);\n  tableSet(&instance->fields, AS_STRING(args[1]), args[2]);\n  return args[2];\n}\n```\n\nLike I said in answer #1, the error-handling in these is lame. Ideally, they\nwould abort with a runtime error if the arguments were incorrect.\n\nLikewise, calling `getField()` when the instance doesn't have that field should\nbe a runtime error, but here is just returns `nil`.\n\nThese get declared as top level functions by adding this to `initVM()`:\n\n```c\n  defineNative(\"getField\", getFieldNative);\n  defineNative(\"setField\", setFieldNative);\n```\n"
  },
  {
    "path": "note/answers/chapter27_classes/3.md",
    "content": "Ruby provides a private method, `remove_instance_variable` that an object can\ncall on itself passing in the name of the instance vairable to delete. Ruby is\ninteresting in that it has the model that accessing an undefined instance\nvariable returns `nil`. But it it still makes a distinction between a deleted\ninstance variable and an instance variable whose value has been set to `nil`.\nIf you use `defined?` to tell if the instance variable exists, one whose value\nis `nil` does exist, while a deleted one does not.\n\nLua has, I think, a more consistent model. Accessing a non-existent table key\n-- Lua's rough analogue to fields -- returns `nil`. And there is no special way\nto delete a table key. You just set its value to `nil`.\n\nPython does not treat absent attributes as equivalent to `None`. Accessing an\nattribute that does not exist throws an exception. To remove an attribute, you\ncan use the built `del` statement:\n\n```python\ndel obj.some_attribute\n```\n\nIn my answer for #1, I felt Lox should go with a stricter approach like Python.\nThat suggests we shouldn't use setting a field to `nil` to delete it. Instead,\nfollowing the previous two answers, we'll add another top level native function:\n\n```c\nstatic Value deleteFieldNative(int argCount, Value* args) {\n  if (argCount != 2) return NIL_VAL;\n  if (!IS_INSTANCE(args[0])) return NIL_VAL;\n  if (!IS_STRING(args[1])) return NIL_VAL;\n\n  ObjInstance* instance = AS_INSTANCE(args[0]);\n  tableDelete(&instance->fields, AS_STRING(args[1]));\n  return NIL_VAL;\n}\n```\n\nAnd wire it up in `initVM()`:\n\n```c\n  defineNative(\"deleteField\", deleteFieldNative);\n```\n\nHonestly, I don't think this is a great user experience. Lox makes it very\neasy and natural to add a field, so it's weird to have to call a native function\nand pass in the field as a string in order to remove one.\n\nIf I were making a full language, I would consider some built-in syntax for\nremoving a field. On the other hand, removing a field is a pretty strange, rare\noperation. In most object-oriented programs, the set of fields an object has is\nessentially fixed, even in dynamically-typed ones.\n"
  },
  {
    "path": "note/answers/chapter27_classes/4.md",
    "content": "I'll just point you to a resource. Look for the paper \"An Efficient\nImplementation of Self, a Dynamically-Typed Object-Oriented Language Based on\nPrototypes\".\n"
  },
  {
    "path": "note/answers/chapter28_methods/1.md",
    "content": "An easy optimization is to cache the initializer directly in the ObjClass to\navoid the hash table lookup:\n\n```c\ntypedef struct ObjClass {\n  Obj obj;\n  ObjString* name;\n  Value initializer; // <--\n  Table methods;\n} ObjClass;\n```\n\nIt starts out nil:\n\n```c\nObjClass* newClass(ObjString* name) {\n  ObjClass* klass = ALLOCATE_OBJ(ObjClass, OBJ_CLASS);\n  klass->name = name;\n  klass->initializer = NIL_VAL; // <--\n  initTable(&klass->methods);\n  return klass;\n}\n```\n\nWhen a method is defined, if it's the initializer, then we also store it in\nthat field:\n\n```c\nstatic void defineMethod(ObjString* name) {\n  Value method = peek(0);\n  ObjClass* klass = AS_CLASS(peek(1));\n  tableSet(&klass->methods, name, method);\n  if (name == vm.initString) klass->initializer = method; // <--\n  pop();\n}\n```\n\nThen in `callValue()` we use that instead of looking for the initializer in the\nmethod table:\n\n```c\n      case OBJ_CLASS: {\n        ObjClass* klass = AS_CLASS(callee);\n        vm.stackTop[-argCount - 1] = OBJ_VAL(newInstance(klass));\n        if (!IS_NIL(klass->initializer)) {                       // <--\n          return call(AS_CLOSURE(klass->initializer), argCount); // <--\n        } else if (argCount != 0) {\n          runtimeError(\"Expected 0 arguments but got %d.\", argCount);\n          return false;\n        }\n```\n\nIt's a reasonable little optimization. On my machine, it doesn't really affect\nperf in a noticeable way. Even in a benchmark that stresses creating instances,\nit's only a marginal improvement. That's because the heap allocation and GC of\nthe instances dominates the runtime.\n\nHowever if we had a more sophisticated implementation with its own faster\nmemory allocator, then that might go down. At that point, looking up the\ninitializer could be a larger piece of the time to instantiate and object and\nmight be more important to speed up.\n"
  },
  {
    "path": "note/answers/chapter28_methods/2.md",
    "content": "The answer here is \"inline caching\". At each callsite, the VM inserts a little\nspace to store a cached reference to a class and a method. When the callsite is\nfirst reached, the VM looks up the class of the receiver and then looks up the\nmethod on that class. It stores that class and method in the cache next to that\ncallsite and then invokes the method as normal.\n\nThe next time that callsite executes, the VM checks to see if the receiver has\nthe same class as the cached one. If so, it knows the same method will be\nresolved so it uses the cached method directly instead of looking it up again.\n"
  },
  {
    "path": "note/answers/chapter28_methods/3.md",
    "content": "I'm actually *not* a fan of this choice, though it is certainly a common one.\nI like how Ruby uses a leading `@` to distinguish instance fields from methods\nand getters on the object. In my own language Wren, I use a leading underscore\nto similar effect.\n\nThis means that methods and fields never shadow one another since they are\ntextually distinct. With my language Wren, it also means that we can tell the\nset of fields a class uses just by parsing the class body. We can thus avoid\nthe need for a hash table to store the instance's state. Instead, an instance\nhas a single inline array of fields. Field access is a simple array lookup with\nan index determined at compile time. It is *much* faster than Lox.\n\nBut, for the book, I felt it made sense to stick with a more traditional\nlanguage choice. JavaScript, Python, Lua, and many other dynamically typed\nlanguages all treat objects as hash tables under the hood, so I felt it was\nworth showing how those languages work.\n"
  },
  {
    "path": "note/answers/chapter29_superclasses/1.md",
    "content": "I created a hobby language named Wren. The clox VM was actually based on Wren's\nimplementation, and the design of Lox borrows a lot from Wren. Lox is sort of\nsimplified slightly-less-weird Wren.\n\nIn Wren, fields have a leading underscore in their name. This solves the problem\nin the previous chapter of fields shadowing methods, and it also helps address\nthis problem. Because the compiler can syntactically identify a field access,\nand it knows the surrounding class, it effectively \"renames\" each field based\non the surrounding class.\n\nSo in a program like this (using more Lox-like syntax):\n\n```\nclass A {\n  init() {\n    _field = \"a field\";\n  }\n}\n\nclass B < A {\n  init() {\n    super.init();\n    _field = \"b field\";\n  }\n}\n```\n\nThere is no collision here because the compiler treats `_field` inside methods\nof class A as having a distinct name from `_field` inside class B. The main\ndownside is that fields become \"private\" instead of \"protected\". There's no way\nfor a subclass to directly access a field defined by a superclass, even on the\nsame instance.\n\nI think that's a worthwhile trade-off.\n"
  },
  {
    "path": "note/answers/chapter29_superclasses/2.md",
    "content": "I can think of a few approaches:\n\n## 1. Eagerly rebuild the subclass method tables\n\nWe could keep doing copy-down inheritance like we do here. But also give each\nsuperclass references to the set of subclasses that inherit from it. When a\nsuperclass's method table is modified, it walks the subclasses and also updates\nor rebuilds their now-invalidated method tables. That sounds slow, and it would\nbe. However, meta-programming like this usually happens only a couple of times\nnear the beginning of the program's execution and then stops. It's unusual for a\nclass's set of methods to change frequently during a program's run or inside a\nhot loop. So this likely doesn't need to be fast.\n\n## 2. Lazily rebuild the subclass method tables\n\nThe downside of 1 is that the superclass needs to maintain a list of every\nsubclass. Every single time a method is touched, the entire tree of subclasses\nmust be updated. If it's common to change a number of methods in succession,\nthat's a lot of work. Also, maintaining the list of references from superclass\nto subclass makes GC harder (they'll need to be weak references if you want to\nbe able ever collect subclasses) and makes classes heavier-weight.\n\nAnother option is to have the subclass lazily rebuild its method table when it\nsees the superclass has changed. In each class, we add a \"version\" integer\nfield. It starts out at zero and increments any time the class's set of methods\nis modified. (In principle, this could overflow, but that's pretty unlikely.)\n\nWe also add an integer field to each class to track the version of its\n*superclass*. This stores the version that the superclass was at when this\nsubclass inherited its methods.\n\nWhen a subclass is declared, it copies the methods from its superclass, and\nalso records the superclass's current version number in its superclass version\nfield.\n\nWhenever a class's method set changes after the declaration executes, we also\nincrement its version. If a subclass's own superclass version field is ever out\nof sync with the version field on its actual superclass, then we know the\nsuperclass has changed since the last time its methods were copied down.\n\nWhen do we check that? The only real natural point in time is right before a\nmethod call. Adding overhead to each method call is a drag, but it's a fairly\nsimple check between two numbers. If the two versions are out of sync, we\nrebuild the subclass's method table and then re-sync the version numbers.\n\n## 3. Lean on inline caching\n\nThis is probably the best approach (though I wouldn't put money on it). If the\nVM already does some form of inline caching, then method lookup doesn't need to\nbe that. For a given callsite, you'll only do the lookup once and then rely on\nthe fast inline cache for most calls.\n\nSo in this case, we could keep something like jlox's slow approach where methods\nare resolved by dynamically walking the inheritance chain. Then once we find the\nmethod, we store it in the inline cache, and after that it's as fast as we could\nwant.\n\nThe only missing piece is handling the fact that the cache can now become\ninvalidated. If a class's method set cannot change, then the only way an inline\ncache can become stale is if the class of the receiver changes. Now an inline\ncache on the same receiver can become stale if a method changes and a lookup\nwould now produce a different method.\n\nInline caches usually track the receiver's class by having some kind of numeric\nID for each class. Each class stores its ID and in the inline cache, we store\nthe ID of the receiver's class that the method was called one. If those match,\nthe cache is valid.\n\nWe might be able to extend that by having a method change to a class change its\nID. It is as if metaprogramming a class produces a new class with a different\nID. Since the inline cache only stores the ID of the leaf-most class of the\nreceiver, we also have to ensure that metaprogramming a *superclass* also\naffects the ID of the subclasses. So we'd want to do something like the approach\nin 1 where changing a superclass means we traverse the tree of subclasses and\nupdate their IDs too.\n\nThere are probably better solutions, but these are the first few that came to\nmind.\n"
  },
  {
    "path": "note/answers/chapter29_superclasses/3.diff",
    "content": "diff --git a/c/chunk.h b/c/chunk.h\nindex 3fe9250..b035513 100644\n--- a/c/chunk.h\n+++ b/c/chunk.h\n@@ -19,7 +19,6 @@ typedef enum {\n   OP_SET_UPVALUE,\n   OP_GET_PROPERTY,\n   OP_SET_PROPERTY,\n-  OP_GET_SUPER,\n   OP_EQUAL,\n   OP_GREATER,\n   OP_LESS,\n@@ -35,7 +34,7 @@ typedef enum {\n   OP_LOOP,\n   OP_CALL,\n   OP_INVOKE,\n-  OP_SUPER_INVOKE,\n+  OP_INNER,\n   OP_CLOSURE,\n   OP_CLOSE_UPVALUE,\n   OP_RETURN,\ndiff --git a/c/compiler.c b/c/compiler.c\nindex 78ce52d..125bf8c 100644\n--- a/c/compiler.c\n+++ b/c/compiler.c\n@@ -69,7 +69,9 @@ typedef struct Compiler {\n \n typedef struct ClassCompiler {\n   struct ClassCompiler* enclosing;\n+  uint16_t id;\n   Token name;\n+  Token methodName;\n   bool hasSuperclass;\n } ClassCompiler;\n \n@@ -484,27 +486,27 @@ static Token syntheticToken(const char* text) {\n   token.length = (int)strlen(text);\n   return token;\n }\n-static void super_(bool canAssign) {\n+static void inner(bool canAssign) {\n   if (currentClass == NULL) {\n-    error(\"Cannot use 'super' outside of a class.\");\n-  } else if (!currentClass->hasSuperclass) {\n-    error(\"Cannot use 'super' in a class with no superclass.\");\n+    error(\"Cannot use 'inner' outside of a class.\");\n   }\n \n-  consume(TOKEN_DOT, \"Expect '.' after 'super'.\");\n-  consume(TOKEN_IDENTIFIER, \"Expect superclass method name.\");\n-  uint8_t name = identifierConstant(&parser.previous);\n-\n   namedVariable(syntheticToken(\"this\"), false);\n-  if (match(TOKEN_LEFT_PAREN)) {\n-    uint8_t argCount = argumentList();\n-    namedVariable(syntheticToken(\"super\"), false);\n-    emitBytes(OP_SUPER_INVOKE, name);\n-    emitByte(argCount);\n-  } else {\n-    namedVariable(syntheticToken(\"super\"), false);\n-    emitBytes(OP_GET_SUPER, name);\n+  consume(TOKEN_LEFT_PAREN, \"Expect argument list after 'inner'.\");\n+  uint8_t argCount = argumentList();\n+  \n+  uint8_t constant = 0;\n+  if (currentClass != NULL) {\n+    char name[256];\n+    sprintf(name, \"%.*s@%x\",\n+        currentClass->methodName.length,\n+        currentClass->methodName.start,\n+        currentClass->id);\n+    constant = makeConstant(OBJ_VAL(copyString(name, (int)strlen(name))));\n   }\n+\n+  emitBytes(OP_INNER, constant);\n+  emitByte(argCount);\n }\n static void this_(bool canAssign) {\n   if (currentClass == NULL) {\n@@ -561,7 +563,7 @@ ParseRule rules[] = {\n   { NULL,     or_,     PREC_OR },         // TOKEN_OR\n   { NULL,     NULL,    PREC_NONE },       // TOKEN_PRINT\n   { NULL,     NULL,    PREC_NONE },       // TOKEN_RETURN\n-  { super_,   NULL,    PREC_NONE },       // TOKEN_SUPER\n+  { inner,    NULL,    PREC_NONE },       // TOKEN_INNER\n   { this_,    NULL,    PREC_NONE },       // TOKEN_THIS\n   { literal,  NULL,    PREC_NONE },       // TOKEN_TRUE\n   { NULL,     NULL,    PREC_NONE },       // TOKEN_VAR\n@@ -638,6 +640,7 @@ static void function(FunctionType type) {\n }\n static void method() {\n   consume(TOKEN_IDENTIFIER, \"Expect method name.\");\n+  currentClass->methodName = parser.previous;\n   uint8_t constant = identifierConstant(&parser.previous);\n \n   FunctionType type = TYPE_METHOD;\n@@ -656,12 +659,17 @@ static void classDeclaration() {\n   declareVariable();\n \n   emitBytes(OP_CLASS, nameConstant);\n+  uint16_t id = vm.nextClassID++;\n+  emitByte((id >> 8) & 0xff);\n+  emitByte(id & 0xff);\n+  \n   defineVariable(nameConstant);\n \n   ClassCompiler classCompiler;\n   classCompiler.name = parser.previous;\n   classCompiler.hasSuperclass = false;\n   classCompiler.enclosing = currentClass;\n+  classCompiler.id = id;\n   currentClass = &classCompiler;\n \n   if (match(TOKEN_LESS)) {\n@@ -672,10 +680,6 @@ static void classDeclaration() {\n       error(\"A class cannot inherit from itself.\");\n     }\n \n-    beginScope();\n-    addLocal(syntheticToken(\"super\"));\n-    defineVariable(0);\n-\n     namedVariable(className, false);\n     emitByte(OP_INHERIT);\n     classCompiler.hasSuperclass = true;\n@@ -689,10 +693,6 @@ static void classDeclaration() {\n   consume(TOKEN_RIGHT_BRACE, \"Expect '}' after class body.\");\n   emitByte(OP_POP);\n \n-  if (classCompiler.hasSuperclass) {\n-    endScope();\n-  }\n-\n   currentClass = currentClass->enclosing;\n }\n static void funDeclaration() {\ndiff --git a/c/debug.c b/c/debug.c\nindex ce23cbc..321a133 100644\n--- a/c/debug.c\n+++ b/c/debug.c\n@@ -82,8 +82,6 @@ int disassembleInstruction(Chunk* chunk, int offset) {\n       return constantInstruction(\"OP_GET_PROPERTY\", chunk, offset);\n     case OP_SET_PROPERTY:\n       return constantInstruction(\"OP_SET_PROPERTY\", chunk, offset);\n-    case OP_GET_SUPER:\n-      return constantInstruction(\"OP_GET_SUPER\", chunk, offset);\n     case OP_EQUAL:\n       return simpleInstruction(\"OP_EQUAL\", offset);\n     case OP_GREATER:\n@@ -114,8 +112,8 @@ int disassembleInstruction(Chunk* chunk, int offset) {\n       return byteInstruction(\"OP_CALL\", chunk, offset);\n     case OP_INVOKE:\n       return invokeInstruction(\"OP_INVOKE\", chunk, offset);\n-    case OP_SUPER_INVOKE:\n-      return invokeInstruction(\"OP_SUPER_INVOKE\", chunk, offset);\n+    case OP_INNER:\n+      return invokeInstruction(\"OP_INNER\", chunk, offset);\n     case OP_CLOSURE: {\n       offset++;\n       uint8_t constant = chunk->code[offset++];\ndiff --git a/c/object.c b/c/object.c\nindex 4ba65f0..976fb4a 100644\n--- a/c/object.c\n+++ b/c/object.c\n@@ -31,9 +31,10 @@ ObjBoundMethod* newBoundMethod(Value receiver, ObjClosure* method) {\n   bound->method = method;\n   return bound;\n }\n-ObjClass* newClass(ObjString* name) {\n+ObjClass* newClass(ObjString* name, uint16_t id) {\n   ObjClass* klass = ALLOCATE_OBJ(ObjClass, OBJ_CLASS);\n   klass->name = name; // [klass]\n+  klass->id = id;\n   initTable(&klass->methods);\n   return klass;\n }\n@@ -47,6 +48,7 @@ ObjClosure* newClosure(ObjFunction* function) {\n   closure->function = function;\n   closure->upvalues = upvalues;\n   closure->upvalueCount = function->upvalueCount;\n+  closure->classID = 0xffff;\n   return closure;\n }\n ObjFunction* newFunction() {\ndiff --git a/c/object.h b/c/object.h\nindex dddcfe1..c560468 100644\n--- a/c/object.h\n+++ b/c/object.h\n@@ -74,11 +74,14 @@ typedef struct {\n   ObjFunction* function;\n   ObjUpvalue** upvalues;\n   int upvalueCount;\n+  // If this closure is a method, the ID of the class that declares it.\n+  uint16_t classID;\n } ObjClosure;\n \n typedef struct sObjClass {\n   Obj obj;\n   ObjString* name;\n+  uint16_t id;\n   Table methods;\n } ObjClass;\n \n@@ -95,7 +98,7 @@ typedef struct {\n } ObjBoundMethod;\n \n ObjBoundMethod* newBoundMethod(Value receiver, ObjClosure* method);\n-ObjClass* newClass(ObjString* name);\n+ObjClass* newClass(ObjString* name, uint16_t id);\n ObjClosure* newClosure(ObjFunction* function);\n ObjFunction* newFunction();\n ObjInstance* newInstance(ObjClass* klass);\ndiff --git a/c/scanner.c b/c/scanner.c\nindex a577951..69b14ee 100644\n--- a/c/scanner.c\n+++ b/c/scanner.c\n@@ -116,12 +116,18 @@ static TokenType identifierType()\n         }\n       }\n       break;\n-    case 'i': return checkKeyword(1, 1, \"f\", TOKEN_IF);\n+    case 'i':\n+      if (scanner.current - scanner.start > 1) {\n+        switch (scanner.start[1]) {\n+          case 'f': return TOKEN_IF;\n+          case 'n': return checkKeyword(2, 3, \"ner\", TOKEN_INNER);\n+        }\n+      }\n+      break;\n     case 'n': return checkKeyword(1, 2, \"il\", TOKEN_NIL);\n     case 'o': return checkKeyword(1, 1, \"r\", TOKEN_OR);\n     case 'p': return checkKeyword(1, 4, \"rint\", TOKEN_PRINT);\n     case 'r': return checkKeyword(1, 5, \"eturn\", TOKEN_RETURN);\n-    case 's': return checkKeyword(1, 4, \"uper\", TOKEN_SUPER);\n     case 't':\n       if (scanner.current - scanner.start > 1) {\n         switch (scanner.start[1]) {\ndiff --git a/c/scanner.h b/c/scanner.h\nindex 089c7c3..a8f82ec 100644\n--- a/c/scanner.h\n+++ b/c/scanner.h\n@@ -20,7 +20,7 @@ typedef enum {\n   // Keywords.\n   TOKEN_AND, TOKEN_CLASS, TOKEN_ELSE, TOKEN_FALSE,\n   TOKEN_FOR, TOKEN_FUN, TOKEN_IF, TOKEN_NIL, TOKEN_OR,\n-  TOKEN_PRINT, TOKEN_RETURN, TOKEN_SUPER, TOKEN_THIS,\n+  TOKEN_PRINT, TOKEN_RETURN, TOKEN_INNER, TOKEN_THIS,\n   TOKEN_TRUE, TOKEN_VAR, TOKEN_WHILE,\n \n   TOKEN_ERROR,\ndiff --git a/c/vm.c b/c/vm.c\nindex 626f8c2..e2431c6 100644\n--- a/c/vm.c\n+++ b/c/vm.c\n@@ -60,6 +60,8 @@ void initVM() {\n   vm.grayCount = 0;\n   vm.grayCapacity = 0;\n   vm.grayStack = NULL;\n+  \n+  vm.nextClassID = 0;\n \n   initTable(&vm.globals);\n   initTable(&vm.strings);\n@@ -175,6 +177,20 @@ static bool invoke(ObjString* name, int argCount) {\n \n   return invokeFromClass(instance->klass, name, argCount);\n }\n+static bool invokeInner(ObjString* name, int argCount) {\n+  Value receiver = peek(argCount);\n+  ObjInstance* instance = AS_INSTANCE(receiver);\n+\n+  Value method;\n+  if (!tableGet(&instance->klass->methods, name, &method)) {\n+    // No inner method, so discard args and return nil.\n+    vm.stackTop -= argCount + 1;\n+    push(NIL_VAL);\n+    return true;\n+  }\n+\n+  return call(AS_CLOSURE(method), argCount);\n+}\n static bool bindMethod(ObjClass* klass, ObjString* name) {\n   Value method;\n   if (!tableGet(&klass->methods, name, &method)) {\n@@ -221,6 +237,18 @@ static void closeUpvalues(Value* last) {\n static void defineMethod(ObjString* name) {\n   Value method = peek(0);\n   ObjClass* klass = AS_CLASS(peek(1));\n+  \n+  AS_CLOSURE(method)->classID = klass->id;\n+  \n+  ObjString* originalName = name;\n+  Value existing;\n+  while (tableGet(&klass->methods, name, &existing)) {\n+    ObjClosure* existingClosure = AS_CLOSURE(existing);\n+    char newNameChars[256];\n+    sprintf(newNameChars, \"%s@%x\", originalName->chars, existingClosure->classID);\n+    name = copyString(newNameChars, (int)strlen(newNameChars));\n+  }\n+  \n   tableSet(&klass->methods, name, method);\n   pop();\n }\n@@ -378,15 +406,6 @@ static InterpretResult run() {\n         break;\n       }\n \n-      case OP_GET_SUPER: {\n-        ObjString* name = READ_STRING();\n-        ObjClass* superclass = AS_CLASS(pop());\n-        if (!bindMethod(superclass, name)) {\n-          return INTERPRET_RUNTIME_ERROR;\n-        }\n-        break;\n-      }\n-\n       case OP_EQUAL: {\n         Value b = pop();\n         Value a = pop();\n@@ -466,12 +485,11 @@ static InterpretResult run() {\n         frame = &vm.frames[vm.frameCount - 1];\n         break;\n       }\n-\n-      case OP_SUPER_INVOKE: {\n+      \n+      case OP_INNER: {\n         ObjString* method = READ_STRING();\n         int argCount = READ_BYTE();\n-        ObjClass* superclass = AS_CLASS(pop());\n-        if (!invokeFromClass(superclass, method, argCount)) {\n+        if (!invokeInner(method, argCount)) {\n           return INTERPRET_RUNTIME_ERROR;\n         }\n         frame = &vm.frames[vm.frameCount - 1];\n@@ -517,9 +535,12 @@ static InterpretResult run() {\n         break;\n       }\n \n-      case OP_CLASS:\n-        push(OBJ_VAL(newClass(READ_STRING())));\n+      case OP_CLASS: {\n+        ObjString* name = READ_STRING();\n+        uint16_t id = READ_SHORT();\n+        push(OBJ_VAL(newClass(name, id)));\n         break;\n+      }\n \n       case OP_INHERIT: {\n         Value superclass = peek(1);\ndiff --git a/c/vm.h b/c/vm.h\nindex 9ce5805..56c48a4 100644\n--- a/c/vm.h\n+++ b/c/vm.h\n@@ -32,6 +32,8 @@ typedef struct {\n   int grayCount;\n   int grayCapacity;\n   Obj** grayStack;\n+  \n+  uint16_t nextClassID;\n } VM;\n \n typedef enum {\ndiff --git a/test/inner/arguments.lox b/test/inner/arguments.lox\nnew file mode 100644\nindex 0000000..b8032b4\n--- /dev/null\n+++ b/test/inner/arguments.lox\n@@ -0,0 +1,16 @@\n+class A {\n+  method(a, b) {\n+    print \"A.method \" + a + \" \" + b;\n+    inner(b, a);\n+  }\n+}\n+\n+class B < A {\n+  method(a, b) {\n+    print \"B.method \" + a + \" \" + b;\n+  }\n+}\n+\n+B().method(\"first\", \"second\");\n+// expect: A.method first second\n+// expect: B.method second first\ndiff --git a/test/inner/inner_at_top_level.lox b/test/inner/inner_at_top_level.lox\nnew file mode 100644\nindex 0000000..9d3756f\n--- /dev/null\n+++ b/test/inner/inner_at_top_level.lox\n@@ -0,0 +1 @@\n+inner(\"bar\"); // Error at 'inner': Cannot use 'inner' outside of a class.\ndiff --git a/test/inner/inner_in_top_level_function.lox b/test/inner/inner_in_top_level_function.lox\nnew file mode 100644\nindex 0000000..5201e64\n--- /dev/null\n+++ b/test/inner/inner_in_top_level_function.lox\n@@ -0,0 +1,3 @@\n+fun foo() {\n+  inner(\"arg\"); // Error at 'inner': Cannot use 'inner' outside of a class.\n+}\ndiff --git a/test/inner/missing_argument_list.lox b/test/inner/missing_argument_list.lox\nnew file mode 100644\nindex 0000000..78e1b9a\n--- /dev/null\n+++ b/test/inner/missing_argument_list.lox\n@@ -0,0 +1,5 @@\n+class A {\n+  method() {\n+    inner; // Error at ';': Expect argument list after 'inner'.\n+  }\n+}\ndiff --git a/test/inner/no_inner.lox b/test/inner/no_inner.lox\nnew file mode 100644\nindex 0000000..7483e21\n--- /dev/null\n+++ b/test/inner/no_inner.lox\n@@ -0,0 +1,14 @@\n+class A {\n+  method(a) {\n+    print inner();\n+    print inner(1, 2, 3);\n+    print a;\n+  }\n+}\n+\n+class B < A {}\n+\n+B().method(\"arg\");\n+// expect: nil\n+// expect: nil\n+// expect: arg\ndiff --git a/test/inner/simple.lox b/test/inner/simple.lox\nnew file mode 100644\nindex 0000000..6801a61\n--- /dev/null\n+++ b/test/inner/simple.lox\n@@ -0,0 +1,18 @@\n+class A {\n+  method() {\n+    print \"A.method() before\";\n+    inner();\n+    print \"A.method() after\";\n+  }\n+}\n+\n+class B < A {\n+  method() {\n+    print \"B.method()\";\n+  }\n+}\n+\n+B().method();\n+// expect: A.method() before\n+// expect: B.method()\n+// expect: A.method() after\n"
  },
  {
    "path": "note/answers/chapter29_superclasses/3.md",
    "content": "I have a solution that implements the right semantics and makes inner calls as\nfast as any other method call. I won't walk through it in detail, but there is\na diff in this directory that you can apply to the result of chapter 29's code\nto see the full thing.\n\nThe basic idea is that each `inner()` call gets compiled to a call to a method\nwhose name is a combination of the surrounding method name and a unique ID for\nthe containing class. So in:\n\n```lox\nclass A {\n  foo() {\n    inner();\n  }\n}\n```\n\nThe compiler desugars it to something like:\n\n```lox\nclass A {\n  foo() {\n    this.foo@0();\n  }\n}\n```\n\nHere, \"foo@0\" is all part of the method name. \"0\" is the ID of the class A, and\nwe use \"@\" as a separator to ensure the generated name can't collide with a real\nmethod name. At runtime, when a subclass inherits from a superclass, we copy\ndown all of the superclass's methods as before. That doesn't change. But when\nthe subclass then defines its *own* methods, we do some extra work.\n\nBefore storing the method in the subclass's method table, we look for an\nexisting method with that name. If we find one, it means an \"outer\" method with\nthat name already exists on some superclass. In that case, this subclass method\ndefinition must become an inner method and thus we need to change its name. But\nto what?\n\nWe know we need to append a class ID, but it's not clear which one. We extend\nObjClass to store its ID. We also extend ObjClosure to store the class ID of the\nclass where the method is declared. (We could make a separate ObjMethod type for\nthis, but I was lazy and put it in ObjClosure even though its only used for\nclosures that are method bodies.)\n\nWhen defining a new method in a subclass, if we see a method with that name in\nthe table already, then that method is the outermost method that the subclass's\nmethod is an inner method for. So we look at the class ID of the method already\nin the table, and then generate a new name for the new method that includes\nthat class ID. So in:\n\n```lox\nclass A { // ID 0.\n  foo() {\n    inner();\n  }\n}\n\nclass B < A { // ID 1.\n  foo() {}\n}\n```\n\nWhen we execute the `OP_METHOD` for `foo()` in B, we have already copied the\ndefinition of `foo()` from A into B's method table. We see that collision. So we\nlook up the class ID stored in that closure and find 0. So then we change the\nname of the method we're defining to `foo@0` instead.\n\nWe can't stop there. There may be multiple levels of `inner()` methods in\nsuperclasses, so we look up `foo@0` in the method table too. If we find *that*\nas well, then we look for *its* class ID. We keep looping like that walking\ndown the inheritance hierarchy until we eventually find an open slot that\ncorresponds to `inner()` on the lowest class in the hierarchy and slot our new\nmethod there.\n\nThat's basically it. Since we already compile `inner()` calls to be method calls\non `this` with a correctly synthesized name, they will route to the right\nmethod definition and behave as they should using the exist runtime code we\nhave for method dispatch.\n\nThe only missing piece is what happens when you call `inner()` in a class where\nthere is no subclass that refines it. We don't want to be a runtime error since\nthere's no way for a superclass to detect that. Instead, we treat it as if there\nis an empty method that returns `nil`.\n\nTo implement that, I just made a new `OP_INNER` instruction to use instead of\n`OP_INVOKE` for `inner()` calls. It behaves almost exactly like `OP_INVOKE`\nexcept that in the case where no method could be found, instead of aborting, it\ndiscards any argument stack slots and then pushes `nil`. Another option would\nto actually compile default empty methods into the class, but then we'd have to\ntake care not to incorrectly inherit those and have them get in the way of real\ncalls.\n\nFor all the details, apply the diff to the code and see how it looks.\n"
  },
  {
    "path": "note/blurb.txt",
    "content": "Software engineers use programming languages every day, but few of us understand how those languages are designed and implemented. Crafting Interpreters gives you that insight by implementing two complete interpreters from scratch. In the process, you'll learn parsing, compilation, garbage collection, and other fundamental computer science concepts. But don't be intimidated! Crafting Interpreters walks you through all of this one step at a time with an emphasis on having fun and getting your hands dirty.\n\n---\n\nDespite using them every day, most software engineers know little about how programming languages are designed and implemented. For many, their only experience with that corner of computer science was a terrifying \"compilers\" class that they suffered through in undergrad and tried to blot from their memory as soon as they had scribbled their last NFA to DFA conversion on the final exam.\n\nThat fearsome reputation belies a field that is rich with useful techniques and not so difficult as some of its practitioners might have you believe. A better understanding of how programming languages are built will make you a stronger software engineer and teach you concepts and data structures you'll use the rest of your coding days. You might even have fun.\n\nThis book teaches you everything you need to know to implement a full-featured, efficient scripting language. You'll learn both high-level concepts around parsing and semantics and gritty details like bytecode representation and garbage collection. Your brain will light up with new ideas, and your hands will get dirty and calloused.\n\nStarting from main(), you will build a language that features rich syntax, dynamic typing, garbage collection, lexical scope, first-class functions, closures, classes, and inheritance. All packed into a few thousand lines of clean, fast code that you thoroughly understand because you wrote each one yourself.\n\n---\n\nBob Nystrom is a senior software engineer at Google working on the Dart programming language. Before discovering a love of programming languages, he developed games at Electronic Arts. He is the author of the best-selling book \"Game Programming Patterns\".\n\n---\n\nWe use programming languages every day, but few of us know how they are designed and implemented. Crafting Interpreters teaches you that by building two complete interpreters from scratch. You'll learn parsing, compilation, garbage collection, and other fundamental CS concepts. But don't be intimidated! You learn it one step at a time with an emphasis on having fun and getting your hands dirty."
  },
  {
    "path": "note/contents.txt",
    "content": "high-level goal: a *small* book that builds a complete, efficient interpreter.\ninstead of a wide text about programming language*s*, it is a single path\nthrough the language space. aim for 60k words.\n\npossible mission: cover most of the topics needed to understand how mri, cpython, and lua work.\n\nstuff *not* included:\n- type systems\n- ahead of time compilation\n- machine code\n- bottom up parsing\n- parser generators\n- ir\n- context-sensitive analysis\n- most compile-time optimizations\n\nstuff to maybe include:\n- your first language - simple stack-based language\n- lexing\n- recursive descent parsing\n- scopes as dictionaries\n- stack-based vm\n- name binding of locals\n- objects as dictionaries\n- objects\n- classes\n- prototypes\n- control flow\n- functions\n- first-class functions\n- closures\n- arithmetic\n- primitive methods/functions\n- external functions\n- compiling to bytecode\n- tree-walk interpreting\n- mark-sweep collection\n- copy collection\n- lisp2 algorithm?\n- bump-pointer allocation\n- stack traces and line information\n- lexer errors\n- compile time errors\n- runtime errors\n- nan tagging\n- object representation\n- variables and assignment\n- scope\n- jitting\n- internal representations\n- roots\n- fibers and coroutines\n- passing arguments\n- expression parsing\n- aesthetics and usability of syntax design\n- backjumping and infinite lookahead or context-sensitive grammars\n- symbol tables and hash tables\n- strings\n- tail call optimization\n- virtual machine\n- stack frames\n- stack based bytecode\n- register based bytecode\n- strings\n- arrays\n- hash tables (for internal use and as object in language)\n- dynamic dispatch\n- testing\n\n- kinds of asides\n  - historical context and people\n  - further things to learn\n  - omitted alternatives\n"
  },
  {
    "path": "note/design breaks.md",
    "content": "- The \"novelty budget\" and choosing which things to keep familiar and which to\n  keep new.\n\n- Learnability versus consistency. Being internally consistent leads to a\n  simpler, more elegant language, but doesn't leverage what the user already\n  knows.\n\n- Building an entire ecosystem: implementation, spec, core libraries, docs, etc.\n\n- Deciding how many things can be done at the library level versus special\n  permissions only the language has.\n\n- Choosing reserved words. Abbreviations to avoid name collisions.\n\n- When to write a language spec and what it's for. Useful to help think\n  precisely about semantics. Doesn't help improve usability of language. Need to\n  actually play with an implementation for that. Very time consuming. Users\n  don't need it. They need more user-friendly docs. Important when you have\n  multiple competing implementations.\n\n- When to introduce new operators. Very hard to read. If users don't have\n  intuition about precedence, they can't even visually parse it until they know\n  what it is. Tempting, but resist.\n\n  Some language designers, when presented with a problem, think \"I know, I'll\n  add a new operator.\" Now you have %*$! problems.\n\n  Note that overloading existing operators is different from defining new\n  ones. With the former, readers can still parse the code.\n\n- Syntactic novelty. First time design language and write lexer and parser,\n  excited to be able to do things different from other languages because you\n  can and because novelty is fun. Do that. Try all sorts of new things. Get it\n  out of your system. In practice, novelty has high cost.\n\n- Put a code sample on the front page of your site.\n\n- Which language features can have user-defined behavior. For example, for\n  loop in some languages works with user-defined sequence types. Some\n  languages allow operators to be overloaded, even assignment. Some allow\n  user-defined truthiness.\n\n  Trade off is power versus concrete readability.\n\n- Something about evolving a language after it's released.\n\n---\n\nKinds of language docs.\n\n- way you describe lang varies based on who description is for\n- common kinds:\n  - tutorial for beginners with language\n    - example heavy\n    - people learn by example\n    - linear narrative\n    - deliberately not comprehensive\n  - guide\n    - still informal\n    - not as linear -- search friendly\n    - still omit edge cases\n  - reference\n    - comprehensive\n    - document all the things\n    - final word for user\n    - may be enough for implementer, assuming good faith\n    - no narrative\n  - specification\n    - final word for implementer\n    - legalese\n    - if implementer follows all rules in spec, resulting code correctly\n      implements lang\n    - even if implementer was trying not to!\n    - competing org\n    - may be formal -- machine checkable\n\n- if doc your own language, priority is in that order\n- many langs never even get to spec\n\n- when designing a language, have to capture and communicate design to\n  implementers and users\n  - need to have a language (mechanism) to do that\n  - users and implementers have different needs!\n- when implementing a language, need to consume design from designer\n- even if designing and implementing your own language, still need to teach it\n  to users\n- also need to teach it to yourself\n- writing it down clearly helps you organize design in your mind\n"
  },
  {
    "path": "note/images.md",
    "content": "Unscaled source images are scanned at 1200 DPI and stored at around that\nresolution. (We'll want the extra pixels when it comes time to do the print\nedition.)\n\nRaw scans are saved grayscale unless scanned from graph paper drawings on blue\nlines. Those are color, so we can extract the blue channel.\n\nImages are drawn to scale so that text sizes and line thickness is consistent\nacross illustrations. Each two squares of quarter-ruled graph paper is one\ncolumn of the page.\n\nImages are in one of three sizes:\n\n- Normal images are the width of the text column.\n\n- Aside images are the width of the sidebar.\n  To avoid being too tall, aside images usually don't fill the entire width of\n  the image. Instead, they are scaled to a smaller number of columns, then\n  the canvas is grown to pad with pixels.\n\n- Wide images cover the text column and sidebar.\n\n## Web\n\nWeb images are scaled to two pixels per CSS pixel (to account for high DPI\nscreens). They are saved as 64-color PNGs.\n\nThe web site is organized into 48 pixel columns. At 2x for high DPI, that's\n96 image pixels per column.\n\nFinal images are in one of three sizes:\n\n- Normal:              12 columns = 1152 pixels = 24 graph paper squares\n- Aside:                6 columns =  576 pixels\n- Wide:   12 + 6 + 1 = 19 columns = 1824 pixels\n\n## Print\n\nPrint images are 1200 DPI bitmaps saved as TIFF files. Sizes:\n\n- Normal: 324 pts = 4.5  inches = 5400 pixels\n- Aside:  126 pts = 1.75 inches = 2100 pixels\n- Wide:   468 pts = 6.5  inches = 7800 pixels\n"
  },
  {
    "path": "note/indexing.md",
    "content": "## Acronyms\n\nThe expanded form gets most of the locators. The acronym gets a \"See\" cross-ref\nto the expanded form:\n\n    National Basketball Association 3, 5, 7, 9\n    NBA See National Basketball Association\n\nIf the prose explains the acronym, the acronym also gets a locator.\n\n## Subheadings\n\nSubheadings are for attributes of the main heading, not a taxonomic grouping.\nI wouldn't do:\n\n    animals\n      mammals\n        bats\n\nNor:\n\n    literal\n      number\n        decimal\n        integer\n\nRemember, indexes are mostly for looking things up by name, not defining a\nhierarchy.\n\nIn cases where there are subtopics related to a main topic, prefer flattening\nthe subtopics and putting the locators there and then add \"See also\" cross-refs\nto the main topic:\n\n    double-precision 4, 5, 6\n    floating point 1, 2, 3\n    number 6, 7\n      See also double-precision\n      See also floating point\n\n## Double-posting\n\nDon't. Not because it's not useful but because it's too much of a headache.\nInstead, use cross-links.\n\nIf the synonym is defined in the book, add an entry for the page and a separate\n\"See [also]\" link for the main term. If the synonym is not defined in the book,\nuse a \"See\" link.\n\n## Languages\n\nInclude references to programming languages. Prefer giving them subtopics since\nthere are so many.\n\nDon't link to C and Java just because they are the implementation languages for\nclox and jlox. Only mention them when there is something interesting about those\nlanguages in particular.\n\n## clox and jlox\n\nDon't bother distinguishing entries by jlox and clox, like \"error handling, in\njlox\". The reader can tell pretty easily from the page number which interpreter\nan entry is for.\n\n## Other stuff\n\n*   When indexing a design pattern, do \"<Name> design pattern\". Note the pattern\n    name is capitalized.\n\n*   Add entries for jlox classes named \"<name> class\". (Probably link the\n    generated AST classes to the appendix.)\n\n*   Add entries for jlox interfaces named \"<name> interface\".\n\n*   Add entries for jlox and clox enum types named \"<name> enum\".\n\n*   Add entries for clox struct named \"<name> struct\".\n\n*   Add entries for each clox opcode. Link to the place where the opcode itself\n    is defined in the enum.\n\n*   Don't add entries for methods and functions.\n\n## TODO at end\n\n*   Go through and make sure I caught all the classes, structs, and enums that\n    should have entries.\n\n*   Make sure all opcodes have entries.\n\n*   For topics with a lot of page numbers (like most language names), go through\n    and see which ones can have subtopics. Or just remove some of them if they\n    don't add value.\n\n*   Look for topics that should be collapsed like \"dynamic typing\" and \"dynamic\n    types\".\n"
  },
  {
    "path": "note/log.txt",
    "content": "2021/07/29 - *** launch! ***\n2021/07/28 - work on blog post\n2021/07/27 - work on blog post\n2021/07/26 - responsive index page\n2021/07/25 - index page photos\n2021/07/24 - work on index page\n2021/07/23 - work on index page, change cover colors\n2021/07/22 - work on index page\n2021/07/21 - work on index page\n2021/07/20 - work on kindle version\n2021/07/19 - work on kindle version\n2021/07/18 - ebook covers, proof in calibre, kindle export\n2021/07/17 - ebook styling\n2021/07/16 - email\n2021/07/15 - apply remaining proofreading fixes, justify challenges and notes\n2021/07/14 - apply proofreading fixes up through 10\n2021/07/13 - finish proofreading, sync styles and masters\n2021/07/12 - proofread 26 through 28\n2021/07/11 - proofread 19 through 25\n2021/07/10 - proofread 14 through 18\n2021/07/09 - proofread 7 through 13\n2021/07/08 - finish proofreading 6, start 7\n2021/07/07 - start proofreading 6\n2021/07/06 - proofread 5\n2021/07/05 - proofread 3 and 4\n2021/07/04 - weekend\n2021/07/03 - weekend\n2021/07/02 - proofread 1 and 2\n2021/07/01 - epub export\n2021/06/30 - epub export\n2021/06/29 - clean up code for ebook stuff\n2021/06/28 - cover and ebook css\n2021/06/27 - fix epub validation errors\n2021/06/26 - epub export\n2021/06/25 - github issues\n2021/06/24 - work on ebook export\n2021/06/23 - upload and order proof\n2021/06/22 - cover design\n2021/06/21 - cover lettering\n2021/06/20 - cover illustration\n2021/06/19 - cover illustration\n2021/06/18 - cover illustration\n2021/06/17 - cover illustration\n2021/06/16 - cover illustration\n2021/06/15 - cover design\n2021/06/14 - copyright, isbn, etc.\n2021/06/13 - acknowledgements and dedication\n2021/06/12 - weekend\n2021/06/11 - work on covers\n2021/06/10 - update gpp, finish other todos\n2021/06/09 - finish index\n2021/06/08 - index jumping back and forth and calls and functions\n2021/06/07 - index hash tables, global variables, and local variables\n2021/06/06 - index compiling expressions, types of values, and strings\n2021/06/05 - weekend\n2021/06/04 - index scanning on demand\n2021/06/03 - index classes, inheritance, chunks of bytecode, and a virtual machine\n2021/06/02 - index functions and resolving and binding\n2021/06/01 - index statements and state and control flow\n2021/05/30 - weekend\n2021/05/29 - weekend\n2021/05/28 - indexing parsing expressions and evaluating expressions\n2021/05/27 - indexing the lox language, scanning, representing code\n2021/05/26 - indexing the lox language\n2021/05/25 - indexing introduction and a map of the territory\n2021/05/24 - indexing introduction and a map of the territory\n2021/05/23 - design index\n2021/05/22 - toc\n2021/05/21 - fix running headers\n2021/05/20 - typeset optimization and appendices\n2021/05/19 - typeset superclasses\n2021/05/18 - typeset methods and initializers\n2021/05/17 - typeset methods and initializers\n2021/05/16 - typeset classes and instances\n2021/05/15 - typeset garbage collection\n2021/05/14 - typeset closures\n2021/05/13 - typeset closures\n2021/05/12 - typeset calls and functions\n2021/05/11 - typeset calls and functions\n2021/05/10 - typeset calls and functions\n2021/05/09 - weekend\n2021/05/08 - typeset calls and functions\n2021/05/07 - typeset jumping back and forth\n2021/05/06 - typeset local variables\n2021/05/05 - typeset global variables\n2021/05/04 - typeset hash tables\n2021/05/03 - typeset hash tables\n2021/05/02 - typeset strings\n2021/05/01 - typeset strings\n2021/04/30 - issues and prs\n2021/04/29 - typeset types of values\n2021/04/28 - vaccine\n2021/04/27 - typeset types of values\n2021/04/26 - typeset compiling expressions\n2021/04/25 - weekend\n2021/04/24 - weekend\n2021/04/23 - trim unneeded blank lines\n2021/04/22 - start typesetting compiling expressions, trim lines\n2021/04/21 - typeset scanning on demand\n2021/04/20 - typeset a virtual machine\n2021/04/19 - typeset chunks of bytecode\n2021/04/12 to 2021/04/18 - spring break\n2021/04/11 - typeset chunks of bytecode\n2021/04/10 - typeset inheritance\n2021/04/09 - typeset classes\n2021/04/08 - typeset classes\n2021/04/07 - typeset resolving and binding\n2021/04/06 - typeset resolving and binding\n2021/04/05 - typeset functions\n2021/04/04 - weekend\n2021/04/03 - typeset control flow\n2021/04/02 - typeset statements and state\n2021/04/01 - typeset evaluating expressions\n2021/03/31 - finish typesetting parsing expressions\n2021/03/30 - typeset parsing expressions\n2021/03/29 - finish typesetting representing code, start parsing expressions\n2021/03/28 - weekend\n2021/03/27 - weekend\n2021/03/26 - typeset representing code\n2021/03/25 - table styles, typeset representing code\n2021/03/24 - soft breaks for code output\n2021/03/23 - typeset scanning\n2021/03/22 - typeset scanning\n2021/03/21 - typeset first three chapters\n2021/03/20 - more xml tweaks, bug fixes\n2021/03/19 - get aside js script working again\n2021/03/18 - chapter start page\n2021/03/17 - header design\n2021/03/16 - rebuild styles\n2021/03/15 - rebuild styles\n2021/03/14 - xml output\n2021/03/13 - xml output\n2021/03/12 - xml output\n2021/03/11 - xml output\n2021/03/10 - xml output\n2021/03/09 - page layout\n2021/03/08 - page layout\n2021/03/07 - weekend\n2021/03/06 - weekend\n2021/03/05 - page layout\n2021/03/04 - page layout\n2021/03/03 - page layout\n2021/03/02 - page layout\n2021/03/01 - page layout\n2021/02/28 - weekend\n2021/02/27 - weekend\n2021/02/26 - work on layout\n2021/02/25 - work on layout\n2021/02/24 - work on layout\n2021/02/23 - 5 issues\n2021/02/22 - remove caching bit mask in optimization chapter\n2021/02/21 - weekend\n2021/02/20 - weekend\n2021/02/19 - 8 issues and prs\n2021/02/18 - copy edits for optimization\n2021/02/17 - copy edits for methods and initializers and superclasses\n2021/02/16 - copy edits for classes and instances\n2021/02/15 - copy edits for garbage collection\n2021/02/14 - weekend\n2021/02/13 - weekend\n2021/02/12 - copy edits for closures\n2021/02/11 - copy edits for calls and functions\n2021/02/10 - copy edits for local variables and jumping back and forth\n2021/02/09 - copy edits for strings, hash tables, and global variables\n2021/02/08 - copy edits for compiling expressions and types of values\n2021/02/07 - weekend\n2021/02/06 - weekend\n2021/02/05 - copy edits for chunks of bytecode, a virtual machine, and scanning\n2021/02/04 - copy edits for classes and inheritance\n2021/02/03 - copy edits for resolving and binding\n2021/02/02 - copy edits for control flow and functions\n2021/02/01 - copy edits for evaluating expressions and statements and state\n2021/01/31 - weekend\n2021/01/30 - weekend\n2021/01/29 - copy edits for parsing expressions\n2021/01/28 - copy edits for representing code\n2021/01/27 - copy edits for the lox language\n2021/01/26 - change ellipse formatting, copy edits for a map of the territory\n2021/01/25 - copy edits for introduction, appendices, and section pages\n2021/01/24 - weekend\n2021/01/23 - weekend\n2021/01/22 - make 8x10 layout\n2021/01/21 - email\n2021/01/20 - email\n2021/01/19 - email\n2021/01/18 - mlk\n2021/01/17 - weekend\n2021/01/16 - weekend\n2021/01/15 - 4 issues and prs\n2021/01/14 - fix illustrations, 6 issues\n2021/01/13 - 3 issues\n2021/01/12 - finish custom interpreters in test runner\n2021/01/11 - 1 issue, work on custom interpreters in the test runner\n2021/01/10 - weekend\n2021/01/09 - weekend\n2021/01/08 - 1 issue\n2021/01/07 - 8 issues and prs\n2021/01/06 - font research\n2021/01/05 - sync changes with copy editor\n2020/12/24 - 2020/01/04 holiday break\n2020/12/23 - 9585 words, edit optimization and appendices\n2020/12/22 - 4602 words, edit superclasses\n2020/12/21 - 8427 words, edit methods and initializers\n2020/12/20 - weekend\n2020/12/19 - weekend\n2020/12/18 - 1559 words, finish classes and instances\n2020/12/17 - 1025 words, classes and instances\n2020/12/16 - 1254 words, classes and instances\n2020/12/15 - 9286 words, edit garbage collection\n2020/12/14 - 5833 words, finish closures\n2020/12/13 - weekend\n2020/12/12 - weekend\n2020/12/11 - 4735 words, finish calls and functions\n2020/12/10 - 5192 words, finish calls and functions\n2020/12/09 - 4301 words, calls and functions\n2020/12/08 - email\n2020/12/07 - 2604 words, finish jumping back and forth\n2020/12/06 - weekend\n2020/12/05 - weekend\n2020/12/04 - 1431 words, jumping back and forth\n2020/12/03 - 2218 words, jumping back and forth\n2020/12/02 - 4669 words, edit local variables\n2020/12/01 - editing\n2020/11/30 - 4756 words, edit global variables\n2020/11/29 - weekend\n2020/11/28 - weekend\n2020/11/27 - 4523 words, finish hash tables\n2020/11/26 - 3992 words, hash tables\n2020/11/25 - email, 3 issues\n2020/11/24 - email\n2020/11/23 - email\n2020/11/22 - weekend\n2020/11/21 - weekend\n2020/11/20 - email\n2020/11/19 - 8 issues\n2020/11/18 - update style guide, better word count\n2020/11/17 - 5275 words, edit strings\n2020/11/16 - email\n2020/11/15 - weekend\n2020/11/14 - weekend\n2020/11/13 - 1949 words, finish types of values\n2020/11/12 - 2691 words, types of values\n2020/11/11 - 4178 words, finish compiling expressions\n2020/11/10 - 446 words, compiling expressions\n2020/11/09 - 1882 words, compiling expressions\n2020/11/08 - weekend\n2020/11/07 - weekend\n2020/11/06 - 5374 words, edit scanning on demand\n2020/11/05 - 1990 words, finish a virtual machine\n2020/11/04 - 3302 words, a virtual machine\n2020/11/03 - 759 words, a virtual machine\n2020/11/02 - 5813 words, finish chunks of bytecode\n2020/11/01 - weekend\n2020/10/31 - weekend\n2020/10/30 - 1883 words, chunks of bytecode\n2020/10/29 - 160 words, a bytecode vm\n2020/10/28 - 11 issues and prs\n2020/10/27 - 2112 words, finish inheritance\n2020/10/26 - 1966 words, inheritance\n2020/10/25 - weekend\n2020/10/24 - weekend\n2020/10/23 - 2088 words, finish classes\n2020/10/22 - 3677 words, classes\n2020/10/21 - 742 words, classes\n2020/10/20 - 1067 words, classes\n2020/10/19 - 3731 words, finish resolving and binding\n2020/10/18 - weekend\n2020/10/17 - weekend\n2020/10/16 - 2624 words, resolving and binding\n2020/10/15 - 2878 words, finish functions\n2020/10/14 - 2598 words, functions\n2020/10/13 - 967 words, functions\n2020/10/12 - 1887 words, finish control flow\n2020/10/11 - weekend\n2020/10/10 - weekend\n2020/10/09 - 2545 words, control flow\n2020/10/08 - 2771 words, finish statements and state\n2020/10/07 - 3934 words, statements and state\n2020/10/06 - 1749 words, statements and state\n2020/10/05 - 1 issue\n2020/10/04 - weekend\n2020/10/03 - weekend\n2020/10/02 - 4 issues\n2020/10/01 - 5 issues\n2020/09/30 - 2301 words, finish evaluating expressions\n2020/09/29 - 1549 words, evaluating expressions\n2020/09/28 - 613 words, evaluating expressions\n2020/09/27 - weekend\n2020/09/26 - weekend\n2020/09/25 - finish long lines\n2020/09/24 - work on long lines\n2020/09/23 - 2926 words, finish parsing expressions\n2020/09/22 - 1308 words, parsing expressions\n2020/09/21 - 1820 words, parsing expressions\n2020/09/20 - weekend\n2020/09/19 - weekend\n2020/09/18 - fix 8 issues\n2020/09/17 - 3422 words, finish representing code\n2020/09/16 - 2805 words, representing code\n2020/09/15 - 2604 words, finish scanning\n2020/09/14 - 596 words, scanning\n2020/09/13 - weekend\n2020/09/12 - weekend\n2020/09/11 - 2273 words, scanning\n2020/09/10 - ~5900 words, the lox language and a tree-walk interpreter\n2020/09/09 - 1030 words, finish a map of the territory\n2020/09/08 - 3795 words, a map of the territory\n2020/09/07 - labor day\n2020/09/06 - weekend\n2020/09/05 - weekend\n2020/09/04 - 1221 words, finish introduction\n2020/09/03 - 2533 words, welcome and introduction\n2020/09/02 - couple of issues\n2020/09/01 - 12 issues and prs\n2020/08/31 - weekend\n2020/08/30 - weekend\n2020/08/29 - design note styles, xml export\n2020/08/28 - more xml export\n2020/08/27 - more xml export\n2020/08/26 - more xml export\n2020/08/25 - work on lists\n2020/08/24 - challenge styles\n2020/08/23 - work on indesign styles, set up printer, test prints\n2020/08/22 - weekend\n2020/08/21 - weekend\n2020/08/20 - work on indesign styles\n2020/08/19 - work on indesign scripting\n2020/08/18 - work on indesign scripting\n2020/08/17 - work on indesign scripting\n2020/08/17 - work on indesign scripting\n2020/08/16 - weekend\n2020/08/15 - weekend\n2020/08/14 - work on indesign scripting\n2020/08/13 - work on indesign scripting\n2020/08/12 - work on indesign scripting\n2020/08/11 - work on indesign scripting\n2020/08/10 - more work on print design\n2020/08/09 - more work on print design\n2020/08/08 - weekend\n2020/08/07 - more work on print design\n2020/08/06 - more work on print design\n2020/08/05 - work on code line length for print\n2020/08/04 - more work on xml export and indesign styles\n2020/08/03 - more work on xml export and indesign styles\n2020/08/02 - weekend\n2020/08/01 - weekend\n2020/07/31 - vacation\n2020/07/30 - vacation\n2020/07/29 - vacation\n2020/07/28 - vacation\n2020/07/27 - vacation\n2020/07/26 - more work on xml export and indesign styles\n2020/07/25 - scan references, work on xml export\n2020/07/24 - typos and email\n2020/07/23 - work on page layout\n2020/07/22 - fix #680\n2020/07/21 - email\n2020/07/20 - fix #683\n2020/07/19 - weekend\n2020/07/18 - weekend\n2020/07/17 - fix #693\n2020/07/16 - 3 issues\n2020/07/15 - get rest of chapter snippets compiling\n2020/07/14 - 1 issue\n2020/07/13 - 6 issues\n2020/07/12 - finish type punning, one other issue\n2020/07/11 - weekend\n2020/07/10 - work on type punning\n2020/07/09 - email\n2020/07/08 - 5 issues\n2020/07/07 - 1 issue\n2020/07/06 - 2 issues\n2020/07/05 - weekend\n2020/07/04 - weekend\n2020/07/03 - fix #635\n2020/07/02 - use code font for statement types\n2020/07/01 - email and issues\n2020/06/30 - email\n2020/06/29 - three issues and prs\n2020/06/28 - weekend\n2020/06/27 - weekend\n2020/06/26 - 404 page\n2020/06/25 - 2 issues\n2020/06/24 - bugs and email\n2020/06/23 - bugs\n2020/06/22 - fix a few issues, fix aside markers in context lines\n2020/06/21 - weekend\n2020/06/20 - weekend\n2020/06/19 - email and issues\n2020/06/18 - close a few issues\n2020/06/17 - email\n2020/06/16 - 3 issues\n2020/06/15 - 9 prs and issues\n2020/06/12 - email\n2020/06/11 - email\n2020/06/10 - fix trailing whitespace css issue\n2020/06/09 - work on trailing whitespace css issue\n2020/06/08 - work on trailing whitespace css issue\n2020/06/07 - email and bug fixes\n2020/06/06 - weekend\n2020/06/05 - fix bugs\n2020/06/04 - email\n2020/06/03 - fix bugs\n2020/06/02 - fix bugs\n2020/06/01 - port benchmark.py to dart\n2020/05/31 - more output clean up, ebnf grammar\n2020/05/30 - bunch of clean up in html output\n2020/05/29 - switch over to dart tools\n2020/05/28 - port test.py to dart\n2020/05/27 - port split_chapters and compile_snippets to dart\n2020/05/26 - port build system to dart\n2020/05/25 - port build system to dart\n2020/05/24 - port build system to dart, fix missing context lines\n2020/05/23 - port build system to dart\n2020/05/22 - port build system to dart\n2020/05/21 - port build system to dart\n2020/05/20 - port build system to dart\n2020/05/19 - port build system to dart\n2020/05/18 - port build system to dart\n2020/05/17 - port build system to dart\n2020/05/16 - weekend\n2020/05/15 - work on page layout\n2020/05/14 - start work on page layout\n2020/05/13 - email\n2020/05/12 - email\n2020/05/11 - email\n2020/05/10 - weekend\n2020/05/09 - weekend\n2020/05/08 - email and issues\n2020/05/07 - email and issues\n2020/05/06 - email and issues\n2020/05/05 - 17 prs and issues\n2020/05/04 - email\n2020/05/03 - weekend\n2020/05/02 - weekend\n2020/05/01 - email\n2020/04/30 - talk to editor\n2020/04/29 - email\n2020/04/28 - file washington taxes, email\n2020/04/27 - start catching up on email\n*** nice long break ***\n2020/04/05 - edit blog post, publish chapter, DONE!\n2020/04/04 - write blog post\n2020/04/03 - 9494 words, third draft \"optimization\"\n2020/04/02 - 6670 words, finish second draft \"optimization\"\n2020/04/01 - 3743 words, second draft \"optimization\"\n2020/03/31 - one illustration\n2020/03/30 - one illustration\n2020/03/29 - ink and photoshop two illustrations\n2020/03/28 - 1552 words, finish first draft \"optimization\"\n             (bank 27) pencil 5 illustrations\n             (bank 28) ink and photoshop 5 illustrations\n2020/03/27 - 1157 words, first draft \"optimization\"\n2020/03/26 - 1619 words, first draft \"optimization\"\n2020/03/25 - 738 words, first draft \"optimization\"\n             (bank 26) 1383 words, first draft \"optimization\"\n2020/03/24 - 401 words, first draft \"optimization\"\n2020/03/23 - 1370 words, first draft \"optimization\"\n2020/03/22 - 1343 words, first draft \"optimization\"\n2020/03/21 - 1518 words, finish outline \"optimization\"\n2020/03/20 - 702 words, outline \"optimization\"\n             (bank 25) 480 words, outline \"optimization\"\n2020/03/19 - finish organizing snippets for \"optimization\"\n             (bank 24) 1420 words, outline \"optimization\"\n2020/03/18 - publish chapter\n2020/03/17 - 5 issues\n2020/03/16 - 14 issues\n2020/03/15 - third draft of \"superclasses\"\n             (bank 23) split snippets for \"optimization\"\n2020/03/14 - one illustration\n2020/03/13 - reword some prose around an illustration\n2020/03/12 - photoshop and tweak illustration\n2020/03/11 - pencil and ink illustration\n2020/03/10 - one illustration\n2020/03/09 - one illustration\n2020/03/08 - taxes and email\n2020/03/07 - one illustration\n2020/03/06 - one illustration\n2020/03/05 - 3138 words, finish second draft \"superclasses\"\n2020/03/04 - 1156 words, second draft \"superclasses\"\n2020/03/03 - finish writing up third challenge, fix bug, file gc issues\n2020/03/02 - more work on third challenge\n2020/03/01 - work on third challenge\n2020/02/29 - first two challenge answers\n2020/02/28 - 1429 words, finish first draft \"superclasses\"\n2020/02/27 - 822 words, first draft \"superclasses\"\n2020/02/26 - 802 words, first draft \"superclasses\"\n2020/02/25 - 149 words, first draft \"superclasses\"\n2020/02/24 - 686 words, first draft \"superclasses\"\n2020/02/23 - 512 words, first draft \"superclasses\"\n2020/02/22 - 1725 words, finish outline \"superclasses\"\n2020/02/21 - 492 words, outline \"superclasses\"\n2020/02/20 - split up and order snippets for \"superclasses\"\n2020/02/19 - publish \"methods and initializers\"\n2020/02/18 - file seattle taxes\n2020/02/17 - use bank 21\n2020/02/16 - email\n2020/02/15 - 12 issues and prs\n             (bank 22) email\n2019/02/14 - 1372 words, finish third draft \"methods and initializers\"\n2019/02/13 - 3159 words, third draft \"methods and initializers\"\n2019/02/12 - 1336 words, third draft \"methods and initializers\"\n2019/02/11 - fix python install\n2019/02/10 - 1 illustration\n2019/02/09 - 2503 words, third draft \"methods and initializers\"\n2019/02/08 - photoshop illustration\n2019/02/07 - ink two illustrations, photoshop one\n2019/02/06 - pencil two illustrations\n2019/02/05 - add aside\n2019/02/04 - 1 illustration\n2019/02/03 - 1 illustration\n2019/02/02 - 1 illustration\n2019/02/01 - 1 illustration\n2019/01/31 - other challenge answers\n2019/01/30 - 1 challenge answer\n2019/01/29 - 1308 words, finish second draft \"methods and initializers\"\n2019/01/28 - 1112 words, second draft \"methods and initializers\"\n2019/01/27 - 1035 words, second draft \"methods and initializers\"\n2019/01/26 - 610 words, second draft \"methods and initializers\"\n2019/01/25 - 1461 words, second draft \"methods and initializers\"\n2019/01/24 - 788 words, second draft \"methods and initializers\"\n2019/01/23 - 328 words, second draft \"methods and initializers\"\n2019/01/22 - 284 words, second draft \"methods and initializers\"\n2019/01/21 - 1314 words, second draft \"methods and initializers\"\n2019/01/20 - 1090 words, finish first draft \"methods and initializers\"\n2019/01/19 - 781 words first draft \"methods and initializers\"\n2019/01/18 - 391 words first draft \"methods and initializers\"\n2019/01/17 - 1123 words first draft \"methods and initializers\"\n2019/01/16 - 635 words first draft \"methods and initializers\"\n2019/01/15 - 896 words first draft \"methods and initializers\"\n2019/01/14 - 793 words first draft \"methods and initializers\"\n2019/01/13 - 509 words first draft \"methods and initializers\"\n2019/01/12 - 227 words first draft \"methods and initializers\"\n2019/01/11 - 1063 words first draft \"methods and initializers\"\n2019/01/10 - 1008 words first draft \"methods and initializers\"\n2019/01/09 - 320 words first draft \"methods and initializers\"\n2019/01/08 - 803 words finish outline \"methods and initializers\"\n2019/01/07 - 891 words outline \"methods and initializers\"\n2019/01/06 - 472 words outline \"methods and initializers\"\n2019/01/05 - 520 words outline \"methods and initializers\"\n2019/01/04 - 732 words outline \"methods and initializers\"\n2019/01/03 - 415 words outline \"methods and initializers\"\n2019/01/02 - reread part i classes chapter\n2019/01/01 - couple of minor tweaks\n2019/12/31 - publish chapter\n2019/12/30 - more issues\n2019/12/29 - issues and prs\n2019/12/28 - email\n2019/12/27 - order snippets for \"methods and initializers\"\n2019/12/26 - figure out quote\n2019/12/25 - split up snippets for \"methods and initializers\"\n2019/12/24 - last two answers for \"classes and instances\"\n2019/12/23 - use bank 20 (painting)\n2019/12/22 - first two answers for \"classes and instances\"\n2019/12/21 - third draft \"classes and instances\"\n2019/12/20 - photoshop illustration\n2019/12/19 - pencil and ink illustration\n2019/12/18 - email\n2019/12/17 - photoshop illustration\n2019/12/16 - finish inking one illustration\n2019/12/15 - pencil and half-ink one illustration\n2019/12/14 - one illustration\n2019/12/13 - 1451 words, finish second draft \"classes and instances\"\n2019/12/12 - 2159 words, second draft \"classes and instances\"\n2019/12/11 - 578 words, finish first draft \"classes and instances\"\n2019/12/10 - 671 words, first draft \"classes and instances\"\n2019/12/09 - 568 words, first draft \"classes and instances\"\n2019/12/08 - 557 words, first draft \"classes and instances\"\n2019/12/07 - 503 words, first draft \"classes and instances\"\n2019/12/06 - quote\n2019/12/05 - 508 words, first draft \"classes and instances\"\n2019/12/04 - 667 words, finish outline \"classes and instances\"\n2019/12/03 - 531 words, outline \"classes and instances\"\n             (bank 21) 500 words, outline \"classes and instances\"\n2019/12/02 - 5 issues, 1 pr\n2019/12/01 - split up and order snippets for \"classes and instances\"\n2019/11/30 - 2 prs, publish chapter\n2019/11/29 - fix field name in illustrations\n2019/11/28 - work on https://github.com/munificent/craftinginterpreters/pull/552\n2019/11/27 - 3 issues\n2019/11/26 - 5 issues\n2019/11/25 - prs and issues, work on 531\n2019/11/24 - 6 issues\n2019/11/23 - 5 prs\n2019/11/22 - 2495 words, finish third draft \"garbage collection\"\n2019/11/21 - work on third draft \"garbage collection\"\n2019/11/20 - 2161 words, third draft \"garbage collection\"\n2019/11/19 - 2294 words, third draft \"garbage collection\"\n2019/11/18 - 2108 words, third draft \"garbage collection\"\n2019/11/17 - reorganize subheaders\n2019/11/16 - baguette illustration\n2019/11/15 - redo lines in latency illustration\n2019/11/14 - one illustration\n2019/11/13 - photoshop illustration, make bullet images\n2019/11/12 - pencil and ink one illustration\n2019/11/11 - use bank 19\n2019/11/10 - use bank 18\n2019/11/09 - one illustration\n2019/11/08 - fix previous and pencil one large illustration\n             (bank 20) ink and photoshop illustration\n2019/11/07 - one illustration\n             (bank 19) another illustration\n2019/11/06 - one illustration\n2019/11/02 - fix crash bug and other issues\n2019/11/01 - look into crash bug\n2019/10/31 - write first answer for \"garbage collection\"\n2019/10/30 - code for first answer for \"garbage collection\"\n2019/10/29 - 2166 words, finish second draft \"garbage collection\"\n2019/10/28 - 767 words, second draft \"garbage collection\"\n2019/10/27 - 2205 words, second draft \"garbage collection\"\n2019/10/26 - 1049 words, second draft \"garbage collection\"\n2019/10/25 - 1621 words, second draft \"garbage collection\"\n2019/10/24 - 1142 words, second draft \"garbage collection\"\n2019/10/23 - 587 words, finish first draft \"garbage collection\"\n2019/10/22 - 2047 words, first draft \"garbage collection\"\n2019/10/21 - 531 words, first draft \"garbage collection\"\n2019/10/20 - 884 words, first draft \"garbage collection\"\n2019/10/19 - 624 words, first draft \"garbage collection\"\n2019/10/18 - 1502 words, first draft \"garbage collection\"\n2019/10/17 - 779 words, first draft \"garbage collection\"\n2019/10/16 - 1405 words, first draft \"garbage collection\"\n2019/10/15 - 656 words, first draft \"garbage collection\"\n2019/10/14 - 673 words, first draft \"garbage collection\"\n2019/10/13 - 7 bug fixes, publish site\n2019/10/12 - 8 bug fixes\n2019/10/11 - 4 bug fixes\n2019/10/10 - 6 prs, fix location for overloads\n2019/10/09 - 1615 words, finish outline \"garbage collection\"\n2019/10/08 - outline \"garbage collection\"\n2019/10/07 - 675 words, outline \"garbage collection\"\n2019/10/06 - 844 words, outline \"garbage collection\"\n2019/10/05 - 800 words, outline \"garbage collection\"\n2019/10/04 - 101 words, outline \"garbage collection\"\n2019/10/03 - 223 words, outline \"garbage collection\"\n2019/10/02 - finish organizing snippets\n2019/10/01 - more splitting and organizing snippets\n2019/09/30 - more splitting and organizing snippets\n2019/09/29 - split up and order snippets\n2019/09/28 - issues and pull requests\n2019/09/27 - publish \"closures\"\n2019/09/26 - split out snippets for \"garbage collection\"\n2019/09/25 - 2142 words, finish third draft \"closures\"\n2019/09/24 - 1173 words, third draft \"closures\"\n2019/09/23 - 2484 words, third draft \"closures\"\n2019/09/22 - 2633 words, third draft \"closures\"\n2019/09/21 - 2084 words, third draft \"closures\"\n2019/09/20 - tease apart and commit changes\n2019/09/19 - 1 illustration\n2019/09/18 - 1 illustration, fix positions of asides in chrome\n2019/09/17 - 1 illustration\n2019/09/16 - 1 illustration\n2019/09/15 - 1 illustration\n2019/09/14 - 1 illustration\n2019/09/13 - 1 illustration\n2019/09/12 - 1 illustration\n2019/09/11 - 2176 words, second draft \"closures\"\n2019/09/10 - 759 words, second draft \"closures\"\n2019/09/09 - 1788 words, second draft \"closures\"\n2019/09/08 - 1210 words, second draft \"closures\"\n2019/09/07 - 2603 words, second draft \"closures\"\n2019/09/06 - 1395 words, second draft \"closures\"\n2019/09/05 - 699 words, second draft \"closures\"\n2019/09/04 - fix an issue and publish site\n2019/09/03 - email and bug fixes\n2019/09/02 - six issues\n2019/09/01 - three issues\n2019/08/31 - use bank 17\n2019/08/30 - answer 3\n2019/08/29 - finish answer 2\n2019/08/28 - work on answer 2\n2019/08/27 - work on answer 2\n2019/08/26 - write up text for answer 1\n             (bank 18) work on answer 2\n2019/08/25 - code for answer 1\n2019/08/24 - 17 emails\n             (bank 17) 950 words, finish first draft \"closures\"\n2019/08/23 - first draft \"closures\"\n2019/08/22 - 749 words, first draft \"closures\"\n2019/08/21 - 1017 words, first draft \"closures\"\n2019/08/20 - 1327 words, first draft \"closures\"\n2019/08/19 - 1207 words, first draft \"closures\"\n2019/08/18 - 1016 words, first draft \"closures\"\n2019/08/17 - 1401 words, first draft \"closures\"\n2019/08/16 - 419 words, first draft \"closures\"\n2019/08/15 - email\n2019/08/14 - 401 words, first draft \"closures\"\n2019/08/13 - 1271 words, first draft \"closures\"\n2019/08/12 - outline design note for \"closures\"\n2019/08/11 - code samples for \"closures\" design note\n2019/08/10 - more outline \"closures\"\n2019/08/09 - more outline \"closures\"\n2019/08/08 - more outline \"closures\"\n2019/08/07 - 408 words outline \"closures\"\n2019/08/06 - 447 words outline \"closures\"\n2019/08/05 - 515 words outline \"closures\"\n2019/08/04 - more outline \"closures\"\n2019/08/03 - research, 283 words outline \"closures\"\n2019/08/02 - 4 prs and issues\n2019/08/01 - survey about (void), 6 prs and issues\n2019/07/31 - 7 prs and issues\n2019/07/30 - get \"calls and functions\" compiling partway through\n2019/07/29 - finish snippet test script\n2019/07/28 - set up test snippets for more chapters\n2019/07/27 - set up test snippets for more chapters\n2019/07/26 - set up test snippets for more chapters\n2019/07/25 - start building system to test snippets in middle of chapters\n2019/07/24 - close 2 prs and 6 issues\n2019/07/23 - answers for \"calls and functions\"\n2019/07/22 - 1812 words, finish third draft \"calls and functions\"\n2019/07/21 - 2145 words, third draft \"calls and functions\"\n2019/07/20 - 1658 words, third draft \"calls and functions\"\n2019/07/19 - 3530 words, third draft \"calls and functions\"\n2019/07/18 - photoshop illustration\n2019/07/17 - ink illustration\n2019/07/16 - 178 words outline closures\n2019/07/15 - re-read jlox chapters around closures\n2019/07/14 - merge calls branch into closures\n2019/07/13 - pencil illustration\n2019/07/12 - photoshop four illustrations\n2019/07/11 - redraw three illustrations\n2019/07/10 - fix more bugs\n2019/07/09 - track down bugs, fix stack handling of script\n2019/07/08 - finish ordering snippets for \"closures\"\n2019/07/07 - ordering snippets for \"closures\"\n2019/07/06 - ordering snippets for \"closures\"\n2019/07/05 - outline \"closures\"\n2019/07/04 - split snippets for \"closures\"\n2019/07/03 - email and bug fixes\n2019/07/02 - another illustration for \"calls\"\n2019/07/01 - ink and photoshop 3 illustrations for \"calls\"\n2019/06/30 - pencil 3 illustrations for \"calls\"\n2019/06/29 - another illustration for \"calls\"\n2019/06/28 - ink and photoshop illustration for \"calls\"\n2019/06/27 - pencil illustration for \"calls\"\n2019/06/26 - 2169 words finish second draft \"calls\"\n2019/06/25 - 1676 words second draft \"calls\"\n2019/06/24 - 1065 words second draft \"calls\"\n2019/06/23 - 2167 words second draft \"calls\"\n2019/06/22 - 1793 words second draft \"calls\"\n2019/06/21 - 1124 words finish first draft \"calls\"\n2019/06/20 - 1943 words first draft \"calls\"\n2019/06/19 - 758 words first draft \"calls\"\n2019/06/18 - 1050 words first draft \"calls\"\n2019/06/17 - 744 words first draft \"calls\"\n2019/06/16 - 75 words first draft \"calls\" (raccoon)\n2019/06/15 - 490 words first draft \"calls\"\n2019/06/14 - 865 words first draft \"calls\"\n2019/06/13 - 472 words first draft \"calls\"\n2019/06/12 - 550 words first draft \"calls\"\n2019/06/11 - 840 words first draft \"calls\"\n2019/06/10 - challenges and finish outline \"calls\"\n2019/06/09 - 160 words outline \"calls\"\n2019/06/08 - 1126 words outline \"calls\"\n2019/06/07 - 585 words outline \"calls\"\n2019/06/06 - 168 words outline \"calls\", change max arg count\n2019/06/05 - simplify call, invoke, and super instructions\n2019/06/04 - 544 words outline \"calls\"\n2019/06/03 - 331 words outline \"calls\", quote, fix build script\n2019/06/02 - 853 words outline \"calls\"\n2019/06/01 - 571 words outline \"calls\"\n2019/05/31 - 267 words outline \"calls\"\n2019/05/30 - sign polish translation contract, 90 words outline \"calls\"\n2019/05/29 - fix illustration bugs\n2019/05/28 - email\n2019/05/27 - email\n2019/05/26 - email\n2019/05/25 - bugs\n2019/05/24 - email\n2019/05/23 - merge pr\n2019/05/22 - 10 issues, 3 pull requests\n2019/05/21 - tweak some code\n2019/05/20 - fix bugs\n2019/05/19 - publish chapter\n2019/05/18 - finish organizing snippets for \"calls and functions\"\n2019/05/17 - still more work on \"calls and functions\" snippets\n2019/05/16 - more work on \"calls and functions\" snippets\n2019/05/15 - a little work on \"calls and functions\" snippets\n2019/05/14 - a little work on \"calls and functions\" snippets\n2019/05/13 - fix issues, merge branches, work on \"calls and functions\" snippets\n2019/05/12 - fix/close 10 issues\n2019/05/11 - 3414 words, finish third draft \"jumping\"\n2019/05/10 - set up venv for python stuff, update markdown\n2019/05/09 - work on snippets for \"calls and functions\"\n2019/05/08 - work on snippets for \"calls and functions\"\n2019/05/07 - work on snippets for \"calls and functions\"\n2019/05/06 - split up snippets for \"calls and functions\"\n2019/05/05 - 1734 words, third draft \"jumping\"\n2019/05/04 - 1046 words, third draft \"jumping\"\n2019/05/03 - another illustration\n2019/05/02 - more illustrations\n2019/05/01 - one illustration\n2019/04/30 - email\n2019/04/29 - photoshop two illustrations\n2019/04/28 - photoshop three illustrations\n2019/04/27 - ink two illustrations\n2019/04/26 - ink two illustrations\n2019/04/25 - ink two illustrations\n2019/04/24 - draw and photoshop one illustration\n2019/04/23 - pencil illustrations\n2019/04/22 - pencil illustrations\n2019/04/21 - pencil illustrations\n2019/04/20 - answers 2 and 3 for \"jumping\"\n2019/04/19 - answer 1 for \"jumping\"\n2019/04/18 - fix rest of grammar examples in \"representing code\"\n2019/04/17 - 1528 words, finish first draft \"jumping\"\n2019/04/16 - 318 words, first draft \"jumping\"\n2019/04/15 - 1049 words, first draft \"jumping\"\n2019/04/14 - 559 words, first draft \"jumping\"\n2019/04/13 - 661 words, first draft \"jumping\"\n2019/04/12 - fix overlapping chapter number, quotes\n2019/04/11 - 878 words, first draft \"jumping\"\n2019/04/10 - 828 words, first draft \"jumping\"\n2019/04/09 - use bank 16\n2019/04/08 - fix five issues\n             (bank 16) 463 words, first draft \"jumping\"\n2019/04/07 - redo illustration for #378\n2019/04/06 - work on #378\n2019/04/05 - fix two issues\n2019/04/04 - 863 words, design note\n2019/04/03 - illustrations\n2019/04/02 - outline design note\n2019/04/01 - research goto considered harmful\n2019/03/31 - 155 words outline \"jumping back and forth\"\n2019/03/30 - 829 words outline \"jumping back and forth\"\n2019/03/29 - 409 words outline \"jumping back and forth\"\n2019/03/28 - 457 words outline \"jumping back and forth\"\n2019/03/27 - 278 words outline \"jumping back and forth\"\n2019/03/26 - rename chapter, 309 words outline \"jumping back and forth\"\n2019/03/25 - split and order snippets for \"jumping forward and back\"\n2019/03/24 - publish chapter\n2019/03/23 - email\n2019/03/22 - email\n2019/03/21 - email\n2019/03/20 - email\n2019/03/19 - email\n2019/03/18 - 1921 words, finish third draft \"local variables\"\n2019/03/17 - 3029 words, third draft \"local variables\"\n2019/03/16 - tweak illustration and add caption\n2019/03/15 - another image\n2019/03/14 - photoshop illustration\n2019/03/13 - draw illustration\n2019/03/12 - ink and photoshop illustration\n2019/03/11 - pencil illustration\n2019/03/10 - another illustration\n2019/03/09 - ink and photoshop illustration\n2019/03/08 - pencil and start inking illustration\n2019/03/07 - 2131 words, finish second draft \"local variables\"\n2019/03/06 - 944 words, second draft \"local variables\"\n2019/03/05 - finish off #394\n2019/03/04 - more work on #394\n2019/03/03 - work on #394\n2019/03/02 - work on #394\n2019/03/01 - work on #394\n2019/02/28 - work on #394\n2019/02/27 - 59 words, second draft \"local variables\" (ginny :( )\n2019/02/26 - 554 words, second draft \"local variables\"\n2019/02/25 - 649 words, second draft \"local variables\"\n2019/02/24 - 659 words, second draft \"local variables\"\n2019/02/23 - 3 issues and prs\n2019/02/22 - email\n2019/02/21 - fix #389\n2019/02/20 - 5 bugs\n2019/02/19 - 8 bugs\n2019/02/18 - 385 words, finish first draft \"local variables\"\n2019/02/17 - 877 words, first draft \"local variables\"\n2019/02/16 - 698 words, first draft \"local variables\"\n2019/02/15 - more work on first draft \"local variables\"\n2019/02/14 - 1224 words, first draft \"local variables\"\n2019/02/13 - 169 words, first draft \"local variables\"\n2019/02/12 - 634 words, first draft \"local variables\"\n2019/02/11 - 395 words, first draft \"local variables\"\n2019/02/10 - 217 words, first draft \"local variables\"\n2019/02/09 - 835 words, finish outline \"local variables\"\n2019/02/08 - 394 words, outline \"local variables\"\n2019/02/07 - 472 words, outline \"local variables\"\n2019/02/06 - email\n2019/02/05 - 1 more issue\n2019/02/04 - 10 issues\n2019/02/03 - 67 words, outline \"local variables\"\n2019/02/02 - 446 words, outline \"local variables\"\n2019/02/01 - fix broken repo, look into broken payhip\n2019/01/31 - simplify how \"in its own initializer\" error is reported, finish ordering snippets\n2019/01/30 - work on ordering snippets\n2019/01/29 - put chapter online, work on snippets for \"local variables\"\n2019/01/28 - fix 5 issues\n2019/01/27 - fix 17 issues\n2019/01/26 - fix 5 issues, work on one more\n2019/01/25 - fix a few issues, work on #327\n2019/01/24 - fix issues\n2019/01/23 - 1880 words, finish third draft \"global variables\", last answer\n2019/01/22 - 2733 words, third draft \"global variables\"\n2019/01/21 - photoshop illustration\n2019/01/20 - one illustration\n2019/01/19 - two illustrations\n2019/01/18 - photoshop illustration\n2019/01/17 - draw illustration\n2019/01/16 - 6 emails\n2019/01/15 - edit and tweak \"global variables\"\n2019/01/14 - two answers for challenges in \"global variables\"\n2019/01/13 - finish second draft of \"global variables\"\n2019/01/12 - redo intro to \"global variables\"\n2019/01/11 - 1030 words, second draft \"global variables\"\n2019/01/10 - 800 words, second draft \"global variables\"\n2019/01/09 - 17 emails\n2019/01/08 - 10 emails\n2019/01/07 - 376 words, second draft \"global variables\"\n2019/01/06 - taxes\n2019/01/05 - 361 words, second draft \"global variables\"\n2019/01/04 - 181 words, second draft \"global variables\"\n2019/01/03 - 1811 words, finish first draft \"global variables\"\n2019/01/02 - 1070 words first draft \"global variables\"\n2019/01/01 - 657 words first draft \"global variables\"\n2018/12/31 - 740 words first draft \"global variables\"\n2018/12/30 - challenges\n2018/12/29 - titles for quotes\n2018/12/28 - 526 words outline \"global variables\"\n2018/12/27 - 230 words outline \"global variables\"\n2018/12/26 - 563 words outline \"global variables\"\n2018/12/25 - 561 words outline \"global variables\"\n2018/12/24 - outlining and notes for \"global variables\"\n2018/12/23 - order snippets for \"global variables\"\n2018/12/22 - start outlining and notes for \"global variables\"\n2018/12/21 - publish chapter, split up snippets for \"global variables\"\n2018/12/20 - get email ready\n2018/12/19 - 8 issues\n2018/12/18 - 6 issues\n2018/12/17 - 2 prs, 4 issues\n2018/12/16 - pay taxes\n2018/12/15 - write up answer 1, upgrade markdown package, issues, prs\n2018/12/14 - work on answer 1\n2018/12/13 - 402 words, finish third draft \"hash tables\"\n2018/12/12 - 3135 words, third draft \"hash tables\"\n2018/12/11 - 860 words, third draft \"hash tables\"\n2018/12/10 - 2338 words, third draft \"hash tables\"\n2018/12/09 - 1627 words, third draft \"hash tables\"\n2018/12/08 - fix tombstone illustration text\n2018/12/07 - rework prose for delete illustrations\n2018/12/06 - three more illustrations\n2018/12/05 - photoshop illustration, then do tombstone illustration\n2018/12/04 - draw and ink illustration\n2018/12/03 - photoshop illustration\n2018/12/02 - draw and ink illustration\n2018/12/01 - prose for insert sequence\n2018/11/30 - photoshop insert illustrations\n2018/11/29 - ink and scan insert illustrations\n2018/11/28 - pencil illustrations\n2018/11/27 - photoshop pigeons\n2018/11/26 - pigeon illustration\n2018/11/25 - draw pigeons, pencil one illustration\n2018/11/24 - draw one illustration\n2018/11/23 - couple more prs and bugs\n2018/11/22 - merge a few prs\n2018/11/21 - 965 words, second draft of \"hash tables\"\n2018/11/20 - 549 words, second draft of \"hash tables\"\n2018/11/19 - 1840 words, second draft of \"hash tables\"\n2018/11/18 - 179 words, second draft of \"hash tables\"\n2018/11/17 - 866 words, second draft of \"hash tables\"\n2018/11/16 - 1074 words, second draft of \"hash tables\"\n2018/11/15 - 257 words, second draft of \"hash tables\"\n2018/11/14 - 796 words, second draft of \"hash tables\"\n2018/11/13 - 1506 words, second draft of \"hash tables\"\n2018/11/12 - 415 words, first draft of \"hash tables\"\n2018/11/11 - 590 words, first draft of \"hash tables\"\n2018/11/10 - 76 words, first draft of \"hash tables\"\n2018/11/09 - 423 words, first draft of \"hash tables\"\n2018/11/08 - 481 words, first draft of \"hash tables\"\n2018/11/07 - 727 words, first draft of \"hash tables\"\n2018/11/06 - 455 words, first draft of \"hash tables\"\n2018/11/05 - 439 words, first draft of \"hash tables\"\n2018/11/04 - 459 words, first draft of \"hash tables\"\n2018/11/03 - 343 words, first draft of \"hash tables\"\n2018/11/02 - 84 words, first draft of \"hash tables\" (sick)\n2018/11/01 - 331 words, first draft of \"hash tables\"\n2018/10/31 - 208 words, first draft of \"hash tables\"\n2018/10/30 - 323 words, first draft of \"hash tables\"\n2018/10/29 - 810 words, first draft of \"hash tables\"\n2018/10/28 - 617 words, first draft of \"hash tables\"\n2018/10/27 - first draft of \"hash tables\"\n2018/10/26 - 642 words, first draft of \"hash tables\"\n2018/10/25 - 733 words, new first draft of \"hash tables\"\n2018/10/24 - outline more on \"hash tables\"\n2018/10/23 - try to figure out what order to introduce concepts\n2018/10/22 - rewrite some of \"hash tables\"\n2018/10/21 - 344 words, first draft \"hash tables\"\n2018/10/20 - 186 words, first draft \"hash tables\"\n2018/10/19 - 401 words, first draft \"hash tables\"\n2018/10/18 - 384 words, first draft \"hash tables\"\n2018/10/17 - 221 words, finish outline \"hash tables\"\n2018/10/16 - 351 words, outline \"hash tables\"\n2018/10/15 - finish outlining deletion\n2018/10/14 - token amount of work on deletion :(\n2018/10/13 - work on deletion a bit\n2018/10/12 - use tombstones in hash table\n2018/10/11 - benchmark hash table deletion\n2018/10/10 - research deleting from hash tables\n2018/10/09 - more outlining \"hash tables\"\n2018/10/08 - 327 words, outline \"hash tables\"\n2018/10/07 - 390 words, outline \"hash tables\"\n2018/10/06 - little more outlining \"hash tables\"\n2018/10/05 - more outline \"hash tables\"\n2018/10/04 - 470 words, outline \"hash tables\"\n2018/10/03 - more outlining \"hash tables\"\n2018/10/02 - 603 words, outline \"hash tables\"\n2018/10/01 - 246 words, outline \"hash tables\"\n2018/09/30 - 165 words, outline \"hash tables\"\n2018/09/29 - 148 words, outline \"hash tables\"\n2018/09/28 - finish ordering snippets\n2018/09/27 - more ordering snippets\n2018/09/26 - start ordering snippets\n2018/09/25 - outlining on \"hash tables\"\n2018/09/24 - put strings online\n2018/09/23 - slice up snippets for \"hash tables\"\n2018/09/22 - 4288 words, finish third draft \"strings\"\n2018/09/21 - ink and photoshop illustration\n2018/09/20 - one illustration, pencil another\n2018/09/19 - redo illustration to fix 265\n2018/09/18 - fix 282\n2018/09/17 - fix two illustrations\n2018/09/16 - close two issues\n2018/09/15 - fix 269\n2018/09/14 - merge 2 prs\n2018/09/13 - finish illustration\n2018/09/12 - start illustration\n2018/09/11 - photoshop one illustration\n2018/09/10 - draw and ink one illustration\n2018/09/09 - 922 words, third draft \"strings\"\n2018/09/08 - taxes\n2018/09/07 - 2695 words, finish second draft \"strings\"\n2018/09/06 - 1972 words, second draft \"strings\"\n2018/09/05 - 516 words, second draft \"strings\"\n2018/09/04 - answers for \"strings\" challenges\n2018/09/03 - 330 words, finish first draft \"strings\"\n2018/09/02 - 754 words, first draft \"strings\"\n2018/09/01 - 1187 words, first draft \"strings\"\n2018/08/31 - 1363 words, first draft \"strings\"\n2018/08/30 - 1101 words, first draft \"strings\"\n2018/08/29 - 505 words, first draft \"strings\"\n2018/08/28 - 485 words, outline \"strings\"\n2018/08/27 - 638 words, outline \"strings\"\n2018/08/26 - 506 words, outline \"strings\"\n2018/08/25 - 680 words, outline \"strings\"\n2018/08/24 - 195 words, outline \"strings\"\n2018/08/23 - 344 words, outline \"strings\"\n2018/08/22 - start outlining\n2018/08/21 - finish ordering snippets\n2018/08/20 - more ordering snippets\n2018/08/19 - 4 prs, 1 issue\n2018/08/18 - more ordering snippets, fix aside comments in snippets\n2018/08/17 - start ordering snippets\n2018/08/16 - finish splitting up snippets, start outlining \"strings\"\n2018/08/15 - 10 emails\n2018/08/14 - start slicing up snippets for \"strings\"\n2018/08/13 - put chapter online\n2018/08/12 - fix three issues\n2018/08/11 - answers for \"types of values\"\n2018/08/10 - 1257 words, finish third draft \"types of values\"\n2018/08/09 - 3203 words, third draft \"types of values\"\n2018/08/08 - photoshop one illustration\n2018/08/07 - draw one illustration\n2018/08/06 - one illustration\n2018/08/05 - redo prose around value size\n2018/08/04 - use bank 15\n2018/08/03 - three illustrations\n2018/08/02 - redo location code in build script\n2018/08/01 - handle trailing commas in snippets better\n2018/07/31 - 5 issues\n2018/07/30 - finish fixing horizontal code scrolling and long lines\n2018/07/29 - work on fixing horizontal code scrolling\n2018/07/28 - 1 pull request, 2 issues, try to fix another\n2018/07/27 - use bank 14\n2018/07/26 - 1766 words, finish second first draft \"types of values\"\n2018/07/25 - 933 words, second first draft \"types of values\"\n2018/07/24 - 1000 words, second draft \"types of values\"\n             (bank 15) 1399 words, second draft \"types of values\"\n2018/07/23 - 940 words, finish first draft \"types of values\"\n2018/07/22 - 324 words, first draft \"types of values\"\n2018/07/21 - use bank 13 (hauberk hackathon)\n2018/07/20 - 516 words, first draft design note for \"compiling expressions\"\n2018/07/19 - 892 words, first draft \"types of values\"\n2018/07/18 - 824 words, first draft \"types of values\"\n2018/07/17 - 100 words, first draft \"types of values\"\n             (bank 14) 439 words, first draft \"types of values\"\n2018/07/16 - 14 emails\n2018/07/15 - 203 words, first draft \"types of values\"\n2018/07/14 - 810 words, outline \"types of values\"\n2018/07/13 - 793 words, outline \"types of values\"\n2018/07/12 - 608 words, outline \"types of values\"\n2018/07/11 - order snippets, start rough outline\n2018/07/10 - split out snippets and start organizing \"types of values\"\n2018/07/09 - put chapter online\n2018/07/08 - write email\n2018/07/07 - email, typos, and bug reports\n2018/07/06 - 3388 words, finish third draft \"compiling expressions\"\n2018/07/05 - 446 words, third draft \"compiling expressions\"\n2018/07/04 - 1955 words, third draft \"compiling expressions\"\n2018/07/03 - 7 emails\n2018/07/02 - 9 emails\n2018/07/01 - one illustration\n2018/06/30 - one illustration\n2018/06/29 - one illustration (that didn't work out)\n2018/06/28 - ink and photoshop illustration\n2018/06/27 - pencil illustration\n2018/06/26 - research and close one issue\n2018/06/25 - one question and answer for \"compiling expressions\", 5 issues\n             (bank 13) one illustration\n2018/06/24 - 772 words, finish second draft \"compiling expressions\", two answers\n2018/06/23 - 1499 words, second draft \"compiling expressions\"\n2018/06/22 - one illustration\n2018/06/21 - quotes\n2018/06/20 - 1543 words, second draft \"compiling expressions\"\n2018/06/19 - 835 words, second draft \"compiling expressions\"\n2018/06/18 - 1242 words, second draft \"compiling expressions\"\n2018/06/17 - fix two more bugs\n2018/06/16 - fix #238, other tweaks\n2018/06/15 - 14 emails\n2018/06/14 - work on #238\n2018/06/13 - fix todos, 1 pr, 1 issue\n2018/06/12 - 883 words, first draft \"compiling expressions\"\n2018/06/11 - 177 words, first draft \"compiling expressions\"\n2018/06/10 - 636 words, first draft \"compiling expressions\"\n2018/06/09 - 750 words, first draft \"compiling expressions\"\n2018/06/08 - 710 words, first draft \"compiling expressions\"\n2018/06/07 - 827 words, first draft \"compiling expressions\"\n2018/06/06 - 410 words, first draft \"compiling expressions\"\n2018/06/05 - 488 words, first draft \"compiling expressions\"\n2018/06/04 - 931 words, first draft \"compiling expressions\"\n2018/06/03 - 367 words, first draft \"compiling expressions\"\n2018/06/02 - finish outline \"compiling expressions\"\n2018/06/01 - 379 words outline \"compiling expressions\"\n2018/05/31 - 342 words outline \"compiling expressions\"\n2018/05/30 - more outlining \"compiling expressions\"\n2018/05/29 - more outlining \"compiling expressions\"\n2018/05/28 - taxes, fix footer css on toc\n2018/05/27 - more outlining \"compiling expressions\"\n2018/05/26 - 284 words, outline \"compiling expressions\"\n2018/05/25 - finish slicing and ordering snippets for \"compiling expressions\"\n2018/05/24 - fix early return from init() in jlox\n2018/05/23 - 2 pull requests, 4 issues\n2018/05/22 - more slicing snippets for \"compiling expressions\"\n2018/05/21 - more slicing snippets for \"compiling expressions\"\n2018/05/20 - start slicing snippets for \"compiling expressions\"\n2018/05/19 - use bank 12\n2018/05/18 - post chapter\n2018/05/17 - email\n2018/05/16 - email and a couple of issues\n2018/05/15 - email\n2018/05/14 - 2208 words, finish third draft \"scanning on demand\"\n2018/05/13 - 1748 words, third draft \"scanning on demand\"\n2018/05/12 - 1330 words, third draft \"scanning on demand\"\n2018/05/11 - 1 pr, better snippet locations inside c typedefs\n2018/05/10 - better snippet locations inside c typedefs\n2018/05/09 - 5 bugs\n2018/05/08 - 3 prs, 3 bugs\n2018/05/07 - 1606 words, finish second draft \"scanning on demand\"\n2018/05/06 - 943 words, second draft \"scanning on demand\"\n2018/05/05 - 1174 words, second draft \"scanning on demand\"\n             (bank 12) 1022 words, second draft \"scanning on demand\"\n2018/05/04 - 920 words, second draft \"scanning on demand\"\n2018/05/03 - axolotl illustration\n2018/05/02 - draw and photoshop one illustration\n2018/05/01 - draw and photoshop one illustration\n2018/04/30 - photoshop two illustrations\n2018/04/29 - draw another illustration\n2018/04/28 - draw one illustration\n2018/04/27 - challenges and answers for \"scanning on demand\"\n2018/04/26 - 889 words, first draft \"scanning on demand\"\n2018/04/25 - 649 words, first draft \"scanning on demand\"\n2018/04/24 - tweak em dashes\n2018/04/23 - 994 words, first draft \"scanning on demand\"\n2018/04/22 - use bank 11\n2018/04/21 - 651 words, first draft \"scanning on demand\"\n2018/04/20 - 725 words, first draft \"scanning on demand\"\n2018/04/19 - 213 words, first draft \"scanning on demand\"\n2018/04/18 - 857 words, first draft \"scanning on demand\"\n2018/04/17 - 406 words, first draft \"scanning on demand\"\n2018/04/16 - 879 words, finish outline \"scanning on demand\"\n2018/04/15 - 1038 words outline \"scanning on demand\"\n2018/04/14 - 540 words outline \"scanning on demand\"\n2018/04/13 - use bank 10\n2018/04/12 - switch-based keyword recognizer\n2018/04/11 - finish ordering snippets\n2018/04/10 - outlining and ordering snippets\n2018/04/09 - use bank 9\n2018/04/08 - finish slicing up snippets\n2018/04/07 - fix typos\n             (bank 11) start slicing up \"scanning on demand\"\n2018/04/06 - publish \"a virtual machine\", fix typos\n2018/04/05 - redo reallocate()\n2018/04/04 - 5 pull requests, 10 bugs\n2018/04/03 - 2000 words, finish third draft \"a virtual machine\"\n             (bank 10) 2034 words, third draft \"a virtual machine\"\n2018/04/02 - 1937 words, third draft \"a virtual machine\"\n2018/04/01 - ink and photoshop two illustrations\n2018/03/30 - pencil two illustrations\n2018/03/29 - pancakes\n2018/03/28 - photoshop three illustrations\n2018/03/27 - ink more illustrations\n2018/03/26 - one more illustration\n2018/03/25 - record screencast, edit video\n2018/03/24 - shoot video, start editing\n             (bank 9) edit\n2018/03/23 - sketch another illustration\n2018/03/22 - one illustration\n2018/03/21 - two illustrations\n2018/03/20 - illustratin'\n2018/03/19 - practice illustration\n2018/03/18 - work on script and set\n2018/03/17 - answer challenge three\n2018/03/16 - answer two challenges\n2018/03/15 - test video\n2018/03/14 - 1763 words, second draft \"a virtual machine\"\n2018/03/13 - 923 words, second draft \"a virtual machine\"\n2018/03/12 - 1762 words, second draft \"a virtual machine\"\n2018/03/11 - 727 words, second draft \"a virtual machine\"\n2018/03/10 - 624 words, second draft \"a virtual machine\"\n2018/03/09 - first draft of note for \"a virtual machine\"\n2018/03/08 - outline note for \"a virtual machine\"\n2018/03/07 - 402 words, challenges and finish first draft \"a virtual machine\"\n2018/03/06 - 844 words, first draft \"a virtual machine\"\n2018/03/05 - 506 words, first draft \"a virtual machine\"\n2018/03/04 - 615 words, first draft \"a virtual machine\"\n2018/03/03 - 922 words, first draft \"a virtual machine\"\n2018/03/02 - 691 words, first draft \"a virtual machine\"\n2018/03/01 - 362 words, first draft \"a virtual machine\"\n2018/02/28 - 319 words, first draft \"a virtual machine\"\n2018/02/27 - 585 words, first draft \"a virtual machine\"\n2018/02/26 - 535 words, finish main outline \"a virtual machine\"\n2018/02/25 - 824 words outline \"a virtual machine\"\n2018/02/24 - 932 words outline \"a virtual machine\"\n2018/02/23 - finish ordering snippets\n2018/02/22 - order snippets\n2018/02/21 - finish slicing \"a virtual machine\", start outlining\n2018/02/20 - mostly finish slicing \"a virtual machine\" snippets\n2018/02/19 - post chapter online\n2018/02/18 - print style pr, 3 bugs, other stuff\n2018/02/17 - 7 pull requests\n2018/02/16 - second answer for \"chunks of bytecode\"\n2018/02/15 - one answer for \"chunks of bytecode\"\n2018/02/14 - 1921 words, finish third draft of \"chunks of bytecode\"\n2018/02/13 - 2110 words, third draft of \"chunks of bytecode\"\n2018/02/12 - 923 words, third draft of \"chunks of bytecode\"\n2018/02/11 - 246 words, third draft of \"chunks of bytecode\" (service :( )\n2018/02/10 - 514 words, third draft of \"chunks of bytecode\"\n2018/02/09 - 1887 words, third draft of \"chunks of bytecode\"\n2018/02/08 - more slicing \"a virtual machine\" snippets\n2018/02/07 - start slicing \"a virtual machine\" snippets\n2018/02/06 - more quotes, 7 emails\n2018/02/05 - 2 emails, quote research\n2018/02/04 - one more illustration\n2018/02/03 - table for realloc()\n2018/02/02 - photoshop 2 illustrations\n2018/02/01 - ink 1 1/2 illustrations\n2018/01/31 - pencil 2 illustrations for \"chunks of bytecode\", ink 1/2\n2018/01/30 - illustration for \"chunks of bytecode\"\n2018/01/29 - taxes\n2018/01/28 - 1941 words, finish second draft \"chunks of bytecode\"\n2018/01/27 - 2192 words, second draft \"chunks of bytecode\"\n2018/01/26 - 1386 words, second draft \"chunks of bytecode\"\n2018/01/25 - 499 words, second draft \"chunks of bytecode\" (flying)\n2018/01/24 - 768 words, second draft \"chunks of bytecode\"\n2018/01/23 - use bank 8\n2018/01/22 - ink and photoshop ast illustration\n             (bank 8) 1056 words, second draft \"chunks of bytecode\"\n2018/01/21 - pencil ast illustration\n2018/01/20 - figure out how to illustrate chunks\n2018/01/19 - design note for \"chunks of bytecode\"\n2018/01/18 - outline design note for \"chunks of bytecode\"\n2018/01/17 - challenges for \"chunks of bytecode\"\n2018/01/16 - 919 words, first draft \"chunks of bytecode\"\n2018/01/15 - implement run-length encoding of line info\n2018/01/14 - 775 words, first draft \"chunks of bytecode\"\n2018/01/13 - 737 words, first draft \"chunks of bytecode\"\n2018/01/12 - 186 words, first draft \"chunks of bytecode\"\n2018/01/11 - 250 words, first draft \"chunks of bytecode\"\n2018/01/10 - 254 words, first draft \"chunks of bytecode\" (4 :( )\n2018/01/09 - 130 words, first draft \"chunks of bytecode\" (traveling)\n2018/01/08 - 434 words, first draft \"chunks of bytecode\"\n2018/01/07 - 653 words, first draft \"chunks of bytecode\"\n2018/01/06 - 201 words, first draft \"chunks of bytecode\"\n2018/01/05 - 216 words, first draft \"chunks of bytecode\"\n2018/01/04 - 190 words, first draft \"chunks of bytecode\"\n2018/01/03 - 147 words, first draft \"chunks of bytecode\", try storing ip on stack\n2018/01/02 - 852 words, first draft \"chunks of bytecode\"\n2018/01/01 - 623 words, first draft \"chunks of bytecode\"\n2017/12/31 - work on optimizing clox\n2017/12/30 - 537 words, first draft \"chunks of bytecode\"\n2017/12/29 - finish outlining \"chunks of bytecode\"\n2017/12/28 - more outlining \"chunks of bytecode\"\n2017/12/27 - more outlining \"chunks of bytecode\"\n2017/12/26 - 233 words outline \"chunks of bytecode\"\n2017/12/25 - more work outlining \"chunks of bytecode\"\n2017/12/24 - more work organizing \"chunks of bytecode\"\n2017/12/23 - 149 words outline \"chunks of bytecode\"\n2017/12/22 - 164 words outline \"chunks of bytecode\"\n2017/12/21 - finish splitting snippets for \"chunks of bytecode\"\n2017/12/20 - split up and organize snippets for \"chunks of bytecode\"\n2017/12/19 - more outline \"chunks of bytecode\"\n2017/12/18 - 186 words, outline \"chunks of bytecode\"\n2017/12/17 - add generated ast appendix and link to chapters\n2017/12/16 - add appendices and grammar appendix\n2017/12/15 - 9 emails\n2017/12/14 - 10 emails\n2017/12/13 - 3 emails\n2017/12/12 - 11 emails\n2017/12/11 - put \"inheritance\" online\n2017/12/10 - one more illustration for \"inheritance\"\n2017/12/09 - merge 6 prs and close 7 bugs\n2017/12/08 - 2723 words, finish third draft of \"inheritance\"\n2017/12/07 - 1401 words, third draft of \"inheritance\"\n2017/12/06 - incorporate illustration into text\n2017/12/05 - process illustration\n2017/12/04 - draw one illustration\n2017/12/05 - draw and process one illustration\n2017/12/03 - draw and process two illustrations\n2017/12/02 - more quote digging\n2017/12/01 - look for quotes\n2017/11/30 - more splitting up code for \"bytecode\", outlining\n2017/11/29 - start splitting up code for \"bytecode\"\n2017/11/28 - prose for challenge 1 answer in inheritance\n2017/11/27 - code for challenge 1 answer in inheritance\n2017/11/26 - research c3 linearization\n2017/11/25 - answer challenge 2 in inheritance\n2017/11/24 - prose for challenge 3 answer in inheritance\n2017/11/23 - code for challenge 3 answer in inheritance\n2017/11/22 - 1105 words, finish second draft \"inheritance\"\n2017/11/21 - 485 words, second draft \"inheritance\"\n2017/11/20 - 580 words, second draft \"inheritance\"\n2017/11/19 - 212 words, second draft \"inheritance\"\n2017/11/18 - 751 words, second draft \"inheritance\"\n2017/11/17 - 759 words, second draft \"inheritance\" (still sick)\n2017/11/16 - 203 words, second draft \"inheritance\" (still sick)\n2017/11/15 - tweaks and asides\n2017/11/14 - 1526 words, finish first draft \"inheritance\"\n2017/11/13 - broke the chain, sick and forgot, made up on 11/14\n2017/11/12 - 303 words, first draft \"inheritance\" (sick)\n2017/11/11 - 659 words, first draft \"inheritance\"\n2017/11/10 - 453 words, first draft \"inheritance\"\n2017/11/09 - 594 words, first draft \"inheritance\"\n2017/11/08 - long aside on \"sub-\"\n2017/11/07 - 535 words, first draft \"inheritance\"\n2017/11/06 - finish challenges and outline for \"inheritance\"\n2017/11/05 - outline conclusion\n2017/11/04 - mostly done with outline, challenges\n2017/11/03 - work on outline\n2017/11/02 - research oop history\n2017/11/01 - start outline and taking notes for \"inheritance\"\n2017/10/31 - email, split up snippets for \"inheritance\"\n2017/10/30 - post new chapter\n2017/10/29 - write email\n2017/10/28 - 1139 words, finish third draft of \"classes\"\n2017/10/27 - fix #156\n2017/10/26 - one more illustration for \"classes\"\n2017/10/25 - 859 words, third draft of \"classes\"\n2017/10/24 - 1579 words, third draft of \"classes\"\n2017/10/23 - 1865 words, third draft of \"classes\"\n2017/10/22 - 1147 words, third draft of \"classes\"\n2017/10/21 - 769 words, third draft of \"classes\"\n2017/10/20 - fix #147, #153, #131\n2017/10/19 - 7 pull requests and a few bugs\n2017/10/18 - redo section around this to take illustrations into account\n2017/10/17 - photoshop two illustrations and work into text\n2017/10/16 - draw two more illustrations for \"classes\"\n2017/10/15 - finish fourth illustration for \"classes\"\n2017/10/14 - start working on illustration four for \"classes\"\n2017/10/13 - third illustration for \"classes\"\n2017/10/12 - second illustration for \"classes\"\n2017/10/11 - first illustration for \"classes\"\n2017/10/10 - 968 words, finish second draft of \"classes\"\n2017/10/09 - 1460 words, second draft of \"classes\"\n2017/10/08 - 1318 words, second draft of \"classes\"\n2017/10/07 - 1481 words, second draft of \"classes\"\n2017/10/06 - 1138 words, second draft of \"classes\"\n2017/10/05 - 902 words, second draft of \"classes\"\n2017/10/04 - finish answers for \"classes\"\n2017/10/03 - work on answers for \"classes\"\n2017/10/02 - 897 words, design note for \"classes\", finish first draft\n2017/10/01 - work on challenges for \"classes\"\n2017/09/30 - 179 words, outline design note for \"classes\"\n2017/09/29 - 2155 words across two sessions, first draft \"classes\"\n2017/09/28 - 828 words, first draft \"classes\"\n2017/09/27 - 835 words, first draft \"classes\"\n2017/09/26 - broke the chain, busy in aarhus and forgot :(\n             made up on 09/29\n2017/09/25 - 417 words, first draft \"classes\"\n2017/09/24 - 712 words, first draft \"classes\"\n2017/09/23 - 891 words, first draft \"classes\"\n2017/09/22 - 344 words, first draft \"classes\"\n2017/09/21 - 327 words, first draft \"classes\"\n2017/09/20 - finish rough outline \"classes\"\n2017/09/19 - 773 words outline \"classes\"\n2017/09/18 - 278 words outline \"classes\"\n2017/09/17 - finish ordering and slicing snippets\n2017/09/16 - more ordering and slicing up snippets\n2017/09/15 - more ordering and slicing up snippets\n2017/09/14 - last answers for chapter 11\n2017/09/13 - first three answers for chapter 11\n2017/09/11 - split up snippets and start outlining chapter 12\n2017/09/11 - publish chapter\n2017/09/10 - fix 1 bug, prep email\n2017/09/09 - 4 prs and 5 bugs\n2017/09/08 - 1853 words, finish third draft chapter 11\n2017/09/07 - 1428 words, third draft chapter 11\n2017/09/06 - 1101 words, third draft chapter 11\n2017/09/05 - ~1870 words, third draft chapter 11\n2017/09/04 - 1 more illustration\n2017/09/03 - work illustrations into chapter\n2017/09/02 - ~517 words, third draft chapter 11\n2017/09/01 - ~500 words, third draft chapter 11\n2017/08/31 - photoshop 4 illustrations\n2017/08/30 - ink 4 illustrations\n2017/08/29 - 1724 words, third draft chapter 11\n2017/08/28 - 409 words, third draft chapter 11\n2017/08/27 - sketch illustrations\n2017/08/26 - explain semantic analysis\n2017/08/25 - 1450 words, finish second draft chapter 11, delete ~50\n2017/08/24 - 945 words, second draft chapter 11\n2017/08/23 - 1322 words, second draft chapter 11, delete ~120\n2017/08/22 - 849 words, second draft chapter 11, delete ~130\n2017/08/21 - use bank 7\n2017/08/20 - 760 words, second draft chapter 11, delete ~120\n2017/08/19 - 828 words, second draft chapter 11, delete 140\n2017/08/18 - 1000 words, finish first draft chapter 11\n             (bank 7) 955 words first draft chapter 11\n2017/08/17 - 874 words, first draft chapter 11\n2017/08/16 - 958 words, first draft chapter 11\n2017/08/15 - 592 words, first draft chapter 11\n2017/08/14 - 546 words, first draft chapter 11\n2017/08/13 - 829 words, first draft chapter 11\n2017/08/12 - 315 words, first draft chapter 11\n2017/08/11 - 490 words, first draft chapter 11\n2017/08/10 - 142 words, first draft chapter 11 (camping, sick :( )\n2017/08/09 - finish merging old and new outline\n2017/08/08 - redo 321 words outline chapter 11\n2017/08/07 - finish outline and code snippet splitting\n2017/08/06 - work on reorganizing outline\n2017/08/05 - 806 words outline chapter 11\n2017/08/04 - 400 words outline chapter 11\n2017/08/03 - 539 words outline chapter 11\n2017/08/02 - 876 words outline chapter 11\n2017/08/01 - label code snippets for chapter 11\n2017/07/31 - publish chapter\n2017/07/30 - prep email, fix #122\n2017/07/29 - email\n2017/07/28 - redo direction illustration, fix bugs\n2017/07/27 - address 3 prs and work on 1 bug\n2017/07/26 - 2092 words, finish third draft \"functions\"\n2017/07/25 - 2917 words, third draft \"functions\"\n2017/07/24 - fix #123\n2017/07/23 - 1521 words, third draft \"functions\"\n2017/07/22 - integrate illustrations into text\n2017/07/21 - photoshop five illustrations\n2017/07/20 - draw three illustrations\n2017/07/19 - draw two illustrations\n2017/07/18 - photoshop two illustrations, other chapter tweaks\n2017/07/17 - draw two illustrations\n2017/07/16 - 1120 words, finish second draft chapter 10, write answers\n2017/07/15 - 953 words, second draft chapter 10\n2017/07/14 - 1230 words, second draft chapter 10\n2017/07/13 - 3007 words, second draft chapter 10\n2017/07/12 - 1785 words, finish draft chapter 10\n2017/07/11 - 885 words, first draft chapter 10\n2017/07/10 - 564 words, first draft chapter 10\n2017/07/09 - 1152 words, first draft chapter 10\n2017/07/08 - 1073 words, first draft chapter 10\n2017/07/07 - 631 words, first draft chapter 10\n2017/07/06 - 547 words, finish outline chapter 10, first draft challenges\n2017/07/05 - 169 words, outline chapter 10 :(\n2017/07/04 - 948 words, outline chapter 10\n2017/07/03 - 796 words, outline chapter 10\n2017/07/02 - 526 words, outline chapter 10\n2017/07/01 - finish organizing code snippets for chapter 10\n2017/06/30 - more slicing and organizing code snippets for chapter 10\n2017/06/29 - code snippets for chapter 10\n2017/06/28 - email and bug fixes\n2017/06/27 - address 3 prs and 4 bugs\n2017/06/26 - publish \"control flow\"\n2017/06/25 - address 2 prs and 4 bugs\n2017/06/24 - answers for chapter 9\n2017/06/23 - third draft of chapter 9, all 5837 words\n2017/06/22 - illustrate dangling else\n2017/06/21 - add support for aside images above the text\n2017/06/20 - illustrate turing machine\n2017/06/19 - 2499 words, finish second draft chapter 9 (cut ~80)\n2017/06/18 - 1818 words, second draft chapter 9 (cut ~250)\n2017/06/17 - answers for chapter 8\n2017/06/16 - 688 words, finish first draft chapter 9\n2017/06/15 - redo turing machine part\n2017/06/14 - 1666 words first draft chapter 9\n2017/06/13 - ~1440 words first draft chapter 9\n2017/06/12 - 854 words first draft chapter 9\n2017/06/11 - hack script to estimate completion date\n2017/06/10 - 897 words outline chapter 9\n2017/06/09 - 700 words outline chapter 9\n2017/06/08 - 648 words outline chapter 9\n2017/06/07 - split up snippets for control flow, start outlining\n2017/06/06 - fix snippet labeling (#97) and 3 other bugs\n2017/06/05 - work on fixing snippet labeling (#97)\n2017/06/04 - paperwork, 1 pr, 5 bugs\n2017/06/03 - 6 prs, 4 bug\n2017/06/02 - 12 emails\n2017/06/01 - put chapter 8 online\n2017/05/31 - one pr, two bugs\n2017/05/30 - one pr, two bugs\n2017/05/29 - show date range in copyright, one pr\n2017/05/28 - resolve 3 bugs\n2017/05/27 - merge 2 prs, fix 6 bugs\n2017/05/26 - 1484 words, finish third draft chapter 8\n2017/05/25 - 1412 words, third draft chapter 8\n2017/05/24 - 2392 words, third draft chapter 8 (cut ~80)\n2017/05/23 - 1219 words, third draft chapter 8\n2017/05/22 - 1198 words, third draft chapter 8 (cut ~60)\n2017/05/21 - 579 words, third draft chapter 8 (cut ~50)\n2017/05/20 - scan and process illustrations\n2017/05/19 - cactus illustration\n2017/05/18 - environment illustrations\n2017/05/17 - letter and scan brain illustration\n2017/05/16 - brain illustration 2\n2017/05/15 - brain illustration\n2017/05/14 - 1663 words, finish second draft chapter 8 (cut ~100)\n2017/05/13 - 1051 words, second draft chapter 8 (cut ~50)\n2017/05/12 - 1348 words, second draft chapter 8 (cut ~40)\n2017/05/11 - 1342 words, second draft chapter 8 (cut ~100)\n2017/05/10 - 1236 words, second draft chapter 8\n2017/05/09 - 1090 words, second draft chapter 8 (cut ~100)\n2017/05/08 - 786 words, second draft chapter 8\n2017/05/07 - tinker with splitting chapter 8 in two, replace quote\n2017/05/06 - 1621 words, finish first draft chapter 8\n2017/05/05 - 907 words, first draft chapter 8\n2017/05/04 - 648 words, first draft chapter 8\n2017/05/03 - 708 words, first draft chapter 8\n2017/05/02 - 736 words, first draft chapter 8\n2017/05/01 - 1147 words, first draft chapter 8\n2017/04/30 - 1027 words, first draft chapter 8\n2017/04/29 - 245 words, first draft chapter 8\n2017/04/28 - 558 words, first draft chapter 8\n2017/04/27 - 377 words and quote, first draft chapter 8\n2017/04/26 - 803 words, first draft chapter 8\n2017/04/25 - 362 words, first draft chapter 8\n2017/04/24 - finish design note outline, full outline for chapter 8\n2017/04/23 - 371 words, sketch outline for chapter 8 design note\n2017/04/22 - 1248 words, outline chapter 8\n2017/04/21 - 958 words, outline chapter 8\n2017/04/20 - 917 words, outline chapter 8\n2017/04/19 - 904 words, outline chapter 8\n2017/04/18 - ~200 words, outline chapter 8\n2017/04/17 - use bank 6\n2017/04/16 - use bank 5\n2017/04/15 - 413 words, outline chapter 8\n2017/04/14 - use bank 4\n2017/04/13 - use bank 3\n2017/04/12 - use bank 2\n2017/04/11 - use bank 1\n2017/04/10 - finish splitting snippets for chapter 8, rough outline\n2017/04/09 - start splitting up snippets for chapter 8\n2017/04/08 - email\n2017/04/07 - put chapter 7 online\n2017/04/06 - prep email, resolve 2 bugs\n2017/04/05 - 2000 words, third draft chapter 7\n           - (bank 6) 2209 words, finish third draft chapter 7\n2017/04/04 - skeleton illustration\n2017/04/03 - muffin illustration, sketch skeleton\n2017/04/02 - lightning illustration for chapter 7\n2017/04/01 - 5 pull requests, start working on glossary\n2017/03/31 - write design note, 1200 words second draft chapter 7\n             (bank 5) 1511 words, finish second draft chapter 7\n2017/03/30 - 1405 words, second draft chapter 7\n2017/03/29 - 10 emails\n2017/03/28 - answers to chapter 7 questions\n2017/03/27 - 354 words, finish first draft chapter 7\n2017/03/26 - 600 words, first draft chapter 7\n           - (bank 4) 609 words, first draft chapter 7\n2017/03/25 - 1000 words, first draft chapter 7\n           - (bank 3) 690 words, first draft chapter 7\n2017/03/24 - 623 words, first draft chapter 7\n2017/03/23 - tweak outline, challenges chapter 7\n2017/03/22 - finish outline chapter 7\n2017/03/21 - ~400 words outline chapter 7\n2017/03/20 - put chapter 6 online\n2017/03/19 - order snippets\n2017/03/18 - write email, slice chapter 7 code into snippets\n2017/03/17 - fix bugs and merge prs\n2017/03/16 - 2000 words, third draft chapter 6\n           - (bank 2) 2136 words, finish third draft chapter 6\n2017/03/15 - 1609 words, third draft chapter 6\n2017/03/14 - 1 more illustration for chapter 6\n2017/03/13 - 1 illustration for chapter 6\n2017/03/12 - 2 illustrations for chapter 6\n           - (bank 1) 1 1/2 illustrations for chapter 6\n2017/03/11 - 1031 words, finish second draft chapter 6\n2017/03/10 - 780 words, second draft chapter 6\n2017/03/09 - 569 first draft design note for chapter 6\n2017/03/08 - outline design note for chapter 6\n2017/03/07 - 313 words, second draft chapter 6\n2017/03/06 - 3514 words, second draft chapter 6\n2017/03/05 - 488 words, finish first draft of chapter 6, answers for challenges\n2017/03/04 - 1017 words, first draft chapter 6\n2017/03/03 - 1014 words, first draft chapter 6\n2017/03/02 - 708 words, first draft chapter 6\n2017/03/01 - rework part of first draft chapter 6\n2017/02/28 - 590 words, first draft chapter 6\n2017/02/27 - precedence table and css for tables\n2017/02/26 - 688 words, first draft chapter 6\n2017/02/25 - 576 words, first draft chapter 6\n2017/02/24 - finish outlining chapter 6 (except design note)\n2017/02/23 - 447 words outline chapter 6\n2017/02/22 - finish redoing panic mode recovery\n2017/02/21 - more panic mode hacking\n2017/02/20 - revisit panic mode synchronization\n2017/02/19 - more outlining, look into error recovery\n2017/02/18 - 188 words outline parsing expressions, split up code\n2017/02/17 - email and bug fixes, start working on chapter 6\n2017/02/16 - remove topics from toc, publish chapter 5\n2017/02/15 - 2499 words, finish third draft representing code\n2017/02/14 - 1819 words, third draft representing code\n2017/02/13 - 1592 words, third draft representing code\n2017/02/12 - 2533 words, finish second draft representing code\n2017/02/11 - 2737 words, second draft representing code\n2017/02/10 - 793 words, second draft representing code\n2017/02/09 - last two illustrations\n2017/02/08 - rows and columns illustrations\n2017/02/07 - table illustration\n2017/02/06 - play grammar illustration\n2017/02/05 - evaluate tree illustration\n2017/02/04 - 519 words, finish first draft representing code\n2017/02/03 - 319 words, allow hiding snippet location in build script\n2017/02/02 - 1354 words, first draft representing code\n2017/02/01 - 352 words, first draft representing code (sick :( )\n2017/01/31 - close 6 issues and 2 pull requests\n2017/01/30 - 1250 words, first draft representing code\n2017/01/29 - 315 words, first draft representing code (sick :( )\n2017/01/28 - 800 words, first draft representing code\n2017/01/27 - 1076 words, first draft representing code\n2017/01/26 - 348 words, first draft representing code\n2017/01/25 - first draft and answers for challenges representing code\n2017/01/24 - finish outline representing code\n2017/01/23 - >1000 words, outline representing code\n2017/01/22 - 738 words, outline representing code\n2017/01/21 - 579 words, outline representing code\n2017/01/20 - 736 words, outline representing code\n2017/01/19 - more work on README, fix #24, start on chapter 5\n2017/01/18 - fix bug accessing \"this\" in super calls (#20), README\n2017/01/17 - fix 11 issues, lots more email\n2017/01/16 - email and bug fixes\n2017/01/15 - go live!\n2017/01/14 - fourth draft of scanning, tweak styles, copyright image\n2017/01/13 - link to next chapter in footer, tweak code styles\n2017/01/12 - 3004 words, finish third draft, write answers to challenges\n2017/01/11 - 2244 words, third draft scanning\n2017/01/10 - lexigator illustration\n2017/01/09 - lexeme illustration\n2017/01/08 - 2449 words, finish second draft scanning\n2017/01/07 - 1445 words, second draft scanning\n2017/01/06 - 1556 words, second draft scanning\n2017/01/05 - redo scanning headers, third draft part intro, explain snippet in intro\n2017/01/04 - second draft part intro, up nav links, rename parts\n2017/01/03 - first draft part ii intro, hunt down quotes\n2017/01/02 - figure out a license\n2017/01/01 - 772 words design note for scanner, aside markers in code\n2016/12/31 - 1081 words first draft scanner, mostly done\n2016/12/30 - 1085 words first draft scanner\n2016/12/29 - 722 words first draft scanner\n2016/12/28 - 561 words first draft scanner\n2016/12/27 - 1127 words first draft scanner\n2016/12/26 - finish outlining and splitting, reallow multiline strings\n2016/12/25 - fix some bugs in chapter splitting, make multiline strings and error\n2016/12/24 - slice up more scanning code into snippets\n2016/12/23 - allow named snippets\n2016/12/22 - handle surrounding context in code snippets\n2016/12/21 - simplify error reporting\n2016/12/20 - 1063 words, outline scanning\n2016/12/19 - fix \"lox language\" to make print not a function\n2016/12/18 - optimize refreshing in build server\n2016/12/17 - better validation of transclusion\n2016/12/16 - finish transclusion code\n2016/12/15 - work on code to transclude code chunks\n2016/12/14 - prototype lookup illustration\n2016/12/13 - finish class lookup illustration\n2016/12/12 - work on class lookup illustration\n2016/12/11 - one more draft, read out loud, of lox chapter\n2016/12/10 - 1445, finish third draft lox chapter\n2016/12/09 - 2041 words, third draft lox chapter\n2016/12/08 - ~2300 words, third draft lox chapter\n2016/12/07 - second draft, entire lox chapter\n2016/12/06 - finish first draft lox, design note, style design note\n2016/12/05 - 1092 words first draft of lox\n2016/12/04 - 1325 words first draft of lox\n2016/12/03 - 916 words first draft of lox\n2016/12/02 - 996 words first draft of lox\n2016/12/01 - 881 words first draft of lox\n2016/11/30 - delete pancake\n2016/11/29 - 325 words first draft lox (sick gretch :( )\n2016/11/28 - 385 words first draft lox, syntax highlighter\n2016/11/27 - finish outlining lox chapter, toy with adding lambdas to lox\n2016/11/26 - outline lox chapter\n2016/11/25 - research and notes for lox chapter\n2016/11/24 - research and notes for lox chapter\n2016/11/23 - ~3000 words, finish third draft map of territory\n2016/11/22 - start outlining lox chapter\n2016/11/21 - ~2500 words, third draft map of territory\n2016/11/20 - finish reorganizing map of territory, add numbers to sections\n2016/11/19 - ~1000 words third draft of map of territory, reorganize a bunch\n2016/11/18 - start third draft of map of territory\n2016/11/17 - third draft of index, welcome, and introduction\n2016/11/16 - ink and scan venn diagram\n2016/11/15 - sketch languages venn diagram\n2016/11/14 - finish and photoshop plant\n2016/11/13 - more work on plant drawing\n2016/11/12 - draw plants\n2016/11/11 - process tokens, draw and scan ast, sketch plants\n2016/11/10 - draw tokens\n2016/11/09 - finish redoing characters illustration\n2016/11/08 - work on redoing characters illustration\n2016/11/07 - characters illustration, toy with illustration background color\n2016/11/06 - little languages illustration\n2016/11/05 - scan yak, illustrate bootstrapping\n2016/11/04 - illustrate yak and elephant in tree\n2016/11/03 - revise 1949 words, finish second draft map of territory\n2016/11/02 - revise 601 words map of territory, add kildall aside\n2016/11/01 - revise ~200 words map of territory :(\n2016/10/31 - revise 917 words map of territory\n2016/10/30 - revise 961 words map of territory\n2016/10/29 - revise ~1000 words, and rewrite some of the beginning of map\n2016/10/28 - 438 words revise map of territory\n2016/10/27 - revise 1041 words introduction (banked on 10/24)\n2016/10/26 - revise 948 words intro\n2016/10/25 - revise 465 words intro\n2016/10/24 - revise 1030 words, welcome and introduction\n2016/10/23 - work on exercises, remove glossary\n2016/10/22 - finish first draft intro, split into two chapters\n2016/10/21 - 1062 words intro\n2016/10/20 - 430 words transpiler\n2016/10/19 - 235 words runtime, fix code highlighting, other edits\n2016/10/18 - 599 words code generation\n2016/10/17 - rewrite optimization section ~540 words\n2016/10/16 - ~1000 words intro, tweak styles\n2016/10/15 - ~1000 words of intro\n2016/10/14 - reorganize and weave together existing intro prose\n2016/10/13 - figure out new outline for introduction\n2016/10/12 - find intro quote, start reorganizing\n2016/10/11 - 814 words first draft introduction\n2016/10/10 - 972 words first draft introduction\n2016/10/09 - 165 words, draft welcome, 776 words first draft introduction\n2016/10/08 - finish lettering, draw mountain\n2016/10/07 - illustrating (mostly lettering)\n2016/10/06 - start inking mountain\n2016/10/05 - start sketching full size mountain illustration\n2016/10/04 - test out illustration size on pages\n2016/10/03 - practice illustrations\n2016/10/02 - revise intro outline, outline welcome part\n2016/10/01 - 1340 words, finish outlining introduction, rename parts\n2016/09/30 - 1457 words outline introduction\n2016/09/29 - split notes into chapters, 701 words outline introduction\n2016/09/28 - finish topics, merge function call and user-defined functions chapters\n2016/09/27 - tweak mobile arrow styles, more chapter topics\n2016/09/26 - favicon, fill in more chapter topics\n2016/09/25 - photoshop new index background\n2016/09/24 - fix some todos, take photos for index background\n2016/09/23 - set up mailchimp stuff\n2016/09/22 - put sign-up form on pages\n2016/09/21 - fix collapsing nav\n2016/09/20 - responsive table of contents\n2016/09/19 - redo font sizes and spacing for mobile\n2016/09/18 - start setting up mailing list, more work on index\n2016/09/17 - work on index page\n2016/09/16 - get rid of \"reaching the summit\"\n2016/09/15 - finish getting rid of chapter 4\n2016/09/14 - combine chapters 4 and 5\n2016/09/13 - finish putting new logotype into design\n2016/09/12 - start working on putting new logotype into design\n2016/09/11 - hand letter second logotype\n2016/09/10 - hand letter logotype\n2016/09/09 - table/mobile styles for toc, work on build.py\n2016/09/08 - more work contents design\n2016/09/07 - work on templates, toc, build script\n2016/09/06 - stop using tombstones to delete from hash table\n2016/09/05 - more poking around hash table benchmarks\n2016/09/04 - more poking around hash table benchmarks\n2016/09/03 - benchmark hash table implementation\n2016/09/02 - work on table of contents template\n2016/09/01 - work on styles, headers, table of contents, etc.\n2016/08/31 - work on restyling navigation\n2016/08/30 - unify \"statements\" and \"global variables\" chapters, rename \"inheritance\" chapter\n2016/08/29 - move native functions into function call chapter\n2016/08/28 - try storing ip in register, added benchmark runner\n2016/08/27 - add optimization chapter -- nan tagging, hash masking\n2016/08/26 - hack on adding stack traces to jlox\n2016/08/25 - vox -> lox everywhere\n2016/08/24 - more name noodling, clean up repo\n2016/08/23 - noodle on language names\n2016/08/22 - start writing pancake language, update book chapters\n2016/08/21 - put bytecode tracing into chapters, \"native functions\" chapter\n2016/08/20 - get \"inheritance\" split out\n2016/08/19 - get \"methods and initializers\" split out\n2016/08/18 - track down uninitialized memory bug in split chapters\n2016/08/17 - get \"classes and instances\" split out\n2016/08/16 - get \"garbage collection\" split out, running, and tested\n2016/08/15 - get \"closures\" split out, running, and tested\n2016/08/14 - get \"functions\" split out, running, and tested\n2016/08/13 - get \"jumping forward and back\" split out, running, and tested\n2016/08/12 - get \"local variables\" running and tested\n2016/08/11 - option to run all interpreters in test runner\n2016/08/10 - \"global variables\" chapter, with tests\n2016/08/09 - finish \"statements\"\n2016/08/08 - start working on \"statements\" chapter for cvox\n2016/08/07 - split up and land changes, generate diffs for c chapters\n2016/08/06 - add diff generation to makefile, reorganize natives in jvox, toy with hasField() native\n2016/08/05 - toy with adding \"delete\" statement\n2016/08/04 - update hash table code to handle delete and tombstones correctly\n2016/08/03 - work on hash table deletion, write hash table test code\n2016/08/02 - work on \"hash tables\" and string interning\n2016/08/01 - get \"strings\" working and start on \"hash tables\"\n2016/07/31 - get \"types of values\" working\n2016/07/30 - get compiling expressions working\n2016/07/29 - get scanning on demand chapter working, start work on compiling expressions\n2016/07/28 - get virtual machine chapter working following chunks, update chapters\n2016/07/27 - get chunks chapter working\n2016/07/26 - noodle on whether to introduce chunks\n2016/07/25 - get virtual machine chapter working, plan chapters\n2016/07/24 - work on virtual machine chapter\n2016/07/23 - start working on splitting up c chapters\n2016/07/22 - reorganize stuff now that we have a print statement\n2016/07/21 - optimize empty for clauses in cvox\n2016/07/20 - add a dedicated print statement, implement for in cvox\n2016/07/19 - renumber and reorder chapters\n2016/07/18 - move block scope to statements chapter and closures to functions\n2016/07/17 - implement c-style for loop in jvox, tests\n2016/07/16 - rework resolver, start working on for\n2016/07/15 - get inheritance chapter working\n2016/07/14 - get chapter 10 (classes) working\n2016/07/13 - tests for chapters 6, 7, and 8\n2016/07/12 - tests for chapters 2, 4, and 5\n2016/07/11 - get test.py set up to run and track chapter versions\n2016/07/10 - work on function and block chapters\n2016/07/09 - rename chapters and get control flow chapter working\n2016/07/08 - more sorting out sections around variables and block scope\n2016/07/07 - start separating out block scope from other scopes\n2016/07/06 - hack on resolution and variable lookup\n2016/07/05 - work on chapter 6, reorganize ast generation stuff\n2016/07/04 - finish removing context, get chapter 5 running\n2016/07/03 - get split chapter 4 working, start on 5, get rid of context in visitors\n2016/07/02 - clean up ast printer and generator, more chapter splitting\n2016/07/01 - clean up framework code for jvox\n2016/06/30 - work on getting intellij set up for split chapters\n*** every day after this ***\n2016/06/24 - finish adding chapter markers, start writing script to split\n2016/06/23 - start interleaving chapter notes into real jvox code\n2016/06/22 - use stack of maps for local scopes in jvox\n2016/06/21 - more work organizing java code into chapters\n2016/06/17 - start organizing java code into chapters\n2016/06/16 - fit java code in 72 columns\n2016/06/13 - tweak css to fit 72 columns of code\n2016/06/10 - lots of clean up\n2016/06/09 - make the gc strategy more realistic\n2016/06/07 - try to clean up some stuff in cvox\n2016/06/04 - fix some bugs, work on outline, split out Chunk and value.h\n2016/05/31 - clean up after unboxing work\n2016/05/30 - more work on unboxing values\n2016/05/29 - start working on unboxed values\n2016/05/28 - more mobile tweaks, handle limits in cvox\n2016/05/24 - start working on mobile layout\n2016/05/19 - show and style file name by code samples\n2016/05/18 - null -> nil in jvox\n2016/05/17 - null -> nil in cvox\n2016/05/14 - handle rebound superclass\n2016/05/12 - cache hashes in strings\n2016/05/11 - include function name in stack traces, other todos\n2016/05/10 - take advantage of string interning\n2016/05/09 - constructors -> init()\n2016/05/08 - more work on super\n2016/05/07 - finish cleaning up errors\n2016/05/06 - super calls in cvox\n2016/05/05 - better cvox error reporting\n2016/05/04 - start improving cvox error reporting\n2016/05/03 - revamp jvox scanner/parser\n2016/05/02 - more error improvements\n2016/05/01 - work on better syntax error reporting\n2016/04/30 - hash table\n2016/04/29 - minor code clean up\n2016/04/27 - string interning\n2016/04/26 - hash strings\n2016/04/24 - constructors in cvox\n2016/04/23 - copy-down inheritance, closurizing methods\n2016/04/12 - inheritance\n2016/04/08 - property -> field, start working on inheritance\n2016/04/07 - make tables not objects, other clean up\n2016/04/06 - method calls, this, port more wren tests\n2016/04/05 - finish resolving in jvox\n2016/04/04 - add resolving step to jvox\n2016/04/03 - more scope corner cases, handle wrong types in operators\n2016/04/02 - get jvox working mostly like cvox\n2016/04/01 - start syncing up jvox to latest semantics\n2016/03/31 - fix a bunch of corner cases around scope\n2016/03/30 - miscellaneous clean up\n2016/03/29 - more work on classes, mainly fields\n2016/03/28 - more work on classes\n2016/03/27 - start working on classes\n2016/03/26 - finish closures\n2016/03/25 - start working on closures\n2016/03/24 - simplify compiling constants\n2016/03/23 - pointers instead of array indices to refer to stack, bug fixes\n2016/03/22 - return statement, go back to single stack and upvalues\n2016/03/21 - function calls, arguments, parameters\n2016/03/20 - more work on functions\n2016/03/19 - start working on functions, heap allocate frames\n2016/03/18 - runtime error locations, call frames, fix bugs around variables\n2016/03/14 - local variables, calls, native functions, run scripts, makefile, etc.\n2016/03/13 - scheme-like semantics for top level variables, branching\n2016/03/06 - start implementing global variables\n2016/03/04 - bools, comparison, strings, grouping, unary\n2016/03/03 - switch to mark/sweep\n2016/03/02 - create ObjFunction at beginning of compile\n2016/03/01 - start writing compiler, compile infix operators\n2016/02/29 - sketch out more object types\n2016/02/28 - start slapping together gc and vm for cvox\n2016/02/06 - lots of parser/scanner clean up, start collecting quotes\n2016/02/05 - split out java interpreter into per-package versions\n2016/02/04 - work on resolver\n2016/02/03 - work on resolver\n2016/02/01 - cleaner implementation of locals\n2016/01/30 - runtime error reporting\n2016/01/29 - get rest of java interpreter working, tests, etc.\n2016/01/28 - port parser and half of interpreter to java, repl\n2016/01/27 - start porting lexer to java\n2016/01/19 - null literal\n2016/01/18 - return statement\n2016/01/16 - comments\n2016/01/15 - run from files, properties, classes, lots of other stuff\n2016/01/14 - start writing interpreter, error reporting, flow control, etc.\n2016/01/13 - parse functions and classes, repl\n2016/01/09 - properties and tests for calls and properties\n2016/01/08 - assignment\n2016/01/07 - parse expression and block statements\n2016/01/06 - logical expressions\n2016/01/05 - parser tests\n2016/01/04 - use metaprogramming for ast types\n2016/01/02 - get js working in browser again, unary expressions\n2015/11/19 - more work on styles and toc in build script\n2015/11/18 - css for narrow desktop, build script\n\nlots of untracked stuff before this..."
  },
  {
    "path": "note/names.txt",
    "content": "music:\n-----\nfuzz\njive\njam\nmojo\nhaze\nquid\nmoxy\npick\nhowl\nfunk\nvox\n\nrocks and minerals:\n------------------\nagate\nberyl\nflint\ngalena\njet\njade\njasper\nmarl\nmica\nnickel\nonyx\nperidot\nquartz\nsard\nshale\ntufa\n\nmountain-related:\n----------------\ncrag\ntor\n\nclimbing:\n--------\nbolt - A point of protection permanently installed in a hole drilled into the rock, to which a metal hanger is attached, having a hole for a carabiner or ring.\ncairn - A distinctive pile of stones placed to designate a summit or mark a trail, often above the treeline.\nCam - A spring-loaded device used as protection.\nChock - A mechanical device, or a wedge, used as anchors in cracks. A naturally occurring stone wedged in a crack.\nCol - A small pass or \"saddle\" between two peaks. Excellent for navigation as when standing on one it's always down in two, opposite, directions and up in the two directions in between those.\nCrag - A small area with climbing routes, often just a small cliff face or a few boulders.\nCrux - The most difficult portion of a climb.\nDeck - The ground. To hit the ground, usually the outcome of a fall.\nDyno - A dynamic move to grab a hold that would otherwise be out of reach. Generally both feet will leave the rock face and return again once the target hold is caught. Non-climbers would call it a jump or a leap.\nJib - A particularly small foothold, usually only large enough for the big toe, sometimes relying heavily on friction to support weight.\nJug - A shortened term for Jug Hold, both noun and verb.\nJug hold - A large, easily held hold. Also known simply as a jug.\nNub - A little hold that only a few fingers can grip, or the tips of the toes.\nPeg - A piton.\nSerac - A large ice tower.\nSprag - A type of hand position where the fingers and thumb are opposed.\n\nfood:\n----\npancake\ncrepe\nroux\nlox\n\npnw:\n---\nmoss\nelk\npika\nmarmot\nfir\ncedar\nhemlock\n"
  },
  {
    "path": "note/objects.txt",
    "content": "A couple of issues related to working with objects:\n\n* How do we distinguish field access from \"getters\"?\n* How do we distinguish method calls from invoking a function stored in a field?\n* Can methods be torn off?\n* How do we handle properties/methods on built-in types?\n* How do we handle operators? Are they methods or special?\n* What object represents a class?\n* How are objects constructed?\n\nWithin a method, there are a few namespaces in play:\n- The lexical namespace of local variables and then the surrounding global oens.\n- The namespace of fields on the instance.\n- The namespace of methods on the class (and then any inherited methods).\n\nThe simplest way to handle these is to have distinct syntaxes for each. We\ncould do:\n\n- Bare names for variables.\n- \"@\" or some other sigil for fields.\n- Explicit property access (including on \"this\") for methods.\n\nThat still leaves an ambiguity between a nullary method that returns a function\nversus a method call with parameters. The simplest solution is to do what Java\ndoes and not have getters, though that's gross for stuff like list.length(). If\nwe don't want to have tear-offs, that might be the best solution.\n\n---\n\nFor the class object, let's take a page from JS and make it the constructor.\nYou then invoke it just like a function to create an instance. (I.e. no \"new\"\nkeyword.)\n\nThis is the bare minimum needed by the \"class object\" -- to be a generator for\ninstances. Since this is just a teaching language, we can mention but not\ndeal with metaclasses, static methods, etc.\n\nclass Vector {\n  Vector(x, y) {\n    @x = x; @y = y;\n  }\n\n  length() {\n\n  }\n}\n\nThis won't be *just* a function. It needs some additional stuff: in particular\nthe method set that instances use and a superclass reference. But it's\na superset of what a function can do.\n\n---\n\nIt is nice to use based \".\" style for properties, both on this and other\nobjects. It's familiar to users coming from C, JS, etc:\n\n    foo.bar = \"value\";\n\nIt does cause a couple of ambiguities:\n\n    foo.bar;\n\nIs that:\n\nA. Accessing a field \"bar\" on foo?\nB. Closurizing a reference to the method \"bar\" on foo?\n\nLikewise:\n\n    foo.bar();\n\nIs that:\n\nC. Invoking the method \"bar\" on foo with zero arguments?\nD. Accessing the field \"bar\" on foo, which returns a function, then calling\n   that with zero arguments?\n\nWe obviously need to support A and C. B is really difficult given how we\ncurrently compile \"this\". We'd need to compile it to an upvalue for every\nmethod just in case it gets closed over. But then for normal invocations, we\nwould then have to allocate a closure every time. We don't want to do that.\n\nA couple of options:\n\n1. Come up with a separate OP_THIS that handles both the closurized and regular\n   call cases. But how?\n2. Always store this in the first slot, even when it's closed over. When\n   calling a function (not a method), copy the caller's zero slot into the\n   callee's. In other words, instead of storing the function in slot zero for\n   function calls, store the surrounding \"this\".\n\n   Hmm, wait. That's wrong. A function should use lexical scope to find \"this\",\n   not dynamic.\n3. Not allow closurization.\n\nCurrently leaning towards 3.\n\nD is a little annoying, but I think users would expect it to work. The only\ndownside is that it means we have to do two lookups on every method call, first\nto look for a field (which shadows the method) and then to look for a method.\n\nAnother option is to have different syntax for fields (like @foo) and no\ngetters. That keeps all cases distinct, at the expense of making the very handy\nfoo.bar not even be valid syntax.\n\nFor now, let's try sticking with the current syntax and just not allow B. I'll\nsee what kind of perf hit D causes and go from there.\n"
  },
  {
    "path": "note/outline.md",
    "content": "**TODO: when can we introduce a print statement/function?**\n\nneeds to happen before statements and flow control otherwise those aren't\nvisible to user.\n\n- Warming Up\n    - Introduction\n        - who book is for\n        - who am i\n            - doodling languages in notebook\n            - always fascinated\n            - seemed like magic\n            - iStudio\n            - paternity\n            - Dart\n        - why learn languages?\n            - in full programming career, will end up doing something related to\n              language\n            - good way to learn lots of techniques: recursion, trees, graphs,\n              state machines, memory management, optimization\n            - hard, training with weights on\n            - fun\n            - dispell magic\n        - structure of book\n        - languages used in impl\n        - what's in book\n        - what's not in book\n        - end goal is high quality, efficient interpreter suitable for real use\n        - to get there, narrow path through space, not broad survey\n        - will point to alternatives to explore on own\n        - learn enough to carry conversation with professional lang person\n    - The Vox Language\n        - intro to full language we'll be implementing\n        - ebnf\n    - The Pancake Language\n        - basic phases and terminology of interpreter\n        - simple stack-based language\n- Practice (Java)\n    - Framework\n        - repl\n        - interpreters run from source\n        - test framework\n    - Scanning\n        - tokens\n        - whitespace\n        - regex\n        - comments\n        - numbers\n            - leading zeroes\n            - floating point\n            - leading and trailing \".\"\n            - range\n            - negative\n        - token value\n        - strings\n        - token type\n        - escaping\n        - errors\n        - maximal munch\n        - fortran parsing identifiers without whitespace\n        - significant indentation\n        - state machine for identifiers\n        - ex: self-assignment and increment\n        - ex: scientific and hex\n        - ex: significant indentation and newlines\n        - ex: escapes\n        - eagerly scan to list of tokens\n    - Parsing Expressions\n        - ast\n        - metaprogramming the ast types\n        - recursive descent\n        - lookahead\n        - ex: \"needs more input\" for multi-line repl\n    - Tree Walk Interpreting\n        - evaluating operands\n        - recursion\n        - arithmetic\n        - visitor pattern\n        - aside: interpreter pattern is putting interpret methods on nodes\n          - makes it possible to add new node types\n        - values versus ast nodes for literals\n        - dynamic typing and conversions\n        - errors\n    - Variables\n        - statements versus expressions\n        - declaration\n        - assignment\n        - variable references\n        - scope\n        - undefined names\n        - block scope\n    - Control Flow\n        - if\n        - and and or\n        - while\n        - for\n    - Functions\n        - parsing calls\n        - '(' as infix operator\n        - built in fns\n        - user-defined fns\n        - parameters and arguments\n        - call stack\n        - closures\n        - ffi?\n        - tail call optimization\n        - arity mismatch\n    - Resolution\n        - compile errors\n        - recursion and mutual recursion\n        - decorating an ast\n        - symbol tables\n        - name binding\n        - early versus late binding\n    - Classes\n        - classes\n        - prototypes?\n        - this\n        - properties\n        - methods\n        - dynamic dispatch\n        - constructors\n    - Inheritance\n        - inheritance\n        - super calls\n    - Lists and Loops\n        - list type\n        - subscript operator\n        - subscript setter\n        - for syntax\n        - iterator protocol\n        - desugaring\n        - ex: make string implement protocol\n        TODO: Cut this?\n\nTODO: Still needs a lot of work:\n\n- Performance (C)\n    - Framework\n    - A Virtual Machine\n        - stack\n        - for now, Value is just a double and OP_CONSTANT uses the argument as\n          an immediate int value so we don't need a constant table\n        - bytecode\n        - hand-author and run some bytecode\n    - Scanning\n        - pull based lazy scanning\n        - zero-alloc tokens\n        - talk about state machine for keywords?\n    - Compiling Expressions\n        - top-down operator precedence parsing\n        - single-pass compiling\n        ^ can now compile and run arithmetic exprs\n    - Representing Objects\n        - numbers, strings, bools, null\n        - dynamic typing\n        ^ now can handle \"str\" + \"another\"\n    - Garbage Collection\n        ^ since previous chapter needs to heap alloc stuff, need to manage it\n        - roots\n        ??? we don't have any objects that store references to other objects\n            yet, so there is no traversal happening\n    - String Interning and Symbols\n        - string interning\n        - fast equality\n        - hashing?\n        - separate symbol types\n        - intern all or some strings\n        - gcing interned strings\n    - Variables\n        - statements versus expressions\n    - Control Flow\n        - branching instructions\n\n    - Functions\n        - upvalues\n\n    TODO: other stuff...\n    - constant pools\n    - functions\n    - symbol tables\n    - nan tagging\n    - copy down inheritance\n\n\nprinciples\n\n- each top-level section builds one interpreter starting from scratch\n- since the book will be \"published\" online serially, the chapters should be\n  ordered such that they are useful even while the book is incomplete. that\n  probably means doing all of parsing for the whole grammar isn't a good idea:\n  it's boring until later chapters do something with it.\n\n- kinds of content in a chapter\n  - main narrative with prose and code\n  - historical context and people\n  - further things to learn\n  - omitted alternatives\n  - review questions: ask things chapter did explain\n  - challenges: add new features or compare other languages\n  - quotation at beginning of each chapter\n  - engineering considerations: error handling, maintainability, etc.\n  - design and pyschology: usability, aesthetics, popularity, learnability, etc.\n\nstuff to maybe include:\n\n- error-handling\n    - stack traces and line information\n    - runtime errors\n- variables\n    - scopes as dictionaries\n    - name binding of locals\n    - variables and assignment\n    - scope\n- object model\n    - objects as dictionaries\n    - objects\n    - classes\n    - prototypes\n    - nan tagging\n    - object representation\n    - symbol tables and hash tables\n    - strings\n    - arrays\n    - hash tables (for internal use and as object in language)\n    - dynamic dispatch\n- syntax\n    - aesthetics and usability of syntax design\n    - backjumping and infinite lookahead or context-sensitive grammars\n"
  },
  {
    "path": "note/research.txt",
    "content": "http://en.wikipedia.org/wiki/PL/0\n\n\"Structure and Interpretation of Efficient Interpreters\" (in dropbox)\n\nhttp://blog.analogmachine.org/2011/09/20/lets-build-a-compiler/\n\n\"Compiler Construction\" by Wirth"
  },
  {
    "path": "note/scope.txt",
    "content": "Mostly following Scheme (R5RS):\n\n- Accessing an undefined name is a runtime error. It is not a compile time\n  error.\n\n  (define (eval-foo) foo) ; OK, though foo is undefined.\n  (eval-foo)              ; Runtime error.\n  (define foo \"ok\")       ; Now foo is defined.\n  (eval-foo)              ; Now it works.\n\n  - Allows mutual recursion at the top level.\n  - Does so in a way that's friendly to the REPL and incremental evaluation.\n\n- Assigning to an undefined name is a runtime error. It is not a compile time\n  error.\n\n  (define (setbar) (set! bar \"wat\"))  ; OK, though bar is undefined.\n  (setbar)                            ; Runtime error.\n  (define bar \"ok\")                   ; Now bar is defined.\n  (setbar)                            ; Now it works.\n\n  - Avoids the mushiness of treating a typo in an assignment as \"let's just\n    create a new global variable\", which is probably not what the user wants.\n\n- Top level variables can be defined multiple times.\n\n  (define foo \"1\")\n  foo               ; 1\n  (define foo \"2\")  ; OK.\n  foo               ; 2\n\n  - REPL friendly.\n\n- A variable is not in scope in its own initializer. It is declared after its\n  initializer is run.\n\n  (define foo foo) ; Error: foo is not defined.\n\n  (define bar \"outer\")\n  (let ((bar bar)) bar) ; \"outer\"\n\n  (define baz 1)\n  (define baz (+ baz 1))  ; Refer to previous definition of baz.\n  baz                     ; 2\n\n  - This is different from classes and functions, but that's probably OK.\n\n  - If we want it to be a compile error for a local to be in scope in its own\n    initializer, we'd need some sort of resolving step in jvox.\n\n  - If we want it to be a runtime error, that error would always occur, so it\n    feels weird to defer that. And it would make cvox slower, or require us to\n    emit some special case \"always throw runtime error here\" code.\n"
  },
  {
    "path": "note/struct sizes.txt",
    "content": "typedef struct Obj2 {\n  ObjType type;\n} Obj2;\n\ntypedef struct ObjString2 {\n  Obj2 obj;\n  int length;\n  char* chars;\n\n} ObjString2;\n\nint main(int argc, const char* argv[]) {\n  printf(\"sizeof(Obj) %ld\\n\", sizeof(Obj2));\n  printf(\"sizeof(ObjType) %ld\\n\", sizeof(ObjType));\n  printf(\"offset(Obj, type) %ld\\n\", offsetof(Obj2, type));\n  printf(\"sizeof(ObjString) %ld\\n\", sizeof(ObjString2));\n  printf(\"offset(ObjString, obj) %ld\\n\", offsetof(ObjString2, obj));\n  printf(\"offset(ObjString, length) %ld\\n\", offsetof(ObjString2, length));\n  printf(\"offset(ObjString, chars) %ld\\n\", offsetof(ObjString2, chars));\n  printf(\"sizeof(Value) %ld\\n\", sizeof(Value));\n  printf(\"sizeof(ValueType) %ld\\n\", sizeof(ValueType));\n  printf(\"offset(Value, ValueType) %ld\\n\", offsetof(Value, type));\n  printf(\"offset(Value, as) %ld\\n\", offsetof(Value, as));\n  printf(\"sizeof(Obj*) %ld\\n\", sizeof(Obj*));\n}\n\nsizeof(Obj) 4\nsizeof(ObjType) 4\noffset(Obj, type) 0\nsizeof(ObjString) 16\noffset(ObjString, obj) 0\noffset(ObjString, length) 4\noffset(ObjString, chars) 8\nsizeof(Value) 16\nsizeof(ValueType) 4\noffset(Value, ValueType) 0\noffset(Value, as) 8\nsizeof(Obj*) 8"
  },
  {
    "path": "note/style guide.md",
    "content": "## Person\n\nFiguring out when to use \"we\" versus \"it\" when talking about the code is hard.\nIt's important to be clear because the prose talks about what the reader needs\nto do \"define this method\", \"replace this line\", etc. and what the code needs\nto do while it's running \"match this token\", etc.\n\nBut it gets really awkward to always use \"it\" for describing what the code does.\nSo the rough rules are:\n\n1.  When walking through a hypothetical execution of the code, use \"we\". Most\n    prose explaining the code is like this.\n\n2.  When describing how the code must be changed, what the reader must\n    mechanically do, use \"we\" (and not \"you\").\n\n3.  When describing how a piece of code works in general, or if it otherwise\n    reads better, use \"it\".\n\n## Formatting\n\n*   Class names are not in code font: \"The PrettyPrinter class\". Type names in C\n    are also formatted normally: Value, Obj, etc. Even built-in types like\n    double and uint16_t.\n\n*   File names and extensions are quoted:\n\n    > The file \"Expr.java\" has extension \".java\".\n\n*   C module names are quoted:\n\n    > The \"debug\" module.\n\nTODO: How do we style keywords used in headers and subheaders?\n\n### Bold and italics\n\n*   The first time a technical term is defined, make it bold. Don't quote it,\n    even when referring to the term directly.\n\n*   Consider using italics for a new technical term that isn't explicitly\n    defined in order to highlight that it is jargon.\n\n*   In a bullet list, if the bold part is a sentence or part of a sentence,\n    emphasize it like normal prose. If the bullet item starts with a standalone\n    term, separate it from the subsequent prose with an en dash.\n\n*   Big-O notation: \"*O(n)*\".\n\n### Code font\n\n*   References to statements like \"`if` statement\" and \"`switch`\". Use \"`else`\n    clause\" to refer to that part of an `if` statement's *sytax*, but \"then\n    branch\" and \"else branch\" to refer to those *concepts*.\n\n*   Use \"`return` statement\", but \"early return\". In almost all other cases,\n    \"return\" uses normal type (\"return value\", \"return from\", etc.), except when\n    \"the `return`\" refers to a return statement.\n\n*   \"Class declaration\", but \"`class` statement\".\n\n*   When referring to the Boolean values true and false, put them in code font,\n    as in \"returns `true`\". Use normal text when referring to truth or falsehood\n    in general.\n\n*   Opcodes: \"`OP_RETURN`\".\n\n*   `nil`, `null` (Java), and `NULL` (C). Simply \"null\" when used as a verb as\n    in \"null out the field\".\n\n## Punctuation\n\n*   Prose before a Java or C code snippet ends in `:` if the last sentence is\n    not a complete sentence or directly refers to the subsequent code. End in\n    `.` if it is a reasonable-sounding sentence on its own. This is mainly so\n    that we don't use a gratuitous amount of `:` at the end of nearly every\n    paragraph.\n\n*   On the other hand, prose before illustrations, Lox examples, and grammar\n    snippets can use `:` even when a complete sentence, if the sentence refers\n    to the subsequent code or picture.\n\n### Hyphenation\n\n*   If part of a word is emphasized, like \"*re*-define\", hyphenate at the point\n    where the italics change.\n\n*   Hyphenate \"left-hand side\" and \"right-hand side\".\n\n*   Always hyphenate:\n\n    *   left-associative\n    *   right-associative\n    *   non-associative\n    *   left-recursive\n    *   l-value\n    *   r-value\n    *   finite-state machine\n\n*   Never hyphenate:\n\n    *    left recursion\n    *    call stack\n    *    call frame (but \"CallFrame\" when referring to the struct)\n\n*   Hyphenate when preceding a noun, but not otherwise (\"A first-class function\n    is first class.\"):\n\n    *    first class\n    *    lowest precedence\n    *    start up (\"start up the interpreter\" versus \"startup time\")\n\n## Usage\n\n*   Numbers in prose are usually spelled as words when there is a single word\n    for them: one, eleven, etc. However, numbers that refer to binary digits are\n    always 0 or 1.\n\n### Capitalization\n\n*   Follow common usage to determine which acronyms and abbreviations are all\n    caps or not. \"COBOL\", \"Fortran\", etc.\n\n*   Design pattern names are capitalized when referring to the pattern itself,\n    but not code that implements the pattern (unless the code is the name of the\n    actual class). As in: \"ExpressionVisitor is a visitor class that implements\n    the Visitor pattern.\"\n\n### Word list\n\n*     opcode\n*     Boolean\n*     lookup\n*     I/O\n*     \"null\" when referring to the null byte at the end of a string\n"
  },
  {
    "path": "note/todo.txt",
    "content": "Print:\n\n- Order proof from IngramSpark.\n\neBook:\n\n- Fix all TODOs in asset/ebook/*.\n- Create wider cover image for non-Kindle.\n- Style TOC page.\n- Assign ISBN numbers:\n  https://www.myidentifiers.com/title_registration?isbn=978-0-9905829-3-9&icon_type=New\n- Get example snippet in chapter one looking right.\n- Make eBook exporter handle Kindle.\n- Generate PDF version.\n  - Include cover.\n  - Get table of contents links working.\n- Style tables (see \"Chunks of Bytecode\").\n- Check out on a few different readers.\n- Go through whole book and see how it looks.\n- Make sure inline images look OK.\n\nWeb:\n\n- Link to stores.\n- Export sample chapter PDF.\n- Remove \"not done\" script, templates and styles.\n- Replace \"work in progress\" header with something about print edition.\n- Add snippets for remaining chapters to compile_snippets to pin down which\n  snippets reach a working point.\n"
  },
  {
    "path": "site/.htaccess",
    "content": "ErrorDocument 404 /404.html\n\nRedirect /beta http://journal.stuffwithstuff.com/2012/12/19/the-impoliteness-of-overriding-methods/\nRedirect /budget https://steveklabnik.com/writing/the-language-strangeness-budget\nRedirect /dragon https://en.wikipedia.org/wiki/Compilers:_Principles,_Techniques,_and_Tools\nRedirect /finch http://finch.stuffwithstuff.com/\nRedirect /koan http://wiki.c2.com/?ClosuresAndObjectsAreEquivalent\nRedirect /locality http://gameprogrammingpatterns.com/data-locality.html\nRedirect /lua5 https://www.lua.org/doc/jucs05.pdf\nRedirect /ports https://github.com/munificent/craftinginterpreters/wiki/Lox-implementations\nRedirect /pratt http://journal.stuffwithstuff.com/2011/03/19/pratt-parsers-expression-parsing-made-easy/\nRedirect /prototypes http://gameprogrammingpatterns.com/prototype.html\nRedirect /repo https://github.com/munificent/craftinginterpreters\nRedirect /singleton http://gameprogrammingpatterns.com/singleton.html\nRedirect /state http://gameprogrammingpatterns.com/state.html\nRedirect /tests https://github.com/munificent/craftinginterpreters/tree/master/test\nRedirect /wizard https://mitpress.mit.edu/sites/default/files/sicp/index.html\nRedirect /wren https://wren.io/\n"
  },
  {
    "path": "site/404.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>404 Page Not Found &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n\n</head>\n<body>\n\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype-small.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n    <h2><a href=\"#top\"><small>&nbsp;</small> Table of Contents</a></h2>\n    <ul>\n      <li><a href=\"#welcome\"><small>I</small>Welcome</a></li>\n      <li><a href=\"#a-tree-walk-interpreter\"><small>II</small>A Tree-Walk Interpreter</a></li>\n      <li><a href=\"#a-bytecode-virtual-machine\"><small>III</small>A Bytecode Virtual Machine</a></li>\n      <li><a href=\"#backmatter\"><small>&#10087;</small>Backmatter</a></li>\n    </ul>\n        <div class=\"prev-next\">\n        <a href=\"index.html\" title=\"Crafting Interpreters\" class=\"left\">&larr;&nbsp;Previous</a>\n        <a href=\"index.html\" title=\"Crafting Interpreters\">&uarr;&nbsp;Up</a>\n        <a href=\"welcome.html\" title=\"Welcome\" class=\"right\">Next&nbsp;&rarr;</a>\n    </div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype-small.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"index.html\" title=\"Crafting Interpreters\" class=\"prev\">←</a>\n<a href=\"welcome.html\" title=\"Welcome\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype-small.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n    <h2><a href=\"#top\"><small>&nbsp;</small> Table of Contents</a></h2>\n    <ul>\n      <li><a href=\"#welcome\"><small>I</small>Welcome</a></li>\n      <li><a href=\"#a-tree-walk-interpreter\"><small>II</small>A Tree-Walk Interpreter</a></li>\n      <li><a href=\"#a-bytecode-virtual-machine\"><small>III</small>A Bytecode Virtual Machine</a></li>\n      <li><a href=\"#backmatter\"><small>&#10087;</small>Backmatter</a></li>\n    </ul>\n        <div class=\"prev-next\">\n        <a href=\"index.html\" title=\"Crafting Interpreters\" class=\"left\">&larr;&nbsp;Previous</a>\n        <a href=\"index.html\" title=\"Crafting Interpreters\">&uarr;&nbsp;Up</a>\n        <a href=\"welcome.html\" title=\"Welcome\" class=\"right\">Next&nbsp;&rarr;</a>\n    </div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"contents\">\n\n<h1>404 &ndash; Page Not Found</h1>\n\n<p>You seem to have reached a dead end. Did you get lost? Did I mislead you?\nEither way, you probably want to <a href=\"/\">go back to the start.</a></p>\n\n<footer>\nHand-crafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2020</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/a-bytecode-virtual-machine.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>A Bytecode Virtual Machine &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h2><small>III</small>A Bytecode Virtual Machine</h2>\n\n<ul>\n    <li><a href=\"chunks-of-bytecode.html\"><small>14</small>Chunks of Bytecode</a></li>\n    <li><a href=\"a-virtual-machine.html\"><small>15</small>A Virtual Machine</a></li>\n    <li><a href=\"scanning-on-demand.html\"><small>16</small>Scanning on Demand</a></li>\n    <li><a href=\"compiling-expressions.html\"><small>17</small>Compiling Expressions</a></li>\n    <li><a href=\"types-of-values.html\"><small>18</small>Types of Values</a></li>\n    <li><a href=\"strings.html\"><small>19</small>Strings</a></li>\n    <li><a href=\"hash-tables.html\"><small>20</small>Hash Tables</a></li>\n    <li><a href=\"global-variables.html\"><small>21</small>Global Variables</a></li>\n    <li><a href=\"local-variables.html\"><small>22</small>Local Variables</a></li>\n    <li><a href=\"jumping-back-and-forth.html\"><small>23</small>Jumping Back and Forth</a></li>\n    <li><a href=\"calls-and-functions.html\"><small>24</small>Calls and Functions</a></li>\n    <li><a href=\"closures.html\"><small>25</small>Closures</a></li>\n    <li><a href=\"garbage-collection.html\"><small>26</small>Garbage Collection</a></li>\n    <li><a href=\"classes-and-instances.html\"><small>27</small>Classes and Instances</a></li>\n    <li><a href=\"methods-and-initializers.html\"><small>28</small>Methods and Initializers</a></li>\n    <li><a href=\"superclasses.html\"><small>29</small>Superclasses</a></li>\n    <li><a href=\"optimization.html\"><small>30</small>Optimization</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"inheritance.html\" title=\"Inheritance\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"contents.html\" title=\"Table of Contents\">&uarr;&nbsp;Up</a>\n    <a href=\"chunks-of-bytecode.html\" title=\"Chunks of Bytecode\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"inheritance.html\" title=\"Inheritance\" class=\"prev\">←</a>\n<a href=\"chunks-of-bytecode.html\" title=\"Chunks of Bytecode\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h2><small>III</small>A Bytecode Virtual Machine</h2>\n\n<ul>\n    <li><a href=\"chunks-of-bytecode.html\"><small>14</small>Chunks of Bytecode</a></li>\n    <li><a href=\"a-virtual-machine.html\"><small>15</small>A Virtual Machine</a></li>\n    <li><a href=\"scanning-on-demand.html\"><small>16</small>Scanning on Demand</a></li>\n    <li><a href=\"compiling-expressions.html\"><small>17</small>Compiling Expressions</a></li>\n    <li><a href=\"types-of-values.html\"><small>18</small>Types of Values</a></li>\n    <li><a href=\"strings.html\"><small>19</small>Strings</a></li>\n    <li><a href=\"hash-tables.html\"><small>20</small>Hash Tables</a></li>\n    <li><a href=\"global-variables.html\"><small>21</small>Global Variables</a></li>\n    <li><a href=\"local-variables.html\"><small>22</small>Local Variables</a></li>\n    <li><a href=\"jumping-back-and-forth.html\"><small>23</small>Jumping Back and Forth</a></li>\n    <li><a href=\"calls-and-functions.html\"><small>24</small>Calls and Functions</a></li>\n    <li><a href=\"closures.html\"><small>25</small>Closures</a></li>\n    <li><a href=\"garbage-collection.html\"><small>26</small>Garbage Collection</a></li>\n    <li><a href=\"classes-and-instances.html\"><small>27</small>Classes and Instances</a></li>\n    <li><a href=\"methods-and-initializers.html\"><small>28</small>Methods and Initializers</a></li>\n    <li><a href=\"superclasses.html\"><small>29</small>Superclasses</a></li>\n    <li><a href=\"optimization.html\"><small>30</small>Optimization</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"inheritance.html\" title=\"Inheritance\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"contents.html\" title=\"Table of Contents\">&uarr;&nbsp;Up</a>\n    <a href=\"chunks-of-bytecode.html\" title=\"Chunks of Bytecode\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">III</div>\n  <h1 class=\"part\">A Bytecode Virtual Machine</h1>\n\n<p>Our Java interpreter, jlox, taught us many of the fundamentals of programming\nlanguages, but we still have much to learn. First, if you run any interesting\nLox programs in jlox, you&rsquo;ll discover it&rsquo;s achingly slow. The style of\ninterpretation it uses<span class=\"em\">&mdash;</span>walking the AST directly<span class=\"em\">&mdash;</span>is good enough for <em>some</em>\nreal-world uses, but leaves a lot to be desired for a general-purpose scripting\nlanguage.</p>\n<p>Also, we implicitly rely on runtime features of the JVM itself. We take for\ngranted that things like <code>instanceof</code> in Java work <em>somehow</em>. And we never for a\nsecond worry about memory management because the JVM&rsquo;s garbage collector takes\ncare of it for us.</p>\n<p>When we were focused on high-level concepts, it was fine to gloss over those.\nBut now that we know our way around an interpreter, it&rsquo;s time to dig down to\nthose lower layers and build our own virtual machine from scratch using nothing\nmore than the C standard library<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span></p>\n\n<footer>\n<a href=\"chunks-of-bytecode.html\" class=\"next\">\n  Next Chapter: &ldquo;Chunks of Bytecode&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/a-map-of-the-territory.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>A Map of the Territory &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">A Map of the Territory<small>2</small></a></h3>\n\n<ul>\n    <li><a href=\"#the-parts-of-a-language\"><small>2.1</small> The Parts of a Language</a></li>\n    <li><a href=\"#shortcuts-and-alternate-routes\"><small>2.2</small> Shortcuts and Alternate Routes</a></li>\n    <li><a href=\"#compilers-and-interpreters\"><small>2.3</small> Compilers and Interpreters</a></li>\n    <li><a href=\"#our-journey\"><small>2.4</small> Our Journey</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"introduction.html\" title=\"Introduction\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"welcome.html\" title=\"Welcome\">&uarr;&nbsp;Up</a>\n    <a href=\"the-lox-language.html\" title=\"The Lox Language\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"introduction.html\" title=\"Introduction\" class=\"prev\">←</a>\n<a href=\"the-lox-language.html\" title=\"The Lox Language\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">A Map of the Territory<small>2</small></a></h3>\n\n<ul>\n    <li><a href=\"#the-parts-of-a-language\"><small>2.1</small> The Parts of a Language</a></li>\n    <li><a href=\"#shortcuts-and-alternate-routes\"><small>2.2</small> Shortcuts and Alternate Routes</a></li>\n    <li><a href=\"#compilers-and-interpreters\"><small>2.3</small> Compilers and Interpreters</a></li>\n    <li><a href=\"#our-journey\"><small>2.4</small> Our Journey</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"introduction.html\" title=\"Introduction\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"welcome.html\" title=\"Welcome\">&uarr;&nbsp;Up</a>\n    <a href=\"the-lox-language.html\" title=\"The Lox Language\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">2</div>\n  <h1>A Map of the Territory</h1>\n\n<blockquote>\n<p>You must have a map, no matter how rough. Otherwise you wander all over the\nplace. In <em>The Lord of the Rings</em> I never made anyone go farther than he could\non a given day.</p>\n<p><cite>J. R. R. Tolkien</cite></p>\n</blockquote>\n<p>We don&rsquo;t want to wander all over the place, so before we set off, let&rsquo;s scan\nthe territory charted by previous language implementers. It will help us\nunderstand where we are going and the alternate routes others have taken.</p>\n<p>First, let me establish a shorthand. Much of this book is about a language&rsquo;s\n<em>implementation</em>, which is distinct from the <em>language itself</em> in some sort of\nPlatonic ideal form. Things like &ldquo;stack&rdquo;, &ldquo;bytecode&rdquo;, and &ldquo;recursive descent&rdquo;,\nare nuts and bolts one particular implementation might use. From the user&rsquo;s\nperspective, as long as the resulting contraption faithfully follows the\nlanguage&rsquo;s specification, it&rsquo;s all implementation detail.</p>\n<p>We&rsquo;re going to spend a lot of time on those details, so if I have to write\n&ldquo;language <em>implementation</em>&rdquo; every single time I mention them, I&rsquo;ll wear my\nfingers off. Instead, I&rsquo;ll use &ldquo;language&rdquo; to refer to either a language or an\nimplementation of it, or both, unless the distinction matters.</p>\n<h2><a href=\"#the-parts-of-a-language\" id=\"the-parts-of-a-language\"><small>2&#8202;.&#8202;1</small>The Parts of a Language</a></h2>\n<p>Engineers have been building programming languages since the Dark Ages of\ncomputing. As soon as we could talk to computers, we discovered doing so was too\nhard, and we enlisted their help. I find it fascinating that even though today&rsquo;s\nmachines are literally a million times faster and have orders of magnitude more\nstorage, the way we build programming languages is virtually unchanged.</p>\n<p>Though the area explored by language designers is vast, the trails they&rsquo;ve\ncarved through it are <span name=\"dead\">few</span>. Not every language takes the\nexact same path<span class=\"em\">&mdash;</span>some take a shortcut or two<span class=\"em\">&mdash;</span>but otherwise they are\nreassuringly similar, from Rear Admiral Grace Hopper&rsquo;s first COBOL compiler all\nthe way to some hot, new, transpile-to-JavaScript language whose &ldquo;documentation&rdquo;\nconsists entirely of a single, poorly edited README in a Git repository\nsomewhere.</p>\n<aside name=\"dead\">\n<p>There are certainly dead ends, sad little cul-de-sacs of CS papers with zero\ncitations and now-forgotten optimizations that only made sense when memory was\nmeasured in individual bytes.</p>\n</aside>\n<p>I visualize the network of paths an implementation may choose as climbing a\nmountain. You start off at the bottom with the program as raw source text,\nliterally just a string of characters. Each phase analyzes the program and\ntransforms it to some higher-level representation where the semantics<span class=\"em\">&mdash;</span>what\nthe author wants the computer to do<span class=\"em\">&mdash;</span>become more apparent.</p>\n<p>Eventually we reach the peak. We have a bird&rsquo;s-eye view of the user&rsquo;s program\nand can see what their code <em>means</em>. We begin our descent down the other side of\nthe mountain. We transform this highest-level representation down to\nsuccessively lower-level forms to get closer and closer to something we know how\nto make the CPU actually execute.</p><img src=\"image/a-map-of-the-territory/mountain.png\" alt=\"The branching paths a language may take over the mountain.\" class=\"wide\" />\n<p>Let&rsquo;s trace through each of those trails and points of interest. Our journey\nbegins on the left with the bare text of the user&rsquo;s source code:</p><img src=\"image/a-map-of-the-territory/string.png\" alt=\"var average = (min + max) / 2;\" />\n<h3><a href=\"#scanning\" id=\"scanning\"><small>2&#8202;.&#8202;1&#8202;.&#8202;1</small>Scanning</a></h3>\n<p>The first step is <strong>scanning</strong>, also known as <strong>lexing</strong>, or (if you&rsquo;re trying\nto impress someone) <strong>lexical analysis</strong>. They all mean pretty much the same\nthing. I like &ldquo;lexing&rdquo; because it sounds like something an evil supervillain\nwould do, but I&rsquo;ll use &ldquo;scanning&rdquo; because it seems to be marginally more\ncommonplace.</p>\n<p>A <strong>scanner</strong> (or <strong>lexer</strong>) takes in the linear stream of characters and chunks\nthem together into a series of something more akin to <span\nname=\"word\">&ldquo;words&rdquo;</span>. In programming languages, each of these words is\ncalled a <strong>token</strong>. Some tokens are single characters, like <code>(</code> and <code>,</code>. Others\nmay be several characters long, like numbers (<code>123</code>), string literals (<code>\"hi!\"</code>),\nand identifiers (<code>min</code>).</p>\n<aside name=\"word\">\n<p>&ldquo;Lexical&rdquo; comes from the Greek root &ldquo;lex&rdquo;, meaning &ldquo;word&rdquo;.</p>\n</aside>\n<p>Some characters in a source file don&rsquo;t actually mean anything. Whitespace is\noften insignificant, and comments, by definition, are ignored by the language.\nThe scanner usually discards these, leaving a clean sequence of meaningful\ntokens.</p><img src=\"image/a-map-of-the-territory/tokens.png\" alt=\"[var] [average] [=] [(] [min] [+] [max] [)] [/] [2] [;]\" />\n<h3><a href=\"#parsing\" id=\"parsing\"><small>2&#8202;.&#8202;1&#8202;.&#8202;2</small>Parsing</a></h3>\n<p>The next step is <strong>parsing</strong>. This is where our syntax gets a <strong>grammar</strong><span class=\"em\">&mdash;</span>the\nability to compose larger expressions and statements out of smaller parts. Did\nyou ever diagram sentences in English class? If so, you&rsquo;ve done what a parser\ndoes, except that English has thousands and thousands of &ldquo;keywords&rdquo; and an\noverflowing cornucopia of ambiguity. Programming languages are much simpler.</p>\n<p>A <strong>parser</strong> takes the flat sequence of tokens and builds a tree structure that\nmirrors the nested nature of the grammar. These trees have a couple of different\nnames<span class=\"em\">&mdash;</span><strong>parse tree</strong> or <strong>abstract syntax tree</strong><span class=\"em\">&mdash;</span>depending on how\nclose to the bare syntactic structure of the source language they are. In\npractice, language hackers usually call them <strong>syntax trees</strong>, <strong>ASTs</strong>, or\noften just <strong>trees</strong>.</p><img src=\"image/a-map-of-the-territory/ast.png\" alt=\"An abstract syntax tree.\" />\n<p>Parsing has a long, rich history in computer science that is closely tied to the\nartificial intelligence community. Many of the techniques used today to parse\nprogramming languages were originally conceived to parse <em>human</em> languages by AI\nresearchers who were trying to get computers to talk to us.</p>\n<p>It turns out human languages were too messy for the rigid grammars those parsers\ncould handle, but they were a perfect fit for the simpler artificial grammars of\nprogramming languages. Alas, we flawed humans still manage to use those simple\ngrammars incorrectly, so the parser&rsquo;s job also includes letting us know when we\ndo by reporting <strong>syntax errors</strong>.</p>\n<h3><a href=\"#static-analysis\" id=\"static-analysis\"><small>2&#8202;.&#8202;1&#8202;.&#8202;3</small>Static analysis</a></h3>\n<p>The first two stages are pretty similar across all implementations. Now, the\nindividual characteristics of each language start coming into play. At this\npoint, we know the syntactic structure of the code<span class=\"em\">&mdash;</span>things like which\nexpressions are nested in which<span class=\"em\">&mdash;</span>but we don&rsquo;t know much more than that.</p>\n<p>In an expression like <code>a + b</code>, we know we are adding <code>a</code> and <code>b</code>, but we don&rsquo;t\nknow what those names refer to. Are they local variables? Global? Where are they\ndefined?</p>\n<p>The first bit of analysis that most languages do is called <strong>binding</strong> or\n<strong>resolution</strong>. For each <strong>identifier</strong>, we find out where that name is defined\nand wire the two together. This is where <strong>scope</strong> comes into play<span class=\"em\">&mdash;</span>the region\nof source code where a certain name can be used to refer to a certain\ndeclaration.</p>\n<p>If the language is <span name=\"type\">statically typed</span>, this is when we\ntype check. Once we know where <code>a</code> and <code>b</code> are declared, we can also figure out\ntheir types. Then if those types don&rsquo;t support being added to each other, we\nreport a <strong>type error</strong>.</p>\n<aside name=\"type\">\n<p>The language we&rsquo;ll build in this book is dynamically typed, so it will do its\ntype checking later, at runtime.</p>\n</aside>\n<p>Take a deep breath. We have attained the summit of the mountain and a sweeping\nview of the user&rsquo;s program. All this semantic insight that is visible to us from\nanalysis needs to be stored somewhere. There are a few places we can squirrel it\naway:</p>\n<ul>\n<li>\n<p>Often, it gets stored right back as <strong>attributes</strong> on the syntax tree\nitself<span class=\"em\">&mdash;</span>extra fields in the nodes that aren&rsquo;t initialized during parsing\nbut get filled in later.</p>\n</li>\n<li>\n<p>Other times, we may store data in a lookup table off to the side. Typically,\nthe keys to this table are identifiers<span class=\"em\">&mdash;</span>names of variables and declarations.\nIn that case, we call it a <strong>symbol table</strong> and the values it associates with\neach key tell us what that identifier refers to.</p>\n</li>\n<li>\n<p>The most powerful bookkeeping tool is to transform the tree into an entirely\nnew data structure that more directly expresses the semantics of the code.\nThat&rsquo;s the next section.</p>\n</li>\n</ul>\n<p>Everything up to this point is considered the <strong>front end</strong> of the\nimplementation. You might guess everything after this is the <strong>back end</strong>, but\nno. Back in the days of yore when &ldquo;front end&rdquo; and &ldquo;back end&rdquo; were coined,\ncompilers were much simpler. Later researchers invented new phases to stuff\nbetween the two halves. Rather than discard the old terms, William Wulf and\ncompany lumped those new phases into the charming but spatially paradoxical name\n<strong>middle end</strong>.</p>\n<h3><a href=\"#intermediate-representations\" id=\"intermediate-representations\"><small>2&#8202;.&#8202;1&#8202;.&#8202;4</small>Intermediate representations</a></h3>\n<p>You can think of the compiler as a pipeline where each stage&rsquo;s job is to\norganize the data representing the user&rsquo;s code in a way that makes the next\nstage simpler to implement. The front end of the pipeline is specific to the\nsource language the program is written in. The back end is concerned with the\nfinal architecture where the program will run.</p>\n<p>In the middle, the code may be stored in some <span name=\"ir\"><strong>intermediate\nrepresentation</strong></span> (<strong>IR</strong>) that isn&rsquo;t tightly tied to either the source or\ndestination forms (hence &ldquo;intermediate&rdquo;). Instead, the IR acts as an interface\nbetween these two languages.</p>\n<aside name=\"ir\">\n<p>There are a few well-established styles of IRs out there. Hit your search engine\nof choice and look for &ldquo;control flow graph&rdquo;, &ldquo;static single-assignment&rdquo;,\n&ldquo;continuation-passing style&rdquo;, and &ldquo;three-address code&rdquo;.</p>\n</aside>\n<p>This lets you support multiple source languages and target platforms with less\neffort. Say you want to implement Pascal, C, and Fortran compilers, and you want\nto target x86, ARM, and, I dunno, SPARC. Normally, that means you&rsquo;re signing up\nto write <em>nine</em> full compilers: Pascal&rarr;x86, C&rarr;ARM, and every other\ncombination.</p>\n<p>A <span name=\"gcc\">shared</span> intermediate representation reduces that\ndramatically. You write <em>one</em> front end for each source language that produces\nthe IR. Then <em>one</em> back end for each target architecture. Now you can mix and\nmatch those to get every combination.</p>\n<aside name=\"gcc\">\n<p>If you&rsquo;ve ever wondered how <a href=\"https://en.wikipedia.org/wiki/GNU_Compiler_Collection\">GCC</a> supports so many crazy languages and\narchitectures, like Modula-3 on Motorola 68k, now you know. Language front ends\ntarget one of a handful of IRs, mainly <a href=\"https://gcc.gnu.org/onlinedocs/gccint/GIMPLE.html\">GIMPLE</a> and <a href=\"https://gcc.gnu.org/onlinedocs/gccint/RTL.html\">RTL</a>. Target back ends\nlike the one for 68k then take those IRs and produce native code.</p>\n</aside>\n<p>There&rsquo;s another big reason we might want to transform the code into a form that\nmakes the semantics more apparent<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span></p>\n<h3><a href=\"#optimization\" id=\"optimization\"><small>2&#8202;.&#8202;1&#8202;.&#8202;5</small>Optimization</a></h3>\n<p>Once we understand what the user&rsquo;s program means, we are free to swap it out\nwith a different program that has the <em>same semantics</em> but implements them more\nefficiently<span class=\"em\">&mdash;</span>we can <strong>optimize</strong> it.</p>\n<p>A simple example is <strong>constant folding</strong>: if some expression always evaluates to\nthe exact same value, we can do the evaluation at compile time and replace the\ncode for the expression with its result. If the user typed in this:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">pennyArea</span> = <span class=\"n\">3.14159</span> * (<span class=\"n\">0.75</span> / <span class=\"n\">2</span>) * (<span class=\"n\">0.75</span> / <span class=\"n\">2</span>);\n</pre></div>\n<p>we could do all of that arithmetic in the compiler and change the code to:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">pennyArea</span> = <span class=\"n\">0.4417860938</span>;\n</pre></div>\n<p>Optimization is a huge part of the programming language business. Many language\nhackers spend their entire careers here, squeezing every drop of performance\nthey can out of their compilers to get their benchmarks a fraction of a percent\nfaster. It can become a sort of obsession.</p>\n<p>We&rsquo;re mostly going to <span name=\"rathole\">hop over that rathole</span> in this\nbook. Many successful languages have surprisingly few compile-time\noptimizations. For example, Lua and CPython generate relatively unoptimized\ncode, and focus most of their performance effort on the runtime.</p>\n<aside name=\"rathole\">\n<p>If you can&rsquo;t resist poking your foot into that hole, some keywords to get you\nstarted are &ldquo;constant propagation&rdquo;, &ldquo;common subexpression elimination&rdquo;, &ldquo;loop\ninvariant code motion&rdquo;, &ldquo;global value numbering&rdquo;, &ldquo;strength reduction&rdquo;, &ldquo;scalar\nreplacement of aggregates&rdquo;, &ldquo;dead code elimination&rdquo;, and &ldquo;loop unrolling&rdquo;.</p>\n</aside>\n<h3><a href=\"#code-generation\" id=\"code-generation\"><small>2&#8202;.&#8202;1&#8202;.&#8202;6</small>Code generation</a></h3>\n<p>We have applied all of the optimizations we can think of to the user&rsquo;s program.\nThe last step is converting it to a form the machine can actually run. In other\nwords, <strong>generating code</strong> (or <strong>code gen</strong>), where &ldquo;code&rdquo; here usually refers to\nthe kind of primitive assembly-like instructions a CPU runs and not the kind of\n&ldquo;source code&rdquo; a human might want to read.</p>\n<p>Finally, we are in the <strong>back end</strong>, descending the other side of the mountain.\nFrom here on out, our representation of the code becomes more and more\nprimitive, like evolution run in reverse, as we get closer to something our\nsimple-minded machine can understand.</p>\n<p>We have a decision to make. Do we generate instructions for a real CPU or a\nvirtual one? If we generate real machine code, we get an executable that the OS\ncan load directly onto the chip. Native code is lightning fast, but generating\nit is a lot of work. Today&rsquo;s architectures have piles of instructions, complex\npipelines, and enough <span name=\"aad\">historical baggage</span> to fill a 747&rsquo;s\nluggage bay.</p>\n<p>Speaking the chip&rsquo;s language also means your compiler is tied to a specific\narchitecture. If your compiler targets <a href=\"https://en.wikipedia.org/wiki/X86\">x86</a> machine code, it&rsquo;s not going to\nrun on an <a href=\"https://en.wikipedia.org/wiki/ARM_architecture\">ARM</a> device. All the way back in the &rsquo;60s, during the\nCambrian explosion of computer architectures, that lack of portability was a\nreal obstacle.</p>\n<aside name=\"aad\">\n<p>For example, the <a href=\"http://www.felixcloutier.com/x86/AAD.html\">AAD</a> (&ldquo;ASCII Adjust AX Before Division&rdquo;) instruction lets\nyou perform division, which sounds useful. Except that instruction takes, as\noperands, two binary-coded decimal digits packed into a single 16-bit register.\nWhen was the last time <em>you</em> needed BCD on a 16-bit machine?</p>\n</aside>\n<p>To get around that, hackers like Martin Richards and Niklaus Wirth, of BCPL and\nPascal fame, respectively, made their compilers produce <em>virtual</em> machine code.\nInstead of instructions for some real chip, they produced code for a\nhypothetical, idealized machine. Wirth called this <strong>p-code</strong> for <em>portable</em>,\nbut today, we generally call it <strong>bytecode</strong> because each instruction is often a\nsingle byte long.</p>\n<p>These synthetic instructions are designed to map a little more closely to the\nlanguage&rsquo;s semantics, and not be so tied to the peculiarities of any one\ncomputer architecture and its accumulated historical cruft. You can think of it\nlike a dense, binary encoding of the language&rsquo;s low-level operations.</p>\n<h3><a href=\"#virtual-machine\" id=\"virtual-machine\"><small>2&#8202;.&#8202;1&#8202;.&#8202;7</small>Virtual machine</a></h3>\n<p>If your compiler produces bytecode, your work isn&rsquo;t over once that&rsquo;s done. Since\nthere is no chip that speaks that bytecode, it&rsquo;s your job to translate. Again,\nyou have two options. You can write a little mini-compiler for each target\narchitecture that converts the bytecode to native code for that machine. You\nstill have to do work for <span name=\"shared\">each</span> chip you support, but\nthis last stage is pretty simple and you get to reuse the rest of the compiler\npipeline across all of the machines you support. You&rsquo;re basically using your\nbytecode as an intermediate representation.</p>\n<aside name=\"shared\" class=\"bottom\">\n<p>The basic principle here is that the farther down the pipeline you push the\narchitecture-specific work, the more of the earlier phases you can share across\narchitectures.</p>\n<p>There is a tension, though. Many optimizations, like register allocation and\ninstruction selection, work best when they know the strengths and capabilities\nof a specific chip. Figuring out which parts of your compiler can be shared and\nwhich should be target-specific is an art.</p>\n</aside>\n<p>Or you can write a <span name=\"vm\"><strong>virtual machine</strong></span> (<strong>VM</strong>), a\nprogram that emulates a hypothetical chip supporting your virtual architecture\nat runtime. Running bytecode in a VM is slower than translating it to native\ncode ahead of time because every instruction must be simulated at runtime each\ntime it executes. In return, you get simplicity and portability. Implement your\nVM in, say, C, and you can run your language on any platform that has a C\ncompiler. This is how the second interpreter we build in this book works.</p>\n<aside name=\"vm\">\n<p>The term &ldquo;virtual machine&rdquo; also refers to a different kind of abstraction. A\n<strong>system virtual machine</strong> emulates an entire hardware platform and operating\nsystem in software. This is how you can play Windows games on your Linux\nmachine, and how cloud providers give customers the user experience of\ncontrolling their own &ldquo;server&rdquo; without needing to physically allocate separate\ncomputers for each user.</p>\n<p>The kind of VMs we&rsquo;ll talk about in this book are <strong>language virtual machines</strong>\nor <strong>process virtual machines</strong> if you want to be unambiguous.</p>\n</aside>\n<h3><a href=\"#runtime\" id=\"runtime\"><small>2&#8202;.&#8202;1&#8202;.&#8202;8</small>Runtime</a></h3>\n<p>We have finally hammered the user&rsquo;s program into a form that we can execute. The\nlast step is running it. If we compiled it to machine code, we simply tell the\noperating system to load the executable and off it goes. If we compiled it to\nbytecode, we need to start up the VM and load the program into that.</p>\n<p>In both cases, for all but the basest of low-level languages, we usually need\nsome services that our language provides while the program is running. For\nexample, if the language automatically manages memory, we need a garbage\ncollector going in order to reclaim unused bits. If our language supports\n&ldquo;instance of&rdquo; tests so you can see what kind of object you have, then we need\nsome representation to keep track of the type of each object during execution.</p>\n<p>All of this stuff is going at runtime, so it&rsquo;s called, appropriately, the\n<strong>runtime</strong>. In a fully compiled language, the code implementing the runtime\ngets inserted directly into the resulting executable. In, say, <a href=\"https://golang.org/\">Go</a>, each\ncompiled application has its own copy of Go&rsquo;s runtime directly embedded in it.\nIf the language is run inside an interpreter or VM, then the runtime lives\nthere. This is how most implementations of languages like Java, Python, and\nJavaScript work.</p>\n<h2><a href=\"#shortcuts-and-alternate-routes\" id=\"shortcuts-and-alternate-routes\"><small>2&#8202;.&#8202;2</small>Shortcuts and Alternate Routes</a></h2>\n<p>That&rsquo;s the long path covering every possible phase you might implement. Many\nlanguages do walk the entire route, but there are a few shortcuts and alternate\npaths.</p>\n<h3><a href=\"#single-pass-compilers\" id=\"single-pass-compilers\"><small>2&#8202;.&#8202;2&#8202;.&#8202;1</small>Single-pass compilers</a></h3>\n<p>Some simple compilers interleave parsing, analysis, and code generation so that\nthey produce output code directly in the parser, without ever allocating any\nsyntax trees or other IRs. These <span name=\"sdt\"><strong>single-pass\ncompilers</strong></span> restrict the design of the language. You have no intermediate\ndata structures to store global information about the program, and you don&rsquo;t\nrevisit any previously parsed part of the code. That means as soon as you see\nsome expression, you need to know enough to correctly compile it.</p>\n<aside name=\"sdt\">\n<p><a href=\"https://en.wikipedia.org/wiki/Syntax-directed_translation\"><strong>Syntax-directed translation</strong></a> is a structured technique for building\nthese all-at-once compilers. You associate an <em>action</em> with each piece of the\ngrammar, usually one that generates output code. Then, whenever the parser\nmatches that chunk of syntax, it executes the action, building up the target\ncode one rule at a time.</p>\n</aside>\n<p>Pascal and C were designed around this limitation. At the time, memory was so\nprecious that a compiler might not even be able to hold an entire <em>source file</em>\nin memory, much less the whole program. This is why Pascal&rsquo;s grammar requires\ntype declarations to appear first in a block. It&rsquo;s why in C you can&rsquo;t call a\nfunction above the code that defines it unless you have an explicit forward\ndeclaration that tells the compiler what it needs to know to generate code for a\ncall to the later function.</p>\n<h3><a href=\"#tree-walk-interpreters\" id=\"tree-walk-interpreters\"><small>2&#8202;.&#8202;2&#8202;.&#8202;2</small>Tree-walk interpreters</a></h3>\n<p>Some programming languages begin executing code right after parsing it to an AST\n(with maybe a bit of static analysis applied). To run the program, the\ninterpreter traverses the syntax tree one branch and leaf at a time, evaluating\neach node as it goes.</p>\n<p>This implementation style is common for student projects and little languages,\nbut is not widely used for <span name=\"ruby\">general-purpose</span> languages\nsince it tends to be slow. Some people use &ldquo;interpreter&rdquo; to mean only these\nkinds of implementations, but others define that word more generally, so I&rsquo;ll\nuse the inarguably explicit <strong>tree-walk interpreter</strong> to refer to these. Our\nfirst interpreter rolls this way.</p>\n<aside name=\"ruby\">\n<p>A notable exception is early versions of Ruby, which were tree walkers. At 1.9,\nthe canonical implementation of Ruby switched from the original MRI (Matz&rsquo;s Ruby\nInterpreter) to Koichi Sasada&rsquo;s YARV (Yet Another Ruby VM). YARV is a\nbytecode virtual machine.</p>\n</aside>\n<h3><a href=\"#transpilers\" id=\"transpilers\"><small>2&#8202;.&#8202;2&#8202;.&#8202;3</small>Transpilers</a></h3>\n<p><span name=\"gary\">Writing</span> a complete back end for a language can be a lot\nof work. If you have some existing generic IR to target, you could bolt your\nfront end onto that. Otherwise, it seems like you&rsquo;re stuck. But what if you\ntreated some other <em>source language</em> as if it were an intermediate\nrepresentation?</p>\n<p>You write a front end for your language. Then, in the back end, instead of doing\nall the work to <em>lower</em> the semantics to some primitive target language, you\nproduce a string of valid source code for some other language that&rsquo;s about as\nhigh level as yours. Then, you use the existing compilation tools for <em>that</em>\nlanguage as your escape route off the mountain and down to something you can\nexecute.</p>\n<p>They used to call this a <strong>source-to-source compiler</strong> or a <strong>transcompiler</strong>.\nAfter the rise of languages that compile to JavaScript in order to run in the\nbrowser, they&rsquo;ve affected the hipster sobriquet <strong>transpiler</strong>.</p>\n<aside name=\"gary\">\n<p>The first transcompiler, XLT86, translated 8080 assembly into 8086 assembly.\nThat might seem straightforward, but keep in mind the 8080 was an 8-bit chip and\nthe 8086 a 16-bit chip that could use each register as a pair of 8-bit ones.\nXLT86 did data flow analysis to track register usage in the source program and\nthen efficiently map it to the register set of the 8086.</p>\n<p>It was written by Gary Kildall, a tragic hero of computer science if there\never was one. One of the first people to recognize the promise of\nmicrocomputers, he created PL/M and CP/M, the first high-level language and OS\nfor them.</p>\n<p>He was a sea captain, business owner, licensed pilot, and motorcyclist. A TV\nhost with the Kris Kristofferson-esque look sported by dashing bearded dudes in\nthe &rsquo;80s. He took on Bill Gates and, like many, lost, before meeting his end in\na biker bar under mysterious circumstances. He died too young, but sure as hell\nlived before he did.</p>\n</aside>\n<p>While the first transcompiler translated one assembly language to another,\ntoday, most transpilers work on higher-level languages. After the viral spread\nof UNIX to machines various and sundry, there began a long tradition of\ncompilers that produced C as their output language. C compilers were available\neverywhere UNIX was and produced efficient code, so targeting C was a good way\nto get your language running on a lot of architectures.</p>\n<p>Web browsers are the &ldquo;machines&rdquo; of today, and their &ldquo;machine code&rdquo; is\nJavaScript, so these days it seems <a href=\"https://github.com/jashkenas/coffeescript/wiki/list-of-languages-that-compile-to-js\">almost every language out there</a> has a\ncompiler that targets JS since that&rsquo;s the <span name=\"js\">main</span> way to get\nyour code running in a browser.</p>\n<aside name=\"js\">\n<p>JS used to be the <em>only</em> way to execute code in a browser. Thanks to\n<a href=\"https://github.com/webassembly/\">WebAssembly</a>, compilers now have a second, lower-level language they can\ntarget that runs on the web.</p>\n</aside>\n<p>The front end<span class=\"em\">&mdash;</span>scanner and parser<span class=\"em\">&mdash;</span>of a transpiler looks like other\ncompilers. Then, if the source language is only a simple syntactic skin over the\ntarget language, it may skip analysis entirely and go straight to outputting the\nanalogous syntax in the destination language.</p>\n<p>If the two languages are more semantically different, you&rsquo;ll see more of the\ntypical phases of a full compiler including analysis and possibly even\noptimization. Then, when it comes to code generation, instead of outputting some\nbinary language like machine code, you produce a string of grammatically correct\nsource (well, destination) code in the target language.</p>\n<p>Either way, you then run that resulting code through the output language&rsquo;s\nexisting compilation pipeline, and you&rsquo;re good to go.</p>\n<h3><a href=\"#just-in-time-compilation\" id=\"just-in-time-compilation\"><small>2&#8202;.&#8202;2&#8202;.&#8202;4</small>Just-in-time compilation</a></h3>\n<p>This last one is less a shortcut and more a dangerous alpine scramble best\nreserved for experts. The fastest way to execute code is by compiling it to\nmachine code, but you might not know what architecture your end user&rsquo;s machine\nsupports. What to do?</p>\n<p>You can do the same thing that the HotSpot Java Virtual Machine (JVM),\nMicrosoft&rsquo;s Common Language Runtime (CLR), and most JavaScript interpreters do.\nOn the end user&rsquo;s machine, when the program is loaded<span class=\"em\">&mdash;</span>either from source in\nthe case of JS, or platform-independent bytecode for the JVM and CLR<span class=\"em\">&mdash;</span>you\ncompile it to native code for the architecture their computer supports.\nNaturally enough, this is called <strong>just-in-time compilation</strong>. Most hackers just\nsay &ldquo;JIT&rdquo;, pronounced like it rhymes with &ldquo;fit&rdquo;.</p>\n<p>The most sophisticated JITs insert profiling hooks into the generated code to\nsee which regions are most performance critical and what kind of data is flowing\nthrough them. Then, over time, they will automatically recompile those <span\nname=\"hot\">hot spots</span> with more advanced optimizations.</p>\n<aside name=\"hot\">\n<p>This is, of course, exactly where the HotSpot JVM gets its name.</p>\n</aside>\n<h2><a href=\"#compilers-and-interpreters\" id=\"compilers-and-interpreters\"><small>2&#8202;.&#8202;3</small>Compilers and Interpreters</a></h2>\n<p>Now that I&rsquo;ve stuffed your head with a dictionary&rsquo;s worth of programming\nlanguage jargon, we can finally address a question that&rsquo;s plagued coders since\ntime immemorial: What&rsquo;s the difference between a compiler and an interpreter?</p>\n<p>It turns out this is like asking the difference between a fruit and a vegetable.\nThat seems like a binary either-or choice, but actually &ldquo;fruit&rdquo; is a <em>botanical</em>\nterm and &ldquo;vegetable&rdquo; is <em>culinary</em>. One does not strictly imply the negation of\nthe other. There are fruits that aren&rsquo;t vegetables (apples) and vegetables that\naren&rsquo;t fruits (carrots), but also edible plants that are both fruits <em>and</em>\nvegetables, like tomatoes.</p>\n<p><span name=\"veg\"></span></p><img src=\"image/a-map-of-the-territory/plants.png\" alt=\"A Venn diagram of edible plants\" />\n<aside name=\"veg\">\n<p>Peanuts (which are not even nuts) and cereals like wheat are actually fruit, but\nI got this drawing wrong. What can I say, I&rsquo;m a software engineer, not a\nbotanist. I should probably erase the little peanut guy, but he&rsquo;s so cute that I\ncan&rsquo;t bear to.</p>\n<p>Now <em>pine nuts</em>, on the other hand, are plant-based foods that are neither\nfruits nor vegetables. At least as far as I can tell.</p>\n</aside>\n<p>So, back to languages:</p>\n<ul>\n<li>\n<p><strong>Compiling</strong> is an <em>implementation technique</em> that involves translating a\nsource language to some other<span class=\"em\">&mdash;</span>usually lower-level<span class=\"em\">&mdash;</span>form. When you\ngenerate bytecode or machine code, you are compiling. When you transpile to\nanother high-level language, you are compiling too.</p>\n</li>\n<li>\n<p>When we say a language implementation &ldquo;is a <strong>compiler</strong>&rdquo;, we mean it\ntranslates source code to some other form but doesn&rsquo;t execute it. The user has\nto take the resulting output and run it themselves.</p>\n</li>\n<li>\n<p>Conversely, when we say an implementation &ldquo;is an <strong>interpreter</strong>&rdquo;, we mean it\ntakes in source code and executes it immediately. It runs programs &ldquo;from\nsource&rdquo;.</p>\n</li>\n</ul>\n<p>Like apples and oranges, some implementations are clearly compilers and <em>not</em>\ninterpreters. GCC and Clang take your C code and compile it to machine code. An\nend user runs that executable directly and may never even know which tool was\nused to compile it. So those are <em>compilers</em> for C.</p>\n<p>In older versions of Matz&rsquo;s canonical implementation of Ruby, the user ran Ruby\nfrom source. The implementation parsed it and executed it directly by traversing\nthe syntax tree. No other translation occurred, either internally or in any\nuser-visible form. So this was definitely an <em>interpreter</em> for Ruby.</p>\n<p>But what of CPython? When you run your Python program using it, the code is\nparsed and converted to an internal bytecode format, which is then executed\ninside the VM. From the user&rsquo;s perspective, this is clearly an interpreter<span class=\"em\">&mdash;</span>they run their program from source. But if you look under CPython&rsquo;s scaly skin,\nyou&rsquo;ll see that there is definitely some compiling going on.</p>\n<p>The answer is that it is <span name=\"go\">both</span>. CPython <em>is</em> an\ninterpreter, and it <em>has</em> a compiler. In practice, most scripting languages work\nthis way, as you can see:</p>\n<aside name=\"go\">\n<p>The <a href=\"https://golang.org/\">Go tool</a> is even more of a horticultural curiosity. If you run <code>go build</code>, it compiles your Go source code to machine code and stops. If you type\n<code>go run</code>, it does that, then immediately executes the generated executable.</p>\n<p>So <code>go</code> <em>is</em> a compiler (you can use it as a tool to compile code without\nrunning it), <em>is</em> an interpreter (you can invoke it to immediately run a program\nfrom source), and also <em>has</em> a compiler (when you use it as an interpreter, it\nis still compiling internally).</p>\n</aside><img src=\"image/a-map-of-the-territory/venn.png\" alt=\"A Venn diagram of compilers and interpreters\" />\n<p>That overlapping region in the center is where our second interpreter lives too,\nsince it internally compiles to bytecode. So while this book is nominally about\ninterpreters, we&rsquo;ll cover some compilation too.</p>\n<h2><a href=\"#our-journey\" id=\"our-journey\"><small>2&#8202;.&#8202;4</small>Our Journey</a></h2>\n<p>That&rsquo;s a lot to take in all at once. Don&rsquo;t worry. This isn&rsquo;t the chapter where\nyou&rsquo;re expected to <em>understand</em> all of these pieces and parts. I just want you\nto know that they are out there and roughly how they fit together.</p>\n<p>This map should serve you well as you explore the territory beyond the guided\npath we take in this book. I want to leave you yearning to strike out on your\nown and wander all over that mountain.</p>\n<p>But, for now, it&rsquo;s time for our own journey to begin. Tighten your bootlaces,\ncinch up your pack, and come along. From <span name=\"here\">here</span> on out,\nall you need to focus on is the path in front of you.</p>\n<aside name=\"here\">\n<p>Henceforth, I promise to tone down the whole mountain metaphor thing.</p>\n</aside>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>Pick an open source implementation of a language you like. Download the\nsource code and poke around in it. Try to find the code that implements the\nscanner and parser. Are they handwritten, or generated using tools like\nLex and Yacc? (<code>.l</code> or <code>.y</code> files usually imply the latter.)</p>\n</li>\n<li>\n<p>Just-in-time compilation tends to be the fastest way to implement dynamically\ntyped languages, but not all of them use it. What reasons are there to <em>not</em>\nJIT?</p>\n</li>\n<li>\n<p>Most Lisp implementations that compile to C also contain an interpreter that\nlets them execute Lisp code on the fly as well. Why?</p>\n</li>\n</ol>\n</div>\n\n<footer>\n<a href=\"the-lox-language.html\" class=\"next\">\n  Next Chapter: &ldquo;The Lox Language&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/a-tree-walk-interpreter.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>A Tree-Walk Interpreter &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h2><small>II</small>A Tree-Walk Interpreter</h2>\n\n<ul>\n    <li><a href=\"scanning.html\"><small>4</small>Scanning</a></li>\n    <li><a href=\"representing-code.html\"><small>5</small>Representing Code</a></li>\n    <li><a href=\"parsing-expressions.html\"><small>6</small>Parsing Expressions</a></li>\n    <li><a href=\"evaluating-expressions.html\"><small>7</small>Evaluating Expressions</a></li>\n    <li><a href=\"statements-and-state.html\"><small>8</small>Statements and State</a></li>\n    <li><a href=\"control-flow.html\"><small>9</small>Control Flow</a></li>\n    <li><a href=\"functions.html\"><small>10</small>Functions</a></li>\n    <li><a href=\"resolving-and-binding.html\"><small>11</small>Resolving and Binding</a></li>\n    <li><a href=\"classes.html\"><small>12</small>Classes</a></li>\n    <li><a href=\"inheritance.html\"><small>13</small>Inheritance</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"the-lox-language.html\" title=\"The Lox Language\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"contents.html\" title=\"Table of Contents\">&uarr;&nbsp;Up</a>\n    <a href=\"scanning.html\" title=\"Scanning\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"the-lox-language.html\" title=\"The Lox Language\" class=\"prev\">←</a>\n<a href=\"scanning.html\" title=\"Scanning\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h2><small>II</small>A Tree-Walk Interpreter</h2>\n\n<ul>\n    <li><a href=\"scanning.html\"><small>4</small>Scanning</a></li>\n    <li><a href=\"representing-code.html\"><small>5</small>Representing Code</a></li>\n    <li><a href=\"parsing-expressions.html\"><small>6</small>Parsing Expressions</a></li>\n    <li><a href=\"evaluating-expressions.html\"><small>7</small>Evaluating Expressions</a></li>\n    <li><a href=\"statements-and-state.html\"><small>8</small>Statements and State</a></li>\n    <li><a href=\"control-flow.html\"><small>9</small>Control Flow</a></li>\n    <li><a href=\"functions.html\"><small>10</small>Functions</a></li>\n    <li><a href=\"resolving-and-binding.html\"><small>11</small>Resolving and Binding</a></li>\n    <li><a href=\"classes.html\"><small>12</small>Classes</a></li>\n    <li><a href=\"inheritance.html\"><small>13</small>Inheritance</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"the-lox-language.html\" title=\"The Lox Language\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"contents.html\" title=\"Table of Contents\">&uarr;&nbsp;Up</a>\n    <a href=\"scanning.html\" title=\"Scanning\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">II</div>\n  <h1 class=\"part\">A Tree-Walk Interpreter</h1>\n\n<p>With this part, we begin jlox, the first of our two interpreters. Programming\nlanguages are a huge topic with piles of concepts and terminology to cram into\nyour brain all at once. Programming language theory requires a level of mental\nrigor that you probably haven&rsquo;t had to summon since your last calculus final.\n(Fortunately there isn&rsquo;t too much theory in this book.)</p>\n<p>Implementing an interpreter uses a few architectural tricks and design\npatterns uncommon in other kinds of applications, so we&rsquo;ll be getting used to\nthe engineering side of things too. Given all of that, we&rsquo;ll keep the code we\nhave to write as simple and plain as possible.</p>\n<p>In less than two thousand lines of clean Java code, we&rsquo;ll build a complete\ninterpreter for Lox that implements every single feature of the language,\nexactly as we&rsquo;ve specified. The first few chapters work front-to-back through\nthe phases of the interpreter<span class=\"em\">&mdash;</span><a href=\"scanning.html\">scanning</a>, <a href=\"parsing-expressions.html\">parsing</a>, and\n<a href=\"evaluating-expressions.html\">evaluating code</a>. After that, we add language features one at a time,\ngrowing a simple calculator into a full-fledged scripting language.</p>\n\n<footer>\n<a href=\"scanning.html\" class=\"next\">\n  Next Chapter: &ldquo;Scanning&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/a-virtual-machine.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>A Virtual Machine &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">A Virtual Machine<small>15</small></a></h3>\n\n<ul>\n    <li><a href=\"#an-instruction-execution-machine\"><small>15.1</small> An Instruction Execution Machine</a></li>\n    <li><a href=\"#a-value-stack-manipulator\"><small>15.2</small> A Value Stack Manipulator</a></li>\n    <li><a href=\"#an-arithmetic-calculator\"><small>15.3</small> An Arithmetic Calculator</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Register-Based Bytecode</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"chunks-of-bytecode.html\" title=\"Chunks of Bytecode\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"scanning-on-demand.html\" title=\"Scanning on Demand\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"chunks-of-bytecode.html\" title=\"Chunks of Bytecode\" class=\"prev\">←</a>\n<a href=\"scanning-on-demand.html\" title=\"Scanning on Demand\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">A Virtual Machine<small>15</small></a></h3>\n\n<ul>\n    <li><a href=\"#an-instruction-execution-machine\"><small>15.1</small> An Instruction Execution Machine</a></li>\n    <li><a href=\"#a-value-stack-manipulator\"><small>15.2</small> A Value Stack Manipulator</a></li>\n    <li><a href=\"#an-arithmetic-calculator\"><small>15.3</small> An Arithmetic Calculator</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Register-Based Bytecode</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"chunks-of-bytecode.html\" title=\"Chunks of Bytecode\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"scanning-on-demand.html\" title=\"Scanning on Demand\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">15</div>\n  <h1>A Virtual Machine</h1>\n\n<blockquote>\n<p>Magicians protect their secrets not because the secrets are large and\nimportant, but because they are so small and trivial. The wonderful effects\ncreated on stage are often the result of a secret so absurd that the magician\nwould be embarrassed to admit that that was how it was done.</p>\n<p><cite>Christopher Priest, <em>The Prestige</em></cite></p>\n</blockquote>\n<p>We&rsquo;ve spent a lot of time talking about how to represent a program as a sequence\nof bytecode instructions, but it feels like learning biology using only stuffed,\ndead animals. We know what instructions are in theory, but we&rsquo;ve never seen them\nin action, so it&rsquo;s hard to really understand what they <em>do</em>. It would be hard to\nwrite a compiler that outputs bytecode when we don&rsquo;t have a good understanding\nof how that bytecode behaves.</p>\n<p>So, before we go and build the front end of our new interpreter, we will begin\nwith the back end<span class=\"em\">&mdash;</span>the virtual machine that executes instructions. It breathes\nlife into the bytecode. Watching the instructions prance around gives us a\nclearer picture of how a compiler might translate the user&rsquo;s source code into a\nseries of them.</p>\n<h2><a href=\"#an-instruction-execution-machine\" id=\"an-instruction-execution-machine\"><small>15&#8202;.&#8202;1</small>An Instruction Execution Machine</a></h2>\n<p>The virtual machine is one part of our interpreter&rsquo;s internal architecture. You\nhand it a chunk of code<span class=\"em\">&mdash;</span>literally a Chunk<span class=\"em\">&mdash;</span>and it runs it. The code and\ndata structures for the VM reside in a new module.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.h</em><br>\ncreate new file</div>\n<pre><span class=\"a\">#ifndef clox_vm_h</span>\n<span class=\"a\">#define clox_vm_h</span>\n\n<span class=\"a\">#include &quot;chunk.h&quot;</span>\n\n<span class=\"k\">typedef</span> <span class=\"k\">struct</span> {\n  <span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>;\n} <span class=\"a\">VM</span>;\n\n<span class=\"t\">void</span> <span class=\"i\">initVM</span>();\n<span class=\"t\">void</span> <span class=\"i\">freeVM</span>();\n\n<span class=\"a\">#endif</span>\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em>, create new file</div>\n\n<p>As usual, we start simple. The VM will gradually acquire a whole pile of state\nit needs to keep track of, so we define a struct now to stuff that all in.\nCurrently, all we store is the chunk that it executes.</p>\n<p>Like we do with most of the data structures we create, we also define functions\nto create and tear down a VM. Here&rsquo;s the implementation:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\ncreate new file</div>\n<pre><span class=\"a\">#include &quot;common.h&quot;</span>\n<span class=\"a\">#include &quot;vm.h&quot;</span>\n\n<span class=\"a\">VM</span> <span class=\"i\">vm</span>;<span name=\"one\"> </span>\n\n<span class=\"t\">void</span> <span class=\"i\">initVM</span>() {\n}\n\n<span class=\"t\">void</span> <span class=\"i\">freeVM</span>() {\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, create new file</div>\n\n<p>OK, calling those functions &ldquo;implementations&rdquo; is a stretch. We don&rsquo;t have any\ninteresting state to initialize or free yet, so the functions are empty. Trust\nme, we&rsquo;ll get there.</p>\n<p>The slightly more interesting line here is that declaration of <code>vm</code>. This module\nis eventually going to have a slew of functions and it would be a chore to pass\naround a pointer to the VM to all of them. Instead, we declare a single global\nVM object. We need only one anyway, and this keeps the code in the book a little\nlighter on the page.</p>\n<aside name=\"one\">\n<p>The choice to have a static VM instance is a concession for the book, but not\nnecessarily a sound engineering choice for a real language implementation. If\nyou&rsquo;re building a VM that&rsquo;s designed to be embedded in other host applications,\nit gives the host more flexibility if you <em>do</em> explicitly take a VM pointer\nand pass it around.</p>\n<p>That way, the host app can control when and where memory for the VM is\nallocated, run multiple VMs in parallel, etc.</p>\n<p>What I&rsquo;m doing here is a global variable, and <a href=\"http://gameprogrammingpatterns.com/singleton.html\">everything bad you&rsquo;ve heard about\nglobal variables</a> is still true when programming in the large. But when\nkeeping things small for a book<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span></p>\n</aside>\n<p>Before we start pumping fun code into our VM, let&rsquo;s go ahead and wire it up to\nthe interpreter&rsquo;s main entrypoint.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">int main(int argc, const char* argv[]) {\n</pre><div class=\"source-file\"><em>main.c</em><br>\nin <em>main</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">initVM</span>();\n\n</pre><pre class=\"insert-after\">  Chunk chunk;\n</pre></div>\n<div class=\"source-file-narrow\"><em>main.c</em>, in <em>main</em>()</div>\n\n<p>We spin up the VM when the interpreter first starts. Then when we&rsquo;re about to\nexit, we wind it down.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  disassembleChunk(&amp;chunk, &quot;test chunk&quot;);\n</pre><div class=\"source-file\"><em>main.c</em><br>\nin <em>main</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">freeVM</span>();\n</pre><pre class=\"insert-after\">  freeChunk(&amp;chunk);\n</pre></div>\n<div class=\"source-file-narrow\"><em>main.c</em>, in <em>main</em>()</div>\n\n<p>One last ceremonial obligation:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;debug.h&quot;\n</pre><div class=\"source-file\"><em>main.c</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;vm.h&quot;</span>\n</pre><pre class=\"insert-after\">\n\nint main(int argc, const char* argv[]) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>main.c</em></div>\n\n<p>Now when you run clox, it starts up the VM before it creates that hand-authored\nchunk from the <a href=\"chunks-of-bytecode.html#disassembling-chunks\">last chapter</a>. The VM is ready and waiting, so let&rsquo;s teach it\nto do something.</p>\n<h3><a href=\"#executing-instructions\" id=\"executing-instructions\"><small>15&#8202;.&#8202;1&#8202;.&#8202;1</small>Executing instructions</a></h3>\n<p>The VM springs into action when we command it to interpret a chunk of bytecode.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  disassembleChunk(&amp;chunk, &quot;test chunk&quot;);\n</pre><div class=\"source-file\"><em>main.c</em><br>\nin <em>main</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">interpret</span>(&amp;<span class=\"i\">chunk</span>);\n</pre><pre class=\"insert-after\">  freeVM();\n</pre></div>\n<div class=\"source-file-narrow\"><em>main.c</em>, in <em>main</em>()</div>\n\n<p>This function is the main entrypoint into the VM. It&rsquo;s declared like so:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void freeVM();\n</pre><div class=\"source-file\"><em>vm.h</em><br>\nadd after <em>freeVM</em>()</div>\n<pre class=\"insert\"><span class=\"t\">InterpretResult</span> <span class=\"i\">interpret</span>(<span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>);\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em>, add after <em>freeVM</em>()</div>\n\n<p>The VM runs the chunk and then responds with a value from this enum:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} VM;\n\n</pre><div class=\"source-file\"><em>vm.h</em><br>\nadd after struct <em>VM</em></div>\n<pre class=\"insert\"><span class=\"k\">typedef</span> <span class=\"k\">enum</span> {\n  <span class=\"a\">INTERPRET_OK</span>,\n  <span class=\"a\">INTERPRET_COMPILE_ERROR</span>,\n  <span class=\"a\">INTERPRET_RUNTIME_ERROR</span>\n} <span class=\"t\">InterpretResult</span>;\n\n</pre><pre class=\"insert-after\">void initVM();\nvoid freeVM();\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em>, add after struct <em>VM</em></div>\n\n<p>We aren&rsquo;t using the result yet, but when we have a compiler that reports static\nerrors and a VM that detects runtime errors, the interpreter will use this to\nknow how to set the exit code of the process.</p>\n<p>We&rsquo;re inching towards some actual implementation.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nadd after <em>freeVM</em>()</div>\n<pre><span class=\"t\">InterpretResult</span> <span class=\"i\">interpret</span>(<span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>) {\n  <span class=\"i\">vm</span>.<span class=\"i\">chunk</span> = <span class=\"i\">chunk</span>;\n  <span class=\"i\">vm</span>.<span class=\"i\">ip</span> = <span class=\"i\">vm</span>.<span class=\"i\">chunk</span>-&gt;<span class=\"i\">code</span>;\n  <span class=\"k\">return</span> <span class=\"i\">run</span>();\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, add after <em>freeVM</em>()</div>\n\n<p>First, we store the chunk being executed in the VM. Then we call <code>run()</code>, an\ninternal helper function that actually runs the bytecode instructions. Between\nthose two parts is an intriguing line. What is this <code>ip</code> business?</p>\n<p>As the VM works its way through the bytecode, it keeps track of where it is<span class=\"em\">&mdash;</span>the location of the instruction currently being executed. We don&rsquo;t use a <span\nname=\"local\">local</span> variable inside <code>run()</code> for this because eventually\nother functions will need to access it. Instead, we store it as a field in VM.</p>\n<aside name=\"local\">\n<p>If we were trying to squeeze every ounce of speed out of our bytecode\ninterpreter, we would store <code>ip</code> in a local variable. It gets modified so often\nduring execution that we want the C compiler to keep it in a register.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">typedef struct {\n  Chunk* chunk;\n</pre><div class=\"source-file\"><em>vm.h</em><br>\nin struct <em>VM</em></div>\n<pre class=\"insert\">  <span class=\"t\">uint8_t</span>* <span class=\"i\">ip</span>;\n</pre><pre class=\"insert-after\">} VM;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em>, in struct <em>VM</em></div>\n\n<p>Its type is a byte pointer. We use an actual real C pointer pointing right into\nthe middle of the bytecode array instead of something like an integer index\nbecause it&rsquo;s faster to dereference a pointer than look up an element in an array\nby index.</p>\n<p>The name &ldquo;IP&rdquo; is traditional, and<span class=\"em\">&mdash;</span>unlike many traditional names in CS<span class=\"em\">&mdash;</span>actually makes sense: it&rsquo;s an <strong><a href=\"https://en.wikipedia.org/wiki/Program_counter\">instruction pointer</a></strong>. Almost every\ninstruction set in the <span name=\"ip\">world</span>, real and virtual, has a\nregister or variable like this.</p>\n<aside name=\"ip\">\n<p>x86, x64, and the CLR call it &ldquo;IP&rdquo;. 68k, PowerPC, ARM, p-code, and the JVM call\nit &ldquo;PC&rdquo;, for <strong>program counter</strong>.</p>\n</aside>\n<p>We initialize <code>ip</code> by pointing it at the first byte of code in the chunk. We\nhaven&rsquo;t executed that instruction yet, so <code>ip</code> points to the instruction <em>about\nto be executed</em>. This will be true during the entire time the VM is running: the\nIP always points to the next instruction, not the one currently being handled.</p>\n<p>The real fun happens in <code>run</code>().</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nadd after <em>freeVM</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">InterpretResult</span> <span class=\"i\">run</span>() {\n<span class=\"a\">#define READ_BYTE() (*vm.ip++)</span>\n\n  <span class=\"k\">for</span> (;;) {\n    <span class=\"t\">uint8_t</span> <span class=\"i\">instruction</span>;\n    <span class=\"k\">switch</span> (<span class=\"i\">instruction</span> = <span class=\"a\">READ_BYTE</span>()) {\n      <span class=\"k\">case</span> <span class=\"a\">OP_RETURN</span>: {\n        <span class=\"k\">return</span> <span class=\"a\">INTERPRET_OK</span>;\n      }\n    }\n  }\n\n<span class=\"a\">#undef READ_BYTE</span>\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, add after <em>freeVM</em>()</div>\n\n<p>This is the single most <span name=\"important\">important</span> function in all\nof clox, by far. When the interpreter executes a user&rsquo;s program, it will spend\nsomething like 90% of its time inside <code>run()</code>. It is the beating heart of the\nVM.</p>\n<aside name=\"important\">\n<p>Or, at least, it <em>will</em> be in a few chapters when it has enough content to be\nuseful. Right now, it&rsquo;s not exactly a wonder of software wizardry.</p>\n</aside>\n<p>Despite that dramatic intro, it&rsquo;s conceptually pretty simple. We have an outer\nloop that goes and goes. Each turn through that loop, we read and execute a\nsingle bytecode instruction.</p>\n<p>To process an instruction, we first figure out what kind of instruction we&rsquo;re\ndealing with. The <code>READ_BYTE</code> macro reads the byte currently pointed at by <code>ip</code>\nand then <span name=\"next\">advances</span> the instruction pointer. The first\nbyte of any instruction is the opcode. Given a numeric opcode, we need to get to\nthe right C code that implements that instruction&rsquo;s semantics. This process is\ncalled <strong>decoding</strong> or <strong>dispatching</strong> the instruction.</p>\n<aside name=\"next\">\n<p>Note that <code>ip</code> advances as soon as we read the opcode, before we&rsquo;ve actually\nstarted executing the instruction. So, again, <code>ip</code> points to the <em>next</em>\nbyte of code to be used.</p>\n</aside>\n<p>We do that process for every single instruction, every single time one is\nexecuted, so this is the most performance critical part of the entire virtual\nmachine. Programming language lore is filled with <span\nname=\"dispatch\">clever</span> techniques to do bytecode dispatch efficiently,\ngoing all the way back to the early days of computers.</p>\n<aside name=\"dispatch\">\n<p>If you want to learn some of these techniques, look up &ldquo;direct threaded code&rdquo;,\n&ldquo;jump table&rdquo;, and &ldquo;computed goto&rdquo;.</p>\n</aside>\n<p>Alas, the fastest solutions require either non-standard extensions to C, or\nhandwritten assembly code. For clox, we&rsquo;ll keep it simple. Just like our\ndisassembler, we have a single giant <code>switch</code> statement with a case for each\nopcode. The body of each case implements that opcode&rsquo;s behavior.</p>\n<p>So far, we handle only a single instruction, <code>OP_RETURN</code>, and the only thing it\ndoes is exit the loop entirely. Eventually, that instruction will be used to\nreturn from the current Lox function, but we don&rsquo;t have functions yet, so we&rsquo;ll\nrepurpose it temporarily to end the execution.</p>\n<p>Let&rsquo;s go ahead and support our one other instruction.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    switch (instruction = READ_BYTE()) {\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_CONSTANT</span>: {\n        <span class=\"t\">Value</span> <span class=\"i\">constant</span> = <span class=\"a\">READ_CONSTANT</span>();\n        <span class=\"i\">printValue</span>(<span class=\"i\">constant</span>);\n        <span class=\"i\">printf</span>(<span class=\"s\">&quot;</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>);\n        <span class=\"k\">break</span>;\n      }\n</pre><pre class=\"insert-after\">      case OP_RETURN: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>We don&rsquo;t have enough machinery in place yet to do anything useful with a\nconstant. For now, we&rsquo;ll just print it out so we interpreter hackers can see\nwhat&rsquo;s going on inside our VM. That call to <code>printf()</code> necessitates an include.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nadd to top of file</div>\n<pre class=\"insert\"><span class=\"a\">#include &lt;stdio.h&gt;</span>\n\n</pre><pre class=\"insert-after\">#include &quot;common.h&quot;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, add to top of file</div>\n\n<p>We also have a new macro to define.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define READ_BYTE() (*vm.ip++)\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\"><span class=\"a\">#define READ_CONSTANT() (vm.chunk-&gt;constants.values[READ_BYTE()])</span>\n</pre><pre class=\"insert-after\">\n\n  for (;;) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p><code>READ_CONSTANT()</code> reads the next byte from the bytecode, treats the resulting\nnumber as an index, and looks up the corresponding Value in the chunk&rsquo;s constant\ntable. In later chapters, we&rsquo;ll add a few more instructions with operands that\nrefer to constants, so we&rsquo;re setting up this helper macro now.</p>\n<p>Like the previous <code>READ_BYTE</code> macro, <code>READ_CONSTANT</code> is only used inside\n<code>run()</code>. To make that scoping more explicit, the macro definitions themselves\nare confined to that function. We <span name=\"macro\">define</span> them at the\nbeginning and<span class=\"em\">&mdash;</span>because we care<span class=\"em\">&mdash;</span>undefine them at the end.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#undef READ_BYTE\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\"><span class=\"a\">#undef READ_CONSTANT</span>\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<aside name=\"macro\">\n<p>Undefining these macros explicitly might seem needlessly fastidious, but C tends\nto punish sloppy users, and the C preprocessor doubly so.</p>\n</aside>\n<h3><a href=\"#execution-tracing\" id=\"execution-tracing\"><small>15&#8202;.&#8202;1&#8202;.&#8202;2</small>Execution tracing</a></h3>\n<p>If you run clox now, it executes the chunk we hand-authored in the last chapter\nand spits out <code>1.2</code> to your terminal. We can see that it&rsquo;s working, but that&rsquo;s\nonly because our implementation of <code>OP_CONSTANT</code> has temporary code to log the\nvalue. Once that instruction is doing what it&rsquo;s supposed to do and plumbing that\nconstant along to other operations that want to consume it, the VM will become a\nblack box. That makes our lives as VM implementers harder.</p>\n<p>To help ourselves out, now is a good time to add some diagnostic logging to the\nVM like we did with chunks themselves. In fact, we&rsquo;ll even reuse the same code.\nWe don&rsquo;t want this logging enabled all the time<span class=\"em\">&mdash;</span>it&rsquo;s just for us VM hackers,\nnot Lox users<span class=\"em\">&mdash;</span>so first we create a flag to hide it behind.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &lt;stdint.h&gt;\n</pre><div class=\"source-file\"><em>common.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#define DEBUG_TRACE_EXECUTION</span>\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>common.h</em></div>\n\n<p>When this flag is defined, the VM disassembles and prints each instruction right\nbefore executing it. Where our previous disassembler walked an entire chunk\nonce, statically, this disassembles instructions dynamically, on the fly.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  for (;;) {\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\"><span class=\"a\">#ifdef DEBUG_TRACE_EXECUTION</span>\n    <span class=\"i\">disassembleInstruction</span>(<span class=\"i\">vm</span>.<span class=\"i\">chunk</span>,\n                           (<span class=\"t\">int</span>)(<span class=\"i\">vm</span>.<span class=\"i\">ip</span> - <span class=\"i\">vm</span>.<span class=\"i\">chunk</span>-&gt;<span class=\"i\">code</span>));\n<span class=\"a\">#endif</span>\n\n</pre><pre class=\"insert-after\">    uint8_t instruction;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>Since <code>disassembleInstruction()</code> takes an integer byte <em>offset</em> and we store the\ncurrent instruction reference as a direct pointer, we first do a little pointer\nmath to convert <code>ip</code> back to a relative offset from the beginning of the\nbytecode. Then we disassemble the instruction that begins at that byte.</p>\n<p>As ever, we need to bring in the declaration of the function before we can call\nit.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;common.h&quot;\n</pre><div class=\"source-file\"><em>vm.c</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;debug.h&quot;</span>\n</pre><pre class=\"insert-after\">#include &quot;vm.h&quot;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em></div>\n\n<p>I know this code isn&rsquo;t super impressive so far<span class=\"em\">&mdash;</span>it&rsquo;s literally a switch\nstatement wrapped in a <code>for</code> loop but, believe it or not, this is one of the two\nmajor components of our VM. With this, we can imperatively execute instructions.\nIts simplicity is a virtue<span class=\"em\">&mdash;</span>the less work it does, the faster it can do it.\nContrast this with all of the complexity and overhead we had in jlox with the\nVisitor pattern for walking the AST.</p>\n<h2><a href=\"#a-value-stack-manipulator\" id=\"a-value-stack-manipulator\"><small>15&#8202;.&#8202;2</small>A Value Stack Manipulator</a></h2>\n<p>In addition to imperative side effects, Lox has expressions that produce,\nmodify, and consume values. Thus, our compiled bytecode needs a way to shuttle\nvalues around between the different instructions that need them. For example:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">print</span> <span class=\"n\">3</span> - <span class=\"n\">2</span>;\n</pre></div>\n<p>We obviously need instructions for the constants 3 and 2, the <code>print</code> statement,\nand the subtraction. But how does the subtraction instruction know that 3 is\nthe <span name=\"word\">minuend</span> and 2 is the subtrahend? How does the print\ninstruction know to print the result of that?</p>\n<aside name=\"word\">\n<p>Yes, I did have to look up &ldquo;subtrahend&rdquo; and &ldquo;minuend&rdquo; in a dictionary. But\naren&rsquo;t they delightful words? &ldquo;Minuend&rdquo; sounds like a kind of Elizabethan dance\nand &ldquo;subtrahend&rdquo; might be some sort of underground Paleolithic monument.</p>\n</aside>\n<p>To put a finer point on it, look at this thing right here:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">echo</span>(<span class=\"i\">n</span>) {\n  <span class=\"k\">print</span> <span class=\"i\">n</span>;\n  <span class=\"k\">return</span> <span class=\"i\">n</span>;\n}\n\n<span class=\"k\">print</span> <span class=\"i\">echo</span>(<span class=\"i\">echo</span>(<span class=\"n\">1</span>) + <span class=\"i\">echo</span>(<span class=\"n\">2</span>)) + <span class=\"i\">echo</span>(<span class=\"i\">echo</span>(<span class=\"n\">4</span>) + <span class=\"i\">echo</span>(<span class=\"n\">5</span>));\n</pre></div>\n<p>I wrapped each subexpression in a call to <code>echo()</code> that prints and returns its\nargument. That side effect means we can see the exact order of operations.</p>\n<p>Don&rsquo;t worry about the VM for a minute. Think about just the semantics of Lox\nitself. The operands to an arithmetic operator obviously need to be evaluated\nbefore we can perform the operation itself. (It&rsquo;s pretty hard to add <code>a + b</code> if\nyou don&rsquo;t know what <code>a</code> and <code>b</code> are.) Also, when we implemented expressions in\njlox, we <span name=\"undefined\">decided</span> that the left operand must be\nevaluated before the right.</p>\n<aside name=\"undefined\">\n<p>We could have left evaluation order unspecified and let each implementation\ndecide. That leaves the door open for optimizing compilers to reorder arithmetic\nexpressions for efficiency, even in cases where the operands have visible side\neffects. C and Scheme leave evaluation order unspecified. Java specifies\nleft-to-right evaluation like we do for Lox.</p>\n<p>I think nailing down stuff like this is generally better for users. When\nexpressions are not evaluated in the order users intuit<span class=\"em\">&mdash;</span>possibly in different\norders across different implementations!<span class=\"em\">&mdash;</span>it can be a burning hellscape of\npain to figure out what&rsquo;s going on.</p>\n</aside>\n<p>Here is the syntax tree for the <code>print</code> statement:</p>\n<p><img src=\"image/a-virtual-machine/ast.png\" alt=\"The AST for the example\nstatement, with numbers marking the order that the nodes are evaluated.\" /></p>\n<p>Given left-to-right evaluation, and the way the expressions are nested, any\ncorrect Lox implementation <em>must</em> print these numbers in this order:</p>\n<div class=\"codehilite\"><pre>1  // from echo(1)\n2  // from echo(2)\n3  // from echo(1 + 2)\n4  // from echo(4)\n5  // from echo(5)\n9  // from echo(4 + 5)\n12 // from print 3 + 9\n</pre></div>\n<p>Our old jlox interpreter accomplishes this by recursively traversing the AST. It\ndoes a postorder traversal. First it recurses down the left operand branch,\nthen the right operand, then finally it evaluates the node itself.</p>\n<p>After evaluating the left operand, jlox needs to store that result somewhere\ntemporarily while it&rsquo;s busy traversing down through the right operand tree. We\nuse a local variable in Java for that. Our recursive tree-walk interpreter\ncreates a unique Java call frame for each node being evaluated, so we could have\nas many of these local variables as we needed.</p>\n<p>In clox, our <code>run()</code> function is not recursive<span class=\"em\">&mdash;</span>the nested expression tree is\nflattened out into a linear series of instructions. We don&rsquo;t have the luxury of\nusing C local variables, so how and where should we store these temporary\nvalues? You can probably <span name=\"guess\">guess</span> already, but I want to\nreally drill into this because it&rsquo;s an aspect of programming that we take for\ngranted, but we rarely learn <em>why</em> computers are architected this way.</p>\n<aside name=\"guess\">\n<p>Hint: it&rsquo;s in the name of this section, and it&rsquo;s how Java and C manage recursive\ncalls to functions.</p>\n</aside>\n<p>Let&rsquo;s do a weird exercise. We&rsquo;ll walk through the execution of the above program\na step at a time:</p>\n<p><img src=\"image/a-virtual-machine/bars.png\" alt=\"The series of instructions with\nbars showing which numbers need to be preserved across which instructions.\" /></p>\n<p>On the left are the steps of code. On the right are the values we&rsquo;re tracking.\nEach bar represents a number. It starts when the value is first produced<span class=\"em\">&mdash;</span>either a constant or the result of an addition. The length of the bar tracks\nwhen a previously produced value needs to be kept around, and it ends when that\nvalue finally gets consumed by an operation.</p>\n<p>As you step through, you see values appear and then later get eaten. The\nlongest-lived ones are the values produced from the left-hand side of an\naddition. Those stick around while we work through the right-hand operand\nexpression.</p>\n<p>In the above diagram, I gave each unique number its own visual column. Let&rsquo;s be\na little more parsimonious. Once a number is consumed, we allow its column to be\nreused for another later value. In other words, we take all of those gaps\nup there and fill them in, pushing in numbers from the right:</p>\n<p><img src=\"image/a-virtual-machine/bars-stacked.png\" alt=\"Like the previous\ndiagram, but with number bars pushed to the left, forming a stack.\" /></p>\n<p>There&rsquo;s some interesting stuff going on here. When we shift everything over,\neach number still manages to stay in a single column for its entire life. Also,\nthere are no gaps left. In other words, whenever a number appears earlier than\nanother, then it will live at least as long as that second one. The first number\nto appear is the last to be consumed. Hmm<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>last-in, first-out<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>why, that&rsquo;s a\n<span name=\"pancakes\">stack</span>!</p>\n<aside name=\"pancakes\">\n<p>This is also a stack:</p><img src=\"image/a-virtual-machine/pancakes.png\" alt=\"A stack... of pancakes.\" />\n</aside>\n<p>In the second diagram, each time we introduce a number, we push it onto the\nstack from the right. When numbers are consumed, they are always popped off from\nrightmost to left.</p>\n<p>Since the temporary values we need to track naturally have stack-like behavior,\nour VM will use a stack to manage them. When an instruction &ldquo;produces&rdquo; a value,\nit pushes it onto the stack. When it needs to consume one or more values, it\ngets them by popping them off the stack.</p>\n<h3><a href=\"#the-vms-stack\" id=\"the-vms-stack\"><small>15&#8202;.&#8202;2&#8202;.&#8202;1</small>The VM&rsquo;s Stack</a></h3>\n<p>Maybe this doesn&rsquo;t seem like a revelation, but I <em>love</em> stack-based VMs. When\nyou first see a magic trick, it feels like something actually magical. But then\nyou learn how it works<span class=\"em\">&mdash;</span>usually some mechanical gimmick or misdirection<span class=\"em\">&mdash;</span>and\nthe sense of wonder evaporates. There are a <span name=\"wonder\">couple</span> of\nideas in computer science where even after I pulled them apart and learned all\nthe ins and outs, some of the initial sparkle remained. Stack-based VMs are one\nof those.</p>\n<aside name=\"wonder\">\n<p>Heaps<span class=\"em\">&mdash;</span><a href=\"https://en.wikipedia.org/wiki/Heap_(data_structure)\">the data structure</a>, not <a href=\"https://en.wikipedia.org/wiki/Memory_management#HEAP\">the memory management thing</a><span class=\"em\">&mdash;</span>are another. And Vaughan Pratt&rsquo;s top-down operator precedence parsing scheme,\nwhich we&rsquo;ll learn about <a href=\"compiling-expressions.html\">in due time</a>.</p>\n</aside>\n<p>As you&rsquo;ll see in this chapter, executing instructions in a stack-based VM is\ndead <span name=\"cheat\">simple</span>. In later chapters, you&rsquo;ll also discover\nthat compiling a source language to a stack-based instruction set is a piece of\ncake. And yet, this architecture is fast enough to be used by production\nlanguage implementations. It almost feels like cheating at the programming\nlanguage game.</p>\n<aside name=\"cheat\">\n<p>To take a bit of the sheen off: stack-based interpreters aren&rsquo;t a silver bullet.\nThey&rsquo;re often <em>adequate</em>, but modern implementations of the JVM, the CLR, and\nJavaScript all use sophisticated <a href=\"https://en.wikipedia.org/wiki/Just-in-time_compilation\">just-in-time compilation</a> pipelines to\ngenerate <em>much</em> faster native code on the fly.</p>\n</aside>\n<p>Alrighty, it&rsquo;s codin&rsquo; time! Here&rsquo;s the stack:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">typedef struct {\n  Chunk* chunk;\n  uint8_t* ip;\n</pre><div class=\"source-file\"><em>vm.h</em><br>\nin struct <em>VM</em></div>\n<pre class=\"insert\">  <span class=\"t\">Value</span> <span class=\"i\">stack</span>[<span class=\"a\">STACK_MAX</span>];\n  <span class=\"t\">Value</span>* <span class=\"i\">stackTop</span>;\n</pre><pre class=\"insert-after\">} VM;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em>, in struct <em>VM</em></div>\n\n<p>We implement the stack semantics ourselves on top of a raw C array. The bottom\nof the stack<span class=\"em\">&mdash;</span>the first value pushed and the last to be popped<span class=\"em\">&mdash;</span>is at\nelement zero in the array, and later pushed values follow it. If we push the\nletters of &ldquo;crepe&rdquo;<span class=\"em\">&mdash;</span>my favorite stackable breakfast item<span class=\"em\">&mdash;</span>onto the stack, in\norder, the resulting C array looks like this:</p>\n<p><img src=\"image/a-virtual-machine/array.png\" alt=\"An array containing the\nletters in 'crepe' in order starting at element 0.\" /></p>\n<p>Since the stack grows and shrinks as values are pushed and popped, we need to\ntrack where the top of the stack is in the array. As with <code>ip</code>, we use a direct\npointer instead of an integer index since it&rsquo;s faster to dereference the pointer\nthan calculate the offset from the index each time we need it.</p>\n<p>The pointer points at the array element just <em>past</em> the element containing the\ntop value on the stack. That seems a little odd, but almost every implementation\ndoes this. It means we can indicate that the stack is empty by pointing at\nelement zero in the array.</p>\n<p><img src=\"image/a-virtual-machine/stack-empty.png\" alt=\"An empty array with\nstackTop pointing at the first element.\" /></p>\n<p>If we pointed to the top element, then for an empty stack we&rsquo;d need to point at\nelement -1. That&rsquo;s <span name=\"defined\">undefined</span> in C. As we push values\nonto the stack<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span></p>\n<aside name=\"defined\">\n<p>What about when the stack is <em>full</em>, you ask, Clever Reader? The C standard is\none step ahead of you. It <em>is</em> allowed and well-specified to have an array\npointer that points just past the end of an array.</p>\n</aside>\n<p><img src=\"image/a-virtual-machine/stack-c.png\" alt=\"An array with 'c' at element\nzero.\" /></p>\n<p><span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span><code>stackTop</code> always points just past the last item.</p>\n<p><img src=\"image/a-virtual-machine/stack-crepe.png\" alt=\"An array with 'c', 'r',\n'e', 'p', and 'e' in the first five elements.\" /></p>\n<p>I remember it like this: <code>stackTop</code> points to where the next value to be pushed\nwill go. The maximum number of values we can store on the stack (for now, at\nleast) is:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;chunk.h&quot;\n</pre><div class=\"source-file\"><em>vm.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#define STACK_MAX 256</span>\n</pre><pre class=\"insert-after\">\n\ntypedef struct {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em></div>\n\n<p>Giving our VM a fixed stack size means it&rsquo;s possible for some sequence of\ninstructions to push too many values and run out of stack space<span class=\"em\">&mdash;</span>the classic\n&ldquo;stack overflow&rdquo;. We could grow the stack dynamically as needed, but for now\nwe&rsquo;ll keep it simple. Since VM uses Value, we need to include its declaration.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;chunk.h&quot;\n</pre><div class=\"source-file\"><em>vm.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;value.h&quot;</span>\n</pre><pre class=\"insert-after\">\n\n#define STACK_MAX 256\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em></div>\n\n<p>Now that VM has some interesting state, we get to initialize it.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void initVM() {\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>initVM</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">resetStack</span>();\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>initVM</em>()</div>\n\n<p>That uses this helper function:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nadd after variable <em>vm</em></div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">resetStack</span>() {\n  <span class=\"i\">vm</span>.<span class=\"i\">stackTop</span> = <span class=\"i\">vm</span>.<span class=\"i\">stack</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, add after variable <em>vm</em></div>\n\n<p>Since the stack array is declared directly inline in the VM struct, we don&rsquo;t\nneed to allocate it. We don&rsquo;t even need to clear the unused cells in the\narray<span class=\"em\">&mdash;</span>we simply won&rsquo;t access them until after values have been stored in\nthem. The only initialization we need is to set <code>stackTop</code> to point to the\nbeginning of the array to indicate that the stack is empty.</p>\n<p>The stack protocol supports two operations:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">InterpretResult interpret(Chunk* chunk);\n</pre><div class=\"source-file\"><em>vm.h</em><br>\nadd after <em>interpret</em>()</div>\n<pre class=\"insert\"><span class=\"t\">void</span> <span class=\"i\">push</span>(<span class=\"t\">Value</span> <span class=\"i\">value</span>);\n<span class=\"t\">Value</span> <span class=\"i\">pop</span>();\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em>, add after <em>interpret</em>()</div>\n\n<p>You can push a new value onto the top of the stack, and you can pop the most\nrecently pushed value back off. Here&rsquo;s the first function:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nadd after <em>freeVM</em>()</div>\n<pre><span class=\"t\">void</span> <span class=\"i\">push</span>(<span class=\"t\">Value</span> <span class=\"i\">value</span>) {\n  *<span class=\"i\">vm</span>.<span class=\"i\">stackTop</span> = <span class=\"i\">value</span>;\n  <span class=\"i\">vm</span>.<span class=\"i\">stackTop</span>++;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, add after <em>freeVM</em>()</div>\n\n<p>If you&rsquo;re rusty on your C pointer syntax and operations, this is a good warm-up.\nThe first line stores <code>value</code> in the array element at the top of the stack.\nRemember, <code>stackTop</code> points just <em>past</em> the last used element, at the next\navailable one. This stores the value in that slot. Then we increment the pointer\nitself to point to the next unused slot in the array now that the previous slot\nis occupied.</p>\n<p>Popping is the mirror image.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nadd after <em>push</em>()</div>\n<pre><span class=\"t\">Value</span> <span class=\"i\">pop</span>() {\n  <span class=\"i\">vm</span>.<span class=\"i\">stackTop</span>--;\n  <span class=\"k\">return</span> *<span class=\"i\">vm</span>.<span class=\"i\">stackTop</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, add after <em>push</em>()</div>\n\n<p>First, we move the stack pointer <em>back</em> to get to the most recent used slot in\nthe array. Then we look up the value at that index and return it. We don&rsquo;t need\nto explicitly &ldquo;remove&rdquo; it from the array<span class=\"em\">&mdash;</span>moving <code>stackTop</code> down is enough to\nmark that slot as no longer in use.</p>\n<h3><a href=\"#stack-tracing\" id=\"stack-tracing\"><small>15&#8202;.&#8202;2&#8202;.&#8202;2</small>Stack tracing</a></h3>\n<p>We have a working stack, but it&rsquo;s hard to <em>see</em> that it&rsquo;s working. When we start\nimplementing more complex instructions and compiling and running larger pieces\nof code, we&rsquo;ll end up with a lot of values crammed into that array. It would\nmake our lives as VM hackers easier if we had some visibility into the stack.</p>\n<p>To that end, whenever we&rsquo;re tracing execution, we&rsquo;ll also show the current\ncontents of the stack before we interpret each instruction.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#ifdef DEBUG_TRACE_EXECUTION\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">    <span class=\"i\">printf</span>(<span class=\"s\">&quot;          &quot;</span>);\n    <span class=\"k\">for</span> (<span class=\"t\">Value</span>* <span class=\"i\">slot</span> = <span class=\"i\">vm</span>.<span class=\"i\">stack</span>; <span class=\"i\">slot</span> &lt; <span class=\"i\">vm</span>.<span class=\"i\">stackTop</span>; <span class=\"i\">slot</span>++) {\n      <span class=\"i\">printf</span>(<span class=\"s\">&quot;[ &quot;</span>);\n      <span class=\"i\">printValue</span>(*<span class=\"i\">slot</span>);\n      <span class=\"i\">printf</span>(<span class=\"s\">&quot; ]&quot;</span>);\n    }\n    <span class=\"i\">printf</span>(<span class=\"s\">&quot;</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>);\n</pre><pre class=\"insert-after\">    disassembleInstruction(vm.chunk,\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>We loop, printing each value in the array, starting at the first (bottom of the\nstack) and ending when we reach the top. This lets us observe the effect of each\ninstruction on the stack. The output is pretty verbose, but it&rsquo;s useful when\nwe&rsquo;re surgically extracting a nasty bug from the bowels of the interpreter.</p>\n<p>Stack in hand, let&rsquo;s revisit our two instructions. First up:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case OP_CONSTANT: {\n        Value constant = READ_CONSTANT();\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()<br>\nreplace 2 lines</div>\n<pre class=\"insert\">        <span class=\"i\">push</span>(<span class=\"i\">constant</span>);\n</pre><pre class=\"insert-after\">        break;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>(), replace 2 lines</div>\n\n<p>In the last chapter, I was hand-wavey about how the <code>OP_CONSTANT</code> instruction\n&ldquo;loads&rdquo; a constant. Now that we have a stack you know what it means to actually\nproduce a value: it gets pushed onto the stack.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case OP_RETURN: {\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">        <span class=\"i\">printValue</span>(<span class=\"i\">pop</span>());\n        <span class=\"i\">printf</span>(<span class=\"s\">&quot;</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>);\n</pre><pre class=\"insert-after\">        return INTERPRET_OK;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>Then we make <code>OP_RETURN</code> pop the stack and print the top value before exiting.\nWhen we add support for real functions to clox, we&rsquo;ll change this code. But, for\nnow, it gives us a way to get the VM executing simple instruction sequences and\ndisplaying the result.</p>\n<h2><a href=\"#an-arithmetic-calculator\" id=\"an-arithmetic-calculator\"><small>15&#8202;.&#8202;3</small>An Arithmetic Calculator</a></h2>\n<p>The heart and soul of our VM are in place now. The bytecode loop dispatches and\nexecutes instructions. The stack grows and shrinks as values flow through it.\nThe two halves work, but it&rsquo;s hard to get a feel for how cleverly they interact\nwith only the two rudimentary instructions we have so far. So let&rsquo;s teach our\ninterpreter to do arithmetic.</p>\n<p>We&rsquo;ll start with the simplest arithmetic operation, unary negation.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"n\">1.2</span>;\n<span class=\"k\">print</span> -<span class=\"i\">a</span>; <span class=\"c\">// -1.2.</span>\n</pre></div>\n<p>The prefix <code>-</code> operator takes one operand, the value to negate. It produces a\nsingle result. We aren&rsquo;t fussing with a parser yet, but we can add the\nbytecode instruction that the above syntax will compile to.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_CONSTANT,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_NEGATE</span>,\n</pre><pre class=\"insert-after\">  OP_RETURN,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>We execute it like so:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_NEGATE</span>:   <span class=\"i\">push</span>(-<span class=\"i\">pop</span>()); <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">      case OP_RETURN: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>The instruction needs a value to operate on, which it gets by popping from the\nstack. It negates that, then pushes the result back on for later instructions to\nuse. Doesn&rsquo;t get much easier than that. We can disassemble it too.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    case OP_CONSTANT:\n      return constantInstruction(&quot;OP_CONSTANT&quot;, chunk, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_NEGATE</span>:\n      <span class=\"k\">return</span> <span class=\"i\">simpleInstruction</span>(<span class=\"s\">&quot;OP_NEGATE&quot;</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    case OP_RETURN:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>And we can try it out in our test chunk.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  writeChunk(&amp;chunk, constant, 123);\n</pre><div class=\"source-file\"><em>main.c</em><br>\nin <em>main</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">writeChunk</span>(&amp;<span class=\"i\">chunk</span>, <span class=\"a\">OP_NEGATE</span>, <span class=\"n\">123</span>);\n</pre><pre class=\"insert-after\">\n\n  writeChunk(&amp;chunk, OP_RETURN, 123);\n</pre></div>\n<div class=\"source-file-narrow\"><em>main.c</em>, in <em>main</em>()</div>\n\n<p>After loading the constant, but before returning, we execute the negate\ninstruction. That replaces the constant on the stack with its negation. Then the\nreturn instruction prints that out:</p>\n<div class=\"codehilite\"><pre>-1.2\n</pre></div>\n<p>Magical!</p>\n<h3><a href=\"#binary-operators\" id=\"binary-operators\"><small>15&#8202;.&#8202;3&#8202;.&#8202;1</small>Binary operators</a></h3>\n<p>OK, unary operators aren&rsquo;t <em>that</em> impressive. We still only ever have a single\nvalue on the stack. To really see some depth, we need binary operators. Lox has\nfour binary <span name=\"ops\">arithmetic</span> operators: addition, subtraction,\nmultiplication, and division. We&rsquo;ll go ahead and implement them all at the same\ntime.</p>\n<aside name=\"ops\">\n<p>Lox has some other binary operators<span class=\"em\">&mdash;</span>comparison and equality<span class=\"em\">&mdash;</span>but those\ndon&rsquo;t produce numbers as a result, so we aren&rsquo;t ready for them yet.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_CONSTANT,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_ADD</span>,\n  <span class=\"a\">OP_SUBTRACT</span>,\n  <span class=\"a\">OP_MULTIPLY</span>,\n  <span class=\"a\">OP_DIVIDE</span>,\n</pre><pre class=\"insert-after\">  OP_NEGATE,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>Back in the bytecode loop, they are executed like this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_ADD</span>:      <span class=\"a\">BINARY_OP</span>(+); <span class=\"k\">break</span>;\n      <span class=\"k\">case</span> <span class=\"a\">OP_SUBTRACT</span>: <span class=\"a\">BINARY_OP</span>(-); <span class=\"k\">break</span>;\n      <span class=\"k\">case</span> <span class=\"a\">OP_MULTIPLY</span>: <span class=\"a\">BINARY_OP</span>(*); <span class=\"k\">break</span>;\n      <span class=\"k\">case</span> <span class=\"a\">OP_DIVIDE</span>:   <span class=\"a\">BINARY_OP</span>(/); <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">      case OP_NEGATE:   push(-pop()); break;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>The only difference between these four instructions is which underlying C\noperator they ultimately use to combine the two operands. Surrounding that core\narithmetic expression is some boilerplate code to pull values off the stack and\npush the result. When we later add dynamic typing, that boilerplate will grow.\nTo avoid repeating that code four times, I wrapped it up in a macro.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define READ_CONSTANT() (vm.chunk-&gt;constants.values[READ_BYTE()])\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\"><span class=\"a\">#define BINARY_OP(op) \\</span>\n<span class=\"a\">    do { \\</span>\n<span class=\"a\">      double b = pop(); \\</span>\n<span class=\"a\">      double a = pop(); \\</span>\n<span class=\"a\">      push(a op b); \\</span>\n<span class=\"a\">    } while (false)</span>\n</pre><pre class=\"insert-after\">\n\n  for (;;) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>I admit this is a fairly <span name=\"operator\">adventurous</span> use of the C\npreprocessor. I hesitated to do this, but you&rsquo;ll be glad in later chapters when\nwe need to add the type checking for each operand and stuff. It would be a chore\nto walk you through the same code four times.</p>\n<aside name=\"operator\">\n<p>Did you even know you can pass an <em>operator</em> as an argument to a macro? Now you\ndo. The preprocessor doesn&rsquo;t care that operators aren&rsquo;t first class in C. As far\nas it&rsquo;s concerned, it&rsquo;s all just text tokens.</p>\n<p>I know, you can just <em>feel</em> the temptation to abuse this, can&rsquo;t you?</p>\n</aside>\n<p>If you aren&rsquo;t familiar with the trick already, that outer <code>do while</code> loop\nprobably looks really weird. This macro needs to expand to a series of\nstatements. To be careful macro authors, we want to ensure those statements all\nend up in the same scope when the macro is expanded. Imagine if you defined:</p>\n<div class=\"codehilite\"><pre><span class=\"a\">#define WAKE_UP() makeCoffee(); drinkCoffee();</span>\n</pre></div>\n<p>And then used it like:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">if</span> (<span class=\"i\">morning</span>) <span class=\"a\">WAKE_UP</span>();\n</pre></div>\n<p>The intent is to execute both statements of the macro body only if <code>morning</code> is\ntrue. But it expands to:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">if</span> (<span class=\"i\">morning</span>) <span class=\"i\">makeCoffee</span>(); <span class=\"i\">drinkCoffee</span>();;\n</pre></div>\n<p>Oops. The <code>if</code> attaches only to the <em>first</em> statement. You might think you could\nfix this using a block.</p>\n<div class=\"codehilite\"><pre><span class=\"a\">#define WAKE_UP() { makeCoffee(); drinkCoffee(); }</span>\n</pre></div>\n<p>That&rsquo;s better, but you still risk:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">if</span> (<span class=\"i\">morning</span>)\n  <span class=\"a\">WAKE_UP</span>();\n<span class=\"k\">else</span>\n  <span class=\"i\">sleepIn</span>();\n</pre></div>\n<p>Now you get a compile error on the <code>else</code> because of that trailing <code>;</code> after the\nmacro&rsquo;s block. Using a <code>do while</code> loop in the macro looks funny, but it gives\nyou a way to contain multiple statements inside a block that <em>also</em> permits a\nsemicolon at the end.</p>\n<p>Where were we? Right, so what the body of that macro does is straightforward. A\nbinary operator takes two operands, so it pops twice. It performs the operation\non those two values and then pushes the result.</p>\n<p>Pay close attention to the <em>order</em> of the two pops. Note that we assign the\nfirst popped operand to <code>b</code>, not <code>a</code>. It looks backwards. When the operands\nthemselves are calculated, the left is evaluated first, then the right. That\nmeans the left operand gets pushed before the right operand. So the right\noperand will be on top of the stack. Thus, the first value we pop is <code>b</code>.</p>\n<p>For example, if we compile <code>3 - 1</code>, the data flow between the instructions looks\nlike so:</p>\n<p><img src=\"image/a-virtual-machine/reverse.png\" alt=\"A sequence of instructions\nwith the stack for each showing how pushing and then popping values reverses\ntheir order.\" /></p>\n<p>As we did with the other macros inside <code>run()</code>, we clean up after ourselves at\nthe end of the function.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#undef READ_CONSTANT\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\"><span class=\"a\">#undef BINARY_OP</span>\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>Last is disassembler support.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    case OP_CONSTANT:\n      return constantInstruction(&quot;OP_CONSTANT&quot;, chunk, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_ADD</span>:\n      <span class=\"k\">return</span> <span class=\"i\">simpleInstruction</span>(<span class=\"s\">&quot;OP_ADD&quot;</span>, <span class=\"i\">offset</span>);\n    <span class=\"k\">case</span> <span class=\"a\">OP_SUBTRACT</span>:\n      <span class=\"k\">return</span> <span class=\"i\">simpleInstruction</span>(<span class=\"s\">&quot;OP_SUBTRACT&quot;</span>, <span class=\"i\">offset</span>);\n    <span class=\"k\">case</span> <span class=\"a\">OP_MULTIPLY</span>:\n      <span class=\"k\">return</span> <span class=\"i\">simpleInstruction</span>(<span class=\"s\">&quot;OP_MULTIPLY&quot;</span>, <span class=\"i\">offset</span>);\n    <span class=\"k\">case</span> <span class=\"a\">OP_DIVIDE</span>:\n      <span class=\"k\">return</span> <span class=\"i\">simpleInstruction</span>(<span class=\"s\">&quot;OP_DIVIDE&quot;</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    case OP_NEGATE:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>The arithmetic instruction formats are simple, like <code>OP_RETURN</code>. Even though the\narithmetic <em>operators</em> take operands<span class=\"em\">&mdash;</span>which are found on the stack<span class=\"em\">&mdash;</span>the\narithmetic <em>bytecode instructions</em> do not.</p>\n<p>Let&rsquo;s put some of our new instructions through their paces by evaluating a\nlarger expression:</p>\n<p><img src=\"image/a-virtual-machine/chunk.png\" alt=\"The expression being\nevaluated: -((1.2 + 3.4) / 5.6)\" /></p>\n<p>Building on our existing example chunk, here&rsquo;s the additional instructions we\nneed to hand-compile that AST to bytecode.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  int constant = addConstant(&amp;chunk, 1.2);\n  writeChunk(&amp;chunk, OP_CONSTANT, 123);\n  writeChunk(&amp;chunk, constant, 123);\n</pre><div class=\"source-file\"><em>main.c</em><br>\nin <em>main</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"i\">constant</span> = <span class=\"i\">addConstant</span>(&amp;<span class=\"i\">chunk</span>, <span class=\"n\">3.4</span>);\n  <span class=\"i\">writeChunk</span>(&amp;<span class=\"i\">chunk</span>, <span class=\"a\">OP_CONSTANT</span>, <span class=\"n\">123</span>);\n  <span class=\"i\">writeChunk</span>(&amp;<span class=\"i\">chunk</span>, <span class=\"i\">constant</span>, <span class=\"n\">123</span>);\n\n  <span class=\"i\">writeChunk</span>(&amp;<span class=\"i\">chunk</span>, <span class=\"a\">OP_ADD</span>, <span class=\"n\">123</span>);\n\n  <span class=\"i\">constant</span> = <span class=\"i\">addConstant</span>(&amp;<span class=\"i\">chunk</span>, <span class=\"n\">5.6</span>);\n  <span class=\"i\">writeChunk</span>(&amp;<span class=\"i\">chunk</span>, <span class=\"a\">OP_CONSTANT</span>, <span class=\"n\">123</span>);\n  <span class=\"i\">writeChunk</span>(&amp;<span class=\"i\">chunk</span>, <span class=\"i\">constant</span>, <span class=\"n\">123</span>);\n\n  <span class=\"i\">writeChunk</span>(&amp;<span class=\"i\">chunk</span>, <span class=\"a\">OP_DIVIDE</span>, <span class=\"n\">123</span>);\n</pre><pre class=\"insert-after\">  writeChunk(&amp;chunk, OP_NEGATE, 123);\n\n  writeChunk(&amp;chunk, OP_RETURN, 123);\n</pre></div>\n<div class=\"source-file-narrow\"><em>main.c</em>, in <em>main</em>()</div>\n\n<p>The addition goes first. The instruction for the left constant, 1.2, is already\nthere, so we add another for 3.4. Then we add those two using <code>OP_ADD</code>, leaving\nit on the stack. That covers the left side of the division. Next we push the\n5.6, and divide the result of the addition by it. Finally, we negate the result\nof that.</p>\n<p>Note how the output of the <code>OP_ADD</code> implicitly flows into being an operand of\n<code>OP_DIVIDE</code> without either instruction being directly coupled to each other.\nThat&rsquo;s the magic of the stack. It lets us freely compose instructions without\nthem needing any complexity or awareness of the data flow. The stack acts like a\nshared workspace that they all read from and write to.</p>\n<p>In this tiny example chunk, the stack still only gets two values tall, but when\nwe start compiling Lox source to bytecode, we&rsquo;ll have chunks that use much more\nof the stack. In the meantime, try playing around with this hand-authored chunk\nto calculate different nested arithmetic expressions and see how values flow\nthrough the instructions and stack.</p>\n<p>You may as well get it out of your system now. This is the last chunk we&rsquo;ll\nbuild by hand. When we next revisit bytecode, we will be writing a compiler to\ngenerate it for us.</p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>What bytecode instruction sequences would you generate for the following\nexpressions:</p>\n<div class=\"codehilite\"><pre><span class=\"n\">1</span> * <span class=\"n\">2</span> + <span class=\"n\">3</span>\n<span class=\"n\">1</span> + <span class=\"n\">2</span> * <span class=\"n\">3</span>\n<span class=\"n\">3</span> - <span class=\"n\">2</span> - <span class=\"n\">1</span>\n<span class=\"n\">1</span> + <span class=\"n\">2</span> * <span class=\"n\">3</span> - <span class=\"n\">4</span> / -<span class=\"n\">5</span>\n</pre></div>\n<p>(Remember that Lox does not have a syntax for negative number literals, so\nthe <code>-5</code> is negating the number 5.)</p>\n</li>\n<li>\n<p>If we really wanted a minimal instruction set, we could eliminate either\n<code>OP_NEGATE</code> or <code>OP_SUBTRACT</code>. Show the bytecode instruction sequence you\nwould generate for:</p>\n<div class=\"codehilite\"><pre><span class=\"n\">4</span> - <span class=\"n\">3</span> * -<span class=\"n\">2</span>\n</pre></div>\n<p>First, without using <code>OP_NEGATE</code>. Then, without using <code>OP_SUBTRACT</code>.</p>\n<p>Given the above, do you think it makes sense to have both instructions? Why\nor why not? Are there any other redundant instructions you would consider\nincluding?</p>\n</li>\n<li>\n<p>Our VM&rsquo;s stack has a fixed size, and we don&rsquo;t check if pushing a value\noverflows it. This means the wrong series of instructions could cause our\ninterpreter to crash or go into undefined behavior. Avoid that by\ndynamically growing the stack as needed.</p>\n<p>What are the costs and benefits of doing so?</p>\n</li>\n<li>\n<p>To interpret <code>OP_NEGATE</code>, we pop the operand, negate the value, and then\npush the result. That&rsquo;s a simple implementation, but it increments and\ndecrements <code>stackTop</code> unnecessarily, since the stack ends up the same height\nin the end. It might be faster to simply negate the value in place on the\nstack and leave <code>stackTop</code> alone. Try that and see if you can measure a\nperformance difference.</p>\n<p>Are there other instructions where you can do a similar optimization?</p>\n</li>\n</ol>\n</div>\n<div class=\"design-note\">\n<h2><a href=\"#design-note\" id=\"design-note\">Design Note: Register-Based Bytecode</a></h2>\n<p>For the remainder of this book, we&rsquo;ll meticulously implement an interpreter\naround a stack-based bytecode instruction set. There&rsquo;s another family of\nbytecode architectures out there<span class=\"em\">&mdash;</span><em>register-based</em>. Despite the name, these\nbytecode instructions aren&rsquo;t quite as difficult to work with as the registers in\nan actual chip like <span name=\"x64\">x64</span>. With real hardware registers,\nyou usually have only a handful for the entire program, so you spend a lot of\neffort <a href=\"https://en.wikipedia.org/wiki/Register_allocation\">trying to use them efficiently and shuttling stuff in and out of\nthem</a>.</p>\n<aside name=\"x64\">\n<p>Register-based bytecode is a little closer to the <a href=\"https://en.wikipedia.org/wiki/Register_window\"><em>register windows</em></a>\nsupported by SPARC chips.</p>\n</aside>\n<p>In a register-based VM, you still have a stack. Temporary values still get\npushed onto it and popped when no longer needed. The main difference is that\ninstructions can read their inputs from anywhere in the stack and can store\ntheir outputs into specific stack slots.</p>\n<p>Take this little Lox script:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"n\">1</span>;\n<span class=\"k\">var</span> <span class=\"i\">b</span> = <span class=\"n\">2</span>;\n<span class=\"k\">var</span> <span class=\"i\">c</span> = <span class=\"i\">a</span> + <span class=\"i\">b</span>;\n</pre></div>\n<p>In our stack-based VM, the last statement will get compiled to something like:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">load</span> &lt;<span class=\"i\">a</span>&gt;  <span class=\"c\">// Read local variable a and push onto stack.</span>\n<span class=\"i\">load</span> &lt;<span class=\"i\">b</span>&gt;  <span class=\"c\">// Read local variable b and push onto stack.</span>\n<span class=\"i\">add</span>       <span class=\"c\">// Pop two values, add, push result.</span>\n<span class=\"i\">store</span> &lt;<span class=\"i\">c</span>&gt; <span class=\"c\">// Pop value and store in local variable c.</span>\n</pre></div>\n<p>(Don&rsquo;t worry if you don&rsquo;t fully understand the load and store instructions yet.\nWe&rsquo;ll go over them in much greater detail <a href=\"global-variables.html\">when we implement\nvariables</a>.) We have four separate instructions. That means four\ntimes through the bytecode interpret loop, four instructions to decode and\ndispatch. It&rsquo;s at least seven bytes of code<span class=\"em\">&mdash;</span>four for the opcodes and another\nthree for the operands identifying which locals to load and store. Three pushes\nand three pops. A lot of work!</p>\n<p>In a register-based instruction set, instructions can read from and store\ndirectly into local variables. The bytecode for the last statement above looks\nlike:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">add</span> &lt;<span class=\"i\">a</span>&gt; &lt;<span class=\"i\">b</span>&gt; &lt;<span class=\"i\">c</span>&gt; <span class=\"c\">// Read values from a and b, add, store in c.</span>\n</pre></div>\n<p>The add instruction is bigger<span class=\"em\">&mdash;</span>it has three instruction operands that define\nwhere in the stack it reads its inputs from and writes the result to. But since\nlocal variables live on the stack, it can read directly from <code>a</code> and <code>b</code> and\nthen store the result right into <code>c</code>.</p>\n<p>There&rsquo;s only a single instruction to decode and dispatch, and the whole thing\nfits in four bytes. Decoding is more complex because of the additional operands,\nbut it&rsquo;s still a net win. There&rsquo;s no pushing and popping or other stack\nmanipulation.</p>\n<p>The main implementation of Lua used to be stack-based. For <span name=\"lua\">Lua\n5.0</span>, the implementers switched to a register instruction set and noted a\nspeed improvement. The amount of improvement, naturally, depends heavily on the\ndetails of the language semantics, specific instruction set, and compiler\nsophistication, but that should get your attention.</p>\n<aside name=\"lua\">\n<p>The Lua dev team<span class=\"em\">&mdash;</span>Roberto Ierusalimschy, Waldemar Celes, and Luiz Henrique de\nFigueiredo<span class=\"em\">&mdash;</span>wrote a <em>fantastic</em> paper on this, one of my all time favorite\ncomputer science papers, &ldquo;<a href=\"https://www.lua.org/doc/jucs05.pdf\">The Implementation of Lua 5.0</a>&rdquo; (PDF).</p>\n</aside>\n<p>That raises the obvious question of why I&rsquo;m going to spend the rest of the book\ndoing a stack-based bytecode. Register VMs are neat, but they are quite a bit\nharder to write a compiler for. For what is likely to be your very first\ncompiler, I wanted to stick with an instruction set that&rsquo;s easy to generate and\neasy to execute. Stack-based bytecode is marvelously simple.</p>\n<p>It&rsquo;s also <em>much</em> better known in the literature and the community. Even though\nyou may eventually move to something more advanced, it&rsquo;s a good common ground to\nshare with the rest of your language hacker peers.</p>\n</div>\n\n<footer>\n<a href=\"scanning-on-demand.html\" class=\"next\">\n  Next Chapter: &ldquo;Scanning on Demand&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/acknowledgements.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Acknowledgements &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h2><small></small>Acknowledgements</h2>\n<hr>\n\n<div class=\"prev-next\">\n    <a href=\"dedication.html\" title=\"Dedication\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"contents.html\" title=\"Table of Contents\">&uarr;&nbsp;Up</a>\n    <a href=\"contents.html\" title=\"Table of Contents\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"dedication.html\" title=\"Dedication\" class=\"prev\">←</a>\n<a href=\"contents.html\" title=\"Table of Contents\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h2><small></small>Acknowledgements</h2>\n<hr>\n\n<div class=\"prev-next\">\n    <a href=\"dedication.html\" title=\"Dedication\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"contents.html\" title=\"Table of Contents\">&uarr;&nbsp;Up</a>\n    <a href=\"contents.html\" title=\"Table of Contents\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <h1 class=\"part\">Acknowledgements</h1>\n\n<p>When the first copy of &ldquo;<a href=\"https://gameprogrammingpatterns.com/\">Game Programming Patterns</a>&rdquo; sold, I guess I had\nthe right to call myself an author. But it took time to feel comfortable with\nthat label. Thank you to everyone who bought copies of my first book, and to the\npublishers and translators who brought it to other languages. You gave me the\nconfidence to believe I could tackle a project of this scope. Well, that, and\nmassively underestimating what I was getting myself into, but that&rsquo;s on me.</p>\n<p>A fear particular to technical writing is <em>getting stuff wrong</em>. Tests and\nstatic analysis only get you so far. Once the code and prose is in ink on paper,\nthere&rsquo;s no fixing it. I am deeply grateful to the many people who filed issues\nand pull requests on the <a href=\"https://github.com/munificent/craftinginterpreters\">open source repo</a> for the book. Special thanks\ngo to cm1776, who filed 145 tactfully worded issues pointing out hundreds of\ncode errors, typos, and unclear sentences. The book is more accurate and\nreadable because of you all.</p>\n<p>I&rsquo;m grateful to my copy editor Kari Somerton who braved a heap of computer\nscience jargon and an unfamilar workflow in order to fix my many grammar errors\nand stylistic inconsistencies.</p>\n<p>When the pandemic turned everyone&rsquo;s life upside down, a number of people reached\nout to tell me that my book provided a helpful distraction. This book that I\nspent six years writing forms a chapter in my own life&rsquo;s story and I&rsquo;m grateful\nto the readers who contacted me and made that chapter more meaningful.</p>\n<p>Finally, the deepest thanks go to my wife Megan and my daughters Lily and\nGretchen. You patiently endured the time I had to sink into the book, and my\nstress while writing it. There&rsquo;s no one I&rsquo;d rather be stuck at home with.</p>\n\n<footer>\n<a href=\"contents.html\" class=\"next\">\n  Next Part: &ldquo;Table of Contents&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/appendix-i.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Appendix I &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Appendix I<small>A1</small></a></h3>\n\n<ul>\n    <li><a href=\"#syntax-grammar\"><small>A1.1</small> Syntax Grammar</a></li>\n    <li><a href=\"#lexical-grammar\"><small>A1.2</small> Lexical Grammar</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"backmatter.html\" title=\"Backmatter\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"backmatter.html\" title=\"Backmatter\">&uarr;&nbsp;Up</a>\n    <a href=\"appendix-ii.html\" title=\"Appendix II\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"backmatter.html\" title=\"Backmatter\" class=\"prev\">←</a>\n<a href=\"appendix-ii.html\" title=\"Appendix II\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Appendix I<small>A1</small></a></h3>\n\n<ul>\n    <li><a href=\"#syntax-grammar\"><small>A1.1</small> Syntax Grammar</a></li>\n    <li><a href=\"#lexical-grammar\"><small>A1.2</small> Lexical Grammar</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"backmatter.html\" title=\"Backmatter\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"backmatter.html\" title=\"Backmatter\">&uarr;&nbsp;Up</a>\n    <a href=\"appendix-ii.html\" title=\"Appendix II\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">A1</div>\n  <h1>Appendix I</h1>\n\n<p>Here is a complete grammar for Lox. The chapters that introduce each part of the\nlanguage include the grammar rules there, but this collects them all into one\nplace.</p>\n<h2><a href=\"#syntax-grammar\" id=\"syntax-grammar\"><small>A1&#8202;.&#8202;1</small>Syntax Grammar</a></h2>\n<p>The syntactic grammar is used to parse the linear sequence of tokens into the\nnested syntax tree structure. It starts with the first rule that matches an\nentire Lox program (or a single REPL entry).</p>\n<div class=\"codehilite\"><pre><span class=\"i\">program</span>        → <span class=\"i\">declaration</span>* <span class=\"t\">EOF</span> ;\n</pre></div>\n<h3><a href=\"#declarations\" id=\"declarations\"><small>A1&#8202;.&#8202;1&#8202;.&#8202;1</small>Declarations</a></h3>\n<p>A program is a series of declarations, which are the statements that bind new\nidentifiers or any of the other statement types.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">declaration</span>    → <span class=\"i\">classDecl</span>\n               | <span class=\"i\">funDecl</span>\n               | <span class=\"i\">varDecl</span>\n               | <span class=\"i\">statement</span> ;\n\n<span class=\"i\">classDecl</span>      → <span class=\"s\">&quot;class&quot;</span> <span class=\"t\">IDENTIFIER</span> ( <span class=\"s\">&quot;&lt;&quot;</span> <span class=\"t\">IDENTIFIER</span> )?\n                 <span class=\"s\">&quot;{&quot;</span> <span class=\"i\">function</span>* <span class=\"s\">&quot;}&quot;</span> ;\n<span class=\"i\">funDecl</span>        → <span class=\"s\">&quot;fun&quot;</span> <span class=\"i\">function</span> ;\n<span class=\"i\">varDecl</span>        → <span class=\"s\">&quot;var&quot;</span> <span class=\"t\">IDENTIFIER</span> ( <span class=\"s\">&quot;=&quot;</span> <span class=\"i\">expression</span> )? <span class=\"s\">&quot;;&quot;</span> ;\n</pre></div>\n<h3><a href=\"#statements\" id=\"statements\"><small>A1&#8202;.&#8202;1&#8202;.&#8202;2</small>Statements</a></h3>\n<p>The remaining statement rules produce side effects, but do not introduce\nbindings.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">statement</span>      → <span class=\"i\">exprStmt</span>\n               | <span class=\"i\">forStmt</span>\n               | <span class=\"i\">ifStmt</span>\n               | <span class=\"i\">printStmt</span>\n               | <span class=\"i\">returnStmt</span>\n               | <span class=\"i\">whileStmt</span>\n               | <span class=\"i\">block</span> ;\n\n<span class=\"i\">exprStmt</span>       → <span class=\"i\">expression</span> <span class=\"s\">&quot;;&quot;</span> ;\n<span class=\"i\">forStmt</span>        → <span class=\"s\">&quot;for&quot;</span> <span class=\"s\">&quot;(&quot;</span> ( <span class=\"i\">varDecl</span> | <span class=\"i\">exprStmt</span> | <span class=\"s\">&quot;;&quot;</span> )\n                           <span class=\"i\">expression</span>? <span class=\"s\">&quot;;&quot;</span>\n                           <span class=\"i\">expression</span>? <span class=\"s\">&quot;)&quot;</span> <span class=\"i\">statement</span> ;\n<span class=\"i\">ifStmt</span>         → <span class=\"s\">&quot;if&quot;</span> <span class=\"s\">&quot;(&quot;</span> <span class=\"i\">expression</span> <span class=\"s\">&quot;)&quot;</span> <span class=\"i\">statement</span>\n                 ( <span class=\"s\">&quot;else&quot;</span> <span class=\"i\">statement</span> )? ;\n<span class=\"i\">printStmt</span>      → <span class=\"s\">&quot;print&quot;</span> <span class=\"i\">expression</span> <span class=\"s\">&quot;;&quot;</span> ;\n<span class=\"i\">returnStmt</span>     → <span class=\"s\">&quot;return&quot;</span> <span class=\"i\">expression</span>? <span class=\"s\">&quot;;&quot;</span> ;\n<span class=\"i\">whileStmt</span>      → <span class=\"s\">&quot;while&quot;</span> <span class=\"s\">&quot;(&quot;</span> <span class=\"i\">expression</span> <span class=\"s\">&quot;)&quot;</span> <span class=\"i\">statement</span> ;\n<span class=\"i\">block</span>          → <span class=\"s\">&quot;{&quot;</span> <span class=\"i\">declaration</span>* <span class=\"s\">&quot;}&quot;</span> ;\n</pre></div>\n<p>Note that <code>block</code> is a statement rule, but is also used as a nonterminal in a\ncouple of other rules for things like function bodies.</p>\n<h3><a href=\"#expressions\" id=\"expressions\"><small>A1&#8202;.&#8202;1&#8202;.&#8202;3</small>Expressions</a></h3>\n<p>Expressions produce values. Lox has a number of unary and binary operators with\ndifferent levels of precedence. Some grammars for languages do not directly\nencode the precedence relationships and specify that elsewhere. Here, we use a\nseparate rule for each precedence level to make it explicit.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">expression</span>     → <span class=\"i\">assignment</span> ;\n\n<span class=\"i\">assignment</span>     → ( <span class=\"i\">call</span> <span class=\"s\">&quot;.&quot;</span> )? <span class=\"t\">IDENTIFIER</span> <span class=\"s\">&quot;=&quot;</span> <span class=\"i\">assignment</span>\n               | <span class=\"i\">logic_or</span> ;\n\n<span class=\"i\">logic_or</span>       → <span class=\"i\">logic_and</span> ( <span class=\"s\">&quot;or&quot;</span> <span class=\"i\">logic_and</span> )* ;\n<span class=\"i\">logic_and</span>      → <span class=\"i\">equality</span> ( <span class=\"s\">&quot;and&quot;</span> <span class=\"i\">equality</span> )* ;\n<span class=\"i\">equality</span>       → <span class=\"i\">comparison</span> ( ( <span class=\"s\">&quot;!=&quot;</span> | <span class=\"s\">&quot;==&quot;</span> ) <span class=\"i\">comparison</span> )* ;\n<span class=\"i\">comparison</span>     → <span class=\"i\">term</span> ( ( <span class=\"s\">&quot;&gt;&quot;</span> | <span class=\"s\">&quot;&gt;=&quot;</span> | <span class=\"s\">&quot;&lt;&quot;</span> | <span class=\"s\">&quot;&lt;=&quot;</span> ) <span class=\"i\">term</span> )* ;\n<span class=\"i\">term</span>           → <span class=\"i\">factor</span> ( ( <span class=\"s\">&quot;-&quot;</span> | <span class=\"s\">&quot;+&quot;</span> ) <span class=\"i\">factor</span> )* ;\n<span class=\"i\">factor</span>         → <span class=\"i\">unary</span> ( ( <span class=\"s\">&quot;/&quot;</span> | <span class=\"s\">&quot;*&quot;</span> ) <span class=\"i\">unary</span> )* ;\n\n<span class=\"i\">unary</span>          → ( <span class=\"s\">&quot;!&quot;</span> | <span class=\"s\">&quot;-&quot;</span> ) <span class=\"i\">unary</span> | <span class=\"i\">call</span> ;\n<span class=\"i\">call</span>           → <span class=\"i\">primary</span> ( <span class=\"s\">&quot;(&quot;</span> <span class=\"i\">arguments</span>? <span class=\"s\">&quot;)&quot;</span> | <span class=\"s\">&quot;.&quot;</span> <span class=\"t\">IDENTIFIER</span> )* ;\n<span class=\"i\">primary</span>        → <span class=\"s\">&quot;true&quot;</span> | <span class=\"s\">&quot;false&quot;</span> | <span class=\"s\">&quot;nil&quot;</span> | <span class=\"s\">&quot;this&quot;</span>\n               | <span class=\"t\">NUMBER</span> | <span class=\"t\">STRING</span> | <span class=\"t\">IDENTIFIER</span> | <span class=\"s\">&quot;(&quot;</span> <span class=\"i\">expression</span> <span class=\"s\">&quot;)&quot;</span>\n               | <span class=\"s\">&quot;super&quot;</span> <span class=\"s\">&quot;.&quot;</span> <span class=\"t\">IDENTIFIER</span> ;\n</pre></div>\n<h3><a href=\"#utility-rules\" id=\"utility-rules\"><small>A1&#8202;.&#8202;1&#8202;.&#8202;4</small>Utility rules</a></h3>\n<p>In order to keep the above rules a little cleaner, some of the grammar is\nsplit out into a few reused helper rules.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">function</span>       → <span class=\"t\">IDENTIFIER</span> <span class=\"s\">&quot;(&quot;</span> <span class=\"i\">parameters</span>? <span class=\"s\">&quot;)&quot;</span> <span class=\"i\">block</span> ;\n<span class=\"i\">parameters</span>     → <span class=\"t\">IDENTIFIER</span> ( <span class=\"s\">&quot;,&quot;</span> <span class=\"t\">IDENTIFIER</span> )* ;\n<span class=\"i\">arguments</span>      → <span class=\"i\">expression</span> ( <span class=\"s\">&quot;,&quot;</span> <span class=\"i\">expression</span> )* ;\n</pre></div>\n<h2><a href=\"#lexical-grammar\" id=\"lexical-grammar\"><small>A1&#8202;.&#8202;2</small>Lexical Grammar</a></h2>\n<p>The lexical grammar is used by the scanner to group characters into tokens.\nWhere the syntax is <a href=\"https://en.wikipedia.org/wiki/Context-free_grammar\">context free</a>, the lexical grammar is <a href=\"https://en.wikipedia.org/wiki/Regular_grammar\">regular</a><span class=\"em\">&mdash;</span>note\nthat there are no recursive rules.</p>\n<div class=\"codehilite\"><pre><span class=\"t\">NUMBER</span>         → <span class=\"t\">DIGIT</span>+ ( <span class=\"s\">&quot;.&quot;</span> <span class=\"t\">DIGIT</span>+ )? ;\n<span class=\"t\">STRING</span>         → <span class=\"s\">&quot;</span><span class=\"e\">\\&quot;</span><span class=\"s\">&quot;</span> &lt;<span class=\"i\">any</span> <span class=\"i\">char</span> <span class=\"i\">except</span> <span class=\"s\">&quot;</span><span class=\"e\">\\&quot;</span><span class=\"s\">&quot;</span>&gt;* <span class=\"s\">&quot;</span><span class=\"e\">\\&quot;</span><span class=\"s\">&quot;</span> ;\n<span class=\"t\">IDENTIFIER</span>     → <span class=\"t\">ALPHA</span> ( <span class=\"t\">ALPHA</span> | <span class=\"t\">DIGIT</span> )* ;\n<span class=\"t\">ALPHA</span>          → <span class=\"s\">&quot;a&quot;</span> ... <span class=\"s\">&quot;z&quot;</span> | <span class=\"s\">&quot;A&quot;</span> ... <span class=\"s\">&quot;Z&quot;</span> | <span class=\"s\">&quot;_&quot;</span> ;\n<span class=\"t\">DIGIT</span>          → <span class=\"s\">&quot;0&quot;</span> ... <span class=\"s\">&quot;9&quot;</span> ;\n</pre></div>\n\n<footer>\n<a href=\"appendix-ii.html\" class=\"next\">\n  Next Chapter: &ldquo;Appendix II&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/appendix-ii.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Appendix II &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Appendix II<small>A2</small></a></h3>\n\n<ul>\n    <li><a href=\"#expressions\"><small>A2.1</small> Expressions</a></li>\n    <li><a href=\"#statements\"><small>A2.2</small> Statements</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"appendix-i.html\" title=\"Appendix I\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"backmatter.html\" title=\"Backmatter\">&uarr;&nbsp;Up</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"appendix-i.html\" title=\"Appendix I\" class=\"prev\">←</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Appendix II<small>A2</small></a></h3>\n\n<ul>\n    <li><a href=\"#expressions\"><small>A2.1</small> Expressions</a></li>\n    <li><a href=\"#statements\"><small>A2.2</small> Statements</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"appendix-i.html\" title=\"Appendix I\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"backmatter.html\" title=\"Backmatter\">&uarr;&nbsp;Up</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">A2</div>\n  <h1>Appendix II</h1>\n\n<p>For your edification, here is the code produced by <a href=\"representing-code.html#metaprogramming-the-trees\">the little script\nwe built</a> to automate generating the syntax tree classes for jlox.</p>\n<h2><a href=\"#expressions\" id=\"expressions\"><small>A2&#8202;.&#8202;1</small>Expressions</a></h2>\n<p>Expressions are the first syntax tree nodes we see, introduced in &ldquo;<a href=\"representing-code.html\">Representing\nCode</a>&rdquo;. The main Expr class defines the visitor\ninterface used to dispatch against the specific expression types, and contains\nthe other expression subclasses as nested classes.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Expr.java</em><br>\ncreate new file</div>\n<pre><span class=\"k\">package</span> <span class=\"i\">com.craftinginterpreters.lox</span>;\n\n<span class=\"k\">import</span> <span class=\"i\">java.util.List</span>;\n\n<span class=\"k\">abstract</span> <span class=\"k\">class</span> <span class=\"t\">Expr</span> {\n  <span class=\"k\">interface</span> <span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; {\n    <span class=\"t\">R</span> <span class=\"i\">visitAssignExpr</span>(<span class=\"t\">Assign</span> <span class=\"i\">expr</span>);\n    <span class=\"t\">R</span> <span class=\"i\">visitBinaryExpr</span>(<span class=\"t\">Binary</span> <span class=\"i\">expr</span>);\n    <span class=\"t\">R</span> <span class=\"i\">visitCallExpr</span>(<span class=\"t\">Call</span> <span class=\"i\">expr</span>);\n    <span class=\"t\">R</span> <span class=\"i\">visitGetExpr</span>(<span class=\"t\">Get</span> <span class=\"i\">expr</span>);\n    <span class=\"t\">R</span> <span class=\"i\">visitGroupingExpr</span>(<span class=\"t\">Grouping</span> <span class=\"i\">expr</span>);\n    <span class=\"t\">R</span> <span class=\"i\">visitLiteralExpr</span>(<span class=\"t\">Literal</span> <span class=\"i\">expr</span>);\n    <span class=\"t\">R</span> <span class=\"i\">visitLogicalExpr</span>(<span class=\"t\">Logical</span> <span class=\"i\">expr</span>);\n    <span class=\"t\">R</span> <span class=\"i\">visitSetExpr</span>(<span class=\"t\">Set</span> <span class=\"i\">expr</span>);\n    <span class=\"t\">R</span> <span class=\"i\">visitSuperExpr</span>(<span class=\"t\">Super</span> <span class=\"i\">expr</span>);\n    <span class=\"t\">R</span> <span class=\"i\">visitThisExpr</span>(<span class=\"t\">This</span> <span class=\"i\">expr</span>);\n    <span class=\"t\">R</span> <span class=\"i\">visitUnaryExpr</span>(<span class=\"t\">Unary</span> <span class=\"i\">expr</span>);\n    <span class=\"t\">R</span> <span class=\"i\">visitVariableExpr</span>(<span class=\"t\">Variable</span> <span class=\"i\">expr</span>);\n  }\n\n  <span class=\"c\">// Nested Expr classes here...</span>\n\n  <span class=\"k\">abstract</span> &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Expr.java</em>, create new file</div>\n\n<h3><a href=\"#assign-expression\" id=\"assign-expression\"><small>A2&#8202;.&#8202;1&#8202;.&#8202;1</small>Assign expression</a></h3>\n<p>Variable assignment is introduced in &ldquo;<a href=\"statements-and-state.html#assignment\">Statements and\nState</a>&rdquo;.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Expr.java</em><br>\nnest inside class <em>Expr</em></div>\n<pre>  <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">Assign</span> <span class=\"k\">extends</span> <span class=\"t\">Expr</span> {\n    <span class=\"t\">Assign</span>(<span class=\"t\">Token</span> <span class=\"i\">name</span>, <span class=\"t\">Expr</span> <span class=\"i\">value</span>) {\n      <span class=\"k\">this</span>.<span class=\"i\">name</span> = <span class=\"i\">name</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">value</span> = <span class=\"i\">value</span>;\n    }\n\n    <span class=\"a\">@Override</span>\n    &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">visitor</span>.<span class=\"i\">visitAssignExpr</span>(<span class=\"k\">this</span>);\n    }\n\n    <span class=\"k\">final</span> <span class=\"t\">Token</span> <span class=\"i\">name</span>;\n    <span class=\"k\">final</span> <span class=\"t\">Expr</span> <span class=\"i\">value</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Expr.java</em>, nest inside class <em>Expr</em></div>\n\n<h3><a href=\"#binary-expression\" id=\"binary-expression\"><small>A2&#8202;.&#8202;1&#8202;.&#8202;2</small>Binary expression</a></h3>\n<p>Binary operators are introduced in &ldquo;<a href=\"representing-code.html\">Representing\nCode</a>&rdquo;.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Expr.java</em><br>\nnest inside class <em>Expr</em></div>\n<pre>  <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">Binary</span> <span class=\"k\">extends</span> <span class=\"t\">Expr</span> {\n    <span class=\"t\">Binary</span>(<span class=\"t\">Expr</span> <span class=\"i\">left</span>, <span class=\"t\">Token</span> <span class=\"i\">operator</span>, <span class=\"t\">Expr</span> <span class=\"i\">right</span>) {\n      <span class=\"k\">this</span>.<span class=\"i\">left</span> = <span class=\"i\">left</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">operator</span> = <span class=\"i\">operator</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">right</span> = <span class=\"i\">right</span>;\n    }\n\n    <span class=\"a\">@Override</span>\n    &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">visitor</span>.<span class=\"i\">visitBinaryExpr</span>(<span class=\"k\">this</span>);\n    }\n\n    <span class=\"k\">final</span> <span class=\"t\">Expr</span> <span class=\"i\">left</span>;\n    <span class=\"k\">final</span> <span class=\"t\">Token</span> <span class=\"i\">operator</span>;\n    <span class=\"k\">final</span> <span class=\"t\">Expr</span> <span class=\"i\">right</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Expr.java</em>, nest inside class <em>Expr</em></div>\n\n<h3><a href=\"#call-expression\" id=\"call-expression\"><small>A2&#8202;.&#8202;1&#8202;.&#8202;3</small>Call expression</a></h3>\n<p>Function call expressions are introduced in\n&ldquo;<a href=\"functions.html#function-calls\">Functions</a>&rdquo;.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Expr.java</em><br>\nnest inside class <em>Expr</em></div>\n<pre>  <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">Call</span> <span class=\"k\">extends</span> <span class=\"t\">Expr</span> {\n    <span class=\"t\">Call</span>(<span class=\"t\">Expr</span> <span class=\"i\">callee</span>, <span class=\"t\">Token</span> <span class=\"i\">paren</span>, <span class=\"t\">List</span>&lt;<span class=\"t\">Expr</span>&gt; <span class=\"i\">arguments</span>) {\n      <span class=\"k\">this</span>.<span class=\"i\">callee</span> = <span class=\"i\">callee</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">paren</span> = <span class=\"i\">paren</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">arguments</span> = <span class=\"i\">arguments</span>;\n    }\n\n    <span class=\"a\">@Override</span>\n    &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">visitor</span>.<span class=\"i\">visitCallExpr</span>(<span class=\"k\">this</span>);\n    }\n\n    <span class=\"k\">final</span> <span class=\"t\">Expr</span> <span class=\"i\">callee</span>;\n    <span class=\"k\">final</span> <span class=\"t\">Token</span> <span class=\"i\">paren</span>;\n    <span class=\"k\">final</span> <span class=\"t\">List</span>&lt;<span class=\"t\">Expr</span>&gt; <span class=\"i\">arguments</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Expr.java</em>, nest inside class <em>Expr</em></div>\n\n<h3><a href=\"#get-expression\" id=\"get-expression\"><small>A2&#8202;.&#8202;1&#8202;.&#8202;4</small>Get expression</a></h3>\n<p>Property access, or &ldquo;get&rdquo; expressions are introduced in\n&ldquo;<a href=\"classes.html#properties-on-instances\">Classes</a>&rdquo;.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Expr.java</em><br>\nnest inside class <em>Expr</em></div>\n<pre>  <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">Get</span> <span class=\"k\">extends</span> <span class=\"t\">Expr</span> {\n    <span class=\"t\">Get</span>(<span class=\"t\">Expr</span> <span class=\"i\">object</span>, <span class=\"t\">Token</span> <span class=\"i\">name</span>) {\n      <span class=\"k\">this</span>.<span class=\"i\">object</span> = <span class=\"i\">object</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">name</span> = <span class=\"i\">name</span>;\n    }\n\n    <span class=\"a\">@Override</span>\n    &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">visitor</span>.<span class=\"i\">visitGetExpr</span>(<span class=\"k\">this</span>);\n    }\n\n    <span class=\"k\">final</span> <span class=\"t\">Expr</span> <span class=\"i\">object</span>;\n    <span class=\"k\">final</span> <span class=\"t\">Token</span> <span class=\"i\">name</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Expr.java</em>, nest inside class <em>Expr</em></div>\n\n<h3><a href=\"#grouping-expression\" id=\"grouping-expression\"><small>A2&#8202;.&#8202;1&#8202;.&#8202;5</small>Grouping expression</a></h3>\n<p>Using parentheses to group expressions is introduced in &ldquo;<a href=\"representing-code.html\">Representing\nCode</a>&rdquo;.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Expr.java</em><br>\nnest inside class <em>Expr</em></div>\n<pre>  <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">Grouping</span> <span class=\"k\">extends</span> <span class=\"t\">Expr</span> {\n    <span class=\"t\">Grouping</span>(<span class=\"t\">Expr</span> <span class=\"i\">expression</span>) {\n      <span class=\"k\">this</span>.<span class=\"i\">expression</span> = <span class=\"i\">expression</span>;\n    }\n\n    <span class=\"a\">@Override</span>\n    &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">visitor</span>.<span class=\"i\">visitGroupingExpr</span>(<span class=\"k\">this</span>);\n    }\n\n    <span class=\"k\">final</span> <span class=\"t\">Expr</span> <span class=\"i\">expression</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Expr.java</em>, nest inside class <em>Expr</em></div>\n\n<h3><a href=\"#literal-expression\" id=\"literal-expression\"><small>A2&#8202;.&#8202;1&#8202;.&#8202;6</small>Literal expression</a></h3>\n<p>Literal value expressions are introduced in &ldquo;<a href=\"representing-code.html\">Representing\nCode</a>&rdquo;.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Expr.java</em><br>\nnest inside class <em>Expr</em></div>\n<pre>  <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">Literal</span> <span class=\"k\">extends</span> <span class=\"t\">Expr</span> {\n    <span class=\"t\">Literal</span>(<span class=\"t\">Object</span> <span class=\"i\">value</span>) {\n      <span class=\"k\">this</span>.<span class=\"i\">value</span> = <span class=\"i\">value</span>;\n    }\n\n    <span class=\"a\">@Override</span>\n    &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">visitor</span>.<span class=\"i\">visitLiteralExpr</span>(<span class=\"k\">this</span>);\n    }\n\n    <span class=\"k\">final</span> <span class=\"t\">Object</span> <span class=\"i\">value</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Expr.java</em>, nest inside class <em>Expr</em></div>\n\n<h3><a href=\"#logical-expression\" id=\"logical-expression\"><small>A2&#8202;.&#8202;1&#8202;.&#8202;7</small>Logical expression</a></h3>\n<p>The logical <code>and</code> and <code>or</code> operators are introduced in &ldquo;<a href=\"control-flow.html#logical-operators\">Control\nFlow</a>&rdquo;.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Expr.java</em><br>\nnest inside class <em>Expr</em></div>\n<pre>  <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">Logical</span> <span class=\"k\">extends</span> <span class=\"t\">Expr</span> {\n    <span class=\"t\">Logical</span>(<span class=\"t\">Expr</span> <span class=\"i\">left</span>, <span class=\"t\">Token</span> <span class=\"i\">operator</span>, <span class=\"t\">Expr</span> <span class=\"i\">right</span>) {\n      <span class=\"k\">this</span>.<span class=\"i\">left</span> = <span class=\"i\">left</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">operator</span> = <span class=\"i\">operator</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">right</span> = <span class=\"i\">right</span>;\n    }\n\n    <span class=\"a\">@Override</span>\n    &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">visitor</span>.<span class=\"i\">visitLogicalExpr</span>(<span class=\"k\">this</span>);\n    }\n\n    <span class=\"k\">final</span> <span class=\"t\">Expr</span> <span class=\"i\">left</span>;\n    <span class=\"k\">final</span> <span class=\"t\">Token</span> <span class=\"i\">operator</span>;\n    <span class=\"k\">final</span> <span class=\"t\">Expr</span> <span class=\"i\">right</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Expr.java</em>, nest inside class <em>Expr</em></div>\n\n<h3><a href=\"#set-expression\" id=\"set-expression\"><small>A2&#8202;.&#8202;1&#8202;.&#8202;8</small>Set expression</a></h3>\n<p>Property assignment, or &ldquo;set&rdquo; expressions are introduced in\n&ldquo;<a href=\"classes.html#properties-on-instances\">Classes</a>&rdquo;.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Expr.java</em><br>\nnest inside class <em>Expr</em></div>\n<pre>  <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">Set</span> <span class=\"k\">extends</span> <span class=\"t\">Expr</span> {\n    <span class=\"t\">Set</span>(<span class=\"t\">Expr</span> <span class=\"i\">object</span>, <span class=\"t\">Token</span> <span class=\"i\">name</span>, <span class=\"t\">Expr</span> <span class=\"i\">value</span>) {\n      <span class=\"k\">this</span>.<span class=\"i\">object</span> = <span class=\"i\">object</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">name</span> = <span class=\"i\">name</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">value</span> = <span class=\"i\">value</span>;\n    }\n\n    <span class=\"a\">@Override</span>\n    &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">visitor</span>.<span class=\"i\">visitSetExpr</span>(<span class=\"k\">this</span>);\n    }\n\n    <span class=\"k\">final</span> <span class=\"t\">Expr</span> <span class=\"i\">object</span>;\n    <span class=\"k\">final</span> <span class=\"t\">Token</span> <span class=\"i\">name</span>;\n    <span class=\"k\">final</span> <span class=\"t\">Expr</span> <span class=\"i\">value</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Expr.java</em>, nest inside class <em>Expr</em></div>\n\n<h3><a href=\"#super-expression\" id=\"super-expression\"><small>A2&#8202;.&#8202;1&#8202;.&#8202;9</small>Super expression</a></h3>\n<p>The <code>super</code> expression is introduced in\n&ldquo;<a href=\"inheritance.html#calling-superclass-methods\">Inheritance</a>&rdquo;.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Expr.java</em><br>\nnest inside class <em>Expr</em></div>\n<pre>  <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">Super</span> <span class=\"k\">extends</span> <span class=\"t\">Expr</span> {\n    <span class=\"t\">Super</span>(<span class=\"t\">Token</span> <span class=\"i\">keyword</span>, <span class=\"t\">Token</span> <span class=\"i\">method</span>) {\n      <span class=\"k\">this</span>.<span class=\"i\">keyword</span> = <span class=\"i\">keyword</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">method</span> = <span class=\"i\">method</span>;\n    }\n\n    <span class=\"a\">@Override</span>\n    &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">visitor</span>.<span class=\"i\">visitSuperExpr</span>(<span class=\"k\">this</span>);\n    }\n\n    <span class=\"k\">final</span> <span class=\"t\">Token</span> <span class=\"i\">keyword</span>;\n    <span class=\"k\">final</span> <span class=\"t\">Token</span> <span class=\"i\">method</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Expr.java</em>, nest inside class <em>Expr</em></div>\n\n<h3><a href=\"#this-expression\" id=\"this-expression\"><small>A2&#8202;.&#8202;1&#8202;.&#8202;10</small>This expression</a></h3>\n<p>The <code>this</code> expression is introduced in &ldquo;<a href=\"classes.html#this\">Classes</a>&rdquo;.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Expr.java</em><br>\nnest inside class <em>Expr</em></div>\n<pre>  <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">This</span> <span class=\"k\">extends</span> <span class=\"t\">Expr</span> {\n    <span class=\"t\">This</span>(<span class=\"t\">Token</span> <span class=\"i\">keyword</span>) {\n      <span class=\"k\">this</span>.<span class=\"i\">keyword</span> = <span class=\"i\">keyword</span>;\n    }\n\n    <span class=\"a\">@Override</span>\n    &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">visitor</span>.<span class=\"i\">visitThisExpr</span>(<span class=\"k\">this</span>);\n    }\n\n    <span class=\"k\">final</span> <span class=\"t\">Token</span> <span class=\"i\">keyword</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Expr.java</em>, nest inside class <em>Expr</em></div>\n\n<h3><a href=\"#unary-expression\" id=\"unary-expression\"><small>A2&#8202;.&#8202;1&#8202;.&#8202;11</small>Unary expression</a></h3>\n<p>Unary operators are introduced in &ldquo;<a href=\"representing-code.html\">Representing Code</a>&rdquo;.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Expr.java</em><br>\nnest inside class <em>Expr</em></div>\n<pre>  <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">Unary</span> <span class=\"k\">extends</span> <span class=\"t\">Expr</span> {\n    <span class=\"t\">Unary</span>(<span class=\"t\">Token</span> <span class=\"i\">operator</span>, <span class=\"t\">Expr</span> <span class=\"i\">right</span>) {\n      <span class=\"k\">this</span>.<span class=\"i\">operator</span> = <span class=\"i\">operator</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">right</span> = <span class=\"i\">right</span>;\n    }\n\n    <span class=\"a\">@Override</span>\n    &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">visitor</span>.<span class=\"i\">visitUnaryExpr</span>(<span class=\"k\">this</span>);\n    }\n\n    <span class=\"k\">final</span> <span class=\"t\">Token</span> <span class=\"i\">operator</span>;\n    <span class=\"k\">final</span> <span class=\"t\">Expr</span> <span class=\"i\">right</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Expr.java</em>, nest inside class <em>Expr</em></div>\n\n<h3><a href=\"#variable-expression\" id=\"variable-expression\"><small>A2&#8202;.&#8202;1&#8202;.&#8202;12</small>Variable expression</a></h3>\n<p>Variable access expressions are introduced in &ldquo;<a href=\"statements-and-state.html#variable-syntax\">Statements and\nState</a>&rdquo;.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Expr.java</em><br>\nnest inside class <em>Expr</em></div>\n<pre>  <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">Variable</span> <span class=\"k\">extends</span> <span class=\"t\">Expr</span> {\n    <span class=\"t\">Variable</span>(<span class=\"t\">Token</span> <span class=\"i\">name</span>) {\n      <span class=\"k\">this</span>.<span class=\"i\">name</span> = <span class=\"i\">name</span>;\n    }\n\n    <span class=\"a\">@Override</span>\n    &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">visitor</span>.<span class=\"i\">visitVariableExpr</span>(<span class=\"k\">this</span>);\n    }\n\n    <span class=\"k\">final</span> <span class=\"t\">Token</span> <span class=\"i\">name</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Expr.java</em>, nest inside class <em>Expr</em></div>\n\n<h2><a href=\"#statements\" id=\"statements\"><small>A2&#8202;.&#8202;2</small>Statements</a></h2>\n<p>Statements form a second hierarchy of syntax tree nodes independent of\nexpressions. We add the first couple of them in &ldquo;<a href=\"statements-and-state.html\">Statements and\nState</a>&rdquo;.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Stmt.java</em><br>\ncreate new file</div>\n<pre><span class=\"k\">package</span> <span class=\"i\">com.craftinginterpreters.lox</span>;\n\n<span class=\"k\">import</span> <span class=\"i\">java.util.List</span>;\n\n<span class=\"k\">abstract</span> <span class=\"k\">class</span> <span class=\"t\">Stmt</span> {\n  <span class=\"k\">interface</span> <span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; {\n    <span class=\"t\">R</span> <span class=\"i\">visitBlockStmt</span>(<span class=\"t\">Block</span> <span class=\"i\">stmt</span>);\n    <span class=\"t\">R</span> <span class=\"i\">visitClassStmt</span>(<span class=\"t\">Class</span> <span class=\"i\">stmt</span>);\n    <span class=\"t\">R</span> <span class=\"i\">visitExpressionStmt</span>(<span class=\"t\">Expression</span> <span class=\"i\">stmt</span>);\n    <span class=\"t\">R</span> <span class=\"i\">visitFunctionStmt</span>(<span class=\"t\">Function</span> <span class=\"i\">stmt</span>);\n    <span class=\"t\">R</span> <span class=\"i\">visitIfStmt</span>(<span class=\"t\">If</span> <span class=\"i\">stmt</span>);\n    <span class=\"t\">R</span> <span class=\"i\">visitPrintStmt</span>(<span class=\"t\">Print</span> <span class=\"i\">stmt</span>);\n    <span class=\"t\">R</span> <span class=\"i\">visitReturnStmt</span>(<span class=\"t\">Return</span> <span class=\"i\">stmt</span>);\n    <span class=\"t\">R</span> <span class=\"i\">visitVarStmt</span>(<span class=\"t\">Var</span> <span class=\"i\">stmt</span>);\n    <span class=\"t\">R</span> <span class=\"i\">visitWhileStmt</span>(<span class=\"t\">While</span> <span class=\"i\">stmt</span>);\n  }\n\n  <span class=\"c\">// Nested Stmt classes here...</span>\n\n  <span class=\"k\">abstract</span> &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Stmt.java</em>, create new file</div>\n\n<h3><a href=\"#block-statement\" id=\"block-statement\"><small>A2&#8202;.&#8202;2&#8202;.&#8202;1</small>Block statement</a></h3>\n<p>The curly-braced block statement that defines a local scope is introduced in\n&ldquo;<a href=\"statements-and-state.html#block-syntax-and-semantics\">Statements and State</a>&rdquo;.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Stmt.java</em><br>\nnest inside class <em>Stmt</em></div>\n<pre>  <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">Block</span> <span class=\"k\">extends</span> <span class=\"t\">Stmt</span> {\n    <span class=\"t\">Block</span>(<span class=\"t\">List</span>&lt;<span class=\"t\">Stmt</span>&gt; <span class=\"i\">statements</span>) {\n      <span class=\"k\">this</span>.<span class=\"i\">statements</span> = <span class=\"i\">statements</span>;\n    }\n\n    <span class=\"a\">@Override</span>\n    &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">visitor</span>.<span class=\"i\">visitBlockStmt</span>(<span class=\"k\">this</span>);\n    }\n\n    <span class=\"k\">final</span> <span class=\"t\">List</span>&lt;<span class=\"t\">Stmt</span>&gt; <span class=\"i\">statements</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Stmt.java</em>, nest inside class <em>Stmt</em></div>\n\n<h3><a href=\"#class-statement\" id=\"class-statement\"><small>A2&#8202;.&#8202;2&#8202;.&#8202;2</small>Class statement</a></h3>\n<p>Class declarations are introduced in, unsurprisingly,\n&ldquo;<a href=\"classes.html#class-declarations\">Classes</a>&rdquo;.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Stmt.java</em><br>\nnest inside class <em>Stmt</em></div>\n<pre>  <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">Class</span> <span class=\"k\">extends</span> <span class=\"t\">Stmt</span> {\n    <span class=\"t\">Class</span>(<span class=\"t\">Token</span> <span class=\"i\">name</span>,\n          <span class=\"t\">Expr</span>.<span class=\"t\">Variable</span> <span class=\"i\">superclass</span>,\n          <span class=\"t\">List</span>&lt;<span class=\"t\">Stmt</span>.<span class=\"t\">Function</span>&gt; <span class=\"i\">methods</span>) {\n      <span class=\"k\">this</span>.<span class=\"i\">name</span> = <span class=\"i\">name</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">superclass</span> = <span class=\"i\">superclass</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">methods</span> = <span class=\"i\">methods</span>;\n    }\n\n    <span class=\"a\">@Override</span>\n    &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">visitor</span>.<span class=\"i\">visitClassStmt</span>(<span class=\"k\">this</span>);\n    }\n\n    <span class=\"k\">final</span> <span class=\"t\">Token</span> <span class=\"i\">name</span>;\n    <span class=\"k\">final</span> <span class=\"t\">Expr</span>.<span class=\"t\">Variable</span> <span class=\"i\">superclass</span>;\n    <span class=\"k\">final</span> <span class=\"t\">List</span>&lt;<span class=\"t\">Stmt</span>.<span class=\"t\">Function</span>&gt; <span class=\"i\">methods</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Stmt.java</em>, nest inside class <em>Stmt</em></div>\n\n<h3><a href=\"#expression-statement\" id=\"expression-statement\"><small>A2&#8202;.&#8202;2&#8202;.&#8202;3</small>Expression statement</a></h3>\n<p>The expression statement is introduced in &ldquo;<a href=\"statements-and-state.html#statements\">Statements and\nState</a>&rdquo;.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Stmt.java</em><br>\nnest inside class <em>Stmt</em></div>\n<pre>  <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">Expression</span> <span class=\"k\">extends</span> <span class=\"t\">Stmt</span> {\n    <span class=\"t\">Expression</span>(<span class=\"t\">Expr</span> <span class=\"i\">expression</span>) {\n      <span class=\"k\">this</span>.<span class=\"i\">expression</span> = <span class=\"i\">expression</span>;\n    }\n\n    <span class=\"a\">@Override</span>\n    &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">visitor</span>.<span class=\"i\">visitExpressionStmt</span>(<span class=\"k\">this</span>);\n    }\n\n    <span class=\"k\">final</span> <span class=\"t\">Expr</span> <span class=\"i\">expression</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Stmt.java</em>, nest inside class <em>Stmt</em></div>\n\n<h3><a href=\"#function-statement\" id=\"function-statement\"><small>A2&#8202;.&#8202;2&#8202;.&#8202;4</small>Function statement</a></h3>\n<p>Function declarations are introduced in, you guessed it,\n&ldquo;<a href=\"functions.html#function-declarations\">Functions</a>&rdquo;.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Stmt.java</em><br>\nnest inside class <em>Stmt</em></div>\n<pre>  <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">Function</span> <span class=\"k\">extends</span> <span class=\"t\">Stmt</span> {\n    <span class=\"t\">Function</span>(<span class=\"t\">Token</span> <span class=\"i\">name</span>, <span class=\"t\">List</span>&lt;<span class=\"t\">Token</span>&gt; <span class=\"i\">params</span>, <span class=\"t\">List</span>&lt;<span class=\"t\">Stmt</span>&gt; <span class=\"i\">body</span>) {\n      <span class=\"k\">this</span>.<span class=\"i\">name</span> = <span class=\"i\">name</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">params</span> = <span class=\"i\">params</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">body</span> = <span class=\"i\">body</span>;\n    }\n\n    <span class=\"a\">@Override</span>\n    &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">visitor</span>.<span class=\"i\">visitFunctionStmt</span>(<span class=\"k\">this</span>);\n    }\n\n    <span class=\"k\">final</span> <span class=\"t\">Token</span> <span class=\"i\">name</span>;\n    <span class=\"k\">final</span> <span class=\"t\">List</span>&lt;<span class=\"t\">Token</span>&gt; <span class=\"i\">params</span>;\n    <span class=\"k\">final</span> <span class=\"t\">List</span>&lt;<span class=\"t\">Stmt</span>&gt; <span class=\"i\">body</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Stmt.java</em>, nest inside class <em>Stmt</em></div>\n\n<h3><a href=\"#if-statement\" id=\"if-statement\"><small>A2&#8202;.&#8202;2&#8202;.&#8202;5</small>If statement</a></h3>\n<p>The <code>if</code> statement is introduced in &ldquo;<a href=\"control-flow.html#conditional-execution\">Control\nFlow</a>&rdquo;.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Stmt.java</em><br>\nnest inside class <em>Stmt</em></div>\n<pre>  <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">If</span> <span class=\"k\">extends</span> <span class=\"t\">Stmt</span> {\n    <span class=\"t\">If</span>(<span class=\"t\">Expr</span> <span class=\"i\">condition</span>, <span class=\"t\">Stmt</span> <span class=\"i\">thenBranch</span>, <span class=\"t\">Stmt</span> <span class=\"i\">elseBranch</span>) {\n      <span class=\"k\">this</span>.<span class=\"i\">condition</span> = <span class=\"i\">condition</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">thenBranch</span> = <span class=\"i\">thenBranch</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">elseBranch</span> = <span class=\"i\">elseBranch</span>;\n    }\n\n    <span class=\"a\">@Override</span>\n    &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">visitor</span>.<span class=\"i\">visitIfStmt</span>(<span class=\"k\">this</span>);\n    }\n\n    <span class=\"k\">final</span> <span class=\"t\">Expr</span> <span class=\"i\">condition</span>;\n    <span class=\"k\">final</span> <span class=\"t\">Stmt</span> <span class=\"i\">thenBranch</span>;\n    <span class=\"k\">final</span> <span class=\"t\">Stmt</span> <span class=\"i\">elseBranch</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Stmt.java</em>, nest inside class <em>Stmt</em></div>\n\n<h3><a href=\"#print-statement\" id=\"print-statement\"><small>A2&#8202;.&#8202;2&#8202;.&#8202;6</small>Print statement</a></h3>\n<p>The <code>print</code> statement is introduced in &ldquo;<a href=\"statements-and-state.html#statements\">Statements and\nState</a>&rdquo;.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Stmt.java</em><br>\nnest inside class <em>Stmt</em></div>\n<pre>  <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">Print</span> <span class=\"k\">extends</span> <span class=\"t\">Stmt</span> {\n    <span class=\"t\">Print</span>(<span class=\"t\">Expr</span> <span class=\"i\">expression</span>) {\n      <span class=\"k\">this</span>.<span class=\"i\">expression</span> = <span class=\"i\">expression</span>;\n    }\n\n    <span class=\"a\">@Override</span>\n    &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">visitor</span>.<span class=\"i\">visitPrintStmt</span>(<span class=\"k\">this</span>);\n    }\n\n    <span class=\"k\">final</span> <span class=\"t\">Expr</span> <span class=\"i\">expression</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Stmt.java</em>, nest inside class <em>Stmt</em></div>\n\n<h3><a href=\"#return-statement\" id=\"return-statement\"><small>A2&#8202;.&#8202;2&#8202;.&#8202;7</small>Return statement</a></h3>\n<p>You need a function to return from, so <code>return</code> statements are introduced in\n&ldquo;<a href=\"functions.html#return-statements\">Functions</a>&rdquo;.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Stmt.java</em><br>\nnest inside class <em>Stmt</em></div>\n<pre>  <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">Return</span> <span class=\"k\">extends</span> <span class=\"t\">Stmt</span> {\n    <span class=\"t\">Return</span>(<span class=\"t\">Token</span> <span class=\"i\">keyword</span>, <span class=\"t\">Expr</span> <span class=\"i\">value</span>) {\n      <span class=\"k\">this</span>.<span class=\"i\">keyword</span> = <span class=\"i\">keyword</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">value</span> = <span class=\"i\">value</span>;\n    }\n\n    <span class=\"a\">@Override</span>\n    &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">visitor</span>.<span class=\"i\">visitReturnStmt</span>(<span class=\"k\">this</span>);\n    }\n\n    <span class=\"k\">final</span> <span class=\"t\">Token</span> <span class=\"i\">keyword</span>;\n    <span class=\"k\">final</span> <span class=\"t\">Expr</span> <span class=\"i\">value</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Stmt.java</em>, nest inside class <em>Stmt</em></div>\n\n<h3><a href=\"#variable-statement\" id=\"variable-statement\"><small>A2&#8202;.&#8202;2&#8202;.&#8202;8</small>Variable statement</a></h3>\n<p>Variable declarations are introduced in &ldquo;<a href=\"statements-and-state.html#variable-syntax\">Statements and\nState</a>&rdquo;.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Stmt.java</em><br>\nnest inside class <em>Stmt</em></div>\n<pre>  <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">Var</span> <span class=\"k\">extends</span> <span class=\"t\">Stmt</span> {\n    <span class=\"t\">Var</span>(<span class=\"t\">Token</span> <span class=\"i\">name</span>, <span class=\"t\">Expr</span> <span class=\"i\">initializer</span>) {\n      <span class=\"k\">this</span>.<span class=\"i\">name</span> = <span class=\"i\">name</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">initializer</span> = <span class=\"i\">initializer</span>;\n    }\n\n    <span class=\"a\">@Override</span>\n    &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">visitor</span>.<span class=\"i\">visitVarStmt</span>(<span class=\"k\">this</span>);\n    }\n\n    <span class=\"k\">final</span> <span class=\"t\">Token</span> <span class=\"i\">name</span>;\n    <span class=\"k\">final</span> <span class=\"t\">Expr</span> <span class=\"i\">initializer</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Stmt.java</em>, nest inside class <em>Stmt</em></div>\n\n<h3><a href=\"#while-statement\" id=\"while-statement\"><small>A2&#8202;.&#8202;2&#8202;.&#8202;9</small>While statement</a></h3>\n<p>The <code>while</code> statement is introduced in &ldquo;<a href=\"control-flow.html#while-loops\">Control\nFlow</a>&rdquo;.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Stmt.java</em><br>\nnest inside class <em>Stmt</em></div>\n<pre>  <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">While</span> <span class=\"k\">extends</span> <span class=\"t\">Stmt</span> {\n    <span class=\"t\">While</span>(<span class=\"t\">Expr</span> <span class=\"i\">condition</span>, <span class=\"t\">Stmt</span> <span class=\"i\">body</span>) {\n      <span class=\"k\">this</span>.<span class=\"i\">condition</span> = <span class=\"i\">condition</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">body</span> = <span class=\"i\">body</span>;\n    }\n\n    <span class=\"a\">@Override</span>\n    &lt;<span class=\"t\">R</span>&gt; <span class=\"t\">R</span> <span class=\"i\">accept</span>(<span class=\"t\">Visitor</span>&lt;<span class=\"t\">R</span>&gt; <span class=\"i\">visitor</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">visitor</span>.<span class=\"i\">visitWhileStmt</span>(<span class=\"k\">this</span>);\n    }\n\n    <span class=\"k\">final</span> <span class=\"t\">Expr</span> <span class=\"i\">condition</span>;\n    <span class=\"k\">final</span> <span class=\"t\">Stmt</span> <span class=\"i\">body</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Stmt.java</em>, nest inside class <em>Stmt</em></div>\n\n\n<footer>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/backmatter.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Backmatter &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h2><small></small>Backmatter</h2>\n\n<ul>\n    <li><a href=\"appendix-i.html\"><small>A1</small>Appendix I</a></li>\n    <li><a href=\"appendix-ii.html\"><small>A2</small>Appendix II</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"optimization.html\" title=\"Optimization\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"contents.html\" title=\"Table of Contents\">&uarr;&nbsp;Up</a>\n    <a href=\"appendix-i.html\" title=\"Appendix I\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"optimization.html\" title=\"Optimization\" class=\"prev\">←</a>\n<a href=\"appendix-i.html\" title=\"Appendix I\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h2><small></small>Backmatter</h2>\n\n<ul>\n    <li><a href=\"appendix-i.html\"><small>A1</small>Appendix I</a></li>\n    <li><a href=\"appendix-ii.html\"><small>A2</small>Appendix II</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"optimization.html\" title=\"Optimization\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"contents.html\" title=\"Table of Contents\">&uarr;&nbsp;Up</a>\n    <a href=\"appendix-i.html\" title=\"Appendix I\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <h1 class=\"part\">Backmatter</h1>\n\n<p>You&rsquo;ve reached the end of the book! There are two pieces of supplementary\nmaterial you may find helpful:</p>\n<ul>\n<li>\n<p><strong><a href=\"appendix-i.html\">Appendix I</a></strong> contains a complete grammar for Lox, all in one place.</p>\n</li>\n<li>\n<p><strong><a href=\"appendix-ii.html\">Appendix II</a></strong> shows the Java classes produced by <a href=\"representing-code.html#metaprogramming-the-trees\">the AST generator</a>\nwe use for jlox.</p>\n</li>\n</ul>\n\n<footer>\n<a href=\"appendix-i.html\" class=\"next\">\n  Next Chapter: &ldquo;Appendix I&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/calls-and-functions.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Calls and Functions &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Calls and Functions<small>24</small></a></h3>\n\n<ul>\n    <li><a href=\"#function-objects\"><small>24.1</small> Function Objects</a></li>\n    <li><a href=\"#compiling-to-function-objects\"><small>24.2</small> Compiling to Function Objects</a></li>\n    <li><a href=\"#call-frames\"><small>24.3</small> Call Frames</a></li>\n    <li><a href=\"#function-declarations\"><small>24.4</small> Function Declarations</a></li>\n    <li><a href=\"#function-calls\"><small>24.5</small> Function Calls</a></li>\n    <li><a href=\"#return-statements\"><small>24.6</small> Return Statements</a></li>\n    <li><a href=\"#native-functions\"><small>24.7</small> Native Functions</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"jumping-back-and-forth.html\" title=\"Jumping Back and Forth\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"closures.html\" title=\"Closures\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"jumping-back-and-forth.html\" title=\"Jumping Back and Forth\" class=\"prev\">←</a>\n<a href=\"closures.html\" title=\"Closures\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Calls and Functions<small>24</small></a></h3>\n\n<ul>\n    <li><a href=\"#function-objects\"><small>24.1</small> Function Objects</a></li>\n    <li><a href=\"#compiling-to-function-objects\"><small>24.2</small> Compiling to Function Objects</a></li>\n    <li><a href=\"#call-frames\"><small>24.3</small> Call Frames</a></li>\n    <li><a href=\"#function-declarations\"><small>24.4</small> Function Declarations</a></li>\n    <li><a href=\"#function-calls\"><small>24.5</small> Function Calls</a></li>\n    <li><a href=\"#return-statements\"><small>24.6</small> Return Statements</a></li>\n    <li><a href=\"#native-functions\"><small>24.7</small> Native Functions</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"jumping-back-and-forth.html\" title=\"Jumping Back and Forth\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"closures.html\" title=\"Closures\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">24</div>\n  <h1>Calls and Functions</h1>\n\n<blockquote>\n<p>Any problem in computer science can be solved with another level of\nindirection. Except for the problem of too many layers of indirection.</p>\n<p><cite>David Wheeler</cite></p>\n</blockquote>\n<p>This chapter is a beast. I try to break features into bite-sized pieces, but\nsometimes you gotta swallow the whole <span name=\"eat\">meal</span>. Our next\ntask is functions. We could start with only function declarations, but that&rsquo;s\nnot very useful when you can&rsquo;t call them. We could do calls, but there&rsquo;s nothing\nto call. And all of the runtime support needed in the VM to support both of\nthose isn&rsquo;t very rewarding if it isn&rsquo;t hooked up to anything you can see. So\nwe&rsquo;re going to do it all. It&rsquo;s a lot, but we&rsquo;ll feel good when we&rsquo;re done.</p>\n<aside name=\"eat\">\n<p>Eating<span class=\"em\">&mdash;</span>consumption<span class=\"em\">&mdash;</span>is a weird metaphor for a creative act. But most of the\nbiological processes that produce &ldquo;output&rdquo; are a little less, ahem, decorous.</p>\n</aside>\n<h2><a href=\"#function-objects\" id=\"function-objects\"><small>24&#8202;.&#8202;1</small>Function Objects</a></h2>\n<p>The most interesting structural change in the VM is around the stack. We already\n<em>have</em> a stack for local variables and temporaries, so we&rsquo;re partway there. But\nwe have no notion of a <em>call</em> stack. Before we can make much progress, we&rsquo;ll\nhave to fix that. But first, let&rsquo;s write some code. I always feel better once I\nstart moving. We can&rsquo;t do much without having some kind of representation for\nfunctions, so we&rsquo;ll start there. From the VM&rsquo;s perspective, what is a function?</p>\n<p>A function has a body that can be executed, so that means some bytecode. We\ncould compile the entire program and all of its function declarations into one\nbig monolithic Chunk. Each function would have a pointer to the first\ninstruction of its code inside the Chunk.</p>\n<p>This is roughly how compilation to native code works where you end up with one\nsolid blob of machine code. But for our bytecode VM, we can do something a\nlittle higher level. I think a cleaner model is to give each function its own\nChunk. We&rsquo;ll want some other metadata too, so let&rsquo;s go ahead and stuff it all in\na struct now.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  struct Obj* next;\n};\n</pre><div class=\"source-file\"><em>object.h</em><br>\nadd after struct <em>Obj</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">typedef</span> <span class=\"k\">struct</span> {\n  <span class=\"t\">Obj</span> <span class=\"i\">obj</span>;\n  <span class=\"t\">int</span> <span class=\"i\">arity</span>;\n  <span class=\"t\">Chunk</span> <span class=\"i\">chunk</span>;\n  <span class=\"t\">ObjString</span>* <span class=\"i\">name</span>;\n} <span class=\"t\">ObjFunction</span>;\n</pre><pre class=\"insert-after\">\n\nstruct ObjString {\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, add after struct <em>Obj</em></div>\n\n<p>Functions are first class in Lox, so they need to be actual Lox objects. Thus\nObjFunction has the same Obj header that all object types share. The <code>arity</code>\nfield stores the number of parameters the function expects. Then, in addition to\nthe chunk, we store the function&rsquo;s <span name=\"name\">name</span>. That will be\nhandy for reporting readable runtime errors.</p>\n<aside name=\"name\">\n<p>Humans don&rsquo;t seem to find numeric bytecode offsets particularly illuminating in\ncrash dumps.</p>\n</aside>\n<p>This is the first time the &ldquo;object&rdquo; module has needed to reference Chunk, so we\nget an include.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;common.h&quot;\n</pre><div class=\"source-file\"><em>object.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;chunk.h&quot;</span>\n</pre><pre class=\"insert-after\">#include &quot;value.h&quot;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em></div>\n\n<p>Like we did with strings, we define some accessories to make Lox functions\neasier to work with in C. Sort of a poor man&rsquo;s object orientation. First, we&rsquo;ll\ndeclare a C function to create a new Lox function.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  uint32_t hash;\n};\n\n</pre><div class=\"source-file\"><em>object.h</em><br>\nadd after struct <em>ObjString</em></div>\n<pre class=\"insert\"><span class=\"t\">ObjFunction</span>* <span class=\"i\">newFunction</span>();\n</pre><pre class=\"insert-after\">ObjString* takeString(char* chars, int length);\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, add after struct <em>ObjString</em></div>\n\n<p>The implementation is over here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>object.c</em><br>\nadd after <em>allocateObject</em>()</div>\n<pre><span class=\"t\">ObjFunction</span>* <span class=\"i\">newFunction</span>() {\n  <span class=\"t\">ObjFunction</span>* <span class=\"i\">function</span> = <span class=\"a\">ALLOCATE_OBJ</span>(<span class=\"t\">ObjFunction</span>, <span class=\"a\">OBJ_FUNCTION</span>);\n  <span class=\"i\">function</span>-&gt;<span class=\"i\">arity</span> = <span class=\"n\">0</span>;\n  <span class=\"i\">function</span>-&gt;<span class=\"i\">name</span> = <span class=\"a\">NULL</span>;\n  <span class=\"i\">initChunk</span>(&amp;<span class=\"i\">function</span>-&gt;<span class=\"i\">chunk</span>);\n  <span class=\"k\">return</span> <span class=\"i\">function</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, add after <em>allocateObject</em>()</div>\n\n<p>We use our friend <code>ALLOCATE_OBJ()</code> to allocate memory and initialize the\nobject&rsquo;s header so that the VM knows what type of object it is. Instead of\npassing in arguments to initialize the function like we did with ObjString, we\nset the function up in a sort of blank state<span class=\"em\">&mdash;</span>zero arity, no name, and no\ncode. That will get filled in later after the function is created.</p>\n<p>Since we have a new kind of object, we need a new object type in the enum.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">typedef enum {\n</pre><div class=\"source-file\"><em>object.h</em><br>\nin enum <em>ObjType</em></div>\n<pre class=\"insert\">  <span class=\"a\">OBJ_FUNCTION</span>,\n</pre><pre class=\"insert-after\">  OBJ_STRING,\n} ObjType;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, in enum <em>ObjType</em></div>\n\n<p>When we&rsquo;re done with a function object, we must return the bits it borrowed back\nto the operating system.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  switch (object-&gt;type) {\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>freeObject</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OBJ_FUNCTION</span>: {\n      <span class=\"t\">ObjFunction</span>* <span class=\"i\">function</span> = (<span class=\"t\">ObjFunction</span>*)<span class=\"i\">object</span>;\n      <span class=\"i\">freeChunk</span>(&amp;<span class=\"i\">function</span>-&gt;<span class=\"i\">chunk</span>);\n      <span class=\"a\">FREE</span>(<span class=\"t\">ObjFunction</span>, <span class=\"i\">object</span>);\n      <span class=\"k\">break</span>;\n    }\n</pre><pre class=\"insert-after\">    case OBJ_STRING: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>freeObject</em>()</div>\n\n<p>This switch case is <span name=\"free-name\">responsible</span> for freeing the\nObjFunction itself as well as any other memory it owns. Functions own their\nchunk, so we call Chunk&rsquo;s destructor-like function.</p>\n<aside name=\"free-name\">\n<p>We don&rsquo;t need to explicitly free the function&rsquo;s name because it&rsquo;s an ObjString.\nThat means we can let the garbage collector manage its lifetime for us. Or, at\nleast, we&rsquo;ll be able to once we <a href=\"garbage-collection.html\">implement a garbage collector</a>.</p>\n</aside>\n<p>Lox lets you print any object, and functions are first-class objects, so we\nneed to handle them too.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  switch (OBJ_TYPE(value)) {\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>printObject</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OBJ_FUNCTION</span>:\n      <span class=\"i\">printFunction</span>(<span class=\"a\">AS_FUNCTION</span>(<span class=\"i\">value</span>));\n      <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">    case OBJ_STRING:\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>printObject</em>()</div>\n\n<p>This calls out to:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>object.c</em><br>\nadd after <em>copyString</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">printFunction</span>(<span class=\"t\">ObjFunction</span>* <span class=\"i\">function</span>) {\n  <span class=\"i\">printf</span>(<span class=\"s\">&quot;&lt;fn %s&gt;&quot;</span>, <span class=\"i\">function</span>-&gt;<span class=\"i\">name</span>-&gt;<span class=\"i\">chars</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, add after <em>copyString</em>()</div>\n\n<p>Since a function knows its name, it may as well say it.</p>\n<p>Finally, we have a couple of macros for converting values to functions. First,\nmake sure your value actually <em>is</em> a function.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define OBJ_TYPE(value)        (AS_OBJ(value)-&gt;type)\n\n</pre><div class=\"source-file\"><em>object.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define IS_FUNCTION(value)     isObjType(value, OBJ_FUNCTION)</span>\n</pre><pre class=\"insert-after\">#define IS_STRING(value)       isObjType(value, OBJ_STRING)\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em></div>\n\n<p>Assuming that evaluates to true, you can then safely cast the Value to an\nObjFunction pointer using this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define IS_STRING(value)       isObjType(value, OBJ_STRING)\n\n</pre><div class=\"source-file\"><em>object.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define AS_FUNCTION(value)     ((ObjFunction*)AS_OBJ(value))</span>\n</pre><pre class=\"insert-after\">#define AS_STRING(value)       ((ObjString*)AS_OBJ(value))\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em></div>\n\n<p>With that, our object model knows how to represent functions. I&rsquo;m feeling warmed\nup now. You ready for something a little harder?</p>\n<h2><a href=\"#compiling-to-function-objects\" id=\"compiling-to-function-objects\"><small>24&#8202;.&#8202;2</small>Compiling to Function Objects</a></h2>\n<p>Right now, our compiler assumes it is always compiling to one single chunk. With\neach function&rsquo;s code living in separate chunks, that gets more complex. When the\ncompiler reaches a function declaration, it needs to emit code into the\nfunction&rsquo;s chunk when compiling its body. At the end of the function body, the\ncompiler needs to return to the previous chunk it was working with.</p>\n<p>That&rsquo;s fine for code inside function bodies, but what about code that isn&rsquo;t? The\n&ldquo;top level&rdquo; of a Lox program is also imperative code and we need a chunk to\ncompile that into. We can simplify the compiler and VM by placing that top-level\ncode inside an automatically defined function too. That way, the compiler is\nalways within some kind of function body, and the VM always runs code by\ninvoking a function. It&rsquo;s as if the entire program is <span\nname=\"wrap\">wrapped</span> inside an implicit <code>main()</code> function.</p>\n<aside name=\"wrap\">\n<p>One semantic corner where that analogy breaks down is global variables. They\nhave special scoping rules different from local variables, so in that way, the\ntop level of a script isn&rsquo;t like a function body.</p>\n</aside>\n<p>Before we get to user-defined functions, then, let&rsquo;s do the reorganization to\nsupport that implicit top-level function. It starts with the Compiler struct.\nInstead of pointing directly to a Chunk that the compiler writes to, it instead\nhas a reference to the function object being built.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">typedef struct {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin struct <em>Compiler</em></div>\n<pre class=\"insert\">  <span class=\"t\">ObjFunction</span>* <span class=\"i\">function</span>;\n  <span class=\"t\">FunctionType</span> <span class=\"i\">type</span>;\n\n</pre><pre class=\"insert-after\">  Local locals[UINT8_COUNT];\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in struct <em>Compiler</em></div>\n\n<p>We also have a little FunctionType enum. This lets the compiler tell when it&rsquo;s\ncompiling top-level code versus the body of a function. Most of the compiler\ndoesn&rsquo;t care about this<span class=\"em\">&mdash;</span>that&rsquo;s why it&rsquo;s a useful abstraction<span class=\"em\">&mdash;</span>but in one or\ntwo places the distinction is meaningful. We&rsquo;ll get to one later.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after struct <em>Local</em></div>\n<pre><span class=\"k\">typedef</span> <span class=\"k\">enum</span> {\n  <span class=\"a\">TYPE_FUNCTION</span>,\n  <span class=\"a\">TYPE_SCRIPT</span>\n} <span class=\"t\">FunctionType</span>;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after struct <em>Local</em></div>\n\n<p>Every place in the compiler that was writing to the Chunk now needs to go\nthrough that <code>function</code> pointer. Fortunately, many <span\nname=\"current\">chapters</span> ago, we encapsulated access to the chunk in the\n<code>currentChunk()</code> function. We only need to fix that and the rest of the compiler\nis happy.</p>\n<aside name=\"current\">\n<p>It&rsquo;s almost like I had a crystal ball that could see into the future and knew\nwe&rsquo;d need to change the code later. But, really, it&rsquo;s because I wrote all the\ncode for the book before any of the text.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">Compiler* current = NULL;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nadd after variable <em>current</em><br>\nreplace 5 lines</div>\n<pre class=\"insert\">\n\n<span class=\"k\">static</span> <span class=\"t\">Chunk</span>* <span class=\"i\">currentChunk</span>() {\n  <span class=\"k\">return</span> &amp;<span class=\"i\">current</span>-&gt;<span class=\"i\">function</span>-&gt;<span class=\"i\">chunk</span>;\n}\n</pre><pre class=\"insert-after\">\n\nstatic void errorAt(Token* token, const char* message) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after variable <em>current</em>, replace 5 lines</div>\n\n<p>The current chunk is always the chunk owned by the function we&rsquo;re in the middle\nof compiling. Next, we need to actually create that function. Previously, the VM\npassed a Chunk to the compiler which filled it with code. Instead, the compiler\nwill create and return a function that contains the compiled top-level code<span class=\"em\">&mdash;</span>which is all we support right now<span class=\"em\">&mdash;</span>of the user&rsquo;s program.</p>\n<h3><a href=\"#creating-functions-at-compile-time\" id=\"creating-functions-at-compile-time\"><small>24&#8202;.&#8202;2&#8202;.&#8202;1</small>Creating functions at compile time</a></h3>\n<p>We start threading this through in <code>compile()</code>, which is the main entry point\ninto the compiler.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  Compiler compiler;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>compile</em>()<br>\nreplace 2 lines</div>\n<pre class=\"insert\">  <span class=\"i\">initCompiler</span>(&amp;<span class=\"i\">compiler</span>, <span class=\"a\">TYPE_SCRIPT</span>);\n</pre><pre class=\"insert-after\">\n\n  parser.hadError = false;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>compile</em>(), replace 2 lines</div>\n\n<p>There are a bunch of changes in how the compiler is initialized. First, we\ninitialize the new Compiler fields.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nfunction <em>initCompiler</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">initCompiler</span>(<span class=\"t\">Compiler</span>* <span class=\"i\">compiler</span>, <span class=\"t\">FunctionType</span> <span class=\"i\">type</span>) {\n  <span class=\"i\">compiler</span>-&gt;<span class=\"i\">function</span> = <span class=\"a\">NULL</span>;\n  <span class=\"i\">compiler</span>-&gt;<span class=\"i\">type</span> = <span class=\"i\">type</span>;\n</pre><pre class=\"insert-after\">  compiler-&gt;localCount = 0;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, function <em>initCompiler</em>(), replace 1 line</div>\n\n<p>Then we allocate a new function object to compile into.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  compiler-&gt;scopeDepth = 0;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>initCompiler</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">compiler</span>-&gt;<span class=\"i\">function</span> = <span class=\"i\">newFunction</span>();\n</pre><pre class=\"insert-after\">  current = compiler;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>initCompiler</em>()</div>\n\n<p><span name=\"null\"></span></p>\n<aside name=\"null\">\n<p>I know, it looks dumb to null the <code>function</code> field only to immediately assign it\na value a few lines later. More garbage collection-related paranoia.</p>\n</aside>\n<p>Creating an ObjFunction in the compiler might seem a little strange. A function\nobject is the <em>runtime</em> representation of a function, but here we are creating\nit at compile time. The way to think of it is that a function is similar to a\nstring or number literal. It forms a bridge between the compile time and runtime\nworlds. When we get to function <em>declarations</em>, those really <em>are</em> literals<span class=\"em\">&mdash;</span>they are a notation that produces values of a built-in type. So the <span\nname=\"closure\">compiler</span> creates function objects during compilation.\nThen, at runtime, they are simply invoked.</p>\n<aside name=\"closure\">\n<p>We can create functions at compile time because they contain only data available\nat compile time. The function&rsquo;s code, name, and arity are all fixed. When we add\nclosures in the <a href=\"closures.html\">next chapter</a>, which capture variables at runtime,\nthe story gets more complex.</p>\n</aside>\n<p>Here is another strange piece of code:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  current = compiler;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>initCompiler</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"t\">Local</span>* <span class=\"i\">local</span> = &amp;<span class=\"i\">current</span>-&gt;<span class=\"i\">locals</span>[<span class=\"i\">current</span>-&gt;<span class=\"i\">localCount</span>++];\n  <span class=\"i\">local</span>-&gt;<span class=\"i\">depth</span> = <span class=\"n\">0</span>;\n  <span class=\"i\">local</span>-&gt;<span class=\"i\">name</span>.<span class=\"i\">start</span> = <span class=\"s\">&quot;&quot;</span>;\n  <span class=\"i\">local</span>-&gt;<span class=\"i\">name</span>.<span class=\"i\">length</span> = <span class=\"n\">0</span>;\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>initCompiler</em>()</div>\n\n<p>Remember that the compiler&rsquo;s <code>locals</code> array keeps track of which stack slots are\nassociated with which local variables or temporaries. From now on, the compiler\nimplicitly claims stack slot zero for the VM&rsquo;s own internal use. We give it an\nempty name so that the user can&rsquo;t write an identifier that refers to it. I&rsquo;ll\nexplain what this is about when it becomes useful.</p>\n<p>That&rsquo;s the initialization side. We also need a couple of changes on the other\nend when we finish compiling some code.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nfunction <em>endCompiler</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"k\">static</span> <span class=\"t\">ObjFunction</span>* <span class=\"i\">endCompiler</span>() {\n</pre><pre class=\"insert-after\">  emitReturn();\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, function <em>endCompiler</em>(), replace 1 line</div>\n\n<p>Previously, when <code>interpret()</code> called into the compiler, it passed in a Chunk to\nbe written to. Now that the compiler creates the function object itself, we\nreturn that function. We grab it from the current compiler here:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  emitReturn();\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>endCompiler</em>()</div>\n<pre class=\"insert\">  <span class=\"t\">ObjFunction</span>* <span class=\"i\">function</span> = <span class=\"i\">current</span>-&gt;<span class=\"i\">function</span>;\n\n</pre><pre class=\"insert-after\">#ifdef DEBUG_PRINT_CODE\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>endCompiler</em>()</div>\n\n<p>And then return it to <code>compile()</code> like so:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#endif\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>endCompiler</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"k\">return</span> <span class=\"i\">function</span>;\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>endCompiler</em>()</div>\n\n<p>Now is a good time to make another tweak in this function. Earlier, we added\nsome diagnostic code to have the VM dump the disassembled bytecode so we could\ndebug the compiler. We should fix that to keep working now that the generated\nchunk is wrapped in a function.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#ifdef DEBUG_PRINT_CODE\n  if (!parser.hadError) {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>endCompiler</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"i\">disassembleChunk</span>(<span class=\"i\">currentChunk</span>(), <span class=\"i\">function</span>-&gt;<span class=\"i\">name</span> != <span class=\"a\">NULL</span>\n        ? <span class=\"i\">function</span>-&gt;<span class=\"i\">name</span>-&gt;<span class=\"i\">chars</span> : <span class=\"s\">&quot;&lt;script&gt;&quot;</span>);\n</pre><pre class=\"insert-after\">  }\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>endCompiler</em>(), replace 1 line</div>\n\n<p>Notice the check in here to see if the function&rsquo;s name is <code>NULL</code>? User-defined\nfunctions have names, but the implicit function we create for the top-level code\ndoes not, and we need to handle that gracefully even in our own diagnostic code.\nSpeaking of which:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void printFunction(ObjFunction* function) {\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>printFunction</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">function</span>-&gt;<span class=\"i\">name</span> == <span class=\"a\">NULL</span>) {\n    <span class=\"i\">printf</span>(<span class=\"s\">&quot;&lt;script&gt;&quot;</span>);\n    <span class=\"k\">return</span>;\n  }\n</pre><pre class=\"insert-after\">  printf(&quot;&lt;fn %s&gt;&quot;, function-&gt;name-&gt;chars);\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>printFunction</em>()</div>\n\n<p>There&rsquo;s no way for a <em>user</em> to get a reference to the top-level function and try\nto print it, but our <code>DEBUG_TRACE_EXECUTION</code> <span\nname=\"debug\">diagnostic</span> code that prints the entire stack can and does.</p>\n<aside name=\"debug\">\n<p>It is no fun if the diagnostic code we use to find bugs itself causes the VM to\nsegfault!</p>\n</aside>\n<p>Bumping up a level to <code>compile()</code>, we adjust its signature.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;vm.h&quot;\n\n</pre><div class=\"source-file\"><em>compiler.h</em><br>\nfunction <em>compile</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"t\">ObjFunction</span>* <span class=\"i\">compile</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">source</span>);\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.h</em>, function <em>compile</em>(), replace 1 line</div>\n\n<p>Instead of taking a chunk, now it returns a function. Over in the\nimplementation:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nfunction <em>compile</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"t\">ObjFunction</span>* <span class=\"i\">compile</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">source</span>) {\n</pre><pre class=\"insert-after\">  initScanner(source);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, function <em>compile</em>(), replace 1 line</div>\n\n<p>Finally we get to some actual code. We change the very end of the function to\nthis:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  while (!match(TOKEN_EOF)) {\n    declaration();\n  }\n\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>compile</em>()<br>\nreplace 2 lines</div>\n<pre class=\"insert\">  <span class=\"t\">ObjFunction</span>* <span class=\"i\">function</span> = <span class=\"i\">endCompiler</span>();\n  <span class=\"k\">return</span> <span class=\"i\">parser</span>.<span class=\"i\">hadError</span> ? <span class=\"a\">NULL</span> : <span class=\"i\">function</span>;\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>compile</em>(), replace 2 lines</div>\n\n<p>We get the function object from the compiler. If there were no compile errors,\nwe return it. Otherwise, we signal an error by returning <code>NULL</code>. This way, the\nVM doesn&rsquo;t try to execute a function that may contain invalid bytecode.</p>\n<p>Eventually, we will update <code>interpret()</code> to handle the new declaration of\n<code>compile()</code>, but first we have some other changes to make.</p>\n<h2><a href=\"#call-frames\" id=\"call-frames\"><small>24&#8202;.&#8202;3</small>Call Frames</a></h2>\n<p>It&rsquo;s time for a big conceptual leap. Before we can implement function\ndeclarations and calls, we need to get the VM ready to handle them. There are\ntwo main problems we need to worry about:</p>\n<h3><a href=\"#allocating-local-variables\" id=\"allocating-local-variables\"><small>24&#8202;.&#8202;3&#8202;.&#8202;1</small>Allocating local variables</a></h3>\n<p>The compiler allocates stack slots for local variables. How should that work\nwhen the set of local variables in a program is distributed across multiple\nfunctions?</p>\n<p>One option would be to keep them totally separate. Each function would get its\nown dedicated set of slots in the VM stack that it would own <span\nname=\"static\">forever</span>, even when the function isn&rsquo;t being called. Each\nlocal variable in the entire program would have a bit of memory in the VM that\nit keeps to itself.</p>\n<aside name=\"static\">\n<p>It&rsquo;s basically what you&rsquo;d get if you declared every local variable in a C\nprogram using <code>static</code>.</p>\n</aside>\n<p>Believe it or not, early programming language implementations worked this way.\nThe first Fortran compilers statically allocated memory for each variable. The\nobvious problem is that it&rsquo;s really inefficient. Most functions are not in the\nmiddle of being called at any point in time, so sitting on unused memory for\nthem is wasteful.</p>\n<p>The more fundamental problem, though, is recursion. With recursion, you can be\n&ldquo;in&rdquo; multiple calls to the same function at the same time. Each needs its <span\nname=\"fortran\">own</span> memory for its local variables. In jlox, we solved\nthis by dynamically allocating memory for an environment each time a function\nwas called or a block entered. In clox, we don&rsquo;t want that kind of performance\ncost on every function call.</p>\n<aside name=\"fortran\">\n<p>Fortran avoided this problem by disallowing recursion entirely. Recursion was\nconsidered an advanced, esoteric feature at the time.</p>\n</aside>\n<p>Instead, our solution lies somewhere between Fortran&rsquo;s static allocation and\njlox&rsquo;s dynamic approach. The value stack in the VM works on the observation that\nlocal variables and temporaries behave in a last-in first-out fashion.\nFortunately for us, that&rsquo;s still true even when you add function calls into the\nmix. Here&rsquo;s an example:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">first</span>() {\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"n\">1</span>;\n  <span class=\"i\">second</span>();\n  <span class=\"k\">var</span> <span class=\"i\">b</span> = <span class=\"n\">2</span>;\n}\n\n<span class=\"k\">fun</span> <span class=\"i\">second</span>() {\n  <span class=\"k\">var</span> <span class=\"i\">c</span> = <span class=\"n\">3</span>;\n  <span class=\"k\">var</span> <span class=\"i\">d</span> = <span class=\"n\">4</span>;\n}\n\n<span class=\"i\">first</span>();\n</pre></div>\n<p>Step through the program and look at which variables are in memory at each point\nin time:</p><img src=\"image/calls-and-functions/calls.png\" alt=\"Tracing through the execution of the previous program, showing the stack of variables at each step.\" />\n<p>As execution flows through the two calls, every local variable obeys the\nprinciple that any variable declared after it will be discarded before the first\nvariable needs to be. This is true even across calls. We know we&rsquo;ll be done with\n<code>c</code> and <code>d</code> before we are done with <code>a</code>. It seems we should be able to allocate\nlocal variables on the VM&rsquo;s value stack.</p>\n<p>Ideally, we still determine <em>where</em> on the stack each variable will go at\ncompile time. That keeps the bytecode instructions for working with variables\nsimple and fast. In the above example, we could <span\nname=\"imagine\">imagine</span> doing so in a straightforward way, but that\ndoesn&rsquo;t always work out. Consider:</p>\n<aside name=\"imagine\">\n<p>I say &ldquo;imagine&rdquo; because the compiler can&rsquo;t actually figure this out. Because\nfunctions are first class in Lox, we can&rsquo;t determine which functions call which\nothers at compile time.</p>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">first</span>() {\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"n\">1</span>;\n  <span class=\"i\">second</span>();\n  <span class=\"k\">var</span> <span class=\"i\">b</span> = <span class=\"n\">2</span>;\n  <span class=\"i\">second</span>();\n}\n\n<span class=\"k\">fun</span> <span class=\"i\">second</span>() {\n  <span class=\"k\">var</span> <span class=\"i\">c</span> = <span class=\"n\">3</span>;\n  <span class=\"k\">var</span> <span class=\"i\">d</span> = <span class=\"n\">4</span>;\n}\n\n<span class=\"i\">first</span>();\n</pre></div>\n<p>In the first call to <code>second()</code>, <code>c</code> and <code>d</code> would go into slots 1 and 2. But in\nthe second call, we need to have made room for <code>b</code>, so <code>c</code> and <code>d</code> need to be in\nslots 2 and 3. Thus the compiler can&rsquo;t pin down an exact slot for each local\nvariable across function calls. But <em>within</em> a given function, the <em>relative</em>\nlocations of each local variable are fixed. Variable <code>d</code> is always in the slot\nright after <code>c</code>. This is the key insight.</p>\n<p>When a function is called, we don&rsquo;t know where the top of the stack will be\nbecause it can be called from different contexts. But, wherever that top happens\nto be, we do know where all of the function&rsquo;s local variables will be relative\nto that starting point. So, like many problems, we solve our allocation problem\nwith a level of indirection.</p>\n<p>At the beginning of each function call, the VM records the location of the first\nslot where that function&rsquo;s own locals begin. The instructions for working with\nlocal variables access them by a slot index relative to that, instead of\nrelative to the bottom of the stack like they do today. At compile time, we\ncalculate those relative slots. At runtime, we convert that relative slot to an\nabsolute stack index by adding the function call&rsquo;s starting slot.</p>\n<p>It&rsquo;s as if the function gets a &ldquo;window&rdquo; or &ldquo;frame&rdquo; within the larger stack where\nit can store its locals. The position of the <strong>call frame</strong> is determined at\nruntime, but within and relative to that region, we know where to find things.</p><img src=\"image/calls-and-functions/window.png\" alt=\"The stack at the two points when second() is called, with a window hovering over each one showing the pair of stack slots used by the function.\" />\n<p>The historical name for this recorded location where the function&rsquo;s locals start\nis a <strong>frame pointer</strong> because it points to the beginning of the function&rsquo;s call\nframe. Sometimes you hear <strong>base pointer</strong>, because it points to the base stack\nslot on top of which all of the function&rsquo;s variables live.</p>\n<p>That&rsquo;s the first piece of data we need to track. Every time we call a function,\nthe VM determines the first stack slot where that function&rsquo;s variables begin.</p>\n<h3><a href=\"#return-addresses\" id=\"return-addresses\"><small>24&#8202;.&#8202;3&#8202;.&#8202;2</small>Return addresses</a></h3>\n<p>Right now, the VM works its way through the instruction stream by incrementing\nthe <code>ip</code> field. The only interesting behavior is around control flow\ninstructions which offset the <code>ip</code> by larger amounts. <em>Calling</em> a function is\npretty straightforward<span class=\"em\">&mdash;</span>simply set <code>ip</code> to point to the first instruction in\nthat function&rsquo;s chunk. But what about when the function is done?</p>\n<p>The VM needs to <span name=\"return\">return</span> back to the chunk where the\nfunction was called from and resume execution at the instruction immediately\nafter the call. Thus, for each function call, we need to track where we jump\nback to when the call completes. This is called a <strong>return address</strong> because\nit&rsquo;s the address of the instruction that the VM returns to after the call.</p>\n<p>Again, thanks to recursion, there may be multiple return addresses for a single\nfunction, so this is a property of each <em>invocation</em> and not the function\nitself.</p>\n<aside name=\"return\">\n<p>The authors of early Fortran compilers had a clever trick for implementing\nreturn addresses. Since they <em>didn&rsquo;t</em> support recursion, any given function\nneeded only a single return address at any point in time. So when a function was\ncalled at runtime, the program would <em>modify its own code</em> to change a jump\ninstruction at the end of the function to jump back to its caller. Sometimes the\nline between genius and madness is hair thin.</p>\n</aside>\n<h3><a href=\"#the-call-stack\" id=\"the-call-stack\"><small>24&#8202;.&#8202;3&#8202;.&#8202;3</small>The call stack</a></h3>\n<p>So for each live function invocation<span class=\"em\">&mdash;</span>each call that hasn&rsquo;t returned yet<span class=\"em\">&mdash;</span>we\nneed to track where on the stack that function&rsquo;s locals begin, and where the\ncaller should resume. We&rsquo;ll put this, along with some other stuff, in a new\nstruct.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define STACK_MAX 256\n</pre><div class=\"source-file\"><em>vm.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">typedef</span> <span class=\"k\">struct</span> {\n  <span class=\"t\">ObjFunction</span>* <span class=\"i\">function</span>;\n  <span class=\"t\">uint8_t</span>* <span class=\"i\">ip</span>;\n  <span class=\"t\">Value</span>* <span class=\"i\">slots</span>;\n} <span class=\"t\">CallFrame</span>;\n</pre><pre class=\"insert-after\">\n\ntypedef struct {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em></div>\n\n<p>A CallFrame represents a single ongoing function call. The <code>slots</code> field points\ninto the VM&rsquo;s value stack at the first slot that this function can use. I gave\nit a plural name because<span class=\"em\">&mdash;</span>thanks to C&rsquo;s weird &ldquo;pointers are sort of arrays&rdquo;\nthing<span class=\"em\">&mdash;</span>we&rsquo;ll treat it like an array.</p>\n<p>The implementation of return addresses is a little different from what I\ndescribed above. Instead of storing the return address in the callee&rsquo;s frame,\nthe caller stores its own <code>ip</code>. When we return from a function, the VM will jump\nto the <code>ip</code> of the caller&rsquo;s CallFrame and resume from there.</p>\n<p>I also stuffed a pointer to the function being called in here. We&rsquo;ll use that to\nlook up constants and for a few other things.</p>\n<p>Each time a function is called, we create one of these structs. We could <span\nname=\"heap\">dynamically</span> allocate them on the heap, but that&rsquo;s slow.\nFunction calls are a core operation, so they need to be as fast as possible.\nFortunately, we can make the same observation we made for variables: function\ncalls have stack semantics. If <code>first()</code> calls <code>second()</code>, the call to\n<code>second()</code> will complete before <code>first()</code> does.</p>\n<aside name=\"heap\">\n<p>Many Lisp implementations dynamically allocate stack frames because it\nsimplifies implementing <a href=\"https://en.wikipedia.org/wiki/Continuation\">continuations</a>. If your language supports\ncontinuations, then function calls do <em>not</em> always have stack semantics.</p>\n</aside>\n<p>So over in the VM, we create an array of these CallFrame structs up front and\ntreat it as a stack, like we do with the value array.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">typedef struct {\n</pre><div class=\"source-file\"><em>vm.h</em><br>\nin struct <em>VM</em><br>\nreplace 2 lines</div>\n<pre class=\"insert\">  <span class=\"t\">CallFrame</span> <span class=\"i\">frames</span>[<span class=\"a\">FRAMES_MAX</span>];\n  <span class=\"t\">int</span> <span class=\"i\">frameCount</span>;\n\n</pre><pre class=\"insert-after\">  Value stack[STACK_MAX];\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em>, in struct <em>VM</em>, replace 2 lines</div>\n\n<p>This array replaces the <code>chunk</code> and <code>ip</code> fields we used to have directly in the\nVM. Now each CallFrame has its own <code>ip</code> and its own pointer to the ObjFunction\nthat it&rsquo;s executing. From there, we can get to the function&rsquo;s chunk.</p>\n<p>The new <code>frameCount</code> field in the VM stores the current height of the CallFrame\nstack<span class=\"em\">&mdash;</span>the number of ongoing function calls. To keep clox simple, the array&rsquo;s\ncapacity is fixed. This means, as in many language implementations, there is a\nmaximum call depth we can handle. For clox, it&rsquo;s defined here:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;value.h&quot;\n\n</pre><div class=\"source-file\"><em>vm.h</em><br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"a\">#define FRAMES_MAX 64</span>\n<span class=\"a\">#define STACK_MAX (FRAMES_MAX * UINT8_COUNT)</span>\n</pre><pre class=\"insert-after\">\n\ntypedef struct {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em>, replace 1 line</div>\n\n<p>We also redefine the value stack&rsquo;s <span name=\"plenty\">size</span> in terms of\nthat to make sure we have plenty of stack slots even in very deep call trees.\nWhen the VM starts up, the CallFrame stack is empty.</p>\n<aside name=\"plenty\">\n<p>It is still possible to overflow the stack if enough function calls use enough\ntemporaries in addition to locals. A robust implementation would guard against\nthis, but I&rsquo;m trying to keep things simple.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">  vm.stackTop = vm.stack;\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>resetStack</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">vm</span>.<span class=\"i\">frameCount</span> = <span class=\"n\">0</span>;\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>resetStack</em>()</div>\n\n<p>The &ldquo;vm.h&rdquo; header needs access to ObjFunction, so we add an include.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define clox_vm_h\n\n</pre><div class=\"source-file\"><em>vm.h</em><br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;object.h&quot;</span>\n</pre><pre class=\"insert-after\">#include &quot;table.h&quot;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em>, replace 1 line</div>\n\n<p>Now we&rsquo;re ready to move over to the VM&rsquo;s implementation file. We&rsquo;ve got some\ngrunt work ahead of us. We&rsquo;ve moved <code>ip</code> out of the VM struct and into\nCallFrame. We need to fix every line of code in the VM that touches <code>ip</code> to\nhandle that. Also, the instructions that access local variables by stack slot\nneed to be updated to do so relative to the current CallFrame&rsquo;s <code>slots</code> field.</p>\n<p>We&rsquo;ll start at the top and plow through it.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static InterpretResult run() {\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()<br>\nreplace 4 lines</div>\n<pre class=\"insert\">  <span class=\"t\">CallFrame</span>* <span class=\"i\">frame</span> = &amp;<span class=\"i\">vm</span>.<span class=\"i\">frames</span>[<span class=\"i\">vm</span>.<span class=\"i\">frameCount</span> - <span class=\"n\">1</span>];\n\n<span class=\"a\">#define READ_BYTE() (*frame-&gt;ip++)</span>\n\n<span class=\"a\">#define READ_SHORT() \\</span>\n<span class=\"a\">    (frame-&gt;ip += 2, \\</span>\n<span class=\"a\">    (uint16_t)((frame-&gt;ip[-2] &lt;&lt; 8) | frame-&gt;ip[-1]))</span>\n\n<span class=\"a\">#define READ_CONSTANT() \\</span>\n<span class=\"a\">    (frame-&gt;function-&gt;chunk.constants.values[READ_BYTE()])</span>\n\n</pre><pre class=\"insert-after\">#define READ_STRING() AS_STRING(READ_CONSTANT())\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>(), replace 4 lines</div>\n\n<p>First, we store the current topmost CallFrame in a <span\nname=\"local\">local</span> variable inside the main bytecode execution function.\nThen we replace the bytecode access macros with versions that access <code>ip</code>\nthrough that variable.</p>\n<aside name=\"local\">\n<p>We could access the current frame by going through the CallFrame array every\ntime, but that&rsquo;s verbose. More importantly, storing the frame in a local\nvariable encourages the C compiler to keep that pointer in a register. That\nspeeds up access to the frame&rsquo;s <code>ip</code>. There&rsquo;s no <em>guarantee</em> that the compiler\nwill do this, but there&rsquo;s a good chance it will.</p>\n</aside>\n<p>Now onto each instruction that needs a little tender loving care.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case OP_GET_LOCAL: {\n        uint8_t slot = READ_BYTE();\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">        <span class=\"i\">push</span>(<span class=\"i\">frame</span>-&gt;<span class=\"i\">slots</span>[<span class=\"i\">slot</span>]);\n</pre><pre class=\"insert-after\">        break;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>(), replace 1 line</div>\n\n<p>Previously, <code>OP_GET_LOCAL</code> read the given local slot directly from the VM&rsquo;s\nstack array, which meant it indexed the slot starting from the bottom of the\nstack. Now, it accesses the current frame&rsquo;s <code>slots</code> array, which means it\naccesses the given numbered slot relative to the beginning of that frame.</p>\n<p>Setting a local variable works the same way.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case OP_SET_LOCAL: {\n        uint8_t slot = READ_BYTE();\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">        <span class=\"i\">frame</span>-&gt;<span class=\"i\">slots</span>[<span class=\"i\">slot</span>] = <span class=\"i\">peek</span>(<span class=\"n\">0</span>);\n</pre><pre class=\"insert-after\">        break;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>(), replace 1 line</div>\n\n<p>The jump instructions used to modify the VM&rsquo;s <code>ip</code> field. Now, they do the same\nfor the current frame&rsquo;s <code>ip</code>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case OP_JUMP: {\n        uint16_t offset = READ_SHORT();\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">        <span class=\"i\">frame</span>-&gt;<span class=\"i\">ip</span> += <span class=\"i\">offset</span>;\n</pre><pre class=\"insert-after\">        break;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>(), replace 1 line</div>\n\n<p>Same with the conditional jump:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case OP_JUMP_IF_FALSE: {\n        uint16_t offset = READ_SHORT();\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">        <span class=\"k\">if</span> (<span class=\"i\">isFalsey</span>(<span class=\"i\">peek</span>(<span class=\"n\">0</span>))) <span class=\"i\">frame</span>-&gt;<span class=\"i\">ip</span> += <span class=\"i\">offset</span>;\n</pre><pre class=\"insert-after\">        break;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>(), replace 1 line</div>\n\n<p>And our backward-jumping loop instruction:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case OP_LOOP: {\n        uint16_t offset = READ_SHORT();\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">        <span class=\"i\">frame</span>-&gt;<span class=\"i\">ip</span> -= <span class=\"i\">offset</span>;\n</pre><pre class=\"insert-after\">        break;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>(), replace 1 line</div>\n\n<p>We have some diagnostic code that prints each instruction as it executes to help\nus debug our VM. That needs to work with the new structure too.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    printf(&quot;\\n&quot;);\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()<br>\nreplace 2 lines</div>\n<pre class=\"insert\">    <span class=\"i\">disassembleInstruction</span>(&amp;<span class=\"i\">frame</span>-&gt;<span class=\"i\">function</span>-&gt;<span class=\"i\">chunk</span>,\n        (<span class=\"t\">int</span>)(<span class=\"i\">frame</span>-&gt;<span class=\"i\">ip</span> - <span class=\"i\">frame</span>-&gt;<span class=\"i\">function</span>-&gt;<span class=\"i\">chunk</span>.<span class=\"i\">code</span>));\n</pre><pre class=\"insert-after\">#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>(), replace 2 lines</div>\n\n<p>Instead of passing in the VM&rsquo;s <code>chunk</code> and <code>ip</code> fields, now we read from the\ncurrent CallFrame.</p>\n<p>You know, that wasn&rsquo;t too bad, actually. Most instructions just use the macros\nso didn&rsquo;t need to be touched. Next, we jump up a level to the code that calls\n<code>run()</code>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">InterpretResult interpret(const char* source) {\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>interpret</em>()<br>\nreplace 10 lines</div>\n<pre class=\"insert\">  <span class=\"t\">ObjFunction</span>* <span class=\"i\">function</span> = <span class=\"i\">compile</span>(<span class=\"i\">source</span>);\n  <span class=\"k\">if</span> (<span class=\"i\">function</span> == <span class=\"a\">NULL</span>) <span class=\"k\">return</span> <span class=\"a\">INTERPRET_COMPILE_ERROR</span>;\n\n  <span class=\"i\">push</span>(<span class=\"a\">OBJ_VAL</span>(<span class=\"i\">function</span>));\n  <span class=\"t\">CallFrame</span>* <span class=\"i\">frame</span> = &amp;<span class=\"i\">vm</span>.<span class=\"i\">frames</span>[<span class=\"i\">vm</span>.<span class=\"i\">frameCount</span>++];\n  <span class=\"i\">frame</span>-&gt;<span class=\"i\">function</span> = <span class=\"i\">function</span>;\n  <span class=\"i\">frame</span>-&gt;<span class=\"i\">ip</span> = <span class=\"i\">function</span>-&gt;<span class=\"i\">chunk</span>.<span class=\"i\">code</span>;\n  <span class=\"i\">frame</span>-&gt;<span class=\"i\">slots</span> = <span class=\"i\">vm</span>.<span class=\"i\">stack</span>;\n</pre><pre class=\"insert-after\">\n\n  InterpretResult result = run();\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>interpret</em>(), replace 10 lines</div>\n\n<p>We finally get to wire up our earlier compiler changes to the back-end changes\nwe just made. First, we pass the source code to the compiler. It returns us a\nnew ObjFunction containing the compiled top-level code. If we get <code>NULL</code> back,\nit means there was some compile-time error which the compiler has already\nreported. In that case, we bail out since we can&rsquo;t run anything.</p>\n<p>Otherwise, we store the function on the stack and prepare an initial CallFrame\nto execute its code. Now you can see why the compiler sets aside stack slot zero<span class=\"em\">&mdash;</span>that stores the function being called. In the new CallFrame, we point to the\nfunction, initialize its <code>ip</code> to point to the beginning of the function&rsquo;s\nbytecode, and set up its stack window to start at the very bottom of the VM&rsquo;s\nvalue stack.</p>\n<p>This gets the interpreter ready to start executing code. After finishing, the VM\nused to free the hardcoded chunk. Now that the ObjFunction owns that code, we\ndon&rsquo;t need to do that anymore, so the end of <code>interpret()</code> is simply this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  frame-&gt;slots = vm.stack;\n\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>interpret</em>()<br>\nreplace 4 lines</div>\n<pre class=\"insert\">  <span class=\"k\">return</span> <span class=\"i\">run</span>();\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>interpret</em>(), replace 4 lines</div>\n\n<p>The last piece of code referring to the old VM fields is <code>runtimeError()</code>. We&rsquo;ll\nrevisit that later in the chapter, but for now let&rsquo;s change it to this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  fputs(&quot;\\n&quot;, stderr);\n\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>runtimeError</em>()<br>\nreplace 2 lines</div>\n<pre class=\"insert\">  <span class=\"t\">CallFrame</span>* <span class=\"i\">frame</span> = &amp;<span class=\"i\">vm</span>.<span class=\"i\">frames</span>[<span class=\"i\">vm</span>.<span class=\"i\">frameCount</span> - <span class=\"n\">1</span>];\n  <span class=\"t\">size_t</span> <span class=\"i\">instruction</span> = <span class=\"i\">frame</span>-&gt;<span class=\"i\">ip</span> - <span class=\"i\">frame</span>-&gt;<span class=\"i\">function</span>-&gt;<span class=\"i\">chunk</span>.<span class=\"i\">code</span> - <span class=\"n\">1</span>;\n  <span class=\"t\">int</span> <span class=\"i\">line</span> = <span class=\"i\">frame</span>-&gt;<span class=\"i\">function</span>-&gt;<span class=\"i\">chunk</span>.<span class=\"i\">lines</span>[<span class=\"i\">instruction</span>];\n</pre><pre class=\"insert-after\">  fprintf(stderr, &quot;[line %d] in script\\n&quot;, line);\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>runtimeError</em>(), replace 2 lines</div>\n\n<p>Instead of reading the chunk and <code>ip</code> directly from the VM, it pulls those from\nthe topmost CallFrame on the stack. That should get the function working again\nand behaving as it did before.</p>\n<p>Assuming we did all of that correctly, we got clox back to a runnable\nstate. Fire it up and it does<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>exactly what it did before. We haven&rsquo;t added\nany new features yet, so this is kind of a let down. But all of the\ninfrastructure is there and ready for us now. Let&rsquo;s take advantage of it.</p>\n<h2><a href=\"#function-declarations\" id=\"function-declarations\"><small>24&#8202;.&#8202;4</small>Function Declarations</a></h2>\n<p>Before we can do call expressions, we need something to call, so we&rsquo;ll do\nfunction declarations first. The <span name=\"fun\">fun</span> starts with a\nkeyword.</p>\n<aside name=\"fun\">\n<p>Yes, I am going to make a dumb joke about the <code>fun</code> keyword every time it\ncomes up.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void declaration() {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>declaration</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"a\">TOKEN_FUN</span>)) {\n    <span class=\"i\">funDeclaration</span>();\n  } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"a\">TOKEN_VAR</span>)) {\n</pre><pre class=\"insert-after\">    varDeclaration();\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>declaration</em>(), replace 1 line</div>\n\n<p>That passes control to here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>block</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">funDeclaration</span>() {\n  <span class=\"t\">uint8_t</span> <span class=\"i\">global</span> = <span class=\"i\">parseVariable</span>(<span class=\"s\">&quot;Expect function name.&quot;</span>);\n  <span class=\"i\">markInitialized</span>();\n  <span class=\"i\">function</span>(<span class=\"a\">TYPE_FUNCTION</span>);\n  <span class=\"i\">defineVariable</span>(<span class=\"i\">global</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>block</em>()</div>\n\n<p>Functions are first-class values, and a function declaration simply creates and\nstores one in a newly declared variable. So we parse the name just like any\nother variable declaration. A function declaration at the top level will bind\nthe function to a global variable. Inside a block or other function, a function\ndeclaration creates a local variable.</p>\n<p>In an earlier chapter, I explained how variables <a href=\"local-variables.html#another-scope-edge-case\">get defined in two\nstages</a>. This ensures you can&rsquo;t access a variable&rsquo;s value inside the\nvariable&rsquo;s own initializer. That would be bad because the variable doesn&rsquo;t\n<em>have</em> a value yet.</p>\n<p>Functions don&rsquo;t suffer from this problem. It&rsquo;s safe for a function to refer to\nits own name inside its body. You can&rsquo;t <em>call</em> the function and execute the body\nuntil after it&rsquo;s fully defined, so you&rsquo;ll never see the variable in an\nuninitialized state. Practically speaking, it&rsquo;s useful to allow this in order to\nsupport recursive local functions.</p>\n<p>To make that work, we mark the function declaration&rsquo;s variable &ldquo;initialized&rdquo; as\nsoon as we compile the name, before we compile the body. That way the name can\nbe referenced inside the body without generating an error.</p>\n<p>We do need one check, though.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void markInitialized() {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>markInitialized</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">current</span>-&gt;<span class=\"i\">scopeDepth</span> == <span class=\"n\">0</span>) <span class=\"k\">return</span>;\n</pre><pre class=\"insert-after\">  current-&gt;locals[current-&gt;localCount - 1].depth =\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>markInitialized</em>()</div>\n\n<p>Before, we called <code>markInitialized()</code> only when we already knew we were in a\nlocal scope. Now, a top-level function declaration will also call this function.\nWhen that happens, there is no local variable to mark initialized<span class=\"em\">&mdash;</span>the\nfunction is bound to a global variable.</p>\n<p>Next, we compile the function itself<span class=\"em\">&mdash;</span>its parameter list and block body. For\nthat, we use a separate helper function. That helper generates code that\nleaves the resulting function object on top of the stack. After that, we call\n<code>defineVariable()</code> to store that function back into the variable we declared for\nit.</p>\n<p>I split out the code to compile the parameters and body because we&rsquo;ll reuse it\nlater for parsing method declarations inside classes. Let&rsquo;s build it\nincrementally, starting with this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>block</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">function</span>(<span class=\"t\">FunctionType</span> <span class=\"i\">type</span>) {\n  <span class=\"t\">Compiler</span> <span class=\"i\">compiler</span>;\n  <span class=\"i\">initCompiler</span>(&amp;<span class=\"i\">compiler</span>, <span class=\"i\">type</span>);\n  <span class=\"i\">beginScope</span>();<span name=\"no-end-scope\"> </span>\n\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_LEFT_PAREN</span>, <span class=\"s\">&quot;Expect &#39;(&#39; after function name.&quot;</span>);\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_RIGHT_PAREN</span>, <span class=\"s\">&quot;Expect &#39;)&#39; after parameters.&quot;</span>);\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_LEFT_BRACE</span>, <span class=\"s\">&quot;Expect &#39;{&#39; before function body.&quot;</span>);\n  <span class=\"i\">block</span>();\n\n  <span class=\"t\">ObjFunction</span>* <span class=\"i\">function</span> = <span class=\"i\">endCompiler</span>();\n  <span class=\"i\">emitBytes</span>(<span class=\"a\">OP_CONSTANT</span>, <span class=\"i\">makeConstant</span>(<span class=\"a\">OBJ_VAL</span>(<span class=\"i\">function</span>)));\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>block</em>()</div>\n\n<aside name=\"no-end-scope\">\n<p>This <code>beginScope()</code> doesn&rsquo;t have a corresponding <code>endScope()</code> call. Because we\nend Compiler completely when we reach the end of the function body, there&rsquo;s no\nneed to close the lingering outermost scope.</p>\n</aside>\n<p>For now, we won&rsquo;t worry about parameters. We parse an empty pair of parentheses\nfollowed by the body. The body starts with a left curly brace, which we parse\nhere. Then we call our existing <code>block()</code> function, which knows how to compile\nthe rest of a block including the closing brace.</p>\n<h3><a href=\"#a-stack-of-compilers\" id=\"a-stack-of-compilers\"><small>24&#8202;.&#8202;4&#8202;.&#8202;1</small>A stack of compilers</a></h3>\n<p>The interesting parts are the compiler stuff at the top and bottom. The Compiler\nstruct stores data like which slots are owned by which local variables, how many\nblocks of nesting we&rsquo;re currently in, etc. All of that is specific to a single\nfunction. But now the front end needs to handle compiling multiple functions\n<span name=\"nested\">nested</span> within each other.</p>\n<aside name=\"nested\">\n<p>Remember that the compiler treats top-level code as the body of an implicit\nfunction, so as soon as we add <em>any</em> function declarations, we&rsquo;re in a world of\nnested functions.</p>\n</aside>\n<p>The trick for managing that is to create a separate Compiler for each function\nbeing compiled. When we start compiling a function declaration, we create a new\nCompiler on the C stack and initialize it. <code>initCompiler()</code> sets that Compiler\nto be the current one. Then, as we compile the body, all of the functions that\nemit bytecode write to the chunk owned by the new Compiler&rsquo;s function.</p>\n<p>After we reach the end of the function&rsquo;s block body, we call <code>endCompiler()</code>.\nThat yields the newly compiled function object, which we store as a constant in\nthe <em>surrounding</em> function&rsquo;s constant table. But, wait, how do we get back to\nthe surrounding function? We lost it when <code>initCompiler()</code> overwrote the current\ncompiler pointer.</p>\n<p>We fix that by treating the series of nested Compiler structs as a stack. Unlike\nthe Value and CallFrame stacks in the VM, we won&rsquo;t use an array. Instead, we use\na linked list. Each Compiler points back to the Compiler for the function that\nencloses it, all the way back to the root Compiler for the top-level code.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} FunctionType;\n\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nadd after enum <em>FunctionType</em><br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"k\">typedef</span> <span class=\"k\">struct</span> <span class=\"t\">Compiler</span> {\n  <span class=\"k\">struct</span> <span class=\"t\">Compiler</span>* <span class=\"i\">enclosing</span>;\n</pre><pre class=\"insert-after\">  ObjFunction* function;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after enum <em>FunctionType</em>, replace 1 line</div>\n\n<p>Inside the Compiler struct, we can&rsquo;t reference the Compiler <em>typedef</em> since that\ndeclaration hasn&rsquo;t finished yet. Instead, we give a name to the struct itself\nand use that for the field&rsquo;s type. C is weird.</p>\n<p>When initializing a new Compiler, we capture the about-to-no-longer-be-current\none in that pointer.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void initCompiler(Compiler* compiler, FunctionType type) {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>initCompiler</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">compiler</span>-&gt;<span class=\"i\">enclosing</span> = <span class=\"i\">current</span>;\n</pre><pre class=\"insert-after\">  compiler-&gt;function = NULL;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>initCompiler</em>()</div>\n\n<p>Then when a Compiler finishes, it pops itself off the stack by restoring the\nprevious compiler to be the new current one.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#endif\n\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>endCompiler</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">current</span> = <span class=\"i\">current</span>-&gt;<span class=\"i\">enclosing</span>;\n</pre><pre class=\"insert-after\">  return function;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>endCompiler</em>()</div>\n\n<p>Note that we don&rsquo;t even need to <span name=\"compiler\">dynamically</span>\nallocate the Compiler structs. Each is stored as a local variable in the C stack<span class=\"em\">&mdash;</span>either in <code>compile()</code> or <code>function()</code>. The linked list of Compilers threads\nthrough the C stack. The reason we can get an unbounded number of them is\nbecause our compiler uses recursive descent, so <code>function()</code> ends up calling\nitself recursively when you have nested function declarations.</p>\n<aside name=\"compiler\">\n<p>Using the native stack for Compiler structs does mean our compiler has a\npractical limit on how deeply nested function declarations can be. Go too far\nand you could overflow the C stack. If we want the compiler to be more robust\nagainst pathological or even malicious code<span class=\"em\">&mdash;</span>a real concern for tools like\nJavaScript VMs<span class=\"em\">&mdash;</span>it would be good to have our compiler artificially limit the\namount of function nesting it permits.</p>\n</aside>\n<h3><a href=\"#function-parameters\" id=\"function-parameters\"><small>24&#8202;.&#8202;4&#8202;.&#8202;2</small>Function parameters</a></h3>\n<p>Functions aren&rsquo;t very useful if you can&rsquo;t pass arguments to them, so let&rsquo;s do\nparameters next.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  consume(TOKEN_LEFT_PAREN, &quot;Expect '(' after function name.&quot;);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>function</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (!<span class=\"i\">check</span>(<span class=\"a\">TOKEN_RIGHT_PAREN</span>)) {\n    <span class=\"k\">do</span> {\n      <span class=\"i\">current</span>-&gt;<span class=\"i\">function</span>-&gt;<span class=\"i\">arity</span>++;\n      <span class=\"k\">if</span> (<span class=\"i\">current</span>-&gt;<span class=\"i\">function</span>-&gt;<span class=\"i\">arity</span> &gt; <span class=\"n\">255</span>) {\n        <span class=\"i\">errorAtCurrent</span>(<span class=\"s\">&quot;Can&#39;t have more than 255 parameters.&quot;</span>);\n      }\n      <span class=\"t\">uint8_t</span> <span class=\"i\">constant</span> = <span class=\"i\">parseVariable</span>(<span class=\"s\">&quot;Expect parameter name.&quot;</span>);\n      <span class=\"i\">defineVariable</span>(<span class=\"i\">constant</span>);\n    } <span class=\"k\">while</span> (<span class=\"i\">match</span>(<span class=\"a\">TOKEN_COMMA</span>));\n  }\n</pre><pre class=\"insert-after\">  consume(TOKEN_RIGHT_PAREN, &quot;Expect ')' after parameters.&quot;);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>function</em>()</div>\n\n<p>Semantically, a parameter is simply a local variable declared in the outermost\nlexical scope of the function body. We get to use the existing compiler support\nfor declaring named local variables to parse and compile parameters. Unlike\nlocal variables, which have initializers, there&rsquo;s no code here to initialize the\nparameter&rsquo;s value. We&rsquo;ll see how they are initialized later when we do argument\npassing in function calls.</p>\n<p>While we&rsquo;re at it, we note the function&rsquo;s arity by counting how many parameters\nwe parse. The other piece of metadata we store with a function is its name. When\ncompiling a function declaration, we call <code>initCompiler()</code> right after we parse\nthe function&rsquo;s name. That means we can grab the name right then from the\nprevious token.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  current = compiler;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>initCompiler</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">type</span> != <span class=\"a\">TYPE_SCRIPT</span>) {\n    <span class=\"i\">current</span>-&gt;<span class=\"i\">function</span>-&gt;<span class=\"i\">name</span> = <span class=\"i\">copyString</span>(<span class=\"i\">parser</span>.<span class=\"i\">previous</span>.<span class=\"i\">start</span>,\n                                         <span class=\"i\">parser</span>.<span class=\"i\">previous</span>.<span class=\"i\">length</span>);\n  }\n</pre><pre class=\"insert-after\">\n\n  Local* local = &amp;current-&gt;locals[current-&gt;localCount++];\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>initCompiler</em>()</div>\n\n<p>Note that we&rsquo;re careful to create a copy of the name string. Remember, the\nlexeme points directly into the original source code string. That string may get\nfreed once the code is finished compiling. The function object we create in the\ncompiler outlives the compiler and persists until runtime. So it needs its own\nheap-allocated name string that it can keep around.</p>\n<p>Rad. Now we can compile function declarations, like this:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">areWeHavingItYet</span>() {\n  <span class=\"k\">print</span> <span class=\"s\">&quot;Yes we are!&quot;</span>;\n}\n\n<span class=\"k\">print</span> <span class=\"i\">areWeHavingItYet</span>;\n</pre></div>\n<p>We just can&rsquo;t do anything <span name=\"useful\">useful</span> with them.</p>\n<aside name=\"useful\">\n<p>We can print them! I guess that&rsquo;s not very useful, though.</p>\n</aside>\n<h2><a href=\"#function-calls\" id=\"function-calls\"><small>24&#8202;.&#8202;5</small>Function Calls</a></h2>\n<p>By the end of this section, we&rsquo;ll start to see some interesting behavior. The\nnext step is calling functions. We don&rsquo;t usually think of it this way, but a\nfunction call expression is kind of an infix <code>(</code> operator. You have a\nhigh-precedence expression on the left for the thing being called<span class=\"em\">&mdash;</span>usually\njust a single identifier. Then the <code>(</code> in the middle, followed by the argument\nexpressions separated by commas, and a final <code>)</code> to wrap it up at the end.</p>\n<p>That odd grammatical perspective explains how to hook the syntax into our\nparsing table.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">ParseRule rules[] = {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>unary</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  [<span class=\"a\">TOKEN_LEFT_PAREN</span>]    = {<span class=\"i\">grouping</span>, <span class=\"i\">call</span>,   <span class=\"a\">PREC_CALL</span>},\n</pre><pre class=\"insert-after\">  [TOKEN_RIGHT_PAREN]   = {NULL,     NULL,   PREC_NONE},\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>unary</em>(), replace 1 line</div>\n\n<p>When the parser encounters a left parenthesis following an expression, it\ndispatches to a new parser function.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>binary</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">call</span>(<span class=\"t\">bool</span> <span class=\"i\">canAssign</span>) {\n  <span class=\"t\">uint8_t</span> <span class=\"i\">argCount</span> = <span class=\"i\">argumentList</span>();\n  <span class=\"i\">emitBytes</span>(<span class=\"a\">OP_CALL</span>, <span class=\"i\">argCount</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>binary</em>()</div>\n\n<p>We&rsquo;ve already consumed the <code>(</code> token, so next we compile the arguments using a\nseparate <code>argumentList()</code> helper. That function returns the number of arguments\nit compiled. Each argument expression generates code that leaves its value on\nthe stack in preparation for the call. After that, we emit a new <code>OP_CALL</code>\ninstruction to invoke the function, using the argument count as an operand.</p>\n<p>We compile the arguments using this friend:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>defineVariable</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">uint8_t</span> <span class=\"i\">argumentList</span>() {\n  <span class=\"t\">uint8_t</span> <span class=\"i\">argCount</span> = <span class=\"n\">0</span>;\n  <span class=\"k\">if</span> (!<span class=\"i\">check</span>(<span class=\"a\">TOKEN_RIGHT_PAREN</span>)) {\n    <span class=\"k\">do</span> {\n      <span class=\"i\">expression</span>();\n      <span class=\"i\">argCount</span>++;\n    } <span class=\"k\">while</span> (<span class=\"i\">match</span>(<span class=\"a\">TOKEN_COMMA</span>));\n  }\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_RIGHT_PAREN</span>, <span class=\"s\">&quot;Expect &#39;)&#39; after arguments.&quot;</span>);\n  <span class=\"k\">return</span> <span class=\"i\">argCount</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>defineVariable</em>()</div>\n\n<p>That code should look familiar from jlox. We chew through arguments as long as\nwe find commas after each expression. Once we run out, we consume the final\nclosing parenthesis and we&rsquo;re done.</p>\n<p>Well, almost. Back in jlox, we added a compile-time check that you don&rsquo;t pass\nmore than 255 arguments to a call. At the time, I said that was because clox\nwould need a similar limit. Now you can see why<span class=\"em\">&mdash;</span>since we stuff the argument\ncount into the bytecode as a single-byte operand, we can only go up to 255. We\nneed to verify that in this compiler too.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      expression();\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>argumentList</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">if</span> (<span class=\"i\">argCount</span> == <span class=\"n\">255</span>) {\n        <span class=\"i\">error</span>(<span class=\"s\">&quot;Can&#39;t have more than 255 arguments.&quot;</span>);\n      }\n</pre><pre class=\"insert-after\">      argCount++;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>argumentList</em>()</div>\n\n<p>That&rsquo;s the front end. Let&rsquo;s skip over to the back end, with a quick stop in the\nmiddle to declare the new instruction.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_LOOP,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_CALL</span>,\n</pre><pre class=\"insert-after\">  OP_RETURN,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<h3><a href=\"#binding-arguments-to-parameters\" id=\"binding-arguments-to-parameters\"><small>24&#8202;.&#8202;5&#8202;.&#8202;1</small>Binding arguments to parameters</a></h3>\n<p>Before we get to the implementation, we should think about what the stack looks\nlike at the point of a call and what we need to do from there. When we reach the\ncall instruction, we have already executed the expression for the function being\ncalled, followed by its arguments. Say our program looks like this:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">sum</span>(<span class=\"i\">a</span>, <span class=\"i\">b</span>, <span class=\"i\">c</span>) {\n  <span class=\"k\">return</span> <span class=\"i\">a</span> + <span class=\"i\">b</span> + <span class=\"i\">c</span>;\n}\n\n<span class=\"k\">print</span> <span class=\"n\">4</span> + <span class=\"i\">sum</span>(<span class=\"n\">5</span>, <span class=\"n\">6</span>, <span class=\"n\">7</span>);\n</pre></div>\n<p>If we pause the VM right on the <code>OP_CALL</code> instruction for that call to <code>sum()</code>,\nthe stack looks like this:</p><img src=\"image/calls-and-functions/argument-stack.png\" alt=\"Stack: 4, fn sum, 5, 6, 7.\" />\n<p>Picture this from the perspective of <code>sum()</code> itself. When the compiler compiled\n<code>sum()</code>, it automatically allocated slot zero. Then, after that, it allocated\nlocal slots for the parameters <code>a</code>, <code>b</code>, and <code>c</code>, in order. To perform a call to\n<code>sum()</code>, we need a CallFrame initialized with the function being called and a\nregion of stack slots that it can use. Then we need to collect the arguments\npassed to the function and get them into the corresponding slots for the\nparameters.</p>\n<p>When the VM starts executing the body of <code>sum()</code>, we want its stack window to\nlook like this:</p><img src=\"image/calls-and-functions/parameter-window.png\" alt=\"The same stack with the sum() function's call frame window surrounding fn sum, 5, 6, and 7.\" />\n<p>Do you notice how the argument slots that the caller sets up and the parameter\nslots the callee needs are both in exactly the right order? How convenient! This\nis no coincidence. When I talked about each CallFrame having its own window into\nthe stack, I never said those windows must be <em>disjoint</em>. There&rsquo;s nothing\npreventing us from overlapping them, like this:</p><img src=\"image/calls-and-functions/overlapping-windows.png\" alt=\"The same stack with the top-level call frame covering the entire stack and the sum() function's call frame window surrounding fn sum, 5, 6, and 7.\" />\n<p><span name=\"lua\">The</span> top of the caller&rsquo;s stack contains the function\nbeing called followed by the arguments in order. We know the caller doesn&rsquo;t have\nany other slots above those in use because any temporaries needed when\nevaluating argument expressions have been discarded by now. The bottom of the\ncallee&rsquo;s stack overlaps so that the parameter slots exactly line up with where\nthe argument values already live.</p>\n<aside name=\"lua\">\n<p>Different bytecode VMs and real CPU architectures have different <em>calling\nconventions</em>, which is the specific mechanism they use to pass arguments, store\nthe return address, etc. The mechanism I use here is based on Lua&rsquo;s clean, fast\nvirtual machine.</p>\n</aside>\n<p>This means that we don&rsquo;t need to do <em>any</em> work to &ldquo;bind an argument to a\nparameter&rdquo;. There&rsquo;s no copying values between slots or across environments. The\narguments are already exactly where they need to be. It&rsquo;s hard to beat that for\nperformance.</p>\n<p>Time to implement the call instruction.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_CALL</span>: {\n        <span class=\"t\">int</span> <span class=\"i\">argCount</span> = <span class=\"a\">READ_BYTE</span>();\n        <span class=\"k\">if</span> (!<span class=\"i\">callValue</span>(<span class=\"i\">peek</span>(<span class=\"i\">argCount</span>), <span class=\"i\">argCount</span>)) {\n          <span class=\"k\">return</span> <span class=\"a\">INTERPRET_RUNTIME_ERROR</span>;\n        }\n        <span class=\"k\">break</span>;\n      }\n</pre><pre class=\"insert-after\">      case OP_RETURN: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>We need to know the function being called and the number of arguments passed to\nit. We get the latter from the instruction&rsquo;s operand. That also tells us where\nto find the function on the stack by counting past the argument slots from the\ntop of the stack. We hand that data off to a separate <code>callValue()</code> function. If\nthat returns <code>false</code>, it means the call caused some sort of runtime error. When\nthat happens, we abort the interpreter.</p>\n<p>If <code>callValue()</code> is successful, there will be a new frame on the CallFrame stack\nfor the called function. The <code>run()</code> function has its own cached pointer to the\ncurrent frame, so we need to update that.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">          return INTERPRET_RUNTIME_ERROR;\n        }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">        <span class=\"i\">frame</span> = &amp;<span class=\"i\">vm</span>.<span class=\"i\">frames</span>[<span class=\"i\">vm</span>.<span class=\"i\">frameCount</span> - <span class=\"n\">1</span>];\n</pre><pre class=\"insert-after\">        break;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>Since the bytecode dispatch loop reads from that <code>frame</code> variable, when the VM\ngoes to execute the next instruction, it will read the <code>ip</code> from the newly\ncalled function&rsquo;s CallFrame and jump to its code. The work for executing that\ncall begins here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nadd after <em>peek</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">bool</span> <span class=\"i\">callValue</span>(<span class=\"t\">Value</span> <span class=\"i\">callee</span>, <span class=\"t\">int</span> <span class=\"i\">argCount</span>) {\n  <span class=\"k\">if</span> (<span class=\"a\">IS_OBJ</span>(<span class=\"i\">callee</span>)) {\n    <span class=\"k\">switch</span> (<span class=\"a\">OBJ_TYPE</span>(<span class=\"i\">callee</span>)) {\n      <span class=\"k\">case</span> <span class=\"a\">OBJ_FUNCTION</span>:<span name=\"switch\"> </span>\n        <span class=\"k\">return</span> <span class=\"i\">call</span>(<span class=\"a\">AS_FUNCTION</span>(<span class=\"i\">callee</span>), <span class=\"i\">argCount</span>);\n      <span class=\"k\">default</span>:\n        <span class=\"k\">break</span>; <span class=\"c\">// Non-callable object type.</span>\n    }\n  }\n  <span class=\"i\">runtimeError</span>(<span class=\"s\">&quot;Can only call functions and classes.&quot;</span>);\n  <span class=\"k\">return</span> <span class=\"k\">false</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, add after <em>peek</em>()</div>\n\n<aside name=\"switch\">\n<p>Using a <code>switch</code> statement to check a single type is overkill now, but will make\nsense when we add cases to handle other callable types.</p>\n</aside>\n<p>There&rsquo;s more going on here than just initializing a new CallFrame. Because Lox\nis dynamically typed, there&rsquo;s nothing to prevent a user from writing bad code\nlike:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">notAFunction</span> = <span class=\"n\">123</span>;\n<span class=\"i\">notAFunction</span>();\n</pre></div>\n<p>If that happens, the runtime needs to safely report an error and halt. So the\nfirst thing we do is check the type of the value that we&rsquo;re trying to call. If\nit&rsquo;s not a function, we error out. Otherwise, the actual call happens here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nadd after <em>peek</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">bool</span> <span class=\"i\">call</span>(<span class=\"t\">ObjFunction</span>* <span class=\"i\">function</span>, <span class=\"t\">int</span> <span class=\"i\">argCount</span>) {\n  <span class=\"t\">CallFrame</span>* <span class=\"i\">frame</span> = &amp;<span class=\"i\">vm</span>.<span class=\"i\">frames</span>[<span class=\"i\">vm</span>.<span class=\"i\">frameCount</span>++];\n  <span class=\"i\">frame</span>-&gt;<span class=\"i\">function</span> = <span class=\"i\">function</span>;\n  <span class=\"i\">frame</span>-&gt;<span class=\"i\">ip</span> = <span class=\"i\">function</span>-&gt;<span class=\"i\">chunk</span>.<span class=\"i\">code</span>;\n  <span class=\"i\">frame</span>-&gt;<span class=\"i\">slots</span> = <span class=\"i\">vm</span>.<span class=\"i\">stackTop</span> - <span class=\"i\">argCount</span> - <span class=\"n\">1</span>;\n  <span class=\"k\">return</span> <span class=\"k\">true</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, add after <em>peek</em>()</div>\n\n<p>This simply initializes the next CallFrame on the stack. It stores a pointer to\nthe function being called and points the frame&rsquo;s <code>ip</code> to the beginning of the\nfunction&rsquo;s bytecode. Finally, it sets up the <code>slots</code> pointer to give the frame\nits window into the stack. The arithmetic there ensures that the arguments\nalready on the stack line up with the function&rsquo;s parameters:</p><img src=\"image/calls-and-functions/arithmetic.png\" alt=\"The arithmetic to calculate frame-&gt;slots from stackTop and argCount.\" />\n<p>The funny little <code>- 1</code> is to account for stack slot zero which the compiler set\naside for when we add methods later. The parameters start at slot one so we\nmake the window start one slot earlier to align them with the arguments.</p>\n<p>Before we move on, let&rsquo;s add the new instruction to our disassembler.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      return jumpInstruction(&quot;OP_LOOP&quot;, -1, chunk, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_CALL</span>:\n      <span class=\"k\">return</span> <span class=\"i\">byteInstruction</span>(<span class=\"s\">&quot;OP_CALL&quot;</span>, <span class=\"i\">chunk</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    case OP_RETURN:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>And one more quick side trip. Now that we have a handy function for initiating a\nCallFrame, we may as well use it to set up the first frame for executing the\ntop-level code.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  push(OBJ_VAL(function));\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>interpret</em>()<br>\nreplace 4 lines</div>\n<pre class=\"insert\">  <span class=\"i\">call</span>(<span class=\"i\">function</span>, <span class=\"n\">0</span>);\n</pre><pre class=\"insert-after\">\n\n  return run();\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>interpret</em>(), replace 4 lines</div>\n\n<p>OK, now back to calls<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span></p>\n<h3><a href=\"#runtime-error-checking\" id=\"runtime-error-checking\"><small>24&#8202;.&#8202;5&#8202;.&#8202;2</small>Runtime error checking</a></h3>\n<p>The overlapping stack windows work based on the assumption that a call passes\nexactly one argument for each of the function&rsquo;s parameters. But, again, because\nLox ain&rsquo;t statically typed, a foolish user could pass too many or too few\narguments. In Lox, we&rsquo;ve defined that to be a runtime error, which we report\nlike so:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static bool call(ObjFunction* function, int argCount) {\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>call</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">argCount</span> != <span class=\"i\">function</span>-&gt;<span class=\"i\">arity</span>) {\n    <span class=\"i\">runtimeError</span>(<span class=\"s\">&quot;Expected %d arguments but got %d.&quot;</span>,\n        <span class=\"i\">function</span>-&gt;<span class=\"i\">arity</span>, <span class=\"i\">argCount</span>);\n    <span class=\"k\">return</span> <span class=\"k\">false</span>;\n  }\n\n</pre><pre class=\"insert-after\">  CallFrame* frame = &amp;vm.frames[vm.frameCount++];\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>call</em>()</div>\n\n<p>Pretty straightforward. This is why we store the arity of each function inside\nthe ObjFunction for it.</p>\n<p>There&rsquo;s another error we need to report that&rsquo;s less to do with the user&rsquo;s\nfoolishness than our own. Because the CallFrame array has a fixed size, we need\nto ensure a deep call chain doesn&rsquo;t overflow it.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  }\n\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>call</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">vm</span>.<span class=\"i\">frameCount</span> == <span class=\"a\">FRAMES_MAX</span>) {\n    <span class=\"i\">runtimeError</span>(<span class=\"s\">&quot;Stack overflow.&quot;</span>);\n    <span class=\"k\">return</span> <span class=\"k\">false</span>;\n  }\n\n</pre><pre class=\"insert-after\">  CallFrame* frame = &amp;vm.frames[vm.frameCount++];\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>call</em>()</div>\n\n<p>In practice, if a program gets anywhere close to this limit, there&rsquo;s most likely\na bug in some runaway recursive code.</p>\n<h3><a href=\"#printing-stack-traces\" id=\"printing-stack-traces\"><small>24&#8202;.&#8202;5&#8202;.&#8202;3</small>Printing stack traces</a></h3>\n<p>While we&rsquo;re on the subject of runtime errors, let&rsquo;s spend a little time making\nthem more useful. Stopping on a runtime error is important to prevent the VM\nfrom crashing and burning in some ill-defined way. But simply aborting doesn&rsquo;t\nhelp the user fix their code that <em>caused</em> that error.</p>\n<p>The classic tool to aid debugging runtime failures is a <strong>stack trace</strong><span class=\"em\">&mdash;</span>a\nprint out of each function that was still executing when the program died, and\nwhere the execution was at the point that it died. Now that we have a call stack\nand we&rsquo;ve conveniently stored each function&rsquo;s name, we can show that entire\nstack when a runtime error disrupts the harmony of the user&rsquo;s existence. It\nlooks like this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  fputs(&quot;\\n&quot;, stderr);\n\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>runtimeError</em>()<br>\nreplace 4 lines</div>\n<pre class=\"insert\">  <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">i</span> = <span class=\"i\">vm</span>.<span class=\"i\">frameCount</span> - <span class=\"n\">1</span>; <span class=\"i\">i</span> &gt;= <span class=\"n\">0</span>; <span class=\"i\">i</span>--) {\n    <span class=\"t\">CallFrame</span>* <span class=\"i\">frame</span> = &amp;<span class=\"i\">vm</span>.<span class=\"i\">frames</span>[<span class=\"i\">i</span>];\n    <span class=\"t\">ObjFunction</span>* <span class=\"i\">function</span> = <span class=\"i\">frame</span>-&gt;<span class=\"i\">function</span>;\n    <span class=\"t\">size_t</span> <span class=\"i\">instruction</span> = <span class=\"i\">frame</span>-&gt;<span class=\"i\">ip</span> - <span class=\"i\">function</span>-&gt;<span class=\"i\">chunk</span>.<span class=\"i\">code</span> - <span class=\"n\">1</span>;\n    <span class=\"i\">fprintf</span>(<span class=\"i\">stderr</span>, <span class=\"s\">&quot;[line %d] in &quot;</span>,<span name=\"minus\"> </span>\n            <span class=\"i\">function</span>-&gt;<span class=\"i\">chunk</span>.<span class=\"i\">lines</span>[<span class=\"i\">instruction</span>]);\n    <span class=\"k\">if</span> (<span class=\"i\">function</span>-&gt;<span class=\"i\">name</span> == <span class=\"a\">NULL</span>) {\n      <span class=\"i\">fprintf</span>(<span class=\"i\">stderr</span>, <span class=\"s\">&quot;script</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>);\n    } <span class=\"k\">else</span> {\n      <span class=\"i\">fprintf</span>(<span class=\"i\">stderr</span>, <span class=\"s\">&quot;%s()</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>, <span class=\"i\">function</span>-&gt;<span class=\"i\">name</span>-&gt;<span class=\"i\">chars</span>);\n    }\n  }\n\n</pre><pre class=\"insert-after\">  resetStack();\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>runtimeError</em>(), replace 4 lines</div>\n\n<aside name=\"minus\">\n<p>The <code>- 1</code> is because the IP is already sitting on the next instruction to be\nexecuted but we want the stack trace to point to the previous failed\ninstruction.</p>\n</aside>\n<p>After printing the error message itself, we walk the call stack from <span\nname=\"top\">top</span> (the most recently called function) to bottom (the\ntop-level code). For each frame, we find the line number that corresponds to the\ncurrent <code>ip</code> inside that frame&rsquo;s function. Then we print that line number along\nwith the function name.</p>\n<aside name=\"top\">\n<p>There is some disagreement on which order stack frames should be shown in a\ntrace. Most put the innermost function as the first line and work their way\ntowards the bottom of the stack. Python prints them out in the opposite order.\nSo reading from top to bottom tells you how your program got to where it is, and\nthe last line is where the error actually occurred.</p>\n<p>There&rsquo;s a logic to that style. It ensures you can always see the innermost\nfunction even if the stack trace is too long to fit on one screen. On the other\nhand, the &ldquo;<a href=\"https://en.wikipedia.org/wiki/Inverted_pyramid_(journalism)\">inverted pyramid</a>&rdquo; from journalism tells us we should put the most\nimportant information <em>first</em> in a block of text. In a stack trace, that&rsquo;s the\nfunction where the error actually occurred. Most other language implementations\ndo that.</p>\n</aside>\n<p>For example, if you run this broken program:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">a</span>() { <span class=\"i\">b</span>(); }\n<span class=\"k\">fun</span> <span class=\"i\">b</span>() { <span class=\"i\">c</span>(); }\n<span class=\"k\">fun</span> <span class=\"i\">c</span>() {\n  <span class=\"i\">c</span>(<span class=\"s\">&quot;too&quot;</span>, <span class=\"s\">&quot;many&quot;</span>);\n}\n\n<span class=\"i\">a</span>();\n</pre></div>\n<p>It prints out:</p>\n<div class=\"codehilite\"><pre>Expected 0 arguments but got 2.\n[line 4] in c()\n[line 2] in b()\n[line 1] in a()\n[line 7] in script\n</pre></div>\n<p>That doesn&rsquo;t look too bad, does it?</p>\n<h3><a href=\"#returning-from-functions\" id=\"returning-from-functions\"><small>24&#8202;.&#8202;5&#8202;.&#8202;4</small>Returning from functions</a></h3>\n<p>We&rsquo;re getting close. We can call functions, and the VM will execute them. But we\ncan&rsquo;t <em>return</em> from them yet. We&rsquo;ve had an <code>OP_RETURN</code> instruction for quite\nsome time, but it&rsquo;s always had some kind of temporary code hanging out in it\njust to get us out of the bytecode loop. The time has arrived for a real\nimplementation.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case OP_RETURN: {\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()<br>\nreplace 2 lines</div>\n<pre class=\"insert\">        <span class=\"t\">Value</span> <span class=\"i\">result</span> = <span class=\"i\">pop</span>();\n        <span class=\"i\">vm</span>.<span class=\"i\">frameCount</span>--;\n        <span class=\"k\">if</span> (<span class=\"i\">vm</span>.<span class=\"i\">frameCount</span> == <span class=\"n\">0</span>) {\n          <span class=\"i\">pop</span>();\n          <span class=\"k\">return</span> <span class=\"a\">INTERPRET_OK</span>;\n        }\n\n        <span class=\"i\">vm</span>.<span class=\"i\">stackTop</span> = <span class=\"i\">frame</span>-&gt;<span class=\"i\">slots</span>;\n        <span class=\"i\">push</span>(<span class=\"i\">result</span>);\n        <span class=\"i\">frame</span> = &amp;<span class=\"i\">vm</span>.<span class=\"i\">frames</span>[<span class=\"i\">vm</span>.<span class=\"i\">frameCount</span> - <span class=\"n\">1</span>];\n        <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">      }\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>(), replace 2 lines</div>\n\n<p>When a function returns a value, that value will be on top of the stack. We&rsquo;re\nabout to discard the called function&rsquo;s entire stack window, so we pop that\nreturn value off and hang on to it. Then we discard the CallFrame for the\nreturning function. If that was the very last CallFrame, it means we&rsquo;ve finished\nexecuting the top-level code. The entire program is done, so we pop the main\nscript function from the stack and then exit the interpreter.</p>\n<p>Otherwise, we discard all of the slots the callee was using for its parameters\nand local variables. That includes the same slots the caller used to pass the\narguments. Now that the call is done, the caller doesn&rsquo;t need them anymore. This\nmeans the top of the stack ends up right at the beginning of the returning\nfunction&rsquo;s stack window.</p>\n<p>We push the return value back onto the stack at that new, lower location. Then\nwe update the <code>run()</code> function&rsquo;s cached pointer to the current frame. Just like\nwhen we began a call, on the next iteration of the bytecode dispatch loop, the\nVM will read <code>ip</code> from that frame, and execution will jump back to the caller,\nright where it left off, immediately after the <code>OP_CALL</code> instruction.</p><img src=\"image/calls-and-functions/return.png\" alt=\"Each step of the return process: popping the return value, discarding the call frame, pushing the return value.\" />\n<p>Note that we assume here that the function <em>did</em> actually return a value, but\na function can implicitly return by reaching the end of its body:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">noReturn</span>() {\n  <span class=\"k\">print</span> <span class=\"s\">&quot;Do stuff&quot;</span>;\n  <span class=\"c\">// No return here.</span>\n}\n\n<span class=\"k\">print</span> <span class=\"i\">noReturn</span>(); <span class=\"c\">// ???</span>\n</pre></div>\n<p>We need to handle that correctly too. The language is specified to implicitly\nreturn <code>nil</code> in that case. To make that happen, we add this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void emitReturn() {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>emitReturn</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">emitByte</span>(<span class=\"a\">OP_NIL</span>);\n</pre><pre class=\"insert-after\">  emitByte(OP_RETURN);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>emitReturn</em>()</div>\n\n<p>The compiler calls <code>emitReturn()</code> to write the <code>OP_RETURN</code> instruction at the\nend of a function body. Now, before that, it emits an instruction to push <code>nil</code>\nonto the stack. And with that, we have working function calls! They can even\ntake parameters! It almost looks like we know what we&rsquo;re doing here.</p>\n<h2><a href=\"#return-statements\" id=\"return-statements\"><small>24&#8202;.&#8202;6</small>Return Statements</a></h2>\n<p>If you want a function that returns something other than the implicit <code>nil</code>, you\nneed a <code>return</code> statement. Let&rsquo;s get that working.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    ifStatement();\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>statement</em>()</div>\n<pre class=\"insert\">  } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"a\">TOKEN_RETURN</span>)) {\n    <span class=\"i\">returnStatement</span>();\n</pre><pre class=\"insert-after\">  } else if (match(TOKEN_WHILE)) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>statement</em>()</div>\n\n<p>When the compiler sees a <code>return</code> keyword, it goes here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>printStatement</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">returnStatement</span>() {\n  <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"a\">TOKEN_SEMICOLON</span>)) {\n    <span class=\"i\">emitReturn</span>();\n  } <span class=\"k\">else</span> {\n    <span class=\"i\">expression</span>();\n    <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_SEMICOLON</span>, <span class=\"s\">&quot;Expect &#39;;&#39; after return value.&quot;</span>);\n    <span class=\"i\">emitByte</span>(<span class=\"a\">OP_RETURN</span>);\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>printStatement</em>()</div>\n\n<p>The return value expression is optional, so the parser looks for a semicolon\ntoken to tell if a value was provided. If there is no return value, the\nstatement implicitly returns <code>nil</code>. We implement that by calling <code>emitReturn()</code>,\nwhich emits an <code>OP_NIL</code> instruction. Otherwise, we compile the return value\nexpression and return it with an <code>OP_RETURN</code> instruction.</p>\n<p>This is the same <code>OP_RETURN</code> instruction we&rsquo;ve already implemented<span class=\"em\">&mdash;</span>we don&rsquo;t\nneed any new runtime code. This is quite a difference from jlox. There, we had\nto use exceptions to unwind the stack when a <code>return</code> statement was executed.\nThat was because you could return from deep inside some nested blocks. Since\njlox recursively walks the AST, that meant there were a bunch of Java method\ncalls we needed to escape out of.</p>\n<p>Our bytecode compiler flattens that all out. We do recursive descent during\nparsing, but at runtime, the VM&rsquo;s bytecode dispatch loop is completely flat.\nThere is no recursion going on at the C level at all. So returning, even from\nwithin some nested blocks, is as straightforward as returning from the end of\nthe function&rsquo;s body.</p>\n<p>We&rsquo;re not totally done, though. The new <code>return</code> statement gives us a new\ncompile error to worry about. Returns are useful for returning from functions\nbut the top level of a Lox program is imperative code too. You shouldn&rsquo;t be able\nto <span name=\"worst\">return</span> from there.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">return</span> <span class=\"s\">&quot;What?!&quot;</span>;\n</pre></div>\n<aside name=\"worst\">\n<p>Allowing <code>return</code> at the top level isn&rsquo;t the worst idea in the world. It would\ngive you a natural way to terminate a script early. You could maybe even use a\nreturned number to indicate the process&rsquo;s exit code.</p>\n</aside>\n<p>We&rsquo;ve specified that it&rsquo;s a compile error to have a <code>return</code> statement outside\nof any function, which we implement like so:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void returnStatement() {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>returnStatement</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">current</span>-&gt;<span class=\"i\">type</span> == <span class=\"a\">TYPE_SCRIPT</span>) {\n    <span class=\"i\">error</span>(<span class=\"s\">&quot;Can&#39;t return from top-level code.&quot;</span>);\n  }\n\n</pre><pre class=\"insert-after\">  if (match(TOKEN_SEMICOLON)) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>returnStatement</em>()</div>\n\n<p>This is one of the reasons we added that FunctionType enum to the compiler.</p>\n<h2><a href=\"#native-functions\" id=\"native-functions\"><small>24&#8202;.&#8202;7</small>Native Functions</a></h2>\n<p>Our VM is getting more powerful. We&rsquo;ve got functions, calls, parameters,\nreturns. You can define lots of different functions that can call each other in\ninteresting ways. But, ultimately, they can&rsquo;t really <em>do</em> anything. The only\nuser-visible thing a Lox program can do, regardless of its complexity, is print.\nTo add more capabilities, we need to expose them to the user.</p>\n<p>A programming language implementation reaches out and touches the material world\nthrough <strong>native functions</strong>. If you want to be able to write programs that\ncheck the time, read user input, or access the file system, we need to add\nnative functions<span class=\"em\">&mdash;</span>callable from Lox but implemented in C<span class=\"em\">&mdash;</span>that expose those\ncapabilities.</p>\n<p>At the language level, Lox is fairly complete<span class=\"em\">&mdash;</span>it&rsquo;s got closures, classes,\ninheritance, and other fun stuff. One reason it feels like a toy language is\nbecause it has almost no native capabilities. We could turn it into a real\nlanguage by adding a long list of them.</p>\n<p>However, grinding through a pile of OS operations isn&rsquo;t actually very\neducational. Once you&rsquo;ve seen how to bind one piece of C code to Lox, you get\nthe idea. But you do need to see <em>one</em>, and even a single native function\nrequires us to build out all the machinery for interfacing Lox with C. So we&rsquo;ll\ngo through that and do all the hard work. Then, when that&rsquo;s done, we&rsquo;ll add one\ntiny native function just to prove that it works.</p>\n<p>The reason we need new machinery is because, from the implementation&rsquo;s\nperspective, native functions are different from Lox functions. When they are\ncalled, they don&rsquo;t push a CallFrame, because there&rsquo;s no bytecode code for that\nframe to point to. They have no bytecode chunk. Instead, they somehow reference\na piece of native C code.</p>\n<p>We handle this in clox by defining native functions as an entirely different\nobject type.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} ObjFunction;\n</pre><div class=\"source-file\"><em>object.h</em><br>\nadd after struct <em>ObjFunction</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">typedef</span> <span class=\"t\">Value</span> (*<span class=\"t\">NativeFn</span>)(<span class=\"t\">int</span> <span class=\"i\">argCount</span>, <span class=\"t\">Value</span>* <span class=\"i\">args</span>);\n\n<span class=\"k\">typedef</span> <span class=\"k\">struct</span> {\n  <span class=\"t\">Obj</span> <span class=\"i\">obj</span>;\n  <span class=\"t\">NativeFn</span> <span class=\"i\">function</span>;\n} <span class=\"t\">ObjNative</span>;\n</pre><pre class=\"insert-after\">\n\nstruct ObjString {\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, add after struct <em>ObjFunction</em></div>\n\n<p>The representation is simpler than ObjFunction<span class=\"em\">&mdash;</span>merely an Obj header and a\npointer to the C function that implements the native behavior. The native\nfunction takes the argument count and a pointer to the first argument on the\nstack. It accesses the arguments through that pointer. Once it&rsquo;s done, it\nreturns the result value.</p>\n<p>As always, a new object type carries some accoutrements with it. To create an\nObjNative, we declare a constructor-like function.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">ObjFunction* newFunction();\n</pre><div class=\"source-file\"><em>object.h</em><br>\nadd after <em>newFunction</em>()</div>\n<pre class=\"insert\"><span class=\"t\">ObjNative</span>* <span class=\"i\">newNative</span>(<span class=\"t\">NativeFn</span> <span class=\"i\">function</span>);\n</pre><pre class=\"insert-after\">ObjString* takeString(char* chars, int length);\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, add after <em>newFunction</em>()</div>\n\n<p>We implement that like so:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>object.c</em><br>\nadd after <em>newFunction</em>()</div>\n<pre><span class=\"t\">ObjNative</span>* <span class=\"i\">newNative</span>(<span class=\"t\">NativeFn</span> <span class=\"i\">function</span>) {\n  <span class=\"t\">ObjNative</span>* <span class=\"i\">native</span> = <span class=\"a\">ALLOCATE_OBJ</span>(<span class=\"t\">ObjNative</span>, <span class=\"a\">OBJ_NATIVE</span>);\n  <span class=\"i\">native</span>-&gt;<span class=\"i\">function</span> = <span class=\"i\">function</span>;\n  <span class=\"k\">return</span> <span class=\"i\">native</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, add after <em>newFunction</em>()</div>\n\n<p>The constructor takes a C function pointer to wrap in an ObjNative. It sets up\nthe object header and stores the function. For the header, we need a new object\ntype.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">typedef enum {\n  OBJ_FUNCTION,\n</pre><div class=\"source-file\"><em>object.h</em><br>\nin enum <em>ObjType</em></div>\n<pre class=\"insert\">  <span class=\"a\">OBJ_NATIVE</span>,\n</pre><pre class=\"insert-after\">  OBJ_STRING,\n} ObjType;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, in enum <em>ObjType</em></div>\n\n<p>The VM also needs to know how to deallocate a native function object.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    }\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>freeObject</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OBJ_NATIVE</span>:\n      <span class=\"a\">FREE</span>(<span class=\"t\">ObjNative</span>, <span class=\"i\">object</span>);\n      <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">    case OBJ_STRING: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>freeObject</em>()</div>\n\n<p>There isn&rsquo;t much here since ObjNative doesn&rsquo;t own any extra memory. The other\ncapability all Lox objects support is being printed.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      break;\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>printObject</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OBJ_NATIVE</span>:\n      <span class=\"i\">printf</span>(<span class=\"s\">&quot;&lt;native fn&gt;&quot;</span>);\n      <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">    case OBJ_STRING:\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>printObject</em>()</div>\n\n<p>In order to support dynamic typing, we have a macro to see if a value is a\nnative function.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define IS_FUNCTION(value)     isObjType(value, OBJ_FUNCTION)\n</pre><div class=\"source-file\"><em>object.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define IS_NATIVE(value)       isObjType(value, OBJ_NATIVE)</span>\n</pre><pre class=\"insert-after\">#define IS_STRING(value)       isObjType(value, OBJ_STRING)\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em></div>\n\n<p>Assuming that returns true, this macro extracts the C function pointer from a\nValue representing a native function:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define AS_FUNCTION(value)     ((ObjFunction*)AS_OBJ(value))\n</pre><div class=\"source-file\"><em>object.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define AS_NATIVE(value) \\</span>\n<span class=\"a\">    (((ObjNative*)AS_OBJ(value))-&gt;function)</span>\n</pre><pre class=\"insert-after\">#define AS_STRING(value)       ((ObjString*)AS_OBJ(value))\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em></div>\n\n<p>All of this baggage lets the VM treat native functions like any other object.\nYou can store them in variables, pass them around, throw them birthday parties,\netc. Of course, the operation we actually care about is <em>calling</em> them<span class=\"em\">&mdash;</span>using\none as the left-hand operand in a call expression.</p>\n<p>Over in <code>callValue()</code> we add another type case.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case OBJ_FUNCTION:<span name=\"switch\"> </span>\n        return call(AS_FUNCTION(callee), argCount);\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>callValue</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OBJ_NATIVE</span>: {\n        <span class=\"t\">NativeFn</span> <span class=\"i\">native</span> = <span class=\"a\">AS_NATIVE</span>(<span class=\"i\">callee</span>);\n        <span class=\"t\">Value</span> <span class=\"i\">result</span> = <span class=\"i\">native</span>(<span class=\"i\">argCount</span>, <span class=\"i\">vm</span>.<span class=\"i\">stackTop</span> - <span class=\"i\">argCount</span>);\n        <span class=\"i\">vm</span>.<span class=\"i\">stackTop</span> -= <span class=\"i\">argCount</span> + <span class=\"n\">1</span>;\n        <span class=\"i\">push</span>(<span class=\"i\">result</span>);\n        <span class=\"k\">return</span> <span class=\"k\">true</span>;\n      }\n</pre><pre class=\"insert-after\">      default:\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>callValue</em>()</div>\n\n<p>If the object being called is a native function, we invoke the C function right\nthen and there. There&rsquo;s no need to muck with CallFrames or anything. We just\nhand off to C, get the result, and stuff it back in the stack. This makes native\nfunctions as fast as we can get.</p>\n<p>With this, users should be able to call native functions, but there aren&rsquo;t any\nto call. Without something like a foreign function interface, users can&rsquo;t define\ntheir own native functions. That&rsquo;s our job as VM implementers. We&rsquo;ll start with\na helper to define a new native function exposed to Lox programs.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nadd after <em>runtimeError</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">defineNative</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">name</span>, <span class=\"t\">NativeFn</span> <span class=\"i\">function</span>) {\n  <span class=\"i\">push</span>(<span class=\"a\">OBJ_VAL</span>(<span class=\"i\">copyString</span>(<span class=\"i\">name</span>, (<span class=\"t\">int</span>)<span class=\"i\">strlen</span>(<span class=\"i\">name</span>))));\n  <span class=\"i\">push</span>(<span class=\"a\">OBJ_VAL</span>(<span class=\"i\">newNative</span>(<span class=\"i\">function</span>)));\n  <span class=\"i\">tableSet</span>(&amp;<span class=\"i\">vm</span>.<span class=\"i\">globals</span>, <span class=\"a\">AS_STRING</span>(<span class=\"i\">vm</span>.<span class=\"i\">stack</span>[<span class=\"n\">0</span>]), <span class=\"i\">vm</span>.<span class=\"i\">stack</span>[<span class=\"n\">1</span>]);\n  <span class=\"i\">pop</span>();\n  <span class=\"i\">pop</span>();\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, add after <em>runtimeError</em>()</div>\n\n<p>It takes a pointer to a C function and the name it will be known as in Lox.\nWe wrap the function in an ObjNative and then store that in a global variable\nwith the given name.</p>\n<p>You&rsquo;re probably wondering why we push and pop the name and function on the\nstack. That looks weird, right? This is the kind of stuff you have to worry\nabout when <span name=\"worry\">garbage</span> collection gets involved. Both\n<code>copyString()</code> and <code>newNative()</code> dynamically allocate memory. That means once we\nhave a GC, they can potentially trigger a collection. If that happens, we need\nto ensure the collector knows we&rsquo;re not done with the name and ObjFunction so\nthat it doesn&rsquo;t free them out from under us. Storing them on the value stack\naccomplishes that.</p>\n<aside name=\"worry\">\n<p>Don&rsquo;t worry if you didn&rsquo;t follow all that. It will make a lot more sense once we\nget around to <a href=\"garbage-collection.html\">implementing the GC</a>.</p>\n</aside>\n<p>It feels silly, but after all of that work, we&rsquo;re going to add only one\nlittle native function.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nadd after variable <em>vm</em></div>\n<pre><span class=\"k\">static</span> <span class=\"t\">Value</span> <span class=\"i\">clockNative</span>(<span class=\"t\">int</span> <span class=\"i\">argCount</span>, <span class=\"t\">Value</span>* <span class=\"i\">args</span>) {\n  <span class=\"k\">return</span> <span class=\"a\">NUMBER_VAL</span>((<span class=\"t\">double</span>)<span class=\"i\">clock</span>() / <span class=\"a\">CLOCKS_PER_SEC</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, add after variable <em>vm</em></div>\n\n<p>This returns the elapsed time since the program started running, in seconds. It&rsquo;s\nhandy for benchmarking Lox programs. In Lox, we&rsquo;ll name it <code>clock()</code>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  initTable(&amp;vm.strings);\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>initVM</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"i\">defineNative</span>(<span class=\"s\">&quot;clock&quot;</span>, <span class=\"i\">clockNative</span>);\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>initVM</em>()</div>\n\n<p>To get to the C standard library <code>clock()</code> function, the &ldquo;vm&rdquo; module needs an\ninclude.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &lt;string.h&gt;\n</pre><div class=\"source-file\"><em>vm.c</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &lt;time.h&gt;</span>\n</pre><pre class=\"insert-after\">\n\n#include &quot;common.h&quot;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em></div>\n\n<p>That was a lot of material to work through, but we did it! Type this in and try\nit out:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">fib</span>(<span class=\"i\">n</span>) {\n  <span class=\"k\">if</span> (<span class=\"i\">n</span> &lt; <span class=\"n\">2</span>) <span class=\"k\">return</span> <span class=\"i\">n</span>;\n  <span class=\"k\">return</span> <span class=\"i\">fib</span>(<span class=\"i\">n</span> - <span class=\"n\">2</span>) + <span class=\"i\">fib</span>(<span class=\"i\">n</span> - <span class=\"n\">1</span>);\n}\n\n<span class=\"k\">var</span> <span class=\"i\">start</span> = <span class=\"i\">clock</span>();\n<span class=\"k\">print</span> <span class=\"i\">fib</span>(<span class=\"n\">35</span>);\n<span class=\"k\">print</span> <span class=\"i\">clock</span>() - <span class=\"i\">start</span>;\n</pre></div>\n<p>We can write a really inefficient recursive Fibonacci function. Even better, we\ncan measure just <span name=\"faster\"><em>how</em></span> inefficient it is. This is, of\ncourse, not the smartest way to calculate a Fibonacci number. But it is a good\nway to stress test a language implementation&rsquo;s support for function calls. On my\nmachine, running this in clox is about five times faster than in jlox. That&rsquo;s\nquite an improvement.</p>\n<aside name=\"faster\">\n<p>It&rsquo;s a little slower than a comparable Ruby program run in Ruby 2.4.3p205, and\nabout 3x faster than one run in Python 3.7.3. And we still have a lot of simple\noptimizations we can do in our VM.</p>\n</aside>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>Reading and writing the <code>ip</code> field is one of the most frequent operations\ninside the bytecode loop. Right now, we access it through a pointer to the\ncurrent CallFrame. That requires a pointer indirection which may force the\nCPU to bypass the cache and hit main memory. That can be a real performance\nsink.</p>\n<p>Ideally, we&rsquo;d keep the <code>ip</code> in a native CPU register. C doesn&rsquo;t let us\n<em>require</em> that without dropping into inline assembly, but we can structure\nthe code to encourage the compiler to make that optimization. If we store\nthe <code>ip</code> directly in a C local variable and mark it <code>register</code>, there&rsquo;s a\ngood chance the C compiler will accede to our polite request.</p>\n<p>This does mean we need to be careful to load and store the local <code>ip</code> back\ninto the correct CallFrame when starting and ending function calls.\nImplement this optimization. Write a couple of benchmarks and see how it\naffects the performance. Do you think the extra code complexity is worth it?</p>\n</li>\n<li>\n<p>Native function calls are fast in part because we don&rsquo;t validate that the\ncall passes as many arguments as the function expects. We really should, or\nan incorrect call to a native function without enough arguments could cause\nthe function to read uninitialized memory. Add arity checking.</p>\n</li>\n<li>\n<p>Right now, there&rsquo;s no way for a native function to signal a runtime error.\nIn a real implementation, this is something we&rsquo;d need to support because\nnative functions live in the statically typed world of C but are called\nfrom dynamically typed Lox land. If a user, say, tries to pass a string to\n<code>sqrt()</code>, that native function needs to report a runtime error.</p>\n<p>Extend the native function system to support that. How does this capability\naffect the performance of native calls?</p>\n</li>\n<li>\n<p>Add some more native functions to do things you find useful. Write some\nprograms using those. What did you add? How do they affect the feel of the\nlanguage and how practical it is?</p>\n</li>\n</ol>\n</div>\n\n<footer>\n<a href=\"closures.html\" class=\"next\">\n  Next Chapter: &ldquo;Closures&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/chunks-of-bytecode.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Chunks of Bytecode &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Chunks of Bytecode<small>14</small></a></h3>\n\n<ul>\n    <li><a href=\"#bytecode\"><small>14.1</small> Bytecode?</a></li>\n    <li><a href=\"#getting-started\"><small>14.2</small> Getting Started</a></li>\n    <li><a href=\"#chunks-of-instructions\"><small>14.3</small> Chunks of Instructions</a></li>\n    <li><a href=\"#disassembling-chunks\"><small>14.4</small> Disassembling Chunks</a></li>\n    <li><a href=\"#constants\"><small>14.5</small> Constants</a></li>\n    <li><a href=\"#line-information\"><small>14.6</small> Line Information</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Test Your Language</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"a-virtual-machine.html\" title=\"A Virtual Machine\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\" class=\"prev\">←</a>\n<a href=\"a-virtual-machine.html\" title=\"A Virtual Machine\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Chunks of Bytecode<small>14</small></a></h3>\n\n<ul>\n    <li><a href=\"#bytecode\"><small>14.1</small> Bytecode?</a></li>\n    <li><a href=\"#getting-started\"><small>14.2</small> Getting Started</a></li>\n    <li><a href=\"#chunks-of-instructions\"><small>14.3</small> Chunks of Instructions</a></li>\n    <li><a href=\"#disassembling-chunks\"><small>14.4</small> Disassembling Chunks</a></li>\n    <li><a href=\"#constants\"><small>14.5</small> Constants</a></li>\n    <li><a href=\"#line-information\"><small>14.6</small> Line Information</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Test Your Language</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"a-virtual-machine.html\" title=\"A Virtual Machine\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">14</div>\n  <h1>Chunks of Bytecode</h1>\n\n<blockquote>\n<p>If you find that you&rsquo;re spending almost all your time on theory, start turning\nsome attention to practical things; it will improve your theories. If you find\nthat you&rsquo;re spending almost all your time on practice, start turning some\nattention to theoretical things; it will improve your practice.</p>\n<p><cite>Donald Knuth</cite></p>\n</blockquote>\n<p>We already have ourselves a complete implementation of Lox with jlox, so why\nisn&rsquo;t the book over yet? Part of this is because jlox relies on the <span\nname=\"metal\">JVM</span> to do lots of things for us. If we want to understand\nhow an interpreter works all the way down to the metal, we need to build those\nbits and pieces ourselves.</p>\n<aside name=\"metal\">\n<p>Of course, our second interpreter relies on the C standard library for basics\nlike memory allocation, and the C compiler frees us from details of the\nunderlying machine code we&rsquo;re running it on. Heck, that machine code is probably\nimplemented in terms of microcode on the chip. And the C runtime relies on the\noperating system to hand out pages of memory. But we have to stop <em>somewhere</em> if\nthis book is going to fit on your bookshelf.</p>\n</aside>\n<p>An even more fundamental reason that jlox isn&rsquo;t sufficient is that it&rsquo;s too damn\nslow. A tree-walk interpreter is fine for some kinds of high-level, declarative\nlanguages. But for a general-purpose, imperative language<span class=\"em\">&mdash;</span>even a &ldquo;scripting&rdquo;\nlanguage like Lox<span class=\"em\">&mdash;</span>it won&rsquo;t fly. Take this little script:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">fib</span>(<span class=\"i\">n</span>) {\n  <span class=\"k\">if</span> (<span class=\"i\">n</span> &lt; <span class=\"n\">2</span>) <span class=\"k\">return</span> <span class=\"i\">n</span>;\n  <span class=\"k\">return</span> <span class=\"i\">fib</span>(<span class=\"i\">n</span> - <span class=\"n\">1</span>) + <span class=\"i\">fib</span>(<span class=\"i\">n</span> - <span class=\"n\">2</span>);<span name=\"fib\"> </span>\n}\n\n<span class=\"k\">var</span> <span class=\"i\">before</span> = <span class=\"i\">clock</span>();\n<span class=\"k\">print</span> <span class=\"i\">fib</span>(<span class=\"n\">40</span>);\n<span class=\"k\">var</span> <span class=\"i\">after</span> = <span class=\"i\">clock</span>();\n<span class=\"k\">print</span> <span class=\"i\">after</span> - <span class=\"i\">before</span>;\n</pre></div>\n<aside name=\"fib\">\n<p>This is a comically inefficient way to actually calculate Fibonacci numbers.\nOur goal is to see how fast the <em>interpreter</em> runs, not to see how fast of a\nprogram we can write. A slow program that does a lot of work<span class=\"em\">&mdash;</span>pointless or not<span class=\"em\">&mdash;</span>is a good test case for that.</p>\n</aside>\n<p>On my laptop, that takes jlox about 72 seconds to execute. An equivalent C\nprogram finishes in half a second. Our dynamically typed scripting language is\nnever going to be as fast as a statically typed language with manual memory\nmanagement, but we don&rsquo;t need to settle for more than <em>two orders of magnitude</em>\nslower.</p>\n<p>We could take jlox and run it in a profiler and start tuning and tweaking\nhotspots, but that will only get us so far. The execution model<span class=\"em\">&mdash;</span>walking the\nAST<span class=\"em\">&mdash;</span>is fundamentally the wrong design. We can&rsquo;t micro-optimize that to the\nperformance we want any more than you can polish an AMC Gremlin into an SR-71\nBlackbird.</p>\n<p>We need to rethink the core model. This chapter introduces that model, bytecode,\nand begins our new interpreter, clox.</p>\n<h2><a href=\"#bytecode\" id=\"bytecode\"><small>14&#8202;.&#8202;1</small>Bytecode?</a></h2>\n<p>In engineering, few choices are without trade-offs. To best understand why we&rsquo;re\ngoing with bytecode, let&rsquo;s stack it up against a couple of alternatives.</p>\n<h3><a href=\"#why-not-walk-the-ast\" id=\"why-not-walk-the-ast\"><small>14&#8202;.&#8202;1&#8202;.&#8202;1</small>Why not walk the AST?</a></h3>\n<p>Our existing interpreter has a couple of things going for it:</p>\n<ul>\n<li>\n<p>Well, first, we already wrote it. It&rsquo;s done. And the main reason it&rsquo;s done\nis because this style of interpreter is <em>really simple to implement</em>. The\nruntime representation of the code directly maps to the syntax. It&rsquo;s\nvirtually effortless to get from the parser to the data structures we need\nat runtime.</p>\n</li>\n<li>\n<p>It&rsquo;s <em>portable</em>. Our current interpreter is written in Java and runs on any\nplatform Java supports. We could write a new implementation in C using the\nsame approach and compile and run our language on basically every platform\nunder the sun.</p>\n</li>\n</ul>\n<p>Those are real advantages. But, on the other hand, it&rsquo;s <em>not memory-efficient</em>.\nEach piece of syntax becomes an AST node. A tiny Lox expression like <code>1 + 2</code>\nturns into a slew of objects with lots of pointers between them, something like:</p>\n<p><span name=\"header\"></span></p>\n<aside name=\"header\">\n<p>The &ldquo;(header)&rdquo; parts are the bookkeeping information the Java virtual machine\nuses to support memory management and store the object&rsquo;s type. Those take up\nspace too!</p>\n</aside><img src=\"image/chunks-of-bytecode/ast.png\" alt=\"The tree of Java objects created to represent '1 + 2'.\" />\n<p>Each of those pointers adds an extra 32 or 64 bits of overhead to the object.\nWorse, sprinkling our data across the heap in a loosely connected web of objects\ndoes bad things for <span name=\"locality\"><em>spatial locality</em></span>.</p>\n<aside name=\"locality\">\n<p>I wrote <a href=\"http://gameprogrammingpatterns.com/data-locality.html\">an entire chapter</a> about this exact problem in my first\nbook, <em>Game Programming Patterns</em>, if you want to really dig in.</p>\n</aside>\n<p>Modern CPUs process data way faster than they can pull it from RAM. To\ncompensate for that, chips have multiple layers of caching. If a piece of memory\nit needs is already in the cache, it can be loaded more quickly. We&rsquo;re talking\nupwards of 100 <em>times</em> faster.</p>\n<p>How does data get into that cache? The machine speculatively stuffs things in\nthere for you. Its heuristic is pretty simple. Whenever the CPU reads a bit of\ndata from RAM, it pulls in a whole little bundle of adjacent bytes and stuffs\nthem in the cache.</p>\n<p>If our program next requests some data close enough to be inside that cache\nline, our CPU runs like a well-oiled conveyor belt in a factory. We <em>really</em>\nwant to take advantage of this. To use the cache effectively, the way we\nrepresent code in memory should be dense and ordered like it&rsquo;s read.</p>\n<p>Now look up at that tree. Those sub-objects could be <span\nname=\"anywhere\"><em>anywhere</em></span>. Every step the tree-walker takes where it\nfollows a reference to a child node may step outside the bounds of the cache and\nforce the CPU to stall until a new lump of data can be slurped in from RAM. Just\nthe <em>overhead</em> of those tree nodes with all of their pointer fields and object\nheaders tends to push objects away from each other and out of the cache.</p>\n<aside name=\"anywhere\">\n<p>Even if the objects happened to be allocated in sequential memory when the\nparser first produced them, after a couple of rounds of garbage collection<span class=\"em\">&mdash;</span>which may move objects around in memory<span class=\"em\">&mdash;</span>there&rsquo;s no telling where they&rsquo;ll be.</p>\n</aside>\n<p>Our AST walker has other overhead too around interface dispatch and the Visitor\npattern, but the locality issues alone are enough to justify a better code\nrepresentation.</p>\n<h3><a href=\"#why-not-compile-to-native-code\" id=\"why-not-compile-to-native-code\"><small>14&#8202;.&#8202;1&#8202;.&#8202;2</small>Why not compile to native code?</a></h3>\n<p>If you want to go <em>real</em> fast, you want to get all of those layers of\nindirection out of the way. Right down to the metal. Machine code. It even\n<em>sounds</em> fast. <em>Machine code.</em></p>\n<p>Compiling directly to the native instruction set the chip supports is what the\nfastest languages do. Targeting native code has been the most efficient option\nsince way back in the early days when engineers actually <span\nname=\"hand\">handwrote</span> programs in machine code.</p>\n<aside name=\"hand\">\n<p>Yes, they actually wrote machine code by hand. On punched cards. Which,\npresumably, they punched <em>with their fists</em>.</p>\n</aside>\n<p>If you&rsquo;ve never written any machine code, or its slightly more human-palatable\ncousin assembly code before, I&rsquo;ll give you the gentlest of introductions. Native\ncode is a dense series of operations, encoded directly in binary. Each\ninstruction is between one and a few bytes long, and is almost mind-numbingly\nlow level. &ldquo;Move a value from this address to this register.&rdquo; &ldquo;Add the integers\nin these two registers.&rdquo; Stuff like that.</p>\n<p>The CPU cranks through the instructions, decoding and executing each one in\norder. There is no tree structure like our AST, and control flow is handled by\njumping from one point in the code directly to another. No indirection, no\noverhead, no unnecessary skipping around or chasing pointers.</p>\n<p>Lightning fast, but that performance comes at a cost. First of all, compiling to\nnative code ain&rsquo;t easy. Most chips in wide use today have sprawling Byzantine\narchitectures with heaps of instructions that accreted over decades. They\nrequire sophisticated register allocation, pipelining, and instruction\nscheduling.</p>\n<p>And, of course, you&rsquo;ve thrown <span name=\"back\">portability</span> out. Spend a\nfew years mastering some architecture and that still only gets you onto <em>one</em> of\nthe several popular instruction sets out there. To get your language on all of\nthem, you need to learn all of their instruction sets and write a separate back\nend for each one.</p>\n<aside name=\"back\">\n<p>The situation isn&rsquo;t entirely dire. A well-architected compiler lets you\nshare the front end and most of the middle layer optimization passes across the\ndifferent architectures you support. It&rsquo;s mainly the code generation and some of\nthe details around instruction selection that you&rsquo;ll need to write afresh each\ntime.</p>\n<p>The <a href=\"https://llvm.org/\">LLVM</a> project gives you some of this out of the box. If your compiler\noutputs LLVM&rsquo;s own special intermediate language, LLVM in turn compiles that to\nnative code for a plethora of architectures.</p>\n</aside>\n<h3><a href=\"#what-is-bytecode\" id=\"what-is-bytecode\"><small>14&#8202;.&#8202;1&#8202;.&#8202;3</small>What is bytecode?</a></h3>\n<p>Fix those two points in your mind. On one end, a tree-walk interpreter is\nsimple, portable, and slow. On the other, native code is complex and\nplatform-specific but fast. Bytecode sits in the middle. It retains the\nportability of a tree-walker<span class=\"em\">&mdash;</span>we won&rsquo;t be getting our hands dirty with\nassembly code in this book. It sacrifices <em>some</em> simplicity to get a performance\nboost in return, though not as fast as going fully native.</p>\n<p>Structurally, bytecode resembles machine code. It&rsquo;s a dense, linear sequence of\nbinary instructions. That keeps overhead low and plays nice with the cache.\nHowever, it&rsquo;s a much simpler, higher-level instruction set than any real chip\nout there. (In many bytecode formats, each instruction is only a single byte\nlong, hence &ldquo;bytecode&rdquo;.)</p>\n<p>Imagine you&rsquo;re writing a native compiler from some source language and you&rsquo;re\ngiven carte blanche to define the easiest possible architecture to target.\nBytecode is kind of like that. It&rsquo;s an idealized fantasy instruction set that\nmakes your life as the compiler writer easier.</p>\n<p>The problem with a fantasy architecture, of course, is that it doesn&rsquo;t exist. We\nsolve that by writing an <em>emulator</em><span class=\"em\">&mdash;</span>a simulated chip written in software that\ninterprets the bytecode one instruction at a time. A <em>virtual machine (VM)</em>, if\nyou will.</p>\n<p>That emulation layer adds <span name=\"p-code\">overhead</span>, which is a key\nreason bytecode is slower than native code. But in return, it gives us\nportability. Write our VM in a language like C that is already supported on all\nthe machines we care about, and we can run our emulator on top of any hardware\nwe like.</p>\n<aside name=\"p-code\">\n<p>One of the first bytecode formats was <a href=\"https://en.wikipedia.org/wiki/P-code_machine\">p-code</a>, developed for Niklaus Wirth&rsquo;s\nPascal language. You might think a PDP-11 running at 15MHz couldn&rsquo;t afford the\noverhead of emulating a virtual machine. But back then, computers were in their\nCambrian explosion and new architectures appeared every day. Keeping up with the\nlatest chips was worth more than squeezing the maximum performance from each\none. That&rsquo;s why the &ldquo;p&rdquo; in p-code doesn&rsquo;t stand for &ldquo;Pascal&rdquo;, but &ldquo;portable&rdquo;.</p>\n</aside>\n<p>This is the path we&rsquo;ll take with our new interpreter, clox. We&rsquo;ll follow in the\nfootsteps of the main implementations of Python, Ruby, Lua, OCaml, Erlang, and\nothers. In many ways, our VM&rsquo;s design will parallel the structure of our\nprevious interpreter:</p>\n<p><img src=\"image/chunks-of-bytecode/phases.png\" alt=\"Phases of the two\nimplementations. jlox is Parser to Syntax Trees to Interpreter. clox is Compiler\nto Bytecode to Virtual Machine.\" /></p>\n<p>Of course, we won&rsquo;t implement the phases strictly in order. Like our previous\ninterpreter, we&rsquo;ll bounce around, building up the implementation one language\nfeature at a time. In this chapter, we&rsquo;ll get the skeleton of the application in\nplace and create the data structures needed to store and represent a chunk of\nbytecode.</p>\n<h2><a href=\"#getting-started\" id=\"getting-started\"><small>14&#8202;.&#8202;2</small>Getting Started</a></h2>\n<p>Where else to begin, but at <code>main()</code>? <span name=\"ready\">Fire</span> up your\ntrusty text editor and start typing.</p>\n<aside name=\"ready\">\n<p>Now is a good time to stretch, maybe crack your knuckles. A little montage music\nwouldn&rsquo;t hurt either.</p>\n</aside>\n<div class=\"codehilite\"><div class=\"source-file\"><em>main.c</em><br>\ncreate new file</div>\n<pre><span class=\"a\">#include &quot;common.h&quot;</span>\n\n<span class=\"t\">int</span> <span class=\"i\">main</span>(<span class=\"t\">int</span> <span class=\"i\">argc</span>, <span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">argv</span>[]) {\n  <span class=\"k\">return</span> <span class=\"n\">0</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>main.c</em>, create new file</div>\n\n<p>From this tiny seed, we will grow our entire VM. Since C provides us with so\nlittle, we first need to spend some time amending the soil. Some of that goes\ninto this header:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>common.h</em><br>\ncreate new file</div>\n<pre><span class=\"a\">#ifndef clox_common_h</span>\n<span class=\"a\">#define clox_common_h</span>\n\n<span class=\"a\">#include &lt;stdbool.h&gt;</span>\n<span class=\"a\">#include &lt;stddef.h&gt;</span>\n<span class=\"a\">#include &lt;stdint.h&gt;</span>\n\n<span class=\"a\">#endif</span>\n</pre></div>\n<div class=\"source-file-narrow\"><em>common.h</em>, create new file</div>\n\n<p>There are a handful of types and constants we&rsquo;ll use throughout the interpreter,\nand this is a convenient place to put them. For now, it&rsquo;s the venerable <code>NULL</code>,\n<code>size_t</code>, the nice C99 Boolean <code>bool</code>, and explicit-sized integer types<span class=\"em\">&mdash;</span><code>uint8_t</code> and friends.</p>\n<h2><a href=\"#chunks-of-instructions\" id=\"chunks-of-instructions\"><small>14&#8202;.&#8202;3</small>Chunks of Instructions</a></h2>\n<p>Next, we need a module to define our code representation. I&rsquo;ve been using\n&ldquo;chunk&rdquo; to refer to sequences of bytecode, so let&rsquo;s make that the official name\nfor that module.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>chunk.h</em><br>\ncreate new file</div>\n<pre><span class=\"a\">#ifndef clox_chunk_h</span>\n<span class=\"a\">#define clox_chunk_h</span>\n\n<span class=\"a\">#include &quot;common.h&quot;</span>\n\n<span class=\"a\">#endif</span>\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, create new file</div>\n\n<p>In our bytecode format, each instruction has a one-byte <strong>operation code</strong>\n(universally shortened to <strong>opcode</strong>). That number controls what kind of\ninstruction we&rsquo;re dealing with<span class=\"em\">&mdash;</span>add, subtract, look up variable, etc. We\ndefine those here:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;common.h&quot;\n</pre><div class=\"source-file\"><em>chunk.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">typedef</span> <span class=\"k\">enum</span> {\n  <span class=\"a\">OP_RETURN</span>,\n} <span class=\"t\">OpCode</span>;\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em></div>\n\n<p>For now, we start with a single instruction, <code>OP_RETURN</code>. When we have a\nfull-featured VM, this instruction will mean &ldquo;return from the current function&rdquo;.\nI admit this isn&rsquo;t exactly useful yet, but we have to start somewhere, and this\nis a particularly simple instruction, for reasons we&rsquo;ll get to later.</p>\n<h3><a href=\"#a-dynamic-array-of-instructions\" id=\"a-dynamic-array-of-instructions\"><small>14&#8202;.&#8202;3&#8202;.&#8202;1</small>A dynamic array of instructions</a></h3>\n<p>Bytecode is a series of instructions. Eventually, we&rsquo;ll store some other data\nalong with the instructions, so let&rsquo;s go ahead and create a struct to hold it\nall.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} OpCode;\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nadd after enum <em>OpCode</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">typedef</span> <span class=\"k\">struct</span> {\n  <span class=\"t\">uint8_t</span>* <span class=\"i\">code</span>;\n} <span class=\"t\">Chunk</span>;\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, add after enum <em>OpCode</em></div>\n\n<p>At the moment, this is simply a wrapper around an array of bytes. Since we don&rsquo;t\nknow how big the array needs to be before we start compiling a chunk, it must be\ndynamic. Dynamic arrays are one of my favorite data structures. That sounds like\nclaiming vanilla is my favorite ice cream <span name=\"flavor\">flavor</span>, but\nhear me out. Dynamic arrays provide:</p>\n<aside name=\"flavor\">\n<p>Butter pecan is actually my favorite.</p>\n</aside>\n<ul>\n<li>\n<p>Cache-friendly, dense storage</p>\n</li>\n<li>\n<p>Constant-time indexed element lookup</p>\n</li>\n<li>\n<p>Constant-time appending to the end of the array</p>\n</li>\n</ul>\n<p>Those features are exactly why we used dynamic arrays all the time in jlox under\nthe guise of Java&rsquo;s ArrayList class. Now that we&rsquo;re in C, we get to roll our\nown. If you&rsquo;re rusty on dynamic arrays, the idea is pretty simple. In addition\nto the array itself, we keep two numbers: the number of elements in the array we\nhave allocated (&ldquo;capacity&rdquo;) and how many of those allocated entries are actually\nin use (&ldquo;count&rdquo;).</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">typedef struct {\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin struct <em>Chunk</em></div>\n<pre class=\"insert\">  <span class=\"t\">int</span> <span class=\"i\">count</span>;\n  <span class=\"t\">int</span> <span class=\"i\">capacity</span>;\n</pre><pre class=\"insert-after\">  uint8_t* code;\n} Chunk;\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in struct <em>Chunk</em></div>\n\n<p>When we add an element, if the count is less than the capacity, then there is\nalready available space in the array. We store the new element right in there\nand bump the count.</p>\n<p><img src=\"image/chunks-of-bytecode/insert.png\" alt=\"Storing an element in an\narray that has enough capacity.\" /></p>\n<p>If we have no spare capacity, then the process is a little more involved.</p>\n<p><img src=\"image/chunks-of-bytecode/grow.png\" alt=\"Growing the dynamic array\nbefore storing an element.\" class=\"wide\" /></p>\n<ol>\n<li><span name=\"amortized\">Allocate</span> a new array with more capacity.</li>\n<li>Copy the existing elements from the old array to the new one.</li>\n<li>Store the new <code>capacity</code>.</li>\n<li>Delete the old array.</li>\n<li>Update <code>code</code> to point to the new array.</li>\n<li>Store the element in the new array now that there is room.</li>\n<li>Update the <code>count</code>.</li>\n</ol>\n<aside name=\"amortized\">\n<p>Copying the existing elements when you grow the array makes it seem like\nappending an element is <em>O(n)</em>, not <em>O(1)</em> like I said above. However, you need\nto do this copy step only on <em>some</em> of the appends. Most of the time, there is\nalready extra capacity, so you don&rsquo;t need to copy.</p>\n<p>To understand how this works, we need <a href=\"https://en.wikipedia.org/wiki/Amortized_analysis\"><strong>amortized\nanalysis</strong></a>. That shows us\nthat as long as we grow the array by a multiple of its current size, when we\naverage out the cost of a <em>sequence</em> of appends, each append is <em>O(1)</em>.</p>\n</aside>\n<p>We have our struct ready, so let&rsquo;s implement the functions to work with it. C\ndoesn&rsquo;t have constructors, so we declare a function to initialize a new chunk.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} Chunk;\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nadd after struct <em>Chunk</em></div>\n<pre class=\"insert\">\n\n<span class=\"t\">void</span> <span class=\"i\">initChunk</span>(<span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>);\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, add after struct <em>Chunk</em></div>\n\n<p>And implement it thusly:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>chunk.c</em><br>\ncreate new file</div>\n<pre><span class=\"a\">#include &lt;stdlib.h&gt;</span>\n\n<span class=\"a\">#include &quot;chunk.h&quot;</span>\n\n<span class=\"t\">void</span> <span class=\"i\">initChunk</span>(<span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>) {\n  <span class=\"i\">chunk</span>-&gt;<span class=\"i\">count</span> = <span class=\"n\">0</span>;\n  <span class=\"i\">chunk</span>-&gt;<span class=\"i\">capacity</span> = <span class=\"n\">0</span>;\n  <span class=\"i\">chunk</span>-&gt;<span class=\"i\">code</span> = <span class=\"a\">NULL</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.c</em>, create new file</div>\n\n<p>The dynamic array starts off completely empty. We don&rsquo;t even allocate a raw\narray yet. To append a byte to the end of the chunk, we use a new function.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void initChunk(Chunk* chunk);\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nadd after <em>initChunk</em>()</div>\n<pre class=\"insert\"><span class=\"t\">void</span> <span class=\"i\">writeChunk</span>(<span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>, <span class=\"t\">uint8_t</span> <span class=\"i\">byte</span>);\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, add after <em>initChunk</em>()</div>\n\n<p>This is where the interesting work happens.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>chunk.c</em><br>\nadd after <em>initChunk</em>()</div>\n<pre><span class=\"t\">void</span> <span class=\"i\">writeChunk</span>(<span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>, <span class=\"t\">uint8_t</span> <span class=\"i\">byte</span>) {\n  <span class=\"k\">if</span> (<span class=\"i\">chunk</span>-&gt;<span class=\"i\">capacity</span> &lt; <span class=\"i\">chunk</span>-&gt;<span class=\"i\">count</span> + <span class=\"n\">1</span>) {\n    <span class=\"t\">int</span> <span class=\"i\">oldCapacity</span> = <span class=\"i\">chunk</span>-&gt;<span class=\"i\">capacity</span>;\n    <span class=\"i\">chunk</span>-&gt;<span class=\"i\">capacity</span> = <span class=\"a\">GROW_CAPACITY</span>(<span class=\"i\">oldCapacity</span>);\n    <span class=\"i\">chunk</span>-&gt;<span class=\"i\">code</span> = <span class=\"a\">GROW_ARRAY</span>(<span class=\"t\">uint8_t</span>, <span class=\"i\">chunk</span>-&gt;<span class=\"i\">code</span>,\n        <span class=\"i\">oldCapacity</span>, <span class=\"i\">chunk</span>-&gt;<span class=\"i\">capacity</span>);\n  }\n\n  <span class=\"i\">chunk</span>-&gt;<span class=\"i\">code</span>[<span class=\"i\">chunk</span>-&gt;<span class=\"i\">count</span>] = <span class=\"i\">byte</span>;\n  <span class=\"i\">chunk</span>-&gt;<span class=\"i\">count</span>++;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.c</em>, add after <em>initChunk</em>()</div>\n\n<p>The first thing we need to do is see if the current array already has capacity\nfor the new byte. If it doesn&rsquo;t, then we first need to grow the array to make\nroom. (We also hit this case on the very first write when the array is <code>NULL</code>\nand <code>capacity</code> is 0.)</p>\n<p>To grow the array, first we figure out the new capacity and grow the array to\nthat size. Both of those lower-level memory operations are defined in a new\nmodule.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;chunk.h&quot;\n</pre><div class=\"source-file\"><em>chunk.c</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;memory.h&quot;</span>\n</pre><pre class=\"insert-after\">\n\nvoid initChunk(Chunk* chunk) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.c</em></div>\n\n<p>This is enough to get us started.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>memory.h</em><br>\ncreate new file</div>\n<pre><span class=\"a\">#ifndef clox_memory_h</span>\n<span class=\"a\">#define clox_memory_h</span>\n\n<span class=\"a\">#include &quot;common.h&quot;</span>\n\n<span class=\"a\">#define GROW_CAPACITY(capacity) \\</span>\n<span class=\"a\">    ((capacity) &lt; 8 ? 8 : (capacity) * 2)</span>\n\n<span class=\"a\">#endif</span>\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.h</em>, create new file</div>\n\n<p>This macro calculates a new capacity based on a given current capacity. In order\nto get the performance we want, the important part is that it <em>scales</em> based on\nthe old size. We grow by a factor of two, which is pretty typical. 1.5&times; is\nanother common choice.</p>\n<p>We also handle when the current capacity is zero. In that case, we jump straight\nto eight elements instead of starting at one. That <span\nname=\"profile\">avoids</span> a little extra memory churn when the array is very\nsmall, at the expense of wasting a few bytes on very small chunks.</p>\n<aside name=\"profile\">\n<p>I picked the number eight somewhat arbitrarily for the book. Most dynamic array\nimplementations have a minimum threshold like this. The right way to pick a\nvalue for this is to profile against real-world usage and see which constant\nmakes the best performance trade-off between extra grows versus wasted space.</p>\n</aside>\n<p>Once we know the desired capacity, we create or grow the array to that size\nusing <code>GROW_ARRAY()</code>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define GROW_CAPACITY(capacity) \\\n    ((capacity) &lt; 8 ? 8 : (capacity) * 2)\n</pre><div class=\"source-file\"><em>memory.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#define GROW_ARRAY(type, pointer, oldCount, newCount) \\</span>\n<span class=\"a\">    (type*)reallocate(pointer, sizeof(type) * (oldCount), \\</span>\n<span class=\"a\">        sizeof(type) * (newCount))</span>\n\n<span class=\"t\">void</span>* <span class=\"i\">reallocate</span>(<span class=\"t\">void</span>* <span class=\"i\">pointer</span>, <span class=\"t\">size_t</span> <span class=\"i\">oldSize</span>, <span class=\"t\">size_t</span> <span class=\"i\">newSize</span>);\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.h</em></div>\n\n<p>This macro pretties up a function call to <code>reallocate()</code> where the real work\nhappens. The macro itself takes care of getting the size of the array&rsquo;s element\ntype and casting the resulting <code>void*</code> back to a pointer of the right type.</p>\n<p>This <code>reallocate()</code> function is the single function we&rsquo;ll use for all dynamic\nmemory management in clox<span class=\"em\">&mdash;</span>allocating memory, freeing it, and changing the\nsize of an existing allocation. Routing all of those operations through a single\nfunction will be important later when we add a garbage collector that needs to\nkeep track of how much memory is in use.</p>\n<p>The two size arguments passed to <code>reallocate()</code> control which operation to\nperform:</p><table>\n  <thead>\n    <tr>\n      <td>oldSize</td>\n      <td>newSize</td>\n      <td>Operation</td>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <td>0</td>\n      <td>Non&#8209;zero</td>\n      <td>Allocate new block.</td>\n    </tr>\n    <tr>\n      <td>Non&#8209;zero</td>\n      <td>0</td>\n      <td>Free allocation.</td>\n    </tr>\n    <tr>\n      <td>Non&#8209;zero</td>\n      <td>Smaller&nbsp;than&nbsp;<code>oldSize</code></td>\n      <td>Shrink existing allocation.</td>\n    </tr>\n    <tr>\n      <td>Non&#8209;zero</td>\n      <td>Larger&nbsp;than&nbsp;<code>oldSize</code></td>\n      <td>Grow existing allocation.</td>\n    </tr>\n  </tbody>\n</table>\n<p>That sounds like a lot of cases to handle, but here&rsquo;s the implementation:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>memory.c</em><br>\ncreate new file</div>\n<pre><span class=\"a\">#include &lt;stdlib.h&gt;</span>\n\n<span class=\"a\">#include &quot;memory.h&quot;</span>\n\n<span class=\"t\">void</span>* <span class=\"i\">reallocate</span>(<span class=\"t\">void</span>* <span class=\"i\">pointer</span>, <span class=\"t\">size_t</span> <span class=\"i\">oldSize</span>, <span class=\"t\">size_t</span> <span class=\"i\">newSize</span>) {\n  <span class=\"k\">if</span> (<span class=\"i\">newSize</span> == <span class=\"n\">0</span>) {\n    <span class=\"i\">free</span>(<span class=\"i\">pointer</span>);\n    <span class=\"k\">return</span> <span class=\"a\">NULL</span>;\n  }\n\n  <span class=\"t\">void</span>* <span class=\"i\">result</span> = <span class=\"i\">realloc</span>(<span class=\"i\">pointer</span>, <span class=\"i\">newSize</span>);\n  <span class=\"k\">return</span> <span class=\"i\">result</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, create new file</div>\n\n<p>When <code>newSize</code> is zero, we handle the deallocation case ourselves by calling\n<code>free()</code>. Otherwise, we rely on the C standard library&rsquo;s <code>realloc()</code> function.\nThat function conveniently supports the other three aspects of our policy. When\n<code>oldSize</code> is zero, <code>realloc()</code> is equivalent to calling <code>malloc()</code>.</p>\n<p>The interesting cases are when both <code>oldSize</code> and <code>newSize</code> are not zero. Those\ntell <code>realloc()</code> to resize the previously allocated block. If the new size is\nsmaller than the existing block of memory, it simply <span\nname=\"shrink\">updates</span> the size of the block and returns the same pointer\nyou gave it. If the new size is larger, it attempts to grow the existing block\nof memory.</p>\n<p>It can do that only if the memory after that block isn&rsquo;t already in use. If\nthere isn&rsquo;t room to grow the block, <code>realloc()</code> instead allocates a <em>new</em> block\nof memory of the desired size, copies over the old bytes, frees the old block,\nand then returns a pointer to the new block. Remember, that&rsquo;s exactly the\nbehavior we want for our dynamic array.</p>\n<p>Because computers are finite lumps of matter and not the perfect mathematical\nabstractions computer science theory would have us believe, allocation can fail\nif there isn&rsquo;t enough memory and <code>realloc()</code> will return <code>NULL</code>. We should\nhandle that.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  void* result = realloc(pointer, newSize);\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>reallocate</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">result</span> == <span class=\"a\">NULL</span>) <span class=\"i\">exit</span>(<span class=\"n\">1</span>);\n</pre><pre class=\"insert-after\">  return result;\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>reallocate</em>()</div>\n\n<p>There&rsquo;s not really anything <em>useful</em> that our VM can do if it can&rsquo;t get the\nmemory it needs, but we at least detect that and abort the process immediately\ninstead of returning a <code>NULL</code> pointer and letting it go off the rails later.</p>\n<aside name=\"shrink\">\n<p>Since all we passed in was a bare pointer to the first byte of memory, what does\nit mean to &ldquo;update&rdquo; the block&rsquo;s size? Under the hood, the memory allocator\nmaintains additional bookkeeping information for each block of heap-allocated\nmemory, including its size.</p>\n<p>Given a pointer to some previously allocated memory, it can find this\nbookkeeping information, which is necessary to be able to cleanly free it. It&rsquo;s\nthis size metadata that <code>realloc()</code> updates.</p>\n<p>Many implementations of <code>malloc()</code> store the allocated size in memory right\n<em>before</em> the returned address.</p>\n</aside>\n<p>OK, we can create new chunks and write instructions to them. Are we done? Nope!\nWe&rsquo;re in C now, remember, we have to manage memory ourselves, like in Ye Olden\nTimes, and that means <em>freeing</em> it too.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void initChunk(Chunk* chunk);\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nadd after <em>initChunk</em>()</div>\n<pre class=\"insert\"><span class=\"t\">void</span> <span class=\"i\">freeChunk</span>(<span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>);\n</pre><pre class=\"insert-after\">void writeChunk(Chunk* chunk, uint8_t byte);\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, add after <em>initChunk</em>()</div>\n\n<p>The implementation is:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>chunk.c</em><br>\nadd after <em>initChunk</em>()</div>\n<pre><span class=\"t\">void</span> <span class=\"i\">freeChunk</span>(<span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>) {\n  <span class=\"a\">FREE_ARRAY</span>(<span class=\"t\">uint8_t</span>, <span class=\"i\">chunk</span>-&gt;<span class=\"i\">code</span>, <span class=\"i\">chunk</span>-&gt;<span class=\"i\">capacity</span>);\n  <span class=\"i\">initChunk</span>(<span class=\"i\">chunk</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.c</em>, add after <em>initChunk</em>()</div>\n\n<p>We deallocate all of the memory and then call <code>initChunk()</code> to zero out the\nfields leaving the chunk in a well-defined empty state. To free the memory, we\nadd one more macro.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define GROW_ARRAY(type, pointer, oldCount, newCount) \\\n    (type*)reallocate(pointer, sizeof(type) * (oldCount), \\\n        sizeof(type) * (newCount))\n</pre><div class=\"source-file\"><em>memory.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#define FREE_ARRAY(type, pointer, oldCount) \\</span>\n<span class=\"a\">    reallocate(pointer, sizeof(type) * (oldCount), 0)</span>\n</pre><pre class=\"insert-after\">\n\nvoid* reallocate(void* pointer, size_t oldSize, size_t newSize);\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.h</em></div>\n\n<p>Like <code>GROW_ARRAY()</code>, this is a wrapper around a call to <code>reallocate()</code>. This one\nfrees the memory by passing in zero for the new size. I know, this is a lot of\nboring low-level stuff. Don&rsquo;t worry, we&rsquo;ll get a lot of use out of these in\nlater chapters and will get to program at a higher level. Before we can do that,\nthough, we gotta lay our own foundation.</p>\n<h2><a href=\"#disassembling-chunks\" id=\"disassembling-chunks\"><small>14&#8202;.&#8202;4</small>Disassembling Chunks</a></h2>\n<p>Now we have a little module for creating chunks of bytecode. Let&rsquo;s try it out by\nhand-building a sample chunk.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">int main(int argc, const char* argv[]) {\n</pre><div class=\"source-file\"><em>main.c</em><br>\nin <em>main</em>()</div>\n<pre class=\"insert\">  <span class=\"t\">Chunk</span> <span class=\"i\">chunk</span>;\n  <span class=\"i\">initChunk</span>(&amp;<span class=\"i\">chunk</span>);\n  <span class=\"i\">writeChunk</span>(&amp;<span class=\"i\">chunk</span>, <span class=\"a\">OP_RETURN</span>);\n  <span class=\"i\">freeChunk</span>(&amp;<span class=\"i\">chunk</span>);\n</pre><pre class=\"insert-after\">  return 0;\n</pre></div>\n<div class=\"source-file-narrow\"><em>main.c</em>, in <em>main</em>()</div>\n\n<p>Don&rsquo;t forget the include.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;common.h&quot;\n</pre><div class=\"source-file\"><em>main.c</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;chunk.h&quot;</span>\n</pre><pre class=\"insert-after\">\n\nint main(int argc, const char* argv[]) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>main.c</em></div>\n\n<p>Run that and give it a try. Did it work? Uh<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>who knows? All we&rsquo;ve done is push\nsome bytes around in memory. We have no human-friendly way to see what&rsquo;s\nactually inside that chunk we made.</p>\n<p>To fix this, we&rsquo;re going to create a <strong>disassembler</strong>. An <strong>assembler</strong> is an\nold-school program that takes a file containing human-readable mnemonic names\nfor CPU instructions like &ldquo;ADD&rdquo; and &ldquo;MULT&rdquo; and translates them to their binary\nmachine code equivalent. A <em>dis</em>assembler goes in the other direction<span class=\"em\">&mdash;</span>given a\nblob of machine code, it spits out a textual listing of the instructions.</p>\n<p>We&rsquo;ll implement something <span name=\"printer\">similar</span>. Given a chunk, it\nwill print out all of the instructions in it. A Lox <em>user</em> won&rsquo;t use this, but\nwe Lox <em>maintainers</em> will certainly benefit since it gives us a window into the\ninterpreter&rsquo;s internal representation of code.</p>\n<aside name=\"printer\">\n<p>In jlox, our analogous tool was the <a href=\"representing-code.html#a-not-very-pretty-printer\">AstPrinter class</a>.</p>\n</aside>\n<p>In <code>main()</code>, after we create the chunk, we pass it to the disassembler.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  initChunk(&amp;chunk);\n  writeChunk(&amp;chunk, OP_RETURN);\n</pre><div class=\"source-file\"><em>main.c</em><br>\nin <em>main</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"i\">disassembleChunk</span>(&amp;<span class=\"i\">chunk</span>, <span class=\"s\">&quot;test chunk&quot;</span>);\n</pre><pre class=\"insert-after\">  freeChunk(&amp;chunk);\n</pre></div>\n<div class=\"source-file-narrow\"><em>main.c</em>, in <em>main</em>()</div>\n\n<p>Again, we whip up <span name=\"module\">yet another</span> module.</p>\n<aside name=\"module\">\n<p>I promise you we won&rsquo;t be creating this many new files in later chapters.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;chunk.h&quot;\n</pre><div class=\"source-file\"><em>main.c</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;debug.h&quot;</span>\n</pre><pre class=\"insert-after\">\n\nint main(int argc, const char* argv[]) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>main.c</em></div>\n\n<p>Here&rsquo;s that header:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>debug.h</em><br>\ncreate new file</div>\n<pre><span class=\"a\">#ifndef clox_debug_h</span>\n<span class=\"a\">#define clox_debug_h</span>\n\n<span class=\"a\">#include &quot;chunk.h&quot;</span>\n\n<span class=\"t\">void</span> <span class=\"i\">disassembleChunk</span>(<span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>, <span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">name</span>);\n<span class=\"t\">int</span> <span class=\"i\">disassembleInstruction</span>(<span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>, <span class=\"t\">int</span> <span class=\"i\">offset</span>);\n\n<span class=\"a\">#endif</span>\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.h</em>, create new file</div>\n\n<p>In <code>main()</code>, we call <code>disassembleChunk()</code> to disassemble all of the instructions\nin the entire chunk. That&rsquo;s implemented in terms of the other function, which\njust disassembles a single instruction. It shows up here in the header because\nwe&rsquo;ll call it from the VM in later chapters.</p>\n<p>Here&rsquo;s a start at the implementation file:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>debug.c</em><br>\ncreate new file</div>\n<pre><span class=\"a\">#include &lt;stdio.h&gt;</span>\n\n<span class=\"a\">#include &quot;debug.h&quot;</span>\n\n<span class=\"t\">void</span> <span class=\"i\">disassembleChunk</span>(<span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>, <span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">name</span>) {\n  <span class=\"i\">printf</span>(<span class=\"s\">&quot;== %s ==</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>, <span class=\"i\">name</span>);\n\n  <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">offset</span> = <span class=\"n\">0</span>; <span class=\"i\">offset</span> &lt; <span class=\"i\">chunk</span>-&gt;<span class=\"i\">count</span>;) {\n    <span class=\"i\">offset</span> = <span class=\"i\">disassembleInstruction</span>(<span class=\"i\">chunk</span>, <span class=\"i\">offset</span>);\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, create new file</div>\n\n<p>To disassemble a chunk, we print a little header (so we can tell <em>which</em> chunk\nwe&rsquo;re looking at) and then crank through the bytecode, disassembling each\ninstruction. The way we iterate through the code is a little odd. Instead of\nincrementing <code>offset</code> in the loop, we let <code>disassembleInstruction()</code> do it for\nus. When we call that function, after disassembling the instruction at the given\noffset, it returns the offset of the <em>next</em> instruction. This is because, as\nwe&rsquo;ll see later, instructions can have different sizes.</p>\n<p>The core of the &ldquo;debug&rdquo; module is this function:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>debug.c</em><br>\nadd after <em>disassembleChunk</em>()</div>\n<pre><span class=\"t\">int</span> <span class=\"i\">disassembleInstruction</span>(<span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>, <span class=\"t\">int</span> <span class=\"i\">offset</span>) {\n  <span class=\"i\">printf</span>(<span class=\"s\">&quot;%04d &quot;</span>, <span class=\"i\">offset</span>);\n\n  <span class=\"t\">uint8_t</span> <span class=\"i\">instruction</span> = <span class=\"i\">chunk</span>-&gt;<span class=\"i\">code</span>[<span class=\"i\">offset</span>];\n  <span class=\"k\">switch</span> (<span class=\"i\">instruction</span>) {\n    <span class=\"k\">case</span> <span class=\"a\">OP_RETURN</span>:\n      <span class=\"k\">return</span> <span class=\"i\">simpleInstruction</span>(<span class=\"s\">&quot;OP_RETURN&quot;</span>, <span class=\"i\">offset</span>);\n    <span class=\"k\">default</span>:\n      <span class=\"i\">printf</span>(<span class=\"s\">&quot;Unknown opcode %d</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>, <span class=\"i\">instruction</span>);\n      <span class=\"k\">return</span> <span class=\"i\">offset</span> + <span class=\"n\">1</span>;\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, add after <em>disassembleChunk</em>()</div>\n\n<p>First, it prints the byte offset of the given instruction<span class=\"em\">&mdash;</span>that tells us where\nin the chunk this instruction is. This will be a helpful signpost when we start\ndoing control flow and jumping around in the bytecode.</p>\n<p>Next, it reads a single byte from the bytecode at the given offset. That&rsquo;s our\nopcode. We <span name=\"switch\">switch</span> on that. For each kind of\ninstruction, we dispatch to a little utility function for displaying it. On the\noff chance that the given byte doesn&rsquo;t look like an instruction at all<span class=\"em\">&mdash;</span>a bug\nin our compiler<span class=\"em\">&mdash;</span>we print that too. For the one instruction we do have,\n<code>OP_RETURN</code>, the display function is:</p>\n<aside name=\"switch\">\n<p>We have only one instruction right now, but this switch will grow throughout the\nrest of the book.</p>\n</aside>\n<div class=\"codehilite\"><div class=\"source-file\"><em>debug.c</em><br>\nadd after <em>disassembleChunk</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">int</span> <span class=\"i\">simpleInstruction</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">name</span>, <span class=\"t\">int</span> <span class=\"i\">offset</span>) {\n  <span class=\"i\">printf</span>(<span class=\"s\">&quot;%s</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>, <span class=\"i\">name</span>);\n  <span class=\"k\">return</span> <span class=\"i\">offset</span> + <span class=\"n\">1</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, add after <em>disassembleChunk</em>()</div>\n\n<p>There isn&rsquo;t much to a return instruction, so all it does is print the name of\nthe opcode, then return the next byte offset past this instruction. Other\ninstructions will have more going on.</p>\n<p>If we run our nascent interpreter now, it actually prints something:</p>\n<div class=\"codehilite\"><pre>== test chunk ==\n0000 OP_RETURN\n</pre></div>\n<p>It worked! This is sort of the &ldquo;Hello, world!&rdquo; of our code representation. We\ncan create a chunk, write an instruction to it, and then extract that\ninstruction back out. Our encoding and decoding of the binary bytecode is\nworking.</p>\n<h2><a href=\"#constants\" id=\"constants\"><small>14&#8202;.&#8202;5</small>Constants</a></h2>\n<p>Now that we have a rudimentary chunk structure working, let&rsquo;s start making it\nmore useful. We can store <em>code</em> in chunks, but what about <em>data</em>? Many values\nthe interpreter works with are created at runtime as the result of operations.</p>\n<div class=\"codehilite\"><pre><span class=\"n\">1</span> + <span class=\"n\">2</span>;\n</pre></div>\n<p>The value 3 appears nowhere in the code here. However, the literals <code>1</code> and <code>2</code>\ndo. To compile that statement to bytecode, we need some sort of instruction that\nmeans &ldquo;produce a constant&rdquo; and those literal values need to get stored in the\nchunk somewhere. In jlox, the Expr.Literal AST node held the value. We need a\ndifferent solution now that we don&rsquo;t have a syntax tree.</p>\n<h3><a href=\"#representing-values\" id=\"representing-values\"><small>14&#8202;.&#8202;5&#8202;.&#8202;1</small>Representing values</a></h3>\n<p>We won&rsquo;t be <em>running</em> any code in this chapter, but since constants have a foot\nin both the static and dynamic worlds of our interpreter, they force us to start\nthinking at least a little bit about how our VM should represent values.</p>\n<p>For now, we&rsquo;re going to start as simple as possible<span class=\"em\">&mdash;</span>we&rsquo;ll support only\ndouble-precision, floating-point numbers. This will obviously expand over time,\nso we&rsquo;ll set up a new module to give ourselves room to grow.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>value.h</em><br>\ncreate new file</div>\n<pre><span class=\"a\">#ifndef clox_value_h</span>\n<span class=\"a\">#define clox_value_h</span>\n\n<span class=\"a\">#include &quot;common.h&quot;</span>\n\n<span class=\"k\">typedef</span> <span class=\"t\">double</span> <span class=\"t\">Value</span>;\n\n<span class=\"a\">#endif</span>\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em>, create new file</div>\n\n<p>This typedef abstracts how Lox values are concretely represented in C. That way,\nwe can change that representation without needing to go back and fix existing\ncode that passes around values.</p>\n<p>Back to the question of where to store constants in a chunk. For small\nfixed-size values like integers, many instruction sets store the value directly\nin the code stream right after the opcode. These are called <strong>immediate\ninstructions</strong> because the bits for the value are immediately after the opcode.</p>\n<p>That doesn&rsquo;t work well for large or variable-sized constants like strings. In a\nnative compiler to machine code, those bigger constants get stored in a separate\n&ldquo;constant data&rdquo; region in the binary executable. Then, the instruction to load a\nconstant has an address or offset pointing to where the value is stored in that\nsection.</p>\n<p>Most virtual machines do something similar. For example, the Java Virtual\nMachine <a href=\"https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-4.html#jvms-4.4\">associates a <strong>constant pool</strong></a> with each compiled class.\nThat sounds good enough for clox to me. Each chunk will carry with it a list of\nthe values that appear as literals in the program. To keep things <span\nname=\"immediate\">simpler</span>, we&rsquo;ll put <em>all</em> constants in there, even simple\nintegers.</p>\n<aside name=\"immediate\">\n<p>In addition to needing two kinds of constant instructions<span class=\"em\">&mdash;</span>one for immediate\nvalues and one for constants in the constant table<span class=\"em\">&mdash;</span>immediates also force us\nto worry about alignment, padding, and endianness. Some architectures aren&rsquo;t\nhappy if you try to say, stuff a 4-byte integer at an odd address.</p>\n</aside>\n<h3><a href=\"#value-arrays\" id=\"value-arrays\"><small>14&#8202;.&#8202;5&#8202;.&#8202;2</small>Value arrays</a></h3>\n<p>The constant pool is an array of values. The instruction to load a constant\nlooks up the value by index in that array. As with our <span\nname=\"generic\">bytecode</span> array, the compiler doesn&rsquo;t know how big the\narray needs to be ahead of time. So, again, we need a dynamic one. Since C\ndoesn&rsquo;t have generic data structures, we&rsquo;ll write another dynamic array data\nstructure, this time for Value.</p>\n<aside name=\"generic\">\n<p>Defining a new struct and manipulation functions each time we need a dynamic\narray of a different type is a chore. We could cobble together some preprocessor\nmacros to fake generics, but that&rsquo;s overkill for clox. We won&rsquo;t need many more\nof these.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">typedef double Value;\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">typedef</span> <span class=\"k\">struct</span> {\n  <span class=\"t\">int</span> <span class=\"i\">capacity</span>;\n  <span class=\"t\">int</span> <span class=\"i\">count</span>;\n  <span class=\"t\">Value</span>* <span class=\"i\">values</span>;\n} <span class=\"t\">ValueArray</span>;\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>As with the bytecode array in Chunk, this struct wraps a pointer to an array\nalong with its allocated capacity and the number of elements in use. We also\nneed the same three functions to work with value arrays.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} ValueArray;\n</pre><div class=\"source-file\"><em>value.h</em><br>\nadd after struct <em>ValueArray</em></div>\n<pre class=\"insert\">\n\n<span class=\"t\">void</span> <span class=\"i\">initValueArray</span>(<span class=\"t\">ValueArray</span>* <span class=\"i\">array</span>);\n<span class=\"t\">void</span> <span class=\"i\">writeValueArray</span>(<span class=\"t\">ValueArray</span>* <span class=\"i\">array</span>, <span class=\"t\">Value</span> <span class=\"i\">value</span>);\n<span class=\"t\">void</span> <span class=\"i\">freeValueArray</span>(<span class=\"t\">ValueArray</span>* <span class=\"i\">array</span>);\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em>, add after struct <em>ValueArray</em></div>\n\n<p>The implementations will probably give you déjà vu. First, to create a new one:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>value.c</em><br>\ncreate new file</div>\n<pre><span class=\"a\">#include &lt;stdio.h&gt;</span>\n\n<span class=\"a\">#include &quot;memory.h&quot;</span>\n<span class=\"a\">#include &quot;value.h&quot;</span>\n\n<span class=\"t\">void</span> <span class=\"i\">initValueArray</span>(<span class=\"t\">ValueArray</span>* <span class=\"i\">array</span>) {\n  <span class=\"i\">array</span>-&gt;<span class=\"i\">values</span> = <span class=\"a\">NULL</span>;\n  <span class=\"i\">array</span>-&gt;<span class=\"i\">capacity</span> = <span class=\"n\">0</span>;\n  <span class=\"i\">array</span>-&gt;<span class=\"i\">count</span> = <span class=\"n\">0</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.c</em>, create new file</div>\n\n<p>Once we have an initialized array, we can start <span name=\"add\">adding</span>\nvalues to it.</p>\n<aside name=\"add\">\n<p>Fortunately, we don&rsquo;t need other operations like insertion and removal.</p>\n</aside>\n<div class=\"codehilite\"><div class=\"source-file\"><em>value.c</em><br>\nadd after <em>initValueArray</em>()</div>\n<pre><span class=\"t\">void</span> <span class=\"i\">writeValueArray</span>(<span class=\"t\">ValueArray</span>* <span class=\"i\">array</span>, <span class=\"t\">Value</span> <span class=\"i\">value</span>) {\n  <span class=\"k\">if</span> (<span class=\"i\">array</span>-&gt;<span class=\"i\">capacity</span> &lt; <span class=\"i\">array</span>-&gt;<span class=\"i\">count</span> + <span class=\"n\">1</span>) {\n    <span class=\"t\">int</span> <span class=\"i\">oldCapacity</span> = <span class=\"i\">array</span>-&gt;<span class=\"i\">capacity</span>;\n    <span class=\"i\">array</span>-&gt;<span class=\"i\">capacity</span> = <span class=\"a\">GROW_CAPACITY</span>(<span class=\"i\">oldCapacity</span>);\n    <span class=\"i\">array</span>-&gt;<span class=\"i\">values</span> = <span class=\"a\">GROW_ARRAY</span>(<span class=\"t\">Value</span>, <span class=\"i\">array</span>-&gt;<span class=\"i\">values</span>,\n                               <span class=\"i\">oldCapacity</span>, <span class=\"i\">array</span>-&gt;<span class=\"i\">capacity</span>);\n  }\n\n  <span class=\"i\">array</span>-&gt;<span class=\"i\">values</span>[<span class=\"i\">array</span>-&gt;<span class=\"i\">count</span>] = <span class=\"i\">value</span>;\n  <span class=\"i\">array</span>-&gt;<span class=\"i\">count</span>++;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.c</em>, add after <em>initValueArray</em>()</div>\n\n<p>The memory-management macros we wrote earlier do let us reuse some of the logic\nfrom the code array, so this isn&rsquo;t too bad. Finally, to release all memory used\nby the array:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>value.c</em><br>\nadd after <em>writeValueArray</em>()</div>\n<pre><span class=\"t\">void</span> <span class=\"i\">freeValueArray</span>(<span class=\"t\">ValueArray</span>* <span class=\"i\">array</span>) {\n  <span class=\"a\">FREE_ARRAY</span>(<span class=\"t\">Value</span>, <span class=\"i\">array</span>-&gt;<span class=\"i\">values</span>, <span class=\"i\">array</span>-&gt;<span class=\"i\">capacity</span>);\n  <span class=\"i\">initValueArray</span>(<span class=\"i\">array</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.c</em>, add after <em>writeValueArray</em>()</div>\n\n<p>Now that we have growable arrays of values, we can add one to Chunk to store the\nchunk&rsquo;s constants.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  uint8_t* code;\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin struct <em>Chunk</em></div>\n<pre class=\"insert\">  <span class=\"t\">ValueArray</span> <span class=\"i\">constants</span>;\n</pre><pre class=\"insert-after\">} Chunk;\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in struct <em>Chunk</em></div>\n\n<p>Don&rsquo;t forget the include.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;common.h&quot;\n</pre><div class=\"source-file\"><em>chunk.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;value.h&quot;</span>\n</pre><pre class=\"insert-after\">\n\ntypedef enum {\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em></div>\n\n<p>Ah, C, and its Stone Age modularity story. Where were we? Right. When we\ninitialize a new chunk, we initialize its constant list too.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  chunk-&gt;code = NULL;\n</pre><div class=\"source-file\"><em>chunk.c</em><br>\nin <em>initChunk</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">initValueArray</span>(&amp;<span class=\"i\">chunk</span>-&gt;<span class=\"i\">constants</span>);\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.c</em>, in <em>initChunk</em>()</div>\n\n<p>Likewise, we free the constants when we free the chunk.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  FREE_ARRAY(uint8_t, chunk-&gt;code, chunk-&gt;capacity);\n</pre><div class=\"source-file\"><em>chunk.c</em><br>\nin <em>freeChunk</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">freeValueArray</span>(&amp;<span class=\"i\">chunk</span>-&gt;<span class=\"i\">constants</span>);\n</pre><pre class=\"insert-after\">  initChunk(chunk);\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.c</em>, in <em>freeChunk</em>()</div>\n\n<p>Next, we define a convenience method to add a new constant to the chunk. Our\nyet-to-be-written compiler could write to the constant array inside Chunk\ndirectly<span class=\"em\">&mdash;</span>it&rsquo;s not like C has private fields or anything<span class=\"em\">&mdash;</span>but it&rsquo;s a little\nnicer to add an explicit function.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void writeChunk(Chunk* chunk, uint8_t byte);\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nadd after <em>writeChunk</em>()</div>\n<pre class=\"insert\"><span class=\"t\">int</span> <span class=\"i\">addConstant</span>(<span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>, <span class=\"t\">Value</span> <span class=\"i\">value</span>);\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, add after <em>writeChunk</em>()</div>\n\n<p>Then we implement it.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>chunk.c</em><br>\nadd after <em>writeChunk</em>()</div>\n<pre><span class=\"t\">int</span> <span class=\"i\">addConstant</span>(<span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>, <span class=\"t\">Value</span> <span class=\"i\">value</span>) {\n  <span class=\"i\">writeValueArray</span>(&amp;<span class=\"i\">chunk</span>-&gt;<span class=\"i\">constants</span>, <span class=\"i\">value</span>);\n  <span class=\"k\">return</span> <span class=\"i\">chunk</span>-&gt;<span class=\"i\">constants</span>.<span class=\"i\">count</span> - <span class=\"n\">1</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.c</em>, add after <em>writeChunk</em>()</div>\n\n<p>After we add the constant, we return the index where the constant was appended\nso that we can locate that same constant later.</p>\n<h3><a href=\"#constant-instructions\" id=\"constant-instructions\"><small>14&#8202;.&#8202;5&#8202;.&#8202;3</small>Constant instructions</a></h3>\n<p>We can <em>store</em> constants in chunks, but we also need to <em>execute</em> them. In a\npiece of code like:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">print</span> <span class=\"n\">1</span>;\n<span class=\"k\">print</span> <span class=\"n\">2</span>;\n</pre></div>\n<p>The compiled chunk needs to not only contain the values 1 and 2, but know <em>when</em>\nto produce them so that they are printed in the right order. Thus, we need an\ninstruction that produces a particular constant.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">typedef enum {\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_CONSTANT</span>,\n</pre><pre class=\"insert-after\">  OP_RETURN,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>When the VM executes a constant instruction, it <span name=\"load\">&ldquo;loads&rdquo;</span>\nthe constant for use. This new instruction is a little more complex than\n<code>OP_RETURN</code>. In the above example, we load two different constants. A single\nbare opcode isn&rsquo;t enough to know <em>which</em> constant to load.</p>\n<aside name=\"load\">\n<p>I&rsquo;m being vague about what it means to &ldquo;load&rdquo; or &ldquo;produce&rdquo; a constant because we\nhaven&rsquo;t learned how the virtual machine actually executes code at runtime yet.\nFor that, you&rsquo;ll have to wait until you get to (or skip ahead to, I suppose) the\n<a href=\"a-virtual-machine.html\">next chapter</a>.</p>\n</aside>\n<p>To handle cases like this, our bytecode<span class=\"em\">&mdash;</span>like most others<span class=\"em\">&mdash;</span>allows\ninstructions to have <span name=\"operand\"><strong>operands</strong></span>. These are stored\nas binary data immediately after the opcode in the instruction stream and let us\nparameterize what the instruction does.</p>\n<p><img src=\"image/chunks-of-bytecode/format.png\" alt=\"OP_CONSTANT is a byte for\nthe opcode followed by a byte for the constant index.\" /></p>\n<p>Each opcode determines how many operand bytes it has and what they mean. For\nexample, a simple operation like &ldquo;return&rdquo; may have no operands, where an\ninstruction for &ldquo;load local variable&rdquo; needs an operand to identify which\nvariable to load. Each time we add a new opcode to clox, we specify what its\noperands look like<span class=\"em\">&mdash;</span>its <strong>instruction format</strong>.</p>\n<aside name=\"operand\">\n<p>Bytecode instruction operands are <em>not</em> the same as the operands passed to an\narithmetic operator. You&rsquo;ll see when we get to expressions that arithmetic\noperand values are tracked separately. Instruction operands are a lower-level\nnotion that modify how the bytecode instruction itself behaves.</p>\n</aside>\n<p>In this case, <code>OP_CONSTANT</code> takes a single byte operand that specifies which\nconstant to load from the chunk&rsquo;s constant array. Since we don&rsquo;t have a compiler\nyet, we &ldquo;hand-compile&rdquo; an instruction in our test chunk.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  initChunk(&amp;chunk);\n</pre><div class=\"source-file\"><em>main.c</em><br>\nin <em>main</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"t\">int</span> <span class=\"i\">constant</span> = <span class=\"i\">addConstant</span>(&amp;<span class=\"i\">chunk</span>, <span class=\"n\">1.2</span>);\n  <span class=\"i\">writeChunk</span>(&amp;<span class=\"i\">chunk</span>, <span class=\"a\">OP_CONSTANT</span>);\n  <span class=\"i\">writeChunk</span>(&amp;<span class=\"i\">chunk</span>, <span class=\"i\">constant</span>);\n\n</pre><pre class=\"insert-after\">  writeChunk(&amp;chunk, OP_RETURN);\n</pre></div>\n<div class=\"source-file-narrow\"><em>main.c</em>, in <em>main</em>()</div>\n\n<p>We add the constant value itself to the chunk&rsquo;s constant pool. That returns the\nindex of the constant in the array. Then we write the constant instruction,\nstarting with its opcode. After that, we write the one-byte constant index\noperand. Note that <code>writeChunk()</code> can write opcodes or operands. It&rsquo;s all raw\nbytes as far as that function is concerned.</p>\n<p>If we try to run this now, the disassembler is going to yell at us because it\ndoesn&rsquo;t know how to decode the new instruction. Let&rsquo;s fix that.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  switch (instruction) {\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_CONSTANT</span>:\n      <span class=\"k\">return</span> <span class=\"i\">constantInstruction</span>(<span class=\"s\">&quot;OP_CONSTANT&quot;</span>, <span class=\"i\">chunk</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    case OP_RETURN:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>This instruction has a different instruction format, so we write a new helper\nfunction to disassemble it.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>debug.c</em><br>\nadd after <em>disassembleChunk</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">int</span> <span class=\"i\">constantInstruction</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">name</span>, <span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>,\n                               <span class=\"t\">int</span> <span class=\"i\">offset</span>) {\n  <span class=\"t\">uint8_t</span> <span class=\"i\">constant</span> = <span class=\"i\">chunk</span>-&gt;<span class=\"i\">code</span>[<span class=\"i\">offset</span> + <span class=\"n\">1</span>];\n  <span class=\"i\">printf</span>(<span class=\"s\">&quot;%-16s %4d &#39;&quot;</span>, <span class=\"i\">name</span>, <span class=\"i\">constant</span>);\n  <span class=\"i\">printValue</span>(<span class=\"i\">chunk</span>-&gt;<span class=\"i\">constants</span>.<span class=\"i\">values</span>[<span class=\"i\">constant</span>]);\n  <span class=\"i\">printf</span>(<span class=\"s\">&quot;&#39;</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, add after <em>disassembleChunk</em>()</div>\n\n<p>There&rsquo;s more going on here. As with <code>OP_RETURN</code>, we print out the name of the\nopcode. Then we pull out the constant index from the subsequent byte in the\nchunk. We print that index, but that isn&rsquo;t super useful to us human readers. So\nwe also look up the actual constant value<span class=\"em\">&mdash;</span>since constants <em>are</em> known at\ncompile time after all<span class=\"em\">&mdash;</span>and display the value itself too.</p>\n<p>This requires some way to print a clox Value. That function will live in the\n&ldquo;value&rdquo; module, so we include that.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;debug.h&quot;\n</pre><div class=\"source-file\"><em>debug.c</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;value.h&quot;</span>\n</pre><pre class=\"insert-after\">\n\nvoid disassembleChunk(Chunk* chunk, const char* name) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em></div>\n\n<p>Over in that header, we declare:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void freeValueArray(ValueArray* array);\n</pre><div class=\"source-file\"><em>value.h</em><br>\nadd after <em>freeValueArray</em>()</div>\n<pre class=\"insert\"><span class=\"t\">void</span> <span class=\"i\">printValue</span>(<span class=\"t\">Value</span> <span class=\"i\">value</span>);\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em>, add after <em>freeValueArray</em>()</div>\n\n<p>And here&rsquo;s an implementation:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>value.c</em><br>\nadd after <em>freeValueArray</em>()</div>\n<pre><span class=\"t\">void</span> <span class=\"i\">printValue</span>(<span class=\"t\">Value</span> <span class=\"i\">value</span>) {\n  <span class=\"i\">printf</span>(<span class=\"s\">&quot;%g&quot;</span>, <span class=\"i\">value</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.c</em>, add after <em>freeValueArray</em>()</div>\n\n<p>Magnificent, right? As you can imagine, this is going to get more complex once\nwe add dynamic typing to Lox and have values of different types.</p>\n<p>Back in <code>constantInstruction()</code>, the only remaining piece is the return value.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  printf(&quot;'\\n&quot;);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>constantInstruction</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">return</span> <span class=\"i\">offset</span> + <span class=\"n\">2</span>;\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>constantInstruction</em>()</div>\n\n<p>Remember that <code>disassembleInstruction()</code> also returns a number to tell the\ncaller the offset of the beginning of the <em>next</em> instruction. Where <code>OP_RETURN</code>\nwas only a single byte, <code>OP_CONSTANT</code> is two<span class=\"em\">&mdash;</span>one for the opcode and one for\nthe operand.</p>\n<h2><a href=\"#line-information\" id=\"line-information\"><small>14&#8202;.&#8202;6</small>Line Information</a></h2>\n<p>Chunks contain almost all of the information that the runtime needs from the\nuser&rsquo;s source code. It&rsquo;s kind of crazy to think that we can reduce all of the\ndifferent AST classes that we created in jlox down to an array of bytes and an\narray of constants. There&rsquo;s only one piece of data we&rsquo;re missing. We need it,\neven though the user hopes to never see it.</p>\n<p>When a runtime error occurs, we show the user the line number of the offending\nsource code. In jlox, those numbers live in tokens, which we in turn store in\nthe AST nodes. We need a different solution for clox now that we&rsquo;ve ditched\nsyntax trees in favor of bytecode. Given any bytecode instruction, we need to be\nable to determine the line of the user&rsquo;s source program that it was compiled\nfrom.</p>\n<p>There are a lot of clever ways we could encode this. I took the absolute <span\nname=\"side\">simplest</span> approach I could come up with, even though it&rsquo;s\nembarrassingly inefficient with memory. In the chunk, we store a separate array\nof integers that parallels the bytecode. Each number in the array is the line\nnumber for the corresponding byte in the bytecode. When a runtime error occurs,\nwe look up the line number at the same index as the current instruction&rsquo;s offset\nin the code array.</p>\n<aside name=\"side\">\n<p>This braindead encoding does do one thing right: it keeps the line information\nin a <em>separate</em> array instead of interleaving it in the bytecode itself. Since\nline information is only used when a runtime error occurs, we don&rsquo;t want it\nbetween the instructions, taking up precious space in the CPU cache and causing\nmore cache misses as the interpreter skips past it to get to the opcodes and\noperands it cares about.</p>\n</aside>\n<p>To implement this, we add another array to Chunk.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  uint8_t* code;\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin struct <em>Chunk</em></div>\n<pre class=\"insert\">  <span class=\"t\">int</span>* <span class=\"i\">lines</span>;\n</pre><pre class=\"insert-after\">  ValueArray constants;\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in struct <em>Chunk</em></div>\n\n<p>Since it exactly parallels the bytecode array, we don&rsquo;t need a separate count or\ncapacity. Every time we touch the code array, we make a corresponding change to\nthe line number array, starting with initialization.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  chunk-&gt;code = NULL;\n</pre><div class=\"source-file\"><em>chunk.c</em><br>\nin <em>initChunk</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">chunk</span>-&gt;<span class=\"i\">lines</span> = <span class=\"a\">NULL</span>;\n</pre><pre class=\"insert-after\">  initValueArray(&amp;chunk-&gt;constants);\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.c</em>, in <em>initChunk</em>()</div>\n\n<p>And likewise deallocation:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  FREE_ARRAY(uint8_t, chunk-&gt;code, chunk-&gt;capacity);\n</pre><div class=\"source-file\"><em>chunk.c</em><br>\nin <em>freeChunk</em>()</div>\n<pre class=\"insert\">  <span class=\"a\">FREE_ARRAY</span>(<span class=\"t\">int</span>, <span class=\"i\">chunk</span>-&gt;<span class=\"i\">lines</span>, <span class=\"i\">chunk</span>-&gt;<span class=\"i\">capacity</span>);\n</pre><pre class=\"insert-after\">  freeValueArray(&amp;chunk-&gt;constants);\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.c</em>, in <em>freeChunk</em>()</div>\n\n<p>When we write a byte of code to the chunk, we need to know what source line it\ncame from, so we add an extra parameter in the declaration of <code>writeChunk()</code>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void freeChunk(Chunk* chunk);\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nfunction <em>writeChunk</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"t\">void</span> <span class=\"i\">writeChunk</span>(<span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>, <span class=\"t\">uint8_t</span> <span class=\"i\">byte</span>, <span class=\"t\">int</span> <span class=\"i\">line</span>);\n</pre><pre class=\"insert-after\">int addConstant(Chunk* chunk, Value value);\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, function <em>writeChunk</em>(), replace 1 line</div>\n\n<p>And in the implementation:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>chunk.c</em><br>\nfunction <em>writeChunk</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"t\">void</span> <span class=\"i\">writeChunk</span>(<span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>, <span class=\"t\">uint8_t</span> <span class=\"i\">byte</span>, <span class=\"t\">int</span> <span class=\"i\">line</span>) {\n</pre><pre class=\"insert-after\">  if (chunk-&gt;capacity &lt; chunk-&gt;count + 1) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.c</em>, function <em>writeChunk</em>(), replace 1 line</div>\n\n<p>When we allocate or grow the code array, we do the same for the line info too.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    chunk-&gt;code = GROW_ARRAY(uint8_t, chunk-&gt;code,\n        oldCapacity, chunk-&gt;capacity);\n</pre><div class=\"source-file\"><em>chunk.c</em><br>\nin <em>writeChunk</em>()</div>\n<pre class=\"insert\">    <span class=\"i\">chunk</span>-&gt;<span class=\"i\">lines</span> = <span class=\"a\">GROW_ARRAY</span>(<span class=\"t\">int</span>, <span class=\"i\">chunk</span>-&gt;<span class=\"i\">lines</span>,\n        <span class=\"i\">oldCapacity</span>, <span class=\"i\">chunk</span>-&gt;<span class=\"i\">capacity</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.c</em>, in <em>writeChunk</em>()</div>\n\n<p>Finally, we store the line number in the array.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  chunk-&gt;code[chunk-&gt;count] = byte;\n</pre><div class=\"source-file\"><em>chunk.c</em><br>\nin <em>writeChunk</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">chunk</span>-&gt;<span class=\"i\">lines</span>[<span class=\"i\">chunk</span>-&gt;<span class=\"i\">count</span>] = <span class=\"i\">line</span>;\n</pre><pre class=\"insert-after\">  chunk-&gt;count++;\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.c</em>, in <em>writeChunk</em>()</div>\n\n<h3><a href=\"#disassembling-line-information\" id=\"disassembling-line-information\"><small>14&#8202;.&#8202;6&#8202;.&#8202;1</small>Disassembling line information</a></h3>\n<p>Alright, let&rsquo;s try this out with our little, uh, artisanal chunk. First, since\nwe added a new parameter to <code>writeChunk()</code>, we need to fix those calls to pass\nin some<span class=\"em\">&mdash;</span>arbitrary at this point<span class=\"em\">&mdash;</span>line number.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  int constant = addConstant(&amp;chunk, 1.2);\n</pre><div class=\"source-file\"><em>main.c</em><br>\nin <em>main</em>()<br>\nreplace 4 lines</div>\n<pre class=\"insert\">  <span class=\"i\">writeChunk</span>(&amp;<span class=\"i\">chunk</span>, <span class=\"a\">OP_CONSTANT</span>, <span class=\"n\">123</span>);\n  <span class=\"i\">writeChunk</span>(&amp;<span class=\"i\">chunk</span>, <span class=\"i\">constant</span>, <span class=\"n\">123</span>);\n\n  <span class=\"i\">writeChunk</span>(&amp;<span class=\"i\">chunk</span>, <span class=\"a\">OP_RETURN</span>, <span class=\"n\">123</span>);\n</pre><pre class=\"insert-after\">\n\n  disassembleChunk(&amp;chunk, &quot;test chunk&quot;);\n</pre></div>\n<div class=\"source-file-narrow\"><em>main.c</em>, in <em>main</em>(), replace 4 lines</div>\n\n<p>Once we have a real front end, of course, the compiler will track the current\nline as it parses and pass that in.</p>\n<p>Now that we have line information for every instruction, let&rsquo;s put it to good\nuse. In our disassembler, it&rsquo;s helpful to show which source line each\ninstruction was compiled from. That gives us a way to map back to the original\ncode when we&rsquo;re trying to figure out what some blob of bytecode is supposed to\ndo. After printing the offset of the instruction<span class=\"em\">&mdash;</span>the number of bytes from the\nbeginning of the chunk<span class=\"em\">&mdash;</span>we show its source line.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">int disassembleInstruction(Chunk* chunk, int offset) {\n  printf(&quot;%04d &quot;, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">offset</span> &gt; <span class=\"n\">0</span> &amp;&amp;\n      <span class=\"i\">chunk</span>-&gt;<span class=\"i\">lines</span>[<span class=\"i\">offset</span>] == <span class=\"i\">chunk</span>-&gt;<span class=\"i\">lines</span>[<span class=\"i\">offset</span> - <span class=\"n\">1</span>]) {\n    <span class=\"i\">printf</span>(<span class=\"s\">&quot;   | &quot;</span>);\n  } <span class=\"k\">else</span> {\n    <span class=\"i\">printf</span>(<span class=\"s\">&quot;%4d &quot;</span>, <span class=\"i\">chunk</span>-&gt;<span class=\"i\">lines</span>[<span class=\"i\">offset</span>]);\n  }\n</pre><pre class=\"insert-after\">\n\n  uint8_t instruction = chunk-&gt;code[offset];\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>Bytecode instructions tend to be pretty fine-grained. A single line of source\ncode often compiles to a whole sequence of instructions. To make that more\nvisually clear, we show a <code>|</code> for any instruction that comes from the same\nsource line as the preceding one. The resulting output for our handwritten\nchunk looks like:</p>\n<div class=\"codehilite\"><pre>== test chunk ==\n0000  123 OP_CONSTANT         0 '1.2'\n0002    | OP_RETURN\n</pre></div>\n<p>We have a three-byte chunk. The first two bytes are a constant instruction that\nloads 1.2 from the chunk&rsquo;s constant pool. The first byte is the <code>OP_CONSTANT</code>\nopcode and the second is the index in the constant pool. The third byte (at\noffset 2) is a single-byte return instruction.</p>\n<p>In the remaining chapters, we will flesh this out with lots more kinds of\ninstructions. But the basic structure is here, and we have everything we need\nnow to completely represent an executable piece of code at runtime in our\nvirtual machine. Remember that whole family of AST classes we defined in jlox?\nIn clox, we&rsquo;ve reduced that down to three arrays: bytes of code, constant\nvalues, and line information for debugging.</p>\n<p>This reduction is a key reason why our new interpreter will be faster than jlox.\nYou can think of bytecode as a sort of compact serialization of the AST, highly\noptimized for how the interpreter will deserialize it in the order it needs as\nit executes. In the <a href=\"a-virtual-machine.html\">next chapter</a>, we will see how the virtual machine does\nexactly that.</p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>Our encoding of line information is hilariously wasteful of memory. Given\nthat a series of instructions often correspond to the same source line, a\nnatural solution is something akin to <a href=\"https://en.wikipedia.org/wiki/Run-length_encoding\">run-length encoding</a> of the line\nnumbers.</p>\n<p>Devise an encoding that compresses the line information for a\nseries of instructions on the same line. Change <code>writeChunk()</code> to write this\ncompressed form, and implement a <code>getLine()</code> function that, given the index\nof an instruction, determines the line where the instruction occurs.</p>\n<p><em>Hint: It&rsquo;s not necessary for <code>getLine()</code> to be particularly efficient.\nSince it is called only when a runtime error occurs, it is well off the\ncritical path where performance matters.</em></p>\n</li>\n<li>\n<p>Because <code>OP_CONSTANT</code> uses only a single byte for its operand, a chunk may\nonly contain up to 256 different constants. That&rsquo;s small enough that people\nwriting real-world code will hit that limit. We could use two or more bytes\nto store the operand, but that makes <em>every</em> constant instruction take up\nmore space. Most chunks won&rsquo;t need that many unique constants, so that\nwastes space and sacrifices some locality in the common case to support the\nrare case.</p>\n<p>To balance those two competing aims, many instruction sets feature multiple\ninstructions that perform the same operation but with operands of different\nsizes. Leave our existing one-byte <code>OP_CONSTANT</code> instruction alone, and\ndefine a second <code>OP_CONSTANT_LONG</code> instruction. It stores the operand as a\n24-bit number, which should be plenty.</p>\n<p>Implement this function:</p>\n<div class=\"codehilite\"><pre><span class=\"t\">void</span> <span class=\"i\">writeConstant</span>(<span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>, <span class=\"t\">Value</span> <span class=\"i\">value</span>, <span class=\"t\">int</span> <span class=\"i\">line</span>) {\n  <span class=\"c\">// Implement me...</span>\n}\n</pre></div>\n<p>It adds <code>value</code> to <code>chunk</code>&rsquo;s constant array and then writes an appropriate\ninstruction to load the constant. Also add support to the disassembler for\n<code>OP_CONSTANT_LONG</code> instructions.</p>\n<p>Defining two instructions seems to be the best of both worlds. What\nsacrifices, if any, does it force on us?</p>\n</li>\n<li>\n<p>Our <code>reallocate()</code> function relies on the C standard library for dynamic\nmemory allocation and freeing. <code>malloc()</code> and <code>free()</code> aren&rsquo;t magic. Find\na couple of open source implementations of them and explain how they work.\nHow do they keep track of which bytes are allocated and which are free?\nWhat is required to allocate a block of memory? Free it? How do they make\nthat efficient? What do they do about fragmentation?</p>\n<p><em>Hardcore mode:</em> Implement <code>reallocate()</code> without calling <code>realloc()</code>,\n<code>malloc()</code>, or <code>free()</code>. You are allowed to call <code>malloc()</code> <em>once</em>, at the\nbeginning of the interpreter&rsquo;s execution, to allocate a single big block of\nmemory, which your <code>reallocate()</code> function has access to. It parcels out\nblobs of memory from that single region, your own personal heap. It&rsquo;s your\njob to define how it does that.</p>\n</li>\n</ol>\n</div>\n<div class=\"design-note\">\n<h2><a href=\"#design-note\" id=\"design-note\">Design Note: Test Your Language</a></h2>\n<p>We&rsquo;re almost halfway through the book and one thing we haven&rsquo;t talked about is\n<em>testing</em> your language implementation. That&rsquo;s not because testing isn&rsquo;t\nimportant. I can&rsquo;t possibly stress enough how vital it is to have a good,\ncomprehensive test suite for your language.</p>\n<p>I wrote a <a href=\"https://github.com/munificent/craftinginterpreters/tree/master/test\">test suite for Lox</a> (which you are welcome to use on your own\nLox implementation) before I wrote a single word of this book. Those tests found\ncountless bugs in my implementations.</p>\n<p>Tests are important in all software, but they&rsquo;re even more important for a\nprogramming language for at least a couple of reasons:</p>\n<ul>\n<li>\n<p><strong>Users expect their programming languages to be rock solid.</strong> We are so\nused to mature, stable compilers and interpreters that &ldquo;It&rsquo;s your code, not\nthe compiler&rdquo; is <a href=\"https://blog.codinghorror.com/the-first-rule-of-programming-its-always-your-fault/\">an ingrained part of software culture</a>. If there\nare bugs in your language implementation, users will go through the full\nfive stages of grief before they can figure out what&rsquo;s going on, and you\ndon&rsquo;t want to put them through all that.</p>\n</li>\n<li>\n<p><strong>A language implementation is a deeply interconnected piece of software.</strong>\nSome codebases are broad and shallow. If the file loading code is broken in\nyour text editor, it<span class=\"em\">&mdash;</span>hopefully!<span class=\"em\">&mdash;</span>won&rsquo;t cause failures in the text\nrendering on screen. Language implementations are narrower and deeper,\nespecially the core of the interpreter that handles the language&rsquo;s actual\nsemantics. That makes it easy for subtle bugs to creep in caused by weird\ninteractions between various parts of the system. It takes good tests to\nflush those out.</p>\n</li>\n<li>\n<p><strong>The input to a language implementation is, by design, combinatorial.</strong>\nThere are an infinite number of possible programs a user could write, and\nyour implementation needs to run them all correctly. You obviously can&rsquo;t\ntest that exhaustively, but you need to work hard to cover as much of the\ninput space as you can.</p>\n</li>\n<li>\n<p><strong>Language implementations are often complex, constantly changing, and full\nof optimizations.</strong> That leads to gnarly code with lots of dark corners\nwhere bugs can hide.</p>\n</li>\n</ul>\n<p>All of that means you&rsquo;re gonna want a lot of tests. But <em>what</em> tests? Projects\nI&rsquo;ve seen focus mostly on end-to-end &ldquo;language tests&rdquo;. Each test is a program\nwritten in the language along with the output or errors it is expected to\nproduce. Then you have a test runner that pushes the test program through your\nlanguage implementation and validates that it does what it&rsquo;s supposed to.\nWriting your tests in the language itself has a few nice advantages:</p>\n<ul>\n<li>\n<p>The tests aren&rsquo;t coupled to any particular API or internal architecture\ndecisions of the implementation. This frees you to reorganize or rewrite\nparts of your interpreter or compiler without needing to update a slew of\ntests.</p>\n</li>\n<li>\n<p>You can use the same tests for multiple implementations of the language.</p>\n</li>\n<li>\n<p>Tests can often be terse and easy to read and maintain since they are\nsimply scripts in your language.</p>\n</li>\n</ul>\n<p>It&rsquo;s not all rosy, though:</p>\n<ul>\n<li>\n<p>End-to-end tests help you determine <em>if</em> there is a bug, but not <em>where</em> the\nbug is. It can be harder to figure out where the erroneous code in the\nimplementation is because all the test tells you is that the right output\ndidn&rsquo;t appear.</p>\n</li>\n<li>\n<p>It can be a chore to craft a valid program that tickles some obscure corner\nof the implementation. This is particularly true for highly optimized\ncompilers where you may need to write convoluted code to ensure that you\nend up on just the right optimization path where a bug may be hiding.</p>\n</li>\n<li>\n<p>The overhead can be high to fire up the interpreter, parse, compile, and\nrun each test script. With a big suite of tests<span class=\"em\">&mdash;</span>which you <em>do</em> want,\nremember<span class=\"em\">&mdash;</span>that can mean a lot of time spent waiting for the tests to\nfinish running.</p>\n</li>\n</ul>\n<p>I could go on, but I don&rsquo;t want this to turn into a sermon. Also, I don&rsquo;t\npretend to be an expert on <em>how</em> to test languages. I just want you to\ninternalize how important it is <em>that</em> you test yours. Seriously. Test your\nlanguage. You&rsquo;ll thank me for it.</p>\n</div>\n\n<footer>\n<a href=\"a-virtual-machine.html\" class=\"next\">\n  Next Chapter: &ldquo;A Virtual Machine&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/classes-and-instances.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Classes and Instances &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Classes and Instances<small>27</small></a></h3>\n\n<ul>\n    <li><a href=\"#class-objects\"><small>27.1</small> Class Objects</a></li>\n    <li><a href=\"#class-declarations\"><small>27.2</small> Class Declarations</a></li>\n    <li><a href=\"#instances-of-classes\"><small>27.3</small> Instances of Classes</a></li>\n    <li><a href=\"#get-and-set-expressions\"><small>27.4</small> Get and Set Expressions</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"garbage-collection.html\" title=\"Garbage Collection\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"methods-and-initializers.html\" title=\"Methods and Initializers\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"garbage-collection.html\" title=\"Garbage Collection\" class=\"prev\">←</a>\n<a href=\"methods-and-initializers.html\" title=\"Methods and Initializers\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Classes and Instances<small>27</small></a></h3>\n\n<ul>\n    <li><a href=\"#class-objects\"><small>27.1</small> Class Objects</a></li>\n    <li><a href=\"#class-declarations\"><small>27.2</small> Class Declarations</a></li>\n    <li><a href=\"#instances-of-classes\"><small>27.3</small> Instances of Classes</a></li>\n    <li><a href=\"#get-and-set-expressions\"><small>27.4</small> Get and Set Expressions</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"garbage-collection.html\" title=\"Garbage Collection\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"methods-and-initializers.html\" title=\"Methods and Initializers\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">27</div>\n  <h1>Classes and Instances</h1>\n\n<blockquote>\n<p>Caring too much for objects can destroy you. Only<span class=\"em\">&mdash;</span>if you care for a thing\nenough, it takes on a life of its own, doesn&rsquo;t it? And isn’t the whole point\nof things<span class=\"em\">&mdash;</span>beautiful things<span class=\"em\">&mdash;</span>that they connect you to some larger beauty?</p>\n<p><cite>Donna Tartt, <em>The Goldfinch</em></cite></p>\n</blockquote>\n<p>The last area left to implement in clox is object-oriented programming. <span\nname=\"oop\">OOP</span> is a bundle of intertwined features: classes, instances,\nfields, methods, initializers, and inheritance. Using relatively high-level\nJava, we packed all that into two chapters. Now that we&rsquo;re coding in C, which\nfeels like building a model of the Eiffel tower out of toothpicks, we&rsquo;ll devote\nthree chapters to covering the same territory. This makes for a leisurely stroll\nthrough the implementation. After strenuous chapters like <a href=\"closures.html\">closures</a> and the\n<a href=\"garbage-collection.html\">garbage collector</a>, you have earned a rest. In fact, the book should be easy\nfrom here on out.</p>\n<aside name=\"oop\">\n<p>People who have strong opinions about object-oriented programming<span class=\"em\">&mdash;</span>read\n&ldquo;everyone&rdquo;<span class=\"em\">&mdash;</span>tend to assume OOP means some very specific list of language\nfeatures, but really there&rsquo;s a whole space to explore, and each language has its\nown ingredients and recipes.</p>\n<p>Self has objects but no classes. CLOS has methods but doesn&rsquo;t attach them to\nspecific classes. C++ initially had no runtime polymorphism<span class=\"em\">&mdash;</span>no virtual\nmethods. Python has multiple inheritance, but Java does not. Ruby attaches\nmethods to classes, but you can also define methods on a single object.</p>\n</aside>\n<p>In this chapter, we cover the first three features: classes, instances, and\nfields. This is the stateful side of object orientation. Then in the next two\nchapters, we will hang behavior and code reuse off of those objects.</p>\n<h2><a href=\"#class-objects\" id=\"class-objects\"><small>27&#8202;.&#8202;1</small>Class Objects</a></h2>\n<p>In a class-based object-oriented language, everything begins with classes. They\ndefine what sorts of objects exist in the program and are the factories used to\nproduce new instances. Going bottom-up, we&rsquo;ll start with their runtime\nrepresentation and then hook that into the language.</p>\n<p>By this point, we&rsquo;re well-acquainted with the process of adding a new object\ntype to the VM. We start with a struct.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} ObjClosure;\n</pre><div class=\"source-file\"><em>object.h</em><br>\nadd after struct <em>ObjClosure</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">typedef</span> <span class=\"k\">struct</span> {\n  <span class=\"t\">Obj</span> <span class=\"i\">obj</span>;\n  <span class=\"t\">ObjString</span>* <span class=\"i\">name</span>;\n} <span class=\"t\">ObjClass</span>;\n</pre><pre class=\"insert-after\">\n\nObjClosure* newClosure(ObjFunction* function);\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, add after struct <em>ObjClosure</em></div>\n\n<p>After the Obj header, we store the class&rsquo;s name. This isn&rsquo;t strictly needed for\nthe user&rsquo;s program, but it lets us show the name at runtime for things like\nstack traces.</p>\n<p>The new type needs a corresponding case in the ObjType enum.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">typedef enum {\n</pre><div class=\"source-file\"><em>object.h</em><br>\nin enum <em>ObjType</em></div>\n<pre class=\"insert\">  <span class=\"a\">OBJ_CLASS</span>,\n</pre><pre class=\"insert-after\">  OBJ_CLOSURE,\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, in enum <em>ObjType</em></div>\n\n<p>And that type gets a corresponding pair of macros. First, for testing an\nobject&rsquo;s type:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define OBJ_TYPE(value)        (AS_OBJ(value)-&gt;type)\n\n</pre><div class=\"source-file\"><em>object.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define IS_CLASS(value)        isObjType(value, OBJ_CLASS)</span>\n</pre><pre class=\"insert-after\">#define IS_CLOSURE(value)      isObjType(value, OBJ_CLOSURE)\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em></div>\n\n<p>And then for casting a Value to an ObjClass pointer:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define IS_STRING(value)       isObjType(value, OBJ_STRING)\n\n</pre><div class=\"source-file\"><em>object.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define AS_CLASS(value)        ((ObjClass*)AS_OBJ(value))</span>\n</pre><pre class=\"insert-after\">#define AS_CLOSURE(value)      ((ObjClosure*)AS_OBJ(value))\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em></div>\n\n<p>The VM creates new class objects using this function:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} ObjClass;\n\n</pre><div class=\"source-file\"><em>object.h</em><br>\nadd after struct <em>ObjClass</em></div>\n<pre class=\"insert\"><span class=\"t\">ObjClass</span>* <span class=\"i\">newClass</span>(<span class=\"t\">ObjString</span>* <span class=\"i\">name</span>);\n</pre><pre class=\"insert-after\">ObjClosure* newClosure(ObjFunction* function);\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, add after struct <em>ObjClass</em></div>\n\n<p>The implementation lives over here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>object.c</em><br>\nadd after <em>allocateObject</em>()</div>\n<pre><span class=\"t\">ObjClass</span>* <span class=\"i\">newClass</span>(<span class=\"t\">ObjString</span>* <span class=\"i\">name</span>) {\n  <span class=\"t\">ObjClass</span>* <span class=\"i\">klass</span> = <span class=\"a\">ALLOCATE_OBJ</span>(<span class=\"t\">ObjClass</span>, <span class=\"a\">OBJ_CLASS</span>);\n  <span class=\"i\">klass</span>-&gt;<span class=\"i\">name</span> = <span class=\"i\">name</span>;<span name=\"klass\"> </span>\n  <span class=\"k\">return</span> <span class=\"i\">klass</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, add after <em>allocateObject</em>()</div>\n\n<p>Pretty much all boilerplate. It takes in the class&rsquo;s name as a string and stores\nit. Every time the user declares a new class, the VM will create a new one of\nthese ObjClass structs to represent it.</p>\n<aside name=\"klass\"><img src=\"image/classes-and-instances/klass.png\" alt=\"'Klass' in a zany kidz font.\"/>\n<p>I named the variable &ldquo;klass&rdquo; not just to give the VM a zany preschool &ldquo;Kidz\nKorner&rdquo; feel. It makes it easier to get clox compiling as C++ where &ldquo;class&rdquo; is\na reserved word.</p>\n</aside>\n<p>When the VM no longer needs a class, it frees it like so:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  switch (object-&gt;type) {\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>freeObject</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OBJ_CLASS</span>: {\n      <span class=\"a\">FREE</span>(<span class=\"t\">ObjClass</span>, <span class=\"i\">object</span>);\n      <span class=\"k\">break</span>;\n    }<span name=\"braces\"> </span>\n</pre><pre class=\"insert-after\">    case OBJ_CLOSURE: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>freeObject</em>()</div>\n\n<aside name=\"braces\">\n<p>The braces here are pointless now, but will be useful in the next chapter when\nwe add some more code to the switch case.</p>\n</aside>\n<p>We have a memory manager now, so we also need to support tracing through class\nobjects.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  switch (object-&gt;type) {\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>blackenObject</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OBJ_CLASS</span>: {\n      <span class=\"t\">ObjClass</span>* <span class=\"i\">klass</span> = (<span class=\"t\">ObjClass</span>*)<span class=\"i\">object</span>;\n      <span class=\"i\">markObject</span>((<span class=\"t\">Obj</span>*)<span class=\"i\">klass</span>-&gt;<span class=\"i\">name</span>);\n      <span class=\"k\">break</span>;\n    }\n</pre><pre class=\"insert-after\">    case OBJ_CLOSURE: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>blackenObject</em>()</div>\n\n<p>When the GC reaches a class object, it marks the class&rsquo;s name to keep that\nstring alive too.</p>\n<p>The last operation the VM can perform on a class is printing it.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  switch (OBJ_TYPE(value)) {\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>printObject</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OBJ_CLASS</span>:\n      <span class=\"i\">printf</span>(<span class=\"s\">&quot;%s&quot;</span>, <span class=\"a\">AS_CLASS</span>(<span class=\"i\">value</span>)-&gt;<span class=\"i\">name</span>-&gt;<span class=\"i\">chars</span>);\n      <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">    case OBJ_CLOSURE:\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>printObject</em>()</div>\n\n<p>A class simply says its own name.</p>\n<h2><a href=\"#class-declarations\" id=\"class-declarations\"><small>27&#8202;.&#8202;2</small>Class Declarations</a></h2>\n<p>Runtime representation in hand, we are ready to add support for classes to the\nlanguage. Next, we move into the parser.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void declaration() {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>declaration</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"a\">TOKEN_CLASS</span>)) {\n    <span class=\"i\">classDeclaration</span>();\n  } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"a\">TOKEN_FUN</span>)) {\n</pre><pre class=\"insert-after\">    funDeclaration();\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>declaration</em>(), replace 1 line</div>\n\n<p>Class declarations are statements, and the parser recognizes one by the leading\n<code>class</code> keyword. The rest of the compilation happens over here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>function</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">classDeclaration</span>() {\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_IDENTIFIER</span>, <span class=\"s\">&quot;Expect class name.&quot;</span>);\n  <span class=\"t\">uint8_t</span> <span class=\"i\">nameConstant</span> = <span class=\"i\">identifierConstant</span>(&amp;<span class=\"i\">parser</span>.<span class=\"i\">previous</span>);\n  <span class=\"i\">declareVariable</span>();\n\n  <span class=\"i\">emitBytes</span>(<span class=\"a\">OP_CLASS</span>, <span class=\"i\">nameConstant</span>);\n  <span class=\"i\">defineVariable</span>(<span class=\"i\">nameConstant</span>);\n\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_LEFT_BRACE</span>, <span class=\"s\">&quot;Expect &#39;{&#39; before class body.&quot;</span>);\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_RIGHT_BRACE</span>, <span class=\"s\">&quot;Expect &#39;}&#39; after class body.&quot;</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>function</em>()</div>\n\n<p>Immediately after the <code>class</code> keyword is the class&rsquo;s name. We take that\nidentifier and add it to the surrounding function&rsquo;s constant table as a string.\nAs you just saw, printing a class shows its name, so the compiler needs to stuff\nthe name string somewhere that the runtime can find. The constant table is the\nway to do that.</p>\n<p>The class&rsquo;s <span name=\"variable\">name</span> is also used to bind the class\nobject to a variable of the same name. So we declare a variable with that\nidentifier right after consuming its token.</p>\n<aside name=\"variable\">\n<p>We could have made class declarations be <em>expressions</em> instead of statements<span class=\"em\">&mdash;</span>they are essentially a literal that produces a value after all. Then users would\nhave to explicitly bind the class to a variable themselves like:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"t\">Pie</span> = <span class=\"k\">class</span> {}\n</pre></div>\n<p>Sort of like lambda functions but for classes. But since we generally want\nclasses to be named anyway, it makes sense to treat them as declarations.</p>\n</aside>\n<p>Next, we emit a new instruction to actually create the class object at runtime.\nThat instruction takes the constant table index of the class&rsquo;s name as an\noperand.</p>\n<p>After that, but before compiling the body of the class, we define the variable\nfor the class&rsquo;s name. <em>Declaring</em> the variable adds it to the scope, but recall\nfrom <a href=\"local-variables.html#another-scope-edge-case\">a previous chapter</a> that we can&rsquo;t <em>use</em> the variable until it&rsquo;s\n<em>defined</em>. For classes, we define the variable before the body. That way, users\ncan refer to the containing class inside the bodies of its own methods. That&rsquo;s\nuseful for things like factory methods that produce new instances of the class.</p>\n<p>Finally, we compile the body. We don&rsquo;t have methods yet, so right now it&rsquo;s\nsimply an empty pair of braces. Lox doesn&rsquo;t require fields to be declared in the\nclass, so we&rsquo;re done with the body<span class=\"em\">&mdash;</span>and the parser<span class=\"em\">&mdash;</span>for now.</p>\n<p>The compiler is emitting a new instruction, so let&rsquo;s define that.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_RETURN,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_CLASS</span>,\n</pre><pre class=\"insert-after\">} OpCode;\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>And add it to the disassembler:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    case OP_RETURN:\n      return simpleInstruction(&quot;OP_RETURN&quot;, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_CLASS</span>:\n      <span class=\"k\">return</span> <span class=\"i\">constantInstruction</span>(<span class=\"s\">&quot;OP_CLASS&quot;</span>, <span class=\"i\">chunk</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    default:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>For such a large-seeming feature, the interpreter support is minimal.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">        break;\n      }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_CLASS</span>:\n        <span class=\"i\">push</span>(<span class=\"a\">OBJ_VAL</span>(<span class=\"i\">newClass</span>(<span class=\"a\">READ_STRING</span>())));\n        <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">    }\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>We load the string for the class&rsquo;s name from the constant table and pass that to\n<code>newClass()</code>. That creates a new class object with the given name. We push that\nonto the stack and we&rsquo;re good. If the class is bound to a global variable, then\nthe compiler&rsquo;s call to <code>defineVariable()</code> will emit code to store that object\nfrom the stack into the global variable table. Otherwise, it&rsquo;s right where it\nneeds to be on the stack for a new <span name=\"local\">local</span> variable.</p>\n<aside name=\"local\">\n<p>&ldquo;Local&rdquo; classes<span class=\"em\">&mdash;</span>classes declared inside the body of a function or block, are\nan unusual concept. Many languages don&rsquo;t allow them at all. But since Lox is a\ndynamically typed scripting language, it treats the top level of a program and\nthe bodies of functions and blocks uniformly. Classes are just another kind of\ndeclaration, and since you can declare variables and functions inside blocks,\nyou can declare classes in there too.</p>\n</aside>\n<p>There you have it, our VM supports classes now. You can run this:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Brioche</span> {}\n<span class=\"k\">print</span> <span class=\"t\">Brioche</span>;\n</pre></div>\n<p>Unfortunately, printing is about <em>all</em> you can do with classes, so next is\nmaking them more useful.</p>\n<h2><a href=\"#instances-of-classes\" id=\"instances-of-classes\"><small>27&#8202;.&#8202;3</small>Instances of Classes</a></h2>\n<p>Classes serve two main purposes in a language:</p>\n<ul>\n<li>\n<p><strong>They are how you create new instances.</strong> Sometimes this involves a <code>new</code>\nkeyword, other times it&rsquo;s a method call on the class object, but you usually\nmention the class by name <em>somehow</em> to get a new instance.</p>\n</li>\n<li>\n<p><strong>They contain methods.</strong> These define how all instances of the class\nbehave.</p>\n</li>\n</ul>\n<p>We won&rsquo;t get to methods until the next chapter, so for now we will only worry\nabout the first part. Before classes can create instances, we need a\nrepresentation for them.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} ObjClass;\n</pre><div class=\"source-file\"><em>object.h</em><br>\nadd after struct <em>ObjClass</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">typedef</span> <span class=\"k\">struct</span> {\n  <span class=\"t\">Obj</span> <span class=\"i\">obj</span>;\n  <span class=\"t\">ObjClass</span>* <span class=\"i\">klass</span>;\n  <span class=\"t\">Table</span> <span class=\"i\">fields</span>;<span name=\"fields\"> </span>\n} <span class=\"t\">ObjInstance</span>;\n</pre><pre class=\"insert-after\">\n\nObjClass* newClass(ObjString* name);\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, add after struct <em>ObjClass</em></div>\n\n<p>Instances know their class<span class=\"em\">&mdash;</span>each instance has a pointer to the class that it\nis an instance of.  We won&rsquo;t use this much in this chapter, but it will become\ncritical when we add methods.</p>\n<p>More important to this chapter is how instances store their state. Lox lets\nusers freely add fields to an instance at runtime. This means we need a storage\nmechanism that can grow. We could use a dynamic array, but we also want to look\nup fields by name as quickly as possible. There&rsquo;s a data structure that&rsquo;s just\nperfect for quickly accessing a set of values by name and<span class=\"em\">&mdash;</span>even more conveniently<span class=\"em\">&mdash;</span>we&rsquo;ve already implemented it. Each instance stores\nits fields using a hash table.</p>\n<aside name=\"fields\">\n<p>Being able to freely add fields to an object at runtime is a big practical\ndifference between most dynamic and static languages. Statically typed languages\nusually require fields to be explicitly declared. This way, the compiler knows\nexactly what fields each instance has. It can use that to determine the precise\namount of memory needed for each instance and the offsets in that memory where\neach field can be found.</p>\n<p>In Lox and other dynamic languages, accessing a field is usually a hash table\nlookup. Constant time, but still pretty heavyweight. In a language like C++,\naccessing a field is as fast as offsetting a pointer by an integer constant.</p>\n</aside>\n<p>We only need to add an include, and we&rsquo;ve got it.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;chunk.h&quot;\n</pre><div class=\"source-file\"><em>object.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;table.h&quot;</span>\n</pre><pre class=\"insert-after\">#include &quot;value.h&quot;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em></div>\n\n<p>This new struct gets a new object type.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OBJ_FUNCTION,\n</pre><div class=\"source-file\"><em>object.h</em><br>\nin enum <em>ObjType</em></div>\n<pre class=\"insert\">  <span class=\"a\">OBJ_INSTANCE</span>,\n</pre><pre class=\"insert-after\">  OBJ_NATIVE,\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, in enum <em>ObjType</em></div>\n\n<p>I want to slow down a bit here because the Lox <em>language&rsquo;s</em> notion of &ldquo;type&rdquo; and\nthe VM <em>implementation&rsquo;s</em> notion of &ldquo;type&rdquo; brush against each other in ways that\ncan be confusing. Inside the C code that makes clox, there are a number of\ndifferent types of Obj<span class=\"em\">&mdash;</span>ObjString, ObjClosure, etc. Each has its own internal\nrepresentation and semantics.</p>\n<p>In the Lox <em>language</em>, users can define their own classes<span class=\"em\">&mdash;</span>say Cake and Pie<span class=\"em\">&mdash;</span>and then create instances of those classes. From the user&rsquo;s perspective, an\ninstance of Cake is a different type of object than an instance of Pie. But,\nfrom the VM&rsquo;s perspective, every class the user defines is simply another value\nof type ObjClass. Likewise, each instance in the user&rsquo;s program, no matter what\nclass it is an instance of, is an ObjInstance. That one VM object type covers\ninstances of all classes. The two worlds map to each other something like this:</p><img src=\"image/classes-and-instances/lox-clox.png\" alt=\"A set of class declarations and instances, and the runtime representations each maps to.\"/>\n<p>Got it? OK, back to the implementation. We also get our usual macros.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define IS_FUNCTION(value)     isObjType(value, OBJ_FUNCTION)\n</pre><div class=\"source-file\"><em>object.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define IS_INSTANCE(value)     isObjType(value, OBJ_INSTANCE)</span>\n</pre><pre class=\"insert-after\">#define IS_NATIVE(value)       isObjType(value, OBJ_NATIVE)\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em></div>\n\n<p>And:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define AS_FUNCTION(value)     ((ObjFunction*)AS_OBJ(value))\n</pre><div class=\"source-file\"><em>object.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define AS_INSTANCE(value)     ((ObjInstance*)AS_OBJ(value))</span>\n</pre><pre class=\"insert-after\">#define AS_NATIVE(value) \\\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em></div>\n\n<p>Since fields are added after the instance is created, the &ldquo;constructor&rdquo; function\nonly needs to know the class.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">ObjFunction* newFunction();\n</pre><div class=\"source-file\"><em>object.h</em><br>\nadd after <em>newFunction</em>()</div>\n<pre class=\"insert\"><span class=\"t\">ObjInstance</span>* <span class=\"i\">newInstance</span>(<span class=\"t\">ObjClass</span>* <span class=\"i\">klass</span>);\n</pre><pre class=\"insert-after\">ObjNative* newNative(NativeFn function);\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, add after <em>newFunction</em>()</div>\n\n<p>We implement that function here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>object.c</em><br>\nadd after <em>newFunction</em>()</div>\n<pre><span class=\"t\">ObjInstance</span>* <span class=\"i\">newInstance</span>(<span class=\"t\">ObjClass</span>* <span class=\"i\">klass</span>) {\n  <span class=\"t\">ObjInstance</span>* <span class=\"i\">instance</span> = <span class=\"a\">ALLOCATE_OBJ</span>(<span class=\"t\">ObjInstance</span>, <span class=\"a\">OBJ_INSTANCE</span>);\n  <span class=\"i\">instance</span>-&gt;<span class=\"i\">klass</span> = <span class=\"i\">klass</span>;\n  <span class=\"i\">initTable</span>(&amp;<span class=\"i\">instance</span>-&gt;<span class=\"i\">fields</span>);\n  <span class=\"k\">return</span> <span class=\"i\">instance</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, add after <em>newFunction</em>()</div>\n\n<p>We store a reference to the instance&rsquo;s class. Then we initialize the field\ntable to an empty hash table. A new baby object is born!</p>\n<p>At the sadder end of the instance&rsquo;s lifespan, it gets freed.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      FREE(ObjFunction, object);\n      break;\n    }\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>freeObject</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OBJ_INSTANCE</span>: {\n      <span class=\"t\">ObjInstance</span>* <span class=\"i\">instance</span> = (<span class=\"t\">ObjInstance</span>*)<span class=\"i\">object</span>;\n      <span class=\"i\">freeTable</span>(&amp;<span class=\"i\">instance</span>-&gt;<span class=\"i\">fields</span>);\n      <span class=\"a\">FREE</span>(<span class=\"t\">ObjInstance</span>, <span class=\"i\">object</span>);\n      <span class=\"k\">break</span>;\n    }\n</pre><pre class=\"insert-after\">    case OBJ_NATIVE:\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>freeObject</em>()</div>\n\n<p>The instance owns its field table so when freeing the instance, we also free the\ntable. We don&rsquo;t explicitly free the entries <em>in</em> the table, because there may\nbe other references to those objects. The garbage collector will take care of\nthose for us. Here we free only the entry array of the table itself.</p>\n<p>Speaking of the garbage collector, it needs support for tracing through\ninstances.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      markArray(&amp;function-&gt;chunk.constants);\n      break;\n    }\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>blackenObject</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OBJ_INSTANCE</span>: {\n      <span class=\"t\">ObjInstance</span>* <span class=\"i\">instance</span> = (<span class=\"t\">ObjInstance</span>*)<span class=\"i\">object</span>;\n      <span class=\"i\">markObject</span>((<span class=\"t\">Obj</span>*)<span class=\"i\">instance</span>-&gt;<span class=\"i\">klass</span>);\n      <span class=\"i\">markTable</span>(&amp;<span class=\"i\">instance</span>-&gt;<span class=\"i\">fields</span>);\n      <span class=\"k\">break</span>;\n    }\n</pre><pre class=\"insert-after\">    case OBJ_UPVALUE:\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>blackenObject</em>()</div>\n\n<p>If the instance is alive, we need to keep its class around. Also, we need to\nkeep every object referenced by the instance&rsquo;s fields. Most live objects that\nare not roots are reachable because some instance refers to the object in a\nfield. Fortunately, we already have a nice <code>markTable()</code> function to make\ntracing them easy.</p>\n<p>Less critical but still important is printing.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      break;\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>printObject</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OBJ_INSTANCE</span>:\n      <span class=\"i\">printf</span>(<span class=\"s\">&quot;%s instance&quot;</span>,\n             <span class=\"a\">AS_INSTANCE</span>(<span class=\"i\">value</span>)-&gt;<span class=\"i\">klass</span>-&gt;<span class=\"i\">name</span>-&gt;<span class=\"i\">chars</span>);\n      <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">    case OBJ_NATIVE:\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>printObject</em>()</div>\n\n<p><span name=\"print\">An</span> instance prints its name followed by &ldquo;instance&rdquo;.\n(The &ldquo;instance&rdquo; part is mainly so that classes and instances don&rsquo;t print the\nsame.)</p>\n<aside name=\"print\">\n<p>Most object-oriented languages let a class define some sort of <code>toString()</code>\nmethod that lets the class specify how its instances are converted to a string\nand printed. If Lox was less of a toy language, I would want to support that\ntoo.</p>\n</aside>\n<p>The real fun happens over in the interpreter. Lox has no special <code>new</code> keyword.\nThe way to create an instance of a class is to invoke the class itself as if it\nwere a function. The runtime already supports function calls, and it checks the\ntype of object being called to make sure the user doesn&rsquo;t try to invoke a number\nor other invalid type.</p>\n<p>We extend that runtime checking with a new case.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    switch (OBJ_TYPE(callee)) {\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>callValue</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OBJ_CLASS</span>: {\n        <span class=\"t\">ObjClass</span>* <span class=\"i\">klass</span> = <span class=\"a\">AS_CLASS</span>(<span class=\"i\">callee</span>);\n        <span class=\"i\">vm</span>.<span class=\"i\">stackTop</span>[-<span class=\"i\">argCount</span> - <span class=\"n\">1</span>] = <span class=\"a\">OBJ_VAL</span>(<span class=\"i\">newInstance</span>(<span class=\"i\">klass</span>));\n        <span class=\"k\">return</span> <span class=\"k\">true</span>;\n      }\n</pre><pre class=\"insert-after\">      case OBJ_CLOSURE:\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>callValue</em>()</div>\n\n<p>If the value being called<span class=\"em\">&mdash;</span>the object that results when evaluating the\nexpression to the left of the opening parenthesis<span class=\"em\">&mdash;</span>is a class, then we treat\nit as a constructor call. We <span name=\"args\">create</span> a new instance of\nthe called class and store the result on the stack.</p>\n<aside name=\"args\">\n<p>We ignore any arguments passed to the call for now. We&rsquo;ll revisit this code in\nthe <a href=\"methods-and-initializers.html\">next chapter</a> when we add support for initializers.</p>\n</aside>\n<p>We&rsquo;re one step farther. Now we can define classes and create instances of them.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Brioche</span> {}\n<span class=\"k\">print</span> <span class=\"t\">Brioche</span>();\n</pre></div>\n<p>Note the parentheses after <code>Brioche</code> on the second line now. This prints\n&ldquo;Brioche instance&rdquo;.</p>\n<h2><a href=\"#get-and-set-expressions\" id=\"get-and-set-expressions\"><small>27&#8202;.&#8202;4</small>Get and Set Expressions</a></h2>\n<p>Our object representation for instances can already store state, so all that\nremains is exposing that functionality to the user. Fields are accessed and\nmodified using get and set expressions. Not one to break with tradition, Lox\nuses the classic &ldquo;dot&rdquo; syntax:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">eclair</span>.<span class=\"i\">filling</span> = <span class=\"s\">&quot;pastry creme&quot;</span>;\n<span class=\"k\">print</span> <span class=\"i\">eclair</span>.<span class=\"i\">filling</span>;\n</pre></div>\n<p>The period<span class=\"em\">&mdash;</span>full stop for my English friends<span class=\"em\">&mdash;</span>works <span\nname=\"sort\">sort</span> of like an infix operator. There is an expression to the\nleft that is evaluated first and produces an instance. After that is the <code>.</code>\nfollowed by a field name. Since there is a preceding operand, we hook this into\nthe parse table as an infix expression.</p>\n<aside name=\"sort\">\n<p>I say &ldquo;sort of&rdquo; because the right-hand side after the <code>.</code> is not an expression,\nbut a single identifier whose semantics are handled by the get or set expression\nitself. It&rsquo;s really closer to a postfix expression.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">  [TOKEN_COMMA]         = {NULL,     NULL,   PREC_NONE},\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nreplace 1 line</div>\n<pre class=\"insert\">  [<span class=\"a\">TOKEN_DOT</span>]           = {<span class=\"a\">NULL</span>,     <span class=\"i\">dot</span>,    <span class=\"a\">PREC_CALL</span>},\n</pre><pre class=\"insert-after\">  [TOKEN_MINUS]         = {unary,    binary, PREC_TERM},\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, replace 1 line</div>\n\n<p>As in other languages, the <code>.</code> operator binds tightly, with precedence as high\nas the parentheses in a function call. After the parser consumes the dot token,\nit dispatches to a new parse function.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>call</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">dot</span>(<span class=\"t\">bool</span> <span class=\"i\">canAssign</span>) {\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_IDENTIFIER</span>, <span class=\"s\">&quot;Expect property name after &#39;.&#39;.&quot;</span>);\n  <span class=\"t\">uint8_t</span> <span class=\"i\">name</span> = <span class=\"i\">identifierConstant</span>(&amp;<span class=\"i\">parser</span>.<span class=\"i\">previous</span>);\n\n  <span class=\"k\">if</span> (<span class=\"i\">canAssign</span> &amp;&amp; <span class=\"i\">match</span>(<span class=\"a\">TOKEN_EQUAL</span>)) {\n    <span class=\"i\">expression</span>();\n    <span class=\"i\">emitBytes</span>(<span class=\"a\">OP_SET_PROPERTY</span>, <span class=\"i\">name</span>);\n  } <span class=\"k\">else</span> {\n    <span class=\"i\">emitBytes</span>(<span class=\"a\">OP_GET_PROPERTY</span>, <span class=\"i\">name</span>);\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>call</em>()</div>\n\n<p>The parser expects to find a <span name=\"prop\">property</span> name immediately\nafter the dot. We load that token&rsquo;s lexeme into the constant table as a string\nso that the name is available at runtime.</p>\n<aside name=\"prop\">\n<p>The compiler uses &ldquo;property&rdquo; instead of &ldquo;field&rdquo; here because, remember, Lox also\nlets you use dot syntax to access a method without calling it. &ldquo;Property&rdquo; is the\ngeneral term we use to refer to any named entity you can access on an instance.\nFields are the subset of properties that are backed by the instance&rsquo;s state.</p>\n</aside>\n<p>We have two new expression forms<span class=\"em\">&mdash;</span>getters and setters<span class=\"em\">&mdash;</span>that this one\nfunction handles. If we see an equals sign after the field name, it must be a\nset expression that is assigning to a field. But we don&rsquo;t <em>always</em> allow an\nequals sign after the field to be compiled. Consider:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">a</span> + <span class=\"i\">b</span>.<span class=\"i\">c</span> = <span class=\"n\">3</span>\n</pre></div>\n<p>This is syntactically invalid according to Lox&rsquo;s grammar, which means our Lox\nimplementation is obligated to detect and report the error. If <code>dot()</code> silently\nparsed the <code>= 3</code> part, we would incorrectly interpret the code as if the user\nhad written:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">a</span> + (<span class=\"i\">b</span>.<span class=\"i\">c</span> = <span class=\"n\">3</span>)\n</pre></div>\n<p>The problem is that the <code>=</code> side of a set expression has much lower precedence\nthan the <code>.</code> part. The parser may call <code>dot()</code> in a context that is too high\nprecedence to permit a setter to appear. To avoid incorrectly allowing that, we\nparse and compile the equals part only when <code>canAssign</code> is true. If an equals\ntoken appears when <code>canAssign</code> is false, <code>dot()</code> leaves it alone and returns. In\nthat case, the compiler will eventually unwind up to <code>parsePrecedence()</code>, which\nstops at the unexpected <code>=</code> still sitting as the next token and reports an\nerror.</p>\n<p>If we find an <code>=</code> in a context where it <em>is</em> allowed, then we compile the\nexpression that follows. After that, we emit a new <span\nname=\"set\"><code>OP_SET_PROPERTY</code></span> instruction. That takes a single operand for\nthe index of the property name in the constant table. If we didn&rsquo;t compile a set\nexpression, we assume it&rsquo;s a getter and emit an <code>OP_GET_PROPERTY</code> instruction,\nwhich also takes an operand for the property name.</p>\n<aside name=\"set\">\n<p>You can&rsquo;t <em>set</em> a non-field property, so I suppose that instruction could have\nbeen <code>OP_SET_FIELD</code>, but I thought it looked nicer to be consistent with the get\ninstruction.</p>\n</aside>\n<p>Now is a good time to define these two new instructions.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_SET_UPVALUE,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_GET_PROPERTY</span>,\n  <span class=\"a\">OP_SET_PROPERTY</span>,\n</pre><pre class=\"insert-after\">  OP_EQUAL,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>And add support for disassembling them:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      return byteInstruction(&quot;OP_SET_UPVALUE&quot;, chunk, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_GET_PROPERTY</span>:\n      <span class=\"k\">return</span> <span class=\"i\">constantInstruction</span>(<span class=\"s\">&quot;OP_GET_PROPERTY&quot;</span>, <span class=\"i\">chunk</span>, <span class=\"i\">offset</span>);\n    <span class=\"k\">case</span> <span class=\"a\">OP_SET_PROPERTY</span>:\n      <span class=\"k\">return</span> <span class=\"i\">constantInstruction</span>(<span class=\"s\">&quot;OP_SET_PROPERTY&quot;</span>, <span class=\"i\">chunk</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    case OP_EQUAL:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<h3><a href=\"#interpreting-getter-and-setter-expressions\" id=\"interpreting-getter-and-setter-expressions\"><small>27&#8202;.&#8202;4&#8202;.&#8202;1</small>Interpreting getter and setter expressions</a></h3>\n<p>Sliding over to the runtime, we&rsquo;ll start with get expressions since those are a\nlittle simpler.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_GET_PROPERTY</span>: {\n        <span class=\"t\">ObjInstance</span>* <span class=\"i\">instance</span> = <span class=\"a\">AS_INSTANCE</span>(<span class=\"i\">peek</span>(<span class=\"n\">0</span>));\n        <span class=\"t\">ObjString</span>* <span class=\"i\">name</span> = <span class=\"a\">READ_STRING</span>();\n\n        <span class=\"t\">Value</span> <span class=\"i\">value</span>;\n        <span class=\"k\">if</span> (<span class=\"i\">tableGet</span>(&amp;<span class=\"i\">instance</span>-&gt;<span class=\"i\">fields</span>, <span class=\"i\">name</span>, &amp;<span class=\"i\">value</span>)) {\n          <span class=\"i\">pop</span>(); <span class=\"c\">// Instance.</span>\n          <span class=\"i\">push</span>(<span class=\"i\">value</span>);\n          <span class=\"k\">break</span>;\n        }\n      }\n</pre><pre class=\"insert-after\">      case OP_EQUAL: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>When the interpreter reaches this instruction, the expression to the left of the\ndot has already been executed and the resulting instance is on top of the stack.\nWe read the field name from the constant pool and look it up in the instance&rsquo;s\nfield table. If the hash table contains an entry with that name, we pop the\ninstance and push the entry&rsquo;s value as the result.</p>\n<p>Of course, the field might not exist. In Lox, we&rsquo;ve defined that to be a runtime\nerror. So we add a check for that and abort if it happens.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">          push(value);\n          break;\n        }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">\n\n        <span class=\"i\">runtimeError</span>(<span class=\"s\">&quot;Undefined property &#39;%s&#39;.&quot;</span>, <span class=\"i\">name</span>-&gt;<span class=\"i\">chars</span>);\n        <span class=\"k\">return</span> <span class=\"a\">INTERPRET_RUNTIME_ERROR</span>;\n</pre><pre class=\"insert-after\">      }\n      case OP_EQUAL: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p><span name=\"field\">There</span> is another failure mode to handle which you&rsquo;ve\nprobably noticed. The above code assumes the expression to the left of the dot\ndid evaluate to an ObjInstance. But there&rsquo;s nothing preventing a user from\nwriting this:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">obj</span> = <span class=\"s\">&quot;not an instance&quot;</span>;\n<span class=\"k\">print</span> <span class=\"i\">obj</span>.<span class=\"i\">field</span>;\n</pre></div>\n<p>The user&rsquo;s program is wrong, but the VM still has to handle it with some grace.\nRight now, it will misinterpret the bits of the ObjString as an ObjInstance and,\nI don&rsquo;t know, catch on fire or something definitely not graceful.</p>\n<p>In Lox, only instances are allowed to have fields. You can&rsquo;t stuff a field onto\na string or number. So we need to check that the value is an instance before\naccessing any fields on it.</p>\n<aside name=\"field\">\n<p>Lox <em>could</em> support adding fields to values of other types. It&rsquo;s our language\nand we can do what we want. But it&rsquo;s likely a bad idea. It significantly\ncomplicates the implementation in ways that hurt performance<span class=\"em\">&mdash;</span>for example,\nstring interning gets a lot harder.</p>\n<p>Also, it raises gnarly semantic questions around the equality and identity of\nvalues. If I attach a field to the number <code>3</code>, does the result of <code>1 + 2</code> have\nthat field as well? If so, how does the implementation track that? If not, are\nthose two resulting &ldquo;threes&rdquo; still considered equal?</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case OP_GET_PROPERTY: {\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">        <span class=\"k\">if</span> (!<span class=\"a\">IS_INSTANCE</span>(<span class=\"i\">peek</span>(<span class=\"n\">0</span>))) {\n          <span class=\"i\">runtimeError</span>(<span class=\"s\">&quot;Only instances have properties.&quot;</span>);\n          <span class=\"k\">return</span> <span class=\"a\">INTERPRET_RUNTIME_ERROR</span>;\n        }\n\n</pre><pre class=\"insert-after\">        ObjInstance* instance = AS_INSTANCE(peek(0));\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>If the value on the stack isn&rsquo;t an instance, we report a runtime error and\nsafely exit.</p>\n<p>Of course, get expressions are not very useful when no instances have any\nfields. For that we need setters.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">        return INTERPRET_RUNTIME_ERROR;\n      }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_SET_PROPERTY</span>: {\n        <span class=\"t\">ObjInstance</span>* <span class=\"i\">instance</span> = <span class=\"a\">AS_INSTANCE</span>(<span class=\"i\">peek</span>(<span class=\"n\">1</span>));\n        <span class=\"i\">tableSet</span>(&amp;<span class=\"i\">instance</span>-&gt;<span class=\"i\">fields</span>, <span class=\"a\">READ_STRING</span>(), <span class=\"i\">peek</span>(<span class=\"n\">0</span>));\n        <span class=\"t\">Value</span> <span class=\"i\">value</span> = <span class=\"i\">pop</span>();\n        <span class=\"i\">pop</span>();\n        <span class=\"i\">push</span>(<span class=\"i\">value</span>);\n        <span class=\"k\">break</span>;\n      }\n</pre><pre class=\"insert-after\">      case OP_EQUAL: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>This is a little more complex than <code>OP_GET_PROPERTY</code>. When this executes, the\ntop of the stack has the instance whose field is being set and above that, the\nvalue to be stored. Like before, we read the instruction&rsquo;s operand and find the\nfield name string. Using that, we store the value on top of the stack into the\ninstance&rsquo;s field table.</p>\n<p>After that is a little <span name=\"stack\">stack</span> juggling. We pop the\nstored value off, then pop the instance, and finally push the value back on. In\nother words, we remove the <em>second</em> element from the stack while leaving the top\nalone. A setter is itself an expression whose result is the assigned value, so\nwe need to leave that value on the stack. Here&rsquo;s what I mean:</p>\n<aside name=\"stack\">\n<p>The stack operations go like this:</p><img src=\"image/classes-and-instances/stack.png\" alt=\"Popping two values and then pushing the first value back on the stack.\"/>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Toast</span> {}\n<span class=\"k\">var</span> <span class=\"i\">toast</span> = <span class=\"t\">Toast</span>();\n<span class=\"k\">print</span> <span class=\"i\">toast</span>.<span class=\"i\">jam</span> = <span class=\"s\">&quot;grape&quot;</span>; <span class=\"c\">// Prints &quot;grape&quot;.</span>\n</pre></div>\n<p>Unlike when reading a field, we don&rsquo;t need to worry about the hash table not\ncontaining the field. A setter implicitly creates the field if needed. We do\nneed to handle the user incorrectly trying to store a field on a value that\nisn&rsquo;t an instance.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case OP_SET_PROPERTY: {\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">        <span class=\"k\">if</span> (!<span class=\"a\">IS_INSTANCE</span>(<span class=\"i\">peek</span>(<span class=\"n\">1</span>))) {\n          <span class=\"i\">runtimeError</span>(<span class=\"s\">&quot;Only instances have fields.&quot;</span>);\n          <span class=\"k\">return</span> <span class=\"a\">INTERPRET_RUNTIME_ERROR</span>;\n        }\n\n</pre><pre class=\"insert-after\">        ObjInstance* instance = AS_INSTANCE(peek(1));\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>Exactly like with get expressions, we check the value&rsquo;s type and report a\nruntime error if it&rsquo;s invalid. And, with that, the stateful side of Lox&rsquo;s\nsupport for object-oriented programming is in place. Give it a try:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Pair</span> {}\n\n<span class=\"k\">var</span> <span class=\"i\">pair</span> = <span class=\"t\">Pair</span>();\n<span class=\"i\">pair</span>.<span class=\"i\">first</span> = <span class=\"n\">1</span>;\n<span class=\"i\">pair</span>.<span class=\"i\">second</span> = <span class=\"n\">2</span>;\n<span class=\"k\">print</span> <span class=\"i\">pair</span>.<span class=\"i\">first</span> + <span class=\"i\">pair</span>.<span class=\"i\">second</span>; <span class=\"c\">// 3.</span>\n</pre></div>\n<p>This doesn&rsquo;t really feel very <em>object</em>-oriented. It&rsquo;s more like a strange,\ndynamically typed variant of C where objects are loose struct-like bags of data.\nSort of a dynamic procedural language. But this is a big step in expressiveness.\nOur Lox implementation now lets users freely aggregate data into bigger units.\nIn the next chapter, we will breathe life into those inert blobs.</p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>Trying to access a non-existent field on an object immediately aborts the\nentire VM. The user has no way to recover from this runtime error, nor is\nthere any way to see if a field exists <em>before</em> trying to access it. It&rsquo;s up\nto the user to ensure on their own that only valid fields are read.</p>\n<p>How do other dynamically typed languages handle missing fields? What do you\nthink Lox should do? Implement your solution.</p>\n</li>\n<li>\n<p>Fields are accessed at runtime by their <em>string</em> name. But that name must\nalways appear directly in the source code as an <em>identifier token</em>. A user\nprogram cannot imperatively build a string value and then use that as the\nname of a field. Do you think they should be able to? Devise a language\nfeature that enables that and implement it.</p>\n</li>\n<li>\n<p>Conversely, Lox offers no way to <em>remove</em> a field from an instance. You can\nset a field&rsquo;s value to <code>nil</code>, but the entry in the hash table is still\nthere. How do other languages handle this? Choose and implement a strategy\nfor Lox.</p>\n</li>\n<li>\n<p>Because fields are accessed by name at runtime, working with instance state\nis slow. It&rsquo;s technically a constant-time operation<span class=\"em\">&mdash;</span>thanks, hash tables<span class=\"em\">&mdash;</span>but the constant factors are relatively large. This is a major component\nof why dynamic languages are slower than statically typed ones.</p>\n<p>How do sophisticated implementations of dynamically typed languages cope\nwith and optimize this?</p>\n</li>\n</ol>\n</div>\n\n<footer>\n<a href=\"methods-and-initializers.html\" class=\"next\">\n  Next Chapter: &ldquo;Methods and Initializers&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/classes.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Classes &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Classes<small>12</small></a></h3>\n\n<ul>\n    <li><a href=\"#oop-and-classes\"><small>12.1</small> OOP and Classes</a></li>\n    <li><a href=\"#class-declarations\"><small>12.2</small> Class Declarations</a></li>\n    <li><a href=\"#creating-instances\"><small>12.3</small> Creating Instances</a></li>\n    <li><a href=\"#properties-on-instances\"><small>12.4</small> Properties on Instances</a></li>\n    <li><a href=\"#methods-on-classes\"><small>12.5</small> Methods on Classes</a></li>\n    <li><a href=\"#this\"><small>12.6</small> This</a></li>\n    <li><a href=\"#constructors-and-initializers\"><small>12.7</small> Constructors and Initializers</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Prototypes and Power</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"resolving-and-binding.html\" title=\"Resolving and Binding\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\">&uarr;&nbsp;Up</a>\n    <a href=\"inheritance.html\" title=\"Inheritance\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"resolving-and-binding.html\" title=\"Resolving and Binding\" class=\"prev\">←</a>\n<a href=\"inheritance.html\" title=\"Inheritance\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Classes<small>12</small></a></h3>\n\n<ul>\n    <li><a href=\"#oop-and-classes\"><small>12.1</small> OOP and Classes</a></li>\n    <li><a href=\"#class-declarations\"><small>12.2</small> Class Declarations</a></li>\n    <li><a href=\"#creating-instances\"><small>12.3</small> Creating Instances</a></li>\n    <li><a href=\"#properties-on-instances\"><small>12.4</small> Properties on Instances</a></li>\n    <li><a href=\"#methods-on-classes\"><small>12.5</small> Methods on Classes</a></li>\n    <li><a href=\"#this\"><small>12.6</small> This</a></li>\n    <li><a href=\"#constructors-and-initializers\"><small>12.7</small> Constructors and Initializers</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Prototypes and Power</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"resolving-and-binding.html\" title=\"Resolving and Binding\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\">&uarr;&nbsp;Up</a>\n    <a href=\"inheritance.html\" title=\"Inheritance\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">12</div>\n  <h1>Classes</h1>\n\n<blockquote>\n<p>One has no right to love or hate anything if one has not acquired a thorough\nknowledge of its nature. Great love springs from great knowledge of the\nbeloved object, and if you know it but little you will be able to love it only\na little or not at all.</p>\n<p><cite>Leonardo da Vinci</cite></p>\n</blockquote>\n<p>We&rsquo;re eleven chapters in, and the interpreter sitting on your machine is nearly\na complete scripting language. It could use a couple of built-in data structures\nlike lists and maps, and it certainly needs a core library for file I/O, user\ninput, etc. But the language itself is sufficient. We&rsquo;ve got a little procedural\nlanguage in the same vein as BASIC, Tcl, Scheme (minus macros), and early\nversions of Python and Lua.</p>\n<p>If this were the &rsquo;80s, we&rsquo;d stop here. But today, many popular languages support\n&ldquo;object-oriented programming&rdquo;. Adding that to Lox will give users a familiar set\nof tools for writing larger programs. Even if you personally don&rsquo;t <span\nname=\"hate\">like</span> OOP, this chapter and <a href=\"inheritance.html\">the next</a> will help\nyou understand how others design and build object systems.</p>\n<aside name=\"hate\">\n<p>If you <em>really</em> hate classes, though, you can skip these two chapters. They are\nfairly isolated from the rest of the book. Personally, I find it&rsquo;s good to learn\nmore about the things I dislike. Things look simple at a distance, but as I get\ncloser, details emerge and I gain a more nuanced perspective.</p>\n</aside>\n<h2><a href=\"#oop-and-classes\" id=\"oop-and-classes\"><small>12&#8202;.&#8202;1</small>OOP and Classes</a></h2>\n<p>There are three broad paths to object-oriented programming: classes,\n<a href=\"http://gameprogrammingpatterns.com/prototype.html\">prototypes</a>, and <span name=\"multimethods\"><a href=\"https://en.wikipedia.org/wiki/Multiple_dispatch\">multimethods</a></span>. Classes\ncame first and are the most popular style. With the rise of JavaScript (and to a\nlesser extent <a href=\"https://www.lua.org/pil/13.4.1.html\">Lua</a>), prototypes are more widely known than they used to be.\nI&rsquo;ll talk more about those <a href=\"#design-note\">later</a>. For Lox, we&rsquo;re taking the, ahem, classic\napproach.</p>\n<aside name=\"multimethods\">\n<p>Multimethods are the approach you&rsquo;re least likely to be familiar with. I&rsquo;d love\nto talk more about them<span class=\"em\">&mdash;</span>I designed <a href=\"http://magpie-lang.org/\">a hobby language</a> around them\nonce and they are <em>super rad</em><span class=\"em\">&mdash;</span>but there are only so many pages I can fit in.\nIf you&rsquo;d like to learn more, take a look at <a href=\"https://en.wikipedia.org/wiki/Common_Lisp_Object_System\">CLOS</a> (the object system in\nCommon Lisp), <a href=\"https://opendylan.org/\">Dylan</a>, <a href=\"https://julialang.org/\">Julia</a>, or <a href=\"https://docs.raku.org/language/functions#Multi-dispatch\">Raku</a>.</p>\n</aside>\n<p>Since you&rsquo;ve written about a thousand lines of Java code with me already, I&rsquo;m\nassuming you don&rsquo;t need a detailed introduction to object orientation. The main\ngoal is to bundle data with the code that acts on it. Users do that by declaring\na <em>class</em> that:</p>\n<p><span name=\"circle\"></span></p>\n<ol>\n<li>\n<p>Exposes a <em>constructor</em> to create and initialize new <em>instances</em> of the\nclass</p>\n</li>\n<li>\n<p>Provides a way to store and access <em>fields</em> on instances</p>\n</li>\n<li>\n<p>Defines a set of <em>methods</em> shared by all instances of the class that\noperate on each instances&rsquo; state.</p>\n</li>\n</ol>\n<p>That&rsquo;s about as minimal as it gets. Most object-oriented languages, all the way\nback to Simula, also do inheritance to reuse behavior across classes. We&rsquo;ll add\nthat in the <a href=\"inheritance.html\">next chapter</a>. Even kicking that out, we still have a\nlot to get through. This is a big chapter and everything doesn&rsquo;t quite come\ntogether until we have all of the above pieces, so gather your stamina.</p>\n<aside name=\"circle\"><img src=\"image/classes/circle.png\" alt=\"The relationships between classes, methods, instances, constructors, and fields.\" />\n<p>It&rsquo;s like the circle of life, <em>sans</em> Sir Elton John.</p>\n</aside>\n<h2><a href=\"#class-declarations\" id=\"class-declarations\"><small>12&#8202;.&#8202;2</small>Class Declarations</a></h2>\n<p>Like we do, we&rsquo;re gonna start with syntax. A <code>class</code> statement introduces a new\nname, so it lives in the <code>declaration</code> grammar rule.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">declaration</span>    → <span class=\"i\">classDecl</span>\n               | <span class=\"i\">funDecl</span>\n               | <span class=\"i\">varDecl</span>\n               | <span class=\"i\">statement</span> ;\n\n<span class=\"i\">classDecl</span>      → <span class=\"s\">&quot;class&quot;</span> <span class=\"t\">IDENTIFIER</span> <span class=\"s\">&quot;{&quot;</span> <span class=\"i\">function</span>* <span class=\"s\">&quot;}&quot;</span> ;\n</pre></div>\n<p>The new <code>classDecl</code> rule relies on the <code>function</code> rule we defined\n<a href=\"functions.html#function-declarations\">earlier</a>. To refresh your memory:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">function</span>       → <span class=\"t\">IDENTIFIER</span> <span class=\"s\">&quot;(&quot;</span> <span class=\"i\">parameters</span>? <span class=\"s\">&quot;)&quot;</span> <span class=\"i\">block</span> ;\n<span class=\"i\">parameters</span>     → <span class=\"t\">IDENTIFIER</span> ( <span class=\"s\">&quot;,&quot;</span> <span class=\"t\">IDENTIFIER</span> )* ;\n</pre></div>\n<p>In plain English, a class declaration is the <code>class</code> keyword, followed by the\nclass&rsquo;s name, then a curly-braced body. Inside that body is a list of method\ndeclarations. Unlike function declarations, methods don&rsquo;t have a leading <span\nname=\"fun\"><code>fun</code></span> keyword. Each method is a name, parameter list, and\nbody. Here&rsquo;s an example:</p>\n<aside name=\"fun\">\n<p>Not that I&rsquo;m trying to say methods aren&rsquo;t fun or anything.</p>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Breakfast</span> {\n  <span class=\"i\">cook</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Eggs a-fryin&#39;!&quot;</span>;\n  }\n\n  <span class=\"i\">serve</span>(<span class=\"i\">who</span>) {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Enjoy your breakfast, &quot;</span> + <span class=\"i\">who</span> + <span class=\"s\">&quot;.&quot;</span>;\n  }\n}\n</pre></div>\n<p>Like most dynamically typed languages, fields are not explicitly listed in the\nclass declaration. Instances are loose bags of data and you can freely add\nfields to them as you see fit using normal imperative code.</p>\n<p>Over in our AST generator, the <code>classDecl</code> grammar rule gets its own statement\n<span name=\"class-ast\">node</span>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      &quot;Block      : List&lt;Stmt&gt; statements&quot;,\n</pre><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nin <em>main</em>()</div>\n<pre class=\"insert\">      <span class=\"s\">&quot;Class      : Token name, List&lt;Stmt.Function&gt; methods&quot;</span>,\n</pre><pre class=\"insert-after\">      &quot;Expression : Expr expression&quot;,\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, in <em>main</em>()</div>\n\n<aside name=\"class-ast\">\n<p>The generated code for the new node is in <a href=\"appendix-ii.html#class-statement\">Appendix II</a>.</p>\n</aside>\n<p>It stores the class&rsquo;s name and the methods inside its body. Methods are\nrepresented by the existing Stmt.Function class that we use for function\ndeclaration AST nodes. That gives us all the bits of state that we need for a\nmethod: name, parameter list, and body.</p>\n<p>A class can appear anywhere a named declaration is allowed, triggered by the\nleading <code>class</code> keyword.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    try {\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>declaration</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">CLASS</span>)) <span class=\"k\">return</span> <span class=\"i\">classDeclaration</span>();\n</pre><pre class=\"insert-after\">      if (match(FUN)) return function(&quot;function&quot;);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>declaration</em>()</div>\n\n<p>That calls out to:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>declaration</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Stmt</span> <span class=\"i\">classDeclaration</span>() {\n    <span class=\"t\">Token</span> <span class=\"i\">name</span> = <span class=\"i\">consume</span>(<span class=\"i\">IDENTIFIER</span>, <span class=\"s\">&quot;Expect class name.&quot;</span>);\n    <span class=\"i\">consume</span>(<span class=\"i\">LEFT_BRACE</span>, <span class=\"s\">&quot;Expect &#39;{&#39; before class body.&quot;</span>);\n\n    <span class=\"t\">List</span>&lt;<span class=\"t\">Stmt</span>.<span class=\"t\">Function</span>&gt; <span class=\"i\">methods</span> = <span class=\"k\">new</span> <span class=\"t\">ArrayList</span>&lt;&gt;();\n    <span class=\"k\">while</span> (!<span class=\"i\">check</span>(<span class=\"i\">RIGHT_BRACE</span>) &amp;&amp; !<span class=\"i\">isAtEnd</span>()) {\n      <span class=\"i\">methods</span>.<span class=\"i\">add</span>(<span class=\"i\">function</span>(<span class=\"s\">&quot;method&quot;</span>));\n    }\n\n    <span class=\"i\">consume</span>(<span class=\"i\">RIGHT_BRACE</span>, <span class=\"s\">&quot;Expect &#39;}&#39; after class body.&quot;</span>);\n\n    <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">Stmt</span>.<span class=\"t\">Class</span>(<span class=\"i\">name</span>, <span class=\"i\">methods</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>declaration</em>()</div>\n\n<p>There&rsquo;s more meat to this than most of the other parsing methods, but it roughly\nfollows the grammar. We&rsquo;ve already consumed the <code>class</code> keyword, so we look for\nthe expected class name next, followed by the opening curly brace. Once inside\nthe body, we keep parsing method declarations until we hit the closing brace.\nEach method declaration is parsed by a call to <code>function()</code>, which we defined\nback in the <a href=\"functions.html\">chapter where functions were introduced</a>.</p>\n<p>Like we do in any open-ended loop in the parser, we also check for hitting the\nend of the file. That won&rsquo;t happen in correct code since a class should have a\nclosing brace at the end, but it ensures the parser doesn&rsquo;t get stuck in an\ninfinite loop if the user has a syntax error and forgets to correctly end the\nclass body.</p>\n<p>We wrap the name and list of methods into a Stmt.Class node and we&rsquo;re done.\nPreviously, we would jump straight into the interpreter, but now we need to\nplumb the node through the resolver first.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>visitBlockStmt</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitClassStmt</span>(<span class=\"t\">Stmt</span>.<span class=\"t\">Class</span> <span class=\"i\">stmt</span>) {\n    <span class=\"i\">declare</span>(<span class=\"i\">stmt</span>.<span class=\"i\">name</span>);\n    <span class=\"i\">define</span>(<span class=\"i\">stmt</span>.<span class=\"i\">name</span>);\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>visitBlockStmt</em>()</div>\n\n<p>We aren&rsquo;t going to worry about resolving the methods themselves yet, so for now\nall we need to do is declare the class using its name. It&rsquo;s not common to\ndeclare a class as a local variable, but Lox permits it, so we need to handle it\ncorrectly.</p>\n<p>Now we interpret the class declaration.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>visitBlockStmt</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitClassStmt</span>(<span class=\"t\">Stmt</span>.<span class=\"t\">Class</span> <span class=\"i\">stmt</span>) {\n    <span class=\"i\">environment</span>.<span class=\"i\">define</span>(<span class=\"i\">stmt</span>.<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>, <span class=\"k\">null</span>);\n    <span class=\"t\">LoxClass</span> <span class=\"i\">klass</span> = <span class=\"k\">new</span> <span class=\"t\">LoxClass</span>(<span class=\"i\">stmt</span>.<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>);\n    <span class=\"i\">environment</span>.<span class=\"i\">assign</span>(<span class=\"i\">stmt</span>.<span class=\"i\">name</span>, <span class=\"i\">klass</span>);\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>visitBlockStmt</em>()</div>\n\n<p>This looks similar to how we execute function declarations. We declare the\nclass&rsquo;s name in the current environment. Then we turn the class <em>syntax node</em>\ninto a LoxClass, the <em>runtime</em> representation of a class. We circle back and\nstore the class object in the variable we previously declared. That two-stage\nvariable binding process allows references to the class inside its own methods.</p>\n<p>We will refine it throughout the chapter, but the first draft of LoxClass looks\nlike this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/LoxClass.java</em><br>\ncreate new file</div>\n<pre><span class=\"k\">package</span> <span class=\"i\">com.craftinginterpreters.lox</span>;\n\n<span class=\"k\">import</span> <span class=\"i\">java.util.List</span>;\n<span class=\"k\">import</span> <span class=\"i\">java.util.Map</span>;\n\n<span class=\"k\">class</span> <span class=\"t\">LoxClass</span> {\n  <span class=\"k\">final</span> <span class=\"t\">String</span> <span class=\"i\">name</span>;\n\n  <span class=\"t\">LoxClass</span>(<span class=\"t\">String</span> <span class=\"i\">name</span>) {\n    <span class=\"k\">this</span>.<span class=\"i\">name</span> = <span class=\"i\">name</span>;\n  }\n\n  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">String</span> <span class=\"i\">toString</span>() {\n    <span class=\"k\">return</span> <span class=\"i\">name</span>;\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxClass.java</em>, create new file</div>\n\n<p>Literally a wrapper around a name. We don&rsquo;t even store the methods yet. Not\nsuper useful, but it does have a <code>toString()</code> method so we can write a trivial\nscript and test that class objects are actually being parsed and executed.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">DevonshireCream</span> {\n  <span class=\"i\">serveOn</span>() {\n    <span class=\"k\">return</span> <span class=\"s\">&quot;Scones&quot;</span>;\n  }\n}\n\n<span class=\"k\">print</span> <span class=\"t\">DevonshireCream</span>; <span class=\"c\">// Prints &quot;DevonshireCream&quot;.</span>\n</pre></div>\n<h2><a href=\"#creating-instances\" id=\"creating-instances\"><small>12&#8202;.&#8202;3</small>Creating Instances</a></h2>\n<p>We have classes, but they don&rsquo;t do anything yet. Lox doesn&rsquo;t have &ldquo;static&rdquo;\nmethods that you can call right on the class itself, so without actual\ninstances, classes are useless. Thus instances are the next step.</p>\n<p>While some syntax and semantics are fairly standard across OOP languages, the\nway you create new instances isn&rsquo;t. Ruby, following Smalltalk, creates instances\nby calling a method on the class object itself, a <span\nname=\"turtles\">recursively</span> graceful approach. Some, like C++ and Java,\nhave a <code>new</code> keyword dedicated to birthing a new object. Python has you &ldquo;call&rdquo;\nthe class itself like a function. (JavaScript, ever weird, sort of does both.)</p>\n<aside name=\"turtles\">\n<p>In Smalltalk, even <em>classes</em> are created by calling methods on an existing\nobject, usually the desired superclass. It&rsquo;s sort of a turtles-all-the-way-down\nthing. It ultimately bottoms out on a few magical classes like Object and\nMetaclass that the runtime conjures into being <em>ex nihilo</em>.</p>\n</aside>\n<p>I took a minimal approach with Lox. We already have class objects, and we\nalready have function calls, so we&rsquo;ll use call expressions on class objects to\ncreate new instances. It&rsquo;s as if a class is a factory function that generates\ninstances of itself. This feels elegant to me, and also spares us the need to\nintroduce syntax like <code>new</code>. Therefore, we can skip past the front end straight\ninto the runtime.</p>\n<p>Right now, if you try this:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Bagel</span> {}\n<span class=\"t\">Bagel</span>();\n</pre></div>\n<p>You get a runtime error. <code>visitCallExpr()</code> checks to see if the called object\nimplements <code>LoxCallable</code> and reports an error since LoxClass doesn&rsquo;t. Not <em>yet</em>,\nthat is.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">import java.util.Map;\n\n</pre><div class=\"source-file\"><em>lox/LoxClass.java</em><br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"k\">class</span> <span class=\"t\">LoxClass</span> <span class=\"k\">implements</span> <span class=\"t\">LoxCallable</span> {\n</pre><pre class=\"insert-after\">  final String name;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxClass.java</em>, replace 1 line</div>\n\n<p>Implementing that interface requires two methods.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/LoxClass.java</em><br>\nadd after <em>toString</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Object</span> <span class=\"i\">call</span>(<span class=\"t\">Interpreter</span> <span class=\"i\">interpreter</span>,\n                     <span class=\"t\">List</span>&lt;<span class=\"t\">Object</span>&gt; <span class=\"i\">arguments</span>) {\n    <span class=\"t\">LoxInstance</span> <span class=\"i\">instance</span> = <span class=\"k\">new</span> <span class=\"t\">LoxInstance</span>(<span class=\"k\">this</span>);\n    <span class=\"k\">return</span> <span class=\"i\">instance</span>;\n  }\n\n  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">int</span> <span class=\"i\">arity</span>() {\n    <span class=\"k\">return</span> <span class=\"n\">0</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxClass.java</em>, add after <em>toString</em>()</div>\n\n<p>The interesting one is <code>call()</code>. When you &ldquo;call&rdquo; a class, it instantiates a new\nLoxInstance for the called class and returns it. The <code>arity()</code> method is how the\ninterpreter validates that you passed the right number of arguments to a\ncallable. For now, we&rsquo;ll say you can&rsquo;t pass any. When we get to user-defined\nconstructors, we&rsquo;ll revisit this.</p>\n<p>That leads us to LoxInstance, the runtime representation of an instance of a Lox\nclass. Again, our first implementation starts small.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/LoxInstance.java</em><br>\ncreate new file</div>\n<pre><span class=\"k\">package</span> <span class=\"i\">com.craftinginterpreters.lox</span>;\n\n<span class=\"k\">import</span> <span class=\"i\">java.util.HashMap</span>;\n<span class=\"k\">import</span> <span class=\"i\">java.util.Map</span>;\n\n<span class=\"k\">class</span> <span class=\"t\">LoxInstance</span> {\n  <span class=\"k\">private</span> <span class=\"t\">LoxClass</span> <span class=\"i\">klass</span>;\n\n  <span class=\"t\">LoxInstance</span>(<span class=\"t\">LoxClass</span> <span class=\"i\">klass</span>) {\n    <span class=\"k\">this</span>.<span class=\"i\">klass</span> = <span class=\"i\">klass</span>;\n  }\n\n  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">String</span> <span class=\"i\">toString</span>() {\n    <span class=\"k\">return</span> <span class=\"i\">klass</span>.<span class=\"i\">name</span> + <span class=\"s\">&quot; instance&quot;</span>;\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxInstance.java</em>, create new file</div>\n\n<p>Like LoxClass, it&rsquo;s pretty bare bones, but we&rsquo;re only getting started. If you\nwant to give it a try, here&rsquo;s a script to run:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Bagel</span> {}\n<span class=\"k\">var</span> <span class=\"i\">bagel</span> = <span class=\"t\">Bagel</span>();\n<span class=\"k\">print</span> <span class=\"i\">bagel</span>; <span class=\"c\">// Prints &quot;Bagel instance&quot;.</span>\n</pre></div>\n<p>This program doesn&rsquo;t do much, but it&rsquo;s starting to do <em>something</em>.</p>\n<h2><a href=\"#properties-on-instances\" id=\"properties-on-instances\"><small>12&#8202;.&#8202;4</small>Properties on Instances</a></h2>\n<p>We have instances, so we should make them useful. We&rsquo;re at a fork in the road.\nWe could add behavior first<span class=\"em\">&mdash;</span>methods<span class=\"em\">&mdash;</span>or we could start with state<span class=\"em\">&mdash;</span>properties. We&rsquo;re going to take the latter because, as we&rsquo;ll see, the two get\nentangled in an interesting way and it will be easier to make sense of them if\nwe get properties working first.</p>\n<p>Lox follows JavaScript and Python in how it handles state. Every instance is an\nopen collection of named values. Methods on the instance&rsquo;s class can access and\nmodify properties, but so can <span name=\"outside\">outside</span> code.\nProperties are accessed using a <code>.</code> syntax.</p>\n<aside name=\"outside\">\n<p>Allowing code outside of the class to directly modify an object&rsquo;s fields goes\nagainst the object-oriented credo that a class <em>encapsulates</em> state. Some\nlanguages take a more principled stance. In Smalltalk, fields are accessed using\nsimple identifiers<span class=\"em\">&mdash;</span>essentially, variables that are only in scope inside a\nclass&rsquo;s methods. Ruby uses <code>@</code> followed by a name to access a field in an\nobject. That syntax is only meaningful inside a method and always accesses state\non the current object.</p>\n<p>Lox, for better or worse, isn&rsquo;t quite so pious about its OOP faith.</p>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"i\">someObject</span>.<span class=\"i\">someProperty</span>\n</pre></div>\n<p>An expression followed by <code>.</code> and an identifier reads the property with that\nname from the object the expression evaluates to. That dot has the same\nprecedence as the parentheses in a function call expression, so we slot it into\nthe grammar by replacing the existing <code>call</code> rule with:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">call</span>           → <span class=\"i\">primary</span> ( <span class=\"s\">&quot;(&quot;</span> <span class=\"i\">arguments</span>? <span class=\"s\">&quot;)&quot;</span> | <span class=\"s\">&quot;.&quot;</span> <span class=\"t\">IDENTIFIER</span> )* ;\n</pre></div>\n<p>After a primary expression, we allow a series of any mixture of parenthesized\ncalls and dotted property accesses. &ldquo;Property access&rdquo; is a mouthful, so from\nhere on out, we&rsquo;ll call these &ldquo;get expressions&rdquo;.</p>\n<h3><a href=\"#get-expressions\" id=\"get-expressions\"><small>12&#8202;.&#8202;4&#8202;.&#8202;1</small>Get expressions</a></h3>\n<p>The <span name=\"get-ast\">syntax tree node</span> is:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      &quot;Call     : Expr callee, Token paren, List&lt;Expr&gt; arguments&quot;,\n</pre><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nin <em>main</em>()</div>\n<pre class=\"insert\">      <span class=\"s\">&quot;Get      : Expr object, Token name&quot;</span>,\n</pre><pre class=\"insert-after\">      &quot;Grouping : Expr expression&quot;,\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, in <em>main</em>()</div>\n\n<aside name=\"get-ast\">\n<p>The generated code for the new node is in <a href=\"appendix-ii.html#get-expression\">Appendix II</a>.</p>\n</aside>\n<p>Following the grammar, the new parsing code goes in our existing <code>call()</code>\nmethod.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    while (true) {<span name=\"while-true\"> </span>\n      if (match(LEFT_PAREN)) {\n        expr = finishCall(expr);\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>call</em>()</div>\n<pre class=\"insert\">      } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">DOT</span>)) {\n        <span class=\"t\">Token</span> <span class=\"i\">name</span> = <span class=\"i\">consume</span>(<span class=\"i\">IDENTIFIER</span>,\n            <span class=\"s\">&quot;Expect property name after &#39;.&#39;.&quot;</span>);\n        <span class=\"i\">expr</span> = <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Get</span>(<span class=\"i\">expr</span>, <span class=\"i\">name</span>);\n</pre><pre class=\"insert-after\">      } else {\n        break;\n      }\n    }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>call</em>()</div>\n\n<p>The outer <code>while</code> loop there corresponds to the <code>*</code> in the grammar rule. We zip\nalong the tokens building up a chain of calls and gets as we find parentheses\nand dots, like so:</p><img src=\"image/classes/zip.png\" alt=\"Parsing a series of '.' and '()' expressions to an AST.\" />\n<p>Instances of the new Expr.Get node feed into the resolver.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>visitCallExpr</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitGetExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Get</span> <span class=\"i\">expr</span>) {\n    <span class=\"i\">resolve</span>(<span class=\"i\">expr</span>.<span class=\"i\">object</span>);\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>visitCallExpr</em>()</div>\n\n<p>OK, not much to that. Since properties are looked up <span\nname=\"dispatch\">dynamically</span>, they don&rsquo;t get resolved. During resolution,\nwe recurse only into the expression to the left of the dot. The actual property\naccess happens in the interpreter.</p>\n<aside name=\"dispatch\">\n<p>You can literally see that property dispatch in Lox is dynamic since we don&rsquo;t\nprocess the property name during the static resolution pass.</p>\n</aside>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>visitCallExpr</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Object</span> <span class=\"i\">visitGetExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Get</span> <span class=\"i\">expr</span>) {\n    <span class=\"t\">Object</span> <span class=\"i\">object</span> = <span class=\"i\">evaluate</span>(<span class=\"i\">expr</span>.<span class=\"i\">object</span>);\n    <span class=\"k\">if</span> (<span class=\"i\">object</span> <span class=\"k\">instanceof</span> <span class=\"t\">LoxInstance</span>) {\n      <span class=\"k\">return</span> ((<span class=\"t\">LoxInstance</span>) <span class=\"i\">object</span>).<span class=\"i\">get</span>(<span class=\"i\">expr</span>.<span class=\"i\">name</span>);\n    }\n\n    <span class=\"k\">throw</span> <span class=\"k\">new</span> <span class=\"t\">RuntimeError</span>(<span class=\"i\">expr</span>.<span class=\"i\">name</span>,\n        <span class=\"s\">&quot;Only instances have properties.&quot;</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>visitCallExpr</em>()</div>\n\n<p>First, we evaluate the expression whose property is being accessed. In Lox, only\ninstances of classes have properties. If the object is some other type like a\nnumber, invoking a getter on it is a runtime error.</p>\n<p>If the object is a LoxInstance, then we ask it to look up the property. It must\nbe time to give LoxInstance some actual state. A map will do fine.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  private LoxClass klass;\n</pre><div class=\"source-file\"><em>lox/LoxInstance.java</em><br>\nin class <em>LoxInstance</em></div>\n<pre class=\"insert\">  <span class=\"k\">private</span> <span class=\"k\">final</span> <span class=\"t\">Map</span>&lt;<span class=\"t\">String</span>, <span class=\"t\">Object</span>&gt; <span class=\"i\">fields</span> = <span class=\"k\">new</span> <span class=\"t\">HashMap</span>&lt;&gt;();\n</pre><pre class=\"insert-after\">\n\n  LoxInstance(LoxClass klass) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxInstance.java</em>, in class <em>LoxInstance</em></div>\n\n<p>Each key in the map is a property name and the corresponding value is the\nproperty&rsquo;s value. To look up a property on an instance:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/LoxInstance.java</em><br>\nadd after <em>LoxInstance</em>()</div>\n<pre>  <span class=\"t\">Object</span> <span class=\"i\">get</span>(<span class=\"t\">Token</span> <span class=\"i\">name</span>) {\n    <span class=\"k\">if</span> (<span class=\"i\">fields</span>.<span class=\"i\">containsKey</span>(<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>)) {\n      <span class=\"k\">return</span> <span class=\"i\">fields</span>.<span class=\"i\">get</span>(<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>);\n    }\n\n    <span class=\"k\">throw</span> <span class=\"k\">new</span> <span class=\"t\">RuntimeError</span>(<span class=\"i\">name</span>,<span name=\"hidden\"> </span>\n        <span class=\"s\">&quot;Undefined property &#39;&quot;</span> + <span class=\"i\">name</span>.<span class=\"i\">lexeme</span> + <span class=\"s\">&quot;&#39;.&quot;</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxInstance.java</em>, add after <em>LoxInstance</em>()</div>\n\n<aside name=\"hidden\">\n<p>Doing a hash table lookup for every field access is fast enough for many\nlanguage implementations, but not ideal. High performance VMs for languages like\nJavaScript use sophisticated optimizations like &ldquo;<a href=\"http://richardartoul.github.io/jekyll/update/2015/04/26/hidden-classes.html\">hidden classes</a>&rdquo; to avoid\nthat overhead.</p>\n<p>Paradoxically, many of the optimizations invented to make dynamic languages fast\nrest on the observation that<span class=\"em\">&mdash;</span>even in those languages<span class=\"em\">&mdash;</span>most code is fairly\nstatic in terms of the types of objects it works with and their fields.</p>\n</aside>\n<p>An interesting edge case we need to handle is what happens if the instance\ndoesn&rsquo;t <em>have</em> a property with the given name. We could silently return some\ndummy value like <code>nil</code>, but my experience with languages like JavaScript is that\nthis behavior masks bugs more often than it does anything useful. Instead, we&rsquo;ll\nmake it a runtime error.</p>\n<p>So the first thing we do is see if the instance actually has a field with the\ngiven name. Only then do we return it. Otherwise, we raise an error.</p>\n<p>Note how I switched from talking about &ldquo;properties&rdquo; to &ldquo;fields&rdquo;. There is a\nsubtle difference between the two. Fields are named bits of state stored\ndirectly in an instance. Properties are the named, uh, <em>things</em>, that a get\nexpression may return. Every field is a property, but as we&rsquo;ll see <span\nname=\"foreshadowing\">later</span>, not every property is a field.</p>\n<aside name=\"foreshadowing\">\n<p>Ooh, foreshadowing. Spooky!</p>\n</aside>\n<p>In theory, we can now read properties on objects. But since there&rsquo;s no way to\nactually stuff any state into an instance, there are no fields to access. Before\nwe can test out reading, we must support writing.</p>\n<h3><a href=\"#set-expressions\" id=\"set-expressions\"><small>12&#8202;.&#8202;4&#8202;.&#8202;2</small>Set expressions</a></h3>\n<p>Setters use the same syntax as getters, except they appear on the left side of\nan assignment.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">someObject</span>.<span class=\"i\">someProperty</span> = <span class=\"i\">value</span>;\n</pre></div>\n<p>In grammar land, we extend the rule for assignment to allow dotted identifiers\non the left-hand side.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">assignment</span>     → ( <span class=\"i\">call</span> <span class=\"s\">&quot;.&quot;</span> )? <span class=\"t\">IDENTIFIER</span> <span class=\"s\">&quot;=&quot;</span> <span class=\"i\">assignment</span>\n               | <span class=\"i\">logic_or</span> ;\n</pre></div>\n<p>Unlike getters, setters don&rsquo;t chain. However, the reference to <code>call</code> allows any\nhigh-precedence expression before the last dot, including any number of\n<em>getters</em>, as in:</p><img src=\"image/classes/setter.png\" alt=\"breakfast.omelette.filling.meat = ham\" />\n<p>Note here that only the <em>last</em> part, the <code>.meat</code> is the <em>setter</em>. The\n<code>.omelette</code> and <code>.filling</code> parts are both <em>get</em> expressions.</p>\n<p>Just as we have two separate AST nodes for variable access and variable\nassignment, we need a <span name=\"set-ast\">second setter node</span> to\ncomplement our getter node.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      &quot;Logical  : Expr left, Token operator, Expr right&quot;,\n</pre><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nin <em>main</em>()</div>\n<pre class=\"insert\">      <span class=\"s\">&quot;Set      : Expr object, Token name, Expr value&quot;</span>,\n</pre><pre class=\"insert-after\">      &quot;Unary    : Token operator, Expr right&quot;,\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, in <em>main</em>()</div>\n\n<aside name=\"set-ast\">\n<p>The generated code for the new node is in <a href=\"appendix-ii.html#set-expression\">Appendix II</a>.</p>\n</aside>\n<p>In case you don&rsquo;t remember, the way we handle assignment in the parser is a\nlittle funny. We can&rsquo;t easily tell that a series of tokens is the left-hand side\nof an assignment until we reach the <code>=</code>. Now that our assignment grammar rule\nhas <code>call</code> on the left side, which can expand to arbitrarily large expressions,\nthat final <code>=</code> may be many tokens away from the point where we need to know\nwe&rsquo;re parsing an assignment.</p>\n<p>Instead, the trick we do is parse the left-hand side as a normal expression.\nThen, when we stumble onto the equal sign after it, we take the expression we\nalready parsed and transform it into the correct syntax tree node for the\nassignment.</p>\n<p>We add another clause to that transformation to handle turning an Expr.Get\nexpression on the left into the corresponding Expr.Set.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">        return new Expr.Assign(name, value);\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>assignment</em>()</div>\n<pre class=\"insert\">      } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">expr</span> <span class=\"k\">instanceof</span> <span class=\"t\">Expr</span>.<span class=\"t\">Get</span>) {\n        <span class=\"t\">Expr</span>.<span class=\"t\">Get</span> <span class=\"i\">get</span> = (<span class=\"t\">Expr</span>.<span class=\"t\">Get</span>)<span class=\"i\">expr</span>;\n        <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Set</span>(<span class=\"i\">get</span>.<span class=\"i\">object</span>, <span class=\"i\">get</span>.<span class=\"i\">name</span>, <span class=\"i\">value</span>);\n</pre><pre class=\"insert-after\">      }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>assignment</em>()</div>\n\n<p>That&rsquo;s parsing our syntax. We push that node through into the resolver.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>visitLogicalExpr</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitSetExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Set</span> <span class=\"i\">expr</span>) {\n    <span class=\"i\">resolve</span>(<span class=\"i\">expr</span>.<span class=\"i\">value</span>);\n    <span class=\"i\">resolve</span>(<span class=\"i\">expr</span>.<span class=\"i\">object</span>);\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>visitLogicalExpr</em>()</div>\n\n<p>Again, like Expr.Get, the property itself is dynamically evaluated, so there&rsquo;s\nnothing to resolve there. All we need to do is recurse into the two\nsubexpressions of Expr.Set, the object whose property is being set, and the\nvalue it&rsquo;s being set to.</p>\n<p>That leads us to the interpreter.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>visitLogicalExpr</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Object</span> <span class=\"i\">visitSetExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Set</span> <span class=\"i\">expr</span>) {\n    <span class=\"t\">Object</span> <span class=\"i\">object</span> = <span class=\"i\">evaluate</span>(<span class=\"i\">expr</span>.<span class=\"i\">object</span>);\n\n    <span class=\"k\">if</span> (!(<span class=\"i\">object</span> <span class=\"k\">instanceof</span> <span class=\"t\">LoxInstance</span>)) {<span name=\"order\"> </span>\n      <span class=\"k\">throw</span> <span class=\"k\">new</span> <span class=\"t\">RuntimeError</span>(<span class=\"i\">expr</span>.<span class=\"i\">name</span>,\n                             <span class=\"s\">&quot;Only instances have fields.&quot;</span>);\n    }\n\n    <span class=\"t\">Object</span> <span class=\"i\">value</span> = <span class=\"i\">evaluate</span>(<span class=\"i\">expr</span>.<span class=\"i\">value</span>);\n    ((<span class=\"t\">LoxInstance</span>)<span class=\"i\">object</span>).<span class=\"i\">set</span>(<span class=\"i\">expr</span>.<span class=\"i\">name</span>, <span class=\"i\">value</span>);\n    <span class=\"k\">return</span> <span class=\"i\">value</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>visitLogicalExpr</em>()</div>\n\n<p>We evaluate the object whose property is being set and check to see if it&rsquo;s a\nLoxInstance. If not, that&rsquo;s a runtime error. Otherwise, we evaluate the value\nbeing set and store it on the instance. That relies on a new method in\nLoxInstance.</p>\n<aside name=\"order\">\n<p>This is another semantic edge case. There are three distinct operations:</p>\n<ol>\n<li>\n<p>Evaluate the object.</p>\n</li>\n<li>\n<p>Raise a runtime error if it&rsquo;s not an instance of a class.</p>\n</li>\n<li>\n<p>Evaluate the value.</p>\n</li>\n</ol>\n<p>The order that those are performed in could be user visible, which means we need\nto carefully specify it and ensure our implementations do these in the same\norder.</p>\n</aside>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/LoxInstance.java</em><br>\nadd after <em>get</em>()</div>\n<pre>  <span class=\"t\">void</span> <span class=\"i\">set</span>(<span class=\"t\">Token</span> <span class=\"i\">name</span>, <span class=\"t\">Object</span> <span class=\"i\">value</span>) {\n    <span class=\"i\">fields</span>.<span class=\"i\">put</span>(<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>, <span class=\"i\">value</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxInstance.java</em>, add after <em>get</em>()</div>\n\n<p>No real magic here. We stuff the values straight into the Java map where fields\nlive. Since Lox allows freely creating new fields on instances, there&rsquo;s no need\nto see if the key is already present.</p>\n<h2><a href=\"#methods-on-classes\" id=\"methods-on-classes\"><small>12&#8202;.&#8202;5</small>Methods on Classes</a></h2>\n<p>You can create instances of classes and stuff data into them, but the class\nitself doesn&rsquo;t really <em>do</em> anything. Instances are just maps and all instances\nare more or less the same. To make them feel like instances <em>of classes</em>, we\nneed behavior<span class=\"em\">&mdash;</span>methods.</p>\n<p>Our helpful parser already parses method declarations, so we&rsquo;re good there. We\nalso don&rsquo;t need to add any new parser support for method <em>calls</em>. We already\nhave <code>.</code> (getters) and <code>()</code> (function calls). A &ldquo;method call&rdquo; simply chains\nthose together.</p><img src=\"image/classes/method.png\" alt=\"The syntax tree for 'object.method(argument)\" />\n<p>That raises an interesting question. What happens when those two expressions are\npulled apart? Assuming that <code>method</code> in this example is a method on the class of\n<code>object</code> and not a field on the instance, what should the following piece of\ncode do?</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">m</span> = <span class=\"i\">object</span>.<span class=\"i\">method</span>;\n<span class=\"i\">m</span>(<span class=\"i\">argument</span>);\n</pre></div>\n<p>This program &ldquo;looks up&rdquo; the method and stores the result<span class=\"em\">&mdash;</span>whatever that is<span class=\"em\">&mdash;</span>in a variable and then calls that object later. Is this allowed? Can you treat a\nmethod like it&rsquo;s a function on the instance?</p>\n<p>What about the other direction?</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Box</span> {}\n\n<span class=\"k\">fun</span> <span class=\"i\">notMethod</span>(<span class=\"i\">argument</span>) {\n  <span class=\"k\">print</span> <span class=\"s\">&quot;called function with &quot;</span> + <span class=\"i\">argument</span>;\n}\n\n<span class=\"k\">var</span> <span class=\"i\">box</span> = <span class=\"t\">Box</span>();\n<span class=\"i\">box</span>.<span class=\"i\">function</span> = <span class=\"i\">notMethod</span>;\n<span class=\"i\">box</span>.<span class=\"i\">function</span>(<span class=\"s\">&quot;argument&quot;</span>);\n</pre></div>\n<p>This program creates an instance and then stores a function in a field on it.\nThen it calls that function using the same syntax as a method call. Does that\nwork?</p>\n<p>Different languages have different answers to these questions. One could write a\ntreatise on it. For Lox, we&rsquo;ll say the answer to both of these is yes, it does\nwork. We have a couple of reasons to justify that. For the second example<span class=\"em\">&mdash;</span>calling a function stored in a field<span class=\"em\">&mdash;</span>we want to support that because\nfirst-class functions are useful and storing them in fields is a perfectly\nnormal thing to do.</p>\n<p>The first example is more obscure. One motivation is that users generally expect\nto be able to hoist a subexpression out into a local variable without changing\nthe meaning of the program. You can take this:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">breakfast</span>(<span class=\"i\">omelette</span>.<span class=\"i\">filledWith</span>(<span class=\"i\">cheese</span>), <span class=\"i\">sausage</span>);\n</pre></div>\n<p>And turn it into this:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">eggs</span> = <span class=\"i\">omelette</span>.<span class=\"i\">filledWith</span>(<span class=\"i\">cheese</span>);\n<span class=\"i\">breakfast</span>(<span class=\"i\">eggs</span>, <span class=\"i\">sausage</span>);\n</pre></div>\n<p>And it does the same thing. Likewise, since the <code>.</code> and the <code>()</code> in a method\ncall <em>are</em> two separate expressions, it seems you should be able to hoist the\n<em>lookup</em> part into a variable and then call it <span\nname=\"callback\">later</span>. We need to think carefully about what the <em>thing</em>\nyou get when you look up a method is, and how it behaves, even in weird cases\nlike:</p>\n<aside name=\"callback\">\n<p>A motivating use for this is callbacks. Often, you want to pass a callback whose\nbody simply invokes a method on some object. Being able to look up the method and\npass it directly saves you the chore of manually declaring a function to wrap\nit. Compare this:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">callback</span>(<span class=\"i\">a</span>, <span class=\"i\">b</span>, <span class=\"i\">c</span>) {\n  <span class=\"i\">object</span>.<span class=\"i\">method</span>(<span class=\"i\">a</span>, <span class=\"i\">b</span>, <span class=\"i\">c</span>);\n}\n\n<span class=\"i\">takeCallback</span>(<span class=\"i\">callback</span>);\n</pre></div>\n<p>With this:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">takeCallback</span>(<span class=\"i\">object</span>.<span class=\"i\">method</span>);\n</pre></div>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Person</span> {\n  <span class=\"i\">sayName</span>() {\n    <span class=\"k\">print</span> <span class=\"k\">this</span>.<span class=\"i\">name</span>;\n  }\n}\n\n<span class=\"k\">var</span> <span class=\"i\">jane</span> = <span class=\"t\">Person</span>();\n<span class=\"i\">jane</span>.<span class=\"i\">name</span> = <span class=\"s\">&quot;Jane&quot;</span>;\n\n<span class=\"k\">var</span> <span class=\"i\">method</span> = <span class=\"i\">jane</span>.<span class=\"i\">sayName</span>;\n<span class=\"i\">method</span>(); <span class=\"c\">// ?</span>\n</pre></div>\n<p>If you grab a handle to a method on some instance and call it later, does it\n&ldquo;remember&rdquo; the instance it was pulled off from? Does <code>this</code> inside the method\nstill refer to that original object?</p>\n<p>Here&rsquo;s a more pathological example to bend your brain:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Person</span> {\n  <span class=\"i\">sayName</span>() {\n    <span class=\"k\">print</span> <span class=\"k\">this</span>.<span class=\"i\">name</span>;\n  }\n}\n\n<span class=\"k\">var</span> <span class=\"i\">jane</span> = <span class=\"t\">Person</span>();\n<span class=\"i\">jane</span>.<span class=\"i\">name</span> = <span class=\"s\">&quot;Jane&quot;</span>;\n\n<span class=\"k\">var</span> <span class=\"i\">bill</span> = <span class=\"t\">Person</span>();\n<span class=\"i\">bill</span>.<span class=\"i\">name</span> = <span class=\"s\">&quot;Bill&quot;</span>;\n\n<span class=\"i\">bill</span>.<span class=\"i\">sayName</span> = <span class=\"i\">jane</span>.<span class=\"i\">sayName</span>;\n<span class=\"i\">bill</span>.<span class=\"i\">sayName</span>(); <span class=\"c\">// ?</span>\n</pre></div>\n<p>Does that last line print &ldquo;Bill&rdquo; because that&rsquo;s the instance that we <em>called</em>\nthe method through, or &ldquo;Jane&rdquo; because it&rsquo;s the instance where we first grabbed\nthe method?</p>\n<p>Equivalent code in Lua and JavaScript would print &ldquo;Bill&rdquo;. Those languages don&rsquo;t\nreally have a notion of &ldquo;methods&rdquo;. Everything is sort of functions-in-fields, so\nit&rsquo;s not clear that <code>jane</code> &ldquo;owns&rdquo; <code>sayName</code> any more than <code>bill</code> does.</p>\n<p>Lox, though, has real class syntax so we do know which callable things are\nmethods and which are functions. Thus, like Python, C#, and others, we will have\nmethods &ldquo;bind&rdquo; <code>this</code> to the original instance when the method is first grabbed.\nPython calls <span name=\"bound\">these</span> <strong>bound methods</strong>.</p>\n<aside name=\"bound\">\n<p>I know, imaginative name, right?</p>\n</aside>\n<p>In practice, that&rsquo;s usually what you want. If you take a reference to a method\non some object so you can use it as a callback later, you want to remember the\ninstance it belonged to, even if that callback happens to be stored in a field\non some other object.</p>\n<p>OK, that&rsquo;s a lot of semantics to load into your head. Forget about the edge\ncases for a bit. We&rsquo;ll get back to those. For now, let&rsquo;s get basic method calls\nworking. We&rsquo;re already parsing the method declarations inside the class body, so\nthe next step is to resolve them.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    define(stmt.name);\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin <em>visitClassStmt</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"k\">for</span> (<span class=\"t\">Stmt</span>.<span class=\"t\">Function</span> <span class=\"i\">method</span> : <span class=\"i\">stmt</span>.<span class=\"i\">methods</span>) {\n      <span class=\"t\">FunctionType</span> <span class=\"i\">declaration</span> = <span class=\"t\">FunctionType</span>.<span class=\"i\">METHOD</span>;\n      <span class=\"i\">resolveFunction</span>(<span class=\"i\">method</span>, <span class=\"i\">declaration</span>);<span name=\"local\"> </span>\n    }\n\n</pre><pre class=\"insert-after\">    return null;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in <em>visitClassStmt</em>()</div>\n\n<aside name=\"local\">\n<p>Storing the function type in a local variable is pointless right now, but we&rsquo;ll\nexpand this code before too long and it will make more sense.</p>\n</aside>\n<p>We iterate through the methods in the class body and call the\n<code>resolveFunction()</code> method we wrote for handling function declarations already.\nThe only difference is that we pass in a new FunctionType enum value.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    NONE,\n</pre><pre class=\"insert-before\">    <span class=\"i\">FUNCTION</span><span class=\"insert-comma\">,</span>\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin enum <em>FunctionType</em><br>\nadd <em>&ldquo;,&rdquo;</em> to previous line</div>\n<pre class=\"insert\">    <span class=\"i\">METHOD</span>\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in enum <em>FunctionType</em>, add <em>&ldquo;,&rdquo;</em> to previous line</div>\n\n<p>That&rsquo;s going to be important when we resolve <code>this</code> expressions. For now, don&rsquo;t\nworry about it. The interesting stuff is in the interpreter.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    environment.define(stmt.name.lexeme, null);\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitClassStmt</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">\n\n    <span class=\"t\">Map</span>&lt;<span class=\"t\">String</span>, <span class=\"t\">LoxFunction</span>&gt; <span class=\"i\">methods</span> = <span class=\"k\">new</span> <span class=\"t\">HashMap</span>&lt;&gt;();\n    <span class=\"k\">for</span> (<span class=\"t\">Stmt</span>.<span class=\"t\">Function</span> <span class=\"i\">method</span> : <span class=\"i\">stmt</span>.<span class=\"i\">methods</span>) {\n      <span class=\"t\">LoxFunction</span> <span class=\"i\">function</span> = <span class=\"k\">new</span> <span class=\"t\">LoxFunction</span>(<span class=\"i\">method</span>, <span class=\"i\">environment</span>);\n      <span class=\"i\">methods</span>.<span class=\"i\">put</span>(<span class=\"i\">method</span>.<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>, <span class=\"i\">function</span>);\n    }\n\n    <span class=\"t\">LoxClass</span> <span class=\"i\">klass</span> = <span class=\"k\">new</span> <span class=\"t\">LoxClass</span>(<span class=\"i\">stmt</span>.<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>, <span class=\"i\">methods</span>);\n</pre><pre class=\"insert-after\">    environment.assign(stmt.name, klass);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitClassStmt</em>(), replace 1 line</div>\n\n<p>When we interpret a class declaration statement, we turn the syntactic\nrepresentation of the class<span class=\"em\">&mdash;</span>its AST node<span class=\"em\">&mdash;</span>into its runtime representation.\nNow, we need to do that for the methods contained in the class as well. Each\nmethod declaration blossoms into a LoxFunction object.</p>\n<p>We take all of those and wrap them up into a map, keyed by the method names.\nThat gets stored in LoxClass.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  final String name;\n</pre><div class=\"source-file\"><em>lox/LoxClass.java</em><br>\nin class <em>LoxClass</em><br>\nreplace 4 lines</div>\n<pre class=\"insert\">  <span class=\"k\">private</span> <span class=\"k\">final</span> <span class=\"t\">Map</span>&lt;<span class=\"t\">String</span>, <span class=\"t\">LoxFunction</span>&gt; <span class=\"i\">methods</span>;\n\n  <span class=\"t\">LoxClass</span>(<span class=\"t\">String</span> <span class=\"i\">name</span>, <span class=\"t\">Map</span>&lt;<span class=\"t\">String</span>, <span class=\"t\">LoxFunction</span>&gt; <span class=\"i\">methods</span>) {\n    <span class=\"k\">this</span>.<span class=\"i\">name</span> = <span class=\"i\">name</span>;\n    <span class=\"k\">this</span>.<span class=\"i\">methods</span> = <span class=\"i\">methods</span>;\n  }\n</pre><pre class=\"insert-after\">\n\n  @Override\n  public String toString() {\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxClass.java</em>, in class <em>LoxClass</em>, replace 4 lines</div>\n\n<p>Where an instance stores state, the class stores behavior. LoxInstance has its\nmap of fields, and LoxClass gets a map of methods. Even though methods are\nowned by the class, they are still accessed through instances of that class.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  Object get(Token name) {\n    if (fields.containsKey(name.lexeme)) {\n      return fields.get(name.lexeme);\n    }\n\n</pre><div class=\"source-file\"><em>lox/LoxInstance.java</em><br>\nin <em>get</em>()</div>\n<pre class=\"insert\">    <span class=\"t\">LoxFunction</span> <span class=\"i\">method</span> = <span class=\"i\">klass</span>.<span class=\"i\">findMethod</span>(<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>);\n    <span class=\"k\">if</span> (<span class=\"i\">method</span> != <span class=\"k\">null</span>) <span class=\"k\">return</span> <span class=\"i\">method</span>;\n\n</pre><pre class=\"insert-after\">    throw new RuntimeError(name,<span name=\"hidden\"> </span>\n        &quot;Undefined property '&quot; + name.lexeme + &quot;'.&quot;);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxInstance.java</em>, in <em>get</em>()</div>\n\n<p>When looking up a property on an instance, if we don&rsquo;t <span\nname=\"shadow\">find</span> a matching field, we look for a method with that name\non the instance&rsquo;s class. If found, we return that. This is where the distinction\nbetween &ldquo;field&rdquo; and &ldquo;property&rdquo; becomes meaningful. When accessing a property,\nyou might get a field<span class=\"em\">&mdash;</span>a bit of state stored on the instance<span class=\"em\">&mdash;</span>or you could\nhit a method defined on the instance&rsquo;s class.</p>\n<p>The method is looked up using this:</p>\n<aside name=\"shadow\">\n<p>Looking for a field first implies that fields shadow methods, a subtle but\nimportant semantic point.</p>\n</aside>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/LoxClass.java</em><br>\nadd after <em>LoxClass</em>()</div>\n<pre>  <span class=\"t\">LoxFunction</span> <span class=\"i\">findMethod</span>(<span class=\"t\">String</span> <span class=\"i\">name</span>) {\n    <span class=\"k\">if</span> (<span class=\"i\">methods</span>.<span class=\"i\">containsKey</span>(<span class=\"i\">name</span>)) {\n      <span class=\"k\">return</span> <span class=\"i\">methods</span>.<span class=\"i\">get</span>(<span class=\"i\">name</span>);\n    }\n\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxClass.java</em>, add after <em>LoxClass</em>()</div>\n\n<p>You can probably guess this method is going to get more interesting later. For\nnow, a simple map lookup on the class&rsquo;s method table is enough to get us\nstarted. Give it a try:</p>\n<p><span name=\"crunch\"></span></p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Bacon</span> {\n  <span class=\"i\">eat</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Crunch crunch crunch!&quot;</span>;\n  }\n}\n\n<span class=\"t\">Bacon</span>().<span class=\"i\">eat</span>(); <span class=\"c\">// Prints &quot;Crunch crunch crunch!&quot;.</span>\n</pre></div>\n<aside name=\"crunch\">\n<p>Apologies if you prefer chewy bacon over crunchy. Feel free to adjust the script\nto your taste.</p>\n</aside>\n<h2><a href=\"#this\" id=\"this\"><small>12&#8202;.&#8202;6</small>This</a></h2>\n<p>We can define both behavior and state on objects, but they aren&rsquo;t tied together\nyet. Inside a method, we have no way to access the fields of the &ldquo;current&rdquo;\nobject<span class=\"em\">&mdash;</span>the instance that the method was called on<span class=\"em\">&mdash;</span>nor can we call other\nmethods on that same object.</p>\n<p>To get at that instance, it needs a <span name=\"i\">name</span>. Smalltalk,\nRuby, and Swift use &ldquo;self&rdquo;. Simula, C++, Java, and others use &ldquo;this&rdquo;. Python\nuses &ldquo;self&rdquo; by convention, but you can technically call it whatever you like.</p>\n<aside name=\"i\">\n<p>&ldquo;I&rdquo; would have been a great choice, but using &ldquo;i&rdquo; for loop variables predates\nOOP and goes all the way back to Fortran. We are victims of the incidental\nchoices of our forebears.</p>\n</aside>\n<p>For Lox, since we generally hew to Java-ish style, we&rsquo;ll go with &ldquo;this&rdquo;. Inside\na method body, a <code>this</code> expression evaluates to the instance that the method was\ncalled on. Or, more specifically, since methods are accessed and then invoked as\ntwo steps, it will refer to the object that the method was <em>accessed</em> from.</p>\n<p>That makes our job harder. Peep at:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Egotist</span> {\n  <span class=\"i\">speak</span>() {\n    <span class=\"k\">print</span> <span class=\"k\">this</span>;\n  }\n}\n\n<span class=\"k\">var</span> <span class=\"i\">method</span> = <span class=\"t\">Egotist</span>().<span class=\"i\">speak</span>;\n<span class=\"i\">method</span>();\n</pre></div>\n<p>On the second-to-last line, we grab a reference to the <code>speak()</code> method off an\ninstance of the class. That returns a function, and that function needs to\nremember the instance it was pulled off of so that <em>later</em>, on the last line, it\ncan still find it when the function is called.</p>\n<p>We need to take <code>this</code> at the point that the method is accessed and attach it to\nthe function somehow so that it stays around as long as we need it to. Hmm<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>a\nway to store some extra data that hangs around a function, eh? That sounds an\nawful lot like a <em>closure</em>, doesn&rsquo;t it?</p>\n<p>If we defined <code>this</code> as a sort of hidden variable in an environment that\nsurrounds the function returned when looking up a method, then uses of <code>this</code> in\nthe body would be able to find it later. LoxFunction already has the ability to\nhold on to a surrounding environment, so we have the machinery we need.</p>\n<p>Let&rsquo;s walk through an example to see how it works:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Cake</span> {\n  <span class=\"i\">taste</span>() {\n    <span class=\"k\">var</span> <span class=\"i\">adjective</span> = <span class=\"s\">&quot;delicious&quot;</span>;\n    <span class=\"k\">print</span> <span class=\"s\">&quot;The &quot;</span> + <span class=\"k\">this</span>.<span class=\"i\">flavor</span> + <span class=\"s\">&quot; cake is &quot;</span> + <span class=\"i\">adjective</span> + <span class=\"s\">&quot;!&quot;</span>;\n  }\n}\n\n<span class=\"k\">var</span> <span class=\"i\">cake</span> = <span class=\"t\">Cake</span>();\n<span class=\"i\">cake</span>.<span class=\"i\">flavor</span> = <span class=\"s\">&quot;German chocolate&quot;</span>;\n<span class=\"i\">cake</span>.<span class=\"i\">taste</span>(); <span class=\"c\">// Prints &quot;The German chocolate cake is delicious!&quot;.</span>\n</pre></div>\n<p>When we first evaluate the class definition, we create a LoxFunction for\n<code>taste()</code>. Its closure is the environment surrounding the class, in this case\nthe global one. So the LoxFunction we store in the class&rsquo;s method map looks\nlike so:</p><img src=\"image/classes/closure.png\" alt=\"The initial closure for the method.\" />\n<p>When we evaluate the <code>cake.taste</code> get expression, we create a new environment\nthat binds <code>this</code> to the object the method is accessed from (here, <code>cake</code>). Then\nwe make a <em>new</em> LoxFunction with the same code as the original one but using\nthat new environment as its closure.</p><img src=\"image/classes/bound-method.png\" alt=\"The new closure that binds 'this'.\" />\n<p>This is the LoxFunction that gets returned when evaluating the get expression\nfor the method name. When that function is later called by a <code>()</code> expression,\nwe create an environment for the method body as usual.</p><img src=\"image/classes/call.png\" alt=\"Calling the bound method and creating a new environment for the method body.\" />\n<p>The parent of the body environment is the environment we created earlier to bind\n<code>this</code> to the current object. Thus any use of <code>this</code> inside the body\nsuccessfully resolves to that instance.</p>\n<p>Reusing our environment code for implementing <code>this</code> also takes care of\ninteresting cases where methods and functions interact, like:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Thing</span> {\n  <span class=\"i\">getCallback</span>() {\n    <span class=\"k\">fun</span> <span class=\"i\">localFunction</span>() {\n      <span class=\"k\">print</span> <span class=\"k\">this</span>;\n    }\n\n    <span class=\"k\">return</span> <span class=\"i\">localFunction</span>;\n  }\n}\n\n<span class=\"k\">var</span> <span class=\"i\">callback</span> = <span class=\"t\">Thing</span>().<span class=\"i\">getCallback</span>();\n<span class=\"i\">callback</span>();\n</pre></div>\n<p>In, say, JavaScript, it&rsquo;s common to return a callback from inside a method. That\ncallback may want to hang on to and retain access to the original object<span class=\"em\">&mdash;</span>the\n<code>this</code> value<span class=\"em\">&mdash;</span>that the method was associated with. Our existing support for\nclosures and environment chains should do all this correctly.</p>\n<p>Let&rsquo;s code it up. The first step is adding <span name=\"this-ast\">new\nsyntax</span> for <code>this</code>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      &quot;Set      : Expr object, Token name, Expr value&quot;,\n</pre><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nin <em>main</em>()</div>\n<pre class=\"insert\">      <span class=\"s\">&quot;This     : Token keyword&quot;</span>,\n</pre><pre class=\"insert-after\">      &quot;Unary    : Token operator, Expr right&quot;,\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, in <em>main</em>()</div>\n\n<aside name=\"this-ast\">\n<p>The generated code for the new node is in <a href=\"appendix-ii.html#this-expression\">Appendix II</a>.</p>\n</aside>\n<p>Parsing is simple since it&rsquo;s a single token which our lexer already\nrecognizes as a reserved word.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      return new Expr.Literal(previous().literal);\n    }\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>primary</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">THIS</span>)) <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">This</span>(<span class=\"i\">previous</span>());\n</pre><pre class=\"insert-after\">\n\n    if (match(IDENTIFIER)) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>primary</em>()</div>\n\n<p>You can start to see how <code>this</code> works like a variable when we get to the\nresolver.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>visitSetExpr</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitThisExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">This</span> <span class=\"i\">expr</span>) {\n    <span class=\"i\">resolveLocal</span>(<span class=\"i\">expr</span>, <span class=\"i\">expr</span>.<span class=\"i\">keyword</span>);\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>visitSetExpr</em>()</div>\n\n<p>We resolve it exactly like any other local variable using &ldquo;this&rdquo; as the name for\nthe &ldquo;variable&rdquo;. Of course, that&rsquo;s not going to work right now, because &ldquo;this&rdquo;\n<em>isn&rsquo;t</em> declared in any scope. Let&rsquo;s fix that over in <code>visitClassStmt()</code>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    define(stmt.name);\n\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin <em>visitClassStmt</em>()</div>\n<pre class=\"insert\">    <span class=\"i\">beginScope</span>();\n    <span class=\"i\">scopes</span>.<span class=\"i\">peek</span>().<span class=\"i\">put</span>(<span class=\"s\">&quot;this&quot;</span>, <span class=\"k\">true</span>);\n\n</pre><pre class=\"insert-after\">    for (Stmt.Function method : stmt.methods) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in <em>visitClassStmt</em>()</div>\n\n<p>Before we step in and start resolving the method bodies, we push a new scope and\ndefine &ldquo;this&rdquo; in it as if it were a variable. Then, when we&rsquo;re done, we discard\nthat surrounding scope.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    }\n\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin <em>visitClassStmt</em>()</div>\n<pre class=\"insert\">    <span class=\"i\">endScope</span>();\n\n</pre><pre class=\"insert-after\">    return null;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in <em>visitClassStmt</em>()</div>\n\n<p>Now, whenever a <code>this</code> expression is encountered (at least inside a method) it\nwill resolve to a &ldquo;local variable&rdquo; defined in an implicit scope just outside of\nthe block for the method body.</p>\n<p>The resolver has a new <em>scope</em> for <code>this</code>, so the interpreter needs to create a\ncorresponding <em>environment</em> for it. Remember, we always have to keep the\nresolver&rsquo;s scope chains and the interpreter&rsquo;s linked environments in sync with\neach other. At runtime, we create the environment after we find the method on\nthe instance. We replace the previous line of code that simply returned the\nmethod&rsquo;s LoxFunction with this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    LoxFunction method = klass.findMethod(name.lexeme);\n</pre><div class=\"source-file\"><em>lox/LoxInstance.java</em><br>\nin <em>get</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (<span class=\"i\">method</span> != <span class=\"k\">null</span>) <span class=\"k\">return</span> <span class=\"i\">method</span>.<span class=\"i\">bind</span>(<span class=\"k\">this</span>);\n</pre><pre class=\"insert-after\">\n\n    throw new RuntimeError(name,<span name=\"hidden\"> </span>\n        &quot;Undefined property '&quot; + name.lexeme + &quot;'.&quot;);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxInstance.java</em>, in <em>get</em>(), replace 1 line</div>\n\n<p>Note the new call to <code>bind()</code>. That looks like so:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/LoxFunction.java</em><br>\nadd after <em>LoxFunction</em>()</div>\n<pre>  <span class=\"t\">LoxFunction</span> <span class=\"i\">bind</span>(<span class=\"t\">LoxInstance</span> <span class=\"i\">instance</span>) {\n    <span class=\"t\">Environment</span> <span class=\"i\">environment</span> = <span class=\"k\">new</span> <span class=\"t\">Environment</span>(<span class=\"i\">closure</span>);\n    <span class=\"i\">environment</span>.<span class=\"i\">define</span>(<span class=\"s\">&quot;this&quot;</span>, <span class=\"i\">instance</span>);\n    <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">LoxFunction</span>(<span class=\"i\">declaration</span>, <span class=\"i\">environment</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxFunction.java</em>, add after <em>LoxFunction</em>()</div>\n\n<p>There isn&rsquo;t much to it. We create a new environment nestled inside the method&rsquo;s\noriginal closure. Sort of a closure-within-a-closure. When the method is called,\nthat will become the parent of the method body&rsquo;s environment.</p>\n<p>We declare &ldquo;this&rdquo; as a variable in that environment and bind it to the given\ninstance, the instance that the method is being accessed from. <em>Et voilà</em>, the\nreturned LoxFunction now carries around its own little persistent world where\n&ldquo;this&rdquo; is bound to the object.</p>\n<p>The remaining task is interpreting those <code>this</code> expressions. Similar to the\nresolver, it is the same as interpreting a variable expression.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>visitSetExpr</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Object</span> <span class=\"i\">visitThisExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">This</span> <span class=\"i\">expr</span>) {\n    <span class=\"k\">return</span> <span class=\"i\">lookUpVariable</span>(<span class=\"i\">expr</span>.<span class=\"i\">keyword</span>, <span class=\"i\">expr</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>visitSetExpr</em>()</div>\n\n<p>Go ahead and give it a try using that cake example from earlier. With less than\ntwenty lines of code, our interpreter handles <code>this</code> inside methods even in all\nof the weird ways it can interact with nested classes, functions inside methods,\nhandles to methods, etc.</p>\n<h3><a href=\"#invalid-uses-of-this\" id=\"invalid-uses-of-this\"><small>12&#8202;.&#8202;6&#8202;.&#8202;1</small>Invalid uses of this</a></h3>\n<p>Wait a minute. What happens if you try to use <code>this</code> <em>outside</em> of a method? What\nabout:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">print</span> <span class=\"k\">this</span>;\n</pre></div>\n<p>Or:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">notAMethod</span>() {\n  <span class=\"k\">print</span> <span class=\"k\">this</span>;\n}\n</pre></div>\n<p>There is no instance for <code>this</code> to point to if you&rsquo;re not in a method. We could\ngive it some default value like <code>nil</code> or make it a runtime error, but the user\nhas clearly made a mistake. The sooner they find and fix that mistake, the\nhappier they&rsquo;ll be.</p>\n<p>Our resolution pass is a fine place to detect this error statically. It already\ndetects <code>return</code> statements outside of functions. We&rsquo;ll do something similar for\n<code>this</code>. In the vein of our existing FunctionType enum, we define a new ClassType\none.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  }\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after enum <em>FunctionType</em></div>\n<pre class=\"insert\">\n\n  <span class=\"k\">private</span> <span class=\"k\">enum</span> <span class=\"t\">ClassType</span> {\n    <span class=\"i\">NONE</span>,\n    <span class=\"i\">CLASS</span>\n  }\n\n  <span class=\"k\">private</span> <span class=\"t\">ClassType</span> <span class=\"i\">currentClass</span> = <span class=\"t\">ClassType</span>.<span class=\"i\">NONE</span>;\n\n</pre><pre class=\"insert-after\">  void resolve(List&lt;Stmt&gt; statements) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after enum <em>FunctionType</em></div>\n\n<p>Yes, it could be a Boolean. When we get to inheritance, it will get a third\nvalue, hence the enum right now. We also add a corresponding field,\n<code>currentClass</code>. Its value tells us if we are currently inside a class\ndeclaration while traversing the syntax tree. It starts out <code>NONE</code> which means\nwe aren&rsquo;t in one.</p>\n<p>When we begin to resolve a class declaration, we change that.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  public Void visitClassStmt(Stmt.Class stmt) {\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin <em>visitClassStmt</em>()</div>\n<pre class=\"insert\">    <span class=\"t\">ClassType</span> <span class=\"i\">enclosingClass</span> = <span class=\"i\">currentClass</span>;\n    <span class=\"i\">currentClass</span> = <span class=\"t\">ClassType</span>.<span class=\"i\">CLASS</span>;\n\n</pre><pre class=\"insert-after\">    declare(stmt.name);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in <em>visitClassStmt</em>()</div>\n\n<p>As with <code>currentFunction</code>, we store the previous value of the field in a local\nvariable. This lets us piggyback onto the JVM to keep a stack of <code>currentClass</code>\nvalues. That way we don&rsquo;t lose track of the previous value if one class nests\ninside another.</p>\n<p>Once the methods have been resolved, we &ldquo;pop&rdquo; that stack by restoring the old\nvalue.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    endScope();\n\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin <em>visitClassStmt</em>()</div>\n<pre class=\"insert\">    <span class=\"i\">currentClass</span> = <span class=\"i\">enclosingClass</span>;\n</pre><pre class=\"insert-after\">    return null;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in <em>visitClassStmt</em>()</div>\n\n<p>When we resolve a <code>this</code> expression, the <code>currentClass</code> field gives us the bit\nof data we need to report an error if the expression doesn&rsquo;t occur nestled\ninside a method body.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  public Void visitThisExpr(Expr.This expr) {\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin <em>visitThisExpr</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (<span class=\"i\">currentClass</span> == <span class=\"t\">ClassType</span>.<span class=\"i\">NONE</span>) {\n      <span class=\"t\">Lox</span>.<span class=\"i\">error</span>(<span class=\"i\">expr</span>.<span class=\"i\">keyword</span>,\n          <span class=\"s\">&quot;Can&#39;t use &#39;this&#39; outside of a class.&quot;</span>);\n      <span class=\"k\">return</span> <span class=\"k\">null</span>;\n    }\n\n</pre><pre class=\"insert-after\">    resolveLocal(expr, expr.keyword);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in <em>visitThisExpr</em>()</div>\n\n<p>That should help users use <code>this</code> correctly, and it saves us from having to\nhandle misuse at runtime in the interpreter.</p>\n<h2><a href=\"#constructors-and-initializers\" id=\"constructors-and-initializers\"><small>12&#8202;.&#8202;7</small>Constructors and Initializers</a></h2>\n<p>We can do almost everything with classes now, and as we near the end of the\nchapter we find ourselves strangely focused on a beginning. Methods and fields\nlet us encapsulate state and behavior together so that an object always <em>stays</em>\nin a valid configuration. But how do we ensure a brand new object <em>starts</em> in a\ngood state?</p>\n<p>For that, we need constructors. I find them one of the trickiest parts of a\nlanguage to design, and if you peer closely at most other languages, you&rsquo;ll see\n<span name=\"cracks\">cracks</span> around object construction where the seams of\nthe design don&rsquo;t quite fit together perfectly. Maybe there&rsquo;s something\nintrinsically messy about the moment of birth.</p>\n<aside name=\"cracks\">\n<p>A few examples: In Java, even though final fields must be initialized, it is\nstill possible to read one <em>before</em> it has been. Exceptions<span class=\"em\">&mdash;</span>a huge, complex\nfeature<span class=\"em\">&mdash;</span>were added to C++ mainly as a way to emit errors from constructors.</p>\n</aside>\n<p>&ldquo;Constructing&rdquo; an object is actually a pair of operations:</p>\n<ol>\n<li>\n<p>The runtime <span name=\"allocate\"><em>allocates</em></span> the memory required for\na fresh instance. In most languages, this operation is at a fundamental\nlevel beneath what user code is able to access.</p>\n<aside name=\"allocate\">\n<p>C++&rsquo;s &ldquo;<a href=\"https://en.wikipedia.org/wiki/Placement_syntax\">placement new</a>&rdquo; is a rare example where the bowels of allocation\nare laid bare for the programmer to prod.</p>\n</aside></li>\n<li>\n<p>Then, a user-provided chunk of code is called which <em>initializes</em> the\nunformed object.</p>\n</li>\n</ol>\n<p>The latter is what we tend to think of when we hear &ldquo;constructor&rdquo;, but the\nlanguage itself has usually done some groundwork for us before we get to that\npoint. In fact, our Lox interpreter already has that covered when it creates a\nnew LoxInstance object.</p>\n<p>We&rsquo;ll do the remaining part<span class=\"em\">&mdash;</span>user-defined initialization<span class=\"em\">&mdash;</span>now. Languages\nhave a variety of notations for the chunk of code that sets up a new object for\na class. C++, Java, and C# use a method whose name matches the class name. Ruby\nand Python call it <code>init()</code>. The latter is nice and short, so we&rsquo;ll do that.</p>\n<p>In LoxClass&rsquo;s implementation of LoxCallable, we add a few more lines.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">                     List&lt;Object&gt; arguments) {\n    LoxInstance instance = new LoxInstance(this);\n</pre><div class=\"source-file\"><em>lox/LoxClass.java</em><br>\nin <em>call</em>()</div>\n<pre class=\"insert\">    <span class=\"t\">LoxFunction</span> <span class=\"i\">initializer</span> = <span class=\"i\">findMethod</span>(<span class=\"s\">&quot;init&quot;</span>);\n    <span class=\"k\">if</span> (<span class=\"i\">initializer</span> != <span class=\"k\">null</span>) {\n      <span class=\"i\">initializer</span>.<span class=\"i\">bind</span>(<span class=\"i\">instance</span>).<span class=\"i\">call</span>(<span class=\"i\">interpreter</span>, <span class=\"i\">arguments</span>);\n    }\n\n</pre><pre class=\"insert-after\">    return instance;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxClass.java</em>, in <em>call</em>()</div>\n\n<p>When a class is called, after the LoxInstance is created, we look for an &ldquo;init&rdquo;\nmethod. If we find one, we immediately bind and invoke it just like a normal\nmethod call. The argument list is forwarded along.</p>\n<p>That argument list means we also need to tweak how a class declares its arity.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  public int arity() {\n</pre><div class=\"source-file\"><em>lox/LoxClass.java</em><br>\nin <em>arity</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"t\">LoxFunction</span> <span class=\"i\">initializer</span> = <span class=\"i\">findMethod</span>(<span class=\"s\">&quot;init&quot;</span>);\n    <span class=\"k\">if</span> (<span class=\"i\">initializer</span> == <span class=\"k\">null</span>) <span class=\"k\">return</span> <span class=\"n\">0</span>;\n    <span class=\"k\">return</span> <span class=\"i\">initializer</span>.<span class=\"i\">arity</span>();\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxClass.java</em>, in <em>arity</em>(), replace 1 line</div>\n\n<p>If there is an initializer, that method&rsquo;s arity determines how many arguments\nyou must pass when you call the class itself. We don&rsquo;t <em>require</em> a class to\ndefine an initializer, though, as a convenience. If you don&rsquo;t have an\ninitializer, the arity is still zero.</p>\n<p>That&rsquo;s basically it. Since we bind the <code>init()</code> method before we call it, it has\naccess to <code>this</code> inside its body. That, along with the arguments passed to the\nclass, are all you need to be able to set up the new instance however you\ndesire.</p>\n<h3><a href=\"#invoking-init-directly\" id=\"invoking-init-directly\"><small>12&#8202;.&#8202;7&#8202;.&#8202;1</small>Invoking init() directly</a></h3>\n<p>As usual, exploring this new semantic territory rustles up a few weird\ncreatures. Consider:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Foo</span> {\n  <span class=\"i\">init</span>() {\n    <span class=\"k\">print</span> <span class=\"k\">this</span>;\n  }\n}\n\n<span class=\"k\">var</span> <span class=\"i\">foo</span> = <span class=\"t\">Foo</span>();\n<span class=\"k\">print</span> <span class=\"i\">foo</span>.<span class=\"i\">init</span>();\n</pre></div>\n<p>Can you &ldquo;re-initialize&rdquo; an object by directly calling its <code>init()</code> method? If\nyou do, what does it return? A <span name=\"compromise\">reasonable</span> answer\nwould be <code>nil</code> since that&rsquo;s what it appears the body returns.</p>\n<p>However<span class=\"em\">&mdash;</span>and I generally dislike compromising to satisfy the\nimplementation<span class=\"em\">&mdash;</span>it will make clox&rsquo;s implementation of constructors much\neasier if we say that <code>init()</code> methods always return <code>this</code>, even when\ndirectly called. In order to keep jlox compatible with that, we add a little\nspecial case code in LoxFunction.</p>\n<aside name=\"compromise\">\n<p>Maybe &ldquo;dislike&rdquo; is too strong a claim. It&rsquo;s reasonable to have the constraints\nand resources of your implementation affect the design of the language. There\nare only so many hours in the day, and if a cut corner here or there lets you get\nmore features to users in less time, it may very well be a net win for their\nhappiness and productivity. The trick is figuring out <em>which</em> corners to cut\nthat won&rsquo;t cause your users and future self to curse your shortsightedness.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">      return returnValue.value;\n    }\n</pre><div class=\"source-file\"><em>lox/LoxFunction.java</em><br>\nin <em>call</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"k\">if</span> (<span class=\"i\">isInitializer</span>) <span class=\"k\">return</span> <span class=\"i\">closure</span>.<span class=\"i\">getAt</span>(<span class=\"n\">0</span>, <span class=\"s\">&quot;this&quot;</span>);\n</pre><pre class=\"insert-after\">    return null;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxFunction.java</em>, in <em>call</em>()</div>\n\n<p>If the function is an initializer, we override the actual return value and\nforcibly return <code>this</code>. That relies on a new <code>isInitializer</code> field.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  private final Environment closure;\n\n</pre><div class=\"source-file\"><em>lox/LoxFunction.java</em><br>\nin class <em>LoxFunction</em><br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"k\">private</span> <span class=\"k\">final</span> <span class=\"t\">boolean</span> <span class=\"i\">isInitializer</span>;\n\n  <span class=\"t\">LoxFunction</span>(<span class=\"t\">Stmt</span>.<span class=\"t\">Function</span> <span class=\"i\">declaration</span>, <span class=\"t\">Environment</span> <span class=\"i\">closure</span>,\n              <span class=\"t\">boolean</span> <span class=\"i\">isInitializer</span>) {\n    <span class=\"k\">this</span>.<span class=\"i\">isInitializer</span> = <span class=\"i\">isInitializer</span>;\n</pre><pre class=\"insert-after\">    this.closure = closure;\n    this.declaration = declaration;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxFunction.java</em>, in class <em>LoxFunction</em>, replace 1 line</div>\n\n<p>We can&rsquo;t simply see if the name of the LoxFunction is &ldquo;init&rdquo; because the user\ncould have defined a <em>function</em> with that name. In that case, there <em>is</em> no\n<code>this</code> to return. To avoid <em>that</em> weird edge case, we&rsquo;ll directly store whether\nthe LoxFunction represents an initializer method. That means we need to go back\nand fix the few places where we create LoxFunctions.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  public Void visitFunctionStmt(Stmt.Function stmt) {\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitFunctionStmt</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"t\">LoxFunction</span> <span class=\"i\">function</span> = <span class=\"k\">new</span> <span class=\"t\">LoxFunction</span>(<span class=\"i\">stmt</span>, <span class=\"i\">environment</span>,\n                                           <span class=\"k\">false</span>);\n</pre><pre class=\"insert-after\">    environment.define(stmt.name.lexeme, function);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitFunctionStmt</em>(), replace 1 line</div>\n\n<p>For actual function declarations, <code>isInitializer</code> is always false. For methods,\nwe check the name.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    for (Stmt.Function method : stmt.methods) {\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitClassStmt</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">      <span class=\"t\">LoxFunction</span> <span class=\"i\">function</span> = <span class=\"k\">new</span> <span class=\"t\">LoxFunction</span>(<span class=\"i\">method</span>, <span class=\"i\">environment</span>,\n          <span class=\"i\">method</span>.<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>.<span class=\"i\">equals</span>(<span class=\"s\">&quot;init&quot;</span>));\n</pre><pre class=\"insert-after\">      methods.put(method.name.lexeme, function);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitClassStmt</em>(), replace 1 line</div>\n\n<p>And then in <code>bind()</code> where we create the closure that binds <code>this</code> to a method,\nwe pass along the original method&rsquo;s value.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    environment.define(&quot;this&quot;, instance);\n</pre><div class=\"source-file\"><em>lox/LoxFunction.java</em><br>\nin <em>bind</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">LoxFunction</span>(<span class=\"i\">declaration</span>, <span class=\"i\">environment</span>,\n                           <span class=\"i\">isInitializer</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxFunction.java</em>, in <em>bind</em>(), replace 1 line</div>\n\n<h3><a href=\"#returning-from-init\" id=\"returning-from-init\"><small>12&#8202;.&#8202;7&#8202;.&#8202;2</small>Returning from init()</a></h3>\n<p>We aren&rsquo;t out of the woods yet. We&rsquo;ve been assuming that a user-written\ninitializer doesn&rsquo;t explicitly return a value because most constructors don&rsquo;t.\nWhat should happen if a user tries:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Foo</span> {\n  <span class=\"i\">init</span>() {\n    <span class=\"k\">return</span> <span class=\"s\">&quot;something else&quot;</span>;\n  }\n}\n</pre></div>\n<p>It&rsquo;s definitely not going to do what they want, so we may as well make it a\nstatic error. Back in the resolver, we add another case to FunctionType.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    FUNCTION,\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin enum <em>FunctionType</em></div>\n<pre class=\"insert\">    <span class=\"i\">INITIALIZER</span>,\n</pre><pre class=\"insert-after\">    METHOD\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in enum <em>FunctionType</em></div>\n\n<p>We use the visited method&rsquo;s name to determine if we&rsquo;re resolving an initializer\nor not.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      FunctionType declaration = FunctionType.METHOD;\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin <em>visitClassStmt</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">if</span> (<span class=\"i\">method</span>.<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>.<span class=\"i\">equals</span>(<span class=\"s\">&quot;init&quot;</span>)) {\n        <span class=\"i\">declaration</span> = <span class=\"t\">FunctionType</span>.<span class=\"i\">INITIALIZER</span>;\n      }\n\n</pre><pre class=\"insert-after\">      resolveFunction(method, declaration);<span name=\"local\"> </span>\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in <em>visitClassStmt</em>()</div>\n\n<p>When we later traverse into a <code>return</code> statement, we check that field and make\nit an error to return a value from inside an <code>init()</code> method.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    if (stmt.value != null) {\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin <em>visitReturnStmt</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">if</span> (<span class=\"i\">currentFunction</span> == <span class=\"t\">FunctionType</span>.<span class=\"i\">INITIALIZER</span>) {\n        <span class=\"t\">Lox</span>.<span class=\"i\">error</span>(<span class=\"i\">stmt</span>.<span class=\"i\">keyword</span>,\n            <span class=\"s\">&quot;Can&#39;t return a value from an initializer.&quot;</span>);\n      }\n\n</pre><pre class=\"insert-after\">      resolve(stmt.value);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in <em>visitReturnStmt</em>()</div>\n\n<p>We&rsquo;re <em>still</em> not done. We statically disallow returning a <em>value</em> from an\ninitializer, but you can still use an empty early <code>return</code>.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Foo</span> {\n  <span class=\"i\">init</span>() {\n    <span class=\"k\">return</span>;\n  }\n}\n</pre></div>\n<p>That is actually kind of useful sometimes, so we don&rsquo;t want to disallow it\nentirely. Instead, it should return <code>this</code> instead of <code>nil</code>. That&rsquo;s an easy fix\nover in LoxFunction.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    } catch (Return returnValue) {\n</pre><div class=\"source-file\"><em>lox/LoxFunction.java</em><br>\nin <em>call</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">if</span> (<span class=\"i\">isInitializer</span>) <span class=\"k\">return</span> <span class=\"i\">closure</span>.<span class=\"i\">getAt</span>(<span class=\"n\">0</span>, <span class=\"s\">&quot;this&quot;</span>);\n\n</pre><pre class=\"insert-after\">      return returnValue.value;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxFunction.java</em>, in <em>call</em>()</div>\n\n<p>If we&rsquo;re in an initializer and execute a <code>return</code> statement, instead of\nreturning the value (which will always be <code>nil</code>), we again return <code>this</code>.</p>\n<p>Phew! That was a whole list of tasks but our reward is that our little\ninterpreter has grown an entire programming paradigm. Classes, methods, fields,\n<code>this</code>, and constructors. Our baby language is looking awfully grown-up.</p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>We have methods on instances, but there is no way to define &ldquo;static&rdquo; methods\nthat can be called directly on the class object itself. Add support for\nthem. Use a <code>class</code> keyword preceding the method to indicate a static method\nthat hangs off the class object.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Math</span> {\n  <span class=\"k\">class</span> <span class=\"i\">square</span>(<span class=\"i\">n</span>) {\n    <span class=\"k\">return</span> <span class=\"i\">n</span> * <span class=\"i\">n</span>;\n  }\n}\n\n<span class=\"k\">print</span> <span class=\"t\">Math</span>.<span class=\"i\">square</span>(<span class=\"n\">3</span>); <span class=\"c\">// Prints &quot;9&quot;.</span>\n</pre></div>\n<p>You can solve this however you like, but the &ldquo;<a href=\"https://en.wikipedia.org/wiki/Metaclass\">metaclasses</a>&rdquo; used by\nSmalltalk and Ruby are a particularly elegant approach. <em>Hint: Make LoxClass\nextend LoxInstance and go from there.</em></p>\n</li>\n<li>\n<p>Most modern languages support &ldquo;getters&rdquo; and &ldquo;setters&rdquo;<span class=\"em\">&mdash;</span>members on a class\nthat look like field reads and writes but that actually execute user-defined\ncode. Extend Lox to support getter methods. These are declared without a\nparameter list. The body of the getter is executed when a property with that\nname is accessed.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Circle</span> {\n  <span class=\"i\">init</span>(<span class=\"i\">radius</span>) {\n    <span class=\"k\">this</span>.<span class=\"i\">radius</span> = <span class=\"i\">radius</span>;\n  }\n\n  <span class=\"i\">area</span> {\n    <span class=\"k\">return</span> <span class=\"n\">3.141592653</span> * <span class=\"k\">this</span>.<span class=\"i\">radius</span> * <span class=\"k\">this</span>.<span class=\"i\">radius</span>;\n  }\n}\n\n<span class=\"k\">var</span> <span class=\"i\">circle</span> = <span class=\"t\">Circle</span>(<span class=\"n\">4</span>);\n<span class=\"k\">print</span> <span class=\"i\">circle</span>.<span class=\"i\">area</span>; <span class=\"c\">// Prints roughly &quot;50.2655&quot;.</span>\n</pre></div>\n</li>\n<li>\n<p>Python and JavaScript allow you to freely access an object&rsquo;s fields from\noutside of its own methods. Ruby and Smalltalk encapsulate instance state.\nOnly methods on the class can access the raw fields, and it is up to the\nclass to decide which state is exposed. Most statically typed languages\noffer modifiers like <code>private</code> and <code>public</code> to control which parts of a\nclass are externally accessible on a per-member basis.</p>\n<p>What are the trade-offs between these approaches and why might a language\nprefer one or the other?</p>\n</li>\n</ol>\n</div>\n<div class=\"design-note\">\n<h2><a href=\"#design-note\" id=\"design-note\">Design Note: Prototypes and Power</a></h2>\n<p>In this chapter, we introduced two new runtime entities, LoxClass and\nLoxInstance. The former is where behavior for objects lives, and the latter is\nfor state. What if you could define methods right on a single object, inside\nLoxInstance? In that case, we wouldn&rsquo;t need LoxClass at all. LoxInstance would\nbe a complete package for defining the behavior and state of an object.</p>\n<p>We&rsquo;d still want some way, without classes, to reuse behavior across multiple\ninstances. We could let a LoxInstance <a href=\"https://en.wikipedia.org/wiki/Prototype-based_programming#Delegation\"><em>delegate</em></a> directly to another\nLoxInstance to reuse its fields and methods, sort of like inheritance.</p>\n<p>Users would model their program as a constellation of objects, some of which\ndelegate to each other to reflect commonality. Objects used as delegates\nrepresent &ldquo;canonical&rdquo; or &ldquo;prototypical&rdquo; objects that others refine. The result\nis a simpler runtime with only a single internal construct, LoxInstance.</p>\n<p>That&rsquo;s where the name <strong><a href=\"https://en.wikipedia.org/wiki/Prototype-based_programming\">prototypes</a></strong> comes from for this paradigm. It\nwas invented by David Ungar and Randall Smith in a language called <a href=\"http://www.selflanguage.org/\">Self</a>.\nThey came up with it by starting with Smalltalk and following the above mental\nexercise to see how much they could pare it down.</p>\n<p>Prototypes were an academic curiosity for a long time, a fascinating one that\ngenerated interesting research but didn&rsquo;t make a dent in the larger world of\nprogramming. That is, until Brendan Eich crammed prototypes into JavaScript,\nwhich then promptly took over the world. Many (many) <span\nname=\"words\">words</span> have been written about prototypes in JavaScript.\nWhether that shows that prototypes are brilliant or confusing<span class=\"em\">&mdash;</span>or both!<span class=\"em\">&mdash;</span>is\nan open question.</p>\n<aside name=\"words\">\n<p>Including <a href=\"http://gameprogrammingpatterns.com/prototype.html\">more than a handful</a> by yours truly.</p>\n</aside>\n<p>I won&rsquo;t get into whether or not I think prototypes are a good idea for a\nlanguage. I&rsquo;ve made languages that are <a href=\"http://finch.stuffwithstuff.com/\">prototypal</a> and\n<a href=\"http://wren.io/\">class-based</a>, and my opinions of both are complex. What I want to discuss\nis the role of <em>simplicity</em> in a language.</p>\n<p>Prototypes are simpler than classes<span class=\"em\">&mdash;</span>less code for the language implementer to\nwrite, and fewer concepts for the user to learn and understand. Does that make\nthem better? We language nerds have a tendency to fetishize minimalism.\nPersonally, I think simplicity is only part of the equation. What we really want\nto give the user is <em>power</em>, which I define as:</p>\n<div class=\"codehilite\"><pre>power = breadth × ease ÷ complexity\n</pre></div>\n<p>None of these are precise numeric measures. I&rsquo;m using math as analogy here, not\nactual quantification.</p>\n<ul>\n<li>\n<p><strong>Breadth</strong> is the range of different things the language lets you express.\nC has a lot of breadth<span class=\"em\">&mdash;</span>it&rsquo;s been used for everything from operating\nsystems to user applications to games. Domain-specific languages like\nAppleScript and Matlab have less breadth.</p>\n</li>\n<li>\n<p><strong>Ease</strong> is how little effort it takes to make the language do what you\nwant. &ldquo;Usability&rdquo; might be another term, though it carries more baggage than\nI want to bring in. &ldquo;Higher-level&rdquo; languages tend to have more ease than\n&ldquo;lower-level&rdquo; ones. Most languages have a &ldquo;grain&rdquo; to them where some things\nfeel easier to express than others.</p>\n</li>\n<li>\n<p><strong>Complexity</strong> is how big the language (including its runtime, core libraries,\ntools, ecosystem, etc.) is. People talk about how many pages are in a\nlanguage&rsquo;s spec, or how many keywords it has. It&rsquo;s how much the user has to\nload into their wetware before they can be productive in the system. It is\nthe antonym of simplicity.</p>\n</li>\n</ul>\n<p>Reducing complexity <em>does</em> increase power. The smaller the denominator, the\nlarger the resulting value, so our intuition that simplicity is good is valid.\nHowever, when reducing complexity, we must take care not to sacrifice breadth or\nease in the process, or the total power may go down. Java would be a strictly\n<em>simpler</em> language if it removed strings, but it probably wouldn&rsquo;t handle text\nmanipulation tasks well, nor would it be as easy to get things done.</p>\n<p>The art, then, is finding <em>accidental</em> complexity that can be omitted<span class=\"em\">&mdash;</span>language features and interactions that don&rsquo;t carry their weight by increasing\nthe breadth or ease of using the language.</p>\n<p>If users want to express their program in terms of categories of objects, then\nbaking classes into the language increases the ease of doing that, hopefully by\na large enough margin to pay for the added complexity. But if that isn&rsquo;t how\nusers are using your language, then by all means leave classes out.</p>\n</div>\n\n<footer>\n<a href=\"inheritance.html\" class=\"next\">\n  Next Chapter: &ldquo;Inheritance&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/closures.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Closures &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Closures<small>25</small></a></h3>\n\n<ul>\n    <li><a href=\"#closure-objects\"><small>25.1</small> Closure Objects</a></li>\n    <li><a href=\"#upvalues\"><small>25.2</small> Upvalues</a></li>\n    <li><a href=\"#upvalue-objects\"><small>25.3</small> Upvalue Objects</a></li>\n    <li><a href=\"#closed-upvalues\"><small>25.4</small> Closed Upvalues</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Closing Over the Loop Variable</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"calls-and-functions.html\" title=\"Calls and Functions\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"garbage-collection.html\" title=\"Garbage Collection\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"calls-and-functions.html\" title=\"Calls and Functions\" class=\"prev\">←</a>\n<a href=\"garbage-collection.html\" title=\"Garbage Collection\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Closures<small>25</small></a></h3>\n\n<ul>\n    <li><a href=\"#closure-objects\"><small>25.1</small> Closure Objects</a></li>\n    <li><a href=\"#upvalues\"><small>25.2</small> Upvalues</a></li>\n    <li><a href=\"#upvalue-objects\"><small>25.3</small> Upvalue Objects</a></li>\n    <li><a href=\"#closed-upvalues\"><small>25.4</small> Closed Upvalues</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Closing Over the Loop Variable</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"calls-and-functions.html\" title=\"Calls and Functions\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"garbage-collection.html\" title=\"Garbage Collection\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">25</div>\n  <h1>Closures</h1>\n\n<blockquote>\n<p>As the man said, for every complex problem there&rsquo;s a simple solution, and it&rsquo;s\nwrong.</p>\n<p><cite>Umberto Eco, <em>Foucault&rsquo;s Pendulum</em></cite></p>\n</blockquote>\n<p>Thanks to our diligent labor in <a href=\"calls-and-functions.html\">the last chapter</a>, we have a virtual\nmachine with working functions. What it lacks is closures. Aside from global\nvariables, which are their own breed of animal, a function has no way to\nreference a variable declared outside of its own body.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">x</span> = <span class=\"s\">&quot;global&quot;</span>;\n<span class=\"k\">fun</span> <span class=\"i\">outer</span>() {\n  <span class=\"k\">var</span> <span class=\"i\">x</span> = <span class=\"s\">&quot;outer&quot;</span>;\n  <span class=\"k\">fun</span> <span class=\"i\">inner</span>() {\n    <span class=\"k\">print</span> <span class=\"i\">x</span>;\n  }\n  <span class=\"i\">inner</span>();\n}\n<span class=\"i\">outer</span>();\n</pre></div>\n<p>Run this example now and it prints &ldquo;global&rdquo;. It&rsquo;s supposed to print &ldquo;outer&rdquo;. To\nfix this, we need to include the entire lexical scope of all surrounding\nfunctions when resolving a variable.</p>\n<p>This problem is harder in clox than it was in jlox because our bytecode VM\nstores locals on a stack. We used a stack because I claimed locals have stack\nsemantics<span class=\"em\">&mdash;</span>variables are discarded in the reverse order that they are created.\nBut with closures, that&rsquo;s only <em>mostly</em> true.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">makeClosure</span>() {\n  <span class=\"k\">var</span> <span class=\"i\">local</span> = <span class=\"s\">&quot;local&quot;</span>;\n  <span class=\"k\">fun</span> <span class=\"i\">closure</span>() {\n    <span class=\"k\">print</span> <span class=\"i\">local</span>;\n  }\n  <span class=\"k\">return</span> <span class=\"i\">closure</span>;\n}\n\n<span class=\"k\">var</span> <span class=\"i\">closure</span> = <span class=\"i\">makeClosure</span>();\n<span class=\"i\">closure</span>();\n</pre></div>\n<p>The outer function <code>makeClosure()</code> declares a variable, <code>local</code>. It also creates\nan inner function, <code>closure()</code> that captures that variable. Then <code>makeClosure()</code>\nreturns a reference to that function. Since the closure <span\nname=\"flying\">escapes</span> while holding on to the local variable, <code>local</code> must\noutlive the function call where it was created.</p>\n<aside name=\"flying\"><img src=\"image/closures/flying.png\" class=\"above\" alt=\"A local variable flying away from the stack.\"/>\n<p>Oh no, it&rsquo;s escaping!</p>\n</aside>\n<p>We could solve this problem by dynamically allocating memory for all local\nvariables. That&rsquo;s what jlox does by putting everything in those Environment\nobjects that float around in Java&rsquo;s heap. But we don&rsquo;t want to. Using a <span\nname=\"stack\">stack</span> is <em>really</em> fast. Most local variables are <em>not</em>\ncaptured by closures and do have stack semantics. It would suck to make all of\nthose slower for the benefit of the rare local that is captured.</p>\n<aside name=\"stack\">\n<p>There is a reason that C and Java use the stack for their local variables, after\nall.</p>\n</aside>\n<p>This means a more complex approach than we used in our Java interpreter. Because\nsome locals have very different lifetimes, we will have two implementation\nstrategies. For locals that aren&rsquo;t used in closures, we&rsquo;ll keep them just as\nthey are on the stack. When a local is captured by a closure, we&rsquo;ll adopt\nanother solution that lifts them onto the heap where they can live as long as\nneeded.</p>\n<p>Closures have been around since the early Lisp days when bytes of memory and CPU\ncycles were more precious than emeralds. Over the intervening decades, hackers\ndevised all <span name=\"lambda\">manner</span> of ways to compile closures to\noptimized runtime representations. Some are more efficient but require a more\ncomplex compilation process than we could easily retrofit into clox.</p>\n<aside name=\"lambda\">\n<p>Search for &ldquo;closure conversion&rdquo; or &ldquo;lambda lifting&rdquo; to start exploring.</p>\n</aside>\n<p>The technique I explain here comes from the design of the Lua VM. It is fast,\nparsimonious with memory, and implemented with relatively little code. Even more\nimpressive, it fits naturally into the single-pass compilers clox and Lua both\nuse. It is somewhat intricate, though. It might take a while before all the\npieces click together in your mind. We&rsquo;ll build them one step at a time, and\nI&rsquo;ll try to introduce the concepts in stages.</p>\n<h2><a href=\"#closure-objects\" id=\"closure-objects\"><small>25&#8202;.&#8202;1</small>Closure Objects</a></h2>\n<p>Our VM represents functions at runtime using ObjFunction. These objects are\ncreated by the front end during compilation. At runtime, all the VM does is load\nthe function object from a constant table and bind it to a name. There is no\noperation to &ldquo;create&rdquo; a function at runtime. Much like string and number <span\nname=\"literal\">literals</span>, they are constants instantiated purely at\ncompile time.</p>\n<aside name=\"literal\">\n<p>In other words, a function declaration in Lox <em>is</em> a kind of literal<span class=\"em\">&mdash;</span>a piece\nof syntax that defines a constant value of a built-in type.</p>\n</aside>\n<p>That made sense because all of the data that composes a function is known at\ncompile time: the chunk of bytecode compiled from the function&rsquo;s body, and the\nconstants used in the body. Once we introduce closures, though, that\nrepresentation is no longer sufficient. Take a gander at:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">makeClosure</span>(<span class=\"i\">value</span>) {\n  <span class=\"k\">fun</span> <span class=\"i\">closure</span>() {\n    <span class=\"k\">print</span> <span class=\"i\">value</span>;\n  }\n  <span class=\"k\">return</span> <span class=\"i\">closure</span>;\n}\n\n<span class=\"k\">var</span> <span class=\"i\">doughnut</span> = <span class=\"i\">makeClosure</span>(<span class=\"s\">&quot;doughnut&quot;</span>);\n<span class=\"k\">var</span> <span class=\"i\">bagel</span> = <span class=\"i\">makeClosure</span>(<span class=\"s\">&quot;bagel&quot;</span>);\n<span class=\"i\">doughnut</span>();\n<span class=\"i\">bagel</span>();\n</pre></div>\n<p>The <code>makeClosure()</code> function defines and returns a function. We call it twice\nand get two closures back. They are created by the same nested function\ndeclaration, <code>closure</code>, but close over different values. When we call the two\nclosures, each prints a different string. That implies we need some runtime\nrepresentation for a closure that captures the local variables surrounding the\nfunction as they exist when the function declaration is <em>executed</em>, not just\nwhen it is compiled.</p>\n<p>We&rsquo;ll work our way up to capturing variables, but a good first step is defining\nthat object representation. Our existing ObjFunction type represents the <span\nname=\"raw\">&ldquo;raw&rdquo;</span> compile-time state of a function declaration, since all\nclosures created from a single declaration share the same code and constants. At\nruntime, when we execute a function declaration, we wrap the ObjFunction in a\nnew ObjClosure structure. The latter has a reference to the underlying bare\nfunction along with runtime state for the variables the function closes over.</p>\n<aside name=\"raw\">\n<p>The Lua implementation refers to the raw function object containing the bytecode\nas a &ldquo;prototype&rdquo;, which is a great word to describe this, except that word also\ngets overloaded to refer to <a href=\"https://en.wikipedia.org/wiki/Prototype-based_programming\">prototypal inheritance</a>.</p>\n</aside><img src=\"image/closures/obj-closure.png\" alt=\"An ObjClosure with a reference to an ObjFunction.\"/>\n<p>We&rsquo;ll wrap every function in an ObjClosure, even if the function doesn&rsquo;t\nactually close over and capture any surrounding local variables. This is a\nlittle wasteful, but it simplifies the VM because we can always assume that the\nfunction we&rsquo;re calling is an ObjClosure. That new struct starts out like this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>object.h</em><br>\nadd after struct <em>ObjString</em></div>\n<pre><span class=\"k\">typedef</span> <span class=\"k\">struct</span> {\n  <span class=\"t\">Obj</span> <span class=\"i\">obj</span>;\n  <span class=\"t\">ObjFunction</span>* <span class=\"i\">function</span>;\n} <span class=\"t\">ObjClosure</span>;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, add after struct <em>ObjString</em></div>\n\n<p>Right now, it simply points to an ObjFunction and adds the necessary object\nheader stuff. Grinding through the usual ceremony for adding a new object type\nto clox, we declare a C function to create a new closure.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} ObjClosure;\n\n</pre><div class=\"source-file\"><em>object.h</em><br>\nadd after struct <em>ObjClosure</em></div>\n<pre class=\"insert\"><span class=\"t\">ObjClosure</span>* <span class=\"i\">newClosure</span>(<span class=\"t\">ObjFunction</span>* <span class=\"i\">function</span>);\n</pre><pre class=\"insert-after\">ObjFunction* newFunction();\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, add after struct <em>ObjClosure</em></div>\n\n<p>Then we implement it here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>object.c</em><br>\nadd after <em>allocateObject</em>()</div>\n<pre><span class=\"t\">ObjClosure</span>* <span class=\"i\">newClosure</span>(<span class=\"t\">ObjFunction</span>* <span class=\"i\">function</span>) {\n  <span class=\"t\">ObjClosure</span>* <span class=\"i\">closure</span> = <span class=\"a\">ALLOCATE_OBJ</span>(<span class=\"t\">ObjClosure</span>, <span class=\"a\">OBJ_CLOSURE</span>);\n  <span class=\"i\">closure</span>-&gt;<span class=\"i\">function</span> = <span class=\"i\">function</span>;\n  <span class=\"k\">return</span> <span class=\"i\">closure</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, add after <em>allocateObject</em>()</div>\n\n<p>It takes a pointer to the ObjFunction it wraps. It also initializes the type\nfield to a new type.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">typedef enum {\n</pre><div class=\"source-file\"><em>object.h</em><br>\nin enum <em>ObjType</em></div>\n<pre class=\"insert\">  <span class=\"a\">OBJ_CLOSURE</span>,\n</pre><pre class=\"insert-after\">  OBJ_FUNCTION,\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, in enum <em>ObjType</em></div>\n\n<p>And when we&rsquo;re done with a closure, we release its memory.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  switch (object-&gt;type) {\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>freeObject</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OBJ_CLOSURE</span>: {\n      <span class=\"a\">FREE</span>(<span class=\"t\">ObjClosure</span>, <span class=\"i\">object</span>);\n      <span class=\"k\">break</span>;\n    }\n</pre><pre class=\"insert-after\">    case OBJ_FUNCTION: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>freeObject</em>()</div>\n\n<p>We free only the ObjClosure itself, not the ObjFunction. That&rsquo;s because the\nclosure doesn&rsquo;t <em>own</em> the function. There may be multiple closures that all\nreference the same function, and none of them claims any special privilege over\nit. We can&rsquo;t free the ObjFunction until <em>all</em> objects referencing it are gone<span class=\"em\">&mdash;</span>including even the surrounding function whose constant table contains it.\nTracking that sounds tricky, and it is! That&rsquo;s why we&rsquo;ll write a garbage\ncollector soon to manage it for us.</p>\n<p>We also have the usual <span name=\"macro\">macros</span> for checking a value&rsquo;s\ntype.</p>\n<aside name=\"macro\">\n<p>Perhaps I should have defined a macro to make it easier to generate these\nmacros. Maybe that would be a little too meta.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define OBJ_TYPE(value)        (AS_OBJ(value)-&gt;type)\n\n</pre><div class=\"source-file\"><em>object.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define IS_CLOSURE(value)      isObjType(value, OBJ_CLOSURE)</span>\n</pre><pre class=\"insert-after\">#define IS_FUNCTION(value)     isObjType(value, OBJ_FUNCTION)\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em></div>\n\n<p>And to cast a value:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define IS_STRING(value)       isObjType(value, OBJ_STRING)\n\n</pre><div class=\"source-file\"><em>object.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define AS_CLOSURE(value)      ((ObjClosure*)AS_OBJ(value))</span>\n</pre><pre class=\"insert-after\">#define AS_FUNCTION(value)     ((ObjFunction*)AS_OBJ(value))\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em></div>\n\n<p>Closures are first-class objects, so you can print them.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  switch (OBJ_TYPE(value)) {\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>printObject</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OBJ_CLOSURE</span>:\n      <span class=\"i\">printFunction</span>(<span class=\"a\">AS_CLOSURE</span>(<span class=\"i\">value</span>)-&gt;<span class=\"i\">function</span>);\n      <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">    case OBJ_FUNCTION:\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>printObject</em>()</div>\n\n<p>They display exactly as ObjFunction does. From the user&rsquo;s perspective, the\ndifference between ObjFunction and ObjClosure is purely a hidden implementation\ndetail. With that out of the way, we have a working but empty representation for\nclosures.</p>\n<h3><a href=\"#compiling-to-closure-objects\" id=\"compiling-to-closure-objects\"><small>25&#8202;.&#8202;1&#8202;.&#8202;1</small>Compiling to closure objects</a></h3>\n<p>We have closure objects, but our VM never creates them. The next step is getting\nthe compiler to emit instructions to tell the runtime when to create a new\nObjClosure to wrap a given ObjFunction. This happens right at the end of a\nfunction declaration.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  ObjFunction* function = endCompiler();\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>function</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"i\">emitBytes</span>(<span class=\"a\">OP_CLOSURE</span>, <span class=\"i\">makeConstant</span>(<span class=\"a\">OBJ_VAL</span>(<span class=\"i\">function</span>)));\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>function</em>(), replace 1 line</div>\n\n<p>Before, the final bytecode for a function declaration was a single <code>OP_CONSTANT</code>\ninstruction to load the compiled function from the surrounding function&rsquo;s\nconstant table and push it onto the stack. Now we have a new instruction.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_CALL,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_CLOSURE</span>,\n</pre><pre class=\"insert-after\">  OP_RETURN,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>Like <code>OP_CONSTANT</code>, it takes a single operand that represents a constant table\nindex for the function. But when we get over to the runtime implementation, we\ndo something more interesting.</p>\n<p>First, let&rsquo;s be diligent VM hackers and slot in disassembler support for the\ninstruction.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    case OP_CALL:\n      return byteInstruction(&quot;OP_CALL&quot;, chunk, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_CLOSURE</span>: {\n      <span class=\"i\">offset</span>++;\n      <span class=\"t\">uint8_t</span> <span class=\"i\">constant</span> = <span class=\"i\">chunk</span>-&gt;<span class=\"i\">code</span>[<span class=\"i\">offset</span>++];\n      <span class=\"i\">printf</span>(<span class=\"s\">&quot;%-16s %4d &quot;</span>, <span class=\"s\">&quot;OP_CLOSURE&quot;</span>, <span class=\"i\">constant</span>);\n      <span class=\"i\">printValue</span>(<span class=\"i\">chunk</span>-&gt;<span class=\"i\">constants</span>.<span class=\"i\">values</span>[<span class=\"i\">constant</span>]);\n      <span class=\"i\">printf</span>(<span class=\"s\">&quot;</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>);\n      <span class=\"k\">return</span> <span class=\"i\">offset</span>;\n    }\n</pre><pre class=\"insert-after\">    case OP_RETURN:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>There&rsquo;s more going on here than we usually have in the disassembler. By the end\nof the chapter, you&rsquo;ll discover that <code>OP_CLOSURE</code> is quite an unusual\ninstruction. It&rsquo;s straightforward right now<span class=\"em\">&mdash;</span>just a single byte operand<span class=\"em\">&mdash;</span>but\nwe&rsquo;ll be adding to it. This code here anticipates that future.</p>\n<h3><a href=\"#interpreting-function-declarations\" id=\"interpreting-function-declarations\"><small>25&#8202;.&#8202;1&#8202;.&#8202;2</small>Interpreting function declarations</a></h3>\n<p>Most of the work we need to do is in the runtime. We have to handle the new\ninstruction, naturally. But we also need to touch every piece of code in the VM\nthat works with ObjFunction and change it to use ObjClosure instead<span class=\"em\">&mdash;</span>function\ncalls, call frames, etc. We&rsquo;ll start with the instruction, though.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_CLOSURE</span>: {\n        <span class=\"t\">ObjFunction</span>* <span class=\"i\">function</span> = <span class=\"a\">AS_FUNCTION</span>(<span class=\"a\">READ_CONSTANT</span>());\n        <span class=\"t\">ObjClosure</span>* <span class=\"i\">closure</span> = <span class=\"i\">newClosure</span>(<span class=\"i\">function</span>);\n        <span class=\"i\">push</span>(<span class=\"a\">OBJ_VAL</span>(<span class=\"i\">closure</span>));\n        <span class=\"k\">break</span>;\n      }\n</pre><pre class=\"insert-after\">      case OP_RETURN: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>Like the <code>OP_CONSTANT</code> instruction we used before, first we load the compiled\nfunction from the constant table. The difference now is that we wrap that\nfunction in a new ObjClosure and push the result onto the stack.</p>\n<p>Once you have a closure, you&rsquo;ll eventually want to call it.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    switch (OBJ_TYPE(callee)) {\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>callValue</em>()<br>\nreplace 2 lines</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OBJ_CLOSURE</span>:\n        <span class=\"k\">return</span> <span class=\"i\">call</span>(<span class=\"a\">AS_CLOSURE</span>(<span class=\"i\">callee</span>), <span class=\"i\">argCount</span>);\n</pre><pre class=\"insert-after\">      case OBJ_NATIVE: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>callValue</em>(), replace 2 lines</div>\n\n<p>We remove the code for calling objects whose type is <code>OBJ_FUNCTION</code>. Since we\nwrap all functions in ObjClosures, the runtime will never try to invoke a bare\nObjFunction anymore. Those objects live only in constant tables and get\nimmediately <span name=\"naked\">wrapped</span> in closures before anything else\nsees them.</p>\n<aside name=\"naked\">\n<p>We don&rsquo;t want any naked functions wandering around the VM! What would the\nneighbors say?</p>\n</aside>\n<p>We replace the old code with very similar code for calling a closure instead.\nThe only difference is the type of object we pass to <code>call()</code>. The real changes\nare over in that function. First, we update its signature.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nfunction <em>call</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"k\">static</span> <span class=\"t\">bool</span> <span class=\"i\">call</span>(<span class=\"t\">ObjClosure</span>* <span class=\"i\">closure</span>, <span class=\"t\">int</span> <span class=\"i\">argCount</span>) {\n</pre><pre class=\"insert-after\">  if (argCount != function-&gt;arity) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, function <em>call</em>(), replace 1 line</div>\n\n<p>Then, in the body, we need to fix everything that referenced the function to\nhandle the fact that we&rsquo;ve introduced a layer of indirection. We start with the\narity checking:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static bool call(ObjClosure* closure, int argCount) {\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>call</em>()<br>\nreplace 3 lines</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">argCount</span> != <span class=\"i\">closure</span>-&gt;<span class=\"i\">function</span>-&gt;<span class=\"i\">arity</span>) {\n    <span class=\"i\">runtimeError</span>(<span class=\"s\">&quot;Expected %d arguments but got %d.&quot;</span>,\n        <span class=\"i\">closure</span>-&gt;<span class=\"i\">function</span>-&gt;<span class=\"i\">arity</span>, <span class=\"i\">argCount</span>);\n</pre><pre class=\"insert-after\">    return false;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>call</em>(), replace 3 lines</div>\n\n<p>The only change is that we unwrap the closure to get to the underlying function.\nThe next thing <code>call()</code> does is create a new CallFrame. We change that code to\nstore the closure in the CallFrame and get the bytecode pointer from the\nclosure&rsquo;s function.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  CallFrame* frame = &amp;vm.frames[vm.frameCount++];\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>call</em>()<br>\nreplace 2 lines</div>\n<pre class=\"insert\">  <span class=\"i\">frame</span>-&gt;<span class=\"i\">closure</span> = <span class=\"i\">closure</span>;\n  <span class=\"i\">frame</span>-&gt;<span class=\"i\">ip</span> = <span class=\"i\">closure</span>-&gt;<span class=\"i\">function</span>-&gt;<span class=\"i\">chunk</span>.<span class=\"i\">code</span>;\n</pre><pre class=\"insert-after\">  frame-&gt;slots = vm.stackTop - argCount - 1;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>call</em>(), replace 2 lines</div>\n\n<p>This necessitates changing the declaration of CallFrame too.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">typedef struct {\n</pre><div class=\"source-file\"><em>vm.h</em><br>\nin struct <em>CallFrame</em><br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"t\">ObjClosure</span>* <span class=\"i\">closure</span>;\n</pre><pre class=\"insert-after\">  uint8_t* ip;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em>, in struct <em>CallFrame</em>, replace 1 line</div>\n\n<p>That change triggers a few other cascading changes. Every place in the VM that\naccessed CallFrame&rsquo;s function needs to use a closure instead. First, the macro\nfor reading a constant from the current function&rsquo;s constant table:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    (uint16_t)((frame-&gt;ip[-2] &lt;&lt; 8) | frame-&gt;ip[-1]))\n\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()<br>\nreplace 2 lines</div>\n<pre class=\"insert\"><span class=\"a\">#define READ_CONSTANT() \\</span>\n<span class=\"a\">    (frame-&gt;closure-&gt;function-&gt;chunk.constants.values[READ_BYTE()])</span>\n</pre><pre class=\"insert-after\">\n\n#define READ_STRING() AS_STRING(READ_CONSTANT())\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>(), replace 2 lines</div>\n\n<p>When <code>DEBUG_TRACE_EXECUTION</code> is enabled, it needs to get to the chunk from the\nclosure.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    printf(&quot;\\n&quot;);\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()<br>\nreplace 2 lines</div>\n<pre class=\"insert\">    <span class=\"i\">disassembleInstruction</span>(&amp;<span class=\"i\">frame</span>-&gt;<span class=\"i\">closure</span>-&gt;<span class=\"i\">function</span>-&gt;<span class=\"i\">chunk</span>,\n        (<span class=\"t\">int</span>)(<span class=\"i\">frame</span>-&gt;<span class=\"i\">ip</span> - <span class=\"i\">frame</span>-&gt;<span class=\"i\">closure</span>-&gt;<span class=\"i\">function</span>-&gt;<span class=\"i\">chunk</span>.<span class=\"i\">code</span>));\n</pre><pre class=\"insert-after\">#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>(), replace 2 lines</div>\n\n<p>Likewise when reporting a runtime error:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    CallFrame* frame = &amp;vm.frames[i];\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>runtimeError</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"t\">ObjFunction</span>* <span class=\"i\">function</span> = <span class=\"i\">frame</span>-&gt;<span class=\"i\">closure</span>-&gt;<span class=\"i\">function</span>;\n</pre><pre class=\"insert-after\">    size_t instruction = frame-&gt;ip - function-&gt;chunk.code - 1;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>runtimeError</em>(), replace 1 line</div>\n\n<p>Almost there. The last piece is the blob of code that sets up the very first\nCallFrame to begin executing the top-level code for a Lox script.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  push(OBJ_VAL(function));\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>interpret</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"t\">ObjClosure</span>* <span class=\"i\">closure</span> = <span class=\"i\">newClosure</span>(<span class=\"i\">function</span>);\n  <span class=\"i\">pop</span>();\n  <span class=\"i\">push</span>(<span class=\"a\">OBJ_VAL</span>(<span class=\"i\">closure</span>));\n  <span class=\"i\">call</span>(<span class=\"i\">closure</span>, <span class=\"n\">0</span>);\n</pre><pre class=\"insert-after\">\n\n  return run();\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>interpret</em>(), replace 1 line</div>\n\n<p><span name=\"pop\">The</span> compiler still returns a raw ObjFunction when\ncompiling a script. That&rsquo;s fine, but it means we need to wrap it in an\nObjClosure here, before the VM can execute it.</p>\n<aside name=\"pop\">\n<p>The code looks a little silly because we still push the original ObjFunction\nonto the stack. Then we pop it after creating the closure, only to then push the\nclosure. Why put the ObjFunction on there at all? As usual, when you see weird\nstack stuff going on, it&rsquo;s to keep the <a href=\"garbage-collection.html\">forthcoming garbage collector</a> aware\nof some heap-allocated objects.</p>\n</aside>\n<p>We are back to a working interpreter. The <em>user</em> can&rsquo;t tell any difference, but\nthe compiler now generates code telling the VM to create a closure for each\nfunction declaration. Every time the VM executes a function declaration, it\nwraps the ObjFunction in a new ObjClosure. The rest of the VM now handles those\nObjClosures floating around. That&rsquo;s the boring stuff out of the way. Now we&rsquo;re\nready to make these closures actually <em>do</em> something.</p>\n<h2><a href=\"#upvalues\" id=\"upvalues\"><small>25&#8202;.&#8202;2</small>Upvalues</a></h2>\n<p>Our existing instructions for reading and writing local variables are limited to\na single function&rsquo;s stack window. Locals from a surrounding function are outside\nof the inner function&rsquo;s window. We&rsquo;re going to need some new instructions.</p>\n<p>The easiest approach might be an instruction that takes a relative stack slot\noffset that can reach <em>before</em> the current function&rsquo;s window. That would work if\nclosed-over variables were always on the stack. But as we saw earlier, these\nvariables sometimes outlive the function where they are declared. That means\nthey won&rsquo;t always be on the stack.</p>\n<p>The next easiest approach, then, would be to take any local variable that gets\nclosed over and have it always live on the heap. When the local variable\ndeclaration in the surrounding function is executed, the VM would allocate\nmemory for it dynamically. That way it could live as long as needed.</p>\n<p>This would be a fine approach if clox didn&rsquo;t have a single-pass compiler. But\nthat restriction we chose in our implementation makes things harder. Take a look\nat this example:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">outer</span>() {\n  <span class=\"k\">var</span> <span class=\"i\">x</span> = <span class=\"n\">1</span>;    <span class=\"c\">// (1)</span>\n  <span class=\"i\">x</span> = <span class=\"n\">2</span>;        <span class=\"c\">// (2)</span>\n  <span class=\"k\">fun</span> <span class=\"i\">inner</span>() { <span class=\"c\">// (3)</span>\n    <span class=\"k\">print</span> <span class=\"i\">x</span>;\n  }\n  <span class=\"i\">inner</span>();\n}\n</pre></div>\n<p>Here, the compiler compiles the declaration of <code>x</code> at <code>(1)</code> and emits code for\nthe assignment at <code>(2)</code>. It does that before reaching the declaration of\n<code>inner()</code> at <code>(3)</code> and discovering that <code>x</code> is in fact closed over. We don&rsquo;t\nhave an easy way to go back and fix that already-emitted code to treat <code>x</code>\nspecially. Instead, we want a solution that allows a closed-over variable to\nlive on the stack exactly like a normal local variable <em>until the point that it\nis closed over</em>.</p>\n<p>Fortunately, thanks to the Lua dev team, we have a solution. We use a level of\nindirection that they call an <strong>upvalue</strong>. An upvalue refers to a local variable\nin an enclosing function. Every closure maintains an array of upvalues, one for\neach surrounding local variable that the closure uses.</p>\n<p>The upvalue points back into the stack to where the variable it captured lives.\nWhen the closure needs to access a closed-over variable, it goes through the\ncorresponding upvalue to reach it. When a function declaration is first executed\nand we create a closure for it, the VM creates the array of upvalues and wires\nthem up to &ldquo;capture&rdquo; the surrounding local variables that the closure needs.</p>\n<p>For example, if we throw this program at clox,</p>\n<div class=\"codehilite\"><pre>{\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"n\">3</span>;\n  <span class=\"k\">fun</span> <span class=\"i\">f</span>() {\n    <span class=\"k\">print</span> <span class=\"i\">a</span>;\n  }\n}\n</pre></div>\n<p>the compiler and runtime will conspire together to build up a set of objects in\nmemory like this:</p><img src=\"image/closures/open-upvalue.png\" alt=\"The object graph of the stack, ObjClosure, ObjFunction, and upvalue array.\"/>\n<p>That might look overwhelming, but fear not. We&rsquo;ll work our way through it. The\nimportant part is that upvalues serve as the layer of indirection needed to\ncontinue to find a captured local variable even after it moves off the stack.\nBut before we get to all that, let&rsquo;s focus on compiling captured variables.</p>\n<h3><a href=\"#compiling-upvalues\" id=\"compiling-upvalues\"><small>25&#8202;.&#8202;2&#8202;.&#8202;1</small>Compiling upvalues</a></h3>\n<p>As usual, we want to do as much work as possible during compilation to keep\nexecution simple and fast. Since local variables are lexically scoped in Lox, we\nhave enough knowledge at compile time to resolve which surrounding local\nvariables a function accesses and where those locals are declared. That, in\nturn, means we know <em>how many</em> upvalues a closure needs, <em>which</em> variables they\ncapture, and <em>which stack slots</em> contain those variables in the declaring\nfunction&rsquo;s stack window.</p>\n<p>Currently, when the compiler resolves an identifier, it walks the block scopes\nfor the current function from innermost to outermost. If we don&rsquo;t find the\nvariable in that function, we assume the variable must be a global. We don&rsquo;t\nconsider the local scopes of enclosing functions<span class=\"em\">&mdash;</span>they get skipped right over.\nThe first change, then, is inserting a resolution step for those outer local\nscopes.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  if (arg != -1) {\n    getOp = OP_GET_LOCAL;\n    setOp = OP_SET_LOCAL;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>namedVariable</em>()</div>\n<pre class=\"insert\">  } <span class=\"k\">else</span> <span class=\"k\">if</span> ((<span class=\"i\">arg</span> = <span class=\"i\">resolveUpvalue</span>(<span class=\"i\">current</span>, &amp;<span class=\"i\">name</span>)) != -<span class=\"n\">1</span>) {\n    <span class=\"i\">getOp</span> = <span class=\"a\">OP_GET_UPVALUE</span>;\n    <span class=\"i\">setOp</span> = <span class=\"a\">OP_SET_UPVALUE</span>;\n</pre><pre class=\"insert-after\">  } else {\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>namedVariable</em>()</div>\n\n<p>This new <code>resolveUpvalue()</code> function looks for a local variable declared in any\nof the surrounding functions. If it finds one, it returns an &ldquo;upvalue index&rdquo; for\nthat variable. (We&rsquo;ll get into what that means later.) Otherwise, it returns -1\nto indicate the variable wasn&rsquo;t found. If it was found, we use these two new\ninstructions for reading or writing to the variable through its upvalue:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_SET_GLOBAL,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_GET_UPVALUE</span>,\n  <span class=\"a\">OP_SET_UPVALUE</span>,\n</pre><pre class=\"insert-after\">  OP_EQUAL,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>We&rsquo;re implementing this sort of top-down, so I&rsquo;ll show you how these work at\nruntime soon. The part to focus on now is how the compiler actually resolves the\nidentifier.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>resolveLocal</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">int</span> <span class=\"i\">resolveUpvalue</span>(<span class=\"t\">Compiler</span>* <span class=\"i\">compiler</span>, <span class=\"t\">Token</span>* <span class=\"i\">name</span>) {\n  <span class=\"k\">if</span> (<span class=\"i\">compiler</span>-&gt;<span class=\"i\">enclosing</span> == <span class=\"a\">NULL</span>) <span class=\"k\">return</span> -<span class=\"n\">1</span>;\n\n  <span class=\"t\">int</span> <span class=\"i\">local</span> = <span class=\"i\">resolveLocal</span>(<span class=\"i\">compiler</span>-&gt;<span class=\"i\">enclosing</span>, <span class=\"i\">name</span>);\n  <span class=\"k\">if</span> (<span class=\"i\">local</span> != -<span class=\"n\">1</span>) {\n    <span class=\"k\">return</span> <span class=\"i\">addUpvalue</span>(<span class=\"i\">compiler</span>, (<span class=\"t\">uint8_t</span>)<span class=\"i\">local</span>, <span class=\"k\">true</span>);\n  }\n\n  <span class=\"k\">return</span> -<span class=\"n\">1</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>resolveLocal</em>()</div>\n\n<p>We call this after failing to resolve a local variable in the current function&rsquo;s\nscope, so we know the variable isn&rsquo;t in the current compiler. Recall that\nCompiler stores a pointer to the Compiler for the enclosing function, and these\npointers form a linked chain that goes all the way to the root Compiler for the\ntop-level code. Thus, if the enclosing Compiler is <code>NULL</code>, we know we&rsquo;ve reached\nthe outermost function without finding a local variable. The variable must be\n<span name=\"undefined\">global</span>, so we return -1.</p>\n<aside name=\"undefined\">\n<p>It might end up being an entirely undefined variable and not even global. But in\nLox, we don&rsquo;t detect that error until runtime, so from the compiler&rsquo;s\nperspective, it&rsquo;s &ldquo;hopefully global&rdquo;.</p>\n</aside>\n<p>Otherwise, we try to resolve the identifier as a <em>local</em> variable in the\n<em>enclosing</em> compiler. In other words, we look for it right outside the current\nfunction. For example:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">outer</span>() {\n  <span class=\"k\">var</span> <span class=\"i\">x</span> = <span class=\"n\">1</span>;\n  <span class=\"k\">fun</span> <span class=\"i\">inner</span>() {\n    <span class=\"k\">print</span> <span class=\"i\">x</span>; <span class=\"c\">// (1)</span>\n  }\n  <span class=\"i\">inner</span>();\n}\n</pre></div>\n<p>When compiling the identifier expression at <code>(1)</code>, <code>resolveUpvalue()</code> looks for\na local variable <code>x</code> declared in <code>outer()</code>. If found<span class=\"em\">&mdash;</span>like it is in this\nexample<span class=\"em\">&mdash;</span>then we&rsquo;ve successfully resolved the variable. We create an upvalue\nso that the inner function can access the variable through that. The upvalue is\ncreated here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>resolveLocal</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">int</span> <span class=\"i\">addUpvalue</span>(<span class=\"t\">Compiler</span>* <span class=\"i\">compiler</span>, <span class=\"t\">uint8_t</span> <span class=\"i\">index</span>,\n                      <span class=\"t\">bool</span> <span class=\"i\">isLocal</span>) {\n  <span class=\"t\">int</span> <span class=\"i\">upvalueCount</span> = <span class=\"i\">compiler</span>-&gt;<span class=\"i\">function</span>-&gt;<span class=\"i\">upvalueCount</span>;\n  <span class=\"i\">compiler</span>-&gt;<span class=\"i\">upvalues</span>[<span class=\"i\">upvalueCount</span>].<span class=\"i\">isLocal</span> = <span class=\"i\">isLocal</span>;\n  <span class=\"i\">compiler</span>-&gt;<span class=\"i\">upvalues</span>[<span class=\"i\">upvalueCount</span>].<span class=\"i\">index</span> = <span class=\"i\">index</span>;\n  <span class=\"k\">return</span> <span class=\"i\">compiler</span>-&gt;<span class=\"i\">function</span>-&gt;<span class=\"i\">upvalueCount</span>++;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>resolveLocal</em>()</div>\n\n<p>The compiler keeps an array of upvalue structures to track the closed-over\nidentifiers that it has resolved in the body of each function. Remember how the\ncompiler&rsquo;s Local array mirrors the stack slot indexes where locals live at\nruntime? This new upvalue array works the same way. The indexes in the\ncompiler&rsquo;s array match the indexes where upvalues will live in the ObjClosure at\nruntime.</p>\n<p>This function adds a new upvalue to that array. It also keeps track of the\nnumber of upvalues the function uses. It stores that count directly in the\nObjFunction itself because we&rsquo;ll also <span name=\"bridge\">need</span> that\nnumber for use at runtime.</p>\n<aside name=\"bridge\">\n<p>Like constants and function arity, the upvalue count is another one of those\nlittle pieces of data that form the bridge between the compiler and runtime.</p>\n</aside>\n<p>The <code>index</code> field tracks the closed-over local variable&rsquo;s slot index. That way\nthe compiler knows <em>which</em> variable in the enclosing function needs to be\ncaptured. We&rsquo;ll circle back to what that <code>isLocal</code> field is for before too long.\nFinally, <code>addUpvalue()</code> returns the index of the created upvalue in the\nfunction&rsquo;s upvalue list. That index becomes the operand to the <code>OP_GET_UPVALUE</code>\nand <code>OP_SET_UPVALUE</code> instructions.</p>\n<p>That&rsquo;s the basic idea for resolving upvalues, but the function isn&rsquo;t fully\nbaked. A closure may reference the same variable in a surrounding function\nmultiple times. In that case, we don&rsquo;t want to waste time and memory creating a\nseparate upvalue for each identifier expression. To fix that, before we add a\nnew upvalue, we first check to see if the function already has an upvalue that\ncloses over that variable.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  int upvalueCount = compiler-&gt;function-&gt;upvalueCount;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>addUpvalue</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">i</span> = <span class=\"n\">0</span>; <span class=\"i\">i</span> &lt; <span class=\"i\">upvalueCount</span>; <span class=\"i\">i</span>++) {\n    <span class=\"t\">Upvalue</span>* <span class=\"i\">upvalue</span> = &amp;<span class=\"i\">compiler</span>-&gt;<span class=\"i\">upvalues</span>[<span class=\"i\">i</span>];\n    <span class=\"k\">if</span> (<span class=\"i\">upvalue</span>-&gt;<span class=\"i\">index</span> == <span class=\"i\">index</span> &amp;&amp; <span class=\"i\">upvalue</span>-&gt;<span class=\"i\">isLocal</span> == <span class=\"i\">isLocal</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">i</span>;\n    }\n  }\n\n</pre><pre class=\"insert-after\">  compiler-&gt;upvalues[upvalueCount].isLocal = isLocal;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>addUpvalue</em>()</div>\n\n<p>If we find an upvalue in the array whose slot index matches the one we&rsquo;re\nadding, we just return that <em>upvalue</em> index and reuse it. Otherwise, we fall\nthrough and add the new upvalue.</p>\n<p>These two functions access and modify a bunch of new state, so let&rsquo;s define\nthat. First, we add the upvalue count to ObjFunction.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  int arity;\n</pre><div class=\"source-file\"><em>object.h</em><br>\nin struct <em>ObjFunction</em></div>\n<pre class=\"insert\">  <span class=\"t\">int</span> <span class=\"i\">upvalueCount</span>;\n</pre><pre class=\"insert-after\">  Chunk chunk;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, in struct <em>ObjFunction</em></div>\n\n<p>We&rsquo;re conscientious C programmers, so we zero-initialize that when an\nObjFunction is first allocated.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  function-&gt;arity = 0;\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>newFunction</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">function</span>-&gt;<span class=\"i\">upvalueCount</span> = <span class=\"n\">0</span>;\n</pre><pre class=\"insert-after\">  function-&gt;name = NULL;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>newFunction</em>()</div>\n\n<p>In the compiler, we add a field for the upvalue array.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  int localCount;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin struct <em>Compiler</em></div>\n<pre class=\"insert\">  <span class=\"t\">Upvalue</span> <span class=\"i\">upvalues</span>[<span class=\"a\">UINT8_COUNT</span>];\n</pre><pre class=\"insert-after\">  int scopeDepth;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in struct <em>Compiler</em></div>\n\n<p>For simplicity, I gave it a fixed size. The <code>OP_GET_UPVALUE</code> and\n<code>OP_SET_UPVALUE</code> instructions encode an upvalue index using a single byte\noperand, so there&rsquo;s a restriction on how many upvalues a function can have<span class=\"em\">&mdash;</span>how many unique variables it can close over. Given that, we can afford a static\narray that large. We also need to make sure the compiler doesn&rsquo;t overflow that\nlimit.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    if (upvalue-&gt;index == index &amp;&amp; upvalue-&gt;isLocal == isLocal) {\n      return i;\n    }\n  }\n\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>addUpvalue</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">upvalueCount</span> == <span class=\"a\">UINT8_COUNT</span>) {\n    <span class=\"i\">error</span>(<span class=\"s\">&quot;Too many closure variables in function.&quot;</span>);\n    <span class=\"k\">return</span> <span class=\"n\">0</span>;\n  }\n\n</pre><pre class=\"insert-after\">  compiler-&gt;upvalues[upvalueCount].isLocal = isLocal;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>addUpvalue</em>()</div>\n\n<p>Finally, the Upvalue struct type itself.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after struct <em>Local</em></div>\n<pre><span class=\"k\">typedef</span> <span class=\"k\">struct</span> {\n  <span class=\"t\">uint8_t</span> <span class=\"i\">index</span>;\n  <span class=\"t\">bool</span> <span class=\"i\">isLocal</span>;\n} <span class=\"t\">Upvalue</span>;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after struct <em>Local</em></div>\n\n<p>The <code>index</code> field stores which local slot the upvalue is capturing. The\n<code>isLocal</code> field deserves its own section, which we&rsquo;ll get to next.</p>\n<h3><a href=\"#flattening-upvalues\" id=\"flattening-upvalues\"><small>25&#8202;.&#8202;2&#8202;.&#8202;2</small>Flattening upvalues</a></h3>\n<p>In the example I showed before, the closure is accessing a variable declared in\nthe immediately enclosing function. Lox also supports accessing local variables\ndeclared in <em>any</em> enclosing scope, as in:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">outer</span>() {\n  <span class=\"k\">var</span> <span class=\"i\">x</span> = <span class=\"n\">1</span>;\n  <span class=\"k\">fun</span> <span class=\"i\">middle</span>() {\n    <span class=\"k\">fun</span> <span class=\"i\">inner</span>() {\n      <span class=\"k\">print</span> <span class=\"i\">x</span>;\n    }\n  }\n}\n</pre></div>\n<p>Here, we&rsquo;re accessing <code>x</code> in <code>inner()</code>. That variable is defined not in\n<code>middle()</code>, but all the way out in <code>outer()</code>. We need to handle cases like this\ntoo. You <em>might</em> think that this isn&rsquo;t much harder since the variable will\nsimply be somewhere farther down on the stack. But consider this <span\nname=\"devious\">devious</span> example:</p>\n<aside name=\"devious\">\n<p>If you work on programming languages long enough, you will develop a\nfinely honed skill at creating bizarre programs like this that are technically\nvalid but likely to trip up an implementation written by someone with a less\nperverse imagination than you.</p>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">outer</span>() {\n  <span class=\"k\">var</span> <span class=\"i\">x</span> = <span class=\"s\">&quot;value&quot;</span>;\n  <span class=\"k\">fun</span> <span class=\"i\">middle</span>() {\n    <span class=\"k\">fun</span> <span class=\"i\">inner</span>() {\n      <span class=\"k\">print</span> <span class=\"i\">x</span>;\n    }\n\n    <span class=\"k\">print</span> <span class=\"s\">&quot;create inner closure&quot;</span>;\n    <span class=\"k\">return</span> <span class=\"i\">inner</span>;\n  }\n\n  <span class=\"k\">print</span> <span class=\"s\">&quot;return from outer&quot;</span>;\n  <span class=\"k\">return</span> <span class=\"i\">middle</span>;\n}\n\n<span class=\"k\">var</span> <span class=\"i\">mid</span> = <span class=\"i\">outer</span>();\n<span class=\"k\">var</span> <span class=\"i\">in</span> = <span class=\"i\">mid</span>();\n<span class=\"i\">in</span>();\n</pre></div>\n<p>When you run this, it should print:</p>\n<div class=\"codehilite\"><pre>return from outer\ncreate inner closure\nvalue\n</pre></div>\n<p>I know, it&rsquo;s convoluted. The important part is that <code>outer()</code><span class=\"em\">&mdash;</span>where <code>x</code> is\ndeclared<span class=\"em\">&mdash;</span>returns and pops all of its variables off the stack before the\n<em>declaration</em> of <code>inner()</code> executes. So, at the point in time that we create the\nclosure for <code>inner()</code>, <code>x</code> is already off the stack.</p>\n<p>Here, I traced out the execution flow for you:</p><img src=\"image/closures/execution-flow.png\" alt=\"Tracing through the previous example program.\"/>\n<p>See how <code>x</code> is popped &#9312; before it is captured &#9313; and then later\naccessed &#9314;? We really have two problems:</p>\n<ol>\n<li>\n<p>We need to resolve local variables that are declared in surrounding\nfunctions beyond the immediately enclosing one.</p>\n</li>\n<li>\n<p>We need to be able to capture variables that have already left the stack.</p>\n</li>\n</ol>\n<p>Fortunately, we&rsquo;re in the middle of adding upvalues to the VM, and upvalues are\nexplicitly designed for tracking variables that have escaped the stack. So, in a\nclever bit of self-reference, we can use upvalues to allow upvalues to capture\nvariables declared outside of the immediately surrounding function.</p>\n<p>The solution is to allow a closure to capture either a local variable or <em>an\nexisting upvalue</em> in the immediately enclosing function. If a deeply nested\nfunction references a local variable declared several hops away, we&rsquo;ll thread it\nthrough all of the intermediate functions by having each function capture an\nupvalue for the next function to grab.</p><img src=\"image/closures/linked-upvalues.png\" alt=\"An upvalue in inner() points to an upvalue in middle(), which points to a local variable in outer().\"/>\n<p>In the above example, <code>middle()</code> captures the local variable <code>x</code> in the\nimmediately enclosing function <code>outer()</code> and stores it in its own upvalue. It\ndoes this even though <code>middle()</code> itself doesn&rsquo;t reference <code>x</code>. Then, when the\ndeclaration of <code>inner()</code> executes, its closure grabs the <em>upvalue</em> from the\nObjClosure for <code>middle()</code> that captured <code>x</code>. A function captures<span class=\"em\">&mdash;</span>either a\nlocal or upvalue<span class=\"em\">&mdash;</span><em>only</em> from the immediately surrounding function, which is\nguaranteed to still be around at the point that the inner function declaration\nexecutes.</p>\n<p>In order to implement this, <code>resolveUpvalue()</code> becomes recursive.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  if (local != -1) {\n    return addUpvalue(compiler, (uint8_t)local, true);\n  }\n\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>resolveUpvalue</em>()</div>\n<pre class=\"insert\">  <span class=\"t\">int</span> <span class=\"i\">upvalue</span> = <span class=\"i\">resolveUpvalue</span>(<span class=\"i\">compiler</span>-&gt;<span class=\"i\">enclosing</span>, <span class=\"i\">name</span>);\n  <span class=\"k\">if</span> (<span class=\"i\">upvalue</span> != -<span class=\"n\">1</span>) {\n    <span class=\"k\">return</span> <span class=\"i\">addUpvalue</span>(<span class=\"i\">compiler</span>, (<span class=\"t\">uint8_t</span>)<span class=\"i\">upvalue</span>, <span class=\"k\">false</span>);\n  }\n\n</pre><pre class=\"insert-after\">  return -1;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>resolveUpvalue</em>()</div>\n\n<p>It&rsquo;s only another three lines of code, but I found this function really\nchallenging to get right the first time. This in spite of the fact that I wasn&rsquo;t\ninventing anything new, just porting the concept over from Lua. Most recursive\nfunctions either do all their work before the recursive call (a <strong>pre-order\ntraversal</strong>, or &ldquo;on the way down&rdquo;), or they do all the work after the recursive\ncall (a <strong>post-order traversal</strong>, or &ldquo;on the way back up&rdquo;). This function does\nboth. The recursive call is right in the middle.</p>\n<p>We&rsquo;ll walk through it slowly. First, we look for a matching local variable in\nthe enclosing function. If we find one, we capture that local and return. That&rsquo;s\nthe <span name=\"base\">base</span> case.</p>\n<aside name=\"base\">\n<p>The other base case, of course, is if there is no enclosing function. In that\ncase, the variable can&rsquo;t be resolved lexically and is treated as global.</p>\n</aside>\n<p>Otherwise, we look for a local variable beyond the immediately enclosing\nfunction. We do that by recursively calling <code>resolveUpvalue()</code> on the\n<em>enclosing</em> compiler, not the current one. This series of <code>resolveUpvalue()</code>\ncalls works its way along the chain of nested compilers until it hits one of\nthe base cases<span class=\"em\">&mdash;</span>either it finds an actual local variable to capture or it\nruns out of compilers.</p>\n<p>When a local variable is found, the most deeply <span name=\"outer\">nested</span>\ncall to <code>resolveUpvalue()</code> captures it and returns the upvalue index. That\nreturns to the next call for the inner function declaration. That call captures\nthe <em>upvalue</em> from the surrounding function, and so on. As each nested call to\n<code>resolveUpvalue()</code> returns, we drill back down into the innermost function\ndeclaration where the identifier we are resolving appears. At each step along\nthe way, we add an upvalue to the intervening function and pass the resulting\nupvalue index down to the next call.</p>\n<aside name=\"outer\">\n<p>Each recursive call to <code>resolveUpvalue()</code> walks <em>out</em> one level of function\nnesting. So an inner <em>recursive call</em> refers to an <em>outer</em> nested declaration.\nThe innermost recursive call to <code>resolveUpvalue()</code> that finds the local variable\nwill be for the <em>outermost</em> function, just inside the enclosing function where\nthat variable is actually declared.</p>\n</aside>\n<p>It might help to walk through the original example when resolving <code>x</code>:</p><img src=\"image/closures/recursion.png\" alt=\"Tracing through a recursive call to resolveUpvalue().\"/>\n<p>Note that the new call to <code>addUpvalue()</code> passes <code>false</code> for the <code>isLocal</code>\nparameter. Now you see that that flag controls whether the closure captures a\nlocal variable or an upvalue from the surrounding function.</p>\n<p>By the time the compiler reaches the end of a function declaration, every\nvariable reference has been resolved as either a local, an upvalue, or a global.\nEach upvalue may in turn capture a local variable from the surrounding function,\nor an upvalue in the case of transitive closures. We finally have enough data to\nemit bytecode which creates a closure at runtime that captures all of the\ncorrect variables.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  emitBytes(OP_CLOSURE, makeConstant(OBJ_VAL(function)));\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>function</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">i</span> = <span class=\"n\">0</span>; <span class=\"i\">i</span> &lt; <span class=\"i\">function</span>-&gt;<span class=\"i\">upvalueCount</span>; <span class=\"i\">i</span>++) {\n    <span class=\"i\">emitByte</span>(<span class=\"i\">compiler</span>.<span class=\"i\">upvalues</span>[<span class=\"i\">i</span>].<span class=\"i\">isLocal</span> ? <span class=\"n\">1</span> : <span class=\"n\">0</span>);\n    <span class=\"i\">emitByte</span>(<span class=\"i\">compiler</span>.<span class=\"i\">upvalues</span>[<span class=\"i\">i</span>].<span class=\"i\">index</span>);\n  }\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>function</em>()</div>\n\n<p>The <code>OP_CLOSURE</code> instruction is unique in that it has a variably sized encoding.\nFor each upvalue the closure captures, there are two single-byte operands. Each\npair of operands specifies what that upvalue captures. If the first byte is one,\nit captures a local variable in the enclosing function. If zero, it captures one\nof the function&rsquo;s upvalues. The next byte is the local slot or upvalue index to\ncapture.</p>\n<p>This odd encoding means we need some bespoke support in the disassembly code\nfor <code>OP_CLOSURE</code>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      printf(&quot;\\n&quot;);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">\n\n      <span class=\"t\">ObjFunction</span>* <span class=\"i\">function</span> = <span class=\"a\">AS_FUNCTION</span>(\n          <span class=\"i\">chunk</span>-&gt;<span class=\"i\">constants</span>.<span class=\"i\">values</span>[<span class=\"i\">constant</span>]);\n      <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">j</span> = <span class=\"n\">0</span>; <span class=\"i\">j</span> &lt; <span class=\"i\">function</span>-&gt;<span class=\"i\">upvalueCount</span>; <span class=\"i\">j</span>++) {\n        <span class=\"t\">int</span> <span class=\"i\">isLocal</span> = <span class=\"i\">chunk</span>-&gt;<span class=\"i\">code</span>[<span class=\"i\">offset</span>++];\n        <span class=\"t\">int</span> <span class=\"i\">index</span> = <span class=\"i\">chunk</span>-&gt;<span class=\"i\">code</span>[<span class=\"i\">offset</span>++];\n        <span class=\"i\">printf</span>(<span class=\"s\">&quot;%04d      |                     %s %d</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>,\n               <span class=\"i\">offset</span> - <span class=\"n\">2</span>, <span class=\"i\">isLocal</span> ? <span class=\"s\">&quot;local&quot;</span> : <span class=\"s\">&quot;upvalue&quot;</span>, <span class=\"i\">index</span>);\n      }\n\n</pre><pre class=\"insert-after\">      return offset;\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>For example, take this script:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">outer</span>() {\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"n\">1</span>;\n  <span class=\"k\">var</span> <span class=\"i\">b</span> = <span class=\"n\">2</span>;\n  <span class=\"k\">fun</span> <span class=\"i\">middle</span>() {\n    <span class=\"k\">var</span> <span class=\"i\">c</span> = <span class=\"n\">3</span>;\n    <span class=\"k\">var</span> <span class=\"i\">d</span> = <span class=\"n\">4</span>;\n    <span class=\"k\">fun</span> <span class=\"i\">inner</span>() {\n      <span class=\"k\">print</span> <span class=\"i\">a</span> + <span class=\"i\">c</span> + <span class=\"i\">b</span> + <span class=\"i\">d</span>;\n    }\n  }\n}\n</pre></div>\n<p>If we disassemble the instruction that creates the closure for <code>inner()</code>, it\nprints this:</p>\n<div class=\"codehilite\"><pre>0004    9 OP_CLOSURE          2 &lt;fn inner&gt;\n0006      |                     upvalue 0\n0008      |                     local 1\n0010      |                     upvalue 1\n0012      |                     local 2\n</pre></div>\n<p>We have two other, simpler instructions to add disassembler support for.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    case OP_SET_GLOBAL:\n      return constantInstruction(&quot;OP_SET_GLOBAL&quot;, chunk, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_GET_UPVALUE</span>:\n      <span class=\"k\">return</span> <span class=\"i\">byteInstruction</span>(<span class=\"s\">&quot;OP_GET_UPVALUE&quot;</span>, <span class=\"i\">chunk</span>, <span class=\"i\">offset</span>);\n    <span class=\"k\">case</span> <span class=\"a\">OP_SET_UPVALUE</span>:\n      <span class=\"k\">return</span> <span class=\"i\">byteInstruction</span>(<span class=\"s\">&quot;OP_SET_UPVALUE&quot;</span>, <span class=\"i\">chunk</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    case OP_EQUAL:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>These both have a single-byte operand, so there&rsquo;s nothing exciting going on. We\ndo need to add an include so the debug module can get to <code>AS_FUNCTION()</code>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;debug.h&quot;\n</pre><div class=\"source-file\"><em>debug.c</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;object.h&quot;</span>\n</pre><pre class=\"insert-after\">#include &quot;value.h&quot;\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em></div>\n\n<p>With that, our compiler is where we want it. For each function declaration, it\noutputs an <code>OP_CLOSURE</code> instruction followed by a series of operand byte pairs\nfor each upvalue it needs to capture at runtime. It&rsquo;s time to hop over to that\nside of the VM and get things running.</p>\n<h2><a href=\"#upvalue-objects\" id=\"upvalue-objects\"><small>25&#8202;.&#8202;3</small>Upvalue Objects</a></h2>\n<p>Each <code>OP_CLOSURE</code> instruction is now followed by the series of bytes that\nspecify the upvalues the ObjClosure should own. Before we process those\noperands, we need a runtime representation for upvalues.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>object.h</em><br>\nadd after struct <em>ObjString</em></div>\n<pre><span class=\"k\">typedef</span> <span class=\"k\">struct</span> <span class=\"t\">ObjUpvalue</span> {\n  <span class=\"t\">Obj</span> <span class=\"i\">obj</span>;\n  <span class=\"t\">Value</span>* <span class=\"i\">location</span>;\n} <span class=\"t\">ObjUpvalue</span>;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, add after struct <em>ObjString</em></div>\n\n<p>We know upvalues must manage closed-over variables that no longer live on the\nstack, which implies some amount of dynamic allocation. The easiest way to do\nthat in our VM is by building on the object system we already have. That way,\nwhen we implement a garbage collector in <a href=\"garbage-collection.html\">the next chapter</a>, the GC can\nmanage memory for upvalues too.</p>\n<p>Thus, our runtime upvalue structure is an ObjUpvalue with the typical Obj header\nfield. Following that is a <code>location</code> field that points to the closed-over\nvariable. Note that this is a <em>pointer</em> to a Value, not a Value itself. It&rsquo;s a\nreference to a <em>variable</em>, not a <em>value</em>. This is important because it means\nthat when we assign to the variable the upvalue captures, we&rsquo;re assigning to the\nactual variable, not a copy. For example:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">outer</span>() {\n  <span class=\"k\">var</span> <span class=\"i\">x</span> = <span class=\"s\">&quot;before&quot;</span>;\n  <span class=\"k\">fun</span> <span class=\"i\">inner</span>() {\n    <span class=\"i\">x</span> = <span class=\"s\">&quot;assigned&quot;</span>;\n  }\n  <span class=\"i\">inner</span>();\n  <span class=\"k\">print</span> <span class=\"i\">x</span>;\n}\n<span class=\"i\">outer</span>();\n</pre></div>\n<p>This program should print &ldquo;assigned&rdquo; even though the closure assigns to <code>x</code> and\nthe surrounding function accesses it.</p>\n<p>Because upvalues are objects, we&rsquo;ve got all the usual object machinery, starting\nwith a constructor-like function:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">ObjString* copyString(const char* chars, int length);\n</pre><div class=\"source-file\"><em>object.h</em><br>\nadd after <em>copyString</em>()</div>\n<pre class=\"insert\"><span class=\"t\">ObjUpvalue</span>* <span class=\"i\">newUpvalue</span>(<span class=\"t\">Value</span>* <span class=\"i\">slot</span>);\n</pre><pre class=\"insert-after\">void printObject(Value value);\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, add after <em>copyString</em>()</div>\n\n<p>It takes the address of the slot where the closed-over variable lives. Here is\nthe implementation:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>object.c</em><br>\nadd after <em>copyString</em>()</div>\n<pre><span class=\"t\">ObjUpvalue</span>* <span class=\"i\">newUpvalue</span>(<span class=\"t\">Value</span>* <span class=\"i\">slot</span>) {\n  <span class=\"t\">ObjUpvalue</span>* <span class=\"i\">upvalue</span> = <span class=\"a\">ALLOCATE_OBJ</span>(<span class=\"t\">ObjUpvalue</span>, <span class=\"a\">OBJ_UPVALUE</span>);\n  <span class=\"i\">upvalue</span>-&gt;<span class=\"i\">location</span> = <span class=\"i\">slot</span>;\n  <span class=\"k\">return</span> <span class=\"i\">upvalue</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, add after <em>copyString</em>()</div>\n\n<p>We simply initialize the object and store the pointer. That requires a new\nobject type.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OBJ_STRING,\n</pre><div class=\"source-file\"><em>object.h</em><br>\nin enum <em>ObjType</em></div>\n<pre class=\"insert\">  <span class=\"a\">OBJ_UPVALUE</span>\n</pre><pre class=\"insert-after\">} ObjType;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, in enum <em>ObjType</em></div>\n\n<p>And on the back side, a destructor-like function:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      FREE(ObjString, object);\n      break;\n    }\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>freeObject</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OBJ_UPVALUE</span>:\n      <span class=\"a\">FREE</span>(<span class=\"t\">ObjUpvalue</span>, <span class=\"i\">object</span>);\n      <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>freeObject</em>()</div>\n\n<p>Multiple closures can close over the same variable, so ObjUpvalue does not own\nthe variable it references. Thus, the only thing to free is the ObjUpvalue\nitself.</p>\n<p>And, finally, to print:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    case OBJ_STRING:\n      printf(&quot;%s&quot;, AS_CSTRING(value));\n      break;\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>printObject</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OBJ_UPVALUE</span>:\n      <span class=\"i\">printf</span>(<span class=\"s\">&quot;upvalue&quot;</span>);\n      <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>printObject</em>()</div>\n\n<p>Printing isn&rsquo;t useful to end users. Upvalues are objects only so that we can\ntake advantage of the VM&rsquo;s memory management. They aren&rsquo;t first-class values\nthat a Lox user can directly access in a program. So this code will never\nactually execute<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>but it keeps the compiler from yelling at us about an\nunhandled switch case, so here we are.</p>\n<h3><a href=\"#upvalues-in-closures\" id=\"upvalues-in-closures\"><small>25&#8202;.&#8202;3&#8202;.&#8202;1</small>Upvalues in closures</a></h3>\n<p>When I first introduced upvalues, I said each closure has an array of them.\nWe&rsquo;ve finally worked our way back to implementing that.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  ObjFunction* function;\n</pre><div class=\"source-file\"><em>object.h</em><br>\nin struct <em>ObjClosure</em></div>\n<pre class=\"insert\">  <span class=\"t\">ObjUpvalue</span>** <span class=\"i\">upvalues</span>;\n  <span class=\"t\">int</span> <span class=\"i\">upvalueCount</span>;\n</pre><pre class=\"insert-after\">} ObjClosure;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, in struct <em>ObjClosure</em></div>\n\n<p><span name=\"count\">Different</span> closures may have different numbers of\nupvalues, so we need a dynamic array. The upvalues themselves are dynamically\nallocated too, so we end up with a double pointer<span class=\"em\">&mdash;</span>a pointer to a dynamically\nallocated array of pointers to upvalues. We also store the number of elements in\nthe array.</p>\n<aside name=\"count\">\n<p>Storing the upvalue count in the closure is redundant because the ObjFunction\nthat the ObjClosure references also keeps that count. As usual, this weird code\nis to appease the GC. The collector may need to know an ObjClosure&rsquo;s upvalue\narray size after the closure&rsquo;s corresponding ObjFunction has already been freed.</p>\n</aside>\n<p>When we create an ObjClosure, we allocate an upvalue array of the proper size,\nwhich we determined at compile time and stored in the ObjFunction.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">ObjClosure* newClosure(ObjFunction* function) {\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>newClosure</em>()</div>\n<pre class=\"insert\">  <span class=\"t\">ObjUpvalue</span>** <span class=\"i\">upvalues</span> = <span class=\"a\">ALLOCATE</span>(<span class=\"t\">ObjUpvalue</span>*,\n                                   <span class=\"i\">function</span>-&gt;<span class=\"i\">upvalueCount</span>);\n  <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">i</span> = <span class=\"n\">0</span>; <span class=\"i\">i</span> &lt; <span class=\"i\">function</span>-&gt;<span class=\"i\">upvalueCount</span>; <span class=\"i\">i</span>++) {\n    <span class=\"i\">upvalues</span>[<span class=\"i\">i</span>] = <span class=\"a\">NULL</span>;\n  }\n\n</pre><pre class=\"insert-after\">  ObjClosure* closure = ALLOCATE_OBJ(ObjClosure, OBJ_CLOSURE);\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>newClosure</em>()</div>\n\n<p>Before creating the closure object itself, we allocate the array of upvalues and\ninitialize them all to <code>NULL</code>. This weird ceremony around memory is a careful\ndance to please the (forthcoming) garbage collection deities. It ensures the\nmemory manager never sees uninitialized memory.</p>\n<p>Then we store the array in the new closure, as well as copy the count over from\nthe ObjFunction.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  closure-&gt;function = function;\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>newClosure</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">closure</span>-&gt;<span class=\"i\">upvalues</span> = <span class=\"i\">upvalues</span>;\n  <span class=\"i\">closure</span>-&gt;<span class=\"i\">upvalueCount</span> = <span class=\"i\">function</span>-&gt;<span class=\"i\">upvalueCount</span>;\n</pre><pre class=\"insert-after\">  return closure;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>newClosure</em>()</div>\n\n<p>When we free an ObjClosure, we also free the upvalue array.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    case OBJ_CLOSURE: {\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>freeObject</em>()</div>\n<pre class=\"insert\">      <span class=\"t\">ObjClosure</span>* <span class=\"i\">closure</span> = (<span class=\"t\">ObjClosure</span>*)<span class=\"i\">object</span>;\n      <span class=\"a\">FREE_ARRAY</span>(<span class=\"t\">ObjUpvalue</span>*, <span class=\"i\">closure</span>-&gt;<span class=\"i\">upvalues</span>,\n                 <span class=\"i\">closure</span>-&gt;<span class=\"i\">upvalueCount</span>);\n</pre><pre class=\"insert-after\">      FREE(ObjClosure, object);\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>freeObject</em>()</div>\n\n<p>ObjClosure does not own the ObjUpvalue objects themselves, but it does own <em>the\narray</em> containing pointers to those upvalues.</p>\n<p>We fill the upvalue array over in the interpreter when it creates a closure.\nThis is where we walk through all of the operands after <code>OP_CLOSURE</code> to see what\nkind of upvalue each slot captures.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">        push(OBJ_VAL(closure));\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">        <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">i</span> = <span class=\"n\">0</span>; <span class=\"i\">i</span> &lt; <span class=\"i\">closure</span>-&gt;<span class=\"i\">upvalueCount</span>; <span class=\"i\">i</span>++) {\n          <span class=\"t\">uint8_t</span> <span class=\"i\">isLocal</span> = <span class=\"a\">READ_BYTE</span>();\n          <span class=\"t\">uint8_t</span> <span class=\"i\">index</span> = <span class=\"a\">READ_BYTE</span>();\n          <span class=\"k\">if</span> (<span class=\"i\">isLocal</span>) {\n            <span class=\"i\">closure</span>-&gt;<span class=\"i\">upvalues</span>[<span class=\"i\">i</span>] =\n                <span class=\"i\">captureUpvalue</span>(<span class=\"i\">frame</span>-&gt;<span class=\"i\">slots</span> + <span class=\"i\">index</span>);\n          } <span class=\"k\">else</span> {\n            <span class=\"i\">closure</span>-&gt;<span class=\"i\">upvalues</span>[<span class=\"i\">i</span>] = <span class=\"i\">frame</span>-&gt;<span class=\"i\">closure</span>-&gt;<span class=\"i\">upvalues</span>[<span class=\"i\">index</span>];\n          }\n        }\n</pre><pre class=\"insert-after\">        break;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>This code is the magic moment when a closure comes to life. We iterate over each\nupvalue the closure expects. For each one, we read a pair of operand bytes. If\nthe upvalue closes over a local variable in the enclosing function, we let\n<code>captureUpvalue()</code> do the work.</p>\n<p>Otherwise, we capture an upvalue from the surrounding function. An <code>OP_CLOSURE</code>\ninstruction is emitted at the end of a function declaration. At the moment that\nwe are executing that declaration, the <em>current</em> function is the surrounding\none. That means the current function&rsquo;s closure is stored in the CallFrame at the\ntop of the callstack. So, to grab an upvalue from the enclosing function, we can\nread it right from the <code>frame</code> local variable, which caches a reference to that\nCallFrame.</p>\n<p>Closing over a local variable is more interesting. Most of the work happens in a\nseparate function, but first we calculate the argument to pass to it. We need to\ngrab a pointer to the captured local&rsquo;s slot in the surrounding function&rsquo;s stack\nwindow. That window begins at <code>frame-&gt;slots</code>, which points to slot zero. Adding\n<code>index</code> offsets that to the local slot we want to capture. We pass that pointer\nhere:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nadd after <em>callValue</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">ObjUpvalue</span>* <span class=\"i\">captureUpvalue</span>(<span class=\"t\">Value</span>* <span class=\"i\">local</span>) {\n  <span class=\"t\">ObjUpvalue</span>* <span class=\"i\">createdUpvalue</span> = <span class=\"i\">newUpvalue</span>(<span class=\"i\">local</span>);\n  <span class=\"k\">return</span> <span class=\"i\">createdUpvalue</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, add after <em>callValue</em>()</div>\n\n<p>This seems a little silly. All it does is create a new ObjUpvalue that captures\nthe given stack slot and returns it. Did we need a separate function for this?\nWell, no, not <em>yet</em>. But you know we are going to end up sticking more code in\nhere.</p>\n<p>First, let&rsquo;s wrap up what we&rsquo;re working on. Back in the interpreter code for\nhandling <code>OP_CLOSURE</code>, we eventually finish iterating through the upvalue\narray and initialize each one. When that completes, we have a new closure with\nan array full of upvalues pointing to variables.</p>\n<p>With that in hand, we can implement the instructions that work with those\nupvalues.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_GET_UPVALUE</span>: {\n        <span class=\"t\">uint8_t</span> <span class=\"i\">slot</span> = <span class=\"a\">READ_BYTE</span>();\n        <span class=\"i\">push</span>(*<span class=\"i\">frame</span>-&gt;<span class=\"i\">closure</span>-&gt;<span class=\"i\">upvalues</span>[<span class=\"i\">slot</span>]-&gt;<span class=\"i\">location</span>);\n        <span class=\"k\">break</span>;\n      }\n</pre><pre class=\"insert-after\">      case OP_EQUAL: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>The operand is the index into the current function&rsquo;s upvalue array. So we simply\nlook up the corresponding upvalue and dereference its location pointer to read\nthe value in that slot. Setting a variable is similar.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_SET_UPVALUE</span>: {\n        <span class=\"t\">uint8_t</span> <span class=\"i\">slot</span> = <span class=\"a\">READ_BYTE</span>();\n        *<span class=\"i\">frame</span>-&gt;<span class=\"i\">closure</span>-&gt;<span class=\"i\">upvalues</span>[<span class=\"i\">slot</span>]-&gt;<span class=\"i\">location</span> = <span class=\"i\">peek</span>(<span class=\"n\">0</span>);\n        <span class=\"k\">break</span>;\n      }\n</pre><pre class=\"insert-after\">      case OP_EQUAL: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>We <span name=\"assign\">take</span> the value on top of the stack and store it\ninto the slot pointed to by the chosen upvalue. Just as with the instructions\nfor local variables, it&rsquo;s important that these instructions are fast. User\nprograms are constantly reading and writing variables, so if that&rsquo;s slow,\neverything is slow. And, as usual, the way we make them fast is by keeping them\nsimple. These two new instructions are pretty good: no control flow, no complex\narithmetic, just a couple of pointer indirections and a <code>push()</code>.</p>\n<aside name=\"assign\">\n<p>The set instruction doesn&rsquo;t <em>pop</em> the value from the stack because, remember,\nassignment is an expression in Lox. So the result of the assignment<span class=\"em\">&mdash;</span>the\nassigned value<span class=\"em\">&mdash;</span>needs to remain on the stack for the surrounding expression.</p>\n</aside>\n<p>This is a milestone. As long as all of the variables remain on the stack, we\nhave working closures. Try this:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">outer</span>() {\n  <span class=\"k\">var</span> <span class=\"i\">x</span> = <span class=\"s\">&quot;outside&quot;</span>;\n  <span class=\"k\">fun</span> <span class=\"i\">inner</span>() {\n    <span class=\"k\">print</span> <span class=\"i\">x</span>;\n  }\n  <span class=\"i\">inner</span>();\n}\n<span class=\"i\">outer</span>();\n</pre></div>\n<p>Run this, and it correctly prints &ldquo;outside&rdquo;.</p>\n<h2><a href=\"#closed-upvalues\" id=\"closed-upvalues\"><small>25&#8202;.&#8202;4</small>Closed Upvalues</a></h2>\n<p>Of course, a key feature of closures is that they hold on to the variable as\nlong as needed, even after the function that declares the variable has returned.\nHere&rsquo;s another example that <em>should</em> work:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">outer</span>() {\n  <span class=\"k\">var</span> <span class=\"i\">x</span> = <span class=\"s\">&quot;outside&quot;</span>;\n  <span class=\"k\">fun</span> <span class=\"i\">inner</span>() {\n    <span class=\"k\">print</span> <span class=\"i\">x</span>;\n  }\n\n  <span class=\"k\">return</span> <span class=\"i\">inner</span>;\n}\n\n<span class=\"k\">var</span> <span class=\"i\">closure</span> = <span class=\"i\">outer</span>();\n<span class=\"i\">closure</span>();\n</pre></div>\n<p>But if you run it right now<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>who knows what it does? At runtime, it will end\nup reading from a stack slot that no longer contains the closed-over variable.\nLike I&rsquo;ve mentioned a few times, the crux of the issue is that variables in\nclosures don&rsquo;t have stack semantics. That means we&rsquo;ve got to hoist them off the\nstack when the function where they were declared returns. This final section of\nthe chapter does that.</p>\n<h3><a href=\"#values-and-variables\" id=\"values-and-variables\"><small>25&#8202;.&#8202;4&#8202;.&#8202;1</small>Values and variables</a></h3>\n<p>Before we get to writing code, I want to dig into an important semantic point.\nDoes a closure close over a <em>value</em> or a <em>variable?</em> This isn&rsquo;t purely an <span\nname=\"academic\">academic</span> question. I&rsquo;m not just splitting hairs.\nConsider:</p>\n<aside name=\"academic\">\n<p>If Lox didn&rsquo;t allow assignment, it <em>would</em> be an academic question.</p>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">globalSet</span>;\n<span class=\"k\">var</span> <span class=\"i\">globalGet</span>;\n\n<span class=\"k\">fun</span> <span class=\"i\">main</span>() {\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;initial&quot;</span>;\n\n  <span class=\"k\">fun</span> <span class=\"i\">set</span>() { <span class=\"i\">a</span> = <span class=\"s\">&quot;updated&quot;</span>; }\n  <span class=\"k\">fun</span> <span class=\"i\">get</span>() { <span class=\"k\">print</span> <span class=\"i\">a</span>; }\n\n  <span class=\"i\">globalSet</span> = <span class=\"i\">set</span>;\n  <span class=\"i\">globalGet</span> = <span class=\"i\">get</span>;\n}\n\n<span class=\"i\">main</span>();\n<span class=\"i\">globalSet</span>();\n<span class=\"i\">globalGet</span>();\n</pre></div>\n<p>The outer <code>main()</code> function creates two closures and stores them in <span\nname=\"global\">global</span> variables so that they outlive the execution of\n<code>main()</code> itself. Both of those closures capture the same variable. The first\nclosure assigns a new value to it and the second closure reads the variable.</p>\n<aside name=\"global\">\n<p>The fact that I&rsquo;m using a couple of global variables isn&rsquo;t significant. I needed\nsome way to return two values from a function, and without any kind of\ncollection type in Lox, my options were limited.</p>\n</aside>\n<p>What does the call to <code>globalGet()</code> print? If closures capture <em>values</em> then\neach closure gets its own copy of <code>a</code> with the value that <code>a</code> had at the point\nin time that the closure&rsquo;s function declaration executed. The call to\n<code>globalSet()</code> will modify <code>set()</code>&rsquo;s copy of <code>a</code>, but <code>get()</code>&rsquo;s copy will be\nunaffected. Thus, the call to <code>globalGet()</code> will print &ldquo;initial&rdquo;.</p>\n<p>If closures close over variables, then <code>get()</code> and <code>set()</code> will both capture<span class=\"em\">&mdash;</span>reference<span class=\"em\">&mdash;</span>the <em>same mutable variable</em>. When <code>set()</code> changes <code>a</code>, it changes\nthe same <code>a</code> that <code>get()</code> reads from. There is only one <code>a</code>. That, in turn,\nimplies the call to <code>globalGet()</code> will print &ldquo;updated&rdquo;.</p>\n<p>Which is it? The answer for Lox and most other languages I know with closures is\nthe latter. Closures capture variables. You can think of them as capturing <em>the\nplace the value lives</em>. This is important to keep in mind as we deal with\nclosed-over variables that are no longer on the stack. When a variable moves to\nthe heap, we need to ensure that all closures capturing that variable retain a\nreference to its <em>one</em> new location. That way, when the variable is mutated, all\nclosures see the change.</p>\n<h3><a href=\"#closing-upvalues\" id=\"closing-upvalues\"><small>25&#8202;.&#8202;4&#8202;.&#8202;2</small>Closing upvalues</a></h3>\n<p>We know that local variables always start out on the stack. This is faster, and\nlets our single-pass compiler emit code before it discovers the variable has\nbeen captured. We also know that closed-over variables need to move to the heap\nif the closure outlives the function where the captured variable is declared.</p>\n<p>Following Lua, we&rsquo;ll use <strong>open upvalue</strong> to refer to an upvalue that points to\na local variable still on the stack. When a variable moves to the heap, we are\n<em>closing</em> the upvalue and the result is, naturally, a <strong>closed upvalue</strong>. The\ntwo questions we need to answer are:</p>\n<ol>\n<li>\n<p>Where on the heap does the closed-over variable go?</p>\n</li>\n<li>\n<p>When do we close the upvalue?</p>\n</li>\n</ol>\n<p>The answer to the first question is easy. We already have a convenient object on\nthe heap that represents a reference to a variable<span class=\"em\">&mdash;</span>ObjUpvalue itself. The\nclosed-over variable will move into a new field right inside the ObjUpvalue\nstruct. That way we don&rsquo;t need to do any additional heap allocation to close an\nupvalue.</p>\n<p>The second question is straightforward too. As long as the variable is on the\nstack, there may be code that refers to it there, and that code must work\ncorrectly. So the logical time to hoist the variable to the heap is as late as\npossible. If we move the local variable right when it goes out of scope, we are\ncertain that no code after that point will try to access it from the stack.\n<span name=\"after\">After</span> the variable is out of scope, the compiler will\nhave reported an error if any code tried to use it.</p>\n<aside name=\"after\">\n<p>By &ldquo;after&rdquo; here, I mean in the lexical or textual sense<span class=\"em\">&mdash;</span>code past the <code>}</code>\nfor the block containing the declaration of the closed-over variable.</p>\n</aside>\n<p>The compiler already emits an <code>OP_POP</code> instruction when a local variable goes\nout of scope. If a variable is captured by a closure, we will instead emit a\ndifferent instruction to hoist that variable out of the stack and into its\ncorresponding upvalue. To do that, the compiler needs to know which <span\nname=\"param\">locals</span> are closed over.</p>\n<aside name=\"param\">\n<p>The compiler doesn&rsquo;t pop parameters and locals declared immediately inside the\nbody of a function. We&rsquo;ll handle those too, in the runtime.</p>\n</aside>\n<p>The compiler already maintains an array of Upvalue structs for each local\nvariable in the function to track exactly that state. That array is good for\nanswering &ldquo;Which variables does this closure use?&rdquo; But it&rsquo;s poorly suited for\nanswering, &ldquo;Does <em>any</em> function capture this local variable?&rdquo; In particular,\nonce the Compiler for some closure has finished, the Compiler for the enclosing\nfunction whose variable has been captured no longer has access to any of the\nupvalue state.</p>\n<p>In other words, the compiler maintains pointers from upvalues to the locals they\ncapture, but not in the other direction. So we first need to add some extra\ntracking inside the existing Local struct so that we can tell if a given local\nis captured by a closure.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  int depth;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin struct <em>Local</em></div>\n<pre class=\"insert\">  <span class=\"t\">bool</span> <span class=\"i\">isCaptured</span>;\n</pre><pre class=\"insert-after\">} Local;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in struct <em>Local</em></div>\n\n<p>This field is <code>true</code> if the local is captured by any later nested function\ndeclaration. Initially, all locals are not captured.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  local-&gt;depth = -1;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>addLocal</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">local</span>-&gt;<span class=\"i\">isCaptured</span> = <span class=\"k\">false</span>;\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>addLocal</em>()</div>\n\n<p><span name=\"zero\">Likewise</span>, the special &ldquo;slot zero local&rdquo; that the\ncompiler implicitly declares is not captured.</p>\n<aside name=\"zero\">\n<p>Later in the book, it <em>will</em> become possible for a user to capture this\nvariable. Just building some anticipation here.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">  local-&gt;depth = 0;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>initCompiler</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">local</span>-&gt;<span class=\"i\">isCaptured</span> = <span class=\"k\">false</span>;\n</pre><pre class=\"insert-after\">  local-&gt;name.start = &quot;&quot;;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>initCompiler</em>()</div>\n\n<p>When resolving an identifier, if we end up creating an upvalue for a local\nvariable, we mark it as captured.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  if (local != -1) {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>resolveUpvalue</em>()</div>\n<pre class=\"insert\">    <span class=\"i\">compiler</span>-&gt;<span class=\"i\">enclosing</span>-&gt;<span class=\"i\">locals</span>[<span class=\"i\">local</span>].<span class=\"i\">isCaptured</span> = <span class=\"k\">true</span>;\n</pre><pre class=\"insert-after\">    return addUpvalue(compiler, (uint8_t)local, true);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>resolveUpvalue</em>()</div>\n\n<p>Now, at the end of a block scope when the compiler emits code to free the stack\nslots for the locals, we can tell which ones need to get hoisted onto the heap.\nWe&rsquo;ll use a new instruction for that.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  while (current-&gt;localCount &gt; 0 &amp;&amp;\n         current-&gt;locals[current-&gt;localCount - 1].depth &gt;\n            current-&gt;scopeDepth) {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>endScope</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (<span class=\"i\">current</span>-&gt;<span class=\"i\">locals</span>[<span class=\"i\">current</span>-&gt;<span class=\"i\">localCount</span> - <span class=\"n\">1</span>].<span class=\"i\">isCaptured</span>) {\n      <span class=\"i\">emitByte</span>(<span class=\"a\">OP_CLOSE_UPVALUE</span>);\n    } <span class=\"k\">else</span> {\n      <span class=\"i\">emitByte</span>(<span class=\"a\">OP_POP</span>);\n    }\n</pre><pre class=\"insert-after\">    current-&gt;localCount--;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>endScope</em>(), replace 1 line</div>\n\n<p>The instruction requires no operand. We know that the variable will always be\nright on top of the stack at the point that this instruction executes. We\ndeclare the instruction.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_CLOSURE,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_CLOSE_UPVALUE</span>,\n</pre><pre class=\"insert-after\">  OP_RETURN,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>And add trivial disassembler support for it:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    }\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_CLOSE_UPVALUE</span>:\n      <span class=\"k\">return</span> <span class=\"i\">simpleInstruction</span>(<span class=\"s\">&quot;OP_CLOSE_UPVALUE&quot;</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    case OP_RETURN:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>Excellent. Now the generated bytecode tells the runtime exactly when each\ncaptured local variable must move to the heap. Better, it does so only for the\nlocals that <em>are</em> used by a closure and need this special treatment. This aligns\nwith our general performance goal that we want users to pay only for\nfunctionality that they use. Variables that aren&rsquo;t used by closures live and die\nentirely on the stack just as they did before.</p>\n<h3><a href=\"#tracking-open-upvalues\" id=\"tracking-open-upvalues\"><small>25&#8202;.&#8202;4&#8202;.&#8202;3</small>Tracking open upvalues</a></h3>\n<p>Let&rsquo;s move over to the runtime side. Before we can interpret <code>OP_CLOSE_UPVALUE</code>\ninstructions, we have an issue to resolve. Earlier, when I talked about whether\nclosures capture variables or values, I said it was important that if multiple\nclosures access the same variable that they end up with a reference to the\nexact same storage location in memory. That way if one closure writes to the\nvariable, the other closure sees the change.</p>\n<p>Right now, if two closures capture the same <span name=\"indirect\">local</span>\nvariable, the VM creates a separate Upvalue for each one. The necessary sharing\nis missing. When we move the variable off the stack, if we move it into only one\nof the upvalues, the other upvalue will have an orphaned value.</p>\n<aside name=\"indirect\">\n<p>The VM <em>does</em> share upvalues if one closure captures an <em>upvalue</em> from a\nsurrounding function. The nested case works correctly. But if two <em>sibling</em>\nclosures capture the same local variable, they each create a separate\nObjUpvalue.</p>\n</aside>\n<p>To fix that, whenever the VM needs an upvalue that captures a particular local\nvariable slot, we will first search for an existing upvalue pointing to that\nslot. If found, we reuse that. The challenge is that all of the previously\ncreated upvalues are squirreled away inside the upvalue arrays of the various\nclosures. Those closures could be anywhere in the VM&rsquo;s memory.</p>\n<p>The first step is to give the VM its own list of all open upvalues that point to\nvariables still on the stack. Searching a list each time the VM needs an upvalue\nsounds like it might be slow, but in practice, it&rsquo;s not bad. The number of\nvariables on the stack that actually get closed over tends to be small. And\nfunction declarations that <span name=\"create\">create</span> closures are rarely\non performance critical execution paths in the user&rsquo;s program.</p>\n<aside name=\"create\">\n<p>Closures are frequently <em>invoked</em> inside hot loops. Think about the closures\npassed to typical higher-order functions on collections like <a href=\"https://en.wikipedia.org/wiki/Map_(higher-order_function)\"><code>map()</code></a> and\n<a href=\"https://en.wikipedia.org/wiki/Filter_(higher-order_function)\"><code>filter()</code></a>. That should be fast. But the function declaration that\n<em>creates</em> the closure happens only once and is usually outside of the loop.</p>\n</aside>\n<p>Even better, we can order the list of open upvalues by the stack slot index they\npoint to. The common case is that a slot has <em>not</em> already been captured<span class=\"em\">&mdash;</span>sharing variables between closures is uncommon<span class=\"em\">&mdash;</span>and closures tend to capture\nlocals near the top of the stack. If we store the open upvalue array in stack\nslot order, as soon as we step past the slot where the local we&rsquo;re capturing\nlives, we know it won&rsquo;t be found. When that local is near the top of the stack,\nwe can exit the loop pretty early.</p>\n<p>Maintaining a sorted list requires inserting elements in the middle efficiently.\nThat suggests using a linked list instead of a dynamic array. Since we defined\nthe ObjUpvalue struct ourselves, the easiest implementation is an intrusive list\nthat puts the next pointer right inside the ObjUpvalue struct itself.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  Value* location;\n</pre><div class=\"source-file\"><em>object.h</em><br>\nin struct <em>ObjUpvalue</em></div>\n<pre class=\"insert\">  <span class=\"k\">struct</span> <span class=\"t\">ObjUpvalue</span>* <span class=\"i\">next</span>;\n</pre><pre class=\"insert-after\">} ObjUpvalue;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, in struct <em>ObjUpvalue</em></div>\n\n<p>When we allocate an upvalue, it is not attached to any list yet so the link is\n<code>NULL</code>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  upvalue-&gt;location = slot;\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>newUpvalue</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">upvalue</span>-&gt;<span class=\"i\">next</span> = <span class=\"a\">NULL</span>;\n</pre><pre class=\"insert-after\">  return upvalue;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>newUpvalue</em>()</div>\n\n<p>The VM owns the list, so the head pointer goes right inside the main VM struct.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  Table strings;\n</pre><div class=\"source-file\"><em>vm.h</em><br>\nin struct <em>VM</em></div>\n<pre class=\"insert\">  <span class=\"t\">ObjUpvalue</span>* <span class=\"i\">openUpvalues</span>;\n</pre><pre class=\"insert-after\">  Obj* objects;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em>, in struct <em>VM</em></div>\n\n<p>The list starts out empty.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  vm.frameCount = 0;\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>resetStack</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">vm</span>.<span class=\"i\">openUpvalues</span> = <span class=\"a\">NULL</span>;\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>resetStack</em>()</div>\n\n<p>Starting with the first upvalue pointed to by the VM, each open upvalue points\nto the next open upvalue that references a local variable farther down the\nstack. This script, for example,</p>\n<div class=\"codehilite\"><pre>{\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"n\">1</span>;\n  <span class=\"k\">fun</span> <span class=\"i\">f</span>() {\n    <span class=\"k\">print</span> <span class=\"i\">a</span>;\n  }\n  <span class=\"k\">var</span> <span class=\"i\">b</span> = <span class=\"n\">2</span>;\n  <span class=\"k\">fun</span> <span class=\"i\">g</span>() {\n    <span class=\"k\">print</span> <span class=\"i\">b</span>;\n  }\n  <span class=\"k\">var</span> <span class=\"i\">c</span> = <span class=\"n\">3</span>;\n  <span class=\"k\">fun</span> <span class=\"i\">h</span>() {\n    <span class=\"k\">print</span> <span class=\"i\">c</span>;\n  }\n}\n</pre></div>\n<p>should produce a series of linked upvalues like so:</p><img src=\"image/closures/linked-list.png\" alt=\"Three upvalues in a linked list.\"/>\n<p>Whenever we close over a local variable, before creating a new upvalue, we look\nfor an existing one in the list.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static ObjUpvalue* captureUpvalue(Value* local) {\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>captureUpvalue</em>()</div>\n<pre class=\"insert\">  <span class=\"t\">ObjUpvalue</span>* <span class=\"i\">prevUpvalue</span> = <span class=\"a\">NULL</span>;\n  <span class=\"t\">ObjUpvalue</span>* <span class=\"i\">upvalue</span> = <span class=\"i\">vm</span>.<span class=\"i\">openUpvalues</span>;\n  <span class=\"k\">while</span> (<span class=\"i\">upvalue</span> != <span class=\"a\">NULL</span> &amp;&amp; <span class=\"i\">upvalue</span>-&gt;<span class=\"i\">location</span> &gt; <span class=\"i\">local</span>) {\n    <span class=\"i\">prevUpvalue</span> = <span class=\"i\">upvalue</span>;\n    <span class=\"i\">upvalue</span> = <span class=\"i\">upvalue</span>-&gt;<span class=\"i\">next</span>;\n  }\n\n  <span class=\"k\">if</span> (<span class=\"i\">upvalue</span> != <span class=\"a\">NULL</span> &amp;&amp; <span class=\"i\">upvalue</span>-&gt;<span class=\"i\">location</span> == <span class=\"i\">local</span>) {\n    <span class=\"k\">return</span> <span class=\"i\">upvalue</span>;\n  }\n\n</pre><pre class=\"insert-after\">  ObjUpvalue* createdUpvalue = newUpvalue(local);\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>captureUpvalue</em>()</div>\n\n<p>We start at the <span name=\"head\">head</span> of the list, which is the upvalue\nclosest to the top of the stack. We walk through the list, using a little\npointer comparison to iterate past every upvalue pointing to slots above the one\nwe&rsquo;re looking for. While we do that, we keep track of the preceding upvalue on\nthe list. We&rsquo;ll need to update that node&rsquo;s <code>next</code> pointer if we end up inserting\na node after it.</p>\n<aside name=\"head\">\n<p>It&rsquo;s a singly linked list. It&rsquo;s not like we have any other choice than to start\nat the head and go forward from there.</p>\n</aside>\n<p>There are three reasons we can exit the loop:</p>\n<ol>\n<li>\n<p><strong>The local slot we stopped at <em>is</em> the slot we&rsquo;re looking for.</strong> We found\nan existing upvalue capturing the variable, so we reuse that upvalue.</p>\n</li>\n<li>\n<p><strong>We ran out of upvalues to search.</strong> When <code>upvalue</code> is <code>NULL</code>, it means\nevery open upvalue in the list points to locals above the slot we&rsquo;re looking\nfor, or (more likely) the upvalue list is empty. Either way, we didn&rsquo;t find\nan upvalue for our slot.</p>\n</li>\n<li>\n<p><strong>We found an upvalue whose local slot is <em>below</em> the one we&rsquo;re looking\nfor.</strong> Since the list is sorted, that means we&rsquo;ve gone past the slot we are\nclosing over, and thus there must not be an existing upvalue for it.</p>\n</li>\n</ol>\n<p>In the first case, we&rsquo;re done and we&rsquo;ve returned. Otherwise, we create a new\nupvalue for our local slot and insert it into the list at the right location.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  ObjUpvalue* createdUpvalue = newUpvalue(local);\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>captureUpvalue</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">createdUpvalue</span>-&gt;<span class=\"i\">next</span> = <span class=\"i\">upvalue</span>;\n\n  <span class=\"k\">if</span> (<span class=\"i\">prevUpvalue</span> == <span class=\"a\">NULL</span>) {\n    <span class=\"i\">vm</span>.<span class=\"i\">openUpvalues</span> = <span class=\"i\">createdUpvalue</span>;\n  } <span class=\"k\">else</span> {\n    <span class=\"i\">prevUpvalue</span>-&gt;<span class=\"i\">next</span> = <span class=\"i\">createdUpvalue</span>;\n  }\n\n</pre><pre class=\"insert-after\">  return createdUpvalue;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>captureUpvalue</em>()</div>\n\n<p>The current incarnation of this function already creates the upvalue, so we only\nneed to add code to insert the upvalue into the list. We exited the list\ntraversal by either going past the end of the list, or by stopping on the first\nupvalue whose stack slot is below the one we&rsquo;re looking for. In either case,\nthat means we need to insert the new upvalue <em>before</em> the object pointed at by\n<code>upvalue</code> (which may be <code>NULL</code> if we hit the end of the list).</p>\n<p>As you may have learned in Data Structures 101, to insert a node into a linked\nlist, you set the <code>next</code> pointer of the previous node to point to your new one.\nWe have been conveniently keeping track of that preceding node as we walked the\nlist. We also need to handle the <span name=\"double\">special</span> case where\nwe are inserting a new upvalue at the head of the list, in which case the &ldquo;next&rdquo;\npointer is the VM&rsquo;s head pointer.</p>\n<aside name=\"double\">\n<p>There is a shorter implementation that handles updating either the head pointer\nor the previous upvalue&rsquo;s <code>next</code> pointer uniformly by using a pointer to a\npointer, but that kind of code confuses almost everyone who hasn&rsquo;t reached some\nZen master level of pointer expertise. I went with the basic <code>if</code> statement\napproach.</p>\n</aside>\n<p>With this updated function, the VM now ensures that there is only ever a single\nObjUpvalue for any given local slot. If two closures capture the same variable,\nthey will get the same upvalue. We&rsquo;re ready to move those upvalues off the\nstack now.</p>\n<h3><a href=\"#closing-upvalues-at-runtime\" id=\"closing-upvalues-at-runtime\"><small>25&#8202;.&#8202;4&#8202;.&#8202;4</small>Closing upvalues at runtime</a></h3>\n<p>The compiler helpfully emits an <code>OP_CLOSE_UPVALUE</code> instruction to tell the VM\nexactly when a local variable should be hoisted onto the heap. Executing that\ninstruction is the interpreter&rsquo;s responsibility.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_CLOSE_UPVALUE</span>:\n        <span class=\"i\">closeUpvalues</span>(<span class=\"i\">vm</span>.<span class=\"i\">stackTop</span> - <span class=\"n\">1</span>);\n        <span class=\"i\">pop</span>();\n        <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">      case OP_RETURN: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>When we reach the instruction, the variable we are hoisting is right on top of\nthe stack. We call a helper function, passing the address of that stack slot.\nThat function is responsible for closing the upvalue and moving the local from\nthe stack to the heap. After that, the VM is free to discard the stack slot,\nwhich it does by calling <code>pop()</code>.</p>\n<p>The fun stuff happens here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nadd after <em>captureUpvalue</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">closeUpvalues</span>(<span class=\"t\">Value</span>* <span class=\"i\">last</span>) {\n  <span class=\"k\">while</span> (<span class=\"i\">vm</span>.<span class=\"i\">openUpvalues</span> != <span class=\"a\">NULL</span> &amp;&amp;\n         <span class=\"i\">vm</span>.<span class=\"i\">openUpvalues</span>-&gt;<span class=\"i\">location</span> &gt;= <span class=\"i\">last</span>) {\n    <span class=\"t\">ObjUpvalue</span>* <span class=\"i\">upvalue</span> = <span class=\"i\">vm</span>.<span class=\"i\">openUpvalues</span>;\n    <span class=\"i\">upvalue</span>-&gt;<span class=\"i\">closed</span> = *<span class=\"i\">upvalue</span>-&gt;<span class=\"i\">location</span>;\n    <span class=\"i\">upvalue</span>-&gt;<span class=\"i\">location</span> = &amp;<span class=\"i\">upvalue</span>-&gt;<span class=\"i\">closed</span>;\n    <span class=\"i\">vm</span>.<span class=\"i\">openUpvalues</span> = <span class=\"i\">upvalue</span>-&gt;<span class=\"i\">next</span>;\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, add after <em>captureUpvalue</em>()</div>\n\n<p>This function takes a pointer to a stack slot. It closes every open upvalue it\ncan find that points to that slot or any slot above it on the stack. Right now,\nwe pass a pointer only to the top slot on the stack, so the &ldquo;or above it&rdquo; part\ndoesn&rsquo;t come into play, but it will soon.</p>\n<p>To do this, we walk the VM&rsquo;s list of open upvalues, again from top to bottom. If\nan upvalue&rsquo;s location points into the range of slots we&rsquo;re closing, we close the\nupvalue. Otherwise, once we reach an upvalue outside of the range, we know the\nrest will be too, so we stop iterating.</p>\n<p>The way an upvalue gets closed is pretty <span name=\"cool\">cool</span>. First,\nwe copy the variable&rsquo;s value into the <code>closed</code> field in the ObjUpvalue. That&rsquo;s\nwhere closed-over variables live on the heap. The <code>OP_GET_UPVALUE</code> and\n<code>OP_SET_UPVALUE</code> instructions need to look for the variable there after it&rsquo;s\nbeen moved. We could add some conditional logic in the interpreter code for\nthose instructions to check some flag for whether the upvalue is open or closed.</p>\n<p>But there is already a level of indirection in play<span class=\"em\">&mdash;</span>those instructions\ndereference the <code>location</code> pointer to get to the variable&rsquo;s value. When the\nvariable moves from the stack to the <code>closed</code> field, we simply update that\n<code>location</code> to the address of the ObjUpvalue&rsquo;s <em>own</em> <code>closed</code> field.</p>\n<aside name=\"cool\">\n<p>I&rsquo;m not praising myself here. This is all the Lua dev team&rsquo;s innovation.</p>\n</aside><img src=\"image/closures/closing.png\" alt=\"Moving a value from the stack to the upvalue's 'closed' field and then pointing the 'value' field to it.\"/>\n<p>We don&rsquo;t need to change how <code>OP_GET_UPVALUE</code> and <code>OP_SET_UPVALUE</code> are\ninterpreted at all. That keeps them simple, which in turn keeps them fast. We do\nneed to add the new field to ObjUpvalue, though.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  Value* location;\n</pre><div class=\"source-file\"><em>object.h</em><br>\nin struct <em>ObjUpvalue</em></div>\n<pre class=\"insert\">  <span class=\"t\">Value</span> <span class=\"i\">closed</span>;\n</pre><pre class=\"insert-after\">  struct ObjUpvalue* next;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, in struct <em>ObjUpvalue</em></div>\n\n<p>And we should zero it out when we create an ObjUpvalue so there&rsquo;s no\nuninitialized memory floating around.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  ObjUpvalue* upvalue = ALLOCATE_OBJ(ObjUpvalue, OBJ_UPVALUE);\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>newUpvalue</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">upvalue</span>-&gt;<span class=\"i\">closed</span> = <span class=\"a\">NIL_VAL</span>;\n</pre><pre class=\"insert-after\">  upvalue-&gt;location = slot;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>newUpvalue</em>()</div>\n\n<p>Whenever the compiler reaches the end of a block, it discards all local\nvariables in that block and emits an <code>OP_CLOSE_UPVALUE</code> for each local variable\nthat was closed over. The compiler <span name=\"close\">does</span> <em>not</em> emit any\ninstructions at the end of the outermost block scope that defines a function\nbody. That scope contains the function&rsquo;s parameters and any locals declared\nimmediately inside the function. Those need to get closed too.</p>\n<aside name=\"close\">\n<p>There&rsquo;s nothing <em>preventing</em> us from closing the outermost function scope in the\ncompiler and emitting <code>OP_POP</code> and <code>OP_CLOSE_UPVALUE</code> instructions. Doing so is\njust unnecessary because the runtime discards all of the stack slots used by the\nfunction implicitly when it pops the call frame.</p>\n</aside>\n<p>This is the reason <code>closeUpvalues()</code> accepts a pointer to a stack slot. When a\nfunction returns, we call that same helper and pass in the first stack slot\nowned by the function.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">        Value result = pop();\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">        <span class=\"i\">closeUpvalues</span>(<span class=\"i\">frame</span>-&gt;<span class=\"i\">slots</span>);\n</pre><pre class=\"insert-after\">        vm.frameCount--;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>By passing the first slot in the function&rsquo;s stack window, we close every\nremaining open upvalue owned by the returning function. And with that, we now\nhave a fully functioning closure implementation. Closed-over variables live as\nlong as they are needed by the functions that capture them.</p>\n<p>This was a lot of work! In jlox, closures fell out naturally from our\nenvironment representation. In clox, we had to add a lot of code<span class=\"em\">&mdash;</span>new bytecode\ninstructions, more data structures in the compiler, and new runtime objects. The\nVM very much treats variables in closures as different from other variables.</p>\n<p>There is a rationale for that. In terms of implementation complexity, jlox gave\nus closures &ldquo;for free&rdquo;. But in terms of <em>performance</em>, jlox&rsquo;s closures are\nanything but. By allocating <em>all</em> environments on the heap, jlox pays a\nsignificant performance price for <em>all</em> local variables, even the majority which\nare never captured by closures.</p>\n<p>With clox, we have a more complex system, but that allows us to tailor the\nimplementation to fit the two use patterns we observe for local variables. For\nmost variables which do have stack semantics, we allocate them entirely on the\nstack which is simple and fast. Then, for the few local variables where that\ndoesn&rsquo;t work, we have a second slower path we can opt in to as needed.</p>\n<p>Fortunately, users don&rsquo;t perceive the complexity. From their perspective, local\nvariables in Lox are simple and uniform. The <em>language itself</em> is as simple as\njlox&rsquo;s implementation. But under the hood, clox is watching what the user does\nand optimizing for their specific uses. As your language implementations grow in\nsophistication, you&rsquo;ll find yourself doing this more. A large fraction of\n&ldquo;optimization&rdquo; is about adding special case code that detects certain uses and\nprovides a custom-built, faster path for code that fits that pattern.</p>\n<p>We have lexical scoping fully working in clox now, which is a major milestone.\nAnd, now that we have functions and variables with complex lifetimes, we also\nhave a <em>lot</em> of objects floating around in clox&rsquo;s heap, with a web of pointers\nstringing them together. The <a href=\"garbage-collection.html\">next step</a> is figuring out how to manage that\nmemory so that we can free some of those objects when they&rsquo;re no longer needed.</p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>Wrapping every ObjFunction in an ObjClosure introduces a level of\nindirection that has a performance cost. That cost isn&rsquo;t necessary for\nfunctions that do not close over any variables, but it does let the runtime\ntreat all calls uniformly.</p>\n<p>Change clox to only wrap functions in ObjClosures that need upvalues. How\ndoes the code complexity and performance compare to always wrapping\nfunctions? Take care to benchmark programs that do and do not use closures.\nHow should you weight the importance of each benchmark? If one gets slower\nand one faster, how do you decide what trade-off to make to choose an\nimplementation strategy?</p>\n</li>\n<li>\n<p>Read the design note below. I&rsquo;ll wait. Now, how do you think Lox <em>should</em>\nbehave? Change the implementation to create a new variable for each loop\niteration.</p>\n</li>\n<li>\n<p>A <a href=\"http://wiki.c2.com/?ClosuresAndObjectsAreEquivalent\">famous koan</a> teaches us that &ldquo;objects are a poor man&rsquo;s closure&rdquo;\n(and vice versa). Our VM doesn&rsquo;t support objects yet, but now that we have\nclosures we can approximate them. Using closures, write a Lox program that\nmodels two-dimensional vector &ldquo;objects&rdquo;. It should:</p>\n<ul>\n<li>\n<p>Define a &ldquo;constructor&rdquo; function to create a new vector with the given\n<em>x</em> and <em>y</em> coordinates.</p>\n</li>\n<li>\n<p>Provide &ldquo;methods&rdquo; to access the <em>x</em> and <em>y</em> coordinates of values\nreturned from that constructor.</p>\n</li>\n<li>\n<p>Define an addition &ldquo;method&rdquo; that adds two vectors and produces a third.</p>\n</li>\n</ul>\n</li>\n</ol>\n</div>\n<div class=\"design-note\">\n<h2><a href=\"#design-note\" id=\"design-note\">Design Note: Closing Over the Loop Variable</a></h2>\n<p>Closures capture variables. When two closures capture the same variable, they\nshare a reference to the same underlying storage location. This fact is visible\nwhen new values are assigned to the variable. Obviously, if two closures capture\n<em>different</em> variables, there is no sharing.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">globalOne</span>;\n<span class=\"k\">var</span> <span class=\"i\">globalTwo</span>;\n\n<span class=\"k\">fun</span> <span class=\"i\">main</span>() {\n  {\n    <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;one&quot;</span>;\n    <span class=\"k\">fun</span> <span class=\"i\">one</span>() {\n      <span class=\"k\">print</span> <span class=\"i\">a</span>;\n    }\n    <span class=\"i\">globalOne</span> = <span class=\"i\">one</span>;\n  }\n\n  {\n    <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;two&quot;</span>;\n    <span class=\"k\">fun</span> <span class=\"i\">two</span>() {\n      <span class=\"k\">print</span> <span class=\"i\">a</span>;\n    }\n    <span class=\"i\">globalTwo</span> = <span class=\"i\">two</span>;\n  }\n}\n\n<span class=\"i\">main</span>();\n<span class=\"i\">globalOne</span>();\n<span class=\"i\">globalTwo</span>();\n</pre></div>\n<p>This prints &ldquo;one&rdquo; then &ldquo;two&rdquo;. In this example, it&rsquo;s pretty clear that the two\n<code>a</code> variables are different. But it&rsquo;s not always so obvious. Consider:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">globalOne</span>;\n<span class=\"k\">var</span> <span class=\"i\">globalTwo</span>;\n\n<span class=\"k\">fun</span> <span class=\"i\">main</span>() {\n  <span class=\"k\">for</span> (<span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"n\">1</span>; <span class=\"i\">a</span> &lt;= <span class=\"n\">2</span>; <span class=\"i\">a</span> = <span class=\"i\">a</span> + <span class=\"n\">1</span>) {\n    <span class=\"k\">fun</span> <span class=\"i\">closure</span>() {\n      <span class=\"k\">print</span> <span class=\"i\">a</span>;\n    }\n    <span class=\"k\">if</span> (<span class=\"i\">globalOne</span> == <span class=\"k\">nil</span>) {\n      <span class=\"i\">globalOne</span> = <span class=\"i\">closure</span>;\n    } <span class=\"k\">else</span> {\n      <span class=\"i\">globalTwo</span> = <span class=\"i\">closure</span>;\n    }\n  }\n}\n\n<span class=\"i\">main</span>();\n<span class=\"i\">globalOne</span>();\n<span class=\"i\">globalTwo</span>();\n</pre></div>\n<p>The code is convoluted because Lox has no collection types. The important part\nis that the <code>main()</code> function does two iterations of a <code>for</code> loop. Each time\nthrough the loop, it creates a closure that captures the loop variable. It\nstores the first closure in <code>globalOne</code> and the second in <code>globalTwo</code>.</p>\n<p>There are definitely two different closures. Do they close over two different\nvariables? Is there only one <code>a</code> for the entire duration of the loop, or does\neach iteration get its own distinct <code>a</code> variable?</p>\n<p>The script here is strange and contrived, but this does show up in real code\nin languages that aren&rsquo;t as minimal as clox. Here&rsquo;s a JavaScript example:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">closures</span> = [];\n<span class=\"k\">for</span> (<span class=\"k\">var</span> <span class=\"i\">i</span> = <span class=\"n\">1</span>; <span class=\"i\">i</span> &lt;= <span class=\"n\">2</span>; <span class=\"i\">i</span>++) {\n  <span class=\"i\">closures</span>.<span class=\"i\">push</span>(<span class=\"k\">function</span> () { <span class=\"i\">console</span>.<span class=\"i\">log</span>(<span class=\"i\">i</span>); });\n}\n\n<span class=\"i\">closures</span>[<span class=\"n\">0</span>]();\n<span class=\"i\">closures</span>[<span class=\"n\">1</span>]();\n</pre></div>\n<p>Does this print &ldquo;1&rdquo; then &ldquo;2&rdquo;, or does it print <span name=\"three\">&ldquo;3&rdquo;</span>\ntwice? You may be surprised to hear that it prints &ldquo;3&rdquo; twice. In this JavaScript\nprogram, there is only a single <code>i</code> variable whose lifetime includes all\niterations of the loop, including the final exit.</p>\n<aside name=\"three\">\n<p>You&rsquo;re wondering how <em>three</em> enters the picture? After the second iteration,\n<code>i++</code> is executed, which increments <code>i</code> to three. That&rsquo;s what causes <code>i &lt;= 2</code> to\nevaluate to false and end the loop. If <code>i</code> never reached three, the loop would\nrun forever.</p>\n</aside>\n<p>If you&rsquo;re familiar with JavaScript, you probably know that variables declared\nusing <code>var</code> are implicitly <em>hoisted</em> to the surrounding function or top-level\nscope. It&rsquo;s as if you really wrote this:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">closures</span> = [];\n<span class=\"k\">var</span> <span class=\"i\">i</span>;\n<span class=\"k\">for</span> (<span class=\"i\">i</span> = <span class=\"n\">1</span>; <span class=\"i\">i</span> &lt;= <span class=\"n\">2</span>; <span class=\"i\">i</span>++) {\n  <span class=\"i\">closures</span>.<span class=\"i\">push</span>(<span class=\"k\">function</span> () { <span class=\"i\">console</span>.<span class=\"i\">log</span>(<span class=\"i\">i</span>); });\n}\n\n<span class=\"i\">closures</span>[<span class=\"n\">0</span>]();\n<span class=\"i\">closures</span>[<span class=\"n\">1</span>]();\n</pre></div>\n<p>At that point, it&rsquo;s clearer that there is only a single <code>i</code>. Now consider if\nyou change the program to use the newer <code>let</code> keyword:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">closures</span> = [];\n<span class=\"k\">for</span> (<span class=\"k\">let</span> <span class=\"i\">i</span> = <span class=\"n\">1</span>; <span class=\"i\">i</span> &lt;= <span class=\"n\">2</span>; <span class=\"i\">i</span>++) {\n  <span class=\"i\">closures</span>.<span class=\"i\">push</span>(<span class=\"k\">function</span> () { <span class=\"i\">console</span>.<span class=\"i\">log</span>(<span class=\"i\">i</span>); });\n}\n\n<span class=\"i\">closures</span>[<span class=\"n\">0</span>]();\n<span class=\"i\">closures</span>[<span class=\"n\">1</span>]();\n</pre></div>\n<p>Does this new program behave the same? Nope. In this case, it prints &ldquo;1&rdquo; then\n&ldquo;2&rdquo;. Each closure gets its own <code>i</code>. That&rsquo;s sort of strange when you think about\nit. The increment clause is <code>i++</code>. That looks very much like it is assigning to\nand mutating an existing variable, not creating a new one.</p>\n<p>Let&rsquo;s try some other languages. Here&rsquo;s Python:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">closures</span> = []\n<span class=\"k\">for</span> <span class=\"i\">i</span> <span class=\"k\">in</span> <span class=\"k\">range</span>(<span class=\"n\">1</span>, <span class=\"n\">3</span>):\n  <span class=\"i\">closures</span>.<span class=\"i\">append</span>(<span class=\"k\">lambda</span>: <span class=\"k\">print</span>(<span class=\"i\">i</span>))\n\n<span class=\"i\">closures</span>[<span class=\"n\">0</span>]()\n<span class=\"i\">closures</span>[<span class=\"n\">1</span>]()\n</pre></div>\n<p>Python doesn&rsquo;t really have block scope. Variables are implicitly declared and\nare automatically scoped to the surrounding function. Kind of like hoisting in\nJS, now that I think about it. So both closures capture the same variable.\nUnlike C, though, we don&rsquo;t exit the loop by incrementing <code>i</code> <em>past</em> the last\nvalue, so this prints &ldquo;2&rdquo; twice.</p>\n<p>What about Ruby? Ruby has two typical ways to iterate numerically. Here&rsquo;s the\nclassic imperative style:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">closures</span> = []\n<span class=\"k\">for</span> <span class=\"i\">i</span> <span class=\"k\">in</span> <span class=\"n\">1</span>..<span class=\"n\">2</span> <span class=\"k\">do</span>\n  <span class=\"i\">closures</span> &lt;&lt; <span class=\"k\">lambda</span> { <span class=\"i\">puts</span> <span class=\"i\">i</span> }\n<span class=\"k\">end</span>\n\n<span class=\"i\">closures</span>[<span class=\"n\">0</span>].<span class=\"i\">call</span>\n<span class=\"i\">closures</span>[<span class=\"n\">1</span>].<span class=\"i\">call</span>\n</pre></div>\n<p>This, like Python, prints &ldquo;2&rdquo; twice. But the more idiomatic Ruby style is using\na higher-order <code>each()</code> method on range objects:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">closures</span> = []\n(<span class=\"n\">1</span>..<span class=\"n\">2</span>).<span class=\"i\">each</span> <span class=\"k\">do</span> |<span class=\"i\">i</span>|\n  <span class=\"i\">closures</span> &lt;&lt; <span class=\"k\">lambda</span> { <span class=\"i\">puts</span> <span class=\"i\">i</span> }\n<span class=\"k\">end</span>\n\n<span class=\"i\">closures</span>[<span class=\"n\">0</span>].<span class=\"i\">call</span>\n<span class=\"i\">closures</span>[<span class=\"n\">1</span>].<span class=\"i\">call</span>\n</pre></div>\n<p>If you&rsquo;re not familiar with Ruby, the <code>do |i| ... end</code> part is basically a\nclosure that gets created and passed to the <code>each()</code> method. The <code>|i|</code> is the\nparameter signature for the closure. The <code>each()</code> method invokes that closure\ntwice, passing in 1 for <code>i</code> the first time and 2 the second time.</p>\n<p>In this case, the &ldquo;loop variable&rdquo; is really a function parameter. And, since\neach iteration of the loop is a separate invocation of the function, those are\ndefinitely separate variables for each call. So this prints &ldquo;1&rdquo; then &ldquo;2&rdquo;.</p>\n<p>If a language has a higher-level iterator-based looping structure like <code>foreach</code>\nin C#, Java&rsquo;s &ldquo;enhanced for&rdquo;, <code>for-of</code> in JavaScript, <code>for-in</code> in Dart, etc.,\nthen I think it&rsquo;s natural to the reader to have each iteration create a new\nvariable. The code <em>looks</em> like a new variable because the loop header looks\nlike a variable declaration. And there&rsquo;s no increment expression that looks like\nit&rsquo;s mutating that variable to advance to the next step.</p>\n<p>If you dig around StackOverflow and other places, you find evidence that this is\nwhat users expect, because they are very surprised when they <em>don&rsquo;t</em> get it. In\nparticular, C# originally did <em>not</em> create a new loop variable for each\niteration of a <code>foreach</code> loop. This was such a frequent source of user confusion\nthat they took the very rare step of shipping a breaking change to the language.\nIn C# 5, each iteration creates a fresh variable.</p>\n<p>Old C-style <code>for</code> loops are harder. The increment clause really does look like\nmutation. That implies there is a single variable that&rsquo;s getting updated each\nstep. But it&rsquo;s almost never <em>useful</em> for each iteration to share a loop\nvariable. The only time you can even detect this is when closures capture it.\nAnd it&rsquo;s rarely helpful to have a closure that references a variable whose value\nis whatever value caused you to exit the loop.</p>\n<p>The pragmatically useful answer is probably to do what JavaScript does with\n<code>let</code> in <code>for</code> loops. Make it look like mutation but actually create a new\nvariable each time, because that&rsquo;s what users want. It is kind of weird when you\nthink about it, though.</p>\n</div>\n\n<footer>\n<a href=\"garbage-collection.html\" class=\"next\">\n  Next Chapter: &ldquo;Garbage Collection&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/compiling-expressions.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Compiling Expressions &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Compiling Expressions<small>17</small></a></h3>\n\n<ul>\n    <li><a href=\"#single-pass-compilation\"><small>17.1</small> Single-Pass Compilation</a></li>\n    <li><a href=\"#parsing-tokens\"><small>17.2</small> Parsing Tokens</a></li>\n    <li><a href=\"#emitting-bytecode\"><small>17.3</small> Emitting Bytecode</a></li>\n    <li><a href=\"#parsing-prefix-expressions\"><small>17.4</small> Parsing Prefix Expressions</a></li>\n    <li><a href=\"#parsing-infix-expressions\"><small>17.5</small> Parsing Infix Expressions</a></li>\n    <li><a href=\"#a-pratt-parser\"><small>17.6</small> A Pratt Parser</a></li>\n    <li><a href=\"#dumping-chunks\"><small>17.7</small> Dumping Chunks</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>It&#x27;s Just Parsing</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"scanning-on-demand.html\" title=\"Scanning on Demand\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"types-of-values.html\" title=\"Types of Values\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"scanning-on-demand.html\" title=\"Scanning on Demand\" class=\"prev\">←</a>\n<a href=\"types-of-values.html\" title=\"Types of Values\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Compiling Expressions<small>17</small></a></h3>\n\n<ul>\n    <li><a href=\"#single-pass-compilation\"><small>17.1</small> Single-Pass Compilation</a></li>\n    <li><a href=\"#parsing-tokens\"><small>17.2</small> Parsing Tokens</a></li>\n    <li><a href=\"#emitting-bytecode\"><small>17.3</small> Emitting Bytecode</a></li>\n    <li><a href=\"#parsing-prefix-expressions\"><small>17.4</small> Parsing Prefix Expressions</a></li>\n    <li><a href=\"#parsing-infix-expressions\"><small>17.5</small> Parsing Infix Expressions</a></li>\n    <li><a href=\"#a-pratt-parser\"><small>17.6</small> A Pratt Parser</a></li>\n    <li><a href=\"#dumping-chunks\"><small>17.7</small> Dumping Chunks</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>It&#x27;s Just Parsing</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"scanning-on-demand.html\" title=\"Scanning on Demand\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"types-of-values.html\" title=\"Types of Values\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">17</div>\n  <h1>Compiling Expressions</h1>\n\n<blockquote>\n<p>In the middle of the journey of our life I found myself within a dark woods\nwhere the straight way was lost.</p>\n<p><cite>Dante Alighieri, <em>Inferno</em></cite></p>\n</blockquote>\n<p>This chapter is exciting for not one, not two, but <em>three</em> reasons. First, it\nprovides the final segment of our VM&rsquo;s execution pipeline. Once in place, we can\nplumb the user&rsquo;s source code from scanning all the way through to executing it.</p><img src=\"image/compiling-expressions/pipeline.png\" alt=\"Lowering the 'compiler' section of pipe between 'scanner' and 'VM'.\" />\n<p>Second, we get to write an actual, honest-to-God <em>compiler</em>. It parses source\ncode and outputs a low-level series of binary instructions. Sure, it&rsquo;s <span\nname=\"wirth\">bytecode</span> and not some chip&rsquo;s native instruction set, but\nit&rsquo;s way closer to the metal than jlox was. We&rsquo;re about to be real language\nhackers.</p>\n<aside name=\"wirth\">\n<p>Bytecode was good enough for Niklaus Wirth, and no one questions his street\ncred.</p>\n</aside>\n<p><span name=\"pratt\">Third</span> and finally, I get to show you one of my\nabsolute favorite algorithms: Vaughan Pratt&rsquo;s &ldquo;top-down operator precedence\nparsing&rdquo;. It&rsquo;s the most elegant way I know to parse expressions. It gracefully\nhandles prefix operators, postfix, infix, <em>mixfix</em>, any kind of <em>-fix</em> you got.\nIt deals with precedence and associativity without breaking a sweat. I love it.</p>\n<aside name=\"pratt\">\n<p>Pratt parsers are a sort of oral tradition in industry. No compiler or language\nbook I&rsquo;ve read teaches them. Academia is very focused on generated parsers, and\nPratt&rsquo;s technique is for handwritten ones, so it gets overlooked.</p>\n<p>But in production compilers, where hand-rolled parsers are common, you&rsquo;d be\nsurprised how many people know it. Ask where they learned it, and it&rsquo;s always,\n&ldquo;Oh, I worked on this compiler years ago and my coworker said they took it from\nthis old front end<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>&rdquo;</p>\n</aside>\n<p>As usual, before we get to the fun stuff, we&rsquo;ve got some preliminaries to work\nthrough. You have to eat your vegetables before you get dessert. First, let&rsquo;s\nditch that temporary scaffolding we wrote for testing the scanner and replace it\nwith something more useful.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">InterpretResult interpret(const char* source) {\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>interpret</em>()<br>\nreplace 2 lines</div>\n<pre class=\"insert\">  <span class=\"t\">Chunk</span> <span class=\"i\">chunk</span>;\n  <span class=\"i\">initChunk</span>(&amp;<span class=\"i\">chunk</span>);\n\n  <span class=\"k\">if</span> (!<span class=\"i\">compile</span>(<span class=\"i\">source</span>, &amp;<span class=\"i\">chunk</span>)) {\n    <span class=\"i\">freeChunk</span>(&amp;<span class=\"i\">chunk</span>);\n    <span class=\"k\">return</span> <span class=\"a\">INTERPRET_COMPILE_ERROR</span>;\n  }\n\n  <span class=\"i\">vm</span>.<span class=\"i\">chunk</span> = &amp;<span class=\"i\">chunk</span>;\n  <span class=\"i\">vm</span>.<span class=\"i\">ip</span> = <span class=\"i\">vm</span>.<span class=\"i\">chunk</span>-&gt;<span class=\"i\">code</span>;\n\n  <span class=\"t\">InterpretResult</span> <span class=\"i\">result</span> = <span class=\"i\">run</span>();\n\n  <span class=\"i\">freeChunk</span>(&amp;<span class=\"i\">chunk</span>);\n  <span class=\"k\">return</span> <span class=\"i\">result</span>;\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>interpret</em>(), replace 2 lines</div>\n\n<p>We create a new empty chunk and pass it over to the compiler. The compiler will\ntake the user&rsquo;s program and fill up the chunk with bytecode. At least, that&rsquo;s\nwhat it will do if the program doesn&rsquo;t have any compile errors. If it does\nencounter an error, <code>compile()</code> returns <code>false</code> and we discard the unusable\nchunk.</p>\n<p>Otherwise, we send the completed chunk over to the VM to be executed. When the\nVM finishes, we free the chunk and we&rsquo;re done. As you can see, the signature to\n<code>compile()</code> is different now.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define clox_compiler_h\n\n</pre><div class=\"source-file\"><em>compiler.h</em><br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;vm.h&quot;</span>\n\n<span class=\"t\">bool</span> <span class=\"i\">compile</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">source</span>, <span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>);\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.h</em>, replace 1 line</div>\n\n<p>We pass in the chunk where the compiler will write the code, and then\n<code>compile()</code> returns whether or not compilation succeeded. We make the same\nchange to the signature in the implementation.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;scanner.h&quot;\n\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nfunction <em>compile</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"t\">bool</span> <span class=\"i\">compile</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">source</span>, <span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>) {\n</pre><pre class=\"insert-after\">  initScanner(source);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, function <em>compile</em>(), replace 1 line</div>\n\n<p>That call to <code>initScanner()</code> is the only line that survives this chapter. Rip\nout the temporary code we wrote to test the scanner and replace it with these\nthree lines:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  initScanner(source);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>compile</em>()<br>\nreplace 13 lines</div>\n<pre class=\"insert\">  <span class=\"i\">advance</span>();\n  <span class=\"i\">expression</span>();\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_EOF</span>, <span class=\"s\">&quot;Expect end of expression.&quot;</span>);\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>compile</em>(), replace 13 lines</div>\n\n<p>The call to <code>advance()</code> &ldquo;primes the pump&rdquo; on the scanner. We&rsquo;ll see what it does\nsoon. Then we parse a single expression. We aren&rsquo;t going to do statements yet,\nso that&rsquo;s the only subset of the grammar we support. We&rsquo;ll revisit this when we\n<a href=\"global-variables.html\">add statements in a few chapters</a>. After we compile the expression, we\nshould be at the end of the source code, so we check for the sentinel EOF token.</p>\n<p>We&rsquo;re going to spend the rest of the chapter making this function work,\nespecially that little <code>expression()</code> call. Normally, we&rsquo;d dive right into that\nfunction definition and work our way through the implementation from top to\nbottom.</p>\n<p>This chapter is <span name=\"blog\">different</span>. Pratt&rsquo;s parsing technique is\nremarkably simple once you have it all loaded in your head, but it&rsquo;s a little\ntricky to break into bite-sized pieces. It&rsquo;s recursive, of course, which is part\nof the problem. But it also relies on a big table of data. As we build up the\nalgorithm, that table grows additional columns.</p>\n<aside name=\"blog\">\n<p>If this chapter isn&rsquo;t clicking with you and you&rsquo;d like another take on the\nconcepts, I wrote an article that teaches the same algorithm but using Java and\nan object-oriented style: <a href=\"http://journal.stuffwithstuff.com/2011/03/19/pratt-parsers-expression-parsing-made-easy/\">&ldquo;Pratt Parsing: Expression Parsing Made Easy&rdquo;</a>.</p>\n</aside>\n<p>I don&rsquo;t want to revisit 40-something lines of code each time we extend the\ntable. So we&rsquo;re going to work our way into the core of the parser from the\noutside and cover all of the surrounding bits before we get to the juicy center.\nThis will require a little more patience and mental scratch space than most\nchapters, but it&rsquo;s the best I could do.</p>\n<h2><a href=\"#single-pass-compilation\" id=\"single-pass-compilation\"><small>17&#8202;.&#8202;1</small>Single-Pass Compilation</a></h2>\n<p>A compiler has roughly two jobs. It parses the user&rsquo;s source code to understand\nwhat it means. Then it takes that knowledge and outputs low-level instructions\nthat produce the same semantics. Many languages split those two roles into two\nseparate <span name=\"passes\">passes</span> in the implementation. A parser\nproduces an AST<span class=\"em\">&mdash;</span>just like jlox does<span class=\"em\">&mdash;</span>and then a code generator traverses\nthe AST and outputs target code.</p>\n<aside name=\"passes\">\n<p>In fact, most sophisticated optimizing compilers have a heck of a lot more than\ntwo passes. Determining not just <em>what</em> optimization passes to have, but how to\norder them to squeeze the most performance out of the compiler<span class=\"em\">&mdash;</span>since the\noptimizations often interact in complex ways<span class=\"em\">&mdash;</span>is somewhere between an &ldquo;open\narea of research&rdquo; and a &ldquo;dark art&rdquo;.</p>\n</aside>\n<p>In clox, we&rsquo;re taking an old-school approach and merging these two passes into\none. Back in the day, language hackers did this because computers literally\ndidn&rsquo;t have enough memory to store an entire source file&rsquo;s AST. We&rsquo;re doing it\nbecause it keeps our compiler simpler, which is a real asset when programming in\nC.</p>\n<p>Single-pass compilers like we&rsquo;re going to build don&rsquo;t work well for all\nlanguages. Since the compiler has only a peephole view into the user&rsquo;s program\nwhile generating code, the language must be designed such that you don&rsquo;t need\nmuch surrounding context to understand a piece of syntax. Fortunately, tiny,\ndynamically typed Lox is <span name=\"lox\">well-suited</span> to that.</p>\n<aside name=\"lox\">\n<p>Not that this should come as much of a surprise. I did design the language\nspecifically for this book after all.</p><img src=\"image/compiling-expressions/keyhole.png\" alt=\"Peering through a keyhole at 'var x;'\" />\n</aside>\n<p>What this means in practical terms is that our &ldquo;compiler&rdquo; C module has\nfunctionality you&rsquo;ll recognize from jlox for parsing<span class=\"em\">&mdash;</span>consuming tokens,\nmatching expected token types, etc. And it also has functions for code gen<span class=\"em\">&mdash;</span>emitting bytecode and adding constants to the destination chunk. (And it means\nI&rsquo;ll use &ldquo;parsing&rdquo; and &ldquo;compiling&rdquo; interchangeably throughout this and later\nchapters.)</p>\n<p>We&rsquo;ll build the parsing and code generation halves first. Then we&rsquo;ll stitch them\ntogether with the code in the middle that uses Pratt&rsquo;s technique to parse Lox&rsquo;s\nparticular grammar and output the right bytecode.</p>\n<h2><a href=\"#parsing-tokens\" id=\"parsing-tokens\"><small>17&#8202;.&#8202;2</small>Parsing Tokens</a></h2>\n<p>First up, the front half of the compiler. This function&rsquo;s name should sound\nfamiliar.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;scanner.h&quot;\n</pre><div class=\"source-file\"><em>compiler.c</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">advance</span>() {\n  <span class=\"i\">parser</span>.<span class=\"i\">previous</span> = <span class=\"i\">parser</span>.<span class=\"i\">current</span>;\n\n  <span class=\"k\">for</span> (;;) {\n    <span class=\"i\">parser</span>.<span class=\"i\">current</span> = <span class=\"i\">scanToken</span>();\n    <span class=\"k\">if</span> (<span class=\"i\">parser</span>.<span class=\"i\">current</span>.<span class=\"i\">type</span> != <span class=\"a\">TOKEN_ERROR</span>) <span class=\"k\">break</span>;\n\n    <span class=\"i\">errorAtCurrent</span>(<span class=\"i\">parser</span>.<span class=\"i\">current</span>.<span class=\"i\">start</span>);\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em></div>\n\n<p>Just like in jlox, it steps forward through the token stream. It asks the\nscanner for the next token and stores it for later use. Before doing that, it\ntakes the old <code>current</code> token and stashes that in a <code>previous</code> field. That will\ncome in handy later so that we can get at the lexeme after we match a token.</p>\n<p>The code to read the next token is wrapped in a loop. Remember, clox&rsquo;s scanner\ndoesn&rsquo;t report lexical errors. Instead, it creates special <em>error tokens</em> and\nleaves it up to the parser to report them. We do that here.</p>\n<p>We keep looping, reading tokens and reporting the errors, until we hit a\nnon-error one or reach the end. That way, the rest of the parser sees only valid\ntokens. The current and previous token are stored in this struct:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;scanner.h&quot;\n</pre><div class=\"source-file\"><em>compiler.c</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">typedef</span> <span class=\"k\">struct</span> {\n  <span class=\"t\">Token</span> <span class=\"i\">current</span>;\n  <span class=\"t\">Token</span> <span class=\"i\">previous</span>;\n} <span class=\"t\">Parser</span>;\n\n<span class=\"t\">Parser</span> <span class=\"i\">parser</span>;\n</pre><pre class=\"insert-after\">\n\nstatic void advance() {\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em></div>\n\n<p>Like we did in other modules, we have a single global variable of this struct\ntype so we don&rsquo;t need to pass the state around from function to function in the\ncompiler.</p>\n<h3><a href=\"#handling-syntax-errors\" id=\"handling-syntax-errors\"><small>17&#8202;.&#8202;2&#8202;.&#8202;1</small>Handling syntax errors</a></h3>\n<p>If the scanner hands us an error token, we need to actually tell the user. That\nhappens using this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after variable <em>parser</em></div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">errorAtCurrent</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">message</span>) {\n  <span class=\"i\">errorAt</span>(&amp;<span class=\"i\">parser</span>.<span class=\"i\">current</span>, <span class=\"i\">message</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after variable <em>parser</em></div>\n\n<p>We pull the location out of the current token in order to tell the user where\nthe error occurred and forward it to <code>errorAt()</code>. More often, we&rsquo;ll report an\nerror at the location of the token we just consumed, so we give the shorter name\nto this other function:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after variable <em>parser</em></div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">error</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">message</span>) {\n  <span class=\"i\">errorAt</span>(&amp;<span class=\"i\">parser</span>.<span class=\"i\">previous</span>, <span class=\"i\">message</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after variable <em>parser</em></div>\n\n<p>The actual work happens here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after variable <em>parser</em></div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">errorAt</span>(<span class=\"t\">Token</span>* <span class=\"i\">token</span>, <span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">message</span>) {\n  <span class=\"i\">fprintf</span>(<span class=\"i\">stderr</span>, <span class=\"s\">&quot;[line %d] Error&quot;</span>, <span class=\"i\">token</span>-&gt;<span class=\"i\">line</span>);\n\n  <span class=\"k\">if</span> (<span class=\"i\">token</span>-&gt;<span class=\"i\">type</span> == <span class=\"a\">TOKEN_EOF</span>) {\n    <span class=\"i\">fprintf</span>(<span class=\"i\">stderr</span>, <span class=\"s\">&quot; at end&quot;</span>);\n  } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">token</span>-&gt;<span class=\"i\">type</span> == <span class=\"a\">TOKEN_ERROR</span>) {\n    <span class=\"c\">// Nothing.</span>\n  } <span class=\"k\">else</span> {\n    <span class=\"i\">fprintf</span>(<span class=\"i\">stderr</span>, <span class=\"s\">&quot; at &#39;%.*s&#39;&quot;</span>, <span class=\"i\">token</span>-&gt;<span class=\"i\">length</span>, <span class=\"i\">token</span>-&gt;<span class=\"i\">start</span>);\n  }\n\n  <span class=\"i\">fprintf</span>(<span class=\"i\">stderr</span>, <span class=\"s\">&quot;: %s</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>, <span class=\"i\">message</span>);\n  <span class=\"i\">parser</span>.<span class=\"i\">hadError</span> = <span class=\"k\">true</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after variable <em>parser</em></div>\n\n<p>First, we print where the error occurred. We try to show the lexeme if it&rsquo;s\nhuman-readable. Then we print the error message itself. After that, we set this\n<code>hadError</code> flag. That records whether any errors occurred during compilation.\nThis field also lives in the parser struct.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  Token previous;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin struct <em>Parser</em></div>\n<pre class=\"insert\">  <span class=\"t\">bool</span> <span class=\"i\">hadError</span>;\n</pre><pre class=\"insert-after\">} Parser;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in struct <em>Parser</em></div>\n\n<p>Earlier I said that <code>compile()</code> should return <code>false</code> if an error occurred. Now\nwe can make it do that.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  consume(TOKEN_EOF, &quot;Expect end of expression.&quot;);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>compile</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">return</span> !<span class=\"i\">parser</span>.<span class=\"i\">hadError</span>;\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>compile</em>()</div>\n\n<p>I&rsquo;ve got another flag to introduce for error handling. We want to avoid error\ncascades. If the user has a mistake in their code and the parser gets confused\nabout where it is in the grammar, we don&rsquo;t want it to spew out a whole pile of\nmeaningless knock-on errors after the first one.</p>\n<p>We fixed that in jlox using panic mode error recovery. In the Java interpreter,\nwe threw an exception to unwind out of all of the parser code to a point where\nwe could skip tokens and resynchronize. We don&rsquo;t have <span\nname=\"setjmp\">exceptions</span> in C. Instead, we&rsquo;ll do a little smoke and\nmirrors. We add a flag to track whether we&rsquo;re currently in panic mode.</p>\n<aside name=\"setjmp\">\n<p>There is <code>setjmp()</code> and <code>longjmp()</code>, but I&rsquo;d rather not go there. Those make it\ntoo easy to leak memory, forget to maintain invariants, or otherwise have a Very\nBad Day.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">  bool hadError;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin struct <em>Parser</em></div>\n<pre class=\"insert\">  <span class=\"t\">bool</span> <span class=\"i\">panicMode</span>;\n</pre><pre class=\"insert-after\">} Parser;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in struct <em>Parser</em></div>\n\n<p>When an error occurs, we set it.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void errorAt(Token* token, const char* message) {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>errorAt</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">parser</span>.<span class=\"i\">panicMode</span> = <span class=\"k\">true</span>;\n</pre><pre class=\"insert-after\">  fprintf(stderr, &quot;[line %d] Error&quot;, token-&gt;line);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>errorAt</em>()</div>\n\n<p>After that, we go ahead and keep compiling as normal as if the error never\noccurred. The bytecode will never get executed, so it&rsquo;s harmless to keep on\ntrucking. The trick is that while the panic mode flag is set, we simply suppress\nany other errors that get detected.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void errorAt(Token* token, const char* message) {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>errorAt</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">parser</span>.<span class=\"i\">panicMode</span>) <span class=\"k\">return</span>;\n</pre><pre class=\"insert-after\">  parser.panicMode = true;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>errorAt</em>()</div>\n\n<p>There&rsquo;s a good chance the parser will go off in the weeds, but the user won&rsquo;t\nknow because the errors all get swallowed. Panic mode ends when the parser\nreaches a synchronization point. For Lox, we chose statement boundaries, so when\nwe later add those to our compiler, we&rsquo;ll clear the flag there.</p>\n<p>These new fields need to be initialized.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  initScanner(source);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>compile</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"i\">parser</span>.<span class=\"i\">hadError</span> = <span class=\"k\">false</span>;\n  <span class=\"i\">parser</span>.<span class=\"i\">panicMode</span> = <span class=\"k\">false</span>;\n\n</pre><pre class=\"insert-after\">  advance();\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>compile</em>()</div>\n\n<p>And to display the errors, we need a standard header.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &lt;stdio.h&gt;\n</pre><div class=\"source-file\"><em>compiler.c</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &lt;stdlib.h&gt;</span>\n</pre><pre class=\"insert-after\">\n\n#include &quot;common.h&quot;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em></div>\n\n<p>There&rsquo;s one last parsing function, another old friend from jlox.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>advance</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">consume</span>(<span class=\"t\">TokenType</span> <span class=\"i\">type</span>, <span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">message</span>) {\n  <span class=\"k\">if</span> (<span class=\"i\">parser</span>.<span class=\"i\">current</span>.<span class=\"i\">type</span> == <span class=\"i\">type</span>) {\n    <span class=\"i\">advance</span>();\n    <span class=\"k\">return</span>;\n  }\n\n  <span class=\"i\">errorAtCurrent</span>(<span class=\"i\">message</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>advance</em>()</div>\n\n<p>It&rsquo;s similar to <code>advance()</code> in that it reads the next token. But it also\nvalidates that the token has an expected type. If not, it reports an error. This\nfunction is the foundation of most syntax errors in the compiler.</p>\n<p>OK, that&rsquo;s enough on the front end for now.</p>\n<h2><a href=\"#emitting-bytecode\" id=\"emitting-bytecode\"><small>17&#8202;.&#8202;3</small>Emitting Bytecode</a></h2>\n<p>After we parse and understand a piece of the user&rsquo;s program, the next step is to\ntranslate that to a series of bytecode instructions. It starts with the easiest\npossible step: appending a single byte to the chunk.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>consume</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">emitByte</span>(<span class=\"t\">uint8_t</span> <span class=\"i\">byte</span>) {\n  <span class=\"i\">writeChunk</span>(<span class=\"i\">currentChunk</span>(), <span class=\"i\">byte</span>, <span class=\"i\">parser</span>.<span class=\"i\">previous</span>.<span class=\"i\">line</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>consume</em>()</div>\n\n<p>It&rsquo;s hard to believe great things will flow through such a simple function. It\nwrites the given byte, which may be an opcode or an operand to an instruction.\nIt sends in the previous token&rsquo;s line information so that runtime errors are\nassociated with that line.</p>\n<p>The chunk that we&rsquo;re writing gets passed into <code>compile()</code>, but it needs to make\nits way to <code>emitByte()</code>. To do that, we rely on this intermediary function:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">Parser parser;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nadd after variable <em>parser</em></div>\n<pre class=\"insert\"><span class=\"t\">Chunk</span>* <span class=\"i\">compilingChunk</span>;\n\n<span class=\"k\">static</span> <span class=\"t\">Chunk</span>* <span class=\"i\">currentChunk</span>() {\n  <span class=\"k\">return</span> <span class=\"i\">compilingChunk</span>;\n}\n\n</pre><pre class=\"insert-after\">static void errorAt(Token* token, const char* message) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after variable <em>parser</em></div>\n\n<p>Right now, the chunk pointer is stored in a module-level variable like we store\nother global state. Later, when we start compiling user-defined functions, the\nnotion of &ldquo;current chunk&rdquo; gets more complicated. To avoid having to go back and\nchange a lot of code, I encapsulate that logic in the <code>currentChunk()</code> function.</p>\n<p>We initialize this new module variable before we write any bytecode:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">bool compile(const char* source, Chunk* chunk) {\n  initScanner(source);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>compile</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">compilingChunk</span> = <span class=\"i\">chunk</span>;\n</pre><pre class=\"insert-after\">\n\n  parser.hadError = false;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>compile</em>()</div>\n\n<p>Then, at the very end, when we&rsquo;re done compiling the chunk, we wrap things up.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  consume(TOKEN_EOF, &quot;Expect end of expression.&quot;);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>compile</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">endCompiler</span>();\n</pre><pre class=\"insert-after\">  return !parser.hadError;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>compile</em>()</div>\n\n<p>That calls this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>emitByte</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">endCompiler</span>() {\n  <span class=\"i\">emitReturn</span>();\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>emitByte</em>()</div>\n\n<p>In this chapter, our VM deals only with expressions. When you run clox, it will\nparse, compile, and execute a single expression, then print the result. To print\nthat value, we are temporarily using the <code>OP_RETURN</code> instruction. So we have the\ncompiler add one of those to the end of the chunk.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>emitByte</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">emitReturn</span>() {\n  <span class=\"i\">emitByte</span>(<span class=\"a\">OP_RETURN</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>emitByte</em>()</div>\n\n<p>While we&rsquo;re here in the back end we may as well make our lives easier.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>emitByte</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">emitBytes</span>(<span class=\"t\">uint8_t</span> <span class=\"i\">byte1</span>, <span class=\"t\">uint8_t</span> <span class=\"i\">byte2</span>) {\n  <span class=\"i\">emitByte</span>(<span class=\"i\">byte1</span>);\n  <span class=\"i\">emitByte</span>(<span class=\"i\">byte2</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>emitByte</em>()</div>\n\n<p>Over time, we&rsquo;ll have enough cases where we need to write an opcode followed by\na one-byte operand that it&rsquo;s worth defining this convenience function.</p>\n<h2><a href=\"#parsing-prefix-expressions\" id=\"parsing-prefix-expressions\"><small>17&#8202;.&#8202;4</small>Parsing Prefix Expressions</a></h2>\n<p>We&rsquo;ve assembled our parsing and code generation utility functions. The missing\npiece is the code in the middle that connects those together.</p><img src=\"image/compiling-expressions/mystery.png\" alt=\"Parsing functions on the left, bytecode emitting functions on the right. What goes in the middle?\" />\n<p>The only step in <code>compile()</code> that we have left to implement is this function:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>endCompiler</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">expression</span>() {\n  <span class=\"c\">// What goes here?</span>\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>endCompiler</em>()</div>\n\n<p>We aren&rsquo;t ready to implement every kind of expression in Lox yet. Heck, we don&rsquo;t\neven have Booleans. For this chapter, we&rsquo;re only going to worry about four:</p>\n<ul>\n<li>Number literals: <code>123</code></li>\n<li>Parentheses for grouping: <code>(123)</code></li>\n<li>Unary negation: <code>-123</code></li>\n<li>The Four Horsemen of the Arithmetic: <code>+</code>, <code>-</code>, <code>*</code>, <code>/</code></li>\n</ul>\n<p>As we work through the functions to compile each of those kinds of expressions,\nwe&rsquo;ll also assemble the requirements for the table-driven parser that calls\nthem.</p>\n<h3><a href=\"#parsers-for-tokens\" id=\"parsers-for-tokens\"><small>17&#8202;.&#8202;4&#8202;.&#8202;1</small>Parsers for tokens</a></h3>\n<p>For now, let&rsquo;s focus on the Lox expressions that are each only a single token.\nIn this chapter, that&rsquo;s just number literals, but there will be more later. Here&rsquo;s\nhow we can compile them:</p>\n<p>We map each token type to a different kind of expression. We define a function\nfor each expression that outputs the appropriate bytecode. Then we build an\narray of function pointers. The indexes in the array correspond to the\n<code>TokenType</code> enum values, and the function at each index is the code to compile\nan expression of that token type.</p>\n<p>To compile number literals, we store a pointer to the following function at the\n<code>TOKEN_NUMBER</code> index in the array.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>endCompiler</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">number</span>() {\n  <span class=\"t\">double</span> <span class=\"i\">value</span> = <span class=\"i\">strtod</span>(<span class=\"i\">parser</span>.<span class=\"i\">previous</span>.<span class=\"i\">start</span>, <span class=\"a\">NULL</span>);\n  <span class=\"i\">emitConstant</span>(<span class=\"i\">value</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>endCompiler</em>()</div>\n\n<p>We assume the token for the number literal has already been consumed and is\nstored in <code>previous</code>. We take that lexeme and use the C standard library to\nconvert it to a double value. Then we generate the code to load that value using\nthis function:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>emitReturn</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">emitConstant</span>(<span class=\"t\">Value</span> <span class=\"i\">value</span>) {\n  <span class=\"i\">emitBytes</span>(<span class=\"a\">OP_CONSTANT</span>, <span class=\"i\">makeConstant</span>(<span class=\"i\">value</span>));\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>emitReturn</em>()</div>\n\n<p>First, we add the value to the constant table, then we emit an <code>OP_CONSTANT</code>\ninstruction that pushes it onto the stack at runtime. To insert an entry in the\nconstant table, we rely on:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>emitReturn</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">uint8_t</span> <span class=\"i\">makeConstant</span>(<span class=\"t\">Value</span> <span class=\"i\">value</span>) {\n  <span class=\"t\">int</span> <span class=\"i\">constant</span> = <span class=\"i\">addConstant</span>(<span class=\"i\">currentChunk</span>(), <span class=\"i\">value</span>);\n  <span class=\"k\">if</span> (<span class=\"i\">constant</span> &gt; <span class=\"a\">UINT8_MAX</span>) {\n    <span class=\"i\">error</span>(<span class=\"s\">&quot;Too many constants in one chunk.&quot;</span>);\n    <span class=\"k\">return</span> <span class=\"n\">0</span>;\n  }\n\n  <span class=\"k\">return</span> (<span class=\"t\">uint8_t</span>)<span class=\"i\">constant</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>emitReturn</em>()</div>\n\n<p>Most of the work happens in <code>addConstant()</code>, which we defined back in an\n<a href=\"chunks-of-bytecode.html\">earlier chapter</a>. That adds the given value to the end of the chunk&rsquo;s\nconstant table and returns its index. The new function&rsquo;s job is mostly to make\nsure we don&rsquo;t have too many constants. Since the <code>OP_CONSTANT</code> instruction uses\na single byte for the index operand, we can store and load only up to <span\nname=\"256\">256</span> constants in a chunk.</p>\n<aside name=\"256\">\n<p>Yes, that limit is pretty low. If this were a full-sized language\nimplementation, we&rsquo;d want to add another instruction like <code>OP_CONSTANT_16</code> that\nstores the index as a two-byte operand so we could handle more constants when\nneeded.</p>\n<p>The code to support that isn&rsquo;t particularly illuminating, so I omitted it from\nclox, but you&rsquo;ll want your VMs to scale to larger programs.</p>\n</aside>\n<p>That&rsquo;s basically all it takes. Provided there is some suitable code that\nconsumes a <code>TOKEN_NUMBER</code> token, looks up <code>number()</code> in the function pointer\narray, and then calls it, we can now compile number literals to bytecode.</p>\n<h3><a href=\"#parentheses-for-grouping\" id=\"parentheses-for-grouping\"><small>17&#8202;.&#8202;4&#8202;.&#8202;2</small>Parentheses for grouping</a></h3>\n<p>Our as-yet-imaginary array of parsing function pointers would be great if every\nexpression was only a single token long. Alas, most are longer. However, many\nexpressions <em>start</em> with a particular token. We call these <em>prefix</em> expressions.\nFor example, when we&rsquo;re parsing an expression and the current token is <code>(</code>, we\nknow we must be looking at a parenthesized grouping expression.</p>\n<p>It turns out our function pointer array handles those too. The parsing function\nfor an expression type can consume any additional tokens that it wants to, just\nlike in a regular recursive descent parser. Here&rsquo;s how parentheses work:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>endCompiler</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">grouping</span>() {\n  <span class=\"i\">expression</span>();\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_RIGHT_PAREN</span>, <span class=\"s\">&quot;Expect &#39;)&#39; after expression.&quot;</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>endCompiler</em>()</div>\n\n<p>Again, we assume the initial <code>(</code> has already been consumed. We <span\nname=\"recursive\">recursively</span> call back into <code>expression()</code> to compile the\nexpression between the parentheses, then parse the closing <code>)</code> at the end.</p>\n<aside name=\"recursive\">\n<p>A Pratt parser isn&rsquo;t a recursive <em>descent</em> parser, but it&rsquo;s still recursive.\nThat&rsquo;s to be expected since the grammar itself is recursive.</p>\n</aside>\n<p>As far as the back end is concerned, there&rsquo;s literally nothing to a grouping\nexpression. Its sole function is syntactic<span class=\"em\">&mdash;</span>it lets you insert a\nlower-precedence expression where a higher precedence is expected. Thus, it has\nno runtime semantics on its own and therefore doesn&rsquo;t emit any bytecode. The\ninner call to <code>expression()</code> takes care of generating bytecode for the\nexpression inside the parentheses.</p>\n<h3><a href=\"#unary-negation\" id=\"unary-negation\"><small>17&#8202;.&#8202;4&#8202;.&#8202;3</small>Unary negation</a></h3>\n<p>Unary minus is also a prefix expression, so it works with our model too.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>number</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">unary</span>() {\n  <span class=\"t\">TokenType</span> <span class=\"i\">operatorType</span> = <span class=\"i\">parser</span>.<span class=\"i\">previous</span>.<span class=\"i\">type</span>;\n\n  <span class=\"c\">// Compile the operand.</span>\n  <span class=\"i\">expression</span>();\n\n  <span class=\"c\">// Emit the operator instruction.</span>\n  <span class=\"k\">switch</span> (<span class=\"i\">operatorType</span>) {\n    <span class=\"k\">case</span> <span class=\"a\">TOKEN_MINUS</span>: <span class=\"i\">emitByte</span>(<span class=\"a\">OP_NEGATE</span>); <span class=\"k\">break</span>;\n    <span class=\"k\">default</span>: <span class=\"k\">return</span>; <span class=\"c\">// Unreachable.</span>\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>number</em>()</div>\n\n<p>The leading <code>-</code> token has been consumed and is sitting in <code>parser.previous</code>. We\ngrab the token type from that to note which unary operator we&rsquo;re dealing with.\nIt&rsquo;s unnecessary right now, but this will make more sense when we use this same\nfunction to compile the <code>!</code> operator in <a href=\"types-of-values.html\">the next chapter</a>.</p>\n<p>As in <code>grouping()</code>, we recursively call <code>expression()</code> to compile the operand.\nAfter that, we emit the bytecode to perform the negation. It might seem a little\nweird to write the negate instruction <em>after</em> its operand&rsquo;s bytecode since the\n<code>-</code> appears on the left, but think about it in terms of order of execution:</p>\n<ol>\n<li>\n<p>We evaluate the operand first which leaves its value on the stack.</p>\n</li>\n<li>\n<p>Then we pop that value, negate it, and push the result.</p>\n</li>\n</ol>\n<p>So the <code>OP_NEGATE</code> instruction should be emitted <span name=\"line\">last</span>.\nThis is part of the compiler&rsquo;s job<span class=\"em\">&mdash;</span>parsing the program in the order it\nappears in the source code and rearranging it into the order that execution\nhappens.</p>\n<aside name=\"line\">\n<p>Emitting the <code>OP_NEGATE</code> instruction after the operands does mean that the\ncurrent token when the bytecode is written is <em>not</em> the <code>-</code> token. That mostly\ndoesn&rsquo;t matter, except that we use that token for the line number to associate\nwith that instruction.</p>\n<p>This means if you have a multi-line negation expression, like:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">print</span> -\n  <span class=\"k\">true</span>;\n</pre></div>\n<p>Then the runtime error will be reported on the wrong line. Here, it would show\nthe error on line 2, even though the <code>-</code> is on line 1. A more robust approach\nwould be to store the token&rsquo;s line before compiling the operand and then pass\nthat into <code>emitByte()</code>, but I wanted to keep things simple for the book.</p>\n</aside>\n<p>There is one problem with this code, though. The <code>expression()</code> function it\ncalls will parse any expression for the operand, regardless of precedence. Once\nwe add binary operators and other syntax, that will do the wrong thing.\nConsider:</p>\n<div class=\"codehilite\"><pre>-<span class=\"i\">a</span>.<span class=\"i\">b</span> + <span class=\"i\">c</span>;\n</pre></div>\n<p>Here, the operand to <code>-</code> should be just the <code>a.b</code> expression, not the entire\n<code>a.b + c</code>. But if <code>unary()</code> calls <code>expression()</code>, the latter will happily chew\nthrough all of the remaining code including the <code>+</code>. It will erroneously treat\nthe <code>-</code> as lower precedence than the <code>+</code>.</p>\n<p>When parsing the operand to unary <code>-</code>, we need to compile only expressions at a\ncertain precedence level or higher. In jlox&rsquo;s recursive descent parser we\naccomplished that by calling into the parsing method for the lowest-precedence\nexpression we wanted to allow (in this case, <code>call()</code>). Each method for parsing\na specific expression also parsed any expressions of higher precedence too, so\nthat included the rest of the precedence table.</p>\n<p>The parsing functions like <code>number()</code> and <code>unary()</code> here in clox are different.\nEach only parses exactly one type of expression. They don&rsquo;t cascade to include\nhigher-precedence expression types too. We need a different solution, and it\nlooks like this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>unary</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">parsePrecedence</span>(<span class=\"t\">Precedence</span> <span class=\"i\">precedence</span>) {\n  <span class=\"c\">// What goes here?</span>\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>unary</em>()</div>\n\n<p>This function<span class=\"em\">&mdash;</span>once we implement it<span class=\"em\">&mdash;</span>starts at the current token and parses\nany expression at the given precedence level or higher. We have some other setup\nto get through before we can write the body of this function, but you can\nprobably guess that it will use that table of parsing function pointers I&rsquo;ve\nbeen talking about. For now, don&rsquo;t worry too much about how it works. In order\nto take the &ldquo;precedence&rdquo; as a parameter, we define it numerically.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} Parser;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nadd after struct <em>Parser</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">typedef</span> <span class=\"k\">enum</span> {\n  <span class=\"a\">PREC_NONE</span>,\n  <span class=\"a\">PREC_ASSIGNMENT</span>,  <span class=\"c\">// =</span>\n  <span class=\"a\">PREC_OR</span>,          <span class=\"c\">// or</span>\n  <span class=\"a\">PREC_AND</span>,         <span class=\"c\">// and</span>\n  <span class=\"a\">PREC_EQUALITY</span>,    <span class=\"c\">// == !=</span>\n  <span class=\"a\">PREC_COMPARISON</span>,  <span class=\"c\">// &lt; &gt; &lt;= &gt;=</span>\n  <span class=\"a\">PREC_TERM</span>,        <span class=\"c\">// + -</span>\n  <span class=\"a\">PREC_FACTOR</span>,      <span class=\"c\">// * /</span>\n  <span class=\"a\">PREC_UNARY</span>,       <span class=\"c\">// ! -</span>\n  <span class=\"a\">PREC_CALL</span>,        <span class=\"c\">// . ()</span>\n  <span class=\"a\">PREC_PRIMARY</span>\n} <span class=\"t\">Precedence</span>;\n</pre><pre class=\"insert-after\">\n\nParser parser;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after struct <em>Parser</em></div>\n\n<p>These are all of Lox&rsquo;s precedence levels in order from lowest to highest. Since\nC implicitly gives successively larger numbers for enums, this means that\n<code>PREC_CALL</code> is numerically larger than <code>PREC_UNARY</code>. For example, say the\ncompiler is sitting on a chunk of code like:</p>\n<div class=\"codehilite\"><pre>-<span class=\"i\">a</span>.<span class=\"i\">b</span> + <span class=\"i\">c</span>\n</pre></div>\n<p>If we call <code>parsePrecedence(PREC_ASSIGNMENT)</code>, then it will parse the entire\nexpression because <code>+</code> has higher precedence than assignment. If instead we\ncall <code>parsePrecedence(PREC_UNARY)</code>, it will compile the <code>-a.b</code> and stop there.\nIt doesn&rsquo;t keep going through the <code>+</code> because the addition has lower precedence\nthan unary operators.</p>\n<p>With this function in hand, it&rsquo;s a snap to fill in the missing body for\n<code>expression()</code>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void expression() {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>expression</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"i\">parsePrecedence</span>(<span class=\"a\">PREC_ASSIGNMENT</span>);\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>expression</em>(), replace 1 line</div>\n\n<p>We simply parse the lowest precedence level, which subsumes all of the\nhigher-precedence expressions too. Now, to compile the operand for a unary\nexpression, we call this new function and limit it to the appropriate level:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  // Compile the operand.\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>unary</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"i\">parsePrecedence</span>(<span class=\"a\">PREC_UNARY</span>);\n</pre><pre class=\"insert-after\">\n\n  // Emit the operator instruction.\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>unary</em>(), replace 1 line</div>\n\n<p>We use the unary operator&rsquo;s own <code>PREC_UNARY</code> precedence to permit <span\nname=\"useful\">nested</span> unary expressions like <code>!!doubleNegative</code>. Since\nunary operators have pretty high precedence, that correctly excludes things like\nbinary operators. Speaking of which<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span></p>\n<aside name=\"useful\">\n<p>Not that nesting unary expressions is particularly useful in Lox. But other\nlanguages let you do it, so we do too.</p>\n</aside>\n<h2><a href=\"#parsing-infix-expressions\" id=\"parsing-infix-expressions\"><small>17&#8202;.&#8202;5</small>Parsing Infix Expressions</a></h2>\n<p>Binary operators are different from the previous expressions because they are\n<em>infix</em>. With the other expressions, we know what we are parsing from the very\nfirst token. With infix expressions, we don&rsquo;t know we&rsquo;re in the middle of a\nbinary operator until <em>after</em> we&rsquo;ve parsed its left operand and then stumbled\nonto the operator token in the middle.</p>\n<p>Here&rsquo;s an example:</p>\n<div class=\"codehilite\"><pre><span class=\"n\">1</span> + <span class=\"n\">2</span>\n</pre></div>\n<p>Let&rsquo;s walk through trying to compile it with what we know so far:</p>\n<ol>\n<li>\n<p>We call <code>expression()</code>. That in turn calls\n<code>parsePrecedence(PREC_ASSIGNMENT)</code>.</p>\n</li>\n<li>\n<p>That function (once we implement it) sees the leading number token and\nrecognizes it is parsing a number literal. It hands off control to\n<code>number()</code>.</p>\n</li>\n<li>\n<p><code>number()</code> creates a constant, emits an <code>OP_CONSTANT</code>, and returns back to\n<code>parsePrecedence()</code>.</p>\n</li>\n</ol>\n<p>Now what? The call to <code>parsePrecedence()</code> should consume the entire addition\nexpression, so it needs to keep going somehow. Fortunately, the parser is right\nwhere we need it to be. Now that we&rsquo;ve compiled the leading number expression,\nthe next token is <code>+</code>. That&rsquo;s the exact token that <code>parsePrecedence()</code> needs to\ndetect that we&rsquo;re in the middle of an infix expression and to realize that the\nexpression we already compiled is actually an operand to that.</p>\n<p>So this hypothetical array of function pointers doesn&rsquo;t just list functions to\nparse expressions that start with a given token. Instead, it&rsquo;s a <em>table</em> of\nfunction pointers. One column associates prefix parser functions with token\ntypes. The second column associates infix parser functions with token types.</p>\n<p>The function we will use as the infix parser for <code>TOKEN_PLUS</code>, <code>TOKEN_MINUS</code>,\n<code>TOKEN_STAR</code>, and <code>TOKEN_SLASH</code> is this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>endCompiler</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">binary</span>() {\n  <span class=\"t\">TokenType</span> <span class=\"i\">operatorType</span> = <span class=\"i\">parser</span>.<span class=\"i\">previous</span>.<span class=\"i\">type</span>;\n  <span class=\"t\">ParseRule</span>* <span class=\"i\">rule</span> = <span class=\"i\">getRule</span>(<span class=\"i\">operatorType</span>);\n  <span class=\"i\">parsePrecedence</span>((<span class=\"t\">Precedence</span>)(<span class=\"i\">rule</span>-&gt;<span class=\"i\">precedence</span> + <span class=\"n\">1</span>));\n\n  <span class=\"k\">switch</span> (<span class=\"i\">operatorType</span>) {\n    <span class=\"k\">case</span> <span class=\"a\">TOKEN_PLUS</span>:          <span class=\"i\">emitByte</span>(<span class=\"a\">OP_ADD</span>); <span class=\"k\">break</span>;\n    <span class=\"k\">case</span> <span class=\"a\">TOKEN_MINUS</span>:         <span class=\"i\">emitByte</span>(<span class=\"a\">OP_SUBTRACT</span>); <span class=\"k\">break</span>;\n    <span class=\"k\">case</span> <span class=\"a\">TOKEN_STAR</span>:          <span class=\"i\">emitByte</span>(<span class=\"a\">OP_MULTIPLY</span>); <span class=\"k\">break</span>;\n    <span class=\"k\">case</span> <span class=\"a\">TOKEN_SLASH</span>:         <span class=\"i\">emitByte</span>(<span class=\"a\">OP_DIVIDE</span>); <span class=\"k\">break</span>;\n    <span class=\"k\">default</span>: <span class=\"k\">return</span>; <span class=\"c\">// Unreachable.</span>\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>endCompiler</em>()</div>\n\n<p>When a prefix parser function is called, the leading token has already been\nconsumed. An infix parser function is even more <em>in medias res</em><span class=\"em\">&mdash;</span>the entire\nleft-hand operand expression has already been compiled and the subsequent infix\noperator consumed.</p>\n<p>The fact that the left operand gets compiled first works out fine. It means at\nruntime, that code gets executed first. When it runs, the value it produces will\nend up on the stack. That&rsquo;s right where the infix operator is going to need it.</p>\n<p>Then we come here to <code>binary()</code> to handle the rest of the arithmetic operators.\nThis function compiles the right operand, much like how <code>unary()</code> compiles its\nown trailing operand. Finally, it emits the bytecode instruction that performs\nthe binary operation.</p>\n<p>When run, the VM will execute the left and right operand code, in that order,\nleaving their values on the stack. Then it executes the instruction for the\noperator. That pops the two values, computes the operation, and pushes the\nresult.</p>\n<p>The code that probably caught your eye here is that <code>getRule()</code> line. When we\nparse the right-hand operand, we again need to worry about precedence. Take an\nexpression like:</p>\n<div class=\"codehilite\"><pre><span class=\"n\">2</span> * <span class=\"n\">3</span> + <span class=\"n\">4</span>\n</pre></div>\n<p>When we parse the right operand of the <code>*</code> expression, we need to just capture\n<code>3</code>, and not <code>3 + 4</code>, because <code>+</code> is lower precedence than <code>*</code>. We could define\na separate function for each binary operator. Each would call\n<code>parsePrecedence()</code> and pass in the correct precedence level for its operand.</p>\n<p>But that&rsquo;s kind of tedious. Each binary operator&rsquo;s right-hand operand precedence\nis one level <span name=\"higher\">higher</span> than its own. We can look that up\ndynamically with this <code>getRule()</code> thing we&rsquo;ll get to soon. Using that, we call\n<code>parsePrecedence()</code> with one level higher than this operator&rsquo;s level.</p>\n<aside name=\"higher\">\n<p>We use one <em>higher</em> level of precedence for the right operand because the binary\noperators are left-associative. Given a series of the <em>same</em> operator, like:</p>\n<div class=\"codehilite\"><pre><span class=\"n\">1</span> + <span class=\"n\">2</span> + <span class=\"n\">3</span> + <span class=\"n\">4</span>\n</pre></div>\n<p>We want to parse it like:</p>\n<div class=\"codehilite\"><pre>((<span class=\"n\">1</span> + <span class=\"n\">2</span>) + <span class=\"n\">3</span>) + <span class=\"n\">4</span>\n</pre></div>\n<p>Thus, when parsing the right-hand operand to the first <code>+</code>, we want to consume\nthe <code>2</code>, but not the rest, so we use one level above <code>+</code>&rsquo;s precedence. But if\nour operator was <em>right</em>-associative, this would be wrong. Given:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">a</span> = <span class=\"i\">b</span> = <span class=\"i\">c</span> = <span class=\"i\">d</span>\n</pre></div>\n<p>Since assignment is right-associative, we want to parse it as:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">a</span> = (<span class=\"i\">b</span> = (<span class=\"i\">c</span> = <span class=\"i\">d</span>))\n</pre></div>\n<p>To enable that, we would call <code>parsePrecedence()</code> with the <em>same</em> precedence as\nthe current operator.</p>\n</aside>\n<p>This way, we can use a single <code>binary()</code> function for all binary operators even\nthough they have different precedences.</p>\n<h2><a href=\"#a-pratt-parser\" id=\"a-pratt-parser\"><small>17&#8202;.&#8202;6</small>A Pratt Parser</a></h2>\n<p>We now have all of the pieces and parts of the compiler laid out. We have a\nfunction for each grammar production: <code>number()</code>, <code>grouping()</code>, <code>unary()</code>, and\n<code>binary()</code>. We still need to implement <code>parsePrecedence()</code>, and <code>getRule()</code>. We\nalso know we need a table that, given a token type, lets us find</p>\n<ul>\n<li>\n<p>the function to compile a prefix expression starting with a token of that\ntype,</p>\n</li>\n<li>\n<p>the function to compile an infix expression whose left operand is followed\nby a token of that type, and</p>\n</li>\n<li>\n<p>the precedence of an <span name=\"prefix\">infix</span> expression that uses\nthat token as an operator.</p>\n</li>\n</ul>\n<aside name=\"prefix\">\n<p>We don&rsquo;t need to track the precedence of the <em>prefix</em> expression starting with a\ngiven token because all prefix operators in Lox have the same precedence.</p>\n</aside>\n<p>We wrap these three properties in a little struct which represents a single row\nin the parser table.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} Precedence;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nadd after enum <em>Precedence</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">typedef</span> <span class=\"k\">struct</span> {\n  <span class=\"t\">ParseFn</span> <span class=\"i\">prefix</span>;\n  <span class=\"t\">ParseFn</span> <span class=\"i\">infix</span>;\n  <span class=\"t\">Precedence</span> <span class=\"i\">precedence</span>;\n} <span class=\"t\">ParseRule</span>;\n</pre><pre class=\"insert-after\">\n\nParser parser;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after enum <em>Precedence</em></div>\n\n<p>That ParseFn type is a simple <span name=\"typedef\">typedef</span> for a function\ntype that takes no arguments and returns nothing.</p>\n<aside name=\"typedef\" class=\"bottom\">\n<p>C&rsquo;s syntax for function pointer types is so bad that I always hide it behind a\ntypedef. I understand the intent behind the syntax<span class=\"em\">&mdash;</span>the whole &ldquo;declaration\nreflects use&rdquo; thing<span class=\"em\">&mdash;</span>but I think it was a failed syntactic experiment.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">} Precedence;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nadd after enum <em>Precedence</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">typedef</span> <span class=\"t\">void</span> (*<span class=\"t\">ParseFn</span>)();\n</pre><pre class=\"insert-after\">\n\ntypedef struct {\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after enum <em>Precedence</em></div>\n\n<p>The table that drives our whole parser is an array of ParseRules. We&rsquo;ve been\ntalking about it forever, and finally you get to see it.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>unary</em>()</div>\n<pre><span class=\"t\">ParseRule</span> <span class=\"i\">rules</span>[] = {\n  [<span class=\"a\">TOKEN_LEFT_PAREN</span>]    = {<span class=\"i\">grouping</span>, <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_RIGHT_PAREN</span>]   = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_LEFT_BRACE</span>]    = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},<span name=\"big\"> </span>\n  [<span class=\"a\">TOKEN_RIGHT_BRACE</span>]   = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_COMMA</span>]         = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_DOT</span>]           = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_MINUS</span>]         = {<span class=\"i\">unary</span>,    <span class=\"i\">binary</span>, <span class=\"a\">PREC_TERM</span>},\n  [<span class=\"a\">TOKEN_PLUS</span>]          = {<span class=\"a\">NULL</span>,     <span class=\"i\">binary</span>, <span class=\"a\">PREC_TERM</span>},\n  [<span class=\"a\">TOKEN_SEMICOLON</span>]     = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_SLASH</span>]         = {<span class=\"a\">NULL</span>,     <span class=\"i\">binary</span>, <span class=\"a\">PREC_FACTOR</span>},\n  [<span class=\"a\">TOKEN_STAR</span>]          = {<span class=\"a\">NULL</span>,     <span class=\"i\">binary</span>, <span class=\"a\">PREC_FACTOR</span>},\n  [<span class=\"a\">TOKEN_BANG</span>]          = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_BANG_EQUAL</span>]    = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_EQUAL</span>]         = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_EQUAL_EQUAL</span>]   = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_GREATER</span>]       = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_GREATER_EQUAL</span>] = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_LESS</span>]          = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_LESS_EQUAL</span>]    = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_IDENTIFIER</span>]    = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_STRING</span>]        = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_NUMBER</span>]        = {<span class=\"i\">number</span>,   <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_AND</span>]           = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_CLASS</span>]         = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_ELSE</span>]          = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_FALSE</span>]         = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_FOR</span>]           = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_FUN</span>]           = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_IF</span>]            = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_NIL</span>]           = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_OR</span>]            = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_PRINT</span>]         = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_RETURN</span>]        = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_SUPER</span>]         = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_THIS</span>]          = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_TRUE</span>]          = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_VAR</span>]           = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_WHILE</span>]         = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_ERROR</span>]         = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n  [<span class=\"a\">TOKEN_EOF</span>]           = {<span class=\"a\">NULL</span>,     <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n};\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>unary</em>()</div>\n\n<aside name=\"big\">\n<p>See what I mean about not wanting to revisit the table each time we needed a new\ncolumn? It&rsquo;s a beast.</p>\n<p>If you haven&rsquo;t seen the <code>[TOKEN_DOT] =</code> syntax in a C array literal, that is\nC99&rsquo;s designated initializer syntax. It&rsquo;s clearer than having to count array\nindexes by hand.</p>\n</aside>\n<p>You can see how <code>grouping</code> and <code>unary</code> are slotted into the prefix parser column\nfor their respective token types. In the next column, <code>binary</code> is wired up to\nthe four arithmetic infix operators. Those infix operators also have their\nprecedences set in the last column.</p>\n<p>Aside from those, the rest of the table is full of <code>NULL</code> and <code>PREC_NONE</code>. Most\nof those empty cells are because there is no expression associated with those\ntokens. You can&rsquo;t start an expression with, say, <code>else</code>, and <code>}</code> would make for\na pretty confusing infix operator.</p>\n<p>But, also, we haven&rsquo;t filled in the entire grammar yet. In later chapters, as we\nadd new expression types, some of these slots will get functions in them. One of\nthe things I like about this approach to parsing is that it makes it very easy\nto see which tokens are in use by the grammar and which are available.</p>\n<p>Now that we have the table, we are finally ready to write the code that uses it.\nThis is where our Pratt parser comes to life. The easiest function to define is\n<code>getRule()</code>.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>parsePrecedence</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">ParseRule</span>* <span class=\"i\">getRule</span>(<span class=\"t\">TokenType</span> <span class=\"i\">type</span>) {\n  <span class=\"k\">return</span> &amp;<span class=\"i\">rules</span>[<span class=\"i\">type</span>];\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>parsePrecedence</em>()</div>\n\n<p>It simply returns the rule at the given index. It&rsquo;s called by <code>binary()</code> to look\nup the precedence of the current operator. This function exists solely to handle\na declaration cycle in the C code. <code>binary()</code> is defined <em>before</em> the rules\ntable so that the table can store a pointer to it. That means the body of\n<code>binary()</code> cannot access the table directly.</p>\n<p>Instead, we wrap the lookup in a function. That lets us forward declare\n<code>getRule()</code> before the definition of <code>binary()</code>, and <span\nname=\"forward\">then</span> <em>define</em> <code>getRule()</code> after the table. We&rsquo;ll need a\ncouple of other forward declarations to handle the fact that our grammar is\nrecursive, so let&rsquo;s get them all out of the way.</p>\n<aside name=\"forward\">\n<p>This is what happens when you write your VM in a language that was designed to\nbe compiled on a PDP-11.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">  emitReturn();\n}\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>endCompiler</em>()</div>\n<pre class=\"insert\">\n\n<span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">expression</span>();\n<span class=\"k\">static</span> <span class=\"t\">ParseRule</span>* <span class=\"i\">getRule</span>(<span class=\"t\">TokenType</span> <span class=\"i\">type</span>);\n<span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">parsePrecedence</span>(<span class=\"t\">Precedence</span> <span class=\"i\">precedence</span>);\n\n</pre><pre class=\"insert-after\">static void binary() {\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>endCompiler</em>()</div>\n\n<p>If you&rsquo;re following along and implementing clox yourself, pay close attention to\nthe little annotations that tell you where to put these code snippets. Don&rsquo;t\nworry, though, if you get it wrong, the C compiler will be happy to tell you.</p>\n<h3><a href=\"#parsing-with-precedence\" id=\"parsing-with-precedence\"><small>17&#8202;.&#8202;6&#8202;.&#8202;1</small>Parsing with precedence</a></h3>\n<p>Now we&rsquo;re getting to the fun stuff. The maestro that orchestrates all of the\nparsing functions we&rsquo;ve defined is <code>parsePrecedence()</code>. Let&rsquo;s start with parsing\nprefix expressions.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void parsePrecedence(Precedence precedence) {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>parsePrecedence</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"i\">advance</span>();\n  <span class=\"t\">ParseFn</span> <span class=\"i\">prefixRule</span> = <span class=\"i\">getRule</span>(<span class=\"i\">parser</span>.<span class=\"i\">previous</span>.<span class=\"i\">type</span>)-&gt;<span class=\"i\">prefix</span>;\n  <span class=\"k\">if</span> (<span class=\"i\">prefixRule</span> == <span class=\"a\">NULL</span>) {\n    <span class=\"i\">error</span>(<span class=\"s\">&quot;Expect expression.&quot;</span>);\n    <span class=\"k\">return</span>;\n  }\n\n  <span class=\"i\">prefixRule</span>();\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>parsePrecedence</em>(), replace 1 line</div>\n\n<p>We read the next token and look up the corresponding ParseRule. If there is no\nprefix parser, then the token must be a syntax error. We report that and return\nto the caller.</p>\n<p>Otherwise, we call that prefix parse function and let it do its thing. That\nprefix parser compiles the rest of the prefix expression, consuming any other\ntokens it needs, and returns back here. Infix expressions are where it gets\ninteresting since precedence comes into play. The implementation is remarkably\nsimple.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  prefixRule();\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>parsePrecedence</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"k\">while</span> (<span class=\"i\">precedence</span> &lt;= <span class=\"i\">getRule</span>(<span class=\"i\">parser</span>.<span class=\"i\">current</span>.<span class=\"i\">type</span>)-&gt;<span class=\"i\">precedence</span>) {\n    <span class=\"i\">advance</span>();\n    <span class=\"t\">ParseFn</span> <span class=\"i\">infixRule</span> = <span class=\"i\">getRule</span>(<span class=\"i\">parser</span>.<span class=\"i\">previous</span>.<span class=\"i\">type</span>)-&gt;<span class=\"i\">infix</span>;\n    <span class=\"i\">infixRule</span>();\n  }\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>parsePrecedence</em>()</div>\n\n<p>That&rsquo;s the whole thing. Really. Here&rsquo;s how the entire function works: At the\nbeginning of <code>parsePrecedence()</code>, we look up a prefix parser for the current\ntoken. The first token is <em>always</em> going to belong to some kind of prefix\nexpression, by definition. It may turn out to be nested as an operand inside one\nor more infix expressions, but as you read the code from left to right, the\nfirst token you hit always belongs to a prefix expression.</p>\n<p>After parsing that, which may consume more tokens, the prefix expression is\ndone. Now we look for an infix parser for the next token. If we find one, it\nmeans the prefix expression we already compiled might be an operand for it. But\nonly if the call to <code>parsePrecedence()</code> has a <code>precedence</code> that is low enough to\npermit that infix operator.</p>\n<p>If the next token is too low precedence, or isn&rsquo;t an infix operator at all,\nwe&rsquo;re done. We&rsquo;ve parsed as much expression as we can. Otherwise, we consume the\noperator and hand off control to the infix parser we found. It consumes whatever\nother tokens it needs (usually the right operand) and returns back to\n<code>parsePrecedence()</code>. Then we loop back around and see if the <em>next</em> token is\nalso a valid infix operator that can take the entire preceding expression as its\noperand. We keep looping like that, crunching through infix operators and their\noperands until we hit a token that isn&rsquo;t an infix operator or is too low\nprecedence and stop.</p>\n<p>That&rsquo;s a lot of prose, but if you really want to mind meld with Vaughan Pratt\nand fully understand the algorithm, step through the parser in your debugger as\nit works through some expressions. Maybe a picture will help. There&rsquo;s only a\nhandful of functions, but they are marvelously intertwined:</p>\n<p><span name=\"connections\"></span></p>\n<p><img src=\"image/compiling-expressions/connections.png\" alt=\"The various parsing\nfunctions and how they call each other.\" /></p>\n<aside name=\"connections\">\n<p>The <img src=\"image/compiling-expressions/calls.png\" alt=\"A solid arrow.\"\nclass=\"arrow\" /> arrow connects a function to another function it directly\ncalls. The <img src=\"image/compiling-expressions/points-to.png\" alt=\"An open\narrow.\" class=\"arrow\" /> arrow shows the table&rsquo;s pointers to the parsing\nfunctions.</p>\n</aside>\n<p>Later, we&rsquo;ll need to tweak the code in this chapter to handle assignment. But,\notherwise, what we wrote covers all of our expression compiling needs for the\nrest of the book. We&rsquo;ll plug additional parsing functions into the table when we\nadd new kinds of expressions, but <code>parsePrecedence()</code> is complete.</p>\n<h2><a href=\"#dumping-chunks\" id=\"dumping-chunks\"><small>17&#8202;.&#8202;7</small>Dumping Chunks</a></h2>\n<p>While we&rsquo;re here in the core of our compiler, we should put in some\ninstrumentation. To help debug the generated bytecode, we&rsquo;ll add support for\ndumping the chunk once the compiler finishes. We had some temporary logging\nearlier when we hand-authored the chunk. Now we&rsquo;ll put in some real code so that\nwe can enable it whenever we want.</p>\n<p>Since this isn&rsquo;t for end users, we hide it behind a flag.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &lt;stdint.h&gt;\n\n</pre><div class=\"source-file\"><em>common.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define DEBUG_PRINT_CODE</span>\n</pre><pre class=\"insert-after\">#define DEBUG_TRACE_EXECUTION\n</pre></div>\n<div class=\"source-file-narrow\"><em>common.h</em></div>\n\n<p>When that flag is defined, we use our existing &ldquo;debug&rdquo; module to print out the\nchunk&rsquo;s bytecode.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  emitReturn();\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>endCompiler</em>()</div>\n<pre class=\"insert\"><span class=\"a\">#ifdef DEBUG_PRINT_CODE</span>\n  <span class=\"k\">if</span> (!<span class=\"i\">parser</span>.<span class=\"i\">hadError</span>) {\n    <span class=\"i\">disassembleChunk</span>(<span class=\"i\">currentChunk</span>(), <span class=\"s\">&quot;code&quot;</span>);\n  }\n<span class=\"a\">#endif</span>\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>endCompiler</em>()</div>\n\n<p>We do this only if the code was free of errors. After a syntax error, the\ncompiler keeps on going but it&rsquo;s in kind of a weird state and might produce\nbroken code. That&rsquo;s harmless because it won&rsquo;t get executed, but we&rsquo;ll just\nconfuse ourselves if we try to read it.</p>\n<p>Finally, to access <code>disassembleChunk()</code>, we need to include its header.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;scanner.h&quot;\n</pre><div class=\"source-file\"><em>compiler.c</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#ifdef DEBUG_PRINT_CODE</span>\n<span class=\"a\">#include &quot;debug.h&quot;</span>\n<span class=\"a\">#endif</span>\n</pre><pre class=\"insert-after\">\n\ntypedef struct {\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em></div>\n\n<p>We made it! This was the last major section to install in our VM&rsquo;s compilation\nand execution pipeline. Our interpreter doesn&rsquo;t <em>look</em> like much, but inside it\nis scanning, parsing, compiling to bytecode, and executing.</p>\n<p>Fire up the VM and type in an expression. If we did everything right, it should\ncalculate and print the result. We now have a very over-engineered arithmetic\ncalculator. We have a lot of language features to add in the coming chapters,\nbut the foundation is in place.</p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>To really understand the parser, you need to see how execution threads\nthrough the interesting parsing functions<span class=\"em\">&mdash;</span><code>parsePrecedence()</code> and the\nparser functions stored in the table. Take this (strange) expression:</p>\n<div class=\"codehilite\"><pre>(-<span class=\"n\">1</span> + <span class=\"n\">2</span>) * <span class=\"n\">3</span> - -<span class=\"n\">4</span>\n</pre></div>\n<p>Write a trace of how those functions are called. Show the order they are\ncalled, which calls which, and the arguments passed to them.</p>\n</li>\n<li>\n<p>The ParseRule row for <code>TOKEN_MINUS</code> has both prefix and infix function\npointers. That&rsquo;s because <code>-</code> is both a prefix operator (unary negation) and\nan infix one (subtraction).</p>\n<p>In the full Lox language, what other tokens can be used in both prefix and\ninfix positions? What about in C or in another language of your choice?</p>\n</li>\n<li>\n<p>You might be wondering about complex &ldquo;mixfix&rdquo; expressions that have more\nthan two operands separated by tokens. C&rsquo;s conditional or &ldquo;ternary&rdquo;\noperator, <code>?:</code>, is a widely known one.</p>\n<p>Add support for that operator to the compiler. You don&rsquo;t have to generate\nany bytecode, just show how you would hook it up to the parser and handle\nthe operands.</p>\n</li>\n</ol>\n</div>\n<div class=\"design-note\">\n<h2><a href=\"#design-note\" id=\"design-note\">Design Note: It&rsquo;s Just Parsing</a></h2>\n<p>I&rsquo;m going to make a claim here that will be unpopular with some compiler and\nlanguage people. It&rsquo;s OK if you don&rsquo;t agree. Personally, I learn more from\nstrongly stated opinions that I disagree with than I do from several pages of\nqualifiers and equivocation. My claim is that <em>parsing doesn&rsquo;t matter</em>.</p>\n<p>Over the years, many programming language people, especially in academia, have\ngotten <em>really</em> into parsers and taken them very seriously. Initially, it was\nthe compiler folks who got into <span name=\"yacc\">compiler-compilers</span>,\nLALR, and other stuff like that. The first half of the dragon book is a long\nlove letter to the wonders of parser generators.</p>\n<aside name=\"yacc\">\n<p>All of us suffer from the vice of &ldquo;when all you have is a hammer, everything\nlooks like a nail&rdquo;, but perhaps none so visibly as compiler people. You wouldn&rsquo;t\nbelieve the breadth of software problems that miraculously seem to require a new\nlittle language in their solution as soon as you ask a compiler hacker for help.</p>\n<p>Yacc and other compiler-compilers are the most delightfully recursive example.\n&ldquo;Wow, writing compilers is a chore. I know, let&rsquo;s write a compiler to write our\ncompiler for us.&rdquo;</p>\n<p>For the record, I don&rsquo;t claim immunity to this affliction.</p>\n</aside>\n<p>Later, the functional programming folks got into parser combinators, packrat\nparsers, and other sorts of things. Because, obviously, if you give a functional\nprogrammer a problem, the first thing they&rsquo;ll do is whip out a pocketful of\nhigher-order functions.</p>\n<p>Over in math and algorithm analysis land, there is a long legacy of research\ninto proving time and memory usage for various parsing techniques, transforming\nparsing problems into other problems and back, and assigning complexity classes\nto different grammars.</p>\n<p>At one level, this stuff is important. If you&rsquo;re implementing a language, you\nwant some assurance that your parser won&rsquo;t go exponential and take 7,000 years\nto parse a weird edge case in the grammar. Parser theory gives you that bound.\nAs an intellectual exercise, learning about parsing techniques is also fun and\nrewarding.</p>\n<p>But if your goal is just to implement a language and get it in front of users,\nalmost all of that stuff doesn&rsquo;t matter. It&rsquo;s really easy to get worked up by\nthe enthusiasm of the people who <em>are</em> into it and think that your front end\n<em>needs</em> some whiz-bang generated combinator-parser-factory thing. I&rsquo;ve seen\npeople burn tons of time writing and rewriting their parser using whatever\ntoday&rsquo;s hot library or technique is.</p>\n<p>That&rsquo;s time that doesn&rsquo;t add any value to your user&rsquo;s life. If you&rsquo;re just\ntrying to get your parser done, pick one of the bog-standard techniques, use it,\nand move on. Recursive descent, Pratt parsing, and the popular parser generators\nlike ANTLR or Bison are all fine.</p>\n<p>Take the extra time you saved not rewriting your parsing code and spend it\nimproving the compile error messages your compiler shows users. Good error\nhandling and reporting is more valuable to users than almost anything else you\ncan put time into in the front end.</p>\n</div>\n\n<footer>\n<a href=\"types-of-values.html\" class=\"next\">\n  Next Chapter: &ldquo;Types of Values&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/contents.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Table of Contents &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n    <h2><a href=\"#top\"><small>&nbsp;</small> Table of Contents</a></h2>\n    <ul>\n      <li><a href=\"#welcome\"><small>I</small>Welcome</a></li>\n      <li><a href=\"#a-tree-walk-interpreter\"><small>II</small>A Tree-Walk Interpreter</a></li>\n      <li><a href=\"#a-bytecode-virtual-machine\"><small>III</small>A Bytecode Virtual Machine</a></li>\n      <li><a href=\"#backmatter\"><small>&#10087;</small>Backmatter</a></li>\n    </ul>\n        <div class=\"prev-next\">\n        <a href=\"acknowledgements.html\" title=\"Acknowledgements\" class=\"left\">&larr;&nbsp;Previous</a>\n        <a href=\"index.html\" title=\"Crafting Interpreters\">&uarr;&nbsp;Up</a>\n        <a href=\"welcome.html\" title=\"Welcome\" class=\"right\">Next&nbsp;&rarr;</a>\n    </div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"acknowledgements.html\" title=\"Acknowledgements\" class=\"prev\">←</a>\n<a href=\"welcome.html\" title=\"Welcome\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n    <h2><a href=\"#top\"><small>&nbsp;</small> Table of Contents</a></h2>\n    <ul>\n      <li><a href=\"#welcome\"><small>I</small>Welcome</a></li>\n      <li><a href=\"#a-tree-walk-interpreter\"><small>II</small>A Tree-Walk Interpreter</a></li>\n      <li><a href=\"#a-bytecode-virtual-machine\"><small>III</small>A Bytecode Virtual Machine</a></li>\n      <li><a href=\"#backmatter\"><small>&#10087;</small>Backmatter</a></li>\n    </ul>\n        <div class=\"prev-next\">\n        <a href=\"acknowledgements.html\" title=\"Acknowledgements\" class=\"left\">&larr;&nbsp;Previous</a>\n        <a href=\"index.html\" title=\"Crafting Interpreters\">&uarr;&nbsp;Up</a>\n        <a href=\"welcome.html\" title=\"Welcome\" class=\"right\">Next&nbsp;&rarr;</a>\n    </div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"contents\">\n\n<h1 class=\"part\">Table of Contents</h1>\n\n<div class=\"chapters\">\n  <div class=\"row\">\n    <div class=\"first\">\n    <h2><span class=\"num\">&#10087;</span>Frontmatter</h2>\n    <ul>\n      <li><span class=\"num\">&nbsp;</span><a href=\"dedication.html\">Dedication</a></li>\n      <li><span class=\"num\">&nbsp;</span><a href=\"acknowledgements.html\">Acknowledgements</a></li>\n    </ul>\n\n      <h2><span class=\"num\">I.</span><a href=\"welcome.html\" name=\"welcome\">Welcome</a></h2>\n      <ul>\n        <li><span class=\"num\">1.</span><a href=\"introduction.html\">Introduction</a>\n        </li>\n        <li class=\"design-note\">\n        <span class=\"num\">&nbsp;</span><a href=\"introduction.html#design-note\">Design Note: What&rsquo;s in a Name?</a>\n        </li>\n        <li><span class=\"num\">2.</span><a href=\"a-map-of-the-territory.html\">A Map of the Territory</a>\n        </li>\n        <li><span class=\"num\">3.</span><a href=\"the-lox-language.html\">The Lox Language</a>\n        </li>\n        <li class=\"design-note\">\n        <span class=\"num\">&nbsp;</span><a href=\"the-lox-language.html#design-note\">Design Note: Expressions and Statements</a>\n        </li>\n      </ul>      <h2><span class=\"num\">II.</span><a href=\"a-tree-walk-interpreter.html\" name=\"a-tree-walk-interpreter\">A Tree-Walk Interpreter</a></h2>\n      <ul>\n        <li><span class=\"num\">4.</span><a href=\"scanning.html\">Scanning</a>\n        </li>\n        <li class=\"design-note\">\n        <span class=\"num\">&nbsp;</span><a href=\"scanning.html#design-note\">Design Note: Implicit Semicolons</a>\n        </li>\n        <li><span class=\"num\">5.</span><a href=\"representing-code.html\">Representing Code</a>\n        </li>\n        <li><span class=\"num\">6.</span><a href=\"parsing-expressions.html\">Parsing Expressions</a>\n        </li>\n        <li class=\"design-note\">\n        <span class=\"num\">&nbsp;</span><a href=\"parsing-expressions.html#design-note\">Design Note: Logic Versus History</a>\n        </li>\n        <li><span class=\"num\">7.</span><a href=\"evaluating-expressions.html\">Evaluating Expressions</a>\n        </li>\n        <li class=\"design-note\">\n        <span class=\"num\">&nbsp;</span><a href=\"evaluating-expressions.html#design-note\">Design Note: Static and Dynamic Typing</a>\n        </li>\n        <li><span class=\"num\">8.</span><a href=\"statements-and-state.html\">Statements and State</a>\n        </li>\n        <li class=\"design-note\">\n        <span class=\"num\">&nbsp;</span><a href=\"statements-and-state.html#design-note\">Design Note: Implicit Variable Declaration</a>\n        </li>\n        <li><span class=\"num\">9.</span><a href=\"control-flow.html\">Control Flow</a>\n        </li>\n        <li class=\"design-note\">\n        <span class=\"num\">&nbsp;</span><a href=\"control-flow.html#design-note\">Design Note: Spoonfuls of Syntactic Sugar</a>\n        </li>\n        <li><span class=\"num\">10.</span><a href=\"functions.html\">Functions</a>\n        </li>\n        <li><span class=\"num\">11.</span><a href=\"resolving-and-binding.html\">Resolving and Binding</a>\n        </li>\n        <li><span class=\"num\">12.</span><a href=\"classes.html\">Classes</a>\n        </li>\n        <li class=\"design-note\">\n        <span class=\"num\">&nbsp;</span><a href=\"classes.html#design-note\">Design Note: Prototypes and Power</a>\n        </li>\n        <li><span class=\"num\">13.</span><a href=\"inheritance.html\">Inheritance</a>\n        </li>\n      </ul>    </div>\n    <div class=\"second\">\n      <h2><span class=\"num\">III.</span><a href=\"a-bytecode-virtual-machine.html\" name=\"a-bytecode-virtual-machine\">A Bytecode Virtual Machine</a></h2>\n      <ul>\n        <li><span class=\"num\">14.</span><a href=\"chunks-of-bytecode.html\">Chunks of Bytecode</a>\n        </li>\n        <li class=\"design-note\">\n        <span class=\"num\">&nbsp;</span><a href=\"chunks-of-bytecode.html#design-note\">Design Note: Test Your Language</a>\n        </li>\n        <li><span class=\"num\">15.</span><a href=\"a-virtual-machine.html\">A Virtual Machine</a>\n        </li>\n        <li class=\"design-note\">\n        <span class=\"num\">&nbsp;</span><a href=\"a-virtual-machine.html#design-note\">Design Note: Register-Based Bytecode</a>\n        </li>\n        <li><span class=\"num\">16.</span><a href=\"scanning-on-demand.html\">Scanning on Demand</a>\n        </li>\n        <li><span class=\"num\">17.</span><a href=\"compiling-expressions.html\">Compiling Expressions</a>\n        </li>\n        <li class=\"design-note\">\n        <span class=\"num\">&nbsp;</span><a href=\"compiling-expressions.html#design-note\">Design Note: It&rsquo;s Just Parsing</a>\n        </li>\n        <li><span class=\"num\">18.</span><a href=\"types-of-values.html\">Types of Values</a>\n        </li>\n        <li><span class=\"num\">19.</span><a href=\"strings.html\">Strings</a>\n        </li>\n        <li class=\"design-note\">\n        <span class=\"num\">&nbsp;</span><a href=\"strings.html#design-note\">Design Note: String Encoding</a>\n        </li>\n        <li><span class=\"num\">20.</span><a href=\"hash-tables.html\">Hash Tables</a>\n        </li>\n        <li><span class=\"num\">21.</span><a href=\"global-variables.html\">Global Variables</a>\n        </li>\n        <li><span class=\"num\">22.</span><a href=\"local-variables.html\">Local Variables</a>\n        </li>\n        <li><span class=\"num\">23.</span><a href=\"jumping-back-and-forth.html\">Jumping Back and Forth</a>\n        </li>\n        <li class=\"design-note\">\n        <span class=\"num\">&nbsp;</span><a href=\"jumping-back-and-forth.html#design-note\">Design Note: Considering Goto Harmful</a>\n        </li>\n        <li><span class=\"num\">24.</span><a href=\"calls-and-functions.html\">Calls and Functions</a>\n        </li>\n        <li><span class=\"num\">25.</span><a href=\"closures.html\">Closures</a>\n        </li>\n        <li class=\"design-note\">\n        <span class=\"num\">&nbsp;</span><a href=\"closures.html#design-note\">Design Note: Closing Over the Loop Variable</a>\n        </li>\n        <li><span class=\"num\">26.</span><a href=\"garbage-collection.html\">Garbage Collection</a>\n        </li>\n        <li class=\"design-note\">\n        <span class=\"num\">&nbsp;</span><a href=\"garbage-collection.html#design-note\">Design Note: Generational Collectors</a>\n        </li>\n        <li><span class=\"num\">27.</span><a href=\"classes-and-instances.html\">Classes and Instances</a>\n        </li>\n        <li><span class=\"num\">28.</span><a href=\"methods-and-initializers.html\">Methods and Initializers</a>\n        </li>\n        <li class=\"design-note\">\n        <span class=\"num\">&nbsp;</span><a href=\"methods-and-initializers.html#design-note\">Design Note: Novelty Budget</a>\n        </li>\n        <li><span class=\"num\">29.</span><a href=\"superclasses.html\">Superclasses</a>\n        </li>\n        <li><span class=\"num\">30.</span><a href=\"optimization.html\">Optimization</a>\n        </li>\n      </ul>\n    <h2><span class=\"num\">&#10087;</span><a href=\"backmatter.html\" name=\"backmatter\">Backmatter</a></h2>\n    <ul>\n      <li><span class=\"num\">A1.</span><a href=\"appendix-i.html\">Appendix I: Lox Grammar</a></li>\n      <li><span class=\"num\">A2.</span><a href=\"appendix-ii.html\">Appendix II: Generated Syntax Tree Classes</a></li>\n    </ul>\n    </div>\n  </div>\n</div>\n\n<footer>\n  <a href=\"welcome.html\" class=\"next\">\n    First Part: &ldquo;Welcome&rdquo; &rarr;\n  </a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2020</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/control-flow.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Control Flow &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Control Flow<small>9</small></a></h3>\n\n<ul>\n    <li><a href=\"#turing-machines-briefly\"><small>9.1</small> Turing Machines (Briefly)</a></li>\n    <li><a href=\"#conditional-execution\"><small>9.2</small> Conditional Execution</a></li>\n    <li><a href=\"#logical-operators\"><small>9.3</small> Logical Operators</a></li>\n    <li><a href=\"#while-loops\"><small>9.4</small> While Loops</a></li>\n    <li><a href=\"#for-loops\"><small>9.5</small> For Loops</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Spoonfuls of Syntactic Sugar</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"statements-and-state.html\" title=\"Statements and State\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\">&uarr;&nbsp;Up</a>\n    <a href=\"functions.html\" title=\"Functions\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"statements-and-state.html\" title=\"Statements and State\" class=\"prev\">←</a>\n<a href=\"functions.html\" title=\"Functions\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Control Flow<small>9</small></a></h3>\n\n<ul>\n    <li><a href=\"#turing-machines-briefly\"><small>9.1</small> Turing Machines (Briefly)</a></li>\n    <li><a href=\"#conditional-execution\"><small>9.2</small> Conditional Execution</a></li>\n    <li><a href=\"#logical-operators\"><small>9.3</small> Logical Operators</a></li>\n    <li><a href=\"#while-loops\"><small>9.4</small> While Loops</a></li>\n    <li><a href=\"#for-loops\"><small>9.5</small> For Loops</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Spoonfuls of Syntactic Sugar</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"statements-and-state.html\" title=\"Statements and State\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\">&uarr;&nbsp;Up</a>\n    <a href=\"functions.html\" title=\"Functions\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">9</div>\n  <h1>Control Flow</h1>\n\n<blockquote>\n<p>Logic, like whiskey, loses its beneficial effect when taken in too large\nquantities.</p>\n<p><cite>Edward John Moreton Drax Plunkett, Lord Dunsany</cite></p>\n</blockquote>\n<p>Compared to <a href=\"statements-and-state.html\">last chapter&rsquo;s</a> grueling marathon, today is a\nlighthearted frolic through a daisy meadow. But while the work is easy, the\nreward is surprisingly large.</p>\n<p>Right now, our interpreter is little more than a calculator. A Lox program can\nonly do a fixed amount of work before completing. To make it run twice as long\nyou have to make the source code twice as lengthy. We&rsquo;re about to fix that. In\nthis chapter, our interpreter takes a big step towards the programming\nlanguage major leagues: <em>Turing-completeness</em>.</p>\n<h2><a href=\"#turing-machines-briefly\" id=\"turing-machines-briefly\"><small>9&#8202;.&#8202;1</small>Turing Machines (Briefly)</a></h2>\n<p>In the early part of last century, mathematicians stumbled into a series of\nconfusing <span name=\"paradox\">paradoxes</span> that led them to doubt the\nstability of the foundation they had built their work upon. To address that\n<a href=\"https://en.wikipedia.org/wiki/Foundations_of_mathematics#Foundational_crisis\">crisis</a>, they went back to square one. Starting from a handful of axioms,\nlogic, and set theory, they hoped to rebuild mathematics on top of an\nimpervious foundation.</p>\n<aside name=\"paradox\">\n<p>The most famous is <a href=\"https://en.wikipedia.org/wiki/Russell%27s_paradox\"><strong>Russell&rsquo;s paradox</strong></a>. Initially, set theory\nallowed you to define any sort of set. If you could describe it in English, it\nwas valid. Naturally, given mathematicians&rsquo; predilection for self-reference,\nsets can contain other sets. So Russell, rascal that he was, came up with:</p>\n<p><em>R is the set of all sets that do not contain themselves.</em></p>\n<p>Does R contain itself? If it doesn&rsquo;t, then according to the second half of the\ndefinition it should. But if it does, then it no longer meets the definition.\nCue mind exploding.</p>\n</aside>\n<p>They wanted to rigorously answer questions like, &ldquo;Can all true statements be\nproven?&rdquo;, &ldquo;Can we <a href=\"https://en.wikipedia.org/wiki/Computable_function\">compute</a> all functions that we can define?&rdquo;, or even the\nmore general question, &ldquo;What do we mean when we claim a function is\n&lsquo;computable&rsquo;?&rdquo;</p>\n<p>They presumed the answer to the first two questions would be &ldquo;yes&rdquo;. All that\nremained was to prove it. It turns out that the answer to both is &ldquo;no&rdquo;, and\nastonishingly, the two questions are deeply intertwined. This is a fascinating\ncorner of mathematics that touches fundamental questions about what brains are\nable to do and how the universe works. I can&rsquo;t do it justice here.</p>\n<p>What I do want to note is that in the process of proving that the answer to the\nfirst two questions is &ldquo;no&rdquo;, Alan Turing and Alonzo Church devised a precise\nanswer to the last question<span class=\"em\">&mdash;</span>a definition of what kinds of functions are <span\nname=\"uncomputable\">computable</span>. They each crafted a tiny system with a\nminimum set of machinery that is still powerful enough to compute any of a\n(very) large class of functions.</p>\n<aside name=\"uncomputable\">\n<p>They proved the answer to the first question is &ldquo;no&rdquo; by showing that the\nfunction that returns the truth value of a given statement is <em>not</em> a computable\none.</p>\n</aside>\n<p>These are now considered the &ldquo;computable functions&rdquo;. Turing&rsquo;s system is called a\n<span name=\"turing\"><strong>Turing machine</strong></span>. Church&rsquo;s is the <strong>lambda\ncalculus</strong>. Both are still widely used as the basis for models of computation\nand, in fact, many modern functional programming languages use the lambda\ncalculus at their core.</p>\n<aside name=\"turing\">\n<p>Turing called his inventions &ldquo;a-machines&rdquo; for &ldquo;automatic&rdquo;. He wasn&rsquo;t so\nself-aggrandizing as to put his <em>own</em> name on them. Later mathematicians did\nthat for him. That&rsquo;s how you get famous while still retaining some modesty.</p>\n</aside><img src=\"image/control-flow/turing-machine.png\" alt=\"A Turing machine.\" />\n<p>Turing machines have better name recognition<span class=\"em\">&mdash;</span>there&rsquo;s no Hollywood film about\nAlonzo Church yet<span class=\"em\">&mdash;</span>but the two formalisms are <a href=\"https://en.wikipedia.org/wiki/Church%E2%80%93Turing_thesis\">equivalent in power</a>.\nIn fact, any programming language with some minimal level of expressiveness is\npowerful enough to compute <em>any</em> computable function.</p>\n<p>You can prove that by writing a simulator for a Turing machine in your language.\nSince Turing proved his machine can compute any computable function, by\nextension, that means your language can too. All you need to do is translate the\nfunction into a Turing machine, and then run that on your simulator.</p>\n<p>If your language is expressive enough to do that, it&rsquo;s considered\n<strong>Turing-complete</strong>. Turing machines are pretty dang simple, so it doesn&rsquo;t take\nmuch power to do this. You basically need arithmetic, a little control flow,\nand the ability to allocate and use (theoretically) arbitrary amounts of memory.\nWe&rsquo;ve got the first. By the end of this chapter, we&rsquo;ll have the <span\nname=\"memory\">second</span>.</p>\n<aside name=\"memory\">\n<p>We <em>almost</em> have the third too. You can create and concatenate strings of\narbitrary size, so you can <em>store</em> unbounded memory. But we don&rsquo;t have any way\nto access parts of a string.</p>\n</aside>\n<h2><a href=\"#conditional-execution\" id=\"conditional-execution\"><small>9&#8202;.&#8202;2</small>Conditional Execution</a></h2>\n<p>Enough history, let&rsquo;s jazz up our language. We can divide control flow roughly\ninto two kinds:</p>\n<ul>\n<li>\n<p><strong>Conditional</strong> or <strong>branching control flow</strong> is used to <em>not</em> execute\nsome piece of code. Imperatively, you can think of it as jumping <em>ahead</em>\nover a region of code.</p>\n</li>\n<li>\n<p><strong>Looping control flow</strong> executes a chunk of code more than once. It jumps\n<em>back</em> so that you can do something again. Since you don&rsquo;t usually want\n<em>infinite</em> loops, it typically has some conditional logic to know when to\nstop looping as well.</p>\n</li>\n</ul>\n<p>Branching is simpler, so we&rsquo;ll start there. C-derived languages have two main\nconditional execution features, the <code>if</code> statement and the perspicaciously named\n&ldquo;conditional&rdquo; <span name=\"ternary\">operator</span> (<code>?:</code>). An <code>if</code> statement\nlets you conditionally execute statements and the conditional operator lets you\nconditionally execute expressions.</p>\n<aside name=\"ternary\">\n<p>The conditional operator is also called the &ldquo;ternary&rdquo; operator because it&rsquo;s the\nonly operator in C that takes three operands.</p>\n</aside>\n<p>For simplicity&rsquo;s sake, Lox doesn&rsquo;t have a conditional operator, so let&rsquo;s get our\n<code>if</code> statement on. Our statement grammar gets a new production.</p>\n<p><span name=\"semicolon\"></span></p>\n<div class=\"codehilite\"><pre><span class=\"i\">statement</span>      → <span class=\"i\">exprStmt</span>\n               | <span class=\"i\">ifStmt</span>\n               | <span class=\"i\">printStmt</span>\n               | <span class=\"i\">block</span> ;\n\n<span class=\"i\">ifStmt</span>         → <span class=\"s\">&quot;if&quot;</span> <span class=\"s\">&quot;(&quot;</span> <span class=\"i\">expression</span> <span class=\"s\">&quot;)&quot;</span> <span class=\"i\">statement</span>\n               ( <span class=\"s\">&quot;else&quot;</span> <span class=\"i\">statement</span> )? ;\n</pre></div>\n<aside name=\"semicolon\">\n<p>The semicolons in the rules aren&rsquo;t quoted, which means they are part of the\ngrammar metasyntax, not Lox&rsquo;s syntax. A block does not have a <code>;</code> at the end and\nan <code>if</code> statement doesn&rsquo;t either, unless the then or else statement happens to\nbe one that ends in a semicolon.</p>\n</aside>\n<p>An <code>if</code> statement has an expression for the condition, then a statement to execute\nif the condition is truthy. Optionally, it may also have an <code>else</code> keyword and a\nstatement to execute if the condition is falsey. The <span name=\"if-ast\">syntax\ntree node</span> has fields for each of those three pieces.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      &quot;Expression : Expr expression&quot;,\n</pre><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nin <em>main</em>()</div>\n<pre class=\"insert\">      <span class=\"s\">&quot;If         : Expr condition, Stmt thenBranch,&quot;</span> +\n                  <span class=\"s\">&quot; Stmt elseBranch&quot;</span>,\n</pre><pre class=\"insert-after\">      &quot;Print      : Expr expression&quot;,\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, in <em>main</em>()</div>\n\n<aside name=\"if-ast\">\n<p>The generated code for the new node is in <a href=\"appendix-ii.html#if-statement\">Appendix II</a>.</p>\n</aside>\n<p>Like other statements, the parser recognizes an <code>if</code> statement by the leading\n<code>if</code> keyword.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  private Stmt statement() {\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>statement</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">IF</span>)) <span class=\"k\">return</span> <span class=\"i\">ifStatement</span>();\n</pre><pre class=\"insert-after\">    if (match(PRINT)) return printStatement();\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>statement</em>()</div>\n\n<p>When it finds one, it calls this new method to parse the rest:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>statement</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Stmt</span> <span class=\"i\">ifStatement</span>() {\n    <span class=\"i\">consume</span>(<span class=\"i\">LEFT_PAREN</span>, <span class=\"s\">&quot;Expect &#39;(&#39; after &#39;if&#39;.&quot;</span>);\n    <span class=\"t\">Expr</span> <span class=\"i\">condition</span> = <span class=\"i\">expression</span>();\n    <span class=\"i\">consume</span>(<span class=\"i\">RIGHT_PAREN</span>, <span class=\"s\">&quot;Expect &#39;)&#39; after if condition.&quot;</span>);<span name=\"parens\"> </span>\n\n    <span class=\"t\">Stmt</span> <span class=\"i\">thenBranch</span> = <span class=\"i\">statement</span>();\n    <span class=\"t\">Stmt</span> <span class=\"i\">elseBranch</span> = <span class=\"k\">null</span>;\n    <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">ELSE</span>)) {\n      <span class=\"i\">elseBranch</span> = <span class=\"i\">statement</span>();\n    }\n\n    <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">Stmt</span>.<span class=\"t\">If</span>(<span class=\"i\">condition</span>, <span class=\"i\">thenBranch</span>, <span class=\"i\">elseBranch</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>statement</em>()</div>\n\n<aside name=\"parens\">\n<p>The parentheses around the condition are only half useful. You need some kind of\ndelimiter <em>between</em> the condition and the then statement, otherwise the parser\ncan&rsquo;t tell when it has reached the end of the condition expression. But the\n<em>opening</em> parenthesis after <code>if</code> doesn&rsquo;t do anything useful. Dennis Ritchie put\nit there so he could use <code>)</code> as the ending delimiter without having unbalanced\nparentheses.</p>\n<p>Other languages like Lua and some BASICs use a keyword like <code>then</code> as the ending\ndelimiter and don&rsquo;t have anything before the condition. Go and Swift instead\nrequire the statement to be a braced block. That lets them use the <code>{</code> at the\nbeginning of the statement to tell when the condition is done.</p>\n</aside>\n<p>As usual, the parsing code hews closely to the grammar. It detects an else\nclause by looking for the preceding <code>else</code> keyword. If there isn&rsquo;t one, the\n<code>elseBranch</code> field in the syntax tree is <code>null</code>.</p>\n<p>That seemingly innocuous optional else has, in fact, opened up an ambiguity in\nour grammar. Consider:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">if</span> (<span class=\"i\">first</span>) <span class=\"k\">if</span> (<span class=\"i\">second</span>) <span class=\"i\">whenTrue</span>(); <span class=\"k\">else</span> <span class=\"i\">whenFalse</span>();\n</pre></div>\n<p>Here&rsquo;s the riddle: Which <code>if</code> statement does that else clause belong to? This\nisn&rsquo;t just a theoretical question about how we notate our grammar. It actually\naffects how the code executes:</p>\n<ul>\n<li>\n<p>If we attach the else to the first <code>if</code> statement, then <code>whenFalse()</code> is\ncalled if <code>first</code> is falsey, regardless of what value <code>second</code> has.</p>\n</li>\n<li>\n<p>If we attach it to the second <code>if</code> statement, then <code>whenFalse()</code> is only\ncalled if <code>first</code> is truthy and <code>second</code> is falsey.</p>\n</li>\n</ul>\n<p>Since else clauses are optional, and there is no explicit delimiter marking the\nend of the <code>if</code> statement, the grammar is ambiguous when you nest <code>if</code>s in this\nway. This classic pitfall of syntax is called the <strong><a href=\"https://en.wikipedia.org/wiki/Dangling_else\">dangling else</a></strong> problem.</p>\n<p><span name=\"else\"></span></p><img class=\"above\" src=\"image/control-flow/dangling-else.png\" alt=\"Two ways the else can be interpreted.\" />\n<aside name=\"else\">\n<p>Here, formatting highlights the two ways the else could be parsed. But note that\nsince whitespace characters are ignored by the parser, this is only a guide to\nthe human reader.</p>\n</aside>\n<p>It <em>is</em> possible to define a context-free grammar that avoids the ambiguity\ndirectly, but it requires splitting most of the statement rules into pairs, one\nthat allows an <code>if</code> with an <code>else</code> and one that doesn&rsquo;t. It&rsquo;s annoying.</p>\n<p>Instead, most languages and parsers avoid the problem in an ad hoc way. No\nmatter what hack they use to get themselves out of the trouble, they always\nchoose the same interpretation<span class=\"em\">&mdash;</span>the <code>else</code> is bound to the nearest <code>if</code> that\nprecedes it.</p>\n<p>Our parser conveniently does that already. Since <code>ifStatement()</code> eagerly looks\nfor an <code>else</code> before returning, the innermost call to a nested series will claim\nthe else clause for itself before returning to the outer <code>if</code> statements.</p>\n<p>Syntax in hand, we are ready to interpret.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>visitExpressionStmt</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitIfStmt</span>(<span class=\"t\">Stmt</span>.<span class=\"t\">If</span> <span class=\"i\">stmt</span>) {\n    <span class=\"k\">if</span> (<span class=\"i\">isTruthy</span>(<span class=\"i\">evaluate</span>(<span class=\"i\">stmt</span>.<span class=\"i\">condition</span>))) {\n      <span class=\"i\">execute</span>(<span class=\"i\">stmt</span>.<span class=\"i\">thenBranch</span>);\n    } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">stmt</span>.<span class=\"i\">elseBranch</span> != <span class=\"k\">null</span>) {\n      <span class=\"i\">execute</span>(<span class=\"i\">stmt</span>.<span class=\"i\">elseBranch</span>);\n    }\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>visitExpressionStmt</em>()</div>\n\n<p>The interpreter implementation is a thin wrapper around the self-same Java code.\nIt evaluates the condition. If truthy, it executes the then branch. Otherwise,\nif there is an else branch, it executes that.</p>\n<p>If you compare this code to how the interpreter handles other syntax we&rsquo;ve\nimplemented, the part that makes control flow special is that Java <code>if</code>\nstatement. Most other syntax trees always evaluate their subtrees. Here, we may\nnot evaluate the then or else statement. If either of those has a side effect,\nthe choice not to evaluate it becomes user visible.</p>\n<h2><a href=\"#logical-operators\" id=\"logical-operators\"><small>9&#8202;.&#8202;3</small>Logical Operators</a></h2>\n<p>Since we don&rsquo;t have the conditional operator, you might think we&rsquo;re done with\nbranching, but no. Even without the ternary operator, there are two other\noperators that are technically control flow constructs<span class=\"em\">&mdash;</span>the logical operators\n<code>and</code> and <code>or</code>.</p>\n<p>These aren&rsquo;t like other binary operators because they <strong>short-circuit</strong>. If,\nafter evaluating the left operand, we know what the result of the logical\nexpression must be, we don&rsquo;t evaluate the right operand. For example:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">false</span> <span class=\"k\">and</span> <span class=\"i\">sideEffect</span>();\n</pre></div>\n<p>For an <code>and</code> expression to evaluate to something truthy, both operands must be\ntruthy. We can see as soon as we evaluate the left <code>false</code> operand that that\nisn&rsquo;t going to be the case, so there&rsquo;s no need to evaluate <code>sideEffect()</code> and it\ngets skipped.</p>\n<p>This is why we didn&rsquo;t implement the logical operators with the other binary\noperators. Now we&rsquo;re ready. The two new operators are low in the precedence\ntable. Similar to <code>||</code> and <code>&amp;&amp;</code> in C, they each have their <span\nname=\"logical\">own</span> precedence with <code>or</code> lower than <code>and</code>. We slot them\nright between <code>assignment</code> and <code>equality</code>.</p>\n<aside name=\"logical\">\n<p>I&rsquo;ve always wondered why they don&rsquo;t have the same precedence, like the various\ncomparison or equality operators do.</p>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"i\">expression</span>     → <span class=\"i\">assignment</span> ;\n<span class=\"i\">assignment</span>     → <span class=\"t\">IDENTIFIER</span> <span class=\"s\">&quot;=&quot;</span> <span class=\"i\">assignment</span>\n               | <span class=\"i\">logic_or</span> ;\n<span class=\"i\">logic_or</span>       → <span class=\"i\">logic_and</span> ( <span class=\"s\">&quot;or&quot;</span> <span class=\"i\">logic_and</span> )* ;\n<span class=\"i\">logic_and</span>      → <span class=\"i\">equality</span> ( <span class=\"s\">&quot;and&quot;</span> <span class=\"i\">equality</span> )* ;\n</pre></div>\n<p>Instead of falling back to <code>equality</code>, <code>assignment</code> now cascades to <code>logic_or</code>.\nThe two new rules, <code>logic_or</code> and <code>logic_and</code>, are <span\nname=\"same\">similar</span> to other binary operators. Then <code>logic_and</code> calls\nout to <code>equality</code> for its operands, and we chain back to the rest of the\nexpression rules.</p>\n<aside name=\"same\">\n<p>The <em>syntax</em> doesn&rsquo;t care that they short-circuit. That&rsquo;s a semantic concern.</p>\n</aside>\n<p>We could reuse the existing Expr.Binary class for these two new expressions\nsince they have the same fields. But then <code>visitBinaryExpr()</code> would have to\ncheck to see if the operator is one of the logical operators and use a different\ncode path to handle the short circuiting. I think it&rsquo;s cleaner to define a <span\nname=\"logical-ast\">new class</span> for these operators so that they get their\nown visit method.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      &quot;Literal  : Object value&quot;,\n</pre><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nin <em>main</em>()</div>\n<pre class=\"insert\">      <span class=\"s\">&quot;Logical  : Expr left, Token operator, Expr right&quot;</span>,\n</pre><pre class=\"insert-after\">      &quot;Unary    : Token operator, Expr right&quot;,\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, in <em>main</em>()</div>\n\n<aside name=\"logical-ast\">\n<p>The generated code for the new node is in <a href=\"appendix-ii.html#logical-expression\">Appendix II</a>.</p>\n</aside>\n<p>To weave the new expressions into the parser, we first change the parsing code\nfor assignment to call <code>or()</code>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  private Expr assignment() {\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>assignment</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"t\">Expr</span> <span class=\"i\">expr</span> = <span class=\"i\">or</span>();\n</pre><pre class=\"insert-after\">\n\n    if (match(EQUAL)) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>assignment</em>(), replace 1 line</div>\n\n<p>The code to parse a series of <code>or</code> expressions mirrors other binary operators.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>assignment</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Expr</span> <span class=\"i\">or</span>() {\n    <span class=\"t\">Expr</span> <span class=\"i\">expr</span> = <span class=\"i\">and</span>();\n\n    <span class=\"k\">while</span> (<span class=\"i\">match</span>(<span class=\"i\">OR</span>)) {\n      <span class=\"t\">Token</span> <span class=\"i\">operator</span> = <span class=\"i\">previous</span>();\n      <span class=\"t\">Expr</span> <span class=\"i\">right</span> = <span class=\"i\">and</span>();\n      <span class=\"i\">expr</span> = <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Logical</span>(<span class=\"i\">expr</span>, <span class=\"i\">operator</span>, <span class=\"i\">right</span>);\n    }\n\n    <span class=\"k\">return</span> <span class=\"i\">expr</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>assignment</em>()</div>\n\n<p>Its operands are the next higher level of precedence, the new <code>and</code> expression.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>or</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Expr</span> <span class=\"i\">and</span>() {\n    <span class=\"t\">Expr</span> <span class=\"i\">expr</span> = <span class=\"i\">equality</span>();\n\n    <span class=\"k\">while</span> (<span class=\"i\">match</span>(<span class=\"i\">AND</span>)) {\n      <span class=\"t\">Token</span> <span class=\"i\">operator</span> = <span class=\"i\">previous</span>();\n      <span class=\"t\">Expr</span> <span class=\"i\">right</span> = <span class=\"i\">equality</span>();\n      <span class=\"i\">expr</span> = <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Logical</span>(<span class=\"i\">expr</span>, <span class=\"i\">operator</span>, <span class=\"i\">right</span>);\n    }\n\n    <span class=\"k\">return</span> <span class=\"i\">expr</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>or</em>()</div>\n\n<p>That calls <code>equality()</code> for its operands, and with that, the expression parser\nis all tied back together again. We&rsquo;re ready to interpret.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>visitLiteralExpr</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Object</span> <span class=\"i\">visitLogicalExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Logical</span> <span class=\"i\">expr</span>) {\n    <span class=\"t\">Object</span> <span class=\"i\">left</span> = <span class=\"i\">evaluate</span>(<span class=\"i\">expr</span>.<span class=\"i\">left</span>);\n\n    <span class=\"k\">if</span> (<span class=\"i\">expr</span>.<span class=\"i\">operator</span>.<span class=\"i\">type</span> == <span class=\"t\">TokenType</span>.<span class=\"i\">OR</span>) {\n      <span class=\"k\">if</span> (<span class=\"i\">isTruthy</span>(<span class=\"i\">left</span>)) <span class=\"k\">return</span> <span class=\"i\">left</span>;\n    } <span class=\"k\">else</span> {\n      <span class=\"k\">if</span> (!<span class=\"i\">isTruthy</span>(<span class=\"i\">left</span>)) <span class=\"k\">return</span> <span class=\"i\">left</span>;\n    }\n\n    <span class=\"k\">return</span> <span class=\"i\">evaluate</span>(<span class=\"i\">expr</span>.<span class=\"i\">right</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>visitLiteralExpr</em>()</div>\n\n<p>If you compare this to the <a href=\"evaluating-expressions.html\">earlier chapter&rsquo;s</a> <code>visitBinaryExpr()</code>\nmethod, you can see the difference. Here, we evaluate the left operand first. We\nlook at its value to see if we can short-circuit. If not, and only then, do we\nevaluate the right operand.</p>\n<p>The other interesting piece here is deciding what actual value to return. Since\nLox is dynamically typed, we allow operands of any type and use truthiness to\ndetermine what each operand represents. We apply similar reasoning to the\nresult. Instead of promising to literally return <code>true</code> or <code>false</code>, a logic\noperator merely guarantees it will return a value with appropriate truthiness.</p>\n<p>Fortunately, we have values with proper truthiness right at hand<span class=\"em\">&mdash;</span>the results\nof the operands themselves. So we use those. For example:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">print</span> <span class=\"s\">&quot;hi&quot;</span> <span class=\"k\">or</span> <span class=\"n\">2</span>; <span class=\"c\">// &quot;hi&quot;.</span>\n<span class=\"k\">print</span> <span class=\"k\">nil</span> <span class=\"k\">or</span> <span class=\"s\">&quot;yes&quot;</span>; <span class=\"c\">// &quot;yes&quot;.</span>\n</pre></div>\n<p>On the first line, <code>\"hi\"</code> is truthy, so the <code>or</code> short-circuits and returns\nthat. On the second line, <code>nil</code> is falsey, so it evaluates and returns the\nsecond operand, <code>\"yes\"</code>.</p>\n<p>That covers all of the branching primitives in Lox. We&rsquo;re ready to jump ahead to\nloops. You see what I did there? <em>Jump. Ahead.</em> Get it? See, it&rsquo;s like a\nreference to<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>oh, forget it.</p>\n<h2><a href=\"#while-loops\" id=\"while-loops\"><small>9&#8202;.&#8202;4</small>While Loops</a></h2>\n<p>Lox features two looping control flow statements, <code>while</code> and <code>for</code>. The <code>while</code>\nloop is the simpler one, so we&rsquo;ll start there. Its grammar is the same as in C.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">statement</span>      → <span class=\"i\">exprStmt</span>\n               | <span class=\"i\">ifStmt</span>\n               | <span class=\"i\">printStmt</span>\n               | <span class=\"i\">whileStmt</span>\n               | <span class=\"i\">block</span> ;\n\n<span class=\"i\">whileStmt</span>      → <span class=\"s\">&quot;while&quot;</span> <span class=\"s\">&quot;(&quot;</span> <span class=\"i\">expression</span> <span class=\"s\">&quot;)&quot;</span> <span class=\"i\">statement</span> ;\n</pre></div>\n<p>We add another clause to the statement rule that points to the new rule for\nwhile. It takes a <code>while</code> keyword, followed by a parenthesized condition\nexpression, then a statement for the body. That new grammar rule gets a <span\nname=\"while-ast\">syntax tree node</span>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      &quot;Print      : Expr expression&quot;,\n</pre><pre class=\"insert-before\">      <span class=\"s\">&quot;Var        : Token name, Expr initializer&quot;</span><span class=\"insert-comma\">,</span>\n</pre><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nin <em>main</em>()<br>\nadd <em>&ldquo;,&rdquo;</em> to previous line</div>\n<pre class=\"insert\">      <span class=\"s\">&quot;While      : Expr condition, Stmt body&quot;</span>\n</pre><pre class=\"insert-after\">    ));\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, in <em>main</em>(), add <em>&ldquo;,&rdquo;</em> to previous line</div>\n\n<aside name=\"while-ast\">\n<p>The generated code for the new node is in <a href=\"appendix-ii.html#while-statement\">Appendix II</a>.</p>\n</aside>\n<p>The node stores the condition and body. Here you can see why it&rsquo;s nice to have\nseparate base classes for expressions and statements. The field declarations\nmake it clear that the condition is an expression and the body is a statement.</p>\n<p>Over in the parser, we follow the same process we used for <code>if</code> statements.\nFirst, we add another case in <code>statement()</code> to detect and match the leading\nkeyword.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    if (match(PRINT)) return printStatement();\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>statement</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">WHILE</span>)) <span class=\"k\">return</span> <span class=\"i\">whileStatement</span>();\n</pre><pre class=\"insert-after\">    if (match(LEFT_BRACE)) return new Stmt.Block(block());\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>statement</em>()</div>\n\n<p>That delegates the real work to this method:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>varDeclaration</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Stmt</span> <span class=\"i\">whileStatement</span>() {\n    <span class=\"i\">consume</span>(<span class=\"i\">LEFT_PAREN</span>, <span class=\"s\">&quot;Expect &#39;(&#39; after &#39;while&#39;.&quot;</span>);\n    <span class=\"t\">Expr</span> <span class=\"i\">condition</span> = <span class=\"i\">expression</span>();\n    <span class=\"i\">consume</span>(<span class=\"i\">RIGHT_PAREN</span>, <span class=\"s\">&quot;Expect &#39;)&#39; after condition.&quot;</span>);\n    <span class=\"t\">Stmt</span> <span class=\"i\">body</span> = <span class=\"i\">statement</span>();\n\n    <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">Stmt</span>.<span class=\"t\">While</span>(<span class=\"i\">condition</span>, <span class=\"i\">body</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>varDeclaration</em>()</div>\n\n<p>The grammar is dead simple and this is a straight translation of it to Java.\nSpeaking of translating straight to Java, here&rsquo;s how we execute the new syntax:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>visitVarStmt</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitWhileStmt</span>(<span class=\"t\">Stmt</span>.<span class=\"t\">While</span> <span class=\"i\">stmt</span>) {\n    <span class=\"k\">while</span> (<span class=\"i\">isTruthy</span>(<span class=\"i\">evaluate</span>(<span class=\"i\">stmt</span>.<span class=\"i\">condition</span>))) {\n      <span class=\"i\">execute</span>(<span class=\"i\">stmt</span>.<span class=\"i\">body</span>);\n    }\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>visitVarStmt</em>()</div>\n\n<p>Like the visit method for <code>if</code>, this visitor uses the corresponding Java\nfeature. This method isn&rsquo;t complex, but it makes Lox much more powerful. We can\nfinally write a program whose running time isn&rsquo;t strictly bound by the length of\nthe source code.</p>\n<h2><a href=\"#for-loops\" id=\"for-loops\"><small>9&#8202;.&#8202;5</small>For Loops</a></h2>\n<p>We&rsquo;re down to the last control flow construct, <span name=\"for\">Ye Olde</span>\nC-style <code>for</code> loop. I probably don&rsquo;t need to remind you, but it looks like this:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">for</span> (<span class=\"k\">var</span> <span class=\"i\">i</span> = <span class=\"n\">0</span>; <span class=\"i\">i</span> &lt; <span class=\"n\">10</span>; <span class=\"i\">i</span> = <span class=\"i\">i</span> + <span class=\"n\">1</span>) <span class=\"k\">print</span> <span class=\"i\">i</span>;\n</pre></div>\n<p>In grammarese, that&rsquo;s:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">statement</span>      → <span class=\"i\">exprStmt</span>\n               | <span class=\"i\">forStmt</span>\n               | <span class=\"i\">ifStmt</span>\n               | <span class=\"i\">printStmt</span>\n               | <span class=\"i\">whileStmt</span>\n               | <span class=\"i\">block</span> ;\n\n<span class=\"i\">forStmt</span>        → <span class=\"s\">&quot;for&quot;</span> <span class=\"s\">&quot;(&quot;</span> ( <span class=\"i\">varDecl</span> | <span class=\"i\">exprStmt</span> | <span class=\"s\">&quot;;&quot;</span> )\n                 <span class=\"i\">expression</span>? <span class=\"s\">&quot;;&quot;</span>\n                 <span class=\"i\">expression</span>? <span class=\"s\">&quot;)&quot;</span> <span class=\"i\">statement</span> ;\n</pre></div>\n<aside name=\"for\">\n<p>Most modern languages have a higher-level looping statement for iterating over\narbitrary user-defined sequences. C# has <code>foreach</code>, Java has &ldquo;enhanced for&rdquo;,\neven C++ has range-based <code>for</code> statements now. Those offer cleaner syntax than\nC&rsquo;s <code>for</code> statement by implicitly calling into an iteration protocol that the\nobject being looped over supports.</p>\n<p>I love those. For Lox, though, we&rsquo;re limited by building up the interpreter a\nchapter at a time. We don&rsquo;t have objects and methods yet, so we have no way of\ndefining an iteration protocol that the <code>for</code> loop could use. So we&rsquo;ll stick\nwith the old school C <code>for</code> loop. Think of it as &ldquo;vintage&rdquo;. The fixie of control\nflow statements.</p>\n</aside>\n<p>Inside the parentheses, you have three clauses separated by semicolons:</p>\n<ol>\n<li>\n<p>The first clause is the <em>initializer</em>. It is executed exactly once, before\nanything else. It&rsquo;s usually an expression, but for convenience, we also\nallow a variable declaration. In that case, the variable is scoped to the\nrest of the <code>for</code> loop<span class=\"em\">&mdash;</span>the other two clauses and the body.</p>\n</li>\n<li>\n<p>Next is the <em>condition</em>. As in a <code>while</code> loop, this expression controls when\nto exit the loop. It&rsquo;s evaluated once at the beginning of each iteration,\nincluding the first. If the result is truthy, it executes the loop body.\nOtherwise, it bails.</p>\n</li>\n<li>\n<p>The last clause is the <em>increment</em>. It&rsquo;s an arbitrary expression that does\nsome work at the end of each loop iteration. The result of the expression is\ndiscarded, so it must have a side effect to be useful. In practice, it\nusually increments a variable.</p>\n</li>\n</ol>\n<p>Any of these clauses can be omitted. Following the closing parenthesis is a\nstatement for the body, which is typically a block.</p>\n<h3><a href=\"#desugaring\" id=\"desugaring\"><small>9&#8202;.&#8202;5&#8202;.&#8202;1</small>Desugaring</a></h3>\n<p>That&rsquo;s a lot of machinery, but note that none of it does anything you couldn&rsquo;t\ndo with the statements we already have. If <code>for</code> loops didn&rsquo;t support\ninitializer clauses, you could just put the initializer expression before the\n<code>for</code> statement. Without an increment clause, you could simply put the increment\nexpression at the end of the body yourself.</p>\n<p>In other words, Lox doesn&rsquo;t <em>need</em> <code>for</code> loops, they just make some common code\npatterns more pleasant to write. These kinds of features are called <span\nname=\"sugar\"><strong>syntactic sugar</strong></span>. For example, the previous <code>for</code> loop\ncould be rewritten like so:</p>\n<aside name=\"sugar\">\n<p>This delightful turn of phrase was coined by Peter J. Landin in 1964 to describe\nhow some of the nice expression forms supported by languages like ALGOL were a\nsweetener sprinkled over the more fundamental<span class=\"em\">&mdash;</span>but presumably less palatable<span class=\"em\">&mdash;</span>lambda calculus underneath.</p><img class=\"above\" src=\"image/control-flow/sugar.png\" alt=\"Slightly more than a spoonful of sugar.\" />\n</aside>\n<div class=\"codehilite\"><pre>{\n  <span class=\"k\">var</span> <span class=\"i\">i</span> = <span class=\"n\">0</span>;\n  <span class=\"k\">while</span> (<span class=\"i\">i</span> &lt; <span class=\"n\">10</span>) {\n    <span class=\"k\">print</span> <span class=\"i\">i</span>;\n    <span class=\"i\">i</span> = <span class=\"i\">i</span> + <span class=\"n\">1</span>;\n  }\n}\n</pre></div>\n<p>This script has the exact same semantics as the previous one, though it&rsquo;s not as\neasy on the eyes. Syntactic sugar features like Lox&rsquo;s <code>for</code> loop make a language\nmore pleasant and productive to work in. But, especially in sophisticated\nlanguage implementations, every language feature that requires back-end support\nand optimization is expensive.</p>\n<p>We can have our cake and eat it too by <span\nname=\"caramel\"><strong>desugaring</strong></span>. That funny word describes a process where\nthe front end takes code using syntax sugar and translates it to a more\nprimitive form that the back end already knows how to execute.</p>\n<aside name=\"caramel\">\n<p>Oh, how I wish the accepted term for this was &ldquo;caramelization&rdquo;. Why introduce a\nmetaphor if you aren&rsquo;t going to stick with it?</p>\n</aside>\n<p>We&rsquo;re going to desugar <code>for</code> loops to the <code>while</code> loops and other statements the\ninterpreter already handles. In our simple interpreter, desugaring really\ndoesn&rsquo;t save us much work, but it does give me an excuse to introduce you to the\ntechnique. So, unlike the previous statements, we <em>won&rsquo;t</em> add a new syntax tree\nnode. Instead, we go straight to parsing. First, add an import we&rsquo;ll need soon.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">import java.util.ArrayList;\n</pre><div class=\"source-file\"><em>lox/Parser.java</em></div>\n<pre class=\"insert\"><span class=\"k\">import</span> <span class=\"i\">java.util.Arrays</span>;\n</pre><pre class=\"insert-after\">import java.util.List;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em></div>\n\n<p>Like every statement, we start parsing a <code>for</code> loop by matching its keyword.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  private Stmt statement() {\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>statement</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">FOR</span>)) <span class=\"k\">return</span> <span class=\"i\">forStatement</span>();\n</pre><pre class=\"insert-after\">    if (match(IF)) return ifStatement();\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>statement</em>()</div>\n\n<p>Here is where it gets interesting. The desugaring is going to happen here, so\nwe&rsquo;ll build this method a piece at a time, starting with the opening parenthesis\nbefore the clauses.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>statement</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Stmt</span> <span class=\"i\">forStatement</span>() {\n    <span class=\"i\">consume</span>(<span class=\"i\">LEFT_PAREN</span>, <span class=\"s\">&quot;Expect &#39;(&#39; after &#39;for&#39;.&quot;</span>);\n\n    <span class=\"c\">// More here...</span>\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>statement</em>()</div>\n\n<p>The first clause following that is the initializer.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    consume(LEFT_PAREN, &quot;Expect '(' after 'for'.&quot;);\n\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>forStatement</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"t\">Stmt</span> <span class=\"i\">initializer</span>;\n    <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">SEMICOLON</span>)) {\n      <span class=\"i\">initializer</span> = <span class=\"k\">null</span>;\n    } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">VAR</span>)) {\n      <span class=\"i\">initializer</span> = <span class=\"i\">varDeclaration</span>();\n    } <span class=\"k\">else</span> {\n      <span class=\"i\">initializer</span> = <span class=\"i\">expressionStatement</span>();\n    }\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>forStatement</em>(), replace 1 line</div>\n\n<p>If the token following the <code>(</code> is a semicolon then the initializer has been\nomitted. Otherwise, we check for a <code>var</code> keyword to see if it&rsquo;s a <span\nname=\"variable\">variable</span> declaration. If neither of those matched, it\nmust be an expression. We parse that and wrap it in an expression statement so\nthat the initializer is always of type Stmt.</p>\n<aside name=\"variable\">\n<p>In a previous chapter, I said we can split expression and statement syntax trees\ninto two separate class hierarchies because there&rsquo;s no single place in the\ngrammar that allows both an expression and a statement. That wasn&rsquo;t <em>entirely</em>\ntrue, I guess.</p>\n</aside>\n<p>Next up is the condition.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      initializer = expressionStatement();\n    }\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>forStatement</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"t\">Expr</span> <span class=\"i\">condition</span> = <span class=\"k\">null</span>;\n    <span class=\"k\">if</span> (!<span class=\"i\">check</span>(<span class=\"i\">SEMICOLON</span>)) {\n      <span class=\"i\">condition</span> = <span class=\"i\">expression</span>();\n    }\n    <span class=\"i\">consume</span>(<span class=\"i\">SEMICOLON</span>, <span class=\"s\">&quot;Expect &#39;;&#39; after loop condition.&quot;</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>forStatement</em>()</div>\n\n<p>Again, we look for a semicolon to see if the clause has been omitted. The last\nclause is the increment.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    consume(SEMICOLON, &quot;Expect ';' after loop condition.&quot;);\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>forStatement</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"t\">Expr</span> <span class=\"i\">increment</span> = <span class=\"k\">null</span>;\n    <span class=\"k\">if</span> (!<span class=\"i\">check</span>(<span class=\"i\">RIGHT_PAREN</span>)) {\n      <span class=\"i\">increment</span> = <span class=\"i\">expression</span>();\n    }\n    <span class=\"i\">consume</span>(<span class=\"i\">RIGHT_PAREN</span>, <span class=\"s\">&quot;Expect &#39;)&#39; after for clauses.&quot;</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>forStatement</em>()</div>\n\n<p>It&rsquo;s similar to the condition clause except this one is terminated by the\nclosing parenthesis. All that remains is the <span name=\"body\">body</span>.</p>\n<aside name=\"body\">\n<p>Is it just me or does that sound morbid? &ldquo;All that remained<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>was the <em>body</em>&rdquo;.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">    consume(RIGHT_PAREN, &quot;Expect ')' after for clauses.&quot;);\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>forStatement</em>()</div>\n<pre class=\"insert\">    <span class=\"t\">Stmt</span> <span class=\"i\">body</span> = <span class=\"i\">statement</span>();\n\n    <span class=\"k\">return</span> <span class=\"i\">body</span>;\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>forStatement</em>()</div>\n\n<p>We&rsquo;ve parsed all of the various pieces of the <code>for</code> loop and the resulting AST\nnodes are sitting in a handful of Java local variables. This is where the\ndesugaring comes in. We take those and use them to synthesize syntax tree nodes\nthat express the semantics of the <code>for</code> loop, like the hand-desugared example I\nshowed you earlier.</p>\n<p>The code is a little simpler if we work backward, so we start with the increment\nclause.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    Stmt body = statement();\n\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>forStatement</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (<span class=\"i\">increment</span> != <span class=\"k\">null</span>) {\n      <span class=\"i\">body</span> = <span class=\"k\">new</span> <span class=\"t\">Stmt</span>.<span class=\"t\">Block</span>(\n          <span class=\"t\">Arrays</span>.<span class=\"i\">asList</span>(\n              <span class=\"i\">body</span>,\n              <span class=\"k\">new</span> <span class=\"t\">Stmt</span>.<span class=\"t\">Expression</span>(<span class=\"i\">increment</span>)));\n    }\n\n</pre><pre class=\"insert-after\">    return body;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>forStatement</em>()</div>\n\n<p>The increment, if there is one, executes after the body in each iteration of the\nloop. We do that by replacing the body with a little block that contains the\noriginal body followed by an expression statement that evaluates the increment.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    }\n\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>forStatement</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (<span class=\"i\">condition</span> == <span class=\"k\">null</span>) <span class=\"i\">condition</span> = <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Literal</span>(<span class=\"k\">true</span>);\n    <span class=\"i\">body</span> = <span class=\"k\">new</span> <span class=\"t\">Stmt</span>.<span class=\"t\">While</span>(<span class=\"i\">condition</span>, <span class=\"i\">body</span>);\n\n</pre><pre class=\"insert-after\">    return body;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>forStatement</em>()</div>\n\n<p>Next, we take the condition and the body and build the loop using a primitive\n<code>while</code> loop. If the condition is omitted, we jam in <code>true</code> to make an infinite\nloop.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    body = new Stmt.While(condition, body);\n\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>forStatement</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (<span class=\"i\">initializer</span> != <span class=\"k\">null</span>) {\n      <span class=\"i\">body</span> = <span class=\"k\">new</span> <span class=\"t\">Stmt</span>.<span class=\"t\">Block</span>(<span class=\"t\">Arrays</span>.<span class=\"i\">asList</span>(<span class=\"i\">initializer</span>, <span class=\"i\">body</span>));\n    }\n\n</pre><pre class=\"insert-after\">    return body;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>forStatement</em>()</div>\n\n<p>Finally, if there is an initializer, it runs once before the entire loop. We do\nthat by, again, replacing the whole statement with a block that runs the\ninitializer and then executes the loop.</p>\n<p>That&rsquo;s it. Our interpreter now supports C-style <code>for</code> loops and we didn&rsquo;t have\nto touch the Interpreter class at all. Since we desugared to nodes the\ninterpreter already knows how to visit, there is no more work to do.</p>\n<p>Finally, Lox is powerful enough to entertain us, at least for a few minutes.\nHere&rsquo;s a tiny program to print the first 21 elements in the Fibonacci\nsequence:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"n\">0</span>;\n<span class=\"k\">var</span> <span class=\"i\">temp</span>;\n\n<span class=\"k\">for</span> (<span class=\"k\">var</span> <span class=\"i\">b</span> = <span class=\"n\">1</span>; <span class=\"i\">a</span> &lt; <span class=\"n\">10000</span>; <span class=\"i\">b</span> = <span class=\"i\">temp</span> + <span class=\"i\">b</span>) {\n  <span class=\"k\">print</span> <span class=\"i\">a</span>;\n  <span class=\"i\">temp</span> = <span class=\"i\">a</span>;\n  <span class=\"i\">a</span> = <span class=\"i\">b</span>;\n}\n</pre></div>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>A few chapters from now, when Lox supports first-class functions and dynamic\ndispatch, we technically won&rsquo;t <em>need</em> branching statements built into the\nlanguage. Show how conditional execution can be implemented in terms of\nthose. Name a language that uses this technique for its control flow.</p>\n</li>\n<li>\n<p>Likewise, looping can be implemented using those same tools, provided our\ninterpreter supports an important optimization. What is it, and why is it\nnecessary? Name a language that uses this technique for iteration.</p>\n</li>\n<li>\n<p>Unlike Lox, most other C-style languages also support <code>break</code> and <code>continue</code>\nstatements inside loops. Add support for <code>break</code> statements.</p>\n<p>The syntax is a <code>break</code> keyword followed by a semicolon. It should be a\nsyntax error to have a <code>break</code> statement appear outside of any enclosing\nloop. At runtime, a <code>break</code> statement causes execution to jump to the end of\nthe nearest enclosing loop and proceeds from there. Note that the <code>break</code>\nmay be nested inside other blocks and <code>if</code> statements that also need to be\nexited.</p>\n</li>\n</ol>\n</div>\n<div class=\"design-note\">\n<h2><a href=\"#design-note\" id=\"design-note\">Design Note: Spoonfuls of Syntactic Sugar</a></h2>\n<p>When you design your own language, you choose how much syntactic sugar to pour\ninto the grammar. Do you make an unsweetened health food where each semantic\noperation maps to a single syntactic unit, or some decadent dessert where every\nbit of behavior can be expressed ten different ways? Successful languages\ninhabit all points along this continuum.</p>\n<p>On the extreme acrid end are those with ruthlessly minimal syntax like Lisp,\nForth, and Smalltalk. Lispers famously claim their language &ldquo;has no syntax&rdquo;,\nwhile Smalltalkers proudly show that you can fit the entire grammar on an index\ncard. This tribe has the philosophy that the <em>language</em> doesn&rsquo;t need syntactic\nsugar. Instead, the minimal syntax and semantics it provides are powerful enough\nto let library code be as expressive as if it were part of the language itself.</p>\n<p>Near these are languages like C, Lua, and Go. They aim for simplicity and\nclarity over minimalism. Some, like Go, deliberately eschew both syntactic sugar\nand the kind of syntactic extensibility of the previous category. They want the\nsyntax to get out of the way of the semantics, so they focus on keeping both the\ngrammar and libraries simple. Code should be obvious more than beautiful.</p>\n<p>Somewhere in the middle you have languages like Java, C#, and Python. Eventually\nyou reach Ruby, C++, Perl, and D<span class=\"em\">&mdash;</span>languages which have stuffed so much syntax\ninto their grammar, they are running out of punctuation characters on the\nkeyboard.</p>\n<p>To some degree, location on the spectrum correlates with age. It&rsquo;s relatively\neasy to add bits of syntactic sugar in later releases. New syntax is a crowd\npleaser, and it&rsquo;s less likely to break existing programs than mucking with the\nsemantics. Once added, you can never take it away, so languages tend to sweeten\nwith time. One of the main benefits of creating a new language from scratch is\nit gives you an opportunity to scrape off those accumulated layers of frosting\nand start over.</p>\n<p>Syntactic sugar has a bad rap among the PL intelligentsia. There&rsquo;s a real fetish\nfor minimalism in that crowd. There is some justification for that. Poorly\ndesigned, unneeded syntax raises the cognitive load without adding enough\nexpressiveness to carry its weight. Since there is always pressure to cram new\nfeatures into the language, it takes discipline and a focus on simplicity to\navoid bloat. Once you add some syntax, you&rsquo;re stuck with it, so it&rsquo;s smart to be\nparsimonious.</p>\n<p>At the same time, most successful languages do have fairly complex grammars, at\nleast by the time they are widely used. Programmers spend a ton of time in their\nlanguage of choice, and a few niceties here and there really can improve the\ncomfort and efficiency of their work.</p>\n<p>Striking the right balance<span class=\"em\">&mdash;</span>choosing the right level of sweetness for your\nlanguage<span class=\"em\">&mdash;</span>relies on your own sense of taste.</p>\n</div>\n\n<footer>\n<a href=\"functions.html\" class=\"next\">\n  Next Chapter: &ldquo;Functions&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/dedication.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Dedication &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h2><small></small>Dedication</h2>\n<hr>\n\n<div class=\"prev-next\">\n    <a href=\"index.html\" title=\"Crafting Interpreters\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"contents.html\" title=\"Table of Contents\">&uarr;&nbsp;Up</a>\n    <a href=\"acknowledgements.html\" title=\"Acknowledgements\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"index.html\" title=\"Crafting Interpreters\" class=\"prev\">←</a>\n<a href=\"acknowledgements.html\" title=\"Acknowledgements\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h2><small></small>Dedication</h2>\n<hr>\n\n<div class=\"prev-next\">\n    <a href=\"index.html\" title=\"Crafting Interpreters\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"contents.html\" title=\"Table of Contents\">&uarr;&nbsp;Up</a>\n    <a href=\"acknowledgements.html\" title=\"Acknowledgements\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <h1 class=\"part\">Dedication</h1>\n\n\n<div class=\"dedication\"><img src=\"image/ginny.png\" alt=\"My beloved dog and her stupid face.\" />\n<p>To Ginny, I miss your stupid face.</p>\n</div>\n\n<footer>\n<a href=\"acknowledgements.html\" class=\"next\">\n  Next Part: &ldquo;Acknowledgements&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/evaluating-expressions.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Evaluating Expressions &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Evaluating Expressions<small>7</small></a></h3>\n\n<ul>\n    <li><a href=\"#representing-values\"><small>7.1</small> Representing Values</a></li>\n    <li><a href=\"#evaluating-expressions\"><small>7.2</small> Evaluating Expressions</a></li>\n    <li><a href=\"#runtime-errors\"><small>7.3</small> Runtime Errors</a></li>\n    <li><a href=\"#hooking-up-the-interpreter\"><small>7.4</small> Hooking Up the Interpreter</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Static and Dynamic Typing</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"parsing-expressions.html\" title=\"Parsing Expressions\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\">&uarr;&nbsp;Up</a>\n    <a href=\"statements-and-state.html\" title=\"Statements and State\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"parsing-expressions.html\" title=\"Parsing Expressions\" class=\"prev\">←</a>\n<a href=\"statements-and-state.html\" title=\"Statements and State\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Evaluating Expressions<small>7</small></a></h3>\n\n<ul>\n    <li><a href=\"#representing-values\"><small>7.1</small> Representing Values</a></li>\n    <li><a href=\"#evaluating-expressions\"><small>7.2</small> Evaluating Expressions</a></li>\n    <li><a href=\"#runtime-errors\"><small>7.3</small> Runtime Errors</a></li>\n    <li><a href=\"#hooking-up-the-interpreter\"><small>7.4</small> Hooking Up the Interpreter</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Static and Dynamic Typing</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"parsing-expressions.html\" title=\"Parsing Expressions\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\">&uarr;&nbsp;Up</a>\n    <a href=\"statements-and-state.html\" title=\"Statements and State\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">7</div>\n  <h1>Evaluating Expressions</h1>\n\n<blockquote>\n<p>You are my creator, but I am your master; Obey!</p>\n<p><cite>Mary Shelley, <em>Frankenstein</em></cite></p>\n</blockquote>\n<p>If you want to properly set the mood for this chapter, try to conjure up a\nthunderstorm, one of those swirling tempests that likes to yank open shutters at\nthe climax of the story. Maybe toss in a few bolts of lightning. In this\nchapter, our interpreter will take breath, open its eyes, and execute some code.</p>\n<p><span name=\"spooky\"></span></p><img src=\"image/evaluating-expressions/lightning.png\" alt=\"A bolt of lightning strikes a Victorian mansion. Spooky!\" />\n<aside name=\"spooky\">\n<p>A decrepit Victorian mansion is optional, but adds to the ambiance.</p>\n</aside>\n<p>There are all manner of ways that language implementations make a computer do\nwhat the user&rsquo;s source code commands. They can compile it to machine code,\ntranslate it to another high-level language, or reduce it to some bytecode\nformat for a virtual machine to run. For our first interpreter, though, we are\ngoing to take the simplest, shortest path and execute the syntax tree itself.</p>\n<p>Right now, our parser only supports expressions. So, to &ldquo;execute&rdquo; code, we will\nevaluate an expression and produce a value. For each kind of expression syntax\nwe can parse<span class=\"em\">&mdash;</span>literal, operator, etc.<span class=\"em\">&mdash;</span>we need a corresponding chunk of code\nthat knows how to evaluate that tree and produce a result. That raises two\nquestions:</p>\n<ol>\n<li>\n<p>What kinds of values do we produce?</p>\n</li>\n<li>\n<p>How do we organize those chunks of code?</p>\n</li>\n</ol>\n<p>Taking them on one at a time<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span></p>\n<h2><a href=\"#representing-values\" id=\"representing-values\"><small>7&#8202;.&#8202;1</small>Representing Values</a></h2>\n<p>In Lox, <span name=\"value\">values</span> are created by literals, computed by\nexpressions, and stored in variables. The user sees these as <em>Lox</em> objects, but\nthey are implemented in the underlying language our interpreter is written in.\nThat means bridging the lands of Lox&rsquo;s dynamic typing and Java&rsquo;s static types. A\nvariable in Lox can store a value of any (Lox) type, and can even store values\nof different types at different points in time. What Java type might we use to\nrepresent that?</p>\n<aside name=\"value\">\n<p>Here, I&rsquo;m using &ldquo;value&rdquo; and &ldquo;object&rdquo; pretty much interchangeably.</p>\n<p>Later in the C interpreter we&rsquo;ll make a slight distinction between them, but\nthat&rsquo;s mostly to have unique terms for two different corners of the\nimplementation<span class=\"em\">&mdash;</span>in-place versus heap-allocated data. From the user&rsquo;s\nperspective, the terms are synonymous.</p>\n</aside>\n<p>Given a Java variable with that static type, we must also be able to determine\nwhich kind of value it holds at runtime. When the interpreter executes a <code>+</code>\noperator, it needs to tell if it is adding two numbers or concatenating two\nstrings. Is there a Java type that can hold numbers, strings, Booleans, and\nmore? Is there one that can tell us what its runtime type is? There is! Good old\njava.lang.Object.</p>\n<p>In places in the interpreter where we need to store a Lox value, we can use\nObject as the type. Java has boxed versions of its primitive types that all\nsubclass Object, so we can use those for Lox&rsquo;s built-in types:</p><table>\n<thead>\n<tr>\n  <td>Lox type</td>\n  <td>Java representation</td>\n</tr>\n</thead>\n<tbody>\n<tr>\n  <td>Any Lox value</td>\n  <td>Object</td>\n</tr>\n<tr>\n  <td><code>nil</code></td>\n  <td><code>null</code></td>\n</tr>\n<tr>\n  <td>Boolean</td>\n  <td>Boolean</td>\n</tr>\n<tr>\n  <td>number</td>\n  <td>Double</td>\n</tr>\n<tr>\n  <td>string</td>\n  <td>String</td>\n</tr>\n</tbody>\n</table>\n<p>Given a value of static type Object, we can determine if the runtime value is a\nnumber or a string or whatever using Java&rsquo;s built-in <code>instanceof</code> operator. In\nother words, the <span name=\"jvm\">JVM</span>&rsquo;s own object representation\nconveniently gives us everything we need to implement Lox&rsquo;s built-in types.\nWe&rsquo;ll have to do a little more work later when we add Lox&rsquo;s notions of\nfunctions, classes, and instances, but Object and the boxed primitive classes\nare sufficient for the types we need right now.</p>\n<aside name=\"jvm\">\n<p>Another thing we need to do with values is manage their memory, and Java does\nthat too. A handy object representation and a really nice garbage collector are\nthe main reasons we&rsquo;re writing our first interpreter in Java.</p>\n</aside>\n<h2><a href=\"#evaluating-expressions\" id=\"evaluating-expressions\"><small>7&#8202;.&#8202;2</small>Evaluating Expressions</a></h2>\n<p>Next, we need blobs of code to implement the evaluation logic for each kind of\nexpression we can parse. We could stuff that code into the syntax tree classes\nin something like an <code>interpret()</code> method. In effect, we could tell each syntax\ntree node, &ldquo;Interpret thyself&rdquo;. This is the Gang of Four&rsquo;s\n<a href=\"https://en.wikipedia.org/wiki/Interpreter_pattern\">Interpreter design pattern</a>. It&rsquo;s a neat pattern, but like I mentioned\nearlier, it gets messy if we jam all sorts of logic into the tree classes.</p>\n<p>Instead, we&rsquo;re going to reuse our groovy <a href=\"representing-code.html#the-visitor-pattern\">Visitor pattern</a>. In the previous\nchapter, we created an AstPrinter class. It took in a syntax tree and\nrecursively traversed it, building up a string which it ultimately returned.\nThat&rsquo;s almost exactly what a real interpreter does, except instead of\nconcatenating strings, it computes values.</p>\n<p>We start with a new class.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\ncreate new file</div>\n<pre><span class=\"k\">package</span> <span class=\"i\">com.craftinginterpreters.lox</span>;\n\n<span class=\"k\">class</span> <span class=\"t\">Interpreter</span> <span class=\"k\">implements</span> <span class=\"t\">Expr</span>.<span class=\"t\">Visitor</span>&lt;<span class=\"t\">Object</span>&gt; {\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, create new file</div>\n\n<p>The class declares that it&rsquo;s a visitor. The return type of the visit methods\nwill be Object, the root class that we use to refer to a Lox value in our Java\ncode. To satisfy the Visitor interface, we need to define visit methods for each\nof the four expression tree classes our parser produces. We&rsquo;ll start with the\nsimplest<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span></p>\n<h3><a href=\"#evaluating-literals\" id=\"evaluating-literals\"><small>7&#8202;.&#8202;2&#8202;.&#8202;1</small>Evaluating literals</a></h3>\n<p>The leaves of an expression tree<span class=\"em\">&mdash;</span>the atomic bits of syntax that all other\nexpressions are composed of<span class=\"em\">&mdash;</span>are <span name=\"leaf\">literals</span>. Literals\nare almost values already, but the distinction is important. A literal is a <em>bit\nof syntax</em> that produces a value. A literal always appears somewhere in the\nuser&rsquo;s source code. Lots of values are produced by computation and don&rsquo;t exist\nanywhere in the code itself. Those aren&rsquo;t literals. A literal comes from the\nparser&rsquo;s domain. Values are an interpreter concept, part of the runtime&rsquo;s world.</p>\n<aside name=\"leaf\">\n<p>In the <a href=\"statements-and-state.html\">next chapter</a>, when we implement variables, we&rsquo;ll add identifier\nexpressions, which are also leaf nodes.</p>\n</aside>\n<p>So, much like we converted a literal <em>token</em> into a literal <em>syntax tree node</em>\nin the parser, now we convert the literal tree node into a runtime value. That\nturns out to be trivial.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin class <em>Interpreter</em></div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Object</span> <span class=\"i\">visitLiteralExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Literal</span> <span class=\"i\">expr</span>) {\n    <span class=\"k\">return</span> <span class=\"i\">expr</span>.<span class=\"i\">value</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in class <em>Interpreter</em></div>\n\n<p>We eagerly produced the runtime value way back during scanning and stuffed it in\nthe token. The parser took that value and stuck it in the literal tree node,\nso to evaluate a literal, we simply pull it back out.</p>\n<h3><a href=\"#evaluating-parentheses\" id=\"evaluating-parentheses\"><small>7&#8202;.&#8202;2&#8202;.&#8202;2</small>Evaluating parentheses</a></h3>\n<p>The next simplest node to evaluate is grouping<span class=\"em\">&mdash;</span>the node you get as a result\nof using explicit parentheses in an expression.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin class <em>Interpreter</em></div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Object</span> <span class=\"i\">visitGroupingExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Grouping</span> <span class=\"i\">expr</span>) {\n    <span class=\"k\">return</span> <span class=\"i\">evaluate</span>(<span class=\"i\">expr</span>.<span class=\"i\">expression</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in class <em>Interpreter</em></div>\n\n<p>A <span name=\"grouping\">grouping</span> node has a reference to an inner node\nfor the expression contained inside the parentheses. To evaluate the grouping\nexpression itself, we recursively evaluate that subexpression and return it.</p>\n<p>We rely on this helper method which simply sends the expression back into the\ninterpreter&rsquo;s visitor implementation:</p>\n<aside name=\"grouping\">\n<p>Some parsers don&rsquo;t define tree nodes for parentheses. Instead, when parsing a\nparenthesized expression, they simply return the node for the inner expression.\nWe do create a node for parentheses in Lox because we&rsquo;ll need it later to\ncorrectly handle the left-hand sides of assignment expressions.</p>\n</aside>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin class <em>Interpreter</em></div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Object</span> <span class=\"i\">evaluate</span>(<span class=\"t\">Expr</span> <span class=\"i\">expr</span>) {\n    <span class=\"k\">return</span> <span class=\"i\">expr</span>.<span class=\"i\">accept</span>(<span class=\"k\">this</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in class <em>Interpreter</em></div>\n\n<h3><a href=\"#evaluating-unary-expressions\" id=\"evaluating-unary-expressions\"><small>7&#8202;.&#8202;2&#8202;.&#8202;3</small>Evaluating unary expressions</a></h3>\n<p>Like grouping, unary expressions have a single subexpression that we must\nevaluate first. The difference is that the unary expression itself does a little\nwork afterwards.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>visitLiteralExpr</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Object</span> <span class=\"i\">visitUnaryExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Unary</span> <span class=\"i\">expr</span>) {\n    <span class=\"t\">Object</span> <span class=\"i\">right</span> = <span class=\"i\">evaluate</span>(<span class=\"i\">expr</span>.<span class=\"i\">right</span>);\n\n    <span class=\"k\">switch</span> (<span class=\"i\">expr</span>.<span class=\"i\">operator</span>.<span class=\"i\">type</span>) {\n      <span class=\"k\">case</span> <span class=\"i\">MINUS</span>:\n        <span class=\"k\">return</span> -(<span class=\"t\">double</span>)<span class=\"i\">right</span>;\n    }\n\n    <span class=\"c\">// Unreachable.</span>\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>visitLiteralExpr</em>()</div>\n\n<p>First, we evaluate the operand expression. Then we apply the unary operator\nitself to the result of that. There are two different unary expressions,\nidentified by the type of the operator token.</p>\n<p>Shown here is <code>-</code>, which negates the result of the subexpression. The\nsubexpression must be a number. Since we don&rsquo;t <em>statically</em> know that in Java,\nwe <span name=\"cast\">cast</span> it before performing the operation. This type\ncast happens at runtime when the <code>-</code> is evaluated. That&rsquo;s the core of what makes\na language dynamically typed right there.</p>\n<aside name=\"cast\">\n<p>You&rsquo;re probably wondering what happens if the cast fails. Fear not, we&rsquo;ll get\ninto that soon.</p>\n</aside>\n<p>You can start to see how evaluation recursively traverses the tree. We can&rsquo;t\nevaluate the unary operator itself until after we evaluate its operand\nsubexpression. That means our interpreter is doing a <strong>post-order traversal</strong><span class=\"em\">&mdash;</span>each node evaluates its children before doing its own work.</p>\n<p>The other unary operator is logical not.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    switch (expr.operator.type) {\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitUnaryExpr</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"i\">BANG</span>:\n        <span class=\"k\">return</span> !<span class=\"i\">isTruthy</span>(<span class=\"i\">right</span>);\n</pre><pre class=\"insert-after\">      case MINUS:\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitUnaryExpr</em>()</div>\n\n<p>The implementation is simple, but what is this &ldquo;truthy&rdquo; thing about? We need to\nmake a little side trip to one of the great questions of Western philosophy:\n<em>What is truth?</em></p>\n<h3><a href=\"#truthiness-and-falsiness\" id=\"truthiness-and-falsiness\"><small>7&#8202;.&#8202;2&#8202;.&#8202;4</small>Truthiness and falsiness</a></h3>\n<p>OK, maybe we&rsquo;re not going to really get into the universal question, but at\nleast inside the world of Lox, we need to decide what happens when you use\nsomething other than <code>true</code> or <code>false</code> in a logic operation like <code>!</code> or any\nother place where a Boolean is expected.</p>\n<p>We <em>could</em> just say it&rsquo;s an error because we don&rsquo;t roll with implicit\nconversions, but most dynamically typed languages aren&rsquo;t that ascetic. Instead,\nthey take the universe of values of all types and partition them into two sets,\none of which they define to be &ldquo;true&rdquo;, or &ldquo;truthful&rdquo;, or (my favorite) &ldquo;truthy&rdquo;,\nand the rest which are &ldquo;false&rdquo; or &ldquo;falsey&rdquo;. This partitioning is somewhat\narbitrary and gets <span name=\"weird\">weird</span> in a few languages.</p>\n<aside name=\"weird\" class=\"bottom\">\n<p>In JavaScript, strings are truthy, but empty strings are not. Arrays are truthy\nbut empty arrays are<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>also truthy. The number <code>0</code> is falsey, but the <em>string</em>\n<code>\"0\"</code> is truthy.</p>\n<p>In Python, empty strings are falsey like in JS, but other empty sequences are\nfalsey too.</p>\n<p>In PHP, both the number <code>0</code> and the string <code>\"0\"</code> are falsey. Most other\nnon-empty strings are truthy.</p>\n<p>Get all that?</p>\n</aside>\n<p>Lox follows Ruby&rsquo;s simple rule: <code>false</code> and <code>nil</code> are falsey, and everything else\nis truthy. We implement that like so:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>visitUnaryExpr</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">boolean</span> <span class=\"i\">isTruthy</span>(<span class=\"t\">Object</span> <span class=\"i\">object</span>) {\n    <span class=\"k\">if</span> (<span class=\"i\">object</span> == <span class=\"k\">null</span>) <span class=\"k\">return</span> <span class=\"k\">false</span>;\n    <span class=\"k\">if</span> (<span class=\"i\">object</span> <span class=\"k\">instanceof</span> <span class=\"t\">Boolean</span>) <span class=\"k\">return</span> (<span class=\"t\">boolean</span>)<span class=\"i\">object</span>;\n    <span class=\"k\">return</span> <span class=\"k\">true</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>visitUnaryExpr</em>()</div>\n\n<h3><a href=\"#evaluating-binary-operators\" id=\"evaluating-binary-operators\"><small>7&#8202;.&#8202;2&#8202;.&#8202;5</small>Evaluating binary operators</a></h3>\n<p>On to the last expression tree class, binary operators. There&rsquo;s a handful of\nthem, and we&rsquo;ll start with the arithmetic ones.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>evaluate</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Object</span> <span class=\"i\">visitBinaryExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Binary</span> <span class=\"i\">expr</span>) {\n    <span class=\"t\">Object</span> <span class=\"i\">left</span> = <span class=\"i\">evaluate</span>(<span class=\"i\">expr</span>.<span class=\"i\">left</span>);\n    <span class=\"t\">Object</span> <span class=\"i\">right</span> = <span class=\"i\">evaluate</span>(<span class=\"i\">expr</span>.<span class=\"i\">right</span>);<span name=\"left\"> </span>\n\n    <span class=\"k\">switch</span> (<span class=\"i\">expr</span>.<span class=\"i\">operator</span>.<span class=\"i\">type</span>) {\n      <span class=\"k\">case</span> <span class=\"i\">MINUS</span>:\n        <span class=\"k\">return</span> (<span class=\"t\">double</span>)<span class=\"i\">left</span> - (<span class=\"t\">double</span>)<span class=\"i\">right</span>;\n      <span class=\"k\">case</span> <span class=\"i\">SLASH</span>:\n        <span class=\"k\">return</span> (<span class=\"t\">double</span>)<span class=\"i\">left</span> / (<span class=\"t\">double</span>)<span class=\"i\">right</span>;\n      <span class=\"k\">case</span> <span class=\"i\">STAR</span>:\n        <span class=\"k\">return</span> (<span class=\"t\">double</span>)<span class=\"i\">left</span> * (<span class=\"t\">double</span>)<span class=\"i\">right</span>;\n    }\n\n    <span class=\"c\">// Unreachable.</span>\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>evaluate</em>()</div>\n\n<aside name=\"left\">\n<p>Did you notice we pinned down a subtle corner of the language semantics here?\nIn a binary expression, we evaluate the operands in left-to-right order. If\nthose operands have side effects, that choice is user visible, so this isn&rsquo;t\nsimply an implementation detail.</p>\n<p>If we want our two interpreters to be consistent (hint: we do), we&rsquo;ll need to\nmake sure clox does the same thing.</p>\n</aside>\n<p>I think you can figure out what&rsquo;s going on here. The main difference from the\nunary negation operator is that we have two operands to evaluate.</p>\n<p>I left out one arithmetic operator because it&rsquo;s a little special.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    switch (expr.operator.type) {\n      case MINUS:\n        return (double)left - (double)right;\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitBinaryExpr</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"i\">PLUS</span>:\n        <span class=\"k\">if</span> (<span class=\"i\">left</span> <span class=\"k\">instanceof</span> <span class=\"t\">Double</span> &amp;&amp; <span class=\"i\">right</span> <span class=\"k\">instanceof</span> <span class=\"t\">Double</span>) {\n          <span class=\"k\">return</span> (<span class=\"t\">double</span>)<span class=\"i\">left</span> + (<span class=\"t\">double</span>)<span class=\"i\">right</span>;\n        }<span name=\"plus\"> </span>\n\n        <span class=\"k\">if</span> (<span class=\"i\">left</span> <span class=\"k\">instanceof</span> <span class=\"t\">String</span> &amp;&amp; <span class=\"i\">right</span> <span class=\"k\">instanceof</span> <span class=\"t\">String</span>) {\n          <span class=\"k\">return</span> (<span class=\"t\">String</span>)<span class=\"i\">left</span> + (<span class=\"t\">String</span>)<span class=\"i\">right</span>;\n        }\n\n        <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">      case SLASH:\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitBinaryExpr</em>()</div>\n\n<p>The <code>+</code> operator can also be used to concatenate two strings. To handle that, we\ndon&rsquo;t just assume the operands are a certain type and <em>cast</em> them, we\ndynamically <em>check</em> the type and choose the appropriate operation. This is why\nwe need our object representation to support <code>instanceof</code>.</p>\n<aside name=\"plus\">\n<p>We could have defined an operator specifically for string concatenation. That&rsquo;s\nwhat Perl (<code>.</code>), Lua (<code>..</code>), Smalltalk (<code>,</code>), Haskell (<code>++</code>), and others do.</p>\n<p>I thought it would make Lox a little more approachable to use the same syntax as\nJava, JavaScript, Python, and others. This means that the <code>+</code> operator is\n<strong>overloaded</strong> to support both adding numbers and concatenating strings. Even in\nlanguages that don&rsquo;t use <code>+</code> for strings, they still often overload it for\nadding both integers and floating-point numbers.</p>\n</aside>\n<p>Next up are the comparison operators.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    switch (expr.operator.type) {\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitBinaryExpr</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"i\">GREATER</span>:\n        <span class=\"k\">return</span> (<span class=\"t\">double</span>)<span class=\"i\">left</span> &gt; (<span class=\"t\">double</span>)<span class=\"i\">right</span>;\n      <span class=\"k\">case</span> <span class=\"i\">GREATER_EQUAL</span>:\n        <span class=\"k\">return</span> (<span class=\"t\">double</span>)<span class=\"i\">left</span> &gt;= (<span class=\"t\">double</span>)<span class=\"i\">right</span>;\n      <span class=\"k\">case</span> <span class=\"i\">LESS</span>:\n        <span class=\"k\">return</span> (<span class=\"t\">double</span>)<span class=\"i\">left</span> &lt; (<span class=\"t\">double</span>)<span class=\"i\">right</span>;\n      <span class=\"k\">case</span> <span class=\"i\">LESS_EQUAL</span>:\n        <span class=\"k\">return</span> (<span class=\"t\">double</span>)<span class=\"i\">left</span> &lt;= (<span class=\"t\">double</span>)<span class=\"i\">right</span>;\n</pre><pre class=\"insert-after\">      case MINUS:\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitBinaryExpr</em>()</div>\n\n<p>They are basically the same as arithmetic. The only difference is that where the\narithmetic operators produce a value whose type is the same as the operands\n(numbers or strings), the comparison operators always produce a Boolean.</p>\n<p>The last pair of operators are equality.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitBinaryExpr</em>()</div>\n<pre>      <span class=\"k\">case</span> <span class=\"i\">BANG_EQUAL</span>: <span class=\"k\">return</span> !<span class=\"i\">isEqual</span>(<span class=\"i\">left</span>, <span class=\"i\">right</span>);\n      <span class=\"k\">case</span> <span class=\"i\">EQUAL_EQUAL</span>: <span class=\"k\">return</span> <span class=\"i\">isEqual</span>(<span class=\"i\">left</span>, <span class=\"i\">right</span>);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitBinaryExpr</em>()</div>\n\n<p>Unlike the comparison operators which require numbers, the equality operators\nsupport operands of any type, even mixed ones. You can&rsquo;t ask Lox if 3 is <em>less</em>\nthan <code>\"three\"</code>, but you can ask if it&rsquo;s <span name=\"equal\"><em>equal</em></span> to\nit.</p>\n<aside name=\"equal\">\n<p>Spoiler alert: it&rsquo;s not.</p>\n</aside>\n<p>Like truthiness, the equality logic is hoisted out into a separate method.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>isTruthy</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">boolean</span> <span class=\"i\">isEqual</span>(<span class=\"t\">Object</span> <span class=\"i\">a</span>, <span class=\"t\">Object</span> <span class=\"i\">b</span>) {\n    <span class=\"k\">if</span> (<span class=\"i\">a</span> == <span class=\"k\">null</span> &amp;&amp; <span class=\"i\">b</span> == <span class=\"k\">null</span>) <span class=\"k\">return</span> <span class=\"k\">true</span>;\n    <span class=\"k\">if</span> (<span class=\"i\">a</span> == <span class=\"k\">null</span>) <span class=\"k\">return</span> <span class=\"k\">false</span>;\n\n    <span class=\"k\">return</span> <span class=\"i\">a</span>.<span class=\"i\">equals</span>(<span class=\"i\">b</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>isTruthy</em>()</div>\n\n<p>This is one of those corners where the details of how we represent Lox objects\nin terms of Java matter. We need to correctly implement <em>Lox&rsquo;s</em> notion of\nequality, which may be different from Java&rsquo;s.</p>\n<p>Fortunately, the two are pretty similar. Lox doesn&rsquo;t do implicit conversions in\nequality and Java does not either. We do have to handle <code>nil</code>/<code>null</code> specially\nso that we don&rsquo;t throw a NullPointerException if we try to call <code>equals()</code> on\n<code>null</code>. Otherwise, we&rsquo;re fine. Java&rsquo;s <span name=\"nan\"><code>equals()</code></span> method\non Boolean, Double, and String have the behavior we want for Lox.</p>\n<aside name=\"nan\">\n<p>What do you expect this to evaluate to:</p>\n<div class=\"codehilite\"><pre>(<span class=\"n\">0</span> / <span class=\"n\">0</span>) == (<span class=\"n\">0</span> / <span class=\"n\">0</span>)\n</pre></div>\n<p>According to <a href=\"https://en.wikipedia.org/wiki/IEEE_754\">IEEE 754</a>, which specifies the behavior of double-precision\nnumbers, dividing a zero by zero gives you the special <strong>NaN</strong> (&ldquo;not a number&rdquo;)\nvalue. Strangely enough, NaN is <em>not</em> equal to itself.</p>\n<p>In Java, the <code>==</code> operator on primitive doubles preserves that behavior, but the\n<code>equals()</code> method on the Double class does not. Lox uses the latter, so doesn&rsquo;t\nfollow IEEE. These kinds of subtle incompatibilities occupy a dismaying fraction\nof language implementers&rsquo; lives.</p>\n</aside>\n<p>And that&rsquo;s it! That&rsquo;s all the code we need to correctly interpret a valid Lox\nexpression. But what about an <em>invalid</em> one? In particular, what happens when a\nsubexpression evaluates to an object of the wrong type for the operation being\nperformed?</p>\n<h2><a href=\"#runtime-errors\" id=\"runtime-errors\"><small>7&#8202;.&#8202;3</small>Runtime Errors</a></h2>\n<p>I was cavalier about jamming casts in whenever a subexpression produces an\nObject and the operator requires it to be a number or a string. Those casts can\nfail. Even though the user&rsquo;s code is erroneous, if we want to make a <span\nname=\"fail\">usable</span> language, we are responsible for handling that error\ngracefully.</p>\n<aside name=\"fail\">\n<p>We could simply not detect or report a type error at all. This is what C does if\nyou cast a pointer to some type that doesn&rsquo;t match the data that is actually\nbeing pointed to. C gains flexibility and speed by allowing that, but is\nalso famously dangerous. Once you misinterpret bits in memory, all bets are off.</p>\n<p>Few modern languages accept unsafe operations like that. Instead, most are\n<strong>memory safe</strong> and ensure<span class=\"em\">&mdash;</span>through a combination of static and runtime checks<span class=\"em\">&mdash;</span>that a program can never incorrectly interpret the value stored in a piece of\nmemory.</p>\n</aside>\n<p>It&rsquo;s time for us to talk about <strong>runtime errors</strong>. I spilled a lot of ink in the\nprevious chapters talking about error handling, but those were all <em>syntax</em> or\n<em>static</em> errors. Those are detected and reported before <em>any</em> code is executed.\nRuntime errors are failures that the language semantics demand we detect and\nreport while the program is running (hence the name).</p>\n<p>Right now, if an operand is the wrong type for the operation being performed,\nthe Java cast will fail and the JVM will throw a ClassCastException. That\nunwinds the whole stack and exits the application, vomiting a Java stack trace\nonto the user. That&rsquo;s probably not what we want. The fact that Lox is\nimplemented in Java should be a detail hidden from the user. Instead, we want\nthem to understand that a <em>Lox</em> runtime error occurred, and give them an error\nmessage relevant to our language and their program.</p>\n<p>The Java behavior does have one thing going for it, though. It correctly stops\nexecuting any code when the error occurs. Let&rsquo;s say the user enters some\nexpression like:</p>\n<div class=\"codehilite\"><pre><span class=\"n\">2</span> * (<span class=\"n\">3</span> / -<span class=\"s\">&quot;muffin&quot;</span>)\n</pre></div>\n<p>You can&rsquo;t negate a <span name=\"muffin\">muffin</span>, so we need to report a\nruntime error at that inner <code>-</code> expression. That in turn means we can&rsquo;t evaluate\nthe <code>/</code> expression since it has no meaningful right operand. Likewise for the\n<code>*</code>. So when a runtime error occurs deep in some expression, we need to escape\nall the way out.</p>\n<aside name=\"muffin\">\n<p>I don&rsquo;t know, man, <em>can</em> you negate a muffin?</p><img src=\"image/evaluating-expressions/muffin.png\" alt=\"A muffin, negated.\" />\n</aside>\n<p>We could print a runtime error and then abort the process and exit the\napplication entirely. That has a certain melodramatic flair. Sort of the\nprogramming language interpreter equivalent of a mic drop.</p>\n<p>Tempting as that is, we should probably do something a little less cataclysmic.\nWhile a runtime error needs to stop evaluating the <em>expression</em>, it shouldn&rsquo;t\nkill the <em>interpreter</em>. If a user is running the REPL and has a typo in a line\nof code, they should still be able to keep the session going and enter more code\nafter that.</p>\n<h3><a href=\"#detecting-runtime-errors\" id=\"detecting-runtime-errors\"><small>7&#8202;.&#8202;3&#8202;.&#8202;1</small>Detecting runtime errors</a></h3>\n<p>Our tree-walk interpreter evaluates nested expressions using recursive method\ncalls, and we need to unwind out of all of those. Throwing an exception in Java\nis a fine way to accomplish that. However, instead of using Java&rsquo;s own cast\nfailure, we&rsquo;ll define a Lox-specific one so that we can handle it how we want.</p>\n<p>Before we do the cast, we check the object&rsquo;s type ourselves. So, for unary <code>-</code>,\nwe add:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case MINUS:\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitUnaryExpr</em>()</div>\n<pre class=\"insert\">        <span class=\"i\">checkNumberOperand</span>(<span class=\"i\">expr</span>.<span class=\"i\">operator</span>, <span class=\"i\">right</span>);\n</pre><pre class=\"insert-after\">        return -(double)right;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitUnaryExpr</em>()</div>\n\n<p>The code to check the operand is:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>visitUnaryExpr</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">void</span> <span class=\"i\">checkNumberOperand</span>(<span class=\"t\">Token</span> <span class=\"i\">operator</span>, <span class=\"t\">Object</span> <span class=\"i\">operand</span>) {\n    <span class=\"k\">if</span> (<span class=\"i\">operand</span> <span class=\"k\">instanceof</span> <span class=\"t\">Double</span>) <span class=\"k\">return</span>;\n    <span class=\"k\">throw</span> <span class=\"k\">new</span> <span class=\"t\">RuntimeError</span>(<span class=\"i\">operator</span>, <span class=\"s\">&quot;Operand must be a number.&quot;</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>visitUnaryExpr</em>()</div>\n\n<p>When the check fails, it throws one of these:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/RuntimeError.java</em><br>\ncreate new file</div>\n<pre><span class=\"k\">package</span> <span class=\"i\">com.craftinginterpreters.lox</span>;\n\n<span class=\"k\">class</span> <span class=\"t\">RuntimeError</span> <span class=\"k\">extends</span> <span class=\"t\">RuntimeException</span> {\n  <span class=\"k\">final</span> <span class=\"t\">Token</span> <span class=\"i\">token</span>;\n\n  <span class=\"t\">RuntimeError</span>(<span class=\"t\">Token</span> <span class=\"i\">token</span>, <span class=\"t\">String</span> <span class=\"i\">message</span>) {\n    <span class=\"k\">super</span>(<span class=\"i\">message</span>);\n    <span class=\"k\">this</span>.<span class=\"i\">token</span> = <span class=\"i\">token</span>;\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/RuntimeError.java</em>, create new file</div>\n\n<p>Unlike the Java cast exception, our <span name=\"class\">class</span> tracks the\ntoken that identifies where in the user&rsquo;s code the runtime error came from. As\nwith static errors, this helps the user know where to fix their code.</p>\n<aside name=\"class\">\n<p>I admit the name &ldquo;RuntimeError&rdquo; is confusing since Java defines a\nRuntimeException class. An annoying thing about building interpreters is your\nnames often collide with ones already taken by the implementation language. Just\nwait until we support Lox classes.</p>\n</aside>\n<p>We need similar checking for the binary operators. Since I promised you every\nsingle line of code needed to implement the interpreters, I&rsquo;ll run through them\nall.</p>\n<p>Greater than:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case GREATER:\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitBinaryExpr</em>()</div>\n<pre class=\"insert\">        <span class=\"i\">checkNumberOperands</span>(<span class=\"i\">expr</span>.<span class=\"i\">operator</span>, <span class=\"i\">left</span>, <span class=\"i\">right</span>);\n</pre><pre class=\"insert-after\">        return (double)left &gt; (double)right;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitBinaryExpr</em>()</div>\n\n<p>Greater than or equal to:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case GREATER_EQUAL:\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitBinaryExpr</em>()</div>\n<pre class=\"insert\">        <span class=\"i\">checkNumberOperands</span>(<span class=\"i\">expr</span>.<span class=\"i\">operator</span>, <span class=\"i\">left</span>, <span class=\"i\">right</span>);\n</pre><pre class=\"insert-after\">        return (double)left &gt;= (double)right;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitBinaryExpr</em>()</div>\n\n<p>Less than:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case LESS:\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitBinaryExpr</em>()</div>\n<pre class=\"insert\">        <span class=\"i\">checkNumberOperands</span>(<span class=\"i\">expr</span>.<span class=\"i\">operator</span>, <span class=\"i\">left</span>, <span class=\"i\">right</span>);\n</pre><pre class=\"insert-after\">        return (double)left &lt; (double)right;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitBinaryExpr</em>()</div>\n\n<p>Less than or equal to:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case LESS_EQUAL:\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitBinaryExpr</em>()</div>\n<pre class=\"insert\">        <span class=\"i\">checkNumberOperands</span>(<span class=\"i\">expr</span>.<span class=\"i\">operator</span>, <span class=\"i\">left</span>, <span class=\"i\">right</span>);\n</pre><pre class=\"insert-after\">        return (double)left &lt;= (double)right;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitBinaryExpr</em>()</div>\n\n<p>Subtraction:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case MINUS:\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitBinaryExpr</em>()</div>\n<pre class=\"insert\">        <span class=\"i\">checkNumberOperands</span>(<span class=\"i\">expr</span>.<span class=\"i\">operator</span>, <span class=\"i\">left</span>, <span class=\"i\">right</span>);\n</pre><pre class=\"insert-after\">        return (double)left - (double)right;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitBinaryExpr</em>()</div>\n\n<p>Division:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case SLASH:\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitBinaryExpr</em>()</div>\n<pre class=\"insert\">        <span class=\"i\">checkNumberOperands</span>(<span class=\"i\">expr</span>.<span class=\"i\">operator</span>, <span class=\"i\">left</span>, <span class=\"i\">right</span>);\n</pre><pre class=\"insert-after\">        return (double)left / (double)right;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitBinaryExpr</em>()</div>\n\n<p>Multiplication:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case STAR:\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitBinaryExpr</em>()</div>\n<pre class=\"insert\">        <span class=\"i\">checkNumberOperands</span>(<span class=\"i\">expr</span>.<span class=\"i\">operator</span>, <span class=\"i\">left</span>, <span class=\"i\">right</span>);\n</pre><pre class=\"insert-after\">        return (double)left * (double)right;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitBinaryExpr</em>()</div>\n\n<p>All of those rely on this validator, which is virtually the same as the unary\none:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>checkNumberOperand</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">void</span> <span class=\"i\">checkNumberOperands</span>(<span class=\"t\">Token</span> <span class=\"i\">operator</span>,\n                                   <span class=\"t\">Object</span> <span class=\"i\">left</span>, <span class=\"t\">Object</span> <span class=\"i\">right</span>) {\n    <span class=\"k\">if</span> (<span class=\"i\">left</span> <span class=\"k\">instanceof</span> <span class=\"t\">Double</span> &amp;&amp; <span class=\"i\">right</span> <span class=\"k\">instanceof</span> <span class=\"t\">Double</span>) <span class=\"k\">return</span>;\n   <span name=\"operand\"> </span>\n    <span class=\"k\">throw</span> <span class=\"k\">new</span> <span class=\"t\">RuntimeError</span>(<span class=\"i\">operator</span>, <span class=\"s\">&quot;Operands must be numbers.&quot;</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>checkNumberOperand</em>()</div>\n\n<aside name=\"operand\">\n<p>Another subtle semantic choice: We evaluate <em>both</em> operands before checking the\ntype of <em>either</em>. Imagine we have a function <code>say()</code> that prints its argument\nthen returns it. Using that, we write:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">say</span>(<span class=\"s\">&quot;left&quot;</span>) - <span class=\"i\">say</span>(<span class=\"s\">&quot;right&quot;</span>);\n</pre></div>\n<p>Our interpreter prints &ldquo;left&rdquo; and &ldquo;right&rdquo; before reporting the runtime error. We\ncould have instead specified that the left operand is checked before even\nevaluating the right.</p>\n</aside>\n<p>The last remaining operator, again the odd one out, is addition. Since <code>+</code> is\noverloaded for numbers and strings, it already has code to check the types. All\nwe need to do is fail if neither of the two success cases match.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">          return (String)left + (String)right;\n        }\n\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitBinaryExpr</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">        <span class=\"k\">throw</span> <span class=\"k\">new</span> <span class=\"t\">RuntimeError</span>(<span class=\"i\">expr</span>.<span class=\"i\">operator</span>,\n            <span class=\"s\">&quot;Operands must be two numbers or two strings.&quot;</span>);\n</pre><pre class=\"insert-after\">      case SLASH:\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitBinaryExpr</em>(), replace 1 line</div>\n\n<p>That gets us detecting runtime errors deep in the innards of the evaluator. The\nerrors are getting thrown. The next step is to write the code that catches them.\nFor that, we need to wire up the Interpreter class into the main Lox class that\ndrives it.</p>\n<h2><a href=\"#hooking-up-the-interpreter\" id=\"hooking-up-the-interpreter\"><small>7&#8202;.&#8202;4</small>Hooking Up the Interpreter</a></h2>\n<p>The visit methods are sort of the guts of the Interpreter class, where the real\nwork happens. We need to wrap a skin around them to interface with the rest of\nthe program. The Interpreter&rsquo;s public API is simply one method.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin class <em>Interpreter</em></div>\n<pre>  <span class=\"t\">void</span> <span class=\"i\">interpret</span>(<span class=\"t\">Expr</span> <span class=\"i\">expression</span>) {<span name=\"void\"> </span>\n    <span class=\"k\">try</span> {\n      <span class=\"t\">Object</span> <span class=\"i\">value</span> = <span class=\"i\">evaluate</span>(<span class=\"i\">expression</span>);\n      <span class=\"t\">System</span>.<span class=\"i\">out</span>.<span class=\"i\">println</span>(<span class=\"i\">stringify</span>(<span class=\"i\">value</span>));\n    } <span class=\"k\">catch</span> (<span class=\"t\">RuntimeError</span> <span class=\"i\">error</span>) {\n      <span class=\"t\">Lox</span>.<span class=\"i\">runtimeError</span>(<span class=\"i\">error</span>);\n    }\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in class <em>Interpreter</em></div>\n\n<p>This takes in a syntax tree for an expression and evaluates it. If that\nsucceeds, <code>evaluate()</code> returns an object for the result value. <code>interpret()</code>\nconverts that to a string and shows it to the user. To convert a Lox value to a\nstring, we rely on:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>isEqual</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">String</span> <span class=\"i\">stringify</span>(<span class=\"t\">Object</span> <span class=\"i\">object</span>) {\n    <span class=\"k\">if</span> (<span class=\"i\">object</span> == <span class=\"k\">null</span>) <span class=\"k\">return</span> <span class=\"s\">&quot;nil&quot;</span>;\n\n    <span class=\"k\">if</span> (<span class=\"i\">object</span> <span class=\"k\">instanceof</span> <span class=\"t\">Double</span>) {\n      <span class=\"t\">String</span> <span class=\"i\">text</span> = <span class=\"i\">object</span>.<span class=\"i\">toString</span>();\n      <span class=\"k\">if</span> (<span class=\"i\">text</span>.<span class=\"i\">endsWith</span>(<span class=\"s\">&quot;.0&quot;</span>)) {\n        <span class=\"i\">text</span> = <span class=\"i\">text</span>.<span class=\"i\">substring</span>(<span class=\"n\">0</span>, <span class=\"i\">text</span>.<span class=\"i\">length</span>() - <span class=\"n\">2</span>);\n      }\n      <span class=\"k\">return</span> <span class=\"i\">text</span>;\n    }\n\n    <span class=\"k\">return</span> <span class=\"i\">object</span>.<span class=\"i\">toString</span>();\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>isEqual</em>()</div>\n\n<p>This is another of those pieces of code like <code>isTruthy()</code> that crosses the\nmembrane between the user&rsquo;s view of Lox objects and their internal\nrepresentation in Java.</p>\n<p>It&rsquo;s pretty straightforward. Since Lox was designed to be familiar to someone\ncoming from Java, things like Booleans look the same in both languages. The two\nedge cases are <code>nil</code>, which we represent using Java&rsquo;s <code>null</code>, and numbers.</p>\n<p>Lox uses double-precision numbers even for integer values. In that case, they\nshould print without a decimal point. Since Java has both floating point and\ninteger types, it wants you to know which one you&rsquo;re using. It tells you by\nadding an explicit <code>.0</code> to integer-valued doubles. We don&rsquo;t care about that, so\nwe <span name=\"number\">hack</span> it off the end.</p>\n<aside name=\"number\">\n<p>Yet again, we take care of this edge case with numbers to ensure that jlox and\nclox work the same. Handling weird corners of the language like this will drive\nyou crazy but is an important part of the job.</p>\n<p>Users rely on these details<span class=\"em\">&mdash;</span>either deliberately or inadvertently<span class=\"em\">&mdash;</span>and if\nthe implementations aren&rsquo;t consistent, their program will break when they run it\non different interpreters.</p>\n</aside>\n<h3><a href=\"#reporting-runtime-errors\" id=\"reporting-runtime-errors\"><small>7&#8202;.&#8202;4&#8202;.&#8202;1</small>Reporting runtime errors</a></h3>\n<p>If a runtime error is thrown while evaluating the expression, <code>interpret()</code>\ncatches it. This lets us report the error to the user and then gracefully\ncontinue. All of our existing error reporting code lives in the Lox class, so we\nput this method there too:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Lox.java</em><br>\nadd after <em>error</em>()</div>\n<pre>  <span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">runtimeError</span>(<span class=\"t\">RuntimeError</span> <span class=\"i\">error</span>) {\n    <span class=\"t\">System</span>.<span class=\"i\">err</span>.<span class=\"i\">println</span>(<span class=\"i\">error</span>.<span class=\"i\">getMessage</span>() +\n        <span class=\"s\">&quot;</span><span class=\"e\">\\n</span><span class=\"s\">[line &quot;</span> + <span class=\"i\">error</span>.<span class=\"i\">token</span>.<span class=\"i\">line</span> + <span class=\"s\">&quot;]&quot;</span>);\n    <span class=\"i\">hadRuntimeError</span> = <span class=\"k\">true</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Lox.java</em>, add after <em>error</em>()</div>\n\n<p>We use the token associated with the RuntimeError to tell the user what line of\ncode was executing when the error occurred. Even better would be to give the\nuser an entire call stack to show how they <em>got</em> to be executing that code. But\nwe don&rsquo;t have function calls yet, so I guess we don&rsquo;t have to worry about it.</p>\n<p>After showing the error, <code>runtimeError()</code> sets this field:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  static boolean hadError = false;\n</pre><div class=\"source-file\"><em>lox/Lox.java</em><br>\nin class <em>Lox</em></div>\n<pre class=\"insert\">  <span class=\"k\">static</span> <span class=\"t\">boolean</span> <span class=\"i\">hadRuntimeError</span> = <span class=\"k\">false</span>;\n\n</pre><pre class=\"insert-after\">  public static void main(String[] args) throws IOException {\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Lox.java</em>, in class <em>Lox</em></div>\n\n<p>That field plays a small but important role.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    run(new String(bytes, Charset.defaultCharset()));\n\n    // Indicate an error in the exit code.\n    if (hadError) System.exit(65);\n</pre><div class=\"source-file\"><em>lox/Lox.java</em><br>\nin <em>runFile</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (<span class=\"i\">hadRuntimeError</span>) <span class=\"t\">System</span>.<span class=\"i\">exit</span>(<span class=\"n\">70</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Lox.java</em>, in <em>runFile</em>()</div>\n\n<p>If the user is running a Lox <span name=\"repl\">script from a file</span> and a\nruntime error occurs, we set an exit code when the process quits to let the\ncalling process know. Not everyone cares about shell etiquette, but we do.</p>\n<aside name=\"repl\">\n<p>If the user is running the REPL, we don&rsquo;t care about tracking runtime errors.\nAfter they are reported, we simply loop around and let them input new code and\nkeep going.</p>\n</aside>\n<h3><a href=\"#running-the-interpreter\" id=\"running-the-interpreter\"><small>7&#8202;.&#8202;4&#8202;.&#8202;2</small>Running the interpreter</a></h3>\n<p>Now that we have an interpreter, the Lox class can start using it.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">public class Lox {\n</pre><div class=\"source-file\"><em>lox/Lox.java</em><br>\nin class <em>Lox</em></div>\n<pre class=\"insert\">  <span class=\"k\">private</span> <span class=\"k\">static</span> <span class=\"k\">final</span> <span class=\"t\">Interpreter</span> <span class=\"i\">interpreter</span> = <span class=\"k\">new</span> <span class=\"t\">Interpreter</span>();\n</pre><pre class=\"insert-after\">  static boolean hadError = false;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Lox.java</em>, in class <em>Lox</em></div>\n\n<p>We make the field static so that successive calls to <code>run()</code> inside a REPL\nsession reuse the same interpreter. That doesn&rsquo;t make a difference now, but it\nwill later when the interpreter stores global variables. Those variables should\npersist throughout the REPL session.</p>\n<p>Finally, we remove the line of temporary code from the <a href=\"parsing-expressions.html\">last chapter</a> for\nprinting the syntax tree and replace it with this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    // Stop if there was a syntax error.\n    if (hadError) return;\n\n</pre><div class=\"source-file\"><em>lox/Lox.java</em><br>\nin <em>run</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"i\">interpreter</span>.<span class=\"i\">interpret</span>(<span class=\"i\">expression</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Lox.java</em>, in <em>run</em>(), replace 1 line</div>\n\n<p>We have an entire language pipeline now: scanning, parsing, and\nexecution. Congratulations, you now have your very own arithmetic calculator.</p>\n<p>As you can see, the interpreter is pretty bare bones. But the Interpreter class\nand the Visitor pattern we&rsquo;ve set up today form the skeleton that later chapters\nwill stuff full of interesting guts<span class=\"em\">&mdash;</span>variables, functions, etc. Right now, the\ninterpreter doesn&rsquo;t do very much, but it&rsquo;s alive!</p><img src=\"image/evaluating-expressions/skeleton.png\" alt=\"A skeleton waving hello.\" />\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>Allowing comparisons on types other than numbers could be useful. The\noperators might have a reasonable interpretation for strings. Even\ncomparisons among mixed types, like <code>3 &lt; \"pancake\"</code> could be handy to enable\nthings like ordered collections of heterogeneous types. Or it could simply\nlead to bugs and confusion.</p>\n<p>Would you extend Lox to support comparing other types? If so, which pairs of\ntypes do you allow and how do you define their ordering? Justify your\nchoices and compare them to other languages.</p>\n</li>\n<li>\n<p>Many languages define <code>+</code> such that if <em>either</em> operand is a string, the\nother is converted to a string and the results are then concatenated. For\nexample, <code>\"scone\" + 4</code> would yield <code>scone4</code>. Extend the code in\n<code>visitBinaryExpr()</code> to support that.</p>\n</li>\n<li>\n<p>What happens right now if you divide a number by zero? What do you think\nshould happen? Justify your choice. How do other languages you know handle\ndivision by zero, and why do they make the choices they do?</p>\n<p>Change the implementation in <code>visitBinaryExpr()</code> to detect and report a\nruntime error for this case.</p>\n</li>\n</ol>\n</div>\n<div class=\"design-note\">\n<h2><a href=\"#design-note\" id=\"design-note\">Design Note: Static and Dynamic Typing</a></h2>\n<p>Some languages, like Java, are statically typed which means type errors are\ndetected and reported at compile time before any code is run. Others, like Lox,\nare dynamically typed and defer checking for type errors until runtime right\nbefore an operation is attempted. We tend to consider this a black-and-white\nchoice, but there is actually a continuum between them.</p>\n<p>It turns out even most statically typed languages do <em>some</em> type checks at\nruntime. The type system checks most type rules statically, but inserts runtime\nchecks in the generated code for other operations.</p>\n<p>For example, in Java, the <em>static</em> type system assumes a cast expression will\nalways safely succeed. After you cast some value, you can statically treat it as\nthe destination type and not get any compile errors. But downcasts can fail,\nobviously. The only reason the static checker can presume that casts always\nsucceed without violating the language&rsquo;s soundness guarantees, is because the\ncast is checked <em>at runtime</em> and throws an exception on failure.</p>\n<p>A more subtle example is <a href=\"https://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)#Covariant_arrays_in_Java_and_C.23\">covariant arrays</a> in Java and C#. The static\nsubtyping rules for arrays allow operations that are not sound. Consider:</p>\n<div class=\"codehilite\"><pre><span class=\"t\">Object</span>[] <span class=\"i\">stuff</span> = <span class=\"k\">new</span> <span class=\"t\">Integer</span>[<span class=\"n\">1</span>];\n<span class=\"i\">stuff</span>[<span class=\"n\">0</span>] = <span class=\"s\">&quot;not an int!&quot;</span>;\n</pre></div>\n<p>This code compiles without any errors. The first line upcasts the Integer array\nand stores it in a variable of type Object array. The second line stores a\nstring in one of its cells. The Object array type statically allows that<span class=\"em\">&mdash;</span>strings <em>are</em> Objects<span class=\"em\">&mdash;</span>but the actual Integer array that <code>stuff</code> refers to\nat runtime should never have a string in it! To avoid that catastrophe, when you\nstore a value in an array, the JVM does a <em>runtime</em> check to make sure it&rsquo;s an\nallowed type. If not, it throws an ArrayStoreException.</p>\n<p>Java could have avoided the need to check this at runtime by disallowing the\ncast on the first line. It could make arrays <em>invariant</em> such that an array of\nIntegers is <em>not</em> an array of Objects. That&rsquo;s statically sound, but it prohibits\ncommon and safe patterns of code that only read from arrays. Covariance is safe\nif you never <em>write</em> to the array. Those patterns were particularly important\nfor usability in Java 1.0 before it supported generics. James Gosling and the\nother Java designers traded off a little static safety and performance<span class=\"em\">&mdash;</span>those\narray store checks take time<span class=\"em\">&mdash;</span>in return for some flexibility.</p>\n<p>There are few modern statically typed languages that don&rsquo;t make that trade-off\n<em>somewhere</em>. Even Haskell will let you run code with non-exhaustive matches. If\nyou find yourself designing a statically typed language, keep in mind that you\ncan sometimes give users more flexibility without sacrificing <em>too</em> many of the\nbenefits of static safety by deferring some type checks until runtime.</p>\n<p>On the other hand, a key reason users choose statically typed languages is\nbecause of the confidence the language gives them that certain kinds of errors\ncan <em>never</em> occur when their program is run. Defer too many type checks until\nruntime, and you erode that confidence.</p>\n</div>\n\n<footer>\n<a href=\"statements-and-state.html\" class=\"next\">\n  Next Chapter: &ldquo;Statements and State&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/functions.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Functions &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Functions<small>10</small></a></h3>\n\n<ul>\n    <li><a href=\"#function-calls\"><small>10.1</small> Function Calls</a></li>\n    <li><a href=\"#native-functions\"><small>10.2</small> Native Functions</a></li>\n    <li><a href=\"#function-declarations\"><small>10.3</small> Function Declarations</a></li>\n    <li><a href=\"#function-objects\"><small>10.4</small> Function Objects</a></li>\n    <li><a href=\"#return-statements\"><small>10.5</small> Return Statements</a></li>\n    <li><a href=\"#local-functions-and-closures\"><small>10.6</small> Local Functions and Closures</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"control-flow.html\" title=\"Control Flow\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\">&uarr;&nbsp;Up</a>\n    <a href=\"resolving-and-binding.html\" title=\"Resolving and Binding\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"control-flow.html\" title=\"Control Flow\" class=\"prev\">←</a>\n<a href=\"resolving-and-binding.html\" title=\"Resolving and Binding\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Functions<small>10</small></a></h3>\n\n<ul>\n    <li><a href=\"#function-calls\"><small>10.1</small> Function Calls</a></li>\n    <li><a href=\"#native-functions\"><small>10.2</small> Native Functions</a></li>\n    <li><a href=\"#function-declarations\"><small>10.3</small> Function Declarations</a></li>\n    <li><a href=\"#function-objects\"><small>10.4</small> Function Objects</a></li>\n    <li><a href=\"#return-statements\"><small>10.5</small> Return Statements</a></li>\n    <li><a href=\"#local-functions-and-closures\"><small>10.6</small> Local Functions and Closures</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"control-flow.html\" title=\"Control Flow\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\">&uarr;&nbsp;Up</a>\n    <a href=\"resolving-and-binding.html\" title=\"Resolving and Binding\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">10</div>\n  <h1>Functions</h1>\n\n<blockquote>\n<p>And that is also the way the human mind works<span class=\"em\">&mdash;</span>by the compounding of old\nideas into new structures that become new ideas that can themselves be used in\ncompounds, and round and round endlessly, growing ever more remote from the\nbasic earthbound imagery that is each language&rsquo;s soil.</p>\n<p><cite>Douglas R. Hofstadter, <em>I Am a Strange Loop</em></cite></p>\n</blockquote>\n<p>This chapter marks the culmination of a lot of hard work. The previous chapters\nadd useful functionality in their own right, but each also supplies a piece of a\n<span name=\"lambda\">puzzle</span>. We&rsquo;ll take those pieces<span class=\"em\">&mdash;</span>expressions,\nstatements, variables, control flow, and lexical scope<span class=\"em\">&mdash;</span>add a couple more, and\nassemble them all into support for real user-defined functions and function\ncalls.</p>\n<aside name=\"lambda\"><img src=\"image/functions/lambda.png\" alt=\"A lambda puzzle.\" />\n</aside>\n<h2><a href=\"#function-calls\" id=\"function-calls\"><small>10&#8202;.&#8202;1</small>Function Calls</a></h2>\n<p>You&rsquo;re certainly familiar with C-style function call syntax, but the grammar is\nmore subtle than you may realize. Calls are typically to named functions like:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">average</span>(<span class=\"n\">1</span>, <span class=\"n\">2</span>);\n</pre></div>\n<p>But the <span name=\"pascal\">name</span> of the function being called isn&rsquo;t\nactually part of the call syntax. The thing being called<span class=\"em\">&mdash;</span>the <strong>callee</strong><span class=\"em\">&mdash;</span>can be any expression that evaluates to a function. (Well, it does have to be a\npretty <em>high precedence</em> expression, but parentheses take care of that.) For\nexample:</p>\n<aside name=\"pascal\">\n<p>The name <em>is</em> part of the call syntax in Pascal. You can call only named\nfunctions or functions stored directly in variables.</p>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"i\">getCallback</span>()();\n</pre></div>\n<p>There are two call expressions here. The first pair of parentheses has\n<code>getCallback</code> as its callee. But the second call has the entire <code>getCallback()</code>\nexpression as its callee. It is the parentheses following an expression that\nindicate a function call. You can think of a call as sort of like a postfix\noperator that starts with <code>(</code>.</p>\n<p>This &ldquo;operator&rdquo; has higher precedence than any other operator, even the unary\nones. So we slot it into the grammar by having the <code>unary</code> rule bubble up to a\nnew <code>call</code> rule.</p>\n<p><span name=\"curry\"></span></p>\n<div class=\"codehilite\"><pre><span class=\"i\">unary</span>          → ( <span class=\"s\">&quot;!&quot;</span> | <span class=\"s\">&quot;-&quot;</span> ) <span class=\"i\">unary</span> | <span class=\"i\">call</span> ;\n<span class=\"i\">call</span>           → <span class=\"i\">primary</span> ( <span class=\"s\">&quot;(&quot;</span> <span class=\"i\">arguments</span>? <span class=\"s\">&quot;)&quot;</span> )* ;\n</pre></div>\n<p>This rule matches a primary expression followed by zero or more function calls.\nIf there are no parentheses, this parses a bare primary expression. Otherwise,\neach call is recognized by a pair of parentheses with an optional list of\narguments inside. The argument list grammar is:</p>\n<aside name=\"curry\">\n<p>The rule uses <code>*</code> to allow matching a series of calls like <code>fn(1)(2)(3)</code>. Code\nlike that isn&rsquo;t common in C-style languages, but it is in the family of\nlanguages derived from ML. There, the normal way of defining a function that\ntakes multiple arguments is as a series of nested functions. Each function takes\none argument and returns a new function. That function consumes the next\nargument, returns yet another function, and so on. Eventually, once all of the\narguments are consumed, the last function completes the operation.</p>\n<p>This style, called <strong>currying</strong>, after Haskell Curry (the same guy whose first\nname graces that <em>other</em> well-known functional language), is baked directly into\nthe language syntax so it&rsquo;s not as weird looking as it would be here.</p>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"i\">arguments</span>      → <span class=\"i\">expression</span> ( <span class=\"s\">&quot;,&quot;</span> <span class=\"i\">expression</span> )* ;\n</pre></div>\n<p>This rule requires at least one argument expression, followed by zero or more\nother expressions, each preceded by a comma. To handle zero-argument calls, the\n<code>call</code> rule itself considers the entire <code>arguments</code> production to be optional.</p>\n<p>I admit, this seems more grammatically awkward than you&rsquo;d expect for the\nincredibly common &ldquo;zero or more comma-separated things&rdquo; pattern. There are some\nsophisticated metasyntaxes that handle this better, but in our BNF and in many\nlanguage specs I&rsquo;ve seen, it is this cumbersome.</p>\n<p>Over in our syntax tree generator, we add a <span name=\"call-ast\">new\nnode</span>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      &quot;Binary   : Expr left, Token operator, Expr right&quot;,\n</pre><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nin <em>main</em>()</div>\n<pre class=\"insert\">      <span class=\"s\">&quot;Call     : Expr callee, Token paren, List&lt;Expr&gt; arguments&quot;</span>,\n</pre><pre class=\"insert-after\">      &quot;Grouping : Expr expression&quot;,\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, in <em>main</em>()</div>\n\n<aside name=\"call-ast\">\n<p>The generated code for the new node is in <a href=\"appendix-ii.html#call-expression\">Appendix II</a>.</p>\n</aside>\n<p>It stores the callee expression and a list of expressions for the arguments. It\nalso stores the token for the closing parenthesis. We&rsquo;ll use that token&rsquo;s\nlocation when we report a runtime error caused by a function call.</p>\n<p>Crack open the parser. Where <code>unary()</code> used to jump straight to <code>primary()</code>,\nchange it to call, well, <code>call()</code>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      return new Expr.Unary(operator, right);\n    }\n\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>unary</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"k\">return</span> <span class=\"i\">call</span>();\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>unary</em>(), replace 1 line</div>\n\n<p>Its definition is:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>unary</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Expr</span> <span class=\"i\">call</span>() {\n    <span class=\"t\">Expr</span> <span class=\"i\">expr</span> = <span class=\"i\">primary</span>();\n\n    <span class=\"k\">while</span> (<span class=\"k\">true</span>) {<span name=\"while-true\"> </span>\n      <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">LEFT_PAREN</span>)) {\n        <span class=\"i\">expr</span> = <span class=\"i\">finishCall</span>(<span class=\"i\">expr</span>);\n      } <span class=\"k\">else</span> {\n        <span class=\"k\">break</span>;\n      }\n    }\n\n    <span class=\"k\">return</span> <span class=\"i\">expr</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>unary</em>()</div>\n\n<p>The code here doesn&rsquo;t quite line up with the grammar rules. I moved a few things\naround to make the code cleaner<span class=\"em\">&mdash;</span>one of the luxuries we have with a\nhandwritten parser. But it&rsquo;s roughly similar to how we parse infix operators.\nFirst, we parse a primary expression, the &ldquo;left operand&rdquo; to the call. Then, each\ntime we see a <code>(</code>, we call <code>finishCall()</code> to parse the call expression using the\npreviously parsed expression as the callee. The returned expression becomes the\nnew <code>expr</code> and we loop to see if the result is itself called.</p>\n<aside name=\"while-true\">\n<p>This code would be simpler as <code>while (match(LEFT_PAREN))</code> instead of the silly\n<code>while (true)</code> and <code>break</code>. Don&rsquo;t worry, it will make sense when we expand the\nparser later to handle properties on objects.</p>\n</aside>\n<p>The code to parse the argument list is in this helper:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>unary</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Expr</span> <span class=\"i\">finishCall</span>(<span class=\"t\">Expr</span> <span class=\"i\">callee</span>) {\n    <span class=\"t\">List</span>&lt;<span class=\"t\">Expr</span>&gt; <span class=\"i\">arguments</span> = <span class=\"k\">new</span> <span class=\"t\">ArrayList</span>&lt;&gt;();\n    <span class=\"k\">if</span> (!<span class=\"i\">check</span>(<span class=\"i\">RIGHT_PAREN</span>)) {\n      <span class=\"k\">do</span> {\n        <span class=\"i\">arguments</span>.<span class=\"i\">add</span>(<span class=\"i\">expression</span>());\n      } <span class=\"k\">while</span> (<span class=\"i\">match</span>(<span class=\"i\">COMMA</span>));\n    }\n\n    <span class=\"t\">Token</span> <span class=\"i\">paren</span> = <span class=\"i\">consume</span>(<span class=\"i\">RIGHT_PAREN</span>,\n                          <span class=\"s\">&quot;Expect &#39;)&#39; after arguments.&quot;</span>);\n\n    <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Call</span>(<span class=\"i\">callee</span>, <span class=\"i\">paren</span>, <span class=\"i\">arguments</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>unary</em>()</div>\n\n<p>This is more or less the <code>arguments</code> grammar rule translated to code, except\nthat we also handle the zero-argument case. We check for that case first by\nseeing if the next token is <code>)</code>. If it is, we don&rsquo;t try to parse any arguments.</p>\n<p>Otherwise, we parse an expression, then look for a comma indicating that there\nis another argument after that. We keep doing that as long as we find commas\nafter each expression. When we don&rsquo;t find a comma, then the argument list must\nbe done and we consume the expected closing parenthesis. Finally, we wrap the\ncallee and those arguments up into a call AST node.</p>\n<h3><a href=\"#maximum-argument-counts\" id=\"maximum-argument-counts\"><small>10&#8202;.&#8202;1&#8202;.&#8202;1</small>Maximum argument counts</a></h3>\n<p>Right now, the loop where we parse arguments has no bound. If you want to call a\nfunction and pass a million arguments to it, the parser would have no problem\nwith it. Do we want to limit that?</p>\n<p>Other languages have various approaches. The C standard says a conforming\nimplementation has to support <em>at least</em> 127 arguments to a function, but\ndoesn&rsquo;t say there&rsquo;s any upper limit. The Java specification says a method can\naccept <em>no more than</em> <span name=\"254\">255</span> arguments.</p>\n<aside name=\"254\">\n<p>The limit is 25<em>4</em> arguments if the method is an instance method. That&rsquo;s because\n<code>this</code><span class=\"em\">&mdash;</span>the receiver of the method<span class=\"em\">&mdash;</span>works like an argument that is\nimplicitly passed to the method, so it claims one of the slots.</p>\n</aside>\n<p>Our Java interpreter for Lox doesn&rsquo;t really need a limit, but having a maximum\nnumber of arguments will simplify our bytecode interpreter in <a href=\"a-bytecode-virtual-machine.html\">Part III</a>. We\nwant our two interpreters to be compatible with each other, even in weird corner\ncases like this, so we&rsquo;ll add the same limit to jlox.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      do {\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>finishCall</em>()</div>\n<pre class=\"insert\">        <span class=\"k\">if</span> (<span class=\"i\">arguments</span>.<span class=\"i\">size</span>() &gt;= <span class=\"n\">255</span>) {\n          <span class=\"i\">error</span>(<span class=\"i\">peek</span>(), <span class=\"s\">&quot;Can&#39;t have more than 255 arguments.&quot;</span>);\n        }\n</pre><pre class=\"insert-after\">        arguments.add(expression());\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>finishCall</em>()</div>\n\n<p>Note that the code here <em>reports</em> an error if it encounters too many arguments,\nbut it doesn&rsquo;t <em>throw</em> the error. Throwing is how we kick into panic mode which\nis what we want if the parser is in a confused state and doesn&rsquo;t know where it\nis in the grammar anymore. But here, the parser is still in a perfectly valid\nstate<span class=\"em\">&mdash;</span>it just found too many arguments. So it reports the error and keeps on\nkeepin&rsquo; on.</p>\n<h3><a href=\"#interpreting-function-calls\" id=\"interpreting-function-calls\"><small>10&#8202;.&#8202;1&#8202;.&#8202;2</small>Interpreting function calls</a></h3>\n<p>We don&rsquo;t have any functions we can call, so it seems weird to start implementing\ncalls first, but we&rsquo;ll worry about that when we get there. First, our\ninterpreter needs a new import.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em></div>\n<pre class=\"insert\"><span class=\"k\">import</span> <span class=\"i\">java.util.ArrayList</span>;\n</pre><pre class=\"insert-after\">import java.util.List;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em></div>\n\n<p>As always, interpretation starts with a new visit method for our new call\nexpression node.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>visitBinaryExpr</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Object</span> <span class=\"i\">visitCallExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Call</span> <span class=\"i\">expr</span>) {\n    <span class=\"t\">Object</span> <span class=\"i\">callee</span> = <span class=\"i\">evaluate</span>(<span class=\"i\">expr</span>.<span class=\"i\">callee</span>);\n\n    <span class=\"t\">List</span>&lt;<span class=\"t\">Object</span>&gt; <span class=\"i\">arguments</span> = <span class=\"k\">new</span> <span class=\"t\">ArrayList</span>&lt;&gt;();\n    <span class=\"k\">for</span> (<span class=\"t\">Expr</span> <span class=\"i\">argument</span> : <span class=\"i\">expr</span>.<span class=\"i\">arguments</span>) {<span name=\"in-order\"> </span>\n      <span class=\"i\">arguments</span>.<span class=\"i\">add</span>(<span class=\"i\">evaluate</span>(<span class=\"i\">argument</span>));\n    }\n\n    <span class=\"t\">LoxCallable</span> <span class=\"i\">function</span> = (<span class=\"t\">LoxCallable</span>)<span class=\"i\">callee</span>;\n    <span class=\"k\">return</span> <span class=\"i\">function</span>.<span class=\"i\">call</span>(<span class=\"k\">this</span>, <span class=\"i\">arguments</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>visitBinaryExpr</em>()</div>\n\n<p>First, we evaluate the expression for the callee. Typically, this expression is\njust an identifier that looks up the function by its name, but it could be\nanything. Then we evaluate each of the argument expressions in order and store\nthe resulting values in a list.</p>\n<aside name=\"in-order\">\n<p>This is another one of those subtle semantic choices. Since argument expressions\nmay have side effects, the order they are evaluated could be user visible. Even\nso, some languages like Scheme and C don&rsquo;t specify an order. This gives\ncompilers freedom to reorder them for efficiency, but means users may be\nunpleasantly surprised if arguments aren&rsquo;t evaluated in the order they expect.</p>\n</aside>\n<p>Once we&rsquo;ve got the callee and the arguments ready, all that remains is to\nperform the call. We do that by casting the callee to a <span\nname=\"callable\">LoxCallable</span> and then invoking a <code>call()</code> method on it.\nThe Java representation of any Lox object that can be called like a function\nwill implement this interface. That includes user-defined functions, naturally,\nbut also class objects since classes are &ldquo;called&rdquo; to construct new instances.\nWe&rsquo;ll also use it for one more purpose shortly.</p>\n<aside name=\"callable\">\n<p>I stuck &ldquo;Lox&rdquo; before the name to distinguish it from the Java standard library&rsquo;s\nown Callable interface. Alas, all the good simple names are already taken.</p>\n</aside>\n<p>There isn&rsquo;t too much to this new interface.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/LoxCallable.java</em><br>\ncreate new file</div>\n<pre><span class=\"k\">package</span> <span class=\"i\">com.craftinginterpreters.lox</span>;\n\n<span class=\"k\">import</span> <span class=\"i\">java.util.List</span>;\n\n<span class=\"k\">interface</span> <span class=\"t\">LoxCallable</span> {\n  <span class=\"t\">Object</span> <span class=\"i\">call</span>(<span class=\"t\">Interpreter</span> <span class=\"i\">interpreter</span>, <span class=\"t\">List</span>&lt;<span class=\"t\">Object</span>&gt; <span class=\"i\">arguments</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxCallable.java</em>, create new file</div>\n\n<p>We pass in the interpreter in case the class implementing <code>call()</code> needs it. We\nalso give it the list of evaluated argument values. The implementer&rsquo;s job is\nthen to return the value that the call expression produces.</p>\n<h3><a href=\"#call-type-errors\" id=\"call-type-errors\"><small>10&#8202;.&#8202;1&#8202;.&#8202;3</small>Call type errors</a></h3>\n<p>Before we get to implementing LoxCallable, we need to make the visit method a\nlittle more robust. It currently ignores a couple of failure modes that we can&rsquo;t\npretend won&rsquo;t occur. First, what happens if the callee isn&rsquo;t actually something\nyou can call? What if you try to do this:</p>\n<div class=\"codehilite\"><pre><span class=\"s\">&quot;totally not a function&quot;</span>();\n</pre></div>\n<p>Strings aren&rsquo;t callable in Lox. The runtime representation of a Lox string is a\nJava string, so when we cast that to LoxCallable, the JVM will throw a\nClassCastException. We don&rsquo;t want our interpreter to vomit out some nasty Java\nstack trace and die. Instead, we need to check the type ourselves first.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    }\n\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitCallExpr</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (!(<span class=\"i\">callee</span> <span class=\"k\">instanceof</span> <span class=\"t\">LoxCallable</span>)) {\n      <span class=\"k\">throw</span> <span class=\"k\">new</span> <span class=\"t\">RuntimeError</span>(<span class=\"i\">expr</span>.<span class=\"i\">paren</span>,\n          <span class=\"s\">&quot;Can only call functions and classes.&quot;</span>);\n    }\n\n</pre><pre class=\"insert-after\">    LoxCallable function = (LoxCallable)callee;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitCallExpr</em>()</div>\n\n<p>We still throw an exception, but now we&rsquo;re throwing our own exception type, one\nthat the interpreter knows to catch and report gracefully.</p>\n<h3><a href=\"#checking-arity\" id=\"checking-arity\"><small>10&#8202;.&#8202;1&#8202;.&#8202;4</small>Checking arity</a></h3>\n<p>The other problem relates to the function&rsquo;s <strong>arity</strong>. Arity is the fancy term\nfor the number of arguments a function or operation expects. Unary operators\nhave arity one, binary operators two, etc. With functions, the arity is\ndetermined by the number of parameters it declares.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">add</span>(<span class=\"i\">a</span>, <span class=\"i\">b</span>, <span class=\"i\">c</span>) {\n  <span class=\"k\">print</span> <span class=\"i\">a</span> + <span class=\"i\">b</span> + <span class=\"i\">c</span>;\n}\n</pre></div>\n<p>This function defines three parameters, <code>a</code>, <code>b</code>, and <code>c</code>, so its arity is\nthree and it expects three arguments. So what if you try to call it like this:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">add</span>(<span class=\"n\">1</span>, <span class=\"n\">2</span>, <span class=\"n\">3</span>, <span class=\"n\">4</span>); <span class=\"c\">// Too many.</span>\n<span class=\"i\">add</span>(<span class=\"n\">1</span>, <span class=\"n\">2</span>);       <span class=\"c\">// Too few.</span>\n</pre></div>\n<p>Different languages take different approaches to this problem. Of course, most\nstatically typed languages check this at compile time and refuse to compile the\ncode if the argument count doesn&rsquo;t match the function&rsquo;s arity. JavaScript\ndiscards any extra arguments you pass. If you don&rsquo;t pass enough, it fills in the\nmissing parameters with the magic sort-of-like-null-but-not-really value\n<code>undefined</code>. Python is stricter. It raises a runtime error if the argument list\nis too short or too long.</p>\n<p>I think the latter is a better approach. Passing the wrong number of arguments\nis almost always a bug, and it&rsquo;s a mistake I do make in practice. Given that,\nthe sooner the implementation draws my attention to it, the better. So for Lox,\nwe&rsquo;ll take Python&rsquo;s approach. Before invoking the callable, we check to see if\nthe argument list&rsquo;s length matches the callable&rsquo;s arity.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    LoxCallable function = (LoxCallable)callee;\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitCallExpr</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (<span class=\"i\">arguments</span>.<span class=\"i\">size</span>() != <span class=\"i\">function</span>.<span class=\"i\">arity</span>()) {\n      <span class=\"k\">throw</span> <span class=\"k\">new</span> <span class=\"t\">RuntimeError</span>(<span class=\"i\">expr</span>.<span class=\"i\">paren</span>, <span class=\"s\">&quot;Expected &quot;</span> +\n          <span class=\"i\">function</span>.<span class=\"i\">arity</span>() + <span class=\"s\">&quot; arguments but got &quot;</span> +\n          <span class=\"i\">arguments</span>.<span class=\"i\">size</span>() + <span class=\"s\">&quot;.&quot;</span>);\n    }\n\n</pre><pre class=\"insert-after\">    return function.call(this, arguments);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitCallExpr</em>()</div>\n\n<p>That requires a new method on the LoxCallable interface to ask it its arity.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">interface LoxCallable {\n</pre><div class=\"source-file\"><em>lox/LoxCallable.java</em><br>\nin interface <em>LoxCallable</em></div>\n<pre class=\"insert\">  <span class=\"t\">int</span> <span class=\"i\">arity</span>();\n</pre><pre class=\"insert-after\">  Object call(Interpreter interpreter, List&lt;Object&gt; arguments);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxCallable.java</em>, in interface <em>LoxCallable</em></div>\n\n<p>We <em>could</em> push the arity checking into the concrete implementation of <code>call()</code>.\nBut, since we&rsquo;ll have multiple classes implementing LoxCallable, that would end\nup with redundant validation spread across a few classes. Hoisting it up into\nthe visit method lets us do it in one place.</p>\n<h2><a href=\"#native-functions\" id=\"native-functions\"><small>10&#8202;.&#8202;2</small>Native Functions</a></h2>\n<p>We can theoretically call functions, but we have no functions to call yet.\nBefore we get to user-defined functions, now is a good time to introduce a vital\nbut often overlooked facet of language implementations<span class=\"em\">&mdash;</span><span\nname=\"native\"><strong>native functions</strong></span>. These are functions that the\ninterpreter exposes to user code but that are implemented in the host language\n(in our case Java), not the language being implemented (Lox).</p>\n<p>Sometimes these are called <strong>primitives</strong>, <strong>external functions</strong>, or <strong>foreign\nfunctions</strong>. Since these functions can be called while the user&rsquo;s program is\nrunning, they form part of the implementation&rsquo;s runtime. A lot of programming\nlanguage books gloss over these because they aren&rsquo;t conceptually interesting.\nThey&rsquo;re mostly grunt work.</p>\n<aside name=\"native\">\n<p>Curiously, two names for these functions<span class=\"em\">&mdash;</span>&ldquo;native&rdquo; and &ldquo;foreign&rdquo;<span class=\"em\">&mdash;</span>are\nantonyms. Maybe it depends on the perspective of the person choosing the term.\nIf you think of yourself as &ldquo;living&rdquo; within the runtime&rsquo;s implementation (in our\ncase, Java) then functions written in that are &ldquo;native&rdquo;. But if you have the\nmindset of a <em>user</em> of your language, then the runtime is implemented in some\nother &ldquo;foreign&rdquo; language.</p>\n<p>Or it may be that &ldquo;native&rdquo; refers to the machine code language of the underlying\nhardware. In Java, &ldquo;native&rdquo; methods are ones implemented in C or C++ and\ncompiled to native machine code.</p><img src=\"image/functions/foreign.png\" class=\"above\" alt=\"All a matter of perspective.\" />\n</aside>\n<p>But when it comes to making your language actually good at doing useful stuff,\nthe native functions your implementation provides are key. They provide access\nto the fundamental services that all programs are defined in terms of. If you\ndon&rsquo;t provide native functions to access the file system, a user&rsquo;s going to have\na hell of a time writing a program that reads and <span\nname=\"print\">displays</span> a file.</p>\n<aside name=\"print\">\n<p>A classic native function almost every language provides is one to print text to\nstdout. In Lox, I made <code>print</code> a built-in statement so that we could get stuff\non screen in the chapters before this one.</p>\n<p>Once we have functions, we could simplify the language by tearing out the old\nprint syntax and replacing it with a native function. But that would mean that\nexamples early in the book wouldn&rsquo;t run on the interpreter from later chapters\nand vice versa. So, for the book, I&rsquo;ll leave it alone.</p>\n<p>If you&rsquo;re building an interpreter for your <em>own</em> language, though, you may want\nto consider it.</p>\n</aside>\n<p>Many languages also allow users to provide their own native functions. The\nmechanism for doing so is called a <strong>foreign function interface</strong> (<strong>FFI</strong>),\n<strong>native extension</strong>, <strong>native interface</strong>, or something along those lines.\nThese are nice because they free the language implementer from providing access\nto every single capability the underlying platform supports. We won&rsquo;t define an\nFFI for jlox, but we will add one native function to give you an idea of what it\nlooks like.</p>\n<h3><a href=\"#telling-time\" id=\"telling-time\"><small>10&#8202;.&#8202;2&#8202;.&#8202;1</small>Telling time</a></h3>\n<p>When we get to <a href=\"a-bytecode-virtual-machine.html\">Part III</a> and start working on a much more efficient\nimplementation of Lox, we&rsquo;re going to care deeply about performance. Performance\nwork requires measurement, and that in turn means <strong>benchmarks</strong>. These are\nprograms that measure the time it takes to exercise some corner of the\ninterpreter.</p>\n<p>We could measure the time it takes to start up the interpreter, run the\nbenchmark, and exit, but that adds a lot of overhead<span class=\"em\">&mdash;</span>JVM startup time, OS\nshenanigans, etc. That stuff does matter, of course, but if you&rsquo;re just trying\nto validate an optimization to some piece of the interpreter, you don&rsquo;t want\nthat overhead obscuring your results.</p>\n<p>A nicer solution is to have the benchmark script itself measure the time elapsed\nbetween two points in the code. To do that, a Lox program needs to be able to\ntell time. There&rsquo;s no way to do that now<span class=\"em\">&mdash;</span>you can&rsquo;t implement a useful clock\n&ldquo;from scratch&rdquo; without access to the underlying clock on the computer.</p>\n<p>So we&rsquo;ll add <code>clock()</code>, a native function that returns the number of seconds\nthat have passed since some fixed point in time. The difference between two\nsuccessive invocations tells you how much time elapsed between the two calls.\nThis function is defined in the global scope, so let&rsquo;s ensure the interpreter\nhas access to that.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">class Interpreter implements Expr.Visitor&lt;Object&gt;,\n                             Stmt.Visitor&lt;Void&gt; {\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin class <em>Interpreter</em><br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"k\">final</span> <span class=\"t\">Environment</span> <span class=\"i\">globals</span> = <span class=\"k\">new</span> <span class=\"t\">Environment</span>();\n  <span class=\"k\">private</span> <span class=\"t\">Environment</span> <span class=\"i\">environment</span> = <span class=\"i\">globals</span>;\n</pre><pre class=\"insert-after\">\n\n  void interpret(List&lt;Stmt&gt; statements) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in class <em>Interpreter</em>, replace 1 line</div>\n\n<p>The <code>environment</code> field in the interpreter changes as we enter and exit local\nscopes. It tracks the <em>current</em> environment. This new <code>globals</code> field holds a\nfixed reference to the outermost global environment.</p>\n<p>When we instantiate an Interpreter, we stuff the native function in that global\nscope.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  private Environment environment = globals;\n\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin class <em>Interpreter</em></div>\n<pre class=\"insert\">  <span class=\"t\">Interpreter</span>() {\n    <span class=\"i\">globals</span>.<span class=\"i\">define</span>(<span class=\"s\">&quot;clock&quot;</span>, <span class=\"k\">new</span> <span class=\"t\">LoxCallable</span>() {\n      <span class=\"a\">@Override</span>\n      <span class=\"k\">public</span> <span class=\"t\">int</span> <span class=\"i\">arity</span>() { <span class=\"k\">return</span> <span class=\"n\">0</span>; }\n\n      <span class=\"a\">@Override</span>\n      <span class=\"k\">public</span> <span class=\"t\">Object</span> <span class=\"i\">call</span>(<span class=\"t\">Interpreter</span> <span class=\"i\">interpreter</span>,\n                         <span class=\"t\">List</span>&lt;<span class=\"t\">Object</span>&gt; <span class=\"i\">arguments</span>) {\n        <span class=\"k\">return</span> (<span class=\"t\">double</span>)<span class=\"t\">System</span>.<span class=\"i\">currentTimeMillis</span>() / <span class=\"n\">1000.0</span>;\n      }\n\n      <span class=\"a\">@Override</span>\n      <span class=\"k\">public</span> <span class=\"t\">String</span> <span class=\"i\">toString</span>() { <span class=\"k\">return</span> <span class=\"s\">&quot;&lt;native fn&gt;&quot;</span>; }\n    });\n  }\n\n</pre><pre class=\"insert-after\">  void interpret(List&lt;Stmt&gt; statements) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in class <em>Interpreter</em></div>\n\n<p>This defines a <span name=\"lisp-1\">variable</span> named &ldquo;clock&rdquo;. Its value is a\nJava anonymous class that implements LoxCallable. The <code>clock()</code> function takes\nno arguments, so its arity is zero. The implementation of <code>call()</code> calls the\ncorresponding Java function and converts the result to a double value in\nseconds.</p>\n<aside name=\"lisp-1\">\n<p>In Lox, functions and variables occupy the same namespace. In Common Lisp, the\ntwo live in their own worlds. A function and variable with the same name don&rsquo;t\ncollide. If you call the name, it looks up the function. If you refer to it, it\nlooks up the variable. This does require jumping through some hoops when you do\nwant to refer to a function as a first-class value.</p>\n<p>Richard P. Gabriel and Kent Pitman coined the terms &ldquo;Lisp-1&rdquo; to refer to\nlanguages like Scheme that put functions and variables in the same namespace,\nand &ldquo;Lisp-2&rdquo; for languages like Common Lisp that partition them. Despite being\ntotally opaque, those names have since stuck. Lox is a Lisp-1.</p>\n</aside>\n<p>If we wanted to add other native functions<span class=\"em\">&mdash;</span>reading input from the user,\nworking with files, etc.<span class=\"em\">&mdash;</span>we could add them each as their own anonymous class\nthat implements LoxCallable. But for the book, this one is really all we need.</p>\n<p>Let&rsquo;s get ourselves out of the function-defining business and let our users\ntake over<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span></p>\n<h2><a href=\"#function-declarations\" id=\"function-declarations\"><small>10&#8202;.&#8202;3</small>Function Declarations</a></h2>\n<p>We finally get to add a new production to the <code>declaration</code> rule we introduced\nback when we added variables. Function declarations, like variables, bind a new\n<span name=\"name\">name</span>. That means they are allowed only in places where\na declaration is permitted.</p>\n<aside name=\"name\">\n<p>A named function declaration isn&rsquo;t really a single primitive operation. It&rsquo;s\nsyntactic sugar for two distinct steps: (1) creating a new function object, and\n(2) binding a new variable to it. If Lox had syntax for anonymous functions, we\nwouldn&rsquo;t need function declaration statements. You could just do:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">add</span> = <span class=\"k\">fun</span> (<span class=\"i\">a</span>, <span class=\"i\">b</span>) {\n  <span class=\"k\">print</span> <span class=\"i\">a</span> + <span class=\"i\">b</span>;\n};\n</pre></div>\n<p>However, since named functions are the common case, I went ahead and gave Lox\nnice syntax for them.</p>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"i\">declaration</span>    → <span class=\"i\">funDecl</span>\n               | <span class=\"i\">varDecl</span>\n               | <span class=\"i\">statement</span> ;\n</pre></div>\n<p>The updated <code>declaration</code> rule references this new rule:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">funDecl</span>        → <span class=\"s\">&quot;fun&quot;</span> <span class=\"i\">function</span> ;\n<span class=\"i\">function</span>       → <span class=\"t\">IDENTIFIER</span> <span class=\"s\">&quot;(&quot;</span> <span class=\"i\">parameters</span>? <span class=\"s\">&quot;)&quot;</span> <span class=\"i\">block</span> ;\n</pre></div>\n<p>The main <code>funDecl</code> rule uses a separate helper rule <code>function</code>. A function\n<em>declaration statement</em> is the <code>fun</code> keyword followed by the actual function-y\nstuff. When we get to classes, we&rsquo;ll reuse that <code>function</code> rule for declaring\nmethods. Those look similar to function declarations, but aren&rsquo;t preceded by\n<span name=\"fun\"><code>fun</code></span>.</p>\n<aside name=\"fun\">\n<p>Methods are too classy to have fun.</p>\n</aside>\n<p>The function itself is a name followed by the parenthesized parameter list and\nthe body. The body is always a braced block, using the same grammar rule that\nblock statements use. The parameter list uses this rule:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">parameters</span>     → <span class=\"t\">IDENTIFIER</span> ( <span class=\"s\">&quot;,&quot;</span> <span class=\"t\">IDENTIFIER</span> )* ;\n</pre></div>\n<p>It&rsquo;s like the earlier <code>arguments</code> rule, except that each parameter is an\nidentifier, not an expression. That&rsquo;s a lot of new syntax for the parser to chew\nthrough, but the resulting AST <span name=\"fun-ast\">node</span> isn&rsquo;t too bad.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      &quot;Expression : Expr expression&quot;,\n</pre><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nin <em>main</em>()</div>\n<pre class=\"insert\">      <span class=\"s\">&quot;Function   : Token name, List&lt;Token&gt; params,&quot;</span> +\n                  <span class=\"s\">&quot; List&lt;Stmt&gt; body&quot;</span>,\n</pre><pre class=\"insert-after\">      &quot;If         : Expr condition, Stmt thenBranch,&quot; +\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, in <em>main</em>()</div>\n\n<aside name=\"fun-ast\">\n<p>The generated code for the new node is in <a href=\"appendix-ii.html#function-statement\">Appendix II</a>.</p>\n</aside>\n<p>A function node has a name, a list of parameters (their names), and then the\nbody. We store the body as the list of statements contained inside the curly\nbraces.</p>\n<p>Over in the parser, we weave in the new declaration.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    try {\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>declaration</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">FUN</span>)) <span class=\"k\">return</span> <span class=\"i\">function</span>(<span class=\"s\">&quot;function&quot;</span>);\n</pre><pre class=\"insert-after\">      if (match(VAR)) return varDeclaration();\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>declaration</em>()</div>\n\n<p>Like other statements, a function is recognized by the leading keyword. When we\nencounter <code>fun</code>, we call <code>function</code>. That corresponds to the <code>function</code> grammar\nrule since we already matched and consumed the <code>fun</code> keyword. We&rsquo;ll build the\nmethod up a piece at a time, starting with this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>expressionStatement</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Stmt</span>.<span class=\"t\">Function</span> <span class=\"i\">function</span>(<span class=\"t\">String</span> <span class=\"i\">kind</span>) {\n    <span class=\"t\">Token</span> <span class=\"i\">name</span> = <span class=\"i\">consume</span>(<span class=\"i\">IDENTIFIER</span>, <span class=\"s\">&quot;Expect &quot;</span> + <span class=\"i\">kind</span> + <span class=\"s\">&quot; name.&quot;</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>expressionStatement</em>()</div>\n\n<p>Right now, it only consumes the identifier token for the function&rsquo;s name. You\nmight be wondering about that funny little <code>kind</code> parameter. Just like we reuse\nthe grammar rule, we&rsquo;ll reuse the <code>function()</code> method later to parse methods\ninside classes. When we do that, we&rsquo;ll pass in &ldquo;method&rdquo; for <code>kind</code> so that the\nerror messages are specific to the kind of declaration being parsed.</p>\n<p>Next, we parse the parameter list and the pair of parentheses wrapped around it.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    Token name = consume(IDENTIFIER, &quot;Expect &quot; + kind + &quot; name.&quot;);\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>function</em>()</div>\n<pre class=\"insert\">    <span class=\"i\">consume</span>(<span class=\"i\">LEFT_PAREN</span>, <span class=\"s\">&quot;Expect &#39;(&#39; after &quot;</span> + <span class=\"i\">kind</span> + <span class=\"s\">&quot; name.&quot;</span>);\n    <span class=\"t\">List</span>&lt;<span class=\"t\">Token</span>&gt; <span class=\"i\">parameters</span> = <span class=\"k\">new</span> <span class=\"t\">ArrayList</span>&lt;&gt;();\n    <span class=\"k\">if</span> (!<span class=\"i\">check</span>(<span class=\"i\">RIGHT_PAREN</span>)) {\n      <span class=\"k\">do</span> {\n        <span class=\"k\">if</span> (<span class=\"i\">parameters</span>.<span class=\"i\">size</span>() &gt;= <span class=\"n\">255</span>) {\n          <span class=\"i\">error</span>(<span class=\"i\">peek</span>(), <span class=\"s\">&quot;Can&#39;t have more than 255 parameters.&quot;</span>);\n        }\n\n        <span class=\"i\">parameters</span>.<span class=\"i\">add</span>(\n            <span class=\"i\">consume</span>(<span class=\"i\">IDENTIFIER</span>, <span class=\"s\">&quot;Expect parameter name.&quot;</span>));\n      } <span class=\"k\">while</span> (<span class=\"i\">match</span>(<span class=\"i\">COMMA</span>));\n    }\n    <span class=\"i\">consume</span>(<span class=\"i\">RIGHT_PAREN</span>, <span class=\"s\">&quot;Expect &#39;)&#39; after parameters.&quot;</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>function</em>()</div>\n\n<p>This is like the code for handling arguments in a call, except not split out\ninto a helper method. The outer <code>if</code> statement handles the zero parameter case,\nand the inner <code>while</code> loop parses parameters as long as we find commas to\nseparate them. The result is the list of tokens for each parameter&rsquo;s name.</p>\n<p>Just like we do with arguments to function calls, we validate at parse time\nthat you don&rsquo;t exceed the maximum number of parameters a function is allowed to\nhave.</p>\n<p>Finally, we parse the body and wrap it all up in a function node.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    consume(RIGHT_PAREN, &quot;Expect ')' after parameters.&quot;);\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>function</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"i\">consume</span>(<span class=\"i\">LEFT_BRACE</span>, <span class=\"s\">&quot;Expect &#39;{&#39; before &quot;</span> + <span class=\"i\">kind</span> + <span class=\"s\">&quot; body.&quot;</span>);\n    <span class=\"t\">List</span>&lt;<span class=\"t\">Stmt</span>&gt; <span class=\"i\">body</span> = <span class=\"i\">block</span>();\n    <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">Stmt</span>.<span class=\"t\">Function</span>(<span class=\"i\">name</span>, <span class=\"i\">parameters</span>, <span class=\"i\">body</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>function</em>()</div>\n\n<p>Note that we consume the <code>{</code> at the beginning of the body here before calling\n<code>block()</code>. That&rsquo;s because <code>block()</code> assumes the brace token has already been\nmatched. Consuming it here lets us report a more precise error message if the\n<code>{</code> isn&rsquo;t found since we know it&rsquo;s in the context of a function declaration.</p>\n<h2><a href=\"#function-objects\" id=\"function-objects\"><small>10&#8202;.&#8202;4</small>Function Objects</a></h2>\n<p>We&rsquo;ve got some syntax parsed so usually we&rsquo;re ready to interpret, but first we\nneed to think about how to represent a Lox function in Java. We need to keep\ntrack of the parameters so that we can bind them to argument values when the\nfunction is called. And, of course, we need to keep the code for the body of the\nfunction so that we can execute it.</p>\n<p>That&rsquo;s basically what the Stmt.Function class is. Could we just use that?\nAlmost, but not quite. We also need a class that implements LoxCallable so that\nwe can call it. We don&rsquo;t want the runtime phase of the interpreter to bleed into\nthe front end&rsquo;s syntax classes so we don&rsquo;t want Stmt.Function itself to\nimplement that. Instead, we wrap it in a new class.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/LoxFunction.java</em><br>\ncreate new file</div>\n<pre><span class=\"k\">package</span> <span class=\"i\">com.craftinginterpreters.lox</span>;\n\n<span class=\"k\">import</span> <span class=\"i\">java.util.List</span>;\n\n<span class=\"k\">class</span> <span class=\"t\">LoxFunction</span> <span class=\"k\">implements</span> <span class=\"t\">LoxCallable</span> {\n  <span class=\"k\">private</span> <span class=\"k\">final</span> <span class=\"t\">Stmt</span>.<span class=\"t\">Function</span> <span class=\"i\">declaration</span>;\n  <span class=\"t\">LoxFunction</span>(<span class=\"t\">Stmt</span>.<span class=\"t\">Function</span> <span class=\"i\">declaration</span>) {\n    <span class=\"k\">this</span>.<span class=\"i\">declaration</span> = <span class=\"i\">declaration</span>;\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxFunction.java</em>, create new file</div>\n\n<p>We implement the <code>call()</code> of LoxCallable like so:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/LoxFunction.java</em><br>\nadd after <em>LoxFunction</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Object</span> <span class=\"i\">call</span>(<span class=\"t\">Interpreter</span> <span class=\"i\">interpreter</span>,\n                     <span class=\"t\">List</span>&lt;<span class=\"t\">Object</span>&gt; <span class=\"i\">arguments</span>) {\n    <span class=\"t\">Environment</span> <span class=\"i\">environment</span> = <span class=\"k\">new</span> <span class=\"t\">Environment</span>(<span class=\"i\">interpreter</span>.<span class=\"i\">globals</span>);\n    <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">i</span> = <span class=\"n\">0</span>; <span class=\"i\">i</span> &lt; <span class=\"i\">declaration</span>.<span class=\"i\">params</span>.<span class=\"i\">size</span>(); <span class=\"i\">i</span>++) {\n      <span class=\"i\">environment</span>.<span class=\"i\">define</span>(<span class=\"i\">declaration</span>.<span class=\"i\">params</span>.<span class=\"i\">get</span>(<span class=\"i\">i</span>).<span class=\"i\">lexeme</span>,\n          <span class=\"i\">arguments</span>.<span class=\"i\">get</span>(<span class=\"i\">i</span>));\n    }\n\n    <span class=\"i\">interpreter</span>.<span class=\"i\">executeBlock</span>(<span class=\"i\">declaration</span>.<span class=\"i\">body</span>, <span class=\"i\">environment</span>);\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxFunction.java</em>, add after <em>LoxFunction</em>()</div>\n\n<p>This handful of lines of code is one of the most fundamental, powerful pieces of\nour interpreter. As we saw in <a href=\"statements-and-state.html\">the chapter on statements and <span\nname=\"env\">state</span></a>, managing name environments is a core part\nof a language implementation. Functions are deeply tied to that.</p>\n<aside name=\"env\">\n<p>We&rsquo;ll dig even deeper into environments in the <a href=\"resolving-and-binding.html\">next chapter</a>.</p>\n</aside>\n<p>Parameters are core to functions, especially the fact that a function\n<em>encapsulates</em> its parameters<span class=\"em\">&mdash;</span>no other code outside of the function can see\nthem. This means each function gets its own environment where it stores those\nvariables.</p>\n<p>Further, this environment must be created dynamically. Each function <em>call</em> gets\nits own environment. Otherwise, recursion would break. If there are multiple\ncalls to the same function in play at the same time, each needs its <em>own</em>\nenvironment, even though they are all calls to the same function.</p>\n<p>For example, here&rsquo;s a convoluted way to count to three:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">count</span>(<span class=\"i\">n</span>) {\n  <span class=\"k\">if</span> (<span class=\"i\">n</span> &gt; <span class=\"n\">1</span>) <span class=\"i\">count</span>(<span class=\"i\">n</span> - <span class=\"n\">1</span>);\n  <span class=\"k\">print</span> <span class=\"i\">n</span>;\n}\n\n<span class=\"i\">count</span>(<span class=\"n\">3</span>);\n</pre></div>\n<p>Imagine we pause the interpreter right at the point where it&rsquo;s about to print 1\nin the innermost nested call. The outer calls to print 2 and 3 haven&rsquo;t printed\ntheir values yet, so there must be environments somewhere in memory that still\nstore the fact that <code>n</code> is bound to 3 in one context, 2 in another, and 1 in the\ninnermost, like:</p><img src=\"image/functions/recursion.png\" alt=\"A separate environment for each recursive call.\" />\n<p>That&rsquo;s why we create a new environment at each <em>call</em>, not at the function\n<em>declaration</em>. The <code>call()</code> method we saw earlier does that. At the beginning of\nthe call, it creates a new environment. Then it walks the parameter and argument\nlists in lockstep. For each pair, it creates a new variable with the parameter&rsquo;s\nname and binds it to the argument&rsquo;s value.</p>\n<p>So, for a program like this:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">add</span>(<span class=\"i\">a</span>, <span class=\"i\">b</span>, <span class=\"i\">c</span>) {\n  <span class=\"k\">print</span> <span class=\"i\">a</span> + <span class=\"i\">b</span> + <span class=\"i\">c</span>;\n}\n\n<span class=\"i\">add</span>(<span class=\"n\">1</span>, <span class=\"n\">2</span>, <span class=\"n\">3</span>);\n</pre></div>\n<p>At the point of the call to <code>add()</code>, the interpreter creates something like\nthis:</p><img src=\"image/functions/binding.png\" alt=\"Binding arguments to their parameters.\" />\n<p>Then <code>call()</code> tells the interpreter to execute the body of the function in this\nnew function-local environment. Up until now, the current environment was the\nenvironment where the function was being called. Now, we teleport from there\ninside the new parameter space we&rsquo;ve created for the function.</p>\n<p>This is all that&rsquo;s required to pass data into the function. By using different\nenvironments when we execute the body, calls to the same function with the\nsame code can produce different results.</p>\n<p>Once the body of the function has finished executing, <code>executeBlock()</code> discards\nthat function-local environment and restores the previous one that was active\nback at the callsite. Finally, <code>call()</code> returns <code>null</code>, which returns <code>nil</code> to\nthe caller. (We&rsquo;ll add return values later.)</p>\n<p>Mechanically, the code is pretty simple. Walk a couple of lists. Bind some new\nvariables. Call a method. But this is where the crystalline <em>code</em> of the\nfunction declaration becomes a living, breathing <em>invocation</em>. This is one of my\nfavorite snippets in this entire book. Feel free to take a moment to meditate on\nit if you&rsquo;re so inclined.</p>\n<p>Done? OK. Note when we bind the parameters, we assume the parameter and argument\nlists have the same length. This is safe because <code>visitCallExpr()</code> checks the\narity before calling <code>call()</code>. It relies on the function reporting its arity to\ndo that.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/LoxFunction.java</em><br>\nadd after <em>LoxFunction</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">int</span> <span class=\"i\">arity</span>() {\n    <span class=\"k\">return</span> <span class=\"i\">declaration</span>.<span class=\"i\">params</span>.<span class=\"i\">size</span>();\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxFunction.java</em>, add after <em>LoxFunction</em>()</div>\n\n<p>That&rsquo;s most of our object representation. While we&rsquo;re in here, we may as well\nimplement <code>toString()</code>.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/LoxFunction.java</em><br>\nadd after <em>LoxFunction</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">String</span> <span class=\"i\">toString</span>() {\n    <span class=\"k\">return</span> <span class=\"s\">&quot;&lt;fn &quot;</span> + <span class=\"i\">declaration</span>.<span class=\"i\">name</span>.<span class=\"i\">lexeme</span> + <span class=\"s\">&quot;&gt;&quot;</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxFunction.java</em>, add after <em>LoxFunction</em>()</div>\n\n<p>This gives nicer output if a user decides to print a function value.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">add</span>(<span class=\"i\">a</span>, <span class=\"i\">b</span>) {\n  <span class=\"k\">print</span> <span class=\"i\">a</span> + <span class=\"i\">b</span>;\n}\n\n<span class=\"k\">print</span> <span class=\"i\">add</span>; <span class=\"c\">// &quot;&lt;fn add&gt;&quot;.</span>\n</pre></div>\n<h3><a href=\"#interpreting-function-declarations\" id=\"interpreting-function-declarations\"><small>10&#8202;.&#8202;4&#8202;.&#8202;1</small>Interpreting function declarations</a></h3>\n<p>We&rsquo;ll come back and refine LoxFunction soon, but that&rsquo;s enough to get started.\nNow we can visit a function declaration.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>visitExpressionStmt</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitFunctionStmt</span>(<span class=\"t\">Stmt</span>.<span class=\"t\">Function</span> <span class=\"i\">stmt</span>) {\n    <span class=\"t\">LoxFunction</span> <span class=\"i\">function</span> = <span class=\"k\">new</span> <span class=\"t\">LoxFunction</span>(<span class=\"i\">stmt</span>);\n    <span class=\"i\">environment</span>.<span class=\"i\">define</span>(<span class=\"i\">stmt</span>.<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>, <span class=\"i\">function</span>);\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>visitExpressionStmt</em>()</div>\n\n<p>This is similar to how we interpret other literal expressions. We take a\nfunction <em>syntax node</em><span class=\"em\">&mdash;</span>a compile-time representation of the function<span class=\"em\">&mdash;</span>and\nconvert it to its runtime representation. Here, that&rsquo;s a LoxFunction that wraps\nthe syntax node.</p>\n<p>Function declarations are different from other literal nodes in that the\ndeclaration <em>also</em> binds the resulting object to a new variable. So, after\ncreating the LoxFunction, we create a new binding in the current environment and\nstore a reference to it there.</p>\n<p>With that, we can define and call our own functions all within Lox. Give it a\ntry:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">sayHi</span>(<span class=\"i\">first</span>, <span class=\"i\">last</span>) {\n  <span class=\"k\">print</span> <span class=\"s\">&quot;Hi, &quot;</span> + <span class=\"i\">first</span> + <span class=\"s\">&quot; &quot;</span> + <span class=\"i\">last</span> + <span class=\"s\">&quot;!&quot;</span>;\n}\n\n<span class=\"i\">sayHi</span>(<span class=\"s\">&quot;Dear&quot;</span>, <span class=\"s\">&quot;Reader&quot;</span>);\n</pre></div>\n<p>I don&rsquo;t know about you, but that looks like an honest-to-God programming\nlanguage to me.</p>\n<h2><a href=\"#return-statements\" id=\"return-statements\"><small>10&#8202;.&#8202;5</small>Return Statements</a></h2>\n<p>We can get data into functions by passing parameters, but we&rsquo;ve got no way to\nget results back <span name=\"hotel\"><em>out</em></span>. If Lox were an\nexpression-oriented language like Ruby or Scheme, the body would be an\nexpression whose value is implicitly the function&rsquo;s result. But in Lox, the body\nof a function is a list of statements which don&rsquo;t produce values, so we need\ndedicated syntax for emitting a result. In other words, <code>return</code> statements. I&rsquo;m\nsure you can guess the grammar already.</p>\n<aside name=\"hotel\">\n<p>The Hotel California of data.</p>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"i\">statement</span>      → <span class=\"i\">exprStmt</span>\n               | <span class=\"i\">forStmt</span>\n               | <span class=\"i\">ifStmt</span>\n               | <span class=\"i\">printStmt</span>\n               | <span class=\"i\">returnStmt</span>\n               | <span class=\"i\">whileStmt</span>\n               | <span class=\"i\">block</span> ;\n\n<span class=\"i\">returnStmt</span>     → <span class=\"s\">&quot;return&quot;</span> <span class=\"i\">expression</span>? <span class=\"s\">&quot;;&quot;</span> ;\n</pre></div>\n<p>We&rsquo;ve got one more<span class=\"em\">&mdash;</span>the final, in fact<span class=\"em\">&mdash;</span>production under the venerable\n<code>statement</code> rule. A <code>return</code> statement is the <code>return</code> keyword followed by an\noptional expression and terminated with a semicolon.</p>\n<p>The return value is optional to support exiting early from a function that\ndoesn&rsquo;t return a useful value. In statically typed languages, &ldquo;void&rdquo; functions\ndon&rsquo;t return a value and non-void ones do. Since Lox is dynamically typed, there\nare no true void functions. The compiler has no way of preventing you from\ntaking the result value of a call to a function that doesn&rsquo;t contain a <code>return</code>\nstatement.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">procedure</span>() {\n  <span class=\"k\">print</span> <span class=\"s\">&quot;don&#39;t return anything&quot;</span>;\n}\n\n<span class=\"k\">var</span> <span class=\"i\">result</span> = <span class=\"i\">procedure</span>();\n<span class=\"k\">print</span> <span class=\"i\">result</span>; <span class=\"c\">// ?</span>\n</pre></div>\n<p>This means every Lox function must return <em>something</em>, even if it contains no\n<code>return</code> statements at all. We use <code>nil</code> for this, which is why LoxFunction&rsquo;s\nimplementation of <code>call()</code> returns <code>null</code> at the end. In that same vein, if you\nomit the value in a <code>return</code> statement, we simply treat it as equivalent to:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">return</span> <span class=\"k\">nil</span>;\n</pre></div>\n<p>Over in our AST generator, we add a <span name=\"return-ast\">new node</span>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      &quot;Print      : Expr expression&quot;,\n</pre><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nin <em>main</em>()</div>\n<pre class=\"insert\">      <span class=\"s\">&quot;Return     : Token keyword, Expr value&quot;</span>,\n</pre><pre class=\"insert-after\">      &quot;Var        : Token name, Expr initializer&quot;,\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, in <em>main</em>()</div>\n\n<aside name=\"return-ast\">\n<p>The generated code for the new node is in <a href=\"appendix-ii.html#return-statement\">Appendix II</a>.</p>\n</aside>\n<p>It keeps the <code>return</code> keyword token so we can use its location for error\nreporting, and the value being returned, if any. We parse it like other\nstatements, first by recognizing the initial keyword.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    if (match(PRINT)) return printStatement();\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>statement</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">RETURN</span>)) <span class=\"k\">return</span> <span class=\"i\">returnStatement</span>();\n</pre><pre class=\"insert-after\">    if (match(WHILE)) return whileStatement();\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>statement</em>()</div>\n\n<p>That branches out to:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>printStatement</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Stmt</span> <span class=\"i\">returnStatement</span>() {\n    <span class=\"t\">Token</span> <span class=\"i\">keyword</span> = <span class=\"i\">previous</span>();\n    <span class=\"t\">Expr</span> <span class=\"i\">value</span> = <span class=\"k\">null</span>;\n    <span class=\"k\">if</span> (!<span class=\"i\">check</span>(<span class=\"i\">SEMICOLON</span>)) {\n      <span class=\"i\">value</span> = <span class=\"i\">expression</span>();\n    }\n\n    <span class=\"i\">consume</span>(<span class=\"i\">SEMICOLON</span>, <span class=\"s\">&quot;Expect &#39;;&#39; after return value.&quot;</span>);\n    <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">Stmt</span>.<span class=\"t\">Return</span>(<span class=\"i\">keyword</span>, <span class=\"i\">value</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>printStatement</em>()</div>\n\n<p>After snagging the previously consumed <code>return</code> keyword, we look for a value\nexpression. Since many different tokens can potentially start an expression,\nit&rsquo;s hard to tell if a return value is <em>present</em>. Instead, we check if it&rsquo;s\n<em>absent</em>. Since a semicolon can&rsquo;t begin an expression, if the next token is\nthat, we know there must not be a value.</p>\n<h3><a href=\"#returning-from-calls\" id=\"returning-from-calls\"><small>10&#8202;.&#8202;5&#8202;.&#8202;1</small>Returning from calls</a></h3>\n<p>Interpreting a <code>return</code> statement is tricky. You can return from anywhere within\nthe body of a function, even deeply nested inside other statements. When the\nreturn is executed, the interpreter needs to jump all the way out of whatever\ncontext it&rsquo;s currently in and cause the function call to complete, like some\nkind of jacked up control flow construct.</p>\n<p>For example, say we&rsquo;re running this program and we&rsquo;re about to execute the\n<code>return</code> statement:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">count</span>(<span class=\"i\">n</span>) {\n  <span class=\"k\">while</span> (<span class=\"i\">n</span> &lt; <span class=\"n\">100</span>) {\n    <span class=\"k\">if</span> (<span class=\"i\">n</span> == <span class=\"n\">3</span>) <span class=\"k\">return</span> <span class=\"i\">n</span>; <span class=\"c\">// &lt;--</span>\n    <span class=\"k\">print</span> <span class=\"i\">n</span>;\n    <span class=\"i\">n</span> = <span class=\"i\">n</span> + <span class=\"n\">1</span>;\n  }\n}\n\n<span class=\"i\">count</span>(<span class=\"n\">1</span>);\n</pre></div>\n<p>The Java call stack currently looks roughly like this:</p>\n<div class=\"codehilite\"><pre>Interpreter.visitReturnStmt()\nInterpreter.visitIfStmt()\nInterpreter.executeBlock()\nInterpreter.visitBlockStmt()\nInterpreter.visitWhileStmt()\nInterpreter.executeBlock()\nLoxFunction.call()\nInterpreter.visitCallExpr()\n</pre></div>\n<p>We need to get from the top of the stack all the way back to <code>call()</code>. I don&rsquo;t\nknow about you, but to me that sounds like exceptions. When we execute a\n<code>return</code> statement, we&rsquo;ll use an exception to unwind the interpreter past the\nvisit methods of all of the containing statements back to the code that began\nexecuting the body.</p>\n<p>The visit method for our new AST node looks like this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>visitPrintStmt</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitReturnStmt</span>(<span class=\"t\">Stmt</span>.<span class=\"t\">Return</span> <span class=\"i\">stmt</span>) {\n    <span class=\"t\">Object</span> <span class=\"i\">value</span> = <span class=\"k\">null</span>;\n    <span class=\"k\">if</span> (<span class=\"i\">stmt</span>.<span class=\"i\">value</span> != <span class=\"k\">null</span>) <span class=\"i\">value</span> = <span class=\"i\">evaluate</span>(<span class=\"i\">stmt</span>.<span class=\"i\">value</span>);\n\n    <span class=\"k\">throw</span> <span class=\"k\">new</span> <span class=\"t\">Return</span>(<span class=\"i\">value</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>visitPrintStmt</em>()</div>\n\n<p>If we have a return value, we evaluate it, otherwise, we use <code>nil</code>. Then we take\nthat value and wrap it in a custom exception class and throw it.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Return.java</em><br>\ncreate new file</div>\n<pre><span class=\"k\">package</span> <span class=\"i\">com.craftinginterpreters.lox</span>;\n\n<span class=\"k\">class</span> <span class=\"t\">Return</span> <span class=\"k\">extends</span> <span class=\"t\">RuntimeException</span> {\n  <span class=\"k\">final</span> <span class=\"t\">Object</span> <span class=\"i\">value</span>;\n\n  <span class=\"t\">Return</span>(<span class=\"t\">Object</span> <span class=\"i\">value</span>) {\n    <span class=\"k\">super</span>(<span class=\"k\">null</span>, <span class=\"k\">null</span>, <span class=\"k\">false</span>, <span class=\"k\">false</span>);\n    <span class=\"k\">this</span>.<span class=\"i\">value</span> = <span class=\"i\">value</span>;\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Return.java</em>, create new file</div>\n\n<p>This class wraps the return value with the accoutrements Java requires for a\nruntime exception class. The weird super constructor call with those <code>null</code> and\n<code>false</code> arguments disables some JVM machinery that we don&rsquo;t need. Since we&rsquo;re\nusing our exception class for <span name=\"exception\">control flow</span> and not\nactual error handling, we don&rsquo;t need overhead like stack traces.</p>\n<aside name=\"exception\">\n<p>For the record, I&rsquo;m not generally a fan of using exceptions for control flow.\nBut inside a heavily recursive tree-walk interpreter, it&rsquo;s the way to go. Since\nour own syntax tree evaluation is so heavily tied to the Java call stack, we&rsquo;re\npressed to do some heavyweight call stack manipulation occasionally, and\nexceptions are a handy tool for that.</p>\n</aside>\n<p>We want this to unwind all the way to where the function call began, the\n<code>call()</code> method in LoxFunction.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">          arguments.get(i));\n    }\n\n</pre><div class=\"source-file\"><em>lox/LoxFunction.java</em><br>\nin <em>call</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"k\">try</span> {\n      <span class=\"i\">interpreter</span>.<span class=\"i\">executeBlock</span>(<span class=\"i\">declaration</span>.<span class=\"i\">body</span>, <span class=\"i\">environment</span>);\n    } <span class=\"k\">catch</span> (<span class=\"t\">Return</span> <span class=\"i\">returnValue</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">returnValue</span>.<span class=\"i\">value</span>;\n    }\n</pre><pre class=\"insert-after\">    return null;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxFunction.java</em>, in <em>call</em>(), replace 1 line</div>\n\n<p>We wrap the call to <code>executeBlock()</code> in a try-catch block. When it catches a\nreturn exception, it pulls out the value and makes that the return value from\n<code>call()</code>. If it never catches one of these exceptions, it means the function\nreached the end of its body without hitting a <code>return</code> statement. In that case,\nit implicitly returns <code>nil</code>.</p>\n<p>Let&rsquo;s try it out. We finally have enough power to support this classic\nexample<span class=\"em\">&mdash;</span>a recursive function to calculate Fibonacci numbers:</p>\n<p><span name=\"slow\"></span></p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">fib</span>(<span class=\"i\">n</span>) {\n  <span class=\"k\">if</span> (<span class=\"i\">n</span> &lt;= <span class=\"n\">1</span>) <span class=\"k\">return</span> <span class=\"i\">n</span>;\n  <span class=\"k\">return</span> <span class=\"i\">fib</span>(<span class=\"i\">n</span> - <span class=\"n\">2</span>) + <span class=\"i\">fib</span>(<span class=\"i\">n</span> - <span class=\"n\">1</span>);\n}\n\n<span class=\"k\">for</span> (<span class=\"k\">var</span> <span class=\"i\">i</span> = <span class=\"n\">0</span>; <span class=\"i\">i</span> &lt; <span class=\"n\">20</span>; <span class=\"i\">i</span> = <span class=\"i\">i</span> + <span class=\"n\">1</span>) {\n  <span class=\"k\">print</span> <span class=\"i\">fib</span>(<span class=\"i\">i</span>);\n}\n</pre></div>\n<p>This tiny program exercises almost every language feature we have spent the past\nseveral chapters implementing<span class=\"em\">&mdash;</span>expressions, arithmetic, branching, looping,\nvariables, functions, function calls, parameter binding, and returns.</p>\n<aside name=\"slow\">\n<p>You might notice this is pretty slow. Obviously, recursion isn&rsquo;t the most\nefficient way to calculate Fibonacci numbers, but as a microbenchmark, it does\na good job of stress testing how fast our interpreter implements function calls.</p>\n<p>As you can see, the answer is &ldquo;not very fast&rdquo;. That&rsquo;s OK. Our C interpreter will\nbe faster.</p>\n</aside>\n<h2><a href=\"#local-functions-and-closures\" id=\"local-functions-and-closures\"><small>10&#8202;.&#8202;6</small>Local Functions and Closures</a></h2>\n<p>Our functions are pretty full featured, but there is one hole to patch. In fact,\nit&rsquo;s a big enough gap that we&rsquo;ll spend most of the <a href=\"resolving-and-binding.html\">next chapter</a> sealing it\nup, but we can get started here.</p>\n<p>LoxFunction&rsquo;s implementation of <code>call()</code> creates a new environment where it\nbinds the function&rsquo;s parameters. When I showed you that code, I glossed over one\nimportant point: What is the <em>parent</em> of that environment?</p>\n<p>Right now, it is always <code>globals</code>, the top-level global environment. That way,\nif an identifier isn&rsquo;t defined inside the function body itself, the interpreter\ncan look outside the function in the global scope to find it. In the Fibonacci\nexample, that&rsquo;s how the interpreter is able to look up the recursive call to\n<code>fib</code> inside the function&rsquo;s own body<span class=\"em\">&mdash;</span><code>fib</code> is a global variable.</p>\n<p>But recall that in Lox, function declarations are allowed <em>anywhere</em> a name can\nbe bound. That includes the top level of a Lox script, but also the inside of\nblocks or other functions. Lox supports <strong>local functions</strong> that are defined\ninside another function, or nested inside a block.</p>\n<p>Consider this classic example:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">makeCounter</span>() {\n  <span class=\"k\">var</span> <span class=\"i\">i</span> = <span class=\"n\">0</span>;\n  <span class=\"k\">fun</span> <span class=\"i\">count</span>() {\n    <span class=\"i\">i</span> = <span class=\"i\">i</span> + <span class=\"n\">1</span>;\n    <span class=\"k\">print</span> <span class=\"i\">i</span>;\n  }\n\n  <span class=\"k\">return</span> <span class=\"i\">count</span>;\n}\n\n<span class=\"k\">var</span> <span class=\"i\">counter</span> = <span class=\"i\">makeCounter</span>();\n<span class=\"i\">counter</span>(); <span class=\"c\">// &quot;1&quot;.</span>\n<span class=\"i\">counter</span>(); <span class=\"c\">// &quot;2&quot;.</span>\n</pre></div>\n<p>Here, <code>count()</code> uses <code>i</code>, which is declared outside of itself in the containing\nfunction <code>makeCounter()</code>. <code>makeCounter()</code> returns a reference to the <code>count()</code>\nfunction and then its own body finishes executing completely.</p>\n<p>Meanwhile, the top-level code invokes the returned <code>count()</code> function. That\nexecutes the body of <code>count()</code>, which assigns to and reads <code>i</code>, even though the\nfunction where <code>i</code> was defined has already exited.</p>\n<p>If you&rsquo;ve never encountered a language with nested functions before, this might\nseem crazy, but users do expect it to work. Alas, if you run it now, you get an\nundefined variable error in the call to <code>counter()</code> when the body of <code>count()</code>\ntries to look up <code>i</code>. That&rsquo;s because the environment chain in effect looks like\nthis:</p><img src=\"image/functions/global.png\" alt=\"The environment chain from count()'s body to the global scope.\" />\n<p>When we call <code>count()</code> (through the reference to it stored in <code>counter</code>), we\ncreate a new empty environment for the function body. The parent of that is the\nglobal environment. We lost the environment for <code>makeCounter()</code> where <code>i</code> is\nbound.</p>\n<p>Let&rsquo;s go back in time a bit. Here&rsquo;s what the environment chain looked like right\nwhen we declared <code>count()</code> inside the body of <code>makeCounter()</code>:</p><img src=\"image/functions/body.png\" alt=\"The environment chain inside the body of makeCounter().\" />\n<p>So at the point where the function is declared, we can see <code>i</code>. But when we\nreturn from <code>makeCounter()</code> and exit its body, the interpreter discards that\nenvironment. Since the interpreter doesn&rsquo;t keep the environment surrounding\n<code>count()</code> around, it&rsquo;s up to the function object itself to hang on to it.</p>\n<p>This data structure is called a <span name=\"closure\"><strong>closure</strong></span> because\nit &ldquo;closes over&rdquo; and holds on to the surrounding variables where the function is\ndeclared. Closures have been around since the early Lisp days, and language\nhackers have come up with all manner of ways to implement them. For jlox, we&rsquo;ll\ndo the simplest thing that works. In LoxFunction, we add a field to store an\nenvironment.</p>\n<aside name=\"closure\">\n<p>&ldquo;Closure&rdquo; is yet another term coined by Peter J. Landin. I assume before he came\nalong that computer scientists communicated with each other using only primitive\ngrunts and pawing hand gestures.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">  private final Stmt.Function declaration;\n</pre><div class=\"source-file\"><em>lox/LoxFunction.java</em><br>\nin class <em>LoxFunction</em></div>\n<pre class=\"insert\">  <span class=\"k\">private</span> <span class=\"k\">final</span> <span class=\"t\">Environment</span> <span class=\"i\">closure</span>;\n\n</pre><pre class=\"insert-after\">  LoxFunction(Stmt.Function declaration) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxFunction.java</em>, in class <em>LoxFunction</em></div>\n\n<p>We initialize that in the constructor.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/LoxFunction.java</em><br>\nconstructor <em>LoxFunction</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"t\">LoxFunction</span>(<span class=\"t\">Stmt</span>.<span class=\"t\">Function</span> <span class=\"i\">declaration</span>, <span class=\"t\">Environment</span> <span class=\"i\">closure</span>) {\n    <span class=\"k\">this</span>.<span class=\"i\">closure</span> = <span class=\"i\">closure</span>;\n</pre><pre class=\"insert-after\">    this.declaration = declaration;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxFunction.java</em>, constructor <em>LoxFunction</em>(), replace 1 line</div>\n\n<p>When we create a LoxFunction, we capture the current environment.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  public Void visitFunctionStmt(Stmt.Function stmt) {\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitFunctionStmt</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"t\">LoxFunction</span> <span class=\"i\">function</span> = <span class=\"k\">new</span> <span class=\"t\">LoxFunction</span>(<span class=\"i\">stmt</span>, <span class=\"i\">environment</span>);\n</pre><pre class=\"insert-after\">    environment.define(stmt.name.lexeme, function);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitFunctionStmt</em>(), replace 1 line</div>\n\n<p>This is the environment that is active when the function is <em>declared</em> not when\nit&rsquo;s <em>called</em>, which is what we want. It represents the lexical scope\nsurrounding the function declaration. Finally, when we call the function, we use\nthat environment as the call&rsquo;s parent instead of going straight to <code>globals</code>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">                     List&lt;Object&gt; arguments) {\n</pre><div class=\"source-file\"><em>lox/LoxFunction.java</em><br>\nin <em>call</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"t\">Environment</span> <span class=\"i\">environment</span> = <span class=\"k\">new</span> <span class=\"t\">Environment</span>(<span class=\"i\">closure</span>);\n</pre><pre class=\"insert-after\">    for (int i = 0; i &lt; declaration.params.size(); i++) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxFunction.java</em>, in <em>call</em>(), replace 1 line</div>\n\n<p>This creates an environment chain that goes from the function&rsquo;s body out through\nthe environments where the function is declared, all the way out to the global\nscope. The runtime environment chain matches the textual nesting of the source\ncode like we want. The end result when we call that function looks like this:</p><img src=\"image/functions/closure.png\" alt=\"The environment chain with the closure.\" />\n<p>Now, as you can see, the interpreter can still find <code>i</code> when it needs to because\nit&rsquo;s in the middle of the environment chain. Try running that <code>makeCounter()</code>\nexample now. It works!</p>\n<p>Functions let us abstract over, reuse, and compose code. Lox is much more\npowerful than the rudimentary arithmetic calculator it used to be. Alas, in our\nrush to cram closures in, we have let a tiny bit of dynamic scoping leak into\nthe interpreter. In the <a href=\"resolving-and-binding.html\">next chapter</a>, we will explore deeper into lexical\nscope and close that hole.</p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>Our interpreter carefully checks that the number of arguments passed to a\nfunction matches the number of parameters it expects. Since this check is\ndone at runtime on every call, it has a performance cost. Smalltalk\nimplementations don&rsquo;t have that problem. Why not?</p>\n</li>\n<li>\n<p>Lox&rsquo;s function declaration syntax performs two independent operations. It\ncreates a function and also binds it to a name. This improves usability for\nthe common case where you do want to associate a name with the function.\nBut in functional-styled code, you often want to create a function to\nimmediately pass it to some other function or return it. In that case, it\ndoesn&rsquo;t need a name.</p>\n<p>Languages that encourage a functional style usually support <strong>anonymous\nfunctions</strong> or <strong>lambdas</strong><span class=\"em\">&mdash;</span>an expression syntax that creates a function\nwithout binding it to a name. Add anonymous function syntax to Lox so that\nthis works:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">thrice</span>(<span class=\"i\">fn</span>) {\n  <span class=\"k\">for</span> (<span class=\"k\">var</span> <span class=\"i\">i</span> = <span class=\"n\">1</span>; <span class=\"i\">i</span> &lt;= <span class=\"n\">3</span>; <span class=\"i\">i</span> = <span class=\"i\">i</span> + <span class=\"n\">1</span>) {\n    <span class=\"i\">fn</span>(<span class=\"i\">i</span>);\n  }\n}\n\n<span class=\"i\">thrice</span>(<span class=\"k\">fun</span> (<span class=\"i\">a</span>) {\n  <span class=\"k\">print</span> <span class=\"i\">a</span>;\n});\n<span class=\"c\">// &quot;1&quot;.</span>\n<span class=\"c\">// &quot;2&quot;.</span>\n<span class=\"c\">// &quot;3&quot;.</span>\n</pre></div>\n<p>How do you handle the tricky case of an anonymous function expression\noccurring in an expression statement:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> () {};\n</pre></div>\n</li>\n<li>\n<p>Is this program valid?</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">scope</span>(<span class=\"i\">a</span>) {\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;local&quot;</span>;\n}\n</pre></div>\n<p>In other words, are a function&rsquo;s parameters in the <em>same</em> scope as its local\nvariables, or in an outer scope? What does Lox do? What about other\nlanguages you are familiar with? What do you think a language <em>should</em> do?</p>\n</li>\n</ol>\n</div>\n\n<footer>\n<a href=\"resolving-and-binding.html\" class=\"next\">\n  Next Chapter: &ldquo;Resolving and Binding&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/garbage-collection.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Garbage Collection &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Garbage Collection<small>26</small></a></h3>\n\n<ul>\n    <li><a href=\"#reachability\"><small>26.1</small> Reachability</a></li>\n    <li><a href=\"#mark-sweep-garbage-collection\"><small>26.2</small> Mark-Sweep Garbage Collection</a></li>\n    <li><a href=\"#marking-the-roots\"><small>26.3</small> Marking the Roots</a></li>\n    <li><a href=\"#tracing-object-references\"><small>26.4</small> Tracing Object References</a></li>\n    <li><a href=\"#sweeping-unused-objects\"><small>26.5</small> Sweeping Unused Objects</a></li>\n    <li><a href=\"#when-to-collect\"><small>26.6</small> When to Collect</a></li>\n    <li><a href=\"#garbage-collection-bugs\"><small>26.7</small> Garbage Collection Bugs</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Generational Collectors</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"closures.html\" title=\"Closures\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"classes-and-instances.html\" title=\"Classes and Instances\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"closures.html\" title=\"Closures\" class=\"prev\">←</a>\n<a href=\"classes-and-instances.html\" title=\"Classes and Instances\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Garbage Collection<small>26</small></a></h3>\n\n<ul>\n    <li><a href=\"#reachability\"><small>26.1</small> Reachability</a></li>\n    <li><a href=\"#mark-sweep-garbage-collection\"><small>26.2</small> Mark-Sweep Garbage Collection</a></li>\n    <li><a href=\"#marking-the-roots\"><small>26.3</small> Marking the Roots</a></li>\n    <li><a href=\"#tracing-object-references\"><small>26.4</small> Tracing Object References</a></li>\n    <li><a href=\"#sweeping-unused-objects\"><small>26.5</small> Sweeping Unused Objects</a></li>\n    <li><a href=\"#when-to-collect\"><small>26.6</small> When to Collect</a></li>\n    <li><a href=\"#garbage-collection-bugs\"><small>26.7</small> Garbage Collection Bugs</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Generational Collectors</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"closures.html\" title=\"Closures\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"classes-and-instances.html\" title=\"Classes and Instances\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">26</div>\n  <h1>Garbage Collection</h1>\n\n<blockquote>\n<p>I wanna, I wanna,<br />\nI wanna, I wanna,<br />\nI wanna be trash.<br /></p>\n<p><cite>The Whip, &ldquo;Trash&rdquo;</cite></p>\n</blockquote>\n<p>We say Lox is a &ldquo;high-level&rdquo; language because it frees programmers from worrying\nabout details irrelevant to the problem they&rsquo;re solving. The user becomes an\nexecutive, giving the machine abstract goals and letting the lowly computer\nfigure out how to get there.</p>\n<p>Dynamic memory allocation is a perfect candidate for automation. It&rsquo;s necessary\nfor a working program, tedious to do by hand, and yet still error-prone. The\ninevitable mistakes can be catastrophic, leading to crashes, memory corruption,\nor security violations. It&rsquo;s the kind of risky-yet-boring work that machines\nexcel at over humans.</p>\n<p>This is why Lox is a <strong>managed language</strong>, which means that the language\nimplementation manages memory allocation and freeing on the user&rsquo;s behalf. When\na user performs an operation that requires some dynamic memory, the VM\nautomatically allocates it. The programmer never worries about deallocating\nanything. The machine ensures any memory the program is using sticks around as\nlong as needed.</p>\n<p>Lox provides the illusion that the computer has an infinite amount of memory.\nUsers can allocate and allocate and allocate and never once think about where\nall these bytes are coming from. Of course, computers do not yet <em>have</em> infinite\nmemory. So the way managed languages maintain this illusion is by going behind\nthe programmer&rsquo;s back and reclaiming memory that the program no longer needs.\nThe component that does this is called a <strong>garbage <span\nname=\"recycle\">collector</span></strong>.</p>\n<aside name=\"recycle\">\n<p>Recycling would really be a better metaphor for this. The GC doesn&rsquo;t <em>throw\naway</em> the memory, it reclaims it to be reused for new data. But managed\nlanguages are older than Earth Day, so the inventors went with the analogy they\nknew.</p><img src=\"image/garbage-collection/recycle.png\" class=\"above\" alt=\"A recycle bin full of bits.\" />\n</aside>\n<h2><a href=\"#reachability\" id=\"reachability\"><small>26&#8202;.&#8202;1</small>Reachability</a></h2>\n<p>This raises a surprisingly difficult question: how does a VM tell what memory is\n<em>not</em> needed? Memory is only needed if it is read in the future, but short of\nhaving a time machine, how can an implementation tell what code the program\n<em>will</em> execute and which data it <em>will</em> use? Spoiler alert: VMs cannot travel\ninto the future. Instead, the language makes a <span\nname=\"conservative\">conservative</span> approximation: it considers a piece of\nmemory to still be in use if it <em>could possibly</em> be read in the future.</p>\n<aside name=\"conservative\">\n<p>I&rsquo;m using &ldquo;conservative&rdquo; in the general sense. There is such a thing as a\n&ldquo;conservative garbage collector&rdquo; which means something more specific. All\ngarbage collectors are &ldquo;conservative&rdquo; in that they keep memory alive if it\n<em>could</em> be accessed, instead of having a Magic 8-Ball that lets them more\nprecisely know what data <em>will</em> be accessed.</p>\n<p>A <strong>conservative GC</strong> is a special kind of collector that considers any piece of\nmemory to be a pointer if the value in there looks like it could be an address.\nThis is in contrast to a <strong>precise GC</strong><span class=\"em\">&mdash;</span>which is what we&rsquo;ll implement<span class=\"em\">&mdash;</span>that\nknows exactly which words in memory are pointers and which store other kinds of\nvalues like numbers or strings.</p>\n</aside>\n<p>That sounds <em>too</em> conservative. Couldn&rsquo;t <em>any</em> bit of memory potentially be\nread? Actually, no, at least not in a memory-safe language like Lox. Here&rsquo;s an\nexample:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;first value&quot;</span>;\n<span class=\"i\">a</span> = <span class=\"s\">&quot;updated&quot;</span>;\n<span class=\"c\">// GC here.</span>\n<span class=\"k\">print</span> <span class=\"i\">a</span>;\n</pre></div>\n<p>Say we run the GC after the assignment has completed on the second line. The\nstring &ldquo;first value&rdquo; is still sitting in memory, but there is no way for the\nuser&rsquo;s program to ever get to it. Once <code>a</code> got reassigned, the program lost any\nreference to that string. We can safely free it. A value is <strong>reachable</strong> if\nthere is some way for a user program to reference it. Otherwise, like the string\n&ldquo;first value&rdquo; here, it is <strong>unreachable</strong>.</p>\n<p>Many values can be directly accessed by the VM. Take a look at:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">global</span> = <span class=\"s\">&quot;string&quot;</span>;\n{\n  <span class=\"k\">var</span> <span class=\"i\">local</span> = <span class=\"s\">&quot;another&quot;</span>;\n  <span class=\"k\">print</span> <span class=\"i\">global</span> + <span class=\"i\">local</span>;\n}\n</pre></div>\n<p>Pause the program right after the two strings have been concatenated but before\nthe <code>print</code> statement has executed. The VM can reach <code>\"string\"</code> by looking\nthrough the global variable table and finding the entry for <code>global</code>. It can\nfind <code>\"another\"</code> by walking the value stack and hitting the slot for the local\nvariable <code>local</code>. It can even find the concatenated string <code>\"stringanother\"</code>\nsince that temporary value is also sitting on the VM&rsquo;s stack at the point when\nwe paused our program.</p>\n<p>All of these values are called <strong>roots</strong>. A root is any object that the VM can\nreach directly without going through a reference in some other object. Most\nroots are global variables or on the stack, but as we&rsquo;ll see, there are a couple\nof other places the VM stores references to objects that it can find.</p>\n<p>Other values can be found by going through a reference inside another value.\n<span name=\"class\">Fields</span> on instances of classes are the most obvious\ncase, but we don&rsquo;t have those yet. Even without those, our VM still has indirect\nreferences. Consider:</p>\n<aside name=\"class\">\n<p>We&rsquo;ll get there <a href=\"classes-and-instances.html\">soon</a>, though!</p>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">makeClosure</span>() {\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;data&quot;</span>;\n\n  <span class=\"k\">fun</span> <span class=\"i\">f</span>() { <span class=\"k\">print</span> <span class=\"i\">a</span>; }\n  <span class=\"k\">return</span> <span class=\"i\">f</span>;\n}\n\n{\n  <span class=\"k\">var</span> <span class=\"i\">closure</span> = <span class=\"i\">makeClosure</span>();\n  <span class=\"c\">// GC here.</span>\n  <span class=\"i\">closure</span>();\n}\n</pre></div>\n<p>Say we pause the program on the marked line and run the garbage collector. When\nthe collector is done and the program resumes, it will call the closure, which\nwill in turn print <code>\"data\"</code>. So the collector needs to <em>not</em> free that string.\nBut here&rsquo;s what the stack looks like when we pause the program:</p><img src=\"image/garbage-collection/stack.png\" alt=\"The stack, containing only the script and closure.\" />\n<p>The <code>\"data\"</code> string is nowhere on it. It has already been hoisted off the stack\nand moved into the closed upvalue that the closure uses. The closure itself is\non the stack. But to get to the string, we need to trace through the closure and\nits upvalue array. Since it <em>is</em> possible for the user&rsquo;s program to do that, all\nof these indirectly accessible objects are also considered reachable.</p><img src=\"image/garbage-collection/reachable.png\" class=\"wide\" alt=\"All of the referenced objects from the closure, and the path to the 'data' string from the stack.\" />\n<p>This gives us an inductive definition of reachability:</p>\n<ul>\n<li>\n<p>All roots are reachable.</p>\n</li>\n<li>\n<p>Any object referred to from a reachable object is itself reachable.</p>\n</li>\n</ul>\n<p>These are the values that are still &ldquo;live&rdquo; and need to stay in memory. Any value\nthat <em>doesn&rsquo;t</em> meet this definition is fair game for the collector to reap.\nThat recursive pair of rules hints at a recursive algorithm we can use to free\nup unneeded memory:</p>\n<ol>\n<li>\n<p>Starting with the roots, traverse through object references to find the\nfull set of reachable objects.</p>\n</li>\n<li>\n<p>Free all objects <em>not</em> in that set.</p>\n</li>\n</ol>\n<p>Many <span name=\"handbook\">different</span> garbage collection algorithms are in\nuse today, but they all roughly follow that same structure. Some may interleave\nthe steps or mix them, but the two fundamental operations are there. They mostly\ndiffer in <em>how</em> they perform each step.</p>\n<aside name=\"handbook\">\n<p>If you want to explore other GC algorithms,\n<a href=\"http://gchandbook.org/\"><em>The Garbage Collection Handbook</em></a> (Jones, et al.) is the canonical\nreference. For a large book on such a deep, narrow topic, it is quite enjoyable\nto read. Or perhaps I have a strange idea of fun.</p>\n</aside>\n<h2><a href=\"#mark-sweep-garbage-collection\" id=\"mark-sweep-garbage-collection\"><small>26&#8202;.&#8202;2</small>Mark-Sweep Garbage Collection</a></h2>\n<p>The first managed language was Lisp, the second &ldquo;high-level&rdquo; language to be\ninvented, right after Fortran. John McCarthy considered using manual memory\nmanagement or reference counting, but <span\nname=\"procrastination\">eventually</span> settled on (and coined) garbage\ncollection<span class=\"em\">&mdash;</span>once the program was out of memory, it would go back and find\nunused storage it could reclaim.</p>\n<aside name=\"procrastination\">\n<p>In John McCarthy&rsquo;s &ldquo;History of Lisp&rdquo;, he notes: &ldquo;Once we decided on garbage\ncollection, its actual implementation could be postponed, because only toy\nexamples were being done.&rdquo; Our choice to procrastinate adding the GC to clox\nfollows in the footsteps of giants.</p>\n</aside>\n<p>He designed the very first, simplest garbage collection algorithm, called\n<strong>mark-and-sweep</strong> or just <strong>mark-sweep</strong>. Its description fits in three short\nparagraphs in the initial paper on Lisp. Despite its age and simplicity, the\nsame fundamental algorithm underlies many modern memory managers. Some corners\nof CS seem to be timeless.</p>\n<p>As the name implies, mark-sweep works in two phases:</p>\n<ul>\n<li>\n<p><strong>Marking:</strong> We start with the roots and traverse or <span\nname=\"trace\"><em>trace</em></span> through all of the objects those roots refer to.\nThis is a classic graph traversal of all of the reachable objects. Each time\nwe visit an object, we <em>mark</em> it in some way. (Implementations differ in how\nthey record the mark.)</p>\n</li>\n<li>\n<p><strong>Sweeping:</strong> Once the mark phase completes, every reachable object\nin the heap has been marked. That means any unmarked object is unreachable and\nripe for reclamation. We go through all the unmarked objects and free each\none.</p>\n</li>\n</ul>\n<p>It looks something like this:</p><img src=\"image/garbage-collection/mark-sweep.png\" class=\"wide\" alt=\"Starting from a graph of objects, first the reachable ones are marked, the remaining are swept, and then only the reachable remain.\" />\n<aside name=\"trace\">\n<p>A <strong>tracing garbage collector</strong> is any algorithm that traces through the graph\nof object references. This is in contrast with reference counting, which has a\ndifferent strategy for tracking the reachable objects.</p>\n</aside>\n<p>That&rsquo;s what we&rsquo;re gonna implement. Whenever we decide it&rsquo;s time to reclaim some\nbytes, we&rsquo;ll trace everything and mark all the reachable objects, free what\ndidn&rsquo;t get marked, and then resume the user&rsquo;s program.</p>\n<h3><a href=\"#collecting-garbage\" id=\"collecting-garbage\"><small>26&#8202;.&#8202;2&#8202;.&#8202;1</small>Collecting garbage</a></h3>\n<p>This entire chapter is about implementing this one <span\nname=\"one\">function</span>:</p>\n<aside name=\"one\">\n<p>Of course, we&rsquo;ll end up adding a bunch of helper functions too.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">void* reallocate(void* pointer, size_t oldSize, size_t newSize);\n</pre><div class=\"source-file\"><em>memory.h</em><br>\nadd after <em>reallocate</em>()</div>\n<pre class=\"insert\"><span class=\"t\">void</span> <span class=\"i\">collectGarbage</span>();\n</pre><pre class=\"insert-after\">void freeObjects();\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.h</em>, add after <em>reallocate</em>()</div>\n\n<p>We&rsquo;ll work our way up to a full implementation starting with this empty shell:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>memory.c</em><br>\nadd after <em>freeObject</em>()</div>\n<pre><span class=\"t\">void</span> <span class=\"i\">collectGarbage</span>() {\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, add after <em>freeObject</em>()</div>\n\n<p>The first question you might ask is, When does this function get called? It\nturns out that&rsquo;s a subtle question that we&rsquo;ll spend some time on later in the\nchapter. For now we&rsquo;ll sidestep the issue and build ourselves a handy diagnostic\ntool in the process.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define DEBUG_TRACE_EXECUTION\n</pre><div class=\"source-file\"><em>common.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#define DEBUG_STRESS_GC</span>\n</pre><pre class=\"insert-after\">\n\n#define UINT8_COUNT (UINT8_MAX + 1)\n</pre></div>\n<div class=\"source-file-narrow\"><em>common.h</em></div>\n\n<p>We&rsquo;ll add an optional &ldquo;stress test&rdquo; mode for the garbage collector. When this\nflag is defined, the GC runs as often as it possibly can. This is, obviously,\nhorrendous for performance. But it&rsquo;s great for flushing out memory management\nbugs that occur only when a GC is triggered at just the right moment. If <em>every</em>\nmoment triggers a GC, you&rsquo;re likely to find those bugs.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void* reallocate(void* pointer, size_t oldSize, size_t newSize) {\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>reallocate</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">newSize</span> &gt; <span class=\"i\">oldSize</span>) {\n<span class=\"a\">#ifdef DEBUG_STRESS_GC</span>\n    <span class=\"i\">collectGarbage</span>();\n<span class=\"a\">#endif</span>\n  }\n\n</pre><pre class=\"insert-after\">  if (newSize == 0) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>reallocate</em>()</div>\n\n<p>Whenever we call <code>reallocate()</code> to acquire more memory, we force a collection to\nrun. The if check is because <code>reallocate()</code> is also called to free or shrink an\nallocation. We don&rsquo;t want to trigger a GC for that<span class=\"em\">&mdash;</span>in particular because the\nGC itself will call <code>reallocate()</code> to free memory.</p>\n<p>Collecting right before <span name=\"demand\">allocation</span> is the classic way\nto wire a GC into a VM. You&rsquo;re already calling into the memory manager, so it&rsquo;s\nan easy place to hook in the code. Also, allocation is the only time when you\nreally <em>need</em> some freed up memory so that you can reuse it. If you <em>don&rsquo;t</em> use\nallocation to trigger a GC, you have to make sure every possible place in code\nwhere you can loop and allocate memory also has a way to trigger the collector.\nOtherwise, the VM can get into a starved state where it needs more memory but\nnever collects any.</p>\n<aside name=\"demand\">\n<p>More sophisticated collectors might run on a separate thread or be interleaved\nperiodically during program execution<span class=\"em\">&mdash;</span>often at function call boundaries or\nwhen a backward jump occurs.</p>\n</aside>\n<h3><a href=\"#debug-logging\" id=\"debug-logging\"><small>26&#8202;.&#8202;2&#8202;.&#8202;2</small>Debug logging</a></h3>\n<p>While we&rsquo;re on the subject of diagnostics, let&rsquo;s put some more in. A real\nchallenge I&rsquo;ve found with garbage collectors is that they are opaque. We&rsquo;ve been\nrunning lots of Lox programs just fine without any GC <em>at all</em> so far. Once we\nadd one, how do we tell if it&rsquo;s doing anything useful? Can we tell only if we\nwrite programs that plow through acres of memory? How do we debug that?</p>\n<p>An easy way to shine a light into the GC&rsquo;s inner workings is with some logging.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define DEBUG_STRESS_GC\n</pre><div class=\"source-file\"><em>common.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define DEBUG_LOG_GC</span>\n</pre><pre class=\"insert-after\">\n\n#define UINT8_COUNT (UINT8_MAX + 1)\n</pre></div>\n<div class=\"source-file-narrow\"><em>common.h</em></div>\n\n<p>When this is enabled, clox prints information to the console when it does\nsomething with dynamic memory.</p>\n<p>We need a couple of includes.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;vm.h&quot;\n</pre><div class=\"source-file\"><em>memory.c</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#ifdef DEBUG_LOG_GC</span>\n<span class=\"a\">#include &lt;stdio.h&gt;</span>\n<span class=\"a\">#include &quot;debug.h&quot;</span>\n<span class=\"a\">#endif</span>\n</pre><pre class=\"insert-after\">\n\nvoid* reallocate(void* pointer, size_t oldSize, size_t newSize) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em></div>\n\n<p>We don&rsquo;t have a collector yet, but we can start putting in some of the logging\nnow. We&rsquo;ll want to know when a collection run starts.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void collectGarbage() {\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>collectGarbage</em>()</div>\n<pre class=\"insert\"><span class=\"a\">#ifdef DEBUG_LOG_GC</span>\n  <span class=\"i\">printf</span>(<span class=\"s\">&quot;-- gc begin</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>);\n<span class=\"a\">#endif</span>\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>collectGarbage</em>()</div>\n\n<p>Eventually we will log some other operations during the collection, so we&rsquo;ll\nalso want to know when the show&rsquo;s over.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  printf(&quot;-- gc begin\\n&quot;);\n#endif\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>collectGarbage</em>()</div>\n<pre class=\"insert\">\n\n<span class=\"a\">#ifdef DEBUG_LOG_GC</span>\n  <span class=\"i\">printf</span>(<span class=\"s\">&quot;-- gc end</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>);\n<span class=\"a\">#endif</span>\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>collectGarbage</em>()</div>\n\n<p>We don&rsquo;t have any code for the collector yet, but we do have functions for\nallocating and freeing, so we can instrument those now.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  vm.objects = object;\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>allocateObject</em>()</div>\n<pre class=\"insert\">\n\n<span class=\"a\">#ifdef DEBUG_LOG_GC</span>\n  <span class=\"i\">printf</span>(<span class=\"s\">&quot;%p allocate %zu for %d</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>, (<span class=\"t\">void</span>*)<span class=\"i\">object</span>, <span class=\"i\">size</span>, <span class=\"i\">type</span>);\n<span class=\"a\">#endif</span>\n\n</pre><pre class=\"insert-after\">  return object;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>allocateObject</em>()</div>\n\n<p>And at the end of an object&rsquo;s lifespan:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void freeObject(Obj* object) {\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>freeObject</em>()</div>\n<pre class=\"insert\"><span class=\"a\">#ifdef DEBUG_LOG_GC</span>\n  <span class=\"i\">printf</span>(<span class=\"s\">&quot;%p free type %d</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>, (<span class=\"t\">void</span>*)<span class=\"i\">object</span>, <span class=\"i\">object</span>-&gt;<span class=\"i\">type</span>);\n<span class=\"a\">#endif</span>\n\n</pre><pre class=\"insert-after\">  switch (object-&gt;type) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>freeObject</em>()</div>\n\n<p>With these two flags, we should be able to see that we&rsquo;re making progress as we\nwork through the rest of the chapter.</p>\n<h2><a href=\"#marking-the-roots\" id=\"marking-the-roots\"><small>26&#8202;.&#8202;3</small>Marking the Roots</a></h2>\n<p>Objects are scattered across the heap like stars in the inky night sky. A\nreference from one object to another forms a connection, and these\nconstellations are the graph that the mark phase traverses. Marking begins at\nthe roots.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#ifdef DEBUG_LOG_GC\n  printf(&quot;-- gc begin\\n&quot;);\n#endif\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>collectGarbage</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"i\">markRoots</span>();\n</pre><pre class=\"insert-after\">\n\n#ifdef DEBUG_LOG_GC\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>collectGarbage</em>()</div>\n\n<p>Most roots are local variables or temporaries sitting right in the VM&rsquo;s stack,\nso we start by walking that.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>memory.c</em><br>\nadd after <em>freeObject</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">markRoots</span>() {\n  <span class=\"k\">for</span> (<span class=\"t\">Value</span>* <span class=\"i\">slot</span> = <span class=\"i\">vm</span>.<span class=\"i\">stack</span>; <span class=\"i\">slot</span> &lt; <span class=\"i\">vm</span>.<span class=\"i\">stackTop</span>; <span class=\"i\">slot</span>++) {\n    <span class=\"i\">markValue</span>(*<span class=\"i\">slot</span>);\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, add after <em>freeObject</em>()</div>\n\n<p>To mark a Lox value, we use this new function:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void* reallocate(void* pointer, size_t oldSize, size_t newSize);\n</pre><div class=\"source-file\"><em>memory.h</em><br>\nadd after <em>reallocate</em>()</div>\n<pre class=\"insert\"><span class=\"t\">void</span> <span class=\"i\">markValue</span>(<span class=\"t\">Value</span> <span class=\"i\">value</span>);\n</pre><pre class=\"insert-after\">void collectGarbage();\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.h</em>, add after <em>reallocate</em>()</div>\n\n<p>Its implementation is here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>memory.c</em><br>\nadd after <em>reallocate</em>()</div>\n<pre><span class=\"t\">void</span> <span class=\"i\">markValue</span>(<span class=\"t\">Value</span> <span class=\"i\">value</span>) {\n  <span class=\"k\">if</span> (<span class=\"a\">IS_OBJ</span>(<span class=\"i\">value</span>)) <span class=\"i\">markObject</span>(<span class=\"a\">AS_OBJ</span>(<span class=\"i\">value</span>));\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, add after <em>reallocate</em>()</div>\n\n<p>Some Lox values<span class=\"em\">&mdash;</span>numbers, Booleans, and <code>nil</code><span class=\"em\">&mdash;</span>are stored directly inline in\nValue and require no heap allocation. The garbage collector doesn&rsquo;t need to\nworry about them at all, so the first thing we do is ensure that the value is an\nactual heap object. If so, the real work happens in this function:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void* reallocate(void* pointer, size_t oldSize, size_t newSize);\n</pre><div class=\"source-file\"><em>memory.h</em><br>\nadd after <em>reallocate</em>()</div>\n<pre class=\"insert\"><span class=\"t\">void</span> <span class=\"i\">markObject</span>(<span class=\"t\">Obj</span>* <span class=\"i\">object</span>);\n</pre><pre class=\"insert-after\">void markValue(Value value);\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.h</em>, add after <em>reallocate</em>()</div>\n\n<p>Which is defined here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>memory.c</em><br>\nadd after <em>reallocate</em>()</div>\n<pre><span class=\"t\">void</span> <span class=\"i\">markObject</span>(<span class=\"t\">Obj</span>* <span class=\"i\">object</span>) {\n  <span class=\"k\">if</span> (<span class=\"i\">object</span> == <span class=\"a\">NULL</span>) <span class=\"k\">return</span>;\n  <span class=\"i\">object</span>-&gt;<span class=\"i\">isMarked</span> = <span class=\"k\">true</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, add after <em>reallocate</em>()</div>\n\n<p>The <code>NULL</code> check is unnecessary when called from <code>markValue()</code>. A Lox Value that\nis some kind of Obj type will always have a valid pointer. But later we will\ncall this function directly from other code, and in some of those places, the\nobject being pointed to is optional.</p>\n<p>Assuming we do have a valid object, we mark it by setting a flag. That new field\nlives in the Obj header struct all objects share.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  ObjType type;\n</pre><div class=\"source-file\"><em>object.h</em><br>\nin struct <em>Obj</em></div>\n<pre class=\"insert\">  <span class=\"t\">bool</span> <span class=\"i\">isMarked</span>;\n</pre><pre class=\"insert-after\">  struct Obj* next;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, in struct <em>Obj</em></div>\n\n<p>Every new object begins life unmarked because we haven&rsquo;t yet determined if it is\nreachable or not.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  object-&gt;type = type;\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>allocateObject</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">object</span>-&gt;<span class=\"i\">isMarked</span> = <span class=\"k\">false</span>;\n</pre><pre class=\"insert-after\">\n\n  object-&gt;next = vm.objects;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>allocateObject</em>()</div>\n\n<p>Before we go any farther, let&rsquo;s add some logging to <code>markObject()</code>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void markObject(Obj* object) {\n  if (object == NULL) return;\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>markObject</em>()</div>\n<pre class=\"insert\"><span class=\"a\">#ifdef DEBUG_LOG_GC</span>\n  <span class=\"i\">printf</span>(<span class=\"s\">&quot;%p mark &quot;</span>, (<span class=\"t\">void</span>*)<span class=\"i\">object</span>);\n  <span class=\"i\">printValue</span>(<span class=\"a\">OBJ_VAL</span>(<span class=\"i\">object</span>));\n  <span class=\"i\">printf</span>(<span class=\"s\">&quot;</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>);\n<span class=\"a\">#endif</span>\n\n</pre><pre class=\"insert-after\">  object-&gt;isMarked = true;\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>markObject</em>()</div>\n\n<p>This way we can see what the mark phase is doing. Marking the stack takes care\nof local variables and temporaries. The other main source of roots are the\nglobal variables.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    markValue(*slot);\n  }\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>markRoots</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"i\">markTable</span>(&amp;<span class=\"i\">vm</span>.<span class=\"i\">globals</span>);\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>markRoots</em>()</div>\n\n<p>Those live in a hash table owned by the VM, so we&rsquo;ll declare another helper\nfunction for marking all of the objects in a table.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">ObjString* tableFindString(Table* table, const char* chars,\n                           int length, uint32_t hash);\n</pre><div class=\"source-file\"><em>table.h</em><br>\nadd after <em>tableFindString</em>()</div>\n<pre class=\"insert\"><span class=\"t\">void</span> <span class=\"i\">markTable</span>(<span class=\"t\">Table</span>* <span class=\"i\">table</span>);\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.h</em>, add after <em>tableFindString</em>()</div>\n\n<p>We implement that in the &ldquo;table&rdquo; module here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>table.c</em><br>\nadd after <em>tableFindString</em>()</div>\n<pre><span class=\"t\">void</span> <span class=\"i\">markTable</span>(<span class=\"t\">Table</span>* <span class=\"i\">table</span>) {\n  <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">i</span> = <span class=\"n\">0</span>; <span class=\"i\">i</span> &lt; <span class=\"i\">table</span>-&gt;<span class=\"i\">capacity</span>; <span class=\"i\">i</span>++) {\n    <span class=\"t\">Entry</span>* <span class=\"i\">entry</span> = &amp;<span class=\"i\">table</span>-&gt;<span class=\"i\">entries</span>[<span class=\"i\">i</span>];\n    <span class=\"i\">markObject</span>((<span class=\"t\">Obj</span>*)<span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span>);\n    <span class=\"i\">markValue</span>(<span class=\"i\">entry</span>-&gt;<span class=\"i\">value</span>);\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, add after <em>tableFindString</em>()</div>\n\n<p>Pretty straightforward. We walk the entry array. For each one, we mark its\nvalue. We also mark the key strings for each entry since the GC manages those\nstrings too.</p>\n<h3><a href=\"#less-obvious-roots\" id=\"less-obvious-roots\"><small>26&#8202;.&#8202;3&#8202;.&#8202;1</small>Less obvious roots</a></h3>\n<p>Those cover the roots that we typically think of<span class=\"em\">&mdash;</span>the values that are\nobviously reachable because they&rsquo;re stored in variables the user&rsquo;s program can\nsee. But the VM has a few of its own hidey-holes where it squirrels away\nreferences to values that it directly accesses.</p>\n<p>Most function call state lives in the value stack, but the VM maintains a\nseparate stack of CallFrames. Each CallFrame contains a pointer to the closure\nbeing called. The VM uses those pointers to access constants and upvalues, so\nthose closures need to be kept around too.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  }\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>markRoots</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">i</span> = <span class=\"n\">0</span>; <span class=\"i\">i</span> &lt; <span class=\"i\">vm</span>.<span class=\"i\">frameCount</span>; <span class=\"i\">i</span>++) {\n    <span class=\"i\">markObject</span>((<span class=\"t\">Obj</span>*)<span class=\"i\">vm</span>.<span class=\"i\">frames</span>[<span class=\"i\">i</span>].<span class=\"i\">closure</span>);\n  }\n</pre><pre class=\"insert-after\">\n\n  markTable(&amp;vm.globals);\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>markRoots</em>()</div>\n\n<p>Speaking of upvalues, the open upvalue list is another set of values that the\nVM can directly reach.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  for (int i = 0; i &lt; vm.frameCount; i++) {\n    markObject((Obj*)vm.frames[i].closure);\n  }\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>markRoots</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"k\">for</span> (<span class=\"t\">ObjUpvalue</span>* <span class=\"i\">upvalue</span> = <span class=\"i\">vm</span>.<span class=\"i\">openUpvalues</span>;\n       <span class=\"i\">upvalue</span> != <span class=\"a\">NULL</span>;\n       <span class=\"i\">upvalue</span> = <span class=\"i\">upvalue</span>-&gt;<span class=\"i\">next</span>) {\n    <span class=\"i\">markObject</span>((<span class=\"t\">Obj</span>*)<span class=\"i\">upvalue</span>);\n  }\n</pre><pre class=\"insert-after\">\n\n  markTable(&amp;vm.globals);\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>markRoots</em>()</div>\n\n<p>Remember also that a collection can begin during <em>any</em> allocation. Those\nallocations don&rsquo;t just happen while the user&rsquo;s program is running. The compiler\nitself periodically grabs memory from the heap for literals and the constant\ntable. If the GC runs while we&rsquo;re in the middle of compiling, then any values\nthe compiler directly accesses need to be treated as roots too.</p>\n<p>To keep the compiler module cleanly separated from the rest of the VM, we&rsquo;ll do\nthat in a separate function.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  markTable(&amp;vm.globals);\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>markRoots</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">markCompilerRoots</span>();\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>markRoots</em>()</div>\n\n<p>It&rsquo;s declared here:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">ObjFunction* compile(const char* source);\n</pre><div class=\"source-file\"><em>compiler.h</em><br>\nadd after <em>compile</em>()</div>\n<pre class=\"insert\"><span class=\"t\">void</span> <span class=\"i\">markCompilerRoots</span>();\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.h</em>, add after <em>compile</em>()</div>\n\n<p>Which means the &ldquo;memory&rdquo; module needs an include.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &lt;stdlib.h&gt;\n\n</pre><div class=\"source-file\"><em>memory.c</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;compiler.h&quot;</span>\n</pre><pre class=\"insert-after\">#include &quot;memory.h&quot;\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em></div>\n\n<p>And the definition is over in the &ldquo;compiler&rdquo; module.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>compile</em>()</div>\n<pre><span class=\"t\">void</span> <span class=\"i\">markCompilerRoots</span>() {\n  <span class=\"t\">Compiler</span>* <span class=\"i\">compiler</span> = <span class=\"i\">current</span>;\n  <span class=\"k\">while</span> (<span class=\"i\">compiler</span> != <span class=\"a\">NULL</span>) {\n    <span class=\"i\">markObject</span>((<span class=\"t\">Obj</span>*)<span class=\"i\">compiler</span>-&gt;<span class=\"i\">function</span>);\n    <span class=\"i\">compiler</span> = <span class=\"i\">compiler</span>-&gt;<span class=\"i\">enclosing</span>;\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>compile</em>()</div>\n\n<p>Fortunately, the compiler doesn&rsquo;t have too many values that it hangs on to. The\nonly object it uses is the ObjFunction it is compiling into. Since function\ndeclarations can nest, the compiler has a linked list of those and we walk the\nwhole list.</p>\n<p>Since the &ldquo;compiler&rdquo; module is calling <code>markObject()</code>, it also needs an include.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;compiler.h&quot;\n</pre><div class=\"source-file\"><em>compiler.c</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;memory.h&quot;</span>\n</pre><pre class=\"insert-after\">#include &quot;scanner.h&quot;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em></div>\n\n<p>Those are all the roots. After running this, every object that the VM<span class=\"em\">&mdash;</span>runtime\nand compiler<span class=\"em\">&mdash;</span>can get to <em>without</em> going through some other object has its\nmark bit set.</p>\n<h2><a href=\"#tracing-object-references\" id=\"tracing-object-references\"><small>26&#8202;.&#8202;4</small>Tracing Object References</a></h2>\n<p>The next step in the marking process is tracing through the graph of references\nbetween objects to find the indirectly reachable values. We don&rsquo;t have instances\nwith fields yet, so there aren&rsquo;t many objects that contain references, but we do\nhave <span name=\"some\">some</span>. In particular, ObjClosure has the list of\nObjUpvalues it closes over as well as a reference to the raw ObjFunction that it\nwraps. ObjFunction, in turn, has a constant table containing references to all\nof the literals created in the function&rsquo;s body. This is enough to build a fairly\ncomplex web of objects for our collector to crawl through.</p>\n<aside name=\"some\">\n<p>I slotted this chapter into the book right here specifically <em>because</em> we now\nhave closures which give us interesting objects for the garbage collector to\nprocess.</p>\n</aside>\n<p>Now it&rsquo;s time to implement that traversal. We can go breadth-first, depth-first,\nor in some other order. Since we just need to find the <em>set</em> of all reachable\nobjects, the order we visit them <span name=\"dfs\">mostly</span> doesn&rsquo;t matter.</p>\n<aside name=\"dfs\">\n<p>I say &ldquo;mostly&rdquo; because some garbage collectors move objects in the order that\nthey are visited, so traversal order determines which objects end up adjacent in\nmemory. That impacts performance because the CPU uses locality to determine\nwhich memory to preload into the caches.</p>\n<p>Even when traversal order does matter, it&rsquo;s not clear which order is <em>best</em>.\nIt&rsquo;s very difficult to determine which order objects will be used in in the\nfuture, so it&rsquo;s hard for the GC to know which order will help performance.</p>\n</aside>\n<h3><a href=\"#the-tricolor-abstraction\" id=\"the-tricolor-abstraction\"><small>26&#8202;.&#8202;4&#8202;.&#8202;1</small>The tricolor abstraction</a></h3>\n<p>As the collector wanders through the graph of objects, we need to make sure it\ndoesn&rsquo;t lose track of where it is or get stuck going in circles. This is\nparticularly a concern for advanced implementations like incremental GCs that\ninterleave marking with running pieces of the user&rsquo;s program. The collector\nneeds to be able to pause and then pick up where it left off later.</p>\n<p>To help us soft-brained humans reason about this complex process, VM hackers\ncame up with a metaphor called the <span name=\"color\"></span><strong>tricolor\nabstraction</strong>. Each object has a conceptual &ldquo;color&rdquo; that tracks what state the\nobject is in, and what work is left to do.</p>\n<aside name=\"color\">\n<p>Advanced garbage collection algorithms often add other colors to the\nabstraction. I&rsquo;ve seen multiple shades of gray, and even purple in some designs.\nMy puce-chartreuse-fuchsia-malachite collector paper was, alas, not accepted for\npublication.</p>\n</aside>\n<ul>\n<li>\n<p><strong><img src=\"image/garbage-collection/white.png\" alt=\"A white circle.\"\nclass=\"dot\" /> White:</strong> At the beginning of a garbage collection, every\nobject is white. This color means we have not reached or processed the\nobject at all.</p>\n</li>\n<li>\n<p><strong><img src=\"image/garbage-collection/gray.png\" alt=\"A gray circle.\"\nclass=\"dot\" /> Gray:</strong> During marking, when we first reach an object, we\ndarken it gray. This color means we know the object itself is reachable and\nshould not be collected. But we have not yet traced <em>through</em> it to see what\n<em>other</em> objects it references. In graph algorithm terms, this is the\n<em>worklist</em><span class=\"em\">&mdash;</span>the set of objects we know about but haven&rsquo;t processed yet.</p>\n</li>\n<li>\n<p><strong><img src=\"image/garbage-collection/black.png\" alt=\"A black circle.\"\nclass=\"dot\" /> Black:</strong> When\nwe take a gray object and mark all of the objects it references, we then\nturn the gray object black. This color means the mark phase is done\nprocessing that object.</p>\n</li>\n</ul>\n<p>In terms of that abstraction, the marking process now looks like this:</p>\n<ol>\n<li>\n<p>Start off with all objects white.</p>\n</li>\n<li>\n<p>Find all the roots and mark them gray.</p>\n</li>\n<li>\n<p>Repeat as long as there are still gray objects:</p>\n<ol>\n<li>\n<p>Pick a gray object. Turn any white objects that the object mentions to\ngray.</p>\n</li>\n<li>\n<p>Mark the original gray object black.</p>\n</li>\n</ol>\n</li>\n</ol>\n<p>I find it helps to visualize this. You have a web of objects with references\nbetween them. Initially, they are all little white dots. Off to the side are\nsome incoming edges from the VM that point to the roots. Those roots turn gray.\nThen each gray object&rsquo;s siblings turn gray while the object itself turns black.\nThe full effect is a gray wavefront that passes through the graph, leaving a\nfield of reachable black objects behind it. Unreachable objects are not touched\nby the wavefront and stay white.</p><img src=\"image/garbage-collection/tricolor-trace.png\" class=\"wide\" alt=\"A gray wavefront working through a graph of nodes.\" />\n<p>At the <span name=\"invariant\">end</span>, you&rsquo;re left with a sea of reached,\nblack objects sprinkled with islands of white objects that can be swept up and\nfreed. Once the unreachable objects are freed, the remaining objects<span class=\"em\">&mdash;</span>all\nblack<span class=\"em\">&mdash;</span>are reset to white for the next garbage collection cycle.</p>\n<aside name=\"invariant\">\n<p>Note that at every step of this process no black node ever points to a white\nnode. This property is called the <strong>tricolor invariant</strong>. The traversal process\nmaintains this invariant to ensure that no reachable object is ever collected.</p>\n</aside>\n<h3><a href=\"#a-worklist-for-gray-objects\" id=\"a-worklist-for-gray-objects\"><small>26&#8202;.&#8202;4&#8202;.&#8202;2</small>A worklist for gray objects</a></h3>\n<p>In our implementation we have already marked the roots. They&rsquo;re all gray. The\nnext step is to start picking them and traversing their references. But we don&rsquo;t\nhave any easy way to find them. We set a field on the object, but that&rsquo;s it. We\ndon&rsquo;t want to have to traverse the entire object list looking for objects with\nthat field set.</p>\n<p>Instead, we&rsquo;ll create a separate worklist to keep track of all of the gray\nobjects. When an object turns gray, in addition to setting the mark field we&rsquo;ll\nalso add it to the worklist.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  object-&gt;isMarked = true;\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>markObject</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"k\">if</span> (<span class=\"i\">vm</span>.<span class=\"i\">grayCapacity</span> &lt; <span class=\"i\">vm</span>.<span class=\"i\">grayCount</span> + <span class=\"n\">1</span>) {\n    <span class=\"i\">vm</span>.<span class=\"i\">grayCapacity</span> = <span class=\"a\">GROW_CAPACITY</span>(<span class=\"i\">vm</span>.<span class=\"i\">grayCapacity</span>);\n    <span class=\"i\">vm</span>.<span class=\"i\">grayStack</span> = (<span class=\"t\">Obj</span>**)<span class=\"i\">realloc</span>(<span class=\"i\">vm</span>.<span class=\"i\">grayStack</span>,\n                                  <span class=\"k\">sizeof</span>(<span class=\"t\">Obj</span>*) * <span class=\"i\">vm</span>.<span class=\"i\">grayCapacity</span>);\n  }\n\n  <span class=\"i\">vm</span>.<span class=\"i\">grayStack</span>[<span class=\"i\">vm</span>.<span class=\"i\">grayCount</span>++] = <span class=\"i\">object</span>;\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>markObject</em>()</div>\n\n<p>We could use any kind of data structure that lets us put items in and take them\nout easily. I picked a stack because that&rsquo;s the simplest to implement with a\ndynamic array in C. It works mostly like other dynamic arrays we&rsquo;ve built in\nLox, <em>except</em>, note that it calls the <em>system</em> <code>realloc()</code> function and not our\nown <code>reallocate()</code> wrapper. The memory for the gray stack itself is <em>not</em>\nmanaged by the garbage collector. We don&rsquo;t want growing the gray stack during a\nGC to cause the GC to recursively start a new GC. That could tear a hole in the\nspace-time continuum.</p>\n<p>We&rsquo;ll manage its memory ourselves, explicitly. The VM owns the gray stack.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  Obj* objects;\n</pre><div class=\"source-file\"><em>vm.h</em><br>\nin struct <em>VM</em></div>\n<pre class=\"insert\">  <span class=\"t\">int</span> <span class=\"i\">grayCount</span>;\n  <span class=\"t\">int</span> <span class=\"i\">grayCapacity</span>;\n  <span class=\"t\">Obj</span>** <span class=\"i\">grayStack</span>;\n</pre><pre class=\"insert-after\">} VM;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em>, in struct <em>VM</em></div>\n\n<p>It starts out empty.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  vm.objects = NULL;\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>initVM</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"i\">vm</span>.<span class=\"i\">grayCount</span> = <span class=\"n\">0</span>;\n  <span class=\"i\">vm</span>.<span class=\"i\">grayCapacity</span> = <span class=\"n\">0</span>;\n  <span class=\"i\">vm</span>.<span class=\"i\">grayStack</span> = <span class=\"a\">NULL</span>;\n</pre><pre class=\"insert-after\">\n\n  initTable(&amp;vm.globals);\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>initVM</em>()</div>\n\n<p>And we need to free it when the VM shuts down.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    object = next;\n  }\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>freeObjects</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"i\">free</span>(<span class=\"i\">vm</span>.<span class=\"i\">grayStack</span>);\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>freeObjects</em>()</div>\n\n<p><span name=\"robust\">We</span> take full responsibility for this array. That\nincludes allocation failure. If we can&rsquo;t create or grow the gray stack, then we\ncan&rsquo;t finish the garbage collection. This is bad news for the VM, but\nfortunately rare since the gray stack tends to be pretty small. It would be nice\nto do something more graceful, but to keep the code in this book simple, we just\nabort.</p>\n<aside name=\"robust\">\n<p>To be more robust, we can allocate a &ldquo;rainy day fund&rdquo; block of memory when we\nstart the VM. If the gray stack allocation fails, we free the rainy day block\nand try again. That may give us enough wiggle room on the heap to create the\ngray stack, finish the GC, and free up more memory.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">    vm.grayStack = (Obj**)realloc(vm.grayStack,\n                                  sizeof(Obj*) * vm.grayCapacity);\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>markObject</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"k\">if</span> (<span class=\"i\">vm</span>.<span class=\"i\">grayStack</span> == <span class=\"a\">NULL</span>) <span class=\"i\">exit</span>(<span class=\"n\">1</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>markObject</em>()</div>\n\n<h3><a href=\"#processing-gray-objects\" id=\"processing-gray-objects\"><small>26&#8202;.&#8202;4&#8202;.&#8202;3</small>Processing gray objects</a></h3>\n<p>OK, now when we&rsquo;re done marking the roots, we have both set a bunch of fields\nand filled our work list with objects to chew through. It&rsquo;s time for the next\nphase.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  markRoots();\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>collectGarbage</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">traceReferences</span>();\n</pre><pre class=\"insert-after\">\n\n#ifdef DEBUG_LOG_GC\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>collectGarbage</em>()</div>\n\n<p>Here&rsquo;s the implementation:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>memory.c</em><br>\nadd after <em>markRoots</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">traceReferences</span>() {\n  <span class=\"k\">while</span> (<span class=\"i\">vm</span>.<span class=\"i\">grayCount</span> &gt; <span class=\"n\">0</span>) {\n    <span class=\"t\">Obj</span>* <span class=\"i\">object</span> = <span class=\"i\">vm</span>.<span class=\"i\">grayStack</span>[--<span class=\"i\">vm</span>.<span class=\"i\">grayCount</span>];\n    <span class=\"i\">blackenObject</span>(<span class=\"i\">object</span>);\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, add after <em>markRoots</em>()</div>\n\n<p>It&rsquo;s as close to that textual algorithm as you can get. Until the stack empties,\nwe keep pulling out gray objects, traversing their references, and then marking\nthem black. Traversing an object&rsquo;s references may turn up new white objects that\nget marked gray and added to the stack. So this function swings back and forth\nbetween turning white objects gray and gray objects black, gradually advancing\nthe entire wavefront forward.</p>\n<p>Here&rsquo;s where we traverse a single object&rsquo;s references:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>memory.c</em><br>\nadd after <em>markValue</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">blackenObject</span>(<span class=\"t\">Obj</span>* <span class=\"i\">object</span>) {\n  <span class=\"k\">switch</span> (<span class=\"i\">object</span>-&gt;<span class=\"i\">type</span>) {\n    <span class=\"k\">case</span> <span class=\"a\">OBJ_NATIVE</span>:\n    <span class=\"k\">case</span> <span class=\"a\">OBJ_STRING</span>:\n      <span class=\"k\">break</span>;\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, add after <em>markValue</em>()</div>\n\n<p>Each object <span name=\"leaf\">kind</span> has different fields that might\nreference other objects, so we need a specific blob of code for each type. We\nstart with the easy ones<span class=\"em\">&mdash;</span>strings and native function objects contain no\noutgoing references so there is nothing to traverse.</p>\n<aside name=\"leaf\">\n<p>An easy optimization we could do in <code>markObject()</code> is to skip adding strings and\nnative functions to the gray stack at all since we know they don&rsquo;t need to be\nprocessed. Instead, they could darken from white straight to black.</p>\n</aside>\n<p>Note that we don&rsquo;t set any state in the traversed object itself. There is no\ndirect encoding of &ldquo;black&rdquo; in the object&rsquo;s state. A black object is any object\nwhose <code>isMarked</code> field is <span name=\"field\">set</span> and that is no longer in\nthe gray stack.</p>\n<aside name=\"field\">\n<p>You may rightly wonder why we have the <code>isMarked</code> field at all. All in good\ntime, friend.</p>\n</aside>\n<p>Now let&rsquo;s start adding in the other object types. The simplest is upvalues.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void blackenObject(Obj* object) {\n  switch (object-&gt;type) {\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>blackenObject</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OBJ_UPVALUE</span>:\n      <span class=\"i\">markValue</span>(((<span class=\"t\">ObjUpvalue</span>*)<span class=\"i\">object</span>)-&gt;<span class=\"i\">closed</span>);\n      <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">    case OBJ_NATIVE:\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>blackenObject</em>()</div>\n\n<p>When an upvalue is closed, it contains a reference to the closed-over value.\nSince the value is no longer on the stack, we need to make sure we trace the\nreference to it from the upvalue.</p>\n<p>Next are functions.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  switch (object-&gt;type) {\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>blackenObject</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OBJ_FUNCTION</span>: {\n      <span class=\"t\">ObjFunction</span>* <span class=\"i\">function</span> = (<span class=\"t\">ObjFunction</span>*)<span class=\"i\">object</span>;\n      <span class=\"i\">markObject</span>((<span class=\"t\">Obj</span>*)<span class=\"i\">function</span>-&gt;<span class=\"i\">name</span>);\n      <span class=\"i\">markArray</span>(&amp;<span class=\"i\">function</span>-&gt;<span class=\"i\">chunk</span>.<span class=\"i\">constants</span>);\n      <span class=\"k\">break</span>;\n    }\n</pre><pre class=\"insert-after\">    case OBJ_UPVALUE:\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>blackenObject</em>()</div>\n\n<p>Each function has a reference to an ObjString containing the function&rsquo;s name.\nMore importantly, the function has a constant table packed full of references to\nother objects. We trace all of those using this helper:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>memory.c</em><br>\nadd after <em>markValue</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">markArray</span>(<span class=\"t\">ValueArray</span>* <span class=\"i\">array</span>) {\n  <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">i</span> = <span class=\"n\">0</span>; <span class=\"i\">i</span> &lt; <span class=\"i\">array</span>-&gt;<span class=\"i\">count</span>; <span class=\"i\">i</span>++) {\n    <span class=\"i\">markValue</span>(<span class=\"i\">array</span>-&gt;<span class=\"i\">values</span>[<span class=\"i\">i</span>]);\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, add after <em>markValue</em>()</div>\n\n<p>The last object type we have now<span class=\"em\">&mdash;</span>we&rsquo;ll add more in later chapters<span class=\"em\">&mdash;</span>is\nclosures.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  switch (object-&gt;type) {\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>blackenObject</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OBJ_CLOSURE</span>: {\n      <span class=\"t\">ObjClosure</span>* <span class=\"i\">closure</span> = (<span class=\"t\">ObjClosure</span>*)<span class=\"i\">object</span>;\n      <span class=\"i\">markObject</span>((<span class=\"t\">Obj</span>*)<span class=\"i\">closure</span>-&gt;<span class=\"i\">function</span>);\n      <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">i</span> = <span class=\"n\">0</span>; <span class=\"i\">i</span> &lt; <span class=\"i\">closure</span>-&gt;<span class=\"i\">upvalueCount</span>; <span class=\"i\">i</span>++) {\n        <span class=\"i\">markObject</span>((<span class=\"t\">Obj</span>*)<span class=\"i\">closure</span>-&gt;<span class=\"i\">upvalues</span>[<span class=\"i\">i</span>]);\n      }\n      <span class=\"k\">break</span>;\n    }\n</pre><pre class=\"insert-after\">    case OBJ_FUNCTION: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>blackenObject</em>()</div>\n\n<p>Each closure has a reference to the bare function it wraps, as well as an array\nof pointers to the upvalues it captures. We trace all of those.</p>\n<p>That&rsquo;s the basic mechanism for processing a gray object, but there are two loose\nends to tie up. First, some logging.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void blackenObject(Obj* object) {\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>blackenObject</em>()</div>\n<pre class=\"insert\"><span class=\"a\">#ifdef DEBUG_LOG_GC</span>\n  <span class=\"i\">printf</span>(<span class=\"s\">&quot;%p blacken &quot;</span>, (<span class=\"t\">void</span>*)<span class=\"i\">object</span>);\n  <span class=\"i\">printValue</span>(<span class=\"a\">OBJ_VAL</span>(<span class=\"i\">object</span>));\n  <span class=\"i\">printf</span>(<span class=\"s\">&quot;</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>);\n<span class=\"a\">#endif</span>\n\n</pre><pre class=\"insert-after\">  switch (object-&gt;type) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>blackenObject</em>()</div>\n\n<p>This way, we can watch the tracing percolate through the object graph. Speaking\nof which, note that I said <em>graph</em>. References between objects are directed, but\nthat doesn&rsquo;t mean they&rsquo;re <em>acyclic!</em> It&rsquo;s entirely possible to have cycles of\nobjects. When that happens, we need to ensure our collector doesn&rsquo;t get stuck in\nan infinite loop as it continually re-adds the same series of objects to the\ngray stack.</p>\n<p>The fix is easy.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  if (object == NULL) return;\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>markObject</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">object</span>-&gt;<span class=\"i\">isMarked</span>) <span class=\"k\">return</span>;\n\n</pre><pre class=\"insert-after\">#ifdef DEBUG_LOG_GC\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>markObject</em>()</div>\n\n<p>If the object is already marked, we don&rsquo;t mark it again and thus don&rsquo;t add it to\nthe gray stack. This ensures that an already-gray object is not redundantly\nadded and that a black object is not inadvertently turned back to gray. In other\nwords, it keeps the wavefront moving forward through only the white objects.</p>\n<h2><a href=\"#sweeping-unused-objects\" id=\"sweeping-unused-objects\"><small>26&#8202;.&#8202;5</small>Sweeping Unused Objects</a></h2>\n<p>When the loop in <code>traceReferences()</code> exits, we have processed all the objects we\ncould get our hands on. The gray stack is empty, and every object in the heap is\neither black or white. The black objects are reachable, and we want to hang on to\nthem. Anything still white never got touched by the trace and is thus garbage.\nAll that&rsquo;s left is to reclaim them.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  traceReferences();\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>collectGarbage</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">sweep</span>();\n</pre><pre class=\"insert-after\">\n\n#ifdef DEBUG_LOG_GC\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>collectGarbage</em>()</div>\n\n<p>All of the logic lives in one function.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>memory.c</em><br>\nadd after <em>traceReferences</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">sweep</span>() {\n  <span class=\"t\">Obj</span>* <span class=\"i\">previous</span> = <span class=\"a\">NULL</span>;\n  <span class=\"t\">Obj</span>* <span class=\"i\">object</span> = <span class=\"i\">vm</span>.<span class=\"i\">objects</span>;\n  <span class=\"k\">while</span> (<span class=\"i\">object</span> != <span class=\"a\">NULL</span>) {\n    <span class=\"k\">if</span> (<span class=\"i\">object</span>-&gt;<span class=\"i\">isMarked</span>) {\n      <span class=\"i\">previous</span> = <span class=\"i\">object</span>;\n      <span class=\"i\">object</span> = <span class=\"i\">object</span>-&gt;<span class=\"i\">next</span>;\n    } <span class=\"k\">else</span> {\n      <span class=\"t\">Obj</span>* <span class=\"i\">unreached</span> = <span class=\"i\">object</span>;\n      <span class=\"i\">object</span> = <span class=\"i\">object</span>-&gt;<span class=\"i\">next</span>;\n      <span class=\"k\">if</span> (<span class=\"i\">previous</span> != <span class=\"a\">NULL</span>) {\n        <span class=\"i\">previous</span>-&gt;<span class=\"i\">next</span> = <span class=\"i\">object</span>;\n      } <span class=\"k\">else</span> {\n        <span class=\"i\">vm</span>.<span class=\"i\">objects</span> = <span class=\"i\">object</span>;\n      }\n\n      <span class=\"i\">freeObject</span>(<span class=\"i\">unreached</span>);\n    }\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, add after <em>traceReferences</em>()</div>\n\n<p>I know that&rsquo;s kind of a lot of code and pointer shenanigans, but there isn&rsquo;t\nmuch to it once you work through it. The outer <code>while</code> loop walks the linked\nlist of every object in the heap, checking their mark bits. If an object is\nmarked (black), we leave it alone and continue past it. If it is unmarked\n(white), we unlink it from the list and free it using the <code>freeObject()</code>\nfunction we already wrote.</p><img src=\"image/garbage-collection/unlink.png\" alt=\"A recycle bin full of bits.\" />\n<p>Most of the other code in here deals with the fact that removing a node from a\nsingly linked list is cumbersome. We have to continuously remember the previous\nnode so we can unlink its next pointer, and we have to handle the edge case\nwhere we are freeing the first node. But, otherwise, it&rsquo;s pretty simple<span class=\"em\">&mdash;</span>delete every node in a linked list that doesn&rsquo;t have a bit set in it.</p>\n<p>There&rsquo;s one little addition:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    if (object-&gt;isMarked) {\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>sweep</em>()</div>\n<pre class=\"insert\">      <span class=\"i\">object</span>-&gt;<span class=\"i\">isMarked</span> = <span class=\"k\">false</span>;\n</pre><pre class=\"insert-after\">      previous = object;\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>sweep</em>()</div>\n\n<p>After <code>sweep()</code> completes, the only remaining objects are the live black ones\nwith their mark bits set. That&rsquo;s correct, but when the <em>next</em> collection cycle\nstarts, we need every object to be white. So whenever we reach a black object,\nwe go ahead and clear the bit now in anticipation of the next run.</p>\n<h3><a href=\"#weak-references-and-the-string-pool\" id=\"weak-references-and-the-string-pool\"><small>26&#8202;.&#8202;5&#8202;.&#8202;1</small>Weak references and the string pool</a></h3>\n<p>We are almost done collecting. There is one remaining corner of the VM that has\nsome unusual requirements around memory. Recall that when we added strings to\nclox we made the VM intern them all. That means the VM has a hash table\ncontaining a pointer to every single string in the heap. The VM uses this to\nde-duplicate strings.</p>\n<p>During the mark phase, we deliberately did <em>not</em> treat the VM&rsquo;s string table as\na source of roots. If we had, no <span name=\"intern\">string</span> would <em>ever</em>\nbe collected. The string table would grow and grow and never yield a single byte\nof memory back to the operating system. That would be bad.</p>\n<aside name=\"intern\">\n<p>This can be a real problem. Java does not intern <em>all</em> strings, but it does\nintern string <em>literals</em>. It also provides an API to add strings to the string\ntable. For many years, the capacity of that table was fixed, and strings added\nto it could never be removed. If users weren&rsquo;t careful about their use of\n<code>String.intern()</code>, they could run out of memory and crash.</p>\n<p>Ruby had a similar problem for years where symbols<span class=\"em\">&mdash;</span>interned string-like\nvalues<span class=\"em\">&mdash;</span>were not garbage collected. Both eventually enabled the GC to collect\nthese strings.</p>\n</aside>\n<p>At the same time, if we <em>do</em> let the GC free strings, then the VM&rsquo;s string table\nwill be left with dangling pointers to freed memory. That would be even worse.</p>\n<p>The string table is special and we need special support for it. In particular,\nit needs a special kind of reference. The table should be able to refer to a\nstring, but that link should not be considered a root when determining\nreachability. That implies that the referenced object can be freed. When that\nhappens, the dangling reference must be fixed too, sort of like a magic,\nself-clearing pointer. This particular set of semantics comes up frequently\nenough that it has a name: a <a href=\"https://en.wikipedia.org/wiki/Weak_reference\"><strong>weak reference</strong></a>.</p>\n<p>We have already implicitly implemented half of the string table&rsquo;s unique\nbehavior by virtue of the fact that we <em>don&rsquo;t</em> traverse it during marking. That\nmeans it doesn&rsquo;t force strings to be reachable. The remaining piece is clearing\nout any dangling pointers for strings that are freed.</p>\n<p>To remove references to unreachable strings, we need to know which strings <em>are</em>\nunreachable. We don&rsquo;t know that until after the mark phase has completed. But we\ncan&rsquo;t wait until after the sweep phase is done because by then the objects<span class=\"em\">&mdash;</span>and their mark bits<span class=\"em\">&mdash;</span>are no longer around to check. So the right time is\nexactly between the marking and sweeping phases.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  traceReferences();\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>collectGarbage</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">tableRemoveWhite</span>(&amp;<span class=\"i\">vm</span>.<span class=\"i\">strings</span>);\n</pre><pre class=\"insert-after\">  sweep();\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>collectGarbage</em>()</div>\n\n<p>The logic for removing the about-to-be-deleted strings exists in a new function\nin the &ldquo;table&rdquo; module.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">ObjString* tableFindString(Table* table, const char* chars,\n                           int length, uint32_t hash);\n</pre><div class=\"source-file\"><em>table.h</em><br>\nadd after <em>tableFindString</em>()</div>\n<pre class=\"insert\">\n\n<span class=\"t\">void</span> <span class=\"i\">tableRemoveWhite</span>(<span class=\"t\">Table</span>* <span class=\"i\">table</span>);\n</pre><pre class=\"insert-after\">void markTable(Table* table);\n\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.h</em>, add after <em>tableFindString</em>()</div>\n\n<p>The implementation is here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>table.c</em><br>\nadd after <em>tableFindString</em>()</div>\n<pre><span class=\"t\">void</span> <span class=\"i\">tableRemoveWhite</span>(<span class=\"t\">Table</span>* <span class=\"i\">table</span>) {\n  <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">i</span> = <span class=\"n\">0</span>; <span class=\"i\">i</span> &lt; <span class=\"i\">table</span>-&gt;<span class=\"i\">capacity</span>; <span class=\"i\">i</span>++) {\n    <span class=\"t\">Entry</span>* <span class=\"i\">entry</span> = &amp;<span class=\"i\">table</span>-&gt;<span class=\"i\">entries</span>[<span class=\"i\">i</span>];\n    <span class=\"k\">if</span> (<span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span> != <span class=\"a\">NULL</span> &amp;&amp; !<span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span>-&gt;<span class=\"i\">obj</span>.<span class=\"i\">isMarked</span>) {\n      <span class=\"i\">tableDelete</span>(<span class=\"i\">table</span>, <span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span>);\n    }\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, add after <em>tableFindString</em>()</div>\n\n<p>We walk every entry in the table. The string intern table uses only the key of\neach entry<span class=\"em\">&mdash;</span>it&rsquo;s basically a hash <em>set</em> not a hash <em>map</em>. If the key string\nobject&rsquo;s mark bit is not set, then it is a white object that is moments from\nbeing swept away. We delete it from the hash table first and thus ensure we\nwon&rsquo;t see any dangling pointers.</p>\n<h2><a href=\"#when-to-collect\" id=\"when-to-collect\"><small>26&#8202;.&#8202;6</small>When to Collect</a></h2>\n<p>We have a fully functioning mark-sweep garbage collector now. When the stress\ntesting flag is enabled, it gets called all the time, and with the logging\nenabled too, we can watch it do its thing and see that it is indeed reclaiming\nmemory. But, when the stress testing flag is off, it never runs at all. It&rsquo;s\ntime to decide when the collector should be invoked during normal program\nexecution.</p>\n<p>As far as I can tell, this question is poorly answered by the literature. When\ngarbage collectors were first invented, computers had a tiny, fixed amount of\nmemory. Many of the early GC papers assumed that you set aside a few thousand\nwords of memory<span class=\"em\">&mdash;</span>in other words, most of it<span class=\"em\">&mdash;</span>and invoked the collector\nwhenever you ran out. Simple.</p>\n<p>Modern machines have gigs of physical RAM, hidden behind the operating system&rsquo;s\neven larger virtual memory abstraction, which is shared among a slew of other\nprograms all fighting for their chunk of memory. The operating system will let\nyour program request as much as it wants and then page in and out from the disc\nwhen physical memory gets full. You never really &ldquo;run out&rdquo; of memory, you just\nget slower and slower.</p>\n<h3><a href=\"#latency-and-throughput\" id=\"latency-and-throughput\"><small>26&#8202;.&#8202;6&#8202;.&#8202;1</small>Latency and throughput</a></h3>\n<p>It no longer makes sense to wait until you &ldquo;have to&rdquo;, to run the GC, so we need\na more subtle timing strategy. To reason about this more precisely, it&rsquo;s time to\nintroduce two fundamental numbers used when measuring a memory manager&rsquo;s\nperformance: <em>throughput</em> and <em>latency</em>.</p>\n<p>Every managed language pays a performance price compared to explicit,\nuser-authored deallocation. The time spent actually freeing memory is the same,\nbut the GC spends cycles figuring out <em>which</em> memory to free. That is time <em>not</em>\nspent running the user&rsquo;s code and doing useful work. In our implementation,\nthat&rsquo;s the entirety of the mark phase. The goal of a sophisticated garbage\ncollector is to minimize that overhead.</p>\n<p>There are two key metrics we can use to understand that cost better:</p>\n<ul>\n<li>\n<p><strong>Throughput</strong> is the total fraction of time spent running user code versus\ndoing garbage collection work. Say you run a clox program for ten seconds\nand it spends a second of that inside <code>collectGarbage()</code>. That means the\nthroughput is 90%<span class=\"em\">&mdash;</span>it spent 90% of the time running the program and 10%\non GC overhead.</p>\n<p>Throughput is the most fundamental measure because it tracks the total cost\nof collection overhead. All else being equal, you want to maximize\nthroughput. Up until this chapter, clox had no GC at all and thus <span\nname=\"hundred\">100%</span> throughput. That&rsquo;s pretty hard to beat. Of\ncourse, it came at the slight expense of potentially running out of memory\nand crashing if the user&rsquo;s program ran long enough. You can look at the goal\nof a GC as fixing that &ldquo;glitch&rdquo; while sacrificing as little throughput as\npossible.</p>\n</li>\n</ul>\n<aside name=\"hundred\">\n<p>Well, not <em>exactly</em> 100%. It did still put the allocated objects into a linked\nlist, so there was some tiny overhead for setting those pointers.</p>\n</aside>\n<ul>\n<li>\n<p><strong>Latency</strong> is the longest <em>continuous</em> chunk of time where the user&rsquo;s\nprogram is completely paused while garbage collection happens. It&rsquo;s a\nmeasure of how &ldquo;chunky&rdquo; the collector is. Latency is an entirely different\nmetric than throughput.</p>\n<p>Consider two runs of a clox program that both take ten seconds. In the first\nrun, the GC kicks in once and spends a solid second in <code>collectGarbage()</code> in\none massive collection. In the second run, the GC gets invoked five times,\neach for a fifth of a second. The <em>total</em> amount of time spent collecting is\nstill a second, so the throughput is 90% in both cases. But in the second\nrun, the latency is only 1/5th of a second, five times less than in the\nfirst.</p>\n</li>\n</ul>\n<p><span name=\"latency\"></span></p><img src=\"image/garbage-collection/latency-throughput.png\" alt=\"A bar representing execution time with slices for running user code and running the GC. The largest GC slice is latency. The size of all of the user code slices is throughput.\" />\n<aside name=\"latency\">\n<p>The bar represents the execution of a program, divided into time spent running\nuser code and time spent in the GC. The size of the largest single slice of time\nrunning the GC is the latency. The size of all of the user code slices added up\nis the throughput.</p>\n</aside>\n<p>If you like analogies, imagine your program is a bakery selling fresh-baked\nbread to customers. Throughput is the total number of warm, crusty baguettes you\ncan serve to customers in a single day. Latency is how long the unluckiest\ncustomer has to wait in line before they get served.</p>\n<p><span name=\"dishwasher\">Running</span> the garbage collector is like shutting\ndown the bakery temporarily to go through all of the dishes, sort out the dirty\nfrom the clean, and then wash the used ones. In our analogy, we don&rsquo;t have\ndedicated dishwashers, so while this is going on, no baking is happening. The\nbaker is washing up.</p>\n<aside name=\"dishwasher\">\n<p>If each person represents a thread, then an obvious optimization is to have\nseparate threads running garbage collection, giving you a <strong>concurrent garbage\ncollector</strong>. In other words, hire some dishwashers to clean while others bake.\nThis is how very sophisticated GCs work because it does let the bakers<span class=\"em\">&mdash;</span>the worker threads<span class=\"em\">&mdash;</span>keep running user code with little interruption.</p>\n<p>However, coordination is required. You don&rsquo;t want a dishwasher grabbing a bowl\nout of a baker&rsquo;s hands! This coordination adds overhead and a lot of complexity.\nConcurrent collectors are fast, but challenging to implement correctly.</p><img src=\"image/garbage-collection/baguette.png\" class=\"above\" alt=\"Un baguette.\" />\n</aside>\n<p>Selling fewer loaves of bread a day is bad, and making any particular customer\nsit and wait while you clean all the dishes is too. The goal is to maximize\nthroughput and minimize latency, but there is no free lunch, even inside a\nbakery. Garbage collectors make different trade-offs between how much throughput\nthey sacrifice and latency they tolerate.</p>\n<p>Being able to make these trade-offs is useful because different user programs\nhave different needs. An overnight batch job that is generating a report from a\nterabyte of data just needs to get as much work done as fast as possible.\nThroughput is queen. Meanwhile, an app running on a user&rsquo;s smartphone needs to\nalways respond immediately to user input so that dragging on the screen feels\n<span name=\"butter\">buttery</span> smooth. The app can&rsquo;t freeze for a few\nseconds while the GC mucks around in the heap.</p>\n<aside name=\"butter\">\n<p>Clearly the baking analogy is going to my head.</p>\n</aside>\n<p>As a garbage collector author, you control some of the trade-off between\nthroughput and latency by your choice of collection algorithm. But even within a\nsingle algorithm, we have a lot of control over <em>how frequently</em> the collector\nruns.</p>\n<p>Our collector is a <span name=\"incremental\"><strong>stop-the-world GC</strong></span> which\nmeans the user&rsquo;s program is paused until the entire garbage collection process\nhas completed. If we wait a long time before we run the collector, then a large\nnumber of dead objects will accumulate. That leads to a very long pause while\nthe collector runs, and thus high latency. So, clearly, we want to run the\ncollector really frequently.</p>\n<aside name=\"incremental\">\n<p>In contrast, an <strong>incremental garbage collector</strong> can do a little collection,\nthen run some user code, then collect a little more, and so on.</p>\n</aside>\n<p>But every time the collector runs, it spends some time visiting live objects.\nThat doesn&rsquo;t really <em>do</em> anything useful (aside from ensuring that they don&rsquo;t\nincorrectly get deleted). Time visiting live objects is time not freeing memory\nand also time not running user code. If you run the GC <em>really</em> frequently, then\nthe user&rsquo;s program doesn&rsquo;t have enough time to even generate new garbage for the\nVM to collect. The VM will spend all of its time obsessively revisiting the same\nset of live objects over and over, and throughput will suffer. So, clearly, we\nwant to run the collector really <em>in</em>frequently.</p>\n<p>In fact, we want something in the middle, and the frequency of when the\ncollector runs is one of our main knobs for tuning the trade-off between latency\nand throughput.</p>\n<h3><a href=\"#self-adjusting-heap\" id=\"self-adjusting-heap\"><small>26&#8202;.&#8202;6&#8202;.&#8202;2</small>Self-adjusting heap</a></h3>\n<p>We want our GC to run frequently enough to minimize latency but infrequently\nenough to maintain decent throughput. But how do we find the balance between\nthese when we have no idea how much memory the user&rsquo;s program needs and how\noften it allocates? We could pawn the problem onto the user and force them to\npick by exposing GC tuning parameters. Many VMs do this. But if we, the GC\nauthors, don&rsquo;t know how to tune it well, odds are good most users won&rsquo;t either.\nThey deserve a reasonable default behavior.</p>\n<p>I&rsquo;ll be honest with you, this is not my area of expertise. I&rsquo;ve talked to a\nnumber of professional GC hackers<span class=\"em\">&mdash;</span>this is something you can build an entire\ncareer on<span class=\"em\">&mdash;</span>and read a lot of the literature, and all of the answers I got\nwere<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>vague. The strategy I ended up picking is common, pretty simple, and (I\nhope!) good enough for most uses.</p>\n<p>The idea is that the collector frequency automatically adjusts based on the live\nsize of the heap. We track the total number of bytes of managed memory that the\nVM has allocated. When it goes above some threshold, we trigger a GC. After\nthat, we note how many bytes of memory remain<span class=\"em\">&mdash;</span>how many were <em>not</em> freed. Then\nwe adjust the threshold to some value larger than that.</p>\n<p>The result is that as the amount of live memory increases, we collect less\nfrequently in order to avoid sacrificing throughput by re-traversing the growing\npile of live objects. As the amount of live memory goes down, we collect more\nfrequently so that we don&rsquo;t lose too much latency by waiting too long.</p>\n<p>The implementation requires two new bookkeeping fields in the VM.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  ObjUpvalue* openUpvalues;\n</pre><div class=\"source-file\"><em>vm.h</em><br>\nin struct <em>VM</em></div>\n<pre class=\"insert\">\n\n  <span class=\"t\">size_t</span> <span class=\"i\">bytesAllocated</span>;\n  <span class=\"t\">size_t</span> <span class=\"i\">nextGC</span>;\n</pre><pre class=\"insert-after\">  Obj* objects;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em>, in struct <em>VM</em></div>\n\n<p>The first is a running total of the number of bytes of managed memory the VM has\nallocated. The second is the threshold that triggers the next collection. We\ninitialize them when the VM starts up.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  vm.objects = NULL;\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>initVM</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">vm</span>.<span class=\"i\">bytesAllocated</span> = <span class=\"n\">0</span>;\n  <span class=\"i\">vm</span>.<span class=\"i\">nextGC</span> = <span class=\"n\">1024</span> * <span class=\"n\">1024</span>;\n</pre><pre class=\"insert-after\">\n\n  vm.grayCount = 0;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>initVM</em>()</div>\n\n<p>The starting threshold here is <span name=\"lab\">arbitrary</span>. It&rsquo;s similar\nto the initial capacity we picked for our various dynamic arrays. The goal is to\nnot trigger the first few GCs <em>too</em> quickly but also to not wait too long. If we\nhad some real-world Lox programs, we could profile those to tune this. But since\nall we have are toy programs, I just picked a number.</p>\n<aside name=\"lab\">\n<p>A challenge with learning garbage collectors is that it&rsquo;s <em>very</em> hard to\ndiscover the best practices in an isolated lab environment. You don&rsquo;t see how a\ncollector actually performs unless you run it on the kind of large, messy\nreal-world programs it is actually intended for. It&rsquo;s like tuning a rally car<span class=\"em\">&mdash;</span>you need to take it out on the course.</p>\n</aside>\n<p>Every time we allocate or free some memory, we adjust the counter by that delta.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void* reallocate(void* pointer, size_t oldSize, size_t newSize) {\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>reallocate</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">vm</span>.<span class=\"i\">bytesAllocated</span> += <span class=\"i\">newSize</span> - <span class=\"i\">oldSize</span>;\n</pre><pre class=\"insert-after\">  if (newSize &gt; oldSize) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>reallocate</em>()</div>\n\n<p>When the total crosses the limit, we run the collector.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    collectGarbage();\n#endif\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>reallocate</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"k\">if</span> (<span class=\"i\">vm</span>.<span class=\"i\">bytesAllocated</span> &gt; <span class=\"i\">vm</span>.<span class=\"i\">nextGC</span>) {\n      <span class=\"i\">collectGarbage</span>();\n    }\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>reallocate</em>()</div>\n\n<p>Now, finally, our garbage collector actually does something when the user runs a\nprogram without our hidden diagnostic flag enabled. The sweep phase frees\nobjects by calling <code>reallocate()</code>, which lowers the value of <code>bytesAllocated</code>,\nso after the collection completes, we know how many live bytes remain. We adjust\nthe threshold of the next GC based on that.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  sweep();\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>collectGarbage</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"i\">vm</span>.<span class=\"i\">nextGC</span> = <span class=\"i\">vm</span>.<span class=\"i\">bytesAllocated</span> * <span class=\"a\">GC_HEAP_GROW_FACTOR</span>;\n</pre><pre class=\"insert-after\">\n\n#ifdef DEBUG_LOG_GC\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>collectGarbage</em>()</div>\n\n<p>The threshold is a multiple of the heap size. This way, as the amount of memory\nthe program uses grows, the threshold moves farther out to limit the total time\nspent re-traversing the larger live set. Like other numbers in this chapter, the\nscaling factor is basically arbitrary.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#endif\n</pre><div class=\"source-file\"><em>memory.c</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#define GC_HEAP_GROW_FACTOR 2</span>\n</pre><pre class=\"insert-after\">\n\nvoid* reallocate(void* pointer, size_t oldSize, size_t newSize) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em></div>\n\n<p>You&rsquo;d want to tune this in your implementation once you had some real programs\nto benchmark it on. Right now, we can at least log some of the statistics that\nwe have. We capture the heap size before the collection.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  printf(&quot;-- gc begin\\n&quot;);\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>collectGarbage</em>()</div>\n<pre class=\"insert\">  <span class=\"t\">size_t</span> <span class=\"i\">before</span> = <span class=\"i\">vm</span>.<span class=\"i\">bytesAllocated</span>;\n</pre><pre class=\"insert-after\">#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>collectGarbage</em>()</div>\n\n<p>And then print the results at the end.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  printf(&quot;-- gc end\\n&quot;);\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>collectGarbage</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">printf</span>(<span class=\"s\">&quot;   collected %zu bytes (from %zu to %zu) next at %zu</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>,\n         <span class=\"i\">before</span> - <span class=\"i\">vm</span>.<span class=\"i\">bytesAllocated</span>, <span class=\"i\">before</span>, <span class=\"i\">vm</span>.<span class=\"i\">bytesAllocated</span>,\n         <span class=\"i\">vm</span>.<span class=\"i\">nextGC</span>);\n</pre><pre class=\"insert-after\">#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>collectGarbage</em>()</div>\n\n<p>This way we can see how much the garbage collector accomplished while it ran.</p>\n<h2><a href=\"#garbage-collection-bugs\" id=\"garbage-collection-bugs\"><small>26&#8202;.&#8202;7</small>Garbage Collection Bugs</a></h2>\n<p>In theory, we are all done now. We have a GC. It kicks in periodically, collects\nwhat it can, and leaves the rest. If this were a typical textbook, we would wipe\nthe dust from our hands and bask in the soft glow of the flawless marble edifice\nwe have created.</p>\n<p>But I aim to teach you not just the theory of programming languages but the\nsometimes painful reality. I am going to roll over a rotten log and show you the\nnasty bugs that live under it, and garbage collector bugs really are some of the\ngrossest invertebrates out there.</p>\n<p>The collector&rsquo;s job is to free dead objects and preserve live ones. Mistakes are\neasy to make in both directions. If the VM fails to free objects that aren&rsquo;t\nneeded, it slowly leaks memory. If it frees an object that is in use, the user&rsquo;s\nprogram can access invalid memory. These failures often don&rsquo;t immediately cause\na crash, which makes it hard for us to trace backward in time to find the bug.</p>\n<p>This is made harder by the fact that we don&rsquo;t know when the collector will run.\nAny call that eventually allocates some memory is a place in the VM where a\ncollection could happen. It&rsquo;s like musical chairs. At any point, the GC might\nstop the music. Every single heap-allocated object that we want to keep needs to\nfind a chair quickly<span class=\"em\">&mdash;</span>get marked as a root or stored as a reference in some\nother object<span class=\"em\">&mdash;</span>before the sweep phase comes to kick it out of the game.</p>\n<p>How is it possible for the VM to use an object later<span class=\"em\">&mdash;</span>one that the GC itself\ndoesn&rsquo;t see? How can the VM find it? The most common answer is through a pointer\nstored in some local variable on the C stack. The GC walks the <em>VM&rsquo;s</em> value and\nCallFrame stacks, but the C stack is <span name=\"c\">hidden</span> to it.</p>\n<aside name=\"c\">\n<p>Our GC can&rsquo;t find addresses in the C stack, but many can. Conservative garbage\ncollectors look all through memory, including the native stack. The most\nwell-known of this variety is the <a href=\"https://en.wikipedia.org/wiki/Boehm_garbage_collector\"><strong>Boehm–Demers–Weiser garbage\ncollector</strong></a>, usually just called the &ldquo;Boehm collector&rdquo;. (The shortest\npath to fame in CS is a last name that&rsquo;s alphabetically early so that it shows\nup first in sorted lists of names.)</p>\n<p>Many precise GCs walk the C stack too. Even those have to be careful about\npointers to live objects that exist only in <em>CPU registers</em>.</p>\n</aside>\n<p>In previous chapters, we wrote seemingly pointless code that pushed an object\nonto the VM&rsquo;s value stack, did a little work, and then popped it right back off.\nMost times, I said this was for the GC&rsquo;s benefit. Now you see why. The code\nbetween pushing and popping potentially allocates memory and thus can trigger a\nGC. We had to make sure the object was on the value stack so that the\ncollector&rsquo;s mark phase would find it and keep it alive.</p>\n<p>I wrote the entire clox implementation before splitting it into chapters and\nwriting the prose, so I had plenty of time to find all of these corners and\nflush out most of these bugs. The stress testing code we put in at the beginning\nof this chapter and a pretty good test suite were very helpful.</p>\n<p>But I fixed only <em>most</em> of them. I left a couple in because I want to give you a\nhint of what it&rsquo;s like to encounter these bugs in the wild. If you enable the\nstress test flag and run some toy Lox programs, you can probably stumble onto a\nfew. Give it a try and <em>see if you can fix any yourself</em>.</p>\n<h3><a href=\"#adding-to-the-constant-table\" id=\"adding-to-the-constant-table\"><small>26&#8202;.&#8202;7&#8202;.&#8202;1</small>Adding to the constant table</a></h3>\n<p>You are very likely to hit the first bug. The constant table each chunk owns is\na dynamic array. When the compiler adds a new constant to the current function&rsquo;s\ntable, that array may need to grow. The constant itself may also be some\nheap-allocated object like a string or a nested function.</p>\n<p>The new object being added to the constant table is passed to <code>addConstant()</code>.\nAt that moment, the object can be found only in the parameter to that function\non the C stack. That function appends the object to the constant table. If the\ntable doesn&rsquo;t have enough capacity and needs to grow, it calls <code>reallocate()</code>.\nThat in turn triggers a GC, which fails to mark the new constant object and\nthus sweeps it right before we have a chance to add it to the table. Crash.</p>\n<p>The fix, as you&rsquo;ve seen in other places, is to push the constant onto the stack\ntemporarily.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">int addConstant(Chunk* chunk, Value value) {\n</pre><div class=\"source-file\"><em>chunk.c</em><br>\nin <em>addConstant</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">push</span>(<span class=\"i\">value</span>);\n</pre><pre class=\"insert-after\">  writeValueArray(&amp;chunk-&gt;constants, value);\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.c</em>, in <em>addConstant</em>()</div>\n\n<p>Once the constant table contains the object, we pop it off the stack.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  writeValueArray(&amp;chunk-&gt;constants, value);\n</pre><div class=\"source-file\"><em>chunk.c</em><br>\nin <em>addConstant</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">pop</span>();\n</pre><pre class=\"insert-after\">  return chunk-&gt;constants.count - 1;\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.c</em>, in <em>addConstant</em>()</div>\n\n<p>When the GC is marking roots, it walks the chain of compilers and marks each of\ntheir functions, so the new constant is reachable now. We do need an include\nto call into the VM from the &ldquo;chunk&rdquo; module.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;memory.h&quot;\n</pre><div class=\"source-file\"><em>chunk.c</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;vm.h&quot;</span>\n</pre><pre class=\"insert-after\">\n\nvoid initChunk(Chunk* chunk) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.c</em></div>\n\n<h3><a href=\"#interning-strings\" id=\"interning-strings\"><small>26&#8202;.&#8202;7&#8202;.&#8202;2</small>Interning strings</a></h3>\n<p>Here&rsquo;s another similar one. All strings are interned in clox, so whenever we\ncreate a new string, we also add it to the intern table. You can see where this\nis going. Since the string is brand new, it isn&rsquo;t reachable anywhere. And\nresizing the string pool can trigger a collection. Again, we go ahead and stash\nthe string on the stack first.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  string-&gt;chars = chars;\n  string-&gt;hash = hash;\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>allocateString</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"i\">push</span>(<span class=\"a\">OBJ_VAL</span>(<span class=\"i\">string</span>));\n</pre><pre class=\"insert-after\">  tableSet(&amp;vm.strings, string, NIL_VAL);\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>allocateString</em>()</div>\n\n<p>And then pop it back off once it&rsquo;s safely nestled in the table.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  tableSet(&amp;vm.strings, string, NIL_VAL);\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>allocateString</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">pop</span>();\n\n</pre><pre class=\"insert-after\">  return string;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>allocateString</em>()</div>\n\n<p>This ensures the string is safe while the table is being resized. Once it\nsurvives that, <code>allocateString()</code> will return it to some caller which can then\ntake responsibility for ensuring the string is still reachable before the next\nheap allocation occurs.</p>\n<h3><a href=\"#concatenating-strings\" id=\"concatenating-strings\"><small>26&#8202;.&#8202;7&#8202;.&#8202;3</small>Concatenating strings</a></h3>\n<p>One last example: Over in the interpreter, the <code>OP_ADD</code> instruction can be used\nto concatenate two strings. As it does with numbers, it pops the two operands\nfrom the stack, computes the result, and pushes that new value back onto the\nstack. For numbers that&rsquo;s perfectly safe.</p>\n<p>But concatenating two strings requires allocating a new character array on the\nheap, which can in turn trigger a GC. Since we&rsquo;ve already popped the operand\nstrings by that point, they can potentially be missed by the mark phase and get\nswept away. Instead of popping them off the stack eagerly, we peek them.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void concatenate() {\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>concatenate</em>()<br>\nreplace 2 lines</div>\n<pre class=\"insert\">  <span class=\"t\">ObjString</span>* <span class=\"i\">b</span> = <span class=\"a\">AS_STRING</span>(<span class=\"i\">peek</span>(<span class=\"n\">0</span>));\n  <span class=\"t\">ObjString</span>* <span class=\"i\">a</span> = <span class=\"a\">AS_STRING</span>(<span class=\"i\">peek</span>(<span class=\"n\">1</span>));\n</pre><pre class=\"insert-after\">\n\n  int length = a-&gt;length + b-&gt;length;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>concatenate</em>(), replace 2 lines</div>\n\n<p>That way, they are still hanging out on the stack when we create the result\nstring. Once that&rsquo;s done, we can safely pop them off and replace them with the\nresult.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  ObjString* result = takeString(chars, length);\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>concatenate</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">pop</span>();\n  <span class=\"i\">pop</span>();\n</pre><pre class=\"insert-after\">  push(OBJ_VAL(result));\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>concatenate</em>()</div>\n\n<p>Those were all pretty easy, especially because I <em>showed</em> you where the fix was.\nIn practice, <em>finding</em> them is the hard part. All you see is an object that\n<em>should</em> be there but isn&rsquo;t. It&rsquo;s not like other bugs where you&rsquo;re looking for\nthe code that <em>causes</em> some problem. You&rsquo;re looking for the <em>absence</em> of code\nwhich fails to <em>prevent</em> a problem, and that&rsquo;s a much harder search.</p>\n<p>But, for now at least, you can rest easy. As far as I know, we&rsquo;ve found all of\nthe collection bugs in clox, and now we have a working, robust, self-tuning,\nmark-sweep garbage collector.</p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>The Obj header struct at the top of each object now has three fields:\n<code>type</code>, <code>isMarked</code>, and <code>next</code>. How much memory do those take up (on your\nmachine)? Can you come up with something more compact? Is there a runtime\ncost to doing so?</p>\n</li>\n<li>\n<p>When the sweep phase traverses a live object, it clears the <code>isMarked</code>\nfield to prepare it for the next collection cycle. Can you come up with a\nmore efficient approach?</p>\n</li>\n<li>\n<p>Mark-sweep is only one of a variety of garbage collection algorithms out\nthere. Explore those by replacing or augmenting the current collector with\nanother one. Good candidates to consider are reference counting, Cheney&rsquo;s\nalgorithm, or the Lisp 2 mark-compact algorithm.</p>\n</li>\n</ol>\n</div>\n<div class=\"design-note\">\n<h2><a href=\"#design-note\" id=\"design-note\">Design Note: Generational Collectors</a></h2>\n<p>A collector loses throughput if it spends a long time re-visiting objects that\nare still alive. But it can increase latency if it avoids collecting and\naccumulates a large pile of garbage to wade through. If only there were some way\nto tell which objects were likely to be long-lived and which weren&rsquo;t. Then the\nGC could avoid revisiting the long-lived ones as often and clean up the\nephemeral ones more frequently.</p>\n<p>It turns out there kind of is. Many years ago, GC researchers gathered metrics\non the lifetime of objects in real-world running programs. They tracked every\nobject when it was allocated, and eventually when it was no longer needed, and\nthen graphed out how long objects tended to live.</p>\n<p>They discovered something they called the <strong>generational hypothesis</strong>, or the\nmuch less tactful term <strong>infant mortality</strong>. Their observation was that most\nobjects are very short-lived but once they survive beyond a certain age, they\ntend to stick around quite a long time. The longer an object <em>has</em> lived, the\nlonger it likely will <em>continue</em> to live. This observation is powerful because\nit gave them a handle on how to partition objects into groups that benefit from\nfrequent collections and those that don&rsquo;t.</p>\n<p>They designed a technique called <strong>generational garbage collection</strong>. It works\nlike this: Every time a new object is allocated, it goes into a special,\nrelatively small region of the heap called the &ldquo;nursery&rdquo;. Since objects tend to\ndie young, the garbage collector is invoked <span\nname=\"nursery\">frequently</span> over the objects just in this region.</p>\n<aside name=\"nursery\">\n<p>Nurseries are also usually managed using a copying collector which is faster at\nallocating and freeing objects than a mark-sweep collector.</p>\n</aside>\n<p>Each time the GC runs over the nursery is called a &ldquo;generation&rdquo;. Any objects\nthat are no longer needed get freed. Those that survive are now considered one\ngeneration older, and the GC tracks this for each object. If an object survives\na certain number of generations<span class=\"em\">&mdash;</span>often just a single collection<span class=\"em\">&mdash;</span>it gets\n<em>tenured</em>. At this point, it is copied out of the nursery into a much larger\nheap region for long-lived objects. The garbage collector runs over that region\ntoo, but much less frequently since odds are good that most of those objects\nwill still be alive.</p>\n<p>Generational collectors are a beautiful marriage of empirical data<span class=\"em\">&mdash;</span>the\nobservation that object lifetimes are <em>not</em> evenly distributed<span class=\"em\">&mdash;</span>and clever\nalgorithm design that takes advantage of that fact. They&rsquo;re also conceptually\nquite simple. You can think of one as just two separately tuned GCs and a pretty\nsimple policy for moving objects from one to the other.</p>\n</div>\n\n<footer>\n<a href=\"classes-and-instances.html\" class=\"next\">\n  Next Chapter: &ldquo;Classes and Instances&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/global-variables.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Global Variables &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Global Variables<small>21</small></a></h3>\n\n<ul>\n    <li><a href=\"#statements\"><small>21.1</small> Statements</a></li>\n    <li><a href=\"#variable-declarations\"><small>21.2</small> Variable Declarations</a></li>\n    <li><a href=\"#reading-variables\"><small>21.3</small> Reading Variables</a></li>\n    <li><a href=\"#assignment\"><small>21.4</small> Assignment</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"hash-tables.html\" title=\"Hash Tables\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"local-variables.html\" title=\"Local Variables\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"hash-tables.html\" title=\"Hash Tables\" class=\"prev\">←</a>\n<a href=\"local-variables.html\" title=\"Local Variables\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Global Variables<small>21</small></a></h3>\n\n<ul>\n    <li><a href=\"#statements\"><small>21.1</small> Statements</a></li>\n    <li><a href=\"#variable-declarations\"><small>21.2</small> Variable Declarations</a></li>\n    <li><a href=\"#reading-variables\"><small>21.3</small> Reading Variables</a></li>\n    <li><a href=\"#assignment\"><small>21.4</small> Assignment</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"hash-tables.html\" title=\"Hash Tables\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"local-variables.html\" title=\"Local Variables\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">21</div>\n  <h1>Global Variables</h1>\n\n<blockquote>\n<p>If only there could be an invention that bottled up a memory, like scent. And\nit never faded, and it never got stale. And then, when one wanted it, the\nbottle could be uncorked, and it would be like living the moment all over\nagain.</p>\n<p><cite>Daphne du Maurier, <em>Rebecca</em></cite></p>\n</blockquote>\n<p>The <a href=\"hash-tables.html\">previous chapter</a> was a long exploration of one big, deep,\nfundamental computer science data structure. Heavy on theory and concept. There\nmay have been some discussion of big-O notation and algorithms. This chapter has\nfewer intellectual pretensions. There are no large ideas to learn. Instead, it&rsquo;s\na handful of straightforward engineering tasks. Once we&rsquo;ve completed them, our\nvirtual machine will support variables.</p>\n<p>Actually, it will support only <em>global</em> variables. Locals are coming in the\n<a href=\"local-variables.html\">next chapter</a>. In jlox, we managed to cram them both into a single chapter\nbecause we used the same implementation technique for all variables. We built a\nchain of environments, one for each scope, all the way up to the top. That was a\nsimple, clean way to learn how to manage state.</p>\n<p>But it&rsquo;s also <em>slow</em>. Allocating a new hash table each time you enter a block or\ncall a function is not the road to a fast VM. Given how much code is concerned\nwith using variables, if variables go slow, everything goes slow. For clox,\nwe&rsquo;ll improve that by using a much more efficient strategy for <span\nname=\"different\">local</span> variables, but globals aren&rsquo;t as easily optimized.</p>\n<aside name=\"different\">\n<p>This is a common meta-strategy in sophisticated language implementations. Often,\nthe same language feature will have multiple implementation techniques, each\ntuned for different use patterns. For example, JavaScript VMs often have a\nfaster representation for objects that are used more like instances of classes\ncompared to other objects whose set of properties is more freely modified. C and\nC++ compilers usually have a variety of ways to compile <code>switch</code> statements\nbased on the number of cases and how densely packed the case values are.</p>\n</aside>\n<p>A quick refresher on Lox semantics: Global variables in Lox are &ldquo;late bound&rdquo;, or\nresolved dynamically. This means you can compile a chunk of code that refers to\na global variable before it&rsquo;s defined. As long as the code doesn&rsquo;t <em>execute</em>\nbefore the definition happens, everything is fine. In practice, that means you\ncan refer to later variables inside the body of functions.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">showVariable</span>() {\n  <span class=\"k\">print</span> <span class=\"i\">global</span>;\n}\n\n<span class=\"k\">var</span> <span class=\"i\">global</span> = <span class=\"s\">&quot;after&quot;</span>;\n<span class=\"i\">showVariable</span>();\n</pre></div>\n<p>Code like this might seem odd, but it&rsquo;s handy for defining mutually recursive\nfunctions. It also plays nicer with the REPL. You can write a little function in\none line, then define the variable it uses in the next.</p>\n<p>Local variables work differently. Since a local variable&rsquo;s declaration <em>always</em>\noccurs before it is used, the VM can resolve them at compile time, even in a\nsimple single-pass compiler. That will let us use a smarter representation for\nlocals. But that&rsquo;s for the next chapter. Right now, let&rsquo;s just worry about\nglobals.</p>\n<h2><a href=\"#statements\" id=\"statements\"><small>21&#8202;.&#8202;1</small>Statements</a></h2>\n<p>Variables come into being using variable declarations, which means now is also\nthe time to add support for statements to our compiler. If you recall, Lox\nsplits statements into two categories. &ldquo;Declarations&rdquo; are those statements that\nbind a new name to a value. The other kinds of statements<span class=\"em\">&mdash;</span>control flow,\nprint, etc.<span class=\"em\">&mdash;</span>are just called &ldquo;statements&rdquo;. We disallow declarations directly\ninside control flow statements, like this:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">if</span> (<span class=\"i\">monday</span>) <span class=\"k\">var</span> <span class=\"i\">croissant</span> = <span class=\"s\">&quot;yes&quot;</span>; <span class=\"c\">// Error.</span>\n</pre></div>\n<p>Allowing it would raise confusing questions around the scope of the variable.\nSo, like other languages, we prohibit it syntactically by having a separate\ngrammar rule for the subset of statements that <em>are</em> allowed inside a control\nflow body.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">statement</span>      → <span class=\"i\">exprStmt</span>\n               | <span class=\"i\">forStmt</span>\n               | <span class=\"i\">ifStmt</span>\n               | <span class=\"i\">printStmt</span>\n               | <span class=\"i\">returnStmt</span>\n               | <span class=\"i\">whileStmt</span>\n               | <span class=\"i\">block</span> ;\n</pre></div>\n<p>Then we use a separate rule for the top level of a script and inside a block.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">declaration</span>    → <span class=\"i\">classDecl</span>\n               | <span class=\"i\">funDecl</span>\n               | <span class=\"i\">varDecl</span>\n               | <span class=\"i\">statement</span> ;\n</pre></div>\n<p>The <code>declaration</code> rule contains the statements that declare names, and also\nincludes <code>statement</code> so that all statement types are allowed. Since <code>block</code>\nitself is in <code>statement</code>, you can put declarations <span\nname=\"parens\">inside</span> a control flow construct by nesting them inside a\nblock.</p>\n<aside name=\"parens\">\n<p>Blocks work sort of like parentheses do for expressions. A block lets you put\nthe &ldquo;lower-precedence&rdquo; declaration statements in places where only a\n&ldquo;higher-precedence&rdquo; non-declaring statement is allowed.</p>\n</aside>\n<p>In this chapter, we&rsquo;ll cover only a couple of statements and one\ndeclaration.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">statement</span>      → <span class=\"i\">exprStmt</span>\n               | <span class=\"i\">printStmt</span> ;\n\n<span class=\"i\">declaration</span>    → <span class=\"i\">varDecl</span>\n               | <span class=\"i\">statement</span> ;\n</pre></div>\n<p>Up to now, our VM considered a &ldquo;program&rdquo; to be a single expression since that&rsquo;s\nall we could parse and compile. In a full Lox implementation, a program is a\nsequence of declarations. We&rsquo;re ready to support that now.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  advance();\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>compile</em>()<br>\nreplace 2 lines</div>\n<pre class=\"insert\">\n\n  <span class=\"k\">while</span> (!<span class=\"i\">match</span>(<span class=\"a\">TOKEN_EOF</span>)) {\n    <span class=\"i\">declaration</span>();\n  }\n\n</pre><pre class=\"insert-after\">  endCompiler();\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>compile</em>(), replace 2 lines</div>\n\n<p>We keep compiling declarations until we hit the end of the source file. We\ncompile a single declaration using this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>expression</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">declaration</span>() {\n  <span class=\"i\">statement</span>();\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>expression</em>()</div>\n\n<p>We&rsquo;ll get to variable declarations later in the chapter, so for now, we simply\nforward to <code>statement()</code>.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>declaration</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">statement</span>() {\n  <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"a\">TOKEN_PRINT</span>)) {\n    <span class=\"i\">printStatement</span>();\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>declaration</em>()</div>\n\n<p>Blocks can contain declarations, and control flow statements can contain other\nstatements. That means these two functions will eventually be recursive. We may\nas well write out the forward declarations now.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void expression();\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>expression</em>()</div>\n<pre class=\"insert\"><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">statement</span>();\n<span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">declaration</span>();\n</pre><pre class=\"insert-after\">static ParseRule* getRule(TokenType type);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>expression</em>()</div>\n\n<h3><a href=\"#print-statements\" id=\"print-statements\"><small>21&#8202;.&#8202;1&#8202;.&#8202;1</small>Print statements</a></h3>\n<p>We have two statement types to support in this chapter. Let&rsquo;s start with <code>print</code>\nstatements, which begin, naturally enough, with a <code>print</code> token. We detect that\nusing this helper function:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>consume</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">bool</span> <span class=\"i\">match</span>(<span class=\"t\">TokenType</span> <span class=\"i\">type</span>) {\n  <span class=\"k\">if</span> (!<span class=\"i\">check</span>(<span class=\"i\">type</span>)) <span class=\"k\">return</span> <span class=\"k\">false</span>;\n  <span class=\"i\">advance</span>();\n  <span class=\"k\">return</span> <span class=\"k\">true</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>consume</em>()</div>\n\n<p>You may recognize it from jlox. If the current token has the given type, we\nconsume the token and return <code>true</code>. Otherwise we leave the token alone and\nreturn <code>false</code>. This <span name=\"turtles\">helper</span> function is implemented\nin terms of this other helper:</p>\n<aside name=\"turtles\">\n<p>It&rsquo;s helpers all the way down!</p>\n</aside>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>consume</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">bool</span> <span class=\"i\">check</span>(<span class=\"t\">TokenType</span> <span class=\"i\">type</span>) {\n  <span class=\"k\">return</span> <span class=\"i\">parser</span>.<span class=\"i\">current</span>.<span class=\"i\">type</span> == <span class=\"i\">type</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>consume</em>()</div>\n\n<p>The <code>check()</code> function returns <code>true</code> if the current token has the given type.\nIt seems a little <span name=\"read\">silly</span> to wrap this in a function, but\nwe&rsquo;ll use it more later, and I think short verb-named functions like this make\nthe parser easier to read.</p>\n<aside name=\"read\">\n<p>This sounds trivial, but handwritten parsers for non-toy languages get pretty\nbig. When you have thousands of lines of code, a utility function that turns two\nlines into one and makes the result a little more readable easily earns its\nkeep.</p>\n</aside>\n<p>If we did match the <code>print</code> token, then we compile the rest of the statement\nhere:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>expression</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">printStatement</span>() {\n  <span class=\"i\">expression</span>();\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_SEMICOLON</span>, <span class=\"s\">&quot;Expect &#39;;&#39; after value.&quot;</span>);\n  <span class=\"i\">emitByte</span>(<span class=\"a\">OP_PRINT</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>expression</em>()</div>\n\n<p>A <code>print</code> statement evaluates an expression and prints the result, so we first\nparse and compile that expression. The grammar expects a semicolon after that,\nso we consume it. Finally, we emit a new instruction to print the result.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_NEGATE,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_PRINT</span>,\n</pre><pre class=\"insert-after\">  OP_RETURN,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>At runtime, we execute this instruction like so:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">        break;\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_PRINT</span>: {\n        <span class=\"i\">printValue</span>(<span class=\"i\">pop</span>());\n        <span class=\"i\">printf</span>(<span class=\"s\">&quot;</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>);\n        <span class=\"k\">break</span>;\n      }\n</pre><pre class=\"insert-after\">      case OP_RETURN: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>When the interpreter reaches this instruction, it has already executed the code\nfor the expression, leaving the result value on top of the stack. Now we simply\npop and print it.</p>\n<p>Note that we don&rsquo;t push anything else after that. This is a key difference\nbetween expressions and statements in the VM. Every bytecode instruction has a\n<span name=\"effect\"><strong>stack effect</strong></span> that describes how the instruction\nmodifies the stack. For example, <code>OP_ADD</code> pops two values and pushes one,\nleaving the stack one element smaller than before.</p>\n<aside name=\"effect\">\n<p>The stack is one element shorter after an <code>OP_ADD</code>, so its effect is -1:</p><img src=\"image/global-variables/stack-effect.png\" alt=\"The stack effect of an OP_ADD instruction.\" />\n</aside>\n<p>You can sum the stack effects of a series of instructions to get their total\neffect. When you add the stack effects of the series of instructions compiled\nfrom any complete expression, it will total one. Each expression leaves one\nresult value on the stack.</p>\n<p>The bytecode for an entire statement has a total stack effect of zero. Since a\nstatement produces no values, it ultimately leaves the stack unchanged, though\nit of course uses the stack while it&rsquo;s doing its thing. This is important\nbecause when we get to control flow and looping, a program might execute a long\nseries of statements. If each statement grew or shrank the stack, it might\neventually overflow or underflow.</p>\n<p>While we&rsquo;re in the interpreter loop, we should delete a bit of code.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case OP_RETURN: {\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()<br>\nreplace 2 lines</div>\n<pre class=\"insert\">        <span class=\"c\">// Exit interpreter.</span>\n</pre><pre class=\"insert-after\">        return INTERPRET_OK;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>(), replace 2 lines</div>\n\n<p>When the VM only compiled and evaluated a single expression, we had some\ntemporary code in <code>OP_RETURN</code> to output the value. Now that we have statements\nand <code>print</code>, we don&rsquo;t need that anymore. We&rsquo;re one <span\nname=\"return\">step</span> closer to the complete implementation of clox.</p>\n<aside name=\"return\">\n<p>We&rsquo;re only one step closer, though. We will revisit <code>OP_RETURN</code> again when we\nadd functions. Right now, it exits the entire interpreter loop.</p>\n</aside>\n<p>As usual, a new instruction needs support in the disassembler.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      return simpleInstruction(&quot;OP_NEGATE&quot;, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_PRINT</span>:\n      <span class=\"k\">return</span> <span class=\"i\">simpleInstruction</span>(<span class=\"s\">&quot;OP_PRINT&quot;</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    case OP_RETURN:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>That&rsquo;s our <code>print</code> statement. If you want, give it a whirl:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">print</span> <span class=\"n\">1</span> + <span class=\"n\">2</span>;\n<span class=\"k\">print</span> <span class=\"n\">3</span> * <span class=\"n\">4</span>;\n</pre></div>\n<p>Exciting! OK, maybe not thrilling, but we can build scripts that contain as many\nstatements as we want now, which feels like progress.</p>\n<h3><a href=\"#expression-statements\" id=\"expression-statements\"><small>21&#8202;.&#8202;1&#8202;.&#8202;2</small>Expression statements</a></h3>\n<p>Wait until you see the next statement. If we <em>don&rsquo;t</em> see a <code>print</code> keyword, then\nwe must be looking at an expression statement.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    printStatement();\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>statement</em>()</div>\n<pre class=\"insert\">  } <span class=\"k\">else</span> {\n    <span class=\"i\">expressionStatement</span>();\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>statement</em>()</div>\n\n<p>It&rsquo;s parsed like so:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>expression</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">expressionStatement</span>() {\n  <span class=\"i\">expression</span>();\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_SEMICOLON</span>, <span class=\"s\">&quot;Expect &#39;;&#39; after expression.&quot;</span>);\n  <span class=\"i\">emitByte</span>(<span class=\"a\">OP_POP</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>expression</em>()</div>\n\n<p>An &ldquo;expression statement&rdquo; is simply an expression followed by a semicolon.\nThey&rsquo;re how you write an expression in a context where a statement is expected.\nUsually, it&rsquo;s so that you can call a function or evaluate an assignment for its\nside effect, like this:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">brunch</span> = <span class=\"s\">&quot;quiche&quot;</span>;\n<span class=\"i\">eat</span>(<span class=\"i\">brunch</span>);\n</pre></div>\n<p>Semantically, an expression statement evaluates the expression and discards the\nresult. The compiler directly encodes that behavior. It compiles the expression,\nand then emits an <code>OP_POP</code> instruction.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_FALSE,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_POP</span>,\n</pre><pre class=\"insert-after\">  OP_EQUAL,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>As the name implies, that instruction pops the top value off the stack and\nforgets it.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case OP_FALSE: push(BOOL_VAL(false)); break;\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_POP</span>: <span class=\"i\">pop</span>(); <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">      case OP_EQUAL: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>We can disassemble it too.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      return simpleInstruction(&quot;OP_FALSE&quot;, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_POP</span>:\n      <span class=\"k\">return</span> <span class=\"i\">simpleInstruction</span>(<span class=\"s\">&quot;OP_POP&quot;</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    case OP_EQUAL:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>Expression statements aren&rsquo;t very useful yet since we can&rsquo;t create any\nexpressions that have side effects, but they&rsquo;ll be essential when we\n<a href=\"calls-and-functions.html\">add functions later</a>. The <span name=\"majority\">majority</span> of\nstatements in real-world code in languages like C are expression statements.</p>\n<aside name=\"majority\">\n<p>By my count, 80 of the 149 statements, in the version of &ldquo;compiler.c&rdquo; that we\nhave at the end of this chapter are expression statements.</p>\n</aside>\n<h3><a href=\"#error-synchronization\" id=\"error-synchronization\"><small>21&#8202;.&#8202;1&#8202;.&#8202;3</small>Error synchronization</a></h3>\n<p>While we&rsquo;re getting this initial work done in the compiler, we can tie off a\nloose end we left <a href=\"compiling-expressions.html#handling-syntax-errors\">several chapters back</a>. Like jlox, clox uses panic\nmode error recovery to minimize the number of cascaded compile errors that it\nreports. The compiler exits panic mode when it reaches a synchronization point.\nFor Lox, we chose statement boundaries as that point. Now that we have\nstatements, we can implement synchronization.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  statement();\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>declaration</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"k\">if</span> (<span class=\"i\">parser</span>.<span class=\"i\">panicMode</span>) <span class=\"i\">synchronize</span>();\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>declaration</em>()</div>\n\n<p>If we hit a compile error while parsing the previous statement, we enter panic\nmode. When that happens, after the statement we start synchronizing.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>printStatement</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">synchronize</span>() {\n  <span class=\"i\">parser</span>.<span class=\"i\">panicMode</span> = <span class=\"k\">false</span>;\n\n  <span class=\"k\">while</span> (<span class=\"i\">parser</span>.<span class=\"i\">current</span>.<span class=\"i\">type</span> != <span class=\"a\">TOKEN_EOF</span>) {\n    <span class=\"k\">if</span> (<span class=\"i\">parser</span>.<span class=\"i\">previous</span>.<span class=\"i\">type</span> == <span class=\"a\">TOKEN_SEMICOLON</span>) <span class=\"k\">return</span>;\n    <span class=\"k\">switch</span> (<span class=\"i\">parser</span>.<span class=\"i\">current</span>.<span class=\"i\">type</span>) {\n      <span class=\"k\">case</span> <span class=\"a\">TOKEN_CLASS</span>:\n      <span class=\"k\">case</span> <span class=\"a\">TOKEN_FUN</span>:\n      <span class=\"k\">case</span> <span class=\"a\">TOKEN_VAR</span>:\n      <span class=\"k\">case</span> <span class=\"a\">TOKEN_FOR</span>:\n      <span class=\"k\">case</span> <span class=\"a\">TOKEN_IF</span>:\n      <span class=\"k\">case</span> <span class=\"a\">TOKEN_WHILE</span>:\n      <span class=\"k\">case</span> <span class=\"a\">TOKEN_PRINT</span>:\n      <span class=\"k\">case</span> <span class=\"a\">TOKEN_RETURN</span>:\n        <span class=\"k\">return</span>;\n\n      <span class=\"k\">default</span>:\n        ; <span class=\"c\">// Do nothing.</span>\n    }\n\n    <span class=\"i\">advance</span>();\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>printStatement</em>()</div>\n\n<p>We skip tokens indiscriminately until we reach something that looks like a\nstatement boundary. We recognize the boundary by looking for a preceding token\nthat can end a statement, like a semicolon. Or we&rsquo;ll look for a subsequent token\nthat begins a statement, usually one of the control flow or declaration\nkeywords.</p>\n<h2><a href=\"#variable-declarations\" id=\"variable-declarations\"><small>21&#8202;.&#8202;2</small>Variable Declarations</a></h2>\n<p>Merely being able to <em>print</em> doesn&rsquo;t win your language any prizes at the\nprogramming language <span name=\"fair\">fair</span>, so let&rsquo;s move on to\nsomething a little more ambitious and get variables going. There are three\noperations we need to support:</p>\n<aside name=\"fair\">\n<p>I can&rsquo;t help but imagine a &ldquo;language fair&rdquo; like some country 4H thing. Rows of\nstraw-lined stalls full of baby languages <em>moo</em>ing and <em>baa</em>ing at each other.</p>\n</aside>\n<ul>\n<li>Declaring a new variable using a <code>var</code> statement.</li>\n<li>Accessing the value of a variable using an identifier expression.</li>\n<li>Storing a new value in an existing variable using an assignment expression.</li>\n</ul>\n<p>We can&rsquo;t do either of the last two until we have some variables, so we start\nwith declarations.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void declaration() {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>declaration</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"a\">TOKEN_VAR</span>)) {\n    <span class=\"i\">varDeclaration</span>();\n  } <span class=\"k\">else</span> {\n    <span class=\"i\">statement</span>();\n  }\n</pre><pre class=\"insert-after\">\n\n  if (parser.panicMode) synchronize();\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>declaration</em>(), replace 1 line</div>\n\n<p>The placeholder parsing function we sketched out for the declaration grammar\nrule has an actual production now. If we match a <code>var</code> token, we jump here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>expression</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">varDeclaration</span>() {\n  <span class=\"t\">uint8_t</span> <span class=\"i\">global</span> = <span class=\"i\">parseVariable</span>(<span class=\"s\">&quot;Expect variable name.&quot;</span>);\n\n  <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"a\">TOKEN_EQUAL</span>)) {\n    <span class=\"i\">expression</span>();\n  } <span class=\"k\">else</span> {\n    <span class=\"i\">emitByte</span>(<span class=\"a\">OP_NIL</span>);\n  }\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_SEMICOLON</span>,\n          <span class=\"s\">&quot;Expect &#39;;&#39; after variable declaration.&quot;</span>);\n\n  <span class=\"i\">defineVariable</span>(<span class=\"i\">global</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>expression</em>()</div>\n\n<p>The keyword is followed by the variable name. That&rsquo;s compiled by\n<code>parseVariable()</code>, which we&rsquo;ll get to in a second. Then we look for an <code>=</code>\nfollowed by an initializer expression. If the user doesn&rsquo;t initialize the\nvariable, the compiler implicitly initializes it to <span\nname=\"nil\"><code>nil</code></span> by emitting an <code>OP_NIL</code> instruction. Either way, we\nexpect the statement to be terminated with a semicolon.</p>\n<aside name=\"nil\" class=\"bottom\">\n<p>Essentially, the compiler desugars a variable declaration like:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">a</span>;\n</pre></div>\n<p>into:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"k\">nil</span>;\n</pre></div>\n<p>The code it generates for the former is identical to what it produces for the\nlatter.</p>\n</aside>\n<p>There are two new functions here for working with variables and identifiers.\nHere is the first:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void parsePrecedence(Precedence precedence);\n\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>parsePrecedence</em>()</div>\n<pre class=\"insert\"><span class=\"k\">static</span> <span class=\"t\">uint8_t</span> <span class=\"i\">parseVariable</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">errorMessage</span>) {\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_IDENTIFIER</span>, <span class=\"i\">errorMessage</span>);\n  <span class=\"k\">return</span> <span class=\"i\">identifierConstant</span>(&amp;<span class=\"i\">parser</span>.<span class=\"i\">previous</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>parsePrecedence</em>()</div>\n\n<p>It requires the next token to be an identifier, which it consumes and sends\nhere:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void parsePrecedence(Precedence precedence);\n\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>parsePrecedence</em>()</div>\n<pre class=\"insert\"><span class=\"k\">static</span> <span class=\"t\">uint8_t</span> <span class=\"i\">identifierConstant</span>(<span class=\"t\">Token</span>* <span class=\"i\">name</span>) {\n  <span class=\"k\">return</span> <span class=\"i\">makeConstant</span>(<span class=\"a\">OBJ_VAL</span>(<span class=\"i\">copyString</span>(<span class=\"i\">name</span>-&gt;<span class=\"i\">start</span>,\n                                         <span class=\"i\">name</span>-&gt;<span class=\"i\">length</span>)));\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>parsePrecedence</em>()</div>\n\n<p>This function takes the given token and adds its lexeme to the chunk&rsquo;s constant\ntable as a string. It then returns the index of that constant in the constant\ntable.</p>\n<p>Global variables are looked up <em>by name</em> at runtime. That means the VM<span class=\"em\">&mdash;</span>the\nbytecode interpreter loop<span class=\"em\">&mdash;</span>needs access to the name. A whole string is too big\nto stuff into the bytecode stream as an operand. Instead, we store the string in\nthe constant table and the instruction then refers to the name by its index in\nthe table.</p>\n<p>This function returns that index all the way to <code>varDeclaration()</code> which later\nhands it over to here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>parseVariable</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">defineVariable</span>(<span class=\"t\">uint8_t</span> <span class=\"i\">global</span>) {\n  <span class=\"i\">emitBytes</span>(<span class=\"a\">OP_DEFINE_GLOBAL</span>, <span class=\"i\">global</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>parseVariable</em>()</div>\n\n<p><span name=\"helper\">This</span> outputs the bytecode instruction that defines\nthe new variable and stores its initial value. The index of the variable&rsquo;s name\nin the constant table is the instruction&rsquo;s operand. As usual in a stack-based\nVM, we emit this instruction last. At runtime, we execute the code for the\nvariable&rsquo;s initializer first. That leaves the value on the stack. Then this\ninstruction takes that value and stores it away for later.</p>\n<aside name=\"helper\">\n<p>I know some of these functions seem pretty pointless right now. But we&rsquo;ll get\nmore mileage out of them as we add more language features for working with\nnames. Function and class declarations both declare new variables, and variable\nand assignment expressions access them.</p>\n</aside>\n<p>Over in the runtime, we begin with this new instruction:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_POP,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_DEFINE_GLOBAL</span>,\n</pre><pre class=\"insert-after\">  OP_EQUAL,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>Thanks to our handy-dandy hash table, the implementation isn&rsquo;t too hard.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case OP_POP: pop(); break;\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_DEFINE_GLOBAL</span>: {\n        <span class=\"t\">ObjString</span>* <span class=\"i\">name</span> = <span class=\"a\">READ_STRING</span>();\n        <span class=\"i\">tableSet</span>(&amp;<span class=\"i\">vm</span>.<span class=\"i\">globals</span>, <span class=\"i\">name</span>, <span class=\"i\">peek</span>(<span class=\"n\">0</span>));\n        <span class=\"i\">pop</span>();\n        <span class=\"k\">break</span>;\n      }\n</pre><pre class=\"insert-after\">      case OP_EQUAL: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>We get the name of the variable from the constant table. Then we <span\nname=\"pop\">take</span> the value from the top of the stack and store it in a\nhash table with that name as the key.</p>\n<aside name=\"pop\">\n<p>Note that we don&rsquo;t <em>pop</em> the value until <em>after</em> we add it to the hash table.\nThat ensures the VM can still find the value if a garbage collection is\ntriggered right in the middle of adding it to the hash table. That&rsquo;s a distinct\npossibility since the hash table requires dynamic allocation when it resizes.</p>\n</aside>\n<p>This code doesn&rsquo;t check to see if the key is already in the table. Lox is pretty\nlax with global variables and lets you redefine them without error. That&rsquo;s\nuseful in a REPL session, so the VM supports that by simply overwriting the\nvalue if the key happens to already be in the hash table.</p>\n<p>There&rsquo;s another little helper macro:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define READ_CONSTANT() (vm.chunk-&gt;constants.values[READ_BYTE()])\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\"><span class=\"a\">#define READ_STRING() AS_STRING(READ_CONSTANT())</span>\n</pre><pre class=\"insert-after\">#define BINARY_OP(valueType, op) \\\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>It reads a one-byte operand from the bytecode chunk. It treats that as an index\ninto the chunk&rsquo;s constant table and returns the string at that index. It doesn&rsquo;t\ncheck that the value <em>is</em> a string<span class=\"em\">&mdash;</span>it just indiscriminately casts it. That&rsquo;s\nsafe because the compiler never emits an instruction that refers to a non-string\nconstant.</p>\n<p>Because we care about lexical hygiene, we also undefine this macro at the end of\nthe interpret function.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#undef READ_CONSTANT\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\"><span class=\"a\">#undef READ_STRING</span>\n</pre><pre class=\"insert-after\">#undef BINARY_OP\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>I keep saying &ldquo;the hash table&rdquo;, but we don&rsquo;t actually have one yet. We need a\nplace to store these globals. Since we want them to persist as long as clox is\nrunning, we store them right in the VM.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  Value* stackTop;\n</pre><div class=\"source-file\"><em>vm.h</em><br>\nin struct <em>VM</em></div>\n<pre class=\"insert\">  <span class=\"t\">Table</span> <span class=\"i\">globals</span>;\n</pre><pre class=\"insert-after\">  Table strings;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em>, in struct <em>VM</em></div>\n\n<p>As we did with the string table, we need to initialize the hash table to a valid\nstate when the VM boots up.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  vm.objects = NULL;\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>initVM</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"i\">initTable</span>(&amp;<span class=\"i\">vm</span>.<span class=\"i\">globals</span>);\n</pre><pre class=\"insert-after\">  initTable(&amp;vm.strings);\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>initVM</em>()</div>\n\n<p>And we <span name=\"tear\">tear</span> it down when we exit.</p>\n<aside name=\"tear\">\n<p>The process will free everything on exit, but it feels undignified to require\nthe operating system to clean up our mess.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">void freeVM() {\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>freeVM</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">freeTable</span>(&amp;<span class=\"i\">vm</span>.<span class=\"i\">globals</span>);\n</pre><pre class=\"insert-after\">  freeTable(&amp;vm.strings);\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>freeVM</em>()</div>\n\n<p>As usual, we want to be able to disassemble the new instruction too.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      return simpleInstruction(&quot;OP_POP&quot;, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_DEFINE_GLOBAL</span>:\n      <span class=\"k\">return</span> <span class=\"i\">constantInstruction</span>(<span class=\"s\">&quot;OP_DEFINE_GLOBAL&quot;</span>, <span class=\"i\">chunk</span>,\n                                 <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    case OP_EQUAL:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>And with that, we can define global variables. Not that users can <em>tell</em> that\nthey&rsquo;ve done so, because they can&rsquo;t actually <em>use</em> them. So let&rsquo;s fix that next.</p>\n<h2><a href=\"#reading-variables\" id=\"reading-variables\"><small>21&#8202;.&#8202;3</small>Reading Variables</a></h2>\n<p>As in every programming language ever, we access a variable&rsquo;s value using its\nname. We hook up identifier tokens to the expression parser here:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  [TOKEN_LESS_EQUAL]    = {NULL,     binary, PREC_COMPARISON},\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nreplace 1 line</div>\n<pre class=\"insert\">  [<span class=\"a\">TOKEN_IDENTIFIER</span>]    = {<span class=\"i\">variable</span>, <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n</pre><pre class=\"insert-after\">  [TOKEN_STRING]        = {string,   NULL,   PREC_NONE},\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, replace 1 line</div>\n\n<p>That calls this new parser function:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>string</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">variable</span>() {\n  <span class=\"i\">namedVariable</span>(<span class=\"i\">parser</span>.<span class=\"i\">previous</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>string</em>()</div>\n\n<p>Like with declarations, there are a couple of tiny helper functions that seem\npointless now but will become more useful in later chapters. I promise.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>string</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">namedVariable</span>(<span class=\"t\">Token</span> <span class=\"i\">name</span>) {\n  <span class=\"t\">uint8_t</span> <span class=\"i\">arg</span> = <span class=\"i\">identifierConstant</span>(&amp;<span class=\"i\">name</span>);\n  <span class=\"i\">emitBytes</span>(<span class=\"a\">OP_GET_GLOBAL</span>, <span class=\"i\">arg</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>string</em>()</div>\n\n<p>This calls the same <code>identifierConstant()</code> function from before to take the\ngiven identifier token and add its lexeme to the chunk&rsquo;s constant table as a\nstring. All that remains is to emit an instruction that loads the global\nvariable with that name. Here&rsquo;s the instruction:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_POP,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_GET_GLOBAL</span>,\n</pre><pre class=\"insert-after\">  OP_DEFINE_GLOBAL,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>Over in the interpreter, the implementation mirrors <code>OP_DEFINE_GLOBAL</code>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case OP_POP: pop(); break;\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_GET_GLOBAL</span>: {\n        <span class=\"t\">ObjString</span>* <span class=\"i\">name</span> = <span class=\"a\">READ_STRING</span>();\n        <span class=\"t\">Value</span> <span class=\"i\">value</span>;\n        <span class=\"k\">if</span> (!<span class=\"i\">tableGet</span>(&amp;<span class=\"i\">vm</span>.<span class=\"i\">globals</span>, <span class=\"i\">name</span>, &amp;<span class=\"i\">value</span>)) {\n          <span class=\"i\">runtimeError</span>(<span class=\"s\">&quot;Undefined variable &#39;%s&#39;.&quot;</span>, <span class=\"i\">name</span>-&gt;<span class=\"i\">chars</span>);\n          <span class=\"k\">return</span> <span class=\"a\">INTERPRET_RUNTIME_ERROR</span>;\n        }\n        <span class=\"i\">push</span>(<span class=\"i\">value</span>);\n        <span class=\"k\">break</span>;\n      }\n</pre><pre class=\"insert-after\">      case OP_DEFINE_GLOBAL: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>We pull the constant table index from the instruction&rsquo;s operand and get the\nvariable name. Then we use that as a key to look up the variable&rsquo;s value in the\nglobals hash table.</p>\n<p>If the key isn&rsquo;t present in the hash table, it means that global variable has\nnever been defined. That&rsquo;s a runtime error in Lox, so we report it and exit the\ninterpreter loop if that happens. Otherwise, we take the value and push it\nonto the stack.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      return simpleInstruction(&quot;OP_POP&quot;, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_GET_GLOBAL</span>:\n      <span class=\"k\">return</span> <span class=\"i\">constantInstruction</span>(<span class=\"s\">&quot;OP_GET_GLOBAL&quot;</span>, <span class=\"i\">chunk</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    case OP_DEFINE_GLOBAL:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>A little bit of disassembling, and we&rsquo;re done. Our interpreter is now able to\nrun code like this:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">beverage</span> = <span class=\"s\">&quot;cafe au lait&quot;</span>;\n<span class=\"k\">var</span> <span class=\"i\">breakfast</span> = <span class=\"s\">&quot;beignets with &quot;</span> + <span class=\"i\">beverage</span>;\n<span class=\"k\">print</span> <span class=\"i\">breakfast</span>;\n</pre></div>\n<p>There&rsquo;s only one operation left.</p>\n<h2><a href=\"#assignment\" id=\"assignment\"><small>21&#8202;.&#8202;4</small>Assignment</a></h2>\n<p>Throughout this book, I&rsquo;ve tried to keep you on a fairly safe and easy path. I\ndon&rsquo;t avoid hard <em>problems</em>, but I try to not make the <em>solutions</em> more complex\nthan they need to be. Alas, other design choices in our <span\nname=\"jlox\">bytecode</span> compiler make assignment annoying to implement.</p>\n<aside name=\"jlox\">\n<p>If you recall, assignment was pretty easy in jlox.</p>\n</aside>\n<p>Our bytecode VM uses a single-pass compiler. It parses and generates bytecode\non the fly without any intermediate AST. As soon as it recognizes a piece of\nsyntax, it emits code for it. Assignment doesn&rsquo;t naturally fit that. Consider:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">menu</span>.<span class=\"i\">brunch</span>(<span class=\"i\">sunday</span>).<span class=\"i\">beverage</span> = <span class=\"s\">&quot;mimosa&quot;</span>;\n</pre></div>\n<p>In this code, the parser doesn&rsquo;t realize <code>menu.brunch(sunday).beverage</code> is the\ntarget of an assignment and not a normal expression until it reaches <code>=</code>, many\ntokens after the first <code>menu</code>. By then, the compiler has already emitted\nbytecode for the whole thing.</p>\n<p>The problem is not as dire as it might seem, though. Look at how the parser sees that example:</p><img src=\"image/global-variables/setter.png\" alt=\"The 'menu.brunch(sunday).beverage = &quot;mimosa&quot;' statement, showing that 'menu.brunch(sunday)' is an expression.\" />\n<p>Even though the <code>.beverage</code> part must not be compiled as a get expression,\neverything to the left of the <code>.</code> is an expression, with the normal expression\nsemantics. The <code>menu.brunch(sunday)</code> part can be compiled and executed as usual.</p>\n<p>Fortunately for us, the only semantic differences on the left side of an\nassignment appear at the very right-most end of the tokens, immediately\npreceding the <code>=</code>. Even though the receiver of a setter may be an arbitrarily\nlong expression, the part whose behavior differs from a get expression is only\nthe trailing identifier, which is right before the <code>=</code>. We don&rsquo;t need much\nlookahead to realize <code>beverage</code> should be compiled as a set expression and not a\ngetter.</p>\n<p>Variables are even easier since they are just a single bare identifier before an\n<code>=</code>. The idea then is that right <em>before</em> compiling an expression that can also\nbe used as an assignment target, we look for a subsequent <code>=</code> token. If we see\none, we compile it as an assignment or setter instead of a variable access or\ngetter.</p>\n<p>We don&rsquo;t have setters to worry about yet, so all we need to handle are variables.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  uint8_t arg = identifierConstant(&amp;name);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>namedVariable</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">\n\n  <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"a\">TOKEN_EQUAL</span>)) {\n    <span class=\"i\">expression</span>();\n    <span class=\"i\">emitBytes</span>(<span class=\"a\">OP_SET_GLOBAL</span>, <span class=\"i\">arg</span>);\n  } <span class=\"k\">else</span> {\n    <span class=\"i\">emitBytes</span>(<span class=\"a\">OP_GET_GLOBAL</span>, <span class=\"i\">arg</span>);\n  }\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>namedVariable</em>(), replace 1 line</div>\n\n<p>In the parse function for identifier expressions, we look for an equals sign\nafter the identifier. If we find one, instead of emitting code for a variable\naccess, we compile the assigned value and then emit an assignment instruction.</p>\n<p>That&rsquo;s the last instruction we need to add in this chapter.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_DEFINE_GLOBAL,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_SET_GLOBAL</span>,\n</pre><pre class=\"insert-after\">  OP_EQUAL,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>As you&rsquo;d expect, its runtime behavior is similar to defining a new variable.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_SET_GLOBAL</span>: {\n        <span class=\"t\">ObjString</span>* <span class=\"i\">name</span> = <span class=\"a\">READ_STRING</span>();\n        <span class=\"k\">if</span> (<span class=\"i\">tableSet</span>(&amp;<span class=\"i\">vm</span>.<span class=\"i\">globals</span>, <span class=\"i\">name</span>, <span class=\"i\">peek</span>(<span class=\"n\">0</span>))) {\n          <span class=\"i\">tableDelete</span>(&amp;<span class=\"i\">vm</span>.<span class=\"i\">globals</span>, <span class=\"i\">name</span>);<span name=\"delete\"> </span>\n          <span class=\"i\">runtimeError</span>(<span class=\"s\">&quot;Undefined variable &#39;%s&#39;.&quot;</span>, <span class=\"i\">name</span>-&gt;<span class=\"i\">chars</span>);\n          <span class=\"k\">return</span> <span class=\"a\">INTERPRET_RUNTIME_ERROR</span>;\n        }\n        <span class=\"k\">break</span>;\n      }\n</pre><pre class=\"insert-after\">      case OP_EQUAL: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>The main difference is what happens when the key doesn&rsquo;t already exist in the\nglobals hash table. If the variable hasn&rsquo;t been defined yet, it&rsquo;s a runtime\nerror to try to assign to it. Lox <a href=\"statements-and-state.html#design-note\">doesn&rsquo;t do implicit variable\ndeclaration</a>.</p>\n<aside name=\"delete\">\n<p>The call to <code>tableSet()</code> stores the value in the global variable table even if\nthe variable wasn&rsquo;t previously defined. That fact is visible in a REPL session,\nsince it keeps running even after the runtime error is reported. So we also take\ncare to delete that zombie value from the table.</p>\n</aside>\n<p>The other difference is that setting a variable doesn&rsquo;t pop the value off the\nstack. Remember, assignment is an expression, so it needs to leave that value\nthere in case the assignment is nested inside some larger expression.</p>\n<p>Add a dash of disassembly:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      return constantInstruction(&quot;OP_DEFINE_GLOBAL&quot;, chunk,\n                                 offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_SET_GLOBAL</span>:\n      <span class=\"k\">return</span> <span class=\"i\">constantInstruction</span>(<span class=\"s\">&quot;OP_SET_GLOBAL&quot;</span>, <span class=\"i\">chunk</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    case OP_EQUAL:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>So we&rsquo;re done, right? Well<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>not quite. We&rsquo;ve made a mistake! Take a gander at:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">a</span> * <span class=\"i\">b</span> = <span class=\"i\">c</span> + <span class=\"i\">d</span>;\n</pre></div>\n<p>According to Lox&rsquo;s grammar, <code>=</code> has the lowest precedence, so this should be\nparsed roughly like:</p><img src=\"image/global-variables/ast-good.png\" alt=\"The expected parse, like '(a * b) = (c + d)'.\" />\n<p>Obviously, <code>a * b</code> isn&rsquo;t a <span name=\"do\">valid</span> assignment target, so\nthis should be a syntax error. But here&rsquo;s what our parser does:</p>\n<aside name=\"do\">\n<p>Wouldn&rsquo;t it be wild if <code>a * b</code> <em>was</em> a valid assignment target, though? You\ncould imagine some algebra-like language that tried to divide the assigned value\nup in some reasonable way and distribute it to <code>a</code> and <code>b</code><span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>that&rsquo;s probably\na terrible idea.</p>\n</aside>\n<ol>\n<li>First, <code>parsePrecedence()</code> parses <code>a</code> using the <code>variable()</code> prefix parser.</li>\n<li>After that, it enters the infix parsing loop.</li>\n<li>It reaches the <code>*</code> and calls <code>binary()</code>.</li>\n<li>That recursively calls <code>parsePrecedence()</code> to parse the right-hand operand.</li>\n<li>That calls <code>variable()</code> again for parsing <code>b</code>.</li>\n<li>Inside that call to <code>variable()</code>, it looks for a trailing <code>=</code>. It sees one\nand thus parses the rest of the line as an assignment.</li>\n</ol>\n<p>In other words, the parser sees the above code like:</p><img src=\"image/global-variables/ast-bad.png\" alt=\"The actual parse, like 'a * (b = c + d)'.\" />\n<p>We&rsquo;ve messed up the precedence handling because <code>variable()</code> doesn&rsquo;t take into\naccount the precedence of the surrounding expression that contains the variable.\nIf the variable happens to be the right-hand side of an infix operator, or the\noperand of a unary operator, then that containing expression is too high\nprecedence to permit the <code>=</code>.</p>\n<p>To fix this, <code>variable()</code> should look for and consume the <code>=</code> only if it&rsquo;s in\nthe context of a low-precedence expression. The code that knows the current\nprecedence is, logically enough, <code>parsePrecedence()</code>. The <code>variable()</code> function\ndoesn&rsquo;t need to know the actual level. It just cares that the precedence is low\nenough to allow assignment, so we pass that fact in as a Boolean.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    error(&quot;Expect expression.&quot;);\n    return;\n  }\n\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>parsePrecedence</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"t\">bool</span> <span class=\"i\">canAssign</span> = <span class=\"i\">precedence</span> &lt;= <span class=\"a\">PREC_ASSIGNMENT</span>;\n  <span class=\"i\">prefixRule</span>(<span class=\"i\">canAssign</span>);\n</pre><pre class=\"insert-after\">\n\n  while (precedence &lt;= getRule(parser.current.type)-&gt;precedence) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>parsePrecedence</em>(), replace 1 line</div>\n\n<p>Since assignment is the lowest-precedence expression, the only time we allow an\nassignment is when parsing an assignment expression or top-level expression like\nin an expression statement. That flag makes its way to the parser function here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nfunction <em>variable</em>()<br>\nreplace 3 lines</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">variable</span>(<span class=\"t\">bool</span> <span class=\"i\">canAssign</span>) {\n  <span class=\"i\">namedVariable</span>(<span class=\"i\">parser</span>.<span class=\"i\">previous</span>, <span class=\"i\">canAssign</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, function <em>variable</em>(), replace 3 lines</div>\n\n<p>Which passes it through a new parameter:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nfunction <em>namedVariable</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">namedVariable</span>(<span class=\"t\">Token</span> <span class=\"i\">name</span>, <span class=\"t\">bool</span> <span class=\"i\">canAssign</span>) {\n</pre><pre class=\"insert-after\">  uint8_t arg = identifierConstant(&amp;name);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, function <em>namedVariable</em>(), replace 1 line</div>\n\n<p>And then finally uses it here:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  uint8_t arg = identifierConstant(&amp;name);\n\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>namedVariable</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">canAssign</span> &amp;&amp; <span class=\"i\">match</span>(<span class=\"a\">TOKEN_EQUAL</span>)) {\n</pre><pre class=\"insert-after\">    expression();\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>namedVariable</em>(), replace 1 line</div>\n\n<p>That&rsquo;s a lot of plumbing to get literally one bit of data to the right place in\nthe compiler, but arrived it has. If the variable is nested inside some\nexpression with higher precedence, <code>canAssign</code> will be <code>false</code> and this will\nignore the <code>=</code> even if there is one there. Then <code>namedVariable()</code> returns, and\nexecution eventually makes its way back to <code>parsePrecedence()</code>.</p>\n<p>Then what? What does the compiler do with our broken example from before? Right\nnow, <code>variable()</code> won&rsquo;t consume the <code>=</code>, so that will be the current token. The\ncompiler returns back to <code>parsePrecedence()</code> from the <code>variable()</code> prefix parser\nand then tries to enter the infix parsing loop. There is no parsing function\nassociated with <code>=</code>, so it skips that loop.</p>\n<p>Then <code>parsePrecedence()</code> silently returns back to the caller. That also isn&rsquo;t\nright. If the <code>=</code> doesn&rsquo;t get consumed as part of the expression, nothing else\nis going to consume it. It&rsquo;s an error and we should report it.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    infixRule();\n  }\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>parsePrecedence</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"k\">if</span> (<span class=\"i\">canAssign</span> &amp;&amp; <span class=\"i\">match</span>(<span class=\"a\">TOKEN_EQUAL</span>)) {\n    <span class=\"i\">error</span>(<span class=\"s\">&quot;Invalid assignment target.&quot;</span>);\n  }\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>parsePrecedence</em>()</div>\n\n<p>With that, the previous bad program correctly gets an error at compile time. OK,\n<em>now</em> are we done? Still not quite. See, we&rsquo;re passing an argument to one of the\nparse functions. But those functions are stored in a table of function pointers,\nso all of the parse functions need to have the same type. Even though most parse\nfunctions don&rsquo;t support being used as an assignment target<span class=\"em\">&mdash;</span>setters are the\n<span name=\"index\">only</span> other one<span class=\"em\">&mdash;</span>our friendly C compiler requires\nthem <em>all</em> to accept the parameter.</p>\n<aside name=\"index\">\n<p>If Lox had arrays and subscript operators like <code>array[index]</code> then an infix <code>[</code>\nwould also allow assignment to support <code>array[index] = value</code>.</p>\n</aside>\n<p>So we&rsquo;re going to finish off this chapter with some grunt work. First, let&rsquo;s go\nahead and pass the flag to the infix parse functions.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    ParseFn infixRule = getRule(parser.previous.type)-&gt;infix;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>parsePrecedence</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"i\">infixRule</span>(<span class=\"i\">canAssign</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>parsePrecedence</em>(), replace 1 line</div>\n\n<p>We&rsquo;ll need that for setters eventually. Then we&rsquo;ll fix the typedef for the\nfunction type.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} Precedence;\n\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nadd after enum <em>Precedence</em><br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"k\">typedef</span> <span class=\"t\">void</span> (*<span class=\"t\">ParseFn</span>)(<span class=\"t\">bool</span> <span class=\"i\">canAssign</span>);\n</pre><pre class=\"insert-after\">\n\ntypedef struct {\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after enum <em>Precedence</em>, replace 1 line</div>\n\n<p>And some completely tedious code to accept this parameter in all of our existing\nparse functions. Here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nfunction <em>binary</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">binary</span>(<span class=\"t\">bool</span> <span class=\"i\">canAssign</span>) {\n</pre><pre class=\"insert-after\">  TokenType operatorType = parser.previous.type;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, function <em>binary</em>(), replace 1 line</div>\n\n<p>And here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nfunction <em>literal</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">literal</span>(<span class=\"t\">bool</span> <span class=\"i\">canAssign</span>) {\n</pre><pre class=\"insert-after\">  switch (parser.previous.type) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, function <em>literal</em>(), replace 1 line</div>\n\n<p>And here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nfunction <em>grouping</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">grouping</span>(<span class=\"t\">bool</span> <span class=\"i\">canAssign</span>) {\n</pre><pre class=\"insert-after\">  expression();\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, function <em>grouping</em>(), replace 1 line</div>\n\n<p>And here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nfunction <em>number</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">number</span>(<span class=\"t\">bool</span> <span class=\"i\">canAssign</span>) {\n</pre><pre class=\"insert-after\">  double value = strtod(parser.previous.start, NULL);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, function <em>number</em>(), replace 1 line</div>\n\n<p>And here too:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nfunction <em>string</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">string</span>(<span class=\"t\">bool</span> <span class=\"i\">canAssign</span>) {\n</pre><pre class=\"insert-after\">  emitConstant(OBJ_VAL(copyString(parser.previous.start + 1,\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, function <em>string</em>(), replace 1 line</div>\n\n<p>And, finally:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nfunction <em>unary</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">unary</span>(<span class=\"t\">bool</span> <span class=\"i\">canAssign</span>) {\n</pre><pre class=\"insert-after\">  TokenType operatorType = parser.previous.type;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, function <em>unary</em>(), replace 1 line</div>\n\n<p>Phew! We&rsquo;re back to a C program we can compile. Fire it up and now you can run\nthis:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">breakfast</span> = <span class=\"s\">&quot;beignets&quot;</span>;\n<span class=\"k\">var</span> <span class=\"i\">beverage</span> = <span class=\"s\">&quot;cafe au lait&quot;</span>;\n<span class=\"i\">breakfast</span> = <span class=\"s\">&quot;beignets with &quot;</span> + <span class=\"i\">beverage</span>;\n\n<span class=\"k\">print</span> <span class=\"i\">breakfast</span>;\n</pre></div>\n<p>It&rsquo;s starting to look like real code for an actual language!</p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>The compiler adds a global variable&rsquo;s name to the constant table as a string\nevery time an identifier is encountered. It creates a new constant each\ntime, even if that variable name is already in a previous slot in the\nconstant table. That&rsquo;s wasteful in cases where the same variable is\nreferenced multiple times by the same function. That, in turn, increases the\nodds of filling up the constant table and running out of slots since we\nallow only 256 constants in a single chunk.</p>\n<p>Optimize this. How does your optimization affect the performance of the\ncompiler compared to the runtime? Is this the right trade-off?</p>\n</li>\n<li>\n<p>Looking up a global variable by name in a hash table each time it is used\nis pretty slow, even with a good hash table. Can you come up with a more\nefficient way to store and access global variables without changing the\nsemantics?</p>\n</li>\n<li>\n<p>When running in the REPL, a user might write a function that references an\nunknown global variable. Then, in the next line, they declare the variable.\nLox should handle this gracefully by not reporting an &ldquo;unknown variable&rdquo;\ncompile error when the function is first defined.</p>\n<p>But when a user runs a Lox <em>script</em>, the compiler has access to the full\ntext of the entire program before any code is run. Consider this program:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">useVar</span>() {\n  <span class=\"k\">print</span> <span class=\"i\">oops</span>;\n}\n\n<span class=\"k\">var</span> <span class=\"i\">ooops</span> = <span class=\"s\">&quot;too many o&#39;s!&quot;</span>;\n</pre></div>\n<p>Here, we can tell statically that <code>oops</code> will not be defined because there\nis <em>no</em> declaration of that global anywhere in the program. Note that\n<code>useVar()</code> is never called either, so even though the variable isn&rsquo;t\ndefined, no runtime error will occur because it&rsquo;s never used either.</p>\n<p>We could report mistakes like this as compile errors, at least when running\nfrom a script. Do you think we should? Justify your answer. What do other\nscripting languages you know do?</p>\n</li>\n</ol>\n</div>\n\n<footer>\n<a href=\"local-variables.html\" class=\"next\">\n  Next Chapter: &ldquo;Local Variables&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/hash-tables.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Hash Tables &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Hash Tables<small>20</small></a></h3>\n\n<ul>\n    <li><a href=\"#an-array-of-buckets\"><small>20.1</small> An Array of Buckets</a></li>\n    <li><a href=\"#collision-resolution\"><small>20.2</small> Collision Resolution</a></li>\n    <li><a href=\"#hash-functions\"><small>20.3</small> Hash Functions</a></li>\n    <li><a href=\"#building-a-hash-table\"><small>20.4</small> Building a Hash Table</a></li>\n    <li><a href=\"#string-interning\"><small>20.5</small> String Interning</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"strings.html\" title=\"Strings\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"global-variables.html\" title=\"Global Variables\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"strings.html\" title=\"Strings\" class=\"prev\">←</a>\n<a href=\"global-variables.html\" title=\"Global Variables\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Hash Tables<small>20</small></a></h3>\n\n<ul>\n    <li><a href=\"#an-array-of-buckets\"><small>20.1</small> An Array of Buckets</a></li>\n    <li><a href=\"#collision-resolution\"><small>20.2</small> Collision Resolution</a></li>\n    <li><a href=\"#hash-functions\"><small>20.3</small> Hash Functions</a></li>\n    <li><a href=\"#building-a-hash-table\"><small>20.4</small> Building a Hash Table</a></li>\n    <li><a href=\"#string-interning\"><small>20.5</small> String Interning</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"strings.html\" title=\"Strings\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"global-variables.html\" title=\"Global Variables\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">20</div>\n  <h1>Hash Tables</h1>\n\n<blockquote>\n<p>Hash, x. There is no definition for this word<span class=\"em\">&mdash;</span>nobody knows what hash is.</p>\n<p><cite>Ambrose Bierce, <em>The Unabridged Devil&rsquo;s Dictionary</em></cite></p>\n</blockquote>\n<p>Before we can add variables to our burgeoning virtual machine, we need some way\nto look up a value given a variable&rsquo;s name. Later, when we add classes, we&rsquo;ll\nalso need a way to store fields on instances. The perfect data structure for\nthese problems and others is a hash table.</p>\n<p>You probably already know what a hash table is, even if you don&rsquo;t know it by\nthat name. If you&rsquo;re a Java programmer, you call them &ldquo;HashMaps&rdquo;. C# and Python\nusers call them &ldquo;dictionaries&rdquo;. In C++, it&rsquo;s an &ldquo;unordered map&rdquo;. &ldquo;Objects&rdquo; in\nJavaScript and &ldquo;tables&rdquo; in Lua are hash tables under the hood, which is what\ngives them their flexibility.</p>\n<p>A hash table, whatever your language calls it, associates a set of <strong>keys</strong> with\na set of <strong>values</strong>. Each key/value pair is an <strong>entry</strong> in the table. Given a\nkey, you can look up its corresponding value. You can add new key/value pairs\nand remove entries by key. If you add a new value for an existing key, it\nreplaces the previous entry.</p>\n<p>Hash tables appear in so many languages because they are incredibly powerful.\nMuch of this power comes from one metric: given a key, a hash table returns the\ncorresponding value in <span name=\"constant\">constant time</span>, <em>regardless\nof how many keys are in the hash table</em>.</p>\n<aside name=\"constant\">\n<p>More specifically, the <em>average-case</em> lookup time is constant. Worst-case\nperformance can be, well, worse. In practice, it&rsquo;s easy to avoid degenerate\nbehavior and stay on the happy path.</p>\n</aside>\n<p>That&rsquo;s pretty remarkable when you think about it. Imagine you&rsquo;ve got a big stack\nof business cards and I ask you to find a certain person. The bigger the pile\nis, the longer it will take. Even if the pile is nicely sorted and you&rsquo;ve got\nthe manual dexterity to do a binary search by hand, you&rsquo;re still talking\n<em>O(log n)</em>. But with a <span name=\"rolodex\">hash table</span>, it takes the\nsame time to find that business card when the stack has ten cards as when it has\na million.</p>\n<aside name=\"rolodex\">\n<p>Stuff all those cards in a Rolodex<span class=\"em\">&mdash;</span>does anyone even remember those things\nanymore?<span class=\"em\">&mdash;</span>with dividers for each letter, and you improve your speed\ndramatically. As we&rsquo;ll see, that&rsquo;s not too far from the trick a hash table uses.</p>\n</aside>\n<h2><a href=\"#an-array-of-buckets\" id=\"an-array-of-buckets\"><small>20&#8202;.&#8202;1</small>An Array of Buckets</a></h2>\n<p>A complete, fast hash table has a couple of moving parts. I&rsquo;ll introduce them\none at a time by working through a couple of toy problems and their solutions.\nEventually, we&rsquo;ll build up to a data structure that can associate any set of\nnames with their values.</p>\n<p>For now, imagine if Lox was a <em>lot</em> more restricted in variable names. What if a\nvariable&rsquo;s name could only be a <span name=\"basic\">single</span> lowercase\nletter. How could we very efficiently represent a set of variable names and\ntheir values?</p>\n<aside name=\"basic\">\n<p>This limitation isn&rsquo;t <em>too</em> far-fetched. The initial versions of BASIC out of\nDartmouth allowed variable names to be only a single letter followed by one\noptional digit.</p>\n</aside>\n<p>With only 26 possible variables (27 if you consider underscore a &ldquo;letter&rdquo;, I\nguess), the answer is easy. Declare a fixed-size array with 26 elements. We&rsquo;ll\nfollow tradition and call each element a <strong>bucket</strong>. Each represents a variable\nwith <code>a</code> starting at index zero. If there&rsquo;s a value in the array at some\nletter&rsquo;s index, then that key is present with that value. Otherwise, the bucket\nis empty and that key/value pair isn&rsquo;t in the data structure.</p>\n<aside name=\"bucket\">\n<p><img src=\"image/hash-tables/bucket-array.png\" alt=\"A row of buckets, each\nlabeled with a letter of the alphabet.\" /></p>\n</aside>\n<p>Memory usage is great<span class=\"em\">&mdash;</span>just a single, reasonably sized <span\nname=\"bucket\">array</span>. There&rsquo;s some waste from the empty buckets, but it&rsquo;s\nnot huge. There&rsquo;s no overhead for node pointers, padding, or other stuff you&rsquo;d\nget with something like a linked list or tree.</p>\n<p>Performance is even better. Given a variable name<span class=\"em\">&mdash;</span>its character<span class=\"em\">&mdash;</span>you can\nsubtract the ASCII value of <code>a</code> and use the result to index directly into the\narray. Then you can either look up the existing value or store a new value\ndirectly into that slot. It doesn&rsquo;t get much faster than that.</p>\n<p>This is sort of our Platonic ideal data structure. Lightning fast, dead simple,\nand compact in memory. As we add support for more complex keys, we&rsquo;ll have to\nmake some concessions, but this is what we&rsquo;re aiming for. Even once you add in\nhash functions, dynamic resizing, and collision resolution, this is still the\ncore of every hash table out there<span class=\"em\">&mdash;</span>a contiguous array of buckets that you\nindex directly into.</p>\n<h3><a href=\"#load-factor-and-wrapped-keys\" id=\"load-factor-and-wrapped-keys\"><small>20&#8202;.&#8202;1&#8202;.&#8202;1</small>Load factor and wrapped keys</a></h3>\n<p>Confining Lox to single-letter variables would make our job as implementers\neasier, but it&rsquo;s probably no fun programming in a language that gives you only\n26 storage locations. What if we loosened it a little and allowed variables up\nto <span name=\"six\">eight</span> characters long?</p>\n<aside name=\"six\">\n<p>Again, this restriction isn&rsquo;t so crazy. Early linkers for C treated only the\nfirst six characters of external identifiers as meaningful. Everything after\nthat was ignored. If you&rsquo;ve ever wondered why the C standard library is so\nenamored of abbreviation<span class=\"em\">&mdash;</span>looking at you, <code>strncmp()</code><span class=\"em\">&mdash;</span>it turns out it\nwasn&rsquo;t entirely because of the small screens (or teletypes!) of the day.</p>\n</aside>\n<p>That&rsquo;s small enough that we can pack all eight characters into a 64-bit integer\nand easily turn the string into a number. We can then use it as an array index.\nOr, at least, we could if we could somehow allocate a 295,148 <em>petabyte</em> array.\nMemory&rsquo;s gotten cheaper over time, but not quite <em>that</em> cheap. Even if we could\nmake an array that big, it would be heinously wasteful. Almost every bucket\nwould be empty unless users started writing way bigger Lox programs than we&rsquo;ve\nanticipated.</p>\n<p>Even though our variable keys cover the full 64-bit numeric range, we clearly\ndon&rsquo;t need an array that large. Instead, we allocate an array with more than\nenough capacity for the entries we need, but not unreasonably large. We map the\nfull 64-bit keys down to that smaller range by taking the value modulo the size\nof the array. Doing that essentially folds the larger numeric range onto itself\nuntil it fits the smaller range of array elements.</p>\n<p>For example, say we want to store &ldquo;bagel&rdquo;. We allocate an array with eight\nelements, plenty enough to store it and more later. We treat the key string as a\n64-bit integer. On a little-endian machine like Intel, packing those characters\ninto a 64-bit word puts the first letter, &ldquo;b&rdquo; (ASCII value 98), in the\nleast-significant byte. We take that integer modulo the array size (<span\nname=\"power-of-two\">8</span>) to fit it in the bounds and get a bucket index, 2.\nThen we store the value there as usual.</p>\n<aside name=\"power-of-two\">\n<p>I&rsquo;m using powers of two for the array sizes here, but they don&rsquo;t need to be.\nSome styles of hash tables work best with powers of two, including the one we&rsquo;ll\nbuild in this book. Others prefer prime number array sizes or have other rules.</p>\n</aside>\n<p>Using the array size as a modulus lets us map the key&rsquo;s numeric range down to\nfit an array of any size. We can thus control the number of buckets\nindependently of the key range. That solves our waste problem, but introduces a\nnew one. Any two variables whose key number has the same remainder when divided\nby the array size will end up in the same bucket. Keys can <strong>collide</strong>. For\nexample, if we try to add &ldquo;jam&rdquo;, it also ends up in bucket 2.</p><img src=\"image/hash-tables/collision.png\" alt=\"'Bagel' and 'jam' both end up in bucket index 2.\" />\n<p>We have some control over this by tuning the array size. The bigger the array,\nthe fewer the indexes that get mapped to the same bucket and the fewer the\ncollisions that are likely to occur. Hash table implementers track this\ncollision likelihood by measuring the table&rsquo;s <strong>load factor</strong>. It&rsquo;s defined as\nthe number of entries divided by the number of buckets. So a hash table with\nfive entries and an array of 16 elements has a load factor of 0.3125. The higher\nthe load factor, the greater the chance of collisions.</p>\n<p>One way we mitigate collisions is by resizing the array. Just like the dynamic\narrays we implemented earlier, we reallocate and grow the hash table&rsquo;s array as\nit fills up. Unlike a regular dynamic array, though, we won&rsquo;t wait until the\narray is <em>full</em>. Instead, we pick a desired load factor and grow the array when\nit goes over that.</p>\n<h2><a href=\"#collision-resolution\" id=\"collision-resolution\"><small>20&#8202;.&#8202;2</small>Collision Resolution</a></h2>\n<p>Even with a very low load factor, collisions can still occur. The <a href=\"https://en.wikipedia.org/wiki/Birthday_problem\"><em>birthday\nparadox</em></a> tells us that as the number of entries in the hash table\nincreases, the chance of collision increases very quickly. We can pick a large\narray size to reduce that, but it&rsquo;s a losing game. Say we wanted to store a\nhundred items in a hash table. To keep the chance of collision below a\nstill-pretty-high 10%, we need an array with at least 47,015 elements. To get\nthe chance below 1% requires an array with 492,555 elements, over 4,000 empty\nbuckets for each one in use.</p>\n<p>A low load factor can make collisions <span name=\"pigeon\">rarer</span>, but the\n<a href=\"https://en.wikipedia.org/wiki/Pigeonhole_principle\"><em>pigeonhole principle</em></a> tells us we can never eliminate them entirely.\nIf you&rsquo;ve got five pet pigeons and four holes to put them in, at least one hole\nis going to end up with more than one pigeon. With 18,446,744,073,709,551,616\ndifferent variable names, any reasonably sized array can potentially end up with\nmultiple keys in the same bucket.</p>\n<p>Thus we still have to handle collisions gracefully when they occur. Users don&rsquo;t\nlike it when their programming language can look up variables correctly only\n<em>most</em> of the time.</p>\n<aside name=\"pigeon\">\n<p>Put these two funny-named mathematical rules together and you get this\nobservation: Take a birdhouse containing 365 pigeonholes, and use each pigeon&rsquo;s\nbirthday to assign it to a pigeonhole. You&rsquo;ll need only about 26 randomly chosen\npigeons before you get a greater than 50% chance of two pigeons in the same box.</p><img src=\"image/hash-tables/pigeons.png\" alt=\"Two pigeons in the same hole.\" />\n</aside>\n<h3><a href=\"#separate-chaining\" id=\"separate-chaining\"><small>20&#8202;.&#8202;2&#8202;.&#8202;1</small>Separate chaining</a></h3>\n<p>Techniques for resolving collisions fall into two broad categories. The first is\n<strong>separate chaining</strong>. Instead of each bucket containing a single entry, we let\nit contain a collection of them. In the classic implementation, each bucket\npoints to a linked list of entries. To look up an entry, you find its bucket and\nthen walk the list until you find an entry with the matching key.</p><img src=\"image/hash-tables/chaining.png\" alt=\"An array with eight buckets. Bucket 2 links to a chain of two nodes. Bucket 5 links to a single node.\" />\n<p>In catastrophically bad cases where every entry collides in the same bucket, the\ndata structure degrades into a single unsorted linked list with <em>O(n)</em> lookup.\nIn practice, it&rsquo;s easy to avoid that by controlling the load factor and how\nentries get scattered across buckets. In typical separate-chained hash tables,\nit&rsquo;s rare for a bucket to have more than one or two entries.</p>\n<p>Separate chaining is conceptually simple<span class=\"em\">&mdash;</span>it&rsquo;s literally an array of linked\nlists. Most operations are straightforward to implement, even deletion which, as\nwe&rsquo;ll see, can be a pain. But it&rsquo;s not a great fit for modern CPUs. It has a lot\nof overhead from pointers and tends to scatter little linked list <span\nname=\"node\">nodes</span> around in memory which isn&rsquo;t great for cache usage.</p>\n<aside name=\"node\">\n<p>There are a few tricks to optimize this. Many implementations store the first\nentry right in the bucket so that in the common case where there&rsquo;s only one, no\nextra pointer indirection is needed. You can also make each linked list node\nstore a few entries to reduce the pointer overhead.</p>\n</aside>\n<h3><a href=\"#open-addressing\" id=\"open-addressing\"><small>20&#8202;.&#8202;2&#8202;.&#8202;2</small>Open addressing</a></h3>\n<p>The other technique is <span name=\"open\">called</span> <strong>open addressing</strong> or\n(confusingly) <strong>closed hashing</strong>. With this technique, all entries live directly\nin the bucket array, with one entry per bucket. If two entries collide in the\nsame bucket, we find a different empty bucket to use instead.</p>\n<aside name=\"open\">\n<p>It&rsquo;s called &ldquo;open&rdquo; addressing because the entry may end up at an address\n(bucket) outside of its preferred one. It&rsquo;s called &ldquo;closed&rdquo; hashing because all\nof the entries stay inside the array of buckets.</p>\n</aside>\n<p>Storing all entries in a single, big, contiguous array is great for keeping the\nmemory representation simple and fast. But it makes all of the operations on the\nhash table more complex. When inserting an entry, its bucket may be full,\nsending us to look at another bucket. That bucket itself may be occupied and so\non. This process of finding an available bucket is called <strong>probing</strong>, and the\norder that you examine buckets is a <strong>probe sequence</strong>.</p>\n<p>There are a <span name=\"probe\">number</span> of algorithms for determining\nwhich buckets to probe and how to decide which entry goes in which bucket.\nThere&rsquo;s been a ton of research here because even slight tweaks can have a large\nperformance impact. And, on a data structure as heavily used as hash tables,\nthat performance impact touches a very large number of real-world programs\nacross a range of hardware capabilities.</p>\n<aside name=\"probe\">\n<p>If you&rsquo;d like to learn more (and you should, because some of these are really\ncool), look into &ldquo;double hashing&rdquo;, &ldquo;cuckoo hashing&rdquo;, &ldquo;Robin Hood hashing&rdquo;, and\nanything those lead you to.</p>\n</aside>\n<p>As usual in this book, we&rsquo;ll pick the simplest one that gets the job done\nefficiently. That&rsquo;s good old <strong>linear probing</strong>. When looking for an entry, we\nlook in the first bucket its key maps to. If it&rsquo;s not in there, we look in the\nvery next element in the array, and so on. If we reach the end, we wrap back\naround to the beginning.</p>\n<p>The good thing about linear probing is that it&rsquo;s cache friendly. Since you walk\nthe array directly in memory order, it keeps the CPU&rsquo;s cache lines full and\nhappy. The bad thing is that it&rsquo;s prone to <strong>clustering</strong>. If you have a lot of\nentries with numerically similar key values, you can end up with a lot of\ncolliding, overflowing buckets right next to each other.</p>\n<p>Compared to separate chaining, open addressing can be harder to wrap your head\naround. I think of open addressing as similar to separate chaining except that\nthe &ldquo;list&rdquo; of nodes is threaded through the bucket array itself. Instead of\nstoring the links between them in pointers, the connections are calculated\nimplicitly by the order that you look through the buckets.</p>\n<p>The tricky part is that more than one of these implicit lists may be interleaved\ntogether. Let&rsquo;s walk through an example that covers all the interesting cases.\nWe&rsquo;ll ignore values for now and just worry about a set of keys. We start with an\nempty array of 8 buckets.</p><img src=\"image/hash-tables/insert-1.png\" alt=\"An array with eight empty buckets.\" class=\"wide\" />\n<p>We decide to insert &ldquo;bagel&rdquo;. The first letter, &ldquo;b&rdquo; (ASCII value 98), modulo the\narray size (8) puts it in bucket 2.</p><img src=\"image/hash-tables/insert-2.png\" alt=\"Bagel goes into bucket 2.\" class=\"wide\" />\n<p>Next, we insert &ldquo;jam&rdquo;. That also wants to go in bucket 2 (106 mod 8 = 2), but\nthat bucket&rsquo;s taken. We keep probing to the next bucket. It&rsquo;s empty, so we put\nit there.</p><img src=\"image/hash-tables/insert-3.png\" alt=\"Jam goes into bucket 3, since 2 is full.\" class=\"wide\" />\n<p>We insert &ldquo;fruit&rdquo;, which happily lands in bucket 6.</p><img src=\"image/hash-tables/insert-4.png\" alt=\"Fruit goes into bucket 6.\" class=\"wide\" />\n<p>Likewise, &ldquo;migas&rdquo; can go in its preferred bucket 5.</p><img src=\"image/hash-tables/insert-5.png\" alt=\"Migas goes into bucket 5.\" class=\"wide\" />\n<p>When we try to insert &ldquo;eggs&rdquo;, it also wants to be in bucket 5. That&rsquo;s full, so we\nskip to 6. Bucket 6 is also full. Note that the entry in there is <em>not</em> part of\nthe same probe sequence. &ldquo;Fruit&rdquo; is in its preferred bucket, 6. So the 5 and 6\nsequences have collided and are interleaved. We skip over that and finally put\n&ldquo;eggs&rdquo; in bucket 7.</p><img src=\"image/hash-tables/insert-6.png\" alt=\"Eggs goes into bucket 7 because 5 and 6 are full.\" class=\"wide\" />\n<p>We run into a similar problem with &ldquo;nuts&rdquo;. It can&rsquo;t land in 6 like it wants to.\nNor can it go into 7. So we keep going. But we&rsquo;ve reached the end of the array,\nso we wrap back around to 0 and put it there.</p><img src=\"image/hash-tables/insert-7.png\" alt=\"Nuts wraps around to bucket 0 because 6 and 7 are full.\" class=\"wide\" />\n<p>In practice, the interleaving turns out to not be much of a problem. Even in\nseparate chaining, we need to walk the list to check each entry&rsquo;s key because\nmultiple keys can reduce to the same bucket. With open addressing, we need to do\nthat same check, and that also covers the case where you are stepping over\nentries that &ldquo;belong&rdquo; to a different original bucket.</p>\n<h2><a href=\"#hash-functions\" id=\"hash-functions\"><small>20&#8202;.&#8202;3</small>Hash Functions</a></h2>\n<p>We can now build ourselves a reasonably efficient table for storing variable\nnames up to eight characters long, but that limitation is still annoying. In\norder to relax the last constraint, we need a way to take a string of any length\nand convert it to a fixed-size integer.</p>\n<p>Finally, we get to the &ldquo;hash&rdquo; part of &ldquo;hash table&rdquo;. A <strong>hash function</strong> takes\nsome larger blob of data and &ldquo;hashes&rdquo; it to produce a fixed-size integer <strong>hash\ncode</strong> whose value depends on all of the bits of the original data. A <span\nname=\"crypto\">good</span> hash function has three main goals:</p>\n<aside name=\"crypto\">\n<p>Hash functions are also used for cryptography. In that domain, &ldquo;good&rdquo; has a\n<em>much</em> more stringent definition to avoid exposing details about the data being\nhashed. We, thankfully, don&rsquo;t need to worry about those concerns for this book.</p>\n</aside>\n<ul>\n<li>\n<p><strong>It must be <em>deterministic</em>.</strong> The same input must always hash to the same\nnumber. If the same variable ends up in different buckets at different\npoints in time, it&rsquo;s gonna get really hard to find it.</p>\n</li>\n<li>\n<p><strong>It must be <em>uniform</em>.</strong> Given a typical set of inputs, it should produce a\nwide and evenly distributed range of output numbers, with as few clumps or\npatterns as possible. We want it to <span name=\"scatter\">scatter</span>\nvalues across the whole numeric range to minimize collisions and clustering.</p>\n</li>\n<li>\n<p><strong>It must be <em>fast</em>.</strong> Every operation on the hash table requires us to hash\nthe key first. If hashing is slow, it can potentially cancel out the speed\nof the underlying array storage.</p>\n</li>\n</ul>\n<aside name=\"scatter\">\n<p>One of the original names for a hash table was &ldquo;scatter table&rdquo; because it takes\nthe entries and scatters them throughout the array. The word &ldquo;hash&rdquo; came from\nthe idea that a hash function takes the input data, chops it up, and tosses it\nall together into a pile to come up with a single number from all of those bits.</p>\n</aside>\n<p>There is a veritable pile of hash functions out there. Some are old and\noptimized for architectures no one uses anymore. Some are designed to be fast,\nothers cryptographically secure. Some take advantage of vector instructions and\ncache sizes for specific chips, others aim to maximize portability.</p>\n<p>There are people out there for whom designing and evaluating hash functions is,\nlike, their <em>jam</em>. I admire them, but I&rsquo;m not mathematically astute enough to\n<em>be</em> one. So for clox, I picked a simple, well-worn hash function called\n<a href=\"http://www.isthe.com/chongo/tech/comp/fnv/\">FNV-1a</a> that&rsquo;s served me fine over the years. Consider <span\nname=\"thing\">trying</span> out different ones in your code and see if they make\na difference.</p>\n<aside name=\"thing\">\n<p>Who knows, maybe hash functions could turn out to be your thing too?</p>\n</aside>\n<p>OK, that&rsquo;s a quick run through of buckets, load factors, open addressing,\ncollision resolution, and hash functions. That&rsquo;s an awful lot of text and not a\nlot of real code. Don&rsquo;t worry if it still seems vague. Once we&rsquo;re done coding it\nup, it will all click into place.</p>\n<h2><a href=\"#building-a-hash-table\" id=\"building-a-hash-table\"><small>20&#8202;.&#8202;4</small>Building a Hash Table</a></h2>\n<p>The great thing about hash tables compared to other classic techniques like\nbalanced search trees is that the actual data structure is so simple. Ours goes\ninto a new module.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>table.h</em><br>\ncreate new file</div>\n<pre><span class=\"a\">#ifndef clox_table_h</span>\n<span class=\"a\">#define clox_table_h</span>\n\n<span class=\"a\">#include &quot;common.h&quot;</span>\n<span class=\"a\">#include &quot;value.h&quot;</span>\n\n<span class=\"k\">typedef</span> <span class=\"k\">struct</span> {\n  <span class=\"t\">int</span> <span class=\"i\">count</span>;\n  <span class=\"t\">int</span> <span class=\"i\">capacity</span>;\n  <span class=\"t\">Entry</span>* <span class=\"i\">entries</span>;\n} <span class=\"t\">Table</span>;\n\n<span class=\"a\">#endif</span>\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.h</em>, create new file</div>\n\n<p>A hash table is an array of entries. As in our dynamic array earlier, we keep\ntrack of both the allocated size of the array (<code>capacity</code>) and the number of\nkey/value pairs currently stored in it (<code>count</code>). The ratio of count to capacity\nis exactly the load factor of the hash table.</p>\n<p>Each entry is one of these:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;value.h&quot;\n</pre><div class=\"source-file\"><em>table.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">typedef</span> <span class=\"k\">struct</span> {\n  <span class=\"t\">ObjString</span>* <span class=\"i\">key</span>;\n  <span class=\"t\">Value</span> <span class=\"i\">value</span>;\n} <span class=\"t\">Entry</span>;\n</pre><pre class=\"insert-after\">\n\ntypedef struct {\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.h</em></div>\n\n<p>It&rsquo;s a simple key/value pair. Since the key is always a <span\nname=\"string\">string</span>, we store the ObjString pointer directly instead of\nwrapping it in a Value. It&rsquo;s a little faster and smaller this way.</p>\n<aside name=\"string\">\n<p>In clox, we only need to support keys that are strings. Handling other types of\nkeys doesn&rsquo;t add much complexity. As long as you can compare two objects for\nequality and reduce them to sequences of bits, it&rsquo;s easy to use them as hash\nkeys.</p>\n</aside>\n<p>To create a new, empty hash table, we declare a constructor-like function.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} Table;\n\n</pre><div class=\"source-file\"><em>table.h</em><br>\nadd after struct <em>Table</em></div>\n<pre class=\"insert\"><span class=\"t\">void</span> <span class=\"i\">initTable</span>(<span class=\"t\">Table</span>* <span class=\"i\">table</span>);\n\n</pre><pre class=\"insert-after\">#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.h</em>, add after struct <em>Table</em></div>\n\n<p>We need a new implementation file to define that. While we&rsquo;re at it, let&rsquo;s get\nall of the pesky includes out of the way.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>table.c</em><br>\ncreate new file</div>\n<pre><span class=\"a\">#include &lt;stdlib.h&gt;</span>\n<span class=\"a\">#include &lt;string.h&gt;</span>\n\n<span class=\"a\">#include &quot;memory.h&quot;</span>\n<span class=\"a\">#include &quot;object.h&quot;</span>\n<span class=\"a\">#include &quot;table.h&quot;</span>\n<span class=\"a\">#include &quot;value.h&quot;</span>\n\n<span class=\"t\">void</span> <span class=\"i\">initTable</span>(<span class=\"t\">Table</span>* <span class=\"i\">table</span>) {\n  <span class=\"i\">table</span>-&gt;<span class=\"i\">count</span> = <span class=\"n\">0</span>;\n  <span class=\"i\">table</span>-&gt;<span class=\"i\">capacity</span> = <span class=\"n\">0</span>;\n  <span class=\"i\">table</span>-&gt;<span class=\"i\">entries</span> = <span class=\"a\">NULL</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, create new file</div>\n\n<p>As in our dynamic value array type, a hash table initially starts with zero\ncapacity and a <code>NULL</code> array. We don&rsquo;t allocate anything until needed. Assuming\nwe do eventually allocate something, we need to be able to free it too.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void initTable(Table* table);\n</pre><div class=\"source-file\"><em>table.h</em><br>\nadd after <em>initTable</em>()</div>\n<pre class=\"insert\"><span class=\"t\">void</span> <span class=\"i\">freeTable</span>(<span class=\"t\">Table</span>* <span class=\"i\">table</span>);\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.h</em>, add after <em>initTable</em>()</div>\n\n<p>And its glorious implementation:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>table.c</em><br>\nadd after <em>initTable</em>()</div>\n<pre><span class=\"t\">void</span> <span class=\"i\">freeTable</span>(<span class=\"t\">Table</span>* <span class=\"i\">table</span>) {\n  <span class=\"a\">FREE_ARRAY</span>(<span class=\"t\">Entry</span>, <span class=\"i\">table</span>-&gt;<span class=\"i\">entries</span>, <span class=\"i\">table</span>-&gt;<span class=\"i\">capacity</span>);\n  <span class=\"i\">initTable</span>(<span class=\"i\">table</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, add after <em>initTable</em>()</div>\n\n<p>Again, it looks just like a dynamic array. In fact, you can think of a hash\ntable as basically a dynamic array with a really strange policy for inserting\nitems. We don&rsquo;t need to check for <code>NULL</code> here since <code>FREE_ARRAY()</code> already\nhandles that gracefully.</p>\n<h3><a href=\"#hashing-strings\" id=\"hashing-strings\"><small>20&#8202;.&#8202;4&#8202;.&#8202;1</small>Hashing strings</a></h3>\n<p>Before we can start putting entries in the table, we need to, well, hash them.\nTo ensure that the entries get distributed uniformly throughout the array, we\nwant a good hash function that looks at all of the bits of the key string. If it\nlooked at, say, only the first few characters, then a series of strings that all\nshared the same prefix would end up colliding in the same bucket.</p>\n<p>On the other hand, walking the entire string to calculate the hash is kind of\nslow. We&rsquo;d lose some of the performance benefit of the hash table if we had to\nwalk the string every time we looked for a key in the table. So we&rsquo;ll do the\nobvious thing: cache it.</p>\n<p>Over in the &ldquo;object&rdquo; module in ObjString, we add:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  char* chars;\n</pre><div class=\"source-file\"><em>object.h</em><br>\nin struct <em>ObjString</em></div>\n<pre class=\"insert\">  <span class=\"t\">uint32_t</span> <span class=\"i\">hash</span>;\n</pre><pre class=\"insert-after\">};\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, in struct <em>ObjString</em></div>\n\n<p>Each ObjString stores the hash code for its string. Since strings are immutable\nin Lox, we can calculate the hash code once up front and be certain that it will\nnever get invalidated. Caching it eagerly makes a kind of sense: allocating the\nstring and copying its characters over is already an <em>O(n)</em> operation, so it&rsquo;s a\ngood time to also do the <em>O(n)</em> calculation of the string&rsquo;s hash.</p>\n<p>Whenever we call the internal function to allocate a string, we pass in its\nhash code.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>object.c</em><br>\nfunction <em>allocateString</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"k\">static</span> <span class=\"t\">ObjString</span>* <span class=\"i\">allocateString</span>(<span class=\"t\">char</span>* <span class=\"i\">chars</span>, <span class=\"t\">int</span> <span class=\"i\">length</span>,\n                                 <span class=\"t\">uint32_t</span> <span class=\"i\">hash</span>) {\n</pre><pre class=\"insert-after\">  ObjString* string = ALLOCATE_OBJ(ObjString, OBJ_STRING);\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, function <em>allocateString</em>(), replace 1 line</div>\n\n<p>That function simply stores the hash in the struct.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  string-&gt;chars = chars;\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>allocateString</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">string</span>-&gt;<span class=\"i\">hash</span> = <span class=\"i\">hash</span>;\n</pre><pre class=\"insert-after\">  return string;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>allocateString</em>()</div>\n\n<p>The fun happens over at the callers. <code>allocateString()</code> is called from two\nplaces: the function that copies a string and the one that takes ownership of an\nexisting dynamically allocated string. We&rsquo;ll start with the first.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">ObjString* copyString(const char* chars, int length) {\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>copyString</em>()</div>\n<pre class=\"insert\">  <span class=\"t\">uint32_t</span> <span class=\"i\">hash</span> = <span class=\"i\">hashString</span>(<span class=\"i\">chars</span>, <span class=\"i\">length</span>);\n</pre><pre class=\"insert-after\">  char* heapChars = ALLOCATE(char, length + 1);\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>copyString</em>()</div>\n\n<p>No magic here. We calculate the hash code and then pass it along.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  memcpy(heapChars, chars, length);\n  heapChars[length] = '\\0';\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>copyString</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"k\">return</span> <span class=\"i\">allocateString</span>(<span class=\"i\">heapChars</span>, <span class=\"i\">length</span>, <span class=\"i\">hash</span>);\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>copyString</em>(), replace 1 line</div>\n\n<p>The other string function is similar.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">ObjString* takeString(char* chars, int length) {\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>takeString</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"t\">uint32_t</span> <span class=\"i\">hash</span> = <span class=\"i\">hashString</span>(<span class=\"i\">chars</span>, <span class=\"i\">length</span>);\n  <span class=\"k\">return</span> <span class=\"i\">allocateString</span>(<span class=\"i\">chars</span>, <span class=\"i\">length</span>, <span class=\"i\">hash</span>);\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>takeString</em>(), replace 1 line</div>\n\n<p>The interesting code is over here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>object.c</em><br>\nadd after <em>allocateString</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">uint32_t</span> <span class=\"i\">hashString</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">key</span>, <span class=\"t\">int</span> <span class=\"i\">length</span>) {\n  <span class=\"t\">uint32_t</span> <span class=\"i\">hash</span> = <span class=\"n\">2166136261u</span>;\n  <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">i</span> = <span class=\"n\">0</span>; <span class=\"i\">i</span> &lt; <span class=\"i\">length</span>; <span class=\"i\">i</span>++) {\n    <span class=\"i\">hash</span> ^= (<span class=\"t\">uint8_t</span>)<span class=\"i\">key</span>[<span class=\"i\">i</span>];\n    <span class=\"i\">hash</span> *= <span class=\"n\">16777619</span>;\n  }\n  <span class=\"k\">return</span> <span class=\"i\">hash</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, add after <em>allocateString</em>()</div>\n\n<p>This is the actual bona fide &ldquo;hash function&rdquo; in clox. The algorithm is called\n&ldquo;FNV-1a&rdquo;, and is the shortest decent hash function I know. Brevity is certainly\na virtue in a book that aims to show you every line of code.</p>\n<p>The basic idea is pretty simple, and many hash functions follow the same\npattern. You start with some initial hash value, usually a constant with certain\ncarefully chosen mathematical properties. Then you walk the data to be hashed.\nFor each byte (or sometimes word), you mix the bits into the hash value somehow,\nand then scramble the resulting bits around some.</p>\n<p>What it means to &ldquo;mix&rdquo; and &ldquo;scramble&rdquo; can get pretty sophisticated. Ultimately,\nthough, the basic goal is <em>uniformity</em><span class=\"em\">&mdash;</span>we want the resulting hash values to\nbe as widely scattered around the numeric range as possible to avoid collisions\nand clustering.</p>\n<h3><a href=\"#inserting-entries\" id=\"inserting-entries\"><small>20&#8202;.&#8202;4&#8202;.&#8202;2</small>Inserting entries</a></h3>\n<p>Now that string objects know their hash code, we can start putting them into\nhash tables.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void freeTable(Table* table);\n</pre><div class=\"source-file\"><em>table.h</em><br>\nadd after <em>freeTable</em>()</div>\n<pre class=\"insert\"><span class=\"t\">bool</span> <span class=\"i\">tableSet</span>(<span class=\"t\">Table</span>* <span class=\"i\">table</span>, <span class=\"t\">ObjString</span>* <span class=\"i\">key</span>, <span class=\"t\">Value</span> <span class=\"i\">value</span>);\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.h</em>, add after <em>freeTable</em>()</div>\n\n<p>This function adds the given key/value pair to the given hash table. If an entry\nfor that key is already present, the new value overwrites the old value. The\nfunction returns <code>true</code> if a new entry was added. Here&rsquo;s the implementation:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>table.c</em><br>\nadd after <em>freeTable</em>()</div>\n<pre><span class=\"t\">bool</span> <span class=\"i\">tableSet</span>(<span class=\"t\">Table</span>* <span class=\"i\">table</span>, <span class=\"t\">ObjString</span>* <span class=\"i\">key</span>, <span class=\"t\">Value</span> <span class=\"i\">value</span>) {\n  <span class=\"t\">Entry</span>* <span class=\"i\">entry</span> = <span class=\"i\">findEntry</span>(<span class=\"i\">table</span>-&gt;<span class=\"i\">entries</span>, <span class=\"i\">table</span>-&gt;<span class=\"i\">capacity</span>, <span class=\"i\">key</span>);\n  <span class=\"t\">bool</span> <span class=\"i\">isNewKey</span> = <span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span> == <span class=\"a\">NULL</span>;\n  <span class=\"k\">if</span> (<span class=\"i\">isNewKey</span>) <span class=\"i\">table</span>-&gt;<span class=\"i\">count</span>++;\n\n  <span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span> = <span class=\"i\">key</span>;\n  <span class=\"i\">entry</span>-&gt;<span class=\"i\">value</span> = <span class=\"i\">value</span>;\n  <span class=\"k\">return</span> <span class=\"i\">isNewKey</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, add after <em>freeTable</em>()</div>\n\n<p>Most of the interesting logic is in <code>findEntry()</code> which we&rsquo;ll get to soon. That\nfunction&rsquo;s job is to take a key and figure out which bucket in the array it\nshould go in. It returns a pointer to that bucket<span class=\"em\">&mdash;</span>the address of the Entry in\nthe array.</p>\n<p>Once we have a bucket, inserting is straightforward. We update the hash table&rsquo;s\nsize, taking care to not increase the count if we overwrote the value for an\nalready-present key. Then we copy the key and value into the corresponding\nfields in the Entry.</p>\n<p>We&rsquo;re missing a little something here, though. We haven&rsquo;t actually allocated the\nEntry array yet. Oops! Before we can insert anything, we need to make sure we\nhave an array, and that it&rsquo;s big enough.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">bool tableSet(Table* table, ObjString* key, Value value) {\n</pre><div class=\"source-file\"><em>table.c</em><br>\nin <em>tableSet</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">table</span>-&gt;<span class=\"i\">count</span> + <span class=\"n\">1</span> &gt; <span class=\"i\">table</span>-&gt;<span class=\"i\">capacity</span> * <span class=\"a\">TABLE_MAX_LOAD</span>) {\n    <span class=\"t\">int</span> <span class=\"i\">capacity</span> = <span class=\"a\">GROW_CAPACITY</span>(<span class=\"i\">table</span>-&gt;<span class=\"i\">capacity</span>);\n    <span class=\"i\">adjustCapacity</span>(<span class=\"i\">table</span>, <span class=\"i\">capacity</span>);\n  }\n\n</pre><pre class=\"insert-after\">  Entry* entry = findEntry(table-&gt;entries, table-&gt;capacity, key);\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, in <em>tableSet</em>()</div>\n\n<p>This is similar to the code we wrote a while back for growing a dynamic array.\nIf we don&rsquo;t have enough capacity to insert an item, we reallocate and grow the\narray. The <code>GROW_CAPACITY()</code> macro takes an existing capacity and grows it by\na multiple to ensure that we get amortized constant performance over a series\nof inserts.</p>\n<p>The interesting difference here is that <code>TABLE_MAX_LOAD</code> constant.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;value.h&quot;\n\n</pre><div class=\"source-file\"><em>table.c</em></div>\n<pre class=\"insert\"><span class=\"a\">#define TABLE_MAX_LOAD 0.75</span>\n\n</pre><pre class=\"insert-after\">void initTable(Table* table) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em></div>\n\n<p>This is how we manage the table&rsquo;s <span name=\"75\">load</span> factor. We don&rsquo;t\ngrow when the capacity is completely full. Instead, we grow the array before\nthen, when the array becomes at least 75% full.</p>\n<aside name=\"75\">\n<p>Ideal max load factor varies based on the hash function, collision-handling\nstrategy, and typical keysets you&rsquo;ll see. Since a toy language like Lox doesn&rsquo;t\nhave &ldquo;real world&rdquo; data sets, it&rsquo;s hard to optimize this, and I picked 75%\nsomewhat arbitrarily. When you build your own hash tables, benchmark and tune\nthis.</p>\n</aside>\n<p>We&rsquo;ll get to the implementation of <code>adjustCapacity()</code> soon. First, let&rsquo;s look\nat that <code>findEntry()</code> function you&rsquo;ve been wondering about.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>table.c</em><br>\nadd after <em>freeTable</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">Entry</span>* <span class=\"i\">findEntry</span>(<span class=\"t\">Entry</span>* <span class=\"i\">entries</span>, <span class=\"t\">int</span> <span class=\"i\">capacity</span>,\n                        <span class=\"t\">ObjString</span>* <span class=\"i\">key</span>) {\n  <span class=\"t\">uint32_t</span> <span class=\"i\">index</span> = <span class=\"i\">key</span>-&gt;<span class=\"i\">hash</span> % <span class=\"i\">capacity</span>;\n  <span class=\"k\">for</span> (;;) {\n    <span class=\"t\">Entry</span>* <span class=\"i\">entry</span> = &amp;<span class=\"i\">entries</span>[<span class=\"i\">index</span>];\n    <span class=\"k\">if</span> (<span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span> == <span class=\"i\">key</span> || <span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span> == <span class=\"a\">NULL</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">entry</span>;\n    }\n\n    <span class=\"i\">index</span> = (<span class=\"i\">index</span> + <span class=\"n\">1</span>) % <span class=\"i\">capacity</span>;\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, add after <em>freeTable</em>()</div>\n\n<p>This function is the real core of the hash table. It&rsquo;s responsible for taking a\nkey and an array of buckets, and figuring out which bucket the entry belongs in.\nThis function is also where linear probing and collision handling come into\nplay. We&rsquo;ll use <code>findEntry()</code> both to look up existing entries in the hash\ntable and to decide where to insert new ones.</p>\n<p>For all that, there isn&rsquo;t much to it. First, we use modulo to map the key&rsquo;s hash\ncode to an index within the array&rsquo;s bounds. That gives us a bucket index where,\nideally, we&rsquo;ll be able to find or place the entry.</p>\n<p>There are a few cases to check for:</p>\n<ul>\n<li>\n<p>If the key for the Entry at that array index is <code>NULL</code>, then the bucket is\nempty. If we&rsquo;re using <code>findEntry()</code> to look up something in the hash table,\nthis means it isn&rsquo;t there. If we&rsquo;re using it to insert, it means we&rsquo;ve found\na place to add the new entry.</p>\n</li>\n<li>\n<p>If the key in the bucket is <span name=\"equal\">equal</span> to the key we&rsquo;re\nlooking for, then that key is already present in the table. If we&rsquo;re doing a\nlookup, that&rsquo;s good<span class=\"em\">&mdash;</span>we&rsquo;ve found the key we seek. If we&rsquo;re doing an insert,\nthis means we&rsquo;ll be replacing the value for that key instead of adding a new\nentry.</p>\n</li>\n</ul>\n<aside name=\"equal\">\n<p>It looks like we&rsquo;re using <code>==</code> to see if two strings are equal. That doesn&rsquo;t\nwork, does it? There could be two copies of the same string at different places\nin memory. Fear not, astute reader. We&rsquo;ll solve this further on. And, strangely\nenough, it&rsquo;s a hash table that provides the tool we need.</p>\n</aside>\n<ul>\n<li>Otherwise, the bucket has an entry in it, but with a different key. This is\na collision. In that case, we start probing. That&rsquo;s what that <code>for</code> loop\ndoes. We start at the bucket where the entry would ideally go. If that\nbucket is empty or has the same key, we&rsquo;re done. Otherwise, we advance to\nthe next element<span class=\"em\">&mdash;</span>this is the <em>linear</em> part of &ldquo;linear probing&rdquo;<span class=\"em\">&mdash;</span>and\ncheck there. If we go past the end of the array, that second modulo operator\nwraps us back around to the beginning.</li>\n</ul>\n<p>We exit the loop when we find either an empty bucket or a bucket with the same\nkey as the one we&rsquo;re looking for. You might be wondering about an infinite loop.\nWhat if we collide with <em>every</em> bucket? Fortunately, that can&rsquo;t happen thanks to\nour load factor. Because we grow the array as soon as it gets close to being\nfull, we know there will always be empty buckets.</p>\n<p>We return directly from within the loop, yielding a pointer to the found Entry\nso the caller can either insert something into it or read from it. Way back in\n<code>tableSet()</code>, the function that first kicked this off, we store the new entry in\nthat returned bucket and we&rsquo;re done.</p>\n<h3><a href=\"#allocating-and-resizing\" id=\"allocating-and-resizing\"><small>20&#8202;.&#8202;4&#8202;.&#8202;3</small>Allocating and resizing</a></h3>\n<p>Before we can put entries in the hash table, we do need a place to actually\nstore them. We need to allocate an array of buckets. That happens in this\nfunction:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>table.c</em><br>\nadd after <em>findEntry</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">adjustCapacity</span>(<span class=\"t\">Table</span>* <span class=\"i\">table</span>, <span class=\"t\">int</span> <span class=\"i\">capacity</span>) {\n  <span class=\"t\">Entry</span>* <span class=\"i\">entries</span> = <span class=\"a\">ALLOCATE</span>(<span class=\"t\">Entry</span>, <span class=\"i\">capacity</span>);\n  <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">i</span> = <span class=\"n\">0</span>; <span class=\"i\">i</span> &lt; <span class=\"i\">capacity</span>; <span class=\"i\">i</span>++) {\n    <span class=\"i\">entries</span>[<span class=\"i\">i</span>].<span class=\"i\">key</span> = <span class=\"a\">NULL</span>;\n    <span class=\"i\">entries</span>[<span class=\"i\">i</span>].<span class=\"i\">value</span> = <span class=\"a\">NIL_VAL</span>;\n  }\n\n  <span class=\"i\">table</span>-&gt;<span class=\"i\">entries</span> = <span class=\"i\">entries</span>;\n  <span class=\"i\">table</span>-&gt;<span class=\"i\">capacity</span> = <span class=\"i\">capacity</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, add after <em>findEntry</em>()</div>\n\n<p>We create a bucket array with <code>capacity</code> entries. After we allocate the array,\nwe initialize every element to be an empty bucket and then store the array (and\nits capacity) in the hash table&rsquo;s main struct. This code is fine for when we\ninsert the very first entry into the table, and we require the first allocation\nof the array. But what about when we already have one and we need to grow it?</p>\n<p>Back when we were doing a dynamic array, we could just use <code>realloc()</code> and let\nthe C standard library copy everything over. That doesn&rsquo;t work for a hash table.\nRemember that to choose the bucket for each entry, we take its hash key <em>modulo\nthe array size</em>. That means that when the array size changes, entries may end up\nin different buckets.</p>\n<p>Those new buckets may have new collisions that we need to deal with. So the\nsimplest way to get every entry where it belongs is to rebuild the table from\nscratch by re-inserting every entry into the new empty array.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    entries[i].value = NIL_VAL;\n  }\n</pre><div class=\"source-file\"><em>table.c</em><br>\nin <em>adjustCapacity</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">i</span> = <span class=\"n\">0</span>; <span class=\"i\">i</span> &lt; <span class=\"i\">table</span>-&gt;<span class=\"i\">capacity</span>; <span class=\"i\">i</span>++) {\n    <span class=\"t\">Entry</span>* <span class=\"i\">entry</span> = &amp;<span class=\"i\">table</span>-&gt;<span class=\"i\">entries</span>[<span class=\"i\">i</span>];\n    <span class=\"k\">if</span> (<span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span> == <span class=\"a\">NULL</span>) <span class=\"k\">continue</span>;\n\n    <span class=\"t\">Entry</span>* <span class=\"i\">dest</span> = <span class=\"i\">findEntry</span>(<span class=\"i\">entries</span>, <span class=\"i\">capacity</span>, <span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span>);\n    <span class=\"i\">dest</span>-&gt;<span class=\"i\">key</span> = <span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span>;\n    <span class=\"i\">dest</span>-&gt;<span class=\"i\">value</span> = <span class=\"i\">entry</span>-&gt;<span class=\"i\">value</span>;\n  }\n</pre><pre class=\"insert-after\">\n\n  table-&gt;entries = entries;\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, in <em>adjustCapacity</em>()</div>\n\n<p>We walk through the old array front to back. Any time we find a non-empty\nbucket, we insert that entry into the new array. We use <code>findEntry()</code>, passing\nin the <em>new</em> array instead of the one currently stored in the Table. (This is\nwhy <code>findEntry()</code> takes a pointer directly to an Entry array and not the whole\n<code>Table</code> struct. That way, we can pass the new array and capacity before we&rsquo;ve\nstored those in the struct.)</p>\n<p>After that&rsquo;s done, we can release the memory for the old array.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    dest-&gt;value = entry-&gt;value;\n  }\n\n</pre><div class=\"source-file\"><em>table.c</em><br>\nin <em>adjustCapacity</em>()</div>\n<pre class=\"insert\">  <span class=\"a\">FREE_ARRAY</span>(<span class=\"t\">Entry</span>, <span class=\"i\">table</span>-&gt;<span class=\"i\">entries</span>, <span class=\"i\">table</span>-&gt;<span class=\"i\">capacity</span>);\n</pre><pre class=\"insert-after\">  table-&gt;entries = entries;\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, in <em>adjustCapacity</em>()</div>\n\n<p>With that, we have a hash table that we can stuff as many entries into as we\nlike. It handles overwriting existing keys and growing itself as needed to\nmaintain the desired load capacity.</p>\n<p>While we&rsquo;re at it, let&rsquo;s also define a helper function for copying all of the\nentries of one hash table into another.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">bool tableSet(Table* table, ObjString* key, Value value);\n</pre><div class=\"source-file\"><em>table.h</em><br>\nadd after <em>tableSet</em>()</div>\n<pre class=\"insert\"><span class=\"t\">void</span> <span class=\"i\">tableAddAll</span>(<span class=\"t\">Table</span>* <span class=\"i\">from</span>, <span class=\"t\">Table</span>* <span class=\"i\">to</span>);\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.h</em>, add after <em>tableSet</em>()</div>\n\n<p>We won&rsquo;t need this until much later when we support method inheritance, but we\nmay as well implement it now while we&rsquo;ve got all the hash table stuff fresh in\nour minds.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>table.c</em><br>\nadd after <em>tableSet</em>()</div>\n<pre><span class=\"t\">void</span> <span class=\"i\">tableAddAll</span>(<span class=\"t\">Table</span>* <span class=\"i\">from</span>, <span class=\"t\">Table</span>* <span class=\"i\">to</span>) {\n  <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">i</span> = <span class=\"n\">0</span>; <span class=\"i\">i</span> &lt; <span class=\"i\">from</span>-&gt;<span class=\"i\">capacity</span>; <span class=\"i\">i</span>++) {\n    <span class=\"t\">Entry</span>* <span class=\"i\">entry</span> = &amp;<span class=\"i\">from</span>-&gt;<span class=\"i\">entries</span>[<span class=\"i\">i</span>];\n    <span class=\"k\">if</span> (<span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span> != <span class=\"a\">NULL</span>) {\n      <span class=\"i\">tableSet</span>(<span class=\"i\">to</span>, <span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span>, <span class=\"i\">entry</span>-&gt;<span class=\"i\">value</span>);\n    }\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, add after <em>tableSet</em>()</div>\n\n<p>There&rsquo;s not much to say about this. It walks the bucket array of the source hash\ntable. Whenever it finds a non-empty bucket, it adds the entry to the\ndestination hash table using the <code>tableSet()</code> function we recently defined.</p>\n<h3><a href=\"#retrieving-values\" id=\"retrieving-values\"><small>20&#8202;.&#8202;4&#8202;.&#8202;4</small>Retrieving values</a></h3>\n<p>Now that our hash table contains some stuff, let&rsquo;s start pulling things back\nout. Given a key, we can look up the corresponding value, if there is one, with\nthis function:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void freeTable(Table* table);\n</pre><div class=\"source-file\"><em>table.h</em><br>\nadd after <em>freeTable</em>()</div>\n<pre class=\"insert\"><span class=\"t\">bool</span> <span class=\"i\">tableGet</span>(<span class=\"t\">Table</span>* <span class=\"i\">table</span>, <span class=\"t\">ObjString</span>* <span class=\"i\">key</span>, <span class=\"t\">Value</span>* <span class=\"i\">value</span>);\n</pre><pre class=\"insert-after\">bool tableSet(Table* table, ObjString* key, Value value);\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.h</em>, add after <em>freeTable</em>()</div>\n\n<p>You pass in a table and a key. If it finds an entry with that key, it returns\n<code>true</code>, otherwise it returns <code>false</code>. If the entry exists, the <code>value</code> output\nparameter points to the resulting value.</p>\n<p>Since <code>findEntry()</code> already does the hard work, the implementation isn&rsquo;t bad.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>table.c</em><br>\nadd after <em>findEntry</em>()</div>\n<pre><span class=\"t\">bool</span> <span class=\"i\">tableGet</span>(<span class=\"t\">Table</span>* <span class=\"i\">table</span>, <span class=\"t\">ObjString</span>* <span class=\"i\">key</span>, <span class=\"t\">Value</span>* <span class=\"i\">value</span>) {\n  <span class=\"k\">if</span> (<span class=\"i\">table</span>-&gt;<span class=\"i\">count</span> == <span class=\"n\">0</span>) <span class=\"k\">return</span> <span class=\"k\">false</span>;\n\n  <span class=\"t\">Entry</span>* <span class=\"i\">entry</span> = <span class=\"i\">findEntry</span>(<span class=\"i\">table</span>-&gt;<span class=\"i\">entries</span>, <span class=\"i\">table</span>-&gt;<span class=\"i\">capacity</span>, <span class=\"i\">key</span>);\n  <span class=\"k\">if</span> (<span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span> == <span class=\"a\">NULL</span>) <span class=\"k\">return</span> <span class=\"k\">false</span>;\n\n  *<span class=\"i\">value</span> = <span class=\"i\">entry</span>-&gt;<span class=\"i\">value</span>;\n  <span class=\"k\">return</span> <span class=\"k\">true</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, add after <em>findEntry</em>()</div>\n\n<p>If the table is completely empty, we definitely won&rsquo;t find the entry, so we\ncheck for that first. This isn&rsquo;t just an optimization<span class=\"em\">&mdash;</span>it also ensures that we\ndon&rsquo;t try to access the bucket array when the array is <code>NULL</code>. Otherwise, we let\n<code>findEntry()</code> work its magic. That returns a pointer to a bucket. If the bucket\nis empty, which we detect by seeing if the key is <code>NULL</code>, then we didn&rsquo;t find an\nEntry with our key. If <code>findEntry()</code> does return a non-empty Entry, then that&rsquo;s\nour match. We take the Entry&rsquo;s value and copy it to the output parameter so the\ncaller can get it. Piece of cake.</p>\n<h3><a href=\"#deleting-entries\" id=\"deleting-entries\"><small>20&#8202;.&#8202;4&#8202;.&#8202;5</small>Deleting entries</a></h3>\n<p>There is one more fundamental operation a full-featured hash table needs to\nsupport: removing an entry. This seems pretty obvious, if you can add things,\nyou should be able to <em>un</em>-add them, right? But you&rsquo;d be surprised how many\ntutorials on hash tables omit this.</p>\n<p>I could have taken that route too. In fact, we use deletion in clox only in a\ntiny edge case in the VM. But if you want to actually understand how to\ncompletely implement a hash table, this feels important. I can sympathize with\ntheir desire to overlook it. As we&rsquo;ll see, deleting from a hash table that uses\n<span name=\"delete\">open</span> addressing is tricky.</p>\n<aside name=\"delete\">\n<p>With separate chaining, deleting is as easy as removing a node from a linked\nlist.</p>\n</aside>\n<p>At least the declaration is simple.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">bool tableSet(Table* table, ObjString* key, Value value);\n</pre><div class=\"source-file\"><em>table.h</em><br>\nadd after <em>tableSet</em>()</div>\n<pre class=\"insert\"><span class=\"t\">bool</span> <span class=\"i\">tableDelete</span>(<span class=\"t\">Table</span>* <span class=\"i\">table</span>, <span class=\"t\">ObjString</span>* <span class=\"i\">key</span>);\n</pre><pre class=\"insert-after\">void tableAddAll(Table* from, Table* to);\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.h</em>, add after <em>tableSet</em>()</div>\n\n<p>The obvious approach is to mirror insertion. Use <code>findEntry()</code> to look up the\nentry&rsquo;s bucket. Then clear out the bucket. Done!</p>\n<p>In cases where there are no collisions, that works fine. But if a collision has\noccurred, then the bucket where the entry lives may be part of one or more\nimplicit probe sequences. For example, here&rsquo;s a hash table containing three keys\nall with the same preferred bucket, 2:</p><img src=\"image/hash-tables/delete-1.png\" alt=\"A hash table containing 'bagel' in bucket 2, 'biscuit' in bucket 3, and 'jam' in bucket 4.\" />\n<p>Remember that when we&rsquo;re walking a probe sequence to find an entry, we know\nwe&rsquo;ve reached the end of a sequence and that the entry isn&rsquo;t present when we hit\nan empty bucket. It&rsquo;s like the probe sequence is a list of entries and an empty\nentry terminates that list.</p>\n<p>If we delete &ldquo;biscuit&rdquo; by simply clearing the Entry, then we break that probe\nsequence in the middle, leaving the trailing entries orphaned and unreachable.\nSort of like removing a node from a linked list without relinking the pointer\nfrom the previous node to the next one.</p>\n<p>If we later try to look for &ldquo;jam&rdquo;, we&rsquo;d start at &ldquo;bagel&rdquo;, stop at the next\nempty Entry, and never find it.</p><img src=\"image/hash-tables/delete-2.png\" alt=\"The 'biscuit' entry has been deleted from the hash table, breaking the chain.\" />\n<p>To solve this, most implementations use a trick called <span\nname=\"tombstone\"><strong>tombstones</strong></span>. Instead of clearing the entry on\ndeletion, we replace it with a special sentinel entry called a &ldquo;tombstone&rdquo;. When\nwe are following a probe sequence during a lookup, and we hit a tombstone, we\n<em>don&rsquo;t</em> treat it like an empty slot and stop iterating. Instead, we keep going\nso that deleting an entry doesn&rsquo;t break any implicit collision chains and we can\nstill find entries after it.</p><img src=\"image/hash-tables/delete-3.png\" alt=\"Instead of deleting 'biscuit', it's replaced with a tombstone.\" />\n<p>The code looks like this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>table.c</em><br>\nadd after <em>tableSet</em>()</div>\n<pre><span class=\"t\">bool</span> <span class=\"i\">tableDelete</span>(<span class=\"t\">Table</span>* <span class=\"i\">table</span>, <span class=\"t\">ObjString</span>* <span class=\"i\">key</span>) {\n  <span class=\"k\">if</span> (<span class=\"i\">table</span>-&gt;<span class=\"i\">count</span> == <span class=\"n\">0</span>) <span class=\"k\">return</span> <span class=\"k\">false</span>;\n\n  <span class=\"c\">// Find the entry.</span>\n  <span class=\"t\">Entry</span>* <span class=\"i\">entry</span> = <span class=\"i\">findEntry</span>(<span class=\"i\">table</span>-&gt;<span class=\"i\">entries</span>, <span class=\"i\">table</span>-&gt;<span class=\"i\">capacity</span>, <span class=\"i\">key</span>);\n  <span class=\"k\">if</span> (<span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span> == <span class=\"a\">NULL</span>) <span class=\"k\">return</span> <span class=\"k\">false</span>;\n\n  <span class=\"c\">// Place a tombstone in the entry.</span>\n  <span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span> = <span class=\"a\">NULL</span>;\n  <span class=\"i\">entry</span>-&gt;<span class=\"i\">value</span> = <span class=\"a\">BOOL_VAL</span>(<span class=\"k\">true</span>);\n  <span class=\"k\">return</span> <span class=\"k\">true</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, add after <em>tableSet</em>()</div>\n\n<p>First, we find the bucket containing the entry we want to delete. (If we don&rsquo;t\nfind it, there&rsquo;s nothing to delete, so we bail out.) We replace the entry with a\ntombstone. In clox, we use a <code>NULL</code> key and a <code>true</code> value to represent that,\nbut any representation that can&rsquo;t be confused with an empty bucket or a valid\nentry works.</p>\n<aside name=\"tombstone\"><img src=\"image/hash-tables/tombstone.png\" alt=\"A tombstone enscribed 'Here lies entry biscuit &rarr; 3.75, gone but not deleted'.\" />\n</aside>\n<p>That&rsquo;s all we need to do to delete an entry. Simple and fast. But all of the\nother operations need to correctly handle tombstones too. A tombstone is a sort\nof &ldquo;half&rdquo; entry. It has some of the characteristics of a present entry, and some\nof the characteristics of an empty one.</p>\n<p>When we are following a probe sequence during a lookup, and we hit a tombstone,\nwe note it and keep going.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  for (;;) {\n    Entry* entry = &amp;entries[index];\n</pre><div class=\"source-file\"><em>table.c</em><br>\nin <em>findEntry</em>()<br>\nreplace 3 lines</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (<span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span> == <span class=\"a\">NULL</span>) {\n      <span class=\"k\">if</span> (<span class=\"a\">IS_NIL</span>(<span class=\"i\">entry</span>-&gt;<span class=\"i\">value</span>)) {\n        <span class=\"c\">// Empty entry.</span>\n        <span class=\"k\">return</span> <span class=\"i\">tombstone</span> != <span class=\"a\">NULL</span> ? <span class=\"i\">tombstone</span> : <span class=\"i\">entry</span>;\n      } <span class=\"k\">else</span> {\n        <span class=\"c\">// We found a tombstone.</span>\n        <span class=\"k\">if</span> (<span class=\"i\">tombstone</span> == <span class=\"a\">NULL</span>) <span class=\"i\">tombstone</span> = <span class=\"i\">entry</span>;\n      }\n    } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span> == <span class=\"i\">key</span>) {\n      <span class=\"c\">// We found the key.</span>\n      <span class=\"k\">return</span> <span class=\"i\">entry</span>;\n    }\n</pre><pre class=\"insert-after\">\n\n    index = (index + 1) % capacity;\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, in <em>findEntry</em>(), replace 3 lines</div>\n\n<p>The first time we pass a tombstone, we store it in this local variable:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  uint32_t index = key-&gt;hash % capacity;\n</pre><div class=\"source-file\"><em>table.c</em><br>\nin <em>findEntry</em>()</div>\n<pre class=\"insert\">  <span class=\"t\">Entry</span>* <span class=\"i\">tombstone</span> = <span class=\"a\">NULL</span>;\n\n</pre><pre class=\"insert-after\">  for (;;) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, in <em>findEntry</em>()</div>\n\n<p>If we reach a truly empty entry, then the key isn&rsquo;t present. In that case, if we\nhave passed a tombstone, we return its bucket instead of the later empty one. If\nwe&rsquo;re calling <code>findEntry()</code> in order to insert a node, that lets us treat the\ntombstone bucket as empty and reuse it for the new entry.</p>\n<p>Reusing tombstone slots automatically like this helps reduce the number of\ntombstones wasting space in the bucket array. In typical use cases where there\nis a mixture of insertions and deletions, the number of tombstones grows for a\nwhile and then tends to stabilize.</p>\n<p>Even so, there&rsquo;s no guarantee that a large number of deletes won&rsquo;t cause the\narray to be full of tombstones. In the very worst case, we could end up with\n<em>no</em> empty buckets. That would be bad because, remember, the only thing\npreventing an infinite loop in <code>findEntry()</code> is the assumption that we&rsquo;ll\neventually hit an empty bucket.</p>\n<p>So we need to be thoughtful about how tombstones interact with the table&rsquo;s load\nfactor and resizing. The key question is, when calculating the load factor,\nshould we treat tombstones like full buckets or empty ones?</p>\n<h3><a href=\"#counting-tombstones\" id=\"counting-tombstones\"><small>20&#8202;.&#8202;4&#8202;.&#8202;6</small>Counting tombstones</a></h3>\n<p>If we treat tombstones like full buckets, then we may end up with a bigger array\nthan we probably need because it artificially inflates the load factor. There\nare tombstones we could reuse, but they aren&rsquo;t treated as unused so we end up\ngrowing the array prematurely.</p>\n<p>But if we treat tombstones like empty buckets and <em>don&rsquo;t</em> include them in the\nload factor, then we run the risk of ending up with <em>no</em> actual empty buckets to\nterminate a lookup. An infinite loop is a much worse problem than a few extra\narray slots, so for load factor, we consider tombstones to be full buckets.</p>\n<p>That&rsquo;s why we don&rsquo;t reduce the count when deleting an entry in the previous\ncode. The count is no longer the number of entries in the hash table, it&rsquo;s the\nnumber of entries plus tombstones. That implies that we increment the count\nduring insertion only if the new entry goes into an entirely empty bucket.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  bool isNewKey = entry-&gt;key == NULL;\n</pre><div class=\"source-file\"><em>table.c</em><br>\nin <em>tableSet</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">isNewKey</span> &amp;&amp; <span class=\"a\">IS_NIL</span>(<span class=\"i\">entry</span>-&gt;<span class=\"i\">value</span>)) <span class=\"i\">table</span>-&gt;<span class=\"i\">count</span>++;\n</pre><pre class=\"insert-after\">\n\n  entry-&gt;key = key;\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, in <em>tableSet</em>(), replace 1 line</div>\n\n<p>If we are replacing a tombstone with a new entry, the bucket has already been\naccounted for and the count doesn&rsquo;t change.</p>\n<p>When we resize the array, we allocate a new array and re-insert all of the\nexisting entries into it. During that process, we <em>don&rsquo;t</em> copy the tombstones\nover. They don&rsquo;t add any value since we&rsquo;re rebuilding the probe sequences\nanyway, and would just slow down lookups. That means we need to recalculate the\ncount since it may change during a resize. So we clear it out:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  }\n\n</pre><div class=\"source-file\"><em>table.c</em><br>\nin <em>adjustCapacity</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">table</span>-&gt;<span class=\"i\">count</span> = <span class=\"n\">0</span>;\n</pre><pre class=\"insert-after\">  for (int i = 0; i &lt; table-&gt;capacity; i++) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, in <em>adjustCapacity</em>()</div>\n\n<p>Then each time we find a non-tombstone entry, we increment it.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    dest-&gt;value = entry-&gt;value;\n</pre><div class=\"source-file\"><em>table.c</em><br>\nin <em>adjustCapacity</em>()</div>\n<pre class=\"insert\">    <span class=\"i\">table</span>-&gt;<span class=\"i\">count</span>++;\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, in <em>adjustCapacity</em>()</div>\n\n<p>This means that when we grow the capacity, we may end up with <em>fewer</em> entries in\nthe resulting larger array because all of the tombstones get discarded. That&rsquo;s a\nlittle wasteful, but not a huge practical problem.</p>\n<p>I find it interesting that much of the work to support deleting entries is in\n<code>findEntry()</code> and <code>adjustCapacity()</code>. The actual delete logic is quite simple\nand fast. In practice, deletions tend to be rare, so you&rsquo;d expect a hash table\nto do as much work as it can in the delete function and leave the other\nfunctions alone to keep them faster. With our tombstone approach, deletes are\nfast, but lookups get penalized.</p>\n<p>I did a little benchmarking to test this out in a few different deletion\nscenarios. I was surprised to discover that tombstones did end up being faster\noverall compared to doing all the work during deletion to reinsert the affected\nentries.</p>\n<p>But if you think about it, it&rsquo;s not that the tombstone approach pushes the work\nof fully deleting an entry to other operations, it&rsquo;s more that it makes deleting\n<em>lazy</em>. At first, it does the minimal work to turn the entry into a tombstone.\nThat can cause a penalty when later lookups have to skip over it. But it also\nallows that tombstone bucket to be reused by a later insert too. That reuse is a\nvery efficient way to avoid the cost of rearranging all of the following\naffected entries. You basically recycle a node in the chain of probed entries.\nIt&rsquo;s a neat trick.</p>\n<h2><a href=\"#string-interning\" id=\"string-interning\"><small>20&#8202;.&#8202;5</small>String Interning</a></h2>\n<p>We&rsquo;ve got ourselves a hash table that mostly works, though it has a critical\nflaw in its center. Also, we aren&rsquo;t using it for anything yet. It&rsquo;s time to\naddress both of those and, in the process, learn a classic technique used by\ninterpreters.</p>\n<p>The reason the hash table doesn&rsquo;t totally work is that when <code>findEntry()</code> checks\nto see if an existing key matches the one it&rsquo;s looking for, it uses <code>==</code> to\ncompare two strings for equality. That only returns true if the two keys are the\nexact same string in memory. Two separate strings with the same characters\nshould be considered equal, but aren&rsquo;t.</p>\n<p>Remember, back when we added strings in the last chapter, we added <a href=\"strings.html#operations-on-strings\">explicit\nsupport to compare the strings character-by-character</a> in order to get\ntrue value equality. We could do that in <code>findEntry()</code>, but that&rsquo;s <span\nname=\"hash-collision\">slow</span>.</p>\n<aside name=\"hash-collision\">\n<p>In practice, we would first compare the hash codes of the two strings. That\nquickly detects almost all different strings<span class=\"em\">&mdash;</span>it wouldn&rsquo;t be a very good hash\nfunction if it didn&rsquo;t. But when the two hashes are the same, we still have to\ncompare characters to make sure we didn&rsquo;t have a hash collision on different\nstrings.</p>\n</aside>\n<p>Instead, we&rsquo;ll use a technique called <strong>string interning</strong>. The core problem is\nthat it&rsquo;s possible to have different strings in memory with the same characters.\nThose need to behave like equivalent values even though they are distinct\nobjects. They&rsquo;re essentially duplicates, and we have to compare all of their\nbytes to detect that.</p>\n<p><span name=\"intern\">String interning</span> is a process of deduplication. We\ncreate a collection of &ldquo;interned&rdquo; strings. Any string in that collection is\nguaranteed to be textually distinct from all others. When you intern a string,\nyou look for a matching string in the collection. If found, you use that\noriginal one. Otherwise, the string you have is unique, so you add it to the\ncollection.</p>\n<aside name=\"intern\">\n<p>I&rsquo;m guessing &ldquo;intern&rdquo; is short for &ldquo;internal&rdquo;. I think the idea is that the\nlanguage&rsquo;s runtime keeps its own &ldquo;internal&rdquo; collection of these strings, whereas\nother strings could be user created and floating around in memory. When you\nintern a string, you ask the runtime to add the string to that internal\ncollection and return a pointer to it.</p>\n<p>Languages vary in how much string interning they do and how it&rsquo;s exposed to the\nuser. Lua interns <em>all</em> strings, which is what clox will do too. Lisp, Scheme,\nSmalltalk, Ruby and others have a separate string-like type called &ldquo;symbol&rdquo; that\nis implicitly interned. (This is why they say symbols are &ldquo;faster&rdquo; in Ruby.)\nJava interns constant strings by default, and provides an API to let you\nexplicitly intern any string you give it.</p>\n</aside>\n<p>In this way, you know that each sequence of characters is represented by only\none string in memory. This makes value equality trivial. If two strings point\nto the same address in memory, they are obviously the same string and must be\nequal. And, because we know strings are unique, if two strings point to\ndifferent addresses, they must be distinct strings.</p>\n<p>Thus, pointer equality exactly matches value equality. Which in turn means that\nour existing <code>==</code> in <code>findEntry()</code> does the right thing. Or, at least, it will\nonce we intern all the strings. In order to reliably deduplicate all strings,\nthe VM needs to be able to find every string that&rsquo;s created. We do that by\ngiving it a hash table to store them all.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  Value* stackTop;\n</pre><div class=\"source-file\"><em>vm.h</em><br>\nin struct <em>VM</em></div>\n<pre class=\"insert\">  <span class=\"t\">Table</span> <span class=\"i\">strings</span>;\n</pre><pre class=\"insert-after\">  Obj* objects;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em>, in struct <em>VM</em></div>\n\n<p>As usual, we need an include.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;chunk.h&quot;\n</pre><div class=\"source-file\"><em>vm.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;table.h&quot;</span>\n</pre><pre class=\"insert-after\">#include &quot;value.h&quot;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em></div>\n\n<p>When we spin up a new VM, the string table is empty.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  vm.objects = NULL;\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>initVM</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">initTable</span>(&amp;<span class=\"i\">vm</span>.<span class=\"i\">strings</span>);\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>initVM</em>()</div>\n\n<p>And when we shut down the VM, we clean up any resources used by the table.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void freeVM() {\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>freeVM</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">freeTable</span>(&amp;<span class=\"i\">vm</span>.<span class=\"i\">strings</span>);\n</pre><pre class=\"insert-after\">  freeObjects();\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>freeVM</em>()</div>\n\n<p>Some languages have a separate type or an explicit step to intern a string. For\nclox, we&rsquo;ll automatically intern every one. That means whenever we create a new\nunique string, we add it to the table.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  string-&gt;hash = hash;\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>allocateString</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">tableSet</span>(&amp;<span class=\"i\">vm</span>.<span class=\"i\">strings</span>, <span class=\"i\">string</span>, <span class=\"a\">NIL_VAL</span>);\n</pre><pre class=\"insert-after\">  return string;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>allocateString</em>()</div>\n\n<p>We&rsquo;re using the table more like a hash <em>set</em> than a hash <em>table</em>. The keys are\nthe strings and those are all we care about, so we just use <code>nil</code> for the\nvalues.</p>\n<p>This gets a string into the table assuming that it&rsquo;s unique, but we need to\nactually check for duplication before we get here. We do that in the two\nhigher-level functions that call <code>allocateString()</code>. Here&rsquo;s one:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  uint32_t hash = hashString(chars, length);\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>copyString</em>()</div>\n<pre class=\"insert\">  <span class=\"t\">ObjString</span>* <span class=\"i\">interned</span> = <span class=\"i\">tableFindString</span>(&amp;<span class=\"i\">vm</span>.<span class=\"i\">strings</span>, <span class=\"i\">chars</span>, <span class=\"i\">length</span>,\n                                        <span class=\"i\">hash</span>);\n  <span class=\"k\">if</span> (<span class=\"i\">interned</span> != <span class=\"a\">NULL</span>) <span class=\"k\">return</span> <span class=\"i\">interned</span>;\n\n</pre><pre class=\"insert-after\">  char* heapChars = ALLOCATE(char, length + 1);\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>copyString</em>()</div>\n\n<p>When copying a string into a new LoxString, we look it up in the string table\nfirst. If we find it, instead of &ldquo;copying&rdquo;, we just return a reference to that\nstring. Otherwise, we fall through, allocate a new string, and store it in the\nstring table.</p>\n<p>Taking ownership of a string is a little different.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  uint32_t hash = hashString(chars, length);\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>takeString</em>()</div>\n<pre class=\"insert\">  <span class=\"t\">ObjString</span>* <span class=\"i\">interned</span> = <span class=\"i\">tableFindString</span>(&amp;<span class=\"i\">vm</span>.<span class=\"i\">strings</span>, <span class=\"i\">chars</span>, <span class=\"i\">length</span>,\n                                        <span class=\"i\">hash</span>);\n  <span class=\"k\">if</span> (<span class=\"i\">interned</span> != <span class=\"a\">NULL</span>) {\n    <span class=\"a\">FREE_ARRAY</span>(<span class=\"t\">char</span>, <span class=\"i\">chars</span>, <span class=\"i\">length</span> + <span class=\"n\">1</span>);\n    <span class=\"k\">return</span> <span class=\"i\">interned</span>;\n  }\n\n</pre><pre class=\"insert-after\">  return allocateString(chars, length, hash);\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>takeString</em>()</div>\n\n<p>Again, we look up the string in the string table first. If we find it, before we\nreturn it, we free the memory for the string that was passed in. Since ownership\nis being passed to this function and we no longer need the duplicate string,\nit&rsquo;s up to us to free it.</p>\n<p>Before we get to the new function we need to write, there&rsquo;s one more include.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;object.h&quot;\n</pre><div class=\"source-file\"><em>object.c</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;table.h&quot;</span>\n</pre><pre class=\"insert-after\">#include &quot;value.h&quot;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em></div>\n\n<p>To look for a string in the table, we can&rsquo;t use the normal <code>tableGet()</code> function\nbecause that calls <code>findEntry()</code>, which has the exact problem with duplicate\nstrings that we&rsquo;re trying to fix right now. Instead, we use this new function:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void tableAddAll(Table* from, Table* to);\n</pre><div class=\"source-file\"><em>table.h</em><br>\nadd after <em>tableAddAll</em>()</div>\n<pre class=\"insert\"><span class=\"t\">ObjString</span>* <span class=\"i\">tableFindString</span>(<span class=\"t\">Table</span>* <span class=\"i\">table</span>, <span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">chars</span>,\n                           <span class=\"t\">int</span> <span class=\"i\">length</span>, <span class=\"t\">uint32_t</span> <span class=\"i\">hash</span>);\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.h</em>, add after <em>tableAddAll</em>()</div>\n\n<p>The implementation looks like so:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>table.c</em><br>\nadd after <em>tableAddAll</em>()</div>\n<pre><span class=\"t\">ObjString</span>* <span class=\"i\">tableFindString</span>(<span class=\"t\">Table</span>* <span class=\"i\">table</span>, <span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">chars</span>,\n                           <span class=\"t\">int</span> <span class=\"i\">length</span>, <span class=\"t\">uint32_t</span> <span class=\"i\">hash</span>) {\n  <span class=\"k\">if</span> (<span class=\"i\">table</span>-&gt;<span class=\"i\">count</span> == <span class=\"n\">0</span>) <span class=\"k\">return</span> <span class=\"a\">NULL</span>;\n\n  <span class=\"t\">uint32_t</span> <span class=\"i\">index</span> = <span class=\"i\">hash</span> % <span class=\"i\">table</span>-&gt;<span class=\"i\">capacity</span>;\n  <span class=\"k\">for</span> (;;) {\n    <span class=\"t\">Entry</span>* <span class=\"i\">entry</span> = &amp;<span class=\"i\">table</span>-&gt;<span class=\"i\">entries</span>[<span class=\"i\">index</span>];\n    <span class=\"k\">if</span> (<span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span> == <span class=\"a\">NULL</span>) {\n      <span class=\"c\">// Stop if we find an empty non-tombstone entry.</span>\n      <span class=\"k\">if</span> (<span class=\"a\">IS_NIL</span>(<span class=\"i\">entry</span>-&gt;<span class=\"i\">value</span>)) <span class=\"k\">return</span> <span class=\"a\">NULL</span>;\n    } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span>-&gt;<span class=\"i\">length</span> == <span class=\"i\">length</span> &amp;&amp;\n        <span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span>-&gt;<span class=\"i\">hash</span> == <span class=\"i\">hash</span> &amp;&amp;\n        <span class=\"i\">memcmp</span>(<span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span>-&gt;<span class=\"i\">chars</span>, <span class=\"i\">chars</span>, <span class=\"i\">length</span>) == <span class=\"n\">0</span>) {\n      <span class=\"c\">// We found it.</span>\n      <span class=\"k\">return</span> <span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span>;\n    }\n\n    <span class=\"i\">index</span> = (<span class=\"i\">index</span> + <span class=\"n\">1</span>) % <span class=\"i\">table</span>-&gt;<span class=\"i\">capacity</span>;\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, add after <em>tableAddAll</em>()</div>\n\n<p>It appears we have copy-pasted <code>findEntry()</code>. There is a lot of redundancy, but\nalso a couple of key differences. First, we pass in the raw character array of\nthe key we&rsquo;re looking for instead of an ObjString. At the point that we call\nthis, we haven&rsquo;t created an ObjString yet.</p>\n<p>Second, when checking to see if we found the key, we look at the actual strings.\nWe first see if they have matching lengths and hashes. Those are quick to check\nand if they aren&rsquo;t equal, the strings definitely aren&rsquo;t the same.</p>\n<p>If there is a hash collision, we do an actual character-by-character string\ncomparison. This is the one place in the VM where we actually test strings for\ntextual equality. We do it here to deduplicate strings and then the rest of the\nVM can take for granted that any two strings at different addresses in memory\nmust have different contents.</p>\n<p>In fact, now that we&rsquo;ve interned all the strings, we can take advantage of it in\nthe bytecode interpreter. When a user does <code>==</code> on two objects that happen to be\nstrings, we don&rsquo;t need to test the characters any more.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    case VAL_NUMBER: return AS_NUMBER(a) == AS_NUMBER(b);\n</pre><div class=\"source-file\"><em>value.c</em><br>\nin <em>valuesEqual</em>()<br>\nreplace 7 lines</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">VAL_OBJ</span>:    <span class=\"k\">return</span> <span class=\"a\">AS_OBJ</span>(<span class=\"i\">a</span>) == <span class=\"a\">AS_OBJ</span>(<span class=\"i\">b</span>);\n</pre><pre class=\"insert-after\">    default:         return false; // Unreachable.\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.c</em>, in <em>valuesEqual</em>(), replace 7 lines</div>\n\n<p>We&rsquo;ve added a little overhead when creating strings to intern them. But in\nreturn, at runtime, the equality operator on strings is much faster. With that,\nwe have a full-featured hash table ready for us to use for tracking variables,\ninstances, or any other key-value pairs that might show up.</p>\n<p>We also sped up testing strings for equality. This is nice for when the user\ndoes <code>==</code> on strings. But it&rsquo;s even more critical in a dynamically typed\nlanguage like Lox where method calls and instance fields are looked up by name\nat runtime. If testing a string for equality is slow, then that means looking up\na method by name is slow. And if <em>that&rsquo;s</em> slow in your object-oriented language,\nthen <em>everything</em> is slow.</p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>In clox, we happen to only need keys that are strings, so the hash table we\nbuilt is hardcoded for that key type. If we exposed hash tables to Lox users\nas a first-class collection, it would be useful to support different kinds\nof keys.</p>\n<p>Add support for keys of the other primitive types: numbers, Booleans, and\n<code>nil</code>. Later, clox will support user-defined classes. If we want to support\nkeys that are instances of those classes, what kind of complexity does that\nadd?</p>\n</li>\n<li>\n<p>Hash tables have a lot of knobs you can tweak that affect their performance.\nYou decide whether to use separate chaining or open addressing. Depending on\nwhich fork in that road you take, you can tune how many entries are stored\nin each node, or the probing strategy you use. You control the hash\nfunction, load factor, and growth rate.</p>\n<p>All of this variety wasn&rsquo;t created just to give CS doctoral candidates\nsomething to <span name=\"publish\">publish</span> theses on: each has its\nuses in the many varied domains and hardware scenarios where hashing comes\ninto play. Look up a few hash table implementations in different open source\nsystems, research the choices they made, and try to figure out why they did\nthings that way.</p>\n<aside name=\"publish\">\n<p>Well, at least that wasn&rsquo;t the <em>only</em> reason they were created. Whether that\nwas the <em>main</em> reason is up for debate.</p>\n</aside></li>\n<li>\n<p>Benchmarking a hash table is notoriously difficult. A hash table\nimplementation may perform well with some keysets and poorly with others. It\nmay work well at small sizes but degrade as it grows, or vice versa. It may\nchoke when deletions are common, but fly when they aren&rsquo;t. Creating\nbenchmarks that accurately represent how your users will use the hash table\nis a challenge.</p>\n<p>Write a handful of different benchmark programs to validate our hash table\nimplementation. How does the performance vary between them? Why did you\nchoose the specific test cases you chose?</p>\n</li>\n</ol>\n</div>\n\n<footer>\n<a href=\"global-variables.html\" class=\"next\">\n  Next Chapter: &ldquo;Global Variables&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/index.css",
    "content": "@font-face {\n  font-family: \"Crimson\";\n  src: url(\"font/crimson-roman.woff\") format(\"woff\");\n}\n@font-face {\n  font-family: \"Crimson\";\n  src: url(\"font/crimson-italic.woff\") format(\"woff\");\n  font-style: italic;\n}\n@font-face {\n  font-family: \"Crimson\";\n  src: url(\"font/crimson-semibold.woff\") format(\"woff\");\n  font-weight: 600;\n}\n@font-face {\n  font-family: \"Crimson\";\n  src: url(\"font/crimson-semibolditalic.woff\") format(\"woff\");\n  font-style: italic;\n  font-weight: 600;\n}\n@font-face {\n  font-family: \"Crimson\";\n  src: url(\"font/crimson-bold.woff\") format(\"woff\");\n  font-weight: bold;\n}\n@font-face {\n  font-family: \"Crimson\";\n  src: url(\"font/crimson-bolditalic.woff\") format(\"woff\");\n  font-style: italic;\n  font-weight: bold;\n}\nbody, h1, h2, h3, h4, p, blockquote, code, ul, ol, dl, dd, img {\n  margin: 0;\n}\n\nimg {\n  outline: none;\n}\n\nimg.arrow {\n  width: auto;\n  height: 11px;\n}\n\nimg.dot {\n  width: auto;\n  height: 18px;\n  vertical-align: text-bottom;\n}\n\nbody {\n  color: #222;\n  font: normal 16px/24px \"Crimson\", Georgia, serif;\n}\n\n.sign-up {\n  padding: 12px;\n  margin: 24px 0 24px 0;\n  background: #fcf6e8;\n  color: #bf9540;\n  border-radius: 3px;\n}\n.sign-up form {\n  display: flex;\n}\n.sign-up input {\n  padding: 4px;\n  font: 16px \"Source Sans Pro\", sans-serif;\n  outline: none;\n  border-radius: 3px;\n  border: solid 2px #ffd580;\n  color: #825e17;\n  height: 32px;\n}\n.sign-up input.email {\n  display: block;\n  box-sizing: border-box;\n  width: 100%;\n}\n.sign-up input.button {\n  margin-left: 8px;\n  padding: 4px 8px;\n  font: 600 13px \"Source Sans Pro\", sans-serif;\n  text-transform: uppercase;\n  letter-spacing: 1px;\n  background: #ffbb33;\n  border: none;\n  transition: background-color 0.2s ease;\n}\n.sign-up input.button:hover {\n  background: #ffd580;\n}\n.sign-up input:focus {\n  border-color: #ffaa00;\n}\n\nbody, h1, h2, h3, h4, p, blockquote, code, ul, ol, dl, dd, img {\n  margin: 0;\n}\n\nbody {\n  background: #29313d url(\"image/background.png\") top center/100% auto no-repeat;\n  color: #222;\n  font: normal 16px/24px \"Crimson\", Georgia, serif;\n}\n\na {\n  color: #1481b8;\n  text-decoration: none;\n  border-bottom: solid 1px rgba(222, 233, 237, 0);\n  transition: color 0.2s ease, border-color 0.4s ease;\n}\n\na:hover {\n  color: #1481b8;\n  border-bottom: solid 1px #dee9ed;\n}\n\narticle {\n  margin: 0 auto;\n  padding: 0 0 12px 0;\n  max-width: 960px;\n  background: #fff;\n}\n\nheader {\n  margin: 0 0 48px 0;\n  color: #595959;\n  background: #f5f3f0;\n  border-bottom: solid 1px #dad8d6;\n}\n\nmain {\n  margin: 0 48px;\n}\n\nimg.header {\n  display: block;\n  width: 100%;\n}\n\nimg.small {\n  display: none;\n}\n\ndiv.intro {\n  display: flex;\n}\ndiv.intro blockquote {\n  flex-basis: 40%;\n  margin: 0 48px 0 0;\n  font: italic 28px/42px \"Crimson\", Georgia, serif;\n}\ndiv.intro div.text {\n  flex-basis: 60%;\n  margin: 8px 0 24px 0;\n}\n\np + p {\n  margin-top: 24px;\n}\n\n.format {\n  margin: 0 -12px 24px -12px;\n  padding: 12px 12px 8px 12px;\n  height: 244px;\n  box-sizing: border-box;\n  background: #eef4f7;\n  background-size: cover;\n  background-position: left;\n  color: #444;\n  border-radius: 3px;\n  font: normal 16px/24px \"Source Sans Pro\", sans-serif;\n}\n.format h3 {\n  margin: 0;\n  padding: 0 0 4px 0;\n  font: 600 16px/24px \"Source Sans Pro\", sans-serif;\n  text-transform: uppercase;\n  letter-spacing: 1px;\n}\n.format p {\n  margin-bottom: 8px;\n}\n\n.format.print, .format.pdf {\n  background-position: right;\n  text-align: right;\n}\n\n.format-info {\n  display: inline-block;\n  width: 384px;\n  text-align: left;\n}\n.format-info table {\n  width: 100%;\n  border-collapse: collapse;\n}\n.format-info table td + td {\n  padding-left: 5px;\n}\n\n.format.print {\n  background-image: url(\"image/format-print.jpg\");\n}\n\n.format.ebook {\n  background-image: url(\"image/format-ebook.jpg\");\n}\n\n.format.pdf {\n  background-image: url(\"image/format-pdf.jpg\");\n}\n\n.format.web {\n  background-image: url(\"image/format-web.jpg\");\n}\n\na.action {\n  display: block;\n  margin: 0 0 4px 0;\n  padding: 4px 0;\n  text-align: center;\n  border-radius: 3px;\n  background: #1481b8;\n  transition: background-color 0.2s ease, color 0.2s ease;\n  font: 400 17px/24px \"Source Sans Pro\", sans-serif;\n  color: white;\n}\na.action small {\n  font-size: 14px;\n  padding: 4px;\n  color: rgba(255, 255, 255, 0.7);\n  transition: color 0.2s ease;\n}\n\na.action:hover {\n  background-color: #2badee;\n}\na.action:hover small {\n  color: white;\n}\n\nh3 {\n  font: italic 24px/24px \"Crimson\", Georgia, serif;\n  margin: 12px 0;\n}\n\nimg.author {\n  float: left;\n  width: 240px;\n  margin: 0 12px 0 -12px;\n  padding: 12px;\n  background: #f5f3f0;\n  border-radius: 3px;\n}\n\ndiv.author {\n  vertical-align: top;\n  margin: 36px 0 0 288px;\n}\n\nfooter {\n  position: relative;\n  border-top: solid 1px #dee9ed;\n  color: #7aa0b8;\n  font: 400 15px \"Source Sans Pro\", sans-serif;\n  text-align: center;\n  margin: 12px 0 36px 0;\n  padding-top: 48px;\n}\nfooter a, footer a:hover {\n  border: none;\n}\n\n@media only screen and (max-width: 700px) {\n  main {\n    margin: 0 24px;\n  }\n\n  header {\n    margin-bottom: 24px;\n  }\n\n  img.big {\n    display: none;\n  }\n\n  img.small {\n    display: block;\n  }\n\n  div.intro {\n    display: block;\n  }\n  div.intro blockquote {\n    display: block;\n    font: italic 24px/36px \"Crimson\", Georgia, serif;\n  }\n  div.intro div.text {\n    display: block;\n    margin: 24px 0 24px 0;\n  }\n\n  .format {\n    margin-bottom: 12px;\n    height: auto;\n    background-blend-mode: lighten;\n  }\n\n  .format-info {\n    display: block;\n    width: 100%;\n  }\n\n  .format.print {\n    background-color: #a6a29f;\n  }\n\n  .format.ebook {\n    background-color: #97a2aa;\n  }\n\n  .format.pdf {\n    background-color: #cfccca;\n  }\n\n  .format.web {\n    background-color: #d6dbd3;\n  }\n\n  img.author {\n    float: none;\n  }\n\n  div.author {\n    margin: 0 0 0 0;\n  }\n}"
  },
  {
    "path": "site/index.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Crafting Interpreters</title>\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"index.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body>\n\n<article>\n\n<header>\n  <a href=\"dedication.html\"><img class=\"header big\" src=\"image/header.png\" alt=\"Crafting Interpreters by Robert Nystrom\" /><img class=\"header small\" src=\"image/header-small.png\" alt=\"Crafting Interpreters by Robert Nystrom\" /></a>\n</header>\n\n<main>\n\n<div class=\"intro\">\n\n<blockquote><p>Ever wanted to make your own programming language or wondered how\nthey are designed and built?</p><p>If so, this book is for you.</p></blockquote>\n\n<div class=\"text\">\n\n<p><em>Crafting Interpreters</em> contains everything you need to implement a\nfull-featured, efficient scripting language. You&#8217;ll learn both high-level\nconcepts around parsing and semantics and gritty details like bytecode\nrepresentation and garbage collection. Your brain will light up with new ideas,\nand your hands will get dirty and calloused. It&#8217;s a blast.</p>\n\n<p>Starting from <code>main()</code>, you build a language that features rich\nsyntax, dynamic typing, garbage collection, lexical scope, first-class\nfunctions, closures, classes, and inheritance. All packed into a few thousand\nlines of clean, fast code that you thoroughly understand because you write each\none yourself.</p>\n\n<p>The book is available in four delectable formats:</p>\n\n</div>\n\n</div>\n\n<div class=\"format print\">\n  <div class=\"format-info\">\n    <h3>Print</h3>\n    <p>640 pages of beautiful typography and high resolution hand-drawn\n    illustrations. Each page lovingly typeset by the author. The premiere reading\n    experience.</p>\n    <table>\n    <tr>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.com/dp/0990582930\" target=\"_blank\">Amazon<small>.com</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.ca/dp/0990582930\" target=\"_blank\"><small>.ca</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.co.uk/dp/0990582930\" target=\"_blank\"><small>.uk</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.com.au/dp/0990582930\" target=\"_blank\"><small>.au</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.de/dp/0990582930\" target=\"_blank\"><small>.de</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.fr/dp/0990582930\" target=\"_blank\"><small>.fr</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.es/dp/0990582930\" target=\"_blank\"><small>.es</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.it/dp/0990582930\" target=\"_blank\"><small>.it</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.co.jp/dp/0990582930\" target=\"_blank\"><small>.jp</small></a>\n    </td>\n    </tr>\n    </table>\n    <table>\n    <tr>\n    <td>\n      <a class=\"action\" href=\"https://www.barnesandnoble.com/w/crafting-interpreters-robert-nystrom/1139915245?ean=9780990582939\" target=\"_blank\">Barnes and Noble</a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.bookdepository.com/Crafting-Interpreters-Robert-Nystrom/9780990582939\" target=\"_blank\">Book Depository</a>\n    </td>\n    </tr>\n    </table>\n    <a class=\"action\" href=\"/sample.pdf\" target=\"_blank\">Download Sample <small>PDF</small></a>\n  </div>\n</div>\n<div class=\"format ebook\">\n  <div class=\"format-info\">\n    <h3>eBook</h3>\n    <p>Carefully tuned CSS fits itself to your ebook reader and screen size.\n    Full-color syntax highlighting and live hyperlinks. Like Alan Kay's Dynabook\n    but real.</p>\n    <table>\n    <tr>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.com/dp/B09BCCVLCL\" target=\"_blank\">Kindle <small class=\"hide-small\"><span class=\"hide-medium\">Amazon</span>.com</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.co.uk/dp/B09BCCVLCL\" target=\"_blank\"><small>.uk</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.ca/dp/B09BCCVLCL\" target=\"_blank\"><small>.ca</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.com.au/dp/B09BCCVLCL\" target=\"_blank\"><small>.au</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.de/dp/B09BCCVLCL\" target=\"_blank\"><small>.de</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.in/dp/B09BCCVLCL\" target=\"_blank\"><small>.in</small></a>\n    </td>\n    </tr>\n    </table>\n    <table>\n    <tr>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.fr/dp/B09BCCVLCL\" target=\"_blank\"><small>.fr</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.es/dp/B09BCCVLCL\" target=\"_blank\"><small>.es</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.it/dp/B09BCCVLCL\" target=\"_blank\"><small>.it</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.co.jp/dp/B09BCCVLCL\" target=\"_blank\"><small>.jp</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.com.br/dp/B09BCCVLCL\" target=\"_blank\"><small>.br</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.amazon.com.mx/dp/B09BCCVLCL\" target=\"_blank\"><small>.mx</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://books.apple.com/us/book/crafting-interpreters/id1578795812\" target=\"_blank\">Apple Books</a>\n    </td>\n    </tr>\n    </table>\n    <table>\n    <tr>\n    <td>\n      <a class=\"action\" href=\"https://play.google.com/store/books/details?id=q0c6EAAAQBAJ\" target=\"_blank\">Play Books <small class=\"hide-small\">Google</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.barnesandnoble.com/w/crafting-interpreters-robert-nystrom/1139915245?ean=2940164977092\" target=\"_blank\">Nook <small class=\"hide-small\">B&amp;N</small></a>\n    </td>\n    <td>\n      <a class=\"action\" href=\"https://www.smashwords.com/books/view/1096463\" target=\"_blank\">EPUB <small class=\"hide-small\">Smashwords</small></a>\n    </td>\n    </tr>\n    </table>\n  </div>\n</div>\n<div class=\"format pdf\">\n  <div class=\"format-info\">\n    <h3>PDF</h3>\n    <p>Perfectly mirrors the hand-crafted typesetting and sharp illustrations of\n    the print book, but much easier to carry around.</p>\n    <a class=\"action\" href=\"https://payhip.com/b/F0zkr\" target=\"_blank\">Buy from Payhip</a>\n    <a class=\"action\" href=\"/sample.pdf\" target=\"_blank\">Download Free Sample</a>\n  </div>\n</div>\n<div class=\"format web\">\n  <div class=\"format-info\">\n    <h3>Web</h3>\n    <p>Meticulous responsive design looks great from your desktop down to your\n    phone. Every chapter, aside, and illustration is there. Read the whole book\n    for free. Really.</p>\n    <a class=\"action\" href=\"contents.html\">Read Now</a>\n  </div>\n</div>\n\n<img src=\"image/dogshot.jpg\" class=\"author\" />\n\n<div class=\"author\">\n<h3>About Robert Nystrom</h3>\n\n<p>I got bitten by the language bug years ago while on paternity leave between\nmidnight feedings. I cobbled together a <a href=\"http://wren.io/\"\ntarget=\"_blank\">number</a> <a href=\"http://magpie-lang.org/\"\ntarget=\"_blank\">of</a> <a href=\"http://finch.stuffwithstuff.com/\"\ntarget=\"_blank\">hobby</a> <a href=\"https://github.com/munificent/vigil\"\ntarget=\"_blank\">languages</a> before worming my way into an honest-to-God,\nfull-time programming language job. Today, I work at Google on the <a\nhref=\"http://dart.dev/\" target=\"_blank\">Dart language</a>.</p>\n\n<p>Before I fell in love with languages, I developed games at Electronic Arts\nfor eight years. I wrote the best-selling book <em><a\nhref=\"http://gameprogrammingpatterns.com/\" target=\"_blank\">Game Programming\nPatterns</a></em> based on what I learned there. You can read that book for free\ntoo.</p>\n\n<p>If you want more, you can find me on Twitter (<a\nhref=\"https://twitter.com/intent/user?screen_name=munificentbob\"\ntarget=\"_blank\"><code>@munificentbob</code></a>), email me at <code>bob</code>\nat this site's domain (though I am slow to respond), read <a\nhref=\"http://journal.stuffwithstuff.com/\" target=\"_blank\">my blog</a>, or join\nmy low frequency mailing list:</p>\n\n<div class=\"sign-up\">\n  <!-- Begin MailChimp Signup Form -->\n  <div id=\"mc_embed_signup\">\n  <form action=\"//gameprogrammingpatterns.us7.list-manage.com/subscribe/post?u=0952ca43ed2536d6717766b88&amp;id=6e96334109\" method=\"post\" id=\"mc-embedded-subscribe-form\" name=\"mc-embedded-subscribe-form\" class=\"validate\" target=\"_blank\" novalidate>\n    <input type=\"email\" value=\"\" name=\"EMAIL\" class=\"email\" id=\"mce-EMAIL\" placeholder=\"Your email address\" required>\n    <!-- real people should not fill this in and expect good things - do not remove this or risk form bot signups -->\n    <div style=\"position: absolute; left: -5000px;\" aria-hidden=\"true\"><input type=\"text\" name=\"b_0952ca43ed2536d6717766b88_6e96334109\" tabindex=\"-1\" value=\"\"></div>\n    <input type=\"submit\" value=\"Sign me up!\" name=\"subscribe\" id=\"mc-embedded-subscribe\" class=\"button\">\n  </form>\n  </div>\n  <!--End mc_embed_signup -->\n</div>\n\n</div>\n\n<footer>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</main>\n</article>\n</body>\n</html>\n"
  },
  {
    "path": "site/inheritance.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Inheritance &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Inheritance<small>13</small></a></h3>\n\n<ul>\n    <li><a href=\"#superclasses-and-subclasses\"><small>13.1</small> Superclasses and Subclasses</a></li>\n    <li><a href=\"#inheriting-methods\"><small>13.2</small> Inheriting Methods</a></li>\n    <li><a href=\"#calling-superclass-methods\"><small>13.3</small> Calling Superclass Methods</a></li>\n    <li><a href=\"#conclusion\"><small>13.4</small> Conclusion</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"classes.html\" title=\"Classes\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\">&uarr;&nbsp;Up</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"classes.html\" title=\"Classes\" class=\"prev\">←</a>\n<a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Inheritance<small>13</small></a></h3>\n\n<ul>\n    <li><a href=\"#superclasses-and-subclasses\"><small>13.1</small> Superclasses and Subclasses</a></li>\n    <li><a href=\"#inheriting-methods\"><small>13.2</small> Inheriting Methods</a></li>\n    <li><a href=\"#calling-superclass-methods\"><small>13.3</small> Calling Superclass Methods</a></li>\n    <li><a href=\"#conclusion\"><small>13.4</small> Conclusion</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"classes.html\" title=\"Classes\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\">&uarr;&nbsp;Up</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">13</div>\n  <h1>Inheritance</h1>\n\n<blockquote>\n<p>Once we were blobs in the sea, and then fishes, and then lizards and rats and\nthen monkeys, and hundreds of things in between. This hand was once a fin,\nthis hand once had claws! In my human mouth I have the pointy teeth of a wolf\nand the chisel teeth of a rabbit and the grinding teeth of a cow! Our blood is\nas salty as the sea we used to live in! When we&rsquo;re frightened, the hair on our\nskin stands up, just like it did when we had fur. We are history! Everything\nwe&rsquo;ve ever been on the way to becoming us, we still are.</p>\n<p><cite>Terry Pratchett, <em>A Hat Full of Sky</em></cite></p>\n</blockquote>\n<p>Can you believe it? We&rsquo;ve reached the last chapter of <a href=\"a-tree-walk-interpreter.html\">Part II</a>. We&rsquo;re almost\ndone with our first Lox interpreter. The <a href=\"classes.html\">previous chapter</a> was a big ball of\nintertwined object-orientation features. I couldn&rsquo;t separate those from each\nother, but I did manage to untangle one piece. In this chapter, we&rsquo;ll finish\noff Lox&rsquo;s class support by adding inheritance.</p>\n<p>Inheritance appears in object-oriented languages all the way back to the <span\nname=\"inherited\">first</span> one, <a href=\"https://en.wikipedia.org/wiki/Simula\">Simula</a>. Early on, Kristen Nygaard and\nOle-Johan Dahl noticed commonalities across classes in the simulation programs\nthey wrote. Inheritance gave them a way to reuse the code for those similar\nparts.</p>\n<aside name=\"inherited\">\n<p>You could say all those other languages <em>inherited</em> it from Simula. Hey-ooo!\nI&rsquo;ll, uh, see myself out.</p>\n</aside>\n<h2><a href=\"#superclasses-and-subclasses\" id=\"superclasses-and-subclasses\"><small>13&#8202;.&#8202;1</small>Superclasses and Subclasses</a></h2>\n<p>Given that the concept is &ldquo;inheritance&rdquo;, you would hope they would pick a\nconsistent metaphor and call them &ldquo;parent&rdquo; and &ldquo;child&rdquo; classes, but that would\nbe too easy. Way back when, C. A. R. Hoare coined the term &ldquo;<span\nname=\"subclass\">subclass</span>&rdquo; to refer to a record type that refines another\ntype. Simula borrowed that term to refer to a <em>class</em> that inherits from\nanother. I don&rsquo;t think it was until Smalltalk came along that someone flipped\nthe Latin prefix to get &ldquo;superclass&rdquo; to refer to the other side of the\nrelationship. From C++, you also hear &ldquo;base&rdquo; and &ldquo;derived&rdquo; classes. I&rsquo;ll mostly\nstick with &ldquo;superclass&rdquo; and &ldquo;subclass&rdquo;.</p>\n<aside name=\"subclass\">\n<p>&ldquo;Super-&rdquo; and &ldquo;sub-&rdquo; mean &ldquo;above&rdquo; and &ldquo;below&rdquo; in Latin, respectively. Picture an\ninheritance tree like a family tree with the root at the top<span class=\"em\">&mdash;</span>subclasses are\nbelow their superclasses on the diagram. More generally, &ldquo;sub-&rdquo; refers to things\nthat refine or are contained by some more general concept. In zoology, a\nsubclass is a finer categorization of a larger class of living things.</p>\n<p>In set theory, a subset is contained by a larger superset which has all of the\nelements of the subset and possibly more. Set theory and programming languages\nmeet each other in type theory. There, you have &ldquo;supertypes&rdquo; and &ldquo;subtypes&rdquo;.</p>\n<p>In statically typed object-oriented languages, a subclass is also often a\nsubtype of its superclass. Say we have a Doughnut superclass and a BostonCream\nsubclass. Every BostonCream is also an instance of Doughnut, but there may be\ndoughnut objects that are not BostonCreams (like Crullers).</p>\n<p>Think of a type as the set of all values of that type. The set of all Doughnut\ninstances contains the set of all BostonCream instances since every BostonCream\nis also a Doughnut. So BostonCream is a subclass, and a subtype, and its\ninstances are a subset. It all lines up.</p><img src=\"image/inheritance/doughnuts.png\" alt=\"Boston cream &lt;: doughnut.\" />\n</aside>\n<p>Our first step towards supporting inheritance in Lox is a way to specify a\nsuperclass when declaring a class. There&rsquo;s a lot of variety in syntax for this.\nC++ and C# place a <code>:</code> after the subclass&rsquo;s name, followed by the superclass\nname. Java uses <code>extends</code> instead of the colon. Python puts the superclass(es)\nin parentheses after the class name. Simula puts the superclass&rsquo;s name <em>before</em>\nthe <code>class</code> keyword.</p>\n<p>This late in the game, I&rsquo;d rather not add a new reserved word or token to the\nlexer. We don&rsquo;t have <code>extends</code> or even <code>:</code>, so we&rsquo;ll follow Ruby and use a\nless-than sign (<code>&lt;</code>).</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Doughnut</span> {\n  <span class=\"c\">// General doughnut stuff...</span>\n}\n\n<span class=\"k\">class</span> <span class=\"t\">BostonCream</span> &lt; <span class=\"t\">Doughnut</span> {\n  <span class=\"c\">// Boston Cream-specific stuff...</span>\n}\n</pre></div>\n<p>To work this into the grammar, we add a new optional clause in our existing\n<code>classDecl</code> rule.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">classDecl</span>      → <span class=\"s\">&quot;class&quot;</span> <span class=\"t\">IDENTIFIER</span> ( <span class=\"s\">&quot;&lt;&quot;</span> <span class=\"t\">IDENTIFIER</span> )?\n                 <span class=\"s\">&quot;{&quot;</span> <span class=\"i\">function</span>* <span class=\"s\">&quot;}&quot;</span> ;\n</pre></div>\n<p>After the class name, you can have a <code>&lt;</code> followed by the superclass&rsquo;s name. The\nsuperclass clause is optional because you don&rsquo;t <em>have</em> to have a superclass.\nUnlike some other object-oriented languages like Java, Lox has no root &ldquo;Object&rdquo;\nclass that everything inherits from, so when you omit the superclass clause, the\nclass has <em>no</em> superclass, not even an implicit one.</p>\n<p>We want to capture this new syntax in the class declaration&rsquo;s AST node.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      &quot;Block      : List&lt;Stmt&gt; statements&quot;,\n</pre><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nin <em>main</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">      <span class=\"s\">&quot;Class      : Token name, Expr.Variable superclass,&quot;</span> +\n                  <span class=\"s\">&quot; List&lt;Stmt.Function&gt; methods&quot;</span>,\n</pre><pre class=\"insert-after\">      &quot;Expression : Expr expression&quot;,\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, in <em>main</em>(), replace 1 line</div>\n\n<p>You might be surprised that we store the superclass name as an Expr.Variable,\nnot a Token. The grammar restricts the superclass clause to a single identifier,\nbut at runtime, that identifier is evaluated as a variable access. Wrapping the\nname in an Expr.Variable early on in the parser gives us an object that the\nresolver can hang the resolution information off of.</p>\n<p>The new parser code follows the grammar directly.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    Token name = consume(IDENTIFIER, &quot;Expect class name.&quot;);\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>classDeclaration</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"t\">Expr</span>.<span class=\"t\">Variable</span> <span class=\"i\">superclass</span> = <span class=\"k\">null</span>;\n    <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">LESS</span>)) {\n      <span class=\"i\">consume</span>(<span class=\"i\">IDENTIFIER</span>, <span class=\"s\">&quot;Expect superclass name.&quot;</span>);\n      <span class=\"i\">superclass</span> = <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Variable</span>(<span class=\"i\">previous</span>());\n    }\n\n</pre><pre class=\"insert-after\">    consume(LEFT_BRACE, &quot;Expect '{' before class body.&quot;);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>classDeclaration</em>()</div>\n\n<p>Once we&rsquo;ve (possibly) parsed a superclass declaration, we store it in the AST.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    consume(RIGHT_BRACE, &quot;Expect '}' after class body.&quot;);\n\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>classDeclaration</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">Stmt</span>.<span class=\"t\">Class</span>(<span class=\"i\">name</span>, <span class=\"i\">superclass</span>, <span class=\"i\">methods</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>classDeclaration</em>(), replace 1 line</div>\n\n<p>If we didn&rsquo;t parse a superclass clause, the superclass expression will be\n<code>null</code>. We&rsquo;ll have to make sure the later passes check for that. The first of\nthose is the resolver.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    define(stmt.name);\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin <em>visitClassStmt</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"k\">if</span> (<span class=\"i\">stmt</span>.<span class=\"i\">superclass</span> != <span class=\"k\">null</span>) {\n      <span class=\"i\">resolve</span>(<span class=\"i\">stmt</span>.<span class=\"i\">superclass</span>);\n    }\n</pre><pre class=\"insert-after\">\n\n    beginScope();\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in <em>visitClassStmt</em>()</div>\n\n<p>The class declaration AST node has a new subexpression, so we traverse into and\nresolve that. Since classes are usually declared at the top level, the\nsuperclass name will most likely be a global variable, so this doesn&rsquo;t usually\ndo anything useful. However, Lox allows class declarations even inside blocks,\nso it&rsquo;s possible the superclass name refers to a local variable. In that case,\nwe need to make sure it&rsquo;s resolved.</p>\n<p>Because even well-intentioned programmers sometimes write weird code, there&rsquo;s a\nsilly edge case we need to worry about while we&rsquo;re in here. Take a look at this:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Oops</span> &lt; <span class=\"t\">Oops</span> {}\n</pre></div>\n<p>There&rsquo;s no way this will do anything useful, and if we let the runtime try to\nrun this, it will break the expectation the interpreter has about there not\nbeing cycles in the inheritance chain. The safest thing is to detect this case\nstatically and report it as an error.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    define(stmt.name);\n\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin <em>visitClassStmt</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (<span class=\"i\">stmt</span>.<span class=\"i\">superclass</span> != <span class=\"k\">null</span> &amp;&amp;\n        <span class=\"i\">stmt</span>.<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>.<span class=\"i\">equals</span>(<span class=\"i\">stmt</span>.<span class=\"i\">superclass</span>.<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>)) {\n      <span class=\"t\">Lox</span>.<span class=\"i\">error</span>(<span class=\"i\">stmt</span>.<span class=\"i\">superclass</span>.<span class=\"i\">name</span>,\n          <span class=\"s\">&quot;A class can&#39;t inherit from itself.&quot;</span>);\n    }\n\n</pre><pre class=\"insert-after\">    if (stmt.superclass != null) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in <em>visitClassStmt</em>()</div>\n\n<p>Assuming the code resolves without error, the AST travels to the interpreter.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  public Void visitClassStmt(Stmt.Class stmt) {\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitClassStmt</em>()</div>\n<pre class=\"insert\">    <span class=\"t\">Object</span> <span class=\"i\">superclass</span> = <span class=\"k\">null</span>;\n    <span class=\"k\">if</span> (<span class=\"i\">stmt</span>.<span class=\"i\">superclass</span> != <span class=\"k\">null</span>) {\n      <span class=\"i\">superclass</span> = <span class=\"i\">evaluate</span>(<span class=\"i\">stmt</span>.<span class=\"i\">superclass</span>);\n      <span class=\"k\">if</span> (!(<span class=\"i\">superclass</span> <span class=\"k\">instanceof</span> <span class=\"t\">LoxClass</span>)) {\n        <span class=\"k\">throw</span> <span class=\"k\">new</span> <span class=\"t\">RuntimeError</span>(<span class=\"i\">stmt</span>.<span class=\"i\">superclass</span>.<span class=\"i\">name</span>,\n            <span class=\"s\">&quot;Superclass must be a class.&quot;</span>);\n      }\n    }\n\n</pre><pre class=\"insert-after\">    environment.define(stmt.name.lexeme, null);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitClassStmt</em>()</div>\n\n<p>If the class has a superclass expression, we evaluate it. Since that could\npotentially evaluate to some other kind of object, we have to check at runtime\nthat the thing we want to be the superclass is actually a class. Bad things\nwould happen if we allowed code like:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"t\">NotAClass</span> = <span class=\"s\">&quot;I am totally not a class&quot;</span>;\n\n<span class=\"k\">class</span> <span class=\"t\">Subclass</span> &lt; <span class=\"t\">NotAClass</span> {} <span class=\"c\">// ?!</span>\n</pre></div>\n<p>Assuming that check passes, we continue on. Executing a class declaration turns\nthe syntactic representation of a class<span class=\"em\">&mdash;</span>its AST node<span class=\"em\">&mdash;</span>into its runtime\nrepresentation, a LoxClass object. We need to plumb the superclass through to\nthat too. We pass the superclass to the constructor.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      methods.put(method.name.lexeme, function);\n    }\n\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitClassStmt</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"t\">LoxClass</span> <span class=\"i\">klass</span> = <span class=\"k\">new</span> <span class=\"t\">LoxClass</span>(<span class=\"i\">stmt</span>.<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>,\n        (<span class=\"t\">LoxClass</span>)<span class=\"i\">superclass</span>, <span class=\"i\">methods</span>);\n\n</pre><pre class=\"insert-after\">    environment.assign(stmt.name, klass);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitClassStmt</em>(), replace 1 line</div>\n\n<p>The constructor stores it in a field.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/LoxClass.java</em><br>\nconstructor <em>LoxClass</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"t\">LoxClass</span>(<span class=\"t\">String</span> <span class=\"i\">name</span>, <span class=\"t\">LoxClass</span> <span class=\"i\">superclass</span>,\n           <span class=\"t\">Map</span>&lt;<span class=\"t\">String</span>, <span class=\"t\">LoxFunction</span>&gt; <span class=\"i\">methods</span>) {\n    <span class=\"k\">this</span>.<span class=\"i\">superclass</span> = <span class=\"i\">superclass</span>;\n</pre><pre class=\"insert-after\">    this.name = name;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxClass.java</em>, constructor <em>LoxClass</em>(), replace 1 line</div>\n\n<p>Which we declare here:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  final String name;\n</pre><div class=\"source-file\"><em>lox/LoxClass.java</em><br>\nin class <em>LoxClass</em></div>\n<pre class=\"insert\">  <span class=\"k\">final</span> <span class=\"t\">LoxClass</span> <span class=\"i\">superclass</span>;\n</pre><pre class=\"insert-after\">  private final Map&lt;String, LoxFunction&gt; methods;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxClass.java</em>, in class <em>LoxClass</em></div>\n\n<p>With that, we can define classes that are subclasses of other classes. Now, what\ndoes having a superclass actually <em>do?</em></p>\n<h2><a href=\"#inheriting-methods\" id=\"inheriting-methods\"><small>13&#8202;.&#8202;2</small>Inheriting Methods</a></h2>\n<p>Inheriting from another class means that everything that&rsquo;s <span\nname=\"liskov\">true</span> of the superclass should be true, more or less, of the\nsubclass. In statically typed languages, that carries a lot of implications. The\nsub<em>class</em> must also be a sub<em>type</em>, and the memory layout is controlled so that\nyou can pass an instance of a subclass to a function expecting a superclass and\nit can still access the inherited fields correctly.</p>\n<aside name=\"liskov\">\n<p>A fancier name for this hand-wavey guideline is the <a href=\"https://en.wikipedia.org/wiki/Liskov_substitution_principle\"><em>Liskov substitution\nprinciple</em></a>. Barbara Liskov introduced it in a keynote during the\nformative period of object-oriented programming.</p>\n</aside>\n<p>Lox is a dynamically typed language, so our requirements are much simpler.\nBasically, it means that if you can call some method on an instance of the\nsuperclass, you should be able to call that method when given an instance of the\nsubclass. In other words, methods are inherited from the superclass.</p>\n<p>This lines up with one of the goals of inheritance<span class=\"em\">&mdash;</span>to give users a way to\nreuse code across classes. Implementing this in our interpreter is\nastonishingly easy.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      return methods.get(name);\n    }\n\n</pre><div class=\"source-file\"><em>lox/LoxClass.java</em><br>\nin <em>findMethod</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (<span class=\"i\">superclass</span> != <span class=\"k\">null</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">superclass</span>.<span class=\"i\">findMethod</span>(<span class=\"i\">name</span>);\n    }\n\n</pre><pre class=\"insert-after\">    return null;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/LoxClass.java</em>, in <em>findMethod</em>()</div>\n\n<p>That&rsquo;s literally all there is to it. When we are looking up a method on an\ninstance, if we don&rsquo;t find it on the instance&rsquo;s class, we recurse up through the\nsuperclass chain and look there. Give it a try:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Doughnut</span> {\n  <span class=\"i\">cook</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Fry until golden brown.&quot;</span>;\n  }\n}\n\n<span class=\"k\">class</span> <span class=\"t\">BostonCream</span> &lt; <span class=\"t\">Doughnut</span> {}\n\n<span class=\"t\">BostonCream</span>().<span class=\"i\">cook</span>();\n</pre></div>\n<p>There we go, half of our inheritance features are complete with only three lines\nof Java code.</p>\n<h2><a href=\"#calling-superclass-methods\" id=\"calling-superclass-methods\"><small>13&#8202;.&#8202;3</small>Calling Superclass Methods</a></h2>\n<p>In <code>findMethod()</code> we look for a method on the current class <em>before</em> walking up\nthe superclass chain. If a method with the same name exists in both the subclass\nand the superclass, the subclass one takes precedence or <strong>overrides</strong> the\nsuperclass method. Sort of like how variables in inner scopes shadow outer ones.</p>\n<p>That&rsquo;s great if the subclass wants to <em>replace</em> some superclass behavior\ncompletely. But, in practice, subclasses often want to <em>refine</em> the superclass&rsquo;s\nbehavior. They want to do a little work specific to the subclass, but also\nexecute the original superclass behavior too.</p>\n<p>However, since the subclass has overridden the method, there&rsquo;s no way to refer\nto the original one. If the subclass method tries to call it by name, it will\njust recursively hit its own override. We need a way to say &ldquo;Call this method,\nbut look for it directly on my superclass and ignore my override&rdquo;. Java uses\n<code>super</code> for this, and we&rsquo;ll use that same syntax in Lox. Here is an example:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Doughnut</span> {\n  <span class=\"i\">cook</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Fry until golden brown.&quot;</span>;\n  }\n}\n\n<span class=\"k\">class</span> <span class=\"t\">BostonCream</span> &lt; <span class=\"t\">Doughnut</span> {\n  <span class=\"i\">cook</span>() {\n    <span class=\"k\">super</span>.<span class=\"i\">cook</span>();\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Pipe full of custard and coat with chocolate.&quot;</span>;\n  }\n}\n\n<span class=\"t\">BostonCream</span>().<span class=\"i\">cook</span>();\n</pre></div>\n<p>If you run this, it should print:</p>\n<div class=\"codehilite\"><pre>Fry until golden brown.\nPipe full of custard and coat with chocolate.\n</pre></div>\n<p>We have a new expression form. The <code>super</code> keyword, followed by a dot and an\nidentifier, looks for a method with that name. Unlike calls on <code>this</code>, the search\nstarts at the superclass.</p>\n<h3><a href=\"#syntax\" id=\"syntax\"><small>13&#8202;.&#8202;3&#8202;.&#8202;1</small>Syntax</a></h3>\n<p>With <code>this</code>, the keyword works sort of like a magic variable, and the expression\nis that one lone token. But with <code>super</code>, the subsequent <code>.</code> and property name\nare inseparable parts of the <code>super</code> expression. You can&rsquo;t have a bare <code>super</code>\ntoken all by itself.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">print</span> <span class=\"k\">super</span>; <span class=\"c\">// Syntax error.</span>\n</pre></div>\n<p>So the new clause we add to the <code>primary</code> rule in our grammar includes the\nproperty access as well.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">primary</span>        → <span class=\"s\">&quot;true&quot;</span> | <span class=\"s\">&quot;false&quot;</span> | <span class=\"s\">&quot;nil&quot;</span> | <span class=\"s\">&quot;this&quot;</span>\n               | <span class=\"t\">NUMBER</span> | <span class=\"t\">STRING</span> | <span class=\"t\">IDENTIFIER</span> | <span class=\"s\">&quot;(&quot;</span> <span class=\"i\">expression</span> <span class=\"s\">&quot;)&quot;</span>\n               | <span class=\"s\">&quot;super&quot;</span> <span class=\"s\">&quot;.&quot;</span> <span class=\"t\">IDENTIFIER</span> ;\n</pre></div>\n<p>Typically, a <code>super</code> expression is used for a method call, but, as with regular\nmethods, the argument list is <em>not</em> part of the expression. Instead, a super\n<em>call</em> is a super <em>access</em> followed by a function call. Like other method calls,\nyou can get a handle to a superclass method and invoke it separately.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">method</span> = <span class=\"k\">super</span>.<span class=\"i\">cook</span>;\n<span class=\"i\">method</span>();\n</pre></div>\n<p>So the <code>super</code> expression itself contains only the token for the <code>super</code> keyword\nand the name of the method being looked up. The corresponding <span\nname=\"super-ast\">syntax tree node</span> is thus:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      &quot;Set      : Expr object, Token name, Expr value&quot;,\n</pre><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nin <em>main</em>()</div>\n<pre class=\"insert\">      <span class=\"s\">&quot;Super    : Token keyword, Token method&quot;</span>,\n</pre><pre class=\"insert-after\">      &quot;This     : Token keyword&quot;,\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, in <em>main</em>()</div>\n\n<aside name=\"super-ast\">\n<p>The generated code for the new node is in <a href=\"appendix-ii.html#super-expression\">Appendix II</a>.</p>\n</aside>\n<p>Following the grammar, the new parsing code goes inside our existing <code>primary()</code>\nmethod.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      return new Expr.Literal(previous().literal);\n    }\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>primary</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">SUPER</span>)) {\n      <span class=\"t\">Token</span> <span class=\"i\">keyword</span> = <span class=\"i\">previous</span>();\n      <span class=\"i\">consume</span>(<span class=\"i\">DOT</span>, <span class=\"s\">&quot;Expect &#39;.&#39; after &#39;super&#39;.&quot;</span>);\n      <span class=\"t\">Token</span> <span class=\"i\">method</span> = <span class=\"i\">consume</span>(<span class=\"i\">IDENTIFIER</span>,\n          <span class=\"s\">&quot;Expect superclass method name.&quot;</span>);\n      <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Super</span>(<span class=\"i\">keyword</span>, <span class=\"i\">method</span>);\n    }\n</pre><pre class=\"insert-after\">\n\n    if (match(THIS)) return new Expr.This(previous());\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>primary</em>()</div>\n\n<p>A leading <code>super</code> keyword tells us we&rsquo;ve hit a <code>super</code> expression. After that we\nconsume the expected <code>.</code> and method name.</p>\n<h3><a href=\"#semantics\" id=\"semantics\"><small>13&#8202;.&#8202;3&#8202;.&#8202;2</small>Semantics</a></h3>\n<p>Earlier, I said a <code>super</code> expression starts the method lookup from &ldquo;the\nsuperclass&rdquo;, but <em>which</em> superclass? The naïve answer is the superclass of\n<code>this</code>, the object the surrounding method was called on. That coincidentally\nproduces the right behavior in a lot of cases, but that&rsquo;s not actually correct.\nGaze upon:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">A</span> {\n  <span class=\"i\">method</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;A method&quot;</span>;\n  }\n}\n\n<span class=\"k\">class</span> <span class=\"t\">B</span> &lt; <span class=\"t\">A</span> {\n  <span class=\"i\">method</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;B method&quot;</span>;\n  }\n\n  <span class=\"i\">test</span>() {\n    <span class=\"k\">super</span>.<span class=\"i\">method</span>();\n  }\n}\n\n<span class=\"k\">class</span> <span class=\"t\">C</span> &lt; <span class=\"t\">B</span> {}\n\n<span class=\"t\">C</span>().<span class=\"i\">test</span>();\n</pre></div>\n<p>Translate this program to Java, C#, or C++ and it will print &ldquo;A method&rdquo;, which\nis what we want Lox to do too. When this program runs, inside the body of\n<code>test()</code>, <code>this</code> is an instance of C. The superclass of C is B, but that is\n<em>not</em> where the lookup should start. If it did, we would hit B&rsquo;s <code>method()</code>.</p>\n<p>Instead, lookup should start on the superclass of <em>the class containing the\n<code>super</code> expression</em>. In this case, since <code>test()</code> is defined inside B, the\n<code>super</code> expression inside it should start the lookup on <em>B</em>&rsquo;s superclass<span class=\"em\">&mdash;</span>A.</p>\n<p><span name=\"flow\"></span></p><img src=\"image/inheritance/classes.png\" alt=\"The call chain flowing through the classes.\" />\n<aside name=\"flow\">\n<p>The execution flow looks something like this:</p>\n<ol>\n<li>\n<p>We call <code>test()</code> on an instance of C.</p>\n</li>\n<li>\n<p>That enters the <code>test()</code> method inherited from B. That calls\n<code>super.method()</code>.</p>\n</li>\n<li>\n<p>The superclass of B is A, so that chains to <code>method()</code> on A, and the program\nprints &ldquo;A method&rdquo;.</p>\n</li>\n</ol>\n</aside>\n<p>Thus, in order to evaluate a <code>super</code> expression, we need access to the\nsuperclass of the class definition surrounding the call. Alack and alas, at the\npoint in the interpreter where we are executing a <code>super</code> expression, we don&rsquo;t\nhave that easily available.</p>\n<p>We <em>could</em> add a field to LoxFunction to store a reference to the LoxClass that\nowns that method. The interpreter would keep a reference to the\ncurrently executing LoxFunction so that we could look it up later when we hit a\n<code>super</code> expression. From there, we&rsquo;d get the LoxClass of the method, then its\nsuperclass.</p>\n<p>That&rsquo;s a lot of plumbing. In the <a href=\"classes.html\">last chapter</a>, we had a similar problem when\nwe needed to add support for <code>this</code>. In that case, we used our existing\nenvironment and closure mechanism to store a reference to the current object.\nCould we do something similar for storing the superclass<span\nname=\"rhetorical\">?</span> Well, I probably wouldn&rsquo;t be talking about it if the\nanswer was no, so<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>yes.</p>\n<aside name=\"rhetorical\">\n<p>Does anyone even like rhetorical questions?</p>\n</aside>\n<p>One important difference is that we bound <code>this</code> when the method was <em>accessed</em>.\nThe same method can be called on different instances and each needs its own\n<code>this</code>. With <code>super</code> expressions, the superclass is a fixed property of the\n<em>class declaration itself</em>. Every time you evaluate some <code>super</code> expression, the\nsuperclass is always the same.</p>\n<p>That means we can create the environment for the superclass once, when the class\ndefinition is executed. Immediately before we define the methods, we make a new\nenvironment to bind the class&rsquo;s superclass to the name <code>super</code>.</p><img src=\"image/inheritance/superclass.png\" alt=\"The superclass environment.\" />\n<p>When we create the LoxFunction runtime representation for each method, that is\nthe environment they will capture in their closure. Later, when a method is\ninvoked and <code>this</code> is bound, the superclass environment becomes the parent for\nthe method&rsquo;s environment, like so:</p><img src=\"image/inheritance/environments.png\" alt=\"The environment chain including the superclass environment.\" />\n<p>That&rsquo;s a lot of machinery, but we&rsquo;ll get through it a step at a time. Before we\ncan get to creating the environment at runtime, we need to handle the\ncorresponding scope chain in the resolver.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      resolve(stmt.superclass);\n    }\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin <em>visitClassStmt</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"k\">if</span> (<span class=\"i\">stmt</span>.<span class=\"i\">superclass</span> != <span class=\"k\">null</span>) {\n      <span class=\"i\">beginScope</span>();\n      <span class=\"i\">scopes</span>.<span class=\"i\">peek</span>().<span class=\"i\">put</span>(<span class=\"s\">&quot;super&quot;</span>, <span class=\"k\">true</span>);\n    }\n</pre><pre class=\"insert-after\">\n\n    beginScope();\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in <em>visitClassStmt</em>()</div>\n\n<p>If the class declaration has a superclass, then we create a new scope\nsurrounding all of its methods. In that scope, we define the name &ldquo;super&rdquo;. Once\nwe&rsquo;re done resolving the class&rsquo;s methods, we discard that scope.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    endScope();\n\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin <em>visitClassStmt</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (<span class=\"i\">stmt</span>.<span class=\"i\">superclass</span> != <span class=\"k\">null</span>) <span class=\"i\">endScope</span>();\n\n</pre><pre class=\"insert-after\">    currentClass = enclosingClass;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in <em>visitClassStmt</em>()</div>\n\n<p>It&rsquo;s a minor optimization, but we only create the superclass environment if the\nclass actually <em>has</em> a superclass. There&rsquo;s no point creating it when there isn&rsquo;t\na superclass since there&rsquo;d be no superclass to store in it anyway.</p>\n<p>With &ldquo;super&rdquo; defined in a scope chain, we are able to resolve the <code>super</code>\nexpression itself.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>visitSetExpr</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitSuperExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Super</span> <span class=\"i\">expr</span>) {\n    <span class=\"i\">resolveLocal</span>(<span class=\"i\">expr</span>, <span class=\"i\">expr</span>.<span class=\"i\">keyword</span>);\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>visitSetExpr</em>()</div>\n\n<p>We resolve the <code>super</code> token exactly as if it were a variable. The resolution\nstores the number of hops along the environment chain that the interpreter needs\nto walk to find the environment where the superclass is stored.</p>\n<p>This code is mirrored in the interpreter. When we evaluate a subclass\ndefinition, we create a new environment.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">        throw new RuntimeError(stmt.superclass.name,\n            &quot;Superclass must be a class.&quot;);\n      }\n    }\n\n    environment.define(stmt.name.lexeme, null);\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitClassStmt</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"k\">if</span> (<span class=\"i\">stmt</span>.<span class=\"i\">superclass</span> != <span class=\"k\">null</span>) {\n      <span class=\"i\">environment</span> = <span class=\"k\">new</span> <span class=\"t\">Environment</span>(<span class=\"i\">environment</span>);\n      <span class=\"i\">environment</span>.<span class=\"i\">define</span>(<span class=\"s\">&quot;super&quot;</span>, <span class=\"i\">superclass</span>);\n    }\n</pre><pre class=\"insert-after\">\n\n    Map&lt;String, LoxFunction&gt; methods = new HashMap&lt;&gt;();\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitClassStmt</em>()</div>\n\n<p>Inside that environment, we store a reference to the superclass<span class=\"em\">&mdash;</span>the actual\nLoxClass object for the superclass which we have now that we are in the runtime.\nThen we create the LoxFunctions for each method. Those will capture the current\nenvironment<span class=\"em\">&mdash;</span>the one where we just bound &ldquo;super&rdquo;<span class=\"em\">&mdash;</span>as their closure, holding\non to the superclass like we need. Once that&rsquo;s done, we pop the environment.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    LoxClass klass = new LoxClass(stmt.name.lexeme,\n        (LoxClass)superclass, methods);\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitClassStmt</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"k\">if</span> (<span class=\"i\">superclass</span> != <span class=\"k\">null</span>) {\n      <span class=\"i\">environment</span> = <span class=\"i\">environment</span>.<span class=\"i\">enclosing</span>;\n    }\n</pre><pre class=\"insert-after\">\n\n    environment.assign(stmt.name, klass);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitClassStmt</em>()</div>\n\n<p>We&rsquo;re ready to interpret <code>super</code> expressions themselves. There are a few moving\nparts, so we&rsquo;ll build this method up in pieces.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>visitSetExpr</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Object</span> <span class=\"i\">visitSuperExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Super</span> <span class=\"i\">expr</span>) {\n    <span class=\"t\">int</span> <span class=\"i\">distance</span> = <span class=\"i\">locals</span>.<span class=\"i\">get</span>(<span class=\"i\">expr</span>);\n    <span class=\"t\">LoxClass</span> <span class=\"i\">superclass</span> = (<span class=\"t\">LoxClass</span>)<span class=\"i\">environment</span>.<span class=\"i\">getAt</span>(\n        <span class=\"i\">distance</span>, <span class=\"s\">&quot;super&quot;</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>visitSetExpr</em>()</div>\n\n<p>First, the work we&rsquo;ve been leading up to. We look up the surrounding class&rsquo;s\nsuperclass by looking up &ldquo;super&rdquo; in the proper environment.</p>\n<p>When we access a method, we also need to bind <code>this</code> to the object the method is\naccessed from. In an expression like <code>doughnut.cook</code>, the object is whatever we\nget from evaluating <code>doughnut</code>. In a <code>super</code> expression like <code>super.cook</code>, the\ncurrent object is implicitly the <em>same</em> current object that we&rsquo;re using. In\nother words, <code>this</code>. Even though we are looking up the <em>method</em> on the\nsuperclass, the <em>instance</em> is still <code>this</code>.</p>\n<p>Unfortunately, inside the <code>super</code> expression, we don&rsquo;t have a convenient node\nfor the resolver to hang the number of hops to <code>this</code> on. Fortunately, we do\ncontrol the layout of the environment chains. The environment where &ldquo;this&rdquo; is\nbound is always right inside the environment where we store &ldquo;super&rdquo;.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    LoxClass superclass = (LoxClass)environment.getAt(\n        distance, &quot;super&quot;);\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitSuperExpr</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"t\">LoxInstance</span> <span class=\"i\">object</span> = (<span class=\"t\">LoxInstance</span>)<span class=\"i\">environment</span>.<span class=\"i\">getAt</span>(\n        <span class=\"i\">distance</span> - <span class=\"n\">1</span>, <span class=\"s\">&quot;this&quot;</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitSuperExpr</em>()</div>\n\n<p>Offsetting the distance by one looks up &ldquo;this&rdquo; in that inner environment. I\nadmit this isn&rsquo;t the most <span name=\"elegant\">elegant</span> code, but it\nworks.</p>\n<aside name=\"elegant\">\n<p>Writing a book that includes every single line of code for a program means I\ncan&rsquo;t hide the hacks by leaving them as an &ldquo;exercise for the reader&rdquo;.</p>\n</aside>\n<p>Now we&rsquo;re ready to look up and bind the method, starting at the superclass.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    LoxInstance object = (LoxInstance)environment.getAt(\n        distance - 1, &quot;this&quot;);\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitSuperExpr</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"t\">LoxFunction</span> <span class=\"i\">method</span> = <span class=\"i\">superclass</span>.<span class=\"i\">findMethod</span>(<span class=\"i\">expr</span>.<span class=\"i\">method</span>.<span class=\"i\">lexeme</span>);\n    <span class=\"k\">return</span> <span class=\"i\">method</span>.<span class=\"i\">bind</span>(<span class=\"i\">object</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitSuperExpr</em>()</div>\n\n<p>This is almost exactly like the code for looking up a method of a get\nexpression, except that we call <code>findMethod()</code> on the superclass instead of on\nthe class of the current object.</p>\n<p>That&rsquo;s basically it. Except, of course, that we might <em>fail</em> to find the method.\nSo we check for that too.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">\n\n    LoxFunction method = superclass.findMethod(expr.method.lexeme);\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitSuperExpr</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"k\">if</span> (<span class=\"i\">method</span> == <span class=\"k\">null</span>) {\n      <span class=\"k\">throw</span> <span class=\"k\">new</span> <span class=\"t\">RuntimeError</span>(<span class=\"i\">expr</span>.<span class=\"i\">method</span>,\n          <span class=\"s\">&quot;Undefined property &#39;&quot;</span> + <span class=\"i\">expr</span>.<span class=\"i\">method</span>.<span class=\"i\">lexeme</span> + <span class=\"s\">&quot;&#39;.&quot;</span>);\n    }\n\n</pre><pre class=\"insert-after\">    return method.bind(object);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitSuperExpr</em>()</div>\n\n<p>There you have it! Take that BostonCream example earlier and give it a try.\nAssuming you and I did everything right, it should fry it first, then stuff it\nwith cream.</p>\n<h3><a href=\"#invalid-uses-of-super\" id=\"invalid-uses-of-super\"><small>13&#8202;.&#8202;3&#8202;.&#8202;3</small>Invalid uses of super</a></h3>\n<p>As with previous language features, our implementation does the right thing when\nthe user writes correct code, but we haven&rsquo;t bulletproofed the intepreter\nagainst bad code. In particular, consider:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Eclair</span> {\n  <span class=\"i\">cook</span>() {\n    <span class=\"k\">super</span>.<span class=\"i\">cook</span>();\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Pipe full of crème pâtissière.&quot;</span>;\n  }\n}\n</pre></div>\n<p>This class has a <code>super</code> expression, but no superclass. At runtime, the code for\nevaluating <code>super</code> expressions assumes that &ldquo;super&rdquo; was successfully resolved\nand will be found in the environment. That&rsquo;s going to fail here because there is\nno surrounding environment for the superclass since there is no superclass. The\nJVM will throw an exception and bring our interpreter to its knees.</p>\n<p>Heck, there are even simpler broken uses of super:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">super</span>.<span class=\"i\">notEvenInAClass</span>();\n</pre></div>\n<p>We could handle errors like these at runtime by checking to see if the lookup\nof &ldquo;super&rdquo; succeeded. But we can tell statically<span class=\"em\">&mdash;</span>just by looking at the\nsource code<span class=\"em\">&mdash;</span>that Eclair has no superclass and thus no <code>super</code> expression will\nwork inside it. Likewise, in the second example, we know that the <code>super</code>\nexpression is not even inside a method body.</p>\n<p>Even though Lox is dynamically typed, that doesn&rsquo;t mean we want to defer\n<em>everything</em> to runtime. If the user made a mistake, we&rsquo;d like to help them find\nit sooner rather than later. So we&rsquo;ll report these errors statically, in the\nresolver.</p>\n<p>First, we add a new case to the enum we use to keep track of what kind of class\nis surrounding the current code being visited.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    NONE,\n</pre><pre class=\"insert-before\">    <span class=\"i\">CLASS</span><span class=\"insert-comma\">,</span>\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin enum <em>ClassType</em><br>\nadd <em>&ldquo;,&rdquo;</em> to previous line</div>\n<pre class=\"insert\">    <span class=\"i\">SUBCLASS</span>\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in enum <em>ClassType</em>, add <em>&ldquo;,&rdquo;</em> to previous line</div>\n\n<p>We&rsquo;ll use that to distinguish when we&rsquo;re inside a class that has a superclass\nversus one that doesn&rsquo;t. When we resolve a class declaration, we set that if the\nclass is a subclass.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    if (stmt.superclass != null) {\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin <em>visitClassStmt</em>()</div>\n<pre class=\"insert\">      <span class=\"i\">currentClass</span> = <span class=\"t\">ClassType</span>.<span class=\"i\">SUBCLASS</span>;\n</pre><pre class=\"insert-after\">      resolve(stmt.superclass);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in <em>visitClassStmt</em>()</div>\n\n<p>Then, when we resolve a <code>super</code> expression, we check to see that we are\ncurrently inside a scope where that&rsquo;s allowed.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  public Void visitSuperExpr(Expr.Super expr) {\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin <em>visitSuperExpr</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (<span class=\"i\">currentClass</span> == <span class=\"t\">ClassType</span>.<span class=\"i\">NONE</span>) {\n      <span class=\"t\">Lox</span>.<span class=\"i\">error</span>(<span class=\"i\">expr</span>.<span class=\"i\">keyword</span>,\n          <span class=\"s\">&quot;Can&#39;t use &#39;super&#39; outside of a class.&quot;</span>);\n    } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">currentClass</span> != <span class=\"t\">ClassType</span>.<span class=\"i\">SUBCLASS</span>) {\n      <span class=\"t\">Lox</span>.<span class=\"i\">error</span>(<span class=\"i\">expr</span>.<span class=\"i\">keyword</span>,\n          <span class=\"s\">&quot;Can&#39;t use &#39;super&#39; in a class with no superclass.&quot;</span>);\n    }\n\n</pre><pre class=\"insert-after\">    resolveLocal(expr, expr.keyword);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in <em>visitSuperExpr</em>()</div>\n\n<p>If not<span class=\"em\">&mdash;</span>oopsie!<span class=\"em\">&mdash;</span>the user made a mistake.</p>\n<h2><a href=\"#conclusion\" id=\"conclusion\"><small>13&#8202;.&#8202;4</small>Conclusion</a></h2>\n<p>We made it! That final bit of error handling is the last chunk of code needed to\ncomplete our Java implementation of Lox. This is a real <span\nname=\"superhero\">accomplishment</span> and one you should be proud of. In the\npast dozen chapters and a thousand or so lines of code, we have learned and\nimplemented<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span></p>\n<ul>\n<li><a href=\"scanning.html\">tokens and lexing</a>,</li>\n<li><a href=\"representing-code.html\">abstract syntax trees</a>,</li>\n<li><a href=\"parsing-expressions.html\">recursive descent parsing</a>,</li>\n<li>prefix and infix expressions,</li>\n<li>runtime representation of objects,</li>\n<li><a href=\"evaluating-expressions.html\">interpreting code using the Visitor pattern</a>,</li>\n<li><a href=\"statements-and-state.html\">lexical scope</a>,</li>\n<li>environment chains for storing variables,</li>\n<li><a href=\"control-flow.html\">control flow</a>,</li>\n<li><a href=\"functions.html\">functions with parameters</a>,</li>\n<li>closures,</li>\n<li><a href=\"resolving-and-binding.html\">static variable resolution and error detection</a>,</li>\n<li><a href=\"classes.html\">classes</a>,</li>\n<li>constructors,</li>\n<li>fields,</li>\n<li>methods, and finally,</li>\n<li>inheritance.</li>\n</ul>\n<aside name=\"superhero\"><img src=\"image/inheritance/superhero.png\" alt=\"You, being your bad self.\" />\n</aside>\n<p>We did all of that from scratch, with no external dependencies or magic tools.\nJust you and I, our respective text editors, a couple of collection classes in\nthe Java standard library, and the JVM runtime.</p>\n<p>This marks the end of Part II, but not the end of the book. Take a break. Maybe\nwrite a few fun Lox programs and run them in your interpreter. (You may want to\nadd a few more native methods for things like reading user input.) When you&rsquo;re\nrefreshed and ready, we&rsquo;ll embark on our <a href=\"a-bytecode-virtual-machine.html\">next adventure</a>.</p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>Lox supports only <em>single inheritance</em><span class=\"em\">&mdash;</span>a class may have a single\nsuperclass and that&rsquo;s the only way to reuse methods across classes. Other\nlanguages have explored a variety of ways to more freely reuse and share\ncapabilities across classes: mixins, traits, multiple inheritance, virtual\ninheritance, extension methods, etc.</p>\n<p>If you were to add some feature along these lines to Lox, which would you\npick and why? If you&rsquo;re feeling courageous (and you should be at this\npoint), go ahead and add it.</p>\n</li>\n<li>\n<p>In Lox, as in most other object-oriented languages, when looking up a\nmethod, we start at the bottom of the class hierarchy and work our way up<span class=\"em\">&mdash;</span>a subclass&rsquo;s method is preferred over a superclass&rsquo;s. In order to get to the\nsuperclass method from within an overriding method, you use <code>super</code>.</p>\n<p>The language <a href=\"https://beta.cs.au.dk/\">BETA</a> takes the <a href=\"http://journal.stuffwithstuff.com/2012/12/19/the-impoliteness-of-overriding-methods/\">opposite approach</a>. When you call a\nmethod, it starts at the <em>top</em> of the class hierarchy and works <em>down</em>. A\nsuperclass method wins over a subclass method. In order to get to the\nsubclass method, the superclass method can call <code>inner</code>, which is sort of\nlike the inverse of <code>super</code>. It chains to the next method down the\nhierarchy.</p>\n<p>The superclass method controls when and where the subclass is allowed to\nrefine its behavior. If the superclass method doesn&rsquo;t call <code>inner</code> at all,\nthen the subclass has no way of overriding or modifying the superclass&rsquo;s\nbehavior.</p>\n<p>Take out Lox&rsquo;s current overriding and <code>super</code> behavior and replace it with\nBETA&rsquo;s semantics. In short:</p>\n<ul>\n<li>\n<p>When calling a method on a class, prefer the method <em>highest</em> on the\nclass&rsquo;s inheritance chain.</p>\n</li>\n<li>\n<p>Inside the body of a method, a call to <code>inner</code> looks for a method with\nthe same name in the nearest subclass along the inheritance chain\nbetween the class containing the <code>inner</code> and the class of <code>this</code>. If\nthere is no matching method, the <code>inner</code> call does nothing.</p>\n</li>\n</ul>\n<p>For example:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Doughnut</span> {\n  <span class=\"i\">cook</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Fry until golden brown.&quot;</span>;\n    <span class=\"i\">inner</span>();\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Place in a nice box.&quot;</span>;\n  }\n}\n\n<span class=\"k\">class</span> <span class=\"t\">BostonCream</span> &lt; <span class=\"t\">Doughnut</span> {\n  <span class=\"i\">cook</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Pipe full of custard and coat with chocolate.&quot;</span>;\n  }\n}\n\n<span class=\"t\">BostonCream</span>().<span class=\"i\">cook</span>();\n</pre></div>\n<p>This should print:</p>\n<div class=\"codehilite\"><pre>Fry until golden brown.\nPipe full of custard and coat with chocolate.\nPlace in a nice box.\n</pre></div>\n</li>\n<li>\n<p>In the chapter where I introduced Lox, <a href=\"the-lox-language.html#challenges\">I challenged you</a> to\ncome up with a couple of features you think the language is missing. Now\nthat you know how to build an interpreter, implement one of those features.</p>\n</li>\n</ol>\n</div>\n\n<footer>\n<a href=\"a-bytecode-virtual-machine.html\" class=\"next\">\n  Next Part: &ldquo;A Bytecode Virtual Machine&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/introduction.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Introduction &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Introduction<small>1</small></a></h3>\n\n<ul>\n    <li><a href=\"#why-learn-this-stuff\"><small>1.1</small> Why Learn This Stuff?</a></li>\n    <li><a href=\"#how-the-book-is-organized\"><small>1.2</small> How the Book Is Organized</a></li>\n    <li><a href=\"#the-first-interpreter\"><small>1.3</small> The First Interpreter</a></li>\n    <li><a href=\"#the-second-interpreter\"><small>1.4</small> The Second Interpreter</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>What&#x27;s in a Name?</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"welcome.html\" title=\"Welcome\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"welcome.html\" title=\"Welcome\">&uarr;&nbsp;Up</a>\n    <a href=\"a-map-of-the-territory.html\" title=\"A Map of the Territory\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"welcome.html\" title=\"Welcome\" class=\"prev\">←</a>\n<a href=\"a-map-of-the-territory.html\" title=\"A Map of the Territory\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Introduction<small>1</small></a></h3>\n\n<ul>\n    <li><a href=\"#why-learn-this-stuff\"><small>1.1</small> Why Learn This Stuff?</a></li>\n    <li><a href=\"#how-the-book-is-organized\"><small>1.2</small> How the Book Is Organized</a></li>\n    <li><a href=\"#the-first-interpreter\"><small>1.3</small> The First Interpreter</a></li>\n    <li><a href=\"#the-second-interpreter\"><small>1.4</small> The Second Interpreter</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>What&#x27;s in a Name?</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"welcome.html\" title=\"Welcome\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"welcome.html\" title=\"Welcome\">&uarr;&nbsp;Up</a>\n    <a href=\"a-map-of-the-territory.html\" title=\"A Map of the Territory\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">1</div>\n  <h1>Introduction</h1>\n\n<blockquote>\n<p>Fairy tales are more than true: not because they tell us that dragons exist,\nbut because they tell us that dragons can be beaten.</p>\n<p><cite>G.K. Chesterton by way of Neil Gaiman, <em>Coraline</em></cite></p>\n</blockquote>\n<p>I&rsquo;m really excited we&rsquo;re going on this journey together. This is a book on\nimplementing interpreters for programming languages. It&rsquo;s also a book on how to\ndesign a language worth implementing. It&rsquo;s the book I wish I&rsquo;d had when I first\nstarted getting into languages, and it&rsquo;s the book I&rsquo;ve been writing in my <span\nname=\"head\">head</span> for nearly a decade.</p>\n<aside name=\"head\">\n<p>To my friends and family, sorry I&rsquo;ve been so absentminded!</p>\n</aside>\n<p>In these pages, we will walk step-by-step through two complete interpreters for\na full-featured language. I assume this is your first foray into languages, so\nI&rsquo;ll cover each concept and line of code you need to build a complete, usable,\nfast language implementation.</p>\n<p>In order to cram two full implementations inside one book without it turning\ninto a doorstop, this text is lighter on theory than others. As we build each\npiece of the system, I will introduce the history and concepts behind it. I&rsquo;ll\ntry to get you familiar with the lingo so that if you ever find yourself at a\n<span name=\"party\">cocktail party</span> full of PL (programming language)\nresearchers, you&rsquo;ll fit in.</p>\n<aside name=\"party\">\n<p>Strangely enough, a situation I have found myself in multiple times. You\nwouldn&rsquo;t believe how much some of them can drink.</p>\n</aside>\n<p>But we&rsquo;re mostly going to spend our brain juice getting the language up and\nrunning. This is not to say theory isn&rsquo;t important. Being able to reason\nprecisely and <span name=\"formal\">formally</span> about syntax and semantics is\na vital skill when working on a language. But, personally, I learn best by\ndoing. It&rsquo;s hard for me to wade through paragraphs full of abstract concepts and\nreally absorb them. But if I&rsquo;ve coded something, run it, and debugged it, then I\n<em>get</em> it.</p>\n<aside name=\"formal\">\n<p>Static type systems in particular require rigorous formal reasoning. Hacking on\na type system has the same feel as proving a theorem in mathematics.</p>\n<p>It turns out this is no coincidence. In the early half of last century, Haskell\nCurry and William Alvin Howard showed that they are two sides of the same coin:\n<a href=\"https://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspondence\">the Curry-Howard isomorphism</a>.</p>\n</aside>\n<p>That&rsquo;s my goal for you. I want you to come away with a solid intuition of how a\nreal language lives and breathes. My hope is that when you read other, more\ntheoretical books later, the concepts there will firmly stick in your mind,\nadhered to this tangible substrate.</p>\n<h2><a href=\"#why-learn-this-stuff\" id=\"why-learn-this-stuff\"><small>1&#8202;.&#8202;1</small>Why Learn This Stuff?</a></h2>\n<p>Every introduction to every compiler book seems to have this section. I don&rsquo;t\nknow what it is about programming languages that causes such existential doubt.\nI don&rsquo;t think ornithology books worry about justifying their existence. They\nassume the reader loves birds and start teaching.</p>\n<p>But programming languages are a little different. I suppose it is true that the\nodds of any of us creating a broadly successful, general-purpose programming\nlanguage are slim. The designers of the world&rsquo;s widely used languages could fit\nin a Volkswagen bus, even without putting the pop-top camper up. If joining that\nelite group was the <em>only</em> reason to learn languages, it would be hard to\njustify. Fortunately, it isn&rsquo;t.</p>\n<h3><a href=\"#little-languages-are-everywhere\" id=\"little-languages-are-everywhere\"><small>1&#8202;.&#8202;1&#8202;.&#8202;1</small>Little languages are everywhere</a></h3>\n<p>For every successful general-purpose language, there are a thousand successful\nniche ones. We used to call them &ldquo;little languages&rdquo;, but inflation in the jargon\neconomy led to the name &ldquo;domain-specific languages&rdquo;. These are pidgins\ntailor-built to a specific task. Think application scripting languages, template\nengines, markup formats, and configuration files.</p>\n<p><span name=\"little\"></span><img src=\"image/introduction/little-languages.png\" alt=\"A random selection of little languages.\" /></p>\n<aside name=\"little\">\n<p>A random selection of some little languages you might run into.</p>\n</aside>\n<p>Almost every large software project needs a handful of these. When you can, it&rsquo;s\ngood to reuse an existing one instead of rolling your own. Once you factor in\ndocumentation, debuggers, editor support, syntax highlighting, and all of the\nother trappings, doing it yourself becomes a tall order.</p>\n<p>But there&rsquo;s still a good chance you&rsquo;ll find yourself needing to whip up a parser\nor other tool when there isn&rsquo;t an existing library that fits your needs. Even\nwhen you are reusing some existing implementation, you&rsquo;ll inevitably end up\nneeding to debug and maintain it and poke around in its guts.</p>\n<h3><a href=\"#languages-are-great-exercise\" id=\"languages-are-great-exercise\"><small>1&#8202;.&#8202;1&#8202;.&#8202;2</small>Languages are great exercise</a></h3>\n<p>Long distance runners sometimes train with weights strapped to their ankles or\nat high altitudes where the atmosphere is thin. When they later unburden\nthemselves, the new relative ease of light limbs and oxygen-rich air enables\nthem to run farther and faster.</p>\n<p>Implementing a language is a real test of programming skill. The code is complex\nand performance critical. You must master recursion, dynamic arrays, trees,\ngraphs, and hash tables. You probably use hash tables at least in your\nday-to-day programming, but do you <em>really</em> understand them? Well, after we&rsquo;ve\ncrafted our own from scratch, I guarantee you will.</p>\n<p>While I intend to show you that an interpreter isn&rsquo;t as daunting as you might\nbelieve, implementing one well is still a challenge. Rise to it, and you&rsquo;ll come\naway a stronger programmer, and smarter about how you use data structures and\nalgorithms in your day job.</p>\n<h3><a href=\"#one-more-reason\" id=\"one-more-reason\"><small>1&#8202;.&#8202;1&#8202;.&#8202;3</small>One more reason</a></h3>\n<p>This last reason is hard for me to admit, because it&rsquo;s so close to my heart.\nEver since I learned to program as a kid, I felt there was something magical\nabout languages. When I first tapped out BASIC programs one key at a time I\ncouldn&rsquo;t conceive how BASIC <em>itself</em> was made.</p>\n<p>Later, the mixture of awe and terror on my college friends&rsquo; faces when talking\nabout their compilers class was enough to convince me language hackers were a\ndifferent breed of human<span class=\"em\">&mdash;</span>some sort of wizards granted privileged access to\narcane arts.</p>\n<p>It&rsquo;s a charming <span name=\"image\">image</span>, but it has a darker side. <em>I</em>\ndidn&rsquo;t feel like a wizard, so I was left thinking I lacked some inborn quality\nnecessary to join the cabal. Though I&rsquo;ve been fascinated by languages ever since\nI doodled made-up keywords in my school notebook, it took me decades to muster\nthe courage to try to really learn them. That &ldquo;magical&rdquo; quality, that sense of\nexclusivity, excluded <em>me</em>.</p>\n<aside name=\"image\">\n<p>And its practitioners don&rsquo;t hesitate to play up this image. Two of the seminal\ntexts on programming languages feature a <a href=\"https://en.wikipedia.org/wiki/Compilers:_Principles,_Techniques,_and_Tools\">dragon</a> and a <a href=\"https://mitpress.mit.edu/sites/default/files/sicp/index.html\">wizard</a> on their\ncovers.</p>\n</aside>\n<p>When I did finally start cobbling together my own little interpreters, I quickly\nlearned that, of course, there is no magic at all. It&rsquo;s just code, and the\npeople who hack on languages are just people.</p>\n<p>There <em>are</em> a few techniques you don&rsquo;t often encounter outside of languages, and\nsome parts are a little difficult. But not more difficult than other obstacles\nyou&rsquo;ve overcome. My hope is that if you&rsquo;ve felt intimidated by languages and\nthis book helps you overcome that fear, maybe I&rsquo;ll leave you just a tiny bit\nbraver than you were before.</p>\n<p>And, who knows, maybe you <em>will</em> make the next great language. Someone has to.</p>\n<h2><a href=\"#how-the-book-is-organized\" id=\"how-the-book-is-organized\"><small>1&#8202;.&#8202;2</small>How the Book Is Organized</a></h2>\n<p>This book is broken into three parts. You&rsquo;re reading the first one now. It&rsquo;s a\ncouple of chapters to get you oriented, teach you some of the lingo that\nlanguage hackers use, and introduce you to Lox, the language we&rsquo;ll be\nimplementing.</p>\n<p>Each of the other two parts builds one complete Lox interpreter. Within those\nparts, each chapter is structured the same way. The chapter takes a single\nlanguage feature, teaches you the concepts behind it, and walks you through an\nimplementation.</p>\n<p>It took a good bit of trial and error on my part, but I managed to carve up the\ntwo interpreters into chapter-sized chunks that build on the previous chapters\nbut require nothing from later ones. From the very first chapter, you&rsquo;ll have a\nworking program you can run and play with. With each passing chapter, it grows\nincreasingly full-featured until you eventually have a complete language.</p>\n<p>Aside from copious, scintillating English prose, chapters have a few other\ndelightful facets:</p>\n<h3><a href=\"#the-code\" id=\"the-code\"><small>1&#8202;.&#8202;2&#8202;.&#8202;1</small>The code</a></h3>\n<p>We&rsquo;re about <em>crafting</em> interpreters, so this book contains real code. Every\nsingle line of code needed is included, and each snippet tells you where to\ninsert it in your ever-growing implementation.</p>\n<p>Many other language books and language implementations use tools like <a href=\"https://en.wikipedia.org/wiki/Lex_(software)\">Lex</a>\nand <span name=\"yacc\"><a href=\"https://en.wikipedia.org/wiki/Yacc\">Yacc</a></span>, so-called <strong>compiler-compilers</strong>, that\nautomatically generate some of the source files for an implementation from some\nhigher-level description. There are pros and cons to tools like those, and\nstrong opinions<span class=\"em\">&mdash;</span>some might say religious convictions<span class=\"em\">&mdash;</span>on both sides.</p>\n<aside name=\"yacc\">\n<p>Yacc is a tool that takes in a grammar file and produces a source file for a\ncompiler, so it&rsquo;s sort of like a &ldquo;compiler&rdquo; that outputs a compiler, which is\nwhere we get the term &ldquo;compiler-compiler&rdquo;.</p>\n<p>Yacc wasn&rsquo;t the first of its ilk, which is why it&rsquo;s named &ldquo;Yacc&rdquo;<span class=\"em\">&mdash;</span><em>Yet\nAnother</em> Compiler-Compiler. A later similar tool is <a href=\"https://en.wikipedia.org/wiki/GNU_bison\">Bison</a>, named as a pun on\nthe pronunciation of Yacc like &ldquo;yak&rdquo;.</p><img src=\"image/introduction/yak.png\" alt=\"A yak.\" />\n<p>If you find all of these little self-references and puns charming and fun,\nyou&rsquo;ll fit right in here. If not, well, maybe the language nerd sense of humor\nis an acquired taste.</p>\n</aside>\n<p>We will abstain from using them here. I want to ensure there are no dark corners\nwhere magic and confusion can hide, so we&rsquo;ll write everything by hand. As you&rsquo;ll\nsee, it&rsquo;s not as bad as it sounds, and it means you really will understand each\nline of code and how both interpreters work.</p>\n<p>A book has different constraints from the &ldquo;real world&rdquo; and so the coding style\nhere might not always reflect the best way to write maintainable production\nsoftware. If I seem a little cavalier about, say, omitting <code>private</code> or\ndeclaring a global variable, understand I do so to keep the code easier on your\neyes. The pages here aren&rsquo;t as wide as your IDE and every character counts.</p>\n<p>Also, the code doesn&rsquo;t have many comments. That&rsquo;s because each handful of lines\nis surrounded by several paragraphs of honest-to-God prose explaining it. When\nyou write a book to accompany your program, you are welcome to omit comments\ntoo. Otherwise, you should probably use <code>//</code> a little more than I do.</p>\n<p>While the book contains every line of code and teaches what each means, it does\nnot describe the machinery needed to compile and run the interpreter. I assume\nyou can slap together a makefile or a project in your IDE of choice in order to\nget the code to run. Those kinds of instructions get out of date quickly, and\nI want this book to age like XO brandy, not backyard hooch.</p>\n<h3><a href=\"#snippets\" id=\"snippets\"><small>1&#8202;.&#8202;2&#8202;.&#8202;2</small>Snippets</a></h3>\n<p>Since the book contains literally every line of code needed for the\nimplementations, the snippets are quite precise. Also, because I try to keep the\nprogram in a runnable state even when major features are missing, sometimes we\nadd temporary code that gets replaced in later snippets.</p>\n<p>A snippet with all the bells and whistles looks like this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">\n      default:\n</pre><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nin <em>scanToken</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">\n        <span class=\"k\">if</span> (<span class=\"i\">isDigit</span>(<span class=\"i\">c</span>)) {\n          <span class=\"i\">number</span>();\n        } <span class=\"k\">else</span> {\n          <span class=\"t\">Lox</span>.<span class=\"i\">error</span>(<span class=\"i\">line</span>, <span class=\"s\">&quot;Unexpected character.&quot;</span>);\n        }\n</pre><pre class=\"insert-after\">\n        break;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, in <em>scanToken</em>(), replace 1 line</div>\n<p>In the center, you have the new code to add. It may have a few faded out lines\nabove or below to show where it goes in the existing surrounding code. There is\nalso a little blurb telling you in which file and where to place the snippet. If\nthat blurb says &ldquo;replace _ lines&rdquo;, there is some existing code between the faded\nlines that you need to remove and replace with the new snippet.</p>\n<h3><a href=\"#asides\" id=\"asides\"><small>1&#8202;.&#8202;2&#8202;.&#8202;3</small>Asides</a></h3>\n<p><span name=\"joke\">Asides</span> contain biographical sketches, historical\nbackground, references to related topics, and suggestions of other areas to\nexplore. There&rsquo;s nothing that you <em>need</em> to know in them to understand later\nparts of the book, so you can skip them if you want. I won&rsquo;t judge you, but I\nmight be a little sad.</p>\n<aside name=\"joke\">\n<p>Well, some asides do, at least. Most of them are just dumb jokes and amateurish\ndrawings.</p>\n</aside>\n<h3><a href=\"#challenges_\" id=\"challenges_\"><small>1&#8202;.&#8202;2&#8202;.&#8202;4</small>Challenges</a></h3>\n<p>Each chapter ends with a few exercises. Unlike textbook problem sets, which tend\nto review material you already covered, these are to help you learn <em>more</em> than\nwhat&rsquo;s in the chapter. They force you to step off the guided path and explore on\nyour own. They will make you research other languages, figure out how to\nimplement features, or otherwise get you out of your comfort zone.</p>\n<p><span name=\"warning\">Vanquish</span> the challenges and you&rsquo;ll come away with a\nbroader understanding and possibly a few bumps and scrapes. Or skip them if you\nwant to stay inside the comfy confines of the tour bus. It&rsquo;s your book.</p>\n<aside name=\"warning\">\n<p>A word of warning: the challenges often ask you to make changes to the\ninterpreter you&rsquo;re building. You&rsquo;ll want to implement those in a copy of your\ncode. The later chapters assume your interpreter is in a pristine\n(&ldquo;unchallenged&rdquo;?) state.</p>\n</aside>\n<h3><a href=\"#design-notes\" id=\"design-notes\"><small>1&#8202;.&#8202;2&#8202;.&#8202;5</small>Design notes</a></h3>\n<p>Most &ldquo;programming language&rdquo; books are strictly programming language\n<em>implementation</em> books. They rarely discuss how one might happen to <em>design</em> the\nlanguage being implemented. Implementation is fun because it is so <span\nname=\"benchmark\">precisely defined</span>. We programmers seem to have an\naffinity for things that are black and white, ones and zeroes.</p>\n<aside name=\"benchmark\">\n<p>I know a lot of language hackers whose careers are based on this. You slide a\nlanguage spec under their door, wait a few months, and code and benchmark\nresults come out.</p>\n</aside>\n<p>Personally, I think the world needs only so many implementations of <span\nname=\"fortran\">FORTRAN 77</span>. At some point, you find yourself designing a\n<em>new</em> language. Once you start playing <em>that</em> game, then the softer, human side\nof the equation becomes paramount. Things like which features are easy to learn,\nhow to balance innovation and familiarity, what syntax is more readable and to\nwhom.</p>\n<aside name=\"fortran\">\n<p>Hopefully your new language doesn&rsquo;t hardcode assumptions about the width of a\npunched card into its grammar.</p>\n</aside>\n<p>All of that stuff profoundly affects the success of your new language. I want\nyour language to succeed, so in some chapters I end with a &ldquo;design note&rdquo;, a\nlittle essay on some corner of the human aspect of programming languages. I&rsquo;m no\nexpert on this<span class=\"em\">&mdash;</span>I don&rsquo;t know if anyone really is<span class=\"em\">&mdash;</span>so take these with a large\npinch of salt. That should make them tastier food for thought, which is my main\naim.</p>\n<h2><a href=\"#the-first-interpreter\" id=\"the-first-interpreter\"><small>1&#8202;.&#8202;3</small>The First Interpreter</a></h2>\n<p>We&rsquo;ll write our first interpreter, jlox, in <span name=\"lang\">Java</span>. The\nfocus is on <em>concepts</em>. We&rsquo;ll write the simplest, cleanest code we can to\ncorrectly implement the semantics of the language. This will get us comfortable\nwith the basic techniques and also hone our understanding of exactly how the\nlanguage is supposed to behave.</p>\n<aside name=\"lang\">\n<p>The book uses Java and C, but readers have ported the code to <a href=\"https://github.com/munificent/craftinginterpreters/wiki/Lox-implementations\">many other\nlanguages</a>. If the languages I picked aren&rsquo;t your bag, take a look at\nthose.</p>\n</aside>\n<p>Java is a great language for this. It&rsquo;s high level enough that we don&rsquo;t get\noverwhelmed by fiddly implementation details, but it&rsquo;s still pretty explicit.\nUnlike in scripting languages, there tends to be less complex machinery hiding\nunder the hood, and you&rsquo;ve got static types to see what data structures you&rsquo;re\nworking with.</p>\n<p>I also chose Java specifically because it is an object-oriented language. That\nparadigm swept the programming world in the &rsquo;90s and is now the dominant way of\nthinking for millions of programmers. Odds are good you&rsquo;re already used to\norganizing code into classes and methods, so we&rsquo;ll keep you in that comfort\nzone.</p>\n<p>While academic language folks sometimes look down on object-oriented languages,\nthe reality is that they are widely used even for language work. GCC and LLVM\nare written in C++, as are most JavaScript virtual machines. Object-oriented\nlanguages are ubiquitous, and the tools and compilers <em>for</em> a language are often\nwritten <em>in</em> the <span name=\"host\">same language</span>.</p>\n<aside name=\"host\">\n<p>A compiler reads files in one language, translates them, and outputs files in\nanother language. You can implement a compiler in any language, including the\nsame language it compiles, a process called <strong>self-hosting</strong>.</p>\n<p>You can&rsquo;t compile your compiler using itself yet, but if you have another\ncompiler for your language written in some other language, you use <em>that</em> one to\ncompile your compiler once. Now you can use the compiled version of your own\ncompiler to compile future versions of itself, and you can discard the original\none compiled from the other compiler. This is called <strong>bootstrapping</strong>, from\nthe image of pulling yourself up by your own bootstraps.</p><img src=\"image/introduction/bootstrap.png\" alt=\"Fact: This is the primary mode of transportation of the American cowboy.\" />\n</aside>\n<p>And, finally, Java is hugely popular. That means there&rsquo;s a good chance you\nalready know it, so there&rsquo;s less for you to learn to get going in the book. If\nyou aren&rsquo;t that familiar with Java, don&rsquo;t freak out. I try to stick to a fairly\nminimal subset of it. I use the diamond operator from Java 7 to make things a\nlittle more terse, but that&rsquo;s about it as far as &ldquo;advanced&rdquo; features go. If you\nknow another object-oriented language, like C# or C++, you can muddle through.</p>\n<p>By the end of part II, we&rsquo;ll have a simple, readable implementation. It&rsquo;s not\nvery fast, but it&rsquo;s correct. However, we are only able to accomplish that by\nbuilding on the Java virtual machine&rsquo;s own runtime facilities. We want to learn\nhow Java <em>itself</em> implements those things.</p>\n<h2><a href=\"#the-second-interpreter\" id=\"the-second-interpreter\"><small>1&#8202;.&#8202;4</small>The Second Interpreter</a></h2>\n<p>So in the next part, we start all over again, but this time in C. C is the\nperfect language for understanding how an implementation <em>really</em> works, all the\nway down to the bytes in memory and the code flowing through the CPU.</p>\n<p>A big reason that we&rsquo;re using C is so I can show you things C is particularly\ngood at, but that <em>does</em> mean you&rsquo;ll need to be pretty comfortable with it. You\ndon&rsquo;t have to be the reincarnation of Dennis Ritchie, but you shouldn&rsquo;t be\nspooked by pointers either.</p>\n<p>If you aren&rsquo;t there yet, pick up an introductory book on C and chew through it,\nthen come back here when you&rsquo;re done. In return, you&rsquo;ll come away from this book\nan even stronger C programmer. That&rsquo;s useful given how many language\nimplementations are written in C: Lua, CPython, and Ruby&rsquo;s MRI, to name a few.</p>\n<p>In our C interpreter, <span name=\"clox\">clox</span>, we are forced to implement\nfor ourselves all the things Java gave us for free. We&rsquo;ll write our own dynamic\narray and hash table. We&rsquo;ll decide how objects are represented in memory, and\nbuild a garbage collector to reclaim them.</p>\n<aside name=\"clox\">\n<p>I pronounce the name like &ldquo;sea-locks&rdquo;, but you can say it &ldquo;clocks&rdquo; or even\n&ldquo;cloch&rdquo;, where you pronounce the &ldquo;x&rdquo; like the Greeks do if it makes you happy.</p>\n</aside>\n<p>Our Java implementation was focused on being correct. Now that we have that\ndown, we&rsquo;ll turn to also being <em>fast</em>. Our C interpreter will contain a <span\nname=\"compiler\">compiler</span> that translates Lox to an efficient bytecode\nrepresentation (don&rsquo;t worry, I&rsquo;ll get into what that means soon), which it then\nexecutes. This is the same technique used by implementations of Lua, Python,\nRuby, PHP, and many other successful languages.</p>\n<aside name=\"compiler\">\n<p>Did you think this was just an interpreter book? It&rsquo;s a compiler book as well.\nTwo for the price of one!</p>\n</aside>\n<p>We&rsquo;ll even try our hand at benchmarking and optimization. By the end, we&rsquo;ll have\na robust, accurate, fast interpreter for our language, able to keep up with\nother professional caliber implementations out there. Not bad for one book and a\nfew thousand lines of code.</p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>There are at least six domain-specific languages used in the <a href=\"https://github.com/munificent/craftinginterpreters\">little system\nI cobbled together</a> to write and publish this book. What are they?</p>\n</li>\n<li>\n<p>Get a &ldquo;Hello, world!&rdquo; program written and running in Java. Set up whatever\nmakefiles or IDE projects you need to get it working. If you have a\ndebugger, get comfortable with it and step through your program as it runs.</p>\n</li>\n<li>\n<p>Do the same thing for C. To get some practice with pointers, define a\n<a href=\"https://en.wikipedia.org/wiki/Doubly_linked_list\">doubly linked list</a> of heap-allocated strings. Write functions to insert,\nfind, and delete items from it. Test them.</p>\n</li>\n</ol>\n</div>\n<div class=\"design-note\">\n<h2><a href=\"#design-note\" id=\"design-note\">Design Note: What&rsquo;s in a Name?</a></h2>\n<p>One of the hardest challenges in writing this book was coming up with a name for\nthe language it implements. I went through <em>pages</em> of candidates before I found\none that worked. As you&rsquo;ll discover on the first day you start building your own\nlanguage, naming is deviously hard. A good name satisfies a few criteria:</p>\n<ol>\n<li>\n<p><strong>It isn&rsquo;t in use.</strong> You can run into all sorts of trouble, legal and\nsocial, if you inadvertently step on someone else&rsquo;s name.</p>\n</li>\n<li>\n<p><strong>It&rsquo;s easy to pronounce.</strong> If things go well, hordes of people will be\nsaying and writing your language&rsquo;s name. Anything longer than a couple of\nsyllables or a handful of letters will annoy them to no end.</p>\n</li>\n<li>\n<p><strong>It&rsquo;s distinct enough to search for.</strong> People will Google your language&rsquo;s\nname to learn about it, so you want a word that&rsquo;s rare enough that most\nresults point to your docs. Though, with the amount of AI search engines are\npacking today, that&rsquo;s less of an issue. Still, you won&rsquo;t be doing your users\nany favors if you name your language &ldquo;for&rdquo;.</p>\n</li>\n<li>\n<p><strong>It doesn&rsquo;t have negative connotations across a number of cultures.</strong> This\nis hard to be on guard for, but it&rsquo;s worth considering. The designer of\nNimrod ended up renaming his language to &ldquo;Nim&rdquo; because too many people\nremember that Bugs Bunny used &ldquo;Nimrod&rdquo; as an insult. (Bugs was using it\nironically.)</p>\n</li>\n</ol>\n<p>If your potential name makes it through that gauntlet, keep it. Don&rsquo;t get hung\nup on trying to find an appellation that captures the quintessence of your\nlanguage. If the names of the world&rsquo;s other successful languages teach us\nanything, it&rsquo;s that the name doesn&rsquo;t matter much. All you need is a reasonably\nunique token.</p>\n</div>\n\n<footer>\n<a href=\"a-map-of-the-territory.html\" class=\"next\">\n  Next Chapter: &ldquo;A Map of the Territory&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/jumping-back-and-forth.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Jumping Back and Forth &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Jumping Back and Forth<small>23</small></a></h3>\n\n<ul>\n    <li><a href=\"#if-statements\"><small>23.1</small> If Statements</a></li>\n    <li><a href=\"#logical-operators\"><small>23.2</small> Logical Operators</a></li>\n    <li><a href=\"#while-statements\"><small>23.3</small> While Statements</a></li>\n    <li><a href=\"#for-statements\"><small>23.4</small> For Statements</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Considering Goto Harmful</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"local-variables.html\" title=\"Local Variables\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"calls-and-functions.html\" title=\"Calls and Functions\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"local-variables.html\" title=\"Local Variables\" class=\"prev\">←</a>\n<a href=\"calls-and-functions.html\" title=\"Calls and Functions\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Jumping Back and Forth<small>23</small></a></h3>\n\n<ul>\n    <li><a href=\"#if-statements\"><small>23.1</small> If Statements</a></li>\n    <li><a href=\"#logical-operators\"><small>23.2</small> Logical Operators</a></li>\n    <li><a href=\"#while-statements\"><small>23.3</small> While Statements</a></li>\n    <li><a href=\"#for-statements\"><small>23.4</small> For Statements</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Considering Goto Harmful</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"local-variables.html\" title=\"Local Variables\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"calls-and-functions.html\" title=\"Calls and Functions\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">23</div>\n  <h1>Jumping Back and Forth</h1>\n\n<blockquote>\n<p>The order that our mind imagines is like a net, or like a ladder, built to\nattain something. But afterward you must throw the ladder away, because you\ndiscover that, even if it was useful, it was meaningless.</p>\n<p><cite>Umberto Eco, <em>The Name of the Rose</em></cite></p>\n</blockquote>\n<p>It&rsquo;s taken a while to get here, but we&rsquo;re finally ready to add control flow to\nour virtual machine. In the tree-walk interpreter we built for jlox, we\nimplemented Lox&rsquo;s control flow in terms of Java&rsquo;s. To execute a Lox <code>if</code>\nstatement, we used a Java <code>if</code> statement to run the chosen branch. That works,\nbut isn&rsquo;t entirely satisfying. By what magic does the <em>JVM itself</em> or a native\nCPU implement <code>if</code> statements? Now that we have our own bytecode VM to hack on,\nwe can answer that.</p>\n<p>When we talk about &ldquo;control flow&rdquo;, what are we referring to? By &ldquo;flow&rdquo; we mean\nthe way execution moves through the text of the program. Almost like there is a\nlittle robot inside the computer wandering through our code, executing bits and\npieces here and there. Flow is the path that robot takes, and by <em>controlling</em>\nthe robot, we drive which pieces of code it executes.</p>\n<p>In jlox, the robot&rsquo;s locus of attention<span class=\"em\">&mdash;</span>the <em>current</em> bit of code<span class=\"em\">&mdash;</span>was\nimplicit based on which AST nodes were stored in various Java variables and what\nJava code we were in the middle of running. In clox, it is much more explicit.\nThe VM&rsquo;s <code>ip</code> field stores the address of the current bytecode instruction. The\nvalue of that field is exactly &ldquo;where we are&rdquo; in the program.</p>\n<p>Execution proceeds normally by incrementing the <code>ip</code>. But we can mutate that\nvariable however we want to. In order to implement control flow, all that&rsquo;s\nnecessary is to change the <code>ip</code> in more interesting ways. The simplest control\nflow construct is an <code>if</code> statement with no <code>else</code> clause:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">if</span> (<span class=\"i\">condition</span>) <span class=\"k\">print</span>(<span class=\"s\">&quot;condition was truthy&quot;</span>);\n</pre></div>\n<p>The VM evaluates the bytecode for the condition expression. If the result is\ntruthy, then it continues along and executes the <code>print</code> statement in the body.\nThe interesting case is when the condition is falsey. When that happens,\nexecution skips over the then branch and proceeds to the next statement.</p>\n<p>To skip over a chunk of code, we simply set the <code>ip</code> field to the address of the\nbytecode instruction following that code. To <em>conditionally</em> skip over some\ncode, we need an instruction that looks at the value on top of the stack. If\nit&rsquo;s falsey, it adds a given offset to the <code>ip</code> to jump over a range of\ninstructions. Otherwise, it does nothing and lets execution proceed to the next\ninstruction as usual.</p>\n<p>When we compile to bytecode, the explicit nested block structure of the code\nevaporates, leaving only a flat series of instructions behind. Lox is a\n<a href=\"https://en.wikipedia.org/wiki/Structured_programming\">structured programming</a> language, but clox bytecode isn&rsquo;t. The right<span class=\"em\">&mdash;</span>or\nwrong, depending on how you look at it<span class=\"em\">&mdash;</span>set of bytecode instructions could\njump into the middle of a block, or from one scope into another.</p>\n<p>The VM will happily execute that, even if the result leaves the stack in an\nunknown, inconsistent state. So even though the bytecode is unstructured, we&rsquo;ll\ntake care to ensure that our compiler only generates clean code that maintains\nthe same structure and nesting that Lox itself does.</p>\n<p>This is exactly how real CPUs behave. Even though we might program them using\nhigher-level languages that mandate structured control flow, the compiler lowers\nthat down to raw jumps. At the bottom, it turns out goto is the only real\ncontrol flow.</p>\n<p>Anyway, I didn&rsquo;t mean to get all philosophical. The important bit is that if we\nhave that one conditional jump instruction, that&rsquo;s enough to implement Lox&rsquo;s\n<code>if</code> statement, as long as it doesn&rsquo;t have an <code>else</code> clause. So let&rsquo;s go ahead\nand get started with that.</p>\n<h2><a href=\"#if-statements\" id=\"if-statements\"><small>23&#8202;.&#8202;1</small>If Statements</a></h2>\n<p>This many chapters in, you know the drill. Any new feature starts in the front\nend and works its way through the pipeline. An <code>if</code> statement is, well, a\nstatement, so that&rsquo;s where we hook it into the parser.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  if (match(TOKEN_PRINT)) {\n    printStatement();\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>statement</em>()</div>\n<pre class=\"insert\">  } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"a\">TOKEN_IF</span>)) {\n    <span class=\"i\">ifStatement</span>();\n</pre><pre class=\"insert-after\">  } else if (match(TOKEN_LEFT_BRACE)) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>statement</em>()</div>\n\n<p>When we see an <code>if</code> keyword, we hand off compilation to this function:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>expressionStatement</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">ifStatement</span>() {\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_LEFT_PAREN</span>, <span class=\"s\">&quot;Expect &#39;(&#39; after &#39;if&#39;.&quot;</span>);\n  <span class=\"i\">expression</span>();\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_RIGHT_PAREN</span>, <span class=\"s\">&quot;Expect &#39;)&#39; after condition.&quot;</span>);<span name=\"paren\"> </span>\n\n  <span class=\"t\">int</span> <span class=\"i\">thenJump</span> = <span class=\"i\">emitJump</span>(<span class=\"a\">OP_JUMP_IF_FALSE</span>);\n  <span class=\"i\">statement</span>();\n\n  <span class=\"i\">patchJump</span>(<span class=\"i\">thenJump</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>expressionStatement</em>()</div>\n\n<aside name=\"paren\">\n<p>Have you ever noticed that the <code>(</code> after the <code>if</code> keyword doesn&rsquo;t actually do\nanything useful? The language would be just as unambiguous and easy to parse\nwithout it, like:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">if</span> <span class=\"i\">condition</span>) <span class=\"k\">print</span>(<span class=\"s\">&quot;looks weird&quot;</span>);\n</pre></div>\n<p>The closing <code>)</code> is useful because it separates the condition expression from the\nbody. Some languages use a <code>then</code> keyword instead. But the opening <code>(</code> doesn&rsquo;t\ndo anything. It&rsquo;s just there because unmatched parentheses look bad to us\nhumans.</p>\n</aside>\n<p>First we compile the condition expression, bracketed by parentheses. At runtime,\nthat will leave the condition value on top of the stack. We&rsquo;ll use that to\ndetermine whether to execute the then branch or skip it.</p>\n<p>Then we emit a new <code>OP_JUMP_IF_FALSE</code> instruction. It has an operand for how\nmuch to offset the <code>ip</code><span class=\"em\">&mdash;</span>how many bytes of code to skip. If the condition is\nfalsey, it adjusts the <code>ip</code> by that amount. Something like this:</p>\n<aside name=\"legend\">\n<p>The boxes with the torn edges here represent the blob of bytecode generated by\ncompiling some sub-clause of a control flow construct. So the &ldquo;condition\nexpression&rdquo; box is all of the instructions emitted when we compiled that\nexpression.</p>\n</aside>\n<p><span name=\"legend\"></span></p><img src=\"image/jumping-back-and-forth/if-without-else.png\" alt=\"Flowchart of the compiled bytecode of an if statement.\" />\n<p>But we have a problem. When we&rsquo;re writing the <code>OP_JUMP_IF_FALSE</code> instruction&rsquo;s\noperand, how do we know how far to jump? We haven&rsquo;t compiled the then branch\nyet, so we don&rsquo;t know how much bytecode it contains.</p>\n<p>To fix that, we use a classic trick called <strong>backpatching</strong>. We emit the jump\ninstruction first with a placeholder offset operand. We keep track of where that\nhalf-finished instruction is. Next, we compile the then body. Once that&rsquo;s done,\nwe know how far to jump. So we go back and replace that placeholder offset with\nthe real one now that we can calculate it. Sort of like sewing a patch onto the\nexisting fabric of the compiled code.</p><img src=\"image/jumping-back-and-forth/patch.png\" alt=\"A patch containing a number being sewn onto a sheet of bytecode.\" />\n<p>We encode this trick into two helper functions.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>emitBytes</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">int</span> <span class=\"i\">emitJump</span>(<span class=\"t\">uint8_t</span> <span class=\"i\">instruction</span>) {\n  <span class=\"i\">emitByte</span>(<span class=\"i\">instruction</span>);\n  <span class=\"i\">emitByte</span>(<span class=\"n\">0xff</span>);\n  <span class=\"i\">emitByte</span>(<span class=\"n\">0xff</span>);\n  <span class=\"k\">return</span> <span class=\"i\">currentChunk</span>()-&gt;<span class=\"i\">count</span> - <span class=\"n\">2</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>emitBytes</em>()</div>\n\n<p>The first emits a bytecode instruction and writes a placeholder operand for the\njump offset. We pass in the opcode as an argument because later we&rsquo;ll have two\ndifferent instructions that use this helper. We use two bytes for the jump\noffset operand. A 16-bit <span name=\"offset\">offset</span> lets us jump over up\nto 65,535 bytes of code, which should be plenty for our needs.</p>\n<aside name=\"offset\">\n<p>Some instruction sets have separate &ldquo;long&rdquo; jump instructions that take larger\noperands for when you need to jump a greater distance.</p>\n</aside>\n<p>The function returns the offset of the emitted instruction in the chunk. After\ncompiling the then branch, we take that offset and pass it to this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>emitConstant</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">patchJump</span>(<span class=\"t\">int</span> <span class=\"i\">offset</span>) {\n  <span class=\"c\">// -2 to adjust for the bytecode for the jump offset itself.</span>\n  <span class=\"t\">int</span> <span class=\"i\">jump</span> = <span class=\"i\">currentChunk</span>()-&gt;<span class=\"i\">count</span> - <span class=\"i\">offset</span> - <span class=\"n\">2</span>;\n\n  <span class=\"k\">if</span> (<span class=\"i\">jump</span> &gt; <span class=\"a\">UINT16_MAX</span>) {\n    <span class=\"i\">error</span>(<span class=\"s\">&quot;Too much code to jump over.&quot;</span>);\n  }\n\n  <span class=\"i\">currentChunk</span>()-&gt;<span class=\"i\">code</span>[<span class=\"i\">offset</span>] = (<span class=\"i\">jump</span> &gt;&gt; <span class=\"n\">8</span>) &amp; <span class=\"n\">0xff</span>;\n  <span class=\"i\">currentChunk</span>()-&gt;<span class=\"i\">code</span>[<span class=\"i\">offset</span> + <span class=\"n\">1</span>] = <span class=\"i\">jump</span> &amp; <span class=\"n\">0xff</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>emitConstant</em>()</div>\n\n<p>This goes back into the bytecode and replaces the operand at the given location\nwith the calculated jump offset. We call <code>patchJump()</code> right before we emit the\nnext instruction that we want the jump to land on, so it uses the current\nbytecode count to determine how far to jump. In the case of an <code>if</code> statement,\nthat means right after we compile the then branch and before we compile the next\nstatement.</p>\n<p>That&rsquo;s all we need at compile time. Let&rsquo;s define the new instruction.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_PRINT,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_JUMP_IF_FALSE</span>,\n</pre><pre class=\"insert-after\">  OP_RETURN,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>Over in the VM, we get it working like so:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">        break;\n      }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_JUMP_IF_FALSE</span>: {\n        <span class=\"t\">uint16_t</span> <span class=\"i\">offset</span> = <span class=\"a\">READ_SHORT</span>();\n        <span class=\"k\">if</span> (<span class=\"i\">isFalsey</span>(<span class=\"i\">peek</span>(<span class=\"n\">0</span>))) <span class=\"i\">vm</span>.<span class=\"i\">ip</span> += <span class=\"i\">offset</span>;\n        <span class=\"k\">break</span>;\n      }\n</pre><pre class=\"insert-after\">      case OP_RETURN: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>This is the first instruction we&rsquo;ve added that takes a 16-bit operand. To read\nthat from the chunk, we use a new macro.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define READ_CONSTANT() (vm.chunk-&gt;constants.values[READ_BYTE()])\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\"><span class=\"a\">#define READ_SHORT() \\</span>\n<span class=\"a\">    (vm.ip += 2, (uint16_t)((vm.ip[-2] &lt;&lt; 8) | vm.ip[-1]))</span>\n</pre><pre class=\"insert-after\">#define READ_STRING() AS_STRING(READ_CONSTANT())\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>It yanks the next two bytes from the chunk and builds a 16-bit unsigned integer\nout of them. As usual, we clean up our macro when we&rsquo;re done with it.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#undef READ_BYTE\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\"><span class=\"a\">#undef READ_SHORT</span>\n</pre><pre class=\"insert-after\">#undef READ_CONSTANT\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>After reading the offset, we check the condition value on top of the stack.\n<span name=\"if\">If</span> it&rsquo;s falsey, we apply this jump offset to the <code>ip</code>.\nOtherwise, we leave the <code>ip</code> alone and execution will automatically proceed to\nthe next instruction following the jump instruction.</p>\n<p>In the case where the condition is falsey, we don&rsquo;t need to do any other work.\nWe&rsquo;ve offset the <code>ip</code>, so when the outer instruction dispatch loop turns again,\nit will pick up execution at that new instruction, past all of the code in the\nthen branch.</p>\n<aside name=\"if\">\n<p>I said we wouldn&rsquo;t use C&rsquo;s <code>if</code> statement to implement Lox&rsquo;s control flow, but\nwe do use one here to determine whether or not to offset the instruction\npointer. But we aren&rsquo;t really using C for <em>control flow</em>. If we wanted to, we\ncould do the same thing purely arithmetically. Let&rsquo;s assume we have a function\n<code>falsey()</code> that takes a Lox Value and returns 1 if it&rsquo;s falsey or 0 otherwise.\nThen we could implement the jump instruction like:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">case</span> <span class=\"a\">OP_JUMP_IF_FALSE</span>: {\n  <span class=\"t\">uint16_t</span> <span class=\"i\">offset</span> = <span class=\"a\">READ_SHORT</span>();\n  <span class=\"i\">vm</span>.<span class=\"i\">ip</span> += <span class=\"i\">falsey</span>() * <span class=\"i\">offset</span>;\n  <span class=\"k\">break</span>;\n}\n</pre></div>\n<p>The <code>falsey()</code> function would probably use some control flow to handle the\ndifferent value types, but that&rsquo;s an implementation detail of that function and\ndoesn&rsquo;t affect how our VM does its own control flow.</p>\n</aside>\n<p>Note that the jump instruction doesn&rsquo;t pop the condition value off the stack. So\nwe aren&rsquo;t totally done here, since this leaves an extra value floating around on\nthe stack. We&rsquo;ll clean that up soon. Ignoring that for the moment, we do have a\nworking <code>if</code> statement in Lox now, with only one little instruction required to\nsupport it at runtime in the VM.</p>\n<h3><a href=\"#else-clauses\" id=\"else-clauses\"><small>23&#8202;.&#8202;1&#8202;.&#8202;1</small>Else clauses</a></h3>\n<p>An <code>if</code> statement without support for <code>else</code> clauses is like Morticia Addams\nwithout Gomez. So, after we compile the then branch, we look for an <code>else</code>\nkeyword. If we find one, we compile the else branch.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  patchJump(thenJump);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>ifStatement</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"a\">TOKEN_ELSE</span>)) <span class=\"i\">statement</span>();\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>ifStatement</em>()</div>\n\n<p>When the condition is falsey, we&rsquo;ll jump over the then branch. If there&rsquo;s an\nelse branch, the <code>ip</code> will land right at the beginning of its code. But that&rsquo;s\nnot enough, though. Here&rsquo;s the flow that leads to:</p><img src=\"image/jumping-back-and-forth/bad-else.png\" alt=\"Flowchart of the compiled bytecode with the then branch incorrectly falling through to the else branch.\" />\n<p>If the condition is truthy, we execute the then branch like we want. But after\nthat, execution rolls right on through into the else branch. Oops! When the\ncondition is true, after we run the then branch, we need to jump over the else\nbranch. That way, in either case, we only execute a single branch, like this:</p><img src=\"image/jumping-back-and-forth/if-else.png\" alt=\"Flowchart of the compiled bytecode for an if with an else clause.\" />\n<p>To implement that, we need another jump from the end of the then branch.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  statement();\n\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>ifStatement</em>()</div>\n<pre class=\"insert\">  <span class=\"t\">int</span> <span class=\"i\">elseJump</span> = <span class=\"i\">emitJump</span>(<span class=\"a\">OP_JUMP</span>);\n\n</pre><pre class=\"insert-after\">  patchJump(thenJump);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>ifStatement</em>()</div>\n\n<p>We patch that offset after the end of the else body.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  if (match(TOKEN_ELSE)) statement();\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>ifStatement</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">patchJump</span>(<span class=\"i\">elseJump</span>);\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>ifStatement</em>()</div>\n\n<p>After executing the then branch, this jumps to the next statement after the else\nbranch. Unlike the other jump, this jump is unconditional. We always take it, so\nwe need another instruction that expresses that.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_PRINT,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_JUMP</span>,\n</pre><pre class=\"insert-after\">  OP_JUMP_IF_FALSE,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>We interpret it like so:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">        break;\n      }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_JUMP</span>: {\n        <span class=\"t\">uint16_t</span> <span class=\"i\">offset</span> = <span class=\"a\">READ_SHORT</span>();\n        <span class=\"i\">vm</span>.<span class=\"i\">ip</span> += <span class=\"i\">offset</span>;\n        <span class=\"k\">break</span>;\n      }\n</pre><pre class=\"insert-after\">      case OP_JUMP_IF_FALSE: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>Nothing too surprising here<span class=\"em\">&mdash;</span>the only difference is that it doesn&rsquo;t check a\ncondition and always applies the offset.</p>\n<p>We have then and else branches working now, so we&rsquo;re close. The last bit is to\nclean up that condition value we left on the stack. Remember, each statement is\nrequired to have zero stack effect<span class=\"em\">&mdash;</span>after the statement is finished executing,\nthe stack should be as tall as it was before.</p>\n<p>We could have the <code>OP_JUMP_IF_FALSE</code> instruction pop the condition itself, but\nsoon we&rsquo;ll use that same instruction for the logical operators where we don&rsquo;t\nwant the condition popped. Instead, we&rsquo;ll have the compiler emit a couple of\nexplicit <code>OP_POP</code> instructions when compiling an <code>if</code> statement. We need to take\ncare that every execution path through the generated code pops the condition.</p>\n<p>When the condition is truthy, we pop it right before the code inside the then\nbranch.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  int thenJump = emitJump(OP_JUMP_IF_FALSE);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>ifStatement</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">emitByte</span>(<span class=\"a\">OP_POP</span>);\n</pre><pre class=\"insert-after\">  statement();\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>ifStatement</em>()</div>\n\n<p>Otherwise, we pop it at the beginning of the else branch.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  patchJump(thenJump);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>ifStatement</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">emitByte</span>(<span class=\"a\">OP_POP</span>);\n</pre><pre class=\"insert-after\">\n\n  if (match(TOKEN_ELSE)) statement();\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>ifStatement</em>()</div>\n\n<p>This little instruction here also means that every <code>if</code> statement has an\nimplicit else branch even if the user didn&rsquo;t write an <code>else</code> clause. In the case\nwhere they left it off, all the branch does is discard the condition value.</p>\n<p>The full correct flow looks like this:</p><img src=\"image/jumping-back-and-forth/full-if-else.png\" alt=\"Flowchart of the compiled bytecode including necessary pop instructions.\" />\n<p>If you trace through, you can see that it always executes a single branch and\nensures the condition is popped first. All that remains is a little disassembler\nsupport.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      return simpleInstruction(&quot;OP_PRINT&quot;, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_JUMP</span>:\n      <span class=\"k\">return</span> <span class=\"i\">jumpInstruction</span>(<span class=\"s\">&quot;OP_JUMP&quot;</span>, <span class=\"n\">1</span>, <span class=\"i\">chunk</span>, <span class=\"i\">offset</span>);\n    <span class=\"k\">case</span> <span class=\"a\">OP_JUMP_IF_FALSE</span>:\n      <span class=\"k\">return</span> <span class=\"i\">jumpInstruction</span>(<span class=\"s\">&quot;OP_JUMP_IF_FALSE&quot;</span>, <span class=\"n\">1</span>, <span class=\"i\">chunk</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    case OP_RETURN:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>These two instructions have a new format with a 16-bit operand, so we add a new\nutility function to disassemble them.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>debug.c</em><br>\nadd after <em>byteInstruction</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">int</span> <span class=\"i\">jumpInstruction</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">name</span>, <span class=\"t\">int</span> <span class=\"i\">sign</span>,\n                           <span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>, <span class=\"t\">int</span> <span class=\"i\">offset</span>) {\n  <span class=\"t\">uint16_t</span> <span class=\"i\">jump</span> = (<span class=\"t\">uint16_t</span>)(<span class=\"i\">chunk</span>-&gt;<span class=\"i\">code</span>[<span class=\"i\">offset</span> + <span class=\"n\">1</span>] &lt;&lt; <span class=\"n\">8</span>);\n  <span class=\"i\">jump</span> |= <span class=\"i\">chunk</span>-&gt;<span class=\"i\">code</span>[<span class=\"i\">offset</span> + <span class=\"n\">2</span>];\n  <span class=\"i\">printf</span>(<span class=\"s\">&quot;%-16s %4d -&gt; %d</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>, <span class=\"i\">name</span>, <span class=\"i\">offset</span>,\n         <span class=\"i\">offset</span> + <span class=\"n\">3</span> + <span class=\"i\">sign</span> * <span class=\"i\">jump</span>);\n  <span class=\"k\">return</span> <span class=\"i\">offset</span> + <span class=\"n\">3</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, add after <em>byteInstruction</em>()</div>\n\n<p>There we go, that&rsquo;s one complete control flow construct. If this were an &rsquo;80s\nmovie, the montage music would kick in and the rest of the control flow syntax\nwould take care of itself. Alas, the <span name=\"80s\">&rsquo;80s</span> are long over,\nso we&rsquo;ll have to grind it out ourselves.</p>\n<aside name=\"80s\">\n<p>My enduring love of Depeche Mode notwithstanding.</p>\n</aside>\n<h2><a href=\"#logical-operators\" id=\"logical-operators\"><small>23&#8202;.&#8202;2</small>Logical Operators</a></h2>\n<p>You probably remember this from jlox, but the logical operators <code>and</code> and <code>or</code>\naren&rsquo;t just another pair of binary operators like <code>+</code> and <code>-</code>. Because they\nshort-circuit and may not evaluate their right operand depending on the value of\nthe left one, they work more like control flow expressions.</p>\n<p>They&rsquo;re basically a little variation on an <code>if</code> statement with an <code>else</code> clause.\nThe easiest way to explain them is to just show you the compiler code and the\ncontrol flow it produces in the resulting bytecode. Starting with <code>and</code>, we hook\nit into the expression parsing table here:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  [TOKEN_NUMBER]        = {number,   NULL,   PREC_NONE},\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nreplace 1 line</div>\n<pre class=\"insert\">  [<span class=\"a\">TOKEN_AND</span>]           = {<span class=\"a\">NULL</span>,     <span class=\"i\">and_</span>,   <span class=\"a\">PREC_AND</span>},\n</pre><pre class=\"insert-after\">  [TOKEN_CLASS]         = {NULL,     NULL,   PREC_NONE},\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, replace 1 line</div>\n\n<p>That hands off to a new parser function.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>defineVariable</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">and_</span>(<span class=\"t\">bool</span> <span class=\"i\">canAssign</span>) {\n  <span class=\"t\">int</span> <span class=\"i\">endJump</span> = <span class=\"i\">emitJump</span>(<span class=\"a\">OP_JUMP_IF_FALSE</span>);\n\n  <span class=\"i\">emitByte</span>(<span class=\"a\">OP_POP</span>);\n  <span class=\"i\">parsePrecedence</span>(<span class=\"a\">PREC_AND</span>);\n\n  <span class=\"i\">patchJump</span>(<span class=\"i\">endJump</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>defineVariable</em>()</div>\n\n<p>At the point this is called, the left-hand side expression has already been\ncompiled. That means at runtime, its value will be on top of the stack. If that\nvalue is falsey, then we know the entire <code>and</code> must be false, so we skip the\nright operand and leave the left-hand side value as the result of the entire\nexpression. Otherwise, we discard the left-hand value and evaluate the right\noperand which becomes the result of the whole <code>and</code> expression.</p>\n<p>Those four lines of code right there produce exactly that. The flow looks like\nthis:</p><img src=\"image/jumping-back-and-forth/and.png\" alt=\"Flowchart of the compiled bytecode of an 'and' expression.\" />\n<p>Now you can see why <code>OP_JUMP_IF_FALSE</code> <span name=\"instr\">leaves</span> the\nvalue on top of the stack. When the left-hand side of the <code>and</code> is falsey, that\nvalue sticks around to become the result of the entire expression.</p>\n<aside name=\"instr\">\n<p>We&rsquo;ve got plenty of space left in our opcode range, so we could have separate\ninstructions for conditional jumps that implicitly pop and those that don&rsquo;t, I\nsuppose. But I&rsquo;m trying to keep things minimal for the book. In your bytecode\nVM, it&rsquo;s worth exploring adding more specialized instructions and seeing how\nthey affect performance.</p>\n</aside>\n<h3><a href=\"#logical-or-operator\" id=\"logical-or-operator\"><small>23&#8202;.&#8202;2&#8202;.&#8202;1</small>Logical or operator</a></h3>\n<p>The <code>or</code> operator is a little more complex. First we add it to the parse table.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  [TOKEN_NIL]           = {literal,  NULL,   PREC_NONE},\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nreplace 1 line</div>\n<pre class=\"insert\">  [<span class=\"a\">TOKEN_OR</span>]            = {<span class=\"a\">NULL</span>,     <span class=\"i\">or_</span>,    <span class=\"a\">PREC_OR</span>},\n</pre><pre class=\"insert-after\">  [TOKEN_PRINT]         = {NULL,     NULL,   PREC_NONE},\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, replace 1 line</div>\n\n<p>When that parser consumes an infix <code>or</code> token, it calls this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>number</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">or_</span>(<span class=\"t\">bool</span> <span class=\"i\">canAssign</span>) {\n  <span class=\"t\">int</span> <span class=\"i\">elseJump</span> = <span class=\"i\">emitJump</span>(<span class=\"a\">OP_JUMP_IF_FALSE</span>);\n  <span class=\"t\">int</span> <span class=\"i\">endJump</span> = <span class=\"i\">emitJump</span>(<span class=\"a\">OP_JUMP</span>);\n\n  <span class=\"i\">patchJump</span>(<span class=\"i\">elseJump</span>);\n  <span class=\"i\">emitByte</span>(<span class=\"a\">OP_POP</span>);\n\n  <span class=\"i\">parsePrecedence</span>(<span class=\"a\">PREC_OR</span>);\n  <span class=\"i\">patchJump</span>(<span class=\"i\">endJump</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>number</em>()</div>\n\n<p>In an <code>or</code> expression, if the left-hand side is <em>truthy</em>, then we skip over the\nright operand. Thus we need to jump when a value is truthy. We could add a\nseparate instruction, but just to show how our compiler is free to map the\nlanguage&rsquo;s semantics to whatever instruction sequence it wants, I implemented it\nin terms of the jump instructions we already have.</p>\n<p>When the left-hand side is falsey, it does a tiny jump over the next statement.\nThat statement is an unconditional jump over the code for the right operand.\nThis little dance effectively does a jump when the value is truthy. The flow\nlooks like this:</p><img src=\"image/jumping-back-and-forth/or.png\" alt=\"Flowchart of the compiled bytecode of a logical or expression.\" />\n<p>If I&rsquo;m honest with you, this isn&rsquo;t the best way to do this. There are more\ninstructions to dispatch and more overhead. There&rsquo;s no good reason why <code>or</code>\nshould be slower than <code>and</code>. But it is kind of fun to see that it&rsquo;s possible to\nimplement both operators without adding any new instructions. Forgive me my\nindulgences.</p>\n<p>OK, those are the three <em>branching</em> constructs in Lox. By that, I mean, these\nare the control flow features that only jump <em>forward</em> over code. Other\nlanguages often have some kind of multi-way branching statement like <code>switch</code>\nand maybe a conditional expression like <code>?:</code>, but Lox keeps it simple.</p>\n<h2><a href=\"#while-statements\" id=\"while-statements\"><small>23&#8202;.&#8202;3</small>While Statements</a></h2>\n<p>That takes us to the <em>looping</em> statements, which jump <em>backward</em> so that code\ncan be executed more than once. Lox only has two loop constructs, <code>while</code> and\n<code>for</code>. A <code>while</code> loop is (much) simpler, so we start the party there.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    ifStatement();\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>statement</em>()</div>\n<pre class=\"insert\">  } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"a\">TOKEN_WHILE</span>)) {\n    <span class=\"i\">whileStatement</span>();\n</pre><pre class=\"insert-after\">  } else if (match(TOKEN_LEFT_BRACE)) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>statement</em>()</div>\n\n<p>When we reach a <code>while</code> token, we call:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>printStatement</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">whileStatement</span>() {\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_LEFT_PAREN</span>, <span class=\"s\">&quot;Expect &#39;(&#39; after &#39;while&#39;.&quot;</span>);\n  <span class=\"i\">expression</span>();\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_RIGHT_PAREN</span>, <span class=\"s\">&quot;Expect &#39;)&#39; after condition.&quot;</span>);\n\n  <span class=\"t\">int</span> <span class=\"i\">exitJump</span> = <span class=\"i\">emitJump</span>(<span class=\"a\">OP_JUMP_IF_FALSE</span>);\n  <span class=\"i\">emitByte</span>(<span class=\"a\">OP_POP</span>);\n  <span class=\"i\">statement</span>();\n\n  <span class=\"i\">patchJump</span>(<span class=\"i\">exitJump</span>);\n  <span class=\"i\">emitByte</span>(<span class=\"a\">OP_POP</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>printStatement</em>()</div>\n\n<p>Most of this mirrors <code>if</code> statements<span class=\"em\">&mdash;</span>we compile the condition expression,\nsurrounded by mandatory parentheses. That&rsquo;s followed by a jump instruction that\nskips over the subsequent body statement if the condition is falsey.</p>\n<p>We patch the jump after compiling the body and take care to <span\nname=\"pop\">pop</span> the condition value from the stack on either path. The\nonly difference from an <code>if</code> statement is the loop. That looks like this:</p>\n<aside name=\"pop\">\n<p>Really starting to second-guess my decision to use the same jump instructions\nfor the logical operators.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">  statement();\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>whileStatement</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">emitLoop</span>(<span class=\"i\">loopStart</span>);\n</pre><pre class=\"insert-after\">\n\n  patchJump(exitJump);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>whileStatement</em>()</div>\n\n<p>After the body, we call this function to emit a &ldquo;loop&rdquo; instruction. That\ninstruction needs to know how far back to jump. When jumping forward, we had to\nemit the instruction in two stages since we didn&rsquo;t know how far we were going to\njump until after we emitted the jump instruction. We don&rsquo;t have that problem\nnow. We&rsquo;ve already compiled the point in code that we want to jump back to<span class=\"em\">&mdash;</span>it&rsquo;s right before the condition expression.</p>\n<p>All we need to do is capture that location as we compile it.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void whileStatement() {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>whileStatement</em>()</div>\n<pre class=\"insert\">  <span class=\"t\">int</span> <span class=\"i\">loopStart</span> = <span class=\"i\">currentChunk</span>()-&gt;<span class=\"i\">count</span>;\n</pre><pre class=\"insert-after\">  consume(TOKEN_LEFT_PAREN, &quot;Expect '(' after 'while'.&quot;);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>whileStatement</em>()</div>\n\n<p>After executing the body of a <code>while</code> loop, we jump all the way back to before\nthe condition. That way, we re-evaluate the condition expression on each\niteration. We store the chunk&rsquo;s current instruction count in <code>loopStart</code> to\nrecord the offset in the bytecode right before the condition expression we&rsquo;re\nabout to compile. Then we pass that into this helper function:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>emitBytes</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">emitLoop</span>(<span class=\"t\">int</span> <span class=\"i\">loopStart</span>) {\n  <span class=\"i\">emitByte</span>(<span class=\"a\">OP_LOOP</span>);\n\n  <span class=\"t\">int</span> <span class=\"i\">offset</span> = <span class=\"i\">currentChunk</span>()-&gt;<span class=\"i\">count</span> - <span class=\"i\">loopStart</span> + <span class=\"n\">2</span>;\n  <span class=\"k\">if</span> (<span class=\"i\">offset</span> &gt; <span class=\"a\">UINT16_MAX</span>) <span class=\"i\">error</span>(<span class=\"s\">&quot;Loop body too large.&quot;</span>);\n\n  <span class=\"i\">emitByte</span>((<span class=\"i\">offset</span> &gt;&gt; <span class=\"n\">8</span>) &amp; <span class=\"n\">0xff</span>);\n  <span class=\"i\">emitByte</span>(<span class=\"i\">offset</span> &amp; <span class=\"n\">0xff</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>emitBytes</em>()</div>\n\n<p>It&rsquo;s a bit like <code>emitJump()</code> and <code>patchJump()</code> combined. It emits a new loop\ninstruction, which unconditionally jumps <em>backwards</em> by a given offset. Like the\njump instructions, after that we have a 16-bit operand. We calculate the offset\nfrom the instruction we&rsquo;re currently at to the <code>loopStart</code> point that we want to\njump back to. The <code>+ 2</code> is to take into account the size of the <code>OP_LOOP</code>\ninstruction&rsquo;s own operands which we also need to jump over.</p>\n<p>From the VM&rsquo;s perspective, there really is no semantic difference between\n<code>OP_LOOP</code> and <code>OP_JUMP</code>. Both just add an offset to the <code>ip</code>. We could have used\na single instruction for both and given it a signed offset operand. But I\nfigured it was a little easier to sidestep the annoying bit twiddling required\nto manually pack a signed 16-bit integer into two bytes, and we&rsquo;ve got the\nopcode space available, so why not use it?</p>\n<p>The new instruction is here:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_JUMP_IF_FALSE,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_LOOP</span>,\n</pre><pre class=\"insert-after\">  OP_RETURN,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>And in the VM, we implement it thusly:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_LOOP</span>: {\n        <span class=\"t\">uint16_t</span> <span class=\"i\">offset</span> = <span class=\"a\">READ_SHORT</span>();\n        <span class=\"i\">vm</span>.<span class=\"i\">ip</span> -= <span class=\"i\">offset</span>;\n        <span class=\"k\">break</span>;\n      }\n</pre><pre class=\"insert-after\">      case OP_RETURN: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>The only difference from <code>OP_JUMP</code> is a subtraction instead of an addition.\nDisassembly is similar too.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      return jumpInstruction(&quot;OP_JUMP_IF_FALSE&quot;, 1, chunk, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_LOOP</span>:\n      <span class=\"k\">return</span> <span class=\"i\">jumpInstruction</span>(<span class=\"s\">&quot;OP_LOOP&quot;</span>, -<span class=\"n\">1</span>, <span class=\"i\">chunk</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    case OP_RETURN:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>That&rsquo;s our <code>while</code> statement. It contains two jumps<span class=\"em\">&mdash;</span>a conditional forward one\nto escape the loop when the condition is not met, and an unconditional loop\nbackward after we have executed the body. The flow looks like this:</p><img src=\"image/jumping-back-and-forth/while.png\" alt=\"Flowchart of the compiled bytecode of a while statement.\" />\n<h2><a href=\"#for-statements\" id=\"for-statements\"><small>23&#8202;.&#8202;4</small>For Statements</a></h2>\n<p>The other looping statement in Lox is the venerable <code>for</code> loop, inherited from\nC. It&rsquo;s got a lot more going on with it compared to a <code>while</code> loop. It has three\nclauses, all of which are optional:</p>\n<p><span name=\"detail\"></span></p>\n<ul>\n<li>\n<p>The initializer can be a variable declaration or an expression. It runs once\nat the beginning of the statement.</p>\n</li>\n<li>\n<p>The condition clause is an expression. Like in a <code>while</code> loop, we exit the\nloop when it evaluates to something falsey.</p>\n</li>\n<li>\n<p>The increment expression runs once at the end of each loop iteration.</p>\n</li>\n</ul>\n<aside name=\"detail\">\n<p>If you want a refresher, the corresponding chapter in part II goes through the\nsemantics <a href=\"control-flow.html#for-loops\">in more detail</a>.</p>\n</aside>\n<p>In jlox, the parser desugared a <code>for</code> loop to a synthesized AST for a <code>while</code>\nloop with some extra stuff before it and at the end of the body. We&rsquo;ll do\nsomething similar, though we won&rsquo;t go through anything like an AST. Instead,\nour bytecode compiler will use the jump and loop instructions we already have.</p>\n<p>We&rsquo;ll work our way through the implementation a piece at a time, starting with\nthe <code>for</code> keyword.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    printStatement();\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>statement</em>()</div>\n<pre class=\"insert\">  } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"a\">TOKEN_FOR</span>)) {\n    <span class=\"i\">forStatement</span>();\n</pre><pre class=\"insert-after\">  } else if (match(TOKEN_IF)) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>statement</em>()</div>\n\n<p>It calls a helper function. If we only supported <code>for</code> loops with empty clauses\nlike <code>for (;;)</code>, then we could implement it like this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>expressionStatement</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">forStatement</span>() {\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_LEFT_PAREN</span>, <span class=\"s\">&quot;Expect &#39;(&#39; after &#39;for&#39;.&quot;</span>);\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_SEMICOLON</span>, <span class=\"s\">&quot;Expect &#39;;&#39;.&quot;</span>);\n\n  <span class=\"t\">int</span> <span class=\"i\">loopStart</span> = <span class=\"i\">currentChunk</span>()-&gt;<span class=\"i\">count</span>;\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_SEMICOLON</span>, <span class=\"s\">&quot;Expect &#39;;&#39;.&quot;</span>);\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_RIGHT_PAREN</span>, <span class=\"s\">&quot;Expect &#39;)&#39; after for clauses.&quot;</span>);\n\n  <span class=\"i\">statement</span>();\n  <span class=\"i\">emitLoop</span>(<span class=\"i\">loopStart</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>expressionStatement</em>()</div>\n\n<p>There&rsquo;s a bunch of mandatory punctuation at the top. Then we compile the body.\nLike we did for <code>while</code> loops, we record the bytecode offset at the top of the\nbody and emit a loop to jump back to that point after it. We&rsquo;ve got a working\nimplementation of <span name=\"infinite\">infinite</span> loops now.</p>\n<aside name=\"infinite\">\n<p>Alas, without <code>return</code> statements, there isn&rsquo;t any way to terminate it short of\na runtime error.</p>\n</aside>\n<h3><a href=\"#initializer-clause\" id=\"initializer-clause\"><small>23&#8202;.&#8202;4&#8202;.&#8202;1</small>Initializer clause</a></h3>\n<p>Now we&rsquo;ll add the first clause, the initializer. It executes only once, before\nthe body, so compiling is straightforward.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  consume(TOKEN_LEFT_PAREN, &quot;Expect '(' after 'for'.&quot;);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>forStatement</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"a\">TOKEN_SEMICOLON</span>)) {\n    <span class=\"c\">// No initializer.</span>\n  } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"a\">TOKEN_VAR</span>)) {\n    <span class=\"i\">varDeclaration</span>();\n  } <span class=\"k\">else</span> {\n    <span class=\"i\">expressionStatement</span>();\n  }\n</pre><pre class=\"insert-after\">\n\n  int loopStart = currentChunk()-&gt;count;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>forStatement</em>(), replace 1 line</div>\n\n<p>The syntax is a little complex since we allow either a variable declaration or\nan expression. We use the presence of the <code>var</code> keyword to tell which we have.\nFor the expression case, we call <code>expressionStatement()</code> instead of\n<code>expression()</code>. That looks for a semicolon, which we need here too, and also\nemits an <code>OP_POP</code> instruction to discard the value. We don&rsquo;t want the\ninitializer to leave anything on the stack.</p>\n<p>If a <code>for</code> statement declares a variable, that variable should be scoped to the\nloop body. We ensure that by wrapping the whole statement in a scope.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void forStatement() {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>forStatement</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">beginScope</span>();\n</pre><pre class=\"insert-after\">  consume(TOKEN_LEFT_PAREN, &quot;Expect '(' after 'for'.&quot;);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>forStatement</em>()</div>\n\n<p>Then we close it at the end.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  emitLoop(loopStart);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>forStatement</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">endScope</span>();\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>forStatement</em>()</div>\n\n<h3><a href=\"#condition-clause\" id=\"condition-clause\"><small>23&#8202;.&#8202;4&#8202;.&#8202;2</small>Condition clause</a></h3>\n<p>Next, is the condition expression that can be used to exit the loop.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  int loopStart = currentChunk()-&gt;count;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>forStatement</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"t\">int</span> <span class=\"i\">exitJump</span> = -<span class=\"n\">1</span>;\n  <span class=\"k\">if</span> (!<span class=\"i\">match</span>(<span class=\"a\">TOKEN_SEMICOLON</span>)) {\n    <span class=\"i\">expression</span>();\n    <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_SEMICOLON</span>, <span class=\"s\">&quot;Expect &#39;;&#39; after loop condition.&quot;</span>);\n\n    <span class=\"c\">// Jump out of the loop if the condition is false.</span>\n    <span class=\"i\">exitJump</span> = <span class=\"i\">emitJump</span>(<span class=\"a\">OP_JUMP_IF_FALSE</span>);\n    <span class=\"i\">emitByte</span>(<span class=\"a\">OP_POP</span>); <span class=\"c\">// Condition.</span>\n  }\n\n</pre><pre class=\"insert-after\">  consume(TOKEN_RIGHT_PAREN, &quot;Expect ')' after for clauses.&quot;);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>forStatement</em>(), replace 1 line</div>\n\n<p>Since the clause is optional, we need to see if it&rsquo;s actually present. If the\nclause is omitted, the next token must be a semicolon, so we look for that to\ntell. If there isn&rsquo;t a semicolon, there must be a condition expression.</p>\n<p>In that case, we compile it. Then, just like with while, we emit a conditional\njump that exits the loop if the condition is falsey. Since the jump leaves the\nvalue on the stack, we pop it before executing the body. That ensures we discard\nthe value when the condition is true.</p>\n<p>After the loop body, we need to patch that jump.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  emitLoop(loopStart);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>forStatement</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"k\">if</span> (<span class=\"i\">exitJump</span> != -<span class=\"n\">1</span>) {\n    <span class=\"i\">patchJump</span>(<span class=\"i\">exitJump</span>);\n    <span class=\"i\">emitByte</span>(<span class=\"a\">OP_POP</span>); <span class=\"c\">// Condition.</span>\n  }\n\n</pre><pre class=\"insert-after\">  endScope();\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>forStatement</em>()</div>\n\n<p>We do this only when there is a condition clause. If there isn&rsquo;t, there&rsquo;s no\njump to patch and no condition value on the stack to pop.</p>\n<h3><a href=\"#increment-clause\" id=\"increment-clause\"><small>23&#8202;.&#8202;4&#8202;.&#8202;3</small>Increment clause</a></h3>\n<p>I&rsquo;ve saved the best for last, the increment clause. It&rsquo;s pretty convoluted. It\nappears textually before the body, but executes <em>after</em> it. If we parsed to an\nAST and generated code in a separate pass, we could simply traverse into and\ncompile the <code>for</code> statement AST&rsquo;s body field before its increment clause.</p>\n<p>Unfortunately, we can&rsquo;t compile the increment clause later, since our compiler\nonly makes a single pass over the code. Instead, we&rsquo;ll <em>jump over</em> the\nincrement, run the body, jump <em>back</em> up to the increment, run it, and then go to\nthe next iteration.</p>\n<p>I know, a little weird, but hey, it beats manually managing ASTs in memory in C,\nright? Here&rsquo;s the code:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  }\n\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>forStatement</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (!<span class=\"i\">match</span>(<span class=\"a\">TOKEN_RIGHT_PAREN</span>)) {\n    <span class=\"t\">int</span> <span class=\"i\">bodyJump</span> = <span class=\"i\">emitJump</span>(<span class=\"a\">OP_JUMP</span>);\n    <span class=\"t\">int</span> <span class=\"i\">incrementStart</span> = <span class=\"i\">currentChunk</span>()-&gt;<span class=\"i\">count</span>;\n    <span class=\"i\">expression</span>();\n    <span class=\"i\">emitByte</span>(<span class=\"a\">OP_POP</span>);\n    <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_RIGHT_PAREN</span>, <span class=\"s\">&quot;Expect &#39;)&#39; after for clauses.&quot;</span>);\n\n    <span class=\"i\">emitLoop</span>(<span class=\"i\">loopStart</span>);\n    <span class=\"i\">loopStart</span> = <span class=\"i\">incrementStart</span>;\n    <span class=\"i\">patchJump</span>(<span class=\"i\">bodyJump</span>);\n  }\n</pre><pre class=\"insert-after\">\n\n  statement();\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>forStatement</em>(), replace 1 line</div>\n\n<p>Again, it&rsquo;s optional. Since this is the last clause, when omitted, the next\ntoken will be the closing parenthesis. When an increment is present, we need to\ncompile it now, but it shouldn&rsquo;t execute yet. So, first, we emit an\nunconditional jump that hops over the increment clause&rsquo;s code to the body of the\nloop.</p>\n<p>Next, we compile the increment expression itself. This is usually an assignment.\nWhatever it is, we only execute it for its side effect, so we also emit a pop to\ndiscard its value.</p>\n<p>The last part is a little tricky. First, we emit a loop instruction. This is the\nmain loop that takes us back to the top of the <code>for</code> loop<span class=\"em\">&mdash;</span>right before the\ncondition expression if there is one. That loop happens right after the\nincrement, since the increment executes at the end of each loop iteration.</p>\n<p>Then we change <code>loopStart</code> to point to the offset where the increment expression\nbegins. Later, when we emit the loop instruction after the body statement, this\nwill cause it to jump up to the <em>increment</em> expression instead of the top of the\nloop like it does when there is no increment. This is how we weave the\nincrement in to run after the body.</p>\n<p>It&rsquo;s convoluted, but it all works out. A complete loop with all the clauses\ncompiles to a flow like this:</p><img src=\"image/jumping-back-and-forth/for.png\" alt=\"Flowchart of the compiled bytecode of a for statement.\" />\n<p>As with implementing <code>for</code> loops in jlox, we didn&rsquo;t need to touch the runtime.\nIt all gets compiled down to primitive control flow operations the VM already\nsupports. In this chapter, we&rsquo;ve taken a big <span name=\"leap\">leap</span>\nforward<span class=\"em\">&mdash;</span>clox is now Turing complete. We&rsquo;ve also covered quite a bit of new\nsyntax: three statements and two expression forms. Even so, it only took three\nnew simple instructions. That&rsquo;s a pretty good effort-to-reward ratio for the\narchitecture of our VM.</p>\n<aside name=\"leap\">\n<p>I couldn&rsquo;t resist the pun. I regret nothing.</p>\n</aside>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>In addition to <code>if</code> statements, most C-family languages have a multi-way\n<code>switch</code> statement. Add one to clox. The grammar is:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">switchStmt</span>     → <span class=\"s\">&quot;switch&quot;</span> <span class=\"s\">&quot;(&quot;</span> <span class=\"i\">expression</span> <span class=\"s\">&quot;)&quot;</span>\n                 <span class=\"s\">&quot;{&quot;</span> <span class=\"i\">switchCase</span>* <span class=\"i\">defaultCase</span>? <span class=\"s\">&quot;}&quot;</span> ;\n<span class=\"i\">switchCase</span>     → <span class=\"s\">&quot;case&quot;</span> <span class=\"i\">expression</span> <span class=\"s\">&quot;:&quot;</span> <span class=\"i\">statement</span>* ;\n<span class=\"i\">defaultCase</span>    → <span class=\"s\">&quot;default&quot;</span> <span class=\"s\">&quot;:&quot;</span> <span class=\"i\">statement</span>* ;\n</pre></div>\n<p>To execute a <code>switch</code> statement, first evaluate the parenthesized switch\nvalue expression. Then walk the cases. For each case, evaluate its value\nexpression. If the case value is equal to the switch value, execute the\nstatements under the case and then exit the <code>switch</code> statement. Otherwise,\ntry the next case. If no case matches and there is a <code>default</code> clause,\nexecute its statements.</p>\n<p>To keep things simpler, we&rsquo;re omitting fallthrough and <code>break</code> statements.\nEach case automatically jumps to the end of the switch statement after its\nstatements are done.</p>\n</li>\n<li>\n<p>In jlox, we had a challenge to add support for <code>break</code> statements. This\ntime, let&rsquo;s do <code>continue</code>:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">continueStmt</span>   → <span class=\"s\">&quot;continue&quot;</span> <span class=\"s\">&quot;;&quot;</span> ;\n</pre></div>\n<p>A <code>continue</code> statement jumps directly to the top of the nearest enclosing\nloop, skipping the rest of the loop body. Inside a <code>for</code> loop, a <code>continue</code>\njumps to the increment clause, if there is one. It&rsquo;s a compile-time error to\nhave a <code>continue</code> statement not enclosed in a loop.</p>\n<p>Make sure to think about scope. What should happen to local variables\ndeclared inside the body of the loop or in blocks nested inside the loop\nwhen a <code>continue</code> is executed?</p>\n</li>\n<li>\n<p>Control flow constructs have been mostly unchanged since Algol 68. Language\nevolution since then has focused on making code more declarative and high\nlevel, so imperative control flow hasn&rsquo;t gotten much attention.</p>\n<p>For fun, try to invent a useful novel control flow feature for Lox. It can\nbe a refinement of an existing form or something entirely new. In practice,\nit&rsquo;s hard to come up with something useful enough at this low expressiveness\nlevel to outweigh the cost of forcing a user to learn an unfamiliar notation\nand behavior, but it&rsquo;s a good chance to practice your design skills.</p>\n</li>\n</ol>\n</div>\n<div class=\"design-note\">\n<h2><a href=\"#design-note\" id=\"design-note\">Design Note: Considering Goto Harmful</a></h2>\n<p>Discovering that all of our beautiful structured control flow in Lox is actually\ncompiled to raw unstructured jumps is like the moment in Scooby Doo when the\nmonster rips the mask off their face. It was goto all along! Except in this\ncase, the monster is <em>under</em> the mask. We all know goto is evil. But<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>why?</p>\n<p>It is true that you can write outrageously unmaintainable code using goto. But I\ndon&rsquo;t think most programmers around today have seen that first hand. It&rsquo;s been a\nlong time since that style was common. These days, it&rsquo;s a boogie man we invoke\nin scary stories around the campfire.</p>\n<p>The reason we rarely confront that monster in person is because Edsger Dijkstra\nslayed it with his famous letter &ldquo;Go To Statement Considered Harmful&rdquo;, published\nin <em>Communications of the ACM</em> (March, 1968). Debate around structured\nprogramming had been fierce for some time with adherents on both sides, but I\nthink Dijkstra deserves the most credit for effectively ending it. Most new\nlanguages today have no unstructured jump statements.</p>\n<p>A one-and-a-half page letter that almost single-handedly destroyed a language\nfeature must be pretty impressive stuff. If you haven&rsquo;t read it, I encourage you\nto do so. It&rsquo;s a seminal piece of computer science lore, one of our tribe&rsquo;s\nancestral songs. Also, it&rsquo;s a nice, short bit of practice for reading academic\nCS <span name=\"style\">writing</span>, which is a useful skill to develop.</p>\n<aside name=\"style\">\n<p>That is, if you can get past Dijkstra&rsquo;s insufferable faux-modest\nself-aggrandizing writing style:</p>\n<blockquote>\n<p>More recently I discovered why the use of the go to statement has such\ndisastrous effects. <span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>At that time I did not attach too much importance to\nthis discovery; I now submit my considerations for publication because in very\nrecent discussions in which the subject turned up, I have been urged to do so.</p>\n</blockquote>\n<p>Ah, yet another one of my many discoveries. I couldn&rsquo;t even be bothered to write\nit up until the clamoring masses begged me to.</p>\n</aside>\n<p>I&rsquo;ve read it through a number of times, along with a few critiques, responses,\nand commentaries. I ended up with mixed feelings, at best. At a very high level,\nI&rsquo;m with him. His general argument is something like this:</p>\n<ol>\n<li>\n<p>As programmers, we write programs<span class=\"em\">&mdash;</span>static text<span class=\"em\">&mdash;</span>but what we care about\nis the actual running program<span class=\"em\">&mdash;</span>its dynamic behavior.</p>\n</li>\n<li>\n<p>We&rsquo;re better at reasoning about static things than dynamic things. (He\ndoesn&rsquo;t provide any evidence to support this claim, but I accept it.)</p>\n</li>\n<li>\n<p>Thus, the more we can make the dynamic execution of the program reflect its\ntextual structure, the better.</p>\n</li>\n</ol>\n<p>This is a good start. Drawing our attention to the separation between the code\nwe write and the code as it runs inside the machine is an interesting insight.\nThen he tries to define a &ldquo;correspondence&rdquo; between program text and execution.\nFor someone who spent literally his entire career advocating greater rigor in\nprogramming, his definition is pretty hand-wavey. He says:</p>\n<blockquote>\n<p>Let us now consider how we can characterize the progress of a process. (You\nmay think about this question in a very concrete manner: suppose that a\nprocess, considered as a time succession of actions, is stopped after an\narbitrary action, what data do we have to fix in order that we can redo the\nprocess until the very same point?)</p>\n</blockquote>\n<p>Imagine it like this. You have two computers with the same program running on\nthe exact same inputs<span class=\"em\">&mdash;</span>so totally deterministic. You pause one of them at an\narbitrary point in its execution. What data would you need to send to the other\ncomputer to be able to stop it exactly as far along as the first one was?</p>\n<p>If your program allows only simple statements like assignment, it&rsquo;s easy. You\njust need to know the point after the last statement you executed. Basically a\nbreakpoint, the <code>ip</code> in our VM, or the line number in an error message. Adding\nbranching control flow like <code>if</code> and <code>switch</code> doesn&rsquo;t add any more to this. Even\nif the marker points inside a branch, we can still tell where we are.</p>\n<p>Once you add function calls, you need something more. You could have paused the\nfirst computer in the middle of a function, but that function may be called from\nmultiple places. To pause the second machine at exactly the same point in <em>the\nentire program&rsquo;s</em> execution, you need to pause it on the <em>right</em> call to that\nfunction.</p>\n<p>So you need to know not just the current statement, but, for function calls that\nhaven&rsquo;t returned yet, you need to know the locations of the callsites. In other\nwords, a call stack, though I don&rsquo;t think that term existed when Dijkstra wrote\nthis. Groovy.</p>\n<p>He notes that loops make things harder. If you pause in the middle of a loop\nbody, you don&rsquo;t know how many iterations have run. So he says you also need to\nkeep an iteration count. And, since loops can nest, you need a stack of those\n(presumably interleaved with the call stack pointers since you can be in loops\nin outer calls too).</p>\n<p>This is where it gets weird. So we&rsquo;re really building to something now, and you\nexpect him to explain how goto breaks all of this. Instead, he just says:</p>\n<blockquote>\n<p>The unbridled use of the go to statement has an immediate consequence that it\nbecomes terribly hard to find a meaningful set of coordinates in which to\ndescribe the process progress.</p>\n</blockquote>\n<p>He doesn&rsquo;t prove that this is hard, or say why. He just says it. He does say\nthat one approach is unsatisfactory:</p>\n<blockquote>\n<p>With the go to statement one can, of course, still describe the progress\nuniquely by a counter counting the number of actions performed since program\nstart (viz. a kind of normalized clock). The difficulty is that such a\ncoordinate, although unique, is utterly unhelpful.</p>\n</blockquote>\n<p>But<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>that&rsquo;s effectively what loop counters do, and he was fine with those.\nIt&rsquo;s not like every loop is a simple &ldquo;for every integer from 0 to 10&rdquo;\nincrementing count. Many are <code>while</code> loops with complex conditionals.</p>\n<p>Taking an example close to home, consider the core bytecode execution loop at\nthe heart of clox. Dijkstra argues that that loop is tractable because we can\nsimply count how many times the loop has run to reason about its progress. But\nthat loop runs once for each executed instruction in some user&rsquo;s compiled Lox\nprogram. Does knowing that it executed 6,201 bytecode instructions really tell\nus VM maintainers <em>anything</em> edifying about the state of the interpreter?</p>\n<p>In fact, this particular example points to a deeper truth. Böhm and Jacopini\n<a href=\"https://en.wikipedia.org/wiki/Structured_program_theorem\">proved</a> that <em>any</em> control flow using goto can be transformed into one using\njust sequencing, loops, and branches. Our bytecode interpreter loop is a living\nexample of that proof: it implements the unstructured control flow of the clox\nbytecode instruction set without using any gotos itself.</p>\n<p>That seems to offer a counter-argument to Dijkstra&rsquo;s claim: you <em>can</em> define a\ncorrespondence for a program using gotos by transforming it to one that doesn&rsquo;t\nand then use the correspondence from that program, which<span class=\"em\">&mdash;</span>according to him<span class=\"em\">&mdash;</span>is acceptable because it uses only branches and loops.</p>\n<p>But, honestly, my argument here is also weak. I think both of us are basically\ndoing pretend math and using fake logic to make what should be an empirical,\nhuman-centered argument. Dijkstra is right that some code using goto is really\nbad. Much of that could and should be turned into clearer code by using\nstructured control flow.</p>\n<p>By eliminating goto completely from languages, you&rsquo;re definitely prevented from\nwriting bad code using gotos. It may be that forcing users to use structured\ncontrol flow and making it an uphill battle to write goto-like code using those\nconstructs is a net win for all of our productivity.</p>\n<p>But I do wonder sometimes if we threw out the baby with the bathwater. In the\nabsence of goto, we often resort to more complex structured patterns. The\n&ldquo;switch inside a loop&rdquo; is a classic one. Another is using a guard variable to\nexit out of a series of nested loops:</p><span name=\"break\">\n</span>\n<div class=\"codehilite\"><pre><span class=\"c\">// See if the matrix contains a zero.</span>\n<span class=\"t\">bool</span> <span class=\"i\">found</span> = <span class=\"k\">false</span>;\n<span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">x</span> = <span class=\"n\">0</span>; <span class=\"i\">x</span> &lt; <span class=\"i\">xSize</span>; <span class=\"i\">x</span>++) {\n  <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">y</span> = <span class=\"n\">0</span>; <span class=\"i\">y</span> &lt; <span class=\"i\">ySize</span>; <span class=\"i\">y</span>++) {\n    <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">z</span> = <span class=\"n\">0</span>; <span class=\"i\">z</span> &lt; <span class=\"i\">zSize</span>; <span class=\"i\">z</span>++) {\n      <span class=\"k\">if</span> (<span class=\"i\">matrix</span>[<span class=\"i\">x</span>][<span class=\"i\">y</span>][<span class=\"i\">z</span>] == <span class=\"n\">0</span>) {\n        <span class=\"i\">printf</span>(<span class=\"s\">&quot;found&quot;</span>);\n        <span class=\"i\">found</span> = <span class=\"k\">true</span>;\n        <span class=\"k\">break</span>;\n      }\n    }\n    <span class=\"k\">if</span> (<span class=\"i\">found</span>) <span class=\"k\">break</span>;\n  }\n  <span class=\"k\">if</span> (<span class=\"i\">found</span>) <span class=\"k\">break</span>;\n}\n</pre></div>\n<p>Is that really better than:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">x</span> = <span class=\"n\">0</span>; <span class=\"i\">x</span> &lt; <span class=\"i\">xSize</span>; <span class=\"i\">x</span>++) {\n  <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">y</span> = <span class=\"n\">0</span>; <span class=\"i\">y</span> &lt; <span class=\"i\">ySize</span>; <span class=\"i\">y</span>++) {\n    <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">z</span> = <span class=\"n\">0</span>; <span class=\"i\">z</span> &lt; <span class=\"i\">zSize</span>; <span class=\"i\">z</span>++) {\n      <span class=\"k\">if</span> (<span class=\"i\">matrix</span>[<span class=\"i\">x</span>][<span class=\"i\">y</span>][<span class=\"i\">z</span>] == <span class=\"n\">0</span>) {\n        <span class=\"i\">printf</span>(<span class=\"s\">&quot;found&quot;</span>);\n        <span class=\"k\">goto</span> <span class=\"i\">done</span>;\n      }\n    }\n  }\n}\n<span class=\"i\">done</span>:\n</pre></div>\n<aside name=\"break\">\n<p>You could do this without <code>break</code> statements<span class=\"em\">&mdash;</span>themselves a limited goto-ish\nconstruct<span class=\"em\">&mdash;</span>by inserting <code>!found &amp;&amp;</code> at the beginning of the condition clause\nof each loop.</p>\n</aside>\n<p>I guess what I really don&rsquo;t like is that we&rsquo;re making language design and\nengineering decisions today based on fear. Few people today have any subtle\nunderstanding of the problems and benefits of goto. Instead, we just think it&rsquo;s\n&ldquo;considered harmful&rdquo;. Personally, I&rsquo;ve never found dogma a good starting place\nfor quality creative work.</p>\n</div>\n\n<footer>\n<a href=\"calls-and-functions.html\" class=\"next\">\n  Next Chapter: &ldquo;Calls and Functions&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/local-variables.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Local Variables &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Local Variables<small>22</small></a></h3>\n\n<ul>\n    <li><a href=\"#representing-local-variables\"><small>22.1</small> Representing Local Variables</a></li>\n    <li><a href=\"#block-statements\"><small>22.2</small> Block Statements</a></li>\n    <li><a href=\"#declaring-local-variables\"><small>22.3</small> Declaring Local Variables</a></li>\n    <li><a href=\"#using-locals\"><small>22.4</small> Using Locals</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"global-variables.html\" title=\"Global Variables\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"jumping-back-and-forth.html\" title=\"Jumping Back and Forth\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"global-variables.html\" title=\"Global Variables\" class=\"prev\">←</a>\n<a href=\"jumping-back-and-forth.html\" title=\"Jumping Back and Forth\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Local Variables<small>22</small></a></h3>\n\n<ul>\n    <li><a href=\"#representing-local-variables\"><small>22.1</small> Representing Local Variables</a></li>\n    <li><a href=\"#block-statements\"><small>22.2</small> Block Statements</a></li>\n    <li><a href=\"#declaring-local-variables\"><small>22.3</small> Declaring Local Variables</a></li>\n    <li><a href=\"#using-locals\"><small>22.4</small> Using Locals</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"global-variables.html\" title=\"Global Variables\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"jumping-back-and-forth.html\" title=\"Jumping Back and Forth\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">22</div>\n  <h1>Local Variables</h1>\n\n<blockquote>\n<p>And as imagination bodies forth<br />\nThe forms of things unknown, the poet&rsquo;s pen<br />\nTurns them to shapes and gives to airy nothing<br />\nA local habitation and a name.</p>\n<p><cite>William Shakespeare, <em>A Midsummer Night&rsquo;s Dream</em></cite></p>\n</blockquote>\n<p>The <a href=\"global-variables.html\">last chapter</a> introduced variables to clox, but only of the <span\nname=\"global\">global</span> variety. In this chapter, we&rsquo;ll extend that to\nsupport blocks, block scope, and local variables. In jlox, we managed to pack\nall of that and globals into one chapter. For clox, that&rsquo;s two chapters worth of\nwork partially because, frankly, everything takes more effort in C.</p>\n<aside name=\"global\">\n<p>There&rsquo;s probably some dumb &ldquo;think globally, act locally&rdquo; joke here, but I&rsquo;m\nstruggling to find it.</p>\n</aside>\n<p>But an even more important reason is that our approach to local variables will\nbe quite different from how we implemented globals. Global variables are late\nbound in Lox. &ldquo;Late&rdquo; in this context means &ldquo;resolved after compile time&rdquo;. That&rsquo;s\ngood for keeping the compiler simple, but not great for performance. Local\nvariables are one of the most-used <span name=\"params\">parts</span> of a\nlanguage. If locals are slow, <em>everything</em> is slow. So we want a strategy for\nlocal variables that&rsquo;s as efficient as possible.</p>\n<aside name=\"params\">\n<p>Function parameters are also heavily used. They work like local variables too,\nso we&rsquo;ll use the same implementation technique for them.</p>\n</aside>\n<p>Fortunately, lexical scoping is here to help us. As the name implies, lexical\nscope means we can resolve a local variable just by looking at the text of the\nprogram<span class=\"em\">&mdash;</span>locals are <em>not</em> late bound. Any processing work we do in the\ncompiler is work we <em>don&rsquo;t</em> have to do at runtime, so our implementation of\nlocal variables will lean heavily on the compiler.</p>\n<h2><a href=\"#representing-local-variables\" id=\"representing-local-variables\"><small>22&#8202;.&#8202;1</small>Representing Local Variables</a></h2>\n<p>The nice thing about hacking on a programming language in modern times is\nthere&rsquo;s a long lineage of other languages to learn from. So how do C and Java\nmanage their local variables? Why, on the stack, of course! They typically use\nthe native stack mechanisms supported by the chip and OS. That&rsquo;s a little too\nlow level for us, but inside the virtual world of clox, we have our own stack we\ncan use.</p>\n<p>Right now, we only use it for holding on to <strong>temporaries</strong><span class=\"em\">&mdash;</span>short-lived blobs\nof data that we need to remember while computing an expression. As long as we\ndon&rsquo;t get in the way of those, we can stuff our local variables onto the stack\ntoo. This is great for performance. Allocating space for a new local requires\nonly incrementing the <code>stackTop</code> pointer, and freeing is likewise a decrement.\nAccessing a variable from a known stack slot is an indexed array lookup.</p>\n<p>We do need to be careful, though. The VM expects the stack to behave like, well,\na stack. We have to be OK with allocating new locals only on the top of the\nstack, and we have to accept that we can discard a local only when nothing is\nabove it on the stack. Also, we need to make sure temporaries don&rsquo;t interfere.</p>\n<p>Conveniently, the design of Lox is in <span name=\"harmony\">harmony</span> with\nthese constraints. New locals are always created by declaration statements.\nStatements don&rsquo;t nest inside expressions, so there are never any temporaries on\nthe stack when a statement begins executing. Blocks are strictly nested. When a\nblock ends, it always takes the innermost, most recently declared locals with\nit. Since those are also the locals that came into scope last, they should be on\ntop of the stack where we need them.</p>\n<aside name=\"harmony\">\n<p>This alignment obviously isn&rsquo;t coincidental. I designed Lox to be amenable to\nsingle-pass compilation to stack-based bytecode. But I didn&rsquo;t have to tweak the\nlanguage too much to fit in those restrictions. Most of its design should feel\npretty natural.</p>\n<p>This is in large part because the history of languages is deeply tied to\nsingle-pass compilation and<span class=\"em\">&mdash;</span>to a lesser degree<span class=\"em\">&mdash;</span>stack-based architectures.\nLox&rsquo;s block scoping follows a tradition stretching back to BCPL. As programmers,\nour intuition of what&rsquo;s &ldquo;normal&rdquo; in a language is informed even today by the\nhardware limitations of yesteryear.</p>\n</aside>\n<p>Step through this example program and watch how the local variables come in and\ngo out of scope:</p><img src=\"image/local-variables/scopes.png\" alt=\"A series of local variables come into and out of scope in a stack-like fashion.\" />\n<p>See how they fit a stack perfectly? It seems that the stack will work for\nstoring locals at runtime. But we can go further than that. Not only do we know\n<em>that</em> they will be on the stack, but we can even pin down precisely <em>where</em>\nthey will be on the stack. Since the compiler knows exactly which local\nvariables are in scope at any point in time, it can effectively simulate the\nstack during compilation and note <span name=\"fn\">where</span> in the stack each\nvariable lives.</p>\n<p>We&rsquo;ll take advantage of this by using these stack offsets as operands for the\nbytecode instructions that read and store local variables. This makes working\nwith locals deliciously fast<span class=\"em\">&mdash;</span>as simple as indexing into an array.</p>\n<aside name=\"fn\">\n<p>In this chapter, locals start at the bottom of the VM&rsquo;s stack array and are\nindexed from there. When we add <a href=\"calls-and-functions.html\">functions</a>, that scheme gets a little more\ncomplex. Each function needs its own region of the stack for its parameters and\nlocal variables. But, as we&rsquo;ll see, that doesn&rsquo;t add as much complexity as you\nmight expect.</p>\n</aside>\n<p>There&rsquo;s a lot of state we need to track in the compiler to make this whole thing\ngo, so let&rsquo;s get started there. In jlox, we used a linked chain of &ldquo;environment&rdquo;\nHashMaps to track which local variables were currently in scope. That&rsquo;s sort of\nthe classic, schoolbook way of representing lexical scope. For clox, as usual,\nwe&rsquo;re going a little closer to the metal. All of the state lives in a new\nstruct.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} ParseRule;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nadd after struct <em>ParseRule</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">typedef</span> <span class=\"k\">struct</span> {\n  <span class=\"t\">Local</span> <span class=\"i\">locals</span>[<span class=\"a\">UINT8_COUNT</span>];\n  <span class=\"t\">int</span> <span class=\"i\">localCount</span>;\n  <span class=\"t\">int</span> <span class=\"i\">scopeDepth</span>;\n} <span class=\"t\">Compiler</span>;\n</pre><pre class=\"insert-after\">\n\nParser parser;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after struct <em>ParseRule</em></div>\n\n<p>We have a simple, flat array of all locals that are in scope during each point in\nthe compilation process. They are <span name=\"order\">ordered</span> in the array\nin the order that their declarations appear in the code. Since the instruction\noperand we&rsquo;ll use to encode a local is a single byte, our VM has a hard limit on\nthe number of locals that can be in scope at once. That means we can also give\nthe locals array a fixed size.</p>\n<aside name=\"order\">\n<p>We&rsquo;re writing a single-pass compiler, so it&rsquo;s not like we have <em>too</em> many other\noptions for how to order them in the array.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define DEBUG_TRACE_EXECUTION\n</pre><div class=\"source-file\"><em>common.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#define UINT8_COUNT (UINT8_MAX + 1)</span>\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>common.h</em></div>\n\n<p>Back in the Compiler struct, the <code>localCount</code> field tracks how many locals are\nin scope<span class=\"em\">&mdash;</span>how many of those array slots are in use. We also track the &ldquo;scope\ndepth&rdquo;. This is the number of blocks surrounding the current bit of code we&rsquo;re\ncompiling.</p>\n<p>Our Java interpreter used a chain of maps to keep each block&rsquo;s variables\nseparate from other blocks&rsquo;. This time, we&rsquo;ll simply number variables with the\nlevel of nesting where they appear. Zero is the global scope, one is the first\ntop-level block, two is inside that, you get the idea. We use this to track\nwhich block each local belongs to so that we know which locals to discard when a\nblock ends.</p>\n<p>Each local in the array is one of these:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} ParseRule;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nadd after struct <em>ParseRule</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">typedef</span> <span class=\"k\">struct</span> {\n  <span class=\"t\">Token</span> <span class=\"i\">name</span>;\n  <span class=\"t\">int</span> <span class=\"i\">depth</span>;\n} <span class=\"t\">Local</span>;\n</pre><pre class=\"insert-after\">\n\ntypedef struct {\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after struct <em>ParseRule</em></div>\n\n<p>We store the name of the variable. When we&rsquo;re resolving an identifier, we\ncompare the identifier&rsquo;s lexeme with each local&rsquo;s name to find a match. It&rsquo;s\npretty hard to resolve a variable if you don&rsquo;t know its name. The <code>depth</code> field\nrecords the scope depth of the block where the local variable was declared.\nThat&rsquo;s all the state we need for now.</p>\n<p>This is a very different representation from what we had in jlox, but it still\nlets us answer all of the same questions our compiler needs to ask of the\nlexical environment. The next step is figuring out how the compiler <em>gets</em> at\nthis state. If we were <span name=\"thread\">principled</span> engineers, we&rsquo;d\ngive each function in the front end a parameter that accepts a pointer to a\nCompiler. We&rsquo;d create a Compiler at the beginning and carefully thread it\nthrough each function call<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>but that would mean a lot of boring changes to\nthe code we already wrote, so here&rsquo;s a global variable instead:</p>\n<aside name=\"thread\">\n<p>In particular, if we ever want to use our compiler in a multi-threaded\napplication, possibly with multiple compilers running in parallel, then using a\nglobal variable is a <em>bad</em> idea.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">Parser parser;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nadd after variable <em>parser</em></div>\n<pre class=\"insert\"><span class=\"t\">Compiler</span>* <span class=\"i\">current</span> = <span class=\"a\">NULL</span>;\n</pre><pre class=\"insert-after\">Chunk* compilingChunk;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after variable <em>parser</em></div>\n\n<p>Here&rsquo;s a little function to initialize the compiler:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>emitConstant</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">initCompiler</span>(<span class=\"t\">Compiler</span>* <span class=\"i\">compiler</span>) {\n  <span class=\"i\">compiler</span>-&gt;<span class=\"i\">localCount</span> = <span class=\"n\">0</span>;\n  <span class=\"i\">compiler</span>-&gt;<span class=\"i\">scopeDepth</span> = <span class=\"n\">0</span>;\n  <span class=\"i\">current</span> = <span class=\"i\">compiler</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>emitConstant</em>()</div>\n\n<p>When we first start up the VM, we call it to get everything into a clean state.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  initScanner(source);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>compile</em>()</div>\n<pre class=\"insert\">  <span class=\"t\">Compiler</span> <span class=\"i\">compiler</span>;\n  <span class=\"i\">initCompiler</span>(&amp;<span class=\"i\">compiler</span>);\n</pre><pre class=\"insert-after\">  compilingChunk = chunk;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>compile</em>()</div>\n\n<p>Our compiler has the data it needs, but not the operations on that data. There&rsquo;s\nno way to create and destroy scopes, or add and resolve variables. We&rsquo;ll add\nthose as we need them. First, let&rsquo;s start building some language features.</p>\n<h2><a href=\"#block-statements\" id=\"block-statements\"><small>22&#8202;.&#8202;2</small>Block Statements</a></h2>\n<p>Before we can have any local variables, we need some local scopes. These come\nfrom two things: function bodies and <span name=\"block\">blocks</span>. Functions\nare a big chunk of work that we&rsquo;ll tackle in <a href=\"calls-and-functions.html\">a later chapter</a>, so\nfor now we&rsquo;re only going to do blocks. As usual, we start with the syntax. The\nnew grammar we&rsquo;ll introduce is:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">statement</span>      → <span class=\"i\">exprStmt</span>\n               | <span class=\"i\">printStmt</span>\n               | <span class=\"i\">block</span> ;\n\n<span class=\"i\">block</span>          → <span class=\"s\">&quot;{&quot;</span> <span class=\"i\">declaration</span>* <span class=\"s\">&quot;}&quot;</span> ;\n</pre></div>\n<aside name=\"block\">\n<p>When you think about it, &ldquo;block&rdquo; is a weird name. Used metaphorically, &ldquo;block&rdquo;\nusually means a small indivisible unit, but for some reason, the Algol 60\ncommittee decided to use it to refer to a <em>compound</em> structure<span class=\"em\">&mdash;</span>a series of\nstatements. It could be worse, I suppose. Algol 58 called <code>begin</code> and <code>end</code>\n&ldquo;statement parentheses&rdquo;.</p><img src=\"image/local-variables/block.png\" alt=\"A cinder block.\" class=\"above\" />\n</aside>\n<p>Blocks are a kind of statement, so the rule for them goes in the <code>statement</code>\nproduction. The corresponding code to compile one looks like this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  if (match(TOKEN_PRINT)) {\n    printStatement();\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>statement</em>()</div>\n<pre class=\"insert\">  } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"a\">TOKEN_LEFT_BRACE</span>)) {\n    <span class=\"i\">beginScope</span>();\n    <span class=\"i\">block</span>();\n    <span class=\"i\">endScope</span>();\n</pre><pre class=\"insert-after\">  } else {\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>statement</em>()</div>\n\n<p>After <span name=\"helper\">parsing</span> the initial curly brace, we use this\nhelper function to compile the rest of the block:</p>\n<aside name=\"helper\">\n<p>This function will come in handy later for compiling function bodies.</p>\n</aside>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>expression</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">block</span>() {\n  <span class=\"k\">while</span> (!<span class=\"i\">check</span>(<span class=\"a\">TOKEN_RIGHT_BRACE</span>) &amp;&amp; !<span class=\"i\">check</span>(<span class=\"a\">TOKEN_EOF</span>)) {\n    <span class=\"i\">declaration</span>();\n  }\n\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_RIGHT_BRACE</span>, <span class=\"s\">&quot;Expect &#39;}&#39; after block.&quot;</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>expression</em>()</div>\n\n<p>It keeps parsing declarations and statements until it hits the closing brace. As\nwe do with any loop in the parser, we also check for the end of the token\nstream. This way, if there&rsquo;s a malformed program with a missing closing curly,\nthe compiler doesn&rsquo;t get stuck in a loop.</p>\n<p>Executing a block simply means executing the statements it contains, one after\nthe other, so there isn&rsquo;t much to compiling them. The semantically interesting\nthing blocks do is create scopes. Before we compile the body of a block, we call\nthis function to enter a new local scope:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>endCompiler</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">beginScope</span>() {\n  <span class=\"i\">current</span>-&gt;<span class=\"i\">scopeDepth</span>++;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>endCompiler</em>()</div>\n\n<p>In order to &ldquo;create&rdquo; a scope, all we do is increment the current depth. This is\ncertainly much faster than jlox, which allocated an entire new HashMap for\neach one. Given <code>beginScope()</code>, you can probably guess what <code>endScope()</code> does.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>beginScope</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">endScope</span>() {\n  <span class=\"i\">current</span>-&gt;<span class=\"i\">scopeDepth</span>--;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>beginScope</em>()</div>\n\n<p>That&rsquo;s it for blocks and scopes<span class=\"em\">&mdash;</span>more or less<span class=\"em\">&mdash;</span>so we&rsquo;re ready to stuff some\nvariables into them.</p>\n<h2><a href=\"#declaring-local-variables\" id=\"declaring-local-variables\"><small>22&#8202;.&#8202;3</small>Declaring Local Variables</a></h2>\n<p>Usually we start with parsing here, but our compiler already supports parsing\nand compiling variable declarations. We&rsquo;ve got <code>var</code> statements, identifier\nexpressions and assignment in there now. It&rsquo;s just that the compiler assumes\nall variables are global. So we don&rsquo;t need any new parsing support, we just need\nto hook up the new scoping semantics to the existing code.</p><img src=\"image/local-variables/declaration.png\" alt=\"The code flow within varDeclaration().\" />\n<p>Variable declaration parsing begins in <code>varDeclaration()</code> and relies on a couple\nof other functions. First, <code>parseVariable()</code> consumes the identifier token for\nthe variable name, adds its lexeme to the chunk&rsquo;s constant table as a string,\nand then returns the constant table index where it was added. Then, after\n<code>varDeclaration()</code> compiles the initializer, it calls <code>defineVariable()</code> to emit\nthe bytecode for storing the variable&rsquo;s value in the global variable hash table.</p>\n<p>Both of those helpers need a few changes to support local variables. In\n<code>parseVariable()</code>, we add:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  consume(TOKEN_IDENTIFIER, errorMessage);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>parseVariable</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"i\">declareVariable</span>();\n  <span class=\"k\">if</span> (<span class=\"i\">current</span>-&gt;<span class=\"i\">scopeDepth</span> &gt; <span class=\"n\">0</span>) <span class=\"k\">return</span> <span class=\"n\">0</span>;\n\n</pre><pre class=\"insert-after\">  return identifierConstant(&amp;parser.previous);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>parseVariable</em>()</div>\n\n<p>First, we &ldquo;declare&rdquo; the variable. I&rsquo;ll get to what that means in a second. After\nthat, we exit the function if we&rsquo;re in a local scope. At runtime, locals aren&rsquo;t\nlooked up by name. There&rsquo;s no need to stuff the variable&rsquo;s name into the\nconstant table, so if the declaration is inside a local scope, we return a dummy\ntable index instead.</p>\n<p>Over in <code>defineVariable()</code>, we need to emit the code to store a local variable\nif we&rsquo;re in a local scope. It looks like this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void defineVariable(uint8_t global) {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>defineVariable</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">current</span>-&gt;<span class=\"i\">scopeDepth</span> &gt; <span class=\"n\">0</span>) {\n    <span class=\"k\">return</span>;\n  }\n\n</pre><pre class=\"insert-after\">  emitBytes(OP_DEFINE_GLOBAL, global);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>defineVariable</em>()</div>\n\n<p>Wait, what? Yup. That&rsquo;s it. There is no code to create a local variable at\nruntime. Think about what state the VM is in. It has already executed the code\nfor the variable&rsquo;s initializer (or the implicit <code>nil</code> if the user omitted an\ninitializer), and that value is sitting right on top of the stack as the only\nremaining temporary. We also know that new locals are allocated at the top of\nthe stack<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>right where that value already is. Thus, there&rsquo;s nothing to do. The\ntemporary simply <em>becomes</em> the local variable. It doesn&rsquo;t get much more\nefficient than that.</p>\n<p><span name=\"locals\"></span></p><img src=\"image/local-variables/local-slots.png\" alt=\"Walking through the bytecode execution showing that each initializer's result ends up in the local's slot.\" />\n<aside name=\"locals\">\n<p>The code on the left compiles to the sequence of instructions on the right.</p>\n</aside>\n<p>OK, so what&rsquo;s &ldquo;declaring&rdquo; about? Here&rsquo;s what that does:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>identifierConstant</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">declareVariable</span>() {\n  <span class=\"k\">if</span> (<span class=\"i\">current</span>-&gt;<span class=\"i\">scopeDepth</span> == <span class=\"n\">0</span>) <span class=\"k\">return</span>;\n\n  <span class=\"t\">Token</span>* <span class=\"i\">name</span> = &amp;<span class=\"i\">parser</span>.<span class=\"i\">previous</span>;\n  <span class=\"i\">addLocal</span>(*<span class=\"i\">name</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>identifierConstant</em>()</div>\n\n<p>This is the point where the compiler records the existence of the variable. We\nonly do this for locals, so if we&rsquo;re in the top-level global scope, we just bail\nout. Because global variables are late bound, the compiler doesn&rsquo;t keep track of\nwhich declarations for them it has seen.</p>\n<p>But for local variables, the compiler does need to remember that the variable\nexists. That&rsquo;s what declaring it does<span class=\"em\">&mdash;</span>it adds it to the compiler&rsquo;s list of\nvariables in the current scope. We implement that using another new function.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>identifierConstant</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">addLocal</span>(<span class=\"t\">Token</span> <span class=\"i\">name</span>) {\n  <span class=\"t\">Local</span>* <span class=\"i\">local</span> = &amp;<span class=\"i\">current</span>-&gt;<span class=\"i\">locals</span>[<span class=\"i\">current</span>-&gt;<span class=\"i\">localCount</span>++];\n  <span class=\"i\">local</span>-&gt;<span class=\"i\">name</span> = <span class=\"i\">name</span>;\n  <span class=\"i\">local</span>-&gt;<span class=\"i\">depth</span> = <span class=\"i\">current</span>-&gt;<span class=\"i\">scopeDepth</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>identifierConstant</em>()</div>\n\n<p>This initializes the next available Local in the compiler&rsquo;s array of variables.\nIt stores the variable&rsquo;s <span name=\"lexeme\">name</span> and the depth of the\nscope that owns the variable.</p>\n<aside name=\"lexeme\">\n<p>Worried about the lifetime of the string for the variable&rsquo;s name? The Local\ndirectly stores a copy of the Token struct for the identifier. Tokens store a\npointer to the first character of their lexeme and the lexeme&rsquo;s length. That\npointer points into the original source string for the script or REPL entry\nbeing compiled.</p>\n<p>As long as that string stays around during the entire compilation process<span class=\"em\">&mdash;</span>which it must since, you know, we&rsquo;re compiling it<span class=\"em\">&mdash;</span>then all of the tokens\npointing into it are fine.</p>\n</aside>\n<p>Our implementation is fine for a correct Lox program, but what about invalid\ncode? Let&rsquo;s aim to be robust. The first error to handle is not really the user&rsquo;s\nfault, but more a limitation of the VM. The instructions to work with local\nvariables refer to them by slot index. That index is stored in a single-byte\noperand, which means the VM only supports up to 256 local variables in scope at\none time.</p>\n<p>If we try to go over that, not only could we not refer to them at runtime, but\nthe compiler would overwrite its own locals array, too. Let&rsquo;s prevent that.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void addLocal(Token name) {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>addLocal</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">current</span>-&gt;<span class=\"i\">localCount</span> == <span class=\"a\">UINT8_COUNT</span>) {\n    <span class=\"i\">error</span>(<span class=\"s\">&quot;Too many local variables in function.&quot;</span>);\n    <span class=\"k\">return</span>;\n  }\n\n</pre><pre class=\"insert-after\">  Local* local = &amp;current-&gt;locals[current-&gt;localCount++];\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>addLocal</em>()</div>\n\n<p>The next case is trickier. Consider:</p>\n<div class=\"codehilite\"><pre>{\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;first&quot;</span>;\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;second&quot;</span>;\n}\n</pre></div>\n<p>At the top level, Lox allows redeclaring a variable with the same name as a\nprevious declaration because that&rsquo;s useful for the REPL. But inside a local\nscope, that&rsquo;s a pretty <span name=\"rust\">weird</span> thing to do. It&rsquo;s likely\nto be a mistake, and many languages, including our own Lox, enshrine that\nassumption by making this an error.</p>\n<aside name=\"rust\">\n<p>Interestingly, the Rust programming language <em>does</em> allow this, and idiomatic\ncode relies on it.</p>\n</aside>\n<p>Note that the above program is different from this one:</p>\n<div class=\"codehilite\"><pre>{\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;outer&quot;</span>;\n  {\n    <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;inner&quot;</span>;\n  }\n}\n</pre></div>\n<p>It&rsquo;s OK to have two variables with the same name in <em>different</em> scopes, even\nwhen the scopes overlap such that both are visible at the same time. That&rsquo;s\nshadowing, and Lox does allow that. It&rsquo;s only an error to have two variables\nwith the same name in the <em>same</em> local scope.</p>\n<p>We detect that error like so:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  Token* name = &amp;parser.previous;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>declareVariable</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">i</span> = <span class=\"i\">current</span>-&gt;<span class=\"i\">localCount</span> - <span class=\"n\">1</span>; <span class=\"i\">i</span> &gt;= <span class=\"n\">0</span>; <span class=\"i\">i</span>--) {\n    <span class=\"t\">Local</span>* <span class=\"i\">local</span> = &amp;<span class=\"i\">current</span>-&gt;<span class=\"i\">locals</span>[<span class=\"i\">i</span>];\n    <span class=\"k\">if</span> (<span class=\"i\">local</span>-&gt;<span class=\"i\">depth</span> != -<span class=\"n\">1</span> &amp;&amp; <span class=\"i\">local</span>-&gt;<span class=\"i\">depth</span> &lt; <span class=\"i\">current</span>-&gt;<span class=\"i\">scopeDepth</span>) {\n      <span class=\"k\">break</span>;<span name=\"negative\"> </span>\n    }\n\n    <span class=\"k\">if</span> (<span class=\"i\">identifiersEqual</span>(<span class=\"i\">name</span>, &amp;<span class=\"i\">local</span>-&gt;<span class=\"i\">name</span>)) {\n      <span class=\"i\">error</span>(<span class=\"s\">&quot;Already a variable with this name in this scope.&quot;</span>);\n    }\n  }\n\n</pre><pre class=\"insert-after\">  addLocal(*name);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>declareVariable</em>()</div>\n\n<aside name=\"negative\">\n<p>Don&rsquo;t worry about that odd <code>depth != -1</code> part yet. We&rsquo;ll get to what that&rsquo;s\nabout later.</p>\n</aside>\n<p>Local variables are appended to the array when they&rsquo;re declared, which means the\ncurrent scope is always at the end of the array. When we declare a new variable,\nwe start at the end and work backward, looking for an existing variable with the\nsame name. If we find one in the current scope, we report the error. Otherwise,\nif we reach the beginning of the array or a variable owned by another scope,\nthen we know we&rsquo;ve checked all of the existing variables in the scope.</p>\n<p>To see if two identifiers are the same, we use this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>identifierConstant</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">bool</span> <span class=\"i\">identifiersEqual</span>(<span class=\"t\">Token</span>* <span class=\"i\">a</span>, <span class=\"t\">Token</span>* <span class=\"i\">b</span>) {\n  <span class=\"k\">if</span> (<span class=\"i\">a</span>-&gt;<span class=\"i\">length</span> != <span class=\"i\">b</span>-&gt;<span class=\"i\">length</span>) <span class=\"k\">return</span> <span class=\"k\">false</span>;\n  <span class=\"k\">return</span> <span class=\"i\">memcmp</span>(<span class=\"i\">a</span>-&gt;<span class=\"i\">start</span>, <span class=\"i\">b</span>-&gt;<span class=\"i\">start</span>, <span class=\"i\">a</span>-&gt;<span class=\"i\">length</span>) == <span class=\"n\">0</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>identifierConstant</em>()</div>\n\n<p>Since we know the lengths of both lexemes, we check that first. That will fail\nquickly for many non-equal strings. If the <span name=\"hash\">lengths</span> are\nthe same, we check the characters using <code>memcmp()</code>. To get to <code>memcmp()</code>, we\nneed an include.</p>\n<aside name=\"hash\">\n<p>It would be a nice little optimization if we could check their hashes, but\ntokens aren&rsquo;t full LoxStrings, so we haven&rsquo;t calculated their hashes yet.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &lt;stdlib.h&gt;\n</pre><div class=\"source-file\"><em>compiler.c</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &lt;string.h&gt;</span>\n</pre><pre class=\"insert-after\">\n\n#include &quot;common.h&quot;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em></div>\n\n<p>With this, we&rsquo;re able to bring variables into being. But, like ghosts, they\nlinger on beyond the scope where they are declared. When a block ends, we need\nto put them to rest.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  current-&gt;scopeDepth--;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>endScope</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"k\">while</span> (<span class=\"i\">current</span>-&gt;<span class=\"i\">localCount</span> &gt; <span class=\"n\">0</span> &amp;&amp;\n         <span class=\"i\">current</span>-&gt;<span class=\"i\">locals</span>[<span class=\"i\">current</span>-&gt;<span class=\"i\">localCount</span> - <span class=\"n\">1</span>].<span class=\"i\">depth</span> &gt;\n            <span class=\"i\">current</span>-&gt;<span class=\"i\">scopeDepth</span>) {\n    <span class=\"i\">emitByte</span>(<span class=\"a\">OP_POP</span>);\n    <span class=\"i\">current</span>-&gt;<span class=\"i\">localCount</span>--;\n  }\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>endScope</em>()</div>\n\n<p>When we pop a scope, we walk backward through the local array looking for any\nvariables declared at the scope depth we just left. We discard them by simply\ndecrementing the length of the array.</p>\n<p>There is a runtime component to this too. Local variables occupy slots on the\nstack. When a local variable goes out of scope, that slot is no longer needed\nand should be freed. So, for each variable that we discard, we also emit an\n<code>OP_POP</code> <span name=\"pop\">instruction</span> to pop it from the stack.</p>\n<aside name=\"pop\">\n<p>When multiple local variables go out of scope at once, you get a series of\n<code>OP_POP</code> instructions that get interpreted one at a time. A simple optimization\nyou could add to your Lox implementation is a specialized <code>OP_POPN</code> instruction\nthat takes an operand for the number of slots to pop and pops them all at once.</p>\n</aside>\n<h2><a href=\"#using-locals\" id=\"using-locals\"><small>22&#8202;.&#8202;4</small>Using Locals</a></h2>\n<p>We can now compile and execute local variable declarations. At runtime, their\nvalues are sitting where they should be on the stack. Let&rsquo;s start using them.\nWe&rsquo;ll do both variable access and assignment at the same time since they touch\nthe same functions in the compiler.</p>\n<p>We already have code for getting and setting global variables, and<span class=\"em\">&mdash;</span>like good\nlittle software engineers<span class=\"em\">&mdash;</span>we want to reuse as much of that existing code as\nwe can. Something like this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void namedVariable(Token name, bool canAssign) {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>namedVariable</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"t\">uint8_t</span> <span class=\"i\">getOp</span>, <span class=\"i\">setOp</span>;\n  <span class=\"t\">int</span> <span class=\"i\">arg</span> = <span class=\"i\">resolveLocal</span>(<span class=\"i\">current</span>, &amp;<span class=\"i\">name</span>);\n  <span class=\"k\">if</span> (<span class=\"i\">arg</span> != -<span class=\"n\">1</span>) {\n    <span class=\"i\">getOp</span> = <span class=\"a\">OP_GET_LOCAL</span>;\n    <span class=\"i\">setOp</span> = <span class=\"a\">OP_SET_LOCAL</span>;\n  } <span class=\"k\">else</span> {\n    <span class=\"i\">arg</span> = <span class=\"i\">identifierConstant</span>(&amp;<span class=\"i\">name</span>);\n    <span class=\"i\">getOp</span> = <span class=\"a\">OP_GET_GLOBAL</span>;\n    <span class=\"i\">setOp</span> = <span class=\"a\">OP_SET_GLOBAL</span>;\n  }\n</pre><pre class=\"insert-after\">\n\n  if (canAssign &amp;&amp; match(TOKEN_EQUAL)) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>namedVariable</em>(), replace 1 line</div>\n\n<p>Instead of hardcoding the bytecode instructions emitted for variable access and\nassignment, we use a couple of C variables. First, we try to find a local\nvariable with the given name. If we find one, we use the instructions for\nworking with locals. Otherwise, we assume it&rsquo;s a global variable and use the\nexisting bytecode instructions for globals.</p>\n<p>A little further down, we use those variables to emit the right instructions.\nFor assignment:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  if (canAssign &amp;&amp; match(TOKEN_EQUAL)) {\n    expression();\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>namedVariable</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"i\">emitBytes</span>(<span class=\"i\">setOp</span>, (<span class=\"t\">uint8_t</span>)<span class=\"i\">arg</span>);\n</pre><pre class=\"insert-after\">  } else {\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>namedVariable</em>(), replace 1 line</div>\n\n<p>And for access:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    emitBytes(setOp, (uint8_t)arg);\n  } else {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>namedVariable</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"i\">emitBytes</span>(<span class=\"i\">getOp</span>, (<span class=\"t\">uint8_t</span>)<span class=\"i\">arg</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>namedVariable</em>(), replace 1 line</div>\n\n<p>The real heart of this chapter, the part where we resolve a local variable, is\nhere:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>identifiersEqual</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">int</span> <span class=\"i\">resolveLocal</span>(<span class=\"t\">Compiler</span>* <span class=\"i\">compiler</span>, <span class=\"t\">Token</span>* <span class=\"i\">name</span>) {\n  <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">i</span> = <span class=\"i\">compiler</span>-&gt;<span class=\"i\">localCount</span> - <span class=\"n\">1</span>; <span class=\"i\">i</span> &gt;= <span class=\"n\">0</span>; <span class=\"i\">i</span>--) {\n    <span class=\"t\">Local</span>* <span class=\"i\">local</span> = &amp;<span class=\"i\">compiler</span>-&gt;<span class=\"i\">locals</span>[<span class=\"i\">i</span>];\n    <span class=\"k\">if</span> (<span class=\"i\">identifiersEqual</span>(<span class=\"i\">name</span>, &amp;<span class=\"i\">local</span>-&gt;<span class=\"i\">name</span>)) {\n      <span class=\"k\">return</span> <span class=\"i\">i</span>;\n    }\n  }\n\n  <span class=\"k\">return</span> -<span class=\"n\">1</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>identifiersEqual</em>()</div>\n\n<p>For all that, it&rsquo;s straightforward. We walk the list of locals that are\ncurrently in scope. If one has the same name as the identifier token, the\nidentifier must refer to that variable. We&rsquo;ve found it! We walk the array\nbackward so that we find the <em>last</em> declared variable with the identifier. That\nensures that inner local variables correctly shadow locals with the same name in\nsurrounding scopes.</p>\n<p>At runtime, we load and store locals using the stack slot index, so that&rsquo;s what\nthe compiler needs to calculate after it resolves the variable. Whenever a\nvariable is declared, we append it to the locals array in Compiler. That means\nthe first local variable is at index zero, the next one is at index one, and so\non. In other words, the locals array in the compiler has the <em>exact</em> same layout\nas the VM&rsquo;s stack will have at runtime. The variable&rsquo;s index in the locals array\nis the same as its stack slot. How convenient!</p>\n<p>If we make it through the whole array without finding a variable with the given\nname, it must not be a local. In that case, we return <code>-1</code> to signal that it\nwasn&rsquo;t found and should be assumed to be a global variable instead.</p>\n<h3><a href=\"#interpreting-local-variables\" id=\"interpreting-local-variables\"><small>22&#8202;.&#8202;4&#8202;.&#8202;1</small>Interpreting local variables</a></h3>\n<p>Our compiler is emitting two new instructions, so let&rsquo;s get them working. First\nis loading a local variable:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_POP,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_GET_LOCAL</span>,\n</pre><pre class=\"insert-after\">  OP_GET_GLOBAL,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>And its implementation:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case OP_POP: pop(); break;\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_GET_LOCAL</span>: {\n        <span class=\"t\">uint8_t</span> <span class=\"i\">slot</span> = <span class=\"a\">READ_BYTE</span>();\n        <span class=\"i\">push</span>(<span class=\"i\">vm</span>.<span class=\"i\">stack</span>[<span class=\"i\">slot</span>]);<span name=\"slot\"> </span>\n        <span class=\"k\">break</span>;\n      }\n</pre><pre class=\"insert-after\">      case OP_GET_GLOBAL: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>It takes a single-byte operand for the stack slot where the local lives. It\nloads the value from that index and then pushes it on top of the stack where\nlater instructions can find it.</p>\n<aside name=\"slot\">\n<p>It seems redundant to push the local&rsquo;s value onto the stack since it&rsquo;s already\non the stack lower down somewhere. The problem is that the other bytecode\ninstructions only look for data at the <em>top</em> of the stack. This is the core\naspect that makes our bytecode instruction set <em>stack</em>-based.\n<a href=\"a-virtual-machine.html#design-note\">Register-based</a> bytecode instruction sets avoid this stack juggling at the\ncost of having larger instructions with more operands.</p>\n</aside>\n<p>Next is assignment:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_GET_LOCAL,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_SET_LOCAL</span>,\n</pre><pre class=\"insert-after\">  OP_GET_GLOBAL,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>You can probably predict the implementation.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_SET_LOCAL</span>: {\n        <span class=\"t\">uint8_t</span> <span class=\"i\">slot</span> = <span class=\"a\">READ_BYTE</span>();\n        <span class=\"i\">vm</span>.<span class=\"i\">stack</span>[<span class=\"i\">slot</span>] = <span class=\"i\">peek</span>(<span class=\"n\">0</span>);\n        <span class=\"k\">break</span>;\n      }\n</pre><pre class=\"insert-after\">      case OP_GET_GLOBAL: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>It takes the assigned value from the top of the stack and stores it in the stack\nslot corresponding to the local variable. Note that it doesn&rsquo;t pop the value\nfrom the stack. Remember, assignment is an expression, and every expression\nproduces a value. The value of an assignment expression is the assigned value\nitself, so the VM just leaves the value on the stack.</p>\n<p>Our disassembler is incomplete without support for these two new instructions.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      return simpleInstruction(&quot;OP_POP&quot;, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_GET_LOCAL</span>:\n      <span class=\"k\">return</span> <span class=\"i\">byteInstruction</span>(<span class=\"s\">&quot;OP_GET_LOCAL&quot;</span>, <span class=\"i\">chunk</span>, <span class=\"i\">offset</span>);\n    <span class=\"k\">case</span> <span class=\"a\">OP_SET_LOCAL</span>:\n      <span class=\"k\">return</span> <span class=\"i\">byteInstruction</span>(<span class=\"s\">&quot;OP_SET_LOCAL&quot;</span>, <span class=\"i\">chunk</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    case OP_GET_GLOBAL:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>The compiler compiles local variables to direct slot access. The local\nvariable&rsquo;s name never leaves the compiler to make it into the chunk at all.\nThat&rsquo;s great for performance, but not so great for introspection. When we\ndisassemble these instructions, we can&rsquo;t show the variable&rsquo;s name like we could\nwith globals. Instead, we just show the slot number.</p>\n<aside name=\"debug\">\n<p>Erasing local variable names in the compiler is a real issue if we ever want to\nimplement a debugger for our VM. When users step through code, they expect to\nsee the values of local variables organized by their names. To support that,\nwe&rsquo;d need to output some additional information that tracks the name of each\nlocal variable at each stack slot.</p>\n</aside>\n<div class=\"codehilite\"><div class=\"source-file\"><em>debug.c</em><br>\nadd after <em>simpleInstruction</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">int</span> <span class=\"i\">byteInstruction</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">name</span>, <span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>,\n                           <span class=\"t\">int</span> <span class=\"i\">offset</span>) {\n  <span class=\"t\">uint8_t</span> <span class=\"i\">slot</span> = <span class=\"i\">chunk</span>-&gt;<span class=\"i\">code</span>[<span class=\"i\">offset</span> + <span class=\"n\">1</span>];\n  <span class=\"i\">printf</span>(<span class=\"s\">&quot;%-16s %4d</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>, <span class=\"i\">name</span>, <span class=\"i\">slot</span>);\n  <span class=\"k\">return</span> <span class=\"i\">offset</span> + <span class=\"n\">2</span>;<span name=\"debug\"> </span>\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, add after <em>simpleInstruction</em>()</div>\n\n<h3><a href=\"#another-scope-edge-case\" id=\"another-scope-edge-case\"><small>22&#8202;.&#8202;4&#8202;.&#8202;2</small>Another scope edge case</a></h3>\n<p>We already sunk some time into handling a couple of weird edge cases around\nscopes. We made sure shadowing works correctly. We report an error if two\nvariables in the same local scope have the same name. For reasons that aren&rsquo;t\nentirely clear to me, variable scoping seems to have a lot of these wrinkles.\nI&rsquo;ve never seen a language where it feels completely <span\nname=\"elegant\">elegant</span>.</p>\n<aside name=\"elegant\">\n<p>No, not even Scheme.</p>\n</aside>\n<p>We&rsquo;ve got one more edge case to deal with before we end this chapter. Recall this strange beastie we first met in <a href=\"resolving-and-binding.html#resolving-variable-declarations\">jlox&rsquo;s implementation of variable resolution</a>:</p>\n<div class=\"codehilite\"><pre>{\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;outer&quot;</span>;\n  {\n    <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"i\">a</span>;\n  }\n}\n</pre></div>\n<p>We slayed it then by splitting a variable&rsquo;s declaration into two phases, and\nwe&rsquo;ll do that again here:</p><img src=\"image/local-variables/phases.png\" alt=\"An example variable declaration marked 'declared uninitialized' before the variable name and 'ready for use' after the initializer.\" />\n<p>As soon as the variable declaration begins<span class=\"em\">&mdash;</span>in other words, before its\ninitializer<span class=\"em\">&mdash;</span>the name is declared in the current scope. The variable exists,\nbut in a special &ldquo;uninitialized&rdquo; state. Then we compile the initializer. If at\nany point in that expression we resolve an identifier that points back to this\nvariable, we&rsquo;ll see that it is not initialized yet and report an error. After we\nfinish compiling the initializer, we mark the variable as initialized and ready\nfor use.</p>\n<p>To implement this, when we declare a local, we need to indicate the\n&ldquo;uninitialized&rdquo; state somehow. We could add a new field to Local, but let&rsquo;s be a\nlittle more parsimonious with memory. Instead, we&rsquo;ll set the variable&rsquo;s scope\ndepth to a special sentinel value, <code>-1</code>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  local-&gt;name = name;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>addLocal</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"i\">local</span>-&gt;<span class=\"i\">depth</span> = -<span class=\"n\">1</span>;\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>addLocal</em>(), replace 1 line</div>\n\n<p>Later, once the variable&rsquo;s initializer has been compiled, we mark it\ninitialized.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  if (current-&gt;scopeDepth &gt; 0) {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>defineVariable</em>()</div>\n<pre class=\"insert\">    <span class=\"i\">markInitialized</span>();\n</pre><pre class=\"insert-after\">    return;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>defineVariable</em>()</div>\n\n<p>That is implemented like so:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>parseVariable</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">markInitialized</span>() {\n  <span class=\"i\">current</span>-&gt;<span class=\"i\">locals</span>[<span class=\"i\">current</span>-&gt;<span class=\"i\">localCount</span> - <span class=\"n\">1</span>].<span class=\"i\">depth</span> =\n      <span class=\"i\">current</span>-&gt;<span class=\"i\">scopeDepth</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>parseVariable</em>()</div>\n\n<p>So this is <em>really</em> what &ldquo;declaring&rdquo; and &ldquo;defining&rdquo; a variable means in the\ncompiler. &ldquo;Declaring&rdquo; is when the variable is added to the scope, and &ldquo;defining&rdquo;\nis when it becomes available for use.</p>\n<p>When we resolve a reference to a local variable, we check the scope depth to see\nif it&rsquo;s fully defined.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    if (identifiersEqual(name, &amp;local-&gt;name)) {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>resolveLocal</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">if</span> (<span class=\"i\">local</span>-&gt;<span class=\"i\">depth</span> == -<span class=\"n\">1</span>) {\n        <span class=\"i\">error</span>(<span class=\"s\">&quot;Can&#39;t read local variable in its own initializer.&quot;</span>);\n      }\n</pre><pre class=\"insert-after\">      return i;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>resolveLocal</em>()</div>\n\n<p>If the variable has the sentinel depth, it must be a reference to a variable in\nits own initializer, and we report that as an error.</p>\n<p>That&rsquo;s it for this chapter! We added blocks, local variables, and real,\nhonest-to-God lexical scoping. Given that we introduced an entirely different\nruntime representation for variables, we didn&rsquo;t have to write a lot of code. The\nimplementation ended up being pretty clean and efficient.</p>\n<p>You&rsquo;ll notice that almost all of the code we wrote is in the compiler. Over in\nthe runtime, it&rsquo;s just two little instructions. You&rsquo;ll see this as a continuing\n<span name=\"static\">trend</span> in clox compared to jlox. One of the biggest\nhammers in the optimizer&rsquo;s toolbox is pulling work forward into the compiler so\nthat you don&rsquo;t have to do it at runtime. In this chapter, that meant resolving\nexactly which stack slot every local variable occupies. That way, at runtime, no\nlookup or resolution needs to happen.</p>\n<aside name=\"static\">\n<p>You can look at static types as an extreme example of this trend. A statically\ntyped language takes all of the type analysis and type error handling and sorts\nit all out during compilation. Then the runtime doesn&rsquo;t have to waste any time\nchecking that values have the proper type for their operation. In fact, in some\nstatically typed languages like C, you don&rsquo;t even <em>know</em> the type at runtime.\nThe compiler completely erases any representation of a value&rsquo;s type leaving just\nthe bare bits.</p>\n</aside>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>Our simple local array makes it easy to calculate the stack slot of each\nlocal variable. But it means that when the compiler resolves a reference to\na variable, we have to do a linear scan through the array.</p>\n<p>Come up with something more efficient. Do you think the additional\ncomplexity is worth it?</p>\n</li>\n<li>\n<p>How do other languages handle code like this:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"i\">a</span>;\n</pre></div>\n<p>What would you do if it was your language? Why?</p>\n</li>\n<li>\n<p>Many languages make a distinction between variables that can be reassigned\nand those that can&rsquo;t. In Java, the <code>final</code> modifier prevents you from\nassigning to a variable. In JavaScript, a variable declared with <code>let</code> can\nbe assigned, but one declared using <code>const</code> can&rsquo;t. Swift treats <code>let</code> as\nsingle-assignment and uses <code>var</code> for assignable variables. Scala and Kotlin\nuse <code>val</code> and <code>var</code>.</p>\n<p>Pick a keyword for a single-assignment variable form to add to Lox. Justify\nyour choice, then implement it. An attempt to assign to a variable declared\nusing your new keyword should cause a compile error.</p>\n</li>\n<li>\n<p>Extend clox to allow more than 256 local variables to be in scope at a time.</p>\n</li>\n</ol>\n</div>\n\n<footer>\n<a href=\"jumping-back-and-forth.html\" class=\"next\">\n  Next Chapter: &ldquo;Jumping Back and Forth&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/methods-and-initializers.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Methods and Initializers &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Methods and Initializers<small>28</small></a></h3>\n\n<ul>\n    <li><a href=\"#method-declarations\"><small>28.1</small> Method Declarations</a></li>\n    <li><a href=\"#method-references\"><small>28.2</small> Method References</a></li>\n    <li><a href=\"#this\"><small>28.3</small> This</a></li>\n    <li><a href=\"#instance-initializers\"><small>28.4</small> Instance Initializers</a></li>\n    <li><a href=\"#optimized-invocations\"><small>28.5</small> Optimized Invocations</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Novelty Budget</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"classes-and-instances.html\" title=\"Classes and Instances\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"superclasses.html\" title=\"Superclasses\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"classes-and-instances.html\" title=\"Classes and Instances\" class=\"prev\">←</a>\n<a href=\"superclasses.html\" title=\"Superclasses\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Methods and Initializers<small>28</small></a></h3>\n\n<ul>\n    <li><a href=\"#method-declarations\"><small>28.1</small> Method Declarations</a></li>\n    <li><a href=\"#method-references\"><small>28.2</small> Method References</a></li>\n    <li><a href=\"#this\"><small>28.3</small> This</a></li>\n    <li><a href=\"#instance-initializers\"><small>28.4</small> Instance Initializers</a></li>\n    <li><a href=\"#optimized-invocations\"><small>28.5</small> Optimized Invocations</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Novelty Budget</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"classes-and-instances.html\" title=\"Classes and Instances\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"superclasses.html\" title=\"Superclasses\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">28</div>\n  <h1>Methods and Initializers</h1>\n\n<blockquote>\n<p>When you are on the dancefloor, there is nothing to do but dance.</p>\n<p><cite>Umberto Eco, <em>The Mysterious Flame of Queen Loana</em></cite></p>\n</blockquote>\n<p>It is time for our virtual machine to bring its nascent objects to life with\nbehavior. That means methods and method calls. And, since they are a special\nkind of method, initializers too.</p>\n<p>All of this is familiar territory from our previous jlox interpreter. What&rsquo;s new\nin this second trip is an important optimization we&rsquo;ll implement to make method\ncalls over seven times faster than our baseline performance. But before we get\nto that fun, we gotta get the basic stuff working.</p>\n<h2><a href=\"#method-declarations\" id=\"method-declarations\"><small>28&#8202;.&#8202;1</small>Method Declarations</a></h2>\n<p>We can&rsquo;t optimize method calls before we have method calls, and we can&rsquo;t call\nmethods without having methods to call, so we&rsquo;ll start with declarations.</p>\n<h3><a href=\"#representing-methods\" id=\"representing-methods\"><small>28&#8202;.&#8202;1&#8202;.&#8202;1</small>Representing methods</a></h3>\n<p>We usually start in the compiler, but let&rsquo;s knock the object model out first\nthis time. The runtime representation for methods in clox is similar to that of\njlox. Each class stores a hash table of methods. Keys are method names, and each\nvalue is an ObjClosure for the body of the method.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">typedef struct {\n  Obj obj;\n  ObjString* name;\n</pre><div class=\"source-file\"><em>object.h</em><br>\nin struct <em>ObjClass</em></div>\n<pre class=\"insert\">  <span class=\"t\">Table</span> <span class=\"i\">methods</span>;\n</pre><pre class=\"insert-after\">} ObjClass;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, in struct <em>ObjClass</em></div>\n\n<p>A brand new class begins with an empty method table.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  klass-&gt;name = name;<span name=\"klass\"> </span>\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>newClass</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">initTable</span>(&amp;<span class=\"i\">klass</span>-&gt;<span class=\"i\">methods</span>);\n</pre><pre class=\"insert-after\">  return klass;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>newClass</em>()</div>\n\n<p>The ObjClass struct owns the memory for this table, so when the memory manager\ndeallocates a class, the table should be freed too.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    case OBJ_CLASS: {\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>freeObject</em>()</div>\n<pre class=\"insert\">      <span class=\"t\">ObjClass</span>* <span class=\"i\">klass</span> = (<span class=\"t\">ObjClass</span>*)<span class=\"i\">object</span>;\n      <span class=\"i\">freeTable</span>(&amp;<span class=\"i\">klass</span>-&gt;<span class=\"i\">methods</span>);\n</pre><pre class=\"insert-after\">      FREE(ObjClass, object);\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>freeObject</em>()</div>\n\n<p>Speaking of memory managers, the GC needs to trace through classes into the\nmethod table. If a class is still reachable (likely through some instance),\nthen all of its methods certainly need to stick around too.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      markObject((Obj*)klass-&gt;name);\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>blackenObject</em>()</div>\n<pre class=\"insert\">      <span class=\"i\">markTable</span>(&amp;<span class=\"i\">klass</span>-&gt;<span class=\"i\">methods</span>);\n</pre><pre class=\"insert-after\">      break;\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>blackenObject</em>()</div>\n\n<p>We use the existing <code>markTable()</code> function, which traces through the key string\nand value in each table entry.</p>\n<p>Storing a class&rsquo;s methods is pretty familiar coming from jlox. The different\npart is how that table gets populated. Our previous interpreter had access to\nthe entire AST node for the class declaration and all of the methods it\ncontained. At runtime, the interpreter simply walked that list of declarations.</p>\n<p>Now every piece of information the compiler wants to shunt over to the runtime\nhas to squeeze through the interface of a flat series of bytecode instructions.\nHow do we take a class declaration, which can contain an arbitrarily large set\nof methods, and represent it as bytecode? Let&rsquo;s hop over to the compiler and\nfind out.</p>\n<h3><a href=\"#compiling-method-declarations\" id=\"compiling-method-declarations\"><small>28&#8202;.&#8202;1&#8202;.&#8202;2</small>Compiling method declarations</a></h3>\n<p>The last chapter left us with a compiler that parses classes but allows only an\nempty body. Now we insert a little code to compile a series of method\ndeclarations between the braces.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  consume(TOKEN_LEFT_BRACE, &quot;Expect '{' before class body.&quot;);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>classDeclaration</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">while</span> (!<span class=\"i\">check</span>(<span class=\"a\">TOKEN_RIGHT_BRACE</span>) &amp;&amp; !<span class=\"i\">check</span>(<span class=\"a\">TOKEN_EOF</span>)) {\n    <span class=\"i\">method</span>();\n  }\n</pre><pre class=\"insert-after\">  consume(TOKEN_RIGHT_BRACE, &quot;Expect '}' after class body.&quot;);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>classDeclaration</em>()</div>\n\n<p>Lox doesn&rsquo;t have field declarations, so anything before the closing brace at the\nend of the class body must be a method. We stop compiling methods when we hit\nthat final curly or if we reach the end of the file. The latter check ensures\nour compiler doesn&rsquo;t get stuck in an infinite loop if the user accidentally\nforgets the closing brace.</p>\n<p>The tricky part with compiling a class declaration is that a class may declare\nany number of methods. Somehow the runtime needs to look up and bind all of\nthem. That would be a lot to pack into a single <code>OP_CLASS</code> instruction. Instead,\nthe bytecode we generate for a class declaration will split the process into a\n<span name=\"series\"><em>series</em></span> of instructions. The compiler already emits\nan <code>OP_CLASS</code> instruction that creates a new empty ObjClass object. Then it\nemits instructions to store the class in a variable with its name.</p>\n<aside name=\"series\">\n<p>We did something similar for closures. The <code>OP_CLOSURE</code> instruction needs to\nknow the type and index for each captured upvalue. We encoded that using a\nseries of pseudo-instructions following the main <code>OP_CLOSURE</code> instruction<span class=\"em\">&mdash;</span>basically a variable number of operands. The VM processes all of those extra\nbytes immediately when interpreting the <code>OP_CLOSURE</code> instruction.</p>\n<p>Here our approach is a little different because from the VM&rsquo;s perspective, each\ninstruction to define a method is a separate stand-alone operation. Either\napproach would work. A variable-sized pseudo-instruction is possibly marginally\nfaster, but class declarations are rarely in hot loops, so it doesn&rsquo;t matter\nmuch.</p>\n</aside>\n<p>Now, for each method declaration, we emit a new <code>OP_METHOD</code> instruction that\nadds a single method to that class. When all of the <code>OP_METHOD</code> instructions\nhave executed, we&rsquo;re left with a fully formed class. While the user sees a class\ndeclaration as a single atomic operation, the VM implements it as a series of\nmutations.</p>\n<p>To define a new method, the VM needs three things:</p>\n<ol>\n<li>\n<p>The name of the method.</p>\n</li>\n<li>\n<p>The closure for the method body.</p>\n</li>\n<li>\n<p>The class to bind the method to.</p>\n</li>\n</ol>\n<p>We&rsquo;ll incrementally write the compiler code to see how those all get through to\nthe runtime, starting here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>function</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">method</span>() {\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_IDENTIFIER</span>, <span class=\"s\">&quot;Expect method name.&quot;</span>);\n  <span class=\"t\">uint8_t</span> <span class=\"i\">constant</span> = <span class=\"i\">identifierConstant</span>(&amp;<span class=\"i\">parser</span>.<span class=\"i\">previous</span>);\n  <span class=\"i\">emitBytes</span>(<span class=\"a\">OP_METHOD</span>, <span class=\"i\">constant</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>function</em>()</div>\n\n<p>Like <code>OP_GET_PROPERTY</code> and other instructions that need names at runtime, the\ncompiler adds the method name token&rsquo;s lexeme to the constant table, getting back\na table index. Then we emit an <code>OP_METHOD</code> instruction with that index as the\noperand. That&rsquo;s the name. Next is the method body:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  uint8_t constant = identifierConstant(&amp;parser.previous);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>method</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"t\">FunctionType</span> <span class=\"i\">type</span> = <span class=\"a\">TYPE_FUNCTION</span>;\n  <span class=\"i\">function</span>(<span class=\"i\">type</span>);\n</pre><pre class=\"insert-after\">  emitBytes(OP_METHOD, constant);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>method</em>()</div>\n\n<p>We use the same <code>function()</code> helper that we wrote for compiling function\ndeclarations. That utility function compiles the subsequent parameter list and\nfunction body. Then it emits the code to create an ObjClosure and leave it on\ntop of the stack. At runtime, the VM will find the closure there.</p>\n<p>Last is the class to bind the method to. Where can the VM find that?\nUnfortunately, by the time we reach the <code>OP_METHOD</code> instruction, we don&rsquo;t know\nwhere it is. It <span name=\"global\">could</span> be on the stack, if the user\ndeclared the class in a local scope. But a top-level class declaration ends up\nwith the ObjClass in the global variable table.</p>\n<aside name=\"global\">\n<p>If Lox supported declaring classes only at the top level, the VM could assume\nthat any class could be found by looking it up directly from the global\nvariable table. Alas, because we support local classes, we need to handle that\ncase too.</p>\n</aside>\n<p>Fear not. The compiler does know the <em>name</em> of the class. We can capture it\nright after we consume its token.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  consume(TOKEN_IDENTIFIER, &quot;Expect class name.&quot;);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>classDeclaration</em>()</div>\n<pre class=\"insert\">  <span class=\"t\">Token</span> <span class=\"i\">className</span> = <span class=\"i\">parser</span>.<span class=\"i\">previous</span>;\n</pre><pre class=\"insert-after\">  uint8_t nameConstant = identifierConstant(&amp;parser.previous);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>classDeclaration</em>()</div>\n\n<p>And we know that no other declaration with that name could possibly shadow the\nclass. So we do the easy fix. Before we start binding methods, we emit whatever\ncode is necessary to load the class back on top of the stack.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  defineVariable(nameConstant);\n\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>classDeclaration</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">namedVariable</span>(<span class=\"i\">className</span>, <span class=\"k\">false</span>);\n</pre><pre class=\"insert-after\">  consume(TOKEN_LEFT_BRACE, &quot;Expect '{' before class body.&quot;);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>classDeclaration</em>()</div>\n\n<p>Right before compiling the class body, we <span name=\"load\">call</span>\n<code>namedVariable()</code>. That helper function generates code to load a variable with\nthe given name onto the stack. Then we compile the methods.</p>\n<aside name=\"load\">\n<p>The preceding call to <code>defineVariable()</code> pops the class, so it seems silly to\ncall <code>namedVariable()</code> to load it right back onto the stack. Why not simply\nleave it on the stack in the first place? We could, but in the <a href=\"superclasses.html\">next\nchapter</a> we will insert code between these two calls to support\ninheritance. At that point, it will be simpler if the class isn&rsquo;t sitting around\non the stack.</p>\n</aside>\n<p>This means that when we execute each <code>OP_METHOD</code> instruction, the stack has the\nmethod&rsquo;s closure on top with the class right under it. Once we&rsquo;ve reached the\nend of the methods, we no longer need the class and tell the VM to pop it off\nthe stack.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  consume(TOKEN_RIGHT_BRACE, &quot;Expect '}' after class body.&quot;);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>classDeclaration</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">emitByte</span>(<span class=\"a\">OP_POP</span>);\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>classDeclaration</em>()</div>\n\n<p>Putting all of that together, here is an example class declaration to throw at\nthe compiler:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Brunch</span> {\n  <span class=\"i\">bacon</span>() {}\n  <span class=\"i\">eggs</span>() {}\n}\n</pre></div>\n<p>Given that, here is what the compiler generates and how those instructions\naffect the stack at runtime:</p><img src=\"image/methods-and-initializers/method-instructions.png\" alt=\"The series of bytecode instructions for a class declaration with two methods.\" />\n<p>All that remains for us is to implement the runtime for that new <code>OP_METHOD</code>\ninstruction.</p>\n<h3><a href=\"#executing-method-declarations\" id=\"executing-method-declarations\"><small>28&#8202;.&#8202;1&#8202;.&#8202;3</small>Executing method declarations</a></h3>\n<p>First we define the opcode.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_CLASS,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_METHOD</span>\n</pre><pre class=\"insert-after\">} OpCode;\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>We disassemble it like other instructions that have string constant operands.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    case OP_CLASS:\n      return constantInstruction(&quot;OP_CLASS&quot;, chunk, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_METHOD</span>:\n      <span class=\"k\">return</span> <span class=\"i\">constantInstruction</span>(<span class=\"s\">&quot;OP_METHOD&quot;</span>, <span class=\"i\">chunk</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    default:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>And over in the interpreter, we add a new case too.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">        break;\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_METHOD</span>:\n        <span class=\"i\">defineMethod</span>(<span class=\"a\">READ_STRING</span>());\n        <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">    }\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>There, we read the method name from the constant table and pass it here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nadd after <em>closeUpvalues</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">defineMethod</span>(<span class=\"t\">ObjString</span>* <span class=\"i\">name</span>) {\n  <span class=\"t\">Value</span> <span class=\"i\">method</span> = <span class=\"i\">peek</span>(<span class=\"n\">0</span>);\n  <span class=\"t\">ObjClass</span>* <span class=\"i\">klass</span> = <span class=\"a\">AS_CLASS</span>(<span class=\"i\">peek</span>(<span class=\"n\">1</span>));\n  <span class=\"i\">tableSet</span>(&amp;<span class=\"i\">klass</span>-&gt;<span class=\"i\">methods</span>, <span class=\"i\">name</span>, <span class=\"i\">method</span>);\n  <span class=\"i\">pop</span>();\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, add after <em>closeUpvalues</em>()</div>\n\n<p>The method closure is on top of the stack, above the class it will be bound to.\nWe read those two stack slots and store the closure in the class&rsquo;s method table.\nThen we pop the closure since we&rsquo;re done with it.</p>\n<p>Note that we don&rsquo;t do any runtime type checking on the closure or class object.\nThat <code>AS_CLASS()</code> call is safe because the compiler itself generated the code\nthat causes the class to be in that stack slot. The VM <span\nname=\"verify\">trusts</span> its own compiler.</p>\n<aside name=\"verify\">\n<p>The VM trusts that the instructions it executes are valid because the <em>only</em> way\nto get code to the bytecode interpreter is by going through clox&rsquo;s own compiler.\nMany bytecode VMs, like the JVM and CPython, support executing bytecode that has\nbeen compiled separately. That leads to a different security story. Maliciously\ncrafted bytecode could crash the VM or worse.</p>\n<p>To prevent that, the JVM does a bytecode verification pass before it executes\nany loaded code. CPython says it&rsquo;s up to the user to ensure any bytecode they\nrun is safe.</p>\n</aside>\n<p>After the series of <code>OP_METHOD</code> instructions is done and the <code>OP_POP</code> has popped\nthe class, we will have a class with a nicely populated method table, ready to\nstart doing things. The next step is pulling those methods back out and using\nthem.</p>\n<h2><a href=\"#method-references\" id=\"method-references\"><small>28&#8202;.&#8202;2</small>Method References</a></h2>\n<p>Most of the time, methods are accessed and immediately called, leading to this\nfamiliar syntax:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">instance</span>.<span class=\"i\">method</span>(<span class=\"i\">argument</span>);\n</pre></div>\n<p>But remember, in Lox and some other languages, those two steps are distinct and\ncan be separated.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">closure</span> = <span class=\"i\">instance</span>.<span class=\"i\">method</span>;\n<span class=\"i\">closure</span>(<span class=\"i\">argument</span>);\n</pre></div>\n<p>Since users <em>can</em> separate the operations, we have to implement them separately.\nThe first step is using our existing dotted property syntax to access a method\ndefined on the instance&rsquo;s class. That should return some kind of object that the\nuser can then call like a function.</p>\n<p>The obvious approach is to look up the method in the class&rsquo;s method table and\nreturn the ObjClosure associated with that name. But we also need to remember\nthat when you access a method, <code>this</code> gets bound to the instance the method was\naccessed from. Here&rsquo;s the example from <a href=\"classes.html#methods-on-classes\">when we added methods to jlox</a>:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Person</span> {\n  <span class=\"i\">sayName</span>() {\n    <span class=\"k\">print</span> <span class=\"k\">this</span>.<span class=\"i\">name</span>;\n  }\n}\n\n<span class=\"k\">var</span> <span class=\"i\">jane</span> = <span class=\"t\">Person</span>();\n<span class=\"i\">jane</span>.<span class=\"i\">name</span> = <span class=\"s\">&quot;Jane&quot;</span>;\n\n<span class=\"k\">var</span> <span class=\"i\">method</span> = <span class=\"i\">jane</span>.<span class=\"i\">sayName</span>;\n<span class=\"i\">method</span>(); <span class=\"c\">// ?</span>\n</pre></div>\n<p>This should print &ldquo;Jane&rdquo;, so the object returned by <code>.sayName</code> somehow needs to\nremember the instance it was accessed from when it later gets called. In jlox,\nwe implemented that &ldquo;memory&rdquo; using the interpreter&rsquo;s existing heap-allocated\nEnvironment class, which handled all variable storage.</p>\n<p>Our bytecode VM has a more complex architecture for storing state. <a href=\"local-variables.html#representing-local-variables\">Local\nvariables and temporaries</a> are on the stack, <a href=\"global-variables.html#variable-declarations\">globals</a> are in a hash\ntable, and variables in closures use <a href=\"closures.html#upvalues\">upvalues</a>. That necessitates a somewhat\nmore complex solution for tracking a method&rsquo;s receiver in clox, and a new\nruntime type.</p>\n<h3><a href=\"#bound-methods\" id=\"bound-methods\"><small>28&#8202;.&#8202;2&#8202;.&#8202;1</small>Bound methods</a></h3>\n<p>When the user executes a method access, we&rsquo;ll find the closure for that method\nand wrap it in a new <span name=\"bound\">&ldquo;bound method&rdquo;</span> object that tracks\nthe instance that the method was accessed from. This bound object can be called\nlater like a function. When invoked, the VM will do some shenanigans to wire up\n<code>this</code> to point to the receiver inside the method&rsquo;s body.</p>\n<aside name=\"bound\">\n<p>I took the name &ldquo;bound method&rdquo; from CPython. Python behaves similar to Lox here,\nand I used its implementation for inspiration.</p>\n</aside>\n<p>Here&rsquo;s the new object type:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} ObjInstance;\n\n</pre><div class=\"source-file\"><em>object.h</em><br>\nadd after struct <em>ObjInstance</em></div>\n<pre class=\"insert\"><span class=\"k\">typedef</span> <span class=\"k\">struct</span> {\n  <span class=\"t\">Obj</span> <span class=\"i\">obj</span>;\n  <span class=\"t\">Value</span> <span class=\"i\">receiver</span>;\n  <span class=\"t\">ObjClosure</span>* <span class=\"i\">method</span>;\n} <span class=\"t\">ObjBoundMethod</span>;\n\n</pre><pre class=\"insert-after\">ObjClass* newClass(ObjString* name);\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, add after struct <em>ObjInstance</em></div>\n\n<p>It wraps the receiver and the method closure together. The receiver&rsquo;s type is\nValue even though methods can be called only on ObjInstances. Since the VM\ndoesn&rsquo;t care what kind of receiver it has anyway, using Value means we don&rsquo;t\nhave to keep converting the pointer back to a Value when it gets passed to more\ngeneral functions.</p>\n<p>The new struct implies the usual boilerplate you&rsquo;re used to by now. A new case\nin the object type enum:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">typedef enum {\n</pre><div class=\"source-file\"><em>object.h</em><br>\nin enum <em>ObjType</em></div>\n<pre class=\"insert\">  <span class=\"a\">OBJ_BOUND_METHOD</span>,\n</pre><pre class=\"insert-after\">  OBJ_CLASS,\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, in enum <em>ObjType</em></div>\n\n<p>A macro to check a value&rsquo;s type:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define OBJ_TYPE(value)        (AS_OBJ(value)-&gt;type)\n\n</pre><div class=\"source-file\"><em>object.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define IS_BOUND_METHOD(value) isObjType(value, OBJ_BOUND_METHOD)</span>\n</pre><pre class=\"insert-after\">#define IS_CLASS(value)        isObjType(value, OBJ_CLASS)\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em></div>\n\n<p>Another macro to cast the value to an ObjBoundMethod pointer:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define IS_STRING(value)       isObjType(value, OBJ_STRING)\n\n</pre><div class=\"source-file\"><em>object.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define AS_BOUND_METHOD(value) ((ObjBoundMethod*)AS_OBJ(value))</span>\n</pre><pre class=\"insert-after\">#define AS_CLASS(value)        ((ObjClass*)AS_OBJ(value))\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em></div>\n\n<p>A function to create a new ObjBoundMethod:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} ObjBoundMethod;\n\n</pre><div class=\"source-file\"><em>object.h</em><br>\nadd after struct <em>ObjBoundMethod</em></div>\n<pre class=\"insert\"><span class=\"t\">ObjBoundMethod</span>* <span class=\"i\">newBoundMethod</span>(<span class=\"t\">Value</span> <span class=\"i\">receiver</span>,\n                               <span class=\"t\">ObjClosure</span>* <span class=\"i\">method</span>);\n</pre><pre class=\"insert-after\">ObjClass* newClass(ObjString* name);\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, add after struct <em>ObjBoundMethod</em></div>\n\n<p>And an implementation of that function here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>object.c</em><br>\nadd after <em>allocateObject</em>()</div>\n<pre><span class=\"t\">ObjBoundMethod</span>* <span class=\"i\">newBoundMethod</span>(<span class=\"t\">Value</span> <span class=\"i\">receiver</span>,\n                               <span class=\"t\">ObjClosure</span>* <span class=\"i\">method</span>) {\n  <span class=\"t\">ObjBoundMethod</span>* <span class=\"i\">bound</span> = <span class=\"a\">ALLOCATE_OBJ</span>(<span class=\"t\">ObjBoundMethod</span>,\n                                       <span class=\"a\">OBJ_BOUND_METHOD</span>);\n  <span class=\"i\">bound</span>-&gt;<span class=\"i\">receiver</span> = <span class=\"i\">receiver</span>;\n  <span class=\"i\">bound</span>-&gt;<span class=\"i\">method</span> = <span class=\"i\">method</span>;\n  <span class=\"k\">return</span> <span class=\"i\">bound</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, add after <em>allocateObject</em>()</div>\n\n<p>The constructor-like function simply stores the given closure and receiver. When\nthe bound method is no longer needed, we free it.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  switch (object-&gt;type) {\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>freeObject</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OBJ_BOUND_METHOD</span>:\n      <span class=\"a\">FREE</span>(<span class=\"t\">ObjBoundMethod</span>, <span class=\"i\">object</span>);\n      <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">    case OBJ_CLASS: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>freeObject</em>()</div>\n\n<p>The bound method has a couple of references, but it doesn&rsquo;t <em>own</em> them, so it\nfrees nothing but itself. However, those references do get traced by the garbage\ncollector.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  switch (object-&gt;type) {\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>blackenObject</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OBJ_BOUND_METHOD</span>: {\n      <span class=\"t\">ObjBoundMethod</span>* <span class=\"i\">bound</span> = (<span class=\"t\">ObjBoundMethod</span>*)<span class=\"i\">object</span>;\n      <span class=\"i\">markValue</span>(<span class=\"i\">bound</span>-&gt;<span class=\"i\">receiver</span>);\n      <span class=\"i\">markObject</span>((<span class=\"t\">Obj</span>*)<span class=\"i\">bound</span>-&gt;<span class=\"i\">method</span>);\n      <span class=\"k\">break</span>;\n    }\n</pre><pre class=\"insert-after\">    case OBJ_CLASS: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>blackenObject</em>()</div>\n\n<p>This <span name=\"trace\">ensures</span> that a handle to a method keeps the\nreceiver around in memory so that <code>this</code> can still find the object when you\ninvoke the handle later. We also trace the method closure.</p>\n<aside name=\"trace\">\n<p>Tracing the method closure isn&rsquo;t really necessary. The receiver is an\nObjInstance, which has a pointer to its ObjClass, which has a table for all of\nthe methods. But it feels dubious to me in some vague way to have ObjBoundMethod\nrely on that.</p>\n</aside>\n<p>The last operation all objects support is printing.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  switch (OBJ_TYPE(value)) {\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>printObject</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OBJ_BOUND_METHOD</span>:\n      <span class=\"i\">printFunction</span>(<span class=\"a\">AS_BOUND_METHOD</span>(<span class=\"i\">value</span>)-&gt;<span class=\"i\">method</span>-&gt;<span class=\"i\">function</span>);\n      <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">    case OBJ_CLASS:\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>printObject</em>()</div>\n\n<p>A bound method prints exactly the same way as a function. From the user&rsquo;s\nperspective, a bound method <em>is</em> a function. It&rsquo;s an object they can call. We\ndon&rsquo;t expose that the VM implements bound methods using a different object type.</p>\n<aside name=\"party\"><img src=\"image/methods-and-initializers/party-hat.png\" alt=\"A party hat.\" />\n</aside>\n<p>Put on your <span name=\"party\">party</span> hat because we just reached a little\nmilestone. ObjBoundMethod is the very last runtime type to add to clox. You&rsquo;ve\nwritten your last <code>IS_</code> and <code>AS_</code> macros. We&rsquo;re only a few chapters from the end\nof the book, and we&rsquo;re getting close to a complete VM.</p>\n<h3><a href=\"#accessing-methods\" id=\"accessing-methods\"><small>28&#8202;.&#8202;2&#8202;.&#8202;2</small>Accessing methods</a></h3>\n<p>Let&rsquo;s get our new object type doing something. Methods are accessed using the\nsame &ldquo;dot&rdquo; property syntax we implemented in the last chapter. The compiler\nalready parses the right expressions and emits <code>OP_GET_PROPERTY</code> instructions\nfor them. The only changes we need to make are in the runtime.</p>\n<p>When a property access instruction executes, the instance is on top of the\nstack. The instruction&rsquo;s job is to find a field or method with the given name\nand replace the top of the stack with the accessed property.</p>\n<p>The interpreter already handles fields, so we simply extend the\n<code>OP_GET_PROPERTY</code> case with another section.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">          pop(); // Instance.\n          push(value);\n          break;\n        }\n\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()<br>\nreplace 2 lines</div>\n<pre class=\"insert\">        <span class=\"k\">if</span> (!<span class=\"i\">bindMethod</span>(<span class=\"i\">instance</span>-&gt;<span class=\"i\">klass</span>, <span class=\"i\">name</span>)) {\n          <span class=\"k\">return</span> <span class=\"a\">INTERPRET_RUNTIME_ERROR</span>;\n        }\n        <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">      }\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>(), replace 2 lines</div>\n\n<p>We insert this after the code to look up a field on the receiver instance.\nFields take priority over and shadow methods, so we look for a field first. If\nthe instance does not have a field with the given property name, then the name\nmay refer to a method.</p>\n<p>We take the instance&rsquo;s class and pass it to a new <code>bindMethod()</code> helper. If that\nfunction finds a method, it places the method on the stack and returns <code>true</code>.\nOtherwise it returns <code>false</code> to indicate a method with that name couldn&rsquo;t be\nfound. Since the name also wasn&rsquo;t a field, that means we have a runtime error,\nwhich aborts the interpreter.</p>\n<p>Here is the good stuff:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nadd after <em>callValue</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">bool</span> <span class=\"i\">bindMethod</span>(<span class=\"t\">ObjClass</span>* <span class=\"i\">klass</span>, <span class=\"t\">ObjString</span>* <span class=\"i\">name</span>) {\n  <span class=\"t\">Value</span> <span class=\"i\">method</span>;\n  <span class=\"k\">if</span> (!<span class=\"i\">tableGet</span>(&amp;<span class=\"i\">klass</span>-&gt;<span class=\"i\">methods</span>, <span class=\"i\">name</span>, &amp;<span class=\"i\">method</span>)) {\n    <span class=\"i\">runtimeError</span>(<span class=\"s\">&quot;Undefined property &#39;%s&#39;.&quot;</span>, <span class=\"i\">name</span>-&gt;<span class=\"i\">chars</span>);\n    <span class=\"k\">return</span> <span class=\"k\">false</span>;\n  }\n\n  <span class=\"t\">ObjBoundMethod</span>* <span class=\"i\">bound</span> = <span class=\"i\">newBoundMethod</span>(<span class=\"i\">peek</span>(<span class=\"n\">0</span>),\n                                         <span class=\"a\">AS_CLOSURE</span>(<span class=\"i\">method</span>));\n  <span class=\"i\">pop</span>();\n  <span class=\"i\">push</span>(<span class=\"a\">OBJ_VAL</span>(<span class=\"i\">bound</span>));\n  <span class=\"k\">return</span> <span class=\"k\">true</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, add after <em>callValue</em>()</div>\n\n<p>First we look for a method with the given name in the class&rsquo;s method table. If\nwe don&rsquo;t find one, we report a runtime error and bail out. Otherwise, we take\nthe method and wrap it in a new ObjBoundMethod. We grab the receiver from its\nhome on top of the stack. Finally, we pop the instance and replace the top of\nthe stack with the bound method.</p>\n<p>For example:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Brunch</span> {\n  <span class=\"i\">eggs</span>() {}\n}\n\n<span class=\"k\">var</span> <span class=\"i\">brunch</span> = <span class=\"t\">Brunch</span>();\n<span class=\"k\">var</span> <span class=\"i\">eggs</span> = <span class=\"i\">brunch</span>.<span class=\"i\">eggs</span>;\n</pre></div>\n<p>Here is what happens when the VM executes the <code>bindMethod()</code> call for the\n<code>brunch.eggs</code> expression:</p><img src=\"image/methods-and-initializers/bind-method.png\" alt=\"The stack changes caused by bindMethod().\" />\n<p>That&rsquo;s a lot of machinery under the hood, but from the user&rsquo;s perspective, they\nsimply get a function that they can call.</p>\n<h3><a href=\"#calling-methods\" id=\"calling-methods\"><small>28&#8202;.&#8202;2&#8202;.&#8202;3</small>Calling methods</a></h3>\n<p>Users can declare methods on classes, access them on instances, and get bound\nmethods onto the stack. They just can&rsquo;t <span name=\"do\"><em>do</em></span> anything\nuseful with those bound method objects. The operation we&rsquo;re missing is calling\nthem. Calls are implemented in <code>callValue()</code>, so we add a case there for the new\nobject type.</p>\n<aside name=\"do\">\n<p>A bound method <em>is</em> a first-class value, so they can store it in variables, pass\nit to functions, and otherwise do &ldquo;value&rdquo;-y stuff with it.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">    switch (OBJ_TYPE(callee)) {\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>callValue</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OBJ_BOUND_METHOD</span>: {\n        <span class=\"t\">ObjBoundMethod</span>* <span class=\"i\">bound</span> = <span class=\"a\">AS_BOUND_METHOD</span>(<span class=\"i\">callee</span>);\n        <span class=\"k\">return</span> <span class=\"i\">call</span>(<span class=\"i\">bound</span>-&gt;<span class=\"i\">method</span>, <span class=\"i\">argCount</span>);\n      }\n</pre><pre class=\"insert-after\">      case OBJ_CLASS: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>callValue</em>()</div>\n\n<p>We pull the raw closure back out of the ObjBoundMethod and use the existing\n<code>call()</code> helper to begin an invocation of that closure by pushing a CallFrame\nfor it onto the call stack. That&rsquo;s all it takes to be able to run this Lox\nprogram:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Scone</span> {\n  <span class=\"i\">topping</span>(<span class=\"i\">first</span>, <span class=\"i\">second</span>) {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;scone with &quot;</span> + <span class=\"i\">first</span> + <span class=\"s\">&quot; and &quot;</span> + <span class=\"i\">second</span>;\n  }\n}\n\n<span class=\"k\">var</span> <span class=\"i\">scone</span> = <span class=\"t\">Scone</span>();\n<span class=\"i\">scone</span>.<span class=\"i\">topping</span>(<span class=\"s\">&quot;berries&quot;</span>, <span class=\"s\">&quot;cream&quot;</span>);\n</pre></div>\n<p>That&rsquo;s three big steps. We can declare, access, and invoke methods. But\nsomething is missing. We went to all that trouble to wrap the method closure in\nan object that binds the receiver, but when we invoke the method, we don&rsquo;t use\nthat receiver at all.</p>\n<h2><a href=\"#this\" id=\"this\"><small>28&#8202;.&#8202;3</small>This</a></h2>\n<p>The reason bound methods need to keep hold of the receiver is so that it can be\naccessed inside the body of the method. Lox exposes a method&rsquo;s receiver through\n<code>this</code> expressions. It&rsquo;s time for some new syntax. The lexer already treats\n<code>this</code> as a special token type, so the first step is wiring that token up in the\nparse table.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  [TOKEN_SUPER]         = {NULL,     NULL,   PREC_NONE},\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nreplace 1 line</div>\n<pre class=\"insert\">  [<span class=\"a\">TOKEN_THIS</span>]          = {<span class=\"i\">this_</span>,    <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n</pre><pre class=\"insert-after\">  [TOKEN_TRUE]          = {literal,  NULL,   PREC_NONE},\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, replace 1 line</div>\n\n<aside name=\"this\">\n<p>The underscore at the end of the name of the parser function is because <code>this</code>\nis a reserved word in C++ and we support compiling clox as C++.</p>\n</aside>\n<p>When the parser encounters a <code>this</code> in prefix position, it dispatches to a new\nparser function.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>variable</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">this_</span>(<span class=\"t\">bool</span> <span class=\"i\">canAssign</span>) {\n  <span class=\"i\">variable</span>(<span class=\"k\">false</span>);\n}<span name=\"this\"> </span>\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>variable</em>()</div>\n\n<p>We&rsquo;ll apply the same implementation technique for <code>this</code> in clox that we used in\njlox. We treat <code>this</code> as a lexically scoped local variable whose value gets\nmagically initialized. Compiling it like a local variable means we get a lot of\nbehavior for free. In particular, closures inside a method that reference <code>this</code>\nwill do the right thing and capture the receiver in an upvalue.</p>\n<p>When the parser function is called, the <code>this</code> token has just been consumed and\nis stored as the previous token. We call our existing <code>variable()</code> function\nwhich compiles identifier expressions as variable accesses. It takes a single\nBoolean parameter for whether the compiler should look for a following <code>=</code>\noperator and parse a setter. You can&rsquo;t assign to <code>this</code>, so we pass <code>false</code> to\ndisallow that.</p>\n<p>The <code>variable()</code> function doesn&rsquo;t care that <code>this</code> has its own token type and\nisn&rsquo;t an identifier. It is happy to treat the lexeme &ldquo;this&rdquo; as if it were a\nvariable name and then look it up using the existing scope resolution machinery.\nRight now, that lookup will fail because we never declared a variable whose name\nis &ldquo;this&rdquo;. It&rsquo;s time to think about where the receiver should live in memory.</p>\n<p>At least until they get captured by closures, clox stores every local variable\non the VM&rsquo;s stack. The compiler keeps track of which slots in the function&rsquo;s\nstack window are owned by which local variables. If you recall, the compiler\nsets aside stack slot zero by declaring a local variable whose name is an empty\nstring.</p>\n<p>For function calls, that slot ends up holding the function being called. Since\nthe slot has no name, the function body never accesses it. You can guess where\nthis is going. For <em>method</em> calls, we can repurpose that slot to store the\nreceiver. Slot zero will store the instance that <code>this</code> is bound to. In order to\ncompile <code>this</code> expressions, the compiler simply needs to give the correct name\nto that local variable.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  local-&gt;isCaptured = false;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>initCompiler</em>()<br>\nreplace 2 lines</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">type</span> != <span class=\"a\">TYPE_FUNCTION</span>) {\n    <span class=\"i\">local</span>-&gt;<span class=\"i\">name</span>.<span class=\"i\">start</span> = <span class=\"s\">&quot;this&quot;</span>;\n    <span class=\"i\">local</span>-&gt;<span class=\"i\">name</span>.<span class=\"i\">length</span> = <span class=\"n\">4</span>;\n  } <span class=\"k\">else</span> {\n    <span class=\"i\">local</span>-&gt;<span class=\"i\">name</span>.<span class=\"i\">start</span> = <span class=\"s\">&quot;&quot;</span>;\n    <span class=\"i\">local</span>-&gt;<span class=\"i\">name</span>.<span class=\"i\">length</span> = <span class=\"n\">0</span>;\n  }\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>initCompiler</em>(), replace 2 lines</div>\n\n<p>We want to do this only for methods. Function declarations don&rsquo;t have a <code>this</code>.\nAnd, in fact, they <em>must not</em> declare a variable named &ldquo;this&rdquo;, so that if you\nwrite a <code>this</code> expression inside a function declaration which is itself inside a\nmethod, the <code>this</code> correctly resolves to the outer method&rsquo;s receiver.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Nested</span> {\n  <span class=\"i\">method</span>() {\n    <span class=\"k\">fun</span> <span class=\"i\">function</span>() {\n      <span class=\"k\">print</span> <span class=\"k\">this</span>;\n    }\n\n    <span class=\"i\">function</span>();\n  }\n}\n\n<span class=\"t\">Nested</span>().<span class=\"i\">method</span>();\n</pre></div>\n<p>This program should print &ldquo;Nested instance&rdquo;. To decide what name to give to\nlocal slot zero, the compiler needs to know whether it&rsquo;s compiling a function or\nmethod declaration, so we add a new case to our FunctionType enum to distinguish\nmethods.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  TYPE_FUNCTION,\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin enum <em>FunctionType</em></div>\n<pre class=\"insert\">  <span class=\"a\">TYPE_METHOD</span>,\n</pre><pre class=\"insert-after\">  TYPE_SCRIPT\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in enum <em>FunctionType</em></div>\n\n<p>When we compile a method, we use that type.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  uint8_t constant = identifierConstant(&amp;parser.previous);\n\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>method</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"t\">FunctionType</span> <span class=\"i\">type</span> = <span class=\"a\">TYPE_METHOD</span>;\n</pre><pre class=\"insert-after\">  function(type);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>method</em>(), replace 1 line</div>\n\n<p>Now we can correctly compile references to the special &ldquo;this&rdquo; variable, and the\ncompiler will emit the right <code>OP_GET_LOCAL</code> instructions to access it. Closures\ncan even capture <code>this</code> and store the receiver in upvalues. Pretty cool.</p>\n<p>Except that at runtime, the receiver isn&rsquo;t actually <em>in</em> slot zero. The\ninterpreter isn&rsquo;t holding up its end of the bargain yet. Here is the fix:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case OBJ_BOUND_METHOD: {\n        ObjBoundMethod* bound = AS_BOUND_METHOD(callee);\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>callValue</em>()</div>\n<pre class=\"insert\">        <span class=\"i\">vm</span>.<span class=\"i\">stackTop</span>[-<span class=\"i\">argCount</span> - <span class=\"n\">1</span>] = <span class=\"i\">bound</span>-&gt;<span class=\"i\">receiver</span>;\n</pre><pre class=\"insert-after\">        return call(bound-&gt;method, argCount);\n      }\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>callValue</em>()</div>\n\n<p>When a method is called, the top of the stack contains all of the arguments, and\nthen just under those is the closure of the called method. That&rsquo;s where slot\nzero in the new CallFrame will be. This line of code inserts the receiver into\nthat slot. For example, given a method call like this:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">scone</span>.<span class=\"i\">topping</span>(<span class=\"s\">&quot;berries&quot;</span>, <span class=\"s\">&quot;cream&quot;</span>);\n</pre></div>\n<p>We calculate the slot to store the receiver like so:</p><img src=\"image/methods-and-initializers/closure-slot.png\" alt=\"Skipping over the argument stack slots to find the slot containing the closure.\" />\n<p>The <code>-argCount</code> skips past the arguments and the <code>- 1</code> adjusts for the fact that\n<code>stackTop</code> points just <em>past</em> the last used stack slot.</p>\n<h3><a href=\"#misusing-this\" id=\"misusing-this\"><small>28&#8202;.&#8202;3&#8202;.&#8202;1</small>Misusing this</a></h3>\n<p>Our VM now supports users <em>correctly</em> using <code>this</code>, but we also need to make\nsure it properly handles users <em>mis</em>using <code>this</code>. Lox says it is a compile\nerror for a <code>this</code> expression to appear outside of the body of a method. These\ntwo wrong uses should be caught by the compiler:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">print</span> <span class=\"k\">this</span>; <span class=\"c\">// At top level.</span>\n\n<span class=\"k\">fun</span> <span class=\"i\">notMethod</span>() {\n  <span class=\"k\">print</span> <span class=\"k\">this</span>; <span class=\"c\">// In a function.</span>\n}\n</pre></div>\n<p>So how does the compiler know if it&rsquo;s inside a method? The obvious answer is to\nlook at the FunctionType of the current Compiler. We did just add an enum case\nthere to treat methods specially. However, that wouldn&rsquo;t correctly handle code\nlike the earlier example where you are inside a function which is, itself,\nnested inside a method.</p>\n<p>We could try to resolve &ldquo;this&rdquo; and then report an error if it wasn&rsquo;t found in\nany of the surrounding lexical scopes. That would work, but would require us to\nshuffle around a bunch of code, since right now the code for resolving a\nvariable implicitly considers it a global access if no declaration is found.</p>\n<p>In the next chapter, we will need information about the nearest enclosing class.\nIf we had that, we could use it here to determine if we are inside a method. So\nwe may as well make our future selves&rsquo; lives a little easier and put that\nmachinery in place now.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">Compiler* current = NULL;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nadd after variable <em>current</em></div>\n<pre class=\"insert\"><span class=\"t\">ClassCompiler</span>* <span class=\"i\">currentClass</span> = <span class=\"a\">NULL</span>;\n</pre><pre class=\"insert-after\">\n\nstatic Chunk* currentChunk() {\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after variable <em>current</em></div>\n\n<p>This module variable points to a struct representing the current, innermost\nclass being compiled. The new type looks like this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} Compiler;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nadd after struct <em>Compiler</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">typedef</span> <span class=\"k\">struct</span> <span class=\"t\">ClassCompiler</span> {\n  <span class=\"k\">struct</span> <span class=\"t\">ClassCompiler</span>* <span class=\"i\">enclosing</span>;\n} <span class=\"t\">ClassCompiler</span>;\n</pre><pre class=\"insert-after\">\n\nParser parser;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after struct <em>Compiler</em></div>\n\n<p>Right now we store only a pointer to the ClassCompiler for the enclosing class,\nif any. Nesting a class declaration inside a method in some other class is an\nuncommon thing to do, but Lox supports it. Just like the Compiler struct, this\nmeans ClassCompiler forms a linked list from the current innermost class being\ncompiled out through all of the enclosing classes.</p>\n<p>If we aren&rsquo;t inside any class declaration at all, the module variable\n<code>currentClass</code> is <code>NULL</code>. When the compiler begins compiling a class, it pushes\na new ClassCompiler onto that implicit linked stack.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  defineVariable(nameConstant);\n\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>classDeclaration</em>()</div>\n<pre class=\"insert\">  <span class=\"t\">ClassCompiler</span> <span class=\"i\">classCompiler</span>;\n  <span class=\"i\">classCompiler</span>.<span class=\"i\">enclosing</span> = <span class=\"i\">currentClass</span>;\n  <span class=\"i\">currentClass</span> = &amp;<span class=\"i\">classCompiler</span>;\n\n</pre><pre class=\"insert-after\">  namedVariable(className, false);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>classDeclaration</em>()</div>\n\n<p>The memory for the ClassCompiler struct lives right on the C stack, a handy\ncapability we get by writing our compiler using recursive descent. At the end of\nthe class body, we pop that compiler off the stack and restore the enclosing\none.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  emitByte(OP_POP);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>classDeclaration</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"i\">currentClass</span> = <span class=\"i\">currentClass</span>-&gt;<span class=\"i\">enclosing</span>;\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>classDeclaration</em>()</div>\n\n<p>When an outermost class body ends, <code>enclosing</code> will be <code>NULL</code>, so this resets\n<code>currentClass</code> to <code>NULL</code>. Thus, to see if we are inside a class<span class=\"em\">&mdash;</span>and therefore\ninside a method<span class=\"em\">&mdash;</span>we simply check that module variable.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void this_(bool canAssign) {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>this_</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">currentClass</span> == <span class=\"a\">NULL</span>) {\n    <span class=\"i\">error</span>(<span class=\"s\">&quot;Can&#39;t use &#39;this&#39; outside of a class.&quot;</span>);\n    <span class=\"k\">return</span>;\n  }\n\n</pre><pre class=\"insert-after\">  variable(false);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>this_</em>()</div>\n\n<p>With that, <code>this</code> outside of a class is correctly forbidden. Now our methods\nreally feel like <em>methods</em> in the object-oriented sense. Accessing the receiver\nlets them affect the instance you called the method on. We&rsquo;re getting there!</p>\n<h2><a href=\"#instance-initializers\" id=\"instance-initializers\"><small>28&#8202;.&#8202;4</small>Instance Initializers</a></h2>\n<p>The reason object-oriented languages tie state and behavior together<span class=\"em\">&mdash;</span>one of\nthe core tenets of the paradigm<span class=\"em\">&mdash;</span>is to ensure that objects are always in a\nvalid, meaningful state. When the only way to touch an object&rsquo;s state is <span\nname=\"through\">through</span> its methods, the methods can make sure nothing\ngoes awry. But that presumes the object is <em>already</em> in a proper state. What\nabout when it&rsquo;s first created?</p>\n<aside name=\"through\">\n<p>Of course, Lox does let outside code directly access and modify an instance&rsquo;s\nfields without going through its methods. This is unlike Ruby and Smalltalk,\nwhich completely encapsulate state inside objects. Our toy scripting language,\nalas, isn&rsquo;t so principled.</p>\n</aside>\n<p>Object-oriented languages ensure that brand new objects are properly set up\nthrough constructors, which both produce a new instance and initialize its\nstate. In Lox, the runtime allocates new raw instances, and a class may declare\nan initializer to set up any fields. Initializers work mostly like normal\nmethods, with a few tweaks:</p>\n<ol>\n<li>\n<p>The runtime automatically invokes the initializer method whenever an\ninstance of a class is created.</p>\n</li>\n<li>\n<p>The caller that constructs an instance always gets the instance <span\nname=\"return\">back</span> after the initializer finishes, regardless of what\nthe initializer function itself returns. The initializer method doesn&rsquo;t need\nto explicitly return <code>this</code>.</p>\n</li>\n<li>\n<p>In fact, an initializer is <em>prohibited</em> from returning any value at all\nsince the value would never be seen anyway.</p>\n</li>\n</ol>\n<aside name=\"return\">\n<p>It&rsquo;s as if the initializer is implicitly wrapped in a bundle of code like this:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">create</span>(<span class=\"i\">klass</span>) {\n  <span class=\"k\">var</span> <span class=\"i\">obj</span> = <span class=\"i\">newInstance</span>(<span class=\"i\">klass</span>);\n  <span class=\"i\">obj</span>.<span class=\"i\">init</span>();\n  <span class=\"k\">return</span> <span class=\"i\">obj</span>;\n}\n</pre></div>\n<p>Note how the value returned by <code>init()</code> is discarded.</p>\n</aside>\n<p>Now that we support methods, to add initializers, we merely need to implement\nthose three special rules. We&rsquo;ll go in order.</p>\n<h3><a href=\"#invoking-initializers\" id=\"invoking-initializers\"><small>28&#8202;.&#8202;4&#8202;.&#8202;1</small>Invoking initializers</a></h3>\n<p>First, automatically calling <code>init()</code> on new instances:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">        vm.stackTop[-argCount - 1] = OBJ_VAL(newInstance(klass));\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>callValue</em>()</div>\n<pre class=\"insert\">        <span class=\"t\">Value</span> <span class=\"i\">initializer</span>;\n        <span class=\"k\">if</span> (<span class=\"i\">tableGet</span>(&amp;<span class=\"i\">klass</span>-&gt;<span class=\"i\">methods</span>, <span class=\"i\">vm</span>.<span class=\"i\">initString</span>,\n                     &amp;<span class=\"i\">initializer</span>)) {\n          <span class=\"k\">return</span> <span class=\"i\">call</span>(<span class=\"a\">AS_CLOSURE</span>(<span class=\"i\">initializer</span>), <span class=\"i\">argCount</span>);\n        }\n</pre><pre class=\"insert-after\">        return true;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>callValue</em>()</div>\n\n<p>After the runtime allocates the new instance, we look for an <code>init()</code> method on\nthe class. If we find one, we initiate a call to it. This pushes a new CallFrame\nfor the initializer&rsquo;s closure. Say we run this program:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Brunch</span> {\n  <span class=\"i\">init</span>(<span class=\"i\">food</span>, <span class=\"i\">drink</span>) {}\n}\n\n<span class=\"t\">Brunch</span>(<span class=\"s\">&quot;eggs&quot;</span>, <span class=\"s\">&quot;coffee&quot;</span>);\n</pre></div>\n<p>When the VM executes the call to <code>Brunch()</code>, it goes like this:</p><img src=\"image/methods-and-initializers/init-call-frame.png\" alt=\"The aligned stack windows for the Brunch() call and the corresponding init() method it forwards to.\" />\n<p>Any arguments passed to the class when we called it are still sitting on the\nstack above the instance. The new CallFrame for the <code>init()</code> method shares that\nstack window, so those arguments implicitly get forwarded to the initializer.</p>\n<p>Lox doesn&rsquo;t require a class to define an initializer. If omitted, the runtime\nsimply returns the new uninitialized instance. However, if there is no <code>init()</code>\nmethod, then it doesn&rsquo;t make any sense to pass arguments to the class when\ncreating the instance. We make that an error.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">          return call(AS_CLOSURE(initializer), argCount);\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>callValue</em>()</div>\n<pre class=\"insert\">        } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">argCount</span> != <span class=\"n\">0</span>) {\n          <span class=\"i\">runtimeError</span>(<span class=\"s\">&quot;Expected 0 arguments but got %d.&quot;</span>,\n                       <span class=\"i\">argCount</span>);\n          <span class=\"k\">return</span> <span class=\"k\">false</span>;\n</pre><pre class=\"insert-after\">        }\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>callValue</em>()</div>\n\n<p>When the class <em>does</em> provide an initializer, we also need to ensure that the\nnumber of arguments passed matches the initializer&rsquo;s arity. Fortunately, the\n<code>call()</code> helper does that for us already.</p>\n<p>To call the initializer, the runtime looks up the <code>init()</code> method by name. We\nwant that to be fast since it happens every time an instance is constructed.\nThat means it would be good to take advantage of the string interning we&rsquo;ve\nalready implemented. To do that, the VM creates an ObjString for &ldquo;init&rdquo; and\nreuses it. The string lives right in the VM struct.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  Table strings;\n</pre><div class=\"source-file\"><em>vm.h</em><br>\nin struct <em>VM</em></div>\n<pre class=\"insert\">  <span class=\"t\">ObjString</span>* <span class=\"i\">initString</span>;\n</pre><pre class=\"insert-after\">  ObjUpvalue* openUpvalues;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em>, in struct <em>VM</em></div>\n\n<p>We create and intern the string when the VM boots up.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  initTable(&amp;vm.strings);\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>initVM</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"i\">vm</span>.<span class=\"i\">initString</span> = <span class=\"i\">copyString</span>(<span class=\"s\">&quot;init&quot;</span>, <span class=\"n\">4</span>);\n</pre><pre class=\"insert-after\">\n\n  defineNative(&quot;clock&quot;, clockNative);\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>initVM</em>()</div>\n\n<p>We want it to stick around, so the GC considers it a root.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  markCompilerRoots();\n</pre><div class=\"source-file\"><em>memory.c</em><br>\nin <em>markRoots</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">markObject</span>((<span class=\"t\">Obj</span>*)<span class=\"i\">vm</span>.<span class=\"i\">initString</span>);\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, in <em>markRoots</em>()</div>\n\n<p>Look carefully. See any bug waiting to happen? No? It&rsquo;s a subtle one. The\ngarbage collector now reads <code>vm.initString</code>. That field is initialized from the\nresult of calling <code>copyString()</code>. But copying a string allocates memory, which\ncan trigger a GC. If the collector ran at just the wrong time, it would read\n<code>vm.initString</code> before it had been initialized. So, first we zero the field out.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  initTable(&amp;vm.strings);\n\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>initVM</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">vm</span>.<span class=\"i\">initString</span> = <span class=\"a\">NULL</span>;\n</pre><pre class=\"insert-after\">  vm.initString = copyString(&quot;init&quot;, 4);\n\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>initVM</em>()</div>\n\n<p>We clear the pointer when the VM shuts down since the next line will free it.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  freeTable(&amp;vm.strings);\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>freeVM</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">vm</span>.<span class=\"i\">initString</span> = <span class=\"a\">NULL</span>;\n</pre><pre class=\"insert-after\">  freeObjects();\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>freeVM</em>()</div>\n\n<p>OK, that lets us call initializers.</p>\n<h3><a href=\"#initializer-return-values\" id=\"initializer-return-values\"><small>28&#8202;.&#8202;4&#8202;.&#8202;2</small>Initializer return values</a></h3>\n<p>The next step is ensuring that constructing an instance of a class with an\ninitializer always returns the new instance, and not <code>nil</code> or whatever the body\nof the initializer returns. Right now, if a class defines an initializer, then\nwhen an instance is constructed, the VM pushes a call to that initializer onto\nthe CallFrame stack. Then it just keeps on trucking.</p>\n<p>The user&rsquo;s invocation on the class to create the instance will complete whenever\nthat initializer method returns, and will leave on the stack whatever value the\ninitializer puts there. That means that unless the user takes care to put\n<code>return this;</code> at the end of the initializer, no instance will come out. Not\nvery helpful.</p>\n<p>To fix this, whenever the front end compiles an initializer method, it will emit\ndifferent bytecode at the end of the body to return <code>this</code> from the method\ninstead of the usual implicit <code>nil</code> most functions return. In order to do\n<em>that</em>, the compiler needs to actually know when it is compiling an initializer.\nWe detect that by checking to see if the name of the method we&rsquo;re compiling is\n&ldquo;init&rdquo;.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  FunctionType type = TYPE_METHOD;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>method</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">parser</span>.<span class=\"i\">previous</span>.<span class=\"i\">length</span> == <span class=\"n\">4</span> &amp;&amp;\n      <span class=\"i\">memcmp</span>(<span class=\"i\">parser</span>.<span class=\"i\">previous</span>.<span class=\"i\">start</span>, <span class=\"s\">&quot;init&quot;</span>, <span class=\"n\">4</span>) == <span class=\"n\">0</span>) {\n    <span class=\"i\">type</span> = <span class=\"a\">TYPE_INITIALIZER</span>;\n  }\n\n</pre><pre class=\"insert-after\">  function(type);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>method</em>()</div>\n\n<p>We define a new function type to distinguish initializers from other methods.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  TYPE_FUNCTION,\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin enum <em>FunctionType</em></div>\n<pre class=\"insert\">  <span class=\"a\">TYPE_INITIALIZER</span>,\n</pre><pre class=\"insert-after\">  TYPE_METHOD,\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in enum <em>FunctionType</em></div>\n\n<p>Whenever the compiler emits the implicit return at the end of a body, we check\nthe type to decide whether to insert the initializer-specific behavior.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void emitReturn() {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>emitReturn</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">current</span>-&gt;<span class=\"i\">type</span> == <span class=\"a\">TYPE_INITIALIZER</span>) {\n    <span class=\"i\">emitBytes</span>(<span class=\"a\">OP_GET_LOCAL</span>, <span class=\"n\">0</span>);\n  } <span class=\"k\">else</span> {\n    <span class=\"i\">emitByte</span>(<span class=\"a\">OP_NIL</span>);\n  }\n\n</pre><pre class=\"insert-after\">  emitByte(OP_RETURN);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>emitReturn</em>(), replace 1 line</div>\n\n<p>In an initializer, instead of pushing <code>nil</code> onto the stack before returning,\nwe load slot zero, which contains the instance. This <code>emitReturn()</code> function is\nalso called when compiling a <code>return</code> statement without a value, so this also\ncorrectly handles cases where the user does an early return inside the\ninitializer.</p>\n<h3><a href=\"#incorrect-returns-in-initializers\" id=\"incorrect-returns-in-initializers\"><small>28&#8202;.&#8202;4&#8202;.&#8202;3</small>Incorrect returns in initializers</a></h3>\n<p>The last step, the last item in our list of special features of initializers, is\nmaking it an error to try to return anything <em>else</em> from an initializer. Now\nthat the compiler tracks the method type, this is straightforward.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  if (match(TOKEN_SEMICOLON)) {\n    emitReturn();\n  } else {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>returnStatement</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (<span class=\"i\">current</span>-&gt;<span class=\"i\">type</span> == <span class=\"a\">TYPE_INITIALIZER</span>) {\n      <span class=\"i\">error</span>(<span class=\"s\">&quot;Can&#39;t return a value from an initializer.&quot;</span>);\n    }\n\n</pre><pre class=\"insert-after\">    expression();\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>returnStatement</em>()</div>\n\n<p>We report an error if a <code>return</code> statement in an initializer has a value. We\nstill go ahead and compile the value afterwards so that the compiler doesn&rsquo;t get\nconfused by the trailing expression and report a bunch of cascaded errors.</p>\n<p>Aside from inheritance, which we&rsquo;ll get to <a href=\"superclasses.html\">soon</a>, we now have a\nfairly full-featured class system working in clox.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">CoffeeMaker</span> {\n  <span class=\"i\">init</span>(<span class=\"i\">coffee</span>) {\n    <span class=\"k\">this</span>.<span class=\"i\">coffee</span> = <span class=\"i\">coffee</span>;\n  }\n\n  <span class=\"i\">brew</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Enjoy your cup of &quot;</span> + <span class=\"k\">this</span>.<span class=\"i\">coffee</span>;\n\n    <span class=\"c\">// No reusing the grounds!</span>\n    <span class=\"k\">this</span>.<span class=\"i\">coffee</span> = <span class=\"k\">nil</span>;\n  }\n}\n\n<span class=\"k\">var</span> <span class=\"i\">maker</span> = <span class=\"t\">CoffeeMaker</span>(<span class=\"s\">&quot;coffee and chicory&quot;</span>);\n<span class=\"i\">maker</span>.<span class=\"i\">brew</span>();\n</pre></div>\n<p>Pretty fancy for a C program that would fit on an old <span\nname=\"floppy\">floppy</span> disk.</p>\n<aside name=\"floppy\">\n<p>I acknowledge that &ldquo;floppy disk&rdquo; may no longer be a useful size reference for\ncurrent generations of programmers. Maybe I should have said &ldquo;a few tweets&rdquo; or\nsomething.</p>\n</aside>\n<h2><a href=\"#optimized-invocations\" id=\"optimized-invocations\"><small>28&#8202;.&#8202;5</small>Optimized Invocations</a></h2>\n<p>Our VM correctly implements the language&rsquo;s semantics for method calls and\ninitializers. We could stop here. But the main reason we are building an entire\nsecond implementation of Lox from scratch is to execute faster than our old Java\ninterpreter. Right now, method calls even in clox are slow.</p>\n<p>Lox&rsquo;s semantics define a method invocation as two operations<span class=\"em\">&mdash;</span>accessing the\nmethod and then calling the result. Our VM must support those as separate\noperations because the user <em>can</em> separate them. You can access a method without\ncalling it and then invoke the bound method later. Nothing we&rsquo;ve implemented so\nfar is unnecessary.</p>\n<p>But <em>always</em> executing those as separate operations has a significant cost.\nEvery single time a Lox program accesses and invokes a method, the runtime\nheap allocates a new ObjBoundMethod, initializes its fields, then pulls them\nright back out. Later, the GC has to spend time freeing all of those ephemeral\nbound methods.</p>\n<p>Most of the time, a Lox program accesses a method and then immediately calls it.\nThe bound method is created by one bytecode instruction and then consumed by the\nvery next one. In fact, it&rsquo;s so immediate that the compiler can even textually\n<em>see</em> that it&rsquo;s happening<span class=\"em\">&mdash;</span>a dotted property access followed by an opening\nparenthesis is most likely a method call.</p>\n<p>Since we can recognize this pair of operations at compile time, we have the\nopportunity to emit a <span name=\"super\">new, special</span> instruction that\nperforms an optimized method call.</p>\n<p>We start in the function that compiles dotted property expressions.</p>\n<aside name=\"super\" class=\"bottom\">\n<p>If you spend enough time watching your bytecode VM run, you&rsquo;ll notice it often\nexecutes the same series of bytecode instructions one after the other. A classic\noptimization technique is to define a new single instruction called a\n<strong>superinstruction</strong> that fuses those into a single instruction with the same\nbehavior as the entire sequence.</p>\n<p>One of the largest performance drains in a bytecode interpreter is the overhead\nof decoding and dispatching each instruction. Fusing several instructions into\none eliminates some of that.</p>\n<p>The challenge is determining <em>which</em> instruction sequences are common enough to\nbenefit from this optimization. Every new superinstruction claims an opcode for\nits own use and there are only so many of those to go around. Add too many, and\nyou&rsquo;ll need a larger encoding for opcodes, which then increases code size and\nmakes decoding <em>all</em> instructions slower.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">  if (canAssign &amp;&amp; match(TOKEN_EQUAL)) {\n    expression();\n    emitBytes(OP_SET_PROPERTY, name);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>dot</em>()</div>\n<pre class=\"insert\">  } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"a\">TOKEN_LEFT_PAREN</span>)) {\n    <span class=\"t\">uint8_t</span> <span class=\"i\">argCount</span> = <span class=\"i\">argumentList</span>();\n    <span class=\"i\">emitBytes</span>(<span class=\"a\">OP_INVOKE</span>, <span class=\"i\">name</span>);\n    <span class=\"i\">emitByte</span>(<span class=\"i\">argCount</span>);\n</pre><pre class=\"insert-after\">  } else {\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>dot</em>()</div>\n\n<p>After the compiler has parsed the property name, we look for a left parenthesis.\nIf we match one, we switch to a new code path. There, we compile the argument\nlist exactly like we do when compiling a call expression. Then we emit a single\nnew <code>OP_INVOKE</code> instruction. It takes two operands:</p>\n<ol>\n<li>\n<p>The index of the property name in the constant table.</p>\n</li>\n<li>\n<p>The number of arguments passed to the method.</p>\n</li>\n</ol>\n<p>In other words, this single instruction combines the operands of the\n<code>OP_GET_PROPERTY</code> and <code>OP_CALL</code> instructions it replaces, in that order. It\nreally is a fusion of those two instructions. Let&rsquo;s define it.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_CALL,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_INVOKE</span>,\n</pre><pre class=\"insert-after\">  OP_CLOSURE,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>And add it to the disassembler:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    case OP_CALL:\n      return byteInstruction(&quot;OP_CALL&quot;, chunk, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_INVOKE</span>:\n      <span class=\"k\">return</span> <span class=\"i\">invokeInstruction</span>(<span class=\"s\">&quot;OP_INVOKE&quot;</span>, <span class=\"i\">chunk</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    case OP_CLOSURE: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>This is a new, special instruction format, so it needs a little custom\ndisassembly logic.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>debug.c</em><br>\nadd after <em>constantInstruction</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">int</span> <span class=\"i\">invokeInstruction</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">name</span>, <span class=\"t\">Chunk</span>* <span class=\"i\">chunk</span>,\n                                <span class=\"t\">int</span> <span class=\"i\">offset</span>) {\n  <span class=\"t\">uint8_t</span> <span class=\"i\">constant</span> = <span class=\"i\">chunk</span>-&gt;<span class=\"i\">code</span>[<span class=\"i\">offset</span> + <span class=\"n\">1</span>];\n  <span class=\"t\">uint8_t</span> <span class=\"i\">argCount</span> = <span class=\"i\">chunk</span>-&gt;<span class=\"i\">code</span>[<span class=\"i\">offset</span> + <span class=\"n\">2</span>];\n  <span class=\"i\">printf</span>(<span class=\"s\">&quot;%-16s (%d args) %4d &#39;&quot;</span>, <span class=\"i\">name</span>, <span class=\"i\">argCount</span>, <span class=\"i\">constant</span>);\n  <span class=\"i\">printValue</span>(<span class=\"i\">chunk</span>-&gt;<span class=\"i\">constants</span>.<span class=\"i\">values</span>[<span class=\"i\">constant</span>]);\n  <span class=\"i\">printf</span>(<span class=\"s\">&quot;&#39;</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>);\n  <span class=\"k\">return</span> <span class=\"i\">offset</span> + <span class=\"n\">3</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, add after <em>constantInstruction</em>()</div>\n\n<p>We read the two operands and then print out both the method name and the\nargument count. Over in the interpreter&rsquo;s bytecode dispatch loop is where the\nreal action begins.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_INVOKE</span>: {\n        <span class=\"t\">ObjString</span>* <span class=\"i\">method</span> = <span class=\"a\">READ_STRING</span>();\n        <span class=\"t\">int</span> <span class=\"i\">argCount</span> = <span class=\"a\">READ_BYTE</span>();\n        <span class=\"k\">if</span> (!<span class=\"i\">invoke</span>(<span class=\"i\">method</span>, <span class=\"i\">argCount</span>)) {\n          <span class=\"k\">return</span> <span class=\"a\">INTERPRET_RUNTIME_ERROR</span>;\n        }\n        <span class=\"i\">frame</span> = &amp;<span class=\"i\">vm</span>.<span class=\"i\">frames</span>[<span class=\"i\">vm</span>.<span class=\"i\">frameCount</span> - <span class=\"n\">1</span>];\n        <span class=\"k\">break</span>;\n      }\n</pre><pre class=\"insert-after\">      case OP_CLOSURE: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>Most of the work happens in <code>invoke()</code>, which we&rsquo;ll get to. Here, we look up the\nmethod name from the first operand and then read the argument count operand.\nThen we hand off to <code>invoke()</code> to do the heavy lifting. That function returns\n<code>true</code> if the invocation succeeds. As usual, a <code>false</code> return means a runtime\nerror occurred. We check for that here and abort the interpreter if disaster has\nstruck.</p>\n<p>Finally, assuming the invocation succeeded, then there is a new CallFrame on the\nstack, so we refresh our cached copy of the current frame in <code>frame</code>.</p>\n<p>The interesting work happens here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nadd after <em>callValue</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">bool</span> <span class=\"i\">invoke</span>(<span class=\"t\">ObjString</span>* <span class=\"i\">name</span>, <span class=\"t\">int</span> <span class=\"i\">argCount</span>) {\n  <span class=\"t\">Value</span> <span class=\"i\">receiver</span> = <span class=\"i\">peek</span>(<span class=\"i\">argCount</span>);\n  <span class=\"t\">ObjInstance</span>* <span class=\"i\">instance</span> = <span class=\"a\">AS_INSTANCE</span>(<span class=\"i\">receiver</span>);\n  <span class=\"k\">return</span> <span class=\"i\">invokeFromClass</span>(<span class=\"i\">instance</span>-&gt;<span class=\"i\">klass</span>, <span class=\"i\">name</span>, <span class=\"i\">argCount</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, add after <em>callValue</em>()</div>\n\n<p>First we grab the receiver off the stack. The arguments passed to the method are\nabove it on the stack, so we peek that many slots down. Then it&rsquo;s a simple\nmatter to cast the object to an instance and invoke the method on it.</p>\n<p>That does assume the object <em>is</em> an instance. As with <code>OP_GET_PROPERTY</code>\ninstructions, we also need to handle the case where a user incorrectly tries to\ncall a method on a value of the wrong type.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  Value receiver = peek(argCount);\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>invoke</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"k\">if</span> (!<span class=\"a\">IS_INSTANCE</span>(<span class=\"i\">receiver</span>)) {\n    <span class=\"i\">runtimeError</span>(<span class=\"s\">&quot;Only instances have methods.&quot;</span>);\n    <span class=\"k\">return</span> <span class=\"k\">false</span>;\n  }\n\n</pre><pre class=\"insert-after\">  ObjInstance* instance = AS_INSTANCE(receiver);\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>invoke</em>()</div>\n\n<p><span name=\"helper\">That&rsquo;s</span> a runtime error, so we report that and bail\nout. Otherwise, we get the instance&rsquo;s class and jump over to this other new\nutility function:</p>\n<aside name=\"helper\">\n<p>As you can guess by now, we split this code into a separate function because\nwe&rsquo;re going to reuse it later<span class=\"em\">&mdash;</span>in this case for <code>super</code> calls.</p>\n</aside>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nadd after <em>callValue</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">bool</span> <span class=\"i\">invokeFromClass</span>(<span class=\"t\">ObjClass</span>* <span class=\"i\">klass</span>, <span class=\"t\">ObjString</span>* <span class=\"i\">name</span>,\n                            <span class=\"t\">int</span> <span class=\"i\">argCount</span>) {\n  <span class=\"t\">Value</span> <span class=\"i\">method</span>;\n  <span class=\"k\">if</span> (!<span class=\"i\">tableGet</span>(&amp;<span class=\"i\">klass</span>-&gt;<span class=\"i\">methods</span>, <span class=\"i\">name</span>, &amp;<span class=\"i\">method</span>)) {\n    <span class=\"i\">runtimeError</span>(<span class=\"s\">&quot;Undefined property &#39;%s&#39;.&quot;</span>, <span class=\"i\">name</span>-&gt;<span class=\"i\">chars</span>);\n    <span class=\"k\">return</span> <span class=\"k\">false</span>;\n  }\n  <span class=\"k\">return</span> <span class=\"i\">call</span>(<span class=\"a\">AS_CLOSURE</span>(<span class=\"i\">method</span>), <span class=\"i\">argCount</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, add after <em>callValue</em>()</div>\n\n<p>This function combines the logic of how the VM implements <code>OP_GET_PROPERTY</code> and\n<code>OP_CALL</code> instructions, in that order. First we look up the method by name in\nthe class&rsquo;s method table. If we don&rsquo;t find one, we report that runtime error and\nexit.</p>\n<p>Otherwise, we take the method&rsquo;s closure and push a call to it onto the CallFrame\nstack. We don&rsquo;t need to heap allocate and initialize an ObjBoundMethod. In fact,\nwe don&rsquo;t even need to <span name=\"juggle\">juggle</span> anything on the stack.\nThe receiver and method arguments are already right where they need to be.</p>\n<aside name=\"juggle\">\n<p>This is a key reason <em>why</em> we use stack slot zero to store the receiver<span class=\"em\">&mdash;</span>it&rsquo;s\nhow the caller already organizes the stack for a method call. An efficient\ncalling convention is an important part of a bytecode VM&rsquo;s performance story.</p>\n</aside>\n<p>If you fire up the VM and run a little program that calls methods now, you\nshould see the exact same behavior as before. But, if we did our job right, the\n<em>performance</em> should be much improved. I wrote a little microbenchmark that\ndoes a batch of 10,000 method calls. Then it tests how many of these batches it\ncan execute in 10 seconds. On my computer, without the new <code>OP_INVOKE</code>\ninstruction, it got through 1,089 batches. With this new optimization, it\nfinished 8,324 batches in the same time. That&rsquo;s <em>7.6 times faster</em>, which is a\nhuge improvement when it comes to programming language optimization.</p>\n<p><span name=\"pat\"></span></p>\n<aside name=\"pat\">\n<p>We shouldn&rsquo;t pat ourselves on the back <em>too</em> firmly. This performance\nimprovement is relative to our own unoptimized method call implementation which\nwas quite slow. Doing a heap allocation for every single method call isn&rsquo;t going\nto win any races.</p>\n</aside><img src=\"image/methods-and-initializers/benchmark.png\" alt=\"Bar chart comparing the two benchmark results.\" />\n<h3><a href=\"#invoking-fields\" id=\"invoking-fields\"><small>28&#8202;.&#8202;5&#8202;.&#8202;1</small>Invoking fields</a></h3>\n<p>The fundamental creed of optimization is: &ldquo;Thou shalt not break correctness.&rdquo;\n<span name=\"monte\">Users</span> like it when a language implementation gives\nthem an answer faster, but only if it&rsquo;s the <em>right</em> answer. Alas, our\nimplementation of faster method invocations fails to uphold that principle:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Oops</span> {\n  <span class=\"i\">init</span>() {\n    <span class=\"k\">fun</span> <span class=\"i\">f</span>() {\n      <span class=\"k\">print</span> <span class=\"s\">&quot;not a method&quot;</span>;\n    }\n\n    <span class=\"k\">this</span>.<span class=\"i\">field</span> = <span class=\"i\">f</span>;\n  }\n}\n\n<span class=\"k\">var</span> <span class=\"i\">oops</span> = <span class=\"t\">Oops</span>();\n<span class=\"i\">oops</span>.<span class=\"i\">field</span>();\n</pre></div>\n<p>The last line looks like a method call. The compiler thinks that it is and\ndutifully emits an <code>OP_INVOKE</code> instruction for it. However, it&rsquo;s not. What is\nactually happening is a <em>field</em> access that returns a function which then gets\ncalled. Right now, instead of executing that correctly, our VM reports a runtime\nerror when it can&rsquo;t find a method named &ldquo;field&rdquo;.</p>\n<aside name=\"monte\">\n<p>There are cases where users may be satisfied when a program sometimes returns\nthe wrong answer in return for running significantly faster or with a better\nbound on the performance. These are the field of <a href=\"https://en.wikipedia.org/wiki/Monte_Carlo_algorithm\"><strong>Monte Carlo\nalgorithms</strong></a>. For some use cases, this is a good trade-off.</p>\n<p>The important part, though, is that the user is <em>choosing</em> to apply one of these\nalgorithms. We language implementers can&rsquo;t unilaterally decide to sacrifice\ntheir program&rsquo;s correctness.</p>\n</aside>\n<p>Earlier, when we implemented <code>OP_GET_PROPERTY</code>, we handled both field and method\naccesses. To squash this new bug, we need to do the same thing for <code>OP_INVOKE</code>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  ObjInstance* instance = AS_INSTANCE(receiver);\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>invoke</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"t\">Value</span> <span class=\"i\">value</span>;\n  <span class=\"k\">if</span> (<span class=\"i\">tableGet</span>(&amp;<span class=\"i\">instance</span>-&gt;<span class=\"i\">fields</span>, <span class=\"i\">name</span>, &amp;<span class=\"i\">value</span>)) {\n    <span class=\"i\">vm</span>.<span class=\"i\">stackTop</span>[-<span class=\"i\">argCount</span> - <span class=\"n\">1</span>] = <span class=\"i\">value</span>;\n    <span class=\"k\">return</span> <span class=\"i\">callValue</span>(<span class=\"i\">value</span>, <span class=\"i\">argCount</span>);\n  }\n\n</pre><pre class=\"insert-after\">  return invokeFromClass(instance-&gt;klass, name, argCount);\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>invoke</em>()</div>\n\n<p>Pretty simple fix. Before looking up a method on the instance&rsquo;s class, we look\nfor a field with the same name. If we find a field, then we store it on the\nstack in place of the receiver, <em>under</em> the argument list. This is how\n<code>OP_GET_PROPERTY</code> behaves since the latter instruction executes before a\nsubsequent parenthesized list of arguments has been evaluated.</p>\n<p>Then we try to call that field&rsquo;s value like the callable that it hopefully is.\nThe <code>callValue()</code> helper will check the value&rsquo;s type and call it as appropriate\nor report a runtime error if the field&rsquo;s value isn&rsquo;t a callable type like a\nclosure.</p>\n<p>That&rsquo;s all it takes to make our optimization fully safe. We do sacrifice a\nlittle performance, unfortunately. But that&rsquo;s the price you have to pay\nsometimes. You occasionally get frustrated by optimizations you <em>could</em> do if\nonly the language wouldn&rsquo;t allow some annoying corner case. But, as language\n<span name=\"designer\">implementers</span>, we have to play the game we&rsquo;re given.</p>\n<aside name=\"designer\">\n<p>As language <em>designers</em>, our role is very different. If we do control the\nlanguage itself, we may sometimes choose to restrict or change the language in\nways that enable optimizations. Users want expressive languages, but they also\nwant fast implementations. Sometimes it is good language design to sacrifice a\nlittle power if you can give them perf in return.</p>\n</aside>\n<p>The code we wrote here follows a typical pattern in optimization:</p>\n<ol>\n<li>\n<p>Recognize a common operation or sequence of operations that is performance\ncritical. In this case, it is a method access followed by a call.</p>\n</li>\n<li>\n<p>Add an optimized implementation of that pattern. That&rsquo;s our <code>OP_INVOKE</code>\ninstruction.</p>\n</li>\n<li>\n<p>Guard the optimized code with some conditional logic that validates that the\npattern actually applies. If it does, stay on the fast path. Otherwise, fall\nback to a slower but more robust unoptimized behavior. Here, that means\nchecking that we are actually calling a method and not accessing a field.</p>\n</li>\n</ol>\n<p>As your language work moves from getting the implementation working <em>at all</em> to\ngetting it to work <em>faster</em>, you will find yourself spending more and more\ntime looking for patterns like this and adding guarded optimizations for them.\nFull-time VM engineers spend much of their careers in this loop.</p>\n<p>But we can stop here for now. With this, clox now supports most of the features\nof an object-oriented programming language, and with respectable performance.</p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>The hash table lookup to find a class&rsquo;s <code>init()</code> method is constant time,\nbut still fairly slow. Implement something faster. Write a benchmark and\nmeasure the performance difference.</p>\n</li>\n<li>\n<p>In a dynamically typed language like Lox, a single callsite may invoke a\nvariety of methods on a number of classes throughout a program&rsquo;s execution.\nEven so, in practice, most of the time a callsite ends up calling the exact\nsame method on the exact same class for the duration of the run. Most calls\nare actually not polymorphic even if the language says they can be.</p>\n<p>How do advanced language implementations optimize based on that observation?</p>\n</li>\n<li>\n<p>When interpreting an <code>OP_INVOKE</code> instruction, the VM has to do two hash\ntable lookups. First, it looks for a field that could shadow a method, and\nonly if that fails does it look for a method. The former check is rarely\nuseful<span class=\"em\">&mdash;</span>most fields do not contain functions. But it is <em>necessary</em>\nbecause the language says fields and methods are accessed using the same\nsyntax, and fields shadow methods.</p>\n<p>That is a language <em>choice</em> that affects the performance of our\nimplementation. Was it the right choice? If Lox were your language, what\nwould you do?</p>\n</li>\n</ol>\n</div>\n<div class=\"design-note\">\n<h2><a href=\"#design-note\" id=\"design-note\">Design Note: Novelty Budget</a></h2>\n<p>I still remember the first time I wrote a tiny BASIC program on a TRS-80 and\nmade a computer do something it hadn&rsquo;t done before. It felt like a superpower.\nThe first time I cobbled together just enough of a parser and interpreter to let\nme write a tiny program in <em>my own language</em> that made a computer do a thing was\nlike some sort of higher-order meta-superpower. It was and remains a wonderful\nfeeling.</p>\n<p>I realized I could design a language that looked and behaved however I chose. It\nwas like I&rsquo;d been going to a private school that required uniforms my whole life\nand then one day transferred to a public school where I could wear whatever I\nwanted. I don&rsquo;t need to use curly braces for blocks? I can use something other\nthan an equals sign for assignment? I can do objects without classes? Multiple\ninheritance <em>and</em> multimethods? A dynamic language that overloads statically, by\narity?</p>\n<p>Naturally, I took that freedom and ran with it. I made the weirdest, most\narbitrary language design decisions. Apostrophes for generics. No commas between\narguments. Overload resolution that can fail at runtime. I did things\ndifferently just for difference&rsquo;s sake.</p>\n<p>This is a very fun experience that I highly recommend. We need more weird,\navant-garde programming languages. I want to see more art languages. I still\nmake oddball toy languages for fun sometimes.</p>\n<p><em>However</em>, if your goal is success where &ldquo;success&rdquo; is defined as a large number\nof users, then your priorities must be different. In that case, your primary\ngoal is to have your language loaded into the brains of as many people as\npossible. That&rsquo;s <em>really hard</em>. It takes a lot of human effort to move a\nlanguage&rsquo;s syntax and semantics from a computer into trillions of neurons.</p>\n<p>Programmers are naturally conservative with their time and cautious about what\nlanguages are worth uploading into their wetware. They don&rsquo;t want to waste their\ntime on a language that ends up not being useful to them. As a language\ndesigner, your goal is thus to give them as much language power as you can with\nas little required learning as possible.</p>\n<p>One natural approach is <em>simplicity</em>. The fewer concepts and features your\nlanguage has, the less total volume of stuff there is to learn. This is one of\nthe reasons minimal <span name=\"dynamic\">scripting</span> languages often find\nsuccess even though they aren&rsquo;t as powerful as the big industrial languages<span class=\"em\">&mdash;</span>they are easier to get started with, and once they are in someone&rsquo;s brain, the\nuser wants to keep using them.</p>\n<aside name=\"dynamic\">\n<p>In particular, this is a big advantage of dynamically typed languages. A static\nlanguage requires you to learn <em>two</em> languages<span class=\"em\">&mdash;</span>the runtime semantics and the\nstatic type system<span class=\"em\">&mdash;</span>before you can get to the point where you are making the\ncomputer do stuff. Dynamic languages require you to learn only the former.</p>\n<p>Eventually, programs get big enough that the value of static analysis pays for\nthe effort to learn that second static language, but the value proposition isn&rsquo;t\nas obvious at the outset.</p>\n</aside>\n<p>The problem with simplicity is that simply cutting features often sacrifices\npower and expressiveness. There is an art to finding features that punch above\ntheir weight, but often minimal languages simply do less.</p>\n<p>There is another path that avoids much of that problem. The trick is to realize\nthat a user doesn&rsquo;t have to load your entire language into their head, <em>just the\npart they don&rsquo;t already have in there</em>. As I mentioned in an <a href=\"parsing-expressions.html#design-note\">earlier design\nnote</a>, learning is about transferring the <em>delta</em> between what they\nalready know and what they need to know.</p>\n<p>Many potential users of your language already know some other programming\nlanguage. Any features your language shares with that language are essentially\n&ldquo;free&rdquo; when it comes to learning. It&rsquo;s already in their head, they just have to\nrecognize that your language does the same thing.</p>\n<p>In other words, <em>familiarity</em> is another key tool to lower the adoption cost of\nyour language. Of course, if you fully maximize that attribute, the end result\nis a language that is completely identical to some existing one. That&rsquo;s not a\nrecipe for success, because at that point there&rsquo;s no incentive for users to\nswitch to your language at all.</p>\n<p>So you do need to provide some compelling differences. Some things your language\ncan do that other languages can&rsquo;t, or at least can&rsquo;t do as well. I believe this\nis one of the fundamental balancing acts of language design: similarity to other\nlanguages lowers learning cost, while divergence raises the compelling\nadvantages.</p>\n<p>I think of this balancing act in terms of a <span name=\"idiosyncracy\"><strong>novelty\nbudget</strong></span>, or as Steve Klabnik calls it, a &ldquo;<a href=\"https://words.steveklabnik.com/the-language-strangeness-budget\">strangeness budget</a>&rdquo;. Users\nhave a low threshold for the total amount of new stuff they are willing to\naccept to learn a new language. Exceed that, and they won&rsquo;t show up.</p>\n<aside name=\"idiosyncracy\">\n<p>A related concept in psychology is <a href=\"https://en.wikipedia.org/wiki/Idiosyncrasy_credit\"><strong>idiosyncrasy credit</strong></a>, the\nidea that other people in society grant you a finite amount of deviations from\nsocial norms. You earn credit by fitting in and doing in-group things, which you\ncan then spend on oddball activities that might otherwise raise eyebrows. In\nother words, demonstrating that you are &ldquo;one of the good ones&rdquo; gives you license\nto raise your freak flag, but only so far.</p>\n</aside>\n<p>Anytime you add something new to your language that other languages don&rsquo;t have,\nor anytime your language does something other languages do in a different way,\nyou spend some of that budget. That&rsquo;s OK<span class=\"em\">&mdash;</span>you <em>need</em> to spend it to make your\nlanguage compelling. But your goal is to spend it <em>wisely</em>. For each feature or\ndifference, ask yourself how much compelling power it adds to your language and\nthen evaluate critically whether it pays its way. Is the change so valuable that\nit is worth blowing some of your novelty budget?</p>\n<p>In practice, I find this means that you end up being pretty conservative with\nsyntax and more adventurous with semantics. As fun as it is to put on a new\nchange of clothes, swapping out curly braces with some other block delimiter is\nvery unlikely to add much real power to the language, but it does spend some\nnovelty. It&rsquo;s hard for syntax differences to carry their weight.</p>\n<p>On the other hand, new semantics can significantly increase the power of the\nlanguage. Multimethods, mixins, traits, reflection, dependent types, runtime\nmetaprogramming, etc. can radically level up what a user can do with the\nlanguage.</p>\n<p>Alas, being conservative like this is not as fun as just changing everything.\nBut it&rsquo;s up to you to decide whether you want to chase mainstream success or not\nin the first place. We don&rsquo;t all need to be radio-friendly pop bands. If you\nwant your language to be like free jazz or drone metal and are happy with the\nproportionally smaller (but likely more devoted) audience size, go for it.</p>\n</div>\n\n<footer>\n<a href=\"superclasses.html\" class=\"next\">\n  Next Chapter: &ldquo;Superclasses&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/optimization.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Optimization &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Optimization<small>30</small></a></h3>\n\n<ul>\n    <li><a href=\"#measuring-performance\"><small>30.1</small> Measuring Performance</a></li>\n    <li><a href=\"#faster-hash-table-probing\"><small>30.2</small> Faster Hash Table Probing</a></li>\n    <li><a href=\"#nan-boxing\"><small>30.3</small> NaN Boxing</a></li>\n    <li><a href=\"#where-to-next\"><small>30.4</small> Where to Next</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"superclasses.html\" title=\"Superclasses\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"backmatter.html\" title=\"Backmatter\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"superclasses.html\" title=\"Superclasses\" class=\"prev\">←</a>\n<a href=\"backmatter.html\" title=\"Backmatter\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Optimization<small>30</small></a></h3>\n\n<ul>\n    <li><a href=\"#measuring-performance\"><small>30.1</small> Measuring Performance</a></li>\n    <li><a href=\"#faster-hash-table-probing\"><small>30.2</small> Faster Hash Table Probing</a></li>\n    <li><a href=\"#nan-boxing\"><small>30.3</small> NaN Boxing</a></li>\n    <li><a href=\"#where-to-next\"><small>30.4</small> Where to Next</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"superclasses.html\" title=\"Superclasses\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"backmatter.html\" title=\"Backmatter\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">30</div>\n  <h1>Optimization</h1>\n\n<blockquote>\n<p>The evening&rsquo;s the best part of the day. You&rsquo;ve done your day&rsquo;s work. Now you\ncan put your feet up and enjoy it.</p>\n<p><cite>Kazuo Ishiguro, <em>The Remains of the Day</em></cite></p>\n</blockquote>\n<p>If I still lived in New Orleans, I&rsquo;d call this chapter a <em>lagniappe</em>, a little\nsomething extra given for free to a customer. You&rsquo;ve got a whole book and a\ncomplete virtual machine already, but I want you to have some more fun hacking\non clox. This time, we&rsquo;re going for pure performance. We&rsquo;ll apply two very\ndifferent optimizations to our virtual machine.  In the process, you&rsquo;ll get a\nfeel for measuring and improving the performance of a language implementation<span class=\"em\">&mdash;</span>or any program, really.</p>\n<h2><a href=\"#measuring-performance\" id=\"measuring-performance\"><small>30&#8202;.&#8202;1</small>Measuring Performance</a></h2>\n<p><strong>Optimization</strong> means taking a working application and improving its\nperformance. An optimized program does the same thing, it just takes less\nresources to do so. The resource we usually think of when optimizing is runtime\nspeed, but it can also be important to reduce memory usage, startup time,\npersistent storage size, or network bandwidth. All physical resources have some\ncost<span class=\"em\">&mdash;</span>even if the cost is mostly in wasted human time<span class=\"em\">&mdash;</span>so optimization work\noften pays off.</p>\n<p>There was a time in the early days of computing that a skilled programmer could\nhold the entire hardware architecture and compiler pipeline in their head and\nunderstand a program&rsquo;s performance just by thinking real hard. Those days are\nlong gone, separated from the present by microcode, cache lines, branch\nprediction, deep compiler pipelines, and mammoth instruction sets. We like to\npretend C is a &ldquo;low-level&rdquo; language, but the stack of technology between</p>\n<div class=\"codehilite\"><pre><span class=\"i\">printf</span>(<span class=\"s\">&quot;Hello, world!&quot;</span>);\n</pre></div>\n<p>and a greeting appearing on screen is now perilously tall.</p>\n<p>Optimization today is an empirical science. Our program is a border collie\nsprinting through the hardware&rsquo;s obstacle course. If we want her to reach the\nend faster, we can&rsquo;t just sit and ruminate on canine physiology until\nenlightenment strikes. Instead, we need to <em>observe</em> her performance, see where\nshe stumbles, and then find faster paths for her to take.</p>\n<p>Much like agility training is particular to one dog and one obstacle course, we\ncan&rsquo;t assume that our virtual machine optimizations will make <em>all</em> Lox programs\nrun faster on <em>all</em> hardware. Different Lox programs stress different areas of\nthe VM, and different architectures have their own strengths and weaknesses.</p>\n<h3><a href=\"#benchmarks\" id=\"benchmarks\"><small>30&#8202;.&#8202;1&#8202;.&#8202;1</small>Benchmarks</a></h3>\n<p>When we add new functionality, we validate correctness by writing tests<span class=\"em\">&mdash;</span>Lox\nprograms that use a feature and validate the VM&rsquo;s behavior. Tests pin down\nsemantics and ensure we don&rsquo;t break existing features when we add new ones. We\nhave similar needs when it comes to performance:</p>\n<ol>\n<li>\n<p>How do we validate that an optimization <em>does</em> improve performance, and by\nhow much?</p>\n</li>\n<li>\n<p>How do we ensure that other unrelated changes don&rsquo;t <em>regress</em> performance?</p>\n</li>\n</ol>\n<p>The Lox programs we write to accomplish those goals are <strong>benchmarks</strong>. These\nare carefully crafted programs that stress some part of the language\nimplementation. They measure not <em>what</em> the program does, but how <span\nname=\"much\"><em>long</em></span> it takes to do it.</p>\n<aside name=\"much\">\n<p>Most benchmarks measure running time. But, of course, you&rsquo;ll eventually find\nyourself needing to write benchmarks that measure memory allocation, how much\ntime is spent in the garbage collector, startup time, etc.</p>\n</aside>\n<p>By measuring the performance of a benchmark before and after a change, you can\nsee what your change does. When you land an optimization, all of the tests\nshould behave exactly the same as they did before, but hopefully the benchmarks\nrun faster.</p>\n<p>Once you have an entire <span name=\"js\"><em>suite</em></span> of benchmarks, you can\nmeasure not just <em>that</em> an optimization changes performance, but on which\n<em>kinds</em> of code. Often you&rsquo;ll find that some benchmarks get faster while others\nget slower. Then you have to make hard decisions about what kinds of code your\nlanguage implementation optimizes for.</p>\n<p>The suite of benchmarks you choose to write is a key part of that decision. In\nthe same way that your tests encode your choices around what correct behavior\nlooks like, your benchmarks are the embodiment of your priorities when it comes\nto performance. They will guide which optimizations you implement, so choose\nyour benchmarks carefully, and don&rsquo;t forget to periodically reflect on whether\nthey are helping you reach your larger goals.</p>\n<aside name=\"js\">\n<p>In the early proliferation of JavaScript VMs, the first widely used benchmark\nsuite was SunSpider from WebKit. During the browser wars, marketing folks used\nSunSpider results to claim their browser was fastest. That highly incentivized\nVM hackers to optimize to those benchmarks.</p>\n<p>Unfortunately, SunSpider programs often didn&rsquo;t match real-world JavaScript. They\nwere mostly microbenchmarks<span class=\"em\">&mdash;</span>tiny toy programs that completed quickly. Those\nbenchmarks penalize complex just-in-time compilers that start off slower but get\n<em>much</em> faster once the JIT has had enough time to optimize and re-compile hot\ncode paths. This put VM hackers in the unfortunate position of having to choose\nbetween making the SunSpider numbers get better, or actually optimizing the\nkinds of programs real users ran.</p>\n<p>Google&rsquo;s V8 team responded by sharing their Octane benchmark suite, which was\ncloser to real-world code at the time. Years later, as JavaScript use patterns\ncontinued to evolve, even Octane outlived its usefulness. Expect that your\nbenchmarks will evolve as your language&rsquo;s ecosystem does.</p>\n<p>Remember, the ultimate goal is to make <em>user programs</em> faster, and benchmarks\nare only a proxy for that.</p>\n</aside>\n<p>Benchmarking is a subtle art. Like tests, you need to balance not overfitting to\nyour implementation while ensuring that the benchmark does actually tickle the\ncode paths that you care about. When you measure performance, you need to\ncompensate for variance caused by CPU throttling, caching, and other weird\nhardware and operating system quirks. I won&rsquo;t give you a whole sermon here,\nbut treat benchmarking as its own skill that improves with practice.</p>\n<h3><a href=\"#profiling\" id=\"profiling\"><small>30&#8202;.&#8202;1&#8202;.&#8202;2</small>Profiling</a></h3>\n<p>OK, so you&rsquo;ve got a few benchmarks now. You want to make them go faster. Now\nwhat? First of all, let&rsquo;s assume you&rsquo;ve done all the obvious, easy work. You are\nusing the right algorithms and data structures<span class=\"em\">&mdash;</span>or, at least, you aren&rsquo;t using\nones that are aggressively wrong. I don&rsquo;t consider using a hash table instead of\na linear search through a huge unsorted array &ldquo;optimization&rdquo; so much as &ldquo;good\nsoftware engineering&rdquo;.</p>\n<p>Since the hardware is too complex to reason about our program&rsquo;s performance from\nfirst principles, we have to go out into the field. That means <em>profiling</em>. A\n<strong>profiler</strong>, if you&rsquo;ve never used one, is a tool that runs your <span\nname=\"program\">program</span> and tracks hardware resource use as the code\nexecutes. Simple ones show you how much time was spent in each function in your\nprogram. Sophisticated ones log data cache misses, instruction cache misses,\nbranch mispredictions, memory allocations, and all sorts of other metrics.</p>\n<aside name=\"program\">\n<p>&ldquo;Your program&rdquo; here means the Lox VM itself running some <em>other</em> Lox program. We\nare trying to optimize clox, not the user&rsquo;s Lox script. Of course, the choice of\nwhich Lox program to load into our VM will highly affect which parts of clox get\nstressed, which is why benchmarks are so important.</p>\n<p>A profiler <em>won&rsquo;t</em> show us how much time is spent in each <em>Lox</em> function in the\nscript being run. We&rsquo;d have to write our own &ldquo;Lox profiler&rdquo; to do that, which is\nslightly out of scope for this book.</p>\n</aside>\n<p>There are many profilers out there for various operating systems and languages.\nOn whatever platform you program, it&rsquo;s worth getting familiar with a decent\nprofiler. You don&rsquo;t need to be a master. I have learned things within minutes of\nthrowing a program at a profiler that would have taken me <em>days</em> to discover on\nmy own through trial and error. Profilers are wonderful, magical tools.</p>\n<h2><a href=\"#faster-hash-table-probing\" id=\"faster-hash-table-probing\"><small>30&#8202;.&#8202;2</small>Faster Hash Table Probing</a></h2>\n<p>Enough pontificating, let&rsquo;s get some performance charts going up and to the\nright. The first optimization we&rsquo;ll do, it turns out, is about the <em>tiniest</em>\npossible change we could make to our VM.</p>\n<p>When I first got the bytecode virtual machine that clox is descended from\nworking, I did what any self-respecting VM hacker would do. I cobbled together a\ncouple of benchmarks, fired up a profiler, and ran those scripts through my\ninterpreter. In a dynamically typed language like Lox, a large fraction of user\ncode is field accesses and method calls, so one of my benchmarks looked\nsomething like this:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Zoo</span> {\n  <span class=\"i\">init</span>() {\n    <span class=\"k\">this</span>.<span class=\"i\">aardvark</span> = <span class=\"n\">1</span>;\n    <span class=\"k\">this</span>.<span class=\"i\">baboon</span>   = <span class=\"n\">1</span>;\n    <span class=\"k\">this</span>.<span class=\"i\">cat</span>      = <span class=\"n\">1</span>;\n    <span class=\"k\">this</span>.<span class=\"i\">donkey</span>   = <span class=\"n\">1</span>;\n    <span class=\"k\">this</span>.<span class=\"i\">elephant</span> = <span class=\"n\">1</span>;\n    <span class=\"k\">this</span>.<span class=\"i\">fox</span>      = <span class=\"n\">1</span>;\n  }\n  <span class=\"i\">ant</span>()    { <span class=\"k\">return</span> <span class=\"k\">this</span>.<span class=\"i\">aardvark</span>; }\n  <span class=\"i\">banana</span>() { <span class=\"k\">return</span> <span class=\"k\">this</span>.<span class=\"i\">baboon</span>; }\n  <span class=\"i\">tuna</span>()   { <span class=\"k\">return</span> <span class=\"k\">this</span>.<span class=\"i\">cat</span>; }\n  <span class=\"i\">hay</span>()    { <span class=\"k\">return</span> <span class=\"k\">this</span>.<span class=\"i\">donkey</span>; }\n  <span class=\"i\">grass</span>()  { <span class=\"k\">return</span> <span class=\"k\">this</span>.<span class=\"i\">elephant</span>; }\n  <span class=\"i\">mouse</span>()  { <span class=\"k\">return</span> <span class=\"k\">this</span>.<span class=\"i\">fox</span>; }\n}\n\n<span class=\"k\">var</span> <span class=\"i\">zoo</span> = <span class=\"t\">Zoo</span>();\n<span class=\"k\">var</span> <span class=\"i\">sum</span> = <span class=\"n\">0</span>;\n<span class=\"k\">var</span> <span class=\"i\">start</span> = <span class=\"i\">clock</span>();\n<span class=\"k\">while</span> (<span class=\"i\">sum</span> &lt; <span class=\"n\">100000000</span>) {\n  <span class=\"i\">sum</span> = <span class=\"i\">sum</span> + <span class=\"i\">zoo</span>.<span class=\"i\">ant</span>()\n            + <span class=\"i\">zoo</span>.<span class=\"i\">banana</span>()\n            + <span class=\"i\">zoo</span>.<span class=\"i\">tuna</span>()\n            + <span class=\"i\">zoo</span>.<span class=\"i\">hay</span>()\n            + <span class=\"i\">zoo</span>.<span class=\"i\">grass</span>()\n            + <span class=\"i\">zoo</span>.<span class=\"i\">mouse</span>();\n}\n\n<span class=\"k\">print</span> <span class=\"i\">clock</span>() - <span class=\"i\">start</span>;\n<span class=\"k\">print</span> <span class=\"i\">sum</span>;\n</pre></div>\n<aside name=\"sum\" class=\"bottom\">\n<p>Another thing this benchmark is careful to do is <em>use</em> the result of the code it\nexecutes. By calculating a rolling sum and printing the result, we ensure the VM\n<em>must</em> execute all that Lox code. This is an important habit. Unlike our simple\nLox VM, many compilers do aggressive dead code elimination and are smart enough\nto discard a computation whose result is never used.</p>\n<p>Many a programming language hacker has been impressed by the blazing performance\nof a VM on some benchmark, only to realize that it&rsquo;s because the compiler\noptimized the entire benchmark program away to nothing.</p>\n</aside>\n<p>If you&rsquo;ve never seen a benchmark before, this might seem ludicrous. <em>What</em> is\ngoing on here? The program itself doesn&rsquo;t intend to <span name=\"sum\">do</span>\nanything useful. What it does do is call a bunch of methods and access a bunch\nof fields since those are the parts of the language we&rsquo;re interested in. Fields\nand methods live in hash tables, so it takes care to populate at least a <span\nname=\"more\"><em>few</em></span> interesting keys in those tables. That is all wrapped\nin a big loop to ensure our profiler has enough execution time to dig in and see\nwhere the cycles are going.</p>\n<aside name=\"more\">\n<p>If you really want to benchmark hash table performance, you should use many\ntables of different sizes. The six keys we add to each table here aren&rsquo;t even\nenough to get over our hash table&rsquo;s eight-element minimum threshold. But I\ndidn&rsquo;t want to throw an enormous benchmark script at you. Feel free to add more\ncritters and treats if you like.</p>\n</aside>\n<p>Before I tell you what my profiler showed me, spend a minute taking a few\nguesses. Where in clox&rsquo;s codebase do you think the VM spent most of its time? Is\nthere any code we&rsquo;ve written in previous chapters that you suspect is\nparticularly slow?</p>\n<p>Here&rsquo;s what I found: Naturally, the function with the greatest inclusive time is\n<code>run()</code>. (<strong>Inclusive time</strong> means the total time spent in some function and all\nother functions it calls<span class=\"em\">&mdash;</span>the total time between when you enter the function\nand when it returns.) Since <code>run()</code> is the main bytecode execution loop, it\ndrives everything.</p>\n<p>Inside <code>run()</code>, there are small chunks of time sprinkled in various cases in the\nbytecode switch for common instructions like <code>OP_POP</code>, <code>OP_RETURN</code>, and\n<code>OP_ADD</code>. The big heavy instructions are <code>OP_GET_GLOBAL</code> with 17% of the\nexecution time, <code>OP_GET_PROPERTY</code> at 12%, and <code>OP_INVOKE</code> which takes a whopping\n42% of the total running time.</p>\n<p>So we&rsquo;ve got three hotspots to optimize? Actually, no. Because it turns out\nthose three instructions spend almost all of their time inside calls to the same\nfunction: <code>tableGet()</code>. That function claims a whole 72% of the execution time\n(again, inclusive). Now, in a dynamically typed language, we expect to spend a\nfair bit of time looking stuff up in hash tables<span class=\"em\">&mdash;</span>it&rsquo;s sort of the price of\ndynamism. But, still, <em>wow.</em></p>\n<h3><a href=\"#slow-key-wrapping\" id=\"slow-key-wrapping\"><small>30&#8202;.&#8202;2&#8202;.&#8202;1</small>Slow key wrapping</a></h3>\n<p>If you take a look at <code>tableGet()</code>, you&rsquo;ll see it&rsquo;s mostly a wrapper around a\ncall to <code>findEntry()</code> where the actual hash table lookup happens. To refresh\nyour memory, here it is in full:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">static</span> <span class=\"t\">Entry</span>* <span class=\"i\">findEntry</span>(<span class=\"t\">Entry</span>* <span class=\"i\">entries</span>, <span class=\"t\">int</span> <span class=\"i\">capacity</span>,\n                        <span class=\"t\">ObjString</span>* <span class=\"i\">key</span>) {\n  <span class=\"t\">uint32_t</span> <span class=\"i\">index</span> = <span class=\"i\">key</span>-&gt;<span class=\"i\">hash</span> % <span class=\"i\">capacity</span>;\n  <span class=\"t\">Entry</span>* <span class=\"i\">tombstone</span> = <span class=\"a\">NULL</span>;\n\n  <span class=\"k\">for</span> (;;) {\n    <span class=\"t\">Entry</span>* <span class=\"i\">entry</span> = &amp;<span class=\"i\">entries</span>[<span class=\"i\">index</span>];\n    <span class=\"k\">if</span> (<span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span> == <span class=\"a\">NULL</span>) {\n      <span class=\"k\">if</span> (<span class=\"a\">IS_NIL</span>(<span class=\"i\">entry</span>-&gt;<span class=\"i\">value</span>)) {\n        <span class=\"c\">// Empty entry.</span>\n        <span class=\"k\">return</span> <span class=\"i\">tombstone</span> != <span class=\"a\">NULL</span> ? <span class=\"i\">tombstone</span> : <span class=\"i\">entry</span>;\n      } <span class=\"k\">else</span> {\n        <span class=\"c\">// We found a tombstone.</span>\n        <span class=\"k\">if</span> (<span class=\"i\">tombstone</span> == <span class=\"a\">NULL</span>) <span class=\"i\">tombstone</span> = <span class=\"i\">entry</span>;\n      }\n    } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">entry</span>-&gt;<span class=\"i\">key</span> == <span class=\"i\">key</span>) {\n      <span class=\"c\">// We found the key.</span>\n      <span class=\"k\">return</span> <span class=\"i\">entry</span>;\n    }\n\n    <span class=\"i\">index</span> = (<span class=\"i\">index</span> + <span class=\"n\">1</span>) % <span class=\"i\">capacity</span>;\n  }\n}\n</pre></div>\n<p>When running that previous benchmark<span class=\"em\">&mdash;</span>on my machine, at least<span class=\"em\">&mdash;</span>the VM spends\n70% of the total execution time on <em>one line</em> in this function. Any guesses as\nto which one? No? It&rsquo;s this:</p>\n<div class=\"codehilite\"><pre>  <span class=\"t\">uint32_t</span> <span class=\"i\">index</span> = <span class=\"i\">key</span>-&gt;<span class=\"i\">hash</span> % <span class=\"i\">capacity</span>;\n</pre></div>\n<p>That pointer dereference isn&rsquo;t the problem. It&rsquo;s the little <code>%</code>. It turns out\nthe modulo operator is <em>really</em> slow. Much slower than other <span\nname=\"division\">arithmetic</span> operators. Can we do something better?</p>\n<aside name=\"division\">\n<p>Pipelining makes it hard to talk about the performance of an individual CPU\ninstruction, but to give you a feel for things, division and modulo are about\n30-50 <em>times</em> slower than addition and subtraction on x86.</p>\n</aside>\n<p>In the general case, it&rsquo;s really hard to re-implement a fundamental arithmetic\noperator in user code in a way that&rsquo;s faster than what the CPU itself can do.\nAfter all, our C code ultimately compiles down to the CPU&rsquo;s own arithmetic\noperations. If there were tricks we could use to go faster, the chip would\nalready be using them.</p>\n<p>However, we can take advantage of the fact that we know more about our problem\nthan the CPU does. We use modulo here to take a key string&rsquo;s hash code and\nwrap it to fit within the bounds of the table&rsquo;s entry array. That array starts\nout at eight elements and grows by a factor of two each time. We know<span class=\"em\">&mdash;</span>and the\nCPU and C compiler do not<span class=\"em\">&mdash;</span>that our table&rsquo;s size is always a power of two.</p>\n<p>Because we&rsquo;re clever bit twiddlers, we know a faster way to calculate the\nremainder of a number modulo a power of two: <strong>bit masking</strong>. Let&rsquo;s say we want\nto calculate 229 modulo 64. The answer is 37, which is not particularly apparent\nin decimal, but is clearer when you view those numbers in binary:</p><img src=\"image/optimization/mask.png\" alt=\"The bit patterns resulting from 229 % 64 = 37 and 229 &amp; 63 = 37.\" />\n<p>On the left side of the illustration, notice how the result (37) is simply the\ndividend (229) with the highest two bits shaved off? Those two highest bits are\nthe bits at or to the left of the divisor&rsquo;s single 1 bit.</p>\n<p>On the right side, we get the same result by taking 229 and bitwise <span\nclass=\"small-caps\">AND</span>-ing it with 63, which is one less than our\noriginal power of two divisor. Subtracting one from a power of two gives you a\nseries of 1 bits. That is exactly the mask we need in order to strip out those\ntwo leftmost bits.</p>\n<p>In other words, you can calculate a number modulo any power of two by simply\n<span class=\"small-caps\">AND</span>-ing it with that power of two minus one. I&rsquo;m\nnot enough of a mathematician to <em>prove</em> to you that this works, but if you\nthink it through, it should make sense. We can replace that slow modulo operator\nwith a very fast decrement and bitwise <span class=\"small-caps\">AND</span>. We\nsimply change the offending line of code to this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static Entry* findEntry(Entry* entries, int capacity,\n                        ObjString* key) {\n</pre><div class=\"source-file\"><em>table.c</em><br>\nin <em>findEntry</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"t\">uint32_t</span> <span class=\"i\">index</span> = <span class=\"i\">key</span>-&gt;<span class=\"i\">hash</span> &amp; (<span class=\"i\">capacity</span> - <span class=\"n\">1</span>);\n</pre><pre class=\"insert-after\">  Entry* tombstone = NULL;\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, in <em>findEntry</em>(), replace 1 line</div>\n\n<p>CPUs love bitwise operators, so it&rsquo;s hard to <span name=\"sub\">improve</span> on that. </p>\n<aside name=\"sub\">\n<p>Another potential improvement is to eliminate the decrement by storing the bit\nmask directly instead of the capacity. In my tests, that didn&rsquo;t make a\ndifference. Instruction pipelining makes some operations essentially free if the\nCPU is bottlenecked elsewhere.</p>\n</aside>\n<p>Our linear probing search may need to wrap around the end of the array, so there\nis another modulo in <code>findEntry()</code> to update.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      // We found the key.\n      return entry;\n    }\n\n</pre><div class=\"source-file\"><em>table.c</em><br>\nin <em>findEntry</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"i\">index</span> = (<span class=\"i\">index</span> + <span class=\"n\">1</span>) &amp; (<span class=\"i\">capacity</span> - <span class=\"n\">1</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, in <em>findEntry</em>(), replace 1 line</div>\n\n<p>This line didn&rsquo;t show up in the profiler since most searches don&rsquo;t wrap.</p>\n<p>The <code>findEntry()</code> function has a sister function, <code>tableFindString()</code> that does\na hash table lookup for interning strings. We may as well apply the same\noptimizations there too. This function is called only when interning strings,\nwhich wasn&rsquo;t heavily stressed by our benchmark. But a Lox program that created\nlots of strings might noticeably benefit from this change.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  if (table-&gt;count == 0) return NULL;\n\n</pre><div class=\"source-file\"><em>table.c</em><br>\nin <em>tableFindString</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"t\">uint32_t</span> <span class=\"i\">index</span> = <span class=\"i\">hash</span> &amp; (<span class=\"i\">table</span>-&gt;<span class=\"i\">capacity</span> - <span class=\"n\">1</span>);\n</pre><pre class=\"insert-after\">  for (;;) {\n    Entry* entry = &amp;table-&gt;entries[index];\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, in <em>tableFindString</em>(), replace 1 line</div>\n\n<p>And also when the linear probing wraps around.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      return entry-&gt;key;\n    }\n\n</pre><div class=\"source-file\"><em>table.c</em><br>\nin <em>tableFindString</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"i\">index</span> = (<span class=\"i\">index</span> + <span class=\"n\">1</span>) &amp; (<span class=\"i\">table</span>-&gt;<span class=\"i\">capacity</span> - <span class=\"n\">1</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>table.c</em>, in <em>tableFindString</em>(), replace 1 line</div>\n\n<p>Let&rsquo;s see if our fixes were worth it. I tweaked that zoological benchmark to\ncount how many <span name=\"batch\">batches</span> of 10,000 calls it can run in\nten seconds. More batches equals faster performance. On my machine using the\nunoptimized code, the benchmark gets through 3,192 batches. After this\noptimization, that jumps to 6,249.</p><img src=\"image/optimization/hash-chart.png\" alt=\"Bar chart comparing the performance before and after the optimization.\" />\n<p>That&rsquo;s almost exactly twice as much work in the same amount of time. We made the\nVM twice as fast (usual caveat: on this benchmark). That is a massive win when\nit comes to optimization. Usually you feel good if you can claw a few percentage\npoints here or there. Since methods, fields, and global variables are so\nprevalent in Lox programs, this tiny optimization improves performance across\nthe board. Almost every Lox program benefits.</p>\n<aside name=\"batch\">\n<p>Our original benchmark fixed the amount of <em>work</em> and then measured the <em>time</em>.\nChanging the script to count how many batches of calls it can do in ten seconds\nfixes the time and measures the work. For performance comparisons, I like the\nlatter measure because the reported number represents <em>speed</em>. You can directly\ncompare the numbers before and after an optimization. When measuring execution\ntime, you have to do a little arithmetic to get to a good relative measure of\nperformance.</p>\n</aside>\n<p>Now, the point of this section is <em>not</em> that the modulo operator is profoundly\nevil and you should stamp it out of every program you ever write. Nor is it that\nmicro-optimization is a vital engineering skill. It&rsquo;s rare that a performance\nproblem has such a narrow, effective solution. We got lucky.</p>\n<p>The point is that we didn&rsquo;t <em>know</em> that the modulo operator was a performance\ndrain until our profiler told us so. If we had wandered around our VM&rsquo;s codebase\nblindly guessing at hotspots, we likely wouldn&rsquo;t have noticed it. What I want\nyou to take away from this is how important it is to have a profiler in your\ntoolbox.</p>\n<p>To reinforce that point, let&rsquo;s go ahead and run the original benchmark in our\nnow-optimized VM and see what the profiler shows us. On my machine, <code>tableGet()</code>\nis still a fairly large chunk of execution time. That&rsquo;s to be expected for a\ndynamically typed language. But it has dropped from 72% of the total execution\ntime down to 35%. That&rsquo;s much more in line with what we&rsquo;d like to see and shows\nthat our optimization didn&rsquo;t just make the program faster, but made it faster\n<em>in the way we expected</em>. Profilers are as useful for verifying solutions as\nthey are for discovering problems.</p>\n<h2><a href=\"#nan-boxing\" id=\"nan-boxing\"><small>30&#8202;.&#8202;3</small>NaN Boxing</a></h2>\n<p>This next optimization has a very different feel. Thankfully, despite the odd\nname, it does not involve punching your grandmother. It&rsquo;s different, but not,\nlike, <em>that</em> different. With our previous optimization, the profiler told us\nwhere the problem was, and we merely had to use some ingenuity to come up with a\nsolution.</p>\n<p>This optimization is more subtle, and its performance effects more scattered\nacross the virtual machine. The profiler won&rsquo;t help us come up with this.\nInstead, it was invented by <span name=\"someone\">someone</span> thinking deeply\nabout the lowest levels of machine architecture.</p>\n<aside name=\"someone\">\n<p>I&rsquo;m not sure who first came up with this trick. The earliest source I can find\nis David Gudeman&rsquo;s 1993 paper &ldquo;Representing Type Information in Dynamically\nTyped Languages&rdquo;. Everyone else cites that. But Gudeman himself says the paper\nisn&rsquo;t novel work, but instead &ldquo;gathers together a body of folklore&rdquo;.</p>\n<p>Maybe the inventor has been lost to the mists of time, or maybe it&rsquo;s been\nreinvented a number of times. Anyone who ruminates on IEEE 754 long enough\nprobably starts thinking about trying to stuff something useful into all those\nunused NaN bits.</p>\n</aside>\n<p>Like the heading says, this optimization is called <strong>NaN boxing</strong> or sometimes\n<strong>NaN tagging</strong>. Personally I like the latter name because &ldquo;boxing&rdquo; tends to imply\nsome kind of heap-allocated representation, but the former seems to be the more\nwidely used term. This technique changes how we represent values in the VM.</p>\n<p>On a 64-bit machine, our Value type takes up 16 bytes. The struct has two\nfields, a type tag and a union for the payload. The largest fields in the union\nare an Obj pointer and a double, which are both 8 bytes. To keep the union field\naligned to an 8-byte boundary, the compiler adds padding after the tag too:</p><img src=\"image/optimization/union.png\" alt=\"Byte layout of the 16-byte tagged union Value.\" />\n<p>That&rsquo;s pretty big. If we could cut that down, then the VM could pack more values\ninto the same amount of memory. Most computers have plenty of RAM these days, so\nthe direct memory savings aren&rsquo;t a huge deal. But a smaller representation means\nmore Values fit in a cache line. That means fewer cache misses, which affects\n<em>speed</em>.</p>\n<p>If Values need to be aligned to their largest payload size, and a Lox number or\nObj pointer needs a full 8 bytes, how can we get any smaller? In a dynamically\ntyped language like Lox, each value needs to carry not just its payload, but\nenough additional information to determine the value&rsquo;s type at runtime. If a Lox\nnumber is already using the full 8 bytes, where could we squirrel away a couple\nof extra bits to tell the runtime &ldquo;this is a number&rdquo;?</p>\n<p>This is one of the perennial problems for dynamic language hackers. It\nparticularly bugs them because statically typed languages don&rsquo;t generally have\nthis problem. The type of each value is known at compile time, so no extra\nmemory is needed at runtime to track it. When your C compiler compiles a 32-bit\nint, the resulting variable gets <em>exactly</em> 32 bits of storage.</p>\n<p>Dynamic language folks hate losing ground to the static camp, so they&rsquo;ve come up\nwith a number of very clever ways to pack type information and a payload into a\nsmall number of bits. NaN boxing is one of those. It&rsquo;s a particularly good fit\nfor languages like JavaScript and Lua, where all numbers are double-precision\nfloating point. Lox is in that same boat.</p>\n<h3><a href=\"#what-is-and-is-not-a-number\" id=\"what-is-and-is-not-a-number\"><small>30&#8202;.&#8202;3&#8202;.&#8202;1</small>What is (and is not) a number?</a></h3>\n<p>Before we start optimizing, we need to really understand how our friend the CPU\nrepresents floating-point numbers. Almost all machines today use the same\nscheme, encoded in the venerable scroll <a href=\"https://en.wikipedia.org/wiki/IEEE_754\">IEEE 754</a>, known to mortals as the\n&ldquo;IEEE Standard for Floating-Point Arithmetic&rdquo;.</p>\n<p>In the eyes of your computer, a <span name=\"hyphen\">64-bit</span>,\ndouble-precision, IEEE floating-point number looks like this:</p>\n<aside name=\"hyphen\">\n<p>That&rsquo;s a lot of hyphens for one sentence.</p>\n</aside><img src=\"image/optimization/double.png\" alt=\"Bit representation of an IEEE 754 double.\" />\n<ul>\n<li>\n<p>Starting from the right, the first 52 bits are the <strong>fraction</strong>,\n<strong>mantissa</strong>, or <strong>significand</strong> bits. They represent the significant digits\nof the number, as a binary integer.</p>\n</li>\n<li>\n<p>Next to that are 11 <strong>exponent</strong> bits. These tell you how far the mantissa\nis shifted away from the decimal (well, binary) point.</p>\n</li>\n<li>\n<p>The highest bit is the <span name=\"sign\"><strong>sign bit</strong></span>, which\nindicates whether the number is positive or negative.</p>\n</li>\n</ul>\n<p>I know that&rsquo;s a little vague, but this chapter isn&rsquo;t a deep dive on\nfloating point representation. If you want to know how the exponent and mantissa\nplay together, there are already better explanations out there than I could\nwrite.</p>\n<aside name=\"sign\">\n<p>Since the sign bit is always present, even if the number is zero, that implies\nthat &ldquo;positive zero&rdquo; and &ldquo;negative zero&rdquo; have different bit representations, and\nindeed, IEEE 754 does distinguish those.</p>\n</aside>\n<p>The important part for our purposes is that the spec carves out a special case\nexponent. When all of the exponent bits are set, then instead of just\nrepresenting a really big number, the value has a different meaning. These\nvalues are &ldquo;Not a Number&rdquo; (hence, <strong>NaN</strong>) values. They represent concepts like\ninfinity or the result of division by zero.</p>\n<p><em>Any</em> double whose exponent bits are all set is a NaN, regardless of the\nmantissa bits. That means there&rsquo;s lots and lots of <em>different</em> NaN bit patterns.\nIEEE 754 divides those into two categories. Values where the highest mantissa\nbit is 0 are called <strong>signalling NaNs</strong>, and the others are <strong>quiet NaNs</strong>.\nSignalling NaNs are intended to be the result of erroneous computations, like\ndivision by zero. A chip <span name=\"abort\">may</span> detect when one of these\nvalues is produced and abort a program completely. They may self-destruct if you\ntry to read one.</p>\n<aside name=\"abort\">\n<p>I don&rsquo;t know if any CPUs actually <em>do</em> trap signalling NaNs and abort. The spec\njust says they <em>could</em>.</p>\n</aside>\n<p>Quiet NaNs are supposed to be safer to use. They don&rsquo;t represent useful numeric\nvalues, but they should at least not set your hand on fire if you touch them.</p>\n<p>Every double with all of its exponent bits set and its highest mantissa bit set\nis a quiet NaN. That leaves 52 bits unaccounted for. We&rsquo;ll avoid one of those so\nthat we don&rsquo;t step on Intel&rsquo;s &ldquo;QNaN Floating-Point Indefinite&rdquo; value, leaving us\n51 bits. Those remaining bits can be anything. We&rsquo;re talking\n2,251,799,813,685,248 unique quiet NaN bit patterns.</p><img src=\"image/optimization/nan.png\" alt=\"The bits in a double that make it a quiet NaN.\" />\n<p>This means a 64-bit double has enough room to store all of the various different\nnumeric floating-point values and <em>also</em> has room for another 51 bits of data\nthat we can use however we want. That&rsquo;s plenty of room to set aside a couple of\nbit patterns to represent Lox&rsquo;s <code>nil</code>, <code>true</code>, and <code>false</code> values. But what\nabout Obj pointers? Don&rsquo;t pointers need a full 64 bits too?</p>\n<p>Fortunately, we have another trick up our other sleeve. Yes, technically\npointers on a 64-bit architecture are 64 bits. But, no architecture I know of\nactually uses that entire address space. Instead, most widely used chips today\nonly ever use the low <span name=\"48\">48</span> bits. The remaining 16 bits are\neither unspecified or always zero.</p>\n<aside name=\"48\">\n<p>48 bits is enough to address 262,144 gigabytes of memory. Modern operating\nsystems also give each process its own address space, so that should be plenty.</p>\n</aside>\n<p>If we&rsquo;ve got 51 bits, we can stuff a 48-bit pointer in there with three bits to\nspare. Those three bits are just enough to store tiny type tags to distinguish\nbetween <code>nil</code>, Booleans, and Obj pointers.</p>\n<p>That&rsquo;s NaN boxing. Within a single 64-bit double, you can store all of the\ndifferent floating-point numeric values, a pointer, or any of a couple of other\nspecial sentinel values. Half the memory usage of our current Value struct,\nwhile retaining all of the fidelity.</p>\n<p>What&rsquo;s particularly nice about this representation is that there is no need to\n<em>convert</em> a numeric double value into a &ldquo;boxed&rdquo; form. Lox numbers <em>are</em> just\nnormal, 64-bit doubles. We still need to <em>check</em> their type before we use them,\nsince Lox is dynamically typed, but we don&rsquo;t need to do any bit shifting or\npointer indirection to go from &ldquo;value&rdquo; to &ldquo;number&rdquo;.</p>\n<p>For the other value types, there is a conversion step, of course. But,\nfortunately, our VM hides all of the mechanism to go from values to raw types\nbehind a handful of macros. Rewrite those to implement NaN boxing, and the rest\nof the VM should just work.</p>\n<h3><a href=\"#conditional-support\" id=\"conditional-support\"><small>30&#8202;.&#8202;3&#8202;.&#8202;2</small>Conditional support</a></h3>\n<p>I know the details of this new representation aren&rsquo;t clear in your head yet.\nDon&rsquo;t worry, they will crystallize as we work through the implementation. Before\nwe get to that, we&rsquo;re going to put some compile-time scaffolding in place.</p>\n<p>For our previous optimization, we rewrote the previous slow code and called it\ndone. This one is a little different. NaN boxing relies on some very low-level\ndetails of how a chip represents floating-point numbers and pointers. It\n<em>probably</em> works on most CPUs you&rsquo;re likely to encounter, but you can never be\ntotally sure.</p>\n<p>It would suck if our VM completely lost support for an architecture just because\nof its value representation. To avoid that, we&rsquo;ll maintain support for <em>both</em>\nthe old tagged union implementation of Value and the new NaN-boxed form. We\nselect which representation we want at compile time using this flag:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &lt;stdint.h&gt;\n\n</pre><div class=\"source-file\"><em>common.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define NAN_BOXING</span>\n</pre><pre class=\"insert-after\">#define DEBUG_PRINT_CODE\n</pre></div>\n<div class=\"source-file-narrow\"><em>common.h</em></div>\n\n<p>If that&rsquo;s defined, the VM uses the new form. Otherwise, it reverts to the old\nstyle. The few pieces of code that care about the details of the value\nrepresentation<span class=\"em\">&mdash;</span>mainly the handful of macros for wrapping and unwrapping\nValues<span class=\"em\">&mdash;</span>vary based on whether this flag is set. The rest of the VM can\ncontinue along its merry way.</p>\n<p>Most of the work happens in the &ldquo;value&rdquo; module where we add a section for the\nnew type.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">typedef struct ObjString ObjString;\n\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#ifdef NAN_BOXING</span>\n\n<span class=\"k\">typedef</span> <span class=\"t\">uint64_t</span> <span class=\"t\">Value</span>;\n\n<span class=\"a\">#else</span>\n\n</pre><pre class=\"insert-after\">typedef enum {\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>When NaN boxing is enabled, the actual type of a Value is a flat, unsigned\n64-bit integer. We could use double instead, which would make the macros for\ndealing with Lox numbers a little simpler. But all of the other macros need to\ndo bitwise operations and uint64_t is a much friendlier type for that. Outside\nof this module, the rest of the VM doesn&rsquo;t really care one way or the other.</p>\n<p>Before we start re-implementing those macros, we close the <code>#else</code> branch of the\n<code>#ifdef</code> at the end of the definitions for the old representation.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define OBJ_VAL(object)   ((Value){VAL_OBJ, {.obj = (Obj*)object}})\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#endif</span>\n</pre><pre class=\"insert-after\">\n\ntypedef struct {\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>Our remaining task is simply to fill in that first <code>#ifdef</code> section with new\nimplementations of all the stuff already in the <code>#else</code> side. We&rsquo;ll work through\nit one value type at a time, from easiest to hardest.</p>\n<h3><a href=\"#numbers\" id=\"numbers\"><small>30&#8202;.&#8202;3&#8202;.&#8202;3</small>Numbers</a></h3>\n<p>We&rsquo;ll start with numbers since they have the most direct representation under\nNaN boxing. To &ldquo;convert&rdquo; a C double to a NaN-boxed clox Value, we don&rsquo;t need to\ntouch a single bit<span class=\"em\">&mdash;</span>the representation is exactly the same. But we do need to\nconvince our C compiler of that fact, which we made harder by defining Value to\nbe uint64_t.</p>\n<p>We need to get the compiler to take a set of bits that it thinks are a double\nand use those same bits as a uint64_t, or vice versa. This is called <strong>type\npunning</strong>. C and C++ programmers have been doing this since the days of bell\nbottoms and 8-tracks, but the language specifications have <span\nname=\"hesitate\">hesitated</span> to say which of the many ways to do this is\nofficially sanctioned.</p>\n<aside name=\"hesitate\" class=\"bottom\">\n<p>Spec authors don&rsquo;t like type punning because it makes optimization harder. A key\noptimization technique is reordering instructions to fill the CPU&rsquo;s execution\npipelines. A compiler can reorder code only when doing so doesn&rsquo;t have a\nuser-visible effect, obviously.</p>\n<p>Pointers make that harder. If two pointers point to the same value, then a write\nthrough one and a read through the other cannot be reordered. But what about two\npointers of <em>different</em> types? If those could point to the same object, then\nbasically <em>any</em> two pointers could be aliases to the same value. That\ndrastically limits the amount of code the compiler is free to rearrange.</p>\n<p>To avoid that, compilers want to assume <strong>strict aliasing</strong><span class=\"em\">&mdash;</span>pointers of\nincompatible types cannot point to the same value. Type punning, by nature,\nbreaks that assumption.</p>\n</aside>\n<p>I know one way to convert a <code>double</code> to <code>Value</code> and back that I believe is\nsupported by both the C and C++ specs. Unfortunately, it doesn&rsquo;t fit in a single\nexpression, so the conversion macros have to call out to helper functions.\nHere&rsquo;s the first macro:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">typedef uint64_t Value;\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#define NUMBER_VAL(num) numToValue(num)</span>\n</pre><pre class=\"insert-after\">\n\n#else\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>That macro passes the double here:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define NUMBER_VAL(num) numToValue(num)\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">static</span> <span class=\"k\">inline</span> <span class=\"t\">Value</span> <span class=\"i\">numToValue</span>(<span class=\"t\">double</span> <span class=\"i\">num</span>) {\n  <span class=\"t\">Value</span> <span class=\"i\">value</span>;\n  <span class=\"i\">memcpy</span>(&amp;<span class=\"i\">value</span>, &amp;<span class=\"i\">num</span>, <span class=\"k\">sizeof</span>(<span class=\"t\">double</span>));\n  <span class=\"k\">return</span> <span class=\"i\">value</span>;\n}\n</pre><pre class=\"insert-after\">\n\n#else\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>I know, weird, right? The way to treat a series of bytes as having a different\ntype without changing their value at all is <code>memcpy()</code>? This looks horrendously\nslow: Create a local variable. Pass its address to the operating system through\na syscall to copy a few bytes. Then return the result, which is the exact same\nbytes as the input. Thankfully, because this <em>is</em> the supported idiom for type\npunning, most compilers recognize the pattern and optimize away the <code>memcpy()</code>\nentirely.</p>\n<p>&ldquo;Unwrapping&rdquo; a Lox number is the mirror image.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">typedef uint64_t Value;\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#define AS_NUMBER(value)    valueToNum(value)</span>\n</pre><pre class=\"insert-after\">\n\n#define NUMBER_VAL(num) numToValue(num)\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>That macro calls this function:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define NUMBER_VAL(num) numToValue(num)\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">static</span> <span class=\"k\">inline</span> <span class=\"t\">double</span> <span class=\"i\">valueToNum</span>(<span class=\"t\">Value</span> <span class=\"i\">value</span>) {\n  <span class=\"t\">double</span> <span class=\"i\">num</span>;\n  <span class=\"i\">memcpy</span>(&amp;<span class=\"i\">num</span>, &amp;<span class=\"i\">value</span>, <span class=\"k\">sizeof</span>(<span class=\"t\">Value</span>));\n  <span class=\"k\">return</span> <span class=\"i\">num</span>;\n}\n</pre><pre class=\"insert-after\">\n\nstatic inline Value numToValue(double num) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>It works exactly the same except we swap the types. Again, the compiler will\neliminate all of it. Even though those calls to\n<code>memcpy()</code> will disappear, we still need to show the compiler <em>which</em> <code>memcpy()</code>\nwe&rsquo;re calling so we also need an <span name=\"union\">include</span>.</p>\n<aside name=\"union\" class=\"bottom\">\n<p>If you find yourself with a compiler that does not optimize the <code>memcpy()</code> away,\ntry this instead:</p>\n<div class=\"codehilite\"><pre><span class=\"t\">double</span> <span class=\"i\">valueToNum</span>(<span class=\"t\">Value</span> <span class=\"i\">value</span>) {\n  <span class=\"k\">union</span> {\n    <span class=\"t\">uint64_t</span> <span class=\"i\">bits</span>;\n    <span class=\"t\">double</span> <span class=\"i\">num</span>;\n  } <span class=\"i\">data</span>;\n  <span class=\"i\">data</span>.<span class=\"i\">bits</span> = <span class=\"i\">value</span>;\n  <span class=\"k\">return</span> <span class=\"i\">data</span>.<span class=\"i\">num</span>;\n}\n</pre></div>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define clox_value_h\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#include &lt;string.h&gt;</span>\n</pre><pre class=\"insert-after\">\n\n#include &quot;common.h&quot;\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>That was a lot of code to ultimately do nothing but silence the C type checker.\nDoing a runtime type <em>test</em> on a Lox number is a little more interesting. If all\nwe have are exactly the bits for a double, how do we tell that it <em>is</em> a double?\nIt&rsquo;s time to get bit twiddling.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">typedef uint64_t Value;\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#define IS_NUMBER(value)    (((value) &amp; QNAN) != QNAN)</span>\n</pre><pre class=\"insert-after\">\n\n#define AS_NUMBER(value)    valueToNum(value)\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>We know that every Value that is <em>not</em> a number will use a special quiet NaN\nrepresentation. And we presume we have correctly avoided any of the meaningful\nNaN representations that may actually be produced by doing arithmetic on\nnumbers.</p>\n<p>If the double has all of its NaN bits set, and the quiet NaN bit set, and one\nmore for good measure, we can be <span name=\"certain\">pretty certain</span> it\nis one of the bit patterns we ourselves have set aside for other types. To check\nthat, we mask out all of the bits except for our set of quiet NaN bits. If <em>all</em>\nof those bits are set, it must be a NaN-boxed value of some other Lox type.\nOtherwise, it is actually a number.</p>\n<aside name=\"certain\">\n<p>Pretty certain, but not strictly guaranteed. As far as I know, there is nothing\npreventing a CPU from producing a NaN value as the result of some operation\nwhose bit representation collides with ones we have claimed. But in my tests\nacross a number of architectures, I haven&rsquo;t seen it happen.</p>\n</aside>\n<p>The set of quiet NaN bits are declared like this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#ifdef NAN_BOXING\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#define QNAN     ((uint64_t)0x7ffc000000000000)</span>\n</pre><pre class=\"insert-after\">\n\ntypedef uint64_t Value;\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>It would be nice if C supported binary literals. But if you do the conversion,\nyou&rsquo;ll see that value is the same as this:</p><img src=\"image/optimization/qnan.png\" alt=\"The quiet NaN bits.\" />\n<p>This is exactly all of the exponent bits, plus the quiet NaN bit, plus one extra\nto dodge that Intel value.</p>\n<h3><a href=\"#nil-true-and-false\" id=\"nil-true-and-false\"><small>30&#8202;.&#8202;3&#8202;.&#8202;4</small>Nil, true, and false</a></h3>\n<p>The next type to handle is <code>nil</code>. That&rsquo;s pretty simple since there&rsquo;s only one\n<code>nil</code> value and thus we need only a single bit pattern to represent it. There\nare two other singleton values, the two Booleans, <code>true</code> and <code>false</code>. This calls\nfor three total unique bit patterns.</p>\n<p>Two bits give us four different combinations, which is plenty. We claim the two\nlowest bits of our unused mantissa space as a &ldquo;type tag&rdquo; to determine which of\nthese three singleton values we&rsquo;re looking at. The three type tags are defined\nlike so:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define QNAN     ((uint64_t)0x7ffc000000000000)\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#define TAG_NIL   1 </span><span class=\"c\">// 01.</span>\n<span class=\"a\">#define TAG_FALSE 2 </span><span class=\"c\">// 10.</span>\n<span class=\"a\">#define TAG_TRUE  3 </span><span class=\"c\">// 11.</span>\n</pre><pre class=\"insert-after\">\n\ntypedef uint64_t Value;\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>Our representation of <code>nil</code> is thus all of the bits required to define our\nquiet NaN representation along with the <code>nil</code> type tag bits:</p><img src=\"image/optimization/nil.png\" alt=\"The bit representation of the nil value.\" />\n<p>In code, we check the bits like so:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define AS_NUMBER(value)    valueToNum(value)\n\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define NIL_VAL         ((Value)(uint64_t)(QNAN | TAG_NIL))</span>\n</pre><pre class=\"insert-after\">#define NUMBER_VAL(num) numToValue(num)\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>We simply bitwise <span class=\"small-caps\">OR</span> the quiet NaN bits and the\ntype tag, and then do a little cast dance to teach the C compiler what we want\nthose bits to mean.</p>\n<p>Since <code>nil</code> has only a single bit representation, we can use equality on\nuint64_t to see if a Value is <code>nil</code>.</p>\n<p><span name=\"equal\"></span></p>\n<div class=\"codehilite\"><pre class=\"insert-before\">typedef uint64_t Value;\n\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define IS_NIL(value)       ((value) == NIL_VAL)</span>\n</pre><pre class=\"insert-after\">#define IS_NUMBER(value)    (((value) &amp; QNAN) != QNAN)\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>You can guess how we define the <code>true</code> and <code>false</code> values.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define AS_NUMBER(value)    valueToNum(value)\n\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define FALSE_VAL       ((Value)(uint64_t)(QNAN | TAG_FALSE))</span>\n<span class=\"a\">#define TRUE_VAL        ((Value)(uint64_t)(QNAN | TAG_TRUE))</span>\n</pre><pre class=\"insert-after\">#define NIL_VAL         ((Value)(uint64_t)(QNAN | TAG_NIL))\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>The bits look like this:</p><img src=\"image/optimization/bools.png\" alt=\"The bit representation of the true and false values.\" />\n<p>To convert a C bool into a Lox Boolean, we rely on these two singleton values\nand the good old conditional operator.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define AS_NUMBER(value)    valueToNum(value)\n\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define BOOL_VAL(b)     ((b) ? TRUE_VAL : FALSE_VAL)</span>\n</pre><pre class=\"insert-after\">#define FALSE_VAL       ((Value)(uint64_t)(QNAN | TAG_FALSE))\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>There&rsquo;s probably a cleverer bitwise way to do this, but my hunch is that the\ncompiler can figure one out faster than I can. Going the other direction is\nsimpler.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define IS_NUMBER(value)    (((value) &amp; QNAN) != QNAN)\n\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define AS_BOOL(value)      ((value) == TRUE_VAL)</span>\n</pre><pre class=\"insert-after\">#define AS_NUMBER(value)    valueToNum(value)\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>Since we know there are exactly two Boolean bit representations in Lox<span class=\"em\">&mdash;</span>unlike\nin C where any non-zero value can be considered &ldquo;true&rdquo;<span class=\"em\">&mdash;</span>if it ain&rsquo;t <code>true</code>, it\nmust be <code>false</code>. This macro does assume you call it only on a Value that you\nknow <em>is</em> a Lox Boolean. To check that, there&rsquo;s one more macro.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">typedef uint64_t Value;\n\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define IS_BOOL(value)      (((value) | 1) == TRUE_VAL)</span>\n</pre><pre class=\"insert-after\">#define IS_NIL(value)       ((value) == NIL_VAL)\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>That looks a little strange. A more obvious macro would look like this:</p>\n<div class=\"codehilite\"><pre><span class=\"a\">#define IS_BOOL(v) ((v) == TRUE_VAL || (v) == FALSE_VAL)</span>\n</pre></div>\n<p>Unfortunately, that&rsquo;s not safe. The expansion mentions <code>v</code> twice, which means if\nthat expression has any side effects, they will be executed twice. We could have\nthe macro call out to a separate function, but, ugh, what a chore.</p>\n<p>Instead, we bitwise <span class=\"small-caps\">OR</span> a 1 onto the value to\nmerge the only two valid Boolean bit patterns. That leaves three potential\nstates the value can be in:</p>\n<ol>\n<li>\n<p>It was <code>FALSE_VAL</code> and has now been converted to <code>TRUE_VAL</code>.</p>\n</li>\n<li>\n<p>It was <code>TRUE_VAL</code> and the <code>| 1</code> did nothing and it&rsquo;s still <code>TRUE_VAL</code>.</p>\n</li>\n<li>\n<p>It&rsquo;s some other, non-Boolean value.</p>\n</li>\n</ol>\n<p>At that point, we can simply compare the result to <code>TRUE_VAL</code> to see if we&rsquo;re\nin the first two states or the third.</p>\n<h3><a href=\"#objects\" id=\"objects\"><small>30&#8202;.&#8202;3&#8202;.&#8202;5</small>Objects</a></h3>\n<p>The last value type is the hardest. Unlike the singleton values, there are\nbillions of different pointer values we need to box inside a NaN. This means we\nneed both some kind of tag to indicate that these particular NaNs <em>are</em> Obj\npointers, and room for the addresses themselves.</p>\n<p>The tag bits we used for the singleton values are in the region where I decided\nto store the pointer itself, so we can&rsquo;t easily use a different <span\nname=\"ptr\">bit</span> there to indicate that the value is an object reference.\nHowever, there is another bit we aren&rsquo;t using. Since all our NaN values are not\nnumbers<span class=\"em\">&mdash;</span>it&rsquo;s right there in the name<span class=\"em\">&mdash;</span>the sign bit isn&rsquo;t used for anything.\nWe&rsquo;ll go ahead and use that as the type tag for objects. If one of our quiet\nNaNs has its sign bit set, then it&rsquo;s an Obj pointer. Otherwise, it must be one\nof the previous singleton values.</p>\n<aside name=\"ptr\">\n<p>We actually <em>could</em> use the lowest bits to store the type tag even when the\nvalue is an Obj pointer. That&rsquo;s because Obj pointers are always aligned to an\n8-byte boundary since Obj contains a 64-bit field. That, in turn, implies that\nthe three lowest bits of an Obj pointer will always be zero. We could store\nwhatever we wanted in there and just mask it off before dereferencing the\npointer.</p>\n<p>This is another value representation optimization called <strong>pointer tagging</strong>.</p>\n</aside>\n<p>If the sign bit is set, then the remaining low bits store the pointer to the\nObj:</p><img src=\"image/optimization/obj.png\" alt=\"Bit representation of an Obj* stored in a Value.\" />\n<p>To convert a raw Obj pointer to a Value, we take the pointer and set all of the\nquiet NaN bits and the sign bit.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define NUMBER_VAL(num) numToValue(num)\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define OBJ_VAL(obj) \\</span>\n<span class=\"a\">    (Value)(SIGN_BIT | QNAN | (uint64_t)(uintptr_t)(obj))</span>\n</pre><pre class=\"insert-after\">\n\nstatic inline double valueToNum(Value value) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>The pointer itself is a full 64 bits, and in <span name=\"safe\">principle</span>,\nit could thus overlap with some of those quiet NaN and sign bits. But in\npractice, at least on the architectures I&rsquo;ve tested, everything above the 48th\nbit in a pointer is always zero. There&rsquo;s a lot of casting going on here, which\nI&rsquo;ve found is necessary to satisfy some of the pickiest C compilers, but the\nend result is just jamming some bits together.</p>\n<aside name=\"safe\">\n<p>I try to follow the letter of the law when it comes to the code in this book, so\nthis paragraph is dubious. There comes a point when optimizing where you push\nthe boundary of not just what the <em>spec says</em> you can do, but what a real\ncompiler and chip let you get away with.</p>\n<p>There are risks when stepping outside of the spec, but there are rewards in that\nlawless territory too. It&rsquo;s up to you to decide if the gains are worth it.</p>\n</aside>\n<p>We define the sign bit like so:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#ifdef NAN_BOXING\n\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define SIGN_BIT ((uint64_t)0x8000000000000000)</span>\n</pre><pre class=\"insert-after\">#define QNAN     ((uint64_t)0x7ffc000000000000)\n\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>To get the Obj pointer back out, we simply mask off all of those extra bits.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define AS_NUMBER(value)    valueToNum(value)\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define AS_OBJ(value) \\</span>\n<span class=\"a\">    ((Obj*)(uintptr_t)((value) &amp; ~(SIGN_BIT | QNAN)))</span>\n</pre><pre class=\"insert-after\">\n\n#define BOOL_VAL(b)     ((b) ? TRUE_VAL : FALSE_VAL)\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>The tilde (<code>~</code>), if you haven&rsquo;t done enough bit manipulation to encounter it\nbefore, is bitwise <span class=\"small-caps\">NOT</span>. It toggles all ones and\nzeroes in its operand. By masking the value with the bitwise negation of the\nquiet NaN and sign bits, we <em>clear</em> those bits and let the pointer bits remain.</p>\n<p>One last macro:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define IS_NUMBER(value)    (((value) &amp; QNAN) != QNAN)\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define IS_OBJ(value) \\</span>\n<span class=\"a\">    (((value) &amp; (QNAN | SIGN_BIT)) == (QNAN | SIGN_BIT))</span>\n</pre><pre class=\"insert-after\">\n\n#define AS_BOOL(value)      ((value) == TRUE_VAL)\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>A Value storing an Obj pointer has its sign bit set, but so does any negative\nnumber. To tell if a Value is an Obj pointer, we need to check that both the\nsign bit and all of the quiet NaN bits are set. This is similar to how we detect\nthe type of the singleton values, except this time we use the sign bit as the\ntag.</p>\n<h3><a href=\"#value-functions\" id=\"value-functions\"><small>30&#8202;.&#8202;3&#8202;.&#8202;6</small>Value functions</a></h3>\n<p>The rest of the VM usually goes through the macros when working with Values, so\nwe are almost done. However, there are a couple of functions in the &ldquo;value&rdquo;\nmodule that peek inside the otherwise black box of Value and work with its\nencoding directly. We need to fix those too.</p>\n<p>The first is <code>printValue()</code>. It has separate code for each value type. We no\nlonger have an explicit type enum we can switch on, so instead we use a series\nof type tests to handle each kind of value.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void printValue(Value value) {\n</pre><div class=\"source-file\"><em>value.c</em><br>\nin <em>printValue</em>()</div>\n<pre class=\"insert\"><span class=\"a\">#ifdef NAN_BOXING</span>\n  <span class=\"k\">if</span> (<span class=\"a\">IS_BOOL</span>(<span class=\"i\">value</span>)) {\n    <span class=\"i\">printf</span>(<span class=\"a\">AS_BOOL</span>(<span class=\"i\">value</span>) ? <span class=\"s\">&quot;true&quot;</span> : <span class=\"s\">&quot;false&quot;</span>);\n  } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"a\">IS_NIL</span>(<span class=\"i\">value</span>)) {\n    <span class=\"i\">printf</span>(<span class=\"s\">&quot;nil&quot;</span>);\n  } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"a\">IS_NUMBER</span>(<span class=\"i\">value</span>)) {\n    <span class=\"i\">printf</span>(<span class=\"s\">&quot;%g&quot;</span>, <span class=\"a\">AS_NUMBER</span>(<span class=\"i\">value</span>));\n  } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"a\">IS_OBJ</span>(<span class=\"i\">value</span>)) {\n    <span class=\"i\">printObject</span>(<span class=\"i\">value</span>);\n  }\n<span class=\"a\">#else</span>\n</pre><pre class=\"insert-after\">  switch (value.type) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.c</em>, in <em>printValue</em>()</div>\n\n<p>This is technically a tiny bit slower than a switch, but compared to the\noverhead of actually writing to a stream, it&rsquo;s negligible.</p>\n<p>We still support the original tagged union representation, so we keep the old\ncode and enclose it in the <code>#else</code> conditional section.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  }\n</pre><div class=\"source-file\"><em>value.c</em><br>\nin <em>printValue</em>()</div>\n<pre class=\"insert\"><span class=\"a\">#endif</span>\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.c</em>, in <em>printValue</em>()</div>\n\n<p>The other operation is testing two values for equality.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">bool valuesEqual(Value a, Value b) {\n</pre><div class=\"source-file\"><em>value.c</em><br>\nin <em>valuesEqual</em>()</div>\n<pre class=\"insert\"><span class=\"a\">#ifdef NAN_BOXING</span>\n  <span class=\"k\">return</span> <span class=\"i\">a</span> == <span class=\"i\">b</span>;\n<span class=\"a\">#else</span>\n</pre><pre class=\"insert-after\">  if (a.type != b.type) return false;\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.c</em>, in <em>valuesEqual</em>()</div>\n\n<p>It doesn&rsquo;t get much simpler than that! If the two bit representations are\nidentical, the values are equal. That does the right thing for the singleton\nvalues since each has a unique bit representation and they are only equal to\nthemselves. It also does the right thing for Obj pointers, since objects use\nidentity for equality<span class=\"em\">&mdash;</span>two Obj references are equal only if they point to the\nexact same object.</p>\n<p>It&rsquo;s <em>mostly</em> correct for numbers too. Most floating-point numbers with\ndifferent bit representations are distinct numeric values. Alas, IEEE 754\ncontains a pothole to trip us up. For reasons that aren&rsquo;t entirely clear to me,\nthe spec mandates that NaN values are <em>not</em> equal to <em>themselves</em>. This isn&rsquo;t a\nproblem for the special quiet NaNs that we are using for our own purposes. But\nit&rsquo;s possible to produce a &ldquo;real&rdquo; arithmetic NaN in Lox, and if we want to\ncorrectly implement IEEE 754 numbers, then the resulting value is not supposed\nto be equal to itself. More concretely:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">nan</span> = <span class=\"n\">0</span>/<span class=\"n\">0</span>;\n<span class=\"k\">print</span> <span class=\"i\">nan</span> == <span class=\"i\">nan</span>;\n</pre></div>\n<p>IEEE 754 says this program is supposed to print &ldquo;false&rdquo;. It does the right thing\nwith our old tagged union representation because the <code>VAL_NUMBER</code> case applies\n<code>==</code> to two values that the C compiler knows are doubles. Thus the compiler\ngenerates the right CPU instruction to perform an IEEE floating-point equality.</p>\n<p>Our new representation breaks that by defining Value to be a uint64_t. If we\nwant to be <em>fully</em> compliant with IEEE 754, we need to handle this case.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#ifdef NAN_BOXING\n</pre><div class=\"source-file\"><em>value.c</em><br>\nin <em>valuesEqual</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"a\">IS_NUMBER</span>(<span class=\"i\">a</span>) &amp;&amp; <span class=\"a\">IS_NUMBER</span>(<span class=\"i\">b</span>)) {\n    <span class=\"k\">return</span> <span class=\"a\">AS_NUMBER</span>(<span class=\"i\">a</span>) == <span class=\"a\">AS_NUMBER</span>(<span class=\"i\">b</span>);\n  }\n</pre><pre class=\"insert-after\">  return a == b;\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.c</em>, in <em>valuesEqual</em>()</div>\n\n<p>I know, it&rsquo;s weird. And there is a performance cost to doing this type test\nevery time we check two Lox values for equality. If we are willing to sacrifice\na little <span name=\"java\">compatibility</span><span class=\"em\">&mdash;</span>who <em>really</em> cares if NaN is\nnot equal to itself?<span class=\"em\">&mdash;</span>we could leave this off. I&rsquo;ll leave it up to you to\ndecide how pedantic you want to be.</p>\n<aside name=\"java\">\n<p>In fact, jlox gets NaN equality wrong. Java does the right thing when you\ncompare primitive doubles using <code>==</code>, but not if you box those to Double or\nObject and compare them using <code>equals()</code>, which is how jlox implements equality.</p>\n</aside>\n<p>Finally, we close the conditional compilation section around the old\nimplementation.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  }\n</pre><div class=\"source-file\"><em>value.c</em><br>\nin <em>valuesEqual</em>()</div>\n<pre class=\"insert\"><span class=\"a\">#endif</span>\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.c</em>, in <em>valuesEqual</em>()</div>\n\n<p>And that&rsquo;s it. This optimization is complete, as is our clox virtual machine.\nThat was the last line of new code in the book.</p>\n<h3><a href=\"#evaluating-performance\" id=\"evaluating-performance\"><small>30&#8202;.&#8202;3&#8202;.&#8202;7</small>Evaluating performance</a></h3>\n<p>The code is done, but we still need to figure out if we actually made anything\nbetter with these changes. Evaluating an optimization like this is very\ndifferent from the previous one. There, we had a clear hotspot visible in the\nprofiler. We fixed that part of the code and could instantly see the hotspot\nget faster.</p>\n<p>The effects of changing the value representation are more diffused. The macros\nare expanded in place wherever they are used, so the performance changes are\nspread across the codebase in a way that&rsquo;s hard for many profilers to track\nwell, especially in an <span name=\"opt\">optimized</span> build.</p>\n<aside name=\"opt\">\n<p>When doing profiling work, you almost always want to profile an optimized\n&ldquo;release&rdquo; build of your program since that reflects the performance story your\nend users experience. Compiler optimizations, like inlining, can dramatically\naffect which parts of the code are performance hotspots. Hand-optimizing a debug\nbuild risks sending you off &ldquo;fixing&rdquo; problems that the optimizing compiler will\nalready solve for you.</p>\n<p>Make sure you don&rsquo;t accidentally benchmark and optimize your debug build. I seem\nto make that mistake at least once a year.</p>\n</aside>\n<p>We also can&rsquo;t easily <em>reason</em> about the effects of our change. We&rsquo;ve made values\nsmaller, which reduces cache misses all across the VM. But the actual real-world\nperformance effect of that change is highly dependent on the memory use of the\nLox program being run. A tiny Lox microbenchmark may not have enough values\nscattered around in memory for the effect to be noticeable, and even things like\nthe addresses handed out to us by the C memory allocator can impact the results.</p>\n<p>If we did our job right, basically everything gets a little faster, especially\non larger, more complex Lox programs. But it is possible that the extra bitwise\noperations we do when NaN-boxing values nullify the gains from the better\nmemory use. Doing performance work like this is unnerving because you can&rsquo;t\neasily <em>prove</em> that you&rsquo;ve made the VM better. You can&rsquo;t point to a single\nsurgically targeted microbenchmark and say, &ldquo;There, see?&rdquo;</p>\n<p>Instead, what we really need is a <em>suite</em> of larger benchmarks. Ideally, they\nwould be distilled from real-world applications<span class=\"em\">&mdash;</span>not that such a thing exists\nfor a toy language like Lox. Then we can measure the aggregate performance\nchanges across all of those. I did my best to cobble together a handful of\nlarger Lox programs. On my machine, the new value representation seems to make\neverything roughly 10% faster across the board.</p>\n<p>That&rsquo;s not a huge improvement, especially compared to the profound effect of\nmaking hash table lookups faster. I added this optimization in large part\nbecause it&rsquo;s a good example of a certain <em>kind</em> of performance work you may\nexperience, and honestly, because I think it&rsquo;s technically really cool. It might\nnot be the first thing I would reach for if I were seriously trying to make clox\nfaster. There is probably other, lower-hanging fruit.</p>\n<p>But, if you find yourself working on a program where all of the easy wins have\nbeen taken, then at some point you may want to think about tuning your value\nrepresentation. I hope this chapter has shined a light on some of the options\nyou have in that area.</p>\n<h2><a href=\"#where-to-next\" id=\"where-to-next\"><small>30&#8202;.&#8202;4</small>Where to Next</a></h2>\n<p>We&rsquo;ll stop here with the Lox language and our two interpreters. We could tinker\non it forever, adding new language features and clever speed improvements. But,\nfor this book, I think we&rsquo;ve reached a natural place to call our work complete.\nI won&rsquo;t rehash everything we&rsquo;ve learned in the past many pages. You were there\nwith me and you remember. Instead, I&rsquo;d like to take a minute to talk about where\nyou might go from here. What is the next step in your programming language\njourney?</p>\n<p>Most of you probably won&rsquo;t spend a significant part of your career working in\ncompilers or interpreters. It&rsquo;s a pretty small slice of the computer science\nacademia pie, and an even smaller segment of software engineering in industry.\nThat&rsquo;s OK. Even if you never work on a compiler again in your life, you will\ncertainly <em>use</em> one, and I hope this book has equipped you with a better\nunderstanding of how the programming languages you use are designed and\nimplemented.</p>\n<p>You have also learned a handful of important, fundamental data structures and\ngotten some practice doing low-level profiling and optimization work. That kind\nof expertise is helpful no matter what domain you program in.</p>\n<p>I also hope I gave you a new way of <span name=\"domain\">looking</span> at and\nsolving problems. Even if you never work on a language again, you may be\nsurprised to discover how many programming problems can be seen as\nlanguage-<em>like</em>. Maybe that report generator you need to write can be modeled as\na series of stack-based &ldquo;instructions&rdquo; that the generator &ldquo;executes&rdquo;. That user\ninterface you need to render looks an awful lot like traversing an AST.</p>\n<aside name=\"domain\">\n<p>This goes for other domains too. I don&rsquo;t think there&rsquo;s a single topic I&rsquo;ve\nlearned in programming<span class=\"em\">&mdash;</span>or even outside of programming<span class=\"em\">&mdash;</span>that I haven&rsquo;t ended\nup finding useful in other areas. One of my favorite aspects of software\nengineering is how much it rewards those with eclectic interests.</p>\n</aside>\n<p>If you do want to go further down the programming language rabbit hole, here\nare some suggestions for which branches in the tunnel to explore:</p>\n<ul>\n<li>\n<p>Our simple, single-pass bytecode compiler pushed us towards mostly runtime\noptimization. In a mature language implementation, compile-time optimization\nis generally more important, and the field of compiler optimizations is\nincredibly rich. Grab a classic <span name=\"cooper\">compilers</span> book,\nand rebuild the front end of clox or jlox to be a sophisticated compilation\npipeline with some interesting intermediate representations and optimization\npasses.</p>\n<p>Dynamic typing will place some restrictions on how far you can go, but there\nis still a lot you can do. Or maybe you want to take a big leap and add\nstatic types and a type checker to Lox. That will certainly give your front\nend a lot more to chew on.</p>\n<aside name=\"cooper\">\n<p>I like Cooper and Torczon&rsquo;s <em>Engineering a Compiler</em> for this. Appel&rsquo;s\n<em>Modern Compiler Implementation</em> books are also well regarded.</p>\n</aside></li>\n<li>\n<p>In this book, I aim to be correct, but not particularly rigorous. My goal is\nmostly to give you an <em>intuition</em> and a feel for doing language work. If you\nlike more precision, then the whole world of programming language academia\nis waiting for you. Languages and compilers have been studied formally since\nbefore we even had computers, so there is no shortage of books and papers on\nparser theory, type systems, semantics, and formal logic. Going down this\npath will also teach you how to read CS papers, which is a valuable skill in\nits own right.</p>\n</li>\n<li>\n<p>Or, if you just really enjoy hacking on and making languages, you can take\nLox and turn it into your own <span name=\"license\">plaything</span>. Change\nthe syntax to something that delights your eye. Add missing features or\nremove ones you don&rsquo;t like. Jam new optimizations in there.</p>\n<aside name=\"license\">\n<p>The <em>text</em> of this book is copyrighted to me, but the <em>code</em> and the\nimplementations of jlox and clox use the very permissive <a href=\"https://en.wikipedia.org/wiki/MIT_License\">MIT license</a>.\nYou are more than welcome to <a href=\"https://github.com/munificent/craftinginterpreters\">take either of those interpreters</a> and\ndo whatever you want with them. Go to town.</p>\n<p>If you make significant changes to the language, it would be good to also\nchange the name, mostly to avoid confusing people about what the name &ldquo;Lox&rdquo;\nrepresents.</p>\n</aside>\n<p>Eventually you may get to a point where you have something you think others\ncould use as well. That gets you into the very distinct world of programming\nlanguage <em>popularity</em>. Expect to spend a ton of time writing documentation,\nexample programs, tools, and useful libraries. The field is crowded with\nlanguages vying for users. To thrive in that space you&rsquo;ll have to put on\nyour marketing hat and <em>sell</em>. Not everyone enjoys that kind of\npublic-facing work, but if you do, it can be incredibly gratifying to see\npeople use your language to express themselves.</p>\n</li>\n</ul>\n<p>Or maybe this book has satisfied your craving and you&rsquo;ll stop here. Whichever\nway you go, or don&rsquo;t go, there is one lesson I hope to lodge in your heart. Like\nI was, you may have initially been intimidated by programming languages. But in\nthese chapters, you&rsquo;ve seen that even really challenging material can be tackled\nby us mortals if we get our hands dirty and take it a step at a time. If you can\nhandle compilers and interpreters, you can do anything you put your mind to.</p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<p>Assigning homework on the last day of school seems cruel but if you really want\nsomething to do during your summer vacation:</p>\n<ol>\n<li>\n<p>Fire up your profiler, run a couple of benchmarks, and look for other\nhotspots in the VM. Do you see anything in the runtime that you can improve?</p>\n</li>\n<li>\n<p>Many strings in real-world user programs are small, often only a character\nor two. This is less of a concern in clox because we intern strings, but\nmost VMs don&rsquo;t. For those that don&rsquo;t, heap allocating a tiny character array\nfor each of those little strings and then representing the value as a\npointer to that array is wasteful. Often, the pointer is larger than the\nstring&rsquo;s characters. A classic trick is to have a separate value\nrepresentation for small strings that stores the characters inline in the\nvalue.</p>\n<p>Starting from clox&rsquo;s original tagged union representation, implement that\noptimization. Write a couple of relevant benchmarks and see if it helps.</p>\n</li>\n<li>\n<p>Reflect back on your experience with this book. What parts of it worked well\nfor you? What didn&rsquo;t? Was it easier for you to learn bottom-up or top-down?\nDid the illustrations help or distract? Did the analogies clarify or\nconfuse?</p>\n<p>The more you understand your personal learning style, the more effectively\nyou can upload knowledge into your head. You can specifically target\nmaterial that teaches you the way you learn best.</p>\n</li>\n</ol>\n</div>\n\n<footer>\n<a href=\"backmatter.html\" class=\"next\">\n  Next Part: &ldquo;Backmatter&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/parsing-expressions.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Parsing Expressions &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Parsing Expressions<small>6</small></a></h3>\n\n<ul>\n    <li><a href=\"#ambiguity-and-the-parsing-game\"><small>6.1</small> Ambiguity and the Parsing Game</a></li>\n    <li><a href=\"#recursive-descent-parsing\"><small>6.2</small> Recursive Descent Parsing</a></li>\n    <li><a href=\"#syntax-errors\"><small>6.3</small> Syntax Errors</a></li>\n    <li><a href=\"#wiring-up-the-parser\"><small>6.4</small> Wiring up the Parser</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Logic Versus History</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"representing-code.html\" title=\"Representing Code\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\">&uarr;&nbsp;Up</a>\n    <a href=\"evaluating-expressions.html\" title=\"Evaluating Expressions\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"representing-code.html\" title=\"Representing Code\" class=\"prev\">←</a>\n<a href=\"evaluating-expressions.html\" title=\"Evaluating Expressions\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Parsing Expressions<small>6</small></a></h3>\n\n<ul>\n    <li><a href=\"#ambiguity-and-the-parsing-game\"><small>6.1</small> Ambiguity and the Parsing Game</a></li>\n    <li><a href=\"#recursive-descent-parsing\"><small>6.2</small> Recursive Descent Parsing</a></li>\n    <li><a href=\"#syntax-errors\"><small>6.3</small> Syntax Errors</a></li>\n    <li><a href=\"#wiring-up-the-parser\"><small>6.4</small> Wiring up the Parser</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Logic Versus History</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"representing-code.html\" title=\"Representing Code\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\">&uarr;&nbsp;Up</a>\n    <a href=\"evaluating-expressions.html\" title=\"Evaluating Expressions\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">6</div>\n  <h1>Parsing Expressions</h1>\n\n<blockquote>\n<p>Grammar, which knows how to control even kings.\n<cite>Molière</cite></p>\n</blockquote>\n<p><span name=\"parse\">This</span> chapter marks the first major milestone of the\nbook. Many of us have cobbled together a mishmash of regular expressions and\nsubstring operations to extract some sense out of a pile of text. The code was\nprobably riddled with bugs and a beast to maintain. Writing a <em>real</em> parser<span class=\"em\">&mdash;</span>one with decent error handling, a coherent internal structure, and the ability\nto robustly chew through a sophisticated syntax<span class=\"em\">&mdash;</span>is considered a rare,\nimpressive skill. In this chapter, you will <span name=\"attain\">attain</span>\nit.</p>\n<aside name=\"parse\">\n<p>&ldquo;Parse&rdquo; comes to English from the Old French &ldquo;pars&rdquo; for &ldquo;part of speech&rdquo;. It\nmeans to take a text and map each word to the grammar of the language. We use it\nhere in the same sense, except that our language is a little more modern than\nOld French.</p>\n</aside>\n<aside name=\"attain\">\n<p>Like many rites of passage, you&rsquo;ll probably find it looks a little smaller, a\nlittle less daunting when it&rsquo;s behind you than when it loomed ahead.</p>\n</aside>\n<p>It&rsquo;s easier than you think, partially because we front-loaded a lot of the hard\nwork in the <a href=\"representing-code.html\">last chapter</a>. You already know your way around a formal grammar.\nYou&rsquo;re familiar with syntax trees, and we have some Java classes to represent\nthem. The only remaining piece is parsing<span class=\"em\">&mdash;</span>transmogrifying a sequence of\ntokens into one of those syntax trees.</p>\n<p>Some CS textbooks make a big deal out of parsers. In the &rsquo;60s, computer\nscientists<span class=\"em\">&mdash;</span>understandably tired of programming in assembly language<span class=\"em\">&mdash;</span>started designing more sophisticated, <span name=\"human\">human</span>-friendly\nlanguages like Fortran and ALGOL. Alas, they weren&rsquo;t very <em>machine</em>-friendly\nfor the primitive computers of the time.</p>\n<aside name=\"human\">\n<p>Imagine how harrowing assembly programming on those old machines must have been\nthat they considered <em>Fortran</em> to be an improvement.</p>\n</aside>\n<p>These pioneers designed languages that they honestly weren&rsquo;t even sure how to\nwrite compilers for, and then did groundbreaking work inventing parsing and\ncompiling techniques that could handle these new, big languages on those old, tiny\nmachines.</p>\n<p>Classic compiler books read like fawning hagiographies of these heroes and their\ntools. The cover of <em>Compilers: Principles, Techniques, and Tools</em> literally has\na dragon labeled &ldquo;complexity of compiler design&rdquo; being slain by a knight bearing\na sword and shield branded &ldquo;LALR parser generator&rdquo; and &ldquo;syntax directed\ntranslation&rdquo;. They laid it on thick.</p>\n<p>A little self-congratulation is well-deserved, but the truth is you don&rsquo;t need\nto know most of that stuff to bang out a high quality parser for a modern\nmachine. As always, I encourage you to broaden your education and take it in\nlater, but this book omits the trophy case.</p>\n<h2><a href=\"#ambiguity-and-the-parsing-game\" id=\"ambiguity-and-the-parsing-game\"><small>6&#8202;.&#8202;1</small>Ambiguity and the Parsing Game</a></h2>\n<p>In the last chapter, I said you can &ldquo;play&rdquo; a context-free grammar like a game in\norder to <em>generate</em> strings. Parsers play that game in reverse. Given a string<span class=\"em\">&mdash;</span>a series of tokens<span class=\"em\">&mdash;</span>we map those tokens to terminals in the grammar to\nfigure out which rules could have generated that string.</p>\n<p>The &ldquo;could have&rdquo; part is interesting. It&rsquo;s entirely possible to create a grammar\nthat is <em>ambiguous</em>, where different choices of productions can lead to the same\nstring. When you&rsquo;re using the grammar to <em>generate</em> strings, that doesn&rsquo;t matter\nmuch. Once you have the string, who cares how you got to it?</p>\n<p>When parsing, ambiguity means the parser may misunderstand the user&rsquo;s code. As\nwe parse, we aren&rsquo;t just determining if the string is valid Lox code, we&rsquo;re\nalso tracking which rules match which parts of it so that we know what part of\nthe language each token belongs to. Here&rsquo;s the Lox expression grammar we put\ntogether in the last chapter:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">expression</span>     → <span class=\"i\">literal</span>\n               | <span class=\"i\">unary</span>\n               | <span class=\"i\">binary</span>\n               | <span class=\"i\">grouping</span> ;\n\n<span class=\"i\">literal</span>        → <span class=\"t\">NUMBER</span> | <span class=\"t\">STRING</span> | <span class=\"s\">&quot;true&quot;</span> | <span class=\"s\">&quot;false&quot;</span> | <span class=\"s\">&quot;nil&quot;</span> ;\n<span class=\"i\">grouping</span>       → <span class=\"s\">&quot;(&quot;</span> <span class=\"i\">expression</span> <span class=\"s\">&quot;)&quot;</span> ;\n<span class=\"i\">unary</span>          → ( <span class=\"s\">&quot;-&quot;</span> | <span class=\"s\">&quot;!&quot;</span> ) <span class=\"i\">expression</span> ;\n<span class=\"i\">binary</span>         → <span class=\"i\">expression</span> <span class=\"i\">operator</span> <span class=\"i\">expression</span> ;\n<span class=\"i\">operator</span>       → <span class=\"s\">&quot;==&quot;</span> | <span class=\"s\">&quot;!=&quot;</span> | <span class=\"s\">&quot;&lt;&quot;</span> | <span class=\"s\">&quot;&lt;=&quot;</span> | <span class=\"s\">&quot;&gt;&quot;</span> | <span class=\"s\">&quot;&gt;=&quot;</span>\n               | <span class=\"s\">&quot;+&quot;</span>  | <span class=\"s\">&quot;-&quot;</span>  | <span class=\"s\">&quot;*&quot;</span> | <span class=\"s\">&quot;/&quot;</span> ;\n</pre></div>\n<p>This is a valid string in that grammar:</p><img src=\"image/parsing-expressions/tokens.png\" alt=\"6 / 3 - 1\" />\n<p>But there are two ways we could have generated it. One way is:</p>\n<ol>\n<li>Starting at <code>expression</code>, pick <code>binary</code>.</li>\n<li>For the left-hand <code>expression</code>, pick <code>NUMBER</code>, and use <code>6</code>.</li>\n<li>For the operator, pick <code>\"/\"</code>.</li>\n<li>For the right-hand <code>expression</code>, pick <code>binary</code> again.</li>\n<li>In that nested <code>binary</code> expression, pick <code>3 - 1</code>.</li>\n</ol>\n<p>Another is:</p>\n<ol>\n<li>Starting at <code>expression</code>, pick <code>binary</code>.</li>\n<li>For the left-hand <code>expression</code>, pick <code>binary</code> again.</li>\n<li>In that nested <code>binary</code> expression, pick <code>6 / 3</code>.</li>\n<li>Back at the outer <code>binary</code>, for the operator, pick <code>\"-\"</code>.</li>\n<li>For the right-hand <code>expression</code>, pick <code>NUMBER</code>, and use <code>1</code>.</li>\n</ol>\n<p>Those produce the same <em>strings</em>, but not the same <em>syntax trees</em>:</p><img src=\"image/parsing-expressions/syntax-trees.png\" alt=\"Two valid syntax trees: (6 / 3) - 1 and 6 / (3 - 1)\" />\n<p>In other words, the grammar allows seeing the expression as <code>(6 / 3) - 1</code> or <code>6 / (3 - 1)</code>. The <code>binary</code> rule lets operands nest any which way you want. That in\nturn affects the result of evaluating the parsed tree. The way mathematicians\nhave addressed this ambiguity since blackboards were first invented is by\ndefining rules for precedence and associativity.</p>\n<ul>\n<li>\n<p><span name=\"nonassociative\"><strong>Precedence</strong></span> determines which operator\nis evaluated first in an expression containing a mixture of different\noperators. Precedence rules tell us that we evaluate the <code>/</code> before the <code>-</code>\nin the above example. Operators with higher precedence are evaluated\nbefore operators with lower precedence. Equivalently, higher precedence\noperators are said to &ldquo;bind tighter&rdquo;.</p>\n</li>\n<li>\n<p><strong>Associativity</strong> determines which operator is evaluated first in a series\nof the <em>same</em> operator. When an operator is <strong>left-associative</strong> (think\n&ldquo;left-to-right&rdquo;), operators on the left evaluate before those on the right.\nSince <code>-</code> is left-associative, this expression:</p>\n<div class=\"codehilite\"><pre><span class=\"n\">5</span> - <span class=\"n\">3</span> - <span class=\"n\">1</span>\n</pre></div>\n<p>is equivalent to:</p>\n<div class=\"codehilite\"><pre>(<span class=\"n\">5</span> - <span class=\"n\">3</span>) - <span class=\"n\">1</span>\n</pre></div>\n<p>Assignment, on the other hand, is <strong>right-associative</strong>. This:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">a</span> = <span class=\"i\">b</span> = <span class=\"i\">c</span>\n</pre></div>\n<p>is equivalent to:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">a</span> = (<span class=\"i\">b</span> = <span class=\"i\">c</span>)\n</pre></div>\n</li>\n</ul>\n<aside name=\"nonassociative\">\n<p>While not common these days, some languages specify that certain pairs of\noperators have <em>no</em> relative precedence. That makes it a syntax error to mix\nthose operators in an expression without using explicit grouping.</p>\n<p>Likewise, some operators are <strong>non-associative</strong>. That means it&rsquo;s an error to\nuse that operator more than once in a sequence. For example, Perl&rsquo;s range\noperator isn&rsquo;t associative, so <code>a .. b</code> is OK, but <code>a .. b .. c</code> is an error.</p>\n</aside>\n<p>Without well-defined precedence and associativity, an expression that uses\nmultiple operators is ambiguous<span class=\"em\">&mdash;</span>it can be parsed into different syntax trees,\nwhich could in turn evaluate to different results. We&rsquo;ll fix that in Lox by\napplying the same precedence rules as C, going from lowest to highest.</p><table>\n<thead>\n<tr>\n  <td>Name</td>\n  <td>Operators</td>\n  <td>Associates</td>\n</tr>\n</thead>\n<tbody>\n<tr>\n  <td>Equality</td>\n  <td><code>==</code> <code>!=</code></td>\n  <td>Left</td>\n</tr>\n<tr>\n  <td>Comparison</td>\n  <td><code>&gt;</code> <code>&gt;=</code>\n      <code>&lt;</code> <code>&lt;=</code></td>\n  <td>Left</td>\n</tr>\n<tr>\n  <td>Term</td>\n  <td><code>-</code> <code>+</code></td>\n  <td>Left</td>\n</tr>\n<tr>\n  <td>Factor</td>\n  <td><code>/</code> <code>*</code></td>\n  <td>Left</td>\n</tr>\n<tr>\n  <td>Unary</td>\n  <td><code>!</code> <code>-</code></td>\n  <td>Right</td>\n</tr>\n</tbody>\n</table>\n<p>Right now, the grammar stuffs all expression types into a single <code>expression</code>\nrule. That same rule is used as the non-terminal for operands, which lets the\ngrammar accept any kind of expression as a subexpression, regardless of whether\nthe precedence rules allow it.</p>\n<p>We fix that by <span name=\"massage\">stratifying</span> the grammar. We define a\nseparate rule for each precedence level.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">expression</span>     → ...\n<span class=\"i\">equality</span>       → ...\n<span class=\"i\">comparison</span>     → ...\n<span class=\"i\">term</span>           → ...\n<span class=\"i\">factor</span>         → ...\n<span class=\"i\">unary</span>          → ...\n<span class=\"i\">primary</span>        → ...\n</pre></div>\n<aside name=\"massage\">\n<p>Instead of baking precedence right into the grammar rules, some parser\ngenerators let you keep the same ambiguous-but-simple grammar and then add in a\nlittle explicit operator precedence metadata on the side in order to\ndisambiguate.</p>\n</aside>\n<p>Each rule here only matches expressions at its precedence level or higher. For\nexample, <code>unary</code> matches a unary expression like <code>!negated</code> or a primary\nexpression like <code>1234</code>. And <code>term</code> can match <code>1 + 2</code> but also <code>3 * 4 / 5</code>. The\nfinal <code>primary</code> rule covers the highest-precedence forms<span class=\"em\">&mdash;</span>literals and\nparenthesized expressions.</p>\n<p>We just need to fill in the productions for each of those rules. We&rsquo;ll do the\neasy ones first. The top <code>expression</code> rule matches any expression at any\nprecedence level. Since <span name=\"equality\"><code>equality</code></span> has the lowest\nprecedence, if we match that, then it covers everything.</p>\n<aside name=\"equality\">\n<p>We could eliminate <code>expression</code> and simply use <code>equality</code> in the other rules\nthat contain expressions, but using <code>expression</code> makes those other rules read a\nlittle better.</p>\n<p>Also, in later chapters when we expand the grammar to include assignment and\nlogical operators, we&rsquo;ll only need to change the production for <code>expression</code>\ninstead of touching every rule that contains an expression.</p>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"i\">expression</span>     → <span class=\"i\">equality</span>\n</pre></div>\n<p>Over at the other end of the precedence table, a primary expression contains\nall the literals and grouping expressions.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">primary</span>        → <span class=\"t\">NUMBER</span> | <span class=\"t\">STRING</span> | <span class=\"s\">&quot;true&quot;</span> | <span class=\"s\">&quot;false&quot;</span> | <span class=\"s\">&quot;nil&quot;</span>\n               | <span class=\"s\">&quot;(&quot;</span> <span class=\"i\">expression</span> <span class=\"s\">&quot;)&quot;</span> ;\n</pre></div>\n<p>A unary expression starts with a unary operator followed by the operand. Since\nunary operators can nest<span class=\"em\">&mdash;</span><code>!!true</code> is a valid if weird expression<span class=\"em\">&mdash;</span>the\noperand can itself be a unary operator. A recursive rule handles that nicely.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">unary</span>          → ( <span class=\"s\">&quot;!&quot;</span> | <span class=\"s\">&quot;-&quot;</span> ) <span class=\"i\">unary</span> ;\n</pre></div>\n<p>But this rule has a problem. It never terminates.</p>\n<p>Remember, each rule needs to match expressions at that precedence level <em>or\nhigher</em>, so we also need to let this match a primary expression.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">unary</span>          → ( <span class=\"s\">&quot;!&quot;</span> | <span class=\"s\">&quot;-&quot;</span> ) <span class=\"i\">unary</span>\n               | <span class=\"i\">primary</span> ;\n</pre></div>\n<p>That works.</p>\n<p>The remaining rules are all binary operators. We&rsquo;ll start with the rule for\nmultiplication and division. Here&rsquo;s a first try:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">factor</span>         → <span class=\"i\">factor</span> ( <span class=\"s\">&quot;/&quot;</span> | <span class=\"s\">&quot;*&quot;</span> ) <span class=\"i\">unary</span>\n               | <span class=\"i\">unary</span> ;\n</pre></div>\n<p>The rule recurses to match the left operand. That enables the rule to match a\nseries of multiplication and division expressions like <code>1 * 2 / 3</code>. Putting the\nrecursive production on the left side and <code>unary</code> on the right makes the rule\n<span name=\"mult\">left-associative</span> and unambiguous.</p>\n<aside name=\"mult\">\n<p>In principle, it doesn&rsquo;t matter whether you treat multiplication as left- or\nright-associative<span class=\"em\">&mdash;</span>you get the same result either way. Alas, in the real world\nwith limited precision, roundoff and overflow mean that associativity can affect\nthe result of a sequence of multiplications. Consider:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">print</span> <span class=\"n\">0.1</span> * (<span class=\"n\">0.2</span> * <span class=\"n\">0.3</span>);\n<span class=\"k\">print</span> (<span class=\"n\">0.1</span> * <span class=\"n\">0.2</span>) * <span class=\"n\">0.3</span>;\n</pre></div>\n<p>In languages like Lox that use <a href=\"https://en.wikipedia.org/wiki/Double-precision_floating-point_format\">IEEE 754</a> double-precision floating-point\nnumbers, the first evaluates to <code>0.006</code>, while the second yields\n<code>0.006000000000000001</code>. Sometimes that tiny difference matters.\n<a href=\"https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html\">This</a> is a good place to learn more.</p>\n</aside>\n<p>All of this is correct, but the fact that the first symbol in the body of the\nrule is the same as the head of the rule means this production is\n<strong>left-recursive</strong>. Some parsing techniques, including the one we&rsquo;re going to\nuse, have trouble with left recursion. (Recursion elsewhere, like we have in\n<code>unary</code> and the indirect recursion for grouping in <code>primary</code> are not a problem.)</p>\n<p>There are many grammars you can define that match the same language. The choice\nfor how to model a particular language is partially a matter of taste and\npartially a pragmatic one. This rule is correct, but not optimal for how we\nintend to parse it. Instead of a left recursive rule, we&rsquo;ll use a different one.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">factor</span>         → <span class=\"i\">unary</span> ( ( <span class=\"s\">&quot;/&quot;</span> | <span class=\"s\">&quot;*&quot;</span> ) <span class=\"i\">unary</span> )* ;\n</pre></div>\n<p>We define a factor expression as a flat <em>sequence</em> of multiplications\nand divisions. This matches the same syntax as the previous rule, but better\nmirrors the code we&rsquo;ll write to parse Lox. We use the same structure for all of\nthe other binary operator precedence levels, giving us this complete expression\ngrammar:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">expression</span>     → <span class=\"i\">equality</span> ;\n<span class=\"i\">equality</span>       → <span class=\"i\">comparison</span> ( ( <span class=\"s\">&quot;!=&quot;</span> | <span class=\"s\">&quot;==&quot;</span> ) <span class=\"i\">comparison</span> )* ;\n<span class=\"i\">comparison</span>     → <span class=\"i\">term</span> ( ( <span class=\"s\">&quot;&gt;&quot;</span> | <span class=\"s\">&quot;&gt;=&quot;</span> | <span class=\"s\">&quot;&lt;&quot;</span> | <span class=\"s\">&quot;&lt;=&quot;</span> ) <span class=\"i\">term</span> )* ;\n<span class=\"i\">term</span>           → <span class=\"i\">factor</span> ( ( <span class=\"s\">&quot;-&quot;</span> | <span class=\"s\">&quot;+&quot;</span> ) <span class=\"i\">factor</span> )* ;\n<span class=\"i\">factor</span>         → <span class=\"i\">unary</span> ( ( <span class=\"s\">&quot;/&quot;</span> | <span class=\"s\">&quot;*&quot;</span> ) <span class=\"i\">unary</span> )* ;\n<span class=\"i\">unary</span>          → ( <span class=\"s\">&quot;!&quot;</span> | <span class=\"s\">&quot;-&quot;</span> ) <span class=\"i\">unary</span>\n               | <span class=\"i\">primary</span> ;\n<span class=\"i\">primary</span>        → <span class=\"t\">NUMBER</span> | <span class=\"t\">STRING</span> | <span class=\"s\">&quot;true&quot;</span> | <span class=\"s\">&quot;false&quot;</span> | <span class=\"s\">&quot;nil&quot;</span>\n               | <span class=\"s\">&quot;(&quot;</span> <span class=\"i\">expression</span> <span class=\"s\">&quot;)&quot;</span> ;\n</pre></div>\n<p>This grammar is more complex than the one we had before, but in return we have\neliminated the previous one&rsquo;s ambiguity. It&rsquo;s just what we need to make a\nparser.</p>\n<h2><a href=\"#recursive-descent-parsing\" id=\"recursive-descent-parsing\"><small>6&#8202;.&#8202;2</small>Recursive Descent Parsing</a></h2>\n<p>There is a whole pack of parsing techniques whose names are mostly combinations\nof &ldquo;L&rdquo; and &ldquo;R&rdquo;<span class=\"em\">&mdash;</span><a href=\"https://en.wikipedia.org/wiki/LL_parser\">LL(k)</a>, <a href=\"https://en.wikipedia.org/wiki/LR_parser\">LR(1)</a>, <a href=\"https://en.wikipedia.org/wiki/LALR_parser\">LALR</a><span class=\"em\">&mdash;</span>along with more exotic\nbeasts like <a href=\"https://en.wikipedia.org/wiki/Parser_combinator\">parser combinators</a>, <a href=\"https://en.wikipedia.org/wiki/Earley_parser\">Earley parsers</a>, <a href=\"https://en.wikipedia.org/wiki/Shunting-yard_algorithm\">the shunting yard\nalgorithm</a>, and <a href=\"https://en.wikipedia.org/wiki/Parsing_expression_grammar\">packrat parsing</a>. For our first interpreter, one\ntechnique is more than sufficient: <strong>recursive descent</strong>.</p>\n<p>Recursive descent is the simplest way to build a parser, and doesn&rsquo;t require\nusing complex parser generator tools like Yacc, Bison or ANTLR. All you need is\nstraightforward handwritten code. Don&rsquo;t be fooled by its simplicity, though.\nRecursive descent parsers are fast, robust, and can support sophisticated\nerror handling. In fact, GCC, V8 (the JavaScript VM in Chrome), Roslyn (the C#\ncompiler written in C#) and many other heavyweight production language\nimplementations use recursive descent. It rocks.</p>\n<p>Recursive descent is considered a <strong>top-down parser</strong> because it starts from the\ntop or outermost grammar rule (here <code>expression</code>) and works its way <span\nname=\"descent\">down</span> into the nested subexpressions before finally\nreaching the leaves of the syntax tree. This is in contrast with bottom-up\nparsers like LR that start with primary expressions and compose them into larger\nand larger chunks of syntax.</p>\n<aside name=\"descent\">\n<p>It&rsquo;s called &ldquo;recursive <em>descent</em>&rdquo; because it walks <em>down</em> the grammar.\nConfusingly, we also use direction metaphorically when talking about &ldquo;high&rdquo; and\n&ldquo;low&rdquo; precedence, but the orientation is reversed. In a top-down parser, you\nreach the lowest-precedence expressions first because they may in turn contain\nsubexpressions of higher precedence.</p><img src=\"image/parsing-expressions/direction.png\" alt=\"Top-down grammar rules in order of increasing precedence.\" />\n<p>CS people really need to get together and straighten out their metaphors. Don&rsquo;t\neven get me started on which direction a stack grows or why trees have their\nroots on top.</p>\n</aside>\n<p>A recursive descent parser is a literal translation of the grammar&rsquo;s rules\nstraight into imperative code. Each rule becomes a function. The body of the\nrule translates to code roughly like:</p><table>\n<thead>\n<tr>\n  <td>Grammar notation</td>\n  <td>Code representation</td>\n</tr>\n</thead>\n<tbody>\n  <tr><td>Terminal</td><td>Code to match and consume a token</td></tr>\n  <tr><td>Nonterminal</td><td>Call to that rule&rsquo;s function</td></tr>\n  <tr><td><code>|</code></td><td><code>if</code> or <code>switch</code> statement</td></tr>\n  <tr><td><code>*</code> or <code>+</code></td><td><code>while</code> or <code>for</code> loop</td></tr>\n  <tr><td><code>?</code></td><td><code>if</code> statement</td></tr>\n</tbody>\n</table>\n<p>The descent is described as &ldquo;recursive&rdquo; because when a grammar rule refers to\nitself<span class=\"em\">&mdash;</span>directly or indirectly<span class=\"em\">&mdash;</span>that translates to a recursive function\ncall.</p>\n<h3><a href=\"#the-parser-class\" id=\"the-parser-class\"><small>6&#8202;.&#8202;2&#8202;.&#8202;1</small>The parser class</a></h3>\n<p>Each grammar rule becomes a method inside this new class:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\ncreate new file</div>\n<pre><span class=\"k\">package</span> <span class=\"i\">com.craftinginterpreters.lox</span>;\n\n<span class=\"k\">import</span> <span class=\"i\">java.util.List</span>;\n\n<span class=\"k\">import static</span> <span class=\"i\">com.craftinginterpreters.lox.TokenType.*</span>;\n\n<span class=\"k\">class</span> <span class=\"t\">Parser</span> {\n  <span class=\"k\">private</span> <span class=\"k\">final</span> <span class=\"t\">List</span>&lt;<span class=\"t\">Token</span>&gt; <span class=\"i\">tokens</span>;\n  <span class=\"k\">private</span> <span class=\"t\">int</span> <span class=\"i\">current</span> = <span class=\"n\">0</span>;\n\n  <span class=\"t\">Parser</span>(<span class=\"t\">List</span>&lt;<span class=\"t\">Token</span>&gt; <span class=\"i\">tokens</span>) {\n    <span class=\"k\">this</span>.<span class=\"i\">tokens</span> = <span class=\"i\">tokens</span>;\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, create new file</div>\n\n<p>Like the scanner, the parser consumes a flat input sequence, only now we&rsquo;re\nreading tokens instead of characters. We store the list of tokens and use\n<code>current</code> to point to the next token eagerly waiting to be parsed.</p>\n<p>We&rsquo;re going to run straight through the expression grammar now and translate\neach rule to Java code. The first rule, <code>expression</code>, simply expands to the\n<code>equality</code> rule, so that&rsquo;s straightforward.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>Parser</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Expr</span> <span class=\"i\">expression</span>() {\n    <span class=\"k\">return</span> <span class=\"i\">equality</span>();\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>Parser</em>()</div>\n\n<p>Each method for parsing a grammar rule produces a syntax tree for that rule and\nreturns it to the caller. When the body of the rule contains a nonterminal<span class=\"em\">&mdash;</span>a\nreference to another rule<span class=\"em\">&mdash;</span>we <span name=\"left\">call</span> that other rule&rsquo;s\nmethod.</p>\n<aside name=\"left\">\n<p>This is why left recursion is problematic for recursive descent. The function\nfor a left-recursive rule immediately calls itself, which calls itself again,\nand so on, until the parser hits a stack overflow and dies.</p>\n</aside>\n<p>The rule for equality is a little more complex.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">equality</span>       → <span class=\"i\">comparison</span> ( ( <span class=\"s\">&quot;!=&quot;</span> | <span class=\"s\">&quot;==&quot;</span> ) <span class=\"i\">comparison</span> )* ;\n</pre></div>\n<p>In Java, that becomes:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>expression</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Expr</span> <span class=\"i\">equality</span>() {\n    <span class=\"t\">Expr</span> <span class=\"i\">expr</span> = <span class=\"i\">comparison</span>();\n\n    <span class=\"k\">while</span> (<span class=\"i\">match</span>(<span class=\"i\">BANG_EQUAL</span>, <span class=\"i\">EQUAL_EQUAL</span>)) {\n      <span class=\"t\">Token</span> <span class=\"i\">operator</span> = <span class=\"i\">previous</span>();\n      <span class=\"t\">Expr</span> <span class=\"i\">right</span> = <span class=\"i\">comparison</span>();\n      <span class=\"i\">expr</span> = <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Binary</span>(<span class=\"i\">expr</span>, <span class=\"i\">operator</span>, <span class=\"i\">right</span>);\n    }\n\n    <span class=\"k\">return</span> <span class=\"i\">expr</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>expression</em>()</div>\n\n<p>Let&rsquo;s step through it. The first <code>comparison</code> nonterminal in the body translates\nto the first call to <code>comparison()</code> in the method. We take that result and store\nit in a local variable.</p>\n<p>Then, the <code>( ... )*</code> loop in the rule maps to a <code>while</code> loop. We need to know\nwhen to exit that loop. We can see that inside the rule, we must first find\neither a <code>!=</code> or <code>==</code> token. So, if we <em>don&rsquo;t</em> see one of those, we must be done\nwith the sequence of equality operators. We express that check using a handy\n<code>match()</code> method.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>equality</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">boolean</span> <span class=\"i\">match</span>(<span class=\"t\">TokenType</span>... <span class=\"i\">types</span>) {\n    <span class=\"k\">for</span> (<span class=\"t\">TokenType</span> <span class=\"i\">type</span> : <span class=\"i\">types</span>) {\n      <span class=\"k\">if</span> (<span class=\"i\">check</span>(<span class=\"i\">type</span>)) {\n        <span class=\"i\">advance</span>();\n        <span class=\"k\">return</span> <span class=\"k\">true</span>;\n      }\n    }\n\n    <span class=\"k\">return</span> <span class=\"k\">false</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>equality</em>()</div>\n\n<p>This checks to see if the current token has any of the given types. If so, it\nconsumes the token and returns <code>true</code>. Otherwise, it returns <code>false</code> and leaves\nthe current token alone. The <code>match()</code> method is defined in terms of two more\nfundamental operations.</p>\n<p>The <code>check()</code> method returns <code>true</code> if the current token is of the given type.\nUnlike <code>match()</code>, it never consumes the token, it only looks at it.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>match</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">boolean</span> <span class=\"i\">check</span>(<span class=\"t\">TokenType</span> <span class=\"i\">type</span>) {\n    <span class=\"k\">if</span> (<span class=\"i\">isAtEnd</span>()) <span class=\"k\">return</span> <span class=\"k\">false</span>;\n    <span class=\"k\">return</span> <span class=\"i\">peek</span>().<span class=\"i\">type</span> == <span class=\"i\">type</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>match</em>()</div>\n\n<p>The <code>advance()</code> method consumes the current token and returns it, similar to how\nour scanner&rsquo;s corresponding method crawled through characters.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>check</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Token</span> <span class=\"i\">advance</span>() {\n    <span class=\"k\">if</span> (!<span class=\"i\">isAtEnd</span>()) <span class=\"i\">current</span>++;\n    <span class=\"k\">return</span> <span class=\"i\">previous</span>();\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>check</em>()</div>\n\n<p>These methods bottom out on the last handful of primitive operations.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>advance</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">boolean</span> <span class=\"i\">isAtEnd</span>() {\n    <span class=\"k\">return</span> <span class=\"i\">peek</span>().<span class=\"i\">type</span> == <span class=\"i\">EOF</span>;\n  }\n\n  <span class=\"k\">private</span> <span class=\"t\">Token</span> <span class=\"i\">peek</span>() {\n    <span class=\"k\">return</span> <span class=\"i\">tokens</span>.<span class=\"i\">get</span>(<span class=\"i\">current</span>);\n  }\n\n  <span class=\"k\">private</span> <span class=\"t\">Token</span> <span class=\"i\">previous</span>() {\n    <span class=\"k\">return</span> <span class=\"i\">tokens</span>.<span class=\"i\">get</span>(<span class=\"i\">current</span> - <span class=\"n\">1</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>advance</em>()</div>\n\n<p><code>isAtEnd()</code> checks if we&rsquo;ve run out of tokens to parse. <code>peek()</code> returns the\ncurrent token we have yet to consume, and <code>previous()</code> returns the most recently\nconsumed token. The latter makes it easier to use <code>match()</code> and then access the\njust-matched token.</p>\n<p>That&rsquo;s most of the parsing infrastructure we need. Where were we? Right, so if\nwe are inside the <code>while</code> loop in <code>equality()</code>, then we know we have found a\n<code>!=</code> or <code>==</code> operator and must be parsing an equality expression.</p>\n<p>We grab the matched operator token so we can track which kind of equality\nexpression we have. Then we call <code>comparison()</code> again to parse the right-hand\noperand. We combine the operator and its two operands into a new <code>Expr.Binary</code>\nsyntax tree node, and then loop around. For each iteration, we store the\nresulting expression back in the same <code>expr</code> local variable. As we zip through a\nsequence of equality expressions, that creates a left-associative nested tree of\nbinary operator nodes.</p>\n<p><span name=\"sequence\"></span></p><img src=\"image/parsing-expressions/sequence.png\" alt=\"The syntax tree created by parsing 'a == b == c == d == e'\" />\n<aside name=\"sequence\">\n<p>Parsing <code>a == b == c == d == e</code>. For each iteration, we create a new binary\nexpression using the previous one as the left operand.</p>\n</aside>\n<p>The parser falls out of the loop once it hits a token that&rsquo;s not an equality\noperator. Finally, it returns the expression. Note that if the parser never\nencounters an equality operator, then it never enters the loop. In that case,\nthe <code>equality()</code> method effectively calls and returns <code>comparison()</code>. In that\nway, this method matches an equality operator <em>or anything of higher\nprecedence</em>.</p>\n<p>Moving on to the next rule<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span></p>\n<div class=\"codehilite\"><pre><span class=\"i\">comparison</span>     → <span class=\"i\">term</span> ( ( <span class=\"s\">&quot;&gt;&quot;</span> | <span class=\"s\">&quot;&gt;=&quot;</span> | <span class=\"s\">&quot;&lt;&quot;</span> | <span class=\"s\">&quot;&lt;=&quot;</span> ) <span class=\"i\">term</span> )* ;\n</pre></div>\n<p>Translated to Java:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>equality</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Expr</span> <span class=\"i\">comparison</span>() {\n    <span class=\"t\">Expr</span> <span class=\"i\">expr</span> = <span class=\"i\">term</span>();\n\n    <span class=\"k\">while</span> (<span class=\"i\">match</span>(<span class=\"i\">GREATER</span>, <span class=\"i\">GREATER_EQUAL</span>, <span class=\"i\">LESS</span>, <span class=\"i\">LESS_EQUAL</span>)) {\n      <span class=\"t\">Token</span> <span class=\"i\">operator</span> = <span class=\"i\">previous</span>();\n      <span class=\"t\">Expr</span> <span class=\"i\">right</span> = <span class=\"i\">term</span>();\n      <span class=\"i\">expr</span> = <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Binary</span>(<span class=\"i\">expr</span>, <span class=\"i\">operator</span>, <span class=\"i\">right</span>);\n    }\n\n    <span class=\"k\">return</span> <span class=\"i\">expr</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>equality</em>()</div>\n\n<p>The grammar rule is virtually <span name=\"handle\">identical</span> to <code>equality</code>\nand so is the corresponding code. The only differences are the token types for\nthe operators we match, and the method we call for the operands<span class=\"em\">&mdash;</span>now\n<code>term()</code> instead of <code>comparison()</code>. The remaining two binary operator rules\nfollow the same pattern.</p>\n<p>In order of precedence, first addition and subtraction:</p>\n<aside name=\"handle\">\n<p>If you wanted to do some clever Java 8, you could create a helper method for\nparsing a left-associative series of binary operators given a list of token\ntypes, and an operand method handle to simplify this redundant code.</p>\n</aside>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>comparison</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Expr</span> <span class=\"i\">term</span>() {\n    <span class=\"t\">Expr</span> <span class=\"i\">expr</span> = <span class=\"i\">factor</span>();\n\n    <span class=\"k\">while</span> (<span class=\"i\">match</span>(<span class=\"i\">MINUS</span>, <span class=\"i\">PLUS</span>)) {\n      <span class=\"t\">Token</span> <span class=\"i\">operator</span> = <span class=\"i\">previous</span>();\n      <span class=\"t\">Expr</span> <span class=\"i\">right</span> = <span class=\"i\">factor</span>();\n      <span class=\"i\">expr</span> = <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Binary</span>(<span class=\"i\">expr</span>, <span class=\"i\">operator</span>, <span class=\"i\">right</span>);\n    }\n\n    <span class=\"k\">return</span> <span class=\"i\">expr</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>comparison</em>()</div>\n\n<p>And finally, multiplication and division:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>term</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Expr</span> <span class=\"i\">factor</span>() {\n    <span class=\"t\">Expr</span> <span class=\"i\">expr</span> = <span class=\"i\">unary</span>();\n\n    <span class=\"k\">while</span> (<span class=\"i\">match</span>(<span class=\"i\">SLASH</span>, <span class=\"i\">STAR</span>)) {\n      <span class=\"t\">Token</span> <span class=\"i\">operator</span> = <span class=\"i\">previous</span>();\n      <span class=\"t\">Expr</span> <span class=\"i\">right</span> = <span class=\"i\">unary</span>();\n      <span class=\"i\">expr</span> = <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Binary</span>(<span class=\"i\">expr</span>, <span class=\"i\">operator</span>, <span class=\"i\">right</span>);\n    }\n\n    <span class=\"k\">return</span> <span class=\"i\">expr</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>term</em>()</div>\n\n<p>That&rsquo;s all of the binary operators, parsed with the correct precedence and\nassociativity. We&rsquo;re crawling up the precedence hierarchy and now we&rsquo;ve reached\nthe unary operators.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">unary</span>          → ( <span class=\"s\">&quot;!&quot;</span> | <span class=\"s\">&quot;-&quot;</span> ) <span class=\"i\">unary</span>\n               | <span class=\"i\">primary</span> ;\n</pre></div>\n<p>The code for this is a little different.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>factor</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Expr</span> <span class=\"i\">unary</span>() {\n    <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">BANG</span>, <span class=\"i\">MINUS</span>)) {\n      <span class=\"t\">Token</span> <span class=\"i\">operator</span> = <span class=\"i\">previous</span>();\n      <span class=\"t\">Expr</span> <span class=\"i\">right</span> = <span class=\"i\">unary</span>();\n      <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Unary</span>(<span class=\"i\">operator</span>, <span class=\"i\">right</span>);\n    }\n\n    <span class=\"k\">return</span> <span class=\"i\">primary</span>();\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>factor</em>()</div>\n\n<p>Again, we look at the <span name=\"current\">current</span> token to see how to\nparse. If it&rsquo;s a <code>!</code> or <code>-</code>, we must have a unary expression. In that case, we\ngrab the token and then recursively call <code>unary()</code> again to parse the operand.\nWrap that all up in a unary expression syntax tree and we&rsquo;re done.</p>\n<aside name=\"current\">\n<p>The fact that the parser looks ahead at upcoming tokens to decide how to parse\nputs recursive descent into the category of <strong>predictive parsers</strong>.</p>\n</aside>\n<p>Otherwise, we must have reached the highest level of precedence, primary\nexpressions.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">primary</span>        → <span class=\"t\">NUMBER</span> | <span class=\"t\">STRING</span> | <span class=\"s\">&quot;true&quot;</span> | <span class=\"s\">&quot;false&quot;</span> | <span class=\"s\">&quot;nil&quot;</span>\n               | <span class=\"s\">&quot;(&quot;</span> <span class=\"i\">expression</span> <span class=\"s\">&quot;)&quot;</span> ;\n</pre></div>\n<p>Most of the cases for the rule are single terminals, so parsing is\nstraightforward.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>unary</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Expr</span> <span class=\"i\">primary</span>() {\n    <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">FALSE</span>)) <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Literal</span>(<span class=\"k\">false</span>);\n    <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">TRUE</span>)) <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Literal</span>(<span class=\"k\">true</span>);\n    <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">NIL</span>)) <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Literal</span>(<span class=\"k\">null</span>);\n\n    <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">NUMBER</span>, <span class=\"i\">STRING</span>)) {\n      <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Literal</span>(<span class=\"i\">previous</span>().<span class=\"i\">literal</span>);\n    }\n\n    <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">LEFT_PAREN</span>)) {\n      <span class=\"t\">Expr</span> <span class=\"i\">expr</span> = <span class=\"i\">expression</span>();\n      <span class=\"i\">consume</span>(<span class=\"i\">RIGHT_PAREN</span>, <span class=\"s\">&quot;Expect &#39;)&#39; after expression.&quot;</span>);\n      <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Grouping</span>(<span class=\"i\">expr</span>);\n    }\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>unary</em>()</div>\n\n<p>The interesting branch is the one for handling parentheses. After we match an\nopening <code>(</code> and parse the expression inside it, we <em>must</em> find a <code>)</code> token. If\nwe don&rsquo;t, that&rsquo;s an error.</p>\n<h2><a href=\"#syntax-errors\" id=\"syntax-errors\"><small>6&#8202;.&#8202;3</small>Syntax Errors</a></h2>\n<p>A parser really has two jobs:</p>\n<ol>\n<li>\n<p>Given a valid sequence of tokens, produce a corresponding syntax tree.</p>\n</li>\n<li>\n<p>Given an <em>invalid</em> sequence of tokens, detect any errors and tell the\nuser about their mistakes.</p>\n</li>\n</ol>\n<p>Don&rsquo;t underestimate how important the second job is! In modern IDEs and editors,\nthe parser is constantly reparsing code<span class=\"em\">&mdash;</span>often while the user is still editing\nit<span class=\"em\">&mdash;</span>in order to syntax highlight and support things like auto-complete. That\nmeans it will encounter code in incomplete, half-wrong states <em>all the time.</em></p>\n<p>When the user doesn&rsquo;t realize the syntax is wrong, it is up to the parser to\nhelp guide them back onto the right path. The way it reports errors is a large\npart of your language&rsquo;s user interface. Good syntax error handling is hard. By\ndefinition, the code isn&rsquo;t in a well-defined state, so there&rsquo;s no infallible way\nto know what the user <em>meant</em> to write. The parser can&rsquo;t read your <span\nname=\"telepathy\">mind</span>.</p>\n<aside name=\"telepathy\">\n<p>Not yet at least. With the way things are going in machine learning these days,\nwho knows what the future will bring?</p>\n</aside>\n<p>There are a couple of hard requirements for when the parser runs into a syntax\nerror. A parser must:</p>\n<ul>\n<li>\n<p><strong>Detect and report the error.</strong> If it doesn&rsquo;t detect the <span\nname=\"error\">error</span> and passes the resulting malformed syntax tree on\nto the interpreter, all manner of horrors may be summoned.</p>\n<aside name=\"error\">\n<p>Philosophically speaking, if an error isn&rsquo;t detected and the interpreter\nruns the code, is it <em>really</em> an error?</p>\n</aside></li>\n<li>\n<p><strong>Avoid crashing or hanging.</strong> Syntax errors are a fact of life, and\nlanguage tools have to be robust in the face of them. Segfaulting or getting\nstuck in an infinite loop isn&rsquo;t allowed. While the source may not be valid\n<em>code</em>, it&rsquo;s still a valid <em>input to the parser</em> because users use the\nparser to learn what syntax is allowed.</p>\n</li>\n</ul>\n<p>Those are the table stakes if you want to get in the parser game at all, but you\nreally want to raise the ante beyond that. A decent parser should:</p>\n<ul>\n<li>\n<p><strong>Be fast.</strong> Computers are thousands of times faster than they were when\nparser technology was first invented. The days of needing to optimize your\nparser so that it could get through an entire source file during a coffee\nbreak are over. But programmer expectations have risen as quickly, if not\nfaster. They expect their editors to reparse files in milliseconds after\nevery keystroke.</p>\n</li>\n<li>\n<p><strong>Report as many distinct errors as there are.</strong> Aborting after the first\nerror is easy to implement, but it&rsquo;s annoying for users if every time they\nfix what they think is the one error in a file, a new one appears. They\nwant to see them all.</p>\n</li>\n<li>\n<p><strong>Minimize <em>cascaded</em> errors.</strong> Once a single error is found, the parser no\nlonger really knows what&rsquo;s going on. It tries to get itself back on track\nand keep going, but if it gets confused, it may report a slew of ghost\nerrors that don&rsquo;t indicate other real problems in the code. When the first\nerror is fixed, those phantoms disappear, because they reflect only the\nparser&rsquo;s own confusion. Cascaded errors are annoying because they can scare\nthe user into thinking their code is in a worse state than it is.</p>\n</li>\n</ul>\n<p>The last two points are in tension. We want to report as many separate errors as\nwe can, but we don&rsquo;t want to report ones that are merely side effects of an\nearlier one.</p>\n<p>The way a parser responds to an error and keeps going to look for later errors\nis called <strong>error recovery</strong>. This was a hot research topic in the &rsquo;60s. Back\nthen, you&rsquo;d hand a stack of punch cards to the secretary and come back the next\nday to see if the compiler succeeded. With an iteration loop that slow, you\n<em>really</em> wanted to find every single error in your code in one pass.</p>\n<p>Today, when parsers complete before you&rsquo;ve even finished typing, it&rsquo;s less of an\nissue. Simple, fast error recovery is fine.</p>\n<h3><a href=\"#panic-mode-error-recovery\" id=\"panic-mode-error-recovery\"><small>6&#8202;.&#8202;3&#8202;.&#8202;1</small>Panic mode error recovery</a></h3>\n<aside name=\"panic\">\n<p>You know you want to push it.</p><img src=\"image/parsing-expressions/panic.png\" alt=\"A big shiny 'PANIC' button.\" />\n</aside>\n<p>Of all the recovery techniques devised in yesteryear, the one that best stood\nthe test of time is called<span class=\"em\">&mdash;</span>somewhat alarmingly<span class=\"em\">&mdash;</span><span name=\"panic\"><strong>panic\nmode</strong></span>. As soon as the parser detects an error, it enters panic mode. It\nknows at least one token doesn&rsquo;t make sense given its current state in the\nmiddle of some stack of grammar productions.</p>\n<p>Before it can get back to parsing, it needs to get its state and the sequence of\nforthcoming tokens aligned such that the next token does match the rule being\nparsed. This process is called <strong>synchronization</strong>.</p>\n<p>To do that, we select some rule in the grammar that will mark the\nsynchronization point. The parser fixes its parsing state by jumping out of any\nnested productions until it gets back to that rule. Then it synchronizes the\ntoken stream by discarding tokens until it reaches one that can appear at that\npoint in the rule.</p>\n<p>Any additional real syntax errors hiding in those discarded tokens aren&rsquo;t\nreported, but it also means that any mistaken cascaded errors that are side\neffects of the initial error aren&rsquo;t <em>falsely</em> reported either, which is a decent\ntrade-off.</p>\n<p>The traditional place in the grammar to synchronize is between statements. We\ndon&rsquo;t have those yet, so we won&rsquo;t actually synchronize in this chapter, but\nwe&rsquo;ll get the machinery in place for later.</p>\n<h3><a href=\"#entering-panic-mode\" id=\"entering-panic-mode\"><small>6&#8202;.&#8202;3&#8202;.&#8202;2</small>Entering panic mode</a></h3>\n<p>Back before we went on this side trip around error recovery, we were writing the\ncode to parse a parenthesized expression. After parsing the expression, the\nparser looks for the closing <code>)</code> by calling <code>consume()</code>. Here, finally, is that\nmethod:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>match</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Token</span> <span class=\"i\">consume</span>(<span class=\"t\">TokenType</span> <span class=\"i\">type</span>, <span class=\"t\">String</span> <span class=\"i\">message</span>) {\n    <span class=\"k\">if</span> (<span class=\"i\">check</span>(<span class=\"i\">type</span>)) <span class=\"k\">return</span> <span class=\"i\">advance</span>();\n\n    <span class=\"k\">throw</span> <span class=\"i\">error</span>(<span class=\"i\">peek</span>(), <span class=\"i\">message</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>match</em>()</div>\n\n<p>It&rsquo;s similar to <code>match()</code> in that it checks to see if the next token is of the\nexpected type. If so, it consumes the token and everything is groovy. If some\nother token is there, then we&rsquo;ve hit an error. We report it by calling this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>previous</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">ParseError</span> <span class=\"i\">error</span>(<span class=\"t\">Token</span> <span class=\"i\">token</span>, <span class=\"t\">String</span> <span class=\"i\">message</span>) {\n    <span class=\"t\">Lox</span>.<span class=\"i\">error</span>(<span class=\"i\">token</span>, <span class=\"i\">message</span>);\n    <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">ParseError</span>();\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>previous</em>()</div>\n\n<p>First, that shows the error to the user by calling:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Lox.java</em><br>\nadd after <em>report</em>()</div>\n<pre>  <span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">error</span>(<span class=\"t\">Token</span> <span class=\"i\">token</span>, <span class=\"t\">String</span> <span class=\"i\">message</span>) {\n    <span class=\"k\">if</span> (<span class=\"i\">token</span>.<span class=\"i\">type</span> == <span class=\"t\">TokenType</span>.<span class=\"i\">EOF</span>) {\n      <span class=\"i\">report</span>(<span class=\"i\">token</span>.<span class=\"i\">line</span>, <span class=\"s\">&quot; at end&quot;</span>, <span class=\"i\">message</span>);\n    } <span class=\"k\">else</span> {\n      <span class=\"i\">report</span>(<span class=\"i\">token</span>.<span class=\"i\">line</span>, <span class=\"s\">&quot; at &#39;&quot;</span> + <span class=\"i\">token</span>.<span class=\"i\">lexeme</span> + <span class=\"s\">&quot;&#39;&quot;</span>, <span class=\"i\">message</span>);\n    }\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Lox.java</em>, add after <em>report</em>()</div>\n\n<p>This reports an error at a given token. It shows the token&rsquo;s location and the\ntoken itself. This will come in handy later since we use tokens throughout the\ninterpreter to track locations in code.</p>\n<p>After we report the error, the user knows about their mistake, but what does the\n<em>parser</em> do next? Back in <code>error()</code>, we create and return a ParseError, an\ninstance of this new class:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">class Parser {\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nnest inside class <em>Parser</em></div>\n<pre class=\"insert\">  <span class=\"k\">private</span> <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">ParseError</span> <span class=\"k\">extends</span> <span class=\"t\">RuntimeException</span> {}\n\n</pre><pre class=\"insert-after\">  private final List&lt;Token&gt; tokens;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, nest inside class <em>Parser</em></div>\n\n<p>This is a simple sentinel class we use to unwind the parser. The <code>error()</code>\nmethod <em>returns</em> the error instead of <em>throwing</em> it because we want to let the\ncalling method inside the parser decide whether to unwind or not. Some parse\nerrors occur in places where the parser isn&rsquo;t likely to get into a weird state\nand we don&rsquo;t need to <span name=\"production\">synchronize</span>. In those\nplaces, we simply report the error and keep on truckin&rsquo;.</p>\n<p>For example, Lox limits the number of arguments you can pass to a function. If\nyou pass too many, the parser needs to report that error, but it can and should\nsimply keep on parsing the extra arguments instead of freaking out and going\ninto panic mode.</p>\n<aside name=\"production\">\n<p>Another way to handle common syntax errors is with <strong>error productions</strong>. You\naugment the grammar with a rule that <em>successfully</em> matches the <em>erroneous</em>\nsyntax. The parser safely parses it but then reports it as an error instead of\nproducing a syntax tree.</p>\n<p>For example, some languages have a unary <code>+</code> operator, like <code>+123</code>, but Lox does\nnot. Instead of getting confused when the parser stumbles onto a <code>+</code> at the\nbeginning of an expression, we could extend the unary rule to allow it.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">unary</span> → ( <span class=\"s\">&quot;!&quot;</span> | <span class=\"s\">&quot;-&quot;</span> | <span class=\"s\">&quot;+&quot;</span> ) <span class=\"i\">unary</span>\n      | <span class=\"i\">primary</span> ;\n</pre></div>\n<p>This lets the parser consume <code>+</code> without going into panic mode or leaving the\nparser in a weird state.</p>\n<p>Error productions work well because you, the parser author, know <em>how</em> the code\nis wrong and what the user was likely trying to do. That means you can give a\nmore helpful message to get the user back on track, like, &ldquo;Unary &lsquo;+&rsquo; expressions\nare not supported.&rdquo; Mature parsers tend to accumulate error productions like\nbarnacles since they help users fix common mistakes.</p>\n</aside>\n<p>In our case, though, the syntax error is nasty enough that we want to panic and\nsynchronize. Discarding tokens is pretty easy, but how do we synchronize the\nparser&rsquo;s own state?</p>\n<h3><a href=\"#synchronizing-a-recursive-descent-parser\" id=\"synchronizing-a-recursive-descent-parser\"><small>6&#8202;.&#8202;3&#8202;.&#8202;3</small>Synchronizing a recursive descent parser</a></h3>\n<p>With recursive descent, the parser&rsquo;s state<span class=\"em\">&mdash;</span>which rules it is in the middle of\nrecognizing<span class=\"em\">&mdash;</span>is not stored explicitly in fields. Instead, we use Java&rsquo;s\nown call stack to track what the parser is doing. Each rule in the middle of\nbeing parsed is a call frame on the stack. In order to reset that state, we need\nto clear out those call frames.</p>\n<p>The natural way to do that in Java is exceptions. When we want to synchronize,\nwe <em>throw</em> that ParseError object. Higher up in the method for the grammar rule\nwe are synchronizing to, we&rsquo;ll catch it. Since we synchronize on statement\nboundaries, we&rsquo;ll catch the exception there. After the exception is caught, the\nparser is in the right state. All that&rsquo;s left is to synchronize the tokens.</p>\n<p>We want to discard tokens until we&rsquo;re right at the beginning of the next\nstatement. That boundary is pretty easy to spot<span class=\"em\">&mdash;</span>it&rsquo;s one of the main reasons\nwe picked it. <em>After</em> a semicolon, we&rsquo;re <span name=\"semicolon\">probably</span>\nfinished with a statement. Most statements start with a keyword<span class=\"em\">&mdash;</span><code>for</code>, <code>if</code>,\n<code>return</code>, <code>var</code>, etc. When the <em>next</em> token is any of those, we&rsquo;re probably\nabout to start a statement.</p>\n<aside name=\"semicolon\">\n<p>I say &ldquo;probably&rdquo; because we could hit a semicolon separating clauses in a <code>for</code>\nloop. Our synchronization isn&rsquo;t perfect, but that&rsquo;s OK. We&rsquo;ve already reported\nthe first error precisely, so everything after that is kind of &ldquo;best effort&rdquo;.</p>\n</aside>\n<p>This method encapsulates that logic:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>error</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">void</span> <span class=\"i\">synchronize</span>() {\n    <span class=\"i\">advance</span>();\n\n    <span class=\"k\">while</span> (!<span class=\"i\">isAtEnd</span>()) {\n      <span class=\"k\">if</span> (<span class=\"i\">previous</span>().<span class=\"i\">type</span> == <span class=\"i\">SEMICOLON</span>) <span class=\"k\">return</span>;\n\n      <span class=\"k\">switch</span> (<span class=\"i\">peek</span>().<span class=\"i\">type</span>) {\n        <span class=\"k\">case</span> <span class=\"i\">CLASS</span>:\n        <span class=\"k\">case</span> <span class=\"i\">FUN</span>:\n        <span class=\"k\">case</span> <span class=\"i\">VAR</span>:\n        <span class=\"k\">case</span> <span class=\"i\">FOR</span>:\n        <span class=\"k\">case</span> <span class=\"i\">IF</span>:\n        <span class=\"k\">case</span> <span class=\"i\">WHILE</span>:\n        <span class=\"k\">case</span> <span class=\"i\">PRINT</span>:\n        <span class=\"k\">case</span> <span class=\"i\">RETURN</span>:\n          <span class=\"k\">return</span>;\n      }\n\n      <span class=\"i\">advance</span>();\n    }\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>error</em>()</div>\n\n<p>It discards tokens until it thinks it has found a statement boundary. After\ncatching a ParseError, we&rsquo;ll call this and then we are hopefully back in sync.\nWhen it works well, we have discarded tokens that would have likely caused\ncascaded errors anyway, and now we can parse the rest of the file starting at\nthe next statement.</p>\n<p>Alas, we don&rsquo;t get to see this method in action, since we don&rsquo;t have statements\nyet. We&rsquo;ll get to that <a href=\"statements-and-state.html\">in a couple of chapters</a>. For now, if an\nerror occurs, we&rsquo;ll panic and unwind all the way to the top and stop parsing.\nSince we can parse only a single expression anyway, that&rsquo;s no big loss.</p>\n<h2><a href=\"#wiring-up-the-parser\" id=\"wiring-up-the-parser\"><small>6&#8202;.&#8202;4</small>Wiring up the Parser</a></h2>\n<p>We are mostly done parsing expressions now. There is one other place where we\nneed to add a little error handling. As the parser descends through the parsing\nmethods for each grammar rule, it eventually hits <code>primary()</code>. If none of the\ncases in there match, it means we are sitting on a token that can&rsquo;t start an\nexpression. We need to handle that error too.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    if (match(LEFT_PAREN)) {\n      Expr expr = expression();\n      consume(RIGHT_PAREN, &quot;Expect ')' after expression.&quot;);\n      return new Expr.Grouping(expr);\n    }\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>primary</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"k\">throw</span> <span class=\"i\">error</span>(<span class=\"i\">peek</span>(), <span class=\"s\">&quot;Expect expression.&quot;</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>primary</em>()</div>\n\n<p>With that, all that remains in the parser is to define an initial method to kick\nit off. That method is called, naturally enough, <code>parse()</code>.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>Parser</em>()</div>\n<pre>  <span class=\"t\">Expr</span> <span class=\"i\">parse</span>() {\n    <span class=\"k\">try</span> {\n      <span class=\"k\">return</span> <span class=\"i\">expression</span>();\n    } <span class=\"k\">catch</span> (<span class=\"t\">ParseError</span> <span class=\"i\">error</span>) {\n      <span class=\"k\">return</span> <span class=\"k\">null</span>;\n    }\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>Parser</em>()</div>\n\n<p>We&rsquo;ll revisit this method later when we add statements to the language. For now,\nit parses a single expression and returns it. We also have some temporary code\nto exit out of panic mode. Syntax error recovery is the parser&rsquo;s job, so we\ndon&rsquo;t want the ParseError exception to escape into the rest of the interpreter.</p>\n<p>When a syntax error does occur, this method returns <code>null</code>. That&rsquo;s OK. The\nparser promises not to crash or hang on invalid syntax, but it doesn&rsquo;t promise\nto return a <em>usable syntax tree</em> if an error is found. As soon as the parser\nreports an error, <code>hadError</code> gets set, and subsequent phases are skipped.</p>\n<p>Finally, we can hook up our brand new parser to the main Lox class and try it\nout. We still don&rsquo;t have an interpreter, so for now, we&rsquo;ll parse to a syntax\ntree and then use the AstPrinter class from the <a href=\"representing-code.html#a-not-very-pretty-printer\">last chapter</a> to\ndisplay it.</p>\n<p>Delete the old code to print the scanned tokens and replace it with this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    List&lt;Token&gt; tokens = scanner.scanTokens();\n</pre><div class=\"source-file\"><em>lox/Lox.java</em><br>\nin <em>run</em>()<br>\nreplace 5 lines</div>\n<pre class=\"insert\">    <span class=\"t\">Parser</span> <span class=\"i\">parser</span> = <span class=\"k\">new</span> <span class=\"t\">Parser</span>(<span class=\"i\">tokens</span>);\n    <span class=\"t\">Expr</span> <span class=\"i\">expression</span> = <span class=\"i\">parser</span>.<span class=\"i\">parse</span>();\n\n    <span class=\"c\">// Stop if there was a syntax error.</span>\n    <span class=\"k\">if</span> (<span class=\"i\">hadError</span>) <span class=\"k\">return</span>;\n\n    <span class=\"t\">System</span>.<span class=\"i\">out</span>.<span class=\"i\">println</span>(<span class=\"k\">new</span> <span class=\"t\">AstPrinter</span>().<span class=\"i\">print</span>(<span class=\"i\">expression</span>));\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Lox.java</em>, in <em>run</em>(), replace 5 lines</div>\n\n<p>Congratulations, you have crossed the <span name=\"harder\">threshold</span>! That\nreally is all there is to handwriting a parser. We&rsquo;ll extend the grammar in\nlater chapters with assignment, statements, and other stuff, but none of that is\nany more complex than the binary operators we tackled here.</p>\n<aside name=\"harder\">\n<p>It is possible to define a more complex grammar than Lox&rsquo;s that&rsquo;s difficult to\nparse using recursive descent. Predictive parsing gets tricky when you may need\nto look ahead a large number of tokens to figure out what you&rsquo;re sitting on.</p>\n<p>In practice, most languages are designed to avoid that. Even in cases where they\naren&rsquo;t, you can usually hack around it without too much pain. If you can parse\nC++ using recursive descent<span class=\"em\">&mdash;</span>which many C++ compilers do<span class=\"em\">&mdash;</span>you can parse\nanything.</p>\n</aside>\n<p>Fire up the interpreter and type in some expressions. See how it handles\nprecedence and associativity correctly? Not bad for less than 200 lines of code.</p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>In C, a block is a statement form that allows you to pack a series of\nstatements where a single one is expected. The <a href=\"https://en.wikipedia.org/wiki/Comma_operator\">comma operator</a> is an\nanalogous syntax for expressions. A comma-separated series of expressions\ncan be given where a single expression is expected (except inside a function\ncall&rsquo;s argument list). At runtime, the comma operator evaluates the left\noperand and discards the result. Then it evaluates and returns the right\noperand.</p>\n<p>Add support for comma expressions. Give them the same precedence and\nassociativity as in C. Write the grammar, and then implement the necessary\nparsing code.</p>\n</li>\n<li>\n<p>Likewise, add support for the C-style conditional or &ldquo;ternary&rdquo; operator\n<code>?:</code>. What precedence level is allowed between the <code>?</code> and <code>:</code>? Is the whole\noperator left-associative or right-associative?</p>\n</li>\n<li>\n<p>Add error productions to handle each binary operator appearing without a\nleft-hand operand. In other words, detect a binary operator appearing at the\nbeginning of an expression. Report that as an error, but also parse and\ndiscard a right-hand operand with the appropriate precedence.</p>\n</li>\n</ol>\n</div>\n<div class=\"design-note\">\n<h2><a href=\"#design-note\" id=\"design-note\">Design Note: Logic Versus History</a></h2>\n<p>Let&rsquo;s say we decide to add bitwise <code>&amp;</code> and <code>|</code> operators to Lox. Where should we\nput them in the precedence hierarchy? C<span class=\"em\">&mdash;</span>and most languages that follow in C&rsquo;s\nfootsteps<span class=\"em\">&mdash;</span>place them below <code>==</code>. This is widely considered a mistake because\nit means common operations like testing a flag require parentheses.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">if</span> (<span class=\"i\">flags</span> &amp; <span class=\"a\">FLAG_MASK</span> == <span class=\"a\">SOME_FLAG</span>) { ... } <span class=\"c\">// Wrong.</span>\n<span class=\"k\">if</span> ((<span class=\"i\">flags</span> &amp; <span class=\"a\">FLAG_MASK</span>) == <span class=\"a\">SOME_FLAG</span>) { ... } <span class=\"c\">// Right.</span>\n</pre></div>\n<p>Should we fix this for Lox and put bitwise operators higher up the precedence\ntable than C does? There are two strategies we can take.</p>\n<p>You almost never want to use the result of an <code>==</code> expression as the operand to\na bitwise operator. By making bitwise bind tighter, users don&rsquo;t need to\nparenthesize as often. So if we do that, and users assume the precedence is\nchosen logically to minimize parentheses, they&rsquo;re likely to infer it correctly.</p>\n<p>This kind of internal consistency makes the language easier to learn because\nthere are fewer edge cases and exceptions users have to stumble into and then\ncorrect. That&rsquo;s good, because before users can use our language, they have to\nload all of that syntax and semantics into their heads. A simpler, more rational\nlanguage <em>makes sense</em>.</p>\n<p>But, for many users there is an even faster shortcut to getting our language&rsquo;s\nideas into their wetware<span class=\"em\">&mdash;</span><em>use concepts they already know</em>. Many newcomers to\nour language will be coming from some other language or languages. If our\nlanguage uses some of the same syntax or semantics as those, there is much less\nfor the user to learn (and <em>unlearn</em>).</p>\n<p>This is particularly helpful with syntax. You may not remember it well today,\nbut way back when you learned your very first programming language, code\nprobably looked alien and unapproachable. Only through painstaking effort did\nyou learn to read and accept it. If you design a novel syntax for your new\nlanguage, you force users to start that process all over again.</p>\n<p>Taking advantage of what users already know is one of the most powerful tools\nyou can use to ease adoption of your language. It&rsquo;s almost impossible to\noverestimate how valuable this is. But it faces you with a nasty problem: What\nhappens when the thing the users all know <em>kind of sucks</em>? C&rsquo;s bitwise operator\nprecedence is a mistake that doesn&rsquo;t make sense. But it&rsquo;s a <em>familiar</em> mistake\nthat millions have already gotten used to and learned to live with.</p>\n<p>Do you stay true to your language&rsquo;s own internal logic and ignore history? Do\nyou start from a blank slate and first principles? Or do you weave your language\ninto the rich tapestry of programming history and give your users a leg up by\nstarting from something they already know?</p>\n<p>There is no perfect answer here, only trade-offs. You and I are obviously biased\ntowards liking novel languages, so our natural inclination is to burn the\nhistory books and start our own story.</p>\n<p>In practice, it&rsquo;s often better to make the most of what users already know.\nGetting them to come to your language requires a big leap. The smaller you can\nmake that chasm, the more people will be willing to cross it. But you can&rsquo;t\n<em>always</em> stick to history, or your language won&rsquo;t have anything new and\ncompelling to give people a <em>reason</em> to jump over.</p>\n</div>\n\n<footer>\n<a href=\"evaluating-expressions.html\" class=\"next\">\n  Next Chapter: &ldquo;Evaluating Expressions&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/representing-code.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Representing Code &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Representing Code<small>5</small></a></h3>\n\n<ul>\n    <li><a href=\"#context-free-grammars\"><small>5.1</small> Context-Free Grammars</a></li>\n    <li><a href=\"#implementing-syntax-trees\"><small>5.2</small> Implementing Syntax Trees</a></li>\n    <li><a href=\"#working-with-trees\"><small>5.3</small> Working with Trees</a></li>\n    <li><a href=\"#a-not-very-pretty-printer\"><small>5.4</small> A (Not Very) Pretty Printer</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"scanning.html\" title=\"Scanning\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\">&uarr;&nbsp;Up</a>\n    <a href=\"parsing-expressions.html\" title=\"Parsing Expressions\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"scanning.html\" title=\"Scanning\" class=\"prev\">←</a>\n<a href=\"parsing-expressions.html\" title=\"Parsing Expressions\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Representing Code<small>5</small></a></h3>\n\n<ul>\n    <li><a href=\"#context-free-grammars\"><small>5.1</small> Context-Free Grammars</a></li>\n    <li><a href=\"#implementing-syntax-trees\"><small>5.2</small> Implementing Syntax Trees</a></li>\n    <li><a href=\"#working-with-trees\"><small>5.3</small> Working with Trees</a></li>\n    <li><a href=\"#a-not-very-pretty-printer\"><small>5.4</small> A (Not Very) Pretty Printer</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"scanning.html\" title=\"Scanning\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\">&uarr;&nbsp;Up</a>\n    <a href=\"parsing-expressions.html\" title=\"Parsing Expressions\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">5</div>\n  <h1>Representing Code</h1>\n\n<blockquote>\n<p>To dwellers in a wood, almost every species of tree has its voice as well as\nits feature.\n<cite>Thomas Hardy, <em>Under the Greenwood Tree</em></cite></p>\n</blockquote>\n<p>In the <a href=\"scanning.html\">last chapter</a>, we took the raw source code as a string and\ntransformed it into a slightly higher-level representation: a series of tokens.\nThe parser we&rsquo;ll write in the <a href=\"parsing-expressions.html\">next chapter</a> takes those tokens and\ntransforms them yet again, into an even richer, more complex representation.</p>\n<p>Before we can produce that representation, we need to define it. That&rsquo;s the\nsubject of this chapter. Along the way, we&rsquo;ll <span name=\"boring\">cover</span>\nsome theory around formal grammars, feel the difference between functional and\nobject-oriented programming, go over a couple of design patterns, and do some\nmetaprogramming.</p>\n<aside name=\"boring\">\n<p>I was so worried about this being one of the most boring chapters in the book\nthat I kept stuffing more fun ideas into it until I ran out of room.</p>\n</aside>\n<p>Before we do all that, let&rsquo;s focus on the main goal<span class=\"em\">&mdash;</span>a representation for\ncode. It should be simple for the parser to produce and easy for the\ninterpreter to consume. If you haven&rsquo;t written a parser or interpreter yet,\nthose requirements aren&rsquo;t exactly illuminating. Maybe your intuition can help.\nWhat is your brain doing when you play the part of a <em>human</em> interpreter? How do\nyou mentally evaluate an arithmetic expression like this:</p>\n<div class=\"codehilite\"><pre><span class=\"n\">1</span> + <span class=\"n\">2</span> * <span class=\"n\">3</span> - <span class=\"n\">4</span>\n</pre></div>\n<p>Because you understand the order of operations<span class=\"em\">&mdash;</span>the old &ldquo;<a href=\"https://en.wikipedia.org/wiki/Order_of_operations#Mnemonics\">Please Excuse My\nDear Aunt Sally</a>&rdquo; stuff<span class=\"em\">&mdash;</span>you know that the multiplication is evaluated\nbefore the addition or subtraction. One way to visualize that precedence is\nusing a tree. Leaf nodes are numbers, and interior nodes are operators with\nbranches for each of their operands.</p>\n<p>In order to evaluate an arithmetic node, you need to know the numeric values of\nits subtrees, so you have to evaluate those first. That means working your way\nfrom the leaves up to the root<span class=\"em\">&mdash;</span>a <em>post-order</em> traversal:</p>\n<p><span name=\"tree-steps\"></span></p><img src=\"image/representing-code/tree-evaluate.png\" alt=\"Evaluating the tree from the bottom up.\" />\n<aside name=\"tree-steps\">\n<p>A. Starting with the full tree, evaluate the bottom-most operation, <code>2 * 3</code>.</p>\n<p>B. Now we can evaluate the <code>+</code>.</p>\n<p>C. Next, the <code>-</code>.</p>\n<p>D. The final answer.</p>\n</aside>\n<p>If I gave you an arithmetic expression, you could draw one of these trees pretty\neasily. Given a tree, you can evaluate it without breaking a sweat. So it\nintuitively seems like a workable representation of our code is a <span\nname=\"only\">tree</span> that matches the grammatical structure<span class=\"em\">&mdash;</span>the operator\nnesting<span class=\"em\">&mdash;</span>of the language.</p>\n<aside name=\"only\">\n<p>That&rsquo;s not to say a tree is the <em>only</em> possible representation of our code. In\n<a href=\"a-bytecode-virtual-machine.html\">Part III</a>, we&rsquo;ll generate bytecode, another representation that isn&rsquo;t as\nhuman friendly but is closer to the machine.</p>\n</aside>\n<p>We need to get more precise about what that grammar is then. Like lexical\ngrammars in the last chapter, there is a long ton of theory around syntactic\ngrammars. We&rsquo;re going into that theory a little more than we did when scanning\nbecause it turns out to be a useful tool throughout much of the interpreter.\nWe start by moving one level up the <a href=\"https://en.wikipedia.org/wiki/Chomsky_hierarchy\">Chomsky hierarchy</a><span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span></p>\n<h2><a href=\"#context-free-grammars\" id=\"context-free-grammars\"><small>5&#8202;.&#8202;1</small>Context-Free Grammars</a></h2>\n<p>In the last chapter, the formalism we used for defining the lexical grammar<span class=\"em\">&mdash;</span>the rules for how characters get grouped into tokens<span class=\"em\">&mdash;</span>was called a <em>regular\nlanguage</em>. That was fine for our scanner, which emits a flat sequence of tokens.\nBut regular languages aren&rsquo;t powerful enough to handle expressions which can\nnest arbitrarily deeply.</p>\n<p>We need a bigger hammer, and that hammer is a <strong>context-free grammar</strong>\n(<strong>CFG</strong>). It&rsquo;s the next heaviest tool in the toolbox of\n<strong><a href=\"https://en.wikipedia.org/wiki/Formal_grammar\">formal grammars</a></strong>. A formal grammar takes a set of atomic pieces it calls\nits &ldquo;alphabet&rdquo;. Then it defines a (usually infinite) set of &ldquo;strings&rdquo; that are\n&ldquo;in&rdquo; the grammar. Each string is a sequence of &ldquo;letters&rdquo; in the alphabet.</p>\n<p>I&rsquo;m using all those quotes because the terms get a little confusing as you move\nfrom lexical to syntactic grammars. In our scanner&rsquo;s grammar, the alphabet\nconsists of individual characters and the strings are the valid lexemes<span class=\"em\">&mdash;</span>roughly &ldquo;words&rdquo;. In the syntactic grammar we&rsquo;re talking about now, we&rsquo;re at a\ndifferent level of granularity. Now each &ldquo;letter&rdquo; in the alphabet is an entire\ntoken and a &ldquo;string&rdquo; is a sequence of <em>tokens</em><span class=\"em\">&mdash;</span>an entire expression.</p>\n<p>Oof. Maybe a table will help:</p><table>\n<thead>\n<tr>\n  <td>Terminology</td>\n  <td></td>\n  <td>Lexical grammar</td>\n  <td>Syntactic grammar</td>\n</tr>\n</thead>\n<tbody>\n<tr>\n  <td>The &ldquo;alphabet&rdquo; is<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.</span></td>\n  <td>&rarr;&ensp;</td>\n  <td>Characters</td>\n  <td>Tokens</td>\n</tr>\n<tr>\n  <td>A &ldquo;string&rdquo; is<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.</span></td>\n  <td>&rarr;&ensp;</td>\n  <td>Lexeme or token</td>\n  <td>Expression</td>\n</tr>\n<tr>\n  <td>It&rsquo;s implemented by the<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.</span></td>\n  <td>&rarr;&ensp;</td>\n  <td>Scanner</td>\n  <td>Parser</td>\n</tr>\n</tbody>\n</table>\n<p>A formal grammar&rsquo;s job is to specify which strings are valid and which aren&rsquo;t.\nIf we were defining a grammar for English sentences, &ldquo;eggs are tasty for\nbreakfast&rdquo; would be in the grammar, but &ldquo;tasty breakfast for are eggs&rdquo; would\nprobably not.</p>\n<h3><a href=\"#rules-for-grammars\" id=\"rules-for-grammars\"><small>5&#8202;.&#8202;1&#8202;.&#8202;1</small>Rules for grammars</a></h3>\n<p>How do we write down a grammar that contains an infinite number of valid\nstrings? We obviously can&rsquo;t list them all out. Instead, we create a finite set\nof rules. You can think of them as a game that you can &ldquo;play&rdquo; in one of two\ndirections.</p>\n<p>If you start with the rules, you can use them to <em>generate</em> strings that are in\nthe grammar. Strings created this way are called <strong>derivations</strong> because each is\n<em>derived</em> from the rules of the grammar. In each step of the game, you pick a\nrule and follow what it tells you to do. Most of the lingo around formal\ngrammars comes from playing them in this direction. Rules are called\n<strong>productions</strong> because they <em>produce</em> strings in the grammar.</p>\n<p>Each production in a context-free grammar has a <strong>head</strong><span class=\"em\">&mdash;</span>its <span\nname=\"name\">name</span><span class=\"em\">&mdash;</span>and a <strong>body</strong>, which describes what it generates. In\nits pure form, the body is simply a list of symbols. Symbols come in two\ndelectable flavors:</p>\n<aside name=\"name\">\n<p>Restricting heads to a single symbol is a defining feature of context-free\ngrammars. More powerful formalisms like <strong><a href=\"https://en.wikipedia.org/wiki/Unrestricted_grammar\">unrestricted grammars</a></strong> allow a\nsequence of symbols in the head as well as in the body.</p>\n</aside>\n<ul>\n<li>\n<p>A <strong>terminal</strong> is a letter from the grammar&rsquo;s alphabet. You can think of it\nlike a literal value. In the syntactic grammar we&rsquo;re defining, the terminals\nare individual lexemes<span class=\"em\">&mdash;</span>tokens coming from the scanner like <code>if</code> or\n<code>1234</code>.</p>\n<p>These are called &ldquo;terminals&rdquo;, in the sense of an &ldquo;end point&rdquo; because they\ndon&rsquo;t lead to any further &ldquo;moves&rdquo; in the game. You simply produce that one\nsymbol.</p>\n</li>\n<li>\n<p>A <strong>nonterminal</strong> is a named reference to another rule in the grammar. It\nmeans &ldquo;play that rule and insert whatever it produces here&rdquo;. In this way,\nthe grammar composes.</p>\n</li>\n</ul>\n<p>There is one last refinement: you may have multiple rules with the same name.\nWhen you reach a nonterminal with that name, you are allowed to pick any of the\nrules for it, whichever floats your boat.</p>\n<p>To make this concrete, we need a <span name=\"turtles\">way</span> to write down\nthese production rules. People have been trying to crystallize grammar all the\nway back to Pāṇini&rsquo;s <em>Ashtadhyayi</em>, which codified Sanskrit grammar a mere\ncouple thousand years ago. Not much progress happened until John Backus and\ncompany needed a notation for specifying ALGOL 58 and came up with\n<a href=\"https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form\"><strong>Backus-Naur form</strong></a> (<strong>BNF</strong>). Since then, nearly everyone uses some\nflavor of BNF, tweaked to their own tastes.</p>\n<p>I tried to come up with something clean. Each rule is a name, followed by an\narrow (<code>→</code>), followed by a sequence of symbols, and finally ending with a\nsemicolon (<code>;</code>). Terminals are quoted strings, and nonterminals are lowercase\nwords.</p>\n<aside name=\"turtles\">\n<p>Yes, we need to define a syntax to use for the rules that define our syntax.\nShould we specify that <em>metasyntax</em> too? What notation do we use for <em>it?</em> It&rsquo;s\nlanguages all the way down!</p>\n</aside>\n<p>Using that, here&rsquo;s a grammar for <span name=\"breakfast\">breakfast</span> menus:</p>\n<aside name=\"breakfast\">\n<p>Yes, I really am going to be using breakfast examples throughout this entire\nbook. Sorry.</p>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"i\">breakfast</span>  → <span class=\"i\">protein</span> <span class=\"s\">&quot;with&quot;</span> <span class=\"i\">breakfast</span> <span class=\"s\">&quot;on the side&quot;</span> ;\n<span class=\"i\">breakfast</span>  → <span class=\"i\">protein</span> ;\n<span class=\"i\">breakfast</span>  → <span class=\"i\">bread</span> ;\n\n<span class=\"i\">protein</span>    → <span class=\"i\">crispiness</span> <span class=\"s\">&quot;crispy&quot;</span> <span class=\"s\">&quot;bacon&quot;</span> ;\n<span class=\"i\">protein</span>    → <span class=\"s\">&quot;sausage&quot;</span> ;\n<span class=\"i\">protein</span>    → <span class=\"i\">cooked</span> <span class=\"s\">&quot;eggs&quot;</span> ;\n\n<span class=\"i\">crispiness</span> → <span class=\"s\">&quot;really&quot;</span> ;\n<span class=\"i\">crispiness</span> → <span class=\"s\">&quot;really&quot;</span> <span class=\"i\">crispiness</span> ;\n\n<span class=\"i\">cooked</span>     → <span class=\"s\">&quot;scrambled&quot;</span> ;\n<span class=\"i\">cooked</span>     → <span class=\"s\">&quot;poached&quot;</span> ;\n<span class=\"i\">cooked</span>     → <span class=\"s\">&quot;fried&quot;</span> ;\n\n<span class=\"i\">bread</span>      → <span class=\"s\">&quot;toast&quot;</span> ;\n<span class=\"i\">bread</span>      → <span class=\"s\">&quot;biscuits&quot;</span> ;\n<span class=\"i\">bread</span>      → <span class=\"s\">&quot;English muffin&quot;</span> ;\n</pre></div>\n<p>We can use this grammar to generate random breakfasts. Let&rsquo;s play a round and\nsee how it works. By age-old convention, the game starts with the first rule in\nthe grammar, here <code>breakfast</code>. There are three productions for that, and we\nrandomly pick the first one. Our resulting string looks like:</p>\n<div class=\"codehilite\"><pre>protein &quot;with&quot; breakfast &quot;on the side&quot;\n</pre></div>\n<p>We need to expand that first nonterminal, <code>protein</code>, so we pick a production for\nthat. Let&rsquo;s pick:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">protein</span> → <span class=\"i\">cooked</span> <span class=\"s\">&quot;eggs&quot;</span> ;\n</pre></div>\n<p>Next, we need a production for <code>cooked</code>, and so we pick <code>\"poached\"</code>. That&rsquo;s a\nterminal, so we add that. Now our string looks like:</p>\n<div class=\"codehilite\"><pre>&quot;poached&quot; &quot;eggs&quot; &quot;with&quot; breakfast &quot;on the side&quot;\n</pre></div>\n<p>The next non-terminal is <code>breakfast</code> again. The first <code>breakfast</code> production we\nchose recursively refers back to the <code>breakfast</code> rule. Recursion in the grammar\nis a good sign that the language being defined is context-free instead of\nregular. In particular, recursion where the recursive nonterminal has\nproductions on <span name=\"nest\">both</span> sides implies that the language is\nnot regular.</p>\n<aside name=\"nest\">\n<p>Imagine that we&rsquo;ve recursively expanded the <code>breakfast</code> rule here several times,\nlike &ldquo;bacon with bacon with bacon with<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>&rdquo; In order to complete the string\ncorrectly, we need to add an <em>equal</em> number of &ldquo;on the side&rdquo; bits to the end.\nTracking the number of required trailing parts is beyond the capabilities of a\nregular grammar. Regular grammars can express <em>repetition</em>, but they can&rsquo;t <em>keep\ncount</em> of how many repetitions there are, which is necessary to ensure that the\nstring has the same number of <code>with</code> and <code>on the side</code> parts.</p>\n</aside>\n<p>We could keep picking the first production for <code>breakfast</code> over and over again\nyielding all manner of breakfasts like &ldquo;bacon with sausage with scrambled eggs\nwith bacon<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>&rdquo; We won&rsquo;t though. This time we&rsquo;ll pick <code>bread</code>. There are three\nrules for that, each of which contains only a terminal. We&rsquo;ll pick &ldquo;English\nmuffin&rdquo;.</p>\n<p>With that, every nonterminal in the string has been expanded until it finally\ncontains only terminals and we&rsquo;re left with:</p><img src=\"image/representing-code/breakfast.png\" alt='\"Playing\" the grammar to generate a string.' />\n<p>Throw in some ham and Hollandaise, and you&rsquo;ve got eggs Benedict.</p>\n<p>Any time we hit a rule that had multiple productions, we just picked one\narbitrarily. It is this flexibility that allows a short number of grammar rules\nto encode a combinatorially larger set of strings. The fact that a rule can\nrefer to itself<span class=\"em\">&mdash;</span>directly or indirectly<span class=\"em\">&mdash;</span>kicks it up even more, letting us\npack an infinite number of strings into a finite grammar.</p>\n<h3><a href=\"#enhancing-our-notation\" id=\"enhancing-our-notation\"><small>5&#8202;.&#8202;1&#8202;.&#8202;2</small>Enhancing our notation</a></h3>\n<p>Stuffing an infinite set of strings in a handful of rules is pretty fantastic,\nbut let&rsquo;s take it further. Our notation works, but it&rsquo;s tedious. So, like any\ngood language designer, we&rsquo;ll sprinkle a little syntactic sugar on top<span class=\"em\">&mdash;</span>some\nextra convenience notation. In addition to terminals and nonterminals, we&rsquo;ll\nallow a few other kinds of expressions in the body of a rule:</p>\n<ul>\n<li>\n<p>Instead of repeating the rule name each time we want to add another\nproduction for it, we&rsquo;ll allow a series of productions separated by a pipe\n(<code>|</code>).</p>\n<div class=\"codehilite\"><pre><span class=\"i\">bread</span> → <span class=\"s\">&quot;toast&quot;</span> | <span class=\"s\">&quot;biscuits&quot;</span> | <span class=\"s\">&quot;English muffin&quot;</span> ;\n</pre></div>\n</li>\n<li>\n<p>Further, we&rsquo;ll allow parentheses for grouping and then allow <code>|</code> within that\nto select one from a series of options within the middle of a production.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">protein</span> → ( <span class=\"s\">&quot;scrambled&quot;</span> | <span class=\"s\">&quot;poached&quot;</span> | <span class=\"s\">&quot;fried&quot;</span> ) <span class=\"s\">&quot;eggs&quot;</span> ;\n</pre></div>\n</li>\n<li>\n<p>Using recursion to support repeated sequences of symbols has a certain\nappealing <span name=\"purity\">purity</span>, but it&rsquo;s kind of a chore to\nmake a separate named sub-rule each time we want to loop. So, we also use a\npostfix <code>*</code> to allow the previous symbol or group to be repeated zero or\nmore times.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">crispiness</span> → <span class=\"s\">&quot;really&quot;</span> <span class=\"s\">&quot;really&quot;</span>* ;\n</pre></div>\n</li>\n</ul>\n<aside name=\"purity\">\n<p>This is how the Scheme programming language works. It has no built-in looping\nfunctionality at all. Instead, <em>all</em> repetition is expressed in terms of\nrecursion.</p>\n</aside>\n<ul>\n<li>\n<p>A postfix <code>+</code> is similar, but requires the preceding production to appear\nat least once.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">crispiness</span> → <span class=\"s\">&quot;really&quot;</span>+ ;\n</pre></div>\n</li>\n<li>\n<p>A postfix <code>?</code> is for an optional production. The thing before it can appear\nzero or one time, but not more.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">breakfast</span> → <span class=\"i\">protein</span> ( <span class=\"s\">&quot;with&quot;</span> <span class=\"i\">breakfast</span> <span class=\"s\">&quot;on the side&quot;</span> )? ;\n</pre></div>\n</li>\n</ul>\n<p>With all of those syntactic niceties, our breakfast grammar condenses down to:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">breakfast</span> → <span class=\"i\">protein</span> ( <span class=\"s\">&quot;with&quot;</span> <span class=\"i\">breakfast</span> <span class=\"s\">&quot;on the side&quot;</span> )?\n          | <span class=\"i\">bread</span> ;\n\n<span class=\"i\">protein</span>   → <span class=\"s\">&quot;really&quot;</span>+ <span class=\"s\">&quot;crispy&quot;</span> <span class=\"s\">&quot;bacon&quot;</span>\n          | <span class=\"s\">&quot;sausage&quot;</span>\n          | ( <span class=\"s\">&quot;scrambled&quot;</span> | <span class=\"s\">&quot;poached&quot;</span> | <span class=\"s\">&quot;fried&quot;</span> ) <span class=\"s\">&quot;eggs&quot;</span> ;\n\n<span class=\"i\">bread</span>     → <span class=\"s\">&quot;toast&quot;</span> | <span class=\"s\">&quot;biscuits&quot;</span> | <span class=\"s\">&quot;English muffin&quot;</span> ;\n</pre></div>\n<p>Not too bad, I hope. If you&rsquo;re used to grep or using <a href=\"https://en.wikipedia.org/wiki/Regular_expression#Standards\">regular\nexpressions</a> in your text editor, most of the punctuation should be\nfamiliar. The main difference is that symbols here represent entire tokens, not\nsingle characters.</p>\n<p>We&rsquo;ll use this notation throughout the rest of the book to precisely describe\nLox&rsquo;s grammar. As you work on programming languages, you&rsquo;ll find that\ncontext-free grammars (using this or <a href=\"https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form\">EBNF</a> or some other notation) help you\ncrystallize your informal syntax design ideas. They are also a handy medium for\ncommunicating with other language hackers about syntax.</p>\n<p>The rules and productions we define for Lox are also our guide to the tree data\nstructure we&rsquo;re going to implement to represent code in memory. Before we can do\nthat, we need an actual grammar for Lox, or at least enough of one for us to get\nstarted.</p>\n<h3><a href=\"#a-grammar-for-lox-expressions\" id=\"a-grammar-for-lox-expressions\"><small>5&#8202;.&#8202;1&#8202;.&#8202;3</small>A Grammar for Lox expressions</a></h3>\n<p>In the previous chapter, we did Lox&rsquo;s entire lexical grammar in one fell swoop.\nEvery keyword and bit of punctuation is there. The syntactic grammar is larger,\nand it would be a real bore to grind through the entire thing before we actually\nget our interpreter up and running.</p>\n<p>Instead, we&rsquo;ll crank through a subset of the language in the next couple of\nchapters. Once we have that mini-language represented, parsed, and interpreted,\nthen later chapters will progressively add new features to it, including the new\nsyntax. For now, we are going to worry about only a handful of expressions:</p>\n<ul>\n<li>\n<p><strong>Literals.</strong> Numbers, strings, Booleans, and <code>nil</code>.</p>\n</li>\n<li>\n<p><strong>Unary expressions.</strong> A prefix <code>!</code> to perform a logical not, and <code>-</code> to\nnegate a number.</p>\n</li>\n<li>\n<p><strong>Binary expressions.</strong> The infix arithmetic (<code>+</code>, <code>-</code>, <code>*</code>, <code>/</code>) and logic\noperators (<code>==</code>, <code>!=</code>, <code>&lt;</code>, <code>&lt;=</code>, <code>&gt;</code>, <code>&gt;=</code>) we know and love.</p>\n</li>\n<li>\n<p><strong>Parentheses.</strong> A pair of <code>(</code> and <code>)</code> wrapped around an expression.</p>\n</li>\n</ul>\n<p>That gives us enough syntax for expressions like:</p>\n<div class=\"codehilite\"><pre><span class=\"n\">1</span> - (<span class=\"n\">2</span> * <span class=\"n\">3</span>) &lt; <span class=\"n\">4</span> == <span class=\"k\">false</span>\n</pre></div>\n<p>Using our handy dandy new notation, here&rsquo;s a grammar for those:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">expression</span>     → <span class=\"i\">literal</span>\n               | <span class=\"i\">unary</span>\n               | <span class=\"i\">binary</span>\n               | <span class=\"i\">grouping</span> ;\n\n<span class=\"i\">literal</span>        → <span class=\"t\">NUMBER</span> | <span class=\"t\">STRING</span> | <span class=\"s\">&quot;true&quot;</span> | <span class=\"s\">&quot;false&quot;</span> | <span class=\"s\">&quot;nil&quot;</span> ;\n<span class=\"i\">grouping</span>       → <span class=\"s\">&quot;(&quot;</span> <span class=\"i\">expression</span> <span class=\"s\">&quot;)&quot;</span> ;\n<span class=\"i\">unary</span>          → ( <span class=\"s\">&quot;-&quot;</span> | <span class=\"s\">&quot;!&quot;</span> ) <span class=\"i\">expression</span> ;\n<span class=\"i\">binary</span>         → <span class=\"i\">expression</span> <span class=\"i\">operator</span> <span class=\"i\">expression</span> ;\n<span class=\"i\">operator</span>       → <span class=\"s\">&quot;==&quot;</span> | <span class=\"s\">&quot;!=&quot;</span> | <span class=\"s\">&quot;&lt;&quot;</span> | <span class=\"s\">&quot;&lt;=&quot;</span> | <span class=\"s\">&quot;&gt;&quot;</span> | <span class=\"s\">&quot;&gt;=&quot;</span>\n               | <span class=\"s\">&quot;+&quot;</span>  | <span class=\"s\">&quot;-&quot;</span>  | <span class=\"s\">&quot;*&quot;</span> | <span class=\"s\">&quot;/&quot;</span> ;\n</pre></div>\n<p>There&rsquo;s one bit of extra <span name=\"play\">metasyntax</span> here. In addition\nto quoted strings for terminals that match exact lexemes, we <code>CAPITALIZE</code>\nterminals that are a single lexeme whose text representation may vary. <code>NUMBER</code>\nis any number literal, and <code>STRING</code> is any string literal. Later, we&rsquo;ll do the\nsame for <code>IDENTIFIER</code>.</p>\n<p>This grammar is actually ambiguous, which we&rsquo;ll see when we get to parsing it.\nBut it&rsquo;s good enough for now.</p>\n<aside name=\"play\">\n<p>If you&rsquo;re so inclined, try using this grammar to generate a few expressions like\nwe did with the breakfast grammar before. Do the resulting expressions look\nright to you? Can you make it generate anything wrong like <code>1 + / 3</code>?</p>\n</aside>\n<h2><a href=\"#implementing-syntax-trees\" id=\"implementing-syntax-trees\"><small>5&#8202;.&#8202;2</small>Implementing Syntax Trees</a></h2>\n<p>Finally, we get to write some code. That little expression grammar is our\nskeleton. Since the grammar is recursive<span class=\"em\">&mdash;</span>note how <code>grouping</code>, <code>unary</code>, and\n<code>binary</code> all refer back to <code>expression</code><span class=\"em\">&mdash;</span>our data structure will form a tree.\nSince this structure represents the syntax of our language, it&rsquo;s called a <span\nname=\"ast\"><strong>syntax tree</strong></span>.</p>\n<aside name=\"ast\">\n<p>In particular, we&rsquo;re defining an <strong>abstract syntax tree</strong> (<strong>AST</strong>). In a\n<strong>parse tree</strong>, every single grammar production becomes a node in the tree. An\nAST elides productions that aren&rsquo;t needed by later phases.</p>\n</aside>\n<p>Our scanner used a single Token class to represent all kinds of lexemes. To\ndistinguish the different kinds<span class=\"em\">&mdash;</span>think the number <code>123</code> versus the string\n<code>\"123\"</code><span class=\"em\">&mdash;</span>we included a simple TokenType enum. Syntax trees are not so <span\nname=\"token-data\">homogeneous</span>. Unary expressions have a single operand,\nbinary expressions have two, and literals have none.</p>\n<p>We <em>could</em> mush that all together into a single Expression class with an\narbitrary list of children. Some compilers do. But I like getting the most out\nof Java&rsquo;s type system. So we&rsquo;ll define a base class for expressions. Then, for\neach kind of expression<span class=\"em\">&mdash;</span>each production under <code>expression</code><span class=\"em\">&mdash;</span>we create a\nsubclass that has fields for the nonterminals specific to that rule. This way,\nwe get a compile error if we, say, try to access the second operand of a unary\nexpression.</p>\n<aside name=\"token-data\">\n<p>Tokens aren&rsquo;t entirely homogeneous either. Tokens for literals store the value,\nbut other kinds of lexemes don&rsquo;t need that state. I have seen scanners that use\ndifferent classes for literals and other kinds of lexemes, but I figured I&rsquo;d\nkeep things simpler.</p>\n</aside>\n<p>Something like this:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">package</span> <span class=\"i\">com.craftinginterpreters.lox</span>;\n\n<span class=\"k\">abstract</span> <span class=\"k\">class</span> <span class=\"t\">Expr</span> {<span name=\"expr\"> </span>\n  <span class=\"k\">static</span> <span class=\"k\">class</span> <span class=\"t\">Binary</span> <span class=\"k\">extends</span> <span class=\"t\">Expr</span> {\n    <span class=\"t\">Binary</span>(<span class=\"t\">Expr</span> <span class=\"i\">left</span>, <span class=\"t\">Token</span> <span class=\"i\">operator</span>, <span class=\"t\">Expr</span> <span class=\"i\">right</span>) {\n      <span class=\"k\">this</span>.<span class=\"i\">left</span> = <span class=\"i\">left</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">operator</span> = <span class=\"i\">operator</span>;\n      <span class=\"k\">this</span>.<span class=\"i\">right</span> = <span class=\"i\">right</span>;\n    }\n\n    <span class=\"k\">final</span> <span class=\"t\">Expr</span> <span class=\"i\">left</span>;\n    <span class=\"k\">final</span> <span class=\"t\">Token</span> <span class=\"i\">operator</span>;\n    <span class=\"k\">final</span> <span class=\"t\">Expr</span> <span class=\"i\">right</span>;\n  }\n\n  <span class=\"c\">// Other expressions...</span>\n}\n</pre></div>\n<aside name=\"expr\">\n<p>I avoid abbreviations in my code because they trip up a reader who doesn&rsquo;t know\nwhat they stand for. But in compilers I&rsquo;ve looked at, &ldquo;Expr&rdquo; and &ldquo;Stmt&rdquo; are so\nubiquitous that I may as well start getting you used to them now.</p>\n</aside>\n<p>Expr is the base class that all expression classes inherit from. As you can see\nfrom <code>Binary</code>, the subclasses are nested inside of it. There&rsquo;s no technical need\nfor this, but it lets us cram all of the classes into a single Java file.</p>\n<h3><a href=\"#disoriented-objects\" id=\"disoriented-objects\"><small>5&#8202;.&#8202;2&#8202;.&#8202;1</small>Disoriented objects</a></h3>\n<p>You&rsquo;ll note that, much like the Token class, there aren&rsquo;t any methods here. It&rsquo;s\na dumb structure. Nicely typed, but merely a bag of data. This feels strange in\nan object-oriented language like Java. Shouldn&rsquo;t the class <em>do stuff</em>?</p>\n<p>The problem is that these tree classes aren&rsquo;t owned by any single domain. Should\nthey have methods for parsing since that&rsquo;s where the trees are created? Or\ninterpreting since that&rsquo;s where they are consumed? Trees span the border between\nthose territories, which means they are really owned by <em>neither</em>.</p>\n<p>In fact, these types exist to enable the parser and interpreter to\n<em>communicate</em>. That lends itself to types that are simply data with no\nassociated behavior. This style is very natural in functional languages like\nLisp and ML where <em>all</em> data is separate from behavior, but it feels odd in\nJava.</p>\n<p>Functional programming aficionados right now are jumping up to exclaim &ldquo;See!\nObject-oriented languages are a bad fit for an interpreter!&rdquo; I won&rsquo;t go that\nfar. You&rsquo;ll recall that the scanner itself was admirably suited to\nobject-orientation. It had all of the mutable state to keep track of where it\nwas in the source code, a well-defined set of public methods, and a handful of\nprivate helpers.</p>\n<p>My feeling is that each phase or part of the interpreter works fine in an\nobject-oriented style. It is the data structures that flow between them that are\nstripped of behavior.</p>\n<h3><a href=\"#metaprogramming-the-trees\" id=\"metaprogramming-the-trees\"><small>5&#8202;.&#8202;2&#8202;.&#8202;2</small>Metaprogramming the trees</a></h3>\n<p>Java can express behavior-less classes, but I wouldn&rsquo;t say that it&rsquo;s\nparticularly great at it. Eleven lines of code to stuff three fields in an\nobject is pretty tedious, and when we&rsquo;re all done, we&rsquo;re going to have 21 of\nthese classes.</p>\n<p>I don&rsquo;t want to waste your time or my ink writing all that down. Really, what is\nthe essence of each subclass? A name, and a list of typed fields. That&rsquo;s it.\nWe&rsquo;re smart language hackers, right? Let&rsquo;s <span\nname=\"automate\">automate</span>.</p>\n<aside name=\"automate\">\n<p>Picture me doing an awkward robot dance when you read that. &ldquo;AU-TO-MATE.&rdquo;</p>\n</aside>\n<p>Instead of tediously handwriting each class definition, field declaration,\nconstructor, and initializer, we&rsquo;ll hack together a <span\nname=\"python\">script</span> that does it for us. It has a description of each\ntree type<span class=\"em\">&mdash;</span>its name and fields<span class=\"em\">&mdash;</span>and it prints out the Java code needed to\ndefine a class with that name and state.</p>\n<p>This script is a tiny Java command-line app that generates a file named\n&ldquo;Expr.java&rdquo;:</p>\n<aside name=\"python\">\n<p>I got the idea of scripting the syntax tree classes from Jim Hugunin, creator of\nJython and IronPython.</p>\n<p>An actual scripting language would be a better fit for this than Java, but I&rsquo;m\ntrying not to throw too many languages at you.</p>\n</aside>\n<div class=\"codehilite\"><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\ncreate new file</div>\n<pre><span class=\"k\">package</span> <span class=\"i\">com.craftinginterpreters.tool</span>;\n\n<span class=\"k\">import</span> <span class=\"i\">java.io.IOException</span>;\n<span class=\"k\">import</span> <span class=\"i\">java.io.PrintWriter</span>;\n<span class=\"k\">import</span> <span class=\"i\">java.util.Arrays</span>;\n<span class=\"k\">import</span> <span class=\"i\">java.util.List</span>;\n\n<span class=\"k\">public</span> <span class=\"k\">class</span> <span class=\"t\">GenerateAst</span> {\n  <span class=\"k\">public</span> <span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">main</span>(<span class=\"t\">String</span>[] <span class=\"i\">args</span>) <span class=\"k\">throws</span> <span class=\"t\">IOException</span> {\n    <span class=\"k\">if</span> (<span class=\"i\">args</span>.<span class=\"i\">length</span> != <span class=\"n\">1</span>) {\n      <span class=\"t\">System</span>.<span class=\"i\">err</span>.<span class=\"i\">println</span>(<span class=\"s\">&quot;Usage: generate_ast &lt;output directory&gt;&quot;</span>);\n      <span class=\"t\">System</span>.<span class=\"i\">exit</span>(<span class=\"n\">64</span>);\n    }\n    <span class=\"t\">String</span> <span class=\"i\">outputDir</span> = <span class=\"i\">args</span>[<span class=\"n\">0</span>];\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, create new file</div>\n\n<p>Note that this file is in a different package, <code>.tool</code> instead of <code>.lox</code>. This\nscript isn&rsquo;t part of the interpreter itself. It&rsquo;s a tool <em>we</em>, the people\nhacking on the interpreter, run ourselves to generate the syntax tree classes.\nWhen it&rsquo;s done, we treat &ldquo;Expr.java&rdquo; like any other file in the implementation.\nWe are merely automating how that file gets authored.</p>\n<p>To generate the classes, it needs to have some description of each type and its\nfields.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    String outputDir = args[0];\n</pre><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nin <em>main</em>()</div>\n<pre class=\"insert\">    <span class=\"i\">defineAst</span>(<span class=\"i\">outputDir</span>, <span class=\"s\">&quot;Expr&quot;</span>, <span class=\"t\">Arrays</span>.<span class=\"i\">asList</span>(\n      <span class=\"s\">&quot;Binary   : Expr left, Token operator, Expr right&quot;</span>,\n      <span class=\"s\">&quot;Grouping : Expr expression&quot;</span>,\n      <span class=\"s\">&quot;Literal  : Object value&quot;</span>,\n      <span class=\"s\">&quot;Unary    : Token operator, Expr right&quot;</span>\n    ));\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, in <em>main</em>()</div>\n\n<p>For brevity&rsquo;s sake, I jammed the descriptions of the expression types into\nstrings. Each is the name of the class followed by <code>:</code> and the list of fields,\nseparated by commas. Each field has a type and a name.</p>\n<p>The first thing <code>defineAst()</code> needs to do is output the base Expr class.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nadd after <em>main</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">defineAst</span>(\n      <span class=\"t\">String</span> <span class=\"i\">outputDir</span>, <span class=\"t\">String</span> <span class=\"i\">baseName</span>, <span class=\"t\">List</span>&lt;<span class=\"t\">String</span>&gt; <span class=\"i\">types</span>)\n      <span class=\"k\">throws</span> <span class=\"t\">IOException</span> {\n    <span class=\"t\">String</span> <span class=\"i\">path</span> = <span class=\"i\">outputDir</span> + <span class=\"s\">&quot;/&quot;</span> + <span class=\"i\">baseName</span> + <span class=\"s\">&quot;.java&quot;</span>;\n    <span class=\"t\">PrintWriter</span> <span class=\"i\">writer</span> = <span class=\"k\">new</span> <span class=\"t\">PrintWriter</span>(<span class=\"i\">path</span>, <span class=\"s\">&quot;UTF-8&quot;</span>);\n\n    <span class=\"i\">writer</span>.<span class=\"i\">println</span>(<span class=\"s\">&quot;package com.craftinginterpreters.lox;&quot;</span>);\n    <span class=\"i\">writer</span>.<span class=\"i\">println</span>();\n    <span class=\"i\">writer</span>.<span class=\"i\">println</span>(<span class=\"s\">&quot;import java.util.List;&quot;</span>);\n    <span class=\"i\">writer</span>.<span class=\"i\">println</span>();\n    <span class=\"i\">writer</span>.<span class=\"i\">println</span>(<span class=\"s\">&quot;abstract class &quot;</span> + <span class=\"i\">baseName</span> + <span class=\"s\">&quot; {&quot;</span>);\n\n    <span class=\"i\">writer</span>.<span class=\"i\">println</span>(<span class=\"s\">&quot;}&quot;</span>);\n    <span class=\"i\">writer</span>.<span class=\"i\">close</span>();\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, add after <em>main</em>()</div>\n\n<p>When we call this, <code>baseName</code> is &ldquo;Expr&rdquo;, which is both the name of the class and\nthe name of the file it outputs. We pass this as an argument instead of\nhardcoding the name because we&rsquo;ll add a separate family of classes later for\nstatements.</p>\n<p>Inside the base class, we define each subclass.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    writer.println(&quot;abstract class &quot; + baseName + &quot; {&quot;);\n\n</pre><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nin <em>defineAst</em>()</div>\n<pre class=\"insert\">    <span class=\"c\">// The AST classes.</span>\n    <span class=\"k\">for</span> (<span class=\"t\">String</span> <span class=\"i\">type</span> : <span class=\"i\">types</span>) {\n      <span class=\"t\">String</span> <span class=\"i\">className</span> = <span class=\"i\">type</span>.<span class=\"i\">split</span>(<span class=\"s\">&quot;:&quot;</span>)[<span class=\"n\">0</span>].<span class=\"i\">trim</span>();\n      <span class=\"t\">String</span> <span class=\"i\">fields</span> = <span class=\"i\">type</span>.<span class=\"i\">split</span>(<span class=\"s\">&quot;:&quot;</span>)[<span class=\"n\">1</span>].<span class=\"i\">trim</span>();<span name=\"robust\"> </span>\n      <span class=\"i\">defineType</span>(<span class=\"i\">writer</span>, <span class=\"i\">baseName</span>, <span class=\"i\">className</span>, <span class=\"i\">fields</span>);\n    }\n</pre><pre class=\"insert-after\">    writer.println(&quot;}&quot;);\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, in <em>defineAst</em>()</div>\n\n<aside name=\"robust\">\n<p>This isn&rsquo;t the world&rsquo;s most elegant string manipulation code, but that&rsquo;s fine.\nIt only runs on the exact set of class definitions we give it. Robustness ain&rsquo;t\na priority.</p>\n</aside>\n<p>That code, in turn, calls:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nadd after <em>defineAst</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">defineType</span>(\n      <span class=\"t\">PrintWriter</span> <span class=\"i\">writer</span>, <span class=\"t\">String</span> <span class=\"i\">baseName</span>,\n      <span class=\"t\">String</span> <span class=\"i\">className</span>, <span class=\"t\">String</span> <span class=\"i\">fieldList</span>) {\n    <span class=\"i\">writer</span>.<span class=\"i\">println</span>(<span class=\"s\">&quot;  static class &quot;</span> + <span class=\"i\">className</span> + <span class=\"s\">&quot; extends &quot;</span> +\n        <span class=\"i\">baseName</span> + <span class=\"s\">&quot; {&quot;</span>);\n\n    <span class=\"c\">// Constructor.</span>\n    <span class=\"i\">writer</span>.<span class=\"i\">println</span>(<span class=\"s\">&quot;    &quot;</span> + <span class=\"i\">className</span> + <span class=\"s\">&quot;(&quot;</span> + <span class=\"i\">fieldList</span> + <span class=\"s\">&quot;) {&quot;</span>);\n\n    <span class=\"c\">// Store parameters in fields.</span>\n    <span class=\"t\">String</span>[] <span class=\"i\">fields</span> = <span class=\"i\">fieldList</span>.<span class=\"i\">split</span>(<span class=\"s\">&quot;, &quot;</span>);\n    <span class=\"k\">for</span> (<span class=\"t\">String</span> <span class=\"i\">field</span> : <span class=\"i\">fields</span>) {\n      <span class=\"t\">String</span> <span class=\"i\">name</span> = <span class=\"i\">field</span>.<span class=\"i\">split</span>(<span class=\"s\">&quot; &quot;</span>)[<span class=\"n\">1</span>];\n      <span class=\"i\">writer</span>.<span class=\"i\">println</span>(<span class=\"s\">&quot;      this.&quot;</span> + <span class=\"i\">name</span> + <span class=\"s\">&quot; = &quot;</span> + <span class=\"i\">name</span> + <span class=\"s\">&quot;;&quot;</span>);\n    }\n\n    <span class=\"i\">writer</span>.<span class=\"i\">println</span>(<span class=\"s\">&quot;    }&quot;</span>);\n\n    <span class=\"c\">// Fields.</span>\n    <span class=\"i\">writer</span>.<span class=\"i\">println</span>();\n    <span class=\"k\">for</span> (<span class=\"t\">String</span> <span class=\"i\">field</span> : <span class=\"i\">fields</span>) {\n      <span class=\"i\">writer</span>.<span class=\"i\">println</span>(<span class=\"s\">&quot;    final &quot;</span> + <span class=\"i\">field</span> + <span class=\"s\">&quot;;&quot;</span>);\n    }\n\n    <span class=\"i\">writer</span>.<span class=\"i\">println</span>(<span class=\"s\">&quot;  }&quot;</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, add after <em>defineAst</em>()</div>\n\n<p>There we go. All of that glorious Java boilerplate is done. It declares each\nfield in the class body. It defines a constructor for the class with parameters\nfor each field and initializes them in the body.</p>\n<p>Compile and run this Java program now and it <span name=\"longer\">blasts</span>\nout a new &ldquo;.java&rdquo; file containing a few dozen lines of code. That file&rsquo;s\nabout to get even longer.</p>\n<aside name=\"longer\">\n<p><a href=\"appendix-ii.html\">Appendix II</a> contains the code generated by this script once we&rsquo;ve finished\nimplementing jlox and defined all of its syntax tree nodes.</p>\n</aside>\n<h2><a href=\"#working-with-trees\" id=\"working-with-trees\"><small>5&#8202;.&#8202;3</small>Working with Trees</a></h2>\n<p>Put on your imagination hat for a moment. Even though we aren&rsquo;t there yet,\nconsider what the interpreter will do with the syntax trees. Each kind of\nexpression in Lox behaves differently at runtime. That means the interpreter\nneeds to select a different chunk of code to handle each expression type. With\ntokens, we can simply switch on the TokenType. But we don&rsquo;t have a &ldquo;type&rdquo; enum\nfor the syntax trees, just a separate Java class for each one.</p>\n<p>We could write a long chain of type tests:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">if</span> (<span class=\"i\">expr</span> <span class=\"k\">instanceof</span> <span class=\"t\">Expr</span>.<span class=\"t\">Binary</span>) {\n  <span class=\"c\">// ...</span>\n} <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">expr</span> <span class=\"k\">instanceof</span> <span class=\"t\">Expr</span>.<span class=\"t\">Grouping</span>) {\n  <span class=\"c\">// ...</span>\n} <span class=\"k\">else</span> <span class=\"c\">// ...</span>\n</pre></div>\n<p>But all of those sequential type tests are slow. Expression types whose names\nare alphabetically later would take longer to execute because they&rsquo;d fall\nthrough more <code>if</code> cases before finding the right type. That&rsquo;s not my idea of an\nelegant solution.</p>\n<p>We have a family of classes and we need to associate a chunk of behavior with\neach one. The natural solution in an object-oriented language like Java is to\nput those behaviors into methods on the classes themselves. We could add an\nabstract <span name=\"interpreter-pattern\"><code>interpret()</code></span> method on Expr\nwhich each subclass would then implement to interpret itself.</p>\n<aside name=\"interpreter-pattern\">\n<p>This exact thing is literally called the <a href=\"https://en.wikipedia.org/wiki/Interpreter_pattern\">&ldquo;Interpreter pattern&rdquo;</a> in\n<em>Design Patterns: Elements of Reusable Object-Oriented Software</em>, by Erich\nGamma, et al.</p>\n</aside>\n<p>This works alright for tiny projects, but it scales poorly. Like I noted before,\nthese tree classes span a few domains. At the very least, both the parser and\ninterpreter will mess with them. As <a href=\"resolving-and-binding.html\">you&rsquo;ll see later</a>, we need to\ndo name resolution on them. If our language was statically typed, we&rsquo;d have a\ntype checking pass.</p>\n<p>If we added instance methods to the expression classes for every one of those\noperations, that would smush a bunch of different domains together. That\nviolates <a href=\"https://en.wikipedia.org/wiki/Separation_of_concerns\">separation of concerns</a> and leads to hard-to-maintain code.</p>\n<h3><a href=\"#the-expression-problem\" id=\"the-expression-problem\"><small>5&#8202;.&#8202;3&#8202;.&#8202;1</small>The expression problem</a></h3>\n<p>This problem is more fundamental than it may seem at first. We have a handful of\ntypes, and a handful of high-level operations like &ldquo;interpret&rdquo;. For each pair of\ntype and operation, we need a specific implementation. Picture a table:</p><img src=\"image/representing-code/table.png\" alt=\"A table where rows are labeled with expression classes, and columns are function names.\" />\n<p>Rows are types, and columns are operations. Each cell represents the unique\npiece of code to implement that operation on that type.</p>\n<p>An object-oriented language like Java assumes that all of the code in one row\nnaturally hangs together. It figures all the things you do with a type are\nlikely related to each other, and the language makes it easy to define them\ntogether as methods inside the same class.</p><img src=\"image/representing-code/rows.png\" alt=\"The table split into rows for each class.\" />\n<p>This makes it easy to extend the table by adding new rows. Simply define a new\nclass. No existing code has to be touched. But imagine if you want to add a new\n<em>operation</em><span class=\"em\">&mdash;</span>a new column. In Java, that means cracking open each of those\nexisting classes and adding a method to it.</p>\n<p>Functional paradigm languages in the <span name=\"ml\">ML</span> family flip that\naround. There, you don&rsquo;t have classes with methods. Types and functions are\ntotally distinct. To implement an operation for a number of different types, you\ndefine a single function. In the body of that function, you use <em>pattern\nmatching</em><span class=\"em\">&mdash;</span>sort of a type-based switch on steroids<span class=\"em\">&mdash;</span>to implement the\noperation for each type all in one place.</p>\n<aside name=\"ml\">\n<p>ML, short for &ldquo;metalanguage&rdquo; was created by Robin Milner and friends and forms\none of the main branches in the great programming language family tree. Its\nchildren include SML, Caml, OCaml, Haskell, and F#. Even Scala, Rust, and Swift\nbear a strong resemblance.</p>\n<p>Much like Lisp, it is one of those languages that is so full of good ideas that\nlanguage designers today are still rediscovering them over forty years later.</p>\n</aside>\n<p>This makes it trivial to add new operations<span class=\"em\">&mdash;</span>simply define another function\nthat pattern matches on all of the types.</p><img src=\"image/representing-code/columns.png\" alt=\"The table split into columns for each function.\" />\n<p>But, conversely, adding a new type is hard. You have to go back and add a new\ncase to all of the pattern matches in all of the existing functions.</p>\n<p>Each style has a certain &ldquo;grain&rdquo; to it. That&rsquo;s what the paradigm name literally\nsays<span class=\"em\">&mdash;</span>an object-oriented language wants you to <em>orient</em> your code along the\nrows of types. A functional language instead encourages you to lump each\ncolumn&rsquo;s worth of code together into a <em>function</em>.</p>\n<p>A bunch of smart language nerds noticed that neither style made it easy to add\n<em>both</em> rows and columns to the <span name=\"multi\">table</span>. They called this\ndifficulty the &ldquo;expression problem&rdquo; because<span class=\"em\">&mdash;</span>like we are now<span class=\"em\">&mdash;</span>they first ran\ninto it when they were trying to figure out the best way to model expression\nsyntax tree nodes in a compiler.</p>\n<aside name=\"multi\">\n<p>Languages with <em>multimethods</em>, like Common Lisp&rsquo;s CLOS, Dylan, and Julia do\nsupport adding both new types and operations easily. What they typically\nsacrifice is either static type checking, or separate compilation.</p>\n</aside>\n<p>People have thrown all sorts of language features, design patterns, and\nprogramming tricks to try to knock that problem down but no perfect language has\nfinished it off yet. In the meantime, the best we can do is try to pick a\nlanguage whose orientation matches the natural architectural seams in the\nprogram we&rsquo;re writing.</p>\n<p>Object-orientation works fine for many parts of our interpreter, but these tree\nclasses rub against the grain of Java. Fortunately, there&rsquo;s a design pattern we\ncan bring to bear on it.</p>\n<h3><a href=\"#the-visitor-pattern\" id=\"the-visitor-pattern\"><small>5&#8202;.&#8202;3&#8202;.&#8202;2</small>The Visitor pattern</a></h3>\n<p>The <strong>Visitor pattern</strong> is the most widely misunderstood pattern in all of\n<em>Design Patterns</em>, which is really saying something when you look at the\nsoftware architecture excesses of the past couple of decades.</p>\n<p>The trouble starts with terminology. The pattern isn&rsquo;t about &ldquo;visiting&rdquo;, and the\n&ldquo;accept&rdquo; method in it doesn&rsquo;t conjure up any helpful imagery either. Many think\nthe pattern has to do with traversing trees, which isn&rsquo;t the case at all. We\n<em>are</em> going to use it on a set of classes that are tree-like, but that&rsquo;s a\ncoincidence. As you&rsquo;ll see, the pattern works as well on a single object.</p>\n<p>The Visitor pattern is really about approximating the functional style within an\nOOP language. It lets us add new columns to that table easily. We can define all\nof the behavior for a new operation on a set of types in one place, without\nhaving to touch the types themselves. It does this the same way we solve almost\nevery problem in computer science: by adding a layer of indirection.</p>\n<p>Before we apply it to our auto-generated Expr classes, let&rsquo;s walk through a\nsimpler example. Say we have two kinds of pastries: <span\nname=\"beignet\">beignets</span> and crullers.</p>\n<aside name=\"beignet\">\n<p>A beignet (pronounced &ldquo;ben-yay&rdquo;, with equal emphasis on both syllables) is a\ndeep-fried pastry in the same family as doughnuts. When the French colonized\nNorth America in the 1700s, they brought beignets with them. Today, in the US,\nthey are most strongly associated with the cuisine of New Orleans.</p>\n<p>My preferred way to consume them is fresh out of the fryer at Café du Monde,\npiled high in powdered sugar, and washed down with a cup of café au lait while I\nwatch tourists staggering around trying to shake off their hangover from the\nprevious night&rsquo;s revelry.</p>\n</aside>\n<div class=\"codehilite\"><pre>  <span class=\"k\">abstract</span> <span class=\"k\">class</span> <span class=\"t\">Pastry</span> {\n  }\n\n  <span class=\"k\">class</span> <span class=\"t\">Beignet</span> <span class=\"k\">extends</span> <span class=\"t\">Pastry</span> {\n  }\n\n  <span class=\"k\">class</span> <span class=\"t\">Cruller</span> <span class=\"k\">extends</span> <span class=\"t\">Pastry</span> {\n  }\n</pre></div>\n\n<p>We want to be able to define new pastry operations<span class=\"em\">&mdash;</span>cooking them, eating them,\ndecorating them, etc.<span class=\"em\">&mdash;</span>without having to add a new method to each class every\ntime. Here&rsquo;s how we do it. First, we define a separate interface.</p>\n<div class=\"codehilite\"><pre>  <span class=\"k\">interface</span> <span class=\"t\">PastryVisitor</span> {\n    <span class=\"t\">void</span> <span class=\"i\">visitBeignet</span>(<span class=\"t\">Beignet</span> <span class=\"i\">beignet</span>);<span name=\"overload\"> </span>\n    <span class=\"t\">void</span> <span class=\"i\">visitCruller</span>(<span class=\"t\">Cruller</span> <span class=\"i\">cruller</span>);\n  }\n</pre></div>\n\n<aside name=\"overload\">\n<p>In <em>Design Patterns</em>, both of these methods are confusingly named <code>visit()</code>, and\nthey rely on overloading to distinguish them. This leads some readers to think\nthat the correct visit method is chosen <em>at runtime</em> based on its parameter\ntype. That isn&rsquo;t the case. Unlike over<em>riding</em>, over<em>loading</em> is statically\ndispatched at compile time.</p>\n<p>Using distinct names for each method makes the dispatch more obvious, and also\nshows you how to apply this pattern in languages that don&rsquo;t support overloading.</p>\n</aside>\n<p>Each operation that can be performed on pastries is a new class that implements\nthat interface. It has a concrete method for each type of pastry. That keeps the\ncode for the operation on both types all nestled snugly together in one class.</p>\n<p>Given some pastry, how do we route it to the correct method on the visitor based\non its type? Polymorphism to the rescue! We add this method to Pastry:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  abstract class Pastry {\n</pre><pre class=\"insert\">    <span class=\"k\">abstract</span> <span class=\"t\">void</span> <span class=\"i\">accept</span>(<span class=\"t\">PastryVisitor</span> <span class=\"i\">visitor</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n\n<p>Each subclass implements it.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  class Beignet extends Pastry {\n</pre><pre class=\"insert\">    <span class=\"a\">@Override</span>\n    <span class=\"t\">void</span> <span class=\"i\">accept</span>(<span class=\"t\">PastryVisitor</span> <span class=\"i\">visitor</span>) {\n      <span class=\"i\">visitor</span>.<span class=\"i\">visitBeignet</span>(<span class=\"k\">this</span>);\n    }\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n\n<p>And:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  class Cruller extends Pastry {\n</pre><pre class=\"insert\">    <span class=\"a\">@Override</span>\n    <span class=\"t\">void</span> <span class=\"i\">accept</span>(<span class=\"t\">PastryVisitor</span> <span class=\"i\">visitor</span>) {\n      <span class=\"i\">visitor</span>.<span class=\"i\">visitCruller</span>(<span class=\"k\">this</span>);\n    }\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n\n<p>To perform an operation on a pastry, we call its <code>accept()</code> method and pass in\nthe visitor for the operation we want to execute. The pastry<span class=\"em\">&mdash;</span>the specific\nsubclass&rsquo;s overriding implementation of <code>accept()</code><span class=\"em\">&mdash;</span>turns around and calls the\nappropriate visit method on the visitor and passes <em>itself</em> to it.</p>\n<p>That&rsquo;s the heart of the trick right there. It lets us use polymorphic dispatch\non the <em>pastry</em> classes to select the appropriate method on the <em>visitor</em> class.\nIn the table, each pastry class is a row, but if you look at all of the methods\nfor a single visitor, they form a <em>column</em>.</p><img src=\"image/representing-code/visitor.png\" alt=\"Now all of the cells for one operation are part of the same class, the visitor.\" />\n<p>We added one <code>accept()</code> method to each class, and we can use it for as many\nvisitors as we want without ever having to touch the pastry classes again. It&rsquo;s\na clever pattern.</p>\n<h3><a href=\"#visitors-for-expressions\" id=\"visitors-for-expressions\"><small>5&#8202;.&#8202;3&#8202;.&#8202;3</small>Visitors for expressions</a></h3>\n<p>OK, let&rsquo;s weave it into our expression classes. We&rsquo;ll also <span\nname=\"context\">refine</span> the pattern a little. In the pastry example, the\nvisit and <code>accept()</code> methods don&rsquo;t return anything. In practice, visitors often\nwant to define operations that produce values. But what return type should\n<code>accept()</code> have? We can&rsquo;t assume every visitor class wants to produce the same\ntype, so we&rsquo;ll use generics to let each implementation fill in a return type.</p>\n<aside name=\"context\">\n<p>Another common refinement is an additional &ldquo;context&rdquo; parameter that is passed to\nthe visit methods and then sent back through as a parameter to <code>accept()</code>. That\nlets operations take an additional parameter. The visitors we&rsquo;ll define in the\nbook don&rsquo;t need that, so I omitted it.</p>\n</aside>\n<p>First, we define the visitor interface. Again, we nest it inside the base class\nso that we can keep everything in one file.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    writer.println(&quot;abstract class &quot; + baseName + &quot; {&quot;);\n\n</pre><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nin <em>defineAst</em>()</div>\n<pre class=\"insert\">    <span class=\"i\">defineVisitor</span>(<span class=\"i\">writer</span>, <span class=\"i\">baseName</span>, <span class=\"i\">types</span>);\n\n</pre><pre class=\"insert-after\">    // The AST classes.\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, in <em>defineAst</em>()</div>\n\n<p>That function generates the visitor interface.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nadd after <em>defineAst</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">defineVisitor</span>(\n      <span class=\"t\">PrintWriter</span> <span class=\"i\">writer</span>, <span class=\"t\">String</span> <span class=\"i\">baseName</span>, <span class=\"t\">List</span>&lt;<span class=\"t\">String</span>&gt; <span class=\"i\">types</span>) {\n    <span class=\"i\">writer</span>.<span class=\"i\">println</span>(<span class=\"s\">&quot;  interface Visitor&lt;R&gt; {&quot;</span>);\n\n    <span class=\"k\">for</span> (<span class=\"t\">String</span> <span class=\"i\">type</span> : <span class=\"i\">types</span>) {\n      <span class=\"t\">String</span> <span class=\"i\">typeName</span> = <span class=\"i\">type</span>.<span class=\"i\">split</span>(<span class=\"s\">&quot;:&quot;</span>)[<span class=\"n\">0</span>].<span class=\"i\">trim</span>();\n      <span class=\"i\">writer</span>.<span class=\"i\">println</span>(<span class=\"s\">&quot;    R visit&quot;</span> + <span class=\"i\">typeName</span> + <span class=\"i\">baseName</span> + <span class=\"s\">&quot;(&quot;</span> +\n          <span class=\"i\">typeName</span> + <span class=\"s\">&quot; &quot;</span> + <span class=\"i\">baseName</span>.<span class=\"i\">toLowerCase</span>() + <span class=\"s\">&quot;);&quot;</span>);\n    }\n\n    <span class=\"i\">writer</span>.<span class=\"i\">println</span>(<span class=\"s\">&quot;  }&quot;</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, add after <em>defineAst</em>()</div>\n\n<p>Here, we iterate through all of the subclasses and declare a visit method for\neach one. When we define new expression types later, this will automatically\ninclude them.</p>\n<p>Inside the base class, we define the abstract <code>accept()</code> method.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      defineType(writer, baseName, className, fields);\n    }\n</pre><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nin <em>defineAst</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"c\">// The base accept() method.</span>\n    <span class=\"i\">writer</span>.<span class=\"i\">println</span>();\n    <span class=\"i\">writer</span>.<span class=\"i\">println</span>(<span class=\"s\">&quot;  abstract &lt;R&gt; R accept(Visitor&lt;R&gt; visitor);&quot;</span>);\n\n</pre><pre class=\"insert-after\">    writer.println(&quot;}&quot;);\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, in <em>defineAst</em>()</div>\n\n<p>Finally, each subclass implements that and calls the right visit method for its\nown type.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    writer.println(&quot;    }&quot;);\n</pre><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nin <em>defineType</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"c\">// Visitor pattern.</span>\n    <span class=\"i\">writer</span>.<span class=\"i\">println</span>();\n    <span class=\"i\">writer</span>.<span class=\"i\">println</span>(<span class=\"s\">&quot;    @Override&quot;</span>);\n    <span class=\"i\">writer</span>.<span class=\"i\">println</span>(<span class=\"s\">&quot;    &lt;R&gt; R accept(Visitor&lt;R&gt; visitor) {&quot;</span>);\n    <span class=\"i\">writer</span>.<span class=\"i\">println</span>(<span class=\"s\">&quot;      return visitor.visit&quot;</span> +\n        <span class=\"i\">className</span> + <span class=\"i\">baseName</span> + <span class=\"s\">&quot;(this);&quot;</span>);\n    <span class=\"i\">writer</span>.<span class=\"i\">println</span>(<span class=\"s\">&quot;    }&quot;</span>);\n</pre><pre class=\"insert-after\">\n\n    // Fields.\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, in <em>defineType</em>()</div>\n\n<p>There we go. Now we can define operations on expressions without having to muck\nwith the classes or our generator script. Compile and run this generator script\nto output an updated &ldquo;Expr.java&rdquo; file. It contains a generated Visitor\ninterface and a set of expression node classes that support the Visitor pattern\nusing it.</p>\n<p>Before we end this rambling chapter, let&rsquo;s implement that Visitor interface and\nsee the pattern in action.</p>\n<h2><a href=\"#a-not-very-pretty-printer\" id=\"a-not-very-pretty-printer\"><small>5&#8202;.&#8202;4</small>A (Not Very) Pretty Printer</a></h2>\n<p>When we debug our parser and interpreter, it&rsquo;s often useful to look at a parsed\nsyntax tree and make sure it has the structure we expect. We could inspect it in\nthe debugger, but that can be a chore.</p>\n<p>Instead, we&rsquo;d like some code that, given a syntax tree, produces an unambiguous\nstring representation of it. Converting a tree to a string is sort of the\nopposite of a parser, and is often called &ldquo;pretty printing&rdquo; when the goal is to\nproduce a string of text that is valid syntax in the source language.</p>\n<p>That&rsquo;s not our goal here. We want the string to very explicitly show the nesting\nstructure of the tree. A printer that returned <code>1 + 2 * 3</code> isn&rsquo;t super helpful\nif what we&rsquo;re trying to debug is whether operator precedence is handled\ncorrectly. We want to know if the <code>+</code> or <code>*</code> is at the top of the tree.</p>\n<p>To that end, the string representation we produce isn&rsquo;t going to be Lox syntax.\nInstead, it will look a lot like, well, Lisp. Each expression is explicitly\nparenthesized, and all of its subexpressions and tokens are contained in that.</p>\n<p>Given a syntax tree like:</p><img src=\"image/representing-code/expression.png\" alt=\"An example syntax tree.\" />\n<p>It produces:</p>\n<div class=\"codehilite\"><pre>(* (- 123) (group 45.67))\n</pre></div>\n<p>Not exactly &ldquo;pretty&rdquo;, but it does show the nesting and grouping explicitly. To\nimplement this, we define a new class.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/AstPrinter.java</em><br>\ncreate new file</div>\n<pre><span class=\"k\">package</span> <span class=\"i\">com.craftinginterpreters.lox</span>;\n\n<span class=\"k\">class</span> <span class=\"t\">AstPrinter</span> <span class=\"k\">implements</span> <span class=\"t\">Expr</span>.<span class=\"t\">Visitor</span>&lt;<span class=\"t\">String</span>&gt; {\n  <span class=\"t\">String</span> <span class=\"i\">print</span>(<span class=\"t\">Expr</span> <span class=\"i\">expr</span>) {\n    <span class=\"k\">return</span> <span class=\"i\">expr</span>.<span class=\"i\">accept</span>(<span class=\"k\">this</span>);\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/AstPrinter.java</em>, create new file</div>\n\n<p>As you can see, it implements the visitor interface. That means we need visit\nmethods for each of the expression types we have so far.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    return expr.accept(this);\n  }\n</pre><div class=\"source-file\"><em>lox/AstPrinter.java</em><br>\nadd after <em>print</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">String</span> <span class=\"i\">visitBinaryExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Binary</span> <span class=\"i\">expr</span>) {\n    <span class=\"k\">return</span> <span class=\"i\">parenthesize</span>(<span class=\"i\">expr</span>.<span class=\"i\">operator</span>.<span class=\"i\">lexeme</span>,\n                        <span class=\"i\">expr</span>.<span class=\"i\">left</span>, <span class=\"i\">expr</span>.<span class=\"i\">right</span>);\n  }\n\n  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">String</span> <span class=\"i\">visitGroupingExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Grouping</span> <span class=\"i\">expr</span>) {\n    <span class=\"k\">return</span> <span class=\"i\">parenthesize</span>(<span class=\"s\">&quot;group&quot;</span>, <span class=\"i\">expr</span>.<span class=\"i\">expression</span>);\n  }\n\n  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">String</span> <span class=\"i\">visitLiteralExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Literal</span> <span class=\"i\">expr</span>) {\n    <span class=\"k\">if</span> (<span class=\"i\">expr</span>.<span class=\"i\">value</span> == <span class=\"k\">null</span>) <span class=\"k\">return</span> <span class=\"s\">&quot;nil&quot;</span>;\n    <span class=\"k\">return</span> <span class=\"i\">expr</span>.<span class=\"i\">value</span>.<span class=\"i\">toString</span>();\n  }\n\n  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">String</span> <span class=\"i\">visitUnaryExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Unary</span> <span class=\"i\">expr</span>) {\n    <span class=\"k\">return</span> <span class=\"i\">parenthesize</span>(<span class=\"i\">expr</span>.<span class=\"i\">operator</span>.<span class=\"i\">lexeme</span>, <span class=\"i\">expr</span>.<span class=\"i\">right</span>);\n  }\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/AstPrinter.java</em>, add after <em>print</em>()</div>\n\n<p>Literal expressions are easy<span class=\"em\">&mdash;</span>they convert the value to a string with a little\ncheck to handle Java&rsquo;s <code>null</code> standing in for Lox&rsquo;s <code>nil</code>. The other expressions\nhave subexpressions, so they use this <code>parenthesize()</code> helper method:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/AstPrinter.java</em><br>\nadd after <em>visitUnaryExpr</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">String</span> <span class=\"i\">parenthesize</span>(<span class=\"t\">String</span> <span class=\"i\">name</span>, <span class=\"t\">Expr</span>... <span class=\"i\">exprs</span>) {\n    <span class=\"t\">StringBuilder</span> <span class=\"i\">builder</span> = <span class=\"k\">new</span> <span class=\"t\">StringBuilder</span>();\n\n    <span class=\"i\">builder</span>.<span class=\"i\">append</span>(<span class=\"s\">&quot;(&quot;</span>).<span class=\"i\">append</span>(<span class=\"i\">name</span>);\n    <span class=\"k\">for</span> (<span class=\"t\">Expr</span> <span class=\"i\">expr</span> : <span class=\"i\">exprs</span>) {\n      <span class=\"i\">builder</span>.<span class=\"i\">append</span>(<span class=\"s\">&quot; &quot;</span>);\n      <span class=\"i\">builder</span>.<span class=\"i\">append</span>(<span class=\"i\">expr</span>.<span class=\"i\">accept</span>(<span class=\"k\">this</span>));\n    }\n    <span class=\"i\">builder</span>.<span class=\"i\">append</span>(<span class=\"s\">&quot;)&quot;</span>);\n\n    <span class=\"k\">return</span> <span class=\"i\">builder</span>.<span class=\"i\">toString</span>();\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/AstPrinter.java</em>, add after <em>visitUnaryExpr</em>()</div>\n\n<p>It takes a name and a list of subexpressions and wraps them all up in\nparentheses, yielding a string like:</p>\n<div class=\"codehilite\"><pre>(+ 1 2)\n</pre></div>\n<p>Note that it calls <code>accept()</code> on each subexpression and passes in itself. This\nis the <span name=\"tree\">recursive</span> step that lets us print an entire\ntree.</p>\n<aside name=\"tree\">\n<p>This recursion is also why people think the Visitor pattern itself has to do\nwith trees.</p>\n</aside>\n<p>We don&rsquo;t have a parser yet, so it&rsquo;s hard to see this in action. For now, we&rsquo;ll\nhack together a little <code>main()</code> method that manually instantiates a tree and\nprints it.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/AstPrinter.java</em><br>\nadd after <em>parenthesize</em>()</div>\n<pre>  <span class=\"k\">public</span> <span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">main</span>(<span class=\"t\">String</span>[] <span class=\"i\">args</span>) {\n    <span class=\"t\">Expr</span> <span class=\"i\">expression</span> = <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Binary</span>(\n        <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Unary</span>(\n            <span class=\"k\">new</span> <span class=\"t\">Token</span>(<span class=\"t\">TokenType</span>.<span class=\"i\">MINUS</span>, <span class=\"s\">&quot;-&quot;</span>, <span class=\"k\">null</span>, <span class=\"n\">1</span>),\n            <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Literal</span>(<span class=\"n\">123</span>)),\n        <span class=\"k\">new</span> <span class=\"t\">Token</span>(<span class=\"t\">TokenType</span>.<span class=\"i\">STAR</span>, <span class=\"s\">&quot;*&quot;</span>, <span class=\"k\">null</span>, <span class=\"n\">1</span>),\n        <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Grouping</span>(\n            <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Literal</span>(<span class=\"n\">45.67</span>)));\n\n    <span class=\"t\">System</span>.<span class=\"i\">out</span>.<span class=\"i\">println</span>(<span class=\"k\">new</span> <span class=\"t\">AstPrinter</span>().<span class=\"i\">print</span>(<span class=\"i\">expression</span>));\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/AstPrinter.java</em>, add after <em>parenthesize</em>()</div>\n\n<p>If we did everything right, it prints:</p>\n<div class=\"codehilite\"><pre>(* (- 123) (group 45.67))\n</pre></div>\n<p>You can go ahead and delete this method. We won&rsquo;t need it. Also, as we add new\nsyntax tree types, I won&rsquo;t bother showing the necessary visit methods for them\nin AstPrinter. If you want to (and you want the Java compiler to not yell at\nyou), go ahead and add them yourself. It will come in handy in the next chapter\nwhen we start parsing Lox code into syntax trees. Or, if you don&rsquo;t care to\nmaintain AstPrinter, feel free to delete it. We won&rsquo;t need it again.</p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>Earlier, I said that the <code>|</code>, <code>*</code>, and <code>+</code> forms we added to our grammar\nmetasyntax were just syntactic sugar. Take this grammar:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">expr</span> → <span class=\"i\">expr</span> ( <span class=\"s\">&quot;(&quot;</span> ( <span class=\"i\">expr</span> ( <span class=\"s\">&quot;,&quot;</span> <span class=\"i\">expr</span> )* )? <span class=\"s\">&quot;)&quot;</span> | <span class=\"s\">&quot;.&quot;</span> <span class=\"t\">IDENTIFIER</span> )+\n     | <span class=\"t\">IDENTIFIER</span>\n     | <span class=\"t\">NUMBER</span>\n</pre></div>\n<p>Produce a grammar that matches the same language but does not use any of\nthat notational sugar.</p>\n<p><em>Bonus:</em> What kind of expression does this bit of grammar encode?</p>\n</li>\n<li>\n<p>The Visitor pattern lets you emulate the functional style in an\nobject-oriented language. Devise a complementary pattern for a functional\nlanguage. It should let you bundle all of the operations on one type\ntogether and let you define new types easily.</p>\n<p>(SML or Haskell would be ideal for this exercise, but Scheme or another Lisp\nworks as well.)</p>\n</li>\n<li>\n<p>In <a href=\"https://en.wikipedia.org/wiki/Reverse_Polish_notation\">reverse Polish notation</a> (RPN), the operands to an arithmetic\noperator are both placed before the operator, so <code>1 + 2</code> becomes <code>1 2 +</code>.\nEvaluation proceeds from left to right. Numbers are pushed onto an implicit\nstack. An arithmetic operator pops the top two numbers, performs the\noperation, and pushes the result. Thus, this:</p>\n<div class=\"codehilite\"><pre>(<span class=\"n\">1</span> + <span class=\"n\">2</span>) * (<span class=\"n\">4</span> - <span class=\"n\">3</span>)\n</pre></div>\n<p>in RPN becomes:</p>\n<div class=\"codehilite\"><pre><span class=\"n\">1</span> <span class=\"n\">2</span> + <span class=\"n\">4</span> <span class=\"n\">3</span> - *\n</pre></div>\n<p>Define a visitor class for our syntax tree classes that takes an expression,\nconverts it to RPN, and returns the resulting string.</p>\n</li>\n</ol>\n</div>\n\n<footer>\n<a href=\"parsing-expressions.html\" class=\"next\">\n  Next Chapter: &ldquo;Parsing Expressions&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/resolving-and-binding.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Resolving and Binding &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Resolving and Binding<small>11</small></a></h3>\n\n<ul>\n    <li><a href=\"#static-scope\"><small>11.1</small> Static Scope</a></li>\n    <li><a href=\"#semantic-analysis\"><small>11.2</small> Semantic Analysis</a></li>\n    <li><a href=\"#a-resolver-class\"><small>11.3</small> A Resolver Class</a></li>\n    <li><a href=\"#interpreting-resolved-variables\"><small>11.4</small> Interpreting Resolved Variables</a></li>\n    <li><a href=\"#resolution-errors\"><small>11.5</small> Resolution Errors</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"functions.html\" title=\"Functions\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\">&uarr;&nbsp;Up</a>\n    <a href=\"classes.html\" title=\"Classes\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"functions.html\" title=\"Functions\" class=\"prev\">←</a>\n<a href=\"classes.html\" title=\"Classes\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Resolving and Binding<small>11</small></a></h3>\n\n<ul>\n    <li><a href=\"#static-scope\"><small>11.1</small> Static Scope</a></li>\n    <li><a href=\"#semantic-analysis\"><small>11.2</small> Semantic Analysis</a></li>\n    <li><a href=\"#a-resolver-class\"><small>11.3</small> A Resolver Class</a></li>\n    <li><a href=\"#interpreting-resolved-variables\"><small>11.4</small> Interpreting Resolved Variables</a></li>\n    <li><a href=\"#resolution-errors\"><small>11.5</small> Resolution Errors</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"functions.html\" title=\"Functions\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\">&uarr;&nbsp;Up</a>\n    <a href=\"classes.html\" title=\"Classes\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">11</div>\n  <h1>Resolving and Binding</h1>\n\n<blockquote>\n<p>Once in a while you find yourself in an odd situation. You get into it by\ndegrees and in the most natural way but, when you are right in the midst of\nit, you are suddenly astonished and ask yourself how in the world it all came\nabout.</p>\n<p><cite>Thor Heyerdahl, <em>Kon-Tiki</em></cite></p>\n</blockquote>\n<p>Oh, no! Our language implementation is taking on water! Way back when we <a href=\"statements-and-state.html\">added\nvariables and blocks</a>, we had scoping nice and tight. But when we\n<a href=\"functions.html\">later added closures</a>, a hole opened in our formerly waterproof\ninterpreter. Most real programs are unlikely to slip through this hole, but as\nlanguage implementers, we take a sacred vow to care about correctness even in\nthe deepest, dampest corners of the semantics.</p>\n<p>We will spend this entire chapter exploring that leak, and then carefully\npatching it up. In the process, we will gain a more rigorous understanding of\nlexical scoping as used by Lox and other languages in the C tradition. We&rsquo;ll\nalso get a chance to learn about <em>semantic analysis</em><span class=\"em\">&mdash;</span>a powerful technique for\nextracting meaning from the user&rsquo;s source code without having to run it.</p>\n<h2><a href=\"#static-scope\" id=\"static-scope\"><small>11&#8202;.&#8202;1</small>Static Scope</a></h2>\n<p>A quick refresher: Lox, like most modern languages, uses <em>lexical</em> scoping. This\nmeans that you can figure out which declaration a variable name refers to just\nby reading the text of the program. For example:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;outer&quot;</span>;\n{\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;inner&quot;</span>;\n  <span class=\"k\">print</span> <span class=\"i\">a</span>;\n}\n</pre></div>\n<p>Here, we know that the <code>a</code> being printed is the variable declared on the\nprevious line, and not the global one. Running the program doesn&rsquo;t<span class=\"em\">&mdash;</span><em>can&rsquo;t</em><span class=\"em\">&mdash;</span>affect this. The scope rules are part of the <em>static</em> semantics of the language,\nwhich is why they&rsquo;re also called <em>static scope</em>.</p>\n<p>I haven&rsquo;t spelled out those scope rules, but now is the time for <span\nname=\"precise\">precision</span>:</p>\n<aside name=\"precise\">\n<p>This is still nowhere near as precise as a real language specification. Those\ndocs must be so explicit that even a Martian or an outright malicious programmer\nwould be forced to implement the correct semantics provided they followed the\nletter of the spec.</p>\n<p>That exactitude is important when a language may be implemented by competing\ncompanies who want their product to be incompatible with the others to lock\ncustomers onto their platform. For this book, we can thankfully ignore those\nkinds of shady shenanigans.</p>\n</aside>\n<p><strong>A variable usage refers to the preceding declaration with the same name in the\ninnermost scope that encloses the expression where the variable is used.</strong></p>\n<p>There&rsquo;s a lot to unpack in that:</p>\n<ul>\n<li>\n<p>I say &ldquo;variable usage&rdquo; instead of &ldquo;variable expression&rdquo; to cover both\nvariable expressions and assignments. Likewise with &ldquo;expression where the\nvariable is used&rdquo;.</p>\n</li>\n<li>\n<p>&ldquo;Preceding&rdquo; means appearing before <em>in the program text</em>.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;outer&quot;</span>;\n{\n  <span class=\"k\">print</span> <span class=\"i\">a</span>;\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;inner&quot;</span>;\n}\n</pre></div>\n<p>Here, the <code>a</code> being printed is the outer one since it appears <span\nname=\"hoisting\">before</span> the <code>print</code> statement that uses it. In most\ncases, in straight line code, the declaration preceding in <em>text</em> will also\nprecede the usage in <em>time</em>. But that&rsquo;s not always true. As we&rsquo;ll see,\nfunctions may defer a chunk of code such that its <em>dynamic temporal</em>\nexecution no longer mirrors the <em>static textual</em> ordering.</p>\n<aside name=\"hoisting\">\n<p>In JavaScript, variables declared using <code>var</code> are implicitly &ldquo;hoisted&rdquo; to\nthe beginning of the block. Any use of that name in the block will refer to\nthat variable, even if the use appears before the declaration. When you\nwrite this in JavaScript:</p>\n<div class=\"codehilite\"><pre>{\n  <span class=\"i\">console</span>.<span class=\"i\">log</span>(<span class=\"i\">a</span>);\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;value&quot;</span>;\n}\n</pre></div>\n<p>It behaves like:</p>\n<div class=\"codehilite\"><pre>{\n  <span class=\"k\">var</span> <span class=\"i\">a</span>; <span class=\"c\">// Hoist.</span>\n  <span class=\"i\">console</span>.<span class=\"i\">log</span>(<span class=\"i\">a</span>);\n  <span class=\"i\">a</span> = <span class=\"s\">&quot;value&quot;</span>;\n}\n</pre></div>\n<p>That means that in some cases you can read a variable before its initializer\nhas run<span class=\"em\">&mdash;</span>an annoying source of bugs. The alternate <code>let</code> syntax for\ndeclaring variables was added later to address this problem.</p>\n</aside></li>\n<li>\n<p>&ldquo;Innermost&rdquo; is there because of our good friend shadowing. There may be more\nthan one variable with the given name in enclosing scopes, as in:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;outer&quot;</span>;\n{\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;inner&quot;</span>;\n  <span class=\"k\">print</span> <span class=\"i\">a</span>;\n}\n</pre></div>\n<p>Our rule disambiguates this case by saying the innermost scope wins.</p>\n</li>\n</ul>\n<p>Since this rule makes no mention of any runtime behavior, it implies that a\nvariable expression always refers to the same declaration through the entire\nexecution of the program. Our interpreter so far <em>mostly</em> implements the rule\ncorrectly. But when we added closures, an error snuck in.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;global&quot;</span>;\n{\n  <span class=\"k\">fun</span> <span class=\"i\">showA</span>() {\n    <span class=\"k\">print</span> <span class=\"i\">a</span>;\n  }\n\n  <span class=\"i\">showA</span>();\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;block&quot;</span>;\n  <span class=\"i\">showA</span>();\n}\n</pre></div>\n<p><span name=\"tricky\">Before</span> you type this in and run it, decide what you\nthink it <em>should</em> print.</p>\n<aside name=\"tricky\">\n<p>I know, it&rsquo;s a totally pathological, contrived program. It&rsquo;s just <em>weird</em>. No\nreasonable person would ever write code like this. Alas, more of your life than\nyou&rsquo;d expect will be spent dealing with bizarro snippets of code like this if\nyou stay in the programming language game for long.</p>\n</aside>\n<p>OK<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>got it? If you&rsquo;re familiar with closures in other languages, you&rsquo;ll expect\nit to print &ldquo;global&rdquo; twice. The first call to <code>showA()</code> should definitely print\n&ldquo;global&rdquo; since we haven&rsquo;t even reached the declaration of the inner <code>a</code> yet. And\nby our rule that a variable expression always resolves to the same variable,\nthat implies the second call to <code>showA()</code> should print the same thing.</p>\n<p>Alas, it prints:</p>\n<div class=\"codehilite\"><pre>global\nblock\n</pre></div>\n<p>Let me stress that this program never reassigns any variable and contains only a\nsingle <code>print</code> statement. Yet, somehow, that <code>print</code> statement for a\nnever-assigned variable prints two different values at different points in time.\nWe definitely broke something somewhere.</p>\n<h3><a href=\"#scopes-and-mutable-environments\" id=\"scopes-and-mutable-environments\"><small>11&#8202;.&#8202;1&#8202;.&#8202;1</small>Scopes and mutable environments</a></h3>\n<p>In our interpreter, environments are the dynamic manifestation of static scopes.\nThe two mostly stay in sync with each other<span class=\"em\">&mdash;</span>we create a new environment when\nwe enter a new scope, and discard it when we leave the scope. There is one other\noperation we perform on environments: binding a variable in one. This is where\nour bug lies.</p>\n<p>Let&rsquo;s walk through that problematic example and see what the environments look\nlike at each step. First, we declare <code>a</code> in the global scope.</p><img src=\"image/resolving-and-binding/environment-1.png\" alt=\"The global environment with 'a' defined in it.\" />\n<p>That gives us a single environment with a single variable in it. Then we enter\nthe block and execute the declaration of <code>showA()</code>.</p><img src=\"image/resolving-and-binding/environment-2.png\" alt=\"A block environment linking to the global one.\" />\n<p>We get a new environment for the block. In that, we declare one name, <code>showA</code>,\nwhich is bound to the LoxFunction object we create to represent the function.\nThat object has a <code>closure</code> field that captures the environment where the\nfunction was declared, so it has a reference back to the environment for the\nblock.</p>\n<p>Now we call <code>showA()</code>.</p><img src=\"image/resolving-and-binding/environment-3.png\" alt=\"An empty environment for showA()'s body linking to the previous two. 'a' is resolved in the global environment.\" />\n<p>The interpreter dynamically creates a new environment for the function body of\n<code>showA()</code>. It&rsquo;s empty since that function doesn&rsquo;t declare any variables. The\nparent of that environment is the function&rsquo;s closure<span class=\"em\">&mdash;</span>the outer block\nenvironment.</p>\n<p>Inside the body of <code>showA()</code>, we print the value of <code>a</code>. The interpreter looks\nup this value by walking the chain of environments. It gets all the way\nto the global environment before finding it there and printing <code>\"global\"</code>.\nGreat.</p>\n<p>Next, we declare the second <code>a</code>, this time inside the block.</p><img src=\"image/resolving-and-binding/environment-4.png\" alt=\"The block environment has both 'a' and 'showA' now.\" />\n<p>It&rsquo;s in the same block<span class=\"em\">&mdash;</span>the same scope<span class=\"em\">&mdash;</span>as <code>showA()</code>, so it goes into the\nsame environment, which is also the same environment <code>showA()</code>&rsquo;s closure refers\nto. This is where it gets interesting. We call <code>showA()</code> again.</p><img src=\"image/resolving-and-binding/environment-5.png\" alt=\"An empty environment for showA()'s body linking to the previous two. 'a' is resolved in the block environment.\" />\n<p>We create a new empty environment for the body of <code>showA()</code> again, wire it up to\nthat closure, and run the body. When the interpreter walks the chain of\nenvironments to find <code>a</code>, it now discovers the <em>new</em> <code>a</code> in the block\nenvironment. Boo.</p>\n<p>I chose to implement environments in a way that I hoped would agree with your\ninformal intuition around scopes. We tend to consider all of the code within a\nblock as being within the same scope, so our interpreter uses a single\nenvironment to represent that. Each environment is a mutable hash table. When a\nnew local variable is declared, it gets added to the existing environment for\nthat scope.</p>\n<p>That intuition, like many in life, isn&rsquo;t quite right. A block is not necessarily\nall the same scope. Consider:</p>\n<div class=\"codehilite\"><pre>{\n  <span class=\"k\">var</span> <span class=\"i\">a</span>;\n  <span class=\"c\">// 1.</span>\n  <span class=\"k\">var</span> <span class=\"i\">b</span>;\n  <span class=\"c\">// 2.</span>\n}\n</pre></div>\n<p>At the first marked line, only <code>a</code> is in scope. At the second line, both <code>a</code> and\n<code>b</code> are. If you define a &ldquo;scope&rdquo; to be a set of declarations, then those are\nclearly not the same scope<span class=\"em\">&mdash;</span>they don&rsquo;t contain the same declarations. It&rsquo;s\nlike each <code>var</code> statement <span name=\"split\">splits</span> the block into two\nseparate scopes, the scope before the variable is declared and the one after,\nwhich includes the new variable.</p>\n<aside name=\"split\">\n<p>Some languages make this split explicit. In Scheme and ML, when you declare a\nlocal variable using <code>let</code>, you also delineate the subsequent code where the new\nvariable is in scope. There is no implicit &ldquo;rest of the block&rdquo;.</p>\n</aside>\n<p>But in our implementation, environments do act like the entire block is one\nscope, just a scope that changes over time. Closures do not like that. When a\nfunction is declared, it captures a reference to the current environment. The\nfunction <em>should</em> capture a frozen snapshot of the environment <em>as it existed at\nthe moment the function was declared</em>. But instead, in the Java code, it has a\nreference to the actual mutable environment object. When a variable is later\ndeclared in the scope that environment corresponds to, the closure sees the new\nvariable, even though the declaration does <em>not</em> precede the function.</p>\n<h3><a href=\"#persistent-environments\" id=\"persistent-environments\"><small>11&#8202;.&#8202;1&#8202;.&#8202;2</small>Persistent environments</a></h3>\n<p>There is a style of programming that uses what are called <strong>persistent data\nstructures</strong>. Unlike the squishy data structures you&rsquo;re familiar with in\nimperative programming, a persistent data structure can never be directly\nmodified. Instead, any &ldquo;modification&rdquo; to an existing structure produces a <span\nname=\"copy\">brand</span> new object that contains all of the original data and\nthe new modification. The original is left unchanged.</p>\n<aside name=\"copy\">\n<p>This sounds like it might waste tons of memory and time copying the structure\nfor each operation. In practice, persistent data structures share most of their\ndata between the different &ldquo;copies&rdquo;.</p>\n</aside>\n<p>If we were to apply that technique to Environment, then every time you declared\na variable it would return a <em>new</em> environment that contained all of the\npreviously declared variables along with the one new name. Declaring a variable\nwould do the implicit &ldquo;split&rdquo; where you have an environment before the variable\nis declared and one after:</p><img src=\"image/resolving-and-binding/split.png\" alt=\"Separate environments before and after the variable is declared.\" />\n<p>A closure retains a reference to the Environment instance in play when the\nfunction was declared. Since any later declarations in that block would produce\nnew Environment objects, the closure wouldn&rsquo;t see the new variables and our bug\nwould be fixed.</p>\n<p>This is a legit way to solve the problem, and it&rsquo;s the classic way to implement\nenvironments in Scheme interpreters. We could do that for Lox, but it would mean\ngoing back and changing a pile of existing code.</p>\n<p>I won&rsquo;t drag you through that. We&rsquo;ll keep the way we represent environments the\nsame. Instead of making the data more statically structured, we&rsquo;ll bake the\nstatic resolution into the access <em>operation</em> itself.</p>\n<h2><a href=\"#semantic-analysis\" id=\"semantic-analysis\"><small>11&#8202;.&#8202;2</small>Semantic Analysis</a></h2>\n<p>Our interpreter <strong>resolves</strong> a variable<span class=\"em\">&mdash;</span>tracks down which declaration it\nrefers to<span class=\"em\">&mdash;</span>each and every time the variable expression is evaluated. If that\nvariable is swaddled inside a loop that runs a thousand times, that variable\ngets re-resolved a thousand times.</p>\n<p>We know static scope means that a variable usage always resolves to the same\ndeclaration, which can be determined just by looking at the text. Given that,\nwhy are we doing it dynamically every time? Doing so doesn&rsquo;t just open the hole\nthat leads to our annoying bug, it&rsquo;s also needlessly slow.</p>\n<p>A better solution is to resolve each variable use <em>once</em>. Write a chunk of code\nthat inspects the user&rsquo;s program, finds every variable mentioned, and figures\nout which declaration each refers to. This process is an example of a <strong>semantic\nanalysis</strong>. Where a parser tells only if a program is grammatically correct (a\n<em>syntactic</em> analysis), semantic analysis goes farther and starts to figure out\nwhat pieces of the program actually mean. In this case, our analysis will\nresolve variable bindings. We&rsquo;ll know not just that an expression <em>is</em> a\nvariable, but <em>which</em> variable it is.</p>\n<p>There are a lot of ways we could store the binding between a variable and its\ndeclaration. When we get to the C interpreter for Lox, we&rsquo;ll have a <em>much</em> more\nefficient way of storing and accessing local variables. But for jlox, I want to\nminimize the collateral damage we inflict on our existing codebase. I&rsquo;d hate to\nthrow out a bunch of mostly fine code.</p>\n<p>Instead, we&rsquo;ll store the resolution in a way that makes the most out of our\nexisting Environment class. Recall how the accesses of <code>a</code> are interpreted in\nthe problematic example.</p><img src=\"image/resolving-and-binding/environment-3.png\" alt=\"An empty environment for showA()'s body linking to the previous two. 'a' is resolved in the global environment.\" />\n<p>In the first (correct) evaluation, we look at three environments in the chain\nbefore finding the global declaration of <code>a</code>. Then, when the inner <code>a</code> is later\ndeclared in a block scope, it shadows the global one.</p><img src=\"image/resolving-and-binding/environment-5.png\" alt=\"An empty environment for showA()'s body linking to the previous two. 'a' is resolved in the block environment.\" />\n<p>The next lookup walks the chain, finds <code>a</code> in the <em>second</em> environment and\nstops there. Each environment corresponds to a single lexical scope where\nvariables are declared. If we could ensure a variable lookup always walked the\n<em>same</em> number of links in the environment chain, that would ensure that it\nfound the same variable in the same scope every time.</p>\n<p>To &ldquo;resolve&rdquo; a variable usage, we only need to calculate how many &ldquo;hops&rdquo; away\nthe declared variable will be in the environment chain. The interesting question\nis <em>when</em> to do this calculation<span class=\"em\">&mdash;</span>or, put differently, where in our\ninterpreter&rsquo;s implementation do we stuff the code for it?</p>\n<p>Since we&rsquo;re calculating a static property based on the structure of the source\ncode, the obvious answer is in the parser. That is the traditional home, and is\nwhere we&rsquo;ll put it later in clox. It would work here too, but I want an excuse to\nshow you another technique. We&rsquo;ll write our resolver as a separate pass.</p>\n<h3><a href=\"#a-variable-resolution-pass\" id=\"a-variable-resolution-pass\"><small>11&#8202;.&#8202;2&#8202;.&#8202;1</small>A variable resolution pass</a></h3>\n<p>After the parser produces the syntax tree, but before the interpreter starts\nexecuting it, we&rsquo;ll do a single walk over the tree to resolve all of the\nvariables it contains. Additional passes between parsing and execution are\ncommon. If Lox had static types, we could slide a type checker in there.\nOptimizations are often implemented in separate passes like this too. Basically,\nany work that doesn&rsquo;t rely on state that&rsquo;s only available at runtime can be done\nin this way.</p>\n<p>Our variable resolution pass works like a sort of mini-interpreter. It walks the\ntree, visiting each node, but a static analysis is different from a dynamic\nexecution:</p>\n<ul>\n<li>\n<p><strong>There are no side effects.</strong> When the static analysis visits a print\nstatement, it doesn&rsquo;t actually print anything. Calls to native functions or\nother operations that reach out to the outside world are stubbed out and\nhave no effect.</p>\n</li>\n<li>\n<p><strong>There is no control flow.</strong> Loops are visited only <span\nname=\"fix\">once</span>. Both branches are visited in <code>if</code> statements. Logic\noperators are not short-circuited.</p>\n</li>\n</ul>\n<aside name=\"fix\">\n<p>Variable resolution touches each node once, so its performance is <em>O(n)</em> where\n<em>n</em> is the number of syntax tree nodes. More sophisticated analyses may have\ngreater complexity, but most are carefully designed to be linear or not far from\nit. It&rsquo;s an embarrassing faux pas if your compiler gets exponentially slower as\nthe user&rsquo;s program grows.</p>\n</aside>\n<h2><a href=\"#a-resolver-class\" id=\"a-resolver-class\"><small>11&#8202;.&#8202;3</small>A Resolver Class</a></h2>\n<p>Like everything in Java, our variable resolution pass is embodied in a class.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\ncreate new file</div>\n<pre><span class=\"k\">package</span> <span class=\"i\">com.craftinginterpreters.lox</span>;\n\n<span class=\"k\">import</span> <span class=\"i\">java.util.HashMap</span>;\n<span class=\"k\">import</span> <span class=\"i\">java.util.List</span>;\n<span class=\"k\">import</span> <span class=\"i\">java.util.Map</span>;\n<span class=\"k\">import</span> <span class=\"i\">java.util.Stack</span>;\n\n<span class=\"k\">class</span> <span class=\"t\">Resolver</span> <span class=\"k\">implements</span> <span class=\"t\">Expr</span>.<span class=\"t\">Visitor</span>&lt;<span class=\"t\">Void</span>&gt;, <span class=\"t\">Stmt</span>.<span class=\"t\">Visitor</span>&lt;<span class=\"t\">Void</span>&gt; {\n  <span class=\"k\">private</span> <span class=\"k\">final</span> <span class=\"t\">Interpreter</span> <span class=\"i\">interpreter</span>;\n\n  <span class=\"t\">Resolver</span>(<span class=\"t\">Interpreter</span> <span class=\"i\">interpreter</span>) {\n    <span class=\"k\">this</span>.<span class=\"i\">interpreter</span> = <span class=\"i\">interpreter</span>;\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, create new file</div>\n\n<p>Since the resolver needs to visit every node in the syntax tree, it implements\nthe visitor abstraction we already have in place. Only a few kinds of nodes are\ninteresting when it comes to resolving variables:</p>\n<ul>\n<li>\n<p>A block statement introduces a new scope for the statements it contains.</p>\n</li>\n<li>\n<p>A function declaration introduces a new scope for its body and binds its\nparameters in that scope.</p>\n</li>\n<li>\n<p>A variable declaration adds a new variable to the current scope.</p>\n</li>\n<li>\n<p>Variable and assignment expressions need to have their variables resolved.</p>\n</li>\n</ul>\n<p>The rest of the nodes don&rsquo;t do anything special, but we still need to implement\nvisit methods for them that traverse into their subtrees. Even though a <code>+</code>\nexpression doesn&rsquo;t <em>itself</em> have any variables to resolve, either of its\noperands might.</p>\n<h3><a href=\"#resolving-blocks\" id=\"resolving-blocks\"><small>11&#8202;.&#8202;3&#8202;.&#8202;1</small>Resolving blocks</a></h3>\n<p>We start with blocks since they create the local scopes where all the magic\nhappens.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>Resolver</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitBlockStmt</span>(<span class=\"t\">Stmt</span>.<span class=\"t\">Block</span> <span class=\"i\">stmt</span>) {\n    <span class=\"i\">beginScope</span>();\n    <span class=\"i\">resolve</span>(<span class=\"i\">stmt</span>.<span class=\"i\">statements</span>);\n    <span class=\"i\">endScope</span>();\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>Resolver</em>()</div>\n\n<p>This begins a new scope, traverses into the statements inside the block, and\nthen discards the scope. The fun stuff lives in those helper methods. We start\nwith the simple one.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>Resolver</em>()</div>\n<pre>  <span class=\"t\">void</span> <span class=\"i\">resolve</span>(<span class=\"t\">List</span>&lt;<span class=\"t\">Stmt</span>&gt; <span class=\"i\">statements</span>) {\n    <span class=\"k\">for</span> (<span class=\"t\">Stmt</span> <span class=\"i\">statement</span> : <span class=\"i\">statements</span>) {\n      <span class=\"i\">resolve</span>(<span class=\"i\">statement</span>);\n    }\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>Resolver</em>()</div>\n\n<p>This walks a list of statements and resolves each one. It in turn calls:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>visitBlockStmt</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">void</span> <span class=\"i\">resolve</span>(<span class=\"t\">Stmt</span> <span class=\"i\">stmt</span>) {\n    <span class=\"i\">stmt</span>.<span class=\"i\">accept</span>(<span class=\"k\">this</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>visitBlockStmt</em>()</div>\n\n<p>While we&rsquo;re at it, let&rsquo;s add another overload that we&rsquo;ll need later for\nresolving an expression.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>resolve</em>(Stmt stmt)</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">void</span> <span class=\"i\">resolve</span>(<span class=\"t\">Expr</span> <span class=\"i\">expr</span>) {\n    <span class=\"i\">expr</span>.<span class=\"i\">accept</span>(<span class=\"k\">this</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>resolve</em>(Stmt stmt)</div>\n\n<p>These methods are similar to the <code>evaluate()</code> and <code>execute()</code> methods in\nInterpreter<span class=\"em\">&mdash;</span>they turn around and apply the Visitor pattern to the given\nsyntax tree node.</p>\n<p>The real interesting behavior is around scopes. A new block scope is created\nlike so:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>resolve</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">void</span> <span class=\"i\">beginScope</span>() {\n    <span class=\"i\">scopes</span>.<span class=\"i\">push</span>(<span class=\"k\">new</span> <span class=\"t\">HashMap</span>&lt;<span class=\"t\">String</span>, <span class=\"t\">Boolean</span>&gt;());\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>resolve</em>()</div>\n\n<p>Lexical scopes nest in both the interpreter and the resolver. They behave like a\nstack. The interpreter implements that stack using a linked list<span class=\"em\">&mdash;</span>the chain of\nEnvironment objects. In the resolver, we use an actual Java Stack.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  private final Interpreter interpreter;\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin class <em>Resolver</em></div>\n<pre class=\"insert\">  <span class=\"k\">private</span> <span class=\"k\">final</span> <span class=\"t\">Stack</span>&lt;<span class=\"t\">Map</span>&lt;<span class=\"t\">String</span>, <span class=\"t\">Boolean</span>&gt;&gt; <span class=\"i\">scopes</span> = <span class=\"k\">new</span> <span class=\"t\">Stack</span>&lt;&gt;();\n</pre><pre class=\"insert-after\">\n\n  Resolver(Interpreter interpreter) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in class <em>Resolver</em></div>\n\n<p>This field keeps track of the stack of scopes currently, uh, in scope. Each\nelement in the stack is a Map representing a single block scope. Keys, as in\nEnvironment, are variable names. The values are Booleans, for a reason I&rsquo;ll\nexplain soon.</p>\n<p>The scope stack is only used for local block scopes. Variables declared at the\ntop level in the global scope are not tracked by the resolver since they are\nmore dynamic in Lox. When resolving a variable, if we can&rsquo;t find it in the stack\nof local scopes, we assume it must be global.</p>\n<p>Since scopes are stored in an explicit stack, exiting one is straightforward.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>beginScope</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">void</span> <span class=\"i\">endScope</span>() {\n    <span class=\"i\">scopes</span>.<span class=\"i\">pop</span>();\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>beginScope</em>()</div>\n\n<p>Now we can push and pop a stack of empty scopes. Let&rsquo;s put some things in them.</p>\n<h3><a href=\"#resolving-variable-declarations\" id=\"resolving-variable-declarations\"><small>11&#8202;.&#8202;3&#8202;.&#8202;2</small>Resolving variable declarations</a></h3>\n<p>Resolving a variable declaration adds a new entry to the current innermost\nscope&rsquo;s map. That seems simple, but there&rsquo;s a little dance we need to do.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>visitBlockStmt</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitVarStmt</span>(<span class=\"t\">Stmt</span>.<span class=\"t\">Var</span> <span class=\"i\">stmt</span>) {\n    <span class=\"i\">declare</span>(<span class=\"i\">stmt</span>.<span class=\"i\">name</span>);\n    <span class=\"k\">if</span> (<span class=\"i\">stmt</span>.<span class=\"i\">initializer</span> != <span class=\"k\">null</span>) {\n      <span class=\"i\">resolve</span>(<span class=\"i\">stmt</span>.<span class=\"i\">initializer</span>);\n    }\n    <span class=\"i\">define</span>(<span class=\"i\">stmt</span>.<span class=\"i\">name</span>);\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>visitBlockStmt</em>()</div>\n\n<p>We split binding into two steps, declaring then defining, in order to handle\nfunny edge cases like this:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;outer&quot;</span>;\n{\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"i\">a</span>;\n}\n</pre></div>\n<p>What happens when the initializer for a local variable refers to a variable with\nthe same name as the variable being declared? We have a few options:</p>\n<ol>\n<li>\n<p><strong>Run the initializer, then put the new variable in scope.</strong> Here, the new\nlocal <code>a</code> would be initialized with &ldquo;outer&rdquo;, the value of the <em>global</em> one.\nIn other words, the previous declaration would desugar to:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">temp</span> = <span class=\"i\">a</span>; <span class=\"c\">// Run the initializer.</span>\n<span class=\"k\">var</span> <span class=\"i\">a</span>;        <span class=\"c\">// Declare the variable.</span>\n<span class=\"i\">a</span> = <span class=\"i\">temp</span>;     <span class=\"c\">// Initialize it.</span>\n</pre></div>\n</li>\n<li>\n<p><strong>Put the new variable in scope, then run the initializer.</strong> This means you\ncould observe a variable before it&rsquo;s initialized, so we would need to figure\nout what value it would have then. Probably <code>nil</code>. That means the new local\n<code>a</code> would be re-initialized to its own implicitly initialized value, <code>nil</code>.\nNow the desugaring would look like:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">a</span>; <span class=\"c\">// Define the variable.</span>\n<span class=\"i\">a</span> = <span class=\"i\">a</span>; <span class=\"c\">// Run the initializer.</span>\n</pre></div>\n</li>\n<li>\n<p><strong>Make it an error to reference a variable in its initializer.</strong> Have the\ninterpreter fail either at compile time or runtime if an initializer\nmentions the variable being initialized.</p>\n</li>\n</ol>\n<p>Do either of those first two options look like something a user actually\n<em>wants</em>? Shadowing is rare and often an error, so initializing a shadowing\nvariable based on the value of the shadowed one seems unlikely to be deliberate.</p>\n<p>The second option is even less useful. The new variable will <em>always</em> have the\nvalue <code>nil</code>. There is never any point in mentioning it by name. You could use an\nexplicit <code>nil</code> instead.</p>\n<p>Since the first two options are likely to mask user errors, we&rsquo;ll take the\nthird. Further, we&rsquo;ll make it a compile error instead of a runtime one. That\nway, the user is alerted to the problem before any code is run.</p>\n<p>In order to do that, as we visit expressions, we need to know if we&rsquo;re inside\nthe initializer for some variable. We do that by splitting binding into two\nsteps. The first is <strong>declaring</strong> it.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>endScope</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">void</span> <span class=\"i\">declare</span>(<span class=\"t\">Token</span> <span class=\"i\">name</span>) {\n    <span class=\"k\">if</span> (<span class=\"i\">scopes</span>.<span class=\"i\">isEmpty</span>()) <span class=\"k\">return</span>;\n\n    <span class=\"t\">Map</span>&lt;<span class=\"t\">String</span>, <span class=\"t\">Boolean</span>&gt; <span class=\"i\">scope</span> = <span class=\"i\">scopes</span>.<span class=\"i\">peek</span>();\n    <span class=\"i\">scope</span>.<span class=\"i\">put</span>(<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>, <span class=\"k\">false</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>endScope</em>()</div>\n\n<p>Declaration adds the variable to the innermost scope so that it shadows any\nouter one and so that we know the variable exists. We mark it as &ldquo;not ready yet&rdquo;\nby binding its name to <code>false</code> in the scope map. The value associated with a key\nin the scope map represents whether or not we have finished resolving that\nvariable&rsquo;s initializer.</p>\n<p>After declaring the variable, we resolve its initializer expression in that same\nscope where the new variable now exists but is unavailable. Once the initializer\nexpression is done, the variable is ready for prime time. We do that by\n<strong>defining</strong> it.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>declare</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">void</span> <span class=\"i\">define</span>(<span class=\"t\">Token</span> <span class=\"i\">name</span>) {\n    <span class=\"k\">if</span> (<span class=\"i\">scopes</span>.<span class=\"i\">isEmpty</span>()) <span class=\"k\">return</span>;\n    <span class=\"i\">scopes</span>.<span class=\"i\">peek</span>().<span class=\"i\">put</span>(<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>, <span class=\"k\">true</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>declare</em>()</div>\n\n<p>We set the variable&rsquo;s value in the scope map to <code>true</code> to mark it as fully\ninitialized and available for use. It&rsquo;s alive! </p>\n<h3><a href=\"#resolving-variable-expressions\" id=\"resolving-variable-expressions\"><small>11&#8202;.&#8202;3&#8202;.&#8202;3</small>Resolving variable expressions</a></h3>\n<p>Variable declarations<span class=\"em\">&mdash;</span>and function declarations, which we&rsquo;ll get to<span class=\"em\">&mdash;</span>write\nto the scope maps. Those maps are read when we resolve variable expressions.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>visitVarStmt</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitVariableExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Variable</span> <span class=\"i\">expr</span>) {\n    <span class=\"k\">if</span> (!<span class=\"i\">scopes</span>.<span class=\"i\">isEmpty</span>() &amp;&amp;\n        <span class=\"i\">scopes</span>.<span class=\"i\">peek</span>().<span class=\"i\">get</span>(<span class=\"i\">expr</span>.<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>) == <span class=\"t\">Boolean</span>.<span class=\"i\">FALSE</span>) {\n      <span class=\"t\">Lox</span>.<span class=\"i\">error</span>(<span class=\"i\">expr</span>.<span class=\"i\">name</span>,\n          <span class=\"s\">&quot;Can&#39;t read local variable in its own initializer.&quot;</span>);\n    }\n\n    <span class=\"i\">resolveLocal</span>(<span class=\"i\">expr</span>, <span class=\"i\">expr</span>.<span class=\"i\">name</span>);\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>visitVarStmt</em>()</div>\n\n<p>First, we check to see if the variable is being accessed inside its own\ninitializer. This is where the values in the scope map come into play. If the\nvariable exists in the current scope but its value is <code>false</code>, that means we\nhave declared it but not yet defined it. We report that error.</p>\n<p>After that check, we actually resolve the variable itself using this helper:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>define</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">void</span> <span class=\"i\">resolveLocal</span>(<span class=\"t\">Expr</span> <span class=\"i\">expr</span>, <span class=\"t\">Token</span> <span class=\"i\">name</span>) {\n    <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">i</span> = <span class=\"i\">scopes</span>.<span class=\"i\">size</span>() - <span class=\"n\">1</span>; <span class=\"i\">i</span> &gt;= <span class=\"n\">0</span>; <span class=\"i\">i</span>--) {\n      <span class=\"k\">if</span> (<span class=\"i\">scopes</span>.<span class=\"i\">get</span>(<span class=\"i\">i</span>).<span class=\"i\">containsKey</span>(<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>)) {\n        <span class=\"i\">interpreter</span>.<span class=\"i\">resolve</span>(<span class=\"i\">expr</span>, <span class=\"i\">scopes</span>.<span class=\"i\">size</span>() - <span class=\"n\">1</span> - <span class=\"i\">i</span>);\n        <span class=\"k\">return</span>;\n      }\n    }\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>define</em>()</div>\n\n<p>This looks, for good reason, a lot like the code in Environment for evaluating a\nvariable. We start at the innermost scope and work outwards, looking in each map\nfor a matching name. If we find the variable, we resolve it, passing in the\nnumber of scopes between the current innermost scope and the scope where the\nvariable was found. So, if the variable was found in the current scope, we\npass in 0. If it&rsquo;s in the immediately enclosing scope, 1. You get the idea.</p>\n<p>If we walk through all of the block scopes and never find the variable, we leave\nit unresolved and assume it&rsquo;s global. We&rsquo;ll get to the implementation of that\n<code>resolve()</code> method a little later. For now, let&rsquo;s keep on cranking through the\nother syntax nodes.</p>\n<h3><a href=\"#resolving-assignment-expressions\" id=\"resolving-assignment-expressions\"><small>11&#8202;.&#8202;3&#8202;.&#8202;4</small>Resolving assignment expressions</a></h3>\n<p>The other expression that references a variable is assignment. Resolving one\nlooks like this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>visitVarStmt</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitAssignExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Assign</span> <span class=\"i\">expr</span>) {\n    <span class=\"i\">resolve</span>(<span class=\"i\">expr</span>.<span class=\"i\">value</span>);\n    <span class=\"i\">resolveLocal</span>(<span class=\"i\">expr</span>, <span class=\"i\">expr</span>.<span class=\"i\">name</span>);\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>visitVarStmt</em>()</div>\n\n<p>First, we resolve the expression for the assigned value in case it also contains\nreferences to other variables. Then we use our existing <code>resolveLocal()</code> method\nto resolve the variable that&rsquo;s being assigned to.</p>\n<h3><a href=\"#resolving-function-declarations\" id=\"resolving-function-declarations\"><small>11&#8202;.&#8202;3&#8202;.&#8202;5</small>Resolving function declarations</a></h3>\n<p>Finally, functions. Functions both bind names and introduce a scope. The name of\nthe function itself is bound in the surrounding scope where the function is\ndeclared. When we step into the function&rsquo;s body, we also bind its parameters\ninto that inner function scope.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>visitBlockStmt</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitFunctionStmt</span>(<span class=\"t\">Stmt</span>.<span class=\"t\">Function</span> <span class=\"i\">stmt</span>) {\n    <span class=\"i\">declare</span>(<span class=\"i\">stmt</span>.<span class=\"i\">name</span>);\n    <span class=\"i\">define</span>(<span class=\"i\">stmt</span>.<span class=\"i\">name</span>);\n\n    <span class=\"i\">resolveFunction</span>(<span class=\"i\">stmt</span>);\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>visitBlockStmt</em>()</div>\n\n<p>Similar to <code>visitVariableStmt()</code>, we declare and define the name of the function\nin the current scope. Unlike variables, though, we define the name eagerly,\nbefore resolving the function&rsquo;s body. This lets a function recursively refer to\nitself inside its own body.</p>\n<p>Then we resolve the function&rsquo;s body using this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>resolve</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">void</span> <span class=\"i\">resolveFunction</span>(<span class=\"t\">Stmt</span>.<span class=\"t\">Function</span> <span class=\"i\">function</span>) {\n    <span class=\"i\">beginScope</span>();\n    <span class=\"k\">for</span> (<span class=\"t\">Token</span> <span class=\"i\">param</span> : <span class=\"i\">function</span>.<span class=\"i\">params</span>) {\n      <span class=\"i\">declare</span>(<span class=\"i\">param</span>);\n      <span class=\"i\">define</span>(<span class=\"i\">param</span>);\n    }\n    <span class=\"i\">resolve</span>(<span class=\"i\">function</span>.<span class=\"i\">body</span>);\n    <span class=\"i\">endScope</span>();\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>resolve</em>()</div>\n\n<p>It&rsquo;s a separate method since we will also use it for resolving Lox methods when\nwe add classes later. It creates a new scope for the body and then binds\nvariables for each of the function&rsquo;s parameters.</p>\n<p>Once that&rsquo;s ready, it resolves the function body in that scope. This is\ndifferent from how the interpreter handles function declarations. At <em>runtime</em>,\ndeclaring a function doesn&rsquo;t do anything with the function&rsquo;s body. The body\ndoesn&rsquo;t get touched until later when the function is called. In a <em>static</em>\nanalysis, we immediately traverse into the body right then and there.</p>\n<h3><a href=\"#resolving-the-other-syntax-tree-nodes\" id=\"resolving-the-other-syntax-tree-nodes\"><small>11&#8202;.&#8202;3&#8202;.&#8202;6</small>Resolving the other syntax tree nodes</a></h3>\n<p>That covers the interesting corners of the grammars. We handle every place where\na variable is declared, read, or written, and every place where a scope is\ncreated or destroyed. Even though they aren&rsquo;t affected by variable resolution,\nwe also need visit methods for all of the other syntax tree nodes in order to\nrecurse into their subtrees. <span name=\"boring\">Sorry</span> this bit is\nboring, but bear with me. We&rsquo;ll go kind of &ldquo;top down&rdquo; and start with statements.</p>\n<aside name=\"boring\">\n<p>I did say the book would have every single line of code for these interpreters.\nI didn&rsquo;t say they&rsquo;d all be exciting.</p>\n</aside>\n<p>An expression statement contains a single expression to traverse.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>visitBlockStmt</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitExpressionStmt</span>(<span class=\"t\">Stmt</span>.<span class=\"t\">Expression</span> <span class=\"i\">stmt</span>) {\n    <span class=\"i\">resolve</span>(<span class=\"i\">stmt</span>.<span class=\"i\">expression</span>);\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>visitBlockStmt</em>()</div>\n\n<p>An if statement has an expression for its condition and one or two statements\nfor the branches.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>visitFunctionStmt</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitIfStmt</span>(<span class=\"t\">Stmt</span>.<span class=\"t\">If</span> <span class=\"i\">stmt</span>) {\n    <span class=\"i\">resolve</span>(<span class=\"i\">stmt</span>.<span class=\"i\">condition</span>);\n    <span class=\"i\">resolve</span>(<span class=\"i\">stmt</span>.<span class=\"i\">thenBranch</span>);\n    <span class=\"k\">if</span> (<span class=\"i\">stmt</span>.<span class=\"i\">elseBranch</span> != <span class=\"k\">null</span>) <span class=\"i\">resolve</span>(<span class=\"i\">stmt</span>.<span class=\"i\">elseBranch</span>);\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>visitFunctionStmt</em>()</div>\n\n<p>Here, we see how resolution is different from interpretation. When we resolve an\n<code>if</code> statement, there is no control flow. We resolve the condition and <em>both</em>\nbranches. Where a dynamic execution steps only into the branch that <em>is</em> run, a\nstatic analysis is conservative<span class=\"em\">&mdash;</span>it analyzes any branch that <em>could</em> be run.\nSince either one could be reached at runtime, we resolve both.</p>\n<p>Like expression statements, a <code>print</code> statement contains a single subexpression.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>visitIfStmt</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitPrintStmt</span>(<span class=\"t\">Stmt</span>.<span class=\"t\">Print</span> <span class=\"i\">stmt</span>) {\n    <span class=\"i\">resolve</span>(<span class=\"i\">stmt</span>.<span class=\"i\">expression</span>);\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>visitIfStmt</em>()</div>\n\n<p>Same deal for return.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>visitPrintStmt</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitReturnStmt</span>(<span class=\"t\">Stmt</span>.<span class=\"t\">Return</span> <span class=\"i\">stmt</span>) {\n    <span class=\"k\">if</span> (<span class=\"i\">stmt</span>.<span class=\"i\">value</span> != <span class=\"k\">null</span>) {\n      <span class=\"i\">resolve</span>(<span class=\"i\">stmt</span>.<span class=\"i\">value</span>);\n    }\n\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>visitPrintStmt</em>()</div>\n\n<p>As in <code>if</code> statements, with a <code>while</code> statement, we resolve its condition and\nresolve the body exactly once.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>visitVarStmt</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitWhileStmt</span>(<span class=\"t\">Stmt</span>.<span class=\"t\">While</span> <span class=\"i\">stmt</span>) {\n    <span class=\"i\">resolve</span>(<span class=\"i\">stmt</span>.<span class=\"i\">condition</span>);\n    <span class=\"i\">resolve</span>(<span class=\"i\">stmt</span>.<span class=\"i\">body</span>);\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>visitVarStmt</em>()</div>\n\n<p>That covers all the statements. On to expressions<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span></p>\n<p>Our old friend the binary expression. We traverse into and resolve both\noperands.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>visitAssignExpr</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitBinaryExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Binary</span> <span class=\"i\">expr</span>) {\n    <span class=\"i\">resolve</span>(<span class=\"i\">expr</span>.<span class=\"i\">left</span>);\n    <span class=\"i\">resolve</span>(<span class=\"i\">expr</span>.<span class=\"i\">right</span>);\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>visitAssignExpr</em>()</div>\n\n<p>Calls are similar<span class=\"em\">&mdash;</span>we walk the argument list and resolve them all. The thing\nbeing called is also an expression (usually a variable expression), so that gets\nresolved too.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>visitBinaryExpr</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitCallExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Call</span> <span class=\"i\">expr</span>) {\n    <span class=\"i\">resolve</span>(<span class=\"i\">expr</span>.<span class=\"i\">callee</span>);\n\n    <span class=\"k\">for</span> (<span class=\"t\">Expr</span> <span class=\"i\">argument</span> : <span class=\"i\">expr</span>.<span class=\"i\">arguments</span>) {\n      <span class=\"i\">resolve</span>(<span class=\"i\">argument</span>);\n    }\n\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>visitBinaryExpr</em>()</div>\n\n<p>Parentheses are easy.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>visitCallExpr</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitGroupingExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Grouping</span> <span class=\"i\">expr</span>) {\n    <span class=\"i\">resolve</span>(<span class=\"i\">expr</span>.<span class=\"i\">expression</span>);\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>visitCallExpr</em>()</div>\n\n<p>Literals are easiest of all.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>visitGroupingExpr</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitLiteralExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Literal</span> <span class=\"i\">expr</span>) {\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>visitGroupingExpr</em>()</div>\n\n<p>A literal expression doesn&rsquo;t mention any variables and doesn&rsquo;t contain any\nsubexpressions so there is no work to do.</p>\n<p>Since a static analysis does no control flow or short-circuiting, logical\nexpressions are exactly the same as other binary operators.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>visitLiteralExpr</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitLogicalExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Logical</span> <span class=\"i\">expr</span>) {\n    <span class=\"i\">resolve</span>(<span class=\"i\">expr</span>.<span class=\"i\">left</span>);\n    <span class=\"i\">resolve</span>(<span class=\"i\">expr</span>.<span class=\"i\">right</span>);\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>visitLiteralExpr</em>()</div>\n\n<p>And, finally, the last node. We resolve its one operand.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>visitLogicalExpr</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitUnaryExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Unary</span> <span class=\"i\">expr</span>) {\n    <span class=\"i\">resolve</span>(<span class=\"i\">expr</span>.<span class=\"i\">right</span>);\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>visitLogicalExpr</em>()</div>\n\n<p>With all of these visit methods, the Java compiler should be satisfied that\nResolver fully implements Stmt.Visitor and Expr.Visitor. Now is a good time to\ntake a break, have a snack, maybe a little nap.</p>\n<h2><a href=\"#interpreting-resolved-variables\" id=\"interpreting-resolved-variables\"><small>11&#8202;.&#8202;4</small>Interpreting Resolved Variables</a></h2>\n<p>Let&rsquo;s see what our resolver is good for. Each time it visits a variable, it\ntells the interpreter how many scopes there are between the current scope and\nthe scope where the variable is defined. At runtime, this corresponds exactly to\nthe number of <em>environments</em> between the current one and the enclosing one where\nthe interpreter can find the variable&rsquo;s value. The resolver hands that number to\nthe interpreter by calling this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>execute</em>()</div>\n<pre>  <span class=\"t\">void</span> <span class=\"i\">resolve</span>(<span class=\"t\">Expr</span> <span class=\"i\">expr</span>, <span class=\"t\">int</span> <span class=\"i\">depth</span>) {\n    <span class=\"i\">locals</span>.<span class=\"i\">put</span>(<span class=\"i\">expr</span>, <span class=\"i\">depth</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>execute</em>()</div>\n\n<p>We want to store the resolution information somewhere so we can use it when the\nvariable or assignment expression is later executed, but where? One obvious\nplace is right in the syntax tree node itself. That&rsquo;s a fine approach, and\nthat&rsquo;s where many compilers store the results of analyses like this.</p>\n<p>We could do that, but it would require mucking around with our syntax tree\ngenerator. Instead, we&rsquo;ll take another common approach and store it off to the\n<span name=\"side\">side</span> in a map that associates each syntax tree node\nwith its resolved data.</p>\n<aside name=\"side\">\n<p>I <em>think</em> I&rsquo;ve heard this map called a &ldquo;side table&rdquo; since it&rsquo;s a tabular data\nstructure that stores data separately from the objects it relates to. But\nwhenever I try to Google for that term, I get pages about furniture.</p>\n</aside>\n<p>Interactive tools like IDEs often incrementally reparse and re-resolve parts of\nthe user&rsquo;s program. It may be hard to find all of the bits of state that need\nrecalculating when they&rsquo;re hiding in the foliage of the syntax tree. A benefit\nof storing this data outside of the nodes is that it makes it easy to <em>discard</em>\nit<span class=\"em\">&mdash;</span>simply clear the map.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  private Environment environment = globals;\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin class <em>Interpreter</em></div>\n<pre class=\"insert\">  <span class=\"k\">private</span> <span class=\"k\">final</span> <span class=\"t\">Map</span>&lt;<span class=\"t\">Expr</span>, <span class=\"t\">Integer</span>&gt; <span class=\"i\">locals</span> = <span class=\"k\">new</span> <span class=\"t\">HashMap</span>&lt;&gt;();\n</pre><pre class=\"insert-after\">\n\n  Interpreter() {\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in class <em>Interpreter</em></div>\n\n<p>You might think we&rsquo;d need some sort of nested tree structure to avoid getting\nconfused when there are multiple expressions that reference the same variable,\nbut each expression node is its own Java object with its own unique identity. A\nsingle monolithic map doesn&rsquo;t have any trouble keeping them separated.</p>\n<p>As usual, using a collection requires us to import a couple of names.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">import java.util.ArrayList;\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em></div>\n<pre class=\"insert\"><span class=\"k\">import</span> <span class=\"i\">java.util.HashMap</span>;\n</pre><pre class=\"insert-after\">import java.util.List;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em></div>\n\n<p>And:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">import java.util.List;\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em></div>\n<pre class=\"insert\"><span class=\"k\">import</span> <span class=\"i\">java.util.Map</span>;\n</pre><pre class=\"insert-after\">\n\nclass Interpreter implements Expr.Visitor&lt;Object&gt;,\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em></div>\n\n<h3><a href=\"#accessing-a-resolved-variable\" id=\"accessing-a-resolved-variable\"><small>11&#8202;.&#8202;4&#8202;.&#8202;1</small>Accessing a resolved variable</a></h3>\n<p>Our interpreter now has access to each variable&rsquo;s resolved location. Finally, we\nget to make use of that. We replace the visit method for variable expressions\nwith this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  public Object visitVariableExpr(Expr.Variable expr) {\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitVariableExpr</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"k\">return</span> <span class=\"i\">lookUpVariable</span>(<span class=\"i\">expr</span>.<span class=\"i\">name</span>, <span class=\"i\">expr</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitVariableExpr</em>(), replace 1 line</div>\n\n<p>That delegates to:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>visitVariableExpr</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Object</span> <span class=\"i\">lookUpVariable</span>(<span class=\"t\">Token</span> <span class=\"i\">name</span>, <span class=\"t\">Expr</span> <span class=\"i\">expr</span>) {\n    <span class=\"t\">Integer</span> <span class=\"i\">distance</span> = <span class=\"i\">locals</span>.<span class=\"i\">get</span>(<span class=\"i\">expr</span>);\n    <span class=\"k\">if</span> (<span class=\"i\">distance</span> != <span class=\"k\">null</span>) {\n      <span class=\"k\">return</span> <span class=\"i\">environment</span>.<span class=\"i\">getAt</span>(<span class=\"i\">distance</span>, <span class=\"i\">name</span>.<span class=\"i\">lexeme</span>);\n    } <span class=\"k\">else</span> {\n      <span class=\"k\">return</span> <span class=\"i\">globals</span>.<span class=\"i\">get</span>(<span class=\"i\">name</span>);\n    }\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>visitVariableExpr</em>()</div>\n\n<p>There are a couple of things going on here. First, we look up the resolved\ndistance in the map. Remember that we resolved only <em>local</em> variables. Globals\nare treated specially and don&rsquo;t end up in the map (hence the name <code>locals</code>). So,\nif we don&rsquo;t find a distance in the map, it must be global. In that case, we\nlook it up, dynamically, directly in the global environment. That throws a\nruntime error if the variable isn&rsquo;t defined.</p>\n<p>If we <em>do</em> get a distance, we have a local variable, and we get to take\nadvantage of the results of our static analysis. Instead of calling <code>get()</code>, we\ncall this new method on Environment:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Environment.java</em><br>\nadd after <em>define</em>()</div>\n<pre>  <span class=\"t\">Object</span> <span class=\"i\">getAt</span>(<span class=\"t\">int</span> <span class=\"i\">distance</span>, <span class=\"t\">String</span> <span class=\"i\">name</span>) {\n    <span class=\"k\">return</span> <span class=\"i\">ancestor</span>(<span class=\"i\">distance</span>).<span class=\"i\">values</span>.<span class=\"i\">get</span>(<span class=\"i\">name</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Environment.java</em>, add after <em>define</em>()</div>\n\n<p>The old <code>get()</code> method dynamically walks the chain of enclosing environments,\nscouring each one to see if the variable might be hiding in there somewhere. But\nnow we know exactly which environment in the chain will have the variable. We\nreach it using this helper method:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Environment.java</em><br>\nadd after <em>define</em>()</div>\n<pre>  <span class=\"t\">Environment</span> <span class=\"i\">ancestor</span>(<span class=\"t\">int</span> <span class=\"i\">distance</span>) {\n    <span class=\"t\">Environment</span> <span class=\"i\">environment</span> = <span class=\"k\">this</span>;\n    <span class=\"k\">for</span> (<span class=\"t\">int</span> <span class=\"i\">i</span> = <span class=\"n\">0</span>; <span class=\"i\">i</span> &lt; <span class=\"i\">distance</span>; <span class=\"i\">i</span>++) {\n      <span class=\"i\">environment</span> = <span class=\"i\">environment</span>.<span class=\"i\">enclosing</span>;<span name=\"coupled\"> </span>\n    }\n\n    <span class=\"k\">return</span> <span class=\"i\">environment</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Environment.java</em>, add after <em>define</em>()</div>\n\n<p>This walks a fixed number of hops up the parent chain and returns the\nenvironment there. Once we have that, <code>getAt()</code> simply returns the value of the\nvariable in that environment&rsquo;s map. It doesn&rsquo;t even have to check to see if the\nvariable is there<span class=\"em\">&mdash;</span>we know it will be because the resolver already found it\nbefore.</p>\n<aside name=\"coupled\">\n<p>The way the interpreter assumes the variable is in that map feels like flying\nblind. The interpreter code trusts that the resolver did its job and resolved\nthe variable correctly. This implies a deep coupling between these two classes.\nIn the resolver, each line of code that touches a scope must have its exact\nmatch in the interpreter for modifying an environment.</p>\n<p>I felt that coupling firsthand because as I wrote the code for the book, I\nran into a couple of subtle bugs where the resolver and interpreter code were\nslightly out of sync. Tracking those down was difficult. One tool to make that\neasier is to have the interpreter explicitly assert<span class=\"em\">&mdash;</span>using Java&rsquo;s assert\nstatements or some other validation tool<span class=\"em\">&mdash;</span>the contract it expects the resolver\nto have already upheld.</p>\n</aside>\n<h3><a href=\"#assigning-to-a-resolved-variable\" id=\"assigning-to-a-resolved-variable\"><small>11&#8202;.&#8202;4&#8202;.&#8202;2</small>Assigning to a resolved variable</a></h3>\n<p>We can also use a variable by assigning to it. The changes to visiting an\nassignment expression are similar.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  public Object visitAssignExpr(Expr.Assign expr) {\n    Object value = evaluate(expr.value);\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin <em>visitAssignExpr</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">\n\n    <span class=\"t\">Integer</span> <span class=\"i\">distance</span> = <span class=\"i\">locals</span>.<span class=\"i\">get</span>(<span class=\"i\">expr</span>);\n    <span class=\"k\">if</span> (<span class=\"i\">distance</span> != <span class=\"k\">null</span>) {\n      <span class=\"i\">environment</span>.<span class=\"i\">assignAt</span>(<span class=\"i\">distance</span>, <span class=\"i\">expr</span>.<span class=\"i\">name</span>, <span class=\"i\">value</span>);\n    } <span class=\"k\">else</span> {\n      <span class=\"i\">globals</span>.<span class=\"i\">assign</span>(<span class=\"i\">expr</span>.<span class=\"i\">name</span>, <span class=\"i\">value</span>);\n    }\n\n</pre><pre class=\"insert-after\">    return value;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in <em>visitAssignExpr</em>(), replace 1 line</div>\n\n<p>Again, we look up the variable&rsquo;s scope distance. If not found, we assume it&rsquo;s\nglobal and handle it the same way as before. Otherwise, we call this new method:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Environment.java</em><br>\nadd after <em>getAt</em>()</div>\n<pre>  <span class=\"t\">void</span> <span class=\"i\">assignAt</span>(<span class=\"t\">int</span> <span class=\"i\">distance</span>, <span class=\"t\">Token</span> <span class=\"i\">name</span>, <span class=\"t\">Object</span> <span class=\"i\">value</span>) {\n    <span class=\"i\">ancestor</span>(<span class=\"i\">distance</span>).<span class=\"i\">values</span>.<span class=\"i\">put</span>(<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>, <span class=\"i\">value</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Environment.java</em>, add after <em>getAt</em>()</div>\n\n<p>As <code>getAt()</code> is to <code>get()</code>, <code>assignAt()</code> is to <code>assign()</code>. It walks a fixed\nnumber of environments, and then stuffs the new value in that map.</p>\n<p>Those are the only changes to Interpreter. This is why I chose a representation\nfor our resolved data that was minimally invasive. All of the rest of the nodes\ncontinue working as they did before. Even the code for modifying environments is\nunchanged.</p>\n<h3><a href=\"#running-the-resolver\" id=\"running-the-resolver\"><small>11&#8202;.&#8202;4&#8202;.&#8202;3</small>Running the resolver</a></h3>\n<p>We do need to actually <em>run</em> the resolver, though. We insert the new pass after\nthe parser does its magic.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    // Stop if there was a syntax error.\n    if (hadError) return;\n\n</pre><div class=\"source-file\"><em>lox/Lox.java</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">    <span class=\"t\">Resolver</span> <span class=\"i\">resolver</span> = <span class=\"k\">new</span> <span class=\"t\">Resolver</span>(<span class=\"i\">interpreter</span>);\n    <span class=\"i\">resolver</span>.<span class=\"i\">resolve</span>(<span class=\"i\">statements</span>);\n\n</pre><pre class=\"insert-after\">    interpreter.interpret(statements);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Lox.java</em>, in <em>run</em>()</div>\n\n<p>We don&rsquo;t run the resolver if there are any parse errors. If the code has a\nsyntax error, it&rsquo;s never going to run, so there&rsquo;s little value in resolving it.\nIf the syntax is clean, we tell the resolver to do its thing. The resolver has a\nreference to the interpreter and pokes the resolution data directly into it as\nit walks over variables. When the interpreter runs next, it has everything it\nneeds.</p>\n<p>At least, that&rsquo;s true if the resolver <em>succeeds</em>. But what about errors during\nresolution?</p>\n<h2><a href=\"#resolution-errors\" id=\"resolution-errors\"><small>11&#8202;.&#8202;5</small>Resolution Errors</a></h2>\n<p>Since we are doing a semantic analysis pass, we have an opportunity to make\nLox&rsquo;s semantics more precise, and to help users catch bugs early before running\ntheir code. Take a look at this bad boy:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">bad</span>() {\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;first&quot;</span>;\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;second&quot;</span>;\n}\n</pre></div>\n<p>We do allow declaring multiple variables with the same name in the <em>global</em>\nscope, but doing so in a local scope is probably a mistake. If they knew the\nvariable already existed, they would have assigned to it instead of using <code>var</code>.\nAnd if they <em>didn&rsquo;t</em> know it existed, they probably didn&rsquo;t intend to overwrite\nthe previous one.</p>\n<p>We can detect this mistake statically while resolving.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    Map&lt;String, Boolean&gt; scope = scopes.peek();\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin <em>declare</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (<span class=\"i\">scope</span>.<span class=\"i\">containsKey</span>(<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>)) {\n      <span class=\"t\">Lox</span>.<span class=\"i\">error</span>(<span class=\"i\">name</span>,\n          <span class=\"s\">&quot;Already a variable with this name in this scope.&quot;</span>);\n    }\n\n</pre><pre class=\"insert-after\">    scope.put(name.lexeme, false);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in <em>declare</em>()</div>\n\n<p>When we declare a variable in a local scope, we already know the names of every\nvariable previously declared in that same scope. If we see a collision, we\nreport an error.</p>\n<h3><a href=\"#invalid-return-errors\" id=\"invalid-return-errors\"><small>11&#8202;.&#8202;5&#8202;.&#8202;1</small>Invalid return errors</a></h3>\n<p>Here&rsquo;s another nasty little script:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">return</span> <span class=\"s\">&quot;at top level&quot;</span>;\n</pre></div>\n<p>This executes a <code>return</code> statement, but it&rsquo;s not even inside a function at all.\nIt&rsquo;s top-level code. I don&rsquo;t know what the user <em>thinks</em> is going to happen, but\nI don&rsquo;t think we want Lox to allow this.</p>\n<p>We can extend the resolver to detect this statically. Much like we track scopes\nas we walk the tree, we can track whether or not the code we are currently\nvisiting is inside a function declaration.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  private final Stack&lt;Map&lt;String, Boolean&gt;&gt; scopes = new Stack&lt;&gt;();\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin class <em>Resolver</em></div>\n<pre class=\"insert\">  <span class=\"k\">private</span> <span class=\"t\">FunctionType</span> <span class=\"i\">currentFunction</span> = <span class=\"t\">FunctionType</span>.<span class=\"i\">NONE</span>;\n</pre><pre class=\"insert-after\">\n\n  Resolver(Interpreter interpreter) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in class <em>Resolver</em></div>\n\n<p>Instead of a bare Boolean, we use this funny enum:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nadd after <em>Resolver</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"k\">enum</span> <span class=\"t\">FunctionType</span> {\n    <span class=\"i\">NONE</span>,\n    <span class=\"i\">FUNCTION</span>\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, add after <em>Resolver</em>()</div>\n\n<p>It seems kind of dumb now, but we&rsquo;ll add a couple more cases to it later and\nthen it will make more sense. When we resolve a function declaration, we pass\nthat in.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    define(stmt.name);\n\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin <em>visitFunctionStmt</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"i\">resolveFunction</span>(<span class=\"i\">stmt</span>, <span class=\"t\">FunctionType</span>.<span class=\"i\">FUNCTION</span>);\n</pre><pre class=\"insert-after\">    return null;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in <em>visitFunctionStmt</em>(), replace 1 line</div>\n\n<p>Over in <code>resolveFunction()</code>, we take that parameter and store it in the field\nbefore resolving the body.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nmethod <em>resolveFunction</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"k\">private</span> <span class=\"t\">void</span> <span class=\"i\">resolveFunction</span>(\n      <span class=\"t\">Stmt</span>.<span class=\"t\">Function</span> <span class=\"i\">function</span>, <span class=\"t\">FunctionType</span> <span class=\"i\">type</span>) {\n    <span class=\"t\">FunctionType</span> <span class=\"i\">enclosingFunction</span> = <span class=\"i\">currentFunction</span>;\n    <span class=\"i\">currentFunction</span> = <span class=\"i\">type</span>;\n\n</pre><pre class=\"insert-after\">    beginScope();\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, method <em>resolveFunction</em>(), replace 1 line</div>\n\n<p>We stash the previous value of the field in a local variable first. Remember,\nLox has local functions, so you can nest function declarations arbitrarily\ndeeply. We need to track not just that we&rsquo;re in a function, but <em>how many</em> we&rsquo;re\nin.</p>\n<p>We could use an explicit stack of FunctionType values for that, but instead\nwe&rsquo;ll piggyback on the JVM. We store the previous value in a local on the Java\nstack. When we&rsquo;re done resolving the function body, we restore the field to that\nvalue.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    endScope();\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin <em>resolveFunction</em>()</div>\n<pre class=\"insert\">    <span class=\"i\">currentFunction</span> = <span class=\"i\">enclosingFunction</span>;\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in <em>resolveFunction</em>()</div>\n\n<p>Now that we can always tell whether or not we&rsquo;re inside a function declaration,\nwe check that when resolving a <code>return</code> statement.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  public Void visitReturnStmt(Stmt.Return stmt) {\n</pre><div class=\"source-file\"><em>lox/Resolver.java</em><br>\nin <em>visitReturnStmt</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (<span class=\"i\">currentFunction</span> == <span class=\"t\">FunctionType</span>.<span class=\"i\">NONE</span>) {\n      <span class=\"t\">Lox</span>.<span class=\"i\">error</span>(<span class=\"i\">stmt</span>.<span class=\"i\">keyword</span>, <span class=\"s\">&quot;Can&#39;t return from top-level code.&quot;</span>);\n    }\n\n</pre><pre class=\"insert-after\">    if (stmt.value != null) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Resolver.java</em>, in <em>visitReturnStmt</em>()</div>\n\n<p>Neat, right?</p>\n<p>There&rsquo;s one more piece. Back in the main Lox class that stitches everything\ntogether, we are careful to not run the interpreter if any parse errors are\nencountered. That check runs <em>before</em> the resolver so that we don&rsquo;t try to\nresolve syntactically invalid code.</p>\n<p>But we also need to skip the interpreter if there are resolution errors, so we\nadd <em>another</em> check.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    resolver.resolve(statements);\n</pre><div class=\"source-file\"><em>lox/Lox.java</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"c\">// Stop if there was a resolution error.</span>\n    <span class=\"k\">if</span> (<span class=\"i\">hadError</span>) <span class=\"k\">return</span>;\n</pre><pre class=\"insert-after\">\n\n    interpreter.interpret(statements);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Lox.java</em>, in <em>run</em>()</div>\n\n<p>You could imagine doing lots of other analysis in here. For example, if we added\n<code>break</code> statements to Lox, we would probably want to ensure they are only used\ninside loops.</p>\n<p>We could go farther and report warnings for code that isn&rsquo;t necessarily <em>wrong</em>\nbut probably isn&rsquo;t useful. For example, many IDEs will warn if you have\nunreachable code after a <code>return</code> statement, or a local variable whose value is\nnever read. All of that would be pretty easy to add to our static visiting pass,\nor as <span name=\"separate\">separate</span> passes.</p>\n<aside name=\"separate\">\n<p>The choice of how many different analyses to lump into a single pass is\ndifficult. Many small isolated passes, each with their own responsibility, are\nsimpler to implement and maintain. However, there is a real runtime cost to\ntraversing the syntax tree itself, so bundling multiple analyses into a single\npass is usually faster.</p>\n</aside>\n<p>But, for now, we&rsquo;ll stick with that limited amount of analysis. The important\npart is that we fixed that one weird annoying edge case bug, though it might be\nsurprising that it took this much work to do it.</p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>Why is it safe to eagerly define the variable bound to a function&rsquo;s name\nwhen other variables must wait until after they are initialized before they\ncan be used?</p>\n</li>\n<li>\n<p>How do other languages you know handle local variables that refer to the\nsame name in their initializer, like:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;outer&quot;</span>;\n{\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"i\">a</span>;\n}\n</pre></div>\n<p>Is it a runtime error? Compile error? Allowed? Do they treat global\nvariables differently? Do you agree with their choices? Justify your answer.</p>\n</li>\n<li>\n<p>Extend the resolver to report an error if a local variable is never used.</p>\n</li>\n<li>\n<p>Our resolver calculates <em>which</em> environment the variable is found in, but\nit&rsquo;s still looked up by name in that map. A more efficient environment\nrepresentation would store local variables in an array and look them up by\nindex.</p>\n<p>Extend the resolver to associate a unique index for each local variable\ndeclared in a scope. When resolving a variable access, look up both the\nscope the variable is in and its index and store that. In the interpreter,\nuse that to quickly access a variable by its index instead of using a map.</p>\n</li>\n</ol>\n</div>\n\n<footer>\n<a href=\"classes.html\" class=\"next\">\n  Next Chapter: &ldquo;Classes&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/scanning-on-demand.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Scanning on Demand &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Scanning on Demand<small>16</small></a></h3>\n\n<ul>\n    <li><a href=\"#spinning-up-the-interpreter\"><small>16.1</small> Spinning Up the Interpreter</a></li>\n    <li><a href=\"#a-token-at-a-time\"><small>16.2</small> A Token at a Time</a></li>\n    <li><a href=\"#a-lexical-grammar-for-lox\"><small>16.3</small> A Lexical Grammar for Lox</a></li>\n    <li><a href=\"#identifiers-and-keywords\"><small>16.4</small> Identifiers and Keywords</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"a-virtual-machine.html\" title=\"A Virtual Machine\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"compiling-expressions.html\" title=\"Compiling Expressions\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"a-virtual-machine.html\" title=\"A Virtual Machine\" class=\"prev\">←</a>\n<a href=\"compiling-expressions.html\" title=\"Compiling Expressions\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Scanning on Demand<small>16</small></a></h3>\n\n<ul>\n    <li><a href=\"#spinning-up-the-interpreter\"><small>16.1</small> Spinning Up the Interpreter</a></li>\n    <li><a href=\"#a-token-at-a-time\"><small>16.2</small> A Token at a Time</a></li>\n    <li><a href=\"#a-lexical-grammar-for-lox\"><small>16.3</small> A Lexical Grammar for Lox</a></li>\n    <li><a href=\"#identifiers-and-keywords\"><small>16.4</small> Identifiers and Keywords</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"a-virtual-machine.html\" title=\"A Virtual Machine\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"compiling-expressions.html\" title=\"Compiling Expressions\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">16</div>\n  <h1>Scanning on Demand</h1>\n\n<blockquote>\n<p>Literature is idiosyncratic arrangements in horizontal lines in only\ntwenty-six phonetic symbols, ten Arabic numbers, and about eight punctuation\nmarks.</p>\n<p><cite>Kurt Vonnegut, <em>Like Shaking Hands With God: A Conversation about Writing</em></cite></p>\n</blockquote>\n<p>Our second interpreter, clox, has three phases<span class=\"em\">&mdash;</span>scanner, compiler, and virtual\nmachine. A data structure joins each pair of phases. Tokens flow from scanner to\ncompiler, and chunks of bytecode from compiler to VM. We began our\nimplementation near the end with <a href=\"chunks-of-bytecode.html\">chunks</a> and the <a href=\"a-virtual-machine.html\">VM</a>. Now, we&rsquo;re going to\nhop back to the beginning and build a scanner that makes tokens. In the\n<a href=\"compiling-expressions.html\">next chapter</a>, we&rsquo;ll tie the two ends together with our bytecode compiler.</p><img src=\"image/scanning-on-demand/pipeline.png\" alt=\"Source code &rarr; scanner &rarr; tokens &rarr; compiler &rarr; bytecode chunk &rarr; VM.\" />\n<p>I&rsquo;ll admit, this is not the most exciting chapter in the book. With two\nimplementations of the same language, there&rsquo;s bound to be some redundancy. I did\nsneak in a few interesting differences compared to jlox&rsquo;s scanner. Read on to\nsee what they are.</p>\n<h2><a href=\"#spinning-up-the-interpreter\" id=\"spinning-up-the-interpreter\"><small>16&#8202;.&#8202;1</small>Spinning Up the Interpreter</a></h2>\n<p>Now that we&rsquo;re building the front end, we can get clox running like a real\ninterpreter. No more hand-authored chunks of bytecode. It&rsquo;s time for a REPL and\nscript loading. Tear out most of the code in <code>main()</code> and replace it with:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">int main(int argc, const char* argv[]) {\n  initVM();\n\n</pre><div class=\"source-file\"><em>main.c</em><br>\nin <em>main</em>()<br>\nreplace 26 lines</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">argc</span> == <span class=\"n\">1</span>) {\n    <span class=\"i\">repl</span>();\n  } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">argc</span> == <span class=\"n\">2</span>) {\n    <span class=\"i\">runFile</span>(<span class=\"i\">argv</span>[<span class=\"n\">1</span>]);\n  } <span class=\"k\">else</span> {\n    <span class=\"i\">fprintf</span>(<span class=\"i\">stderr</span>, <span class=\"s\">&quot;Usage: clox [path]</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>);\n    <span class=\"i\">exit</span>(<span class=\"n\">64</span>);\n  }\n\n  <span class=\"i\">freeVM</span>();\n</pre><pre class=\"insert-after\">  return 0;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>main.c</em>, in <em>main</em>(), replace 26 lines</div>\n\n<p>If you pass <span name=\"args\">no arguments</span> to the executable, you are\ndropped into the REPL. A single command line argument is understood to be the\npath to a script to run.</p>\n<aside name=\"args\">\n<p>The code tests for one and two arguments, not zero and one, because the first\nargument in <code>argv</code> is always the name of the executable being run.</p>\n</aside>\n<p>We&rsquo;ll need a few system headers, so let&rsquo;s get them all out of the way.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>main.c</em><br>\nadd to top of file</div>\n<pre class=\"insert\"><span class=\"a\">#include &lt;stdio.h&gt;</span>\n<span class=\"a\">#include &lt;stdlib.h&gt;</span>\n<span class=\"a\">#include &lt;string.h&gt;</span>\n\n</pre><pre class=\"insert-after\">#include &quot;common.h&quot;\n</pre></div>\n<div class=\"source-file-narrow\"><em>main.c</em>, add to top of file</div>\n\n<p>Next, we get the REPL up and REPL-ing.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;vm.h&quot;\n</pre><div class=\"source-file\"><em>main.c</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">repl</span>() {\n  <span class=\"t\">char</span> <span class=\"i\">line</span>[<span class=\"n\">1024</span>];\n  <span class=\"k\">for</span> (;;) {\n    <span class=\"i\">printf</span>(<span class=\"s\">&quot;&gt; &quot;</span>);\n\n    <span class=\"k\">if</span> (!<span class=\"i\">fgets</span>(<span class=\"i\">line</span>, <span class=\"k\">sizeof</span>(<span class=\"i\">line</span>), <span class=\"i\">stdin</span>)) {\n      <span class=\"i\">printf</span>(<span class=\"s\">&quot;</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>);\n      <span class=\"k\">break</span>;\n    }\n\n    <span class=\"i\">interpret</span>(<span class=\"i\">line</span>);\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>main.c</em></div>\n\n<p>A quality REPL handles input that spans multiple lines gracefully and doesn&rsquo;t\nhave a hardcoded line length limit. This REPL here is a little more, ahem,\naustere, but it&rsquo;s fine for our purposes.</p>\n<p>The real work happens in <code>interpret()</code>. We&rsquo;ll get to that soon, but first let&rsquo;s\ntake care of loading scripts.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>main.c</em><br>\nadd after <em>repl</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">runFile</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">path</span>) {\n  <span class=\"t\">char</span>* <span class=\"i\">source</span> = <span class=\"i\">readFile</span>(<span class=\"i\">path</span>);\n  <span class=\"t\">InterpretResult</span> <span class=\"i\">result</span> = <span class=\"i\">interpret</span>(<span class=\"i\">source</span>);\n  <span class=\"i\">free</span>(<span class=\"i\">source</span>);<span name=\"owner\"> </span>\n\n  <span class=\"k\">if</span> (<span class=\"i\">result</span> == <span class=\"a\">INTERPRET_COMPILE_ERROR</span>) <span class=\"i\">exit</span>(<span class=\"n\">65</span>);\n  <span class=\"k\">if</span> (<span class=\"i\">result</span> == <span class=\"a\">INTERPRET_RUNTIME_ERROR</span>) <span class=\"i\">exit</span>(<span class=\"n\">70</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>main.c</em>, add after <em>repl</em>()</div>\n\n<p>We read the file and execute the resulting string of Lox source code. Then,\nbased on the result of that, we set the exit code appropriately because we&rsquo;re\nscrupulous tool builders and care about little details like that.</p>\n<p>We also need to free the source code string because <code>readFile()</code> dynamically\nallocates it and passes ownership to its caller. That function looks like this:</p>\n<aside name=\"owner\">\n<p>C asks us not just to manage memory explicitly, but <em>mentally</em>. We programmers\nhave to remember the ownership rules and hand-implement them throughout the\nprogram. Java just does it for us. C++ gives us tools to encode the policy\ndirectly so that the compiler validates it for us.</p>\n<p>I like C&rsquo;s simplicity, but we pay a real price for it<span class=\"em\">&mdash;</span>the language requires\nus to be more conscientious.</p>\n</aside>\n<div class=\"codehilite\"><div class=\"source-file\"><em>main.c</em><br>\nadd after <em>repl</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">char</span>* <span class=\"i\">readFile</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">path</span>) {\n  <span class=\"a\">FILE</span>* <span class=\"i\">file</span> = <span class=\"i\">fopen</span>(<span class=\"i\">path</span>, <span class=\"s\">&quot;rb&quot;</span>);\n\n  <span class=\"i\">fseek</span>(<span class=\"i\">file</span>, <span class=\"n\">0L</span>, <span class=\"a\">SEEK_END</span>);\n  <span class=\"t\">size_t</span> <span class=\"i\">fileSize</span> = <span class=\"i\">ftell</span>(<span class=\"i\">file</span>);\n  <span class=\"i\">rewind</span>(<span class=\"i\">file</span>);\n\n  <span class=\"t\">char</span>* <span class=\"i\">buffer</span> = (<span class=\"t\">char</span>*)<span class=\"i\">malloc</span>(<span class=\"i\">fileSize</span> + <span class=\"n\">1</span>);\n  <span class=\"t\">size_t</span> <span class=\"i\">bytesRead</span> = <span class=\"i\">fread</span>(<span class=\"i\">buffer</span>, <span class=\"k\">sizeof</span>(<span class=\"t\">char</span>), <span class=\"i\">fileSize</span>, <span class=\"i\">file</span>);\n  <span class=\"i\">buffer</span>[<span class=\"i\">bytesRead</span>] = <span class=\"s\">&#39;\\0&#39;</span>;\n\n  <span class=\"i\">fclose</span>(<span class=\"i\">file</span>);\n  <span class=\"k\">return</span> <span class=\"i\">buffer</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>main.c</em>, add after <em>repl</em>()</div>\n\n<p>Like a lot of C code, it takes more effort than it seems like it should,\nespecially for a language expressly designed for operating systems. The\ndifficult part is that we want to allocate a big enough string to read the whole\nfile, but we don&rsquo;t know how big the file is until we&rsquo;ve read it.</p>\n<p>The code here is the classic trick to solve that. We open the file, but before\nreading it, we seek to the very end using <code>fseek()</code>. Then we call <code>ftell()</code>\nwhich tells us how many bytes we are from the start of the file. Since we seeked\n(sought?) to the end, that&rsquo;s the size. We rewind back to the beginning, allocate\na string of that <span name=\"one\">size</span>, and read the whole file in a\nsingle batch.</p>\n<aside name=\"one\">\n<p>Well, that size <em>plus one</em>. Always gotta remember to make room for the null\nbyte.</p>\n</aside>\n<p>So we&rsquo;re done, right? Not quite. These function calls, like most calls in the C\nstandard library, can fail. If this were Java, the failures would be thrown as\nexceptions and automatically unwind the stack so we wouldn&rsquo;t <em>really</em> need to\nhandle them. In C, if we don&rsquo;t check for them, they silently get ignored.</p>\n<p>This isn&rsquo;t really a book on good C programming practice, but I hate to encourage\nbad style, so let&rsquo;s go ahead and handle the errors. It&rsquo;s good for us, like\neating our vegetables or flossing.</p>\n<p>Fortunately, we don&rsquo;t need to do anything particularly clever if a failure\noccurs. If we can&rsquo;t correctly read the user&rsquo;s script, all we can really do is\ntell the user and exit the interpreter gracefully. First of all, we might fail\nto open the file.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  FILE* file = fopen(path, &quot;rb&quot;);\n</pre><div class=\"source-file\"><em>main.c</em><br>\nin <em>readFile</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">file</span> == <span class=\"a\">NULL</span>) {\n    <span class=\"i\">fprintf</span>(<span class=\"i\">stderr</span>, <span class=\"s\">&quot;Could not open file </span><span class=\"e\">\\&quot;</span><span class=\"s\">%s</span><span class=\"e\">\\&quot;</span><span class=\"s\">.</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>, <span class=\"i\">path</span>);\n    <span class=\"i\">exit</span>(<span class=\"n\">74</span>);\n  }\n</pre><pre class=\"insert-after\">\n\n  fseek(file, 0L, SEEK_END);\n</pre></div>\n<div class=\"source-file-narrow\"><em>main.c</em>, in <em>readFile</em>()</div>\n\n<p>This can happen if the file doesn&rsquo;t exist or the user doesn&rsquo;t have access to it.\nIt&rsquo;s pretty common<span class=\"em\">&mdash;</span>people mistype paths all the time.</p>\n<p>This failure is much rarer:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  char* buffer = (char*)malloc(fileSize + 1);\n</pre><div class=\"source-file\"><em>main.c</em><br>\nin <em>readFile</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">buffer</span> == <span class=\"a\">NULL</span>) {\n    <span class=\"i\">fprintf</span>(<span class=\"i\">stderr</span>, <span class=\"s\">&quot;Not enough memory to read </span><span class=\"e\">\\&quot;</span><span class=\"s\">%s</span><span class=\"e\">\\&quot;</span><span class=\"s\">.</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>, <span class=\"i\">path</span>);\n    <span class=\"i\">exit</span>(<span class=\"n\">74</span>);\n  }\n\n</pre><pre class=\"insert-after\">  size_t bytesRead = fread(buffer, sizeof(char), fileSize, file);\n</pre></div>\n<div class=\"source-file-narrow\"><em>main.c</em>, in <em>readFile</em>()</div>\n\n<p>If we can&rsquo;t even allocate enough memory to read the Lox script, the user&rsquo;s\nprobably got bigger problems to worry about, but we should do our best to at\nleast let them know.</p>\n<p>Finally, the read itself may fail.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  size_t bytesRead = fread(buffer, sizeof(char), fileSize, file);\n</pre><div class=\"source-file\"><em>main.c</em><br>\nin <em>readFile</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">bytesRead</span> &lt; <span class=\"i\">fileSize</span>) {\n    <span class=\"i\">fprintf</span>(<span class=\"i\">stderr</span>, <span class=\"s\">&quot;Could not read file </span><span class=\"e\">\\&quot;</span><span class=\"s\">%s</span><span class=\"e\">\\&quot;</span><span class=\"s\">.</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>, <span class=\"i\">path</span>);\n    <span class=\"i\">exit</span>(<span class=\"n\">74</span>);\n  }\n\n</pre><pre class=\"insert-after\">  buffer[bytesRead] = '\\0';\n</pre></div>\n<div class=\"source-file-narrow\"><em>main.c</em>, in <em>readFile</em>()</div>\n\n<p>This is also unlikely. Actually, the <span name=\"printf\"> calls</span> to\n<code>fseek()</code>, <code>ftell()</code>, and <code>rewind()</code> could theoretically fail too, but let&rsquo;s not\ngo too far off in the weeds, shall we?</p>\n<aside name=\"printf\">\n<p>Even good old <code>printf()</code> can fail. Yup. How many times have you handled <em>that</em>\nerror?</p>\n</aside>\n<h3><a href=\"#opening-the-compilation-pipeline\" id=\"opening-the-compilation-pipeline\"><small>16&#8202;.&#8202;1&#8202;.&#8202;1</small>Opening the compilation pipeline</a></h3>\n<p>We&rsquo;ve got ourselves a string of Lox source code, so now we&rsquo;re ready to set up a\npipeline to scan, compile, and execute it. It&rsquo;s driven by <code>interpret()</code>. Right\nnow, that function runs our old hardcoded test chunk. Let&rsquo;s change it to\nsomething closer to its final incarnation.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void freeVM();\n</pre><div class=\"source-file\"><em>vm.h</em><br>\nfunction <em>interpret</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"t\">InterpretResult</span> <span class=\"i\">interpret</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">source</span>);\n</pre><pre class=\"insert-after\">void push(Value value);\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em>, function <em>interpret</em>(), replace 1 line</div>\n\n<p>Where before we passed in a Chunk, now we pass in the string of source code.\nHere&rsquo;s the new implementation:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nfunction <em>interpret</em>()<br>\nreplace 4 lines</div>\n<pre class=\"insert\"><span class=\"t\">InterpretResult</span> <span class=\"i\">interpret</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">source</span>) {\n  <span class=\"i\">compile</span>(<span class=\"i\">source</span>);\n  <span class=\"k\">return</span> <span class=\"a\">INTERPRET_OK</span>;\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, function <em>interpret</em>(), replace 4 lines</div>\n\n<p>We won&rsquo;t build the actual <em>compiler</em> yet in this chapter, but we can start\nlaying out its structure. It lives in a new module.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;common.h&quot;\n</pre><div class=\"source-file\"><em>vm.c</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;compiler.h&quot;</span>\n</pre><pre class=\"insert-after\">#include &quot;debug.h&quot;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em></div>\n\n<p>For now, the one function in it is declared like so:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.h</em><br>\ncreate new file</div>\n<pre><span class=\"a\">#ifndef clox_compiler_h</span>\n<span class=\"a\">#define clox_compiler_h</span>\n\n<span class=\"t\">void</span> <span class=\"i\">compile</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">source</span>);\n\n<span class=\"a\">#endif</span>\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.h</em>, create new file</div>\n\n<p>That signature will change, but it gets us going.</p>\n<p>The first phase of compilation is scanning<span class=\"em\">&mdash;</span>the thing we&rsquo;re doing in this\nchapter<span class=\"em\">&mdash;</span>so right now all the compiler does is set that up.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\ncreate new file</div>\n<pre><span class=\"a\">#include &lt;stdio.h&gt;</span>\n\n<span class=\"a\">#include &quot;common.h&quot;</span>\n<span class=\"a\">#include &quot;compiler.h&quot;</span>\n<span class=\"a\">#include &quot;scanner.h&quot;</span>\n\n<span class=\"t\">void</span> <span class=\"i\">compile</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">source</span>) {\n  <span class=\"i\">initScanner</span>(<span class=\"i\">source</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, create new file</div>\n\n<p>This will also grow in later chapters, naturally.</p>\n<h3><a href=\"#the-scanner-scans\" id=\"the-scanner-scans\"><small>16&#8202;.&#8202;1&#8202;.&#8202;2</small>The scanner scans</a></h3>\n<p>There are still a few more feet of scaffolding to stand up before we can start\nwriting useful code. First, a new header:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>scanner.h</em><br>\ncreate new file</div>\n<pre><span class=\"a\">#ifndef clox_scanner_h</span>\n<span class=\"a\">#define clox_scanner_h</span>\n\n<span class=\"t\">void</span> <span class=\"i\">initScanner</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">source</span>);\n\n<span class=\"a\">#endif</span>\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.h</em>, create new file</div>\n\n<p>And its corresponding implementation:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>scanner.c</em><br>\ncreate new file</div>\n<pre><span class=\"a\">#include &lt;stdio.h&gt;</span>\n<span class=\"a\">#include &lt;string.h&gt;</span>\n\n<span class=\"a\">#include &quot;common.h&quot;</span>\n<span class=\"a\">#include &quot;scanner.h&quot;</span>\n\n<span class=\"k\">typedef</span> <span class=\"k\">struct</span> {\n  <span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">start</span>;\n  <span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">current</span>;\n  <span class=\"t\">int</span> <span class=\"i\">line</span>;\n} <span class=\"t\">Scanner</span>;\n\n<span class=\"t\">Scanner</span> <span class=\"i\">scanner</span>;\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, create new file</div>\n\n<p>As our scanner chews through the user&rsquo;s source code, it tracks how far it&rsquo;s\ngone. Like we did with the VM, we wrap that state in a struct and then create a\nsingle top-level module variable of that type so we don&rsquo;t have to pass it around\nall of the various functions.</p>\n<p>There are surprisingly few fields. The <code>start</code> pointer marks the beginning of\nthe current lexeme being scanned, and <code>current</code> points to the current character\nbeing looked at.</p>\n<p><span name=\"fields\"></span></p><img src=\"image/scanning-on-demand/fields.png\" alt=\"The start and current fields pointing at 'print bacon;'. Start points at 'b' and current points at 'o'.\" />\n<aside name=\"fields\">\n<p>Here, we are in the middle of scanning the identifier <code>bacon</code>. The current\ncharacter is <code>o</code> and the character we most recently consumed is <code>c</code>.</p>\n</aside>\n<p>We have a <code>line</code> field to track what line the current lexeme is on for error\nreporting. That&rsquo;s it! We don&rsquo;t even keep a pointer to the beginning of the\nsource code string. The scanner works its way through the code once and is done\nafter that.</p>\n<p>Since we have some state, we should initialize it.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>scanner.c</em><br>\nadd after variable <em>scanner</em></div>\n<pre><span class=\"t\">void</span> <span class=\"i\">initScanner</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">source</span>) {\n  <span class=\"i\">scanner</span>.<span class=\"i\">start</span> = <span class=\"i\">source</span>;\n  <span class=\"i\">scanner</span>.<span class=\"i\">current</span> = <span class=\"i\">source</span>;\n  <span class=\"i\">scanner</span>.<span class=\"i\">line</span> = <span class=\"n\">1</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, add after variable <em>scanner</em></div>\n\n<p>We start at the very first character on the very first line, like a runner\ncrouched at the starting line.</p>\n<h2><a href=\"#a-token-at-a-time\" id=\"a-token-at-a-time\"><small>16&#8202;.&#8202;2</small>A Token at a Time</a></h2>\n<p>In jlox, when the starting gun went off, the scanner raced ahead and eagerly\nscanned the whole program, returning a list of tokens. This would be a challenge\nin clox. We&rsquo;d need some sort of growable array or list to store the tokens in.\nWe&rsquo;d need to manage allocating and freeing the tokens, and the collection\nitself. That&rsquo;s a lot of code, and a lot of memory churn.</p>\n<p>At any point in time, the compiler needs only one or two tokens<span class=\"em\">&mdash;</span>remember our\ngrammar requires only a single token of lookahead<span class=\"em\">&mdash;</span>so we don&rsquo;t need to keep\nthem <em>all</em> around at the same time. Instead, the simplest solution is to not\nscan a token until the compiler needs one. When the scanner provides one, it\nreturns the token by value. It doesn&rsquo;t need to dynamically allocate anything<span class=\"em\">&mdash;</span>it can just pass tokens around on the C stack.</p>\n<p>Unfortunately, we don&rsquo;t have a compiler yet that can ask the scanner for tokens,\nso the scanner will just sit there doing nothing. To kick it into action, we&rsquo;ll\nwrite some temporary code to drive it.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  initScanner(source);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>compile</em>()</div>\n<pre class=\"insert\">  <span class=\"t\">int</span> <span class=\"i\">line</span> = -<span class=\"n\">1</span>;\n  <span class=\"k\">for</span> (;;) {\n    <span class=\"t\">Token</span> <span class=\"i\">token</span> = <span class=\"i\">scanToken</span>();\n    <span class=\"k\">if</span> (<span class=\"i\">token</span>.<span class=\"i\">line</span> != <span class=\"i\">line</span>) {\n      <span class=\"i\">printf</span>(<span class=\"s\">&quot;%4d &quot;</span>, <span class=\"i\">token</span>.<span class=\"i\">line</span>);\n      <span class=\"i\">line</span> = <span class=\"i\">token</span>.<span class=\"i\">line</span>;\n    } <span class=\"k\">else</span> {\n      <span class=\"i\">printf</span>(<span class=\"s\">&quot;   | &quot;</span>);\n    }\n    <span class=\"i\">printf</span>(<span class=\"s\">&quot;%2d &#39;%.*s&#39;</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>, <span class=\"i\">token</span>.<span class=\"i\">type</span>, <span class=\"i\">token</span>.<span class=\"i\">length</span>, <span class=\"i\">token</span>.<span class=\"i\">start</span>);<span name=\"format\"> </span>\n\n    <span class=\"k\">if</span> (<span class=\"i\">token</span>.<span class=\"i\">type</span> == <span class=\"a\">TOKEN_EOF</span>) <span class=\"k\">break</span>;\n  }\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>compile</em>()</div>\n\n<aside name=\"format\">\n<p>That <code>%.*s</code> in the format string is a neat feature. Usually, you set the output\nprecision<span class=\"em\">&mdash;</span>the number of characters to show<span class=\"em\">&mdash;</span>by placing a number inside the\nformat string. Using <code>*</code> instead lets you pass the precision as an argument. So\nthat <code>printf()</code> call prints the first <code>token.length</code> characters of the string at\n<code>token.start</code>. We need to limit the length like that because the lexeme points\ninto the original source string and doesn&rsquo;t have a terminator at the end.</p>\n</aside>\n<p>This loops indefinitely. Each turn through the loop, it scans one token and\nprints it. When it reaches a special &ldquo;end of file&rdquo; token or an error, it stops.\nFor example, if we run the interpreter on this program:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">print</span> <span class=\"n\">1</span> + <span class=\"n\">2</span>;\n</pre></div>\n<p>It prints out:</p>\n<div class=\"codehilite\"><pre>   1 31 'print'\n   | 21 '1'\n   |  7 '+'\n   | 21 '2'\n   |  8 ';'\n   2 39 ''\n</pre></div>\n<p>The first column is the line number, the second is the numeric value of the\ntoken <span name=\"token\">type</span>, and then finally the lexeme. That last\nempty lexeme on line 2 is the EOF token.</p>\n<aside name=\"token\">\n<p>Yeah, the raw index of the token type isn&rsquo;t exactly human readable, but it&rsquo;s all\nC gives us.</p>\n</aside>\n<p>The goal for the rest of the chapter is to make that blob of code work by\nimplementing this key function:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void initScanner(const char* source);\n</pre><div class=\"source-file\"><em>scanner.h</em><br>\nadd after <em>initScanner</em>()</div>\n<pre class=\"insert\"><span class=\"t\">Token</span> <span class=\"i\">scanToken</span>();\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.h</em>, add after <em>initScanner</em>()</div>\n\n<p>Each call scans and returns the next token in the source code. A token looks\nlike this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define clox_scanner_h\n</pre><div class=\"source-file\"><em>scanner.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">typedef</span> <span class=\"k\">struct</span> {\n  <span class=\"t\">TokenType</span> <span class=\"i\">type</span>;\n  <span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">start</span>;\n  <span class=\"t\">int</span> <span class=\"i\">length</span>;\n  <span class=\"t\">int</span> <span class=\"i\">line</span>;\n} <span class=\"t\">Token</span>;\n</pre><pre class=\"insert-after\">\n\nvoid initScanner(const char* source);\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.h</em></div>\n\n<p>It&rsquo;s pretty similar to jlox&rsquo;s Token class. We have an enum identifying what type\nof token it is<span class=\"em\">&mdash;</span>number, identifier, <code>+</code> operator, etc. The enum is virtually\nidentical to the one in jlox, so let&rsquo;s just hammer out the whole thing.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#ifndef clox_scanner_h\n#define clox_scanner_h\n</pre><div class=\"source-file\"><em>scanner.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">typedef</span> <span class=\"k\">enum</span> {\n  <span class=\"c\">// Single-character tokens.</span>\n  <span class=\"a\">TOKEN_LEFT_PAREN</span>, <span class=\"a\">TOKEN_RIGHT_PAREN</span>,\n  <span class=\"a\">TOKEN_LEFT_BRACE</span>, <span class=\"a\">TOKEN_RIGHT_BRACE</span>,\n  <span class=\"a\">TOKEN_COMMA</span>, <span class=\"a\">TOKEN_DOT</span>, <span class=\"a\">TOKEN_MINUS</span>, <span class=\"a\">TOKEN_PLUS</span>,\n  <span class=\"a\">TOKEN_SEMICOLON</span>, <span class=\"a\">TOKEN_SLASH</span>, <span class=\"a\">TOKEN_STAR</span>,\n  <span class=\"c\">// One or two character tokens.</span>\n  <span class=\"a\">TOKEN_BANG</span>, <span class=\"a\">TOKEN_BANG_EQUAL</span>,\n  <span class=\"a\">TOKEN_EQUAL</span>, <span class=\"a\">TOKEN_EQUAL_EQUAL</span>,\n  <span class=\"a\">TOKEN_GREATER</span>, <span class=\"a\">TOKEN_GREATER_EQUAL</span>,\n  <span class=\"a\">TOKEN_LESS</span>, <span class=\"a\">TOKEN_LESS_EQUAL</span>,\n  <span class=\"c\">// Literals.</span>\n  <span class=\"a\">TOKEN_IDENTIFIER</span>, <span class=\"a\">TOKEN_STRING</span>, <span class=\"a\">TOKEN_NUMBER</span>,\n  <span class=\"c\">// Keywords.</span>\n  <span class=\"a\">TOKEN_AND</span>, <span class=\"a\">TOKEN_CLASS</span>, <span class=\"a\">TOKEN_ELSE</span>, <span class=\"a\">TOKEN_FALSE</span>,\n  <span class=\"a\">TOKEN_FOR</span>, <span class=\"a\">TOKEN_FUN</span>, <span class=\"a\">TOKEN_IF</span>, <span class=\"a\">TOKEN_NIL</span>, <span class=\"a\">TOKEN_OR</span>,\n  <span class=\"a\">TOKEN_PRINT</span>, <span class=\"a\">TOKEN_RETURN</span>, <span class=\"a\">TOKEN_SUPER</span>, <span class=\"a\">TOKEN_THIS</span>,\n  <span class=\"a\">TOKEN_TRUE</span>, <span class=\"a\">TOKEN_VAR</span>, <span class=\"a\">TOKEN_WHILE</span>,\n\n  <span class=\"a\">TOKEN_ERROR</span>, <span class=\"a\">TOKEN_EOF</span>\n} <span class=\"t\">TokenType</span>;\n</pre><pre class=\"insert-after\">\n\ntypedef struct {\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.h</em></div>\n\n<p>Aside from prefixing all the names with <code>TOKEN_</code> (since C tosses enum names in\nthe top-level namespace) the only difference is that extra <code>TOKEN_ERROR</code> type.\nWhat&rsquo;s that about?</p>\n<p>There are only a couple of errors that get detected during scanning:\nunterminated strings and unrecognized characters. In jlox, the scanner reports\nthose itself. In clox, the scanner produces a synthetic &ldquo;error&rdquo; token for that\nerror and passes it over to the compiler. This way, the compiler knows an error\noccurred and can kick off error recovery before reporting it.</p>\n<p>The novel part in clox&rsquo;s Token type is how it represents the lexeme. In jlox,\neach Token stored the lexeme as its own separate little Java string. If we did\nthat for clox, we&rsquo;d have to figure out how to manage the memory for those\nstrings. That&rsquo;s especially hard since we pass tokens by value<span class=\"em\">&mdash;</span>multiple tokens could point to the same lexeme string. Ownership gets weird.</p>\n<p>Instead, we use the original source string as our character store. We represent\na lexeme by a pointer to its first character and the number of characters it\ncontains. This means we don&rsquo;t need to worry about managing memory for lexemes at\nall and we can freely copy tokens around. As long as the main source code string\n<span name=\"outlive\">outlives</span> all of the tokens, everything works fine.</p>\n<aside name=\"outlive\">\n<p>I don&rsquo;t mean to sound flippant. We really do need to think about and ensure that\nthe source string, which is created far away over in the &ldquo;main&rdquo; module, has a\nlong enough lifetime. That&rsquo;s why <code>runFile()</code> doesn&rsquo;t free the string until\n<code>interpret()</code> finishes executing the code and returns.</p>\n</aside>\n<h3><a href=\"#scanning-tokens\" id=\"scanning-tokens\"><small>16&#8202;.&#8202;2&#8202;.&#8202;1</small>Scanning tokens</a></h3>\n<p>We&rsquo;re ready to scan some tokens. We&rsquo;ll work our way up to the complete\nimplementation, starting with this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>scanner.c</em><br>\nadd after <em>initScanner</em>()</div>\n<pre><span class=\"t\">Token</span> <span class=\"i\">scanToken</span>() {\n  <span class=\"i\">scanner</span>.<span class=\"i\">start</span> = <span class=\"i\">scanner</span>.<span class=\"i\">current</span>;\n\n  <span class=\"k\">if</span> (<span class=\"i\">isAtEnd</span>()) <span class=\"k\">return</span> <span class=\"i\">makeToken</span>(<span class=\"a\">TOKEN_EOF</span>);\n\n  <span class=\"k\">return</span> <span class=\"i\">errorToken</span>(<span class=\"s\">&quot;Unexpected character.&quot;</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, add after <em>initScanner</em>()</div>\n\n<p>Since each call to this function scans a complete token, we know we are at the\nbeginning of a new token when we enter the function. Thus, we set\n<code>scanner.start</code> to point to the current character so we remember where the\nlexeme we&rsquo;re about to scan starts.</p>\n<p>Then we check to see if we&rsquo;ve reached the end of the source code. If so, we\nreturn an EOF token and stop. This is a sentinel value that signals to the\ncompiler to stop asking for more tokens.</p>\n<p>If we aren&rsquo;t at the end, we do some<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>stuff<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>to scan the next token. But we\nhaven&rsquo;t written that code yet. We&rsquo;ll get to that soon. If that code doesn&rsquo;t\nsuccessfully scan and return a token, then we reach the end of the function.\nThat must mean we&rsquo;re at a character that the scanner can&rsquo;t recognize, so we\nreturn an error token for that.</p>\n<p>This function relies on a couple of helpers, most of which are familiar from\njlox. First up:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>scanner.c</em><br>\nadd after <em>initScanner</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">bool</span> <span class=\"i\">isAtEnd</span>() {\n  <span class=\"k\">return</span> *<span class=\"i\">scanner</span>.<span class=\"i\">current</span> == <span class=\"s\">&#39;\\0&#39;</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, add after <em>initScanner</em>()</div>\n\n<p>We require the source string to be a good null-terminated C string. If the\ncurrent character is the null byte, then we&rsquo;ve reached the end.</p>\n<p>To create a token, we have this constructor-like function:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>scanner.c</em><br>\nadd after <em>isAtEnd</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">Token</span> <span class=\"i\">makeToken</span>(<span class=\"t\">TokenType</span> <span class=\"i\">type</span>) {\n  <span class=\"t\">Token</span> <span class=\"i\">token</span>;\n  <span class=\"i\">token</span>.<span class=\"i\">type</span> = <span class=\"i\">type</span>;\n  <span class=\"i\">token</span>.<span class=\"i\">start</span> = <span class=\"i\">scanner</span>.<span class=\"i\">start</span>;\n  <span class=\"i\">token</span>.<span class=\"i\">length</span> = (<span class=\"t\">int</span>)(<span class=\"i\">scanner</span>.<span class=\"i\">current</span> - <span class=\"i\">scanner</span>.<span class=\"i\">start</span>);\n  <span class=\"i\">token</span>.<span class=\"i\">line</span> = <span class=\"i\">scanner</span>.<span class=\"i\">line</span>;\n  <span class=\"k\">return</span> <span class=\"i\">token</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, add after <em>isAtEnd</em>()</div>\n\n<p>It uses the scanner&rsquo;s <code>start</code> and <code>current</code> pointers to capture the token&rsquo;s\nlexeme. It sets a couple of other obvious fields then returns the token. It has\na sister function for returning error tokens.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>scanner.c</em><br>\nadd after <em>makeToken</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">Token</span> <span class=\"i\">errorToken</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">message</span>) {\n  <span class=\"t\">Token</span> <span class=\"i\">token</span>;\n  <span class=\"i\">token</span>.<span class=\"i\">type</span> = <span class=\"a\">TOKEN_ERROR</span>;\n  <span class=\"i\">token</span>.<span class=\"i\">start</span> = <span class=\"i\">message</span>;\n  <span class=\"i\">token</span>.<span class=\"i\">length</span> = (<span class=\"t\">int</span>)<span class=\"i\">strlen</span>(<span class=\"i\">message</span>);\n  <span class=\"i\">token</span>.<span class=\"i\">line</span> = <span class=\"i\">scanner</span>.<span class=\"i\">line</span>;\n  <span class=\"k\">return</span> <span class=\"i\">token</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, add after <em>makeToken</em>()</div>\n\n<p><span name=\"axolotl\"></span></p>\n<aside name=\"axolotl\">\n<p>This part of the chapter is pretty dry, so here&rsquo;s a picture of an axolotl.</p><img src=\"image/scanning-on-demand/axolotl.png\" alt=\"A drawing of an axolotl.\" />\n</aside>\n<p>The only difference is that the &ldquo;lexeme&rdquo; points to the error message string\ninstead of pointing into the user&rsquo;s source code. Again, we need to ensure that\nthe error message sticks around long enough for the compiler to read it. In\npractice, we only ever call this function with C string literals. Those are\nconstant and eternal, so we&rsquo;re fine.</p>\n<p>What we have now is basically a working scanner for a language with an empty\nlexical grammar. Since the grammar has no productions, every character is an\nerror. That&rsquo;s not exactly a fun language to program in, so let&rsquo;s fill in the\nrules.</p>\n<h2><a href=\"#a-lexical-grammar-for-lox\" id=\"a-lexical-grammar-for-lox\"><small>16&#8202;.&#8202;3</small>A Lexical Grammar for Lox</a></h2>\n<p>The simplest tokens are only a single character. We recognize those like so:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  if (isAtEnd()) return makeToken(TOKEN_EOF);\n</pre><div class=\"source-file\"><em>scanner.c</em><br>\nin <em>scanToken</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"t\">char</span> <span class=\"i\">c</span> = <span class=\"i\">advance</span>();\n\n  <span class=\"k\">switch</span> (<span class=\"i\">c</span>) {\n    <span class=\"k\">case</span> <span class=\"s\">&#39;(&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">makeToken</span>(<span class=\"a\">TOKEN_LEFT_PAREN</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;)&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">makeToken</span>(<span class=\"a\">TOKEN_RIGHT_PAREN</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;{&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">makeToken</span>(<span class=\"a\">TOKEN_LEFT_BRACE</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;}&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">makeToken</span>(<span class=\"a\">TOKEN_RIGHT_BRACE</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;;&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">makeToken</span>(<span class=\"a\">TOKEN_SEMICOLON</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;,&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">makeToken</span>(<span class=\"a\">TOKEN_COMMA</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;.&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">makeToken</span>(<span class=\"a\">TOKEN_DOT</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;-&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">makeToken</span>(<span class=\"a\">TOKEN_MINUS</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;+&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">makeToken</span>(<span class=\"a\">TOKEN_PLUS</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;/&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">makeToken</span>(<span class=\"a\">TOKEN_SLASH</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;*&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">makeToken</span>(<span class=\"a\">TOKEN_STAR</span>);\n  }\n</pre><pre class=\"insert-after\">\n\n  return errorToken(&quot;Unexpected character.&quot;);\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, in <em>scanToken</em>()</div>\n\n<p>We read the next character from the source code, and then do a straightforward\nswitch to see if it matches any of Lox&rsquo;s one-character lexemes. To read the next\ncharacter, we use a new helper which consumes the current character and returns\nit.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>scanner.c</em><br>\nadd after <em>isAtEnd</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">char</span> <span class=\"i\">advance</span>() {\n  <span class=\"i\">scanner</span>.<span class=\"i\">current</span>++;\n  <span class=\"k\">return</span> <span class=\"i\">scanner</span>.<span class=\"i\">current</span>[-<span class=\"n\">1</span>];\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, add after <em>isAtEnd</em>()</div>\n\n<p>Next up are the two-character punctuation tokens like <code>!=</code> and <code>&gt;=</code>. Each of\nthese also has a corresponding single-character token. That means that when we\nsee a character like <code>!</code>, we don&rsquo;t know if we&rsquo;re in a <code>!</code> token or a <code>!=</code> until\nwe look at the next character too. We handle those like so:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    case '*': return makeToken(TOKEN_STAR);\n</pre><div class=\"source-file\"><em>scanner.c</em><br>\nin <em>scanToken</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"s\">&#39;!&#39;</span>:\n      <span class=\"k\">return</span> <span class=\"i\">makeToken</span>(\n          <span class=\"i\">match</span>(<span class=\"s\">&#39;=&#39;</span>) ? <span class=\"a\">TOKEN_BANG_EQUAL</span> : <span class=\"a\">TOKEN_BANG</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;=&#39;</span>:\n      <span class=\"k\">return</span> <span class=\"i\">makeToken</span>(\n          <span class=\"i\">match</span>(<span class=\"s\">&#39;=&#39;</span>) ? <span class=\"a\">TOKEN_EQUAL_EQUAL</span> : <span class=\"a\">TOKEN_EQUAL</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;&lt;&#39;</span>:\n      <span class=\"k\">return</span> <span class=\"i\">makeToken</span>(\n          <span class=\"i\">match</span>(<span class=\"s\">&#39;=&#39;</span>) ? <span class=\"a\">TOKEN_LESS_EQUAL</span> : <span class=\"a\">TOKEN_LESS</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;&gt;&#39;</span>:\n      <span class=\"k\">return</span> <span class=\"i\">makeToken</span>(\n          <span class=\"i\">match</span>(<span class=\"s\">&#39;=&#39;</span>) ? <span class=\"a\">TOKEN_GREATER_EQUAL</span> : <span class=\"a\">TOKEN_GREATER</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, in <em>scanToken</em>()</div>\n\n<p>After consuming the first character, we look for an <code>=</code>. If found, we consume it\nand return the corresponding two-character token. Otherwise, we leave the\ncurrent character alone (so it can be part of the <em>next</em> token) and return the\nappropriate one-character token.</p>\n<p>That logic for conditionally consuming the second character lives here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>scanner.c</em><br>\nadd after <em>advance</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">bool</span> <span class=\"i\">match</span>(<span class=\"t\">char</span> <span class=\"i\">expected</span>) {\n  <span class=\"k\">if</span> (<span class=\"i\">isAtEnd</span>()) <span class=\"k\">return</span> <span class=\"k\">false</span>;\n  <span class=\"k\">if</span> (*<span class=\"i\">scanner</span>.<span class=\"i\">current</span> != <span class=\"i\">expected</span>) <span class=\"k\">return</span> <span class=\"k\">false</span>;\n  <span class=\"i\">scanner</span>.<span class=\"i\">current</span>++;\n  <span class=\"k\">return</span> <span class=\"k\">true</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, add after <em>advance</em>()</div>\n\n<p>If the current character is the desired one, we advance and return <code>true</code>.\nOtherwise, we return <code>false</code> to indicate it wasn&rsquo;t matched.</p>\n<p>Now our scanner supports all of the punctuation-like tokens. Before we get to\nthe longer ones, let&rsquo;s take a little side trip to handle characters that aren&rsquo;t\npart of a token at all.</p>\n<h3><a href=\"#whitespace\" id=\"whitespace\"><small>16&#8202;.&#8202;3&#8202;.&#8202;1</small>Whitespace</a></h3>\n<p>Our scanner needs to handle spaces, tabs, and newlines, but those characters\ndon&rsquo;t become part of any token&rsquo;s lexeme. We could check for those inside the\nmain character switch in <code>scanToken()</code> but it gets a little tricky to ensure\nthat the function still correctly finds the next token <em>after</em> the whitespace\nwhen you call it. We&rsquo;d have to wrap the whole body of the function in a loop or\nsomething.</p>\n<p>Instead, before starting the token, we shunt off to a separate function.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">Token scanToken() {\n</pre><div class=\"source-file\"><em>scanner.c</em><br>\nin <em>scanToken</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">skipWhitespace</span>();\n</pre><pre class=\"insert-after\">  scanner.start = scanner.current;\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, in <em>scanToken</em>()</div>\n\n<p>This advances the scanner past any leading whitespace. After this call returns,\nwe know the very next character is a meaningful one (or we&rsquo;re at the end of the\nsource code).</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>scanner.c</em><br>\nadd after <em>errorToken</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">skipWhitespace</span>() {\n  <span class=\"k\">for</span> (;;) {\n    <span class=\"t\">char</span> <span class=\"i\">c</span> = <span class=\"i\">peek</span>();\n    <span class=\"k\">switch</span> (<span class=\"i\">c</span>) {\n      <span class=\"k\">case</span> <span class=\"s\">&#39; &#39;</span>:\n      <span class=\"k\">case</span> <span class=\"s\">&#39;\\r&#39;</span>:\n      <span class=\"k\">case</span> <span class=\"s\">&#39;\\t&#39;</span>:\n        <span class=\"i\">advance</span>();\n        <span class=\"k\">break</span>;\n      <span class=\"k\">default</span>:\n        <span class=\"k\">return</span>;\n    }\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, add after <em>errorToken</em>()</div>\n\n<p>It&rsquo;s sort of a separate mini-scanner. It loops, consuming every whitespace\ncharacter it encounters. We need to be careful that it does <em>not</em> consume any\n<em>non</em>-whitespace characters. To support that, we use this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>scanner.c</em><br>\nadd after <em>advance</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">char</span> <span class=\"i\">peek</span>() {\n  <span class=\"k\">return</span> *<span class=\"i\">scanner</span>.<span class=\"i\">current</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, add after <em>advance</em>()</div>\n\n<p>This simply returns the current character, but doesn&rsquo;t consume it. The previous\ncode handles all the whitespace characters except for newlines.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">        break;\n</pre><div class=\"source-file\"><em>scanner.c</em><br>\nin <em>skipWhitespace</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"s\">&#39;\\n&#39;</span>:\n        <span class=\"i\">scanner</span>.<span class=\"i\">line</span>++;\n        <span class=\"i\">advance</span>();\n        <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">      default:\n        return;\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, in <em>skipWhitespace</em>()</div>\n\n<p>When we consume one of those, we also bump the current line number.</p>\n<h3><a href=\"#comments\" id=\"comments\"><small>16&#8202;.&#8202;3&#8202;.&#8202;2</small>Comments</a></h3>\n<p>Comments aren&rsquo;t technically &ldquo;whitespace&rdquo;, if you want to get all precise with\nyour terminology, but as far as Lox is concerned, they may as well be, so we\nskip those too.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">        break;\n</pre><div class=\"source-file\"><em>scanner.c</em><br>\nin <em>skipWhitespace</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"s\">&#39;/&#39;</span>:\n        <span class=\"k\">if</span> (<span class=\"i\">peekNext</span>() == <span class=\"s\">&#39;/&#39;</span>) {\n          <span class=\"c\">// A comment goes until the end of the line.</span>\n          <span class=\"k\">while</span> (<span class=\"i\">peek</span>() != <span class=\"s\">&#39;\\n&#39;</span> &amp;&amp; !<span class=\"i\">isAtEnd</span>()) <span class=\"i\">advance</span>();\n        } <span class=\"k\">else</span> {\n          <span class=\"k\">return</span>;\n        }\n        <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">      default:\n        return;\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, in <em>skipWhitespace</em>()</div>\n\n<p>Comments start with <code>//</code> in Lox, so as with <code>!=</code> and friends, we need a second\ncharacter of lookahead. However, with <code>!=</code>, we still wanted to consume the <code>!</code>\neven if the <code>=</code> wasn&rsquo;t found. Comments are different. If we don&rsquo;t find a second\n<code>/</code>, then <code>skipWhitespace()</code> needs to not consume the <em>first</em> slash either.</p>\n<p>To handle that, we add:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>scanner.c</em><br>\nadd after <em>peek</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">char</span> <span class=\"i\">peekNext</span>() {\n  <span class=\"k\">if</span> (<span class=\"i\">isAtEnd</span>()) <span class=\"k\">return</span> <span class=\"s\">&#39;\\0&#39;</span>;\n  <span class=\"k\">return</span> <span class=\"i\">scanner</span>.<span class=\"i\">current</span>[<span class=\"n\">1</span>];\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, add after <em>peek</em>()</div>\n\n<p>This is like <code>peek()</code> but for one character past the current one. If the current\ncharacter and the next one are both <code>/</code>, we consume them and then any other\ncharacters until the next newline or the end of the source code.</p>\n<p>We use <code>peek()</code> to check for the newline but not consume it. That way, the\nnewline will be the current character on the next turn of the outer loop in\n<code>skipWhitespace()</code> and we&rsquo;ll recognize it and increment <code>scanner.line</code>.</p>\n<h3><a href=\"#literal-tokens\" id=\"literal-tokens\"><small>16&#8202;.&#8202;3&#8202;.&#8202;3</small>Literal tokens</a></h3>\n<p>Number and string tokens are special because they have a runtime value\nassociated with them. We&rsquo;ll start with strings because they are easy to\nrecognize<span class=\"em\">&mdash;</span>they always begin with a double quote.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">          match('=') ? TOKEN_GREATER_EQUAL : TOKEN_GREATER);\n</pre><div class=\"source-file\"><em>scanner.c</em><br>\nin <em>scanToken</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"s\">&#39;&quot;&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">string</span>();\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, in <em>scanToken</em>()</div>\n\n<p>That calls a new function.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>scanner.c</em><br>\nadd after <em>skipWhitespace</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">Token</span> <span class=\"i\">string</span>() {\n  <span class=\"k\">while</span> (<span class=\"i\">peek</span>() != <span class=\"s\">&#39;&quot;&#39;</span> &amp;&amp; !<span class=\"i\">isAtEnd</span>()) {\n    <span class=\"k\">if</span> (<span class=\"i\">peek</span>() == <span class=\"s\">&#39;\\n&#39;</span>) <span class=\"i\">scanner</span>.<span class=\"i\">line</span>++;\n    <span class=\"i\">advance</span>();\n  }\n\n  <span class=\"k\">if</span> (<span class=\"i\">isAtEnd</span>()) <span class=\"k\">return</span> <span class=\"i\">errorToken</span>(<span class=\"s\">&quot;Unterminated string.&quot;</span>);\n\n  <span class=\"c\">// The closing quote.</span>\n  <span class=\"i\">advance</span>();\n  <span class=\"k\">return</span> <span class=\"i\">makeToken</span>(<span class=\"a\">TOKEN_STRING</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, add after <em>skipWhitespace</em>()</div>\n\n<p>Similar to jlox, we consume characters until we reach the closing quote. We also\ntrack newlines inside the string literal. (Lox supports multi-line strings.)\nAnd, as ever, we gracefully handle running out of source code before we find the\nend quote.</p>\n<p>The main change here in clox is something that&rsquo;s <em>not</em> present. Again, it\nrelates to memory management. In jlox, the Token class had a field of type\nObject to store the runtime value converted from the literal token&rsquo;s lexeme.</p>\n<p>Implementing that in C would require a lot of work. We&rsquo;d need some sort of union\nand type tag to tell whether the token contains a string or double value. If\nit&rsquo;s a string, we&rsquo;d need to manage the memory for the string&rsquo;s character array\nsomehow.</p>\n<p>Instead of adding that complexity to the scanner, we defer <span\nname=\"convert\">converting</span> the literal lexeme to a runtime value until\nlater. In clox, tokens only store the lexeme<span class=\"em\">&mdash;</span>the character sequence exactly\nas it appears in the user&rsquo;s source code. Later in the compiler, we&rsquo;ll convert\nthat lexeme to a runtime value right when we are ready to store it in the\nchunk&rsquo;s constant table.</p>\n<aside name=\"convert\">\n<p>Doing the lexeme-to-value conversion in the compiler does introduce some\nredundancy. The work to scan a number literal is awfully similar to the work\nrequired to convert a sequence of digit characters to a number value. But there\nisn&rsquo;t <em>that</em> much redundancy, it isn&rsquo;t in anything performance critical, and it\nkeeps our scanner simpler.</p>\n</aside>\n<p>Next up, numbers. Instead of adding a switch case for each of the ten digits\nthat can start a number, we handle them here:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  char c = advance();\n</pre><div class=\"source-file\"><em>scanner.c</em><br>\nin <em>scanToken</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">isDigit</span>(<span class=\"i\">c</span>)) <span class=\"k\">return</span> <span class=\"i\">number</span>();\n</pre><pre class=\"insert-after\">\n\n  switch (c) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, in <em>scanToken</em>()</div>\n\n<p>That uses this obvious utility function:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>scanner.c</em><br>\nadd after <em>initScanner</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">bool</span> <span class=\"i\">isDigit</span>(<span class=\"t\">char</span> <span class=\"i\">c</span>) {\n  <span class=\"k\">return</span> <span class=\"i\">c</span> &gt;= <span class=\"s\">&#39;0&#39;</span> &amp;&amp; <span class=\"i\">c</span> &lt;= <span class=\"s\">&#39;9&#39;</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, add after <em>initScanner</em>()</div>\n\n<p>We finish scanning the number using this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>scanner.c</em><br>\nadd after <em>skipWhitespace</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">Token</span> <span class=\"i\">number</span>() {\n  <span class=\"k\">while</span> (<span class=\"i\">isDigit</span>(<span class=\"i\">peek</span>())) <span class=\"i\">advance</span>();\n\n  <span class=\"c\">// Look for a fractional part.</span>\n  <span class=\"k\">if</span> (<span class=\"i\">peek</span>() == <span class=\"s\">&#39;.&#39;</span> &amp;&amp; <span class=\"i\">isDigit</span>(<span class=\"i\">peekNext</span>())) {\n    <span class=\"c\">// Consume the &quot;.&quot;.</span>\n    <span class=\"i\">advance</span>();\n\n    <span class=\"k\">while</span> (<span class=\"i\">isDigit</span>(<span class=\"i\">peek</span>())) <span class=\"i\">advance</span>();\n  }\n\n  <span class=\"k\">return</span> <span class=\"i\">makeToken</span>(<span class=\"a\">TOKEN_NUMBER</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, add after <em>skipWhitespace</em>()</div>\n\n<p>It&rsquo;s virtually identical to jlox&rsquo;s version except, again, we don&rsquo;t convert the\nlexeme to a double yet.</p>\n<h2><a href=\"#identifiers-and-keywords\" id=\"identifiers-and-keywords\"><small>16&#8202;.&#8202;4</small>Identifiers and Keywords</a></h2>\n<p>The last batch of tokens are identifiers, both user-defined and reserved. This\nsection should be fun<span class=\"em\">&mdash;</span>the way we recognize keywords in clox is quite\ndifferent from how we did it in jlox, and touches on some important data\nstructures.</p>\n<p>First, though, we have to scan the lexeme. Names start with a letter or\nunderscore.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  char c = advance();\n</pre><div class=\"source-file\"><em>scanner.c</em><br>\nin <em>scanToken</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">isAlpha</span>(<span class=\"i\">c</span>)) <span class=\"k\">return</span> <span class=\"i\">identifier</span>();\n</pre><pre class=\"insert-after\">  if (isDigit(c)) return number();\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, in <em>scanToken</em>()</div>\n\n<p>We recognize those using this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>scanner.c</em><br>\nadd after <em>initScanner</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">bool</span> <span class=\"i\">isAlpha</span>(<span class=\"t\">char</span> <span class=\"i\">c</span>) {\n  <span class=\"k\">return</span> (<span class=\"i\">c</span> &gt;= <span class=\"s\">&#39;a&#39;</span> &amp;&amp; <span class=\"i\">c</span> &lt;= <span class=\"s\">&#39;z&#39;</span>) ||\n         (<span class=\"i\">c</span> &gt;= <span class=\"s\">&#39;A&#39;</span> &amp;&amp; <span class=\"i\">c</span> &lt;= <span class=\"s\">&#39;Z&#39;</span>) ||\n          <span class=\"i\">c</span> == <span class=\"s\">&#39;_&#39;</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, add after <em>initScanner</em>()</div>\n\n<p>Once we&rsquo;ve found an identifier, we scan the rest of it here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>scanner.c</em><br>\nadd after <em>skipWhitespace</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">Token</span> <span class=\"i\">identifier</span>() {\n  <span class=\"k\">while</span> (<span class=\"i\">isAlpha</span>(<span class=\"i\">peek</span>()) || <span class=\"i\">isDigit</span>(<span class=\"i\">peek</span>())) <span class=\"i\">advance</span>();\n  <span class=\"k\">return</span> <span class=\"i\">makeToken</span>(<span class=\"i\">identifierType</span>());\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, add after <em>skipWhitespace</em>()</div>\n\n<p>After the first letter, we allow digits too, and we keep consuming alphanumerics\nuntil we run out of them. Then we produce a token with the proper type.\nDetermining that &ldquo;proper&rdquo; type is the unique part of this chapter.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>scanner.c</em><br>\nadd after <em>skipWhitespace</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">TokenType</span> <span class=\"i\">identifierType</span>() {\n  <span class=\"k\">return</span> <span class=\"a\">TOKEN_IDENTIFIER</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, add after <em>skipWhitespace</em>()</div>\n\n<p>Okay, I guess that&rsquo;s not very exciting yet. That&rsquo;s what it looks like if we\nhave no reserved words at all. How should we go about recognizing keywords? In\njlox, we stuffed them all in a Java Map and looked them up by name. We don&rsquo;t\nhave any sort of hash table structure in clox, at least not yet.</p>\n<p>A hash table would be overkill anyway. To look up a string in a hash <span\nname=\"hash\">table</span>, we need to walk the string to calculate its hash code,\nfind the corresponding bucket in the hash table, and then do a\ncharacter-by-character equality comparison on any string it happens to find\nthere.</p>\n<aside name=\"hash\">\n<p>Don&rsquo;t worry if this is unfamiliar to you. When we get to <a href=\"hash-tables.html\">building our own hash\ntable from scratch</a>, we&rsquo;ll learn all about it in exquisite detail.</p>\n</aside>\n<p>Let&rsquo;s say we&rsquo;ve scanned the identifier &ldquo;gorgonzola&rdquo;. How much work <em>should</em> we\nneed to do to tell if that&rsquo;s a reserved word? Well, no Lox keyword starts with\n&ldquo;g&rdquo;, so looking at the first character is enough to definitively answer no.\nThat&rsquo;s a lot simpler than a hash table lookup.</p>\n<p>What about &ldquo;cardigan&rdquo;? We do have a keyword in Lox that starts with &ldquo;c&rdquo;:\n&ldquo;class&rdquo;. But the second character in &ldquo;cardigan&rdquo;, &ldquo;a&rdquo;, rules that out. What about\n&ldquo;forest&rdquo;? Since &ldquo;for&rdquo; is a keyword, we have to go farther in the string before\nwe can establish that we don&rsquo;t have a reserved word. But, in most cases, only a\ncharacter or two is enough to tell we&rsquo;ve got a user-defined name on our hands.\nWe should be able to recognize that and fail fast.</p>\n<p>Here&rsquo;s a visual representation of that branching character-inspection logic:</p>\n<p><span name=\"down\"></span></p><img src=\"image/scanning-on-demand/keywords.png\" alt=\"A trie that contains all of Lox's keywords.\" />\n<aside name=\"down\">\n<p>Read down each chain of nodes and you&rsquo;ll see Lox&rsquo;s keywords emerge.</p>\n</aside>\n<p>We start at the root node. If there is a child node whose letter matches the\nfirst character in the lexeme, we move to that node. Then repeat for the next\nletter in the lexeme and so on. If at any point the next letter in the lexeme\ndoesn&rsquo;t match a child node, then the identifier must not be a keyword and we\nstop. If we reach a double-lined box, and we&rsquo;re at the last character of the\nlexeme, then we found a keyword.</p>\n<h3><a href=\"#tries-and-state-machines\" id=\"tries-and-state-machines\"><small>16&#8202;.&#8202;4&#8202;.&#8202;1</small>Tries and state machines</a></h3>\n<p>This tree diagram is an example of a thing called a <span\nname=\"trie\"><a href=\"https://en.wikipedia.org/wiki/Trie\"><strong>trie</strong></a></span>. A trie stores a set of strings. Most other\ndata structures for storing strings contain the raw character arrays and then\nwrap them inside some larger construct that helps you search faster. A trie is\ndifferent. Nowhere in the trie will you find a whole string.</p>\n<aside name=\"trie\">\n<p>&ldquo;Trie&rdquo; is one of the most confusing names in CS. Edward Fredkin yanked it out of\nthe middle of the word &ldquo;retrieval&rdquo;, which means it should be pronounced like\n&ldquo;tree&rdquo;. But, uh, there is already a pretty important data structure pronounced\n&ldquo;tree&rdquo; <em>which tries are a special case of</em>, so unless you never speak of these\nthings out loud, no one can tell which one you&rsquo;re talking about. Thus, people\nthese days often pronounce it like &ldquo;try&rdquo; to avoid the headache.</p>\n</aside>\n<p>Instead, each string the trie &ldquo;contains&rdquo; is represented as a <em>path</em> through the\ntree of character nodes, as in our traversal above. Nodes that match the last\ncharacter in a string have a special marker<span class=\"em\">&mdash;</span>the double lined boxes in the\nillustration. That way, if your trie contains, say, &ldquo;banquet&rdquo; and &ldquo;ban&rdquo;, you are\nable to tell that it does <em>not</em> contain &ldquo;banque&rdquo;<span class=\"em\">&mdash;</span>the &ldquo;e&rdquo; node won&rsquo;t have that\nmarker, while the &ldquo;n&rdquo; and &ldquo;t&rdquo; nodes will.</p>\n<p>Tries are a special case of an even more fundamental data structure: a\n<a href=\"https://en.wikipedia.org/wiki/Deterministic_finite_automaton\"><strong>deterministic finite automaton</strong></a> (<strong>DFA</strong>). You might also know these\nby other names: <strong>finite state machine</strong>, or just <strong>state machine</strong>. State\nmachines are rad. They end up useful in everything from <a href=\"http://gameprogrammingpatterns.com/state.html\">game\nprogramming</a> to implementing networking protocols.</p>\n<p>In a DFA, you have a set of <em>states</em> with <em>transitions</em> between them, forming a\ngraph. At any point in time, the machine is &ldquo;in&rdquo; exactly one state. It gets to\nother states by following transitions. When you use a DFA for lexical analysis,\neach transition is a character that gets matched from the string. Each state\nrepresents a set of allowed characters.</p>\n<p>Our keyword tree is exactly a DFA that recognizes Lox keywords. But DFAs are\nmore powerful than simple trees because they can be arbitrary <em>graphs</em>.\nTransitions can form cycles between states. That lets you recognize arbitrarily\nlong strings. For example, here&rsquo;s a DFA that recognizes number literals:</p>\n<p><span name=\"railroad\"></span></p><img src=\"image/scanning-on-demand/numbers.png\" alt=\"A syntax diagram that recognizes integer and floating point literals.\" />\n<aside name=\"railroad\">\n<p>This style of diagram is called a <a href=\"https://en.wikipedia.org/wiki/Syntax_diagram\"><strong>syntax diagram</strong></a> or the\nmore charming <strong>railroad diagram</strong>. The latter name is because it looks\nsomething like a switching yard for trains.</p>\n<p>Back before Backus-Naur Form was a thing, this was one of the predominant ways\nof documenting a language&rsquo;s grammar. These days, we mostly use text, but there&rsquo;s\nsomething delightful about the official specification for a <em>textual language</em>\nrelying on an <em>image</em>.</p>\n</aside>\n<p>I&rsquo;ve collapsed the nodes for the ten digits together to keep it more readable,\nbut the basic process works the same<span class=\"em\">&mdash;</span>you work through the path, entering\nnodes whenever you consume a corresponding character in the lexeme. If we were\nso inclined, we could construct one big giant DFA that does <em>all</em> of the lexical\nanalysis for Lox, a single state machine that recognizes and spits out all of\nthe tokens we need.</p>\n<p>However, crafting that mega-DFA by <span name=\"regex\">hand</span> would be\nchallenging. That&rsquo;s why <a href=\"https://en.wikipedia.org/wiki/Lex_(software)\">Lex</a> was created. You give it a simple textual\ndescription of your lexical grammar<span class=\"em\">&mdash;</span>a bunch of regular expressions<span class=\"em\">&mdash;</span>and it\nautomatically generates a DFA for you and produces a pile of C code that\nimplements it.</p>\n<aside name=\"regex\">\n<p>This is also how most regular expression engines in programming languages and\ntext editors work under the hood. They take your regex string and convert it to\na DFA, which they then use to match strings.</p>\n<p>If you want to learn the algorithm to convert a regular expression into a DFA,\n<a href=\"https://en.wikipedia.org/wiki/Compilers:_Principles,_Techniques,_and_Tools\">the dragon book</a> has you covered.</p>\n</aside>\n<p>We won&rsquo;t go down that road. We already have a perfectly serviceable hand-rolled\nscanner. We just need a tiny trie for recognizing keywords. How should we map\nthat to code?</p>\n<p>The absolute simplest <span name=\"v8\">solution</span> is to use a switch\nstatement for each node with cases for each branch. We&rsquo;ll start with the root\nnode and handle the easy keywords.</p>\n<aside name=\"v8\">\n<p>Simple doesn&rsquo;t mean dumb. The same approach is <a href=\"https://github.com/v8/v8/blob/e77eebfe3b747fb315bd3baad09bec0953e53e68/src/parsing/scanner.cc#L1643\">essentially what V8 does</a>,\nand that&rsquo;s currently one of the world&rsquo;s most sophisticated, fastest language\nimplementations.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">static TokenType identifierType() {\n</pre><div class=\"source-file\"><em>scanner.c</em><br>\nin <em>identifierType</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">switch</span> (<span class=\"i\">scanner</span>.<span class=\"i\">start</span>[<span class=\"n\">0</span>]) {\n    <span class=\"k\">case</span> <span class=\"s\">&#39;a&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">checkKeyword</span>(<span class=\"n\">1</span>, <span class=\"n\">2</span>, <span class=\"s\">&quot;nd&quot;</span>, <span class=\"a\">TOKEN_AND</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;c&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">checkKeyword</span>(<span class=\"n\">1</span>, <span class=\"n\">4</span>, <span class=\"s\">&quot;lass&quot;</span>, <span class=\"a\">TOKEN_CLASS</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;e&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">checkKeyword</span>(<span class=\"n\">1</span>, <span class=\"n\">3</span>, <span class=\"s\">&quot;lse&quot;</span>, <span class=\"a\">TOKEN_ELSE</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;i&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">checkKeyword</span>(<span class=\"n\">1</span>, <span class=\"n\">1</span>, <span class=\"s\">&quot;f&quot;</span>, <span class=\"a\">TOKEN_IF</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;n&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">checkKeyword</span>(<span class=\"n\">1</span>, <span class=\"n\">2</span>, <span class=\"s\">&quot;il&quot;</span>, <span class=\"a\">TOKEN_NIL</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;o&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">checkKeyword</span>(<span class=\"n\">1</span>, <span class=\"n\">1</span>, <span class=\"s\">&quot;r&quot;</span>, <span class=\"a\">TOKEN_OR</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;p&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">checkKeyword</span>(<span class=\"n\">1</span>, <span class=\"n\">4</span>, <span class=\"s\">&quot;rint&quot;</span>, <span class=\"a\">TOKEN_PRINT</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;r&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">checkKeyword</span>(<span class=\"n\">1</span>, <span class=\"n\">5</span>, <span class=\"s\">&quot;eturn&quot;</span>, <span class=\"a\">TOKEN_RETURN</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;s&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">checkKeyword</span>(<span class=\"n\">1</span>, <span class=\"n\">4</span>, <span class=\"s\">&quot;uper&quot;</span>, <span class=\"a\">TOKEN_SUPER</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;v&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">checkKeyword</span>(<span class=\"n\">1</span>, <span class=\"n\">2</span>, <span class=\"s\">&quot;ar&quot;</span>, <span class=\"a\">TOKEN_VAR</span>);\n    <span class=\"k\">case</span> <span class=\"s\">&#39;w&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">checkKeyword</span>(<span class=\"n\">1</span>, <span class=\"n\">4</span>, <span class=\"s\">&quot;hile&quot;</span>, <span class=\"a\">TOKEN_WHILE</span>);\n  }\n\n</pre><pre class=\"insert-after\">  return TOKEN_IDENTIFIER;\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, in <em>identifierType</em>()</div>\n\n<p>These are the initial letters that correspond to a single keyword. If we see an\n&ldquo;s&rdquo;, the only keyword the identifier could possibly be is <code>super</code>. It might not\nbe, though, so we still need to check the rest of the letters too. In the tree\ndiagram, this is basically that straight path hanging off the &ldquo;s&rdquo;.</p>\n<p>We won&rsquo;t roll a switch for each of those nodes. Instead, we have a utility\nfunction that tests the rest of a potential keyword&rsquo;s lexeme.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>scanner.c</em><br>\nadd after <em>skipWhitespace</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">TokenType</span> <span class=\"i\">checkKeyword</span>(<span class=\"t\">int</span> <span class=\"i\">start</span>, <span class=\"t\">int</span> <span class=\"i\">length</span>,\n    <span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">rest</span>, <span class=\"t\">TokenType</span> <span class=\"i\">type</span>) {\n  <span class=\"k\">if</span> (<span class=\"i\">scanner</span>.<span class=\"i\">current</span> - <span class=\"i\">scanner</span>.<span class=\"i\">start</span> == <span class=\"i\">start</span> + <span class=\"i\">length</span> &amp;&amp;\n      <span class=\"i\">memcmp</span>(<span class=\"i\">scanner</span>.<span class=\"i\">start</span> + <span class=\"i\">start</span>, <span class=\"i\">rest</span>, <span class=\"i\">length</span>) == <span class=\"n\">0</span>) {\n    <span class=\"k\">return</span> <span class=\"i\">type</span>;\n  }\n\n  <span class=\"k\">return</span> <span class=\"a\">TOKEN_IDENTIFIER</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, add after <em>skipWhitespace</em>()</div>\n\n<p>We use this for all of the unbranching paths in the tree. Once we&rsquo;ve found a\nprefix that could only be one possible reserved word, we need to verify two\nthings. The lexeme must be exactly as long as the keyword. If the first letter\nis &ldquo;s&rdquo;, the lexeme could still be &ldquo;sup&rdquo; or &ldquo;superb&rdquo;. And the remaining\ncharacters must match exactly<span class=\"em\">&mdash;</span>&ldquo;supar&rdquo; isn&rsquo;t good enough.</p>\n<p>If we do have the right number of characters, and they&rsquo;re the ones we want, then\nit&rsquo;s a keyword, and we return the associated token type. Otherwise, it must be a\nnormal identifier.</p>\n<p>We have a couple of keywords where the tree branches again after the first\nletter. If the lexeme starts with &ldquo;f&rdquo;, it could be <code>false</code>, <code>for</code>, or <code>fun</code>. So\nwe add another switch for the branches coming off the &ldquo;f&rdquo; node.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    case 'e': return checkKeyword(1, 3, &quot;lse&quot;, TOKEN_ELSE);\n</pre><div class=\"source-file\"><em>scanner.c</em><br>\nin <em>identifierType</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"s\">&#39;f&#39;</span>:\n      <span class=\"k\">if</span> (<span class=\"i\">scanner</span>.<span class=\"i\">current</span> - <span class=\"i\">scanner</span>.<span class=\"i\">start</span> &gt; <span class=\"n\">1</span>) {\n        <span class=\"k\">switch</span> (<span class=\"i\">scanner</span>.<span class=\"i\">start</span>[<span class=\"n\">1</span>]) {\n          <span class=\"k\">case</span> <span class=\"s\">&#39;a&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">checkKeyword</span>(<span class=\"n\">2</span>, <span class=\"n\">3</span>, <span class=\"s\">&quot;lse&quot;</span>, <span class=\"a\">TOKEN_FALSE</span>);\n          <span class=\"k\">case</span> <span class=\"s\">&#39;o&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">checkKeyword</span>(<span class=\"n\">2</span>, <span class=\"n\">1</span>, <span class=\"s\">&quot;r&quot;</span>, <span class=\"a\">TOKEN_FOR</span>);\n          <span class=\"k\">case</span> <span class=\"s\">&#39;u&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">checkKeyword</span>(<span class=\"n\">2</span>, <span class=\"n\">1</span>, <span class=\"s\">&quot;n&quot;</span>, <span class=\"a\">TOKEN_FUN</span>);\n        }\n      }\n      <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">    case 'i': return checkKeyword(1, 1, &quot;f&quot;, TOKEN_IF);\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, in <em>identifierType</em>()</div>\n\n<p>Before we switch, we need to check that there even <em>is</em> a second letter. &ldquo;f&rdquo; by\nitself is a valid identifier too, after all. The other letter that branches is\n&ldquo;t&rdquo;.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    case 's': return checkKeyword(1, 4, &quot;uper&quot;, TOKEN_SUPER);\n</pre><div class=\"source-file\"><em>scanner.c</em><br>\nin <em>identifierType</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"s\">&#39;t&#39;</span>:\n      <span class=\"k\">if</span> (<span class=\"i\">scanner</span>.<span class=\"i\">current</span> - <span class=\"i\">scanner</span>.<span class=\"i\">start</span> &gt; <span class=\"n\">1</span>) {\n        <span class=\"k\">switch</span> (<span class=\"i\">scanner</span>.<span class=\"i\">start</span>[<span class=\"n\">1</span>]) {\n          <span class=\"k\">case</span> <span class=\"s\">&#39;h&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">checkKeyword</span>(<span class=\"n\">2</span>, <span class=\"n\">2</span>, <span class=\"s\">&quot;is&quot;</span>, <span class=\"a\">TOKEN_THIS</span>);\n          <span class=\"k\">case</span> <span class=\"s\">&#39;r&#39;</span>: <span class=\"k\">return</span> <span class=\"i\">checkKeyword</span>(<span class=\"n\">2</span>, <span class=\"n\">2</span>, <span class=\"s\">&quot;ue&quot;</span>, <span class=\"a\">TOKEN_TRUE</span>);\n        }\n      }\n      <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">    case 'v': return checkKeyword(1, 2, &quot;ar&quot;, TOKEN_VAR);\n</pre></div>\n<div class=\"source-file-narrow\"><em>scanner.c</em>, in <em>identifierType</em>()</div>\n\n<p>That&rsquo;s it. A couple of nested <code>switch</code> statements. Not only is this code <span\nname=\"short\">short</span>, but it&rsquo;s very, very fast. It does the minimum amount\nof work required to detect a keyword, and bails out as soon as it can tell the\nidentifier will not be a reserved one.</p>\n<p>And with that, our scanner is complete.</p>\n<aside name=\"short\">\n<p>We sometimes fall into the trap of thinking that performance comes from\ncomplicated data structures, layers of caching, and other fancy optimizations.\nBut, many times, all that&rsquo;s required is to do less work, and I often find that\nwriting the simplest code I can is sufficient to accomplish that.</p>\n</aside>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>Many newer languages support <a href=\"https://en.wikipedia.org/wiki/String_interpolation\"><strong>string interpolation</strong></a>. Inside a\nstring literal, you have some sort of special delimiters<span class=\"em\">&mdash;</span>most commonly\n<code>${</code> at the beginning and <code>}</code> at the end. Between those delimiters, any\nexpression can appear. When the string literal is executed, the inner\nexpression is evaluated, converted to a string, and then merged with the\nsurrounding string literal.</p>\n<p>For example, if Lox supported string interpolation, then this<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span></p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">drink</span> = <span class=\"s\">&quot;Tea&quot;</span>;\n<span class=\"k\">var</span> <span class=\"i\">steep</span> = <span class=\"n\">4</span>;\n<span class=\"k\">var</span> <span class=\"i\">cool</span> = <span class=\"n\">2</span>;\n<span class=\"k\">print</span> <span class=\"s\">&quot;${drink} will be ready in ${steep + cool} minutes.&quot;</span>;\n</pre></div>\n<p><span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>would print:</p>\n<div class=\"codehilite\"><pre>Tea will be ready in 6 minutes.\n</pre></div>\n<p>What token types would you define to implement a scanner for string\ninterpolation? What sequence of tokens would you emit for the above string\nliteral?</p>\n<p>What tokens would you emit for:</p>\n<div class=\"codehilite\"><pre>&quot;Nested ${&quot;interpolation?! Are you ${&quot;mad?!&quot;}&quot;}&quot;\n</pre></div>\n<p>Consider looking at other language implementations that support\ninterpolation to see how they handle it.</p>\n</li>\n<li>\n<p>Several languages use angle brackets for generics and also have a <code>&gt;&gt;</code> right\nshift operator. This led to a classic problem in early versions of C++:</p>\n<div class=\"codehilite\"><pre><span class=\"t\">vector</span>&lt;<span class=\"t\">vector</span>&lt;<span class=\"t\">string</span>&gt;&gt; <span class=\"i\">nestedVectors</span>;\n</pre></div>\n<p>This would produce a compile error because the <code>&gt;&gt;</code> was lexed to a single\nright shift token, not two <code>&gt;</code> tokens. Users were forced to avoid this by\nputting a space between the closing angle brackets.</p>\n<p>Later versions of C++ are smarter and can handle the above code. Java and C#\nnever had the problem. How do those languages specify and implement this?</p>\n</li>\n<li>\n<p>Many languages, especially later in their evolution, define &ldquo;contextual\nkeywords&rdquo;. These are identifiers that act like reserved words in some\ncontexts but can be normal user-defined identifiers in others.</p>\n<p>For example, <code>await</code> is a keyword inside an <code>async</code> method in C#, but\nin other methods, you can use <code>await</code> as your own identifier.</p>\n<p>Name a few contextual keywords from other languages, and the context where\nthey are meaningful. What are the pros and cons of having contextual\nkeywords? How would you implement them in your language&rsquo;s front end if you\nneeded to?</p>\n</li>\n</ol>\n</div>\n\n<footer>\n<a href=\"compiling-expressions.html\" class=\"next\">\n  Next Chapter: &ldquo;Compiling Expressions&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/scanning.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Scanning &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Scanning<small>4</small></a></h3>\n\n<ul>\n    <li><a href=\"#the-interpreter-framework\"><small>4.1</small> The Interpreter Framework</a></li>\n    <li><a href=\"#lexemes-and-tokens\"><small>4.2</small> Lexemes and Tokens</a></li>\n    <li><a href=\"#regular-languages-and-expressions\"><small>4.3</small> Regular Languages and Expressions</a></li>\n    <li><a href=\"#the-scanner-class\"><small>4.4</small> The Scanner Class</a></li>\n    <li><a href=\"#recognizing-lexemes\"><small>4.5</small> Recognizing Lexemes</a></li>\n    <li><a href=\"#longer-lexemes\"><small>4.6</small> Longer Lexemes</a></li>\n    <li><a href=\"#reserved-words-and-identifiers\"><small>4.7</small> Reserved Words and Identifiers</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Implicit Semicolons</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\">&uarr;&nbsp;Up</a>\n    <a href=\"representing-code.html\" title=\"Representing Code\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\" class=\"prev\">←</a>\n<a href=\"representing-code.html\" title=\"Representing Code\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Scanning<small>4</small></a></h3>\n\n<ul>\n    <li><a href=\"#the-interpreter-framework\"><small>4.1</small> The Interpreter Framework</a></li>\n    <li><a href=\"#lexemes-and-tokens\"><small>4.2</small> Lexemes and Tokens</a></li>\n    <li><a href=\"#regular-languages-and-expressions\"><small>4.3</small> Regular Languages and Expressions</a></li>\n    <li><a href=\"#the-scanner-class\"><small>4.4</small> The Scanner Class</a></li>\n    <li><a href=\"#recognizing-lexemes\"><small>4.5</small> Recognizing Lexemes</a></li>\n    <li><a href=\"#longer-lexemes\"><small>4.6</small> Longer Lexemes</a></li>\n    <li><a href=\"#reserved-words-and-identifiers\"><small>4.7</small> Reserved Words and Identifiers</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Implicit Semicolons</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\">&uarr;&nbsp;Up</a>\n    <a href=\"representing-code.html\" title=\"Representing Code\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">4</div>\n  <h1>Scanning</h1>\n\n<blockquote>\n<p>Take big bites. Anything worth doing is worth overdoing.</p>\n<p><cite>Robert A. Heinlein, <em>Time Enough for Love</em></cite></p>\n</blockquote>\n<p>The first step in any compiler or interpreter is <span\nname=\"lexing\">scanning</span>. The scanner takes in raw source code as a series\nof characters and groups it into a series of chunks we call <strong>tokens</strong>. These\nare the meaningful &ldquo;words&rdquo; and &ldquo;punctuation&rdquo; that make up the language&rsquo;s\ngrammar.</p>\n<aside name=\"lexing\">\n<p>This task has been variously called &ldquo;scanning&rdquo; and &ldquo;lexing&rdquo; (short for &ldquo;lexical\nanalysis&rdquo;) over the years. Way back when computers were as big as Winnebagos but\nhad less memory than your watch, some people used &ldquo;scanner&rdquo; only to refer to the\npiece of code that dealt with reading raw source code characters from disk and\nbuffering them in memory. Then &ldquo;lexing&rdquo; was the subsequent phase that did useful\nstuff with the characters.</p>\n<p>These days, reading a source file into memory is trivial, so it&rsquo;s rarely a\ndistinct phase in the compiler. Because of that, the two terms are basically\ninterchangeable.</p>\n</aside>\n<p>Scanning is a good starting point for us too because the code isn&rsquo;t very hard<span class=\"em\">&mdash;</span>pretty much a <code>switch</code> statement with delusions of grandeur. It will help us\nwarm up before we tackle some of the more interesting material later. By the end\nof this chapter, we&rsquo;ll have a full-featured, fast scanner that can take any\nstring of Lox source code and produce the tokens that we&rsquo;ll feed into the parser\nin the next chapter.</p>\n<h2><a href=\"#the-interpreter-framework\" id=\"the-interpreter-framework\"><small>4&#8202;.&#8202;1</small>The Interpreter Framework</a></h2>\n<p>Since this is our first real chapter, before we get to actually scanning some\ncode we need to sketch out the basic shape of our interpreter, jlox. Everything\nstarts with a class in Java.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Lox.java</em><br>\ncreate new file</div>\n<pre><span class=\"k\">package</span> <span class=\"i\">com.craftinginterpreters.lox</span>;\n\n<span class=\"k\">import</span> <span class=\"i\">java.io.BufferedReader</span>;\n<span class=\"k\">import</span> <span class=\"i\">java.io.IOException</span>;\n<span class=\"k\">import</span> <span class=\"i\">java.io.InputStreamReader</span>;\n<span class=\"k\">import</span> <span class=\"i\">java.nio.charset.Charset</span>;\n<span class=\"k\">import</span> <span class=\"i\">java.nio.file.Files</span>;\n<span class=\"k\">import</span> <span class=\"i\">java.nio.file.Paths</span>;\n<span class=\"k\">import</span> <span class=\"i\">java.util.List</span>;\n\n<span class=\"k\">public</span> <span class=\"k\">class</span> <span class=\"t\">Lox</span> {\n  <span class=\"k\">public</span> <span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">main</span>(<span class=\"t\">String</span>[] <span class=\"i\">args</span>) <span class=\"k\">throws</span> <span class=\"t\">IOException</span> {\n    <span class=\"k\">if</span> (<span class=\"i\">args</span>.<span class=\"i\">length</span> &gt; <span class=\"n\">1</span>) {\n      <span class=\"t\">System</span>.<span class=\"i\">out</span>.<span class=\"i\">println</span>(<span class=\"s\">&quot;Usage: jlox [script]&quot;</span>);\n      <span class=\"t\">System</span>.<span class=\"i\">exit</span>(<span class=\"n\">64</span>);<span name=\"64\"> </span>\n    } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">args</span>.<span class=\"i\">length</span> == <span class=\"n\">1</span>) {\n      <span class=\"i\">runFile</span>(<span class=\"i\">args</span>[<span class=\"n\">0</span>]);\n    } <span class=\"k\">else</span> {\n      <span class=\"i\">runPrompt</span>();\n    }\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Lox.java</em>, create new file</div>\n\n<aside name=\"64\">\n<p>For exit codes, I&rsquo;m using the conventions defined in the UNIX\n<a href=\"https://www.freebsd.org/cgi/man.cgi?query=sysexits&amp;apropos=0&amp;sektion=0&amp;manpath=FreeBSD+4.3-RELEASE&amp;format=html\">&ldquo;sysexits.h&rdquo;</a> header. It&rsquo;s the closest thing to a standard I could\nfind.</p>\n</aside>\n<p>Stick that in a text file, and go get your IDE or Makefile or whatever set up.\nI&rsquo;ll be right here when you&rsquo;re ready. Good? OK!</p>\n<p>Lox is a scripting language, which means it executes directly from source. Our\ninterpreter supports two ways of running code. If you start jlox from the\ncommand line and give it a path to a file, it reads the file and executes it.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Lox.java</em><br>\nadd after <em>main</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">runFile</span>(<span class=\"t\">String</span> <span class=\"i\">path</span>) <span class=\"k\">throws</span> <span class=\"t\">IOException</span> {\n    <span class=\"t\">byte</span>[] <span class=\"i\">bytes</span> = <span class=\"t\">Files</span>.<span class=\"i\">readAllBytes</span>(<span class=\"t\">Paths</span>.<span class=\"i\">get</span>(<span class=\"i\">path</span>));\n    <span class=\"i\">run</span>(<span class=\"k\">new</span> <span class=\"t\">String</span>(<span class=\"i\">bytes</span>, <span class=\"t\">Charset</span>.<span class=\"i\">defaultCharset</span>()));\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Lox.java</em>, add after <em>main</em>()</div>\n\n<p>If you want a more intimate conversation with your interpreter, you can also run\nit interactively. Fire up jlox without any arguments, and it drops you into a\nprompt where you can enter and execute code one line at a time.</p>\n<aside name=\"repl\">\n<p>An interactive prompt is also called a &ldquo;REPL&rdquo; (pronounced like &ldquo;rebel&rdquo; but with\na &ldquo;p&rdquo;). The name comes from Lisp where implementing one is as simple as\nwrapping a loop around a few built-in functions:</p>\n<div class=\"codehilite\"><pre>(<span class=\"i\">print</span> (<span class=\"i\">eval</span> (<span class=\"i\">read</span>)))\n</pre></div>\n<p>Working outwards from the most nested call, you <strong>R</strong>ead a line of input,\n<strong>E</strong>valuate it, <strong>P</strong>rint the result, then <strong>L</strong>oop and do it all over again.</p>\n</aside>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Lox.java</em><br>\nadd after <em>runFile</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">runPrompt</span>() <span class=\"k\">throws</span> <span class=\"t\">IOException</span> {\n    <span class=\"t\">InputStreamReader</span> <span class=\"i\">input</span> = <span class=\"k\">new</span> <span class=\"t\">InputStreamReader</span>(<span class=\"t\">System</span>.<span class=\"i\">in</span>);\n    <span class=\"t\">BufferedReader</span> <span class=\"i\">reader</span> = <span class=\"k\">new</span> <span class=\"t\">BufferedReader</span>(<span class=\"i\">input</span>);\n\n    <span class=\"k\">for</span> (;;) {<span name=\"repl\"> </span>\n      <span class=\"t\">System</span>.<span class=\"i\">out</span>.<span class=\"i\">print</span>(<span class=\"s\">&quot;&gt; &quot;</span>);\n      <span class=\"t\">String</span> <span class=\"i\">line</span> = <span class=\"i\">reader</span>.<span class=\"i\">readLine</span>();\n      <span class=\"k\">if</span> (<span class=\"i\">line</span> == <span class=\"k\">null</span>) <span class=\"k\">break</span>;\n      <span class=\"i\">run</span>(<span class=\"i\">line</span>);\n    }\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Lox.java</em>, add after <em>runFile</em>()</div>\n\n<p>The <code>readLine()</code> function, as the name so helpfully implies, reads a line of\ninput from the user on the command line and returns the result. To kill an\ninteractive command-line app, you usually type Control-D. Doing so signals an\n&ldquo;end-of-file&rdquo; condition to the program. When that happens <code>readLine()</code> returns\n<code>null</code>, so we check for that to exit the loop.</p>\n<p>Both the prompt and the file runner are thin wrappers around this core function:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Lox.java</em><br>\nadd after <em>runPrompt</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">run</span>(<span class=\"t\">String</span> <span class=\"i\">source</span>) {\n    <span class=\"t\">Scanner</span> <span class=\"i\">scanner</span> = <span class=\"k\">new</span> <span class=\"t\">Scanner</span>(<span class=\"i\">source</span>);\n    <span class=\"t\">List</span>&lt;<span class=\"t\">Token</span>&gt; <span class=\"i\">tokens</span> = <span class=\"i\">scanner</span>.<span class=\"i\">scanTokens</span>();\n\n    <span class=\"c\">// For now, just print the tokens.</span>\n    <span class=\"k\">for</span> (<span class=\"t\">Token</span> <span class=\"i\">token</span> : <span class=\"i\">tokens</span>) {\n      <span class=\"t\">System</span>.<span class=\"i\">out</span>.<span class=\"i\">println</span>(<span class=\"i\">token</span>);\n    }\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Lox.java</em>, add after <em>runPrompt</em>()</div>\n\n<p>It&rsquo;s not super useful yet since we haven&rsquo;t written the interpreter, but baby\nsteps, you know? Right now, it prints out the tokens our forthcoming scanner\nwill emit so that we can see if we&rsquo;re making progress.</p>\n<h3><a href=\"#error-handling\" id=\"error-handling\"><small>4&#8202;.&#8202;1&#8202;.&#8202;1</small>Error handling</a></h3>\n<p>While we&rsquo;re setting things up, another key piece of infrastructure is <em>error\nhandling</em>. Textbooks sometimes gloss over this because it&rsquo;s more a practical\nmatter than a formal computer science-y problem. But if you care about making a\nlanguage that&rsquo;s actually <em>usable</em>, then handling errors gracefully is vital.</p>\n<p>The tools our language provides for dealing with errors make up a large portion\nof its user interface. When the user&rsquo;s code is working, they aren&rsquo;t thinking\nabout our language at all<span class=\"em\">&mdash;</span>their headspace is all about <em>their program</em>. It&rsquo;s\nusually only when things go wrong that they notice our implementation.</p>\n<p><span name=\"errors\">When</span> that happens, it&rsquo;s up to us to give the user all\nthe information they need to understand what went wrong and guide them gently\nback to where they are trying to go. Doing that well means thinking about error\nhandling all through the implementation of our interpreter, starting now.</p>\n<aside name=\"errors\">\n<p>Having said all that, for <em>this</em> interpreter, what we&rsquo;ll build is pretty bare\nbones. I&rsquo;d love to talk about interactive debuggers, static analyzers, and other\nfun stuff, but there&rsquo;s only so much ink in the pen.</p>\n</aside>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Lox.java</em><br>\nadd after <em>run</em>()</div>\n<pre>  <span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">error</span>(<span class=\"t\">int</span> <span class=\"i\">line</span>, <span class=\"t\">String</span> <span class=\"i\">message</span>) {\n    <span class=\"i\">report</span>(<span class=\"i\">line</span>, <span class=\"s\">&quot;&quot;</span>, <span class=\"i\">message</span>);\n  }\n\n  <span class=\"k\">private</span> <span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">report</span>(<span class=\"t\">int</span> <span class=\"i\">line</span>, <span class=\"t\">String</span> <span class=\"i\">where</span>,\n                             <span class=\"t\">String</span> <span class=\"i\">message</span>) {\n    <span class=\"t\">System</span>.<span class=\"i\">err</span>.<span class=\"i\">println</span>(\n        <span class=\"s\">&quot;[line &quot;</span> + <span class=\"i\">line</span> + <span class=\"s\">&quot;] Error&quot;</span> + <span class=\"i\">where</span> + <span class=\"s\">&quot;: &quot;</span> + <span class=\"i\">message</span>);\n    <span class=\"i\">hadError</span> = <span class=\"k\">true</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Lox.java</em>, add after <em>run</em>()</div>\n\n<p>This <code>error()</code> function and its <code>report()</code> helper tells the user some syntax\nerror occurred on a given line. That is really the bare minimum to be able to\nclaim you even <em>have</em> error reporting. Imagine if you accidentally left a\ndangling comma in some function call and the interpreter printed out:</p>\n<div class=\"codehilite\"><pre>Error: Unexpected &quot;,&quot; somewhere in your code. Good luck finding it!\n</pre></div>\n<p>That&rsquo;s not very helpful. We need to at least point them to the right line. Even\nbetter would be the beginning and end column so they know <em>where</em> in the line.\nEven better than <em>that</em> is to <em>show</em> the user the offending line, like:</p>\n<div class=\"codehilite\"><pre>Error: Unexpected &quot;,&quot; in argument list.\n\n    15 | function(first, second,);\n                               ^-- Here.\n</pre></div>\n<p>I&rsquo;d love to implement something like that in this book but the honest truth is\nthat it&rsquo;s a lot of grungy string manipulation code. Very useful for users, but\nnot super fun to read in a book and not very technically interesting. So we&rsquo;ll\nstick with just a line number. In your own interpreters, please do as I say and\nnot as I do.</p>\n<p>The primary reason we&rsquo;re sticking this error reporting function in the main Lox\nclass is because of that <code>hadError</code> field. It&rsquo;s defined here:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">public class Lox {\n</pre><div class=\"source-file\"><em>lox/Lox.java</em><br>\nin class <em>Lox</em></div>\n<pre class=\"insert\">  <span class=\"k\">static</span> <span class=\"t\">boolean</span> <span class=\"i\">hadError</span> = <span class=\"k\">false</span>;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Lox.java</em>, in class <em>Lox</em></div>\n\n<p>We&rsquo;ll use this to ensure we don&rsquo;t try to execute code that has a known error.\nAlso, it lets us exit with a non-zero exit code like a good command line citizen\nshould.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    run(new String(bytes, Charset.defaultCharset()));\n</pre><div class=\"source-file\"><em>lox/Lox.java</em><br>\nin <em>runFile</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"c\">// Indicate an error in the exit code.</span>\n    <span class=\"k\">if</span> (<span class=\"i\">hadError</span>) <span class=\"t\">System</span>.<span class=\"i\">exit</span>(<span class=\"n\">65</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Lox.java</em>, in <em>runFile</em>()</div>\n\n<p>We need to reset this flag in the interactive loop. If the user makes a mistake,\nit shouldn&rsquo;t kill their entire session.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      run(line);\n</pre><div class=\"source-file\"><em>lox/Lox.java</em><br>\nin <em>runPrompt</em>()</div>\n<pre class=\"insert\">      <span class=\"i\">hadError</span> = <span class=\"k\">false</span>;\n</pre><pre class=\"insert-after\">    }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Lox.java</em>, in <em>runPrompt</em>()</div>\n\n<p>The other reason I pulled the error reporting out here instead of stuffing it\ninto the scanner and other phases where the error might occur is to remind you\nthat it&rsquo;s good engineering practice to separate the code that <em>generates</em> the\nerrors from the code that <em>reports</em> them.</p>\n<p>Various phases of the front end will detect errors, but it&rsquo;s not really their\njob to know how to present that to a user. In a full-featured language\nimplementation, you will likely have multiple ways errors get displayed: on\nstderr, in an IDE&rsquo;s error window, logged to a file, etc. You don&rsquo;t want that\ncode smeared all over your scanner and parser.</p>\n<p>Ideally, we would have an actual abstraction, some kind of <span\nname=\"reporter\">&ldquo;ErrorReporter&rdquo;</span> interface that gets passed to the scanner\nand parser so that we can swap out different reporting strategies. For our\nsimple interpreter here, I didn&rsquo;t do that, but I did at least move the code for\nerror reporting into a different class.</p>\n<aside name=\"reporter\">\n<p>I had exactly that when I first implemented jlox. I ended up tearing it out\nbecause it felt over-engineered for the minimal interpreter in this book.</p>\n</aside>\n<p>With some rudimentary error handling in place, our application shell is ready.\nOnce we have a Scanner class with a <code>scanTokens()</code> method, we can start running\nit. Before we get to that, let&rsquo;s get more precise about what tokens are.</p>\n<h2><a href=\"#lexemes-and-tokens\" id=\"lexemes-and-tokens\"><small>4&#8202;.&#8202;2</small>Lexemes and Tokens</a></h2>\n<p>Here&rsquo;s a line of Lox code:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">language</span> = <span class=\"s\">&quot;lox&quot;</span>;\n</pre></div>\n<p>Here, <code>var</code> is the keyword for declaring a variable. That three-character\nsequence &ldquo;v-a-r&rdquo; means something. But if we yank three letters out of the\nmiddle of <code>language</code>, like &ldquo;g-u-a&rdquo;, those don&rsquo;t mean anything on their own.</p>\n<p>That&rsquo;s what lexical analysis is about. Our job is to scan through the list of\ncharacters and group them together into the smallest sequences that still\nrepresent something. Each of these blobs of characters is called a <strong>lexeme</strong>.\nIn that example line of code, the lexemes are:</p><img src=\"image/scanning/lexemes.png\" alt=\"'var', 'language', '=', 'lox', ';'\" />\n<p>The lexemes are only the raw substrings of the source code. However, in the\nprocess of grouping character sequences into lexemes, we also stumble upon some\nother useful information. When we take the lexeme and bundle it together with\nthat other data, the result is a token. It includes useful stuff like:</p>\n<h3><a href=\"#token-type\" id=\"token-type\"><small>4&#8202;.&#8202;2&#8202;.&#8202;1</small>Token type</a></h3>\n<p>Keywords are part of the shape of the language&rsquo;s grammar, so the parser often\nhas code like, &ldquo;If the next token is <code>while</code> then do<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>&rdquo; That means the parser\nwants to know not just that it has a lexeme for some identifier, but that it has\na <em>reserved</em> word, and <em>which</em> keyword it is.</p>\n<p>The <span name=\"ugly\">parser</span> could categorize tokens from the raw lexeme\nby comparing the strings, but that&rsquo;s slow and kind of ugly. Instead, at the\npoint that we recognize a lexeme, we also remember which <em>kind</em> of lexeme it\nrepresents. We have a different type for each keyword, operator, bit of\npunctuation, and literal type.</p>\n<aside name=\"ugly\">\n<p>After all, string comparison ends up looking at individual characters, and isn&rsquo;t\nthat the scanner&rsquo;s job?</p>\n</aside>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/TokenType.java</em><br>\ncreate new file</div>\n<pre><span class=\"k\">package</span> <span class=\"i\">com.craftinginterpreters.lox</span>;\n\n<span class=\"k\">enum</span> <span class=\"t\">TokenType</span> {\n  <span class=\"c\">// Single-character tokens.</span>\n  <span class=\"i\">LEFT_PAREN</span>, <span class=\"i\">RIGHT_PAREN</span>, <span class=\"i\">LEFT_BRACE</span>, <span class=\"i\">RIGHT_BRACE</span>,\n  <span class=\"i\">COMMA</span>, <span class=\"i\">DOT</span>, <span class=\"i\">MINUS</span>, <span class=\"i\">PLUS</span>, <span class=\"i\">SEMICOLON</span>, <span class=\"i\">SLASH</span>, <span class=\"i\">STAR</span>,\n\n  <span class=\"c\">// One or two character tokens.</span>\n  <span class=\"i\">BANG</span>, <span class=\"i\">BANG_EQUAL</span>,\n  <span class=\"i\">EQUAL</span>, <span class=\"i\">EQUAL_EQUAL</span>,\n  <span class=\"i\">GREATER</span>, <span class=\"i\">GREATER_EQUAL</span>,\n  <span class=\"i\">LESS</span>, <span class=\"i\">LESS_EQUAL</span>,\n\n  <span class=\"c\">// Literals.</span>\n  <span class=\"i\">IDENTIFIER</span>, <span class=\"i\">STRING</span>, <span class=\"i\">NUMBER</span>,\n\n  <span class=\"c\">// Keywords.</span>\n  <span class=\"i\">AND</span>, <span class=\"i\">CLASS</span>, <span class=\"i\">ELSE</span>, <span class=\"i\">FALSE</span>, <span class=\"i\">FUN</span>, <span class=\"i\">FOR</span>, <span class=\"i\">IF</span>, <span class=\"i\">NIL</span>, <span class=\"i\">OR</span>,\n  <span class=\"i\">PRINT</span>, <span class=\"i\">RETURN</span>, <span class=\"i\">SUPER</span>, <span class=\"i\">THIS</span>, <span class=\"i\">TRUE</span>, <span class=\"i\">VAR</span>, <span class=\"i\">WHILE</span>,\n\n  <span class=\"i\">EOF</span>\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/TokenType.java</em>, create new file</div>\n\n<h3><a href=\"#literal-value\" id=\"literal-value\"><small>4&#8202;.&#8202;2&#8202;.&#8202;2</small>Literal value</a></h3>\n<p>There are lexemes for literal values<span class=\"em\">&mdash;</span>numbers and strings and the like. Since\nthe scanner has to walk each character in the literal to correctly identify it,\nit can also convert that textual representation of a value to the living runtime\nobject that will be used by the interpreter later.</p>\n<h3><a href=\"#location-information\" id=\"location-information\"><small>4&#8202;.&#8202;2&#8202;.&#8202;3</small>Location information</a></h3>\n<p>Back when I was preaching the gospel about error handling, we saw that we need\nto tell users <em>where</em> errors occurred. Tracking that starts here. In our simple\ninterpreter, we note only which line the token appears on, but more\nsophisticated implementations include the column and length too.</p>\n<aside name=\"location\">\n<p>Some token implementations store the location as two numbers: the offset from\nthe beginning of the source file to the beginning of the lexeme, and the length\nof the lexeme. The scanner needs to know these anyway, so there&rsquo;s no overhead to\ncalculate them.</p>\n<p>An offset can be converted to line and column positions later by looking back at\nthe source file and counting the preceding newlines. That sounds slow, and it\nis. However, you need to do it <em>only when you need to actually display a line\nand column to the user</em>. Most tokens never appear in an error message. For\nthose, the less time you spend calculating position information ahead of time,\nthe better.</p>\n</aside>\n<p>We take all of this data and wrap it in a class.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Token.java</em><br>\ncreate new file</div>\n<pre><span class=\"k\">package</span> <span class=\"i\">com.craftinginterpreters.lox</span>;\n\n<span class=\"k\">class</span> <span class=\"t\">Token</span> {\n  <span class=\"k\">final</span> <span class=\"t\">TokenType</span> <span class=\"i\">type</span>;\n  <span class=\"k\">final</span> <span class=\"t\">String</span> <span class=\"i\">lexeme</span>;\n  <span class=\"k\">final</span> <span class=\"t\">Object</span> <span class=\"i\">literal</span>;\n  <span class=\"k\">final</span> <span class=\"t\">int</span> <span class=\"i\">line</span>;<span name=\"location\"> </span>\n\n  <span class=\"t\">Token</span>(<span class=\"t\">TokenType</span> <span class=\"i\">type</span>, <span class=\"t\">String</span> <span class=\"i\">lexeme</span>, <span class=\"t\">Object</span> <span class=\"i\">literal</span>, <span class=\"t\">int</span> <span class=\"i\">line</span>) {\n    <span class=\"k\">this</span>.<span class=\"i\">type</span> = <span class=\"i\">type</span>;\n    <span class=\"k\">this</span>.<span class=\"i\">lexeme</span> = <span class=\"i\">lexeme</span>;\n    <span class=\"k\">this</span>.<span class=\"i\">literal</span> = <span class=\"i\">literal</span>;\n    <span class=\"k\">this</span>.<span class=\"i\">line</span> = <span class=\"i\">line</span>;\n  }\n\n  <span class=\"k\">public</span> <span class=\"t\">String</span> <span class=\"i\">toString</span>() {\n    <span class=\"k\">return</span> <span class=\"i\">type</span> + <span class=\"s\">&quot; &quot;</span> + <span class=\"i\">lexeme</span> + <span class=\"s\">&quot; &quot;</span> + <span class=\"i\">literal</span>;\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Token.java</em>, create new file</div>\n\n<p>Now we have an object with enough structure to be useful for all of the later\nphases of the interpreter.</p>\n<h2><a href=\"#regular-languages-and-expressions\" id=\"regular-languages-and-expressions\"><small>4&#8202;.&#8202;3</small>Regular Languages and Expressions</a></h2>\n<p>Now that we know what we&rsquo;re trying to produce, let&rsquo;s, well, produce it. The core\nof the scanner is a loop. Starting at the first character of the source code,\nthe scanner figures out what lexeme the character belongs to, and consumes it\nand any following characters that are part of that lexeme. When it reaches the\nend of that lexeme, it emits a token.</p>\n<p>Then it loops back and does it again, starting from the very next character in\nthe source code. It keeps doing that, eating characters and occasionally, uh,\nexcreting tokens, until it reaches the end of the input.</p>\n<p><span name=\"alligator\"></span></p><img src=\"image/scanning/lexigator.png\" alt=\"An alligator eating characters and, well, you don't want to know.\" />\n<aside name=\"alligator\">\n<p>Lexical analygator.</p>\n</aside>\n<p>The part of the loop where we look at a handful of characters to figure out\nwhich kind of lexeme it &ldquo;matches&rdquo; may sound familiar. If you know regular\nexpressions, you might consider defining a regex for each kind of lexeme and\nusing those to match characters. For example, Lox has the same rules as C for\nidentifiers (variable names and the like). This regex matches one:</p>\n<div class=\"codehilite\"><pre>[a-zA-Z_][a-zA-Z_0-9]*\n</pre></div>\n<p>If you did think of regular expressions, your intuition is a deep one. The rules\nthat determine how a particular language groups characters into lexemes are\ncalled its <span name=\"theory\"><strong>lexical grammar</strong></span>. In Lox, as in most\nprogramming languages, the rules of that grammar are simple enough for the\nlanguage to be classified a <strong><a href=\"https://en.wikipedia.org/wiki/Regular_language\">regular language</a></strong>. That&rsquo;s the same &ldquo;regular&rdquo;\nas in regular expressions.</p>\n<aside name=\"theory\">\n<p>It pains me to gloss over the theory so much, especially when it&rsquo;s as\ninteresting as I think the <a href=\"https://en.wikipedia.org/wiki/Chomsky_hierarchy\">Chomsky hierarchy</a> and <a href=\"https://en.wikipedia.org/wiki/Finite-state_machine\">finite-state machines</a>\nare. But the honest truth is other books cover this better than I could.\n<a href=\"https://en.wikipedia.org/wiki/Compilers:_Principles,_Techniques,_and_Tools\"><em>Compilers: Principles, Techniques, and Tools</em></a> (universally known as\n&ldquo;the dragon book&rdquo;) is the canonical reference.</p>\n</aside>\n<p>You very precisely <em>can</em> recognize all of the different lexemes for Lox using\nregexes if you want to, and there&rsquo;s a pile of interesting theory underlying why\nthat is and what it means. Tools like <a href=\"http://dinosaur.compilertools.net/lex/\">Lex</a> or\n<a href=\"https://github.com/westes/flex\">Flex</a> are designed expressly to let you do this<span class=\"em\">&mdash;</span>throw a handful of regexes\nat them, and they give you a complete scanner <span name=\"lex\">back</span>.</p>\n<aside name=\"lex\">\n<p>Lex was created by Mike Lesk and Eric Schmidt. Yes, the same Eric Schmidt who\nwas executive chairman of Google. I&rsquo;m not saying programming languages are a\nsurefire path to wealth and fame, but we <em>can</em> count at least one\nmega billionaire among us.</p>\n</aside>\n<p>Since our goal is to understand how a scanner does what it does, we won&rsquo;t be\ndelegating that task. We&rsquo;re about handcrafted goods.</p>\n<h2><a href=\"#the-scanner-class\" id=\"the-scanner-class\"><small>4&#8202;.&#8202;4</small>The Scanner Class</a></h2>\n<p>Without further ado, let&rsquo;s make ourselves a scanner.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Scanner.java</em><br>\ncreate new file</div>\n<pre><span class=\"k\">package</span> <span class=\"i\">com.craftinginterpreters.lox</span>;\n\n<span class=\"k\">import</span> <span class=\"i\">java.util.ArrayList</span>;\n<span class=\"k\">import</span> <span class=\"i\">java.util.HashMap</span>;\n<span class=\"k\">import</span> <span class=\"i\">java.util.List</span>;\n<span class=\"k\">import</span> <span class=\"i\">java.util.Map</span>;\n\n<span class=\"k\">import static</span> <span class=\"i\">com.craftinginterpreters.lox.TokenType.*</span>;<span name=\"static-import\"> </span>\n\n<span class=\"k\">class</span> <span class=\"t\">Scanner</span> {\n  <span class=\"k\">private</span> <span class=\"k\">final</span> <span class=\"t\">String</span> <span class=\"i\">source</span>;\n  <span class=\"k\">private</span> <span class=\"k\">final</span> <span class=\"t\">List</span>&lt;<span class=\"t\">Token</span>&gt; <span class=\"i\">tokens</span> = <span class=\"k\">new</span> <span class=\"t\">ArrayList</span>&lt;&gt;();\n\n  <span class=\"t\">Scanner</span>(<span class=\"t\">String</span> <span class=\"i\">source</span>) {\n    <span class=\"k\">this</span>.<span class=\"i\">source</span> = <span class=\"i\">source</span>;\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, create new file</div>\n\n<aside name=\"static-import\">\n<p>I know static imports are considered bad style by some, but they save me from\nhaving to sprinkle <code>TokenType.</code> all over the scanner and parser. Forgive me, but\nevery character counts in a book.</p>\n</aside>\n<p>We store the raw source code as a simple string, and we have a list ready to\nfill with tokens we&rsquo;re going to generate. The aforementioned loop that does that\nlooks like this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nadd after <em>Scanner</em>()</div>\n<pre>  <span class=\"t\">List</span>&lt;<span class=\"t\">Token</span>&gt; <span class=\"i\">scanTokens</span>() {\n    <span class=\"k\">while</span> (!<span class=\"i\">isAtEnd</span>()) {\n      <span class=\"c\">// We are at the beginning of the next lexeme.</span>\n      <span class=\"i\">start</span> = <span class=\"i\">current</span>;\n      <span class=\"i\">scanToken</span>();\n    }\n\n    <span class=\"i\">tokens</span>.<span class=\"i\">add</span>(<span class=\"k\">new</span> <span class=\"t\">Token</span>(<span class=\"i\">EOF</span>, <span class=\"s\">&quot;&quot;</span>, <span class=\"k\">null</span>, <span class=\"i\">line</span>));\n    <span class=\"k\">return</span> <span class=\"i\">tokens</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, add after <em>Scanner</em>()</div>\n\n<p>The scanner works its way through the source code, adding tokens until it runs\nout of characters. Then it appends one final &ldquo;end of file&rdquo; token. That isn&rsquo;t\nstrictly needed, but it makes our parser a little cleaner.</p>\n<p>This loop depends on a couple of fields to keep track of where the scanner is in\nthe source code.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  private final List&lt;Token&gt; tokens = new ArrayList&lt;&gt;();\n</pre><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nin class <em>Scanner</em></div>\n<pre class=\"insert\">  <span class=\"k\">private</span> <span class=\"t\">int</span> <span class=\"i\">start</span> = <span class=\"n\">0</span>;\n  <span class=\"k\">private</span> <span class=\"t\">int</span> <span class=\"i\">current</span> = <span class=\"n\">0</span>;\n  <span class=\"k\">private</span> <span class=\"t\">int</span> <span class=\"i\">line</span> = <span class=\"n\">1</span>;\n</pre><pre class=\"insert-after\">\n\n  Scanner(String source) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, in class <em>Scanner</em></div>\n\n<p>The <code>start</code> and <code>current</code> fields are offsets that index into the string. The\n<code>start</code> field points to the first character in the lexeme being scanned, and\n<code>current</code> points at the character currently being considered. The <code>line</code> field\ntracks what source line <code>current</code> is on so we can produce tokens that know their\nlocation.</p>\n<p>Then we have one little helper function that tells us if we&rsquo;ve consumed all the\ncharacters.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nadd after <em>scanTokens</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">boolean</span> <span class=\"i\">isAtEnd</span>() {\n    <span class=\"k\">return</span> <span class=\"i\">current</span> &gt;= <span class=\"i\">source</span>.<span class=\"i\">length</span>();\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, add after <em>scanTokens</em>()</div>\n\n<h2><a href=\"#recognizing-lexemes\" id=\"recognizing-lexemes\"><small>4&#8202;.&#8202;5</small>Recognizing Lexemes</a></h2>\n<p>In each turn of the loop, we scan a single token. This is the real heart of the\nscanner. We&rsquo;ll start simple. Imagine if every lexeme were only a single character\nlong. All you would need to do is consume the next character and pick a token type for\nit. Several lexemes <em>are</em> only a single character in Lox, so let&rsquo;s start with\nthose.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nadd after <em>scanTokens</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">void</span> <span class=\"i\">scanToken</span>() {\n    <span class=\"t\">char</span> <span class=\"i\">c</span> = <span class=\"i\">advance</span>();\n    <span class=\"k\">switch</span> (<span class=\"i\">c</span>) {\n      <span class=\"k\">case</span> <span class=\"s\">&#39;(&#39;</span>: <span class=\"i\">addToken</span>(<span class=\"i\">LEFT_PAREN</span>); <span class=\"k\">break</span>;\n      <span class=\"k\">case</span> <span class=\"s\">&#39;)&#39;</span>: <span class=\"i\">addToken</span>(<span class=\"i\">RIGHT_PAREN</span>); <span class=\"k\">break</span>;\n      <span class=\"k\">case</span> <span class=\"s\">&#39;{&#39;</span>: <span class=\"i\">addToken</span>(<span class=\"i\">LEFT_BRACE</span>); <span class=\"k\">break</span>;\n      <span class=\"k\">case</span> <span class=\"s\">&#39;}&#39;</span>: <span class=\"i\">addToken</span>(<span class=\"i\">RIGHT_BRACE</span>); <span class=\"k\">break</span>;\n      <span class=\"k\">case</span> <span class=\"s\">&#39;,&#39;</span>: <span class=\"i\">addToken</span>(<span class=\"i\">COMMA</span>); <span class=\"k\">break</span>;\n      <span class=\"k\">case</span> <span class=\"s\">&#39;.&#39;</span>: <span class=\"i\">addToken</span>(<span class=\"i\">DOT</span>); <span class=\"k\">break</span>;\n      <span class=\"k\">case</span> <span class=\"s\">&#39;-&#39;</span>: <span class=\"i\">addToken</span>(<span class=\"i\">MINUS</span>); <span class=\"k\">break</span>;\n      <span class=\"k\">case</span> <span class=\"s\">&#39;+&#39;</span>: <span class=\"i\">addToken</span>(<span class=\"i\">PLUS</span>); <span class=\"k\">break</span>;\n      <span class=\"k\">case</span> <span class=\"s\">&#39;;&#39;</span>: <span class=\"i\">addToken</span>(<span class=\"i\">SEMICOLON</span>); <span class=\"k\">break</span>;\n      <span class=\"k\">case</span> <span class=\"s\">&#39;*&#39;</span>: <span class=\"i\">addToken</span>(<span class=\"i\">STAR</span>); <span class=\"k\">break</span>;<span name=\"slash\"> </span>\n    }\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, add after <em>scanTokens</em>()</div>\n\n<aside name=\"slash\">\n<p>Wondering why <code>/</code> isn&rsquo;t in here? Don&rsquo;t worry, we&rsquo;ll get to it.</p>\n</aside>\n<p>Again, we need a couple of helper methods.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nadd after <em>isAtEnd</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">char</span> <span class=\"i\">advance</span>() {\n    <span class=\"k\">return</span> <span class=\"i\">source</span>.<span class=\"i\">charAt</span>(<span class=\"i\">current</span>++);\n  }\n\n  <span class=\"k\">private</span> <span class=\"t\">void</span> <span class=\"i\">addToken</span>(<span class=\"t\">TokenType</span> <span class=\"i\">type</span>) {\n    <span class=\"i\">addToken</span>(<span class=\"i\">type</span>, <span class=\"k\">null</span>);\n  }\n\n  <span class=\"k\">private</span> <span class=\"t\">void</span> <span class=\"i\">addToken</span>(<span class=\"t\">TokenType</span> <span class=\"i\">type</span>, <span class=\"t\">Object</span> <span class=\"i\">literal</span>) {\n    <span class=\"t\">String</span> <span class=\"i\">text</span> = <span class=\"i\">source</span>.<span class=\"i\">substring</span>(<span class=\"i\">start</span>, <span class=\"i\">current</span>);\n    <span class=\"i\">tokens</span>.<span class=\"i\">add</span>(<span class=\"k\">new</span> <span class=\"t\">Token</span>(<span class=\"i\">type</span>, <span class=\"i\">text</span>, <span class=\"i\">literal</span>, <span class=\"i\">line</span>));\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, add after <em>isAtEnd</em>()</div>\n\n<p>The <code>advance()</code> method consumes the next character in the source file and\nreturns it. Where <code>advance()</code> is for input, <code>addToken()</code> is for output. It grabs\nthe text of the current lexeme and creates a new token for it. We&rsquo;ll use the\nother overload to handle tokens with literal values soon.</p>\n<h3><a href=\"#lexical-errors\" id=\"lexical-errors\"><small>4&#8202;.&#8202;5&#8202;.&#8202;1</small>Lexical errors</a></h3>\n<p>Before we get too far in, let&rsquo;s take a moment to think about errors at the\nlexical level. What happens if a user throws a source file containing some\ncharacters Lox doesn&rsquo;t use, like <code>@#^</code>, at our interpreter? Right now, those\ncharacters get silently discarded. They aren&rsquo;t used by the Lox language, but\nthat doesn&rsquo;t mean the interpreter can pretend they aren&rsquo;t there. Instead, we\nreport an error.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case '*': addToken(STAR); break;<span name=\"slash\"> </span>\n</pre><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nin <em>scanToken</em>()</div>\n<pre class=\"insert\">\n\n      <span class=\"k\">default</span>:\n        <span class=\"t\">Lox</span>.<span class=\"i\">error</span>(<span class=\"i\">line</span>, <span class=\"s\">&quot;Unexpected character.&quot;</span>);\n        <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">    }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, in <em>scanToken</em>()</div>\n\n<p>Note that the erroneous character is still <em>consumed</em> by the earlier call to\n<code>advance()</code>. That&rsquo;s important so that we don&rsquo;t get stuck in an infinite loop.</p>\n<p>Note also that we <span name=\"shotgun\"><em>keep scanning</em></span>. There may be\nother errors later in the program. It gives our users a better experience if we\ndetect as many of those as possible in one go. Otherwise, they see one tiny\nerror and fix it, only to have the next error appear, and so on. Syntax error\nWhac-A-Mole is no fun.</p>\n<p>(Don&rsquo;t worry. Since <code>hadError</code> gets set, we&rsquo;ll never try to <em>execute</em> any of the\ncode, even though we keep going and scan the rest of it.)</p>\n<aside name=\"shotgun\">\n<p>The code reports each invalid character separately, so this shotguns the user\nwith a blast of errors if they accidentally paste a big blob of weird text.\nCoalescing a run of invalid characters into a single error would give a nicer\nuser experience.</p>\n</aside>\n<h3><a href=\"#operators\" id=\"operators\"><small>4&#8202;.&#8202;5&#8202;.&#8202;2</small>Operators</a></h3>\n<p>We have single-character lexemes working, but that doesn&rsquo;t cover all of Lox&rsquo;s\noperators. What about <code>!</code>? It&rsquo;s a single character, right? Sometimes, yes, but\nif the very next character is an equals sign, then we should instead create a\n<code>!=</code> lexeme. Note that the <code>!</code> and <code>=</code> are <em>not</em> two independent operators. You\ncan&rsquo;t write <code>!   =</code> in Lox and have it behave like an inequality operator.\nThat&rsquo;s why we need to scan <code>!=</code> as a single lexeme. Likewise, <code>&lt;</code>, <code>&gt;</code>, and <code>=</code>\ncan all be followed by <code>=</code> to create the other equality and comparison\noperators.</p>\n<p>For all of these, we need to look at the second character.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case '*': addToken(STAR); break;<span name=\"slash\"> </span>\n</pre><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nin <em>scanToken</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"s\">&#39;!&#39;</span>:\n        <span class=\"i\">addToken</span>(<span class=\"i\">match</span>(<span class=\"s\">&#39;=&#39;</span>) ? <span class=\"i\">BANG_EQUAL</span> : <span class=\"i\">BANG</span>);\n        <span class=\"k\">break</span>;\n      <span class=\"k\">case</span> <span class=\"s\">&#39;=&#39;</span>:\n        <span class=\"i\">addToken</span>(<span class=\"i\">match</span>(<span class=\"s\">&#39;=&#39;</span>) ? <span class=\"i\">EQUAL_EQUAL</span> : <span class=\"i\">EQUAL</span>);\n        <span class=\"k\">break</span>;\n      <span class=\"k\">case</span> <span class=\"s\">&#39;&lt;&#39;</span>:\n        <span class=\"i\">addToken</span>(<span class=\"i\">match</span>(<span class=\"s\">&#39;=&#39;</span>) ? <span class=\"i\">LESS_EQUAL</span> : <span class=\"i\">LESS</span>);\n        <span class=\"k\">break</span>;\n      <span class=\"k\">case</span> <span class=\"s\">&#39;&gt;&#39;</span>:\n        <span class=\"i\">addToken</span>(<span class=\"i\">match</span>(<span class=\"s\">&#39;=&#39;</span>) ? <span class=\"i\">GREATER_EQUAL</span> : <span class=\"i\">GREATER</span>);\n        <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">\n\n      default:\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, in <em>scanToken</em>()</div>\n\n<p>Those cases use this new method:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nadd after <em>scanToken</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">boolean</span> <span class=\"i\">match</span>(<span class=\"t\">char</span> <span class=\"i\">expected</span>) {\n    <span class=\"k\">if</span> (<span class=\"i\">isAtEnd</span>()) <span class=\"k\">return</span> <span class=\"k\">false</span>;\n    <span class=\"k\">if</span> (<span class=\"i\">source</span>.<span class=\"i\">charAt</span>(<span class=\"i\">current</span>) != <span class=\"i\">expected</span>) <span class=\"k\">return</span> <span class=\"k\">false</span>;\n\n    <span class=\"i\">current</span>++;\n    <span class=\"k\">return</span> <span class=\"k\">true</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, add after <em>scanToken</em>()</div>\n\n<p>It&rsquo;s like a conditional <code>advance()</code>. We only consume the current character if\nit&rsquo;s what we&rsquo;re looking for.</p>\n<p>Using <code>match()</code>, we recognize these lexemes in two stages. When we reach, for\nexample, <code>!</code>, we jump to its switch case. That means we know the lexeme <em>starts</em>\nwith <code>!</code>. Then we look at the next character to determine if we&rsquo;re on a <code>!=</code> or\nmerely a <code>!</code>.</p>\n<h2><a href=\"#longer-lexemes\" id=\"longer-lexemes\"><small>4&#8202;.&#8202;6</small>Longer Lexemes</a></h2>\n<p>We&rsquo;re still missing one operator: <code>/</code> for division. That character needs a\nlittle special handling because comments begin with a slash too.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">        break;\n</pre><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nin <em>scanToken</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"s\">&#39;/&#39;</span>:\n        <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"s\">&#39;/&#39;</span>)) {\n          <span class=\"c\">// A comment goes until the end of the line.</span>\n          <span class=\"k\">while</span> (<span class=\"i\">peek</span>() != <span class=\"s\">&#39;\\n&#39;</span> &amp;&amp; !<span class=\"i\">isAtEnd</span>()) <span class=\"i\">advance</span>();\n        } <span class=\"k\">else</span> {\n          <span class=\"i\">addToken</span>(<span class=\"i\">SLASH</span>);\n        }\n        <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">\n\n      default:\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, in <em>scanToken</em>()</div>\n\n<p>This is similar to the other two-character operators, except that when we find a\nsecond <code>/</code>, we don&rsquo;t end the token yet. Instead, we keep consuming characters\nuntil we reach the end of the line.</p>\n<p>This is our general strategy for handling longer lexemes. After we detect the\nbeginning of one, we shunt over to some lexeme-specific code that keeps eating\ncharacters until it sees the end.</p>\n<p>We&rsquo;ve got another helper:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nadd after <em>match</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">char</span> <span class=\"i\">peek</span>() {\n    <span class=\"k\">if</span> (<span class=\"i\">isAtEnd</span>()) <span class=\"k\">return</span> <span class=\"s\">&#39;\\0&#39;</span>;\n    <span class=\"k\">return</span> <span class=\"i\">source</span>.<span class=\"i\">charAt</span>(<span class=\"i\">current</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, add after <em>match</em>()</div>\n\n<p>It&rsquo;s sort of like <code>advance()</code>, but doesn&rsquo;t consume the character. This is called\n<span name=\"match\"><strong>lookahead</strong></span>. Since it only looks at the current\nunconsumed character, we have <em>one character of lookahead</em>. The smaller this\nnumber is, generally, the faster the scanner runs. The rules of the lexical\ngrammar dictate how much lookahead we need. Fortunately, most languages in wide\nuse peek only one or two characters ahead.</p>\n<aside name=\"match\">\n<p>Technically, <code>match()</code> is doing lookahead too. <code>advance()</code> and <code>peek()</code> are the\nfundamental operators and <code>match()</code> combines them.</p>\n</aside>\n<p>Comments are lexemes, but they aren&rsquo;t meaningful, and the parser doesn&rsquo;t want\nto deal with them. So when we reach the end of the comment, we <em>don&rsquo;t</em> call\n<code>addToken()</code>. When we loop back around to start the next lexeme, <code>start</code> gets\nreset and the comment&rsquo;s lexeme disappears in a puff of smoke.</p>\n<p>While we&rsquo;re at it, now&rsquo;s a good time to skip over those other meaningless\ncharacters: newlines and whitespace.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">        break;\n</pre><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nin <em>scanToken</em>()</div>\n<pre class=\"insert\">\n\n      <span class=\"k\">case</span> <span class=\"s\">&#39; &#39;</span>:\n      <span class=\"k\">case</span> <span class=\"s\">&#39;\\r&#39;</span>:\n      <span class=\"k\">case</span> <span class=\"s\">&#39;\\t&#39;</span>:\n        <span class=\"c\">// Ignore whitespace.</span>\n        <span class=\"k\">break</span>;\n\n      <span class=\"k\">case</span> <span class=\"s\">&#39;\\n&#39;</span>:\n        <span class=\"i\">line</span>++;\n        <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">\n\n      default:\n        Lox.error(line, &quot;Unexpected character.&quot;);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, in <em>scanToken</em>()</div>\n\n<p>When encountering whitespace, we simply go back to the beginning of the scan\nloop. That starts a new lexeme <em>after</em> the whitespace character. For newlines,\nwe do the same thing, but we also increment the line counter. (This is why we\nused <code>peek()</code> to find the newline ending a comment instead of <code>match()</code>. We want\nthat newline to get us here so we can update <code>line</code>.)</p>\n<p>Our scanner is getting smarter. It can handle fairly free-form code like:</p>\n<div class=\"codehilite\"><pre><span class=\"c\">// this is a comment</span>\n(( )){} <span class=\"c\">// grouping stuff</span>\n!*+-/=&lt;&gt; &lt;= == <span class=\"c\">// operators</span>\n</pre></div>\n<h3><a href=\"#string-literals\" id=\"string-literals\"><small>4&#8202;.&#8202;6&#8202;.&#8202;1</small>String literals</a></h3>\n<p>Now that we&rsquo;re comfortable with longer lexemes, we&rsquo;re ready to tackle literals.\nWe&rsquo;ll do strings first, since they always begin with a specific character, <code>\"</code>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">        break;\n</pre><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nin <em>scanToken</em>()</div>\n<pre class=\"insert\">\n\n      <span class=\"k\">case</span> <span class=\"s\">&#39;&quot;&#39;</span>: <span class=\"i\">string</span>(); <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">\n\n      default:\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, in <em>scanToken</em>()</div>\n\n<p>That calls:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nadd after <em>scanToken</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">void</span> <span class=\"i\">string</span>() {\n    <span class=\"k\">while</span> (<span class=\"i\">peek</span>() != <span class=\"s\">&#39;&quot;&#39;</span> &amp;&amp; !<span class=\"i\">isAtEnd</span>()) {\n      <span class=\"k\">if</span> (<span class=\"i\">peek</span>() == <span class=\"s\">&#39;\\n&#39;</span>) <span class=\"i\">line</span>++;\n      <span class=\"i\">advance</span>();\n    }\n\n    <span class=\"k\">if</span> (<span class=\"i\">isAtEnd</span>()) {\n      <span class=\"t\">Lox</span>.<span class=\"i\">error</span>(<span class=\"i\">line</span>, <span class=\"s\">&quot;Unterminated string.&quot;</span>);\n      <span class=\"k\">return</span>;\n    }\n\n    <span class=\"c\">// The closing &quot;.</span>\n    <span class=\"i\">advance</span>();\n\n    <span class=\"c\">// Trim the surrounding quotes.</span>\n    <span class=\"t\">String</span> <span class=\"i\">value</span> = <span class=\"i\">source</span>.<span class=\"i\">substring</span>(<span class=\"i\">start</span> + <span class=\"n\">1</span>, <span class=\"i\">current</span> - <span class=\"n\">1</span>);\n    <span class=\"i\">addToken</span>(<span class=\"i\">STRING</span>, <span class=\"i\">value</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, add after <em>scanToken</em>()</div>\n\n<p>Like with comments, we consume characters until we hit the <code>\"</code> that ends the\nstring. We also gracefully handle running out of input before the string is\nclosed and report an error for that.</p>\n<p>For no particular reason, Lox supports multi-line strings. There are pros and\ncons to that, but prohibiting them was a little more complex than allowing them,\nso I left them in. That does mean we also need to update <code>line</code> when we hit a\nnewline inside a string.</p>\n<p>Finally, the last interesting bit is that when we create the token, we also\nproduce the actual string <em>value</em> that will be used later by the interpreter.\nHere, that conversion only requires a <code>substring()</code> to strip off the surrounding\nquotes. If Lox supported escape sequences like <code>\\n</code>, we&rsquo;d unescape those here.</p>\n<h3><a href=\"#number-literals\" id=\"number-literals\"><small>4&#8202;.&#8202;6&#8202;.&#8202;2</small>Number literals</a></h3>\n<p>All numbers in Lox are floating point at runtime, but both integer and decimal\nliterals are supported. A number literal is a series of <span\nname=\"minus\">digits</span> optionally followed by a <code>.</code> and one or more trailing\ndigits.</p>\n<aside name=\"minus\">\n<p>Since we look only for a digit to start a number, that means <code>-123</code> is not a\nnumber <em>literal</em>. Instead, <code>-123</code>, is an <em>expression</em> that applies <code>-</code> to the\nnumber literal <code>123</code>. In practice, the result is the same, though it has one\ninteresting edge case if we were to add method calls on numbers. Consider:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">print</span> -<span class=\"n\">123</span>.<span class=\"i\">abs</span>();\n</pre></div>\n<p>This prints <code>-123</code> because negation has lower precedence than method calls. We\ncould fix that by making <code>-</code> part of the number literal. But then consider:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">n</span> = <span class=\"n\">123</span>;\n<span class=\"k\">print</span> -<span class=\"i\">n</span>.<span class=\"i\">abs</span>();\n</pre></div>\n<p>This still produces <code>-123</code>, so now the language seems inconsistent. No matter\nwhat you do, some case ends up weird.</p>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"n\">1234</span>\n<span class=\"n\">12.34</span>\n</pre></div>\n<p>We don&rsquo;t allow a leading or trailing decimal point, so these are both invalid:</p>\n<div class=\"codehilite\"><pre>.<span class=\"n\">1234</span>\n<span class=\"n\">1234</span>.\n</pre></div>\n<p>We could easily support the former, but I left it out to keep things simple. The\nlatter gets weird if we ever want to allow methods on numbers like <code>123.sqrt()</code>.</p>\n<p>To recognize the beginning of a number lexeme, we look for any digit. It&rsquo;s kind\nof tedious to add cases for every decimal digit, so we&rsquo;ll stuff it in the\ndefault case instead.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      default:\n</pre><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nin <em>scanToken</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">        <span class=\"k\">if</span> (<span class=\"i\">isDigit</span>(<span class=\"i\">c</span>)) {\n          <span class=\"i\">number</span>();\n        } <span class=\"k\">else</span> {\n          <span class=\"t\">Lox</span>.<span class=\"i\">error</span>(<span class=\"i\">line</span>, <span class=\"s\">&quot;Unexpected character.&quot;</span>);\n        }\n</pre><pre class=\"insert-after\">        break;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, in <em>scanToken</em>(), replace 1 line</div>\n\n<p>This relies on this little utility:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nadd after <em>peek</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">boolean</span> <span class=\"i\">isDigit</span>(<span class=\"t\">char</span> <span class=\"i\">c</span>) {\n    <span class=\"k\">return</span> <span class=\"i\">c</span> &gt;= <span class=\"s\">&#39;0&#39;</span> &amp;&amp; <span class=\"i\">c</span> &lt;= <span class=\"s\">&#39;9&#39;</span>;\n  }<span name=\"is-digit\"> </span>\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, add after <em>peek</em>()</div>\n\n<aside name=\"is-digit\">\n<p>The Java standard library provides <a href=\"http://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#isDigit(char)\"><code>Character.isDigit()</code></a>, which seems\nlike a good fit. Alas, that method allows things like Devanagari digits,\nfull-width numbers, and other funny stuff we don&rsquo;t want.</p>\n</aside>\n<p>Once we know we are in a number, we branch to a separate method to consume the\nrest of the literal, like we do with strings.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nadd after <em>scanToken</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">void</span> <span class=\"i\">number</span>() {\n    <span class=\"k\">while</span> (<span class=\"i\">isDigit</span>(<span class=\"i\">peek</span>())) <span class=\"i\">advance</span>();\n\n    <span class=\"c\">// Look for a fractional part.</span>\n    <span class=\"k\">if</span> (<span class=\"i\">peek</span>() == <span class=\"s\">&#39;.&#39;</span> &amp;&amp; <span class=\"i\">isDigit</span>(<span class=\"i\">peekNext</span>())) {\n      <span class=\"c\">// Consume the &quot;.&quot;</span>\n      <span class=\"i\">advance</span>();\n\n      <span class=\"k\">while</span> (<span class=\"i\">isDigit</span>(<span class=\"i\">peek</span>())) <span class=\"i\">advance</span>();\n    }\n\n    <span class=\"i\">addToken</span>(<span class=\"i\">NUMBER</span>,\n        <span class=\"t\">Double</span>.<span class=\"i\">parseDouble</span>(<span class=\"i\">source</span>.<span class=\"i\">substring</span>(<span class=\"i\">start</span>, <span class=\"i\">current</span>)));\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, add after <em>scanToken</em>()</div>\n\n<p>We consume as many digits as we find for the integer part of the literal. Then\nwe look for a fractional part, which is a decimal point (<code>.</code>) followed by at\nleast one digit. If we do have a fractional part, again, we consume as many\ndigits as we can find.</p>\n<p>Looking past the decimal point requires a second character of lookahead since we\ndon&rsquo;t want to consume the <code>.</code> until we&rsquo;re sure there is a digit <em>after</em> it. So\nwe add:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nadd after <em>peek</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">char</span> <span class=\"i\">peekNext</span>() {\n    <span class=\"k\">if</span> (<span class=\"i\">current</span> + <span class=\"n\">1</span> &gt;= <span class=\"i\">source</span>.<span class=\"i\">length</span>()) <span class=\"k\">return</span> <span class=\"s\">&#39;\\0&#39;</span>;\n    <span class=\"k\">return</span> <span class=\"i\">source</span>.<span class=\"i\">charAt</span>(<span class=\"i\">current</span> + <span class=\"n\">1</span>);\n  }<span name=\"peek-next\"> </span>\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, add after <em>peek</em>()</div>\n\n<aside name=\"peek-next\">\n<p>I could have made <code>peek()</code> take a parameter for the number of characters ahead\nto look instead of defining two functions, but that would allow <em>arbitrarily</em>\nfar lookahead. Providing these two functions makes it clearer to a reader of the\ncode that our scanner looks ahead at most two characters.</p>\n</aside>\n<p>Finally, we convert the lexeme to its numeric value. Our interpreter uses Java&rsquo;s\n<code>Double</code> type to represent numbers, so we produce a value of that type. We&rsquo;re\nusing Java&rsquo;s own parsing method to convert the lexeme to a real Java double. We\ncould implement that ourselves, but, honestly, unless you&rsquo;re trying to cram for\nan upcoming programming interview, it&rsquo;s not worth your time.</p>\n<p>The remaining literals are Booleans and <code>nil</code>, but we handle those as keywords,\nwhich gets us to<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span></p>\n<h2><a href=\"#reserved-words-and-identifiers\" id=\"reserved-words-and-identifiers\"><small>4&#8202;.&#8202;7</small>Reserved Words and Identifiers</a></h2>\n<p>Our scanner is almost done. The only remaining pieces of the lexical grammar to\nimplement are identifiers and their close cousins, the reserved words. You might\nthink we could match keywords like <code>or</code> in the same way we handle\nmultiple-character operators like <code>&lt;=</code>.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">case</span> <span class=\"s\">&#39;o&#39;</span>:\n  <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"s\">&#39;r&#39;</span>)) {\n    <span class=\"i\">addToken</span>(<span class=\"i\">OR</span>);\n  }\n  <span class=\"k\">break</span>;\n</pre></div>\n<p>Consider what would happen if a user named a variable <code>orchid</code>. The scanner\nwould see the first two letters, <code>or</code>, and immediately emit an <code>or</code> keyword\ntoken. This gets us to an important principle called <span\nname=\"maximal\"><strong>maximal munch</strong></span>. When two lexical grammar rules can both\nmatch a chunk of code that the scanner is looking at, <em>whichever one matches the\nmost characters wins</em>.</p>\n<p>That rule states that if we can match <code>orchid</code> as an identifier and <code>or</code> as a\nkeyword, then the former wins. This is also why we tacitly assumed, previously,\nthat <code>&lt;=</code> should be scanned as a single <code>&lt;=</code> token and not <code>&lt;</code> followed by <code>=</code>.</p>\n<aside name=\"maximal\">\n<p>Consider this nasty bit of C code:</p>\n<div class=\"codehilite\"><pre>---<span class=\"i\">a</span>;\n</pre></div>\n<p>Is it valid? That depends on how the scanner splits the lexemes. What if the scanner\nsees it like this:</p>\n<div class=\"codehilite\"><pre>- --<span class=\"i\">a</span>;\n</pre></div>\n<p>Then it could be parsed. But that would require the scanner to know about the\ngrammatical structure of the surrounding code, which entangles things more than\nwe want. Instead, the maximal munch rule says that it is <em>always</em> scanned like:</p>\n<div class=\"codehilite\"><pre>-- -<span class=\"i\">a</span>;\n</pre></div>\n<p>It scans it that way even though doing so leads to a syntax error later in the\nparser.</p>\n</aside>\n<p>Maximal munch means we can&rsquo;t easily detect a reserved word until we&rsquo;ve reached\nthe end of what might instead be an identifier. After all, a reserved word <em>is</em>\nan identifier, it&rsquo;s just one that has been claimed by the language for its own\nuse. That&rsquo;s where the term <strong>reserved word</strong> comes from.</p>\n<p>So we begin by assuming any lexeme starting with a letter or underscore is an\nidentifier.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      default:\n        if (isDigit(c)) {\n          number();\n</pre><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nin <em>scanToken</em>()</div>\n<pre class=\"insert\">        } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"i\">isAlpha</span>(<span class=\"i\">c</span>)) {\n          <span class=\"i\">identifier</span>();\n</pre><pre class=\"insert-after\">        } else {\n          Lox.error(line, &quot;Unexpected character.&quot;);\n        }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, in <em>scanToken</em>()</div>\n\n<p>The rest of the code lives over here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nadd after <em>scanToken</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">void</span> <span class=\"i\">identifier</span>() {\n    <span class=\"k\">while</span> (<span class=\"i\">isAlphaNumeric</span>(<span class=\"i\">peek</span>())) <span class=\"i\">advance</span>();\n\n    <span class=\"i\">addToken</span>(<span class=\"i\">IDENTIFIER</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, add after <em>scanToken</em>()</div>\n\n<p>We define that in terms of these helpers:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nadd after <em>peekNext</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">boolean</span> <span class=\"i\">isAlpha</span>(<span class=\"t\">char</span> <span class=\"i\">c</span>) {\n    <span class=\"k\">return</span> (<span class=\"i\">c</span> &gt;= <span class=\"s\">&#39;a&#39;</span> &amp;&amp; <span class=\"i\">c</span> &lt;= <span class=\"s\">&#39;z&#39;</span>) ||\n           (<span class=\"i\">c</span> &gt;= <span class=\"s\">&#39;A&#39;</span> &amp;&amp; <span class=\"i\">c</span> &lt;= <span class=\"s\">&#39;Z&#39;</span>) ||\n            <span class=\"i\">c</span> == <span class=\"s\">&#39;_&#39;</span>;\n  }\n\n  <span class=\"k\">private</span> <span class=\"t\">boolean</span> <span class=\"i\">isAlphaNumeric</span>(<span class=\"t\">char</span> <span class=\"i\">c</span>) {\n    <span class=\"k\">return</span> <span class=\"i\">isAlpha</span>(<span class=\"i\">c</span>) || <span class=\"i\">isDigit</span>(<span class=\"i\">c</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, add after <em>peekNext</em>()</div>\n\n<p>That gets identifiers working. To handle keywords, we see if the identifier&rsquo;s\nlexeme is one of the reserved words. If so, we use a token type specific to that\nkeyword. We define the set of reserved words in a map.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nin class <em>Scanner</em></div>\n<pre>  <span class=\"k\">private</span> <span class=\"k\">static</span> <span class=\"k\">final</span> <span class=\"t\">Map</span>&lt;<span class=\"t\">String</span>, <span class=\"t\">TokenType</span>&gt; <span class=\"i\">keywords</span>;\n\n  <span class=\"k\">static</span> {\n    <span class=\"i\">keywords</span> = <span class=\"k\">new</span> <span class=\"t\">HashMap</span>&lt;&gt;();\n    <span class=\"i\">keywords</span>.<span class=\"i\">put</span>(<span class=\"s\">&quot;and&quot;</span>,    <span class=\"i\">AND</span>);\n    <span class=\"i\">keywords</span>.<span class=\"i\">put</span>(<span class=\"s\">&quot;class&quot;</span>,  <span class=\"i\">CLASS</span>);\n    <span class=\"i\">keywords</span>.<span class=\"i\">put</span>(<span class=\"s\">&quot;else&quot;</span>,   <span class=\"i\">ELSE</span>);\n    <span class=\"i\">keywords</span>.<span class=\"i\">put</span>(<span class=\"s\">&quot;false&quot;</span>,  <span class=\"i\">FALSE</span>);\n    <span class=\"i\">keywords</span>.<span class=\"i\">put</span>(<span class=\"s\">&quot;for&quot;</span>,    <span class=\"i\">FOR</span>);\n    <span class=\"i\">keywords</span>.<span class=\"i\">put</span>(<span class=\"s\">&quot;fun&quot;</span>,    <span class=\"i\">FUN</span>);\n    <span class=\"i\">keywords</span>.<span class=\"i\">put</span>(<span class=\"s\">&quot;if&quot;</span>,     <span class=\"i\">IF</span>);\n    <span class=\"i\">keywords</span>.<span class=\"i\">put</span>(<span class=\"s\">&quot;nil&quot;</span>,    <span class=\"i\">NIL</span>);\n    <span class=\"i\">keywords</span>.<span class=\"i\">put</span>(<span class=\"s\">&quot;or&quot;</span>,     <span class=\"i\">OR</span>);\n    <span class=\"i\">keywords</span>.<span class=\"i\">put</span>(<span class=\"s\">&quot;print&quot;</span>,  <span class=\"i\">PRINT</span>);\n    <span class=\"i\">keywords</span>.<span class=\"i\">put</span>(<span class=\"s\">&quot;return&quot;</span>, <span class=\"i\">RETURN</span>);\n    <span class=\"i\">keywords</span>.<span class=\"i\">put</span>(<span class=\"s\">&quot;super&quot;</span>,  <span class=\"i\">SUPER</span>);\n    <span class=\"i\">keywords</span>.<span class=\"i\">put</span>(<span class=\"s\">&quot;this&quot;</span>,   <span class=\"i\">THIS</span>);\n    <span class=\"i\">keywords</span>.<span class=\"i\">put</span>(<span class=\"s\">&quot;true&quot;</span>,   <span class=\"i\">TRUE</span>);\n    <span class=\"i\">keywords</span>.<span class=\"i\">put</span>(<span class=\"s\">&quot;var&quot;</span>,    <span class=\"i\">VAR</span>);\n    <span class=\"i\">keywords</span>.<span class=\"i\">put</span>(<span class=\"s\">&quot;while&quot;</span>,  <span class=\"i\">WHILE</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, in class <em>Scanner</em></div>\n\n<p>Then, after we scan an identifier, we check to see if it matches anything in the\nmap.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    while (isAlphaNumeric(peek())) advance();\n\n</pre><div class=\"source-file\"><em>lox/Scanner.java</em><br>\nin <em>identifier</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"t\">String</span> <span class=\"i\">text</span> = <span class=\"i\">source</span>.<span class=\"i\">substring</span>(<span class=\"i\">start</span>, <span class=\"i\">current</span>);\n    <span class=\"t\">TokenType</span> <span class=\"i\">type</span> = <span class=\"i\">keywords</span>.<span class=\"i\">get</span>(<span class=\"i\">text</span>);\n    <span class=\"k\">if</span> (<span class=\"i\">type</span> == <span class=\"k\">null</span>) <span class=\"i\">type</span> = <span class=\"i\">IDENTIFIER</span>;\n    <span class=\"i\">addToken</span>(<span class=\"i\">type</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Scanner.java</em>, in <em>identifier</em>(), replace 1 line</div>\n\n<p>If so, we use that keyword&rsquo;s token type. Otherwise, it&rsquo;s a regular user-defined\nidentifier.</p>\n<p>And with that, we now have a complete scanner for the entire Lox lexical\ngrammar. Fire up the REPL and type in some valid and invalid code. Does it\nproduce the tokens you expect? Try to come up with some interesting edge cases\nand see if it handles them as it should.</p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>The lexical grammars of Python and Haskell are not <em>regular</em>. What does that\nmean, and why aren&rsquo;t they?</p>\n</li>\n<li>\n<p>Aside from separating tokens<span class=\"em\">&mdash;</span>distinguishing <code>print foo</code> from <code>printfoo</code><span class=\"em\">&mdash;</span>spaces aren&rsquo;t used for much in most languages. However, in a couple of\ndark corners, a space <em>does</em> affect how code is parsed in CoffeeScript,\nRuby, and the C preprocessor. Where and what effect does it have in each of\nthose languages?</p>\n</li>\n<li>\n<p>Our scanner here, like most, discards comments and whitespace since those\naren&rsquo;t needed by the parser. Why might you want to write a scanner that does\n<em>not</em> discard those? What would it be useful for?</p>\n</li>\n<li>\n<p>Add support to Lox&rsquo;s scanner for C-style <code>/* ... */</code> block comments. Make\nsure to handle newlines in them. Consider allowing them to nest. Is adding\nsupport for nesting more work than you expected? Why?</p>\n</li>\n</ol>\n</div>\n<div class=\"design-note\">\n<h2><a href=\"#design-note\" id=\"design-note\">Design Note: Implicit Semicolons</a></h2>\n<p>Programmers today are spoiled for choice in languages and have gotten picky\nabout syntax. They want their language to look clean and modern. One bit of\nsyntactic lichen that almost every new language scrapes off (and some ancient\nones like BASIC never had) is <code>;</code> as an explicit statement terminator.</p>\n<p>Instead, they treat a newline as a statement terminator where it makes sense to\ndo so. The &ldquo;where it makes sense&rdquo; part is the challenging bit. While <em>most</em>\nstatements are on their own line, sometimes you need to spread a single\nstatement across a couple of lines. Those intermingled newlines should not be\ntreated as terminators.</p>\n<p>Most of the obvious cases where the newline should be ignored are easy to\ndetect, but there are a handful of nasty ones:</p>\n<ul>\n<li>\n<p>A return value on the next line:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">if</span> (<span class=\"i\">condition</span>) <span class=\"k\">return</span>\n<span class=\"s\">&quot;value&quot;</span>\n</pre></div>\n<p>Is &ldquo;value&rdquo; the value being returned, or do we have a <code>return</code> statement with\n  no value followed by an expression statement containing a string literal?</p>\n</li>\n<li>\n<p>A parenthesized expression on the next line:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">func</span>\n(<span class=\"i\">parenthesized</span>)\n</pre></div>\n<p>Is this a call to <code>func(parenthesized)</code>, or two expression statements, one\n  for <code>func</code> and one for a parenthesized expression?</p>\n</li>\n<li>\n<p>A <code>-</code> on the next line:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">first</span>\n-<span class=\"i\">second</span>\n</pre></div>\n<p>Is this <code>first - second</code><span class=\"em\">&mdash;</span>an infix subtraction<span class=\"em\">&mdash;</span>or two expression\n  statements, one for <code>first</code> and one to negate <code>second</code>?</p>\n</li>\n</ul>\n<p>In all of these, either treating the newline as a separator or not would both\nproduce valid code, but possibly not the code the user wants. Across languages,\nthere is an unsettling variety of rules used to decide which newlines are\nseparators. Here are a couple:</p>\n<ul>\n<li>\n<p><a href=\"https://www.lua.org/pil/1.1.html\">Lua</a> completely ignores newlines, but carefully controls its grammar such\nthat no separator between statements is needed at all in most cases. This is\nperfectly legit:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">a</span> = <span class=\"n\">1</span> <span class=\"i\">b</span> = <span class=\"n\">2</span>\n</pre></div>\n<p>Lua avoids the <code>return</code> problem by requiring a <code>return</code> statement to be the\nvery last statement in a block. If there is a value after <code>return</code> before\nthe keyword <code>end</code>, it <em>must</em> be for the <code>return</code>. For the other two cases,\nthey allow an explicit <code>;</code> and expect users to use that. In practice, that\nalmost never happens because there&rsquo;s no point in a parenthesized or unary\nnegation expression statement.</p>\n</li>\n<li>\n<p><a href=\"https://golang.org/ref/spec#Semicolons\">Go</a> handles newlines in the scanner. If a newline appears following one\nof a handful of token types that are known to potentially end a statement,\nthe newline is treated like a semicolon. Otherwise it is ignored. The Go\nteam provides a canonical code formatter, <a href=\"https://golang.org/cmd/gofmt/\">gofmt</a>, and the ecosystem is\nfervent about its use, which ensures that idiomatic styled code works well\nwith this simple rule.</p>\n</li>\n<li>\n<p><a href=\"https://docs.python.org/3.5/reference/lexical_analysis.html#implicit-line-joining\">Python</a> treats all newlines as significant unless an explicit backslash\nis used at the end of a line to continue it to the next line. However,\nnewlines anywhere inside a pair of brackets (<code>()</code>, <code>[]</code>, or <code>{}</code>) are\nignored. Idiomatic style strongly prefers the latter.</p>\n<p>This rule works well for Python because it is a highly statement-oriented\nlanguage. In particular, Python&rsquo;s grammar ensures a statement never appears\ninside an expression. C does the same, but many other languages which have a\n&ldquo;lambda&rdquo; or function literal syntax do not.</p>\n<p>An example in JavaScript:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">console</span>.<span class=\"i\">log</span>(<span class=\"k\">function</span>() {\n  <span class=\"i\">statement</span>();\n});\n</pre></div>\n<p>Here, the <code>console.log()</code> <em>expression</em> contains a function literal which\nin turn contains the <em>statement</em> <code>statement();</code>.</p>\n<p>Python would need a different set of rules for implicitly joining lines if\nyou could get back <em>into</em> a <span name=\"lambda\">statement</span> where\nnewlines should become meaningful while still nested inside brackets.</p>\n</li>\n</ul>\n<aside name=\"lambda\">\n<p>And now you know why Python&rsquo;s <code>lambda</code> allows only a single expression body.</p>\n</aside>\n<ul>\n<li>\n<p>JavaScript&rsquo;s &ldquo;<a href=\"https://www.ecma-international.org/ecma-262/5.1/#sec-7.9\">automatic semicolon insertion</a>&rdquo; rule is the real odd\none. Where other languages assume most newlines <em>are</em> meaningful and only a\nfew should be ignored in multi-line statements, JS assumes the opposite. It\ntreats all of your newlines as meaningless whitespace <em>unless</em> it encounters\na parse error. If it does, it goes back and tries turning the previous\nnewline into a semicolon to get something grammatically valid.</p>\n<p>This design note would turn into a design diatribe if I went into complete\ndetail about how that even <em>works</em>, much less all the various ways that\nJavaScript&rsquo;s &ldquo;solution&rdquo; is a bad idea. It&rsquo;s a mess. JavaScript is the only\nlanguage I know where many style guides demand explicit semicolons after\nevery statement even though the language theoretically lets you elide them.</p>\n</li>\n</ul>\n<p>If you&rsquo;re designing a new language, you almost surely <em>should</em> avoid an explicit\nstatement terminator. Programmers are creatures of fashion like other humans, and\nsemicolons are as passé as ALL CAPS KEYWORDS. Just make sure you pick a set of\nrules that make sense for your language&rsquo;s particular grammar and idioms. And\ndon&rsquo;t do what JavaScript did.</p>\n</div>\n\n<footer>\n<a href=\"representing-code.html\" class=\"next\">\n  Next Chapter: &ldquo;Representing Code&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/script.js",
    "content": "$(function() {\n  $(\"#expand-nav\").click(function() {\n    $(\".expandable\").toggleClass(\"shown\");\n  });\n\n  $(window).scroll(function() {\n    var nav = $(\"nav.floating\");\n    if ($(window).scrollTop() > 84) {\n      nav.addClass(\"pinned\");\n    } else {\n      nav.removeClass(\"pinned\");\n    }\n  });\n\n  $(window).resize(refreshAsides);\n\n  // Since we may not have the height correct for the images, adjust the asides\n  // too when an image is loaded.\n  $(\"img\").on(\"load\", function() {\n    refreshAsides();\n  });\n\n  // On the off chance the browser supports the new font loader API, use it.\n  if (document.fontloader) {\n    document.fontloader.notifyWhenFontsReady(function() {\n      refreshAsides();\n    });\n  }\n\n  // Lame. Just do another refresh after a second when the font is *probably*\n  // loaded to hack around the fact that the metrics changed a bit.\n  window.setTimeout(refreshAsides, 200);\n\n  refreshAsides();\n});\n\nfunction refreshAsides() {\n  $(\"aside\").each(function() {\n    var aside = $(this);\n\n    // If the asides are inline, clear their position.\n    if ($(document).width() <= 48 * 20) {\n      aside.css('top', 'auto');\n      return;\n    }\n\n    // Find the span the aside should be anchored next to.\n    var name = aside.attr(\"name\");\n    if (name == null) {\n      window.console.log(\"No name for aside:\");\n      window.console.log(aside.context);\n      return;\n    }\n\n    var span = $(\"span[name='\" + name + \"']\");\n    if (span == null) {\n      window.console.log(\"Could not find span for '\" + name + \"'\");\n      return;\n    }\n\n    // Vertically position the aside next to the span it annotates.\n    var pos = span.position();\n    if (pos == null) {\n      window.console.log(\"Could not find position for '\" + name + \"'\");\n      console.log(span);\n      return;\n    }\n\n    if (aside.hasClass(\"bottom\")) {\n      aside.offset({top: pos.top + 23 - aside.height()});\n    } else {\n      aside.offset({top: pos.top - 6});\n    }\n  });\n}"
  },
  {
    "path": "site/statements-and-state.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Statements and State &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Statements and State<small>8</small></a></h3>\n\n<ul>\n    <li><a href=\"#statements\"><small>8.1</small> Statements</a></li>\n    <li><a href=\"#global-variables\"><small>8.2</small> Global Variables</a></li>\n    <li><a href=\"#environments\"><small>8.3</small> Environments</a></li>\n    <li><a href=\"#assignment\"><small>8.4</small> Assignment</a></li>\n    <li><a href=\"#scope\"><small>8.5</small> Scope</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Implicit Variable Declaration</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"evaluating-expressions.html\" title=\"Evaluating Expressions\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\">&uarr;&nbsp;Up</a>\n    <a href=\"control-flow.html\" title=\"Control Flow\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"evaluating-expressions.html\" title=\"Evaluating Expressions\" class=\"prev\">←</a>\n<a href=\"control-flow.html\" title=\"Control Flow\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Statements and State<small>8</small></a></h3>\n\n<ul>\n    <li><a href=\"#statements\"><small>8.1</small> Statements</a></li>\n    <li><a href=\"#global-variables\"><small>8.2</small> Global Variables</a></li>\n    <li><a href=\"#environments\"><small>8.3</small> Environments</a></li>\n    <li><a href=\"#assignment\"><small>8.4</small> Assignment</a></li>\n    <li><a href=\"#scope\"><small>8.5</small> Scope</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Implicit Variable Declaration</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"evaluating-expressions.html\" title=\"Evaluating Expressions\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\">&uarr;&nbsp;Up</a>\n    <a href=\"control-flow.html\" title=\"Control Flow\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">8</div>\n  <h1>Statements and State</h1>\n\n<blockquote>\n<p>All my life, my heart has yearned for a thing I cannot name.\n<cite>Andr&eacute; Breton, <em>Mad Love</em></cite></p>\n</blockquote>\n<p>The interpreter we have so far feels less like programming a real language and\nmore like punching buttons on a calculator. &ldquo;Programming&rdquo; to me means building\nup a system out of smaller pieces. We can&rsquo;t do that yet because we have no way\nto bind a name to some data or function. We can&rsquo;t compose software without a way\nto refer to the pieces.</p>\n<p>To support bindings, our interpreter needs internal state. When you define a\nvariable at the beginning of the program and use it at the end, the interpreter\nhas to hold on to the value of that variable in the meantime. So in this\nchapter, we will give our interpreter a brain that can not just process, but\n<em>remember</em>.</p><img src=\"image/statements-and-state/brain.png\" alt=\"A brain, presumably remembering stuff.\" />\n<p>State and <span name=\"expr\">statements</span> go hand in hand. Since statements,\nby definition, don&rsquo;t evaluate to a value, they need to do something else to be\nuseful. That something is called a <strong>side effect</strong>. It could mean producing\nuser-visible output or modifying some state in the interpreter that can be\ndetected later. The latter makes them a great fit for defining variables or\nother named entities.</p>\n<aside name=\"expr\">\n<p>You could make a language that treats variable declarations as expressions that\nboth create a binding and produce a value. The only language I know that does\nthat is Tcl. Scheme seems like a contender, but note that after a <code>let</code>\nexpression is evaluated, the variable it bound is forgotten. The <code>define</code> syntax\nis not an expression.</p>\n</aside>\n<p>In this chapter, we&rsquo;ll do all of that. We&rsquo;ll define statements that produce\noutput (<code>print</code>) and create state (<code>var</code>). We&rsquo;ll add expressions to access and\nassign to variables. Finally, we&rsquo;ll add blocks and local scope. That&rsquo;s a lot to\nstuff into one chapter, but we&rsquo;ll chew through it all one bite at a time.</p>\n<h2><a href=\"#statements\" id=\"statements\"><small>8&#8202;.&#8202;1</small>Statements</a></h2>\n<p>We start by extending Lox&rsquo;s grammar with statements. They aren&rsquo;t very different\nfrom expressions. We start with the two simplest kinds:</p>\n<ol>\n<li>\n<p>An <strong>expression statement</strong> lets you place an expression where a statement\nis expected. They exist to evaluate expressions that have side effects. You\nmay not notice them, but you use them all the time in <span\nname=\"expr-stmt\">C</span>, Java, and other languages. Any time you see a\nfunction or method call followed by a <code>;</code>, you&rsquo;re looking at an expression\nstatement.</p>\n<aside name=\"expr-stmt\">\n<p>Pascal is an outlier. It distinguishes between <em>procedures</em> and <em>functions</em>.\nFunctions return values, but procedures cannot. There is a statement form\nfor calling a procedure, but functions can only be called where an\nexpression is expected. There are no expression statements in Pascal.</p>\n</aside></li>\n<li>\n<p>A <strong><code>print</code> statement</strong> evaluates an expression and displays the result to\nthe user. I admit it&rsquo;s weird to bake printing right into the language\ninstead of making it a library function. Doing so is a concession to the\nfact that we&rsquo;re building this interpreter one chapter at a time and want to\nbe able to play with it before it&rsquo;s all done. To make print a library\nfunction, we&rsquo;d have to wait until we had all of the machinery for defining\nand calling functions <span name=\"print\">before</span> we could witness any\nside effects.</p>\n<aside name=\"print\">\n<p>I will note with only a modicum of defensiveness that BASIC and Python\nhave dedicated <code>print</code> statements and they are real languages. Granted,\nPython did remove their <code>print</code> statement in 3.0<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span></p>\n</aside></li>\n</ol>\n<p>New syntax means new grammar rules. In this chapter, we finally gain the ability\nto parse an entire Lox script. Since Lox is an imperative, dynamically typed\nlanguage, the &ldquo;top level&rdquo; of a script is simply a list of statements. The new\nrules are:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">program</span>        → <span class=\"i\">statement</span>* <span class=\"t\">EOF</span> ;\n\n<span class=\"i\">statement</span>      → <span class=\"i\">exprStmt</span>\n               | <span class=\"i\">printStmt</span> ;\n\n<span class=\"i\">exprStmt</span>       → <span class=\"i\">expression</span> <span class=\"s\">&quot;;&quot;</span> ;\n<span class=\"i\">printStmt</span>      → <span class=\"s\">&quot;print&quot;</span> <span class=\"i\">expression</span> <span class=\"s\">&quot;;&quot;</span> ;\n</pre></div>\n<p>The first rule is now <code>program</code>, which is the starting point for the grammar and\nrepresents a complete Lox script or REPL entry. A program is a list of\nstatements followed by the special &ldquo;end of file&rdquo; token. The mandatory end token\nensures the parser consumes the entire input and doesn&rsquo;t silently ignore\nerroneous unconsumed tokens at the end of a script.</p>\n<p>Right now, <code>statement</code> only has two cases for the two kinds of statements we&rsquo;ve\ndescribed. We&rsquo;ll fill in more later in this chapter and in the following ones.\nThe next step is turning this grammar into something we can store in memory<span class=\"em\">&mdash;</span>syntax trees.</p>\n<h3><a href=\"#statement-syntax-trees\" id=\"statement-syntax-trees\"><small>8&#8202;.&#8202;1&#8202;.&#8202;1</small>Statement syntax trees</a></h3>\n<p>There is no place in the grammar where both an expression and a statement are\nallowed. The operands of, say, <code>+</code> are always expressions, never statements. The\nbody of a <code>while</code> loop is always a statement.</p>\n<p>Since the two syntaxes are disjoint, we don&rsquo;t need a single base class that they\nall inherit from. Splitting expressions and statements into separate class\nhierarchies enables the Java compiler to help us find dumb mistakes like passing\na statement to a Java method that expects an expression.</p>\n<p>That means a new base class for statements. As our elders did before us, we will\nuse the cryptic name &ldquo;Stmt&rdquo;. With great <span name=\"foresight\">foresight</span>,\nI have designed our little AST metaprogramming script in anticipation of this.\nThat&rsquo;s why we passed in &ldquo;Expr&rdquo; as a parameter to <code>defineAst()</code>. Now we add\nanother call to define Stmt and its <span name=\"stmt-ast\">subclasses</span>.</p>\n<aside name=\"foresight\">\n<p>Not really foresight: I wrote all the code for the book before I sliced it into\nchapters.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">      &quot;Unary    : Token operator, Expr right&quot;\n    ));\n</pre><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nin <em>main</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"i\">defineAst</span>(<span class=\"i\">outputDir</span>, <span class=\"s\">&quot;Stmt&quot;</span>, <span class=\"t\">Arrays</span>.<span class=\"i\">asList</span>(\n      <span class=\"s\">&quot;Expression : Expr expression&quot;</span>,\n      <span class=\"s\">&quot;Print      : Expr expression&quot;</span>\n    ));\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, in <em>main</em>()</div>\n\n<aside name=\"stmt-ast\">\n<p>The generated code for the new nodes is in <a href=\"appendix-ii.html\">Appendix II</a>: <a href=\"appendix-ii.html#expression-statement\">Expression statement</a>, <a href=\"appendix-ii.html#print-statement\">Print statement</a>.</p>\n</aside>\n<p>Run the AST generator script and behold the resulting &ldquo;Stmt.java&rdquo; file with the\nsyntax tree classes we need for expression and <code>print</code> statements. Don&rsquo;t forget\nto add the file to your IDE project or makefile or whatever.</p>\n<h3><a href=\"#parsing-statements\" id=\"parsing-statements\"><small>8&#8202;.&#8202;1&#8202;.&#8202;2</small>Parsing statements</a></h3>\n<p>The parser&rsquo;s <code>parse()</code> method that parses and returns a single expression was a\ntemporary hack to get the last chapter up and running. Now that our grammar has\nthe correct starting rule, <code>program</code>, we can turn <code>parse()</code> into the real deal.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nmethod <em>parse</em>()<br>\nreplace 7 lines</div>\n<pre>  <span class=\"t\">List</span>&lt;<span class=\"t\">Stmt</span>&gt; <span class=\"i\">parse</span>() {\n    <span class=\"t\">List</span>&lt;<span class=\"t\">Stmt</span>&gt; <span class=\"i\">statements</span> = <span class=\"k\">new</span> <span class=\"t\">ArrayList</span>&lt;&gt;();\n    <span class=\"k\">while</span> (!<span class=\"i\">isAtEnd</span>()) {\n      <span class=\"i\">statements</span>.<span class=\"i\">add</span>(<span class=\"i\">statement</span>());\n    }\n\n    <span class=\"k\">return</span> <span class=\"i\">statements</span>;<span name=\"parse-error-handling\"> </span>\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, method <em>parse</em>(), replace 7 lines</div>\n\n<aside name=\"parse-error-handling\">\n<p>What about the code we had in here for catching <code>ParseError</code> exceptions? We&rsquo;ll\nput better parse error handling in place soon when we add support for additional\nstatement types.</p>\n</aside>\n<p>This parses a series of statements, as many as it can find until it hits the end\nof the input. This is a pretty direct translation of the <code>program</code> rule into\nrecursive descent style. We must also chant a minor prayer to the Java verbosity\ngods since we are using ArrayList now.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">package com.craftinginterpreters.lox;\n\n</pre><div class=\"source-file\"><em>lox/Parser.java</em></div>\n<pre class=\"insert\"><span class=\"k\">import</span> <span class=\"i\">java.util.ArrayList</span>;\n</pre><pre class=\"insert-after\">import java.util.List;\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em></div>\n\n<p>A program is a list of statements, and we parse one of those statements using\nthis method:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>expression</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Stmt</span> <span class=\"i\">statement</span>() {\n    <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">PRINT</span>)) <span class=\"k\">return</span> <span class=\"i\">printStatement</span>();\n\n    <span class=\"k\">return</span> <span class=\"i\">expressionStatement</span>();\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>expression</em>()</div>\n\n<p>A little bare bones, but we&rsquo;ll fill it in with more statement types later. We\ndetermine which specific statement rule is matched by looking at the current\ntoken. A <code>print</code> token means it&rsquo;s obviously a <code>print</code> statement.</p>\n<p>If the next token doesn&rsquo;t look like any known kind of statement, we assume it\nmust be an expression statement. That&rsquo;s the typical final fallthrough case when\nparsing a statement, since it&rsquo;s hard to proactively recognize an expression from\nits first token.</p>\n<p>Each statement kind gets its own method. First <code>print</code>:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>statement</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Stmt</span> <span class=\"i\">printStatement</span>() {\n    <span class=\"t\">Expr</span> <span class=\"i\">value</span> = <span class=\"i\">expression</span>();\n    <span class=\"i\">consume</span>(<span class=\"i\">SEMICOLON</span>, <span class=\"s\">&quot;Expect &#39;;&#39; after value.&quot;</span>);\n    <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">Stmt</span>.<span class=\"t\">Print</span>(<span class=\"i\">value</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>statement</em>()</div>\n\n<p>Since we already matched and consumed the <code>print</code> token itself, we don&rsquo;t need to\ndo that here. We parse the subsequent expression, consume the terminating\nsemicolon, and emit the syntax tree.</p>\n<p>If we didn&rsquo;t match a <code>print</code> statement, we must have one of these:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>printStatement</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Stmt</span> <span class=\"i\">expressionStatement</span>() {\n    <span class=\"t\">Expr</span> <span class=\"i\">expr</span> = <span class=\"i\">expression</span>();\n    <span class=\"i\">consume</span>(<span class=\"i\">SEMICOLON</span>, <span class=\"s\">&quot;Expect &#39;;&#39; after expression.&quot;</span>);\n    <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">Stmt</span>.<span class=\"t\">Expression</span>(<span class=\"i\">expr</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>printStatement</em>()</div>\n\n<p>Similar to the previous method, we parse an expression followed by a semicolon.\nWe wrap that Expr in a Stmt of the right type and return it.</p>\n<h3><a href=\"#executing-statements\" id=\"executing-statements\"><small>8&#8202;.&#8202;1&#8202;.&#8202;3</small>Executing statements</a></h3>\n<p>We&rsquo;re running through the previous couple of chapters in microcosm, working our\nway through the front end. Our parser can now produce statement syntax trees, so\nthe next and final step is to interpret them. As in expressions, we use the\nVisitor pattern, but we have a new visitor interface, Stmt.Visitor, to\nimplement since statements have their own base class.</p>\n<p>We add that to the list of interfaces Interpreter implements.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"k\">class</span> <span class=\"t\">Interpreter</span> <span class=\"k\">implements</span> <span class=\"t\">Expr</span>.<span class=\"t\">Visitor</span>&lt;<span class=\"t\">Object</span>&gt;,\n                             <span class=\"t\">Stmt</span>.<span class=\"t\">Visitor</span>&lt;<span class=\"t\">Void</span>&gt; {\n</pre><pre class=\"insert-after\">  void interpret(Expr expression) {<span name=\"void\"> </span>\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, replace 1 line</div>\n\n<aside name=\"void\">\n<p>Java doesn&rsquo;t let you use lowercase &ldquo;void&rdquo; as a generic type argument for obscure\nreasons having to do with type erasure and the stack. Instead, there is a\nseparate &ldquo;Void&rdquo; type specifically for this use. Sort of a &ldquo;boxed void&rdquo;, like\n&ldquo;Integer&rdquo; is for &ldquo;int&rdquo;.</p>\n</aside>\n<p>Unlike expressions, statements produce no values, so the return type of the\nvisit methods is Void, not Object. We have two statement types, and we need a\nvisit method for each. The easiest is expression statements.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>evaluate</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitExpressionStmt</span>(<span class=\"t\">Stmt</span>.<span class=\"t\">Expression</span> <span class=\"i\">stmt</span>) {\n    <span class=\"i\">evaluate</span>(<span class=\"i\">stmt</span>.<span class=\"i\">expression</span>);\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>evaluate</em>()</div>\n\n<p>We evaluate the inner expression using our existing <code>evaluate()</code> method and\n<span name=\"discard\">discard</span> the value. Then we return <code>null</code>. Java\nrequires that to satisfy the special capitalized Void return type. Weird, but\nwhat can you do?</p>\n<aside name=\"discard\">\n<p>Appropriately enough, we discard the value returned by <code>evaluate()</code> by placing\nthat call inside a <em>Java</em> expression statement.</p>\n</aside>\n<p>The <code>print</code> statement&rsquo;s visit method isn&rsquo;t much different.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>visitExpressionStmt</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitPrintStmt</span>(<span class=\"t\">Stmt</span>.<span class=\"t\">Print</span> <span class=\"i\">stmt</span>) {\n    <span class=\"t\">Object</span> <span class=\"i\">value</span> = <span class=\"i\">evaluate</span>(<span class=\"i\">stmt</span>.<span class=\"i\">expression</span>);\n    <span class=\"t\">System</span>.<span class=\"i\">out</span>.<span class=\"i\">println</span>(<span class=\"i\">stringify</span>(<span class=\"i\">value</span>));\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>visitExpressionStmt</em>()</div>\n\n<p>Before discarding the expression&rsquo;s value, we convert it to a string using the\n<code>stringify()</code> method we introduced in the last chapter and then dump it to\nstdout.</p>\n<p>Our interpreter is able to visit statements now, but we have some work to do to\nfeed them to it. First, modify the old <code>interpret()</code> method in the Interpreter\nclass to accept a list of statements<span class=\"em\">&mdash;</span>in other words, a program.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nmethod <em>interpret</em>()<br>\nreplace 8 lines</div>\n<pre>  <span class=\"t\">void</span> <span class=\"i\">interpret</span>(<span class=\"t\">List</span>&lt;<span class=\"t\">Stmt</span>&gt; <span class=\"i\">statements</span>) {\n    <span class=\"k\">try</span> {\n      <span class=\"k\">for</span> (<span class=\"t\">Stmt</span> <span class=\"i\">statement</span> : <span class=\"i\">statements</span>) {\n        <span class=\"i\">execute</span>(<span class=\"i\">statement</span>);\n      }\n    } <span class=\"k\">catch</span> (<span class=\"t\">RuntimeError</span> <span class=\"i\">error</span>) {\n      <span class=\"t\">Lox</span>.<span class=\"i\">runtimeError</span>(<span class=\"i\">error</span>);\n    }\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, method <em>interpret</em>(), replace 8 lines</div>\n\n<p>This replaces the old code which took a single expression. The new code relies\non this tiny helper method:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>evaluate</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">void</span> <span class=\"i\">execute</span>(<span class=\"t\">Stmt</span> <span class=\"i\">stmt</span>) {\n    <span class=\"i\">stmt</span>.<span class=\"i\">accept</span>(<span class=\"k\">this</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>evaluate</em>()</div>\n\n<p>That&rsquo;s the statement analogue to the <code>evaluate()</code> method we have for\nexpressions. Since we&rsquo;re working with lists now, we need to let Java know.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">package com.craftinginterpreters.lox;\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">import</span> <span class=\"i\">java.util.List</span>;\n</pre><pre class=\"insert-after\">\n\nclass Interpreter implements Expr.Visitor&lt;Object&gt;,\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em></div>\n\n<p>The main Lox class is still trying to parse a single expression and pass it to\nthe interpreter. We fix the parsing line like so:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    Parser parser = new Parser(tokens);\n</pre><div class=\"source-file\"><em>lox/Lox.java</em><br>\nin <em>run</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"t\">List</span>&lt;<span class=\"t\">Stmt</span>&gt; <span class=\"i\">statements</span> = <span class=\"i\">parser</span>.<span class=\"i\">parse</span>();\n</pre><pre class=\"insert-after\">\n\n    // Stop if there was a syntax error.\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Lox.java</em>, in <em>run</em>(), replace 1 line</div>\n\n<p>And then replace the call to the interpreter with this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    if (hadError) return;\n\n</pre><div class=\"source-file\"><em>lox/Lox.java</em><br>\nin <em>run</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"i\">interpreter</span>.<span class=\"i\">interpret</span>(<span class=\"i\">statements</span>);\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Lox.java</em>, in <em>run</em>(), replace 1 line</div>\n\n<p>Basically just plumbing the new syntax through. OK, fire up the interpreter and\ngive it a try. At this point, it&rsquo;s worth sketching out a little Lox program in a\ntext file to run as a script. Something like:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">print</span> <span class=\"s\">&quot;one&quot;</span>;\n<span class=\"k\">print</span> <span class=\"k\">true</span>;\n<span class=\"k\">print</span> <span class=\"n\">2</span> + <span class=\"n\">1</span>;\n</pre></div>\n<p>It almost looks like a real program! Note that the REPL, too, now requires you\nto enter a full statement instead of a simple expression. Don&rsquo;t forget your\nsemicolons.</p>\n<h2><a href=\"#global-variables\" id=\"global-variables\"><small>8&#8202;.&#8202;2</small>Global Variables</a></h2>\n<p>Now that we have statements, we can start working on state. Before we get into\nall of the complexity of lexical scoping, we&rsquo;ll start off with the easiest kind\nof variables<span class=\"em\">&mdash;</span><span name=\"globals\">globals</span>. We need two new constructs.</p>\n<ol>\n<li>\n<p>A <strong>variable declaration</strong> statement brings a new variable into the world.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">beverage</span> = <span class=\"s\">&quot;espresso&quot;</span>;\n</pre></div>\n<p>This creates a new binding that associates a name (here &ldquo;beverage&rdquo;) with a\nvalue (here, the string <code>\"espresso\"</code>).</p>\n</li>\n<li>\n<p>Once that&rsquo;s done, a <strong>variable expression</strong> accesses that binding. When the\nidentifier &ldquo;beverage&rdquo; is used as an expression, it looks up the value bound\nto that name and returns it.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">print</span> <span class=\"i\">beverage</span>; <span class=\"c\">// &quot;espresso&quot;.</span>\n</pre></div>\n</li>\n</ol>\n<p>Later, we&rsquo;ll add assignment and block scope, but that&rsquo;s enough to get moving.</p>\n<aside name=\"globals\">\n<p>Global state gets a bad rap. Sure, lots of global state<span class=\"em\">&mdash;</span>especially <em>mutable</em>\nstate<span class=\"em\">&mdash;</span>makes it hard to maintain large programs. It&rsquo;s good software\nengineering to minimize how much you use.</p>\n<p>But when you&rsquo;re slapping together a simple programming language or, heck, even\nlearning your first language, the flat simplicity of global variables helps. My\nfirst language was BASIC and, though I outgrew it eventually, it was nice that I\ndidn&rsquo;t have to wrap my head around scoping rules before I could make a computer\ndo fun stuff.</p>\n</aside>\n<h3><a href=\"#variable-syntax\" id=\"variable-syntax\"><small>8&#8202;.&#8202;2&#8202;.&#8202;1</small>Variable syntax</a></h3>\n<p>As before, we&rsquo;ll work through the implementation from front to back, starting\nwith the syntax. Variable declarations are statements, but they are different\nfrom other statements, and we&rsquo;re going to split the statement grammar in two to\nhandle them. That&rsquo;s because the grammar restricts where some kinds of statements\nare allowed.</p>\n<p>The clauses in control flow statements<span class=\"em\">&mdash;</span>think the then and else branches of\nan <code>if</code> statement or the body of a <code>while</code><span class=\"em\">&mdash;</span>are each a single statement. But\nthat statement is not allowed to be one that declares a name. This is OK:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">if</span> (<span class=\"i\">monday</span>) <span class=\"k\">print</span> <span class=\"s\">&quot;Ugh, already?&quot;</span>;\n</pre></div>\n<p>But this is not:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">if</span> (<span class=\"i\">monday</span>) <span class=\"k\">var</span> <span class=\"i\">beverage</span> = <span class=\"s\">&quot;espresso&quot;</span>;\n</pre></div>\n<p>We <em>could</em> allow the latter, but it&rsquo;s confusing. What is the scope of that\n<code>beverage</code> variable? Does it persist after the <code>if</code> statement? If so, what is\nits value on days other than Monday? Does the variable exist at all on those\ndays?</p>\n<p>Code like this is weird, so C, Java, and friends all disallow it. It&rsquo;s as if\nthere are two levels of <span name=\"brace\">&ldquo;precedence&rdquo;</span> for statements.\nSome places where a statement is allowed<span class=\"em\">&mdash;</span>like inside a block or at the top\nlevel<span class=\"em\">&mdash;</span>allow any kind of statement, including declarations. Others allow only\nthe &ldquo;higher&rdquo; precedence statements that don&rsquo;t declare names.</p>\n<aside name=\"brace\">\n<p>In this analogy, block statements work sort of like parentheses do for\nexpressions. A block is itself in the &ldquo;higher&rdquo; precedence level and can be used\nanywhere, like in the clauses of an <code>if</code> statement. But the statements it\n<em>contains</em> can be lower precedence. You&rsquo;re allowed to declare variables and\nother names inside the block. The curlies let you escape back into the full\nstatement grammar from a place where only some statements are allowed.</p>\n</aside>\n<p>To accommodate the distinction, we add another rule for kinds of statements that\ndeclare names.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">program</span>        → <span class=\"i\">declaration</span>* <span class=\"t\">EOF</span> ;\n\n<span class=\"i\">declaration</span>    → <span class=\"i\">varDecl</span>\n               | <span class=\"i\">statement</span> ;\n\n<span class=\"i\">statement</span>      → <span class=\"i\">exprStmt</span>\n               | <span class=\"i\">printStmt</span> ;\n</pre></div>\n<p>Declaration statements go under the new <code>declaration</code> rule. Right now, it&rsquo;s only\nvariables, but later it will include functions and classes. Any place where a\ndeclaration is allowed also allows non-declaring statements, so the\n<code>declaration</code> rule falls through to <code>statement</code>. Obviously, you can declare\nstuff at the top level of a script, so <code>program</code> routes to the new rule.</p>\n<p>The rule for declaring a variable looks like:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">varDecl</span>        → <span class=\"s\">&quot;var&quot;</span> <span class=\"t\">IDENTIFIER</span> ( <span class=\"s\">&quot;=&quot;</span> <span class=\"i\">expression</span> )? <span class=\"s\">&quot;;&quot;</span> ;\n</pre></div>\n<p>Like most statements, it starts with a leading keyword. In this case, <code>var</code>.\nThen an identifier token for the name of the variable being declared, followed\nby an optional initializer expression. Finally, we put a bow on it with the\nsemicolon.</p>\n<p>To access a variable, we define a new kind of primary expression.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">primary</span>        → <span class=\"s\">&quot;true&quot;</span> | <span class=\"s\">&quot;false&quot;</span> | <span class=\"s\">&quot;nil&quot;</span>\n               | <span class=\"t\">NUMBER</span> | <span class=\"t\">STRING</span>\n               | <span class=\"s\">&quot;(&quot;</span> <span class=\"i\">expression</span> <span class=\"s\">&quot;)&quot;</span>\n               | <span class=\"t\">IDENTIFIER</span> ;\n</pre></div>\n<p>That <code>IDENTIFIER</code> clause matches a single identifier token, which is understood\nto be the name of the variable being accessed.</p>\n<p>These new grammar rules get their corresponding syntax trees. Over in the AST\ngenerator, we add a <span name=\"var-stmt-ast\">new statement</span> node for a\nvariable declaration.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      &quot;Expression : Expr expression&quot;,\n</pre><pre class=\"insert-before\">      <span class=\"s\">&quot;Print      : Expr expression&quot;</span><span class=\"insert-comma\">,</span>\n</pre><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nin <em>main</em>()<br>\nadd <em>&ldquo;,&rdquo;</em> to previous line</div>\n<pre class=\"insert\">      <span class=\"s\">&quot;Var        : Token name, Expr initializer&quot;</span>\n</pre><pre class=\"insert-after\">    ));\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, in <em>main</em>(), add <em>&ldquo;,&rdquo;</em> to previous line</div>\n\n<aside name=\"var-stmt-ast\">\n<p>The generated code for the new node is in <a href=\"appendix-ii.html#variable-statement\">Appendix II</a>.</p>\n</aside>\n<p>It stores the name token so we know what it&rsquo;s declaring, along with the\ninitializer expression. (If there isn&rsquo;t an initializer, that field is <code>null</code>.)</p>\n<p>Then we add an expression node for accessing a variable.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      &quot;Literal  : Object value&quot;,\n</pre><pre class=\"insert-before\">      <span class=\"s\">&quot;Unary    : Token operator, Expr right&quot;</span><span class=\"insert-comma\">,</span>\n</pre><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nin <em>main</em>()<br>\nadd <em>&ldquo;,&rdquo;</em> to previous line</div>\n<pre class=\"insert\">      <span class=\"s\">&quot;Variable : Token name&quot;</span>\n</pre><pre class=\"insert-after\">    ));\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, in <em>main</em>(), add <em>&ldquo;,&rdquo;</em> to previous line</div>\n\n<p><span name=\"var-expr-ast\">It&rsquo;s</span> simply a wrapper around the token for the\nvariable name. That&rsquo;s it. As always, don&rsquo;t forget to run the AST generator\nscript so that you get updated &ldquo;Expr.java&rdquo; and &ldquo;Stmt.java&rdquo; files.</p>\n<aside name=\"var-expr-ast\">\n<p>The generated code for the new node is in <a href=\"appendix-ii.html#variable-expression\">Appendix II</a>.</p>\n</aside>\n<h3><a href=\"#parsing-variables\" id=\"parsing-variables\"><small>8&#8202;.&#8202;2&#8202;.&#8202;2</small>Parsing variables</a></h3>\n<p>Before we parse variable statements, we need to shift around some code to make\nroom for the new <code>declaration</code> rule in the grammar. The top level of a program\nis now a list of declarations, so the entrypoint method to the parser changes.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  List&lt;Stmt&gt; parse() {\n    List&lt;Stmt&gt; statements = new ArrayList&lt;&gt;();\n    while (!isAtEnd()) {\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>parse</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">      <span class=\"i\">statements</span>.<span class=\"i\">add</span>(<span class=\"i\">declaration</span>());\n</pre><pre class=\"insert-after\">    }\n\n    return statements;<span name=\"parse-error-handling\"> </span>\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>parse</em>(), replace 1 line</div>\n\n<p>That calls this new method:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>expression</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Stmt</span> <span class=\"i\">declaration</span>() {\n    <span class=\"k\">try</span> {\n      <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">VAR</span>)) <span class=\"k\">return</span> <span class=\"i\">varDeclaration</span>();\n\n      <span class=\"k\">return</span> <span class=\"i\">statement</span>();\n    } <span class=\"k\">catch</span> (<span class=\"t\">ParseError</span> <span class=\"i\">error</span>) {\n      <span class=\"i\">synchronize</span>();\n      <span class=\"k\">return</span> <span class=\"k\">null</span>;\n    }\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>expression</em>()</div>\n\n<p>Hey, do you remember way back in that <a href=\"parsing-expressions.html\">earlier chapter</a> when we put the\ninfrastructure in place to do error recovery? We are finally ready to hook that\nup.</p>\n<p>This <code>declaration()</code> method is the method we call repeatedly when parsing a\nseries of statements in a block or a script, so it&rsquo;s the right place to\nsynchronize when the parser goes into panic mode. The whole body of this method\nis wrapped in a try block to catch the exception thrown when the parser begins\nerror recovery. This gets it back to trying to parse the beginning of the next\nstatement or declaration.</p>\n<p>The real parsing happens inside the try block. First, it looks to see if we&rsquo;re\nat a variable declaration by looking for the leading <code>var</code> keyword. If not, it\nfalls through to the existing <code>statement()</code> method that parses <code>print</code> and\nexpression statements.</p>\n<p>Remember how <code>statement()</code> tries to parse an expression statement if no other\nstatement matches? And <code>expression()</code> reports a syntax error if it can&rsquo;t parse\nan expression at the current token? That chain of calls ensures we report an\nerror if a valid declaration or statement isn&rsquo;t parsed.</p>\n<p>When the parser matches a <code>var</code> token, it branches to:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>printStatement</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Stmt</span> <span class=\"i\">varDeclaration</span>() {\n    <span class=\"t\">Token</span> <span class=\"i\">name</span> = <span class=\"i\">consume</span>(<span class=\"i\">IDENTIFIER</span>, <span class=\"s\">&quot;Expect variable name.&quot;</span>);\n\n    <span class=\"t\">Expr</span> <span class=\"i\">initializer</span> = <span class=\"k\">null</span>;\n    <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">EQUAL</span>)) {\n      <span class=\"i\">initializer</span> = <span class=\"i\">expression</span>();\n    }\n\n    <span class=\"i\">consume</span>(<span class=\"i\">SEMICOLON</span>, <span class=\"s\">&quot;Expect &#39;;&#39; after variable declaration.&quot;</span>);\n    <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">Stmt</span>.<span class=\"t\">Var</span>(<span class=\"i\">name</span>, <span class=\"i\">initializer</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>printStatement</em>()</div>\n\n<p>As always, the recursive descent code follows the grammar rule. The parser has\nalready matched the <code>var</code> token, so next it requires and consumes an identifier\ntoken for the variable name.</p>\n<p>Then, if it sees an <code>=</code> token, it knows there is an initializer expression and\nparses it. Otherwise, it leaves the initializer <code>null</code>. Finally, it consumes the\nrequired semicolon at the end of the statement. All this gets wrapped in a\nStmt.Var syntax tree node and we&rsquo;re groovy.</p>\n<p>Parsing a variable expression is even easier. In <code>primary()</code>, we look for an\nidentifier token.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      return new Expr.Literal(previous().literal);\n    }\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>primary</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">IDENTIFIER</span>)) {\n      <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Variable</span>(<span class=\"i\">previous</span>());\n    }\n</pre><pre class=\"insert-after\">\n\n    if (match(LEFT_PAREN)) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>primary</em>()</div>\n\n<p>That gives us a working front end for declaring and using variables. All that&rsquo;s\nleft is to feed it into the interpreter. Before we get to that, we need to talk\nabout where variables live in memory.</p>\n<h2><a href=\"#environments\" id=\"environments\"><small>8&#8202;.&#8202;3</small>Environments</a></h2>\n<p>The bindings that associate variables to values need to be stored somewhere.\nEver since the Lisp folks invented parentheses, this data structure has been\ncalled an <span name=\"env\"><strong>environment</strong></span>.</p><img src=\"image/statements-and-state/environment.png\" alt=\"An environment containing two bindings.\" />\n<aside name=\"env\">\n<p>I like to imagine the environment literally, as a sylvan wonderland where\nvariables and values frolic.</p>\n</aside>\n<p>You can think of it like a <span name=\"map\">map</span> where the keys are\nvariable names and the values are the variable&rsquo;s, uh, values. In fact, that&rsquo;s\nhow we&rsquo;ll implement it in Java. We could stuff that map and the code to manage\nit right into Interpreter, but since it forms a nicely delineated concept, we&rsquo;ll\npull it out into its own class.</p>\n<p>Start a new file and add:</p>\n<aside name=\"map\">\n<p>Java calls them <strong>maps</strong> or <strong>hashmaps</strong>. Other languages call them <strong>hash\ntables</strong>, <strong>dictionaries</strong> (Python and C#), <strong>hashes</strong> (Ruby and Perl),\n<strong>tables</strong> (Lua), or <strong>associative arrays</strong> (PHP). Way back when, they were\nknown as <strong>scatter tables</strong>.</p>\n</aside>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Environment.java</em><br>\ncreate new file</div>\n<pre><span class=\"k\">package</span> <span class=\"i\">com.craftinginterpreters.lox</span>;\n\n<span class=\"k\">import</span> <span class=\"i\">java.util.HashMap</span>;\n<span class=\"k\">import</span> <span class=\"i\">java.util.Map</span>;\n\n<span class=\"k\">class</span> <span class=\"t\">Environment</span> {\n  <span class=\"k\">private</span> <span class=\"k\">final</span> <span class=\"t\">Map</span>&lt;<span class=\"t\">String</span>, <span class=\"t\">Object</span>&gt; <span class=\"i\">values</span> = <span class=\"k\">new</span> <span class=\"t\">HashMap</span>&lt;&gt;();\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Environment.java</em>, create new file</div>\n\n<p>There&rsquo;s a Java Map in there to store the bindings. It uses bare strings for the\nkeys, not tokens. A token represents a unit of code at a specific place in the\nsource text, but when it comes to looking up variables, all identifier tokens\nwith the same name should refer to the same variable (ignoring scope for now).\nUsing the raw string ensures all of those tokens refer to the same map key.</p>\n<p>There are two operations we need to support. First, a variable definition binds\na new name to a value.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Environment.java</em><br>\nin class <em>Environment</em></div>\n<pre>  <span class=\"t\">void</span> <span class=\"i\">define</span>(<span class=\"t\">String</span> <span class=\"i\">name</span>, <span class=\"t\">Object</span> <span class=\"i\">value</span>) {\n    <span class=\"i\">values</span>.<span class=\"i\">put</span>(<span class=\"i\">name</span>, <span class=\"i\">value</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Environment.java</em>, in class <em>Environment</em></div>\n\n<p>Not exactly brain surgery, but we have made one interesting semantic choice.\nWhen we add the key to the map, we don&rsquo;t check to see if it&rsquo;s already present.\nThat means that this program works:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;before&quot;</span>;\n<span class=\"k\">print</span> <span class=\"i\">a</span>; <span class=\"c\">// &quot;before&quot;.</span>\n<span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;after&quot;</span>;\n<span class=\"k\">print</span> <span class=\"i\">a</span>; <span class=\"c\">// &quot;after&quot;.</span>\n</pre></div>\n<p>A variable statement doesn&rsquo;t just define a <em>new</em> variable, it can also be used\nto <em>re</em>define an existing variable. We could <span name=\"scheme\">choose</span>\nto make this an error instead. The user may not intend to redefine an existing\nvariable. (If they did mean to, they probably would have used assignment, not\n<code>var</code>.) Making redefinition an error would help them find that bug.</p>\n<p>However, doing so interacts poorly with the REPL. In the middle of a REPL\nsession, it&rsquo;s nice to not have to mentally track which variables you&rsquo;ve already\ndefined. We could allow redefinition in the REPL but not in scripts, but then\nusers would have to learn two sets of rules, and code copied and pasted from one\nform to the other might not work.</p>\n<aside name=\"scheme\">\n<p>My rule about variables and scoping is, &ldquo;When in doubt, do what Scheme does&rdquo;.\nThe Scheme folks have probably spent more time thinking about variable scope\nthan we ever will<span class=\"em\">&mdash;</span>one of the main goals of Scheme was to introduce lexical\nscoping to the world<span class=\"em\">&mdash;</span>so it&rsquo;s hard to go wrong if you follow in their\nfootsteps.</p>\n<p>Scheme allows redefining variables at the top level.</p>\n</aside>\n<p>So, to keep the two modes consistent, we&rsquo;ll allow it<span class=\"em\">&mdash;</span>at least for global\nvariables. Once a variable exists, we need a way to look it up.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">class Environment {\n  private final Map&lt;String, Object&gt; values = new HashMap&lt;&gt;();\n</pre><div class=\"source-file\"><em>lox/Environment.java</em><br>\nin class <em>Environment</em></div>\n<pre class=\"insert\">\n\n  <span class=\"t\">Object</span> <span class=\"i\">get</span>(<span class=\"t\">Token</span> <span class=\"i\">name</span>) {\n    <span class=\"k\">if</span> (<span class=\"i\">values</span>.<span class=\"i\">containsKey</span>(<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>)) {\n      <span class=\"k\">return</span> <span class=\"i\">values</span>.<span class=\"i\">get</span>(<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>);\n    }\n\n    <span class=\"k\">throw</span> <span class=\"k\">new</span> <span class=\"t\">RuntimeError</span>(<span class=\"i\">name</span>,\n        <span class=\"s\">&quot;Undefined variable &#39;&quot;</span> + <span class=\"i\">name</span>.<span class=\"i\">lexeme</span> + <span class=\"s\">&quot;&#39;.&quot;</span>);\n  }\n\n</pre><pre class=\"insert-after\">  void define(String name, Object value) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Environment.java</em>, in class <em>Environment</em></div>\n\n<p>This is a little more semantically interesting. If the variable is found, it\nsimply returns the value bound to it. But what if it&rsquo;s not? Again, we have a\nchoice:</p>\n<ul>\n<li>\n<p>Make it a syntax error.</p>\n</li>\n<li>\n<p>Make it a runtime error.</p>\n</li>\n<li>\n<p>Allow it and return some default value like <code>nil</code>.</p>\n</li>\n</ul>\n<p>Lox is pretty lax, but the last option is a little <em>too</em> permissive to me.\nMaking it a syntax error<span class=\"em\">&mdash;</span>a compile-time error<span class=\"em\">&mdash;</span>seems like a smart choice.\nUsing an undefined variable is a bug, and the sooner you detect the mistake, the\nbetter.</p>\n<p>The problem is that <em>using</em> a variable isn&rsquo;t the same as <em>referring</em> to it. You\ncan refer to a variable in a chunk of code without immediately evaluating it if\nthat chunk of code is wrapped inside a function. If we make it a static error to\n<em>mention</em> a variable before it&rsquo;s been declared, it becomes much harder to define\nrecursive functions.</p>\n<p>We could accommodate single recursion<span class=\"em\">&mdash;</span>a function that calls itself<span class=\"em\">&mdash;</span>by\ndeclaring the function&rsquo;s own name before we examine its body. But that doesn&rsquo;t\nhelp with mutually recursive procedures that call each other. Consider:</p>\n<p><span name=\"contrived\"></span></p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">isOdd</span>(<span class=\"i\">n</span>) {\n  <span class=\"k\">if</span> (<span class=\"i\">n</span> == <span class=\"n\">0</span>) <span class=\"k\">return</span> <span class=\"k\">false</span>;\n  <span class=\"k\">return</span> <span class=\"i\">isEven</span>(<span class=\"i\">n</span> - <span class=\"n\">1</span>);\n}\n\n<span class=\"k\">fun</span> <span class=\"i\">isEven</span>(<span class=\"i\">n</span>) {\n  <span class=\"k\">if</span> (<span class=\"i\">n</span> == <span class=\"n\">0</span>) <span class=\"k\">return</span> <span class=\"k\">true</span>;\n  <span class=\"k\">return</span> <span class=\"i\">isOdd</span>(<span class=\"i\">n</span> - <span class=\"n\">1</span>);\n}\n</pre></div>\n<aside name=\"contrived\">\n<p>Granted, this is probably not the most efficient way to tell if a number is even\nor odd (not to mention the bad things that happen if you pass a non-integer or\nnegative number to them). Bear with me.</p>\n</aside>\n<p>The <code>isEven()</code> function isn&rsquo;t defined by the <span name=\"declare\">time</span> we\nare looking at the body of <code>isOdd()</code> where it&rsquo;s called. If we swap the order of\nthe two functions, then <code>isOdd()</code> isn&rsquo;t defined when we&rsquo;re looking at\n<code>isEven()</code>&rsquo;s body.</p>\n<aside name=\"declare\">\n<p>Some statically typed languages like Java and C# solve this by specifying that\nthe top level of a program isn&rsquo;t a sequence of imperative statements. Instead, a\nprogram is a set of declarations which all come into being simultaneously. The\nimplementation declares <em>all</em> of the names before looking at the bodies of <em>any</em>\nof the functions.</p>\n<p>Older languages like C and Pascal don&rsquo;t work like this. Instead, they force you\nto add explicit <em>forward declarations</em> to declare a name before it&rsquo;s fully\ndefined. That was a concession to the limited computing power at the time. They\nwanted to be able to compile a source file in one single pass through the text,\nso those compilers couldn&rsquo;t gather up all of the declarations first before\nprocessing function bodies.</p>\n</aside>\n<p>Since making it a <em>static</em> error makes recursive declarations too difficult,\nwe&rsquo;ll defer the error to runtime. It&rsquo;s OK to refer to a variable before it&rsquo;s\ndefined as long as you don&rsquo;t <em>evaluate</em> the reference. That lets the program\nfor even and odd numbers work, but you&rsquo;d get a runtime error in:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">print</span> <span class=\"i\">a</span>;\n<span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;too late!&quot;</span>;\n</pre></div>\n<p>As with type errors in the expression evaluation code, we report a runtime error\nby throwing an exception. The exception contains the variable&rsquo;s token so we can\ntell the user where in their code they messed up.</p>\n<h3><a href=\"#interpreting-global-variables\" id=\"interpreting-global-variables\"><small>8&#8202;.&#8202;3&#8202;.&#8202;1</small>Interpreting global variables</a></h3>\n<p>The Interpreter class gets an instance of the new Environment class.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">class Interpreter implements Expr.Visitor&lt;Object&gt;,\n                             Stmt.Visitor&lt;Void&gt; {\n</pre><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nin class <em>Interpreter</em></div>\n<pre class=\"insert\">  <span class=\"k\">private</span> <span class=\"t\">Environment</span> <span class=\"i\">environment</span> = <span class=\"k\">new</span> <span class=\"t\">Environment</span>();\n\n</pre><pre class=\"insert-after\">  void interpret(List&lt;Stmt&gt; statements) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, in class <em>Interpreter</em></div>\n\n<p>We store it as a field directly in Interpreter so that the variables stay in\nmemory as long as the interpreter is still running.</p>\n<p>We have two new syntax trees, so that&rsquo;s two new visit methods. The first is for\ndeclaration statements.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>visitPrintStmt</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitVarStmt</span>(<span class=\"t\">Stmt</span>.<span class=\"t\">Var</span> <span class=\"i\">stmt</span>) {\n    <span class=\"t\">Object</span> <span class=\"i\">value</span> = <span class=\"k\">null</span>;\n    <span class=\"k\">if</span> (<span class=\"i\">stmt</span>.<span class=\"i\">initializer</span> != <span class=\"k\">null</span>) {\n      <span class=\"i\">value</span> = <span class=\"i\">evaluate</span>(<span class=\"i\">stmt</span>.<span class=\"i\">initializer</span>);\n    }\n\n    <span class=\"i\">environment</span>.<span class=\"i\">define</span>(<span class=\"i\">stmt</span>.<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>, <span class=\"i\">value</span>);\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>visitPrintStmt</em>()</div>\n\n<p>If the variable has an initializer, we evaluate it. If not, we have another\nchoice to make. We could have made this a syntax error in the parser by\n<em>requiring</em> an initializer. Most languages don&rsquo;t, though, so it feels a little\nharsh to do so in Lox.</p>\n<p>We could make it a runtime error. We&rsquo;d let you define an uninitialized variable,\nbut if you accessed it before assigning to it, a runtime error would occur. It&rsquo;s\nnot a bad idea, but most dynamically typed languages don&rsquo;t do that. Instead,\nwe&rsquo;ll keep it simple and say that Lox sets a variable to <code>nil</code> if it isn&rsquo;t\nexplicitly initialized.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">a</span>;\n<span class=\"k\">print</span> <span class=\"i\">a</span>; <span class=\"c\">// &quot;nil&quot;.</span>\n</pre></div>\n<p>Thus, if there isn&rsquo;t an initializer, we set the value to <code>null</code>, which is the\nJava representation of Lox&rsquo;s <code>nil</code> value. Then we tell the environment to bind\nthe variable to that value.</p>\n<p>Next, we evaluate a variable expression.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>visitUnaryExpr</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Object</span> <span class=\"i\">visitVariableExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Variable</span> <span class=\"i\">expr</span>) {\n    <span class=\"k\">return</span> <span class=\"i\">environment</span>.<span class=\"i\">get</span>(<span class=\"i\">expr</span>.<span class=\"i\">name</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>visitUnaryExpr</em>()</div>\n\n<p>This simply forwards to the environment which does the heavy lifting to make\nsure the variable is defined. With that, we&rsquo;ve got rudimentary variables\nworking. Try this out:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"n\">1</span>;\n<span class=\"k\">var</span> <span class=\"i\">b</span> = <span class=\"n\">2</span>;\n<span class=\"k\">print</span> <span class=\"i\">a</span> + <span class=\"i\">b</span>;\n</pre></div>\n<p>We can&rsquo;t reuse <em>code</em> yet, but we can start to build up programs that reuse\n<em>data</em>.</p>\n<h2><a href=\"#assignment\" id=\"assignment\"><small>8&#8202;.&#8202;4</small>Assignment</a></h2>\n<p>It&rsquo;s possible to create a language that has variables but does not let you\nreassign<span class=\"em\">&mdash;</span>or <strong>mutate</strong><span class=\"em\">&mdash;</span>them. Haskell is one example. SML supports only\nmutable references and arrays<span class=\"em\">&mdash;</span>variables cannot be reassigned. Rust steers you\naway from mutation by requiring a <code>mut</code> modifier to enable assignment.</p>\n<p>Mutating a variable is a side effect and, as the name suggests, some language\nfolks think side effects are <span name=\"pure\">dirty</span> or inelegant. Code\nshould be pure math that produces values<span class=\"em\">&mdash;</span>crystalline, unchanging ones<span class=\"em\">&mdash;</span>like\nan act of divine creation. Not some grubby automaton that beats blobs of data\ninto shape, one imperative grunt at a time.</p>\n<aside name=\"pure\">\n<p>I find it delightful that the same group of people who pride themselves on\ndispassionate logic are also the ones who can&rsquo;t resist emotionally loaded terms\nfor their work: &ldquo;pure&rdquo;, &ldquo;side effect&rdquo;, &ldquo;lazy&rdquo;, &ldquo;persistent&rdquo;, &ldquo;first-class&rdquo;,\n&ldquo;higher-order&rdquo;.</p>\n</aside>\n<p>Lox is not so austere. Lox is an imperative language, and mutation comes with\nthe territory. Adding support for assignment doesn&rsquo;t require much work. Global\nvariables already support redefinition, so most of the machinery is there now.\nMainly, we&rsquo;re missing an explicit assignment notation.</p>\n<h3><a href=\"#assignment-syntax\" id=\"assignment-syntax\"><small>8&#8202;.&#8202;4&#8202;.&#8202;1</small>Assignment syntax</a></h3>\n<p>That little <code>=</code> syntax is more complex than it might seem. Like most C-derived\nlanguages, assignment is an <span name=\"assign\">expression</span> and not a\nstatement. As in C, it is the lowest precedence expression form. That means the\nrule slots between <code>expression</code> and <code>equality</code> (the next lowest precedence\nexpression).</p>\n<aside name=\"assign\">\n<p>In some other languages, like Pascal, Python, and Go, assignment is a statement.</p>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"i\">expression</span>     → <span class=\"i\">assignment</span> ;\n<span class=\"i\">assignment</span>     → <span class=\"t\">IDENTIFIER</span> <span class=\"s\">&quot;=&quot;</span> <span class=\"i\">assignment</span>\n               | <span class=\"i\">equality</span> ;\n</pre></div>\n<p>This says an <code>assignment</code> is either an identifier followed by an <code>=</code> and an\nexpression for the value, or an <code>equality</code> (and thus any other) expression.\nLater, <code>assignment</code> will get more complex when we add property setters on\nobjects, like:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">instance</span>.<span class=\"i\">field</span> = <span class=\"s\">&quot;value&quot;</span>;\n</pre></div>\n<p>The easy part is adding the <span name=\"assign-ast\">new syntax tree node</span>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    defineAst(outputDir, &quot;Expr&quot;, Arrays.asList(\n</pre><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nin <em>main</em>()</div>\n<pre class=\"insert\">      <span class=\"s\">&quot;Assign   : Token name, Expr value&quot;</span>,\n</pre><pre class=\"insert-after\">      &quot;Binary   : Expr left, Token operator, Expr right&quot;,\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, in <em>main</em>()</div>\n\n<aside name=\"assign-ast\">\n<p>The generated code for the new node is in <a href=\"appendix-ii.html#assign-expression\">Appendix II</a>.</p>\n</aside>\n<p>It has a token for the variable being assigned to, and an expression for the new\nvalue. After you run the AstGenerator to get the new Expr.Assign class, swap out\nthe body of the parser&rsquo;s existing <code>expression()</code> method to match the updated\nrule.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  private Expr expression() {\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>expression</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">    <span class=\"k\">return</span> <span class=\"i\">assignment</span>();\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>expression</em>(), replace 1 line</div>\n\n<p>Here is where it gets tricky. A single token lookahead recursive descent parser\ncan&rsquo;t see far enough to tell that it&rsquo;s parsing an assignment until <em>after</em> it\nhas gone through the left-hand side and stumbled onto the <code>=</code>. You might wonder\nwhy it even needs to. After all, we don&rsquo;t know we&rsquo;re parsing a <code>+</code> expression\nuntil after we&rsquo;ve finished parsing the left operand.</p>\n<p>The difference is that the left-hand side of an assignment isn&rsquo;t an expression\nthat evaluates to a value. It&rsquo;s a sort of pseudo-expression that evaluates to a\n&ldquo;thing&rdquo; you can assign to. Consider:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;before&quot;</span>;\n<span class=\"i\">a</span> = <span class=\"s\">&quot;value&quot;</span>;\n</pre></div>\n<p>On the second line, we don&rsquo;t <em>evaluate</em> <code>a</code> (which would return the string\n&ldquo;before&rdquo;). We figure out what variable <code>a</code> refers to so we know where to store\nthe right-hand side expression&rsquo;s value. The <a href=\"https://en.wikipedia.org/wiki/Value_(computer_science)#lrvalue\">classic terms</a> for these\ntwo <span name=\"l-value\">constructs</span> are <strong>l-value</strong> and <strong>r-value</strong>. All\nof the expressions that we&rsquo;ve seen so far that produce values are r-values. An\nl-value &ldquo;evaluates&rdquo; to a storage location that you can assign into.</p>\n<aside name=\"l-value\">\n<p>In fact, the names come from assignment expressions: <em>l</em>-values appear on the\n<em>left</em> side of the <code>=</code> in an assignment, and <em>r</em>-values on the <em>right</em>.</p>\n</aside>\n<p>We want the syntax tree to reflect that an l-value isn&rsquo;t evaluated like a normal\nexpression. That&rsquo;s why the Expr.Assign node has a <em>Token</em> for the left-hand\nside, not an Expr. The problem is that the parser doesn&rsquo;t know it&rsquo;s parsing an\nl-value until it hits the <code>=</code>. In a complex l-value, that may occur <span\nname=\"many\">many</span> tokens later.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">makeList</span>().<span class=\"i\">head</span>.<span class=\"i\">next</span> = <span class=\"i\">node</span>;\n</pre></div>\n<aside name=\"many\">\n<p>Since the receiver of a field assignment can be any expression, and expressions\ncan be as long as you want to make them, it may take an <em>unbounded</em> number of\ntokens of lookahead to find the <code>=</code>.</p>\n</aside>\n<p>We have only a single token of lookahead, so what do we do? We use a little\ntrick, and it looks like this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>expressionStatement</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">Expr</span> <span class=\"i\">assignment</span>() {\n    <span class=\"t\">Expr</span> <span class=\"i\">expr</span> = <span class=\"i\">equality</span>();\n\n    <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">EQUAL</span>)) {\n      <span class=\"t\">Token</span> <span class=\"i\">equals</span> = <span class=\"i\">previous</span>();\n      <span class=\"t\">Expr</span> <span class=\"i\">value</span> = <span class=\"i\">assignment</span>();\n\n      <span class=\"k\">if</span> (<span class=\"i\">expr</span> <span class=\"k\">instanceof</span> <span class=\"t\">Expr</span>.<span class=\"t\">Variable</span>) {\n        <span class=\"t\">Token</span> <span class=\"i\">name</span> = ((<span class=\"t\">Expr</span>.<span class=\"t\">Variable</span>)<span class=\"i\">expr</span>).<span class=\"i\">name</span>;\n        <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">Expr</span>.<span class=\"t\">Assign</span>(<span class=\"i\">name</span>, <span class=\"i\">value</span>);\n      }\n\n      <span class=\"i\">error</span>(<span class=\"i\">equals</span>, <span class=\"s\">&quot;Invalid assignment target.&quot;</span>);<span name=\"no-throw\"> </span>\n    }\n\n    <span class=\"k\">return</span> <span class=\"i\">expr</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>expressionStatement</em>()</div>\n\n<p>Most of the code for parsing an assignment expression looks similar to that of\nthe other binary operators like <code>+</code>. We parse the left-hand side, which can be\nany expression of higher precedence. If we find an <code>=</code>, we parse the right-hand\nside and then wrap it all up in an assignment expression tree node.</p>\n<aside name=\"no-throw\">\n<p>We <em>report</em> an error if the left-hand side isn&rsquo;t a valid assignment target, but\nwe don&rsquo;t <em>throw</em> it because the parser isn&rsquo;t in a confused state where we need\nto go into panic mode and synchronize.</p>\n</aside>\n<p>One slight difference from binary operators is that we don&rsquo;t loop to build up a\nsequence of the same operator. Since assignment is right-associative, we instead\nrecursively call <code>assignment()</code> to parse the right-hand side.</p>\n<p>The trick is that right before we create the assignment expression node, we look\nat the left-hand side expression and figure out what kind of assignment target\nit is. We convert the r-value expression node into an l-value representation.</p>\n<p>This conversion works because it turns out that every valid assignment target\nhappens to also be <span name=\"converse\">valid syntax</span> as a normal\nexpression. Consider a complex field assignment like:</p>\n<aside name=\"converse\">\n<p>You can still use this trick even if there are assignment targets that are not\nvalid expressions. Define a <strong>cover grammar</strong>, a looser grammar that accepts\nall of the valid expression <em>and</em> assignment target syntaxes. When you hit\nan <code>=</code>, report an error if the left-hand side isn&rsquo;t within the valid assignment\ntarget grammar. Conversely, if you <em>don&rsquo;t</em> hit an <code>=</code>, report an error if the\nleft-hand side isn&rsquo;t a valid <em>expression</em>.</p>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"i\">newPoint</span>(<span class=\"i\">x</span> + <span class=\"n\">2</span>, <span class=\"n\">0</span>).<span class=\"i\">y</span> = <span class=\"n\">3</span>;\n</pre></div>\n<p>The left-hand side of that assignment could also work as a valid expression.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">newPoint</span>(<span class=\"i\">x</span> + <span class=\"n\">2</span>, <span class=\"n\">0</span>).<span class=\"i\">y</span>;\n</pre></div>\n<p>The first example sets the field, the second gets it.</p>\n<p>This means we can parse the left-hand side <em>as if it were</em> an expression and\nthen after the fact produce a syntax tree that turns it into an assignment\ntarget. If the left-hand side expression isn&rsquo;t a <span name=\"paren\">valid</span>\nassignment target, we fail with a syntax error. That ensures we report an error\non code like this:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">a</span> + <span class=\"i\">b</span> = <span class=\"i\">c</span>;\n</pre></div>\n<aside name=\"paren\">\n<p>Way back in the parsing chapter, I said we represent parenthesized expressions\nin the syntax tree because we&rsquo;ll need them later. This is why. We need to be\nable to distinguish these cases:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">a</span> = <span class=\"n\">3</span>;   <span class=\"c\">// OK.</span>\n(<span class=\"i\">a</span>) = <span class=\"n\">3</span>; <span class=\"c\">// Error.</span>\n</pre></div>\n</aside>\n<p>Right now, the only valid target is a simple variable expression, but we&rsquo;ll add\nfields later. The end result of this trick is an assignment expression tree node\nthat knows what it is assigning to and has an expression subtree for the value\nbeing assigned. All with only a single token of lookahead and no backtracking.</p>\n<h3><a href=\"#assignment-semantics\" id=\"assignment-semantics\"><small>8&#8202;.&#8202;4&#8202;.&#8202;2</small>Assignment semantics</a></h3>\n<p>We have a new syntax tree node, so our interpreter gets a new visit method.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>visitVarStmt</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Object</span> <span class=\"i\">visitAssignExpr</span>(<span class=\"t\">Expr</span>.<span class=\"t\">Assign</span> <span class=\"i\">expr</span>) {\n    <span class=\"t\">Object</span> <span class=\"i\">value</span> = <span class=\"i\">evaluate</span>(<span class=\"i\">expr</span>.<span class=\"i\">value</span>);\n    <span class=\"i\">environment</span>.<span class=\"i\">assign</span>(<span class=\"i\">expr</span>.<span class=\"i\">name</span>, <span class=\"i\">value</span>);\n    <span class=\"k\">return</span> <span class=\"i\">value</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>visitVarStmt</em>()</div>\n\n<p>For obvious reasons, it&rsquo;s similar to variable declaration. It evaluates the\nright-hand side to get the value, then stores it in the named variable. Instead\nof using <code>define()</code> on Environment, it calls this new method:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Environment.java</em><br>\nadd after <em>get</em>()</div>\n<pre>  <span class=\"t\">void</span> <span class=\"i\">assign</span>(<span class=\"t\">Token</span> <span class=\"i\">name</span>, <span class=\"t\">Object</span> <span class=\"i\">value</span>) {\n    <span class=\"k\">if</span> (<span class=\"i\">values</span>.<span class=\"i\">containsKey</span>(<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>)) {\n      <span class=\"i\">values</span>.<span class=\"i\">put</span>(<span class=\"i\">name</span>.<span class=\"i\">lexeme</span>, <span class=\"i\">value</span>);\n      <span class=\"k\">return</span>;\n    }\n\n    <span class=\"k\">throw</span> <span class=\"k\">new</span> <span class=\"t\">RuntimeError</span>(<span class=\"i\">name</span>,\n        <span class=\"s\">&quot;Undefined variable &#39;&quot;</span> + <span class=\"i\">name</span>.<span class=\"i\">lexeme</span> + <span class=\"s\">&quot;&#39;.&quot;</span>);\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Environment.java</em>, add after <em>get</em>()</div>\n\n<p>The key difference between assignment and definition is that assignment is not\n<span name=\"new\">allowed</span> to create a <em>new</em> variable. In terms of our\nimplementation, that means it&rsquo;s a runtime error if the key doesn&rsquo;t already exist\nin the environment&rsquo;s variable map.</p>\n<aside name=\"new\">\n<p>Unlike Python and Ruby, Lox doesn&rsquo;t do <a href=\"#design-note\">implicit variable declaration</a>.</p>\n</aside>\n<p>The last thing the <code>visit()</code> method does is return the assigned value. That&rsquo;s\nbecause assignment is an expression that can be nested inside other expressions,\nlike so:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"n\">1</span>;\n<span class=\"k\">print</span> <span class=\"i\">a</span> = <span class=\"n\">2</span>; <span class=\"c\">// &quot;2&quot;.</span>\n</pre></div>\n<p>Our interpreter can now create, read, and modify variables. It&rsquo;s about as\nsophisticated as early <span name=\"basic\">BASICs</span>. Global variables are\nsimple, but writing a large program when any two chunks of code can accidentally\nstep on each other&rsquo;s state is no fun. We want <em>local</em> variables, which means\nit&rsquo;s time for <em>scope</em>.</p>\n<aside name=\"basic\">\n<p>Maybe a little better than that. Unlike some old BASICs, Lox can handle variable\nnames longer than two characters.</p>\n</aside>\n<h2><a href=\"#scope\" id=\"scope\"><small>8&#8202;.&#8202;5</small>Scope</a></h2>\n<p>A <strong>scope</strong> defines a region where a name maps to a certain entity. Multiple\nscopes enable the same name to refer to different things in different contexts.\nIn my house, &ldquo;Bob&rdquo; usually refers to me. But maybe in your town you know a\ndifferent Bob. Same name, but different dudes based on where you say it.</p>\n<p><span name=\"lexical\"><strong>Lexical scope</strong></span> (or the less commonly heard\n<strong>static scope</strong>) is a specific style of scoping where the text of the program\nitself shows where a scope begins and ends. In Lox, as in most modern languages,\nvariables are lexically scoped. When you see an expression that uses some\nvariable, you can figure out which variable declaration it refers to just by\nstatically reading the code.</p>\n<aside name=\"lexical\">\n<p>&ldquo;Lexical&rdquo; comes from the Greek &ldquo;lexikos&rdquo; which means &ldquo;related to words&rdquo;. When we\nuse it in programming languages, it usually means a thing you can figure out\nfrom source code itself without having to execute anything.</p>\n<p>Lexical scope came onto the scene with ALGOL. Earlier languages were often\ndynamically scoped. Computer scientists back then believed dynamic scope was\nfaster to execute. Today, thanks to early Scheme hackers, we know that isn&rsquo;t\ntrue. If anything, it&rsquo;s the opposite.</p>\n<p>Dynamic scope for variables lives on in some corners. Emacs Lisp defaults to\ndynamic scope for variables. The <a href=\"http://clojuredocs.org/clojure.core/binding\"><code>binding</code></a> macro in Clojure provides\nit. The widely disliked <a href=\"https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/with\"><code>with</code> statement</a> in JavaScript turns properties\non an object into dynamically scoped variables.</p>\n</aside>\n<p>For example:</p>\n<div class=\"codehilite\"><pre>{\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;first&quot;</span>;\n  <span class=\"k\">print</span> <span class=\"i\">a</span>; <span class=\"c\">// &quot;first&quot;.</span>\n}\n\n{\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;second&quot;</span>;\n  <span class=\"k\">print</span> <span class=\"i\">a</span>; <span class=\"c\">// &quot;second&quot;.</span>\n}\n</pre></div>\n<p>Here, we have two blocks with a variable <code>a</code> declared in each of them. You and\nI can tell just from looking at the code that the use of <code>a</code> in the first\n<code>print</code> statement refers to the first <code>a</code>, and the second one refers to the\nsecond.</p><img src=\"image/statements-and-state/blocks.png\" alt=\"An environment for each 'a'.\" />\n<p>This is in contrast to <strong>dynamic scope</strong> where you don&rsquo;t know what a name refers\nto until you execute the code. Lox doesn&rsquo;t have dynamically scoped <em>variables</em>,\nbut methods and fields on objects are dynamically scoped.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Saxophone</span> {\n  <span class=\"i\">play</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Careless Whisper&quot;</span>;\n  }\n}\n\n<span class=\"k\">class</span> <span class=\"t\">GolfClub</span> {\n  <span class=\"i\">play</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Fore!&quot;</span>;\n  }\n}\n\n<span class=\"k\">fun</span> <span class=\"i\">playIt</span>(<span class=\"i\">thing</span>) {\n  <span class=\"i\">thing</span>.<span class=\"i\">play</span>();\n}\n</pre></div>\n<p>When <code>playIt()</code> calls <code>thing.play()</code>, we don&rsquo;t know if we&rsquo;re about to hear\n&ldquo;Careless Whisper&rdquo; or &ldquo;Fore!&rdquo; It depends on whether you pass a Saxophone or a\nGolfClub to the function, and we don&rsquo;t know that until runtime.</p>\n<p>Scope and environments are close cousins. The former is the theoretical concept,\nand the latter is the machinery that implements it. As our interpreter works its\nway through code, syntax tree nodes that affect scope will change the\nenvironment. In a C-ish syntax like Lox&rsquo;s, scope is controlled by curly-braced\nblocks. (That&rsquo;s why we call it <strong>block scope</strong>.)</p>\n<div class=\"codehilite\"><pre>{\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;in block&quot;</span>;\n}\n<span class=\"k\">print</span> <span class=\"i\">a</span>; <span class=\"c\">// Error! No more &quot;a&quot;.</span>\n</pre></div>\n<p>The beginning of a block introduces a new local scope, and that scope ends when\nexecution passes the closing <code>}</code>. Any variables declared inside the block\ndisappear.</p>\n<h3><a href=\"#nesting-and-shadowing\" id=\"nesting-and-shadowing\"><small>8&#8202;.&#8202;5&#8202;.&#8202;1</small>Nesting and shadowing</a></h3>\n<p>A first cut at implementing block scope might work like this:</p>\n<ol>\n<li>\n<p>As we visit each statement inside the block, keep track of any variables\ndeclared.</p>\n</li>\n<li>\n<p>After the last statement is executed, tell the environment to delete all of\nthose variables.</p>\n</li>\n</ol>\n<p>That would work for the previous example. But remember, one motivation for\nlocal scope is encapsulation<span class=\"em\">&mdash;</span>a block of code in one corner of the program\nshouldn&rsquo;t interfere with some other block. Check this out:</p>\n<div class=\"codehilite\"><pre><span class=\"c\">// How loud?</span>\n<span class=\"k\">var</span> <span class=\"i\">volume</span> = <span class=\"n\">11</span>;\n\n<span class=\"c\">// Silence.</span>\n<span class=\"i\">volume</span> = <span class=\"n\">0</span>;\n\n<span class=\"c\">// Calculate size of 3x4x5 cuboid.</span>\n{\n  <span class=\"k\">var</span> <span class=\"i\">volume</span> = <span class=\"n\">3</span> * <span class=\"n\">4</span> * <span class=\"n\">5</span>;\n  <span class=\"k\">print</span> <span class=\"i\">volume</span>;\n}\n</pre></div>\n<p>Look at the block where we calculate the volume of the cuboid using a local\ndeclaration of <code>volume</code>. After the block exits, the interpreter will delete the\n<em>global</em> <code>volume</code> variable. That ain&rsquo;t right. When we exit the block, we should\nremove any variables declared inside the block, but if there is a variable with\nthe same name declared outside of the block, <em>that&rsquo;s a different variable</em>. It\nshouldn&rsquo;t get touched.</p>\n<p>When a local variable has the same name as a variable in an enclosing scope, it\n<strong>shadows</strong> the outer one. Code inside the block can&rsquo;t see it any more<span class=\"em\">&mdash;</span>it is\nhidden in the &ldquo;shadow&rdquo; cast by the inner one<span class=\"em\">&mdash;</span>but it&rsquo;s still there.</p>\n<p>When we enter a new block scope, we need to preserve variables defined in outer\nscopes so they are still around when we exit the inner block. We do that by\ndefining a fresh environment for each block containing only the variables\ndefined in that scope. When we exit the block, we discard its environment and\nrestore the previous one.</p>\n<p>We also need to handle enclosing variables that are <em>not</em> shadowed.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">global</span> = <span class=\"s\">&quot;outside&quot;</span>;\n{\n  <span class=\"k\">var</span> <span class=\"i\">local</span> = <span class=\"s\">&quot;inside&quot;</span>;\n  <span class=\"k\">print</span> <span class=\"i\">global</span> + <span class=\"i\">local</span>;\n}\n</pre></div>\n<p>Here, <code>global</code> lives in the outer global environment and <code>local</code> is defined\ninside the block&rsquo;s environment. In that <code>print</code> statement, both of those\nvariables are in scope. In order to find them, the interpreter must search not\nonly the current innermost environment, but also any enclosing ones.</p>\n<p>We implement this by <span name=\"cactus\">chaining</span> the environments\ntogether. Each environment has a reference to the environment of the immediately\nenclosing scope. When we look up a variable, we walk that chain from innermost\nout until we find the variable. Starting at the inner scope is how we make local\nvariables shadow outer ones.</p><img src=\"image/statements-and-state/chaining.png\" alt=\"Environments for each scope, linked together.\" />\n<aside name=\"cactus\">\n<p>While the interpreter is running, the environments form a linear list of\nobjects, but consider the full set of environments created during the entire\nexecution. An outer scope may have multiple blocks nested within it, and each\nwill point to the outer one, giving a tree-like structure, though only one path\nthrough the tree exists at a time.</p>\n<p>The boring name for this is a <a href=\"https://en.wikipedia.org/wiki/Parent_pointer_tree\"><strong>parent-pointer tree</strong></a>, but I\nmuch prefer the evocative <strong>cactus stack</strong>.</p><img class=\"above\" src=\"image/statements-and-state/cactus.png\" alt=\"Each branch points to its parent. The root is global scope.\" />\n</aside>\n<p>Before we add block syntax to the grammar, we&rsquo;ll beef up our Environment class\nwith support for this nesting. First, we give each environment a reference to\nits enclosing one.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">class Environment {\n</pre><div class=\"source-file\"><em>lox/Environment.java</em><br>\nin class <em>Environment</em></div>\n<pre class=\"insert\">  <span class=\"k\">final</span> <span class=\"t\">Environment</span> <span class=\"i\">enclosing</span>;\n</pre><pre class=\"insert-after\">  private final Map&lt;String, Object&gt; values = new HashMap&lt;&gt;();\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Environment.java</em>, in class <em>Environment</em></div>\n\n<p>This field needs to be initialized, so we add a couple of constructors.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Environment.java</em><br>\nin class <em>Environment</em></div>\n<pre>  <span class=\"t\">Environment</span>() {\n    <span class=\"i\">enclosing</span> = <span class=\"k\">null</span>;\n  }\n\n  <span class=\"t\">Environment</span>(<span class=\"t\">Environment</span> <span class=\"i\">enclosing</span>) {\n    <span class=\"k\">this</span>.<span class=\"i\">enclosing</span> = <span class=\"i\">enclosing</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Environment.java</em>, in class <em>Environment</em></div>\n\n<p>The no-argument constructor is for the global scope&rsquo;s environment, which ends\nthe chain. The other constructor creates a new local scope nested inside the\ngiven outer one.</p>\n<p>We don&rsquo;t have to touch the <code>define()</code> method<span class=\"em\">&mdash;</span>a new variable is always\ndeclared in the current innermost scope. But variable lookup and assignment work\nwith existing variables and they need to walk the chain to find them. First,\nlookup:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      return values.get(name.lexeme);\n    }\n</pre><div class=\"source-file\"><em>lox/Environment.java</em><br>\nin <em>get</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"k\">if</span> (<span class=\"i\">enclosing</span> != <span class=\"k\">null</span>) <span class=\"k\">return</span> <span class=\"i\">enclosing</span>.<span class=\"i\">get</span>(<span class=\"i\">name</span>);\n</pre><pre class=\"insert-after\">\n\n    throw new RuntimeError(name,\n        &quot;Undefined variable '&quot; + name.lexeme + &quot;'.&quot;);\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Environment.java</em>, in <em>get</em>()</div>\n\n<p>If the variable isn&rsquo;t found in this environment, we simply try the enclosing\none. That in turn does the same thing <span name=\"recurse\">recursively</span>,\nso this will ultimately walk the entire chain. If we reach an environment with\nno enclosing one and still don&rsquo;t find the variable, then we give up and report\nan error as before.</p>\n<p>Assignment works the same way.</p>\n<aside name=\"recurse\">\n<p>It&rsquo;s likely faster to iteratively walk the chain, but I think the recursive\nsolution is prettier. We&rsquo;ll do something <em>much</em> faster in clox.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">      values.put(name.lexeme, value);\n      return;\n    }\n\n</pre><div class=\"source-file\"><em>lox/Environment.java</em><br>\nin <em>assign</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (<span class=\"i\">enclosing</span> != <span class=\"k\">null</span>) {\n      <span class=\"i\">enclosing</span>.<span class=\"i\">assign</span>(<span class=\"i\">name</span>, <span class=\"i\">value</span>);\n      <span class=\"k\">return</span>;\n    }\n\n</pre><pre class=\"insert-after\">    throw new RuntimeError(name,\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Environment.java</em>, in <em>assign</em>()</div>\n\n<p>Again, if the variable isn&rsquo;t in this environment, it checks the outer one,\nrecursively.</p>\n<h3><a href=\"#block-syntax-and-semantics\" id=\"block-syntax-and-semantics\"><small>8&#8202;.&#8202;5&#8202;.&#8202;2</small>Block syntax and semantics</a></h3>\n<p>Now that Environments nest, we&rsquo;re ready to add blocks to the language. Behold\nthe grammar:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">statement</span>      → <span class=\"i\">exprStmt</span>\n               | <span class=\"i\">printStmt</span>\n               | <span class=\"i\">block</span> ;\n\n<span class=\"i\">block</span>          → <span class=\"s\">&quot;{&quot;</span> <span class=\"i\">declaration</span>* <span class=\"s\">&quot;}&quot;</span> ;\n</pre></div>\n<p>A block is a (possibly empty) series of statements or declarations surrounded by\ncurly braces. A block is itself a statement and can appear anywhere a statement\nis allowed. The <span name=\"block-ast\">syntax tree</span> node looks like this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    defineAst(outputDir, &quot;Stmt&quot;, Arrays.asList(\n</pre><div class=\"source-file\"><em>tool/GenerateAst.java</em><br>\nin <em>main</em>()</div>\n<pre class=\"insert\">      <span class=\"s\">&quot;Block      : List&lt;Stmt&gt; statements&quot;</span>,\n</pre><pre class=\"insert-after\">      &quot;Expression : Expr expression&quot;,\n</pre></div>\n<div class=\"source-file-narrow\"><em>tool/GenerateAst.java</em>, in <em>main</em>()</div>\n\n<aside name=\"block-ast\">\n<p>The generated code for the new node is in <a href=\"appendix-ii.html#block-statement\">Appendix II</a>.</p>\n</aside>\n<p><span name=\"generate\">It</span> contains the list of statements that are inside\nthe block. Parsing is straightforward. Like other statements, we detect the\nbeginning of a block by its leading token<span class=\"em\">&mdash;</span>in this case the <code>{</code>. In the\n<code>statement()</code> method, we add:</p>\n<aside name=\"generate\">\n<p>As always, don&rsquo;t forget to run &ldquo;GenerateAst.java&rdquo;.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">    if (match(PRINT)) return printStatement();\n</pre><div class=\"source-file\"><em>lox/Parser.java</em><br>\nin <em>statement</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"i\">LEFT_BRACE</span>)) <span class=\"k\">return</span> <span class=\"k\">new</span> <span class=\"t\">Stmt</span>.<span class=\"t\">Block</span>(<span class=\"i\">block</span>());\n</pre><pre class=\"insert-after\">\n\n    return expressionStatement();\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, in <em>statement</em>()</div>\n\n<p>All the real work happens here:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Parser.java</em><br>\nadd after <em>expressionStatement</em>()</div>\n<pre>  <span class=\"k\">private</span> <span class=\"t\">List</span>&lt;<span class=\"t\">Stmt</span>&gt; <span class=\"i\">block</span>() {\n    <span class=\"t\">List</span>&lt;<span class=\"t\">Stmt</span>&gt; <span class=\"i\">statements</span> = <span class=\"k\">new</span> <span class=\"t\">ArrayList</span>&lt;&gt;();\n\n    <span class=\"k\">while</span> (!<span class=\"i\">check</span>(<span class=\"i\">RIGHT_BRACE</span>) &amp;&amp; !<span class=\"i\">isAtEnd</span>()) {\n      <span class=\"i\">statements</span>.<span class=\"i\">add</span>(<span class=\"i\">declaration</span>());\n    }\n\n    <span class=\"i\">consume</span>(<span class=\"i\">RIGHT_BRACE</span>, <span class=\"s\">&quot;Expect &#39;}&#39; after block.&quot;</span>);\n    <span class=\"k\">return</span> <span class=\"i\">statements</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Parser.java</em>, add after <em>expressionStatement</em>()</div>\n\n<p>We <span name=\"list\">create</span> an empty list and then parse statements and\nadd them to the list until we reach the end of the block, marked by the closing\n<code>}</code>. Note that the loop also has an explicit check for <code>isAtEnd()</code>. We have to\nbe careful to avoid infinite loops, even when parsing invalid code. If the user\nforgets a closing <code>}</code>, the parser needs to not get stuck.</p>\n<aside name=\"list\">\n<p>Having <code>block()</code> return the raw list of statements and leaving it to\n<code>statement()</code> to wrap the list in a Stmt.Block looks a little odd. I did it that\nway because we&rsquo;ll reuse <code>block()</code> later for parsing function bodies and we don&rsquo;t\nwant that body wrapped in a Stmt.Block.</p>\n</aside>\n<p>That&rsquo;s it for syntax. For semantics, we add another visit method to Interpreter.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>execute</em>()</div>\n<pre>  <span class=\"a\">@Override</span>\n  <span class=\"k\">public</span> <span class=\"t\">Void</span> <span class=\"i\">visitBlockStmt</span>(<span class=\"t\">Stmt</span>.<span class=\"t\">Block</span> <span class=\"i\">stmt</span>) {\n    <span class=\"i\">executeBlock</span>(<span class=\"i\">stmt</span>.<span class=\"i\">statements</span>, <span class=\"k\">new</span> <span class=\"t\">Environment</span>(<span class=\"i\">environment</span>));\n    <span class=\"k\">return</span> <span class=\"k\">null</span>;\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>execute</em>()</div>\n\n<p>To execute a block, we create a new environment for the block&rsquo;s scope and pass\nit off to this other method:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>lox/Interpreter.java</em><br>\nadd after <em>execute</em>()</div>\n<pre>  <span class=\"t\">void</span> <span class=\"i\">executeBlock</span>(<span class=\"t\">List</span>&lt;<span class=\"t\">Stmt</span>&gt; <span class=\"i\">statements</span>,\n                    <span class=\"t\">Environment</span> <span class=\"i\">environment</span>) {\n    <span class=\"t\">Environment</span> <span class=\"i\">previous</span> = <span class=\"k\">this</span>.<span class=\"i\">environment</span>;\n    <span class=\"k\">try</span> {\n      <span class=\"k\">this</span>.<span class=\"i\">environment</span> = <span class=\"i\">environment</span>;\n\n      <span class=\"k\">for</span> (<span class=\"t\">Stmt</span> <span class=\"i\">statement</span> : <span class=\"i\">statements</span>) {\n        <span class=\"i\">execute</span>(<span class=\"i\">statement</span>);\n      }\n    } <span class=\"k\">finally</span> {\n      <span class=\"k\">this</span>.<span class=\"i\">environment</span> = <span class=\"i\">previous</span>;\n    }\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>lox/Interpreter.java</em>, add after <em>execute</em>()</div>\n\n<p>This new method executes a list of statements in the context of a given <span\nname=\"param\">environment</span>. Up until now, the <code>environment</code> field in\nInterpreter always pointed to the same environment<span class=\"em\">&mdash;</span>the global one. Now, that\nfield represents the <em>current</em> environment. That&rsquo;s the environment that\ncorresponds to the innermost scope containing the code to be executed.</p>\n<p>To execute code within a given scope, this method updates the interpreter&rsquo;s\n<code>environment</code> field, visits all of the statements, and then restores the\nprevious value. As is always good practice in Java, it restores the previous\nenvironment using a finally clause. That way it gets restored even if an\nexception is thrown.</p>\n<aside name=\"param\">\n<p>Manually changing and restoring a mutable <code>environment</code> field feels inelegant.\nAnother classic approach is to explicitly pass the environment as a parameter to\neach visit method. To &ldquo;change&rdquo; the environment, you pass a different one as you\nrecurse down the tree. You don&rsquo;t have to restore the old one, since the new one\nlives on the Java stack and is implicitly discarded when the interpreter returns\nfrom the block&rsquo;s visit method.</p>\n<p>I considered that for jlox, but it&rsquo;s kind of tedious and verbose adding an\nenvironment parameter to every single visit method. To keep the book a little\nsimpler, I went with the mutable field.</p>\n</aside>\n<p>Surprisingly, that&rsquo;s all we need to do in order to fully support local\nvariables, nesting, and shadowing. Go ahead and try this out:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;global a&quot;</span>;\n<span class=\"k\">var</span> <span class=\"i\">b</span> = <span class=\"s\">&quot;global b&quot;</span>;\n<span class=\"k\">var</span> <span class=\"i\">c</span> = <span class=\"s\">&quot;global c&quot;</span>;\n{\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;outer a&quot;</span>;\n  <span class=\"k\">var</span> <span class=\"i\">b</span> = <span class=\"s\">&quot;outer b&quot;</span>;\n  {\n    <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"s\">&quot;inner a&quot;</span>;\n    <span class=\"k\">print</span> <span class=\"i\">a</span>;\n    <span class=\"k\">print</span> <span class=\"i\">b</span>;\n    <span class=\"k\">print</span> <span class=\"i\">c</span>;\n  }\n  <span class=\"k\">print</span> <span class=\"i\">a</span>;\n  <span class=\"k\">print</span> <span class=\"i\">b</span>;\n  <span class=\"k\">print</span> <span class=\"i\">c</span>;\n}\n<span class=\"k\">print</span> <span class=\"i\">a</span>;\n<span class=\"k\">print</span> <span class=\"i\">b</span>;\n<span class=\"k\">print</span> <span class=\"i\">c</span>;\n</pre></div>\n<p>Our little interpreter can remember things now. We are inching closer to\nsomething resembling a full-featured programming language.</p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>The REPL no longer supports entering a single expression and automatically\nprinting its result value. That&rsquo;s a drag. Add support to the REPL to let\nusers type in both statements and expressions. If they enter a statement,\nexecute it. If they enter an expression, evaluate it and display the result\nvalue.</p>\n</li>\n<li>\n<p>Maybe you want Lox to be a little more explicit about variable\ninitialization. Instead of implicitly initializing variables to <code>nil</code>, make\nit a runtime error to access a variable that has not been initialized or\nassigned to, as in:</p>\n<div class=\"codehilite\"><pre><span class=\"c\">// No initializers.</span>\n<span class=\"k\">var</span> <span class=\"i\">a</span>;\n<span class=\"k\">var</span> <span class=\"i\">b</span>;\n\n<span class=\"i\">a</span> = <span class=\"s\">&quot;assigned&quot;</span>;\n<span class=\"k\">print</span> <span class=\"i\">a</span>; <span class=\"c\">// OK, was assigned first.</span>\n\n<span class=\"k\">print</span> <span class=\"i\">b</span>; <span class=\"c\">// Error!</span>\n</pre></div>\n</li>\n<li>\n<p>What does the following program do?</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"n\">1</span>;\n{\n  <span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"i\">a</span> + <span class=\"n\">2</span>;\n  <span class=\"k\">print</span> <span class=\"i\">a</span>;\n}\n</pre></div>\n<p>What did you <em>expect</em> it to do? Is it what you think it should do? What\ndoes analogous code in other languages you are familiar with do? What do\nyou think users will expect this to do?</p>\n</li>\n</ol>\n</div>\n<div class=\"design-note\">\n<h2><a href=\"#design-note\" id=\"design-note\">Design Note: Implicit Variable Declaration</a></h2>\n<p>Lox has distinct syntax for declaring a new variable and assigning to an\nexisting one. Some languages collapse those to only assignment syntax. Assigning\nto a non-existent variable automatically brings it into being. This is called\n<strong>implicit variable declaration</strong> and exists in Python, Ruby, and CoffeeScript,\namong others. JavaScript has an explicit syntax to declare variables, but can\nalso create new variables on assignment. Visual Basic has <a href=\"https://msdn.microsoft.com/en-us/library/xe53dz5w(v=vs.100).aspx\">an option to enable\nor disable implicit variables</a>.</p>\n<p>When the same syntax can assign or create a variable, each language must decide\nwhat happens when it isn&rsquo;t clear about which behavior the user intends. In\nparticular, each language must choose how implicit declaration interacts with\nshadowing, and which scope an implicitly declared variable goes into.</p>\n<ul>\n<li>\n<p>In Python, assignment always creates a variable in the current function&rsquo;s\nscope, even if there is a variable with the same name declared outside of\nthe function.</p>\n</li>\n<li>\n<p>Ruby avoids some ambiguity by having different naming rules for local and\nglobal variables. However, blocks in Ruby (which are more like closures than\nlike &ldquo;blocks&rdquo; in C) have their own scope, so it still has the problem.\nAssignment in Ruby assigns to an existing variable outside of the current\nblock if there is one with the same name. Otherwise, it creates a new\nvariable in the current block&rsquo;s scope.</p>\n</li>\n<li>\n<p>CoffeeScript, which takes after Ruby in many ways, is similar. It explicitly\ndisallows shadowing by saying that assignment always assigns to a variable\nin an outer scope if there is one, all the way up to the outermost global\nscope. Otherwise, it creates the variable in the current function scope.</p>\n</li>\n<li>\n<p>In JavaScript, assignment modifies an existing variable in any enclosing\nscope, if found. If not, it implicitly creates a new variable in the\n<em>global</em> scope.</p>\n</li>\n</ul>\n<p>The main advantage to implicit declaration is simplicity. There&rsquo;s less syntax\nand no &ldquo;declaration&rdquo; concept to learn. Users can just start assigning stuff and\nthe language figures it out.</p>\n<p>Older, statically typed languages like C benefit from explicit declaration\nbecause they give the user a place to tell the compiler what type each variable\nhas and how much storage to allocate for it. In a dynamically typed,\ngarbage-collected language, that isn&rsquo;t really necessary, so you can get away\nwith making declarations implicit. It feels a little more &ldquo;scripty&rdquo;, more &ldquo;you\nknow what I mean&rdquo;.</p>\n<p>But is that a good idea? Implicit declaration has some problems.</p>\n<ul>\n<li>\n<p>A user may intend to assign to an existing variable, but may have misspelled\nit. The interpreter doesn&rsquo;t know that, so it goes ahead and silently creates\nsome new variable and the variable the user wanted to assign to still has\nits old value. This is particularly heinous in JavaScript where a typo will\ncreate a <em>global</em> variable, which may in turn interfere with other code.</p>\n</li>\n<li>\n<p>JS, Ruby, and CoffeeScript use the presence of an existing variable with the\nsame name<span class=\"em\">&mdash;</span>even in an outer scope<span class=\"em\">&mdash;</span>to determine whether or not an\nassignment creates a new variable or assigns to an existing one. That means\nadding a new variable in a surrounding scope can change the meaning of\nexisting code. What was once a local variable may silently turn into an\nassignment to that new outer variable.</p>\n</li>\n<li>\n<p>In Python, you may <em>want</em> to assign to some variable outside of the current\nfunction instead of creating a new variable in the current one, but you\ncan&rsquo;t.</p>\n</li>\n</ul>\n<p>Over time, the languages I know with implicit variable declaration ended up\nadding more features and complexity to deal with these problems.</p>\n<ul>\n<li>\n<p>Implicit declaration of global variables in JavaScript is universally\nconsidered a mistake today. &ldquo;Strict mode&rdquo; disables it and makes it a compile\nerror.</p>\n</li>\n<li>\n<p>Python added a <code>global</code> statement to let you explicitly assign to a global\nvariable from within a function. Later, as functional programming and nested\nfunctions became more popular, they added a similar <code>nonlocal</code> statement to\nassign to variables in enclosing functions.</p>\n</li>\n<li>\n<p>Ruby extended its block syntax to allow declaring certain variables to be\nexplicitly local to the block even if the same name exists in an outer\nscope.</p>\n</li>\n</ul>\n<p>Given those, I think the simplicity argument is mostly lost. There is an\nargument that implicit declaration is the right <em>default</em> but I personally find\nthat less compelling.</p>\n<p>My opinion is that implicit declaration made sense in years past when most\nscripting languages were heavily imperative and code was pretty flat. As\nprogrammers have gotten more comfortable with deep nesting, functional\nprogramming, and closures, it&rsquo;s become much more common to want access to\nvariables in outer scopes. That makes it more likely that users will run into\nthe tricky cases where it&rsquo;s not clear whether they intend their assignment to\ncreate a new variable or reuse a surrounding one.</p>\n<p>So I prefer explicitly declaring variables, which is why Lox requires it.</p>\n</div>\n\n<footer>\n<a href=\"control-flow.html\" class=\"next\">\n  Next Chapter: &ldquo;Control Flow&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/strings.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Strings &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Strings<small>19</small></a></h3>\n\n<ul>\n    <li><a href=\"#values-and-objects\"><small>19.1</small> Values and Objects</a></li>\n    <li><a href=\"#struct-inheritance\"><small>19.2</small> Struct Inheritance</a></li>\n    <li><a href=\"#strings\"><small>19.3</small> Strings</a></li>\n    <li><a href=\"#operations-on-strings\"><small>19.4</small> Operations on Strings</a></li>\n    <li><a href=\"#freeing-objects\"><small>19.5</small> Freeing Objects</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>String Encoding</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"types-of-values.html\" title=\"Types of Values\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"hash-tables.html\" title=\"Hash Tables\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"types-of-values.html\" title=\"Types of Values\" class=\"prev\">←</a>\n<a href=\"hash-tables.html\" title=\"Hash Tables\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Strings<small>19</small></a></h3>\n\n<ul>\n    <li><a href=\"#values-and-objects\"><small>19.1</small> Values and Objects</a></li>\n    <li><a href=\"#struct-inheritance\"><small>19.2</small> Struct Inheritance</a></li>\n    <li><a href=\"#strings\"><small>19.3</small> Strings</a></li>\n    <li><a href=\"#operations-on-strings\"><small>19.4</small> Operations on Strings</a></li>\n    <li><a href=\"#freeing-objects\"><small>19.5</small> Freeing Objects</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>String Encoding</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"types-of-values.html\" title=\"Types of Values\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"hash-tables.html\" title=\"Hash Tables\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">19</div>\n  <h1>Strings</h1>\n\n<blockquote>\n<p>&ldquo;Ah? A small aversion to menial labor?&rdquo; The doctor cocked an eyebrow.\n&ldquo;Understandable, but misplaced. One should treasure those hum-drum\ntasks that keep the body occupied but leave the mind and heart unfettered.&rdquo;</p>\n<p><cite>Tad Williams, <em>The Dragonbone Chair</em></cite></p>\n</blockquote>\n<p>Our little VM can represent three types of values right now: numbers, Booleans,\nand <code>nil</code>. Those types have two important things in common: they&rsquo;re immutable\nand they&rsquo;re small. Numbers are the largest, and they still fit into two 64-bit\nwords. That&rsquo;s a small enough price that we can afford to pay it for all values,\neven Booleans and nils which don&rsquo;t need that much space.</p>\n<p>Strings, unfortunately, are not so petite. There&rsquo;s no maximum length for a\nstring. Even if we were to artificially cap it at some contrived limit like\n<span name=\"pascal\">255</span> characters, that&rsquo;s still too much memory to spend\non every single value.</p>\n<aside name=\"pascal\">\n<p>UCSD Pascal, one of the first implementations of Pascal, had this exact limit.\nInstead of using a terminating null byte to indicate the end of the string like\nC, Pascal strings started with a length value. Since UCSD used only a single\nbyte to store the length, strings couldn&rsquo;t be any longer than 255 characters.</p><img src=\"image/strings/pstring.png\" alt=\"The Pascal string 'hello' with a length byte of 5 preceding it.\" />\n</aside>\n<p>We need a way to support values whose sizes vary, sometimes greatly. This is\nexactly what dynamic allocation on the heap is designed for. We can allocate as\nmany bytes as we need. We get back a pointer that we&rsquo;ll use to keep track of the\nvalue as it flows through the VM.</p>\n<h2><a href=\"#values-and-objects\" id=\"values-and-objects\"><small>19&#8202;.&#8202;1</small>Values and Objects</a></h2>\n<p>Using the heap for larger, variable-sized values and the stack for smaller,\natomic ones leads to a two-level representation. Every Lox value that you can\nstore in a variable or return from an expression will be a Value. For small,\nfixed-size types like numbers, the payload is stored directly inside the Value\nstruct itself.</p>\n<p>If the object is larger, its data lives on the heap. Then the Value&rsquo;s payload is\na <em>pointer</em> to that blob of memory. We&rsquo;ll eventually have a handful of\nheap-allocated types in clox: strings, instances, functions, you get the idea.\nEach type has its own unique data, but there is also state they all share that\n<a href=\"garbage-collection.html\">our future garbage collector</a> will use to manage their memory.</p><img src=\"image/strings/value.png\" class=\"wide\" alt=\"Field layout of number and obj values.\" />\n<p>We&rsquo;ll call this common representation <span name=\"short\">&ldquo;Obj&rdquo;</span>. Each Lox\nvalue whose state lives on the heap is an Obj. We can thus use a single new\nValueType case to refer to all heap-allocated types.</p>\n<aside name=\"short\">\n<p>&ldquo;Obj&rdquo; is short for &ldquo;object&rdquo;, natch.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">  VAL_NUMBER,\n</pre><div class=\"source-file\"><em>value.h</em><br>\nin enum <em>ValueType</em></div>\n<pre class=\"insert\">  <span class=\"a\">VAL_OBJ</span>\n</pre><pre class=\"insert-after\">} ValueType;\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em>, in enum <em>ValueType</em></div>\n\n<p>When a Value&rsquo;s type is <code>VAL_OBJ</code>, the payload is a pointer to the heap memory,\nso we add another case to the union for that.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    double number;\n</pre><div class=\"source-file\"><em>value.h</em><br>\nin struct <em>Value</em></div>\n<pre class=\"insert\">    <span class=\"t\">Obj</span>* <span class=\"i\">obj</span>;\n</pre><pre class=\"insert-after\">  } as;<span name=\"as\"> </span>\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em>, in struct <em>Value</em></div>\n\n<p>As we did with the other value types, we crank out a couple of helpful macros\nfor working with Obj values.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define IS_NUMBER(value)  ((value).type == VAL_NUMBER)\n</pre><div class=\"source-file\"><em>value.h</em><br>\nadd after struct <em>Value</em></div>\n<pre class=\"insert\"><span class=\"a\">#define IS_OBJ(value)     ((value).type == VAL_OBJ)</span>\n</pre><pre class=\"insert-after\">\n\n#define AS_BOOL(value)    ((value).as.boolean)\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em>, add after struct <em>Value</em></div>\n\n<p>This evaluates to <code>true</code> if the given Value is an Obj. If so, we can use this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define IS_OBJ(value)     ((value).type == VAL_OBJ)\n\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define AS_OBJ(value)     ((value).as.obj)</span>\n</pre><pre class=\"insert-after\">#define AS_BOOL(value)    ((value).as.boolean)\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>It extracts the Obj pointer from the value. We can also go the other way.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define NUMBER_VAL(value) ((Value){VAL_NUMBER, {.number = value}})\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define OBJ_VAL(object)   ((Value){VAL_OBJ, {.obj = (Obj*)object}})</span>\n</pre><pre class=\"insert-after\">\n\ntypedef struct {\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>This takes a bare Obj pointer and wraps it in a full Value.</p>\n<h2><a href=\"#struct-inheritance\" id=\"struct-inheritance\"><small>19&#8202;.&#8202;2</small>Struct Inheritance</a></h2>\n<p>Every heap-allocated value is an Obj, but <span name=\"objs\">Objs</span> are\nnot all the same. For strings, we need the array of characters. When we get to\ninstances, they will need their data fields. A function object will need its\nchunk of bytecode. How do we handle different payloads and sizes? We can&rsquo;t use\nanother union like we did for Value since the sizes are all over the place.</p>\n<aside name=\"objs\">\n<p>No, I don&rsquo;t know how to pronounce &ldquo;objs&rdquo; either. Feels like there should be a\nvowel in there somewhere.</p>\n</aside>\n<p>Instead, we&rsquo;ll use another technique. It&rsquo;s been around for ages, to the point\nthat the C specification carves out specific support for it, but I don&rsquo;t know\nthat it has a canonical name. It&rsquo;s an example of <a href=\"https://en.wikipedia.org/wiki/Type_punning\"><em>type punning</em></a>, but that\nterm is too broad. In the absence of any better ideas, I&rsquo;ll call it <strong>struct\ninheritance</strong>, because it relies on structs and roughly follows how\nsingle-inheritance of state works in object-oriented languages.</p>\n<p>Like a tagged union, each Obj starts with a tag field that identifies what kind\nof object it is<span class=\"em\">&mdash;</span>string, instance, etc. Following that are the payload fields.\nInstead of a union with cases for each type, each type is its own separate\nstruct. The tricky part is how to treat these structs uniformly since C has no\nconcept of inheritance or polymorphism. I&rsquo;ll explain that soon, but first lets\nget the preliminary stuff out of the way.</p>\n<p>The name &ldquo;Obj&rdquo; itself refers to a struct that contains the state shared across\nall object types. It&rsquo;s sort of like the &ldquo;base class&rdquo; for objects. Because of\nsome cyclic dependencies between values and objects, we forward-declare it in\nthe &ldquo;value&rdquo; module.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;common.h&quot;\n\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\"><span class=\"k\">typedef</span> <span class=\"k\">struct</span> <span class=\"t\">Obj</span> <span class=\"t\">Obj</span>;\n\n</pre><pre class=\"insert-after\">typedef enum {\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>And the actual definition is in a new module.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>object.h</em><br>\ncreate new file</div>\n<pre><span class=\"a\">#ifndef clox_object_h</span>\n<span class=\"a\">#define clox_object_h</span>\n\n<span class=\"a\">#include &quot;common.h&quot;</span>\n<span class=\"a\">#include &quot;value.h&quot;</span>\n\n<span class=\"k\">struct</span> <span class=\"t\">Obj</span> {\n  <span class=\"t\">ObjType</span> <span class=\"i\">type</span>;\n};\n\n<span class=\"a\">#endif</span>\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, create new file</div>\n\n<p>Right now, it contains only the type tag. Shortly, we&rsquo;ll add some other\nbookkeeping information for memory management. The type enum is this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;value.h&quot;\n</pre><div class=\"source-file\"><em>object.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">typedef</span> <span class=\"k\">enum</span> {\n  <span class=\"a\">OBJ_STRING</span>,\n} <span class=\"t\">ObjType</span>;\n</pre><pre class=\"insert-after\">\n\nstruct Obj {\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em></div>\n\n<p>Obviously, that will be more useful in later chapters after we add more\nheap-allocated types. Since we&rsquo;ll be accessing these tag types frequently, it&rsquo;s\nworth making a little macro that extracts the object type tag from a given\nValue.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;value.h&quot;\n</pre><div class=\"source-file\"><em>object.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#define OBJ_TYPE(value)        (AS_OBJ(value)-&gt;type)</span>\n</pre><pre class=\"insert-after\">\n\ntypedef enum {\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em></div>\n\n<p>That&rsquo;s our foundation.</p>\n<p>Now, let&rsquo;s build strings on top of it. The payload for strings is defined in a\nseparate struct. Again, we need to forward-declare it.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">typedef struct Obj Obj;\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\"><span class=\"k\">typedef</span> <span class=\"k\">struct</span> <span class=\"t\">ObjString</span> <span class=\"t\">ObjString</span>;\n</pre><pre class=\"insert-after\">\n\ntypedef enum {\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<p>The definition lives alongside Obj.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">};\n</pre><div class=\"source-file\"><em>object.h</em><br>\nadd after struct <em>Obj</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">struct</span> <span class=\"t\">ObjString</span> {\n  <span class=\"t\">Obj</span> <span class=\"i\">obj</span>;\n  <span class=\"t\">int</span> <span class=\"i\">length</span>;\n  <span class=\"t\">char</span>* <span class=\"i\">chars</span>;\n};\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, add after struct <em>Obj</em></div>\n\n<p>A string object contains an array of characters. Those are stored in a separate,\nheap-allocated array so that we set aside only as much room as needed for each\nstring. We also store the number of bytes in the array. This isn&rsquo;t strictly\nnecessary but lets us tell how much memory is allocated for the string without\nwalking the character array to find the null terminator.</p>\n<p>Because ObjString is an Obj, it also needs the state all Objs share. It\naccomplishes that by having its first field be an Obj. C specifies that struct\nfields are arranged in memory in the order that they are declared. Also, when\nyou nest structs, the inner struct&rsquo;s fields are expanded right in place. So the\nmemory for Obj and for ObjString looks like this:</p><img src=\"image/strings/obj.png\" alt=\"The memory layout for the fields in Obj and ObjString.\" />\n<p>Note how the first bytes of ObjString exactly line up with Obj. This is not a\ncoincidence<span class=\"em\">&mdash;</span>C <span name=\"spec\">mandates</span> it. This is designed to\nenable a clever pattern: You can take a pointer to a struct and safely convert\nit to a pointer to its first field and back.</p>\n<aside name=\"spec\">\n<p>The key part of the spec is:</p>\n<blockquote>\n<p>&sect; 6.7.2.1 13</p>\n<p>Within a structure object, the non-bit-field members and the units in which\nbit-fields reside have addresses that increase in the order in which they\nare declared. A pointer to a structure object, suitably converted, points to\nits initial member (or if that member is a bit-field, then to the unit in\nwhich it resides), and vice versa. There may be unnamed padding within a\nstructure object, but not at its beginning.</p>\n</blockquote>\n</aside>\n<p>Given an <code>ObjString*</code>, you can safely cast it to <code>Obj*</code> and then access the\n<code>type</code> field from it. Every ObjString &ldquo;is&rdquo; an Obj in the OOP sense of &ldquo;is&rdquo;. When\nwe later add other object types, each struct will have an Obj as its first\nfield. Any code that wants to work with all objects can treat them as base\n<code>Obj*</code> and ignore any other fields that may happen to follow.</p>\n<p>You can go in the other direction too. Given an <code>Obj*</code>, you can &ldquo;downcast&rdquo; it to\nan <code>ObjString*</code>. Of course, you need to ensure that the <code>Obj*</code> pointer you have\ndoes point to the <code>obj</code> field of an actual ObjString. Otherwise, you are\nunsafely reinterpreting random bits of memory. To detect that such a cast is\nsafe, we add another macro.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define OBJ_TYPE(value)        (AS_OBJ(value)-&gt;type)\n</pre><div class=\"source-file\"><em>object.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#define IS_STRING(value)       isObjType(value, OBJ_STRING)</span>\n</pre><pre class=\"insert-after\">\n\ntypedef enum {\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em></div>\n\n<p>It takes a Value, not a raw <code>Obj*</code> because most code in the VM works with\nValues. It relies on this inline function:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">};\n\n</pre><div class=\"source-file\"><em>object.h</em><br>\nadd after struct <em>ObjString</em></div>\n<pre class=\"insert\"><span class=\"k\">static</span> <span class=\"k\">inline</span> <span class=\"t\">bool</span> <span class=\"i\">isObjType</span>(<span class=\"t\">Value</span> <span class=\"i\">value</span>, <span class=\"t\">ObjType</span> <span class=\"i\">type</span>) {\n  <span class=\"k\">return</span> <span class=\"a\">IS_OBJ</span>(<span class=\"i\">value</span>) &amp;&amp; <span class=\"a\">AS_OBJ</span>(<span class=\"i\">value</span>)-&gt;<span class=\"i\">type</span> == <span class=\"i\">type</span>;\n}\n\n</pre><pre class=\"insert-after\">#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, add after struct <em>ObjString</em></div>\n\n<p>Pop quiz: Why not just put the body of this function right in the macro? What&rsquo;s\ndifferent about this one compared to the others? Right, it&rsquo;s because the body\nuses <code>value</code> twice. A macro is expanded by inserting the argument <em>expression</em>\nevery place the parameter name appears in the body. If a macro uses a parameter\nmore than once, that expression gets evaluated multiple times.</p>\n<p>That&rsquo;s bad if the expression has side effects. If we put the body of\n<code>isObjType()</code> into the macro definition and then you did, say,</p>\n<div class=\"codehilite\"><pre><span class=\"a\">IS_STRING</span>(<span class=\"a\">POP</span>())\n</pre></div>\n<p>then it would pop two values off the stack! Using a function fixes that.</p>\n<p>As long as we ensure that we set the type tag correctly whenever we create an\nObj of some type, this macro will tell us when it&rsquo;s safe to cast a value to a\nspecific object type. We can do that using these:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define IS_STRING(value)       isObjType(value, OBJ_STRING)\n</pre><div class=\"source-file\"><em>object.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#define AS_STRING(value)       ((ObjString*)AS_OBJ(value))</span>\n<span class=\"a\">#define AS_CSTRING(value)      (((ObjString*)AS_OBJ(value))-&gt;chars)</span>\n</pre><pre class=\"insert-after\">\n\ntypedef enum {\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em></div>\n\n<p>These two macros take a Value that is expected to contain a pointer to a valid\nObjString on the heap. The first one returns the <code>ObjString*</code> pointer. The\nsecond one steps through that to return the character array itself, since that&rsquo;s\noften what we&rsquo;ll end up needing.</p>\n<h2><a href=\"#strings\" id=\"strings\"><small>19&#8202;.&#8202;3</small>Strings</a></h2>\n<p>OK, our VM can now represent string values. It&rsquo;s time to add strings to the\nlanguage itself. As usual, we begin in the front end. The lexer already\ntokenizes string literals, so it&rsquo;s the parser&rsquo;s turn.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  [TOKEN_IDENTIFIER]    = {NULL,     NULL,   PREC_NONE},\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nreplace 1 line</div>\n<pre class=\"insert\">  [<span class=\"a\">TOKEN_STRING</span>]        = {<span class=\"i\">string</span>,   <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n</pre><pre class=\"insert-after\">  [TOKEN_NUMBER]        = {number,   NULL,   PREC_NONE},\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, replace 1 line</div>\n\n<p>When the parser hits a string token, it calls this parse function:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>number</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">string</span>() {\n  <span class=\"i\">emitConstant</span>(<span class=\"a\">OBJ_VAL</span>(<span class=\"i\">copyString</span>(<span class=\"i\">parser</span>.<span class=\"i\">previous</span>.<span class=\"i\">start</span> + <span class=\"n\">1</span>,\n                                  <span class=\"i\">parser</span>.<span class=\"i\">previous</span>.<span class=\"i\">length</span> - <span class=\"n\">2</span>)));\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>number</em>()</div>\n\n<p>This takes the string&rsquo;s characters <span name=\"escape\">directly</span> from the\nlexeme. The <code>+ 1</code> and <code>- 2</code> parts trim the leading and trailing quotation marks.\nIt then creates a string object, wraps it in a Value, and stuffs it into the\nconstant table.</p>\n<aside name=\"escape\">\n<p>If Lox supported string escape sequences like <code>\\n</code>, we&rsquo;d translate those here.\nSince it doesn&rsquo;t, we can take the characters as they are.</p>\n</aside>\n<p>To create the string, we use <code>copyString()</code>, which is declared in <code>object.h</code>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">};\n\n</pre><div class=\"source-file\"><em>object.h</em><br>\nadd after struct <em>ObjString</em></div>\n<pre class=\"insert\"><span class=\"t\">ObjString</span>* <span class=\"i\">copyString</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">chars</span>, <span class=\"t\">int</span> <span class=\"i\">length</span>);\n\n</pre><pre class=\"insert-after\">static inline bool isObjType(Value value, ObjType type) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, add after struct <em>ObjString</em></div>\n\n<p>The compiler module needs to include that.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define clox_compiler_h\n\n</pre><div class=\"source-file\"><em>compiler.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;object.h&quot;</span>\n</pre><pre class=\"insert-after\">#include &quot;vm.h&quot;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.h</em></div>\n\n<p>Our &ldquo;object&rdquo; module gets an implementation file where we define the new\nfunction.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>object.c</em><br>\ncreate new file</div>\n<pre><span class=\"a\">#include &lt;stdio.h&gt;</span>\n<span class=\"a\">#include &lt;string.h&gt;</span>\n\n<span class=\"a\">#include &quot;memory.h&quot;</span>\n<span class=\"a\">#include &quot;object.h&quot;</span>\n<span class=\"a\">#include &quot;value.h&quot;</span>\n<span class=\"a\">#include &quot;vm.h&quot;</span>\n\n<span class=\"t\">ObjString</span>* <span class=\"i\">copyString</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">chars</span>, <span class=\"t\">int</span> <span class=\"i\">length</span>) {\n  <span class=\"t\">char</span>* <span class=\"i\">heapChars</span> = <span class=\"a\">ALLOCATE</span>(<span class=\"t\">char</span>, <span class=\"i\">length</span> + <span class=\"n\">1</span>);\n  <span class=\"i\">memcpy</span>(<span class=\"i\">heapChars</span>, <span class=\"i\">chars</span>, <span class=\"i\">length</span>);\n  <span class=\"i\">heapChars</span>[<span class=\"i\">length</span>] = <span class=\"s\">&#39;\\0&#39;</span>;\n  <span class=\"k\">return</span> <span class=\"i\">allocateString</span>(<span class=\"i\">heapChars</span>, <span class=\"i\">length</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, create new file</div>\n\n<p>First, we allocate a new array on the heap, just big enough for the string&rsquo;s\ncharacters and the trailing <span name=\"terminator\">terminator</span>, using\nthis low-level macro that allocates an array with a given element type and\ncount:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;common.h&quot;\n\n</pre><div class=\"source-file\"><em>memory.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#define ALLOCATE(type, count) \\</span>\n<span class=\"a\">    (type*)reallocate(NULL, 0, sizeof(type) * (count))</span>\n\n</pre><pre class=\"insert-after\">#define GROW_CAPACITY(capacity) \\\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.h</em></div>\n\n<p>Once we have the array, we copy over the characters from the lexeme and\nterminate it.</p>\n<aside name=\"terminator\" class=\"bottom\">\n<p>We need to terminate the string ourselves because the lexeme points at a range\nof characters inside the monolithic source string and isn&rsquo;t terminated.</p>\n<p>Since ObjString stores the length explicitly, we <em>could</em> leave the character\narray unterminated, but slapping a terminator on the end costs us only a byte\nand lets us pass the character array to C standard library functions that expect\na terminated string.</p>\n</aside>\n<p>You might wonder why the ObjString can&rsquo;t just point back to the original\ncharacters in the source string. Some ObjStrings will be created dynamically at\nruntime as a result of string operations like concatenation. Those strings\nobviously need to dynamically allocate memory for the characters, which means\nthe string needs to <em>free</em> that memory when it&rsquo;s no longer needed.</p>\n<p>If we had an ObjString for a string literal, and tried to free its character\narray that pointed into the original source code string, bad things would\nhappen. So, for literals, we preemptively copy the characters over to the heap.\nThis way, every ObjString reliably owns its character array and can free it.</p>\n<p>The real work of creating a string object happens in this function:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;vm.h&quot;\n\n</pre><div class=\"source-file\"><em>object.c</em></div>\n<pre class=\"insert\"><span class=\"k\">static</span> <span class=\"t\">ObjString</span>* <span class=\"i\">allocateString</span>(<span class=\"t\">char</span>* <span class=\"i\">chars</span>, <span class=\"t\">int</span> <span class=\"i\">length</span>) {\n  <span class=\"t\">ObjString</span>* <span class=\"i\">string</span> = <span class=\"a\">ALLOCATE_OBJ</span>(<span class=\"t\">ObjString</span>, <span class=\"a\">OBJ_STRING</span>);\n  <span class=\"i\">string</span>-&gt;<span class=\"i\">length</span> = <span class=\"i\">length</span>;\n  <span class=\"i\">string</span>-&gt;<span class=\"i\">chars</span> = <span class=\"i\">chars</span>;\n  <span class=\"k\">return</span> <span class=\"i\">string</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em></div>\n\n<p>It creates a new ObjString on the heap and then initializes its fields. It&rsquo;s\nsort of like a constructor in an OOP language. As such, it first calls the &ldquo;base\nclass&rdquo; constructor to initialize the Obj state, using a new macro.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;vm.h&quot;\n</pre><div class=\"source-file\"><em>object.c</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#define ALLOCATE_OBJ(type, objectType) \\</span>\n<span class=\"a\">    (type*)allocateObject(sizeof(type), objectType)</span>\n</pre><pre class=\"insert-after\">\n\nstatic ObjString* allocateString(char* chars, int length) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em></div>\n\n<p><span name=\"factored\">Like</span> the previous macro, this exists mainly to\navoid the need to redundantly cast a <code>void*</code> back to the desired type. The\nactual functionality is here:</p>\n<aside name=\"factored\">\n<p>I admit this chapter has a sea of helper functions and macros to wade through. I\ntry to keep the code nicely factored, but that leads to a scattering of tiny\nfunctions. They will pay off when we reuse them later.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define ALLOCATE_OBJ(type, objectType) \\\n    (type*)allocateObject(sizeof(type), objectType)\n</pre><div class=\"source-file\"><em>object.c</em></div>\n<pre class=\"insert\">\n\n<span class=\"k\">static</span> <span class=\"t\">Obj</span>* <span class=\"i\">allocateObject</span>(<span class=\"t\">size_t</span> <span class=\"i\">size</span>, <span class=\"t\">ObjType</span> <span class=\"i\">type</span>) {\n  <span class=\"t\">Obj</span>* <span class=\"i\">object</span> = (<span class=\"t\">Obj</span>*)<span class=\"i\">reallocate</span>(<span class=\"a\">NULL</span>, <span class=\"n\">0</span>, <span class=\"i\">size</span>);\n  <span class=\"i\">object</span>-&gt;<span class=\"i\">type</span> = <span class=\"i\">type</span>;\n  <span class=\"k\">return</span> <span class=\"i\">object</span>;\n}\n</pre><pre class=\"insert-after\">\n\nstatic ObjString* allocateString(char* chars, int length) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em></div>\n\n<p>It allocates an object of the given size on the heap. Note that the size is\n<em>not</em> just the size of Obj itself. The caller passes in the number of bytes so\nthat there is room for the extra payload fields needed by the specific object\ntype being created.</p>\n<p>Then it initializes the Obj state<span class=\"em\">&mdash;</span>right now, that&rsquo;s just the type tag. This\nfunction returns to <code>allocateString()</code>, which finishes initializing the ObjString\nfields. <span name=\"viola\"><em>Voilà</em></span>, we can compile and execute string\nliterals.</p>\n<aside name=\"viola\"><img src=\"image/strings/viola.png\" class=\"above\" alt=\"A viola.\" />\n<p>Don&rsquo;t get &ldquo;voilà&rdquo; confused with &ldquo;viola&rdquo;. One means &ldquo;there it is&rdquo; and the other\nis a string instrument, the middle child between a violin and a cello. Yes, I\ndid spend two hours drawing a viola just to mention that.</p>\n</aside>\n<h2><a href=\"#operations-on-strings\" id=\"operations-on-strings\"><small>19&#8202;.&#8202;4</small>Operations on Strings</a></h2>\n<p>Our fancy strings are there, but they don&rsquo;t do much of anything yet. A good\nfirst step is to make the existing print code not barf on the new value type.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    case VAL_NUMBER: printf(&quot;%g&quot;, AS_NUMBER(value)); break;\n</pre><div class=\"source-file\"><em>value.c</em><br>\nin <em>printValue</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">VAL_OBJ</span>: <span class=\"i\">printObject</span>(<span class=\"i\">value</span>); <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.c</em>, in <em>printValue</em>()</div>\n\n<p>If the value is a heap-allocated object, it defers to a helper function over in\nthe &ldquo;object&rdquo; module.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">ObjString* copyString(const char* chars, int length);\n</pre><div class=\"source-file\"><em>object.h</em><br>\nadd after <em>copyString</em>()</div>\n<pre class=\"insert\"><span class=\"t\">void</span> <span class=\"i\">printObject</span>(<span class=\"t\">Value</span> <span class=\"i\">value</span>);\n</pre><pre class=\"insert-after\">\n\nstatic inline bool isObjType(Value value, ObjType type) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, add after <em>copyString</em>()</div>\n\n<p>The implementation looks like this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>object.c</em><br>\nadd after <em>copyString</em>()</div>\n<pre><span class=\"t\">void</span> <span class=\"i\">printObject</span>(<span class=\"t\">Value</span> <span class=\"i\">value</span>) {\n  <span class=\"k\">switch</span> (<span class=\"a\">OBJ_TYPE</span>(<span class=\"i\">value</span>)) {\n    <span class=\"k\">case</span> <span class=\"a\">OBJ_STRING</span>:\n      <span class=\"i\">printf</span>(<span class=\"s\">&quot;%s&quot;</span>, <span class=\"a\">AS_CSTRING</span>(<span class=\"i\">value</span>));\n      <span class=\"k\">break</span>;\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, add after <em>copyString</em>()</div>\n\n<p>We have only a single object type now, but this function will sprout additional\nswitch cases in later chapters. For string objects, it simply <span\nname=\"term-2\">prints</span> the character array as a C string.</p>\n<aside name=\"term-2\">\n<p>I told you terminating the string would come in handy.</p>\n</aside>\n<p>The equality operators also need to gracefully handle strings. Consider:</p>\n<div class=\"codehilite\"><pre><span class=\"s\">&quot;string&quot;</span> == <span class=\"s\">&quot;string&quot;</span>\n</pre></div>\n<p>These are two separate string literals. The compiler will make two separate\ncalls to <code>copyString()</code>, create two distinct ObjString objects and store them as\ntwo constants in the chunk. They are different objects in the heap. But our\nusers (and thus we) expect strings to have value equality. The above expression\nshould evaluate to <code>true</code>. That requires a little special support.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    case VAL_NUMBER: return AS_NUMBER(a) == AS_NUMBER(b);\n</pre><div class=\"source-file\"><em>value.c</em><br>\nin <em>valuesEqual</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">VAL_OBJ</span>: {\n      <span class=\"t\">ObjString</span>* <span class=\"i\">aString</span> = <span class=\"a\">AS_STRING</span>(<span class=\"i\">a</span>);\n      <span class=\"t\">ObjString</span>* <span class=\"i\">bString</span> = <span class=\"a\">AS_STRING</span>(<span class=\"i\">b</span>);\n      <span class=\"k\">return</span> <span class=\"i\">aString</span>-&gt;<span class=\"i\">length</span> == <span class=\"i\">bString</span>-&gt;<span class=\"i\">length</span> &amp;&amp;\n          <span class=\"i\">memcmp</span>(<span class=\"i\">aString</span>-&gt;<span class=\"i\">chars</span>, <span class=\"i\">bString</span>-&gt;<span class=\"i\">chars</span>,\n                 <span class=\"i\">aString</span>-&gt;<span class=\"i\">length</span>) == <span class=\"n\">0</span>;\n    }\n</pre><pre class=\"insert-after\">    default:         return false; // Unreachable.\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.c</em>, in <em>valuesEqual</em>()</div>\n\n<p>If the two values are both strings, then they are equal if their character\narrays contain the same characters, regardless of whether they are two separate\nobjects or the exact same one. This does mean that string equality is slower\nthan equality on other types since it has to walk the whole string. We&rsquo;ll revise\nthat <a href=\"hash-tables.html\">later</a>, but this gives us the right semantics for now.</p>\n<p>Finally, in order to use <code>memcmp()</code> and the new stuff in the &ldquo;object&rdquo; module, we\nneed a couple of includes. Here:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &lt;stdio.h&gt;\n</pre><div class=\"source-file\"><em>value.c</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &lt;string.h&gt;</span>\n</pre><pre class=\"insert-after\">\n\n#include &quot;memory.h&quot;\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.c</em></div>\n\n<p>And here:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &lt;string.h&gt;\n\n</pre><div class=\"source-file\"><em>value.c</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;object.h&quot;</span>\n</pre><pre class=\"insert-after\">#include &quot;memory.h&quot;\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.c</em></div>\n\n<h3><a href=\"#concatenation\" id=\"concatenation\"><small>19&#8202;.&#8202;4&#8202;.&#8202;1</small>Concatenation</a></h3>\n<p>Full-grown languages provide lots of operations for working with strings<span class=\"em\">&mdash;</span>access to individual characters, the string&rsquo;s length, changing case, splitting,\njoining, searching, etc. When you implement your language, you&rsquo;ll likely want\nall that. But for this book, we keep things <em>very</em> minimal.</p>\n<p>The only interesting operation we support on strings is <code>+</code>. If you use that\noperator on two string objects, it produces a new string that&rsquo;s a concatenation\nof the two operands. Since Lox is dynamically typed, we can&rsquo;t tell which\nbehavior is needed at compile time because we don&rsquo;t know the types of the\noperands until runtime. Thus, the <code>OP_ADD</code> instruction dynamically inspects the\noperands and chooses the right operation.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case OP_LESS:     BINARY_OP(BOOL_VAL, &lt;); break;\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_ADD</span>: {\n        <span class=\"k\">if</span> (<span class=\"a\">IS_STRING</span>(<span class=\"i\">peek</span>(<span class=\"n\">0</span>)) &amp;&amp; <span class=\"a\">IS_STRING</span>(<span class=\"i\">peek</span>(<span class=\"n\">1</span>))) {\n          <span class=\"i\">concatenate</span>();\n        } <span class=\"k\">else</span> <span class=\"k\">if</span> (<span class=\"a\">IS_NUMBER</span>(<span class=\"i\">peek</span>(<span class=\"n\">0</span>)) &amp;&amp; <span class=\"a\">IS_NUMBER</span>(<span class=\"i\">peek</span>(<span class=\"n\">1</span>))) {\n          <span class=\"t\">double</span> <span class=\"i\">b</span> = <span class=\"a\">AS_NUMBER</span>(<span class=\"i\">pop</span>());\n          <span class=\"t\">double</span> <span class=\"i\">a</span> = <span class=\"a\">AS_NUMBER</span>(<span class=\"i\">pop</span>());\n          <span class=\"i\">push</span>(<span class=\"a\">NUMBER_VAL</span>(<span class=\"i\">a</span> + <span class=\"i\">b</span>));\n        } <span class=\"k\">else</span> {\n          <span class=\"i\">runtimeError</span>(\n              <span class=\"s\">&quot;Operands must be two numbers or two strings.&quot;</span>);\n          <span class=\"k\">return</span> <span class=\"a\">INTERPRET_RUNTIME_ERROR</span>;\n        }\n        <span class=\"k\">break</span>;\n      }\n</pre><pre class=\"insert-after\">      case OP_SUBTRACT: BINARY_OP(NUMBER_VAL, -); break;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>(), replace 1 line</div>\n\n<p>If both operands are strings, it concatenates. If they&rsquo;re both numbers, it adds\nthem. Any other <span name=\"convert\">combination</span> of operand types is a\nruntime error.</p>\n<aside name=\"convert\" class=\"bottom\">\n<p>This is more conservative than most languages. In other languages, if one\noperand is a string, the other can be any type and it will be implicitly\nconverted to a string before concatenating the two.</p>\n<p>I think that&rsquo;s a fine feature, but would require writing tedious &ldquo;convert to\nstring&rdquo; code for each type, so I left it out of Lox.</p>\n</aside>\n<p>To concatenate strings, we define a new function.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nadd after <em>isFalsey</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">concatenate</span>() {\n  <span class=\"t\">ObjString</span>* <span class=\"i\">b</span> = <span class=\"a\">AS_STRING</span>(<span class=\"i\">pop</span>());\n  <span class=\"t\">ObjString</span>* <span class=\"i\">a</span> = <span class=\"a\">AS_STRING</span>(<span class=\"i\">pop</span>());\n\n  <span class=\"t\">int</span> <span class=\"i\">length</span> = <span class=\"i\">a</span>-&gt;<span class=\"i\">length</span> + <span class=\"i\">b</span>-&gt;<span class=\"i\">length</span>;\n  <span class=\"t\">char</span>* <span class=\"i\">chars</span> = <span class=\"a\">ALLOCATE</span>(<span class=\"t\">char</span>, <span class=\"i\">length</span> + <span class=\"n\">1</span>);\n  <span class=\"i\">memcpy</span>(<span class=\"i\">chars</span>, <span class=\"i\">a</span>-&gt;<span class=\"i\">chars</span>, <span class=\"i\">a</span>-&gt;<span class=\"i\">length</span>);\n  <span class=\"i\">memcpy</span>(<span class=\"i\">chars</span> + <span class=\"i\">a</span>-&gt;<span class=\"i\">length</span>, <span class=\"i\">b</span>-&gt;<span class=\"i\">chars</span>, <span class=\"i\">b</span>-&gt;<span class=\"i\">length</span>);\n  <span class=\"i\">chars</span>[<span class=\"i\">length</span>] = <span class=\"s\">&#39;\\0&#39;</span>;\n\n  <span class=\"t\">ObjString</span>* <span class=\"i\">result</span> = <span class=\"i\">takeString</span>(<span class=\"i\">chars</span>, <span class=\"i\">length</span>);\n  <span class=\"i\">push</span>(<span class=\"a\">OBJ_VAL</span>(<span class=\"i\">result</span>));\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, add after <em>isFalsey</em>()</div>\n\n<p>It&rsquo;s pretty verbose, as C code that works with strings tends to be. First, we\ncalculate the length of the result string based on the lengths of the operands.\nWe allocate a character array for the result and then copy the two halves in. As\nalways, we carefully ensure the string is terminated.</p>\n<p>In order to call <code>memcpy()</code>, the VM needs an include.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &lt;stdio.h&gt;\n</pre><div class=\"source-file\"><em>vm.c</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &lt;string.h&gt;</span>\n</pre><pre class=\"insert-after\">\n\n#include &quot;common.h&quot;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em></div>\n\n<p>Finally, we produce an ObjString to contain those characters. This time we use a\nnew function, <code>takeString()</code>.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">};\n\n</pre><div class=\"source-file\"><em>object.h</em><br>\nadd after struct <em>ObjString</em></div>\n<pre class=\"insert\"><span class=\"t\">ObjString</span>* <span class=\"i\">takeString</span>(<span class=\"t\">char</span>* <span class=\"i\">chars</span>, <span class=\"t\">int</span> <span class=\"i\">length</span>);\n</pre><pre class=\"insert-after\">ObjString* copyString(const char* chars, int length);\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, add after struct <em>ObjString</em></div>\n\n<p>The implementation looks like this:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>object.c</em><br>\nadd after <em>allocateString</em>()</div>\n<pre><span class=\"t\">ObjString</span>* <span class=\"i\">takeString</span>(<span class=\"t\">char</span>* <span class=\"i\">chars</span>, <span class=\"t\">int</span> <span class=\"i\">length</span>) {\n  <span class=\"k\">return</span> <span class=\"i\">allocateString</span>(<span class=\"i\">chars</span>, <span class=\"i\">length</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, add after <em>allocateString</em>()</div>\n\n<p>The previous <code>copyString()</code> function assumes it <em>cannot</em> take ownership of the\ncharacters you pass in. Instead, it conservatively creates a copy of the\ncharacters on the heap that the ObjString can own. That&rsquo;s the right thing for\nstring literals where the passed-in characters are in the middle of the source\nstring.</p>\n<p>But, for concatenation, we&rsquo;ve already dynamically allocated a character array on\nthe heap. Making another copy of that would be redundant (and would mean\n<code>concatenate()</code> has to remember to free its copy). Instead, this function claims\nownership of the string you give it.</p>\n<p>As usual, stitching this functionality together requires a couple of includes.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;debug.h&quot;\n</pre><div class=\"source-file\"><em>vm.c</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;object.h&quot;</span>\n<span class=\"a\">#include &quot;memory.h&quot;</span>\n</pre><pre class=\"insert-after\">#include &quot;vm.h&quot;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em></div>\n\n<h2><a href=\"#freeing-objects\" id=\"freeing-objects\"><small>19&#8202;.&#8202;5</small>Freeing Objects</a></h2>\n<p>Behold this innocuous-seeming expression:</p>\n<div class=\"codehilite\"><pre><span class=\"s\">&quot;st&quot;</span> + <span class=\"s\">&quot;ri&quot;</span> + <span class=\"s\">&quot;ng&quot;</span>\n</pre></div>\n<p>When the compiler chews through this, it allocates an ObjString for each of\nthose three string literals and stores them in the chunk&rsquo;s constant table and\ngenerates this <span name=\"stack\">bytecode</span>:</p>\n<aside name=\"stack\">\n<p>Here&rsquo;s what the stack looks like after each instruction:</p><img src=\"image/strings/stack.png\" alt=\"The state of the stack at each instruction.\" />\n</aside>\n<div class=\"codehilite\"><pre>0000    OP_CONSTANT         0 &quot;st&quot;\n0002    OP_CONSTANT         1 &quot;ri&quot;\n0004    OP_ADD\n0005    OP_CONSTANT         2 &quot;ng&quot;\n0007    OP_ADD\n0008    OP_RETURN\n</pre></div>\n<p>The first two instructions push <code>\"st\"</code> and <code>\"ri\"</code> onto the stack. Then the\n<code>OP_ADD</code> pops those and concatenates them. That dynamically allocates a new\n<code>\"stri\"</code> string on the heap. The VM pushes that and then pushes the <code>\"ng\"</code>\nconstant. The last <code>OP_ADD</code> pops <code>\"stri\"</code> and <code>\"ng\"</code>, concatenates them, and\npushes the result: <code>\"string\"</code>. Great, that&rsquo;s what we expect.</p>\n<p>But, wait. What happened to that <code>\"stri\"</code> string? We dynamically allocated it,\nthen the VM discarded it after concatenating it with <code>\"ng\"</code>. We popped it from\nthe stack and no longer have a reference to it, but we never freed its memory.\nWe&rsquo;ve got ourselves a classic memory leak.</p>\n<p>Of course, it&rsquo;s perfectly fine for the <em>Lox program</em> to forget about\nintermediate strings and not worry about freeing them. Lox automatically manages\nmemory on the user&rsquo;s behalf. The responsibility to manage memory doesn&rsquo;t\n<em>disappear</em>. Instead, it falls on our shoulders as VM implementers.</p>\n<p>The full <span name=\"borrowed\">solution</span> is a <a href=\"garbage-collection.html\">garbage collector</a> that\nreclaims unused memory while the program is running. We&rsquo;ve got some other stuff\nto get in place before we&rsquo;re ready to tackle that project. Until then, we are\nliving on borrowed time. The longer we wait to add the collector, the harder it\nis to do.</p>\n<aside name=\"borrowed\">\n<p>I&rsquo;ve seen a number of people implement large swathes of their language before\ntrying to start on the GC. For the kind of toy programs you typically run while\na language is being developed, you actually don&rsquo;t run out of memory before\nreaching the end of the program, so this gets you surprisingly far.</p>\n<p>But that underestimates how <em>hard</em> it is to add a garbage collector later. The\ncollector <em>must</em> ensure it can find every bit of memory that <em>is</em> still being\nused so that it doesn&rsquo;t collect live data. There are hundreds of places a\nlanguage implementation can squirrel away a reference to some object. If you\ndon&rsquo;t find all of them, you get nightmarish bugs.</p>\n<p>I&rsquo;ve seen language implementations die because it was too hard to get the GC in\nlater. If your language needs GC, get it working as soon as you can. It&rsquo;s a\ncrosscutting concern that touches the entire codebase.</p>\n</aside>\n<p>Today, we should at least do the bare minimum: avoid <em>leaking</em> memory by making\nsure the VM can still find every allocated object even if the Lox program itself\nno longer references them. There are many sophisticated techniques that advanced\nmemory managers use to allocate and track memory for objects. We&rsquo;re going to\ntake the simplest practical approach.</p>\n<p>We&rsquo;ll create a linked list that stores every Obj. The VM can traverse that\nlist to find every single object that has been allocated on the heap, whether or\nnot the user&rsquo;s program or the VM&rsquo;s stack still has a reference to it.</p>\n<p>We could define a separate linked list node struct but then we&rsquo;d have to\nallocate those too. Instead, we&rsquo;ll use an <strong>intrusive list</strong><span class=\"em\">&mdash;</span>the Obj struct\nitself will be the linked list node. Each Obj gets a pointer to the next Obj in\nthe chain.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">struct Obj {\n  ObjType type;\n</pre><div class=\"source-file\"><em>object.h</em><br>\nin struct <em>Obj</em></div>\n<pre class=\"insert\">  <span class=\"k\">struct</span> <span class=\"t\">Obj</span>* <span class=\"i\">next</span>;\n</pre><pre class=\"insert-after\">};\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.h</em>, in struct <em>Obj</em></div>\n\n<p>The VM stores a pointer to the head of the list.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  Value* stackTop;\n</pre><div class=\"source-file\"><em>vm.h</em><br>\nin struct <em>VM</em></div>\n<pre class=\"insert\">  <span class=\"t\">Obj</span>* <span class=\"i\">objects</span>;\n</pre><pre class=\"insert-after\">} VM;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em>, in struct <em>VM</em></div>\n\n<p>When we first initialize the VM, there are no allocated objects.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  resetStack();\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>initVM</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">vm</span>.<span class=\"i\">objects</span> = <span class=\"a\">NULL</span>;\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>initVM</em>()</div>\n\n<p>Every time we allocate an Obj, we insert it in the list.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  object-&gt;type = type;\n</pre><div class=\"source-file\"><em>object.c</em><br>\nin <em>allocateObject</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"i\">object</span>-&gt;<span class=\"i\">next</span> = <span class=\"i\">vm</span>.<span class=\"i\">objects</span>;\n  <span class=\"i\">vm</span>.<span class=\"i\">objects</span> = <span class=\"i\">object</span>;\n</pre><pre class=\"insert-after\">  return object;\n</pre></div>\n<div class=\"source-file-narrow\"><em>object.c</em>, in <em>allocateObject</em>()</div>\n\n<p>Since this is a singly linked list, the easiest place to insert it is as the\nhead. That way, we don&rsquo;t need to also store a pointer to the tail and keep it\nupdated.</p>\n<p>The &ldquo;object&rdquo; module is directly using the global <code>vm</code> variable from the &ldquo;vm&rdquo;\nmodule, so we need to expose that externally.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} InterpretResult;\n\n</pre><div class=\"source-file\"><em>vm.h</em><br>\nadd after enum <em>InterpretResult</em></div>\n<pre class=\"insert\"><span class=\"k\">extern</span> <span class=\"a\">VM</span> <span class=\"i\">vm</span>;\n\n</pre><pre class=\"insert-after\">void initVM();\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.h</em>, add after enum <em>InterpretResult</em></div>\n\n<p>Eventually, the garbage collector will free memory while the VM is still\nrunning. But, even then, there will usually be unused objects still lingering in\nmemory when the user&rsquo;s program completes. The VM should free those too.</p>\n<p>There&rsquo;s no sophisticated logic for that. Once the program is done, we can free\n<em>every</em> object. We can and should implement that now.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void freeVM() {\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>freeVM</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">freeObjects</span>();\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>freeVM</em>()</div>\n\n<p>That empty function we defined <a href=\"a-virtual-machine.html#an-instruction-execution-machine\">way back when</a> finally does something! It\ncalls this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void* reallocate(void* pointer, size_t oldSize, size_t newSize);\n</pre><div class=\"source-file\"><em>memory.h</em><br>\nadd after <em>reallocate</em>()</div>\n<pre class=\"insert\"><span class=\"t\">void</span> <span class=\"i\">freeObjects</span>();\n</pre><pre class=\"insert-after\">\n\n#endif\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.h</em>, add after <em>reallocate</em>()</div>\n\n<p>Here&rsquo;s how we free the objects:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>memory.c</em><br>\nadd after <em>reallocate</em>()</div>\n<pre><span class=\"t\">void</span> <span class=\"i\">freeObjects</span>() {\n  <span class=\"t\">Obj</span>* <span class=\"i\">object</span> = <span class=\"i\">vm</span>.<span class=\"i\">objects</span>;\n  <span class=\"k\">while</span> (<span class=\"i\">object</span> != <span class=\"a\">NULL</span>) {\n    <span class=\"t\">Obj</span>* <span class=\"i\">next</span> = <span class=\"i\">object</span>-&gt;<span class=\"i\">next</span>;\n    <span class=\"i\">freeObject</span>(<span class=\"i\">object</span>);\n    <span class=\"i\">object</span> = <span class=\"i\">next</span>;\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, add after <em>reallocate</em>()</div>\n\n<p>This is a CS 101 textbook implementation of walking a linked list and freeing\nits nodes. For each node, we call:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>memory.c</em><br>\nadd after <em>reallocate</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">freeObject</span>(<span class=\"t\">Obj</span>* <span class=\"i\">object</span>) {\n  <span class=\"k\">switch</span> (<span class=\"i\">object</span>-&gt;<span class=\"i\">type</span>) {\n    <span class=\"k\">case</span> <span class=\"a\">OBJ_STRING</span>: {\n      <span class=\"t\">ObjString</span>* <span class=\"i\">string</span> = (<span class=\"t\">ObjString</span>*)<span class=\"i\">object</span>;\n      <span class=\"a\">FREE_ARRAY</span>(<span class=\"t\">char</span>, <span class=\"i\">string</span>-&gt;<span class=\"i\">chars</span>, <span class=\"i\">string</span>-&gt;<span class=\"i\">length</span> + <span class=\"n\">1</span>);\n      <span class=\"a\">FREE</span>(<span class=\"t\">ObjString</span>, <span class=\"i\">object</span>);\n      <span class=\"k\">break</span>;\n    }\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em>, add after <em>reallocate</em>()</div>\n\n<p>We aren&rsquo;t only freeing the Obj itself. Since some object types also allocate\nother memory that they own, we also need a little type-specific code to handle\neach object type&rsquo;s special needs. Here, that means we free the character array\nand then free the ObjString. Those both use one last memory management macro.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    (type*)reallocate(NULL, 0, sizeof(type) * (count))\n</pre><div class=\"source-file\"><em>memory.h</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#define FREE(type, pointer) reallocate(pointer, sizeof(type), 0)</span>\n</pre><pre class=\"insert-after\">\n\n#define GROW_CAPACITY(capacity) \\\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.h</em></div>\n\n<p>It&rsquo;s a tiny <span name=\"free\">wrapper</span> around <code>reallocate()</code> that\n&ldquo;resizes&rdquo; an allocation down to zero bytes.</p>\n<aside name=\"free\">\n<p>Using <code>reallocate()</code> to free memory might seem pointless. Why not just call\n<code>free()</code>? Later, this will help the VM track how much memory is still being\nused. If all allocation and freeing goes through <code>reallocate()</code>, it&rsquo;s easy to\nkeep a running count of the number of bytes of allocated memory.</p>\n</aside>\n<p>As usual, we need an include to wire everything together.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;common.h&quot;\n</pre><div class=\"source-file\"><em>memory.h</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;object.h&quot;</span>\n</pre><pre class=\"insert-after\">\n\n#define ALLOCATE(type, count) \\\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.h</em></div>\n\n<p>Then in the implementation file:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;memory.h&quot;\n</pre><div class=\"source-file\"><em>memory.c</em></div>\n<pre class=\"insert\"><span class=\"a\">#include &quot;vm.h&quot;</span>\n</pre><pre class=\"insert-after\">\n\nvoid* reallocate(void* pointer, size_t oldSize, size_t newSize) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>memory.c</em></div>\n\n<p>With this, our VM no longer leaks memory. Like a good C program, it cleans up\nits mess before exiting. But it doesn&rsquo;t free any objects while the VM is\nrunning. Later, when it&rsquo;s possible to write longer-running Lox programs, the VM\nwill eat more and more memory as it goes, not relinquishing a single byte until\nthe entire program is done.</p>\n<p>We won&rsquo;t address that until we&rsquo;ve added <a href=\"garbage-collection.html\">a real garbage collector</a>, but this\nis a big step. We now have the infrastructure to support a variety of different\nkinds of dynamically allocated objects. And we&rsquo;ve used that to add strings to\nclox, one of the most used types in most programming languages. Strings in turn\nenable us to build another fundamental data type, especially in dynamic\nlanguages: the venerable <a href=\"hash-tables.html\">hash table</a>. But that&rsquo;s for the next chapter<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span></p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>Each string requires two separate dynamic allocations<span class=\"em\">&mdash;</span>one for the\nObjString and a second for the character array. Accessing the characters\nfrom a value requires two pointer indirections, which can be bad for\nperformance. A more efficient solution relies on a technique called\n<strong><a href=\"https://en.wikipedia.org/wiki/Flexible_array_member\">flexible array members</a></strong>. Use that to store the ObjString and its\ncharacter array in a single contiguous allocation.</p>\n</li>\n<li>\n<p>When we create the ObjString for each string literal, we copy the characters\nonto the heap. That way, when the string is later freed, we know it is safe\nto free the characters too.</p>\n<p>This is a simpler approach but wastes some memory, which might be a problem\non very constrained devices. Instead, we could keep track of which\nObjStrings own their character array and which are &ldquo;constant strings&rdquo; that\njust point back to the original source string or some other non-freeable\nlocation. Add support for this.</p>\n</li>\n<li>\n<p>If Lox was your language, what would you have it do when a user tries to use\n<code>+</code> with one string operand and the other some other type? Justify your\nchoice. What do other languages do?</p>\n</li>\n</ol>\n</div>\n<div class=\"design-note\">\n<h2><a href=\"#design-note\" id=\"design-note\">Design Note: String Encoding</a></h2>\n<p>In this book, I try not to shy away from the gnarly problems you&rsquo;ll run into in\na real language implementation. We might not always use the most <em>sophisticated</em>\nsolution<span class=\"em\">&mdash;</span>it&rsquo;s an intro book after all<span class=\"em\">&mdash;</span>but I don&rsquo;t think it&rsquo;s honest to\npretend the problem doesn&rsquo;t exist at all. However, I did skirt around one really\nnasty conundrum: deciding how to represent strings.</p>\n<p>There are two facets to a string encoding:</p>\n<ul>\n<li>\n<p><strong>What is a single &ldquo;character&rdquo; in a string?</strong> How many different values are\nthere and what do they represent? The first widely adopted standard answer\nto this was <a href=\"https://en.wikipedia.org/wiki/ASCII\">ASCII</a>. It gave you 127 different character values and\nspecified what they were. It was great<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>if you only ever cared about\nEnglish. While it has weird, mostly forgotten characters like &ldquo;record\nseparator&rdquo; and &ldquo;synchronous idle&rdquo;, it doesn&rsquo;t have a single umlaut, acute,\nor grave. It can&rsquo;t represent &ldquo;jalapeño&rdquo;, &ldquo;naïve&rdquo;, <span\nname=\"gruyere\">&ldquo;Gruyère&rdquo;</span>, or &ldquo;Mötley Crüe&rdquo;.</p>\n<aside name=\"gruyere\">\n<p>It goes without saying that a language that does not let one discuss Gruyère\nor Mötley Crüe is a language not worth using.</p>\n</aside>\n<p>Next came <a href=\"https://en.wikipedia.org/wiki/Unicode\">Unicode</a>. Initially, it supported 16,384 different characters\n(<strong>code points</strong>), which fit nicely in 16 bits with a couple of bits to\nspare. Later that grew and grew, and now there are well over 100,000\ndifferent code points including such vital instruments of human\ncommunication as 💩 (Unicode Character &lsquo;PILE OF POO&rsquo;, <code>U+1F4A9</code>).</p>\n<p>Even that long list of code points is not enough to represent each possible\nvisible glyph a language might support. To handle that, Unicode also has\n<strong>combining characters</strong> that modify a preceding code point. For example,\n&ldquo;a&rdquo; followed by the combining character &ldquo;¨&rdquo; gives you &ldquo;ä&rdquo;. (To make things\nmore confusing Unicode <em>also</em> has a single code point that looks like &ldquo;ä&rdquo;.)</p>\n<p>If a user accesses the fourth &ldquo;character&rdquo; in &ldquo;naïve&rdquo;, do they expect to get\nback &ldquo;v&rdquo; or &ldquo;¨&rdquo;? The former means they are thinking of each code\npoint and its combining character as a single unit<span class=\"em\">&mdash;</span>what Unicode calls an\n<strong>extended grapheme cluster</strong><span class=\"em\">&mdash;</span>the latter means they are thinking in\nindividual code points. Which do your users expect?</p>\n</li>\n<li>\n<p><strong>How is a single unit represented in memory?</strong> Most systems using ASCII\ngave a single byte to each character and left the high bit unused. Unicode\nhas a handful of common encodings. UTF-16 packs most code points into 16\nbits. That was great when every code point fit in that size. When that\noverflowed, they added <em>surrogate pairs</em> that use multiple 16-bit code units\nto represent a single code point. UTF-32 is the next evolution of\nUTF-16<span class=\"em\">&mdash;</span>it gives a full 32 bits to each and every code point.</p>\n<p>UTF-8 is more complex than either of those. It uses a variable number of\nbytes to encode a code point. Lower-valued code points fit in fewer bytes.\nSince each character may occupy a different number of bytes, you can&rsquo;t\ndirectly index into the string to find a specific code point. If you want,\nsay, the 10th code point, you don&rsquo;t know how many bytes into the string that\nis without walking and decoding all of the preceding ones.</p>\n</li>\n</ul>\n<p>Choosing a character representation and encoding involves fundamental\ntrade-offs. Like many things in engineering, there&rsquo;s no <span\nname=\"python\">perfect</span> solution:</p>\n<aside name=\"python\">\n<p>An example of how difficult this problem is comes from Python. The achingly long\ntransition from Python 2 to 3 is painful mostly because of its changes around\nstring encoding.</p>\n</aside>\n<ul>\n<li>\n<p>ASCII is memory efficient and fast, but it kicks non-Latin languages to the\nside.</p>\n</li>\n<li>\n<p>UTF-32 is fast and supports the whole Unicode range, but wastes a lot of\nmemory given that most code points do tend to be in the lower range of\nvalues, where a full 32 bits aren&rsquo;t needed.</p>\n</li>\n<li>\n<p>UTF-8 is memory efficient and supports the whole Unicode range, but its\nvariable-length encoding makes it slow to access arbitrary code points.</p>\n</li>\n<li>\n<p>UTF-16 is worse than all of them<span class=\"em\">&mdash;</span>an ugly consequence of Unicode\noutgrowing its earlier 16-bit range. It&rsquo;s less memory efficient than UTF-8\nbut is still a variable-length encoding thanks to surrogate pairs. Avoid it\nif you can. Alas, if your language needs to run on or interoperate with the\nbrowser, the JVM, or the CLR, you might be stuck with it, since those all\nuse UTF-16 for their strings and you don&rsquo;t want to have to convert every\ntime you pass a string to the underlying system.</p>\n</li>\n</ul>\n<p>One option is to take the maximal approach and do the &ldquo;rightest&rdquo; thing. Support\nall the Unicode code points. Internally, select an encoding for each string\nbased on its contents<span class=\"em\">&mdash;</span>use ASCII if every code point fits in a byte, UTF-16 if\nthere are no surrogate pairs, etc. Provide APIs to let users iterate over both\ncode points and extended grapheme clusters.</p>\n<p>This covers all your bases but is really complex. It&rsquo;s a lot to implement,\ndebug, and optimize. When serializing strings or interoperating with other\nsystems, you have to deal with all of the encodings. Users need to understand\nthe two indexing APIs and know which to use when. This is the approach that\nnewer, big languages tend to take<span class=\"em\">&mdash;</span>like Raku and Swift.</p>\n<p>A simpler compromise is to always encode using UTF-8 and only expose an API that\nworks with code points. For users that want to work with grapheme clusters, let\nthem use a third-party library for that. This is less Latin-centric than ASCII\nbut not much more complex. You lose fast direct indexing by code point, but you\ncan usually live without that or afford to make it <em>O(n)</em> instead of <em>O(1)</em>.</p>\n<p>If I were designing a big workhorse language for people writing large\napplications, I&rsquo;d probably go with the maximal approach. For my little embedded\nscripting language <a href=\"http://wren.io\">Wren</a>, I went with UTF-8 and code points.</p>\n</div>\n\n<footer>\n<a href=\"hash-tables.html\" class=\"next\">\n  Next Chapter: &ldquo;Hash Tables&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/style.css",
    "content": "@charset \"UTF-8\";\n@font-face {\n  font-family: \"Crimson\";\n  src: url(\"font/crimson-roman.woff\") format(\"woff\");\n}\n@font-face {\n  font-family: \"Crimson\";\n  src: url(\"font/crimson-italic.woff\") format(\"woff\");\n  font-style: italic;\n}\n@font-face {\n  font-family: \"Crimson\";\n  src: url(\"font/crimson-semibold.woff\") format(\"woff\");\n  font-weight: 600;\n}\n@font-face {\n  font-family: \"Crimson\";\n  src: url(\"font/crimson-semibolditalic.woff\") format(\"woff\");\n  font-style: italic;\n  font-weight: 600;\n}\n@font-face {\n  font-family: \"Crimson\";\n  src: url(\"font/crimson-bold.woff\") format(\"woff\");\n  font-weight: bold;\n}\n@font-face {\n  font-family: \"Crimson\";\n  src: url(\"font/crimson-bolditalic.woff\") format(\"woff\");\n  font-style: italic;\n  font-weight: bold;\n}\nbody, h1, h2, h3, h4, p, blockquote, code, ul, ol, dl, dd, img {\n  margin: 0;\n}\n\nimg {\n  outline: none;\n}\n\nimg.arrow {\n  width: auto;\n  height: 11px;\n}\n\nimg.dot {\n  width: auto;\n  height: 18px;\n  vertical-align: text-bottom;\n}\n\nbody {\n  color: #222;\n  font: normal 16px/24px \"Crimson\", Georgia, serif;\n}\n\narticle.chapter h2 {\n  font: 600 30px/24px \"Crimson\", Georgia, serif;\n  margin: 69px 0 0 0;\n  padding-bottom: 3px;\n}\narticle.chapter h2 small {\n  font: 800 22px/24px \"Crimson\", Georgia, serif;\n  float: right;\n}\narticle.chapter h3 {\n  font: italic 24px/24px \"Crimson\", Georgia, serif;\n  margin: 71px 0 0 0;\n  padding-bottom: 1px;\n}\narticle.chapter h3 small {\n  font: 600 16px/24px \"Crimson\", Georgia, serif;\n  float: right;\n}\narticle.chapter h2 a, article.chapter h3 a {\n  color: #222;\n  border-bottom: none;\n}\narticle.chapter h2 a:hover, article.chapter h3 a:hover {\n  border-bottom: none;\n  color: inherit;\n}\narticle.chapter h2 a::before, article.chapter h3 a::before {\n  position: absolute;\n  left: -48px;\n  width: 48px;\n  content: \"§\";\n  color: #fff;\n  transition: color 0.2s ease;\n  text-align: center;\n}\narticle.chapter h2 a:hover::before, article.chapter h3 a:hover::before {\n  color: #ddd;\n}\narticle.chapter .challenges, article.chapter .design-note {\n  border-radius: 3px;\n  padding: 12px;\n  margin: -2px -12px 26px -12px;\n  font: normal 16px/24px \"Source Sans Pro\", sans-serif;\n  color: #444;\n}\narticle.chapter .challenges h2, article.chapter .design-note h2 {\n  margin: 0 0 -12px 0;\n  padding: 0;\n  font: 600 16px/24px \"Source Sans Pro\", sans-serif;\n  text-transform: uppercase;\n  letter-spacing: 1px;\n}\narticle.chapter .challenges h2 a, article.chapter .design-note h2 a {\n  color: inherit;\n}\narticle.chapter .challenges h2 a::before, article.chapter .design-note h2 a::before {\n  content: none;\n}\narticle.chapter .challenges ol, article.chapter .design-note ol {\n  padding: 0 0 0 18px;\n}\narticle.chapter .challenges ol li, article.chapter .design-note ol li {\n  padding: 0 0 0 6px;\n  font-weight: 600;\n}\narticle.chapter .challenges ol li p, article.chapter .design-note ol li p {\n  font-weight: 400;\n}\narticle.chapter .challenges pre, article.chapter .design-note pre {\n  margin: 0;\n}\narticle.chapter .challenges > blockquote p, article.chapter .design-note > blockquote p {\n  margin: 0 24px;\n  font: italic 16px/24px \"Source Sans Pro\", sans-serif;\n  color: #444;\n}\narticle.chapter .challenges > blockquote::before, article.chapter .challenges > blockquote::after, article.chapter .design-note > blockquote::before, article.chapter .design-note > blockquote::after {\n  content: none;\n}\narticle.chapter .challenges aside code, article.chapter .challenges aside .codehilite, article.chapter .design-note aside code, article.chapter .design-note aside .codehilite {\n  color: #595959;\n  background: #faf8f5;\n}\narticle.chapter .challenges *:last-child, article.chapter .design-note *:last-child {\n  margin-bottom: 0;\n}\narticle.chapter .challenges .codehilite,\narticle.chapter .design-note .codehilite {\n  margin: -12px 0 -12px 0;\n}\narticle.chapter .challenges {\n  background: #eef4f7;\n}\narticle.chapter .challenges code, article.chapter .challenges .codehilite {\n  background: #e4eef1;\n}\narticle.chapter .design-note {\n  background: #f6f8f2;\n}\narticle.chapter .design-note code, article.chapter .design-note .codehilite {\n  background: #eef1ea;\n}\narticle.chapter table {\n  width: 100%;\n  border-collapse: collapse;\n}\narticle.chapter table thead {\n  font: 700 15px \"Crimson\", Georgia, serif;\n}\narticle.chapter table td {\n  border-bottom: solid 1px #dee9ed;\n  line-height: 22px;\n  padding: 3px 0 0 0;\n  margin: 0;\n}\narticle.chapter table td + td {\n  padding-left: 12px;\n}\n\n@media only screen and (max-width: 960px) {\n  article.chapter .challenges aside, article.chapter .design-note aside {\n    font: normal 15px/24px \"Source Sans Pro\", sans-serif;\n    padding-bottom: 4px;\n  }\n  article.chapter .challenges aside code, article.chapter .challenges aside .codehilite {\n    background: #e4eef1;\n  }\n  article.chapter .design-note aside code, article.chapter .design-note aside .codehilite {\n    background: #eef1ea;\n  }\n}\n@media only screen and (max-width: 630px) {\n  article.chapter h2 a::before, article.chapter h3 a::before {\n    left: -24px;\n    width: 24px;\n  }\n}\n@media only screen and (max-width: 580px) {\n  article.chapter h2 {\n    margin-top: 64px;\n    padding-bottom: 2px;\n    font-size: 22px;\n    line-height: 22px;\n  }\n  article.chapter h3 {\n    margin-top: 64px;\n    padding-bottom: 0;\n    font-size: 20px;\n  }\n  article.chapter .challenges, article.chapter .design-note {\n    padding: 11px 11px 8px 11px;\n    margin: 25px 0 0 0;\n    font-size: 15px;\n    line-height: 22px;\n  }\n  article.chapter .challenges code, article.chapter .challenges .codehilite, article.chapter .design-note code, article.chapter .design-note .codehilite {\n    font-size: 14px;\n  }\n  article.chapter .challenges h2, article.chapter .design-note h2 {\n    padding: 5px 0 4px 6px;\n    font-size: 17px;\n    line-height: 22px;\n  }\n  article.chapter .challenges aside, article.chapter .design-note aside {\n    line-height: 22px;\n  }\n}\narticle.contents h2 {\n  margin: 22px 0 6px 0;\n  font: 600 normal 18px/24px \"Source Sans Pro\", sans-serif;\n  text-transform: uppercase;\n  letter-spacing: 1px;\n}\narticle.contents h2 .num {\n  display: inline-block;\n  width: 36px;\n}\narticle.contents ul {\n  margin: -12px 0 0 0;\n  padding: 6px 0 14px 0;\n}\narticle.contents li {\n  padding: 12px 0 0 36px;\n  font: normal 16px/24px \"Source Sans Pro\", sans-serif;\n  color: #7aa0b8;\n  list-style-type: none;\n}\narticle.contents li .num {\n  display: inline-block;\n  letter-spacing: 1px;\n  width: 36px;\n}\narticle.contents li a {\n  font: 600 17px/24px \"Source Sans Pro\", sans-serif;\n}\narticle.contents li.design-note {\n  padding-top: 0;\n}\narticle.contents li.design-note a {\n  font: 400 16px/23px \"Source Sans Pro\", sans-serif;\n}\narticle.contents .chapters {\n  display: table;\n  width: 864px;\n}\narticle.contents .row {\n  display: table-row;\n}\narticle.contents .first, article.contents .second {\n  display: table-cell;\n  vertical-align: top;\n}\narticle.contents .second {\n  padding-left: 48px;\n}\narticle.contents footer {\n  width: 864px;\n}\n\n@media only screen and (max-width: 1344px) {\n  article.contents .chapters, article.contents .row, article.contents .first, article.contents .second {\n    display: block;\n    width: auto;\n  }\n  article.contents .second {\n    padding-left: 0;\n  }\n  article.contents footer {\n    width: inherit;\n  }\n}\n@media only screen and (max-width: 630px) {\n  article.contents h2 .num, article.contents li .num {\n    width: 28px;\n  }\n  article.contents ol, article.contents ul {\n    margin-left: 0;\n  }\n  article.contents li {\n    padding-left: 0;\n  }\n}\n@media only screen and (max-width: 580px) {\n  article.contents h2 {\n    margin: 19px 0 6px 0;\n    font-size: 17px;\n    line-height: 22px;\n  }\n  article.contents h3 {\n    padding: 1px 0 2px 0;\n    font-size: 17px;\n    line-height: 22px;\n  }\n  article.contents p {\n    font-size: 15px;\n    line-height: 22px;\n  }\n  article.contents ol, article.contents ul {\n    padding-bottom: 8px;\n  }\n  article.contents li {\n    font-size: 14px;\n    line-height: 22px;\n    padding: 4px 0 3px 0;\n  }\n}\n.sign-up {\n  padding: 12px;\n  margin: 24px 0 24px 0;\n  background: #fcf6e8;\n  color: #bf9540;\n  border-radius: 3px;\n}\n.sign-up form {\n  display: flex;\n}\n.sign-up input {\n  padding: 4px;\n  font: 16px \"Source Sans Pro\", sans-serif;\n  outline: none;\n  border-radius: 3px;\n  border: solid 2px #ffd580;\n  color: #825e17;\n  height: 32px;\n}\n.sign-up input.email {\n  display: block;\n  box-sizing: border-box;\n  width: 100%;\n}\n.sign-up input.button {\n  margin-left: 8px;\n  padding: 4px 8px;\n  font: 600 13px \"Source Sans Pro\", sans-serif;\n  text-transform: uppercase;\n  letter-spacing: 1px;\n  background: #ffbb33;\n  border: none;\n  transition: background-color 0.2s ease;\n}\n.sign-up input.button:hover {\n  background: #ffd580;\n}\n.sign-up input:focus {\n  border-color: #ffaa00;\n}\n\n@font-face {\n  font-family: \"Crimson\";\n  src: url(\"font/crimson-roman.woff\") format(\"woff\");\n}\n@font-face {\n  font-family: \"Crimson\";\n  src: url(\"font/crimson-italic.woff\") format(\"woff\");\n  font-style: italic;\n}\n@font-face {\n  font-family: \"Crimson\";\n  src: url(\"font/crimson-semibold.woff\") format(\"woff\");\n  font-weight: 600;\n}\n@font-face {\n  font-family: \"Crimson\";\n  src: url(\"font/crimson-semibolditalic.woff\") format(\"woff\");\n  font-style: italic;\n  font-weight: 600;\n}\n@font-face {\n  font-family: \"Crimson\";\n  src: url(\"font/crimson-bold.woff\") format(\"woff\");\n  font-weight: bold;\n}\n@font-face {\n  font-family: \"Crimson\";\n  src: url(\"font/crimson-bolditalic.woff\") format(\"woff\");\n  font-style: italic;\n  font-weight: bold;\n}\nbody, h1, h2, h3, h4, p, blockquote, code, ul, ol, dl, dd, img {\n  margin: 0;\n}\n\nimg {\n  outline: none;\n}\n\nimg.arrow {\n  width: auto;\n  height: 11px;\n}\n\nimg.dot {\n  width: auto;\n  height: 18px;\n  vertical-align: text-bottom;\n}\n\nbody {\n  color: #222;\n  font: normal 16px/24px \"Crimson\", Georgia, serif;\n}\n\n@media print {\n  body, a, code {\n    color: #000 !important;\n    background: none !important;\n  }\n\n  nav, .sign-up {\n    display: none;\n  }\n\n  .page {\n    margin: 0 !important;\n  }\n\n  .codehilite {\n    margin: 0 !important;\n    background: none !important;\n    border-radius: 0 !important;\n    border-left: solid 1px #dad8d6;\n    border-right: solid 1px #dad8d6;\n  }\n  .codehilite pre {\n    color: #000 !important;\n  }\n  .codehilite .insert {\n    border-left: solid 3px #dad8d6 !important;\n    border-right: solid 3px #dad8d6 !important;\n    background: none !important;\n  }\n  .codehilite .delete {\n    -webkit-print-color-adjust: exact;\n    color-adjust: exact;\n  }\n  .codehilite .insert-before span, .codehilite .insert-after span {\n    -webkit-print-color-adjust: exact;\n    color-adjust: exact;\n  }\n}\n.emdash {\n  white-space: nowrap;\n}\n\n.scrim {\n  position: absolute;\n  width: 100%;\n  height: 10000px;\n  z-index: 4;\n  background: url(\"rows.png\");\n}\n\n.small-caps {\n  font-weight: 600;\n  font-size: 13px;\n}\n\na {\n  color: #1481b8;\n  text-decoration: none;\n  border-bottom: solid 1px rgba(222, 233, 237, 0);\n  transition: color 0.2s ease, border-color 0.4s ease;\n}\n\na:hover {\n  color: #1481b8;\n  border-bottom: solid 1px #dee9ed;\n}\n\nnav {\n  font: 300 15px/24px \"Source Sans Pro\", sans-serif;\n  background: #29313d;\n  color: #4b6781;\n}\nnav a, nav h2 a {\n  color: #7aa0b8;\n  text-decoration: none;\n  border-bottom: none;\n}\nnav a:hover {\n  color: #dee9ed;\n  text-decoration: none;\n  border-bottom: none;\n}\nnav img {\n  box-sizing: border-box;\n  width: 100%;\n  padding: 55px 48px 23px 48px;\n}\nnav h2 {\n  font: 400 16px/24px \"Source Sans Pro\", sans-serif;\n  text-transform: uppercase;\n  letter-spacing: 1px;\n  color: #7aa0b8;\n}\nnav h3 {\n  font: 400 18px/24px \"Source Sans Pro\", sans-serif;\n  color: #7aa0b8;\n}\nnav h2 small, nav h3 small {\n  float: right;\n  font-size: 16px;\n  color: #4b6781;\n}\nnav ol, nav ul {\n  margin: 6px 0 3px 0;\n  padding: 6px 0 4px 24px;\n  border-top: solid 1px #3b4b5e;\n  border-bottom: solid 1px #3b4b5e;\n}\nnav ul {\n  list-style-type: none;\n  padding-left: 0;\n}\nnav hr {\n  border: none;\n  border-top: solid 1px #3b4b5e;\n  margin: 6px 0 0 0;\n  padding: 0 0 3px 0;\n}\nnav li small {\n  float: right;\n  font-size: 14px;\n  color: #4b6781;\n}\nnav li.divider {\n  margin: 5px 0 7px 0;\n  border-top: solid 1px #3b4b5e;\n}\nnav li.end-part {\n  font-size: 12px;\n  font-weight: 400;\n  text-transform: uppercase;\n  letter-spacing: 1px;\n}\nnav li.end-part small {\n  font-weight: 300;\n  text-transform: none;\n  letter-spacing: 0;\n}\nnav .prev-next {\n  padding-top: 7px;\n  font: 400 12px/18px \"Source Sans Pro\", sans-serif;\n  text-align: center;\n  text-transform: uppercase;\n  letter-spacing: 1px;\n}\n\nnav.wide {\n  position: fixed;\n  width: 336px;\n  height: 100%;\n}\nnav.wide .contents {\n  margin: 24px 48px;\n}\n\n.nav-wrapper {\n  position: absolute;\n  right: 288px;\n}\n\nnav.floating {\n  display: none;\n  z-index: 2;\n  position: absolute;\n  width: 288px;\n  border-bottom-left-radius: 3px;\n  border-bottom-right-radius: 3px;\n}\nnav.floating #expand-nav {\n  padding: 0 0 4px 0;\n  display: block;\n  font-size: 20px;\n  text-align: center;\n  color: #4b6781;\n  cursor: pointer;\n  transition: padding 0.2s ease, margin 0.2s ease, color 0.2s ease;\n}\nnav.floating #expand-nav, nav.floating #expand-nav:hover {\n  border-bottom: none;\n}\nnav.floating #expand-nav:hover {\n  color: #dee9ed;\n}\nnav.floating .expandable {\n  overflow: hidden;\n  padding: 0 12px;\n  max-height: 0;\n  transition: margin 0.2s ease, max-height 1s ease;\n}\nnav.floating .expandable .prev-next {\n  padding-bottom: 6px;\n}\nnav.floating .expandable.shown {\n  max-height: 550px;\n}\nnav.floating img {\n  padding: 110px 24px 23px 24px;\n}\n\nnav.floating.pinned {\n  position: fixed;\n  top: -85px;\n}\nnav.floating.pinned .expandable {\n  margin-top: -13px;\n}\nnav.floating.pinned #expand-nav {\n  margin-top: -14px;\n}\n\nnav.narrow {\n  display: none;\n  text-align: center;\n}\nnav.narrow img {\n  box-sizing: content-box;\n  padding: 11px 0 3px 0;\n  width: auto;\n  height: 27px;\n}\nnav.narrow .prev, nav.narrow .next {\n  font-size: 32px;\n  position: absolute;\n  top: 12px;\n  padding: 0 48px;\n}\nnav.narrow .prev {\n  left: 0;\n}\nnav.narrow .next {\n  right: 0;\n}\n\n.left {\n  float: left;\n}\n\n.right {\n  float: right;\n}\n\n.page {\n  position: relative;\n  width: 912px;\n  margin: 0 auto 0 384px;\n}\n\n.em {\n  padding: 0 0.1em;\n  white-space: nowrap;\n}\n\n.ellipse {\n  white-space: nowrap;\n}\n\ncode {\n  font: normal 16px \"Source Code Pro\", Menlo, Consolas, Monaco, monospace;\n  color: #717170;\n  white-space: nowrap;\n  padding: 2px;\n}\n\nstrong code {\n  font-weight: bold;\n  color: inherit;\n}\n\na code {\n  color: #1481b8;\n}\n\n.codehilite {\n  color: #595959;\n  background: #faf8f5;\n  border-radius: 3px;\n  padding: 12px;\n  margin: -12px;\n}\n\npre {\n  font: normal 13px/20px \"Source Code Pro\", Menlo, Consolas, Monaco, monospace;\n  margin: 0;\n  padding: 0;\n  white-space: pre-wrap;\n  overflow-wrap: anywhere;\n}\n\ndiv.codehilite + div.challenges {\n  margin-top: 24px;\n}\n\narticle {\n  position: relative;\n  width: 576px;\n}\narticle h1 {\n  position: relative;\n  font: 48px/48px \"Crimson\", Georgia, serif;\n  padding: 109px 0 19px 0;\n  z-index: 2;\n}\narticle h1.part {\n  font: 600 36px/48px \"Source Sans Pro\", sans-serif;\n  padding: 108px 0 20px 0;\n  text-transform: uppercase;\n  letter-spacing: 1px;\n}\narticle .number {\n  position: absolute;\n  top: 50px;\n  left: 624px;\n  z-index: 1;\n  font: 300 96px \"Source Sans Pro\", sans-serif;\n  color: #dee9ed;\n}\narticle p {\n  margin: 24px 0;\n}\narticle ol, article ul {\n  margin: 24px 0;\n  padding: 0 0 0 24px;\n}\narticle img {\n  max-width: 100%;\n}\narticle img.wide {\n  max-width: none;\n  width: 912px;\n}\n\naside {\n  position: absolute;\n  right: -336px;\n  width: 288px;\n  font: normal 14px/20px \"Crimson\", Georgia, serif;\n  border-top: solid 1px #dee9ed;\n}\naside p {\n  margin: 20px 0;\n}\naside p:first-child,\naside img:first-child {\n  margin-top: 4px;\n}\naside p:last-child {\n  margin-bottom: 4px;\n}\naside code {\n  font-size: 14px;\n  border-radius: 2px;\n  padding: 1px 2px;\n}\naside .codehilite {\n  padding: 6px;\n  margin: -12px 0;\n}\naside .codehilite:last-child {\n  margin-bottom: 4px;\n}\naside img.above {\n  position: absolute;\n  bottom: 100%;\n  margin-bottom: 16px;\n}\naside blockquote {\n  margin: 20px 0;\n}\naside blockquote::before, aside blockquote::after {\n  content: none;\n}\naside blockquote p {\n  margin: 0 12px;\n  font: italic 15px/20px \"Crimson\", Georgia, serif;\n  color: inherit;\n}\n\naside.bottom {\n  border-top: none;\n  border-bottom: solid 1px #dee9ed;\n}\n\nblockquote {\n  position: relative;\n  margin: 29px 0 31px 0;\n}\nblockquote::before, blockquote::after {\n  position: absolute;\n  top: -20px;\n  font: italic 72px \"Crimson\", Georgia, serif;\n  color: #dee9ed;\n}\nblockquote::before {\n  content: \"“\";\n  left: -7px;\n}\nblockquote::after {\n  content: \"”\";\n  right: 8px;\n}\nblockquote p {\n  margin: 0 48px;\n  font: italic 24px/36px \"Crimson\", Georgia, serif;\n  color: #5985a6;\n}\nblockquote p em {\n  font-style: normal;\n}\nblockquote cite {\n  display: block;\n  text-align: right;\n  color: #7aa0b8;\n  font-style: normal;\n  font-size: 18px;\n}\nblockquote cite::before {\n  content: \"— \";\n  color: #dee9ed;\n}\nblockquote cite em {\n  font-style: italic;\n}\n\nfooter {\n  position: relative;\n  border-top: solid 1px #dee9ed;\n  color: #7aa0b8;\n  font: 400 15px \"Source Sans Pro\", sans-serif;\n  text-align: center;\n  margin: 48px 0;\n  padding-top: 48px;\n}\nfooter a, footer a:hover {\n  border: none;\n}\nfooter .next {\n  position: absolute;\n  right: 0;\n  top: -13px;\n  padding-left: 4px;\n  background: #fff;\n  font: 400 17px/24px \"Source Sans Pro\", sans-serif;\n  text-transform: uppercase;\n  letter-spacing: 1px;\n}\nfooter .next:hover {\n  color: #004466;\n  border: none;\n}\n\n.dedication {\n  margin: 96px 0 128px 0;\n  text-align: center;\n}\n.dedication img {\n  width: 50%;\n}\n\n.source-file, .source-file-narrow {\n  font: normal 11px/16px \"Source Code Pro\", Menlo, Consolas, Monaco, monospace;\n  color: #bab8b7;\n}\n.source-file em, .source-file-narrow em {\n  color: #999997;\n  font-style: normal;\n}\n\n.source-file-narrow {\n  display: none;\n  margin: 0px -12px 0 0;\n  padding: 14px 0 0 0;\n  text-align: right;\n}\n\n.source-file {\n  position: absolute;\n  right: -336px;\n  width: 288px;\n  padding: 2px 0 0 0;\n}\n.source-file::before {\n  content: \"<<\";\n  color: #dad8d6;\n  position: absolute;\n  left: -36px;\n  width: 36px;\n  text-align: center;\n}\n\n.codehilite pre {\n  color: #797978;\n}\n.codehilite .k {\n  color: #0099e6;\n}\n.codehilite .n {\n  color: #dd713c;\n}\n.codehilite .s {\n  color: #c38e22;\n}\n.codehilite .e {\n  color: #e8ba30;\n}\n.codehilite .c {\n  color: #aaa9a7;\n}\n.codehilite .a {\n  color: #9966cc;\n}\n.codehilite .i {\n  color: #1b6e98;\n}\n.codehilite .t {\n  color: #00a4b3;\n}\n.codehilite .insert {\n  margin: -2px -12px;\n  padding: 2px 10px;\n  border-left: solid 2px #dad8d6;\n  border-right: solid 2px #dad8d6;\n  background: #f5f3f0;\n}\n.codehilite .delete {\n  margin: -2px -12px;\n  padding: 2px 10px;\n  border-left: solid 2px #dad8d6;\n  border-right: solid 2px #dad8d6;\n  background: repeating-linear-gradient(-45deg, #dad8d6, #dad8d6 1px, rgba(0, 0, 0, 0) 1px, rgba(0, 0, 0, 0) 6px);\n}\n.codehilite .delete span {\n  color: #bab8b7;\n}\n.codehilite .insert-before, .codehilite .insert-after {\n  color: #bab8b7;\n}\n.codehilite .insert-before .insert-comma {\n  margin: -2px -1px;\n  padding: 2px 1px;\n  border-radius: 2px;\n  background: #f5f3f0;\n  color: #595959;\n}\n\n@media only screen and (max-width: 1344px) {\n  nav.wide {\n    display: none;\n  }\n\n  nav.floating {\n    display: block;\n  }\n\n  body {\n    margin: 0 24px;\n  }\n\n  .page {\n    position: relative;\n    width: inherit;\n    max-width: 912px;\n    margin: 0 auto;\n  }\n\n  article {\n    width: inherit;\n    margin-right: 336px;\n  }\n  article .number {\n    top: 73px;\n    left: inherit;\n    right: 0;\n    font-size: 72px;\n  }\n  article h1 {\n    padding: 110px 0 18px 0;\n    font-size: 44px;\n  }\n}\n@media only screen and (max-width: 960px) {\n  body {\n    margin: 0;\n  }\n\n  nav.floating {\n    display: none;\n  }\n\n  nav.narrow {\n    display: block;\n  }\n\n  .page {\n    margin: 0 48px;\n    width: inherit;\n  }\n\n  article {\n    margin: 0;\n  }\n  article img.wide {\n    width: inherit;\n    max-width: 100%;\n  }\n\n  aside {\n    position: inherit;\n    right: inherit;\n    width: inherit;\n    border-bottom: solid 1px #dee9ed;\n  }\n  aside p:first-child {\n    margin-top: 8px;\n  }\n  aside p:last-child {\n    margin-bottom: 8px;\n  }\n  aside div.codehilite:last-child {\n    margin-bottom: 12px;\n  }\n  aside img {\n    display: block;\n    max-width: 288px;\n    margin: 0 auto;\n  }\n  aside img.above {\n    position: relative;\n  }\n\n  aside + div.codehilite {\n    margin-top: 12px;\n  }\n\n  div.codehilite + aside {\n    margin-top: 24px;\n  }\n\n  .source-file {\n    display: none;\n  }\n\n  .source-file-narrow {\n    display: block;\n  }\n}\n@media only screen and (max-width: 630px) {\n  .page {\n    margin: 0 24px;\n    width: inherit;\n  }\n\n  nav.narrow .prev, nav.narrow .next {\n    padding: 0 24px;\n  }\n}\n@media only screen and (max-width: 580px) {\n  body {\n    font-size: 15px;\n    line-height: 22px;\n  }\n\n  .small-caps {\n    font-size: 12px;\n  }\n\n  .scrim {\n    background: url(\"rows-22.png\");\n  }\n\n  nav.narrow img {\n    padding: 9px 0 1px 0;\n    height: 27px;\n  }\n  nav.narrow .prev, nav.narrow .next {\n    top: 11px;\n  }\n\n  article h1 {\n    font-size: 36px;\n    padding: 100px 0 14px 0;\n  }\n  article h1.part {\n    font-size: 30px;\n    padding: 97px 0 17px 0;\n  }\n  article .number {\n    top: 61px;\n    font-size: 72px;\n  }\n  article p {\n    margin: 22px 0;\n  }\n  article ol, article ul {\n    margin: 22px 0;\n    padding: 0 0 0 22px;\n  }\n\n  blockquote {\n    margin: 27px 0 28px 0;\n  }\n  blockquote::before, blockquote::after {\n    top: -17px;\n    font-size: 52px;\n  }\n  blockquote p {\n    margin: 0 22px;\n    font-size: 20px;\n    line-height: 33px;\n  }\n\n  footer .next {\n    font-size: 15px;\n  }\n}"
  },
  {
    "path": "site/superclasses.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Superclasses &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Superclasses<small>29</small></a></h3>\n\n<ul>\n    <li><a href=\"#inheriting-methods\"><small>29.1</small> Inheriting Methods</a></li>\n    <li><a href=\"#storing-superclasses\"><small>29.2</small> Storing Superclasses</a></li>\n    <li><a href=\"#super-calls\"><small>29.3</small> Super Calls</a></li>\n    <li><a href=\"#a-complete-virtual-machine\"><small>29.4</small> A Complete Virtual Machine</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"methods-and-initializers.html\" title=\"Methods and Initializers\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"optimization.html\" title=\"Optimization\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"methods-and-initializers.html\" title=\"Methods and Initializers\" class=\"prev\">←</a>\n<a href=\"optimization.html\" title=\"Optimization\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Superclasses<small>29</small></a></h3>\n\n<ul>\n    <li><a href=\"#inheriting-methods\"><small>29.1</small> Inheriting Methods</a></li>\n    <li><a href=\"#storing-superclasses\"><small>29.2</small> Storing Superclasses</a></li>\n    <li><a href=\"#super-calls\"><small>29.3</small> Super Calls</a></li>\n    <li><a href=\"#a-complete-virtual-machine\"><small>29.4</small> A Complete Virtual Machine</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"methods-and-initializers.html\" title=\"Methods and Initializers\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"optimization.html\" title=\"Optimization\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">29</div>\n  <h1>Superclasses</h1>\n\n<blockquote>\n<p>You can choose your friends but you sho&rsquo; can&rsquo;t choose your family, an&rsquo; they&rsquo;re\nstill kin to you no matter whether you acknowledge &rsquo;em or not, and it\nmakes you look right silly when you don&rsquo;t.</p>\n<p><cite>Harper Lee, <em>To Kill a Mockingbird</em></cite></p>\n</blockquote>\n<p>This is the very last chapter where we add new functionality to our VM. We&rsquo;ve\npacked almost the entire Lox language in there already. All that remains is\ninheriting methods and calling superclass methods. We have <a href=\"optimization.html\">another\nchapter</a> after this one, but it introduces no new behavior. It\n<span name=\"faster\">only</span> makes existing stuff faster. Make it to the end\nof this one, and you&rsquo;ll have a complete Lox implementation.</p>\n<aside name=\"faster\">\n<p>That &ldquo;only&rdquo; should not imply that making stuff faster isn&rsquo;t important! After\nall, the whole purpose of our entire second virtual machine is better\nperformance over jlox. You could argue that <em>all</em> of the past fifteen chapters\nare &ldquo;optimization&rdquo;.</p>\n</aside>\n<p>Some of the material in this chapter will remind you of jlox. The way we resolve\nsuper calls is pretty much the same, though viewed through clox&rsquo;s more complex\nmechanism for storing state on the stack. But we have an entirely different,\nmuch faster, way of handling inherited method calls this time around.</p>\n<h2><a href=\"#inheriting-methods\" id=\"inheriting-methods\"><small>29&#8202;.&#8202;1</small>Inheriting Methods</a></h2>\n<p>We&rsquo;ll kick things off with method inheritance since it&rsquo;s the simpler piece. To\nrefresh your memory, Lox inheritance syntax looks like this:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Doughnut</span> {\n  <span class=\"i\">cook</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Dunk in the fryer.&quot;</span>;\n  }\n}\n\n<span class=\"k\">class</span> <span class=\"t\">Cruller</span> &lt; <span class=\"t\">Doughnut</span> {\n  <span class=\"i\">finish</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Glaze with icing.&quot;</span>;\n  }\n}\n</pre></div>\n<p>Here, the Cruller class inherits from Doughnut and thus, instances of Cruller\ninherit the <code>cook()</code> method. I don&rsquo;t know why I&rsquo;m belaboring this. You know how\ninheritance works. Let&rsquo;s start compiling the new syntax.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  currentClass = &amp;classCompiler;\n\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>classDeclaration</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"a\">TOKEN_LESS</span>)) {\n    <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_IDENTIFIER</span>, <span class=\"s\">&quot;Expect superclass name.&quot;</span>);\n    <span class=\"i\">variable</span>(<span class=\"k\">false</span>);\n    <span class=\"i\">namedVariable</span>(<span class=\"i\">className</span>, <span class=\"k\">false</span>);\n    <span class=\"i\">emitByte</span>(<span class=\"a\">OP_INHERIT</span>);\n  }\n\n</pre><pre class=\"insert-after\">  namedVariable(className, false);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>classDeclaration</em>()</div>\n\n<p>After we compile the class name, if the next token is a <code>&lt;</code>, then we found a\nsuperclass clause. We consume the superclass&rsquo;s identifier token, then call\n<code>variable()</code>. That function takes the previously consumed token, treats it as a\nvariable reference, and emits code to load the variable&rsquo;s value. In other words,\nit looks up the superclass by name and pushes it onto the stack.</p>\n<p>After that, we call <code>namedVariable()</code> to load the subclass doing the inheriting\nonto the stack, followed by an <code>OP_INHERIT</code> instruction. That instruction\nwires up the superclass to the new subclass. In the last chapter, we defined an\n<code>OP_METHOD</code> instruction to mutate an existing class object by adding a method to\nits method table. This is similar<span class=\"em\">&mdash;</span>the <code>OP_INHERIT</code> instruction takes an\nexisting class and applies the effect of inheritance to it.</p>\n<p>In the previous example, when the compiler works through this bit of syntax:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Cruller</span> &lt; <span class=\"t\">Doughnut</span> {\n</pre></div>\n<p>The result is this bytecode:</p><img src=\"image/superclasses/inherit-stack.png\" alt=\"The series of bytecode instructions for a Cruller class inheriting from Doughnut.\" />\n<p>Before we implement the new <code>OP_INHERIT</code> instruction, we have an edge case to\ndetect.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    variable(false);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>classDeclaration</em>()</div>\n<pre class=\"insert\">\n\n    <span class=\"k\">if</span> (<span class=\"i\">identifiersEqual</span>(&amp;<span class=\"i\">className</span>, &amp;<span class=\"i\">parser</span>.<span class=\"i\">previous</span>)) {\n      <span class=\"i\">error</span>(<span class=\"s\">&quot;A class can&#39;t inherit from itself.&quot;</span>);\n    }\n\n</pre><pre class=\"insert-after\">    namedVariable(className, false);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>classDeclaration</em>()</div>\n\n<p><span name=\"cycle\">A</span> class cannot be its own superclass. Unless you have\naccess to a deranged nuclear physicist and a very heavily modified DeLorean, you\ncannot inherit from yourself.</p>\n<aside name=\"cycle\">\n<p>Interestingly, with the way we implement method inheritance, I don&rsquo;t think\nallowing cycles would actually cause any problems in clox. It wouldn&rsquo;t do\nanything <em>useful</em>, but I don&rsquo;t think it would cause a crash or infinite loop.</p>\n</aside>\n<h3><a href=\"#executing-inheritance\" id=\"executing-inheritance\"><small>29&#8202;.&#8202;1&#8202;.&#8202;1</small>Executing inheritance</a></h3>\n<p>Now onto the new instruction.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_CLASS,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_INHERIT</span>,\n</pre><pre class=\"insert-after\">  OP_METHOD\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>There are no operands to worry about. The two values we need<span class=\"em\">&mdash;</span>superclass and\nsubclass<span class=\"em\">&mdash;</span>are both found on the stack. That means disassembling is easy.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      return constantInstruction(&quot;OP_CLASS&quot;, chunk, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_INHERIT</span>:\n      <span class=\"k\">return</span> <span class=\"i\">simpleInstruction</span>(<span class=\"s\">&quot;OP_INHERIT&quot;</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    case OP_METHOD:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>The interpreter is where the action happens.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">        break;\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_INHERIT</span>: {\n        <span class=\"t\">Value</span> <span class=\"i\">superclass</span> = <span class=\"i\">peek</span>(<span class=\"n\">1</span>);\n        <span class=\"t\">ObjClass</span>* <span class=\"i\">subclass</span> = <span class=\"a\">AS_CLASS</span>(<span class=\"i\">peek</span>(<span class=\"n\">0</span>));\n        <span class=\"i\">tableAddAll</span>(&amp;<span class=\"a\">AS_CLASS</span>(<span class=\"i\">superclass</span>)-&gt;<span class=\"i\">methods</span>,\n                    &amp;<span class=\"i\">subclass</span>-&gt;<span class=\"i\">methods</span>);\n        <span class=\"i\">pop</span>(); <span class=\"c\">// Subclass.</span>\n        <span class=\"k\">break</span>;\n      }\n</pre><pre class=\"insert-after\">      case OP_METHOD:\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>From the top of the stack down, we have the subclass then the superclass. We\ngrab both of those and then do the inherit-y bit. This is where clox takes a\ndifferent path than jlox. In our first interpreter, each subclass stored a\nreference to its superclass. On method access, if we didn&rsquo;t find the method in\nthe subclass&rsquo;s method table, we recursed through the inheritance chain looking\nat each ancestor&rsquo;s method table until we found it.</p>\n<p>For example, calling <code>cook()</code> on an instance of Cruller sends jlox on this\njourney:</p><img src=\"image/superclasses/jlox-resolve.png\" alt=\"Resolving a call to cook() in an instance of Cruller means walking the superclass chain.\" />\n<p>That&rsquo;s a lot of work to perform during method <em>invocation</em> time. It&rsquo;s slow, and\nworse, the farther an inherited method is up the ancestor chain, the slower it\ngets. Not a great performance story.</p>\n<p>The new approach is much faster. When the subclass is declared, we copy all of\nthe inherited class&rsquo;s methods down into the subclass&rsquo;s own method table. Later,\nwhen <em>calling</em> a method, any method inherited from a superclass will be found\nright in the subclass&rsquo;s own method table. There is no extra runtime work needed\nfor inheritance at all. By the time the class is declared, the work is done.\nThis means inherited method calls are exactly as fast as normal method calls<span class=\"em\">&mdash;</span>a <span name=\"two\">single</span> hash table lookup.</p><img src=\"image/superclasses/clox-resolve.png\" alt=\"Resolving a call to cook() in an instance of Cruller which has the method in its own method table.\" />\n<aside name=\"two\">\n<p>Well, two hash table lookups, I guess. Because first we have to make sure a\nfield on the instance doesn&rsquo;t shadow the method.</p>\n</aside>\n<p>I&rsquo;ve sometimes heard this technique called &ldquo;copy-down inheritance&rdquo;. It&rsquo;s simple\nand fast, but, like most optimizations, you get to use it only under certain\nconstraints. It works in Lox because Lox classes are <em>closed</em>. Once a class\ndeclaration is finished executing, the set of methods for that class can never\nchange.</p>\n<p>In languages like Ruby, Python, and JavaScript, it&rsquo;s possible to <span\nname=\"monkey\">crack</span> open an existing class and jam some new methods into\nit or even remove them. That would break our optimization because if those\nmodifications happened to a superclass <em>after</em> the subclass declaration\nexecuted, the subclass would not pick up those changes. That breaks a user&rsquo;s\nexpectation that inheritance always reflects the current state of the\nsuperclass.</p>\n<aside name=\"monkey\">\n<p>As you can imagine, changing the set of methods a class defines imperatively at\nruntime can make it hard to reason about a program. It is a very powerful tool,\nbut also a dangerous tool.</p>\n<p>Those who find this tool maybe a little <em>too</em> dangerous gave it the unbecoming\nname &ldquo;monkey patching&rdquo;, or the even less decorous &ldquo;duck punching&rdquo;.</p><img src=\"image/superclasses/monkey.png\" alt=\"A monkey with an eyepatch, naturally.\" />\n</aside>\n<p>Fortunately for us (but not for users who like the feature, I guess), Lox\ndoesn&rsquo;t let you patch monkeys or punch ducks, so we can safely apply this\noptimization.</p>\n<p>What about method overrides? Won&rsquo;t copying the superclass&rsquo;s methods into the\nsubclass&rsquo;s method table clash with the subclass&rsquo;s own methods? Fortunately, no.\nWe emit the <code>OP_INHERIT</code> after the <code>OP_CLASS</code> instruction that creates the\nsubclass but before any method declarations and <code>OP_METHOD</code> instructions have\nbeen compiled. At the point that we copy the superclass&rsquo;s methods down, the\nsubclass&rsquo;s method table is empty. Any methods the subclass overrides will\noverwrite those inherited entries in the table.</p>\n<h3><a href=\"#invalid-superclasses\" id=\"invalid-superclasses\"><small>29&#8202;.&#8202;1&#8202;.&#8202;2</small>Invalid superclasses</a></h3>\n<p>Our implementation is simple and fast, which is just the way I like my VM code.\nBut it&rsquo;s not robust. Nothing prevents a user from inheriting from an object that\nisn&rsquo;t a class at all:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"t\">NotClass</span> = <span class=\"s\">&quot;So not a class&quot;</span>;\n<span class=\"k\">class</span> <span class=\"t\">OhNo</span> &lt; <span class=\"t\">NotClass</span> {}\n</pre></div>\n<p>Obviously, no self-respecting programmer would write that, but we have to guard\nagainst potential Lox users who have no self respect. A simple runtime check\nfixes that.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">        Value superclass = peek(1);\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">        <span class=\"k\">if</span> (!<span class=\"a\">IS_CLASS</span>(<span class=\"i\">superclass</span>)) {\n          <span class=\"i\">runtimeError</span>(<span class=\"s\">&quot;Superclass must be a class.&quot;</span>);\n          <span class=\"k\">return</span> <span class=\"a\">INTERPRET_RUNTIME_ERROR</span>;\n        }\n\n</pre><pre class=\"insert-after\">        ObjClass* subclass = AS_CLASS(peek(0));\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>If the value we loaded from the identifier in the superclass clause isn&rsquo;t an\nObjClass, we report a runtime error to let the user know what we think of them\nand their code.</p>\n<h2><a href=\"#storing-superclasses\" id=\"storing-superclasses\"><small>29&#8202;.&#8202;2</small>Storing Superclasses</a></h2>\n<p>Did you notice that when we added method inheritance, we didn&rsquo;t actually add any\nreference from a subclass to its superclass? After we copy the inherited methods\nover, we forget the superclass entirely. We don&rsquo;t need to keep a handle on the\nsuperclass, so we don&rsquo;t.</p>\n<p>That won&rsquo;t be sufficient to support super calls. Since a subclass <span\nname=\"may\">may</span> override the superclass method, we need to be able to get\nour hands on superclass method tables. Before we get to that mechanism, I want \nto refresh your memory on how super calls are statically resolved.</p>\n<aside name=\"may\">\n<p>&ldquo;May&rdquo; might not be a strong enough word. Presumably the method <em>has</em> been\noverridden. Otherwise, why are you bothering to use <code>super</code> instead of just\ncalling it directly?</p>\n</aside>\n<p>Back in the halcyon days of jlox, I showed you <a href=\"inheritance.html#semantics\">this tricky example</a> to\nexplain the way super calls are dispatched:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">A</span> {\n  <span class=\"i\">method</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;A method&quot;</span>;\n  }\n}\n\n<span class=\"k\">class</span> <span class=\"t\">B</span> &lt; <span class=\"t\">A</span> {\n  <span class=\"i\">method</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;B method&quot;</span>;\n  }\n\n  <span class=\"i\">test</span>() {\n    <span class=\"k\">super</span>.<span class=\"i\">method</span>();\n  }\n}\n\n<span class=\"k\">class</span> <span class=\"t\">C</span> &lt; <span class=\"t\">B</span> {}\n\n<span class=\"t\">C</span>().<span class=\"i\">test</span>();\n</pre></div>\n<p>Inside the body of the <code>test()</code> method, <code>this</code> is an instance of C. If super\ncalls were resolved relative to the superclass of the <em>receiver</em>, then we would\nlook in C&rsquo;s superclass, B. But super calls are resolved relative to the\nsuperclass of the <em>surrounding class where the super call occurs</em>. In this case,\nwe are in B&rsquo;s <code>test()</code> method, so the superclass is A, and the program should\nprint &ldquo;A method&rdquo;.</p>\n<p>This means that super calls are not resolved dynamically based on the runtime\ninstance. The superclass used to look up the method is a static<span class=\"em\">&mdash;</span>practically\nlexical<span class=\"em\">&mdash;</span>property of where the call occurs. When we added inheritance to jlox,\nwe took advantage of that static aspect by storing the superclass in the same\nEnvironment structure we used for all lexical scopes. Almost as if the\ninterpreter saw the above program like this:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">A</span> {\n  <span class=\"i\">method</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;A method&quot;</span>;\n  }\n}\n\n<span class=\"k\">var</span> <span class=\"t\">Bs_super</span> = <span class=\"t\">A</span>;\n<span class=\"k\">class</span> <span class=\"t\">B</span> &lt; <span class=\"t\">A</span> {\n  <span class=\"i\">method</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;B method&quot;</span>;\n  }\n\n  <span class=\"i\">test</span>() {\n    <span class=\"i\">runtimeSuperCall</span>(<span class=\"t\">Bs_super</span>, <span class=\"s\">&quot;method&quot;</span>);\n  }\n}\n\n<span class=\"k\">var</span> <span class=\"t\">Cs_super</span> = <span class=\"t\">B</span>;\n<span class=\"k\">class</span> <span class=\"t\">C</span> &lt; <span class=\"t\">B</span> {}\n\n<span class=\"t\">C</span>().<span class=\"i\">test</span>();\n</pre></div>\n<p>Each subclass has a hidden variable storing a reference to its superclass.\nWhenever we need to perform a super call, we access the superclass from that\nvariable and tell the runtime to start looking for methods there.</p>\n<p>We&rsquo;ll take the same path with clox. The difference is that instead of jlox&rsquo;s\nheap-allocated Environment class, we have the bytecode VM&rsquo;s value stack and\nupvalue system. The machinery is a little different, but the overall effect is\nthe same.</p>\n<h3><a href=\"#a-superclass-local-variable\" id=\"a-superclass-local-variable\"><small>29&#8202;.&#8202;2&#8202;.&#8202;1</small>A superclass local variable</a></h3>\n<p>Our compiler already emits code to load the superclass onto the stack. Instead\nof leaving that slot as a temporary, we create a new scope and make it a local\nvariable.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    }\n\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>classDeclaration</em>()</div>\n<pre class=\"insert\">    <span class=\"i\">beginScope</span>();\n    <span class=\"i\">addLocal</span>(<span class=\"i\">syntheticToken</span>(<span class=\"s\">&quot;super&quot;</span>));\n    <span class=\"i\">defineVariable</span>(<span class=\"n\">0</span>);\n\n</pre><pre class=\"insert-after\">    namedVariable(className, false);\n    emitByte(OP_INHERIT);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>classDeclaration</em>()</div>\n\n<p>Creating a new lexical scope ensures that if we declare two classes in the same\nscope, each has a different local slot to store its superclass. Since we always\nname this variable &ldquo;super&rdquo;, if we didn&rsquo;t make a scope for each subclass, the\nvariables would collide.</p>\n<p>We name the variable &ldquo;super&rdquo; for the same reason we use &ldquo;this&rdquo; as the name of\nthe hidden local variable that <code>this</code> expressions resolve to: &ldquo;super&rdquo; is a\nreserved word, which guarantees the compiler&rsquo;s hidden variable won&rsquo;t collide\nwith a user-defined one.</p>\n<p>The difference is that when compiling <code>this</code> expressions, we conveniently have a\ntoken sitting around whose lexeme is &ldquo;this&rdquo;. We aren&rsquo;t so lucky here. Instead,\nwe add a little helper function to create a synthetic token for the given <span\nname=\"constant\">constant</span> string.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>variable</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">Token</span> <span class=\"i\">syntheticToken</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">text</span>) {\n  <span class=\"t\">Token</span> <span class=\"i\">token</span>;\n  <span class=\"i\">token</span>.<span class=\"i\">start</span> = <span class=\"i\">text</span>;\n  <span class=\"i\">token</span>.<span class=\"i\">length</span> = (<span class=\"t\">int</span>)<span class=\"i\">strlen</span>(<span class=\"i\">text</span>);\n  <span class=\"k\">return</span> <span class=\"i\">token</span>;\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>variable</em>()</div>\n\n<aside name=\"constant\" class=\"bottom\">\n<p>I say &ldquo;constant string&rdquo; because tokens don&rsquo;t do any memory management of their\nlexeme. If we tried to use a heap-allocated string for this, we&rsquo;d end up leaking\nmemory because it never gets freed. But the memory for C string literals lives\nin the executable&rsquo;s constant data section and never needs to be freed, so we&rsquo;re\nfine.</p>\n</aside>\n<p>Since we opened a local scope for the superclass variable, we need to close it.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  emitByte(OP_POP);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>classDeclaration</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"k\">if</span> (<span class=\"i\">classCompiler</span>.<span class=\"i\">hasSuperclass</span>) {\n    <span class=\"i\">endScope</span>();\n  }\n</pre><pre class=\"insert-after\">\n\n  currentClass = currentClass-&gt;enclosing;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>classDeclaration</em>()</div>\n\n<p>We pop the scope and discard the &ldquo;super&rdquo; variable after compiling the class body\nand its methods. That way, the variable is accessible in all of the methods of\nthe subclass. It&rsquo;s a somewhat pointless optimization, but we create the scope\nonly if there <em>is</em> a superclass clause. Thus we need to close the scope only if\nthere is one.</p>\n<p>To track that, we could declare a little local variable in <code>classDeclaration()</code>.\nBut soon, other functions in the compiler will need to know whether the\nsurrounding class is a subclass or not. So we may as well give our future selves\na hand and store this fact as a field in the ClassCompiler now.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">typedef struct ClassCompiler {\n  struct ClassCompiler* enclosing;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin struct <em>ClassCompiler</em></div>\n<pre class=\"insert\">  <span class=\"t\">bool</span> <span class=\"i\">hasSuperclass</span>;\n</pre><pre class=\"insert-after\">} ClassCompiler;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in struct <em>ClassCompiler</em></div>\n\n<p>When we first initialize a ClassCompiler, we assume it is not a subclass.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  ClassCompiler classCompiler;\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>classDeclaration</em>()</div>\n<pre class=\"insert\">  <span class=\"i\">classCompiler</span>.<span class=\"i\">hasSuperclass</span> = <span class=\"k\">false</span>;\n</pre><pre class=\"insert-after\">  classCompiler.enclosing = currentClass;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>classDeclaration</em>()</div>\n\n<p>Then, if we see a superclass clause, we know we are compiling a subclass.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    emitByte(OP_INHERIT);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>classDeclaration</em>()</div>\n<pre class=\"insert\">    <span class=\"i\">classCompiler</span>.<span class=\"i\">hasSuperclass</span> = <span class=\"k\">true</span>;\n</pre><pre class=\"insert-after\">  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>classDeclaration</em>()</div>\n\n<p>This machinery gives us a mechanism at runtime to access the superclass object\nof the surrounding subclass from within any of the subclass&rsquo;s methods<span class=\"em\">&mdash;</span>simply\nemit code to load the variable named &ldquo;super&rdquo;. That variable is a local outside\nof the method body, but our existing upvalue support enables the VM to capture\nthat local inside the body of the method or even in functions nested inside that\nmethod.</p>\n<h2><a href=\"#super-calls\" id=\"super-calls\"><small>29&#8202;.&#8202;3</small>Super Calls</a></h2>\n<p>With that runtime support in place, we are ready to implement super calls. As\nusual, we go front to back, starting with the new syntax. A super call <span\nname=\"last\">begins</span>, naturally enough, with the <code>super</code> keyword.</p>\n<aside name=\"last\">\n<p>This is it, friend. The very last entry you&rsquo;ll add to the parsing table.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">  [TOKEN_RETURN]        = {NULL,     NULL,   PREC_NONE},\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nreplace 1 line</div>\n<pre class=\"insert\">  [<span class=\"a\">TOKEN_SUPER</span>]         = {<span class=\"i\">super_</span>,   <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n</pre><pre class=\"insert-after\">  [TOKEN_THIS]          = {this_,    NULL,   PREC_NONE},\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, replace 1 line</div>\n\n<p>When the expression parser lands on a <code>super</code> token, control jumps to a new\nparsing function which starts off like so:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>syntheticToken</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">super_</span>(<span class=\"t\">bool</span> <span class=\"i\">canAssign</span>) {\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_DOT</span>, <span class=\"s\">&quot;Expect &#39;.&#39; after &#39;super&#39;.&quot;</span>);\n  <span class=\"i\">consume</span>(<span class=\"a\">TOKEN_IDENTIFIER</span>, <span class=\"s\">&quot;Expect superclass method name.&quot;</span>);\n  <span class=\"t\">uint8_t</span> <span class=\"i\">name</span> = <span class=\"i\">identifierConstant</span>(&amp;<span class=\"i\">parser</span>.<span class=\"i\">previous</span>);\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>syntheticToken</em>()</div>\n\n<p>This is pretty different from how we compiled <code>this</code> expressions. Unlike <code>this</code>,\na <code>super</code> <span name=\"token\">token</span> is not a standalone expression.\nInstead, the dot and method name following it are inseparable parts of the\nsyntax. However, the parenthesized argument list is separate. As with normal\nmethod access, Lox supports getting a reference to a superclass method as a\nclosure without invoking it:</p>\n<aside name=\"token\">\n<p>Hypothetical question: If a bare <code>super</code> token <em>was</em> an expression, what kind of\nobject would it evaluate to?</p>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">A</span> {\n  <span class=\"i\">method</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;A&quot;</span>;\n  }\n}\n\n<span class=\"k\">class</span> <span class=\"t\">B</span> &lt; <span class=\"t\">A</span> {\n  <span class=\"i\">method</span>() {\n    <span class=\"k\">var</span> <span class=\"i\">closure</span> = <span class=\"k\">super</span>.<span class=\"i\">method</span>;\n    <span class=\"i\">closure</span>(); <span class=\"c\">// Prints &quot;A&quot;.</span>\n  }\n}\n</pre></div>\n<p>In other words, Lox doesn&rsquo;t really have super <em>call</em> expressions, it has super\n<em>access</em> expressions, which you can choose to immediately invoke if you want. So\nwhen the compiler hits a <code>super</code> token, we consume the subsequent <code>.</code> token and\nthen look for a method name. Methods are looked up dynamically, so we use\n<code>identifierConstant()</code> to take the lexeme of the method name token and store it\nin the constant table just like we do for property access expressions.</p>\n<p>Here is what the compiler does after consuming those tokens:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  uint8_t name = identifierConstant(&amp;parser.previous);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>super_</em>()</div>\n<pre class=\"insert\">\n\n  <span class=\"i\">namedVariable</span>(<span class=\"i\">syntheticToken</span>(<span class=\"s\">&quot;this&quot;</span>), <span class=\"k\">false</span>);\n  <span class=\"i\">namedVariable</span>(<span class=\"i\">syntheticToken</span>(<span class=\"s\">&quot;super&quot;</span>), <span class=\"k\">false</span>);\n  <span class=\"i\">emitBytes</span>(<span class=\"a\">OP_GET_SUPER</span>, <span class=\"i\">name</span>);\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>super_</em>()</div>\n\n<p>In order to access a <em>superclass method</em> on <em>the current instance</em>, the runtime\nneeds both the receiver <em>and</em> the superclass of the surrounding method&rsquo;s class.\nThe first <code>namedVariable()</code> call generates code to look up the current receiver\nstored in the hidden variable &ldquo;this&rdquo; and push it onto the stack. The second\n<code>namedVariable()</code> call emits code to look up the superclass from its &ldquo;super&rdquo;\nvariable and push that on top.</p>\n<p>Finally, we emit a new <code>OP_GET_SUPER</code> instruction with an operand for the\nconstant table index of the method name. That&rsquo;s a lot to hold in your head. To\nmake it tangible, consider this example program:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Doughnut</span> {\n  <span class=\"i\">cook</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Dunk in the fryer.&quot;</span>;\n    <span class=\"k\">this</span>.<span class=\"i\">finish</span>(<span class=\"s\">&quot;sprinkles&quot;</span>);\n  }\n\n  <span class=\"i\">finish</span>(<span class=\"i\">ingredient</span>) {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Finish with &quot;</span> + <span class=\"i\">ingredient</span>;\n  }\n}\n\n<span class=\"k\">class</span> <span class=\"t\">Cruller</span> &lt; <span class=\"t\">Doughnut</span> {\n  <span class=\"i\">finish</span>(<span class=\"i\">ingredient</span>) {\n    <span class=\"c\">// No sprinkles, always icing.</span>\n    <span class=\"k\">super</span>.<span class=\"i\">finish</span>(<span class=\"s\">&quot;icing&quot;</span>);\n  }\n}\n</pre></div>\n<p>The bytecode emitted for the <code>super.finish(\"icing\")</code> expression looks and works\nlike this:</p><img src=\"image/superclasses/super-instructions.png\" alt=\"The series of bytecode instructions for calling super.finish().\" />\n<p>The first three instructions give the runtime access to the three pieces of\ninformation it needs to perform the super access:</p>\n<ol>\n<li>\n<p>The first instruction loads <strong>the instance</strong> onto the stack.</p>\n</li>\n<li>\n<p>The second instruction loads <strong>the superclass where the method is\nresolved</strong>.</p>\n</li>\n<li>\n<p>Then the new <code>OP_GET_SUPER</code> instuction encodes <strong>the name of the method to\naccess</strong> as an operand.</p>\n</li>\n</ol>\n<p>The remaining instructions are the normal bytecode for evaluating an argument\nlist and calling a function.</p>\n<p>We&rsquo;re almost ready to implement the new <code>OP_GET_SUPER</code> instruction in the\ninterpreter. But before we do, the compiler has some errors it is responsible\nfor reporting.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">static void super_(bool canAssign) {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>super_</em>()</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">currentClass</span> == <span class=\"a\">NULL</span>) {\n    <span class=\"i\">error</span>(<span class=\"s\">&quot;Can&#39;t use &#39;super&#39; outside of a class.&quot;</span>);\n  } <span class=\"k\">else</span> <span class=\"k\">if</span> (!<span class=\"i\">currentClass</span>-&gt;<span class=\"i\">hasSuperclass</span>) {\n    <span class=\"i\">error</span>(<span class=\"s\">&quot;Can&#39;t use &#39;super&#39; in a class with no superclass.&quot;</span>);\n  }\n\n</pre><pre class=\"insert-after\">  consume(TOKEN_DOT, &quot;Expect '.' after 'super'.&quot;);\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>super_</em>()</div>\n\n<p>A super call is meaningful only inside the body of a method (or in a function\nnested inside a method), and only inside the method of a class that has a\nsuperclass. We detect both of these cases using the value of <code>currentClass</code>. If\nthat&rsquo;s <code>NULL</code> or points to a class with no superclass, we report those errors.</p>\n<h3><a href=\"#executing-super-accesses\" id=\"executing-super-accesses\"><small>29&#8202;.&#8202;3&#8202;.&#8202;1</small>Executing super accesses</a></h3>\n<p>Assuming the user didn&rsquo;t put a <code>super</code> expression where it&rsquo;s not allowed, their\ncode passes from the compiler over to the runtime. We&rsquo;ve got ourselves a new\ninstruction.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_SET_PROPERTY,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_GET_SUPER</span>,\n</pre><pre class=\"insert-after\">  OP_EQUAL,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>We disassemble it like other opcodes that take a constant table index operand.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      return constantInstruction(&quot;OP_SET_PROPERTY&quot;, chunk, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_GET_SUPER</span>:\n      <span class=\"k\">return</span> <span class=\"i\">constantInstruction</span>(<span class=\"s\">&quot;OP_GET_SUPER&quot;</span>, <span class=\"i\">chunk</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    case OP_EQUAL:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>You might anticipate something harder, but interpreting the new instruction is\nsimilar to executing a normal property access.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_GET_SUPER</span>: {\n        <span class=\"t\">ObjString</span>* <span class=\"i\">name</span> = <span class=\"a\">READ_STRING</span>();\n        <span class=\"t\">ObjClass</span>* <span class=\"i\">superclass</span> = <span class=\"a\">AS_CLASS</span>(<span class=\"i\">pop</span>());\n\n        <span class=\"k\">if</span> (!<span class=\"i\">bindMethod</span>(<span class=\"i\">superclass</span>, <span class=\"i\">name</span>)) {\n          <span class=\"k\">return</span> <span class=\"a\">INTERPRET_RUNTIME_ERROR</span>;\n        }\n        <span class=\"k\">break</span>;\n      }\n</pre><pre class=\"insert-after\">      case OP_EQUAL: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>As with properties, we read the method name from the\nconstant table. Then we pass that to <code>bindMethod()</code> which looks up the method in\nthe given class&rsquo;s method table and creates an ObjBoundMethod to bundle the\nresulting closure to the current instance.</p>\n<p>The key <span name=\"field\">difference</span> is <em>which</em> class we pass to\n<code>bindMethod()</code>. With a normal property access, we use the ObjInstances&rsquo;s own\nclass, which gives us the dynamic dispatch we want. For a super call, we don&rsquo;t\nuse the instance&rsquo;s class. Instead, we use the statically resolved superclass of\nthe containing class, which the compiler has conveniently ensured is sitting on\ntop of the stack waiting for us.</p>\n<p>We pop that superclass and pass it to <code>bindMethod()</code>, which correctly skips over\nany overriding methods in any of the subclasses between that superclass and the\ninstance&rsquo;s own class. It also correctly includes any methods inherited by the\nsuperclass from any of <em>its</em> superclasses.</p>\n<p>The rest of the behavior is the same. Popping the superclass leaves the instance\nat the top of the stack. When <code>bindMethod()</code> succeeds, it pops the instance and\npushes the new bound method. Otherwise, it reports a runtime error and returns\n<code>false</code>. In that case, we abort the interpreter.</p>\n<aside name=\"field\">\n<p>Another difference compared to <code>OP_GET_PROPERTY</code> is that we don&rsquo;t try to look\nfor a shadowing field first. Fields are not inherited, so <code>super</code> expressions\nalways resolve to methods.</p>\n<p>If Lox were a prototype-based language that used <em>delegation</em> instead of\n<em>inheritance</em>, then instead of one <em>class</em> inheriting from another <em>class</em>,\ninstances would inherit from (&ldquo;delegate to&rdquo;) other instances. In that case,\nfields <em>could</em> be inherited, and we would need to check for them here.</p>\n</aside>\n<h3><a href=\"#faster-super-calls\" id=\"faster-super-calls\"><small>29&#8202;.&#8202;3&#8202;.&#8202;2</small>Faster super calls</a></h3>\n<p>We have superclass method accesses working now. And since the returned object is\nan ObjBoundMethod that you can then invoke, we&rsquo;ve got super <em>calls</em> working too.\nJust like last chapter, we&rsquo;ve reached a point where our VM has the complete,\ncorrect semantics.</p>\n<p>But, also like last chapter, it&rsquo;s pretty slow. Again, we&rsquo;re heap allocating an\nObjBoundMethod for each super call even though most of the time the very next\ninstruction is an <code>OP_CALL</code> that immediately unpacks that bound method, invokes\nit, and then discards it. In fact, this is even more likely to be true for\nsuper calls than for regular method calls. At least with method calls there is\na chance that the user is actually invoking a function stored in a field. With\nsuper calls, you&rsquo;re <em>always</em> looking up a method. The only question is whether\nyou invoke it immediately or not.</p>\n<p>The compiler can certainly answer that question for itself if it sees a left\nparenthesis after the superclass method name, so we&rsquo;ll go ahead and perform the\nsame optimization we did for method calls. Take out the two lines of code that\nload the superclass and emit <code>OP_GET_SUPER</code>, and replace them with this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  namedVariable(syntheticToken(&quot;this&quot;), false);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>super_</em>()<br>\nreplace 2 lines</div>\n<pre class=\"insert\">  <span class=\"k\">if</span> (<span class=\"i\">match</span>(<span class=\"a\">TOKEN_LEFT_PAREN</span>)) {\n    <span class=\"t\">uint8_t</span> <span class=\"i\">argCount</span> = <span class=\"i\">argumentList</span>();\n    <span class=\"i\">namedVariable</span>(<span class=\"i\">syntheticToken</span>(<span class=\"s\">&quot;super&quot;</span>), <span class=\"k\">false</span>);\n    <span class=\"i\">emitBytes</span>(<span class=\"a\">OP_SUPER_INVOKE</span>, <span class=\"i\">name</span>);\n    <span class=\"i\">emitByte</span>(<span class=\"i\">argCount</span>);\n  } <span class=\"k\">else</span> {\n    <span class=\"i\">namedVariable</span>(<span class=\"i\">syntheticToken</span>(<span class=\"s\">&quot;super&quot;</span>), <span class=\"k\">false</span>);\n    <span class=\"i\">emitBytes</span>(<span class=\"a\">OP_GET_SUPER</span>, <span class=\"i\">name</span>);\n  }\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>super_</em>(), replace 2 lines</div>\n\n<p>Now before we emit anything, we look for a parenthesized argument list. If we\nfind one, we compile that. Then we load the superclass. After that, we emit a\nnew <code>OP_SUPER_INVOKE</code> instruction. This <span\nname=\"superinstruction\">superinstruction</span> combines the behavior of\n<code>OP_GET_SUPER</code> and <code>OP_CALL</code>, so it takes two operands: the constant table index\nof the method name to look up and the number of arguments to pass to it.</p>\n<aside name=\"superinstruction\">\n<p>This is a particularly <em>super</em> superinstruction, if you get what I&rsquo;m saying.\nI<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>I&rsquo;m sorry for this terrible joke.</p>\n</aside>\n<p>Otherwise, if we don&rsquo;t find a <code>(</code>, we continue to compile the expression as a\nsuper access like we did before and emit an <code>OP_GET_SUPER</code>.</p>\n<p>Drifting down the compilation pipeline, our first stop is a new instruction.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_INVOKE,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_SUPER_INVOKE</span>,\n</pre><pre class=\"insert-after\">  OP_CLOSURE,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>And just past that, its disassembler support.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      return invokeInstruction(&quot;OP_INVOKE&quot;, chunk, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_SUPER_INVOKE</span>:\n      <span class=\"k\">return</span> <span class=\"i\">invokeInstruction</span>(<span class=\"s\">&quot;OP_SUPER_INVOKE&quot;</span>, <span class=\"i\">chunk</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    case OP_CLOSURE: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>A super invocation instruction has the same set of operands as <code>OP_INVOKE</code>, so\nwe reuse the same helper to disassemble it. Finally, the pipeline dumps us into\nthe interpreter.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">        break;\n      }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_SUPER_INVOKE</span>: {\n        <span class=\"t\">ObjString</span>* <span class=\"i\">method</span> = <span class=\"a\">READ_STRING</span>();\n        <span class=\"t\">int</span> <span class=\"i\">argCount</span> = <span class=\"a\">READ_BYTE</span>();\n        <span class=\"t\">ObjClass</span>* <span class=\"i\">superclass</span> = <span class=\"a\">AS_CLASS</span>(<span class=\"i\">pop</span>());\n        <span class=\"k\">if</span> (!<span class=\"i\">invokeFromClass</span>(<span class=\"i\">superclass</span>, <span class=\"i\">method</span>, <span class=\"i\">argCount</span>)) {\n          <span class=\"k\">return</span> <span class=\"a\">INTERPRET_RUNTIME_ERROR</span>;\n        }\n        <span class=\"i\">frame</span> = &amp;<span class=\"i\">vm</span>.<span class=\"i\">frames</span>[<span class=\"i\">vm</span>.<span class=\"i\">frameCount</span> - <span class=\"n\">1</span>];\n        <span class=\"k\">break</span>;\n      }\n</pre><pre class=\"insert-after\">      case OP_CLOSURE: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>This handful of code is basically our implementation of <code>OP_INVOKE</code> mixed\ntogether with a dash of <code>OP_GET_SUPER</code>. There are some differences in how the\nstack is organized, though. With an unoptimized super call, the superclass is\npopped and replaced by the ObjBoundMethod for the resolved function <em>before</em> the\narguments to the call are executed. This ensures that by the time the <code>OP_CALL</code>\nis executed, the bound method is <em>under</em> the argument list, where the runtime\nexpects it to be for a closure call.</p>\n<p>With our optimized instructions, things are shuffled a bit:</p><img src=\"image/superclasses/super-invoke.png\" class=\"wide\" alt=\"The series of bytecode instructions for calling super.finish() using OP_SUPER_INVOKE.\" />\n<p>Now resolving the superclass method is part of the <em>invocation</em>, so the\narguments need to already be on the stack at the point that we look up the\nmethod. This means the superclass object is on top of the arguments.</p>\n<p>Aside from that, the behavior is roughly the same as an <code>OP_GET_SUPER</code> followed\nby an <code>OP_CALL</code>. First, we pull out the method name and argument count operands.\nThen we pop the superclass off the top of the stack so that we can look up the\nmethod in its method table. This conveniently leaves the stack set up just right\nfor a method call.</p>\n<p>We pass the superclass, method name, and argument count to our existing\n<code>invokeFromClass()</code> function. That function looks up the given method on the\ngiven class and attempts to create a call to it with the given arity. If a\nmethod could not be found, it returns <code>false</code>, and we bail out of the\ninterpreter. Otherwise, <code>invokeFromClass()</code> pushes a new CallFrame onto the call\nstack for the method&rsquo;s closure. That invalidates the interpreter&rsquo;s cached\nCallFrame pointer, so we refresh <code>frame</code>.</p>\n<h2><a href=\"#a-complete-virtual-machine\" id=\"a-complete-virtual-machine\"><small>29&#8202;.&#8202;4</small>A Complete Virtual Machine</a></h2>\n<p>Take a look back at what we&rsquo;ve created. By my count, we wrote around 2,500 lines\nof fairly clean, straightforward C. That little program contains a complete\nimplementation of the<span class=\"em\">&mdash;</span>quite high-level!<span class=\"em\">&mdash;</span>Lox language, with a whole\nprecedence table full of expression types and a suite of control flow\nstatements. We implemented variables, functions, closures, classes, fields,\nmethods, and inheritance.</p>\n<p>Even more impressive, our implementation is portable to any platform with a C\ncompiler, and is fast enough for real-world production use. We have a\nsingle-pass bytecode compiler, a tight virtual machine interpreter for our\ninternal instruction set, compact object representations, a stack for storing\nvariables without heap allocation, and a precise garbage collector.</p>\n<p>If you go out and start poking around in the implementations of Lua, Python, or\nRuby, you will be surprised by how much of it now looks familiar to you. You\nhave seriously leveled up your knowledge of how programming languages work,\nwhich in turn gives you a deeper understanding of programming itself. It&rsquo;s like\nyou used to be a race car driver, and now you can pop the hood and repair the\nengine too.</p>\n<p>You can stop here if you like. The two implementations of Lox you have are\ncomplete and full featured. You built the car and can drive it wherever you want\nnow. But if you are looking to have more fun tuning and tweaking for even\ngreater performance out on the track, there is one more chapter. We don&rsquo;t add\nany new capabilities, but we roll in a couple of classic optimizations to\nsqueeze even more perf out. If that sounds fun, <a href=\"optimization.html\">keep reading</a><span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span></p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>A tenet of object-oriented programming is that a class should ensure new\nobjects are in a valid state. In Lox, that means defining an initializer\nthat populates the instance&rsquo;s fields. Inheritance complicates invariants\nbecause the instance must be in a valid state according to all of the\nclasses in the object&rsquo;s inheritance chain.</p>\n<p>The easy part is remembering to call <code>super.init()</code> in each subclass&rsquo;s\n<code>init()</code> method. The harder part is fields. There is nothing preventing two\nclasses in the inheritance chain from accidentally claiming the same field\nname. When this happens, they will step on each other&rsquo;s fields and possibly\nleave you with an instance in a broken state.</p>\n<p>If Lox was your language, how would you address this, if at all? If you\nwould change the language, implement your change.</p>\n</li>\n<li>\n<p>Our copy-down inheritance optimization is valid only because Lox does not\npermit you to modify a class&rsquo;s methods after its declaration. This means we\ndon&rsquo;t have to worry about the copied methods in the subclass getting out of\nsync with later changes to the superclass.</p>\n<p>Other languages, like Ruby, <em>do</em> allow classes to be modified after the\nfact. How do implementations of languages like that support class\nmodification while keeping method resolution efficient?</p>\n</li>\n<li>\n<p>In the <a href=\"inheritance.html\">jlox chapter on inheritance</a>, we had a challenge to\nimplement the BETA language&rsquo;s approach to method overriding. Solve the\nchallenge again, but this time in clox. Here&rsquo;s the description of the\nprevious challenge:</p>\n<p>In Lox, as in most other object-oriented languages, when looking up a\nmethod, we start at the bottom of the class hierarchy and work our way up<span class=\"em\">&mdash;</span>a subclass&rsquo;s method is preferred over a superclass&rsquo;s. In order to get to the\nsuperclass method from within an overriding method, you use <code>super</code>.</p>\n<p>The language <a href=\"https://beta.cs.au.dk/\">BETA</a> takes the <a href=\"http://journal.stuffwithstuff.com/2012/12/19/the-impoliteness-of-overriding-methods/\">opposite approach</a>. When you call a\nmethod, it starts at the <em>top</em> of the class hierarchy and works <em>down</em>. A\nsuperclass method wins over a subclass method. In order to get to the\nsubclass method, the superclass method can call <code>inner</code>, which is sort of\nlike the inverse of <code>super</code>. It chains to the next method down the\nhierarchy.</p>\n<p>The superclass method controls when and where the subclass is allowed to\nrefine its behavior. If the superclass method doesn&rsquo;t call <code>inner</code> at all,\nthen the subclass has no way of overriding or modifying the superclass&rsquo;s\nbehavior.</p>\n<p>Take out Lox&rsquo;s current overriding and <code>super</code> behavior, and replace it with\nBETA&rsquo;s semantics. In short:</p>\n<ul>\n<li>\n<p>When calling a method on a class, the method <em>highest</em> on the\nclass&rsquo;s inheritance chain takes precedence.</p>\n</li>\n<li>\n<p>Inside the body of a method, a call to <code>inner</code> looks for a method with\nthe same name in the nearest subclass along the inheritance chain\nbetween the class containing the <code>inner</code> and the class of <code>this</code>. If\nthere is no matching method, the <code>inner</code> call does nothing.</p>\n</li>\n</ul>\n<p>For example:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Doughnut</span> {\n  <span class=\"i\">cook</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Fry until golden brown.&quot;</span>;\n    <span class=\"i\">inner</span>();\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Place in a nice box.&quot;</span>;\n  }\n}\n\n<span class=\"k\">class</span> <span class=\"t\">BostonCream</span> &lt; <span class=\"t\">Doughnut</span> {\n  <span class=\"i\">cook</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Pipe full of custard and coat with chocolate.&quot;</span>;\n  }\n}\n\n<span class=\"t\">BostonCream</span>().<span class=\"i\">cook</span>();\n</pre></div>\n<p>This should print:</p>\n<div class=\"codehilite\"><pre>Fry until golden brown.\nPipe full of custard and coat with chocolate.\nPlace in a nice box.\n</pre></div>\n<p>Since clox is about not just implementing Lox, but doing so with good\nperformance, this time around try to solve the challenge with an eye towards\nefficiency.</p>\n</li>\n</ol>\n</div>\n\n<footer>\n<a href=\"optimization.html\" class=\"next\">\n  Next Chapter: &ldquo;Optimization&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/the-lox-language.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>The Lox Language &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">The Lox Language<small>3</small></a></h3>\n\n<ul>\n    <li><a href=\"#hello-lox\"><small>3.1</small> Hello, Lox</a></li>\n    <li><a href=\"#a-high-level-language\"><small>3.2</small> A High-Level Language</a></li>\n    <li><a href=\"#data-types\"><small>3.3</small> Data Types</a></li>\n    <li><a href=\"#expressions\"><small>3.4</small> Expressions</a></li>\n    <li><a href=\"#statements\"><small>3.5</small> Statements</a></li>\n    <li><a href=\"#variables\"><small>3.6</small> Variables</a></li>\n    <li><a href=\"#control-flow\"><small>3.7</small> Control Flow</a></li>\n    <li><a href=\"#functions\"><small>3.8</small> Functions</a></li>\n    <li><a href=\"#classes\"><small>3.9</small> Classes</a></li>\n    <li><a href=\"#the-standard-library\"><small>3.10</small> The Standard Library</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Expressions and Statements</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"a-map-of-the-territory.html\" title=\"A Map of the Territory\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"welcome.html\" title=\"Welcome\">&uarr;&nbsp;Up</a>\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"a-map-of-the-territory.html\" title=\"A Map of the Territory\" class=\"prev\">←</a>\n<a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">The Lox Language<small>3</small></a></h3>\n\n<ul>\n    <li><a href=\"#hello-lox\"><small>3.1</small> Hello, Lox</a></li>\n    <li><a href=\"#a-high-level-language\"><small>3.2</small> A High-Level Language</a></li>\n    <li><a href=\"#data-types\"><small>3.3</small> Data Types</a></li>\n    <li><a href=\"#expressions\"><small>3.4</small> Expressions</a></li>\n    <li><a href=\"#statements\"><small>3.5</small> Statements</a></li>\n    <li><a href=\"#variables\"><small>3.6</small> Variables</a></li>\n    <li><a href=\"#control-flow\"><small>3.7</small> Control Flow</a></li>\n    <li><a href=\"#functions\"><small>3.8</small> Functions</a></li>\n    <li><a href=\"#classes\"><small>3.9</small> Classes</a></li>\n    <li><a href=\"#the-standard-library\"><small>3.10</small> The Standard Library</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n    <li class=\"end-part\"><a href=\"#design-note\"><small>note</small>Expressions and Statements</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"a-map-of-the-territory.html\" title=\"A Map of the Territory\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"welcome.html\" title=\"Welcome\">&uarr;&nbsp;Up</a>\n    <a href=\"a-tree-walk-interpreter.html\" title=\"A Tree-Walk Interpreter\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">3</div>\n  <h1>The Lox Language</h1>\n\n<blockquote>\n<p>What nicer thing can you do for somebody than make them breakfast?</p>\n<p><cite>Anthony Bourdain</cite></p>\n</blockquote>\n<p>We&rsquo;ll spend the rest of this book illuminating every dark and sundry corner of\nthe Lox language, but it seems cruel to have you immediately start grinding out\ncode for the interpreter without at least a glimpse of what we&rsquo;re going to end\nup with.</p>\n<p>At the same time, I don&rsquo;t want to drag you through reams of language lawyering\nand specification-ese before you get to touch your text <span\nname=\"home\">editor</span>. So this will be a gentle, friendly introduction to\nLox. It will leave out a lot of details and edge cases. We&rsquo;ve got plenty of time\nfor those later.</p>\n<aside name=\"home\">\n<p>A tutorial isn&rsquo;t very fun if you can&rsquo;t try the code out yourself. Alas, you\ndon&rsquo;t have a Lox interpreter yet, since you haven&rsquo;t built one!</p>\n<p>Fear not. You can use <a href=\"https://github.com/munificent/craftinginterpreters\">mine</a>.</p>\n</aside>\n<h2><a href=\"#hello-lox\" id=\"hello-lox\"><small>3&#8202;.&#8202;1</small>Hello, Lox</a></h2>\n<p>Here&rsquo;s your very first taste of <span name=\"salmon\">Lox</span>:</p>\n<aside name=\"salmon\">\n<p>Your first taste of Lox, the language, that is. I don&rsquo;t know if you&rsquo;ve ever had\nthe cured, cold-smoked salmon before. If not, give it a try too.</p>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"c\">// Your first Lox program!</span>\n<span class=\"k\">print</span> <span class=\"s\">&quot;Hello, world!&quot;</span>;\n</pre></div>\n<p>As that <code>//</code> line comment and the trailing semicolon imply, Lox&rsquo;s syntax is a\nmember of the C family. (There are no parentheses around the string because\n<code>print</code> is a built-in statement, and not a library function.)</p>\n<p>Now, I won&rsquo;t claim that <span name=\"c\">C</span> has a <em>great</em> syntax. If we\nwanted something elegant, we&rsquo;d probably mimic Pascal or Smalltalk. If we wanted\nto go full Scandinavian-furniture-minimalism, we&rsquo;d do a Scheme. Those all have\ntheir virtues.</p>\n<aside name=\"c\">\n<p>I&rsquo;m surely biased, but I think Lox&rsquo;s syntax is pretty clean. C&rsquo;s most egregious\ngrammar problems are around types. Dennis Ritchie had this idea called\n&ldquo;<a href=\"http://softwareengineering.stackexchange.com/questions/117024/why-was-the-c-syntax-for-arrays-pointers-and-functions-designed-this-way\">declaration reflects use</a>&rdquo;, where variable declarations mirror the\noperations you would have to perform on the variable to get to a value of the\nbase type. Clever idea, but I don&rsquo;t think it worked out great in practice.</p>\n<p>Lox doesn&rsquo;t have static types, so we avoid that.</p>\n</aside>\n<p>What C-like syntax has instead is something you&rsquo;ll often find more valuable\nin a language: <em>familiarity</em>. I know you are already comfortable with that style\nbecause the two languages we&rsquo;ll be using to <em>implement</em> Lox<span class=\"em\">&mdash;</span>Java and C<span class=\"em\">&mdash;</span>also inherit it. Using a similar syntax for Lox gives you one less thing to\nlearn.</p>\n<h2><a href=\"#a-high-level-language\" id=\"a-high-level-language\"><small>3&#8202;.&#8202;2</small>A High-Level Language</a></h2>\n<p>While this book ended up bigger than I was hoping, it&rsquo;s still not big enough to\nfit a huge language like Java in it. In order to fit two complete\nimplementations of Lox in these pages, Lox itself has to be pretty compact.</p>\n<p>When I think of languages that are small but useful, what comes to mind are\nhigh-level &ldquo;scripting&rdquo; languages like <span name=\"js\">JavaScript</span>, Scheme,\nand Lua. Of those three, Lox looks most like JavaScript, mainly because most\nC-syntax languages do. As we&rsquo;ll learn later, Lox&rsquo;s approach to scoping hews\nclosely to Scheme. The C flavor of Lox we&rsquo;ll build in <a href=\"a-bytecode-virtual-machine.html\">Part III</a> is heavily\nindebted to Lua&rsquo;s clean, efficient implementation.</p>\n<aside name=\"js\">\n<p>Now that JavaScript has taken over the world and is used to build ginormous\napplications, it&rsquo;s hard to think of it as a &ldquo;little scripting language&rdquo;. But\nBrendan Eich hacked the first JS interpreter into Netscape Navigator in <em>ten\ndays</em> to make buttons animate on web pages. JavaScript has grown up since then,\nbut it was once a cute little language.</p>\n<p>Because Eich slapped JS together with roughly the same raw materials and time as\nan episode of MacGyver, it has some weird semantic corners where the duct tape\nand paper clips show through. Things like variable hoisting, dynamically bound\n<code>this</code>, holes in arrays, and implicit conversions.</p>\n<p>I had the luxury of taking my time on Lox, so it should be a little cleaner.</p>\n</aside>\n<p>Lox shares two other aspects with those three languages:</p>\n<h3><a href=\"#dynamic-typing\" id=\"dynamic-typing\"><small>3&#8202;.&#8202;2&#8202;.&#8202;1</small>Dynamic typing</a></h3>\n<p>Lox is dynamically typed. Variables can store values of any type, and a single\nvariable can even store values of different types at different times. If you try\nto perform an operation on values of the wrong type<span class=\"em\">&mdash;</span>say, dividing a number by\na string<span class=\"em\">&mdash;</span>then the error is detected and reported at runtime.</p>\n<p>There are plenty of reasons to like <span name=\"static\">static</span> types, but\nthey don&rsquo;t outweigh the pragmatic reasons to pick dynamic types for Lox. A\nstatic type system is a ton of work to learn and implement. Skipping it gives\nyou a simpler language and a shorter book. We&rsquo;ll get our interpreter up and\nexecuting bits of code sooner if we defer our type checking to runtime.</p>\n<aside name=\"static\">\n<p>After all, the two languages we&rsquo;ll be using to <em>implement</em> Lox are both\nstatically typed.</p>\n</aside>\n<h3><a href=\"#automatic-memory-management\" id=\"automatic-memory-management\"><small>3&#8202;.&#8202;2&#8202;.&#8202;2</small>Automatic memory management</a></h3>\n<p>High-level languages exist to eliminate error-prone, low-level drudgery, and what\ncould be more tedious than manually managing the allocation and freeing of\nstorage? No one rises and greets the morning sun with, &ldquo;I can&rsquo;t wait to figure\nout the correct place to call <code>free()</code> for every byte of memory I allocate\ntoday!&rdquo;</p>\n<p>There are two main <span name=\"gc\">techniques</span> for managing memory:\n<strong>reference counting</strong> and <strong>tracing garbage collection</strong> (usually just called\n<strong>garbage collection</strong> or <strong>GC</strong>). Ref counters are much simpler to implement<span class=\"em\">&mdash;</span>I think that&rsquo;s why Perl, PHP, and Python all started out using them. But, over\ntime, the limitations of ref counting become too troublesome. All of those\nlanguages eventually ended up adding a full tracing GC, or at least enough of\none to clean up object cycles.</p>\n<aside name=\"gc\">\n<p>In practice, ref counting and tracing are more ends of a continuum than\nopposing sides. Most ref counting systems end up doing some tracing to handle\ncycles, and the write barriers of a generational collector look a bit like\nretain calls if you squint.</p>\n<p>For lots more on this, see &ldquo;<a href=\"https://researcher.watson.ibm.com/researcher/files/us-bacon/Bacon04Unified.pdf\">A Unified Theory of Garbage Collection</a>&rdquo; (PDF).</p>\n</aside>\n<p>Tracing garbage collection has a fearsome reputation. It <em>is</em> a little harrowing\nworking at the level of raw memory. Debugging a GC can sometimes leave you\nseeing hex dumps in your dreams. But, remember, this book is about dispelling\nmagic and slaying those monsters, so we <em>are</em> going to write our own garbage\ncollector. I think you&rsquo;ll find the algorithm is quite simple and a lot of fun to\nimplement.</p>\n<h2><a href=\"#data-types\" id=\"data-types\"><small>3&#8202;.&#8202;3</small>Data Types</a></h2>\n<p>In Lox&rsquo;s little universe, the atoms that make up all matter are the built-in\ndata types. There are only a few:</p>\n<ul>\n<li>\n<p><strong><span name=\"bool\">Booleans</span>.</strong> You can&rsquo;t code without logic and you\ncan&rsquo;t logic without Boolean values. &ldquo;True&rdquo; and &ldquo;false&rdquo;, the yin and yang of\nsoftware. Unlike some ancient languages that repurpose an existing type to\nrepresent truth and falsehood, Lox has a dedicated Boolean type. We may\nbe roughing it on this expedition, but we aren&rsquo;t <em>savages</em>.</p>\n<aside name=\"bool\">\n<p>Boolean variables are the only data type in Lox named after a person, George\nBoole, which is why &ldquo;Boolean&rdquo; is capitalized. He died in 1864, nearly a\ncentury before digital computers turned his algebra into electricity. I\nwonder what he&rsquo;d think to see his name all over billions of lines of Java\ncode.</p>\n</aside>\n<p>There are two Boolean values, obviously, and a literal for each one.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">true</span>;  <span class=\"c\">// Not false.</span>\n<span class=\"k\">false</span>; <span class=\"c\">// Not *not* false.</span>\n</pre></div>\n</li>\n<li>\n<p><strong>Numbers.</strong> Lox has only one kind of number: double-precision floating\npoint. Since floating-point numbers can also represent a wide range of\nintegers, that covers a lot of territory, while keeping things simple.</p>\n<p>Full-featured languages have lots of syntax for numbers<span class=\"em\">&mdash;</span>hexadecimal,\nscientific notation, octal, all sorts of fun stuff. We&rsquo;ll settle for basic\ninteger and decimal literals.</p>\n<div class=\"codehilite\"><pre><span class=\"n\">1234</span>;  <span class=\"c\">// An integer.</span>\n<span class=\"n\">12.34</span>; <span class=\"c\">// A decimal number.</span>\n</pre></div>\n</li>\n<li>\n<p><strong>Strings.</strong> We&rsquo;ve already seen one string literal in the first example.\nLike most languages, they are enclosed in double quotes.</p>\n<div class=\"codehilite\"><pre><span class=\"s\">&quot;I am a string&quot;</span>;\n<span class=\"s\">&quot;&quot;</span>;    <span class=\"c\">// The empty string.</span>\n<span class=\"s\">&quot;123&quot;</span>; <span class=\"c\">// This is a string, not a number.</span>\n</pre></div>\n<p>As we&rsquo;ll see when we get to implementing them, there is quite a lot of\ncomplexity hiding in that innocuous sequence of <span\nname=\"char\">characters</span>.</p>\n<aside name=\"char\">\n<p>Even that word &ldquo;character&rdquo; is a trickster. Is it ASCII? Unicode? A\ncode point or a &ldquo;grapheme cluster&rdquo;? How are characters encoded? Is each\ncharacter a fixed size, or can they vary?</p>\n</aside></li>\n<li>\n<p><strong>Nil.</strong> There&rsquo;s one last built-in value who&rsquo;s never invited to the party\nbut always seems to show up. It represents &ldquo;no value&rdquo;. It&rsquo;s called &ldquo;null&rdquo; in\nmany other languages. In Lox we spell it <code>nil</code>. (When we get to implementing\nit, that will help distinguish when we&rsquo;re talking about Lox&rsquo;s <code>nil</code> versus\nJava or C&rsquo;s <code>null</code>.)</p>\n<p>There are good arguments for not having a null value in a language since\nnull pointer errors are the scourge of our industry. If we were doing a\nstatically typed language, it would be worth trying to ban it. In a\ndynamically typed one, though, eliminating it is often more annoying\nthan having it.</p>\n</li>\n</ul>\n<h2><a href=\"#expressions\" id=\"expressions\"><small>3&#8202;.&#8202;4</small>Expressions</a></h2>\n<p>If built-in data types and their literals are atoms, then <strong>expressions</strong> must\nbe the molecules. Most of these will be familiar.</p>\n<h3><a href=\"#arithmetic\" id=\"arithmetic\"><small>3&#8202;.&#8202;4&#8202;.&#8202;1</small>Arithmetic</a></h3>\n<p>Lox features the basic arithmetic operators you know and love from C and other\nlanguages:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">add</span> + <span class=\"i\">me</span>;\n<span class=\"i\">subtract</span> - <span class=\"i\">me</span>;\n<span class=\"i\">multiply</span> * <span class=\"i\">me</span>;\n<span class=\"i\">divide</span> / <span class=\"i\">me</span>;\n</pre></div>\n<p>The subexpressions on either side of the operator are <strong>operands</strong>. Because\nthere are <em>two</em> of them, these are called <strong>binary</strong> operators. (It has nothing\nto do with the ones-and-zeroes use of &ldquo;binary&rdquo;.) Because the operator is <span\nname=\"fixity\">fixed</span> <em>in</em> the middle of the operands, these are also\ncalled <strong>infix</strong> operators (as opposed to <strong>prefix</strong> operators where the\noperator comes before the operands, and <strong>postfix</strong> where it comes after).</p>\n<aside name=\"fixity\">\n<p>There are some operators that have more than two operands and the operators are\ninterleaved between them. The only one in wide usage is the &ldquo;conditional&rdquo; or\n&ldquo;ternary&rdquo; operator of C and friends:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">condition</span> ? <span class=\"i\">thenArm</span> : <span class=\"i\">elseArm</span>;\n</pre></div>\n<p>Some call these <strong>mixfix</strong> operators. A few languages let you define your own\noperators and control how they are positioned<span class=\"em\">&mdash;</span>their &ldquo;fixity&rdquo;.</p>\n</aside>\n<p>One arithmetic operator is actually <em>both</em> an infix and a prefix one. The <code>-</code>\noperator can also be used to negate a number.</p>\n<div class=\"codehilite\"><pre>-<span class=\"i\">negateMe</span>;\n</pre></div>\n<p>All of these operators work on numbers, and it&rsquo;s an error to pass any other\ntypes to them. The exception is the <code>+</code> operator<span class=\"em\">&mdash;</span>you can also pass it two\nstrings to concatenate them.</p>\n<h3><a href=\"#comparison-and-equality\" id=\"comparison-and-equality\"><small>3&#8202;.&#8202;4&#8202;.&#8202;2</small>Comparison and equality</a></h3>\n<p>Moving along, we have a few more operators that always return a Boolean result.\nWe can compare numbers (and only numbers), using Ye Olde Comparison Operators.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">less</span> &lt; <span class=\"i\">than</span>;\n<span class=\"i\">lessThan</span> &lt;= <span class=\"i\">orEqual</span>;\n<span class=\"i\">greater</span> &gt; <span class=\"i\">than</span>;\n<span class=\"i\">greaterThan</span> &gt;= <span class=\"i\">orEqual</span>;\n</pre></div>\n<p>We can test two values of any kind for equality or inequality.</p>\n<div class=\"codehilite\"><pre><span class=\"n\">1</span> == <span class=\"n\">2</span>;         <span class=\"c\">// false.</span>\n<span class=\"s\">&quot;cat&quot;</span> != <span class=\"s\">&quot;dog&quot;</span>; <span class=\"c\">// true.</span>\n</pre></div>\n<p>Even different types.</p>\n<div class=\"codehilite\"><pre><span class=\"n\">314</span> == <span class=\"s\">&quot;pi&quot;</span>; <span class=\"c\">// false.</span>\n</pre></div>\n<p>Values of different types are <em>never</em> equivalent.</p>\n<div class=\"codehilite\"><pre><span class=\"n\">123</span> == <span class=\"s\">&quot;123&quot;</span>; <span class=\"c\">// false.</span>\n</pre></div>\n<p>I&rsquo;m generally against implicit conversions.</p>\n<h3><a href=\"#logical-operators\" id=\"logical-operators\"><small>3&#8202;.&#8202;4&#8202;.&#8202;3</small>Logical operators</a></h3>\n<p>The not operator, a prefix <code>!</code>, returns <code>false</code> if its operand is true, and vice\nversa.</p>\n<div class=\"codehilite\"><pre>!<span class=\"k\">true</span>;  <span class=\"c\">// false.</span>\n!<span class=\"k\">false</span>; <span class=\"c\">// true.</span>\n</pre></div>\n<p>The other two logical operators really are control flow constructs in the guise\nof expressions. An <span name=\"and\"><code>and</code></span> expression determines if two\nvalues are <em>both</em> true. It returns the left operand if it&rsquo;s false, or the\nright operand otherwise.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">true</span> <span class=\"k\">and</span> <span class=\"k\">false</span>; <span class=\"c\">// false.</span>\n<span class=\"k\">true</span> <span class=\"k\">and</span> <span class=\"k\">true</span>;  <span class=\"c\">// true.</span>\n</pre></div>\n<p>And an <code>or</code> expression determines if <em>either</em> of two values (or both) are true.\nIt returns the left operand if it is true and the right operand otherwise.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">false</span> <span class=\"k\">or</span> <span class=\"k\">false</span>; <span class=\"c\">// false.</span>\n<span class=\"k\">true</span> <span class=\"k\">or</span> <span class=\"k\">false</span>;  <span class=\"c\">// true.</span>\n</pre></div>\n<aside name=\"and\">\n<p>I used <code>and</code> and <code>or</code> for these instead of <code>&amp;&amp;</code> and <code>||</code> because Lox doesn&rsquo;t use\n<code>&amp;</code> and <code>|</code> for bitwise operators. It felt weird to introduce the\ndouble-character forms without the single-character ones.</p>\n<p>I also kind of like using words for these since they are really control flow\nstructures and not simple operators.</p>\n</aside>\n<p>The reason <code>and</code> and <code>or</code> are like control flow structures is that they\n<strong>short-circuit</strong>. Not only does <code>and</code> return the left operand if it is false,\nit doesn&rsquo;t even <em>evaluate</em> the right one in that case. Conversely\n(contrapositively?), if the left operand of an <code>or</code> is true, the right is\nskipped.</p>\n<h3><a href=\"#precedence-and-grouping\" id=\"precedence-and-grouping\"><small>3&#8202;.&#8202;4&#8202;.&#8202;4</small>Precedence and grouping</a></h3>\n<p>All of these operators have the same precedence and associativity that you&rsquo;d\nexpect coming from C. (When we get to parsing, we&rsquo;ll get <em>way</em> more precise\nabout that.) In cases where the precedence isn&rsquo;t what you want, you can use <code>()</code>\nto group stuff.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">average</span> = (<span class=\"i\">min</span> + <span class=\"i\">max</span>) / <span class=\"n\">2</span>;\n</pre></div>\n<p>Since they aren&rsquo;t very technically interesting, I&rsquo;ve cut the remainder of the\ntypical operator menagerie out of our little language. No bitwise, shift,\nmodulo, or conditional operators. I&rsquo;m not grading you, but you will get bonus\npoints in my heart if you augment your own implementation of Lox with them.</p>\n<p>Those are the expression forms (except for a couple related to specific features\nthat we&rsquo;ll get to later), so let&rsquo;s move up a level.</p>\n<h2><a href=\"#statements\" id=\"statements\"><small>3&#8202;.&#8202;5</small>Statements</a></h2>\n<p>Now we&rsquo;re at statements. Where an expression&rsquo;s main job is to produce a <em>value</em>,\na statement&rsquo;s job is to produce an <em>effect</em>. Since, by definition, statements\ndon&rsquo;t evaluate to a value, to be useful they have to otherwise change the world\nin some way<span class=\"em\">&mdash;</span>usually modifying some state, reading input, or producing output.</p>\n<p>You&rsquo;ve seen a couple of kinds of statements already. The first one was:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">print</span> <span class=\"s\">&quot;Hello, world!&quot;</span>;\n</pre></div>\n<p>A <span name=\"print\"><code>print</code> statement</span> evaluates a single expression\nand displays the result to the user. You&rsquo;ve also seen some statements like:</p>\n<aside name=\"print\">\n<p>Baking <code>print</code> into the language instead of just making it a core library\nfunction is a hack. But it&rsquo;s a <em>useful</em> hack for us: it means our in-progress\ninterpreter can start producing output before we&rsquo;ve implemented all of the\nmachinery required to define functions, look them up by name, and call them.</p>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"s\">&quot;some expression&quot;</span>;\n</pre></div>\n<p>An expression followed by a semicolon (<code>;</code>) promotes the expression to\nstatement-hood. This is called (imaginatively enough), an <strong>expression\nstatement</strong>.</p>\n<p>If you want to pack a series of statements where a single one is expected, you\ncan wrap them up in a <strong>block</strong>.</p>\n<div class=\"codehilite\"><pre>{\n  <span class=\"k\">print</span> <span class=\"s\">&quot;One statement.&quot;</span>;\n  <span class=\"k\">print</span> <span class=\"s\">&quot;Two statements.&quot;</span>;\n}\n</pre></div>\n<p>Blocks also affect scoping, which leads us to the next section<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span></p>\n<h2><a href=\"#variables\" id=\"variables\"><small>3&#8202;.&#8202;6</small>Variables</a></h2>\n<p>You declare variables using <code>var</code> statements. If you <span\nname=\"omit\">omit</span> the initializer, the variable&rsquo;s value defaults to <code>nil</code>.</p>\n<aside name=\"omit\">\n<p>This is one of those cases where not having <code>nil</code> and forcing every variable to\nbe initialized to some value would be more annoying than dealing with <code>nil</code>\nitself.</p>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">imAVariable</span> = <span class=\"s\">&quot;here is my value&quot;</span>;\n<span class=\"k\">var</span> <span class=\"i\">iAmNil</span>;\n</pre></div>\n<p>Once declared, you can, naturally, access and assign a variable using its name.</p>\n<p><span name=\"breakfast\"></span></p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">breakfast</span> = <span class=\"s\">&quot;bagels&quot;</span>;\n<span class=\"k\">print</span> <span class=\"i\">breakfast</span>; <span class=\"c\">// &quot;bagels&quot;.</span>\n<span class=\"i\">breakfast</span> = <span class=\"s\">&quot;beignets&quot;</span>;\n<span class=\"k\">print</span> <span class=\"i\">breakfast</span>; <span class=\"c\">// &quot;beignets&quot;.</span>\n</pre></div>\n<aside name=\"breakfast\">\n<p>Can you tell that I tend to work on this book in the morning before I&rsquo;ve had\nanything to eat?</p>\n</aside>\n<p>I won&rsquo;t get into the rules for variable scope here, because we&rsquo;re going to spend\na surprising amount of time in later chapters mapping every square inch of the\nrules. In most cases, it works like you would expect coming from C or Java.</p>\n<h2><a href=\"#control-flow\" id=\"control-flow\"><small>3&#8202;.&#8202;7</small>Control Flow</a></h2>\n<p>It&rsquo;s hard to write <span name=\"flow\">useful</span> programs if you can&rsquo;t skip\nsome code or execute some more than once. That means control flow. In addition\nto the logical operators we already covered, Lox lifts three statements straight\nfrom C.</p>\n<aside name=\"flow\">\n<p>We already have <code>and</code> and <code>or</code> for branching, and we <em>could</em> use recursion to\nrepeat code, so that&rsquo;s theoretically sufficient. It would be pretty awkward to\nprogram that way in an imperative-styled language, though.</p>\n<p>Scheme, on the other hand, has no built-in looping constructs. It <em>does</em> rely on\nrecursion for repetition. Smalltalk has no built-in branching constructs, and\nrelies on dynamic dispatch for selectively executing code.</p>\n</aside>\n<p>An <code>if</code> statement executes one of two statements based on some condition.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">if</span> (<span class=\"i\">condition</span>) {\n  <span class=\"k\">print</span> <span class=\"s\">&quot;yes&quot;</span>;\n} <span class=\"k\">else</span> {\n  <span class=\"k\">print</span> <span class=\"s\">&quot;no&quot;</span>;\n}\n</pre></div>\n<p>A <code>while</code> <span name=\"do\">loop</span> executes the body repeatedly as long as\nthe condition expression evaluates to true.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"n\">1</span>;\n<span class=\"k\">while</span> (<span class=\"i\">a</span> &lt; <span class=\"n\">10</span>) {\n  <span class=\"k\">print</span> <span class=\"i\">a</span>;\n  <span class=\"i\">a</span> = <span class=\"i\">a</span> + <span class=\"n\">1</span>;\n}\n</pre></div>\n<aside name=\"do\">\n<p>I left <code>do while</code> loops out of Lox because they aren&rsquo;t that common and wouldn&rsquo;t\nteach you anything that you won&rsquo;t already learn from <code>while</code>. Go ahead and add\nit to your implementation if it makes you happy. It&rsquo;s your party.</p>\n</aside>\n<p>Finally, we have <code>for</code> loops.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">for</span> (<span class=\"k\">var</span> <span class=\"i\">a</span> = <span class=\"n\">1</span>; <span class=\"i\">a</span> &lt; <span class=\"n\">10</span>; <span class=\"i\">a</span> = <span class=\"i\">a</span> + <span class=\"n\">1</span>) {\n  <span class=\"k\">print</span> <span class=\"i\">a</span>;\n}\n</pre></div>\n<p>This loop does the same thing as the previous <code>while</code> loop. Most modern\nlanguages also have some sort of <span name=\"foreach\"><code>for-in</code></span> or\n<code>foreach</code> loop for explicitly iterating over various sequence types. In a real\nlanguage, that&rsquo;s nicer than the crude C-style <code>for</code> loop we got here. Lox keeps\nit basic.</p>\n<aside name=\"foreach\">\n<p>This is a concession I made because of how the implementation is split across\nchapters. A <code>for-in</code> loop needs some sort of dynamic dispatch in the iterator\nprotocol to handle different kinds of sequences, but we don&rsquo;t get that until\nafter we&rsquo;re done with control flow. We could circle back and add <code>for-in</code> loops\nlater, but I didn&rsquo;t think doing so would teach you anything super interesting.</p>\n</aside>\n<h2><a href=\"#functions\" id=\"functions\"><small>3&#8202;.&#8202;8</small>Functions</a></h2>\n<p>A function call expression looks the same as it does in C.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">makeBreakfast</span>(<span class=\"i\">bacon</span>, <span class=\"i\">eggs</span>, <span class=\"i\">toast</span>);\n</pre></div>\n<p>You can also call a function without passing anything to it.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">makeBreakfast</span>();\n</pre></div>\n<p>Unlike in, say, Ruby, the parentheses are mandatory in this case. If you leave them\noff, the name doesn&rsquo;t <em>call</em> the function, it just refers to it.</p>\n<p>A language isn&rsquo;t very fun if you can&rsquo;t define your own functions. In Lox, you do\nthat with <span name=\"fun\"><code>fun</code></span>.</p>\n<aside name=\"fun\">\n<p>I&rsquo;ve seen languages that use <code>fn</code>, <code>fun</code>, <code>func</code>, and <code>function</code>. I&rsquo;m still\nhoping to discover a <code>funct</code>, <code>functi</code>, or <code>functio</code> somewhere.</p>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">printSum</span>(<span class=\"i\">a</span>, <span class=\"i\">b</span>) {\n  <span class=\"k\">print</span> <span class=\"i\">a</span> + <span class=\"i\">b</span>;\n}\n</pre></div>\n<p>Now&rsquo;s a good time to clarify some <span name=\"define\">terminology</span>. Some\npeople throw around &ldquo;parameter&rdquo; and &ldquo;argument&rdquo; like they are interchangeable\nand, to many, they are. We&rsquo;re going to spend a lot of time splitting the finest\nof downy hairs around semantics, so let&rsquo;s sharpen our words. From here on out:</p>\n<ul>\n<li>\n<p>An <strong>argument</strong> is an actual value you pass to a function when you call it.\nSo a function <em>call</em> has an <em>argument</em> list. Sometimes you hear <strong>actual\nparameter</strong> used for these.</p>\n</li>\n<li>\n<p>A <strong>parameter</strong> is a variable that holds the value of the argument inside\nthe body of the function. Thus, a function <em>declaration</em> has a <em>parameter</em>\nlist. Others call these <strong>formal parameters</strong> or simply <strong>formals</strong>.</p>\n</li>\n</ul>\n<aside name=\"define\">\n<p>Speaking of terminology, some statically typed languages like C make a\ndistinction between <em>declaring</em> a function and <em>defining</em> it. A declaration\nbinds the function&rsquo;s type to its name so that calls can be type-checked but does\nnot provide a body. A definition declares the function and also fills in the\nbody so that the function can be compiled.</p>\n<p>Since Lox is dynamically typed, this distinction isn&rsquo;t meaningful. A function\ndeclaration fully specifies the function including its body.</p>\n</aside>\n<p>The body of a function is always a block. Inside it, you can return a value\nusing a <code>return</code> statement.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">returnSum</span>(<span class=\"i\">a</span>, <span class=\"i\">b</span>) {\n  <span class=\"k\">return</span> <span class=\"i\">a</span> + <span class=\"i\">b</span>;\n}\n</pre></div>\n<p>If execution reaches the end of the block without hitting a <code>return</code>, it\n<span name=\"sneaky\">implicitly</span> returns <code>nil</code>.</p>\n<aside name=\"sneaky\">\n<p>See, I told you <code>nil</code> would sneak in when we weren&rsquo;t looking.</p>\n</aside>\n<h3><a href=\"#closures\" id=\"closures\"><small>3&#8202;.&#8202;8&#8202;.&#8202;1</small>Closures</a></h3>\n<p>Functions are <em>first class</em> in Lox, which just means they are real values that\nyou can get a reference to, store in variables, pass around, etc. This works:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">addPair</span>(<span class=\"i\">a</span>, <span class=\"i\">b</span>) {\n  <span class=\"k\">return</span> <span class=\"i\">a</span> + <span class=\"i\">b</span>;\n}\n\n<span class=\"k\">fun</span> <span class=\"i\">identity</span>(<span class=\"i\">a</span>) {\n  <span class=\"k\">return</span> <span class=\"i\">a</span>;\n}\n\n<span class=\"k\">print</span> <span class=\"i\">identity</span>(<span class=\"i\">addPair</span>)(<span class=\"n\">1</span>, <span class=\"n\">2</span>); <span class=\"c\">// Prints &quot;3&quot;.</span>\n</pre></div>\n<p>Since function declarations are statements, you can declare local functions\ninside another function.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">outerFunction</span>() {\n  <span class=\"k\">fun</span> <span class=\"i\">localFunction</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;I&#39;m local!&quot;</span>;\n  }\n\n  <span class=\"i\">localFunction</span>();\n}\n</pre></div>\n<p>If you combine local functions, first-class functions, and block scope, you run\ninto this interesting situation:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">fun</span> <span class=\"i\">returnFunction</span>() {\n  <span class=\"k\">var</span> <span class=\"i\">outside</span> = <span class=\"s\">&quot;outside&quot;</span>;\n\n  <span class=\"k\">fun</span> <span class=\"i\">inner</span>() {\n    <span class=\"k\">print</span> <span class=\"i\">outside</span>;\n  }\n\n  <span class=\"k\">return</span> <span class=\"i\">inner</span>;\n}\n\n<span class=\"k\">var</span> <span class=\"i\">fn</span> = <span class=\"i\">returnFunction</span>();\n<span class=\"i\">fn</span>();\n</pre></div>\n<p>Here, <code>inner()</code> accesses a local variable declared outside of its body in the\nsurrounding function. Is this kosher? Now that lots of languages have borrowed\nthis feature from Lisp, you probably know the answer is yes.</p>\n<p>For that to work, <code>inner()</code> has to &ldquo;hold on&rdquo; to references to any surrounding\nvariables that it uses so that they stay around even after the outer function\nhas returned. We call functions that do this <span\nname=\"closure\"><strong>closures</strong></span>. These days, the term is often used for <em>any</em>\nfirst-class function, though it&rsquo;s sort of a misnomer if the function doesn&rsquo;t\nhappen to close over any variables.</p>\n<aside name=\"closure\">\n<p>Peter J. Landin coined the term &ldquo;closure&rdquo;. Yes, he invented damn near half the\nterms in programming languages. Most of them came out of one incredible paper,\n&ldquo;<a href=\"https://homepages.inf.ed.ac.uk/wadler/papers/papers-we-love/landin-next-700.pdf\">The Next 700 Programming Languages</a>&rdquo;.</p>\n<p>In order to implement these kind of functions, you need to create a data\nstructure that bundles together the function&rsquo;s code and the surrounding\nvariables it needs. He called this a &ldquo;closure&rdquo; because it <em>closes over</em> and\nholds on to the variables it needs.</p>\n</aside>\n<p>As you can imagine, implementing these adds some complexity because we can no\nlonger assume variable scope works strictly like a stack where local variables\nevaporate the moment the function returns. We&rsquo;re going to have a fun time\nlearning how to make these work correctly and efficiently.</p>\n<h2><a href=\"#classes\" id=\"classes\"><small>3&#8202;.&#8202;9</small>Classes</a></h2>\n<p>Since Lox has dynamic typing, lexical (roughly, &ldquo;block&rdquo;) scope, and closures,\nit&rsquo;s about halfway to being a functional language. But as you&rsquo;ll see, it&rsquo;s\n<em>also</em> about halfway to being an object-oriented language. Both paradigms have a\nlot going for them, so I thought it was worth covering some of each.</p>\n<p>Since classes have come under fire for not living up to their hype, let me first\nexplain why I put them into Lox and this book. There are really two questions:</p>\n<h3><a href=\"#why-might-any-language-want-to-be-object-oriented\" id=\"why-might-any-language-want-to-be-object-oriented\"><small>3&#8202;.&#8202;9&#8202;.&#8202;1</small>Why might any language want to be object oriented?</a></h3>\n<p>Now that object-oriented languages like Java have sold out and only play arena\nshows, it&rsquo;s not cool to like them anymore. Why would anyone make a <em>new</em>\nlanguage with objects? Isn&rsquo;t that like releasing music on 8-track?</p>\n<p>It is true that the &ldquo;all inheritance all the time&rdquo; binge of the &rsquo;90s produced\nsome monstrous class hierarchies, but <strong>object-oriented programming</strong> (<strong>OOP</strong>)\nis still pretty rad. Billions of lines of successful code have been written in\nOOP languages, shipping millions of apps to happy users. Likely a majority of\nworking programmers today are using an object-oriented language. They can&rsquo;t all\nbe <em>that</em> wrong.</p>\n<p>In particular, for a dynamically typed language, objects are pretty handy. We\nneed <em>some</em> way of defining compound data types to bundle blobs of stuff\ntogether.</p>\n<p>If we can also hang methods off of those, then we avoid the need to prefix all\nof our functions with the name of the data type they operate on to avoid\ncolliding with similar functions for different types. In, say, Racket, you end\nup having to name your functions like <code>hash-copy</code> (to copy a hash table) and\n<code>vector-copy</code> (to copy a vector) so that they don&rsquo;t step on each other. Methods\nare scoped to the object, so that problem goes away.</p>\n<h3><a href=\"#why-is-lox-object-oriented\" id=\"why-is-lox-object-oriented\"><small>3&#8202;.&#8202;9&#8202;.&#8202;2</small>Why is Lox object oriented?</a></h3>\n<p>I could claim objects are groovy but still out of scope for the book. Most\nprogramming language books, especially ones that try to implement a whole\nlanguage, leave objects out. To me, that means the topic isn&rsquo;t well covered.\nWith such a widespread paradigm, that omission makes me sad.</p>\n<p>Given how many of us spend all day <em>using</em> OOP languages, it seems like the\nworld could use a little documentation on how to <em>make</em> one. As you&rsquo;ll see, it\nturns out to be pretty interesting. Not as hard as you might fear, but not as\nsimple as you might presume, either.</p>\n<h3><a href=\"#classes-or-prototypes\" id=\"classes-or-prototypes\"><small>3&#8202;.&#8202;9&#8202;.&#8202;3</small>Classes or prototypes</a></h3>\n<p>When it comes to objects, there are actually two approaches to them, <a href=\"https://en.wikipedia.org/wiki/Class-based_programming\">classes</a>\nand <a href=\"https://en.wikipedia.org/wiki/Prototype-based_programming\">prototypes</a>. Classes came first, and are more common thanks to C++, Java,\nC#, and friends. Prototypes were a virtually forgotten offshoot until JavaScript\naccidentally took over the world.</p>\n<p>In class-based languages, there are two core concepts: instances and classes.\nInstances store the state for each object and have a reference to the instance&rsquo;s\nclass. Classes contain the methods and inheritance chain. To call a method on an\ninstance, there is always a level of indirection. You <span\nname=\"dispatch\">look</span> up the instance&rsquo;s class and then you find the method\n<em>there</em>:</p>\n<aside name=\"dispatch\">\n<p>In a statically typed language like C++, method lookup typically happens at\ncompile time based on the <em>static</em> type of the instance, giving you <strong>static\ndispatch</strong>. In contrast, <strong>dynamic dispatch</strong> looks up the class of the actual\ninstance object at runtime. This is how virtual methods in statically typed\nlanguages and all methods in a dynamically typed language like Lox work.</p>\n</aside><img src=\"image/the-lox-language/class-lookup.png\" alt=\"How fields and methods are looked up on classes and instances\" />\n<p>Prototype-based languages <span name=\"blurry\">merge</span> these two concepts.\nThere are only objects<span class=\"em\">&mdash;</span>no classes<span class=\"em\">&mdash;</span>and each individual object may contain\nstate and methods. Objects can directly inherit from each other (or &ldquo;delegate\nto&rdquo; in prototypal lingo):</p>\n<aside name=\"blurry\">\n<p>In practice the line between class-based and prototype-based languages blurs.\nJavaScript&rsquo;s &ldquo;constructor function&rdquo; notion <a href=\"http://gameprogrammingpatterns.com/prototype.html#what-about-javascript\">pushes you pretty hard</a>\ntowards defining class-like objects. Meanwhile, class-based Ruby is perfectly\nhappy to let you attach methods to individual instances.</p>\n</aside><img src=\"image/the-lox-language/prototype-lookup.png\" alt=\"How fields and methods are looked up in a prototypal system\" />\n<p>This means that in some ways prototypal languages are more fundamental than\nclasses. They are really neat to implement because they&rsquo;re <em>so</em> simple. Also,\nthey can express lots of unusual patterns that classes steer you away from.</p>\n<p>But I&rsquo;ve looked at a <em>lot</em> of code written in prototypal languages<span class=\"em\">&mdash;</span>including\n<a href=\"http://finch.stuffwithstuff.com/\">some of my own devising</a>. Do you know what people generally do with all\nof the power and flexibility of prototypes? <span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>They use them to reinvent\nclasses.</p>\n<p>I don&rsquo;t know <em>why</em> that is, but people naturally seem to prefer a class-based\n(Classic? Classy?) style. Prototypes <em>are</em> simpler in the language, but they\nseem to accomplish that only by <span name=\"waterbed\">pushing</span> the\ncomplexity onto the user. So, for Lox, we&rsquo;ll save our users the trouble and bake\nclasses right in.</p>\n<aside name=\"waterbed\">\n<p>Larry Wall, Perl&rsquo;s inventor/prophet calls this the &ldquo;<a href=\"http://wiki.c2.com/?WaterbedTheory\">waterbed theory</a>&rdquo;. Some\ncomplexity is essential and cannot be eliminated. If you push it down in one\nplace, it swells up in another.</p>\n<p>Prototypal languages don&rsquo;t so much <em>eliminate</em> the complexity of classes as they\ndo make the <em>user</em> take that complexity by building their own class-like\nmetaprogramming libraries.</p>\n</aside>\n<h3><a href=\"#classes-in-lox\" id=\"classes-in-lox\"><small>3&#8202;.&#8202;9&#8202;.&#8202;4</small>Classes in Lox</a></h3>\n<p>Enough rationale, let&rsquo;s see what we actually have. Classes encompass a\nconstellation of features in most languages. For Lox, I&rsquo;ve selected what I think\nare the brightest stars. You declare a class and its methods like so:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Breakfast</span> {\n  <span class=\"i\">cook</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Eggs a-fryin&#39;!&quot;</span>;\n  }\n\n  <span class=\"i\">serve</span>(<span class=\"i\">who</span>) {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Enjoy your breakfast, &quot;</span> + <span class=\"i\">who</span> + <span class=\"s\">&quot;.&quot;</span>;\n  }\n}\n</pre></div>\n<p>The body of a class contains its methods. They look like function declarations\nbut without the <code>fun</code> <span name=\"method\">keyword</span>. When the class\ndeclaration is executed, Lox creates a class object and stores that in a\nvariable named after the class. Just like functions, classes are first class in\nLox.</p>\n<aside name=\"method\">\n<p>They are still just as fun, though.</p>\n</aside>\n<div class=\"codehilite\"><pre><span class=\"c\">// Store it in variables.</span>\n<span class=\"k\">var</span> <span class=\"i\">someVariable</span> = <span class=\"t\">Breakfast</span>;\n\n<span class=\"c\">// Pass it to functions.</span>\n<span class=\"i\">someFunction</span>(<span class=\"t\">Breakfast</span>);\n</pre></div>\n<p>Next, we need a way to create instances. We could add some sort of <code>new</code>\nkeyword, but to keep things simple, in Lox the class itself is a factory\nfunction for instances. Call a class like a function, and it produces a new\ninstance of itself.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">breakfast</span> = <span class=\"t\">Breakfast</span>();\n<span class=\"k\">print</span> <span class=\"i\">breakfast</span>; <span class=\"c\">// &quot;Breakfast instance&quot;.</span>\n</pre></div>\n<h3><a href=\"#instantiation-and-initialization\" id=\"instantiation-and-initialization\"><small>3&#8202;.&#8202;9&#8202;.&#8202;5</small>Instantiation and initialization</a></h3>\n<p>Classes that only have behavior aren&rsquo;t super useful. The idea behind\nobject-oriented programming is encapsulating behavior <em>and state</em> together. To\ndo that, you need fields. Lox, like other dynamically typed languages, lets you\nfreely add properties onto objects.</p>\n<div class=\"codehilite\"><pre><span class=\"i\">breakfast</span>.<span class=\"i\">meat</span> = <span class=\"s\">&quot;sausage&quot;</span>;\n<span class=\"i\">breakfast</span>.<span class=\"i\">bread</span> = <span class=\"s\">&quot;sourdough&quot;</span>;\n</pre></div>\n<p>Assigning to a field creates it if it doesn&rsquo;t already exist.</p>\n<p>If you want to access a field or method on the current object from within a\nmethod, you use good old <code>this</code>.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Breakfast</span> {\n  <span class=\"i\">serve</span>(<span class=\"i\">who</span>) {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;Enjoy your &quot;</span> + <span class=\"k\">this</span>.<span class=\"i\">meat</span> + <span class=\"s\">&quot; and &quot;</span> +\n        <span class=\"k\">this</span>.<span class=\"i\">bread</span> + <span class=\"s\">&quot;, &quot;</span> + <span class=\"i\">who</span> + <span class=\"s\">&quot;.&quot;</span>;\n  }\n\n  <span class=\"c\">// ...</span>\n}\n</pre></div>\n<p>Part of encapsulating data within an object is ensuring the object is in a valid\nstate when it&rsquo;s created. To do that, you can define an initializer. If your\nclass has a method named <code>init()</code>, it is called automatically when the object is\nconstructed. Any parameters passed to the class are forwarded to its\ninitializer.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Breakfast</span> {\n  <span class=\"i\">init</span>(<span class=\"i\">meat</span>, <span class=\"i\">bread</span>) {\n    <span class=\"k\">this</span>.<span class=\"i\">meat</span> = <span class=\"i\">meat</span>;\n    <span class=\"k\">this</span>.<span class=\"i\">bread</span> = <span class=\"i\">bread</span>;\n  }\n\n  <span class=\"c\">// ...</span>\n}\n\n<span class=\"k\">var</span> <span class=\"i\">baconAndToast</span> = <span class=\"t\">Breakfast</span>(<span class=\"s\">&quot;bacon&quot;</span>, <span class=\"s\">&quot;toast&quot;</span>);\n<span class=\"i\">baconAndToast</span>.<span class=\"i\">serve</span>(<span class=\"s\">&quot;Dear Reader&quot;</span>);\n<span class=\"c\">// &quot;Enjoy your bacon and toast, Dear Reader.&quot;</span>\n</pre></div>\n<h3><a href=\"#inheritance\" id=\"inheritance\"><small>3&#8202;.&#8202;9&#8202;.&#8202;6</small>Inheritance</a></h3>\n<p>Every object-oriented language lets you not only define methods, but reuse them\nacross multiple classes or objects. For that, Lox supports single inheritance.\nWhen you declare a class, you can specify a class that it inherits from using a less-than\n<span name=\"less\">(<code>&lt;</code>)</span> operator.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Brunch</span> &lt; <span class=\"t\">Breakfast</span> {\n  <span class=\"i\">drink</span>() {\n    <span class=\"k\">print</span> <span class=\"s\">&quot;How about a Bloody Mary?&quot;</span>;\n  }\n}\n</pre></div>\n<aside name=\"less\">\n<p>Why the <code>&lt;</code> operator? I didn&rsquo;t feel like introducing a new keyword like\n<code>extends</code>. Lox doesn&rsquo;t use <code>:</code> for anything else so I didn&rsquo;t want to reserve\nthat either. Instead, I took a page from Ruby and used <code>&lt;</code>.</p>\n<p>If you know any type theory, you&rsquo;ll notice it&rsquo;s not a <em>totally</em> arbitrary\nchoice. Every instance of a subclass is an instance of its superclass too, but\nthere may be instances of the superclass that are not instances of the subclass.\nThat means, in the universe of objects, the set of subclass objects is smaller\nthan the superclass&rsquo;s set, though type nerds usually use <code>&lt;:</code> for that relation.</p>\n</aside>\n<p>Here, Brunch is the <strong>derived class</strong> or <strong>subclass</strong>, and Breakfast is the\n<strong>base class</strong> or <strong>superclass</strong>.</p>\n<p>Every method defined in the superclass is also available to its subclasses.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">var</span> <span class=\"i\">benedict</span> = <span class=\"t\">Brunch</span>(<span class=\"s\">&quot;ham&quot;</span>, <span class=\"s\">&quot;English muffin&quot;</span>);\n<span class=\"i\">benedict</span>.<span class=\"i\">serve</span>(<span class=\"s\">&quot;Noble Reader&quot;</span>);\n</pre></div>\n<p>Even the <code>init()</code> method gets <span name=\"init\">inherited</span>. In practice,\nthe subclass usually wants to define its own <code>init()</code> method too. But the\noriginal one also needs to be called so that the superclass can maintain its\nstate. We need some way to call a method on our own <em>instance</em> without hitting\nour own <em>methods</em>.</p>\n<aside name=\"init\">\n<p>Lox is different from C++, Java, and C#, which do not inherit constructors, but\nsimilar to Smalltalk and Ruby, which do.</p>\n</aside>\n<p>As in Java, you use <code>super</code> for that.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">class</span> <span class=\"t\">Brunch</span> &lt; <span class=\"t\">Breakfast</span> {\n  <span class=\"i\">init</span>(<span class=\"i\">meat</span>, <span class=\"i\">bread</span>, <span class=\"i\">drink</span>) {\n    <span class=\"k\">super</span>.<span class=\"i\">init</span>(<span class=\"i\">meat</span>, <span class=\"i\">bread</span>);\n    <span class=\"k\">this</span>.<span class=\"i\">drink</span> = <span class=\"i\">drink</span>;\n  }\n}\n</pre></div>\n<p>That&rsquo;s about it for object orientation. I tried to keep the feature set minimal.\nThe structure of the book did force one compromise. Lox is not a <em>pure</em>\nobject-oriented language. In a true OOP language every object is an instance of\na class, even primitive values like numbers and Booleans.</p>\n<p>Because we don&rsquo;t implement classes until well after we start working with the\nbuilt-in types, that would have been hard. So values of primitive types aren&rsquo;t\nreal objects in the sense of being instances of classes. They don&rsquo;t have methods\nor properties. If I were trying to make Lox a real language for real users, I\nwould fix that.</p>\n<h2><a href=\"#the-standard-library\" id=\"the-standard-library\"><small>3&#8202;.&#8202;10</small>The Standard Library</a></h2>\n<p>We&rsquo;re almost done. That&rsquo;s the whole language, so all that&rsquo;s left is the &ldquo;core&rdquo;\nor &ldquo;standard&rdquo; library<span class=\"em\">&mdash;</span>the set of functionality that is implemented directly\nin the interpreter and that all user-defined behavior is built on top of.</p>\n<p>This is the saddest part of Lox. Its standard library goes beyond minimalism and\nveers close to outright nihilism. For the sample code in the book, we only need\nto demonstrate that code is running and doing what it&rsquo;s supposed to do. For\nthat, we already have the built-in <code>print</code> statement.</p>\n<p>Later, when we start optimizing, we&rsquo;ll write some benchmarks and see how long it\ntakes to execute code. That means we need to track time, so we&rsquo;ll define one\nbuilt-in function, <code>clock()</code>, that returns the number of seconds since the\nprogram started.</p>\n<p>And<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>that&rsquo;s it. I know, right? It&rsquo;s embarrassing.</p>\n<p>If you wanted to turn Lox into an actual useful language, the very first thing\nyou should do is flesh this out. String manipulation, trigonometric functions,\nfile I/O, networking, heck, even <em>reading input from the user</em> would help. But we\ndon&rsquo;t need any of that for this book, and adding it wouldn&rsquo;t teach you anything\ninteresting, so I&rsquo;ve left it out.</p>\n<p>Don&rsquo;t worry, we&rsquo;ll have plenty of exciting stuff in the language itself to keep\nus busy.</p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>Write some sample Lox programs and run them (you can use the implementations\nof Lox in <a href=\"https://github.com/munificent/craftinginterpreters\">my repository</a>). Try to come up with edge case behavior I\ndidn&rsquo;t specify here. Does it do what you expect? Why or why not?</p>\n</li>\n<li>\n<p>This informal introduction leaves a <em>lot</em> unspecified. List several open\nquestions you have about the language&rsquo;s syntax and semantics. What do you\nthink the answers should be?</p>\n</li>\n<li>\n<p>Lox is a pretty tiny language. What features do you think it is missing that\nwould make it annoying to use for real programs? (Aside from the standard\nlibrary, of course.)</p>\n</li>\n</ol>\n</div>\n<div class=\"design-note\">\n<h2><a href=\"#design-note\" id=\"design-note\">Design Note: Expressions and Statements</a></h2>\n<p>Lox has both expressions and statements. Some languages omit the latter.\nInstead, they treat declarations and control flow constructs as expressions too.\nThese &ldquo;everything is an expression&rdquo; languages tend to have functional pedigrees\nand include most Lisps, SML, Haskell, Ruby, and CoffeeScript.</p>\n<p>To do that, for each &ldquo;statement-like&rdquo; construct in the language, you need to\ndecide what value it evaluates to. Some of those are easy:</p>\n<ul>\n<li>\n<p>An <code>if</code> expression evaluates to the result of whichever branch is chosen.\nLikewise, a <code>switch</code> or other multi-way branch evaluates to whichever case\nis picked.</p>\n</li>\n<li>\n<p>A variable declaration evaluates to the value of the variable.</p>\n</li>\n<li>\n<p>A block evaluates to the result of the last expression in the sequence.</p>\n</li>\n</ul>\n<p>Some get a little stranger. What should a loop evaluate to? A <code>while</code> loop in\nCoffeeScript evaluates to an array containing each element that the body\nevaluated to. That can be handy, or a waste of memory if you don&rsquo;t need the\narray.</p>\n<p>You also have to decide how these statement-like expressions compose with other\nexpressions<span class=\"em\">&mdash;</span>you have to fit them into the grammar&rsquo;s precedence table. For\nexample, Ruby allows:</p>\n<div class=\"codehilite\"><pre><span class=\"i\">puts</span> <span class=\"n\">1</span> + <span class=\"k\">if</span> <span class=\"k\">true</span> <span class=\"k\">then</span> <span class=\"n\">2</span> <span class=\"k\">else</span> <span class=\"n\">3</span> <span class=\"k\">end</span> + <span class=\"n\">4</span>\n</pre></div>\n<p>Is this what you&rsquo;d expect? Is it what your <em>users</em> expect? How does this affect\nhow you design the syntax for your &ldquo;statements&rdquo;? Note that Ruby has an explicit\n<code>end</code> to tell when the <code>if</code> expression is complete. Without it, the <code>+ 4</code> would\nlikely be parsed as part of the <code>else</code> clause.</p>\n<p>Turning every statement into an expression forces you to answer a few hairy\nquestions like that. In return, you eliminate some redundancy. C has both blocks\nfor sequencing statements, and the comma operator for sequencing expressions. It\nhas both the <code>if</code> statement and the <code>?:</code> conditional operator. If everything was\nan expression in C, you could unify each of those.</p>\n<p>Languages that do away with statements usually also feature <strong>implicit returns</strong><span class=\"em\">&mdash;</span>a function automatically returns whatever value its body evaluates to without\nneed for some explicit <code>return</code> syntax. For small functions and methods, this is\nreally handy. In fact, many languages that do have statements have added syntax\nlike <code>=&gt;</code> to be able to define functions whose body is the result of evaluating\na single expression.</p>\n<p>But making <em>all</em> functions work that way can be a little strange. If you aren&rsquo;t\ncareful, your function will leak a return value even if you only intend it to\nproduce a side effect. In practice, though, users of these languages don&rsquo;t find\nit to be a problem.</p>\n<p>For Lox, I gave it statements for prosaic reasons. I picked a C-like syntax for\nfamiliarity&rsquo;s sake, and trying to take the existing C statement syntax and\ninterpret it like expressions gets weird pretty fast.</p>\n</div>\n\n<footer>\n<a href=\"a-tree-walk-interpreter.html\" class=\"next\">\n  Next Part: &ldquo;A Tree-Walk Interpreter&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/types-of-values.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Types of Values &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h3><a href=\"#top\">Types of Values<small>18</small></a></h3>\n\n<ul>\n    <li><a href=\"#tagged-unions\"><small>18.1</small> Tagged Unions</a></li>\n    <li><a href=\"#lox-values-and-c-values\"><small>18.2</small> Lox Values and C Values</a></li>\n    <li><a href=\"#dynamically-typed-numbers\"><small>18.3</small> Dynamically Typed Numbers</a></li>\n    <li><a href=\"#two-new-types\"><small>18.4</small> Two New Types</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"compiling-expressions.html\" title=\"Compiling Expressions\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"strings.html\" title=\"Strings\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"compiling-expressions.html\" title=\"Compiling Expressions\" class=\"prev\">←</a>\n<a href=\"strings.html\" title=\"Strings\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h3><a href=\"#top\">Types of Values<small>18</small></a></h3>\n\n<ul>\n    <li><a href=\"#tagged-unions\"><small>18.1</small> Tagged Unions</a></li>\n    <li><a href=\"#lox-values-and-c-values\"><small>18.2</small> Lox Values and C Values</a></li>\n    <li><a href=\"#dynamically-typed-numbers\"><small>18.3</small> Dynamically Typed Numbers</a></li>\n    <li><a href=\"#two-new-types\"><small>18.4</small> Two New Types</a></li>\n    <li class=\"divider\"></li>\n    <li class=\"end-part\"><a href=\"#challenges\">Challenges</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"compiling-expressions.html\" title=\"Compiling Expressions\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"a-bytecode-virtual-machine.html\" title=\"A Bytecode Virtual Machine\">&uarr;&nbsp;Up</a>\n    <a href=\"strings.html\" title=\"Strings\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">18</div>\n  <h1>Types of Values</h1>\n\n<blockquote>\n<p>When you are a Bear of Very Little Brain, and you Think of Things, you find\nsometimes that a Thing which seemed very Thingish inside you is quite\ndifferent when it gets out into the open and has other people looking at it.</p>\n<p><cite>A. A. Milne, <em>Winnie-the-Pooh</em></cite></p>\n</blockquote>\n<p>The past few chapters were huge, packed full of complex techniques and pages of\ncode. In this chapter, there&rsquo;s only one new concept to learn and a scattering of\nstraightforward code. You&rsquo;ve earned a respite.</p>\n<p>Lox is <span name=\"unityped\">dynamically</span> typed. A single variable can\nhold a Boolean, number, or string at different points in time. At least, that&rsquo;s\nthe idea. Right now, in clox, all values are numbers. By the end of the chapter,\nit will also support Booleans and <code>nil</code>. While those aren&rsquo;t super interesting,\nthey force us to figure out how our value representation can dynamically handle\ndifferent types.</p>\n<aside name=\"unityped\">\n<p>There is a third category next to statically typed and dynamically typed:\n<strong>unityped</strong>. In that paradigm, all variables have a single type, usually a\nmachine register integer. Unityped languages aren&rsquo;t common today, but some\nForths and BCPL, the language that inspired C, worked like this.</p>\n<p>As of this moment, clox is unityped.</p>\n</aside>\n<h2><a href=\"#tagged-unions\" id=\"tagged-unions\"><small>18&#8202;.&#8202;1</small>Tagged Unions</a></h2>\n<p>The nice thing about working in C is that we can build our data structures from\nthe raw bits up. The bad thing is that we <em>have</em> to do that. C doesn&rsquo;t give you\nmuch for free at compile time and even less at runtime. As far as C is\nconcerned, the universe is an undifferentiated array of bytes. It&rsquo;s up to us to\ndecide how many of those bytes to use and what they mean.</p>\n<p>In order to choose a value representation, we need to answer two key questions:</p>\n<ol>\n<li>\n<p><strong>How do we represent the type of a value?</strong> If you try to, say, multiply a\nnumber by <code>true</code>, we need to detect that error at runtime and report it. In\norder to do that, we need to be able to tell what a value&rsquo;s type is.</p>\n</li>\n<li>\n<p><strong>How do we store the value itself?</strong> We need to not only be able to tell\nthat three is a number, but that it&rsquo;s different from the number four. I\nknow, seems obvious, right? But we&rsquo;re operating at a level where it&rsquo;s good\nto spell these things out.</p>\n</li>\n</ol>\n<p>Since we&rsquo;re not just designing this language but building it ourselves, when\nanswering these two questions we also have to keep in mind the implementer&rsquo;s\neternal quest: to do it <em>efficiently</em>.</p>\n<p>Language hackers over the years have come up with a variety of clever ways to\npack the above information into as few bits as possible. For now, we&rsquo;ll start\nwith the simplest, classic solution: a <strong>tagged union</strong>. A value contains two\nparts: a type &ldquo;tag&rdquo;, and a payload for the actual value. To store the value&rsquo;s\ntype, we define an enum for each kind of value the VM supports.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#include &quot;common.h&quot;\n\n</pre><div class=\"source-file\"><em>value.h</em></div>\n<pre class=\"insert\"><span class=\"k\">typedef</span> <span class=\"k\">enum</span> {\n  <span class=\"a\">VAL_BOOL</span>,\n  <span class=\"a\">VAL_NIL</span>,<span name=\"user-types\"> </span>\n  <span class=\"a\">VAL_NUMBER</span>,\n} <span class=\"t\">ValueType</span>;\n\n</pre><pre class=\"insert-after\">typedef double Value;\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em></div>\n\n<aside name=\"user-types\">\n<p>The cases here cover each kind of value that has <em>built-in support in the VM</em>.\nWhen we get to adding classes to the language, each class the user defines\ndoesn&rsquo;t need its own entry in this enum. As far as the VM is concerned, every\ninstance of a class is the same type: &ldquo;instance&rdquo;.</p>\n<p>In other words, this is the VM&rsquo;s notion of &ldquo;type&rdquo;, not the user&rsquo;s.</p>\n</aside>\n<p>For now, we have only a couple of cases, but this will grow as we add strings,\nfunctions, and classes to clox. In addition to the type, we also need to store\nthe data for the value<span class=\"em\">&mdash;</span>the <code>double</code> for a number, <code>true</code> or <code>false</code> for a\nBoolean. We could define a struct with fields for each possible type.</p><img src=\"image/types-of-values/struct.png\" alt=\"A struct with two fields laid next to each other in memory.\" />\n<p>But this is a waste of memory. A value can&rsquo;t simultaneously be both a number and\na Boolean. So at any point in time, only one of those fields will be used. C\nlets you optimize this by defining a <span name=\"sum\">union</span>. A union\nlooks like a struct except that all of its fields overlap in memory.</p>\n<aside name=\"sum\">\n<p>If you&rsquo;re familiar with a language in the ML family, structs and unions in C\nroughly mirror the difference between product and sum types, between tuples\nand algebraic data types.</p>\n</aside><img src=\"image/types-of-values/union.png\" alt=\"A union with two fields overlapping in memory.\" />\n<p>The size of a union is the size of its largest field. Since the fields all reuse\nthe same bits, you have to be very careful when working with them. If you store\ndata using one field and then access it using <span\nname=\"reinterpret\">another</span>, you will reinterpret what the underlying bits\nmean.</p>\n<aside name=\"reinterpret\">\n<p>Using a union to interpret bits as different types is the quintessence of C. It\nopens up a number of clever optimizations and lets you slice and dice each byte\nof memory in ways that memory-safe languages disallow. But it is also wildly\nunsafe and will happily saw your fingers off if you don&rsquo;t watch out.</p>\n</aside>\n<p>As the name &ldquo;tagged union&rdquo; implies, our new value representation combines these\ntwo parts into a single struct.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} ValueType;\n\n</pre><div class=\"source-file\"><em>value.h</em><br>\nadd after enum <em>ValueType</em><br>\nreplace 1 line</div>\n<pre class=\"insert\"><span class=\"k\">typedef</span> <span class=\"k\">struct</span> {\n  <span class=\"t\">ValueType</span> <span class=\"i\">type</span>;\n  <span class=\"k\">union</span> {\n    <span class=\"t\">bool</span> <span class=\"i\">boolean</span>;\n    <span class=\"t\">double</span> <span class=\"i\">number</span>;\n  } <span class=\"i\">as</span>;<span name=\"as\"> </span>\n} <span class=\"t\">Value</span>;\n</pre><pre class=\"insert-after\">\n\ntypedef struct {\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em>, add after enum <em>ValueType</em>, replace 1 line</div>\n\n<p>There&rsquo;s a field for the type tag, and then a second field containing the union\nof all of the underlying values. On a 64-bit machine with a typical C compiler,\nthe layout looks like this:</p>\n<aside name=\"as\">\n<p>A smart language hacker gave me the idea to use &ldquo;as&rdquo; for the name of the union\nfield because it reads nicely, almost like a cast, when you pull the various\nvalues out.</p>\n</aside><img src=\"image/types-of-values/value.png\" alt=\"The full value struct, with the type and as fields next to each other in memory.\" />\n<p>The four-byte type tag comes first, then the union. Most architectures prefer\nvalues be aligned to their size. Since the union field contains an eight-byte\ndouble, the compiler adds four bytes of <span name=\"pad\">padding</span> after\nthe type field to keep that double on the nearest eight-byte boundary. That\nmeans we&rsquo;re effectively spending eight bytes on the type tag, which only needs\nto represent a number between zero and three. We could stuff the enum in a\nsmaller size, but all that would do is increase the padding.</p>\n<aside name=\"pad\">\n<p>We could move the tag field <em>after</em> the union, but that doesn&rsquo;t help much\neither. Whenever we create an array of Values<span class=\"em\">&mdash;</span>which is where most of our\nmemory usage for Values will be<span class=\"em\">&mdash;</span>the C compiler will insert that same padding\n<em>between</em> each Value to keep the doubles aligned.</p>\n</aside>\n<p>So our Values are 16 bytes, which seems a little large. We&rsquo;ll improve it\n<a href=\"optimization.html\">later</a>. In the meantime, they&rsquo;re still small enough to store on\nthe C stack and pass around by value. Lox&rsquo;s semantics allow that because the\nonly types we support so far are <strong>immutable</strong>. If we pass a copy of a Value\ncontaining the number three to some function, we don&rsquo;t need to worry about the\ncaller seeing modifications to the value. You can&rsquo;t &ldquo;modify&rdquo; three. It&rsquo;s three\nforever.</p>\n<h2><a href=\"#lox-values-and-c-values\" id=\"lox-values-and-c-values\"><small>18&#8202;.&#8202;2</small>Lox Values and C Values</a></h2>\n<p>That&rsquo;s our new value representation, but we aren&rsquo;t done. Right now, the rest of\nclox assumes Value is an alias for <code>double</code>. We have code that does a straight C\ncast from one to the other. That code is all broken now. So sad.</p>\n<p>With our new representation, a Value can <em>contain</em> a double, but it&rsquo;s not\n<em>equivalent</em> to it. There is a mandatory conversion step to get from one to the\nother. We need to go through the code and insert those conversions to get clox\nworking again.</p>\n<p>We&rsquo;ll implement these conversions as a handful of macros, one for each type and\noperation. First, to promote a native C value to a clox Value:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} Value;\n</pre><div class=\"source-file\"><em>value.h</em><br>\nadd after struct <em>Value</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#define BOOL_VAL(value)   ((Value){VAL_BOOL, {.boolean = value}})</span>\n<span class=\"a\">#define NIL_VAL           ((Value){VAL_NIL, {.number = 0}})</span>\n<span class=\"a\">#define NUMBER_VAL(value) ((Value){VAL_NUMBER, {.number = value}})</span>\n</pre><pre class=\"insert-after\">\n\ntypedef struct {\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em>, add after struct <em>Value</em></div>\n\n<p>Each one of these takes a C value of the appropriate type and produces a Value\nthat has the correct type tag and contains the underlying value. This hoists\nstatically typed values up into clox&rsquo;s dynamically typed universe. In order to\n<em>do</em> anything with a Value, though, we need to unpack it and get the C value\nback out.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} Value;\n</pre><div class=\"source-file\"><em>value.h</em><br>\nadd after struct <em>Value</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#define AS_BOOL(value)    ((value).as.boolean)</span>\n<span class=\"a\">#define AS_NUMBER(value)  ((value).as.number)</span>\n</pre><pre class=\"insert-after\">\n\n#define BOOL_VAL(value)   ((Value){VAL_BOOL, {.boolean = value}})\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em>, add after struct <em>Value</em></div>\n\n<aside name=\"as-null\">\n<p>There&rsquo;s no <code>AS_NIL</code> macro because there is only one <code>nil</code> value, so a Value with\ntype <code>VAL_NIL</code> doesn&rsquo;t carry any extra data.</p>\n</aside>\n<p><span name=\"as-null\">These</span> macros go in the opposite direction. Given a\nValue of the right type, they unwrap it and return the corresponding raw C\nvalue. The &ldquo;right type&rdquo; part is important! These macros directly access the\nunion fields. If we were to do something like:</p>\n<div class=\"codehilite\"><pre><span class=\"t\">Value</span> <span class=\"i\">value</span> = <span class=\"a\">BOOL_VAL</span>(<span class=\"k\">true</span>);\n<span class=\"t\">double</span> <span class=\"i\">number</span> = <span class=\"a\">AS_NUMBER</span>(<span class=\"i\">value</span>);\n</pre></div>\n<p>Then we may open a smoldering portal to the Shadow Realm. It&rsquo;s not safe to use\nany of the <code>AS_</code> macros unless we know the Value contains the appropriate type.\nTo that end, we define a last few macros to check a Value&rsquo;s type.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} Value;\n</pre><div class=\"source-file\"><em>value.h</em><br>\nadd after struct <em>Value</em></div>\n<pre class=\"insert\">\n\n<span class=\"a\">#define IS_BOOL(value)    ((value).type == VAL_BOOL)</span>\n<span class=\"a\">#define IS_NIL(value)     ((value).type == VAL_NIL)</span>\n<span class=\"a\">#define IS_NUMBER(value)  ((value).type == VAL_NUMBER)</span>\n</pre><pre class=\"insert-after\">\n\n#define AS_BOOL(value)    ((value).as.boolean)\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em>, add after struct <em>Value</em></div>\n\n<p><span name=\"universe\">These</span> macros return <code>true</code> if the Value has that\ntype. Any time we call one of the <code>AS_</code> macros, we need to guard it behind a\ncall to one of these first. With these eight macros, we can now safely shuttle\ndata between Lox&rsquo;s dynamic world and C&rsquo;s static one.</p>\n<aside name=\"universe\"><img src=\"image/types-of-values/universe.png\" alt=\"The earthly C firmament with the Lox heavens above.\" />\n<p>The <code>_VAL</code> macros lift a C value into the heavens. The <code>AS_</code> macros bring it\nback down.</p>\n</aside>\n<h2><a href=\"#dynamically-typed-numbers\" id=\"dynamically-typed-numbers\"><small>18&#8202;.&#8202;3</small>Dynamically Typed Numbers</a></h2>\n<p>We&rsquo;ve got our value representation and the tools to convert to and from it. All\nthat&rsquo;s left to get clox running again is to grind through the code and fix every\nplace where data moves across that boundary. This is one of those sections of\nthe book that isn&rsquo;t exactly mind-blowing, but I promised I&rsquo;d show you every\nsingle line of code, so here we are.</p>\n<p>The first values we create are the constants generated when we compile number\nliterals. After we convert the lexeme to a C double, we simply wrap it in a\nValue before storing it in the constant table.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  double value = strtod(parser.previous.start, NULL);\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>number</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"i\">emitConstant</span>(<span class=\"a\">NUMBER_VAL</span>(<span class=\"i\">value</span>));\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>number</em>(), replace 1 line</div>\n\n<p>Over in the runtime, we have a function to print values.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void printValue(Value value) {\n</pre><div class=\"source-file\"><em>value.c</em><br>\nin <em>printValue</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\"> <span class=\"i\">printf</span>(<span class=\"s\">&quot;%g&quot;</span>, <span class=\"a\">AS_NUMBER</span>(<span class=\"i\">value</span>));\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.c</em>, in <em>printValue</em>(), replace 1 line</div>\n\n<p>Right before we send the Value to <code>printf()</code>, we unwrap it and extract the\ndouble value. We&rsquo;ll revisit this function shortly to add the other types, but\nlet&rsquo;s get our existing code working first.</p>\n<h3><a href=\"#unary-negation-and-runtime-errors\" id=\"unary-negation-and-runtime-errors\"><small>18&#8202;.&#8202;3&#8202;.&#8202;1</small>Unary negation and runtime errors</a></h3>\n<p>The next simplest operation is unary negation. It pops a value off the stack,\nnegates it, and pushes the result. Now that we have other types of values, we\ncan&rsquo;t assume the operand is a number anymore. The user could just as well do:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">print</span> -<span class=\"k\">false</span>; <span class=\"c\">// Uh...</span>\n</pre></div>\n<p>We need to handle that gracefully, which means it&rsquo;s time for <em>runtime errors</em>.\nBefore performing an operation that requires a certain type, we need to make\nsure the Value <em>is</em> that type.</p>\n<p>For unary negation, the check looks like this:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case OP_DIVIDE:   BINARY_OP(/); break;\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_NEGATE</span>:\n        <span class=\"k\">if</span> (!<span class=\"a\">IS_NUMBER</span>(<span class=\"i\">peek</span>(<span class=\"n\">0</span>))) {\n          <span class=\"i\">runtimeError</span>(<span class=\"s\">&quot;Operand must be a number.&quot;</span>);\n          <span class=\"k\">return</span> <span class=\"a\">INTERPRET_RUNTIME_ERROR</span>;\n        }\n        <span class=\"i\">push</span>(<span class=\"a\">NUMBER_VAL</span>(-<span class=\"a\">AS_NUMBER</span>(<span class=\"i\">pop</span>())));\n        <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">      case OP_RETURN: {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>(), replace 1 line</div>\n\n<p>First, we check to see if the Value on top of the stack is a number. If it&rsquo;s\nnot, we report the runtime error and <span name=\"halt\">stop</span> the\ninterpreter. Otherwise, we keep going. Only after this validation do we unwrap\nthe operand, negate it, wrap the result and push it.</p>\n<aside name=\"halt\">\n<p>Lox&rsquo;s approach to error-handling is rather<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span><em>spare</em>. All errors are fatal and\nimmediately halt the interpreter. There&rsquo;s no way for user code to recover from\nan error. If Lox were a real language, this is one of the first things I would\nremedy.</p>\n</aside>\n<p>To access the Value, we use a new little function.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nadd after <em>pop</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">Value</span> <span class=\"i\">peek</span>(<span class=\"t\">int</span> <span class=\"i\">distance</span>) {\n  <span class=\"k\">return</span> <span class=\"i\">vm</span>.<span class=\"i\">stackTop</span>[-<span class=\"n\">1</span> - <span class=\"i\">distance</span>];\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, add after <em>pop</em>()</div>\n\n<p>It returns a Value from the stack but doesn&rsquo;t <span name=\"peek\">pop</span> it.\nThe <code>distance</code> argument is how far down from the top of the stack to look: zero\nis the top, one is one slot down, etc.</p>\n<aside name=\"peek\">\n<p>Why not just pop the operand and then validate it? We could do that. In later\nchapters, it will be important to leave operands on the stack to ensure the\ngarbage collector can find them if a collection is triggered in the middle of\nthe operation. I do the same thing here mostly out of habit.</p>\n</aside>\n<p>We report the runtime error using a new function that we&rsquo;ll get a lot of mileage\nout of over the remainder of the book.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nadd after <em>resetStack</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">runtimeError</span>(<span class=\"k\">const</span> <span class=\"t\">char</span>* <span class=\"i\">format</span>, ...) {\n  <span class=\"t\">va_list</span> <span class=\"i\">args</span>;\n  <span class=\"i\">va_start</span>(<span class=\"i\">args</span>, <span class=\"i\">format</span>);\n  <span class=\"i\">vfprintf</span>(<span class=\"i\">stderr</span>, <span class=\"i\">format</span>, <span class=\"i\">args</span>);\n  <span class=\"i\">va_end</span>(<span class=\"i\">args</span>);\n  <span class=\"i\">fputs</span>(<span class=\"s\">&quot;</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>, <span class=\"i\">stderr</span>);\n\n  <span class=\"t\">size_t</span> <span class=\"i\">instruction</span> = <span class=\"i\">vm</span>.<span class=\"i\">ip</span> - <span class=\"i\">vm</span>.<span class=\"i\">chunk</span>-&gt;<span class=\"i\">code</span> - <span class=\"n\">1</span>;\n  <span class=\"t\">int</span> <span class=\"i\">line</span> = <span class=\"i\">vm</span>.<span class=\"i\">chunk</span>-&gt;<span class=\"i\">lines</span>[<span class=\"i\">instruction</span>];\n  <span class=\"i\">fprintf</span>(<span class=\"i\">stderr</span>, <span class=\"s\">&quot;[line %d] in script</span><span class=\"e\">\\n</span><span class=\"s\">&quot;</span>, <span class=\"i\">line</span>);\n  <span class=\"i\">resetStack</span>();\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, add after <em>resetStack</em>()</div>\n\n<p>You&rsquo;ve certainly <em>called</em> variadic functions<span class=\"em\">&mdash;</span>ones that take a varying number\nof arguments<span class=\"em\">&mdash;</span>in C before: <code>printf()</code> is one. But you may not have <em>defined</em>\nyour own. This book isn&rsquo;t a C <span name=\"tutorial\">tutorial</span>, so I&rsquo;ll\nskim over it here, but basically the <code>...</code> and <code>va_list</code> stuff let us pass an\narbitrary number of arguments to <code>runtimeError()</code>. It forwards those on to\n<code>vfprintf()</code>, which is the flavor of <code>printf()</code> that takes an explicit\n<code>va_list</code>.</p>\n<aside name=\"tutorial\">\n<p>If you are looking for a C tutorial, I love <em><a href=\"https://www.cs.princeton.edu/~bwk/cbook.html\">The C Programming Language</a></em>,\nusually called &ldquo;K&amp;R&rdquo; in honor of its authors. It&rsquo;s not entirely up to date, but\nthe quality of the writing more than makes up for it.</p>\n</aside>\n<p>Callers can pass a format string to <code>runtimeError()</code> followed by a number of\narguments, just like they can when calling <code>printf()</code> directly. <code>runtimeError()</code>\nthen formats and prints those arguments. We won&rsquo;t take advantage of that in this\nchapter, but later chapters will produce formatted runtime error messages that\ncontain other data.</p>\n<p>After we show the hopefully helpful error message, we tell the user which <span\nname=\"stack\">line</span> of their code was being executed when the error\noccurred. Since we left the tokens behind in the compiler, we look up the line\nin the debug information compiled into the chunk. If our compiler did its job\nright, that corresponds to the line of source code that the bytecode was\ncompiled from.</p>\n<p>We look into the chunk&rsquo;s debug line array using the current bytecode instruction\nindex <em>minus one</em>. That&rsquo;s because the interpreter advances past each instruction\nbefore executing it. So, at the point that we call <code>runtimeError()</code>, the failed\ninstruction is the previous one.</p>\n<aside name=\"stack\">\n<p>Just showing the immediate line where the error occurred doesn&rsquo;t provide much\ncontext. Better would be a full stack trace. But we don&rsquo;t even have functions to\ncall yet, so there is no call stack to trace.</p>\n</aside>\n<p>In order to use <code>va_list</code> and the macros for working with it, we need to bring\nin a standard header.</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nadd to top of file</div>\n<pre class=\"insert\"><span class=\"a\">#include &lt;stdarg.h&gt;</span>\n</pre><pre class=\"insert-after\">#include &lt;stdio.h&gt;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, add to top of file</div>\n\n<p>With this, our VM can not only do the right thing when we negate numbers (like\nit used to before we broke it), but it also gracefully handles erroneous\nattempts to negate other types (which we don&rsquo;t have yet, but still).</p>\n<h3><a href=\"#binary-arithmetic-operators\" id=\"binary-arithmetic-operators\"><small>18&#8202;.&#8202;3&#8202;.&#8202;2</small>Binary arithmetic operators</a></h3>\n<p>We have our runtime error machinery in place now, so fixing the binary operators\nis easier even though they&rsquo;re more complex. We support four binary operators\ntoday: <code>+</code>, <code>-</code>, <code>*</code>, and <code>/</code>. The only difference between them is which\nunderlying C operator they use. To minimize redundant code between the four\noperators, we wrapped up the commonality in a big preprocessor macro that takes\nthe operator token as a parameter.</p>\n<p>That macro seemed like overkill a <a href=\"a-virtual-machine.html#binary-operators\">few chapters ago</a>, but we get the benefit\nfrom it today. It lets us add the necessary type checking and conversions in one\nplace.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">#define READ_CONSTANT() (vm.chunk-&gt;constants.values[READ_BYTE()])\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()<br>\nreplace 6 lines</div>\n<pre class=\"insert\"><span class=\"a\">#define BINARY_OP(valueType, op) \\</span>\n<span class=\"a\">    do { \\</span>\n<span class=\"a\">      if (!IS_NUMBER(peek(0)) || !IS_NUMBER(peek(1))) { \\</span>\n<span class=\"a\">        runtimeError(&quot;Operands must be numbers.&quot;); \\</span>\n<span class=\"a\">        return INTERPRET_RUNTIME_ERROR; \\</span>\n<span class=\"a\">      } \\</span>\n<span class=\"a\">      double b = AS_NUMBER(pop()); \\</span>\n<span class=\"a\">      double a = AS_NUMBER(pop()); \\</span>\n<span class=\"a\">      push(valueType(a op b)); \\</span>\n<span class=\"a\">    } while (false)</span>\n</pre><pre class=\"insert-after\">\n\n  for (;;) {\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>(), replace 6 lines</div>\n\n<p>Yeah, I realize that&rsquo;s a monster of a macro. It&rsquo;s not what I&rsquo;d normally consider\ngood C practice, but let&rsquo;s roll with it. The changes are similar to what we did\nfor unary negate. First, we check that the two operands are both numbers. If\neither isn&rsquo;t, we report a runtime error and yank the ejection seat lever.</p>\n<p>If the operands are fine, we pop them both and unwrap them. Then we apply the\ngiven operator, wrap the result, and push it back on the stack. Note that we\ndon&rsquo;t wrap the result by directly using <code>NUMBER_VAL()</code>. Instead, the wrapper to\nuse is passed in as a macro <span name=\"macro\">parameter</span>. For our\nexisting arithmetic operators, the result is a number, so we pass in the\n<code>NUMBER_VAL</code> macro.</p>\n<aside name=\"macro\">\n<p>Did you know you can pass macros as parameters to macros? Now you do!</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">      }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()<br>\nreplace 4 lines</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_ADD</span>:      <span class=\"a\">BINARY_OP</span>(<span class=\"a\">NUMBER_VAL</span>, +); <span class=\"k\">break</span>;\n      <span class=\"k\">case</span> <span class=\"a\">OP_SUBTRACT</span>: <span class=\"a\">BINARY_OP</span>(<span class=\"a\">NUMBER_VAL</span>, -); <span class=\"k\">break</span>;\n      <span class=\"k\">case</span> <span class=\"a\">OP_MULTIPLY</span>: <span class=\"a\">BINARY_OP</span>(<span class=\"a\">NUMBER_VAL</span>, *); <span class=\"k\">break</span>;\n      <span class=\"k\">case</span> <span class=\"a\">OP_DIVIDE</span>:   <span class=\"a\">BINARY_OP</span>(<span class=\"a\">NUMBER_VAL</span>, /); <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">      case OP_NEGATE:\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>(), replace 4 lines</div>\n\n<p>Soon, I&rsquo;ll show you why we made the wrapping macro an argument.</p>\n<h2><a href=\"#two-new-types\" id=\"two-new-types\"><small>18&#8202;.&#8202;4</small>Two New Types</a></h2>\n<p>All of our existing clox code is back in working order. Finally, it&rsquo;s time to\nadd some new types. We&rsquo;ve got a running numeric calculator that now does a\nnumber of pointless paranoid runtime type checks. We can represent other types\ninternally, but there&rsquo;s no way for a user&rsquo;s program to ever create a Value of\none of those types.</p>\n<p>Not until now, that is. We&rsquo;ll start by adding compiler support for the three new\nliterals: <code>true</code>, <code>false</code>, and <code>nil</code>. They&rsquo;re all pretty simple, so we&rsquo;ll do all\nthree in a single batch.</p>\n<p>With number literals, we had to deal with the fact that there are billions of\npossible numeric values. We attended to that by storing the literal&rsquo;s value in\nthe chunk&rsquo;s constant table and emitting a bytecode instruction that simply\nloaded that constant. We could do the same thing for the new types. We&rsquo;d store,\nsay, <code>true</code>, in the constant table, and use an <code>OP_CONSTANT</code> to read it out.</p>\n<p>But given that there are literally (heh) only three possible values we need to\nworry about with these new types, it&rsquo;s gratuitous<span class=\"em\">&mdash;</span>and <span\nname=\"small\">slow!</span><span class=\"em\">&mdash;</span>to waste a two-byte instruction and a constant\ntable entry on them. Instead, we&rsquo;ll define three dedicated instructions to push\neach of these literals on the stack.</p>\n<aside name=\"small\" class=\"bottom\">\n<p>I&rsquo;m not kidding about dedicated operations for certain constant values being\nfaster. A bytecode VM spends much of its execution time reading and decoding\ninstructions. The fewer, simpler instructions you need for a given piece of\nbehavior, the faster it goes. Short instructions dedicated to common operations\nare a classic optimization.</p>\n<p>For example, the Java bytecode instruction set has dedicated instructions for\nloading 0.0, 1.0, 2.0, and the integer values from -1 through 5. (This ends up\nbeing a vestigial optimization given that most mature JVMs now JIT-compile the\nbytecode to machine code before execution anyway.)</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_CONSTANT,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_NIL</span>,\n  <span class=\"a\">OP_TRUE</span>,\n  <span class=\"a\">OP_FALSE</span>,\n</pre><pre class=\"insert-after\">  OP_ADD,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>Our scanner already treats <code>true</code>, <code>false</code>, and <code>nil</code> as keywords, so we can\nskip right to the parser. With our table-based Pratt parser, we just need to\nslot parser functions into the rows associated with those keyword token types.\nWe&rsquo;ll use the same function in all three slots. Here:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  [TOKEN_ELSE]          = {NULL,     NULL,   PREC_NONE},\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nreplace 1 line</div>\n<pre class=\"insert\">  [<span class=\"a\">TOKEN_FALSE</span>]         = {<span class=\"i\">literal</span>,  <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n</pre><pre class=\"insert-after\">  [TOKEN_FOR]           = {NULL,     NULL,   PREC_NONE},\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, replace 1 line</div>\n\n<p>Here:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  [TOKEN_THIS]          = {NULL,     NULL,   PREC_NONE},\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nreplace 1 line</div>\n<pre class=\"insert\">  [<span class=\"a\">TOKEN_TRUE</span>]          = {<span class=\"i\">literal</span>,  <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n</pre><pre class=\"insert-after\">  [TOKEN_VAR]           = {NULL,     NULL,   PREC_NONE},\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, replace 1 line</div>\n\n<p>And here:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  [TOKEN_IF]            = {NULL,     NULL,   PREC_NONE},\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nreplace 1 line</div>\n<pre class=\"insert\">  [<span class=\"a\">TOKEN_NIL</span>]           = {<span class=\"i\">literal</span>,  <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n</pre><pre class=\"insert-after\">  [TOKEN_OR]            = {NULL,     NULL,   PREC_NONE},\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, replace 1 line</div>\n\n<p>When the parser encounters <code>false</code>, <code>nil</code>, or <code>true</code>, in prefix position, it\ncalls this new parser function:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>compiler.c</em><br>\nadd after <em>binary</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">void</span> <span class=\"i\">literal</span>() {\n  <span class=\"k\">switch</span> (<span class=\"i\">parser</span>.<span class=\"i\">previous</span>.<span class=\"i\">type</span>) {\n    <span class=\"k\">case</span> <span class=\"a\">TOKEN_FALSE</span>: <span class=\"i\">emitByte</span>(<span class=\"a\">OP_FALSE</span>); <span class=\"k\">break</span>;\n    <span class=\"k\">case</span> <span class=\"a\">TOKEN_NIL</span>: <span class=\"i\">emitByte</span>(<span class=\"a\">OP_NIL</span>); <span class=\"k\">break</span>;\n    <span class=\"k\">case</span> <span class=\"a\">TOKEN_TRUE</span>: <span class=\"i\">emitByte</span>(<span class=\"a\">OP_TRUE</span>); <span class=\"k\">break</span>;\n    <span class=\"k\">default</span>: <span class=\"k\">return</span>; <span class=\"c\">// Unreachable.</span>\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, add after <em>binary</em>()</div>\n\n<p>Since <code>parsePrecedence()</code> has already consumed the keyword token, all we need to\ndo is output the proper instruction. We <span name=\"switch\">figure</span> that\nout based on the type of token we parsed. Our front end can now compile Boolean\nand nil literals to bytecode. Moving down the execution pipeline, we reach the\ninterpreter.</p>\n<aside name=\"switch\">\n<p>We could have used separate parser functions for each literal and saved\nourselves a switch but that felt needlessly verbose to me. I think it&rsquo;s mostly a\nmatter of taste.</p>\n</aside>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case OP_CONSTANT: {\n        Value constant = READ_CONSTANT();\n        push(constant);\n        break;\n      }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_NIL</span>: <span class=\"i\">push</span>(<span class=\"a\">NIL_VAL</span>); <span class=\"k\">break</span>;\n      <span class=\"k\">case</span> <span class=\"a\">OP_TRUE</span>: <span class=\"i\">push</span>(<span class=\"a\">BOOL_VAL</span>(<span class=\"k\">true</span>)); <span class=\"k\">break</span>;\n      <span class=\"k\">case</span> <span class=\"a\">OP_FALSE</span>: <span class=\"i\">push</span>(<span class=\"a\">BOOL_VAL</span>(<span class=\"k\">false</span>)); <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">      case OP_ADD:      BINARY_OP(NUMBER_VAL, +); break;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>This is pretty self-explanatory. Each instruction summons the appropriate value\nand pushes it onto the stack. We shouldn&rsquo;t forget our disassembler either.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    case OP_CONSTANT:\n      return constantInstruction(&quot;OP_CONSTANT&quot;, chunk, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_NIL</span>:\n      <span class=\"k\">return</span> <span class=\"i\">simpleInstruction</span>(<span class=\"s\">&quot;OP_NIL&quot;</span>, <span class=\"i\">offset</span>);\n    <span class=\"k\">case</span> <span class=\"a\">OP_TRUE</span>:\n      <span class=\"k\">return</span> <span class=\"i\">simpleInstruction</span>(<span class=\"s\">&quot;OP_TRUE&quot;</span>, <span class=\"i\">offset</span>);\n    <span class=\"k\">case</span> <span class=\"a\">OP_FALSE</span>:\n      <span class=\"k\">return</span> <span class=\"i\">simpleInstruction</span>(<span class=\"s\">&quot;OP_FALSE&quot;</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    case OP_ADD:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>With this in place, we can run this Earth-shattering program:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">true</span>\n</pre></div>\n<p>Except that when the interpreter tries to print the result, it blows up. We need\nto extend <code>printValue()</code> to handle the new types too:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">void printValue(Value value) {\n</pre><div class=\"source-file\"><em>value.c</em><br>\nin <em>printValue</em>()<br>\nreplace 1 line</div>\n<pre class=\"insert\">  <span class=\"k\">switch</span> (<span class=\"i\">value</span>.<span class=\"i\">type</span>) {\n    <span class=\"k\">case</span> <span class=\"a\">VAL_BOOL</span>:\n      <span class=\"i\">printf</span>(<span class=\"a\">AS_BOOL</span>(<span class=\"i\">value</span>) ? <span class=\"s\">&quot;true&quot;</span> : <span class=\"s\">&quot;false&quot;</span>);\n      <span class=\"k\">break</span>;\n    <span class=\"k\">case</span> <span class=\"a\">VAL_NIL</span>: <span class=\"i\">printf</span>(<span class=\"s\">&quot;nil&quot;</span>); <span class=\"k\">break</span>;\n    <span class=\"k\">case</span> <span class=\"a\">VAL_NUMBER</span>: <span class=\"i\">printf</span>(<span class=\"s\">&quot;%g&quot;</span>, <span class=\"a\">AS_NUMBER</span>(<span class=\"i\">value</span>)); <span class=\"k\">break</span>;\n  }\n</pre><pre class=\"insert-after\">}\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.c</em>, in <em>printValue</em>(), replace 1 line</div>\n\n<p>There we go! Now we have some new types. They just aren&rsquo;t very useful yet. Aside\nfrom the literals, you can&rsquo;t really <em>do</em> anything with them. It will be a while\nbefore <code>nil</code> comes into play, but we can start putting Booleans to work in the\nlogical operators.</p>\n<h3><a href=\"#logical-not-and-falsiness\" id=\"logical-not-and-falsiness\"><small>18&#8202;.&#8202;4&#8202;.&#8202;1</small>Logical not and falsiness</a></h3>\n<p>The simplest logical operator is our old exclamatory friend unary not.</p>\n<div class=\"codehilite\"><pre><span class=\"k\">print</span> !<span class=\"k\">true</span>; <span class=\"c\">// &quot;false&quot;</span>\n</pre></div>\n<p>This new operation gets a new instruction.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_DIVIDE,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_NOT</span>,\n</pre><pre class=\"insert-after\">  OP_NEGATE,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>We can reuse the <code>unary()</code> parser function we wrote for unary negation to\ncompile a not expression. We just need to slot it into the parsing table.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  [TOKEN_STAR]          = {NULL,     binary, PREC_FACTOR},\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nreplace 1 line</div>\n<pre class=\"insert\">  [<span class=\"a\">TOKEN_BANG</span>]          = {<span class=\"i\">unary</span>,    <span class=\"a\">NULL</span>,   <span class=\"a\">PREC_NONE</span>},\n</pre><pre class=\"insert-after\">  [TOKEN_BANG_EQUAL]    = {NULL,     NULL,   PREC_NONE},\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, replace 1 line</div>\n\n<p>Because I knew we were going to do this, the <code>unary()</code> function already has a\nswitch on the token type to figure out which bytecode instruction to output. We\nmerely add another case.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  switch (operatorType) {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>unary</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">TOKEN_BANG</span>: <span class=\"i\">emitByte</span>(<span class=\"a\">OP_NOT</span>); <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">    case TOKEN_MINUS: emitByte(OP_NEGATE); break;\n    default: return; // Unreachable.\n  }\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>unary</em>()</div>\n\n<p>That&rsquo;s it for the front end. Let&rsquo;s head over to the VM and conjure this\ninstruction into life.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case OP_DIVIDE:   BINARY_OP(NUMBER_VAL, /); break;\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_NOT</span>:\n        <span class=\"i\">push</span>(<span class=\"a\">BOOL_VAL</span>(<span class=\"i\">isFalsey</span>(<span class=\"i\">pop</span>())));\n        <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">      case OP_NEGATE:\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>Like our previous unary operator, it pops the one operand, performs the\noperation, and pushes the result. And, as we did there, we have to worry about\ndynamic typing. Taking the logical not of <code>true</code> is easy, but there&rsquo;s nothing\npreventing an unruly programmer from writing something like this:</p>\n<div class=\"codehilite\"><pre><span class=\"k\">print</span> !<span class=\"k\">nil</span>;\n</pre></div>\n<p>For unary minus, we made it an error to negate anything that isn&rsquo;t a <span\nname=\"negate\">number</span>. But Lox, like most scripting languages, is more\npermissive when it comes to <code>!</code> and other contexts where a Boolean is expected.\nThe rule for how other types are handled is called &ldquo;falsiness&rdquo;, and we implement\nit here:</p>\n<aside name=\"negate\">\n<p>Now I can&rsquo;t help but try to figure out what it would mean to negate other types\nof values. <code>nil</code> is probably its own negation, sort of like a weird pseudo-zero.\nNegating a string could, uh, reverse it?</p>\n</aside>\n<div class=\"codehilite\"><div class=\"source-file\"><em>vm.c</em><br>\nadd after <em>peek</em>()</div>\n<pre><span class=\"k\">static</span> <span class=\"t\">bool</span> <span class=\"i\">isFalsey</span>(<span class=\"t\">Value</span> <span class=\"i\">value</span>) {\n  <span class=\"k\">return</span> <span class=\"a\">IS_NIL</span>(<span class=\"i\">value</span>) || (<span class=\"a\">IS_BOOL</span>(<span class=\"i\">value</span>) &amp;&amp; !<span class=\"a\">AS_BOOL</span>(<span class=\"i\">value</span>));\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, add after <em>peek</em>()</div>\n\n<p>Lox follows Ruby in that <code>nil</code> and <code>false</code> are falsey and every other value\nbehaves like <code>true</code>. We&rsquo;ve got a new instruction we can generate, so we also\nneed to be able to <em>un</em>generate it in the disassembler.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    case OP_DIVIDE:\n      return simpleInstruction(&quot;OP_DIVIDE&quot;, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_NOT</span>:\n      <span class=\"k\">return</span> <span class=\"i\">simpleInstruction</span>(<span class=\"s\">&quot;OP_NOT&quot;</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    case OP_NEGATE:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<h3><a href=\"#equality-and-comparison-operators\" id=\"equality-and-comparison-operators\"><small>18&#8202;.&#8202;4&#8202;.&#8202;2</small>Equality and comparison operators</a></h3>\n<p>That wasn&rsquo;t too bad. Let&rsquo;s keep the momentum going and knock out the equality\nand comparison operators too: <code>==</code>, <code>!=</code>, <code>&lt;</code>, <code>&gt;</code>, <code>&lt;=</code>, and <code>&gt;=</code>. That covers\nall of the operators that return Boolean results except the logical operators\n<code>and</code> and <code>or</code>. Since those need to short-circuit (basically do a little\ncontrol flow) we aren&rsquo;t ready for them yet.</p>\n<p>Here are the new instructions for those operators:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  OP_FALSE,\n</pre><div class=\"source-file\"><em>chunk.h</em><br>\nin enum <em>OpCode</em></div>\n<pre class=\"insert\">  <span class=\"a\">OP_EQUAL</span>,\n  <span class=\"a\">OP_GREATER</span>,\n  <span class=\"a\">OP_LESS</span>,\n</pre><pre class=\"insert-after\">  OP_ADD,\n</pre></div>\n<div class=\"source-file-narrow\"><em>chunk.h</em>, in enum <em>OpCode</em></div>\n\n<p>Wait, only three? What about <code>!=</code>, <code>&lt;=</code>, and <code>&gt;=</code>? We could create instructions\nfor those too. Honestly, the VM would execute faster if we did, so we <em>should</em>\ndo that if the goal is performance.</p>\n<p>But my main goal is to teach you about bytecode compilers. I want you to start\ninternalizing the idea that the bytecode instructions don&rsquo;t need to closely\nfollow the user&rsquo;s source code. The VM has total freedom to use whatever\ninstruction set and code sequences it wants as long as they have the right\nuser-visible behavior.</p>\n<p>The expression <code>a != b</code> has the same semantics as <code>!(a == b)</code>, so the compiler\nis free to compile the former as if it were the latter. Instead of a dedicated\n<code>OP_NOT_EQUAL</code> instruction, it can output an <code>OP_EQUAL</code> followed by an <code>OP_NOT</code>.\nLikewise, <code>a &lt;= b</code> is the <span name=\"same\">same</span> as <code>!(a &gt; b)</code> and <code>a &gt;= b</code> is <code>!(a &lt; b)</code>. Thus, we only need three new instructions.</p>\n<aside name=\"same\" class=\"bottom\">\n<p><em>Is</em> <code>a &lt;= b</code> always the same as <code>!(a &gt; b)</code>? According to <a href=\"https://en.wikipedia.org/wiki/IEEE_754\">IEEE 754</a>, all\ncomparison operators return false when an operand is NaN. That means <code>NaN &lt;= 1</code>\nis false and <code>NaN &gt; 1</code> is also false. But our desugaring assumes the latter is\nalways the negation of the former.</p>\n<p>For the book, we won&rsquo;t get hung up on this, but these kinds of details will\nmatter in your real language implementations.</p>\n</aside>\n<p>Over in the parser, though, we do have six new operators to slot into the parse\ntable. We use the same <code>binary()</code> parser function from before. Here&rsquo;s the row\nfor <code>!=</code>:</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  [TOKEN_BANG]          = {unary,    NULL,   PREC_NONE},\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nreplace 1 line</div>\n<pre class=\"insert\">  [<span class=\"a\">TOKEN_BANG_EQUAL</span>]    = {<span class=\"a\">NULL</span>,     <span class=\"i\">binary</span>, <span class=\"a\">PREC_EQUALITY</span>},\n</pre><pre class=\"insert-after\">  [TOKEN_EQUAL]         = {NULL,     NULL,   PREC_NONE},\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, replace 1 line</div>\n\n<p>The remaining five operators are a little farther down in the table.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  [TOKEN_EQUAL]         = {NULL,     NULL,   PREC_NONE},\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nreplace 5 lines</div>\n<pre class=\"insert\">  [<span class=\"a\">TOKEN_EQUAL_EQUAL</span>]   = {<span class=\"a\">NULL</span>,     <span class=\"i\">binary</span>, <span class=\"a\">PREC_EQUALITY</span>},\n  [<span class=\"a\">TOKEN_GREATER</span>]       = {<span class=\"a\">NULL</span>,     <span class=\"i\">binary</span>, <span class=\"a\">PREC_COMPARISON</span>},\n  [<span class=\"a\">TOKEN_GREATER_EQUAL</span>] = {<span class=\"a\">NULL</span>,     <span class=\"i\">binary</span>, <span class=\"a\">PREC_COMPARISON</span>},\n  [<span class=\"a\">TOKEN_LESS</span>]          = {<span class=\"a\">NULL</span>,     <span class=\"i\">binary</span>, <span class=\"a\">PREC_COMPARISON</span>},\n  [<span class=\"a\">TOKEN_LESS_EQUAL</span>]    = {<span class=\"a\">NULL</span>,     <span class=\"i\">binary</span>, <span class=\"a\">PREC_COMPARISON</span>},\n</pre><pre class=\"insert-after\">  [TOKEN_IDENTIFIER]    = {NULL,     NULL,   PREC_NONE},\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, replace 5 lines</div>\n\n<p>Inside <code>binary()</code> we already have a switch to generate the right bytecode for\neach token type. We add cases for the six new operators.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">  switch (operatorType) {\n</pre><div class=\"source-file\"><em>compiler.c</em><br>\nin <em>binary</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">TOKEN_BANG_EQUAL</span>:    <span class=\"i\">emitBytes</span>(<span class=\"a\">OP_EQUAL</span>, <span class=\"a\">OP_NOT</span>); <span class=\"k\">break</span>;\n    <span class=\"k\">case</span> <span class=\"a\">TOKEN_EQUAL_EQUAL</span>:   <span class=\"i\">emitByte</span>(<span class=\"a\">OP_EQUAL</span>); <span class=\"k\">break</span>;\n    <span class=\"k\">case</span> <span class=\"a\">TOKEN_GREATER</span>:       <span class=\"i\">emitByte</span>(<span class=\"a\">OP_GREATER</span>); <span class=\"k\">break</span>;\n    <span class=\"k\">case</span> <span class=\"a\">TOKEN_GREATER_EQUAL</span>: <span class=\"i\">emitBytes</span>(<span class=\"a\">OP_LESS</span>, <span class=\"a\">OP_NOT</span>); <span class=\"k\">break</span>;\n    <span class=\"k\">case</span> <span class=\"a\">TOKEN_LESS</span>:          <span class=\"i\">emitByte</span>(<span class=\"a\">OP_LESS</span>); <span class=\"k\">break</span>;\n    <span class=\"k\">case</span> <span class=\"a\">TOKEN_LESS_EQUAL</span>:    <span class=\"i\">emitBytes</span>(<span class=\"a\">OP_GREATER</span>, <span class=\"a\">OP_NOT</span>); <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">    case TOKEN_PLUS:          emitByte(OP_ADD); break;\n</pre></div>\n<div class=\"source-file-narrow\"><em>compiler.c</em>, in <em>binary</em>()</div>\n\n<p>The <code>==</code>, <code>&lt;</code>, and <code>&gt;</code> operators output a single instruction. The others output\na pair of instructions, one to evalute the inverse operation, and then an\n<code>OP_NOT</code> to flip the result. Six operators for the price of three instructions!</p>\n<p>That means over in the VM, our job is simpler. Equality is the most general\noperation.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">      case OP_FALSE: push(BOOL_VAL(false)); break;\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_EQUAL</span>: {\n        <span class=\"t\">Value</span> <span class=\"i\">b</span> = <span class=\"i\">pop</span>();\n        <span class=\"t\">Value</span> <span class=\"i\">a</span> = <span class=\"i\">pop</span>();\n        <span class=\"i\">push</span>(<span class=\"a\">BOOL_VAL</span>(<span class=\"i\">valuesEqual</span>(<span class=\"i\">a</span>, <span class=\"i\">b</span>)));\n        <span class=\"k\">break</span>;\n      }\n</pre><pre class=\"insert-after\">      case OP_ADD:      BINARY_OP(NUMBER_VAL, +); break;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>You can evaluate <code>==</code> on any pair of objects, even objects of different types.\nThere&rsquo;s enough complexity that it makes sense to shunt that logic over to a\nseparate function. That function always returns a C <code>bool</code>, so we can safely\nwrap the result in a <code>BOOL_VAL</code>. The function relates to Values, so it lives\nover in the &ldquo;value&rdquo; module.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">} ValueArray;\n\n</pre><div class=\"source-file\"><em>value.h</em><br>\nadd after struct <em>ValueArray</em></div>\n<pre class=\"insert\"><span class=\"t\">bool</span> <span class=\"i\">valuesEqual</span>(<span class=\"t\">Value</span> <span class=\"i\">a</span>, <span class=\"t\">Value</span> <span class=\"i\">b</span>);\n</pre><pre class=\"insert-after\">void initValueArray(ValueArray* array);\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.h</em>, add after struct <em>ValueArray</em></div>\n\n<p>And here&rsquo;s the implementation:</p>\n<div class=\"codehilite\"><div class=\"source-file\"><em>value.c</em><br>\nadd after <em>printValue</em>()</div>\n<pre><span class=\"t\">bool</span> <span class=\"i\">valuesEqual</span>(<span class=\"t\">Value</span> <span class=\"i\">a</span>, <span class=\"t\">Value</span> <span class=\"i\">b</span>) {\n  <span class=\"k\">if</span> (<span class=\"i\">a</span>.<span class=\"i\">type</span> != <span class=\"i\">b</span>.<span class=\"i\">type</span>) <span class=\"k\">return</span> <span class=\"k\">false</span>;\n  <span class=\"k\">switch</span> (<span class=\"i\">a</span>.<span class=\"i\">type</span>) {\n    <span class=\"k\">case</span> <span class=\"a\">VAL_BOOL</span>:   <span class=\"k\">return</span> <span class=\"a\">AS_BOOL</span>(<span class=\"i\">a</span>) == <span class=\"a\">AS_BOOL</span>(<span class=\"i\">b</span>);\n    <span class=\"k\">case</span> <span class=\"a\">VAL_NIL</span>:    <span class=\"k\">return</span> <span class=\"k\">true</span>;\n    <span class=\"k\">case</span> <span class=\"a\">VAL_NUMBER</span>: <span class=\"k\">return</span> <span class=\"a\">AS_NUMBER</span>(<span class=\"i\">a</span>) == <span class=\"a\">AS_NUMBER</span>(<span class=\"i\">b</span>);\n    <span class=\"k\">default</span>:         <span class=\"k\">return</span> <span class=\"k\">false</span>; <span class=\"c\">// Unreachable.</span>\n  }\n}\n</pre></div>\n<div class=\"source-file-narrow\"><em>value.c</em>, add after <em>printValue</em>()</div>\n\n<p>First, we check the types. If the Values have <span\nname=\"equal\">different</span> types, they are definitely not equal. Otherwise,\nwe unwrap the two Values and compare them directly.</p>\n<aside name=\"equal\">\n<p>Some languages have &ldquo;implicit conversions&rdquo; where values of different types may\nbe considered equal if one can be converted to the other&rsquo;s type. For example,\nthe number 0 is equivalent to the string &ldquo;0&rdquo; in JavaScript. This looseness was a\nlarge enough source of pain that JS added a separate &ldquo;strict equality&rdquo; operator,\n<code>===</code>.</p>\n<p>PHP considers the strings &ldquo;1&rdquo; and &ldquo;01&rdquo; to be equivalent because both can be\nconverted to equivalent numbers, though the ultimate reason is because PHP was\ndesigned by a Lovecraftian eldritch god to destroy the mind.</p>\n<p>Most dynamically typed languages that have separate integer and floating-point\nnumber types consider values of different number types equal if the numeric\nvalues are the same (so, say, 1.0 is equal to 1), though even that seemingly\ninnocuous convenience can bite the unwary.</p>\n</aside>\n<p>For each value type, we have a separate case that handles comparing the value\nitself. Given how similar the cases are, you might wonder why we can&rsquo;t simply\n<code>memcmp()</code> the two Value structs and be done with it. The problem is that\nbecause of padding and different-sized union fields, a Value contains unused\nbits. C gives no guarantee about what is in those, so it&rsquo;s possible that two\nequal Values actually differ in memory that isn&rsquo;t used.</p><img src=\"image/types-of-values/memcmp.png\" alt=\"The memory respresentations of two equal values that differ in unused bytes.\" />\n<p>(You wouldn&rsquo;t believe how much pain I went through before learning this fact.)</p>\n<p>Anyway, as we add more types to clox, this function will grow new cases. For\nnow, these three are sufficient. The other comparison operators are easier since\nthey work only on numbers.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">        push(BOOL_VAL(valuesEqual(a, b)));\n        break;\n      }\n</pre><div class=\"source-file\"><em>vm.c</em><br>\nin <em>run</em>()</div>\n<pre class=\"insert\">      <span class=\"k\">case</span> <span class=\"a\">OP_GREATER</span>:  <span class=\"a\">BINARY_OP</span>(<span class=\"a\">BOOL_VAL</span>, &gt;); <span class=\"k\">break</span>;\n      <span class=\"k\">case</span> <span class=\"a\">OP_LESS</span>:     <span class=\"a\">BINARY_OP</span>(<span class=\"a\">BOOL_VAL</span>, &lt;); <span class=\"k\">break</span>;\n</pre><pre class=\"insert-after\">      case OP_ADD:      BINARY_OP(NUMBER_VAL, +); break;\n</pre></div>\n<div class=\"source-file-narrow\"><em>vm.c</em>, in <em>run</em>()</div>\n\n<p>We already extended the <code>BINARY_OP</code> macro to handle operators that return\nnon-numeric types. Now we get to use that. We pass in <code>BOOL_VAL</code> since the\nresult value type is Boolean. Otherwise, it&rsquo;s no different from plus or minus.</p>\n<p>As always, the coda to today&rsquo;s aria is disassembling the new instructions.</p>\n<div class=\"codehilite\"><pre class=\"insert-before\">    case OP_FALSE:\n      return simpleInstruction(&quot;OP_FALSE&quot;, offset);\n</pre><div class=\"source-file\"><em>debug.c</em><br>\nin <em>disassembleInstruction</em>()</div>\n<pre class=\"insert\">    <span class=\"k\">case</span> <span class=\"a\">OP_EQUAL</span>:\n      <span class=\"k\">return</span> <span class=\"i\">simpleInstruction</span>(<span class=\"s\">&quot;OP_EQUAL&quot;</span>, <span class=\"i\">offset</span>);\n    <span class=\"k\">case</span> <span class=\"a\">OP_GREATER</span>:\n      <span class=\"k\">return</span> <span class=\"i\">simpleInstruction</span>(<span class=\"s\">&quot;OP_GREATER&quot;</span>, <span class=\"i\">offset</span>);\n    <span class=\"k\">case</span> <span class=\"a\">OP_LESS</span>:\n      <span class=\"k\">return</span> <span class=\"i\">simpleInstruction</span>(<span class=\"s\">&quot;OP_LESS&quot;</span>, <span class=\"i\">offset</span>);\n</pre><pre class=\"insert-after\">    case OP_ADD:\n</pre></div>\n<div class=\"source-file-narrow\"><em>debug.c</em>, in <em>disassembleInstruction</em>()</div>\n\n<p>With that, our numeric calculator has become something closer to a general\nexpression evaluator. Fire up clox and type in:</p>\n<div class=\"codehilite\"><pre>!(<span class=\"n\">5</span> - <span class=\"n\">4</span> &gt; <span class=\"n\">3</span> * <span class=\"n\">2</span> == !<span class=\"k\">nil</span>)\n</pre></div>\n<p>OK, I&rsquo;ll admit that&rsquo;s maybe not the most <em>useful</em> expression, but we&rsquo;re making\nprogress. We have one missing built-in type with its own literal form: strings.\nThose are much more complex because strings can vary in size. That tiny\ndifference turns out to have implications so large that we give strings <a href=\"strings.html\">their\nvery own chapter</a>.</p>\n<div class=\"challenges\">\n<h2><a href=\"#challenges\" id=\"challenges\">Challenges</a></h2>\n<ol>\n<li>\n<p>We could reduce our binary operators even further than we did here. Which\nother instructions can you eliminate, and how would the compiler cope with\ntheir absence?</p>\n</li>\n<li>\n<p>Conversely, we can improve the speed of our bytecode VM by adding more\nspecific instructions that correspond to higher-level operations. What\ninstructions would you define to speed up the kind of user code we added\nsupport for in this chapter?</p>\n</li>\n</ol>\n</div>\n\n<footer>\n<a href=\"strings.html\" class=\"next\">\n  Next Chapter: &ldquo;Strings&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "site/welcome.html",
    "content": "<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" />\n<title>Welcome &middot; Crafting Interpreters</title>\n\n<!-- Tell mobile browsers we're optimized for them and they don't need to crop\n     the viewport. -->\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"/>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"style.css\" />\n\n<!-- Oh, God, Source Code Pro is so beautiful it makes me want to cry. -->\n<link href='https://fonts.googleapis.com/css?family=Source+Code+Pro:400|Source+Sans+Pro:300,400,600' rel='stylesheet' type='text/css'>\n\n<link rel=\"icon\" type=\"image/png\" href=\"image/favicon.png\" />\n<script src=\"jquery-3.4.1.min.js\"></script>\n<script src=\"script.js\"></script>\n\n<!-- Google analytics -->\n<script>\n  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){\n  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),\n  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)\n  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');\n\n  ga('create', 'UA-42804721-2', 'auto');\n  ga('send', 'pageview');\n</script>\n\n</head>\n<body id=\"top\">\n\n<!-- <div class=\"scrim\"></div> -->\n<nav class=\"wide\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"contents\">\n<h2><small>I</small>Welcome</h2>\n\n<ul>\n    <li><a href=\"introduction.html\"><small>1</small>Introduction</a></li>\n    <li><a href=\"a-map-of-the-territory.html\"><small>2</small>A Map of the Territory</a></li>\n    <li><a href=\"the-lox-language.html\"><small>3</small>The Lox Language</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"contents.html\" title=\"Table of Contents\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"contents.html\" title=\"Table of Contents\">&uarr;&nbsp;Up</a>\n    <a href=\"introduction.html\" title=\"Introduction\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n</nav>\n\n<nav class=\"narrow\">\n<a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n<a href=\"contents.html\" title=\"Table of Contents\" class=\"prev\">←</a>\n<a href=\"introduction.html\" title=\"Introduction\" class=\"next\">→</a>\n</nav>\n\n<div class=\"page\">\n<div class=\"nav-wrapper\">\n<nav class=\"floating\">\n  <a href=\"/\"><img src=\"image/logotype.png\" title=\"Crafting Interpreters\"></a>\n  <div class=\"expandable\">\n<h2><small>I</small>Welcome</h2>\n\n<ul>\n    <li><a href=\"introduction.html\"><small>1</small>Introduction</a></li>\n    <li><a href=\"a-map-of-the-territory.html\"><small>2</small>A Map of the Territory</a></li>\n    <li><a href=\"the-lox-language.html\"><small>3</small>The Lox Language</a></li>\n</ul>\n\n\n<div class=\"prev-next\">\n    <a href=\"contents.html\" title=\"Table of Contents\" class=\"left\">&larr;&nbsp;Previous</a>\n    <a href=\"contents.html\" title=\"Table of Contents\">&uarr;&nbsp;Up</a>\n    <a href=\"introduction.html\" title=\"Introduction\" class=\"right\">Next&nbsp;&rarr;</a>\n</div>  </div>\n  <a id=\"expand-nav\">≡</a>\n</nav>\n</div>\n\n<article class=\"chapter\">\n\n  <div class=\"number\">I</div>\n  <h1 class=\"part\">Welcome</h1>\n\n<p>This may be the beginning of a grand adventure. Programming languages encompass\na huge space to explore and play in. Plenty of room for your own creations to\nshare with others or just enjoy yourself. Brilliant computer scientists and\nsoftware engineers have spent entire careers traversing this land without ever\nreaching the end. If this book is your first entry into the country, welcome.</p>\n<p>The pages of this book give you a guided tour through some of the world of\nlanguages. But before we strap on our hiking boots and venture out, we should\nfamiliarize ourselves with the territory. The chapters in this part introduce\nyou to the basic concepts used by programming languages and how those concepts\nare organized.</p>\n<p>We will also get acquainted with Lox, the language we&rsquo;ll spend the rest of the\nbook implementing (twice).</p>\n\n<footer>\n<a href=\"introduction.html\" class=\"next\">\n  Next Chapter: &ldquo;Introduction&rdquo; &rarr;\n</a>\nHandcrafted by Robert Nystrom&ensp;&mdash;&ensp;<a href=\"https://github.com/munificent/craftinginterpreters/blob/master/LICENSE\" target=\"_blank\">&copy; 2015&hairsp;&ndash;&hairsp;2021</a>\n</footer>\n</article>\n\n</div>\n</body>\n</html>\n"
  },
  {
    "path": "test/assignment/associativity.lox",
    "content": "var a = \"a\";\nvar b = \"b\";\nvar c = \"c\";\n\n// Assignment is right-associative.\na = b = c;\nprint a; // expect: c\nprint b; // expect: c\nprint c; // expect: c\n"
  },
  {
    "path": "test/assignment/global.lox",
    "content": "var a = \"before\";\nprint a; // expect: before\n\na = \"after\";\nprint a; // expect: after\n\nprint a = \"arg\"; // expect: arg\nprint a; // expect: arg\n"
  },
  {
    "path": "test/assignment/grouping.lox",
    "content": "var a = \"a\";\n(a) = \"value\"; // Error at '=': Invalid assignment target.\n"
  },
  {
    "path": "test/assignment/infix_operator.lox",
    "content": "var a = \"a\";\nvar b = \"b\";\na + b = \"value\"; // Error at '=': Invalid assignment target.\n"
  },
  {
    "path": "test/assignment/local.lox",
    "content": "{\n  var a = \"before\";\n  print a; // expect: before\n\n  a = \"after\";\n  print a; // expect: after\n\n  print a = \"arg\"; // expect: arg\n  print a; // expect: arg\n}\n"
  },
  {
    "path": "test/assignment/prefix_operator.lox",
    "content": "var a = \"a\";\n!a = \"value\"; // Error at '=': Invalid assignment target.\n"
  },
  {
    "path": "test/assignment/syntax.lox",
    "content": "// Assignment on RHS of variable.\nvar a = \"before\";\nvar c = a = \"var\";\nprint a; // expect: var\nprint c; // expect: var\n"
  },
  {
    "path": "test/assignment/to_this.lox",
    "content": "class Foo {\n  Foo() {\n    this = \"value\"; // Error at '=': Invalid assignment target.\n  }\n}\n\nFoo();\n"
  },
  {
    "path": "test/assignment/undefined.lox",
    "content": "unknown = \"what\"; // expect runtime error: Undefined variable 'unknown'.\n"
  },
  {
    "path": "test/benchmark/binary_trees.lox",
    "content": "class Tree {\n  init(item, depth) {\n    this.item = item;\n    this.depth = depth;\n    if (depth > 0) {\n      var item2 = item + item;\n      depth = depth - 1;\n      this.left = Tree(item2 - 1, depth);\n      this.right = Tree(item2, depth);\n    } else {\n      this.left = nil;\n      this.right = nil;\n    }\n  }\n\n  check() {\n    if (this.left == nil) {\n      return this.item;\n    }\n\n    return this.item + this.left.check() - this.right.check();\n  }\n}\n\nvar minDepth = 4;\nvar maxDepth = 14;\nvar stretchDepth = maxDepth + 1;\n\nvar start = clock();\n\nprint \"stretch tree of depth:\";\nprint stretchDepth;\nprint \"check:\";\nprint Tree(0, stretchDepth).check();\n\nvar longLivedTree = Tree(0, maxDepth);\n\n// iterations = 2 ** maxDepth\nvar iterations = 1;\nvar d = 0;\nwhile (d < maxDepth) {\n  iterations = iterations * 2;\n  d = d + 1;\n}\n\nvar depth = minDepth;\nwhile (depth < stretchDepth) {\n  var check = 0;\n  var i = 1;\n  while (i <= iterations) {\n    check = check + Tree(i, depth).check() + Tree(-i, depth).check();\n    i = i + 1;\n  }\n\n  print \"num trees:\";\n  print iterations * 2;\n  print \"depth:\";\n  print depth;\n  print \"check:\";\n  print check;\n\n  iterations = iterations / 4;\n  depth = depth + 2;\n}\n\nprint \"long lived tree of depth:\";\nprint maxDepth;\nprint \"check:\";\nprint longLivedTree.check();\nprint \"elapsed:\";\nprint clock() - start;\n"
  },
  {
    "path": "test/benchmark/equality.lox",
    "content": "var i = 0;\n\nvar loopStart = clock();\n\nwhile (i < 10000000) {\n  i = i + 1;\n\n  1; 1; 1; 2; 1; nil; 1; \"str\"; 1; true;\n  nil; nil; nil; 1; nil; \"str\"; nil; true;\n  true; true; true; 1; true; false; true; \"str\"; true; nil;\n  \"str\"; \"str\"; \"str\"; \"stru\"; \"str\"; 1; \"str\"; nil; \"str\"; true;\n}\n\nvar loopTime = clock() - loopStart;\n\nvar start = clock();\n\ni = 0;\nwhile (i < 10000000) {\n  i = i + 1;\n\n  1 == 1; 1 == 2; 1 == nil; 1 == \"str\"; 1 == true;\n  nil == nil; nil == 1; nil == \"str\"; nil == true;\n  true == true; true == 1; true == false; true == \"str\"; true == nil;\n  \"str\" == \"str\"; \"str\" == \"stru\"; \"str\" == 1; \"str\" == nil; \"str\" == true;\n}\n\nvar elapsed = clock() - start;\nprint \"loop\";\nprint loopTime;\nprint \"elapsed\";\nprint elapsed;\nprint \"equals\";\nprint elapsed - loopTime;\n"
  },
  {
    "path": "test/benchmark/fib.lox",
    "content": "fun fib(n) {\n  if (n < 2) return n;\n  return fib(n - 2) + fib(n - 1);\n}\n\nvar start = clock();\nprint fib(35) == 9227465;\nprint clock() - start;\n"
  },
  {
    "path": "test/benchmark/instantiation.lox",
    "content": "// This benchmark stresses instance creation and initializer calling.\n\nclass Foo {\n  init() {}\n}\n\nvar start = clock();\nvar i = 0;\nwhile (i < 500000) {\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  Foo();\n  i = i + 1;\n}\n\nprint clock() - start;\n"
  },
  {
    "path": "test/benchmark/invocation.lox",
    "content": "// This benchmark stresses just method invocation.\n\nclass Foo {\n  method0() {}\n  method1() {}\n  method2() {}\n  method3() {}\n  method4() {}\n  method5() {}\n  method6() {}\n  method7() {}\n  method8() {}\n  method9() {}\n  method10() {}\n  method11() {}\n  method12() {}\n  method13() {}\n  method14() {}\n  method15() {}\n  method16() {}\n  method17() {}\n  method18() {}\n  method19() {}\n  method20() {}\n  method21() {}\n  method22() {}\n  method23() {}\n  method24() {}\n  method25() {}\n  method26() {}\n  method27() {}\n  method28() {}\n  method29() {}\n}\n\nvar foo = Foo();\nvar start = clock();\nvar i = 0;\nwhile (i < 500000) {\n  foo.method0();\n  foo.method1();\n  foo.method2();\n  foo.method3();\n  foo.method4();\n  foo.method5();\n  foo.method6();\n  foo.method7();\n  foo.method8();\n  foo.method9();\n  foo.method10();\n  foo.method11();\n  foo.method12();\n  foo.method13();\n  foo.method14();\n  foo.method15();\n  foo.method16();\n  foo.method17();\n  foo.method18();\n  foo.method19();\n  foo.method20();\n  foo.method21();\n  foo.method22();\n  foo.method23();\n  foo.method24();\n  foo.method25();\n  foo.method26();\n  foo.method27();\n  foo.method28();\n  foo.method29();\n  i = i + 1;\n}\n\nprint clock() - start;\n"
  },
  {
    "path": "test/benchmark/method_call.lox",
    "content": "class Toggle {\n  init(startState) {\n    this.state = startState;\n  }\n\n  value() { return this.state; }\n\n  activate() {\n    this.state = !this.state;\n    return this;\n  }\n}\n\nclass NthToggle < Toggle {\n  init(startState, maxCounter) {\n    super.init(startState);\n    this.countMax = maxCounter;\n    this.count = 0;\n  }\n\n  activate() {\n    this.count = this.count + 1;\n    if (this.count >= this.countMax) {\n      super.activate();\n      this.count = 0;\n    }\n\n    return this;\n  }\n}\n\nvar start = clock();\nvar n = 100000;\nvar val = true;\nvar toggle = Toggle(val);\n\nfor (var i = 0; i < n; i = i + 1) {\n  val = toggle.activate().value();\n  val = toggle.activate().value();\n  val = toggle.activate().value();\n  val = toggle.activate().value();\n  val = toggle.activate().value();\n  val = toggle.activate().value();\n  val = toggle.activate().value();\n  val = toggle.activate().value();\n  val = toggle.activate().value();\n  val = toggle.activate().value();\n}\n\nprint toggle.value();\n\nval = true;\nvar ntoggle = NthToggle(val, 3);\n\nfor (var i = 0; i < n; i = i + 1) {\n  val = ntoggle.activate().value();\n  val = ntoggle.activate().value();\n  val = ntoggle.activate().value();\n  val = ntoggle.activate().value();\n  val = ntoggle.activate().value();\n  val = ntoggle.activate().value();\n  val = ntoggle.activate().value();\n  val = ntoggle.activate().value();\n  val = ntoggle.activate().value();\n  val = ntoggle.activate().value();\n}\n\nprint ntoggle.value();\nprint clock() - start;\n"
  },
  {
    "path": "test/benchmark/properties.lox",
    "content": "// This benchmark stresses both field and method lookup.\n\nclass Foo {\n  init() {\n    this.field0 = 1;\n    this.field1 = 1;\n    this.field2 = 1;\n    this.field3 = 1;\n    this.field4 = 1;\n    this.field5 = 1;\n    this.field6 = 1;\n    this.field7 = 1;\n    this.field8 = 1;\n    this.field9 = 1;\n    this.field10 = 1;\n    this.field11 = 1;\n    this.field12 = 1;\n    this.field13 = 1;\n    this.field14 = 1;\n    this.field15 = 1;\n    this.field16 = 1;\n    this.field17 = 1;\n    this.field18 = 1;\n    this.field19 = 1;\n    this.field20 = 1;\n    this.field21 = 1;\n    this.field22 = 1;\n    this.field23 = 1;\n    this.field24 = 1;\n    this.field25 = 1;\n    this.field26 = 1;\n    this.field27 = 1;\n    this.field28 = 1;\n    this.field29 = 1;\n  }\n\n  method0() { return this.field0; }\n  method1() { return this.field1; }\n  method2() { return this.field2; }\n  method3() { return this.field3; }\n  method4() { return this.field4; }\n  method5() { return this.field5; }\n  method6() { return this.field6; }\n  method7() { return this.field7; }\n  method8() { return this.field8; }\n  method9() { return this.field9; }\n  method10() { return this.field10; }\n  method11() { return this.field11; }\n  method12() { return this.field12; }\n  method13() { return this.field13; }\n  method14() { return this.field14; }\n  method15() { return this.field15; }\n  method16() { return this.field16; }\n  method17() { return this.field17; }\n  method18() { return this.field18; }\n  method19() { return this.field19; }\n  method20() { return this.field20; }\n  method21() { return this.field21; }\n  method22() { return this.field22; }\n  method23() { return this.field23; }\n  method24() { return this.field24; }\n  method25() { return this.field25; }\n  method26() { return this.field26; }\n  method27() { return this.field27; }\n  method28() { return this.field28; }\n  method29() { return this.field29; }\n}\n\nvar foo = Foo();\nvar start = clock();\nvar i = 0;\nwhile (i < 500000) {\n  foo.method0();\n  foo.method1();\n  foo.method2();\n  foo.method3();\n  foo.method4();\n  foo.method5();\n  foo.method6();\n  foo.method7();\n  foo.method8();\n  foo.method9();\n  foo.method10();\n  foo.method11();\n  foo.method12();\n  foo.method13();\n  foo.method14();\n  foo.method15();\n  foo.method16();\n  foo.method17();\n  foo.method18();\n  foo.method19();\n  foo.method20();\n  foo.method21();\n  foo.method22();\n  foo.method23();\n  foo.method24();\n  foo.method25();\n  foo.method26();\n  foo.method27();\n  foo.method28();\n  foo.method29();\n  i = i + 1;\n}\n\nprint clock() - start;\n"
  },
  {
    "path": "test/benchmark/string_equality.lox",
    "content": "var a1 = \"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1\";\nvar a2 = \"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa2\";\nvar a3 = \"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa3\";\nvar a4 = \"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4\";\nvar a5 = \"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa5\";\nvar a6 = \"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa6\";\nvar a7 = \"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa7\";\nvar a8 = \"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa8\";\n\nvar i = 0;\n\nvar loopStart = clock();\n\nwhile (i < 100000) {\n  i = i + 1;\n\n  a1; a1; a1; a2; a1; a3; a1; a4; a1; a5; a1; a6; a1; a7; a1; a8;\n  a2; a1; a2; a2; a2; a3; a2; a4; a2; a5; a2; a6; a2; a7; a2; a8;\n  a3; a1; a3; a2; a3; a3; a3; a4; a3; a5; a3; a6; a3; a7; a3; a8;\n  a4; a1; a4; a2; a4; a3; a4; a4; a4; a5; a4; a6; a4; a7; a4; a8;\n  a5; a1; a5; a2; a5; a3; a5; a4; a5; a5; a5; a6; a5; a7; a5; a8;\n  a6; a1; a6; a2; a6; a3; a6; a4; a6; a5; a6; a6; a6; a7; a6; a8;\n  a7; a1; a7; a2; a7; a3; a7; a4; a7; a5; a7; a6; a7; a7; a7; a8;\n  a8; a1; a8; a2; a8; a3; a8; a4; a8; a5; a8; a6; a8; a7; a8; a8;\n\n  a1; a1; a1; a2; a1; a3; a1; a4; a1; a5; a1; a6; a1; a7; a1; a8;\n  a2; a1; a2; a2; a2; a3; a2; a4; a2; a5; a2; a6; a2; a7; a2; a8;\n  a3; a1; a3; a2; a3; a3; a3; a4; a3; a5; a3; a6; a3; a7; a3; a8;\n  a4; a1; a4; a2; a4; a3; a4; a4; a4; a5; a4; a6; a4; a7; a4; a8;\n  a5; a1; a5; a2; a5; a3; a5; a4; a5; a5; a5; a6; a5; a7; a5; a8;\n  a6; a1; a6; a2; a6; a3; a6; a4; a6; a5; a6; a6; a6; a7; a6; a8;\n  a7; a1; a7; a2; a7; a3; a7; a4; a7; a5; a7; a6; a7; a7; a7; a8;\n  a8; a1; a8; a2; a8; a3; a8; a4; a8; a5; a8; a6; a8; a7; a8; a8;\n\n  a1; a1; a1; a2; a1; a3; a1; a4; a1; a5; a1; a6; a1; a7; a1; a8;\n  a2; a1; a2; a2; a2; a3; a2; a4; a2; a5; a2; a6; a2; a7; a2; a8;\n  a3; a1; a3; a2; a3; a3; a3; a4; a3; a5; a3; a6; a3; a7; a3; a8;\n  a4; a1; a4; a2; a4; a3; a4; a4; a4; a5; a4; a6; a4; a7; a4; a8;\n  a5; a1; a5; a2; a5; a3; a5; a4; a5; a5; a5; a6; a5; a7; a5; a8;\n  a6; a1; a6; a2; a6; a3; a6; a4; a6; a5; a6; a6; a6; a7; a6; a8;\n  a7; a1; a7; a2; a7; a3; a7; a4; a7; a5; a7; a6; a7; a7; a7; a8;\n  a8; a1; a8; a2; a8; a3; a8; a4; a8; a5; a8; a6; a8; a7; a8; a8;\n\n  a1; a1; a1; a2; a1; a3; a1; a4; a1; a5; a1; a6; a1; a7; a1; a8;\n  a2; a1; a2; a2; a2; a3; a2; a4; a2; a5; a2; a6; a2; a7; a2; a8;\n  a3; a1; a3; a2; a3; a3; a3; a4; a3; a5; a3; a6; a3; a7; a3; a8;\n  a4; a1; a4; a2; a4; a3; a4; a4; a4; a5; a4; a6; a4; a7; a4; a8;\n  a5; a1; a5; a2; a5; a3; a5; a4; a5; a5; a5; a6; a5; a7; a5; a8;\n  a6; a1; a6; a2; a6; a3; a6; a4; a6; a5; a6; a6; a6; a7; a6; a8;\n  a7; a1; a7; a2; a7; a3; a7; a4; a7; a5; a7; a6; a7; a7; a7; a8;\n  a8; a1; a8; a2; a8; a3; a8; a4; a8; a5; a8; a6; a8; a7; a8; a8;\n\n  a1; a1; a1; a2; a1; a3; a1; a4; a1; a5; a1; a6; a1; a7; a1; a8;\n  a2; a1; a2; a2; a2; a3; a2; a4; a2; a5; a2; a6; a2; a7; a2; a8;\n  a3; a1; a3; a2; a3; a3; a3; a4; a3; a5; a3; a6; a3; a7; a3; a8;\n  a4; a1; a4; a2; a4; a3; a4; a4; a4; a5; a4; a6; a4; a7; a4; a8;\n  a5; a1; a5; a2; a5; a3; a5; a4; a5; a5; a5; a6; a5; a7; a5; a8;\n  a6; a1; a6; a2; a6; a3; a6; a4; a6; a5; a6; a6; a6; a7; a6; a8;\n  a7; a1; a7; a2; a7; a3; a7; a4; a7; a5; a7; a6; a7; a7; a7; a8;\n  a8; a1; a8; a2; a8; a3; a8; a4; a8; a5; a8; a6; a8; a7; a8; a8;\n\n  a1; a1; a1; a2; a1; a3; a1; a4; a1; a5; a1; a6; a1; a7; a1; a8;\n  a2; a1; a2; a2; a2; a3; a2; a4; a2; a5; a2; a6; a2; a7; a2; a8;\n  a3; a1; a3; a2; a3; a3; a3; a4; a3; a5; a3; a6; a3; a7; a3; a8;\n  a4; a1; a4; a2; a4; a3; a4; a4; a4; a5; a4; a6; a4; a7; a4; a8;\n  a5; a1; a5; a2; a5; a3; a5; a4; a5; a5; a5; a6; a5; a7; a5; a8;\n  a6; a1; a6; a2; a6; a3; a6; a4; a6; a5; a6; a6; a6; a7; a6; a8;\n  a7; a1; a7; a2; a7; a3; a7; a4; a7; a5; a7; a6; a7; a7; a7; a8;\n  a8; a1; a8; a2; a8; a3; a8; a4; a8; a5; a8; a6; a8; a7; a8; a8;\n\n  a1; a1; a1; a2; a1; a3; a1; a4; a1; a5; a1; a6; a1; a7; a1; a8;\n  a2; a1; a2; a2; a2; a3; a2; a4; a2; a5; a2; a6; a2; a7; a2; a8;\n  a3; a1; a3; a2; a3; a3; a3; a4; a3; a5; a3; a6; a3; a7; a3; a8;\n  a4; a1; a4; a2; a4; a3; a4; a4; a4; a5; a4; a6; a4; a7; a4; a8;\n  a5; a1; a5; a2; a5; a3; a5; a4; a5; a5; a5; a6; a5; a7; a5; a8;\n  a6; a1; a6; a2; a6; a3; a6; a4; a6; a5; a6; a6; a6; a7; a6; a8;\n  a7; a1; a7; a2; a7; a3; a7; a4; a7; a5; a7; a6; a7; a7; a7; a8;\n  a8; a1; a8; a2; a8; a3; a8; a4; a8; a5; a8; a6; a8; a7; a8; a8;\n\n  a1; a1; a1; a2; a1; a3; a1; a4; a1; a5; a1; a6; a1; a7; a1; a8;\n  a2; a1; a2; a2; a2; a3; a2; a4; a2; a5; a2; a6; a2; a7; a2; a8;\n  a3; a1; a3; a2; a3; a3; a3; a4; a3; a5; a3; a6; a3; a7; a3; a8;\n  a4; a1; a4; a2; a4; a3; a4; a4; a4; a5; a4; a6; a4; a7; a4; a8;\n  a5; a1; a5; a2; a5; a3; a5; a4; a5; a5; a5; a6; a5; a7; a5; a8;\n  a6; a1; a6; a2; a6; a3; a6; a4; a6; a5; a6; a6; a6; a7; a6; a8;\n  a7; a1; a7; a2; a7; a3; a7; a4; a7; a5; a7; a6; a7; a7; a7; a8;\n  a8; a1; a8; a2; a8; a3; a8; a4; a8; a5; a8; a6; a8; a7; a8; a8;\n\n  a1; a1; a1; a2; a1; a3; a1; a4; a1; a5; a1; a6; a1; a7; a1; a8;\n  a2; a1; a2; a2; a2; a3; a2; a4; a2; a5; a2; a6; a2; a7; a2; a8;\n  a3; a1; a3; a2; a3; a3; a3; a4; a3; a5; a3; a6; a3; a7; a3; a8;\n  a4; a1; a4; a2; a4; a3; a4; a4; a4; a5; a4; a6; a4; a7; a4; a8;\n  a5; a1; a5; a2; a5; a3; a5; a4; a5; a5; a5; a6; a5; a7; a5; a8;\n  a6; a1; a6; a2; a6; a3; a6; a4; a6; a5; a6; a6; a6; a7; a6; a8;\n  a7; a1; a7; a2; a7; a3; a7; a4; a7; a5; a7; a6; a7; a7; a7; a8;\n  a8; a1; a8; a2; a8; a3; a8; a4; a8; a5; a8; a6; a8; a7; a8; a8;\n\n  a1; a1; a1; a2; a1; a3; a1; a4; a1; a5; a1; a6; a1; a7; a1; a8;\n  a2; a1; a2; a2; a2; a3; a2; a4; a2; a5; a2; a6; a2; a7; a2; a8;\n  a3; a1; a3; a2; a3; a3; a3; a4; a3; a5; a3; a6; a3; a7; a3; a8;\n  a4; a1; a4; a2; a4; a3; a4; a4; a4; a5; a4; a6; a4; a7; a4; a8;\n  a5; a1; a5; a2; a5; a3; a5; a4; a5; a5; a5; a6; a5; a7; a5; a8;\n  a6; a1; a6; a2; a6; a3; a6; a4; a6; a5; a6; a6; a6; a7; a6; a8;\n  a7; a1; a7; a2; a7; a3; a7; a4; a7; a5; a7; a6; a7; a7; a7; a8;\n  a8; a1; a8; a2; a8; a3; a8; a4; a8; a5; a8; a6; a8; a7; a8; a8;\n}\n\nvar loopTime = clock() - loopStart;\n\nvar start = clock();\n\ni = 0;\nwhile (i < 100000) {\n  i = i + 1;\n\n  // 1 == 1; 1 == 2; 1 == nil; 1 == \"str\"; 1 == true;\n  // nil == nil; nil == 1; nil == \"str\"; nil == true;\n  // true == true; true == 1; true == false; true == \"str\"; true == nil;\n  // \"str\" == \"str\"; \"str\" == \"stru\"; \"str\" == 1; \"str\" == nil; \"str\" == true;\n\n  a1 == a1; a1 == a2; a1 == a3; a1 == a4; a1 == a5; a1 == a6; a1 == a7; a1 == a8;\n  a2 == a1; a2 == a2; a2 == a3; a2 == a4; a2 == a5; a2 == a6; a2 == a7; a2 == a8;\n  a3 == a1; a3 == a2; a3 == a3; a3 == a4; a3 == a5; a3 == a6; a3 == a7; a3 == a8;\n  a4 == a1; a4 == a2; a4 == a3; a4 == a4; a4 == a5; a4 == a6; a4 == a7; a4 == a8;\n  a5 == a1; a5 == a2; a5 == a3; a5 == a4; a5 == a5; a5 == a6; a5 == a7; a5 == a8;\n  a6 == a1; a6 == a2; a6 == a3; a6 == a4; a6 == a5; a6 == a6; a6 == a7; a6 == a8;\n  a7 == a1; a7 == a2; a7 == a3; a7 == a4; a7 == a5; a7 == a6; a7 == a7; a7 == a8;\n  a8 == a1; a8 == a2; a8 == a3; a8 == a4; a8 == a5; a8 == a6; a8 == a7; a8 == a8;\n\n  a1 == a1; a1 == a2; a1 == a3; a1 == a4; a1 == a5; a1 == a6; a1 == a7; a1 == a8;\n  a2 == a1; a2 == a2; a2 == a3; a2 == a4; a2 == a5; a2 == a6; a2 == a7; a2 == a8;\n  a3 == a1; a3 == a2; a3 == a3; a3 == a4; a3 == a5; a3 == a6; a3 == a7; a3 == a8;\n  a4 == a1; a4 == a2; a4 == a3; a4 == a4; a4 == a5; a4 == a6; a4 == a7; a4 == a8;\n  a5 == a1; a5 == a2; a5 == a3; a5 == a4; a5 == a5; a5 == a6; a5 == a7; a5 == a8;\n  a6 == a1; a6 == a2; a6 == a3; a6 == a4; a6 == a5; a6 == a6; a6 == a7; a6 == a8;\n  a7 == a1; a7 == a2; a7 == a3; a7 == a4; a7 == a5; a7 == a6; a7 == a7; a7 == a8;\n  a8 == a1; a8 == a2; a8 == a3; a8 == a4; a8 == a5; a8 == a6; a8 == a7; a8 == a8;\n\n  a1 == a1; a1 == a2; a1 == a3; a1 == a4; a1 == a5; a1 == a6; a1 == a7; a1 == a8;\n  a2 == a1; a2 == a2; a2 == a3; a2 == a4; a2 == a5; a2 == a6; a2 == a7; a2 == a8;\n  a3 == a1; a3 == a2; a3 == a3; a3 == a4; a3 == a5; a3 == a6; a3 == a7; a3 == a8;\n  a4 == a1; a4 == a2; a4 == a3; a4 == a4; a4 == a5; a4 == a6; a4 == a7; a4 == a8;\n  a5 == a1; a5 == a2; a5 == a3; a5 == a4; a5 == a5; a5 == a6; a5 == a7; a5 == a8;\n  a6 == a1; a6 == a2; a6 == a3; a6 == a4; a6 == a5; a6 == a6; a6 == a7; a6 == a8;\n  a7 == a1; a7 == a2; a7 == a3; a7 == a4; a7 == a5; a7 == a6; a7 == a7; a7 == a8;\n  a8 == a1; a8 == a2; a8 == a3; a8 == a4; a8 == a5; a8 == a6; a8 == a7; a8 == a8;\n\n  a1 == a1; a1 == a2; a1 == a3; a1 == a4; a1 == a5; a1 == a6; a1 == a7; a1 == a8;\n  a2 == a1; a2 == a2; a2 == a3; a2 == a4; a2 == a5; a2 == a6; a2 == a7; a2 == a8;\n  a3 == a1; a3 == a2; a3 == a3; a3 == a4; a3 == a5; a3 == a6; a3 == a7; a3 == a8;\n  a4 == a1; a4 == a2; a4 == a3; a4 == a4; a4 == a5; a4 == a6; a4 == a7; a4 == a8;\n  a5 == a1; a5 == a2; a5 == a3; a5 == a4; a5 == a5; a5 == a6; a5 == a7; a5 == a8;\n  a6 == a1; a6 == a2; a6 == a3; a6 == a4; a6 == a5; a6 == a6; a6 == a7; a6 == a8;\n  a7 == a1; a7 == a2; a7 == a3; a7 == a4; a7 == a5; a7 == a6; a7 == a7; a7 == a8;\n  a8 == a1; a8 == a2; a8 == a3; a8 == a4; a8 == a5; a8 == a6; a8 == a7; a8 == a8;\n\n  a1 == a1; a1 == a2; a1 == a3; a1 == a4; a1 == a5; a1 == a6; a1 == a7; a1 == a8;\n  a2 == a1; a2 == a2; a2 == a3; a2 == a4; a2 == a5; a2 == a6; a2 == a7; a2 == a8;\n  a3 == a1; a3 == a2; a3 == a3; a3 == a4; a3 == a5; a3 == a6; a3 == a7; a3 == a8;\n  a4 == a1; a4 == a2; a4 == a3; a4 == a4; a4 == a5; a4 == a6; a4 == a7; a4 == a8;\n  a5 == a1; a5 == a2; a5 == a3; a5 == a4; a5 == a5; a5 == a6; a5 == a7; a5 == a8;\n  a6 == a1; a6 == a2; a6 == a3; a6 == a4; a6 == a5; a6 == a6; a6 == a7; a6 == a8;\n  a7 == a1; a7 == a2; a7 == a3; a7 == a4; a7 == a5; a7 == a6; a7 == a7; a7 == a8;\n  a8 == a1; a8 == a2; a8 == a3; a8 == a4; a8 == a5; a8 == a6; a8 == a7; a8 == a8;\n\n  a1 == a1; a1 == a2; a1 == a3; a1 == a4; a1 == a5; a1 == a6; a1 == a7; a1 == a8;\n  a2 == a1; a2 == a2; a2 == a3; a2 == a4; a2 == a5; a2 == a6; a2 == a7; a2 == a8;\n  a3 == a1; a3 == a2; a3 == a3; a3 == a4; a3 == a5; a3 == a6; a3 == a7; a3 == a8;\n  a4 == a1; a4 == a2; a4 == a3; a4 == a4; a4 == a5; a4 == a6; a4 == a7; a4 == a8;\n  a5 == a1; a5 == a2; a5 == a3; a5 == a4; a5 == a5; a5 == a6; a5 == a7; a5 == a8;\n  a6 == a1; a6 == a2; a6 == a3; a6 == a4; a6 == a5; a6 == a6; a6 == a7; a6 == a8;\n  a7 == a1; a7 == a2; a7 == a3; a7 == a4; a7 == a5; a7 == a6; a7 == a7; a7 == a8;\n  a8 == a1; a8 == a2; a8 == a3; a8 == a4; a8 == a5; a8 == a6; a8 == a7; a8 == a8;\n\n  a1 == a1; a1 == a2; a1 == a3; a1 == a4; a1 == a5; a1 == a6; a1 == a7; a1 == a8;\n  a2 == a1; a2 == a2; a2 == a3; a2 == a4; a2 == a5; a2 == a6; a2 == a7; a2 == a8;\n  a3 == a1; a3 == a2; a3 == a3; a3 == a4; a3 == a5; a3 == a6; a3 == a7; a3 == a8;\n  a4 == a1; a4 == a2; a4 == a3; a4 == a4; a4 == a5; a4 == a6; a4 == a7; a4 == a8;\n  a5 == a1; a5 == a2; a5 == a3; a5 == a4; a5 == a5; a5 == a6; a5 == a7; a5 == a8;\n  a6 == a1; a6 == a2; a6 == a3; a6 == a4; a6 == a5; a6 == a6; a6 == a7; a6 == a8;\n  a7 == a1; a7 == a2; a7 == a3; a7 == a4; a7 == a5; a7 == a6; a7 == a7; a7 == a8;\n  a8 == a1; a8 == a2; a8 == a3; a8 == a4; a8 == a5; a8 == a6; a8 == a7; a8 == a8;\n\n  a1 == a1; a1 == a2; a1 == a3; a1 == a4; a1 == a5; a1 == a6; a1 == a7; a1 == a8;\n  a2 == a1; a2 == a2; a2 == a3; a2 == a4; a2 == a5; a2 == a6; a2 == a7; a2 == a8;\n  a3 == a1; a3 == a2; a3 == a3; a3 == a4; a3 == a5; a3 == a6; a3 == a7; a3 == a8;\n  a4 == a1; a4 == a2; a4 == a3; a4 == a4; a4 == a5; a4 == a6; a4 == a7; a4 == a8;\n  a5 == a1; a5 == a2; a5 == a3; a5 == a4; a5 == a5; a5 == a6; a5 == a7; a5 == a8;\n  a6 == a1; a6 == a2; a6 == a3; a6 == a4; a6 == a5; a6 == a6; a6 == a7; a6 == a8;\n  a7 == a1; a7 == a2; a7 == a3; a7 == a4; a7 == a5; a7 == a6; a7 == a7; a7 == a8;\n  a8 == a1; a8 == a2; a8 == a3; a8 == a4; a8 == a5; a8 == a6; a8 == a7; a8 == a8;\n\n  a1 == a1; a1 == a2; a1 == a3; a1 == a4; a1 == a5; a1 == a6; a1 == a7; a1 == a8;\n  a2 == a1; a2 == a2; a2 == a3; a2 == a4; a2 == a5; a2 == a6; a2 == a7; a2 == a8;\n  a3 == a1; a3 == a2; a3 == a3; a3 == a4; a3 == a5; a3 == a6; a3 == a7; a3 == a8;\n  a4 == a1; a4 == a2; a4 == a3; a4 == a4; a4 == a5; a4 == a6; a4 == a7; a4 == a8;\n  a5 == a1; a5 == a2; a5 == a3; a5 == a4; a5 == a5; a5 == a6; a5 == a7; a5 == a8;\n  a6 == a1; a6 == a2; a6 == a3; a6 == a4; a6 == a5; a6 == a6; a6 == a7; a6 == a8;\n  a7 == a1; a7 == a2; a7 == a3; a7 == a4; a7 == a5; a7 == a6; a7 == a7; a7 == a8;\n  a8 == a1; a8 == a2; a8 == a3; a8 == a4; a8 == a5; a8 == a6; a8 == a7; a8 == a8;\n\n  a1 == a1; a1 == a2; a1 == a3; a1 == a4; a1 == a5; a1 == a6; a1 == a7; a1 == a8;\n  a2 == a1; a2 == a2; a2 == a3; a2 == a4; a2 == a5; a2 == a6; a2 == a7; a2 == a8;\n  a3 == a1; a3 == a2; a3 == a3; a3 == a4; a3 == a5; a3 == a6; a3 == a7; a3 == a8;\n  a4 == a1; a4 == a2; a4 == a3; a4 == a4; a4 == a5; a4 == a6; a4 == a7; a4 == a8;\n  a5 == a1; a5 == a2; a5 == a3; a5 == a4; a5 == a5; a5 == a6; a5 == a7; a5 == a8;\n  a6 == a1; a6 == a2; a6 == a3; a6 == a4; a6 == a5; a6 == a6; a6 == a7; a6 == a8;\n  a7 == a1; a7 == a2; a7 == a3; a7 == a4; a7 == a5; a7 == a6; a7 == a7; a7 == a8;\n  a8 == a1; a8 == a2; a8 == a3; a8 == a4; a8 == a5; a8 == a6; a8 == a7; a8 == a8;\n\n}\n\nvar elapsed = clock() - start;\nprint \"loop\";\nprint loopTime;\nprint \"elapsed\";\nprint elapsed;\nprint \"equals\";\nprint elapsed - loopTime;\n"
  },
  {
    "path": "test/benchmark/trees.lox",
    "content": "class Tree {\n  init(depth) {\n    this.depth = depth;\n    if (depth > 0) {\n      this.a = Tree(depth - 1);\n      this.b = Tree(depth - 1);\n      this.c = Tree(depth - 1);\n      this.d = Tree(depth - 1);\n      this.e = Tree(depth - 1);\n    }\n  }\n\n  walk() {\n    if (this.depth == 0) return 0;\n    return this.depth \n        + this.a.walk()\n        + this.b.walk()\n        + this.c.walk()\n        + this.d.walk()\n        + this.e.walk();\n  }\n}\n\nvar tree = Tree(8);\nvar start = clock();\nfor (var i = 0; i < 100; i = i + 1) {\n  if (tree.walk() != 122068) print \"Error\";\n}\nprint clock() - start;\n"
  },
  {
    "path": "test/benchmark/zoo.lox",
    "content": "class Zoo {\n  init() {\n    this.aarvark  = 1;\n    this.baboon   = 1;\n    this.cat      = 1;\n    this.donkey   = 1;\n    this.elephant = 1;\n    this.fox      = 1;\n  }\n  ant()    { return this.aarvark; }\n  banana() { return this.baboon; }\n  tuna()   { return this.cat; }\n  hay()    { return this.donkey; }\n  grass()  { return this.elephant; }\n  mouse()  { return this.fox; }\n}\n\nvar zoo = Zoo();\nvar sum = 0;\nvar start = clock();\nwhile (sum < 10000000) {\n  sum = sum + zoo.ant()\n            + zoo.banana()\n            + zoo.tuna()\n            + zoo.hay()\n            + zoo.grass()\n            + zoo.mouse();\n}\n\nprint sum;\nprint clock() - start;\n"
  },
  {
    "path": "test/benchmark/zoo_batch.lox",
    "content": "class Zoo {\n  init() {\n    this.aarvark  = 1;\n    this.baboon   = 1;\n    this.cat      = 1;\n    this.donkey   = 1;\n    this.elephant = 1;\n    this.fox      = 1;\n  }\n  ant()    { return this.aarvark; }\n  banana() { return this.baboon; }\n  tuna()   { return this.cat; }\n  hay()    { return this.donkey; }\n  grass()  { return this.elephant; }\n  mouse()  { return this.fox; }\n}\n\nvar zoo = Zoo();\nvar sum = 0;\nvar start = clock();\nvar batch = 0;\nwhile (clock() - start < 10) {\n  for (var i = 0; i < 10000; i = i + 1) {\n    sum = sum + zoo.ant()\n              + zoo.banana()\n              + zoo.tuna()\n              + zoo.hay()\n              + zoo.grass()\n              + zoo.mouse();\n  }\n  batch = batch + 1;\n}\n\nprint sum;\nprint batch;\nprint clock() - start;\n"
  },
  {
    "path": "test/block/empty.lox",
    "content": "{} // By itself.\n\n// In a statement.\nif (true) {}\nif (false) {} else {}\n\nprint \"ok\"; // expect: ok\n"
  },
  {
    "path": "test/block/scope.lox",
    "content": "var a = \"outer\";\n\n{\n  var a = \"inner\";\n  print a; // expect: inner\n}\n\nprint a; // expect: outer\n"
  },
  {
    "path": "test/bool/equality.lox",
    "content": "print true == true;    // expect: true\nprint true == false;   // expect: false\nprint false == true;   // expect: false\nprint false == false;  // expect: true\n\n// Not equal to other types.\nprint true == 1;        // expect: false\nprint false == 0;       // expect: false\nprint true == \"true\";   // expect: false\nprint false == \"false\"; // expect: false\nprint false == \"\";      // expect: false\n\nprint true != true;    // expect: false\nprint true != false;   // expect: true\nprint false != true;   // expect: true\nprint false != false;  // expect: false\n\n// Not equal to other types.\nprint true != 1;        // expect: true\nprint false != 0;       // expect: true\nprint true != \"true\";   // expect: true\nprint false != \"false\"; // expect: true\nprint false != \"\";      // expect: true\n"
  },
  {
    "path": "test/bool/not.lox",
    "content": "print !true;    // expect: false\nprint !false;   // expect: true\nprint !!true;   // expect: true\n"
  },
  {
    "path": "test/call/bool.lox",
    "content": "true(); // expect runtime error: Can only call functions and classes.\n"
  },
  {
    "path": "test/call/nil.lox",
    "content": "nil(); // expect runtime error: Can only call functions and classes.\n"
  },
  {
    "path": "test/call/num.lox",
    "content": "123(); // expect runtime error: Can only call functions and classes.\n"
  },
  {
    "path": "test/call/object.lox",
    "content": "class Foo {}\n\nvar foo = Foo();\nfoo(); // expect runtime error: Can only call functions and classes.\n"
  },
  {
    "path": "test/call/string.lox",
    "content": "\"str\"(); // expect runtime error: Can only call functions and classes.\n"
  },
  {
    "path": "test/class/empty.lox",
    "content": "class Foo {}\n\nprint Foo; // expect: Foo\n"
  },
  {
    "path": "test/class/inherit_self.lox",
    "content": "class Foo < Foo {} // Error at 'Foo': A class can't inherit from itself.\n"
  },
  {
    "path": "test/class/inherited_method.lox",
    "content": "class Foo {\n  inFoo() {\n    print \"in foo\";\n  }\n}\n\nclass Bar < Foo {\n  inBar() {\n    print \"in bar\";\n  }\n}\n\nclass Baz < Bar {\n  inBaz() {\n    print \"in baz\";\n  }\n}\n\nvar baz = Baz();\nbaz.inFoo(); // expect: in foo\nbaz.inBar(); // expect: in bar\nbaz.inBaz(); // expect: in baz\n"
  },
  {
    "path": "test/class/local_inherit_other.lox",
    "content": "class A {}\n\nfun f() {\n  class B < A {}\n  return B;\n}\n\nprint f(); // expect: B\n"
  },
  {
    "path": "test/class/local_inherit_self.lox",
    "content": "{\n  class Foo < Foo {} // Error at 'Foo': A class can't inherit from itself.\n}\n// [c line 5] Error at end: Expect '}' after block.\n"
  },
  {
    "path": "test/class/local_reference_self.lox",
    "content": "{\n  class Foo {\n    returnSelf() {\n      return Foo;\n    }\n  }\n\n  print Foo().returnSelf(); // expect: Foo\n}\n"
  },
  {
    "path": "test/class/reference_self.lox",
    "content": "class Foo {\n  returnSelf() {\n    return Foo;\n  }\n}\n\nprint Foo().returnSelf(); // expect: Foo\n"
  },
  {
    "path": "test/closure/assign_to_closure.lox",
    "content": "var f;\nvar g;\n\n{\n  var local = \"local\";\n  fun f_() {\n    print local;\n    local = \"after f\";\n    print local;\n  }\n  f = f_;\n\n  fun g_() {\n    print local;\n    local = \"after g\";\n    print local;\n  }\n  g = g_;\n}\n\nf();\n// expect: local\n// expect: after f\n\ng();\n// expect: after f\n// expect: after g\n"
  },
  {
    "path": "test/closure/assign_to_shadowed_later.lox",
    "content": "var a = \"global\";\n\n{\n  fun assign() {\n    a = \"assigned\";\n  }\n\n  var a = \"inner\";\n  assign();\n  print a; // expect: inner\n}\n\nprint a; // expect: assigned\n"
  },
  {
    "path": "test/closure/close_over_function_parameter.lox",
    "content": "var f;\n\nfun foo(param) {\n  fun f_() {\n    print param;\n  }\n  f = f_;\n}\nfoo(\"param\");\n\nf(); // expect: param\n"
  },
  {
    "path": "test/closure/close_over_later_variable.lox",
    "content": "// This is a regression test. There was a bug where if an upvalue for an\n// earlier local (here \"a\") was captured *after* a later one (\"b\"), then it\n// would crash because it walked to the end of the upvalue list (correct), but\n// then didn't handle not finding the variable.\n\nfun f() {\n  var a = \"a\";\n  var b = \"b\";\n  fun g() {\n    print b; // expect: b\n    print a; // expect: a\n  }\n  g();\n}\nf();\n"
  },
  {
    "path": "test/closure/close_over_method_parameter.lox",
    "content": "var f;\n\nclass Foo {\n  method(param) {\n    fun f_() {\n      print param;\n    }\n    f = f_;\n  }\n}\n\nFoo().method(\"param\");\nf(); // expect: param\n"
  },
  {
    "path": "test/closure/closed_closure_in_function.lox",
    "content": "var f;\n\n{\n  var local = \"local\";\n  fun f_() {\n    print local;\n  }\n  f = f_;\n}\n\nf(); // expect: local\n"
  },
  {
    "path": "test/closure/nested_closure.lox",
    "content": "var f;\n\nfun f1() {\n  var a = \"a\";\n  fun f2() {\n    var b = \"b\";\n    fun f3() {\n      var c = \"c\";\n      fun f4() {\n        print a;\n        print b;\n        print c;\n      }\n      f = f4;\n    }\n    f3();\n  }\n  f2();\n}\nf1();\n\nf();\n// expect: a\n// expect: b\n// expect: c\n"
  },
  {
    "path": "test/closure/open_closure_in_function.lox",
    "content": "{\n  var local = \"local\";\n  fun f() {\n    print local; // expect: local\n  }\n  f();\n}\n"
  },
  {
    "path": "test/closure/reference_closure_multiple_times.lox",
    "content": "var f;\n\n{\n  var a = \"a\";\n  fun f_() {\n    print a;\n    print a;\n  }\n  f = f_;\n}\n\nf();\n// expect: a\n// expect: a\n"
  },
  {
    "path": "test/closure/reuse_closure_slot.lox",
    "content": "{\n  var f;\n\n  {\n    var a = \"a\";\n    fun f_() { print a; }\n    f = f_;\n  }\n\n  {\n    // Since a is out of scope, the local slot will be reused by b. Make sure\n    // that f still closes over a.\n    var b = \"b\";\n    f(); // expect: a\n  }\n}\n"
  },
  {
    "path": "test/closure/shadow_closure_with_local.lox",
    "content": "{\n  var foo = \"closure\";\n  fun f() {\n    {\n      print foo; // expect: closure\n      var foo = \"shadow\";\n      print foo; // expect: shadow\n    }\n    print foo; // expect: closure\n  }\n  f();\n}\n"
  },
  {
    "path": "test/closure/unused_closure.lox",
    "content": "// This is a regression test. There was a bug where the VM would try to close\n// an upvalue even if the upvalue was never created because the codepath for\n// the closure was not executed.\n\n{\n  var a = \"a\";\n  if (false) {\n    fun foo() { a; }\n  }\n}\n\n// If we get here, we didn't segfault when a went out of scope.\nprint \"ok\"; // expect: ok\n"
  },
  {
    "path": "test/closure/unused_later_closure.lox",
    "content": "// This is a regression test. When closing upvalues for discarded locals, it\n// wouldn't make sure it discarded the upvalue for the correct stack slot.\n//\n// Here we create two locals that can be closed over, but only the first one\n// actually is. When \"b\" goes out of scope, we need to make sure we don't\n// prematurely close \"a\".\nvar closure;\n\n{\n  var a = \"a\";\n\n  {\n    var b = \"b\";\n    fun returnA() {\n      return a;\n    }\n\n    closure = returnA;\n\n    if (false) {\n      fun returnB() {\n        return b;\n      }\n    }\n  }\n\n  print closure(); // expect: a\n}\n"
  },
  {
    "path": "test/comments/line_at_eof.lox",
    "content": "print \"ok\"; // expect: ok\n// comment"
  },
  {
    "path": "test/comments/only_line_comment.lox",
    "content": "// comment"
  },
  {
    "path": "test/comments/only_line_comment_and_line.lox",
    "content": "// comment\n"
  },
  {
    "path": "test/comments/unicode.lox",
    "content": "// Unicode characters are allowed in comments.\n//\n// Latin 1 Supplement: £§¶ÜÞ\n// Latin Extended-A: ĐĦŋœ\n// Latin Extended-B: ƂƢƩǁ\n// Other stuff: ឃᢆ᯽₪ℜ↩⊗┺░\n// Emoji: ☃☺♣\n\nprint \"ok\"; // expect: ok\n"
  },
  {
    "path": "test/constructor/arguments.lox",
    "content": "class Foo {\n  init(a, b) {\n    print \"init\"; // expect: init\n    this.a = a;\n    this.b = b;\n  }\n}\n\nvar foo = Foo(1, 2);\nprint foo.a; // expect: 1\nprint foo.b; // expect: 2\n"
  },
  {
    "path": "test/constructor/call_init_early_return.lox",
    "content": "class Foo {\n  init() {\n    print \"init\";\n    return;\n    print \"nope\";\n  }\n}\n\nvar foo = Foo(); // expect: init\nprint foo.init(); // expect: init\n// expect: Foo instance\n"
  },
  {
    "path": "test/constructor/call_init_explicitly.lox",
    "content": "class Foo {\n  init(arg) {\n    print \"Foo.init(\" + arg + \")\";\n    this.field = \"init\";\n  }\n}\n\nvar foo = Foo(\"one\"); // expect: Foo.init(one)\nfoo.field = \"field\";\n\nvar foo2 = foo.init(\"two\"); // expect: Foo.init(two)\nprint foo2; // expect: Foo instance\n\n// Make sure init() doesn't create a fresh instance.\nprint foo.field; // expect: init\n"
  },
  {
    "path": "test/constructor/default.lox",
    "content": "class Foo {}\n\nvar foo = Foo();\nprint foo; // expect: Foo instance\n"
  },
  {
    "path": "test/constructor/default_arguments.lox",
    "content": "class Foo {}\n\nvar foo = Foo(1, 2, 3); // expect runtime error: Expected 0 arguments but got 3.\n"
  },
  {
    "path": "test/constructor/early_return.lox",
    "content": "class Foo {\n  init() {\n    print \"init\";\n    return;\n    print \"nope\";\n  }\n}\n\nvar foo = Foo(); // expect: init\nprint foo; // expect: Foo instance\n"
  },
  {
    "path": "test/constructor/extra_arguments.lox",
    "content": "class Foo {\n  init(a, b) {\n    this.a = a;\n    this.b = b;\n  }\n}\n\nvar foo = Foo(1, 2, 3, 4); // expect runtime error: Expected 2 arguments but got 4."
  },
  {
    "path": "test/constructor/init_not_method.lox",
    "content": "class Foo {\n  init(arg) {\n    print \"Foo.init(\" + arg + \")\";\n    this.field = \"init\";\n  }\n}\n\nfun init() {\n  print \"not initializer\";\n}\n\ninit(); // expect: not initializer\n"
  },
  {
    "path": "test/constructor/missing_arguments.lox",
    "content": "class Foo {\n  init(a, b) {}\n}\n\nvar foo = Foo(1); // expect runtime error: Expected 2 arguments but got 1.\n"
  },
  {
    "path": "test/constructor/return_in_nested_function.lox",
    "content": "class Foo {\n  init() {\n    fun init() {\n      return \"bar\";\n    }\n    print init(); // expect: bar\n  }\n}\n\nprint Foo(); // expect: Foo instance\n"
  },
  {
    "path": "test/constructor/return_value.lox",
    "content": "class Foo {\n  init() {\n    return \"result\"; // Error at 'return': Can't return a value from an initializer.\n  }\n}\n"
  },
  {
    "path": "test/empty_file.lox",
    "content": ""
  },
  {
    "path": "test/expressions/evaluate.lox",
    "content": "// Note: This is just for the expression evaluating chapter which evaluates an\n// expression directly.\n(5 - (3 - 1)) + -1\n// expect: 2\n"
  },
  {
    "path": "test/expressions/parse.lox",
    "content": "// Note: This is just for the expression parsing chapter which prints the AST.\n(5 - (3 - 1)) + -1\n// expect: (+ (group (- 5.0 (group (- 3.0 1.0)))) (- 1.0))\n"
  },
  {
    "path": "test/field/call_function_field.lox",
    "content": "class Foo {}\n\nfun bar(a, b) {\n  print \"bar\";\n  print a;\n  print b;\n}\n\nvar foo = Foo();\nfoo.bar = bar;\n\nfoo.bar(1, 2);\n// expect: bar\n// expect: 1\n// expect: 2\n"
  },
  {
    "path": "test/field/call_nonfunction_field.lox",
    "content": "class Foo {}\n\nvar foo = Foo();\nfoo.bar = \"not fn\";\n\nfoo.bar(); // expect runtime error: Can only call functions and classes.\n"
  },
  {
    "path": "test/field/get_and_set_method.lox",
    "content": "// Bound methods have identity equality.\nclass Foo {\n  method(a) {\n    print \"method\";\n    print a;\n  }\n  other(a) {\n    print \"other\";\n    print a;\n  }\n}\n\nvar foo = Foo();\nvar method = foo.method;\n\n// Setting a property shadows the instance method.\nfoo.method = foo.other;\nfoo.method(1);\n// expect: other\n// expect: 1\n\n// The old method handle still points to the original method.\nmethod(2);\n// expect: method\n// expect: 2\n"
  },
  {
    "path": "test/field/get_on_bool.lox",
    "content": "true.foo; // expect runtime error: Only instances have properties.\n"
  },
  {
    "path": "test/field/get_on_class.lox",
    "content": "class Foo {}\nFoo.bar; // expect runtime error: Only instances have properties.\n"
  },
  {
    "path": "test/field/get_on_function.lox",
    "content": "fun foo() {}\n\nfoo.bar; // expect runtime error: Only instances have properties.\n"
  },
  {
    "path": "test/field/get_on_nil.lox",
    "content": "nil.foo; // expect runtime error: Only instances have properties.\n"
  },
  {
    "path": "test/field/get_on_num.lox",
    "content": "123.foo; // expect runtime error: Only instances have properties.\n"
  },
  {
    "path": "test/field/get_on_string.lox",
    "content": "\"str\".foo; // expect runtime error: Only instances have properties.\n"
  },
  {
    "path": "test/field/many.lox",
    "content": "class Foo {}\n\nvar foo = Foo();\nfun setFields() {\n  foo.bilberry = \"bilberry\";\n  foo.lime = \"lime\";\n  foo.elderberry = \"elderberry\";\n  foo.raspberry = \"raspberry\";\n  foo.gooseberry = \"gooseberry\";\n  foo.longan = \"longan\";\n  foo.mandarine = \"mandarine\";\n  foo.kiwifruit = \"kiwifruit\";\n  foo.orange = \"orange\";\n  foo.pomegranate = \"pomegranate\";\n  foo.tomato = \"tomato\";\n  foo.banana = \"banana\";\n  foo.juniper = \"juniper\";\n  foo.damson = \"damson\";\n  foo.blackcurrant = \"blackcurrant\";\n  foo.peach = \"peach\";\n  foo.grape = \"grape\";\n  foo.mango = \"mango\";\n  foo.redcurrant = \"redcurrant\";\n  foo.watermelon = \"watermelon\";\n  foo.plumcot = \"plumcot\";\n  foo.papaya = \"papaya\";\n  foo.cloudberry = \"cloudberry\";\n  foo.rambutan = \"rambutan\";\n  foo.salak = \"salak\";\n  foo.physalis = \"physalis\";\n  foo.huckleberry = \"huckleberry\";\n  foo.coconut = \"coconut\";\n  foo.date = \"date\";\n  foo.tamarind = \"tamarind\";\n  foo.lychee = \"lychee\";\n  foo.raisin = \"raisin\";\n  foo.apple = \"apple\";\n  foo.avocado = \"avocado\";\n  foo.nectarine = \"nectarine\";\n  foo.pomelo = \"pomelo\";\n  foo.melon = \"melon\";\n  foo.currant = \"currant\";\n  foo.plum = \"plum\";\n  foo.persimmon = \"persimmon\";\n  foo.olive = \"olive\";\n  foo.cranberry = \"cranberry\";\n  foo.boysenberry = \"boysenberry\";\n  foo.blackberry = \"blackberry\";\n  foo.passionfruit = \"passionfruit\";\n  foo.mulberry = \"mulberry\";\n  foo.marionberry = \"marionberry\";\n  foo.plantain = \"plantain\";\n  foo.lemon = \"lemon\";\n  foo.yuzu = \"yuzu\";\n  foo.loquat = \"loquat\";\n  foo.kumquat = \"kumquat\";\n  foo.salmonberry = \"salmonberry\";\n  foo.tangerine = \"tangerine\";\n  foo.durian = \"durian\";\n  foo.pear = \"pear\";\n  foo.cantaloupe = \"cantaloupe\";\n  foo.quince = \"quince\";\n  foo.guava = \"guava\";\n  foo.strawberry = \"strawberry\";\n  foo.nance = \"nance\";\n  foo.apricot = \"apricot\";\n  foo.jambul = \"jambul\";\n  foo.grapefruit = \"grapefruit\";\n  foo.clementine = \"clementine\";\n  foo.jujube = \"jujube\";\n  foo.cherry = \"cherry\";\n  foo.feijoa = \"feijoa\";\n  foo.jackfruit = \"jackfruit\";\n  foo.fig = \"fig\";\n  foo.cherimoya = \"cherimoya\";\n  foo.pineapple = \"pineapple\";\n  foo.blueberry = \"blueberry\";\n  foo.jabuticaba = \"jabuticaba\";\n  foo.miracle = \"miracle\";\n  foo.dragonfruit = \"dragonfruit\";\n  foo.satsuma = \"satsuma\";\n  foo.tamarillo = \"tamarillo\";\n  foo.honeydew = \"honeydew\";\n}\n\nsetFields();\n\nfun printFields() {\n  print foo.apple; // expect: apple\n  print foo.apricot; // expect: apricot\n  print foo.avocado; // expect: avocado\n  print foo.banana; // expect: banana\n  print foo.bilberry; // expect: bilberry\n  print foo.blackberry; // expect: blackberry\n  print foo.blackcurrant; // expect: blackcurrant\n  print foo.blueberry; // expect: blueberry\n  print foo.boysenberry; // expect: boysenberry\n  print foo.cantaloupe; // expect: cantaloupe\n  print foo.cherimoya; // expect: cherimoya\n  print foo.cherry; // expect: cherry\n  print foo.clementine; // expect: clementine\n  print foo.cloudberry; // expect: cloudberry\n  print foo.coconut; // expect: coconut\n  print foo.cranberry; // expect: cranberry\n  print foo.currant; // expect: currant\n  print foo.damson; // expect: damson\n  print foo.date; // expect: date\n  print foo.dragonfruit; // expect: dragonfruit\n  print foo.durian; // expect: durian\n  print foo.elderberry; // expect: elderberry\n  print foo.feijoa; // expect: feijoa\n  print foo.fig; // expect: fig\n  print foo.gooseberry; // expect: gooseberry\n  print foo.grape; // expect: grape\n  print foo.grapefruit; // expect: grapefruit\n  print foo.guava; // expect: guava\n  print foo.honeydew; // expect: honeydew\n  print foo.huckleberry; // expect: huckleberry\n  print foo.jabuticaba; // expect: jabuticaba\n  print foo.jackfruit; // expect: jackfruit\n  print foo.jambul; // expect: jambul\n  print foo.jujube; // expect: jujube\n  print foo.juniper; // expect: juniper\n  print foo.kiwifruit; // expect: kiwifruit\n  print foo.kumquat; // expect: kumquat\n  print foo.lemon; // expect: lemon\n  print foo.lime; // expect: lime\n  print foo.longan; // expect: longan\n  print foo.loquat; // expect: loquat\n  print foo.lychee; // expect: lychee\n  print foo.mandarine; // expect: mandarine\n  print foo.mango; // expect: mango\n  print foo.marionberry; // expect: marionberry\n  print foo.melon; // expect: melon\n  print foo.miracle; // expect: miracle\n  print foo.mulberry; // expect: mulberry\n  print foo.nance; // expect: nance\n  print foo.nectarine; // expect: nectarine\n  print foo.olive; // expect: olive\n  print foo.orange; // expect: orange\n  print foo.papaya; // expect: papaya\n  print foo.passionfruit; // expect: passionfruit\n  print foo.peach; // expect: peach\n  print foo.pear; // expect: pear\n  print foo.persimmon; // expect: persimmon\n  print foo.physalis; // expect: physalis\n  print foo.pineapple; // expect: pineapple\n  print foo.plantain; // expect: plantain\n  print foo.plum; // expect: plum\n  print foo.plumcot; // expect: plumcot\n  print foo.pomegranate; // expect: pomegranate\n  print foo.pomelo; // expect: pomelo\n  print foo.quince; // expect: quince\n  print foo.raisin; // expect: raisin\n  print foo.rambutan; // expect: rambutan\n  print foo.raspberry; // expect: raspberry\n  print foo.redcurrant; // expect: redcurrant\n  print foo.salak; // expect: salak\n  print foo.salmonberry; // expect: salmonberry\n  print foo.satsuma; // expect: satsuma\n  print foo.strawberry; // expect: strawberry\n  print foo.tamarillo; // expect: tamarillo\n  print foo.tamarind; // expect: tamarind\n  print foo.tangerine; // expect: tangerine\n  print foo.tomato; // expect: tomato\n  print foo.watermelon; // expect: watermelon\n  print foo.yuzu; // expect: yuzu\n}\n\nprintFields();\n"
  },
  {
    "path": "test/field/method.lox",
    "content": "class Foo {\n  bar(arg) {\n    print arg;\n  }\n}\n\nvar bar = Foo().bar;\nprint \"got method\"; // expect: got method\nbar(\"arg\");          // expect: arg\n"
  },
  {
    "path": "test/field/method_binds_this.lox",
    "content": "class Foo {\n  sayName(a) {\n    print this.name;\n    print a;\n  }\n}\n\nvar foo1 = Foo();\nfoo1.name = \"foo1\";\n\nvar foo2 = Foo();\nfoo2.name = \"foo2\";\n\n// Store the method reference on another object.\nfoo2.fn = foo1.sayName;\n// Still retains original receiver.\nfoo2.fn(1);\n// expect: foo1\n// expect: 1\n"
  },
  {
    "path": "test/field/on_instance.lox",
    "content": "class Foo {}\n\nvar foo = Foo();\n\nprint foo.bar = \"bar value\"; // expect: bar value\nprint foo.baz = \"baz value\"; // expect: baz value\n\nprint foo.bar; // expect: bar value\nprint foo.baz; // expect: baz value\n"
  },
  {
    "path": "test/field/set_evaluation_order.lox",
    "content": "undefined1.bar // expect runtime error: Undefined variable 'undefined1'.\n  = undefined2;"
  },
  {
    "path": "test/field/set_on_bool.lox",
    "content": "true.foo = \"value\"; // expect runtime error: Only instances have fields.\n"
  },
  {
    "path": "test/field/set_on_class.lox",
    "content": "class Foo {}\nFoo.bar = \"value\"; // expect runtime error: Only instances have fields.\n"
  },
  {
    "path": "test/field/set_on_function.lox",
    "content": "fun foo() {}\n\nfoo.bar = \"value\"; // expect runtime error: Only instances have fields.\n"
  },
  {
    "path": "test/field/set_on_nil.lox",
    "content": "nil.foo = \"value\"; // expect runtime error: Only instances have fields.\n"
  },
  {
    "path": "test/field/set_on_num.lox",
    "content": "123.foo = \"value\"; // expect runtime error: Only instances have fields.\n"
  },
  {
    "path": "test/field/set_on_string.lox",
    "content": "\"str\".foo = \"value\"; // expect runtime error: Only instances have fields.\n"
  },
  {
    "path": "test/field/undefined.lox",
    "content": "class Foo {}\nvar foo = Foo();\n\nfoo.bar; // expect runtime error: Undefined property 'bar'.\n"
  },
  {
    "path": "test/for/class_in_body.lox",
    "content": "// [line 2] Error at 'class': Expect expression.\nfor (;;) class Foo {}\n"
  },
  {
    "path": "test/for/closure_in_body.lox",
    "content": "var f1;\nvar f2;\nvar f3;\n\nfor (var i = 1; i < 4; i = i + 1) {\n  var j = i;\n  fun f() {\n    print i;\n    print j;\n  }\n\n  if (j == 1) f1 = f;\n  else if (j == 2) f2 = f;\n  else f3 = f;\n}\n\nf1(); // expect: 4\n      // expect: 1\nf2(); // expect: 4\n      // expect: 2\nf3(); // expect: 4\n      // expect: 3\n"
  },
  {
    "path": "test/for/fun_in_body.lox",
    "content": "// [line 2] Error at 'fun': Expect expression.\nfor (;;) fun foo() {}\n"
  },
  {
    "path": "test/for/return_closure.lox",
    "content": "fun f() {\n  for (;;) {\n    var i = \"i\";\n    fun g() { print i; }\n    return g;\n  }\n}\n\nvar h = f();\nh(); // expect: i\n"
  },
  {
    "path": "test/for/return_inside.lox",
    "content": "fun f() {\n  for (;;) {\n    var i = \"i\";\n    return i;\n  }\n}\n\nprint f();\n// expect: i\n"
  },
  {
    "path": "test/for/scope.lox",
    "content": "{\n  var i = \"before\";\n\n  // New variable is in inner scope.\n  for (var i = 0; i < 1; i = i + 1) {\n    print i; // expect: 0\n\n    // Loop body is in second inner scope.\n    var i = -1;\n    print i; // expect: -1\n  }\n}\n\n{\n  // New variable shadows outer variable.\n  for (var i = 0; i > 0; i = i + 1) {}\n\n  // Goes out of scope after loop.\n  var i = \"after\";\n  print i; // expect: after\n\n  // Can reuse an existing variable.\n  for (i = 0; i < 1; i = i + 1) {\n    print i; // expect: 0\n  }\n}\n"
  },
  {
    "path": "test/for/statement_condition.lox",
    "content": "// [line 3] Error at '{': Expect expression.\n// [line 3] Error at ')': Expect ';' after expression.\nfor (var a = 1; {}; a = a + 1) {}\n"
  },
  {
    "path": "test/for/statement_increment.lox",
    "content": "// [line 2] Error at '{': Expect expression.\nfor (var a = 1; a < 2; {}) {}\n"
  },
  {
    "path": "test/for/statement_initializer.lox",
    "content": "// [line 3] Error at '{': Expect expression.\n// [line 3] Error at ')': Expect ';' after expression.\nfor ({}; a < 2; a = a + 1) {}\n"
  },
  {
    "path": "test/for/syntax.lox",
    "content": "// Single-expression body.\nfor (var c = 0; c < 3;) print c = c + 1;\n// expect: 1\n// expect: 2\n// expect: 3\n\n// Block body.\nfor (var a = 0; a < 3; a = a + 1) {\n  print a;\n}\n// expect: 0\n// expect: 1\n// expect: 2\n\n// No clauses.\nfun foo() {\n  for (;;) return \"done\";\n}\nprint foo(); // expect: done\n\n// No variable.\nvar i = 0;\nfor (; i < 2; i = i + 1) print i;\n// expect: 0\n// expect: 1\n\n// No condition.\nfun bar() {\n  for (var i = 0;; i = i + 1) {\n    print i;\n    if (i >= 2) return;\n  }\n}\nbar();\n// expect: 0\n// expect: 1\n// expect: 2\n\n// No increment.\nfor (var i = 0; i < 2;) {\n  print i;\n  i = i + 1;\n}\n// expect: 0\n// expect: 1\n\n// Statement bodies.\nfor (; false;) if (true) 1; else 2;\nfor (; false;) while (true) 1;\nfor (; false;) for (;;) 1;\n"
  },
  {
    "path": "test/for/var_in_body.lox",
    "content": "// [line 2] Error at 'var': Expect expression.\nfor (;;) var foo;\n"
  },
  {
    "path": "test/function/body_must_be_block.lox",
    "content": "// [line 3] Error at '123': Expect '{' before function body.\n// [c line 4] Error at end: Expect '}' after block.\nfun f() 123;\n"
  },
  {
    "path": "test/function/empty_body.lox",
    "content": "fun f() {}\nprint f(); // expect: nil\n"
  },
  {
    "path": "test/function/extra_arguments.lox",
    "content": "fun f(a, b) {\n  print a;\n  print b;\n}\n\nf(1, 2, 3, 4); // expect runtime error: Expected 2 arguments but got 4.\n"
  },
  {
    "path": "test/function/local_mutual_recursion.lox",
    "content": "{\n  fun isEven(n) {\n    if (n == 0) return true;\n    return isOdd(n - 1); // expect runtime error: Undefined variable 'isOdd'.\n  }\n\n  fun isOdd(n) {\n    if (n == 0) return false;\n    return isEven(n - 1);\n  }\n\n  isEven(4);\n}"
  },
  {
    "path": "test/function/local_recursion.lox",
    "content": "{\n  fun fib(n) {\n    if (n < 2) return n;\n    return fib(n - 1) + fib(n - 2);\n  }\n\n  print fib(8); // expect: 21\n}\n"
  },
  {
    "path": "test/function/missing_arguments.lox",
    "content": "fun f(a, b) {}\n\nf(1); // expect runtime error: Expected 2 arguments but got 1.\n"
  },
  {
    "path": "test/function/missing_comma_in_parameters.lox",
    "content": "// [line 3] Error at 'c': Expect ')' after parameters.\n// [c line 4] Error at end: Expect '}' after block.\nfun foo(a, b c, d, e, f) {}\n"
  },
  {
    "path": "test/function/mutual_recursion.lox",
    "content": "fun isEven(n) {\n  if (n == 0) return true;\n  return isOdd(n - 1);\n}\n\nfun isOdd(n) {\n  if (n == 0) return false;\n  return isEven(n - 1);\n}\n\nprint isEven(4); // expect: true\nprint isOdd(3); // expect: true\n"
  },
  {
    "path": "test/function/nested_call_with_arguments.lox",
    "content": "fun returnArg(arg) {\n  return arg;\n}\n\nfun returnFunCallWithArg(func, arg) {\n  return returnArg(func)(arg);\n}\n\nfun printArg(arg) {\n  print arg;\n}\n\nreturnFunCallWithArg(printArg, \"hello world\"); // expect: hello world\n"
  },
  {
    "path": "test/function/parameters.lox",
    "content": "fun f0() { return 0; }\nprint f0(); // expect: 0\n\nfun f1(a) { return a; }\nprint f1(1); // expect: 1\n\nfun f2(a, b) { return a + b; }\nprint f2(1, 2); // expect: 3\n\nfun f3(a, b, c) { return a + b + c; }\nprint f3(1, 2, 3); // expect: 6\n\nfun f4(a, b, c, d) { return a + b + c + d; }\nprint f4(1, 2, 3, 4); // expect: 10\n\nfun f5(a, b, c, d, e) { return a + b + c + d + e; }\nprint f5(1, 2, 3, 4, 5); // expect: 15\n\nfun f6(a, b, c, d, e, f) { return a + b + c + d + e + f; }\nprint f6(1, 2, 3, 4, 5, 6); // expect: 21\n\nfun f7(a, b, c, d, e, f, g) { return a + b + c + d + e + f + g; }\nprint f7(1, 2, 3, 4, 5, 6, 7); // expect: 28\n\nfun f8(a, b, c, d, e, f, g, h) { return a + b + c + d + e + f + g + h; }\nprint f8(1, 2, 3, 4, 5, 6, 7, 8); // expect: 36\n"
  },
  {
    "path": "test/function/print.lox",
    "content": "fun foo() {}\nprint foo; // expect: <fn foo>\n\nprint clock; // expect: <native fn>\n"
  },
  {
    "path": "test/function/recursion.lox",
    "content": "fun fib(n) {\n  if (n < 2) return n;\n  return fib(n - 1) + fib(n - 2);\n}\n\nprint fib(8); // expect: 21\n"
  },
  {
    "path": "test/function/too_many_arguments.lox",
    "content": "fun foo() {}\n{\n  var a = 1;\n  foo(\n     a, // 1\n     a, // 2\n     a, // 3\n     a, // 4\n     a, // 5\n     a, // 6\n     a, // 7\n     a, // 8\n     a, // 9\n     a, // 10\n     a, // 11\n     a, // 12\n     a, // 13\n     a, // 14\n     a, // 15\n     a, // 16\n     a, // 17\n     a, // 18\n     a, // 19\n     a, // 20\n     a, // 21\n     a, // 22\n     a, // 23\n     a, // 24\n     a, // 25\n     a, // 26\n     a, // 27\n     a, // 28\n     a, // 29\n     a, // 30\n     a, // 31\n     a, // 32\n     a, // 33\n     a, // 34\n     a, // 35\n     a, // 36\n     a, // 37\n     a, // 38\n     a, // 39\n     a, // 40\n     a, // 41\n     a, // 42\n     a, // 43\n     a, // 44\n     a, // 45\n     a, // 46\n     a, // 47\n     a, // 48\n     a, // 49\n     a, // 50\n     a, // 51\n     a, // 52\n     a, // 53\n     a, // 54\n     a, // 55\n     a, // 56\n     a, // 57\n     a, // 58\n     a, // 59\n     a, // 60\n     a, // 61\n     a, // 62\n     a, // 63\n     a, // 64\n     a, // 65\n     a, // 66\n     a, // 67\n     a, // 68\n     a, // 69\n     a, // 70\n     a, // 71\n     a, // 72\n     a, // 73\n     a, // 74\n     a, // 75\n     a, // 76\n     a, // 77\n     a, // 78\n     a, // 79\n     a, // 80\n     a, // 81\n     a, // 82\n     a, // 83\n     a, // 84\n     a, // 85\n     a, // 86\n     a, // 87\n     a, // 88\n     a, // 89\n     a, // 90\n     a, // 91\n     a, // 92\n     a, // 93\n     a, // 94\n     a, // 95\n     a, // 96\n     a, // 97\n     a, // 98\n     a, // 99\n     a, // 100\n     a, // 101\n     a, // 102\n     a, // 103\n     a, // 104\n     a, // 105\n     a, // 106\n     a, // 107\n     a, // 108\n     a, // 109\n     a, // 110\n     a, // 111\n     a, // 112\n     a, // 113\n     a, // 114\n     a, // 115\n     a, // 116\n     a, // 117\n     a, // 118\n     a, // 119\n     a, // 120\n     a, // 121\n     a, // 122\n     a, // 123\n     a, // 124\n     a, // 125\n     a, // 126\n     a, // 127\n     a, // 128\n     a, // 129\n     a, // 130\n     a, // 131\n     a, // 132\n     a, // 133\n     a, // 134\n     a, // 135\n     a, // 136\n     a, // 137\n     a, // 138\n     a, // 139\n     a, // 140\n     a, // 141\n     a, // 142\n     a, // 143\n     a, // 144\n     a, // 145\n     a, // 146\n     a, // 147\n     a, // 148\n     a, // 149\n     a, // 150\n     a, // 151\n     a, // 152\n     a, // 153\n     a, // 154\n     a, // 155\n     a, // 156\n     a, // 157\n     a, // 158\n     a, // 159\n     a, // 160\n     a, // 161\n     a, // 162\n     a, // 163\n     a, // 164\n     a, // 165\n     a, // 166\n     a, // 167\n     a, // 168\n     a, // 169\n     a, // 170\n     a, // 171\n     a, // 172\n     a, // 173\n     a, // 174\n     a, // 175\n     a, // 176\n     a, // 177\n     a, // 178\n     a, // 179\n     a, // 180\n     a, // 181\n     a, // 182\n     a, // 183\n     a, // 184\n     a, // 185\n     a, // 186\n     a, // 187\n     a, // 188\n     a, // 189\n     a, // 190\n     a, // 191\n     a, // 192\n     a, // 193\n     a, // 194\n     a, // 195\n     a, // 196\n     a, // 197\n     a, // 198\n     a, // 199\n     a, // 200\n     a, // 201\n     a, // 202\n     a, // 203\n     a, // 204\n     a, // 205\n     a, // 206\n     a, // 207\n     a, // 208\n     a, // 209\n     a, // 210\n     a, // 211\n     a, // 212\n     a, // 213\n     a, // 214\n     a, // 215\n     a, // 216\n     a, // 217\n     a, // 218\n     a, // 219\n     a, // 220\n     a, // 221\n     a, // 222\n     a, // 223\n     a, // 224\n     a, // 225\n     a, // 226\n     a, // 227\n     a, // 228\n     a, // 229\n     a, // 230\n     a, // 231\n     a, // 232\n     a, // 233\n     a, // 234\n     a, // 235\n     a, // 236\n     a, // 237\n     a, // 238\n     a, // 239\n     a, // 240\n     a, // 241\n     a, // 242\n     a, // 243\n     a, // 244\n     a, // 245\n     a, // 246\n     a, // 247\n     a, // 248\n     a, // 249\n     a, // 250\n     a, // 251\n     a, // 252\n     a, // 253\n     a, // 254\n     a, // 255\n     a); // Error at 'a': Can't have more than 255 arguments.\n}\n"
  },
  {
    "path": "test/function/too_many_parameters.lox",
    "content": "// 256 parameters.\nfun f(\n    a1,\n    a2,\n    a3,\n    a4,\n    a5,\n    a6,\n    a7,\n    a8,\n    a9,\n    a10,\n    a11,\n    a12,\n    a13,\n    a14,\n    a15,\n    a16,\n    a17,\n    a18,\n    a19,\n    a20,\n    a21,\n    a22,\n    a23,\n    a24,\n    a25,\n    a26,\n    a27,\n    a28,\n    a29,\n    a30,\n    a31,\n    a32,\n    a33,\n    a34,\n    a35,\n    a36,\n    a37,\n    a38,\n    a39,\n    a40,\n    a41,\n    a42,\n    a43,\n    a44,\n    a45,\n    a46,\n    a47,\n    a48,\n    a49,\n    a50,\n    a51,\n    a52,\n    a53,\n    a54,\n    a55,\n    a56,\n    a57,\n    a58,\n    a59,\n    a60,\n    a61,\n    a62,\n    a63,\n    a64,\n    a65,\n    a66,\n    a67,\n    a68,\n    a69,\n    a70,\n    a71,\n    a72,\n    a73,\n    a74,\n    a75,\n    a76,\n    a77,\n    a78,\n    a79,\n    a80,\n    a81,\n    a82,\n    a83,\n    a84,\n    a85,\n    a86,\n    a87,\n    a88,\n    a89,\n    a90,\n    a91,\n    a92,\n    a93,\n    a94,\n    a95,\n    a96,\n    a97,\n    a98,\n    a99,\n    a100,\n    a101,\n    a102,\n    a103,\n    a104,\n    a105,\n    a106,\n    a107,\n    a108,\n    a109,\n    a110,\n    a111,\n    a112,\n    a113,\n    a114,\n    a115,\n    a116,\n    a117,\n    a118,\n    a119,\n    a120,\n    a121,\n    a122,\n    a123,\n    a124,\n    a125,\n    a126,\n    a127,\n    a128,\n    a129,\n    a130,\n    a131,\n    a132,\n    a133,\n    a134,\n    a135,\n    a136,\n    a137,\n    a138,\n    a139,\n    a140,\n    a141,\n    a142,\n    a143,\n    a144,\n    a145,\n    a146,\n    a147,\n    a148,\n    a149,\n    a150,\n    a151,\n    a152,\n    a153,\n    a154,\n    a155,\n    a156,\n    a157,\n    a158,\n    a159,\n    a160,\n    a161,\n    a162,\n    a163,\n    a164,\n    a165,\n    a166,\n    a167,\n    a168,\n    a169,\n    a170,\n    a171,\n    a172,\n    a173,\n    a174,\n    a175,\n    a176,\n    a177,\n    a178,\n    a179,\n    a180,\n    a181,\n    a182,\n    a183,\n    a184,\n    a185,\n    a186,\n    a187,\n    a188,\n    a189,\n    a190,\n    a191,\n    a192,\n    a193,\n    a194,\n    a195,\n    a196,\n    a197,\n    a198,\n    a199,\n    a200,\n    a201,\n    a202,\n    a203,\n    a204,\n    a205,\n    a206,\n    a207,\n    a208,\n    a209,\n    a210,\n    a211,\n    a212,\n    a213,\n    a214,\n    a215,\n    a216,\n    a217,\n    a218,\n    a219,\n    a220,\n    a221,\n    a222,\n    a223,\n    a224,\n    a225,\n    a226,\n    a227,\n    a228,\n    a229,\n    a230,\n    a231,\n    a232,\n    a233,\n    a234,\n    a235,\n    a236,\n    a237,\n    a238,\n    a239,\n    a240,\n    a241,\n    a242,\n    a243,\n    a244,\n    a245,\n    a246,\n    a247,\n    a248,\n    a249,\n    a250,\n    a251,\n    a252,\n    a253,\n    a254,\n    a255, a) {} // Error at 'a': Can't have more than 255 parameters.\n"
  },
  {
    "path": "test/if/class_in_else.lox",
    "content": "// [line 2] Error at 'class': Expect expression.\nif (true) \"ok\"; else class Foo {}\n"
  },
  {
    "path": "test/if/class_in_then.lox",
    "content": "// [line 2] Error at 'class': Expect expression.\nif (true) class Foo {}\n"
  },
  {
    "path": "test/if/dangling_else.lox",
    "content": "// A dangling else binds to the right-most if.\nif (true) if (false) print \"bad\"; else print \"good\"; // expect: good\nif (false) if (true) print \"bad\"; else print \"bad\";\n"
  },
  {
    "path": "test/if/else.lox",
    "content": "// Evaluate the 'else' expression if the condition is false.\nif (true) print \"good\"; else print \"bad\"; // expect: good\nif (false) print \"bad\"; else print \"good\"; // expect: good\n\n// Allow block body.\nif (false) nil; else { print \"block\"; } // expect: block\n"
  },
  {
    "path": "test/if/fun_in_else.lox",
    "content": "// [line 2] Error at 'fun': Expect expression.\nif (true) \"ok\"; else fun foo() {}\n"
  },
  {
    "path": "test/if/fun_in_then.lox",
    "content": "// [line 2] Error at 'fun': Expect expression.\nif (true) fun foo() {}\n"
  },
  {
    "path": "test/if/if.lox",
    "content": "// Evaluate the 'then' expression if the condition is true.\nif (true) print \"good\"; // expect: good\nif (false) print \"bad\";\n\n// Allow block body.\nif (true) { print \"block\"; } // expect: block\n\n// Assignment in if condition.\nvar a = false;\nif (a = true) print a; // expect: true\n"
  },
  {
    "path": "test/if/truth.lox",
    "content": "// False and nil are false.\nif (false) print \"bad\"; else print \"false\"; // expect: false\nif (nil) print \"bad\"; else print \"nil\"; // expect: nil\n\n// Everything else is true.\nif (true) print true; // expect: true\nif (0) print 0; // expect: 0\nif (\"\") print \"empty\"; // expect: empty\n"
  },
  {
    "path": "test/if/var_in_else.lox",
    "content": "// [line 2] Error at 'var': Expect expression.\nif (true) \"ok\"; else var foo;\n"
  },
  {
    "path": "test/if/var_in_then.lox",
    "content": "// [line 2] Error at 'var': Expect expression.\nif (true) var foo;\n"
  },
  {
    "path": "test/inheritance/constructor.lox",
    "content": "class A {\n  init(param) {\n    this.field = param;\n  }\n\n  test() {\n    print this.field;\n  }\n}\n\nclass B < A {}\n\nvar b = B(\"value\");\nb.test(); // expect: value\n"
  },
  {
    "path": "test/inheritance/inherit_from_function.lox",
    "content": "fun foo() {}\n\nclass Subclass < foo {} // expect runtime error: Superclass must be a class.\n"
  },
  {
    "path": "test/inheritance/inherit_from_nil.lox",
    "content": "var Nil = nil;\nclass Foo < Nil {} // expect runtime error: Superclass must be a class.\n"
  },
  {
    "path": "test/inheritance/inherit_from_number.lox",
    "content": "var Number = 123;\nclass Foo < Number {} // expect runtime error: Superclass must be a class.\n"
  },
  {
    "path": "test/inheritance/inherit_methods.lox",
    "content": "class Foo {\n  methodOnFoo() { print \"foo\"; }\n  override() { print \"foo\"; }\n}\n\nclass Bar < Foo {\n  methodOnBar() { print \"bar\"; }\n  override() { print \"bar\"; }\n}\n\nvar bar = Bar();\nbar.methodOnFoo(); // expect: foo\nbar.methodOnBar(); // expect: bar\nbar.override(); // expect: bar\n"
  },
  {
    "path": "test/inheritance/parenthesized_superclass.lox",
    "content": "class Foo {}\n\n// [line 4] Error at '(': Expect superclass name.\nclass Bar < (Foo) {}\n"
  },
  {
    "path": "test/inheritance/set_fields_from_base_class.lox",
    "content": "class Foo {\n  foo(a, b) {\n    this.field1 = a;\n    this.field2 = b;\n  }\n\n  fooPrint() {\n    print this.field1;\n    print this.field2;\n  }\n}\n\nclass Bar < Foo {\n  bar(a, b) {\n    this.field1 = a;\n    this.field2 = b;\n  }\n\n  barPrint() {\n    print this.field1;\n    print this.field2;\n  }\n}\n\nvar bar = Bar();\nbar.foo(\"foo 1\", \"foo 2\");\nbar.fooPrint();\n// expect: foo 1\n// expect: foo 2\n\nbar.bar(\"bar 1\", \"bar 2\");\nbar.barPrint();\n// expect: bar 1\n// expect: bar 2\n\nbar.fooPrint();\n// expect: bar 1\n// expect: bar 2\n"
  },
  {
    "path": "test/limit/loop_too_large.lox",
    "content": "var a = 0;\nwhile (false) {\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n  nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil; nil;\n} // Error at '}': Loop body too large.\n"
  },
  {
    "path": "test/limit/no_reuse_constants.lox",
    "content": "fun f() {\n  0; 1; 2; 3; 4; 5; 6; 7;\n  8; 9; 10; 11; 12; 13; 14; 15;\n  16; 17; 18; 19; 20; 21; 22; 23;\n  24; 25; 26; 27; 28; 29; 30; 31;\n  32; 33; 34; 35; 36; 37; 38; 39;\n  40; 41; 42; 43; 44; 45; 46; 47;\n  48; 49; 50; 51; 52; 53; 54; 55;\n  56; 57; 58; 59; 60; 61; 62; 63;\n  64; 65; 66; 67; 68; 69; 70; 71;\n  72; 73; 74; 75; 76; 77; 78; 79;\n  80; 81; 82; 83; 84; 85; 86; 87;\n  88; 89; 90; 91; 92; 93; 94; 95;\n  96; 97; 98; 99; 100; 101; 102; 103;\n  104; 105; 106; 107; 108; 109; 110; 111;\n  112; 113; 114; 115; 116; 117; 118; 119;\n  120; 121; 122; 123; 124; 125; 126; 127;\n  128; 129; 130; 131; 132; 133; 134; 135;\n  136; 137; 138; 139; 140; 141; 142; 143;\n  144; 145; 146; 147; 148; 149; 150; 151;\n  152; 153; 154; 155; 156; 157; 158; 159;\n  160; 161; 162; 163; 164; 165; 166; 167;\n  168; 169; 170; 171; 172; 173; 174; 175;\n  176; 177; 178; 179; 180; 181; 182; 183;\n  184; 185; 186; 187; 188; 189; 190; 191;\n  192; 193; 194; 195; 196; 197; 198; 199;\n  200; 201; 202; 203; 204; 205; 206; 207;\n  208; 209; 210; 211; 212; 213; 214; 215;\n  216; 217; 218; 219; 220; 221; 222; 223;\n  224; 225; 226; 227; 228; 229; 230; 231;\n  232; 233; 234; 235; 236; 237; 238; 239;\n  240; 241; 242; 243; 244; 245; 246; 247;\n  248; 249; 250; 251; 252; 253; 254; 255;\n\n  1; // Error at '1': Too many constants in one chunk.\n}\n"
  },
  {
    "path": "test/limit/stack_overflow.lox",
    "content": "fun foo() {\n  var a1;\n  var a2;\n  var a3;\n  var a4;\n  var a5;\n  var a6;\n  var a7;\n  var a8;\n  var a9;\n  var a10;\n  var a11;\n  var a12;\n  var a13;\n  var a14;\n  var a15;\n  var a16;\n  foo(); // expect runtime error: Stack overflow.\n}\n\nfoo();\n"
  },
  {
    "path": "test/limit/too_many_constants.lox",
    "content": "fun f() {\n  0; 1; 2; 3; 4; 5; 6; 7;\n  8; 9; 10; 11; 12; 13; 14; 15;\n  16; 17; 18; 19; 20; 21; 22; 23;\n  24; 25; 26; 27; 28; 29; 30; 31;\n  32; 33; 34; 35; 36; 37; 38; 39;\n  40; 41; 42; 43; 44; 45; 46; 47;\n  48; 49; 50; 51; 52; 53; 54; 55;\n  56; 57; 58; 59; 60; 61; 62; 63;\n  64; 65; 66; 67; 68; 69; 70; 71;\n  72; 73; 74; 75; 76; 77; 78; 79;\n  80; 81; 82; 83; 84; 85; 86; 87;\n  88; 89; 90; 91; 92; 93; 94; 95;\n  96; 97; 98; 99; 100; 101; 102; 103;\n  104; 105; 106; 107; 108; 109; 110; 111;\n  112; 113; 114; 115; 116; 117; 118; 119;\n  120; 121; 122; 123; 124; 125; 126; 127;\n  128; 129; 130; 131; 132; 133; 134; 135;\n  136; 137; 138; 139; 140; 141; 142; 143;\n  144; 145; 146; 147; 148; 149; 150; 151;\n  152; 153; 154; 155; 156; 157; 158; 159;\n  160; 161; 162; 163; 164; 165; 166; 167;\n  168; 169; 170; 171; 172; 173; 174; 175;\n  176; 177; 178; 179; 180; 181; 182; 183;\n  184; 185; 186; 187; 188; 189; 190; 191;\n  192; 193; 194; 195; 196; 197; 198; 199;\n  200; 201; 202; 203; 204; 205; 206; 207;\n  208; 209; 210; 211; 212; 213; 214; 215;\n  216; 217; 218; 219; 220; 221; 222; 223;\n  224; 225; 226; 227; 228; 229; 230; 231;\n  232; 233; 234; 235; 236; 237; 238; 239;\n  240; 241; 242; 243; 244; 245; 246; 247;\n  248; 249; 250; 251; 252; 253; 254; 255;\n\n  \"oops\"; // Error at '\"oops\"': Too many constants in one chunk.\n}\n"
  },
  {
    "path": "test/limit/too_many_locals.lox",
    "content": "fun f() {\n  // var v00; First slot already taken.\n\n  var v01; var v02; var v03; var v04; var v05; var v06; var v07;\n  var v08; var v09; var v0a; var v0b; var v0c; var v0d; var v0e; var v0f;\n\n  var v10; var v11; var v12; var v13; var v14; var v15; var v16; var v17;\n  var v18; var v19; var v1a; var v1b; var v1c; var v1d; var v1e; var v1f;\n\n  var v20; var v21; var v22; var v23; var v24; var v25; var v26; var v27;\n  var v28; var v29; var v2a; var v2b; var v2c; var v2d; var v2e; var v2f;\n\n  var v30; var v31; var v32; var v33; var v34; var v35; var v36; var v37;\n  var v38; var v39; var v3a; var v3b; var v3c; var v3d; var v3e; var v3f;\n\n  var v40; var v41; var v42; var v43; var v44; var v45; var v46; var v47;\n  var v48; var v49; var v4a; var v4b; var v4c; var v4d; var v4e; var v4f;\n\n  var v50; var v51; var v52; var v53; var v54; var v55; var v56; var v57;\n  var v58; var v59; var v5a; var v5b; var v5c; var v5d; var v5e; var v5f;\n\n  var v60; var v61; var v62; var v63; var v64; var v65; var v66; var v67;\n  var v68; var v69; var v6a; var v6b; var v6c; var v6d; var v6e; var v6f;\n\n  var v70; var v71; var v72; var v73; var v74; var v75; var v76; var v77;\n  var v78; var v79; var v7a; var v7b; var v7c; var v7d; var v7e; var v7f;\n\n  var v80; var v81; var v82; var v83; var v84; var v85; var v86; var v87;\n  var v88; var v89; var v8a; var v8b; var v8c; var v8d; var v8e; var v8f;\n\n  var v90; var v91; var v92; var v93; var v94; var v95; var v96; var v97;\n  var v98; var v99; var v9a; var v9b; var v9c; var v9d; var v9e; var v9f;\n\n  var va0; var va1; var va2; var va3; var va4; var va5; var va6; var va7;\n  var va8; var va9; var vaa; var vab; var vac; var vad; var vae; var vaf;\n\n  var vb0; var vb1; var vb2; var vb3; var vb4; var vb5; var vb6; var vb7;\n  var vb8; var vb9; var vba; var vbb; var vbc; var vbd; var vbe; var vbf;\n\n  var vc0; var vc1; var vc2; var vc3; var vc4; var vc5; var vc6; var vc7;\n  var vc8; var vc9; var vca; var vcb; var vcc; var vcd; var vce; var vcf;\n\n  var vd0; var vd1; var vd2; var vd3; var vd4; var vd5; var vd6; var vd7;\n  var vd8; var vd9; var vda; var vdb; var vdc; var vdd; var vde; var vdf;\n\n  var ve0; var ve1; var ve2; var ve3; var ve4; var ve5; var ve6; var ve7;\n  var ve8; var ve9; var vea; var veb; var vec; var ved; var vee; var vef;\n\n  var vf0; var vf1; var vf2; var vf3; var vf4; var vf5; var vf6; var vf7;\n  var vf8; var vf9; var vfa; var vfb; var vfc; var vfd; var vfe; var vff;\n\n  var oops; // Error at 'oops': Too many local variables in function.\n}\n"
  },
  {
    "path": "test/limit/too_many_upvalues.lox",
    "content": "fun f() {\n  var v00; var v01; var v02; var v03; var v04; var v05; var v06; var v07;\n  var v08; var v09; var v0a; var v0b; var v0c; var v0d; var v0e; var v0f;\n\n  var v10; var v11; var v12; var v13; var v14; var v15; var v16; var v17;\n  var v18; var v19; var v1a; var v1b; var v1c; var v1d; var v1e; var v1f;\n\n  var v20; var v21; var v22; var v23; var v24; var v25; var v26; var v27;\n  var v28; var v29; var v2a; var v2b; var v2c; var v2d; var v2e; var v2f;\n\n  var v30; var v31; var v32; var v33; var v34; var v35; var v36; var v37;\n  var v38; var v39; var v3a; var v3b; var v3c; var v3d; var v3e; var v3f;\n\n  var v40; var v41; var v42; var v43; var v44; var v45; var v46; var v47;\n  var v48; var v49; var v4a; var v4b; var v4c; var v4d; var v4e; var v4f;\n\n  var v50; var v51; var v52; var v53; var v54; var v55; var v56; var v57;\n  var v58; var v59; var v5a; var v5b; var v5c; var v5d; var v5e; var v5f;\n\n  var v60; var v61; var v62; var v63; var v64; var v65; var v66; var v67;\n  var v68; var v69; var v6a; var v6b; var v6c; var v6d; var v6e; var v6f;\n\n  var v70; var v71; var v72; var v73; var v74; var v75; var v76; var v77;\n  var v78; var v79; var v7a; var v7b; var v7c; var v7d; var v7e; var v7f;\n\n  fun g() {\n    var v80; var v81; var v82; var v83; var v84; var v85; var v86; var v87;\n    var v88; var v89; var v8a; var v8b; var v8c; var v8d; var v8e; var v8f;\n\n    var v90; var v91; var v92; var v93; var v94; var v95; var v96; var v97;\n    var v98; var v99; var v9a; var v9b; var v9c; var v9d; var v9e; var v9f;\n\n    var va0; var va1; var va2; var va3; var va4; var va5; var va6; var va7;\n    var va8; var va9; var vaa; var vab; var vac; var vad; var vae; var vaf;\n\n    var vb0; var vb1; var vb2; var vb3; var vb4; var vb5; var vb6; var vb7;\n    var vb8; var vb9; var vba; var vbb; var vbc; var vbd; var vbe; var vbf;\n\n    var vc0; var vc1; var vc2; var vc3; var vc4; var vc5; var vc6; var vc7;\n    var vc8; var vc9; var vca; var vcb; var vcc; var vcd; var vce; var vcf;\n\n    var vd0; var vd1; var vd2; var vd3; var vd4; var vd5; var vd6; var vd7;\n    var vd8; var vd9; var vda; var vdb; var vdc; var vdd; var vde; var vdf;\n\n    var ve0; var ve1; var ve2; var ve3; var ve4; var ve5; var ve6; var ve7;\n    var ve8; var ve9; var vea; var veb; var vec; var ved; var vee; var vef;\n\n    var vf0; var vf1; var vf2; var vf3; var vf4; var vf5; var vf6; var vf7;\n    var vf8; var vf9; var vfa; var vfb; var vfc; var vfd; var vfe; var vff;\n\n    var oops;\n\n    fun h() {\n      v00; v01; v02; v03; v04; v05; v06; v07;\n      v08; v09; v0a; v0b; v0c; v0d; v0e; v0f;\n\n      v10; v11; v12; v13; v14; v15; v16; v17;\n      v18; v19; v1a; v1b; v1c; v1d; v1e; v1f;\n\n      v20; v21; v22; v23; v24; v25; v26; v27;\n      v28; v29; v2a; v2b; v2c; v2d; v2e; v2f;\n\n      v30; v31; v32; v33; v34; v35; v36; v37;\n      v38; v39; v3a; v3b; v3c; v3d; v3e; v3f;\n\n      v40; v41; v42; v43; v44; v45; v46; v47;\n      v48; v49; v4a; v4b; v4c; v4d; v4e; v4f;\n\n      v50; v51; v52; v53; v54; v55; v56; v57;\n      v58; v59; v5a; v5b; v5c; v5d; v5e; v5f;\n\n      v60; v61; v62; v63; v64; v65; v66; v67;\n      v68; v69; v6a; v6b; v6c; v6d; v6e; v6f;\n\n      v70; v71; v72; v73; v74; v75; v76; v77;\n      v78; v79; v7a; v7b; v7c; v7d; v7e; v7f;\n\n      v80; v81; v82; v83; v84; v85; v86; v87;\n      v88; v89; v8a; v8b; v8c; v8d; v8e; v8f;\n\n      v90; v91; v92; v93; v94; v95; v96; v97;\n      v98; v99; v9a; v9b; v9c; v9d; v9e; v9f;\n\n      va0; va1; va2; va3; va4; va5; va6; va7;\n      va8; va9; vaa; vab; vac; vad; vae; vaf;\n\n      vb0; vb1; vb2; vb3; vb4; vb5; vb6; vb7;\n      vb8; vb9; vba; vbb; vbc; vbd; vbe; vbf;\n\n      vc0; vc1; vc2; vc3; vc4; vc5; vc6; vc7;\n      vc8; vc9; vca; vcb; vcc; vcd; vce; vcf;\n\n      vd0; vd1; vd2; vd3; vd4; vd5; vd6; vd7;\n      vd8; vd9; vda; vdb; vdc; vdd; vde; vdf;\n\n      ve0; ve1; ve2; ve3; ve4; ve5; ve6; ve7;\n      ve8; ve9; vea; veb; vec; ved; vee; vef;\n\n      vf0; vf1; vf2; vf3; vf4; vf5; vf6; vf7;\n      vf8; vf9; vfa; vfb; vfc; vfd; vfe; vff;\n\n      oops; // Error at 'oops': Too many closure variables in function.\n    }\n  }\n}\n"
  },
  {
    "path": "test/logical_operator/and.lox",
    "content": "// Note: These tests implicitly depend on ints being truthy.\n\n// Return the first non-true argument.\nprint false and 1; // expect: false\nprint true and 1; // expect: 1\nprint 1 and 2 and false; // expect: false\n\n// Return the last argument if all are true.\nprint 1 and true; // expect: true\nprint 1 and 2 and 3; // expect: 3\n\n// Short-circuit at the first false argument.\nvar a = \"before\";\nvar b = \"before\";\n(a = true) and\n    (b = false) and\n    (a = \"bad\");\nprint a; // expect: true\nprint b; // expect: false\n"
  },
  {
    "path": "test/logical_operator/and_truth.lox",
    "content": "// False and nil are false.\nprint false and \"bad\"; // expect: false\nprint nil and \"bad\"; // expect: nil\n\n// Everything else is true.\nprint true and \"ok\"; // expect: ok\nprint 0 and \"ok\"; // expect: ok\nprint \"\" and \"ok\"; // expect: ok\n"
  },
  {
    "path": "test/logical_operator/or.lox",
    "content": "// Note: These tests implicitly depend on ints being truthy.\n\n// Return the first true argument.\nprint 1 or true; // expect: 1\nprint false or 1; // expect: 1\nprint false or false or true; // expect: true\n\n// Return the last argument if all are false.\nprint false or false; // expect: false\nprint false or false or false; // expect: false\n\n// Short-circuit at the first true argument.\nvar a = \"before\";\nvar b = \"before\";\n(a = false) or\n    (b = true) or\n    (a = \"bad\");\nprint a; // expect: false\nprint b; // expect: true\n"
  },
  {
    "path": "test/logical_operator/or_truth.lox",
    "content": "// False and nil are false.\nprint false or \"ok\"; // expect: ok\nprint nil or \"ok\"; // expect: ok\n\n// Everything else is true.\nprint true or \"ok\"; // expect: true\nprint 0 or \"ok\"; // expect: 0\nprint \"s\" or \"ok\"; // expect: s\n"
  },
  {
    "path": "test/method/arity.lox",
    "content": "class Foo {\n  method0() { return \"no args\"; }\n  method1(a) { return a; }\n  method2(a, b) { return a + b; }\n  method3(a, b, c) { return a + b + c; }\n  method4(a, b, c, d) { return a + b + c + d; }\n  method5(a, b, c, d, e) { return a + b + c + d + e; }\n  method6(a, b, c, d, e, f) { return a + b + c + d + e + f; }\n  method7(a, b, c, d, e, f, g) { return a + b + c + d + e + f + g; }\n  method8(a, b, c, d, e, f, g, h) { return a + b + c + d + e + f + g + h; }\n}\n\nvar foo = Foo();\nprint foo.method0(); // expect: no args\nprint foo.method1(1); // expect: 1\nprint foo.method2(1, 2); // expect: 3\nprint foo.method3(1, 2, 3); // expect: 6\nprint foo.method4(1, 2, 3, 4); // expect: 10\nprint foo.method5(1, 2, 3, 4, 5); // expect: 15\nprint foo.method6(1, 2, 3, 4, 5, 6); // expect: 21\nprint foo.method7(1, 2, 3, 4, 5, 6, 7); // expect: 28\nprint foo.method8(1, 2, 3, 4, 5, 6, 7, 8); // expect: 36\n"
  },
  {
    "path": "test/method/empty_block.lox",
    "content": "class Foo {\n  bar() {}\n}\n\nprint Foo().bar(); // expect: nil\n"
  },
  {
    "path": "test/method/extra_arguments.lox",
    "content": "class Foo {\n  method(a, b) {\n    print a;\n    print b;\n  }\n}\n\nFoo().method(1, 2, 3, 4); // expect runtime error: Expected 2 arguments but got 4.\n"
  },
  {
    "path": "test/method/missing_arguments.lox",
    "content": "class Foo {\n  method(a, b) {}\n}\n\nFoo().method(1); // expect runtime error: Expected 2 arguments but got 1.\n"
  },
  {
    "path": "test/method/not_found.lox",
    "content": "class Foo {}\n\nFoo().unknown(); // expect runtime error: Undefined property 'unknown'.\n"
  },
  {
    "path": "test/method/print_bound_method.lox",
    "content": "class Foo {\n  method() { }\n}\nvar foo = Foo();\nprint foo.method; // expect: <fn method>\n"
  },
  {
    "path": "test/method/refer_to_name.lox",
    "content": "class Foo {\n  method() {\n    print method; // expect runtime error: Undefined variable 'method'.\n  }\n}\n\nFoo().method();\n"
  },
  {
    "path": "test/method/too_many_arguments.lox",
    "content": "{\n  var a = 1;\n  true.method(\n     a, // 1\n     a, // 2\n     a, // 3\n     a, // 4\n     a, // 5\n     a, // 6\n     a, // 7\n     a, // 8\n     a, // 9\n     a, // 10\n     a, // 11\n     a, // 12\n     a, // 13\n     a, // 14\n     a, // 15\n     a, // 16\n     a, // 17\n     a, // 18\n     a, // 19\n     a, // 20\n     a, // 21\n     a, // 22\n     a, // 23\n     a, // 24\n     a, // 25\n     a, // 26\n     a, // 27\n     a, // 28\n     a, // 29\n     a, // 30\n     a, // 31\n     a, // 32\n     a, // 33\n     a, // 34\n     a, // 35\n     a, // 36\n     a, // 37\n     a, // 38\n     a, // 39\n     a, // 40\n     a, // 41\n     a, // 42\n     a, // 43\n     a, // 44\n     a, // 45\n     a, // 46\n     a, // 47\n     a, // 48\n     a, // 49\n     a, // 50\n     a, // 51\n     a, // 52\n     a, // 53\n     a, // 54\n     a, // 55\n     a, // 56\n     a, // 57\n     a, // 58\n     a, // 59\n     a, // 60\n     a, // 61\n     a, // 62\n     a, // 63\n     a, // 64\n     a, // 65\n     a, // 66\n     a, // 67\n     a, // 68\n     a, // 69\n     a, // 70\n     a, // 71\n     a, // 72\n     a, // 73\n     a, // 74\n     a, // 75\n     a, // 76\n     a, // 77\n     a, // 78\n     a, // 79\n     a, // 80\n     a, // 81\n     a, // 82\n     a, // 83\n     a, // 84\n     a, // 85\n     a, // 86\n     a, // 87\n     a, // 88\n     a, // 89\n     a, // 90\n     a, // 91\n     a, // 92\n     a, // 93\n     a, // 94\n     a, // 95\n     a, // 96\n     a, // 97\n     a, // 98\n     a, // 99\n     a, // 100\n     a, // 101\n     a, // 102\n     a, // 103\n     a, // 104\n     a, // 105\n     a, // 106\n     a, // 107\n     a, // 108\n     a, // 109\n     a, // 110\n     a, // 111\n     a, // 112\n     a, // 113\n     a, // 114\n     a, // 115\n     a, // 116\n     a, // 117\n     a, // 118\n     a, // 119\n     a, // 120\n     a, // 121\n     a, // 122\n     a, // 123\n     a, // 124\n     a, // 125\n     a, // 126\n     a, // 127\n     a, // 128\n     a, // 129\n     a, // 130\n     a, // 131\n     a, // 132\n     a, // 133\n     a, // 134\n     a, // 135\n     a, // 136\n     a, // 137\n     a, // 138\n     a, // 139\n     a, // 140\n     a, // 141\n     a, // 142\n     a, // 143\n     a, // 144\n     a, // 145\n     a, // 146\n     a, // 147\n     a, // 148\n     a, // 149\n     a, // 150\n     a, // 151\n     a, // 152\n     a, // 153\n     a, // 154\n     a, // 155\n     a, // 156\n     a, // 157\n     a, // 158\n     a, // 159\n     a, // 160\n     a, // 161\n     a, // 162\n     a, // 163\n     a, // 164\n     a, // 165\n     a, // 166\n     a, // 167\n     a, // 168\n     a, // 169\n     a, // 170\n     a, // 171\n     a, // 172\n     a, // 173\n     a, // 174\n     a, // 175\n     a, // 176\n     a, // 177\n     a, // 178\n     a, // 179\n     a, // 180\n     a, // 181\n     a, // 182\n     a, // 183\n     a, // 184\n     a, // 185\n     a, // 186\n     a, // 187\n     a, // 188\n     a, // 189\n     a, // 190\n     a, // 191\n     a, // 192\n     a, // 193\n     a, // 194\n     a, // 195\n     a, // 196\n     a, // 197\n     a, // 198\n     a, // 199\n     a, // 200\n     a, // 201\n     a, // 202\n     a, // 203\n     a, // 204\n     a, // 205\n     a, // 206\n     a, // 207\n     a, // 208\n     a, // 209\n     a, // 210\n     a, // 211\n     a, // 212\n     a, // 213\n     a, // 214\n     a, // 215\n     a, // 216\n     a, // 217\n     a, // 218\n     a, // 219\n     a, // 220\n     a, // 221\n     a, // 222\n     a, // 223\n     a, // 224\n     a, // 225\n     a, // 226\n     a, // 227\n     a, // 228\n     a, // 229\n     a, // 230\n     a, // 231\n     a, // 232\n     a, // 233\n     a, // 234\n     a, // 235\n     a, // 236\n     a, // 237\n     a, // 238\n     a, // 239\n     a, // 240\n     a, // 241\n     a, // 242\n     a, // 243\n     a, // 244\n     a, // 245\n     a, // 246\n     a, // 247\n     a, // 248\n     a, // 249\n     a, // 250\n     a, // 251\n     a, // 252\n     a, // 253\n     a, // 254\n     a, // 255\n     a); // Error at 'a': Can't have more than 255 arguments.\n}\n"
  },
  {
    "path": "test/method/too_many_parameters.lox",
    "content": "class Foo {\n  // 256 parameters.\n  method(\n    a1,\n    a2,\n    a3,\n    a4,\n    a5,\n    a6,\n    a7,\n    a8,\n    a9,\n    a10,\n    a11,\n    a12,\n    a13,\n    a14,\n    a15,\n    a16,\n    a17,\n    a18,\n    a19,\n    a20,\n    a21,\n    a22,\n    a23,\n    a24,\n    a25,\n    a26,\n    a27,\n    a28,\n    a29,\n    a30,\n    a31,\n    a32,\n    a33,\n    a34,\n    a35,\n    a36,\n    a37,\n    a38,\n    a39,\n    a40,\n    a41,\n    a42,\n    a43,\n    a44,\n    a45,\n    a46,\n    a47,\n    a48,\n    a49,\n    a50,\n    a51,\n    a52,\n    a53,\n    a54,\n    a55,\n    a56,\n    a57,\n    a58,\n    a59,\n    a60,\n    a61,\n    a62,\n    a63,\n    a64,\n    a65,\n    a66,\n    a67,\n    a68,\n    a69,\n    a70,\n    a71,\n    a72,\n    a73,\n    a74,\n    a75,\n    a76,\n    a77,\n    a78,\n    a79,\n    a80,\n    a81,\n    a82,\n    a83,\n    a84,\n    a85,\n    a86,\n    a87,\n    a88,\n    a89,\n    a90,\n    a91,\n    a92,\n    a93,\n    a94,\n    a95,\n    a96,\n    a97,\n    a98,\n    a99,\n    a100,\n    a101,\n    a102,\n    a103,\n    a104,\n    a105,\n    a106,\n    a107,\n    a108,\n    a109,\n    a110,\n    a111,\n    a112,\n    a113,\n    a114,\n    a115,\n    a116,\n    a117,\n    a118,\n    a119,\n    a120,\n    a121,\n    a122,\n    a123,\n    a124,\n    a125,\n    a126,\n    a127,\n    a128,\n    a129,\n    a130,\n    a131,\n    a132,\n    a133,\n    a134,\n    a135,\n    a136,\n    a137,\n    a138,\n    a139,\n    a140,\n    a141,\n    a142,\n    a143,\n    a144,\n    a145,\n    a146,\n    a147,\n    a148,\n    a149,\n    a150,\n    a151,\n    a152,\n    a153,\n    a154,\n    a155,\n    a156,\n    a157,\n    a158,\n    a159,\n    a160,\n    a161,\n    a162,\n    a163,\n    a164,\n    a165,\n    a166,\n    a167,\n    a168,\n    a169,\n    a170,\n    a171,\n    a172,\n    a173,\n    a174,\n    a175,\n    a176,\n    a177,\n    a178,\n    a179,\n    a180,\n    a181,\n    a182,\n    a183,\n    a184,\n    a185,\n    a186,\n    a187,\n    a188,\n    a189,\n    a190,\n    a191,\n    a192,\n    a193,\n    a194,\n    a195,\n    a196,\n    a197,\n    a198,\n    a199,\n    a200,\n    a201,\n    a202,\n    a203,\n    a204,\n    a205,\n    a206,\n    a207,\n    a208,\n    a209,\n    a210,\n    a211,\n    a212,\n    a213,\n    a214,\n    a215,\n    a216,\n    a217,\n    a218,\n    a219,\n    a220,\n    a221,\n    a222,\n    a223,\n    a224,\n    a225,\n    a226,\n    a227,\n    a228,\n    a229,\n    a230,\n    a231,\n    a232,\n    a233,\n    a234,\n    a235,\n    a236,\n    a237,\n    a238,\n    a239,\n    a240,\n    a241,\n    a242,\n    a243,\n    a244,\n    a245,\n    a246,\n    a247,\n    a248,\n    a249,\n    a250,\n    a251,\n    a252,\n    a253,\n    a254,\n    a255, a) {} // Error at 'a': Can't have more than 255 parameters.\n}\n"
  },
  {
    "path": "test/nil/literal.lox",
    "content": "print nil; // expect: nil\n"
  },
  {
    "path": "test/number/decimal_point_at_eof.lox",
    "content": "// [line 2] Error at end: Expect property name after '.'.\n123."
  },
  {
    "path": "test/number/leading_dot.lox",
    "content": "// [line 2] Error at '.': Expect expression.\n.123;\n"
  },
  {
    "path": "test/number/literals.lox",
    "content": "print 123;     // expect: 123\nprint 987654;  // expect: 987654\nprint 0;       // expect: 0\nprint -0;      // expect: -0\n\nprint 123.456; // expect: 123.456\nprint -0.001;  // expect: -0.001\n"
  },
  {
    "path": "test/number/nan_equality.lox",
    "content": "var nan = 0/0;\n\nprint nan == 0; // expect: false\nprint nan != 1; // expect: true\n\n// NaN is not equal to self.\nprint nan == nan; // expect: false\nprint nan != nan; // expect: true\n"
  },
  {
    "path": "test/number/trailing_dot.lox",
    "content": "// [line 2] Error at ';': Expect property name after '.'.\n123.;\n"
  },
  {
    "path": "test/operator/add.lox",
    "content": "print 123 + 456; // expect: 579\nprint \"str\" + \"ing\"; // expect: string\n"
  },
  {
    "path": "test/operator/add_bool_nil.lox",
    "content": "true + nil; // expect runtime error: Operands must be two numbers or two strings.\n"
  },
  {
    "path": "test/operator/add_bool_num.lox",
    "content": "true + 123; // expect runtime error: Operands must be two numbers or two strings.\n"
  },
  {
    "path": "test/operator/add_bool_string.lox",
    "content": "true + \"s\"; // expect runtime error: Operands must be two numbers or two strings.\n"
  },
  {
    "path": "test/operator/add_nil_nil.lox",
    "content": "nil + nil; // expect runtime error: Operands must be two numbers or two strings.\n"
  },
  {
    "path": "test/operator/add_num_nil.lox",
    "content": "1 + nil; // expect runtime error: Operands must be two numbers or two strings.\n"
  },
  {
    "path": "test/operator/add_string_nil.lox",
    "content": "\"s\" + nil; // expect runtime error: Operands must be two numbers or two strings.\n"
  },
  {
    "path": "test/operator/comparison.lox",
    "content": "print 1 < 2;    // expect: true\nprint 2 < 2;    // expect: false\nprint 2 < 1;    // expect: false\n\nprint 1 <= 2;    // expect: true\nprint 2 <= 2;    // expect: true\nprint 2 <= 1;    // expect: false\n\nprint 1 > 2;    // expect: false\nprint 2 > 2;    // expect: false\nprint 2 > 1;    // expect: true\n\nprint 1 >= 2;    // expect: false\nprint 2 >= 2;    // expect: true\nprint 2 >= 1;    // expect: true\n\n// Zero and negative zero compare the same.\nprint 0 < -0; // expect: false\nprint -0 < 0; // expect: false\nprint 0 > -0; // expect: false\nprint -0 > 0; // expect: false\nprint 0 <= -0; // expect: true\nprint -0 <= 0; // expect: true\nprint 0 >= -0; // expect: true\nprint -0 >= 0; // expect: true\n"
  },
  {
    "path": "test/operator/divide.lox",
    "content": "print 8 / 2;         // expect: 4\nprint 12.34 / 12.34;  // expect: 1\n"
  },
  {
    "path": "test/operator/divide_nonnum_num.lox",
    "content": "\"1\" / 1; // expect runtime error: Operands must be numbers.\n"
  },
  {
    "path": "test/operator/divide_num_nonnum.lox",
    "content": "1 / \"1\"; // expect runtime error: Operands must be numbers.\n"
  },
  {
    "path": "test/operator/equals.lox",
    "content": "print nil == nil; // expect: true\n\nprint true == true; // expect: true\nprint true == false; // expect: false\n\nprint 1 == 1; // expect: true\nprint 1 == 2; // expect: false\n\nprint \"str\" == \"str\"; // expect: true\nprint \"str\" == \"ing\"; // expect: false\n\nprint nil == false; // expect: false\nprint false == 0; // expect: false\nprint 0 == \"0\"; // expect: false\n"
  },
  {
    "path": "test/operator/equals_class.lox",
    "content": "// Bound methods have identity equality.\nclass Foo {}\nclass Bar {}\n\nprint Foo == Foo; // expect: true\nprint Foo == Bar; // expect: false\nprint Bar == Foo; // expect: false\nprint Bar == Bar; // expect: true\n\nprint Foo == \"Foo\"; // expect: false\nprint Foo == nil;   // expect: false\nprint Foo == 123;   // expect: false\nprint Foo == true;  // expect: false\n"
  },
  {
    "path": "test/operator/equals_method.lox",
    "content": "// Bound methods have identity equality.\nclass Foo {\n  method() {}\n}\n\nvar foo = Foo();\nvar fooMethod = foo.method;\n\n// Same bound method.\nprint fooMethod == fooMethod; // expect: true\n\n// Different closurizations.\nprint foo.method == foo.method; // expect: false\n"
  },
  {
    "path": "test/operator/greater_nonnum_num.lox",
    "content": "\"1\" > 1; // expect runtime error: Operands must be numbers.\n"
  },
  {
    "path": "test/operator/greater_num_nonnum.lox",
    "content": "1 > \"1\"; // expect runtime error: Operands must be numbers.\n"
  },
  {
    "path": "test/operator/greater_or_equal_nonnum_num.lox",
    "content": "\"1\" >= 1; // expect runtime error: Operands must be numbers.\n"
  },
  {
    "path": "test/operator/greater_or_equal_num_nonnum.lox",
    "content": "1 >= \"1\"; // expect runtime error: Operands must be numbers.\n"
  },
  {
    "path": "test/operator/less_nonnum_num.lox",
    "content": "\"1\" < 1; // expect runtime error: Operands must be numbers.\n"
  },
  {
    "path": "test/operator/less_num_nonnum.lox",
    "content": "1 < \"1\"; // expect runtime error: Operands must be numbers.\n"
  },
  {
    "path": "test/operator/less_or_equal_nonnum_num.lox",
    "content": "\"1\" <= 1; // expect runtime error: Operands must be numbers.\n"
  },
  {
    "path": "test/operator/less_or_equal_num_nonnum.lox",
    "content": "1 <= \"1\"; // expect runtime error: Operands must be numbers.\n"
  },
  {
    "path": "test/operator/multiply.lox",
    "content": "print 5 * 3; // expect: 15\nprint 12.34 * 0.3; // expect: 3.702\n"
  },
  {
    "path": "test/operator/multiply_nonnum_num.lox",
    "content": "\"1\" * 1; // expect runtime error: Operands must be numbers.\n"
  },
  {
    "path": "test/operator/multiply_num_nonnum.lox",
    "content": "1 * \"1\"; // expect runtime error: Operands must be numbers.\n"
  },
  {
    "path": "test/operator/negate.lox",
    "content": "print -(3); // expect: -3\nprint --(3); // expect: 3\nprint ---(3); // expect: -3\n"
  },
  {
    "path": "test/operator/negate_nonnum.lox",
    "content": "-\"s\"; // expect runtime error: Operand must be a number.\n"
  },
  {
    "path": "test/operator/not.lox",
    "content": "print !true;     // expect: false\nprint !false;    // expect: true\nprint !!true;    // expect: true\n\nprint !123;      // expect: false\nprint !0;        // expect: false\n\nprint !nil;     // expect: true\n\nprint !\"\";       // expect: false\n\nfun foo() {}\nprint !foo;      // expect: false\n"
  },
  {
    "path": "test/operator/not_class.lox",
    "content": "class Bar {}\nprint !Bar;      // expect: false\nprint !Bar();    // expect: false\n"
  },
  {
    "path": "test/operator/not_equals.lox",
    "content": "print nil != nil; // expect: false\n\nprint true != true; // expect: false\nprint true != false; // expect: true\n\nprint 1 != 1; // expect: false\nprint 1 != 2; // expect: true\n\nprint \"str\" != \"str\"; // expect: false\nprint \"str\" != \"ing\"; // expect: true\n\nprint nil != false; // expect: true\nprint false != 0; // expect: true\nprint 0 != \"0\"; // expect: true\n"
  },
  {
    "path": "test/operator/subtract.lox",
    "content": "print 4 - 3; // expect: 1\nprint 1.2 - 1.2; // expect: 0\n"
  },
  {
    "path": "test/operator/subtract_nonnum_num.lox",
    "content": "\"1\" - 1; // expect runtime error: Operands must be numbers.\n"
  },
  {
    "path": "test/operator/subtract_num_nonnum.lox",
    "content": "1 - \"1\"; // expect runtime error: Operands must be numbers.\n"
  },
  {
    "path": "test/precedence.lox",
    "content": "// * has higher precedence than +.\nprint 2 + 3 * 4; // expect: 14\n\n// * has higher precedence than -.\nprint 20 - 3 * 4; // expect: 8\n\n// / has higher precedence than +.\nprint 2 + 6 / 3; // expect: 4\n\n// / has higher precedence than -.\nprint 2 - 6 / 3; // expect: 0\n\n// < has higher precedence than ==.\nprint false == 2 < 1; // expect: true\n\n// > has higher precedence than ==.\nprint false == 1 > 2; // expect: true\n\n// <= has higher precedence than ==.\nprint false == 2 <= 1; // expect: true\n\n// >= has higher precedence than ==.\nprint false == 1 >= 2; // expect: true\n\n// 1 - 1 is not space-sensitive.\nprint 1 - 1; // expect: 0\nprint 1 -1;  // expect: 0\nprint 1- 1;  // expect: 0\nprint 1-1;   // expect: 0\n\n// Using () for grouping.\nprint (2 * (6 - (2 + 2))); // expect: 4\n"
  },
  {
    "path": "test/print/missing_argument.lox",
    "content": "// [line 2] Error at ';': Expect expression.\nprint;\n"
  },
  {
    "path": "test/regression/394.lox",
    "content": "{\n  class A {}\n  class B < A {}\n  print B; // expect: B\n}\n"
  },
  {
    "path": "test/regression/40.lox",
    "content": "fun caller(g) {\n  g();\n  // g should be a function, not nil.\n  print g == nil; // expect: false\n}\n\nfun callCaller() {\n  var capturedVar = \"before\";\n  var a = \"a\";\n\n  fun f() {\n    // Commenting the next line out prevents the bug!\n    capturedVar = \"after\";\n\n    // Returning anything also fixes it, even nil:\n    //return nil;\n  }\n\n  caller(f);\n}\n\ncallCaller();\n"
  },
  {
    "path": "test/return/after_else.lox",
    "content": "fun f() {\n  if (false) \"no\"; else return \"ok\";\n}\n\nprint f(); // expect: ok\n"
  },
  {
    "path": "test/return/after_if.lox",
    "content": "fun f() {\n  if (true) return \"ok\";\n}\n\nprint f(); // expect: ok\n"
  },
  {
    "path": "test/return/after_while.lox",
    "content": "fun f() {\n  while (true) return \"ok\";\n}\n\nprint f(); // expect: ok\n"
  },
  {
    "path": "test/return/at_top_level.lox",
    "content": "return \"wat\"; // Error at 'return': Can't return from top-level code.\n"
  },
  {
    "path": "test/return/in_function.lox",
    "content": "fun f() {\n  return \"ok\";\n  print \"bad\";\n}\n\nprint f(); // expect: ok\n"
  },
  {
    "path": "test/return/in_method.lox",
    "content": "class Foo {\n  method() {\n    return \"ok\";\n    print \"bad\";\n  }\n}\n\nprint Foo().method(); // expect: ok\n"
  },
  {
    "path": "test/return/return_nil_if_no_value.lox",
    "content": "fun f() {\n  return;\n  print \"bad\";\n}\n\nprint f(); // expect: nil\n"
  },
  {
    "path": "test/scanning/identifiers.lox",
    "content": "andy formless fo _ _123 _abc ab123\nabcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890_\n\n// expect: IDENTIFIER andy null\n// expect: IDENTIFIER formless null\n// expect: IDENTIFIER fo null\n// expect: IDENTIFIER _ null\n// expect: IDENTIFIER _123 null\n// expect: IDENTIFIER _abc null\n// expect: IDENTIFIER ab123 null\n// expect: IDENTIFIER abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890_ null\n// expect: EOF  null\n"
  },
  {
    "path": "test/scanning/keywords.lox",
    "content": "and class else false for fun if nil or return super this true var while\n\n// expect: AND and null\n// expect: CLASS class null\n// expect: ELSE else null\n// expect: FALSE false null\n// expect: FOR for null\n// expect: FUN fun null\n// expect: IF if null\n// expect: NIL nil null\n// expect: OR or null\n// expect: RETURN return null\n// expect: SUPER super null\n// expect: THIS this null\n// expect: TRUE true null\n// expect: VAR var null\n// expect: WHILE while null\n// expect: EOF  null\n"
  },
  {
    "path": "test/scanning/numbers.lox",
    "content": "123\n123.456\n.456\n123.\n\n// expect: NUMBER 123 123.0\n// expect: NUMBER 123.456 123.456\n// expect: DOT . null\n// expect: NUMBER 456 456.0\n// expect: NUMBER 123 123.0\n// expect: DOT . null\n// expect: EOF  null\n"
  },
  {
    "path": "test/scanning/punctuators.lox",
    "content": "(){};,+-*!===<=>=!=<>/.\n\n// expect: LEFT_PAREN ( null\n// expect: RIGHT_PAREN ) null\n// expect: LEFT_BRACE { null\n// expect: RIGHT_BRACE } null\n// expect: SEMICOLON ; null\n// expect: COMMA , null\n// expect: PLUS + null\n// expect: MINUS - null\n// expect: STAR * null\n// expect: BANG_EQUAL != null\n// expect: EQUAL_EQUAL == null\n// expect: LESS_EQUAL <= null\n// expect: GREATER_EQUAL >= null\n// expect: BANG_EQUAL != null\n// expect: LESS < null\n// expect: GREATER > null\n// expect: SLASH / null\n// expect: DOT . null\n// expect: EOF  null\n"
  },
  {
    "path": "test/scanning/strings.lox",
    "content": "\"\"\n\"string\"\n\n// expect: STRING \"\" \n// expect: STRING \"string\" string\n// expect: EOF  null"
  },
  {
    "path": "test/scanning/whitespace.lox",
    "content": "space    tabs\t\t\t\tnewlines\n\n\n\n\nend\n\n// expect: IDENTIFIER space null\n// expect: IDENTIFIER tabs null\n// expect: IDENTIFIER newlines null\n// expect: IDENTIFIER end null\n// expect: EOF  null\n"
  },
  {
    "path": "test/string/error_after_multiline.lox",
    "content": "// Tests that we correctly track the line info across multiline strings.\nvar a = \"1\n2\n3\n\";\n\nerr; // // expect runtime error: Undefined variable 'err'."
  },
  {
    "path": "test/string/literals.lox",
    "content": "print \"(\" + \"\" + \")\";   // expect: ()\nprint \"a string\"; // expect: a string\n\n// Non-ASCII.\nprint \"A~¶Þॐஃ\"; // expect: A~¶Þॐஃ\n"
  },
  {
    "path": "test/string/multiline.lox",
    "content": "var a = \"1\n2\n3\";\nprint a;\n// expect: 1\n// expect: 2\n// expect: 3\n"
  },
  {
    "path": "test/string/unterminated.lox",
    "content": "// [line 2] Error: Unterminated string.\n\"this string has no close quote"
  },
  {
    "path": "test/super/bound_method.lox",
    "content": "class A {\n  method(arg) {\n    print \"A.method(\" + arg + \")\";\n  }\n}\n\nclass B < A {\n  getClosure() {\n    return super.method;\n  }\n\n  method(arg) {\n    print \"B.method(\" + arg + \")\";\n  }\n}\n\n\nvar closure = B().getClosure();\nclosure(\"arg\"); // expect: A.method(arg)\n"
  },
  {
    "path": "test/super/call_other_method.lox",
    "content": "class Base {\n  foo() {\n    print \"Base.foo()\";\n  }\n}\n\nclass Derived < Base {\n  bar() {\n    print \"Derived.bar()\";\n    super.foo();\n  }\n}\n\nDerived().bar();\n// expect: Derived.bar()\n// expect: Base.foo()\n"
  },
  {
    "path": "test/super/call_same_method.lox",
    "content": "class Base {\n  foo() {\n    print \"Base.foo()\";\n  }\n}\n\nclass Derived < Base {\n  foo() {\n    print \"Derived.foo()\";\n    super.foo();\n  }\n}\n\nDerived().foo();\n// expect: Derived.foo()\n// expect: Base.foo()\n"
  },
  {
    "path": "test/super/closure.lox",
    "content": "class Base {\n  toString() { return \"Base\"; }\n}\n\nclass Derived < Base {\n  getClosure() {\n    fun closure() {\n      return super.toString();\n    }\n    return closure;\n  }\n\n  toString() { return \"Derived\"; }\n}\n\nvar closure = Derived().getClosure();\nprint closure(); // expect: Base\n"
  },
  {
    "path": "test/super/constructor.lox",
    "content": "class Base {\n  init(a, b) {\n    print \"Base.init(\" + a + \", \" + b + \")\";\n  }\n}\n\nclass Derived < Base {\n  init() {\n    print \"Derived.init()\";\n    super.init(\"a\", \"b\");\n  }\n}\n\nDerived();\n// expect: Derived.init()\n// expect: Base.init(a, b)\n"
  },
  {
    "path": "test/super/extra_arguments.lox",
    "content": "class Base {\n  foo(a, b) {\n    print \"Base.foo(\" + a + \", \" + b + \")\";\n  }\n}\n\nclass Derived < Base {\n  foo() {\n    print \"Derived.foo()\"; // expect: Derived.foo()\n    super.foo(\"a\", \"b\", \"c\", \"d\"); // expect runtime error: Expected 2 arguments but got 4.\n  }\n}\n\nDerived().foo();\n"
  },
  {
    "path": "test/super/indirectly_inherited.lox",
    "content": "class A {\n  foo() {\n    print \"A.foo()\";\n  }\n}\n\nclass B < A {}\n\nclass C < B {\n  foo() {\n    print \"C.foo()\";\n    super.foo();\n  }\n}\n\nC().foo();\n// expect: C.foo()\n// expect: A.foo()\n"
  },
  {
    "path": "test/super/missing_arguments.lox",
    "content": "class Base {\n  foo(a, b) {\n    print \"Base.foo(\" + a + \", \" + b + \")\";\n  }\n}\n\nclass Derived < Base {\n  foo() {\n    super.foo(1); // expect runtime error: Expected 2 arguments but got 1.\n  }\n}\n\nDerived().foo();\n"
  },
  {
    "path": "test/super/no_superclass_bind.lox",
    "content": "class Base {\n  foo() {\n    super.doesNotExist; // Error at 'super': Can't use 'super' in a class with no superclass.\n  }\n}\n\nBase().foo();\n"
  },
  {
    "path": "test/super/no_superclass_call.lox",
    "content": "class Base {\n  foo() {\n    super.doesNotExist(1); // Error at 'super': Can't use 'super' in a class with no superclass.\n  }\n}\n\nBase().foo();\n"
  },
  {
    "path": "test/super/no_superclass_method.lox",
    "content": "class Base {}\n\nclass Derived < Base {\n  foo() {\n    super.doesNotExist(1); // expect runtime error: Undefined property 'doesNotExist'.\n  }\n}\n\nDerived().foo();\n"
  },
  {
    "path": "test/super/parenthesized.lox",
    "content": "class A {\n  method() {}\n}\n\nclass B < A {\n  method() {\n    // [line 8] Error at ')': Expect '.' after 'super'.\n    (super).method();\n  }\n}\n"
  },
  {
    "path": "test/super/reassign_superclass.lox",
    "content": "class Base {\n  method() {\n    print \"Base.method()\";\n  }\n}\n\nclass Derived < Base {\n  method() {\n    super.method();\n  }\n}\n\nclass OtherBase {\n  method() {\n    print \"OtherBase.method()\";\n  }\n}\n\nvar derived = Derived();\nderived.method(); // expect: Base.method()\nBase = OtherBase;\nderived.method(); // expect: Base.method()\n"
  },
  {
    "path": "test/super/super_at_top_level.lox",
    "content": "super.foo(\"bar\"); // Error at 'super': Can't use 'super' outside of a class.\nsuper.foo; // Error at 'super': Can't use 'super' outside of a class."
  },
  {
    "path": "test/super/super_in_closure_in_inherited_method.lox",
    "content": "class A {\n  say() {\n    print \"A\";\n  }\n}\n\nclass B < A {\n  getClosure() {\n    fun closure() {\n      super.say();\n    }\n    return closure;\n  }\n\n  say() {\n    print \"B\";\n  }\n}\n\nclass C < B {\n  say() {\n    print \"C\";\n  }\n}\n\nC().getClosure()(); // expect: A\n"
  },
  {
    "path": "test/super/super_in_inherited_method.lox",
    "content": "class A {\n  say() {\n    print \"A\";\n  }\n}\n\nclass B < A {\n  test() {\n    super.say();\n  }\n\n  say() {\n    print \"B\";\n  }\n}\n\nclass C < B {\n  say() {\n    print \"C\";\n  }\n}\n\nC().test(); // expect: A\n"
  },
  {
    "path": "test/super/super_in_top_level_function.lox",
    "content": "  super.bar(); // Error at 'super': Can't use 'super' outside of a class.\nfun foo() {\n}"
  },
  {
    "path": "test/super/super_without_dot.lox",
    "content": "class A {}\n\nclass B < A {\n  method() {\n    // [line 6] Error at ';': Expect '.' after 'super'.\n    super;\n  }\n}\n"
  },
  {
    "path": "test/super/super_without_name.lox",
    "content": "class A {}\n\nclass B < A {\n  method() {\n    super.; // Error at ';': Expect superclass method name.\n  }\n}\n"
  },
  {
    "path": "test/super/this_in_superclass_method.lox",
    "content": "class Base {\n  init(a) {\n    this.a = a;\n  }\n}\n\nclass Derived < Base {\n  init(a, b) {\n    super.init(a);\n    this.b = b;\n  }\n}\n\nvar derived = Derived(\"a\", \"b\");\nprint derived.a; // expect: a\nprint derived.b; // expect: b\n"
  },
  {
    "path": "test/this/closure.lox",
    "content": "class Foo {\n  getClosure() {\n    fun closure() {\n      return this.toString();\n    }\n    return closure;\n  }\n\n  toString() { return \"Foo\"; }\n}\n\nvar closure = Foo().getClosure();\nprint closure(); // expect: Foo\n"
  },
  {
    "path": "test/this/nested_class.lox",
    "content": "class Outer {\n  method() {\n    print this; // expect: Outer instance\n\n    fun f() {\n      print this; // expect: Outer instance\n\n      class Inner {\n        method() {\n          print this; // expect: Inner instance\n        }\n      }\n\n      Inner().method();\n    }\n    f();\n  }\n}\n\nOuter().method();\n"
  },
  {
    "path": "test/this/nested_closure.lox",
    "content": "class Foo {\n  getClosure() {\n    fun f() {\n      fun g() {\n        fun h() {\n          return this.toString();\n        }\n        return h;\n      }\n      return g;\n    }\n    return f;\n  }\n\n  toString() { return \"Foo\"; }\n}\n\nvar closure = Foo().getClosure();\nprint closure()()(); // expect: Foo\n"
  },
  {
    "path": "test/this/this_at_top_level.lox",
    "content": "this; // Error at 'this': Can't use 'this' outside of a class.\n"
  },
  {
    "path": "test/this/this_in_method.lox",
    "content": "class Foo {\n  bar() { return this; }\n  baz() { return \"baz\"; }\n}\n\nprint Foo().bar().baz(); // expect: baz\n"
  },
  {
    "path": "test/this/this_in_top_level_function.lox",
    "content": "fun foo() {\n  this; // Error at 'this': Can't use 'this' outside of a class.\n}\n"
  },
  {
    "path": "test/unexpected_character.lox",
    "content": "// [line 3] Error: Unexpected character.\n// [java line 3] Error at 'b': Expect ')' after arguments.\nfoo(a | b);\n"
  },
  {
    "path": "test/variable/collide_with_parameter.lox",
    "content": "fun foo(a) {\n  var a; // Error at 'a': Already a variable with this name in this scope.\n}\n"
  },
  {
    "path": "test/variable/duplicate_local.lox",
    "content": "{\n  var a = \"value\";\n  var a = \"other\"; // Error at 'a': Already a variable with this name in this scope.\n}\n"
  },
  {
    "path": "test/variable/duplicate_parameter.lox",
    "content": "fun foo(arg,\n        arg) { // Error at 'arg': Already a variable with this name in this scope.\n  \"body\";\n}\n"
  },
  {
    "path": "test/variable/early_bound.lox",
    "content": "var a = \"outer\";\n{\n  fun foo() {\n    print a;\n  }\n\n  foo(); // expect: outer\n  var a = \"inner\";\n  foo(); // expect: outer\n}\n"
  },
  {
    "path": "test/variable/in_middle_of_block.lox",
    "content": "{\n  var a = \"a\";\n  print a; // expect: a\n  var b = a + \" b\";\n  print b; // expect: a b\n  var c = a + \" c\";\n  print c; // expect: a c\n  var d = b + \" d\";\n  print d; // expect: a b d\n}\n"
  },
  {
    "path": "test/variable/in_nested_block.lox",
    "content": "{\n  var a = \"outer\";\n  {\n    print a; // expect: outer\n  }\n}"
  },
  {
    "path": "test/variable/local_from_method.lox",
    "content": "var foo = \"variable\";\n\nclass Foo {\n  method() {\n    print foo;\n  }\n}\n\nFoo().method(); // expect: variable\n"
  },
  {
    "path": "test/variable/redeclare_global.lox",
    "content": "var a = \"1\";\nvar a;\nprint a; // expect: nil\n"
  },
  {
    "path": "test/variable/redefine_global.lox",
    "content": "var a = \"1\";\nvar a = \"2\";\nprint a; // expect: 2\n"
  },
  {
    "path": "test/variable/scope_reuse_in_different_blocks.lox",
    "content": "{\n  var a = \"first\";\n  print a; // expect: first\n}\n\n{\n  var a = \"second\";\n  print a; // expect: second\n}\n"
  },
  {
    "path": "test/variable/shadow_and_local.lox",
    "content": "{\n  var a = \"outer\";\n  {\n    print a; // expect: outer\n    var a = \"inner\";\n    print a; // expect: inner\n  }\n}"
  },
  {
    "path": "test/variable/shadow_global.lox",
    "content": "var a = \"global\";\n{\n  var a = \"shadow\";\n  print a; // expect: shadow\n}\nprint a; // expect: global\n"
  },
  {
    "path": "test/variable/shadow_local.lox",
    "content": "{\n  var a = \"local\";\n  {\n    var a = \"shadow\";\n    print a; // expect: shadow\n  }\n  print a; // expect: local\n}\n"
  },
  {
    "path": "test/variable/undefined_global.lox",
    "content": "print notDefined;  // expect runtime error: Undefined variable 'notDefined'.\n"
  },
  {
    "path": "test/variable/undefined_local.lox",
    "content": "{\n  print notDefined;  // expect runtime error: Undefined variable 'notDefined'.\n}\n"
  },
  {
    "path": "test/variable/uninitialized.lox",
    "content": "var a;\nprint a; // expect: nil\n"
  },
  {
    "path": "test/variable/unreached_undefined.lox",
    "content": "if (false) {\n  print notDefined;\n}\n\nprint \"ok\"; // expect: ok\n"
  },
  {
    "path": "test/variable/use_false_as_var.lox",
    "content": "// [line 2] Error at 'false': Expect variable name.\nvar false = \"value\";\n"
  },
  {
    "path": "test/variable/use_global_in_initializer.lox",
    "content": "var a = \"value\";\nvar a = a;\nprint a; // expect: value\n"
  },
  {
    "path": "test/variable/use_local_in_initializer.lox",
    "content": "var a = \"outer\";\n{\n  var a = a; // Error at 'a': Can't read local variable in its own initializer.\n}\n"
  },
  {
    "path": "test/variable/use_nil_as_var.lox",
    "content": "// [line 2] Error at 'nil': Expect variable name.\nvar nil = \"value\";\n"
  },
  {
    "path": "test/variable/use_this_as_var.lox",
    "content": "// [line 2] Error at 'this': Expect variable name.\nvar this = \"value\";\n"
  },
  {
    "path": "test/while/class_in_body.lox",
    "content": "// [line 2] Error at 'class': Expect expression.\nwhile (true) class Foo {}\n"
  },
  {
    "path": "test/while/closure_in_body.lox",
    "content": "var f1;\nvar f2;\nvar f3;\n\nvar i = 1;\nwhile (i < 4) {\n  var j = i;\n  fun f() { print j; }\n\n  if (j == 1) f1 = f;\n  else if (j == 2) f2 = f;\n  else f3 = f;\n\n  i = i + 1;\n}\n\nf1(); // expect: 1\nf2(); // expect: 2\nf3(); // expect: 3\n"
  },
  {
    "path": "test/while/fun_in_body.lox",
    "content": "// [line 2] Error at 'fun': Expect expression.\nwhile (true) fun foo() {}\n"
  },
  {
    "path": "test/while/return_closure.lox",
    "content": "fun f() {\n  while (true) {\n    var i = \"i\";\n    fun g() { print i; }\n    return g;\n  }\n}\n\nvar h = f();\nh(); // expect: i\n"
  },
  {
    "path": "test/while/return_inside.lox",
    "content": "fun f() {\n  while (true) {\n    var i = \"i\";\n    return i;\n  }\n}\n\nprint f();\n// expect: i\n"
  },
  {
    "path": "test/while/syntax.lox",
    "content": "// Single-expression body.\nvar c = 0;\nwhile (c < 3) print c = c + 1;\n// expect: 1\n// expect: 2\n// expect: 3\n\n// Block body.\nvar a = 0;\nwhile (a < 3) {\n  print a;\n  a = a + 1;\n}\n// expect: 0\n// expect: 1\n// expect: 2\n\n// Statement bodies.\nwhile (false) if (true) 1; else 2;\nwhile (false) while (true) 1;\nwhile (false) for (;;) 1;\n"
  },
  {
    "path": "test/while/var_in_body.lox",
    "content": "// [line 2] Error at 'var': Expect expression.\nwhile (true) var foo;\n"
  },
  {
    "path": "tool/analysis_options.yaml",
    "content": "analyzer:\n strong-mode:\n# Close, but still false positives around clamp().\n  implicit-casts: false\n# Too many false positives.\n  implicit-dynamic: false\n  "
  },
  {
    "path": "tool/bin/benchmark.dart",
    "content": "import 'dart:convert';\nimport 'dart:io';\n\nimport 'package:path/path.dart' as p;\n\nvoid main(List<String> arguments) {\n  if (arguments.isEmpty) {\n    print('Usage: benchmark.py [interpreters...] <benchmark>');\n    exit(1);\n  }\n\n  var interpreters = ['build/clox'];\n  var benchmark = arguments.last;\n  if (arguments.length > 1) {\n    interpreters = arguments.sublist(0, arguments.length - 1);\n  }\n\n  if (interpreters.length > 1) {\n    runComparison(interpreters, benchmark);\n  } else {\n    runBenchmark(interpreters[0], benchmark);\n  }\n}\n\nvoid runBenchmark(String interpreter, String benchmark) {\n  var trial = 1;\n  var best = 9999.0;\n\n  for (;;) {\n    var elapsed = runTrial(interpreter, benchmark);\n    if (elapsed < best) best = elapsed;\n\n    var bestSeconds = best.toStringAsFixed(2);\n    print(\"trial #$trial  $interpreter   best ${bestSeconds}s\");\n    trial++;\n  }\n}\n\n/// Runs the benchmark once and returns the elapsed time.\ndouble runTrial(String interpreter, String benchmark) {\n  var result = Process.runSync(\n      interpreter, [p.join(\"test\", \"benchmark\", \"$benchmark.lox\")]);\n  var outLines = const LineSplitter().convert(result.stdout as String);\n\n  // Remove the trailing last empty line.\n  if (outLines.last == \"\") outLines.removeLast();\n\n  // The benchmark should print the elapsed time last.\n  return double.parse(outLines.last);\n}\n\nvoid runComparison(List<String> interpreters, String benchmark) {\n  var trial = 1;\n  var best = {for (var interpreter in interpreters) interpreter: 9999.0};\n\n  for (;;) {\n    for (var interpreter in interpreters) {\n      var elapsed = runTrial(interpreter, benchmark);\n      if (elapsed < best[interpreter]) best[interpreter] = elapsed;\n    }\n\n    var bestTime = 999.0;\n    var worstTime = 0.0;\n    String bestInterpreter;\n    for (var interpreter in interpreters) {\n      if (best[interpreter] < bestTime) {\n        bestTime = best[interpreter];\n        bestInterpreter = interpreter;\n      }\n      if (best[interpreter] > worstTime) {\n        worstTime = best[interpreter];\n      }\n    }\n\n    // Turn the time measurement into an effort measurement in units where 1\n    // \"work\" is just the total thing the benchmark does.\n    var worstWork = 1.0 / worstTime;\n\n    print(\"trial #$trial\");\n    for (var interpreter in interpreters) {\n      String suffix;\n      if (interpreter == bestInterpreter) {\n        var bestWork = 1.0 / best[interpreter];\n        var workRatio = bestWork / worstWork;\n        var faster = 100 * (workRatio - 1.0);\n        suffix = \"${faster.toStringAsFixed(4)}% faster\";\n      } else {\n        var ratio = best[interpreter] / bestTime;\n        suffix = \"${ratio.toStringAsFixed(4)}x time of best\";\n      }\n      var bestString = best[interpreter].toStringAsFixed(4);\n      print(\"  ${interpreter.padRight(30)}   best ${bestString}s  $suffix\");\n    }\n\n    trial++;\n  }\n}\n"
  },
  {
    "path": "tool/bin/build.dart",
    "content": "import 'dart:io';\n\nimport 'package:glob/glob.dart';\nimport 'package:mime_type/mime_type.dart';\nimport 'package:path/path.dart' as p;\nimport 'package:sass/sass.dart' as sass;\nimport 'package:shelf/shelf.dart' as shelf;\nimport 'package:shelf/shelf_io.dart' as io;\n\nimport 'package:tool/src/book.dart';\nimport 'package:tool/src/format.dart';\nimport 'package:tool/src/markdown/markdown.dart';\nimport 'package:tool/src/mustache.dart';\nimport 'package:tool/src/page.dart';\nimport 'package:tool/src/term.dart' as term;\nimport 'package:tool/src/text.dart';\n\n/// Aside comment marker in highlighted code.\nfinal _asideHighlightedCommentPattern =\n    RegExp(r' ?<span class=\"c\">// \\[([-a-z0-9]+)\\] *</span>');\n\n/// Aside comment marker in highlighted code with a comment too.\nfinal _asideHighlightedWithCommentPattern =\n    RegExp(r' ?<span class=\"c\">// (.+) \\[([-a-z0-9]+)\\] *</span>');\n\n/// Aside comment marker in context lines which are not syntax highlighted.\nfinal _asideCommentPattern = RegExp(r' +// \\[([-a-z0-9]+)\\]');\n\n/// Aside comment marker in context lines which are not syntax highlighted with\n/// a comment too.\nfinal _asideWithCommentPattern = RegExp(r' +// (.+) \\[([-a-z0-9]+)\\]');\n\nFuture<void> main(List<String> arguments) async {\n  _buildSass();\n  _buildPages();\n\n  if (arguments.contains(\"--serve\")) {\n    await _runServer();\n  }\n}\n\n/// Process each Markdown file.\nvoid _buildPages({bool skipUpToDate = false}) {\n  var watch = Stopwatch()..start();\n  var book = Book();\n  var mustache = Mustache();\n\n  DateTime dependenciesModified;\n  if (skipUpToDate) {\n    dependenciesModified = _mostRecentlyModified(\n        [\"asset/mustache/*.html\", \"c/*.{c,h}\", \"java/**.java\"]);\n  }\n\n  var proseWords = 0;\n  var codeLines = 0;\n  var totalWords = 0;\n  for (var page in book.pages) {\n    var metrics = _buildPage(book, mustache, page,\n        dependenciesModified: dependenciesModified);\n    proseWords += metrics[0];\n    codeLines += metrics[1];\n    totalWords += metrics[2];\n  }\n\n  if (totalWords > 0) {\n    var seconds = (watch.elapsedMilliseconds / 1000).toStringAsFixed(2);\n    print(\"Built ${term.green(proseWords.withCommas)} words and \"\n        \"${term.cyan(codeLines.withCommas)} lines of code \"\n        \"(${totalWords.withCommas} total words) in $seconds seconds\");\n  }\n}\n\nList<int> _buildPage(Book book, Mustache mustache, Page page,\n    {DateTime dependenciesModified}) {\n  // See if the HTML is up to date.\n  if (dependenciesModified != null &&\n      _isUpToDate(page.htmlPath, page.markdownPath, dependenciesModified)) {\n    return [0, 0, 0];\n  }\n\n  var proseCount = 0;\n  var codeLineCount = 0;\n  for (var line in page.lines) proseCount += line.wordCount;\n\n  var wordCount = proseCount;\n  for (var tag in page.codeTags) {\n    var snippet = book.findSnippet(tag);\n    if (snippet == null) {\n      print(\"No snippet for $tag\");\n      continue;\n    }\n\n    codeLineCount += snippet.added.length;\n    for (var line in snippet.added) wordCount += line.wordCount;\n    for (var line in snippet.contextBefore) wordCount += line.wordCount;\n    for (var line in snippet.contextAfter) wordCount += line.wordCount;\n  }\n\n  var body = renderMarkdown(book, page, page.lines, Format.web);\n  var output = mustache.render(book, page, body);\n\n  // Turn aside markers in code into spans. In the empty span case, insert a\n  // zero-width space because Chrome seems to lose the span's position if it has\n  // no content.\n  // <span class=\"c\">// [repl]</span>\n  // TODO: Do this directly in the syntax highlighter instead of after the fact.\n  output = output.replaceAllMapped(_asideHighlightedCommentPattern,\n      (match) => '<span name=\"${match[1]}\"> </span>');\n  output = output.replaceAllMapped(_asideHighlightedWithCommentPattern,\n      (match) => '<span class=\"c\" name=\"${match[2]}\">// ${match[1]}</span>');\n  output = output.replaceAllMapped(\n      _asideCommentPattern, (match) => '<span name=\"${match[1]}\"> </span>');\n  output = output.replaceAllMapped(_asideWithCommentPattern,\n      (match) => '<span name=\"${match[2]}\">// ${match[1]}</span>');\n\n  // Write the output.\n  File(page.htmlPath).writeAsStringSync(output);\n\n  var words = \"$wordCount words\";\n  if (codeLineCount > 0) words += \", $codeLineCount loc\";\n  words = term.gray(\"($words)\");\n\n  var number = \"\";\n  if (page.numberString.isNotEmpty) {\n    number = \"${page.numberString}. \";\n  }\n\n  if (page.isChapter) {\n    print(\"  ${term.green('✓')} $number${page.title} $words\");\n  } else {\n    print(\"${term.green('✓')} $number${page.title} $words\");\n  }\n\n  return [proseCount, codeLineCount, wordCount];\n}\n\n/// Process each SASS file.\nvoid _buildSass({bool skipUpToDate = false}) {\n  var moduleModified = _mostRecentlyModified([\"asset/sass/*.scss\"]);\n\n  for (var source in Glob(\"asset/*.scss\").listSync()) {\n    var scssPath = p.normalize(source.path);\n    var cssPath =\n        p.join(\"site\", p.basenameWithoutExtension(source.path) + \".css\");\n\n    if (skipUpToDate && _isUpToDate(cssPath, scssPath, moduleModified)) {\n      continue;\n    }\n\n    var output =\n        sass.compile(scssPath, color: true, style: sass.OutputStyle.expanded);\n    File(cssPath).writeAsStringSync(output);\n    print(\"${term.green('-')} $cssPath\");\n  }\n}\n\nFuture<void> _runServer() async {\n  Future<shelf.Response> handleRequest(shelf.Request request) async {\n    var filePath = p.normalize(p.fromUri(request.url));\n    if (filePath == \".\") filePath = \"index.html\";\n    var extension = p.extension(filePath).replaceAll(\".\", \"\");\n\n    // Refresh files that are being requested.\n    if (extension == \"html\") {\n      _buildPages(skipUpToDate: true);\n    } else if (extension == \"css\") {\n      _buildSass(skipUpToDate: true);\n    }\n\n    try {\n      var contents = await File(p.join(\"site\", filePath)).readAsBytes();\n      return shelf.Response.ok(contents, headers: {\n        HttpHeaders.contentTypeHeader: mimeFromExtension(extension)\n      });\n    } on FileSystemException {\n      print(\n          \"${term.red(request.method)} Not found: ${request.url} ($filePath)\");\n      return shelf.Response.notFound(\"Could not find '$filePath'.\");\n    }\n  }\n\n  var handler = const shelf.Pipeline().addHandler(handleRequest);\n\n  var server = await io.serve(handler, \"localhost\", 8000);\n  print(\"Serving at http://${server.address.host}:${server.port}\");\n}\n\n/// Returns `true` if [outputPath] was generated after [inputPath] and more\n/// recently than [dependenciesModified].\nbool _isUpToDate(\n    String outputPath, String inputPath, DateTime dependenciesModified) {\n  var outputModified = File(outputPath).lastModifiedSync();\n  var inputModified = File(inputPath).lastModifiedSync();\n  return outputModified.isAfter(dependenciesModified) &&\n      outputModified.isAfter(inputModified);\n}\n\n/// The most recently modified time of all files that match [globs].\nDateTime _mostRecentlyModified(List<String> globs) {\n  DateTime latest;\n  for (var glob in globs) {\n    for (var entry in Glob(glob).listSync()) {\n      if (entry is File) {\n        var modified = entry.lastModifiedSync();\n        if (latest == null || modified.isAfter(latest)) latest = modified;\n      }\n    }\n  }\n\n  return latest;\n}\n"
  },
  {
    "path": "tool/bin/build_xml.dart",
    "content": "import 'dart:io';\n\nimport 'package:path/path.dart' as p;\n\nimport 'package:tool/src/book.dart';\nimport 'package:tool/src/format.dart';\nimport 'package:tool/src/markdown/markdown.dart';\nimport 'package:tool/src/markdown/xml_renderer.dart';\nimport 'package:tool/src/mustache.dart';\nimport 'package:tool/src/page.dart';\nimport 'package:tool/src/term.dart' as term;\n\n/// Generate the XML used to import into InDesign.\n\nFuture<void> main(List<String> arguments) async {\n  var book = Book();\n  var mustache = Mustache();\n\n  await Directory(p.join(\"build\", \"xml\")).create(recursive: true);\n\n  for (var page in book.pages) {\n    if (!page.isChapter) continue;\n\n    if (arguments.isNotEmpty && page.fileName != arguments.first) continue;\n\n    _buildPage(book, mustache, page);\n  }\n\n  // Output a minimal XML file that contains all tags used in the book.\n  var allTagsPath = p.join(\"build\", \"xml\", \"all-tags.xml\");\n  File(allTagsPath)\n      .writeAsStringSync(\"<chapter>\\n${XmlRenderer.tagFileBuffer}\\n</chapter>\");\n}\n\nvoid _buildPage(Book book, Mustache mustache, Page page) {\n  var xml = renderMarkdown(book, page, page.lines, Format.print);\n\n  // Write the output.\n  var xmlPath = p.join(\"build\", \"xml\", \"${page.fileName}.xml\");\n  File(xmlPath).writeAsStringSync(xml);\n\n  print(\"${term.green('-')} ${page.numberString}. ${page.title}\");\n}\n"
  },
  {
    "path": "tool/bin/compile_snippets.dart",
    "content": "import 'dart:io';\n\nimport 'package:path/path.dart' as p;\nimport 'package:pool/pool.dart';\n\nimport 'package:tool/src/book.dart';\nimport 'package:tool/src/code_tag.dart';\nimport 'package:tool/src/page.dart';\nimport 'package:tool/src/split_chapter.dart';\nimport 'package:tool/src/term.dart' as term;\n\n/// Tests that various snippets in the middle of chapters can be compiled without\n/// error. Ensures that, as much as possible, we have a working program at\n/// multiple points throughout the chapter.\n\n// TODO: Do this for Java chapters.\n\nvar _chapterTags = <String, List<String>>{\n  \"Chunks of Bytecode\": [\n    \"free-array\",\n    \"main-include-chunk\",\n    \"simple-instruction\",\n    \"add-constant\",\n    \"return-after-operand\",\n  ],\n  \"A Virtual Machine\": [\n    \"main-include-vm\",\n    \"vm-include-debug\",\n    \"print-return\",\n    \"main-negate\",\n  ],\n  \"Scanning on Demand\": [\n    \"init-scanner\",\n    \"error-token\",\n    \"advance\",\n    \"match\",\n    \"newline\",\n    \"peek-next\",\n    \"string\",\n    \"number\",\n    \"identifier-type\",\n    \"check-keyword\",\n  ],\n  \"Compiling Expressions\": [\n    \"expression\",\n    \"forward-declarations\",\n    \"precedence-body\",\n    \"infix\",\n    \"define-debug-print-code\",\n    \"dump-chunk\"\n  ],\n  \"Types of Values\": [\n    \"op-arithmetic\",\n    \"print-value\",\n    \"disassemble-not\",\n    \"values-equal\",\n  ],\n  \"Strings\": [\n    // \"as-string\",\n    // We could get things working earlier by moving the \"Operations on Strings\"\n    // section before \"Strings\".\n    \"value-include-object\",\n    \"vm-include-object-memory\",\n  ],\n  \"Hash Tables\": [\n    \"free-table\",\n    \"hash-string\",\n    \"table-add-all\",\n    \"table-get\",\n    \"table-delete\",\n    \"resize-increment-count\",\n  ],\n  \"Global Variables\": [\n    \"disassemble-print\",\n    \"disassemble-pop\",\n    \"synchronize\",\n    \"define-global-op\",\n    \"disassemble-define-global\",\n    \"disassemble-get-global\",\n    \"disassemble-set-global\",\n  ],\n  \"Local Variables\": [\n    \"local-struct\",\n    \"compiler\",\n    \"end-scope\",\n    \"add-local\",\n    \"too-many-locals\",\n    \"pop-locals\",\n    \"interpret-set-local\",\n  ],\n  \"Jumping Back and Forth\": [\n    \"jump-if-false-op\",\n    \"compile-else\",\n    \"jump-op\",\n    \"pop-end\",\n    \"jump-instruction\",\n    \"and\",\n    \"or\",\n    \"while-statement\",\n    \"loop-op\",\n    \"disassemble-loop\",\n    \"for-statement\",\n  ],\n  \"Calls and Functions\": [\n    \"as-function\",\n    \"function-type-enum\",\n    \"init-compiler\",\n    \"init-function-slot\",\n    \"return-function\",\n    \"disassemble-end\",\n    \"runtime-error-temp\",\n    \"compile-function\",\n    \"init-function-name\",\n    \"call\",\n    \"interpret\",\n    \"disassemble-call\",\n    \"return-statement\",\n    \"runtime-error-stack\",\n    \"return-from-script\",\n    \"print-native\",\n    \"define-native\",\n    \"vm-include-time\"\n  ],\n  \"Closures\": [\n    \"obj-closure\",\n    \"new-closure-h\",\n    \"print-closure\",\n    \"closure-op\",\n    \"disassemble-closure\",\n    \"interpret-closure\",\n    \"runtime-error-function\",\n    \"interpret\",\n    \"upvalue-struct\",\n    \"resolve-upvalue-recurse\",\n    \"capture-upvalues\",\n    \"debug-include-object\",\n    \"obj-upvalue\",\n    \"new-upvalue-h\",\n    \"print-upvalue\",\n    \"upvalue-fields\",\n    \"allocate-upvalue-array\",\n    \"init-upvalue-fields\",\n    \"free-upvalues\",\n    \"capture-upvalue\",\n    \"interpret-get-upvalue\",\n    \"interpret-set-upvalue\",\n    \"is-captured-field\",\n    \"init-is-captured\",\n    \"init-zero-local-is-captured\",\n    \"mark-local-captured\",\n    \"close-upvalue-op\",\n    \"disassemble-close-upvalue\",\n    \"next-field\",\n    \"init-next\",\n    \"open-upvalues-field\",\n    \"init-open-upvalues\",\n    \"look-for-existing-upvalue\",\n    \"insert-upvalue-in-list\",\n    \"closed-field\",\n    \"init-closed\",\n    \"return-close-upvalues\",\n  ],\n  \"Garbage Collection\": [\n    \"collect-garbage-h\",\n    \"collect-garbage\",\n    \"define-stress-gc\",\n    \"call-collect\",\n    \"define-log-gc\",\n    \"debug-log-includes\",\n    \"log-before-collect\",\n    \"log-after-collect\",\n    \"debug-log-allocate\",\n    \"log-free-object\",\n    \"mark-value-h\",\n    \"mark-object-h\",\n    \"is-marked-field\",\n    \"init-is-marked\",\n    \"log-mark-object\",\n    \"mark-table-h\",\n    \"mark-table\",\n    \"mark-closures\",\n    \"mark-open-upvalues\",\n    \"memory-include-compiler\",\n    \"compiler-include-memory\",\n    \"vm-gray-stack\",\n    \"init-gray-stack\",\n    \"free-gray-stack\",\n    \"blacken-closure\",\n    \"log-blacken-object\",\n    \"check-is-marked\",\n    \"sweep\",\n    \"unmark\",\n    \"table-remove-white-h\",\n    \"table-remove-white\",\n    \"vm-fields\",\n    \"init-gc-fields\",\n    \"updated-bytes-allocated\",\n    \"collect-on-next\",\n    \"heap-grow-factor\",\n    \"log-before-size\",\n    \"log-collected-amount\",\n    \"chunk-include-vm\",\n    \"push-string\",\n    \"pop-string\",\n    \"concatenate-peek\",\n    \"concatenate-pop\",\n  ],\n  \"Classes and Instances\": [\n    \"obj-class\",\n    \"print-class\",\n    \"class-op\",\n    \"disassemble-class\",\n    \"interpret-class\",\n    \"object-include-table\",\n    \"print-instance\",\n    \"call-class\",\n    \"property-ops\",\n    \"disassemble-property-ops\",\n    \"interpret-get-property\",\n    \"get-undefined\",\n    \"get-not-instance\",\n    \"interpret-set-property\",\n    \"set-not-instance\",\n  ],\n  \"Methods and Initializers\": [\n    \"class-methods\",\n    \"init-methods\",\n    \"free-methods\",\n    \"mark-methods\",\n    \"method-op\",\n    \"disassemble-method\",\n    \"define-method\",\n    \"obj-bound-method\",\n    \"print-bound-method\",\n    \"bind-method\",\n    \"call-bound-method\",\n    \"this\",\n    \"slot-zero\",\n    \"method-type-enum\",\n    \"method-type\",\n    \"store-receiver\",\n    \"class-compiler-struct\",\n    \"create-class-compiler\",\n    \"pop-enclosing\",\n    \"this-outside-class\",\n    \"vm-init-string\",\n    \"init-init-string\",\n    \"mark-init-string\",\n    \"null-init-string\",\n    \"clear-init-string\",\n    \"initializer-type-enum\",\n    \"return-this\",\n    \"return-from-init\",\n    \"invoke-op\",\n    \"invoke-instruction\",\n    \"invoke-from-class\",\n    \"invoke-field\",\n  ],\n  \"Superclasses\": [\n    \"inherit-op\",\n    \"disassemble-inherit\",\n    \"interpret-inherit\",\n    \"inherit-non-class\",\n    \"synthetic-token\",\n    \"has-superclass\",\n    \"init-has-superclass\",\n    \"set-has-superclass\",\n    \"get-super-op\",\n    \"disassemble-get-super\",\n    \"interpret-get-super\",\n    \"super-invoke-op\",\n    \"disassemble-super-invoke\",\n    \"interpret-super-invoke\",\n  ],\n  \"Optimization\": [\n    \"initial-index\",\n    \"next-index\",\n    \"adjust-alloc\",\n    \"adjust-init\",\n    \"re-hash\",\n    \"adjust-free\",\n    \"table-set-grow\",\n    \"init-capacity-mask\",\n    \"add-all-loop\",\n    \"find-string-index\",\n    \"find-string-next\",\n    \"mark-table\",\n    \"remove-white\",\n    \"free-table\",\n    \"define-nan-boxing\",\n    \"end-values-equal\",\n  ],\n};\n\nvar _allPassed = true;\n\nFuture<void> main(List<String> arguments) async {\n  var watch = Stopwatch()..start();\n  var book = Book();\n  var pool = Pool(Platform.numberOfProcessors);\n  var futures = <Future<void>>[];\n\n  for (var chapterName in _chapterTags.keys) {\n    var chapter = book.findChapter(chapterName);\n\n    var tags = chapter.codeTags;\n    var tagNames = _chapterTags[chapterName];\n    if (tagNames.isNotEmpty) {\n      tags = tagNames.map((name) => book.findTag(chapter, name));\n    } else {\n      print(\"Warning, no in-chapter snippets for '$chapterName'\");\n    }\n\n    for (var tag in tags) {\n      futures\n          .add(pool.withResource(() => _compileChapterTag(book, chapter, tag)));\n    }\n  }\n\n  await Future.wait(futures);\n\n  print(\"Done in ${watch.elapsedMilliseconds / 1000} seconds\");\n  if (!_allPassed) exit(1);\n}\n\nFuture<void> _compileChapterTag(Book book, Page chapter, CodeTag tag) async {\n  await splitChapter(book, chapter, tag);\n\n  var buildName = \"${chapter.shortName}-${tag.directory}\";\n  var sourceDir = p.join(\"gen\", \"snippets\", chapter.shortName, tag.directory);\n\n  var makeArguments = [\n    \"-f\",\n    \"util/c.make\",\n    \"NAME=$buildName\",\n    \"MODE=release\",\n    \"SOURCE_DIR=$sourceDir\",\n    \"SNIPPET=true\"\n  ];\n\n  var result = await Process.run(\"make\", makeArguments);\n  if (result.exitCode == 0) {\n    print(\"${term.green('PASS')} ${chapter.title} / ${tag.name}\");\n  } else {\n    print(\"${term.red('FAIL')} ${chapter.title} / ${tag.name}\");\n    print(result.stdout);\n    print(result.stderr);\n    print(\"\");\n    _allPassed = false;\n  }\n}\n"
  },
  {
    "path": "tool/bin/split_chapters.dart",
    "content": "import 'package:tool/src/book.dart';\nimport 'package:tool/src/split_chapter.dart';\n\nvoid main(List<String> arguments) {\n  var book = Book();\n  for (var page in book.pages) {\n    if (page.language == null) continue;\n    splitChapter(book, page);\n  }\n}\n"
  },
  {
    "path": "tool/bin/test.dart",
    "content": "import 'dart:convert';\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:glob/glob.dart';\nimport 'package:path/path.dart' as p;\n\nimport 'package:tool/src/term.dart' as term;\n\n/// Runs the tests.\n\nfinal _expectedOutputPattern = RegExp(r\"// expect: ?(.*)\");\nfinal _expectedErrorPattern = RegExp(r\"// (Error.*)\");\nfinal _errorLinePattern = RegExp(r\"// \\[((java|c) )?line (\\d+)\\] (Error.*)\");\nfinal _expectedRuntimeErrorPattern = RegExp(r\"// expect runtime error: (.+)\");\nfinal _syntaxErrorPattern = RegExp(r\"\\[.*line (\\d+)\\] (Error.+)\");\nfinal _stackTracePattern = RegExp(r\"\\[line (\\d+)\\]\");\nfinal _nonTestPattern = RegExp(r\"// nontest\");\n\nvar _passed = 0;\nvar _failed = 0;\nvar _skipped = 0;\nvar _expectations = 0;\n\nSuite _suite;\nString _filterPath;\nString _customInterpreter;\nList<String> _customArguments;\n\nfinal _allSuites = <String, Suite>{};\nfinal _cSuites = <String>[];\nfinal _javaSuites = <String>[];\n\nclass Suite {\n  final String name;\n  final String language;\n  final String executable;\n  final List<String> args;\n  final Map<String, String> tests;\n\n  Suite(this.name, this.language, this.executable, this.args, this.tests);\n}\n\nvoid main(List<String> arguments) {\n  _defineTestSuites();\n\n  var parser = ArgParser();\n\n  parser.addOption(\"interpreter\", abbr: \"i\", help: \"Path to interpreter.\");\n  parser.addMultiOption(\"arguments\",\n      abbr: \"a\", help: \"Additional interpreter arguments.\");\n\n  var options = parser.parse(arguments);\n\n  if (options.rest.isEmpty) {\n    _usageError(parser, \"Missing suite name.\");\n  } else if (options.rest.length > 2) {\n    _usageError(\n        parser, \"Unexpected arguments '${options.rest.skip(2).join(' ')}'.\");\n  }\n\n  var suite = options.rest[0];\n  if (options.rest.length == 2) _filterPath = arguments[1];\n\n  if (options.wasParsed(\"interpreter\")) {\n    _customInterpreter = options[\"interpreter\"] as String;\n  }\n\n  if (options.wasParsed(\"arguments\")) {\n    _customArguments = options[\"arguments\"] as List<String>;\n\n    if (_customInterpreter == null) {\n      _usageError(parser,\n          \"Must pass an interpreter path if providing custom arguments.\");\n    }\n  }\n\n  if (suite == \"all\") {\n    _runSuites(_allSuites.keys.toList());\n  } else if (suite == \"c\") {\n    _runSuites(_cSuites);\n  } else if (suite == \"java\") {\n    _runSuites(_javaSuites);\n  } else if (!_allSuites.containsKey(suite)) {\n    print(\"Unknown interpreter '$suite'\");\n    exit(1);\n  } else if (!_runSuite(suite)) {\n    exit(1);\n  }\n}\n\nvoid _usageError(ArgParser parser, String message) {\n  print(message);\n  print(\"\");\n  print(\"Usage: test.dart <suites> [filter] [custom interpreter...]\");\n  print(\"\");\n  print(\"Optional custom interpreter options:\");\n  print(parser.usage);\n  exit(1);\n}\n\nvoid _runSuites(List<String> names) {\n  var anyFailed = false;\n  for (var name in names) {\n    print(\"=== $name ===\");\n    if (!_runSuite(name)) anyFailed = true;\n  }\n\n  if (anyFailed) exit(1);\n}\n\nbool _runSuite(String name) {\n  _suite = _allSuites[name];\n\n  _passed = 0;\n  _failed = 0;\n  _skipped = 0;\n  _expectations = 0;\n\n  for (var file in Glob(\"test/**.lox\").listSync()) {\n    _runTest(file.path);\n  }\n\n  term.clearLine();\n\n  if (_failed == 0) {\n    print(\"All ${term.green(_passed)} tests passed \"\n        \"($_expectations expectations).\");\n  } else {\n    print(\"${term.green(_passed)} tests passed. \"\n        \"${term.red(_failed)} tests failed.\");\n  }\n\n  return _failed == 0;\n}\n\nvoid _runTest(String path) {\n  if (path.contains(\"benchmark\")) return;\n\n  // Make a nice short path relative to the working directory. Normalize it to\n  // use \"/\" since the interpreters expect the argument to use that.\n  path = p.posix.normalize(path);\n\n  // Check if we are just running a subset of the tests.\n  if (_filterPath != null) {\n    var thisTest = p.posix.relative(path, from: \"test\");\n    if (!thisTest.startsWith(_filterPath)) return;\n  }\n\n  // Update the status line.\n  var grayPath = term.gray(\"($path)\");\n  term.writeLine(\"Passed: ${term.green(_passed)} \"\n      \"Failed: ${term.red(_failed)} \"\n      \"Skipped: ${term.yellow(_skipped)} $grayPath\");\n\n  // Read the test and parse out the expectations.\n  var test = Test(path);\n\n  // See if it's a skipped or non-test file.\n  if (!test.parse()) return;\n\n  var failures = test.run();\n\n  // Display the results.\n  if (failures.isEmpty) {\n    _passed++;\n  } else {\n    _failed++;\n    term.writeLine(\"${term.red(\"FAIL\")} $path\");\n    print(\"\");\n    for (var failure in failures) {\n      print(\"     ${term.pink(failure)}\");\n    }\n    print(\"\");\n  }\n}\n\nclass ExpectedOutput {\n  final int line;\n  final String output;\n\n  ExpectedOutput(this.line, this.output);\n}\n\nclass Test {\n  final String _path;\n\n  final _expectedOutput = <ExpectedOutput>[];\n\n  /// The set of expected compile error messages.\n  final _expectedErrors = <String>{};\n\n  /// The expected runtime error message or `null` if there should not be one.\n  String _expectedRuntimeError;\n\n  /// If there is an expected runtime error, the line it should occur on.\n  int _runtimeErrorLine = 0;\n\n  int _expectedExitCode = 0;\n\n  /// The list of failure message lines.\n  final _failures = <String>[];\n\n  Test(this._path);\n\n  bool parse() {\n    // Get the path components.\n    var parts = _path.split(\"/\");\n    var subpath = \"\";\n    String state;\n\n    // Figure out the state of the test. We don't break out of this loop because\n    // we want lines for more specific paths to override more general ones.\n    for (var part in parts) {\n      if (subpath.isNotEmpty) subpath += \"/\";\n      subpath += part;\n\n      if (_suite.tests.containsKey(subpath)) {\n        state = _suite.tests[subpath];\n      }\n    }\n\n    if (state == null) {\n      throw \"Unknown test state for '$_path'.\";\n    } else if (state == \"skip\") {\n      _skipped++;\n      return false;\n    }\n\n    var lines = File(_path).readAsLinesSync();\n    for (var lineNum = 1; lineNum <= lines.length; lineNum++) {\n      var line = lines[lineNum - 1];\n\n      // Not a test file at all, so ignore it.\n      var match = _nonTestPattern.firstMatch(line);\n      if (match != null) return false;\n\n      match = _expectedOutputPattern.firstMatch(line);\n      if (match != null) {\n        _expectedOutput.add(ExpectedOutput(lineNum, match[1]));\n        _expectations++;\n        continue;\n      }\n\n      match = _expectedErrorPattern.firstMatch(line);\n      if (match != null) {\n        _expectedErrors.add(\"[$lineNum] ${match[1]}\");\n\n        // If we expect a compile error, it should exit with EX_DATAERR.\n        _expectedExitCode = 65;\n        _expectations++;\n        continue;\n      }\n\n      match = _errorLinePattern.firstMatch(line);\n      if (match != null) {\n        // The two interpreters are slightly different in terms of which\n        // cascaded errors may appear after an initial compile error because\n        // their panic mode recovery is a little different. To handle that,\n        // the tests can indicate if an error line should only appear for a\n        // certain interpreter.\n        var language = match[2];\n        if (language == null || language == _suite.language) {\n          _expectedErrors.add(\"[${match[3]}] ${match[4]}\");\n\n          // If we expect a compile error, it should exit with EX_DATAERR.\n          _expectedExitCode = 65;\n          _expectations++;\n        }\n        continue;\n      }\n\n      match = _expectedRuntimeErrorPattern.firstMatch(line);\n      if (match != null) {\n        _runtimeErrorLine = lineNum;\n        _expectedRuntimeError = match[1];\n        // If we expect a runtime error, it should exit with EX_SOFTWARE.\n        _expectedExitCode = 70;\n        _expectations++;\n      }\n    }\n\n    if (_expectedErrors.isNotEmpty && _expectedRuntimeError != null) {\n      print(\"${term.magenta('TEST ERROR')} $_path\");\n      print(\"     Cannot expect both compile and runtime errors.\");\n      print(\"\");\n      return false;\n    }\n\n    // If we got here, it's a valid test.\n    return true;\n  }\n\n  /// Invoke the interpreter and run the test.\n  List<String> run() {\n    var args = [\n      if (_customInterpreter != null) ...?_customArguments else ..._suite.args,\n      _path\n    ];\n    var result = Process.runSync(_customInterpreter ?? _suite.executable, args);\n\n    // Normalize Windows line endings.\n    var outputLines = const LineSplitter().convert(result.stdout as String);\n    var errorLines = const LineSplitter().convert(result.stderr as String);\n\n    // Validate that an expected runtime error occurred.\n    if (_expectedRuntimeError != null) {\n      _validateRuntimeError(errorLines);\n    } else {\n      _validateCompileErrors(errorLines);\n    }\n\n    _validateExitCode(result.exitCode, errorLines);\n    _validateOutput(outputLines);\n    return _failures;\n  }\n\n  void _validateRuntimeError(List<String> errorLines) {\n    if (errorLines.length < 2) {\n      fail(\"Expected runtime error '$_expectedRuntimeError' and got none.\");\n      return;\n    }\n\n    if (errorLines[0] != _expectedRuntimeError) {\n      fail(\"Expected runtime error '$_expectedRuntimeError' and got:\");\n      fail(errorLines[0]);\n    }\n\n    // Make sure the stack trace has the right line.\n    RegExpMatch match;\n    var stackLines = errorLines.sublist(1);\n    for (var line in stackLines) {\n      match = _stackTracePattern.firstMatch(line);\n      if (match != null) break;\n    }\n\n    if (match == null) {\n      fail(\"Expected stack trace and got:\", stackLines);\n    } else {\n      var stackLine = int.parse(match[1]);\n      if (stackLine != _runtimeErrorLine) {\n        fail(\"Expected runtime error on line $_runtimeErrorLine \"\n            \"but was on line $stackLine.\");\n      }\n    }\n  }\n\n  void _validateCompileErrors(List<String> error_lines) {\n    // Validate that every compile error was expected.\n    var foundErrors = <String>{};\n    var unexpectedCount = 0;\n    for (var line in error_lines) {\n      var match = _syntaxErrorPattern.firstMatch(line);\n      if (match != null) {\n        var error = \"[${match[1]}] ${match[2]}\";\n        if (_expectedErrors.contains(error)) {\n          foundErrors.add(error);\n        } else {\n          if (unexpectedCount < 10) {\n            fail(\"Unexpected error:\");\n            fail(line);\n          }\n          unexpectedCount++;\n        }\n      } else if (line != \"\") {\n        if (unexpectedCount < 10) {\n          fail(\"Unexpected output on stderr:\");\n          fail(line);\n        }\n        unexpectedCount++;\n      }\n    }\n\n    if (unexpectedCount > 10) {\n      fail(\"(truncated ${unexpectedCount - 10} more...)\");\n    }\n\n    // Validate that every expected error occurred.\n    for (var error in _expectedErrors.difference(foundErrors)) {\n      fail(\"Missing expected error: $error\");\n    }\n  }\n\n  void _validateExitCode(int exitCode, List<String> errorLines) {\n    if (exitCode == _expectedExitCode) return;\n\n    if (errorLines.length > 10) {\n      errorLines = errorLines.sublist(0, 10);\n      errorLines.add(\"(truncated...)\");\n    }\n\n    fail(\"Expected return code $_expectedExitCode and got $exitCode. Stderr:\",\n        errorLines);\n  }\n\n  void _validateOutput(List<String> outputLines) {\n    // Remove the trailing last empty line.\n    if (outputLines.isNotEmpty && outputLines.last == \"\") {\n      outputLines.removeLast();\n    }\n\n    var index = 0;\n    for (; index < outputLines.length; index++) {\n      var line = outputLines[index];\n      if (index >= _expectedOutput.length) {\n        fail(\"Got output '$line' when none was expected.\");\n        continue;\n      }\n\n      var expected = _expectedOutput[index];\n      if (expected.output != line) {\n        fail(\"Expected output '${expected.output}' on line ${expected.line} \"\n            \" and got '$line'.\");\n      }\n    }\n\n    while (index < _expectedOutput.length) {\n      var expected = _expectedOutput[index];\n      fail(\"Missing expected output '${expected.output}' on line \"\n          \"${expected.line}.\");\n      index++;\n    }\n  }\n\n  void fail(String message, [List<String> lines]) {\n    _failures.add(message);\n    if (lines != null) _failures.addAll(lines);\n  }\n}\n\nvoid _defineTestSuites() {\n  void c(String name, Map<String, String> tests) {\n    var executable = name == \"clox\" ? \"build/cloxd\" : \"build/$name\";\n    _allSuites[name] = Suite(name, \"c\", executable, [], tests);\n    _cSuites.add(name);\n  }\n\n  void java(String name, Map<String, String> tests) {\n    var dir = name == \"jlox\" ? \"build/java\" : \"build/gen/$name\";\n    _allSuites[name] = Suite(name, \"java\", \"java\",\n        [\"-cp\", dir, \"com.craftinginterpreters.lox.Lox\"], tests);\n    _javaSuites.add(name);\n  }\n\n  // These are just for earlier chapters.\n  var earlyChapters = {\n    \"test/scanning\": \"skip\",\n    \"test/expressions\": \"skip\",\n  };\n\n  // JVM doesn't correctly implement IEEE equality on boxed doubles.\n  var javaNaNEquality = {\n    \"test/number/nan_equality.lox\": \"skip\",\n  };\n\n  // No hardcoded limits in jlox.\n  var noJavaLimits = {\n    \"test/limit/loop_too_large.lox\": \"skip\",\n    \"test/limit/no_reuse_constants.lox\": \"skip\",\n    \"test/limit/too_many_constants.lox\": \"skip\",\n    \"test/limit/too_many_locals.lox\": \"skip\",\n    \"test/limit/too_many_upvalues.lox\": \"skip\",\n\n    // Rely on JVM for stack overflow checking.\n    \"test/limit/stack_overflow.lox\": \"skip\",\n  };\n\n  // No classes in Java yet.\n  var noJavaClasses = {\n    \"test/assignment/to_this.lox\": \"skip\",\n    \"test/call/object.lox\": \"skip\",\n    \"test/class\": \"skip\",\n    \"test/closure/close_over_method_parameter.lox\": \"skip\",\n    \"test/constructor\": \"skip\",\n    \"test/field\": \"skip\",\n    \"test/inheritance\": \"skip\",\n    \"test/method\": \"skip\",\n    \"test/number/decimal_point_at_eof.lox\": \"skip\",\n    \"test/number/trailing_dot.lox\": \"skip\",\n    \"test/operator/equals_class.lox\": \"skip\",\n    \"test/operator/equals_method.lox\": \"skip\",\n    \"test/operator/not_class.lox\": \"skip\",\n    \"test/regression/394.lox\": \"skip\",\n    \"test/super\": \"skip\",\n    \"test/this\": \"skip\",\n    \"test/return/in_method.lox\": \"skip\",\n    \"test/variable/local_from_method.lox\": \"skip\",\n  };\n\n  // No functions in Java yet.\n  var noJavaFunctions = {\n    \"test/call\": \"skip\",\n    \"test/closure\": \"skip\",\n    \"test/for/closure_in_body.lox\": \"skip\",\n    \"test/for/return_closure.lox\": \"skip\",\n    \"test/for/return_inside.lox\": \"skip\",\n    \"test/for/syntax.lox\": \"skip\",\n    \"test/function\": \"skip\",\n    \"test/operator/not.lox\": \"skip\",\n    \"test/regression/40.lox\": \"skip\",\n    \"test/return\": \"skip\",\n    \"test/unexpected_character.lox\": \"skip\",\n    \"test/while/closure_in_body.lox\": \"skip\",\n    \"test/while/return_closure.lox\": \"skip\",\n    \"test/while/return_inside.lox\": \"skip\",\n  };\n\n  // No resolution in Java yet.\n  var noJavaResolution = {\n    \"test/closure/assign_to_shadowed_later.lox\": \"skip\",\n    \"test/function/local_mutual_recursion.lox\": \"skip\",\n    \"test/variable/collide_with_parameter.lox\": \"skip\",\n    \"test/variable/duplicate_local.lox\": \"skip\",\n    \"test/variable/duplicate_parameter.lox\": \"skip\",\n    \"test/variable/early_bound.lox\": \"skip\",\n\n    // Broken because we haven\"t fixed it yet by detecting the error.\n    \"test/return/at_top_level.lox\": \"skip\",\n    \"test/variable/use_local_in_initializer.lox\": \"skip\",\n  };\n\n  // No control flow in C yet.\n  var noCControlFlow = {\n    \"test/block/empty.lox\": \"skip\",\n    \"test/for\": \"skip\",\n    \"test/if\": \"skip\",\n    \"test/limit/loop_too_large.lox\": \"skip\",\n    \"test/logical_operator\": \"skip\",\n    \"test/variable/unreached_undefined.lox\": \"skip\",\n    \"test/while\": \"skip\",\n  };\n\n  // No functions in C yet.\n  var noCFunctions = {\n    \"test/call\": \"skip\",\n    \"test/closure\": \"skip\",\n    \"test/for/closure_in_body.lox\": \"skip\",\n    \"test/for/return_closure.lox\": \"skip\",\n    \"test/for/return_inside.lox\": \"skip\",\n    \"test/for/syntax.lox\": \"skip\",\n    \"test/function\": \"skip\",\n    \"test/limit/no_reuse_constants.lox\": \"skip\",\n    \"test/limit/stack_overflow.lox\": \"skip\",\n    \"test/limit/too_many_constants.lox\": \"skip\",\n    \"test/limit/too_many_locals.lox\": \"skip\",\n    \"test/limit/too_many_upvalues.lox\": \"skip\",\n    \"test/regression/40.lox\": \"skip\",\n    \"test/return\": \"skip\",\n    \"test/unexpected_character.lox\": \"skip\",\n    \"test/variable/collide_with_parameter.lox\": \"skip\",\n    \"test/variable/duplicate_parameter.lox\": \"skip\",\n    \"test/variable/early_bound.lox\": \"skip\",\n    \"test/while/closure_in_body.lox\": \"skip\",\n    \"test/while/return_closure.lox\": \"skip\",\n    \"test/while/return_inside.lox\": \"skip\",\n  };\n\n  // No classes in C yet.\n  var noCClasses = {\n    \"test/assignment/to_this.lox\": \"skip\",\n    \"test/call/object.lox\": \"skip\",\n    \"test/class\": \"skip\",\n    \"test/closure/close_over_method_parameter.lox\": \"skip\",\n    \"test/constructor\": \"skip\",\n    \"test/field\": \"skip\",\n    \"test/inheritance\": \"skip\",\n    \"test/method\": \"skip\",\n    \"test/number/decimal_point_at_eof.lox\": \"skip\",\n    \"test/number/trailing_dot.lox\": \"skip\",\n    \"test/operator/equals_class.lox\": \"skip\",\n    \"test/operator/equals_method.lox\": \"skip\",\n    \"test/operator/not.lox\": \"skip\",\n    \"test/operator/not_class.lox\": \"skip\",\n    \"test/regression/394.lox\": \"skip\",\n    \"test/return/in_method.lox\": \"skip\",\n    \"test/super\": \"skip\",\n    \"test/this\": \"skip\",\n    \"test/variable/local_from_method.lox\": \"skip\",\n  };\n\n  // No inheritance in C yet.\n  var noCInheritance = {\n    \"test/class/local_inherit_other.lox\": \"skip\",\n    \"test/class/local_inherit_self.lox\": \"skip\",\n    \"test/class/inherit_self.lox\": \"skip\",\n    \"test/class/inherited_method.lox\": \"skip\",\n    \"test/inheritance\": \"skip\",\n    \"test/regression/394.lox\": \"skip\",\n    \"test/super\": \"skip\",\n  };\n\n  java(\"jlox\", {\n    \"test\": \"pass\",\n    ...earlyChapters,\n    ...javaNaNEquality,\n    ...noJavaLimits,\n  });\n\n  java(\"chap04_scanning\", {\n    // No interpreter yet.\n    \"test\": \"skip\",\n    \"test/scanning\": \"pass\"\n  });\n\n  // No test for chapter 5. It just has a hardcoded main() in AstPrinter.\n\n  java(\"chap06_parsing\", {\n    // No real interpreter yet.\n    \"test\": \"skip\",\n    \"test/expressions/parse.lox\": \"pass\"\n  });\n\n  java(\"chap07_evaluating\", {\n    // No real interpreter yet.\n    \"test\": \"skip\",\n    \"test/expressions/evaluate.lox\": \"pass\"\n  });\n\n  java(\"chap08_statements\", {\n    \"test\": \"pass\",\n    ...earlyChapters,\n    ...javaNaNEquality,\n    ...noJavaLimits,\n    ...noJavaFunctions,\n    ...noJavaResolution,\n    ...noJavaClasses,\n\n    // No control flow.\n    \"test/block/empty.lox\": \"skip\",\n    \"test/for\": \"skip\",\n    \"test/if\": \"skip\",\n    \"test/logical_operator\": \"skip\",\n    \"test/while\": \"skip\",\n    \"test/variable/unreached_undefined.lox\": \"skip\",\n  });\n\n  java(\"chap09_control\", {\n    \"test\": \"pass\",\n    ...earlyChapters,\n    ...javaNaNEquality,\n    ...noJavaLimits,\n    ...noJavaFunctions,\n    ...noJavaResolution,\n    ...noJavaClasses,\n  });\n\n  java(\"chap10_functions\", {\n    \"test\": \"pass\",\n    ...earlyChapters,\n    ...javaNaNEquality,\n    ...noJavaLimits,\n    ...noJavaResolution,\n    ...noJavaClasses,\n  });\n\n  java(\"chap11_resolving\", {\n    \"test\": \"pass\",\n    ...earlyChapters,\n    ...javaNaNEquality,\n    ...noJavaLimits,\n    ...noJavaClasses,\n  });\n\n  java(\"chap12_classes\", {\n    \"test\": \"pass\",\n    ...earlyChapters,\n    ...noJavaLimits,\n    ...javaNaNEquality,\n\n    // No inheritance.\n    \"test/class/local_inherit_other.lox\": \"skip\",\n    \"test/class/local_inherit_self.lox\": \"skip\",\n    \"test/class/inherit_self.lox\": \"skip\",\n    \"test/class/inherited_method.lox\": \"skip\",\n    \"test/inheritance\": \"skip\",\n    \"test/regression/394.lox\": \"skip\",\n    \"test/super\": \"skip\",\n  });\n\n  java(\"chap13_inheritance\", {\n    \"test\": \"pass\",\n    ...earlyChapters,\n    ...javaNaNEquality,\n    ...noJavaLimits,\n  });\n\n  c(\"clox\", {\n    \"test\": \"pass\",\n    ...earlyChapters,\n  });\n\n  c(\"chap17_compiling\", {\n    // No real interpreter yet.\n    \"test\": \"skip\",\n    \"test/expressions/evaluate.lox\": \"pass\",\n  });\n\n  c(\"chap18_types\", {\n    // No real interpreter yet.\n    \"test\": \"skip\",\n    \"test/expressions/evaluate.lox\": \"pass\",\n  });\n\n  c(\"chap19_strings\", {\n    // No real interpreter yet.\n    \"test\": \"skip\",\n    \"test/expressions/evaluate.lox\": \"pass\",\n  });\n\n  c(\"chap20_hash\", {\n    // No real interpreter yet.\n    \"test\": \"skip\",\n    \"test/expressions/evaluate.lox\": \"pass\",\n  });\n\n  c(\"chap21_global\", {\n    \"test\": \"pass\",\n    ...earlyChapters,\n    ...noCControlFlow,\n    ...noCFunctions,\n    ...noCClasses,\n\n    // No blocks.\n    \"test/assignment/local.lox\": \"skip\",\n    \"test/variable/in_middle_of_block.lox\": \"skip\",\n    \"test/variable/in_nested_block.lox\": \"skip\",\n    \"test/variable/scope_reuse_in_different_blocks.lox\": \"skip\",\n    \"test/variable/shadow_and_local.lox\": \"skip\",\n    \"test/variable/undefined_local.lox\": \"skip\",\n\n    // No local variables.\n    \"test/block/scope.lox\": \"skip\",\n    \"test/variable/duplicate_local.lox\": \"skip\",\n    \"test/variable/shadow_global.lox\": \"skip\",\n    \"test/variable/shadow_local.lox\": \"skip\",\n    \"test/variable/use_local_in_initializer.lox\": \"skip\",\n  });\n\n  c(\"chap22_local\", {\n    \"test\": \"pass\",\n    ...earlyChapters,\n    ...noCControlFlow,\n    ...noCFunctions,\n    ...noCClasses,\n  });\n\n  c(\"chap23_jumping\", {\n    \"test\": \"pass\",\n    ...earlyChapters,\n    ...noCFunctions,\n    ...noCClasses,\n  });\n\n  c(\"chap24_calls\", {\n    \"test\": \"pass\",\n    ...earlyChapters,\n    ...noCClasses,\n\n    // No closures.\n    \"test/closure\": \"skip\",\n    \"test/for/closure_in_body.lox\": \"skip\",\n    \"test/for/return_closure.lox\": \"skip\",\n    \"test/function/local_recursion.lox\": \"skip\",\n    \"test/limit/too_many_upvalues.lox\": \"skip\",\n    \"test/regression/40.lox\": \"skip\",\n    \"test/while/closure_in_body.lox\": \"skip\",\n    \"test/while/return_closure.lox\": \"skip\",\n  });\n\n  c(\"chap25_closures\", {\n    \"test\": \"pass\",\n    ...earlyChapters,\n    ...noCClasses,\n  });\n\n  c(\"chap26_garbage\", {\n    \"test\": \"pass\",\n    ...earlyChapters,\n    ...noCClasses,\n  });\n\n  c(\"chap27_classes\", {\n    \"test\": \"pass\",\n    ...earlyChapters,\n    ...noCInheritance,\n\n    // No methods.\n    \"test/assignment/to_this.lox\": \"skip\",\n    \"test/class/local_reference_self.lox\": \"skip\",\n    \"test/class/reference_self.lox\": \"skip\",\n    \"test/closure/close_over_method_parameter.lox\": \"skip\",\n    \"test/constructor\": \"skip\",\n    \"test/field/get_and_set_method.lox\": \"skip\",\n    \"test/field/method.lox\": \"skip\",\n    \"test/field/method_binds_this.lox\": \"skip\",\n    \"test/method\": \"skip\",\n    \"test/operator/equals_class.lox\": \"skip\",\n    \"test/operator/equals_method.lox\": \"skip\",\n    \"test/return/in_method.lox\": \"skip\",\n    \"test/this\": \"skip\",\n    \"test/variable/local_from_method.lox\": \"skip\",\n  });\n\n  c(\"chap28_methods\", {\n    \"test\": \"pass\",\n    ...earlyChapters,\n    ...noCInheritance,\n  });\n\n  c(\"chap29_superclasses\", {\n    \"test\": \"pass\",\n    ...earlyChapters,\n  });\n\n  c(\"chap30_optimization\", {\n    \"test\": \"pass\",\n    ...earlyChapters,\n  });\n}\n"
  },
  {
    "path": "tool/bin/tile_pages.dart",
    "content": "import 'dart:io';\n\nimport 'package:image/image.dart';\nimport 'package:path/path.dart' as p;\n\n/// Convert a PDF to a tiled PNG image of all of the pages.\n///\n/// Requires `pdftoppm` which can be installed on Mac with:\n///\n///     brew install poppler\nFuture<void> main(List<String> arguments) async {\n  print('Exporting PDF pages to PNG...');\n  var tempDir = await Directory('.').createTemp('pages');\n\n  // The `-r` argument is DPI.\n  var result = await Process.run('pdftoppm',\n      ['-png', '-r', '40', arguments[0], p.join(tempDir.path, 'page')]);\n  if (result.exitCode != 0) {\n    print('Could not export pages:\\n${result.stdout}\\n${result.stderr}');\n  }\n\n  var pages = <Image>[];\n  var imageFiles = tempDir\n      .listSync()\n      .whereType<File>()\n      .where((entry) => entry.path.endsWith('.png'))\n      .toList();\n  imageFiles.sort((a, b) => a.path.compareTo(b.path));\n\n  for (var imageFile in imageFiles) {\n    print('Reading ${imageFile.path}...');\n    var bytes = await imageFile.readAsBytes();\n    pages.add(decodePng(bytes));\n  }\n\n  const columns = 36;\n  const rows = 18;\n  const border = 4;\n\n  var pageWidth = pages.first.width;\n  var pageHeight = pages.first.height;\n\n  var tiled = Image.rgb((pageWidth + border) * columns + border,\n      (pageHeight + border) * rows + border);\n  tiled.fill(Color.fromRgb(0, 0, 0));\n\n  for (var i = 0; i < pages.length; i++) {\n    var x = i % columns;\n    var y = i ~/ columns;\n    print('Tiling page ${i + 1} ($x, $y)...');\n    copyInto(tiled, pages[i],\n        dstX: x * (pageWidth + border) + border,\n        dstY: y * (pageHeight + border) + border);\n  }\n\n  print('Writing pages.png...');\n  await File('pages.png').writeAsBytes(encodePng(tiled));\n\n  await tempDir.delete(recursive: true);\n}\n"
  },
  {
    "path": "tool/lib/src/book.dart",
    "content": "import 'code_tag.dart';\nimport 'location.dart';\nimport 'page.dart';\nimport 'snippet.dart';\nimport 'source_file_parser.dart';\nimport 'text.dart';\n\nimport 'package:glob/glob.dart';\nimport 'package:path/path.dart' as p;\n\nconst _tableOfContents = {\n  '': [\n    'Crafting Interpreters',\n    'Dedication',\n    'Acknowledgements',\n    'Table of Contents',\n  ],\n  'Welcome': [\n    'Introduction',\n    'A Map of the Territory',\n    'The Lox Language',\n  ],\n  'A Tree-Walk Interpreter': [\n    'Scanning',\n    'Representing Code',\n    'Parsing Expressions',\n    'Evaluating Expressions',\n    'Statements and State',\n    'Control Flow',\n    'Functions',\n    'Resolving and Binding',\n    'Classes',\n    'Inheritance',\n  ],\n  'A Bytecode Virtual Machine': [\n    'Chunks of Bytecode',\n    'A Virtual Machine',\n    'Scanning on Demand',\n    'Compiling Expressions',\n    'Types of Values',\n    'Strings',\n    'Hash Tables',\n    'Global Variables',\n    'Local Variables',\n    'Jumping Back and Forth',\n    'Calls and Functions',\n    'Closures',\n    'Garbage Collection',\n    'Classes and Instances',\n    'Methods and Initializers',\n    'Superclasses',\n    'Optimization',\n  ],\n  'Backmatter': [\n    'Appendix I',\n    'Appendix II',\n  ],\n};\n\n/// The contents of the Markdown and source files for the book, loaded and\n/// parsed.\nclass Book {\n  final List<Page> parts = [];\n  final List<Page> frontmatter = [];\n  final List<Page> pages = [];\n\n  final Map<CodeTag, Snippet> _snippets = {};\n\n  Book() {\n    var partIndex = 1;\n    var chapterIndex = 1;\n    var inMatter = false;\n\n    // Load the pages.\n    for (var part in _tableOfContents.keys) {\n      // Front- and backmatter have no names, pages, or numbers.\n      var partNumber = \"\";\n      inMatter = part == \"\" || part == \"Backmatter\";\n      if (!inMatter) {\n        partNumber = partIndex.roman;\n        partIndex += 1;\n      }\n\n      // There is no part page for the frontmatter.\n      Page partPage;\n      if (part != \"\") {\n        partPage = Page(part, null, partNumber, pages.length);\n        pages.add(partPage);\n        parts.add(partPage);\n      }\n\n      for (var chapter in _tableOfContents[part]) {\n        var chapterNumber = \"\";\n        if (inMatter) {\n          // Front- and backmatter chapters are specially numbered.\n          if (chapter == \"Appendix I\") {\n            chapterNumber = \"A1\";\n          } else if (chapter == \"Appendix II\") {\n            chapterNumber = \"A2\";\n          }\n        } else {\n          chapterNumber = chapterIndex.toString();\n          chapterIndex++;\n        }\n\n        var page = Page(chapter, partPage, chapterNumber, pages.length);\n        pages.add(page);\n        if (partPage != null) {\n          partPage.chapters.add(page);\n        } else {\n          frontmatter.add(page);\n        }\n      }\n    }\n\n    // Load the source files.\n    for (var language in [\"java\", \"c\"]) {\n      for (var file in Glob(\"$language/**.{c,h,java}\").listSync()) {\n        var shortPath = p.relative(file.path, from: language);\n        var sourceFile = SourceFileParser(this, file.path, shortPath).parse();\n\n        // Create snippets from the lines in the file.\n        var lineIndex = 0;\n        for (var line in sourceFile.lines) {\n          var snippet = _snippets.putIfAbsent(\n              line.start, () => Snippet(sourceFile, line.start));\n          snippet.addLine(lineIndex, line);\n\n          if (line.end != null) {\n            var endSnippet = _snippets.putIfAbsent(\n                line.end, () => Snippet(sourceFile, line.end));\n            endSnippet.removeLine(lineIndex, line);\n          }\n\n          lineIndex++;\n        }\n      }\n    }\n\n    for (var snippet in _snippets.values) {\n      if (snippet.tag.name == \"not-yet\") continue;\n      if (snippet.tag.name == \"omit\") continue;\n      snippet.calculateContext();\n    }\n  }\n\n  /// Looks for a page with [title].\n  Page findChapter(String title) =>\n      pages.firstWhere((page) => page.title == title);\n\n  /// Looks for a page with [number];\n  Page findNumber(String number) =>\n      pages.firstWhere((page) => page.numberString == number);\n\n  /// Gets the [Page] [offset] pages before or after this one.\n  Page adjacentPage(Page start, int offset) {\n    var index = pages.indexOf(start) + offset;\n    if (index < 0 || index >= pages.length) return null;\n    return pages[index];\n  }\n\n  Snippet findSnippet(CodeTag tag) => _snippets[tag];\n\n  /// Gets the last snippet that appears in [page].\n  ///\n  /// Note: Not very fast.\n  Snippet lastSnippet(Page page) {\n    Snippet last;\n    for (var snippet in _snippets.values) {\n      if (snippet.tag.chapter != page) continue;\n      if (last == null || snippet.tag > last.tag) last = snippet;\n    }\n\n    return last;\n  }\n\n  /// Find the [CodeTag] with [name] on [page].\n  ///\n  /// Note: Not very fast.\n  CodeTag findTag(Page page, String name) {\n    for (var tag in _snippets.keys) {\n      if (tag.chapter != page) continue;\n      if (tag.name == name) return tag;\n    }\n\n    throw ArgumentError(\"Could not find tag '$name' in '${page.title}'.\");\n  }\n}\n\n/// A single source file whose code is included in the book.\nclass SourceFile {\n  final String path;\n  final List<SourceLine> lines = [];\n\n  SourceFile(this.path);\n\n  String get language => path.endsWith(\"java\") ? \"java\" : \"c\";\n\n  String get nicePath => path.replaceAll(\"com/craftinginterpreters/\", \"\");\n}\n\n/// A line of code in a [SourceFile] and the metadata for it.\nclass SourceLine {\n  final String text;\n  final Location location;\n\n  /// The first snippet where this line appears in the book.\n  final CodeTag start;\n\n  /// The last snippet where this line is removed, or null if the line reaches\n  /// the end of the book.\n  final CodeTag end;\n\n  SourceLine(this.text, this.location, this.start, this.end);\n\n  /// Returns true if this line exists by the time we reach [tag].\n  bool isPresent(CodeTag tag) {\n    // If we haven't reached this line's snippet yet.\n    if (tag < start) return false;\n\n    // If we are past the snippet where it is removed.\n    if (end != null && tag >= end) return false;\n\n    return true;\n  }\n\n  String toString() {\n    var result = \"${text.padRight(72)} // $start\";\n    if (end != null) result += \" < $end\";\n    return result;\n  }\n}\n"
  },
  {
    "path": "tool/lib/src/code_tag.dart",
    "content": "import 'page.dart';\n\nclass CodeTag with Ordering<CodeTag> implements Comparable<CodeTag> {\n  final Page chapter;\n  final String name;\n\n  /// The zero-based index of the tag in the order that it appears on the page.\n  final int _index;\n\n  /// Number of preceding lines of context to show.\n  final int beforeCount;\n\n  /// Number of trailing lines of context to show.\n  final int afterCount;\n\n  /// Whether to show location information.\n  final bool showLocation;\n\n  factory CodeTag(Page chapter, String name, int index, int beforeCount,\n      int afterCount, bool showLocation) {\n    // Hackish. Always want \"not-yet\" to be the last tag even if it appears\n    // before a real tag. That ensures we can push it for other tags that have\n    // been named.\n    if (name == \"not-yet\") index = 9999;\n\n    return CodeTag._(\n        chapter, name, index, beforeCount, afterCount, showLocation);\n  }\n\n  CodeTag._(this.chapter, this.name, this._index, this.beforeCount,\n      this.afterCount, this.showLocation);\n\n  /// Gets the name of the directory used for this tag when the code is split\n  /// at this tag's snippet.\n  String get directory {\n    var index = _index.toString().padLeft(2, \"0\");\n    return \"$index-$name\";\n  }\n\n  int compareTo(CodeTag other) {\n    if (chapter.ordinal != other.chapter.ordinal) {\n      return chapter.ordinal.compareTo(other.chapter.ordinal);\n    }\n\n    return _index.compareTo(other._index);\n  }\n\n  String toString() => \"Tag(${chapter.ordinal}|$_index: $chapter $name)\";\n}\n\n/// Implements the comparison operators in terms of [compareTo()].\nmixin Ordering<T> implements Comparable<T> {\n  bool operator <(T other) => compareTo(other) < 0;\n  bool operator <=(T other) => compareTo(other) <= 0;\n  bool operator >(T other) => compareTo(other) > 0;\n  bool operator >=(T other) => compareTo(other) >= 0;\n}\n"
  },
  {
    "path": "tool/lib/src/format.dart",
    "content": "/// The book format being rendered to.\nenum Format {\n  /// HTML for the web.\n  web,\n\n  /// XML for importing into InDesign.\n  print,\n}\n\nextension FormatExtension on Format {\n  bool get isWeb => this == Format.web;\n  bool get isPrint => this == Format.print;\n}\n"
  },
  {
    "path": "tool/lib/src/location.dart",
    "content": "/// The context in which a line of code appears. The chain of types and\n/// functions it's in.\nclass Location {\n  final Location parent;\n  final String kind;\n  String _name;\n  final String signature;\n\n  /// If [kind] is \"method\" or \"function\" then this tracks where we are\n  /// declaring or defining the function.\n  final bool isFunctionDeclaration;\n\n  Location(this.parent, this.kind, this._name,\n      {this.signature, this.isFunctionDeclaration = false});\n\n  String get name => _name;\n\n  set name(String value) {\n    // Can only set the name if it's an unnamed typedef.\n    assert(_name == null);\n    _name = value;\n  }\n\n  bool get isFile => kind == \"file\";\n\n  bool get isFunction =>\n      const {\"constructor\", \"function\", \"method\"}.contains(kind);\n\n  int get depth {\n    var current = this;\n    var result = 0;\n    while (current != null) {\n      result++;\n      current = current.parent;\n    }\n    return result;\n  }\n\n  String toString() {\n    var result = \"$kind $name\";\n    if (signature != null) result += \"($signature)\";\n    if (parent != null) result = \"$parent > $result\";\n    return result;\n  }\n\n  /// Generates a string of HTML that describes a snippet at this location,\n  /// when following the [preceding] location.\n  String toHtml(Location preceding, List<String> removed) {\n    if (kind == \"new\") return \"create new file\";\n    if (kind == \"top\") return \"add to top of file\";\n\n    // Note: The order of these is highly significant.\n    if (kind == \"class\" && parent?.kind == \"class\") {\n      return \"nest inside class <em>${parent.name}</em>\";\n    }\n\n    if (isFunction && preceding == this) {\n      //  Hack. There's one place where we add a new overload and that shouldn't\n      //  be treated as in the same function. But we can't always look at the\n      //  signature because there's another place where a change signature would\n      //  confuse the build script. So just check for the one-off case here.\n      if (name == \"resolve\" && signature == \"Expr expr\") {\n        return \"add after <em>${preceding.name}</em>(${preceding.signature})\";\n      }\n\n      // We're still inside a function.\n      return \"in <em>$name</em>()\";\n    }\n\n    if (isFunction && removed.isNotEmpty) {\n      // Hack. We don't appear to be in the middle of a function, but we are\n      // replacing lines, so assume we're replacing the entire function.\n      return \"$kind <em>$name</em>()\";\n    }\n\n    if (parent == preceding && !preceding.isFile) {\n      // We're nested inside a type.\n      return \"in ${preceding.kind} <em>${preceding.name}</em>\";\n    }\n\n    if (preceding == this && !isFile) {\n      // We're still inside a type.\n      return \"in $kind <em>$name</em>\";\n    }\n\n    if (preceding.isFunction) {\n      // We aren't inside a function, but we do know the preceding one.\n      return \"add after <em>${preceding.name}</em>()\";\n    }\n\n    if (!preceding.isFile) {\n      // We aren't inside any function, but we do know what we follow.\n      return \"add after ${preceding.kind} <em>${preceding.name}</em>\";\n    }\n\n    // If we get here, there isn't a useful location to show. The snippet will\n    // have enough surrounding context to make it clear. This is usually stuff\n    // like imports or includes near the top of the file.\n    return null;\n  }\n\n  /// Generates a string of InDesign XML that describes a snippet at this\n  /// location, when following the [preceding] location.\n  ///\n  /// This is similar to [toHtml] but uses different tags and places the\n  /// signatures inside the tags instead of outside.\n  String toXml(Location preceding, List<String> removed) {\n    if (kind == \"new\") return \"create new file\";\n    if (kind == \"top\") return \"add to top of file\";\n\n    // Note: The order of these is highly significant.\n    if (kind == \"class\" && parent?.kind == \"class\") {\n      return \"nest inside class <location-type>${parent.name}</location-type>\";\n    }\n\n    if (isFunction && preceding == this) {\n      //  Hack. There's one place where we add a new overload and that shouldn't\n      //  be treated as in the same function. But we can't always look at the\n      //  signature because there's another place where a change signature would\n      //  confuse the build script. So just check for the one-off case here.\n      if (name == \"resolve\" && signature == \"Expr expr\") {\n        return \"add after <location-fn>${preceding.name}\"\n            \"(${preceding.signature})</location-fn>\";\n      }\n\n      // We're still inside a function.\n      return \"in <location-fn>$name()</location-fn>\";\n    }\n\n    if (isFunction && removed.isNotEmpty) {\n      // Hack. We don't appear to be in the middle of a function, but we are\n      // replacing lines, so assume we're replacing the entire function.\n      return \"$kind <location-fn>$name()</location-fn>\";\n    }\n\n    if (parent == preceding && !preceding.isFile) {\n      // We're nested inside a type.\n      return \"in ${preceding.kind} \"\n          \"<location-type>${preceding.name}</location-type>\";\n    }\n\n    if (preceding == this && !isFile) {\n      // We're still inside a type.\n      return \"in $kind <location-type>$name</location-type>\";\n    }\n\n    if (preceding.isFunction) {\n      // We aren't inside a function, but we do know the preceding one.\n      return \"add after <location-fn>${preceding.name}()</location-fn>\";\n    }\n\n    if (!preceding.isFile) {\n      // We aren't inside any function, but we do know what we follow.\n      if (preceding.isFunction) {\n        return \"add after ${preceding.kind} \"\n            \"<location-fn>${preceding.name}()</location-fn>\";\n      } else {\n        return \"add after ${preceding.kind} \"\n            \"<location-type>${preceding.name}</location-type>\";\n      }\n    }\n\n    // If we get here, there isn't a useful location to show. The snippet will\n    // have enough surrounding context to make it clear. This is usually stuff\n    // like imports or includes near the top of the file.\n    return null;\n  }\n\n  bool operator ==(Object other) {\n    // Note: Signature is deliberately not considered part of equality. There's\n    // a case in calls-and-functions where the signature of a function changes\n    // and it confuses the build script if we treat the signatures as\n    // significant.\n    return other is Location && kind == other.kind && name == other.name;\n  }\n\n  int get hashCode => kind.hashCode ^ name.hashCode;\n\n  /// Discard as many children as needed to get to [depth] parents.\n  Location popToDepth(int depth) {\n    var current = this;\n    var locations = <Location>[];\n    while (current != null) {\n      locations.add(current);\n      current = current.parent;\n    }\n\n    // If we are already shallower, there is nothing to pop.\n    if (locations.length < depth + 1) return this;\n\n    return locations[locations.length - depth - 1];\n  }\n}\n"
  },
  {
    "path": "tool/lib/src/markdown/block_syntax.dart",
    "content": "import 'package:markdown/markdown.dart';\n\nimport '../format.dart';\nimport '../page.dart';\n\n/// Parses atx-style headers like `## Header` and gives them the book's special\n/// handling:\n///\n/// - Generates anchor links.\n/// - Includes the section numbers.\nclass BookHeaderSyntax extends BlockSyntax {\n  /// Leading `#` define atx-style headers.\n  static final _headerPattern = RegExp(r'^(#{1,6}) (.*)$');\n\n  final Page _page;\n  final Format _format;\n\n  RegExp get pattern => _headerPattern;\n\n  BookHeaderSyntax(this._page, this._format);\n\n  Node parse(BlockParser parser) {\n    var header = _page.headers[parser.current];\n    parser.advance();\n\n    if (_format.isPrint) {\n      return Element(\"h${header.level}\", [UnparsedContent(header.name)]);\n    }\n\n    var number = \"\";\n    if (!header.isSpecial) {\n      number = \"${_page.numberString}&#8202;.&#8202;${header.headerIndex}\";\n      if (header.subheaderIndex != null) {\n        number += \"&#8202;.&#8202;${header.subheaderIndex}\";\n      }\n    }\n\n    var link = Element(\"a\", [\n      if (!header.isSpecial) Element(\"small\", [Text(number)]),\n      UnparsedContent(header.name)\n    ]);\n    link.attributes[\"href\"] = \"#${header.anchor}\";\n    link.attributes[\"id\"] = header.anchor;\n\n    return Element(\"h${header.level}\", [link]);\n  }\n}\n"
  },
  {
    "path": "tool/lib/src/markdown/code_syntax.dart",
    "content": "import 'package:markdown/markdown.dart';\n\nimport '../book.dart';\nimport '../code_tag.dart';\nimport '../format.dart';\nimport '../page.dart';\nimport '../snippet.dart';\nimport '../syntax/highlighter.dart';\nimport '../text.dart';\n\n/// Custom code block formatter that uses our syntax highlighter.\nclass HighlightedCodeBlockSyntax extends BlockSyntax {\n  static final _codeFencePattern = RegExp(r'^(\\s*)```(.*)$');\n\n  final Format _format;\n\n  RegExp get pattern => _codeFencePattern;\n\n  HighlightedCodeBlockSyntax(this._format);\n\n  bool canParse(BlockParser parser) =>\n      pattern.firstMatch(parser.current) != null;\n\n  List<String> parseChildLines(BlockParser parser) {\n    var childLines = <String>[];\n    parser.advance();\n\n    while (!parser.isDone) {\n      var match = pattern.firstMatch(parser.current);\n      if (match == null) {\n        childLines.add(parser.current);\n        parser.advance();\n      } else {\n        parser.advance();\n        break;\n      }\n    }\n\n    return childLines;\n  }\n\n  Node parse(BlockParser parser) {\n    // Get the syntax identifier, if there is one.\n    var match = pattern.firstMatch(parser.current);\n    var indent = match[1].length;\n    var language = match[2];\n\n    var childLines = parseChildLines(parser);\n\n    String code;\n    if (language == \"text\") {\n      // Don't syntax highlight text.\n      var buffer = StringBuffer();\n      if (!_format.isPrint) {\n        buffer.write(\"<pre>\");\n\n        // The HTML spec mandates that a leading newline after '<pre>' is\n        // ignored.\n        // https://html.spec.whatwg.org/#element-restrictions\n        // Some snippets deliberately start with a newline which needs to be\n        // preserved, so output an extra (discarded) newline in that case.\n        if (_format.isWeb && childLines.first.isEmpty) buffer.writeln();\n      }\n\n      for (var line in childLines) {\n        // Strip off any leading indentation.\n        if (line.length > indent) line = line.substring(indent);\n        checkLineLength(line);\n\n        buffer.write(line.escapeHtml);\n        if (_format.isPrint) {\n          // Soft break, so that the code stays one paragraph.\n          buffer.write(\"&#x2028;\");\n        } else {\n          buffer.writeln();\n        }\n      }\n\n      if (!_format.isPrint) buffer.write(\"</pre>\");\n\n      code = buffer.toString();\n    } else {\n      code = formatCode(language, childLines, _format, indent: indent);\n    }\n\n    if (_format.isPrint) {\n      // Remove the trailing newline since we'll write a newline after the\n      // \"</pre>\" and we don't want InDesign to insert a blank paragraph.\n      code = code.trimTrailingNewline();\n\n      // Replace newlines with soft breaks so that InDesign treats the entire\n      // snippet as a single paragraph and keeps it together.\n      code = code.replaceAll(\"\\n\", \"&#x2028;\");\n\n      // Don't wrap in a div for XML.\n      return Element.text(\"pre\", code);\n    }\n\n    var element = Element.text(\"div\", code);\n    element.attributes[\"class\"] = \"codehilite\";\n    return element;\n  }\n}\n\n/// Recognizes `^code` tags and inserts the relevant snippet.\nclass CodeTagBlockSyntax extends BlockSyntax {\n  static final _startPattern = RegExp(r'\\^code ([a-z0-9-]+)');\n\n  final Book _book;\n  final Page _page;\n  final Format _format;\n\n  CodeTagBlockSyntax(this._book, this._page, this._format);\n\n  RegExp get pattern => _startPattern;\n\n  bool canParse(BlockParser parser) =>\n      pattern.firstMatch(parser.current) != null;\n\n  Node parse(BlockParser parser) {\n    var match = pattern.firstMatch(parser.current);\n    var name = match[1];\n    parser.advance();\n\n    var codeTag = _page.findCodeTag(name);\n    String snippet;\n    if (_format.isPrint) {\n      snippet = _buildSnippetXml(codeTag, _book.findSnippet(codeTag));\n    } else {\n      snippet = _buildSnippet(_format, codeTag, _book.findSnippet(codeTag));\n    }\n    return Text(snippet);\n  }\n}\n\nString _buildSnippet(Format format, CodeTag tag, Snippet snippet) {\n  // NOTE: If you change this, be sure to update the baked in example snippet\n  // in introduction.md.\n\n  if (snippet == null) {\n    print(\"Undefined snippet ${tag.name}\");\n    return \"<strong>ERROR: Missing snippet ${tag.name}</strong>\\n\";\n  }\n\n  var location = <String>[];\n  if (tag.showLocation) location = snippet.locationHtmlLines;\n\n  var buffer = StringBuffer();\n  buffer.write('<div class=\"codehilite\">');\n\n  if (snippet.contextBefore.isNotEmpty) {\n    _writeContextHtml(format, buffer, snippet.contextBefore,\n        cssClass: snippet.added.isNotEmpty ? \"insert-before\" : null);\n  }\n\n  if (snippet.addedComma != null) {\n    var commaLine = formatCode(\n        snippet.file.language, [snippet.addedComma], format,\n        preClass: \"insert-before\");\n    var comma = commaLine.lastIndexOf(\",\");\n    buffer.write(commaLine.substring(0, comma));\n    buffer.write('<span class=\"insert-comma\">,</span>');\n    buffer.write(commaLine.substring(comma + 1));\n  }\n\n  if (tag.showLocation) {\n    var lines = location.join(\"<br>\\n\");\n    buffer.writeln('<div class=\"source-file\">$lines</div>');\n  }\n\n  if (snippet.added != null) {\n    var added = formatCode(snippet.file.language, snippet.added, format,\n        preClass: tag.beforeCount > 0 || tag.afterCount > 0 ? \"insert\" : null);\n    buffer.write(added);\n  }\n\n  if (snippet.contextAfter.isNotEmpty) {\n    _writeContextHtml(format, buffer, snippet.contextAfter,\n        cssClass: snippet.added.isNotEmpty ? \"insert-after\" : null);\n  }\n\n  buffer.writeln('</div>');\n\n  if (tag.showLocation) {\n    var lines = location.join(\", \");\n    buffer.writeln('<div class=\"source-file-narrow\">$lines</div>');\n  }\n\n  return buffer.toString();\n}\n\nString _buildSnippetXml(CodeTag tag, Snippet snippet) {\n  var buffer = StringBuffer();\n\n  if (tag.showLocation) buffer.writeln(snippet.locationXml);\n\n  if (snippet.contextBefore.isNotEmpty) {\n    _writeContextXml(buffer, snippet.contextBefore, \"before\");\n  }\n\n  if (snippet.addedComma != null) {\n    // TODO: How should this look in print?\n    buffer.write(\"TODO added comma\");\n//    var commaLine = formatCode(snippet.file.language, [snippet.addedComma],\n//        preClass: \"insert-before\", xml: true);\n//    var comma = commaLine.lastIndexOf(\",\");\n//    buffer.write(commaLine.substring(0, comma));\n//    buffer.write('<span class=\"insert-comma\">,</span>');\n//    buffer.write(commaLine.substring(comma + 1));\n  }\n\n  if (snippet.added != null) {\n    // Use different tags based on whether there is context before, after,\n    // neither, or both.\n    String insertTag;\n    if (tag.beforeCount > 0) {\n      if (tag.afterCount > 0) {\n        insertTag = \"interpreter-between\";\n      } else {\n        insertTag = \"interpreter-after\";\n      }\n    } else {\n      if (tag.afterCount > 0) {\n        insertTag = \"interpreter-before\";\n      } else {\n        insertTag = \"interpreter\";\n      }\n    }\n\n    if (snippet.contextBefore.isNotEmpty) buffer.writeln();\n    buffer.write(\"<$insertTag>\");\n\n    var code = formatCode(snippet.file.language, snippet.added, Format.print);\n    // Discard the trailing newline so we don't end up with a blank paragraph\n    // in InDesign.\n    code = code.trimTrailingNewline();\n\n    // Replace newlines with soft breaks so that InDesign treats the entire\n    // snippet as a single paragraph and keeps it together.\n    code = code.replaceAll(\"\\n\", \"&#x2028;\");\n\n    buffer.write(code);\n    buffer.write(\"</$insertTag>\");\n  }\n\n  if (snippet.contextAfter.isNotEmpty) {\n    buffer.writeln();\n    _writeContextXml(buffer, snippet.contextAfter, \"after\");\n  }\n\n  return buffer.toString();\n}\n\nvoid _writeContextHtml(Format format, StringBuffer buffer, List<String> lines,\n    {String cssClass}) {\n  buffer.write(\"<pre\");\n  if (cssClass != null) buffer.write(' class=\"$cssClass\"');\n  buffer.write(\">\");\n\n  // The HTML spec mandates that a leading newline after '<pre>' is ignored.\n  // https://html.spec.whatwg.org/#element-restrictions\n  // Some snippets deliberately start with a newline which needs to be\n  // preserved, so output an extra (discarded) newline in that case.\n  if (format.isWeb && lines.first.isEmpty) buffer.writeln();\n\n  for (var line in lines) {\n    buffer.writeln(line.escapeHtml);\n  }\n\n  buffer.write(\"</pre>\");\n}\n\nvoid _writeContextXml(StringBuffer buffer, List<String> lines, String tag) {\n  if (lines.isEmpty) return;\n\n  buffer.write(\"<context-$tag>\");\n  var first = true;\n  for (var line in lines) {\n    // Soft break, so that the context stays one paragraph.\n    if (!first) buffer.write(\"&#x2028;\");\n    first = false;\n    buffer.write(line.escapeHtml);\n  }\n  buffer.write(\"</context-$tag>\");\n}\n"
  },
  {
    "path": "tool/lib/src/markdown/html_renderer.dart",
    "content": "import 'package:markdown/markdown.dart';\n\n/// Custom Markdown to HTML renderer with some tweaks for the output we want.\nclass HtmlRenderer implements NodeVisitor {\n  static const _blockTags = {\n    \"blockquote\",\n    \"div\",\n    \"h1\",\n    \"h2\",\n    \"h3\",\n    \"h4\",\n    \"h5\",\n    \"h6\",\n    \"hr\",\n    \"li\",\n    \"ol\",\n    \"p\",\n    \"pre\",\n    \"ul\",\n  };\n\n  StringBuffer buffer;\n\n  final _elementStack = <Element>[];\n  String _lastVisitedTag;\n\n  String render(List<Node> nodes) {\n    buffer = StringBuffer();\n\n    for (final node in nodes) {\n      node.accept(this);\n    }\n\n    buffer.writeln();\n\n    return buffer.toString();\n  }\n\n  void visitText(Text text) {\n    var content = text.text;\n\n    // Put a newline before inline HTML markup for block-level tags.\n    if (content.startsWith(\"<aside\") ||\n        content.startsWith(\"</aside\") ||\n        content.startsWith(\"<div\") ||\n        content.startsWith(\"</div\")) {\n      buffer.writeln();\n    }\n\n    if (const ['p', 'li'].contains(_lastVisitedTag)) {\n      content = content.trimLeft();\n    }\n    buffer.write(content);\n\n    _lastVisitedTag = null;\n  }\n\n  bool visitElementBefore(Element element) {\n    // Separate block-level elements with newlines.\n    if (buffer.isNotEmpty && _blockTags.contains(element.tag)) {\n      buffer.writeln();\n    }\n\n    buffer.write('<${element.tag}');\n\n    for (var entry in element.attributes.entries) {\n      buffer.write(' ${entry.key}=\"${entry.value}\"');\n    }\n\n    _lastVisitedTag = element.tag;\n\n    if (element.isEmpty) {\n      // Empty element like <hr/>.\n      buffer.write(' />');\n      if (element.tag == 'br') buffer.write('\\n');\n      return false;\n    } else {\n      _elementStack.add(element);\n      buffer.write('>');\n      return true;\n    }\n  }\n\n  void visitElementAfter(Element element) {\n    assert(identical(_elementStack.last, element));\n\n    if (element.children != null &&\n        element.children.isNotEmpty &&\n        _blockTags.contains(_lastVisitedTag) &&\n        _blockTags.contains(element.tag)) {\n      buffer.writeln();\n    } else if (element.tag == 'blockquote') {\n      buffer.writeln();\n    }\n    buffer.write('</${element.tag}>');\n\n    _lastVisitedTag = _elementStack.removeLast().tag;\n  }\n}\n"
  },
  {
    "path": "tool/lib/src/markdown/inline_syntax.dart",
    "content": "import 'package:charcode/ascii.dart';\nimport 'package:markdown/markdown.dart';\n\nimport '../format.dart';\n\nclass EllipseSyntax extends InlineSyntax {\n  final Format _format;\n\n  EllipseSyntax(this._format) : super(r\"\\.\\.\\. ?\", startCharacter: $dot);\n\n  bool onMatch(InlineParser parser, Match match) {\n    // A Unicode ellipsis doesn't have as much space between the dots as\n    // Chicago style mandates so do our own thing.\n    parser.addNode(Text(_format.isPrint\n        ? \"&thinsp;.&thinsp;.&thinsp;.&thinsp;\"\n        : '<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.&nbsp;</span>'));\n    return true;\n  }\n}\n\nclass ApostropheSyntax extends InlineSyntax {\n  final Format _format;\n\n  ApostropheSyntax(this._format) : super(r\"'\", startCharacter: $apostrophe);\n\n  bool onMatch(InlineParser parser, Match match) {\n    var before = -1;\n    if (parser.pos > 0) {\n      before = parser.charAt(parser.pos - 1);\n    }\n    var after = -1;\n    if (parser.pos < parser.source.length - 1) {\n      after = parser.charAt(parser.pos + 1);\n    }\n\n    var isRight = _isRight(before, after);\n    String quote;\n    if (_format.isPrint) {\n      quote = isRight ? \"#8217\" : \"#8216\";\n    } else {\n      quote = isRight ? \"rsquo\" : \"lsquo\";\n    }\n    parser.addNode(Text(\"&$quote;\"));\n    return true;\n  }\n\n  bool _isRight(int before, int after) {\n    // Years like \"the '60s\".\n    if (before == $space && after >= $0 && after <= $9) return true;\n\n    // Possessive after code.\n    if (before == $backquote && after == $s) return true;\n\n    if (before == $space) return false;\n    if (before == $lf) return false;\n\n    // Default to right.\n    return true;\n  }\n}\n\nclass SmartQuoteSyntax extends InlineSyntax {\n  final Format _format;\n\n  SmartQuoteSyntax(this._format) : super(r'\"', startCharacter: $double_quote);\n\n  bool onMatch(InlineParser parser, Match match) {\n    var before = -1;\n    if (parser.pos > 0) {\n      before = parser.charAt(parser.pos - 1);\n    }\n    var after = -1;\n    if (parser.pos < parser.source.length - 1) {\n      after = parser.charAt(parser.pos + 1);\n    }\n\n    var isRight = _isRight(before, after);\n    String quote;\n    if (_format.isPrint) {\n      quote = isRight ? \"#8221\" : \"#8220\";\n    } else {\n      quote = isRight ? \"rdquo\" : \"ldquo\";\n    }\n\n    parser.addNode(Text(\"&$quote;\"));\n    return true;\n  }\n\n  bool _isRight(int before, int after) {\n    if (after == $space) return true;\n    if (before >= $a && before <= $z) return true;\n    if (before >= $A && before <= $Z) return true;\n    if (before >= $0 && before <= $9) return true;\n    if (before == $dot) return true;\n    if (before == $question) return true;\n    if (before == $exclamation) return true;\n\n    if (after == $colon) return true;\n    if (after == $comma) return true;\n    if (after == $dot) return true;\n\n    // Default to left.\n    return false;\n  }\n}\n\nclass EmDashSyntax extends InlineSyntax {\n  final Format _format;\n\n  EmDashSyntax(this._format) : super(r\"\\s--\\s\");\n\n  bool onMatch(InlineParser parser, Match match) {\n    parser.addNode(\n        Text(_format.isPrint ? '—' : '<span class=\"em\">&mdash;</span>'));\n    return true;\n  }\n}\n\n/// Remove newlines in paragraphs and turn them into spaces since InDesign\n/// treats them as line breaks.\nclass NewlineSyntax extends InlineSyntax {\n  NewlineSyntax() : super(\"\\n\", startCharacter: $lf);\n\n  bool onMatch(InlineParser parser, Match match) {\n    parser.addNode(Text(\" \"));\n    return true;\n  }\n}\n"
  },
  {
    "path": "tool/lib/src/markdown/markdown.dart",
    "content": "import 'package:markdown/markdown.dart' hide HtmlRenderer;\n\nimport '../book.dart';\nimport '../format.dart';\nimport '../page.dart';\nimport 'block_syntax.dart';\nimport 'code_syntax.dart';\nimport 'html_renderer.dart';\nimport 'inline_syntax.dart';\nimport 'xml_renderer.dart';\n\nString renderMarkdown(Book book, Page page, List<String> lines, Format format) {\n  var document = Document(blockSyntaxes: [\n    BookHeaderSyntax(page, format),\n    CodeTagBlockSyntax(book, page, format),\n    HighlightedCodeBlockSyntax(format),\n  ], inlineSyntaxes: [\n    // Put inline Markdown code syntax before our smart quotes so that\n    // quotes inside `code` spans don't get smartened.\n    CodeSyntax(),\n    EllipseSyntax(format),\n    ApostropheSyntax(format),\n    SmartQuoteSyntax(format),\n    EmDashSyntax(format),\n    if (format.isPrint) NewlineSyntax(),\n  ], extensionSet: ExtensionSet.gitHubFlavored);\n\n  var ast = document.parseLines(lines);\n  if (format.isPrint) {\n    return XmlRenderer().render(ast);\n  } else {\n    return HtmlRenderer().render(ast);\n  }\n}\n"
  },
  {
    "path": "tool/lib/src/markdown/xml_renderer.dart",
    "content": "import 'package:markdown/markdown.dart';\n\nfinal _imagePathPattern = RegExp(r'\"([^\"]+.png)\"');\n\n/// Matches opening XML tag names.\nfinal _tagPattern = RegExp(r\"<([a-z-_0-9]+)\");\n\nfinal _spanPattern = RegExp(r'<span\\s+name=\"[^\"]+\">');\n\nfinal _smallCapsPattern = RegExp(r'<span\\s+class=\"small-caps\">([A-Z]+)</span>');\n\nclass XmlRenderer implements NodeVisitor {\n  /// While building, also fill a StringBuffer with the minimal set of\n  /// paragraphs needed to cover all tags in the book.\n  static final tagFileBuffer = StringBuffer();\n\n  /// Keeps track of which XML tags [tagFileBuffer] contains.\n  static final allTags = <String>{};\n\n  /// The list of paragraph-level tags.\n  final List<_Paragraph> _paragraphs = [];\n\n  /// Whether we need to create a new paragraph before appending the next text.\n  bool _pendingParagraph = true;\n\n  /// The nested stack of current inline tags.\n  final List<_Inline> _inlineStack = [];\n\n  /// The stack tracking where we are in the document.\n  _Context _context = _Context(\"main\");\n\n  String render(List<Node> nodes) {\n    for (final node in nodes) {\n      node.accept(this);\n    }\n\n    var buffer = StringBuffer();\n    buffer.writeln(\"<chapter>\");\n\n    _Paragraph previousMain;\n    _Paragraph previousAside;\n\n    for (var paragraph in _paragraphs) {\n      String text;\n\n      if (paragraph.context.has(\"aside\")) {\n        text = paragraph.prettyPrint(previousAside);\n        previousAside = paragraph;\n      } else {\n        text = paragraph.prettyPrint(previousMain);\n        previousMain = paragraph;\n\n        // Reached the end of an aside.\n        previousAside = null;\n      }\n\n      buffer.write(text);\n\n      // Only add the paragraph to the tag file buffer if it has a unique tag.\n      var tags = _tagPattern.allMatches(text).map((match) => match[1]).toSet();\n      if (tags.difference(allTags).isNotEmpty) {\n        tagFileBuffer.write(text);\n        allTags.addAll(tags);\n      }\n    }\n\n    buffer.writeln(\"</chapter>\");\n    return buffer.toString();\n  }\n\n  void visitText(Text node) {\n    var text = node.text;\n\n    if (text.isEmpty) return;\n\n    // There are a couple of hand-coded HTML ellipses inside an HTML table.\n    text = text.replaceAll(\n        '<span class=\"ellipse\">&thinsp;.&thinsp;.&thinsp;.</span>', \"&#8230;\");\n\n    // Convert the small-caps bitwise operator spans in \"Optimization\" to\n    // custom tags.\n    text = text.replaceAllMapped(\n        _smallCapsPattern, (match) => \"<bitwise>${match[1]}</bitwise>\");\n\n    text = text\n        .replaceAll(\"&eacute;\", \"&#233;\")\n        .replaceAll(\"&ensp;\", \"&#8194;\")\n        .replaceAll(\"&ldquo;\", \"&#8220;\")\n        .replaceAll(\"&nbsp;\", \"&#160;\")\n        .replaceAll(\"&rdquo;\", \"&#8221;\")\n        .replaceAll(\"&rsquo;\", \"&#8217;\")\n        .replaceAll(\"&rarr;\", \"&#8594;\")\n        .replaceAll(\"&sect;\", \"&#167;\")\n        .replaceAll(\"&thinsp;\", \"&#8201;\")\n        .replaceAll(\"&times;\", \"&#215;\")\n        .replaceAll(\"<br>\", \"<br/>\");\n\n    // Don't send tables to InDesign as XML.\n    text = text\n        .replaceAll(\"<table>\", \"[table]\")\n        .replaceAll(\"</table>\", \"[/table]\")\n        .replaceAll(\"<thead>\", \"[thead]\")\n        .replaceAll(\"</thead>\", \"[/thead]\")\n        .replaceAll(\"<tbody>\", \"[tbody]\")\n        .replaceAll(\"</tbody>\", \"[/tbody]\")\n        .replaceAll(\"<tr>\", \"[tr]\")\n        .replaceAll(\"</tr>\", \"[/tr]\")\n        .replaceAll(\"<td>\", \"[td]\")\n        .replaceAll(\"</td>\", \"[/td]\");\n\n    // Turn aside span locators into little visible markers.\n    text = text\n        .replaceAll(_spanPattern, \"<mark>@</mark>\")\n        .replaceAll(\"</span>\", \"\");\n\n    // Discard the challenge and design note divs.\n    if (text.startsWith(\"<div\") || text.startsWith(\"</div>\")) return;\n\n    // Convert image tags to just their paths.\n    if (text.startsWith(\"<img\")) {\n      var imagePath = _imagePathPattern.firstMatch(text)[1];\n\n      // The GC chapter has a couple of tiny inline images that happen to be in\n      // an unordered list. Don't create paragraphs for them.\n      var isInline = _context.has(\"unordered\");\n\n      // Put main column images in their own paragraph.\n      if (!isInline) _push(\"image\");\n      _addText(imagePath);\n      if (!isInline) _pop();\n      return;\n    }\n\n    // Include code snippet XML as-is.\n    if (text.startsWith(\"<location-file>\") ||\n        // \"Representing Code\" has a few inserted snippets with no location tag.\n        text.startsWith(\"<context-before>\") ||\n        text.startsWith(\"<interpreter>\") ||\n        text.startsWith(\"<interpreter-between>\")) {\n      _push(\"xml\");\n      _addText(text);\n      _pop();\n      return;\n    }\n\n    // Since aside tags appear in the Markdown as literal HTML, they are parsed\n    // as text, not Markdown elements.\n    if (text.startsWith(\"<aside\")) {\n      _push(\"aside\");\n      return;\n    }\n\n    if (text.startsWith(\"</aside>\")) {\n      _pop();\n      return;\n    }\n\n    if (text.trimLeft().startsWith(\"<cite>\")) {\n      _push(\"xml\");\n\n      // Use a custom inline style for cite emphasis.\n      text = text\n          .replaceAll(\"<em>\", \"<cite-em>\")\n          .replaceAll(\"</em>\", \"</cite-em>\");\n\n      _addText(text.trimLeft());\n    } else if (_inlineStack.isNotEmpty) {\n      // We're in an inline tag, so add it to that.\n      _inlineStack.last.text += text;\n    } else {\n      if (_context.name == \"xml\") {\n        // Hackish. Assume the only <em> tags inside XML blocks are in cites.\n        text = text\n            .replaceAll(\"<em>\", \"<cite-em>\")\n            .replaceAll(\"</em>\", \"</cite-em>\");\n      }\n\n      _addText(text);\n    }\n\n    if (text.endsWith(\"</cite>\")) _pop();\n  }\n\n  bool visitElementBefore(Element element) {\n    switch (element.tag) {\n      case \"p\":\n        _resetParagraph();\n        break;\n\n      case \"blockquote\":\n        _push(\"quote\");\n        break;\n\n      case \"h2\":\n        var text = element.textContent;\n        if (text == \"Challenges\") {\n          _context = _Context(\"challenges\");\n        } else if (text.contains(\"Design Note\")) {\n          _context = _Context(\"design\");\n        }\n        _push(\"heading\");\n        break;\n\n      case \"h3\":\n        _push(\"subheading\");\n        break;\n\n      case \"ol\":\n        _push(\"ordered\");\n\n        // Immediately push a subcontext to mark the first list item.\n        _push(\"first\");\n        break;\n\n      case \"pre\":\n        _push(\"pre\");\n        break;\n\n      case \"ul\":\n        _push(\"unordered\");\n        break;\n\n      case \"li\":\n        // If we're on the first item, discard it and replace it with the next\n        // item. The first item restarts numbering but later ones don't.\n        if (_context.name != \"first\") _push(\"item\");\n        break;\n\n      case \"a\":\n        // TODO: What do we want to do with links? Highlight them somehow so\n        // that I decide if the surrounding text needs tweaking?\n        break;\n\n      case \"code\":\n      case \"em\":\n      case \"small\":\n      case \"strong\":\n        // Inline tags.\n\n        // If we're in an inline tags already, flatten them by emitting inline\n        // segments for any text they have. Leave them on the stack so that\n        // they get resumed when the nested inline tags end.\n        var tagParts = [element.tag];\n        for (var i = 0; i < _inlineStack.length; i++) {\n          var inline = _inlineStack[i];\n          if (inline.text.isNotEmpty) {\n            _addInline(inline);\n            _inlineStack[i] = _Inline(inline.tag);\n          }\n\n          tagParts.add(inline.tag);\n        }\n\n        String tag;\n        if (tagParts.contains(\"code\")) {\n          // Code formatting wipes out italics or bold.\n          tag = \"code\";\n        } else {\n          tagParts.sort();\n          tag = tagParts.join(\"-\");\n        }\n        // Make a tag name that includes all nested tags. We'll define separate\n        // styles for each combination.\n        _inlineStack.add(_Inline(tag));\n        break;\n\n      default:\n        print(\"Unexpected open tag ${element.tag}.\");\n    }\n\n    return !element.isEmpty;\n  }\n\n  void visitElementAfter(Element element) {\n    switch (element.tag) {\n      case \"blockquote\":\n      case \"h2\":\n      case \"h3\":\n      case \"pre\":\n        _pop();\n        break;\n\n      case \"ol\":\n      case \"ul\":\n        // If we still have a context for the item, it means we have a Markdown\n        // list with no paragraph tags inside the items. There are a couple of\n        // those in the book.\n        if (_context.name == \"first\" || _context.name == \"item\") _pop();\n\n        // Pop the list itself.\n        _pop();\n        break;\n\n      case \"a\":\n        // Nothing to do.\n        break;\n\n      case \"li\":\n      case \"p\":\n        // The first paragraph in each list item has a special style so that\n        // apply the bullet or number. Later paragraphs in the same list item\n        // do not.\n\n        // We match both <p> and <li> so that lists without paragraphs inside\n        // don't leave lingering item contexts.\n        if (_context.name == \"first\" || _context.name == \"item\") _pop();\n        break;\n\n      case \"code\":\n      case \"em\":\n      case \"small\":\n      case \"strong\":\n        // Inline tags.\n        _addInline(_inlineStack.removeLast());\n        break;\n\n      default:\n        print(\"Unexpected close tag ${element.tag}.\");\n    }\n  }\n\n  void _push(String name) {\n    _context = _Context(name, _context);\n    _resetParagraph();\n  }\n\n  void _pop() {\n    _context = _context.parent;\n    _resetParagraph();\n  }\n\n  void _addText(String text) {\n    _flushParagraph();\n\n    // Discard any leading whitespace at the beginning of list items.\n    var paragraph = _paragraphs.last;\n    if (paragraph.contents.isEmpty &&\n        (_context.has(\"ordered\") || _context.has(\"unordered\"))) {\n      text = text.trimLeft();\n    }\n\n    paragraph.contents.add(_Inline(null, text));\n  }\n\n  void _addInline(_Inline inline) {\n    _flushParagraph();\n    _paragraphs.last.contents.add(inline);\n  }\n\n  void _resetParagraph() {\n    _pendingParagraph = true;\n  }\n\n  void _flushParagraph() {\n    if (!_pendingParagraph) return;\n    _paragraphs.add(_Paragraph(_context));\n    _pendingParagraph = false;\n  }\n}\n\nclass _Context {\n  final String name;\n  final _Context parent;\n\n  _Context(this.name, [this.parent]);\n\n  /// Whether any of the contexts in this chain are [name].\n  bool has(String name) {\n    var context = this;\n    while (context != null) {\n      if (context.name == name) return true;\n      context = context.parent;\n    }\n\n    return false;\n  }\n\n  /// Whether [parent] has [name].\n  bool isIn(String name) => parent != null && parent.has(name);\n\n  /// How many levels of list nesting this context contains.\n  int get listDepth {\n    var depth = 0;\n\n    for (var context = this; context != null; context = context.parent) {\n      if (context.name == \"ordered\" || context.name == \"unordered\") {\n        depth++;\n      } else if (context.name == \"aside\") {\n        // Content inside an aside inside a list item isn't really part of the\n        // list.\n        break;\n      }\n    }\n\n    return depth;\n  }\n\n  String get paragraphTag {\n    var tag = name;\n    var depth = listDepth;\n    if (depth > 2) print(\"Unexpected deep list nesting $this.\");\n\n    switch (tag) {\n      case \"main\":\n        return \"p\";\n      case \"main\":\n        return \"p\";\n      case \"challenges\":\n        // There's only paragraph of non-list prose text and that's also\n        // indented like a list (so that it lines up with the heading), so just\n        // use the same style for both.\n        return \"challenges-list-p\";\n      case \"design\":\n        return \"design-p\";\n      case \"aside\":\n        return \"aside\";\n      case \"xml\":\n        return \"xml\";\n\n      case \"first\":\n      case \"item\":\n        tag = \"${parent.name}-$tag\";\n        if (depth > 1) tag = \"sublist-$tag\";\n        break;\n\n      case \"ordered\":\n      case \"unordered\":\n        tag = \"list-p\";\n        if (depth > 1) tag = \"sublist-$tag\";\n        break;\n\n      default:\n        if (depth > 1) {\n          tag = \"sublist-$tag\";\n        } else if (depth > 0) {\n          tag = \"list-$tag\";\n        }\n    }\n\n    if (isIn(\"aside\")) {\n      tag = \"aside-$tag\";\n    } else if (isIn(\"challenges\")) {\n      tag = \"challenges-$tag\";\n    } else if (isIn(\"design\")) {\n      tag = \"design-$tag\";\n    }\n\n    return tag;\n  }\n\n  /// The prefix to apply to inline tags within this context or the empty string\n  /// it none should be added.\n  String get inlinePrefix {\n    if (has(\"aside\")) return \"aside\";\n    if (has(\"challenges\")) return \"challenges\";\n    if (has(\"design\")) return \"design\";\n    if (has(\"quote\")) return \"quote\";\n\n    return \"\";\n  }\n\n  String toString() {\n    if (parent == null) return name;\n    return \"$parent > $name\";\n  }\n}\n\n/// A paragraph-level tag that contains text and inline tags.\nclass _Paragraph {\n  final _Context context;\n\n  final List<_Inline> contents = [];\n\n  _Paragraph(this.context);\n\n  bool _isNext(String tag, String previousTag) {\n    const nextTags = {\n      \"aside\",\n      \"challenges-p\",\n      \"challenges-list-p\",\n      \"design-p\",\n      \"design-list-p\",\n      \"list-p\",\n      \"p\"\n    };\n\n    if (tag == previousTag) return nextTags.contains(tag);\n\n    // The paragraph after a bullet item is also a next.\n    if (tag.endsWith(\"list-p\")) {\n      // This includes both \"unordered\" and \"ordered\", tags that start with\n      // \"challenges\" or \"design\", and ones that end with \"first\" or \"item\".\n      return previousTag.contains(\"ordered-\");\n    }\n\n    return false;\n  }\n\n  String prettyPrint(_Paragraph previous) {\n    var buffer = StringBuffer();\n    var tag = context.paragraphTag;\n\n    if (previous != null && _isNext(tag, previous.context.paragraphTag)) {\n      tag += \"-next\";\n    }\n\n    if (tag != \"xml\") buffer.write(\"<$tag>\");\n\n    for (var inline in contents) {\n      inline.prettyPrint(buffer, context);\n    }\n\n    if (tag != \"xml\") buffer.write(\"</$tag>\");\n    buffer.writeln();\n    return buffer.toString();\n  }\n}\n\n/// An inline tag or plain text.\nclass _Inline {\n  /// The tag name if this is an inline tag or `null` if it is text.\n  final String tag;\n\n  String text;\n\n  _Inline(this.tag, [this.text = \"\"]);\n\n  bool get isText => tag == null;\n\n  void prettyPrint(StringBuffer buffer, _Context context) {\n    if (tag == null) {\n      buffer.write(text);\n      return;\n    }\n\n    var fullTag = tag;\n    var prefix = context.inlinePrefix;\n    if (prefix != \"\") fullTag = \"$prefix-$fullTag\";\n\n    buffer.write(\"<$fullTag>$text</$fullTag>\");\n  }\n}\n"
  },
  {
    "path": "tool/lib/src/mustache.dart",
    "content": "/// Creates the data map and renders the Mustache templates to HTML.\nimport 'dart:io';\n\nimport 'package:mustache_template/mustache_template.dart';\nimport 'package:path/path.dart' as p;\n\nimport 'book.dart';\nimport 'page.dart';\nimport 'text.dart';\n\n/// Maintains the cache of loaded partials and allows rendering templates.\nclass Mustache {\n  /// The directory where template files can be found.\n  final String _templateDirectory;\n\n  final Map<String, Template> _templates = {};\n\n  Mustache([String templateDirectory])\n      : _templateDirectory = templateDirectory ?? p.join(\"asset\", \"mustache\");\n\n  String render(Book book, Page page, String body, {String template}) {\n    var part = page.part?.title;\n\n    var up = \"Table of Contents\";\n    if (part != null) {\n      up = part;\n    } else if (page.title == \"Table of Contents\") {\n      up = \"Crafting Interpreters\";\n    }\n\n    var previousPage = book.adjacentPage(page, -1);\n    var nextPage = book.adjacentPage(page, 1);\n    String nextType;\n    if (nextPage != null && nextPage.isChapter) {\n      nextType = \"Chapter\";\n    } else if (nextPage != null && nextPage.isPart) {\n      nextType = \"Part\";\n    }\n\n    List<Map<String, dynamic>> chapters;\n    if (page.isPart) {\n      chapters = _makeChapterList(page);\n    }\n\n    var isFrontmatter = const {\n      \"Acknowledgements\",\n      \"Dedication\",\n    }.contains(page.title);\n\n    var data = <String, dynamic>{\n      \"is_chapter\": part != null,\n      \"is_part\": part == null && page.title != null && !isFrontmatter,\n      \"is_frontmatter\": isFrontmatter,\n      \"title\": page.title,\n      \"part\": part,\n      \"body\": body,\n      \"sections\": _makeSections(page),\n      \"chapters\": chapters,\n      \"design_note\": page.designNote,\n      \"has_design_note\": page.designNote != null,\n      \"has_challenges\": page.hasChallenges,\n      \"has_challenges_or_design_note\":\n          page.hasChallenges || page.designNote != null,\n      \"has_number\": page.numberString != \"\",\n      \"number\": page.numberString,\n      // Previous page.\n      \"has_prev\": previousPage != null,\n      \"prev\": previousPage?.title,\n      \"prev_file\": previousPage?.fileName,\n      // Next page.\n      \"has_next\": nextPage != null,\n      \"next\": nextPage?.title,\n      \"next_file\": nextPage?.fileName,\n      \"next_type\": nextType,\n      \"has_up\": up != null,\n      \"up\": up,\n      \"up_file\": up != null ? toFileName(up) : null,\n      // TODO: Only need this for contents page.\n      \"part_1\": _makePartData(book, 0),\n      \"part_2\": _makePartData(book, 1),\n      \"part_3\": _makePartData(book, 2),\n    };\n\n    return _load(template ?? page.template).renderString(data);\n  }\n\n  Map<String, dynamic> _makePartData(Book book, int partIndex) {\n    var partPage = book.parts[partIndex];\n    return <String, dynamic>{\n      \"title\": partPage.title,\n      \"number\": partPage.numberString,\n      \"file\": partPage.fileName,\n      \"chapters\": _makeChapterList(partPage)\n    };\n  }\n\n  List<Map<String, dynamic>> _makeChapterList(Page part) {\n    return [\n      for (var chapter in part.chapters)\n        <String, dynamic>{\n          \"title\": chapter.title,\n          \"number\": chapter.numberString,\n          \"file\": chapter.fileName,\n          \"design_note\": chapter.designNote?.replaceAll(\"'\", \"&rsquo;\"),\n        }\n    ];\n  }\n\n  List<Map<String, dynamic>> _makeSections(Page page) {\n    var sections = <Map<String, dynamic>>[];\n\n    for (var header in page.headers.values) {\n      if (!header.isSpecial && header.level == 2) {\n        sections.add(<String, dynamic>{\n          \"name\": header.name,\n          \"anchor\": header.anchor,\n          \"index\": header.headerIndex\n        });\n      }\n    }\n\n    return sections;\n  }\n\n  Template _load(String name) {\n    return _templates.putIfAbsent(name, () {\n      var path = p.join(_templateDirectory, \"$name.html\");\n      return Template(File(path).readAsStringSync(),\n          name: path, partialResolver: _load);\n    });\n  }\n}\n"
  },
  {
    "path": "tool/lib/src/page.dart",
    "content": "import 'package:path/path.dart' as p;\n\nimport 'code_tag.dart';\nimport 'page_parser.dart';\nimport 'text.dart';\n\n/// One page (in the HTML sense) of the book.\n///\n/// Each chapter, part introduction, and backmatter section is a page.\nclass Page {\n  /// The title of this page.\n  final String title;\n\n  /// The chapter or part number, like \"12\", \"II\", or \"\".\n  final String numberString;\n\n  /// The numeric index of the page in chapter order.\n  ///\n  /// Used to determine which order snippets appear in the book.\n  final int ordinal;\n\n  /// If this page is a part page, the list of chapter pages it contains.\n  final List<Page> chapters = [];\n\n  /// If this page is a chapter page, the part that contains this page.\n  final Page part;\n\n  PageFile _file;\n\n  Page(this.title, this.part, this.numberString, this.ordinal);\n\n  /// The base file path and URI for the page, without any extension.\n  String get fileName => toFileName(title);\n\n  /// The path to this page's Markdown source file.\n  String get markdownPath => p.join(\"book\", \"$fileName.md\");\n\n  /// The path to this page's generated HTML file.\n  String get htmlPath => p.join(\"site\", \"$fileName.html\");\n\n  /// Whether this page is a chapter page, as opposed to a part.\n  bool get isChapter => part != null;\n\n  /// Whether this page is a part page, as opposed to a chapter.\n  bool get isPart => part == null;\n\n  /// The code language used for this chapter page or `null` if this isn't one\n  /// of the main chapter pages.\n  String get language {\n    if (isPart) return null;\n    if (part.title == \"A Tree-Walk Interpreter\") return \"java\";\n    if (part.title == \"A Bytecode Virtual Machine\") return \"c\";\n    return null;\n  }\n\n  String get shortName {\n    var number = numberString.padLeft(2, \"0\");\n\n    var words = title.split(\" \");\n    var word = words.first.toLowerCase();\n    if (word == \"a\" || word == \"the\") word = words[1].toLowerCase();\n\n    return \"chap${number}_$word\";\n  }\n\n  List<String> get lines => _ensureFile().lines;\n\n  String get template {\n    if (title == \"Crafting Interpreters\") return \"index\";\n    if (title == \"Table of Contents\") return \"contents\";\n    return \"page\";\n  }\n\n  Map<String, Header> get headers => _ensureFile().headers;\n\n  bool get hasChallenges => _ensureFile().hasChallenges;\n\n  String get designNote => _ensureFile().designNote;\n\n  Iterable<CodeTag> get codeTags => _ensureFile().codeTags.values;\n\n  CodeTag findCodeTag(String name) {\n    // Return fake tags for the placeholders.\n    if (name == \"omit\") return CodeTag(this, \"omit\", 9998, 0, 0, false);\n    if (name == \"not-yet\") return CodeTag(this, \"omit\", 9999, 0, 0, false);\n\n    var codeTag = _ensureFile().codeTags[name];\n    if (codeTag != null) return codeTag;\n\n    throw ArgumentError(\"Could not find code tag '$name'.\");\n  }\n\n  String toString() => title;\n\n  /// Lazily parse the Markdown file for the page.\n  PageFile _ensureFile() => _file ??= parsePage(this);\n}\n\n/// The data for a page parsed from the Markdown source.\nclass PageFile {\n  final List<String> lines;\n  final Map<String, Header> headers;\n  final bool hasChallenges;\n\n  /// The name of the design note in this page, or `null` if there is none.\n  final String designNote;\n\n  final Map<String, CodeTag> codeTags;\n\n  PageFile(this.lines, this.headers, this.hasChallenges, this.designNote,\n      this.codeTags);\n}\n\n/// A section header in a page.\nclass Header {\n  /// The header depth: 1 is the page title, 2 header, 3 subheader.\n  final int level;\n  final int headerIndex;\n  final int subheaderIndex;\n  final String name;\n\n  Header(this.level, this.headerIndex, this.subheaderIndex, this.name);\n\n  /// Whether this header is for the special \"Challenges\" or \"Design Note\"\n  /// sections.\n  bool get isSpecial => isChallenges || isDesignNote;\n\n  bool get isChallenges {\n    // Check for a subheader because there is a \"Challenges\" *subheader* in\n    // the Introduction.\n    return name == \"Challenges\" && level == 2;\n  }\n\n  bool get isDesignNote => name.startsWith(\"Design Note:\");\n\n  String get anchor {\n    if (isChallenges) return \"challenges\";\n    if (isDesignNote) return \"design-note\";\n    return toFileName(name);\n  }\n}\n"
  },
  {
    "path": "tool/lib/src/page_parser.dart",
    "content": "import 'dart:io';\n\nimport 'code_tag.dart';\nimport 'page.dart';\nimport 'text.dart';\n\nfinal _codePattern = RegExp(r\"^\\^code ([-a-z0-9]+)( \\(([^)]+)\\))?$\");\nfinal _headerPattern = RegExp(r\"^(#{1,3}) \");\nfinal _beforePattern = RegExp(r\"(\\d+) before\");\nfinal _afterPattern = RegExp(r\"(\\d+) after\");\n\n/// Parses the contents of the Markdown file for [page] to extract its metadata,\n/// code tags, section headers, etc.\nPageFile parsePage(Page page) {\n  var headers = <String, Header>{};\n  var codeTagsByName = <String, CodeTag>{};\n  String designNote;\n  var hasChallenges = false;\n\n  var headerIndex = 0;\n  var subheaderIndex = 0;\n\n  var lines = File(page.markdownPath).readAsLinesSync();\n  for (var i = 0; i < lines.length; i++) {\n    var line = lines[i];\n\n    var match = _codePattern.firstMatch(line);\n    if (match != null) {\n      var codeTag =\n          _createCodeTag(page, codeTagsByName.length, match[1], match[3]);\n      codeTagsByName[codeTag.name] = codeTag;\n      continue;\n    }\n\n    match = _headerPattern.firstMatch(line);\n    if (match != null) {\n      // Keep track of the headers so we can add section navigation for them.\n      var headerType = match[1];\n      var level = headerType.length;\n      var name = line.substring(level).trim().pretty;\n\n      if (level == 2) {\n        headerIndex += 1;\n        subheaderIndex = 0;\n      } else if (level == 3) {\n        subheaderIndex += 1;\n      }\n\n      var header =\n          Header(level, headerIndex, level == 3 ? subheaderIndex : null, name);\n\n      if (header.isChallenges) hasChallenges = true;\n      if (header.isDesignNote) {\n        designNote = header.name.substring(\"Design Note: \".length);\n      }\n\n      headers[line] = header;\n    }\n  }\n\n//  # Validate that every snippet for the chapter is included.\n//  for name, snippet in snippets.items():\n//    if name != 'not-yet' and name != 'omit' and snippet != False:\n//      errors.append(\"Unused snippet {}\".format(name))\n//\n//  # Show any errors at the top of the file.\n//  if errors:\n//    error_markdown = \"\"\n//    for error in errors:\n//      error_markdown += \"**Error: {}**\\n\\n\".format(error)\n//    contents = error_markdown + contents\n//\n  return PageFile(lines, headers, hasChallenges, designNote, codeTagsByName);\n}\n\nCodeTag _createCodeTag(Page page, int index, String name, String options) {\n  // Parse the location annotations after the name, if present.\n  var showLocation = true;\n  var beforeCount = 0;\n  var afterCount = 0;\n\n  if (options != null) {\n    for (var option in options.split(\", \")) {\n      if (option == \"no location\") {\n        showLocation = false;\n        continue;\n      }\n\n      var match = _beforePattern.firstMatch(option);\n      if (match != null) {\n        beforeCount = int.parse(match[1]);\n        continue;\n      }\n\n      match = _afterPattern.firstMatch(option);\n      if (match != null) {\n        afterCount = int.parse(match[1]);\n        continue;\n      }\n\n      throw \"Unknown code option '$option'\";\n    }\n  }\n\n  return CodeTag(page, name, index, beforeCount, afterCount, showLocation);\n}\n"
  },
  {
    "path": "tool/lib/src/snippet.dart",
    "content": "import 'book.dart';\nimport 'code_tag.dart';\nimport 'location.dart';\nimport 'text.dart';\n\n/// A snippet of source code that is inserted in the book.\nclass Snippet {\n  final SourceFile file;\n  final CodeTag tag;\n\n  Location _location;\n\n  int _firstLine;\n  int _lastLine;\n\n  Location get precedingLocation => _precedingLocation;\n  Location _precedingLocation;\n\n  /// If the snippet replaces a line with the same line but with a trailing\n  /// comma, this is that line (with the comma).\n  String get addedComma => _addedComma;\n  String _addedComma;\n\n  final List<String> added = [];\n  final List<String> removed = [];\n\n  final List<String> contextBefore = [];\n  final List<String> contextAfter = [];\n\n  Snippet(this.file, this.tag);\n\n  void addLine(int lineIndex, SourceLine line) {\n    if (added.isEmpty) {\n      _location = line.location;\n      _firstLine = lineIndex;\n    }\n    added.add(line.text);\n\n    // Assume that we add the removed lines in order.\n    _lastLine = lineIndex;\n  }\n\n  void removeLine(int lineIndex, SourceLine line) {\n    removed.add(line.text);\n\n    // Assume that we add the removed lines in order.\n    _lastLine = lineIndex;\n  }\n\n  /// Describes where in the file this snippet appears. Returns a list of HTML\n  /// strings.\n  List<String> get locationHtmlLines {\n    var result = [\"<em>${file.nicePath}</em>\"];\n\n    var html = _location.toHtml(precedingLocation, removed);\n    if (html != null) result.add(html);\n\n    if (removed.isNotEmpty && added.isNotEmpty) {\n      result.add(\"replace ${removed.length} line${pluralize(removed)}\");\n    } else if (removed.isNotEmpty && added.isEmpty) {\n      result.add(\"remove ${removed.length} line${pluralize(removed)}\");\n    }\n\n    if (addedComma != null) {\n      result.add(\"add <em>&ldquo;,&rdquo;</em> to previous line\");\n    }\n\n    return result;\n  }\n\n  /// Describes where in the file this snippet appears.\n  String get locationXml {\n    var result = StringBuffer();\n    result.write(\"<location-file>${file.nicePath}</location-file>\");\n\n    var xml = _location.toXml(precedingLocation, removed);\n    var changes = [\n      if (xml != null) xml,\n      if (removed.isNotEmpty && added.isNotEmpty)\n        \"replace ${removed.length} line${pluralize(removed)}\"\n      else if (removed.isNotEmpty && added.isEmpty)\n        \"remove ${removed.length} line${pluralize(removed)}\",\n      if (addedComma != null)\n        \"add <location-comma>&ldquo;,&rdquo;</location-comma> to previous line\"\n    ].map((change) => \"<location-change>$change</location-change>\");\n\n    if (changes.isNotEmpty) {\n      result.writeln();\n      result.writeAll(changes, \"\\n\");\n    }\n\n    return result.toString();\n  }\n\n  String toString() => \"${file.nicePath} ${tag.name}\";\n\n  /// Calculate the surrounding context information for this snippet.\n  void calculateContext() {\n    // Get the preceding lines.\n    for (var i = _firstLine - 1;\n        i >= 0 && contextBefore.length < tag.beforeCount;\n        i--) {\n      var line = file.lines[i];\n      if (!line.isPresent(tag)) continue;\n      contextBefore.insert(0, line.text);\n    }\n\n    // Get the following lines.\n    for (var i = _lastLine + 1;\n        i < file.lines.length && contextAfter.length < tag.afterCount;\n        i++) {\n      var line = file.lines[i];\n      if (line.isPresent(tag)) contextAfter.add(line.text);\n    }\n\n    // Get the preceding location.\n    // TODO: This constant is somewhat arbitrary. Come up with a more precise\n    // way to track the preceding location.\n    int checkedLines = 0;\n    for (var i = _firstLine - 1; i >= 0 && checkedLines <= 4; i--) {\n      var line = file.lines[i];\n      if (!line.isPresent(tag)) continue;\n      checkedLines++;\n\n      // Store the most precise preceding location we find.\n      if (_precedingLocation == null ||\n          line.location.depth > _precedingLocation.depth) {\n        _precedingLocation = line.location;\n      }\n    }\n\n    // Update the current location based on surrounding lines.\n    var hasCodeBefore = contextBefore.isNotEmpty;\n    var hasCodeAfter = contextAfter.isNotEmpty;\n    for (var i = _firstLine - 1; !hasCodeBefore && i >= 0; i--) {\n      hasCodeBefore = file.lines[i].isPresent(tag);\n    }\n\n    for (var i = _lastLine + 1; !hasCodeAfter && i < file.lines.length; i++) {\n      hasCodeAfter = file.lines[i].isPresent(tag);\n    }\n\n    if (!hasCodeBefore) {\n      _location = Location(null, hasCodeAfter ? \"top\" : \"new\", null);\n    }\n\n    // Find line changes that just add a trailing comma.\n    if (added.isNotEmpty &&\n        removed.isNotEmpty &&\n        added.first == \"${removed.last},\") {\n      _addedComma = added.first;\n      added.removeAt(0);\n      removed.removeLast();\n    }\n  }\n}\n"
  },
  {
    "path": "tool/lib/src/source_file_parser.dart",
    "content": "import 'dart:io';\n\nimport 'book.dart';\nimport 'code_tag.dart';\nimport 'location.dart';\nimport 'page.dart';\n\nfinal _blockPattern = RegExp(\n    r\"^/\\* ([A-Z][A-Za-z\\s]+) ([-a-z0-9]+) < ([A-Z][A-Za-z\\s]+) ([-a-z0-9]+)$\");\nfinal _blockSnippetPattern = RegExp(r\"^/\\* < ([-a-z0-9]+)$\");\nfinal _beginSnippetPattern = RegExp(r\"^//> ([-a-z0-9]+)$\");\nfinal _endSnippetPattern = RegExp(r\"^//< ([-a-z0-9]+)$\");\nfinal _beginChapterPattern = RegExp(r\"^//> ([A-Z][A-Za-z\\s]+) ([-a-z0-9]+)$\");\nfinal _endChapterPattern = RegExp(r\"^//< ([A-Z][A-Za-z\\s]+) ([-a-z0-9]+)$\");\n\n// Hacky regexes that matches various declarations.\nfinal _constructorPattern = RegExp(r\"^  ([A-Z][a-z]\\w+)\\(\");\nfinal _functionPattern = RegExp(r\"(\\w+)>*\\*? (\\w+)\\(([^)]*)\");\nfinal _variablePattern = RegExp(r\"^\\w+\\*? (\\w+)(;| = )\");\nfinal _structPattern = RegExp(r\"^struct (\\w+)? {$\");\nfinal _typePattern =\n    RegExp(r\"(public )?(abstract )?(class|enum|interface) ([A-Z]\\w+)\");\nfinal _namedTypedefPattern = RegExp(r\"^typedef (enum|struct|union) (\\w+) {$\");\nfinal _unnamedTypedefPattern = RegExp(r\"^typedef (enum|struct|union) {$\");\nfinal _typedefNamePattern = RegExp(r\"^\\} (\\w+);$\");\n\n/// Reserved words that can appear like a return type in a function declaration\n/// but shouldn't be treated as one.\nconst _keywords = {\"new\", \"return\", \"throw\"};\n\nclass SourceFileParser {\n  final Book _book;\n  final SourceFile _file;\n  final List<String> _lines;\n  final List<_ParseState> _states = [];\n\n  Location _unnamedTypedef;\n\n  Location _location;\n  Location _locationBeforeBlock;\n\n  SourceFileParser(this._book, String path, String relative)\n      : _file = SourceFile(relative),\n        _lines = File(path).readAsLinesSync() {\n    _location = Location(null, \"file\", _file.nicePath);\n  }\n\n  SourceFile parse() {\n//  line_num = 1\n//  handled = False\n//\n//  def error(message):\n//    print(\"Error: {} line {}: {}\".format(relative, line_num, message),\n//        file=sys.stderr)\n//    source_code.errors[state.start.chapter].append(\n//        \"{} line {}: {}\".format(relative, line_num, message))\n//\n    // Split the source file into lines.\n//    printed_file = False\n//    line_num = 1\n    for (var i = 0; i < _lines.length; i++) {\n      var line = _lines[i].trimRight();\n//      handled = False\n//\n//      # Report any lines that are too long.\n//      trimmed = re.sub(r'// \\[([-a-z0-9]+)\\]', '', line)\n//      if len(trimmed) > 72 and not '/*' in trimmed:\n//        if not printed_file:\n//          print(\"Long line in {}:\".format(file.path))\n//          printed_file = True\n//        print(\"{0:4} ({1:2} chars): {2}\".format(line_num, len(trimmed), trimmed))\n//\n\n      _updateLocationBefore(line, i);\n\n      if (!_updateState(line)) {\n        var sourceLine =\n            SourceLine(line, _location, _currentState.start, _currentState.end);\n        _file.lines.add(sourceLine);\n      }\n\n      _updateLocationAfter(line);\n//\n//      line_num += 1\n    }\n\n//    # \".parent.parent\" because there is always the top \"null\" state.\n//    if state.parent != None and state.parent.parent != None:\n//      print(\"{}: Ended with more than one state on the stack.\".format(relative),\n//          file=sys.stderr)\n//      s = state\n//      while s.parent != None:\n//        print(\"  {}\".format(s.start), file=sys.stderr)\n//        s = s.parent\n//      sys.exit(1)\n//\n\n    // TODO: Validate that we don't define two snippets with the same chapter\n    // and number. A snippet may end up in disjoint lines in the final output\n    // because a later snippet is inserted in it, but it shouldn't be explicitly\n    // authored that way.\n    return _file;\n  }\n\n  /// Keep track of the current location where the parser is in the source file.\n  void _updateLocationBefore(String line, int lineIndex) {\n    // See if we reached a new function or method declaration.\n    var match = _functionPattern.firstMatch(line);\n    if (match != null &&\n        !line.contains(\"#define\") &&\n        !_keywords.contains(match[1])) {\n      // Hack. Don't get caught by comments or string literals.\n      if (!line.contains(\"//\") && !line.contains('\"')) {\n        var isFunctionDeclaration = line.endsWith(\";\");\n\n        // Hack: Handle multi-line declarations.\n        if (line.endsWith(\",\") && _lines[lineIndex + 1].endsWith(\";\")) {\n          isFunctionDeclaration = true;\n        }\n\n        _location = Location(_location,\n            _file.language == \"java\" ? \"method\" : \"function\", match[2],\n            signature: match[3], isFunctionDeclaration: isFunctionDeclaration);\n        return;\n      }\n    }\n\n    match = _constructorPattern.firstMatch(line);\n    if (match != null) {\n      _location = Location(_location, \"constructor\", match[1]);\n      return;\n    }\n\n    match = _typePattern.firstMatch(line);\n    if (match != null) {\n      // Hack. Don't get caught by comments or string literals.\n      if (!line.contains(\"//\") && !line.contains('\"')) {\n        var kind = match[3];\n        var name = match[4];\n        _location = Location(_location, kind, name);\n      }\n      return;\n    }\n\n    match = _structPattern.firstMatch(line);\n    if (match != null) {\n      _location = Location(_location, \"struct\", match[1]);\n      return;\n    }\n\n    match = _namedTypedefPattern.firstMatch(line);\n    if (match != null) {\n      _location = Location(_location, match[1], match[2]);\n      return;\n    }\n\n    match = _unnamedTypedefPattern.firstMatch(line);\n    if (match != null) {\n      // We don't know the name of the typedef yet.\n      _location = Location(_location, match[1], null);\n      _unnamedTypedef = _location;\n      return;\n    }\n\n    match = _variablePattern.firstMatch(line);\n    if (match != null) {\n      _location = Location(_location, \"variable\", match[1]);\n      return;\n    }\n  }\n\n  void _updateLocationAfter(String line) {\n    var match = _typedefNamePattern.firstMatch(line);\n    if (match != null) {\n      // Now we know the typedef name.\n      _unnamedTypedef?.name = match[1];\n      _unnamedTypedef = null;\n      _location = _location.parent;\n    }\n\n    // Use \"startsWith\" to include lines like \"} [aside-marker]\".\n    if (line.startsWith(\"}\")) {\n      _location = _location.popToDepth(0);\n    } else if (line.startsWith(\"  }\")) {\n      _location = _location.popToDepth(1);\n    } else if (line.startsWith(\"    }\")) {\n      _location = _location.popToDepth(2);\n    }\n\n    // If we reached a function declaration, not a definition, then it's done\n    // after one line.\n    if (_location.isFunctionDeclaration) {\n      _location = _location.parent;\n    }\n\n    // Module variables are only a single line.\n    if (_location.kind == \"variable\") {\n      _location = _location.parent;\n    }\n\n    // Hack. There is a one-line class in Parser.java.\n    if (line.contains(\"class ParseError\")) {\n      _location = _location.parent;\n    }\n  }\n\n  /// Processes any [line] that changes what snippet the parser is currently in.\n  ///\n  /// Returns `true` if the line contained a snippet annotation.\n  bool _updateState(String line) {\n    var match = _blockPattern.firstMatch(line);\n    if (match != null) {\n      _push(\n          startChapter: _book.findChapter(match[1]),\n          startName: match[2],\n          endChapter: _book.findChapter(match[3]),\n          endName: match[4]);\n      _locationBeforeBlock = _location;\n      return true;\n    }\n\n    match = _blockSnippetPattern.firstMatch(line);\n    if (match != null) {\n      _push(endChapter: _currentState.start.chapter, endName: match[1]);\n      _locationBeforeBlock = _location;\n      return true;\n    }\n\n    if (line.trim() == \"*/\" && _currentState.end != null) {\n      _location = _locationBeforeBlock;\n      _pop();\n      return true;\n    }\n\n    match = _beginSnippetPattern.firstMatch(line);\n    if (match != null) {\n      var name = match[1];\n//        var tag = source_code.find_snippet_tag(state.start.chapter, name);\n//        if tag < state.start:\n//          error(\"Can't push earlier snippet {} from {}.\".format(name, state.start.name))\n//        elif tag == state.start:\n//          error(\"Can't push to same snippet {}.\".format(name))\n      _push(startName: name);\n      return true;\n    }\n\n    match = _endSnippetPattern.firstMatch(line);\n    if (match != null) {\n//      var name = match[1];\n//        if name != state.start.name:\n//          error(\"Expecting to pop {} but got {}.\".format(state.start.name, name))\n//        if state.parent.start.chapter == None:\n//          error('Cannot pop last state {}.'.format(state.start))\n      _pop();\n      return true;\n    }\n\n    match = _beginChapterPattern.firstMatch(line);\n    if (match != null) {\n      var chapter = _book.findChapter(match[1]);\n      var name = match[2];\n\n//        if state.start != None:\n//          old_chapter = book.chapter_number(state.start.chapter)\n//          new_chapter = book.chapter_number(chapter)\n//\n//          if chapter == state.start.chapter and name == state.start.name:\n//            error('Pushing same snippet \"{} {}\"'.format(chapter, name))\n//          if chapter == state.start.chapter:\n//            error('Pushing same chapter, just use \"//>> {}\"'.format(name))\n//          if new_chapter < old_chapter:\n//            error('Can\\'t push earlier chapter \"{}\" from \"{}\".'.format(\n//                chapter, state.start.chapter))\n      _push(startChapter: chapter, startName: name);\n      return true;\n    }\n\n    match = _endChapterPattern.firstMatch(line);\n    if (match != null) {\n//      var chapter = match[1];\n//      var name = match[2];\n//        if chapter != state.start.chapter or name != state.start.name:\n//          error('Expecting to pop \"{} {}\" but got \"{} {}\".'.format(\n//              state.start.chapter, state.start.name, chapter, name))\n//        if state.start.chapter == None:\n//          error('Cannot pop last state \"{}\".'.format(state.start))\n      _pop();\n      return true;\n    }\n\n    return false;\n  }\n\n  _ParseState get _currentState => _states.last;\n\n  void _push(\n      {Page startChapter, String startName, Page endChapter, String endName}) {\n    startChapter ??= _currentState.start.chapter;\n\n    CodeTag start;\n    if (startName != null) {\n      start = startChapter.findCodeTag(startName);\n    } else {\n      start = _currentState.start;\n    }\n\n    CodeTag end;\n    if (endChapter != null) {\n      end = endChapter.findCodeTag(endName);\n    }\n\n    _states.add(_ParseState(start, end));\n  }\n\n  void _pop() {\n    _states.removeLast();\n  }\n}\n\nclass _ParseState {\n  final CodeTag start;\n  final CodeTag end;\n\n  _ParseState(this.start, [this.end]);\n\n  String toString() {\n    if (end != null) return \"_ParseState($start > $end)\";\n    return \"_ParseState($start)\";\n  }\n}\n"
  },
  {
    "path": "tool/lib/src/split_chapter.dart",
    "content": "import 'dart:io';\n\nimport 'package:glob/glob.dart';\nimport 'package:path/path.dart' as p;\nimport 'package:pool/pool.dart';\n\nimport 'package:tool/src/book.dart';\nimport 'package:tool/src/code_tag.dart';\nimport 'package:tool/src/page.dart';\nimport 'package:tool/src/source_file_parser.dart';\n\n/// Don't do too many file operations at once or we risk running out of file\n/// descriptors.\nvar _filePool = Pool(200);\n\nFuture<void> splitChapter(Book book, Page chapter, [CodeTag tag]) async {\n  var futures = <Future<void>>[];\n\n  for (var file in Glob(\"${chapter.language}/**.{c,h,java}\").listSync()) {\n    futures.add(_splitSourceFile(book, chapter, file.path, tag));\n  }\n\n  await Future.wait(futures);\n}\n\nFuture<void> _splitSourceFile(Book book, Page chapter, String sourcePath,\n    [CodeTag tag]) async {\n  var relative = p.relative(sourcePath, from: chapter.language);\n\n  // Don't split the generated files.\n  if (relative == \"com/craftinginterpreters/lox/Expr.java\") return;\n  if (relative == \"com/craftinginterpreters/lox/Stmt.java\") return;\n\n  var package = chapter.shortName;\n  if (tag != null) {\n    package = p.join(\"snippets\", package, tag.directory);\n  }\n\n  // If we're generating the split for an entire chapter, include all its\n  // snippets.\n  tag ??= book.lastSnippet(chapter).tag;\n\n  var outputFile = File(p.join(\"gen\", package, relative));\n\n  var resource = await _filePool.request();\n  try {\n    var output = _generateSourceFile(book, chapter, sourcePath, tag);\n    if (output.isNotEmpty) {\n      // Don't overwrite the file if it didn't change, so the makefile doesn't\n      // think it was touched.\n      if (await outputFile.exists()) {\n        var previous = await outputFile.readAsString();\n        if (previous == output) return;\n      }\n\n      // Write the changed output.\n      await Directory(p.dirname(outputFile.path)).create(recursive: true);\n      await outputFile.writeAsString(output);\n    } else {\n      // Remove it since it's supposed to be nonexistent.\n      if (await outputFile.exists()) await outputFile.delete();\n    }\n  } finally {\n    resource.release();\n  }\n}\n\n/// Gets the code for [sourceFilePath] as it appears at [tag] of [chapter].\nString _generateSourceFile(\n    Book book, Page chapter, String sourcePath, CodeTag tag) {\n  var shortPath = p.relative(sourcePath, from: chapter.language);\n  var sourceFile = SourceFileParser(book, sourcePath, shortPath).parse();\n\n  var buffer = StringBuffer();\n  for (var line in sourceFile.lines) {\n    if (line.isPresent(tag)) {\n      // Hack. In generate_ast.java, we split up a parameter list among\n      // multiple chapters, which leads to hanging commas in some cases.\n      // Remove them.\n      if (line.text.trim().startsWith(\")\")) {\n        var text = buffer.toString();\n        if (text.endsWith(\",\\n\")) {\n          buffer.clear();\n          buffer.writeln(text.substring(0, text.length - 2));\n        }\n      }\n\n      buffer.writeln(line.text);\n    }\n  }\n\n  return buffer.toString();\n}\n"
  },
  {
    "path": "tool/lib/src/syntax/grammar.dart",
    "content": "import 'language.dart';\nimport 'rule.dart';\n\nfinal languages = {\n  \"c\": c,\n  \"c++\": cpp,\n  \"ebnf\": ebnf,\n  \"java\": java,\n  \"js\": js,\n  \"lisp\": lisp,\n  \"lox\": lox,\n  // TODO: This is just enough for the one line in \"scanning\". Do more if\n  // needed.\n  \"lua\": Language(rules: _commonRules),\n  \"python\": python,\n  \"ruby\": ruby,\n};\n\nfinal c = Language(\n  keywords: _cKeywords,\n  types: \"bool char double FILE int size_t uint16_t uint32_t uint64_t uint8_t \"\n      \"uintptr_t va_list void\",\n  rules: _cRules,\n);\n\nfinal cpp = Language(\n  keywords: _cKeywords,\n  types: \"vector string\",\n  rules: _cRules,\n);\n\nfinal ebnf = Language(\n  rules: [\n    // Color ALL_CAPS terminals like types to make them distinct.\n    Rule(r\"[A-Z][A-Z0-9_]+\", \"t\"),\n    ..._commonRules\n  ],\n);\n\nfinal java = Language(\n  keywords: \"abstract assert break case catch class const continue default do \"\n      \"else enum extends false final finally for goto if implements import \"\n      \"instanceof interface native new null package private protected public \"\n      \"return static strictfp super switch synchronized this throw throws \"\n      \"transient true try volatile while\",\n  types: \"boolean byte char double float int long short void\",\n  rules: [\n    // Import.\n    Rule.capture(r\"(import)(\\s+)(\\w+(?:\\.\\w+)*)(;)\", [\"k\", \"\", \"i\", \"\"]),\n    // Static import.\n    Rule.capture(r\"(import\\s+static?)(\\s+)(\\w+(?:\\.\\w+)*(?:\\.\\*)?)(;)\",\n        [\"k\", \"\", \"i\", \"\"]),\n    // Package.\n    Rule.capture(r\"(package)(\\s+)(\\w+(?:\\.\\w+)*)(;)\", [\"k\", \"\", \"i\", \"\"]),\n    // Annotation.\n    Rule(r\"@[a-zA-Z_][a-zA-Z0-9_]*\", \"a\"),\n\n    // ALL_CAPS constant names are colored like normal identifiers. We give\n    // them their own rule so that it matches before the capitalized type name\n    // rule.\n    Rule(r\"[A-Z][A-Z0-9_]+\\b\", \"i\"),\n\n    ..._commonRules,\n    _characterRule,\n  ],\n);\n\nfinal js = Language(\n  keywords: \"break case catch class const continue debugger default delete do \"\n      \"else export extends finally for function if import in instanceof let \"\n      \"new return super switch this throw try typeof var void while with yield\",\n  rules: _commonRules,\n);\n\nfinal lisp = Language(\n  rules: [\n    // TODO: Other punctuation characters.\n    Rule(r\"[a-zA-Z0-9_-]+\", \"i\"),\n  ],\n);\n\nfinal lox = Language(\n  keywords: \"and class else false fun for if nil or print return super this \"\n      \"true var while\",\n  rules: _commonRules,\n);\n\nfinal python = Language(\n  keywords: \"and as assert break class continue def del elif else except \"\n      \"exec finally for from global if import in is lambda not or pass \"\n      \"print raise range return try while with yield\",\n  rules: _commonRules,\n);\n\nfinal ruby = Language(\n  keywords: \"__LINE__ _ENCODING__ __FILE__ BEGIN END alias and begin break \"\n      \"case class def defined? do else elsif end ensure false for if in lambda \"\n      \"module next nil not or redo rescue retry return self super then true \"\n      \"undef unless until when while yield\",\n  rules: _commonRules,\n);\n\nfinal _cKeywords =\n    \"break case const continue default do else enum extern false for goto if \"\n    \"inline return sizeof static struct switch true typedef union while\";\n\nfinal _cRules = [\n  // Preprocessor with comment.\n  Rule.capture(r\"(#.*?)(//.*)\", [\"a\", \"c\"]),\n\n  // Preprocessor.\n  Rule(r\"#.*\", \"a\"),\n\n  // ALL_CAPS preprocessor macro use.\n  Rule(r\"[A-Z][A-Z0-9_]+\", \"a\"),\n\n  ..._commonRules,\n  _characterRule,\n];\n\n// TODO: Multi-character escapes?\nfinal _characterRule = Rule(r\"'\\\\?.'\", \"s\");\n\nfinal _commonRules = [\n  StringRule(),\n\n  Rule(r\"[0-9]+\\.[0-9]+f?\", \"n\"), // Float.\n  Rule(r\"0x[0-9a-fA-F]+\", \"n\"), // Hex integer.\n  Rule(r\"[0-9]+[Lu]?\", \"n\"), // Integer.\n\n  Rule(r\"//.*\", \"c\"), // Line comment.\n\n  // Capitalized type name.\n  Rule(r\"[A-Z][A-Za-z0-9_]*\", \"t\"),\n\n  // Other identifiers or keywords.\n  IdentifierRule(),\n];\n"
  },
  {
    "path": "tool/lib/src/syntax/highlighter.dart",
    "content": "import 'package:charcode/ascii.dart';\nimport 'package:string_scanner/string_scanner.dart';\n\nimport '../format.dart';\nimport '../term.dart' as term;\nimport 'grammar.dart' as grammar;\nimport 'language.dart';\n\nconst _maxLineLength = 67;\n\n/// Takes a string of source code and returns a block of HTML with spans for\n/// syntax highlighting.\n///\n/// Wraps the result in a <pre> tag with the given [preClass].\nString formatCode(String language, List<String> lines, Format format,\n    {String preClass, int indent = 0}) {\n  return Highlighter(language, format)._highlight(lines, preClass, indent);\n}\n\nvoid checkLineLength(String line) {\n  final asideCommentPattern = RegExp(r' +// \\[([-a-z0-9]+)\\]');\n  final asideWithCommentPattern = RegExp(r' +// (.+) \\[([-a-z0-9]+)\\]');\n\n  line = line.replaceAll(asideCommentPattern, '');\n  line = line.replaceAll(asideWithCommentPattern, '');\n\n  if (line.length <= _maxLineLength) return;\n\n  print(line.substring(0, _maxLineLength) +\n      term.red(line.substring(_maxLineLength)));\n}\n\nclass Highlighter {\n  final Format _format;\n  final StringBuffer _buffer = StringBuffer();\n  StringScanner scanner;\n  final Language language;\n\n  /// Whether we are in a multi-line macro started on a previous line.\n  bool _inMacro = false;\n\n  Highlighter(String language, this._format)\n      : language = grammar.languages[language] ??\n            (throw \"Unknown language '$language'.\");\n\n  String _highlight(List<String> lines, String preClass, int indent) {\n    if (!_format.isPrint) {\n      _buffer.write(\"<pre\");\n      if (preClass != null) _buffer.write(' class=\"$preClass\"');\n      _buffer.write(\">\");\n\n      // The HTML spec mandates that a leading newline after '<pre>' is ignored.\n      // https://html.spec.whatwg.org/#element-restrictions\n      // Some snippets deliberately start with a newline which needs to be\n      // preserved, so output an extra (discarded) newline in that case.\n      if (_format.isWeb && lines.first.isEmpty) _buffer.writeln();\n    }\n\n    for (var line in lines) {\n      _scanLine(line, indent);\n    }\n\n    if (!_format.isPrint) _buffer.write(\"</pre>\");\n\n    return _buffer.toString();\n  }\n\n  void _scanLine(String line, int indent) {\n    if (line.trim().isEmpty) {\n      _buffer.writeln();\n      return;\n    }\n\n    // If the entire code block is indented, remove that indentation from the\n    // code lines.\n    if (line.length > indent) line = line.substring(indent);\n\n    checkLineLength(line);\n\n    // Hackish. If the line ends with `\\`, then it is a multi-line macro\n    // definition and we want to highlight subsequent lines like preprocessor\n    // code too.\n    if (language == grammar.c && line.endsWith(\"\\\\\")) _inMacro = true;\n\n    if (_inMacro) {\n      writeToken(\"a\", line);\n    } else {\n      scanner = StringScanner(line);\n      while (!scanner.isDone) {\n        var found = false;\n        for (var rule in language.rules) {\n          if (rule.apply(this)) {\n            found = true;\n            break;\n          }\n        }\n\n        if (!found) _writeChar(scanner.readChar());\n      }\n    }\n\n    if (_inMacro && !line.endsWith(\"\\\\\")) _inMacro = false;\n\n    _buffer.writeln();\n  }\n\n  void writeToken(String type, [String text]) {\n    text ??= scanner.lastMatch[0];\n\n    if (_format.isPrint) {\n      // Only highlight keywords and comments in XML.\n      var tag = {\"k\": \"keyword\", \"c\": \"comment\"}[type];\n\n      if (tag != null) _buffer.write(\"<$tag>\");\n      writeText(text);\n      if (tag != null) _buffer.write(\"</$tag>\");\n    } else {\n      _buffer.write('<span class=\"$type\">');\n      writeText(text);\n      _buffer.write('</span>');\n    }\n  }\n\n  void writeText(String string) {\n    for (var i = 0; i < string.length; i++) {\n      _writeChar(string.codeUnitAt(i));\n    }\n  }\n\n  void _writeChar(int char) {\n    switch (char) {\n      case $less_than:\n        _buffer.write(\"&lt;\");\n        break;\n      case $greater_than:\n        _buffer.write(\"&gt;\");\n        break;\n      case $single_quote:\n        _buffer.write(\"&#39;\");\n        break;\n      case $double_quote:\n        _buffer.write(\"&quot;\");\n        break;\n      case $ampersand:\n        _buffer.write(\"&amp;\");\n        break;\n      default:\n        _buffer.writeCharCode(char);\n    }\n  }\n}\n"
  },
  {
    "path": "tool/lib/src/syntax/language.dart",
    "content": "import 'rule.dart';\n\n/// Defines the syntax rules for a single programming language.\nclass Language {\n  final Map<String, String> words = {};\n  final List<Rule> rules;\n\n  Language({String keywords, String types, List<Rule> this.rules}) {\n    keywordType(String wordList, String type) {\n      if (wordList == null) return;\n      for (var word in wordList.split(\" \")) {\n        words[word] = type;\n      }\n    }\n\n    keywordType(keywords, \"k\");\n    keywordType(types, \"t\");\n  }\n}\n"
  },
  {
    "path": "tool/lib/src/syntax/rule.dart",
    "content": "import 'package:charcode/ascii.dart';\n\nimport 'highlighter.dart';\n\nabstract class Rule {\n  final RegExp pattern;\n\n  factory Rule(String pattern, String tokenType) =>\n      SimpleRule(pattern, tokenType);\n\n  factory Rule.capture(String pattern, List<String> tokenTypes) =>\n      CaptureRule(pattern, tokenTypes);\n\n  Rule._(String pattern) : pattern = RegExp(pattern);\n\n  bool apply(Highlighter highlighter) {\n    if (!highlighter.scanner.scan(pattern)) return false;\n    applyRule(highlighter);\n    return true;\n  }\n\n  void applyRule(Highlighter highlighter);\n}\n\n/// Parses a single regex and outputs the entire matched text as a single token\n/// with the given [tokenType].\nclass SimpleRule extends Rule {\n  final String tokenType;\n\n  SimpleRule(String pattern, this.tokenType) : super._(pattern);\n\n  void applyRule(Highlighter highlighter) {\n    highlighter.writeToken(tokenType);\n  }\n}\n\n/// Parses a single regex where each capture group has a corresponding token\n/// type. If the type is `\"\"` for some group, the matched string text is output\n/// as plain text.\nclass CaptureRule extends Rule {\n  final List<String> tokenTypes;\n\n  CaptureRule(String pattern, this.tokenTypes) : super._(pattern);\n\n  void applyRule(Highlighter highlighter) {\n    var match = highlighter.scanner.lastMatch;\n    for (var i = 0; i < tokenTypes.length; i++) {\n      var type = tokenTypes[i];\n      if (type.isNotEmpty) {\n        highlighter.writeToken(type, match[i + 1]);\n      } else {\n        highlighter.writeText(match[i + 1]);\n      }\n    }\n  }\n}\n\n/// Parses string literals and the escape codes inside them.\nclass StringRule extends Rule {\n  static final _escapePattern = RegExp(r\"\\\\.\");\n\n  StringRule() : super._('\"');\n\n  void applyRule(Highlighter highlighter) {\n    var scanner = highlighter.scanner;\n    var start = scanner.position - 1;\n\n    while (!scanner.isDone) {\n      if (scanner.scan(_escapePattern)) {\n        if (scanner.position > start) {\n          highlighter.writeToken(\n              \"s\", scanner.substring(start, scanner.position - 2));\n        }\n        highlighter.writeToken(\"e\");\n        start = scanner.position;\n      } else if (scanner.scanChar($double_quote)) {\n        highlighter.writeToken(\"s\", scanner.substring(start, scanner.position));\n        return;\n      } else {\n        scanner.position++;\n      }\n    }\n\n    // Error: Unterminated string.\n    highlighter.writeToken(\"err\", scanner.substring(start, scanner.position));\n  }\n}\n\n/// Parses an identifier and resolves keywords for their token type.\nclass IdentifierRule extends Rule {\n  IdentifierRule() : super._(r\"[a-zA-Z_][a-zA-Z0-9_]*\");\n\n  void applyRule(Highlighter highlighter) {\n    var identifier = highlighter.scanner.lastMatch[0];\n    var type = highlighter.language.words[identifier] ?? \"i\";\n    highlighter.writeToken(type);\n  }\n}\n"
  },
  {
    "path": "tool/lib/src/term.dart",
    "content": "/// Utilities for printing to the terminal.\nimport 'dart:io';\n\nfinal _cyan = _ansi('\\u001b[36m');\nfinal _gray = _ansi('\\u001b[1;30m');\nfinal _green = _ansi('\\u001b[32m');\nfinal _magenta = _ansi('\\u001b[35m');\nfinal _pink = _ansi('\\u001b[91m');\nfinal _red = _ansi('\\u001b[31m');\nfinal _yellow = _ansi('\\u001b[33m');\nfinal _none = _ansi('\\u001b[0m');\nfinal _resetColor = _ansi('\\u001b[39m');\n\nString cyan(Object message) => \"$_cyan$message$_none\";\nString gray(Object message) => \"$_gray$message$_none\";\nString green(Object message) => \"$_green$message$_resetColor\";\nString magenta(Object message) => \"$_magenta$message$_resetColor\";\nString pink(Object message) => \"$_pink$message$_resetColor\";\nString red(Object message) => \"$_red$message$_resetColor\";\nString yellow(Object message) => \"$_yellow$message$_resetColor\";\n\nvoid clearLine() {\n  if (_allowAnsi) {\n    stdout.write(\"\\u001b[2K\\r\");\n  } else {\n    print(\"\");\n  }\n}\n\nvoid writeLine([String line]) {\n  clearLine();\n  if (line != null) stdout.write(line);\n}\n\nbool get _allowAnsi =>\n    !Platform.isWindows && stdioType(stdout) == StdioType.terminal;\n\nString _ansi(String special, [String fallback = '']) =>\n    _allowAnsi ? special : fallback;\n"
  },
  {
    "path": "tool/lib/src/text.dart",
    "content": "import 'dart:convert';\nimport 'dart:math' as math;\n\n/// Punctuation characters removed from file names and anchors.\nfinal _punctuation = RegExp(r'[,.?!:' \"'\" '/\"()]');\n\nfinal _whitespace = RegExp(r\"\\s+\");\n\n/// Converts [text] to a string suitable for use as a file or anchor name.\nString toFileName(String text) {\n  if (text == \"Crafting Interpreters\") return \"index\";\n  if (text == \"Table of Contents\") return \"contents\";\n\n  // Hack. The introduction has a *subheader* named \"Challenges\" distinct from\n  // the challenges section. This function here is also used to generate the\n  // anchor names for the links, so handle that one specially so it doesn't\n  // collide with the real \"Challenges\" section.\n  if (text == \"Challenges\") return \"challenges_\";\n\n  return text.toLowerCase().replaceAll(\" \", \"-\").replaceAll(_punctuation, \"\");\n}\n\n/// Returns the length of the longest line in lines, or [longest], whichever\n/// is longer.\nint longestLine(int longest, Iterable<String> lines) {\n  for (var line in lines) {\n    longest = math.max(longest, line.length);\n  }\n  return longest;\n}\n\nString pluralize<T>(Iterable<T> sequence) {\n  if (sequence.length == 1) return \"\";\n  return \"s\";\n}\n\nextension IntExtensions on int {\n  /// Convert n to roman numerals.\n  String get roman {\n    if (this <= 3) return \"I\" * this;\n    if (this == 4) return \"IV\";\n    if (this < 10) return \"V\" + \"I\" * (this - 5);\n\n    throw ArgumentError(\"Can't convert $this to Roman.\");\n  }\n\n  /// Make a nicely formatted string.\n  String get withCommas {\n    if (this > 1000) return \"${this ~/ 1000},${this % 1000}\";\n    return toString();\n  }\n}\n\nextension StringExtensions on String {\n  /// Use nicer HTML entities and special characters.\n  String get pretty {\n    return this\n        .replaceAll(\"à\", \"&agrave;\")\n        .replaceAll(\"ï\", \"&iuml;\")\n        .replaceAll(\"ø\", \"&oslash;\")\n        .replaceAll(\"æ\", \"&aelig;\");\n  }\n\n  String get escapeHtml =>\n      const HtmlEscape(HtmlEscapeMode.attribute).convert(this);\n\n  int get wordCount => split(_whitespace).length;\n\n  /// Removes a single newline from the end of the string.\n  String trimTrailingNewline() {\n    if (endsWith(\"\\n\")) return substring(0, length - 1);\n    return this;\n  }\n}\n"
  },
  {
    "path": "tool/pubspec.yaml",
    "content": "name: tool\npublish_to: none\nenvironment:\n  sdk: '>2.11.0 <3.0.0'\ndependencies:\n  args: ^1.6.0\n  charcode: ^1.1.3\n  glob: ^1.2.0\n  image: ^2.1.19\n  markdown: ^2.1.3\n  mime_type: ^0.3.0\n  mustache_template: ^1.0.0\n  path: ^1.7.0\n  pool: ^1.4.0\n  sass: ^1.26.5\n  shelf: ^0.7.5\n  string_scanner: ^1.0.5\n"
  },
  {
    "path": "util/c.make",
    "content": "# Makefile for building a single configuration of the C interpreter. It expects\n# variables to be passed in for:\n#\n# MODE         \"debug\" or \"release\".\n# NAME         Name of the output executable (and object file directory).\n# SOURCE_DIR   Directory where source files and headers are found.\n\nifeq ($(CPP),true)\n\t# Ideally, we'd add -pedantic-errors, but the use of designated initializers\n\t# means clox relies on some GCC/Clang extensions to compile as C++.\n\tCFLAGS := -std=c++11\n\tC_LANG := -x c++\nelse\n\tCFLAGS := -std=c99\nendif\n\nCFLAGS += -Wall -Wextra -Werror -Wno-unused-parameter\n\n# If we're building at a point in the middle of a chapter, don't fail if there\n# are functions that aren't used yet.\nifeq ($(SNIPPET),true)\n\tCFLAGS += -Wno-unused-function\nendif\n\n# Mode configuration.\nifeq ($(MODE),debug)\n\tCFLAGS += -O0 -DDEBUG -g\n\tBUILD_DIR := build/debug\nelse\n\tCFLAGS += -O3 -flto\n\tBUILD_DIR := build/release\nendif\n\n# Files.\nHEADERS := $(wildcard $(SOURCE_DIR)/*.h)\nSOURCES := $(wildcard $(SOURCE_DIR)/*.c)\nOBJECTS := $(addprefix $(BUILD_DIR)/$(NAME)/, $(notdir $(SOURCES:.c=.o)))\n\n# Targets ---------------------------------------------------------------------\n\n# Link the interpreter.\nbuild/$(NAME): $(OBJECTS)\n\t@ printf \"%8s %-40s %s\\n\" $(CC) $@ \"$(CFLAGS)\"\n\t@ mkdir -p build\n\t@ $(CC) $(CFLAGS) $^ -o $@\n\n# Compile object files.\n$(BUILD_DIR)/$(NAME)/%.o: $(SOURCE_DIR)/%.c $(HEADERS)\n\t@ printf \"%8s %-40s %s\\n\" $(CC) $< \"$(CFLAGS)\"\n\t@ mkdir -p $(BUILD_DIR)/$(NAME)\n\t@ $(CC) -c $(C_LANG) $(CFLAGS) -o $@ $<\n\n.PHONY: default\n"
  },
  {
    "path": "util/intellij/chap04_read.iml",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<module type=\"JAVA_MODULE\" version=\"4\">\n  <component name=\"NewModuleRootManager\" inherit-compiler-output=\"true\">\n    <exclude-output />\n    <content url=\"file://$MODULE_DIR$/../../gen/chap04_framework\">\n      <sourceFolder url=\"file://$MODULE_DIR$/../../gen/chap04_framework\" isTestSource=\"false\" />\n    </content>\n    <orderEntry type=\"inheritedJdk\" />\n    <orderEntry type=\"sourceFolder\" forTests=\"false\" />\n  </component>\n</module>"
  },
  {
    "path": "util/intellij/chap05_scanning.iml",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<module type=\"JAVA_MODULE\" version=\"4\">\n  <component name=\"NewModuleRootManager\" inherit-compiler-output=\"true\">\n    <exclude-output />\n    <content url=\"file://$MODULE_DIR$/../../gen/chap05_scanning\">\n      <sourceFolder url=\"file://$MODULE_DIR$/../../gen/chap05_scanning\" isTestSource=\"false\" />\n    </content>\n    <orderEntry type=\"inheritedJdk\" />\n    <orderEntry type=\"sourceFolder\" forTests=\"false\" />\n  </component>\n</module>"
  },
  {
    "path": "util/intellij/chap06_representing.iml",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<module type=\"JAVA_MODULE\" version=\"4\">\n  <component name=\"NewModuleRootManager\" inherit-compiler-output=\"true\">\n    <exclude-output />\n    <content url=\"file://$MODULE_DIR$/../../gen/chap06_representing\">\n      <sourceFolder url=\"file://$MODULE_DIR$/../../gen/chap06_representing\" isTestSource=\"false\" />\n    </content>\n    <orderEntry type=\"inheritedJdk\" />\n    <orderEntry type=\"sourceFolder\" forTests=\"false\" />\n  </component>\n</module>"
  },
  {
    "path": "util/intellij/chap07_parsing.iml",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<module type=\"JAVA_MODULE\" version=\"4\">\n  <component name=\"NewModuleRootManager\" inherit-compiler-output=\"true\">\n    <exclude-output />\n    <content url=\"file://$MODULE_DIR$/../../gen/chap07_parsing\">\n      <sourceFolder url=\"file://$MODULE_DIR$/../../gen/chap07_parsing\" isTestSource=\"false\" />\n    </content>\n    <orderEntry type=\"inheritedJdk\" />\n    <orderEntry type=\"sourceFolder\" forTests=\"false\" />\n  </component>\n</module>"
  },
  {
    "path": "util/intellij/chap08_evaluating.iml",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<module type=\"JAVA_MODULE\" version=\"4\">\n  <component name=\"NewModuleRootManager\" inherit-compiler-output=\"true\">\n    <exclude-output />\n    <content url=\"file://$MODULE_DIR$/../../gen/chap08_evaluating\">\n      <sourceFolder url=\"file://$MODULE_DIR$/../../gen/chap08_evaluating\" isTestSource=\"false\" />\n    </content>\n    <orderEntry type=\"inheritedJdk\" />\n    <orderEntry type=\"sourceFolder\" forTests=\"false\" />\n  </component>\n</module>"
  },
  {
    "path": "util/intellij/chap09_statements.iml",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<module type=\"JAVA_MODULE\" version=\"4\">\n  <component name=\"NewModuleRootManager\" inherit-compiler-output=\"true\">\n    <exclude-output />\n    <content url=\"file://$MODULE_DIR$/../../gen/chap09_statements\">\n      <sourceFolder url=\"file://$MODULE_DIR$/../../gen/chap09_statements\" isTestSource=\"false\" />\n    </content>\n    <orderEntry type=\"inheritedJdk\" />\n    <orderEntry type=\"sourceFolder\" forTests=\"false\" />\n  </component>\n</module>"
  },
  {
    "path": "util/intellij/chap10_control.iml",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<module type=\"JAVA_MODULE\" version=\"4\">\n  <component name=\"NewModuleRootManager\" inherit-compiler-output=\"true\">\n    <exclude-output />\n    <content url=\"file://$MODULE_DIR$/../../gen/chap10_control\">\n      <sourceFolder url=\"file://$MODULE_DIR$/../../gen/chap10_control\" isTestSource=\"false\" />\n    </content>\n    <orderEntry type=\"inheritedJdk\" />\n    <orderEntry type=\"sourceFolder\" forTests=\"false\" />\n  </component>\n</module>"
  },
  {
    "path": "util/intellij/chap11_functions.iml",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<module type=\"JAVA_MODULE\" version=\"4\">\n  <component name=\"NewModuleRootManager\" inherit-compiler-output=\"true\">\n    <exclude-output />\n    <content url=\"file://$MODULE_DIR$/../../gen/chap11_functions\">\n      <sourceFolder url=\"file://$MODULE_DIR$/../../gen/chap11_functions\" isTestSource=\"false\" />\n    </content>\n    <orderEntry type=\"inheritedJdk\" />\n    <orderEntry type=\"sourceFolder\" forTests=\"false\" />\n  </component>\n</module>"
  },
  {
    "path": "util/intellij/chap12_resolving.iml",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<module type=\"JAVA_MODULE\" version=\"4\">\n  <component name=\"NewModuleRootManager\" inherit-compiler-output=\"true\">\n    <exclude-output />\n    <content url=\"file://$MODULE_DIR$/../../gen/chap12_resolving\">\n      <sourceFolder url=\"file://$MODULE_DIR$/../../gen/chap12_resolving\" isTestSource=\"false\" />\n    </content>\n    <orderEntry type=\"inheritedJdk\" />\n    <orderEntry type=\"sourceFolder\" forTests=\"false\" />\n  </component>\n</module>"
  },
  {
    "path": "util/intellij/chap13_classes.iml",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<module type=\"JAVA_MODULE\" version=\"4\">\n  <component name=\"NewModuleRootManager\" inherit-compiler-output=\"true\">\n    <exclude-output />\n    <content url=\"file://$MODULE_DIR$/../../gen/chap13_classes\">\n      <sourceFolder url=\"file://$MODULE_DIR$/../../gen/chap13_classes\" isTestSource=\"false\" />\n    </content>\n    <orderEntry type=\"inheritedJdk\" />\n    <orderEntry type=\"sourceFolder\" forTests=\"false\" />\n  </component>\n</module>"
  },
  {
    "path": "util/intellij/chap14_inheritance.iml",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<module type=\"JAVA_MODULE\" version=\"4\">\n  <component name=\"NewModuleRootManager\" inherit-compiler-output=\"true\">\n    <exclude-output />\n    <content url=\"file://$MODULE_DIR$/../../gen/chap14_inheritance\">\n      <sourceFolder url=\"file://$MODULE_DIR$/../../gen/chap14_inheritance\" isTestSource=\"false\" />\n    </content>\n    <orderEntry type=\"inheritedJdk\" />\n    <orderEntry type=\"sourceFolder\" forTests=\"false\" />\n  </component>\n</module>"
  },
  {
    "path": "util/intellij/intellij.iml",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<module type=\"JAVA_MODULE\" version=\"4\">\n  <component name=\"NewModuleRootManager\" inherit-compiler-output=\"true\">\n    <exclude-output />\n    <content url=\"file://$MODULE_DIR$\">\n      <sourceFolder url=\"file://$MODULE_DIR$/src\" isTestSource=\"false\" />\n    </content>\n    <orderEntry type=\"inheritedJdk\" />\n    <orderEntry type=\"sourceFolder\" forTests=\"false\" />\n  </component>\n</module>"
  },
  {
    "path": "util/intellij/jlox.iml",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<module type=\"JAVA_MODULE\" version=\"4\">\n  <component name=\"NewModuleRootManager\" LANGUAGE_LEVEL=\"JDK_1_7\" inherit-compiler-output=\"true\">\n    <exclude-output />\n    <content url=\"file://$MODULE_DIR$/../../java\">\n      <sourceFolder url=\"file://$MODULE_DIR$/../../java\" isTestSource=\"false\" />\n    </content>\n    <orderEntry type=\"inheritedJdk\" />\n    <orderEntry type=\"sourceFolder\" forTests=\"false\" />\n  </component>\n</module>"
  },
  {
    "path": "util/intellij/section_test.iml",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<module version=\"4\">\n  <component name=\"NewModuleRootManager\" inherit-compiler-output=\"false\">\n    <orderEntry type=\"sourceFolder\" forTests=\"false\" />\n  </component>\n</module>"
  },
  {
    "path": "util/intellij/snippet_test.iml",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<module type=\"JAVA_MODULE\" version=\"4\">\n  <component name=\"NewModuleRootManager\" inherit-compiler-output=\"true\">\n    <exclude-output />\n    <content url=\"file://$MODULE_DIR$/../../gen/snippet_test\">\n      <sourceFolder url=\"file://$MODULE_DIR$/../../gen/snippet_test\" isTestSource=\"false\" />\n    </content>\n    <orderEntry type=\"inheritedJdk\" />\n    <orderEntry type=\"sourceFolder\" forTests=\"false\" />\n  </component>\n</module>"
  },
  {
    "path": "util/java.make",
    "content": "# Makefile for building a single directory of Java source files. It requires\n# a DIR variable to be set.\n\nBUILD_DIR := build\n\nSOURCES := $(wildcard $(DIR)/com/craftinginterpreters/$(PACKAGE)/*.java)\nCLASSES := $(addprefix $(BUILD_DIR)/, $(SOURCES:.java=.class))\n\nJAVA_OPTIONS := -Werror\n\ndefault: $(CLASSES)\n\t@: # Don't show \"Nothing to be done\" output.\n\n# Compile a single .java file to .class.\n$(BUILD_DIR)/$(DIR)/%.class: $(DIR)/%.java\n\t@ mkdir -p $(BUILD_DIR)/$(DIR)\n\t@ javac -cp $(DIR) -d $(BUILD_DIR)/$(DIR) $(JAVA_OPTIONS) -implicit:none $<\n\t@ printf \"%8s %-60s %s\\n\" javac $< \"$(JAVA_OPTIONS)\"\n\n.PHONY: default\n"
  }
]