Notes on Crafting Interpreters: Welcome
Table of Contents
Introduction
What exactly are dynamic arrays, hash tables and bytecodes?
Challenges
6 DSLs are used for the production of this book. What are they?
There’s a Makefile. Markdown is definitely used, as are XML, Yaml, SCSS and CSS. I think hard-coded HTML makes an appearance too.
TODO: Write hello world in Java.
C pointers. Define doubly linked list of heap-allocated strings. Write functions to insert, find and delete items from it.
Above my current abilities. Geeks For Geeks appear to have a decent explanation which I maybe can return to once I get to part 3 of the book.
A map of the territory
scanning = lexing = lexical analysis
Scanner = lexer takes characters and chunks them into tokens.
Parser takes tokens and builds grammar using tree structures called (abstract) syntax trees, ASTs or just trees. Informs of syntax errors.
Binding = resolution. Identifier. Scope. For what variable refers to. Leads to type error, if statically typed.
All this semantic insight has to go somehwere, either as attributes in the syntax tree, stored away in a symbol table or by transforming the tree completely.
So far all front end. Next: middle end.
Code stored in intermediate representation, where it can then be optimised by constant folding for example before then generating code = code gen.
Now the back end. What should the end code be? p-code = bytecode? What does runtime look like? Do you write loads of native compilers or a virtual machine?
Different implementations include single-pass compilers, tree-walk interpreters, source-to-source compilers = transcompilers = transpiler, just-in-time (JIT) compilation.
Challenges
Look around an open source implementation of a language for the scanner and parser.
The ones for CPython are easy enough to find.
What reasons are there to not JIT?
Start-up delay, memory overload, security vulnerabilities.
Why do most LISP implementations also contain an interpreter?
Culture. And maybe because it is simple to do?
The lox language
Dynamically typed. 2 methods for memory management: reference counting and (tracing) garbage collection (GC). Lox uses GC.
Data types: Booleans, doubles, strings, nil.
Expression statements, which end in ;, can be wrapped up in a block with
{}.
Nystrom is particular on arguments vs parameters:
argument: the value passed to the function,
parameter: the variable holding the value of the argument.
Classes vs prototypes. In class-based languages: instances and classes. In prototype-based languages these are merged, there are only objects.
Inheritance done with <, which creates a derived class = subclass of the
base class = superclass.
Challenges
TODO: Write some sample lox programs and run them. Try to come up with edge cases.
List open questions on language’s syntax and semantics.
Escape characters for strings? Numbers other than doubles such as integers?
What missing features are most annoying?
Lack of proper data structures, e.g. lists, dicts and sets. Lack of comments.
Question for self
Does Robert Nystrom write about how we wrote the book?
Yes he does. Main take away is to write every day. Main two articles describing the process of writing are here: https://journal.stuffwithstuff.com/2014/04/22/zero-to-95688-how-i-wrote-game-programming-patterns/ and https://journal.stuffwithstuff.com/2020/04/05/crafting-crafting-interpreters/