Teaching a Machine to Understand: Inside a Custom Programming Language

Language

Following Thorsten Ball's Writing An Interpreter In Go

I've always been drawn to books as a way to learn, and at some point I found myself wanting a serious project involving parsers — something that wasn't going to feel like a toy, something that would demand real focus over a long period of time. So I did what felt natural: I searched for a book about parsers and Go. Writing An Interpreter In Go by Thorsten Ball was the first result that came up. I picked it up, and about six months later I had built a fully functional programming language from scratch — no libraries, no shortcuts, just Go and a much deeper understanding of how interpreters work under the hood. This is the story of that project: dyxgou/interpreter.

The Monkey Programming Language

The language implemented here is Monkey — a small, expressive language invented by Thorsten Ball which implementes a rich set of features.

  1. Integer and Booleans.
  2. String manipulation
  3. Variables and bindings (let statements)
  4. First-class and higher-order functions
  5. Closures
  6. Arrays and hash maps
  7. A built-in library (e.g., len, print, first, last, push, pop)

Here's a taste of what Monkey code looks like:

Monkey Program Example
let fibonacci = fn(x) {
  if (x == 0) {
    0
  } else {
    if (x == 1) {
      return 1;
    } else {
      fibonacci(x - 1) + fibonacci(x - 2);
    }
  }
};

print(fibonacci(10)) // 55

You can find more examples in the example/ folder of the repository.

Architecture: How the Interpreter Works

The interpreter is structured as a classic pipeline — source code enters one end, and computed results come out the other. There are four core components, each with a clearly defined responsibility.

  1. The Lexer (src/lexer)

    The Lexer is the very first stage of interpretation. Its job is to read raw source code — and convert it into a stream of . A token is a meaningful unit of the language: a keyword like let or fn, an operator like + or ==, an integer, an identifier, or a delimiter like { and }. For example, the code let x = 5 + 3; gets transformed into a sequence like:

    Let statement Tokens
    [LET] [IDENT:"x"] [ASSIGN] [INT:5] [PLUS] [INT:3] [SEMICOLON]

    The Lexer doesn't care whether the code makes sense — that's someone else's problem. It only cares about recognizing and categorizing individual pieces of text.

  2. The AST (src/ast)

    Before the Parser can do anything useful, we need a data structure to represent the shape of the program. That's the Abstract Syntax Tree (AST). It's a tree of nodes where each node represents a syntactic construct: a Statement, an Expression, a Function, an infix operation, and so on.

    For example, the expression 5 + 3 * 2 doesn't get treated as a flat list — it gets structured as a tree that encodes operator precedence:

    Let statement Tokens
    InfixExpression(+)
    ├── IntegerLiteral(5)
    └── InfixExpression(*)
        ├── IntegerLiteral(3)
        └── IntegerLiteral(2)

    The AST package defines all the Node types and ensures every node in the tree implements a common interface, making it easy for the Parser and Evaluator to work with them uniformly.

  3. The Parser (src/parser)

    The Parser takes the Token stream from the Lexer and builds the AST. This is the most intellectually interesting component of the whole project.

    The implementation follows a Pratt parser (also called a top-down operator precedence parser), a technique described elegantly in the book. The key insight of a Pratt parser is that each token type can have prefix and infix parse functions associated with it. When the parser encounters a Token, it calls the appropriate parsing function, which returns an AST node. This approach handles operator precedence naturally and elegantly, without resorting to complicated grammar rules.

    The parser also performs syntax validation — if the token stream doesn't form a valid Monkey program, the parser reports errors rather than silently producing garbage.

  4. The Evaluator (src/evaluator)

    The Evaluator is where the magic happens. It walks the AST recursively — a tree-walking interpreter — and computes the value of every Node it visits.

    1. An IntegerLiteral node evaluates to its numeric value.
    2. An InfixExpression node evaluates both sides and applies the operator.
    3. An IfExpression evaluates the condition, then takes the appropriate branch.
    4. A FunctionLiteral captures the outer environment.
    5. A CallExpression extends the environment, binds arguments, and evaluates the function body.

    The Evaluator also manages the Object system — every value in Monkey (integers, strings, booleans, arrays, hashes, functions, null) is represented as a Go struct implementing a common Object interface. This makes it easy to pass values around and pattern-match on their types.

  5. The REPL (src/repl)

    All of this comes together in a Read-Eval-Print Loop. The REPL reads a line of code, runs it through the full pipeline (lexer → parser → evaluator), and prints the result. You can launch it with a single make command:

    Execute a RELP of the Monkey Programming Language
    git clone https://github.com/dyxgou/interpreter
    cd interpreter
    make

    Or execute a Monkey source file directly:

    Execute a File
    make execute FILE=example/fibonacci.lang
  6. Testing

    This project was built following Test Driven Development (TDD), as guided by the book itself. Every component was written test-first — before implementing a feature, leading to an extensive test suite where every major component — the Lexer, Parser, AST, and Evaluator — has dedicated tests that verify correct behavior across a wide range of inputs, including edge cases and error conditions.

    Execute a RELP of the Monkey Programming Language
    go test ./... -v
  1. Repository: github.com/dyxgou/interpreter
  2. Book: Writing An Interpreter In Go by Thorsten Ball