Andrey Listopadov

Thoughts on Crafting Interpreters - Part 2

@Programming @Language-Design Zig ~113 minutes read

This is a second post about the Crafting Interpreters book by Robert Nystrom.

In the first post, I’ve described my experience with the first half of the book, and the challenges of using a different language with different idioms and practices. This post will be no different, although I have a bit more to discuss, and the contents aren’t actually ~2-year-old weak impressions and remembrances. That being said, this is a second iteration both on this post and on the book so a lot of frustration I had with the book the first time I went through is coming from memory. Well, these memories aren’t as old, as I did the project more recently, but still, I’ll point out when the complaints come from the old version or the new one.

Same as before, I decided to choose a different implementation language, rather than using the one suggested by the author. Please note, that this decision isn’t done for the sake of just using a different language - I have pretty practical reasons to do that. Even more so than in the first half of the book.

For the bytecode VM, Robert chooses the C programming language. It is again, a reasonable choice, and I agree, that for a book on a language runtime, there are probably no other valid choices. For a book, that is.

I’m not writing a book, however, so I felt that C is a rather poor choice for an actual implementation. Not that I’m suggesting that C is a bad language for this kind of task - it has several advantages over most others, but in general, C isn’t the best pick today. C demands a lot of knowledge of how computers work, and it is the least forgiving language when it comes to fixing bugs down the line. We’ve been writing C since the 1970s and haven’t perfected it by any means. Memory errors are everywhere, programs segfault, and a lot of projects dedicated to finding errors in C code are basically required for any modern work done in C because humans are, in fact, humans.

If someone claims that they know C, they are either lying or are extremely rare cases of people who actually know what they’re doing. I don’t think I have ever publicly said that I know C, but I did work with C professionally in the bare metal context, and it’s a whole different story to working with C in the context of the operating system. So even there we can’t say that we really know C if we only use it in one of the many possible contexts where you can use C. Well, at least I worked with ASM as well, so when I saw the C code I could roughly see what ASM would correspond to it.

Yes, C is still good for teaching the concepts behind data structures, used for language implementation, as you do have to implement them for yourself a lot. The context that C is used in often requires this, because libraries might not be available for your specific use case. So implementing stuff in C, especially given that this language just steps aside and says: “go”, not holding your hand, not preventing you from doing crazy shit like many modern languages often do. Oh, so you want to create an array, fill it with random numbers, cast it to the function, and run your machine code you’ve just put in - be my guest, that’s what we call meta-programming! As ridiculous as it is, I have done that in the past.

But, I don’t want to write in C anymore for many reasons. So, what language did I pick?

Well, since you have seen the tags for this post, you already know that it is the Zig programming language. However, I have considered a number of other languages. Namely Nim, Hare, Zig, Ada, and even OCaml.

I have already tried Rust for this kind of project before, so I knew that I didn’t want to do it again with the same language. I realized that I don’t like Rust in general. The idea is nice on paper, but I believe it’s overly fixated. And I have my thoughts on the Rust community, so this language isn’t really for me.

But I wanted a language that can be considered a system language, much like C is. And while Rust is indeed used for Linux kernel, I, myself, can’t consider it a system programming language, much like I can’t consider C++ to be one, even though you can write a system in it, for example Haiku. You can write systems in all kinds of languages, but I think that a system programming language has to have some pretty specific features and these are usually related to the ability to understand code, not to write one. Rust and many others failed this for me.

So, I quickly decided that I didn’t want to use Nim because of its Python-like syntax which I don’t like at all. It may be a decent language, but the other point I don’t like about it in comparison to the other is that it is transpiled to C and then compiled, correct me if I’m wrong and things have changed since when it was like that. I don’t think that this approach is bad, but I think that C is a poor target to compile to, and I can’t be sure that everything I write clearly translates to C without weird shenanigans that can have weird or unexpected behavior. Although more languages do that, namely ECL and Gambit Scheme, it’s not impossible to do this well. Sadly I think Scheme is a better language than Nim.

OCaml is often brought up when discussing a language for writing compilers. There are valid reasons to use it, but I’m quite unfamiliar with the ML family of languages, so I decided to not use it. Also, OCaml is often used for its ability to work with LLVM, which isn’t what we’re doing in the book.

Hare is a relatively new language by Drew DeVault. From what I’ve seen about this language it seems interesting, and it has a nice article listing the advantages over C. However, the documentation was almost non-existent at the moment I started this project, and I didn’t want to dive into a language that was in so early stages of its development. I have vague plans on maybe doing a Forth implementation in Hare in the future, so I’ll get to try it eventually. I would say that Hare was a close second to me.

Ada seems like an interesting language, and I have read the hellscape of a code from one of the projects done by a member of a community in which I hang out. While it seems like a capable language, I didn’t like it that much to try it myself.

There’s also Odin… I have never heard of this language and it’s being used anywhere, but according to their web page, it’s basically everywhere. That seems like a stretch for a project that started in 2016. Zig started in 2015 and had no stable release yet, and Odin claims to be professionally used in so many game studios and other companies, and yet I haven’t heard of it at all! I consider myself a language enthusiast, yet somehow this language completely dodged my field of view. Maybe they’re that good, I don’t know, but I decided that I’ll pass.

Jai, a language by Jonathan Blow, also seemed interesting to try out, but I feel that its aim at being a replacement for C++ in the game programming field is not exactly what I’m looking for. At the time of writing, it’s still in a closed beta test, so it’s hard to get hands on it, and judging from what I have seen in the videos, it’s not like I would use this language for this particular project either.

Before you ask, I haven’t considered Go for this project, because I simply don’t like Go. There’s the interpreterbook book which is oddly similarly named to the book discussed in this post, and it is written in Go, so you don’t even have to go through translating C to Go if you want to learn how to design a language if Go is your thing. Well, I have my reasons to dislike Go, but I don’t want to go into them here. Maybe next time.

And yes, there are many more languages that can be used for this kind of task - obviously, I can’t compare all of them. So I had to stop somewhere.

The Zig programming language caught my eye a few years ago after it was pointed out to me as a response to my rant about C. It is more mature than Hare, and as far as I know, the 1.0.0 release is not as far away. And for what it’s worth, it is closest to C in many ways that Ada or Rust are not. So I decided that it would be a proper pick for the second half of the book.

Zig

So I chose Zig. At the moment of writing this post, the latest release is 0.11.0, which is important to note as Zig is still constantly changing. While this is a good thing, and I understand that Zig had no stable release yet, I am spoiled. I am spoiled by languages with the stable core, like Clojure, where you can find a 13-year-old code and run it with the modern version of Clojure. That’s not how it works with Zig. A lot of code I’ve tried during this experiment did not work because it was written for an older version of Zig.

As you may remember from part 1 - this is a second try on writing the second half of the book in Zig. The first try wasn’t successful, and I’m going to discuss this separately later in this post. During the first iteration, I had to learn many things about the language and its build system. After I hit the point where my implementation did not work (for reasons described later), I decided to delete everything and start over (for reasons described later).

So I took a considerable amount of time to rest from this project and in the meantime, Zig 0.11.0 was released. This is important because the previous run was done using Zig 0.10.0, and I deleted everything, except for the build.zig that I had spent a lot of time creating to fulfill my needs. And now it doesn’t work.

That made me realize that I didn’t actually spend any time in the previous post talking about the build system. So let’s do that right now!

Clojure build system

So, in the Clojure version of the Lox language, I opted to use the deps.edn which is pretty much the standard way to create projects in Clojure. There are other systems that can help manage projects, notably Leiningen, but I like to do things manually, for better understanding, so when I can I use stuff like deps. And unlike Leiningen, deps doesn’t come with the automatic building of your project - you have to make it by yourself.

And that’s actually great! I like the idea that builds are programs because you can write custom builds that are specific to your project’s needs. Of course, most projects don’t need anything fancy, so there is a library that provides the basic stuff you’ll probably need.

Here’s what the build.clj looks like for CljLox:

(ns build
  (:require [clojure.tools.build.api :as b]))

(def lib 'lox)
(def version (format "0.0.%s" (b/git-count-revs nil)))
(def class-dir "target/clojure/classes")
(def basis (b/create-basis {:project "deps.edn"}))
(def uber-file (format "target/clojure/%s-%s-standalone.jar" (name lib) version))

(defn clean [_]
  (b/delete {:path "target/clojure"}))

(defn uber [_]
  (clean nil)
  (b/copy-dir {:src-dirs ["src"]
               :target-dir class-dir})
  (b/compile-clj {:basis basis
                  :src-dirs ["src"]
                  :class-dir class-dir})
  (b/uber {:class-dir class-dir
           :uber-file uber-file
           :basis basis
           :main 'lox.core}))

That’s it!

I like the idea of using the number of commits since the repository creation as the last component of the version, and the build library provides a function for that: b/git-count-revs. We do have to specify some paths, create a basis, whatever it is, and just define the compilation functions. We then execute it like clojure -T:build uber, and it creates the lox-0.0.123-standalone.jar file that we can later run with the java -jar command.

Simple and effective. If my project required additional steps I could just define more functions to do these steps. Let’s compare that to Zig, shall we?

Zig build

Here’s the entire build.zig for the previous version of ZigLox:

const std = @import("std");
const Version = std.builtin.Version;
const Builder = std.build.Builder;
const print = std.debug.print;
const LibExeObjStep = std.build.LibExeObjStep;
const Mode = std.builtin.Mode;
const CrossTarget = std.zig.CrossTarget;
const Step = std.build.Step;
const trimRight = std.mem.trimRight;
const parseInt = std.fmt.parseInt;
const exit = std.process.exit;
const fs = std.fs;

const MAJOR = 0;
const MINOR = 1;

fn gitCountRevs(b: *Builder) u32 {
    var code: u8 = undefined;
    b: {
        const commits_num = trimRight(
            u8,
            b.execAllowFail(
                &[_][]const u8{ "git", "-C", b.build_root, "rev-list", "HEAD", "--count" },
                &code,
                .Ignore,
            ) catch break :b,
            "\n",
        );
        return parseInt(u32, commits_num, 10) catch break :b;
    }
    print("unable to count number of revs\n", .{});
    exit(1);
}

fn getVersion(b: *Builder) Version {
    return Version{
        .major = MAJOR,
        .minor = MINOR,
        .patch = gitCountRevs(b),
    };
}

fn isZigFile(name: []const u8) bool {
    return std.mem.eql(u8, name[name.len - 4 ..], ".zig");
}

/// Automatically resolves packages under the source directory.  Adds
/// packages to the `exe`.
fn resolveAndAddPackages(
    src_dir: []const u8,
    allocator: std.mem.Allocator,
    exe: *LibExeObjStep,
) void {
    const dir = std.fs.cwd().openIterableDir(src_dir, .{}) catch exit(2);
    var iterator = dir.iterate();

    while (iterator.next() catch null) |path| {
        if (isZigFile(path.name)) {
            const package_name = path.name[0 .. path.name.len - 4];
            const src_path = fs.path.join(
                allocator,
                &[_][]const u8{ src_dir, path.name },
            ) catch exit(3);
            exe.addPackagePath(package_name, src_path);
        }
    }
}

/// Automatically finds tests under the test directory, adds the test
/// to the build `step`, and adds all necessary packages for test to
/// run.
fn resolveAndAddTests(
    b: *Builder,
    test_dir: []const u8,
    allocator: std.mem.Allocator,
    mode: Mode,
    target: CrossTarget,
    step: *Step,
) void {
    const dir = std.fs.cwd().openIterableDir(test_dir, .{}) catch exit(2);
    var iterator = dir.iterate();

    while (iterator.next() catch null) |path| {
        if (isZigFile(path.name)) {
            const test_path = fs.path.join(
                allocator,
                &[_][]const u8{ test_dir, path.name },
            ) catch exit(3);
            const tests = b.addTest(test_path);
            resolveAndAddPackages("src/zig/lox", allocator, tests);
            tests.setTarget(target);
            tests.setBuildMode(mode);
            step.dependOn(&tests.step);
        }
    }
}

pub fn build(b: *Builder) void {
    var arena = std.heap.ArenaAllocator.init(std.heap.page_allocator);
    defer arena.deinit();
    const allocator = arena.allocator();

    const version = getVersion(b);
    const version_string = b.fmt("{[major]d}.{[minor]d}.{[patch]d}", version);

    const target = b.standardTargetOptions(.{});
    const mode = b.standardReleaseOptions();

    const exe = b.addExecutable(
        b.fmt("lox-{s}", .{version_string}),
        "src/zig/lox/main.zig",
    );
    exe.setTarget(target);
    exe.setBuildMode(mode);
    exe.install();

    const run_cmd = exe.run();
    run_cmd.step.dependOn(b.getInstallStep());
    if (b.args) |args| {
        run_cmd.addArgs(args);
    }

    const run_step = b.step("run", "Run the app");
    run_step.dependOn(&run_cmd.step);

    const test_step = b.step("test", "Run unit tests");
    const test_dir = fs.path.join(
        allocator,
        &[_][]const u8{ "test", "zig", "lox" },
    ) catch exit(1);

    resolveAndAddTests(b, test_dir, allocator, mode, target, test_step);
}

It’s not that big, but the amount of boilerplate you have to do, even though the language has a build library is far greater than in Clojure. I do understand that these things pursue different goals, but does it really need to be that complicated?

Actually, let me explain why it is so complicated. You see, I have a quite specific project layout in mind:

Lox
├── build.clj
├── build.zig
├── deps.edn
├── src
│   ├── clojure
│   │   └── lox
│   │       └── sources.clj
│   └── zig
│       └── lox
│           └── sources.zig
├── target
│   ├── clojure
│   └── zig
└── test
    ├── clojure
    │   └── lox
    │       └── test_sources.clj
    ├── data
    │   └── various_test_files
    └── zig
        └── lox
            └── test_sources.zig

So the sources are located in the src directory, and the tests are located in the test directory. And each implementation has a language directory in there, so Clojure sources are in the src/clojure and the Zig sources are in src/zig. The same goes for tests.

Unfortunately, I couldn’t find a way to have tests outside of the source directory, so I had to dance around, creating my own solution for this self-introduced problem. Is it that uncommon to put tests into a separate directory, outside of the source directory? Let’s look at other languages.

In Java, this is the norm, same in Clojure. I like this actually, tests are separate entities, they don’t have anything to do with the source code. I mean, the source is for the actual application or the library you ship to the consumer. The tests are for you, the developer, not for the consumer.

In Rust, the tests usually are in the same file as the source code. When I was writing in Rust I kinda liked it, but only, seriously, only because before Rust I was writing C professionally, and tests in C are, well, they’re not standardized, to say the least. So having Rust giving you tools to write and run tests was a huge step up from C, but that was the only step up.

In Go it is possible to put tests in a different directory, and I’ve seen projects doing it, so I guess that’s a fine part of Go.

What else can we look at? Nim? Seems that putting tests into a separate directory is possible there. Hare? From what I can see, the projects have either +test.ha or just test.ha files in the same place where the sources are - not sure if this is the only way it can work, the hare docs are sparse on the details here.

Well, I guess there’s no single answer to this question, but I don’t see why a language would forbid putting tests whenever the project author sees fit. And in Zig, well, while it was hard to make it, and make it fully automatic, remember - the build is a program. You can do whatever you want, as long as the tooling gives you the ability, and in the case of Zig, the build system can do that. Well, at least it could in 0.10.0, but now I have to rewrite it for 0.11.0.

I had a few hiccups and managed to do that, so I’m not going to complain that much here, but I still think it’s too complicated. I mean, it’s not as complicated as it was in 0.10.0, so that’s good, but it is still a lot of manual work. Here’s a diff with the previous version so you can feel the amount of subtle changes I had to make to make this thing compile again:

@@ -1,11 +1,10 @@
 const std = @import("std");
-const Version = std.builtin.Version;
-const Builder = std.build.Builder;
+const Builder = std.Build;
+const Step = Builder.Step;
 const print = std.debug.print;
-const LibExeObjStep = std.build.LibExeObjStep;
+const CompStep = Step.Compile;
 const Mode = std.builtin.Mode;
 const CrossTarget = std.zig.CrossTarget;
-const Step = std.build.Step;
 const trimRight = std.mem.trimRight;
 const parseInt = std.fmt.parseInt;
 const exit = std.process.exit;
@@ -14,6 +13,12 @@ const fs = std.fs;
 const MAJOR = 0;
 const MINOR = 1;

+const Version = struct {
+    major: u32,
+    minor: u32,
+    commits: u32,
+};
+
 fn gitCountRevs(b: *Builder) u32 {
     var code: u8 = undefined;
     b: {
@@ -46,10 +51,10 @@ fn isZigFile(name: []const u8) bool {

 /// Automatically resolves packages under the source directory.  Adds
 /// packages to the `exe`.
-fn resolveAndAddPackages(
+fn resolveAndAddModules(
+    b: *Builder,
     src_dir: []const u8,
-    allocator: std.mem.Allocator,
-    exe: *LibExeObjStep,
+    exe: *CompStep,
 ) void {
     const dir = std.fs.cwd().openIterableDir(src_dir, .{}) catch exit(2);
     var iterator = dir.iterate();
@@ -57,11 +62,8 @@ fn resolveAndAddPackages(
     while (iterator.next() catch null) |path| {
         if (isZigFile(path.name)) {
             const package_name = path.name[0 .. path.name.len - 4];
-            const src_path = fs.path.join(
-                allocator,
-                &[_][]const u8{ src_dir, path.name },
-            ) catch exit(3);
-            exe.addPackagePath(package_name, src_path);
+            const src_path = b.pathJoin(&.{ src_dir, path.name });
+            exe.addModule(package_name, b.createModule(.{ .source_file = .{ .path = src_path } }));
         }
     }
 }
@@ -71,9 +73,9 @@ fn resolveAndAddPackages(
 /// run.
 fn resolveAndAddTests(
     b: *Builder,
+    src_dir: []const u8,
     test_dir: []const u8,
-    allocator: std.mem.Allocator,
-    mode: Mode,
+    optimize: Mode,
     target: CrossTarget,
     step: *Step,
 ) void {
@@ -82,52 +84,41 @@ fn resolveAndAddTests(

     while (iterator.next() catch null) |path| {
         if (isZigFile(path.name)) {
-            const test_path = fs.path.join(
-                allocator,
-                &[_][]const u8{ test_dir, path.name },
-            ) catch exit(3);
-            const tests = b.addTest(test_path);
-            resolveAndAddPackages("src/zig/lox", allocator, tests);
-            tests.setTarget(target);
-            tests.setBuildMode(mode);
-            step.dependOn(&tests.step);
+            const tests = b.addTest(.{
+                .name = path.name,
+                .root_source_file = .{ .path = b.pathJoin(&.{ test_dir, path.name }) },
+                .target = target,
+                .optimize = optimize,
+            });
+            resolveAndAddModules(b, src_dir, tests);
+            step.dependOn(&b.addRunArtifact(tests).step);
         }
     }
 }

 pub fn build(b: *Builder) void {
-    var arena = std.heap.ArenaAllocator.init(std.heap.page_allocator);
-    defer arena.deinit();
-    const allocator = arena.allocator();
-
-    const version = getVersion(b);
-    const version_string = b.fmt("{[major]d}.{[minor]d}.{[patch]d}", version);
+    const version = b.fmt("{[major]d}.{[minor]d}.{[commits]d}", getVersion(b));

     const target = b.standardTargetOptions(.{});
-    const mode = b.standardReleaseOptions();
-
-    const exe = b.addExecutable(
-        b.fmt("lox-{s}", .{version_string}),
-        "src/zig/lox/main.zig",
-    );
-    exe.setTarget(target);
-    exe.setBuildMode(mode);
-    exe.install();
-
-    const run_cmd = exe.run();
-    run_cmd.step.dependOn(b.getInstallStep());
-    if (b.args) |args| {
-        run_cmd.addArgs(args);
-    }
+    const optimize = b.standardOptimizeOption(.{});
+    const src_dir = b.pathJoin(&.{ "src", "zig", "lox" });
+    const main = b.pathJoin(&.{ src_dir, "main.zig" });
+
+    const exe = b.addExecutable(.{
+        .name = b.fmt("lox-{s}", .{version}),
+        .root_source_file = .{ .path = main },
+        .target = target,
+        .optimize = optimize,
+    });
+
+    b.installArtifact(exe);
+    const run_cmd = b.addRunArtifact(exe);

     const run_step = b.step("run", "Run the app");
     run_step.dependOn(&run_cmd.step);

     const test_step = b.step("test", "Run unit tests");
-    const test_dir = fs.path.join(
-        allocator,
-        &[_][]const u8{ "test", "zig", "lox" },
-    ) catch exit(1);
+    const test_dir = b.pathJoin(&.{ "test", "zig", "lox" });

-    resolveAndAddTests(b, test_dir, allocator, mode, target, test_step);
+    resolveAndAddTests(b, src_dir, test_dir, optimize, target, test_step);
 }

Yet, this isn’t a final version of the file1, just the first working one. Gotta say, I had to spend a few hours figuring out why tests don’t actually run, only to notice a line in the changelog:

addTest No Longer Runs It

Before, addTest created and ran a test. Now you need to use b.addRunArtifact to run your test executable.

That’s on me, of course, cuz I haven’t taken my time to read the whole changelog. But it’s like 24k words without the code, so can you blame me? I just had a stroke of passion to restart this project, and I didn’t really want to kill it by reading a huge changelog of the language I already don’t remember almost completely. So for me, it is not even a change log, it’s just a log.

But it now works and that’s the only thing that matters. Hopefully, I’ll finish this project before 0.12.0 comes out, and I’ll reimplement everything again. Oh well, I can just not update, but the language is being improved so I’d rather not miss on these improvements.

With that covered, I think we can actually start with the book!

Chunks of Bytecode

There’s going to be an ongoing theme with this book: I’m not going to re-implement low-level machinery for the sake of implementing it. It’s something I need to say early on, as the book is full of this, and while I understand the author’s intent, I’m not the type of reader the book really aims for.

What I’m talking about is implementing such things as dynamic arrays. Zig already comes with the dynamic array, called the ArrayList. If you’re curious enough, you can check how it is implemented, and it is almost exactly the same as what is described in the book. I implemented it during the first run, looked at it, looked at the standard library implementation, said to myself “I got it”, and decided not to reinvent the wheel. These tricks, while neat, are only meaningful to C - in Zig, we can and should use the standard library.

Even if we don’t re-implement all this stuff, there are a lot of differences in implementing things with Zig. Let’s have a look at the code for the chunk:

const std = @import("std");
const ArrayList = std.ArrayList;

const common = @import("common.zig");
const Code = common.Code;
const OPCode = common.OPCode;
const Allocator = common.Allocator;
const exit = common.exit;
const print = common.print;
const str = common.str;

pub const Chunk = struct {
    const Self = @This();
    code: ArrayList(Code),

    pub fn init(allocator: Allocator) Chunk {
        return Chunk{
            .code = ArrayList(Code).init(allocator),
        };
    }

    pub fn deinit(self: *Self) void {
        self.code.deinit();
    }

    pub fn write(self: *Self, byte: OPCode) void {
        self.code.append(@intFromEnum(byte)) catch
            exit(1, "can't allocate memory for the instruction");
    }

    fn simpleInstruction(name: str, offset: u32) u32 {
        std.debug.print("{s: <16}\n", .{name});
        return offset + 1;
    }

    fn disassembleInstruction(self: *Self, offset: u32) u32 {
        print("{d:0>4} ", .{offset});

        var opcode = self.code.items[offset];
        return switch (@as(OPCode, @enumFromInt(opcode))) {
            .Ret => |op| simpleInstruction(@tagName(op), offset),
            else => |op| {
                print("unknown opcode {d}\n", .{op});
                return offset + 1;
            },
        };
    }

    pub fn disassemble(self: *Self, name: str) void {
        print("==== {s} ====\n", .{name});
        var offset: u32 = 0;
        while (offset < self.code.items.len) {
            offset = self.disassembleInstruction(offset);
        }
    }
};
Code Snippet 1: src/zig/lox/chunk.zig

Do you see that barrage of const definitions at the top? There will be lots of these things during this post. As you can actually notice, a dozen came from the common.zig module:

const std = @import("std");
pub const Allocator = std.mem.Allocator;

pub const str = []const u8;
pub const Code = u8;
pub const print = std.debug.print;

pub const OPCode = enum(Code) {
    Ret,
    _,
};

pub fn exit(code: u8, msg: str) noreturn {
    print("{s}\n", .{msg});
    std.process.exit(code);
}
Code Snippet 2: src/zig/lox/common.zig

Zig claims that the top-level constants don’t have to be ordered, which while convenient, I don’t think is practically good. I do prefer a top-down approach, where the most specific things appear at the top, and more general stuff is at the bottom. So if this isn’t your cup of tea, well, sorry, I guess?

Anyhow, here’s the main.zig:

const std = @import("std");
const Chunk = @import("chunk.zig").Chunk;
const common = @import("common.zig");
const OPCode = common.OPCode;

pub fn main() !void {
    var arena = std.heap.ArenaAllocator.init(std.heap.page_allocator);
    defer arena.deinit();
    const allocator = arena.allocator();

    var chunk = Chunk.init(allocator);
    defer chunk.deinit();

    chunk.write(OPCode.Ret);
    chunk.disassemble("test chunk");
}
Code Snippet 3: src/zig/lox/main.zig

If we run it with zig build run, we get a lovely:

$ zig build run
==== test chunk ====
0000 Ret

So far so good. Note, this is code up to the part where we are supposed to add a value array for constants, so it’s not like it’s the whole code for the chapter. For now, I won’t go into code explanations, it’s not that hard right now, you can get the idea just by reading it, and by referring to the book. But there is a reason why I decided to show it right now.

Adding constants

Let’s have a look at how the book suggests to add the constants.

First, we introduce the addConstant function, which adds a Value type to the dynamic array of constants. Next, we use writeChunk, or in my case chunk.write to write the constant index in the array to the code array of the chunk. Do you see the issue here? Let’s look at the code then.

The first thing we do is add a new function:

pub fn addConstant(self: *Self, value: Value) Code {
    self.constants.append(value) catch
        exit(1, "can't allocate memory for the constant");
    return @truncate(self.constants.items.len - 1);
}

You may wonder, why am I using @truncate on the return value, and the return type is Code? Especially since the code in the book tells us that it should return an int:

int addConstant(Chunk* chunk, Value value) {
  writeValueArray(&chunk->constants, value);
  return chunk->constants.count - 1;
}

Now, remember, our write function (writeChunk in the book) accepts the chunk and an opcode to write. Initially, I’ve made it so that the signature expects the OPCode enum itself:

    pub fn write(self: *Self, byte: OPCode) void {
        self.code.append(@intFromEnum(byte)) catch
            exit(1, "can't allocate memory for the instruction");
    }

However, that would mean, that if we were to pass the result of addConstant to it, we would need to convert it to the enum tag, and then immediately convert it back to the Code type, that the ArrayList uses to store code in the chunk. So, I’ve changed it:

    pub fn write(self: *Self, byte: Code) void {
        self.code.append(byte) catch
            exit(1, "can't allocate memory for the instruction");
    }

Now, we can add constants in the main:

pub fn main() !void {
    var arena = std.heap.ArenaAllocator.init(std.heap.page_allocator);
    defer arena.deinit();
    const allocator = arena.allocator();

    var chunk = Chunk.init(allocator);
    defer chunk.deinit();

    const constant = chunk.addConstant(1.2);
    chunk.write(@intFromEnum(OPCode.Constant));
    chunk.write(constant);
    chunk.disassemble("test chunk");
}

Still don’t see the issue? Well, @truncate kinda spoiled it, so I’ll get straight to the point - we can only have 256 constants in the Chunk. The rest will overflow our Code type, which is, in fact, just a u8 in disguise.

The book overlooks this for now, as in C truncation happens automatically, but comes back to it in the “Challenges” section. And, as you may remember I’ve promised that I have a lot to say about the “Challenges”.

Here’s the “Challenge”:

Because OP_CONSTANT uses only a single byte for its operand, a chunk may only contain up to 256 different constants. That’s small enough that people writing real-world code will hit that limit. We could use two or more bytes to store the operand, but that makes every constant instruction take up more space. Most chunks won’t need that many unique constants, so that wastes space and sacrifices some locality in the common case to support the rare case.

To balance those two competing aims, many instruction sets feature multiple instructions that perform the same operation but with operands of different sizes. Leave our existing one-byte OP_CONSTANT instruction alone, and define a second OP_CONSTANT_LONG instruction. It stores the operand as a 24-bit number, which should be plenty.

Implement this function:

void writeConstant(Chunk* chunk, Value value, int line) {
  // Implement me...
}

It adds value to chunk’s constant array and then writes an appropriate instruction to load the constant. Also add support to the disassembler for OP_CONSTANT_LONG instructions.

Defining two instructions seems to be the best of both worlds. What sacrifices, if any, does it force on us?

So here’s the problem - I had implemented this in the first run and regretted it many, many times. So for this run, I’m not going to, sorry Robert.

I mean, the challenge itself isn’t hard, however, the book assumes that you don’t do them because it is a challenge, not homework that you have to do. So the latter chapters continue as if no challenges were even there - the code in the book still uses the naive truncated approach.

I do understand that this book is about teaching how to implement a language and not a step-by-step guide. But later, these challenges made my code so different from what the book gives you that I had several hours-long sessions of looking through the book, figuring out where the particular piece of code came from, and what the intention there was. All because I had a bug somewhere and the reason was usually that differences, introduced by the challenges, piled up and combined introducing a subtle bug that made itself visible only after several chapters.

So, no challenges this time. All I care about this time is to finish the book and have a working language written in Zig.

Line information

Sike! Let’s do a challenge!

The book suggests that we can store lines in a dynamic array, for each index of the chunk:

    chunk->lines[chunk->count] = line;

However, this is quite inefficient in terms of memory. And the book mentions that in the “Challenges” section. It suggests using a so-called Run-length encoding.

After scratching my head for a moment, I came up with this solution:

    fn writeLine(self: *Self, line: u32) void {
        const len = self.lines.items.len;
        if (len > 1 and self.lines.items[len - 2] == line) {
            self.lines.items[len - 1] = self.lines.items[len - 1] + 1;
        } else {
            self.lines.append(line) catch
                exit(1, "can't allocate memory for line information");
            self.lines.append(1) catch
                exit(1, "can't allocate memory for line information");
        }
    }
Code Snippet 4: a method in the Chunk struct

It’s a rather simple algorithm - we check if the last line is the same as the new one, and if it is, we increase the count, that is stored right next to it. Otherwise, we append a new line and set its count to 1.

But now we need to extract line information by uncompressing this data:

    fn getLine(self: *const Self, offset: u32) u32 {
        var cnt: u32 = 0;
        var off: u32 = offset;
        while (off > 0) loop: {
            var n: u32 = self.lines.items[cnt + 1];
            while (n > 0) {
                if (off == 0)
                    break :loop;
                off = off - 1;
                n = n - 1;
            }
            cnt = cnt + 2;
        }
        return self.lines.items[cnt];
    }
Code Snippet 5: a method in the Chunk struct

Simply put, we’re decreasing the counter while the offset is greater than zero while increasing the counter each time we’ve exhausted the line counter n. Sure, it’s not a very fast algorithm, and we can probably do it without using loops if I think hard enough, but I’m OK with this solution. It isn’t used until the chunk gets disassembled, and this only happens when we debug.

This challenge I like. It doesn’t affect the overall code of the VM and is more of a lesson in optimization. We saved some memory, at the expense of a bit slower disassembler, which is fine by me.

However, seriously, this is probably the last challenge I’ll tackle for a long time, so don’t get excited.

Tests

As a final thing in this chapter, the book suggests testing your code. And here’s the thing - it’s a great suggestion, only this book will make you hate your decision to test the code. I’m a bit exaggerating here, but keep in mind, during this chapter we’ve already changed some parts of the Chunk code several times.

First, we introduced the write method, and it accepted the OPCode enum. Next, we introduced the addConstant method, and it required us to change the write to accept the Code instead. Finally, we’ve added support for line information, changing the write method once again.

You may think that that’s fine, as we did it during one chapter, and the tests should be written after we complete it. Well, TDD people may disagree with you. I’m not a TDD person, I usually test things as I go in the REPL, and then, once I’m satisfied I write proper tests to prevent later breakage. But Zig doesn’t have a REPL, so testing it this way won’t work.

So, while Robert claims that: “I wrote a test suite for Lox before I wrote a single word of this book.”, I’d be cautious when writing tests for this book’s code. Probably his claim is true, especially if he knew what the result would be and wrote a test for it. Or, maybe the book was simply written after the implementation of the language became a thing, which then doesn’t matter if the tests were written upfront or not.

I tend to write tests either during the chapter or at the end of it. And at the first run, I did exactly that. But, as we’ll see later, you’ll have to discard or rework a lot of tests due to changes in the code. And often the changes are dramatic, so I’m not sure if it makes sense to write too many tests early on.

A Virtual Machine

This chapter is mostly straightforward, except for a few shenanigans we have to make because Zig is different. The first thing I would like to talk about is pointers in various programming languages.

In C we basically have one kind of pointer:

SomeType * variable;

This really can be anything. It can be a pointer to a single location, or it can be a pointer to an array, or even to a place somewhere in a set array. Well, there’s also a function pointer in C, but we’re not going to look at it now.

Zig, on the other hand, has quite a few types of pointers:

  • *T - single-item pointer to exactly one item.
    • Supports deref syntax: ptr.*
  • [*]T - many-item pointer to an unknown number of items.
    • Supports index syntax: ptr[i]
    • Supports slice syntax: ptr[start..end]
    • Supports pointer arithmetic: ptr + x, ptr - x
    • T must have a known size
  • *[N]T - pointer to N items, same as a single-item pointer to an array.
    • Supports index syntax: array_ptr[i]
    • Supports slice syntax: array_ptr[start..end]
    • Supports len property: array_ptr.len
  • []T - is a slice (a fat pointer, which contains a pointer of type [*]T and a length).
    • Supports index syntax: slice[i]
    • Supports slice syntax: slice[start..end]
    • Supports len property: slice.len

As you can see, we can’t just write variable: *SomeType in Zig and use it like we can in C. I believe this is a good thing, and in practice, it does eliminate bugs related to accessing ordinary pointers as arrays. On the other hand, it is quite tedious to write code with such a system.

To get what I mean, let’s have a look at how the instruction pointer concept is implemented by the book:

typedef struct {
  Chunk* chunk;
  uint8_t* ip;
} VM;

InterpretResult interpret(Chunk* chunk) {
  vm.chunk = chunk;
  vm.ip = vm.chunk->code;
  return run();
}

You’re looking at the VM struct that holds the currently executed chunk and an instruction pointer ip. In the interpret function we set the VM’s current chunk, and its instruction pointer to point to the code from the chunk. As you may remember the Chunk struct contains the code field which is a dynamic array: uint8_t * code. So by saying vm.ip = vm.chunk->code we essentially say that ip now holds the same address that was in the code field. Hopefully, it’s the start of the array!

With that, we can now inspect the run function. For now, all we really care about is how it uses the instruction pointer:

static InterpretResult run() {
#define READ_BYTE() (*vm.ip++)
        // ...
    switch (instruction = READ_BYTE()) {/* ... */}
        // ...
#undef READ_BYTE
}

Aha, C tricks! The READ_BYTE is a substitution macro that simply increments the pointer with ++, dereferencing the resulting one with *. It’s a clever trick - too clever, even, because of the operator precedence rules in C. The ++ has to happen before * because otherwise, we would increment the dereferenced value. But the effect of ++ has to happen after we dereference the pointer, otherwise, we would dereference the next address. This code can be rewritten into two separate expressions, like uint8_t val = *ip; ip = ip + 1; but it’s hard to use that in a macro, that itself is used in the expression context. So the *ptr++ form is a quite common trick.

But here’s the thing - the ip is not an array, it just happens to hold an address to an array. Uh oh.

Well, that’s a common thing in C, so let’s look at Zig. Now, I’m not really experienced in Zig’s pointers but after poking around I think we have several options:

pub const VM = struct {
    chunk: *Chunk, // store a pointer to the chunk
    ip: []Code, // and a fat pointer to the array

    // methods ...
};
Code Snippet 6: Option №1

Here I have the [*]T type of pointer because we’re going to point to an array because the Chunk stores items in a dynamic array. We can implement the interpret and run functions as follows:

    pub fn interpret(self: *VM, chunk: *Chunk) Result {
        self.chunk = chunk;
        self.ip = chunk.code.items;
        return self.run();
    }

    pub fn run(self: *VM) Result {
        // ...
        const instr = self.ip[0]; self.ip += 1;
        switch (instr) {
            // ...
        }
    }

We can put the pointer reading into an inline function, but that’s beside the point. Look at how we have to dereference this kind of pointer: ip[0]. Unlike in C, we can’t use a single-item pointer dereference syntax here, because it’s not a single-item pointer!

Let’s try a single-item pointer then:

pub const VM = struct {
    chunk: *Chunk, // store a pointer to the chunk
    ip: *Code, // and a pointer to a place in the array

    // methods ...
};
Code Snippet 7: Option №2

Now we have a single-item pointer for the instruction pointer (makes sense), but let’s look at how we have to use it:

    pub fn interpret(self: *VM, chunk: *Chunk) Result {
        self.chunk = chunk;
        self.ip = &chunk.code.items[0];
        return self.run();
    }

    pub fn run(self: *VM) Result {
        // ...
        const instr = self.ip.*;
        self.ip = @ptrCast(@as([*]Code, @ptrCast(self.ip)) + 1);
        switch (instr) {
            // ...
        }
    }

Ugh, that’s looking nasty. We have to do two pointer casts to do a simple thing like that. Looking at it now the ip[0] doesn’t look that out of place.

But then the book does this kind of thing:

disassembleInstruction(vm.chunk, (int)(vm.ip - vm.chunk->code));

If you recall the disassembleInstruction signature, the second argument is an integer offset. So we do all this pointer arithmetic here and there just to convert it back to an index in the array. And that’s our option three:

pub const VM = struct {
    chunk: *Chunk, // store a pointer to the chunk
    ip: []Code,
    ic: u32, // and an index!

    // methods ...
};
Code Snippet 8: Option №3

This does require a bit different approach in the rest of the code:

pub fn interpret(self: *VM, chunk: *Chunk) Result {
    self.chunk = chunk;
    self.ip = chunk.code.items;
    self.ic = 0;
    return self.run();
}

pub fn run(self: *VM) Result {
    // ...
    const instr = self.ip[self.ic]; self.ic += 1;
    switch (instr) {
        // ...
    }
}

We can simply use the vm.disassembleInstruction(self.ip) to disassemble the instruction avoiding all of the pointer arithmetic altogether. Now, I’m sure the raw pointer for this kind of task is more appealing, and probably it is slightly faster. But to me, this looks like a premature optimization, at least for now. So that’s the way I decided to do it for my implementation.

Another thing I want to touch here is the return type of the interpret function. In the book, the return type is called InterpretResult and it is a enumeration with simple tags: INTERPRET_OK, INTERPRET_COMPILE_ERROR, INTERPRET_RUNTIME_ERROR. When we reach the OP_RETURN the result of the run function is the INTERPRET_OK tag. At the end of the chapter, after we’ve implemented the stack and basic arithmetic operations we read the last value on the stack and print it before we exit:

      // ...
      case OP_RETURN: {
        printValue(pop());
        printf("\n");
        return INTERPRET_OK;
      }
      // ...

While this is in the spirit of C, we can do better. And in fact, we have to do better, as we need to write tests, as the book suggested. It’s hard to write tests if you don’t have access to the result of interpretation, so my Result type is defined as union(enum):

pub const Result = union(enum) {
    OK: Value,
    CompileError,
    RuntimeError,
};

This way we can return the actual value instead of printing it, and indicate that everything is OK at the same time, kinda like Rust’s Option type:

    // switch(opcode) {
    // ...
           .Ret => return .{ .OK = self.pop() },
    // }

This way we can check the results in our VM tests.

I’ve omitted a bunch of stuff from this chapter, but simply because there aren’t many interesting things there when it comes to Zig vs C. But one thing I would like to return to is metaprogramming, because this book does it frequently enough to be a hassle. We’ve already seen a bunch of #define directives in the C examples above, and the book goes pretty far with them, defining a polymorphic BINARY_OP macro:

#define BINARY_OP(op) \
    do { \
      double b = pop(); \
      double a = pop(); \
      push(a op b); \
    } while (false)

The do {...} while(false) is a common trick in C macros. For those unfamiliar with how do while works, as it is rarely seen in other languages, it executes the body first, then checks the condition and repeats if it is true. Classical while checks the condition first. Most languages omit do while because it is essentially the same as writing the body of the while loop two times, e.g.:

foo();           // do {
while (test()) { //    foo();
    foo();       // } while (test());
}

Of course, there are other ways to rewrite this code, and do while looks clearer, but it’s quite rare. In our case though, it is used as a way to introduce a valid language construct that allows for a semicolon to be inserted after it. Because of that, it uses while(false) i.e. it never loops.

So when you write BINARY_OP(+); (mind the semicolon) it expands to the correct code with a semicolon after while (false); which is required in the case of this language construct, unlike with other loops. Yep, that’s the sole reason it is used here, and compilers are smart enough to detect that the loop will never happen, so they omit the loop entirely. A truly sad state of metaprogramming, and expression statements in C.

However, Zig doesn’t have macros. Instead, it has the comptime keyword, which marks the expression or function argument to be used only at compilation time. Unfortunately, there doesn’t seem to be a way to write an inline comptime function that will accept an operator and return some kind of AST, or other code to be present in the expression. We can work around it, though:

    inline fn binaryOp(op: OPCode, a: Value, b: Value) Value {
        return switch (op) {
            .Add => a + b,
            .Sub => a - b,
            .Mul => a * b,
            .Div => a / b,
            else => unreachable,
        };
    }
   // ---8<---
   //       switch(opcode) {
                .Add, .Sub, .Mul, .Div => |instr| {
                    const b = self.pop();
                    var sp = self.stack_top - 1;
                    sp[0] = binaryOp(instr, sp[0], b);
                },
   // }

Not as pretty, but works, I guess. We do have a cost of doing branching twice, e.g. first in the run function, then in the binaryOp function, even though by the time we’re in binaryOp we already know what operation we’re dealing with. C’s macros are giving us the ability to construct code without explicit copy-pasting and without explicit re-branching. We can avoid re-branching by writing it like this, of course:

    inline fn add(self: *Self) void {
        const b = self.pop();
        var sp = self.stack_top - 1;
        sp[0] = sp[0] + b;
    }

    inline fn sub(self: *Self) void {
        const b = self.pop();
        var sp = self.stack_top - 1;
        sp[0] = sp[0] - b;
    }

    inline fn div(self: *Self) void {
        const b = self.pop();
        var sp = self.stack_top - 1;
        sp[0] = sp[0] / b;
    }

    inline fn mul(self: *Self) void {
        const b = self.pop();
        var sp = self.stack_top - 1;
        sp[0] = sp[0] * b;
    }

    pub fn run(self: *Self) Result {
        while (true) {
            switch (opcode) {
                // ...
                .Add => self.add(),
                .Sub => self.sub(),
                .Mul => self.mul(),
                .Div => self.div(),
                // ...
            }
        }
    }

But now you see clearly what parts of the code can be easily generated, all these functions only differ by their name. Unfortunately, Zig’s comptime isn’t capable of treating operators as values, so we can’t write something like self.binaryOp(-) and handle it at compilation time to produce the body. Oh well, Lisp macros have spoiled me, I guess.

That’s like the only part I have an issue with in this language. The author is very vocal about macros, but after speaking with them on IRC a few times it seems to me that they only saw C macros and, maybe, C++ templates, and were like: “Yikes, macros are bad!”. We’ve been manipulating code as data with macros since the 60s, people did a lot of research creating safer ways to write macros, then the C preprocessor comes along and now everybody shits on macros because text substitutions are bad. Of course, they are! What baffles me the most is that among lispers there are people who are equally vocal about how bad macros are and that you should never write ones. I guess I’ll write another post about that, adding another hit to the horse.

Scanning on Demand

This chapter starts painfully. Well, painfully isn’t the right word really, but it introduces a lot of inconvenience by breaking our main function.

At the start of the chapter, we’re introduced to two basic workflows our executable will support - run a “REPL” or run a file. And that brings me to my pain point - why doesn’t Zig have any kind of a REPL? Again, I’m spoiled by lisps, but it’s so liberating to have a broken program, yet still be able to call functions from the REPL, gradually fixing the program. Here, I’m basically stripped of any way to run any code, until I make the whole source work again. Well, that’s the downside of the edit compile run cycle, but lisps are compiled languages that don’t have such a problem. The compilation2 just happens in the REPL.

As a matter of fact, the main function will be broken for the entirety of this chapter.

This nitpick aside, we’re finally starting to work on the parser. Unfortunately, this chapter isn’t interesting on its own, when it comes to implementing this in Zig. Same tricks with pointers replaced by integers, and pretty much the same code otherwise.

Still not sure why the chapter required to break main if we never actually used it. It will make more sense as the part of second chapter.

Compiling Expressions

In the spirit of the previous chapter, the breakage continues. Remember how I said that writing tests during this book will result in lots of test rewriting due to constant refactoring? This is probably the first, but not the last instance of this - we’ve changed the interpret signature to accept the source code instead of a chunk.

This broke all of my VM tests, though, thankfully, I only had just one at this point. Actually, I don’t like this change, to be honest, I’d much rather preserve the old signature, as the VM interprets the bytecode, not the source code. The bytecode doesn’t have to come from the source code, it can come from anywhere, and hence it should be what the function accepts. Instead, we create the chunk inside the interpret function and fill it with the compiler. Oh well, I’m not a language designer proper, so I guess I should not complain.

Anyway, this chapter is interesting. We’re actually starting to compile our language from source code into bytecode, and that’s cool. The parser itself is quite concise, and I like how it is implemented so far. One interesting thing I’d like to point out is how the book defines the rules:

ParseRule rules[] = {
  [TOKEN_LEFT_PAREN]    = {grouping, NULL,   PREC_NONE},
  [TOKEN_RIGHT_PAREN]   = {NULL,     NULL,   PREC_NONE},
  [TOKEN_LEFT_BRACE]    = {NULL,     NULL,   PREC_NONE},
  [TOKEN_RIGHT_BRACE]   = {NULL,     NULL,   PREC_NONE},
  [TOKEN_COMMA]         = {NULL,     NULL,   PREC_NONE},
  [TOKEN_DOT]           = {NULL,     NULL,   PREC_NONE},
  [TOKEN_MINUS]         = {unary,    binary, PREC_TERM},
  [TOKEN_PLUS]          = {NULL,     binary, PREC_TERM},
  [TOKEN_SEMICOLON]     = {NULL,     NULL,   PREC_NONE},
  [TOKEN_SLASH]         = {NULL,     binary, PREC_FACTOR},
  [TOKEN_STAR]          = {NULL,     binary, PREC_FACTOR},
  [TOKEN_BANG]          = {NULL,     NULL,   PREC_NONE},
  [TOKEN_NUMBER]        = {number,   NULL,   PREC_NONE},
  // ...
  [TOKEN_ERROR]         = {NULL,     NULL,   PREC_NONE},
  [TOKEN_EOF]           = {NULL,     NULL,   PREC_NONE},
};

static ParseRule* getRule(TokenType type) {
  return &rules[type];
}

This is when C looks nice and beautiful (if you can say so). The fact that enumerations in C are just numbers has a lot of problems, but sometimes it’s cool because you can use them as array indexes.

As for Zig, I couldn’t find if it has a way to do the same thing. Mainly because you can’t initialize an array using enumeration tags as indexes.

Using zig translate-c on this simplified example:

typedef enum {
    TOKEN_RIGHT_PAREN,
    TOKEN_LEFT_BRACE,
    TOKEN_RIGHT_BRACE,
    TOKEN_COMMA,
    TOKEN_DOT,
} TokenType;

typedef enum {
    PREC_A,
    PREC_B,
    PREC_C,
    PREC_D,
} Precedence;

typedef struct {
    Precedence precedence;
} ParseRule;

ParseRule rules[] = {
    [TOKEN_LEFT_BRACE]    = {PREC_A},
    [TOKEN_DOT]           = {PREC_B},
    [TOKEN_RIGHT_BRACE]   = {PREC_C},
    [TOKEN_RIGHT_PAREN]   = {PREC_D},
};

We get the following Zig code:

// .. tons of definitions
pub const TOKEN_RIGHT_PAREN: c_int = 0;
pub const TOKEN_LEFT_BRACE: c_int = 1;
pub const TOKEN_RIGHT_BRACE: c_int = 2;
pub const TOKEN_COMMA: c_int = 3;
pub const TOKEN_DOT: c_int = 4;
pub const TokenType = c_uint;
pub const PREC_A: c_int = 0;
pub const PREC_B: c_int = 1;
pub const PREC_C: c_int = 2;
pub const PREC_D: c_int = 3;
pub const Precedence = c_uint;
pub const ParseRule = extern struct {
    precedence: Precedence,
};
pub export var rules: [5]ParseRule = [5]ParseRule{
    ParseRule{
        .precedence = @as(c_uint, @bitCast(PREC_D)),
    },
    ParseRule{
        .precedence = @as(c_uint, @bitCast(PREC_A)),
    },
    ParseRule{
        .precedence = @as(c_uint, @bitCast(PREC_C)),
    },
    @import("std").mem.zeroes(ParseRule),
    ParseRule{
        .precedence = @as(c_uint, @bitCast(PREC_B)),
    },
};
// .. tons of more definitions

Looks like Zig doesn’t have a syntax like that, suggested by the fact that it sorted elements of the array based on the enumeration tag values. But that practically removes any benefits of using enumerations in such a way, as we’re forced to keep the order ourselves. Perhaps, there’s a way to do that using comptime, but I’m not that good with Zig. So, I did it like that:

fn makeRules() []ParseRule {
    var r = [_]ParseRule{undefined} ** @typeInfo(TokenType).Enum.fields.len;
    r[@intFromEnum(TokenType.LEFT_PAREN)] = .{ .prefix = Parser.grouping, .infix = null, .precedence = .CALL };
    r[@intFromEnum(TokenType.RIGHT_PAREN)] = .{ .prefix = null, .infix = null, .precedence = .NONE };
    r[@intFromEnum(TokenType.LEFT_BRACE)] = .{ .prefix = null, .infix = null, .precedence = .NONE };
    r[@intFromEnum(TokenType.RIGHT_BRACE)] = .{ .prefix = null, .infix = null, .precedence = .NONE };
    r[@intFromEnum(TokenType.COMMA)] = .{ .prefix = null, .infix = null, .precedence = .NONE };
    r[@intFromEnum(TokenType.DOT)] = .{ .prefix = null, .infix = null, .precedence = .NONE };
    r[@intFromEnum(TokenType.MINUS)] = .{ .prefix = Parser.unary, .infix = Parser.binary, .precedence = .TERM };
    r[@intFromEnum(TokenType.PLUS)] = .{ .prefix = null, .infix = Parser.binary, .precedence = .TERM };
    r[@intFromEnum(TokenType.SEMICOLON)] = .{ .prefix = null, .infix = null, .precedence = .NONE };
    r[@intFromEnum(TokenType.SLASH)] = .{ .prefix = null, .infix = Parser.binary, .precedence = .FACTOR };
    r[@intFromEnum(TokenType.STAR)] = .{ .prefix = null, .infix = Parser.binary, .precedence = .FACTOR };
    r[@intFromEnum(TokenType.BANG)] = .{ .prefix = Parser.unary, .infix = null, .precedence = .NONE };
    r[@intFromEnum(TokenType.NUMBER)] = .{ .prefix = Parser.number, .infix = null, .precedence = .NONE };
    // ...
    r[@intFromEnum(TokenType.ERROR)] = .{ .prefix = null, .infix = null, .precedence = .NONE };
    r[@intFromEnum(TokenType.EOF)] = .{ .prefix = null, .infix = null, .precedence = .NONE };
    return &r;
}

const rules = makeRules();

fn getRule(t: TokenType) ParseRule {
    return rules[@intFromEnum(t)];
}

It works, looks awful.

Apart from that, the chapter is pretty straightforward. Another thing I want to point out is the fact that C supports compilation flags out of the box, and Zig doesn’t.

Well, in C it is a hot mess, but it works reliably enough to see it as a feature in my opinion:

#define DEBUG_PRINT_CODE

// later in the code

#ifdef DEBUG_PRINT_CODE
    // compile-time code injection
#endif

Next, when compiling the code, you can supply -DDEBUG_PRINT_CODE to the compiler and it will set this to something so the #ifdef directive contents will be included in the resulting source code.

In Zig, there’s again comptime and I guess writing if (comptime DEBUG_PRINT_CODE) { ... } would include/eliminate the code at compile time, but how do we supply it to the compiler? The answer is the build system of Zig, which itself is just a program written in Zig. And again, I feel that this is way too complicated than it should be, especially since I have tests in a separate directory, and I have to pass compilation options to tests as well as to the main executable. I didn’t bother with that in the end.

Finally, challenges. This chapter has a decent set of challenges3, except for the last one:

  1. You might be wondering about complex “mixfix” expressions that have more than two operands separated by tokens. C’s conditional or “ternary” operator, ?:, is a widely known one.

    Add support for that operator to the compiler. You don’t have to generate any bytecode, just show how you would hook it up to the parser and handle the operands.

First, no - why would you need to implement ?: before we have implemented if itself? Second - why implement ?: at all if we’re implementing the new language and can make if to be an expression? I understand that this is a small and simple language, but the if expressions can be a great way to introduce people to the concept of familiar statements being expressions as well.

Types of Values

This chapter begins with a paragraph I can connect with:

The nice thing about working in C is that we can build our data structures from the raw bits up. The bad thing is that we have to do that. C doesn’t give you much for free at compile time and even less at runtime. As far as C is concerned, the universe is an undifferentiated array of bytes. It’s up to us to decide how many of those bytes to use and what they mean.

Indeed, when I was working with C this was always the case. On one hand, it’s problematic, on the other, it is empowering. But I’m doing this book’s project in Zig and I can’t say that Zig is the same as C in this regard. Instead of doing things the C way, Zig provides more high-level concepts to us, like union enums, and such. So let’s try to use these instead.

Here’s how we implement the Value:

pub const Value = union(enum) {
    Number: f64,
    Bool: bool,
    Nil,

    pub fn print(self: Value) void {
        switch (self) {
            .Number => |value| std.debug.print("{d}", .{value}),
            .Bool => |value| std.debug.print("{}", .{value}),
            .Nil => std.debug.print("nil", .{}),
        }
    }
}

In its essence, it’s similar to the union in the book:

typedef enum {
  VAL_BOOL,
  VAL_NIL,
  VAL_NUMBER,
} ValueType;

typedef struct {
  ValueType type;
  union {
    bool boolean;
    double number;
  } as;
} Value;

However, as an added bonus, we can add methods to our Value type, the same as with other struct’s. As you can see, the print function is a part of the union(enum) and can be invoked as val.print() later in the code.

Once again, we have to change parts of our code that are quite basic, but nothing too breaking this time. All I had to do was wrap numbers into the Value constructor in a few tests. This is just the beginning though, we have to make more Value sub-types for strings, and objects.

Strings

Now, this is where my previous attempt at this book in Zig train-wrecked. Or, should I say, it was the beginning of the end for that implementation. You see, til now, most concepts introduced by the book were easy enough to translate from C to Zig, because most of the time the stuff was either the same or already backed by Zig’s standard library. For example, at the start of the book, I decided not to implement dynamic arrays, because:

  1. I already made my fair share of dynamic arrays in C back when I was a C programmer
  2. After looking at what the book suggests, and the implementation in Zig’s standard library, they’re practically the same, with some small changes that are irrelevant for the matter.

Other stuff was pretty much the same, except for, perhaps, how pointers are used - instead I used offsets most of the time, as it is easier to do. Unions and enumerations are again similar, though with some small differences, like being able to use union(enum) and add methods onto it, but again, C code kinda does the same thing, just the C way. Strings, however, are a different beast. They’re objects, and hence they’re dynamically typed because objects kinda are. E.g. in Lox string is an object, a function is also an object, and a class also is an object, and we have to express this idea somehow in C/Zig. Here’s how the book does it in C:

typedef enum {
  OBJ_STRING,
} ObjType;

struct Obj {
  ObjType type;
};

This is only a struct definition for the object type, but there’s a certain trick that we can do in C by abusing the fact that structure layouts are consistent. We can create a second structure like this one:

struct ObjString {
  Obj obj;
  int length;
  char* chars;
};

Now, here’s an interesting part - we can convert a pointer between these two structs at will! It is possible because the memory layout between the first field in the ObjString and the entirety of Obj struct is the same. Now, since the first field in the struct has the same address as the struct itself, i.e. an offset of 0, and the struct field size is the same as its type, given the pointer to ObjString we can cast it to Obj and read its fields as normal.

But how do we do that in Zig? Zig’s structs are much more than C structs, as they can have methods, we can manipulate field types at compile time, and so on. My first approach was like this:

pub const LoxString = []const u8;

pub const LoxObject = struct {
    next: ?*LoxObject,
    data: union(enum) {
        String: LoxString,
        _,
    },

    pub fn objType(self: LoxObject) Tag(LoxObject) {
        return activeTag(self.data);
    }

    pub fn isString(self: LoxObject) bool {
        return switch (self.data) {
            .String => true,
            else => false,
        };
    }

    pub fn allocString(s: LoxString, vm: *VM(Code), allocator: Allocator) *LoxObject {
        var obj = allocator.create(LoxObject) catch exit(123, "Not enough memory to allocate object");
        switch (obj.*.data) {
            .String => |*value| value.* = s,
            else => unreachable,
        }
        obj.next = vm.objects;
        vm.objects = obj;
        return obj;
    }

    pub fn freeObject(obj: *LoxObject, allocator: Allocator) void {
        switch (obj.*.data) {
            .String => |s| allocator.free(s),
            else => unreachable,
        }
        allocator.destroy(obj);
    }
};

As you can see, I decided to store the underlying data of the object in the union(enum), similarly to how I do that for Value:

pub const Value = union(enum) {
    Number: f64,
    Bool: bool,
    Nil,
    Obj: *Obj,
    // ...
};

Later, when we added more object types, this became a problem. This layout means that I can’t cast the LoxObject to a LoxString, and I can’t really access it without going through the union:

test "string concatenation" {
    var vm = VM(Code).init(allocator);
    defer vm.deinit();

    try switch (interpret("\"a\" + \"b\"", &vm).OK) {
        .Obj => |obj| switch (obj.*.data) {
            .String => |s| expectEqualSlices(u8, s, "ab"),
            else => unreachable,
        },
        else => unreachable,
    };
}

The code was filled with these obj.*.data switches, that extracted the string. And later on, at some point, I had really weird bugs, like when my isString function returns true, but when I try to access the string, I get the error “trying to access the inactive field of enumeration”. Be it a bug in Zig itself or in my code I don’t care.

Since then I changed the implementation to be more like in the book:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
pub const Type = enum {
    String,
};

pub const Obj = struct {
    objType: Type,
    next: ?*Obj,

    pub fn allocate(vm: *VM, comptime T: type, objType: Type) *Obj {
        const ptr = vm.allocator.create(T) catch
            Common.exit(200, "OOME: can't allocate object", .{});
        ptr.obj = Obj{ .objType = objType, .next = vm.objects };
        vm.objects = &ptr.obj;
        return &ptr.obj;
    }

    pub fn asString(self: *Obj) *String {
        return @fieldParentPtr(String, "obj", self);
    }

    pub fn destroy(self: *Obj, vm: *VM) void {
        switch (self.objType) {
            .String => self.asString().destroy(vm),
        }
    }
};

pub const String = struct {
    obj: Obj,
    bytes: []const u8,

    pub fn create(vm: *VM, bytes: str) *String {
        const obj = Obj.allocate(vm, String, .String);
        const out = obj.asString();
        out.* = String{ .obj = obj.*, .bytes = bytes };
        return out;
    }

    pub fn copy(vm: *VM, source: str) *String {
        const buffer = vm.allocator.alloc(u8, source.len) catch
            Common.exit(200, "OOME: can't allocate string", .{});
        std.mem.copy(u8, buffer, source);
        return String.create(vm, buffer);
    }

    pub fn destroy(self: *String, vm: *VM) void {
        vm.allocator.free(self.bytes);
        vm.allocator.destroy(self);
    }
};

Now, this @fieldParentPtr feels sketchy with its "obj" parameter. It is supposed to be a field name, and I find string a weird choice as strings can contain arbitrary data, while field names can’t. But that’s beside the point, let’s look at it a bit closer:

17
18
19
    pub fn asString(self: *Obj) *String {
        return @fieldParentPtr(String, "obj", self);
    }

The signature of this builtin is as follows: comptime ParentType: type, comptime field_name: []const u8, field_ptr: *T, and the return type is a pointer to the ParentType. As you can see, the usage is interesting - we pass the desired type String which is a different struct. Then we pass the "obj" field name, and the last parameter should be a pointer to a field, for which we use the object pointer itself.

Presumably, Zig knows its memory layout and can calculate the offset to the base struct given a field pointer. Indeed, knowing the layout and giving a pointer to a field, it is possible to calculate the start of the struct, however, in this case, we do it a bit weird, as we get a pointer to the Obj structure, which doesn’t have the obj field at all. We then use this pointer as a third parameter, i.e. as a pointer to the field of the String structure. Therefore, this built-in calculates the width of the obj field, subtracts it from the pointer address, and returns the pointer to the base struct but as the String type.

Overall, this looks like what we do in C, except we use @fieldParentPtr to do the pointer casting. We can see it being used in the create function:

32
33
34
35
36
37
    pub fn create(vm: *VM, bytes: str) *String {
        const obj = Obj.allocate(vm, String, .String);
        const out = obj.asString();
        out.* = String{ .obj = obj.*, .bytes = bytes };
        return out;
    }

We have the object allocated as a String, ensuring the memory layout, but then we assign the .obj field an Obj structure, finally returning a pointer to this field:

 9
10
11
12
13
14
15
    pub fn allocate(vm: *VM, comptime T: type, objType: Type) *Obj {
        const ptr = vm.allocator.create(T) catch
            Common.exit(200, "OOME: can't allocate object", .{});
        ptr.obj = Obj{ .objType = objType, .next = vm.objects };
        vm.objects = &ptr.obj;
        return &ptr.obj;
    }

Everything should now make sense - we indeed have the pointer to the obj field by the time we get to use the asString function, but we always have the pointer to the Obj at hand.

In other words, it’s not as in C where we literally convert a single pointer between two different layouts, abusing the fact that the memory layout allows for it. Instead, we can have a pointer to anywhere inside the struct and still get the correct parent pointer thanks to the known memory layout. We could do the same in C but it would be more work, and far more error-prone, as we would need to compute this offset manually. Here, Zig has us covered.

Overall, I like this approach more, I guess. C’s way of doing this feels like a clever memory trick, while Zig’s way of doing this is more of a compiler-supported way.

Hash Tables

Now, in my previous attempt I thought to myself: “Well, I skipped the dynamic array, let’s at least implement a hash table”. As a result, I ended up looking at how the hash table is implemented in Zig’s standard library. Now, I’m not sure if my implementation was correct, but it seemed that way.

Looking at it now, I’m not sure if I could use a hash table from Zig’s standard library, as it needs custom hashing. I probably can implement some kind of interface for the objects that would allow me to hash them. So let’s look into that:

/// General purpose hash table.
/// No order is guaranteed and any modification invalidates live iterators.
/// It provides fast operations (lookup, insertion, deletion) with quite high
/// load factors (up to 80% by default) for low memory usage.
/// For a hash map that can be initialized directly that does not store an Allocator
/// field, see `HashMapUnmanaged`.
/// If iterating over the table entries is a strong usecase and needs to be fast,
/// prefer the alternative `std.ArrayHashMap`.
/// Context must be a struct type with two member functions:
///   hash(self, K) u64
///   eql(self, K, K) bool
/// Adapted variants of many functions are provided.  These variants
/// take a pseudo key instead of a key.  Their context must have the functions:
///   hash(self, PseudoKey) u64
///   eql(self, PseudoKey, K) bool
pub fn HashMap(
    comptime K: type,
    comptime V: type,
    comptime Context: type,
    comptime max_load_percentage: u64,
) type {
    // ...
}

OK, so the documentation of std.hash_map.HashMap says that we can pass a Context that has the hash and eql functions. That should do it.

Speaking of reusing the standard library - Zig already has the “FNV-1a” algorithm the book implements as a part of std.hash.Fnv1a_32. So we’re going to reuse it too.

After walking through the chapter, here’s the final table:

const table_max_load = 75;

pub const Table = struct {
    hm: HashMap(*String, Value, Ctx, table_max_load),

    const Ctx = struct {
        pub fn hash(_: Ctx, key: *String) u64 {
            return @intCast(key.*.hash);
        }

        pub fn eql(_: Ctx, key_a: *String, key_b: *String) bool {
            return key_a.*.hash == key_b.*.hash and
                std.mem.eql(u8, key_a.*.bytes, key_b.*.bytes);
        }
    };

    pub fn init(allocator: Allocator) Table {
        return Table{ .hm = HashMap(*String, Value, Ctx, table_max_load).init(allocator) };
    }

    pub fn deinit(self: *Table) void {
        self.hm.deinit();
    }

    pub fn findString(self: *Table, chars: str, hash: u32) ?*String {
        return self.hm.getKey(@constCast(&String{ .obj = undefined, .hash = hash, .bytes = chars }));
    }

    pub fn set(self: *Table, key: *String, value: Value) bool {
        return if (self.hm.fetchPut(key, value) catch
            common.exit(205, "can't put the key to a table", .{})) |_|
            false // key is not new
        else
            true;
    }

    pub fn get(self: *Table, key: *String) ?Value {
        return self.hm.get(key);
    }

    pub fn delete(self: *Table, key: *String) bool {
        self.hm.remove(key);
    }
};

Someone would say that that’s cheating.

While yes, I'm not doing what the book suggests, I already have done that before.
const table_max_load = 0.75;

const Entry = struct {
    key: ?*LoxObject,
    value: Value,
};

inline fn growCapacity(capacity: usize) usize {
    return if ((capacity) < 8) 8 else capacity * 2;
}

pub const Table = struct {
    entries: []Entry,
    capacity: usize,
    allocator: Allocator,
    count: usize,

    pub fn init(allocator: Allocator) Table {
        return Table{
            .allocator = allocator,
            .entries = &[_]Entry{},
            .capacity = 0,
            .count = 0,
        };
    }

    fn allocatedSlice(self: *Table) []Entry {
        return self.entries.ptr[0..self.capacity];
    }

    pub fn deinit(self: *Table) void {
        self.allocator.free(self.allocatedSlice());
    }

    fn findEntry(entries: []Entry, capacity: usize, key: *LoxObject) *Entry {
        const hash = switch (key.data) {
            .String => |s| s.hash,
            else => unreachable,
        };
        var index = hash % capacity;
        var tombstone: ?*Entry = null;

        while (true) {
            const entry = &entries[index];
            if (entry.key == null) {
                if (entry.value.isNil()) {
                    return if (tombstone) |t| t else entry;
                } else {
                    if (tombstone == null) tombstone = entry;
                }
            } else if (entry.key == key) {
                return entry;
            }
            index = (index + 1) % capacity;
        }
    }

    pub fn findString(self: *Table, chars: []const u8, hash: u32) ?*LoxObject {
        if (self.count == 0) return null;

        var index = hash % self.capacity;
        const entries = self.entries;
        while (true) {
            const entry = entries[index];
            if (entry.key) |k| {
                if (switch (k.data) {
                    .String => |s| s.hash == hash and
                        std.mem.eql(u8, s.chars, chars),
                    else => false,
                })
                    return entry.key;
            } else {
                if (entry.value.isNil()) return null;
            }

            index = (index + 1) % self.capacity;
        }
    }

    fn adjustCapacity(self: *Table, capacity: usize) void {
        const new_memory = self.allocator.alloc(Entry, capacity) catch
            common.exit(48, "can't allocate memory for the hash table");

        for (new_memory) |*entry| {
            entry.key = null;
            entry.value = Nil;
        }

        self.count = 0;
        for (self.entries) |*entry| {
            if (entry.key) |key| {
                var dest = findEntry(new_memory, capacity, key);
                dest.key = entry.key;
                dest.value = entry.value;
                self.count += 1;
            }
        }

        self.allocator.free(self.allocatedSlice());

        self.entries = new_memory;
        self.capacity = capacity;
    }

    fn tableAddAll(from: *Table, to: *Table) void {
        var i = 0;
        while (i < from.capacity) : (i += 1) {
            const entry = &from.entries[i];
            if (entry.key != null) {
                to.tableSet(entry.key, entry.value);
            }
        }
    }

    pub fn set(self: *Table, key: *LoxObject, value: Value) bool {
        if (self.count + 1 > @floatToInt(usize, @intToFloat(f32, self.capacity) * table_max_load)) {
            const capacity = growCapacity(self.capacity);
            self.adjustCapacity(capacity);
        }
        var entry = findEntry(self.entries, self.capacity, key);
        const is_new = entry.key == null;
        if (is_new and entry.value.isNil()) self.count += 1;
        entry.key = key;
        entry.value = value;
        return is_new;
    }

    pub fn get(self: *Table, key: *LoxObject) ?Value {
        if (self.count == 0) return null;

        const entry = findEntry(self.entries, self.capacity, key);
        if (entry.key == null) return null;

        return entry.value;
    }

    pub fn delete(self: *Table, key: *LoxObject) bool {
        if (self.count == 0) return false;

        const entry = findEntry(self.entries, self.capacity, key);
        if (entry.key == null) return false;

        entry.key = null;
        entry.value = True;
        return true;
    }
};

I just don’t feel like doing it again.

And, I would like to learn how to use Zig’s own hash tables, so this is a perfect opportunity. The use of Context is interesting - thanks to it we can implement the hashing however we want.

The only thing I don’t like in the new solution is the @constCast:

25
26
27
    pub fn findString(self: *Table, chars: str, hash: u32) ?*String {
        return self.hm.getKey(@constCast(&String{ .obj = undefined, .hash = hash, .bytes = chars }));
    }

This function itself is a bit hacky, as we only mimic the key object by creating it on a stack. Alternatively, I could use the getKeyAdapted, and provide it a context that does a lookup of a fake key:

17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
    // after 'const Ctx = struct { ... };'

    const AdaptedCtx = struct {
        h: u32,
        b: str,

        pub fn init(b: str, h: u32) AdaptedCtx {
            return AdaptedCtx{ .b = b, .h = h };
        }

        pub fn hash(self: AdaptedCtx, comptime _: @TypeOf(null)) u64 {
            return @intCast(self.h);
        }

        pub fn eql(self: AdaptedCtx, comptime _: @TypeOf(null), key: *String) bool {
            return self.h == key.*.hash and
                std.mem.eql(u8, self.b, key.*.bytes);
        }
    };

Now we can change findString to:

43
44
45
    pub fn findString(self: *Table, chars: str, hash: u32) ?*String {
        return self.hm.getKeyAdapted(null, AdaptedCtx.init(chars, hash));
    }

I don’t really like it either, and I don’t know which one is better/more idiomatic. But, other than that, the new version is nice and clear.

Global Variables

Here we go again. By the end of this chapter, all tests will be broken. Again.

I mean, I’m glad I write tests, as Zig privies a test allocator that acts like Valgrind and I had noticed a couple of memory leaks. Not because the book has them, but just because I was careless and forgot to clean up some values.

However, now no VM test is passing. Why is that? Because this chapter introduced statements, and now if you write any expression you must end it with a semicolon, thus turning it into a statement. And statements are known for not returning anything. Hence the interpret function now can’t return anything, and therefore I can’t test for anything. Amazing!

Now, there are ways of working this around but all of these are hacky. I think the proper solution would be to introduce a top-level return statement, so we could return any expression from the top level. And it’s a shame - the book mentions the REPL four times during this chapter, saying how useful something would be in the context of one. However, it completely disregards the fact that REPL stands for Read Eval Print Loop, and in our case, we can’t really print anything in the REPL, because now everything is a statement. Welp.

What I like about Lox is that the syntax for declaring a variable requires a keyword. This is one of the reasons I dislike languages like Python - looking at the assignment you can’t guess if you’re creating a new variable or assigning an existing one. Paired with the absence of scopes, we get this horrible thing:

def foo():
    if True:
        a = 10
    print(a)

foo()

Why is a available outside the scope of the if statement? Because there’s only one scope4 - the function scope.

Lua is also guilty of this - assigning a non-existent variable creates a global:

function foo()
   if true then
      a = 10
   end
   print(a)
end

foo()

But at least it is global, and you can avoid this behavior by using a local keyword. Something that all of the Lua code should really do, that’s why Fennel uses local by default.

JavaScript also had this problem, but then they introduced the let keyword. It’s still weird because let variables are “hoisted” - their logical position of definition is the top of their enclosing scope. It’s not as bad as it sounds, e.g. you can’t do this:

function f() {
  {
    console.log(x) // Error: can't access lexical declaration 'x' before initialization
  }
  let x = 1
}

But you can do this:

function f() {
  function g() {
    console.log(x) // a closure? to what?
  }
  let x = 1
  g()
}
f() // prints 1

But I’m getting ahead of the book. The next chapter is on local variables and scope, so let’s see how Lox handles them.

Local Variables

And to be honest, it’s not bad. Lox handles local variables pretty much the same way as most other languages I have used. The only nitpick I have is this:

{
  var a = 20;
  var a = 30; // Error at 'a': Already a variable with this name in this scope.
}

I don’t like this. In Clojure, that’s the norm, because there’s no way to set a new value to an already established binding. So you can overshadow it with a different value.

Rust also allows shadowing, and I think that’s a good feature to have. Sometimes you want to keep the name but change the type. The Rust book has this example:

let spaces = "   ";
let spaces = spaces.len();

Note, this can’t be done using mutability, because of the type system:

let mut spaces = "   ";
spaces = spaces.len();
The error says we’re not allowed to mutate a variable’s type:

$ cargo run
   Compiling variables v0.1.0 (file:///projects/variables)
error[E0308]: mismatched types
 --> src/main.rs:3:14
  |
2 |     let mut spaces = "   ";
  |                      ----- expected due to this value
3 |     spaces = spaces.len();
  |              ^^^^^^^^^^^^ expected `&str`, found `usize`

For more information about this error, try `rustc --explain E0308`.
error: could not compile `variables` due to previous error

Lox, however, is dynamically typed, so this can be done using mutability, and that may be a reason to disallow overshadowing. Still, I would prefer variables that are constant, and overshadowing instead of assignments.

Jumping Back and Forth

Oh, boy.

This is a big chapter. It introduces most of the control flow primitives to the language, in terms of branching, looping, and short-circuiting. I have a lot to say.

If statements

First things first, the book introduces the if statement. To which I ask - why not introduce an if expression instead? if expressions are very useful, and not that hard to implement, especially in a small language like Lox. I guess, it’s a bit late to do so, as almost everything in Lox grammar builds from the statement rule, so if expressions would be out of place. It’s a shame though, as a lot of people don’t even know that if can be an expression, without being a ternary operator ?:.

Anyway, the if expression is implemented in an interesting way. We parse the if and the condition, and emit a jump instruction. Emitting is a bit special we also added a placeholder that we will later patch:

    fn emitJump(self: *Self, instruction: OPCode) u32 {
        self.emitByte(@intFromEnum(instruction));
        self.emitByte(0xff);
        self.emitByte(0xff);
        return @as(u32, @intCast(self.currentChunk().code.items.len)) - 2;
    }

These two emitByte calls with 0xff as an argument are exactly for that.

After the emitJump step, we parse a statement and then emit another jump instruction. And right after that, we patch the original jump:

    fn ifStatement(self: *Self) void {
        self.consume(.LEFT_PAREN, "Expect '(' after 'if'.");
        self.expression();
        self.consume(.RIGHT_PAREN, "Expect ')' after condition.");

        const thenJump = self.emitJump(.JumpIfFalse);
        self.emitByte(@intFromEnum(OPCode.Pop));
        self.statement();

        const elseJump = self.emitJump(.Jump);
        self.patchJump(thenJump);
        self.emitByte(@intFromEnum(OPCode.Pop));

        if (self.match(.ELSE)) self.statement();
        self.patchJump(elseJump);
    }

The same happens for the elseJump at the end of the function, if the else branch is present.

So, how it works? The patchJump function looks like this:

    fn patchJump(self: *Self, offset: u32) void {
        // -2 to adjust for the bytecode for the jump offset itself.
        var jump = self.currentChunk().code.items.len - offset - 2;

        if (jump > std.math.maxInt(u16)) {
            self.error_("Too much code to jump over.", .{});
        }

        self.currentChunk().code.items[offset] = @intCast((jump >> 8) & 0xff);
        self.currentChunk().code.items[offset + 1] = @intCast(jump & 0xff);
    }

Basically, we count how much code we generated by reading the statement after the condition, and patch in the address we need to jump to if the condition was false. It’s a clever trick. Notice, the if statement itself doesn’t know anything about scopes - all that is handled by calling the expression function. It will kinda be important later.

Aside from not being an expression, I have no objections to implementing the if statement this way. Well, the limitation of 65535 opcodes is a bit weird, but if you may remember, the book earlier suggested implementing ways to have more than 256 constants by introducing more instructions that know how to deal with that. Here, we could also introduce a LongJump and LongJumpIfFalse instructions that would handle 4294967295 opcodes instead if the body is longer than 65535 opcodes. But this complicates things, and previously it resulted in a lot of misunderstanding where I should and should not take care of such long instructions while I did a first run of the book. So I won’t do it here.

Now, then there’s a “Challenge”:

  1. In addition to if statements, most C-family languages have a multi-way switch statement. Add one to clox. The grammar is:

       switchStmt      "switch" "(" expression ")"
                        "{" switchCase* defaultCase? "}" ;
       switchCase      "case" expression ":" statement* ;
       defaultCase     "default" ":" statement* ;
    

    To execute a switch statement, first evaluate the parenthesized switch value expression. Then walk the cases. For each case, evaluate its value expression. If the case value is equal to the switch value, execute the statements under the case and then exit the switch statement. Otherwise, try the next case. If no case matches and there is a default clause, execute its statements.

    To keep things simpler, we’re omitting fallthrough and break statements. Each case automatically jumps to the end of the switch statement after its statements are done.

Uh, this isn’t how switch is supposed to work. The swich statement is a constant dispatch, meaning cases aren’t tried in order, instead, the right one is picked based on the condition. It can be done with a jump table, or by computing offsets at compile time, and allowing only integer switching and then jumping to the given branch. That’s also a reason why C implements break, without it the program will continue executing to the next label because labels don’t really exist in the program. This allows for clever tricks like the Duff’s device:

send(to, from, count)
register short *to, *from;
register count;
{
    register n = (count + 7) / 8;
    switch (count % 8) {
    case 0: do { *to = *from++;
    case 7:      *to = *from++;
    case 6:      *to = *from++;
    case 5:      *to = *from++;
    case 4:      *to = *from++;
    case 3:      *to = *from++;
    case 2:      *to = *from++;
    case 1:      *to = *from++;
            } while (--n > 0);
    }
}

I do understand that the book tries to keep things easy, but this kind of switch statement is a useless addition in my opinion. If Lox had macros, such switch could be easily implemented in terms of chained if else.

Loops

I’m skipping over logical operators, as I don’t have anything to say about them - they’re OK. Loops, however, are, again, weird.

First, we have a while loop, which is probably the only kind of loop any language needs if it doesn’t implement tail call elimination/optimization technique. It is implemented in pretty much the same way as the if statement, except we emit a Loop instruction instead. Again, the design is OK, it is very bare-bones and clean:

    fn emitLoop(self: *Self, loopStart: u32) void {
        self.emitByte(@intFromEnum(OPCode.Loop));

        const offset = self.currentChunk().code.items.len - loopStart + 2;
        if (offset > std.math.maxInt(u16)) self.error_("Loop body too large.", .{});

        self.emitByte(@intCast((offset >> 8) & 0xff));
        self.emitByte(@intCast(offset & 0xff));
    }

    fn whileStatement(self: *Self) void {
        const loopStart: u32 = @intCast(self.currentChunk().code.items.len);
        self.consume(.LEFT_PAREN, "Expect '(' after 'while'.");
        self.expression();
        self.consume(.RIGHT_PAREN, "Expect ')' after condition.");

        const exitJump = self.emitJump(.JumpIfFalse);
        self.emitByte(@intFromEnum(OPCode.Pop));
        self.statement();
        self.emitLoop(loopStart);

        self.patchJump(exitJump);
        self.emitByte(@intFromEnum(OPCode.Pop));
    }

Then, however, we have a for loop. Not only the implementation is much more involved, as it is implemented in terms of the while loop, but it is also quite useless. Why? Because it is a plain old C-style numeric loop. We already have while, why do we need for? I mean, it’s not too hard to translate this into while loop:

for (var i = 0; i < 10; i = i + 1) {
  print i;
}

Here:

{ var i = 0; while (i < 10) {
    print i;
  i = i + 1; } }

That’s about the same amount of code. Yes, I’ve used weird formatting on the while loop example, but only to show that the actual amount of code is about the same. And that’s exactly what the book does in the implementation of the for loop, except it’s this amount of code:

    fn forStatement(self: *Self) void {
        self.beginScope();
        self.consume(.LEFT_PAREN, "Expect '(' after 'for'.");
        if (self.match(.SEMICOLON)) {
            // No initializer.
        } else if (self.match(.VAR)) {
            self.varDeclaration();
        } else {
            self.expressionStatement();
        }

        var loopStart: u32 = @intCast(self.currentChunk().code.items.len);
        var exitJump: i32 = -1;
        if (!self.match(.SEMICOLON)) {
            self.expression();
            self.consume(.SEMICOLON, "Expect ';' after loop condition.");
            // Jump out of the loop if the condition is false.
            exitJump = @intCast(self.emitJump(.JumpIfFalse));
            self.emitByte(@intFromEnum(OPCode.Pop)); // Condition.
        }

        if (!self.match(.RIGHT_PAREN)) {
            const bodyJump = self.emitJump(.Jump);
            const incrementStart: u32 = @intCast(self.currentChunk().code.items.len);
            self.expression();
            self.emitByte(@intFromEnum(OPCode.Pop));
            self.consume(.RIGHT_PAREN, "Expect ')' after for clauses.");

            self.emitLoop(loopStart);
            loopStart = incrementStart;
            self.patchJump(bodyJump);
        }

        self.statement();
        self.emitLoop(loopStart);

        if (exitJump != -1) {
            self.patchJump(@intCast(exitJump));
            self.emitByte(@intFromEnum(OPCode.Pop)); // Condition.
        }

        self.endScope();
    }

Again, if Lox had macros, the language could define for in terms of while without all this code. Macros are special because they allow your language to operate on your code. which may sound complicated, but in actuality is much easier than fiddling with this bytecode generation, as it is more high-level. Here’s an example of a for macro for Emacs Lisp:

(defmacro for (initializer condition step &rest body)
  `(let (,initializer)
     (while ,condition
       (progn ,@body)
       (setq ,(car initializer) ,step))))

(macroexpand-1
 '(for (a 0) (< a 10) (+ a 1)
    (print a)))
;; (let ((a 0))
;;   (while (< a 10)
;;     (progn (print a))
;;     (setq a (+ a 1))))

Yes, it will get more involved if we would like to support a clause without an initializer, or with no explicit step, but in general, manipulating your language with your language is more concise. Again, I know why the book teaches it this way, and it is good to show that the for loop can be implemented in terms of a while loop, but here’s a second question - what the for loop is for in Lox?

There are no arrays in Lox, and you can’t index strings either. The language will add objects later to teach how to implement object-oriented programming, but that’s a hard sell for me and doesn’t make for more useful. Because for doesn’t support object traversal either. Well, theoretically, we can implement the linked list using objects in Lox, and then traverse them using the for loop like so:

for (var pair = ll.rest(); pair.tail != nil; pair = ll.rest()) {
  var val = pair.head;
  // do stuff
}

But that’s not a great interface for linked lists either.

OK, I am nitpicking here. But wait, there’s one more thing - where’s a break statement? Oh, there’s another “Challenge”:

  1. In jlox, we had a challenge to add support for break statements. This time, let’s do continue:

       continueStmt    "continue" ";" ;
    

    A continue statement jumps directly to the top of the nearest enclosing loop, skipping the rest of the loop body. Inside a for loop, a continue jumps to the increment clause, if there is one. It’s a compile-time error to have a continue statement not enclosed in a loop.

    Make sure to think about scope. What should happen to local variables declared inside the body of the loop or in blocks nested inside the loop when a continue is executed?

Wait, it’s not about implementing break? It’s about implementing continue?! Then, what happened to break?

The break is completely omitted from clox. Well, it was a challenge in jlox, and even back then I considered it to be a weird choice. Loops without the break statement are even more useless. Again, you can overcome this by modifying the condition, but that’s extremely ugly. The book has this example in the “Design Note” section:

bool found = false;
for (int x = 0; x < xSize; x++) {
  for (int y = 0; y < ySize; y++) {
    for (int z = 0; z < zSize; z++) {
      if (matrix[x][y][z] == 0) {
        printf("found");
        found = true;
        break;
      }
    }
    if (found) break;
  }
  if (found) break;
}

They suggest that this can be rewritten using goto:

for (int x = 0; x < xSize; x++) {
  for (int y = 0; y < ySize; y++) {
    for (int z = 0; z < zSize; z++) {
      if (matrix[x][y][z] == 0) {
        printf("found");
        goto done;
      }
    }
  }
}
done:

Alternatively, this can be done like this:

bool found = false;
for (int x = 0; !found && x < xSize; x++) {
  for (int y = 0; !found && y < ySize; y++) {
    for (int z = 0; !found && z < zSize; z++) {
      if (matrix[x][y][z] == 0) {
        printf("found");
        found = true;
      }
    }
  }
}

Each for loop will stop once found is set to true. While works, this is clumsy when you need to do an early break:

for (int x = 0; !done && x < 10; x++) {
  if (x == 7) break;
  // rest of the body sits here
}

bool done = false;
for (int x = 0; !done && x < 10; x++) {
  if (x == 7) {
    done = true;
  } else {
    // rest of the body has to be in the else clause
  }
}

While works, this isn’t particularly great. I work with a language that has neither break nor continue - Fennel. It doesn’t have an early return either. It works, however, because Fennel is a lisp-like language, even though it just compiles to Lua. Its syntax is concise, and that allows us to write things with less indentation and nesting than in most other languages. Still, sometimes it is clumsy to write a certain algorithm without an early break.

continue and break

I’m not going to implement continue, I just don’t feel like it, and certainly haven’t seen continue used in any of the code in ten years. But let’s just think how we could do that.

Here’s an example code:

for (var i = 0; i < 10; i = i + 1) {
  if (i == 5) continue;
  print i;
}

As you may remember, this loop really compiles to this equivalent while loop:

{
  var i = 0;
  while (i < 10) {
    if (i == 5) continue;
    print i;
    i = i + 1;
  }
}

Well, not quite - I lied to you. In the bytecode this loops looks nothing like that, instead it is structured kinda like this:

{
  var i = 0;

  condition:
  if (i < 10)
    goto body:

  increment:
  i = i + 1;
  goto condition:

  body:
  if (i == 5) continue;
  print i;
  goto increment:
}

Upon initialization, we create a condition: label, followed by the if statement. If the if clause is true we goto (or rather use the JUMP instruction) to the body: label. Then, at the end of the body, we goto to the increment: label, which finally brings us back to the condition: label.

Convoluted, huh? Well, the VM doesn’t really have a goto, instead it uses the jumping instructions, but the idea is the same. Now we need to somehow incorporate continue in that.

Semantically, continue should just go to the increment: label, right? Well, yes, however, consider this loop:

{
  var i = 0;
  while (i < 10) {
    if (i == 5) {
      var random = random_int(10);
      // assume we can concatenate strings with numbers using +
      print "The loop will continue from" + random;
      i = random - 1;
      continue;
    };
    print i;
    i = i + 1;
  }
}

Now, this is no different, we still can jump to the increment: label, but what happens to the random variable? Correct, it leaks because by jumping we don’t properly close the scope. So, we need to track how much scopes we jump over as well.

And, mind that, continue may not be the last expression in the scope, so we can’t simply close the scope on it. It can be nested in multiple scopes too.

Implementing break is even harder, because it has to jump over the loop, but we can’t know the loop size, because our compiler is single-pass. Storing the marker and patching it later made harder because break is part of the expression inside the loop, and so we need to have a stack of break labels somewhere.

On the first run I tried some approaches, but failed. None of my solutions worked, and I honestly don’t know why. I consider this partly a book’s failure - implementing break and continue is non-trivial, and just throwing it as a “Challenge” is cheap. I’ve searched for different implementations of the Lox on the web, and I can’t remember if I found any with the continue statement implemented. Maybe, that’s because continue is pretty much useless. Maybe, because it’s hard and unclear how to do so.

I would appreciate if the book had two versions of CLox implemented - one is a bare minimum by-the-book, the other one with all challenges completed. I wonder, if this would change the contents of the book, because when I did the challenges, a lot of later material became less clear, because the code differed a lot by that time. Anyhow, there’s the third “Challenge” wich would be interesting to see implemented officially:

  1. Control flow constructs have been mostly unchanged since Algol 68. Language evolution since then has focused on making code more declarative and high level, so imperative control flow hasn’t gotten much attention.

    For fun, try to invent a useful novel control flow feature for Lox. It can be a refinement of an existing form or something entirely new. In practice, it’s hard to come up with something useful enough at this low expressiveness level to outweigh the cost of forcing a user to learn an unfamiliar notation and behavior, but it’s a good chance to practice your design skills.

What’s the point of this “Challenge”? I don’t know - maybe you would came up with something like COMEFROM, or maybe you’ll be able to create something that will change computing forever. I don’t want to waste time on pretty much useless stuff though. If it’s meant to be a joke, then OK, but I didn’t get it.

Calls and Functions

Finally, some real action. Defining functions is pretty straightforward, and I don’t really have anything to say about this chapter…

EXCEPT:

We’re not totally done, though. The new return statement gives us a new compile error to worry about. Returns are useful for returning from functions but the top level of a Lox program is imperative code too. You shouldn’t be able to return from there.

return "What?!";

We’ve specified that it’s a compile error to have a return statement outside of any function, which we implement like so:

static void returnStatement() {

compiler.c
in returnStatement()

  if (current->type == TYPE_SCRIPT) {
    error("Can't return from top-level code.");
  }

  if (match(TOKEN_SEMICOLON)) {

This is one of the reasons we added that FunctionType enum to the compiler.

Why? I mean, even the book kinda mentions that it can be useful:

Allowing return at the top level isn’t the worst idea in the world. It would give you a natural way to terminate a script early. You could maybe even use a returned number to indicate the process’s exit code.

The author seems to be familiar with Lua, so I don’t really get why they discard top-level = return statements. In my implementation, I left this check out. Especially because it helps with making testing possible again.

As for performance… Well, the book mentions that the current by-the-book implementation of CLox is faster than Python at computing the factorial of a number:

> fun fib(n) { if (n < 2) return n; return fib(n - 2) + fib(n - 1); }
> var start = clock(); print fib(35); print clock() - start;
9.22746e+06
1.35232

I’ve checked this Python implementation:

Python 3.11.5 (main, Aug 28 2023, 00:00:00) [GCC 13.2.1 20230728 (Red Hat 13.2.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> def fib(n):
      if (n < 2):
        return n;
      return fib(n - 2) + fib(n - 1)
... ... ... ...
>>> import time
>>> start = time.time(); print(fib(35)); print(time.time() - start)
9227465
1.7716176509857178

It is certainly slower than CLox, what about my implementation in Zig?

./target/zig/bin/lox-0.1.73
ziglox> fun fib(n) { if (n < 2) return n; return fib(n - 2) + fib(n - 1); }
nil
ziglox> var start = clock(); print fib(35); print clock() - start;
9227465
28.784000158309937
nil

Almost 30 seconds?! I’m not even sure where the problem is, the speed difference is drastic. And I’m unsure how to diagnose this too, perhaps there’s a way to profile Zig code? Apparently, there’s this project, and there are bindings available, but I couldn’t make it build.

Wait… Zig does all these safety checks by default, right? And the default optimization target is “Debug”. Let’s try compiling with other optimization levels:

rm -rf ./target
zig build -Doptimize=ReleaseSafe -p ./target/zig --cache-dir ./target/zig/.cache
./target/zig/bin/lox-0.1.73
ziglox> fun fib(n) { if (n < 2) return n; return fib(n - 2) + fib(n - 1); }
nil
ziglox> var start = clock(); print fib(35); print clock() - start;
9227465
3.015000104904175
nil

OK, ReleaseSafe is ten times faster! Let’s try ReleaseFast then:

rm -rf ./target
zig build -Doptimize=ReleaseFast -p ./target/zig --cache-dir ./target/zig/.cache
./target/zig/bin/lox-0.1.73
ziglox> fun fib(n) { if (n < 2) return n; return fib(n - 2) + fib(n - 1); }
nil
ziglox> var start = clock(); print fib(35); print clock() - start;
9227465
1.7879998683929443
nil

Now we’re as fast as Python, yay… Let’s throw in Lua for comparison:

Lua 5.4.4  Copyright (C) 1994-2022 Lua.org, PUC-Rio
> function fib(n) if (n < 2) then return n end return fib(n - 2) + fib(n - 1) end
> socket = require"socket"
> time = socket.gettime
> start = time(); print(fib(35)); print(time() - start)
9227465
1.0116229057312

Well, Lua is often called one of the fastest scripting languages, so I guess that statement holds up well. A complete CLox implementation by the author of the book is also fast:

> fun fib(n) { if (n < 2) return n; return fib(n - 2) + fib(n - 1); }
> var start = clock(); print fib(35); print clock() - start;
9.22746e+06
1.3271

However, the way the result is printed isn’t particularly helpful.

Well, now, that is sorted out, I think we can conclude that the implementation is fast enough. I will keep the debug optimization target for now, as all these checks are helpful, but I guess I need a separate make rule for a release build.

There’s one “Challenge” in this chapter I’d like to tackle:

  1. Add some more native functions to do things you find useful. Write some programs using those. What did you add? How do they affect the feel of the language and how practical it is?

Here are two functions that I would consider useful.

The first one is length:

fn lengthNative(n: u8, args: [*]Value, vm: *VM) Result {
    if (n > 0) {
        const s = args[0];
        if (s.isString())
            return .{ .OK = Value.from(s.asString().bytes.len) };
    }
    return vm.runtimeError("Expected a string as an argument.", .{});
}

Simply put, it’s handy to be able to know the length of the string. It’s not generic, because there are no other countable data structures in Lox.

As you may noticed, the return value of this function is not of a Value type but an Result one. That’s because we have to signal a runtime error if we get something unexpected as an argument. It plays nicely with the previous Challenge:

  1. Right now, there’s no way for a native function to signal a runtime error. In a real implementation, this is something we’d need to support because native functions live in the statically typed world of C but are called from dynamically typed Lox land. If a user, say, tries to pass a string to sqrt(), that native function needs to report a runtime error.

    Extend the native function system to support that. How does this capability affect the performance of native calls?

As for performance - we’re wrapping the result, and Zig is quite good at unwrapping these with the switch statement. Here’s how we handle native calls now:

The other one is tostring:

fn tostringNative(nargs: u8, args: [*]Value, vm: *VM) Result {
    if (nargs == 0)
        return vm.runtimeError("Expected an argument", .{});

    var array_list = std.ArrayList(u8).init(vm.allocator);
    defer array_list.deinit();

    args[0].format("{}", .{}, array_list.writer()) catch
        common.exit(200, "OOME: can't tostring a value", .{});

    const s = array_list.toOwnedSlice() catch
        common.exit(200, "OOME: can't tostring a value", .{});
    defer vm.allocator.free(s);

    return .{ .OK = (try String.copy(vm, s)).obj.asValue() };
}

Lox allows us to concatenate strings using the + operator. However, this is almost useless, because we can’t concatenate anything else to it. Formatting a message is pretty much impossible.

With tostring we can use + with anything Lox supports:

ziglox> fun countdown(n) {
          for (var i = n; i > 0; i = i - 1)
            print tostring(i) + "...";
          print "liftoff!!";
        }
nil
ziglox> countdown(3);
3...
2...
1...
liftoff!!

I would also consider adding sleep to the language, as this function is very useful when you need to wait, but don’t want to waste CPU. However, Lox is more of a toy language, so having sleep in it isn’t necessary. Anyway, here’s sleep for completeness:

fn sleepNative(nargs: u8, args: [*]Value, vm: *VM) Result {
    if (nargs == 0)
        return vm.runtimeError("Expected an argument", .{});

    const arg = args[0];
    if (!arg.is(.Number))
        return vm.runtimeError("Expected an integer argument", .{});

    const f = arg.asNumber();
    const i: i64 = @intFromFloat(f);

    if (f != @as(f64, @floatFromInt(i)))
        return vm.runtimeError("Expected an integer argument", .{});

    if (i > 0) sleep(@intCast(i * 1000000));
    return .{.OK = Nil };
}

It doesn’t have much utility, though, but having both the clock and sleep functions can work wonders.

Closures

This is getting out of hand - we’ve been adding new object types, and each one requires a method to test if it’s an instance of a particular object:

pub const Obj = struct {
    objType: Type,
    next: ?*Obj,

    // ...
    pub fn isString(self: *Obj) bool {
        return self.objType == .String;
    }

    pub fn asString(self: *Obj) *String {
        return @fieldParentPtr(String, "obj", self);
    }

    pub fn isFunction(self: *Obj) bool {
        return self.objType == .Function;
    }

    pub fn asFunction(self: *Obj) *Function {
        return @fieldParentPtr(Function, "obj", self);
    }

    pub fn isNativeFunction(self: *Obj) bool {
        return self.objType == .NativeFunction;
    }

    pub fn asNativeFunction(self: *Obj) *NativeFunction {
        return @fieldParentPtr(NativeFunction, "obj", self);
    }
};

Now we need to add one more for Closure, and then a bunch of others too. You can see that the only difference between these methods is the enumeration tag or type. So, instead of adding new ones, let’s delete all of these and use Zig’s comptime:

pub const Obj = struct {
    objType: Type,
    next: ?*Obj,

    // ...
    pub fn is(self: *Obj, objType: Type) bool {
        return self.objType == objType;
    }

    pub fn as(self: *Obj, comptime T: type) *T {
        return @fieldParentPtr(T, "obj", self);
    }
}

Now we only need to update all of our code from x.isFunction() to x.is(Function) which is a simple enough substitution. Sadly, we can’t do the same for our evergrowing Value struct:

pub const Value = union(enum) {
    Number: f64,
    Bool: bool,
    Nil,
    Obj: *Obj,

    pub inline fn isNumber(self: Value) bool {
        return self == .Number;
    }

    pub inline fn isBool(self: Value) bool {
        return self == .Bool;
    }

    pub inline fn isObj(self: Value) bool {
        return self == .Obj;
    }
    // ...
};

It may seem that we can do the same, except use the enumeration literal as the parameter to is:

    pub inline fn is(self: Value, comptime target: @TypeOf(.enum_literal)) bool {
        return self == target;
    }

However, we also have methods like this:

    pub inline fn isString(self: Value) bool {
        return self == .Obj and self.Obj.objType == .String;
    }

Unfortunately, here we can’t use this kind of signature. We could do:

    pub inline fn is(self: Value, comptime target: @TypeOf(.enum_literal), objType: ObjType) bool {
        if (comptime target == .Obj) {
            return self == target and self.Obj.objType == objType;
        } else {
            return self == target;
        }
    }

But then it would require us to write x.is(.Number, undefined) and y.is(.Obj, .String), which is ugly. So instead, we can do this comptime check:

    pub inline fn is(self: Value, comptime target: @TypeOf(.enum_literal)) bool {
        if (comptime @hasField(ObjType, @tagName(target))) {
            return self == .Obj and self.Obj.objType == target;
        } else {
            return self == target;
        }
    }

This way we can write both x.is(.Number) and y.is(.String) and it will generate an appropriate code at compile time. This is as close as we can get to macros, I guess. It gets harder from there, though.

Now we need to get rid of these methods:

pub const Value = union(enum) {
    Number: f64,
    Bool: bool,
    Nil,
    Obj: *Obj,
    // ...
    pub inline fn asNumber(self: Value) f64 {
        return self.Number;
    }

    pub inline fn asBool(self: Value) bool {
        return self.Bool;
    }

    pub inline fn asObj(self: Value) *Obj {
        return self.Obj;
    }

    pub inline fn asString(self: Value) *String {
        return self.Obj.as(String);
    }

    pub inline fn asFunction(self: Value) *Function {
        return self.Obj.as(Function);
    }

    pub inline fn asNativeFunction(self: Value) *NativeFunction {
        return self.Obj.as(NativeFunction);
    }
};

As you can see, we again have two kinds of actions to perform - one is to return the desired enumeration tag value, and the other is to call a method. The first few are easy to convert by using the same trick with comptime target: @TypeOf(.enum_literal) but we can’t do the same for the last few, because Obj.as expects a type. On the other hand, we know all of the types we’re gonna use with as, so we can do this:

pub const Value = union(enum) {
    Number: f64,
    Bool: bool,
    Nil,
    Obj: *Obj,
    // ...
    fn typeNameUnqualified(comptime T: type) []const u8 {
        const name = @typeName(T);
        return name[if (std.mem.lastIndexOfAny(u8, name, ".")) |i| i + 1 else 0..];
    }

    pub inline fn as(self: Value, comptime T: type) T {
        return if (comptime @hasField(ObjType, typeNameUnqualified(T)))
            self.Obj.as(switch (@typeInfo(T)) {
                .Pointer => |p| p.child,
                else => T,
            })
        else switch (T) {
            f64 => self.Number,
            bool => self.Bool,
            void => self.Nil,
            *Obj => self.Obj,
            else => unreachable,
        };
    }
};

Not pretty, eh? Well, that’s what you get, when you don’t have macros that operate on actual code.

We call this function for basic types like this:

a.as(f64); // Value{.Number=42} => 42

Transforming the Value into an actual number. And for object types, we do it like this:

// s is a `Value{.Obj=&{objType=.String,next=null}}`
s.as(*String); // => &String{.Obj={objType=.String,next=null},.hash=1337,.bytes="ваыв"}

Because we want a pointer to the String type we pass it as *String. Seems logical, but there’s a problem.

First, we need to check if the ObjType enum has the String field, but here we have a type. More than that, it is actually an *src.object.String type, not just String, so we need a way to transform this type into a tag name. The typeNameUnqualified function takes care of it, but then there’s another problem - we need to pass the correct type to the Object.as method, and *String is not a correct type - it needs a String! So we have to do a little dance with @typeInfo and check that we got a Pointer type, that has a child field, containing the type we’re interested in. Then we can pass it to the as method of the object.

Phew! We’re almost done with the refactoring, the only thing left is to remove these methods:

pub const Value = union(enum) {
    Number: f64,
    Bool: bool,
    Nil,
    Obj: *Obj,

    // ..
    pub inline fn fromNumber(x: f64) Value {
        return Value{ .Number = x };
    }

    pub inline fn fromBool(x: bool) Value {
        return Value{ .Bool = x };
    }

    pub inline fn fromObj(x: *Obj) Value {
        return Value{ .Obj = x };
    }
};

Well, that’s easy:

pub inline fn from(x: anytype) Value {
    return switch (@TypeOf(x)) {
        f64, comptime_float => Value{ .Number = x },
        bool => Value{ .Bool = x },
        void => Nil,
        *Obj => Value{ .Obj = x },
        else => unreachable,
    };
}

Now we can do Value.from(12.3) and it will do it’s job smoothly.

Look at the before and after:

|-----------------------------Before------------------------------------+----------------------------------------After----------------------------------------|
|pub const Value = union(enum) {                                        | pub const Value = union(enum) {                                                     |
|    Number: f64,                                                       |     Number: f64,                                                                    |
|    Bool: bool,                                                        |     Bool: bool,                                                                     |
|    Nil,                                                               |     Nil,                                                                            |
|    Obj: *Obj,                                                         |     Obj: *Obj,                                                                      |
|    // ...                                                             |     // ...                                                                          |
|    pub inline fn isNumber(self: Value) bool {                         |     pub inline fn is(self: Value, comptime target: @TypeOf(.enum_literal)) bool {   |
|        return self == .Number;                                        |         if (comptime @hasField(ObjType, @tagName(target))) {                        |
|    }                                                                  |             return self == .Obj and self.Obj.objType == target;                     |
|                                                                       |         } else {                                                                    |
|    pub inline fn fromNumber(x: f64) Value {                           |             return self == target;                                                  |
|        return Value{ .Number = x };                                   |         }                                                                           |
|    }                                                                  |     }                                                                               |
|                                                                       |                                                                                     |
|    pub inline fn asNumber(self: Value) f64 {                          |     fn typeNameUnqualified(comptime T: type) []const u8 {                           |
|        return self.Number;                                            |         const name = @typeName(T);                                                  |
|    }                                                                  |         return name[if (std.mem.lastIndexOfAny(u8, name, ".")) |i| i + 1 else 0..]; |
|                                                                       |     }                                                                               |
|    pub inline fn isBool(self: Value) bool {                           |                                                                                     |
|        return self == .Bool;                                          |     pub inline fn as(self: Value, comptime T: type) T {                             |
|    }                                                                  |         return if (comptime @hasField(ObjType, typeNameUnqualified(T)))             |
|                                                                       |             self.Obj.as(switch (@typeInfo(T)) {                                     |
|    pub inline fn fromBool(x: bool) Value {                            |                 .Pointer => |p| p.child,                                            |
|        return Value{ .Bool = x };                                     |                 else => T,                                                          |
|    }                                                                  |             })                                                                      |
|                                                                       |         else switch (T) {                                                           |
|    pub inline fn asBool(self: Value) bool {                           |             f64 => self.Number,                                                     |
|        return self.Bool;                                              |             bool => self.Bool,                                                      |
|    }                                                                  |             void => self.Nil,                                                       |
|                                                                       |             *Obj => self.Obj,                                                       |
|    pub inline fn isObj(self: Value) bool {                            |             else => unreachable,                                                    |
|        return self == .Obj;                                           |         };                                                                          |
|    }                                                                  |     }                                                                               |
|                                                                       |                                                                                     |
|    pub inline fn fromObj(x: *Obj) Value {                             |     pub inline fn from(x: anytype) Value {                                          |
|        return Value{ .Obj = x };                                      |         return switch (@TypeOf(x)) {                                                |
|    }                                                                  |             f64, comptime_float => Value{ .Number = x },                            |
|                                                                       |             bool => Value{ .Bool = x },                                             |
|    pub inline fn asObj(self: Value) *Obj {                            |             void => Nil,                                                            |
|        return self.Obj;                                               |             *Obj => Value{ .Obj = x },                                              |
|    }                                                                  |             else => unreachable,                                                    |
|                                                                       |         };                                                                          |
|    pub inline fn isString(self: Value) bool {                         |     }                                                                               |
|        return self == .Obj and self.Obj.objType == .String;           | };                                                                                  |
|    }                                                                  |                                                                                     |
|                                                                       |                                                                                     |
|    pub inline fn asString(self: Value) *String {                      |                                                                                     |
|        return self.Obj.asString();                                    |                                                                                     |
|    }                                                                  |                                                                                     |
|                                                                       |                                                                                     |
|    pub inline fn isFunction(self: Value) bool {                       |                                                                                     |
|        return self == .Obj and self.Obj.objType == .Function;         |                                                                                     |
|    }                                                                  |                                                                                     |
|                                                                       |                                                                                     |
|    pub inline fn asFunction(self: Value) *Function {                  |                                                                                     |
|        return self.Obj.asFunction();                                  |                                                                                     |
|    }                                                                  |                                                                                     |
|                                                                       |                                                                                     |
|    pub inline fn isNativeFunction(self: Value) bool {                 |                                                                                     |
|        return self == .Obj and self.Obj.objType == .NativeFunction;   |                                                                                     |
|    }                                                                  |                                                                                     |
|                                                                       |                                                                                     |
|    pub inline fn asNativeFunction(self: Value) *NativeFunction {      |                                                                                     |
|        return self.Obj.asNativeFunction();                            |                                                                                     |
|    }                                                                  |                                                                                     |
|};                                                                     |                                                                                     |
|-----------------------------------------------------------------------+-------------------------------------------------------------------------------------|

I guess, you can do many things that are done with macros in C using the comptime in Zig, although mostly when it is related to types. Can’t say that I like this more than macros, but it gets the job done. Now we can start this chapter proper!

In the last chapter, we added functions and had them working, but you can do with just functions only so much. In this chapter, we’re adding closures, which take our functions to a whole other level.

Honestly, this chapter, while having mostly straightforward stuff, gave me a lot of trouble. You see, there’s a certain problem with how the book gives you the material. It’s subtle, but when the time comes it’s hard to ignore.

The book gives you information bit by bit while trying to have a working interpreter at all times. Yes, each chapter breaks the interpreter in the beginning, but by the end, you have something that works. And often the chapter progressively gives you more examples and explains why they should work and why they don’t right now. On its own, it’s a great way to show you your progress, but on the other hand, if a small mistake sneaks somewhere in the process, you won’t know till much later. That’s exactly what happened to me.

So, during this chapter, I tried most examples the book featured, and all of them seemed to work. Satisfied, I proceeded to the “Challenges” section, and because I despise OOP much more than said “Challenges”, I decided to do this particular one:

  1. A famous koan teaches us that “objects are a poor man’s closure” (and vice versa). Our VM doesn’t support objects yet, but now that we have closures we can approximate them. Using closures, write a Lox program that models two-dimensional vector “objects”. It should:
    • Define a “constructor” function to create a new vector with the given x and y coordinates.
    • Provide “methods” to access the x and y coordinates of values returned from that constructor.
    • Define an addition “method” that adds two vectors and produces a third.

And it threw a weird error - it couldn’t find some closures.

Now, because Lox doesn’t have any data structures except for strings yet, I decided to implement a linked list, as closures are a perfect solution for that. You don’t need any kind of data structures for linked lists as long as your language has closures. Observe:

fun cons(head, tail) {
    fun cell(isHead) {
        if (isHead)
            return head;
        return tail;
    }
    return cell;
}

fun first(cell) {
    return cell(true);
}

fun rest(cell) {
    return cell(false);
}

With these three functions, we should be able to create a list like this:

var list = cons(1, cons(2, cons(3, nil)));

And we should be able to traverse it with for loop, albeit in a bit ugly way:

for (var head = list; head != nil; head = rest(head)) {
	print first(head);
}

Now, if we try to execute this script, we get

$ ziglox list.lox
Undefined variable 'head'.
[line 1] in cell
[line 1] in first
[line 1] in script
Interpret Runtime Error

An error. I then made sure that the head variable is not related to the loop, by trying to get the head of the list with just first(list), but still got the same error. Then I spent a day looking for what I did wrong.

During this, I decided to try out all of the examples in the chapter. I remember that all of them have worked before, and if they are then the problem is much deeper than the book accounts for. So I started, and everything worked until I hit this one:

fun outer() {
  var x = "value";
  fun middle() {
    fun inner() {
      print x;
    }

    print "create inner closure";
    return inner;
  }

  print "return from outer";
  return middle;
}

var mid = outer();
var in = mid();
in();

Strangely enough, all later examples worked fine. I’ll explain why in a moment. For now, let it sink in - I had a working interpreter the whole time, and it could do everything the book suggested I try, yet there was a bug somewhere.

So, here’s the thing. For some reason, I decided to use a statically allocated stack for locals, back in the chapter where we added support for locals. It worked well until we were introduced to the concept of compiler stacks. However, Since I created the stack way back, I kept using it, and it basically meant that every compiler shared the same stack of locals. So when we were searching for upvalues we saw the same upvalues on all levels.

So you should now understand why it worked for all of the examples, except for the one above - it has one more compiler level. After fixing this by using a proper dynamic array, all of the examples in this chapter work as expected.

The linked list, however, still doesn’t work. This time around, I decided that I would automate this process a bit, and put all of the examples as test cases. I had to adjust them to actually return values instead of printing them, and what do you know - I found another broken case:

fun outer() {
  var a = 1;
  var b = 2;
  fun middle() {
    var c = 3;
    var d = 4;
    fun inner() {
      return a + c + b + d;
    }
	return inner;
  }
  return middle;
}

return outer()()();

It should return 10, but instead it returns 12. How? Well, if we print all of the values that went into the inner function we’ll see that a is 3, b is 2, c is 2, and d is 4 and the answer changes to 14. So while only the d upvalue was correct, the introduction of print still changed something.

There’s another bug. And the same happens in my linked list implementation - first returns itself. Perhaps something to do with how we capture upvalues then?

During these days of debugging, I tried everything I could - I moved readByte to CallFrame struct, I changed the instruction counter to the instruction pointer, as the book does, I also tried allocating stacks on the heap instead of the static memory. I poked the reference implementation by Robert - the bytecode is exactly the same as in my implementation, so it has to be something in the VM. I printed pointer addresses, analyzing if there is a different between my implementation and C, and comparing relative offsets - again, they’re the same. Nothing helped. Reverting to my previous implementation revealed that this case wasn’t handled properly as well - I somehow missed it back then too.

And then I found it. After 4 days of debugging the reason was that I SOMEHOW missed this line in the captureUpvalue function:

createdUpvalue.next = upvalue;

Finally, everything works:

// cons, first, rest...

var list = cons(1, cons(2, cons(3, nil)));

for (var head = list; head != nil; head = rest(head)) {
    print first(head);
}
1
2
3

And, guess what? During all this debugging, and trying different things I managed to improve the speed of the interpreter a bit:

ziglox> fun fib(n) { if (n < 2) return n; return fib(n - 2) + fib(n - 1); }
nil
ziglox> var s = clock(); print fib(35); print clock() - s;
9227465
1.4010000228881836

Here’s the same code when checked out to the previous commit at chapter 24:

ziglox> fun fib(n) { if (n < 2) return n; return fib(n - 2) + fib(n - 1); }
nil
ziglox> var s = clock(); print fib(35); print clock() - s;
9227465
1.815000057220459

Nice! Going back to the third “Challenge”, here’s my attempt at an object via closures:

fun vec2(x, y) {
  var _x = x; // storage
  var _y = y;
  fun obtainer(what) {
    if (what == "x") return _x;
    else if (what == "y") return _y;
  }
  fun establisher(what, val) {
    if (what == "x") _x = val;
    else if (what == "y") _y = val;
  }
  fun add(other) {
    var ox = other("obtain")("x");
    var oy = other("obtain")("y");
    _x = _x + ox;
    _y = _y + oy;
  }
  fun clss (action) {
    if (action == "obtain") return obtainer;
    else if (action == "establish") return establisher;
    else if (action == "add") return add;
  }
  return clss;
}

var a = vec2(3, nil);
var b = vec2(22, 10);

a("establish")("y", 7);
a("add")(b);
print a("obtain")("x") + a("obtain")("y"); // 42

It’s a vec2 structure representing this kind of data {.x = 1, .y = 2}. Although it’s a bit rough to use, it is a proper data structure. And yes - I like obtain and establish better than get and set. What?

Garbage Collection

Hooo…

Now we get to the interesting stuff. The book implements this language in C, and C is pretty straightforward when it comes to memory management. Well, straightforward might not be a proper word, but by default it’s quite basic - you have malloc, free, and realloc as a base.

Zig, on the other hand, doesn’t have a default allocator, which, in theory, should make it more appropriate for embedded systems, where there may not be any way to actually allocate memory. This, in turn, affects the language design - every Zig function that wants to allocate some memory, accepts an allocator (or accesses it from an enclosing struct if that’s a method). Now you can understand why all of the code in this post had something like Allocator in function arguments or structures.

This meant that we didn’t have to implement most of the book’s memory.c and memory.h because Zig’s allocators already can do all of that. However, now we can’t just use an existing allocator, as we need to implement garbage collection mid-allocation. That’s why we need a custom allocator.

Thankfully, we don’t really have to do the allocation ourselves. Instead, we can create an allocator that just wraps another allocator, like a decorator. All we need to do is to implement these three functions:

pub const GCAllocator = struct {
    parentAllocator: Allocator,
    bytesAllocated: usize,

    pub fn init(parentAllocator: Allocator) GCAllocator {
        return .{
            .parentAllocator = parentAllocator,
            .bytesAllocated = 0,
        };
    }

    pub fn allocator(self: *GCAllocator) Allocator {
        return .{
            .ptr = self,
            .vtable = &.{ .alloc = alloc, .resize = resize, .free = free },
        };
    }

    fn alloc(ctx: *anyopaque, len: usize, log2_ptr_align: u8, ra: usize) ?[*]u8 {
        const self: *GCAllocator = @ptrCast(@alignCast(ctx));
        const result = self.parentAllocator.rawAlloc(len, log2_ptr_align, ra);
        if (result != null) self.bytesAllocated += len;
        return result;
    }

    fn resize(ctx: *anyopaque, buf: []u8, log2_buf_align: u8, new_len: usize, ra: usize) bool {
        const self: *GCAllocator = @ptrCast(@alignCast(ctx));
        if (self.parentAllocator.rawResize(buf, log2_buf_align, new_len, ra)) {
            if (new_len <= buf.len) {
                self.bytesAllocated -= buf.len - new_len;
            } else {
                self.bytesAllocated += new_len - buf.len;
            }

            return true;
        }
        std.debug.assert(new_len > buf.len);
        return false;
    }

    fn free(ctx: *anyopaque, buf: []u8, log2_buf_align: u8, ra: usize) void {
        const self: *GCAllocator = @ptrCast(@alignCast(ctx));
        self.parentAllocator.rawFree(buf, log2_buf_align, ra);
        self.bytesAllocated -= buf.len;
    }
};

Now we should be able to implement the rest of this chapter in terms of this allocator. And, thanks to the fact that all our functions already accept an allocator, we can just plug this one in, and it should work!

I called it GCAllocator, although right now it is more like a tracking allocator. It tracks how many bytes were allocated and freed, and that’s basically it. With that, we can implement the rest of the interface.

The book suggests to implement the “Mark and Sweep” algorithm. While not great, it is perfect for teaching. Moreover, there’s still software that uses this technique, for example, one that I’m writing this blog in - Emacs! It was always an active debate whether the GC in Emacs is outdated or not, but I can say that Emacs performs fine for my daily tasks, so it’s probably OK. Something like JVM would never settle on something simple like this though.

I won’t go into the source code of the collector itself, as the translation from C is pretty much 1:1 here. One thing to point out though is the use of grayStack in the VM. The book uses an array of pointers to the objects, and we do the same by using ArrayList(*Obj). However, using append to add things to this array can throw an out-of-memory error. So when we do the marking of objects we have to handle that.

Between these two chapters, I’ve refactored the code to use Zig’s errors. So handling allocation failures is simple enough now, but I don’t really know if we should handle out-of-memory errors. On one hand, JVM doesn’t recommend handling the OOME, and the most common practice I know of is to do a heap dump and die. On the other hand, we’re dealing with OOME inside the virtual machine, not inside the language itself, as Lox doesn’t really have exceptions or other kinds of error handling. I chose to kill the VM if the OOME ever occurs, but in a real language, I would probably not do it this way. And I think in the language that runs on the VM it should be possible to handle OOM too.

Classes and Instances

Now, I’m a bit torn apart here. I don’t want to implement classes.

I mean, seriously, this language doesn’t have any data structures yet, and we’re already in classes? Where are arrays, at the very least? Plus, we already did classes in cljlox, so it’s boring to do it again.

Instead, I think it would be better to go the Lua route and give users only a basic set of data structures, like hash tables. Except, actually give users data structures, and not a mixed table/array thingy. I’m not sure, however, if it will affect the last chapter that talks about optimizations.

Yet, there are benefits in the following chapters. For example, if I were to introduce a hash table, accessing keys from it can be made with a special syntax like a.b. Assigning a value to the key can also use this syntax, and we’re going to handle it in the next chapter.

I guess, classes are more involved, and thus a bit better fit for teaching how to make a language, but on the other hand, class systems can be a fully runtime thing. Sparkle some macros on top of it, and you can hack your own object system in a weekend. That’s what a lot of Lisp/Scheme projects did back in the day.

Anyway, after adding the Class opcode and some other stuff around it the book says:

There you have it, our VM supports classes now. You can run this:

class Brioche {}
print Brioche;

Unfortunately, printing is about all you can do with classes, so next is making them more useful.

Unfortunately, we have classes before arrays.

Yes, I’m negative, because come on, arrays are such a basic data structure that not having them is a bummer. How would you implement anything without an array? Well, technically, we can use our closure-based linked lists as a substitute, but it has horrible performance in many algorithms where constant indexing is required.

Anyway, after implementing field access the chapter is suddenly over. That’s it? Well, right now it’s just a hash table with inconvenient syntax. I mean, yes, classes are just a special case of a hash table in some languages, and hash tables are a special case of classes in some other broken languages. But here, why didn’t we introduce a basic hash table with an open set of keys, even though our runtime already supports one?

Onto “Challenges”.

  1. Trying to access a non-existent field on an object immediately aborts the entire VM. The user has no way to recover from this runtime error, nor is there any way to see if a field exists before trying to access it. It’s up to the user to ensure on their own that only valid fields are read.

    How do other dynamically typed languages handle missing fields? What do you think Lox should do? Implement your solution.

Yeah, that was a strange decision. I decided to just return nil.

  1. Fields are accessed at runtime by their string name. But that name must always appear directly in the source code as an identifier token. A user program cannot imperatively build a string value and then use that as the name of a field. Do you think they should be able to? Devise a language feature that enables that and implement it.

Well, first of all, a user program cannot imperatively build a string value at all, because Lox has no string library. A few chapters ago I added tostring and with that, it is possible to build strings using +, but not in vanilla Lox.

A user should be able to do that, but most languages that allow that it does require a special syntax, and I don’t want to deal with that now. If I did, though, I would probably introduce something like foo[bar], but instead we can always use a native function:

    fn getNative(nargs: u8, args: [*]Value, vm: *VM) !Value {
        if (nargs == 0)
            return vm.runtimeError("Expected an argument", .{});
        const instance = args[0];
        if (!instance.is(.Instance))
            return vm.runtimeError("Expected a class instance as the first argument", .{});
        const name = args[1];
        if (!name.is(.String))
            return vm.runtimeError("Expected a string as the second argument", .{});

        return instance.as(*Instance).fields.get(name.as(*String)) orelse Nil;
    }

    fn setNative(nargs: u8, args: [*]Value, vm: *VM) !Value {
        if (nargs == 0)
            return vm.runtimeError("Expected an argument", .{});

        const instance = args[0];
        if (!instance.is(.Instance))
            return vm.runtimeError("Expected a class instance as the first argument", .{});

        const name = args[1];
        if (!name.is(.String))
            return vm.runtimeError("Expected a string as the second argument", .{});

        if (nargs >= 3 and !args[2].is(.Nil)) {
            const value = args[2];
            _ = try instance.as(*Instance).fields.set(name.as(*String), value);
            return value;
        } else {
            _ = instance.as(*Instance).fields.delete(name.as(*String));
            return Nil;
        }
    }

With these we can do set(foo, "vaiv", 42) and get(foo, "vaiv"). These native functions are bloating the VM struct, so I should probably move them out.

As you can see, I don’t care if the native function has more arguments than it really needs. The only thing I care about is the required arguments. I like this design more, and in fact, doing things this way should allow variadic functions naturally. Maybe I should implement them in the future, alongside arrays.

Lastly:

  1. Conversely, Lox offers no way to remove a field from an instance. You can set a field’s value to nil, but the entry in the hash table is still there. How do other languages handle this? Choose and implement a strategy for Lox.

But as you may have noticed, I already implemented this. Setting a field to nil removes it from the hash table. This is how it works in Lua, and while I strongly dislike this particular part of Lua, I don’t have any problems with this in Lox. Why? Because Lua has a weird mix of tables and arrays, setting a value in the array to nil also removes it from the array, as if the array were a hash table with integer keys.

So by doing arr[3] = nil you effectively transform {"a", "b", "c", "d"} into {1 = "a", 2 = "b", 4 = "d"}, or rather {"a", "b", 4 = "d"}, as Lua views it. This breaks iteration on arrays, as no the iteration will only see the first two fields (and yes, arrays in Lua start from 1, which is great):

arr = {"a", "b", "c", "d"}
print "-- before set --"
for i,v in ipairs(arr) do
   print(i, v)
end
arr[3] = nil;
print "-- after set --"
for i,v in ipairs(arr) do
   print(i, v)
end
-- before set --
1	a
2	b
3	c
4	d
-- after set --
1	a
2	b

If Lua did not remove things when you set them to nil and instead provided a contains and remove functions to check for key presence, and removal of combined array/table would be fine, but in its current state, it’s too easy to trip on this. I know that because I teach people how to write Lua, and this always makes people angry.

If the language provides separate array and table objects, setting table keys to nil for removal is fine - it should not matter if the key is present and nil or just absent. Arrays, however, should be able to have nil in them, and removal is a whole different operation, that shifts all other elements, so setting array elements to nil for removal doesn’t really make sense. Anyway, that’s my reasoning behind this, I’m sure you can disagree, but it’s OK.

Finally:

  1. Because fields are accessed by name at runtime, working with instance state is slow. It’s technically a constant-time operation—thanks, hash tables—but the constant factors are relatively large. This is a major component of why dynamic languages are slower than statically typed ones.

    How do sophisticated implementations of dynamically typed languages cope with and optimize this?

Ah, if only we had arrays - we could easily implement records that have constant time access to fields by field index determined at compile time. Alas.

Methods and Initializers

I don’t have anything to say about this chapter. Mostly, because it is straightforward and concise.

Still, I would prefer not having methods at all, but there were interesting bits to tackle with this, so I guess it was worth it.

Superclasses

// ...return constantInstruction("OP_CLASS", chunk, offset);
    case OP_INHERIT:
      return simpleInstruction("OP_INHERIT", offset);
//  case OP_METHOD: ...

Simple instruction, how ironic.

The chapter proceeds adding tons of new stuff in the compiler, and then in the VM. While straightforward, I can’t help but notice how things got more complicated since we added classes. Honestly, I have never seen a program where inheritance made things easier.

  1. A tenet of object-oriented programming is that a class should ensure new objects are in a valid state. In Lox, that means defining an initializer that populates the instance’s fields. Inheritance complicates invariants because …

    If Lox was your language, how would you address this, if at all? If you would change the language, implement your change.

If Lox was my language it wouldn’t have classes at all. And it would have arrays.

Optimization

The final chapter. Honestly, didn’t think that I’d make it through. It’s 4 AM right now, as I’m writing this, and perhaps I should go to sleep, but I want to finish this stuff quicker, so…

The chapter starts with a benchmark:

class Zoo {
  init() {
    this.aardvark = 1;
    this.baboon   = 1;
    this.cat      = 1;
    this.donkey   = 1;
    this.elephant = 1;
    this.fox      = 1;
  }
  ant()    { return this.aardvark; }
  banana() { return this.baboon; }
  tuna()   { return this.cat; }
  hay()    { return this.donkey; }
  grass()  { return this.elephant; }
  mouse()  { return this.fox; }
}

var zoo = Zoo();
var sum = 0;
var start = clock();
while (sum < 100000000) {
  sum = sum + zoo.ant()
            + zoo.banana()
            + zoo.tuna()
            + zoo.hay()
            + zoo.grass()
            + zoo.mouse();
}

print clock() - start;
print sum;

Running this on the reference implementation takes 3.87092 seconds and the result is 1e+08 or 100000000. Running on the release build of my language takes 8.065999984741211, and the result is 100000002. Don’t mind the difference, it’s an IEEE 754 quirk - if we change how clox prints it’s value from scientific notation to plain float, we’ll get the same thing.

Now, my implementation is twice as slow. Thankfully, the loop itself is fast - if we just do this:

ziglox> var sum = 0;
ziglox> var start = clock(); while (sum < 100000000) { sum = sum + 1 + 1 + 1 + 1 + 1 + 1; } print clock() - start;
2.2915e+01
nil
ziglox>  print sum;
1.00000002e+08
nil

The speed is comparable.

Unfortunately, I am lazy. The book proceeds at optimizing table lookups, by using bit-masking instead of the modulo operator. I’m not using the table implementation from the book itself - instead, I’m wrapping Zig’s std.hash_map.HashMap. Fortunately, Zig’s HashMap already uses bit-masking:

/// Find the index containing the data for the given key.
/// Whether this function returns null is almost always
/// branched on after this function returns, and this function
/// returns null/not null from separate code paths.  We
/// want the optimizer to remove that branch and instead directly
/// fuse the basic blocks after the branch to the basic blocks
/// from this function.  To encourage that, this function is
/// marked as inline.
inline fn getIndex(self: Self, key: anytype, ctx: anytype) ?usize {
    comptime verifyContext(@TypeOf(ctx), @TypeOf(key), K, Hash, false);

    if (self.size == 0) {
        return null;
    }

    // If you get a compile error on this line, it means that your generic hash
    // function is invalid for these parameters.
    const hash = ctx.hash(key);
    // verifyContext can't verify the return type of generic hash functions,
    // so we need to double-check it here.
    if (@TypeOf(hash) != Hash) {
        @compileError("Context " ++ @typeName(@TypeOf(ctx)) ++ " has a generic hash function that returns the wrong type! " ++ @typeName(Hash) ++ " was expected, but found " ++ @typeName(@TypeOf(hash)));
    }
    const mask = self.capacity() - 1;
    const fingerprint = Metadata.takeFingerprint(hash);
    // Don't loop indefinitely when there are no empty slots.
    var limit = self.capacity();
    var idx = @as(usize, @truncate(hash & mask));

    var metadata = self.metadata.? + idx;
    while (!metadata[0].isFree() and limit != 0) {
        if (metadata[0].isUsed() and metadata[0].fingerprint == fingerprint) {
            const test_key = &self.keys()[idx];
            // If you get a compile error on this line, it means that your generic eql
            // function is invalid for these parameters.
            const eql = ctx.eql(key, test_key.*);
            // verifyContext can't verify the return type of generic eql functions,
            // so we need to double-check it here.
            if (@TypeOf(eql) != bool) {
                @compileError("Context " ++ @typeName(@TypeOf(ctx)) ++ " has a generic eql function that returns the wrong type! bool was expected, but found " ++ @typeName(@TypeOf(eql)));
            }
            if (eql) {
                return idx;
            }
        }

        limit -= 1;
        idx = (idx + 1) & mask;
        metadata = self.metadata.? + idx;
    }

    return null;
}

Assuming that it is as fast, let’s look at another optimization: NaN Boxing.

Now, this is something completely new to me - I’m not sure if I ever heard of it. Anyway, now we need to re-implement the Value struct:

pub const NanBoxedValue = packed struct {
    data: u64,

    const SIGN_BIT: u64 = 0x8000000000000000;
    const QNAN: u64 = 0x7ffc000000000000;

    pub const Nil = NanBoxedValue{ .data = QNAN | 1 };
    pub const True = NanBoxedValue{ .data = QNAN | 3 };
    pub const False = NanBoxedValue{ .data = QNAN | 2 };
};

We start with defining some constants. Previously, Nil, True, and False were top-level constants, but I’ve moved them into the struct, so then we could get the correct ones via a compile-time alias. I did the same with the UnionValue struct, previously named Value.

Next, we need to reimplement the is, as, and from methods. I’m glad that I made them generic:

pub const NanBoxedValue = packed struct {
    // ...

    pub inline fn is(self: NanBoxedValue, comptime target: @TypeOf(.enum_literal)) bool {
        if (comptime @hasField(ObjType, @tagName(target))) {
            if (self.is(.Obj)) {
                const o = @as(*Obj, @ptrFromInt(@as(usize, @intCast(self.data & ~(SIGN_BIT | QNAN)))));
                return o.objType == target;
            } else return false;
        } else {
            return switch (target) {
                .Bool => (self.data & False.data) == False.data,
                .Nil => self.data == Nil.data,
                .Number => (self.data & QNAN) != QNAN,
                .Obj => (self.data & (QNAN | SIGN_BIT)) == (QNAN | SIGN_BIT),
                else => unreachable,
            };
        }
    }

    pub inline fn as(self: NanBoxedValue, comptime T: type) T {
        return if (comptime @hasField(ObjType, typeNameUnqualified(T)))
            @as(*Obj, @ptrFromInt(@as(usize, @intCast(self.data & ~(SIGN_BIT | QNAN))))).as(switch (@typeInfo(T)) {
                .Pointer => |p| p.child,
                else => T,
            })
        else switch (T) {
            f64 => @as(f64, @bitCast(self.data)),
            bool => self.data == True.data,
            void => Nil.data,
            *Obj => @as(*Obj, @ptrFromInt(@as(usize, @intCast(self.data & ~(SIGN_BIT | QNAN))))),
            else => unreachable,
        };
    }

    pub inline fn from(x: anytype) NanBoxedValue {
        return switch (@TypeOf(x)) {
            i32, comptime_int => NanBoxedValue{ .data = @as(u64, @bitCast(@as(f64, @floatFromInt(x)))) },
            f64, comptime_float => NanBoxedValue{ .data = @as(u64, @bitCast(@as(f64, x))) },
            bool => if (x) True else False,
            void => Nil,
            *Obj => NanBoxedValue{ .data = SIGN_BIT | QNAN | @intFromPtr(x) },
            else => unreachable,
        };
    }
};

Lastly, we need the remaining isFalse, format, equal, and mark:

pub const NanBoxedValue = packed struct {
    // ...

    pub fn isFalsey(self: NanBoxedValue) bool {
        if (self.is(.Bool)) return !self.as(bool);
        if (self.is(.Nil)) return true;
        return false;
    }

    pub fn equal(self: NanBoxedValue, other: NanBoxedValue) bool {
        if (self.is(.Number) and other.is(.Number)) return self.as(f64) == other.as(f64);
        return self.data == other.data;
    }

    pub fn mark(self: NanBoxedValue, vm: *VM) !void {
        if (self.is(.Obj))
            try self.as(*Obj).mark(vm);
    }

    pub fn format(
        self: NanBoxedValue,
        comptime fmt: []const u8,
        options: std.fmt.FormatOptions,
        out_stream: anytype,
    ) !void {
        _ = fmt;
        _ = options;
        if (self.is(.Number)) {
            try out_stream.print("{d}", .{self.as(f64)});
        } else if (self.is(.Bool)) {
            try out_stream.print("{}", .{self.as(bool)});
        } else if (self.is(.Nil)) {
            try out_stream.print("nil", .{});
        } else {
            try self.as(*Obj).print(out_stream);
        }
    }
};

Notice that we’re no longer using a constant dispatch on the enum tag - instead, it’s a chain of if else. Well, that is a small price to pay, as we rarely print things, compared to overall execution.

After implementing this type of Value struct, we can tell the rest of the implementation to use it like this:

pub const Value = if (NAN_BOXING) NanBoxedValue else UnionValue;

And the rest of the code, hopefully, works. I had to make some small adjustments though.

Now, running the benchmark shows takes 5.4649999141693115 seconds. A great improvement!

Thoughts on the book

The book is great. The writing style is engaging, I really liked the diagrams, and the material is properly structured. Would recommend. This is my second read, mind you.

Perhaps, I was too harsh at times, but I just was frustrated with some things and let it out.

Now, I do think that the lack of arrays, and plain hash tables is a bit of a bummer. I get it - it is a learning exercise, but, it would be great to have just a bit more from actual languages. Also, it would be really interesting to look at JIT and FFI, but I guess it deserves its own book already.

All in all, I’m glad that I read this book. It allowed me to tighten my Clojure skills in the past, and learn an awesome language today - Zig.

Thoughts on Zig

I really like Zig.

At times I thought to myself - I am enjoying this way too much. As you may know, I was a C programmer before, and I did some pretty low-level stuff in it. I got tired of C and decided that I didn’t want to write in these kinds of languages. Rust gave me hope, but it fell apart before my eyes.

Zig, however, is a C I always wanted. With a type system that actually helps reading and writing code, and not intended to do some convoluted nonsense. I’ve written all of the code without an LSP, and it was a breeze. The only tooling I used, besides the compiler, was Emacs tags. Generating tags isn’t supported, and their precision is limited, but it was enough for this project. If you’re interested, here’s how I generated the TAGS file.

First, I made these regex rules for etags:

/.*\<fn +\([a-zA-Z0-9_]+\) *(/\1/
/.*\<\(var\|const\) *\([a-zA-Z0-9_]+\) *= *\(extern\|packed\)? *\(struct\|enum\|union\)/\2/
/.*\<error +\([a-zA-Z0-9_]+\)/\1/
Code Snippet 1: zig.etags

And added this to Makefile:

ZIG_STD ?= ~/.zig/lib/std
ZIG_STD_SOURCES := $(shell find $(ZIG_STD) -type f -name '*.zig')
ZIG_SOURCES := $(shell find ./src/zig -type f -name '*.zig')

tags:
	etags --language=none --regex=@zig.etags $(ZIG_SOURCES) $(ZIG_STD_SOURCES)

It works well enough and allows jumping to the standard library. I don’t use automatic completion that much even in other languages, and error checking is fine when I compile code. Like old times.

Sometimes the errors are weird:

zig build -Doptimize=Debug -p ./target/zig --cache-dir ./target/zig/.cache
zig build-exe lox-0.1.77 Debug native: error: the following command failed with 1 compilation errors:
~/Downloads/zig-linux-x86_64-0.11.0/zig build-exe ~/Git/lox/src/zig/lox/main.zig --cache-dir ./target/zig/.cache --global-cache-dir ~/.cache/zig --name lox-0.1.77 --listen=-
Build Summary: 0/3 steps succeeded; 1 failed (disable with --summary none)
install transitive failure
+- install lox-0.1.77 transitive failure
   +- zig build-exe lox-0.1.77 Debug native 1 errors
~/Downloads/zig-linux-x86_64-0.11.0/lib/std/fmt.zig:87:9: error: expected tuple or struct argument, found void

Like, OK, where exactly is the error though?

Here’s the source code that generates the message above. Can you spot the problem?

const std = @import("std");

pub fn main() !void {
    const stdout = std.io.getStdOut().writer();
    try stdout.print("Hello, world!\n", {});
}

Yup, I forgot the ..

There are other such errors, I just didn’t bother to save more.

Zig’s build system is something else. Can’t say I like it, but I don’t dislike it either. I guess I need to spend a bit more time with it to say for sure. The inability to set the cache and target directories from build.zig is a bummer though.

Also, the language name is, well, weird. The Crafting Interpreters book starts by explaining why the language is called Lox. In the design note they mention Nim:

  1. It doesn’t have negative connotations across a number of cultures. This is hard to be on guard for, but it’s worth considering. The designer of Nimrod ended up renaming his language to “Nim” because too many people remember that Bugs Bunny used “Nimrod” as an insult. (Bugs was using it ironically.)

Well, the name “Zig”, in my country at least, sounds a lot like the beginning of a Nazi salute. And when translated and spelled it has literally the same letters. I’m sure it is not intentional, but I had a hard time talking about this language to basically anyone. Literally, everyone was like: “name’s weird”. So I started calling it ZigLang instead.

Apart from that, I really enjoy this language, and maybe will use it for other projects in the future.

Closing thought

Thank you for reading through this insanely enormous flow of consciousness. And if you did not read it fully, I appreciate that you still got to the end. Even if you just scrolled past everything.

This post was about two years in the making. I’m really glad that I’ve finally finished it, and I hope it is interesting and informative.

If you’re interested in the project’s source code, you can find it here: lox. And if you’re a seasoned Zig developer, I would appreciate it if you reviewed the code. I am by no means an expert in Zig, so I probably made a lot of stupid decisions.

Let me know what you think!


  1. because even this didn’t work later and I just symlinked the source directory into tests, because of course ↩︎

  2. Common Lisp is compiled to native code and loaded without requiring application restart. ↩︎

  3. Just call them the homework. ↩︎

  4. Technically, there’s more, but there are no sub-scopes like in other languages. ↩︎