Implementing a protocol-based Fennel REPL and Emacs client

Recently I read a post by @nikitonsky about writing a custom REPL for Clojure and Sublime Text. What got my attention was a way of implementing a protocol over a plain Clojure REPL. In the post, Nikita describes a way to “upgrade” the connection by sending code that basically implements a new REPL to the main REPL process. I liked the idea and decided that two can play this game, and started working on a similar thing, but for Fennel.

A few months ago I’ve already proposed the idea of a simple protocol for the Fennel REPL. The idea was met with a certain amount of skepticism because there’s already an implementation of a such protocol that is language agnostic (in theory) and battle-tested. The protocol in question is called nREPL and it is widely used in the Clojure ecosystem. There are several client implementations for various editors, and the protocol is extensible enough to support other languages and various clients.

In fact, Fennel already has an implementation of the nREPL protocol, called jeejah. However, it has problems.

First of all, nREPL is mainly designed for Clojure, tested against Clojure, and implements features that Clojure needs in a pretty specific Clojure way. It’s not impossible to implement support for other languages, and there were attempts of various completeness.

Another problem is that nREPL has “n” in its name for a reason. It’s a Network REPL, which means that the protocol is designed around network communication pipelines. In principle, it is possible to implement a standard-input/-output or a pipe-based version, but no clients will probably support it. And Fennel, running on Lua, has some problems with networking. The luasocket library becomes a dependency and there’s no clear way of implementing asynchronous communication. Again, not impossible, JeeJah does it, but it’s hard to do it properly.

There’s also bencode. It’s an encoding that is used by torrents. Encoding and decoding messages may be slow, especially in a single-threaded REPL. And we’ll need to re-implement half of fennel.view the serialization function to support all kinds of Lua quirks.

Nikita also has some thoughts on nREPL:

You need to add nREPL server dependency to your app. It also has a noticeable startup cost (~500ms on my machine).

Which is a good point, although it’s not completely true. Some existing nREPL clients, such as CIDER, can inject the nREPL dependency into your project, so you don’t need to actually ship it with your application. You develop with nREPL, and use all of its goodies, yet your app has no nREPL once it’s shipped. So you only need to include the nREPL dependency if you want to ship your application with capabilities for remote code execution.

This is not really possible with Fennel, however. There’s no package manager for fennel, and neither there is a way to dynamically inject an entire nREPL library from the client. But what about injecting a smaller and simpler protocol?

Why, yes! That’s exactly what Nikita is doing in their Clojure-Sublimed plugin, and I’ve decided to do the same in Emacs.

The protocol

If you’ve read the mailing list discussion I linked earlier, you’ve seen that I envisioned line-based, tab-separated messages. I’ve scraped the idea and went for data-based communication: table in, plist out. Tables can be interpreted by fennel, we can pattern-match on them, and store arbitrary amounts of data. Plists are natively understood by Emacs Lisp, which is an implementation language of our client.

So my idea was that I send something like that to Fennel REPL:

{:id 1 :eval "(+ 1 2 3)"}

And get back something like:

(:id 1 :op "accepted" :data t)
(:id 1 :op "eval" :values ("6"))
(:id 1 :op "done")

I’ll explain why there are three outgoing messages for one incoming, but a bit later. Right now we need to address the main problem: how do we teach Fennel to work in terms of our protocol?

When I thought about the protocol for the first time, I thought that it will be part of the Fennel REPL by default. However, Fennel is quite minimalist, and the choice was not to include anything like that. The REPL is already better than the Lua one. This is true when we’re talking about interactive usage, but not when we’re talking about machine interaction.

So the answer is - REPL upgrade.

Upgrading the REPL

This technique is using the ability to evaluate in the existing REPL, basically implementing a different REPL at runtime.

For example, here’s the simplest REPL:

(let [fennel (require :fennel)]
  (while true
    (io.write "repl>> ")
    (io.flush)
    (print (fennel.eval (io.read)))))

Sending it to Fennel REPL starts our own REPL inside of it:

Welcome to Fennel 1.3.0 on PUC Lua 5.4!
Use ,help to see available commands.
Try installing readline via luarocks for a better repl experience.
>> (let [fennel (require :fennel)] (while true (io.write "repl>> ") (io.flush) (print (fennel.eval (io.read)))))
repl>> (+ 1 2 3)
6

However, it’s more of a downgrade, rather than an upgrade. Our prompt doesn’t support unfinished expressions, comma-commands no longer work, and so on. The purpose of this example is to give you a general idea of how the upgrade process works.

Reimplementing the REPL from scratch is a tough task, but we don’t even have to do it! Fennel comes with an extensible REPL already: the fennel.repl function accepts a table with the following callbacks:

readChunk - poll for the user input.
onValues - called when the REPL returns the result of the evaluation,
onError - called when the error occurs,
pp - pretty-printer function

If we implement these functions we’ll get a REPL that works the way we need it to.

Implementing the Proto(col|type) REPL

First things first, we need fennel the library. We’ll use some of its public API, later on, but the feature we’re interested in the most right now is fennel.repl:

(local {: repl : view : eval : parser &as fennel} (require :fennel))
(local {:read io/read :write io/write} io)
(local protocol {})

I’m caching fennel and io functions to avoid some repeated table lookups. It’s only part of the reason though, but more on that later.

(fn proto-repl []
  (protocol.message [[:id 0] [:op "init"] [:data "done"]])
  (repl {:readChunk protocol.read-chunk
         :onValues protocol.on-values
         :onError protocol.on-error
         :pp (fn [data] (view (view data)))}))

I’ll go through the implementation of each callback but I will not implement the whole protocol as part of this post. The protocol while small, still has a lot of code, so I’ll only do parts that pose interesting challenges. It will still be a functional protocol, just not as feature-full as the one I’ve actually implemented for Emacs integration.

First things first, let’s make the read-chunk callback. In the built-in REPL, it is used to print the prompt and handle user input. We don’t need the prompt, as this will not be an interactive REPL, but we do need to poll for user input. Not just that, our input comes in a form of a message, so we also need to parse it.

(fn protocol.read-chunk [parser-state]
  (io/write ">> ") (io.flush)
  (let [message (io/read :l)
        (ok? message) (pcall eval message)]
    (if ok?
        (case message
            {: id :eval data} (protocol.accept id :eval data)
            _ (error "message did not conform to protocol"))
        (error (.. "malformed input: " (tostring message))))))

In the read-chunk function we’re reading a line from the client and eval it. I could have used just the parser here, but you’ll see why having eval here is useful. If the evaluation was successful, we call case and do pattern matching on the message. If the message matches any pattern (right now the only one) we know how to process it. The processing is handled by the accept function, so let’s implement it:

(set protocol.current-id -1)

(fn protocol.accept [id op data]
  (protocol.message [[:id id] [:op "accept"]])
  (set protocol.current-id id)
  (case op
    :eval (.. data "\n")
    _ (error (.. "unsupported op: " op))))

The code is quite simple. The message function is what sends a message (:id 1 :op "accepted") to the client. Here’s how we implement it:

(fn protocol.message [kvs]
  (io/write (protocol.format kvs) "\n")
  (io.flush))

(fn protocol.format [kvs]
  (.. "("
      (table.concat
       (icollect [_ [k v] (ipairs kvs)]
         (: ":%s %s" :format k
            (if (= :table (type v))
                (.. "(" (table.concat v " ") ")")
                (view v))))
       " ")
      ")"))

The message function simply prints messages to standard out, while format handles transforming our sequence of pairs to Emacs Lisp plist.

Believe it or not, we’re almost done. What’s left are on-values and on-error:

(fn protocol.on-values [data]
  (protocol.data protocol.current-id data)
  (protocol.done protocol.current-id))

(fn protocol.data [id data]
  (protocol.message [[:id id] [:op "eval"] [:values data]]))

(fn protocol.on-error [_ data]
  (protocol.error protocol.current-id data)
  (protocol.done protocol.current-id))

(fn protocol.error [id data]
  (protocol.message [[:id id] [:op "error"] [:message data]]))

The done function is the final piece of the puzzle:

(fn protocol.done [id]
  (protocol.message [[:id id] [:op "done"]]))

With all of this, we should get our protocol-based REPL up and running after calling (proto-repl):

>> (proto-repl)
(:id 0 :op "init" :data "done")
>> {:id 1 :eval "(+ 1 2 3)"}
(:id 1 :op "accept")
(:id 1 :op "eval" :values ("6"))
(:id 1 :op "done")

Nice! Now, obviously, this omits a lot of plumbing, error handling, other operations, and so forth, but it should give you the idea. Also, there are a lot of unnecessary table lookups, I just wrote the code in such a way so you could evaluate them one after another without forward declarations. But, as a result, sending each of those code blocks to a plain Fennel REPL gives us a new REPL that acts in terms of a specified protocol. Now we’ve really upgraded the REPL.

Unfortunately, this will not work as is, not because we’re missing ops or error handling though. There’s an elephant in the room, consider this message: {:id 1 :eval "(print :foo)"}. What would happen if we send this message to our REPL?

>> {:id 1 :eval "(io.write :foo)"}
(:id 1 :op "accept")
foo(:id 1 :op "eval" :values ("#<file (0x7feff913f780)>"))
(:id 1 :op "done")

Right, the output is mixed with the protocol messages. Here’s another one:

>> {:id 1 :eval "(io.read :l)"}
(:id 1 :op "accept")

Now our REPL suddenly waits for user input midway. So if another protocol message comes, it will be consumed by this read. So we have to handle this too.

Handling IO

Unfortunately, Lua doesn’t have any way of redirecting output from standard out to something else. Clojure kinda does:

(with-out-str
  (.println System/out "foo"))

Even though we call the println method of the System/out class, this expression still returns a string "foo\n". It is possible because in the JVM there are ways to configure that. I’ve tried using io.output in Fennel, but it can only be set to a file, we can’t pass it a table that implements the necessary methods:

;; naive approach
(let [orig-out (io.output)
      fake-file {:data ""
                 :write (fn [t s] (set t.data (.. t.data s)))}]
  (io.output fake-file) ; runtime error: bad argument #1 to 'output' (FILE* expected, got table)
  (print :foo)
  (io.output orig-out)
  fake-file.data)

So we need to set up IO in such a way that it works as expected when we communicate through a protocol, yet wraps the IO inside of the user’s code. Fortunately, we can do that:

(fn set-io [env]
  (let [{: stdin : stderr : stdout} io
        {:write fd/write
         :read fd/read
         &as fd} (. (getmetatable stdin) :__index)]
    (fn env.print [...]
      (env.io.write (.. (table.concat [...] "\t") "\n"))
      nil)
    (fn env.io.write [...]
      (: (env.io.output) :write ...))
    (fn env.io.read [mode]
      (let [input (env.io.input)]
        (if (= input stdin)
            (protocol.read mode)
            (input:read mode))))
    (fn fd.write [fd ...]
      (if (or (= fd stdout) (= fd stderr))
          (protocol.message [[:id protocol.current-id]
                             [:op "print"]
                             [:data (table.concat [...] "")]])
          (fd/write fd ...))
      fd)
    (fn fd.read [fd ...]
      (if (= fd stdin)
          (env.io.read ...)
          (fd/read fd ...)))))

It’s a bit of code, but what it essentially does is this:

store the original values of stdin, stdout, stderr,
define a bunch of replacement functions, like env.print, env.io.write,
capture original *FILE metatable in fd and override it’s metamethods.

Unlike tables, all file objects in Lua share the same metatable, so we only need to redefine metatable entries for the stdin file. Though it may depend on the implementation of the Lua runtime being used.

The env argument here is a table that will be used in the fennel.repl, so we need to change that bit of code. Because we don’t want to actually modify the real _G table, we’ll need to copy it first:

(fn copy-table [t]
  (collect [k v (pairs t)]
    k v))

Here’s our new REPL:

(fn proto-repl []
  (protocol.message [[:id 0] [:op "init"] [:data "done"]])
  (repl {:env (doto (copy-table _G) set-io)
         :readChunk protocol.read-chunk
         :onValues protocol.on-values
         :onError protocol.on-error
         :pp (fn [data] (view (view data)))}))

Now, when we send the {:id 1 :eval "(io.write :foo)"} message, we get proper IO handling:

>> (proto-repl)
(:id 0 :op "init" :data "done")
>> {:id 1 :eval "(io.write :foo)"}
(:id 1 :op "accept")
(:id 1 :op "print" :data "foo")
(:id 1 :op "eval" :values ("#<file (0x7f0e234e7780)>"))
(:id 1 :op "done")

Great, now we get a message with the print OP, and the data to be printed. The client then can detect such message and act accordingly. We can even add support for different descriptors, hide stderr messages, or color them differently. But what about the opposite operation? How do we read input from the user?

This is a bit tricky. The general idea is the same, we need to tell the client that the REPL is awaiting input, but we also need to wait in such way that other messages don’t get confused with user input. You can see that in the set-io function, the io.read calls protocol.read if the descriptor is stdin. Let’s implement protocol.read now:

(fn protocol.read [mode]
  (let [tmpname (with-open [p (io.popen "mktemp -u /tmp/proto-repl.XXXXXXXX 2>/dev/null")]
                  (p:read :l))]
    (: (io.popen (.. "mkfifo '" tmpname "' 2>/dev/null")) :close)
    (protocol.message  [[:id protocol.current-id] [:op "read"] [:data tmpname]])
    (let [data (with-open [f (io.open tmpname)]
                 (f:read mode))]
      (: (io.popen (.. "rm -f '" tmpname "'")) :close)
      data)))

Now, if we call io.read as part of our code, we should get back a message:

>> {:id 1 :eval "(io.read :n)"}
(:id 1 :op "accept")
(:id 1 :op "read" :data "/tmp/proto-repl.adTbqVI2")

If we write data, in this case a number to that file, we should get it processed by the REPL:

echo 27 > "/tmp/proto-repl.adTbqVI2"

And indeed we get the usual messages we get from eval OP:

(:id 1 :op "eval" :values ("27"))
(:id 1 :op "done")

Unfortunately, Lua doesn’t have an inbuilt filesystem library, so this method is not cross-platform. It will probably work under BSD systems, and Macs, but not under Windows, unless you’re using WSL, but still not guaranteed. I don’t have a Windows PC, so I can’t test, but if you know a cross-platform, or even just a special case for Windows, for creating a named FIFO pipe, and attaching to it, it’d be great if you reach me.

This pretty much concludes our very basic, yet fully functional protocol!

The IO part is the trickiest part because there may be specifics to how files are read, but again, we can’t expect that the target application will have luafilesystem installed. But there’s a way to work this around. So we do have a final small change to make in our read-chunk function.

Making the protocol extensible at runtime

You may remember that I’ve chosen eval over parser in the REPL to work with incoming messages. There was a specific reason for it - the message can actually be not for the target application, but for the REPL itself! And we don’t even need any kind of special-casing for it, only a nop OP:

(fn protocol.read-chunk [parser-state]
  (io/write ">> ") (io.flush)
  (let [message (io/read :l)
        (ok? message) (pcall eval message)]
    (if ok?
        (case message
          {: id :eval data} (protocol.accept id :eval data)
          {: id :nop ""} "\n"
          _ (error "message did not conform to protocol"))
        (error (.. "malformed input: " (tostring message))))))

We also need to expose the protocol table in the _G table, and fennel for convenience:

(set _G.proto-repl protocol)
(set _G.fennel fennel)

With these changes, we can send a message like (do (print "hi!") {:id 1 :nop ""}) and it will be processed by the eval, and the print call will be evaluated outside of the main protocol handling. Thus, we can get fancy! Here, for example, how we can make our protocol respond in JSON:

((fn to-json-protocol []
   (fn _G.proto-repl.format [data]
     (.. "{" (table.concat
              (icollect [_ [k v] (ipairs data)]
                (let [v (if (= :table (type v))
                            (.. "[" (table.concat v ", ") "]")
                            (fennel.view v))]
                  (.. (fennel.view k) ": " v)))
              ", ")
         "}"))
   {:id 15 :nop ""}))

Now, if we send this to the REPL, we will get nothing back, since the OP in question is nop, however, the format function will be redefined. So if we send one more message, the usual eval one, we’ll get back JSON messages in response:

>> {:id 1 :eval "(+ 1 2 3)"}
{"id": 1, "op": "accept"}
{"id": 1, "op": "values", "data": ["6"]}
{"id": 1, "op": "done"}

This is why languages with the capability to run any code at runtime are cool in my opinion. We can build our application while it is running. Any protocol method, that we’ve defined can be re-implemented at runtime. So if you’re running Windows, you can redefine protocol.read to support your particular platform without even asking me to update the code. (But please, if you know a more portable way of handling input, send me a message).

As a final change, let’s remove the >> prompt from the read-chunk, and we’re done. The prompt is not needed, it was only to help you differentiate what messages we’ve sent, and what messages we receive. For the machine the prompt will only get in the way:

(fn protocol.read-chunk [parser-state]
  (let [message (io/read :l)
        (ok? message) (pcall eval message)]
    (if ok?
        (case message
          {: id :eval data} (protocol.accept id :eval data)
          {: id :nop ""} "\n"
          _ (error "message did not conform to protocol"))
        (error (.. "malformed input: " (tostring message))))))

Finally, we can add the call to proto-repl to our protocol code, so right after we’ve sent it the new REPL is started automatically:

(proto-repl)

Now, for real, this concludes the basic protocol implementation. This is, in fact, what I started with, and then gradually made it more mature by extending the number of operations the protocol supports, and making it more robust. Not every protocol method is exposed to be changed in my implementation though. This was more like a demo that you can follow, but in the real version of the protocol, it is a single giant function that gets sent to the REPL in one go.

Now, we can talk about the client, because without the client our protocol has no value.

The client

My editor of choice is Emacs for many reasons. And this is one of them - we can make all kinds of applications in Emacs, including a custom client for a custom protocol. We’re like full-stack developers now, but our backend is written in Fennel and our frontend will be written in Emacs Lisp.

As I mentioned at the beginning of the post, I’ve chosen plists because Emacs understands them natively. Still, our messages come in as strings, so we’ll need to parse them. But how will we organize our client to begin with?

Comint

The current implementation of the Fennel REPL in the fennel-mode package uses the inbuilt comint package to do all of the heavy lifting. Comint is Emacs’ generic interface for providing user interaction to asynchronous processes. The selling point is that you can start almost any interactive program in Comint, set the prompt regexp and it will work. However, the problems start when we begin building a machine-to-machine interface over a human-to-machine interface.

As with any kind of interface it involves parsing the output. Our protocol is no different here, we’re still going to parse output, so what’s wrong with doing it via comint, especially since it already knows how to do the majority of things? Comint even has the redirection facility to deal with such tasks specifically.

The answer is - there’s no message queue that spans over both comint and the target process.

For example, you can send a long-running code to the REPL that will print something after a long period of time, like this one:

(do (for [i 1 1000000] nil) (print "foo" "bar" "baz"))

Once the loop is completed the stdout of the process will contain a string like this one: foo\tbar\tbaz. Comint will grab it and print it to the REPL. All good.

But if we set up a redirect while this loop runs, that, say, queries for the completions from the REPL, we can get into a funny situation. Imagine, user typed f and pressed the Tab key. The current implementation of completion support in fennel-mode package will use a comint redirect mechanism, and for this case, it will send the following expression ,complete f to the REPL. The output of the ,complete f command will be fn\tfor\tfcollect\tfaccumulate, another tab-separated line, but our REPL is busy at the moment, running the loop. What can go wrong?

What happens is that the ,complete f is buffered by the process stdin, and not processed unless the REPL is able to read the message. The comint redirect, however, is waiting for the output from the process and grabs what’s first to appear there. So it grabs the output from the print function and uses it as the data for the completion engine. While the REPL gets the output from the ,complete command:

This is a race, and the last request from comint wins the first output from the process. It just so happens that both messages used a format similar enough for completion to work. In most cases, completion will silently fail, and the REPL will just lose the results of the expression.

Comint is not suited for working with a protocol like ours, but we can still reuse a fair bit of comint features in our client. Instead, we’ll have two separate processes - one for the server that implements the protocol, and one for the REPL that acts as a fake process with all comint goodies.

Here’s what our architecture will look like:

By separating input from output, and effects from parsing, we avoid all problems with comint racing over process output and misparsing different commands. And the input doesn’t have to come from comint, we can send messages via the input sender from anywhere, so code interaction in the buffer is easy to implement too.

Notice how ID comes into play here - we read user input, format it as a message, and assign it an ID, which we send to the server. While doing so, we register a callback for the message in a hash table, that uses message IDs as keys, and callback as values. Once the server has processed the message, it responds with several messages of its own, all of which include a callback. We then parse these messages in the output filter, which looks for the callback in the hash table. If the callback was found, it is being called with the data from the message according to the protocol. And the callback can print the result back to the REPL, to a completion engine, or to any other target buffer.

This is also the reason why our protocol answers with special messages accept and done before and after the data-oriented ones, like values, print, read, and such. All protocol operations require a callback to reach the user, but once the message is processed fully its callback can be released. This is what the done OP is for in our protocol, but what about accept?

Right now in my design, accept is a noop, but I can see some potential later use cases. One of them is for implementing asynchronous timeouts. For example, when we register a callback we can store the maximum amount of time that is meaningful for such a callback to exist. Once the server answers with the accepted OP, we can check if the time since callback registration exceeds the lifespan of the callback. If not, we continue waiting for other messages, otherwise, we unassign it.

Another possible use for the accept OP is to cancel previous callbacks. This is not particularly useful in the context of a single-threaded Lua application, as it will always process messages sequentially, but nothing prevents us from implementing an asynchronous REPL. In an asynchronous context, it is quite possible that a message will be accepted before the previous one was fully processed. And in cases like when we query for completions, we are interested only in the latest results, so we can use accept to cancel the previous completion callback.

But enough talk, let’s write some code!

Server process

Now, I must say, that in the same way that we didn’t implement a full-blown protocol, we won’t implement a full-blown client. However, I will try to provide a complete enough example that you’ll be able to make your own client like this in the future if you wanted to.

We start simple - we need a way to start the server process that will act as a pipe:

(defvar proto-repl--upgrade-code "...whole protocol code...")
(defvar proto-repl-process " *proto-repl*")
(defvar proto-repl--message-callbacks nil)
(defvar proto-repl--message-id 0)

(defun proto-repl--start-server (command)
  "Start the Fennel REPL process.
COMMAND is used to start the Fennel REPL.  Sends the upgrade code
to the REPL and waits for completion via a callback."
  (let ((proc (make-process
               :name proto-repl-process
               :buffer proto-repl-process
               :command (split-string-shell-command command)
               :connection-type 'pipe
               :filter #'proto-repl--process-filter)))
    (buffer-disable-undo proto-repl-process)
    (with-current-buffer proto-repl-process
      (setq mode-line-process '(":%s")))
    (message "Waiting for Fennel REPL initialization...")
    (setq proto-repl--message-callbacks nil)
    (setq proto-repl--message-id 0)
    (proto-repl--assign-callback #'proto-repl--start-repl)
    (send-string proc (format "%s\n" proto-repl--upgrade-code))))

First of all, let’s address the elephant in the room - the protocol code. Our package needs to somehow obtain the protocol code and send it to the REPL process. I’ve chosen to store it as a string, even though it is a very long string. The reasoning behind this is that I want this file to be both self-contained and because package management in Emacs is a mess, and there’s no reliable way to obtain a file that is located in the same package unless they’re both in the same directory, which may not be the case¹. Not the most elegant solution, but it works.

Next up, we define two more variables, one for process buffer name, and another one for storing callbacks. For simplicity’s sake, I’ll use an associative list for callbacks, but in the real client, a hash table with fast equality function should be used.

And the main piece of code - the function that starts the process. Code should be mostly self-explanatory, but in case it’s not, the main idea is to start the process and assign a process filter to it. The process filter will be the next thing we’ll implement, so I’ll save the explanation for later. After that, we do a few setup steps and send the message to the process with the upgrade code. Sending messages is another bit part of the client, and we’ll look into them as well.

Process filtering

The process filter is a function that accepts two arguments, the process being filtered and the data that was read from it. By specifying a custom process filter we actually prevent any output from the process in the process buffer, so some kind of logging should be implemented. And here’s our first challenge.

Our protocol works on a one message per line basis, yet the output from the process is received in hunks instead of lines. It is quite possible to receive a message that contains the start of the next message without its end. In a case like this, a custom encoding would be a much better choice than using a line-based protocol, but we can workaround the problem by implementing buffering.

(defvar proto-repl--message-buf nil)

(defun proto-repl--buffered-split-string (string)
  "Split STRING on newlines.
If the string doesn't end with a newline character, the last (or
the only) line of the string is buffered and excluded from the
result."
  (let ((strings (string-lines string t)))
    (when proto-repl--message-buf
      (setcar strings (concat proto-repl--message-buf (car strings)))
      (setq proto-repl--message-buf nil))
    (if (string-suffix-p "\n" string)
        strings
      (setq proto-repl--message-buf (car (last strings)))
      (nbutlast strings))))

First, we split the string on newlines, removing empty lines, as they’re irrelevant to us. Next, if we have something in our buffer, we concatenate it with the first line, as it is a leftover from the previous hunk. Then we check if the string ends with a newline, if it is we return lines as is. If not, we set the buffer to the last incomplete line and return everything except for the last line.

This way, the protocol is much more resilient to process buffering, and the fact that Emacs doesn’t read line by line on most systems. Yes, our protocol is line-based and it’s not exactly machine-friendly, but it is much easier than using some kind of encoding, like bencode.

Once we have a list of lines, we can begin processing them. Here’s our process filter:

(defun proto-repl--process-filter (_ message)
  "Parse the MESSAGE and process it with the callback handler."
  (dolist (message (proto-repl--buffered-split-string message))
    (let ((message (substring message (string-match-p "(:id " message))))
      (when-let ((data (condition-case nil
                           (car (read-from-string message))
                         (error nil))))
        (when (plistp data)
          (proto-repl--handle-protocol-op data)))))
  (with-current-buffer proto-repl-process
    (goto-char (point-max))
    (insert message)))

In this function, we’re going to work on a line-by-line basis because the first thing we do is split the incoming message. However, the input massaging doesn’t stop there - for each message we want to strip everything that is not part of it. If you remember, before we had our IO wrapped, the output from io.write was right before one of the messages. This is partly why we do it. In fact, this should never happen, but we can’t control it before the protocol is initialized, especially because by default Fennel REPL is quite verbose. If we send a partially complete expression to the base Fennel REPL it will display a .. prompt, so while sending our protocol line by line we’ll get back a lot of dots preceding the initialization message:

...............................................(:id 0 :op "initialized")

So we have to deal with it. This function is also the right place to add logging, but I’m omitting it for the sake of simplicity and just emitting everything to the process buffer.

After we’ve split the input, and truncated it we parse it with the read-form-string function. This again is where some errors might lurk - if our protocol ever would return two messages in a single line, we’ll lose the last one. So you might want to check for that. If the message was read successfully, we also check if it is a plist with the plistp function, and pass it to the protocol handler.

The protocol handler is the meat of our client. It manages callbacks for each message and handles all supported protocol OPs.

(defun proto-repl--handle-protocol-op (message)
  "Handle protocol MESSAGE.
Message contains an id, operation to execute, and any additional
data related to the operation."
  (let ((id (plist-get message :id))
        (op (plist-get message :op)))
    (when-let ((callback (proto-repl--get-callback id)))
      (pcase op
        ("accept" nil)
        ("done"
         (proto-repl--unassign-callback id)
         (unless (proto-repl--callbacks-pending?)
           (proto-repl--display-prompt)))
        ("eval"
         (let ((values (plist-get message :values)))
           (funcall callback (format "%s\n" (string-join values "\t")))))
        ("print" (proto-repl--print (plist-get message :data)))
        ("read"
         (let ((inhibit-message t))
           (write-region
            (read-string "stdin: ") nil
            (plist-get message :data))))
        ("error"
         (proto-repl--display-error
          (plist-get message :type)
          (plist-get message :data)
          (plist-get message :traceback)))
        ("init"
         (proto-repl--unassign-callback 0)
         (funcall callback nil))))))

Even though our protocol is quite small, we do have a few OPs to support. As I’ve mentioned the accept OP is a nop, but the done OP is quite important. You can see, that we unassign the callback for the message, and then check if there are pending callbacks. This allows us to avoid drawing the prompt in the REPL because pending callbacks mean that the REPL is still busy.

Another special OP is the init one. It has a special callback with the ID 0, and this callback is strictly for the REPL initialization. Other callbacks should be pretty much self-explanatory.

Let’s write functions for dealing with callbacks:

(defun proto-repl--get-callback (id)
  "Get a callback for a message with this ID."
  (cdr (assoc id proto-repl--message-callbacks)))

(defun proto-repl--unassign-callback (id)
  "Remove callback assigned to a message with this ID."
  (setq proto-repl--message-callbacks
        (assoc-delete-all id proto-repl--message-callbacks)))

(defun proto-repl--assign-callback (callback)
  "Assign CALLBACK and return the id it was assigned to."
  (let ((id proto-repl--message-id))
    (add-to-list 'proto-repl--message-callbacks (cons id callback))
    (setq proto-repl--message-id (1+ id))
    id))

(defun proto-repl--callbacks-pending? ()
  "Check for callbacks that still waiting for the DONE message."
  proto-repl--message-callbacks)

With these, we can do all operations on callbacks that our client requires. The proto-repl--assign-callback is an interesting one. It increments the ID after it assigned the callback, and returns the ID of the callback itself. This is our main interface for sending messages to the REPL.

User interaction

With the server part mostly done, we can make the comint part that is an actual user interface to our protocol-based REPL. Let’s start with the mode for the REPL:

(require 'fennel-mode) ;; for font-lock

(defvar proto-repl-buffer "*Fennel Proto REPL*")
(defvar proto-repl-prompt ">> ")

(define-derived-mode proto-repl-mode comint-mode "Fennel Proto REPL"
  "Major mode for Fennel REPL.

\\{proto-repl-mode-map}"
  (setq comint-prompt-regexp (format "^%s" proto-repl-prompt))
  (setq comint-prompt-read-only t)
  (setq comint-input-sender 'proto-repl--input-sender)
  (setq mode-line-process '(":%s"))
  (setq-local comment-end "")
  (fennel-font-lock-setup)
  (set-syntax-table fennel-mode-syntax-table)
  (unless (comint-check-proc (current-buffer))
    (let ((proc (start-process proto-repl-buffer (current-buffer) nil)))
      (add-hook 'kill-buffer-hook
                (lambda ()
                  (when-let ((proc (get-buffer-process proto-repl-process)))
                    (delete-process proc)))
                nil t)
      (insert ";; Welcome to the Fennel Proto REPL\n")
      (set-marker (process-mark proc) (point))
      (proto-repl--display-prompt))))

There’s a lot to dig in, but the main part is where we start a new process with the start-process function. This process is what the comint will use for it’s internal implementation of input handling and such. Let’s look at the fennel-repl--input-sender:

(defun proto-repl--input-sender (_ input)
  "Sender for INPUT from the REPL buffer to REPL process."
  (let* ((id (proto-repl--assign-callback #'proto-repl--print))
         (mesg (format "{:id %s :%s %S}\n" id "eval" (substring-no-properties input))))
    (send-string proto-repl-process mesg)))

This function is responsible for sending the input to the server as a message, so it formats user input like one. Before sending the message it assigns the callback, and then the rest of our system should just work.

The proto-repl--display-prompt is a simple function that just prints the prompt to the REPL buffer:

(defun proto-repl--display-prompt ()
  "Display prompt."
  (let ((proc (get-buffer-process proto-repl-buffer)))
    (comint-output-filter proc proto-repl-prompt)))

And the proto-repl--print is a default callback for printing anything to the REPL buffer:

(defun proto-repl--print (message)
  (let ((proc (get-buffer-process proto-repl-buffer)))
    (comint-output-filter proc message)))

This function is almost identical to the prompt displaying one, but it is mostly for the simplicity’s sake. In fact the real one is a bit more complicated because we want to print stuff before the prompt if we’re sending code not from the REPL but from the buffer. I’ve left this out as it’s not that hard to implement.

So the last thing we need is to actually start the REPL in a buffer:

(defun proto-repl--start-repl (_)
  "Start the REPL."
  (message "Fennel Proto REPL initialized")
  (with-current-buffer (get-buffer-create proto-repl-buffer)
    (proto-repl-mode)
    (pop-to-buffer (current-buffer))))

This one is a callback we’ve used when starting a server. And this one is the funtion for the end user:

(defun proto-repl ()
  "Fennel Proto REPL."
  (interactive)
  (if-let ((repl-buffer (get-buffer proto-repl-buffer)))
      (if (process-live-p (get-buffer-process repl-buffer))
          (pop-to-buffer repl-buffer)
        (proto-repl--start-server "fennel --repl"))
    (proto-repl--start-server "fennel --repl"))
  (get-buffer proto-repl-buffer))

And the client is done! Let’s try it:

On the left is the REPL buffer, which is responsible for input handling, and sending messages. On the right is the server buffer, which shows the process log. As you can see, we’ve got the usual messages for all our expressions.

New Fennel REPL integration for Emacs

Sure, the implementation of the client and the protocol has a lot of room for improvement, but it is enough for the purpose of this post. As a matter of fact, I’m almost done working on a proper implementation, and most of the code here is greatly simplified version of what I’ve already made for the fennel-mode package. For now it lives in a separate module, but I have high hopes on completely replacing inbuilt comint-based client with this protocol based one, once I test it more. It will probably still live in a separate module for a while even after I will consider it mostly complete, so users would be able to try it in their environments, but my hopes are high.

There is one problem though - this protocol implementation may not work in some contexts. For instance, if you’re already providing a custom REPL in your application, that is based on the fennel.repl and implements its own readChunk, that, for example, doesn’t read from stdin. One such example is the LÖVE game engine, there’s an implementation of the REPL that polls for input on a separate thread. If we send our protocol-based REPL to such a custom REPL, we’ll negate all the efforts made to make the REPL work in a non-blocking way. For Clojure, this works because the REPL already lives in its own thread, but there are no threads in Lua apart from what coroutines provide.

So I’m still thinking about how to handle the situation when the upgrade is impossible. This is why the official protocol is much better than such a custom one, but alas.

You can already try this REPL if you check out the proto-repl branch in the fennel-mode project. If you’re using straight.el, or Quelpa, or Emacs version recent enough to have the package-vc-install function, you can try installing fennel-mode and supplying the proto-repl branch to the recipe.

If you want to experiment with the code that is provided in this article for example for the purpose of building a similar protocol for another language, this very article is written in the literate style, and you can grab it here. Run the org-babel-tangle on it, and you should get the protocol.fnl file with all the necessary functions, and proto-repl.el file with the Emacs Lisp code. You’ll need to put the protocol code to the proto-repl--upgrade-code variable, as I’ve excluded it from the article’s text since it was making that particular code block too long.

Thank you for reading! I hope this was interesting and useful. As always, if you have any thoughts you want to discuss on the matter, feel free to contact me via one of the ways mentioned on the about page.

Package managers like straight.el or quelpa use recipes to specify what files to use during the build process. It is possible that users of the package may not notice that the recipe needs to be changed. This can be handled by the package archive like MELPA, but I’d rather not bother with it right now. ↩︎

Comment via email