Andrey Listopadov

Pretty-printing for Fennel Language

@programming fennel lisp ~11 minutes read

Update:

All the patches1​, 2​ has been merged into the main branch of Fennel language, so expect to see improved fennelview in next stable release! Some semantics have been altered, so I’ve updated the post a bit to reflect the changes.


Pretty-printing in Lisp is a way to represent data structures that we operate in our program in a human-readable way. This technique is quite old and was part of many Lisp languages for a very long time. Because Lisps usually are homoiconic, defining a pretty printer for Lisp’s main data structure in terms of the syntax expected by the reader, we get a way to pretty-print Lisp’s code, thus making pretty-printer act as a formatter. And as far as I know, this was originally the main purpose for pretty printer - code was written in a “compressed” form, e.g. no line breaks to save space, and then pretty-printed on some kind of display. This was also handy to review outputs of programs that generate programs, another common property of Lisp languages.

Fennel is a relatively young Lisp-like language, that compiles to Lua. It uses Lua’s main data structure - a table, as its main data structure, and has its own pretty printer, called fennelview. It’s a pretty simple pretty printer, that is mostly based on another pretty printer for Lua, called inspect.lua.

Since inspect.lua is oriented at producing Lua-style tables, fennelview.fnl was altered to make it output tables in a Fennel’s syntax, so printed tables could be read back by the reader. It is important to note, that inspect.lua is not oriented at table serialization, hence this is not well suited for a Lisp-like language.

fennelview.lua also shared some problems, most of which were inherited from inspect.lua, and one of which is indentation handling. In Fennel v0.7.0, tables were printed like this in the REPL:

>> {:a 1 :b 2 :c 3}
{
  :a 1
  :b 2
  :c 3
}

Indentation seems to be correct from a C-like language standpoint, but Lisp hackers usually expect the following notation:

{:a 1
 :b 2
 :c 3}

This is more concise and takes less horizontal space. It may seem like not a big deal, however, the main issue is that fennelview can’t properly deal with nested tables. For example, if we put the table from above into a sequential table with some elements, and possibly other tables, we’ll get an unpleasant result:

>> [1 {:a 1 :b 2} 2 3 {:c 4 :d 5 :e 6 :f 7 :g 8} 9]
[1 {
    :a 1
    :b 2
  } 2 3 {
    :c 4
    :d 5
    :e 6
    :f 7
    :g 8
  } 6]

Indentation no longer makes any sense, and while this is not an issue for Fennel’s reader, it is hard to read such data structure from a human perspective. There are two other Lisps languages with quite a similar syntax - Janet and Clojure, each of which formats the table from above like this:

[1 {:a 1, :b 2} 2 3 {:c 4, :d 5, :e 6, :f 7, :g 8} 9]

(Except in Janet parentheses are used in place of square brackets, and there are no commas in the table literal.)

This is much better than the current behavior of fennelview, however, I still find this kind of representation a bit hard to read, as we need to keep an eye on the inner object’s boundaries. So I’ve decided to re-implement fennelview from scratch, to make it smarter about indentation, and when to produce multi-line representations of the data structure it was passed.

Lua tables, and Fennel reader

Tables in Lua are implemented as a combination of associative tables, and sequential array-like parts. Lua actually has a single syntax for defining both kinds of tables, which uses curly braces:

-- sequential table or array
local t1 = {1, 2, 3}
-- associative table
local t2 = {a = 1, b = 2}
-- sequential table with nested tables
local t3 = {1, t2, 2, 3, {c = 4, d = 5}, 6}
-- a combination of associative and sequential tables
local t4 = {1, 1, 2, 3, 5, kind = "Fibonacci"}

Fennel defines different syntaxes for different table kinds - sequential tables are defined with [], and associative tables are defined with {}:

(local t1 [1 2 3])
(local t2 {:a 1 :b 2})
(local t3 [1 t2 2 3 {:c 4 :d 5} 6])

However, combined tables are created as associative tables, by providing indexes as keys:

(local t4 {1 1 2 1 3 2 4 3 5 5 :kind "Fibonacci"})
;; you can do the same in Lua, but it's more verbose:
;; local t4 = {[1] = 1, [2] = 1, [3] = 2, [4] = 3, [5] = 5, kind = "Fibonacci"}

Another way of defining such a table is by using set and mutating the existing sequential table:

>> (local t4 [1 1 2 3 5])
>> t4
[1 1 2 3 5]
>> (set t4.kind "Fibonacci")
>> t4
{
    1 1
    2 1
    3 2
    4 3
    5 5
    :kind "Fibonacci"
}

Note that until we’ve specified the non-sequential key in our table it was printed as a sequence. However, when a table has non-sequential keys, it is printed as an associative table. This has deep implications since the ipairs function will only traverse the sequential part of the table, so by printing a table this way we also indicate that it has more to it than just the sequential part.

But the indentation is not right. I’ve thought about a small set of rules, which could be used to decide how to print tables. This is what I got, and I’m pretty much pleased with the results:

  1. Associative tables are printed within {}
    1. Opened and closed curly braces must stay at the same lines as the first and last key-value pair in the table respectively.
    2. Associative tables printed as multi-line if a certain amount of key-value pairs are present. Currently the limit is 4.
    3. If any key or value itself is a table, multi-line output is produced.
    4. Strings in key position printed as colon-strings if possible.
  2. Sequential tables are printed within
    1. Opened and closed square brackets must stay at the same lines as the first and last values in the table respectively.
    2. Sequential tables printed as multi-line if a certain amount of key-value pairs is present. Optimal limit is 10.
    3. If any value itself is a table, multi-line output is produced.
  3. If the single-line representation of the current data structure and its indentation is greater than a certain amount, a multi-line representation is printed. Optimal width is 80 characters.
  4. Nested tables share parent's indentation level.

Note:

The scratched rules above were considered initially but changed in the process of integrating the new fennelview into the language. Therefore, the next part of this post is now irrelevant, because most of these tables were examples of that scratched rule, but the current implementation only checks for the outermost third rule, e.g. for the resulting line width. Most of these examples would be single-line ones now, the comparison section has a more up-to-date variant of the formatting. I’ve highlighted irrelevant code with cursive.

Let’s manually apply these rules to the table that I’ve used before to visualize fennelview’s problems:

[1
 {:a 1 :b 2}
 2
 3
 {:c 4
  :d 5
  :e 6
  :f 7
  :g 8}
 9]

We can see that the sequential table contains less than 10 elements, but since the second associative table has more than 4 key-value pairs, we use multi-line output. On the other hand, the first associative table has only two key-value pairs in each, therefore is printed as a single line. And if the second associative table were to have fewer elements, the result would be printed as a one-liner:

[1 {:a 1 :b 2} 2 3 {:c 4 :d 5 :e 6 :f 7} 9]

Because neither of the inner tables is multiline, and the total line length is less than 80.

In my opinion, this is a very readable form for such kind of data structure - we can immediately see all sequential parts, and associative parts still stand out. I’ve re-implemented fenneview with these rules applied to sequential and associative tables, and with these changes whenever we pretty-print anything in the REPL we get a correctly indented result. For example, here’s how the configuration file for Fenneldoc is now printed:

>> (fennel.dofile ".fenneldoc")
{:fennel-path ["cljlib/?.fnl" "src/?.fnl"]
 :function-signatures true
 :insert-comment true
 :insert-copyright true
 :insert-license true
 :insert-version true
 :keys {:copyright "_COPYRIGHT"
        :description "_DESCRIPTION"
        :doc-order "_DOC_ORDER"
        :license "_LICENSE"
        :module-name "_MODULE_NAME"
        :version "_VERSION"}
 :order "aplhabetic"
 :out-dir "./doc"
 :silent false
 :toc true}

This is the exact way I would write it by hand (and I actually did), and I think this way it looks and reads much better than with default fennelview implementation.

However there are some quirks in Lua VM related to tables, and we also need a way to produce correctly indented user-defined data structures, that can be nested in other data structures.

Cyclic tables

In Lua, it is possible to define a table that contains itself, and Fennel is no exception here. Current implementation of fennelview has an option named detect-cycles, which is set to true by default. This means that when we try to print a table that contains itself, we will not go into infinity recursion:

>> (local t1 [])
>> (table.insert t1 t1)
>> t1
@1[#<table @1>]

I’m not a big fan of this #<table @1> thing, It mimics Lua’s internal string representation for a table for no reason. More than that, if the table is associative, we’ll see exact same #<table @id> notation, which is not ideal too. I’ve re-implemented parts of the cycle detection algorithm, and now it can differentiate what table type the cycle appears in:

>> (local t1 [])
>> (table.insert t1 t1)
>> t1
@1[@1[...]]

Cycles are printed either as @id[...] for sequential tables, or as @id{...} for associative tables:

>> (local t {})
>> (local v [t])
>> (set t.v v)
>> t
@1{:v [@1{...}]}
>> v
@1[{:v @1[...]}]

We can disable this behavior by setting :detect-cycles? option to false, and if we encounter a cycle, we will go in until the depth limit is reached. The depth limit is set to 128 by default.

This can also be done for user-defined data structures based on tables. For example, I’ve implemented this for hash-set from Cljlib, and now sets with cycles are printed as @set1{...}, or with other IDs, depending on context. This is done by specifying the custom implementation of __fennelview metamethod in the table’s metatable.

__fennelview metamethod

Set implementation from Cljlib is a bit overwhelming to explain here, so let’s instead define another non-existing data structure. For example, Lua does not have linked lists, so we can implement cons cells in terms of tables:

(fn recur-print-list [[head tail] options view indent elements]
  (when head
    (table.insert elements (view head options indent)))
  (match (type tail)
    (:table ? (= :cons (-?> tail getmetatable (. :__type))))
    (recur-print-list tail options view indent elements)
    :nil nil
    _ (table.insert elements (.. ". " (view tail options (+ indent 2)))))
  elements)

(fn pp-list [list view options indent]
  (let [elements (recur-print-list list options view (+ indent 1) [])
        lines (icollect [i line (ipairs elements)]
                (if (= i 1) line (.. " " line)))]
    (doto lines
      (tset 1 (.. "(" (or (. lines 1) "")))
      (tset (length lines) (.. (. lines (length lines)) ")")))))

(local cons-mt {:__fennelview pp-list :__type :cons})
(fn cons [h t] (setmetatable [h t] cons-mt))

This is a rather basic implementation for cons list pretty printer, but it already allows us to print lists with a traditional notation:

>> (cons 1 2)
(1 . 2)
>> (cons 1 (cons 2 3))
(1 2 . 3)
>> (cons 1 (cons 2 (cons 3)))
(1 2 3)

When we encounter multiline tables we force multi-line output for the whole list. And we can actually see, that the rules of the indentation that we’ve defined for tables still apply:

>> (cons 1 (cons 2 (cons {:some-long-key 1 :some-other-key 2
                          :another-long-key 3 :and-yet-another-one 4})))
(1
 2
 {:and-yet-another-one 4
  :another-long-key 3
  :some-long-key 1
  :some-other-key 2})

The associative table is correctly indented, because we manipulate the indent argument inside recur-print-list, and this even works correctly when our list itself is inside of another table:

>> {:list (cons 1 (cons 2 (cons 3 {:some-long-key 1 :some-other-key 2
                                   :another-long-key 3 :and-yet-another-one 4})))}
{:list (1
        2
        3
        . {:and-yet-another-one 4
           :another-long-key 3
           :some-long-key 1
           :some-other-key 2})}

More complex data structures may require additional logic put in the pretty-printer function, but this example illustrates the idea of control you can have.

Comparison to other Lisp pretty-printers

This implementation may not be as general as the one available in, say, Clojure. For example, I’ve mentioned a rule, that if any nested element of a table is multiline, fennelview will output multi-line representation. This rule makes it a lot easier to reason about indentation. Consider:

user=> (clojure.pprint/pprint {:seq [1 2 3 {:some-long-key 1 :some-other-key 2
                                            :another-long-key 3}]})
{:seq
 [1 2 3 {:some-long-key 1, :some-other-key 2, :another-long-key 3}]}
user=> (clojure.pprint/pprint {:seq [1 2 3 {:some-long-key 1 :some-other-key 2
                                            :another-long-key 3}]})
{:seq
 [1
  2
  3
  {:some-long-key 1,
   :some-other-key 2,
   :another-long-key 3,
   :and-yet-another-one 4}]}

Clojure’s pprint is smart enough to see that it can just put the second table on the second line, and not trigger the multi-line output of a vector. My implementation of fennelview will print everything on a new line, because the length of single-line representation is big enough, and we can benefit from vertical alignment:

>> {:seq [1 2 3 {:some-long-key 1 :some-other-key 2 :another-long-key 3}]}
{:seq [1 2 3 {:another-long-key 3 :some-long-key 1 :some-other-key 2}]}
>> {:seq [1 2 3 {:some-long-key 1 :some-other-key 2
                 :another-long-key 3 :and-yet-another-one 4}]}
{:seq [1
       2
       3
       {:and-yet-another-one 4
        :another-long-key 3
        :some-long-key 1
        :some-other-key 2}]}

In practice, I find this easier to read, but this actually may make it a bit harder to implement code formatter on top of pretty-printer. Also, I think that separating keys and values in associative data structure with newlines is a bad decision to make. Clojure can use commas in printed representation, since they are completely ignored, which I think is a great way to indicate pairs, but does not justify newlines in this particular case at least for me.

The source code of the formatter is available here, and I’ve submitted a patch to the Fennel language, so hopefully, we’ll see this implementation by default. Thanks for reading!