Extending JSON encoder with custom types
One thing I like about Lua is that it has a limited amount of built-in types and data structures. There are no objects, just tables. And if you need a custom behavior, a metatable can be attached to any table to extend it with custom methods that integrate with the rest of Lua runtime.
However, this is both a blessing and a curse. And unfortunately, Lua developers made a huge mistake by breaking this API1 once already, in a minor release even.
As you may noticed, fnl-http has a lot of stuff. It has asynchronous Channels, and various Readers, which act like objects, but are still ordinary tables underneath. And this is true for a lot of Lua projects, tables are flexible enough to implement many cool things, so people do.
However, this poses an interesting problem - because objects are not a thing in Lua, and a set of default metamethods is quite limited, there’s no default way of extending objects with custom behavior. In Java, you would specify what interfaces are required for an object to have, and it often means that if something doesn’t implement them, a wrapper object is needed. In Clojure, we can get away with protocols and multimethods, but Lua has neither of those. In Lua, and subsequently Fennel, we have to be creative, because even though we can add stuff through metatables, not every value can have one.
While working on fnl-http, I added a basic JSON parser and encoder. The parser part is quite simple, as JSON has a specification and a fixed amount of data types to worry about. It’s not like EDN or XML, for example, which are both extensible data notations that can represent pretty much anything. In JSON you get only these data types:
- Number
- String
- Array
- Object
- Boolean
- Null
The only difference from Lua here is that null
is nil
and Lua doesn’t have arrays as a separate type.
So implementing a parser was easy, and everything pretty much maps to Lua data structures one-to-one.
Implementing JSON encoder, however, was not as easy. For basic data types, it is easy enough, but once we touch custom objects created with metatables things get messy.
For example, let’s say we need a set data structure.
A set is similar to a table, except the values are the same as the keys.
But, we may want to distinguish them from ordinary tables, if, say, we’re making a library that has a set of functions for working with mathematical sets.
Like intersection
, difference
, etc.
Here’s a set constructor:
(fn Set [...]
(let [data (faccumulate [res {}
i 1 (select :# ...)]
(let [k (select i ...)]
(doto res (tset k k))))]
(setmetatable
{}
{:__index data
:__newindex (fn [_ k _] (tset data k k))})))
As you can see, we create an empty table and set its metatable to a bunch of functions that close over the local data
.
This technique is quite common in Lua and is called proxy tables.
By using the empty table as a base, we ensure that any table access triggers the __index
or __newindex
metamethod, depending on the action.
However, there’s a problem. If we try to print such an object in the REPL, we don’t get any information about it, even though it has values:
>> (local s (Set :a :b :c))
nil
>> (. s :a)
"a"
>> (set (. s :d) 42)
nil
>> (. s :d)
"d"
>> s
{}
Additionally to that, we have no means of iteration over this data structure, as the default next
function will just return nil
:
>> (next s)
nil
We can fix this by adding a __pairs
metamethod:
(fn Set [...]
(let [data (faccumulate [res {}
i 1 (select :# ...)]
(let [k (select i ...)]
(doto res (tset k k))))]
(setmetatable
{}
{:__index data
:__newindex (fn [_ k _] (tset data k k))
:__pairs (fn [_] #(pick-values 1 (next data $2)))})))
Now, printing works, albeit in a bit strange way:
>> (Set :a :b :c)
{:a nil :b nil :c nil}
A few years ago, I remade the pretty-printer for Fennel, and support for custom objects was a huge part of its design.
In short, we can fix our object textual representation by using the __fennelview
metamethod.
Now, __fennelview
is fine and all, but it is not exactly meant for extending existing objects.
Of course, given a Lua library, that implements these proxy tables you can add support for Fennel, but it’s more of an ad-hoc solution.
So it’s meant for library authors to make the Fennel REPL experience more user-friendly, and make sure that the printed representation can be read back by Fennel.
And it’s not always possible to add __fennelview
to existing objects.
For example, one common complaint in Lua is that arrays start from 1
and not 0
.
We can “fix” that by implementing our own Array type:
(fn Array [...]
(let [vals [...]]
(setmetatable
[]
{:__index (fn [_ i]
(. vals (+ i 1)))
:__newindex (fn [i val]
(tset vals (- i 1) val))
:__len #(length vals)
:__pairs (fn [_] #(next vals $2))})))
Right now, the object above prints like this:
>> (Array 1 2 3)
[1 2 3]
Let’s assume this came from the library, and we wish to print this differently, e.g. as (Array 1 2 3)
instead of [1 2 3]
to make sure that such arrays are read back as 0-based arrays, and not as ordinary Lua tables.
We can do that, but it is tricky because we don’t have any means of accessing the data
closure:
(fn wrap [arr]
(tset (getmetatable arr)
:__fennelview
(fn pp-doc-example [t view options indent]
(let [lines (icollect [i v (pairs t)]
(let [v (view v options (+ 7 indent))]
(if (= i 1) v
(.. " " v))))]
(doto lines
(tset 1 (.. "(Array " (or (. lines 1) "")))
(tset (length lines) (.. (. lines (length lines)) ")"))))))
arr)
However, as you can see, we have to use wrap
on every Array
, because each instance comes with its own metatable, and setting it once wouldn’t work.
And if we try to read back the printed representation we’ll get the old representation:
>> (wrap (Array 1 2 3))
(Array 1 2 3)
>> (Array 1 2 3)
[1 2 3]
So we can make another wrapper:
(local Array* Array)
(fn Array [...]
(wrap (Array* ...)))
>> (Array 1 2 3)
(Array 1 2 3)
So you can see, how this is a bit problematic.
Custom JSON encoders
Now, this was a bit of a tangent, so let’s get back to the main topic - JSON encoding.
Because I can’t ask everyone to do this kind of wrapping, I decided to implement a different way of extending the json.encode
function.
Thus, the library now features two more functions: register-encoder
and unregister-encoder
.
The second one is for fixing mistakes, and won’t probably be used too much.
So, let’s see how we can add support for custom types. First, we need a function that will tell if the given object is an instance of something we want to encode:
(fn Array? [x]
(and (= :table (type x))
(not= nil (. x 0))
Array))
A library may provide a custom predicate to check if the given object is of the same type, like chan?
for asynchronous channels.
However, it can’t be used as is, because to register an encoder, such a function needs to return a singleton, that can be compared with =
.
So we return Array
from our example function.
Next, we can implement the encoder:
(fn encode-array [arr encode]
(.. "["
(-> (fcollect [i 0 (- (length arr) 1)]
(encode (. arr i)))
(table.concat ", "))
"]"))
Finally, we can register it:
(json.register-encoder object object? encode-object)
The first argument is the object itself, and the second one is a function to obtain the object’s type.
Internally, register-encoder
calls the object?
function on the first argument to provide a bit more flexibility, in case object?
returns different values for different objects or based on the object’s properties.
The third argument is a function that provides the encoding.
It accepts a callback to encode nested values as a second argument.
With that, we can register the encoder and encode our array as JSON by calling (json (Array 1 2 3))
:
>> (json.register-encoder (Array) Array? encode-array)
nil
>> (json (Array 1 2 3))
"[1, 2, 3]"
This should provide enough flexibility to support any proxy-based objects.
Custom decoders
I decided not to add custom decoder support to this library, yet. JSON doesn’t have any custom types, so it doesn’t make sense to parse its values to anything but default Lua types.
One could argue, that it would be nice to have a way to parse arrays to zero-based ones, given that we can convert zero-based arrays to JSON. While true, I think this only adds unnecessary complexity to the parser.
One reason, registering an encoder requires a singleton as type designator is to ensure that objects can’t be mixed up.
If it was allowed to register encoders via string types, it would be possible to register two "Array"
types, making ambiguous what encoder would be used or would require to change the name in case of conflict.
JSON has a fixed amount of types, so registering a decoder for, say, Array, would mean that all arrays would be parsed to this custom type in all of the project parts. Meaning, that if you use a different library, that also uses this JSON parser, and internally expects one-based arrays it will break.
I’m open to changing my mind on this, but for now, I feel that custom encoders are sufficient enough to make this library more usable.
As a part of this change, I’m moving this library out from the fnl-http
repo into its own library, making fnl-http
depend on it instead, making it possible for other projects to use this parser more easily.
You can find the code here.
Thanks for reading, and I hope this library will be useful!
Foontones
-
The deprecation of
__ipairs
metamethod in Lua 5.3 resulted in an inability to implement custom iteration over custom sequential data structures. For example, the zero-indexed array example would benefit from__ipairs
because we could specify that the iteration starts from0
. With the defaultipairs
we would always miss the first value. It’s not hard to fix manually, by overridingipairs
with a custom function that checks for__ipairs
metamethod:(local ipairs* ipairs) (fn _G.ipairs [tbl ...] (case (getmetatable tbl) {:__ipairs ip} (ip tbl ...) _ (ipairs* tbl ...)))
However, it’s not practical to intrude into Lua’s standard library in such a way. ↩︎