Implementing dynamic scope for Fennel and Lua
I’m continuing my work on fennel-cljlib, my port of clojure.core
and some other core libraries, focusing on porting missing functions and features to it.
One such feature, which I sometimes miss in Lua and Fennel, is dynamic binding.
The Lua VM doesn’t provide dynamic scoping as a language feature, and Fennel itself doesn’t introduce any concepts like Clojure’s Var
.
However, we can still implement dynamic scoping that works similarly to Clojure and other Lisps using the debug
library.
Most of the ideas are based on information from the guides at leafo.net.
There’s even a “Dynamic scoping in Lua” guide that implements a slightly different version of this feature, requiring variables to be referenced by name via dynamic("var_name")
call.
While this approach is feasible, I wanted something more in line with how other Lisps work, so let’s explore advancing it further.
Luckily for us, Leafo already has all the necessary guides!
But first things first. I wanted to delay working on dynamic scoping as much as possible because it is a feature that’s hard to get right. I already have some experience with implementing dynamic scoping for one of my older libraries that implemented a condition system from Common Lisp in Fennel. This library, however, required special syntax to access all of the dynamically bound symbols and thus did not actually require anything fancy for it to work.
So what does dynamic binding/scoping mean in a language? If you know about lexical and dynamic scoping and wish to skip this tangent, feel free to do so.
In short, lexically scoped variables exist only where their lexical scope allows them to. For example, a variable defined in a block of code will only exist in that block because it is its lexical scope.
When working with languages that have higher-order functions, I often find myself in a situation where I want to refactor some code that uses anonymous functions by moving them out and giving them a name. Sometimes it’s possible; sometimes it’s not.
For example, imagine I wanted to move out this function from map
:
(fn some-func [messages]
(let [extra-data (other-func)]
(map
(fn [message]
(do-stuff message extra-data))
messages)))
If I were to do so, we would have a problem:
(fn process-message [message]
;; oops, extra-data is now an unknown variable
(do-stuff message extra-data))
(fn some-func [messages]
(let [extra-data (other-func)]
(map process-message messages)))
So extra-data
is bound lexically, and thus if we look at the lexical scope of the process-messages
function, we’ll see that it tries to use extra-data
while it’s not defined there.
If extra-data
were a global variable, it wouldn’t be problematic, but it is a local variable with a lexical scope.
We could move the entire let
that binds extra-data
to the result of calling other-func
, but let’s say we don’t want to call it on each iteration of a map
because it’s slow and will do the same work repeatedly.
So, what are our options here?
Well, we can make it a closure!
(fn make-message-processor [extra-data]
(fn process-message [message]
(do-stuff message extra-data)))
(fn some-func [messages]
(let [extra-data (other-func)]
(map (make-message-processor extra-data) messages)))
Now, we pass extra-data
only once to the make-message-processor
function, and it returns a function that has this variable stored in a closure.
However, this is still a lexical scope because, as you can see, extra-data
is present there.
In a language that uses dynamic scoping, this could be a whole different story. Let’s look at Clojure; although I wouldn’t recommend doing it this way, it is possible to do it this way1:
(defn process-message [message]
(do-stuff message extra-data))
(defn some-func [messages]
(binding [extra-data (other-func)]
(mapv process-message messages)))
Here, I assume that extra-data
is a dynamic variable that obeys the rules of dynamic scoping.
The binding
call introduces a dynamic scope within which extra-data
is set to the value of (other-func)
.
It acts more like a scoped global variable, or at least you can think of it that way.
To introduce a dynamic scope, Clojure uses binding
.
Let’s look at it briefly:
(defmacro binding
"binding => var-symbol init-expr
Creates new bindings for the (already-existing) vars, with the
supplied initial values, executes the exprs in an implicit do, then
re-establishes the bindings that existed before. The new bindings
are made in parallel (unlike let); all init-exprs are evaluated
before the vars are bound to their new values."
{:added "1.0"}
[bindings & body]
(assert-args
(vector? bindings) "a vector for its binding"
(even? (count bindings)) "an even number of forms in binding vector")
(let [var-ize (fn [var-vals]
(loop [ret [] vvs (seq var-vals)]
(if vvs
(recur (conj (conj ret `(var ~(first vvs))) (second vvs))
(next (next vvs)))
(seq ret))))]
`(let []
(push-thread-bindings (hash-map ~@(var-ize bindings)))
(try
~@body
(finally
(pop-thread-bindings))))))
It’s a simple idea: a try
block without catch
statements, only with a finally
clause.
Before we enter the try
block, we set all mentioned variables to their values, and after we’re done with the body
, we restore those values.
In Clojure, dynamic bindings work really well, but this is due to a combination of factors.
First, the try
support in the JVM is excellent, ensuring that finally
will perform its intended function.
Additionally, the JVM supports thread-local bindings, so even in a multithreaded context, binding
still works.
Finally, heh, Clojure has Vars, which makes it all possible.
Now, let’s try to implement the same concept in Fennel!
Implementing dynamic scope in Fennel
Before I descend into madness, I would say that we could do the same thing as in Clojure: set some variables, run code in a protected call, and reset the variables afterward. While this approach would indeed work, I wanted to tidy up my understanding of function environments in Lua. It’s a neat concept that Lua and a few other languages have, but Lua is one of the few languages that actually allows users to manipulate function environments. So, let’s explore this idea.
First, we need a way to forcefully set a function’s environment.
This could be done in Lua 5.1 via setfenv
; however, it was removed starting from Lua 5.2 onward.
It can be implemented like this:
(local setfenv
(or _G.setfenv
(fn setfenv [f env i]
(let [i (or i 1)]
(case (debug.getupvalue f i)
:_ENV (doto f (debug.upvaluejoin i (fn [] env) 1))
nil f
_ (setfenv f env (+ i 1)))))))
Now, we can set an environment for any function.
But what exactly is this function environment? I’ve realized that I never explained that, so here we go.
In Lua, the environment is a table2 that stores the names of the variables.
Starting from Lua 5.2, the environment is represented by a variable called _ENV
, which is what we’re testing for in setfenv
above.
By default, _ENV
has the same value as _G
, a table that contains all global variables.
However, we can change the function’s environment by modifying the value of _ENV
.
For instance, in Lua, we can set _ENV
to a table, and all global definitions would end up in that table:
a = 0
local function f (t)
_ENV = t
a = 42
b = 322
return t
end
local t = {}
f(t)
print(a, b) -- 0, nil
print(t.a, t.b) -- 42, 322
This is a cool feature, and we can actually use environments for sandboxing code, but that’s a story for another time. Let’s return to dynamic scoping.
Looking at this, you might get the idea that if we change the function’s environment and set our dynamic variables in it specifically, once we leave the lexical scope of that _ENV
, all changes revert to normal because they never happened in a global environment!
Unfortunately, it’s not that simple.
Yes, we can change the function’s environment, but it will only affect that specific function.
Moreover, this change is permanent, meaning that we’ll have to reset the function back to its original environment.
So, it’s not as straightforward as just changing _ENV
around the code we want to run.
Of course, we could write getfenv
, then wrap the entire thing in a pcall
, and safely restore the environment once the work is done.
However, we can’t set the environment of just the function we’re calling.
_ENV
is stored in a closure, so we’ll need to change all of the functions called by the function we wish to invoke with a custom environment.
This makes undoing changes trickier to implement.
Luckily for us, we can bypass the need to roll back the changes to the function’s environment completely! Instead, we can simply clone the function and set its environment as we wish! Here’s an implementation:
(fn clone-function-with-env [f env]
"Recursively clones the function `f`, and any subsequent functions that it
might call via upvalues. Sets `env` as environment for the cloned function."
(let [dumped (string.dump f)
cloned (load dumped)]
(var (done? i) (values false 1))
(while (not done?)
(case (debug.getupvalue f i)
(where (name val) (= :function (type val)))
(let [subf (clone-function-with-env val env)]
(debug.setupvalue cloned i subf))
name
(debug.upvaluejoin cloned i f i)
nil (set done? true))
(set i (+ i 1)))
(setfenv cloned env)))
Finally, we write the function that will call a given function f
in a context where the given bindings
are dynamically bound:
(fn dynamic-call [bindings f ...]
"Calls `f` with `bindings` as its root environment."
(let [new-env (setmetatable bindings {:__index _ENV})
f* (clone-function-with-env f new-env)]
(f* ...)))
As well, as a convenience macro for using it like a let
but with dynamic binding:
(macro binding [bindings ...]
(assert-compile (sequence? bindings) "expected a sequence of bindings" bindings)
(assert-compile
(= 0 (% (length bindings) 2))
"expected an even number of forms in binding sequence"
bindings)
`(dynamic-call
,(faccumulate [res {} i 1 (length bindings) 2]
(doto res
(tset (tostring (. bindings i)) (. bindings (+ i 1)))))
(fn [] ,...)))
Now we can use dynamic binding in Fennel!
Usage example
To illustrate, let’s create some variables that we wish to treat dynamically:
(global foo 21)
(global bar 73)
(print foo bar) ;; 21 73
With globals in place, we can try our binding
macro:
(binding [foo 42]
(print foo bar))
;; 42 73
As can be seen, foo
no longer refers to 21
, but now it is 42
.
However, if we try to print foo
outside of binding
’s scope, we will again get 21
:
(binding [foo 42]
(print foo))
;; 42
(print foo)
;; 21
A keen reader would mention that this example is not so different from using a plain let
:
(let [foo 42]
(print foo))
;; 42
(print foo)
;; 21
And you’d be right! However, where it’s going to be different is when we put functions into the mix:
(fn f []
(print "f:" foo bar))
How, if we were to call it inside of let
, the bindings introduced by it won’t affect the function, because both foo
and bar
are not lexically present here:
(let [foo 42
bar 1337]
(f))
;; still prints:
;; f: 21 73
This is where binding
jumps in.
Instead of following lexical binding rules, that are natural for most languages, we now introduce dynamic binding of foo
and bar
:
(binding [foo 42
bar 1337]
(f))
;; prints:
;; f: 42 1337
(f)
;; prints:
;; f: 21 73
And, as can be seen above, inside binding
’s scope, f
sees foo
as 42
, and bar
as 1337
, while outside of it, the values are still 21
and 73
.
So, we never actually changed the values of foo
and bar
, they’re still 21
and 73
, respectfully.
Instead, in the scope of binding
we changed how f
accesses these variables.
This also works with functions that call other functions:
(fn f []
(print "f:" foo))
(fn g []
(f)
(print "g:" bar))
(fn h []
((fn [] (print "h:" foo bar))))
(binding [foo 42
bar 322]
(f)
;; prints:
;; f: 42
(g)
;; prints:
;; f: 42
;; g: 322
(h)
;; prints:
;; h: 42 322
)
That’s pretty much it! This approach has a lot of flaws though.
First, it will only work on ordinary functions, so no tricks with __call
metamethod, or native functions are supported.
Second, it won’t work with coroutines either.
You can’t use string.dump
on something like coroutine.resume
directly, so we won’t be able to do (dynamic-call {:foo 42} coroutine.resume some-coroutine)
.
It won’t even work if we wrap coroutine.resume
into an anonymous function like (dynamic-call {:foo 42} (fn [] (coroutine.resume coro)))
, because coro
here, while being an upvalue, is not a function, so we can’t clone it.
Finally, it relies on the debug
library, and recursive function dumping, which itself is already pretty crazy.
There are probably more things that can go wrong with this.
So why not just set the globals temporarily?
First of all, yes, we could just set the globals temporarily, and restore their values later:
(fn set-globals [globals]
(collect [name new-val (pairs globals)]
(let [old-val (. _G name)]
(set (. _G name) new-val)
(values name old-val))))
(fn close-handler [old-vals ok? ...]
(each [name val (pairs old-vals)]
(set (. _G name) val))
(if ok?
...
(error ... 0)))
(fn call-with-temp-globals [globals f ...]
(-> globals
set-globals
(close-handler (pcall f ...))))
(call-with-temp-globals {:foo 123 :bar 456} g)
;; prints:
;; f: 123
;; g: 456
(g)
;; prints:
;; f: 21
;; g: 73
While this works, I don’t like the idea that we’re actually changing the values instead of shadowing them in the environment, though this is more akin to the original Clojure implementation.
Since Lua is single-threaded it should not be problematic, however, I think it can still mess things up if we were to introduce some kind of an asynchronous scheduler, like in my async.fnl library.
This also messes up stacktraces in cases where an error occurs as a result of calling f
because you can’t re-throw errors in Lua like in other languages.
There’s also a potential to use <close>
marker that came in Lua 5.4 to avoid pcall
altogether:
local function set_globals(globals)
local old_values = {}
for name, new_val in pairs(globals) do
old_values[name] = _G[name]
_G[name] = new_val
end
return old_values
end
local function close_handler(old_values)
for name, val in pairs(old_values) do
_G[name] = val
end
end
local function dynamic_call_close(globals, f, ...)
local old_values <close> =
setmetatable(set_globals(globals), {__close = close_handler})
return f(...)
end
This would keep stack traces intact, and values would be restored right when we exit dynamic_call_close
but will only work in Lua 5.4.
Thus, while doing it with pcall
is more generic, I wanted to explore the environment approach first since Lua already provides a mechanism for working with function environments.
But for now, I think I’ll leave dynamic scoping out of cljlib
as I’m not really sold on any of the ways of doing it that I’ve come up with so far.