What if structural editing was a mistake?

Not so long ago I’ve written about Paredit and its quirks. I’ve been happily using Smartparens ever since that post, but something still bugged me. I was constantly thinking about the fact that Smartparens has numerous quirks in various languages and some known bugs that are unlikely to be fixed in the foreseeable future, given that the main maintainer doesn’t have a lot of spare time. And I’m not blaming it, maintaining open-source projects is a hard task, and unfortunately, I can’t fix most of the bugs, because I’m not as familiar with the code-base. So I went searching for new installments in the structural editing space. I already knew about Parinfer, and unfortunately, it’s not that great to use in a team, as it generates a lot of small formatting changes upon opening the file, which results in a lot of noise in Git afterward. What I was mainly interested in is parser-based structural editing – something like what Smartparens does, but with an actual parser, like Tree Sitter. There were a few packages for emacs, namely tree-edit and combobulate but the language selection that they support is very poor. Having structural editing in Python is cool I guess, but I don’t write in Python, and since the languages I use are not yet supported, these packages are not an option, at least for me.

And shortly after that, I found a nice post about structural editing in vanilla Emacs¹. This post describes how one can achieve some facilities of structural editing in Emacs without using Paredit or Smartparens, or any external packages for that matter. In another post in that blog, a particularly interesting Reddit thread is mentioned, which has some opinions about strict paren checking that Paredit and Smartparens introduce.

Which got me thinking - maybe structural editing really isn’t something I should pursue. The title says “mistake” but it’s obviously clickbait. I don’t think that structural editing is a mistake, it’s a pretty useful tool with its set of compromises. Parser-based structure editing can lift some of the compromises and fix some bugs, but some languages may not be supported, or have partial support, which comes with a set of downsides on its own. I mostly want this for Lisps, and Emacs is already pretty good at inferring the structure of the Lisp code, so for such languages parser probably would be overkill, but still, I’d like to have everything structure related in a single package, if possible. On the other hand, thinking about it right now, there are not that many features I need from a structural editing package, that I can’t do by hand or automate. So maybe I just value those too much?

With that in mind, I’ve decided to try living some time without any structural editing package, only relying on what’s available in Emacs or what I’ve written myself. I’m feeling pretty confident in using structural editing by now - I often slurp, splice, and convolute expressions without thinking about what I need to do, it just comes off my fingertips. Taking this away will mess with my muscle memory a lot, and I’ll probably become less efficient, but maybe I’ll learn something new.

This post will be about the experience I had during this experiment. I’ve written this over time, and different sections were written pretty much right after I felt some difference.

Initial struggle

After disabling Smartparens I’ve quickly understood all that pain of writing in Lisp without automatic parenthesis insertion and balance checking. I’ve never seen balance-related errors for some years since I’ve pretty much always used something that tried its best at keeping parentheses balanced. But now, since no such thing is enabled, I’ve started getting these whether I just evaluate an expression, or try to compile a file. So the first thing I did was add a local-write-file-hooks to check parentheses:

(defun setup-before-save-hooks ()
  (add-hook 'local-write-file-hooks #'check-parens))
(add-hook 'common-lisp-modes-mode-hook #'setup-before-save-hooks)

With this hook, I can at least immediately get a warning that I’ve messed up.

The common-lisp-modes-mode is a package I’ve created in my init file to enable common things across all Lisps. Such things included structural editing, automatic indentation on typing, and so forth. Most of these things are now disabled, but the mode is still quite useful, as can be seen above.

Another thing that bothered me enough is that now that I don’t have Smartparens bindings, I’ve lost some pretty handy things, such as, well, slurping and sexp convoluting. Though slurping and barfing are easy enough to implement, or even do by hand.

And another small annoyance is that since I don’t use transient mark mode², I now have to use numerical prefix argument in even more cases. For example:

Wrapping the next expression in parentheses:
- Smartparens: M-(
- Vanilla, transient-mark-mode on: C-M-SPC, M-(
- Vanilla, transient-mark-mode off: M-1, M-(
Raising S-expression and all following expressions:
- Smartparens: M-<up>
- Vanilla, transient-mark-mode on: repeat C-M-SPC till the last, M-x raise-sexp RET
- Vanilla, transient-mark-mode off: count expressions by hand, M-<N> where <N> is the number of expressions, M-x raise-sexp RET
Killing til the end of the expression:
- Smartparens: C-k
- Vanilla, transient-mark-mode on: repeat C-M-SPC till the last, C-w
- Vanilla, transient-mark-mode off: count expressions, M-<N>, C-M-k

As can be seen, some quality of life improvements are definitively lost. This is of course because, without any guarantee that the input is balanced, it’s hard to automate things. In strict mode, Smartparens takes a great deal of keeping expressions balanced, so things like killing til the end of sexp always work as expected. Without strict mode, we need to manually count or select the needed amount before pressing C-M-k. I’m kinda OK with that, but it’s still a bit tricky to do for me.

Adoption period

After some time I’ve noticed that I’m no longer fighting my habits, and unbalanced code is not as an issue as it initially was. I’ve definitively started using more numerical arguments. But some things I still had to implement.

One of such things was a function like sp-rewrap-sexp. I’ve mapped it to M-r. With this function it’s possible to quickly wrap something like █foo bar (█ is a point) in any kind of delimiters with M-2 M-( M-r [ producing [foo bar]. And it automatically escapes strings, so something like █foo "bar" baz with the following chords M-3 M-( M-r “ becomes █"foo \"bar\" baz".

Slurping and barfing doesn’t seem to go away any time soon, but I’ve actually never found any use cases for backward versions of these commands. Maybe that’s because of the prefix nature of Lisps, and arguments are more often appended to the list rather than prepended. So there are no such variants in my small structural-mode package.

Thoughts on structural editing in general

I think a lot of people were misguided by Paredit’s notion of structural editing. Think of it like that - most such packages infer the structure from the code, not the other way around. Well, Lisps are naturally good at this, because all you need to do to map code to its structure is to make parentheses always stay balanced. This doesn’t work for other languages, as their structure almost always has nothing to do with their syntax. So a structural editor for, say, Python, shouldn’t be built on top of non-structural, syntax-inferred information to begin with.

My point is, that we need a way to infer the structure from existing code only once, and then work only with it, not with the code. But right now, most attempts at structural editing packages that integrate with an editor are based on the idea of constant inference of structure, either via an external parser or very smart use of regular expressions, and custom parser around those. Again, given that Lisps are written directly with the data structure they operate on, you can pretty much infer the structure on every keystroke, the only requirement is for the parentheses to be balanced, but this will not work for other languages as smoothly. And projects like Tree-Sitter can close the gap, but are not a solution for the problem, because it is still inference.

I know some projects that implemented the idea in a more close way to what I’m talking about in the first sentence of the previous paragraph. One such example is Vlojure. Note that there’s no actual code, only its structure, represented as circles. Given that you can freely move these circles around, copy those, descend into, and do other stuff makes it a real structure-based editor. Underneath, the code is generated from these structures, and such code can be stored in a file, and inferred back to this form. But working with circles alone may be a bit hard. So people have made hybrid editors, that still use code as a basis, but put it into structural blocks that can be arbitrarily composed. However, to my knowledge, no such system had any significant spread, and the only one that may be considered an example of success is Scratch. Yes, I consider Scratch’s IDE as a structural editor, because you can manipulate code structure directly. Of course, there are more structural editors out there, but I think these are considered experiments.

And such editors come with one obvious downside, which is they’re limited and require specific tooling. By specific tooling, I mean things like diffing, code searching, e.t.c. - all that needs to be built separately for a specific structure implementation that the editor uses. And, as a result, they give you a certain workflow you must obey, because there are no other ways of doing things in such editors.

This is similar to the strictness that Paredit and Smartparens rely on to provide their features, and perhaps that’s why a lot of people consider Paredit a package for structural editing. Making other languages’ syntax strict is quite hard, so error-tolerant parsers like Tree-Sitter will work, but at the moment when the structure is broken there’s no guarantee of correctness, and that’s a problem. So maybe structural editing was a mistake?

Certainly not, and when it works - it really works. Emacs has a lot of inbuilt stuff for structure-based navigation with its thingatpt package, so it would be a waste not to use it, and things like Paredit exist for a reason. And sometimes, leveraging language-specific tooling can work pretty well, as demonstrated by this package for OCaml. Parinfer is another beast here. It is Lisp-exclusive structure inference taken to the next level, as it does both inferences and maintains strictness at the same time. I, actually, think that Parinfer is the best structural editing package for Lisps because it’s both transparent and doesn’t require strict balance, although it does require correct indentation. And, you can use all structure-based edits like slurping and barfing with it with no problem. So this idea as a whole still needs exploration, in my opinion.

Ideally, I’d like a less buggy and a more robust version of Smartparens, but it worked pretty fine for my use-cases, and for a lot of other people, so by all means use what works best for you. That’s all I have on this topic, but maybe I’ll return to it in the future once I make a decision on whether to give Smartparens or Paredit another go.

I guess we’ve come a full circle :) ↩︎
You can use both methods despite the fact if transient-mark-mode is enabled or not, though, you may need to temporarily enable it with C-SPC C-SPC if it’s disabled. ↩︎

Comment via email