Andrey Listopadov

Compiling Clojure projects in Emacs - Jumping into dependencies

In the previous post on the subject I’ve described how one can create a custom compilation mode for any language. In the Dynamically extracting filenames from compiler output section I’m talking about various issues with how Clojure reports problem locations. The main problem is when the problem is inside of a dependency, and be it your own library, or a third-party one it’s equally tedious to go and look into it because the dependency usually is a jar file somewhere in the ~/.m2 directory. And the actual problem is that the error message doesn’t really tell you what dependency to look into, it just uses a path to the file as if it was a file in a project. We can always ignore such files, but it’s not great, as we’re effectively closing our eyes on existing problems in our codebase.

Since the previous post I’ve tweaked the function that gets the filename for the given compilation output line - now it can find files in the project’s dependencies as well. It’s still not entirely reliable, as theoretically there might be some name clashes, but I have yet to see any actual problems with this approach. This approach involves dealing with the classpath.

Updated define-project-compilation-mode macro

I had to update the macro I’m using to generate these helper modes. The main changes are that the supplementary compile-add-error-syntax function is now adding a HIGHLIGHT parameter to the compilation-error-regexp-alist-alist, and automatically specifies that it should only be applied to the hyperlink group. The face is chosen automatically based on the level:

(cl-defun compile-add-error-syntax
    (mode name regexp &key file line col (level 'error) hyperlink highlight)
  "Register new compilation error syntax.

Add NAME symbol to `compilation-error-regexp-alist', and then add
REGEXP FILE LINE and optional COL LEVEL info to
`compilation-error-regexp-alist-alist'."
  (or file (error "Missing value for :file keyword"))
  (or line (error "Missing value for :line keyword"))
  (let ((faces '(compilation-info-face
                 compilation-warning-face
                 compilation-error-face))
        (level (cond ((eq level 'info) 0)
                     ((eq level 'warn) 1)
                     ((eq level 'error) 2)
                     (t (error "Unsupported level type: %S" level))))
        (mode (symbol-name (or mode 'compilation))))
    (add-to-list (intern (concat mode "-error-regexp-alist")) name)
    (add-to-list (intern (concat mode "-error-regexp-alist-alist"))
                 (list name regexp file line col level hyperlink
                       (list highlight (nth level faces))))))

Changes to the macro itself are quite small:

(defmacro define-project-compilation-mode (base-name &rest body)
  (declare (indent 1))
  (let* ((name (symbol-name base-name))
         (doc-name (capitalize (replace-regexp-in-string "-compilation$" "" name)))
         (current-project-root (intern (concat name "-current-project")))
         (current-project-files (intern (concat name "-current-project-files")))
         (compilation-mode-name (intern (concat name "-mode"))))
    `(progn
       (defvar ,(intern (concat name "-error-regexp-alist")) nil
         ,(concat "Alist that specifies how to match errors in " doc-name " compiler output.
See `compilation-error-regexp-alist' for more information."))
       (defvar ,(intern (concat name "-error-regexp-alist-alist")) nil
         ,(concat "Alist of values for `" (downcase doc-name) "-compilation-error-regexp-alist'.
See `compilation-error-regexp-alist-alist' for more information."))
       (defvar-local ,current-project-root nil
         ,(concat "Current root of the project being compiled.
Set automatically by the `" (symbol-name compilation-mode-name) "'."))
       (defvar-local ,current-project-files nil
         ,(concat "Current list of files belonging to the project being compiled.
Set automatically by the `" (symbol-name compilation-mode-name) "'."))
       (define-compilation-mode ,compilation-mode-name
         ,(concat doc-name " Compilation")
         ,(concat "Compilation mode for " doc-name " output.")
         (setq-local ,current-project-root (project-current t))
         (setq-local ,current-project-files (project-files ,current-project-root))
         ,@body)
       (provide ',compilation-mode-name))))

I’ve moved ,@body inside the call to define-compilation-mode as it really was a mistake in the previous version of the macro. This way we can extend the initialization step of the mode with additional expressions in the body.

Working with project’s classpath

First things first, we need to query our project for classpath. It can be done with lein classpath or clojure -Spath if you’re using deps. For this post, I’m going to continue with lein, as it is what I use at work, and I’m unfamiliar with most deps commands, i.e. for example, I don’t know what the analog to lein check.

The code below defines some functions that in the end will return a list of strings. These strings are paths to .jar archives being used by our project:

(defun clojure-compilation--split-classpath (classpath)
  "Split the CLASSPATH string."
  (split-string classpath ":" t "[[:space:]\n]+"))

(defun clojure-compilation--get-project-dependencies* (command _deps-file _mod-time)
  "Call COMMAND to obtain the classpath string.
DEPS-FILE and MOD-TIME are used for memoization."
  (thread-last
    command
    shell-command-to-string
    clojure-compilation--split-classpath
    (seq-filter (lambda (s) (string-suffix-p ".jar" s)))))

(fset 'clojure-compilation--get-project-dependencies-memo
      (memoize #'clojure-compilation--get-project-dependencies*))

(defun clojure-compilation--get-lein-project-dependencies (root)
  "Obtain classpath from lein for ROOT."
  (let* ((project-file (expand-file-name "project.clj" root))
         (mod-time (file-attribute-modification-time (file-attributes project-file))))
    (clojure-compilation--get-project-dependencies-memo
     "lein classpath" project-file mod-time)))

(defun clojure-compilation--get-deps-project-dependencies (root)
  "Obtain classpath from deps for ROOT."
  (let* ((project-file (expand-file-name "deps.edn" root))
         (mod-time (file-attribute-modification-time (file-attributes project-file))))
    (clojure-compilation--get-project-dependencies-memo
     "clojure -Spath" project-file mod-time)))

(defun clojure-compilation-get-project-dependencies (project)
  "Get dependencies of the given PROJECT.
Returns a list of all jar archives."
  (when (bound-and-true-p tramp-gvfs-enabled)
    (let ((root (project-root project)))
      (cond ((file-exists-p (expand-file-name "deps.edn" root))
             (clojure-compilation--get-deps-project-dependencies root))
            ((file-exists-p (expand-file-name "project.clj" root))
             (clojure-compilation--get-lein-project-dependencies root))))))

Using the define-project-compilation-mode macro that I’ve defined above we can create a clojure-compilation-mode:

(defvar-local clojure-compilation-project-deps nil
  "List of project's dependencies")

(defvar-local clojure-compilation-project-deps-mod-time nil
  "Accumulated modification time of all project's libraries")

(define-project-compilation-mode clojure-compilation
  (require 'tramp-gvfs)
  (setq-local clojure-compilation-project-deps
              (clojure-compilation-get-project-dependencies
               clojure-compilation-current-project))
  (setq-local clojure-compilation-project-deps-mod-time
              (seq-reduce #'+ (mapcar (lambda (f)
                                        (time-to-seconds
                                         (file-attribute-modification-time
                                          (file-attributes f))))
                                      clojure-compilation-project-deps)
                          0)))

Upon initializing, it will query lein (or deps) for the project’s classpath, and store it in the clojure-compilation-project-deps var. In addition to that we store the accumulated modification time of all of our dependencies. Spinning up lein classpath every time project is re-compiled is quite slow, and we only really need to do it when project.clj was recently changed, so I’m using a simple memoization function:

(defun memoize (fn)
  "Create a storage for FN's args.
Checks if FN was called with set args before.  If so, return the
value from the storage and don't call FN.  Otherwise calls FN,
and saves its result in the storage.  FN must be referentially
transparent."
  (let ((memo (make-hash-table :test 'equal)))
    (lambda (&rest args)
      (let ((value (gethash args memo)))
        (or value (puthash args (apply fn args) memo))))))

We memoize the call to clojure-compilation--get-project-dependencies* by its command, filename, and the file’s modification timestamp. So, if the filename or timestamp changes, we re-compute the dependencies. Now we look at the function that is used in the compilation-error-regexp-alist-alist.

But before that, let’s define some rules:

(compile-add-error-syntax
 'clojure-compilation 'some-warning
 "^\\([^:[:space:]]+\\):\\([0-9]+\\) "
 :file #'clojure-compilation-filename
 :line 2 :level 'warn :hyperlink 1 :highlight 1)
(compile-add-error-syntax
 'clojure-compilation 'clj-kondo-warning
 "^\\(/[^:]+\\):\\([[:digit:]]+\\):\\([[:digit:]]+\\): warning"
 :file 1 :line 2 :col 3 :level 'warn :hyperlink 1 :highlight 1)
(compile-add-error-syntax
 'clojure-compilation 'clj-kondo-error
 "^\\(/[^:]+\\):\\([[:digit:]]+\\):\\([[:digit:]]+\\): error"
 :file 1 :line 2 :col 3 :hyperlink 1 :highlight 1)
(compile-add-error-syntax
 'clojure-compilation 'kaocha-tap
 "^not ok.*(\\([^:]*\\):\\([0-9]*\\))"
 :file #'clojure-compilation-filename
 :line 2 :hyperlink 1 :highlight 1)
(compile-add-error-syntax
 'clojure-compilation 'clojure-fail
 "^.*\\(?:FAIL\\|ERROR\\) in.*(\\([^:]*\\):\\([0-9]*\\))"
 :file #'clojure-compilation-filename
 :line 2 :hyperlink 1 :highlight 1)
(compile-add-error-syntax
 'clojure-compilation 'clojure-reflection-warning
 "^Reflection warning,[[:space:]]*\\([^:]+\\):\\([0-9]+\\):\\([0-9]+\\)"
 :file #'clojure-compilation-filename
 :line 2 :col 3
 :level 'warn :hyperlink 1 :highlight 1)
(compile-add-error-syntax
 'clojure-compilation 'clojure-performance-warning
 "^Performance warning,[[:space:]]*\\([^:]+\\):\\([0-9]+\\):\\([0-9]+\\)"
 :file #'clojure-compilation-filename
 :line 2 :col 3
 :level 'warn :hyperlink 1 :highlight 1)
(compile-add-error-syntax
 'clojure-compilation 'clojure-syntax-error
 "^Syntax error .* at (\\([^:]+\\):\\([0-9]+\\):\\([0-9]+\\))"
 :file #'clojure-compilation-filename
 :line 2 :col 3)
(compile-add-error-syntax
 'clojure-compilation 'kaocha-unit-error
 "^ERROR in unit (\\([^:]+\\):\\([0-9]+\\))"
 :file #'clojure-compilation-filename
 :line 2 :hyperlink 1 :highlight 1)
(compile-add-error-syntax
 'clojure-compilation 'eastwood-warning
 "^\\([^:[:space:]]+\\):\\([0-9]+\\):\\([0-9]+\\):"
 :file #'clojure-compilation-filename
 :line 2 :col 3 :level 'warn :hyperlink 1 :highlight 1)

These are the rules I’m using at work, it’s quite handy to be able to jump from the compilation buffer and see if the problem is fixable. The clojure-compilation-filename is defined as follows:

(defun clojure-compilation-filename ()
  "Function that gets filename from the error message.
If the filename comes from a dependency, try to guess the
dependency artifact based on the project's dependencies."
  (when-let ((filename (substring-no-properties (match-string 1))))
    (or (clojure-compilation--find-file-in-project filename)
        (when-let ((dep (clojure-compilation--find-dep filename)))
          (concat (expand-file-name dep) "/" filename)))))

It splits the task into two parts. First, it checks if the file is part of the project:

(defun clojure-compilation--find-file-in-project (file)
  "Check if FILE is part of the currently compiled project."
  (seq-find
   (lambda (s) (string-suffix-p file s))
   clojure-compilation-current-project-files))

It’s a rather simple filter of the clojure-compilation-current-project-files var we create when the clojure-compilation-mode starts.

The second part is similar, but it tries to find a matching dependency. Unfortunately, knowing the file name doesn’t mean that we’ll be able to find the dependency itself, as the artifact name may not have the same name. But this is somewhat rare, so we can do it like this:

(defun clojure-compilation--file-exists-jar-p (jar file)
  "Check if FILE is present in the JAR archive."
  (with-temp-buffer
    (when (zerop (call-process "jar" nil (current-buffer) nil "-tf" jar))
      (goto-char (point-min))
      (save-match-data
        (re-search-forward (format "^%s$" (regexp-quote file)) nil t)))))

(defun clojure-compilation--find-dep* (file _project _deps-mod-time)
  "Find FILE in current project dependency list.
PROJECT and DEPS-MOD-TIME are used for memoizing the call."
  (when (not (string-empty-p file))
    (seq-find (lambda (d)
                (clojure-compilation--file-exists-jar-p d file))
              clojure-compilation-project-deps)))

(fset 'clojure-compilation--find-dep-memo
      (memoize #'clojure-compilation--find-dep*))

(defun clojure-compilation--find-dep (file)
  "Find FILE in current project dependency list."
  (clojure-compilation--find-dep-memo
   file
   clojure-compilation-current-project
   clojure-compilation-project-deps-mod-time))

There’s a lot going on, but the idea is basically the same as for the project’s dependencies. We memoize clojure-compilation--find-dep* by the file, current project, and accumulated modification time of all dependencies. If any of the dependencies changes we’ll re-compute the whole thing. Otherwise, if the file is present multiple times in the compilation output we will avoid searching for it multiple times thanks to the memoization.

For example, here’s a log from lein check. I have omitted any lines that are related to the project itself, so we’re only looking at warnings inside the dependencies:

-*- mode: clojure-compilation; default-directory: "~/some/project/" -*-
Clojure Compilation started at Mon Oct  2 15:29:30

lein do clean, check
Reflection warning, me/raynes/fs.clj:517:42 - reference to field getName can't be resolved.
Reflection warning, clojure/data/xml.clj:337:17 - call to method createXMLStreamReader can't be resolved (target class is unknown).
Reflection warning, instaparse/util.clj:5:3 - call to java.lang.RuntimeException ctor can't be resolved.
Reflection warning, instaparse/util.clj:11:3 - call to java.lang.IllegalArgumentException ctor can't be resolved.
Reflection warning, ring/util/servlet.clj:88:24 - call to method write on javax.servlet.ServletOutputStream can't be resolved (argument types: unknown).

So, given a filename of me/raynes/fs.clj and a dependency list like:

(...
 "~/.m2/repository/org/clojure/data.xml/0.0.8/data.xml-0.0.8.jar"
 "~/.m2/repository/instaparse/instaparse/1.4.8/instaparse-1.4.8.jar"
 "~/.m2/repository/me/raynes/fs/1.4.6/fs-1.4.6.jar"
 "~/.m2/repository/ch/qos/logback/logback-core/1.2.11/logback-core-1.2.11.jar"
 "~/.m2/repository/ch/qos/logback/logback-classic/1.2.11/logback-classic-1.2.11.jar"
 "~/.m2/repository/ring-logger/ring-logger/1.1.1/ring-logger-1.1.1.jar"
 "~/.m2/repository/ring/ring-servlet/1.10.0/ring-servlet-1.10.0.jar"
 "~/.m2/repository/ring/ring-core/1.10.0/ring-core-1.10.0.jar"
 ...)

Because me/raynes/fs.clj is not part of the project, we will have to go through every .jar archive, get its file listing, and check if any have this file. I’m not sure if it is possible for two archives in the classpath list to have the same file, and that never happened to me yet, so I assume it is a reliable enough way of doing this. Thus, the result of clojure-compilation-filename will be "~/.m2/repository/me/raynes/fs/1.4.6/fs-1.4.6.jar/me/raynes/fs.clj". But what should we do with it?

Emacs actually can open this kind of path with TRAMP if the gvfs is present on the system, hence the check in the clojure-compilation-get-project-dependencies function. If gvfs is not available there’s no point in analyzing dependencies, because, without it, Emacs can’t go inside such an archive. And I’m not sure why - Emacs can open archives much like DIRED opens directories, and then we can open files from these archives, it just has to be done in two steps and there’s no special path syntax to open an archive, and jump to a file immediately. With gvfs however, Emacs can mount the archive, and jump to the me/raynes/fs.clj file inside of it in one go, thus highlighting the problem in the dependency:

;; ---8<---
(defn find-files
  "Find files matching given pattern."
  [path pattern]
  (find-files* path #(re-matches pattern (.getName %))))
;; ---8<---
Code Snippet 1: warning in the ~/.m2/repository/me/raynes/fs/1.4.6/fs-1.4.6.jar/me/raynes/fs.clj file.

The only downside of this approach is that, for some reason, Emacs takes a very long time to unmount these archives when I close it. This has nothing to do with the method, and probably an issue with the gvfs support in general.

If the gvfs package isn’t available it’s probably possible to use the jarchive package, however, the filename format returned by the clojure-compilation-filename function has to be changed to include the jar:file:// scheme and use ! as a separator:

(defun clojure-compilation-filename ()
;; ---8<---
    (when-let ((dep (clojure-compilation--find-dep filename)))
      (concat "jar:file://" (expand-file-name dep) "!" filename))
;; ---8<---
)

Though, unfortunately, I couldn’t make it work with jarchive because for some reason Emacs transforms jar:file:///foo/bar to jar:file:/foo/bar right before a file is opened, and jarchive specifically looks for jar:file:/// as a prefix. Maybe there’s some kind of a setting for that. If you know about such, let me know too!

With all in place, Clojure warnings are fully intractable from the compilation buffer. Even though Clojure is a language where we rarely use edit-compile-check cycle, I find it tremendously useful to be able to call lein check, and other tools like clj-kondo or eastwood. Same with the kaocha test runner.

Emacs is very configurable. I would say that the compilation-error-regexp-alist isn’t the most straightforward interface for configuring how errors are parsed, but it gets the job done and is very versatile. It’s a general pattern in Emacs, many such configurations accept functions in arbitrary places, allowing users to extend the interface even more than it is possible with just regular parameters. Because of that Emacs is also an infinite time sink, and seeing how my configuration grows by the day even after using Emacs for many years is both inspiring and scary. Inspiring because Emacs shows that such a configurable system is possible. Scary because I don’t know if the process will ever stop.

I hope this post was useful, and gave you the idea of how you can create custom handlers for the compilation buffer. The whole configuration for Clojure can be found here. Macros for defining language compilation modes are here. The required advice for the compilation-start function is available here. Let me know if you had any problems with this code, or have ideas on any possible improvements!