Skip to content

Instantly share code, notes, and snippets.

@nimaai
Last active February 4, 2024 21:37
Show Gist options
  • Star 6 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nimaai/2f98cc421c9a51930e16 to your computer and use it in GitHub Desktop.
Save nimaai/2f98cc421c9a51930e16 to your computer and use it in GitHub Desktop.
Lisp macro pitfalls

Macro pitfalls

Summary of and excerpts from chapter 9 and 10 of On Lisp. Examples are mainly in Common Lisp.

Variable capture

Variable capture occurs when macroexpansion causes a name clash: when some symbol ends up referring to a variable from another context. Inadvertent variable capture can cause extremely subtle bugs.

Macro argument capture

(defmacro for ((var start stop) &body body) ; wrong
  `(do ((,var ,start (1+ ,var))
	      (limit ,stop))
       ((> ,var limit))
     ,@body))
(macroexpand-1 '(for (limit 1 5) (princ limit))

yields

(do ((limit 1 (1+ limit))
     (limit 5))
    ((> limit limit))
  (princ limit))

There is a name clash between a symbol local to the macro expansion and a symbol passed as an argument to the macro. The macroexpansion captures limit. It ends up occurring twice in the same do, which is illegal.

Free symbol capture

The macro definition itself contains a symbol which inadvertently refers to a binding in the environment where the macro is expanded.

(defvar w nil)

(defmacro gripe (warning)		; wrong
  `(progn (setq w (nconc w (list ,warning)))
          nil))

Someone else then wants to write a function sample-ratio, to return the ratio of the lengths of two lists. If either of the lists has less than two elements, the function is to return nil instead, also issuing a warning that it was called on a statistically insignificant case.

(macroexpand-1 '(defun sample-ratio (v w)
                  (let ((vn (length v)) (wn (length w)))
                    (if (or (< vn 2) (< wn 2))
                        (gripe "sample < 2")
                        (/ vn wn))))

yields

(defun sample-ratio (v w)
  (let ((vn (length v)) (wn (length w)))
    (if (or (< vn 2) (< wn 2))
        (progn (setq w (nconc w (list "sample < 2")))
               nil)
        (/ vn wn))))

The problem here is that gripe is used in a context where w has its own local binding. The warning, instead of being saved in the global warning list, will be nconced onto the end of one of the parameters of sample-ratio. Not only is the warning lost, but the list (b), which is probably used as data elsewhere in the program, will have an extraneous string appended to it.

Solution

There are several solutions to the problem of variable capture:

  • obfuscation or "better names"
  • prior evaluation
  • temporary symbol creation aka gensyms
  • read-time uninterned symbol
  • packages / capture in other namespaces
  • hygienic transformation
  • literal objects

In Common Lisp and Clojure the most common way is to use gensym for every newly introduced symbol by the macro:

(defmacro for ((var start stop) &body body)
  (let ((gstop (gensym)))
    `(do ((,var ,start (1+ ,var))
          (,gstop ,stop))
         ((> ,var ,gstop))
       ,@body)))

In Clojure there is even a handy reader macro # character which is used by appending it to the symbol name, e.g. new-var# which equals to having done (let [new-var (gensym)] ...) und without having to unquote new-var# in the macro body:

; Clojure example
(defmacro foo [bar]
  `(let [new-var# "Oh, big deal!"]
     (list new-var# ~bar)))

Other pitfalls

Multiple evaluation

In writing macros one must remember that the arguments to a macro are forms, not values. Depending on where they appear in the expansion, they could be evaluated more than once. One common solution is to bind a variable to the value returned by the particular form and refer to the variable further down in the expansion.

In the example below, we bind stop to a new gensym ensuring that it is evaluated only once (otherwise (> ,var ,stop) would be evaluated repeatedly as it is a test condition evaluated on each loop).

(defmacro for ((var start stop) &body body)
  (let ((gstop (gensym)))
    `(do ((,var ,start (1+ ,var))
          (,gstop ,stop))
         ((> ,var ,gstop))
       ,@body)))

Unless they are clearly intended for iteration, macros should ensure that expressions are evaluated exactly as many times as they appear in the macro call.

Order of evaluation

In Common Lisp function calls, arguments are evaluated left-to-right and it is good practice for macros to do the same. Macros should usually ensure that expressions are evaluated in the same order that they appear in the macro call.

For an example for this case refer to [chapter 10.2] (http://www.bookshelf.jp/cgi-bin/goto.cgi?file=onlisp&node=Order+of+Evaluation).

Non-functional expanders

Lisp expects code which generates macro expansions to be purely functional. Expander code should depend on nothing but the forms passed to it as arguments, and should not try to have an effect on the world except by returning values.

As a general rule, expander code shouldn't depend on anything except its arguments. So any macro which builds its expansion out of strings, for example, should be careful not to assume anything about what the package will be at the time of expansion. This concise but rather pathological example,

(defmacro string-call (opstring &rest args) ; wrong
  `(,(intern opstring) ,@args))

defines a macro which takes the print name of an operator and expands into a call to it:

> (defun our+ (x y) (+ x y))
> OUR+
> (string-call "OUR+" 2 3)
> 5

The call to intern takes a string and returns the corresponding symbol. However, if we omit the optional package argument, it does so in the current package. The expansion will thus depend on the package at the time the expansion is generated, and unless our+ is visible in that package, the expansion will be a call to an unknown function.

Recursion

In general, it's fine for macros to contain references to other macros, so long as expansion terminates somewhere. The trouble with a recursive macros is that every expansion contains a reference to the macro itself. Its possible function version terminates because it recurses on some value, which is changed on each recursion. But macroexpansion only has access to forms, not to their values. When the compiler tries to macroexpand, say, (nthb x y), the first expansion will yield

(if (= x 0)
  (car y)
  (nthb (- x 1) (cdr y)))

which will in turn expand into:

(if (= x 0)
  (car y)
  (if (= (- x 1) 0)
    (car (cdr y))
    (nthb (- (- x 1) 1) (cdr (cdr y)))))

and so on into an infinite loop. It's fine for a macro to expand into a call to itself, just so long as it doesn't always do so.

Depending on what you need a macro for, you may find it sufficient to use instead a combination of macro and function. The first strategy is simply to make the macro expand into a call to a recursive function. If, for example, you need a macro only to save users the trouble of quoting arguments, then this approach should suffice.

If you need a macro because you want its whole expansion to be inserted into the lexical environment of the macro call, then you would more likely want to define a local function.

Although you can't translate a recursive function directly into a macro, you can write a macro whose expansion is recursively generated. The expansion function of a macro is a regular Lisp function, and can of course be recursive.

For related examples see chapter 10.4 from On Lisp.

@jefftrull
Copy link

The link at the bottom points to a site that is no longer in existence, apparently (not 404 but the domain is for sale and just has ads)

@nimaai
Copy link
Author

nimaai commented Sep 27, 2023

@jefftrull Thanks. Updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment