I’d like to be able to write functions using pattern-matching, something like so:
(defun factorial (n)
(pattern-case n
(0 1)
(n (* n (factorial (1- n))))))
I had also considered the name dispatch-on-pattern
.
There’s no implementation for pattern-case
yet but let’s write a stub so the editor knows how to indent it and the compiler doesn’t complain.
(defmacro pattern-case (keyform &body cases))
Patterns are a statement about the structure and (possibly) content of an expression.
They also provide the names you want values to be bound to.
Imagine writing a sum
function, which takes a list and returns evaluates to the sum of all its members.
The savvy lisper would (defun sum (numbers) (apply #'+ numbers))
and be done with it, but that doesn’t suit our didactic purpose.
(defun sum (numbers)
(pattern-case numbers
(nil 0)
((n . ns) (+ n (sum ns)))))
Assuming there is a list to sum, we bind the first element to n
and the rest of them to ns
.
This is exactly like destructuring-bind
.
In fact, this almost seems like it could be solved with a combination of an as-yet unwritten pattern-equal
and destructuring-bind
.
It is, alas, not the case; see if you can figure out why.
Pattern-matching also lets us specify the values we’re interested in.
Consider a pattern-matched version of nth
.
(defun nth (n lst)
(pattern-case (n lst)
((0 (x . xs)) x)
((n (x . xs)) (enth (1- n) xs))))
This works, but obscures the fact that in each of those cases, there’s one value we don’t care about:
if n
is 0, it doesn’t matter what xs
is, just give x
;
otherwise, it doesn’t matter what x
is, we recurse on n-1
and xs
.
The conventional way of writing this (with cond
or if
) would in fact be clearer in that it only asks about the relevant bits.
(defun nth (n lst)
(cond ((zerop n) (car lst))
(t (enth (cdr lst)))))
Not to fear, there is a solution.
A common convention is to mark values we’re not interested in with _
.
Let’s try it out:
(defun nth (n lst)
(pattern-case (n lst)
((0 (x . _)) x)
((n (_ . xs)) (enth (1- n) xs))))
Nice.
Much cleaner, and syntactically valid since _
is a valid symbol and can be bound to anything.
Generally, symbols match (and bind to) anything, other atoms match themselves, and lists match things that have the same shape. Specifically:
- A symbol that can be bound (e.g.
foo
qualifies,:foo
doesn’t) matches anything t
matches a generalized boolean, i.e. anything that’s notnil
- Strings match with
string=
- Other arrays match with
equal
- Other atomic values match with
eql
- Lists match another list of like structure, provided their members match as per the above
(0 . ns)
matches any list whose first member is0
(fname lname)
matches("Johnny" "Goode")
but not("Johnny" "B." "Goode")
nor("Johnny")
(first second third . rest)
matches any list with length ≥ 3
Let’s try that. Conditionals need to be written with the special cases first so the order will be a little different.
(defun pattern-equal (pattern expression)
(cond
;; Symbols that can't be bound first
((eq pattern t) expression)
((eq pattern nil) (not expression))
((keywordp pattern) (eql pattern expression))
;; Symbol that can be bound
((symbolp pattern) t)
;; Lists
((listp pattern) (and (pattern-equal (car pattern)
(car expression))
(pattern-equal (cdr pattern)
(cdr expression))))
;; Strings, arrays
;; string= for case-sensitivity, or string-equal for insensitivity?
((stringp pattern) (string= pattern expression))
((arrayp pattern) (equal pattern expression))
;; Anything else
(t (eql pattern expression))
))
Remember what I said about destructuring-bind
not working for our purpose?
Let’s see what happens by manually expanding one of the examples above.
(defun factorial (n)
(cond ((pattern-equal 0 n) 1)
((pattern-equal 'n n)
(destructuring-bind (n) (list n) (* n (factorial (1- n)))))))
;; My favorite factorial
(factorial 6)
720
Wait, that worked. Let’s try another one.
(defun enth (n lst)
(cond ((pattern-equal '(_ nil) (list n lst))
(error "Bad N for LST"))
((pattern-equal '(0 (x . xs)) (list n lst))
(destructuring-bind (0 (x . xs)) (list n lst)
x))
((pattern-equal '(n (_ . xs)) (list n lst))
(destructuring-bind (n (_ . xs)) (list n lst)
(enth (1- n) xs)))))
; in: DEFUN ENTH
; (DESTRUCTURING-BIND (0 (X . XS)) (LIST N LST) X)
;
; caught ERROR:
; during macroexpansion of (DESTRUCTURING-BIND (0 #) (LIST N LST) ...). Use
; *BREAK-ON-SIGNALS* to intercept.
;
; non-symbol in lambda-list: 0
Whoops.
We can’t pass a pattern into destructuring-bind
because patterns may contain constants and lambda-lists may not.
We should also do something about silencing the warnings for unused variables.
What if we substitute constants for variables?
While we’re at it we’ll do the same to _
so we can have several
(defun substitute-constants (pattern)
(cond ((or (eq '_ pattern ) (constantp pattern))
(gensym))
((listp pattern) (cons (substitute-constants (car pattern))
(substitute-constants (cdr pattern))))
(t pattern)))
(substitute-constants '(0 (x . _)))
(G1093 (X . G1094) . G1095)
That looks strange, but is structurally and syntactically the same.
Let’s try using that, using the reader-macro #.
to evaluate the substitution (otherwise destructuring-bind
would capture the whole (substitute-constants ...)
form)
(destructuring-bind #.(substitute-constants '(0 (x . _)))
'(0 (first second third))
x)
; in: DESTRUCTURING-BIND (#:G1097 (X . #:G1098) . #:G1099)
; (LET* ((#:G1097 (CAR #:WHOLE1100))
; (X (CAR #:REQUIRED-1102))
; (#:G1098 (CDR #:REQUIRED-1102))
; (#:G1099 (CDR (CDR #:WHOLE1100))))
; X)
;
; caught STYLE-WARNING:
; The variable #:G1097 is defined but never used.
;
; caught STYLE-WARNING:
; The variable #:G1098 is defined but never used.
;
; caught STYLE-WARNING:
; The variable #:G1099 is defined but never used.
;
; compilation unit finished
; caught 3 STYLE-WARNING conditions
FIRST
Better!
We have warnings about unused variables, but they can be (declare (ignore ...))
‘ed away, as long as we keep track of the generated symbols.
(defun make-registrar ()
"MAKE-REGISTRAR => REGISTRAR
REGISTRAR is a function:
REGISTRAR :create => new symbol (ala GENSYM)
REGISTRAR :list => list of all symbols generated
"
(let (register)
(lambda (arg)
(ecase arg
(:create (push (gensym) register) (car register))
(:list register)))))
(defun substitute-constants (pattern generator)
"Replace constants and _ in PATTERN with symbols from GENERATOR"
(cond ((or (constantp pattern) (eq '_ pattern)) (funcall generator))
((listp pattern) (cons (substitute-constants (car pattern)
generator)
(substitute-constants (cdr pattern)
generator)))
(t pattern)))
(defmacro pattern-bind (pattern expression &body body)
"Bind variables in PATTERN to values in EXPRESSION, then evaluate BODY"
(let* ((registrar (make-registrar))
(generator (lambda () (funcall registrar :create)))
(safe-pattern (substitute-constants pattern generator))
(gensyms (funcall registrar :list)))
`(destructuring-bind
;; Wrap these up so atomic patterns work
(,safe-pattern) (list ,expression)
(declare (ignore ,@gensyms))
,@body)))
(pattern-bind (0 (x . _)) '(0 (1 2 3)) x)
1
Woohoo!
We now have the pieces to construct the final macro. Let’s do so.
(defmacro pattern-case (keyform &body cases)
"PATTERN-CASE KEYFORM (PATTERN FORM*)*
Evaluate FORM* whose PATTERN matches KEYFORM
EX:
(defun factorial (n)
(pattern-case n
(0 1)
(n (* n (factorial (1- n))))))
(defun kth (k list)
(pattern-case (list k list)
((_ nil) (error \"bad K for LIST\"))
((0 (x . _)) x)
((k (_ . xs)) (kth (1- k) xs))))
"
;; Evaluate KEYFORM exactly once
(let ((keyvalue (gensym)))
`(let ((,keyvalue ,keyform))
(cond ,@(loop :for case in cases
:for pattern = (first case)
:for body = (rest case)
:collect `((pattern-equal ',pattern ,keyvalue)
(pattern-bind ,pattern ,keyvalue
,@body)))))))
(defun kth (k lst)
(pattern-case (list k lst)
((_ nil) (error "much k. little list. such fail. wow."))
((0 (x . _)) x)
((k (_ . xs)) (kth (1- k) xs))))
(handler-case (kth 5 '(earth wind fire water))
(t (e) e))
#<SIMPLE-ERROR "much k. little list. such fail. wow." {10052F1443}>
(kth 1 '(moe larry curly))
LARRY
(defun rotate-left (tree)
(pattern-case tree
((x l nil) `(,x ,l nil))
((x l (r rl rr)) `(,r (,x ,l ,rl) ,rr))))
(rotate-left '(4 (2 (1) (3)) (6 (5) (7))))
(6 (4 (2 (1) (3)) (5)) (7))