Skip to content

Instantly share code, notes, and snippets.

@andreareina
Last active November 10, 2023 01:17
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save andreareina/b8a9cb27cadf7a0860619a006dd8c31b to your computer and use it in GitHub Desktop.
Save andreareina/b8a9cb27cadf7a0860619a006dd8c31b to your computer and use it in GitHub Desktop.

Pattern Matching

The problem

I’d like to be able to write functions using pattern-matching, something like so:

(defun factorial (n)
  (pattern-case n
    (0 1)
    (n (* n (factorial (1- n))))))

I had also considered the name dispatch-on-pattern.

There’s no implementation for pattern-case yet but let’s write a stub so the editor knows how to indent it and the compiler doesn’t complain.

(defmacro pattern-case (keyform &body cases))

What’s a pattern?

Patterns are a statement about the structure and (possibly) content of an expression. They also provide the names you want values to be bound to. Imagine writing a sum function, which takes a list and returns evaluates to the sum of all its members. The savvy lisper would (defun sum (numbers) (apply #'+ numbers)) and be done with it, but that doesn’t suit our didactic purpose.

(defun sum (numbers)
  (pattern-case numbers
    (nil 0)
    ((n . ns) (+ n (sum ns)))))

Assuming there is a list to sum, we bind the first element to n and the rest of them to ns. This is exactly like destructuring-bind. In fact, this almost seems like it could be solved with a combination of an as-yet unwritten pattern-equal and destructuring-bind. It is, alas, not the case; see if you can figure out why.

Pattern-matching also lets us specify the values we’re interested in. Consider a pattern-matched version of nth.

(defun nth (n lst)
  (pattern-case (n lst)
    ((0 (x . xs)) x)
    ((n (x . xs)) (enth (1- n) xs))))

This works, but obscures the fact that in each of those cases, there’s one value we don’t care about: if n is 0, it doesn’t matter what xs is, just give x; otherwise, it doesn’t matter what x is, we recurse on n-1 and xs.

The conventional way of writing this (with cond or if) would in fact be clearer in that it only asks about the relevant bits.

(defun nth (n lst)
  (cond ((zerop n) (car lst))
        (t (enth (cdr lst)))))

Not to fear, there is a solution. A common convention is to mark values we’re not interested in with _. Let’s try it out:

(defun nth (n lst)
  (pattern-case (n lst)
    ((0 (x . _)) x)
    ((n (_ . xs)) (enth (1- n) xs))))

Nice. Much cleaner, and syntactically valid since _ is a valid symbol and can be bound to anything.

What does a pattern match?

Generally, symbols match (and bind to) anything, other atoms match themselves, and lists match things that have the same shape. Specifically:

  • A symbol that can be bound (e.g. foo qualifies, :foo doesn’t) matches anything
  • t matches a generalized boolean, i.e. anything that’s not nil
  • Strings match with string=
  • Other arrays match with equal
  • Other atomic values match with eql
  • Lists match another list of like structure, provided their members match as per the above
    • (0 . ns) matches any list whose first member is 0
    • (fname lname) matches ("Johnny" "Goode") but not ("Johnny" "B." "Goode") nor ("Johnny")
    • (first second third . rest) matches any list with length ≥ 3

Let’s try that. Conditionals need to be written with the special cases first so the order will be a little different.

(defun pattern-equal (pattern expression)
  (cond
    ;; Symbols that can't be bound first
    ((eq pattern t) expression)
    ((eq pattern nil) (not expression))
    ((keywordp pattern) (eql pattern expression))
    ;; Symbol that can be bound
    ((symbolp pattern) t)
    ;; Lists
    ((listp pattern) (and (pattern-equal (car pattern)
                                         (car expression))
                          (pattern-equal (cdr pattern)
                                         (cdr expression))))
    ;; Strings, arrays
    ;; string= for case-sensitivity, or string-equal for insensitivity?
    ((stringp pattern) (string= pattern expression))
    ((arrayp pattern) (equal pattern expression))
    ;; Anything else
    (t (eql pattern expression))
    ))

The patterns that bind

Remember what I said about destructuring-bind not working for our purpose? Let’s see what happens by manually expanding one of the examples above.

(defun factorial (n)
  (cond ((pattern-equal 0 n) 1)
        ((pattern-equal 'n n)
         (destructuring-bind (n) (list n) (* n (factorial (1- n)))))))

;; My favorite factorial
(factorial 6)
720

Wait, that worked. Let’s try another one.

(defun enth (n lst)
  (cond ((pattern-equal '(_ nil) (list n lst))
         (error "Bad N for LST"))
        ((pattern-equal '(0 (x . xs)) (list n lst))
         (destructuring-bind (0 (x . xs)) (list n lst)
           x))
        ((pattern-equal '(n (_ . xs)) (list n lst))
         (destructuring-bind (n (_ . xs)) (list n lst)
           (enth (1- n) xs)))))
; in: DEFUN ENTH
;     (DESTRUCTURING-BIND (0 (X . XS)) (LIST N LST) X)
; 
; caught ERROR:
;   during macroexpansion of (DESTRUCTURING-BIND (0 #) (LIST N LST) ...). Use
;   *BREAK-ON-SIGNALS* to intercept.
;   
;    non-symbol in lambda-list: 0

Whoops. We can’t pass a pattern into destructuring-bind because patterns may contain constants and lambda-lists may not. We should also do something about silencing the warnings for unused variables. What if we substitute constants for variables? While we’re at it we’ll do the same to _ so we can have several

(defun substitute-constants (pattern)
  (cond ((or (eq '_ pattern ) (constantp pattern))
         (gensym))
        ((listp pattern) (cons (substitute-constants (car pattern))
                               (substitute-constants (cdr pattern))))
        (t pattern)))
(substitute-constants '(0 (x . _)))
(G1093 (X . G1094) . G1095)

That looks strange, but is structurally and syntactically the same. Let’s try using that, using the reader-macro #. to evaluate the substitution (otherwise destructuring-bind would capture the whole (substitute-constants ...) form)

(destructuring-bind #.(substitute-constants '(0 (x . _)))
  '(0 (first second third))
  x)
; in: DESTRUCTURING-BIND (#:G1097 (X . #:G1098) . #:G1099)
;     (LET* ((#:G1097 (CAR #:WHOLE1100))
;            (X (CAR #:REQUIRED-1102))
;            (#:G1098 (CDR #:REQUIRED-1102))
;            (#:G1099 (CDR (CDR #:WHOLE1100))))
;       X)
; 
; caught STYLE-WARNING:
;   The variable #:G1097 is defined but never used.
; 
; caught STYLE-WARNING:
;   The variable #:G1098 is defined but never used.
; 
; caught STYLE-WARNING:
;   The variable #:G1099 is defined but never used.
; 
; compilation unit finished
;   caught 3 STYLE-WARNING conditions
FIRST

Better! We have warnings about unused variables, but they can be (declare (ignore ...))‘ed away, as long as we keep track of the generated symbols.

(defun make-registrar ()
  "MAKE-REGISTRAR => REGISTRAR

REGISTRAR is a function:
  REGISTRAR :create => new symbol (ala GENSYM)
  REGISTRAR :list => list of all symbols generated
"
  (let (register)
    (lambda (arg)
      (ecase arg
        (:create (push (gensym) register) (car register))
        (:list register)))))

(defun substitute-constants (pattern generator)
  "Replace constants and _ in PATTERN with symbols from GENERATOR"
  (cond ((or (constantp pattern) (eq '_ pattern)) (funcall generator))
        ((listp pattern) (cons (substitute-constants (car pattern)
                                                     generator)
                               (substitute-constants (cdr pattern)
                                                     generator)))
        (t pattern)))

(defmacro pattern-bind (pattern expression &body body)
  "Bind variables in PATTERN to values in EXPRESSION, then evaluate BODY"
  (let* ((registrar (make-registrar))
         (generator (lambda () (funcall registrar :create)))
         (safe-pattern (substitute-constants pattern generator))
         (gensyms (funcall registrar :list)))
    `(destructuring-bind
         ;; Wrap these up so atomic patterns work
         (,safe-pattern) (list ,expression)
       (declare (ignore ,@gensyms))
       ,@body)))
(pattern-bind (0 (x . _)) '(0 (1 2 3)) x)
1

Woohoo!

Putting it together

We now have the pieces to construct the final macro. Let’s do so.

(defmacro pattern-case (keyform &body cases)
  "PATTERN-CASE KEYFORM (PATTERN FORM*)*

Evaluate FORM* whose PATTERN matches KEYFORM
EX:
(defun factorial (n)
  (pattern-case n
    (0 1)
    (n (* n (factorial (1- n))))))

(defun kth (k list)
  (pattern-case (list k list)
    ((_ nil) (error \"bad K for LIST\"))
    ((0 (x . _)) x)
    ((k (_ . xs)) (kth (1- k) xs))))
"
  ;; Evaluate KEYFORM exactly once
  (let ((keyvalue (gensym)))
    `(let ((,keyvalue ,keyform))
       (cond ,@(loop :for case in cases
                     :for pattern = (first case)
                     :for body = (rest case)
                     :collect `((pattern-equal ',pattern ,keyvalue)
                                (pattern-bind ,pattern ,keyvalue
                                  ,@body)))))))
(defun kth (k lst)
  (pattern-case (list k lst)
    ((_ nil) (error "much k. little list. such fail. wow."))
    ((0 (x . _)) x)
    ((k (_ . xs)) (kth (1- k) xs))))

(handler-case (kth 5 '(earth wind fire water))
  (t (e) e))
#<SIMPLE-ERROR "much k. little list. such fail. wow." {10052F1443}>
(kth 1 '(moe larry curly))
LARRY
(defun rotate-left (tree)
  (pattern-case tree
    ((x l nil) `(,x ,l nil))
    ((x l (r rl rr)) `(,r (,x ,l ,rl) ,rr))))

(rotate-left '(4 (2 (1) (3)) (6 (5) (7))))
(6 (4 (2 (1) (3)) (5)) (7))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment