{{ message }}

Instantly share code, notes, and snippets.

jcoglan/lambda-the-ultimate.md

Created Jul 30, 2014

Lambda the Ultimate

On page 137 of The Little Schemer, this rather convoluted function appears:

```(define multirember&co
(lambda (a lat col)
(cond
((null? lat)
(col (quote ()) (quote ())))
((eq? (car lat) a)
(multirember&co a
(cdr lat)
(lambda (newlat seen)
(col newlat
(cons (car lat) seen)))))
(else
(multirember&co a
(cdr lat)
(lambda (newlat seen)
(col (cons (car lat) newlat)
seen)))))))```

I’m going to rewrite this in slightly more idiomatic style, replacing `(define foo (lambda (x y) ...` with `(define (foo x y) ...`, replacing `(quote x)` with `'x`, and aligning function arguments to make things clearer:

```(define (multirember&co a lat col)
(cond ((null? lat)
(col '() '()))
((eq? (car lat) a)
(multirember&co a
(cdr lat)
(lambda (newlat seen)
(col newlat
(cons (car lat) seen)))))
(else
(multirember&co a
(cdr lat)
(lambda (newlat seen)
(col (cons (car lat) newlat)
seen))))))```

This function takes an atom `a`, a list of atoms `lat`, and a function `col` (short for ‘collector’), and invokes the function with two lists: one is the original list with all occurrences of `a` removed, and the other is all the occurrences of `a` we found. It combines it with this function:

```(define a-friend
(lambda (x y)
(null? y)))

; or, in modern notation

(define (a-friend x y)
(null? y))```

In the example this returns `#f`, since `tuna` appears in the list and so the second argument to `a-friend` is not empty.

```(define foods '(strawberries tuna and swordfish))

(multirember&co 'tuna foods a-friend)
; => #f```

Some discussion of this leads up to the tenth commandment:

Build functions to collect more than one value at a time.

The question is, why? We spent some time on this question, since it was not clear why this function should emit its filtered list and the list of occurrences via a callback, rather than simply returning them as a pair of values in a list. We weren’t sure what the commandment was teaching us, or why later examples of this pattern tightly couple multiple operations up in the same function. They seem to be hinting at a need for `fold`, but never get there. The callback is simply used to emit the final result of an operation.

It also is not clear whether the output of `col` is intended to be the return value of `multirember&co`, i.e. that `col` is a data constructor to be consumed by `multirember&co`’s caller, or whether `col` is intended only to deliver the final lists to a callback as a side-effect, and the output of `col` should be ignored. The use of the term ‘continuation’ suggests the latter, although the purely functional nature of the rest of the book suggests that this is really an unsual method of composition rather than a ‘side effect’ per se.

Recursing with multiple values

One explanation we came up with is that, even though `multirember&co` could return a list containing two lists, getting those inner lists back out again on recursion would be annoying. Here’s another version that doesn’t use a callback:

```(define (partition-eq? a lat)
(cond ((null? lat)
'(() ()))
((eq? (car lat) a)
(let ((pair (partition-eq? a (cdr lat))))
(list (car pair)
(cons (car lat)
(else
(let ((pair (partition-eq? a (cdr lat))))
(list (cons (car lat)
(car pair))

(partition-eq? 'tuna foods)
; => ((strawberries and swordfish) (tuna))```

(In modern Scheme, `(cadr x)` is a shorthand for `(car (cdr x))`, i.e. the second element of `x`.)

Instead of calling a callback, the base case of this function returns `'(() ())`, a pair of lists that are added to via recursion. This makes the data flow a bit more obvious: we’re back to the common pattern of recurring on the ‘rest’ of the list and then adding to the result. But destructuring the return value at every turn is annoying. Since we need to refer to the list `pair` generated by recursion multiple times -- to get its components out -- we must store it in a local variable using `let` or else risk making unnecessary function calls. This concept of local variable binding is never discussed in the book, and without it handling the destructuring becomes painful (`let` desugars to a `lambda` that is immediately applied to the inputs).

We can clean things up even further with more local variables:

```(define (partition-eq? a lat)
(cond ((null? lat)
'(() ()))
((eq? (car lat) a)
(let* ((pair   (partition-eq? a (cdr lat)))
(newlat (car pair))
(list newlat
(cons (car lat) seen))))
(else
(let* ((pair   (partition-eq? a (cdr lat)))
(newlat (car pair))
(list (cons (car lat) newlat)
seen)))))```

Now we can see there's a lot of duplicated logic between two branches of the `cond`. We can fix that by doing the recursion, then figuring out how to push our current element onto the result:

```(define (partition-eq? a lat)
(cond ((null? lat)
'(() ()))
(else
(let* ((pair   (partition-eq? a (cdr lat)))
(newlat (car pair))
(cond ((eq? (car lat) a)
(list newlat (cons (car lat) seen)))
(else
(list (cons (car lat) newlat) seen)))))))```

This keeps the recursive flow clear, and it makes the construction of the next value in the loop clearer.

Composing functions with multiple values

So, it is not very difficult to write `multirember&co` in way that uses a more natural recursion pattern, assuming you have `let` available. But, it’s still hard to compose this with other functions: `partition-eq?` returns one list, but `a-friend` takes two values, so this would be an error:

`(a-friend (partition-eq? 'tuna foods))`

```(let ((pair (partition-eq? 'tuna foods)))

; => #f```

Actual Scheme has a function called `apply` that calls a function with its arguments supplied from a list:

```(apply a-friend (partition-eq? 'tuna foods))
; => #f```

Or, we can use macros; a macro is an instruction for how to rewrite a source expression in terms of lower-level features. Macro systems were first proposed in a 1963 paper by Hart. Modern Scheme (since 1998) provides the pattern-matching `syntax-rules` macro system for declaring transformations; Common Lisp uses the more procedural `defmacro` system where you directly manipulate source trees using the built-in functions.

We can write a macro for destructuring assignment:

```(define-syntax let-elements (syntax-rules (in)
((let-elements (a b ...) expr in body ...)
(let* ((args expr)
(a (car args)))
(let-elements (b ...) (cdr args) in body ...)))
((let-elements () _ in body ...)
(begin body ...))))```

This macro can be used to destructure a list result and hand the pieces off to another function; `x` is bound to the first element of the result, `y` to the second.

```(let-elements (x y) (partition-eq? 'tuna foods) in (a-friend x y))
; => #f```

This expression is rewritten into a series of `let*` expressions:

```(let* ((args (partition-eq? 'tuna foods))
(x (car args)))
(let* ((args (cdr args))
(y (car args)))
(begin
(a-friend x y))))```

And `let*` is itself a macro that desugars to lambda application:

```((lambda (args)
((lambda (x)
((lambda (args)
((lambda (y)
(a-friend x y))
(car args)))
(cdr args)))
(car args)))
(partition-eq? 'tuna foods))

; => #f```

Which brings us back to the limited set of concepts used in the book. This is an obtuse way to write the function, and in the absence of a built-in `let` or a macro system, it seems reasonable to look for other patterns for using `lambda` as a convenient variable binding mechanism.

However, given that `apply` appears in McCarthy’s original 1960 paper, its omission from The Little Schemer seems odd. It greatly simplifies the problem of composing functions that return multiple values.

It’s instructive to rewrite these examples in Haskell to see which things stay the same, and which things recommended by the book are there due to the limitations of the Scheme language. Here’s `multirember&co` and `a-friend`:

```multiremberCo                 :: (Eq a) => a -> [a] -> ([a] -> [a] -> b) -> b
multiremberCo _ []     col    =  col [] []
multiremberCo a (x:xs) col
| a == x                  =  multiremberCo a xs colSeen
| otherwise               =  multiremberCo a xs colRest
where
colSeen newlat seen   =  col newlat (x:seen)
colRest newlat seen   =  col (x:newlat) seen

aFriend         :: [a] -> [a] -> Bool
aFriend xs ys   =  null ys```

Here we use a combination of pattern matching, guards and `where` clauses to define helper functions. This is literally the same algorithm, but expressed differently. To explain what’s going on in each line:

```-- This is a type signature. It says that multiremberCo takes three arguments:
-- one of type `a`, one of type `[a]` (a list of values of type `a`), and one of
-- type `([a] -> [a] -> b)`: a function that takes two lists of `a` and returns
-- a `b`. `a` is a 'type variable', it means the value can take any type so long
-- all the places that `a` appears in the signature are the same type. You can
-- pass an `Int` and a `[Int]`, but not a `Bool` and a `[Int]`. The `(Eq a) =>`
-- part means that whatever type `a` is, it must implement the equality
-- interface, since we're going to pass `a` to the `==` function.
multiremberCo                 :: (Eq a) => a -> [a] -> ([a] -> [a] -> b) -> b

-- This is a pattern-match for the case where the list is empty. We use `_` for
-- the first parameter because we don't care about its value in this case.
multiremberCo _ []     col    =  col [] []

-- Now we handle the general case. The pipes present different boolean checks on
-- the function arguments and say what to do in each case. The `(x:xs)`
-- expression is destructuring the list argument into its first element `x` and
-- the rest of the list `xs`. `where` lets us set the values of internal
-- variables; these bindings apply across all the guard clauses.
multiremberCo a (x:xs) col
| a == x                  =  multiremberCo a xs colSeen
| otherwise               =  multiremberCo a xs colRest
where
colSeen newlat seen   =  col newlat (x:seen)
colRest newlat seen   =  col (x:newlat) seen```

For example:

```let foods = ["strawberries", "tuna", "and", "swordfish"]
multiremberCo "tuna" foods aFriend
-- => False```

This is where the type signature helps you to see how data flows: the fact that `col` takes two `[a]` and returns a `b`, and `multiremberCo` takes an `a` and `[a]` and returns a `b`, means that `multiremberCo` must produce its result by handing some derivative of its arguments off to `col` and returning the result.

We could bring our original Scheme function a little closer by extracting some helper functions:

```(define (multirember&co a lat col)
(let ((col-seen (lambda (newlat seen)
(col newlat (cons (car lat) seen))))
(col-rest (lambda (newlat seen)
(col (cons (car lat) newlat) seen))))
(cond ((null? lat)
(col '() '()))
((eq? (car lat) a)
(multirember&co a (cdr lat) col-seen))
(else
(multirember&co a (cdr lat) col-rest)))))```

This helps separate the decision-making in the `cond` from the logic for how to build the next value, which is wrapped up in the helper functions. But again, this relies on `let`, which the book does not use.

Currying

Let’s rewrite our function one final time to remove the hard-coded `eq?`. Instead, we will let `partition` take a predicate `p` and a list `lat`, and filter out the elements for which `(p x)` is `#t`.

```(define (partition p lat)
(cond ((null? lat)
'(() ()))
(else
(let* ((pair   (partition p (cdr lat)))
(newlat (car pair))
(cond ((p (car lat))
(list newlat (cons (car lat) seen)))
(else
(list (cons (car lat) newlat) seen)))))))```

We can filter out the elements matching `tuna`:

```(partition (lambda (a) (eq? 'tuna a)) foods)
; => ((strawberries and swordfish) (tuna))```

Writing out lambdas creates a lot of line-noise, so let’s extract that function and give it a name:

```(define (is-tuna? a)
(eq? 'tuna a))

(partition is-tuna? foods)
; => ((strawberries and swordfish) (tuna))```

But it would be even better if we didn’t have to write specialised functions to check for particular values at all. If we had a curried version of `eq?`, we could produce specialised versions of it to check for particular values:

```(define (is? a)
(lambda (b)
(eq? a b)))

(partition (is? 'tuna) foods)
; => ((strawberries and swordfish) (tuna))```

Again, Haskell provides an interesting example. Here’s `partition`; it takes a function `a -> Bool` and a list `[a]` and returns a tuple contain two lists of `a`.

```partition             :: (a -> Bool) -> [a] -> ([a], [a])
partition _ []        =  ([], [])
partition p (x:xs)
| p x             =  (rest, x:seen)
| otherwise       =  (x:rest, seen)
where
(rest, seen)  =  partition p xs```

This time, the base case returns a tuple containing two empty lists. Haskell has built-in destructuring; `(rest, seen) = partition p xs` takes the tuple returned by the recursive call and assigns its first member to `rest` and the second to `seen`. `x:y` is how Haskell says `(cons x y)`, which hugely reduces line noise and makes the structure more apparent.

We could call this by passing in a lambda:

```partition (\x -> x == "tuna") foods
-- => (["strawberries", "and", "swordfish"], ["tuna"])```

But this is not necessary. `==` is a function; symbolic functions in Haskell are written infix rather than prefix. All functions are curried by default, so you can pass any function fewer arguments than it expects and get an intermediate function. Infix functions can be partially applied on both their left and right operands, so for example `== "tuna"` is equivalent to the lambda we used above.

```partition (== "tuna") foods
-- => (["strawberries", "and", "swordfish"], ["tuna"])```

We can do the same with other operators:

```partition (> 5) [1..10]
-- => ([1,2,3,4,5], [6,7,8,9,10])```

This massively cuts down on the amount of boilerplate needed to write higher-order functions. But what about composition? `aFriend` takes two lists, not a tuple, so it doesn’t compose with `partition`. There are a couple of solutions to this; we can either use `let` to destructure:

```let (xs, ys) = partition (== "tuna") foods in aFriend xs ys
-- => False```

(This is what I modelled the `let-elements` macro on.)

Or, Haskell has a function called `uncurry`:

`uncurry :: (a -> b -> c) -> ((a, b) -> c)`

`uncurry` takes a function that takes two arguments, and returns a function that takes both arguments as a tuple.

```uncurry aFriend (partition (== "tuna") foods)
-- => False```

Could we do this in Scheme?

```(define (uncurry f)
(lambda (tuple)

((uncurry a-friend) (partition (is? 'tuna) foods))
; => #f```

We haven’t used any concepts outside the book to define this helper function, and it seems to lead to better separation of concerns.

Tail recursion

I can only think of one other reason that `multirember&co` is presented in the way that it is. Here it is again:

```(define (multirember&co a lat col)
(cond ((null? lat)
(col '() '()))
((eq? (car lat) a)
(multirember&co a
(cdr lat)
(lambda (newlat seen)
(col newlat
(cons (car lat) seen)))))
(else
(multirember&co a
(cdr lat)
(lambda (newlat seen)
(col (cons (car lat) newlat)
seen))))))```

In this definition, the last expression in any branch of this function is a call to `multirember&co` itself: the value of a call to `multirember&co` is the value of the recusrive call to itself. Similarly, the internal lambdas that wrap `col` have a call to `col` as their very last expression. This means this definition is tail recursive, so it can be applied to an indefinitely long input without blowing the stack. The same is not true of the `partition` definition, although it would be possible to rewrite it to be tail-recursive.

However, the book doesn’t seem to be making an argument about implementation details of the platform, and tail calls were initially described in a 1974 paper. It seems to be telling you that writing functions in this style is a good idea as part of generally good programming style, but given even the limited tools available in the book, which includes higher-order functions, I’m not convinced.