Skip to content

Instantly share code, notes, and snippets.

@mrange
Last active March 26, 2023 11:58
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mrange/19c032b335bbeabae6fe9e6a4a4b626f to your computer and use it in GitHub Desktop.
Save mrange/19c032b335bbeabae6fe9e6a4a4b626f to your computer and use it in GitHub Desktop.
F# Advent 2020, Dec 6 - Church Encoded Lists

F# Advent 2020, Dec 6 - Church Encoded Lists

Introduction

The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise

Edsger W. Dijkstra

This is my contribution to F# Advent Calendar in English. Thanks to Sergey Tihon for making sure this happens every year.

An abstraction most .NET developers are familiar with is IEnumerable (aka Seq in F#). IEnumerable is an abstraction over a set of values in a particular order and it looks roughly like this:

type IEnumerator<'T> =
  interface
    inherit System.IDisposable
    abstract Current: 'T
    abstract MoveNext: unit -> bool
  end

type IEnumerable<'T> =
  interface
    abstract GetEnumerator: unit -> IEnumerator<'T>
  end

On top of this abstraction .NET developers have implemented the operations filter, map and so on that enables us to build pipelines where values through flow. To visit all values in an IEnumerable there are various options such as using for, Seq.fold or Seq.iter, however internally they boil down to this code pattern.

let s: 'T seq = ???
let e = s.GetEnumerator ()
try
  while e.MoveNext () do
    doStuff e.Current
finally
  e.Dispose ()

This is what is sometimes known as a pull pipeline, that is we try to “pull” one value out of the pipeline by calling MoveNext, if that was successful, we “doStuff” on the pulled value. To support abstracting over a set of values generated from resources that need disposing we need to dispose the enumerator. There are few drawbacks with IEnumerable

  1. You need to implement two interfaces, IEnumerableand IEnumerator
  2. Implementing IEnumerator can feel a bit awkward as work is performed in MoveNext but then value is retrieved using Current.
  3. Each step in the pipeline essentially reimplements the visitation pattern above meaning there are two virtual calls per value and a test, all wrapped in a try…finally

Push pipelines

A benefit of IEnumerable is that each step in the pipeline does not have to materialize the entire collection as is the case when you use Array or List in F#. However, there is an obvious alternative to pull pipelines in push pipelines. In a push pipeline the values are pushed from the source through the pipeline rather than the receiver pull the values through the pipeline.

A push pipeline tends to have a:

  1. More elegant implementation
  2. More efficient implementation by reducing virtual calls and unnecessary checks

There are drawbacks such that once the pipeline is started it will push all data through the pipeline until stopped or reaching the end where IEnumerable by its design is yieldable. Nessos Streams pulls most of the benefits of Pull and Push pipelines by combining them into a “pump pipeline” at the expense of increased implementation complexity.

Church Encoded Lists

One of my favorite Push pipelines is church encoded lists, as described here as an example of Rank N types in Haskell. A Church Encoded List is an abstraction over a set of values that exposes a single method runList which behaves like foldBack except the collection is given by the abstraction.

The name Church Encoded Lists is likely derived from how natural numbers are commonly encoded in lambda calculus.

0 := \fx. x
1 := \fx. f x
2 := \fx. f (f x)

A church encoded numeral n is a function that applies f n times over x. A church encoded list in F# does the same for a list of length n except it also passes the values of the list through f as well.

In F# we could try to model the Church Encoded List like this:

type Cel<'T> = ('T -> 'S -> 'S) -> 'S -> 'S

Assuming above was legal in F# we would sum all numbers in a Church Encoded List like this:

let vs: Cel<int> = ???
let sum = vs (+) 0 // Folds all values in vs using + and 0

The type definition isn’t legal F# though because F# don’t support Rank-N types for function types (The folding functions should work for all ‘S?) and including ‘S in the type definition quickly feels clumsy and awkward. .NET do support Rank-2 types in generic methods in generic classes though so an alternative definition for a Church Encoded List could look like this:

type [<AbstractClass>] Cel<'T>() =
  abstract RunList: ('T -> 'S -> 'S) -> 'S -> 'S

In this case there isn’t a problem as RunList is now a generic method over ´S.

Let us define a few simple functions:

module Cel =
  // Helper method to run a list over a folder function and a zero value
  let inline runList (c: Cel<'T>) folder (zero: 'S) = c.RunList folder zero

  // The empty list
  [<GeneralizableValue>]
  let nil<'T> = 
    { new Cel<'T>() with
      override _.RunList f z = z
    }
    
  // Appends a value
  let inline cons x cel = 
    { new Cel<'T>() with
      override _.RunList f z = f x (runList cel f z)
    }

The empty list contains no values so just returns the “zero” value. For the single value list run the folder function once over the value and the zero value.

Implementing an Church Encoded List over an ordinary F# list is easy enough

  let inline fromList xs =
    { new Cel<'T>() with
      override x.RunList f z = List.foldBack f xs z
    }

As runList should behave as foldBack we just forward the folder function and the zero value to List.foldBack.

Going from a Church Encoded List to an F# list is easy as well

  let inline toList cel = runList cel (fun v s -> v::s) []

The folder function prepends v on list s and the zero value is the empty list. As runList is reverse folding there is no need to reverse the list at the end.

Laws of Church Encoded Lists

The Haskell page on Rank-N types list 3 laws for the operations we defined so far:

  1. runList xs cons nil == xs
  2. runList (fromList xs) f z == foldr f z xs
  3. foldr f z (toList xs) == runList xs f z

First law states that given any church encoded list xs when we fold with cons and nil the resulting list shall be identical, essentially a copy list operation.

Second law states that given any F# list xs, any folder function f and any initial value z when we runList over a church encoded list created by fromList xs that is identical to when we do reverse fold over the given list (List.foldBack in F#).

Third law states that given any church encoded list xs, any folder function f and any initial value z when we do reverse fold over the given list created by toList xs that is identical to when we runList over the given list.

If these 3 laws hold for all values then the implementation so far is sound.

The best thing would be to prove our implementation but that can be difficult. Often in the software industry we use testing as a compromise. One way to test is to generate random input values to make sure the laws hold for all given values.

This is sometimes called property-based testing and for F# there is an excellent library called FsCheck that enables property-based testing.

Using FsCheck it’s possible to write the tests of the 3 laws as this:

  type Properties =
    // Church Encoded List Laws
    static member ``For all xs: runList xs cons nil == xs`` (xs: int Cel) =
      runList xs cons nil == xs

    static member ``For all xs, f and z: runList (fromList xs) f z == foldr f z xs`` (xs: int list) (f: int -> int -> int) (z: int) =
      runList (fromList xs) f z == List.foldBack f xs z

    static member ``For all cs, f and z: foldr f z (toList cs) == runList cs f z`` (xs: int Cel) (f: int -> int -> int) (z: int) =
      List.foldBack f (toList xs) z == runList xs f z

Each test function is called with randomly generated data and returns true if the law holds true for the data given. This is not proof that the laws always hold but tests never provide proofs, good tests just builds confidence. Property based testing properly done can give a high degree of confidence with little effort.

The testing code also looks like the original laws which is a nice touch.

Collect, filter and map

Common operations for data pipelines are collect, filter and map. Let’s look at how they can be implemented for Church Encoded Lists.

  let inline collect collector cel =
    { new Cel<'U>() with
      override _.RunList f z = 
        let ff v s = runList (collector v) f s
        runList cel ff z
    }

  let inline filter filterer cel =
    { new Cel<'T>() with
      override _.RunList f z = runList cel (fun v s -> if filterer v then f v s else s) z
    }

  let inline map mapper cel =
    { new Cel<'U>() with
      override _.RunList f z = runList cel (fun v s -> f (mapper v) s) z
    }

The implementations are short but might be a bit tricky to follow. Let us start with filter:

      override _.RunList f z = runList cel (fun v s -> if filterer v then f v s else s) z

What happens here is that we runList on the input cel but we pass it a filter fold function. This function checks if v passed the condition and if so applies f v s to get a new state, otherwise it passes the existing state.

map is similar:

      override _.RunList f z = runList cel (fun v s -> f (mapper v) s) z

Here a map folder function is passed to runList that for each value in cel maps it to the new value domain and applies it to f to get a new state.

Finally let us look at collect:

      override _.RunList f z = 
        let ff v s = runList (collector v) f s
        runList cel ff z

collect aka flatMap conceptually calls collector for each value in cel to get a Church Encoded List for each value. These lists are concatenated to the final Church Encoded List. The implementation is simpler though as we create a fold function ff that given a value v does runList on the Church Encoded List created from the value.

Testing collect, filter and map

The definitions of collect, filter and map look to me close to how they would be defined mathematically given foldr function.

As we already tested the church encoded laws in terms of foldr for lists in order to test collector, filter and map we can test them by introducing laws saying that for all values they should behave as collect, filter and map for lists.

  1. filter f xs == (fromList >> filter f >> toList) xs
  2. map f xs == (fromList >> map f >> toList) xs
  3. collect f xxs == (fromList >> collect (f >> fromList) >> toList) xxs

fromList >> g >> toList is a pattern that maps the input list into a Church Encoded list, applies g to the it and then maps the result back to list. This is because ’T list and ’T cel aren’t directly comparable.

Given xs is a ’T list, f is any function and xxs is ’T list list we can read the rules such as that for all possible combinations of xs, f and xxs

  1. List filter produce the same result as filter for Church Encoded lists
  2. List map produce the same result as map for Church Encoded lists
  3. List collect produce the same result as collect for Church Encoded lists

Using FsCheck to check the laws it could look like this.

    static member ``For all f and xxs: List.collect and collect produce equal results`` (f: int list -> int list) (xxs: int list list) =
      List.collect f xxs == (fromList >> collect (f >> fromList) >> toList) xxs

    static member ``For all f and xs: List.filter and filter produce equal results`` (xs: int list) (f: int -> bool) =
      List.filter f xs == (fromList >> filter f >> toList) xs

    static member ``For all f and xs: List.map and map produce equal results`` (xs: int list) (f: int -> int) =
      List.map f xs == (fromList >> map f >> toList) xs

Testing collect, filter and map in terms of runList

runList is essentially a foldBack so here's a suggestion for what map could be in terms of runList

let map f xs = runList xs (fun v s -> cons (f v) s) nil

We say that map should be the same as folding back over xs and cons all mapped values f v on the empty list nil.

From this we can then define a property test:

    static member ``For all f and cs: map produce equal results to runList concatenating the results of f`` (xs: int Cel) (f: int -> int) =
      map f xs == runList xs (fun v s -> cons (f v) s) nil

Filter can also be expressed in terms of runList with little effort:

let filter f xs = runList xs (fun v s -> if f v then cons v s else s) nil

We say that filter should be the same as folding back over xs and cons all values where f v is true on the empty list nil.

From this we can then define a property test:

    static member ``For all f and cs: map produce equal results to runList concatenating the true results for f`` (xs: int Cel) (f: int -> bool) =
      filter f xs == runList xs (fun v s -> if f v then cons v s else s) nil

Collect aka flatmap can also be defined in terms of runList as it's conceptually a mapping operation that concatenates the results.

let collect f xs = runList xs (fun v s -> runList (f v) cons s) nil

We have a list that through the function f generates lists to be concatenated so it makes sense to have two nested runList. The first folding function runs f v to get the list from v and then concatenates on s using cons.

The property test:

    static member ``For all f and ccs: collect produce equal results to runList that concatenates the results of the lists of f`` (xs: int Cel Cel) =
      // FsCheck generating: int Cel -> int Cel function results in a quite slow function
      //  so use id for now
      let f = id
      collect f xs == runList xs (fun v s -> runList (f v) cons s) nil

Testing filter and map in terms of collect

We can simplify our definitions further for map and filter by defining them in terms of collect

let filter  f xs = collect (fun v -> if f v then cons v nil else nil) xs
let map     f xs = collect (fun v -> cons (f v) nil) xs

cons v nil creates a single element list which we can use for both filter and map when defining them in terms of collect.

The property tests will be:

    // filter and map laws defined in terms of collect
    static member ``For all f and cs: filter produce equal results to collect concatenating singletons list for all true results for f`` (xs: int Cel) (f: int -> bool) =
      filter f xs == collect (fun v -> if f v then cons v nil else nil) xs

    static member ``For all f and cs: map produce equal results to collect concatenating singletons list with results for f`` (xs: int Cel) (f: int -> int) =
      map f xs == collect (fun v -> cons (f v) nil) xs

The different tests for collect, filter and map is redundant and which style you prefer is up to you. Using a well-known library as an oracle is a quick way to achieve property tests that are easy to understand.

Taking the extra time to define the property tests in terms of simpler primitives gives the writer more insight into what is actually a filter function? Coming back a few months later to the tests it might not feel as obvious though.

Wrapping up with fromRange and sum

Finally, we can implement fromRange and sum such as this:

  let inline fromRange b e =
    { new Cel<int>() with
      override _.RunList f z = 
        let rec loop f s b i = 
          if i >= b then
            loop f (f i s) b (i - 1)
          else
            s
        loop f z b e
    }
  let inline sum cel    = runList cel (+) LanguagePrimitives.GenericZero

While possible to implement fromRange using F# for I personally like tail recursion as it allows me to avoid mutable state, note that the loop goes backwards because runList is essentially a foldBack. sum is trivial as it just forwards a folder function + and the generic zero to runList.

The laws for these two functions could be something like this:

  1. [x..y] == toList (fromRange x y)
  2. (n*n + n)/2 == sum (fromRange 0 n)

The first law states that given any integer x and y the list generated from the comprehension [x..y] should always be identical to the list created using toList (fromRange x y)

The second law says that given any integer n (which is greater or equal to 0) (n*n + n)/2 should always be equal to the sum of all integers between 0 and n inclusive. (n*n + n)/2 is a known expression, the sum of integers from 0 to n inclusive, a sum sequence is known as an arithmetic sequence (https://en.wikipedia.org/wiki/Arithmetic_progression).

The property tests could look like this:

    static member ``For all xs: x..y is equal to list created from fromRange x y`` (x: int) (y: int) =
      [x..y] == toList (fromRange x y)

    static member ``For all n: (n*n + n)/2 is equal to sum (fromRange 0 n)`` (n: int) =
      let n = abs n
      (n*n + n)/2 == sum (fromRange 0 n)

The Finale

As they say, I leave it as an exercise to the reader to implement the functions above for IEnumerable. I think you will find that exercise somewhat more cumbersome than implementing them for a Church Encoded List.

Even though the Church Encoded List definition might seem non-obvious the functions we create become almost the mathematical definitions of what they should be. Testing almost seems unnecessary but if we derive laws and then we can then test these laws using FsCheck and with little effort reach a high degree of confidence.

Working further and changing the definition to use fold rather than foldRight the Church Encoded List is quite performant abstraction as well, often out-performing Linq. Due to technicalities around how tail recursive functions are implemented in .NET to get predictable performance one has to apply some ugly hacks to avoid the function calls from being invoked as tail calls.

In conclusion I hope you can agree with me that the Church Encoded List is an elegant abstraction and testing with FsCheck is just awesome.

Merry Christmas, Mårten

References

  1. Rank-N types in Haskell - https://wiki.haskell.org/Rank-N_types
  2. Church numerals - https://en.wikipedia.org/wiki/Church_encoding
  3. FsCheck - https://fscheck.github.io/FsCheck/
  4. Arithmetic progression - https://en.wikipedia.org/wiki/Arithmetic_progression
  5. Algebraic driven design - https://algebradriven.design/
// Requires the nuget package: FsCheck
// F# don't support Rank-N types for function types
// But .NET supports Rank-2 types in generic methods in generic classes
type [<AbstractClass>] Cel<'T>() =
abstract RunList: ('T -> 'S -> 'S) -> 'S -> 'S
override x.ToString () = "cel " + (x.RunList (fun v s -> v::s) []).ToString()
module Cel =
// Helper method to run a list over a folder function and a zero value
let inline runList (c: 'T Cel) folder (zero: 'S) = c.RunList folder zero
// The empty list
[<GeneralizableValue>]
let nil<'T> =
{ new Cel<'T>() with
override _.RunList f z = z
}
// Appends a value
let inline cons x cel =
{ new Cel<'T>() with
override _.RunList f z = f x (runList cel f z)
}
let inline fromList xs =
{ new Cel<'T>() with
override _.RunList f z = List.foldBack f xs z
}
let inline fromRange b e =
{ new Cel<int>() with
override _.RunList f z =
let rec loop f s b i =
if i >= b then
loop f (f i s) b (i - 1)
else
s
loop f z b e
}
// Like runList is conceptually like a foldBack, buildList is conceptually like unfoldBack
let inline buildList b e =
{ new Cel<'T>() with
override _.RunList f z =
let rec loop u f s i =
match u i with
| None -> s
| Some (v, j) -> loop u f (f v s) j
loop b f z e
}
// Pipes
let inline collect collector cel =
{ new Cel<'U>() with
override _.RunList f z =
let ff v s = runList (collector v) f s
runList cel ff z
}
let inline filter filterer cel =
{ new Cel<'T>() with
override _.RunList f z = runList cel (fun v s -> if filterer v then f v s else s) z
}
let inline map mapper cel =
{ new Cel<'U>() with
override _.RunList f z = runList cel (fun v s -> f (mapper v) s) z
}
// Sinks
let inline sum cel = runList cel (+) LanguagePrimitives.GenericZero
let inline toList cel = runList cel (fun v s -> v::s) []
module CelTest =
open Cel
open FsCheck
type Generators =
// Custom Arbitrary values for 'T Cel
static member Cel() =
let fsList = Arb.Default.FsList()
{ new Arbitrary<'T Cel>() with
override _.Generator = fsList.Generator |> Gen.map fromList
override _.Shrinker t = (t |> toList) |> fsList.Shrinker |> Seq.map fromList
}
// Testing equalness for various types
type Eq = Eq with
static member inline (?<-) (Eq, a: int Cel , b: int Cel ) = toList a = toList b
static member inline (?<-) (Eq, a: int list , b: int list ) = a = b
static member inline (?<-) (Eq, a: int , b: int ) = a = b
// This is "trick" to get function overloading for functions defined with let
let inline (==) a b =
let r = (?<-) Eq a b
#if DEBUG
if r then true
else printfn "EQ failed: %A = %A" a b; false
#else
r
#endif
type Properties =
// For all laws these conventions:
// xs - any 'T list
// xxs - any 'T list list
// x,y,z - any scalar 'T
// cs - any 'T Cel
// ccs - any 'T Cel Cel
// f - any function
// n - any positive int
// Because of technical reasons FsCheck don't allow Generic Properties
// thus 'T in all tests is an int
// Church Encoded List Laws
static member ``For all xs: runList xs cons nil == xs`` (xs: int Cel) =
runList xs cons nil == xs
static member ``For all xs, f and z: runList (fromList xs) f z == foldr f z xs`` (xs: int list) (f: int -> int -> int) (z: int) =
runList (fromList xs) f z == List.foldBack f xs z
static member ``For all cs, f and z: foldr f z (toList cs) == runList cs f z`` (cs: int Cel) (f: int -> int -> int) (z: int) =
List.foldBack f (toList cs) z == runList cs f z
// collect, filter, map and sum laws defined in terms of List operations
static member ``For all f and xxs: List.collect and collect produce equal results`` (f: int list -> int list) (xxs: int list list) =
List.collect f xxs == (fromList >> collect (f >> fromList) >> toList) xxs
static member ``For all f and xs: List.filter and filter produce equal results`` (xs: int list) (f: int -> bool) =
List.filter f xs == (fromList >> filter f >> toList) xs
static member ``For all f and xs: List.map and map produce equal results`` (xs: int list) (f: int -> int) =
List.map f xs == (fromList >> map f >> toList) xs
static member ``For all xs: List.sum and sum produce equal results`` (xs: int list) =
List.sum xs == (fromList >> sum) xs
// collect, filter, map and sum laws defined in terms of runList
static member ``For all f and ccs: collect produce equal results to runList that concatenates the results of the lists of f`` (ccs: int Cel Cel) =
// FsCheck generating: int Cel -> int Cel function results in a quite slow function
// so use id for now
let f = id
collect f ccs == runList ccs (fun v s -> runList (f v) cons s) nil
static member ``For all f and cs: map produce equal results to runList concatenating the true results for f`` (cs: int Cel) (f: int -> bool) =
filter f cs == runList cs (fun v s -> if f v then cons v s else s) nil
static member ``For all f and cs: map produce equal results to runList concatenating the results of f`` (cs: int Cel) (f: int -> int) =
map f cs == runList cs (fun v s -> cons (f v) s) nil
static member ``For all f and cs: sum produce equal results to runList that folds with + and 0`` (cs: int Cel) =
sum cs == runList cs (+) 0
// filter and map laws defined in terms of collect
static member ``For all f and cs: filter produce equal results to collect concatenating singletons list for all true results for f`` (cs: int Cel) (f: int -> bool) =
filter f cs == collect (fun v -> if f v then cons v nil else nil) cs
static member ``For all f and cs: map produce equal results to collect concatenating singletons list with results for f`` (cs: int Cel) (f: int -> int) =
map f cs == collect (fun v -> cons (f v) nil) cs
// fromRange laws
static member ``For all xs: x..y is equal to list created from fromRange x y`` (x: int) (y: int) =
[x..y] == toList (fromRange x y)
static member ``For all n: (n*n + n)/2 is equal to sum (fromRange 0 n)`` (n: int) =
let n = abs n
(n*n + n)/2 == sum (fromRange 0 n)
static member ``For all xs: fromRange x y is equal to a buildList operation that generates the sequence x..y`` (x: int) (y: int)=
fromRange x y == buildList (fun i -> if i >= x then Some (i, i - 1) else None) y
let run () =
let config = Config.Quick
let config =
{ config with
Arbitrary = (typeof<Generators>)::config.Arbitrary
}
#if RELEASE
let config =
{ config with
StartSize = config.StartSize*10
MaxTest = 1000
MaxFail = 1000
}
#endif
Check.All<Properties> config
[<EntryPoint>]
let main argv =
CelTest.run ()
0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment