Skip to content

Instantly share code, notes, and snippets.

@tromp
Last active January 24, 2025 21:14

Revisions

  1. tromp revised this gist Jan 19, 2024. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion Binary_Lambda_Calculus.md
    Original file line number Diff line number Diff line change
    @@ -229,7 +229,7 @@ proves that
    $$KP(x) \leq \ell(\mbox{code}(x))+338$$

    where $\mbox{code}(x)$ is the
    [Levenstein code](https://en.wikipedia.org/wiki/Levenstein_coding) for *x* defined by
    [Levenshtein code](https://en.wikipedia.org/wiki/Levenshtein_coding) for *x* defined by

    $$\mbox{code}(0) = 0$$

  2. tromp revised this gist Jan 14, 2024. 1 changed file with 3 additions and 4 deletions.
    7 changes: 3 additions & 4 deletions Binary_Lambda_Calculus.md
    Original file line number Diff line number Diff line change
    @@ -56,12 +56,11 @@ Instead, BLC requires translating bitstrings into lambda terms, to which the mac
    (itself a lambda term) can be readily applied.

    Bits 0 and 1 are translated into the standard
    [lambda booleans](https://en.wikipedia.org/wiki/Lambda_calculus#Logic_and_predicates)
    B<sub>0</sub> = True and B<sub>1</sub> = False:
    [lambda booleans](https://en.wikipedia.org/wiki/Lambda_calculus#Logic_and_predicates) :

    $$\mbox{True} = \lambda x_0 \lambda x_1. x_0 $$
    $$ B_0 = \lambda x_0 \lambda x_1. x_0 = \mbox{True} $$

    $$\mbox{False} = \lambda x_0 \lambda x_1. x_1 $$
    $$ B_1 = \lambda x_0 \lambda x_1. x_1 = \mbox{False}$$

    which can be seen to directly implement the
    [if-then-else](https://en.wikipedia.org/wiki/If-then-else) operator.
  3. tromp revised this gist Jan 14, 2024. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions Binary_Lambda_Calculus.md
    Original file line number Diff line number Diff line change
    @@ -59,9 +59,9 @@ Bits 0 and 1 are translated into the standard
    [lambda booleans](https://en.wikipedia.org/wiki/Lambda_calculus#Logic_and_predicates)
    B<sub>0</sub> = True and B<sub>1</sub> = False:

    $$\mbox{True} = \lambda x0 \lambda x1. x0 $$
    $$\mbox{True} = \lambda x_0 \lambda x_1. x_0 $$

    $$\mbox{False} = \lambda x0 \lambda x1. x1 $$
    $$\mbox{False} = \lambda x_0 \lambda x_1. x_1 $$

    which can be seen to directly implement the
    [if-then-else](https://en.wikipedia.org/wiki/If-then-else) operator.
  4. tromp revised this gist Jan 14, 2024. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions Binary_Lambda_Calculus.md
    Original file line number Diff line number Diff line change
    @@ -59,9 +59,9 @@ Bits 0 and 1 are translated into the standard
    [lambda booleans](https://en.wikipedia.org/wiki/Lambda_calculus#Logic_and_predicates)
    B<sub>0</sub> = True and B<sub>1</sub> = False:

    $$\mbox{True} = \lambda x \lambda y. x$$
    $$\mbox{True} = \lambda x0 \lambda x1. x0 $$

    $$\mbox{False} = \lambda x \lambda y. y$$
    $$\mbox{False} = \lambda x0 \lambda x1. x1 $$

    which can be seen to directly implement the
    [if-then-else](https://en.wikipedia.org/wiki/If-then-else) operator.
  5. tromp revised this gist Aug 6, 2023. 1 changed file with 7 additions and 3 deletions.
    10 changes: 7 additions & 3 deletions Binary_Lambda_Calculus.md
    Original file line number Diff line number Diff line change
    @@ -262,7 +262,7 @@ $$KP(\mbox{Church numeral } n) \leq \ell(\mbox{code}(n))+139$$

    ## Complexity of Sets
    The complexity of a set of natural numbers is the complexity of its characteristic sequence,
    an infinite lambda term in the [[Infinitary Lambda Calculus]].
    an infinite lambda term in the Infinitary Lambda Calculus [[3](#3)].

    The program

    @@ -473,5 +473,9 @@ John Tromp, Binary Lambda Calculus and Combinatory Logic,
    [(pdf version)](http://tromp.github.io/cl/LC.pdf)

    ### 2
    G J Chaitin, ''Algorithmic information theory'', Volume I in Cambridge Tracts in Theoretical Computer Science, Cambridge University Press, October 1987,
    ([pdf version](http://www.cs.auckland.ac.nz/~chaitin/cup.pdf))
    G J Chaitin, ''Algorithmic information theory'', Volume I in Cambridge Tracts in Theoretical Computer Science, Cambridge University Press, October 1987.
    ([pdf version](http://www.cs.auckland.ac.nz/~chaitin/cup.pdf))

    ### 3
    J.R. Kennaway, J.W. Klop, M.R. Sleep, F.J. de Vries, ''Infinitary lambda calculus'', Theoretical Computer Science, Volume 175, Issue 1, 30 March1997, Pages 93-125.
    ([pdf version](https://core.ac.uk/download/pdf/301634883.pdf))
  6. tromp revised this gist Jul 23, 2023. 1 changed file with 76 additions and 69 deletions.
    145 changes: 76 additions & 69 deletions Binary_Lambda_Calculus.md
    Original file line number Diff line number Diff line change
    @@ -3,7 +3,7 @@
    The Binary Lambda Calculus (BLC) is a minimal,
    [pure](https://en.wikipedia.org/wiki/Purely_functional_programming)
    [functional programming language](https://en.wikipedia.org/wiki/Functional_programming)
    invented by John Tromp in 2004 [1]
    invented by John Tromp in 2004 [[1](#1)]
    based on a [binary encoding](https://en.wikipedia.org/wiki/Binary_code)
    of the untyped [lambda calculus](https://en.wikipedia.org/wiki/Lambda_calculus) in
    [De Bruijn index](https://en.wikipedia.org/wiki/De_Bruijn_index) notation.
    @@ -143,7 +143,7 @@ face the opposite problem: how to encode lambda terms into bitstrings?
    ## Lambda encoding
    First we need to write our lambda terms in a particular notation using what is known as
    [De Bruijn indices](https://en.wikipedia.org/wiki/De_Bruijn_indices).
    Our encoding blc(M) of term M is then defined recursively as follows
    Our encoding blc(*M*) of term *M* is then defined recursively as follows

    $$\mbox{blc}(\lambda M) = 00\ \mbox{blc}(M)$$

    @@ -152,27 +152,28 @@ $$\mbox{blc}(M\ N) = 01\ \mbox{blc}(M)\ \mbox{blc}(N)$$
    $$\mbox{blc}(i) = 1^i0$$

    For instance, the pairing function $\lambda x \lambda y \lambda z. z\ x\ y$
    is written $\lambda \lambda \lambda. 1\ 3\ 2$ in De Bruijn format, with blc code
    <code>00 00 00 01 01 10 1110 110</code>.
    is written $\lambda \lambda \lambda 1\ 3\ 2$ in De Bruijn format, with blc code
    `00 00 00 01 01 10 1110 110`.

    A **closed** lambda term is one in which all variables are bound, i.e. without any free variables.
    In De Bruijn format, this means that an index *i* can only appear
    within at least *i* nested lambdas.
    The number of closed terms of size *n* bits is given by sequence [A114852](https://oeis.org/A114852)
    of the [On-Line Encyclopedia of Integer Sequences](https://en.wikipedia.org/wiki/On-Line_Encyclopedia_of_Integer_Sequences).

    The shortest possible closed term is the identity function $\mbox{blc}(\lambda 1) = 0010$.
    The shortest possible closed term is the identity function, with code $\mbox{blc}(\lambda\ 1)$ = `0010`.
    In delimited mode, this machine just copies its input to its output.
    In undelimited mode, it copies an infinite stream of bits.

    So what is this [universal machine](https://en.wikipedia.org/wiki/Universal_machine) *U*?
    Here it is, in De Bruijn format:

    $$(\lambda\ 1\ 1) (\lambda\ \lambda\ \lambda\ 1\ (\lambda\ \lambda\ \lambda\ \lambda\ 3\
    (\lambda\ 5\ (3\ (\lambda\ 2\ (3\ (\lambda\ \lambda\ 3\ (\lambda\ 1\ 2\ 3)))
    $$(\lambda\ 1\ 1)\ (\lambda\ \lambda\ \lambda\ 1\ (\lambda\ \lambda\ \lambda\ \lambda\ 3\
    (\lambda\ 5\ (3\ (\lambda\ 2\ (3\ (\lambda\ \lambda\ 3\ (\lambda\ 1\ 2\ 3)))\
    (4\ (\lambda\ 4\ (\lambda\ 3\ 1\ (2\ 1))))))$$

    $$(1\ (2\ (\lambda\ 1\ 2))(\lambda\ 4\ (\lambda\ 4\ (\lambda\ 2\ (1\ 4)))\ 5))))
    (3\ 3)\ 2) (\lambda\ 1 ((\lambda\ 1\ 1) (\lambda\ 1\ 1)))$$
    (3\ 3)\ 2)\ (\lambda\ 1\ ((\lambda\ 1\ 1)\ (\lambda\ 1\ 1)))$$

    This is in binary:
    ```
    @@ -206,23 +207,23 @@ $$\lambda\ \lambda\ 1\ ((\lambda\ 1\ 1) (\lambda\ \lambda\ \lambda\ \lambda\ 2\
    (\lambda\ \lambda\ 3\ 2\ (3\ 2\ (2\ (5\ 1\ (2\ 1))))))) (\lambda\ \lambda\ 1)$$

    $$(\lambda\ \lambda\ \lambda\ 1\ (\lambda\ 4\ (\lambda\ 4\ (\lambda\ 1\ 3\ 2))))
    (\lambda\ \lambda\ \lambda\ 1\ (3\ (\lambda\ \lambda 1)) 2) (\lambda\ 1)\ 2$$
    (\lambda\ \lambda\ \lambda\ 1\ (3\ (\lambda\ \lambda 1))\ 2) (\lambda\ 1)\ 2$$

    proves that

    $$KP(x|\ell(x)) \leq \ell(x)+188$$

    The program

    $$(\lambda 1 1) (\lambda \lambda \lambda 1 (\lambda 1 (3 (\lambda \lambda 1))
    (4\ 4 (\lambda 1 (\lambda \lambda \lambda 1 (\lambda 4 (\lambda \lambda 5\ 2
    (5\ 2 (3\ 1 (2\ 1)))))) 4 (\lambda 1))))) (\lambda \lambda \lambda 1 (3
    ((\lambda 1 1)$$
    $$(\lambda\ 1\ 1) (\lambda\ \lambda\ \lambda\ 1\ (\lambda\ 1\ (3\ (\lambda\ \lambda\ 1))
    (4\ 4\ (\lambda\ 1 (\lambda\ \lambda\ \lambda\ 1\ (\lambda\ 4\ (\lambda\ \lambda 5\ 2\
    (5\ 2\ (3\ 1\ (2\ 1))))))\ 4\ (\lambda\ 1))))) (\lambda\ \lambda\ \lambda\ 1\ (3\
    ((\lambda\ 1\ 1)$$

    $$(\lambda \lambda \lambda \lambda 1 (\lambda 5\ 5
    (\lambda \lambda 3\ 5\ 6 (\lambda 1 (\lambda \lambda 6\ 1\ 2) 3))
    (\lambda \lambda 5 (\lambda 1\ 4\ 3))) (3\ 1)) (\lambda \lambda 1
    (\lambda \lambda 2) 2) (\lambda 1)) (\lambda \lambda 1)) 2)$$
    $$(\lambda\ \lambda\ \lambda\ \lambda\ 1\ (\lambda\ 5\ 5\
    (\lambda\ \lambda\ 3\ 5\ 6\ (\lambda\ 1\ (\lambda\ \lambda\ 6\ 1\ 2) 3))
    (\lambda\ \lambda\ 5\ (\lambda\ 1\ 4\ 3))) (3\ 1)) (\lambda\ \lambda 1\
    (\lambda\ \lambda\ 2)\ 2)\ (\lambda\ 1))\ (\lambda\ \lambda\ 1))\ 2)$$

    proves that

    @@ -265,11 +266,11 @@ an infinite lambda term in the [[Infinitary Lambda Calculus]].

    The program

    $$\lambda (\lambda 1 (1 ((\lambda 1 1) (\lambda \lambda \lambda 1 (\lambda \lambda 1)
    ((\lambda 4 4 1 ((\lambda 1 1) (\lambda 2 (1 1))))$$
    $$\lambda\ (\lambda\ 1\ (1\ ((\lambda\ 1\ 1)\ (\lambda\ \lambda\ \lambda\ 1\ (\lambda\ \lambda\ 1)
    ((\lambda\ 4\ 4\ 1\ ((\lambda\ 1\ 1)\ (\lambda\ 2\ (1\ 1))))$$

    $$(\lambda \lambda \lambda \lambda
    1 3 (2 (6 4))))) (\lambda \lambda \lambda 4 (1 3))))) (\lambda \lambda 1 (\lambda \lambda 2) 2)$$
    $$(\lambda\ \lambda\ \lambda\ \lambda\ 1\ 3\ (2\ (6\ 4)))))\ (\lambda\ \lambda\ \lambda\ 4\
    (1\ 3)))))\ (\lambda\ \lambda\ 1\ (\lambda\ \lambda\ 2)\ 2)$$

    whose first 100 bits of output are

    @@ -293,26 +294,26 @@ This is even shorter than Haskell's 23 byte long
    ## Symmetry of information
    The program

    $$(\lambda 1 (\lambda \lambda 2) (\lambda 1 (\lambda \lambda 1))
    (\lambda \lambda \lambda (\lambda 1 (3 2) (\lambda \lambda 3 1
    (6 1 1 (4 ((\lambda 1 1) (\lambda \lambda \lambda \lambda \lambda
    1 2 (\lambda 1 (\lambda \lambda 1) (\lambda \lambda \lambda 8 2)
    (\lambda \lambda 1) (4 (6 6)$$
    $$(\lambda\ 1\ (\lambda\ \lambda\ 2) (\lambda\ 1\ (\lambda\ \lambda\ 1))
    (\lambda\ \lambda\ \lambda\ (\lambda\ 1\ (3\ 2) (\lambda\ \lambda\ 3\ 1\
    (6\ 1\ 1\ (4\ ((\lambda\ 1\ 1) (\lambda\ \lambda\ \lambda\ \lambda\ \lambda\
    1\ 2\ (\lambda\ 1\ (\lambda\ \lambda\ 1)\ (\lambda\ \lambda\ \lambda\ 8\ 2)
    (\lambda\ \lambda\ 1)\ (4\ (6\ 6)$$

    $$(\lambda 4 (\lambda 1 7 2)))))) 5)
    (\lambda \lambda 1) (\lambda \lambda \lambda \lambda 1)
    (\lambda \lambda 2) (\lambda \lambda 1)) (\lambda \lambda \lambda
    1 (\lambda 1 6 4) 2))) (4 (\lambda 1) (\lambda 1) (\lambda 1) 1))
    (\lambda 1))$$
    $$(\lambda\ 4\ (\lambda\ 1\ 7\ 2))))))\ 5)
    (\lambda\ \lambda\ 1) (\lambda\ \lambda\ \lambda\ \lambda\ 1)
    (\lambda\ \lambda\ 2) (\lambda\ \lambda\ 1)) (\lambda\ \lambda\ \lambda\
    1\ (\lambda\ 1\ 6\ 4)\ 2)))\ (4\ (\lambda\ 1) (\lambda\ 1) (\lambda\ 1)\ 1))
    (\lambda\ 1))$$

    $$(\lambda \lambda (\lambda 1 1) (\lambda \lambda \lambda
    \lambda 1 (\lambda (\lambda \lambda \lambda \lambda 3 (\lambda 6
    (3 (\lambda 2 (3 (\lambda 14 (\lambda 3 (\lambda 1 2 3)))) (4
    (\lambda 4 (\lambda 14 (3 1) (2 1)))))) (1 (2 (\lambda 1 2))$$
    $$(\lambda\ \lambda\ (\lambda\ 1\ 1) (\lambda\ \lambda\ \lambda\
    \lambda\ 1\ (\lambda\ (\lambda\ \lambda\ \lambda\ \lambda\ 3\ (\lambda\ 6\
    (3\ (\lambda\ 2\ (3\ (\lambda\ 14 (\lambda\ 3\ (\lambda\ 1\ 2\ 3)))) (4\
    (\lambda\ 4\ (\lambda\ 14\ (3\ 1)\ (2\ 1))))))\ (1\ (2\ (\lambda\ 1\ 2))$$

    $$
    (\lambda \lambda 5 (\lambda 5 (\lambda 2 (1 5))) 7 6)) (\lambda
    6 (\lambda 1 3 2)))) (\lambda 4 (\lambda 1 3 2))) (4 4) 3))$$
    (\lambda\ \lambda\ 5 (\lambda\ 5 (\lambda\ 2\ (1\ 5)))\ 7\ 6)) (\lambda\
    6\ (\lambda\ 1\ 3\ 2)))) (\lambda\ 4\ (\lambda\ 1\ 3\ 2)))\ (4\ 4)\ 3))$$

    proves that

    @@ -325,15 +326,16 @@ Proving the converse

    $$KP(y|x^{\ast}) \leq KP(x,y) - KP(x) + O(1)$$

    involves simulating infinitely many programs in [https://en.wikipedia.org/wiki/Dovetailing_(computer_science) dovetailing] fashion,
    seeing which ones halt and output the pair of ''x'' (for which a shortest program is given) and any ''y'',
    and for each such program ''p'', requesting a new codeword for ''y'' of length <math>\ell(p)-KP(x)</math>.
    The [https://en.wikipedia.org/wiki/Kraft_inequality Kraft inequality] ensures that this infinite enumeration of requests can be fulfilled,
    and in fact codewords for ''y'' can be decoded on the fly, in tandem with the above enumeration.
    Details of this fundamental result by Chaitin can be found in.<ref>G J Chaitin, ''Algorithmic information theory'', Volume I in Cambridge Tracts in Theoretical Computer Science, Cambridge University Press, October 1987, [http://www.cs.auckland.ac.nz/~chaitin/cup.pdf (pdf version)]</ref>
    involves simulating infinitely many programs in [dovetailing](https://en.wikipedia.org/wiki/Dovetailing_(computer_science)) fashion,
    seeing which ones halt and output the pair of *x* (for which a shortest program is given) and any *y*,
    and for each such program *p*, requesting a new codeword for *y* of length $\ell(p)-KP(x)$.
    The [Kraft inequality](https://en.wikipedia.org/wiki/Kraft_inequality) ensures that this infinite enumeration of requests can be fulfilled,
    and in fact codewords for *y* can be decoded on the fly, in tandem with the above enumeration.
    Details of this fundamental result by Chaitin can be found in
    [[2](#2)].

    ## A quine
    The term $Q=\lambda 1 ((\lambda 1 1) (\lambda \lambda \lambda \lambda \lambda 1 4 (3 (5 5) 2))) 1$
    The term $Q=\lambda 1\ ((\lambda\ 1\ 1)\ (\lambda\ \lambda\ \lambda\ \lambda\ \lambda\ 1\ 4\ (3\ (5\ 5)\ 2)))\ 1$
    concatenates two copies of its input, proving that

    $$KS(x x) \leq \ell(x) + 66$$
    @@ -349,7 +351,7 @@ For $\ell(x)>66$ though, we do actually compress $xx$ by $\ell(x)-66$ bits.

    What could be the shortest program that produces a larger output?
    The following is a good candidate: the program
    $(\lambda 1 1 1 1 (\lambda \lambda 1 (\lambda \lambda 1) 2)) (\lambda \lambda 2 (2 1))$,
    $(\lambda\ 1\ 1\ 1\ 1\ (\lambda\ \lambda\ 1\ (\lambda\ \lambda\ 1)\ 2))\ (\lambda\ \lambda\ 2\ (2\ 1))$,
    of size 55 bits, uses [Church numerals](https://en.wikipedia.org/wiki/Church_numerals)
    to output exactly $2^{2^{2^2}}=65536$ ones.
    That beats both [gzip](https://en.wikipedia.org/wiki/Gzip) and [bzip2](https://en.wikipedia.org/wiki/Bzip2),
    @@ -377,13 +379,13 @@ where each byte is represented as a delimited list of 8 bits in

    BLC8 requires a more complicated universal machine:

    $$U8 = \lambda 1 ((\lambda 1\ 1) (\lambda (\lambda \lambda \lambda 1 (\lambda \lambda
    \lambda 2 (\lambda \lambda \lambda (\lambda 7 (10\ (\lambda 5\ (2 (\lambda \lambda 3
    (\lambda\ 1\ 2\ 3))) (11 (\lambda 3 (\lambda 3\ 1\ (2\ 1)))))\ 3)$$
    $$U8 = \lambda\ 1\ ((\lambda\ 1\ 1) (\lambda\ (\lambda\ \lambda\ \lambda\ 1\ (\lambda\ \lambda\
    \lambda\ 2\ (\lambda\ \lambda\ \lambda\ (\lambda\ 7\ (10\ (\lambda\ 5\ (2\ (\lambda\ \lambda\ 3\
    (\lambda\ 1\ 2\ 3))) (11\ (\lambda\ 3\ (\lambda\ 3\ 1\ (2\ 1)))))\ 3)$$

    $$(4\ (1\ (\lambda 1\ 5) 3)
    (10\ (\lambda 2 (\lambda 2\ (1\ 6))) 6))) 8) (\lambda 1 (\lambda\ 8\ 7 (\lambda\ 1\ 6\ 2))))
    (\lambda\ 1\ (4\ 3))) (1\ 1)) (\lambda \lambda\ 2\ ((\lambda\ 1\ 1) (\lambda\ 1\ 1))))$$
    $$(4\ (1\ (\lambda 1\ 5)\ 3)
    (10\ (\lambda\ 2\ (\lambda 2\ (1\ 6)))\ 6)))\ 8) (\lambda\ 1\ (\lambda\ 8\ 7\ (\lambda\ 1\ 6\ 2))))
    (\lambda\ 1\ (4\ 3)))\ (1\ 1))\ (\lambda\ \lambda\ 2\ ((\lambda\ 1\ 1)\ (\lambda\ 1\ 1))))$$

    The parser in U8 keeps track of both remaining bytes, and remaining bits in the current byte,
    discarding the latter when parsing is completed. Thus the size of U8, which is 355 bits as a
    @@ -392,27 +394,27 @@ BLC program, is 45 bytes in BLC8.
    ## Brainfuck
    The following BLC8 program

    $$(\lambda\ 1\ 1) (\lambda (\lambda \lambda \lambda 1 (\lambda (\lambda\ 2\ 1\ 1\ 1
    (\lambda \lambda\ 1\ 3\ 3 (\lambda \lambda 1 (\lambda \lambda (\lambda 7 (1 (3
    (\lambda \lambda \lambda \lambda \lambda 10 (1 (\lambda\ 6\ 1\ 4\ 3)) (\lambda\ 1\ 5
    $$(\lambda\ 1\ 1) (\lambda\ (\lambda\ \lambda\ \lambda\ 1\ (\lambda\ (\lambda\ 2\ 1\ 1\ 1
    (\lambda\ \lambda\ 1\ 3\ 3 (\lambda\ \lambda\ 1\ (\lambda\ \lambda\ (\lambda\ 7\ (1\ (3\
    (\lambda\ \lambda\ \lambda\ \lambda\ \lambda\ 10\ (1\ (\lambda\ 6\ 1\ 4\ 3)) (\lambda\ 1\ 5
    (6\ 5\ 4\ 3\ 2)))$$

    $$(\lambda \lambda 2 ((\lambda\ 1\ 1) (\lambda \lambda \lambda 2
    (\lambda \lambda \lambda\ 6\ 6\ 2 (\lambda \lambda 6 (\lambda 1 (2\ 6) 3) (15 (5\ 1
    (\lambda 1)) (5 (\lambda 1) 1)))) (1\ 2\ (\lambda \lambda \lambda\ 3\ 1\ 2)))$$
    $$(\lambda\ \lambda\ 2\ ((\lambda\ 1\ 1) (\lambda\ \lambda\ \lambda 2\
    (\lambda\ \lambda\ \lambda\ 6\ 6\ 2\ (\lambda\ \lambda\ 6\ (\lambda\ 1\ (2\ 6) 3)\ (15\ (5\ 1\
    (\lambda\ 1))\ (5\ (\lambda\ 1)\ 1)))) (1\ 2\ (\lambda\ \lambda\ \lambda\ 3\ 1\ 2)))$$

    $$1 (\lambda \lambda 2))))) (3 (1 (\lambda \lambda \lambda \lambda 9 (1 (\lambda\ 5\ 1
    (\lambda\ 1\ 5\ 4))) (2\ 4 (\lambda\ 1\ 4\ 2)))) (5 (11 (\lambda 1)) (12 (\lambda 2
    $$1 (\lambda\ \lambda\ 2))))) (3\ (1\ (\lambda\ \lambda\ \lambda\ \lambda\ 9\ (1\ (\lambda\ 5\ 1\
    (\lambda\ 1\ 5\ 4)))\ (2\ 4\ (\lambda\ 1\ 4\ 2)))) (5\ (11\ (\lambda 1))\ (12\ (\lambda 2\
    ((\lambda\ 1\ 1)$$

    $$(\lambda \lambda \lambda 1 ((\lambda\ 1\ 1) (\lambda \lambda \lambda
    2 (1 (3\ 3)) (\lambda 8 (7\ 7\ 1)))) 2\ 1))))))) (\lambda 12 (\lambda 12 (\lambda 3
    (2\ 1)))))))) (\lambda \lambda 1))) (1\ 1))$$
    $$(\lambda\ \lambda\ \lambda\ 1\ ((\lambda\ 1\ 1) (\lambda\ \lambda \lambda\
    2\ (1\ (3\ 3)) (\lambda\ 8\ (7\ 7\ 1)))) 2\ 1))))))) (\lambda\ 12\ (\lambda 12\ (\lambda 3\
    (2\ 1)))))))) (\lambda\ \lambda 1)))\ (1\ 1))$$

    $$(\lambda (\lambda\ 1\ 1) (\lambda \lambda
    1 ((\lambda 1 (1 (1 (\lambda \lambda 1 (\lambda \lambda 2) 2)))) (\lambda \lambda
    2 (2\ 1)) (\lambda \lambda 1)) (2\ 2)) (1 (\lambda \lambda \lambda \lambda \lambda
    \lambda 1)) 1)$$
    $$(\lambda\ (\lambda\ 1\ 1) (\lambda\ \lambda\
    1\ ((\lambda\ 1\ (1\ (1\ (\lambda\ \lambda\ 1 (\lambda\ \lambda\ 2)\ 2)))) (\lambda\ \lambda\
    2\ (2\ 1)) (\lambda\ \lambda\ 1)) (2\ 2)) (1\ (\lambda\ \lambda\ \lambda\ \lambda\ \lambda\
    \lambda\ 1))\ 1)$$

    is an unbounded tape [Brainfuck](https://en.wikipedia.org/wiki/Brainfuck) interpreter in 829 bits
    (under 104 bytes).
    @@ -426,7 +428,7 @@ Consuming more input than is available results in an error (the non-list result

    This interpreter is a rough translation of the following version written in
    [Haskell](https://en.wikipedia.org/wiki/Haskell_(programming_language))
    ```
    ```haskell
    import System.Environment(getArgs)
    import Control.Monad.State
    import Control.Monad.Writer
    @@ -463,8 +465,13 @@ main = do [name] <- getArgs

    ## References

    [1] John Tromp, Binary Lambda Calculus and Combinatory Logic,
    ### 1
    John Tromp, Binary Lambda Calculus and Combinatory Logic,
    in ''Randomness And Complexity, from Leibniz To Chaitin'', ed. Cristian S. Calude,
    World Scientific Publishing Company, October 2008.
    (The last reference, to an initial Haskell implementation, is dated 2004)
    [(pdf version)](http://tromp.github.io/cl/LC.pdf)
    [(pdf version)](http://tromp.github.io/cl/LC.pdf)

    ### 2
    G J Chaitin, ''Algorithmic information theory'', Volume I in Cambridge Tracts in Theoretical Computer Science, Cambridge University Press, October 1987,
    ([pdf version](http://www.cs.auckland.ac.nz/~chaitin/cup.pdf))
  7. tromp created this gist Jul 22, 2023.
    470 changes: 470 additions & 0 deletions Binary_Lambda_Calculus.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,470 @@
    ## Binary Lambda Calculus

    The Binary Lambda Calculus (BLC) is a minimal,
    [pure](https://en.wikipedia.org/wiki/Purely_functional_programming)
    [functional programming language](https://en.wikipedia.org/wiki/Functional_programming)
    invented by John Tromp in 2004 [1]
    based on a [binary encoding](https://en.wikipedia.org/wiki/Binary_code)
    of the untyped [lambda calculus](https://en.wikipedia.org/wiki/Lambda_calculus) in
    [De Bruijn index](https://en.wikipedia.org/wiki/De_Bruijn_index) notation.


    - [Background](#background)
    - [Binary I/O](#binary-io)
    - [Delimited versus undelimited](#delimited-versus-undelimited)
    - [Universality](#universality)
    - [Lambda encoding](#lambda-encoding)
    - [Complexity, concretely](#complexity-concretely)
    - [Theorems, concretely](#theorems-concretely)
    - [Complexity of Sets](#complexity-of-sets)
    - [Symmetry of information](#symmetry-of-information)
    - [A quine](#a-quine)
    - [Compression](#compression)
    - [Halting probability](#halting-probability)
    - [BLC8: byte sized I/O](#blc8-byte-sized-io)
    - [Brainfuck](#brainfuck)
    - [External links](#external-links)
    - [References](#references)

    ## Background
    BLC is designed to provide a very simple and elegant concrete definition of descriptional complexity
    ([Kolmogorov complexity](https://en.wikipedia.org/wiki/Kolmogorov_complexity)).

    Roughly speaking, the complexity of an object is the length of its shortest description.

    To make this precise, we take descriptions to be bitstrings, and identify a description
    method with some computational device, or machine, that transforms descriptions into
    objects. Objects are usually also just bitstrings, but can have additional structure as well,
    e.g., pairs of strings.

    Originally, [Turing machine](https://en.wikipedia.org/wiki/Turing_machine)s, the most well known formalism for computation,
    were used for this purpose. But they are somewhat lacking in ease of
    construction and composability. Another classical computational formalism,
    the [lambda calculus](https://en.wikipedia.org/wiki/Lambda_calculus), offers distinct advantages in ease of use.
    The lambda calculus doesn't incorporate any notion of I/O though,
    so one needs to be designed.

    ## Binary I/O
    Adding a readbit primitive function to lambda calculus, as
    [Chaitin](https://en.wikipedia.org/wiki/Gregory_Chaitin) did for [LISP](https://en.wikipedia.org/wiki/LISP),
    would destroy its [referential transparency](https://en.wikipedia.org/wiki/Referential_transparency_(computer_science)),
    unless one distinguishes between an I/O action and its result, as [Haskell](https://en.wikipedia.org/wiki/Haskell_(programming_language)) does
    with its [monadic](https://en.wikipedia.org/wiki/Monad_(functional_programming)) I/O.
    But that requires a type system, an unnecessary complication.

    Instead, BLC requires translating bitstrings into lambda terms, to which the machine
    (itself a lambda term) can be readily applied.

    Bits 0 and 1 are translated into the standard
    [lambda booleans](https://en.wikipedia.org/wiki/Lambda_calculus#Logic_and_predicates)
    B<sub>0</sub> = True and B<sub>1</sub> = False:

    $$\mbox{True} = \lambda x \lambda y. x$$

    $$\mbox{False} = \lambda x \lambda y. y$$

    which can be seen to directly implement the
    [if-then-else](https://en.wikipedia.org/wiki/If-then-else) operator.

    Now consider the pairing function

    $$\langle , \rangle = \lambda x \lambda y \lambda z. z\ x\ y $$

    applied to two terms $M$ and $N$

    $$\langle M, N \rangle = \lambda z. z\ M\ N$$

    Applying this to a boolean yields the desired component of choice, e.g.

    $$\langle M, N \rangle \mbox{ True } = (\lambda z. z\ M\ N) \mbox{ True } = \mbox{ True } M\ N = M$$

    A string s = b<sub>0</sub>b<sub>1</sub>…b<sub>n&minus;1</sub> is represented by repeated pairing as

    $$\langle B_{b_0}, \langle B_{b_1} \ldots \langle B_{b_{n-1}}, z\rangle \ldots \rangle \rangle$$

    which is denoted as s:z.

    The placeholder z works as a list continuation, that could be a nil list (to end the string)
    or another string (that would be appended to the original string).

    ## Delimited versus undelimited
    Descriptional complexity actually comes in two distinct flavors,
    depending on whether the input is considered to be delimited.

    Knowing the end of your input makes it easier to describe objects.
    For instance, you can just copy the whole input to output.
    This flavor is called *plain* or *simple* complexity.

    But in a sense it is additional information. A file system for instance needs to separately
    store the length of files. The [C language](https://en.wikipedia.org/wiki/C_(programming_language))
    uses the [null character](https://en.wikipedia.org/wiki/Null_character) to denote the end of a string,
    but this comes at the cost of not having that character available within strings.

    The other flavor is called *prefix* complexity, named after
    [prefix code](https://en.wikipedia.org/wiki/Prefix_code)s,
    where the machine needs to figure out, from the input read so far, whether it needs
    to read more bits. We say that the input is self-delimiting.
    This works better for communication channels, since one can send
    multiple descriptions, one after the other, and still tell them apart.

    In the I/O model of BLC, the flavor is dictated by the choice of placeholder z.
    If we set it to the unsolvable term

    $$\Omega = (\lambda x. x\ x)(\lambda x. x\ x)$$

    then, to avoid getting stuck in a loop, the machine must avoid examining it,
    by working in a self-delimiting manner.
    If on the other hand we use a lambda term specifically designed to be easy
    to distinguish from any pairing, then the input becomes delimited.
    BLC chooses <code>False</code> for this purpose but gives it the more descriptive alternative
    name of <code>Nil</code>.
    Dealing with lists that may be <code>Nil</code> is straightforward: since

    $$\langle x, y \rangle\ M\ N = M\ x\ y\ N, \mbox{and}$$

    $$\mbox{Nil } M\ N = N$$

    one can write functions M and N to deal with the two cases, the only caveat being that
    N will be passed to M as its third argument.

    ## Universality
    We can find a description method *U* such that for any other description method *D*,
    there is a constant *c* (depending only on *D*) such that no object takes more than *c* extra
    bits to describe with method *U* than with method *D*.
    The expressivenes of BLC allows these constants to be relatively small.
    In fact the constant will be the length of a binary encoding of a *D*-interpreter written in BLC,
    and *U* will be a lambda term that parses this encoding and runs this decoded interpreter on the
    rest of the input. *U* won't even have to know whether the description is delimited or not;
    it works the same either way.

    Having already solved the problem of translating bitstrings into lambda calculus, we now
    face the opposite problem: how to encode lambda terms into bitstrings?

    ## Lambda encoding
    First we need to write our lambda terms in a particular notation using what is known as
    [De Bruijn indices](https://en.wikipedia.org/wiki/De_Bruijn_indices).
    Our encoding blc(M) of term M is then defined recursively as follows

    $$\mbox{blc}(\lambda M) = 00\ \mbox{blc}(M)$$

    $$\mbox{blc}(M\ N) = 01\ \mbox{blc}(M)\ \mbox{blc}(N)$$

    $$\mbox{blc}(i) = 1^i0$$

    For instance, the pairing function $\lambda x \lambda y \lambda z. z\ x\ y$
    is written $\lambda \lambda \lambda. 1\ 3\ 2$ in De Bruijn format, with blc code
    <code>00 00 00 01 01 10 1110 110</code>.

    A **closed** lambda term is one in which all variables are bound, i.e. without any free variables.
    In De Bruijn format, this means that an index *i* can only appear
    within at least *i* nested lambdas.
    The number of closed terms of size *n* bits is given by sequence [A114852](https://oeis.org/A114852)
    of the [On-Line Encyclopedia of Integer Sequences](https://en.wikipedia.org/wiki/On-Line_Encyclopedia_of_Integer_Sequences).

    The shortest possible closed term is the identity function $\mbox{blc}(\lambda 1) = 0010$.
    In delimited mode, this machine just copies its input to its output.

    So what is this [universal machine](https://en.wikipedia.org/wiki/Universal_machine) *U*?
    Here it is, in De Bruijn format:

    $$(\lambda\ 1\ 1) (\lambda\ \lambda\ \lambda\ 1\ (\lambda\ \lambda\ \lambda\ \lambda\ 3\
    (\lambda\ 5\ (3\ (\lambda\ 2\ (3\ (\lambda\ \lambda\ 3\ (\lambda\ 1\ 2\ 3)))
    (4\ (\lambda\ 4\ (\lambda\ 3\ 1\ (2\ 1))))))$$

    $$(1\ (2\ (\lambda\ 1\ 2))(\lambda\ 4\ (\lambda\ 4\ (\lambda\ 2\ (1\ 4)))\ 5))))
    (3\ 3)\ 2) (\lambda\ 1 ((\lambda\ 1\ 1) (\lambda\ 1\ 1)))$$

    This is in binary:
    ```
    0101000110100000000101011000000000011110000101111110011110
    0001011100111100000011110000101101101110011111000011111000
    0101111010011101001011001110000110110000101111100001111100
    0011100110111101111100111101110110000110010001101000011010
    ```
    a mere 232 bits (29 bytes) long.
    A detailed analysis of machine *U* may be found in the first reference.

    ## Complexity, concretely
    In general, we can make complexity of an object conditional on several other objects
    that are provided as additional argument to the universal machine.
    Plain (or simple) complexity *KS*, prefix complexity *KP*, and an older form of prefix complexity now obsoleted by KP, are defined by

    $$KS(x|y_1,\ldots,y_{k}) = \min \\{ \ell(p)\ |\ U\ (p:Nil)\ y_1\ \ldots\ y_{k}\ = \ x \\} $$

    $$KP(x|y_1,\ldots,y_{k}) = \min \\{ \ell(p)\ |\ U\ (p:\ \Omega\ )\ y_1\ \ldots\ y_{k}\ = \ x \\}$$

    $$Kp(x|y_1,\ldots,y_{k}) = \min \\{ \ell(p)\ |\ U\ (p:\ z\ )\ y_1\ \ldots\ y_{k}\ = \langle x,z \rangle \\}$$

    ## Theorems, concretely
    The identity program $\lambda\ 1$ proves that

    $$KS(x) \leq \ell(x) + 4$$

    The program

    $$\lambda\ \lambda\ 1\ ((\lambda\ 1\ 1) (\lambda\ \lambda\ \lambda\ \lambda\ 2\ (4\ 4)
    (\lambda\ \lambda\ 3\ 2\ (3\ 2\ (2\ (5\ 1\ (2\ 1))))))) (\lambda\ \lambda\ 1)$$

    $$(\lambda\ \lambda\ \lambda\ 1\ (\lambda\ 4\ (\lambda\ 4\ (\lambda\ 1\ 3\ 2))))
    (\lambda\ \lambda\ \lambda\ 1\ (3\ (\lambda\ \lambda 1)) 2) (\lambda\ 1)\ 2$$

    proves that

    $$KP(x|\ell(x)) \leq \ell(x)+188$$

    The program

    $$(\lambda 1 1) (\lambda \lambda \lambda 1 (\lambda 1 (3 (\lambda \lambda 1))
    (4\ 4 (\lambda 1 (\lambda \lambda \lambda 1 (\lambda 4 (\lambda \lambda 5\ 2
    (5\ 2 (3\ 1 (2\ 1)))))) 4 (\lambda 1))))) (\lambda \lambda \lambda 1 (3
    ((\lambda 1 1)$$

    $$(\lambda \lambda \lambda \lambda 1 (\lambda 5\ 5
    (\lambda \lambda 3\ 5\ 6 (\lambda 1 (\lambda \lambda 6\ 1\ 2) 3))
    (\lambda \lambda 5 (\lambda 1\ 4\ 3))) (3\ 1)) (\lambda \lambda 1
    (\lambda \lambda 2) 2) (\lambda 1)) (\lambda \lambda 1)) 2)$$

    proves that

    $$KP(x) \leq \ell(\mbox{code}(x))+338$$

    where $\mbox{code}(x)$ is the
    [Levenstein code](https://en.wikipedia.org/wiki/Levenstein_coding) for *x* defined by

    $$\mbox{code}(0) = 0$$

    $$\mbox{code}(n+1) = 1\ \mbox{code}(\ell(n))\ n$$

    in which we identify numbers and bitstrings according to lexicographic order.
    This code has the nice property that for all *k*,

    $$\ell(\mbox{code}(n)) \leq \ell(n)+\ell(\ell(n))+\cdots+ \ell^{k-1}(n) + O(\ell^k(n))$$

    Furthermore, it makes lexicographic order of delimited numbers coincide with numeric order.

    Number | String | Delimited
    ------ | ------ | ---------
    0 | | 0
    1 | 0 | 10
    2 | 1 | 110 0
    3 | 00 | 110 1
    4 | 01 | 1110 0 00
    5 | 10 | 1110 0 01
    6 | 11 | 1110 0 10
    7 | 000 | 1110 0 11
    8 | 001 | 1110 1 000
    9 | 010 | 1110 1 001

    The prefix complexity of Church numerals themselves is even better:

    $$KP(\mbox{Church numeral } n) \leq \ell(\mbox{code}(n))+139$$

    ## Complexity of Sets
    The complexity of a set of natural numbers is the complexity of its characteristic sequence,
    an infinite lambda term in the [[Infinitary Lambda Calculus]].

    The program

    $$\lambda (\lambda 1 (1 ((\lambda 1 1) (\lambda \lambda \lambda 1 (\lambda \lambda 1)
    ((\lambda 4 4 1 ((\lambda 1 1) (\lambda 2 (1 1))))$$

    $$(\lambda \lambda \lambda \lambda
    1 3 (2 (6 4))))) (\lambda \lambda \lambda 4 (1 3))))) (\lambda \lambda 1 (\lambda \lambda 2) 2)$$

    whose first 100 bits of output are

    ```
    00110101000101000101000100000101000001000101000100
    00010000010100000100010100000100010000010000000100
    ```
    proves that

    $$\mathit{KS}(\mathit{PRIMES}) \leq 167$$

    (a prime) while a simple variation proves

    $$\mathit{KP}(\mathit{PRIMES}) \leq 176$$

    This is even shorter than Haskell's 23 byte long

    <code>nubBy(((>1).).gcd)[2..]
    </code>

    ## Symmetry of information
    The program

    $$(\lambda 1 (\lambda \lambda 2) (\lambda 1 (\lambda \lambda 1))
    (\lambda \lambda \lambda (\lambda 1 (3 2) (\lambda \lambda 3 1
    (6 1 1 (4 ((\lambda 1 1) (\lambda \lambda \lambda \lambda \lambda
    1 2 (\lambda 1 (\lambda \lambda 1) (\lambda \lambda \lambda 8 2)
    (\lambda \lambda 1) (4 (6 6)$$

    $$(\lambda 4 (\lambda 1 7 2)))))) 5)
    (\lambda \lambda 1) (\lambda \lambda \lambda \lambda 1)
    (\lambda \lambda 2) (\lambda \lambda 1)) (\lambda \lambda \lambda
    1 (\lambda 1 6 4) 2))) (4 (\lambda 1) (\lambda 1) (\lambda 1) 1))
    (\lambda 1))$$

    $$(\lambda \lambda (\lambda 1 1) (\lambda \lambda \lambda
    \lambda 1 (\lambda (\lambda \lambda \lambda \lambda 3 (\lambda 6
    (3 (\lambda 2 (3 (\lambda 14 (\lambda 3 (\lambda 1 2 3)))) (4
    (\lambda 4 (\lambda 14 (3 1) (2 1)))))) (1 (2 (\lambda 1 2))$$

    $$
    (\lambda \lambda 5 (\lambda 5 (\lambda 2 (1 5))) 7 6)) (\lambda
    6 (\lambda 1 3 2)))) (\lambda 4 (\lambda 1 3 2))) (4 4) 3))$$

    proves that

    $$KP(x,y) \leq KP(x) + KP(y|x^{\ast}) + 657$$

    where $x^{\ast}$ is a shortest program for ''x''.

    This inequality is the easy half of an equality (up to a constant) known as **Symmetry of information**.
    Proving the converse

    $$KP(y|x^{\ast}) \leq KP(x,y) - KP(x) + O(1)$$

    involves simulating infinitely many programs in [https://en.wikipedia.org/wiki/Dovetailing_(computer_science) dovetailing] fashion,
    seeing which ones halt and output the pair of ''x'' (for which a shortest program is given) and any ''y'',
    and for each such program ''p'', requesting a new codeword for ''y'' of length <math>\ell(p)-KP(x)</math>.
    The [https://en.wikipedia.org/wiki/Kraft_inequality Kraft inequality] ensures that this infinite enumeration of requests can be fulfilled,
    and in fact codewords for ''y'' can be decoded on the fly, in tandem with the above enumeration.
    Details of this fundamental result by Chaitin can be found in.<ref>G J Chaitin, ''Algorithmic information theory'', Volume I in Cambridge Tracts in Theoretical Computer Science, Cambridge University Press, October 1987, [http://www.cs.auckland.ac.nz/~chaitin/cup.pdf (pdf version)]</ref>

    ## A quine
    The term $Q=\lambda 1 ((\lambda 1 1) (\lambda \lambda \lambda \lambda \lambda 1 4 (3 (5 5) 2))) 1$
    concatenates two copies of its input, proving that

    $$KS(x x) \leq \ell(x) + 66$$

    Applying it to its own encoding gives a 132 bit [quine](https://en.wikipedia.org/wiki/Quine_(computing)):

    $$U (\mbox{blc}(Q)\ \mbox{blc}(Q):\mathrm{Nil}) = \mbox{blc}(Q)\ \mbox{blc}(Q)$$

    ## Compression
    So far, we've seen surprisingly little in the way of actually compressing an object into a
    shorter description; in the last example, we were only breaking even.
    For $\ell(x)>66$ though, we do actually compress $xx$ by $\ell(x)-66$ bits.

    What could be the shortest program that produces a larger output?
    The following is a good candidate: the program
    $(\lambda 1 1 1 1 (\lambda \lambda 1 (\lambda \lambda 1) 2)) (\lambda \lambda 2 (2 1))$,
    of size 55 bits, uses [Church numerals](https://en.wikipedia.org/wiki/Church_numerals)
    to output exactly $2^{2^{2^2}}=65536$ ones.
    That beats both [gzip](https://en.wikipedia.org/wiki/Gzip) and [bzip2](https://en.wikipedia.org/wiki/Bzip2),
    compressors that need 344 and 352 bits respectively, to describe the same output
    (as an 8192 byte file with a single letter name).

    ## Halting probability
    The [halting probability](https://en.wikipedia.org/wiki/Halting_probability)
    of the prefix universal machine is defined as the probability it will output
    any term that has a closed normal form (this includes all translated strings):

    $$\Omega_{\lambda} = \sum_{U (p:z) \in NF}\ 2^{-\ell(p)}$$

    With some effort, we can determine the first 3 bits of this particular number of wisdom:

    $$\Omega_{\lambda} = .000\ldots_2$$

    and either the 4th or 5th bit must be 1.

    ## BLC8: byte sized I/O
    While bit streams are nice in theory, they fare poorly in interfacing with the real world.
    The language BLC8 is a more practical variation on BLC in which programs operate on a stream of bytes,
    where each byte is represented as a delimited list of 8 bits in
    [big-endian](https://en.wikipedia.org/wiki/Big-endian) order.

    BLC8 requires a more complicated universal machine:

    $$U8 = \lambda 1 ((\lambda 1\ 1) (\lambda (\lambda \lambda \lambda 1 (\lambda \lambda
    \lambda 2 (\lambda \lambda \lambda (\lambda 7 (10\ (\lambda 5\ (2 (\lambda \lambda 3
    (\lambda\ 1\ 2\ 3))) (11 (\lambda 3 (\lambda 3\ 1\ (2\ 1)))))\ 3)$$

    $$(4\ (1\ (\lambda 1\ 5) 3)
    (10\ (\lambda 2 (\lambda 2\ (1\ 6))) 6))) 8) (\lambda 1 (\lambda\ 8\ 7 (\lambda\ 1\ 6\ 2))))
    (\lambda\ 1\ (4\ 3))) (1\ 1)) (\lambda \lambda\ 2\ ((\lambda\ 1\ 1) (\lambda\ 1\ 1))))$$

    The parser in U8 keeps track of both remaining bytes, and remaining bits in the current byte,
    discarding the latter when parsing is completed. Thus the size of U8, which is 355 bits as a
    BLC program, is 45 bytes in BLC8.

    ## Brainfuck
    The following BLC8 program

    $$(\lambda\ 1\ 1) (\lambda (\lambda \lambda \lambda 1 (\lambda (\lambda\ 2\ 1\ 1\ 1
    (\lambda \lambda\ 1\ 3\ 3 (\lambda \lambda 1 (\lambda \lambda (\lambda 7 (1 (3
    (\lambda \lambda \lambda \lambda \lambda 10 (1 (\lambda\ 6\ 1\ 4\ 3)) (\lambda\ 1\ 5
    (6\ 5\ 4\ 3\ 2)))$$

    $$(\lambda \lambda 2 ((\lambda\ 1\ 1) (\lambda \lambda \lambda 2
    (\lambda \lambda \lambda\ 6\ 6\ 2 (\lambda \lambda 6 (\lambda 1 (2\ 6) 3) (15 (5\ 1
    (\lambda 1)) (5 (\lambda 1) 1)))) (1\ 2\ (\lambda \lambda \lambda\ 3\ 1\ 2)))$$

    $$1 (\lambda \lambda 2))))) (3 (1 (\lambda \lambda \lambda \lambda 9 (1 (\lambda\ 5\ 1
    (\lambda\ 1\ 5\ 4))) (2\ 4 (\lambda\ 1\ 4\ 2)))) (5 (11 (\lambda 1)) (12 (\lambda 2
    ((\lambda\ 1\ 1)$$

    $$(\lambda \lambda \lambda 1 ((\lambda\ 1\ 1) (\lambda \lambda \lambda
    2 (1 (3\ 3)) (\lambda 8 (7\ 7\ 1)))) 2\ 1))))))) (\lambda 12 (\lambda 12 (\lambda 3
    (2\ 1)))))))) (\lambda \lambda 1))) (1\ 1))$$

    $$(\lambda (\lambda\ 1\ 1) (\lambda \lambda
    1 ((\lambda 1 (1 (1 (\lambda \lambda 1 (\lambda \lambda 2) 2)))) (\lambda \lambda
    2 (2\ 1)) (\lambda \lambda 1)) (2\ 2)) (1 (\lambda \lambda \lambda \lambda \lambda
    \lambda 1)) 1)$$

    is an unbounded tape [Brainfuck](https://en.wikipedia.org/wiki/Brainfuck) interpreter in 829 bits
    (under 104 bytes).

    This provides a nice example of the property that universal description methods differ by at most
    a constant in complexity. Writing a BLC8 interpreter in Brainfuck, which would provide a
    matching upper bound in the other direction, is left as an exercise for die-hard Brainfuck programmers.

    The interpreter expects its input to consist of a Brainfuck program followed by a <code>]</code> followed by the input for the Brainfuck program. The interpreter only looks at bits 0,1,4 of each character to determine which of <code>,-.+&lt;&gt;][</code> it is, so any characters other than those 8 should be stripped from a Brainfuck program.
    Consuming more input than is available results in an error (the non-list result $\lambda x.x$).

    This interpreter is a rough translation of the following version written in
    [Haskell](https://en.wikipedia.org/wiki/Haskell_(programming_language))
    ```
    import System.Environment(getArgs)
    import Control.Monad.State
    import Control.Monad.Writer
    import Control.Applicative hiding ((<|>),many)
    import Text.ParserCombinators.Parsec
    putc = do ( _, _,b, _) <- get; tell [b]
    getc = do ( left, right,_,b:input) <- get; put ( left, right, b,input)
    prev = do (l:left, right,b, input) <- get; put ( left,b:right, l,input)
    next = do ( left,r:right,b, input) <- get; put (b:left, right, r,input)
    decr = do ( left, right,b, input) <- get; put ( left, right,pred b,input)
    incr = do ( left, right,b, input) <- get; put ( left, right,succ b,input)
    loop body = do (_,_,b,_) <- get; when (b /= '\0') (body >> loop body)
    parseInstr = liftM loop (between (char '[') (char ']') parseInstrs)
    <|> prev <$ char '<'
    <|> next <$ char '>'
    <|> decr <$ char '-'
    <|> incr <$ char '+'
    <|> putc <$ char '.'
    <|> getc <$ char ','
    <|> return () <$ noneOf "]"
    parseInstrs = liftM sequence_ (many parseInstr)
    main = do [name] <- getArgs
    source <- readFile name
    input <- getContents
    let init = ("", repeat '\0', '\0', input)
    putStrLn $ either show (execWriter . (`evalStateT` init)) (parse parseInstrs name source)
    ```

    ## External links
    * [John's Lambda Calculus and Combinatory Logic Playground](http://tromp.github.io/cl/cl.html)
    * [A Binary Lambda Calculus interpreter in C for the IOCCC](http://www.ioccc.org/2012/tromp/hint.html)
    * [Counting Terms in the Binary Lambda Calculus](http://arxiv.org/abs/1401.0379)

    ## References

    [1] John Tromp, Binary Lambda Calculus and Combinatory Logic,
    in ''Randomness And Complexity, from Leibniz To Chaitin'', ed. Cristian S. Calude,
    World Scientific Publishing Company, October 2008.
    (The last reference, to an initial Haskell implementation, is dated 2004)
    [(pdf version)](http://tromp.github.io/cl/LC.pdf)