{{ message }}

Instantly share code, notes, and snippets.

# jdh30/rewrite.md

Last active Apr 9, 2020
Introduction to term rewriting

# Creating a new term-rewrite language

This article walks through the design and implementation of a simple but remarkably powerful new programming language that uses term rewriting as an evaluation model and, consequently, is well suited to computer algebra.

## The term rewriter

We begin with the definition of an expression type `Expr`:

``````  type Expr =
| Int of System.Numerics.BigInteger
| Symbol of string
| Variable of string
| Apply of Expr * Expr []

override expr.ToString() =
match expr with
| Int n -> n.ToString()
| Symbol s | Variable s -> s
| Apply(f, xs) ->
seq { for x in xs -> x.ToString() }
|> String.concat ", "
|> sprintf "%O(%s)" f
``````

The `Int` union case represents a (big) integer. The `Symbol` case represents a symbol such as `x`. The `Variable` case represents a variable bound in a pattern match. Finally, the `Apply` case represents one expression applied to another.

We replace the `Apply` constructor with a function that reduces arithmetic expressions when possible:

``````  let Apply(f, args) =
match f, args with
| Symbol "+", [|Int m; Int n|] -> Int(m + n)
| Symbol "-", [|Int m; Int n|] -> Int(m - n)
| Symbol "*", [|Int m; Int n|] -> Int(m * n)
| Symbol "/", [|Int m; Int n|] when m % n = 0I ->
Int(m / n)
| Symbol "%", [|Int m; Int n|] -> Int(m % n)
| Symbol "^", [|Int m; Int n|] when n >= 0I ->
Int(pown m (int n))
| f, args -> Apply(f, args)
``````

Term rewriting is based upon the idea of applying a rewrite rule to an expression repeatedly until it stops changing. The point where a value stops changing is called the fixed point. The following function repeatedly applies `f` to `x` until the result stops changing and returns the result:

``````  let rec fixedPoint1 f x =
match f x with
| None -> x
| Some x -> fixedPoint1 f x
``````

Note that `f x` returns an option. The `None` case is used to convey that `f` did nothing. This will be used throughout our program to track whether or not an expression has been altered and, if so, to keep rewriting it.

The following function is similar to the previous one but it returns `None` if the first application of `f x` returns `None`:

``````  let fixedPoint f x = Option.map (fixedPoint1 f) (f x)
``````

The following function maps `f` over an array of expressions `exprs`, returning `Some` array of results if `f` returned `Some` at least once or `None` if `f` returned `None` in every case:

``````  let map f (exprs: _ []) =
let mutable results = null
for i in 0..exprs.Length-1 do
match f exprs.[i] with
| None -> ()
| Some expr ->
if isNull results then
results <- Array.copy exprs
results.[i] <- expr
if isNull results then None else Some results
``````

Note that this function is most easily written imperatively, mutating array elements. However, this function's API is pure because callers are unaware of the impurity encapsulated within this function.

In order to provide a simple but powerful language we shall use a nasty hack. Variables in patterns will begin with a lowercase letter. Rather than adding the complexity of annotations to say that we want to match an integer or a non-integer we shall use the name of a pattern variable to convey this information. Specifically, we shall match integers with `m`, `n`, `p` and `q` and non-integers with `x`, `y` and `z` and any other expression using any other symbol:

``````  let (|IntVar|NonIntVar|AnyVar|) s =
match s with
| "m" | "n" | "p" | "q" -> IntVar
| "x" | "y" | "z" -> NonIntVar
| _ -> AnyVar
``````

The core of our term rewriting is a set of four mutually-recursive functions. The first is a `replaceOnce` function that rewrites all subexpressions to fixed point:

``````  let rec replaceOnce rule expr =
match expr with
| Int _
| Symbol _
| Variable _ as expr -> rule expr
| Apply(Symbol "->", [|_; _|]) -> None
| Apply(Symbol ("If" | "&&" | "||") as op, args) as expr
when args.Length > 0 ->
| Apply(f, args) as expr ->
``````

Note that we have special exemptions for `If` and boolean operators. Specifically, we only evaluate their first argument. This allows us to ensure that one branch of an `If` can be ignored without being evaluated and to implement short-circuit evaluation of boolean operators.

The `replaceHead` function replaces just the first argument:

``````  and replaceHead rule expr op (args: _ []) =
match replaceAll rule args.[0] with
| None -> rule expr
| Some p -> Some(Apply(op, Array.append [|p|] args.[1..]))
``````

The `replaceHeadAndRest` replaces both the head of an `Apply` expression and all arguments:

``````  and replaceHeadAndRest rule expr f args =
match replaceAll rule f, map (replaceAll rule) args with
| None, None -> rule expr
| None, Some args -> Some(Apply(f, args))
| Some f, None -> Some(Apply(f, args))
| Some f, Some args -> Some(Apply(f, args))
``````

Finally, the `replaceAll` function rewrites an expression, including all of its subexpressions, to fixed point:

``````  and replaceAll rule expr =
fixedPoint (replaceOnce rule) expr
``````

Our language is based upon pattern matching. This works in a somewhat similar way to pattern matching in F#. One important difference is that whereas F# prohibits pattern variables from appearing more than once in a pattern our languge allows this and uses it to convey that two subexpressions must be identical.

The following `pmatch` function implements our pattern matcher. Variable bindings are accumulated in `bs`, the pattern expression is `patt` and the expression being matched against is `expr`:

``````  let rec pmatch bs (patt, expr) =
match patt, expr with
``````

If the pattern requires a literal integer then the bindings `bs` are returned if the expression is this integer or `None` is returned if the integers are not the same, indicating to the caller that there was no match (and, hence, there are no bound variables):

``````    | Int m, Int n -> if m=n then Some bs else None
``````

Similar when the pattern is a literal symbol:

``````    | Symbol x, Symbol y ->
if x=y then Some bs else None
``````

Matching pattern variables is more involved. If the pattern variable requires an integer then the expression must be an integer:

``````    | (Variable(IntVar as v) as patt), (Int _ as expr)
``````

Equivalently for non-integers:

``````    | (Variable(NonIntVar as v) as patt),
(Symbol _ | Variable _ | Apply _ as expr)
``````

And pattern variables that match any expression:

``````    | (Variable(AnyVar as v) as patt), expr ->
``````

Check to see if this pattern variable has already been bound to another expression:

``````        match Map.tryFind v bs with
``````

If the pattern is literally identical then the bindings are kept as-is. Otherwise the binding is added to `bs`:

``````        | None ->
Some(if patt=expr then bs else Map.add v expr bs)
``````

If `expr2` is already bound to this pattern variable then we only return the bindings (indicating a successful match) otherwise we return `None` to convey failure to match:

``````        | Some expr2 -> if expr=expr2 then Some bs else None
``````

An `Apply` pattern and `Apply` expression are matched component-wise:

``````    | Apply(p, ps), Apply(e, es) ->
pmatch bs (p, e)
|> Option.bind (fun bs -> pmatches bs (ps, es))
``````

Any other combination fails to match:

``````    | _ -> None
``````

The `pmatches` function matches an array of patterns against an array of expressions:

``````  and pmatches bs (ps: _ [], es: _ []) =
``````

If the arrays are different lengths then the match fails:

``````    if ps.Length <> es.Length then None else
``````

Otherwise a local recursive function is used to loop over the elements in the arrays, matching each one in turn and stopping prematurely if a pattern-expression pair fails to match, otherwise the variable bindings are accumulated and returned:

``````      let rec loop i bs =
if i = ps.Length then Some bs else
match pmatch bs (ps.[i], es.[i]) with
| None -> None
| Some bs -> loop (i+1) bs
loop 0 bs
``````

When a rule matches we need to be able to substitute the appearances of pattern variables in the expression with the expression they have been bound to without doing any other term rewriting. This `subst` function does this:

``````  let rec subst bs expr =
match expr with
| Int _ -> None
| Symbol _ -> None
| Variable v as e1 ->
match Map.tryFind v bs with
| Some e2 when e1<>e2 -> Some e2
| _ -> None
| Apply(f, xs) as expr ->
match subst bs f, map (subst bs) xs with
| None, None -> Some expr
| None, Some xs -> Some(Apply(f, xs))
| Some f, None -> Some(Apply(f, xs))
| Some f, Some xs -> Some(Apply(f, xs))
``````

The core of our language is then an `exec` function that term rewrites a given expression and, if it results in a rule then that rule is prepended onto the rules that will be applied in subsequent term rewriting:

``````  let exec rule expr =
let expr =
match replaceAll rule expr with
| None -> expr
| Some expr -> expr
match expr with
| Apply(Symbol "->", [|patt; action|]) as expr ->
let rule expr =
match pmatch Map.empty (patt, expr) with
| None -> rule expr
| Some bs ->
match subst bs action with
| None -> Some action
| action -> action
rule, expr
| expr -> rule, expr
``````

Note that our `rule` is a function because this is simple. A production solution would most likely keep the rules to be applied as a data structure that could be optimised. For example, the head of the expression being matched could be used to dramatically reduce the number of rules being matched.

## Parsing

In this case we shall use the FParsec library to help us with parsing. FParsec is now readily available via Nuget.

``````  open FParsec
``````

We can parse a big integer as follows:

``````  let pInt : Parser<_, unit> =
puint32 .>> spaces
|>> (System.Numerics.BigInteger >> Int)
``````

A symbol or pattern variable may be parsed as an alphanumeric sequence:

``````  let pSymbol =
let isFst c = isLetter c
let isAlphanum c = isLetter c || isDigit c
many1Satisfy2L isFst isAlphanum "symbol" .>> spaces
|>> fun s ->
if System.Char.IsLower s.[0] then Variable s else Symbol s
``````

A string followed by whitespace:

``````  let pStr s = pstring s .>> spaces
``````

We shall support a variety of interesting operator:

``````  let opp = OperatorPrecedenceParser<Expr,unit,unit>()
let pExpr = opp.ExpressionParser .>> spaces
``````

A term is an int, symbol or bracketed expression:

``````  opp.TermParser <-
choice
[ pInt
pSymbol .>>.
many (pStr "(" >>. sepBy pExpr (pStr ",") .>> pStr ")")
|>> fun (f, argss) ->
Seq.fold (fun f args -> Apply(f, Array.ofList args))
f argss
pStr "(" >>. pExpr .>> pStr ")" ]
``````

We shall support the following operators (although we have only implemented a few):

``````  let ops =
[ Associativity.Left, ["->"]
Associativity.Left, ["&&"; "||"]
Associativity.Left, ["<"; "<="; "="; "<>"; ">="; ">"]
Associativity.Left, ["+"; "-"]
Associativity.Left, ["*"; "/"; "%"]
Associativity.Right, ["^"] ]
|> Seq.indexed
``````

Each operator is added along with unary minus and a postfix factorial operator:

``````  do
for prec, (assoc, ops) in ops do
for op in ops do
let f f g = Apply(Symbol op, [|f; g|])
InfixOperator(op, spaces, prec+1, assoc, f)
let f f = Apply(Symbol "-", [|Int 0I; f|])
PrefixOperator("-", spaces, 10, false, f)
let f f = Apply(Symbol "Factorial", [|f|])
PostfixOperator("!", spaces, 11, false, f)
``````

## Main program

The main program begins with a single rule that provides our language with the ability to print an expression using the expression `Print(e)`:

``````  do
let rule expr =
match expr with
| Apply(Symbol "Print", [|x|]) ->
printfn "%O" x
Some(Symbol "Null")
| _ -> None
``````

This is particularly useful for determining which subexpressions have been ignored without having been rewritten first. For example, we shall check that an `If` expression ignores one of its branches.

The following function parses a line of input into an expression and rewrites it using the `exec` function, accumulating the rewrite rule that applies all rules defined so far:

``````    let f rule (input: string) =
if input.Trim() = "" then rule else
match run (spaces >>. pExpr) input with
| Success(expr,_,_) ->
let rule, expr = exec rule expr
printfn "%O" expr
rule
| Failure(msg, _, _) ->
printfn "Parse error: %s" msg
rule
``````

We shall execute a pre-defined set of rules from a text file before allowing user input:

``````    let path =
[|__SOURCE_DIRECTORY__; "Bootstrap.txt"|]
|> System.IO.Path.Combine
seq { for line in System.IO.File.ReadLines path do
printfn "> %s" line
yield line
while true do
printf "> "
|> Seq.fold f rule
|> ignore
``````

## Examples

Although the design and implementation of this language is minimalistic it actually proved to be a very powerful language indeed.

Let us begin by defining a rule that says any two expressions are not equal regardless of whether or not they are:

``````  > f = g -> False
->(=(f, g), False)
``````

For example, three is not four:

``````  > 3 = 4
False
``````

But also three is not three:

``````  > 3 = 3
False
``````

Now we can use our language's ability to use a pattern variable multiple times in order to match an expression being compared with itself:

``````  > f = f -> True
->(=(f, f), True)
``````

Note that later rule definitions override earlier definitions (i.e. match cases appear in the opposite order in our language compared to F#).

Now three equals three:

``````  > 3 = 3
True
``````

We can define the `Not` function:

``````  > Not(True) -> False
->(Not(True), False)
> Not(False) -> True
->(Not(False), True)
``````

We can even define the semantics of `If` expressions in our language using the language itself:

``````  > If(True, t, f) -> t
->(If(True, t, f), t)
> If(False, t, f) -> f
->(If(False, t, f), f)
``````

The following test would print `ERROR` were it not for the fact that our evaluator has a special case for `If` that rewrites only the first argument (the predicate):

``````  > If(True, Print(OK), Print(ERROR))
OK
Null
``````

Similarly, we can define boolean operators:

``````  > True && f -> f
->(&&(True, f), f)
> False && f -> False
->(&&(False, f), False)
> True || f -> True
->(||(True, f), True)
> False || f -> f
->(||(False, f), f)
``````

These are short circuit evaluated so the following does not print `ERROR`:

``````  > False && Print(ERROR)
False
``````

We can define the factorial function using our postfix `!` operator:

``````  > n! -> n * (n-1)!
->(Factorial(n), *(n, Factorial(-(n, 1))))
> 0! -> 1
->(Factorial(0), 1)
``````

For example, computing 17!:

``````  > 17!
355687428096000
``````

As we were careful to use a pattern variable called `n` in order to match only an integer argument our factorial rewrite rules are not applied to a symbolic argument such as the symbol `x`:

``````  > x!
Factorial(x)
``````

The Fibonacci function may be defined as follows:

``````  > Fib(n) -> Fib(n-2) + Fib(n-1)
->(Fib(n), +(Fib(-(n, 2)), Fib(-(n, 1))))
> Fib(0) -> 0
->(Fib(0), 0)
> Fib(1) -> 1
->(Fib(1), 1)
``````

For example, the tenth Fibonacci number is 55:

``````  > Fib(10)
55
``````

Let's do some computer algebra. We can simplify things by rewriting subtraction in terms of addition and multiplication by minus one:

``````  > f - g -> f + g*-1
->(-(f, g), +(f, *(g, -1)))
``````

Similar for division and power:

``````  > f / g -> f * g^-1
->(/(f, g), *(f, ^(g, -1)))
``````

Let's rotate associative operators to the right:

``````  > (f + g) + h -> f + (g + h)
->(+(+(f, g), h), +(f, +(g, h)))
> (f * g) * h -> f * (g * h)
->(*(*(f, g), h), *(f, *(g, h)))
``````

For example, `(a+b)+(c+d)` is rewritten into `a+(b+(c+d))`:

``````  > (a+b)+(c+d)
+(a, +(b, +(c, d)))
``````

This example has some opportunity for simplification:

``````  > (x+1)+(x+2)
+(x, +(1, +(x, 2)))
``````

Let's rotate integers left:

``````  > x + n -> n + x
->(+(x, n), +(n, x))
> x * n -> n * x
->(*(x, n), *(n, x))
``````

Note how concise our rules are because `n` matches an integer and `x` matches a non-integer.

The integers are being rotated further left:

``````  > (x+1)+(x+2)
+(1, +(x, +(2, x)))
``````

Let's complete the rules for rotation:

``````  > x + (n + y) -> n + (x + y)
->(+(x, +(n, y)), +(n, +(x, y)))
> x * (n * y) -> n * (x * y)
->(*(x, *(n, y)), *(n, *(x, y)))
``````

Now integers are always rotated to the far left:

``````  > (x+1)+(x+2)
+(2, +(1, +(x, x)))
``````

A couple more rules and we can perform arithmetic operations on those adjacent integers on the left:

``````  > m + (n + x) -> (m + n) + x
->(+(m, +(n, x)), +(+(m, n), x))
> m * (n * x) -> (m * n) * x
->(*(m, *(n, x)), *(*(m, n), x))
``````

Now our example is simplified even more:

``````  > (x+1)+(x+2)
+(3, +(x, x))
``````

We might also want to factor our duplicates:

``````  > f + f -> 2 * f
->(+(f, f), *(2, f))
> f * f -> f ^ 2
->(*(f, f), ^(f, 2))
``````

Now our example `(x+1)+(x+2)` is rewritten into `3 + 2*x`:

``````  > (x+1)+(x+2)
+(3, *(2, x))
``````

We might also want to multiply out brackets when the factor is an integer:

``````  > n * (x + y) -> n * x + n * y
->(*(n, +(x, y)), +(*(n, x), *(n, y)))
``````

We can add some rules for arithmetic. Adding zero is a no-op:

``````  > f + 0 -> f
->(+(f, 0), f)
> 0 + f -> f
->(+(0, f), f)
``````

A production system would want a more feature-rich pattern matcher that automatically tries all combinations of terms in a sum or factors in a product but ours does not so we list `f+0` and `0+f` separately by hand.

Multiplying by zero returns zero:

``````  > f * 0 -> 0
->(*(f, 0), 0)
> 0 * f -> 0
->(*(0, f), 0)
``````

Multiplying by one is a no-op:

``````  > f * 1 -> f
->(*(f, 1), f)
> 1 * f -> f
->(*(1, f), f)
``````

Another multiplication-based simplification:

``````  > f + g * f -> (1 + g)*f
->(+(f, *(g, f)), *(+(1, g), f))
``````

Similarly for power, we can add a rule that zero to the power of anything is zero:

``````  > 0 ^ f -> 0
->(^(0, f), 0)
``````

Which is actually wrong for `0^0=1` (at least in the limit!) so we can override it for that case:

``````  > f ^ 0 -> 1
->(^(f, 0), 1)
``````

Similarly for a unit exponent:

``````  > f ^ 1 -> f
->(^(f, 1), f)
``````

More simplifications around powers:

``````  > f * f^g -> f^(g+1)
->(*(f, ^(f, g)), ^(f, +(g, 1)))
> f^g * f -> f^(g+1)
->(*(^(f, g), f), ^(f, +(g, 1)))
> f^g * f^h -> f^(g+h)
->(*(^(f, g), ^(f, h)), ^(f, +(g, h)))
> (f^x) ^ -1 -> f^-x
->(^(^(f, x), -1), ^(f, -(0, x)))
``````

We can also teach our term rewriter some rules for logarithms:

``````  > Log(x) + Log(y) -> Log(x * y)
->(+(Log(x), Log(y)), Log(*(x, y)))
> Log(x ^ y) -> y * Log(x)
->(Log(^(x, y)), *(y, Log(x)))
> Log(1) -> 0
->(Log(1), 0)
``````

And relationships between logarithms and exponents:

``````  > Log(E^f) -> f
->(Log(^(E, f)), f)
> E^Log(f) -> f
->(^(E, Log(f)), f)
``````

For example, `Log(x^x)/x` is simplified to `x*Log(x)*x^-1`:

``````  > Log(x^x) / x
*(x, *(Log(x), ^(x, -1)))
``````

and `E^x/E^-x` to `E^x*(E^-x)^-1` to `E^x*E^x` to `E^(x+x)` to `E^(2*x)`:

``````  > E^x / E^-x
^(E, *(2, x))
``````

Let's teach our computer algebra system how to compute the derivative of any symbolic expression composed of our arithmetic operators and logarithms:

``````  > D(n, y) -> 0
->(D(n, y), 0)
> D(x, x) -> 1
->(D(x, x), 1)
> D(f + g, x) -> D(f, x) + D(g, x)
->(D(+(f, g), x), +(D(f, x), D(g, x)))
> D(f * g, x) -> f * D(g, x) + g * D(f, x)
->(D(*(f, g), x), +(*(f, D(g, x)), *(g, D(f, x))))
> D(f ^ g, x) ->
f^(g - 1) * (g * D(f, x) + f * Log(f) * D(g, x))
->(D(^(f, g), x), *(^(f, -(g, 1)), +(*(g, D(f, x)),
*(*(f, Log(f)), D(g, x)))))
> D(Log(f), x) -> D(f, x) / f
->(D(Log(f), x), /(D(f, x), f))
``````

For example, we can compute the derivative of `x^x`:

``````  > D(x^x, x)
*(^(x, +(-1, x)), +(x, *(x, Log(x))))
``````

Let's define a function that nests `n` applications of a function `f` around an expression `x`:

``````  > Nest(n, f, x) -> Nest(n-1, f, f(x))
->(Nest(n, f, x), Nest(-(n, 1), f, f(x)))
> Nest(0, f, x) -> x
->(Nest(0, f, x), x)
``````

Note that we have been rather lax with our naming of pattern variables here. There is no need for `x` to match only non-numbers.

Let's define a univariate function that computes the derivative of the given expression with respect to `x`:

``````  > Dx(f) -> D(f, x)
->(Dx(f), D(f, x))
``````

Now we can easily compute higher derivatives of an expression:

``````  > Nest(2, Dx, x^x)
+(*(^(x, +(-1, x)), +(2, Log(x))), *(+(x, *(x, Log(x))),
*(^(x, +(-2, x)), +(-1, +(x, *(x, Log(x)))))))
``````

## Summary

This article has walked through the design and implementation of a new programming language based upon term rewriting. Although the language is very simple it is remarkably powerful: able to implement its own `If` expressions and short-circuit-evaluated boolean operators as well as perform a multitude of computer algebra tasks including differentiating non-trivial expressions symbolically.

Future F# Journal articles will revisit the subject of metaprogramming.