Skip to content

Instantly share code, notes, and snippets.

@jdh30

jdh30/rewrite.md

Last active Apr 9, 2020
Embed
What would you like to do?
Introduction to term rewriting

Creating a new term-rewrite language

This article walks through the design and implementation of a simple but remarkably powerful new programming language that uses term rewriting as an evaluation model and, consequently, is well suited to computer algebra.

The term rewriter

We begin with the definition of an expression type Expr:

  type Expr =
    | Int of System.Numerics.BigInteger
    | Symbol of string
    | Variable of string
    | Apply of Expr * Expr []

    override expr.ToString() =
      match expr with
      | Int n -> n.ToString()
      | Symbol s | Variable s -> s
      | Apply(f, xs) ->
          seq { for x in xs -> x.ToString() }
          |> String.concat ", "
          |> sprintf "%O(%s)" f

The Int union case represents a (big) integer. The Symbol case represents a symbol such as x. The Variable case represents a variable bound in a pattern match. Finally, the Apply case represents one expression applied to another.

We replace the Apply constructor with a function that reduces arithmetic expressions when possible:

  let Apply(f, args) =
    match f, args with
    | Symbol "+", [|Int m; Int n|] -> Int(m + n)
    | Symbol "-", [|Int m; Int n|] -> Int(m - n)
    | Symbol "*", [|Int m; Int n|] -> Int(m * n)
    | Symbol "/", [|Int m; Int n|] when m % n = 0I ->
        Int(m / n)
    | Symbol "%", [|Int m; Int n|] -> Int(m % n)
    | Symbol "^", [|Int m; Int n|] when n >= 0I ->
        Int(pown m (int n))
    | f, args -> Apply(f, args)

Term rewriting is based upon the idea of applying a rewrite rule to an expression repeatedly until it stops changing. The point where a value stops changing is called the fixed point. The following function repeatedly applies f to x until the result stops changing and returns the result:

  let rec fixedPoint1 f x =
    match f x with
    | None -> x
    | Some x -> fixedPoint1 f x

Note that f x returns an option. The None case is used to convey that f did nothing. This will be used throughout our program to track whether or not an expression has been altered and, if so, to keep rewriting it.

The following function is similar to the previous one but it returns None if the first application of f x returns None:

  let fixedPoint f x = Option.map (fixedPoint1 f) (f x)

The following function maps f over an array of expressions exprs, returning Some array of results if f returned Some at least once or None if f returned None in every case:

  let map f (exprs: _ []) =
    let mutable results = null
    for i in 0..exprs.Length-1 do
      match f exprs.[i] with
      | None -> ()
      | Some expr ->
          if isNull results then
            results <- Array.copy exprs
          results.[i] <- expr
    if isNull results then None else Some results

Note that this function is most easily written imperatively, mutating array elements. However, this function's API is pure because callers are unaware of the impurity encapsulated within this function.

In order to provide a simple but powerful language we shall use a nasty hack. Variables in patterns will begin with a lowercase letter. Rather than adding the complexity of annotations to say that we want to match an integer or a non-integer we shall use the name of a pattern variable to convey this information. Specifically, we shall match integers with m, n, p and q and non-integers with x, y and z and any other expression using any other symbol:

  let (|IntVar|NonIntVar|AnyVar|) s =
    match s with
    | "m" | "n" | "p" | "q" -> IntVar
    | "x" | "y" | "z" -> NonIntVar
    | _ -> AnyVar

The core of our term rewriting is a set of four mutually-recursive functions. The first is a replaceOnce function that rewrites all subexpressions to fixed point:

  let rec replaceOnce rule expr =
    match expr with
    | Int _
    | Symbol _
    | Variable _ as expr -> rule expr
    | Apply(Symbol "->", [|_; _|]) -> None
    | Apply(Symbol ("If" | "&&" | "||") as op, args) as expr
          when args.Length > 0 ->
        replaceHead rule expr op args
    | Apply(f, args) as expr ->
        replaceHeadAndRest rule expr f args

Note that we have special exemptions for If and boolean operators. Specifically, we only evaluate their first argument. This allows us to ensure that one branch of an If can be ignored without being evaluated and to implement short-circuit evaluation of boolean operators.

The replaceHead function replaces just the first argument:

  and replaceHead rule expr op (args: _ []) =
    match replaceAll rule args.[0] with
    | None -> rule expr
    | Some p -> Some(Apply(op, Array.append [|p|] args.[1..]))

The replaceHeadAndRest replaces both the head of an Apply expression and all arguments:

  and replaceHeadAndRest rule expr f args =
    match replaceAll rule f, map (replaceAll rule) args with
    | None, None -> rule expr
    | None, Some args -> Some(Apply(f, args))
    | Some f, None -> Some(Apply(f, args))
    | Some f, Some args -> Some(Apply(f, args))

Finally, the replaceAll function rewrites an expression, including all of its subexpressions, to fixed point:

  and replaceAll rule expr =
    fixedPoint (replaceOnce rule) expr

Our language is based upon pattern matching. This works in a somewhat similar way to pattern matching in F#. One important difference is that whereas F# prohibits pattern variables from appearing more than once in a pattern our languge allows this and uses it to convey that two subexpressions must be identical.

The following pmatch function implements our pattern matcher. Variable bindings are accumulated in bs, the pattern expression is patt and the expression being matched against is expr:

  let rec pmatch bs (patt, expr) =
    match patt, expr with

If the pattern requires a literal integer then the bindings bs are returned if the expression is this integer or None is returned if the integers are not the same, indicating to the caller that there was no match (and, hence, there are no bound variables):

    | Int m, Int n -> if m=n then Some bs else None

Similar when the pattern is a literal symbol:

    | Symbol x, Symbol y ->
        if x=y then Some bs else None

Matching pattern variables is more involved. If the pattern variable requires an integer then the expression must be an integer:

    | (Variable(IntVar as v) as patt), (Int _ as expr)

Equivalently for non-integers:

    | (Variable(NonIntVar as v) as patt),
      (Symbol _ | Variable _ | Apply _ as expr)

And pattern variables that match any expression:

    | (Variable(AnyVar as v) as patt), expr ->

Check to see if this pattern variable has already been bound to another expression:

        match Map.tryFind v bs with

If the pattern is literally identical then the bindings are kept as-is. Otherwise the binding is added to bs:

        | None ->
            Some(if patt=expr then bs else Map.add v expr bs)

If expr2 is already bound to this pattern variable then we only return the bindings (indicating a successful match) otherwise we return None to convey failure to match:

        | Some expr2 -> if expr=expr2 then Some bs else None

An Apply pattern and Apply expression are matched component-wise:

    | Apply(p, ps), Apply(e, es) ->
        pmatch bs (p, e)
        |> Option.bind (fun bs -> pmatches bs (ps, es))

Any other combination fails to match:

    | _ -> None

The pmatches function matches an array of patterns against an array of expressions:

  and pmatches bs (ps: _ [], es: _ []) =

If the arrays are different lengths then the match fails:

    if ps.Length <> es.Length then None else

Otherwise a local recursive function is used to loop over the elements in the arrays, matching each one in turn and stopping prematurely if a pattern-expression pair fails to match, otherwise the variable bindings are accumulated and returned:

      let rec loop i bs =
        if i = ps.Length then Some bs else
          match pmatch bs (ps.[i], es.[i]) with
          | None -> None
          | Some bs -> loop (i+1) bs
      loop 0 bs

When a rule matches we need to be able to substitute the appearances of pattern variables in the expression with the expression they have been bound to without doing any other term rewriting. This subst function does this:

  let rec subst bs expr =
    match expr with
    | Int _ -> None
    | Symbol _ -> None
    | Variable v as e1 ->
        match Map.tryFind v bs with
        | Some e2 when e1<>e2 -> Some e2
        | _ -> None
    | Apply(f, xs) as expr ->
        match subst bs f, map (subst bs) xs with
        | None, None -> Some expr
        | None, Some xs -> Some(Apply(f, xs))
        | Some f, None -> Some(Apply(f, xs))
        | Some f, Some xs -> Some(Apply(f, xs))

The core of our language is then an exec function that term rewrites a given expression and, if it results in a rule then that rule is prepended onto the rules that will be applied in subsequent term rewriting:

  let exec rule expr =
    let expr =
      match replaceAll rule expr with
      | None -> expr
      | Some expr -> expr
    match expr with
    | Apply(Symbol "->", [|patt; action|]) as expr ->
        let rule expr =
          match pmatch Map.empty (patt, expr) with
          | None -> rule expr
          | Some bs ->
              match subst bs action with
              | None -> Some action
              | action -> action
        rule, expr
    | expr -> rule, expr

Note that our rule is a function because this is simple. A production solution would most likely keep the rules to be applied as a data structure that could be optimised. For example, the head of the expression being matched could be used to dramatically reduce the number of rules being matched.

Parsing

In this case we shall use the FParsec library to help us with parsing. FParsec is now readily available via Nuget.

  open FParsec

We can parse a big integer as follows:

  let pInt : Parser<_, unit> =
    puint32 .>> spaces
    |>> (System.Numerics.BigInteger >> Int)

A symbol or pattern variable may be parsed as an alphanumeric sequence:

  let pSymbol =
    let isFst c = isLetter c
    let isAlphanum c = isLetter c || isDigit c
    many1Satisfy2L isFst isAlphanum "symbol" .>> spaces
    |>> fun s ->
      if System.Char.IsLower s.[0] then Variable s else Symbol s

A string followed by whitespace:

  let pStr s = pstring s .>> spaces

We shall support a variety of interesting operator:

  let opp = OperatorPrecedenceParser<Expr,unit,unit>()
  let pExpr = opp.ExpressionParser .>> spaces

A term is an int, symbol or bracketed expression:

  opp.TermParser <-
    choice
      [ pInt
        pSymbol .>>.
          many (pStr "(" >>. sepBy pExpr (pStr ",") .>> pStr ")")
        |>> fun (f, argss) ->
          Seq.fold (fun f args -> Apply(f, Array.ofList args))
            f argss
        pStr "(" >>. pExpr .>> pStr ")" ]

We shall support the following operators (although we have only implemented a few):

  let ops =
    [ Associativity.Left, ["->"]
      Associativity.Left, ["&&"; "||"]
      Associativity.Left, ["<"; "<="; "="; "<>"; ">="; ">"]
      Associativity.Left, ["+"; "-"]
      Associativity.Left, ["*"; "/"; "%"]
      Associativity.Right, ["^"] ]
    |> Seq.indexed

Each operator is added along with unary minus and a postfix factorial operator:

  do
    for prec, (assoc, ops) in ops do
      for op in ops do
        let f f g = Apply(Symbol op, [|f; g|])
        InfixOperator(op, spaces, prec+1, assoc, f)
        |> opp.AddOperator
    let f f = Apply(Symbol "-", [|Int 0I; f|])
    PrefixOperator("-", spaces, 10, false, f)
    |> opp.AddOperator
    let f f = Apply(Symbol "Factorial", [|f|])
    PostfixOperator("!", spaces, 11, false, f)
    |> opp.AddOperator

Main program

The main program begins with a single rule that provides our language with the ability to print an expression using the expression Print(e):

  do
    let rule expr =
      match expr with
      | Apply(Symbol "Print", [|x|]) ->
          printfn "%O" x
          Some(Symbol "Null")
      | _ -> None

This is particularly useful for determining which subexpressions have been ignored without having been rewritten first. For example, we shall check that an If expression ignores one of its branches.

The following function parses a line of input into an expression and rewrites it using the exec function, accumulating the rewrite rule that applies all rules defined so far:

    let f rule (input: string) =
      if input.Trim() = "" then rule else
        match run (spaces >>. pExpr) input with
        | Success(expr,_,_) ->
            let rule, expr = exec rule expr
            printfn "%O" expr
            rule
        | Failure(msg, _, _) ->
            printfn "Parse error: %s" msg
            rule

We shall execute a pre-defined set of rules from a text file before allowing user input:

    let path =
      [|__SOURCE_DIRECTORY__; "Bootstrap.txt"|]
      |> System.IO.Path.Combine
    seq { for line in System.IO.File.ReadLines path do
            printfn "> %s" line
            yield line
          while true do
            printf "> "
            yield stdin.ReadLine() }
    |> Seq.fold f rule
    |> ignore

Examples

Although the design and implementation of this language is minimalistic it actually proved to be a very powerful language indeed.

Let us begin by defining a rule that says any two expressions are not equal regardless of whether or not they are:

  > f = g -> False
  ->(=(f, g), False)

For example, three is not four:

  > 3 = 4
  False

But also three is not three:

  > 3 = 3
  False

Now we can use our language's ability to use a pattern variable multiple times in order to match an expression being compared with itself:

  > f = f -> True
  ->(=(f, f), True)

Note that later rule definitions override earlier definitions (i.e. match cases appear in the opposite order in our language compared to F#).

Now three equals three:

  > 3 = 3
  True

We can define the Not function:

  > Not(True) -> False
  ->(Not(True), False)
  > Not(False) -> True
  ->(Not(False), True)

We can even define the semantics of If expressions in our language using the language itself:

  > If(True, t, f) -> t
  ->(If(True, t, f), t)
  > If(False, t, f) -> f
  ->(If(False, t, f), f)

The following test would print ERROR were it not for the fact that our evaluator has a special case for If that rewrites only the first argument (the predicate):

  > If(True, Print(OK), Print(ERROR))
  OK
  Null

Similarly, we can define boolean operators:

  > True && f -> f
  ->(&&(True, f), f)
  > False && f -> False
  ->(&&(False, f), False)
  > True || f -> True
  ->(||(True, f), True)
  > False || f -> f
  ->(||(False, f), f)

These are short circuit evaluated so the following does not print ERROR:

  > False && Print(ERROR)
  False

We can define the factorial function using our postfix ! operator:

  > n! -> n * (n-1)!
  ->(Factorial(n), *(n, Factorial(-(n, 1))))
  > 0! -> 1
  ->(Factorial(0), 1)

For example, computing 17!:

  > 17!
  355687428096000

As we were careful to use a pattern variable called n in order to match only an integer argument our factorial rewrite rules are not applied to a symbolic argument such as the symbol x:

  > x!
  Factorial(x)

The Fibonacci function may be defined as follows:

  > Fib(n) -> Fib(n-2) + Fib(n-1)
  ->(Fib(n), +(Fib(-(n, 2)), Fib(-(n, 1))))
  > Fib(0) -> 0
  ->(Fib(0), 0)
  > Fib(1) -> 1
  ->(Fib(1), 1)

For example, the tenth Fibonacci number is 55:

  > Fib(10)
  55

Let's do some computer algebra. We can simplify things by rewriting subtraction in terms of addition and multiplication by minus one:

  > f - g -> f + g*-1
  ->(-(f, g), +(f, *(g, -1)))

Similar for division and power:

  > f / g -> f * g^-1
  ->(/(f, g), *(f, ^(g, -1)))

Let's rotate associative operators to the right:

  > (f + g) + h -> f + (g + h)
  ->(+(+(f, g), h), +(f, +(g, h)))
  > (f * g) * h -> f * (g * h)
  ->(*(*(f, g), h), *(f, *(g, h)))

For example, (a+b)+(c+d) is rewritten into a+(b+(c+d)):

  > (a+b)+(c+d)
  +(a, +(b, +(c, d)))

This example has some opportunity for simplification:

  > (x+1)+(x+2)
  +(x, +(1, +(x, 2)))

Let's rotate integers left:

  > x + n -> n + x
  ->(+(x, n), +(n, x))
  > x * n -> n * x
  ->(*(x, n), *(n, x))

Note how concise our rules are because n matches an integer and x matches a non-integer.

The integers are being rotated further left:

  > (x+1)+(x+2)
  +(1, +(x, +(2, x)))

Let's complete the rules for rotation:

  > x + (n + y) -> n + (x + y)
  ->(+(x, +(n, y)), +(n, +(x, y)))
  > x * (n * y) -> n * (x * y)
  ->(*(x, *(n, y)), *(n, *(x, y)))

Now integers are always rotated to the far left:

  > (x+1)+(x+2)
  +(2, +(1, +(x, x)))

A couple more rules and we can perform arithmetic operations on those adjacent integers on the left:

  > m + (n + x) -> (m + n) + x
  ->(+(m, +(n, x)), +(+(m, n), x))
  > m * (n * x) -> (m * n) * x
  ->(*(m, *(n, x)), *(*(m, n), x))

Now our example is simplified even more:

  > (x+1)+(x+2)
  +(3, +(x, x))

We might also want to factor our duplicates:

  > f + f -> 2 * f
  ->(+(f, f), *(2, f))
  > f * f -> f ^ 2
  ->(*(f, f), ^(f, 2))

Now our example (x+1)+(x+2) is rewritten into 3 + 2*x:

  > (x+1)+(x+2)
  +(3, *(2, x))

We might also want to multiply out brackets when the factor is an integer:

  > n * (x + y) -> n * x + n * y
  ->(*(n, +(x, y)), +(*(n, x), *(n, y)))

We can add some rules for arithmetic. Adding zero is a no-op:

  > f + 0 -> f
  ->(+(f, 0), f)
  > 0 + f -> f
  ->(+(0, f), f)

A production system would want a more feature-rich pattern matcher that automatically tries all combinations of terms in a sum or factors in a product but ours does not so we list f+0 and 0+f separately by hand.

Multiplying by zero returns zero:

  > f * 0 -> 0
  ->(*(f, 0), 0)
  > 0 * f -> 0
  ->(*(0, f), 0)

Multiplying by one is a no-op:

  > f * 1 -> f
  ->(*(f, 1), f)
  > 1 * f -> f
  ->(*(1, f), f)

Another multiplication-based simplification:

  > f + g * f -> (1 + g)*f
  ->(+(f, *(g, f)), *(+(1, g), f))

Similarly for power, we can add a rule that zero to the power of anything is zero:

  > 0 ^ f -> 0
  ->(^(0, f), 0)

Which is actually wrong for 0^0=1 (at least in the limit!) so we can override it for that case:

  > f ^ 0 -> 1
  ->(^(f, 0), 1)

Similarly for a unit exponent:

  > f ^ 1 -> f
  ->(^(f, 1), f)

More simplifications around powers:

  > f * f^g -> f^(g+1)
  ->(*(f, ^(f, g)), ^(f, +(g, 1)))
  > f^g * f -> f^(g+1)
  ->(*(^(f, g), f), ^(f, +(g, 1)))
  > f^g * f^h -> f^(g+h)
  ->(*(^(f, g), ^(f, h)), ^(f, +(g, h)))
  > (f^x) ^ -1 -> f^-x
  ->(^(^(f, x), -1), ^(f, -(0, x)))

We can also teach our term rewriter some rules for logarithms:

  > Log(x) + Log(y) -> Log(x * y)
  ->(+(Log(x), Log(y)), Log(*(x, y)))
  > Log(x ^ y) -> y * Log(x)
  ->(Log(^(x, y)), *(y, Log(x)))
  > Log(1) -> 0
  ->(Log(1), 0)

And relationships between logarithms and exponents:

  > Log(E^f) -> f
  ->(Log(^(E, f)), f)
  > E^Log(f) -> f
  ->(^(E, Log(f)), f)

For example, Log(x^x)/x is simplified to x*Log(x)*x^-1:

  > Log(x^x) / x
  *(x, *(Log(x), ^(x, -1)))

and E^x/E^-x to E^x*(E^-x)^-1 to E^x*E^x to E^(x+x) to E^(2*x):

  > E^x / E^-x
  ^(E, *(2, x))

Let's teach our computer algebra system how to compute the derivative of any symbolic expression composed of our arithmetic operators and logarithms:

  > D(n, y) -> 0
  ->(D(n, y), 0)
  > D(x, x) -> 1
  ->(D(x, x), 1)
  > D(f + g, x) -> D(f, x) + D(g, x)
  ->(D(+(f, g), x), +(D(f, x), D(g, x)))
  > D(f * g, x) -> f * D(g, x) + g * D(f, x)
  ->(D(*(f, g), x), +(*(f, D(g, x)), *(g, D(f, x))))
  > D(f ^ g, x) ->
      f^(g - 1) * (g * D(f, x) + f * Log(f) * D(g, x))
  ->(D(^(f, g), x), *(^(f, -(g, 1)), +(*(g, D(f, x)),
  *(*(f, Log(f)), D(g, x)))))
  > D(Log(f), x) -> D(f, x) / f
  ->(D(Log(f), x), /(D(f, x), f))

For example, we can compute the derivative of x^x:

  > D(x^x, x)
  *(^(x, +(-1, x)), +(x, *(x, Log(x))))

Let's define a function that nests n applications of a function f around an expression x:

  > Nest(n, f, x) -> Nest(n-1, f, f(x))
  ->(Nest(n, f, x), Nest(-(n, 1), f, f(x)))
  > Nest(0, f, x) -> x
  ->(Nest(0, f, x), x)

Note that we have been rather lax with our naming of pattern variables here. There is no need for x to match only non-numbers.

Let's define a univariate function that computes the derivative of the given expression with respect to x:

  > Dx(f) -> D(f, x)
  ->(Dx(f), D(f, x))

Now we can easily compute higher derivatives of an expression:

  > Nest(2, Dx, x^x)
  +(*(^(x, +(-1, x)), +(2, Log(x))), *(+(x, *(x, Log(x))),
*(^(x, +(-2, x)), +(-1, +(x, *(x, Log(x)))))))

Summary

This article has walked through the design and implementation of a new programming language based upon term rewriting. Although the language is very simple it is remarkably powerful: able to implement its own If expressions and short-circuit-evaluated boolean operators as well as perform a multitude of computer algebra tasks including differentiating non-trivial expressions symbolically.

Future F# Journal articles will revisit the subject of metaprogramming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.