Skip to content

Instantly share code, notes, and snippets.

@amnn
Last active September 26, 2020 18:03
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save amnn/559551517d020dbb6588 to your computer and use it in GitHub Desktop.
Save amnn/559551517d020dbb6588 to your computer and use it in GitHub Desktop.
Huet's Circular Unification Algorithm

Based on Huet 1976, §5.7.2

Preliminaries

From earlier sections

  • V is a countably infinite set of variables.
  • C a finite set of constructors.
  • a : C -> Nat an arity function.
  • R : V -/-> Typ a partial environment function.

where Typ = { c(v_1,...,v_{a(c)}) | c <- C, v_i <- V, 1 <= i <= a(c) }

In this representation, we deconstruct compound types, by replacing sub-structure with variables, and then using R to "tie the knot". We are essentially using variables as pointers to compound types. If they point to nothing, then they are seen as referring to themselves.

5.7.2 Algorithmic Description of RATIO and CYCLE

I've omitted CYCLE because I actually want cyclic types.

A unificand N will now be a finite subset of V x V. Initially, N contains the names of pairs of terms we wish to unify.

In the pseudo-code below, the unificand is represented by a list of (V,V) pairs.

We associate with each variable x in V a variable x* in V. Initially x* = x for all x in V.

NB. In the algorithm below, I use dom to refer to the domain of a function, and in as a function to check for set inclusion.

isVar :: V -> Bool
isVar x = not (x* `in` dom R)

unify :: [(V,V)] -> ()
unify []      = return ()
unify ((x,y):N*)
  | x* == y*  = unify N*
  | isVar x*  = { x* := y*; unify N* }
  | isVar y*  = { y* := x*; unify N* }
  | F /= G    = fail
  | otherwise =
    { y* := x*                         -- ***
    ; unify (N ++ xs `zip` ys)
    }
  where
    F(xs) = R(x*)
    G(ys) = R(y*)

The step highlighted by *** in the above is what is different from most implementations of Robinson's unification algorithm. Even when unifying compound types, one type ends up pointing to the other. If you imagine variables as being pointers, then we are forwarding y to point to x. The reason why this is necessary is that the next time x* and y* are compared, we will not recurse into their substructure again: We have made x* = y* true, so the first guard condition fires.

This property, coupled with the fact that a regular tree has a finite number of subtrees means that there only a finite number of pairs (x,y) that the algorithm can see, and it sees each pair at most once, which gives a sketch of the termination argument.

@ezyang
Copy link

ezyang commented Jan 24, 2016

A simple counter-example which shows that the *** line is necessary is to unify any infinite tree with itself. (There are some examples involving infinite trees which will terminate without the *** line!)

@amnn
Copy link
Author

amnn commented Jan 24, 2016

Indeed, if you tried to unify the type a -> fix x. (F x) with (fix x. (F x) -> b it would terminate no problem.

A pathological example I found instructive was trying to unify fix x. F (F x) with fix y. F (F (F y)). Which when encoded as an environment, becomes:

R = { x :-> F x', x' :-> F x
    , y :-> F y', y' :-> F y'', y'' :-> F y 
    }

When you run the algorithm without *** on N = [(x,y)], it does not terminate, but with *** it terminates after inspecting every pair from
(x, y), (x', y'), (x, y''), (x', y), (x, y'), (x', y'').

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment