frenchy64/comp.md

## comp.md

      
    Raw
  

              comp.md
            
          
    Occurrence Typing is Compositional

A few months ago, Sam Tobin-Hochstadt explained to me how occurrence typing is compositional. While I had taken this for granted previously, comparing occurrence typing to other systems makes it clear that this is an important property to have.
First of all, what is occurrence typing? In dynamically typed programs, there are often type invariants enforced at program branches.
(fn [a :- (U nil String)]
  (if a
     (count a)
     0))
For example, the previous code assumes down the then branch that 'a' is String, and down the else branch 'a' is nil. This is achieved by attaching extra information to each expression that records two statements that are true if the expression is used as a test:

the 'then' proposition - a proposition that can be assumed true when checking the then branch
the 'else' propostion - a proposition that can be assumed true when checking the else branch.

In this case, the test is a variable. If a variable is used as a test, intuitively we can assume it is non-nil/non-false in the 'then' branch, and nil/false down the 'else' branch.
Concretely, 'a' has type (U nil String), 'then' proposition (! a (U nil false)) and 'else' proposition (is a (U nil false)).
So why is occurrence typing compositional? Well in other systems, this kind of flow analysis might be implemented as a set of special cases. For example, in Ceylon, an expression such as
if (a extends Integer) then {
...
} else {
...
}
has a special interpretation for extends in an if expression.
If we refactored the test into a function, Ceylon would not be able to perform the flow analysis, because there is no standard way to propagate this flow information.
On the other hand, in occurrence typing, where each expression has flow information attached as propositions, we can refactor the test as we like.
(if (identity a)
  <then>
  <else>)
works exactly the same as if 'a' were the test.
The main way to extend occurrence typing to understand other kinds of programming idioms is the 'path element'. In Typed Racket, a 'car' and 'cdr' are path elements. Now we can reason about "sub"-parts of our current environment:
The test '(car a)' has the 'then' proposition (! a #f car) and the 'else' proposition (is a #f car). We can combine any path elements we please, to be able to check the idioms of the language we are supporting.
For example, in Clojure heterogeneous maps can be looked up on a particular key.
(if (:bar (:foo a))
  <then>
  <else>)
In conclusion, occurrence typing gives a general framework for complicated control flow. Without this, control flow is usually implemented as a series of hard-coded rules. In fact, this is exactly the problem that occurrence typing was design to solve: Typed Racket was initially implemented to follow control flow with complicated rules, which quickly broke down with macros such as and and or. Because these expand to normal if statements, they can be automatically handled by occurrence typing.