gvolpe/laws.md

## laws.md

      
    Raw
  

              laws.md
            
          
    Laws

A law is a group of two or more expressions which are required to be
the same. The expressions will usually involve one or more typed
holes ("inputs") which vary.
Some examples:
x.map(id)                 === x
x.map(f).map(g)           === x.map(f andThen g)
(expr, expr)              === { val y = expr; (y, y) }
x |+| y                   === y |+| x
x.flatMap(f).flatMap(g)   === x.flatMap(a => f(a).flatMap(g))
Sameness

In general we don't have a decidable way to prove that any two Scala
values are (or are not) the same. Specifically, if x and y are the
same, then for any pure, total function f we require that f(x) and
f(y) are also the same. This definition is recursive, and ultimately
relies on the fact that for particular types we can find decidable
ways to compute sameness.
In general, we can't rely on universal equality (which in some cases
says values are equal when they have observable differences), and we
can't rely on having an instance of Eq (which doesn't exist for some
types). In general, we have to rely on informal reasoning.
Pure, total functions

The Scala type Function1 permits partiality (exceptions),
reading/modifying mutable state, infinite loops/non-termination, and
other side-effects. If we require our laws to be true for any possible
Function1 value in Scala then we won't have any laws. Thus, in the
previous definition we have to mandate that the functions used must be
total and pure. We can only enforce this requirement via care and
informal reasoning.
Relaxing this requirement will tend to break things like referential
transparency. For example:
def f(x: Int): Int = sys.error("🕱" * x)
val lhs = { val y = f(2); val x = f(1); (x, y) }
val rhs = { val x = f(1); val y = f(2); (x, y) }
lhs === rhs // false due to side-effecting throw.
// even under very relaxed sameness criteria,
// sys.error("🕱") is not sys.error("🕱🕱").
Universe of values (and expressions)

For any law with typed holes, we require that any valid value of the
correct type must pass the law when used. What does validity mean?
Similar to the previous point, we can create values which are impure,
which read or modify global state, have side-effects, etc. These values
can be used to break most (or all) laws. Thus, we must exclude
these kinds of values and expressions when analyzing our laws.
For example:
// given these inputs to our law:
type F[_]
type A
type B
type C
val functor: Functor[F]
val fa: F[A]
val f: A => B
val g: B => C

// and a sameness criteria:
val sameness: Eq[F[C]]

// we require that lhs is the same as rhs:
val lhs = fa.map(f).map(g)
val rhs = fa.map(f andThen g)
lhs === rhs
Here are some invalid values which, if valid, would demonstrate that
this law was false:
val fa: F[A] = null
// lhs and rhs both produce NPEs not values

def f(a: A): B = if (scala.util.Random.nextBoolean) b0 else b1
// f is not a function, so sameness criteria will often fail.

var counter = 0
def f(a: Int): Int = { counter += 1; counter }
val lhs = List(1,2,3).map(f).map(f)    // List(4,5,6)
val rhs = List(1,2,3).map((f _) andThen (f _)) // List(2,4,6)
lhs =!= rhs
// mutable state makes order of function application visible.
// the goal of our law is to allow maps to be rewritten, which
// can't be done in the presence of observable effects.

def f(a: Int): Option[Int] = Some(a)
def g(b: Option[Int]): Int = System.identityHashCode(b)
val fa = List(1,2,3)
val lhs = fa.map(f).map(g)    // List(1790453435, 563984469, 922931437)
val rhs = fa.map(f andThen g) // List(618864390, 1562772628, 222556677)
lhs != rhs
// System.identityHashCode is a deterministic function, but it
// observes runtime information about values which we normally don't
// consider when thinking about "sameness". If we use this method then
// we will need an Eq[C] which uses eq, eliminating almost any possible
// rewriting we can do.
(The reason we consider these values invalid is that the types and
type class instances being used are widely considered to be
law-abiding -- it is more useful to constrain a universe of values
than to throw out otherwise reasonable type class instances.)
There is no requirement that all laws use the same universe of values.
We might consider certain side-effects invisible (or visible) for the
purposes of particular laws. This is fine, but should be explicitly
stated. The "default" universe we inhabit (when reasoning using
intersubstitutability and referential transparency) is the universe of
pure, effect-free values (and functions operating on these values).
The same argument applies for expressions: instead of applying a
single value to a law, we can apply a valid expression from our
universe which evaluates to a valid value. The expression must not
contain any invalid sub-expressions (e.g. those which produce
side-effects).
Law violations

A law violation occurs when values from the law's universe are found
that cause the law's expressions to evaluate to different values (as
determined by the sameness criteria). When a law violation occurs,
there are several possible responses:

Observing that the law is false (the most common case).
Observing that the law could be true for a narrower universe.
Observing that the value(s) are not valid.
Observing that the sameness criterion is not valid.

For example, two values which we might consider the same could look
different under reflection, or via sun.misc.Unsafe. This doesn't
disprove our law, it just illustrates that our "function" for sameness
is invalid.
Similarly, we might want to write a law about asynchronous operations
(ensuring that both ultimately complete, possibly in a certain order).
This might require allowing a specific kind of side-effect into our
universe of expressions, so that we can write laws which observe order
of effects. In these cases, we need to be explicit that we are no
longer inhabiting our usual universe, and be explicit about the fact
that these expressions are not referentially-transparent to begin with
(i.e. for many other kinds of important laws they will be invalid).
Parametricity and other restrictions

Counter-intuitively, we may be forbidden to use certain methods on a
type when considering "sameness". One example is eq, which we can use
to observe whether two values point to the same memory address or not.
Very few laws can be proven if we require their results to be eq, and
many effects which we would prefer to be unobservable (e.g. memoization
of immutable values, reference sharing, etc.) are observable with eq.
Similarly, we can use runtime reflection or reflective pattern-matching
to "sniff out" the erased type of a value at run time. This allows us
to define functions which can violate parametricity and break laws. We
usually don't consider these functions as valid either.
Non-violations

Notably, if we can prove (or convince ourselves) that a law is true for
its universe of valid types and expressions, we can't say anything
about its behavior for invalid values (those outside the universe). The
kind of equational reasoning laws grant us are inherently constrained
to the universe of values and expressions we used to prove those laws.
Similarly, from the point of view of lawfulness there isn't a good
reason to believe that one behavior is better than another when it
comes to invalid inputs or functions. A type whose map method could
magically "sniff out" (and throw) when given an impure function isn't
any more (or less) lawful than one which memoizes the function and only
runs it once, or one which runs the function N times, or whatever.
We may desire other behavior from our types, but that analysis occurs
outside the realm of the laws we define for the type. If we wanted to
write laws in terms of the universe of all possible Scala values and
expressions, we would have to accept that our toolbox for informal
reasoning would be almost totally empty.
Conclusion

Laws allow us to reason with rigor about code which we prove (or at
least believe) to be lawful. However, in order to be sure we are
reasoning about laws correctly, we must be rigorous about what laws
require, as well as distinguishing things laws can tell us versus
things about which they must remain silent.
See Also

The discussion here is indebted to Proofs and Refutations (Lakatos, 1976) [1].
That text introduces the idea that mathematical objects (e.g. a polyhedron)
and proofs involving those objects (e.g. the Euler characteristic of a
polyhedron) co-evolve based on the prior commitments and philosophical views of
mathematicians.
When a proof fails, we have to decide whether to amend the terms the
proof uses (i.e. narrow its scope), look for a better proof, or give
up and declare the proof (and the underlying idea of the proof)
invalid.
[1] https://en.wikipedia.org/wiki/Proofs_and_Refutations