VictorTaelin/evaluation_strategies.md Secret

## evaluation_strategies.md

      
    Raw
  

              evaluation_strategies.md
            
          
    In most programming languages, functions are evaluated eagerly. That means that, in an expression such as f(g(),h()), the language first computes g(), then it computes h(), and only then it executes the body of f. That's bad, because sometimes the result is not needed. For example, if f(x,y) => x, then computing h() is wasteful.
Haskell fixes that problem by being lazy, i.e., it starts by executing the body of f, and only if g() and h() are actually needed is that they are computed. But that brings other problems. If f(g,h) = [g, g, g], for example, then g() will be evaluated 3 times! Fortunately, Haskell solves that too by memoizing g(). That's very clever, works well in practice and is the reason Haskell is side-effect free; but, in some cases, memoizing isn't possible. For example, if g() itself returns another function which is then applied to some arguments; e.g., if f(g,h) = [g(4), g(7), g(3)]; then g must be duplicated, because its body must be evaluated with 3 different substitutions (4, 7 and 3). Confused? Let's visualize the evolution of not_10_times(True):
-- Input
(\ f x -> 
  let f1 = not in
  let f2 = f1 . f1 in
  let f4 = f2 . f2 in
  let f8 = f4 . f4 in
  f8 x)
  $ f8 True

-- Step 1
(\ x -> 
  let f1 = not in
  let f2 = f1 . f1 in
  let f4 = f2 . f2 in
  let f8 = f4 . f4 in
  f8 x)
  $ True

-- Step 2
let f1 = not in
let f2 = f1 . f1 in
let f4 = f2 . f2 in
let f8 = f4 . f4 in
f8 True

-- Step 3
let f1 = not in
let f2 = f1 . f1 in
let f4 = f2 . f2 in
(f4 . f4) True

-- Step 4
let f1 = not in
let f2 = f1 . f1 in
(f2 . f2 . f2 . f2) True

-- Step 5
let f1 = not in
(f1 . f1 . f1 . f1 . f1 . f1 . f1 . f1) True

-- Step 6
(not . not . not . not . not . not . not . not) True

-- Step 7
not (not (not (not (not (not (not (not True)))))))

-- Step 8
not (not (not (not (not (not (not False))))))

-- Step 9
not (not (not (not (not (not True)))))

-- Step 10
not (not (not (not (not False))))

-- Step 11
not (not (not (not True)))

-- Step 12
not (not (not False))

-- Step 13
not (not True)

-- Step 14
not False

-- Output
True
As expected, this reduction was performed lazily: not and True were not evaluated before substituting them in the function body. But, in that case, that doesn't do anything, because not and True were fully computed already. Moreover, as expected, Haskell memoizes the result of evaluating not and True. But that also doesn't do anything here, because both variables only occur once, and again, they were in normal form anyway.
Seems like Haskell used all its tricks and now it must proceed, but a tragedy happens in Step 2. Here, f8 must be applied to True. But what is f8? It is just a symbol, not a lambda that can be applied. In order to proceed, Haskell must unfold the expression by rewritting let f8 = f4 . f4 in (f8 x) as ((f4 . f4) x); and then it unfold it further, until the whole expression becomes not (not (not ... x)). Notice how this removed all the sharing, and now not must be applied N times, as we dreaded. Oh, no! But what if we unrolled the sharing upside-down, like this:
-- Step 2
let f2 = not . not in
let f4 = f2 . f2 in
let f8 = f4 . f4 in
f8 True

-- Step 3
let f2 = id in
let f4 = f2 . f2 in
let f8 = f4 . f4 in
f8 True

-- Step 4
let f4 = id . id in
let f8 = f4 . f4 in
f8 True

-- Step 4
let f4 = id in
let f8 = f4 . f4 in
f8 True

-- Step 5
let f8 = id in
f8 True

-- Step 6
id True

-- Step 7
True

That's much better! What do we need for that? Simple: execute arguments before function calls. For example, if, when executing comp recursively, we evaluated intermediate arguments, then comp(4, not . not, x) would have been evaluated as comp(4, id, x), which is equivalent to upside-down reduction above. But we know Haskell never evaluates arguments of a function call until they are necessary, and, when they are, they could've been duplicated several times. In other words, fusing intermediate values requires some eagerness, and Haskell's laziness prevents it from doing that.
Does that mean JavaScript was right all that time? Is lazy evaluation an error? No! We already observed that such strategy is wasteful when arguments are discarded, such as in the f(x,y) = x case. Not only that, but there are also terms on which eager evaluation performs worse, even when there is no discarding involved. The λ-term (λx.x (λa.a)) (λy.(λx.x x) (y z)) is an example (feel encouraged to try evaluating it!). This is the moment you realize that, no matter what evaluation strategy you use, you'll always waste work in some case.