jonschoning/continuations-and-reduction-semantics.md

## continuations-and-reduction-semantics.md

      
    Raw
  

              continuations-and-reduction-semantics.md
            
          
    Monads and delimited control are very closely related, so it isn’t too hard to understand them in terms of one another. From a monadic point of view, the big idea is that if you have the computation m >>= f, then f is m’s continuation. It’s the function that is called with m’s result to continue execution after m returns.
If you have a long chain of binds, the continuation is just the composition of all of them. So, for example, if you have
m >>= f >>= g >>= h
then the continuation of m is f >=> g >=> h. Likewise, the continuation of m >>= f is g >=> h.
Continuation monads allow you to grab hold of the continuation at any point in the computation. So, for example, if you’re executing
foo >>= \n -> bar (n + 1)
then foo could say “give me the continuation, please,” and it would be given the function k n = bar (n + 1). Now foo can, if it so wishes, do fancy things like call the continuation multiple times—it can do something like sequence [k 1, k 2, k 3], and it will do what you’d expect.
Continuation monads also give you the ability to say “please replace the current continuation with a completely different one.” So for example, you might have a computation that looks like
m >>= f >=> pure . Right
or something like that. m might say “actually I want to abort this computation early,” and it would replace the f >=> pure . Right continuation with pure . Left, yielding m >>= pure . Left and terminating.

when you say "continuation monads" in the plural, what does that mean?

Well, there are different ways to implement continuations, as there are different models of higher-order control. The Cont monad is the most standard one, where type Cont r a = (a -> r) -> r. This allows you to implement callCC, the original higher-order control operator.
But callCC has problems. It’s very global—when you call callCC, you capture the entire remaining computation, and when you invoke a captured continuation, you replace the entire remaining computation.
This turns out to not be very useful in practice. You want to be able to capture or abort “slices” of the computation. For example, you might use continuations to model throwing an exception, but with callCC, throwing an exception would just terminate your program. What you really want is to be able to say “I want to terminate up to the nearest catch,” and delimited continuations let you do that.
The most well-known operators for delimited control are named shift and reset. The idea is that reset “delimits” a section of the continuation, and shift is like callCC, but only up to the nearest reset. To make that more concrete, if you have
reset (m >>= f >>= g) >>= h
then if m captures the continuation, it will capture f >=> g, but not h.
(It turns out there are actually techniques for implementing delimited continuations in terms of non-delimited continuations, so it’s not a fundamental difference in ability, but doing that is complicated and assumes you have control over the entire computation.)

so just to make sure I'm keeping up: being able to implement only callCC = non-delimited computations

Right!

is it possible to write out something like type signatures for shift and reset ?
reset :: m a -> m a and shift :: ((a -> m b) -> m a) -> m a ?

It’s possible. They’re twisty, though. Say we have a delimited continuation monad DCont. Here are the types of reset and shift:
reset :: DCont a a -> DCont r a
shift :: ((a -> DCont r r) -> DCont r r) -> DCont r a

is r universally quantified in reset, or is it some specific r ?

It is universally quantified, but it will be enforced by the enclosing context (since it has to match up with whatever you >>= the result of reset to). It’s initialized at the start of the computation by runDCont :: DCont a a -> a.

also I just realized I don't know what role r plays in DCont (I'm assuming this is not our naive Cont )

It plays the same role it does in Cont—it tracks the “result type” of the current computation. The difference is that the r in Cont never changes during a single computation, it’s always the result of the entire thing. But with DCont, r changes when you enter a reset, since r tracks the result type of the nearest reset.
Reasoning about this using the types alone is, in my opinion, pretty confusing. This is one of the situations where I think it’s genuinely more useful to think about the terms first. The most useful reasoning tool here is so-called reduction semantics, a particular kind of operational semantics that is especially useful for reasoning about programs with complex control.

ok, that makes sense. i'm having a hard time imagining how reset could be universally quantified in its "result" type

Well, reset is universally quantified in its result type, it’s true, but that doesn’t mean the caller of reset gets to choose it in practice. Suppose you’re writing a function:
f :: Int -> DCont r String
f n = ...
Inside f, if you call reset, r is still forced to be the same r as the one in the type signature, since that’s the type you wrote down! You can’t write (reset a :: DCont A Foo) >> (reset b :: DCont B Bar) because A and B are different, and >> requires both arguments to have exactly the same type except in their last parameter.
It’s like the s in State s a. Sure, the type of pure for State s is pure :: a -> State s a, and the s is universally quantified there. But the type is still ultimately mandated by its enclosing context.

should I read DCont a b as a computation that's "stuck" until some continuation is provided that consumes a b and produces an a ? and then it will itself resolve to an a

Yes, that’s one way of looking at it (and it’s the literal technique used by CPS implementations). A DCont a b is something that knows how to produce a b, but it won’t give it to you directly—you have to give it a function b -> a, and it will call your function.
There are other ways of thinking about continuations, though, which I think are generally more helpful when it comes to actually doing useful things with them.

what are the other ways?
I'm thinking about the type of reset and it makes a little bit of sense to me now

There are a handful of other ways, but I think the most useful ones are the closely-related ideas of reduction semantics (which is a model) and evaluation stack manipulation (which is an implementation).
Reduction semantics is a way of talking about programs the way you were probably taught Algebra I: a series of incremental simplifications. Normally, when we think about evaluation, we think about it in a “big-step” way. For example, if we think about evaluation of the expression
square (1 + 2) * 5
we imagine evaluating from the “inside out.” We start by evaluating 1 + 2 to 3, then evaluate square 3 to 9, then evaluate 9 * 5 to 45. Reduction semantics instead views the evaluation of a program as a series of steps where each intermediate step is a valid program. So the steps here would be:

square (1 + 2) * 5
square 3 * 5
9 * 5
45

This difference might seem trivial, but it’s a meaningful distinction if you stop to really think about it. When we think about evaluation of a function like
f x = do
  y <- g x
  z <- h y
  pure (y + z)
we often imagine evaluation happening in a psuedo-imperative way. We think of g x being evaluated, and its result being “stored” in y, then control “entering” h. There’s this implicit idea of a “call stack,” or something like it.
But reduction semantics imagines evaluation as if it were a transformation on the program itself. So, for example, the sequence of evaluation steps might look like this:

f 42
do { y <- g 42; z <- h y; pure (y + z) }            (substitute f)
do { y <- pure (42 - 10); z <- h y; pure (y + z) }  (substitute g)
do { y <- pure 32; z <- h y; pure (y + z) }         (subtract)
do { let y = 32; z <- h y; pure (y + z) }           (>>= / return law)

and so on. The interesting idea here is that there is no concept of a separate “evaluation stack”; the current context of the program is always 100% captured in syntax of the programming language!
One thing that’s interesting about this way of thinking about programs is it gives you some way to think about something like an “evaluation stack” in a way that makes sense for expression-oriented languages like Haskell. If you have an expression like
square (1 + 2) * 5
you can split it apart into “evaluation frames”:

● * 5
square ●
1 + 2

(where the ● represents the “hole” of each frame.)
Why is any of this useful for reasoning about continuations? Because it makes it very clear what the meaning of “the current continuation” is while evaluating an expression! Suppose the program were actually
f (reset (let x = (shift f) + 1 in x * 10)) / 2
(imagining for a moment a language that has reset and shift as built-in operations and doesn’t require them to be used monadically). It’s possible to split this program into evaluation frames like this:

f ● / 2
reset ●
let x = ● + 1 in x * 10
shift f

Now the “captured portion of the computation” is just the frame in between the shift and the reset, which in this case is let x = ● in x * 10. So you can just put that whole thing under a lambda and replace the ● with the lambda’s binder to get k y = let x = y in x * 10, and that’s the captured continuation!
This might feel like a lot of complicated machinery to keep in your head, so it might not seem like it’s “simpler” at all. 🙂 But in my experience, most programmers already think in ways along these lines in some situations, they just don’t think about it so formally. Reduction semantics just gives some structure and precision to that style of reasoning.
The fact that reduction semantics represents intermediate states of computation using the syntax of the programming language is especially valuable for equational reasoning, since that’s what equational reasoning is. (Reduction semantics just happens to be directed from “less evaluated” to “more evaluated,” while equational reasoning can go backwards, too.)
…anyway you can adapt this to monadic code, too, you just have evaluation frames connected together using monadic binds rather than simply returning a value. So you might have
reset (do { x <- shift f <&> (+ 1); pure (x * 10) }) >>= f >>= (/ 2)
and the frames become

● >>= f >>= (/ 2)
reset ●
do { x <- ● <&> (+ 1); pure (x * 10) }
shift f

With that interpretation, the meaning of the r parameter in DCont r a becomes a little more clear: it corresponds to the expected return value of the most recent reset frame.

thanks for the detailed explanation, the "evaluation frame" thing really helps with thinking about it

I think it’s a useful reasoning tool generally, but Haskellers historically seem to have been dismissive of operational semantics in favor of denotational semantics, so I don’t think it gets discussed very often. 😕

How do we tie this back to your earlier advice:

That said, I think a better way to reason about algebraic effects is to forget about monads and think about delimited continuations and evaluation contexts

E.g. what should I do to think about the maybe monad or the list monad in terms of delimited continuations and evaluation contexts?

Good question! Using all this context, I think I can now answer that question directly. 😄 Let’s talk about the either monad and the list monad, since the either monad is slightly more general, and it will make things slightly clearer, I think.
The way we talk about these things using reduction semantics to to present reduction rules, which are the axioms of evaluation for a language. For example, here’s a reduction rule for function application:
(\x -> e) v ⟶ e[x := v]
(Where e[x := v] means “e but with all occurrences of x substituted with v”.)
We can present reduction rules for throw and catch in Either like this
catch (... (throw e) ...) f ⟶ f e
but this isn’t terribly precise. What is the ...? If it could be anything at all, then we could claim that
catch (let m = throw e in pure 42) f
should reduce to f e, but clearly that does not make sense.
So instead we introduce the idea of “evaluation contexts.” An “evaluation context” is written E[e], where the big E is an evaluation context and the little e is an expression. An evaluation context is… an evaluation frame. So, for example, we might take E = (● * 3) + 2 and e = 5, and E[e] would represent “plugging the hole” to get (5 * 3) + 2.
This allows us to express catch and throw using two rules:

catch (pure v) f ⟶ pure v
catch E[throw e] f ⟶ f e


is an evaluation context a purely syntactic construct?

Everything in reduction semantics is a purely syntactic construct. 🙂
To answer your question more usefully, yes, usually an evaluation context is just a syntactic construct. But some languages allow you to have first-class evaluation contexts, where you can get ahold of an evaluation context as a value! What is a “first-class evaluation context”? Another name for a continuation.

how about if E = throw f; ●

Ah, good catch! Evaluation contexts aren’t allowed to be just anything. If I were being formal, I would have defined the grammar of evaluation contexts precisely.
The grammar of evaluation contexts for a call-by-value lambda calculus with addition looks like this:
v ::= (\x -> e) | <number>
e ::= v | e e | e + e
E ::= ● | E e | v E | E + e | v + E

Values are represented by v.

hmm, I'm struggling to wrap my head around that

Yes, this is rather twisty. Let me give some examples.
(Discussion continues a little bit more here: https://funprog.srid.ca/haskell/delimited-continuations-for-modeling-effectful-computations.html)