gasche/fscd2019-author-response

## fscd2019-author-response
We thank the reviewers for their work on our submission and their
thoughtful comments. Our response is in two parts: discussion of
high-level questions, then more reviewer-specific comments. The
document is designed to have the most important points first, so that
you can stop reading at any point.

# High-level questions

First, we note why we think it would be nice to see this paper at
FSCD.

FSCD is a conference that is dear to us (not least for its continuous
Open Access policy, which is unfortunately a rarity in this area), and
we think that this work could nicely support the general direction of
reasonably broadening the original TLCA+RTA scope: our work justifies
a real-world improvement in a real-world programming language through
a simple but non-routine meta-theoretical development -- yet another
argument in favor of rigorous formalism in programming language
design.

Second, we think that there is an insight in this work that relates to
the discussion with reviewer 2 on "declarativeness". Existing work
(and earlier iterations of the present work) on expressing program
analyses on the type system tend to propose judgments of the form

    Γ ⊢ e : τ ▹ ϕ

or

    Γ ⊢ e : τ ▹ e' : τ', ϕ

where Γ, e, τ are inputs, ϕ describes the analysis results, and (e' :
τ') in the second formulation describes re-annotations of the inputs
e, τ refined with the analysis results. Instead, we present our static
system as

  Γ ⊢ e : τ

and we use the freedom of deciding that Γ may be an output instead of
an input (as reviewer 2 remarks, our system is uni-typed; if we wanted
more types for better precision, Γ would contain both inputs and
outputs) to get a system that expresses the same algorithmic content
with a simpler, more declarative, presentation. Our meta-theoretic
evaluation (proving soundness with respect to the operational
semantics of [Nordlander, Carlsson and Gill, 2008]) validates the idea
that this presentation is reasonably easy to reason about (note in
particular how the subject-reduction proof is completely obvious to
the input/output distinctions in the system), and our programming
experience validates the idea that this is still conductive to an
implementation (which was merged into the OCaml compiler
implementation).

Let us now answer the salient comments and questions from our reviewers.


## Bugs avoided in the OCaml compiler (review 1)

Reviewer 1 asks about OCaml compiler bugs that this work eliminates.
We offer the following list:

- [#7231](https://github.com/ocaml/ocaml/issues/7231): let-rec
   well-formedness check too permissive with nested recursive bindings

   (this issue is directly solved by our more modular/composable
   approach of a type system rather than a list of allowed/disallowed
   scenarios)

- [#7215](https://github.com/ocaml/ocaml/issues/7215): Unsoundness
   with GADTs and let rec

   (This issue, discussed in §6.4, justifies analysing a typed
   representation of the language, rather than an intermediate
   representation that does not represent GADT type equalities.)

- [#4989](https://github.com/ocaml/ocaml/issues/4989): Compiler
  rejects recursive definitions of values

   `let rec f = let g = fun x -> f x in g`

   The other bugs in this list allow invalid programs that
   segfault. This bug is a case of lack of expressivity of the
   previous check; it is trivially solved by our type-system approach.

- [#6939](https://github.com/ocaml/ocaml/issues/6939): Segfault
  with improper use of let-rec and float arrays

   Our §6.2 describes this issue

Besides fixing (at least) these bugs, checking recursive definitions
at the typing phase rather than on an immediate representation made
life much easier for tools that reuse the OCaml front-end, e.g.

  * MetaOCaml (which previously disallowed value recursion within
    quotations, as it could not check them)

  * Merlin (an IDE tool providing edit-time type checking, which did
    not report value recursion errors).

## Declarative vs. Algorithmic presentations (review 2)

We agree with Reviewer 2 that the presentation is neither "fully
declarative" nor "fully algorithmic", and that having separate
declarative and algorithmic version would be nice. But we don't think
this is reasonable within the constrained FSCD space budget. Our
system takes one half-page figure; having two systems, plus
explanations to relate them, would take another page, and we don't see
any portion of the present article that is less important than having
those two presentations instead of one reasonable compromise.

Also, in the details we respectfully disagree with the reviewer 2 on
the placement of the boundary between "declarative" and "algorithmic",
which is somewhat subjective. For example, we would defend the choice
of using a multiplicative style, explicitly merging the contexts of
separate subexpressions. We view this as being mindful about the
respective contributions of subterms, rather than a
redundancy-introducing algorithmic choice. In fact, reference [2] uses
this same style for the application rule, which supports our intuition
that it is a natural presentation choice for this family of type
systems.

The one rule that we agree is objectively algorithmic in its
presentation is the rule for mutual recursion:

```
(G_i, (x_j : m_i,j)^j |- t_i : Return)^i
(m_i,j <= Guard)^i,j
forall i, (G'_i = G_i + \Sum_j m_i,j[G'_j])
-------------------------------------------
(x_i : G'_i)^i |- rec (x_i = t_i)^i
```

Here, the G' are defined from the G as fixpoints of a system of
equations, while it would be enough (less constrained) to specify,
through inequalities, a correctness condition on a single family of
contexts:

```
(G_i, (x_j : m_i,j)^j |- t_i : Return)^i
(m_i,j <= Guard)^i,j
forall i,j,  G_i <= m_i,j[G_j]
----------------------------------
(x_i : G_i)^i |- rec (x_i = t_i)^i
```

This formulation is more declarative, but we found it less intuitive /
readable: it would have been the only rule where constraints on the
context in a sub-goal would flow from the rule around it, instead of
the other way around. So we made the conscious choice to be more
algorithmic there.

As researchers we value declarative presentations because they are (1)
easier for humans to understand and (2) (often) easier to reason about
in the meta-theory. This particular type system is currently never
shown to human users of OCaml (the check either passes or reports an
error for the offending variable), and its meta-theory analysis is
done. Finally, our reference implementation serves as the ultimate
algorithmic presentation. These explain why exploring the
declarative/algorithmic design continuum was not a priority in our
presentation of the work.


## Related work on modal type systems (review 2)

We thank reviewer 2 for the extra references (we only knew of [1],
[3] and work related to [4] by the same authors) and will try to
extend the Related Work section as suggested.

> "apply [our technique to other systems - type parameter variance"
>
>   Yes, and there is related literature, you may start looking at [1,2,5].

By "our technique" we mean "presenting a backward-analysis algorithm
as a right-to-left type system". [1], [2] and [3] demonstrate the
flexibility of modal type systems (in complement to the work on
Nakano-style modal calculi), but they do not propose this technique in
the context of a program analysis.

Relatedly, [4] structures the evidence of a backward analysis as
a modal type system, which is related, but it does not use
a "right-to-left type system" presentation, it returns the result of
the analysis as a "demand" on the right of the judgment. As mentioned
earlier, we propose to represent this syntactically as just the
context of a type judgment. (This is a presentation technique rather
than a deep difference in what is being computed, but we think that it
does make a nice difference, and in a sense that's what is really
"declarative" about our work.)


## Comparison with call-by-need (review 2)

>   Shouldn't be there similarities with call-by-need?  The concept of
>   "needed redex" seems related to your redexes in forced evaluation
>   contexts.  There are differences as well, since you consider a term
>   vicious not only when it is stuck...
>
>   Anyway this is just a hunch; a comment on the relation to
>   call-by-need in the related work section would be useful.

Don Syme proposed a translation of value-letrec into lazy thunks for
F#. Using a source-level call-by-need operational semantics was our
first idea to specify the dynamics to compare our type system against,
but it results in a more complex presentation than the one we adapted
from [Nordlander, Carlsson and Gill, 2008].

In call-by-need calculi, there is also a store (typically for
not-yet-evaluated terms), and indeed it has been presented as a global
thunk store (as implemented in programming languages). But the nicer
presentations for reasoning, starting with "The call-by-need
lambda-calculus" by Ariola, Maraist, Odersky, Felleisen, Wadler, use
local stores / explicit substitutions. We have local stores as well
("let rec"), not to delay computations (we are strictly
call-by-value), but to represent cycles through the memory. Note that
explicit substitutions are a common meta-theoretical technique in
lambda-calculi that is not specific to call-by-need (call-by-need,
being more complex to describe than call-by-{value,name}, tends to
require fancier techniques).

The notion of "forcing context" does sound related to notions that
occur in call-by-need, but it is also not uncommon in call-by-value
settings when you want to precisely study errors. For example, our
definition of Mismatch (using E[H] has a notion of forcing context)
was inspired by the following work, agnostic to the evaluation
strategy:

  System F with Coercion Constraints
  Julien Crétin and Didier Remy, 2014
  http://gallium.inria.fr/~remy/coercions/Cretin-Remy!fcc@lics2014.pdf
  first paragraph of the right column of page 3


> What is the relation between "forcing contexts" and needed redexes?

They are quite different notions, because we are in a call-by-value
setting. A context is "forcing" when (1) it is in evaluation position
and (2) its evaluation requires the value of its hole. For example:

- `(□, u)` is neither needed in call-by-need nor forcing for us
  (by "needed" we mean that a redex in the hole of this context would
  be a needed context)
- `□` is needed but not forcing (in terms of our type-system it is a Return
  position, not a Dereference)
- `(1, □+2)` is forcing (with strict pairs) but not needed


## Details on the operational semantics (review 3)

It was indeed our intention to avoid an "assoc" rule, so that the
example of review 3

    let rec x = (let rec y = 1 :: x in 2 :: y) in 3 :: x

is precisely a normal form that describes a value containing cycles,
with each cycle defined as locally as possible. (This is just a
presentation choice and an "assoc" rule would work just as well; in
fact it would probably have been wiser to directly reuse this aspect
of the presentation of [Nordlander, Carlsson and Gill, 2008], which
use a "merge" rule for that purpose. The rest of this response
describes how the calculus works without needing to explicitly merge
let-rec blocks.)

Note that our "lookup" rule may traverse the whole term from the
lookup point to the root, in search of the value binding that defines
this variable -- we don't need to merge value bindings to find the
value in outer/older bindings.

That said, there were two small mistakes in the rules described in our
submission, related to the lack of a grammar for contexts built only
from fully-evaluated `let rec` blocks:

  L[□] ::= □ | let rec B in L[□]

(With this L[□] defined, our forcing contexts E_f[□] can be defined
from forcing frames F_f[□] in a simpler way: E_f ::= E[F_f[L]].)

Mistake 1: the grammar of values should allow L[v] as a strong value,
and L[w] as a weak value.

Mistake 2: the rules for head reductions should allow for L-contexts
in the middle of the redex, corresponding to a somewhat standard
"reduction at a distance" rule in explicit-substitution calculi:

  L[λx.t] v -{hd}-> L[t[v/x]]

  (match L[K (wᵢ)ⁱ] with h, K (xᵢ) → t, h') -{hd}-> L[t[wᵢ/xᵢ]ⁱ]

(For a reference on "reduction at a distance", see for example the
Linear Substitution Calculus introduced in "An abstract factorization
theorem for explicit substitutions", Beniamino Accattoli, 2012.)

With these two fixes in place, let us describe a more interesting
example by defining

    B :=
      rec hd = λli.
        match li with Nil -> None | Cons (x, xs) -> Some x
      and cycle = λx.
        let rec li = Cons (x, li) in li

    t :=
      let rec B₀ in hd (cycle 1)

We have, as one possible reduction path for t

    let rec B₀ in hd (cycle 1)
    -{lookup hd}->
    let rec B₀ in
      (λli. match li with Nil -> None | Cons (x, xs) -> Some x)
        (cycle 1)
    -{lookup cycle}->
    let rec B₀ in
      (λli. match li with Nil -> None | Cons (x, xs) -> Some x)
        ((λx. let rec li = Cons (x, li) in li) 1)
    -{app}->
    let rec B₀ in
      (λli. match li with Nil -> None | Cons (x, xs) -> Some x)
        (let rec li = Cons (1, li) in li)
    -{lookup li}->
    let rec B₀ in
      (λli. match li with Nil -> None | Cons (x, xs) -> Some x)
        (let rec li = Cons (1, li) in Cons (1, li))
    -{app}->
    let rec B₀ in
      (match (let rec li = Cons (1, li) in Cons (1, li)) with
        | Nil -> None
        | Cons (x, xs) -> Some x)
    -{match}->
    let rec B₀ in
      let rec li = Cons (1, li) in
      Some 1


# Local specific questions from reviewer

## Review 1

>    You basically want to say that information flows from the
>    expression to the variables.  This is usually referred to as the
>    "checking" direction in bidirectional typing, as opposed to the
>    "inference"/"synthesis" direction.  Thus, you could also say "A
>    type checking system for...".

In the bidirectional type systems that we are familiar with, the
context is known, and the "checking" direction moves inwards inside
terms by following constructors, and stops before reaching neutrals
(in particular variables) which are inferred, not checked. It is thus
not clear that "checking" would carry the right intuition here.

> p7 Why is "(λx.□)t" a deref context and not a return context?  Should
> it not be equal to "let x = t in □" ?

Reduction can lower mode constraints, so anti-reduction can strengthen
them. With our mode languages, (□ t) is a Dereference context that is
indistinguishable, from the modes, from (f □); it is natural that
(f(λ.□)) be a Dereference context (f may call the function and inspect
the result), and ((λx.□)t) is handled in the exact same way.

If we wanted to refine the mode-checking of this particular example
(we do not), we could introduced new mode "Call" that sits between
Return and Dereference, is only used on the left of function
applications, with Call[Delay] = Return.

>   Maybe a semantics using a heap more explicitly would make the
>   presentation clearer!?  See, e.g., [4], or related machines for
>   call-by-need, like the Sestoft machine.

We think that there is conceptual value in staying as close to the
source calculus as possible. Of course, "clearer" depends on reader
familiarity, and some readers are already more familiar with
abstract-machine presentations. In term of economy of concepts, we
would argue that this presentation is better.

>   Since you have only shallow matching, you could have instead
>   required disjointness of patterns in the syntax of "match".

Thanks for the suggestion.


## Review 2

> (One approach to remedy the situation would be to prove that the
> paper's semantics is correct with respect to a more conventional
> semantics. But this would be beyond the scope of the paper.)

Yes, we asked ourselves the same question and have written down a
simple compilation scheme for our source language into a λ-calculus
with a global store, and a sketch of a bisimulation argument between a
source program and its compiled form. (It doesn't fit in the paper.)

For the record, our target calculus is defined as follows:

    t ::= x, y, z
        | new (xᵢ)ⁱ; u
        | (xᵢ←tᵢ)ⁱ; u
        | λx.t
        | t u
        | K (tᵢ)ⁱ
        | match t with h

    v ::= K (wᵢ)ⁱ | λx.t
    w ::= x | v

    [[let rec (xᵢ = tᵢ)ⁱ in u]] ::= [[new (xᵢ)ⁱ; (xᵢ ← tᵢ)ⁱ; u]]

In this calculus, store locations are just variables (we don't have
first-class reference cells that can be returned from a function, so
no need for a separate class of runtime locations), and (xᵢ←tᵢ)ⁱ
represents parallel, unordered writes: reducing one of the writee and
committing a fully-evaluated write (x←v) to memory may happen in any
order.
	We thank the reviewers for their work on our submission and their
	thoughtful comments. Our response is in two parts: discussion of
	high-level questions, then more reviewer-specific comments. The
	document is designed to have the most important points first, so that
	you can stop reading at any point.

	# High-level questions

	First, we note why we think it would be nice to see this paper at
	FSCD.

	FSCD is a conference that is dear to us (not least for its continuous
	Open Access policy, which is unfortunately a rarity in this area), and
	we think that this work could nicely support the general direction of
	reasonably broadening the original TLCA+RTA scope: our work justifies
	a real-world improvement in a real-world programming language through
	a simple but non-routine meta-theoretical development -- yet another
	argument in favor of rigorous formalism in programming language
	design.

	Second, we think that there is an insight in this work that relates to
	the discussion with reviewer 2 on "declarativeness". Existing work
	(and earlier iterations of the present work) on expressing program
	analyses on the type system tend to propose judgments of the form

	Γ ⊢ e : τ ▹ ϕ

	or

	Γ ⊢ e : τ ▹ e' : τ', ϕ

	where Γ, e, τ are inputs, ϕ describes the analysis results, and (e' :
	τ') in the second formulation describes re-annotations of the inputs
	e, τ refined with the analysis results. Instead, we present our static
	system as

	Γ ⊢ e : τ

	and we use the freedom of deciding that Γ may be an output instead of
	an input (as reviewer 2 remarks, our system is uni-typed; if we wanted
	more types for better precision, Γ would contain both inputs and
	outputs) to get a system that expresses the same algorithmic content
	with a simpler, more declarative, presentation. Our meta-theoretic
	evaluation (proving soundness with respect to the operational
	semantics of [Nordlander, Carlsson and Gill, 2008]) validates the idea
	that this presentation is reasonably easy to reason about (note in
	particular how the subject-reduction proof is completely obvious to
	the input/output distinctions in the system), and our programming
	experience validates the idea that this is still conductive to an
	implementation (which was merged into the OCaml compiler
	implementation).

	Let us now answer the salient comments and questions from our reviewers.


	## Bugs avoided in the OCaml compiler (review 1)

	Reviewer 1 asks about OCaml compiler bugs that this work eliminates.
	We offer the following list:

	- [#7231](https://github.com/ocaml/ocaml/issues/7231): let-rec
	well-formedness check too permissive with nested recursive bindings

	(this issue is directly solved by our more modular/composable
	approach of a type system rather than a list of allowed/disallowed
	scenarios)

	- [#7215](https://github.com/ocaml/ocaml/issues/7215): Unsoundness
	with GADTs and let rec

	(This issue, discussed in §6.4, justifies analysing a typed
	representation of the language, rather than an intermediate
	representation that does not represent GADT type equalities.)

	- [#4989](https://github.com/ocaml/ocaml/issues/4989): Compiler
	rejects recursive definitions of values

	`let rec f = let g = fun x -> f x in g`

	The other bugs in this list allow invalid programs that
	segfault. This bug is a case of lack of expressivity of the
	previous check; it is trivially solved by our type-system approach.

	- [#6939](https://github.com/ocaml/ocaml/issues/6939): Segfault
	with improper use of let-rec and float arrays

	Our §6.2 describes this issue

	Besides fixing (at least) these bugs, checking recursive definitions
	at the typing phase rather than on an immediate representation made
	life much easier for tools that reuse the OCaml front-end, e.g.

	* MetaOCaml (which previously disallowed value recursion within
	quotations, as it could not check them)

	* Merlin (an IDE tool providing edit-time type checking, which did
	not report value recursion errors).

	## Declarative vs. Algorithmic presentations (review 2)

	We agree with Reviewer 2 that the presentation is neither "fully
	declarative" nor "fully algorithmic", and that having separate
	declarative and algorithmic version would be nice. But we don't think
	this is reasonable within the constrained FSCD space budget. Our
	system takes one half-page figure; having two systems, plus
	explanations to relate them, would take another page, and we don't see
	any portion of the present article that is less important than having
	those two presentations instead of one reasonable compromise.

	Also, in the details we respectfully disagree with the reviewer 2 on
	the placement of the boundary between "declarative" and "algorithmic",
	which is somewhat subjective. For example, we would defend the choice
	of using a multiplicative style, explicitly merging the contexts of
	separate subexpressions. We view this as being mindful about the
	respective contributions of subterms, rather than a
	redundancy-introducing algorithmic choice. In fact, reference [2] uses
	this same style for the application rule, which supports our intuition
	that it is a natural presentation choice for this family of type
	systems.

	The one rule that we agree is objectively algorithmic in its
	presentation is the rule for mutual recursion:

	```
	(G_i, (x_j : m_i,j)^j \|- t_i : Return)^i
	(m_i,j <= Guard)^i,j
	forall i, (G'_i = G_i + \Sum_j m_i,j[G'_j])
	-------------------------------------------
	(x_i : G'_i)^i \|- rec (x_i = t_i)^i
	```

	Here, the G' are defined from the G as fixpoints of a system of
	equations, while it would be enough (less constrained) to specify,
	through inequalities, a correctness condition on a single family of
	contexts:

	```
	(G_i, (x_j : m_i,j)^j \|- t_i : Return)^i
	(m_i,j <= Guard)^i,j
	forall i,j, G_i <= m_i,j[G_j]
	----------------------------------
	(x_i : G_i)^i \|- rec (x_i = t_i)^i
	```

	This formulation is more declarative, but we found it less intuitive /
	readable: it would have been the only rule where constraints on the
	context in a sub-goal would flow from the rule around it, instead of
	the other way around. So we made the conscious choice to be more
	algorithmic there.

	As researchers we value declarative presentations because they are (1)
	easier for humans to understand and (2) (often) easier to reason about
	in the meta-theory. This particular type system is currently never
	shown to human users of OCaml (the check either passes or reports an
	error for the offending variable), and its meta-theory analysis is
	done. Finally, our reference implementation serves as the ultimate
	algorithmic presentation. These explain why exploring the
	declarative/algorithmic design continuum was not a priority in our
	presentation of the work.


	## Related work on modal type systems (review 2)

	We thank reviewer 2 for the extra references (we only knew of [1],
	[3] and work related to [4] by the same authors) and will try to
	extend the Related Work section as suggested.

	> "apply [our technique to other systems - type parameter variance"
	>
	> Yes, and there is related literature, you may start looking at [1,2,5].

	By "our technique" we mean "presenting a backward-analysis algorithm
	as a right-to-left type system". [1], [2] and [3] demonstrate the
	flexibility of modal type systems (in complement to the work on
	Nakano-style modal calculi), but they do not propose this technique in
	the context of a program analysis.

	Relatedly, [4] structures the evidence of a backward analysis as
	a modal type system, which is related, but it does not use
	a "right-to-left type system" presentation, it returns the result of
	the analysis as a "demand" on the right of the judgment. As mentioned
	earlier, we propose to represent this syntactically as just the
	context of a type judgment. (This is a presentation technique rather
	than a deep difference in what is being computed, but we think that it
	does make a nice difference, and in a sense that's what is really
	"declarative" about our work.)


	## Comparison with call-by-need (review 2)

	> Shouldn't be there similarities with call-by-need? The concept of
	> "needed redex" seems related to your redexes in forced evaluation
	> contexts. There are differences as well, since you consider a term
	> vicious not only when it is stuck...
	>
	> Anyway this is just a hunch; a comment on the relation to
	> call-by-need in the related work section would be useful.

	Don Syme proposed a translation of value-letrec into lazy thunks for
	F#. Using a source-level call-by-need operational semantics was our
	first idea to specify the dynamics to compare our type system against,
	but it results in a more complex presentation than the one we adapted
	from [Nordlander, Carlsson and Gill, 2008].

	In call-by-need calculi, there is also a store (typically for
	not-yet-evaluated terms), and indeed it has been presented as a global
	thunk store (as implemented in programming languages). But the nicer
	presentations for reasoning, starting with "The call-by-need
	lambda-calculus" by Ariola, Maraist, Odersky, Felleisen, Wadler, use
	local stores / explicit substitutions. We have local stores as well
	("let rec"), not to delay computations (we are strictly
	call-by-value), but to represent cycles through the memory. Note that
	explicit substitutions are a common meta-theoretical technique in
	lambda-calculi that is not specific to call-by-need (call-by-need,
	being more complex to describe than call-by-{value,name}, tends to
	require fancier techniques).

	The notion of "forcing context" does sound related to notions that
	occur in call-by-need, but it is also not uncommon in call-by-value
	settings when you want to precisely study errors. For example, our
	definition of Mismatch (using E[H] has a notion of forcing context)
	was inspired by the following work, agnostic to the evaluation
	strategy:

	System F with Coercion Constraints
	Julien Crétin and Didier Remy, 2014
	http://gallium.inria.fr/~remy/coercions/Cretin-Remy!fcc@lics2014.pdf
	first paragraph of the right column of page 3


	> What is the relation between "forcing contexts" and needed redexes?

	They are quite different notions, because we are in a call-by-value
	setting. A context is "forcing" when (1) it is in evaluation position
	and (2) its evaluation requires the value of its hole. For example:

	- `(□, u)` is neither needed in call-by-need nor forcing for us
	(by "needed" we mean that a redex in the hole of this context would
	be a needed context)
	- `□` is needed but not forcing (in terms of our type-system it is a Return
	position, not a Dereference)
	- `(1, □+2)` is forcing (with strict pairs) but not needed


	## Details on the operational semantics (review 3)

	It was indeed our intention to avoid an "assoc" rule, so that the
	example of review 3

	let rec x = (let rec y = 1 :: x in 2 :: y) in 3 :: x

	is precisely a normal form that describes a value containing cycles,
	with each cycle defined as locally as possible. (This is just a
	presentation choice and an "assoc" rule would work just as well; in
	fact it would probably have been wiser to directly reuse this aspect
	of the presentation of [Nordlander, Carlsson and Gill, 2008], which
	use a "merge" rule for that purpose. The rest of this response
	describes how the calculus works without needing to explicitly merge
	let-rec blocks.)

	Note that our "lookup" rule may traverse the whole term from the
	lookup point to the root, in search of the value binding that defines
	this variable -- we don't need to merge value bindings to find the
	value in outer/older bindings.

	That said, there were two small mistakes in the rules described in our
	submission, related to the lack of a grammar for contexts built only
	from fully-evaluated `let rec` blocks:

	L[□] ::= □ \| let rec B in L[□]

	(With this L[□] defined, our forcing contexts E_f[□] can be defined
	from forcing frames F_f[□] in a simpler way: E_f ::= E[F_f[L]].)

	Mistake 1: the grammar of values should allow L[v] as a strong value,
	and L[w] as a weak value.

	Mistake 2: the rules for head reductions should allow for L-contexts
	in the middle of the redex, corresponding to a somewhat standard
	"reduction at a distance" rule in explicit-substitution calculi:

	L[λx.t] v -{hd}-> L[t[v/x]]

	(match L[K (wᵢ)ⁱ] with h, K (xᵢ) → t, h') -{hd}-> L[t[wᵢ/xᵢ]ⁱ]

	(For a reference on "reduction at a distance", see for example the
	Linear Substitution Calculus introduced in "An abstract factorization
	theorem for explicit substitutions", Beniamino Accattoli, 2012.)

	With these two fixes in place, let us describe a more interesting
	example by defining

	B :=
	rec hd = λli.
	match li with Nil -> None \| Cons (x, xs) -> Some x
	and cycle = λx.
	let rec li = Cons (x, li) in li

	t :=
	let rec B₀ in hd (cycle 1)

	We have, as one possible reduction path for t

	let rec B₀ in hd (cycle 1)
	-{lookup hd}->
	let rec B₀ in
	(λli. match li with Nil -> None \| Cons (x, xs) -> Some x)
	(cycle 1)
	-{lookup cycle}->
	let rec B₀ in
	(λli. match li with Nil -> None \| Cons (x, xs) -> Some x)
	((λx. let rec li = Cons (x, li) in li) 1)
	-{app}->
	let rec B₀ in
	(λli. match li with Nil -> None \| Cons (x, xs) -> Some x)
	(let rec li = Cons (1, li) in li)
	-{lookup li}->
	let rec B₀ in
	(λli. match li with Nil -> None \| Cons (x, xs) -> Some x)
	(let rec li = Cons (1, li) in Cons (1, li))
	-{app}->
	let rec B₀ in
	(match (let rec li = Cons (1, li) in Cons (1, li)) with
	\| Nil -> None
	\| Cons (x, xs) -> Some x)
	-{match}->
	let rec B₀ in
	let rec li = Cons (1, li) in
	Some 1



	# Local specific questions from reviewer

	## Review 1

	> You basically want to say that information flows from the
	> expression to the variables. This is usually referred to as the
	> "checking" direction in bidirectional typing, as opposed to the
	> "inference"/"synthesis" direction. Thus, you could also say "A
	> type checking system for...".

	In the bidirectional type systems that we are familiar with, the
	context is known, and the "checking" direction moves inwards inside
	terms by following constructors, and stops before reaching neutrals
	(in particular variables) which are inferred, not checked. It is thus
	not clear that "checking" would carry the right intuition here.

	> p7 Why is "(λx.□)t" a deref context and not a return context? Should
	> it not be equal to "let x = t in □" ?

	Reduction can lower mode constraints, so anti-reduction can strengthen
	them. With our mode languages, (□ t) is a Dereference context that is
	indistinguishable, from the modes, from (f □); it is natural that
	(f(λ.□)) be a Dereference context (f may call the function and inspect
	the result), and ((λx.□)t) is handled in the exact same way.

	If we wanted to refine the mode-checking of this particular example
	(we do not), we could introduced new mode "Call" that sits between
	Return and Dereference, is only used on the left of function
	applications, with Call[Delay] = Return.

	> Maybe a semantics using a heap more explicitly would make the
	> presentation clearer!? See, e.g., [4], or related machines for
	> call-by-need, like the Sestoft machine.

	We think that there is conceptual value in staying as close to the
	source calculus as possible. Of course, "clearer" depends on reader
	familiarity, and some readers are already more familiar with
	abstract-machine presentations. In term of economy of concepts, we
	would argue that this presentation is better.

	> Since you have only shallow matching, you could have instead
	> required disjointness of patterns in the syntax of "match".

	Thanks for the suggestion.


	## Review 2

	> (One approach to remedy the situation would be to prove that the
	> paper's semantics is correct with respect to a more conventional
	> semantics. But this would be beyond the scope of the paper.)

	Yes, we asked ourselves the same question and have written down a
	simple compilation scheme for our source language into a λ-calculus
	with a global store, and a sketch of a bisimulation argument between a
	source program and its compiled form. (It doesn't fit in the paper.)

	For the record, our target calculus is defined as follows:

	t ::= x, y, z
	\| new (xᵢ)ⁱ; u
	\| (xᵢ←tᵢ)ⁱ; u
	\| λx.t
	\| t u
	\| K (tᵢ)ⁱ
	\| match t with h

	v ::= K (wᵢ)ⁱ \| λx.t
	w ::= x \| v

	[[let rec (xᵢ = tᵢ)ⁱ in u]] ::= [[new (xᵢ)ⁱ; (xᵢ ← tᵢ)ⁱ; u]]

	In this calculus, store locations are just variables (we don't have
	first-class reference cells that can be returned from a function, so
	no need for a separate class of runtime locations), and (xᵢ←tᵢ)ⁱ
	represents parallel, unordered writes: reducing one of the writee and
	committing a fully-evaluated write (x←v) to memory may happen in any
	order.