AndrasKovacs/NbESharedQuote.hs

## NbESharedQuote.hs
{-
At ICFP 2022 I attended a talk given by Tomasz Drab, about this paper:

    "A simple and efficient implementation of strong call by need by an abstract machine"
    https://dl.acm.org/doi/10.1145/3549822

This is right up my alley since I've implemented strong call-by-need evaluation
quite a few times (without ever doing more formal analysis of it) and I'm also
interested in performance improvements. Such evaluation is required in
conversion checking in dependently typed languages.

I was surprised to hear in the talk that the following can be normalized in
linear time:

    λ x. N ω x

where `ω = λ x. x x` and `N` is a Church numeral. Normalizing this expression
iterates self-application N times on `x`, and in the end we get a tree of
self-applications with depth N.

The familiar NbE (normalization-by-evaluation) has a lot of implicit sharing in
runtime values (which is one reason why it's better than naive term
substitution), but when we "quote" values back to terms we usually don't get any
implicit sharing, because we create fresh term copies from values. For example,
when we evaluate

    let x = expensive in cons x (cons x (cons x nil))

the value of `x` is shared across the three copies in the list, but when we
quote the value, the implicit sharing is discarded and we get three copies
of `x` as a term.

The idea of the paper/talk is to *also* have sharing when we compute
terms. In this file I present a small Haskell implementation for an NbE
algorithm in this style.

Regarding applications: in short, I don't think I would use this algorithm in
practical elaborators, but I do think that it's important to be aware of it.

The algorithm has two primary design requirements.

1. We must use a fresh name convention in terms, we can't use De Bruijn indices
   or levels. The reason is that we need term weakening for free.
2. We must represent neutral values as embedded terms. This is essential for
   avoiding fresh copying in quotation.

(1) is not too problematic; it's only a small overhead and it's not too
annoying. However, (2) is not really workable in elaboration, because we really
want neutral values with carefully chosen structure, for convenience and
efficiency in unification. Also:

- We get *non-observable sharing* from the algorithm. Such sharing is destroyed
  when we copy the resulting terms! If we want to e.g. serialize terms, we need
  to convert sharing to observable sharing, which is usually not cheap.
- In elaboration we very rarely quote to actual beta-normal terms. Instead, we
  get the greatest amount of size compression and observable sharing from
  *preserving definition unfoldings*.

-}

-- Mind the Strict pragma.
{-# language Strict, LambdaCase #-}
{-# options_ghc -Wincomplete-patterns #-}

{-
One important point is that we *can't* use De Bruijn convention, IIUC. The `Int`
variables here behave like names. The reason is that weakening for *terms* must
be for free in this algorithm, or else we don't get the nice asymptotics.
-}
data Tm = Var Int | App Tm Tm | Lam Int Tm deriving (Show)

-- Association lists of names and lazy values.
data Env = Nil | Def Env Int ~Val

-- Name of binder, env, closure body.
data Closure = Cl Int Env Tm

{-
Here's the main departure from vanilla NbE: neutrals are literally terms.
Also, VLam is annotated with a lazy term which is just the normal form
of the value.
-}
data Val = Ne Tm | VLam {-# unpack #-} Closure ~Tm

-- This means that quotation is not even recursive! It's just a projection.
-- This is pretty much the main point: quotation does not copy anything.
quote :: Val -> Tm
quote = \case
  VLam _ t -> t
  Ne t     -> t
{-# inline quote #-}


-- evaluation
------------------------------------------------------------

{-
We pass the next fresh variable along with the environment. The fresh variable
must not appear free in any of the values in the env. This is IMO the most
efficient way to implement fresh name generation for our purposes. We don't
really need global state.
-}

var :: Env -> Int -> Val
var e x = case e of
  Def e x' v | x == x' -> v
             | True    -> var e x
  _ -> undefined

nf :: Int -> Env -> Tm -> Tm
nf fresh e t = quote (eval fresh e t)
{-# inline nf #-}

-- apply a closure to a value.
appCl :: Int -> Closure -> Val -> Val
appCl fresh (Cl x e t) ~u = eval fresh (Def e x u) t
{-# inline appCl #-}

eval :: Int -> Env -> Tm -> Val
eval fresh e = \case
  Var x   -> var e x
  App t u -> case eval fresh e t of
               VLam t _ -> appCl fresh t (eval fresh e u)
               Ne t     -> Ne (App t (nf fresh e u))
  Lam x t -> VLam (Cl x e t)
                  (Lam fresh (nf (fresh+1) (Def e x (Ne (Var fresh))) t))

{- In eval, in the neutral application case we immediately normalize the
   argument and build the normal term.
   In the lambda case we save a closure and compute a lazy normal form.
-}

-- Normalize a closed term.
nf0 :: Tm -> Tm
nf0 = nf 0 Nil

-- Syntactic sugar for variables.
instance Num Tm where
  fromInteger = Var . fromInteger
  (+) = undefined; (*) = undefined; abs = undefined
  signum = undefined; negate = undefined

($$) = App
infixl 7 $$

-- Church natural number literal.
cnat :: Int -> Tm
cnat n = Lam 0 $ Lam 1 (go n) where
  go n | n <= 0 = 1
       | True   = 0 $$ go (n - 1)

-- This one is the `λ x. N ω x` expression that I mentioned before. It runs
-- reasonably quickly in ghci even for args around one million.
test :: Int -> ()
test n = seq (nf0 (Lam 0 $ cnat n $$ (Lam 0 $ 0 $$ 0) $$ 0)) ()
	{-
	At ICFP 2022 I attended a talk given by Tomasz Drab, about this paper:

	"A simple and efficient implementation of strong call by need by an abstract machine"
	https://dl.acm.org/doi/10.1145/3549822

	This is right up my alley since I've implemented strong call-by-need evaluation
	quite a few times (without ever doing more formal analysis of it) and I'm also
	interested in performance improvements. Such evaluation is required in
	conversion checking in dependently typed languages.

	I was surprised to hear in the talk that the following can be normalized in
	linear time:

	λ x. N ω x

	where `ω = λ x. x x` and `N` is a Church numeral. Normalizing this expression
	iterates self-application N times on `x`, and in the end we get a tree of
	self-applications with depth N.

	The familiar NbE (normalization-by-evaluation) has a lot of implicit sharing in
	runtime values (which is one reason why it's better than naive term
	substitution), but when we "quote" values back to terms we usually don't get any
	implicit sharing, because we create fresh term copies from values. For example,
	when we evaluate

	let x = expensive in cons x (cons x (cons x nil))

	the value of `x` is shared across the three copies in the list, but when we
	quote the value, the implicit sharing is discarded and we get three copies
	of `x` as a term.

	The idea of the paper/talk is to also have sharing when we compute
	terms. In this file I present a small Haskell implementation for an NbE
	algorithm in this style.

	Regarding applications: in short, I don't think I would use this algorithm in
	practical elaborators, but I do think that it's important to be aware of it.

	The algorithm has two primary design requirements.

	1. We must use a fresh name convention in terms, we can't use De Bruijn indices
	or levels. The reason is that we need term weakening for free.
	2. We must represent neutral values as embedded terms. This is essential for
	avoiding fresh copying in quotation.

	(1) is not too problematic; it's only a small overhead and it's not too
	annoying. However, (2) is not really workable in elaboration, because we really
	want neutral values with carefully chosen structure, for convenience and
	efficiency in unification. Also:

	- We get non-observable sharing from the algorithm. Such sharing is destroyed
	when we copy the resulting terms! If we want to e.g. serialize terms, we need
	to convert sharing to observable sharing, which is usually not cheap.
	- In elaboration we very rarely quote to actual beta-normal terms. Instead, we
	get the greatest amount of size compression and observable sharing from
	preserving definition unfoldings.

	-}

	-- Mind the Strict pragma.
	{-# language Strict, LambdaCase #-}
	{-# options_ghc -Wincomplete-patterns #-}

	{-
	One important point is that we can't use De Bruijn convention, IIUC. The `Int`
	variables here behave like names. The reason is that weakening for terms must
	be for free in this algorithm, or else we don't get the nice asymptotics.
	-}
	data Tm = Var Int \| App Tm Tm \| Lam Int Tm deriving (Show)

	-- Association lists of names and lazy values.
	data Env = Nil \| Def Env Int ~Val

	-- Name of binder, env, closure body.
	data Closure = Cl Int Env Tm

	{-
	Here's the main departure from vanilla NbE: neutrals are literally terms.
	Also, VLam is annotated with a lazy term which is just the normal form
	of the value.
	-}
	data Val = Ne Tm \| VLam {-# unpack #-} Closure ~Tm

	-- This means that quotation is not even recursive! It's just a projection.
	-- This is pretty much the main point: quotation does not copy anything.
	quote :: Val -> Tm
	quote = \case
	VLam _ t -> t
	Ne t -> t
	{-# inline quote #-}


	-- evaluation
	------------------------------------------------------------

	{-
	We pass the next fresh variable along with the environment. The fresh variable
	must not appear free in any of the values in the env. This is IMO the most
	efficient way to implement fresh name generation for our purposes. We don't
	really need global state.
	-}

	var :: Env -> Int -> Val
	var e x = case e of
	Def e x' v \| x == x' -> v
	\| True -> var e x
	_ -> undefined

	nf :: Int -> Env -> Tm -> Tm
	nf fresh e t = quote (eval fresh e t)
	{-# inline nf #-}

	-- apply a closure to a value.
	appCl :: Int -> Closure -> Val -> Val
	appCl fresh (Cl x e t) ~u = eval fresh (Def e x u) t
	{-# inline appCl #-}

	eval :: Int -> Env -> Tm -> Val
	eval fresh e = \case
	Var x -> var e x
	App t u -> case eval fresh e t of
	VLam t _ -> appCl fresh t (eval fresh e u)
	Ne t -> Ne (App t (nf fresh e u))
	Lam x t -> VLam (Cl x e t)
	(Lam fresh (nf (fresh+1) (Def e x (Ne (Var fresh))) t))

	{- In eval, in the neutral application case we immediately normalize the
	argument and build the normal term.
	In the lambda case we save a closure and compute a lazy normal form.
	-}

	-- Normalize a closed term.
	nf0 :: Tm -> Tm
	nf0 = nf 0 Nil

	-- Syntactic sugar for variables.
	instance Num Tm where
	fromInteger = Var . fromInteger
	(+) = undefined; (*) = undefined; abs = undefined
	signum = undefined; negate = undefined

	($$) = App
	infixl 7 $$

	-- Church natural number literal.
	cnat :: Int -> Tm
	cnat n = Lam 0 $ Lam 1 (go n) where
	go n \| n <= 0 = 1
	\| True = 0 $$ go (n - 1)

	-- This one is the `λ x. N ω x` expression that I mentioned before. It runs
	-- reasonably quickly in ghci even for args around one million.
	test :: Int -> ()
	test n = seq (nf0 (Lam 0 $ cnat n $$ (Lam 0 $ 0 $$ 0) $$ 0)) ()