Skip to content

Instantly share code, notes, and snippets.

@AndrasKovacs
Last active January 14, 2024 17:39
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save AndrasKovacs/f268c0311437f7a8759d5bac57495f8b to your computer and use it in GitHub Desktop.
Save AndrasKovacs/f268c0311437f7a8759d5bac57495f8b to your computer and use it in GitHub Desktop.
{-# language
BlockArguments
, ConstraintKinds
, ImplicitParams
, LambdaCase
, OverloadedStrings
, PatternSynonyms
, Strict
, UnicodeSyntax
#-}
{-# options_ghc -Wincomplete-patterns -Wunused-imports #-}
{-
-- Practical eta conversion for the unit type. Also eta-long normalization.
-------------------------------------------------------------------------------
This a small demo of doing eta-conversion for the unit type in the practically
nicest way that I know. I don't think it's widely known.
This requires retaining types in values, since purely syntax-directed conversion
isn't possible for the unit type. The setup also makes it possible to do
eta-long normalization in an easy way.
There's a significant difference between this and the nicer formal NbE
definitions.
In formal NbE, there's a quote and an unquote operation (also called: reify and
reflect), both of which are type-directed. We decide conversion by first
computing eta-long normal forms, then comparing them.
In practice we want to avoid computing types and normal forms as much as
possible.
What we have here:
- Neutrals and binders are annotated with lazily computed types.
- Conversion is by recursion on values.
- Deciding eta for Pi and Sigma is purely syntax-directed.
- Conversion only computes types of neutrals, and only when necessary for
deciding unit eta. If a conversion problem does not actually depend on unit
eta, *no* types are computed, and we only have a modest constant overhead
compared to purely syntax-directed conversion.
- There is *no* unquote/reflect operation that maps neutrals to values.
Instead, we only ever eta-expand neutrals directly to terms, during quotation.
- Switching between eta-long quotation and eta-ignoring quotation is very
easy with a single branching on some flag.
- Eta-long quotation also only computes types of neutrals.
- There's a large amount of computation sharing in types.
Let's compare Agda as well:
- Conversion is type-directed, i.e. a type is passed to conversion at all times.
So, a head-normal type is computed for everything, including canonical values.
- For record types, the syntax-directed eta-expansion rule is being used, since
eager type-directed expansion would be too explosive.
- When comparing neutrals, types are recomputed for each value in a spine,
starting from the type of the head variable. There is no sharing when the same
neutrals are being compared multiple times.
I also demonstrate usage of ImplicitParams here, a rarely used GHC feature
that's nonetheless very useful in type theory implementations.
-}
import Prelude hiding (pi)
import Control.Monad
import Data.String
import GHC.Stack
-- import Debug.Trace
-- We use plain String names everywhere. No De Bruijn.
-- This not best practice, just laziness.
type Name = String
-- "Raw terms" are the input of elaboration. They have type-annotated let-s but
-- no annotation on lambda binders. Elaboration annotates the lambdas.
data RTm
= RU
| RPi Name RTm RTm
| RLam Name RTm
| RApp RTm RTm
| RVar Name
| RUnit
| RTt
| RLet Name RTm RTm RTm
deriving Show
type Ty = Tm
-- Core terms have type-annotated lambdas.
data Tm
= U
| Pi Name Ty Ty
| Lam Name ~Ty Tm
| App Tm Tm
| Var Name
| Unit
| Tt
| Let Name Ty Tm Tm
deriving Show
type VTy = Val
data Val
= VNe Ne ~VTy -- note the lazy type
| VPi Name VTy (Val -> VTy)
| VLam Name ~VTy (Val -> Val) -- also here
| VU
| VUnit
| VTt
data Ne = NVar String | NApp Ne Val
pattern VVar x a = VNe (NVar x) a
-- Here's a bit of lesser-known GHC magic that can be tremendously useful.
-- Below I define synonyms for *implicit parameter* constraints.
-- https://downloads.haskell.org/ghc/latest/docs/users_guide/exts/implicit_parameters.html
-- These are arguments that are passed implicitly like instances, but which are
-- passed in a lexical manner instead of via instance resolution: the lexically
-- innermost "let ?name" binding or passed (?name :: ...) constraint determines
-- the value of ?name.
-- The list of variables that can possibly occur freely in a semantic value.
-- It's called "domain" because it's the domain of the value environment in NbE.
-- Operationally, it's used in fresh name generation.
type Domain = (?domain :: [Name])
-- A mapping from variables which can occur in *terms*, to values. We use this in evaluation
type Env = (?env :: [(Name, Val)])
-- A mapping from variables in *raw terms* to types and non-shadowing names (we get rid of shadowing
-- during elaboration).
type Cxt = (?cxt :: [(Name, (Name, VTy))])
type Elab = (Domain, Env, Cxt)
-- Create a fresh variable as value, while extending the domain with the fresh name.
fresh :: Name -> VTy -> (Domain => Name -> Val -> a) -> Domain => a
fresh x ~a act =
let go x | x == "_" = x
| elem x ?domain = go (x ++ "'")
| otherwise = x
x' = go x
v = VVar x' a in
let ?domain = x' : ?domain in
act x' v
defineEnv :: Name -> Val -> (Env => a) -> Env => a
defineEnv x ~v act = let ?env = (x, v) : ?env in act
defineElab :: Name -> Val -> VTy -> (Elab => Name -> a) -> Elab => a
defineElab x a ~v act =
fresh x a \x' _ -> defineEnv x' v $ let ?cxt = (x, (x', a)) : ?cxt in act x'
bindElab :: Name -> VTy -> (Elab => Name -> Val -> a) -> Elab => a
bindElab x ~a act =
fresh x a \x' v -> defineEnv x' v $ let ?cxt = (x, (x', a)) : ?cxt in act x' v
impossible :: HasCallStack => a
impossible = error "impossible"
vapp :: Val -> Val -> Val
vapp t u = case t of
-- the type is lazy, so the expression after the $ is a thunk at this point that captures "a" and "u"
VNe n a -> VNe (NApp n u) $ case a of
VPi _ _ b -> b u
_ -> impossible
VLam _ _ t -> t u
_ -> impossible
eval :: Env => Tm -> Val
eval = \case
Var x -> maybe (error x) id $ lookup x ?env
App t u -> vapp (eval t) (eval u)
Let x a t u -> defineEnv x (eval t) (eval u)
U -> VU
Pi x a b -> let va = eval a in VPi x va \v -> defineEnv x v (eval b)
Lam x a t -> let ~va = eval a in VLam x va \v -> defineEnv x v (eval t)
Unit -> VUnit
Tt -> VTt
-- Are all values of a type definitionally equal? Here the irrelevant types
-- are just Unit and the functions returning in Unit.
isIrrelevant :: Domain => VTy -> Bool
isIrrelevant = \case
VUnit -> True
VPi x a b -> fresh x a \_ v -> isIrrelevant (b v)
_ -> False
convNe :: Domain => Ne -> Ne -> Bool
convNe n n' = case (n, n') of
(NVar x , NVar x' ) -> x == x'
(NApp t u, NApp t' u') -> convNe t t' && conv u u'
_ -> False
-- When comparing neutrals, we first try to recursively compare them, and if
-- that fails we check whether the type is irrelevant.
-- Alternatively, we could test relevance first.
-- Actually relying on unit eta is relatively rare, so it makes sense to only check
-- for it on demand.
-- The story could be different in a system with strict Prop, where heavy use of Prop
-- may warrant more eagerness in computing relevance.
conv :: Domain => Val -> Val -> Bool
conv t t' = case (t, t') of
(VU , VU ) -> True
(VUnit , VUnit ) -> True
(VPi x a b , VPi _ a' b' ) -> conv a a' && fresh x a \_ v -> conv (b v) (b' v)
(VLam x a t, VLam x' a' t') -> fresh x a \_ v -> conv (t v) (t' v)
(VNe n a , VNe n' _ ) -> convNe n n' || isIrrelevant a
(VTt , _ ) -> True
(_ , VTt ) -> True
(VLam x a t, t' ) -> fresh x a \_ v -> conv (t v) (vapp t' v)
(t , VLam x' a' t') -> fresh x' a' \_ v -> conv (vapp t v) (t' v)
_ -> False
data EtaOpt = Eta | NoEta deriving (Eq, Show)
type Eta = (?eta :: EtaOpt)
quoteNe :: Domain => Eta => Ne -> Tm
quoteNe = \case
NVar x -> Var x
NApp t u -> App (quoteNe t) (quote u)
etaExpand :: Domain => Eta => Ne -> VTy -> Tm
etaExpand n = \case
VUnit -> Tt
VPi y a b -> let qa = quote a in
fresh y a \y v -> Lam y qa (etaExpand (NApp n v) (b v))
_ -> quoteNe n
quote :: Domain => Eta => Val -> Tm
quote = \case
VNe n a | ?eta == Eta -> etaExpand n a
| otherwise -> quoteNe n
VPi x a b -> let qa = quote a in fresh x a \x v -> Pi x qa (quote (b v))
VLam x a t -> let qa = quote a in fresh x a \x v -> Lam x qa (quote (t v))
VU -> U
VUnit -> Unit
VTt -> Tt
quoteNoEta :: Domain => Val -> Tm
quoteNoEta = let ?eta = NoEta in quote
type ElabM = Either String
check :: Elab => RTm -> VTy -> ElabM Tm
check t a = case (t, a) of
(RLam x t, VPi _ a b) -> do
let ~qa = quoteNoEta a
bindElab x a \x v ->
Lam x qa <$> check t (b v)
(RLet x a t u, b) -> do
a <- check a VU
let va = eval a
t <- check t va
defineElab x va (eval t) \x -> do
u <- check u b
pure (Let x a t u)
(t, a) -> do
(t, a') <- infer t
unless (conv a a') $ do
Left ("type mismatch, expected:\n\n "
++ show (quoteNoEta a)
++ "\n\ninferred:\n\n " ++ show (quoteNoEta a')
++ "\n\nwhile inferring type for"
++ "\n\n " ++ show t)
pure t
infer :: Elab => RTm -> ElabM (Tm, VTy)
infer = \case
RVar x ->
case lookup x ?cxt of
Just (x', a) -> pure (Var x', a)
Nothing -> Left $ "name not in scope: " ++ x
RU ->
pure (U, VU)
RPi x a b -> do
a <- check a VU
bindElab x (eval a) \x _-> do
b <- check b VU
pure (Pi x a b, VU)
t@RLam{} ->
Left "can't infer type for lambda"
RApp t u -> do
(t, a) <- infer t
case a of
VPi x a b -> do
u <- check u a
pure (App t u, b (eval u))
a ->
Left $ show $ quoteNoEta a
RUnit ->
pure (Unit, VU)
RTt ->
pure (Tt, VUnit)
RLet x a t u -> do
a <- check a VU
let va = eval a
t <- check t va
defineElab x va (eval t) \x -> do
(u, b) <- infer u
pure (Let x a t u, b)
nf :: EtaOpt -> RTm -> IO ()
nf eta t = do
let ?env = []
let ?domain = []
let ?eta = eta
let ?cxt = []
case infer t of
Left e -> do
putStrLn "ERROR"
putStrLn e
Right (t, _) -> do
t <- pure $ quote $ eval t
print t
-- Testing
--------------------------------------------------------------------------------
instance IsString RTm where
fromString = RVar
(∙) = RApp; infixl 8
(==>) = RPi "_"; infixr 4 ==>
λ = RLam
u = RU
let_ = RLet
pi = RPi
unit = RUnit
tt = RTt
t1 :: RTm
t1 =
-- Type annotations
let_ "the" (pi "A" u $ "A" ==> "A") (λ "A" $ λ "x" "x") $
-- -- Church numbers
let_ "nat" u (pi "N" u $ ("N" ==> "N") ==> "N" ==> "N") $
-- let_ "zero" "nat" (λ "N" $ λ "s" $ λ "z" "z") $
let_ "suc" ("nat" ==> "nat")
"n" $ λ "N" $ λ "s" $ λ "z" $ "s" ("n" "N" "s" "z")) $
let_ "add" ("nat" ==> "nat" ==> "nat")
"n" $ λ "m" $ λ "N" $ λ "s" $ λ "z"
$ "n" "N" "s" ("m" "N" "s" "z")) $
let_ "mul" ("nat" ==> "nat" ==> "nat")
"n" $ λ "m" $ λ "N" $ λ "s" $ λ "z" $
"n" "N" ("m" "N" "s") "z") $
let_ "n5" "nat""N" $ λ "s" $ λ "z" $
"s" ("s" ("s" ("s" ("s" "z"))))) $
let_ "n2" "nat""N" $ λ "s" $ λ "z" $
("s" ("s" "z"))) $
let_ "n10" "nat" ("mul" "n5" "n2") $
let_ "n100" "nat" ("mul" "n10" "n10") $
-- Leibniz equality
let_ "Id" (pi "A" u $ "A" ==> "A" ==> u)
"A" $ λ "x" $ λ "y" $ pi "P" ("A" ==> u) $ "P" "x" ==> "P" "y") $
let_ "refl" (pi "A" u $ pi "x" "A" $ "Id" "A" "x" "x")
"A" $ λ "x" $ λ "P" $ λ "px" "px") $
-- Testing eta conversion
-- (f : U → U) → Id f (λ x. f x)
let_ "eta1" (pi "f" (u ==> u) $ "Id" (u ==> u) "f" "x" $ "f" "x"))
"f" $ "refl" (u ==> u) "f") $
let_ "ty" u (u ==> (u ==> u) ==> unit) $
-- (f g : U → U → ⊤) → Id f g
let_ "eta2" (pi "f" "ty" $ pi "g" "ty" $ "Id" "ty" "f" "g")
"f" $ λ "g" $ "refl" "ty" "f") $
-- eta-long nf, to be printed with "nf Eta t"
let_ "ty" u (pi "A" u $ "A" ==> (u ==> (unit ==> unit) ==> u) ==> u) $
let_ "etanf" ("ty" ==> "ty") (λ "f" "f") $
"etanf"
t2 :: RTm
t2 =
let_ "the" (pi "A" u $ "A" ==> "A") (λ "A" $ λ "x" "x") $
"the" (pi "A" u $ pi "A" u $ pi "a" "A" $ "A") "A" $ λ "A" $ λ "a" "a")
main :: IO ()
main = nf Eta t1
@ChesterJFGould
Copy link

There's an interesting bug which I ran into in my own language. Type checking then normalizing the following term causes the program to crash due to an unbound variable introduced by check.

b :: RTm
b = 
    let_ "the" (pi "A" u $ "A" ==> "A") (λ "A" $ λ "x" "x") $
    "the" ∙ (pi "A" u $ pi "A" u $ pi "a" "A" $ "A") ∙ (λ "A" $ λ "A" $ λ "a" "a")

-- this will crash
-- nf Eta b

The problem stems from lambdas which don't have a type annotation on their argument being given one by quoting a value. Specifically here the problem is on line 245-246 where a is quoted then added as an annotation to the Lam.

I think the best fix is to rename the argument of the lambda to the generated one, then keep track of the new name in a new field of the Env and replace all references to the argument with the new name while type checking. So for example (λ x. λx. x) : Unit -> Unit -> Unit would become λ x : Unit. λ x' : Unit. x'.

@AndrasKovacs
Copy link
Author

Good point. I usually use de Bruijn where it's harder to trip up on fresh naming. I'll fix this.

@AndrasKovacs
Copy link
Author

Updated, the same way as you suggested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment