LightAndLight/Rename.hs

## Rename.hs
{- | Calculating injective renamings

Say you have typing environment like [x : Int, y : Bool]. In your AST,
variables are represented by natural numbers that index into the environment.
So under this environment, `x` is represented by `0` and `y` is represented
by `1`. During type checking, information about a variable can be found by
indexing into that environment. Similarly, the environment of a
closure can be compiled to array, with the variable lookup being compiled to
array indexing. This approach is known as 'de bruijn indices'.

A renaming is a function on variable indices that allows us to reorder or
enlarge an environment while preserving the semantics of the program. For
example, if a program is typable under [x : Int, y : Bool], then it should
also be typable under any permutation of [x : Int, y : Bool, z : String].
Most permutations will require the variable indices to be shuffled in order
to achieve this. If variables `x : Int` and `y : Bool` were represented by
`0` and `1` respectively, then to be compatible with the environment
[y : Bool, z : String, x : Int] `x` needs to be represented by `2`, and `y`
by `1`. So the renaming that embeds [x : Int, y : Bool] into
[y : Bool, z : String, x : Int] is defined by `\case; 0 -> 2, 1 -> 0`.

This snippet calculates such renamings.

Why is this important to me?

Metaprogramming can be done in a well-typed fashion. The core idea is to
have some type `Code a` which represents 'code for a value of type `a`'.
Values of type `a` can be quoted to produces values of type `Code a`, and
values of type `Code a` can be spliced into places expecting values of type
`a`.

Template Haskell and MetaOCaml are the two most prominent systems that let
you do this. Here's a Haskell example:

```
type Code a = Q (TExp a)

two :: Code Int
two = [|| 2 ||] -- quote

three :: Int
three = 1 + $$two -- splice
```

I have two issues with these systems:

1. `Code` should be `Applicative`

   This style of metaprogramming is incredibly promising, especially when
   used to perform staged computation. But I think that the really killer
   applications require a *function* called `reify : a -> Code a`.

   Haskell effectively has `reify : Lift a => a -> Code a`, which isn't
   powerful enough because it can't reify functions. There are good reasons
   why this is the case for Haskell, but it means that progress in staged
   metaprogramming will hit a roadblock in Haskell.

   With a true `reify` function, `Code` becomes `Applicative` with
   `pure = reify`.

   I don't how much effort people have put into achieving this.

   `reify` should also have an inverse, `eval : Code a -> a`, but that leads
   me to the second issue.

2. Scope correctness is difficult

   There are times where it makes sense to be able "quote 'open' code", i.e.
   write a form such as `[|| x ||]`. Here is a contrived example:

   ```
   idCode :: Code (a -> a)
   idCode = [|| \x -> $$(id [|| x ||]) ||]
   ```

   The quoted `x` is bound by the quoted lambda, so this example will result
   in well-scoped code. Since `[|| x ||]` is still a normal value (and Q
   is MonadIO), we can help `x` escape:

   ```
   bad :: Int
   bad = $$(do
     ref <- liftIO $ newIORef [|| 0 ||]
     _ <- [||
       \x -> $$(do
         let xCode = [|| x ||]
         liftIO $ writeIORef ref xCode
         xCode
       )
       ||]
     xCode <- liftIO $ readIORef ref
     xCode
     )
  ```

  Another sort of problem arises from open code when we introduce
  `eval : Code a -> a`. Assuming that Haskell had this function, how would
  the following code evaluate?

  ```
  [|| \x -> $$(let !y = eval [|| x + 1 ||] in [|| y ||]) ||]
  ```

  The splice `let !y = eval [|| x + 1 ||] in [|| y ||]` needs to reduce first,
  but `y` gets stuck: we can't compute `eval [|| x + 1 ||]` because we don't
  know what `x` is.

  The problem of open code has a well-understood solution, using contextual
  modal type theory (CMTT).

Taking inspiration from CMTT, `Code` will now also be parameterised over
its context, or environment. It will have kind `Code : Row -> Type -> Type`,
where the environment is represented by a row (inspired by the row
polymorphism literature). As an example, `[|| \x -> x + y ||]` would have
type `Code {y : Int} (Int -> Int)`.

`reify` now has type `a -> Code {} a`, and `eval : Code {} a -> a`. They both
operate on *closed* terms; terms containing no free variables. But this has
a consequence on the `Applicative` instance.

Now, we have `instance Applicative (Code env)`. The first inconvenience
is that `reify` has the wrong type for `pure`. The second is that now
`(<*>)` only works on code with the same environment. For example, I might
combine `[|| f ||] : Code {f : Int -> Int} (Int -> Int)` and
`[|| x ||] : Code {x : Int} Int` into
`[|| f ||] <*> [|| x ||] : Code {f : Int -> Int, x : Int} Int`.

This is where renamings come in. Enter
`weaken : Renaming env env' => Code env a -> Code env' a`. `weaken`
calculates and applies the relevant renaming to its argument, which embeds
the argument's environment in some larger/permuted environment. For every
environment `env`, there exists a unique renaming `{}` to `env`. This means
we can take `pure = weaken . reify`. To combine code with different
environment we can use `weaken` to embed each environment into a larger,
common one. So given `[|| f ||] : Code {f : Int -> Int} (Int -> Int)` and
`[|| x ||] : Code {x : Int} Int`,
`weaken [|| f ||] <*> weaken [|| x ||] : Code {f : Int -> Int, x : Int} Int`
will compile.

To resolve the `Renaming` constraint, the compiler would do a calculation
similar to the `rename` function in this module, but without the fancy types.
When the constraint can be satisfied, the algorithm constructs a function
that can rename the variable indices appropriately.

-}

{-# language
  AllowAmbiguousTypes, DataKinds, FlexibleInstances, GADTs, KindSignatures,
  MultiParamTypeClasses, ScopedTypeVariables, TypeApplications,
  TypeFamilies, TypeOperators, UndecidableInstances, UndecidableSuperClasses
#-}

import Control.Applicative
import Data.Bifunctor
import Data.Map (Map)
import qualified Data.Map as Map
import Data.Kind
import Data.Proxy
import Data.Type.Equality
import GHC.TypeLits
import Type.Reflection

data Row = Nil | Cons Symbol Type Row

data SRow :: Row -> Type where
  SNil :: SRow 'Nil
  SCons ::
    (KnownSymbol s, NotInRow s r, WellFormed r) =>
    Proxy s ->
    TypeRep t ->
    SRow r ->
    SRow ('Cons s t r)

class WellFormed r => KnownRow (r :: Row) where
  rowVal :: SRow r

instance KnownRow 'Nil where
  rowVal = SNil

instance
  (KnownSymbol s, Typeable t, KnownRow rest, NotInRow s rest) =>
  KnownRow ('Cons s t rest) where
  rowVal = SCons Proxy typeRep rowVal

type family NotInRow (n :: Symbol) (r :: Row) :: Constraint where
  NotInRow n 'Nil = ()
  NotInRow n (Cons n t rest) = TypeError ('Text "Repeated name " :<>: 'ShowType n)
  NotInRow n (Cons n' t rest) = NotInRow n rest

data SomeTypeRep' where
  SomeTypeRep' :: TypeRep (a :: *) -> SomeTypeRep'

class WellFormed (r :: Row) where
  positions :: Map String (SomeTypeRep', Int)

instance WellFormed 'Nil where
  positions = mempty

instance
  (NotInRow n rest, KnownSymbol n, Typeable t, WellFormed rest) =>
  WellFormed ('Cons n t rest) where
  positions =
    Map.insert
      (symbolVal (Proxy :: Proxy n))
      (SomeTypeRep' (typeRep @t), 0)
      (second (1 +) <$> positions @rest)

data RenameError where
  NotInjective :: RenameError
  TypeMismatch :: TypeRep t -> TypeRep t' -> RenameError

rename_ ::
  forall frees frees'.
  (WellFormed frees, WellFormed frees') =>
  SRow frees -> SRow frees' ->
  Either RenameError (Int -> Int)
rename_ r _ = go r (positions @frees')
  where
    go ::
      forall fs.
      WellFormed fs =>
      SRow fs -> Map String (SomeTypeRep', Int) ->
      Either RenameError (Int -> Int)
    go SNil _ = Right $ \_ -> error "unexpected variable"
    go (SCons n t frees) frees'Positions =
      case Map.lookup (symbolVal n) frees'Positions of
        Nothing -> Left NotInjective
        Just (SomeTypeRep' t', ix') ->
          case testEquality t t' of
            Nothing -> Left $ TypeMismatch t t'
            Just Refl -> do
              f <- go frees frees'Positions
              Right $ \ix -> if ix == 0 then ix' else f (ix-1)

rename ::
  forall frees frees'.
  (KnownRow frees, KnownRow frees') =>
  Either RenameError (Int -> Int)
rename = rename_ (rowVal @frees) (rowVal @frees')

main = do
  let
    from1 = SCons (Proxy :: Proxy "x") (typeRep @Int) SNil
    to1 =
      SCons (Proxy :: Proxy "y") (typeRep @Bool) $
      SCons (Proxy :: Proxy "x") (typeRep @Int) $
      SNil
    Right rn1 = rename_ from1 to1
  putStrLn . unlines $
    [ "0 -> " <> show (rn1 0)
    ]

  putStrLn "---"

  let
    from2 = SCons (Proxy :: Proxy "x") (typeRep @Int) SNil
    to2 =
      SCons (Proxy :: Proxy "x") (typeRep @Int) $
      SCons (Proxy :: Proxy "y") (typeRep @Bool) $
      SNil
    Right rn2 = rename_ from2 to2
  putStrLn . unlines $
    [ "0 -> " <> show (rn2 0)
    ]

  putStrLn "---"

  let
    from3 =
      SCons (Proxy :: Proxy "x") (typeRep @Int) $
      SCons (Proxy :: Proxy "y") (typeRep @Bool) $
      SNil
    to3 =
      SCons (Proxy :: Proxy "y") (typeRep @Bool) $
      SCons (Proxy :: Proxy "x") (typeRep @Int) $
      SNil
    Right rn3 = rename_ from3 to3
  putStrLn . unlines $
    [ "0 -> " <> show (rn3 0)
    , "1 -> " <> show (rn3 1)
    ]

  putStrLn "---"

  let
    Left NotInjective =
      rename
        @(Cons "x" Int (Cons "y" Bool Nil))
        @(Cons "y" Bool Nil)
  putStrLn "NotInjective"

  putStrLn "---"

  let
    Left (TypeMismatch _ _) =
      rename
        @(Cons "x" Int (Cons "y" Bool Nil))
        @(Cons "y" Bool (Cons "x" Bool Nil))
  putStrLn "TypeMismatch"

  putStrLn "---"

  let
    Right rn4 =
      rename
        @(Cons "x" Int (Cons "y" Bool (Cons "z" [Int] Nil)))
        @(Cons "z" [Int] (Cons "x" Int (Cons "w" [Bool] (Cons "y" Bool Nil))))
  putStrLn . unlines $
    [ "0 -> " <> show (rn4 0)
    , "1 -> " <> show (rn4 1)
    , "2 -> " <> show (rn4 2)
    ]
	{- \| Calculating injective renamings

	Say you have typing environment like [x : Int, y : Bool]. In your AST,
	variables are represented by natural numbers that index into the environment.
	So under this environment, `x` is represented by `0` and `y` is represented
	by `1`. During type checking, information about a variable can be found by
	indexing into that environment. Similarly, the environment of a
	closure can be compiled to array, with the variable lookup being compiled to
	array indexing. This approach is known as 'de bruijn indices'.

	A renaming is a function on variable indices that allows us to reorder or
	enlarge an environment while preserving the semantics of the program. For
	example, if a program is typable under [x : Int, y : Bool], then it should
	also be typable under any permutation of [x : Int, y : Bool, z : String].
	Most permutations will require the variable indices to be shuffled in order
	to achieve this. If variables `x : Int` and `y : Bool` were represented by
	`0` and `1` respectively, then to be compatible with the environment
	[y : Bool, z : String, x : Int] `x` needs to be represented by `2`, and `y`
	by `1`. So the renaming that embeds [x : Int, y : Bool] into
	[y : Bool, z : String, x : Int] is defined by `\case; 0 -> 2, 1 -> 0`.

	This snippet calculates such renamings.

	Why is this important to me?

	Metaprogramming can be done in a well-typed fashion. The core idea is to
	have some type `Code a` which represents 'code for a value of type `a`'.
	Values of type `a` can be quoted to produces values of type `Code a`, and
	values of type `Code a` can be spliced into places expecting values of type
	`a`.

	Template Haskell and MetaOCaml are the two most prominent systems that let
	you do this. Here's a Haskell example:

	```
	type Code a = Q (TExp a)

	two :: Code Int
	two = [\|\| 2 \|\|] -- quote

	three :: Int
	three = 1 + $$two -- splice
	```

	I have two issues with these systems:

	1. `Code` should be `Applicative`

	This style of metaprogramming is incredibly promising, especially when
	used to perform staged computation. But I think that the really killer
	applications require a function called `reify : a -> Code a`.

	Haskell effectively has `reify : Lift a => a -> Code a`, which isn't
	powerful enough because it can't reify functions. There are good reasons
	why this is the case for Haskell, but it means that progress in staged
	metaprogramming will hit a roadblock in Haskell.

	With a true `reify` function, `Code` becomes `Applicative` with
	`pure = reify`.

	I don't how much effort people have put into achieving this.

	`reify` should also have an inverse, `eval : Code a -> a`, but that leads
	me to the second issue.

	2. Scope correctness is difficult

	There are times where it makes sense to be able "quote 'open' code", i.e.
	write a form such as `[\|\| x \|\|]`. Here is a contrived example:

	```
	idCode :: Code (a -> a)
	idCode = [\|\| \x -> $$(id [\|\| x \|\|]) \|\|]
	```

	The quoted `x` is bound by the quoted lambda, so this example will result
	in well-scoped code. Since `[\|\| x \|\|]` is still a normal value (and Q
	is MonadIO), we can help `x` escape:

	```
	bad :: Int
	bad = $$(do
	ref <- liftIO $ newIORef [\|\| 0 \|\|]
	_ <- [\|\|
	\x -> $$(do
	let xCode = [\|\| x \|\|]
	liftIO $ writeIORef ref xCode
	xCode
	)
	\|\|]
	xCode <- liftIO $ readIORef ref
	xCode
	)
	```

	Another sort of problem arises from open code when we introduce
	`eval : Code a -> a`. Assuming that Haskell had this function, how would
	the following code evaluate?

	```
	[\|\| \x -> $$(let !y = eval [\|\| x + 1 \|\|] in [\|\| y \|\|]) \|\|]
	```

	The splice `let !y = eval [\|\| x + 1 \|\|] in [\|\| y \|\|]` needs to reduce first,
	but `y` gets stuck: we can't compute `eval [\|\| x + 1 \|\|]` because we don't
	know what `x` is.

	The problem of open code has a well-understood solution, using contextual
	modal type theory (CMTT).

	Taking inspiration from CMTT, `Code` will now also be parameterised over
	its context, or environment. It will have kind `Code : Row -> Type -> Type`,
	where the environment is represented by a row (inspired by the row
	polymorphism literature). As an example, `[\|\| \x -> x + y \|\|]` would have
	type `Code {y : Int} (Int -> Int)`.

	`reify` now has type `a -> Code {} a`, and `eval : Code {} a -> a`. They both
	operate on closed terms; terms containing no free variables. But this has
	a consequence on the `Applicative` instance.

	Now, we have `instance Applicative (Code env)`. The first inconvenience
	is that `reify` has the wrong type for `pure`. The second is that now
	`(<*>)` only works on code with the same environment. For example, I might
	combine `[\|\| f \|\|] : Code {f : Int -> Int} (Int -> Int)` and
	`[\|\| x \|\|] : Code {x : Int} Int` into
	`[\|\| f \|\|] <*> [\|\| x \|\|] : Code {f : Int -> Int, x : Int} Int`.

	This is where renamings come in. Enter
	`weaken : Renaming env env' => Code env a -> Code env' a`. `weaken`
	calculates and applies the relevant renaming to its argument, which embeds
	the argument's environment in some larger/permuted environment. For every
	environment `env`, there exists a unique renaming `{}` to `env`. This means
	we can take `pure = weaken . reify`. To combine code with different
	environment we can use `weaken` to embed each environment into a larger,
	common one. So given `[\|\| f \|\|] : Code {f : Int -> Int} (Int -> Int)` and
	`[\|\| x \|\|] : Code {x : Int} Int`,
	`weaken [\|\| f \|\|] <*> weaken [\|\| x \|\|] : Code {f : Int -> Int, x : Int} Int`
	will compile.

	To resolve the `Renaming` constraint, the compiler would do a calculation
	similar to the `rename` function in this module, but without the fancy types.
	When the constraint can be satisfied, the algorithm constructs a function
	that can rename the variable indices appropriately.

	-}

	{-# language
	AllowAmbiguousTypes, DataKinds, FlexibleInstances, GADTs, KindSignatures,
	MultiParamTypeClasses, ScopedTypeVariables, TypeApplications,
	TypeFamilies, TypeOperators, UndecidableInstances, UndecidableSuperClasses
	#-}

	import Control.Applicative
	import Data.Bifunctor
	import Data.Map (Map)
	import qualified Data.Map as Map
	import Data.Kind
	import Data.Proxy
	import Data.Type.Equality
	import GHC.TypeLits
	import Type.Reflection

	data Row = Nil \| Cons Symbol Type Row

	data SRow :: Row -> Type where
	SNil :: SRow 'Nil
	SCons ::
	(KnownSymbol s, NotInRow s r, WellFormed r) =>
	Proxy s ->
	TypeRep t ->
	SRow r ->
	SRow ('Cons s t r)

	class WellFormed r => KnownRow (r :: Row) where
	rowVal :: SRow r

	instance KnownRow 'Nil where
	rowVal = SNil

	instance
	(KnownSymbol s, Typeable t, KnownRow rest, NotInRow s rest) =>
	KnownRow ('Cons s t rest) where
	rowVal = SCons Proxy typeRep rowVal

	type family NotInRow (n :: Symbol) (r :: Row) :: Constraint where
	NotInRow n 'Nil = ()
	NotInRow n (Cons n t rest) = TypeError ('Text "Repeated name " :<>: 'ShowType n)
	NotInRow n (Cons n' t rest) = NotInRow n rest

	data SomeTypeRep' where
	SomeTypeRep' :: TypeRep (a :: *) -> SomeTypeRep'

	class WellFormed (r :: Row) where
	positions :: Map String (SomeTypeRep', Int)

	instance WellFormed 'Nil where
	positions = mempty

	instance
	(NotInRow n rest, KnownSymbol n, Typeable t, WellFormed rest) =>
	WellFormed ('Cons n t rest) where
	positions =
	Map.insert
	(symbolVal (Proxy :: Proxy n))
	(SomeTypeRep' (typeRep @t), 0)
	(second (1 +) <$> positions @rest)

	data RenameError where
	NotInjective :: RenameError
	TypeMismatch :: TypeRep t -> TypeRep t' -> RenameError

	rename_ ::
	forall frees frees'.
	(WellFormed frees, WellFormed frees') =>
	SRow frees -> SRow frees' ->
	Either RenameError (Int -> Int)
	rename_ r _ = go r (positions @frees')
	where
	go ::
	forall fs.
	WellFormed fs =>
	SRow fs -> Map String (SomeTypeRep', Int) ->
	Either RenameError (Int -> Int)
	go SNil _ = Right $ \_ -> error "unexpected variable"
	go (SCons n t frees) frees'Positions =
	case Map.lookup (symbolVal n) frees'Positions of
	Nothing -> Left NotInjective
	Just (SomeTypeRep' t', ix') ->
	case testEquality t t' of
	Nothing -> Left $ TypeMismatch t t'
	Just Refl -> do
	f <- go frees frees'Positions
	Right $ \ix -> if ix == 0 then ix' else f (ix-1)

	rename ::
	forall frees frees'.
	(KnownRow frees, KnownRow frees') =>
	Either RenameError (Int -> Int)
	rename = rename_ (rowVal @frees) (rowVal @frees')

	main = do
	let
	from1 = SCons (Proxy :: Proxy "x") (typeRep @Int) SNil
	to1 =
	SCons (Proxy :: Proxy "y") (typeRep @Bool) $
	SCons (Proxy :: Proxy "x") (typeRep @Int) $
	SNil
	Right rn1 = rename_ from1 to1
	putStrLn . unlines $
	[ "0 -> " <> show (rn1 0)
	]

	putStrLn "---"

	let
	from2 = SCons (Proxy :: Proxy "x") (typeRep @Int) SNil
	to2 =
	SCons (Proxy :: Proxy "x") (typeRep @Int) $
	SCons (Proxy :: Proxy "y") (typeRep @Bool) $
	SNil
	Right rn2 = rename_ from2 to2
	putStrLn . unlines $
	[ "0 -> " <> show (rn2 0)
	]

	putStrLn "---"

	let
	from3 =
	SCons (Proxy :: Proxy "x") (typeRep @Int) $
	SCons (Proxy :: Proxy "y") (typeRep @Bool) $
	SNil
	to3 =
	SCons (Proxy :: Proxy "y") (typeRep @Bool) $
	SCons (Proxy :: Proxy "x") (typeRep @Int) $
	SNil
	Right rn3 = rename_ from3 to3
	putStrLn . unlines $
	[ "0 -> " <> show (rn3 0)
	, "1 -> " <> show (rn3 1)
	]

	putStrLn "---"

	let
	Left NotInjective =
	rename
	@(Cons "x" Int (Cons "y" Bool Nil))
	@(Cons "y" Bool Nil)
	putStrLn "NotInjective"

	putStrLn "---"

	let
	Left (TypeMismatch _ _) =
	rename
	@(Cons "x" Int (Cons "y" Bool Nil))
	@(Cons "y" Bool (Cons "x" Bool Nil))
	putStrLn "TypeMismatch"

	putStrLn "---"

	let
	Right rn4 =
	rename
	@(Cons "x" Int (Cons "y" Bool (Cons "z" [Int] Nil)))
	@(Cons "z" [Int] (Cons "x" Int (Cons "w" [Bool] (Cons "y" Bool Nil))))
	putStrLn . unlines $
	[ "0 -> " <> show (rn4 0)
	, "1 -> " <> show (rn4 1)
	, "2 -> " <> show (rn4 2)
	]