Skip to content

Instantly share code, notes, and snippets.

@gelisam
Created January 6, 2022 05:19
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gelisam/176181aacda7f41fa643b50965ff2424 to your computer and use it in GitHub Desktop.
Save gelisam/176181aacda7f41fa643b50965ff2424 to your computer and use it in GitHub Desktop.
Defining a recursive datatype by its differences with another recursive datatype
-- | In response to https://twitter.com/_gilmi/status/1478846678409035779
--
-- The challenge is three-fold:
--
-- 1. define the type 'Expr2' as "the same as 'Expr1' but with this constructor
-- instead of that one", without having to repeat all the other constructors.
-- 2. convert from 'Expr1' to 'Expr2' by only providing a translation for the
-- one constructor which is different.
-- 3. write a polymorphic function which works with both 'Expr1' and 'Expr2'
-- because it doesn't touch the constructor which differs.
--
-- As you can guess from the list of language extensions, this is going to be a
-- wild ride.
{-# LANGUAGE AllowAmbiguousTypes #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE UndecidableInstances #-}
-- Before I dive into the dark magic which makes this possible, let me show the
-- end result.
-- 1. define the type 'Expr2' as "the same as 'Expr1' but with this constructor
-- instead of that one", without having to repeat all the other constructors.
data SingleLamF r = SingleLam String r deriving Functor
data MultiLamF r = MultiLam [String] r deriving Functor
data AppF r = App r [r] deriving Functor
type Expr1 = Fix '[SingleLamF, AppF]
type Expr2 = ReplaceCtor SingleLamF MultiLamF Expr1
-- Obviously, this is a very unconventional way to define 'Expr1' and 'Expr2',
-- but the result is morally equivalent to this:
--
-- > data Expr1
-- > = SingleLam String Expr1
-- > | App Expr1 [Expr1]
-- >
-- > data Expr2
-- > = MultiLam [String] Expr2
-- > | App Expr2 [Expr2]
-- 2. convert from 'Expr1' to 'Expr2' by only providing a translation for the
-- one constructor which is different.
oneToTwo
:: Expr1 -> Expr2
oneToTwo
= replaceCtor @SingleLamF @MultiLamF $ \(SingleLam x e)
-> MultiLam [x] e
-- Pretty self-explanatory: we convert from 'Expr1' to 'Expr2' by converting
-- the 'SingleLam' constructor to the 'MultiLam' constructor.
-- 3. write a polymorphic function which works with both 'Expr1' and 'Expr2'
-- because it doesn't touch the constructor which differs.
reverseArgs
:: HasCtor fix AppF
=> fix -> fix
reverseArgs
= editCtor @AppF $ \(App e args)
-> App e (reverse args)
-- Pretty self-explanatory: this polymorphic function works with any of our
-- fancily-defined @Fix '[...]@ datatypes, as long as that list contains the
-- constructor 'AppF'.
-- Let's demonstrate that it works with both 'Expr1' and 'Expr2':
reverseArgs1
:: Expr1 -> Expr1
reverseArgs1
= reverseArgs
reverseArgs2
:: Expr2 -> Expr2
reverseArgs2
= reverseArgs
-- All right, now that we're suitably impressed, let's take a look under the
-- hood.
-- First, note that even though the 'App' constructor is ostensibly unchanged,
-- @App Expr1 [Expr1]@ is actually different from @App Expr2 [Expr2]@ in that
-- the recursive type is different. We thus need to parameterize our
-- constructors by the type 'r' of the recursive datatype. This is the same
-- idea as a "base functor" in recursion-schemes.
--
-- > data AppF r = App r [r] deriving Functor
-- Just like in recursion-schemes, we can then turn a base functor into a
-- recursive datatype by defining a 'Fix' datatype which replaces that 'r' with
-- itself.
--
-- > newtype Fix f = Fix
-- > { unFix :: f (Fix f) }
-- But wait! We don't want a recursive datatype which only has 'App' as its
-- sole constructor, we want to allow any constructor from a given list. We
-- thus use an extensible sum, 'CoRec', to turn our list of functors into a
-- single functor.
newtype Fix ctors = Fix
{ unFix :: CoRec ctors (Fix ctors)
}
-- @CoRec '[f, g, h] a@ is isomorphic to @Either (f a) (Either (g a) (h a))@.
data CoRec ctors r where
Here
:: ctor r -> CoRec (ctor ': ctors) r
There
:: CoRec ctors r -> CoRec (ctor ': ctors) r
-- Now that our types are defined via a type-level list of constructors,
-- defining one type in terms of another type is pretty straightforward: just
-- write a type family which transforms one list into a different list.
type family ReplaceCtor needle replacement fixCtors where
ReplaceCtor needle replacement (Fix ctors)
= Fix (Replace (Find needle ctors) replacement ctors)
type family Find (needle :: k)
(xs :: [k])
:: Nat
where
Find needle (needle ': xs)
= 'Z
Find needle (x ': xs)
= 'S (Find needle xs)
type family Replace (i :: Nat)
(replacement :: k)
(xs :: [k])
:: [k]
where
Replace 'Z replacement (_ ': xs)
= replacement ': xs
Replace ('S i) replacement (x ': xs)
= x ': Replace i replacement xs
-- Note that we're using an inductively-defined 'Nat', we're not using the
-- builtin type-level 'Nat' from "GHC.TypeLits". This is because I'll write
-- instances for @Z@ and @S i@ in a moment, and writing instances for @0@ and
-- @i + 1@ is more difficult because @i + 1@ is an expression.
data Nat
= Z
| S Nat
-- The instances in question are for 'replaceCtor', which has a pretty long
-- type I'll explain in a moment.
replaceCtor
:: forall needle replacement fix1 fix2 ctors1 i
. ( fix1 ~ Fix ctors1
, fix2 ~ ReplaceCtor needle replacement fix1
, i ~ Find needle ctors1
, ReplaceCtorImpl i needle replacement ctors1 fix1 fix2
)
=> (needle fix2 -> replacement fix2)
-> fix1 -> fix2
replaceCtor fNeedle
= fFix
where
fFix :: fix1 -> fix2
fFix (Fix corec1)
= Fix (replaceCtorImpl @i @needle @replacement fFix fNeedle corec1)
-- Recall that we called 'replaceCtor' as
--
-- > replaceCtor @SingleLamF @MultiLamF ...
--
-- Thus the first two type variables at the beginning of the 'forall', 'needle'
-- and 'replacement', are respectively the constructor we're replacing and the
-- constructor we're replacing it with. If we knew that the only other
-- constructor was 'AppF', the type of 'replaceCtor' would be a lot simpler:
--
-- > replaceCtor
-- > :: forall needle replacement
-- > . (needle (Fix '[needle, App]) -> replacement (Fix '[replacement, App]))
-- > -> Fix '[needle, App] -> Fix '[replacement, App]
--
-- which can be further simplified by introducing temporary type synonyms:
--
-- > replaceCtor
-- > :: forall needle replacement fix1 fix2
-- > . ( fix1 ~ Fix '[needle, App]
-- > , fix2 ~ Fix '[replacement, App]
-- > )
-- > => (needle fix1 -> replacement fix2)
-- > -> fix1 -> fix2
--
-- Of course, we want to support a lot more constructor lists than that;
-- namely, any two lists 'ctors1' and 'ctors2' which differ only at one
-- location 'i', where 'ctors1' contains 'needle' at that index and 'ctors2'
-- contains 'replacement' instead. We can already express this using our
-- existing type families, without even having to name 'ctors2'.
--
-- > replaceCtor
-- > :: forall needle replacement fix1 fix2 ctors1
-- > . ( fix1 ~ Fix ctors1
-- > , fix2 ~ ReplaceCtor needle replacement fix1
-- > )
-- > => (needle fix2 -> replacement fix2)
-- > -> fix1 -> fix2
--
-- Finally, the last two constraints are needed by the implementation.
--
-- > replaceCtor
-- > :: forall needle replacement fix1 fix2 ctors1 i
-- > . ( ...
-- > , i ~ Find needle ctors1
-- > , ReplaceCtorImpl i needle replacement ctors1 fix1 fix2
-- > ) ...
--
-- 'i' is again a temporary type synonym, while 'ReplaceCtorImpl' is a
-- typeclass which recursively constructs an implementation by using two
-- non-overlapping instances to match on the 'Z' and 'S' constructors of 'i'.
-- Let's look at that 'ReplaceCtorImpl' typeclass next. It it ostensibly a
-- multi-parameter class, but in practice it is only the 'i' parameter which is
-- used to select the instance.
class ReplaceCtorImpl i needle replacement ctors1 fix1 fix2 where
replaceCtorImpl
:: ( i ~ Find needle ctors1
, ctors2 ~ Replace i replacement ctors1
)
=> (fix1 -> fix2)
-> (needle fix2 -> replacement fix2)
-> CoRec ctors1 fix1
-> CoRec ctors2 fix2
-- The method 'replaceCtorImpl' again defines temporary type synonyms, 'i' and
-- 'ctors2'. It takes a @fix1 -> fix2@ function for recursively transforming
-- the subtrees (it is 'replaceCtor' which ties the knot by defining 'fFix' in
-- terms of 'fFix'), and a function @needle fix2 -> replacement fix2@ to
-- replace the constructor after its recursive occurrences of 'fix1' have
-- already been replaced by 'fix2'.
-- There are two instances of 'ReplaceCtorImpl' one for 'Z' and one for 'S'.
instance ( ctors1 ~ (needle ': xs)
, Functor needle
)
=> ReplaceCtorImpl 'Z needle replacement ctors1 fix1 fix2
where
replaceCtorImpl fFix fNeedle (Here needle)
= Here $ fNeedle $ fmap fFix $ needle
instance ( ctors1 ~ (x ': xs)
, Functor x
, i ~ Find needle xs
, ReplaceCtorImpl i needle replacement xs fix1 fix2
)
=> ReplaceCtorImpl ('S i) needle replacement ctors1 fix1 fix2
where
replaceCtorImpl fFix _fNeedle (Here x)
= Here $ fmap fFix $ x
replaceCtorImpl fFix fNeedle (There y)
= There $ replaceCtorImpl @i @needle @replacement fFix fNeedle y
-- The body of the 'replaceCtorImpl' implementations are fairly
-- straightforward: 'fix1' and 'needle' are converted to 'fix2' and
-- 'replacement' as needed, and we recur on a smaller list of constructors.
-- It's the long list of constraints which is less straightforward.
--
-- Most of those constraints repeat the implementation of 'Find' and 'Replace',
-- and are guaranteed to hold because of the way in which we have computed 'i'.
-- It's just not obvious to ghc that they are guaranteed to hold, so we have to
-- write them down explicitly, but they will be discharged automatically once
-- ghc is given concrete constructor lists, because ghc will then be able to
-- run the type families and confirm that the equalities hold. See [1] for more
-- details about this limitation, and a very long list of workarounds.
--
-- The only constraints which are not guaranteed to hold are the Functor
-- constraints. Thus, in practice, a 'ReplaceCtorImpl' constraint asks that all
-- the base functors in the list of constructors have a Functor instance, which
-- seems reasonable.
--
-- [1] https://github.com/gelisam/typelevel-rewrite-rules#the-problem
-- Finally, let's look at 'editCtor'. This one is really easy:
--
-- > editCtor @App
--
-- is simply
--
-- > replaceCtor @App @App
--
-- But in order to hide 'replaceCtor''s pile of constraints from the API, I use
-- a variant of the "constraint synonym trick":
--
-- > class (Eq a, Ord a) => EqAndOrd a
-- > instance (Eq a, Ord a) => EqAndOrd a
--
-- Except that I only need the synonym to go from 'HasCtor' to its definition,
-- not the other way around, so I only need to put the constraints on the
-- instance.
class HasCtor fix needle where
editCtorImpl
:: (needle fix -> needle fix)
-> fix -> fix
instance ( fix ~ Fix ctors
, i ~ Find needle ctors
, Replace i needle ctors ~ ctors
, ReplaceCtorImpl i needle needle ctors fix fix
)
=> HasCtor fix needle
where
editCtorImpl
= replaceCtor
editCtor
:: forall needle fix
. HasCtor fix needle
=> (needle fix -> needle fix)
-> fix -> fix
editCtor
= editCtorImpl @fix @needle
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment