Skip to content

Instantly share code, notes, and snippets.

@WillNess
Last active July 21, 2017 18:16
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save WillNess/a55aedde9c4e2a76cef0b07d573f40d6 to your computer and use it in GitHub Desktop.
Save WillNess/a55aedde9c4e2a76cef0b07d573f40d6 to your computer and use it in GitHub Desktop.
my edit for
https://stackoverflow.com/questions/45225142/the-pattern-with-functions-like-bool-either-etc/45238164#45238164
by https://stackoverflow.com/users/3234959/chi
First, the types you mention are not really GADTs, they are plain ADTs, since the return type of each constructor is always `T a`. A proper GADT would be something like
data T a where
K1 :: T Bool -- not T a
Anyway, the technique you mention is a well known method to encode algebraic data types into (polymorphic) functions. It goes under many names, like Church encoding, Boehm-Berarducci encoding, encoding as a catamorphism, etc. Sometimes the Yoneda lemma is used to justify this approach, but there's no need to understand the category-theoretic machinery to understand the method.
Basically, the idea is the following. All the ADTs can be generated by
- *product* types `(,) a b`
- *sum* types `Either a b`
- *arrow* types `a -> b`
- *unit* type `()`
- *void* type `Void` (rarely used in Haskell, but theoretically nice)
- *variables* (if the type being defined has parameters)
- possibly, *basic* types (`Integer`, ...)
- type level *recursion*
Type level recursion is used when some value constructor takes the type which is being defined as an argument. The classic example is Peano-style naturals:
data Nat where
O :: Nat
S :: Nat -> Nat
-- ^^^ recursive!
Lists are also common:
data List a where
Nil :: List a
Cons :: a -> List a -> List a
-- ^^^^^^ recursive!
Types like `Maybe a`, pairs, etc. are non recursive.
Note that each ADT, recursive or not, can be reduced to a single constructor with a single argument, by summing (`Either`) over all the constructors, while multiplying (`(,)`) all the arguments of each constructor. For instance, `Nat` is isomorphic to
data Nat1 where
O1 :: () -> Nat
S1 :: Nat -> Nat
which is isomorphic to
data Nat2 where K2 :: (Either () Nat) -> Nat
Lists become
data List1 a where K1 :: (Either () (a, List a)) -> List a
The steps above make use of the algebra of types, which makes the sum and products of types obey the same rules as high school algebra, while `a -> b` behaves like the exponential `b^a`<sup>1</sup>.
Hence, we can write any ADT as a pair of `T` and `F`, in the form
-- pseudo code
data T where
K :: F T -> T
type F r = .....
For instance
type F_Nat r = Either () r -- for T = Nat,
-- K_Nat :: F_Nat Nat -> Nat
type F_List_a r = Either () (a, r) -- for T = List a,
-- K_List_a :: F_List_a (List a) -> List a
(Note that the latter type function `F_List_a` depends on `a`, but it's not important right now.)
Non recursive types will not use `r`:
type F_Maybe_a r = Either () a -- for T = Maybe a,
-- K_Maybe_a :: F_Maybe_a (Maybe a) -> Maybe a
Note that constructor `K` above makes the type `T` isomorphic to `F T` (let's ignore the lifting / extra bottom introduced by it). Essentially, we have that
Nat ~= F_Nat Nat = Either () Nat
List a ~= F_List_a (List a) = Either () (a, List a)
Maybe a ~= F_Maybe_a (Maybe a) = Either () a
We can even formalize this further by abstracting from `F`s,
newtype Fix f = Fix { unFix :: f (Fix f) }
By definition `Fix F` will now be isomorphic to `F (Fix F)`. We could let
type Nat = Fix F_Nat -- ~ F_Nat (Fix F_Nat) = F_Nat Nat
type (List a) = Fix F_List_a -- ~ F_List_a (Fix F_List_a) = F_List_a (List a)
type (Maybe a) = Fix F_Maybe_a -- ~ F_Maybe_a (Fix F_Maybe_a) = F_Maybe_a (Maybe a)
(In Haskell, we need newtype wrappers around them, which I omit for clarity.)
Finally, the general encoding, or catamorphism, is:
cata :: (F r -> r) -> Fix F -> r
This assumes that `F` is a functor.
For `Nat ~= Fix F_Nat`, we get
cata :: (Either () r -> r) -> Nat -> r
-- isomorphic to
cata :: (() -> r, r -> r) -> Nat -> r
-- isomorphic to
cata :: (r, r -> r) -> Nat -> r
-- isomorphic to
cata :: r -> (r -> r) -> Nat -> r
Note the "high school albegra" steps, where `r^(1+r) = r^1 * r^r`, hence `Either () r -> r ~= (() -> r, r -> r)`.
Note that we get two arguments, `r` and `r -> r`, which correspond to `O` and `S`. This is not a coincidence -- we summed over all the constructors. This means that `cata` expects to be passed the value of type `r` which "plays the role of `O`" there, and then the value of type `r -> r` which plays the role of `S`.
More informally, `cata` is telling us that, if we want to map a natural in `r`, we only have to state what is the "zero" inside `r` and how to take the "successor" in `r`, and then every `Nat` can be mapped consequently.
For lists we get:
cata :: (Either () (a, r) -> r) -> List a -> r
-- isomorphic to
cata :: (() -> r, (a, r) -> r) -> List a -> r
-- isomorphic to
cata :: (r, a -> r -> r) -> List a -> r
-- isomorphic to
cata :: r -> (a -> r -> r) -> List a -> r
which is `foldr`.
Again, this is `cata` telling us that, if we state how to take the "empty list" in `r` and to "cons" `a` and `r` inside `r`, we can map any list in `r`.
`Maybe a` is the same:
cata :: (Either () a -> r) -> Maybe a -> r
-- isomorphic to
cata :: (() -> r, a -> r) -> Maybe a -> r
-- isomorphic to
cata :: (r, a -> r) -> Maybe a -> r
-- isomorphic to
cata :: r -> (a -> r) -> Maybe a -> r
If we can map `Nothing` in `r`, and perform `Just` mapping `a` in `r`, the we can map any `Maybe a` in `r`.
If we try to apply the same approach to `Bool` and `(a,b)` we reach the functions which were posted in the question.
More advanced theoretical topics to look up:
- (initial) F-algebras in category theory
- eliminators / recursors / induction principles in type theory (these can be applied to GADTs as well)
----
<sup>1</sup> for the intuition about this, think *"how many elements (values) there are in the type"*, e.g. for `Bool -> Either () (Either () ())`, also written as `2 -> 3` in shorthand -- enumerated as a full table, each such function takes one of three values for each of its two possible inputs; thus there are <i>3*3 = 3^2 = 9</i> such functions possible: `9 = length [(x,y) | x<-[1..3], y<-[1..3]]` where each `(x,y)` encodes a function `f_x_y False = x; f_x_y True = y`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment