Last active
July 21, 2017 18:16
-
-
Save WillNess/a55aedde9c4e2a76cef0b07d573f40d6 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
my edit for | |
https://stackoverflow.com/questions/45225142/the-pattern-with-functions-like-bool-either-etc/45238164#45238164 | |
by https://stackoverflow.com/users/3234959/chi | |
First, the types you mention are not really GADTs, they are plain ADTs, since the return type of each constructor is always `T a`. A proper GADT would be something like | |
data T a where | |
K1 :: T Bool -- not T a | |
Anyway, the technique you mention is a well known method to encode algebraic data types into (polymorphic) functions. It goes under many names, like Church encoding, Boehm-Berarducci encoding, encoding as a catamorphism, etc. Sometimes the Yoneda lemma is used to justify this approach, but there's no need to understand the category-theoretic machinery to understand the method. | |
Basically, the idea is the following. All the ADTs can be generated by | |
- *product* types `(,) a b` | |
- *sum* types `Either a b` | |
- *arrow* types `a -> b` | |
- *unit* type `()` | |
- *void* type `Void` (rarely used in Haskell, but theoretically nice) | |
- *variables* (if the type being defined has parameters) | |
- possibly, *basic* types (`Integer`, ...) | |
- type level *recursion* | |
Type level recursion is used when some value constructor takes the type which is being defined as an argument. The classic example is Peano-style naturals: | |
data Nat where | |
O :: Nat | |
S :: Nat -> Nat | |
-- ^^^ recursive! | |
Lists are also common: | |
data List a where | |
Nil :: List a | |
Cons :: a -> List a -> List a | |
-- ^^^^^^ recursive! | |
Types like `Maybe a`, pairs, etc. are non recursive. | |
Note that each ADT, recursive or not, can be reduced to a single constructor with a single argument, by summing (`Either`) over all the constructors, while multiplying (`(,)`) all the arguments of each constructor. For instance, `Nat` is isomorphic to | |
data Nat1 where | |
O1 :: () -> Nat | |
S1 :: Nat -> Nat | |
which is isomorphic to | |
data Nat2 where K2 :: (Either () Nat) -> Nat | |
Lists become | |
data List1 a where K1 :: (Either () (a, List a)) -> List a | |
The steps above make use of the algebra of types, which makes the sum and products of types obey the same rules as high school algebra, while `a -> b` behaves like the exponential `b^a`<sup>1</sup>. | |
Hence, we can write any ADT as a pair of `T` and `F`, in the form | |
-- pseudo code | |
data T where | |
K :: F T -> T | |
type F r = ..... | |
For instance | |
type F_Nat r = Either () r -- for T = Nat, | |
-- K_Nat :: F_Nat Nat -> Nat | |
type F_List_a r = Either () (a, r) -- for T = List a, | |
-- K_List_a :: F_List_a (List a) -> List a | |
(Note that the latter type function `F_List_a` depends on `a`, but it's not important right now.) | |
Non recursive types will not use `r`: | |
type F_Maybe_a r = Either () a -- for T = Maybe a, | |
-- K_Maybe_a :: F_Maybe_a (Maybe a) -> Maybe a | |
Note that constructor `K` above makes the type `T` isomorphic to `F T` (let's ignore the lifting / extra bottom introduced by it). Essentially, we have that | |
Nat ~= F_Nat Nat = Either () Nat | |
List a ~= F_List_a (List a) = Either () (a, List a) | |
Maybe a ~= F_Maybe_a (Maybe a) = Either () a | |
We can even formalize this further by abstracting from `F`s, | |
newtype Fix f = Fix { unFix :: f (Fix f) } | |
By definition `Fix F` will now be isomorphic to `F (Fix F)`. We could let | |
type Nat = Fix F_Nat -- ~ F_Nat (Fix F_Nat) = F_Nat Nat | |
type (List a) = Fix F_List_a -- ~ F_List_a (Fix F_List_a) = F_List_a (List a) | |
type (Maybe a) = Fix F_Maybe_a -- ~ F_Maybe_a (Fix F_Maybe_a) = F_Maybe_a (Maybe a) | |
(In Haskell, we need newtype wrappers around them, which I omit for clarity.) | |
Finally, the general encoding, or catamorphism, is: | |
cata :: (F r -> r) -> Fix F -> r | |
This assumes that `F` is a functor. | |
For `Nat ~= Fix F_Nat`, we get | |
cata :: (Either () r -> r) -> Nat -> r | |
-- isomorphic to | |
cata :: (() -> r, r -> r) -> Nat -> r | |
-- isomorphic to | |
cata :: (r, r -> r) -> Nat -> r | |
-- isomorphic to | |
cata :: r -> (r -> r) -> Nat -> r | |
Note the "high school albegra" steps, where `r^(1+r) = r^1 * r^r`, hence `Either () r -> r ~= (() -> r, r -> r)`. | |
Note that we get two arguments, `r` and `r -> r`, which correspond to `O` and `S`. This is not a coincidence -- we summed over all the constructors. This means that `cata` expects to be passed the value of type `r` which "plays the role of `O`" there, and then the value of type `r -> r` which plays the role of `S`. | |
More informally, `cata` is telling us that, if we want to map a natural in `r`, we only have to state what is the "zero" inside `r` and how to take the "successor" in `r`, and then every `Nat` can be mapped consequently. | |
For lists we get: | |
cata :: (Either () (a, r) -> r) -> List a -> r | |
-- isomorphic to | |
cata :: (() -> r, (a, r) -> r) -> List a -> r | |
-- isomorphic to | |
cata :: (r, a -> r -> r) -> List a -> r | |
-- isomorphic to | |
cata :: r -> (a -> r -> r) -> List a -> r | |
which is `foldr`. | |
Again, this is `cata` telling us that, if we state how to take the "empty list" in `r` and to "cons" `a` and `r` inside `r`, we can map any list in `r`. | |
`Maybe a` is the same: | |
cata :: (Either () a -> r) -> Maybe a -> r | |
-- isomorphic to | |
cata :: (() -> r, a -> r) -> Maybe a -> r | |
-- isomorphic to | |
cata :: (r, a -> r) -> Maybe a -> r | |
-- isomorphic to | |
cata :: r -> (a -> r) -> Maybe a -> r | |
If we can map `Nothing` in `r`, and perform `Just` mapping `a` in `r`, the we can map any `Maybe a` in `r`. | |
If we try to apply the same approach to `Bool` and `(a,b)` we reach the functions which were posted in the question. | |
More advanced theoretical topics to look up: | |
- (initial) F-algebras in category theory | |
- eliminators / recursors / induction principles in type theory (these can be applied to GADTs as well) | |
---- | |
<sup>1</sup> for the intuition about this, think *"how many elements (values) there are in the type"*, e.g. for `Bool -> Either () (Either () ())`, also written as `2 -> 3` in shorthand -- enumerated as a full table, each such function takes one of three values for each of its two possible inputs; thus there are <i>3*3 = 3^2 = 9</i> such functions possible: `9 = length [(x,y) | x<-[1..3], y<-[1..3]]` where each `(x,y)` encodes a function `f_x_y False = x; f_x_y True = y`. | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment