Instantly share code, notes, and snippets.

# gregberns/the-algebra-of-algebraic-data-types.md

Created December 13, 2018 05:57
Show Gist options
• Save gregberns/5e9da0c95a9a8d2b6338afe69310b945 to your computer and use it in GitHub Desktop.
The Algebra of Algebraic Data Types, Part 1, by Chris Taylor

NOTE: This is an article by Chris Taylor, originally posted on his blog, but the blog was removed at some point. I resurected it from the WayBack Machine without the authors permission, but post it here for posterity's sake, so others may be able to find it.

# The Algebra of Algebraic Data Types, Part 1

by Chris Taylor

FEB 10TH, 2013

I gave a talk on this subject at London Haskell in November 2012. A video of the talk is on YouTube and slides are on GitHub.

In this series of posts I’ll explain why Haskell’s data types are called algebraic - without mentioning category theory or advanced math.

The algebra you learned in high school starts with numbers (e.g. 1, 2, 3 …) and operators (e.g. addition and multiplication). The operators give you a way to combine numbers and make new numbers from them. For example, combining 1 and 2 with the operation of addition gives you another number, 3 - a fact that we normally express as

``````1+2=3
``````

When you get a little older you are introduced to variables (e.g. x, y, z …) which can stand for numbers. Further still, and you learn about the laws that algebra obeys. Laws like

``````0+x=x
1⋅x=x
``````

which hold for all values of x. There are other laws as well, which define properties of numbers or of operations.

When mathematicans talk about algebra, they mean something more general than this. A mathematical algebra has three parts:

• Objects are the ‘things’ of the algebra. The collection of objects defines what we’re talking about.
• Operations give us ways to combine old things to make new things.
• Laws are relationships between the objects and the operations.

In high school algebra the objects are numbers and the operations are addition, multiplication and friends.

## The algebra of Haskell types

In the algebra of Haskell types, the objects are types, for example `Bool` and `Int`. The operators take types that already exist, and generate new types from them. An example is the type constructor `Maybe`. It’s not a type itself, but you use it to create types - for example `Maybe Bool` and `Maybe Int`, which are types. Another example is `Either`, which creates a new type from two old types - for example `Either Int Bool`.

### Counting

A connection to the more familiar algebra of numbers can be seen by counting the possible values that a type has. Take `Bool`, defined by

``````data Bool = False | True
``````

There are two values that an object of type `Bool` can have – it is either `False` or `True` (technically it could also be `undefined` – a fact that I’m going to ignore for the rest of the post). Loosely, the type `Bool` corresponds to the number ‘2’ in the algebra of numbers.

If `Bool` is 2, then what is 1? It should be a type with only one value. In the computer science literature such a type is often called `Unit` and defined as

``````data Unit = Unit
``````

In Haskell there is already a type with only one value - it’s called `()` (pronounced “unit”). You can’t define it yourself, but if you could it would look like

``````data () = ()
``````

Using this counting analogy, `Int` corresponds to the number 2^32, as this is the number of values of type `Int` (at least, it is on my machine).

In principle we could types corresponding to 3, 4, 5 and so on. Sometimes we might have a genuine need to do this - for example, the type corresponding to 7 is useful for encoding days of the week. But it would be nicer if we could build up new types from old. This is where the operators of the algebra come in.

A type corresponding to addition is

``````data Add a b = AddL a | AddR b
``````

That is, the type `a + b` is a tagged union, holding either an `a` or a `b`. To see why this corresponds to addition, we can revisit the counting argument. Let’s say that `a` is `Bool` and `b` is `()`, so that there are 2 values `a` and 1 value for `b`. How many values of type `Add Bool ()` are there? We can list them out:

``````addValues = [AddL False, AddL True, AddR ()]
``````

There are three values, and 3 = 2 + 1. This is often called a sum type. In Haskell the sum type is often called `Either`, defined as

``````data Either a b = Left a | Right b
``````

but I’ll stick with `Add`.

### Multiplication

A type corresponding to multiplication is

``````data Mul a b = Mul a b
``````

That is, the type `a · b` is a container holding both an `a` and a `b`. The counting argument justifies the correspondence with multiplication - if we fix `a` and `b` to both be `Bool`, the possible values of the type `Mul Bool Bool` are

``````mulValues = [Mul False False, Mul False True, Mul True False, Mul True True]
``````

There are four values, and 4 = 2 x 2. This is often called a product type. In Haskell the product is the pair type:

``````data (,) a b = (a, b)
``````

but I’ll stick with Mul.

### Zero

Using addition and multiplication we can generate types corresponding to all the numbers from 1 upwards - but what about 0? That would be a type with no values. It sounds odd, but you can define such a type:

``````data Void
``````

Notice that there are no constructors in the data definition, so you can’t ever construct a value of type Void – it has zero values, just as we wanted!

## Laws in the algebra of Haskell types

What are the laws for the types we’ve just defined? Just like in the algebra of numbers, a law will assert the equality of two objects - in our case, the objects will be types.

However, when I talk about equality, I don’t mean Haskell equality, in the sense of the `(==)` function. Instead, I mean that the two types are in one-to-one correspondence – that is, when I say that two types a and b are equal, I mean that you could write two functions

``````from :: a -> b
to   :: b -> a
``````

that pair up values of a with values of b, so that the following equations always hold (here the `==` is genuine, Haskell-flavored equality):

``````to (from a) == a
from (to b) == b
``````

For example, I contend that the types `Bool` and `Add () ()` are equivalent. I can demonstrate the equivalence with the following functions:

``````to :: Bool -> Add () ()
``````
``````from :: Add () () -> Bool
``````

I’ll use the triple equality symbol, `===`, to denote this kind of equivalence between types.

## Laws for sum types

Here are two laws for addition:

``````Add Void a === a
``````

which says that there are as many values of type `Add Void a` as there are of type a, and

``````Add a b === Add b a
``````

which says that it doesn’t matter which order you add things in. These laws are probably more familiar to you in the algebra of numbers as

``````0 + x = x
x + y = y + x
``````

If you fancy an exercise, you can demonstrate the correctness of the laws in the Haskell algebra – either with a counting argument, or by writing the functions from and to.

## Laws for product types

There are three useful laws for multiplication:

``````Mul Void a === Void
``````

which says that if you multiply anything by Void, you get Void back,

``````Mul () a === a
``````

which says that if you multiply by () you don’t change anything, and

``````Mul a b === Mul b a
``````

which says that it doesn’t matter which order you multiply in. The more familiar forms of these laws are

``````0 ⋅ x = 0
1 ⋅ x = x
x ⋅ y = y ⋅ x
``````

Two more exercises: (i) prove the validity of these laws in the Haskell algebra, and (ii) explain why we don’t need laws of the form

``````Mul a Void === Void
Mul a ()   === a
``````

There’s also a law that relates the addition and multiplication operators:

``````Mul a (Add b c) === Add (Mul a b) (Mul a c)
``````

This one is a bit trickier to reason about, but writing the corresponding from and to functions isn’t too hard. The arithmetic version of this law is the friendlier-looking

``````a ⋅ (b + c) = a ⋅ b + a ⋅ c
``````

called the distributive law.

## Function types

As well as concrete types like `Int` and `Bool`, in Haskell you also have function types, like `Int -> Bool` or `Double -> String`. How do these fit into the algebra?

To figure this out we can go back to the counting argument. How many functions of type `a → b` are there?

Let’s be concrete, and fix a and b to both be `Bool`. The value `False` can map to either `False` or `True`, and similarly for the value `True` - thus there are `2 ⋅ 2 = 2^2 = 4` possible functions `Bool -> Bool`. To be really explicit, we could enumerate them:

``````f1 :: Bool -> Bool -- equivalent to 'id'
f1 True  = True
f1 False = False

f2 :: Bool -> Bool -- equivalent to 'const False'
f2 _     = False

f3 :: Bool -> Bool -- equivalent to 'const True'
f3 _     = True

f4 :: Bool -> Bool -- equivalent to 'not'
f4 True  = False
f4 False = True
``````

What happens if b is still `Bool` (with two values) and a is a type with three values, say

``````data Trio = First | Second | Third
``````

Then each of `First`, `Second` and `Third` can map to two possible values, and in total there are `2 ⋅ 2 ⋅ 2 = 2^3 = 8` functions of type `Trio -> Bool`.

The same argument holds in general. If there are A values of type a, and B values of type b, then the number of values of type `a → b` is

``````B^A
``````

This justifies the common terminology for function types as exponential types.

## Laws for functions

There are two laws for function types that involve the unit type. They are

``````() -> a === a
``````

which says that there are as many functions `() -> a` as there are values of type a, and

``````a -> () === ()
``````

which says that there is only one function `a -> ()` – in particular, it is `const ()`. The arithmetic versions of these laws are

``````a^1 = a
1^a = 1
``````

There is also a law that allows factoring out of common arguments:

``````(a -> b, a -> c) === a -> (b,c)
``````

whose arithmetic form is

``````b^a ⋅ c^a = (bc)^a
``````

and a law about functions that return other functions:

``````a -> (b -> c) === (b,a) -> c
``````

whose arithmetic form is

``````(c^b)^a = c^(b ⋅ a)
``````

This last law may be more familiar when the order of the variables in the pair on the right-hand side is switched, and the parens on the left hand side are removed:

``````a -> b -> c === (a,b) -> c
``````

which just says that we can curry and uncurry functions. Again, it’s an interesting exercise to prove all of these laws by writing the corresponding to and from functions.

## Next time

In the next post I’ll look at recursive types, like lists and binary trees, and show how you can abuse the algebra of types in all kinds of ways to deduce interesting facts about types.

### chris-taylor commented Oct 12, 2022

i retrospectively give permission for posting this :)