jaspervdj-luminal/CyclicHash.lhs

## CyclicHash.lhs
Cryptographic hashes and dependency cycles
==========================================

Introduction
------------

_"Much more than encryption algorithms, one-way hash functions are the workhorses
of modern cryptography." - Bruce Schneier_

NOTE: The above should be formatted as a quote. I don't know if we have anything
special for that.

Cryptographic hashes (or _one-way hash functions_) allow us to compute a digest
that uniquely identifies a resource.  If we would make a small change anywhere
in a resource, its digest also changes (drastically, because of the [Avalanche]
effect).

[Avalanche]: https://en.wikipedia.org/wiki/Avalanche_effect

This makes them very practical for detecting changes in applications that deal
with dependency trees.  If we include the cryptographic hashes of the
dependencies of a resource in its cryptographic hash, we have a cheap way to
check if a resource _or any of its dependencies_ was changed.

This is very commonly used in the software world.  For example, think of git
hashes, blockchains, and the [nix] build system.

[nix]: https://nixos.org/nix/

Implementing this is theoretically not that hard, given a good cryptographic
library.  However, it gets trickier when we want to allow dependency cycles.  At
that point, we get a bit of a chicken-and-egg probblem: we need resource _A_s
hash to compute resource _B_s hash, but that in turn depends on _B_s hash again!

Cyclical cloud resources
------------------------

Let's start by giving a more concrete example of how we encountered this
problem.  In [Fugue], we are dealing with constructing complicated cloud
infrastructure.  Cyclical dependencies between resources inevitably pop up once
in a while.

[Fugue]: https://fugue.co/

For example, it is not uncommon two have two [Security Groups] that refer to
each other.  Imagine you are have a _default_ security group, which contains the
core of your application, and a _monitoring_ security group which performs
health checks.

[Security Groups]: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-network-security.html

- You want to allow traffic from _monitoring_ to _default_ in order to perform
  external sanity checks;
- but you also want to allow traffic from _default_ to _monitoring_, in order to
  perform heartbeat-like checks, or to have your application send detailed
  information.

Configuring these in mainstream configuration languages often leads to annoying
cyclic dependency issues, which in turn require annoying and verbose
workarounds.

In Fugue, we use a statically typed, declarative programming language called
_Ludwig_ to configure infrastructure.  This allows us to solve this problem in a
neat way.  In this blogpost, we study how we can solved part of the problem:
detecting changes in these possibly cyclical dependency graphs.

Literate Haskell
----------------

This blogpost is written in Literate Haskell.  This is an implementation of
[literate programming] for the [Haskell] programming language.  The idea of
literate crogramming was coined a long time ago by Dr. Donald Knuth, who cited
[CWEB] as his favorite programming language.

[literate programming]: https://en.wikipedia.org/wiki/Literate_programming
[Haskell]: http://haskell.org
[CWEB]: http://www-cs-faculty.stanford.edu/~uno/cweb.html

In literate programming, we communicate an idea by presenting a human-readable
text interspersed with snippets of code.  This can be considered as a sort of
dual to _"normal"_ programming, where we intersperse our code with comment
snippets.

This is perfect for technical blogposts (and of course [we really like
Haskell]).

[we really like Haskell]: https://blog.fugue.co/2016-01-27-our-functional-future-or-how-i-learned-to-stop-worrying-and-love-haskell.html

As a result, you can compile this blogpost easily with a Haskell compiler, or
even better, load it into the [GHCi REPL] and play around with it.  In order to
do that, download the [code to this blogpost] and load it in GHCi:

    $ ghci cyclic-hash.lhs
    GHCi, version 7.10.3: http://www.haskell.org/ghc/  :? for help
    [1 of 1] Compiling Data.CyclicHash  ( cyclic-hash.lhs, interpreted )
    Ok, modules loaded: Data.CyclicHash.
    *Data.CyclicHash>

[GHCi REPL]: https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/ghci.html
[code to this blogpost]: TODO

In order for this to blogpost work as a standalone Haskell module, we need to do
some setup first.  We start with a reasonably standard Haskell module header:
first we declare language pragmas we need.

> {-# LANGUAGE BangPatterns          #-}
> {-# LANGUAGE FlexibleContexts      #-}
> {-# LANGUAGE GADTs                 #-}
> {-# LANGUAGE MultiParamTypeClasses #-}
> {-# LANGUAGE TypeFamilies          #-}

Then we describe the exported interface of this module.  Following good
technical design standards, we try to keep this to a minimum.

> module Data.CyclicHash
>     ( CyclicHash (..)
>     , HashField (..)
>     , cyclicHashes
>

We also have some exports for testing:

>     , Username
>     , Person (..)
>     , people
>     , showCyclicHashes
>     ) where

Lastly we have to import a number of other modules we rely on.  Most of these
modules ship with the base library provided by the compiler, and the others are
provided by three very commonly used libraries: [containers], [hashable] and
[unordered-containers].

[containers]: https://hackage.haskell.org/package/containers
[hashable]: https://hackage.haskell.org/package/hashable
[unordered-containers]: https://hackage.haskell.org/package/unordered-containers

> import           Data.Hashable       (Hashable, hashWithSalt)
> import           Data.List           (foldl')
> import           Data.Map.Strict     (Map)
> import qualified Data.Map.Strict     as Map
> import           Data.Maybe          (fromMaybe, maybeToList)
> import           Text.Printf         (printf)
> import qualified Data.Graph          as G

Problem definition
------------------

In this blogpost, we will look at a simple example of cyclical dependencies.  We
are modeling a trust graph, and we want to compute cryptographic hashes for all
the people in this trust graph.  Because trust can be mutual, we quickly
encounter the problem of dependency cycles here.

Let's look at the datatype we will use to model our problem:

> type Username = String

> data Person = Person
>     { pUserName :: Username
>     , pFullName :: String
>     , pTrusts   :: [Username]
>     }

A person has a `Username` (which we will use as key) and a full name. A person
also has a list of other people he or she *trusts*. The cryptographic hash of a
`Person` consists of the username, the full name, and *the hashes of the people
this person trusts*.

The CyclicHash class
--------------------

We start out with a class that captures the essence of computing hashes for
cyclic data structures.

We need an associated type synonym, `Key`, which gives us a unique key for the
value. We also have a corresponding `key` function to obtain this key.

Usually the hash of a datatype is a hash of all the fields in that
datastructure. This is why we have the second important function, `fields`.  It
returns a list of all the fields we want to hash.

> class (Hashable (Key a), Ord (Key a)) => CyclicHash a where
>     type Key a
>     key    :: a -> Key a
>     fields :: a -> [HashField a]

The fields we want to hash are represented using an intermediate datatype.

We make a distinction between two kinds of fields:

- Primitive fields (`Int`, `String`, and other things that have a reasonable
  `Hashable` instance). These just contain the value that needs to be hashed.
- Dependency fields which refer to other values. These are the ones we need to
  be careful about, since they can introduce dependency cycles.

> data HashField a where
>     PrimitiveField  :: Hashable b => b -> HashField a
>     DependencyField :: Key a -> HashField a

We can now write down the `CyclicHash` instance for `Person`. It is fairly
straightforward: `pUserName` and `pFullName` contribute to the hash as primitive
fields, and we create a `DependencyField` for every person in `pTrusts`.

> instance CyclicHash Person where
>     type Key Person = Username

>     key    p = pUserName p
>     fields p =
>         [ PrimitiveField (pUserName p)
>         , PrimitiveField (pFullName p)
>         ] ++
>         map DependencyField (pTrusts p)

Lastly, let's add an auxiliary method that gives simply us the dependencies of a
node. We can do this by finding all the `DependencyField` fields and
concatenating their keys in a list.

> dependencies :: CyclicHash a => a -> [Key a]
> dependencies node = [k | DependencyField k <- fields node]

Aside: the Hashable library
---------------------------

In this blogpost, we use the [hashable] library, because of its easy interface
and wide adoption. However, we must keep in mind that this library does not
provide us with cryptographically strong hash functions.

In production code, it would be better to use an algorithm like SHA-2. The idea
behind the cyclic hashing stays the same though, so updating this code to use
such an algorithm is not a hard task.

A primitive hash
----------------

We can start by defining a "primitive" hash function, which does not look at the
dependencies of a node.

> primitiveHash :: CyclicHash a => Int -> a -> Int
> primitiveHash salt =
>     foldl' hashField salt . fields
>   where
>     hashField s (PrimitiveField  x) =
>         s `hashWithSalt` (0 :: Int) `hashWithSalt` x
>     hashField s (DependencyField k) =
>         s `hashWithSalt` (1 :: Int) `hashWithSalt` k

A component hash
----------------

Now, we want some way to take the hashes of the dependencies into account. But
that is impossible to do if we have the following trust graph like this:

    Mary  ←- David
    |  ↑        |
    ↓  |        ↓
    Rob      Susan

In Haskell, this looks like:

> people :: [Person]
> people =
>     [ Person "rob"   "Robert Smith"    ["mary"]
>     , Person "mary"  "Mary Johnson"    ["rob"]
>     , Person "susan" "Susan Williams"  []
>     , Person "david" "David Brown"     ["mary", "susan"]
>     ]

Mary's hash depends on Rob's hash and vice versa! An elegant way to solve this
is provided to us by computing the [Strongly connected components] of the graph.

[Strongly connected components]: https://en.wikipedia.org/wiki/Strongly_connected_component

We will be able to compute the strongly connected components using the vastly
underrated [Data.Graph] module. The output for the example graph we saw above
is:

[Data.Graph]: https://hackage.haskell.org/package/containers/docs/Data-Graph.html

    B. Mary  | C. ←- David | A. +
       |  ↑  |             |    |
       ↓  |  |             |    ↓
       Rob   |             | Susan

A very important property of the resulting components is that if we look at the
graph of components, **there cannot be any cycles**. This is easy to see: if
there was a cycle between two or more components, they would belong to the same
cyclic component!

Now, we can compute a hash *per component*. The simplest strongly connected
component is a single vertex (represented by the `G.AcyclicSCC` constructor). In
this case, we can rely on the `primitiveHash` function we saw above. Otherwise,
we have a list of vertices (`G.CyclicSCC`). We can hash these together by using
a *fold*.

Note that this hash also contains the *shape* of the component.

> componentHash :: CyclicHash a => Int -> G.SCC a -> Int
> componentHash salt0 (G.AcyclicSCC x) =
>     salt0 `hashWithSalt` (0 :: Int) `primitiveHash` x
> componentHash salt0 (G.CyclicSCC xs0) =
>     salt0 `hashWithSalt` (1 :: Int) `hashPrimitives` xs0
>   where
>     -- | Chain of primitiveHash.
>     hashPrimitives salt = foldl' primitiveHash salt

This gives us something like the following for our example (we use `<+>`) as an
_informal_ operator to denote a combination of hashes:

    componentHash(A) = 0 <+> primitiveHash(susan)
    componentHash(B) = 1 <+> primitiveHash(rob) <+> primitiveHash(mary)
    componentHash(C) = 0 <+> primitiveHash(david)

Note that the the `<+>` operator is _not_ commutative, so the order in which we
do the fold is important. Fortunately, the `Data.Graph` library takes care of
that for us: the nodes returned in a component are sorted in the order we need.

A component hash per node
-------------------------

With the help of the `componentHash` function, we can define another function
which gives us the map containing the hash for each component, by node.

> componentHashes :: CyclicHash a => Int -> [G.SCC a] -> Map (Key a) Int
> componentHashes salt sccs = Map.fromList $ do
>     scc  <- sccs
>     let !hash = componentHash salt scc
>     node <- G.flattenSCC scc
>     return (key node, hash)

For the example graph, this gives us the following mapping:

    rob   => componentHash(B)
    david => componentHash(C)
    susan => componentHash(A)
    mary   => componentHash(B)

The cyclic hash
---------------

Now, we have the almost all the ingredients to compute the full cyclic hash for
a value. What we want to do is easy to informally define but a bit harder to
implement.

Since we know there are no cyclic dependencies between the strongly connected
components of the graph, we can run through them _in order_ now.

1. Let's start with component _A_ which is the simplest. Since the nodes in this
   component have no dependencies whatsoever, we get:

        cyclicHash(susan) = primitiveHash(susan)

2. For component _B_, it's a bit more complicated. Ideally we would have
   something like this, which is of course impossible:

        cyclicHash(rob) = primitiveHash(rob) <+> cyclicHash(mary)
        cyclicHash(mary) = primitiveHash(mary) <+> cyclicHash(rob)

    But we can fix this by, instead of taking the cyclic hashes of nodes in the
    same component, taking the component hashes, since these hashes also contain
    the whole component! This gives us:

        cyclicHash(rob) = primitiveHash(rob) <+> componentHash(B)
        cyclicHash(mary) = primitiveHash(mary) <+> componentHash(B)

3. When we arrive at component _C_, we have already computed the cyclic hashes
   of its dependencies, so we simply get:

        cyclicHash(david) =
            primitiveHash(david) <+>
            cyclicHash(mary)      <+>
            cyclicHash(susan)

The implementation of `cyclicHash` is similar to `primitiveHash`, except we now
pass in an argument `dependencyHashes`. The idea is that this will contain the
`cyclicHash`es of actual dependencies, and the `componentHash`es of cyclic
dependencies (which are in the same component). Fortunately, we don't really
need to check which of these cases we are in -- we can just grab the hash from
the map.

> cyclicHash :: CyclicHash a => Map (Key a) Int -> Int -> a -> Int
> cyclicHash dependencyHashes salt0 = foldl' hashField salt0 . fields
>   where
>     hashField !s (PrimitiveField  x) =
>         s `hashWithSalt` (0 :: Int) `hashWithSalt` x
>     hashField !s (DependencyField k) =
>         let !x = fromMaybe 0 $ Map.lookup k dependencyHashes in
>         s `hashWithSalt` (1 :: Int) `hashWithSalt` x

Tying it all together
---------------------

Finally, it's time to put everything together in a single function that users
can call. In fact, this is the only top-level function exported from this
module!

The function takes a number of values, and returns the values paired with their
hashes. Of course, the interesting bit is how we construct `fullHashes`.

> cyclicHashes :: CyclicHash a => Int -> [a] -> [(a, Int)]
> cyclicHashes salt0 nodes =
>     [ (n, h)
>     | n <- nodes
>     , h <- maybeToList $ Map.lookup (key n) fullHashes
>     ]

We start of by calculating the strongly connected components using the
`Data.Graph` library. Then we calculate a hash for every component, using the
`componentHashes` we defined before.

>   where
>     comps      = G.stronglyConnComp [(n, key n, dependencies n) | n <- nodes]
>     compHashes = componentHashes salt0 comps

It is important to note that `G.stronglyConnComp` returns the components in
[topological order]. For our use case, this means that the components are in the
"right" order with regards to dependencies: if _y_ depends on _x_, _x_ will
always be placed before _y_ in the list.

[topological order]: https://en.wikipedia.org/wiki/Topological_sorting

This means that we can just fold through the `comps` components. The accumulator
we use is the map we calculated using `componentHashes`. Initially, this map
contains the `componentHash` for every node, but as we fold through the list, we
update it to contain the full hash for every node. This is exactly what
`cyclicHash` expects!

>     fullHashes = foldl'
>         (\acc scc ->
>             let !newHashes = Map.fromList
>                       [ (key x, cyclicHash acc salt0 x)
>                       | x <- G.flattenSCC scc
>                       ]
>             in Map.union newHashes acc) compHashes comps

Full example code
-----------------

Let's add a quick printing function to check the results:

> showCyclicHashes :: (CyclicHash a, Show (Key a)) => [a] -> String
> showCyclicHashes values = unlines
>     [ printf "%s: %x" (show (key x)) hash
>     | (x, hash) <- cyclicHashes 42 values
>     ]


    *Data.CyclicHash> putStr $ showCyclicHashes people
    "rob": 9923cc533bb3432
    "mary": e8c6608ddd959521
    "susan": 6dd70fb3a3e127ed
    "david": 86b49a12d999cc35

And then, after we edit e.g. `mary`'s name, we get:

    *Data.CyclicHash> putStr $ showCyclicHashes people
    "rob": 201403e3ca292874
    "mary": 8370448e0b8598cf
    "susan": 6dd70fb3a3e127ed
    "david": 853624228d76fe7f

Seems like it is behaving as expected! [^test]

[^test]: Of course, it goes without saying that algorithms like this need more
    rigorous testing.
	Cryptographic hashes and dependency cycles
	==========================================

	Introduction
	------------

	_"Much more than encryption algorithms, one-way hash functions are the workhorses
	of modern cryptography." - Bruce Schneier_

	NOTE: The above should be formatted as a quote. I don't know if we have anything
	special for that.

	Cryptographic hashes (or _one-way hash functions_) allow us to compute a digest
	that uniquely identifies a resource. If we would make a small change anywhere
	in a resource, its digest also changes (drastically, because of the [Avalanche]
	effect).

	[Avalanche]: https://en.wikipedia.org/wiki/Avalanche_effect

	This makes them very practical for detecting changes in applications that deal
	with dependency trees. If we include the cryptographic hashes of the
	dependencies of a resource in its cryptographic hash, we have a cheap way to
	check if a resource _or any of its dependencies_ was changed.

	This is very commonly used in the software world. For example, think of git
	hashes, blockchains, and the [nix] build system.

	[nix]: https://nixos.org/nix/

	Implementing this is theoretically not that hard, given a good cryptographic
	library. However, it gets trickier when we want to allow dependency cycles. At
	that point, we get a bit of a chicken-and-egg probblem: we need resource _A_s
	hash to compute resource _B_s hash, but that in turn depends on _B_s hash again!

	Cyclical cloud resources
	------------------------

	Let's start by giving a more concrete example of how we encountered this
	problem. In [Fugue], we are dealing with constructing complicated cloud
	infrastructure. Cyclical dependencies between resources inevitably pop up once
	in a while.

	[Fugue]: https://fugue.co/

	For example, it is not uncommon two have two [Security Groups] that refer to
	each other. Imagine you are have a _default_ security group, which contains the
	core of your application, and a _monitoring_ security group which performs
	health checks.

	[Security Groups]: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-network-security.html

	- You want to allow traffic from _monitoring_ to _default_ in order to perform
	external sanity checks;
	- but you also want to allow traffic from _default_ to _monitoring_, in order to
	perform heartbeat-like checks, or to have your application send detailed
	information.

	Configuring these in mainstream configuration languages often leads to annoying
	cyclic dependency issues, which in turn require annoying and verbose
	workarounds.

	In Fugue, we use a statically typed, declarative programming language called
	_Ludwig_ to configure infrastructure. This allows us to solve this problem in a
	neat way. In this blogpost, we study how we can solved part of the problem:
	detecting changes in these possibly cyclical dependency graphs.

	Literate Haskell
	----------------

	This blogpost is written in Literate Haskell. This is an implementation of
	[literate programming] for the [Haskell] programming language. The idea of
	literate crogramming was coined a long time ago by Dr. Donald Knuth, who cited
	[CWEB] as his favorite programming language.

	[literate programming]: https://en.wikipedia.org/wiki/Literate_programming
	[Haskell]: http://haskell.org
	[CWEB]: http://www-cs-faculty.stanford.edu/~uno/cweb.html

	In literate programming, we communicate an idea by presenting a human-readable
	text interspersed with snippets of code. This can be considered as a sort of
	dual to _"normal"_ programming, where we intersperse our code with comment
	snippets.

	This is perfect for technical blogposts (and of course [we really like
	Haskell]).

	[we really like Haskell]: https://blog.fugue.co/2016-01-27-our-functional-future-or-how-i-learned-to-stop-worrying-and-love-haskell.html

	As a result, you can compile this blogpost easily with a Haskell compiler, or
	even better, load it into the [GHCi REPL] and play around with it. In order to
	do that, download the [code to this blogpost] and load it in GHCi:

	$ ghci cyclic-hash.lhs
	GHCi, version 7.10.3: http://www.haskell.org/ghc/ :? for help
	[1 of 1] Compiling Data.CyclicHash ( cyclic-hash.lhs, interpreted )
	Ok, modules loaded: Data.CyclicHash.
	*Data.CyclicHash>

	[GHCi REPL]: https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/ghci.html
	[code to this blogpost]: TODO

	In order for this to blogpost work as a standalone Haskell module, we need to do
	some setup first. We start with a reasonably standard Haskell module header:
	first we declare language pragmas we need.

	> {-# LANGUAGE BangPatterns #-}
	> {-# LANGUAGE FlexibleContexts #-}
	> {-# LANGUAGE GADTs #-}
	> {-# LANGUAGE MultiParamTypeClasses #-}
	> {-# LANGUAGE TypeFamilies #-}

	Then we describe the exported interface of this module. Following good
	technical design standards, we try to keep this to a minimum.

	> module Data.CyclicHash
	> ( CyclicHash (..)
	> , HashField (..)
	> , cyclicHashes
	>

	We also have some exports for testing:

	> , Username
	> , Person (..)
	> , people
	> , showCyclicHashes
	> ) where

	Lastly we have to import a number of other modules we rely on. Most of these
	modules ship with the base library provided by the compiler, and the others are
	provided by three very commonly used libraries: [containers], [hashable] and
	[unordered-containers].

	[containers]: https://hackage.haskell.org/package/containers
	[hashable]: https://hackage.haskell.org/package/hashable
	[unordered-containers]: https://hackage.haskell.org/package/unordered-containers

	> import Data.Hashable (Hashable, hashWithSalt)
	> import Data.List (foldl')
	> import Data.Map.Strict (Map)
	> import qualified Data.Map.Strict as Map
	> import Data.Maybe (fromMaybe, maybeToList)
	> import Text.Printf (printf)
	> import qualified Data.Graph as G

	Problem definition
	------------------

	In this blogpost, we will look at a simple example of cyclical dependencies. We
	are modeling a trust graph, and we want to compute cryptographic hashes for all
	the people in this trust graph. Because trust can be mutual, we quickly
	encounter the problem of dependency cycles here.

	Let's look at the datatype we will use to model our problem:

	> type Username = String

	> data Person = Person
	> { pUserName :: Username
	> , pFullName :: String
	> , pTrusts :: [Username]
	> }

	A person has a `Username` (which we will use as key) and a full name. A person
	also has a list of other people he or she trusts. The cryptographic hash of a
	`Person` consists of the username, the full name, and *the hashes of the people
	this person trusts*.

	The CyclicHash class
	--------------------

	We start out with a class that captures the essence of computing hashes for
	cyclic data structures.

	We need an associated type synonym, `Key`, which gives us a unique key for the
	value. We also have a corresponding `key` function to obtain this key.

	Usually the hash of a datatype is a hash of all the fields in that
	datastructure. This is why we have the second important function, `fields`. It
	returns a list of all the fields we want to hash.

	> class (Hashable (Key a), Ord (Key a)) => CyclicHash a where
	> type Key a
	> key :: a -> Key a
	> fields :: a -> [HashField a]

	The fields we want to hash are represented using an intermediate datatype.

	We make a distinction between two kinds of fields:

	- Primitive fields (`Int`, `String`, and other things that have a reasonable
	`Hashable` instance). These just contain the value that needs to be hashed.
	- Dependency fields which refer to other values. These are the ones we need to
	be careful about, since they can introduce dependency cycles.

	> data HashField a where
	> PrimitiveField :: Hashable b => b -> HashField a
	> DependencyField :: Key a -> HashField a

	We can now write down the `CyclicHash` instance for `Person`. It is fairly
	straightforward: `pUserName` and `pFullName` contribute to the hash as primitive
	fields, and we create a `DependencyField` for every person in `pTrusts`.

	> instance CyclicHash Person where
	> type Key Person = Username

	> key p = pUserName p
	> fields p =
	> [ PrimitiveField (pUserName p)
	> , PrimitiveField (pFullName p)
	> ] ++
	> map DependencyField (pTrusts p)

	Lastly, let's add an auxiliary method that gives simply us the dependencies of a
	node. We can do this by finding all the `DependencyField` fields and
	concatenating their keys in a list.

	> dependencies :: CyclicHash a => a -> [Key a]
	> dependencies node = [k \| DependencyField k <- fields node]

	Aside: the Hashable library
	---------------------------

	In this blogpost, we use the [hashable] library, because of its easy interface
	and wide adoption. However, we must keep in mind that this library does not
	provide us with cryptographically strong hash functions.

	In production code, it would be better to use an algorithm like SHA-2. The idea
	behind the cyclic hashing stays the same though, so updating this code to use
	such an algorithm is not a hard task.

	A primitive hash
	----------------

	We can start by defining a "primitive" hash function, which does not look at the
	dependencies of a node.

	> primitiveHash :: CyclicHash a => Int -> a -> Int
	> primitiveHash salt =
	> foldl' hashField salt . fields
	> where
	> hashField s (PrimitiveField x) =
	> s `hashWithSalt` (0 :: Int) `hashWithSalt` x
	> hashField s (DependencyField k) =
	> s `hashWithSalt` (1 :: Int) `hashWithSalt` k

	A component hash
	----------------

	Now, we want some way to take the hashes of the dependencies into account. But
	that is impossible to do if we have the following trust graph like this:

	Mary ←- David
	\| ↑ \|
	↓ \| ↓
	Rob Susan

	In Haskell, this looks like:

	> people :: [Person]
	> people =
	> [ Person "rob" "Robert Smith" ["mary"]
	> , Person "mary" "Mary Johnson" ["rob"]
	> , Person "susan" "Susan Williams" []
	> , Person "david" "David Brown" ["mary", "susan"]
	> ]

	Mary's hash depends on Rob's hash and vice versa! An elegant way to solve this
	is provided to us by computing the [Strongly connected components] of the graph.

	[Strongly connected components]: https://en.wikipedia.org/wiki/Strongly_connected_component

	We will be able to compute the strongly connected components using the vastly
	underrated [Data.Graph] module. The output for the example graph we saw above
	is:

	[Data.Graph]: https://hackage.haskell.org/package/containers/docs/Data-Graph.html

	B. Mary \| C. ←- David \| A. +
	\| ↑ \| \| \|
	↓ \| \| \| ↓
	Rob \| \| Susan

	A very important property of the resulting components is that if we look at the
	graph of components, there cannot be any cycles. This is easy to see: if
	there was a cycle between two or more components, they would belong to the same
	cyclic component!

	Now, we can compute a hash per component. The simplest strongly connected
	component is a single vertex (represented by the `G.AcyclicSCC` constructor). In
	this case, we can rely on the `primitiveHash` function we saw above. Otherwise,
	we have a list of vertices (`G.CyclicSCC`). We can hash these together by using
	a fold.

	Note that this hash also contains the shape of the component.

	> componentHash :: CyclicHash a => Int -> G.SCC a -> Int
	> componentHash salt0 (G.AcyclicSCC x) =
	> salt0 `hashWithSalt` (0 :: Int) `primitiveHash` x
	> componentHash salt0 (G.CyclicSCC xs0) =
	> salt0 `hashWithSalt` (1 :: Int) `hashPrimitives` xs0
	> where
	> -- \| Chain of primitiveHash.
	> hashPrimitives salt = foldl' primitiveHash salt

	This gives us something like the following for our example (we use `<+>`) as an
	_informal_ operator to denote a combination of hashes:

	componentHash(A) = 0 <+> primitiveHash(susan)
	componentHash(B) = 1 <+> primitiveHash(rob) <+> primitiveHash(mary)
	componentHash(C) = 0 <+> primitiveHash(david)

	Note that the the `<+>` operator is _not_ commutative, so the order in which we
	do the fold is important. Fortunately, the `Data.Graph` library takes care of
	that for us: the nodes returned in a component are sorted in the order we need.

	A component hash per node
	-------------------------

	With the help of the `componentHash` function, we can define another function
	which gives us the map containing the hash for each component, by node.

	> componentHashes :: CyclicHash a => Int -> [G.SCC a] -> Map (Key a) Int
	> componentHashes salt sccs = Map.fromList $ do
	> scc <- sccs
	> let !hash = componentHash salt scc
	> node <- G.flattenSCC scc
	> return (key node, hash)

	For the example graph, this gives us the following mapping:

	rob => componentHash(B)
	david => componentHash(C)
	susan => componentHash(A)
	mary => componentHash(B)

	The cyclic hash
	---------------

	Now, we have the almost all the ingredients to compute the full cyclic hash for
	a value. What we want to do is easy to informally define but a bit harder to
	implement.

	Since we know there are no cyclic dependencies between the strongly connected
	components of the graph, we can run through them _in order_ now.

	1. Let's start with component _A_ which is the simplest. Since the nodes in this
	component have no dependencies whatsoever, we get:

	cyclicHash(susan) = primitiveHash(susan)

	2. For component _B_, it's a bit more complicated. Ideally we would have
	something like this, which is of course impossible:

	cyclicHash(rob) = primitiveHash(rob) <+> cyclicHash(mary)
	cyclicHash(mary) = primitiveHash(mary) <+> cyclicHash(rob)

	But we can fix this by, instead of taking the cyclic hashes of nodes in the
	same component, taking the component hashes, since these hashes also contain
	the whole component! This gives us:

	cyclicHash(rob) = primitiveHash(rob) <+> componentHash(B)
	cyclicHash(mary) = primitiveHash(mary) <+> componentHash(B)

	3. When we arrive at component _C_, we have already computed the cyclic hashes
	of its dependencies, so we simply get:

	cyclicHash(david) =
	primitiveHash(david) <+>
	cyclicHash(mary) <+>
	cyclicHash(susan)

	The implementation of `cyclicHash` is similar to `primitiveHash`, except we now
	pass in an argument `dependencyHashes`. The idea is that this will contain the
	`cyclicHash`es of actual dependencies, and the `componentHash`es of cyclic
	dependencies (which are in the same component). Fortunately, we don't really
	need to check which of these cases we are in -- we can just grab the hash from
	the map.

	> cyclicHash :: CyclicHash a => Map (Key a) Int -> Int -> a -> Int
	> cyclicHash dependencyHashes salt0 = foldl' hashField salt0 . fields
	> where
	> hashField !s (PrimitiveField x) =
	> s `hashWithSalt` (0 :: Int) `hashWithSalt` x
	> hashField !s (DependencyField k) =
	> let !x = fromMaybe 0 $ Map.lookup k dependencyHashes in
	> s `hashWithSalt` (1 :: Int) `hashWithSalt` x

	Tying it all together
	---------------------

	Finally, it's time to put everything together in a single function that users
	can call. In fact, this is the only top-level function exported from this
	module!

	The function takes a number of values, and returns the values paired with their
	hashes. Of course, the interesting bit is how we construct `fullHashes`.

	> cyclicHashes :: CyclicHash a => Int -> [a] -> [(a, Int)]
	> cyclicHashes salt0 nodes =
	> [ (n, h)
	> \| n <- nodes
	> , h <- maybeToList $ Map.lookup (key n) fullHashes
	> ]

	We start of by calculating the strongly connected components using the
	`Data.Graph` library. Then we calculate a hash for every component, using the
	`componentHashes` we defined before.

	> where
	> comps = G.stronglyConnComp [(n, key n, dependencies n) \| n <- nodes]
	> compHashes = componentHashes salt0 comps

	It is important to note that `G.stronglyConnComp` returns the components in
	[topological order]. For our use case, this means that the components are in the
	"right" order with regards to dependencies: if _y_ depends on _x_, _x_ will
	always be placed before _y_ in the list.

	[topological order]: https://en.wikipedia.org/wiki/Topological_sorting

	This means that we can just fold through the `comps` components. The accumulator
	we use is the map we calculated using `componentHashes`. Initially, this map
	contains the `componentHash` for every node, but as we fold through the list, we
	update it to contain the full hash for every node. This is exactly what
	`cyclicHash` expects!

	> fullHashes = foldl'
	> (\acc scc ->
	> let !newHashes = Map.fromList
	> [ (key x, cyclicHash acc salt0 x)
	> \| x <- G.flattenSCC scc
	> ]
	> in Map.union newHashes acc) compHashes comps

	Full example code
	-----------------

	Let's add a quick printing function to check the results:

	> showCyclicHashes :: (CyclicHash a, Show (Key a)) => [a] -> String
	> showCyclicHashes values = unlines
	> [ printf "%s: %x" (show (key x)) hash
	> \| (x, hash) <- cyclicHashes 42 values
	> ]


	*Data.CyclicHash> putStr $ showCyclicHashes people
	"rob": 9923cc533bb3432
	"mary": e8c6608ddd959521
	"susan": 6dd70fb3a3e127ed
	"david": 86b49a12d999cc35

	And then, after we edit e.g. `mary`'s name, we get:

	*Data.CyclicHash> putStr $ showCyclicHashes people
	"rob": 201403e3ca292874
	"mary": 8370448e0b8598cf
	"susan": 6dd70fb3a3e127ed
	"david": 853624228d76fe7f

	Seems like it is behaving as expected! [^test]

	[^test]: Of course, it goes without saying that algorithms like this need more
	rigorous testing.