emilaxelsson/SmartFix.lhs

## SmartFix.lhs
(This post is literate Haskell, [available here](https://gist.github.com/emilaxelsson/81d2774f0c96750378bf).)

This post will show a trick for making it easier to work with terms representated as fixed-points of functors in Haskell. It is assumed that the reader is familiar with this representation.

First some extensions:

\begin{code}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE TypeFamilies #-}
\end{code}


Representing terms as fixed-points
--------------------------------------------------------------------------------

A common generic term representation is as a fixed-point of a functor:

\begin{code}
newtype Term f = In { out :: f (Term f) }
\end{code}

A simple example functor is the following, representing arithmetic expressions:

\begin{code}
data Arith a
    = Num Int
    | Add a a
  deriving (Functor)
\end{code}

One problem with the `Term` representation is that it's quite cumbersome to construct with it. For example, the following is the representation of the expression $(1+2)+3$:

\begin{code}
term1 :: Term Arith
term1 =
    In (Add
      (In (Add
        (In (Num 1))
        (In (Num 2))
      ))
      (In (Num 3))
    )
\end{code}

This post will show a simple trick for inserting the `In` constructors automatically, allowing us to write simply:

\begin{code}
term2 :: Term Arith
term2 = deepIn10 $ Add (Add (Num 1) (Num 2)) (Num 3)
\end{code}


Deep conversion of terms
--------------------------------------------------------------------------------

First, let's have a look at the type of the above expression:

\begin{code}
_ = Add (Add (Num 1) (Num 2)) (Num 3) :: Arith (Arith (Arith a))
\end{code}

We see that the type has three levels of `Arith` because the tree has depth three. So in general, we need a function with the following type scheme:

~~~~~~~~~~~~~~~~~~~~{.haskell}
f (f (f ... (f (Term f)))) -> Term f
~~~~~~~~~~~~~~~~~~~~

The purpose of this function is to insert an `In` at each level, from bottom to top. The reason for having `Term f` in the argument type (rather than just `a`) is that `In` expects an argument of type `Term f`.

This function has to be overloaded on the argument type, so we put it in a type class:

\begin{code}
class Smart f a
  where
    deepIn  :: f a -> Term f
    deepOut :: Term f -> f a
\end{code}

Here `f` is the functor and `a` can be `Term f`, `f (Term f)`, `f (f (Term f))`, etc. The class also contains the inverse function `deepOut` for reasons that will be explained later.

First we make the base instance for when we only have one level:

\begin{code}
instance Smart f (Term f)
  where
    deepIn  = In
    deepOut = out
\end{code}

Next, we make a recusive instance for when there is more than one level:

\begin{code}
instance (Smart f a, Functor f) => Smart f (f a)
  where
    deepIn  = In . fmap deepIn
    deepOut = fmap deepOut . out
\end{code}

And that's it!


Ambiguous types
--------------------------------------------------------------------------------

Now there's only one problem. If we try to use `deepIn $ Add (Add (Num 1) (Num 2)) (Num 3)`, GHC will complain:

    No instance for (Smart Arith a0) arising from a use of ‘deepIn’

As said earlier, the type of the given expression is `Arith (Arith (Arith a))`, which means that we get an ambiguous type `a` which means that it's not possible to know which of the above instances to pick for this `a`.

So we need to constrain the type before calling `deepIn`. One idea might be to use a type function for constraining the `a`. Unfortunately, as far as I know, it's not possible to make a type function that maps e.g.

~~~~~~~~~~~~~~~~~~~~{.haskell}
Arith (Arith ... (Arith a))
~~~~~~~~~~~~~~~~~~~~

to

~~~~~~~~~~~~~~~~~~~~{.haskell}
Arith (Arith ... (Arith (Term Arith)))
~~~~~~~~~~~~~~~~~~~~

This is because `a` is a polymorphic variable, and type functions cannot case split on whether or not a type is polymorphic.

But there is one thing we can do. The type `Arith (Arith (Arith a))` shows that the expression has (at most) three levels. But there is nothing stopping us from adding more levels in the type.

Why not make a version of `deepIn10` that forces 10 levels in the type?

\begin{code}
type Iter10 f a = f (f (f (f (f (f (f (f (f (f a)))))))))

deepIn10 :: (Functor f, a ~ Iter10 f (Term f)) => f a -> Term f
deepIn10 = deepIn
\end{code}

This function can be used to insert `In` constructors in terms of at most depth 10.


Constructing terms from existing terms
--------------------------------------------------------------------------------

The above example, `term2`, was constructed from scratch. We might also want to construct terms that contain other, already constructed, terms. This is where `deepOut` comes into play:

\begin{code}
term3 = deepIn10 $ Add (deepOut term2) (Num 4)
\end{code}
	(This post is literate Haskell, [available here](https://gist.github.com/emilaxelsson/81d2774f0c96750378bf).)

	This post will show a trick for making it easier to work with terms representated as fixed-points of functors in Haskell. It is assumed that the reader is familiar with this representation.

	First some extensions:

	\begin{code}
	{-# LANGUAGE DeriveFunctor #-}
	{-# LANGUAGE FlexibleInstances #-}
	{-# LANGUAGE MultiParamTypeClasses #-}
	{-# LANGUAGE TypeFamilies #-}
	\end{code}



	Representing terms as fixed-points
	--------------------------------------------------------------------------------

	A common generic term representation is as a fixed-point of a functor:

	\begin{code}
	newtype Term f = In { out :: f (Term f) }
	\end{code}

	A simple example functor is the following, representing arithmetic expressions:

	\begin{code}
	data Arith a
	= Num Int
	\| Add a a
	deriving (Functor)
	\end{code}

	One problem with the `Term` representation is that it's quite cumbersome to construct with it. For example, the following is the representation of the expression $(1+2)+3$:

	\begin{code}
	term1 :: Term Arith
	term1 =
	In (Add
	(In (Add
	(In (Num 1))
	(In (Num 2))
	))
	(In (Num 3))
	)
	\end{code}

	This post will show a simple trick for inserting the `In` constructors automatically, allowing us to write simply:

	\begin{code}
	term2 :: Term Arith
	term2 = deepIn10 $ Add (Add (Num 1) (Num 2)) (Num 3)
	\end{code}



	Deep conversion of terms
	--------------------------------------------------------------------------------

	First, let's have a look at the type of the above expression:

	\begin{code}
	_ = Add (Add (Num 1) (Num 2)) (Num 3) :: Arith (Arith (Arith a))
	\end{code}

	We see that the type has three levels of `Arith` because the tree has depth three. So in general, we need a function with the following type scheme:

	~~~~~~~~~~~~~~~~~~~~{.haskell}
	f (f (f ... (f (Term f)))) -> Term f
	~~~~~~~~~~~~~~~~~~~~

	The purpose of this function is to insert an `In` at each level, from bottom to top. The reason for having `Term f` in the argument type (rather than just `a`) is that `In` expects an argument of type `Term f`.

	This function has to be overloaded on the argument type, so we put it in a type class:

	\begin{code}
	class Smart f a
	where
	deepIn :: f a -> Term f
	deepOut :: Term f -> f a
	\end{code}

	Here `f` is the functor and `a` can be `Term f`, `f (Term f)`, `f (f (Term f))`, etc. The class also contains the inverse function `deepOut` for reasons that will be explained later.

	First we make the base instance for when we only have one level:

	\begin{code}
	instance Smart f (Term f)
	where
	deepIn = In
	deepOut = out
	\end{code}

	Next, we make a recusive instance for when there is more than one level:

	\begin{code}
	instance (Smart f a, Functor f) => Smart f (f a)
	where
	deepIn = In . fmap deepIn
	deepOut = fmap deepOut . out
	\end{code}

	And that's it!



	Ambiguous types
	--------------------------------------------------------------------------------

	Now there's only one problem. If we try to use `deepIn $ Add (Add (Num 1) (Num 2)) (Num 3)`, GHC will complain:

	No instance for (Smart Arith a0) arising from a use of ‘deepIn’

	As said earlier, the type of the given expression is `Arith (Arith (Arith a))`, which means that we get an ambiguous type `a` which means that it's not possible to know which of the above instances to pick for this `a`.

	So we need to constrain the type before calling `deepIn`. One idea might be to use a type function for constraining the `a`. Unfortunately, as far as I know, it's not possible to make a type function that maps e.g.

	~~~~~~~~~~~~~~~~~~~~{.haskell}
	Arith (Arith ... (Arith a))
	~~~~~~~~~~~~~~~~~~~~

	to

	~~~~~~~~~~~~~~~~~~~~{.haskell}
	Arith (Arith ... (Arith (Term Arith)))
	~~~~~~~~~~~~~~~~~~~~

	This is because `a` is a polymorphic variable, and type functions cannot case split on whether or not a type is polymorphic.

	But there is one thing we can do. The type `Arith (Arith (Arith a))` shows that the expression has (at most) three levels. But there is nothing stopping us from adding more levels in the type.

	Why not make a version of `deepIn10` that forces 10 levels in the type?

	\begin{code}
	type Iter10 f a = f (f (f (f (f (f (f (f (f (f a)))))))))

	deepIn10 :: (Functor f, a ~ Iter10 f (Term f)) => f a -> Term f
	deepIn10 = deepIn
	\end{code}

	This function can be used to insert `In` constructors in terms of at most depth 10.



	Constructing terms from existing terms
	--------------------------------------------------------------------------------

	The above example, `term2`, was constructed from scratch. We might also want to construct terms that contain other, already constructed, terms. This is where `deepOut` comes into play:

	\begin{code}
	term3 = deepIn10 $ Add (deepOut term2) (Num 4)
	\end{code}