TimWSpence/having_your_cake_and_eating_it.md Secret

## having_your_cake_and_eating_it.md

      
    Raw
  

              having_your_cake_and_eating_it.md
            
          
    Having your cake and eating it

At Permutive, we're very much committed to functional programming. That
typically means committing to immutable data structures (and this is a good
thing!). However, there are times when an algorithm would run much faster or
more space efficiently if it could update state in place.
Consider for example the classic implementation of quick sort:
qsort :: Ord a => [a] -> [a]
qsort [] = []
qsort (h:t) = [x | x <- t, x <= h] ++ [h] ++ [x | x <- t, x > h]
Whilst conceptually elegant, this is hugely space efficient (and slow) due to
repeated list concatenation. It should be possible to sort in-place, as in the
classical implementation you will find in any imperative language. But to do so
requires us to have mutable state!
One option for mutable state would be to store our states inside an IORef.
This provides mutable memory protected by sequenced IO actions. This would run
in constant space but is not very satisfactory as our types become polluted with
IO. Clearly this should not be necessary as our qsort function does not want
to modify any non-local state.
This in fact reveals something about the nature of the problem. The reason we're
required to introduce IO is because the compiler has no way to guarantee that
we won't share our mutable array with another thread and hence to ensure
referential transparency we must work within the IO monad.
So can we construct a type which permits in-place mutation but which cannot be
shared between threads?
Yes! :) (Many thanks to John Launchbury and Simon Peyton Jones - cf their 1994
paper Lazy Functional State Threads)
The ST Monad

Haskell defines types ST s a and STRef s a which support (amongst others) the following operations:
newSTRef :: a -> ST s (STRef s a)
readSTRef :: STRef s a -> ST s a
writeSTRef :: STRef s a -> a -> ST s ()
modifySTRef :: STRef s a -> (a -> a) -> ST s ()
which means that we can write (pseudo-imperative) code like:
increment :: STRef s Int -> ST s ()
increment ref = do
  current <- readSTRef ref
  writeSTRef ref (current + 1)
  
prog :: ST s Int
prog = do
  ref <- newSTRef 0
  increment ref
  increment ref
  readSTRef ref
We can also extract a value from ST using
{-# LANGUAGE RankNTypes #-}

runST :: (forall s. ST s a) -> a
eg
runST prog :: Int

(runST prog) == 2 -- not haskell but you get the picture!
An STRef is backed by a piece of mutable memory. How can the compiler be sure
that this is referentially transparent and doesn't have the same potential
problems as IORef above?
Let's try to 'leak' the ref from an ST value to see what happens
runST $ newSTRef 0 -- attempt to leak the ref from a runST call
                   -- does not compile!!
The compiler will error with the message
Couldn't match type ‘a’ with ‘STRef s Integer’ because type variable ‘s’ would escape its scope

Why is this? Consider the types:
newSTRef 0 :: ST s (STRef s Int)
runST :: (forall s. ST s a) -> a
Suppose we could apply runST to newSTRef 0. I guess we would have:
runST $ newSTRef 0 :: STRef s Int
But this contradicts the type of runST as the return type is dependent on s, whereas
the return type of runST is independent of s
Similarly we can show that the compiler will not allow us to pass in an STRef to be evaluated
by runST:
let v = runST (newVar True)
  in runST (readVar v)
If this were legal, v would have type ST s (STRef s Bool) and hence readVar v :: ST s Bool.
Again, this does not match the type (forall s. ST s a) required by runST - s is free in the
latter but not in the former.
TLDR; the compiler will in fact prohibit us from passing an STRef as a
parameter to runST or returning an STRef via runST, hence we can guarantee
that the mutable state is never shared! (For a more rigorous proof see the original paper)
Back to sorting

What has the ST monad bought us? We now have a way to perform in-place mutations whilst maintaining
referential transparency! Here is qsort running in constant space as promised (using STArray which
is built on top of ST and works much the same way)
qort :: Ord a => [a] -> [a]
qort items = runST $ do
    let len = length items
    arr <- newListArray (1, len) items
    qsort' arr 1 len
    getElems arr
  where
    qsort' :: Ord a => (STArray s Int a) -> Int -> Int -> ST s ()
    qsort' arr low high = do
      when (low < high) $ do
        q <- partition arr low high
        qsort' arr low (q - 1)
        qsort' arr (q + 1) high

    partition arr low high = do
      i <- newSTRef (low - 1)
      pivot <- readArray arr high
      forM_ [low .. pred high] $ \j -> do
        jj <- readArray arr j
        when (jj <= pivot) $ do
          modifySTRef i (+1)
          readSTRef i >>= \ii -> swap arr ii j
      ii <- readSTRef i
      swap arr (ii + 1) high
      return (ii + 1)

    swap arr i j = do
      v <- readArray arr i
      w <- readArray arr j
      writeArray arr i w
      writeArray arr j v
Note that we have not had to sacrifice referential transparency to obtain this -
the signature of the function is unchanged. It is also an almost direct translation
from the corresponding C implementation, but guarantees that our mutable state cannot
escape the scope of the qsort function - something that no imperative language can
do.
TLDR

We can use the ST monad to implement functions which use mutable state for efficiency, whilst
having the compiler guarantee that we won't accidentally share that mutable state :)