Skip to content

Instantly share code, notes, and snippets.

@crdrost
Created July 13, 2018 19:51
Show Gist options
  • Save crdrost/b33a009ecfbad3982b8c3d83a547f8bf to your computer and use it in GitHub Desktop.
Save crdrost/b33a009ecfbad3982b8c3d83a547f8bf to your computer and use it in GitHub Desktop.
Paper of the day: Hughes Lists
-- paper of the day
-- RJM Hughes, A Novel Representation of Lists and its Application to the Function "Reverse"
-- Information Processing Letters 22:141-144 (1986)
-- https://www.cs.tufts.edu/~nr/cs257/archive/john-hughes/lists.pdf
{-# LANGUAGE DeriveFunctor, DeriveFoldable #-}
module HughesLists where
import Data.Monoid
import Data.Foldable
import Control.Monad.State
-- some helper functions to extract the first and last elements:
genHead :: Foldable t => t x -> Maybe x
genHead = getFirst . foldMap (First . Just)
genLast :: Foldable t => t x -> Maybe x
genLast = getLast . foldMap (Last . Just)
-- we start with spine-strict lists,
infixr 5 :>
data StrictList x = Nil | x :> !(StrictList x) deriving (Eq, Ord, Functor, Foldable)
empty :: StrictList x -> Bool
empty Nil = True
empty _ = False
singleton :: x -> StrictList x
singleton = (:> Nil)
toStrictList :: Foldable t => t x -> StrictList x
toStrictList = foldr (:>) Nil
parenthesizeIf :: Bool -> (String -> String) -> String -> String
parenthesizeIf True f = ('(' :) . f . (')' :)
parenthesizeIf False f = f
instance Show x => Show (StrictList x) where
showsPrec p Nil = parenthesizeIf (p > 10) ("Nil" ++)
showsPrec p t@(_ :> _) = parenthesizeIf (p > 5) $ foldr listStr id t . ("Nil" ++)
where listStr item rest = showsPrec 6 item . (" :> " ++) . rest
-- since it's strict this concat is O(|xs|)!
instance Monoid (StrictList x) where
mempty = Nil
mappend Nil ys = ys
mappend (x :> xs) ys = x :> mappend xs ys
-- which causes this to be slow:
reverseNaive :: StrictList x -> StrictList x
reverseNaive Nil = Nil
reverseNaive (x :> xs) = reverseNaive xs <> singleton x
-- in fact we know the iterorecursive algorithm to do this:
reverseIterative :: StrictList x -> StrictList x
reverseIterative list = getDone $ runState loop (list, Nil)
where
loop :: State (StrictList x, StrictList x) ()
loop = while notDone $ do
(x :> xs, done) <- get
put (xs, x :> done)
notDone :: State (StrictList x, StrictList x) Bool
notDone = not . empty . fst <$> get
getDone :: ((), (StrictList x, StrictList x)) -> StrictList x
getDone (_, (_, done)) = done
while :: Monad m => m Bool -> m () -> m ()
while test action = do
continue <- test;
if continue then action >> while test action
else return ()
-- and we can avoid the state monad by writing that itero-recursively:
reverseCorrect :: StrictList x -> StrictList x
reverseCorrect list = go list Nil where
go Nil done = done
go (x :> todo) done = go todo (x :> done)
-- but that is hard to reason out, can we improve the monoid?
-- instead we use difference lists in the following form:
newtype SLDifference x = SLD { runSLD :: StrictList x -> StrictList x }
-- intriguingly you cannot define Show for this in general... e.g. you can have permutation difference-lists or
-- for example `SLD (\case Nil -> Nil; _ :> xs -> xs)`...
-- but it's a monoid:
instance Monoid (SLDifference x) where
mempty = SLD id
mappend (SLD f) (SLD g) = SLD (f . g)
-- now let's use this "under the hood"
reverseWithDL :: StrictList x -> StrictList x
reverseWithDL list = runSLD (reversing list) Nil where
reversing :: StrictList x -> SLDifference x
reversing Nil = SLD id
reversing (x :> xs) = reversing xs <> SLD (x :>)

Running this in GHCI gives:

GHCi, version 8.2.1: http://www.haskell.org/ghc/  :? for help
Prelude> :load hughes-lists.hs
[1 of 1] Compiling HughesLists      ( hughes-lists.hs, interpreted )
Ok, 1 module loaded.
(0.19 secs,)
*HughesLists> x = toStrictList [1..10000] :: StrictList Int
(0.00 secs, 0 bytes)
*HughesLists> genLast $ reverse [1..10000]
Just 1
(0.01 secs, 2,013,288 bytes)
*HughesLists> genLast $ reverseNaive x
Just 1
(9.15 secs, 10,165,531,624 bytes)
*HughesLists> genLast $ reverseIterative x
Just 1
(0.02 secs, 12,746,016 bytes)
*HughesLists> genLast $ reverseCorrect x
Just 1
(0.01 secs, 3,306,824 bytes)
*HughesLists> genLast $ reverseWithDL x
Just 1
(0.01 secs, 5,146,448 bytes)

This probably doesn't take advantage of all of the optimizations that might be available to GHC directly but it's enough to show the O(n2) behavior that Hughes is complaining about. There is nontrivial overhead in doing it this way but it comes within a factor of 2.5 of the native lazy-list-reverse code shipping in GHC and within a factor of 1.5 of the "correct" reverse, while also showing that just writing reverse in the iterorecursive "correct" style allows GHC to improve the runtime by almost 4x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment