Created
September 17, 2011 20:17
-
-
Save Tener/1224319 to your computer and use it in GitHub Desktop.
Quite efficient generation "look and say" sequence in Haskell
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{- | |
ghc --make -Odph -fllvm -optlo-globalopt -optlo-loop-unswitch -optlo-mem2reg -optlo-prune-eh -O2 -fllvm -optl-O3 -rtsopts lookandsay.hs | |
time ./lookandsay > /dev/null | |
./lookandsay > /dev/null 6,02s user 0,24s system 99% cpu 6,284 total | |
-} | |
import System.IO | |
import Control.DeepSeq | |
data S = S1 | S2 | S3 -- deriving (Eq,Show,Read,Ord,Enum) | |
instance NFData S where | |
look_and_say :: [[S]] | |
look_and_say = iterate carryOn [S1] | |
carryOn :: [S] -> [S] | |
carryOn (S1 : S1 : S1 : rest) = S3 : S1 : carryOn rest | |
carryOn (S1 : S1 : rest) = S2 : S1 : carryOn rest | |
carryOn (S1 : rest) = S1 : S1 : carryOn rest | |
carryOn (S2 : S2 : S2 : rest) = S3 : S2 : carryOn rest | |
carryOn (S2 : S2 : rest) = S2 : S2 : carryOn rest | |
carryOn (S2 : rest) = S1 : S2 : carryOn rest | |
carryOn (S3 : S3 : S3 : rest) = S3 : S3 : carryOn rest | |
carryOn (S3 : S3 : rest) = S2 : S3 : carryOn rest | |
carryOn (S3 : rest) = S1 : S3 : carryOn rest | |
carryOn [] = [] | |
{-# INLINE render #-} | |
render (x,y) = do | |
let r S1 = '1' | |
r S2 = '2' | |
r S3 = '3' | |
fi :: S -> Int | |
fi S1 = 1 | |
fi S2 = 2 | |
fi S3 = 3 | |
-- hPrint stderr (deepseq y x) | |
hPrint stderr (x, sum $ map fi y) | |
print (x,map r y) | |
-- print (deepseq y x) | |
look_and_say' :: [[S]] | |
look_and_say' = take n1 look_and_say | |
main :: IO () | |
main = do | |
mapM_ render (zip [1..n1] look_and_say') | |
-- print $ look_and_say !! n2 | |
n1 = 60 |
Do you know why this is so much faster? I pattern matching faster than using length
? Or is it the absence of show
and read
that does the speed difference?
I think your analysis is quite right. The pattern matching is indeed very fast, while length, show and read are understandably quite slow. I said "quite efficient" because I think one can do even better. For example C++ version is bound to be faster.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The difference in performance is huge, and hence hard to measure accurately. The slow algorithm takes a lot of time to finish printing the first 50 items. With fast it's almost instant. Furthermore the fast one is actually GC-bound. If you provide the right RTS flags it runs even faster.
The actual numbers:
slow - 117.34 sec.
above, fast - 0.40 sec.
So generallly it's like 300x faster.