The input I used here was the first 1000 lines of http://www.w3.org/html/wg/drafts/html/master/single-page.html. The whole page causes stack overflow in the specLex stack case :(
Last active
December 15, 2015 15:09
-
-
Save maoe/5279239 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
% ./speclex +RTS -N2 -RTS | |
warming up | |
estimating clock resolution... | |
mean is 2.471314 us (320001 iterations) | |
found 58216 outliers among 319999 samples (18.2%) | |
92 (2.9e-2%) low severe | |
58124 (18.2%) high severe | |
estimating cost of a clock call... | |
mean is 92.79044 ns (18 iterations) | |
found 1 outliers among 18 samples (5.6%) | |
1 (5.6%) high mild | |
benchmarking lexing/seq | |
mean: 1.232473 ms, lb 1.192931 ms, ub 1.298411 ms, ci 0.950 | |
std dev: 256.2373 us, lb 172.0983 us, ub 358.9553 us, ci 0.950 | |
found 17 outliers among 100 samples (17.0%) | |
4 (4.0%) high mild | |
13 (13.0%) high severe | |
variance introduced by outliers: 94.674% | |
variance is severely inflated by outliers | |
benchmarking lexing/spec | |
collecting 100 samples, 1 iterations each, in estimated 10.07850 s | |
mean: 65.31114 ms, lb 62.94200 ms, ub 69.55120 ms, ci 0.950 | |
std dev: 15.86399 ms, lb 9.789240 ms, ub 25.18797 ms, ci 0.950 | |
found 8 outliers among 100 samples (8.0%) | |
4 (4.0%) high mild | |
4 (4.0%) high severe | |
variance introduced by outliers: 95.756% | |
variance is severely inflated by outliers |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import Data.ByteString (ByteString) | |
import qualified Data.ByteString.Char8 as S8 | |
import Control.Concurrent.Speculation | |
import Control.DeepSeq | |
import Criterion.Main | |
data Lexeme | |
= STag -- <[^>]*> | |
| Content -- [^<]+ | |
| ETag -- </[^>]*> | |
deriving (Show, Eq) | |
instance NFData Lexeme | |
data State | |
= Init | |
| Open | |
| InSTag | |
| InContent | |
| InETag | |
deriving (Show, Eq) | |
instance NFData State | |
step :: State -> Char -> (Maybe Lexeme, State) | |
step Init '<' = (Nothing, Open) | |
step Init _ = (Nothing, InContent) | |
step InContent '<' = (Just Content, Open) | |
step InContent _ = (Nothing, InContent) | |
step Open '/' = (Nothing, InETag) | |
step Open _ = (Nothing, InSTag) | |
step InSTag '>' = (Just STag, Init) | |
step InSTag _ = (Nothing, InSTag) | |
step InETag '>' = (Just ETag, Init) | |
step InETag _ = (Nothing, InETag) | |
seqLex :: ByteString -> State | |
seqLex = S8.foldl go Init | |
where | |
-- Ignore output for simplicity | |
go s i = snd $ step s i | |
specLex :: ByteString -> State | |
specLex input = foldlBS guess go Init input | |
where | |
k = 8 | |
guess idx = seqLex $ slice (idx - k) (idx - 1) input | |
go s i = snd $ step s i | |
foldlBS :: Eq b => (Int -> b) -> (b -> Char -> b) -> b -> ByteString -> b | |
foldlBS g f z = extractAcc . S8.foldl mf (Acc 0 z) | |
where | |
mf (Acc n a) b = Acc (n + 1) (spec (g n) (`f` b) a) | |
data Acc a = Acc {-# UNPACK #-} !Int a | |
extractAcc :: Acc a -> a | |
extractAcc (Acc _ a) = a | |
{-# INLINE extractAcc #-} | |
slice :: Int -> Int -> ByteString -> ByteString | |
slice from to = S8.drop (fromIntegral from) . S8.take (fromIntegral to) | |
main :: IO () | |
main = do | |
html <- S8.readFile "foo.html" | |
defaultMain | |
[ bgroup "lexing" | |
[ bench "seq" $ nf seqLex html | |
, bench "spec" $ nf specLex html | |
] | |
] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment