Skip to content

Instantly share code, notes, and snippets.

@nikita-volkov
Last active December 14, 2015 18:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save nikita-volkov/5132774 to your computer and use it in GitHub Desktop.
Save nikita-volkov/5132774 to your computer and use it in GitHub Desktop.
Haskell list items equality algorithms benchmark for this StackOverflow answer: http://stackoverflow.com/a/15319373/485115.
import Data.List
import qualified Data.Set as Set
import Criterion.Main
import System.Random
import Control.Applicative
import qualified Data.MultiSet as MultiSet
absDiffImpl :: (Eq a) => [a] -> [a] -> Bool
absDiffImpl a b = null $ absDiff a b
absDiff :: (Eq a) => [a] -> [a] -> [a]
absDiff [] [] = []
absDiff [] b = b
absDiff a [] = a
absDiff (aHead:aTail) b = absDiff aTail $ delete aHead b
sortImpl :: (Eq a, Ord a) => [a] -> [a] -> Bool
sortImpl a b = sort a == sort b
-- Note: produces incorrect results for lists with duplicate items
setImpl :: (Eq a, Ord a) => [a] -> [a] -> Bool
setImpl a b = Set.fromList a == Set.fromList b
multiSetImpl :: (Ord a) => [a] -> [a] -> Bool
multiSetImpl a b = MultiSet.fromList a == MultiSet.fromList b
generateEqualInput :: Int -> IO ([Int], [Int])
generateEqualInput amount = do
input <- take amount . randoms <$> newStdGen
return (input, input)
generateUnequalInput :: Int -> IO ([Int], [Int])
generateUnequalInput amount = do
gen1 <- newStdGen
gen2 <- newStdGen
return (take amount $ randoms gen1, take amount $ randoms gen2)
generateShuffledInput :: Int -> IO ([Int], [Int])
generateShuffledInput amount = do
input <- take amount . randoms <$> newStdGen
shuffledInput <- shuffle input
return (input, shuffledInput)
where
shuffle :: [a] -> IO [a]
shuffle l = shuffle' l []
where
shuffle' [] acc = return acc
shuffle' l acc =
do k <- randomRIO (0, length l - 1)
let (lead, x:xs) = splitAt k l
shuffle' (lead ++ xs) (x:acc)
generateShuffledAndResizedInput :: Int -> IO ([Int], [Int])
generateShuffledAndResizedInput amount = do
(a, b) <- generateShuffledInput amount
amount' <- randomRIO (0, amount)
return (take amount' a, b)
main :: IO ()
main = defaultMain [
bgroup "absDiffImpl" [
bench "equal 1000" $ nfIO $
uncurry absDiffImpl <$> generateEqualInput 1000,
bench "unequal 1000" $ nfIO $
uncurry absDiffImpl <$> generateUnequalInput 1000,
bench "shuffled 500" $ nfIO $
uncurry absDiffImpl <$> generateShuffledInput 500,
bench "shuffled 1000" $ nfIO $
uncurry absDiffImpl <$> generateShuffledInput 1000,
bench "resized 1000" $ nfIO $
uncurry absDiffImpl <$> generateShuffledAndResizedInput 1000
],
bgroup "sortImpl" [
bench "equal 1000" $ nfIO $
uncurry sortImpl <$> generateEqualInput 1000,
bench "unequal 1000" $ nfIO $
uncurry sortImpl <$> generateUnequalInput 1000,
bench "shuffled 500" $ nfIO $
uncurry sortImpl <$> generateShuffledInput 500,
bench "shuffled 1000" $ nfIO $
uncurry sortImpl <$> generateShuffledInput 1000,
bench "resized 1000" $ nfIO $
uncurry sortImpl <$> generateShuffledAndResizedInput 1000
],
bgroup "multiSetImpl" [
bench "equal 1000" $ nfIO $
uncurry multiSetImpl <$> generateEqualInput 1000,
bench "unequal 1000" $ nfIO $
uncurry multiSetImpl <$> generateUnequalInput 1000,
bench "shuffled 500" $ nfIO $
uncurry multiSetImpl <$> generateShuffledInput 500,
bench "shuffled 1000" $ nfIO $
uncurry multiSetImpl <$> generateShuffledInput 1000,
bench "resized 1000" $ nfIO $
uncurry multiSetImpl <$> generateShuffledAndResizedInput 1000
],
bgroup "setImpl" [
bench "equal 1000" $ nfIO $
uncurry setImpl <$> generateEqualInput 1000,
bench "unequal 1000" $ nfIO $
uncurry setImpl <$> generateUnequalInput 1000,
bench "shuffled 500" $ nfIO $
uncurry setImpl <$> generateShuffledInput 500,
bench "shuffled 1000" $ nfIO $
uncurry setImpl <$> generateShuffledInput 1000,
bench "resized 1000" $ nfIO $
uncurry setImpl <$> generateShuffledAndResizedInput 1000
]
]
estimating clock resolution...
mean is 1.011339 us (640001 iterations)
found 1131755 outliers among 639999 samples (176.8%)
509406 (79.6%) low severe
622349 (97.2%) high severe
estimating cost of a clock call...
mean is 40.46522 ns (8 iterations)
benchmarking absDiffImpl/equal 1000
mean: 924.5801 us, lb 924.1164 us, ub 925.0713 us, ci 0.950
std dev: 2.444467 us, lb 2.123007 us, ub 3.005146 us, ci 0.950
benchmarking absDiffImpl/unequal 1000
mean: 989.7161 us, lb 984.8095 us, ub 998.9906 us, ci 0.950
std dev: 33.48594 us, lb 18.08658 us, ub 51.08539 us, ci 0.950
found 4 outliers among 100 samples (4.0%)
4 (4.0%) high severe
variance introduced by outliers: 29.688%
variance is moderately inflated by outliers
benchmarking absDiffImpl/shuffled 500
mean: 4.044514 ms, lb 4.016478 ms, ub 4.074533 ms, ci 0.950
std dev: 148.3493 us, lb 135.8523 us, ub 161.6944 us, ci 0.950
variance introduced by outliers: 33.577%
variance is moderately inflated by outliers
benchmarking absDiffImpl/shuffled 1000
mean: 15.22935 ms, lb 15.14198 ms, ub 15.32655 ms, ci 0.950
std dev: 472.2023 us, lb 411.1463 us, ub 547.9943 us, ci 0.950
variance introduced by outliers: 25.815%
variance is moderately inflated by outliers
benchmarking absDiffImpl/resized 1000
mean: 10.63822 ms, lb 10.54257 ms, ub 10.75604 ms, ci 0.950
std dev: 544.6949 us, lb 440.9155 us, ub 718.2932 us, ci 0.950
found 3 outliers among 100 samples (3.0%)
2 (2.0%) high mild
1 (1.0%) high severe
variance introduced by outliers: 49.450%
variance is moderately inflated by outliers
benchmarking sortImpl/equal 1000
mean: 1.799238 ms, lb 1.788926 ms, ub 1.811898 ms, ci 0.950
std dev: 58.71677 us, lb 49.42973 us, ub 69.80589 us, ci 0.950
found 12 outliers among 100 samples (12.0%)
9 (9.0%) high mild
3 (3.0%) high severe
variance introduced by outliers: 27.766%
variance is moderately inflated by outliers
benchmarking sortImpl/unequal 1000
mean: 2.281759 ms, lb 2.270899 ms, ub 2.295485 ms, ci 0.950
std dev: 62.62989 us, lb 52.34079 us, ub 77.79304 us, ci 0.950
found 16 outliers among 100 samples (16.0%)
13 (13.0%) high mild
3 (3.0%) high severe
variance introduced by outliers: 21.903%
variance is moderately inflated by outliers
benchmarking sortImpl/shuffled 500
mean: 3.461132 ms, lb 3.434782 ms, ub 3.489642 ms, ci 0.950
std dev: 140.0275 us, lb 125.2981 us, ub 156.4142 us, ci 0.950
variance introduced by outliers: 37.548%
variance is moderately inflated by outliers
benchmarking sortImpl/shuffled 1000
mean: 11.61127 ms, lb 11.55885 ms, ub 11.66647 ms, ci 0.950
std dev: 276.9058 us, lb 246.0557 us, ub 314.8248 us, ci 0.950
variance introduced by outliers: 17.110%
variance is moderately inflated by outliers
benchmarking sortImpl/resized 1000
mean: 10.99581 ms, lb 10.95493 ms, ub 11.03970 ms, ci 0.950
std dev: 217.1699 us, lb 186.7764 us, ub 259.2101 us, ci 0.950
found 5 outliers among 100 samples (5.0%)
4 (4.0%) high mild
variance introduced by outliers: 12.323%
variance is moderately inflated by outliers
benchmarking multiSetImpl/equal 1000
mean: 1.885414 ms, lb 1.873198 ms, ub 1.900768 ms, ci 0.950
std dev: 69.92698 us, lb 57.56888 us, ub 83.43766 us, ci 0.950
found 21 outliers among 100 samples (21.0%)
4 (4.0%) low mild
2 (2.0%) high mild
15 (15.0%) high severe
variance introduced by outliers: 33.609%
variance is moderately inflated by outliers
benchmarking multiSetImpl/unequal 1000
mean: 2.976697 ms, lb 2.963476 ms, ub 2.995110 ms, ci 0.950
std dev: 79.20024 us, lb 62.12011 us, ub 102.5614 us, ci 0.950
found 25 outliers among 100 samples (25.0%)
5 (5.0%) low mild
19 (19.0%) high severe
variance introduced by outliers: 20.932%
variance is moderately inflated by outliers
benchmarking multiSetImpl/shuffled 500
mean: 3.433142 ms, lb 3.413308 ms, ub 3.453713 ms, ci 0.950
std dev: 103.4538 us, lb 92.82392 us, ub 116.3661 us, ci 0.950
variance introduced by outliers: 24.837%
variance is moderately inflated by outliers
benchmarking multiSetImpl/shuffled 1000
mean: 11.31875 ms, lb 11.28396 ms, ub 11.35565 ms, ci 0.950
std dev: 184.2387 us, lb 162.6675 us, ub 216.0683 us, ci 0.950
found 1 outliers among 100 samples (1.0%)
variance introduced by outliers: 9.410%
variance is slightly inflated by outliers
benchmarking multiSetImpl/resized 1000
mean: 11.36131 ms, lb 11.32126 ms, ub 11.40203 ms, ci 0.950
std dev: 206.0579 us, lb 181.6311 us, ub 236.6741 us, ci 0.950
variance introduced by outliers: 11.316%
variance is moderately inflated by outliers
benchmarking setImpl/equal 1000
mean: 1.600115 ms, lb 1.590495 ms, ub 1.614578 ms, ci 0.950
std dev: 59.82304 us, lb 44.12002 us, ub 78.91178 us, ci 0.950
found 9 outliers among 100 samples (9.0%)
9 (9.0%) high severe
variance introduced by outliers: 33.631%
variance is moderately inflated by outliers
benchmarking setImpl/unequal 1000
mean: 2.674131 ms, lb 2.662926 ms, ub 2.689107 ms, ci 0.950
std dev: 65.68530 us, lb 52.39455 us, ub 79.87830 us, ci 0.950
found 14 outliers among 100 samples (14.0%)
2 (2.0%) high mild
12 (12.0%) high severe
variance introduced by outliers: 18.065%
variance is moderately inflated by outliers
benchmarking setImpl/shuffled 500
mean: 3.388221 ms, lb 3.369279 ms, ub 3.406866 ms, ci 0.950
std dev: 96.33406 us, lb 87.24324 us, ub 107.8311 us, ci 0.950
variance introduced by outliers: 22.887%
variance is moderately inflated by outliers
benchmarking setImpl/shuffled 1000
mean: 11.58844 ms, lb 11.54014 ms, ub 11.63710 ms, ci 0.950
std dev: 248.5703 us, lb 217.6913 us, ub 291.6066 us, ci 0.950
found 3 outliers among 100 samples (3.0%)
2 (2.0%) high mild
variance introduced by outliers: 14.236%
variance is moderately inflated by outliers
benchmarking setImpl/resized 1000
mean: 11.39132 ms, lb 11.35020 ms, ub 11.43506 ms, ci 0.950
std dev: 217.9452 us, lb 193.3982 us, ub 250.0571 us, ci 0.950
variance introduced by outliers: 12.279%
variance is moderately inflated by outliers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment