Skip to content

Instantly share code, notes, and snippets.

@oisdk
Last active October 22, 2022 16:46
Show Gist options
  • Save oisdk/0822477aaced58a5ba937c3d11c19639 to your computer and use it in GitHub Desktop.
Save oisdk/0822477aaced58a5ba937c3d11c19639 to your computer and use it in GitHub Desktop.
import Data.List (unfoldr, partition)
import Data.Maybe (catMaybes)
import Criterion.Main (defaultMain, env, bgroup, bench, nf)
import System.Random (randomIO)
import Control.Monad (replicateM)
groupOn :: Eq k => (a -> k) -> [a] -> [(k, [a])]
groupOn k = unfoldr f . map (\x -> (k x, x))
where
f [] = Nothing
f ((k,x):xs) = Just ((k , x : map snd ys), zs)
where
(ys,zs) = partition ((k==) . fst) xs
groupOnOrd :: Ord k => (a -> k) -> [a] -> [(k,[a])]
groupOnOrd k = catMaybes . go . map (\x -> (k x, x))
where
go [] = []
go ((k,x):xs) = Just (k, x : e) : merge m (go l) (go g)
where
(e, m, l, g) = foldr split ([],[],[],[]) xs
split ky@(k',y) ~(e, m, l, g) = case compare k' k of
LT -> ( e, LT : m, ky : l, g)
EQ -> (y:e, EQ : m, l, g)
GT -> ( e, GT : m, l, ky : g)
merge [] lt gt = []
merge (EQ : xs) lt gt = Nothing : merge xs lt gt
merge (LT : xs) (l:lt) gt = l : merge xs lt gt
merge (GT : xs) lt (g:gt) = g : merge xs lt gt
main =
defaultMain
[ env (replicateM m randomIO) $ \xs ->
bgroup (show m)
(
[ bgroup "id"
[ bench "groupOn" $ nf (groupOn id) xs
, bench "groupOnOrd" $ nf (groupOnOrd id) xs
]
] ++
[ bgroup (show (n :: Word))
[ bench "groupOn" $ nf (groupOn (`rem` n)) xs
, bench "groupOnOrd" $ nf (groupOnOrd (`rem` n)) xs
]
| n <- [2,3,100,1000], n < toEnum m ]
)
| p <- [2,3,4], let m = 10 ^ p ]
@oisdk
Copy link
Author

oisdk commented Oct 22, 2022

The maybe isn't there to improve laziness really, it's there to make sure that the lists match up when zipped back together.

split 2 [1,2,3,1,2,3] ~= ([2,2],[Just False, Nothing, Just True, Just False, Nothing, Just True],[1,1],[3,3])

When reconstructing the list at the end you need to know when to not pull an element from either list, and you need to preserve the Nothings in the generated list at each level so the zipped lists match up in length.

@oisdk
Copy link
Author

oisdk commented Oct 22, 2022

Although it probably is clearer to use Ord rather than Maybe Bool

@anka-213
Copy link

But Nothing only happens when the elements are equal, in which case there is nothing to match up, so it just get's thrown away immediately. Am I missing something?

@anka-213
Copy link

Although it probably is clearer to use Ord rather than Maybe Bool

Yes, that does look a lot nicer!

@anka-213
Copy link

anka-213 commented Oct 22, 2022

Riight, I see now. It's the recursive case where it breaks down, since if the child calls would remove an element of the list, it would no longer line up for the parent. Thanks for clearing it up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment