Skip to content

Instantly share code, notes, and snippets.

@statusfailed
Last active August 29, 2015 14:05
Show Gist options
  • Save statusfailed/8e6f79d35e89acc5bb3d to your computer and use it in GitHub Desktop.
Save statusfailed/8e6f79d35e89acc5bb3d to your computer and use it in GitHub Desktop.
A question on how to write a "groupBy" machine using Edward Kmett's machines library
{-# LANGUAGE FlexibleContexts #-}
module Main where
import Prelude hiding (id, (.))
import Control.Category
import Control.Applicative
import Control.Monad
import Control.Monad.Trans
import Control.Monad.IO.Class
import Data.Machine
import Data.Machine.Plan
import Data.Machine.Process
import Data.Machine.Source
-- I asked this question on Reddit a while ago:
-- http://www.reddit.com/r/haskell/comments/2ejzst/streaming_tabseparated_logfile_analysis_with/
-- I am interested in trying to do it with Machines
-- (given that the Plan language feels similar to Python)
-- Machines: https://hackage.haskell.org/package/machines
-- I want to write a "groupBy" function similar to Data.List.groupBy
-- that doesn't emit lists (and hold each group in memory.) I figure it should
-- have a type signature something like the below, i.e. a list of 'iterators'
-- to be executed sequentially
groupBy :: Monad m => (a -> Bool) -> SourceT m (ProcessT m a a)
groupBy = undefined -- I don't know how to write this
-- | Instead, Here's a simpler process, in which each group is just a single
-- element.
singletons :: Monad m => SourceT m (ProcessT m a a)
singletons = repeatedly (yield . construct $ await >>= yield)
-- I also want to concatenate my iterators back together.
-- I'm not sure how to do this without using runT: is there a better
-- way?
-- Also, I copied the type from GHCi, I couldn't get it a simpler one to work
concatenating :: (MonadTrans (PlanT (k (MachineT m k1 o)) o), Category k, Monad m)
=> MachineT m (k (MachineT m k1 o)) o
concatenating = repeatedly $ do
m <- await
xs <- (lift . runT) m
mapM_ yield xs
-- What is the best way to write these functions? Do they already exist in Machines?
-- additionally, would the semantics be the same as Python's groupBy, where evaluating
-- the next group will force the previous group's iterator to be exhausted (i.e., you
-- don't get bits of the previous group in the current group?)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment