Last active
August 29, 2015 14:05
-
-
Save statusfailed/8e6f79d35e89acc5bb3d to your computer and use it in GitHub Desktop.
A question on how to write a "groupBy" machine using Edward Kmett's machines library
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{-# LANGUAGE FlexibleContexts #-} | |
module Main where | |
import Prelude hiding (id, (.)) | |
import Control.Category | |
import Control.Applicative | |
import Control.Monad | |
import Control.Monad.Trans | |
import Control.Monad.IO.Class | |
import Data.Machine | |
import Data.Machine.Plan | |
import Data.Machine.Process | |
import Data.Machine.Source | |
-- I asked this question on Reddit a while ago: | |
-- http://www.reddit.com/r/haskell/comments/2ejzst/streaming_tabseparated_logfile_analysis_with/ | |
-- I am interested in trying to do it with Machines | |
-- (given that the Plan language feels similar to Python) | |
-- Machines: https://hackage.haskell.org/package/machines | |
-- I want to write a "groupBy" function similar to Data.List.groupBy | |
-- that doesn't emit lists (and hold each group in memory.) I figure it should | |
-- have a type signature something like the below, i.e. a list of 'iterators' | |
-- to be executed sequentially | |
groupBy :: Monad m => (a -> Bool) -> SourceT m (ProcessT m a a) | |
groupBy = undefined -- I don't know how to write this | |
-- | Instead, Here's a simpler process, in which each group is just a single | |
-- element. | |
singletons :: Monad m => SourceT m (ProcessT m a a) | |
singletons = repeatedly (yield . construct $ await >>= yield) | |
-- I also want to concatenate my iterators back together. | |
-- I'm not sure how to do this without using runT: is there a better | |
-- way? | |
-- Also, I copied the type from GHCi, I couldn't get it a simpler one to work | |
concatenating :: (MonadTrans (PlanT (k (MachineT m k1 o)) o), Category k, Monad m) | |
=> MachineT m (k (MachineT m k1 o)) o | |
concatenating = repeatedly $ do | |
m <- await | |
xs <- (lift . runT) m | |
mapM_ yield xs | |
-- What is the best way to write these functions? Do they already exist in Machines? | |
-- additionally, would the semantics be the same as Python's groupBy, where evaluating | |
-- the next group will force the previous group's iterator to be exhausted (i.e., you | |
-- don't get bits of the previous group in the current group?) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment