Skip to content

Instantly share code, notes, and snippets.

View florianverhein's full-sized avatar

Florian Verhein florianverhein

View GitHub Profile
@florianverhein
florianverhein / gist:2ed965bde7324cb73325
Last active September 7, 2015 00:02
Drive scalaz.stream.Process externally and expose as Iterator
/**
* Turn a Process[Task,O] into an Iterator[O].
*
* Uses the toTask trick discussed here: https://groups.google.com/forum/#!topic/scalaz/gx0eXHpQN48
* Note: "It's a hack because it's not resource safe - if you stop examining the `Task` before
* it completes, finalizers for the stream are not guaranteed to be run".
* Hence, the iterator should always be completely consumed.
*
* An earlier attempt at tackling this problem is kept below.
*/
@florianverhein
florianverhein / spark_scalaz-stream
Last active August 29, 2015 14:15
Running scalaz-stream Processor inside Spark example
import org.apache.spark._
import scalaz.stream._
/**
* Simple proof of concept - fill an RDD from files that have been
* processed by a scalaz-stream Process (in parallel).
*/
object SparkScalazStream {
def main(args: Array[String]) {