This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/** | |
* Turn a Process[Task,O] into an Iterator[O]. | |
* | |
* Uses the toTask trick discussed here: https://groups.google.com/forum/#!topic/scalaz/gx0eXHpQN48 | |
* Note: "It's a hack because it's not resource safe - if you stop examining the `Task` before | |
* it completes, finalizers for the stream are not guaranteed to be run". | |
* Hence, the iterator should always be completely consumed. | |
* | |
* An earlier attempt at tackling this problem is kept below. | |
*/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import org.apache.spark._ | |
import scalaz.stream._ | |
/** | |
* Simple proof of concept - fill an RDD from files that have been | |
* processed by a scalaz-stream Process (in parallel). | |
*/ | |
object SparkScalazStream { | |
def main(args: Array[String]) { |