Skip to content

Instantly share code, notes, and snippets.

View tristanreid's full-sized avatar

Tristan Reid tristanreid

View GitHub Profile
@tristanreid
tristanreid / KMeans2Caller.scala
Last active October 2, 2015 05:30
Scalding ExecutionApp for KMeans example
/*
To execute https://github.com/twitter/scalding/blob/master/scalding-core/src/main/scala/com/twitter/scalding/examples/KMeans.scala
and wrap results in a file.
Execute locally like this:
scala -classpath target/project-0.0.1-jar-with-dependencies.jar com.mycompany.project.KMeans2Caller \
--local \
--clusters <num clusters> \
--input ../work/kmeansData.tsv \
--output kout.tsv \
@tristanreid
tristanreid / ExecutionAppExamples.scala
Last active October 2, 2015 06:05
Examples of how (and how not) to call Scalding's ExecutionApp
import com.twitter.scalding.typed.TypedPipe
import com.twitter.scalding._
import scala.util.{Failure, Success}
// If you want to understand what's going on, read the code of Execution and ExecutionApp (it's not long)
// Here's the highlights: The main of ExecutionApp - executes job for you, as long as you return Execution[Unit]
// Execution: Consider flatMap, zip, unit
// unit: since you have to return Execution[Unit] at some point, this is handy. Nice with .zip, for example
// zip: combine Executions to execute in parallel for fun and profit