Skip to content

Instantly share code, notes, and snippets.

View ibuenros's full-sized avatar

Issac Buenrostro ibuenros

  • LinkedIn
  • Sunnyvale, CA
View GitHub Profile
@ibuenros
ibuenros / SparkUtils.scala
Created June 29, 2014 17:12
Spark productionizing utilities developed by Ooyala, shown in Spark Summit 2014
//==================================================================
// SPARK INSTRUMENTATION
//==================================================================
import com.codahale.metrics.{MetricRegistry, Meter, Gauge}
import org.apache.spark.{SparkEnv, Accumulator}
import org.apache.spark.metrics.source.Source
import org.joda.time.DateTime
import scala.collection.mutable
# Job metadata
job.name=PullFromWikipediaToKafka
job.group=Wikipedia
job.description=Pull from Wikipedia and write to Kafka
# Schedule
job.schedule=0 0/2 * * * ?
# Source configuration
extract.namespace=gobblin.example.wikipedia