Skip to content

Instantly share code, notes, and snippets.

@jlewi
Created May 24, 2014 02:16
Show Gist options
  • Save jlewi/2c853edddd0ceee5f00c to your computer and use it in GitHub Desktop.
Save jlewi/2c853edddd0ceee5f00c to your computer and use it in GitHub Desktop.
Snippet showing how I try to create a setup a spark context using my custom kryo registrator for Avro generics.
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
val conf = new SparkConf().setMaster(“spark://spark-master:7077”).setAppName(“myapp”)
conf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
conf.set("spark.kryo.registrator", "contrail.AvroGenericRegistrator")
sc.stop()
val sc = new SparkContext(conf)
val contigFile = contrail.AvroHelper.readAvro(sc,"hdfs://hadoop-nn//tmp/contrail.stages.CompressAndCorrect/part-*.avro")
val datums = contigFile.map(r => r._1.datum)
val keyedById = datums.map(r => (r.get("node_id").toString, r))
keyedById.cache()
keyedById.lookup(“4fw3YAOX8-lvVjIWgNPGYR3gc5CvsxI”)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment