Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save nag9s/c6eca3d652538032d99cd6587b03ec67 to your computer and use it in GitHub Desktop.
Save nag9s/c6eca3d652538032d99cd6587b03ec67 to your computer and use it in GitHub Desktop.
val inputPath = "/Users/itversity/Research/data/wordcount.txt" or val inputPath = "/public/randomtextwriter/part-m-00000"
val outputPath = "/Users/itversity/Research/data/wordcount" or val outputPath = "/user/dgadiraju/wordcount"
//Make sure outputPath does not exist for this example
sc.textFile(inputPath).
flatMap(_.split(" ")).
map((_, 1)).
reduceByKey(_ + _).
take(100).
foreach(println)
//alternative
sc.textFile(inputPath).
flatMap(line => line.split(" ").map(rec => (rec, 1))).
reduceByKey(_ + _).
take(100).
foreach(println)
//Saving to file
sc.textFile(inputPath).
flatMap(_.split(" ")).
map((_, 1)).
reduceByKey(_ + _).
map(rec => rec._1 + "\t" + rec._2).
saveAsTextFile(outputPath)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment