Skip to content

Instantly share code, notes, and snippets.

@girisandeep
Last active May 21, 2021 05:56
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 4 You must be signed in to fork a gist
  • Save girisandeep/161d1d5ea09517b1ab44df81b9b148c0 to your computer and use it in GitHub Desktop.
Save girisandeep/161d1d5ea09517b1ab44df81b9b148c0 to your computer and use it in GitHub Desktop.
An example of using accumulator in spark with scala
sc.setLogLevel("ERROR")
var file = sc.textFile("/data/mr/wordcount/input/")
var numBlankLines = sc.accumulator(0)
def toWords(line:String): Array[String] = {
if(line.length == 0) {numBlankLines += 1}
return line.split(" ");
}
var words = file.flatMap(toWords)
words.saveAsTextFile("words3")
printf("Blank lines: %d", numBlankLines.value)
//Blank lines: 24858
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment