Skip to content

Instantly share code, notes, and snippets.

@imhoffd
Created August 27, 2014 16:28
Show Gist options
  • Save imhoffd/d506bf14e209d8c9ef45 to your computer and use it in GitHub Desktop.
Save imhoffd/d506bf14e209d8c9ef45 to your computer and use it in GitHub Desktop.
val file = sc.textFile("/temp/pagecounts-20100212-050000")
val filteredFile = file.filter(line => line.contains("Main_Page"))
val keyedFile = filteredFile.keyBy(line => new BigInt(new java.math.BigInteger(line.split(" ")(3))))
keyedFile.sortByKey(false).collect()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment