Skip to content

Instantly share code, notes, and snippets.

@agrison
Created October 30, 2016 16:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save agrison/0fc3398c55c5d5a4590a5ec3c400dd6f to your computer and use it in GitHub Desktop.
Save agrison/0fc3398c55c5d5a4590a5ec3c400dd6f to your computer and use it in GitHub Desktop.
spark java 8 count
JavaRDD<String> textFile = sc.textFile("hdfs://...");
JavaPairRDD<String, Integer> counts = textFile
.flatMap(line -> Arrays.asList(line.split(" ")))
.mapToPair(w -> new Tuple2<>(w, 1))
.reduceByKey((x, y) -> x + y);
counts.saveAsTextFile("hdfs://...");
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment