Skip to content

Instantly share code, notes, and snippets.

@adekunleba
Created March 23, 2019 19:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save adekunleba/ebacc4b0767298a5d0edb0bfb1194e60 to your computer and use it in GitHub Desktop.
Save adekunleba/ebacc4b0767298a5d0edb0bfb1194e60 to your computer and use it in GitHub Desktop.
val split :RDD[String] = rdd.flatMap(_.split(" "))
val trim :RDD[String] = split.map(_.trim.toLowerCase)
val stopwordsRemoved = trim.filter( x => !Set("and", "the", "is", "to", "she", "he").contains(x))
val assignOne = stopwordsRemoved.map((_, 1))
val counts = assignOne.reduceByKey(_ + _)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment