Skip to content

Instantly share code, notes, and snippets.

@mwinkle
Created January 5, 2016 14:20
Show Gist options
  • Save mwinkle/6e4549e024f8af1cf600 to your computer and use it in GitHub Desktop.
Save mwinkle/6e4549e024f8af1cf600 to your computer and use it in GitHub Desktop.
text_file = sc.textFile("hdfs://sandbox.hortonworks.com/user/guest/install.log")
counts = text_file.flatMap( lambda line: line.split(" ")) \
.map(lambda word: (word, 1) ) \
.reduceByKey(lambda a, b : a + b)
counts.saveAsTextFile("hdfs://sandbox.hortonworks.com/user/guest/output1.txt")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment