Skip to content

Instantly share code, notes, and snippets.

@changkun
Created June 5, 2017 22:29
Show Gist options
  • Save changkun/9095e29c4071dfe96e182379e7770f8a to your computer and use it in GitHub Desktop.
Save changkun/9095e29c4071dfe96e182379e7770f8a to your computer and use it in GitHub Desktop.
Word Count MapReduce
# wordcount
import string
from functools import reduce
str = """\
How much ground would a groundhog hog, \
if a groundhog could hog ground? \
A gfoundhog would hog all the ground he could hog, \
if a groundhog could hog ground. \
"""
# 1. partition
word_list = str.translate(string.punctuation).split(' ')
# 2. map
word_map = map(lambda x: (x, 1), word_list)
# 3. group by
group_by_word = {}
for word, value in word_map:
try:
group_by_word[word].append(value)
except:
group_by_word[word] = [value]
# 4. reduce
word_count = {}
for key, key_list in group_by_word.items():
word_count[key] = reduce(lambda x, y : x+y, key_list)
print(word_count)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment