Skip to content

Instantly share code, notes, and snippets.

@k-fujikawa
k-fujikawa / mapper.py
Last active December 25, 2015 06:18
Hadoop Streaming を利用した word count
#!/usr/local/bin/python
# coding: utf-8
import sys, json, codecs
dic = json.load(codecs.open('dic.json', 'r', 'utf-8'))
# input comes from STDIN (standard input)
for line in sys.stdin:
# 行を単語に分割する
words = line.strip().split()