Skip to content

Instantly share code, notes, and snippets.

@mchung
Created November 13, 2008 23:12
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mchung/24678 to your computer and use it in GitHub Desktop.
Save mchung/24678 to your computer and use it in GitHub Desktop.
inputs = args
spec = Gisting::Spec.new
inputs.each do |file_input|
input = spec.add_input
input.file_pattern = file_input
input.map do |map_input|
# 2722 mailbox 2006-05-23 00:08:39
# 217 - 2006-05-23 15:41:48
# 1326 www.crazyradiodeals.com 2006-05-23 18:00:30
# 2722 mailbox 2006-05-23 00:08:39
# 2722 mailbox 2006-05-23 00:08:42
# 2722 jc whitney 2006-05-23 00:25:47 1 http://www.jcwhitney.com
words = map_input.strip.split("\t")
Emit(words[1], "1")
end
end
output = spec.output
output.filebase = "/Volumes/gisting/datasets/output"
output.num_tasks = 2
output.reduce do |reduce_input|
count = 0
reduce_input.each do |value|
count += value.to_i
end
Emit(count)
end
result = MapReduce(spec)
pp result
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment