Skip to content

Instantly share code, notes, and snippets.

@mrflip
Created Jun 23, 2009
Embed
What would you like to do?
# Three more ways to cheat:
# Stack up all the values in a list then sum them at once:
require 'active_support/core_ext/enumerable'
class Reducer1 < Wukong::Streamer::ListReducer
def finalize
yield [ key, values.map(&:last).map(&:to_i).sum ]
end
end
#
# ... this is common enough that it's already included
#
require 'wukong/streamer/count_keys'
class Reducer3 < Wukong::Streamer::CountKeys
end
#
# or really cheat
#
require 'wukong/streamer/count_keys'
class Reducer4 < Wukong::Streamer::Base
def stream
puts `uniq -c`
end
end
#
# Accumulate the sum record-by-record:
#
class Reducer2 < Wukong::Streamer::AccumulatingReducer
attr_accessor :key_count
def start!(*args) self.key_count = 0 end
def accumulate(*args) self.key_count += 1 end
def finalize
yield [ key, key_count ]
end
end
#
# Accumulate the sum record-by-record:
#
class Reducer2 < Wukong::Streamer::AccumulatingReducer
attr_accessor :key_count
def start!(*args) self.key_count = 0 end
def accumulate(*args) self.key_count += 1 end
def finalize
yield [ key, key_count ]
end
end
module WordCount
class Mapper < Wukong::Streamer::LineStreamer
# Split a string into its constituent words.
def tokenize str
str.downcase.
strip.
split(/\s+/).
reject(&:blank?)
end
# Emit each word in each line.
def process line
tokenize(line).each{|word| yield [word, 1] }
end
end
# Conceptually: reduce with uniq -c
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment