Skip to content

Instantly share code, notes, and snippets.

@masayuki5160
Created July 23, 2013 00:55
Show Gist options
  • Save masayuki5160/6059031 to your computer and use it in GitHub Desktop.
Save masayuki5160/6059031 to your computer and use it in GitHub Desktop.
Hadoop streaming用のreducer。 リファラーとかがshuffleされてわたされるのでカウントするだけ。 key,value -> 'リンク元', 'リンク先' でやってます。 この使い方でいいのかよくわからんけどw
#!/usr/bin/ruby
access = Hash.new
pre_referer = nil
pre_link = nil
count = 1
ARGF.each do |log|
log.chomp!
referer, link = log.split(/\t/)
key = "from#{referer}to#{link}"
#access[key] = count
#access["from#{referer}to#{link}"] = count
if referer == pre_referer && link == pre_link
access[key] = count + 1
#access["from#{referer}to#{link}"] = count + 1
else
puts "from #{referer} to #{link} #{access[key]}"
pre_referer = referer
pre_link = link
access[key] = 1
end
end
#puts "from #{access['referer']} to #{access['link']}"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment