Skip to content

Instantly share code, notes, and snippets.

@igaiga
Last active March 16, 2017 04:40
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save igaiga/7a7591644f65ddd83c975e36aad79587 to your computer and use it in GitHub Desktop.
Save igaiga/7a7591644f65ddd83c975e36aad79587 to your computer and use it in GitHub Desktop.
## Analyze Wikipedia access data
# Data format
# https://wikitech.wikimedia.org/wiki/Analytics/Data/Pageviews
# Files
# https://dumps.wikimedia.org/other/pageviews/
filename = "pageviews-20170101-000000"
file = File.open(filename)
access_data = []
file.each_line do |text|
data = text.split
if data[0] == "ja" && data[1] =~ /駅\z/
access_data.push({title: data[1], count: data[2]})
end
end
file.close
# count順にソート
result = access_data.sort_by do |i|
i[:count].to_i
end
# トップ10表示
result.reverse.first(10).each do |i|
puts i
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment