Skip to content

Instantly share code, notes, and snippets.

@hachy
Created November 21, 2018 04:00
Show Gist options
  • Save hachy/c54a035d69fb03bc5de646a047541617 to your computer and use it in GitHub Desktop.
Save hachy/c54a035d69fb03bc5de646a047541617 to your computer and use it in GitHub Desktop.
乃木坂46流行語大賞2018
# frozen_string_literal: true
words = []
File.open("word.txt", "r") do |f|
words = f.readlines
end
word_count = words.each_with_object(Hash.new(0)) do |word, hash|
hash[word.chomp!] += 1
end
top = word_count.sort_by { |_, count| -count }.take(20)
initial = top.map { |t| t[0].slice(0..2) }.uniq
initial.each do |i|
word_count.each do |word, count|
if word.start_with?(i)
puts [count.to_s.rjust(4), word].join(" | ")
end
end
end
# frozen_string_literal: true
require "nokogiri"
require "open-uri"
max = 1000
page = 0.step(max, 50).to_a
words = []
page.each do |p|
url = "http://blog.nogizaka46.com/karin.itou/2018/11/047843.php?cp=#{p}#comments"
doc = Nokogiri::HTML(URI.open(url, "User-Agent" => "firefox"))
words << doc.css(".vcard").map { |i| i.text.gsub(/\s+/, "") }
end
File.open("word.txt", "w") do |f|
words.each { |w| f.puts(w) }
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment