Skip to content

Instantly share code, notes, and snippets.

@phaedryx
Last active December 16, 2015 02:49
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save phaedryx/5365382 to your computer and use it in GitHub Desktop.
Save phaedryx/5365382 to your computer and use it in GitHub Desktop.
require 'nokogiri'
require 'open-uri'
words = Hash.new {|h,k| h[k] = 0}
(1971..2013).each do |year|
["04","10"].each do |month|
talk_list = Nokogiri::HTML(open("http://www.lds.org/general-conference/sessions/#{year}/#{month}?lang=eng"))
talk_list.css("a.print").each do |link|
talk = Nokogiri::HTML(open(link.attributes["href"].value))
talk.text.gsub(/\n/,' ').gsub(/[[:punct:]]/,' ').split(' ').map(&:downcase).each {|word| words[word] += 1}
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment