Skip to content

Instantly share code, notes, and snippets.

@june29
Created November 27, 2008 23:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save june29/29869 to your computer and use it in GitHub Desktop.
Save june29/29869 to your computer and use it in GitHub Desktop.
require "rubygems"
require "open-uri"
require "nokogiri"
def count_word(url)
text = fetch_text(Nokogiri::HTML(open(url)))
text.scan(/\w+/).inject(Hash.new(0)) { |count, word|
count[word.downcase] += 1
count
}.sort { |a, b|
b[1] <=> a[1]
}
end
def fetch_text(e)
if e.is_a? Nokogiri::XML::Text
return e.text
end
e.children.inject(String.new) { |text, child|
text << fetch_text(child)
text << "\n"
text
}
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment