Skip to content

Instantly share code, notes, and snippets.

@samsm
Created January 16, 2010 21:22
Show Gist options
  • Save samsm/279017 to your computer and use it in GitHub Desktop.
Save samsm/279017 to your computer and use it in GitHub Desktop.
require 'rubygems'
require 'open-uri'
require 'hpricot'
doc = Hpricot(open('http://www.haitisurf.com/dictionary.shtml'))
text_array = (doc/'font').
collect {|elem| elem.to_plain_text }.
delete_if { |txt| txt == '' }.
select {|txt| txt =~ /.+\-.+/ }
text_array.inject({}) do |memo, txt|
hiaitian_word, translation = txt.split(/\s*\-\s*/)
puts "H: #{hiaitian_word} ; Translation: #{translation}"
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment