Skip to content

Instantly share code, notes, and snippets.

@scepion1d
Created January 20, 2012 09:15
Show Gist options
  • Save scepion1d/1646319 to your computer and use it in GitHub Desktop.
Save scepion1d/1646319 to your computer and use it in GitHub Desktop.
Ruby class for download web page and translate it to UTF-8 encoding
require "net/http"
require "ensure"
class Downloader
# Expected encodings
ENCODINGS = Array['UTF-8', 'WINDOWS-1251', 'KOI-8']
def get_page(url)
response = Net::HTTP.get_response(URI.parse(url))
while response.kind_of?(Net::HTTPRedirection)
url = response['Location']
response = Net::HTTP.get_response(URI.parse(url))
end
translate_to_utf(response.body)
end
def translate_to_utf(page)
encoding = Ensure::Encoding.guess_encoding(page, ENCODINGS).name
if encoding != 'UTF-8'
page.ensure_encoding('UTF-8', :external_encoding => encoding)
else
page
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment