Skip to content

Instantly share code, notes, and snippets.

@CountCulture
Created November 2, 2009 10:13
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save CountCulture/224067 to your computer and use it in GitHub Desktop.
Save CountCulture/224067 to your computer and use it in GitHub Desktop.
class OnsReader
require 'httpclient'
BaseUrl = "http://neighbourhood.statistics.gov.uk/dissemination/"
def self.process
sets = YAML.load_file(File.join(RAILS_ROOT, "db/ons_data", "current_datasets.yml"))[0..100]
client = HTTPClient.new
sets.each do |set|
url = set[:url]
download_page = client.get_content(url)
download_link = Nokogiri::HTML(download_page).at('a[text()*="Comma separated values"]')
if download_link
resp = client.get(download_link[:href])
puts "Reponse header: #{resp.header}"
open(File.join(RAILS_ROOT, "db/ons_data", "#{Digest::MD5.hexdigest(set[:title])}.zip", "wb")) do |file|
file.write(resp.body)
puts "written file for #{set[:title]} (#{Digest::MD5.hexdigest(set[:title])}.zip)"
end
else
puts "Failed to find download link for #{set[:title]} (#{set[:url]})"
end
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment