Skip to content

Instantly share code, notes, and snippets.

@breckenedge
Created March 29, 2011 15:19
Show Gist options
  • Save breckenedge/892539 to your computer and use it in GitHub Desktop.
Save breckenedge/892539 to your computer and use it in GitHub Desktop.
used to download pdfs from the OSTI.GOV information bridge
require 'rubygems'
require 'mechanize'
require 'progressbar'
docs = File.readlines('URLS.txt')
pb = ProgressBar.new('downloading', docs.length)
docs.each do |line|
url = line.strip
if !File.exist?(File.basename(url))
agent = Mechanize.new
agent.get(url) # get a session cookie
f = File.new(File.basename(url),'wb')
f << agent.get(url).body # download the file
sleep rand 5
f.close
end
pb.inc
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment