Skip to content

Instantly share code, notes, and snippets.

@mattdsteele
Created September 3, 2011 15:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mattdsteele/1191311 to your computer and use it in GitHub Desktop.
Save mattdsteele/1191311 to your computer and use it in GitHub Desktop.
Triathlon scrapin'
#!/usr/bin/ruby
require 'net/http'
def get_url
page = 'http://www.backprint.com/view_user_event.asp?PID=bp%18yG&EVENTID=81207&PWD=&BIB='
bib_number = ARGV[0]
if bib_number.nil?
puts "usage: scraper.rb bib_number"
exit 1
end
total = page + bib_number
total
end
url = get_url
html = Net::HTTP.get_response(URI.parse(url))
homepage = "homepage.html"
File.open(homepage, 'w') { |f| f.write(html.body) }
# commands
exec 'for i in `grep "thumbnail" homepage.html | sed "s/.*<img class=\"thumbnail\" src=\"\(.*\)\".*/\1/g" | grep brightroom | sed "s/t\.jpg/f\.jpg/g"` ; do wget $i ; done'
File.delete(homepage)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment