Skip to content

Instantly share code, notes, and snippets.

@emarthinsen
Created November 1, 2012 20:00
Show Gist options
  • Save emarthinsen/3996079 to your computer and use it in GitHub Desktop.
Save emarthinsen/3996079 to your computer and use it in GitHub Desktop.
URL Scraper
def scrape_url(url)
url_found = false
twitter_name = nil
begin
agent = Mechanize.new do |a|
a.follow_meta_refresh = true
end
agent.get(normalize_url(url)) do |page
url_found = true
twitter_name = find_twitter_name(page)
end
@err << "[#{@current_record}] SUCCESS\n"
rescue Exception => e
@err << "[#{@current_record}] ERROR (#{url}): "
@err << e.message
@err << "\n"
end
[url_found, twitter_name]
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment