Skip to content

Instantly share code, notes, and snippets.

@mehdi-farsi
Created November 5, 2014 23:05
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mehdi-farsi/421c01a48730ed5bbebf to your computer and use it in GitHub Desktop.
Save mehdi-farsi/421c01a48730ed5bbebf to your computer and use it in GitHub Desktop.
[Web Scraping/Crawling] Calling redirect URL if 301/302 HTTP Status Code
# Concrete case: Google search result URLs
#
# Requirement:
# gem install nokogiri
require 'nokogiri'
require 'net/http'
require 'uri'
def get(url)
uri = URI.parse(URI.escape(url))
response = Net::HTTP.get_response(uri) # get Headers + Body
if ['301', '302'].include? response.code # check if redirection is needed via HTTP Status Code
redirect_url = response['Location'] # Redirection URL set in 'Location' Header
uri = URI.parse(URI.escape(redirect_url))
Nokogiri::HTML(Net::HTTP.get(uri)) # Get Body of redirection URL
else
Nokogiri::HTML(response.body) # Save a call by using the already downloaded response.
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment