Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Small Ruby script that demonstrates how to use Mechanize to scrape some product details from an array of product URLs from Zappos.com
# http://nokogiri.org/Nokogiri/XML/Node.html#method-i-css
require 'mechanize'
require 'csv'
puts "Product Scraper!!!"
puts ' '
urls = [
"http://www.zappos.com/seavees-teva-universal-sandal-concrete",
"http://www.zappos.com/teva-bomber-sandal-dark-olive",
"http://www.zappos.com/teva-jetter-cigar"]
file = "product_data.csv"
header = "title,sku,image,alt_images"
File.open(file, "w") do |csv|
csv << header
csv << "\n"
(0..urls.length - 1).each do |index|
puts urls[index]
agent = Mechanize.new
page = agent.get(urls[index])
title = page.title
title = title[0..title.index(' - ')].rstrip
sku = page.search("#sku").inner_text
sku = sku[4..sku.length-1]
prod_image = page.search("#detailImage img").first
alt_images = page.search("#productImages ul li a img")
brand_text = page.search("#brandText").inner_text
alt_images = alt_images.map { |x| x[:src] }.join("|")
csv << [title, sku, prod_image[:src], "#{alt_images}"]
csv << "\n"
end
2.times { |x| puts "" }
puts "Done!"
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.