Skip to content

Instantly share code, notes, and snippets.

@benjamintanweihao
Created February 12, 2014 08:55
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save benjamintanweihao/8952080 to your computer and use it in GitHub Desktop.
Save benjamintanweihao/8952080 to your computer and use it in GitHub Desktop.
# encoding: utf-8
require 'parallel'
require 'cgi'
require 'mechanize'
require 'open-uri'
require 'pathname'
require 'uri'
def product_name(asin)
url = "http://www.amazon.com/dp/#{asin}"
begin
open(url).read =~ /<title>(.*?)<\/title>/
result = $1
result = result.to_s.gsub("Amazon.com: ", "").gsub(": Video Games","")
CGI.unescapeHTML(result)
rescue
""
end
end
file = File.open("asin_product_name.txt", "a")
file.sync = true
asins = []
File.open("ASIN").each_line do |asin|
asins << asin.strip!
end
Parallel.map(asins, in_threads: 10) do |asin|
product_name = product_name(asin)
puts "[#{asin}] #{product_name}"
file.puts [asin, product_name].join("\t") unless product_name.empty?
end
file.close
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment