Skip to content

Instantly share code, notes, and snippets.

@clvv
Created August 23, 2011 07:00
Show Gist options
  • Save clvv/1164521 to your computer and use it in GitHub Desktop.
Save clvv/1164521 to your computer and use it in GitHub Desktop.
Mechanize Scrapping Script for HP SMB Order Status Checking Page
require 'mechanize'
agent = Mechanize.new
agent.max_history = 1
result = {}
latest_shipped_16 = File.read('latest_shipped_16').to_i
latest_shipped_32 = File.read('latest_shipped_32').to_i
for i in ([latest_shipped_16, latest_shipped_32].min)..4400000
agent.get "https://h20497.www2.hp.com/os/public.tcl?po=#{i}&orderno=&webno=+&attn=&delvno=&reqField=po_date&podate=20-Aug-2011&display.x=0&display.y=0&sort=a-po"
status = agent.page.search 'td:nth-child(4) .udrline'
link = agent.page.search 'td td:nth-child(2) .udrline'
next unless link[0]
link = link[0].attributes["href"].value
agent.get link
if agent.page.body.match /16G/
model = '16G'
elsif agent.page.body.match /32G/
model = '32G'
else
model = 'NONE'
end
#puts i, status.text
#puts
result[i] = status.text
if status.text == 'Shipped'
if model == '32G'
File.open('latest_shipped_32', 'w') do |file|
file.write i
end
elsif model == '16G'
File.open('latest_shipped_16', 'w') do |file|
file.write i
end
end
puts i
puts model
puts
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment