Skip to content

Instantly share code, notes, and snippets.

@gmarik
Created October 16, 2010 18:35
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gmarik/630130 to your computer and use it in GitHub Desktop.
Save gmarik/630130 to your computer and use it in GitHub Desktop.
#!/usr/bin/env ruby
#
# http://gmarik.info/blog/2010/10/16/scraping-asp-net-site-with-mechanize
require 'rubygems'
require 'mechanize'
require 'logger'
user = 'user'
pass = 'pass'
login_url = 'https://asp.net.site'
class Mechanize::Page::Link
def asp_click(action_arg = nil)
etarget,earg = asp_link_args.values_at(0, 1)
f = self.page.form_with(:name => 'aspnetForm')
f.action = asp_link_args.values_at(action_arg) if action_arg
f['__EVENTTARGET'] = etarget
f['__EVENTARGUMENT'] = earg
f.submit
end
def asp_link_args
href = self.attributes['href']
href =~ /\(([^()]+)\)/ && $1.split(/\W?\s*,\s*\W?/).map(&:strip).map {|i| i.gsub(/^['"]|['"]$/,'')}
end
end
agent = Mechanize.new do |a|
a.log = Logger.new($stdout);
a.log.level = 1
a.user_agent_alias = 'Mac Safari'
end
page = agent.get(login_url)
f = page.form_with(:name => 'LgForm')
f['lgLg$User'] = user
f['lgLg$Pass'] = pass
page = agent.submit(f, f.buttons.first)
# Follow links
file = page.link_with(:text => 'Journall').asp_click.
link_with(:text => 'All Downloads').asp_click(4).
link_with(:text => 'Download').asp_click
file.save_as(file.filename)
puts 'Done!'
puts "\nFile: #{file.filename}"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment