Skip to content

Instantly share code, notes, and snippets.

@danielwellman
Created July 20, 2012 19:29
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save danielwellman/3152733 to your computer and use it in GitHub Desktop.
Save danielwellman/3152733 to your computer and use it in GitHub Desktop.
Scrape Payment Receipts from the EasyPayMetroCard website
#! /usr/bin/env ruby
# Downloads "Payment Received" transactions from the EasyPayMetrocard
# website (www.easypaymetrocard.com").
#
# Requires capybara and capybara-webkit for headless JavaScript execution -
# the EasyPayMetroCard site requires JavaScript to obtain account activity.
#
# To install capybara-webkit, you must first install Qt, a cross-platform
# development kit. For instructions, see
# https://github.com/thoughtbot/capybara-webkit/wiki/Installing-Qt-and-compiling-# capybara-webkit
require 'rubygems'
require 'capybara'
require 'capybara/dsl'
require 'capybara-webkit'
class Scraper
Capybara.run_server = false
Capybara.default_driver = :webkit
Capybara.app_host = 'http://www.easypaymetrocard.com'
include Capybara::DSL
def run(account_number, password, start_date, end_date)
# Login
visit("/")
fill_in("iAccountNumber", :with => account_number)
fill_in("iPassword", :with => password)
click_button("Signin")
# Account Summary Page
click_link("Account Activity")
# Account Activity Page
fill_in("HStartDate", :with => start_date)
fill_in("HEndDate", :with => end_date)
find("#Go1").click
# Print the header row
puts find(:xpath, "//table[@id='StatementTable']/tbody/tr").text
begin
payment_rows = all(:xpath, "//table[@id='StatementTable']/tbody/tr[contains(., 'Payment Received')]")
payment_rows.each { |tr| puts tr.text }
if page.has_link? ("Next")
click_link("Next")
else
@done = true
end
end while not @done
end
end
if (ARGV.size != 4)
puts "Usage: ruby easypaymetrocard_charges.rb <pin> <password> <start_date> <end_date>"
puts
puts "- Date format is MM/DD/YYYY"
exit(1)
end
scraper = Scraper.new
scraper.run(ARGV[0], ARGV[1], ARGV[2], ARGV[3])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment