Skip to content

Instantly share code, notes, and snippets.

@leehambley
Created December 13, 2012 15:11
Show Gist options
  • Save leehambley/4277010 to your computer and use it in GitHub Desktop.
Save leehambley/4277010 to your computer and use it in GitHub Desktop.
Example Class to wrap a sanitised Capybara job running on a remote host.

Design Goals:

  • Usability as a class
  • Queueable
  • Expressive, and save API
  • Easily toggleable between JS and non-JS mode
  • Use Capybara do to all the heavy lifting

## Usage

A worker designed to run with the Capybara engine to open remote pages, click on links, and generally interact with the page...usage:

`Resque.enqueue(:crawl, 123, {enable_javascript: false})` # Disable JS, and enqueue via Resque.
`Resque.enqueue(:crawl, 123, {enable_javascript: true})`  # Enable JS (this is the default, can also omit the option) and enqueue via Resque.

Craw.new(123).run # javascript by default, run without the background queue
# encoding: utf-8
require 'capybara'
class Crawl
attr_reader :job_id, :options
#
# Called at the class level, always by the queue backend
# so we instantiate and run one!
#
def self.run(job_id, options)
new(job_id, options).run
end
def initialize(job_id, driver)
@job_id, @driver = job_id, driver
end
def run
safe_job_parts(fake_user_script_from_the_db).each do |step|
case step[:action]
when :click_link
session.send(step[:action], step[:target])
when :fill_in
session.send(step[:action], step[:target], step[:arguments])
else
raise "Boom!"
end
end
#
# !!!! At This point we should be where the user wanted to be when
# they were navigating, it's time to save the file to a tmpfile
# and process it with the next step (or maybe don't use a temfile,
# I don't mind either way :-D)
#
tmpfile = session.save_page
end
private
#
# This method should return something like:
#
# [
# {action: :click_link, target: "Login", arguments: nil}
# {action: :fill_in, target: "#username", arguments: "example@gmail.com"}
# {action: :fill_in, target: "#password", arguments: "correct horse battery staple"}
# ]
#
def safe_job_parts
# This should load the job (capybara-ish script from the database)
# and do the same regexpmagic to it.
job.lines.collect do |l|
# Full magic: http://rubular.com/r/T6GEaIsxrq
l.match(/(?<action>[^\(]*)\((?<target>"[^"]*")(?:,?\s?(?<argument>"[^"]*")|)\)/).tap do |md|
{action: md.action.to_sym, target: md.target, arguments: md.argument}
end
end
end
def job_host
# Should be grabbed from the database
"http://www.example.com/"
end
def capybara_session
@_capybara_session ||= begin
Capybara::Session.new(driver.to_sym, job_host)
end
end
def driver
options.fetch(:enable_javascript, true) ? :poltergeist : :webrat
end
# This is too naïve
def fake_user_script_from_the_db
<<-EOSCRIPT
click_link("Login")
fill_in("#username", "example@gmail.com")
fill_in("#password", "correcthorsebatterystaple")
EOSCRIPT
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment