Skip to content

Instantly share code, notes, and snippets.

@kcurtin
Created October 19, 2012 16:38
Show Gist options
  • Save kcurtin/3919236 to your computer and use it in GitHub Desktop.
Save kcurtin/3919236 to your computer and use it in GitHub Desktop.
threading in ruby: original implementation and the one with threading
def scrape_away(args)
threads = []
self.job_url_collection.each do |job_url|
threads << Thread.new do
job_doc = Nokogiri::HTML(open(job_url))
title = job_doc.css(args[:title_selector]).inner_text.strip
company = job_doc.css(args[:company_selector]).inner_text.strip
source = self.source
job_url = job_url.strip
location = job_doc.css(args[:location_selector]).inner_text.strip
job_type = job_doc.css(args[:job_selector]).inner_text.strip
telecommute = "Empty for now"
description = job_doc.css(args[:description_selector]).inner_text.strip
self.job_database.insert_row([title, source, job_url, company, location, job_type, telecommute, description])
end
end
threads.each(&:join)
end
def scrape_away(args)
self.job_url_collection.each do |job_url|
job_doc = Nokogiri::HTML(open(job_url))
title = job_doc.css(args[:title_selector]).inner_text.strip
company = job_doc.css(args[:company_selector]).inner_text.strip
source = self.source
job_url = job_url.strip
location = job_doc.css(args[:location_selector]).inner_text.strip
job_type = job_doc.css(args[:job_selector]).inner_text.strip
telecommute = "Empty for now"
description = job_doc.css(args[:description_selector]).inner_text.strip
self.job_database.insert_row([title, source, job_url, company, location, job_type, telecommute, description])
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment