Skip to content

Instantly share code, notes, and snippets.

@saasindustries
Created January 25, 2021 13:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save saasindustries/b0248d77ab09f14ecdf09c341f64e4dc to your computer and use it in GitHub Desktop.
Save saasindustries/b0248d77ab09f14ecdf09c341f64e4dc to your computer and use it in GitHub Desktop.
require 'kimurai'
class Job_Scraper < Kimurai::Base
@name= 'acc_job_scraper'
@start_urls = ["https://www.indeed.com/jobs?q=accountant&l=Washington%2C+DC"]
@engine = :mechanize
@@jobs = []
def scrape_job_details
web_page = browser.current_response
job_list = web_page.css('td#resultsCol')
job_list.css('div.jobsearch-SerpJobCard').each do |char_element|
title = char_element.css('h2 a')[0].attributes["title"].value.gsub(/\n/, "")
company = description = char_element.css('span.company').text.gsub(/\n/, "")
salary = char_element.css('div.salarySnippet').text.gsub(/\n/, "")
job_details = [title, company, salary]
@@jobs << job_details if !@@jobs.include?(job_details)
end
end
def parse(response, url:, data: {})
scrape_job_details
CSV.open('jobData.csv', "w") do |csv|
@@jobs.each { |element| csv.puts(element) }
end
@@jobs
end
end
JobScraper.crawl!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment