Skip to content

Instantly share code, notes, and snippets.

@rergw
Last active September 15, 2018 15:14
Show Gist options
  • Save rergw/2a91baa6f36447fc0d5195e10ee17ea3 to your computer and use it in GitHub Desktop.
Save rergw/2a91baa6f36447fc0d5195e10ee17ea3 to your computer and use it in GitHub Desktop.
General purpose scraper adapted to angel.co
/*
General purpose scraper adapted to angel.co.
Does not support pagination.
Usage:
1. For other pages change `map` and `item`.
2. Copy entire code into browser console
3. Results are copied to clipboard and can be pasted on a spreadsheet.
*/
JSONresults = []
TSVresults = []
map = {
name: '.startup-link',
URL: ['.startup-link', 'href'],
title: '.collapsed-title',
compensation: '.collapsed-compensation',
tags: '.collapsed-tags',
active: '.tag.active',
applicants: '.tag.applicants',
locations: '.tag.locations',
employees: '.tag.employees'
}
item = '.header-info'
TSVresults.push(Object.keys(map).join("\t"))
document
.querySelectorAll(item)
.forEach(function(node){
result = {}
for (name in map) {
map[name] = Array.isArray(map[name]) ? map[name] : [map[name]]
selector = map[name][0]
method = map[name][1] || 'innerText'
result[name] = node.querySelector(selector)[method]
}
JSONresults.push(result)
TSVresults.push(Object.values(result).join("\t"))
})
TSVresults = TSVresults.join("\n")
copy(TSVresults)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment