Skip to content

Instantly share code, notes, and snippets.

@shiondev
shiondev / SampleScraper.js
Last active March 1, 2024 15:29
SampleScraper
// This is sample code for building a web scraper.
//
// For this sample, we use
// http://www.citysearch.com/profile/10192700/lockhart_tx/black_s_barbecue.html
// as a sample listing we want to scrape.
//
// For the full crawler, we will assume the crawl
// starts from http://www.houzz.com/professionals/
var EightyApp = function() {
@shiondev
shiondev / download_data
Last active August 29, 2015 14:02
Download updated data
> curl -X GET "https://your_token:@api.datafiniti.net/v2/data/locations/download?view=location_json&q=postalcode:78701 AND dateUpdated:[2014-01-01 TO *]"
@shiondev
shiondev / download_updated_data
Created May 30, 2014 20:33
Download updated data
> curl -X GET "https://your_token:@api.datafiniti.net/v2/data/locations/download?view=location_json&q=postalcode:78701 AND dateUpdated:[2014-01-01 TO *]"
@shiondev
shiondev / tutorials_downloadHTML
Created May 22, 2014 17:17
Tutorials > Downloading the HTML For Every Page on a Group of Websites
> curl -X PUT https://your_user_token:@api.80legs.com/v2/urllists/downloadwebsites-1 -H "Content-Type: application/octet-stream" --data-binary "[\"http://www.80legs.com\",\"https://www.datafiniti.net\",\"http://www.harkavagrant.com\"]" -i
> curl -X PUT https://your_user_token:@api.80legs.com/v2/crawls/downloadwebsites -H "Content-Type: application/json" -d "{\"app\": \"CrawlInternalLinks.js\", \"urllist\": \"downloadwebsites-1\", \"max_depth\": 20, \"max_urls\": 100000 }" -i
> curl -X GET https://your_user_token:@api.80legs.com/v2/results/downloadwebsites
@shiondev
shiondev / tutorials_countURLs
Last active August 29, 2015 14:01
Tutorials > Counting How Many URLs a Website Has
> curl -X PUT https://your_user_token:@api.80legs.com/v2/urllists/crawlURLs-1 -H "Content-Type: application/octet-stream" --data-binary "[\"http://www.80legs.com\"]" -i
> curl -X PUT https://your_user_token:@api.80legs.com/v2/crawls/countURLs -H "Content-Type: application/json" -d "{\"app\": \"CrawlInternalLinks.js\", \"urllist\": \"crawlURLs-1\", \"max_depth\": 20, \"max_urls\": 100000 }" -i
> curl -X GET https://your_user_token:@api.80legs.com/v2/crawls/countURLs
@shiondev
shiondev / datafiniti_download
Created May 20, 2014 14:40
Sample Datafiniti download
> curl -X GET "https://your_token:@api.datafiniti.net/v2/data/locations/download?view=location_json&q=postalcode:78701"
@shiondev
shiondev / datafiniti_preview
Created May 20, 2014 14:37
Sample Datafiniti preview call
> curl -X GET "https://your_token:@api.datafiniti.net/v2/data/locations/preview?view=location_json&q=name:%22Kagan%20Creative%22"
@shiondev
shiondev / 80legs_result_sample.json
Last active August 23, 2019 15:50
80legs result sample
[
{
"url": "http://www.80legs.com/",
"result": "your_data_here"
},
{
"url": "http://www.techcrunch.com/",
"result": "your_data_here"
}
]
@shiondev
shiondev / sample_datafiniti_business_json.json
Last active August 29, 2015 13:58
A sample business record from Datafiniti in JSON
{
"total": 19894,
"records": [
{
"address": "222 Merchandise Mart Plz Ste 111",
"categories": [
"Coffeehouses",
"Grocery Stores",
"Coffee & Tea Shops",
"Restaurants"
@shiondev
shiondev / upload_80app_sample
Last active August 29, 2015 13:57
Upload 80app sample
> curl -X PUT https://your_user_token:@api.80legs.com/v2/apps/full_page_content.js -H "Content-Type: application/octet-stream" --data-binary @/path/to/full_page_content.js -i