Skip to content

Instantly share code, notes, and snippets.

@seripap
Created December 18, 2015 16:08
Show Gist options
  • Save seripap/2bcf15c35eae54279166 to your computer and use it in GitHub Desktop.
Save seripap/2bcf15c35eae54279166 to your computer and use it in GitHub Desktop.
Simple NodeJS Link Crawler
'use strict';
import Crawler from "simplecrawler";
import fs from 'fs';
Crawler.crawl("http://www.google.com").on("fetchcomplete", (queueItem, responseBuffer, response) => {
console.log("Completed fetching resource:", queueItem.url);
if (queueItem.url.indexOf('png') > 0 || queueItem.url.indexOf('jpg') > 0 || queueItem.url.indexOf('gif') > 0 || queueItem.url.indexOf('jpeg') > 0) {
let filename = queueItem.url.substring(queueItem.url.lastIndexOf('/')+1);
fs.writeFile(__dirname + "/assets/images/"+filename, responseBuffer, (err) => {
console.log(err);
});
}
fs.appendFileSync(__dirname + '/results.txt', queueItem.url + '\r\n');
});
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment