Skip to content

Instantly share code, notes, and snippets.

@mhkeller
Created January 1, 2014 20:30
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mhkeller/8211238 to your computer and use it in GitHub Desktop.
Save mhkeller/8211238 to your computer and use it in GitHub Desktop.
alphabits

Alphabits

A combination of the Cheerio and Request libraries to make a nodejs scraper that doesn't have the same memory leak issues as jsdom-based scrapers like node-scraper. Also allows for rate limiting. Units are in milliseconds.

Normal

var alphabits = require('/Users/mike/code/alphabits/lib/alphabits.js');

alphabits('http://america.aljazeera.com', function(err, $){
  var headline = $('h1.topStories-headline a').html();
  console.log(headline);
})

Rate limited

var alphabits = require('/Users/mike/code/alphabits/lib/alphabits.js').rateLimit(5000);

for (var i = 0; i < 5; i++){
	alphabits('http://america.aljazeera.com', function(err, $){
	  var headline = $('h1.topStories-headline a').html();
	  console.log(headline);
	})
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment