Skip to content

Instantly share code, notes, and snippets.

@gornostal
Last active December 23, 2015 02:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gornostal/6570516 to your computer and use it in GitHub Desktop.
Save gornostal/6570516 to your computer and use it in GitHub Desktop.
Scrape audible search results
/**
* To run use http://nrabinowitz.github.com/pjscrape/
* 1. $ phantomjs pjscrape.js audible.js
* 2. Register on mongolab.com
* 3. $ mongoimport -h ds0466148.mongolab.com:45598 -d <dbname> -c <collection> -u <user> -p <password> --file audible.json --jsonArray
* 4. Install greasemonkey (Firefox) or tempermonkey (Chrome)
* 5. Add this userscript https://gist.github.com/gornostal/6570526
* 6. Open https://mongolab.com/databases/audible/collections/fiction_books or whatever you have
*/
pjs.config({
log: 'stdout',
format: 'json',
writer: 'file',
outFile: 'audible.json'
});
pjs.addSuite({
url: 'http://www.audible.com/search/ref=sr_pg__1?field_subjectbin=2226688011&searchSize=50',
moreUrls: '.adbl-page-next a:first:contains("Next")',
maxDepth: 30,
scraper: function() {
return $('li.adbl-result-item').map(function() {
var link = $('.adbl-prod-title a:first', this),
name = link.text(),
url = _pjs.toFullUrl(link.attr('href')),
image = $('a img.adbl-prod-image', this),
imageUrl = _pjs.toFullUrl(image.attr('src')),
author = $('.adbl-prod-author a:first', this).text(),
releaseDate = $('.adbl-label:contains("Release Date")', this).next().text(),
ratingEl = $('.boldrating', this),
rating = parseFloat(ratingEl.text().trim()),
ratingsMatch = ratingEl.parent().text().match(/\((\d+) ratings\)/);
ratings = parseInt((ratingsMatch && ratingsMatch[1]) || 0, 10);
return {
name: name,
link: url,
image: imageUrl,
releaseDate: releaseDate,
author: author,
rating: rating,
ratings: ratings
};
}).toArray();
}
});
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment