Skip to content

Instantly share code, notes, and snippets.

@jellis
Created August 19, 2015 15:27
Show Gist options
  • Save jellis/c894c4dcc206d6b2295f to your computer and use it in GitHub Desktop.
Save jellis/c894c4dcc206d6b2295f to your computer and use it in GitHub Desktop.
This is the code I used to scrape the ninemsn webpage with PhantomJS. Easy install.
var page = require('webpage').create(),
fs = require('fs'),
url = 'http://www.ninemsn.com.au/';
page.onResourceRequested = function(requestData, request) {
if (requestData['url'] !== url) {
console.log('Aborting ' + requestData['url']);
request.abort();
}
}
page.open(url, function(status){
if (status !== 'success') {
console.log('Error opening url ' + page.reason_url + ' : ' + page.reason);
} else {
var fast = page.evaluate(function(){
return window.MI9_NH.stories.fast;
});
var filename = 'mi9-fast.json';
fs.write('data/' + filename, JSON.stringify(fast), 'w');
}
phantom.exit();
});
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment