Skip to content

Instantly share code, notes, and snippets.

@ryanburgess
Last active March 9, 2017 15:51
Show Gist options
  • Save ryanburgess/01fc144b24d06321bfea to your computer and use it in GitHub Desktop.
Save ryanburgess/01fc144b24d06321bfea to your computer and use it in GitHub Desktop.
An example using Cheerio to scrape HTML
var cheerio = require('cheerio');
var request = require('request');
request({
method: 'GET',
url: 'http://www.wordthink.com'
}, function(err, response, body, callback) {
if (err) return console.error(err);
$ = cheerio.load(body);
var post = $('#content .singlemeta:first-child .post');
var word = post.find('.title').eq(0).text().replace('\r\n\t\t\t\t\t', '').replace('\r\n\t\t\t\t', '');
var definition = post.find('p').eq(0).text().replace('\n', '');
console.log(word);
console.log(definition);
});
@kunKun-tx
Copy link

it might be a better idea using normalizeWhitespace: true to remove \r\n\t

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment