Skip to content

Instantly share code, notes, and snippets.

@mrxf
Created July 28, 2017 02:32
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mrxf/beb888b4043389402ee38f48d85699b2 to your computer and use it in GitHub Desktop.
Save mrxf/beb888b4043389402ee38f48d85699b2 to your computer and use it in GitHub Desktop.
使用article parser提取网页内容
var ArticleParser = require('article-parser');
var he = require('he');
var url = 'http://mp.weixin.qq.com/s/JJbZkcBoSMW0zPzmepUr1A';
ArticleParser.configure({
htmlRules:{
allowedTags: [ 'pre', 'p', 'img' ],
allowedAttributes: {
pre: ['style'],
p: ['style'],
img: ['src','data-src']
}
}
})
ArticleParser.extract(url).then((article) => {
article.content = he.decode(article.content);
article.description = he.decode(article.description);
return article;
})
.then(article => console.log(article))
.catch((err) => {
console.log(err);
});
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment