Skip to content

Instantly share code, notes, and snippets.

@kimihito
Created September 7, 2018 05:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kimihito/0c0c68c17bcf6b4545166fc6ce778f03 to your computer and use it in GitHub Desktop.
Save kimihito/0c0c68c17bcf6b4545166fc6ce778f03 to your computer and use it in GitHub Desktop.
mozilla/readability を使った本文抽出サンプル
import puppeteer from 'puppeteer'
const Readability = require("readability");
import { JSDOM } from 'jsdom'
const URL = 'url'
(async () => {
const browser = await puppeteer.launch({
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
'--lang=ja,en-US,en'
]
});
const page = await browser.newPage();
await page.goto(url);
const html = await page.evaluate(() => {
return document.body.innerHTML
})
const dom = new JSDOM(html)
const content = new Readability(dom.window.document).parse()
console.log(content)
await browser.close();
})();
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment