Skip to content

Instantly share code, notes, and snippets.

@kimihito kimihito/index.js
Created Sep 7, 2018

Embed
What would you like to do?
mozilla/readability を使った本文抽出サンプル
import puppeteer from 'puppeteer'
const Readability = require("readability");
import { JSDOM } from 'jsdom'
const URL = 'url'
(async () => {
const browser = await puppeteer.launch({
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
'--lang=ja,en-US,en'
]
});
const page = await browser.newPage();
await page.goto(url);
const html = await page.evaluate(() => {
return document.body.innerHTML
})
const dom = new JSDOM(html)
const content = new Readability(dom.window.document).parse()
console.log(content)
await browser.close();
})();
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.