Skip to content

Instantly share code, notes, and snippets.

@sojohnnysaid
Created October 27, 2019 00:30
Show Gist options
  • Save sojohnnysaid/a88f935f4670359d2ab657177c013313 to your computer and use it in GitHub Desktop.
Save sojohnnysaid/a88f935f4670359d2ab657177c013313 to your computer and use it in GitHub Desktop.
scrapping
const puppeteer = require('puppeteer');
const fs = require('fs');
const HTMLParser = require('node-html-parser');
const urls = ['https://www.tripadvisor.com/Airline_Review-d8729099-Reviews-JetBlue',
'https://www.tripadvisor.com/Airline_Review-d8729099-Reviews-or5-JetBlue#REVIEWS',
'https://www.tripadvisor.com/Airline_Review-d8729099-Reviews-or10-JetBlue#REVIEWS',
'https://www.tripadvisor.com/Airline_Review-d8729099-Reviews-or15-JetBlue#REVIEWS',
'https://www.tripadvisor.com/Airline_Review-d8729099-Reviews-or20-JetBlue#REVIEWS',
'https://www.tripadvisor.com/Airline_Review-d8729099-Reviews-or25-JetBlue#REVIEWS',
'https://www.tripadvisor.com/Airline_Review-d8729099-Reviews-or30-JetBlue#REVIEWS',
'https://www.tripadvisor.com/Airline_Review-d8729099-Reviews-or35-JetBlue#REVIEWS',
'https://www.tripadvisor.com/Airline_Review-d8729099-Reviews-or40-JetBlue#REVIEWS',
'https://www.tripadvisor.com/Airline_Review-d8729099-Reviews-or45-JetBlue#REVIEWS',
'https://www.tripadvisor.com/Airline_Review-d8729099-Reviews-or45-JetBlue#REVIEWS'];
(async() => {
for (let i = 0; i < urls.length; i++) {
const url = urls[i];
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto(url, {waitUntil: 'networkidle0'});
const html = await page.content();
//save our html in a file
fs.writeFile('page' + i + '.html', html, _ => console.log('HTML saved'));
};
await browser.close();
})();
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment