Skip to content

Instantly share code, notes, and snippets.

@schollz
Created August 27, 2017 15:16
Show Gist options
  • Star 15 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save schollz/4dcd045a95196f567ba0abdd0ac70452 to your computer and use it in GitHub Desktop.
Save schollz/4dcd045a95196f567ba0abdd0ac70452 to your computer and use it in GitHub Desktop.
Use Puppeteer to download a webpage after its been processed by javascript
// save as index.js
// npm install https://github.com/GoogleChrome/puppeteer/
// node index.js URL
const puppeteer = require('puppeteer');
(async () => {
const url = process.argv[2];
const browser = await puppeteer.launch();
// use tor
//const browser = await puppeteer.launch({args:['--proxy-server=socks5://127.0.0.1:9050']});
const page = await browser.newPage();
page.on('request', (request) => {
console.log(`Intercepting: ${request.method} ${request.url}`);
request.continue();
});
await page.goto(url, {waitUntil: 'load'});
//const title = await page.title();
//console.log(title);
await page.screenshot({path:'example.png'});
const html = await page.content();
console.log(html);
browser.close();
})();
@yarekc
Copy link

yarekc commented Dec 3, 2017

When try with TOR I got:
Error: net::ERR_NO_SUPPORTED_PROXIES

@finaldzn
Copy link

Thank you, you saved me ! :)

@Winstone-Were
Copy link

I thought this would download the webpage as an html file 😞😞

@schollz
Copy link
Author

schollz commented Aug 26, 2020

@Winstone-Were it outputs the webpage to the console. you can redirect the output or you can change console.log(html) to write to the file of your choice

@Winstone-Were
Copy link

Winstone-Were commented Aug 26, 2020 via email

@Winstone-Were
Copy link

redirect

thanks fellow developer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment