Skip to content

Instantly share code, notes, and snippets.

@tegansnyder
Created February 23, 2018 02:41
Star You must be signed in to star a gist
Save tegansnyder/c3aeae4d57768c58247ae6c4e5acd3d1 to your computer and use it in GitHub Desktop.
Preventing Puppeteer Detection

I’m looking for any tips or tricks for making chrome headless mode less detectable. Here is what I’ve done so far:

Set my args as follows:

const run = (async () => {

    const args = [
        '--no-sandbox',
        '--disable-setuid-sandbox',
        '--disable-infobars',
        '--window-position=0,0',
        '--ignore-certifcate-errors',
        '--ignore-certifcate-errors-spki-list',
        '--user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3312.0 Safari/537.36"'
    ];

    const options = {
        args,
        headless: true,
        ignoreHTTPSErrors: true,
        userDataDir: './tmp'
    };

    const browser = await puppeteer.launch(options);

I’m loading in a preload file that overrides some window.navigator globals:

    const preloadFile = fs.readFileSync('./preload.js', 'utf8');
    await page.evaluateOnNewDocument(preloadFile);
preload.js
// overwrite the `languages` property to use a custom getter
Object.defineProperty(navigator, "languages", {
  get: function() {
    return ["en-US", "en"];
  };
});

// overwrite the `plugins` property to use a custom getter
Object.defineProperty(navigator, 'plugins', {
  get: function() {
    // this just needs to have `length > 0`, but we could mock the plugins too
    return [1, 2, 3, 4, 5];
  },
});

I see there are some other things suggested here https://intoli.com/blog/making-chrome-headless-undetectable/ but I'm not 100% certain how to implement them in puppeteer. Any ideas tips or tricks?

@vndevil
Copy link

vndevil commented May 25, 2020

you might want to check out

https://www.npmjs.com/package/puppeteer-extra-plugin-stealth

It's not working with goat.com and stockx.com. They are protected by perimeterx.com

@qo4on
Copy link

qo4on commented May 26, 2020

It's not working with goat.com and stockx.com. They are protected by perimeterx.com

This thing works with all of them.

@seahindeniz
Copy link

you might want to check out
https://www.npmjs.com/package/puppeteer-extra-plugin-stealth

It's not working with goat.com and stockx.com. They are protected by perimeterx.com

@vndevil I have just run a local test and I think it works

@shi-yan
Copy link

shi-yan commented Aug 7, 2020

https://www.houzz.com/ can detect puppeteer

@xjurko
Copy link

xjurko commented Aug 28, 2020

you might want to check out
https://www.npmjs.com/package/puppeteer-extra-plugin-stealth

Doesn't work when trying to scrape youtube videos in headless mode.

Are you trying to scrape anything besides the actual media content? If not I'd recommend youtube-dl with some ip rotation (might not be necessary)

@leehuwuj
Copy link

Magic!!!! Can you explain params and cases respectively?

@aalfiann
Copy link

aalfiann commented Dec 5, 2020

this way is doesn't work for https://imgfo.com,

you can try for their demo.

@BensimonSamy
Copy link

@vndevil do find a solution for goat and stock x please ? Goat banned my brower's signature....

@NikolaiT
Copy link

NikolaiT commented Jan 13, 2021

Hey guys, I am trying to give back a bit to the community.

https://bot.sannysoft.com/ is a bit old, isn't it?

I found a couple of new ways to detect latest puppeteer.

Check your bot here: https://incolumitas.com/pages/BotOrNot/

Best,
Nikolai

@rafakwolf
Copy link

akamai anti-bot still blocking, even with these techniques :|

@vndevil
Copy link

vndevil commented May 19, 2021

@vndevil do find a solution for goat and stock x please ? Goat banned my brower's signature....

This is my solution, you can see on my website now: https://shoegameviet.com/all-air-jordan-shoes/air-jordan-1/air-jordan-1-high (I get the latest price of Goat/StockX/MonoKabu/snkrDunk everytime load page.
My solution:

  1. using Tor browser as service to change ip automatic on server each time connect to stockx/goat api to crawl data
  2. With stockx just use axios simple
  3. With goat using puppeteer

@micha1333
Copy link

micha1333 commented May 31, 2021

Hi,
Seems like Linkedin detects puppeteer-extra-plugin-stealth.
Who have tried to avoid linkedin anti-bot?
Please help me.

@alpharameeztech
Copy link

Hi all,
Doesnt work with Kickstarter

@lakpahana
Copy link

you might want to check out

https://www.npmjs.com/package/puppeteer-extra-plugin-stealth

This worked for me

@123fischer
Copy link

@rafakwolf have you found a solution for the akamai protection? Have been trying for a while now, but to no real avail

@anhnt2310
Copy link

Hi folks, are there any ways to prevent Nordstrom's detection?

@betogzo
Copy link

betogzo commented Mar 25, 2022

worked for me, now recaptcha isn't bothering me anymore. thanks!

@uzair004
Copy link

In above gist, passing some arguments won't work because those are deprecated i.e no-infobars won't hide chrome is updated by automated script info bars as chrome teams has removed this as security bug.
instead pass another array to launch method
ignoreDefaultArgs: ["--enable-automation"]

@lifeboatpres
Copy link

lifeboatpres commented Sep 22, 2022

" Object.defineProperty(navigator, 'webdriver', { get: () => false, }); " cant work,it is not enought ,because after that input " 'window' in navigator ",the result is 'True'. it still will be detected.

Better to use:

const newProto = navigator.__proto__;
delete newProto.webdriver;
navigator.__proto__ = newProto;

@Vordlex
Copy link

Vordlex commented Mar 20, 2023

" Object.defineProperty(navigator, 'webdriver', { get: () => false, }); " cant work,it is not enought ,because after that input " 'window' in navigator ",the result is 'True'. it still will be detected.

Better to use:

const newProto = navigator.__proto__;
delete newProto.webdriver;
navigator.__proto__ = newProto;

image
solve WebDriver (NEW) for me

@IggsGrey
Copy link

Works on localhost for me, fails on remote vps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment