Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Preventing Puppeteer Detection

I’m looking for any tips or tricks for making chrome headless mode less detectable. Here is what I’ve done so far:

Set my args as follows:

const run = (async () => {

    const args = [
        '--no-sandbox',
        '--disable-setuid-sandbox',
        '--disable-infobars',
        '--window-position=0,0',
        '--ignore-certifcate-errors',
        '--ignore-certifcate-errors-spki-list',
        '--user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3312.0 Safari/537.36"'
    ];

    const options = {
        args,
        headless: true,
        ignoreHTTPSErrors: true,
        userDataDir: './tmp'
    };

    const browser = await puppeteer.launch(options);

I’m loading in a preload file that overrides some window.navigator globals:

    const preloadFile = fs.readFileSync('./preload.js', 'utf8');
    await page.evaluateOnNewDocument(preloadFile);
preload.js
// overwrite the `languages` property to use a custom getter
Object.defineProperty(navigator, "languages", {
  get: function() {
    return ["en-US", "en"];
  };
});

// overwrite the `plugins` property to use a custom getter
Object.defineProperty(navigator, 'plugins', {
  get: function() {
    // this just needs to have `length > 0`, but we could mock the plugins too
    return [1, 2, 3, 4, 5];
  },
});

I see there are some other things suggested here https://intoli.com/blog/making-chrome-headless-undetectable/ but I'm not 100% certain how to implement them in puppeteer. Any ideas tips or tricks?

@getuliojr

This comment has been minimized.

Copy link

getuliojr commented Oct 22, 2018

Has you have any success with it ? Has it still been detected ?

@Raidus

This comment has been minimized.

Copy link

Raidus commented Dec 15, 2018

This might help to get "into the page" but crawling/scraping the same domain frequently is another topic on itself. Depending on the use case, I would also intercept certain files (especially js files which are used for user tracking). Here is an example:

  page.on("request", r => {
    if (
      ["image", "stylesheet", "font", "script"].indexOf(r.resourceType()) !== -1 
    ) {
      r.abort();
    } else {
      r.continue();
    }
  });
@Cphilo

This comment has been minimized.

Copy link

Cphilo commented Jan 5, 2019

I find it works. With this code, I bypass the verification with a big site. Thanks.

@berstend

This comment has been minimized.

Copy link

berstend commented Feb 2, 2019

I'm collecting various tricks to make puppeteer detection harder here, feel free to contribute if you have additional ones!

@coder77

This comment has been minimized.

Copy link

coder77 commented Feb 15, 2019

This code example and the article link are very very very useful for me. Thank You!

@asifmai

This comment has been minimized.

Copy link

asifmai commented Jun 18, 2019

In your preload.js you should also add this to pass webdriver test..
Object.defineProperty(navigator, 'webdriver', { get: () => false, });

@arif-bannehasan

This comment has been minimized.

Copy link

arif-bannehasan commented Jun 23, 2019

Thanks a lot your stuff has helped me solve browser detection issue. I have spent tough time figuring out to prevent detection. There aren't enough source which would help with this issue. Luckily stumbled upon this page.

@pencilcheck

This comment has been minimized.

Copy link

pencilcheck commented Jun 23, 2019

Seems to work for me, but not sure some sites still get blocked

@djilousp

This comment has been minimized.

Copy link

djilousp commented Aug 4, 2019

Please, Does anyone have the alternative for pyppeteer on python i'm getting detected by reCAPTCHA V3

@pichardonuba

This comment has been minimized.

Copy link

pichardonuba commented Aug 7, 2019

Thanks very useful for me. Thanks

@Uukins

This comment has been minimized.

Copy link

Uukins commented Aug 20, 2019

" Object.defineProperty(navigator, 'webdriver', { get: () => false, }); " cant work,it is not enought ,because after that input " 'window' in navigator ",the result is 'True'. it still will be detected.

@Uukins

This comment has been minimized.

Copy link

Uukins commented Aug 20, 2019

sorry,it's not "'window' in navigator",is webdriver in navigator

@debaosuidecl

This comment has been minimized.

Copy link

debaosuidecl commented Aug 27, 2019

how then do we avoid web driver detection

@borispov

This comment has been minimized.

Copy link

borispov commented Oct 23, 2019

For those who still interested:
https://antoinevastel.com/bot%20detection/2018/01/17/detect-chrome-headless-v2.html

Covers few ways that websites detect headless connections, few of them are covered in this gist and thread, few, however, are still unsolved (here..).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.