Skip to content

Instantly share code, notes, and snippets.

@alexwilson
Last active September 3, 2021 17:23
Show Gist options
  • Star 32 You must be signed in to star a gist
  • Fork 8 You must be signed in to fork a gist
  • Save alexwilson/f4f084b87946f84a89b4 to your computer and use it in GitHub Desktop.
Save alexwilson/f4f084b87946f84a89b4 to your computer and use it in GitHub Desktop.
This is a project designed to get around sites using Cloudflare's "I'm under attack" mode. Using the PhantomJS headless browser, it queries a site given to it as the second parameter, waits six seconds and returns the cookies required to continue using this site. With this, it is possible to automate scrapers or spiders that would otherwise be t…
/**
* This is a project designed to get around sites using Cloudflare's "I'm under attack" mode.
* Using the PhantomJS headless browser, it queries a site given to it as the second parameter,
* waits six seconds and returns the cookies required to continue using this site. With this,
* it is possible to automate scrapers or spiders that would otherwise be thwarted by Cloudflare's
* anti-bot protection.
*
* To run this: phantomjs cloudflare-challenge.js http://www.example.org/
*
* Copyright © 2015 by Alex Wilson <antoligy@antoligy.com>
*
* Permission to use, copy, modify, and/or distribute this software for
* any purpose with or without fee is hereby granted, provided that the
* above copyright notice and this permission notice appear in all
* copies.
*
* THE SOFTWARE IS PROVIDED "AS IS" AND ISC DISCLAIMS ALL WARRANTIES WITH
* REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
* MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL ISC BE LIABLE FOR ANY
* SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
* WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
* ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
* OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
*/
/**
* Namespaced object.
* @type {Object}
*/
var antoligy = antoligy || {};
/**
* Simple wrapper to retrieve Cloudflare's 'solved' cookie.
* @type {Object}
*/
antoligy.cloudflareChallenge = {
webpage: false,
system: false,
page: false,
url: false,
userAgent: false,
/**
* Initiate object.
*/
init: function() {
this.webpage = require('webpage');
this.system = require('system');
this.page = this.webpage.create();
this.url = this.system.args[1];
this.userAgent = 'Mozilla/5.0 (Windows NT 6.3; rv:36.0) Gecko/20100101 Firefox/36.0';
this.timeout = 6000;
},
/**
* "Solve" Cloudflare's challenge using PhantomJS's engine.
* @return {String} JSON containing our cookies.
*/
solve: function() {
var self = this;
this.page.settings.userAgent = this.userAgent;
this.page.open(this.url, function(status) {
setTimeout(function() {
console.log(JSON.stringify(phantom.cookies));
phantom.exit()
}, self.timeout);
});
}
}
/**
* In order to carry on making requests, both user agent and IP address must what is returned here.
*/
antoligy.cloudflareChallenge.init();
antoligy.cloudflareChallenge.solve();
@mhouriet
Copy link

Very useful, thanks a lot!

@joshbode
Copy link

joshbode commented May 7, 2016

Thanks for this!
I was trying the same, but didn't have a timeout :)

@ezako2
Copy link

ezako2 commented Feb 3, 2017

Interested,
how to use this method with casperjs ?

@GeneratorEVil
Copy link

GeneratorEVil commented Jan 16, 2019

Hi all! I not understand how to use it. That's what I'm doing:

use JonnyW\PhantomJs\Client;

$cookie = \json_decode(exec(base_path().'//bin/'.$phantomjs.' '.base_path().'//bin/cloudflare-challenge.js '.$url),true);
$request = $client->getMessageFactory()->createRequest($url, 'GET');
for ($i = 0; $i < count($cookie); $i++) {  
            $request->addCookie(
                $cookie[$i]['name'],
                $cookie[$i]['value'],
                $cookie[$i]['path'],
                $cookie[$i]['domain'],
                $cookie[$i]['httponly'],
                $cookie[$i]['secure'],
                $cookie[$i]['expires'],
                $cookie[$i]['expiry']); 
        }        

$response = $client->getMessageFactory()->createResponse();
$client->send($request, $response);        
if($response->getStatus() === 200) {
        // Dump the requested page content
            echo $response->getContent();
        }else {
            echo $response->getStatus();
        }

But i response NULL in all keys of object

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment