-
-
Save michaelbutler/73bf7e52f52224232fcb123dafe4df39 to your computer and use it in GitHub Desktop.
#!/usr/bin/php | |
<?php | |
use GuzzleHttp\Client; | |
use Psr\Http\Message\ResponseInterface; | |
error_reporting(E_ALL); | |
ini_set('display_errors', 1); | |
ini_set('error_log', null); // error_log to stderr | |
require_once 'vendor/autoload.php'; | |
function output_file_descriptors(string $label) | |
{ | |
usleep(50000); | |
$pid = getmypid(); | |
$dir = sprintf('/proc/%d/fd/*', $pid); | |
$files = glob($dir); | |
error_log(sprintf("*** IN %s: OPEN FDs %d", $label, count($files))); | |
usleep(50000); | |
} | |
/** | |
* Make a bunch of concurrent curl requests to a single domain on different paths | |
* @return array Map on responses | |
*/ | |
function run_script_chunk() | |
{ | |
$client = new Client([ | |
'base_uri' => 'https://66.media.tumblr.com/', | |
'sink' => '/dev/null', | |
'proxy' => null, | |
]); | |
$paths = [ | |
'/ca13692b4d21ac673e6c4eea232ceee1/tumblr_pm43a9jP7o1thjq1e_640.jpg', | |
'/4ceb016f5174868bf90ce97ed80a979e/tumblr_po9kpyv1Zc1xoyw8po1_640.jpg', | |
'/605dfc39f03599e04a05d6bee72da699/tumblr_poa05jz0CB1w7u8nmo1_1280.jpg', | |
'/11e37ea7e1be01f08fcccb859a1076f6/tumblr_pirts4a9HG1sn3ne4o1_1280.jpg', | |
'/077b41778bceaebe5e4c15f48be65677/tumblr_po9vj74Irk1rsezm9o1_1280.jpg', | |
'/6ec1d80af8c181d3e597f1b1b0ae681f/tumblr_po9zb72brS1r7b7ly_540.jpg', | |
'/e869e2e1ab780e9a39b297f4ff08d3e5/tumblr_po9zkwyoTB1vz5npso1_1280.png', | |
'/cbba1fb23610cca96275c4d3f4719f6c/tumblr_pk4pwgvPyU1qbtrumo1_1280.jpg', | |
'/c7e0b3c226674c2381f9494761a2834c/tumblr_pjyk879oq91qdj91io2_540.png', | |
'/c7e0b3c2266731194761a2834c/tumblr_pjyk879oq91qdj91io2_540.png', // Cause Exception! | |
'/a08955930aed7ab081700efc44a26b05/tumblr_pllaadvGbz1thzx08o1_500.gif', | |
'/98e382314d57a49380bac5662d9b6e18/tumblr_po9pul2Iyv1xoyw8po1_1280.png', | |
'/8830736f845a7ce0d0e6f46c73fe77df/tumblr_osw9itaD1n1tyhiwmo1_1280.png', | |
'/b5ba6a7e12b6c2e76431e8fd47d08a8e/tumblr_poa8cf4rwS1y72ak6o1_540.gif', | |
'/82929a643f1c31b8e9b32781fa06d25a/tumblr_po6gt1nw331r3yybh_400.jpg', | |
'/0ba4821cb51a7bac6a9e388014a7f879/tumblr_pmtv2w62hP1r4q6y5_400.jpg', | |
'/2c95416f8ac98e0f83ca780afdb9ec59/tumblr_pmy0n9bqwL1qjhjrd_400.jpg', | |
]; | |
$promises = []; | |
foreach ($paths as $index => $path) { | |
// Initiate each request but do not block | |
$promises['image' . $index] = $client->requestAsync('GET', $path)->then( | |
function (ResponseInterface $res) { | |
echo 'Got response! Code ' . $res->getStatusCode() . PHP_EOL; | |
}, | |
function (\GuzzleHttp\Exception\RequestException $exception) { | |
echo 'Got exception: ' . $exception->getMessage() . PHP_EOL; | |
} | |
); | |
} | |
// $results = \GuzzleHttp\Promise\unwrap($promises); | |
$results = \GuzzleHttp\Promise\settle($promises)->wait(); | |
return $results; | |
} | |
function run_outer_loop() | |
{ | |
$responses = run_script_chunk(); | |
} | |
// MAIN ******* | |
output_file_descriptors('BEGIN'); | |
// Run the complete thing a few times | |
foreach (range(1, 4) as $iter) { | |
run_outer_loop(); | |
output_file_descriptors("LOOP"); | |
} | |
output_file_descriptors('FINAL'); |
Same script but with the 404 URL commented out, we can see no FD problems occurred and destructors called at the appropriate scoping time:
*** IN BEGIN: OPEN FDs 6
CONSTRUCT CURLFACTORY CALLED!
CONSTRUCT CURLFACTORY CALLED!
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
HandlerStack Destructor Called. ===
CURL MULTI DESTRUCTED!!!
DESTRUCT CURLFACTORY CALLED!
DESTRUCT CURLFACTORY CALLED!
*** IN LOOP: OPEN FDs 8
CONSTRUCT CURLFACTORY CALLED!
CONSTRUCT CURLFACTORY CALLED!
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
HandlerStack Destructor Called. ===
CURL MULTI DESTRUCTED!!!
DESTRUCT CURLFACTORY CALLED!
DESTRUCT CURLFACTORY CALLED!
*** IN LOOP: OPEN FDs 8
CONSTRUCT CURLFACTORY CALLED!
CONSTRUCT CURLFACTORY CALLED!
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
HandlerStack Destructor Called. ===
CURL MULTI DESTRUCTED!!!
DESTRUCT CURLFACTORY CALLED!
DESTRUCT CURLFACTORY CALLED!
*** IN LOOP: OPEN FDs 8
CONSTRUCT CURLFACTORY CALLED!
CONSTRUCT CURLFACTORY CALLED!
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
HandlerStack Destructor Called. ===
CURL MULTI DESTRUCTED!!!
DESTRUCT CURLFACTORY CALLED!
DESTRUCT CURLFACTORY CALLED!
*** IN LOOP: OPEN FDs 8
*** IN FINAL: OPEN FDs 8
@michaelbutler, should the destructors be called after each promise is fulfilled? Lets say the threshold for maximum number of open files that the OS can handle is 4000, and opening any more than that results in guzzle/guzzle#1927. Given that scenario, even with no exceptions like you've demonstrated above, wouldn't you still run into the issue if your processing over 4000 concurrent requests at a time (in a single promise generation loop)? If so, do you see the solution as being a proper use of maxHandles
(once it is fixed), or is the only solution to break out the number of requests so that only 4000 or less run in a given loop?
Here's my output from running the above (extra error logs are from lines I added to GuzzleHttp internally to track when construct and destruct is called on various objects, but otherwise not changing the functionality at all):