Skip to content

Instantly share code, notes, and snippets.

@michaelbutler
Created March 13, 2019 02:47
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save michaelbutler/73bf7e52f52224232fcb123dafe4df39 to your computer and use it in GitHub Desktop.
Save michaelbutler/73bf7e52f52224232fcb123dafe4df39 to your computer and use it in GitHub Desktop.
This test script causes file descriptors to stay open until the process quits, even though the local variables all should be cleared out.
#!/usr/bin/php
<?php
use GuzzleHttp\Client;
use Psr\Http\Message\ResponseInterface;
error_reporting(E_ALL);
ini_set('display_errors', 1);
ini_set('error_log', null); // error_log to stderr
require_once 'vendor/autoload.php';
function output_file_descriptors(string $label)
{
usleep(50000);
$pid = getmypid();
$dir = sprintf('/proc/%d/fd/*', $pid);
$files = glob($dir);
error_log(sprintf("*** IN %s: OPEN FDs %d", $label, count($files)));
usleep(50000);
}
/**
* Make a bunch of concurrent curl requests to a single domain on different paths
* @return array Map on responses
*/
function run_script_chunk()
{
$client = new Client([
'base_uri' => 'https://66.media.tumblr.com/',
'sink' => '/dev/null',
'proxy' => null,
]);
$paths = [
'/ca13692b4d21ac673e6c4eea232ceee1/tumblr_pm43a9jP7o1thjq1e_640.jpg',
'/4ceb016f5174868bf90ce97ed80a979e/tumblr_po9kpyv1Zc1xoyw8po1_640.jpg',
'/605dfc39f03599e04a05d6bee72da699/tumblr_poa05jz0CB1w7u8nmo1_1280.jpg',
'/11e37ea7e1be01f08fcccb859a1076f6/tumblr_pirts4a9HG1sn3ne4o1_1280.jpg',
'/077b41778bceaebe5e4c15f48be65677/tumblr_po9vj74Irk1rsezm9o1_1280.jpg',
'/6ec1d80af8c181d3e597f1b1b0ae681f/tumblr_po9zb72brS1r7b7ly_540.jpg',
'/e869e2e1ab780e9a39b297f4ff08d3e5/tumblr_po9zkwyoTB1vz5npso1_1280.png',
'/cbba1fb23610cca96275c4d3f4719f6c/tumblr_pk4pwgvPyU1qbtrumo1_1280.jpg',
'/c7e0b3c226674c2381f9494761a2834c/tumblr_pjyk879oq91qdj91io2_540.png',
'/c7e0b3c2266731194761a2834c/tumblr_pjyk879oq91qdj91io2_540.png', // Cause Exception!
'/a08955930aed7ab081700efc44a26b05/tumblr_pllaadvGbz1thzx08o1_500.gif',
'/98e382314d57a49380bac5662d9b6e18/tumblr_po9pul2Iyv1xoyw8po1_1280.png',
'/8830736f845a7ce0d0e6f46c73fe77df/tumblr_osw9itaD1n1tyhiwmo1_1280.png',
'/b5ba6a7e12b6c2e76431e8fd47d08a8e/tumblr_poa8cf4rwS1y72ak6o1_540.gif',
'/82929a643f1c31b8e9b32781fa06d25a/tumblr_po6gt1nw331r3yybh_400.jpg',
'/0ba4821cb51a7bac6a9e388014a7f879/tumblr_pmtv2w62hP1r4q6y5_400.jpg',
'/2c95416f8ac98e0f83ca780afdb9ec59/tumblr_pmy0n9bqwL1qjhjrd_400.jpg',
];
$promises = [];
foreach ($paths as $index => $path) {
// Initiate each request but do not block
$promises['image' . $index] = $client->requestAsync('GET', $path)->then(
function (ResponseInterface $res) {
echo 'Got response! Code ' . $res->getStatusCode() . PHP_EOL;
},
function (\GuzzleHttp\Exception\RequestException $exception) {
echo 'Got exception: ' . $exception->getMessage() . PHP_EOL;
}
);
}
// $results = \GuzzleHttp\Promise\unwrap($promises);
$results = \GuzzleHttp\Promise\settle($promises)->wait();
return $results;
}
function run_outer_loop()
{
$responses = run_script_chunk();
}
// MAIN *******
output_file_descriptors('BEGIN');
// Run the complete thing a few times
foreach (range(1, 4) as $iter) {
run_outer_loop();
output_file_descriptors("LOOP");
}
output_file_descriptors('FINAL');
@michaelbutler
Copy link
Author

Here's my output from running the above (extra error logs are from lines I added to GuzzleHttp internally to track when construct and destruct is called on various objects, but otherwise not changing the functionality at all):

*** IN BEGIN: OPEN FDs 6
CONSTRUCT CURLFACTORY CALLED!
CONSTRUCT CURLFACTORY CALLED!
Got exception: Client error: `GET https://66.media.tumblr.com/c7e0b3c2266731194761a2834c/tumblr_pjyk879oq91qdj91io2_540.png` resulted in a `404 Not Found` response
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
HandlerStack Destructor Called. ===
*** IN LOOP: OPEN FDs 26
CONSTRUCT CURLFACTORY CALLED!
CONSTRUCT CURLFACTORY CALLED!
Got exception: Client error: `GET https://66.media.tumblr.com/c7e0b3c2266731194761a2834c/tumblr_pjyk879oq91qdj91io2_540.png` resulted in a `404 Not Found` response
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
HandlerStack Destructor Called. ===
*** IN LOOP: OPEN FDs 44
CONSTRUCT CURLFACTORY CALLED!
CONSTRUCT CURLFACTORY CALLED!
Got exception: Client error: `GET https://66.media.tumblr.com/c7e0b3c2266731194761a2834c/tumblr_pjyk879oq91qdj91io2_540.png` resulted in a `404 Not Found` response
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
HandlerStack Destructor Called. ===
*** IN LOOP: OPEN FDs 62
CONSTRUCT CURLFACTORY CALLED!
CONSTRUCT CURLFACTORY CALLED!
Got exception: Client error: `GET https://66.media.tumblr.com/c7e0b3c2266731194761a2834c/tumblr_pjyk879oq91qdj91io2_540.png` resulted in a `404 Not Found` response
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
HandlerStack Destructor Called. ===
*** IN LOOP: OPEN FDs 80
*** IN FINAL: OPEN FDs 80
CURL MULTI DESTRUCTED!!!
DESTRUCT CURLFACTORY CALLED!
DESTRUCT CURLFACTORY CALLED!
DESTRUCT CURLFACTORY CALLED!
CURL MULTI DESTRUCTED!!!
DESTRUCT CURLFACTORY CALLED!
DESTRUCT CURLFACTORY CALLED!
DESTRUCT CURLFACTORY CALLED!
CURL MULTI DESTRUCTED!!!
DESTRUCT CURLFACTORY CALLED!
CURL MULTI DESTRUCTED!!!
DESTRUCT CURLFACTORY CALLED!

@michaelbutler
Copy link
Author

Same script but with the 404 URL commented out, we can see no FD problems occurred and destructors called at the appropriate scoping time:

*** IN BEGIN: OPEN FDs 6
CONSTRUCT CURLFACTORY CALLED!
CONSTRUCT CURLFACTORY CALLED!
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
HandlerStack Destructor Called. ===
CURL MULTI DESTRUCTED!!!
DESTRUCT CURLFACTORY CALLED!
DESTRUCT CURLFACTORY CALLED!
*** IN LOOP: OPEN FDs 8
CONSTRUCT CURLFACTORY CALLED!
CONSTRUCT CURLFACTORY CALLED!
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
HandlerStack Destructor Called. ===
CURL MULTI DESTRUCTED!!!
DESTRUCT CURLFACTORY CALLED!
DESTRUCT CURLFACTORY CALLED!
*** IN LOOP: OPEN FDs 8
CONSTRUCT CURLFACTORY CALLED!
CONSTRUCT CURLFACTORY CALLED!
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
HandlerStack Destructor Called. ===
CURL MULTI DESTRUCTED!!!
DESTRUCT CURLFACTORY CALLED!
DESTRUCT CURLFACTORY CALLED!
*** IN LOOP: OPEN FDs 8
CONSTRUCT CURLFACTORY CALLED!
CONSTRUCT CURLFACTORY CALLED!
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
Got response! Code 200
HandlerStack Destructor Called. ===
CURL MULTI DESTRUCTED!!!
DESTRUCT CURLFACTORY CALLED!
DESTRUCT CURLFACTORY CALLED!
*** IN LOOP: OPEN FDs 8
*** IN FINAL: OPEN FDs 8

@arderyp
Copy link

arderyp commented Mar 13, 2019

@michaelbutler, should the destructors be called after each promise is fulfilled? Lets say the threshold for maximum number of open files that the OS can handle is 4000, and opening any more than that results in guzzle/guzzle#1927. Given that scenario, even with no exceptions like you've demonstrated above, wouldn't you still run into the issue if your processing over 4000 concurrent requests at a time (in a single promise generation loop)? If so, do you see the solution as being a proper use of maxHandles (once it is fixed), or is the only solution to break out the number of requests so that only 4000 or less run in a given loop?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment