Skip to content

Instantly share code, notes, and snippets.

Last active March 17, 2022 07:09
Show Gist options
  • Save joseluisq/6ee3876dc64561ffa14b to your computer and use it in GitHub Desktop.
Save joseluisq/6ee3876dc64561ffa14b to your computer and use it in GitHub Desktop.
Comparative between PHP `stream_get_line` and `fgets` about processing large files

Comparative between PHP stream_get_line and fgets about processing large files


Regarding Leigh Purdie's comment (from 4 years ago) about stream_get_line being better for large files, I decided to test this in case it was optimized since then and I found out that Leigh's comment is just completely incorrect fgets actually has a small amount of better performance, but the test Leigh did was not set up to produce good results.

The suggested test was:

$ time yes "This is a test line" | head -1000000 | php -r '$fp=fopen("php://stdin","r"); while($line=stream_get_line($fp,65535,"\n")) { 1; } fclose($fp);'

$ time yes "This is a test line" | head -1000000 | php -r '$fp=fopen("php://stdin","r"); while($line=fgets($fp,65535)) { 1; } fclose($fp);'


The reason this is invalid is because the buffer size of 65535 is completely unnecessary piping the output of "yes 'this is a test line'" in to PHP makes each line 19 characters plus the delimiter so while I don't know why stream_get_line performs better with an oversize buffer, if both buffer sizes are correct, or default, they have a negligable performance difference - although notably, stream_get_line is consistent - however if you're thinking of switching, make sure to be aware of the difference between the two functions, that stream_get_line does NOT append the delimiter, and fgets DOES append the delimiter.

Here are the results on one of my servers:

Buffer size 65535
stream_get_line:    0.340s
fgets:   2.392s

Buffer size of 1024
stream_get_line:  0m0.348s
fgets: 0.404s

Buffer size of 8192 (the default for both)
stream_get_line: 0.348s
fgets:  0.552s

Buffer size of 100:
stream_get_line: 0.332s
fgets: 0.368s
Copy link

hAbd0u commented Apr 10, 2021

I was about to make some benchmarks because I have large files to process, you just saved me some time ;). How ever, when some one read you results it seems stream_get_line is the winner here, but you are saying the opposite, would explain this to me?

Copy link

InfinitumForm commented Mar 17, 2022

This is perfect explanation. Let's give you something great. If you need read file a soon as possible, this is the right solution:

$path = 'some/large/file.json';
$data = '';
$chunk_length = 1024;
$fh = fopen($path,'r');
	while (($line = stream_get_line($fh, $chunk_length)) !== false){
fclose($fh); unset($fh);

I use this in the loop to read around 300 json files in some cases and merge data to one array after json decode. I get results arround 1 second.

NOTE: If you read files in the loop, you don't need unset($fh); in foreach reading. Just unset when loop stop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment