Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 14 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save alcaeus/536156663fac96744eba77b3e133e50a to your computer and use it in GitHub Desktop.
Save alcaeus/536156663fac96744eba77b3e133e50a to your computer and use it in GitHub Desktop.
Performance comparision: in-array vs. isset vs. array_key_exists
<?php declare(strict_types = 1);
function testPerformance($name, Closure $closure, $runs = 1000000)
{
$start = microtime(true);
for (; $runs > 0; $runs--)
{
$closure();
}
$end = microtime(true);
printf("Function call %s took %.5f seconds\n", $name, $end - $start);
}
$items = [1111111];
for ($i = 0; $i < 100000; $i++) {
$items[] = rand(0, 1000000);
}
$items = array_unique($items);
shuffle($items);
$assocItems = array_combine($items, array_fill(0, count($items), true));
$in_array = function () use ($items) {
in_array(1111111, $items);
};
$isset = function () use ($assocItems) {
isset($items[1111111]);
};
$array_key_exists = function () use ($assocItems) {
array_key_exists(1111111, $assocItems);
};
testPerformance('in_array', $in_array, 100000);
testPerformance('isset', $isset, 100000);
testPerformance('array_key_exists', $array_key_exists, 100000);
$ php in_array_vs_isset_vs_array_key_exists.php 
Function call in_array took 1.51871 seconds
Function call isset took 0.14684 seconds
Function call array_key_exists took 0.22123 seconds
@geoglis
Copy link

geoglis commented Apr 5, 2019

Attention!

"in_array" searches for a value in an array, whereas 'isset' and 'array_key_exists' searches for a key inside an array!

Everybody has to keep this in mind. You can not just replace 'in_array' with the other ones.

@nicolaspernot
Copy link

Attention!

"in_array" searches for a value in an array, whereas 'isset' and 'array_key_exists' searches for a key inside an array!

Everybody has to keep this in mind. You can not just replace 'in_array' with the other ones.

The question is : Is it better to look for the existence of a value in an array or for a value in a key.
When you came to make your array do you construct it this way ?


 [0] => 'my_value_0'
 [1] => 'my_value_1'
 ...
 [152364] => 'my_value_152364'

Or this way ?


 ['my_value_0'] => 'my_value_0'
 ['my_value_1'] => 'my_value_1'
 ...
 ['my_value_152364'] => 'my_value_152364'

With the script from Alcaeus, I understand the second way is a lot faster to check the existence of a value. But obviously this method is perfect for appropriate cases. Meaning thant can't works all the time for all types of values.

@alcaeus
Copy link
Author

alcaeus commented Sep 18, 2019

Attention!

"in_array" searches for a value in an array, whereas 'isset' and 'array_key_exists' searches for a key inside an array!

Everybody has to keep this in mind. You can not just replace 'in_array' with the other ones.

Only saw this comment now. Obviously you can't replace them 1 by 1. This gist shows a comparison I did when we decided whether to store a list of values, or whether it would be faster to store them in a hash map with a dummy value. As you can see, isset is faster than in_array by an order of magnitude, so if you can choose how you store the data you may very well use a hash map instead of an array.

@michaelbutler
Copy link

One More Note!

It is true that isset on a hash map is faster than searching through an array for a value (in_array), but keep in mind that converting an array of values, ["foo", "bar", "baz"], to a hash map, ["foo" => true, "bar" => true, "baz" => true], incurs a memory cost (as well as potentially constructing the hash map, depending on how and when you do it). As with all things, you'll have to weigh the pros & cons for each case to determine if a hash map or array (list) of values works best for your needs. This isn't specific to PHP but more of a general problem space of computer science.

@kiatng
Copy link

kiatng commented Jun 2, 2020

PHP Version 7.4.2

Run #1
Function call in_array took 4.25003 seconds
Function call isset took 0.00490 seconds
Function call array_key_exists took 0.00473 seconds

Run #2
Function call in_array took 3.42650 seconds
Function call isset took 0.00491 seconds
Function call array_key_exists took 0.00493 seconds

Run #3
Function call in_array took 5.46345 seconds
Function call isset took 0.00492 seconds
Function call array_key_exists took 0.00473 seconds

@fefas
Copy link

fefas commented Feb 25, 2021

PHP Version 8.0.1 (inside docker)

Run #1

Function call in_array took 12.75440 seconds
Function call isset took 0.00333 seconds
Function call array_key_exists took 0.00318 seconds

Run #2

Function call in_array took 3.31103 seconds
Function call isset took 0.00338 seconds
Function call array_key_exists took 0.00323 seconds

Run #3

Function call in_array took 3.78831 seconds
Function call isset took 0.00336 seconds
Function call array_key_exists took 0.00346 seconds

@Webist
Copy link

Webist commented May 8, 2021

PHP 8.0.3 (cli)
MacOS 2015
2,7 GHz Dual-Core Intel Core i5
8 GB RAM

class Foo{};  
$foo = new Foo();  
foreach($items as $item){  
    $foo->{$item} = null;  
}  
$property_exists = function () use ($foo) {  
    property_exists($foo, '1111111');  
};
testPerformance('in_array', $in_array, 100000);                // 15.41009 seconds  
testPerformance('isset', $isset, 100000);                      //  0.00544 seconds  
testPerformance('array_key_exists', $array_key_exists, 100000);//  0.00600 seconds  
testPerformance('property_exists', $property_exists, 100000);  //  0.01304 seconds

@faustoFF
Copy link

Guys, you have error in line 29, $items is always NULL.

@JonasKraska
Copy link

Just discussed the topic of this benchmark with a colleague the other day. He pointed out that due to the nature of those data structures you most likely pay for the fast lookup with disadvantages in time and memory needed to build the data structure in the first place (like @michaelbutler mentioned before). So its highly comes down to very specific use cases (read vs write heavy access operations) to capitalize of the differences suggested by this benchmark. To bring numbers to the table I measured the time an memory needed to build the generic item array vs. the associative version in a comparable fashion:

PHP 8.0.14 (cli)
MacOS 11.4
2,6 GHz 6-Core Intel Core i7
32 GB 2667 MHz DDR4

Time used to build the item array: 0.00682 seconds
Memory used for item array: 4198480 bytes
Time used to build the associative item array: 0.01031 seconds (1.48 %)
Memory used for associative item array: 5242960 bytes (1.25 %)

Function call in_array took 3.74565 seconds
Function call isset took 0.04157 seconds
Function call array_key_exists took 0.04312 seconds

Code for reference (quick and dirty):

// ...

$base = [1111111];
for ($i = 0; $i < 100000; $i++) {
    $base[] = rand(0, 1000000);
}
$base = array_unique($base);
shuffle($base);

$startMemItems = memory_get_usage(true);
$startTimeItems = microtime(true);
$items = [];
foreach ($base as $k => $v) {
    $items[] = $v;
}
$endTimeItems = microtime(true);
$endMemItems = memory_get_usage(true);

printf("Time used to build the item array: %.5f seconds\n", $endTimeItems - $startTimeItems);
printf("Memory used for item array: %d bytes\n", $endMemItems - $startMemItems);

$startMemAssocItems = memory_get_usage(true);
$startTimeAssocItems = microtime(true);
$assocItems = [];
foreach ($base as $k => $v) {
    $assocItems[$v] = true;
}
$endTimeAssocItems = microtime(true);
$endMemAssocItems = memory_get_usage(true);

printf(
    "Time used to build the associative item array: %.5f seconds (%.2f %%)\n",
    $endTimeAssocItems - $startTimeAssocItems,
    ($endTimeAssocItems - $startTimeAssocItems) / ($endTimeItems - $startTimeItems)
);
printf(
    "Memory used for associative item array: %d bytes (%.2f %%)\n",
    $endMemAssocItems - $startMemAssocItems,
    ($endMemAssocItems - $startMemAssocItems) / ($endMemItems - $startMemItems)
);

// ...

@AnrDaemon
Copy link

This comes down to what you have on hand.
If you have a static structure you compare against (i.e. filtering out sensitive content before passing data to the client), you have a map in code prepared for fastest possible lookup.
When you have an unknown data to search for keys of interest, it's most likely not worth rebuilding it as that сould cost more than what you can save on lookup itself.
So, choose wisely!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment