https://www.infoq.com/articles/N-Plus-1/
- n+1 and API Composition
- n+1 and Distributed caching
https://www.infoq.com/articles/N-Plus-1/
The N+1 query problem is a common performance antipattern. It looks like this:
$cats = load_cats();
foreach ($cats as $cat) {
$cats_hats = load_hats_for_cat($cat);
// ...
}
Assuming load_cats() has an implementation that boils down to:
SELECT * FROM cat WHERE ...
..and load_hats_for_cat($cat) has an implementation something like this:
SELECT * FROM hat WHERE catID = ...
..you will issue "N+1" queries when the code executes, where N is the number of cats:
SELECT * FROM cat WHERE ...
SELECT * FROM hat WHERE catID = 1
SELECT * FROM hat WHERE catID = 2
SELECT * FROM hat WHERE catID = 3
SELECT * FROM hat WHERE catID = 4
SELECT * FROM hat WHERE catID = 5
...
The problem with this is that each query has quite a bit of overhead. It is much faster to issue 1 query which returns 100 results than to issue 100 queries which each return 1 result. This is particularly true if your database is on a different machine which is, say, 1-2ms away on the network. In this case, issuing 100 queries serially has a minimum cost of 100-200ms, even if they can be satisfied instantly by MySQL. This is far higher than the entire server-side generation cost for most Phabricator pages should be.
Fix the N+1 query problem by batching queries. Load all your data before iterating through it (this is oversimplified and omits error checking):
$cats = load_cats();
$hats = load_all_hats_for_these_cats($cats);
foreach ($cats as $cat) {
$cats_hats = $hats[$cat->getID()];
}
That is, issue these queries:
SELECT * FROM cat WHERE ...
SELECT * FROM hat WHERE catID IN (1, 2, 3, 4, 5, ...)