Skip to content

Instantly share code, notes, and snippets.

@parhamr
Last active November 29, 2019 09:46
Show Gist options
  • Star 17 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save parhamr/10091093 to your computer and use it in GitHub Desktop.
Save parhamr/10091093 to your computer and use it in GitHub Desktop.

Magento Scaling Hypotheses

Magento’s checkout throughput can increase to at least 8 times its current capacity and up to 26 times more in ideal conditions.

  1. The current checkouts per hour limit for large, real-world Magento stores is 4,500
  2. This limit cannot effectively be increased with more and/or better hardware
  3. The improper type handling in Magento’s SQL code is the cause of the current limit
  4. If one SQL query is fixed, large Magento stores can scale to a new, real-world limit of 120,000 checkouts per hour
  5. For commodity hardware, this new limit might be 36,000 checkouts per hour

Background Assumptions

Many load and performance tests for Magento have shown impressive numbers for checkouts per hour, but the tests have been performed with invalid conditions. The 2012 publication Optimizing Magento for Peak Performance from Magento lists 487,000 orders per day as the framework’s capacity. This means Magento’s least performant SQL query under test conditions is only 177 ms. Multiple independent parties have seen this query require approximately 800 ms in production for large Magento stores.

Magento’s read operations (SQL SELECT) can scale to nearly as many threads as you’re able to afford to run. The framework’s most expensive write operations—finalizing and saving an order in checkout—are fully and absolutely limited to a single thread.

The fundamental bottleneck is SQL INSERT, UPDATE, and DELETE commands against tables like sales_flat_order_grid and catalog_product_index_price. Only one thread should write to these tables at any given moment, so it is practically impossible for Magento’s checkout throughput to exceed the rate of the slowest average write operation on either of these tables. (Technically it can be done but it violates ACID compliance and threatens referential integrity.)

Some Numbers

1 hour = 3600 seconds

4500 checkouts per hour is the maximum known real world vanilla example in a large store (Source: Magento Expert Consulting Group)

700–900 ms is commonly the slowest transaction (catalog_product_index_price - DELETE) in Magento’s checkout for a large store (Source: https://gist.github.com/parhamr/6177160)

3600 / 4500 = 0.8

This is a really good fit between the 4500 checkout limit and the 800 ms query, but this correlation has not been verified as causally linked.


Assuming the above query is improved…

33 ms is the average execution time for sales_flat_order_grid - INSERT — the next-slowest, single-threaded bottleneck during checkout (source: 3 months of New Relic data for a large cluster)

3600 / 0.03 = 120000 checkouts per hour


To be pessimistic…

100 ms as the new, single-threaded bottleneck should be easily attainable in less than ideal conditions, assuming the above query is fixed

3600 / 0.1 = 36000 checkouts per hour

Four Slow Checkout Queries

table operation average response time
catalog_product_index_price DELETE 806 ms
sales_shipping_aggregated_order DELETE 95.6 ms
catalog_product_index_price INSERT 35.2 ms
sales_flat_order_grid INSERT 33.4 ms
@astorm
Copy link

astorm commented Apr 8, 2014

This is deeper water Magento technical waters than I've dived in a long time, but the assumptions seems reasonable. The key problem to getting traction on this right now will be, as always the unclear direction from Magento on the future of their 1.x platform. Magento 2 has enough changes that I imagine the profiling you've done no longer applies (there might be new, weirder edge cases), and Magento's been mum on what (if anything) they're doing for a 1.8.2 CE/ 1.14 EE release.

@choukalos
Copy link

Thanks Reid! I've passed along the analysis to our 1.x product lead and our 2.x performance lead.

@davidalger
Copy link

Interesting thoughts and results… kind of a side-note, but a confidential white-paper which I can't share had tests showing EE 1.11 was capable of handling 2,152 orders per minute (3.1 million per day) on some very high-end HP ProLiant Converged Infrastructure.

That said, I've seen cases of improper type handling cause major performance issues. Definitely an issue which should be addressed.

@aschroder
Copy link

Very interesting numbers, also in your scalable cluster config gist (https://gist.github.com/parhamr/6177160).

One thought while reading this, has anyone experimented with making orders asynchronous? i.e when the customer checks out - add the order to a queue and then have one or more backends taking orders off the queue and processing them.

The obvious risk is that by the time an order processes, the item that was in stock at the time the order was placed, is now out of stock - there'd be ways to mitigate that, or in some stores, simply handle it as an exception (email customer etc).

@lukenm
Copy link

lukenm commented Apr 9, 2014

Hi Reid. Very interesting.
This made me wonder whether table partitioning could be used to increase write throughput. Something like this: http://www.slideshare.net/bluesmoon/scaling-mysql-writes-through-partitioning-3397422. I wonder if this could be applied to Magento's checkout tables.

@jonpday
Copy link

jonpday commented Apr 9, 2014

@aschroder there is this PoC from @punkstar that I've been meaning to test for a while (https://github.com/punkstar/magento-delayed-events).

@davidalger and @parhamr Seems crazy that type handling could be so expensive, but certainly feasible!

@tim-bezhashvyly
Copy link

I join all people above. Nothing more to add.

Well why it's always that the theory is not meeting practice.

@SchumacherFM
Copy link

@aschroder Here is a PoC for async orders https://github.com/magento-hackathon/MongoDB-OrderTransactions

The PoC from @punkstar does not relate to orders. Only for some custom observers.

The risk with the stock amount can kill your business. It must be for 100% sure that the customers get their ordered quantity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment