Skip to content

Instantly share code, notes, and snippets.

Last active May 30, 2024 06:40
Show Gist options
  • Save jboner/2841832 to your computer and use it in GitHub Desktop.
Save jboner/2841832 to your computer and use it in GitHub Desktop.
Latency Numbers Every Programmer Should Know
Latency Comparison Numbers (~2012)
L1 cache reference 0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns 14x L1 cache
Mutex lock/unlock 25 ns
Main memory reference 100 ns 20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy 3,000 ns 3 us
Send 1K bytes over 1 Gbps network 10,000 ns 10 us
Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD
Read 1 MB sequentially from memory 250,000 ns 250 us
Round trip within same datacenter 500,000 ns 500 us
Read 1 MB sequentially from SSD* 1,000,000 ns 1,000 us 1 ms ~1GB/sec SSD, 4X memory
Disk seek 10,000,000 ns 10,000 us 10 ms 20x datacenter roundtrip
Read 1 MB sequentially from disk 20,000,000 ns 20,000 us 20 ms 80x memory, 20X SSD
Send packet CA->Netherlands->CA 150,000,000 ns 150,000 us 150 ms
1 ns = 10^-9 seconds
1 us = 10^-6 seconds = 1,000 ns
1 ms = 10^-3 seconds = 1,000 us = 1,000,000 ns
By Jeff Dean:
Originally by Peter Norvig:
'Humanized' comparison:
Visual comparison chart:
Copy link

nking commented Jun 8, 2023

Thanks for sharing your updates.

You could consider adding a context switch for threads right under disk seek:
computer context switches: 1e7 ns

Copy link

I see "Read 1 MB sequentially from disk", but how about disk write?

Copy link

SergeSEA commented Dec 20, 2023

the numbers are from Dr. Dean from Google reveals the length of typical computer operations in 2010. I hope someone could update them as it's 2023

Copy link

The numbers should be still quite similar.

These numbers based on Physical limitation only significant technological leap can make a difference.

In any case, these are for estimates, not exact calculation. For example, 1MB read from SSD is different for each SSD, but it should be somewhere around the Millisecond range.

Copy link

xealits commented Jan 31, 2024

it could be useful to add a column with the sizes in the hierarchy. Also, a column of the minimal memory units sizes, the cache line sizes etc. Then you can also divide the sizes by the latencies, which would be some kind of limit for a simple algorithm throughput. Not really sure if this is useful though.

Copy link

As an updated point of reference for the first few numbers, Apple give a table in their Apple Silicon CPU Optimization guide. You can see they are extremely similar to the original figures:

Apple Silicon CPU latency

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment