Skip to content

Instantly share code, notes, and snippets.

@hellerbarde
Forked from jboner/latency.txt
Created May 31, 2012 13:16
Star You must be signed in to star a gist
Save hellerbarde/2843375 to your computer and use it in GitHub Desktop.
Latency numbers every programmer should know

Latency numbers every programmer should know

L1 cache reference ......................... 0.5 ns
Branch mispredict ............................ 5 ns
L2 cache reference ........................... 7 ns
Mutex lock/unlock ........................... 25 ns
Main memory reference ...................... 100 ns             
Compress 1K bytes with Zippy ............. 3,000 ns  =   3 µs
Send 2K bytes over 1 Gbps network ....... 20,000 ns  =  20 µs
SSD random read ........................ 150,000 ns  = 150 µs
Read 1 MB sequentially from memory ..... 250,000 ns  = 250 µs
Round trip within same datacenter ...... 500,000 ns  = 0.5 ms
Read 1 MB sequentially from SSD* ..... 1,000,000 ns  =   1 ms
Disk seek ........................... 10,000,000 ns  =  10 ms
Read 1 MB sequentially from disk .... 20,000,000 ns  =  20 ms
Send packet CA->Netherlands->CA .... 150,000,000 ns  = 150 ms

Assuming ~1GB/sec SSD

Visual representation of latencies

Visual chart provided by ayshen

Data by Jeff Dean

Originally by Peter Norvig

Lets multiply all these durations by a billion:

Magnitudes:

Minute:

L1 cache reference                  0.5 s         One heart beat (0.5 s)
Branch mispredict                   5 s           Yawn
L2 cache reference                  7 s           Long yawn
Mutex lock/unlock                   25 s          Making a coffee

Hour:

Main memory reference               100 s         Brushing your teeth
Compress 1K bytes with Zippy        50 min        One episode of a TV show (including ad breaks)

Day:

Send 2K bytes over 1 Gbps network   5.5 hr        From lunch to end of work day

Week

SSD random read                     1.7 days      A normal weekend
Read 1 MB sequentially from memory  2.9 days      A long weekend
Round trip within same datacenter   5.8 days      A medium vacation
Read 1 MB sequentially from SSD    11.6 days      Waiting for almost 2 weeks for a delivery

Year

Disk seek                           16.5 weeks    A semester in university
Read 1 MB sequentially from disk    7.8 months    Almost producing a new human being
The above 2 together                1 year

Decade

Send packet CA->Netherlands->CA     4.8 years     Average time it takes to complete a bachelor's degree
@b1nary
Copy link

b1nary commented Dec 8, 2014

This is a great collection. I just dont get where or how i am able to make coffee in just 25s.

@stultus
Copy link

stultus commented Dec 8, 2014

Agree @b1nary . if someone knows how to do that, please share the source code 😄

@jeveloper
Copy link

That would be a shocker if devops status page turned into humanized numbers one day (sometime in april).
We should all start working harder to improve our numbers ! and enjoy more Round trip within same datacenter 😃

@benibela
Copy link

benibela commented Dec 8, 2014

Do not forget:

3ms: Time till a wrongly configured sendmail timeouts and fails to deliver a mail. Roughly corresponds to mail servers in a 500km (3 millilightseconds) radius

6h: Time to send a mail across those 500km via RFC 1149

@caimaoy
Copy link

caimaoy commented Jan 5, 2015

cool

@hellerbarde
Copy link
Author

@stultus @b1nary we have a coffee machine that makes coffee. Ta-Dah! 😄

@GreatmanBill
Copy link

good, it's cool!

@villadora
Copy link

cool! great summary

@susingha
Copy link

susingha commented Oct 9, 2016

this is awesome. Thank you

@marianposaceanu
Copy link

marianposaceanu commented Oct 9, 2016

hmm:

branch misprediction penalty on Haswell ~ 1500 ns vs 5 ns in the gist. That's three orders of magnitude of error

EDIT:

I used the ticks from Windows (are 10K in a ms) which is incorrect related to the gist.

If the Haswell CPU is running 3.6Ghz - one cycle would equal to 0.27ns that would mean a branch miss would be 4.05ns - seems about right now.

@rr-paras-patel
Copy link

cool..... thank you

@Kevin-Hamilton
Copy link

Multiplying by a billion stretches the timescales out too much for my taste. So I came up with an alternate list based on multiplying by only 22,000:

L1 cache reference ..................  0.000011 seconds (SR-71 travels 1cm)
Branch mispredict ...................  0.000110 sec (Bullet travels 4cm)
L2 cache reference ..................  0.000154 sec (Boeing 777 travels 4cm)
Mutex lock/unlock ...................  0.00055 sec (Time before you hear a fingersnap made in front of your face [speed of sound across 19cm])
Main memory reference ...............  0.0022 sec (Camera shutter on a sunny day [1/400 - 1/500 shutter speed])
Compress 1K bytes with Zippy ........  0.066 sec (Lightning bolt travels 4km from cloud to ground)
Send 2K bytes over 1 Gbps network ...  0.44 sec (Fastball from pitcher to home plate)
SSD random read .....................  3.3 sec (SR-71 travels 3.1km)
Read 1 MB sequentially from memory ..  5.5 sec (Yawn)
Round trip within same datacenter ... 11.0 sec (A Cheetah runs 200m)
Read 1 MB sequentially from SSD* .... 22.0 sec (Usain Bolt runs 200m)
Disk seek ...........................  3.6 minutes (Brewing coffee in a French Press)
Read 1 MB sequentially from disk ....  7.3 min (A performance of the first movement of Beethoven's 5th Symphony)
Send packet CA->Netherlands->CA ..... 55.0 min (Going for a brisk 5km walk)

@MartyGentillon
Copy link

@Kevin-Hamilton There is a reason to stretch it out that much. From a human perspective, it is really hard to do anything in less than a second. As such, the ridiculously long times give you a better idea of what a computer might be able to do during that disk seek, if it weren't waiting for that disk seek.

Because of this, most of the similar pages I have seen use something like 1 second for a clock cycle (so multiply everything there by 3 or 4). It gives a really good sense of machine sympathy.

@mpron
Copy link

mpron commented Oct 12, 2016

Last year, I came up with this concept for an infographic illustrating these latency numbers with time analogies (if 1 CPU cycle = 1 second). Here was the result (attached, and here's a link: http://imgur.com/8LIwV4C)pic

@cth027
Copy link

cth027 commented Nov 19, 2016

Excellent idea ! Great page !
Perhaps an interesting comparison:

@MAZHARMIK
Copy link

Cool. Loved it.

@hhimanshu
Copy link

very interesting!

@imonti
Copy link

imonti commented Mar 31, 2017

Excelent Gist.

@LeonZhu1981
Copy link

great!!!

@YLD10
Copy link

YLD10 commented Jul 9, 2019

Thanks ^o^

@vinaypuranik
Copy link

Awesome gist! Thanks

@xenowits
Copy link

wowww!!

@vapniks
Copy link

vapniks commented Nov 14, 2019

@jiteshk23
Copy link

These numbers seem old. This page is updated : https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html

@Code2Life
Copy link

cool!

@eduard93
Copy link

eduard93 commented Jan 3, 2022

What about register access timings?

@hellerbarde
Copy link
Author

hellerbarde commented Jan 6, 2022

@eduard93 I think register access happens within one CPU cycle. Which, at 2.4 GHz would be 0.417 nanoseconds, which is very similar to the L1 cache reference. I'm not sure if that's true, because I'm not incredibly familiar with modern CPUs. Feel free to fact check this.

@Yougigun
Copy link

Yougigun commented Nov 7, 2022

thasnk

@zhangchiisgy
Copy link

nice

@sitansu04
Copy link

thats cool!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment