Create a gist now

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Latency numbers every programmer should know

Latency numbers every programmer should know

L1 cache reference ......................... 0.5 ns
Branch mispredict ............................ 5 ns
L2 cache reference ........................... 7 ns
Mutex lock/unlock ........................... 25 ns
Main memory reference ...................... 100 ns             
Compress 1K bytes with Zippy ............. 3,000 ns  =   3 µs
Send 2K bytes over 1 Gbps network ....... 20,000 ns  =  20 µs
SSD random read ........................ 150,000 ns  = 150 µs
Read 1 MB sequentially from memory ..... 250,000 ns  = 250 µs
Round trip within same datacenter ...... 500,000 ns  = 0.5 ms
Read 1 MB sequentially from SSD* ..... 1,000,000 ns  =   1 ms
Disk seek ........................... 10,000,000 ns  =  10 ms
Read 1 MB sequentially from disk .... 20,000,000 ns  =  20 ms
Send packet CA->Netherlands->CA .... 150,000,000 ns  = 150 ms

Assuming ~1GB/sec SSD

Visual representation of latencies

Visual chart provided by ayshen

Data by Jeff Dean

Originally by Peter Norvig

Lets multiply all these durations by a billion:

Magnitudes:

Minute:

L1 cache reference                  0.5 s         One heart beat (0.5 s)
Branch mispredict                   5 s           Yawn
L2 cache reference                  7 s           Long yawn
Mutex lock/unlock                   25 s          Making a coffee

Hour:

Main memory reference               100 s         Brushing your teeth
Compress 1K bytes with Zippy        50 min        One episode of a TV show (including ad breaks)

Day:

Send 2K bytes over 1 Gbps network   5.5 hr        From lunch to end of work day

Week

SSD random read                     1.7 days      A normal weekend
Read 1 MB sequentially from memory  2.9 days      A long weekend
Round trip within same datacenter   5.8 days      A medium vacation
Read 1 MB sequentially from SSD    11.6 days      Waiting for almost 2 weeks for a delivery

Year

Disk seek                           16.5 weeks    A semester in university
Read 1 MB sequentially from disk    7.8 months    Almost producing a new human being
The above 2 together                1 year

Decade

Send packet CA->Netherlands->CA     4.8 years     Average time it takes to complete a bachelor's degree
@toelke

This comment has been minimized.

Show comment
Hide comment
@toelke

toelke May 31, 2012

25s is making one coffee. 7.8 months is a tad faster than producing a new human being. One disk-seek (16 weeks) and then reading 1MiB (7.8 months) is one year.

toelke commented May 31, 2012

25s is making one coffee. 7.8 months is a tad faster than producing a new human being. One disk-seek (16 weeks) and then reading 1MiB (7.8 months) is one year.

@hellerbarde

This comment has been minimized.

Show comment
Hide comment
@hellerbarde

hellerbarde May 31, 2012

thx, will incorporate those!

Owner

hellerbarde commented May 31, 2012

thx, will incorporate those!

@Smerity

This comment has been minimized.

Show comment
Hide comment
@Smerity

Smerity May 31, 2012

16.5 weeks ~= four months ~= full semester at university ~= summer break ?

Smerity commented May 31, 2012

16.5 weeks ~= four months ~= full semester at university ~= summer break ?

@drbawb

This comment has been minimized.

Show comment
Hide comment
@drbawb

drbawb May 31, 2012

branch mis-predict: a yawn.

Such an accurate portrayal. "yawwwn aw damn it fell through to the else again."

drbawb commented May 31, 2012

branch mis-predict: a yawn.

Such an accurate portrayal. "yawwwn aw damn it fell through to the else again."

@dleonard0

This comment has been minimized.

Show comment
Hide comment
@dleonard0

dleonard0 Jun 1, 2012

4.8 years is very roughly two round trips to Mars. Or a one-way trip to Europa.

4.8 years is very roughly two round trips to Mars. Or a one-way trip to Europa.

@mshock

This comment has been minimized.

Show comment
Hide comment
@mshock

mshock Jun 1, 2012

neat, thanks!

mshock commented Jun 1, 2012

neat, thanks!

@henrik

This comment has been minimized.

Show comment
Hide comment
@henrik

henrik Jun 1, 2012

4.8 years is about how long a bachelor's degree takes on average (4.7 years in 2008 according to this page).

henrik commented Jun 1, 2012

4.8 years is about how long a bachelor's degree takes on average (4.7 years in 2008 according to this page).

@stephan-buckmaster

This comment has been minimized.

Show comment
Hide comment
@stephan-buckmaster

stephan-buckmaster Jun 4, 2012

Should add date of creation to text itself (as of ...)

Should add date of creation to text itself (as of ...)

@bartmcleod

This comment has been minimized.

Show comment
Hide comment
@bartmcleod

bartmcleod Jun 26, 2012

If enough people start reading this, the last one will have to wait several years

If enough people start reading this, the last one will have to wait several years

@colin-scott

This comment has been minimized.

Show comment
Hide comment

Here's a tool to visualize these numbers over time: http://www.eecs.berkeley.edu/~rcs/research/interactive_latency.html

@miraculixx

This comment has been minimized.

Show comment
Hide comment
@miraculixx

miraculixx Dec 27, 2012

Great interactive chart! Suggest to add CPU cycles for a simple single instructions such as ADD, MOV, MULT (e.g. "multiplying two 16bit integers".

Great interactive chart! Suggest to add CPU cycles for a simple single instructions such as ADD, MOV, MULT (e.g. "multiplying two 16bit integers".

@legrady

This comment has been minimized.

Show comment
Hide comment
@legrady

legrady Jan 10, 2013

I'd like to see a duration for function-call overhead. Late 80s when I was in school, it was several micro-seconds, enough that people had concerns about having unnecessary function calls. I'm sure that doesn't apply any more, but I don't have real numbers.

legrady commented Jan 10, 2013

I'd like to see a duration for function-call overhead. Late 80s when I was in school, it was several micro-seconds, enough that people had concerns about having unnecessary function calls. I'm sure that doesn't apply any more, but I don't have real numbers.

@milesrout

This comment has been minimized.

Show comment
Hide comment
@milesrout

milesrout Jun 19, 2014

@legrady it depends on a lot of factors. Is it a virtual function call?

@legrady it depends on a lot of factors. Is it a virtual function call?

@coolearn

This comment has been minimized.

Show comment
Hide comment

大神

@AdamBSteele

This comment has been minimized.

Show comment
Hide comment
@AdamBSteele

AdamBSteele Dec 8, 2014

If reading 1MB from an SSD costs 1ms, what would the cost be to read 10MB sequentially from an SSD?

If reading 1MB from an SSD costs 1ms, what would the cost be to read 10MB sequentially from an SSD?

@b1nary

This comment has been minimized.

Show comment
Hide comment
@b1nary

b1nary Dec 8, 2014

This is a great collection. I just dont get where or how i am able to make coffee in just 25s.

b1nary commented Dec 8, 2014

This is a great collection. I just dont get where or how i am able to make coffee in just 25s.

@stultus

This comment has been minimized.

Show comment
Hide comment
@stultus

stultus Dec 8, 2014

Agree @b1nary . if someone knows how to do that, please share the source code 😄

stultus commented Dec 8, 2014

Agree @b1nary . if someone knows how to do that, please share the source code 😄

@jeveloper

This comment has been minimized.

Show comment
Hide comment
@jeveloper

jeveloper Dec 8, 2014

That would be a shocker if devops status page turned into humanized numbers one day (sometime in april).
We should all start working harder to improve our numbers ! and enjoy more Round trip within same datacenter 😃

That would be a shocker if devops status page turned into humanized numbers one day (sometime in april).
We should all start working harder to improve our numbers ! and enjoy more Round trip within same datacenter 😃

@benibela

This comment has been minimized.

Show comment
Hide comment
@benibela

benibela Dec 8, 2014

Do not forget:

3ms: Time till a wrongly configured sendmail timeouts and fails to deliver a mail. Roughly corresponds to mail servers in a 500km (3 millilightseconds) radius

6h: Time to send a mail across those 500km via RFC 1149

benibela commented Dec 8, 2014

Do not forget:

3ms: Time till a wrongly configured sendmail timeouts and fails to deliver a mail. Roughly corresponds to mail servers in a 500km (3 millilightseconds) radius

6h: Time to send a mail across those 500km via RFC 1149

@caimaoy

This comment has been minimized.

Show comment
Hide comment

caimaoy commented Jan 5, 2015

cool

@hellerbarde

This comment has been minimized.

Show comment
Hide comment
@hellerbarde

hellerbarde Apr 24, 2015

@stultus @b1nary we have a coffee machine that makes coffee. Ta-Dah! 😄

Owner

hellerbarde commented Apr 24, 2015

@stultus @b1nary we have a coffee machine that makes coffee. Ta-Dah! 😄

@GreatmanBill

This comment has been minimized.

Show comment
Hide comment
@GreatmanBill

GreatmanBill Apr 28, 2015

good, it's cool!

good, it's cool!

@villadora

This comment has been minimized.

Show comment
Hide comment
@villadora

villadora Apr 15, 2016

cool! great summary

cool! great summary

@susingha

This comment has been minimized.

Show comment
Hide comment
@susingha

susingha Oct 9, 2016

this is awesome. Thank you

susingha commented Oct 9, 2016

this is awesome. Thank you

@marianposaceanu

This comment has been minimized.

Show comment
Hide comment
@marianposaceanu

marianposaceanu Oct 9, 2016

hmm:

branch misprediction penalty on Haswell ~ 1500 ns vs 5 ns in the gist. That's three orders of magnitude of error

EDIT:

I used the ticks from Windows (are 10K in a ms) which is incorrect related to the gist.

If the Haswell CPU is running 3.6Ghz - one cycle would equal to 0.27ns that would mean a branch miss would be 4.05ns - seems about right now.

marianposaceanu commented Oct 9, 2016

hmm:

branch misprediction penalty on Haswell ~ 1500 ns vs 5 ns in the gist. That's three orders of magnitude of error

EDIT:

I used the ticks from Windows (are 10K in a ms) which is incorrect related to the gist.

If the Haswell CPU is running 3.6Ghz - one cycle would equal to 0.27ns that would mean a branch miss would be 4.05ns - seems about right now.

@PatelParas

This comment has been minimized.

Show comment
Hide comment
@PatelParas

PatelParas Oct 11, 2016

cool..... thank you

cool..... thank you

@Kevin-Hamilton

This comment has been minimized.

Show comment
Hide comment
@Kevin-Hamilton

Kevin-Hamilton Oct 11, 2016

Multiplying by a billion stretches the timescales out too much for my taste. So I came up with an alternate list based on multiplying by only 22,000:

L1 cache reference ..................  0.000011 seconds (SR-71 travels 1cm)
Branch mispredict ...................  0.000110 sec (Bullet travels 4cm)
L2 cache reference ..................  0.000154 sec (Boeing 777 travels 4cm)
Mutex lock/unlock ...................  0.00055 sec (Time before you hear a fingersnap made in front of your face [speed of sound across 19cm])
Main memory reference ...............  0.0022 sec (Camera shutter on a sunny day [1/400 - 1/500 shutter speed])
Compress 1K bytes with Zippy ........  0.066 sec (Lightning bolt travels 4km from cloud to ground)
Send 2K bytes over 1 Gbps network ...  0.44 sec (Fastball from pitcher to home plate)
SSD random read .....................  3.3 sec (SR-71 travels 3.1km)
Read 1 MB sequentially from memory ..  5.5 sec (Yawn)
Round trip within same datacenter ... 11.0 sec (A Cheetah runs 200m)
Read 1 MB sequentially from SSD* .... 22.0 sec (Usain Bolt runs 200m)
Disk seek ...........................  3.6 minutes (Brewing coffee in a French Press)
Read 1 MB sequentially from disk ....  7.3 min (A performance of the first movement of Beethoven's 5th Symphony)
Send packet CA->Netherlands->CA ..... 55.0 min (Going for a brisk 5km walk)

Multiplying by a billion stretches the timescales out too much for my taste. So I came up with an alternate list based on multiplying by only 22,000:

L1 cache reference ..................  0.000011 seconds (SR-71 travels 1cm)
Branch mispredict ...................  0.000110 sec (Bullet travels 4cm)
L2 cache reference ..................  0.000154 sec (Boeing 777 travels 4cm)
Mutex lock/unlock ...................  0.00055 sec (Time before you hear a fingersnap made in front of your face [speed of sound across 19cm])
Main memory reference ...............  0.0022 sec (Camera shutter on a sunny day [1/400 - 1/500 shutter speed])
Compress 1K bytes with Zippy ........  0.066 sec (Lightning bolt travels 4km from cloud to ground)
Send 2K bytes over 1 Gbps network ...  0.44 sec (Fastball from pitcher to home plate)
SSD random read .....................  3.3 sec (SR-71 travels 3.1km)
Read 1 MB sequentially from memory ..  5.5 sec (Yawn)
Round trip within same datacenter ... 11.0 sec (A Cheetah runs 200m)
Read 1 MB sequentially from SSD* .... 22.0 sec (Usain Bolt runs 200m)
Disk seek ...........................  3.6 minutes (Brewing coffee in a French Press)
Read 1 MB sequentially from disk ....  7.3 min (A performance of the first movement of Beethoven's 5th Symphony)
Send packet CA->Netherlands->CA ..... 55.0 min (Going for a brisk 5km walk)
@MartyGentillon

This comment has been minimized.

Show comment
Hide comment
@MartyGentillon

MartyGentillon Oct 12, 2016

@Kevin-Hamilton There is a reason to stretch it out that much. From a human perspective, it is really hard to do anything in less than a second. As such, the ridiculously long times give you a better idea of what a computer might be able to do during that disk seek, if it weren't waiting for that disk seek.

Because of this, most of the similar pages I have seen use something like 1 second for a clock cycle (so multiply everything there by 3 or 4). It gives a really good sense of machine sympathy.

@Kevin-Hamilton There is a reason to stretch it out that much. From a human perspective, it is really hard to do anything in less than a second. As such, the ridiculously long times give you a better idea of what a computer might be able to do during that disk seek, if it weren't waiting for that disk seek.

Because of this, most of the similar pages I have seen use something like 1 second for a clock cycle (so multiply everything there by 3 or 4). It gives a really good sense of machine sympathy.

@mpron

This comment has been minimized.

Show comment
Hide comment
@mpron

mpron Oct 12, 2016

Last year, I came up with this concept for an infographic illustrating these latency numbers with time analogies (if 1 CPU cycle = 1 second). Here was the result (attached, and here's a link: http://imgur.com/8LIwV4C)pic

mpron commented Oct 12, 2016

Last year, I came up with this concept for an infographic illustrating these latency numbers with time analogies (if 1 CPU cycle = 1 second). Here was the result (attached, and here's a link: http://imgur.com/8LIwV4C)pic

@cth027

This comment has been minimized.

Show comment
Hide comment
@cth027

cth027 Nov 19, 2016

Excellent idea ! Great page !
Perhaps an interesting comparison:

cth027 commented Nov 19, 2016

Excellent idea ! Great page !
Perhaps an interesting comparison:

@MAZHARMIK

This comment has been minimized.

Show comment
Hide comment
@MAZHARMIK

MAZHARMIK Dec 30, 2016

Cool. Loved it.

Cool. Loved it.

@hhimanshu

This comment has been minimized.

Show comment
Hide comment
@hhimanshu

hhimanshu Jan 22, 2017

very interesting!

very interesting!

@imonti

This comment has been minimized.

Show comment
Hide comment
@imonti

imonti Mar 31, 2017

Excelent Gist.

imonti commented Mar 31, 2017

Excelent Gist.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment