Full source code can be found here
It is well-known that a hard disk has a long delay from that we request the data to that we get the data. Usually we measure the hard disk latency in milliseconds which is an eternity for a CPU. The bandwidth of a hard disk is decent good as SSD:s today can reach 1 GiB/second.
What is less known is that RAM has the same characteristics, bad latency with good bandwidth.
You can measure RAM latency and badndwidth using Intel® Memory Latency Checker. On my machine the RAM latency under semi-high load is ~120 ns (The 3r:1w
bandwidth is 16GiB/second). This means that the CPU on my machine has to wait for ~400 cycles for data, an eternity.