Do you use syscall.Mmap
in Go?
There are high chances the answer is yes
even if your are unware of it.
Your app's direct or indirect dependencies may use syscall.Mmap
because of a widespread beleif - "mmap
is faster than plain old file I/O".
Let's try to understand whether this is is true.
mmap is a system call
for mapping file contents into memory. After the mapping, you can read and/or
write file contents by just accessing memory region returned by the syscall.
Convenient, isn't it? There is no need in heavy
system calls
for reading
and/or writing
the file contents. Absolute victory? No!
What happens when a program accesses memory address inside the region
returned by mmap
? There are two cases:
- The given memory address points to a
hot
data already present in memory. Such memory is known as page cache. In this case the access may be indeed faster than the access viaread
/write
syscall. - The given memory address points to a
cold
data missing in apage cache
. In this case the Operating Sytem (OS) intercepts the memory access via Major page fault, loads the requested data from the mmap'ed file intopage cache
and then returns the control to the program. All thismagic
is invisible to the program - it just accesses data at the specified memory location as usual. But it has very high price -cold
data access requires100000x
more time than thehot
data access. Why? See Latency Numbers Every Programmer Should Know.
I hear your voices - "read
/ write
syscalls has almost the same price for cold
data access as major page faults
for mmap
ed file - the implicit memory interception
is just substituted by an explicit system call".
Yes. But let's take a closer look into Go runtime.
Go runtime runs goroutines on OS threads.
GOMAXPROCS goroutines can be executed simultaneously by OS threads. Other ready goroutines wait
for their turn until the currently running goroutines block, yield or stuck in cgo call / syscall.
Goroutines may block on I/O, channel, mutex. Goroutines may yield on function call, memory allocation or explicit
runtime.Gosched call. Goroutines don't block on a major page fault
!
Again - goroutines don't block and don't yield on a major page fault
, since it is invisible to Go runtime.
What happens when a goroutine accesses cold
data in mmap
ed file? It stucks for a looooong time.
During this time it occupies an OS thread from GOMAXPROCS
threads, so other ready goroutines have reduced
number of threads to be executed on. This leads to CPU underutilization.
What happens if GOMAXPROCS
goroutines concurrently access cold
data regions in mmap
ed file?
Complete stall of the whole program until the OS resolves major page faults
caused by these goroutines!
Monitor request latencies and CPU usage:
- latencies for ALL the requests usually increase during stalls;
user
CPU share drops, since the program performs less work during stalls;system
andiowait
CPU shares increase because the OS handlesmajor page faults
.
- Increase
GOMAXPROCS
toN
x NumCPU. This reduces chances of CPU underutilization duringmajor page faults
at the cost of higher CPU usage, since now each CPU core deals with multiple OS threads. - Access
mmap
ed data only viacgo
calls. Go launches an additional OS thread for each goroutine stuck insidecgo
call. This prevents from CPU underutilization at the cost of higher CPU usage, sincecgo
calls are expensive. - Do not use
mmap
in Go programs. This solution has no drawbacks.
These programs may work without issues until the accessed data from mmap
ed file
fits page cache
. Page cache size is limited by RAM size. So these programs should
experience stalls when mmap
ed files exceed RAM size. Program stalls
may be left unnoticed on low loads and on faster storages (SSDs instead of HDDs or network storages).
The program may experience stalls if mmap
ed file is smaller than the RAM size in the following cases:
- On the first access to
mmap
ed data if it isn't present in the page cache yet. Such stalls usually occur during programwarmup
. Note that stalls induced bymajor page fault
increase latencies for ALL the requests, including requests, which don't touchmmap
ed data. - On the first access to
mmap
ed data after its' eviction from the page cache. The eviction may be caused by a third-party app running on the same OS. For instance, innocentgrep
over a big log file quickly evicts useful data from page cache.
Avoid mmap
inside Go programs, since it may cause stalls.
Read how we were creating the best remote storage for Prometheus.
It is written in Go and it doesn't use mmap
:)