We're seeing issues where cadvisor spends a lot of time on CPU.
The following should run on any system with systemd (code is in this gist):
go run main.go
The output for is is something like this:
completed: 4.32s [manual / mem-stat + cpu-stat]
completed: 4.25s [manual / cpu-stat + mem-stat]
completed: 0.47s [manual / mem-stat]
completed: 0.04s [manual / cpu-stat]
completed: 5.17s [manual / mem-stat + cpu-stat]
completed: 5.59s [manual / cpu-stat + mem-stat]
completed: 0.52s [manual / mem-stat]
completed: 0.04s [manual / cpu-stat]
The first three runs are a warm-up, and the second three runs is the meat and potatoes.
In cpu-stat + mem-stat
we do effectively the following:
for _ in $(seq 1 1000); do cat /sys/fs/cgroup/system.slice/memory.stat /sys/fs/cgroup/system.slice/cpu.stat > /dev/null
In cpu-stat
it's just cpu.stat
file:
for _ in $(seq 1 1000); do cat /sys/fs/cgroup/system.slice/cpu.stat > /dev/null; done
In memory.stat
it's just memory.stat
file:
for _ in $(seq 1 1000); do cat /sys/fs/cgroup/system.slice/memory.stat > /dev/null; done
The weird thing is: combining the latter two is much faster than the former and there's easily 10x difference in the time spent in kernel between them.
Unfortunately, the former is how monitoring software normally works: all metrics are fetched for one service before proceeding to the next one. One prominent example that prompted this examination is cAdvisor:
Flamegraphs for a single combined loop (one-loop.svg
) and for two loops
back to back (two-loops.svg
) are attached to this gist.
This doesn't happen on a mostly idle machine, but we do see this on loaded servers (128 busy logical CPUs, lots of memory churn).
Possible fix: bobrik/linux@50b6278