Linux kernel has a garbage collection mechanism for inflight unix sockets passed to other unix sockets. This mechanism can be used to cause excessive load onto well behaved processes that are using regular unix sockets without any fd passing, because garbage collection is called in the socket write path.
The attached reproduction code in Go can be used to illustrate this:
$ go build -o /tmp/derp main.go && /tmp/derp
What it does:
- Makes a unix connection to itself in a loop, writing some bytes and closing it every 50ms. This is the legitimate well behaving load.
- Makes a unix connection to itself and puts 16.1k unix sockets into it.
This is what forces
unix_gc
to run a lot more for the well behaved connection that has nothing to do with this.
With fewer than 16k file descriptors inflight, there's some gc, but not much:
$ sudo funclatency-bpfcc -uTi 1 unix_gc
Tracing 1 functions for "unix_gc"... Hit Ctrl-C to end.
22:08:34
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 2 |***** |
1024 -> 2047 : 12 |********************************** |
2048 -> 4095 : 14 |****************************************|
4096 -> 8191 : 6 |***************** |
avg = 2535 usecs, total: 86194 usecs, count: 34
This is triggered from unix_release_sock
as long as there are any
inflight sockets present at all (no matter how many):
If you cross the threshold for the number of inflight sockets, it gets worse:
ivan@vm:~$ sudo funclatency-bpfcc -uTi 1 unix_gc
Tracing 1 functions for "unix_gc"... Hit Ctrl-C to end.
22:09:15
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 456 |****************************************|
1024 -> 2047 : 48 |**** |
2048 -> 4095 : 2 | |
avg = 979 usecs, total: 495498 usecs, count: 506
You can observe a lot more calls to unix_gc
and a lot more work in each one.
Most of the calls here are are from unix_stream_sendmsg
: