A docker-compose
and tc
tutorial to reproduce container deadlocks.
Docker containers would block on logging operations when the upstream fluentd server(s) experience performance problems and network slowdowns. If the upstream servers are fully unavailable then docker fails to forward logs and moves on as expected.
The fluentd go client has a lock contention issue between appending logs to the intenral buffer and flusing the buffer:
- https://github.com/fluent/fluent-logger-golang/blob/a8dfe4adfeaf7b985acb486f6b060ff2f6a17e91/fluent/fluent.go#L241
- https://github.com/fluent/fluent-logger-golang/blob/a8dfe4adfeaf7b985acb486f6b060ff2f6a17e91/fluent/fluent.go#L312
When the write
called takes long to return the append path is unable to
aquire the lock and the docker daemon is unable to process stdout/stderr
from containers which eventually block on write
s to stdout/stderr.
This gist includes a docker-compose
project and tc
commands that can
be used to validate the issue with logs locking and network slowdowns.
The test was run with the following configuration:
- Docker (17.06.2)
- Docker Compose (1.13.0)
- Linux Kernel (4.12.9)
netem
kernel module andtc
installed.
The docker-compose
project sets up a container generating a lot of logs
connected to a fluentd server in the same network.
Because the logging
section of a docker-compose
file can't resolve the
IP address of another container started by the same compose project the
IP of the fluentd server is hardcoded in the compose file.
It is assumed that the network for the compose project is 172.18.0.0/16
If that is not the case (you will see no logs in fluentd output) update
the IP of the fluentd container in docker-compose.yaml
.
On a host that fits the prerequisites and where the assumtions hold:
- Download all the files in this gist (zip archive is the easiest way).
- Start the logging client and server:
docker-compose up --build
. - Open
http://localhost:5000/
to see a json object as in the example. - Manipulate the network with
tc
to add delays (see below). - Check the size of the TCP send queue with
netstat --tcp | grep 24224
. - As the queue gets full refresh the web page to see slow log requests.
# Shell 1: start the client and server to see the logs.
$ docker-compose up --build
# Hit Ctrl-C to stop.
# Shell 2: check the client works.
$ curl --silent http://localhost:5000 | jq .
{
"duration": 4.792213439941406e-05,
"end": 1505311218.493932,
"message": "Testing stdout/stderr deadlock",
"start": 1505311218.493884
}
# Find the interface for the fluentd server container (see below for details).
$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 94:de:80:7d:0e:87 brd ff:ff:ff:ff:ff:ff
inet 10.1.100.9/24 brd 10.1.100.255 scope global dynamic eno1
valid_lft 5574sec preferred_lft 5574sec
inet6 fe80::96de:80ff:fe7d:e87/64 scope link
valid_lft forever preferred_lft forever
3: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
link/ether 52:54:00:1a:44:db brd ff:ff:ff:ff:ff:ff
inet 192.168.124.1/24 brd 192.168.124.255 scope global virbr0
valid_lft forever preferred_lft forever
4: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel master virbr0 state DOWN group default qlen 1000
link/ether 52:54:00:1a:44:db brd ff:ff:ff:ff:ff:ff
5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:89:e1:87:30 brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 scope global docker0
valid_lft forever preferred_lft forever
inet6 fe80::42:89ff:fee1:8730/64 scope link
valid_lft forever preferred_lft forever
85: br-ef2c5d6b67d5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 02:42:72:f1:12:ca brd ff:ff:ff:ff:ff:ff
inet 172.18.0.1/16 scope global br-ef2c5d6b67d5
valid_lft forever preferred_lft forever
inet6 fe80::42:72ff:fef1:12ca/64 scope link
valid_lft forever preferred_lft forever
87: vethfbe9bc4@if86: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-ef2c5d6b67d5 state UP group default
link/ether 02:30:e0:5d:20:44 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet6 fe80::30:e0ff:fe5d:2044/64 scope link
valid_lft forever preferred_lft forever
89: veth2434c1b@if88: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-ef2c5d6b67d5 state UP group default
link/ether 16:04:ae:d6:15:f2 brd ff:ff:ff:ff:ff:ff link-netnsid 1
inet6 fe80::1404:aeff:fed6:15f2/64 scope link
valid_lft forever preferred_lft forever
# Add a 2s delay to the fluentd container and wait for the TCP send queue to fill.
$ sudo tc qdisc add dev vethfbe9bc4 root netem delay 2000ms
$ netstat --tcp | grep 24224
tcp 0 3398456 slait:34414 172.18.0.2:24224 ESTABLISHED
$ curl --silent http://localhost:5000 | jq .
{
"duration": 5.294352054595947,
"end": 1505311293.442004,
"message": "Testing stdout/stderr deadlock",
"start": 1505311288.147652
}
# Increase the delay to 10s.
$ sudo tc qdisc del dev vethfbe9bc4 root netem
$ sudo tc qdisc add dev vethfbe9bc4 root netem delay 10000ms
$ netstat --tcp | grep 24224
tcp 0 3776324 slait:34414 172.18.0.2:24224 ESTABLISHED
$ curl --silent http://localhost:5000 | jq .
{
"duration": 20.779669046401978,
"end": 1505311634.759668,
"message": "Testing stdout/stderr deadlock",
"start": 1505311613.979999
}
Docker compose creates a bridged network for the project plus two network interfaces (one for each container).
Mapping interfaces to containers is not the easiest thing:
- Use
docker network ls
to find the newwork ID. - The short NetID is the name of the bridge device (i.e,
ef2c5d6b67d5 => br-ef2c5d6b67d5
). - The containers interfaces will have the bridge device as the
master
. - Because the fluentd server container is started first the device in the network with the lowest interface index (87 above) is the one you want.
- Verify the interface index with the container itself:
docker-compose exec fluentd cat /sys/class/net/eth0/iflink