Skip to content

Instantly share code, notes, and snippets.

@stefano-pogliani
Created September 13, 2017 14:23
Show Gist options
  • Save stefano-pogliani/463af63c040d1bd3740fb46255d2768a to your computer and use it in GitHub Desktop.
Save stefano-pogliani/463af63c040d1bd3740fb46255d2768a to your computer and use it in GitHub Desktop.
Locking docker containers with slow fluentd/network

Locking containers with slow fluentd

A docker-compose and tc tutorial to reproduce container deadlocks.

The problem

Docker containers would block on logging operations when the upstream fluentd server(s) experience performance problems and network slowdowns. If the upstream servers are fully unavailable then docker fails to forward logs and moves on as expected.

The (suspected) reason

The fluentd go client has a lock contention issue between appending logs to the intenral buffer and flusing the buffer:

When the write called takes long to return the append path is unable to aquire the lock and the docker daemon is unable to process stdout/stderr from containers which eventually block on writes to stdout/stderr.

Validating the reason

This gist includes a docker-compose project and tc commands that can be used to validate the issue with logs locking and network slowdowns.

Prerequisites

The test was run with the following configuration:

  • Docker (17.06.2)
  • Docker Compose (1.13.0)
  • Linux Kernel (4.12.9)
  • netem kernel module and tc installed.

Assumptions

The docker-compose project sets up a container generating a lot of logs connected to a fluentd server in the same network.

Because the logging section of a docker-compose file can't resolve the IP address of another container started by the same compose project the IP of the fluentd server is hardcoded in the compose file.

It is assumed that the network for the compose project is 172.18.0.0/16 If that is not the case (you will see no logs in fluentd output) update the IP of the fluentd container in docker-compose.yaml.

Usage

On a host that fits the prerequisites and where the assumtions hold:

  • Download all the files in this gist (zip archive is the easiest way).
  • Start the logging client and server: docker-compose up --build.
  • Open http://localhost:5000/ to see a json object as in the example.
  • Manipulate the network with tc to add delays (see below).
  • Check the size of the TCP send queue with netstat --tcp | grep 24224.
  • As the queue gets full refresh the web page to see slow log requests.

Example

# Shell 1: start the client and server to see the logs.
$ docker-compose up --build
# Hit Ctrl-C to stop.
# Shell 2: check the client works.
$ curl --silent http://localhost:5000 | jq .
{
  "duration": 4.792213439941406e-05,
  "end": 1505311218.493932,
  "message": "Testing stdout/stderr deadlock",
  "start": 1505311218.493884
}

# Find the interface for the fluentd server container (see below for details).
$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 94:de:80:7d:0e:87 brd ff:ff:ff:ff:ff:ff
    inet 10.1.100.9/24 brd 10.1.100.255 scope global dynamic eno1
       valid_lft 5574sec preferred_lft 5574sec
    inet6 fe80::96de:80ff:fe7d:e87/64 scope link 
       valid_lft forever preferred_lft forever
3: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 52:54:00:1a:44:db brd ff:ff:ff:ff:ff:ff
    inet 192.168.124.1/24 brd 192.168.124.255 scope global virbr0
       valid_lft forever preferred_lft forever
4: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel master virbr0 state DOWN group default qlen 1000
    link/ether 52:54:00:1a:44:db brd ff:ff:ff:ff:ff:ff
5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:89:e1:87:30 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:89ff:fee1:8730/64 scope link 
       valid_lft forever preferred_lft forever
85: br-ef2c5d6b67d5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:72:f1:12:ca brd ff:ff:ff:ff:ff:ff
    inet 172.18.0.1/16 scope global br-ef2c5d6b67d5
       valid_lft forever preferred_lft forever
    inet6 fe80::42:72ff:fef1:12ca/64 scope link 
       valid_lft forever preferred_lft forever
87: vethfbe9bc4@if86: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-ef2c5d6b67d5 state UP group default 
    link/ether 02:30:e0:5d:20:44 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::30:e0ff:fe5d:2044/64 scope link 
       valid_lft forever preferred_lft forever
89: veth2434c1b@if88: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-ef2c5d6b67d5 state UP group default 
    link/ether 16:04:ae:d6:15:f2 brd ff:ff:ff:ff:ff:ff link-netnsid 1
    inet6 fe80::1404:aeff:fed6:15f2/64 scope link 
       valid_lft forever preferred_lft forever

# Add a 2s delay to the fluentd container and wait for the TCP send queue to fill.
$ sudo tc qdisc add dev vethfbe9bc4 root netem delay 2000ms
$ netstat --tcp | grep 24224
tcp        0 3398456 slait:34414             172.18.0.2:24224        ESTABLISHED

$ curl --silent http://localhost:5000 | jq .
{
  "duration": 5.294352054595947,
  "end": 1505311293.442004,
  "message": "Testing stdout/stderr deadlock",
  "start": 1505311288.147652
}

# Increase the delay to 10s.
$ sudo tc qdisc del dev vethfbe9bc4 root netem
$ sudo tc qdisc add dev vethfbe9bc4 root netem delay 10000ms
$ netstat --tcp | grep 24224
tcp        0 3776324 slait:34414             172.18.0.2:24224        ESTABLISHED

$ curl --silent http://localhost:5000 | jq .
{
  "duration": 20.779669046401978,
  "end": 1505311634.759668,
  "message": "Testing stdout/stderr deadlock",
  "start": 1505311613.979999
}

Find the device for a container

Docker compose creates a bridged network for the project plus two network interfaces (one for each container).

Mapping interfaces to containers is not the easiest thing:

  1. Use docker network ls to find the newwork ID.
  2. The short NetID is the name of the bridge device (i.e, ef2c5d6b67d5 => br-ef2c5d6b67d5).
  3. The containers interfaces will have the bridge device as the master.
  4. Because the fluentd server container is started first the device in the network with the lowest interface index (87 above) is the one you want.
  5. Verify the interface index with the container itself: docker-compose exec fluentd cat /sys/class/net/eth0/iflink

References

version: '3'
services:
fluentd:
image: fluent/fluentd:v0.12.40
volumes:
- './fluent.conf:/fluentd/etc/fluent.conf'
output-gen:
build: .
depends_on:
- fluentd
logging:
driver: fluentd
options:
fluentd-address: '172.18.0.2:24224'
fluentd-async-connect: 'true'
ports:
- '5000:5000'
FROM debian:jessie
RUN apt-get update \
&& apt-get install -y python-pip \
&& pip install --upgrade pip flask
COPY ./server.py /server.py
EXPOSE 5000
ENV FLASK_APP=server.py
CMD ["flask", "run", "--host=0.0.0.0"]
<source>
@type forward
@id input1
@label @mainstream
port 24224
</source>
<filter **>
@type stdout
</filter>
<label @mainstream>
<match **>
@type stdout
@id output1
</match>
</label>
import logging
from time import sleep
from timeit import default_timer
from threading import Thread
from flask import Flask
from flask import jsonify
app = Flask(__name__)
log = logging.getLogger('test.server')
logging.basicConfig(
level=logging.INFO,
format='[%(asctime)s]:%(name)s:%(levelname)s:%(message)s'
)
@app.route('/')
def index():
start = default_timer()
log.warn('This should block for a while')
end = default_timer()
duration = end - start
return jsonify(
message='Testing stdout/stderr deadlock',
start=start,
end=end,
duration=duration
)
class SpammerThread(Thread):
def __init__(self, *args, **kwargs):
super(SpammerThread, self).__init__(*args, **kwargs)
self.daemon = True
def run(self):
while True:
log.info('Clogging the logs')
# Slow emit rate for playing around.
#sleep(0.001)
def init_clogger():
thread = SpammerThread()
thread.start()
init_clogger()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment