Skip to content

Instantly share code, notes, and snippets.

@lloesche
Created November 5, 2015 17:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save lloesche/9086f0354222ca9b6c1e to your computer and use it in GitHub Desktop.
Save lloesche/9086f0354222ca9b6c1e to your computer and use it in GitHub Desktop.
When reloading haproxy too fast on EL7 (RedHat, CentOS) the system is
being filled with orphaned processes.
I encountered this problem on CentOS 7 with
haproxy-1.5.4-4.el7_1.x86_64 but expect it to exist on all systems
using haproxy-systemd-wrapper not just those based on Fedora.
Steps to reproduce:
1) haproxy is running normal.
[root@localhost ~]# ps ax | grep haproxy
3140 ? Ss 0:00 /usr/sbin/haproxy-systemd-wrapper -f
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid
3141 ? S 0:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
3142 ? Ss 0:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
2) Several reloads are executed in quick succession. Problem worsens
when processes happen to execute a reload in parallel.
[root@localhost ~]# while :; do systemctl reload haproxy; done
^C
3) There's multiple haproxy processes running that will never end. As
you can see there's duplicate pids for the -sf arg. Maybe caused by a
race between haproxy-systemd-wrapper reading and the new haproxy
process writing it's pid.
[root@localhost ~]# ps ax | grep haproxy
423 ? S 0:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 419
429 ? S 0:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 425
430 ? Ss 0:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 419
431 ? Ss 0:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 425
31833 ? Ss 0:01 /usr/sbin/haproxy-systemd-wrapper -f
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid
36593 ? S 0:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 36587
36600 ? Ss 0:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 36587
38316 ? S 0:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 38311
38324 ? Ss 0:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 38311
38344 ? S 0:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 38325
38350 ? Ss 0:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 38325
...
...
I believe the problem is that there's a race in
haproxy-systemd-wrapper.c line 98 where it's missing a
} else if (nb_pid > 0) { ... block until nb_pid is no longer found in
pidfile. Or something similarly blocking.
Otherwise the parent will accept new SIGUSR2/SIGHUP reloads before the
new haproxy process that was spawned in line 96 has written it's pid
file.
Also note the following from the systemd.service manpage:
"It is strongly recommended to set ExecReload= to a command that not
only triggers a configuration reload of the daemon, but also
synchronously waits for it to complete."
That's currently not the case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment