Created
November 5, 2015 17:54
-
-
Save lloesche/9086f0354222ca9b6c1e to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
When reloading haproxy too fast on EL7 (RedHat, CentOS) the system is | |
being filled with orphaned processes. | |
I encountered this problem on CentOS 7 with | |
haproxy-1.5.4-4.el7_1.x86_64 but expect it to exist on all systems | |
using haproxy-systemd-wrapper not just those based on Fedora. | |
Steps to reproduce: | |
1) haproxy is running normal. | |
[root@localhost ~]# ps ax | grep haproxy | |
3140 ? Ss 0:00 /usr/sbin/haproxy-systemd-wrapper -f | |
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid | |
3141 ? S 0:00 /usr/sbin/haproxy -f | |
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds | |
3142 ? Ss 0:00 /usr/sbin/haproxy -f | |
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds | |
2) Several reloads are executed in quick succession. Problem worsens | |
when processes happen to execute a reload in parallel. | |
[root@localhost ~]# while :; do systemctl reload haproxy; done | |
^C | |
3) There's multiple haproxy processes running that will never end. As | |
you can see there's duplicate pids for the -sf arg. Maybe caused by a | |
race between haproxy-systemd-wrapper reading and the new haproxy | |
process writing it's pid. | |
[root@localhost ~]# ps ax | grep haproxy | |
423 ? S 0:00 /usr/sbin/haproxy -f | |
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 419 | |
429 ? S 0:00 /usr/sbin/haproxy -f | |
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 425 | |
430 ? Ss 0:00 /usr/sbin/haproxy -f | |
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 419 | |
431 ? Ss 0:00 /usr/sbin/haproxy -f | |
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 425 | |
31833 ? Ss 0:01 /usr/sbin/haproxy-systemd-wrapper -f | |
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid | |
36593 ? S 0:00 /usr/sbin/haproxy -f | |
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 36587 | |
36600 ? Ss 0:00 /usr/sbin/haproxy -f | |
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 36587 | |
38316 ? S 0:00 /usr/sbin/haproxy -f | |
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 38311 | |
38324 ? Ss 0:00 /usr/sbin/haproxy -f | |
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 38311 | |
38344 ? S 0:00 /usr/sbin/haproxy -f | |
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 38325 | |
38350 ? Ss 0:00 /usr/sbin/haproxy -f | |
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 38325 | |
... | |
... | |
I believe the problem is that there's a race in | |
haproxy-systemd-wrapper.c line 98 where it's missing a | |
} else if (nb_pid > 0) { ... block until nb_pid is no longer found in | |
pidfile. Or something similarly blocking. | |
Otherwise the parent will accept new SIGUSR2/SIGHUP reloads before the | |
new haproxy process that was spawned in line 96 has written it's pid | |
file. | |
Also note the following from the systemd.service manpage: | |
"It is strongly recommended to set ExecReload= to a command that not | |
only triggers a configuration reload of the daemon, but also | |
synchronously waits for it to complete." | |
That's currently not the case. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment