Skip to content

Instantly share code, notes, and snippets.

@sts
Last active November 2, 2021 15:06
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save sts/6a17d4af6f905cbc28a6 to your computer and use it in GitHub Desktop.
Save sts/6a17d4af6f905cbc28a6 to your computer and use it in GitHub Desktop.
Percona XDB Clustercheck with SystemD
# /etc/systemd/system/clustercheck.socket
[Unit]
Description=MySQL Clustercheck Socket
[Socket]
ListenStream=9200
Accept=true
[Install]
WantedBy=sockets.target
# /etc/systemd/system/clustercheck@.service
[Unit]
Description=MySQL Clustercheck
After=network.target
[Service]
User=nobody
ExecStart=-/usr/bin/clustercheck clustercheck AoWdTTItrUqNo9aNY
StandardInput=socket
# /etc/rsyslog.d/mysql_clustercheck.conf
#
# systemd log fixup - No, we don't want to log mysql clustercheck systemd
# messages everytime haproxy connects (every 500ms per LB in our case)
if $programname == 'systemd' and $msg contains 'Starting MySQL Clustercheck' then stop
if $programname == 'systemd' and $msg contains 'Started MySQL Clustercheck' then stop

Systemd Clustercheck

Clustercheck is a script to make a proxy (ie. haproxy) capable of monitoring Percona XtraDB Cluster nodes properly. This is a config to run clustercheck/mysqlchk from systemd instead of xinetd.

INSTALL

cp clustercheck.socket /etc/systemd/system
cp clustercheck@.service /etc/systemd/system
systemctl enable clustercheck.socket
systemctl start clustercheck.socket
@sts
Copy link
Author

sts commented Aug 14, 2015

Don't do this at home kids!

09:38 < sts> hello folks. I'm getting thouthands of these log entries per day, does anyone know how to turn it off? https://gist.github.com/sts/6a22221a9b86e296c3ef
09:39 < twb> sts: do you want to stop just the logging or also stop the cluster checking
09:39 < sts> twb: the logging
09:39 < twb> sts: I don't know if you can do that
09:39 < ohsix> sts: disable the timer that is doing that
09:40 < twb> ohsix: he wants it to still run the check, just not write down that it started & stopped
09:40 < ohsix> sts: the log entries don't take up that much space or anything, having thousands a day isn't bad per se
09:41 < twb> sts: I deal with that kind of thing by teaching my log reading stuff to ignore whitelisted messages
09:42 < sts> ohsix: about 140MB/day
09:42 < twb> haha
09:43 < ohsix> nice
09:44 < twb> I don't have a good answer for that, because it's systemd logging it.
09:44 < sts> So, I'd rather say that in regards to using systemd as xinted replacement, that is a bug?
09:45 < twb> Ah it's not a timer, it's starting one per socket connection?
09:45 < ohsix> what is related to that unit, is it a socket? if it is happening all the time it's probably a timer, and you can just disable the timer
09:45 < sts> twb: yes
09:45 < sts> No its my loadabalancer checking the mysql connection very 500ms.
09:45 < ohsix> and you're starting and stopping mysql every 500ms?
09:45 < twb> The way I dealt with that was to say anything getting that many connections needed to be its own daemon
09:46 < sts> There is a clustercheck.socket, with starts a script via clustercheck@.service.
09:46 < twb> sts: please pastebin both of those
09:46 < sts> ohsix: just checking connectivity
09:46 < ohsix> why doesn't the service keep running, that's essentially what sockets are for, for transparent handoff, the services can exit on their own but they generally do that 'later', when they're idle or some other criteria
09:46 < twb> Ah, it's availability monitoring
09:48 < sts> https://gist.github.com/sts/6a17d4af6f905cbc28a6
09:48 < sts> twb: yes
09:48 < twb> ohsix: he said he's doing it as an xinetd replacement, so it spawns once per connection, handles it, then hangs up. As opposed to e.g. an apache worker thread which will handle say 100 requests, then hang up
09:48 < sts> ohsix: the service was just a simple script, with was always run by xinetd.
09:49 < twb> sts: I think the short answer is: don't do it that way
09:50 < ohsix> yea i think even with xinetd that was kind of a poor way to go, though you could suppress some logging more easily
09:50 < twb> Or reduce the polling interval, or build heavier logging infrastructure that can deal with being spammed that much
09:50 < sts> twb: well, then using systemd as xinetd replacement is broken...
09:50 < twb> sts: yes
09:50 < ohsix> and no!
09:50 < sts> ohsix: it is
09:50 < ohsix> doing what you did with xinetd was probably not a good idea either
09:50 < twb> sts: the systemd people will advocate you change your script a bit, so that this doesn't happen
09:51 < ohsix> you know xinetd is a bit of a hack that lets you handle requests on demand and have someone else own the socket right
09:51 < ohsix> like a service multiplexer
09:51 < ohsix> well f sophistry, if this happened once every 20 minutes it wouldn't matter
09:52 < ohsix> if it happens every 500ms it matters a lot, if it happened every 1ms it'd matter a lot more
09:52 < ohsix> there's the same sort of problem with xinetd, and the answer is 'don't do that'
09:52 < ohsix> systemd is definitely not an xinetd replacement if you consider the ability to easily DoS xinetd as a feature
09:53 < sts> xinetd wasn't logging, so you can turn it as you want, either using systemd as replacement is not supported, or its broken.
09:53 < ohsix> or having no control over the behaviour of the service it starts aside from mainly killing the one pid
09:54 < ohsix> one thing that was kind of nice before there were cgroups and everything, xinetd was the one that held any privileges needed for the resources, and the programs it launched never had to have them, only to drop them later
09:55 < ohsix> what behaviour do you expect when the unit related to the socket actually fails?
09:55 < twb> sts: do you actually need to poll twice a second?
09:55 < ohsix> because that's going to be pretty different from what xinetd does too
09:56 < ohsix> i think what's at issue is 'can', in whatever or whomever told you it was able to act as an inetd replacement
09:57 < ohsix> you still need a lot of discretion to choose when to use inetd for something, like you wouldn't for ssh, generally
09:57 < sts> twb: yes, otherwise the application fails over too slow.
09:57 < ohsix> honestly people just thought it was cool to implement an echo service with 'cat'
09:58 < ohsix> lets address the elephant in the room
09:59 < ohsix> your load balancer is requesting the status of something systemd will know is actually running or not
09:59 < ohsix> to a useful degree
09:59 < twb> TBH I'm surprised mysql doesn't already have a turnkey solution for this
09:59 < ohsix> you'll still have to check that it is actually serving the resource you need, but systemd knows about the process state
10:00 < ohsix> oh, there's that too
10:00 < twb> Like "apt-get install mysql-cluster-node" or something
10:00 < ohsix> mysql doesn't really try to do anything great vis a vis systemd
10:00 < ohsix> i think you posted that link a few days ago, systemd support is basically similar to how they supported sysv
10:00 < twb> https://en.wikipedia.org/wiki/MySQL_Cluster
10:00 < ohsix> 'support'
10:00 < richard_maw> "We implemented Type=forking! We really know systemd guys!"
10:01 < ohsix> hey i keep mentioning it because i think they deserve the benefit of doubt, i think i agree with their concept of 'support'
10:01 < twb> ohsix: er, to be fair, they seem to have fixing that as a goal for the next release
10:01 < ohsix> there's no parity for what they could potentially do on systemd hosts that they can do on others, they just need to get it started
10:01 < sts> ohsix: its not about whether the process is running, its checking variables of a multi-master mysql layer (galera).
10:02 < richard_maw> yeah, props to mysql for making it start up better, but it's not really embracing what systemd could provide
10:02 < ohsix> sts: yes, the point is that it is extra information, you can be told exactly when it happens for one failover scenario without polling
10:02 < ohsix> you should do all the things
10:02 < twb> sts: to be honest if I was you I'd probably say "fuck it" and just keep using xinetd for this because the pubs are open and xinetd is near enough
10:03 < ohsix> it's a contrived reason to say that even if 500ms failover is fast enough, it still isn't anything good compared to what you can do now with !xinetd and some newer reliable information
10:03 < ohsix> yea if it is just an issue of the messages, use xinetd or whatever you were using before
10:04 < ohsix> i know there's stuff xinetd does that systemd (.socket/.service) doesn't but i'm scared to go read about ancient history, and the computer in my closet, it is scary old
10:04 < twb> You could make it way sexier, but you probably have more critical work to do
10:05 < ohsix> if you want to hang a flag on systemd not being an xinetd replacement, be sure to tell the guy that told you it was how you disagree
10:06 < ohsix> meanwhile check out the shiny stuff .services can do! system call filters! forbidding access to paths! private temporary directories! capabilities! and lots more!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment