Skip to content

Instantly share code, notes, and snippets.

@scyto
Last active March 25, 2024 13:48
Show Gist options
  • Star 6 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save scyto/cfe4f1645cb2cd820134238bd3c2596a to your computer and use it in GitHub Desktop.
Save scyto/cfe4f1645cb2cd820134238bd3c2596a to your computer and use it in GitHub Desktop.

Using keepalived for node ingress and dns relaibility

This assumes you have installed a docker swarm

Introduction

When one has a docker swarm a container running on any node in the swarm can be accesed using any IP address of any swarm memeber.

For example if you had a single web server running on port 80, on one node of a swarm you could access the web server with any of the following IP addresses:

  • server1-ip:80
  • server2-ip:80
  • serverN-ip:80

Because you want to get to the app even if one swarm node is down typically folks use roud robin DNS to try each of the IP in sequence, this has the disadvantage of failed requests if the node fails. This gist show how i chose to implement a sigle IP and DNS name to improve reachability and consistency

EDITS 9/28/2023

  • Added dockerd health check - keepalived will now move the vip to another node if the dockerd is stopped on a node
  • Added swarm node health check - checks for node in Active, if not VIP won't start on node
  • Made all weights equal - no idea why i had them unequal one wants the VIP to roam!
  • Removed the SMTP stuff, just not needed unless you really really want notifications

Install keepalve on all nodes

run the following on each docker node

sudo apt-get install keepalived

add user sudo useradd -r -s /sbin/nologin -M keepalived_script note this is not used yet (i need to figure out how to let it run docker)

create health check scripts

To ensure swarm node is active

execute sudo nano /usr/local/bin/node_active_check.sh

add the following contents

#!/bin/bash
docker node ls -f name=$(hostname) | grep Active > /dev/null 2>&1

save

then sudo chmod 755 /usr/local/bin/node_active_check.sh

To ensure swarm node is Ready

execute sudo nano /usr/local/bin/node_ready_check.sh

add the following contents

#!/bin/bash
docker node ls -f name=$(hostname) | grep Ready > /dev/null 2>&1

save

then sudo chmod 755 /usr/local/bin/node_ready_check.sh

create the keepalived.conf file on all nodes

sudo nano /etc/keepalived/keepalived.conf

paste in the following

! Configuration File for keepalived
	
global_defs {
	vrrp_startup_delay 5
	enable_script_security
    	max_auto_priority
    	script_user root 
}

vrrp_track_process track_docker {
      process dockerd
      weight 10
}

vrrp_script node_active_check {
      script "/usr/local/bin/node_active_check.sh"
      interval 2
      timeout 5
      rise 3
      fall 3
}

vrrp_script node_ready_check {
      script "/usr/local/bin/node_ready_check.sh"
      interval 2
      timeout 5
      rise 3
      fall 3
}

vrrp_instance docker_swarm {
	state MASTER
	interface eth0
	virtual_router_id 10
	priority 100
	advert_int 1
	authentication {
		auth_type PASS
		auth_pass 1111
	}
	virtual_ipaddress {
		192.168.1.45/24
	}
	track_process {
        	track_docker
        }
	track_script {
    		node_active_check
   	}
	track_script {
    		node_ready_check
   	}
	
}

Note you may want to:

  • change the PASS to your prefered password
  • change the IP to the IP you want
  • change eth0 if your adapater has a different name

Once you have created the file save and exit

Then start the service

sudo systemctl start keepalived 
sudo systemctl enable  keepalived 

Add a local DNS entry to your internal DNS server

for example

swarm.mydomain.com A 192.168.1.45

use this name when you want to address any container in the swarm

Testing

if you want a simple test ping the vip (e.g. 192.168.1.45) and see what happens when you shutdown each of the nodes!

@joshuacurtiss
Copy link

For the "other nodes", should the state be BACKUP?

These are epic gists by the way, thank you very much for them.

@scyto
Copy link
Author

scyto commented Sep 20, 2023

For the "other nodes", should the state be BACKUP?

These are epic gists by the way, thank you very much for them.

@joshuacurtiss
Thanks, and thanks for finding my deliberate mistake to see if anyone was reading these.

changed it to backup - doh will teach me to cut and paste too much

@Drallas
Copy link

Drallas commented Sep 21, 2023

@scyto i assumed the mistake was by design, thank you for fixing the Gist.

@scyto
Copy link
Author

scyto commented Sep 29, 2023

well i just found something interesting....

  1. unsurprisingly if you stop docker on the master keepalived node then all ingress to the swarm mesh - this can be fixed my making keepalived dependent on docker - i need to figure that out @Drallas you seem to be expert on writing systemd service definitions - i assume i just edit something - but i can't find a keepalived file...

  2. bigger issue - when a docker node is in drain mode it rejects all traffic, as such if the master keepalived node is up, docker is running, keepalived is running - nothing can use the cluster ip address.....

(today was the first time i ever used drain)

@scyto
Copy link
Author

scyto commented Sep 29, 2023

ahh it looks like one is supposed to track a process, script or file https://www.redhat.com/sysadmin/advanced-keepalived

so for issue #1 i could track if dockerd is present i guess?
for #2 not so sure

@scyto
Copy link
Author

scyto commented Sep 29, 2023

ok, edited example to add docker health check - if dockerd stops for some reason the swarm IP will move to another node!

still need to think about drain....

@scyto
Copy link
Author

scyto commented Sep 29, 2023

and now added the swarm check - this check the node is in Active state if it isn't the VIP should move to another node.
hmm doesn't work, but doesn't harm.... will look at this over the weekend

@scyto
Copy link
Author

scyto commented Sep 29, 2023

ah i am dumbass my grep is wrong, will fix tomorrow

fixed it already, now i really can go to bed

@Drallas
Copy link

Drallas commented Sep 29, 2023

@scyto I first implemented it for my Pihole's pi-hole-failover-using-keepalived-and-orbital-sync I'm not sure if I would use it for Docker Swarm nodes, I guess I prefer to use it between services with a check if the service is active.

@Drallas
Copy link

Drallas commented Sep 29, 2023

Ps, vrrp_track_process vrrp_track_docker probably would not work on LXC containers, but only fine on VMs or Bare Metal..

@scyto
Copy link
Author

scyto commented Sep 29, 2023

probably would not work on LXC containers

putting my opinions about docker on lxc to oneside ;-) I thought that in a LXC that say sudo ps aux would only show the processes for that LXC and not for other containers - wouldn't the dockerd process in that list be contained (ahem) in the container? At least that what my postfix LXC container shows - just the processes for it...

@Drallas
Copy link

Drallas commented Sep 29, 2023

I tried to do vrrp_track_process but failed. I didn't go any further in that rabbit-hole. After reading this back to vrrp_track_docker script, works perfect!

@scyto
Copy link
Author

scyto commented Sep 29, 2023

got it a kernel issue on certain platforms back when you did it, doesn't imlpy its broken on all LXC platforms.... wonder if they fixed that in the later pve kernels

(also the person who filed that bug with a -10 weight - yeah that's not going to do what they think it does either - the weight needs to be + 10) i.e. all node needs to have a +10 weight on the process check and then when the process fails the system will decrement 10 from the weight causing the failover... hmm now i type that.... i guess if all noded are -10 when healthy what happens when the 10 is added back - oh yeah, no thats not going to work it will move the VIP to the failed node, lol)

@Drallas
Copy link

Drallas commented Sep 29, 2023

putting my opinions about docker on lxc to oneside ;-)

Noted:😝 and my Docker Swarm in LXC Containers Setup is already stable for almost a week. 😇

But plan on rebuilding it anyway, and putting Docker Swarm & Gluster in proper VMs..

@scyto
Copy link
Author

scyto commented Sep 30, 2023

Setup is already stable for almost a week.

cool, did you every retry they keepalived process tracking (i don't know when you originally tried that?) if the latest PVE kernels doesn't have that feature enabled listed in the github it would seem like a reasonable feature to ask for....

@Drallas
Copy link

Drallas commented Sep 30, 2023

Setup is already stable for almost a week.

cool, did you every retry they keepalived process tracking (i don't know when you originally tried that?) if the latest PVE kernels doesn't have that feature enabled listed in the github it would seem like a reasonable feature to ask for....

Tried that yesterday with +10 not -10 but it doesn’t work for me.

For now I’m happy with the tracking script.

Next week I move on to building my media and fileshare nodes..

@scyto
Copy link
Author

scyto commented Oct 3, 2023

For now I’m happy with the tracking script.

hey if its working don't move - all that matters is it works :-)

I am pretty pumped my config above detects swarm nodes in drain mode and does the right thing.
I guess next up - adding health checks to all my containers.....

@Drallas
Copy link

Drallas commented Oct 3, 2023

Yes, Keepalived works like a charm between my Pi-holes.
So nice not to worry about complaining users, when restarting the Master Node! 😎

@scyto
Copy link
Author

scyto commented Oct 4, 2023

I need to add another check sudo docker node ls -f name=$(hostname) | grep Ready > /dev/null 2>&1 - seems one can have a totally hung docker VM that says it is Active but Down!

@scyto
Copy link
Author

scyto commented Oct 4, 2023

I need to add another check

done and increased interval to 2 seconds

@Drallas
Copy link

Drallas commented Oct 8, 2023

I made my own implementation of this based on this Gist.

The command docker node ls -f name=$(hostname) | grep Active can only be run on Swarm managers, docker node ls is a swarm manager only command!

  • Only included the node_check.sh in the Keepalived MASTER.conf
  • Gave the 2nd, 3rd and 4th nodes different prio's, to control how they fail over
  • Added unicast configuration

Seems to work fine while doing some basic test.

@scyto
Copy link
Author

scyto commented Oct 8, 2023

Aye, this sequence of gists is about a swarm. So the tests are for a swarm. Also in a small node swarm make them all managers. There is no good reason not to… (oh there is one, if you have even number of nodes… clusters should never have even number imo.)

@scyto
Copy link
Author

scyto commented Oct 8, 2023

Dunno why you would give them prios unless each node are different hardware?
Why the unicast?

@scyto
Copy link
Author

scyto commented Oct 8, 2023

Given your use of ceph, you could run a script that runs on a dedicated master node (that’s only a master) that writes to the ceph storage a file for each node that has a 1 or a 0 in it, then use the file check method of keepalive? I do worry that’s a lot of moving pieces….

@Drallas
Copy link

Drallas commented Oct 8, 2023

Yes I have an even number of nodes and only one manager, never though about more, perhaps I try with 3 managers some day.

Prio’s make it easier for me to know what is going on…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment