Skip to content

Instantly share code, notes, and snippets.

@Drallas
Last active May 13, 2024 01:11
Show Gist options
  • Save Drallas/4b965da52d259f0125f18bca39ffc8a3 to your computer and use it in GitHub Desktop.
Save Drallas/4b965da52d259f0125f18bca39ffc8a3 to your computer and use it in GitHub Desktop.

Docker Swarm Keepalived Configuration

Part of collection: Hyper-converged Homelab with Proxmox

Keepalived is a Loadbalancer to add ‘high availability` to Linux systemen. See the Keepalived documentatie for more background information.

This setup build on High Available Pi-hole failover cluster using Keepalived and Orbital Sync.

Setup Keepalived

This setup is using a virtual ip address: 192.168.1.4, which is the only that is needed to access Application on the Docker Swarm. http://192.168.1.4:<port-number>

All nodes

sudo apt-get install keepalived -y

Docker Manager Nodes

Add the Script and the Master Node configuration to the Docker Manager servers:

# node_active_ready_check.sh
sudo curl https://gist.github.com/Drallas/4b965da52d259f0125f18bca39ffc8a3/raw/1774c4fad8783c02d0803b58ad1e6f250a432533/script-node_active_ready_check.sh -o /etc/scripts/node_active_ready_check.sh
sudo chmod +x /etc/scripts/node_active_ready_check.sh

# keepalived.conf
sudo curl https://gist.github.com/Drallas/4b965da52d259f0125f18bca39ffc8a3/raw/9368ec523fda134b68e66ce857b607c93b3678e7/script-keepalived-master.conf -o /etc/keepalived/keepalived.conf

Docker Worker Nodes

Add the Keepalived configuration to the server:

# keepalived.conf
sudo curl https://gist.github.com/Drallas/4b965da52d259f0125f18bca39ffc8a3/raw/9368ec523fda134b68e66ce857b607c93b3678e7/script-keepalived-slave.conf -o /etc/keepalived/keepalived.conf

On change node's there can't be a script that check docker node ls, hence it only monitors the status of the docker service. Each slave has it's unique priority value and unicast_src_ip & unicast_peer configuration.

Edit nano /etc/keepalived/keepalived.conf on each slave node, and change the priority and unicast blocks:

Docker Manger 1

priority 165

Docker Manger 2

priority 155

Docker Manger 3

priority 155

Docker Worker

priority 145

If the Node with ip 192.168.1.111 fails 192.168.1.112 becomes Master, if that one fails too 192.168.1.113, etc.

Start keepalived

sudo systemctl enable --now keepalived && sudo systemctl status keepalived

DNS

set the dns <appname>.<domain>.<countrycode> A 192.168.1.4

Testing

Stop the Docker Service on the Master Node sudo systemctl stop docker.socket && sudo systemctl status docker.service

See on the High Available Pi-hole failover cluster using Keepalived and Orbital Sync test section more details how to test this.

When done start Docker again sudo systemctl start docker.service and monitor sudo systemctl status keepalived to see the Node assuming the MASTER status again.

! Configuration File for keepalived
global_defs {
vrrp_startup_delay 5
enable_script_security
max_auto_priority
script_user root
}
vrrp_track_process track_docker {
process dockerd
weight 10
}
vrrp_script node_active_ready_check {
script "/etc/scripts/node_active_ready_check.sh"
interval 5
}
vrrp_instance docker_swarm {
state MASTER
interface eth0
virtual_router_id 10
priority 160
advert_int 1
authentication {
auth_type PASS
auth_pass 8kgEDPp3
}
virtual_ipaddress {
192.168.1.4/24
}
track_process {
track_docker
}
track_script {
node_active_ready_check
}
}
! Configuration File for keepalived
global_defs {
vrrp_startup_delay 5
enable_script_security
max_auto_priority
script_user root
}
vrrp_track_process track_docker {
process dockerd
weight 10
}
vrrp_instance docker_swarm {
state BACKUP
interface eth0
virtual_router_id 10
priority 145
advert_int 1
authentication {
auth_type PASS
auth_pass 8kgEDPp3
}
virtual_ipaddress {
192.168.1.4/24
}
track_process {
track_docker
}
}
#!/bin/bash
status=$(docker node ls --format "{{.Status}} {{.Availability}}")
if [[ "$status" == *"Ready"* && "$status" == *"Active"* ]]; then
echo "Node is active and ready."
exit 0
else
echo "Node is not active or not ready."
# Log the reason to a file
exit 1
fi
@scyto
Copy link

scyto commented Oct 12, 2023

Removed Unicast

weird, i wouldn't have thought it would harm anything.... what was the logic for going unicast in the first place?

@Drallas
Copy link
Author

Drallas commented Oct 12, 2023

Just trying stuff, but didn't think that one trough, Perhaps I revisit it some day, for now I'm happy it's stable. Migrating data now to CepfFS, so far 350 GB via rsync via VirtioFS without any issue!

@scyto
Copy link

scyto commented Oct 12, 2023

Just trying stuff

got it and thats best way to learn, i only used what i did originally as thats whats all the examples had :-)

or now I'm happy it's stable.

thats all that matters

350 GB via rsync via VirtioFS without any issue

sweet - not sure when i will get to virtioFS or the docker plugins for ceph... my folks are coming to visit from the UK for 3 weeks and i have a week to prep the house (it needs a lot of work) so may not get to look at either until mid to late Nov

@Drallas
Copy link
Author

Drallas commented Oct 12, 2023

Let me know, how it goes, played a lot with it the past weeks. Use the updated script from my Gist, it's fixing some stuff, see Proxmox Forum.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment