Skip to content

Instantly share code, notes, and snippets.

@Drallas
Last active May 1, 2024 20:20
Show Gist options
  • Star 13 You must be signed in to star a gist
  • Fork 4 You must be signed in to fork a gist
  • Save Drallas/e03eb5a4f68bb526f920a423455bc0c9 to your computer and use it in GitHub Desktop.
Save Drallas/e03eb5a4f68bb526f920a423455bc0c9 to your computer and use it in GitHub Desktop.

Docker Swarm in LXC Containers

Part of collection: Hyper-converged Homelab with Proxmox

After struggling for some days, and since I really needed this to work (ignoring the it can't be done vibe everywhere), I managed to get Docker to work reliable in privileged Debian 12 LXC Containers on Proxmox 8

(Unfortunately, I couldn't get anything to work in unprivileged LXC Containers)

There are NO modifications required on the Proxmox host or the /etc/pve/lxc/xxx.conf file; everything is done on the Docker Swarm host. So the only obvious candidate who could break this setup, are future Docker Engine updates!

Host Setup

My host are Debian 12 LXC containers, installed via tteck's Proxmox VE Helper Scripts

Install the LXC via the Proxmox VE Helper Script

bash -c "$(wget -qLO - https://github.com/tteck/Proxmox/raw/main/ct/debian.sh)"

Backing filesystems

Docker info shows i'm using overlay2, this is the recommended storage driver for Debian. This storage driver requires XFS or EXT4 as backing file system.

docker info | grep -A 7 "Storage Driver:"

 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd

Set userns-remap

As Neuer_User pointed out, running the Docker Containers unprivileged on a privileged

LXC seems the best compromise to run the containers in a relative secure way.

To do so, add a daemon.json on the Docker Servers that are part of the Swarm.

mkdir /etc/docker
nano /etc/docker/daemon.json
{
  "userns-remap": "root"
}

And reboot reboot the Docker Host.

(This moves everything below /var/lib/docker/ to the folder /var/lib/docker/0.0/ existing workload disappear, hence it's a step pre Docker installation!)

Install Docker

The get-docker.sh script is the most convenient way to quickly install the latest Docker-CE release!

curl -fsSL https://get.docker.com -o get-docker.sh
chmod +x get-docker.sh
./get-docker.sh

Join Create Docker / Swarm

Without this step, the next step(s) fail!

# Manager Node
docker swarm init

# Add Node
docker swarm join --token <some-very-long-token>

# Display Join token again
docker swarm join-token worker
docker swarm join-token manager

Add ipv4 for Ingress_sbox

For Docker in LXC to work, the only thing needed is to execute:

nsenter --net=/run/docker/netns/ingress_sbox sysctl -w net.ipv4.ip_forward=1

on the Docker Swarm Servers

Make it permanent

This doesn't survive reboots, so I created an oneshot systemd service for it, to make sure that after each reboot the setting is applied.

Create a Bash Script

First, we need a Bash script to be executed by the service.

nano /usr/local/bin/ipforward.sh

#!/bin/bash
nsenter --net=/run/docker/netns/ingress_sbox sysctl -w net.ipv4.ip_forward=1

Make it executable

chmod +x /usr/local/bin/ipforward.sh

Create a Systemd Service

This service is of the type oneshot, during startup it waits for the docker.service to be started, and then 10 seconds for run-docker-netns-ingress_sbox.mount to be loaded. Only after that net.ipv4.ip_forward=1 can be applied.

nano /etc/systemd/system/ingress-sbox-ipforward.service
[Unit]
Description = Set net.ipv4.ip_forward for ingress_sbox namespace
After = docker.service
Wants = docker.service

[Service]
Type = oneshot
RemainAfterExit = yes
ExecStartPre = /bin/sleep 10
ExecStart = /usr/local/bin/ipforward.sh

[Install]
WantedBy = multi-user.target

Start the service and check if it's healthy

systemctl daemon-reload
systemctl enable ingress-sbox-ipforward.service
systemctl start ingress-sbox-ipforward.service
systemctl status ingress-sbox-ipforward.service

Final Checks

Without ipv4.ip_forward set to 1, the Ingress Networking to the Docker Swarm is not active. So it's important to verify if the value is applied successfully.

Manual check if ipv4.ip_forward is set to 1

systemctl status ingress-sbox-ipforward.service | grep ipforward.sh

# Or in a script via:

current_value=$(nsenter --net=/run/docker/netns/ingress_sbox sysctl -n net.ipv4.ip_forward)
echo $current_value

(Now, Docker in LXC seems to behave as Docker in a VM.)

Issues

  1. Service in docker-compose resolved wrong ip

To fix this, it’s needed to add a hostname entry for each swarm service, to make it more logical I also add a prefix service to the service names.

services:
  service_nginx: # Prefix service_
    image: nginx
    hostname: nginx

Links

Screenshot

Screenshot 2023-09-21 at 14 38 45

@scyto
Copy link

scyto commented Sep 24, 2023

There is something very wrong in docker on lxc if that ip issue is an issue. Reading the linked gist issue everything is working as it should - the service name resolves to the docker VIP - I think the issue is people using weird network approaches - like host networking (don't do in swarm) and using same ranges on VIP network and host network....

Also not sure why you have to create a system service - isn't needed on real Debian? It all makes me nervous docker on lxc is very fragile...

@Drallas
Copy link
Author

Drallas commented Sep 24, 2023

This was pre my Virtiofs discovery, now I can move Docker Swarm to VMs. 😀

The service is needed to set net.ipv4.ip_forward=1 which only can be done after run-docker-netns-ingress_sbox.mount becomes active.

Overall this approach is pretty ok, no weird host config, but only one simple setting inside the docker host.

I couldn’t find anyone with a better solution I could work off.

@scyto
Copy link

scyto commented Sep 24, 2023

oh to be clear, i am darn impressed, reading all the horror strories on the forum around docker in LXC made me assume it wasn't really possible
it is the beauty of linux that it is so customizable

@dlasher
Copy link

dlasher commented Jan 27, 2024

So there's a couple of subtle things:

  1. The VE helper scripts above, if you accept defaults, set up UNPRIV LXC containers - which make docker inside fail in unpretty ways. You mention it both ways, wouldn't hurt to put something in bold/red. I went through this a dozen times and missed that point. (And if you use the docker scripts from https://tteck.github.io/Proxmox/ - they are all UNPRIV as well)

  2. I wrote a little startup script to make sure the ingress_sbox is active, then sets the net_ipv4.ip_forward=1. (Posted on proxmox forum, but will share it here)

#!/bin/bash
for lp in {1..60};do
        if exists=$(test -f /run/docker/netns/ingress_sbox)
        then
                nsenter --net=/run/docker/netns/ingress_sbox sysctl -w net.ipv4.ip_forward=1
                exit
        else
                echo "waiting $lp/60 - ingress_sbox does not exist"
                sleep 1
        fi
done
  1. You can use other backing storage than XFS/ZFS, but it takes a little more work, and some help from fuse-overlayfs. Using your guide, I got docker swarm happy on a full proxmox 8.1.x cluster, with CEPH as the backing store. (https://c-goes.github.io/posts/proxmox-lxc-docker-fuse-overlayfs/)

Thanks for putting this together - it got me 99% of the way there, much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment