Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
My Docker Swarm Architecture

This (and related gists) captures how i created my docker swarm architecture. This is intended mostly for my own notes incase i need to re-creeate anything later! As such expect some typos and possibly even an error...

Installation Step-by-Step

Each major task has its own gist, this is to help with maitainability long term.

  1. Install Debian VM for each docker host
  2. install Docker
  3. Configure Docker Swarm
  4. Install Portainer
  5. Install KeepaliveD
  6. glusterFS disk prep, install & config
  7. gluster FS plugin for docker (optional )
  8. example stack templates:

More Details on What and Why

design goals:

  • ensure every container stays running if any of the following fail (one VM, one hypervisor, one docker service)
  • remove chance of blackhole requests (aka eliminate the use of DNS round robin to address the service)
  • enable the use of replicated state so any container can start on any single docker swarm node and fail between nodes and see the data it needs to
  • enable safe replicated shared volume across all nodes that allow state to be replicated and accessible from all nodes and allows for use of datatbases like mariadb which will corrupt if placed on NFS or CIFS/SMB shares across the network
  • make it easy to backup with my synology (this model enabled me to easily backup using active backup for business)

current state

  • all seems to be functioning
  • I am building a NPM node to test out if this really works (small database), nothing critical, easy to switch between my existing nginx and NP, just taking a lot of work to convert all my nginx cutomizations NPM is not vanilla nginx under the hood
  • i still have an NFS volume and a iscsi volume (on my synology) mapped into docker01 for containers bound to that node

Architecture

image

Design Assumptions

  • I wanted to continue to use docker, docker-compose, docker swarm & portainer due to existing skills
  • I have no interest at this time in k8s (i don't use it at work and never will)
  • Start simple, even if that means i do what i shouldn't (this is just a home network)
  • This is small, the containers include (nginx reverse proxy, oauth2-proxy, wordpress site + database, mqtt, upoller, cloudflare ddns) so bear that in mind, this isn't designed for super throuput or scale - its designed for some resilliency.
  • I want to deploy all services (containers) with stack templates and possibly contribute back to portainer template repo
  • The clustered file system must support databases on it (like mariadb)

Design Decisions

  • Debian for my docker host VMs - i seem to gel with debian and it (and other debian derivatives) seems to play nice with most contaniners
  • I will only use package versions included in the debian distri (bullseye stable)
  • I chosee glusterfs as my clustered, replicated file system
  • Gluster volumes will be deployed in dispersed mode
  • I mapped seperate VHDs into the docker hosts one for OS and one for gluster - this is to prevent risk of infinite boot loops
  • my gluster service will be installed on the docker host VMs. Best practice dicates they should be seperate VMs for scale. But as all VMs share the same host CPU this really gives no benefit. If this turns out to be bad decision i will change.
  • I wont tear down my current NFS and iSCSI mapped volumes (not shown) until glusterfs has been shown to run ok and survive reboots etc

A note on docker swarm and state (assume you know docker already)

Docker containers are ephemeral and generally loose all their data when they are stopped. For most docker containers there is some level of confguration state you need to pass to the container (variable, file, folders of data). Simillarly many containers want to persist data state (databases, files etc)

On a single node docker most people map a directory or file on the host into the container as a volumen or bind mount. We also see the following more advanced techniques used:

  1. mount a shared CIFS or NFS volume at bootime on the docker hosts
  2. defining a CIFS volume and mapping it into the container at runtime (this avoids editing fstab on the host)
  3. same as aove but with NFS
  4. using configs - if you have just a single, readi only, confg file that needs to be read this can be defined.

In a swarm where you want a container to run on any node you need to find a way to make the data available on all nodes in a safe effective way.

If you have a simple container that only needs environment variables to be cofigure you can do that directly when you deploy the portainer template as a portaineer stack. See this cloudflare dynamic dns updater as an example.

  • Only #4 offers a safe way to make this happen (the 'config' is available to all nodes) - but this is super restrictive and doesn't help with containers that need to store more state and read/write that state. See this mosquitto mqtt example
  • #1 this can work and you can mount the shares to multiple nodes via fstab. Typically databases cannot be placed on these shares and will ultimately corrupt. You do have to be careful to only have one container writing to any given file to avoid potentials issues.
  • #2 and #3 - thishas the advantage of not being generall mounted to the host OS, but mount on demand by the container, this reduced all the tedious mucking about is hyperspace fstab. You do need to use the volumes UI in portaine for this.

and for nost folks NFS/CIFS shares are not replicated for high availability.

This is why in this architecture i have chose to see if I can overcome these limitations uings glusterfs.

@ociotec
Copy link

ociotec commented Sep 3, 2022

Hi, did you consider CephFS as shared storage solution?

@scyto
Copy link
Author

scyto commented Sep 3, 2022

Hi, did you consider CephFS as shared storage solution?

yes, i hadn't used either, i looked at 1) for the simplest tutorial i could find 2)something that would let me have one VHD disk per node i could dedicate to clustered

i chose glusterfs just because it looked simpler - might look at ceph down the line, for now this has been rock solid (fatal last words i am sure, lol)

@BritHefty
Copy link

This is quite literally the same path I'm learning my way through. had recommendations to use Ceph but didn't have raw disks to throw at it, Gluster worked as it seats in to the existing filesystem. keepalived is the MVP for handling inbound requests to anywhere in the cluster. Thank you for putting this together.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment