sudo usermod -aG docker $USER
Adding your user to the docker group. OBS! Follow up with a login-logout process of the session.
OBS! If you have network issues in the swarm, try to reboot the managers in serie before you try anything else. Rebooting the managers carries a small risk for complications compared to what much else can cause.
docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
b7t0j9rs63go8l1wa8z24e9tl * manager-01 Ready Active Leader 18.09.3
fg992na75y3tmbiei2o6tc6l8 manager-02 Ready Active Reachable 18.09.3
fmg39j952a4lnm5o3uapux3g2 manager-03 Ready Active Reachable 18.09.3
um3j5adapapux3g27nmldb9pf worker-01 Ready Active 18.09.3
y0pcyjb6vlpqz1x62bbsazxp3 worker-02 Down Drain 19.03.15
vzpgz07b64f712t0kam1kfymq worker-03 Ready Active 19.03.1
oivlajzn6dgbi7nk2osejx0hx worker-04 Down Drain 18.09.3
32ohhrmdage2rl637ckywsapj worker-05 Ready Active 20.10.6
...
Shows what node is down or active. Look for something that is down that you do not expect to be down.
If you see some error that says that there is no leader
Error response from daemon: rpc error: code = Unknown desc = The swarm does not have a leader. It's possible that too few managers are online. Make sure more than half of the managers are online.
...and you need to reclaim the cluster from the manager ... that is expected to already be a manager for the cluster, then init swarm with the flag force-new-cluster
should solve the issue for you.
docker swarm init --force-new-cluster
Could be that you need to define what IP to advertise with the --advertise-addr
flag, depending on if there is more then one IP - tells the joining nodes how to connect to the leader.
docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
7bmj6rbsclh1 acl replicated 3/3 registry.example.com:5/acl:latest
9ev0xlyq4fx2 acl-staging replicated 3/3 registry.example.com:5/acl-staging:latest
ykccgabqlu1o acl-uat replicated 3/3 registry.example.com:5/acl-uat:latest
2z62o15rvzoe api replicated 3/3 registry.example.com:5/api:latest
n28wor05ekb0 b2b-notification-tool replicated 1/1 registry.example.com:5/b2b-notification-tool:latest
9ikwwcots325 b2b-notification-tool-ui replicated 1/1 registry.example.com:5/b2b-notification-tool-ui:latest
8pnvjoonjucf bss-adaptor replicated 1/1 mrsuperhero/bss-adaptor:latest
11deiin32cez bss-adaptor-staging replicated 1/1 registry.example.com:5/bss-adaptor-staging:latest
eh0lcm9jsg76 bss-adaptor-uat replicated 1/1 mrsuperhero/bss-adaptor-uat:latest
8pjil6qbw7lb cadmin-repository replicated 1/1 registry.example.com:5/cadmin-repository:latest
m7oycnd82sme calamares-adaptor replicated 1/1 registry.adamo.es:5000/calamares-adaptor:latest
...
Shows what services are running in the cluster. Look for incomplete services, such as 0/1 or 1/3.
docker node ps worker-05
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
vyzeoi0f86b6 proxy.32ohhrmdage2rl637ckywsapj mrsuperhero/proxy:latest worker-05 Running Running 39 minutes ago
ndkkx374oxu1 \_ proxy.32ohhrmdage2rl637ckywsapj mrsuperhero/proxy:latest worker-05 Shutdown Shutdown 39 minutes ago
l5x3yz12nhf5 find-ont-ui.32ohhrmdage2rl637ckywsapj registry.example.com:5/find-ont-ui:latest worker-05 Running Running 41 minutes ago
po8kb12p3u5b \_ find-ont-ui.32ohhrmdage2rl637ckywsapj registry.example.com:5/find-ont-ui:latest worker-05 Shutdown Shutdown 41 minutes ago
e4rd2fylnto0 eurologistica-adapter-staging.32ohhrmdage2rl637ckywsapj registry.example.com:5/eurologistica-adapter-staging:latest worker-05 Running Running 42 minutes ago
touqd4u8gdd5 \_ eurologistica-adapter-staging.32ohhrmdage2rl637ckywsapj registry.example.com:5/eurologistica-adapter-staging:latest worker-05 Shutdown Shutdown 41 minutes ago
hcc20ynvvgvv eurologistica-adapter.32ohhrmdage2rl637ckywsapj registry.example.com:5/eurologistica-adapter:latest worker-05 Running Running 42 minutes ago
vbsykzk7e26j \_ eurologistica-adapter.32ohhrmdage2rl637ckywsapj registry.example.com:5/eurologistica-adapter:latest worker-05 Shutdown Shutdown 42 minutes ago
uc34coz95lub \_ eurologistica-adapter.32ohhrmdage2rl637ckywsapj registry.example.com:5/eurologistica-adapter:latest worker-05 Shutdown Shutdown 42 minutes ago
nq4xrgr62ak2 proxy.32ohhrmdage2rl637ckywsapj mrsuperhero/proxy worker-05 Shutdown Shutdown 39 minutes ago
02epczf81udm eurologistica-adapter-staging.32ohhrmdage2rl637ckywsapj registry.example.com:5/eurologistica-adapter-staging:latest worker-05 Shutdown Shutdown 42 minutes ago
7babrn6xa4fv find-ont-ui.32ohhrmdage2rl637ckywsapj registry.example.com:5/find-ont-ui:latest worker-05 Shutdown Shutdown 41 minutes ago
y7nqf3n7n7bv proxy.32ohhrmdage2rl637ckywsapj mrsuperhero/proxy worker-05 Shutdown Failed about an hour ago "error while removing network:…"
8kmk37626i5t find-ont-ui.32ohhrmdage2rl637ckywsapj registry.example.com:5/find-ont-ui:latest worker-05 Shutdown Failed about an hour ago "error while removing network:…"
s4i2dkz6p742 eurologistica-adapter-staging.32ohhrmdage2rl637ckywsapj registry.example.com:5/eurologistica-adapter-staging:latest worker-05 Shutdown Shutdown about an hour ago
...
Shows what services are running in the specified node.
docker service ps acl
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
uwwrptqqwmvj acl.1 registry.example.com:5/acl:latest worker-05 Running Running about an hour ago
i9b8pv6em9rl \_ acl.1 registry.example.com:5/acl:latest worker-01 Shutdown Shutdown about an hour ago
hwx8vo9yagls \_ acl.1 registry.example.com:5/acl:latest worker-01 Shutdown Shutdown about an hour ago
eg89rm0htcaw \_ acl.1 registry.example.com:5/acl:latest worker-01 Shutdown Complete 4 hours ago
wh0fx31ky1g1 \_ acl.1 registry.example.com:5/acl yn3r1lamszgq6ub06tg3ahe6c Shutdown Running 11 hours ago
tdl29w1g12y5 acl.2 registry.example.com:5/acl:latest worker-05 Running Running about an hour ago
9hrf14vx0m8d \_ acl.2 registry.example.com:5/acl:latest worker-05 Shutdown Shutdown about an hour ago
8hmdu51vuykg \_ acl.2 registry.example.com:5/acl:latest worker-01 Shutdown Shutdown about an hour ago
t5qp12m23yot \_ acl.2 registry.example.com:5/acl:latest worker-01 Shutdown Complete 2 hours ago
tsbam1hr2b9c \_ acl.2 registry.example.com:5/acl yn3r1lamszgq6ub06tg3ahe6c Shutdown Running 11 hours ago
pbq6ew1wv3pn acl.3 registry.example.com:5/acl:latest worker-01 Running Running about an hour ago
qagmo6uqrrdc \_ acl.3 registry.example.com:5/acl:latest worker-05 Shutdown Shutdown about an hour ago
7865qajaj2jq \_ acl.3 registry.example.com:5/acl:latest worker-01 Shutdown Shutdown about an hour ago
cms2tpm2lrvl \_ acl.3 registry.example.com:5/acl:latest worker-01 Shutdown Complete 4 hours ago
yjrba4scedzc \_ acl.3 registry.example.com:5/acl yn3r1lamszgq6ub06tg3ahe6c Shutdown Running 11 hours ago
Shows history of the service and what node they run on. Look for errors in the error column.
docker node update --availability drain worker-05
worker-05
To stop using a node that you identify as corrupt or suspect to be a problem.
docker node update --availability active worker-05
worker-05
To activate a node that is down.
docker swarm join-token worker
To add a worker to this swarm, run the following command:
docker swarm join --token SWMTKN-1-4y2e0g4dfqtr8xwbxsajzorll9vnzyotkij1p97j07vqtor5rg-2shevw4iur544eauhs556rabf 10.156.0.2:2377
To join a node to the cluster as a worker.
OBS! the docker swarm join --token ...
command must be copy pasted to the worker node terminal that you want to join.
docker swarm join-token manager
Same as to join a worker, but to join a new manager to the clustor.
docker node update --label-add production=true worker-05
worker-05
Adding a label to a newly joined node can be necessery if there is a constraint to the services which nodes they can be deployed on.
docker service update ping-pong --force
ping-pong
overall progress: 3 out of 3 tasks
1/3: running [==================================================>]
2/3: running [==================================================>]
3/3: running [==================================================>]
verify: Service converged
Rebalance a service across the cluster after a new node was added to the swarm.
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a998b75af5a3 registry.example.com:5/zoho-adaptor-uat:latest "npm start" About an hour ago Up About an hour zoho-adaptor-uat.1.y0omz07xqf5k3sch384rjzg4s
3f6f7ce7f18c registry.example.com:5/zoho-adaptor-staging:latest "npm start" About an hour ago Up About an hour zoho-adaptor-staging.1.ui3i86so1uxh8x6k2guq9rw01
92b61a1ab0a0 registry.example.com:5/zoho-adaptor:latest "npm start" About an hour ago Up About an hour zoho-adaptor.1.p9q9oybrewa692zbs8z5fd4kb
90d2a771ef0e mrsuperhero/wholesale-sit:latest "docker-php-entrypoi…" About an hour ago Up About an hour 80/tcp wholesale-sit.1.medddy0amddy61xjfyj9i9lqu
13fa4a01bf92 registry.example.com:5/wholesale-ftp-manager:latest "docker-entrypoint.s…" About an hour ago Up About an hour wholesale-ftp-manager.1.moryl65emlvwdoo2pa3p4b3m2
5dec02d416a3 mrsuperhero/wholesale-dev:latest "docker-php-entrypoi…" About an hour ago Exited (137) About an hour ago wholesale-dev.1.tdm04n6es078c514iye5fo5gs
c5a23521d1ec mrsuperhero/wholesale:latest "docker-php-entrypoi…" About an hour ago Up About an hour 80/tcp wholesale.1.k9ddmnl1utqyxni8dso40mk6k
bb2139a6f5ca registry.example.com:5/ui-voip-numberpool-staging:latest "docker-entrypoint.s…" About an hour ago Up About an hour ui-voip-numberpool-staging.1.lgmne591oyvvbf6g2tpyoklpz
7aebe7b1e1a8 registry.example.com:5/ui-voip-numberpool:latest "docker-entrypoint.s…" About an hour ago Up About an hour ui-voip-numberpool.1.8c7idw7wyfshwbcmec7hbd7eb
0ee7cc0d839e registry.example.com:5/sys02-rebuild-ui-uat:latest "npm start" About an hour ago Up About an hour sys02-rebuild-ui-uat.1.v29st7d0goddys6iyz57t2jjq
0938774e5430 registry.example.com:5/sys02-rebuild-ui-staging:latest "npm start" About an hour ago Up About an hour sys02-rebuild-ui-staging.1.p8cvlgms7723m4nnmaox5jrtw
69544e725437 mrsuperhero/sys02-rebuild-ui:latest "npm start" About an hour ago Up About an hour sys02-rebuild-ui.1.lschjl7x7jqi5vd3haqr7jo2p
12f863d15f6d registry.example.com:5/sys02-rebuild-uat:latest "npm start" About an hour ago Created sys02-rebuild-uat.1.hd2bmecvj0fkye5twmey39flv
...
Lists all containers running in the worker node. With the -a
flag, the list includes exited containers.
docker stats
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
a998b75af5a3 zoho-adaptor-uat.1.y0omz07xqf5k3sch384rjzg4s 0.00% 38.83MiB / 23.55GiB 0.16% 2.27MB / 400kB 94.2kB / 16.4kB 23
063de48800d3 sms.1.o510t7owmi3s09skp1xiauls1 0.00% 39.34MiB / 23.55GiB 0.16% 1.64MB / 15.6kB 0B / 16.4kB 23
2486af68cf3a sim-delivery-ui.1.2jjkka517nlk43d0dhiagqphh 0.00% 40.25MiB / 23.55GiB 0.17% 1.63MB / 8.44kB 0B / 16.4kB 19
fff833b0d21d sim-delivery-staging.1.occufvt6om4zrm817zqobl7k0 0.00% 37MiB / 23.55GiB 0.15% 212MB / 742kB 28.7kB / 16.4kB 23
9b7fc3d89708 sim-delivery.1.6x4hsoiisx8095lortwgkzkg1 12.56% 100.9MiB / 23.55GiB 0.42% 2.9GB / 1.18GB 0B / 16.4kB 23
1a61621ff4c5 redis-global.1.oypej8vmpzsa0vxs9tt0lpxtk 9.62% 10.82MiB / 23.55GiB 0.04% 4.06GB / 4.25GB 0B / 1.09MB 4
...
Prints a list of docker statistics related to the containers running on the worker. Look for containers that utelize more then average of resources. OBS! To much RAM usage could be becouse of a memory leak.
docker network inspect ${network-name}
Inspecting a network to see if any IP of the containers in the network missmatch with the container. ...or maybe you looking for something else, idk!
docker inspect ${container-name}
Inspecting a container to see details of the container.
docker swarm leave -f
Node left the swarm.
Look for error output - example:
Error response from daemon: context deadline exceeded
sudo service docker stop
sudo rm -rf /var/lib/docker/swarm
sudo service docker start
Removes the corrupted data, can help if your cli commands are throwing errors or freezes. This is a drastic messure, chill with these sets of commands - only use if necessery. Good to follow up with - removing worker from swarm and restarting the managers, before re-adding the node to the swarm with a join token.