One of the main advantages of running services in Docker containers is the relative ease of reverting to the last-known-good container.
One of the biggest problems with reverting to the last-known-good container arises if you don't already know the tag of the image you want to revert to. Sometimes it can be a bit tricky to figure it out. You usually need to go to the container's "tags" page on DockerHub, sort through the update history for your architecture, pick the appropriate tag, edit your compose file to specify that tag, and then, in the words of the old Knight in the third Indiana Jones movie, hope that you have "chosen wisely".
Your chances of choosing wisely are inversely proportional to the amount of panic you're feeling after you realise a key container isn't working and reverting is your best strategy. Panic is the enemy of clear thinking and reasoning.
There is, however, another way. The method described here means you retain the last-known-good image on your local system, ready to use if you need it. No need to search DockerHub. No filtering by architecture. No comparison of SHA256 signatures to figure out what corresponds with what. No need to wait for the older image to be re-downloaded. No risk of "choosing poorly".
I'm going to use WireGuard as my example. I've selected it for no reason other than I know an update is available on DockerHub so it's a good opportunity for capturing commands and responses.
To begin with, I'll figure out a version number for the instance of WireGuard that is running on my Pi at the moment. This is not really part of the procedure. I'm only doing it as a means of showing what's actually running at various stages.
Every container has a different approach to reporting a version number. I couldn't find any way of asking the WireGuard container to show a version number. Fortunately, WireGuard images come with a suitable label so I'm going to use that as a proxy.
$ docker container inspect wireguard | jq .[0].Config.Labels.build_version
"Linuxserver.io version:- v1.0.20210914-ls92 Build-date:- 2022-11-26T20:58:43-06:00"
The version that is running is v1.0.20210914-ls92
.
Although a general "pull" would work, I'm going to keep things simple by limiting the "pull" to just WireGuard:
$ cd ~/IOTstack
$ docker-compose pull wireguard
[+] Running 1/1
⠿ wireguard Pulled 56.3s
⠿ 7ac8d1043976 Pull complete 9.5s
⠿ cbba887b2540 Pull complete 9.6s
⠿ 1662be267264 Pull complete 9.8s
⠿ 9241a1203bfd Pull complete 12.4s
⠿ e73540acc7bb Pull complete 12.6s
⠿ 3d6fa3a44c12 Pull complete 53.8s
⠿ a4466cab64d6 Pull complete 54.3s
$ docker images | grep -e "REPOSITORY" -e "wireguard"
REPOSITORY TAG IMAGE ID CREATED SIZE
ghcr.io/linuxserver/wireguard latest 35808ba8d486 23 hours ago 918MB
ghcr.io/linuxserver/wireguard <none> d49946f30f14 7 days ago 918MB
ghcr.io/linuxserver/wireguard prior 0dd790bba6bc 2 weeks ago 467MB
Breaking that down:
latest
(ID="35808ba8d486") is the image which has just been downloaded with the "pull". This image is not yet "running" because I haven't done an "up".<none>
(ID="d49946f30f14") is the image is currently instantiated as the running container. It's important to understand that this container has been running successfully for the last week so, in that sense, it is a "known good" image.prior
(ID="0dd790bba6bc") is the image that was the "last known good" version the last time WireGuard was updated (7 days ago). You won't see aprior
image on your system until you have been through this process at least once.
-
"up" the container. That instantiates
latest
(the image just pulled) as the running container.$ docker-compose up -d wireguard [+] Running 2/2 ⠿ Container pihole Running 0.0s ⠿ Container wireguard Started 5.7s
PiHole gets a mention because my WireGuard container has a dependency on PiHole (I want remote clients to use PiHole for DNS).
I can confirm that the version number for the running container has changed:
$ docker container inspect wireguard | jq .[0].Config.Labels.build_version "Linuxserver.io version:- v1.0.20210914-ls93 Build-date:- 2022-12-03T20:43:47-06:00"
The running version is
v1.0.20210914-ls93
.A side-effect of "upping" the container with the new image is that
<none>
(ID="d49946f30f14") is released. If I randocker system prune
right now, that image would be removed. I'm about to prevent that from happening. -
If a
prior
image exists, remove it. That makes theprior
tag available for the next step.$ docker rmi ghcr.io/linuxserver/wireguard:prior Untagged: ghcr.io/linuxserver/wireguard:prior Untagged: ghcr.io/linuxserver/wireguard@sha256:a44a4a2db71b6ad940b050dd32e7128906d4b7e7be8296e95d98510e44890457 Deleted: sha256:0dd790bba6bcff66b95290e2db19000d58729138667ec4bf885cbc51ad594a52 Deleted: sha256:d4ae179d2289cc36b2cf1689638b226c1c73b7940c3162757703f29a3f06a599 Deleted: sha256:43ced15ba9be6273085d75888ab1f5f30ea42f8a333f0c0b14cdb7732b536e1c Deleted: sha256:542a58acdb4081c642642c2ecc11db70ff4a23cefca03fac704d019eb3243c98 Deleted: sha256:8d3a647dfde7fa9cfe17df1c707a536a0a8317a855151ce16485919effc03a8a Deleted: sha256:2e57328ea8fd522d5c9260290a2960f1ee0e16ef4eed693e801a75f89c88098a Deleted: sha256:53039589157c1efa7611dd293fcb1d909729455dc3f441a137ee66c73f88683e Deleted: sha256:45041837a4550f15db78bb4eca5c07f5f031d0dd5cc0fd41dc99b91333fcdbff
What just happened? Well, the two-week-old
prior
(ID="0dd790bba6bc") image which was the last-known-good image before the "pull" has gone away. I'm left with:$ docker images | grep -e "REPOSITORY" -e "wireguard" REPOSITORY TAG IMAGE ID CREATED SIZE ghcr.io/linuxserver/wireguard latest 35808ba8d486 23 hours ago 918MB ghcr.io/linuxserver/wireguard <none> d49946f30f14 7 days ago 918MB
-
Tag
<none>
asprior
:$ docker tag d49946f30f14 ghcr.io/linuxserver/wireguard:prior
Notice that I referred to the
<none>
image by its ID. That's needed here.There is nothing magical about the name "prior" as a tag. You could use "previous" or your own initials. Just be consistent and avoid names that are already used as tags on DockerHub.
The important thing about this step is that adding a tag to an image marked
<none>
is what prevents the image from being removed. -
Prune to remove unused layers (pruning is "best practice").
$ docker system prune -f Total reclaimed space: 0B
In this case there was no work for prune to do.
My system now shows:
$ docker images | grep -e "REPOSITORY" -e "wireguard"
REPOSITORY TAG IMAGE ID CREATED SIZE
ghcr.io/linuxserver/wireguard latest 35808ba8d486 23 hours ago 918MB
ghcr.io/linuxserver/wireguard prior d49946f30f14 7 days ago 918MB
If you "join the dots" it should be apparent that:
latest
is instantiated as the running container and has versionv1.0.20210914-ls93
;prior
is the last-known good container. This is what was running immediately before the "pull" and "up".
Let's suppose latest
turns out to be a disaster and I need to revert to prior
.
The first few lines of WireGuard's service definition in my compose file look like this:
$ grep -A 2 "^ wireguard:" docker-compose.yml
wireguard:
container_name: wireguard
image: ghcr.io/linuxserver/wireguard
I can revert to the prior
image just by adding that tag to the image:
directive:
image: ghcr.io/linuxserver/wireguard:prior
and then executing:
$ docker-compose up -d wireguard
[+] Running 2/2
⠿ Container pihole Running 0.0s
⠿ Container wireguard Started 5.7s
I can confirm that v1.0.20210914-ls92
is running again:
$ docker container inspect wireguard | jq .[0].Config.Labels.build_version
"Linuxserver.io version:- v1.0.20210914-ls92 Build-date:- 2022-11-26T20:58:43-06:00"
If I try to "pull" WireGuard, I'll get an error:
$ docker-compose pull wireguard
[+] Running 0/1
⠿ wireguard Error 1.3s
Error response from daemon: manifest unknown
That's because docker-compose is searching DockerHub for a newer image with a prior
tag. It doesn't find any mention of prior
on DockerHub so it complains.
WireGuard will remain pinned to my local copy of v1.0.20210914-ls92
until I remove the :prior
tag in the compose file.
But let's suppose I'm doing this "for real". I've pulled down latest
, found that it didn't work, and reverted to prior
. What I'm waiting for now is for a new image to become available on DockerHub. Once a new candidate image appears, I will proceed like this:
-
Edit the compose file to remove the
prior
tag. -
Do a "pull" of the affected container:
- The new image will come down from DockerHub and will be tagged
latest
. - The failing image, tagged
latest
before the "pull", will be tagged<none>
after the "pull".
- The new image will come down from DockerHub and will be tagged
-
"up" the container. That will instantiate the newly-pulled
latest
. -
Remove the
<none>
image via:$ docker system prune -f
It can also be removed by by using
docker rmi
plus its image ID. -
The state of play will be:
latest
will be running and is the version I need to test; butprior
will still be the last-known-good, available for reversion if this newer version oflatest
still has problems.
One unfortunate side-effect of pinning any container to prior
is that it also stops other containers from being processed in a general "pull":
$ docker-compose pull
[+] Running 2/8
⠿ mosquitto Skipped - No image to be pulled 0.0s
⠿ nodered Skipped - No image to be pulled 0.0s
⠿ grafana Error 1.3s
⠿ zerotier-router Error 1.3s
⠿ pihole Error 1.3s
⠿ portainer-ce Error 1.3s
⠿ influxdb Error 1.3s
⠿ wireguard Error 1.3s
Error response from daemon: manifest unknown
The reason Mosquitto and Node-RED were "skipped" is because those containers are built using local Dockerfiles. The failing containers are the ones pulled from DockerHub.
The workaround is to pull by name:
$ docker-compose pull grafana zerotier-router pihole portainer-ce influxdb
[+] Running 5/5
⠿ zerotier-router Pulled 2.4s
⠿ grafana Pulled 2.5s
⠿ pihole Pulled 2.4s
⠿ portainer-ce Pulled 2.5s
⠿ influxdb Pulled 2.4s
This is not as elegant as a general "pull" but it works. It's also a constant reminder to check to see if whatever problem created the need to revert has been resolved in a later version.
For the sake of completeness, I could have reverted to the last-known-good image by changing WireGuard's service definition to use this tag:
image: ghcr.io/linuxserver/wireguard:v1.0.20210914-ls92
When I "up" the container, docker-compose will pull that tag from DockerHub. The result would be:
$ docker images | grep -e "REPOSITORY" -e "wireguard"
REPOSITORY TAG IMAGE ID CREATED SIZE
ghcr.io/linuxserver/wireguard latest 35808ba8d486 47 hours ago 918MB
ghcr.io/linuxserver/wireguard prior d49946f30f14 8 days ago 918MB
ghcr.io/linuxserver/wireguard v1.0.20210914-ls92 d49946f30f14 8 days ago 918MB
I've left prior
in the list because the image IDs match and that proves I chose wisely when I selected v1.0.20210914-ls92
on DockerHub! I can also confirm that from the running container:
$ docker container inspect wireguard | jq .[0].Config.Labels.build_version
"Linuxserver.io version:- v1.0.20210914-ls92 Build-date:- 2022-11-26T20:58:43-06:00"
Using a tag that actually appears on DockerHub means a general "pull" will also work:
$ docker-compose pull
[+] Running 8/8
⠿ nodered Skipped - No image to be pulled 0.0s
⠿ mosquitto Skipped - No image to be pulled 0.0s
⠿ zerotier-router Pulled 2.4s
⠿ wireguard Pulled 1.0s
⠿ portainer-ce Pulled 2.4s
⠿ influxdb Pulled 2.4s
⠿ pihole Pulled 2.4s
⠿ grafana Pulled 2.4s
But consider:
- I had to know - in advance of the need - this was the "magic incantation" for determining the correct last-known-good tag for WireGuard.
- Had I not had the
prior
container on my system and discovered its tag, I would have had to have searched DockerHub to find it. There are so many image variants that I had to go to the second page of WireGuard's tags list to find it. Not really what I want to be doing in "panic mode". - This style of
v1.0.20210914-ls92
tag is specific to WireGuard. Other containers have different schemes for their historic tags. - Not every container maintains historic tags on DockerHub. Most of the "major" containers do but niche containers can be hit and miss.
- Not every container is actually hosted on DockerHub. If an image comes from another repository, finding the equivalent of its "tags" page can be … tricky.
The approach of tagging <none>
as prior
after any "pull" is independent of all that. It's a strategy that will work for any container where the image comes straight from any image repository. That's another way of saying it has an image:
directive in its service definition.
The strategy works there too. You just do things in a slightly different order. Here I'll use Node-RED as the example. I start with the running image:
$ docker images | grep -e "REPOSITORY" -e "nodered"
REPOSITORY TAG IMAGE ID CREATED SIZE
iotstack-nodered latest ee9dd58b0e2c 27 hours ago 556MB
As mentioned above, adding a tag to an existing image is what prevents it from being removed. I know I'm about to rebuild so I need to tag the existing image as prior
to stop it from being auto-removed.
If there was a prior
in the list above, I would need to remove it to release the tag, as in:
$ docker rmi iotstack-nodered:prior
Now I can tag the current image as prior
:
$ docker tag ee9dd58b0e2c iotstack-nodered:prior
The state of play is (same image tagged twice):
$ docker images | grep -e "REPOSITORY" -e "nodered"
REPOSITORY TAG IMAGE ID CREATED SIZE
iotstack-nodered latest ee9dd58b0e2c 27 hours ago 556MB
iotstack-nodered prior ee9dd58b0e2c 27 hours ago 556MB
Now I force a container rebuild:
$ docker-compose build --no-cache --pull nodered
[+] Building 75.5s (7/7) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 978B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/nodered/node-red:latest-14 1.8s
=> CACHED [1/3] FROM docker.io/nodered/node-red:latest-14@sha256:ea23400feef32a04af8c28245895b8c9b2ea10cb2b1f1bc12f2 0.0s
=> [2/3] RUN apk update && apk add --no-cache eudev-dev mosquitto-clients bind-tools tcpdump tree 7.0s
=> [3/3] RUN npm install node-red-node-pi-gpiod node-red-dashboard node-red-contrib-influxdb node-red-contr 62.8s
=> exporting to image 3.5s
=> => exporting layers 3.5s
=> => writing image sha256:40cb7d8ba6fe7dde71071163b4995f073e959c3103f679b666a1449ceeccdf04 0.0s
=> => naming to docker.io/library/iotstack-nodered 0.0s
The revised state of play:
$ docker images | grep -e "REPOSITORY" -e "nodered"
REPOSITORY TAG IMAGE ID CREATED SIZE
iotstack-nodered latest 40cb7d8ba6fe 3 minutes ago 556MB
iotstack-nodered prior ee9dd58b0e2c 5 weeks ago 556MB
The simplest way to actually revert to prior
as the running container is to manipulate the tags.
Given the state of play above as my starting point, make these assumptions:
latest
(ID=40cb7d8ba6fe) is a failing image;prior
(ID=ee9dd58b0e2c) is the last-known-good image.
I can proceed like this:
-
If the container is running, terminate it:
$ docker-compose rm --force --stop -v nodered
Termination is needed so the
latest
(failing) image can be removed in the next step. -
Remove the failing image:
$ docker rmi 40cb7d8ba6fe
-
Re-tag the last-known-good as
latest
:$ docker tag ee9dd58b0e2c iotstack-nodered:latest
-
Optionally, remove the
prior
tag:$ docker rmi iotstack-nodered:prior
The state of play is:
$ docker images | grep -e "REPOSITORY" -e "nodered" REPOSITORY TAG IMAGE ID CREATED SIZE iotstack-nodered latest ee9dd58b0e2c 5 weeks ago 556MB
This step is optional. If you are doing experimental builds, you may want to hang onto a local image tagged
prior
so you have an easy way of reverting. -
bring up the container:
$ docker-compose up -d nodered