Skip to content

Instantly share code, notes, and snippets.

@bric3
Last active July 9, 2021 13:22
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bric3/f8ebbe444704a3a7a05d3fd952c32129 to your computer and use it in GitHub Desktop.
Save bric3/f8ebbe444704a3a7a05d3fd952c32129 to your computer and use it in GitHub Desktop.
Group all docker tags by image
#!/usr/bin/env bash
set -euo pipefail
#set -o xtrace
image=${1};
wanted_tag=${2:-""};
max_pages=10;
grouped_tags_tmp=$(mktemp);
function finish {
rm -rf "{$grouped_tags_tmp}";
}
trap finish EXIT;
(
url="https://registry.hub.docker.com/v2/repositories/${image}/tags/?page_size=100"
counter=1
while [ $counter -le $max_pages ] && [ -n "${url}" ]; do
>&2 echo -n ".";
content=$(curl -s "${url}");
((counter++));
url=$(jq -r '.next // empty' <<< "${content}");
echo "${content}";
done;
>&2 echo;
) | jq -s '[.[].results[]]' \
| jq 'map({tag: .name, digest: .images[].digest}) | unique | group_by(.digest) | map(select(.[].digest) | {(.[0].digest): [.[].tag]}) | unique' \
> "${grouped_tags_tmp}";
# jq -s '[.[].results[]]' will slurp -s the array of all pages in a single big array, somewhat like a flatmap
# jq 'map({tag: .name, digest: .images[].digest}) | unique | group_by(.digest) | map(select(.[].digest) | {(.[0].digest): [.[].tag]})'
# - map({tag: .name, digest: .images[].digest}) will keep only the tag name and the images digest
# - unique will keep unique objects, as it's possible to have to have images digest that are available with various architecture or os
# - group_by(.digest) group tags by their images digest, this creates an array of an array of objects
# - map(select(.[].digest) | {(.[0].digest): [.[].tag]}) change the structure to an array of objects shaped this way
# {
# "sha256:518f6c2137b7463272cb1f52488e914b913b92bfe0783acb821c216987959971": [
# "11",
# "11-buster",
# "11-jdk",
# "11-jdk-buster",
# "11.0",
# "11.0-buster",
# "11.0-jdk",
# "11.0-jdk-buster",
# "11.0.8",
# "11.0.8-buster",
# "11.0.8-jdk",
# "11.0.8-jdk-buster"
# ]
# }
if [ -n "${wanted_tag}" ]; then
jq --arg wanted_tag "${wanted_tag}" 'map(to_entries | map(select(.value | index($wanted_tag))) | from_entries | select(length > 0)) | unique' "${grouped_tags_tmp}"
# 'map(to_entries | map(select(.value | index($wanted_tag))) | from_entries | select(length > 0)) | unique'
# - first map(...) enable to work on each entries of the input array
# - to_entries | map(select(.value | index($wanted_tag))) | from_entries | select(length > 0)
# - to_entries for each object extract as a dictionary with key : (the digest), value: (the tags array)
# - map(select(.value | index($wanted_tag))) will select the entry whose value has the wanted tag in its array
# - from_entries will reassemble the dictionary as a object { digest: tags array}
# - select(length > 0) will remove empty objects
# - unique remove possible duplicates
else
jq '.' "${grouped_tags_tmp}"
fi
@adrienaury
Copy link

adrienaury commented Apr 17, 2021

Thank you for this.

I had to adapt line 18 to make it work,

-  url="https://registry.hub.docker.com/v2/repositories/${image}/tags/?page_size=100"
+  url="https://registry.hub.docker.com/v2/repositories/library/${image}/tags/?page_size=100"

On alpine repository the result contains multiple entries for a single tag, because of multiple architectures :

$ ./docker-tag-group.sh alpine | grep -A 2 -B 5 latest
.
  {
    "sha256:5de788243acadd50526e70868b86d12ad79f3793619719ae22e0d09e8c873a66": [
      "3",
      "3.13",
      "3.13.5",
      "latest"
    ]
  },
--
  {
    "sha256:827525365ff718681b0688621e09912af49a17611701ee4d421ba50d57c13f7e": [
      "3",
      "3.13",
      "3.13.5",
      "latest"
    ]
  },
--
  {
    "sha256:8f18fae117ec6e5777cc62ba78cbb3be10a8a38639ccfb949521abd95c8301a4": [
      "3",
      "3.13",
      "3.13.5",
      "latest"
    ]
  },
--
  {
    "sha256:9663906b1c3bf891618ebcac857961531357525b25493ef717bca0f86f581ad6": [
      "3",
      "3.13",
      "3.13.5",
      "latest"
    ]
  },
--
  {
    "sha256:a090d7c93c8e9ab88946367500756c5f50cd660e09deb4c57494989c1f23fa5a": [
      "3",
      "3.13",
      "3.13.5",
      "latest"
    ]
  },
--
  {
    "sha256:def822f9851ca422481ec6fee59a9966f12b351c62ccb9aca841526ffaa9f748": [
      "3",
      "3.13",
      "3.13.5",
      "latest"
    ]
  },
--
  {
    "sha256:ea73ecf48cd45e250f65eb731dd35808175ae37d70cca5d41f9ef57210737f04": [
      "3",
      "3.13",
      "3.13.5",
      "latest"
    ]
  },

@adrienaury
Copy link

To select only amd64 architectures, I changed line 29 like this

-  | jq 'map({tag: .name, digest: .images[].digest}) | unique | group_by(.digest) | map(select(.[].digest) | {(.[0].digest): [.[].tag]}) | unique' \
+  | jq 'map({tag: .name, image: .images[] | select(.architecture == "amd64")}) | map({tag: .tag, digest: .image.digest}) | unique | group_by(.digest) | map(select(.[].digest) | {(.[0].digest): [.[].tag]}) | unique' \

@bric3
Copy link
Author

bric3 commented Apr 17, 2021

Thanks I'll try to come up with something to be able to select architecture. Thanks for the suggestion !

@adrienaury
Copy link

adrienaury commented Apr 19, 2021

For information I ended up with creating my own function : https://gist.github.com/adrienaury/caa582f17edb399acfa7635afc9d435d
I credited you in the help message :)

$ dtags

Usage:   dtags [OPTIONS] IMAGE

Retrieve a list of tags from dockerhub

Options:
  -r, --repository string     Name of a repository, default: library
  -a, --architecture string   Filter on a specific architecture (e.g.: amd64), this can be a regex expression, default: all architectures
  -o, --os string             Filter on a specific operating system (e.g.: linux), this can be a regex expression, default: linux
  -l, --limit integer         Limit the number of results, this number is counted in hundreds (e.g.: a limit of 1 will return a maximum of 100 results), default to 10
  -c, --cache integer         Cache curl result, default no caching

Example: dtags -l1 -a amd64 alpine

Inspired by https://gist.github.com/bric3
Adapted by https://gist.github.com/adrienaury

You can combine it with this one to cache results : https://gist.github.com/adrienaury/4ec61ae619ec7b68e03cf4ee603a0645

$ cache -- dtags alpine -a amd64 | jq 'select(.tag == "latest")'
{
  "tag": "latest",
  "date": "2021-04-14T19:39:25.049993Z",
  "os": "linux",
  "architecture": "amd64",
  "digest": "sha256:def822f9851ca422481ec6fee59a9966f12b351c62ccb9aca841526ffaa9f748"
}

@bric3
Copy link
Author

bric3 commented Jul 9, 2021

Thank you ! Your version is really neat !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment