Skip to content

Instantly share code, notes, and snippets.

@Smasherr
Last active October 10, 2023 07:33
Show Gist options
  • Save Smasherr/972272b56eebae3b860d62ef05eea6eb to your computer and use it in GitHub Desktop.
Save Smasherr/972272b56eebae3b860d62ef05eea6eb to your computer and use it in GitHub Desktop.
This script can be used to copy all docker images from one GitLab-project to another. It was created because in GitLab it's not possible to move projects that contain docker images.
#!/usr/bin/env bash
# This script can be used to copy all docker images from one GitLab-project to another.
# It was created because in GitLab it's not possible to move projects that contain docker images.
#
# Related issue: https://gitlab.com/gitlab-org/gitlab/-/issues/18383
#
# Author: Daniel Estermann <estermad@apache.org>
# Thanks to komar <komar@core.org.ua> for his brilliant support
ME=$(basename "$0")
DEPENDS="docker curl jq tac"
die() {
if [[ "$*" && -t 2 ]]; then
printf "\e[31;1m%s\e[0m\n" "$*" >&2
else
printf "%s\n" "$*" >&2
fi
exit 1
}
usage() {
cat <<- EOM
Usage: $ME <gitlab-server> <path-to-project> <new-path-to-project>
Example: $ME gitlab.com johndoe/foo johndoe/bar
EOM
exit 1
}
[[ ${GITLAB_TOKEN} ]] || die "Please provide an access token via the environment variable GITLAB_TOKEN"
for CMD in $DEPENDS; do
command -v $CMD >/dev/null 2>&1 || die "This script requires $CMD but it's not installed. Aborting."
done
docker ps >/dev/null 2>&1 || die "This script requires docker but its engine is unreachable. Aborting."
[[ ${1} && ${2} && ${3} ]] || usage
PROJECT_NS=${2//\//%2F}
REPOSITORY_IDS=$(curl -s -H "PRIVATE-TOKEN:${GITLAB_TOKEN}" \
"https://$1/api/v4/projects/${PROJECT_NS}/registry/repositories/" |
jq -r '.[].id' 2>/dev/null)
[[ ${REPOSITORY_IDS} ]] || die "$1/$2 has no docker images"
TMPFILE=$(mktemp -u /tmp/$ME.XXXXXX)
for REPOSITORY_ID in $REPOSITORY_IDS; do
TAGS_URL="https://$1/api/v4/projects/${PROJECT_NS}/registry/repositories/${REPOSITORY_ID}/tags?per_page=100"
TOTAL_PAGES=$(curl -sI -X HEAD -H "PRIVATE-TOKEN:${GITLAB_TOKEN}" ${TAGS_URL} |
sed -nE 's/X-Total-Pages:[[:space:]]*([^'$'\r'']*)'$'\r''?/\1/ip')
[[ $TOTAL_PAGES =~ ^[0-9]+$ ]] || die "Error getting total page number using HTTP HEAD on $TAGS_URL (got $TOTAL_PAGES)"
for PAGE in $(seq 1 $TOTAL_PAGES); do
curl -s -H "PRIVATE-TOKEN:${GITLAB_TOKEN}" "${TAGS_URL}&page=${PAGE}" |
jq -r '.[].location'
done |
sed -E "s#(.*)#docker pull \1 \&\& docker tag \1 \1#g
s#(.*)($2)(.*:.*)#\1$3\3#i" |
tee -a $TMPFILE
done
cat << EOM >> $TMPFILE
while true; do
read -p "Do you want to push the images now? [Y/n] " yn
case \${yn:-Y} in
[Yy]* ) break;;
* ) echo "Abort."; exit;;
esac
done
EOM
tac $TMPFILE | sed '1,7d' | tac |
sed -E "s#.* (.*)#docker push \1#" |
tee -a $TMPFILE
[[ -f $TMPFILE ]] &&
chmod +x $TMPFILE &&
echo "You can now review the generated script in stdout or in $TMPFILE and if it looks fine for you just execute it." ||
die "Something went wrong, sorry :("
@xenomachina
Copy link

I ran into a little issue with this script. If I run it like this:

./copy_images.sh gitlab.com mygroup/oldname mygroup/newname

the generated pull commands will look like:

docker pull registry.gitlab.com/mygroup/oldname:1.2.3 && docker tag registry.gitlab.com/mygroup/oldname:1.2.3 registry.gitlab.com/mygroup/newname:1.2.3

but the generated push commands will look like:

docker push gitlab.com/mygroup/newname:1.2.3

This causes the pushes to fail, as registry.gitlab.com is not the same as gitlab.com.

If I specify registry.gitlab.com on the command line I get an error "registry.gitlab.com/mygroup/oldname has no docker images".

My quick and dirty workaround is to just edit the generated script and change the push commands to say registry.gitlab.com, which works.

@Smasherr
Copy link
Author

Smasherr commented Jun 4, 2020

Hey @xenomachina, thanks a lot for the feedback! I see what's the issue. To fix it, I have deployed a new revision of the script with a little change. Does it work as supposed for you?

@xenomachina
Copy link

Hi @Smasherr, thanks for the update! That fixes that problem.

However, I've discovered another issue. I was moving a project from mygroup/myproject to mygroup/mysubgroups/myproject, and the project has multiple image repositories. They are mygroup/myproject/foo-image and mygroup/myproject/bar-image. The resulting script left out the image repository names in the tag and push commands. So the script looked like:

docker pull registry.gitlab.com/mygroup/myproject/foo-image:latest && docker tag registry.gitlab.com/mygroup/myproject/foo-image:latest registry.gitlab.com/mygroup/mysubgroup/myproject:latest
docker pull registry.gitlab.com/mygroup/myproject/bar-image:latest && docker tag registry.gitlab.com/mygroup/myproject/bar-image:latest registry.gitlab.com/mygroup/mysubgroup/myproject:latest
...
docker push registry.gitlab.com/mygroup/mysubgroup/myproject:latest
docker push registry.gitlab.com/mygroup/mysubgroup/myproject:latest

instead of:

docker pull registry.gitlab.com/mygroup/myproject/foo-image:latest && docker tag registry.gitlab.com/mygroup/myproject/foo-image:latest registry.gitlab.com/mygroup/mysubgroup/myproject/foo-image:latest
docker pull registry.gitlab.com/mygroup/myproject/bar-image:latest && docker tag registry.gitlab.com/mygroup/myproject/bar-image:latest registry.gitlab.com/mygroup/mysubgroup/myproject/bar-image:latest
...
docker push registry.gitlab.com/mygroup/mysubgroup/myproject/foo-image:latest
docker push registry.gitlab.com/mygroup/mysubgroup/myproject/bar-image:latest

@Smasherr
Copy link
Author

Smasherr commented Aug 5, 2020

Hey @xenomachina, unfortunately, I cannot reproduce that issue on my machine. Can you tell me which shell you are running?

Also I've just updated the script to make it more compatible with different versions of sed, perhaps it will fix the issue on your side.

@xenomachina
Copy link

Hi @Smasherr. Sorry for the late reply.

I'm using bash 4.4.20 on Linux (Ubuntu 18.04). A sed incompatibility (eg: GNU vs BSD) sounds like a likely possibility.

I don't have any projects to move at the moment, but if I do in the future I'll try out the update. Thanks!

@salim-runsafe
Copy link

@Smasherr, I had the same issue as @xenomachina and was able to fix it by swapping the regex within the sed statement to "s#(.*)($2)(.*:.*)#\1$3\3#". The current script will grab more than just $2 (the previous project path), clobbering any sub-paths in the Docker image (like @xenomachina's foo-image and bar-image under myproject. This change moves those sub-paths into match group 3 so that they are preserved.

Thanks for the awesome script, this saved us a lot of trouble while we restructured our repos and made it possible to put things just the way we needed them.

@Smasherr
Copy link
Author

Hey @salim-runsafe, now I also able to see this issue that was initially discovered by @xenomachina. I appreciate very much your feedback and your fix of the regex.
Furthermore, I've discovered an issue in line 79 - there was a while read command, which just ate the first line and was totally superfluous. I've got rid of it.

@salim-runsafe
Copy link

Hey @salim-runsafe, now I also able to see this issue that was initially discovered by @xenomachina. I appreciate very much your feedback and your fix of the regex.
Furthermore, I've discovered an issue in line 79 - there was a while read command, which just ate the first line and was totally superfluous. I've got rid of it.

@Smasherr would this fix the bug (which I'm just now reporting) that the last image to be pushed would be left out of the generated script? To clarify, if you had a repo with only one image it would pull it down correctly, but there would be no line to push it. If you had a repo with 100 images it would pull 100 and only push the first 99.

@salim-runsafe
Copy link

Hey @salim-runsafe, now I also able to see this issue that was initially discovered by @xenomachina. I appreciate very much your feedback and your fix of the regex.
Furthermore, I've discovered an issue in line 79 - there was a while read command, which just ate the first line and was totally superfluous. I've got rid of it.

@Smasherr would this fix the bug (which I'm just now reporting) that the last image to be pushed would be left out of the generated script? To clarify, if you had a repo with only one image it would pull it down correctly, but there would be no line to push it. If you had a repo with 100 images it would pull 100 and only push the first 99.

I just tested it and it does seem to fix that. Thanks!

@Smasherr
Copy link
Author

Fixing bugs before they are being reported is the best way to run it 😁

@Smasherr
Copy link
Author

I moved to OSX and realized that this script doesn't work on it because many tools are not GNU but BSD compliant. Hence today's update with improved compatibility. Also tested it with busybox.

@JanWendler
Copy link

Hi @Smasherr
FYI, I tried to use your script on a repository with less than 100 tags and it failed to set the TOTAL_PAGES. I had to manual set the TOTAL_PAGES to 1 for it to work. Thank you for this very helpful script! It saved us a lot of time.

@Smasherr
Copy link
Author

Smasherr commented Oct 9, 2023

Thanks for your feedback @JanWendler. Perhaps, the API has changed, and I'm gonna check this.

@Smasherr
Copy link
Author

Smasherr commented Oct 9, 2023

@JanWendler I just tested the script with our internal GitLab installation and with gitlab.com - it works with both, so the HTTP header X-Total-Pages is available. Now there might be an incompatibility with the sed expression, as it is quite complicated.

Do you know if at least the validation in line 57 worked for you? What did you get as an output there?

To investigate your problem further, it would be helpful to see the output of the script in tracing mode. To do this, simply append -x to the shebang in line 1. And don`t forget to mask sensitive data if you want to paste the logs ;-)

@JanWendler
Copy link

JanWendler commented Oct 10, 2023

@Smasherr This is the output:

$ ./copy_images.sh <our_server> <original_repo> jan.wendler/temp
++ basename ./copy_images.sh
+ ME=copy_images.sh
+ DEPENDS='docker curl jq tac'
+ [[ -n <Token> ]]
+ for CMD in $DEPENDS
+ command -v docker
+ for CMD in $DEPENDS
+ command -v curl
+ for CMD in $DEPENDS
+ command -v jq
+ for CMD in $DEPENDS
+ command -v tac
+ docker ps
+ [[ -n <our_server> ]]
+ [[ -n <original_repo> ]]
+ [[ -n jan.wendler/temp ]]
+ PROJECT_NS=<original_repo_%2F>
++ curl -s -H PRIVATE-TOKEN:<Token> https://<our_server>/api/v4/projects/<original_repo_%2F>/registry/repositories/
++ jq -r '.[].id'
+ REPOSITORY_IDS=1702
+ [[ -n 1702 ]]
++ mktemp -u /tmp/copy_images.sh.XXXXXX
+ TMPFILE=/tmp/copy_images.sh.AIoZz4
+ for REPOSITORY_ID in $REPOSITORY_IDS
+ TAGS_URL='https://<our_server>/api/v4/projects/<original_repo_%2F>/registry/repositories/1702/tags?per_page=100'
++ curl -sI -X HEAD -H PRIVATE-TOKEN:<Token> 'https://<our_server>/api/v4/projects/<original_repo_%2F>/registry/repositories/1702/tags?per_page=100'
++ sed -nE 's/X-Total-Pages:[[:space:]]*([^]*)?/\1/ip'
sed: -e expression #1, char 41: unterminated `s' command
+ TOTAL_PAGES=
+ [[ '' =~ ^[0-9]+$ ]]
+ die 'Error getting total page number using HTTP HEAD on https://<our_server>/api/v4/projects/<original_repo_%2F>/registry/repositories/1702/tags?per_page=100 (got )'
+ [[ -n Error getting total page number using HTTP HEAD on https://<our_server>/api/v4/projects/<original_repo_%2F>/registry/repositories/1702/tags?per_page=100 (got ) ]]
+ [[ -t 2 ]]
+ printf '\e[31;1m%s\e[0m\n' 'Error getting total page number using HTTP HEAD on https://<our_server>/api/v4/projects/<original_repo_%2F>/registry/repositories/1702/tags?per_page=100 (got )'
Error getting total page number using HTTP HEAD on https://<our_server>/api/v4/projects/<original_repo_%2F>/registry/repositories/1702/tags?per_page=100 (got )
+ exit 1

I'm working on Windows 10 with an elevated gitbash console. I hope this helps : )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment