Source: https://gitlab.com/gitlab-org/gitlab-runner/issues/1583#note_93170156
OK, I've experimented a lot getting this going with the docker+machine
executor (specifically with the amazonec2
driver, which I suspect is quite common for people looking at this thread!), it may also be helpful to others when debugging what's going on for them.
docker+machine
is interesting because it has several relevant contexts (i.e. a file system and environment variables), which I shall refer to as:
- "runner": what is running the
gitlab-runner
binary - in my case this is an ECS-managed docker container for thegitlab/gitlab-runner
image on docker hub, but it could thesystemd
service configuration if you're running directly on the machine. - "job host": the docker-machine created machine (e.g. EC2 instance) that runs the docker daemon
- "job container": the docker container for the image specified in the project
.gitlab-ci.yaml
(or the default in config.toml)
Of course, if you're not using the docker machine (or ssh?) executor, then the runner and job host context are on the same physical machine.
With some experimenting, and spelunking through this project, I found out the following:
- The
gitlab-runner
binary is what callsdocker-credential-ecr-login
, so make suredocker-credential-ecr-login version
in the runner context succeeds, and that the runner context is the one with IAM permissions for ECR gitlab-runner
uses the docker go client library to talk to the docker daemon, not thedocker
CLI, so it must re-implement configuration parsing and authentication. In particular, this means thatcredsStore
is implemented (by !501 (merged)), but notcredHelpers
DOCKER_AUTH_CONFIG
is defined and used bygitlab-runner
, not by docker, so don't expect setting that to make thedocker
CLI work.DOCKER_AUTH_CONFIG
should still be specified as a job-visible environment variable, e.g. inconfig.toml
environment
, or pipeline secret variables etc., even though it's actually read bygitlab-runner
in the runner context, not the job container. That one is weird. I suspect usingengine-env
inMachineOptions
to set this would not work because of this?gitlab-runner
uses the provided credsStorelist
command for... some reason? Unfortunately, at some point AWS added the requirement todocker-credential-ecr-login list
that the AWS region is provided, the simplest way to do this is to set theAWS_REGION
environment variable - but unlikeDOCKER_AUTH_CONFIG
this must be in the runner context- Test the final call that actually gets the token with
echo $REGISTRY_NAME | docker-credential-ecr-login get
, where$REGISTRY_NAME
should look like123456789012.dkr.ecr.my-region-1.amazonaws.com
(the part of the repository name before the first/
)
Unrelated to gitlab, but also:
- By default the EC2 instance profile is exposed to docker containers that are run in it. You can test this with
curl http://169.254.169.254/latest/meta-data/iam/security-credentials/<iam-role-name>
, which will return the access key id and secret key along with other metadata. You can lock this down further with ECS task roles, but I haven't looked into that myself. This applies both to running gitlab-runner as a docker container, and todocker-machine
created EC2 instances with theamazonec2-iam-instance-profile
machine option. - The only relevant ECR permission when actually using docker is
ecr:GetAuthorizationToken
, which doesn't distinguish between read and write, nor to individual repositories (only at the registry level), so don't bother trying to lock down permission to push to ECR.
In summary, to pull ECR as the job image:
- ensure the runner context has credentials with ECR permissions - including via IAM profiles if it's on EC2, but the default profile in
~/.aws/config
/~/.aws/credentials
should also work? - put
docker-credential-ecr-login
on the PATH forgitlab-runner
(and don't forget to +x, of course) - set
AWS_REGION
to the region of your ECR repository (don't think it's possible to be cross-region yet) config.toml
should haveenvironment = ["DOCKER_AUTH_CONFIG={\"credsStore\":\"ecr-login\"}"]
in[[runners]]
, or if you have multiple private registries(?), as a runner pipeline variable or in.gitlab-ci.yaml
variables
.
This wont get you the ability to use ECR in your CI job scripts though, for that you have a few options, but it's easy enough to extend the solution:
- grant the docker client in the job container access to the docker daemon on the job host (installed by docker-machine) by sharing
/var/run/docker.sock
- make sure in the job
/root/.docker/config.json
(remember,DOCKER_AUTH_CONFIG
is not read bydocker
CLI) has{"credsStore":"ecr-login"}
, anddocker-credential-ecr-login
is on the path. - that the job container context has AWS credentials with ECR permissions, so
docker-credential-ecr-login
can get the token, same as above. - that you have the
docker
client binary, of course! You can use thedocker
image, or also mount the job hostdocker
binary.
Note that docker
doesn't require AWS_REGION
, it only uses get
with the actually accessed registry.
The way I did this is update config.toml
to have:
[[runners]]
[runners.docker]
volumes = [
"/cache",
# So 'docker' client works in CI
"/var/run/docker.sock:/var/run/docker.sock",
# So 'docker push <ECR image> works in CI
"/root/.docker:/root/.docker",
"/usr/local/bin/docker-credential-ecr-login:/usr/local/bin/docker-credential-ecr-login"
]
[runners.machine]
MachineOptions = [
"amazonec2-iam-instance-profile=RUNNER_INSTANCE_PROFILE_NAME",
"amazonec2-userdata=/path/to/userdata"
]
where /path/to/userdata
contains something like:
#!/bin/bash
set -eu
curl --fail \
https://MY_BUCKET.s3-MY_REGION.amazonaws.com/SOME_PREFIX/docker-credential-ecr-login \
-o /usr/local/bin/docker-credential-ecr-login
chmod +x /usr/local/bin/docker-credential-ecr-login
mkdir -p ~/.docker
echo > ~/.docker/config.json '{ "credsStore": "ecr-login" }'
And the URL to docker-credential-ecr-login
works because the object was uploaded with --acl public-read
Thanks to all the above commenters for helping me nail this down!