Consuming Azure Pipelines Python artifact feeds in Docker
Recently, I was building out a set of Python packages and services and needed to find a way to pull down packages from an Azure Artifacts feed into a Docker image. It was straightforward to use the tasks to package an artifact, authenticate to the feed, and publish.
I had to do a bit more digging to piece together a flow I was comfortable with for building container images. This post describes some of the challenges involved and how I solved for them.
What's the problem?
PipAuthenticate task is great - it authenticates with your artifacts feed and per the docs, will store the location of a config file that can be used to connect in the
PYPIRC_PATH environment variable.
That said - by design, containers run in an isolated environment. We can't directly access it while building a container image. We need a way to get that config inside the build phase so that our calls to
python -m pip install are successful. You are using a virtual environment &
python -m pip install to install packages, right?
Challenge 1: No volumes at build time!
Docker doesn't currently support* mounting volumes at build time. So we can't just mount our
PYPIRC_PATH file from the Azure Pipelines host into the build.
It would be much easier to pass a string as a
--build-arg to Docker and then consume it. Azure Pipelines tasks are open source on GitHub, so I thought I'd take a look to see how the task worked and possibly extend it. It turns out that the
PipAuthenticate task has some
undocumented behavior bonus features and it already does what I want! It populates the
PIP_EXTRA_INDEX_URL environment variable, which is automatically picked up by
*Well, sort of! You can solve this with
--mount=type=secretwhen you enable BuildKit. If this was a personal project, I'd have stopped there and said #shipit! In this case, I was really looking to find something that works for all users and isn't explicitly marked "experimental".
Challenge 2: Keep it secret, keep it safe!
Great! We pass in our build arg, set
ENV PIP_EXTRA_INDEX_URL=$PIP_EXTRA_INDEX_URL and call it a day, right! Right...?
Not so fast - we want to have
PIP_EXTRA_INDEX_URL available when we pull packages, but we don't want secret environment variables baked into any of the layers of a runtime image. So we'll combine what we've learned so far with a multi-stage build and we're off to the races!
In my real container build, I needed to install
python3-dev and a bunch of other things to pull down my dependencies & build wheels - so a multi-stage build drops my final image size from >1GB down to ~100MB anyway
I've attached a few sample files that I pulled from my working pipeline to get you started with this approach. I hope this helps and plan for this post to be soon obsolete after I complete a few pull requests into Microsoft docs!