Docker CLI basics
The Docker CLI is usually referenced as the "docker engine", see docs at https://docs.docker.com/engine/reference/commandline/cli/
docker has plenty of sub commands that are given when you just run
Let's see what's happening in our docker environment
$ docker info
gives information from our docker daemon (server) and from this info we can see our current server version
Server Version: 17.09.1-ce
and that we have something like
CPUs: 1 Total Memory: 1.952GiB
available for our containers. On Linux these numbers will match the host machine, but on Mac/Win this will be the number of the CPUs/RAM in the virtual machine.
Before moving forward, let's change change the defaults: open Docker preferences from the top menu icon. Then, under "advanced" set the number of CPUs to max or max-1 and hit
apply & restart. While the machine is restarting the
docker info will return:
Error response from daemon: Bad response from Docker engine
Which means that the docker CLI can not connect to the server. After the machine has restarted, we can see that the
docker info will match our settings.
Noe let's enter the virtual machine either with:
$ screen ~/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/tty
and hit enter to get something on the screen.
Or, we can also get in with a docker container from the internets:
$ docker run --rm -it --privileged --pid=host walkerlee/nsenter -t 1 -m -u -i -n sh
and then we can see with
$ uname -a Linux moby 4.9.49-moby #1 SMP Fri Dec 8 19:40:02 UTC 2017 x86_64 Linux
that this host "moby" is based on the Alpine Linux as we can see from (DEPRECATED IN BETA):
$ cat /etc/alpine-release 3.5.0
Enough of this VM stuff, let's get started!
$ docker images
hello-world that was
docker runned at the install.
Let's run it a couple more times
$ docker run hello-world $ docker run hello-world
docker info the
docker images command connected to docker server. Let's test it quickly:
On a mac we can see that
/var/run/docker.sock is linked to the VM's socket:
$ ls -l /var/run/docker.sock lrwxr-xr-x 1 root daemon 56 Dec 13 15:24 /var/run/docker.sock -> /Users/mpa/Library/Containers/com.docker.docker/Data/s60
Luckily curl supports unix sockets nowadays. So to do what
docker images just did we can:
$ curl --unix-socket /var/run/docker.sock http://localhost/images/json
we can also list
/containers/json etc, see https://docs.docker.com/engine/api/v1.35/#operation/ContainerList
$ curl --unix-socket /var/run/docker.sock http://localhost/containers/json
Let's use proper tools instead of
curl and try removing the image by...
$ docker rmi hello-world
...which should fail with an error
Error response from daemon: conflict: unable to remove repository reference "hello-world" (must force) - container 3d4bab29dd67 is using its referenced image f2a91732366c
because the container created from the image is still present as you can (not) see in
$ docker ps
that lists, by default, all the running containers, but since that container is no longer running (it's designed just to output and exit) we need to say
$ docker ps -a
to show all containers in the daemon. Notice that containers have a container ID and container name that is autogenerated to be something like "confident_golick"
When we have a lot different of containers, we can filter the list naturally with grep or such
$ docker ps -a | grep hello-world
or by docker ps filters (https://docs.docker.com/engine/reference/commandline/ps/#filtering)
$ docker ps -a -f "ancestor=hello-world"
Now we could remove the image by force with
docker rmi --force hello-world, but it would not remove our containers. If we run
$ docker ps -a | grep hello-world $ docker ps -a -f "ancestor=hello-world"
It just appers that the containers were removed, because with grep we are matching "hello-world" and our filter uses ancestor matching and that ancestor was removed with
rmi --force. If you check
docker ps -a or grep
docker ps -a | grep hello you can see that our containers are still there.
We could remove the containers one by one with
docker rm <container name or id>, but let's train our xargs skills instead first because you're gonna need those with the Docker CLI later on. If you ran the
rmi --force, then re-run
docker run hello-world again.
$ docker ps -a -q -f "ancestor=hello-world"
-q will output only the numeric container ID's that we can then use with xargs like:
$ docker ps -aq -f "ancestor=hello-world" | xargs docker rm
This works because the container was stopped. If the container was still running it wouldn't work.
Let's start a container whose image doesn't exit, for example
$ docker run -d nginx
This will download the nginx image and start a container from it to the background with
-d (detach), what can be seen with
$ docker ps
Now if we try
rm it, it will fail:
$ docker rm $(docker ps -q) Error response from daemon: You cannot remove a running container f72c583c982ca686b0826fdc447f04710e78ff6c25dc1ddc7c427cc35eadf5f0. Stop the container before attempting removal or force remove
Now we can either
docker rm --force it or
docker stop <container id or name> and then
docker rm it.
xargs works with docker, it means that docker can consume multiple arguments, so you can also
docker rm id1 id2 id3
It's common that over time the docker daemon is clogged with images and containers laying around, because it's not natural to clean up everything all the time.
Where do the images come from?
We can search for the images with
$ docker search hello-world
that searches images from https://hub.docker.com/
We get plenty of results like
hello-world kitematic/hello-world-nginx tutum/hello-world ...
hello-world image has a web page at https://hub.docker.com/_/hello-world/ - these images without a prefix (aka org/user) are built from git repositories in https://github.com/docker-library
We really can't know where
kitematic/hello-world-nginx is built since this page https://hub.docker.com/r/kitematic/hello-world-nginx/ has no links to any repos. Only thing we know now is that the image is 3 years old.
Also notice that there are no visible guarantees that https://hub.docker.com/_/hello-world/ comes from https://github.com/docker-library/hello-world The "Full Description" has links to that repo, but it may not be true.
In the third
tutum/hello-world you can see that it's
Automated, so in https://hub.docker.com/r/tutum/hello-world/ the "Source Repository" is linked AND in the tab "Build Details" we can see actually what happened during the builds https://hub.docker.com/r/tutum/hello-world/builds/
There are also other Docker registries, such as https://quay.io/ that competes with Docker Hub. Naturally
docker search can not be used to search from these registries, so we have to use the site https://quay.io/search?q=hello and select a result like https://quay.io/repository/nordstrom/hello-world where it's shown how to pull from this registry:
$ docker pull quay.io/nordstrom/hello-world
So by default if the host (here:
quay.io) is omitted, it will pull from Docker Hub. From docker engine/daemons/servers/cli's point of it doesn't matter where the image comes from, it just needs to be stored (pulled) locally.
Let's move on to inspect something more relevant than 'hello-world', for example Ubuntu: https://hub.docker.com/r/library/ubuntu/ - that is one of the most common Docker images to use as a base for your own image.
The description/readme says:
What's in this image? This image is built from official rootfs tarballs provided by Canonical (specifically, https://partner-images.canonical.com/core/).
From the links we can guess (not truly know) that the image is built from https://github.com/tianon/docker-brew-ubuntu-core - So from a guy named "Tianon Gravi" who describes him with "bash, debian, father, gentoo, go, perl, tron, vim, vw; basically nine years old" in his Github profile.
In that git repository's README in https://github.com/tianon/docker-brew-ubuntu-core/tree/1637ff264a1654f77807ce53522eff7f6a57b773#scripts-to-prepare-updates-to-the-ubuntu-official-docker-images it says:
Some more Jenkins happens
which means that in somewhere
Anyway, let's pull this beast:
$ docker pull ubuntu Using default tag: latest latest: Pulling from library/ubuntu
Since we didn't specify a tag, we got
latest that is usually the last build and pushed image to the registry, but in this case the repo readme says that
The ubuntu:latest tag points to the "latest LTS", since that's the version recommended for general use.
From https://hub.docker.com/r/library/ubuntu/tags/ we can see that there are tags like
16.04 which (should) give us the guarantee that the image is based on Ubuntu 16.04. Let's pull that now:
$ docker pull ubuntu:16.04 16.04: Pulling from library/ubuntu c2ca09a1934b: Downloading [============================================> ] 34.25MB/38.64MB d6c3619d2153: Download complete 0efe07335a04: Download complete 6b1bb01b3a3b: Download complete 43a98c187399: Download complete
Images are composed of different layers that are downloaded in parallel to speed up the download.
(PRO-TIP: For command line fetching of all available tags we can do something like https://stackoverflow.com/questions/28320134/how-to-list-all-tags-for-a-docker-image-on-a-remote-registry)
We can tag images locally if we wish, for example
$ docker tag ubuntu:16.04 ubuntu:best_version
But actually tagging is also a way to "rename" the image:
$ docker tag ubuntu:16.04 best_distro:best_version
Now we create a new container with
uptime as the command by saying
$ docker run best_distro:best_version uptime 18:09:26 up 55 min, 0 users, load average: 0.00, 0.01, 0.00
Mac/win only: Again, notice how the uptime is the uptime of your moby virtual machine.
Let's see how our image was really built from https://hub.docker.com/r/_/ubuntu/ by clicking our 16.04 Dockerfile link: https://github.com/tianon/docker-brew-ubuntu-core/blob/85822fe532df3854da30b4829c31878ac51bcb91/xenial/Dockerfile
We get to the
Dockerfile that specifies all the commands that were used to create this image.
The first line states that the image starts from a special image "scratch" that is just empty. Then a file
ubuntu-xenial-core-cloudimg-amd64-root.tar.gz is added to the root from the same directory: https://github.com/tianon/docker-brew-ubuntu-core/tree/85822fe532df3854da30b4829c31878ac51bcb91/xenial
This file should be the "..official rootfs tarballs provided by Canonical" mentioned earlier, but it's not actually coming from https://partner-images.canonical.com/core/xenial/current/, it's copied to the repo owned by "tianon". We could verify the checksums of the file if we are interested.
Notice how the file is not extracted at any point, this is because the
ADD documentation states in https://docs.docker.com/engine/reference/builder/#add that "If is a local tar archive in a recognized compression format (identity, gzip, bzip2 or xz) then it is unpacked as a directory. "
We can be pretty sure that the
ubuntu:16.04 that we just downloaded is this image, because
$ docker history --no-trunc best_distro:best_version
Matches with the directives specified in the
Dockerfile. We could also build the image ourselves if we really wanted - there is nothing special in the "official" image and the build process is, as we saw, truly open.
Let's run a container in the background
$ docker run -d --name looper ubuntu:16.04 sh -c 'while true; do date; sleep 1; done' 2a49df3ba735c8a9b813c11f1c842606c1e94a6265c7c0bd5bd988cf942b8149
And check that it's running
$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 2a49df3ba735 ubuntu:16.04 "sh -c 'while true..." 6 seconds ago Up 1 second looper
Because we gave
--name looper to the container, we can now reference it easily:
$ docker logs -f looper Mon Jan 15 19:25:53 UTC 2018 Mon Jan 15 19:25:54 UTC 2018 Mon Jan 15 19:25:55 UTC 2018 ...
Now, in another terminal try
$ docker pause looper
And see how
logs -f paused
$ docker unpause looper
Attach to the container:
$ docker attach looper Mon Jan 15 19:26:54 UTC 2018 Mon Jan 15 19:26:55 UTC 2018 ...
Now you have logs (STDOUT) running in two terminals. Now in the attach window press control+c. Th container is killed, because the ^C is passed on to the process with pid 1 (
Start the container again and attach to it with
--sig-proxy=false that disables signal proxying. Then when you hit ^C ...
$ docker start looper $ docker attach --sig-proxy=false looper Mon Jan 15 19:27:54 UTC 2018 Mon Jan 15 19:27:55 UTC 2018 ^C
The container will stays running, just disconnecting you from the STDOUT.
To enter our container, we can start a new process in it.
$ docker exec -it looper bashdocker exec -it looper bash root@2a49df3ba735:/# ps aux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1 0.0 0.0 4496 1716 ? Ss 10:31 0:00 sh -c while true; do date; sleep 1; done root 271 0.0 0.0 4496 704 ? Ss 10:33 0:00 sh root 300 0.0 0.0 18380 3364 pts/0 Ss 10:33 0:00 bash root 386 0.0 0.0 4368 672 ? S 10:33 0:00 sleep 1 root 387 0.0 0.0 36836 2900 pts/0 R+ 10:34 0:00 ps aux
In our command
-it is short for
-i is "interactive, connect STDIN" and
-t "allocate a pseudo-TTY". From
ps aux listing we can see that our
bash process got pid 300. We can terminate the container with
kill 1 or, exit and
$ docker kill looper $ docker rm looper
The previous two commands would be basically the same as
docker rm --force looper
Let's start another process with
-it and also with
--rm to remove it automatically after it has exited. This means that there is no garbage containers left behind, but also that
docker start can not be used to start the container after it has exited.
$ docker run -d --rm -it --name looper-it ubuntu:16.04 sh -c 'while true; do date; sleep 1; done' 7d4b4e097931e2aafc62ee9be31bbc58f47b631ead04bd7d2c8dba3abc148137
Now let's attach to the container and hit control+p, control+q that detaches us from the STDOUT. This sequence can be set with
$ docker attach looper-it Mon Jan 15 19:50:42 UTC 2018 Mon Jan 15 19:50:43 UTC 2018 ^P^Qread escape sequence
Note that hitting
^C would still kill (and remove due to
--rm) the process because the
docker attach was done without
My first image
Let's create a file called
FROM ubuntu:16.04 WORKDIR /mydir RUN touch hello.txt COPY local.txt . RUN wget http://example.com/index.html
WORKDIRwill create and set the current working directory to
/mydirafter this directive
RUNwill execute a command with
/bin/sh -cprefix - Because of
WORKDIRthis is essentially same as
RUN touch /mydir/hello.txt
COPYadds a local file to the second argument. It's preferred to use
ADDwhen you are just adding files (ADD has all kinds of magic behaviour attached to it)
Then we'll build it by running build with context argument
. which means that we have to be in the same directory (we could run this build from another directory and then give the path here)
$ docker build .
This fails in the
COPY because the
local.txt doesn't exist. Fix that and build again to see the next error.
Before fixing the next error now notice how all steps that modify the image will say
---> Using cache - this is because the Docker daemon caches all the operations for speed. Changing any build directive will invalidate all the caches after that line.
Now we will find out that
wget doesn't exist in the Ubuntu base image. We'll need to add it with
apt-get as this is Ubuntu. But, if we just add:
RUN apt-get install -y wget
It will fail because the apt sources are not part of the image to bring down the size (and they would be old anyway). When we add lines
RUN apt-get update RUN apt-get install -y wget
the image should build nicely and at the end it will say something like
Successfully built 66b527252f32 where the
66b527252f32 is a random name for our image. This is not ideal, because now we need to separately
docker tag 66b527252f32 myfirst to have a sensible name for it, so let's run it again to also tag it:
$ docker build -t myfirst .
Before running our image we have a looming problem ahead of us: because
apt-get update is run in a separate step that is cached. If we add another package in the
apt-get install -y line some other day, the sources might have changed and thus the installing will fail. When something depends on another command, it's best practise to run them together, like this:
RUN apt-get update && apt-get install -y wget
Now let's run our image - note that we don't have to give a command (to be run in the container) after the image since the ubuntu base image sets it to
bash at the last line: https://github.com/tianon/docker-brew-ubuntu-core/blob/1637ff264a1654f77807ce53522eff7f6a57b773/artful/Dockerfile#L47
$ docker run -it myfirst root@accf99660aeb:/mydir# ls hello.txt index.html local.txt
WORKDIR was last set to
/mydir so our inherited
bash command is started in that directory. Also note how our hostname
accf99660aeb equals the container name. Before exiting the container, let's create one file (in addition to the files created by our
$ touch manually.txt $ exit
Now we can use diff to compare changes between our image
myfirst and container:
$ docker diff accf99660aeb C /mydir A /mydir/manually.txt C /root A /root/.bash_history
What we discover is that in addition to our
bash "secretly" created a history file. We could create a new image from these changes (
myfirst + changes = newimage) with
$ docker commit accf99660aeb myfirst-pluschanges
Let's try creating a new container from the new image, this time by setting the command to "ls -l". Also notice how we don't have to allocate pseudo-TTY or connect STDIN since our command is not interactive (and will exit anyway immediately)
$ docker run myfirst-pluschanges ls -l total 4 -rw-r--r-- 1 root root 0 Jan 5 11:59 hello.txt -rw------- 1 root root 1270 Aug 9 2013 index.html -rw-r--r-- 1 root root 0 Jan 5 12:18 manually.txt
And as expected, our
manually.txt file is now in the image.
Now let's start moving towards a more meaningful image.
youtube-dl a program that downloads youtube videos https://rg3.github.io/youtube-dl/download.html Let's add it to the image - but this time instead of doing it directly in
Dockerfile, let's try another approach that is sometimes easier than our current process where we add things to it and try to see if it builds. This time we'll open up an interactive session and test stuff beforehand "storing" it in our Dockerfile. By following the youtube-dl install instructions blindly we'll see that...
$ docker run -it myfirst root@8c587232a608:/mydir# sudo curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl bash: sudo: command not found
sudo is not installed, but since we are
root we don't need it now, so let's try again without...
root@8c587232a608:/mydir# curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl bash: curl: command not found
..and we see that curl is not installed either - we could just revert to use
wget, but as an exercise, let's add
apt-get since we already have the apt sources in our image (that hopefully are still valid)
$ apt-get install -y curl $ curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl
Then we'll add permissions and run it:
$ chmod a+rx /usr/local/bin/youtube-dl $ youtube-dl /usr/bin/env: 'python': No such file or directory
Okay - On the top of the
youtube-dl download page we'll notice that
Remember youtube-dl requires Python version 2.6, 2.7, or 3.2+ to work except for Windows exe.
So let's add python
$ apt-get install -y python
And let's run it again
$ youtube-dl WARNING: Assuming --restrict-filenames since file system encoding cannot encode all characters. Set the LC_ALL environment variable to fix this. Usage: youtube-dl [OPTIONS] URL [URL...] youtube-dl: error: You must provide at least one URL. Type youtube-dl --help to see a list of all options.
It works (we just need to give an URL), but we notice that it outputs a warning about
LC_ALL. In a regular Ubuntu desktop/server install the localization settings are (usually) set, but in this image they are not set, as we can see by running
env in our container. So according to https://unix.stackexchange.com/questions/87745/what-does-lc-all-c-do just setting it to
LC_ALL=C might be a good fix, so let's try that.
$ LC_ALL=C youtube-dl
Nope, same error. By Googling around you might end up in this thread: https://stackoverflow.com/questions/28405902/how-to-set-the-locale-inside-a-docker-container/41648500, but the best answer is not the most upvoted. To fix this without installing additional locales, see this: https://stackoverflow.com/a/41648500
$ LC_ALL=C.UTF-8 youtube-dl
And it works! Let's persist it for our session and try downloading a video:
$ export LC_ALL=C.UTF-8 $ youtube-dl https://www.youtube.com/watch?v=UFLCdmfGs7E
So now when we know what do, let's add these to the bottom of our
Dockerfile - by adding the instructions to the bottom we preserve our cached layers - this is handy practise to speed up creating the initial version of a Dockerfile when it has time consuming operations like downloads.
... RUN apt-get install -y curl python RUN curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl RUN chmod a+x /usr/local/bin/youtube-dl ENV LC_ALL=C.UTF-8 CMD ["/usr/local/bin/youtube-dl"]
- Instead of using
RUN export LC_ALL=C.UTF-8we'll store the env directly in the image
- We'll also override
bashas our image command (set on the base image) with
youtube-dlitself. This won't work, but let's see why.
When we build this as
$ docker build -t youtube-dl .
And run it:
$ docker run youtube-dl https://www.youtube.com/watch?v=UFLCdmfGs7E Usage: youtube-dl [OPTIONS] URL [URL...] youtube-dl: error: You must provide at least one URL. Type youtube-dl --help to see a list of all options.
So far so good, but now the natural way to use this image would be to give the URL as an argument:
$ docker run youtube-dl http://www.youtube.com /usr/local/bin/docker: Error response from daemon: OCI runtime create failed: container_linux.go:296: starting container process caused "exec: \"http://www.youtube.com\": stat http://www.youtube.com: no such file or directory": unknown. ERRO error waiting for container: context canceled
Now our URL became the command (
CMD). Luckily we have another way to do this: we can use
ENTRYPOINT to define the main executable and then docker will combine our run arguments for it.
And now it works like it should:
$ docker build -t youtube-dl . $ docker run youtube-dl https://www.youtube.com/watch\?v\=UFLCdmfGs7E [youtube] UFLCdmfGs7E: Downloading webpage [youtube] UFLCdmfGs7E: Downloading video info webpage [youtube] UFLCdmfGs7E: Extracting video information [download] Destination: Short introduction to Docker (Scribe)-UFLCdmfGs7E.mp4 [download] 100% of 3.02MiB in 00:0072MiB/s ETA 00:003
Now there's one more thing in
CMD that might be confusing - there are two ways to set them: exec form and shell form. We've been using the exec form where the command itself is executed. In shell form the command that is executed is wrapped with
/bin/sh -c - it's useful when you need to evaluate environment variables in the command like
$MYSQL_PASSWORD or similar.
In the shell form the command is provided as a string without brackets. In the exec form the command and it's arguments are provided as a list (with brackets), see the table below:
Dockerfile Resulting command ENTRYPOINT /bin/ping -c 3 CMD localhost /bin/sh -c '/bin/ping -c 3' /bin/sh -c localhost ENTRYPOINT ["/bin/ping","-c","3"] CMD localhost /bin/ping -c 3 /bin/sh -c localhost ENTRYPOINT /bin/ping -c 3 CMD ["localhost"]" /bin/sh -c '/bin/ping -c 3' localhost ENTRYPOINT ["/bin/ping","-c","3"] CMD ["localhost"] /bin/ping -c 3 localhost
Now we have two problems:
- Minor: Our container build process creates many layers resulting in increased image size
- Major: The downloaded files stay in the container
Let's fix the major issue first.
docker ps -a we can see all our previous runs. When we filter this list with
$ docker ps -a --last 3 CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES be9fdbcafb23 youtube-dl "/usr/local/bin/yout…" Less than a second ago Exited (0) About a minute ago determined_elion b61e4029f997 f2210c2591a1 "/bin/sh -c \"/usr/lo…" Less than a second ago Exited (2) About a minute ago vigorous_bardeen 326bb4f5af1e f2210c2591a1 "/bin/sh -c \"/usr/lo…" About a minute ago Exited (2) 3 minutes ago hardcore_carson
We'll see that the last container was
determined_elion for us humans.
$ docker diff determined_elion C /mydir A /mydir/Short introduction to Docker (Scribe)-UFLCdmfGs7E.mp4
docker cp command to copy the file (notice the quotes because of our filename that has spaces)
$ docker cp "determined_elion://mydir/Short introduction to Docker (Scribe)-UFLCdmfGs7E.mp4" .
And now we have our file locally. This doesn't really fix our issue, so let's continue:
Volumes: bind mount
By bind mounting a host (our machine) folder to the container we can get the file directly to our machine. Let's start another run with
-v option, that requires an absolute path. We mount our current folder as
/mydir in our container, overwriting everything that we have put in that folder in our Dockerfile.
$ docker run -v $(pwd):/mydir youtube-dl https://www.youtube.com/watch\?v\=UFLCdmfGs7E
Note: the Docker for Mac/Win has some magic so that the directories from our host become available for the
moby virtual machine allowing our command to work as it would on a Linux machine.
Optimizing the Dockerfile
Now we'll fix the minor problem of our Dockerfile being non-logical and not very space efficient. In the first version we have just commands rearranged so that the build process is logical:
FROM ubuntu:16.04 ENV LC_ALL=C.UTF-8 RUN apt-get update && apt-get install -y \ curl python RUN curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl RUN chmod a+x /usr/local/bin/youtube-dl WORKDIR /app ENTRYPOINT ["/usr/local/bin/youtube-dl"]
We have also changed the
WORKDIR to be
/app as it's a fairly common convention to put your own stuff in different public docker images. For this image where we essentially download videos, a
WORKDIR /videos or similar might also make sense.
In the next phase we'll glue all
RUN commands together to reduce the number of layers we are making in our image.
FROM ubuntu:16.04 ENV LC_ALL=C.UTF-8 RUN apt-get update && apt-get install -y \ curl python && \ curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl && \ chmod a+x /usr/local/bin/youtube-dl WORKDIR /app ENTRYPOINT ["/usr/local/bin/youtube-dl"]
As a sidenote not directly related to docker: remember that if needed, it is possible to bind packages to versions with
curl=1.2.3 - this will ensure that if the image is built at the later date, then the image is more likely to work, because the versions are exact. On the other hand the packages will be old and have security issues.
docker history we can see that our single
RUN layer adds 85.2 megabytes to the image:
$ docker history youtube-dl IMAGE CREATED CREATED BY SIZE COMMENT 295b16d6560a 30 minutes ago /bin/sh -c #(nop) ENTRYPOINT ["/usr/local... 0B f65f66bbae17 30 minutes ago /bin/sh -c #(nop) WORKDIR /app 0B 89592bae75a8 30 minutes ago /bin/sh -c apt-get update && apt-get insta... 85.2MB ...
The next step is to remove everything that is not needed in the final image. We don't need the apt source lists anymore, so we'll glue the next line to our single
.. && \ rm -rf /var/lib/apt/lists/*
Now when we build, we'll see that the size of the layer is 45.6MB megabytes. We can optimize even further by removing the
curl. We can remove
curl and all the dependencies it installed with
.. && \ apt-get purge -y --auto-remove curl && \ rm -rf /var/lib/apt/lists/*
..which brings us down to 34.9MB
Now our slimmed down container should work, but:
$ docker run -v "$(pwd):/app" youtube-dl https://www.youtube.com/watch\?v\=EUHcNeg_e9g [youtube] EUHcNeg_e9g: Downloading webpage ERROR: Unable to download webpage: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:661)> (caused by URLError(SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:661)'),))
--auto-remove also removed dependencies, like:
Removing ca-certificates (20170717~16.04.1) ...
We can now see that our
youtube-dl worked previously because of our
curl dependencies. If
youtube-dl would have been installed as a package, it would have declared
ca-certificates as its dependency.
Now what we could do is to first
purge --auto-remove and then add
ca-certificates back with
apt-get install or just install
ca-certificates along with other pacakges before removing
FROM ubuntu:16.04 ENV LC_ALL=C.UTF-8 RUN apt-get update && apt-get install -y \ curl python ca-certificates && \ curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl && \ chmod a+x /usr/local/bin/youtube-dl && \ apt-get purge -y --auto-remove curl && \ rm -rf /var/lib/apt/lists/* WORKDIR /app ENTRYPOINT ["/usr/local/bin/youtube-dl"]
From the build output we can see that
ca-certificates also adds
The following additional packages will be installed: openssl The following NEW packages will be installed: ca-certificates openssl
and this brings us to 36.4 megabytes in our
RUN layer (from the original 87.4 megabytes)
Our process (youtube-dl) could in theory escape the container due a bug in docker/kernel. To mitigate this we'll add a non-root user to our container and run our process with that user. Another option would be to map the root user to a high, non-existing user id on the host with https://docs.docker.com/engine/security/userns-remap/, but this is fairly a new feature and not enabled by default.
&& \ useradd -m app
And then we change user with the directive
USER app - so all commands after this line will be executed as our new user, including the
FROM ubuntu:16.04 ENV LC_ALL=C.UTF-8 RUN apt-get update && apt-get install -y \ curl python ca-certificates && \ curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl && \ chmod a+x /usr/local/bin/youtube-dl && \ apt-get purge -y --auto-remove curl && \ rm -rf /var/lib/apt/lists/* && \ useradd -m app USER app WORKDIR /app ENTRYPOINT ["/usr/local/bin/youtube-dl"]
When we run this image without bind mounting our local directory:
$ docker run youtube-dl https://www.youtube.com/watch\?v\=UFLCdmfGs7E [youtube] UFLCdmfGs7E: Downloading webpage [youtube] UFLCdmfGs7E: Downloading video info webpage [youtube] UFLCdmfGs7E: Extracting video information ERROR: unable to open for writing: [Errno 13] Permission denied: 'Short introduction to Docker (Scribe)-UFLCdmfGs7E.mp4.part'
We'll see that our
app user can not write to
/app - this can be fixed with
chown or not fix it at all, if the intented usage is to always have a
/app mounted from the host.
Publishing to Dockerhub
If we want to share our container publicly, we need to tag it as
our_dockerhub_username/youtube-dl and login to the docker hub with
$ docker tag youtube-dl mattipaksula/youtube-dl $ docker push mattipaksula/youtube-dl The push refers to a repository [docker.io/mattipaksula/youtube-dl] 582af28a5d38: Pushed 22c7a6ee7548: Pushed 3ff70ce53dac: Mounted from library/ubuntu b8e5935ae7cc: Mounted from library/ubuntu ba76b502dc9b: Mounted from library/ubuntu 803030df23c1: Mounted from library/ubuntu db8686e0ca43: Mounted from library/ubuntu latest: digest: sha256:ad1038acd11ed87ec013b5b7251a02ce4c0e9e8c08acd7070458cc085f3f53ee size: 1775
From the output we can see that the existing shared Ubuntu layers are re-used (mounted) from the
Alpine Linux variant
Our Ubuntu base image adds the most megabytes to our image (approx 113MB). Alpine Linux provides a popular alternative base in https://hub.docker.com/_/alpine/ that is around 4 megabytes. It's based on altenative glibc implementation musl and busybox binaries, so not all software run well (or at all) with it, but our python container should run just fine. We'll create the following
FROM alpine:3.7 ENV LC_ALL=C.UTF-8 RUN apk add --no-cache curl python ca-certificates && \ curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl && \ chmod a+x /usr/local/bin/youtube-dl && \ apk del curl && \ adduser -D app USER app WORKDIR /app ENTRYPOINT ["/usr/local/bin/youtube-dl"]
- The package manager is
apkand it can work without downloading sources (caches) first with
useraddis missing, but
- Most of the package names are the same - there's a good package browser at https://pkgs.alpinelinux.org/packages
Now when we build this file with
:alpine-3.7 as the tag:
$ docker build -t youtube-dl:alpine-3.7 -f Dockerfile.alpine .
It seems to run fine:
$ docker run -v "$(pwd):/app" youtube-dl:alpine-3.7 https://www.youtube.com/watch\?v\=EUHcNeg_e9g
From the history we can see that the our single
RUN layer size is 41.1MB
$ docker history youtube-dl:alpine-3.7 IMAGE... ... 14cfb0b531fb 20 seconds ago /bin/sh -c apk add --no-cache curl python ca… 41.1MB ... <missing> 3 weeks ago /bin/sh -c #(nop) ADD file:093f0723fa46f6cdb… 4.15MB
So in total our Alpine variant is about 45 megabytes, significantly less than our Ubuntu based image.
We can publish both variants by publishing this tag as well:
$ docker tag youtube-dl:alpine-3.7 mattipaksula/youtube-dl:alpine-3.7 $ docker push mattipaksula/youtube-dl:alpine-3.7
OR, we could just replace our Ubuntu image for everybody who might be depending that it is Ubuntu
$ docker tag youtube-dl:alpine-3.7 mattipaksula/youtube-dl
Also remember that unless specified the
:latest tag will always just refer to the latest image build & pushed - that can basically contain anything.
Even with a simple image, we've already been dealing with plenty of command line options in both building+pushing and running the image:
$ docker build -t youtube-dl:alpine-3.7 -f Dockerfile.alpine . $ docker run -v "$(pwd):/app" youtube-dl:alpine-3.7 https://youtube...
Now we'll switch to tool called
docker-compose to manage these with YAML. We'll create a file called
version: '3.4' services: youtube-dl-ubuntu: image: mattipaksula/youtube-dl:ubuntu-16.04 build: . youtube-dl-alpine: image: mattipaksula/youtube-dl:alpine-3.7 build: context: . dockerfile: Dockerfile.alpine
The version setting is not very strict, it just needs to be above 2 because otherwise the syntax is significantly different. See https://docs.docker.com/compose/compose-file/ for more info. The key
build: value can be set to a path (ubuntu) or have an object with
Now we can build and push both variants with just these commands:
$ docker-compose build $ docker-compose push
To run the image as we did previously, we'll need to add the volume bind mounts. Compose can work without an absolute path:
version: '3.4' services: youtube-dl-ubuntu: image: mattipaksula/youtube-dl:ubuntu-16.04 build: . volumes: - .:/app youtube-dl-alpine: image: mattipaksula/youtube-dl:alpine-3.7 build: context: . dockerfile: Dockerfile.alpine volumes: - .:/app
Now we can run it:
$ docker-compose run youtube-dl-ubuntu https://www.youtube.com/watch\?v\=EUHcNeg_e9g
Compose is really meant for running web services, so let's move from simple binary wrappers to running a HTTP services.
https://github.com/jwilder/whoami is simple service that prints the current container id (hostname).
$ docker run -d -p 8000:8000 jwilder/whoami 736ab83847bb12dddd8b09969433f3a02d64d5b0be48f7a5c59a594e3a6a3541 $ curl localhost:8000 I'm 736ab83847bb
Take down the container so that it's not blocking our port 8000 $ docker rm -f 736ab83847bb
whoami/docker-compose.yml from the command line options (you can also use something like https://github.com/magicmark/composerize)
version: '3.4' services: whoami: image: jwilder/whoami ports: - 8000:8000
$ docker-compose up -d $ curl localhost:8000
Compose can scale the service to run multiple instances:
$ docker-compose up --scale whoami=3 WARNING: The "whoami" service specifies a port on the host. If multiple containers for this service are created on a single host, the port will clash. Starting whoami_whoami_1 ... done Creating whoami_whoami_2 ... error Creating whoami_whoami_3 ... error
But it will fail with port clash. If we don't specify the host port, a free port will be allocated:
$ docker-compose port --index 1 whoami 8000 0.0.0.0:32770 $ docker-compose port --index 2 whoami 8000 0.0.0.0:32769 $ docker-compose port --index 3 whoami 8000 0.0.0.0:32768
We can curl from these ports:
$ curl 0.0.0.0:32769 I'm 536e11304357 $ curl 0.0.0.0:32768 I'm 1ae20cd990f7
In a server environment you'd normally have a load balancer in-front of the service. For local environment (or a single server) one good solution is to use https://github.com/jwilder/nginx-proxy that configures nginx from docker daemon as containers are started and stopped.
Let's add the proxy to our compose file and remove the port bindings from the whoami service. We'll mount our
docker.sock inside of the container in
:ro read-only mode.
version: '3.4' services: whoami: image: jwilder/whoami proxy: image: jwilder/nginx-proxy volumes: - /var/run/docker.sock:/tmp/docker.sock:ro ports: - 80:80
When we start this and test
$ docker-compose up -d --scale whoami=3 $ curl localhost:80 <html> <head><title>503 Service Temporarily Unavailable</title></head> <body bgcolor="white"> <center><h1>503 Service Temporarily Unavailable</h1></center> <hr><center>nginx/1.13.8</center> </body> </html>
It's "working", but the nginx just doesn't know which service we want. The
nginx-proxy works with two environment variables:
VIRTUAL_PORT is not needed if the service has
EXPOSE in it's docker image. We can see that
jwilder/whoami sets it: https://github.com/jwilder/whoami/blob/master/Dockerfile#L9
localtest.me is configured so that all subdomains point to
127.0.0.1 (at least at the time of writing) - let's use that:
version: '3.4' services: whoami: image: jwilder/whoami environment: - VIRTUAL_HOST=whoami.localtest.me proxy: image: jwilder/nginx-proxy volumes: - /var/run/docker.sock:/tmp/docker.sock:ro ports: - 80:80
Now the proxy works:
$ docker-compose up -d --scale whoami=3 $ curl whoami.localtest.me I'm f6f85f4848a8 $ curl whoami.localtest.me I'm 740dc0de1954
Let's add couple of more containers behind the same proxy. We can use the official
nginx image to serve a simple static web page. We don't have to even build the container images, we can just mount the content to the image. Let's prepare some content for two services called "hello" and "world".
$ echo "hello" > hello.html $ echo "world" > world.html
Then add these services to the
docker-compose.yml file where you mount just the content as
index.html in the default nginx path:
hello: image: nginx volumes: - ./hello.html:/usr/share/nginx/html/index.html:ro environment: - VIRTUAL_HOST=hello.localtest.me world: image: nginx volumes: - ./world.html:/usr/share/nginx/html/index.html:ro environment: - VIRTUAL_HOST=world.localtest.me
Now let's test:
$ docker-compose up -d --scale whoami=3 $ curl hello.localtest.me hello $ curl world.localtest.me world $ curl whoami.localtest.me I'm f6f85f4848a8 $ curl whoami.localtest.me I'm 740dc0de1954
Now we have a basic single machine hosting setup up and running.
Test updating the
hello.html without restarting the container, does it work?
Next we'll setup Wordpress that requires MySQL and persisted volume.
In https://hub.docker.com/_/wordpress/ there is a massive list of different variants in
Supported tags and respective Dockerfile links - most likely for this testing we can use any of the images. From "How to use this image" we can see that all variants require
WORDPRESS_DB_HOST that needs to be MySQL.So before moving forward, let's setup that.
In https://hub.docker.com/_/mysql/ there's a sample compose file under "via docker stack deploy or docker-compose" - Let's strip that down to
version: '3.4' services: db: image: mysql restart: unless-stopped environment: MYSQL_ROOT_PASSWORD: example
- Version was updated to
3.4- but that doesn't change anything in this case
restart: alwayswas changed to
unless-stoppedthat will keep the container running unless it's stopped. With
alwaysthe a stopped container is started after reboot for example.
Under "Caveats - Where to Store Data" we can see that the
/var/lib/mysql needs to be mounted separately to preserve data so that the container can be recreated. We could use a bind mount like previously, but this time let's create a separete volume for the data:
version: '3.4' services: mysql: image: mysql restart: unless-stopped environment: - MYSQL_ROOT_PASSWORD=example volumes: - mysql-data:/var/lib/mysql volumes: mysql-data:
$ docker-compose up Creating network "wordpress_default" with the default driver Creating volume "wordpress_mysql-data" with default driver Creating wordpress_mysql_1 ... Creating wordpress_mysql_1 ... done Attaching to wordpress_mysql_1 mysql_1 | Initializing database ... mysql_1 | 2018-02-01T19:48:20.660859Z 0 [Warning] 'tables_priv' entry 'sys_config mysql.sys@localhost' ignored in --skip-name-resolve mode. mysql_1 | 2018-02-01T19:48:20.664811Z 0 [Note] Event Scheduler: Loaded 0 events mysql_1 | 2018-02-01T19:48:20.665236Z 0 [Note] mysqld: ready for connections. mysql_1 | Version: '5.7.21' socket: '/var/run/mysqld/mysqld.sock' port: 3306 MySQL Community Server (GPL)
The image initializes the data files in the first start. Let's terminate the container with ^C
^CGracefully stopping... (press Ctrl+C again to force) Stopping wordpress_mysql_1 ... done
Compose uses the current directory as a prefix for container and volume names so that different projects don't clash. The prefix can be overriden with
COMPOSE_PROJECT_NAME environment variable if needed.
Now when the MySQL is running, let's add the actual Wordpress. The container seems to require just two environment variables.
wordpress: image: 'wordpress:4.9.1-php7.1-apache' environment: - WORDPRESS_DB_HOST=mysql - WORDPRESS_DB_PASSWORD=example ports: - '9999:80' depends_on: - mysql
We also declare that
mysql service should be started first and that the container will link to it - The MySQL server is accessible with dns name "mysql" from the Wordpress service.
Now when you run it:
$ docker-compose up -d $ docker-compose logs wordpress Attaching to wordpress_wordpress_1 wordpress_1 | WordPress not found in /var/www/html - copying now... wordpress_1 | Complete! WordPress has been successfully copied to /var/www/html ...
We see that Wordpress image creates files in startup at
/var/www/html that also needs to be persisted. The Dockerfile has this line https://github.com/docker-library/wordpress/blob/6a085d90853b8baffadbd3f0a41d6814a2513c11/php7.1/apache/Dockerfile#L44 where it declares that a volume should be created. Docker will create the volume, but it will be handled as a anonymous volume that is not managed by compose, so it's better to be explicit about the volume. With that in mind our final file should look like this:
version: '3.4' services: mysql: image: mysql restart: unless-stopped environment: - MYSQL_ROOT_PASSWORD=example volumes: - mysql-data:/var/lib/mysql wordpress: image: 'wordpress:4.9.1-php7.1-apache' environment: - WORDPRESS_DB_HOST=mysql - WORDPRESS_DB_PASSWORD=example volumes: - wordpress-data:/var/www/html ports: - '9999:80' depends_on: - mysql volumes: mysql-data: wordpress-data:
Now open and configure the installation at http://localhost:9999
We can inspect the changes that happened in the image and ensure that no extra meaningful files got written to the container:
$ docker diff $(docker-compose ps -q wordpress) C /run/apache2 A /run/apache2/apache2.pid C /run/lock/apache2 C /tmp
Since plugins and image uploads will by default write to local disk at
/var/www/html, this means that Wordpress can not be scaled in a real production deployment on multiple machines without somehow sharing this path. Some possible solutions:
- shared filesystem like NFS or AWS EFS - Something like https://www.gluster.org/ or http://ceph.com/ - Two-way syncing daemon like https://www.cis.upenn.edu/~bcpierce/unison/index.html, https://syncthing.net/ or https://www.resilio.com) - see http://blog.kontena.io/how-to-build-high-availability-wordpress-site-with-docker/ - User space FUSE solutions like https://github.com/kahing/goofys or https://github.com/googlecloudplatform/gcsfuse - See https://lemag.sfeir.com/wordpress-cluster-docker-google-cloud-platform/
Backups and restore
We can test backing up:
$ docker-compose exec mysql mysqldump wordpress -uroot -pexample | less
Where we see that the first line is unexpected:
mysqldump: [Warning] Using a password on the command line interface can be insecure.
This is because docker-compose's exec has a bug docker/compose#5207 where STDERR gets printed to STDOUT.. As a workaround we can skip
$ docker exec -i $(docker-compose ps -q mysql) mysqldump wordpress -uroot -pexample > dump.sql mysqldump: [Warning] Using a password on the command line interface can be insecure.
Now STDERR is correctly printed to the terminal.
$ docker-compose down Stopping wordpress_wordpress_1 ... done Stopping wordpress_mysql_1 ... done Removing wordpress_wordpress_1 ... done Removing wordpress_mysql_1 ... done Removing network wordpress_default
As our volumes are managed separately in docker-compose, that command didn't remove our volumes to prevent mistakes.
$ docker-compose down --volumes Removing network wordpress_default WARNING: Network wordpress_default not found. Removing volume wordpress_mysql-data Removing volume wordpress_wordpress-data
Then start the mysql service again (with fresh volumes) without the wordpress service
$ docker-compose up -d mysql
Since the dumping with
docker-compose exec did not work, let's see if importing would:
$ docker-compose exec mysql mysql -uroot -pexample < dump.sql mysql: [Warning] Using a password on the command line interface can be insecure. Traceback (most recent call last): File "docker-compose", line 6, in <module> File "compose/cli/main.py", line 71, in main File "compose/cli/main.py", line 124, in perform_command File "compose/cli/main.py", line 467, in exec_command File "site-packages/dockerpty/pty.py", line 338, in start File "site-packages/dockerpty/io.py", line 32, in set_blocking ValueError: file descriptor cannot be a negative integer (-1) Failed to execute script docker-compose
...and no, because of another bug in docker/compose#3352 - we'll bypass compose again with:
$ docker exec -i $(docker-compose ps -q mysql) mysql -uroot -pexample wordpress < dump.sql
And then start the wordpress:
$ docker-compose up -d wordpress
And our old site is back!