Skip to content

Instantly share code, notes, and snippets.

@ZapDos7
Last active February 15, 2024 20:30
Show Gist options
  • Save ZapDos7/29760bc0ddee5563ed73e1d87e433316 to your computer and use it in GitHub Desktop.
Save ZapDos7/29760bc0ddee5563ed73e1d87e433316 to your computer and use it in GitHub Desktop.
Docker notes

Docker

LinkedIn Learn course by Arthur Ulfeldt

Introduction

Docker "carves up" a system into sealed containers that run your code, each with its own processes etc. These containers are portable (so code can be executed anywhere). Docker builds these containers & offers a social platform to exchange these containers.

Docker is:

  • a client program (a command we run)
  • a server program (that listens for such commands that controls the Linux VM)
  • a program that builds the containers from the code
  • a service that distributes these containers online
  • a company that makes these

Docker needs a Linux server to manage.

A container is a self contained, sealed unit of software & contains everything required to run the code, e.g. the OS, configs, processes, networking, dependencies. Multiple containers can coexist in the same machine and they do not affect each other as long as the OS is a Linux distro.

Image: every file that makes up just enough of the operating system to do what you need to do.

Multiple containers can derive from the same image but each one holds its own world, so if I made 2 containers from 1 image, and in 1 of these containers I create a file, that file does not exist in the other container or the image.

Installation notes

Options:

Note: Boot2Docker is obsolete and thus should be uninstalled from any systems.

Commands

Image to Container Commands

Verify installation and prints simple Docker info

docker run hello-world

Run a container from an image

docker run

# e.g.
docker run -ti ubuntu:latest bash   # creates a container
                                    # from ubuntu:latest image
                                    # -ti: terminal interactive
                                    # with interactive terminal and that is 
                                    # running bash in it
                                    # if we omit the tag,
                                    # docker will autofill it 
                                    # as `latest` by default

docker run --rm             # we wish to run something in a container 
                            # but not keep the container
                            # after we're done
docker run ... -c "foo"     # run the commands `foo`
                            # within the container
                            # e.g.
docker run -ti ubuntu bash -c "sleep 3; echo all done"  # run ubuntu container, sleep for 3",
                                                        # then print "all done" to terminal

docker run -d ...           # -d for detached
                            # runs in the background
                            # returns image id, otherwise use docker ps

We can exit with Ctrl + D

Docker Images Commands

Display all images

docker images               # displays all docker images
                            # existing locally:
                            # repository: where the image comes from
                            # tag: version number
                            # image id: internal Docker representation
                            # (images do not need to have an
                            # id so instead we might refer to them 
                            # using repository:tag)
                            # created: date is was created
                            # size: image size in kB/MB

Display all running images

docker ps                   # NOTE: image id != container id
docker ps -a                # displays all containers,
                            # even stopped ones
docker ps -l                # displays last exited container

Container to Image Commands

docker commit <container-id>                  # takes a stopped container and creates
                                              # a new image from it, returning:
                                              # sha256:<image-id>
                                              # we can then do this:
docker tag <image-id> <custom-image-name>     # give a human-readable name to an image
                                              # since the above process is common, 
                                              # we can instead do it in one command:
docker commit <container-name> <custom-image-name>

Running things in Docker

  • containers have a main process
  • the container stops when the main process stops
  • containers have names, if we do not give them one, Docker will generate one for us

Detached Containers Commands

# start detached
docker run -d -ti ubuntu bash   # returns detached id
# reattach to it
docker attach <container-name>  # or its id

Otherwise we can use the keys: Ctrl + P, Ctrl + Q, which can work even if we started with an attached container.

Running more things in a container

docker exec -ti <cont-name> <process>   # start another process 
                                        # in an existing container

Useful for debugging and DB administration.

Does not allow to add ports, volumes etc.

This allows us to have multiple interaction points onto the same container. When the original container exits, the exec'd process dies alongside it.

Container Output Commands

# we run a container but have a typo in `ls` as `lose`
docker run --name example -d ubuntu bash -c "lose /etc/password"
# we check logs
docker logs example
# this returns:
# bash: lose: command not found

Output should not be too large.

Stop & Remove Containers

docker kill <container-name-or-id> # stops a container
docker rm   <container-name-or-id> # removes a container

It's good to keep track and rm unused containers as sometimes the names can be reused and it can lead to pseudo-conflicts.

Constraint Resources Commands

# memory
docker run --memory <max-allowed-memory> <image-name> <command>
# CPU
docker run --cpu-shares # relative to other containers
docker run --cpu-quota  # hard limit in general

Tips & Lessons 1

  1. Don't let containers fetch dependencies when they start.

If you're using things like Node.js and you have your node starts up and then when the container starts it fetches its dependencies. Then, some day, somebody's gonna remove some library out from the node repos, and all of a sudden, all your containers just stop throughout your whole system, it's terrible. Fetch make your containers include their dependencies inside the container themselves, saves a lot of pain.

  1. Don't leave important things in unnamed stopped containers.

Container Networking Commands

Programs in containers are isolated from the internet by default.

We can group containers into "private" networks so they can interact with each other via network, but not with the rest of the world.

We can expose ports (as many as we want) to let connections in in order to connect with the rest of the world. This requires coordination between containers.

Example: A server container passing info between two clients

# terminal 1
docker run --rm -ti -p 45678:45678 -p 45679:45679 --name echo-server ubuntu:14.04 bash 
# -p: publish 
# <port1>:<port1> means we expose the same port from the outside onto the outside world

# inside we will use netcat to simply use the networking facilities
$ nc -lp 45678 | nc -lp 45679
# -lp: listen on port
# listen to port 45678 and write to 45679
# terminal 2
nc localhost 45678
# terminal 3
nc localhost 45679
# Now, whatever we write from terminal 2 is printed onto terminal 3.
# if we don't have nc installed, we can execute it within a container
docker run --rm -ti ubuntu:14.04 bash
$ nc host.docker.internal 45678

The above example works on Mac; in order to specify the host when using multiple containers in other OS, see here.

If we need to set more ports

We do it dynamically:

  • the port inside the container is fixed
  • the port on the host is chosen from unused ports
  • this allows many containers running programs with fixed ports, without us having to set each and every one.

This is useful in service discovery programs like Kubernetes.

e.g.:

# terminal 1
docker run --rm -ti -p 45678 -p 45679 --name echo-server ubuntu:14.04 bash 
$ nc -lp 45678 | nc -lp 45679
# terminal 2
docker port echo-server # it's easier to look up a named container 
# prints
45678/tcp -> 0.0.0.0:123456
45679/tcp -> 0.0.0.0:456123
# so now we can
nc localhost 123456
# terminal 3
nc localhost 456123

And it works like before.

Exposing UDP Ports

# terminal 1
docker run -p <outside-port>:<inside-port>/protocol (tcp/udp)
# e.g.
docker run -p 1234:1234/udp
# we then need to alter the nc command
$ nc -ulp 1234
# terminal 2
nc -u localhost 1234

Docker's Network Features

Show existing networks

docker network ls
# prints:
NETWORK ID     NAME    DRIVER    SCOPE
17776875348d   bridge  bridge    local # default
2eefced1748c   host    host      local # for containers that need to
                                       # not be isolated via network
03dc35a5003d   none    null      local # no networking
...

Create a new network

docker network create <some-name>

Connect to a created network

docker run --rm -ti --net <some-name> --name <server-name> ...
# e.g.
docker network create learn
docker run --rm -ti --net learn --name catserver ubuntu:14.04 bash
$ ping catserver # verify we can send packets to ourselves

We can also add more servers in the same network

docker run --rm -ti --net learn --name dogserver ubuntu:14.04 bash

Now, catserver and dogserver can communicate.

Multiple networks

Expanding the above example we create a new network:

docker network create catsonly
docker network connect catsonly catserver

We now connect anew

docker run --rm -ti --net catsonly --name bobcatserver ubuntu:14.04 bash
$ ping catserver # works
$ ping dogserver # traffic not allowed

This does not stop catserver to ping both bobcatserver and dogserver as it belongs to both networks. dogserver still can only connect to catserver.

Legacy Linking

  • one way links
  • secret environment variables shared only one way
  • startup order is important
  • restarts only sometimes break the links

Example

Like above, we connect to catserver and we also create an environment variable named FOO.

docker run --rm -ti -e FOO=bar --name catserver ubuntu:14.04 bash
$ env # returns FOO=bar
$ nc -lp 1234

Now we connect to this server like so

docker run --rm -ti --link catserver --name dogserver ubuntu:14.04 bash
$ env # returns a copy of this variable, named CATSERVER_ENV_FOO
$ nc catserver 1234

Now our servers can exchange messages. We cannot, however connect from catserver to dogserver as this functionality is only one way.

Images

Note: Summing up the SIZE column does not return the sum of MB used by Docker, as some images share bytes.

Commit Images

docker commit <sha> my-image        # auto tags as latest
docker commit <sha> my-image:v2.1   # tags as v2.1; creates new image

Full name structure

registry.example.com:port/organization/image-name:version-tag
# we can leave out the parts we don't need
# usually just organization/image-name is enough

Getting Images

docker pull # run automatically by docker run, useful for offline work

Cleaning Up Images

docker rmi image-name:tag # or
docker rmi image-id

Sharing data between containers or between containers and hosts: Volumes

Volumes act like shared folders, virtual discs to store and share data.

Two types:

  • persistent (when the container goes away, the data remains)
  • ephemeral (exist as long as a container is using them)

They are not part of images; they're for our own local data (local to this host)

Sharing Data with the Host

For a folder:

[outside docker] mkdir example
[outside docker] docker run -ti -v /path/to/example:/shared-folder ubuntu bash
# -v: specify path to folder we wish to share, colon, path within the container where we wish to store this folder.
[within docker] $ ls /shared-folder/
# yields nothing
[within docker] $ touch /shared-folder/my-data
# exit
[outside docker] ls example/
# returns my-data

For a file, we perform the same process, but the file must exist before we start the container, or it will be assumed to be a directory.

Sharing Data between Containers

We use volumes-from:

  • shared discs that exist only as long as they are being used
  • can be shared between containers
# terminal 1 #
docker run -ti -v /shared-data ubuntu bash # create a volume for container not shared with host
$ echo hello > /shared-data/data-file
# store data into it
# terminal 2
docker run -ti --volumes-from <container-name> ubuntu bash
$ ls /shared-data/
# returns the data-file
# we can add more here and it will be visible on terminal 1's container

After exiting the original container, while keeping the second one, we still have access to our data. We can add another container in a similar manner and it'll also be able to view these files, even though the original machine is gone. When all containers are gone, this data is gone.

Docker Registries

What are they?

  • They manage & distribute images
  • Docker (the company) offers these for free, we can also run our own if we desire.

How to Find Images

docker search <command>

e.g.:

$ docker search ubuntu
NAME                             DESCRIPTION                                     STARS     OFFICIAL   AUTOMATED
ubuntu                           Ubuntu is a Debian-based Linux operating sys…   15370     [OK]       
websphere-liberty                WebSphere Liberty multi-architecture images …   290       [OK]       
ubuntu-upstart                   DEPRECATED, as is Upstart (find other proces…   112       [OK]       
neurodebian                      NeuroDebian provides neuroscience research s…   97        [OK]       
ubuntu/nginx                     Nginx, a high-performance reverse proxy & we…   71                   
open-liberty                     Open Liberty multi-architecture images based…   56        [OK]       
ubuntu/apache2                   Apache, a secure & extensible open-source HT…   51                   
ubuntu-debootstrap               DEPRECATED; use "ubuntu" instead                49        [OK]       
ubuntu/squid                     Squid is a caching proxy for the Web. Long-t…   46                   
ubuntu/mysql                     MySQL open source fast, stable, multi-thread…   40                   
ubuntu/bind9                     BIND 9 is a very flexible, full-featured DNS…   34                   
ubuntu/prometheus                Prometheus is a systems and service monitori…   33                   
ubuntu/postgres                  PostgreSQL is an open source object-relation…   22                   
ubuntu/kafka                     Apache Kafka, a distributed event streaming …   17                   
ubuntu/redis                     Redis, an open source key-value store. Long-…   15                   
ubuntu/prometheus-alertmanager   Alertmanager handles client alerts from Prom…   8                    
ubuntu/grafana                   Grafana, a feature rich metrics dashboard & …   6                    
ubuntu/memcached                 Memcached, in-memory keyvalue store for smal…   5                    
ubuntu/zookeeper                 ZooKeeper maintains configuration informatio…   5                    
ubuntu/dotnet-runtime            Chiselled Ubuntu runtime image for .NET apps…   5                    
ubuntu/dotnet-deps               Chiselled Ubuntu for self-contained .NET & A…   5                    
ubuntu/telegraf                  Telegraf collects, processes, aggregates & w…   4                    
ubuntu/cortex                    Cortex provides storage for Prometheus. Long…   3                    
ubuntu/dotnet-aspnet             Chiselled Ubuntu runtime image for ASP.NET a…   3                    
ubuntu/cassandra                 Cassandra, an open source NoSQL distributed …   2

There is also the OFFICIAL flag which helps differentiate between images.

We can also visit the Docker Hub Webpage, containing even more info for each image.

If we already have an account, we can do the following

docker login                                            # username, password prompts
docker pull debian:sid                                  # pull an image
docker tag debian:sid username/<name>:<tag-version>     # add tag
docker push username/<name>:<tag-version>               # push to repo

Tips & Lessons 2

  1. Don't push images containing passwords or other sensitive info to Docker Hub
  2. Cleanup images regularly (helps keep track with obsolete dependencies, too)
  3. Be careful what you trust!

Building Docker Images

What are Dockerfiles?

  • Small programs that describe the creation of an image.
  • We run these with
docker build -t <name-of-result> <path/to/dockerfile>
  • When it finishes, the result will be in our local docker registry
  • Each line takes the image from the previous line & makes another image, the previous image is unchanged, it does not edit the state from the pervious line
  • We do not want large files to span lines, lest our image be huge eventually.
  • Since each step is being cached, watch the build output for using cache. So, Docker can skip lines that have remained unchanged since the last build.
  • TIP: Put the parts that change at the end of the Dockerfile.
  • They are NOT shell scripts, albeit looking like them.
  • If we want to keep environment variables, we have to use the ENV command and will be set on the next line.

Further reading

How to Create a Dockerfile?

A Simple Example

  1. Create a file named Dockerfile in its own directory. (why?)
  2. Put this in it:
FROM busybox
RUN echo "building simple docker image."
CMD echo "Hello Container"
  1. Run:
docker build -t hello .

This prints something like:

Sending build context to Docker daemon  2.048kB
Step 1/3 : FROM busybox
latest: Pulling from library/busybox
2123501b93d4: Pull complete 
Digest: sha256:05a79c7279f71f86a2a0d05eb72fcb56ea36139150f0a75cd87e80a4272e4e39
Status: Downloaded newer image for busybox:latest
 ---> 827365c7baf1
Step 2/3 : RUN echo "building simple docker image."
 ---> Running in 12470d849d80
building simple docker image.
Removing intermediate container 12470d849d80
 ---> c7ebfbb9c45e
Step 3/3 : CMD echo "Hello Container"
 ---> Running in cc339230761d
Removing intermediate container cc339230761d
 ---> 57a23769c50f
Successfully built 57a23769c50f
Successfully tagged hello:latest
  1. Now run:
docker run --rm hello

This prints:

Hello Container

Installing a Program with Docker Build

In your Dockerfile, put:

FROM debian:sid
RUN apt-get -y update 
RUN apt-get install nano
CMD ["/bin/nano", "/tmp/notes"]
# CMD "nano" "/tmp/notes"

Then run

docker build -t example/nanoer .

Adding a File through Docker Build

In your Dockerfile, put:

FROM example/nanoer
ADD notes.txt /notes.txt
CMD ["nano", "/notes.txt"]

Then:

nano notes.txt
# add something in the file

Lastly:

docker build -t example/notes .
docker run -ti --rm example/notes

We can now access notes.txt

Syntax

The FROM statement

  • It dictates which image to download & start from.
  • It must be the first command in the file.
  • Multiple can exist in the same file (produces multiple images).
  • syntax: FROM java:8

The MAINTAINER statement

  • Defines the author of the file
  • syntax: MAINTAINER fname lname <email@email.com>

The RUN statement

  • Runs the command line, waits for it to finish & saves the result
  • syntax: RUN unzip install.zip /opt/install/
  • or RUN echo hello docker

The ADD statement

  • Adds local files e.g. ADD run.sh /run.sh
  • Decompresses tar contents to directory e.g. ADD foo.tar.gz /install/
  • Downloads from URL and places in directory e.g. ADD https://url/to/file /folder/

The ENV statement

  • Sets environment variables both during the build and when running the result
  • syntax: ENV FOO=bar

The ENTRYPOINT statement

  • ENTRYPOINT specifies the start of the command to run (adds to)

CMD statement

  • CMD specifies the entire command to run (replaces)

ENTRYPOINT & CMD

  • These two can coexist.
  • If your container act like a command-line program, you can use ENTRYPOINT
  • if you're unsure, use CMD
Shell Form vs Exec Form

Shell form:

nano notes.txt

Exec form: (a bit more efficient as it directly runs the command without the need for a shell)

["nano", "/notes.txt"]

The EXPOSE statement

  • Maps a port into the container
  • syntax: EXPOSE 8080

The VOLUME statement

  • Defines shared or ephemeral volumes
  • syntax: VOLUME ["/host/path/", "/container/path"], which maps a host path to a container path (2 args)
  • or VOLUME ["/shared-data/"], which defines a volume that can be inherited later by containers (1 arg)
  • It's good to avoid defining shared folders in Dockerfiles, in order to be able to share them between many computers.

The WORKDIR statement

  • Sets both the working directory for the rest of the Dockerfile as well as for the resulting container
  • syntax: WORKDIR /dir/

The USER statement

  • Sets which user the container will run as
  • syntax: USER arthur or USER 1000
  • Useful for shared networks with specific user names.

There's more!

Check it out.

Multiple-project Docker files

Dockerfile:

FROM ubuntu:16.04
RUN apt-get update
RUN apt-get -y install curl
RUN curl https://google.com | wc -c > google-size
# this simply calculates google's home page character count
ENTRYPOINT echo google is this big; cat google-size

Run:

docker build -t tooo-big .

Now:

docker run tooo-big

This image is about 170MB. This can be improved. Therefore we can split the Dockerfile like so:

FROM ubuntu:16.04 as builder #this changed!#
RUN apt-get update
RUN apt-get -y install curl
RUN curl https://google.com | wc -c > google-size
#add this
FROM alpine
COPY --from=builder /google-size //google-size
#up to here
ENTRYPOINT echo google is this big; cat google-size

Rebuild (with a new name), run anew & check again: the new image size is 4.41MB.

Tips & Lessons 3

Prevent the "Golden Image" problem:

  • Include installers in your project.
  • Have a canonical build system that builds everything completely from scratch.
  • Tag your builds with the git hash of the code that built it.
  • Use small images, like Alpine.
  • Build images you share publically from Dockerfiles always.
  • Do not share passwords! Delete the file that might contain sensitive data in the next step.

Delve into the core

Kernels:

  • Respond to messages from hardware
  • Start & schedule programs
  • Controls & organizes devices & storage units
  • Pass messages between programs
  • Allocate resources (memory, CPU etc)

The kernel being handled by Docker:

  • Docker is written in Go
  • It manages kernel features
    • uses "cgroups" to contain processes
    • uses "namespaces" to contain networks
    • uses "copy-on-write" filesystems to build images

Docker does not introduce new features, it simplifies scripting distributed systems.

Docker is divided into the client & the server. The server receives commands over a socket (either over a network or through a "file") Therefore, the client can run within the server, as well.

If we wish to identify the name of the root process inside of a container, we run:

docker inspect

Registries

These can be run in Docker:

docker run -d -p 5000:5000 --restart=always --name registry registry:2
# --restart=always: always keep it running
# registry version 2 is current
docker tag ubuntu:14.04 localhost:5000/my-company/my-ubuntu:99
docker push localhost:5000/my-company/my-ubuntu:99
# pushed a copy of ubuntu we have locally 
# so even if the original is gone, we have this

Be mindful of authentication when deploying.

Storag options

Orchestration - Large Systems

  • We need multiple containers realistically
  • Therefore we need to orchestrate them & their communications
  • Service discovery
  • Resource allocation

Docker Compose

  • ideal for testing & development
  • single machine coordination
  • not good for scaling
  • brings everything with one command: docker compose up
  • containers run programs
  • pods group containers together (docker compose but dynamic)
  • services make pods discoverable by others
  • labels are used for very advances service discovery
  • makes scripting large operations possible with the kubectl command
  • flexible overlaying system
  • runs on hardware, cloud etc
  • Kubernetes in AWS
  • Google Kubernetes Engine
  • works similarly
  • task definitions define a set of containers that always run together
  • tasks make a container right now
  • services and exposes to the Net (ensures that a task is running all the time)
  • good integration with ELBs (Amazon Load Balancers)
  • you create your own host instances
  • make your instances start the agent & join the cluster
  • passes the docker control socket to the agent
  • provides docker repos
  • it's easy to run your own repo alongside these
  • tasks can be part of CloudFormation stacks (easier deployment)

Utilities

reformat.sh

This script formats to vertical the output of docker ps (only works on bash):

export FORMAT="\nID\t{{.ID}}\nIMAGE\t{{.Image}}\nCOMMAND\t{{.Command}}\nCREATED\t{{.RunningFor}}\nSTATUS\t{{.Status}}\nPORTS\t{{.Ports}}\nNAMES\t{{.Names}}\n"

Use:

docker ps --format $FORMAT

Docker Quick Notes

[Terminal 1]
sudo apt install docker
sudo apt install docker-compose #based on docker-compose.yml in corresponding folder

[Terminal 2]
docker compose up

## 1
systemctl start docker
# if group doesn't exist
sudo groupadd docker
# if group exists
# add user to group
cat /etc/group | grep docker

## 2
sudo username -aG docker $USER
# log out, log in

## 3
newgroup docker
#verify changes

## 4
docker run hello-world
# verify

To run:

docker ps                        # get container's name e.g. postgresql
docker exec -it postgresql bash  # opens bash
$ psql -U username dbname        # opens posgres env (logged in based on credentials we gave), and we can perform the actions we need
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment