-
-
Save ZapDos7/29760bc0ddee5563ed73e1d87e433316 to your computer and use it in GitHub Desktop.
LinkedIn Learn course by Arthur Ulfeldt
Docker "carves up" a system into sealed containers that run your code, each with its own processes etc. These containers are portable (so code can be executed anywhere). Docker builds these containers & offers a social platform to exchange these containers.
Docker is:
- a client program (a command we run)
- a server program (that listens for such commands that controls the Linux VM)
- a program that builds the containers from the code
- a service that distributes these containers online
- a company that makes these
Docker needs a Linux server to manage.
A container is a self contained, sealed unit of software & contains everything required to run the code, e.g. the OS, configs, processes, networking, dependencies. Multiple containers can coexist in the same machine and they do not affect each other as long as the OS is a Linux distro.
Image: every file that makes up just enough of the operating system to do what you need to do.
Multiple containers can derive from the same image but each one holds its own world, so if I made 2 containers from 1 image, and in 1 of these containers I create a file, that file does not exist in the other container or the image.
Options:
Note: Boot2Docker is obsolete and thus should be uninstalled from any systems.
docker run hello-world
docker run
# e.g.
docker run -ti ubuntu:latest bash # creates a container
# from ubuntu:latest image
# -ti: terminal interactive
# with interactive terminal and that is
# running bash in it
# if we omit the tag,
# docker will autofill it
# as `latest` by default
docker run --rm # we wish to run something in a container
# but not keep the container
# after we're done
docker run ... -c "foo" # run the commands `foo`
# within the container
# e.g.
docker run -ti ubuntu bash -c "sleep 3; echo all done" # run ubuntu container, sleep for 3",
# then print "all done" to terminal
docker run -d ... # -d for detached
# runs in the background
# returns image id, otherwise use docker ps
We can exit with Ctrl + D
docker images # displays all docker images
# existing locally:
# repository: where the image comes from
# tag: version number
# image id: internal Docker representation
# (images do not need to have an
# id so instead we might refer to them
# using repository:tag)
# created: date is was created
# size: image size in kB/MB
docker ps # NOTE: image id != container id
docker ps -a # displays all containers,
# even stopped ones
docker ps -l # displays last exited container
docker commit <container-id> # takes a stopped container and creates
# a new image from it, returning:
# sha256:<image-id>
# we can then do this:
docker tag <image-id> <custom-image-name> # give a human-readable name to an image
# since the above process is common,
# we can instead do it in one command:
docker commit <container-name> <custom-image-name>
- containers have a main process
- the container stops when the main process stops
- containers have names, if we do not give them one, Docker will generate one for us
# start detached
docker run -d -ti ubuntu bash # returns detached id
# reattach to it
docker attach <container-name> # or its id
Otherwise we can use the keys: Ctrl + P
, Ctrl + Q
, which can work even if we started with an attached container.
docker exec -ti <cont-name> <process> # start another process
# in an existing container
Useful for debugging and DB administration.
Does not allow to add ports, volumes etc.
This allows us to have multiple interaction points onto the same container. When the original container exits, the exec
'd process dies alongside it.
# we run a container but have a typo in `ls` as `lose`
docker run --name example -d ubuntu bash -c "lose /etc/password"
# we check logs
docker logs example
# this returns:
# bash: lose: command not found
Output should not be too large.
docker kill <container-name-or-id> # stops a container
docker rm <container-name-or-id> # removes a container
It's good to keep track and rm
unused containers as sometimes the names can be reused and it can lead to pseudo-conflicts.
# memory
docker run --memory <max-allowed-memory> <image-name> <command>
# CPU
docker run --cpu-shares # relative to other containers
docker run --cpu-quota # hard limit in general
- Don't let containers fetch dependencies when they start.
If you're using things like Node.js and you have your node starts up and then when the container starts it fetches its dependencies. Then, some day, somebody's gonna remove some library out from the node repos, and all of a sudden, all your containers just stop throughout your whole system, it's terrible. Fetch make your containers include their dependencies inside the container themselves, saves a lot of pain.
- Don't leave important things in unnamed stopped containers.
Programs in containers are isolated from the internet by default.
We can group containers into "private" networks so they can interact with each other via network, but not with the rest of the world.
We can expose ports (as many as we want) to let connections in in order to connect with the rest of the world. This requires coordination between containers.
# terminal 1
docker run --rm -ti -p 45678:45678 -p 45679:45679 --name echo-server ubuntu:14.04 bash
# -p: publish
# <port1>:<port1> means we expose the same port from the outside onto the outside world
# inside we will use netcat to simply use the networking facilities
$ nc -lp 45678 | nc -lp 45679
# -lp: listen on port
# listen to port 45678 and write to 45679
# terminal 2
nc localhost 45678
# terminal 3
nc localhost 45679
# Now, whatever we write from terminal 2 is printed onto terminal 3.
# if we don't have nc installed, we can execute it within a container
docker run --rm -ti ubuntu:14.04 bash
$ nc host.docker.internal 45678
The above example works on Mac; in order to specify the host when using multiple containers in other OS, see here.
We do it dynamically:
- the port inside the container is fixed
- the port on the host is chosen from unused ports
- this allows many containers running programs with fixed ports, without us having to set each and every one.
This is useful in service discovery programs like Kubernetes.
e.g.:
# terminal 1
docker run --rm -ti -p 45678 -p 45679 --name echo-server ubuntu:14.04 bash
$ nc -lp 45678 | nc -lp 45679
# terminal 2
docker port echo-server # it's easier to look up a named container
# prints
45678/tcp -> 0.0.0.0:123456
45679/tcp -> 0.0.0.0:456123
# so now we can
nc localhost 123456
# terminal 3
nc localhost 456123
And it works like before.
# terminal 1
docker run -p <outside-port>:<inside-port>/protocol (tcp/udp)
# e.g.
docker run -p 1234:1234/udp
# we then need to alter the nc command
$ nc -ulp 1234
# terminal 2
nc -u localhost 1234
docker network ls
# prints:
NETWORK ID NAME DRIVER SCOPE
17776875348d bridge bridge local # default
2eefced1748c host host local # for containers that need to
# not be isolated via network
03dc35a5003d none null local # no networking
...
docker network create <some-name>
docker run --rm -ti --net <some-name> --name <server-name> ...
# e.g.
docker network create learn
docker run --rm -ti --net learn --name catserver ubuntu:14.04 bash
$ ping catserver # verify we can send packets to ourselves
We can also add more servers in the same network
docker run --rm -ti --net learn --name dogserver ubuntu:14.04 bash
Now, catserver
and dogserver
can communicate.
Expanding the above example we create a new network:
docker network create catsonly
docker network connect catsonly catserver
We now connect anew
docker run --rm -ti --net catsonly --name bobcatserver ubuntu:14.04 bash
$ ping catserver # works
$ ping dogserver # traffic not allowed
This does not stop catserver
to ping both bobcatserver
and dogserver
as it belongs to both networks. dogserver
still can only connect to catserver
.
- one way links
- secret environment variables shared only one way
- startup order is important
- restarts only sometimes break the links
Like above, we connect to catserver
and we also create an environment variable named FOO
.
docker run --rm -ti -e FOO=bar --name catserver ubuntu:14.04 bash
$ env # returns FOO=bar
$ nc -lp 1234
Now we connect to this server like so
docker run --rm -ti --link catserver --name dogserver ubuntu:14.04 bash
$ env # returns a copy of this variable, named CATSERVER_ENV_FOO
$ nc catserver 1234
Now our servers can exchange messages. We cannot, however connect from catserver
to dogserver
as this functionality is only one way.
Note: Summing up the SIZE
column does not return the sum of MB used by Docker, as some images share bytes.
docker commit <sha> my-image # auto tags as latest
docker commit <sha> my-image:v2.1 # tags as v2.1; creates new image
registry.example.com:port/organization/image-name:version-tag
# we can leave out the parts we don't need
# usually just organization/image-name is enough
docker pull # run automatically by docker run, useful for offline work
docker rmi image-name:tag # or
docker rmi image-id
Volumes act like shared folders, virtual discs to store and share data.
Two types:
- persistent (when the container goes away, the data remains)
- ephemeral (exist as long as a container is using them)
They are not part of images; they're for our own local data (local to this host)
For a folder:
[outside docker] mkdir example
[outside docker] docker run -ti -v /path/to/example:/shared-folder ubuntu bash
# -v: specify path to folder we wish to share, colon, path within the container where we wish to store this folder.
[within docker] $ ls /shared-folder/
# yields nothing
[within docker] $ touch /shared-folder/my-data
# exit
[outside docker] ls example/
# returns my-data
For a file, we perform the same process, but the file must exist before we start the container, or it will be assumed to be a directory.
We use volumes-from
:
- shared discs that exist only as long as they are being used
- can be shared between containers
# terminal 1 #
docker run -ti -v /shared-data ubuntu bash # create a volume for container not shared with host
$ echo hello > /shared-data/data-file
# store data into it
# terminal 2
docker run -ti --volumes-from <container-name> ubuntu bash
$ ls /shared-data/
# returns the data-file
# we can add more here and it will be visible on terminal 1's container
After exiting the original container, while keeping the second one, we still have access to our data. We can add another container in a similar manner and it'll also be able to view these files, even though the original machine is gone. When all containers are gone, this data is gone.
- They manage & distribute images
- Docker (the company) offers these for free, we can also run our own if we desire.
docker search <command>
e.g.:
$ docker search ubuntu
NAME DESCRIPTION STARS OFFICIAL AUTOMATED
ubuntu Ubuntu is a Debian-based Linux operating sys… 15370 [OK]
websphere-liberty WebSphere Liberty multi-architecture images … 290 [OK]
ubuntu-upstart DEPRECATED, as is Upstart (find other proces… 112 [OK]
neurodebian NeuroDebian provides neuroscience research s… 97 [OK]
ubuntu/nginx Nginx, a high-performance reverse proxy & we… 71
open-liberty Open Liberty multi-architecture images based… 56 [OK]
ubuntu/apache2 Apache, a secure & extensible open-source HT… 51
ubuntu-debootstrap DEPRECATED; use "ubuntu" instead 49 [OK]
ubuntu/squid Squid is a caching proxy for the Web. Long-t… 46
ubuntu/mysql MySQL open source fast, stable, multi-thread… 40
ubuntu/bind9 BIND 9 is a very flexible, full-featured DNS… 34
ubuntu/prometheus Prometheus is a systems and service monitori… 33
ubuntu/postgres PostgreSQL is an open source object-relation… 22
ubuntu/kafka Apache Kafka, a distributed event streaming … 17
ubuntu/redis Redis, an open source key-value store. Long-… 15
ubuntu/prometheus-alertmanager Alertmanager handles client alerts from Prom… 8
ubuntu/grafana Grafana, a feature rich metrics dashboard & … 6
ubuntu/memcached Memcached, in-memory keyvalue store for smal… 5
ubuntu/zookeeper ZooKeeper maintains configuration informatio… 5
ubuntu/dotnet-runtime Chiselled Ubuntu runtime image for .NET apps… 5
ubuntu/dotnet-deps Chiselled Ubuntu for self-contained .NET & A… 5
ubuntu/telegraf Telegraf collects, processes, aggregates & w… 4
ubuntu/cortex Cortex provides storage for Prometheus. Long… 3
ubuntu/dotnet-aspnet Chiselled Ubuntu runtime image for ASP.NET a… 3
ubuntu/cassandra Cassandra, an open source NoSQL distributed … 2
There is also the OFFICIAL flag which helps differentiate between images.
We can also visit the Docker Hub Webpage, containing even more info for each image.
If we already have an account, we can do the following
docker login # username, password prompts
docker pull debian:sid # pull an image
docker tag debian:sid username/<name>:<tag-version> # add tag
docker push username/<name>:<tag-version> # push to repo
- Don't push images containing passwords or other sensitive info to Docker Hub
- Cleanup images regularly (helps keep track with obsolete dependencies, too)
- Be careful what you trust!
- Small programs that describe the creation of an image.
- We run these with
docker build -t <name-of-result> <path/to/dockerfile>
- When it finishes, the result will be in our local docker registry
- Each line takes the image from the previous line & makes another image, the previous image is unchanged, it does not edit the state from the pervious line
- We do not want large files to span lines, lest our image be huge eventually.
- Since each step is being cached, watch the build output for
using cache
. So, Docker can skip lines that have remained unchanged since the last build. - TIP: Put the parts that change at the end of the Dockerfile.
- They are NOT shell scripts, albeit looking like them.
- If we want to keep environment variables, we have to use the
ENV
command and will be set on the next line.
- Create a file named
Dockerfile
in its own directory. (why?) - Put this in it:
FROM busybox
RUN echo "building simple docker image."
CMD echo "Hello Container"
- Run:
docker build -t hello .
This prints something like:
Sending build context to Docker daemon 2.048kB
Step 1/3 : FROM busybox
latest: Pulling from library/busybox
2123501b93d4: Pull complete
Digest: sha256:05a79c7279f71f86a2a0d05eb72fcb56ea36139150f0a75cd87e80a4272e4e39
Status: Downloaded newer image for busybox:latest
---> 827365c7baf1
Step 2/3 : RUN echo "building simple docker image."
---> Running in 12470d849d80
building simple docker image.
Removing intermediate container 12470d849d80
---> c7ebfbb9c45e
Step 3/3 : CMD echo "Hello Container"
---> Running in cc339230761d
Removing intermediate container cc339230761d
---> 57a23769c50f
Successfully built 57a23769c50f
Successfully tagged hello:latest
- Now run:
docker run --rm hello
This prints:
Hello Container
In your Dockerfile, put:
FROM debian:sid
RUN apt-get -y update
RUN apt-get install nano
CMD ["/bin/nano", "/tmp/notes"]
# CMD "nano" "/tmp/notes"
Then run
docker build -t example/nanoer .
In your Dockerfile, put:
FROM example/nanoer
ADD notes.txt /notes.txt
CMD ["nano", "/notes.txt"]
Then:
nano notes.txt
# add something in the file
Lastly:
docker build -t example/notes .
docker run -ti --rm example/notes
We can now access notes.txt
- It dictates which image to download & start from.
- It must be the first command in the file.
- Multiple can exist in the same file (produces multiple images).
- syntax:
FROM java:8
- Defines the author of the file
- syntax:
MAINTAINER fname lname <email@email.com>
- Runs the command line, waits for it to finish & saves the result
- syntax:
RUN unzip install.zip /opt/install/
- or
RUN echo hello docker
- Adds local files e.g.
ADD run.sh /run.sh
- Decompresses tar contents to directory e.g.
ADD foo.tar.gz /install/
- Downloads from URL and places in directory e.g.
ADD https://url/to/file /folder/
- Sets environment variables both during the build and when running the result
- syntax:
ENV FOO=bar
ENTRYPOINT
specifies the start of the command to run (adds to)
CMD
specifies the entire command to run (replaces)
- These two can coexist.
- If your container act like a command-line program, you can use
ENTRYPOINT
- if you're unsure, use
CMD
Shell form:
nano notes.txt
Exec form: (a bit more efficient as it directly runs the command without the need for a shell)
["nano", "/notes.txt"]
- Maps a port into the container
- syntax:
EXPOSE 8080
- Defines shared or ephemeral volumes
- syntax:
VOLUME ["/host/path/", "/container/path"]
, which maps a host path to a container path (2 args) - or
VOLUME ["/shared-data/"]
, which defines a volume that can be inherited later by containers (1 arg) - It's good to avoid defining shared folders in Dockerfiles, in order to be able to share them between many computers.
- Sets both the working directory for the rest of the Dockerfile as well as for the resulting container
- syntax:
WORKDIR /dir/
- Sets which user the container will run as
- syntax:
USER arthur
orUSER 1000
- Useful for shared networks with specific user names.
Dockerfile:
FROM ubuntu:16.04
RUN apt-get update
RUN apt-get -y install curl
RUN curl https://google.com | wc -c > google-size
# this simply calculates google's home page character count
ENTRYPOINT echo google is this big; cat google-size
Run:
docker build -t tooo-big .
Now:
docker run tooo-big
This image is about 170MB
. This can be improved. Therefore we can split the Dockerfile like so:
FROM ubuntu:16.04 as builder #this changed!#
RUN apt-get update
RUN apt-get -y install curl
RUN curl https://google.com | wc -c > google-size
#add this
FROM alpine
COPY --from=builder /google-size //google-size
#up to here
ENTRYPOINT echo google is this big; cat google-size
Rebuild (with a new name), run anew & check again: the new image size is 4.41MB
.
Prevent the "Golden Image" problem:
- Include installers in your project.
- Have a canonical build system that builds everything completely from scratch.
- Tag your builds with the git hash of the code that built it.
- Use small images, like Alpine.
- Build images you share publically from Dockerfiles always.
- Do not share passwords! Delete the file that might contain sensitive data in the next step.
Kernels:
- Respond to messages from hardware
- Start & schedule programs
- Controls & organizes devices & storage units
- Pass messages between programs
- Allocate resources (memory, CPU etc)
The kernel being handled by Docker:
- Docker is written in Go
- It manages kernel features
- uses "cgroups" to contain processes
- uses "namespaces" to contain networks
- uses "copy-on-write" filesystems to build images
Docker does not introduce new features, it simplifies scripting distributed systems.
Docker is divided into the client & the server. The server receives commands over a socket (either over a network or through a "file") Therefore, the client can run within the server, as well.
If we wish to identify the name of the root process inside of a container, we run:
docker inspect
These can be run in Docker:
docker run -d -p 5000:5000 --restart=always --name registry registry:2
# --restart=always: always keep it running
# registry version 2 is current
docker tag ubuntu:14.04 localhost:5000/my-company/my-ubuntu:99
docker push localhost:5000/my-company/my-ubuntu:99
# pushed a copy of ubuntu we have locally
# so even if the original is gone, we have this
Be mindful of authentication when deploying.
- locally (backup!)
- Docker Trusted Registry
- Elastic Container Registry
- Google Cloud Container Registry
- Azure Container Registry
docker save
&docker load
(also useful for migration)
- We need multiple containers realistically
- Therefore we need to orchestrate them & their communications
- Service discovery
- Resource allocation
- ideal for testing & development
- single machine coordination
- not good for scaling
- brings everything with one command:
docker compose up
- containers run programs
- pods group containers together (docker compose but dynamic)
- services make pods discoverable by others
- labels are used for very advances service discovery
- makes scripting large operations possible with the
kubectl
command - flexible overlaying system
- runs on hardware, cloud etc
- Kubernetes in AWS
- Google Kubernetes Engine
- works similarly
- task definitions define a set of containers that always run together
- tasks make a container right now
- services and exposes to the Net (ensures that a task is running all the time)
- good integration with ELBs (Amazon Load Balancers)
- you create your own host instances
- make your instances start the agent & join the cluster
- passes the docker control socket to the agent
- provides docker repos
- it's easy to run your own repo alongside these
- tasks can be part of CloudFormation stacks (easier deployment)
This script formats to vertical the output of docker ps
(only works on bash):
export FORMAT="\nID\t{{.ID}}\nIMAGE\t{{.Image}}\nCOMMAND\t{{.Command}}\nCREATED\t{{.RunningFor}}\nSTATUS\t{{.Status}}\nPORTS\t{{.Ports}}\nNAMES\t{{.Names}}\n"
Use:
docker ps --format $FORMAT
[Terminal 1]
sudo apt install docker
sudo apt install docker-compose #based on docker-compose.yml in corresponding folder
[Terminal 2]
docker compose up
## 1
systemctl start docker
# if group doesn't exist
sudo groupadd docker
# if group exists
# add user to group
cat /etc/group | grep docker
## 2
sudo username -aG docker $USER
# log out, log in
## 3
newgroup docker
#verify changes
## 4
docker run hello-world
# verify
To run:
docker ps # get container's name e.g. postgresql
docker exec -it postgresql bash # opens bash
$ psql -U username dbname # opens posgres env (logged in based on credentials we gave), and we can perform the actions we need