ZapDos7/00_intro_docker.md

## 00_intro_docker.md

      
    Raw
  

              00_intro_docker.md
            
          
    Docker Notes


## 01_docker.md

      
    Raw
  

              01_docker.md
            
          
    Docker

LinkedIn Learn course by Arthur Ulfeldt
Introduction

Docker "carves up" a system into sealed containers that run your code, each with its own processes etc. These containers are portable (so code can be executed anywhere). Docker builds these containers & offers a social platform to exchange these containers.
Docker is:

a client program (a command we run)
a server program (that listens for such commands that controls the Linux VM)
a program that builds the containers from the code
a service that distributes these containers online
a company that makes these

Docker needs a Linux server to manage.
A container is a self contained, sealed unit of software & contains everything required to run the code, e.g. the OS, configs, processes, networking, dependencies. Multiple containers can coexist in the same machine and they do not affect each other as long as the OS is a Linux distro.
Image: every file that makes up just enough of the operating system to do what you need to do.
Multiple containers can derive from the same image but each one holds its own world, so if I made 2 containers from 1 image, and in 1 of these containers I create a file, that file does not exist in the other container or the image.
Installation notes

Options:

Docker Desktop (for Linux, Windows or Mac)
Install Docker on Ubuntu

Note: Boot2Docker is obsolete and thus should be uninstalled from any systems.
Commands

Image to Container Commands

Verify installation and prints simple Docker info

docker run hello-world
Run a container from an image

docker run

# e.g.
docker run -ti ubuntu:latest bash   # creates a container
                                    # from ubuntu:latest image
                                    # -ti: terminal interactive
                                    # with interactive terminal and that is 
                                    # running bash in it
                                    # if we omit the tag,
                                    # docker will autofill it 
                                    # as `latest` by default

docker run --rm             # we wish to run something in a container 
                            # but not keep the container
                            # after we're done
docker run ... -c "foo"     # run the commands `foo`
                            # within the container
                            # e.g.
docker run -ti ubuntu bash -c "sleep 3; echo all done"  # run ubuntu container, sleep for 3",
                                                        # then print "all done" to terminal

docker run -d ...           # -d for detached
                            # runs in the background
                            # returns image id, otherwise use docker ps
We can exit with Ctrl + D
Docker Images Commands

Display all images

docker images               # displays all docker images
                            # existing locally:
                            # repository: where the image comes from
                            # tag: version number
                            # image id: internal Docker representation
                            # (images do not need to have an
                            # id so instead we might refer to them 
                            # using repository:tag)
                            # created: date is was created
                            # size: image size in kB/MB
Display all running images

docker ps                   # NOTE: image id != container id
docker ps -a                # displays all containers,
                            # even stopped ones
docker ps -l                # displays last exited container
Container to Image Commands

docker commit <container-id>                  # takes a stopped container and creates
                                              # a new image from it, returning:
                                              # sha256:<image-id>
                                              # we can then do this:
docker tag <image-id> <custom-image-name>     # give a human-readable name to an image
                                              # since the above process is common, 
                                              # we can instead do it in one command:
docker commit <container-name> <custom-image-name>
Running things in Docker


containers have a main process
the container stops when the main process stops
containers have names, if we do not give them one, Docker will generate one for us

Detached Containers Commands

# start detached
docker run -d -ti ubuntu bash   # returns detached id
# reattach to it
docker attach <container-name>  # or its id
Otherwise we can use the keys: Ctrl + P, Ctrl + Q, which can work even if we started with an attached container.
Running more things in a container

docker exec -ti <cont-name> <process>   # start another process 
                                        # in an existing container
Useful for debugging and DB administration.
Does not allow to add ports, volumes etc.
This allows us to have multiple interaction points onto the same container. When the original container exits, the exec'd process dies alongside it.
Container Output Commands

# we run a container but have a typo in `ls` as `lose`
docker run --name example -d ubuntu bash -c "lose /etc/password"
# we check logs
docker logs example
# this returns:
# bash: lose: command not found
Output should not be too large.
Stop & Remove Containers

docker kill <container-name-or-id> # stops a container
docker rm   <container-name-or-id> # removes a container
It's good to keep track and rm unused containers as sometimes the names can be reused and it can lead to pseudo-conflicts.
Constraint Resources Commands

# memory
docker run --memory <max-allowed-memory> <image-name> <command>
# CPU
docker run --cpu-shares # relative to other containers
docker run --cpu-quota  # hard limit in general
Tips & Lessons 1


Don't let containers fetch dependencies when they start.


If you're using things like Node.js and you have your node starts up and then when the container starts it fetches its dependencies. Then, some day, somebody's gonna remove some library out from the node repos, and all of a sudden, all your containers just stop throughout your whole system, it's terrible. Fetch make your containers include their dependencies inside the container themselves, saves a lot of pain.


Don't leave important things in unnamed stopped containers.

Container Networking Commands

Programs in containers are isolated from the internet by default.
We can group containers into "private" networks so they can interact with each other via network, but not with the rest of the world.
We can expose ports (as many as we want) to let connections in in order to connect with the rest of the world. This requires coordination between containers.
Example: A server container passing info between two clients

# terminal 1
docker run --rm -ti -p 45678:45678 -p 45679:45679 --name echo-server ubuntu:14.04 bash 
# -p: publish 
# <port1>:<port1> means we expose the same port from the outside onto the outside world

# inside we will use netcat to simply use the networking facilities
$ nc -lp 45678 | nc -lp 45679
# -lp: listen on port
# listen to port 45678 and write to 45679
# terminal 2
nc localhost 45678
# terminal 3
nc localhost 45679
# Now, whatever we write from terminal 2 is printed onto terminal 3.
# if we don't have nc installed, we can execute it within a container
docker run --rm -ti ubuntu:14.04 bash
$ nc host.docker.internal 45678
The above example works on Mac; in order to specify the host when using multiple containers in other OS, see here.
If we need to set more ports

We do it dynamically:

the port inside the container is fixed
the port on the host is chosen from unused ports
this allows many containers running programs with fixed ports, without us having to set each and every one.

This is useful in service discovery programs like Kubernetes.
e.g.:
# terminal 1
docker run --rm -ti -p 45678 -p 45679 --name echo-server ubuntu:14.04 bash 
$ nc -lp 45678 | nc -lp 45679
# terminal 2
docker port echo-server # it's easier to look up a named container 
# prints
45678/tcp -> 0.0.0.0:123456
45679/tcp -> 0.0.0.0:456123
# so now we can
nc localhost 123456
# terminal 3
nc localhost 456123
And it works like before.
Exposing UDP Ports

# terminal 1
docker run -p <outside-port>:<inside-port>/protocol (tcp/udp)
# e.g.
docker run -p 1234:1234/udp
# we then need to alter the nc command
$ nc -ulp 1234
# terminal 2
nc -u localhost 1234
Docker's Network Features

Show existing networks

docker network ls
# prints:
NETWORK ID     NAME    DRIVER    SCOPE
17776875348d   bridge  bridge    local # default
2eefced1748c   host    host      local # for containers that need to
                                       # not be isolated via network
03dc35a5003d   none    null      local # no networking
...
Create a new network

docker network create <some-name>
Connect to a created network

docker run --rm -ti --net <some-name> --name <server-name> ...
# e.g.
docker network create learn
docker run --rm -ti --net learn --name catserver ubuntu:14.04 bash
$ ping catserver # verify we can send packets to ourselves
We can also add more servers in the same network
docker run --rm -ti --net learn --name dogserver ubuntu:14.04 bash
Now, catserver and dogserver can communicate.
Multiple networks

Expanding the above example we create a new network:
docker network create catsonly
docker network connect catsonly catserver
We now connect anew
docker run --rm -ti --net catsonly --name bobcatserver ubuntu:14.04 bash
$ ping catserver # works
$ ping dogserver # traffic not allowed
This does not stop catserver to ping both bobcatserver and dogserver as it belongs to both networks. dogserver still can only connect to catserver.
Legacy Linking


one way links
secret environment variables shared only one way
startup order is important
restarts only sometimes break the links

Example

Like above, we connect to catserver and we also create an environment variable named FOO.
docker run --rm -ti -e FOO=bar --name catserver ubuntu:14.04 bash
$ env # returns FOO=bar
$ nc -lp 1234
Now we connect to this server like so
docker run --rm -ti --link catserver --name dogserver ubuntu:14.04 bash
$ env # returns a copy of this variable, named CATSERVER_ENV_FOO
$ nc catserver 1234
Now our servers can exchange messages. We cannot, however connect from catserver to dogserver as this functionality is only one way.
Images

Note: Summing up the SIZE column does not return the sum of MB used by Docker, as some images share bytes.
Commit Images

docker commit <sha> my-image        # auto tags as latest
docker commit <sha> my-image:v2.1   # tags as v2.1; creates new image
Full name structure

registry.example.com:port/organization/image-name:version-tag
# we can leave out the parts we don't need
# usually just organization/image-name is enough
Getting Images

docker pull # run automatically by docker run, useful for offline work
Cleaning Up Images

docker rmi image-name:tag # or
docker rmi image-id
Sharing data between containers or between containers and hosts: Volumes

Volumes act like shared folders, virtual discs to store and share data.
Two types:

persistent (when the container goes away, the data remains)
ephemeral (exist as long as a container is using them)

They are not part of images; they're for our own local data (local to this host)
Sharing Data with the Host

For a folder:
[outside docker] mkdir example
[outside docker] docker run -ti -v /path/to/example:/shared-folder ubuntu bash
# -v: specify path to folder we wish to share, colon, path within the container where we wish to store this folder.
[within docker] $ ls /shared-folder/
# yields nothing
[within docker] $ touch /shared-folder/my-data
# exit
[outside docker] ls example/
# returns my-data
For a file, we perform the same process, but the file must exist before we start the container, or it will be assumed to be a directory.
Sharing Data between Containers

We use volumes-from:

shared discs that exist only as long as they are being used
can be shared between containers

# terminal 1 #
docker run -ti -v /shared-data ubuntu bash # create a volume for container not shared with host
$ echo hello > /shared-data/data-file
# store data into it
# terminal 2
docker run -ti --volumes-from <container-name> ubuntu bash
$ ls /shared-data/
# returns the data-file
# we can add more here and it will be visible on terminal 1's container
After exiting the original container, while keeping the second one, we still have access to our data. We can add another container in a similar manner and it'll also be able to view these files, even though the original machine is gone. When all containers are gone, this data is gone.
Docker Registries

What are they?


They manage & distribute images
Docker (the company) offers these for free, we can also run our own if we desire.

How to Find Images

docker search <command>
e.g.:
$ docker search ubuntu
NAME                             DESCRIPTION                                     STARS     OFFICIAL   AUTOMATED
ubuntu                           Ubuntu is a Debian-based Linux operating sys…   15370     [OK]       
websphere-liberty                WebSphere Liberty multi-architecture images …   290       [OK]       
ubuntu-upstart                   DEPRECATED, as is Upstart (find other proces…   112       [OK]       
neurodebian                      NeuroDebian provides neuroscience research s…   97        [OK]       
ubuntu/nginx                     Nginx, a high-performance reverse proxy & we…   71                   
open-liberty                     Open Liberty multi-architecture images based…   56        [OK]       
ubuntu/apache2                   Apache, a secure & extensible open-source HT…   51                   
ubuntu-debootstrap               DEPRECATED; use "ubuntu" instead                49        [OK]       
ubuntu/squid                     Squid is a caching proxy for the Web. Long-t…   46                   
ubuntu/mysql                     MySQL open source fast, stable, multi-thread…   40                   
ubuntu/bind9                     BIND 9 is a very flexible, full-featured DNS…   34                   
ubuntu/prometheus                Prometheus is a systems and service monitori…   33                   
ubuntu/postgres                  PostgreSQL is an open source object-relation…   22                   
ubuntu/kafka                     Apache Kafka, a distributed event streaming …   17                   
ubuntu/redis                     Redis, an open source key-value store. Long-…   15                   
ubuntu/prometheus-alertmanager   Alertmanager handles client alerts from Prom…   8                    
ubuntu/grafana                   Grafana, a feature rich metrics dashboard & …   6                    
ubuntu/memcached                 Memcached, in-memory keyvalue store for smal…   5                    
ubuntu/zookeeper                 ZooKeeper maintains configuration informatio…   5                    
ubuntu/dotnet-runtime            Chiselled Ubuntu runtime image for .NET apps…   5                    
ubuntu/dotnet-deps               Chiselled Ubuntu for self-contained .NET & A…   5                    
ubuntu/telegraf                  Telegraf collects, processes, aggregates & w…   4                    
ubuntu/cortex                    Cortex provides storage for Prometheus. Long…   3                    
ubuntu/dotnet-aspnet             Chiselled Ubuntu runtime image for ASP.NET a…   3                    
ubuntu/cassandra                 Cassandra, an open source NoSQL distributed …   2
There is also the OFFICIAL flag which helps differentiate between images.
We can also visit the Docker Hub Webpage, containing even more info for each image.
If we already have an account, we can do the following
docker login                                            # username, password prompts
docker pull debian:sid                                  # pull an image
docker tag debian:sid username/<name>:<tag-version>     # add tag
docker push username/<name>:<tag-version>               # push to repo
Tips & Lessons 2


Don't push images containing passwords or other sensitive info to Docker Hub
Cleanup images regularly (helps keep track with obsolete dependencies, too)
Be careful what you trust!

Building Docker Images

What are Dockerfiles?


Small programs that describe the creation of an image.
We run these with

docker build -t <name-of-result> <path/to/dockerfile>

When it finishes, the result will be in our local docker registry
Each line takes the image from the previous line & makes another image, the previous image is unchanged, it does not edit the state from the pervious line
We do not want large files to span lines, lest our image be huge eventually.
Since each step is being cached, watch the build output for using cache.
So, Docker can skip lines that have remained unchanged since the last build.
TIP: Put the parts that change at the end of the Dockerfile.
They are NOT shell scripts, albeit looking like them.
If we want to keep environment variables, we have to use the ENV command and will be set on the next line.

Further reading
How to Create a Dockerfile?

A Simple Example


Create a file named Dockerfile in its own directory. (why?)
Put this in it:

FROM busybox
RUN echo "building simple docker image."
CMD echo "Hello Container"


Run:

docker build -t hello .
This prints something like:
Sending build context to Docker daemon  2.048kB
Step 1/3 : FROM busybox
latest: Pulling from library/busybox
2123501b93d4: Pull complete 
Digest: sha256:05a79c7279f71f86a2a0d05eb72fcb56ea36139150f0a75cd87e80a4272e4e39
Status: Downloaded newer image for busybox:latest
 ---> 827365c7baf1
Step 2/3 : RUN echo "building simple docker image."
 ---> Running in 12470d849d80
building simple docker image.
Removing intermediate container 12470d849d80
 ---> c7ebfbb9c45e
Step 3/3 : CMD echo "Hello Container"
 ---> Running in cc339230761d
Removing intermediate container cc339230761d
 ---> 57a23769c50f
Successfully built 57a23769c50f
Successfully tagged hello:latest


Now run:

docker run --rm hello
This prints:
Hello Container
Installing a Program with Docker Build

In your Dockerfile, put:
FROM debian:sid
RUN apt-get -y update 
RUN apt-get install nano
CMD ["/bin/nano", "/tmp/notes"]
# CMD "nano" "/tmp/notes"

Then run
docker build -t example/nanoer .
Adding a File through Docker Build

In your Dockerfile, put:
FROM example/nanoer
ADD notes.txt /notes.txt
CMD ["nano", "/notes.txt"]

Then:
nano notes.txt
# add something in the file
Lastly:
docker build -t example/notes .
docker run -ti --rm example/notes
We can now access notes.txt
Syntax

The FROM statement


It dictates which image to download & start from.
It must be the first command in the file.
Multiple can exist in the same file (produces multiple images).
syntax: FROM java:8

The MAINTAINER statement


Defines the author of the file
syntax: MAINTAINER fname lname <email@email.com>

The RUN statement


Runs the command line, waits for it to finish & saves the result
syntax: RUN unzip install.zip /opt/install/
or RUN echo hello docker

The ADD statement


Adds local files e.g. ADD run.sh /run.sh
Decompresses tar contents to directory e.g. ADD foo.tar.gz /install/
Downloads from URL and places in directory e.g. ADD https://url/to/file /folder/

The ENV statement


Sets environment variables both during the build and when running the result
syntax: ENV FOO=bar

The ENTRYPOINT statement


ENTRYPOINT specifies the start of the command to run (adds to)

CMD statement


CMD specifies the entire command to run (replaces)

ENTRYPOINT & CMD


These two can coexist.
If your container act like a command-line program, you can use ENTRYPOINT
if you're unsure, use CMD

Shell Form vs Exec Form

Shell form:
nano notes.txt
Exec form: (a bit more efficient as it directly runs the command without the need for a shell)
["nano", "/notes.txt"]
The EXPOSE statement


Maps a port into the container
syntax: EXPOSE 8080

The VOLUME statement


Defines shared or ephemeral volumes
syntax: VOLUME ["/host/path/", "/container/path"], which maps a host path to a container path (2 args)
or VOLUME ["/shared-data/"], which defines a volume that can be inherited later by containers (1 arg)
It's good to avoid defining shared folders in Dockerfiles, in order to be able to share them between many computers.

The WORKDIR statement


Sets  both the working directory for the rest of the Dockerfile as well as for the resulting container
syntax: WORKDIR /dir/

The USER statement


Sets which user the container will run as
syntax: USER arthur or USER 1000
Useful for shared networks with specific user names.

There's more!

Check it out.
Multiple-project Docker files

Dockerfile:
FROM ubuntu:16.04
RUN apt-get update
RUN apt-get -y install curl
RUN curl https://google.com | wc -c > google-size
# this simply calculates google's home page character count
ENTRYPOINT echo google is this big; cat google-size

Run:
docker build -t tooo-big .
Now:
docker run tooo-big
This image is about 170MB. This can be improved. Therefore we can split the Dockerfile like so:
FROM ubuntu:16.04 as builder #this changed!#
RUN apt-get update
RUN apt-get -y install curl
RUN curl https://google.com | wc -c > google-size
#add this
FROM alpine
COPY --from=builder /google-size //google-size
#up to here
ENTRYPOINT echo google is this big; cat google-size

Rebuild (with a new name), run anew & check again: the new image size is 4.41MB.
Tips & Lessons 3

Prevent the "Golden Image" problem:

Include installers in your project.
Have a canonical build system that builds everything completely from scratch.
Tag your builds with the git hash of the code that built it.
Use small images, like Alpine.
Build images you share publically from Dockerfiles always.
Do not share passwords! Delete the file that might contain sensitive data in the next step.

Delve into the core

Kernels:

Respond to messages from hardware
Start & schedule programs
Controls & organizes devices & storage units
Pass messages between programs
Allocate resources (memory, CPU etc)

The kernel being handled by Docker:

Docker is written in Go
It manages kernel features

uses "cgroups" to contain processes
uses "namespaces" to contain networks
uses "copy-on-write" filesystems to build images


Docker does not introduce new features, it simplifies scripting distributed systems.
Docker is divided into the client & the server. The server receives commands over a socket (either over a network or through a "file") Therefore, the client can run within the server, as well.
If we wish to  identify the name of the root process inside of a container, we run:
docker inspect
Registries


The official Python Docker Registry
Nexus

These can be run in Docker:
docker run -d -p 5000:5000 --restart=always --name registry registry:2
# --restart=always: always keep it running
# registry version 2 is current
docker tag ubuntu:14.04 localhost:5000/my-company/my-ubuntu:99
docker push localhost:5000/my-company/my-ubuntu:99
# pushed a copy of ubuntu we have locally 
# so even if the original is gone, we have this
Be mindful of authentication when deploying.
Storag options


locally (backup!)
Docker Trusted Registry
Elastic Container Registry
Google Cloud Container Registry
Azure Container Registry
docker save & docker load (also useful for migration)

Orchestration - Large Systems


We need multiple containers realistically
Therefore we need to orchestrate them & their communications
Service discovery
Resource allocation

Docker Compose


ideal for testing & development
single machine coordination
not good for scaling
brings everything with one command: docker compose up

Kubernetes


containers run programs
pods group containers together (docker compose but dynamic)
services make pods discoverable by others
labels are used for very advances service discovery
makes scripting large operations possible with the kubectl command
flexible overlaying system
runs on hardware, cloud etc
Kubernetes in AWS
Google Kubernetes Engine

Amazon EC2 (ECS)


works similarly
task definitions define a set of containers that always run together
tasks make a container right now
services and exposes to the Net (ensures that a task is running all the time)
good integration with ELBs (Amazon Load Balancers)
you create your own host instances
make your instances start the agent & join the cluster
passes the docker control socket to the agent
provides docker repos
it's easy to run your own repo alongside these
tasks can be part of CloudFormation stacks (easier deployment)

Utilities

reformat.sh

This script formats to vertical the output of docker ps (only works on bash):
export FORMAT="\nID\t{{.ID}}\nIMAGE\t{{.Image}}\nCOMMAND\t{{.Command}}\nCREATED\t{{.RunningFor}}\nSTATUS\t{{.Status}}\nPORTS\t{{.Ports}}\nNAMES\t{{.Names}}\n"
Use:
docker ps --format $FORMAT

  
## 02_quick.md

      
    Raw
  

              02_quick.md
            
          
    Docker Quick Notes

[Terminal 1]
sudo apt install docker
sudo apt install docker-compose #based on docker-compose.yml in corresponding folder

[Terminal 2]
docker compose up

## 1
systemctl start docker
# if group doesn't exist
sudo groupadd docker
# if group exists
# add user to group
cat /etc/group | grep docker

## 2
sudo username -aG docker $USER
# log out, log in

## 3
newgroup docker
#verify changes

## 4
docker run hello-world
# verify
To run:
docker ps                        # get container's name e.g. postgresql
docker exec -it postgresql bash  # opens bash
$ psql -U username dbname        # opens posgres env (logged in based on credentials we gave), and we can perform the actions we need

  
## 99_sources.md

      
    Raw
  

              99_sources.md
            
          
    Further reading...


Docker Containers commands
The Docker Handbook
Docker Extensions
Cheat Sheet