Skip to content

Instantly share code, notes, and snippets.

@Theartbug
Last active August 31, 2020 16:40
Show Gist options
  • Save Theartbug/98ee90a8ad3c0960e6fd24cf54fbbb91 to your computer and use it in GitHub Desktop.
Save Theartbug/98ee90a8ad3c0960e6fd24cf54fbbb91 to your computer and use it in GitHub Desktop.
docker

Learning Docker

Orchistration: Building systems with docker

Registries

  • is a program that keeps your docker images / layers safe
    • metadata around images
  • listen on port 5000 (usually)
  • maintains an index and searches tags
  • authorizes and authenticates connections
  • there are a few choices
    • official Python Docker registry
    • Nexus
  • docker makes installing network services reasonably easy, registry is a docker service
  • allows you to save images locally and no longer depend on those in the cloud
    • could be lost or changed
  • must set up authentication prior to exposing it to any network
  • example: docker run -d -p 5000:5000 --restart=always --name registry registry:2
    • restart=always if the container dies, will restart
    • -p 5000:5000 the port exposed to the host machine
    • registry:2 use the image registry version 2
  • then tag an image to push example: docker tag ubuntu:14.04 localhost:5000/mycompany/my-ubuntu:99
    • ubuntu:14.04 use this image to tag
    • localhost:5000/mycompany/my-ubuntu:99 tagged with name of server running registry / name for organization / image name
  • finally push it example: docker push localhost:5000/mycompany/my-ubuntu:99
  • storage options for preserving image
    • locally
    • docker trusted registry
    • amazon elastic container registry
    • google cloud container registry
    • azure container registry saving and loading containers
  • docker save
    • saving images locally is good for when traveling, making backups, or when shipping images to customers
    • docker save -o my-images.tar.gz image-1 image-2 image-3
  • docker load
  • good for migrating between storage types
  • good way to get images to other people

Under the Hood

Docker the program

  • kernals run directly on the hardware
    • responds to messages from the hardware
    • starts and schedules programs
    • controls and organizes storage
    • passes messages between programs
    • allocates resources, memory, CPU, network, etc
  • docker manages the kernel
    • program written in Go
    • manages kernel features
    • uses "control groups" to contain processes
    • uses "namespaces" to contain networks
    • uses "copy-on-write" filesystems to build images
    • used for years before docker
  • makes scripting distributed systems "easy"
  • give docker access to its own server socket:
    • docker run -ti --rm -v /var/run/docker.sock:/var/run/docker.sock docker sh
    • once inside, do docker run -ti --rm ubuntu bash
    • will create a new container from a client within a container, but is not a container within a container
    • is a client within a docker container controlling a server that is outside that container
    • flexibility on where you can control docker from!

Networking and Namespace

  • networking is divided into many layers
    • ethernet: moves frames on a wire
    • IP layer: moves packets on a local network
    • Routing: forwards packets between networks
    • Ports: address particular programs on a computer

bridges

  • docker uses bridges to create virtual networks inside of computer
  • when you create a private network in docker it creates a bridge
  • function like software switches, used to control the ethernet layer
  • example: docker run -ti --rm --net=host ubuntu:16.04 bash
    • --net=host full access to hosts networking stack and turns off all protections
    • run apt-get update && apt-get install bridge-utils
    • run brctl show to display the network bridges within docker
    • in another terminal run docker netowrk create new-netowrk and look at the output id
    • run brctl show again and see the first part of that network string in a new bridge name
  • docker doesnt magically moving packets, it creates bridges to move packets around system

routing

  • creates firewall rules to move packets between sockets
    • uses built-in firewall features of the linux kernal
  • called NAT (network address translation)
    • change the source address on the way out
    • change the destination address on the way back in
    • can look at these for docker container with sudo iptables -n -L -t -nat
  • example: docker run -ti --rm --net=host --privileged=true ubuntu bash
    • privleged=true further turn off safetys, lets container have full privlege over container hosting it
  • exposing ports in docker is actually just port forwarding at the networking layer

namespaces

  • allow processes to be attached to private network segments
  • private networks are bridged into a shared network with the rest of the containers
  • containers have virtual network "cards"
  • containers get their own copy of the networking stack
    • isolated to each container so they cannot reach in and reconfigure other containers

Processes and cgroups

  • processes come from other processes
    • parent-child relationship
  • when a child process exits, it returns an exit code to the parent
  • Process Zero is special; called init, the process that starts the rest
  • in docker, container starts with an init process and vanishes when that process exits
    • it can start with that and divid into other processes that do any number of things
    • will often start with a shell, and that shell splits off and runs other commands and processes
  • you can check out the processes inside a container using: docker inspect --format '{{.State.Pid}}' [container_name]
    • inspect gives you the ability to look at any piece of information in a program, programatically
    • provides the jQuery-like syntax to drill down through the infrastructure of that container to extract information

resource limiting

  • schedule CPU time, memory allocation limits, inherited limitations and quotas
  • cannot escape process to consume more limits

Storage

  • actual storage devices
  • local storage devices
    • layers to partition drives into groups, then partition groups
    • docker uses this extensively
  • file systems
    • which bits on the drive are part of which part of what file
  • FUSE filesystems and network filesystems
    • programs that pretend to be file systems

Copy on Write

  • secret to docker images
  • can piece together separate pieces of an image to build a container that sees a whole image
  • allows flexibility in using images that have / lack certain things until brought together in containers
    • building blocks
  • instead of writing directly to an image, write to new layer
  • moving
    • contents of layers are moved between containers in gzip files
    • containers are independent of the storage engine
    • any container can be loaded anywhere
    • it is possible to run out of layers on some storage engines

volumes and bind mounting

  • linux VFS (virtual file system)
  • mounting devices on the VFS
    • all start with / as the root
    • attach devices along the root of that tree
  • can mount directories as well
    • this will temporarily cover up part of the FS
  • use mount -o bind [source-directory] [target-directory]
    • will take the contents of source and place them over target
    • can undo with unmount [target-directory]
  • mount order matters
    • must do folder first, then file within the folder
  • mounting volumes - always mounts the hosts filesystem over the guest

docker control socket

  • docker is two programs: client and a server
  • the server receives commands over a socket (either network or through a file)
  • client can even run inside docker itself

running docker locally

  • traditionally have a docker client program and a docker server program with a socket communicating between them
  • all inside of a linux docker host
  • these client and server programs can create and delete containers
  • can also have the docker client program run inside of a container, have the socket go into that container and out again for the docker server program

Building Docker Images

What are dockerfiles?

  • small program that describes how to build a docker image
  • run them with docker build -t [name] [location_of_dockerfile]
  • when finishes building, result will be in local docker registry
  • each line in this file takes the image from the previous line and makes another image
    • previous image is unchanged
  • state is not carried from line to line
    • if you start a program in one line, will only run for the duration of that line
  • dont want large files to span lines or your image will be huge
    • careful with operations on large files spanning docker file lines
  • each step is cached, will use cache if is available
    • saves huge amount of time
  • place the parts of the code that are changed the most often at the end of the dockerfile
  • dockerfiles are not shell scripts
    • processes started on one line dont continue onto the next
  • ENV command will preserve environment variables

Build dockerifle

  • example: simple dockerfile
 FROM busybox
 RUN echo "building simple docker image."
 CMD echo "hello container"
  • running docker build -t hello . results in:
Step 1/3 : FROM busybox
latest: Pulling from library/busybox
61c5ed1cbdf8: Pull complete
Digest: sha256:4f47c01fa91355af2865ac10fef5bf6ec9c7f42ad2321377c21e844427972977
Status: Downloaded newer image for busybox:latest
 ---> 018c9d7b792b
Step 2/3 : RUN echo "building simple docker image."
 ---> Running in 8da204a982bb
building simple docker image.
Removing intermediate container 8da204a982bb
 ---> a962b3a3aeea
Step 3/3 : CMD echo "hello container"
 ---> Running in 417eb2dcefb5
Removing intermediate container 417eb2dcefb5
 ---> c09da6cf06c2
Successfully built c09da6cf06c2
Successfully tagged hello:latest
  • install a program with docker build:
FROM debian:sid
RUN apt-get -y update
RUN apt-get install nano
CMD "nano" "/tmp/notes"
  • creates a image that will start you off inside of a nano editor

dockerfile syntax

  • FROM
    • which image to download and start from
    • must be the first command in file
  • MAINTAINER
    • who to contact if there are issues / bugs
  • RUN
    • runs the command line, waits for it to finish, and saves result
  • ADD
    • adds local files, contents of a tar archives, URLs
  • ENV
    • sets env variables during build and when running the resulting image
  • ENTRYPOINT
    • specifies start of command to run
    • gets added to when people run containers from this image
    • if you want container to act like a command-line program
  • CMD
    • specifies the whole command to run
    • It sets the program to run when the container starts.
    • gets replaced when people run containers from this image
  • if you have both CMD and ENTRYPOINT they get strung together
  • most times will want CMD
  • EXPOSE
    • maps port into a container
  • VOLUME
    • defines a shared or ephemeral volume
    • avoid defining shared folders in dockerfiles
  • WORKDIR
    • changes the directory for both the rest of the Docker file, and in the finished image
    • set the directory the container starts in
    • its like typing cd
  • USER
    • set commands to be run as a particular user

Muli project docker files

  • complete vs small
  • multistage dockerfiles help with this
  • example:
FROM ubuntu:16.04
RUN apt-get update
RUN apt-get -y install curl
RUN curl https://google.com | wc -c > google
ENTRYPOINT echo google is this big; cat google-size
  • then run in directory docker build -t tooo-big . followed by docker run tooo-big
    • the image size is 171MB, pretty large, can be helped with splitting up the build
  • example:
FROM ubuntu:16.04 as builder
RUN apt-get update
RUN apt-get -y install curl
RUN curl https://google.com | wc -c > google
// we actually dont need ubuntu in the end, so copy that work over in a minimal sized image base
FROM alpine
COPY --from=builder /google-size /google-size
ENTRYPOINT echo google is this big; cat google-size
  • image size is now 5.57MB!

prevent golden image

  • golden image: It replaces a canonical build with a locally-modified revision.
  • include installer for project
  • have canonical build that builds everything from scratch
  • tag build with git hash of code that built it
  • use small base images, such as alpine
  • build something from dockerfiles, always

Using Docker

What is docker?

  • a self-contained unit of software
  • has everything inside it to run code
  • is not a virtual machine, is one machine (linux server)
  • OS and batteries included
    • code, configs, processes, networking, dependencies
  • each container is its own world
  • docker manages: sets up / tears down these containers
  • is a client program used in the terminal
  • a server program that manages linux system
  • a service that distributes these containers

Using Docker

  • docker flow: images -> containers

images

  • run docker image to look at them
  • repository: where it came from
  • tag: the version
  • image ID: internal docker representation of the image
  • fixed point can alway start from

running an image

  • run docker run -ti ubuntu:latest bash
    • ti = terminal interactive
    • ubuntu:latest latest ubuntu env image
    • bash start with bash
  • run exit to leave the container
  • run docker ps to see the running containers
  • id shows the container id
  • image is the image it came from
  • command any command it was created with

containers

  • run docker ps -a to show all containers, including stopped ones
  • images and container ids are different, not interchangeable
  • when a container is made from the image, the image is not changed
    • can muck up the container without changing the image
  • run docker commit [containerId] if want to save the container to a new image
    • will return a commit sha with an id of the new image sha256:93e9830c76720584d780f1d4b230082ae93019255162642e6b0582c52473a1c6
    • can take that id from the sha and create a tag to name the image
    • docker tag [imageId] [tag-name]
    • will then show up on docker images
  • a shortcut for above would be docker commit [container-name] [new-image-name]

Running Processes in containers

  • docker run
    • containers have main processes
    • container stops when that process stops
  • --rm will remove the container after process exits
  • `docker run -ti ubuntu bash -c "sleep 3; echo all done"
    • -c a command to use
    • sleep hangs out for given seconds
    • ; bash for when done with first command, move to next
  • -d can leave containers running in the background
  • docker attach [container-name] will go into a detached container
  • control + p, control + q will detach from a running container
  • docker exec [container-name]
    • starts another process in an existing container
    • good for debugging and DB admin
    • cant add ports, volumes, etc
    • when the original container stops, so does the exec'd process

Managing Containers

  • docker logs [container-name]
    • keeps output of containers
    • dont let logs get too large
  • docker kill [container-name] stops the container
    • will still exist unless removed
  • docker rm [container-name] removes the container
  • resource constraint
    • docker run --memory maximum-alowed-memory [image-name] [command] can place memory limits
    • docker run --cpu-shares relative to other containers cpu limits
    • docker run --cpu-quota cpu limit in general

general lessons

  • dont let containers fetch dependencies when they start
    • include dependencies inside the container!
  • dont leave important things in unnamed stopped containers

Exposing Ports

  • programs in containers are isolated from the internet by default
  • can group containers into "private" networks
  • can explicitly choose who can connect to whom
  • exposed ports allow connections in
  • private networks to connect between containers
  • explicity specify the port inside and outside the container to listen on
  • expose as many ports as desired
  • requires coordination between containers
  • makes it easy to find exposed ports
  • docker run --rm -ti -p 45678:45678 -p 45679:45679 --name echo-server ubuntu:14.04 bash
    • will publish two ports 45678 and 45679 that link to the same inside and out
    • --name gives it a name
    • nc -lp 45678 | nc -lp 45679 will use netcat to listen to the port 45678 and pipe the contents over to port 45679
    • can then use nc localhost 45679 and nc localhost 45678 to begin passing around contents between terminal panes
  • containers are not allowed to directly address another container by ip address
    • given a special name to refer to one another host.docker.internal on mac OS
    • nc host.docker.internal 45678 inside a docker container to listen to an exposed port on another container

dynamically exposing ports

  • port inside a container is fixed
  • port on the host is chosen from unused ports
  • allows many containers running programs with fixed ports
  • often used with a service discovery program: kubernetes
  • docker run --rm -ti -p 45678 -p 45679 --name echo-server ubuntu:14.04 bash
    • exposed fixed port on inside, dynamic on outside
  • docker port [container-name] shows the dynamically assigned ports
docker port echo-server                                                                     
45678/tcp -> 0.0.0.0:32769
45679/tcp -> 0.0.0.0:32768
  • can also use other protocols like udp

Container Networking

  • when we expose a port in docker it creates a network path through the Docker host from the outside of that machine through the network layers back into the container
    • there are more efficient ways of doing this
    • other containers can connect by going out to the host, turning around, and going back along that path
  • can connect directly between containers with virtual networking
  • docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
e9123c871d78        bridge              bridge              local
322307a8309e        host                host                local
e48bf2be1246        meue_default        bridge              local
a4fd9cdcdfe6        none                null                local
  • bridge - used by containers that dont specify a preference to be put into any other network
  • host - dont want any network isloation at all, does have security concerns
  • none - no networking
  • docker network create [network-name]
    • creates a network for containers to use
  • docker run --rm -ti --net learning --name catserver
    • give a name to the server so its much easier for networker
  • ping to hit the server to see if it is running
  • docker network connect [network-name] [docker-container-server-name]
    • will connect a server container you created to a network you created

Images

  • docker images lists images already downloaded
  • the sizes shown may not add up to be the total space actually used
    • docker images may share a lot of the same resources, space efficient
  • tagging images gives them names
  • docker commit will tag an image for you with latest, or you can give it a tag with docker commit [container-id] [imagename]:[tag]
  • example structure: registry.example.com:portNumber/organization/imageName:version-tag
  • images come from docker pull
    • run automatically by docker run
  • images can build up quickly
  • docker rmi [imageName][tag] and removes from system

Volumes

  • vitrual discs to store and share data between containers and hosts
  • two varieties
    • persistent
    • ephemeral exist as long as a container is using
  • not part of images, your local host data

sharing between host & container

  • shared folders / single file with host
    • a single file must exist before starting the container, or it will be assumed to be a directory
  • docker run -ti -v [path/to/shared/files]:[path/inside/container/to/file] ubuntu bash
    • you can then add things into this folder while in the container and the same contents will appear in the shared location
  • works with single file as well

sharing between containers

  • volumes-from argument
  • shared discs only so long as they are being used, ephemeral
  • docker run -ti --volumes-from [container_name]

docker registeries

  • manage and distribute images
  • docker makes these free
  • can run your own private company regristry
  • docker login and docker push to get up there
  • docker search [phrase] to search from the commandline
  • docker push [image_name][version_tag] is a way to share images in the registry
  • dont push images containing passwords
  • clean images regularly
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment