Skip to content

Instantly share code, notes, and snippets.

@szeitlin
Last active June 11, 2016 18:23
Show Gist options
  • Save szeitlin/62608cacc9d72049781a to your computer and use it in GitHub Desktop.
Save szeitlin/62608cacc9d72049781a to your computer and use it in GitHub Desktop.
notes from Gobridge-sponsored Docker workshop
Taught by Jerome Petazzo, 7November2015
>docker version #returns both the client and server versions of docker, Go, git, and OS
docker daemon and docker engine mean the same thing
docker user is root equivalent, you should restrict access to it, e.g.
>sudo groupadd docker
>sudo gpasswd -a $USER docker
>sudo service docker restart
>docker run busybox echo hello world
#busybox is ~ 2 MB and you'll find it on all kinds of embedded systems, like on wifi routers
>docker run -it ubuntu bash #-i means interactive, -t means terminal
>dpkg -l | wc -l #gives you a count of packages installed
>df -h #gives you info on how space is allocated
note: it can look like your container has more memory than your host VM, this is an illusion! you'll only
find out when you try to write to it, that the disk is full. you can always add more to the thin-provisioning
system to get it to resume. (that's not specific to docker)
>apt-get update #get the updated packages list
>apt-get install curl
>curl canihazip.com/s #cute
>exit #container is in a stopped state
(but it won't remember what was on there, so it's not like vagrant.)
pets vs. cattle:
a pet is like a named server that needs special care
cattle is like a bunch of servers with numbers instead of names, all treated the same
>docker run -d jpetazzo/clock #-d means daemonize, returns a unique id
>docker ps #status, name, created
#-q gives only the id
#-l gives only the last container
>docker logs <id> #gives you all the output of that container.
similar to git commit ids, can just do the first few characters, and that works.
>docker logs --tail <#lines> <id>
#it won't give you a useful error if you forget to add the # lines you want, it will just fail
>docker stop #graceful shut down
(if it doesn't work after 10 seconds, it will kill it.)
>docker kill #brute force, immediate termination.
Background vs. Foreground containers:
Docker considers them all the same, whether they are connected to anything or not
Foreground just means you're attached to it
If attached in -it mode, can detach using an escape sequence:
> control-P control-Q (control will show up as a caret symbol)
(doesn't work if you're not in -it mode)
NOTE! If you just want to see what's going on, use docker logs, not docker attach.
If you're attached and send control-C to kill the view, you also kill the container.
workflow:
1. find the container with what you want by doing docker ps -a
2. docker start <id>
3. docker attach <id>
Image: a read-only set of files that form the root filesystem of a container, + metadata.
Similar to classes in OOP.
Containers are instances.
Images are made of layers. Layers are similar to inheritance.
Each layer can add, change and remove files.
Images can share layers to optimize disk space, transfers, and memory use.
You never change an image, you create a new container, make whatever changes you want,
and then create a new layer,
and then create a new image with that new layer on it.
> docker commit #creates a new layer and a new image from a container
> docker build
> docker import #load a standalone base layer to bootstrap a new image
3 namespaces for images:
1) root-like
2) user
3) self-hosted, e.g. localhost:5000/imagename
There are official images for various distros and packages, as well as language images
> docker images #will list the ones on your machine
> docker search <thing you want>
#automated images are always available, built by Docker hub
Explicitly install:
> docker pull <imagename> #also how you update images. defaults to latest.
> docker pull ubuntu:12.04 #can optionally add the version you want.
#can check docker hub to see all available tags.
Implicit installation happens if you try to run an image you don't already have.
> docker diff #to see how your image differs from the base image
> docker commit <id> <newname>
> docker tag <id> <newname> #if you forget to add a new name when you commit,
or want to change the name, use this to make a human-readable identifier
Dockerfile is just a bunch of shell commands,
with maybe an extra argument -y for 'yes' to continue automatically:
1) must be in its own directory: this is how we ensure isolation of the 'build context'.
Will only use what's in that directory.
2) must be named Dockerfile
3) must have a FROM statement, and it must be the first thing that's not a comment
# the only time you'd want more than one FROM statement
is for unittests using a database or other dependencies
The second FROM instruction marks beginning of building a new image.
If the tests fail, the run command will fail.
4) note that RUN is not for starting services.
for that you want CMD and/or ENTRYPOINT. RUN is for installing libraries.
Can string them together with &&.
In the beginning, better to do one per line b/c it's faster for rebuilding as you iterate.
tip: he recommends the alpine image when you're developing, b/c it's so fast
optional:
5) EXPOSE any ports you need
6) ADD any remote files you need <- this is cached
7) VOLUME any data you might need
8) WORKDIR
9) ENV <-- can overwrite this with docker run -e
10) USER <-- if you want someone other than root
11) CMD <-- default command run when the container starts up
12) ONBUILD <-- sets instructions to be executed when this image is used as a template for another image.
Note that ONBUILD can't be used in a chain, and can't be used to trigger FROM or MAINTAINER instructions.
ex. usage would be a vulnerability check, or to check that the image is still up to date
(I didn't know that python, ruby, and golang all have 'onbuild' equivalents)
example:
FROM ubuntu
RUN apt-get update
RUN apt-get install -y wget
> docker build -t <tag> . #do this in the same directory where the Dockerfile lives
Docker uses the exact strings defined in your Dockerfile, so:
RUN apt-get install -y wget curl *is different from*
RUN apt-get install -y curl wget
The example we did actually builds three images:
each line is one image, with the differences layered on top of each other
You can force a rebuild with:
> docker build --no-cache
To see the layers in your image, sort of like git log of commits:
> docker history <myimagename>
>wget -O- #second dash means write to standard output
CMD in your Dockerfile will run when you run the image,
if you don't specify something else to run
e.g.
FROM ubuntu
RUN apt-get update
RUN apt-get install -y wget
CMD wget -O- -q http://canihazip.com/s
both CMD and ENTRYPOINT will take a list of strings
to ensure processes are executed in the order you want.
For ENTRYPOINT, the list syntax is mandatory.
FROM ubuntu
RUN apt-get update
RUN apt-get install -y wget
ENTRYPOINT ["wget", "-O-", "-q"]
Can use them together,
i.e. ENTRYPOINT can specify the base command, and CMD can define the default parameters
____
On Debian and Ubuntu, build_essential will get a compiler
COPY a script and run it:
FROM ubuntu
RUN apt-get update
RUN apt-get install -y build-essential
COPY hello.c /
RUN make hello
CMD /hello
It's very smart, if you rebuild,
it will compare the hash for the file it's calling with COPY, to see if it has changed.
original:
int main () {
puts("Hello, world!");
return 0;
}
int main () {
puts("Hello, San Francisco!");
return 0;
}
> docker build -t hello .
> docker run hello
You can COPY whole directories recursively
Older Dockerfiles can also have ADD (similar, but can automatically extract archives)
If we actually wanted to compile C code,
we would put it in a different directory with the WORKDIR instruction.
There is also a gcc official image.
Docker Hub is the registry + some fancy features
you can always have it locally:
> docker run registry #but it will be very raw
Naming containers is good for referencing and for ensuring that it's unique.
If you don't specify a name, docker will create one from 1) a mood 2) a famous inventor.
unicity: when you want to be sure that a specific container is running only once.
> docker run --name ticktock jpetasso/clock #-d flag means 'detached' or 'in the background'
You can also rename a container
> docker rename
> docker inspect <container> #returns JSON with all kinds of log info
> docker inspect <container> | jq . #cute JSON colored layout, among other things (see man jq for more info)
> docker inspect --format '{{ json .Created }}' <container> #using Go's text/template package
#the dot means 'current file', in this case inspect
#json is an optional keyword meaning the output has to be valid json
> docker run -d -P jpetazzo/web #-P means publish-all, accessible to the outside world
0.0.0.0:32768->8000/tcp
#if I go to or curl the address on the left,
I'll be actually talking to the port shown on the right inside the container
(note that instead of 0.0.0.0, have to use the IP address if it's running remotely!)
Can't hand out public IPv4 addresses, there aren't enough.
Have to have private addresses and expose services on specific ports.
Q: why not just pass the ports straight through 8000-> 8000?
A: forcing specific ports avoids conflicts
to find the port number it's allocating (it's pseudo-random):
> docker port <container> 8000
If you want to specify a port yourself:
> docker run -t -p 80:8000 jpetazzo/web #80 on the host, 8000 in the container
#lowercase p to explicitly map a port,
uppercase P if you want to publish
Three options for connecting containers:
1) retrieve assigned port and feed it to your configuration.
2) pick a fixed port and set it manually when you start your container.
3) use an overlay network, like a VPN
Someone asked about user namespaces and security.
Inside the container, docker thinks it's root, but it doesn't really have
full privileges. Jerome says he has other presentations on security re: container isolation etc.
4 kinds of network models:
1) default network model for a container is the 'bridge' model:
unix switch, defaults to only eth0 and lo, the loopback interface
2) the 'none' model means 'no network connection', e.g. for unittests:
> docker run --net none -it ubuntu
3) 'host' model: shares all the network interfaces of the host,
good for really high traffic applications where you can't have any delays
another time you might want that is if you need to expose a bunch of ports while the application is running
4) 'container' model: demo on this went by kind of fast and I couldn't quite get it to work
_____
If you want to stop all running containers:
> docker kill $(docker ps -q)
Use -v to make a directory available from the host to the container (NOT A COPY!):
>docker run -d \
-v $(pwd):/opt/namer \
-p 80:9292 \
training/namer
#it's using a bind mount, sort of like a symlink but even more direct
(just as fast as if you accessed from the host)
Pseudo-ssh to be used in emergencies if you need to run a new process
inside a container which is already running (gives you a shell prompt):
> docker exec -ti <container> bash #ns-enter
When you do docker commit, what's inside the volume is left out.
Volumes are what you want for sharing a file or a directory between multiple containers.
for that you'd use the option --volumes-from
example to check logs instead of doing docker exec to get logs,
e.g. if you want to do analytics on them:
> docker run -it --name alpha -v /var/log ubuntu
> docker run --volumes-from alpha ubuntu cat /var/log/now
This seems like a clever trick to keep separation and avoid running out of memory.
In addition to the -v flag, you can also do a VOLUME line in the Dockerfile
VOLUME /var/lib/postgresql
or
> docker run -d -v /var/lib/postgresql training/postgresql
New in docker 1.9: To see what volumes you have:
> docker volume ls #have to use inspect if you don't know the ID
To name a volume so it's easier to find again:
> docker volume create --name preciousredis
#seems that this will create it in the directory where you are (?)
> docker run -v preciousredis:/data -d -p 2222:6379 redis
can get rid of the volume:
> docker volume rm <full id>
there's also a plugin that lets you control the path and move the volume around
(https://github.com/ClusterHQ/flocker)
https://docs.clusterhq.com/en/1.6.1/ <-- includes tutorials
you can chain volumes together to reuse
When you remove containers, volumes stay behind,
but if you remove all the containers that access a volume,
it will be orphaned and you won't be able to access it anymore.
>docker pull docker
>docker run -ti -v /var/run/docker.sock:/var/run/docker.sock docker sh
Now you're in a docker container, talking to docker outside, using the docker CLI
you can also share a device: --device
clean up all containers:
>docker ps -q | xargs docker rm -f
create a data container:
>docker pull redis:latest
>docker run -d --name myredis redis
Share the data container:
> docker run -it --link myredis:redis ubuntu #syntax is name:alias
> docker run -d -P --link myredis:redis nathanleclaire/redisonrails
> docker ps #to figure out what the port is
then go to IP:port number to see the app is running.
The downside of links is that it's only for localhost
Ambassadors: for service portability
Allows you to manage services without hard-coding connection information
Instead of actually connecting containers directly,
you connect to ambassadors and they talk to each other
(love the diagram for this in the slides)
Ambassadors keep track of where things are,
so even if the database container is moved, the web application container
can still connect, without reconfiguration
it's service discovery - they have a separate workshop on how to set this all up
1) start all the containers
2) start all the ambassadors
3) inject the configuration in all the ambassadors
Once it's all set up, you run 3 scripts and it takes 10 seconds to do it all.
You can use something like Zookeeper to track container locations
and generate HAproxy configurations
Alternatively, you can put TLS everywhere
Or you can use a master-less discovery/broadcast protocol like avahi (similar to apple's Bonjour)
Or, you can just configure all the containers and then do docker inspect to see where everything is.
Can set that up with no extra stuff, but harder to maintain.
Ambassadors are good for failover so you can switch to replicas
Dockerfiles are what you use to configure a single container.
Docker Compose is what you use if you want a bunch of containers to be connected
(and they can span multiple machines).
it's an external tool, formerly known as 'fig', which Docker acquired (2 guys from the UK), written in python
YAML file called docker-compose.yml
if you look in an example docker-compose.yml file,
sometimes it says 'build', other times it just says 'image'
(you have to have one of these in each section of the file)
see e.g.
>git clone git://github.com/jpetazzo/dockercoins
>cd dockercoins/
>docker-compose up
see http://docs.docker.com/compose/yml/
his example jpetazzo/trainingwheels app is good for checking requests from different backends,
e.g.
for checking on your load-balancing
note that 'docker-compose scale' will create a bunch of copies,
but doesn't do any load-balancing (you have to add that)
example bad dockerfile:
don't tell it to RUN requirements.txt without first using COPY to provide the requirements.txt file,
so it can check whether that file has changed
I asked re: separate git branches, he said:
copy the repo into a new folder
do docker-compose up -d
it will automatically name the containers by the name of the folders
and then do docker-compose ps to see what port is on what
and maybe even better to just specify them in the YAML file
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment