This guide shows an incremental approach to running a full R project with docker
, from docker installation to creating your own containerized R project.
This guide isn't going to go in-depth on how Docker works. We'll only cover necessary concepts if it's relevant at a particular step. To learn more about Docker, see the Docker Getting Started Overview.
- Install via package manager if possible. Docker is a popular software package.
- Docker can also be installed via script by the instructions at docker/docker-install
Let's verify that docker command line interface (CLI) is installed. Run:
docker --help
CLICK TO SHOW EXPECTED OUTPUT
Usage: docker [OPTIONS] COMMAND
A self-sufficient runtime for containers
Options:
--config string Location of client config files (default
"/Users/georgep/.docker")
-c, --context string Name of the context to use to connect to the
daemon (overrides DOCKER_HOST env var and
default context set with "docker context use")
-D, --debug Enable debug mode
-H, --host list Daemon socket(s) to connect to
-l, --log-level string Set the logging level
("debug"|"info"|"warn"|"error"|"fatal")
(default "info")
--tls Use TLS; implied by --tlsverify
--tlscacert string Trust certs signed only by this CA (default
"/Users/georgep/.docker/ca.pem")
--tlscert string Path to TLS certificate file (default
"/Users/georgep/.docker/cert.pem")
--tlskey string Path to TLS key file (default
"/Users/georgep/.docker/key.pem")
--tlsverify Use TLS and verify the remote
-v, --version Print version information and quit
Management Commands:
builder Manage builds
config Manage Docker configs
container Manage containers
context Manage contexts
image Manage images
network Manage networks
node Manage Swarm nodes
plugin Manage plugins
secret Manage Docker secrets
service Manage services
stack Manage Docker stacks
swarm Manage Swarm
system Manage Docker
trust Manage trust on Docker images
volume Manage volumes
Commands:
attach Attach local standard input, output, and error streams to a running container
build Build an image from a Dockerfile
commit Create a new image from a container's changes
cp Copy files/folders between a container and the local filesystem
create Create a new container
diff Inspect changes to files or directories on a container's filesystem
events Get real time events from the server
exec Run a command in a running container
export Export a container's filesystem as a tar archive
history Show the history of an image
images List images
import Import the contents from a tarball to create a filesystem image
info Display system-wide information
inspect Return low-level information on Docker objects
kill Kill one or more running containers
load Load an image from a tar archive or STDIN
login Log in to a Docker registry
logout Log out from a Docker registry
logs Fetch the logs of a container
pause Pause all processes within one or more containers
port List port mappings or a specific mapping for the container
ps List containers
pull Pull an image or a repository from a registry
push Push an image or a repository to a registry
rename Rename a container
restart Restart one or more containers
rm Remove one or more containers
rmi Remove one or more images
run Run a command in a new container
save Save one or more images to a tar archive (streamed to STDOUT by default)
search Search the Docker Hub for images
start Start one or more stopped containers
stats Display a live stream of container(s) resource usage statistics
stop Stop one or more running containers
tag Create a tag TARGET_IMAGE that refers to SOURCE_IMAGE
top Display the running processes of a container
unpause Unpause all processes within one or more containers
update Update configuration of one or more containers
version Show the Docker version information
wait Block until one or more containers stop, then print their exit codes
Run 'docker COMMAND --help' for more information on a command.
TODO: See "Step 2 — Executing the Docker Command Without Sudo (Optional)" on How To Install and Use Docker on Ubuntu 18.04.
By default, the docker command can only be run the root user or by a user in the docker group, which is automatically created during Docker’s installation process. If you attempt to run the docker command without prefixing it with sudo or without being in the docker group, you’ll get [an eror].
Daemon = Technical jargon term for a long-running background process.
The docker daemon is controlled via the dockerd
command (docs).
But to simply check if the docker daemon is running at all, we don't need to use dockerd
. We can just do a simple docker action which relies on the daemon to work.
Let's verify the daemon is running listing currently running containers:
docker container ls
As the expected output if the docker daemon were running, we should see an empty container listing summary (those are headings for a columnar table that is currently empty).
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
If the docker daemon isn't running, we will see the following error.
Error response from daemon: dial unix docker.raw.sock: connect: connection refused
TODO: How to start docker daemon via command line? Docs say just run dockerd
, but on my Mac, possibly because I'm using docker via the Docker Desktop installation rather than the command line installation, I have no access to dockerd
in my command line and the only way to start/run docker is by running the desktop application. So need to experiment on how to run docker daemon from the command line.
Let's pull the hello-world
image from dockerhub (a public docker image repository) to your environment. Run the following:
docker pull hello-world
The ouput from pulling should look similar to this:
Using default tag: latest
latest: Pulling from library/hello-world
0e03bdcc26d7: Pull complete
Digest: sha256:d58e752213a51785838f9eed2b7a498ffa1cb3aa7f946dda11af39286c3db9a9
Status: Downloaded newer image for hello-world:latest
docker.io/library/hello-world:latest
Now let's run the hello-world
image. Run:
docker run hello-world
The output we want to see from running Hello World running as a container should look like this:
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
(amd64)
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/
For more examples and ideas, visit:
https://docs.docker.com/get-started/
Now seems like a good time to explain this:
- Image: A docker image is the file from which a container could be started. The image is the static file. Files don't do anything on their own. They're just files. The image is just the file and does not itself run.
- Container: A docker container is a running instance of an image. A container is a running process that was started based on the image. Remember that Containers are running, or imply some kind of runtime.
The relationship between an image and a container is that there is a one-to-many relationship between an image and the containers started from that image. One can create many running containers based off of just one image.
- r-base - Docker Official Image for R
- or, rocker/r-ver Reproducible builds to fixed versions of R
Let's start with R-Base as a proof of concept for running R as a docker process.
docker pull r-base
docker run -ti --rm r-base
NB: The run
options -ti --rm
are used when running a docker process as a foreground
--rm
= Clean up, automatically "remove" any volumes created by container when closing the container.-t
and-i
= For interactive processes (like a shell), you must use -i -t together in order to allocate a tty for the container process. For more info, see the docs on running foreground processes. (Note that it order doesn't matter, it could be-it
instead.)
What we want to see here is for us enter an interactive R shell/interpreter/REPL. In the R shell, let's evaluate some R code to verify everything is working.
print("Hello World!")
R.Version()
A container is really just:
- A process / something that runs
- Everything that is needed for that process to run / ALL of its dependencies.
So let's write an R script that we want to run. Let's remember to think of this R-Script as a process.
Let's start simple. This will be my-script.r
.
print("Hello World!")
R.Version()
2+2
Now containerize the running of this script, and run it through docker.
Let's create a dockerfile, which is the main mechanism of describing the container we are about to build. In the dockerfile, we will describe the base image (r-base or rocker/r-ver) and we will run our R script file we've written. Docker will encapsulate everything we do in the dockerfile into a container.
dockerfile
# Use the official R base image for now
FROM r-base
r my-script.r
NOTE: A base image is simply the image which we use with our FROM
declaration at the beginning of the dockerfile. The base image is the image upon which we build our image. So when we say FROM r-base
, r-base
is our base image. We inherit everything that r-base has (a linux, and R itself) and build on top of that with our own stuff.