Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gwpantazes/a6eeda680f502e5354aaf4cbbe80b9a8 to your computer and use it in GitHub Desktop.
Save gwpantazes/a6eeda680f502e5354aaf4cbbe80b9a8 to your computer and use it in GitHub Desktop.
Incremental Steps to Running a Docker R Project

Incremental Steps to Running a Docker R Project

This guide shows an incremental approach to running a full R project with docker, from docker installation to creating your own containerized R project.

About Docker

This guide isn't going to go in-depth on how Docker works. We'll only cover necessary concepts if it's relevant at a particular step. To learn more about Docker, see the Docker Getting Started Overview.

Diagram of Docker Overview

Install Docker

Install docker.

  • Install via package manager if possible. Docker is a popular software package.
  • Docker can also be installed via script by the instructions at docker/docker-install

Let's verify that docker command line interface (CLI) is installed. Run:

docker --help
CLICK TO SHOW EXPECTED OUTPUT
Usage:	docker [OPTIONS] COMMAND

A self-sufficient runtime for containers

Options:
      --config string      Location of client config files (default
                           "/Users/georgep/.docker")
  -c, --context string     Name of the context to use to connect to the
                           daemon (overrides DOCKER_HOST env var and
                           default context set with "docker context use")
  -D, --debug              Enable debug mode
  -H, --host list          Daemon socket(s) to connect to
  -l, --log-level string   Set the logging level
                           ("debug"|"info"|"warn"|"error"|"fatal")
                           (default "info")
      --tls                Use TLS; implied by --tlsverify
      --tlscacert string   Trust certs signed only by this CA (default
                           "/Users/georgep/.docker/ca.pem")
      --tlscert string     Path to TLS certificate file (default
                           "/Users/georgep/.docker/cert.pem")
      --tlskey string      Path to TLS key file (default
                           "/Users/georgep/.docker/key.pem")
      --tlsverify          Use TLS and verify the remote
  -v, --version            Print version information and quit

Management Commands:
  builder     Manage builds
  config      Manage Docker configs
  container   Manage containers
  context     Manage contexts
  image       Manage images
  network     Manage networks
  node        Manage Swarm nodes
  plugin      Manage plugins
  secret      Manage Docker secrets
  service     Manage services
  stack       Manage Docker stacks
  swarm       Manage Swarm
  system      Manage Docker
  trust       Manage trust on Docker images
  volume      Manage volumes

Commands:
  attach      Attach local standard input, output, and error streams to a running container
  build       Build an image from a Dockerfile
  commit      Create a new image from a container's changes
  cp          Copy files/folders between a container and the local filesystem
  create      Create a new container
  diff        Inspect changes to files or directories on a container's filesystem
  events      Get real time events from the server
  exec        Run a command in a running container
  export      Export a container's filesystem as a tar archive
  history     Show the history of an image
  images      List images
  import      Import the contents from a tarball to create a filesystem image
  info        Display system-wide information
  inspect     Return low-level information on Docker objects
  kill        Kill one or more running containers
  load        Load an image from a tar archive or STDIN
  login       Log in to a Docker registry
  logout      Log out from a Docker registry
  logs        Fetch the logs of a container
  pause       Pause all processes within one or more containers
  port        List port mappings or a specific mapping for the container
  ps          List containers
  pull        Pull an image or a repository from a registry
  push        Push an image or a repository to a registry
  rename      Rename a container
  restart     Restart one or more containers
  rm          Remove one or more containers
  rmi         Remove one or more images
  run         Run a command in a new container
  save        Save one or more images to a tar archive (streamed to STDOUT by default)
  search      Search the Docker Hub for images
  start       Start one or more stopped containers
  stats       Display a live stream of container(s) resource usage statistics
  stop        Stop one or more running containers
  tag         Create a tag TARGET_IMAGE that refers to SOURCE_IMAGE
  top         Display the running processes of a container
  unpause     Unpause all processes within one or more containers
  update      Update configuration of one or more containers
  version     Show the Docker version information
  wait        Block until one or more containers stop, then print their exit codes

Run 'docker COMMAND --help' for more information on a command.

TODO: See "Step 2 — Executing the Docker Command Without Sudo (Optional)" on How To Install and Use Docker on Ubuntu 18.04.

By default, the docker command can only be run the root user or by a user in the docker group, which is automatically created during Docker’s installation process. If you attempt to run the docker command without prefixing it with sudo or without being in the docker group, you’ll get [an eror].

Make Sure Docker Daemon Is Running

Daemon = Technical jargon term for a long-running background process.

The docker daemon is controlled via the dockerd command (docs).

But to simply check if the docker daemon is running at all, we don't need to use dockerd. We can just do a simple docker action which relies on the daemon to work.

Let's verify the daemon is running listing currently running containers:

docker container ls

As the expected output if the docker daemon were running, we should see an empty container listing summary (those are headings for a columnar table that is currently empty).

CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES

If the docker daemon isn't running, we will see the following error.

Error response from daemon: dial unix docker.raw.sock: connect: connection refused

TODO: How to start docker daemon via command line? Docs say just run dockerd, but on my Mac, possibly because I'm using docker via the Docker Desktop installation rather than the command line installation, I have no access to dockerd in my command line and the only way to start/run docker is by running the desktop application. So need to experiment on how to run docker daemon from the command line.

Docker Hello World

Let's pull the hello-world image from dockerhub (a public docker image repository) to your environment. Run the following:

docker pull hello-world

The ouput from pulling should look similar to this:

Using default tag: latest
latest: Pulling from library/hello-world
0e03bdcc26d7: Pull complete
Digest: sha256:d58e752213a51785838f9eed2b7a498ffa1cb3aa7f946dda11af39286c3db9a9
Status: Downloaded newer image for hello-world:latest
docker.io/library/hello-world:latest

Now let's run the hello-world image. Run:

docker run hello-world

The output we want to see from running Hello World running as a container should look like this:

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (amd64)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more examples and ideas, visit:
 https://docs.docker.com/get-started/

Explanation about Containers and Images

Now seems like a good time to explain this:

  • Image: A docker image is the file from which a container could be started. The image is the static file. Files don't do anything on their own. They're just files. The image is just the file and does not itself run.
  • Container: A docker container is a running instance of an image. A container is a running process that was started based on the image. Remember that Containers are running, or imply some kind of runtime.

The relationship between an image and a container is that there is a one-to-many relationship between an image and the containers started from that image. One can create many running containers based off of just one image.

Docker R Interpreter

  • r-base - Docker Official Image for R
  • or, rocker/r-ver Reproducible builds to fixed versions of R

Let's start with R-Base as a proof of concept for running R as a docker process.

docker pull r-base
docker run -ti --rm r-base

NB: The run options -ti --rm are used when running a docker process as a foreground

  • --rm = Clean up, automatically "remove" any volumes created by container when closing the container.
  • -t and -i = For interactive processes (like a shell), you must use -i -t together in order to allocate a tty for the container process. For more info, see the docs on running foreground processes. (Note that it order doesn't matter, it could be -it instead.)

What we want to see here is for us enter an interactive R shell/interpreter/REPL. In the R shell, let's evaluate some R code to verify everything is working.

print("Hello World!")
R.Version()

Containerize our own R Script

A container is really just:

  1. A process / something that runs
  2. Everything that is needed for that process to run / ALL of its dependencies.

So let's write an R script that we want to run. Let's remember to think of this R-Script as a process.

Let's start simple. This will be my-script.r.

print("Hello World!")
R.Version()
2+2

Now containerize the running of this script, and run it through docker.

Let's create a dockerfile, which is the main mechanism of describing the container we are about to build. In the dockerfile, we will describe the base image (r-base or rocker/r-ver) and we will run our R script file we've written. Docker will encapsulate everything we do in the dockerfile into a container.

dockerfile

# Use the official R base image for now
FROM r-base

r my-script.r

NOTE: A base image is simply the image which we use with our FROM declaration at the beginning of the dockerfile. The base image is the image upon which we build our image. So when we say FROM r-base, r-base is our base image. We inherit everything that r-base has (a linux, and R itself) and build on top of that with our own stuff.

Containerize Installed Packages

All Together Now: The Full R-Project Ensemble

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment