Skip to content

Instantly share code, notes, and snippets.

@julianpistorius
Forked from ashiklom/docker.org
Created August 26, 2019 18:21
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save julianpistorius/302aace8551f606d8888797cbfc83356 to your computer and use it in GitHub Desktop.
Save julianpistorius/302aace8551f606d8888797cbfc83356 to your computer and use it in GitHub Desktop.
PNNL PEcAn setup

PIC setup notes

VM instance setup

Start at cloud management console (https://dashboard.cloud.pnnl.gov).

Go to “Instances”, then “Launch Instance”.

Details: Give it a name (e.g. “pecan1”) Availability zone “nova” (only option). Count: 1 (for now; can increase if I want multiple instances with same configuration).

Source: Pick base image. I chose pnnl-ubuntu-18.04.

Flavor: “Power” of system. For now, using one m1.xlarge, the most powerful configuration.

Note that you can also add additional storage to the image on this screen by adding a volume.

Networks: Use the defaults.

Network ports: No customization options. Continue.

Security groups: No options. Continue.

Key pair: Create SSH key. Save private and public key to ~/.ssh/<key-name> and ~/.ssh/<key-name>.pub, respectively.

Configuration: Skip for now. Continue.

Server Groups: No options. Continue.

Scheduler hints: Skip for now. Continue.

Metadata: Skip for now. Later, can set environment variables (?) this way. May be useful for identifying specific PEcAn nodes.

Now, click “Launch instance”.

Once the instance launches, you can access it from a console in the browser. However, the default IP address it uses is accessible only from the browser and other nodes on the cloud. To access the machine from a terminal, you need to Allocate a floating IP, which is the IP that can be used to access the instance remotely. From the instance “Actions” menu, select “Allocate a Floating IP” and click through the process, ending by clicking “Allocate”. At that point, you assuming the instance is running, you should be able to connect to it via SSH, with a command like:

ssh -i /path/to/private-key.pem <usermane>@<ip address>

This can be automated by adding an entry like this to ~/.ssh/config (replacing <host name> with anything you want):

Host <host name>
	HostName <ip address>
	User <user name>
	IdentityFile /path/to/private-key.pem

Then, you will be able to log in by simply typing: ssh <host name>.

Additional configuration

To remove the annoying “group not found” message on login, add a line like the following to /etc/group:

<groupname>:x:<groupid>:<username>
# E.g.
homegroup:x:1234567:ashiklom

To remove the unable to resolve host <hostname> message, add <hostname> to the /etc/hosts file next to localhost.

Additional storage via volumes

The basic cloud units do not come with much storage. The way to add large quantities of storage is via “Volumes”.

To create a volume, in OpenStack, go to “Volumes” and click “+ Create Volume”. Then, give the volume a name and a storage allocation.

To attach the volume to an instance, go to “Instances”, go to the drop-down menu of your instance, and click “Attach volume”. Select the volume from the drop-down menu and click “Attach” to attach the volume.

Attaching the volume just makes the device available to the image. The volume still needs to be formatted and mounted to actually be accessible by the instance. The following instructions create a volume with filesystem ext4 and mount it to /data on the image.

First, identify the name of the volume.

sudo fdisk -l

Most likely, the device will be something like /dev/vdb (/dev/vda is the image’s own storage).

Next, open the device for editing with fdisk.

sudo fdisk /dev/vdb

This should drop you into fdisk’s interactive prompt. To see a help menu, type m and hit enter. Note that when running interactively, most of the commands will not actually make any changes until you specifically ask fdisk to *w*rite the changes to disk with w. You can quit fdisk without saving with q.

Within fdisk, first create a partition table. Since we only have one partition, it doesn’t matter much what kind of table is used – here, we will use the DOS (MBR) partition table. To create this table, use the o command.

Now that the disk has a partition table, we need to add an actual partition. The disk can be partitioned however we want, but for simplicity, we will just create one partition that takes up the entire disk. To begin creating a *n*ew partition, enter the n command. This will drop you into an interactive prompt for creating the partition. The prompt allows you to modify the start and end location (i.e. size) of the partition, and other options, but for our purposes, selecting all the defaults (create a partition that takes up the entire disk) is fine.

Once you’ve completed the partition creating menu, write the changes to disk with w.

If that is successful, running sudo fdisk -l should show that you now have a new device like /dev/vdb1 (partition 1 on device /dev/vdb).

There is one more step before the device is usable: creating a filesystem. The filesystem creation commands on Linux all start with mkfs. To see all of the options, you can run compgen -c | grep mkfs. Since we are creating an ext4 filesystem, we will use mkfs.ext4.

sudo mkfs.ext4 /dev/vdb1

Once that finishes, you should be able to mount the device to your desired location. Here, we will use /data. First create the directory…

sudo mkdir /data

…and then mount the device onto it.

sudo mount /dev/vdb1 /data

Right now, because this folder was created by sudo, it will not be writable by the user. There are several ways this can be remedied, depending on how concerned you are about security. The simplest thing to do is give all users read, write, and execute access to the directory:

sudo chmod a+rwx /data

If you want to give only specific users access, you can use setfacl. For instance, to give user myuser read (r), write (w), and execute (x) access, run the following:

sudo setfacl -m u:myuser:rwx /data

You can do the same thing for groups by replacing u with g. For instance, to give all members of the docker group full access:

sudo setfacl -m g:docker:rwx /data
Docker setup

Follow instructions in the Docker website. Note that you should not install the default docker in Ubuntu’s repositories, but rather Docker CE from Docker’s repository.

First, install some system dependencies:

sudo apt install apt-transport-https ca-certificates curl software-properties-common

Add Docker’s official GPG key.

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

Add the stable repository.

sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"

Install.

sudo apt update
sudo apt install docker-ce

The Docker daemon (service) should start automatically. You can verify it with systemctl status docker.

On PNNL’s network, you have to set a different pathway for the Docker network. More details are here, but in a nutshell, you need to edit the /etc/docker/daemon.json file to contain the following:

{"bip": "10.17.129.1/24"}

To confirm that this worked, run the command ip a and look for the docker0 device. If the inet IP matches the one above, then you’re all set. Otherwise, stop the Docker service,…

sudo systemctl stop docker

…delete the bridge,…

sudo ip link del docker0

…and then restart Docker.

sudo systemctl start docker

Note that these settings will not affect docker-compose, so you may have to add blocks like the following to docker-compose.yml files:

networks:
  default:
    ipam:
      config:
        - subnet: 10.17.1.0/24   # or some other subnet that doesn't conflict

To run Docker without sudo, add your user to the docker group:

sudo usermod -aG docker $USER

For this to take effect, you’ll need to log out and then back in.

Finally, to test that everything works as expected, run the hello-world container:

docker run hello-world

Before installing docker-compose, you should log into your DockerHub instance with the docker login command. You will be prompted for your DockerHub username (note that this is not the same as your email) and password. If you get an error like “Cannot autolaunch D-Bus without X11 $DISPLAY”, it may be due to a conflict with docker-compose and one of its dependencies (see Docker issue #6023). Try temporarily removing (sudo apt remove ...) docker-compose and its dependency golang-docker-credential-helpers, then logging in, then, if successful, re-install docker-compose.

For PEcAn work, you will also need docker-compose, which can be installed through apt:

sudo apt install docker-compose
PEcAn setup

Details are in the PEcAn documentation section “Quickstart for Docker and PEcAn”. Basically, first clone PEcAn and cd into the directory:

git clone https://github.com/pecanproject/pecan
cd pecan

Then, run the following commands to initialize PEcAn…

docker-compose -p pecan up -d postgres
docker run -ti --rm --network pecan_pecan pecan/bety:latest initialize
docker run -ti --rm --network pecan_pecan --volume pecan_pecan:/data pecan/data:develop

Finally, assuming the above commands executed successfully, run the following to start PEcAn:

docker-compose -p pecan up -d

To run PEcAn, you will need to use port forwarding. First, open a new terminal instance and run the following to connect localhost port 8005 to the remote’s port 8000 (replacing <connection> with either hostname if you have an ~/.ssh/config entry, or -i /path/to/private-key.pem username@ip.address otherwise):

ssh -L 8005:localhost:8000 <connection>

Once connected, make sure that PEcAn’s docker instances are running (cd /path/to/pecan && docker-compose -p pecan up -d should tell you). Then, in a browser on your local machine, you should be able to browse to localhost:8005 to connect to the instance of PEcAn running on the VM.

NOTE: You can replace 8005 with any port of your choosing, and then use that port in the localhost URL. However, 8000 is required by the docker-compose file – if you want it to be something else, you’ll have to change it in docker-compose as well.

NOTE: To run BETY with the new attributes table, currently need to do the following:

docker run -it --rm --network <stack>_pecan pecan/bety:latest initialize
docker run -it --rm --network <stack>_pecan pecan/bety:develop migrate
AltPIC setup

Perform the steps described in Additional configuration.

Update system packages – sudo apt update and sudo apt upgrade.

(Optional, but recommended) Reboot the instance.

Install R:

sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9
sudo apt install apt-transport-https.
sudo add-apt-repository 'deb [arch=amd64,i386] https://cran.rstudio.com/bin/linux/ubuntu xenial-cran35/'
sudo apt update
sudo apt install r-base r-base-dev

Install some additional dependency libraries for my R packages:

sudo add-apt-repository ppa:ubuntugis/ubuntugis-unstable
sudo apt install \
  libpq-dev \
  libudunits2-dev \
  libgdal-dev \
  libgeos-dev \
  libproj-dev \
  libopenmpi-dev \
  libhdf5-openmpi-dev \
  librdf0-dev
Docker swarm setup

Make sure Docker is installed on both the manager and worker nodes.

Following instructions are based on this guide.

First, on the manager node, initialize the Docker swarm and advertise its IP. (Fill in the IP with the node’s IP address.)

docker swarm init --advertise-addr XXX.XXX.XXX.XXX

Note the output – it gives a command that can be run for other nodes to join the swarm. Run that command on another machine to join the swarm as a worker.

On the manager node, push the images to a registry so they are available to all nodes. Standard image tags (image:tag) are pushed to Docker Hub by default. To push images to a local registry instead, first create a registry on the swarm:

docker service create --name registry --publish published=5000,target=5000 registry:2 

Then, in the compose file specify the image as XXX.XXX.XXX.XXX:5000/image:tag (where the IP address is that of the manager; note that the port corresponds to the published and target options).

This registry is likely insecure by default, which will result in an error like the following:

Error response from daemon: Get https://XXX.XXX.XXX.XXX:5000/v2/: http: server gave HTTP response to HTTPS client

If you see this error, add the following to the /etc/docker/daemon.json file:

"insecure-registries": ["XXX.XXX.XXX.XXX:5000"]

Then, restart the Docker daemon (sudo service docker restart).

To start the stack, building all the images to the right specifications run), the following

docker-compose -p pecanswarm -f docker-compose.yml -f production.yml up --build -d

Do deploy a stack to a swarm, use docker deploy with the --compose-file/-c flag. Note that compose file overrides can be applied by combining multiple -c flags.

docker stack deploy --compose-file docker-compose.yml --compose-file docker-compose.overrides.yml mystack

Alternatively, you can access the full range of docker-compose pre-processing capabilities by first performing all pre-processing steps with docker-compose config and then piping that output to docker stack deploy:

docker-compose config | docker stack deploy --compose-file - pecanswarm

Check that the stack is running with:

docker stack services mystack
Docker Swarm and PEcAn

The standard PEcAn stack uses the following volumes: traefik, postgres, rabbitmq, pecan, and portainer. For conceptual simplicity, we will restrict some of these nodes to only run on the manager with Compose file syntax like the following:

services:
  postgres:
    image: postgres
    deploy:
      placement:
        constraints:
          - node.role == manager

More information on constraint syntax and the Compose deploy field.

Applying that syntax to the traefik, postgres, rabbitmq, and portainer services means that the only volume we need to share across nodes is pecan. We will use the NFS storage driver to make the volume available across nodes:

volumes:
  pecan:
    driver_opts:
      type: nfs
      o: "addr=172.20.234.6,rw,nolock,soft"
      device: ":/public/shared-docker-volumes/pecan_data"

Might not be necessary, but the following container volume configuration is recommended:

services:
  executor:
     volume:
       - type: volume
         source: pecan
         target: /data
         volume:
           nocopy: true

Since I will often be using freshly built images, I need to create a registry service on the Swarm, as described above. Also, in the final production YAML file, I need to label the images with the IP address.

Docker tricks

Update R packages in the depends container, and commit the changes.

docker run -it --name r-update pecan/depends:latest Rscript -e "update.packages(ask = FALSE, checkBuilt = TRUE)"
docker commit -m "Update R packages" r-update pecan/depends:latest
docker rm r-update

You can also do this with a running container. For instance, to install a local package into a running container:

docker cp /path/to/local/package <container ID>:/tmp/package
docker exec -it <container ID> Rscript -e "devtools::install('/tmp/package')"

# Clean up
docker exec -it rm -rf /tmp/package

List all the versions of the pecan/depends image:

docker image ls pecan/depends

Execute a shell in a running container (for debugging).

docker container ls
# Copy container ID
docker exec -it <container ID> /bin/bash

Restart a running service with the same configuration.

docker service update --force <service>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment