Skip to content

Instantly share code, notes, and snippets.

@Paraphraser
Last active April 1, 2024 23:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Paraphraser/7bc5ab5938d2c5ecc3aa8c699bbb5cb9 to your computer and use it in GitHub Desktop.
Save Paraphraser/7bc5ab5938d2c5ecc3aa8c699bbb5cb9 to your computer and use it in GitHub Desktop.
IOTstack tutorial - adding a container

IOTstack tutorial - adding a container

This gist is a response to a Discord question asking:

is it possible to add my own container to IOTStack? for example: couchdb

The answer is "yes" and CouchDB is quite a good example to use for a tutorial.

contents

basic research

It is a good idea to approach the problem as a research project. Keep track of all your sources so you can cite them later, particularly when it comes to writing good documentation for your container.

Conceptually, a Docker container is like a small self-contained computer dedicated to running a single process. If you think about how you would do this in reality, you would:

  1. Pick appropriate hardware (eg a Raspberry Pi);
  2. Choose an operating system (eg Raspberry Pi OS);
  3. Install the operating system on the hardware;
  4. Install the packages needed to support your service;
  5. Figure out the glue to make your service start when the system boots up.

This approach would be called a native installation.

With Docker, your hardware platform is probably a given (eg your Raspberry Pi) but steps 2…5 are defined in a Dockerfile which, in essence, allows you to start from a base image (eg a minimal Debian system), run apt commands to install the packages you need, and then define how your service starts when the container is instantiated.

A Docker image is the equivalent of taking a backup of your completed system so you can restore it later. Unlike a native installation where you would only restore from the backup to solve a problem, with Docker your service is instantiated from the image every time the container starts.

In some ways the relationship between an image and a container is closer to the idea of a student lab where PCs net-boot each time someone logs out.

While you can roll your own Docker images from scratch, it's usually best to at least check whether someone else has already done the work.

The old programmer's adage of "the easiest line of code to maintain is the one you never write" applies here.

DockerHub

The best place to start is the search field at DockerHub.

In this tutorial we will be searching for CouchDB. It's quite common to get more than one hit and CouchDB is no exception. Figuring out which hit is "best" comes from experience. A good rule of thumb is to look for an image with a lot of downloads. CouchDB has 100 million downloads so it is a reasonable candidate.

The next thing to check is whether there is an image variant matching your architecture. Sometimes, the overview tab will list the supported architectures but, if not, you can figure it out by going to the tags tab.

In the case of CouchDB, the supported architectures at the time of writing were:

  • linux/amd64
  • linux/arm64/v8
  • linux/ppc64le
  • linux/s390x

The most likely target environments for IOTstack users are real Raspberry Pis and Debian systems running on Intel chips (either natively or via Proxmox). Both of those are supported but with the caveat that you need a full 64-bit system and, if it's a Raspberry Pi, a model 4.

If DockerHub doesn't include an image variant for your hardware then just go back and search from the DockerHub main page and see if another hit will do the job. Worst case is you will need to roll your own container from scratch (reasonably easy but out of scope for this gist).

While you are nosing around DockerHub, make a note of any other references you can find. In the case of CouchDB, the source code for building the image is on GitHub. Sometimes, there will be links to documentation wikis too.

It is also a good idea to walk through the examples to harvest any useful facts such as environment variables, network ports, and paths to persistent storage. Knowing exactly what to look for is a product of experience. These are the things I noted for CouchDB:

  • environment variables mentioned:

    • COUCHDB_USER and COUCHDB_PASSWORD
    • COUCHDB_SECRET
    • NODENAME (with the note that it is used for clustering and can be ignored for single-node setups)
    • ERL_FLAGS documented here
  • paths mentioned:

    • /opt/couchdb/etc/local.d
    • /opt/couchdb/etc/local.d/docker.ini
    • /opt/couchdb/etc/vm.args
    • /opt/couchdb/data
  • ports mentioned:

    • 5984
  • the image name. The best place to get this is from the tags tab. Examples include:

     $ docker pull couchdb
     $ docker pull couchdb:latest
     $ docker pull couchdb:3.3.2

    From those commands, the image names are everything to the right of docker pull:

     couchdb
     couchdb:latest
     couchdb:3.3.2
    

    This is commonly referred to as "«image»:«tag»" format. In general, it is better to select a tagged image, defaulting to "latest" unless you have a good reason for pinning to a specific version. You can use the untagged form but it isn't always clear which version that corresponds with.

GitHub

The DockerHub overview tab for an image is often drawn from the associated GitHub README.md but it's still always worth perusing.

Sometimes, you will find examples of docker-compose service definitions from which you can cherry-pick but CouchDB doesn't seem to have one.

What CouchDB does have is releases folders and, within those, the structures used to build the images which were deployed on DockerHub.

The 3.3.2 and 3.2.3 folder was the most recent at the time of writing.

"3.2.3" is possibly a typo - it likely means "3.3.3" but that doesn't affect this discussion.

The Dockerfile is a useful source of information:

  1. The base image was Debian (which, among other things, tells you that you can use apt if you need to add any packages to a custom image);
  2. The VOLUME /opt/couchdb/data statement confirms that this is the internal path where CouchDB expects its persistent data to be stored; and
  3. The exposed ports are 5984, 4369 and 9100.

preparing a service definition

template with placeholders

This is a good starting point for a service definition which adheres to IOTstack conventions:

«service»:
  container_name: «service»
  image: «image»{:«tag»}
  restart: unless-stopped
  environment:
    - TZ=${TZ:-Etc/UTC}
  ports:
    - "«external»:«internal»"
  volumes:
    - ./volumes/«service»/«path»:«internalpath»

container name substitution

Replace each instance of «service» with the name of your container:

couchdb:
  container_name: couchdb
  image: «image»{:«tag»}
  restart: unless-stopped
  environment:
    - TZ=${TZ:-Etc/UTC}
  ports:
    - "«external»:«internal»"
  volumes:
    - ./volumes/couchdb/«path»:«internalpath»

choose an image

Replace «image»{:«tag»} with image you selected earlier:

couchdb:
  container_name: couchdb
  image: couchdb:latest
  restart: unless-stopped
  environment:
    - TZ=${TZ:-Etc/UTC}
  ports:
    - "«external»:«internal»"
  volumes:
    - ./volumes/couchdb/«path»:«internalpath»

define environment variables

Environment variables typically separate into two basic groups:

  • those which have an effect each time the container starts; and
  • those which only have an effect when the container is run for the first time.

The TZ variable is a good example of something which has an effect each time the container starts. The syntax used here tells docker-compose to search the file:

~/IOTstack/.env

If that file does not exist on your system, you should initialise it like this:

$ cd ~/IOTstack
$ echo "TZ=$(cat /etc/timezone)" > .env

The information at Configuring CouchDB says that only COUCHDB_USER and COUCHDB_PASSWORD are relevant in non-cluster settings. It isn't clear whether the variables are enforced on every launch or if they only have an effect on first launch.

There are several approaches to adding environment variables but the simplest is to use inline values:

couchdb:
  container_name: couchdb
  image: couchdb:latest
  restart: unless-stopped
  environment:
    - TZ=${TZ:-Etc/UTC}
    - COUCHDB_USER=myAdminName
    - COUCHDB_PASSWORD=myPassword
  ports:
    - "«external»:«internal»"
  volumes:
    - ./volumes/couchdb/«path»:«internalpath»

One common mistake with environment variables is to wrap the right hand sides in quotes. For example:

    - COUCHDB_PASSWORD="myPassword"

Key point:

  • The value passed into the container via an environment variable includes the quotes so the password will be 12 characters rather than 10. In other words, don't use quotes!

define ports

The exposed ports listed in the Dockerfile were 5984, 4369 and 9100, whereas the documented ports only mentioned 5984.

You have two options. You can either assume the documented port is the only one you need, or you can "fail safe" by covering the field.

The first option is probably appropriate if you run Docker containers in isolation. In the IOTstack context where there are over 50 containers, the rule on ports is "first come, first served" so it's better to claim all the ports the container knows about, even if they are not going to be used.

This is the starting position:

  ports:
    - "5984:5984"
    - "4369:4369"
    - "9100:9100"

Note:

  • Port-pairs should always be quoted. See also:

    Please don't get hung up on whether any particular port number is or isn't at risk of misinterpretation. Always quote your port-pairs!

Next, search all the existing IOTstack templates to see if any other container is using any one of those ports:

$ cd ~/IOTstack/.templates
$ git grep -e "5984" -e "4369" -e "9100"
prometheus-nodeexporter/service.yml:    - "9100"
prometheus/iotstack_defaults/config.yml:        - prometheus-nodeexporter:9100

This tells us that Prometheus and its associated Node-Exporter have some knowledge of port 9100. In this case, it doesn't actually matter because the usage is internal. Any number of containers can use the same internal port. It's no different to any number of independent computers listening for HTTP on port 80. The only thing that matters is external ports.

But let's assume, for the moment, that external port 9100 had indeed been claimed by Prometheus. All you do is repeat the git grep search, adding 1 to the port number (9101, then 9102, then 9103) until you don't find any hits.

There are some IOTstack conventions concerning containers that run in host mode but I'll treat those as out-of-scope for this tutorial.

The upshot is that we don't need to change anything. We can use our starting position on ports without any modifications:

couchdb:
  container_name: couchdb
  image: couchdb:latest
  restart: unless-stopped
  environment:
    - TZ=${TZ:-Etc/UTC}
    - COUCHDB_USER=myAdminName
    - COUCHDB_PASSWORD=myPassword
  ports:
    - "5984:5984"
    - "4369:4369"
    - "9100:9100"
  volumes:
    - ./volumes/couchdb/«path»:«internalpath»

define persistent storage

The paths we discovered earlier were:

  • /opt/couchdb/etc/local.d
  • /opt/couchdb/etc/local.d/docker.ini
  • /opt/couchdb/etc/vm.args
  • /opt/couchdb/data

These are internal paths (inside the container) and we have to decide how to map those to external paths.

Rule 1:

  • Only map paths to folders. Do not map paths to files!!

Why? Well, the first time docker-compose launches your container, it runs down the list of external paths and does the equivalent of:

$ sudo mkdir -p «externalPath»

If the path to what the container expects to be a file is created as a folder, the result is a mess and the container crashes. The only way to avoid this is to pre-create the external path before your container is brought up for the first time. You also need to remember to do this if you ever decide to erase your persistent store to give your container a clean slate.

Key point:

  • There is no way to automate this. You just have to remember to do it.

So, let's get rid of the paths to the .ini and .args files, leaving us with:

/opt/couchdb/etc/local.d
/opt/couchdb/data

Although this comes from experience, given that the path leading to vm.args is common to the path leading to local.d, it's probably better to just map etc, as in:

/opt/couchdb/etc
/opt/couchdb/data

The IOTstack convention is to construct volume mappings by starting with:

./volumes/«container»

and then repeating the unique portion of the internal path. In other words:

  volumes:
    - ./volumes/couchdb/etc:/opt/couchdb/etc
    - ./volumes/couchdb/data:/opt/couchdb/data

the candidate service definition

couchdb:
  container_name: couchdb
  image: couchdb:latest
  restart: unless-stopped
  environment:
    - TZ=${TZ:-Etc/UTC}
    - COUCHDB_USER=myAdminName
    - COUCHDB_PASSWORD=myPassword
  ports:
    - "5984:5984"
    - "4369:4369"
    - "9100:9100"
  volumes:
    - ./volumes/couchdb/etc:/opt/couchdb/etc
    - ./volumes/couchdb/data:/opt/couchdb/data

By convention, this YAML is written to a file named service.yml. IOTstack refers to this as the service definition template. If you decide to turn your service into a Pull Request so that it can be added to the IOTstack repository on GitHub, that path to that file will be:

~/IOTstack/.templates/couchdb/service.yml

For now, service.yml can be anywhere.

A couple of tips:

  1. Make sure the last line of the file ends with an end-of-line marker. The convention is to include one blank line at the end of the definition (but nothing enforces this).

  2. If you use a Windows-based text editor, you need to make sure all the DOS/Windows CR+LF line endings are changed to Unix LF line endings. One very quick way to do that (on your Raspberry Pi) is:

    $ dos2unix service.yml

    It reads the file, fixes any problematic line-endings, and updates the file in place. No muss, no fuss!

testing your candidate

IOTstack templates are left-aligned. To use them in your compose file, each line needs to have two spaces stuck on the beginning. One way to do that is:

$ sed -e "s/^/  /" service.yml >>~/IOTstack/docker-compose.yml

Now you are ready:

$ cd ~/IOTstack
$ docker-compose up -d couchdb

debugging

DPS is an alias for the docker ps command, focusing on some interesting columns:

$ DPS couchdb
NAMES     CREATED          STATUS
couchdb   11 seconds ago   Restarting (1) 1 second ago

The container has gone into a restart loop. Why?

$ docker logs couchdb
touch: cannot touch '/opt/couchdb/etc/local.d/docker.ini': No such file or directory

We have a permissions problem and, without going into a detailed explanation, it's because, at launch time, docker-compose:

  1. did the equivalent of:

    $ sudo mkdir -p ~/IOTstack/volumes/couchdb/etc
  2. mounted the external path ~/IOTstack/volumes/couchdb/etc at the internal path /opt/couchdb/etc.

The effect of the mount is the same as any other kind of Unix mount. The external folder and its contents (currently empty) mask whatever is at the internal path.

This inability to self-initialize from scratch is (pretty much) the signature of what I call a non-well-behaved container. For background on what I mean by a non-well-behaved container, please see Issue 331.

Note:

  • My characterisation of CouchDB as non-well-behaved could easily be misplaced. I have never used CouchDB and have no plans to try it. It is perfectly possible that the vast majority of users would be able to run CouchDB with its default configuration, in which case the candidate service definition should be amended to remove the /etc mapping. It is just that, in the IOTstack world, containers with internal-only configurations invariably produce questions which include the words, "how can I change the config?".

The basic problem is compounded by CouchDB downgrading its privileges to run as user ID 5984. Inside the container, that ID is the username "couchdb". In other words, the situation is akin to the user "pi" trying to create a file in /etc but without having the ability to use sudo.

As far as I can tell, the good folks at couchdb-docker have left resolution of these problems as an exercise for the reader.

For now, this is how to work around the problem:

  1. Be in the correct directory:

    $ cd ~/IOTstack
  2. Stop the container:

    $ docker-compose down couchdb

    Note:

    • You must be using docker-compose version 2.19.0 or later for the down command to be able to terminate a container. If you are using an older version, you really should upgrade.
  3. Edit your docker-compose.yml to comment-out the volume mapping for the etc folder, as in:

      volumes:
        # - ./volumes/couchdb/etc:/opt/couchdb/etc
        - ./volumes/couchdb/data:/opt/couchdb/data
  4. Erase the container's persistent store:

    $ sudo rm -rf ./volumes/couchdb
  5. Start the container:

    $ docker-compose up -d couchdb

    This re-initialises the persistent store for the data directory but the etc directory is not mapped so, inside the container, it exists and has the expected structure and content.

  6. Make a copy of the container's etc directory:

    $ docker cp -a couchdb:/opt/couchdb/etc ./etc

    This command is cloning the read-only files from inside the container so you have a read-write copy outside the container.

  7. Change ownership on the copy so it is what the container expects:

    $ sudo chown -R 5984:5984 etc
  8. Move the etc folder into the persistent store:

    $ sudo mv etc ./volumes/couchdb/.
  9. Edit your docker-compose.yml to undo the change in step 3, as in:

      volumes:
        - ./volumes/couchdb/etc:/opt/couchdb/etc
        - ./volumes/couchdb/data:/opt/couchdb/data
  10. Start the container again (yes, it is already running, that's OK):

    $ docker-compose up -d couchdb

    docker-compose notices the change to the service definition and recreates the container automatically.

"Are we there yet?"

$ DPS couchdb
NAMES     CREATED              STATUS
couchdb   About a minute ago   Up About a minute

The signs are good. Now for the acid test:

$ curl -G http://localhost:5984
{"couchdb":"Welcome","version":"3.3.2","git_sha":"11a234070","uuid":"4b85649a2fd9072af3f6de4c2327707f","features":["access-ready","partitioned","pluggable-storage-engines","reshard","scheduler"],"vendor":{"name":"The Apache Software Foundation"}}

Qapla' !!

comments

CouchDB set up as above will work. You will be able to edit the configuration files in:

$ tree -apug ~/IOTstack/volumes/couchdb/etc/
/home/pi/IOTstack/volumes/couchdb/etc/
├── [drwxr-xr-x 5984     5984    ]  default.d
│   ├── [-rw-r--r-- 5984     5984    ]  10-docker-default.ini
│   └── [-rw-r--r-- 5984     5984    ]  README
├── [-rw-r--r-- 5984     5984    ]  default.ini
├── [drwxr-xr-x 5984     5984    ]  local.d
│   ├── [-rw-r--r-- 5984     5984    ]  docker.ini
│   └── [-rw-r--r-- 5984     5984    ]  README
├── [-rw-r--r-- 5984     5984    ]  local.ini
└── [-rw-r--r-- 5984     5984    ]  vm.args

2 directories, 7 files

Having edited a file, you can tell CouchDB to adopt your new settings with a restart:

$ docker-compose restart couchdb

However, if anything inside the etc directory disappears or gets mucked up, recovery involves repeating the steps explained in debugging. This is sub-optimal.

In addition, if you go to the next logical step of preparing a Pull Request to add CouchDB to IOTstack, the documentation will need to explain all the initialisation steps. This, too, is sub-optimal (along with any hope that every user naturally reads the IOTstack documentation before trying out a new container).

A better approach is to file an issue (or a Pull Request) upstream. Rather than populating /opt/local/etc inside the container, it is better to pre-populate another location (eg /couchdb/etc) as the "master version" and then use a tool like rsync inside the entry-point script to selectively replace any files that go missing from the "working" version. Something like:

$ rsync -arpv --ignore-existing /couchdb/etc/ /opt/couchdb/etc

Given a "clean slate" or "first run" situation (ie no persistent store) the destination path is fully populated from the template. The same happens if CouchDB is instantiated via a docker run rather than docker-compose, so it really makes no difference to anyone.

My own view of "best practice" is that containers should always at least start and not go into a restart loop. Releasing a container where you assume every potential user necessarily has sufficient knowledge to be able to recognise, diagnose and fix the underlying causes of restart loops is, I think, a bit much.

the next step

If you want to propose a Pull Request, there are instructions at:

Although it is written with IOTstack in mind, the same basic principles apply if you want to propose improvements to couchdb-docker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment