Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?

Persisting & Sharing Data in Docker

In this blog we will look at various ways in which storage from host machine can be mounted to containers. Also it can be seen as a way of communication in case the networking is disabled for your containers.

A quick intro

Docker is a popular containerization tool used for packaging, deploying, and running applications.

Containers are supposed to be light-weighted but by default all files created inside a container are stored on its writable-layer making it heavy to create and run.

What we miss here ?

  • Persistence , data available on writable-layer gets wiped out when that container no longer exists.

  • As data on writable-layer is isolated for each container , the sharing of data between container becomes difficult . Or Consider a scenario when some other process (on host) wants to use the data from the container. I bet the container will surely disappoint the needy process.

What docker offers !

Docker has following options for containers to persist and share in Linux based system :

  • volumes , part of the host filesystem which is managed by Docker
  • bind mount , it can be stored anywhere & managed by host system
  • tmpfs mount , stored in the host system’s RAM.

Note : only volume and bind mounts both provide data persistence.

Volume

  • Volumes are the preferred mechanism for persisting data generated and used by docker containers. It does not increase the size of the containers using it, and the volume’s contents exist outside the lifecycle of a given container.
  • A single volume can be mounted into multiple containers simultaneously which will be managed by docker itself.
  • When no running container is using a volume, the volume is still available to docker and is not removed automatically.
  • Lets create and inspect volume
     $ docker volume create myvol1
     
     $ docker volume inspect myvol1
     //result of inspect
     {
         "CreatedAt": "2020-04-25T12:13:06+05:30",
         "Driver": "local",
         "Labels": {},
         "Mountpoint": "/var/lib/docker/volumes/myvol1/_data",
         "Name": "myvol1",
         "Options": {},
         "Scope": "local"
     }
    
  • Volumes can be named or anonymous based on whether they are named explicitly or not and can be easily managed via Docker CLI commands or the Docker API.
  • Volume drivers let you store volumes on remote hosts or cloud providers, to encrypt the contents of volumes, or to add other functionality and its local by default.
  • Now lets share the volume between 2 containers running nginx
$ docker run -d -p 8091:80 --name=nginx1 --mount source=myvol1,destination=/usr/share/nginx/html nginx:latest

$ docker run -d -p 8092:80 --name=nginx2 --mount source=myvol1,destination=/usr/share/nginx/html nginx:latest
  • Now go and update the index.html page and you can notice the changes in both containers.
  • Volumes can be mounted read-write for some of them and read-only for others, at the same time.
//use mount flag in this way
 
--mount source=myvol1,destination=/app/,readonly

bind mount

  • A bind mount basically is a file or directory on the host machine that is mounted into a container.
  • The file or directory is referenced by its full path and rely on the host machine’s filesystem for directory structure .
  • It can be used both by docker and other processes side by side.
  • Lets see how to use a local directory on your system inside the container .
//create a file inside data folder
$ touch  file.txt
$ echo "Hello i am file on desktop"> file.txt
$ cat file.txt 
Hello i am file on desktop

//Now bind this dir with container
$ docker run -it --name=myLocalData --mount type=bind,source=/home/knoldus/Desktop/data,target=/LocalData ubuntu:18.04 /bin/bash

//Now print the data
root@0bc54fe572cb:/# cat LocalData/file.txt 
Hello i am file on desktop

  • Also it must be noted that when you bind-mount into a non-empty directory on the container, the directory’s existing contents are over-shadowed by the bind mount.
  • bind mount can also be readonly as some container only needs to read from them.
//create a container and bind mount in readonly mode
$ docker run -d --name=myLocalDataRo --mount type=bind,source=/home/knoldus/Desktop/data,target=/LocalData,readonly alpine:latest

//inspect to verify
$ docker inspect myLocalDataRo

//you will get a similar result in mounts section of the output
"Mounts": [
                {
                    "Type": "bind",
                    "Source": "/home/knoldus/Desktop/data",
                    "Target": "/LocalData",
                    "ReadOnly": true
                }
            ],

Note In above examples we have bind mounted the same source with 2 different containers i.e with myLocalData( R/W is true ) and with myLocalDataRo ( in readonly access ).

tmpfs mount

  • A tmpfs mount is allows the container to create files outside the writable layer but the data is not persisted on disk, either on the docker host or within a container.

  • It can only be used during the lifetime of the container, to store non-persistent state or sensitive information.

  • Lets see how to use tmpfs mount

//via --tmpfs flag

$ docker run -it --rm --name tmpfstest1 --tmpfs /myTmpData alpine:latest

//to see how much memory it occupies 
/ # /bin/df -h | grep myTmpData
Filesystem                Size      Used Available Use% Mounted on

tmpfs                     3.8G         0      3.8G   0% /myTmpData

  • The --tmpfs flag does not allow you to specify any configurable options and can result in degradation of your system performance.
  • Its good to use --mount flag as its is more explicit and verbose and it supports configurable options as well.
//via --mount flag
docker run --rm -it --name=tmpfstest2  --mount type=tmpfs,target=/tmpData,tmpfs-size=5m alpine:latest


//to see how much memory it occupies 
/ # /bin/df -h | grep tmpData
Filesystem                Size      Used Available Use% Mounted on

tmpfs                     5.0M         0      5.0M   0% /tmpData
  • tmpfs mount can be very useful in scenarios where the containerized application needs to speedly process large data as a portion of RAM is mounted.
  • As soon as the container stops, the tmpfs mount is removed, and files written there won’t be persisted.
  • Also this functionality is only available if you’re running Docker on Linux.

Lets summarise

  • Volume is the best and preferred way of sharing data among multiple running containers and is managed by docker itself.
  • Bind mounts can be best used when you want to share source code or build artifacts between a development environment on the Docker host and a container.
  • Tmpfs are useful when you application needs to write a large volume of non-persistent state data.

Thanks for Keeping Up...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment