Docker on BTRFS is very buggy and can result in a fully-unusable system, in that it will completely butcher the underlying BTRFS filesystem in such a way that it uses far more disk space than it needs and can get into a state where it cannot even delete any image, requiring one to take drastic actions up to and including reformatting the entire affected BTRFS root file system.
According to the official Docker documentation:
btrfs requires a dedicated block storage device such as a physical disk. This block device must be formatted for Btrfs and mounted into /var/lib/docker/.
In my experience, you will still run into issues even if you use a dedicated partition. No, it seems it requires a standalone hard drive, which is a luxury many computers just simply cannot afford.
See Docker gradually exhausts disk space on BTRFS #27653 for details of exactly what I have run into. Also, docker does not remove btrfs subvolumes when destroying container
A pseudo filesystem is a filesystem that is contained inside an otherwise-ordinary file, that is mounted by the OS. This guide will show you how to set one up and use it exclusively for Docker images and containers in a way that will NOT cripple your BTRFS file system, but also allows you to store it in normal BTRFS subvolume snapshots.
Steps to migrate /var/lib/docker from a subdirectory to a dedicated pseudo filesystem.
- BACKUP ANY IMPORTANT self-made Docker images! This guide will destroy all of your existing images and containers.
docker save image/name -o image_name.docker; bzip2 image_name.docker
- Open up a terminal and run the command
sudo watch -n10 df /var/lib/docker. Pay attention to the total space availabler. Because BTRFS deletes files from the system only when the disk is inactive, it is important to know when certain processes have really finished, or if they are even happening. In a BTRFS file system that is corrupted by Docker, many times no file will actually be removed from the underlying file system. If this happens, refer to the Drastic Actions section.
- Make a BTRFS volume snapshot!! We are messing with your core file system. It is important to make a snapshot. If all
goes to Hell, refer to the Drastic Actions section for how to restore the snapshot and get quickly back to work.
sudo mkdir /snaps
sudo btrfs subvolume snapshot / /snaps/root-$(date '+%Y-%m-%d')-pre
Clean Up Docker /var/lib/docker files.
Delete all of the docker containers:
docker rm $(docker ps -aq)
docker ps -aqshould return nothing.
Delete all of the docker images.
docker rmi -f $(docker images -q)NOTE: If you do not see any activity for several minutes, it is indicative of a BTRFS meltdown. To verify for sure, run
sudo du -hs /var/lib/docker. If it is still running after 3-5 minutes, refer to the Drastic Actions section.
docker images -qshould return nothing.
sudo systemctl stop dockerNOTE: When docker has butchered the BTRFS file system, stopping docker will many times NOT be stoppable via this step. Fortunately, a simple system reboot resolves this issue. Do that now if you encounter this problem.
3b. Ensure that docker is completely stopped.
ps aux | grep docker
4. Explore the
sudo -s cd /var/lib/docker du -h --max-depth=1 | sort -h
Because you have deleted literally 100% of the files which docker stores, your
/var/lib/docker should be virtually empty.
Maybe a few MB max. However, if Docker has been abusing the underlying root BTRFS system, many times many GBs will still
5. Attempt to remove all of the files manually:
DO NOT USE THE
rm COMMAND! This will not work, and if it does, you will have irreversibly corrupted your BTRFS system.
Go immediately to the Drastic Actions section if you have accidentally done so.
As discussed in nuking old and broken /var/lib/docker directories is non-trivial,
the only safe way to remove broken
/var/lib/docker files on BTRFS is to do the following:
for subvolume in /var/lib/docker/btrfs/subvolumes/*; do btrfs subvolume delete $subvolume done
- Ensure that all docker BTRFS subvolumes have been destroyed:
btrfs subvolume list /You should not see any entries with the path
- Manually remove all the other files in
rm -r /var/lib/docker/*Ensure that it is empty by running both
du -h ., both of which should report 0 disk space used.
If all has gone well, you now have a BTRFS file system that is devoid of all docker-related images, containers and various metadata and caches. Congratulations!
Create the pseudo file system
- Ensure that you are the root user.
- Create the pseudo filesystem:
The best place to store file-based pseudo filesystems is in
Estimate how much space you will need, or want to reserve, for Docker images. I find that 10-20 GB is far more than enough for properly functioning systems.
cd /media fallocate -l 10G docker-volume.img mkfs.ext4 docker-volume.img mount -o loop -t ext4 /media/docker-volume.img /var/lib/docker df -h # You should see: /dev/loop0 9.8G 37M 9.3G 1% /var/lib/docker umount /var/lib/docker
- Add the pesudo filesystem to the "mount on boot" config.
echo "/media/docker-volume.img /var/lib/docker ext4 defaults 0 0" >> /etc/fstab
- Test mount it:
- Restart docker and confirm that it is using the pseudo filesystem:
systemctl start docker systemctl stop docker cd /media ls /var/lib/docker # You should see many subdirectories. du -h /var/lib/docker # It should report approximately 35 directories, and about 256 KB of space used. # You should NOT see any mention of BTRFS subvolumes. umount /var/lib/docker du -h /var/lib/docker # You should see: 0 /var/lib/docker/
- Now reboot the system and confirm that the volume has auto-mounted and that docker is using it.
Congratulations! You have now moved Docker volumes from BTRFS to a pseudo ext4 file system, which docker supports much better!
- IMPORTANT: Take a new snapshot of the fixed system and remove the one we made at the beginning of this guide.
sudo btrfs subvolume snapshot / /snaps/root-$(date '+%Y-%m-%d') sudo btrfs subvolume del /snaps/root-$(date '+%Y-%m-%d')-pre
If you ever run into a corrupted /var/lib/docker in the future, simply
sudo rm /media/docker-volume.img and repeat
this guide. It is much better than risking your entire BTRFS file system to docker's buggy implementation!
Attempt a BTRFS restore
Things didn't go so well? Unfortunately, this happens.
First things first, attempt to restore an older snapshot that may not be corrupted.
Follow the guide here: Using Btrfs for Easy Backup and Rollback
If that fails, restore the snapshot taken in the prep stage of this guide. That will at least get you back to the same state your system was in before you started all of this.
Attempt via a rescue disk
Mount the partition while inside a recovery system like System Rescue CD and reattempt this guide from the very beginning.
When I was in a total desperate situation where Docker had consumed so much of the file system that basic commands would not run, this method saved me.
Back up and Reformat the entire system.
In early 2017, no matter what I tried, nothing worked. If you find yourself in this unfortunate state, back up all of your important files, maybe via a resovery system, and reformat the machine. I still recommend BTRFS as it is vastly superior to all other mainstream file systems. Just don't use it with docker!
Be sure to leave your horror story on the official Docker bug reports for this issue:
I would like to ask if it's a real problem or people are just confused (including me). I have OpenSuse Leap 15.5 beta.
What I found is that
/var/lib/docker/btrfsoccupied 43GB of space and went down to 38GB when I deleted all containers and unused images.
df -hreports 52GB used total. Later after reading this and a few horor stories, I found that in my home directory I have old nextcloud volume and it's tarball backup which are like 2x17GB. 38+2x17 is 72GB which is already 20GB more than what df reports for whole disk and likely 10GB is in other directories.
docker system dfsays I have 4.1GB in 5 images (yes, one of them has 2GB), 0GB in containers, 600MB in volumes (nextcloud is archived) which is realistic
Now this is a bit weird. First, image dir is nearly empty. Second, btrfs directory says that "Set shared" 3.85GB matches almost exactly 4.1GB reported by
docker system dfcommand. 35.23GiB is 37.8GB which matches 38G reported by df command. I guess btrfs contains both real data and snapshots.
I'm very new to using docker, but my rough understanding is that that every subvolume/id is a snapshot created during image build process.