Snapshot space accounting is tricky. This document demonstrates that.
If there is not enough available space to meet the snapshot's space
requirements, zfs snapshot [-r] <dataset>@<snapname>
will fail with ENSOPC
.
Simple, right? Not so much.
It is relatively easy to determine how much space is available. This shows that 990 MiB is available.
[root@buglets ~]# zfs list -o name,available zones/2925dec4-ba6d-cb5a-c41e-84c7e3c08d3e
NAME AVAIL
zones/2925dec4-ba6d-cb5a-c41e-84c7e3c08d3e 990M
To be more precise, use the -p
option.
[root@buglets ~]# zfs list -Hpo available zones/2925dec4-ba6d-cb5a-c41e-84c7e3c08d3e
1038022144
The amount of space available to a zone may be less than the amount available in the pool.
[root@buglets ~]# zfs list -o name,available zones
NAME AVAIL
zones 16.1G
Available space must not be conflated with free
space. A pool may have lots
of free
space, but it may not be available
to use due redundancy (mirrors,
raidz) or it may be reserved through reservation
or refreservation
properties.
[root@buglets ~]# zpool get free zones
NAME PROPERTY VALUE SOURCE
zones free 49.9G -
However, when refquota
is smaller than quota
, you should be aware that
available
will never be larger than refquota
. This results in under
reporting the amount of space that is available to descendants.
For a non-recursive snapshot, the amount of space required is the mount of space referenced only by the dataset that is being snapshotted. Simple enough? Nope! This is really thorny.
XXX Maybe not so thorny. See written
property. Still, if we
are planning for consistent behavior, relying on this may not be the best plan.
Let's simulate a bhyve zone's hierarchy, initially ignoring the fact that the boot disk should be a clone.
[root@buglets ~]# zfs create zones/a
[root@buglets ~]# zfs create -V 100m zones/a/vol
[root@buglets ~]# zfs list -r -t all -o name,volsize,used,refer,quota,refquota,reserv,refreserv,used,usedds,usedsnap,usedrefreserv,available zones/a
NAME VOLSIZE USED REFER QUOTA REFQUOTA RESERV REFRESERV USED USEDDS USEDSNAP USEDREFRESERV AVAIL
zones/a - 105M 23K none none none none 105M 23K 0 0 15.9G
zones/a/vol 100M 105M 12K - - none 105M 105M 12K 0 105M 16.0G
Initially zones/a/vol
references 12 KiB, which is all metadata. When a
snapshot is taken, the snapshot will refer to 12 KiB, but the snapshot uses
no space (used
is zero).
[root@buglets ~]# zfs snapshot zones/a/vol@snap1
[root@buglets ~]# zfs list -r -t all -o name,volsize,used,refer,quota,refquota,reserv,refreserv,used,usedds,usedsnap,usedrefreserv,available zones/a
NAME VOLSIZE USED REFER QUOTA REFQUOTA RESERV REFRESERV USED USEDDS USEDSNAP USEDREFRESERV AVAIL
zones/a - 105M 23K none none none none 105M 23K 0 0 15.9G
zones/a/vol 100M 105M 12K - - none 105M 105M 12K 0 105M 16.0G
zones/a/vol@snap1 - 0 12K - - - - 0 - - - -
Let's write 50 MiB of non-zero data to the volume and try the snapshot again.
[root@buglets ~]# zfs destroy zones/a/vol@snap1
[root@buglets ~]# openssl rand $(( 1024 * 1024 * 50 )) > /dev/zvol/rdsk/zones/a/vol
[root@buglets ~]# zfs snapshot zones/a/vol@snap1
[root@buglets ~]# zfs list -r -t all -o name,volsize,used,refer,quota,refquota,reserv,refreserv,used,usedds,usedsnap,usedrefreserv,available zones/a
NAME VOLSIZE USED REFER QUOTA REFQUOTA RESERV REFRESERV USED USEDDS USEDSNAP USEDREFRESERV AVAIL
zones/a - 156M 23K none none none none 156M 23K 0 0 15.9G
zones/a/vol 100M 156M 50.5M - - none 105M 156M 50.5M 0 105M 16.0G
zones/a/vol@snap1 - 0 50.5M - - - - 0 - - - -
Again, the snapshot refers to the same amount of data as the the volume and uses no space on its own.
If we overwrite that data, the snapshot consumes a bunch of space.
[root@buglets ~]# zfs list -r -t all -o name,volsize,used,refer,quota,refquota,reserv,refreserv,used,usedds,usedsnap,usedrefreserv,available zones/a
NAME VOLSIZE USED REFER QUOTA REFQUOTA RESERV REFRESERV USED USEDDS USEDSNAP USEDREFRESERV AVAIL
zones/a - 156M 23K none none none none 156M 23K 0 0 15.9G
zones/a/vol 100M 156M 50.5M - - none 105M 156M 50.5M 50.5M 54.7M 15.9G
zones/a/vol@snap1 - 50.5M 50.5M - - - - 50.5M - - - -
Let's try that again with a quota. First we clean up what was done above.
[root@buglets ~]# zfs destroy zones/a/vol@snap1
[root@buglets ~]# zfs list -r -t all -o name,volsize,used,refer,quota,refquota,reserv,refreserv,used,usedds,usedsnap,usedrefreserv,available zones/a
NAME VOLSIZE USED REFER QUOTA REFQUOTA RESERV REFRESERV USED USEDDS USEDSNAP USEDREFRESERV AVAIL
zones/a - 105M 23K none none none none 105M 23K 0 0 15.9G
zones/a/vol 100M 105M 50.5M - - none 105M 105M 50.5M 0 54.7M 16.0G
A bhyve zone uses refquota
to restrict the amount of data that can be written
to the zone's dataset, ignoring its descendent snapshots and volumes. By
setting refquota, we've reduced the amount of space that the filesystem can use.
[root@buglets ~]# zfs set refquota=50m zones/a
[root@buglets ~]# zfs list -r -t all -o name,volsize,used,refer,quota,refquota,reserv,refreserv,used,usedds,usedsnap,usedrefreserv,available zones/a
NAME VOLSIZE USED REFER QUOTA REFQUOTA RESERV REFRESERV USED USEDDS USEDSNAP USEDREFRESERV AVAIL
zones/a - 105M 23K none 50M none none 105M 23K 0 0 50.0M
zones/a/vol 100M 105M 50.5M - - none 105M 105M 50.5M 0 54.7M 16.0G
Now, zones/a can only use 50 MiB, but zones/a/vol can still use up to 16 GiB.
Let's set a limit on the amount of space that zone's entire hierarchy can use by
setting a quota
on zones/a
.
[root@buglets ~]# zfs list -r -t all -o name,volsize,used,refer,quota,refquota,reserv,refreserv,used,usedds,usedsnap,usedrefreserv,available zones/a
NAME VOLSIZE USED REFER QUOTA REFQUOTA RESERV REFRESERV USED USEDDS USEDSNAP USEDREFRESERV AVAIL
zones/a - 105M 23K 150M 50M none none 105M 23K 0 0 44.7M
zones/a/vol 100M 105M 50.5M - - none 105M 105M 50.5M 0 54.7M 99.5M
Now there's 44.7 MiB available and the volume references 50.5 MiB. That's not enough space for a snapshot.
[root@buglets ~]# zfs snapshot zones/a/vol@snap1
cannot create snapshot 'zones/a/vol@snap1': out of space
To test the assertion that this is caused by 44.7 MiB available being less than
the 50.5 MiB used, let's increase the quota. To simplify things, we'll get the
refquota out of the way for a minute to make the available
column meaningful.
[root@buglets ~]# zfs set refquota=none zones/a
[root@buglets ~]# zfs set quota=156m zones/a
[root@buglets ~]# zfs list -r -t all -o name,volsize,used,refer,quota,refquota,reserv,refreserv,used,usedds,usedsnap,usedrefreserv,available zones/a
NAME VOLSIZE USED REFER QUOTA REFQUOTA RESERV REFRESERV USED USEDDS USEDSNAP USEDREFRESERV AVAIL
zones/a - 105M 23K 156M none none none 105M 23K 0 0 50.7M
zones/a/vol 100M 105M 50.5M - - none 105M 105M 50.5M 0 54.7M 105M
Now a snapshot can be created, with 224 KiB (0.2 MiB) to spare.
[root@buglets ~]# zfs snapshot zones/a/vol@snap1
[root@buglets ~]# zfs list -r -t all -o name,volsize,used,refer,quota,refquota,reserv,refreserv,used,usedds,usedsnap,usedrefreserv,available zones/a
NAME VOLSIZE USED REFER QUOTA REFQUOTA RESERV REFRESERV USED USEDDS USEDSNAP USEDREFRESERV AVAIL
zones/a - 156M 23K 156M none none none 156M 23K 0 0 224K
zones/a/vol 100M 156M 50.5M - - none 105M 156M 50.5M 0 105M 105M
zones/a/vol@snap1 - 0 50.5M - - - - 0 - - - -
To maximize confusion, we can create another snapshot now too. This is because both snapshots refer to the blocks.
[root@buglets ~]# zfs snapshot zones/a/vol@snap2
[root@buglets ~]# zfs list -r -t all -o name,volsize,used,refer,quota,refquota,reserv,refreserv,used,usedds,usedsnap,usedrefreserv,available zones/a
NAME VOLSIZE USED REFER QUOTA REFQUOTA RESERV REFRESERV USED USEDDS USEDSNAP USEDREFRESERV AVAIL
zones/a - 156M 23K 156M none none none 156M 23K 0 0 224K
zones/a/vol 100M 156M 50.5M - - none 105M 156M 50.5M 0 105M 105M
zones/a/vol@snap1 - 0 50.5M - - - - 0 - - - -
zones/a/vol@snap2 - 0 50.5M - - - - 0 - - - -
While it is really nice that there is room for this second snapshot, there is no means to predict whether there is enough space from user space.
Let's get rid of the second snapshot and try writing some data to the volume.
[root@buglets ~]# openssl rand $(( 1024 * 1024 * 50 )) > /dev/zvol/rdsk/zones/a/vol
[root@buglets ~]# zfs list -r -t all -o name,volsize,used,refer,quota,refquota,reserv,refreserv,used,usedds,usedsnap,usedrefreserv,available zones/a
NAME VOLSIZE USED REFER QUOTA REFQUOTA RESERV REFRESERV USED USEDDS USEDSNAP USEDREFRESERV AVAIL
zones/a - 156M 23K 156M none none none 156M 23K 0 0 224K
zones/a/vol 100M 156M 50.5M - - none 105M 156M 50.5M 50.5M 54.7M 55.0M
zones/a/vol@snap1 - 50.5M 50.5M - - - - 50.5M - - - -
Notice that this write caused the usedrefresrv
(used by reference reservation)
to decrease by the amount (subject to rounding) that usedsnap
(used by
snapshots) increased, keeping the value of used
constant. In fact, we can do
a full overwrite of the volume.
[root@buglets ~]# openssl rand $(( 1024 * 1024 * 100 )) > /dev/zvol/rdsk/zones/a/vol
[root@buglets ~]# zfs list -r -t all -o name,volsize,used,refer,quota,refquota,reserv,refreserv,used,usedds,usedsnap,usedrefreserv,available zones/a
NAME VOLSIZE USED REFER QUOTA REFQUOTA RESERV REFRESERV USED USEDDS USEDSNAP USEDREFRESERV AVAIL
zones/a - 156M 23K 156M none none none 156M 23K 0 0 224K
zones/a/vol 100M 156M 101M - - none 105M 156M 101M 50.5M 4.23M 4.45M
zones/a/vol@snap1 - 50.5M 50.5M - - - - 50.5M - - - -
Even though the volume was fully overwritten, there is still some space
available. This is because the zfs refreservation=auto
value accounts for
raidz variants that are not as efficient at storing metadata. To ensure that
volumes can be sent to other pools, it is best to not optimize this slop away.
Now let's look at a scenario with a boot disk, a data disk, and [ref]quota properties that are representative of what a typical guest would look like.
[root@buglets ~]# zfs create -o refquota=10m -o quota=220m zones/b
[root@buglets ~]# zfs create -V 100m zones/b/boot
[root@buglets ~]# zfs create -V 100m zones/b/data
[root@buglets ~]# zfs list -r -t all -o name,volsize,used,refer,quota,refquota,reserv,refreserv,used,usedds,usedsnap,usedrefreserv,available zones/b
NAME VOLSIZE USED REFER QUOTA REFQUOTA RESERV REFRESERV USED USEDDS USEDSNAP USEDREFRESERV AVAIL
zones/b - 211M 23K 220M 10M none none 211M 23K 0 0 9.48M
zones/b/boot 100M 105M 12K - - none 105M 105M 12K 0 105M 115M
zones/b/data 100M 105M 12K - - none 105M 105M 12K 0 105M 115M
When we create a snapshot, we will want a recursive snapshot that captures
all disks in a single transaction group. There may be some configuration data
stored in /zones/b/config
that should come along with this snapshot. Before
we write to anything, we can create such a snapshot:
[root@buglets ~]# zfs snapshot -r zones/b@snap1
[root@buglets ~]# zfs list -r -t all -o name,volsize,used,refer,quota,refquota,reserv,refreserv,used,usedds,usedsnap,usedrefreserv,available zones/b
NAME VOLSIZE USED REFER QUOTA REFQUOTA RESERV REFRESERV USED USEDDS USEDSNAP USEDREFRESERV AVAIL
zones/b - 211M 23K 220M 10M none none 211M 23K 0 0 9.45M
zones/b@snap1 - 0 23K - - - - 0 - - - -
zones/b/boot 100M 105M 12K - - none 105M 105M 12K 0 105M 115M
zones/b/boot@snap1 - 0 12K - - - - 0 - - - -
zones/b/data 100M 105M 12K - - none 105M 105M 12K 0 105M 115M
zones/b/data@snap1 - 0 12K - - - - 0 - - - -
However, writing just a little bit of data (5% of total allocated to disks) makes it so that we can't create a snapshot.
[root@buglets ~]# zfs destroy -r zones/b@snap1
[root@buglets ~]# openssl rand $(( 1024 * 1024 * 10 )) > /dev/zvol/rdsk/zones/b/data
[root@buglets ~]# zfs list -r -t all -o name,volsize,used,refer,quota,refquota,reserv,refreserv,used,usedds,usedsnap,usedrefreserv,available zones/b
NAME VOLSIZE USED REFER QUOTA REFQUOTA RESERV REFRESERV USED USEDDS USEDSNAP USEDREFRESERV AVAIL
zones/b - 211M 23K 220M 10M none none 211M 23K 0 0 9.48M
zones/b/boot 100M 105M 12K - - none 105M 105M 12K 0 105M 115M
zones/b/data 100M 105M 10.1M - - none 105M 105M 10.1M 0 95.1M 105M
[root@buglets ~]# zfs snapshot -r zones/b@snap1
cannot create snapshot 'zones/b/data@snap1': out of space
no snapshots were created
Taken to the extreme, if both disks were fully written, the package would need
to have more than double the size of the two vdisks to be able to create a
snapshot. Assuming quota
is the means by which we will ensure that customers
don't go over the space reserved by their allocation, they will need to use
only half the space advertised by the package to ensure that snapshots are
possible.
This is likely to drive the need for temporarily moving to a larger package.
Since we don't reserve space equal to the refquota
on the zone's dataset,
it is possible to run out of space in the pool or within the zone's dataset
prior to reaching that refquota
.
[root@buglets ~]# zfs destroy -r zones/b
[root@buglets ~]# zfs create -o refquota=10m -o quota=220m zones/b
[root@buglets ~]# zfs create -V 100m zones/b/boot
[root@buglets ~]# zfs create -V 100m zones/b/data
[root@buglets ~]# zfs list -r -t all -o name,volsize,used,refer,quota,refquota,reserv,refreserv,used,usedds,usedsnap,usedrefreserv,available zones/b
NAME VOLSIZE USED REFER QUOTA REFQUOTA RESERV REFRESERV USED USEDDS USEDSNAP USEDREFRESERV AVAIL
zones/b - 211M 23K 220M 10M none none 211M 23K 0 0 9.48M
zones/b/boot 100M 105M 12K - - none 105M 105M 12K 0 105M 115M
zones/b/data 100M 105M 12K - - none 105M 105M 12K 0 105M 115M
[root@buglets ~]# openssl rand $(( 1024 * 1024 * 9 )) > /dev/zvol/rdsk/zones/b/data
[root@buglets ~]# zfs snapshot -r zones/b@snap1
[root@buglets ~]# zfs list -r -t all -o name,volsize,used,refer,quota,refquota,reserv,refreserv,used,usedds,usedsnap,usedrefreserv,available zones/b
NAME VOLSIZE USED REFER QUOTA REFQUOTA RESERV REFRESERV USED USEDDS USEDSNAP USEDREFRESERV AVAIL
zones/b - 220M 23K 220M 10M none none 220M 23K 0 0 372K
zones/b@snap1 - 0 23K - - - - 0 - - - -
zones/b/boot 100M 105M 12K - - none 105M 105M 12K 0 105M 106M
zones/b/boot@snap1 - 0 12K - - - - 0 - - - -
zones/b/data 100M 114M 9.10M - - none 105M 114M 9.10M 0 105M 106M
zones/b/data@snap1 - 0 9.10M - - - - 0 - - - -
[root@buglets ~]# openssl rand $(( 1024 * 1024 )) > /zones/b/file1
4274973380:error:02012031:system library:fflush:Disc quota exceeded:bss_file.c:434:fflush()
4274973380:error:20074002:BIO routines:FILE_CTRL:system lib:bss_file.c:436:
[root@buglets ~]# ls -ld /zones/b/file1
-rw-r--r-- 1 root root 393216 Jul 16 22:32 /zones/b/file1
This becomes more of an issue when snapshots are used, as snapshots will try to make use of this unreserved space.