Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@stephanGarland
Last active April 1, 2023 15:30
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save stephanGarland/5a677ee65948ecfabb49243224a046dc to your computer and use it in GitHub Desktop.
Save stephanGarland/5a677ee65948ecfabb49243224a046dc to your computer and use it in GitHub Desktop.
Examining ZFS storage of files and snapshots with ZDB
# create a simple dataset with a small record size, and chown it
❯ sudo zfs create -o recordsize=512 tank/foobar && sudo chown $YOUR_USER:$YOUR_GROUP tank/foobar
❯ cd tank/foobar
# make a 1K file filled with hex FF (pull from /dev/zero, then use tr to translate to FF, which is 377 in octal)
# if it's just zeros, there isn't much to look at with zdb
❯ dd if=/dev/zero bs=1k count=1 | tr "\000" "\377" >file.txt
1+0 records in
1+0 records out
1024 bytes (1.0 kB, 1.0 KiB) copied, 4.2977e-05 s, 23.8 MB/s
# snapshot the dataset with a useful name
❯ zfs snapshot tank/foobar@file_created
# get the file's object id with zdb
❯ sudo zdb -O tank/foobar file.txt
Object lvl iblk dblk dsize dnsize lsize %full type
2 2 128K 512 10.5K 512 1K 100.00 ZFS plain file
# use that to look at the snapshot - a lot of information here, but I'll focus on the blocks section at the end
❯ sudo zdb -ddddd tank/foobar@file_created 2
Dataset tank/foobar@file_created [ZPL], ID 90609, cr_txg 10929325, 139K, 7 objects, rootbp DVA[0]=<0:14b2e8f84000:2000> DVA[1]=<1:156b1b290000:2000> [L0 DMU objset] fletcher4 uncompressed unencrypted LE contiguous unique double size=1000L/1000P birth=10929325L/10929325P fill=7 cksum=11c5b1442c:2f8bb58e2b94:432a86ae0a6b2a:42b84b723b0e81fd
Object lvl iblk dblk dsize dnsize lsize %full type
2 2 128K 512 10.5K 512 1K 100.00 ZFS plain file
176 bonus System attributes
dnode flags: USED_BYTES USERUSED_ACCOUNTED USEROBJUSED_ACCOUNTED
dnode maxblkid: 1
path /file.txt
uid 1000
gid 1000
atime Sat Apr 1 10:35:05 2023
mtime Sat Apr 1 10:35:05 2023
ctime Sat Apr 1 10:35:05 2023
crtime Sat Apr 1 10:35:05 2023
gen 10929325
mode 100644
size 1024
parent 34
links 1
pflags 840800000004
Indirect blocks:
0 L1 2:1516995c6000:2000 20000L/1000P F=2 B=10929325/10929325 cksum=81f40d8556:1e1597a3af44b:37dcef96d097702:555c1ed1d41c8604
0 L0 EMBEDDED et=0 200L/10P B=10929325
200 L0 EMBEDDED et=0 200L/10P B=10929325
segment [0000000000000000, 0000000000000400) size 1K
# there are two pointers, L1 and L0
# L1 shows the disk offset (the 1st field is the vdev device)
# so this is 0x1516995c6000 bytes offset from the start of vdev id 2, and is 0x2000 bytes (128KB) long
# L0 represents the actual data, and since this is a 1024 byte file with 512 byte record size, it fits neatly into two pointers (hence EMBEDDED)
# now let's add a newline to the file
❯ file file.txt
file.txt: ISO-8859 text, with very long lines, with no line terminators
❯ echo -n '\n' >> file.txt
❯ file file.txt
file.txt: ISO-8859 text, with very long lines
# now snapshot again
❯ zfs snapshot tank/foobar@file_modified
# let's look at blocks again, by piping zdb output through sed
❯ sudo zdb -ddddd tank/foobar@file_modified 2 | sed -n '/Indirect/,$p'
Indirect blocks:
0 L1 2:1516996a8000:2000 20000L/1000P F=3 B=10929381/10929381 cksum=844f3c5dfd:1e7189908ff54:3831289a28780b8:5556f83f0ca17acd
0 L0 EMBEDDED et=0 200L/10P B=10929325
200 L0 EMBEDDED et=0 200L/10P B=10929325
400 L0 EMBEDDED et=0 200L/11P B=10929381
segment [0000000000000000, 0000000000000600) size 1.50K
# now that we've added another byte to the file, it no longer fits into two L0 pointers
# additionally, of course, the new snapshot lives at a new location on disk
# 0x1516996a8000, which is 0xe2000 bytes (925696 bytes in decimal) past the head of the first snapshot
# if we look at the file itself, not the snapshot, we see that it's identical to the latest snapshot
❯ sudo zdb -ddddd tank/foobar 2 | sed -n '/Indirect/,$p'
Indirect blocks:
0 L1 2:1516996a8000:2000 20000L/1000P F=3 B=10929381/10929381 cksum=844f3c5dfd:1e7189908ff54:3831289a28780b8:5556f83f0ca17acd
0 L0 EMBEDDED et=0 200L/10P B=10929325
200 L0 EMBEDDED et=0 200L/10P B=10929325
400 L0 EMBEDDED et=0 200L/11P B=10929381
segment [0000000000000000, 0000000000000600) size 1.50K
# let's remove that newline by deleting the last byte
❯ truncate -s -1 file.txt
❯ file file.txt
file.txt: ISO-8859 text, with very long lines, with no line terminators
# now if we look at the file itself, it's changed
❯ sudo zdb -ddddd tank/foobar 2 | sed -n '/Indirect/,$p'
Indirect blocks:
0 L1 2:1516997e0000:2000 20000L/1000P F=2 B=10929495/10929495 cksum=81da6d9399:1df1d46667870:3763ae11a707082:47bdc1527673d2c7
0 L0 EMBEDDED et=0 200L/10P B=10929325
200 L0 EMBEDDED et=0 200L/10P B=10929325
400 L0 0:0:0 200L B=10929495
segment [0000000000000000, 0000000000000400) size 1K
# the L1 pointer is at a new place entirely (Copy On Write), and the last L0 block is no longer embedded into a pointer
# since the modification was to delete something, ZFS optimizes storing that by recording a hole
# if we wrote data instead
❯ echo -n '!' >> file.txt
# the result would look like this
❯ sudo zdb -ddddd tank/foobar 2 | sed -n '/Indirect/,$p'
Indirect blocks:
0 L1 0:14a6b7a68000:2000 20000L/1000P F=3 B=10929558/10929558 cksum=83dd557fd2:1e50cfa1218cc:37e8b291062468f:4ed219ccf78c7aae
0 L0 EMBEDDED et=0 200L/10P B=10929325
200 L0 EMBEDDED et=0 200L/10P B=10929325
400 L0 EMBEDDED et=0 200L/11P B=10929558
segment [0000000000000000, 0000000000000600) size 1.50K
# we'll remove that last byte again
❯ truncate -s -1 file.txt
# and look at the file once more
❯ sudo zdb -ddddd tank/foobar 2 | sed -n '/Indirect/,$p'
Indirect blocks:
0 L1 2:151699916000:2000 20000L/1000P F=2 B=10929568/10929568 cksum=81da6ddc99:1df1d47863170:3763ae352046182:47bdc43da03413c7
0 L0 EMBEDDED et=0 200L/10P B=10929325
200 L0 EMBEDDED et=0 200L/10P B=10929325
400 L0 0:0:0 200L B=10929568
segment [0000000000000000, 0000000000000400) size 1K
# note that the location on disk has again moved, due to CoW, but the layout is the same
# now, let's delete the first snapshot
❯ zfs destroy tank/foobar@file_created
# and look at the file - it hasn't moved
❯ sudo zdb -ddddd tank/foobar 2 | sed -n '/Indirect/,$p'
Indirect blocks:
0 L1 2:151699916000:2000 20000L/1000P F=2 B=10929568/10929568 cksum=81da6ddc99:1df1d47863170:3763ae352046182:47bdc43da03413c7
0 L0 EMBEDDED et=0 200L/10P B=10929325
200 L0 EMBEDDED et=0 200L/10P B=10929325
400 L0 0:0:0 200L B=10929568
segment [0000000000000000, 0000000000000400) size 1K
# let's roll back to our latest snapshot
❯ sudo zfs rollback tank/foobar@file_modified
# and then view the file again
# it's identical to the snapshot, which makes sense, since we're just pointing it back to that location on disk
❯ sudo zdb -ddddd tank/foobar 2 | sed -n '/Indirect/,$p'
Indirect blocks:
0 L1 2:1516996a8000:2000 20000L/1000P F=3 B=10929381/10929381 cksum=844f3c5dfd:1e7189908ff54:3831289a28780b8:5556f83f0ca17acd
0 L0 EMBEDDED et=0 200L/10P B=10929325
200 L0 EMBEDDED et=0 200L/10P B=10929325
400 L0 EMBEDDED et=0 200L/11P B=10929381
segment [0000000000000000, 0000000000000600) size 1.50K
# finally, let's again modify the file by removing the last byte
❯ truncate -s -1 file.txt
❯ file file.txt
file.txt: ISO-8859 text, with very long lines, with no line terminators
# then view the file
❯ sudo zdb -ddddd tank/foobar 2 | sed -n '/Indirect/,$p'
Indirect blocks:
0 L1 0:14a6b7d5e000:2000 20000L/1000P F=2 B=10929656/10929656 cksum=81da6e3499:1df1d48e10970:3763ae5fe387982:47bdc7c250c7abc7
0 L0 EMBEDDED et=0 200L/10P B=10929325
200 L0 EMBEDDED et=0 200L/10P B=10929325
400 L0 0:0:0 200L B=10929656
segment [0000000000000000, 0000000000000400) size 1K
# now, let's delete the only remaining snapshot
❯ zfs destroy tank/foobar@file_modified
# and view the file again
# it didn't move, because why would it?
❯ sudo zdb -ddddd tank/foobar 2 | sed -n '/Indirect/,$p'
Indirect blocks:
0 L1 0:14a6b7d5e000:2000 20000L/1000P F=2 B=10929656/10929656 cksum=81da6e3499:1df1d48e10970:3763ae5fe387982:47bdc7c250c7abc7
0 L0 EMBEDDED et=0 200L/10P B=10929325
200 L0 EMBEDDED et=0 200L/10P B=10929325
400 L0 0:0:0 200L B=10929656
segment [0000000000000000, 0000000000000400) size 1K
@alfonsrv
Copy link

alfonsrv commented Apr 1, 2023

Super interesting – thanks!

Also learned a new thing or two about Linux 😼

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment