Preparation
truncate -s20G d1.img
truncate -s20G d2.img
truncate -s20G d3.img
truncate -s20G d4.img
set ld1 (sudo losetup --show --find d1.img)
set ld2 (sudo losetup --show --find d2.img)
set ld3 (sudo losetup --show --find d3.img)
set ld4 (sudo losetup --show --find d4.img)
sudo mkfs.btrfs -d single -m raid1c3 "$ld1" "$ld2" "$ld3" "$ld4"
sudo mkdir -p /mnt/loop
sudo mount "$ld1" /mnt/loop
sudo dd if=/dev/zero of=/mnt/loop/file bs=1M count=500
I also copied a bunch of videos
sudo cp -r ~/d/70_Now_Watching/ /mnt/loop/
First, I check the distribution of data
sudo btrfs device usage /mnt/loop
/dev/loop0, ID: 1
Device size: 20.00GiB
Device slack: 0.00B
Data,single: 19.00GiB
Unallocated: 1.00GiB
/dev/loop1, ID: 2
Device size: 20.00GiB
Device slack: 0.00B
Data,single: 18.00GiB
Metadata,RAID1C3: 1.00GiB
System,RAID1C3: 8.00MiB
Unallocated: 1016.00MiB
/dev/loop2, ID: 3
Device size: 20.00GiB
Device slack: 0.00B
Data,single: 18.00GiB
Metadata,RAID1C3: 1.00GiB
System,RAID1C3: 8.00MiB
Unallocated: 1016.00MiB
/dev/loop3, ID: 4
Device size: 20.00GiB
Device slack: 0.00B
Data,single: 18.00GiB
Metadata,RAID1C3: 1.00GiB
System,RAID1C3: 8.00MiB
Unallocated: 1016.00MiB
Seems to be distributed pretty evenly. Now let's fuck shit up !!
sudo dd if=/dev/random of="$ld3"
dd: writing to '/dev/loop2': No space left on device
41943041+0 records in
41943040+0 records out
21474836480 bytes (21 GB, 20 GiB) copied, 101.421 s, 212 MB/s
sudo btrfs scrub start /mnt/loop/
ERROR: there are uncorrectable errors
UUID: b4ade67a-8c7b-45c3-b747-8280d9504714
Scrub started: Sat Jan 21 23:16:09 2023
Status: finished
Duration: 0:00:25
Total to scrub: 72.80GiB
Rate: 2.91GiB/s
Error summary: super=2 csum=4698471
Corrected: 5662
Uncorrectable: 4692809
Unverified: 0
Now we have a little script to check our files:
from pathlib import Path
error_count = 0
success_count = 0
for file_path in Path('/mnt/loop/').rglob('*'):
try:
if file_path.is_file():
with open(file_path, 'rb') as f:
# read the entire contents of the file
file_contents = f.read()
success_count += 1
except IOError:
error_count += 1
print(f'Number of successful reads: {success_count}')
print(f'Number of IO errors: {error_count}')
And the results are... drumroll please......... no? ok fine.
Number of successful reads: 119
Number of IO errors: 66
Interesting, but I wonder if there is any variation in size of those files or if we could simulate heavy file extent fragmentation.
Successful read files size: min 0 average 241136404 max 2397645276 sum 28695232117
IO error files size: min 1888612 average 745390515 max 4884066696 sum 49195774012
Interesting... maybe I need to run a bigger test but the result of being able to read a 2.4GB file is probably somewhat surprising to the person who said only kbs would be accessible. But it is true that only 37% of data was still accessible, and this is a very small contrived simulation with only one process writing data.
It does seem like smaller files are going to be more likely to survive, but that can be expected. The gods of bits only make guillotines so big.
This result does make me feel a little bit better--though I will still investigate MergerFS a little bit more. I really like btrfs and switching to MergerFS seems like a lot of work...
A script to simulate this test using the output of btrfs inspect-internal dump-tree --extents
might be interesting. I wonder how much file extent fragmentation my drives actually have.
sudo umount /mnt/loop
sudo losetup -d "$ld1" "$ld2" "$ld3" "$ld4"
rm d1.img d2.img d3.img d4.img
uname -a
Linux pakon 6.1.6-200.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Jan 14 16:55:06 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
This is very good advice. I did the same preparation, here is the distribution of files before the degraded state:
then I unmounted the fs, deleted disk 2, echo 3 > /proc/sys/vm/drop_caches, and remounted the fs.
I am surprised that mounting worked without error but I guess the device is still active via losetup. I'm assuming this would be similar to an actual disk failure though, if the device weren't there maybe btrfs will complain and ask to be mounted with the
-o degraded
flag.There was nothing exciting in dmesg
Oohh weird...
Okay turns out the deleted file is still connected to the loopback device.
Now we get some interesting stuff in dmesg
But we can still mount it as read-only
And the results are
In this test about 26% of data is still fully readable (21798190683 / (21798190683+60850112364)).
I also tried another variant of the experiment where I did all of the above but ran this command before removing the disk:
and the results are not much better... in fact they are worse 20% lol