Skip to content

Instantly share code, notes, and snippets.

@masonmark
Last active November 25, 2023 05:54
Show Gist options
  • Save masonmark/03c7cb08e22968b1f2feb1a6ac3c9701 to your computer and use it in GitHub Desktop.
Save masonmark/03c7cb08e22968b1f2feb1a6ac3c9701 to your computer and use it in GitHub Desktop.
zfs_bug_15526_reproducer.sh
#!/bin/bash
# masonmark 2023-11-25
#
# This is my modified reproducer.sh for https://github.com/openzfs/zfs/issues/15526
# Original: https://gist.github.com/tonyhutter/d69f305508ae3b7ff6e9263b22031a84#file-reproducer-sh
#
# This is my modified reproducer script, which just fixes the typo in
# the example command and comments out the initial if block. I did
# not need zfs_bclone_enabled to be set (the orginial script got a
# "file not found" on my system, which is a physical machine running
# TrueNAS-SCALE-23.10.0.1 from a consumer SATA SSD. The machine is
# based on my kids' old gaming PC and has Intel Core i9 9900K CPU
# and a 3 NVMe SSDs, 5 SATA SSDs, and 7 spinning magnetic disks.
#
# This script reproduced the bug first try, and every try so far,
# within a few seconds, when run from the home director of the
# admin user, which is located on the default boot ZFS filesystem
# created by the TrueNAS SCALE installer.
# Run this script multiple times in parallel inside your pool's mount
# to reproduce https://github.com/openzfs/zfs/issues/15526. Like:
#
# ./reproducer.sh & ./reproducer.sh & ./reproducer.sh & ./reproducer.sh & wait
#
#if [ $(cat /sys/module/zfs/parameters/zfs_bclone_enabled) != "1" ] ; then
# echo "please set /sys/module/zfs/parameters/zfs_bclone_enabled = 1"
# exit
#fi
#
# I commented the above out because this TrueNAS SCALE system is unmodified
# from what the installed configured, and I want to keep it that way for
# my testing. However, the reproducer apparently reproduces regardless of
# whether block-cloning is enabled; because I could reproduce this within the
# home directory of the admin user which is on boot-pool.
#
# root@truenas:/mnt/slow# zpool get all | grep clon
# big bcloneused 0 -
# big bclonesaved 0 -
# big bcloneratio 1.00x -
# big feature@block_cloning enabled local
# boot-pool bcloneused 0 -
# boot-pool bclonesaved 0 -
# boot-pool bcloneratio 1.00x -
# boot-pool feature@block_cloning disabled local
# fast bcloneused 1.71G -
# fast bclonesaved 3.41G -
# fast bcloneratio 2.99x -
# fast feature@block_cloning active local
# medium bcloneused 0 -
# medium bclonesaved 0 -
# medium bcloneratio 1.00x -
# medium feature@block_cloning enabled local
# slow bcloneused 1.46G -
# slow bclonesaved 2.92G -
# slow bcloneratio 2.99x -
# slow feature@block_cloning active local
# root@truenas:/mnt/slow#
prefix="reproducer_${BASHPID}_"
dd if=/dev/urandom of=${prefix}0 bs=1M count=1 status=none
echo "writing files"
end=1000
h=0
for i in `seq 1 2 $end` ; do
let "j=$i+1"
cp ${prefix}$h ${prefix}$i
cp --reflink=never ${prefix}$i ${prefix}$j
let "h++"
done
echo "checking files"
for i in `seq 1 $end` ; do
diff ${prefix}0 ${prefix}$i
done
@masonmark
Copy link
Author

masonmark commented Nov 25, 2023

This is how easy it was to trigger for me. Notes: fast is a single-device fast NVMe SSD, slow is a 6-spinning-disk RAIDZ2, and the home directory of admin is on the boot device (a single-device SATA SSD).

Linux truenas 6.1.55-production+truenas #2 SMP PREEMPT_DYNAMIC Tue Oct 31 16:07:08 UTC 2023 x86_64

        TrueNAS (c) 2009-2023, iXsystems, Inc.
        All rights reserved.
        TrueNAS code is released under the modified BSD license with some
        files copyrighted by (c) iXsystems, Inc.

        For more information, documentation, help or support, go here:
        http://truenas.com

Welcome to TrueNAS
Last login: Sat Nov 25 13:25:20 JST 2023 on pts/3

Warning: the supported mechanisms for making configuration changes
are the TrueNAS WebUI, CLI, and API exclusively. ALL OTHERS ARE
NOT SUPPORTED AND WILL RESULT IN UNDEFINED BEHAVIOR AND MAY
RESULT IN SYSTEM FAILURE.

admin@truenas[~]$ ./reproducer.sh & ./reproducer.sh & ./reproducer.sh & ./reproducer.sh & wait
[1] 108017
[2] 108018
[3] 108019
[4] 108020
writing files
writing files
writing files
writing files
checking files
checking files
checking files
checking files
[3]  - done       ./reproducer.sh
[1]    done       ./reproducer.sh
[4]  + done       ./reproducer.sh
[2]  + done       ./reproducer.sh
admin@truenas[~]$ ./reproducer.sh & ./reproducer.sh & ./reproducer.sh & ./reproducer.sh & wait
[1] 116081
[2] 116082
[3] 116083
[4] 116084
writing files
writing files
writing files
writing files
checking files
checking files
checking files
checking files
[2]    done       ./reproducer.sh
[4]  + done       ./reproducer.sh
[1]  - done       ./reproducer.sh
[3]  + done       ./reproducer.sh
admin@truenas[~]$ ./reproducer.sh & ./reproducer.sh & ./reproducer.sh & ./reproducer.sh & wait
[1] 124218
[2] 124219
[3] 124220
[4] 124221
writing files
writing files
writing files
writing files
checking files
checking files
checking files
checking files
Binary files reproducer_124219_0 and reproducer_124219_670 differ
Binary files reproducer_124218_0 and reproducer_124218_834 differ
[3]  - done       ./reproducer.sh
Binary files reproducer_124219_0 and reproducer_124219_966 differ
[1]    done       ./reproducer.sh
[2]  - done       ./reproducer.sh
[4]  + done       ./reproducer.sh
admin@truenas[~]$ zfs list
zsh: command not found: zfs
admin@truenas[~]$ sudo bash
[sudo] password for admin: 
Sorry, try again.
[sudo] password for admin: 
root@truenas:/home/admin# zfs list
NAME                                                    USED  AVAIL  REFER  MOUNTPOINT
big                                                     648K  7.14T    96K  /mnt/big
boot-pool                                              21.5G   413G    96K  none
boot-pool/ROOT                                         21.5G   413G    96K  none
boot-pool/ROOT/23.10.0.1                               21.5G   413G  21.5G  legacy
boot-pool/ROOT/Initial-Install                            8K   413G  2.33G  /
boot-pool/grub                                         8.22M   413G  8.22M  legacy
fast                                                    339M  3.51T    96K  /mnt/fast
fast/.system                                            334M  3.51T   128K  legacy
fast/.system/configs-ae32c386e13840b2bf9c0083275e7941   700K  3.51T   700K  legacy
fast/.system/cores                                       96K  1024M    96K  legacy
fast/.system/ctdb_shared_vol                             96K  3.51T    96K  legacy
fast/.system/glusterd                                   104K  3.51T   104K  legacy
fast/.system/netdata-ae32c386e13840b2bf9c0083275e7941   333M  3.51T   333M  legacy
fast/.system/rrd-ae32c386e13840b2bf9c0083275e7941        96K  3.51T    96K  legacy
fast/.system/samba4                                     256K  3.51T   256K  legacy
fast/.system/services                                    96K  3.51T    96K  legacy
fast/.system/webui                                       96K  3.51T    96K  legacy
medium                                                  785K   871G   140K  /mnt/medium
slow                                                   2.99T  4.15T   208K  /mnt/slow
slow/stud-backup-time-machine                          2.75T  4.15T  2.75T  /mnt/slow/stud-backup-time-machine
slow/vm-backups                                         245G  4.15T   320K  /mnt/slow/vm-backups
slow/vm-backups/fed                                    43.5G  4.15T  43.5G  /mnt/slow/vm-backups/fed
slow/vm-backups/win                                     202G  4.15T   202G  /mnt/slow/vm-backups/win
root@truenas:/home/admin# cd /mnt/fast/
root@truenas:/mnt/fast# /home/admin/reproducer.sh 
writing files
checking files
root@truenas:/mnt/fast# /home/admin/reproducer.sh & /home/admin/reproducer.sh & /home/admin/reproducer.sh & /home/admin/reproducer.sh & /home/admin/reproducer.sh & /home/admin/reproducer.sh & wait
[1] 134412
[2] 134413
[3] 134414
[4] 134415
[5] 134416
[6] 134417
writing files
writing files
writing files
writing files
writing files
writing files
checking files
checking files
checking files
checking files
checking files
checking files
Binary files reproducer_134414_0 and reproducer_134414_784 differ
[1]   Done                    /home/admin/reproducer.sh
[2]   Done                    /home/admin/reproducer.sh
[3]   Done                    /home/admin/reproducer.sh
[5]-  Done                    /home/admin/reproducer.sh
[6]+  Done                    /home/admin/reproducer.sh
[4]+  Done                    /home/admin/reproducer.sh
root@truenas:/mnt/fast# 
root@truenas:/mnt/fast# 
root@truenas:/mnt/fast# 
root@truenas:/mnt/fast# 
root@truenas:/mnt/fast# 
root@truenas:/mnt/fast# cd /mnt/slow
root@truenas:/mnt/slow# 
root@truenas:/mnt/slow# 
root@truenas:/mnt/slow# 
root@truenas:/mnt/slow# /home/admin/reproducer.sh & /home/admin/reproducer.sh & /home/admin/reproducer.sh & /home/admin/reproducer.sh & /home/admin/reproducer.sh & /home/admin/reproducer.sh & wait
[1] 146493
[2] 146494
[3] 146495
[4] 146496
[5] 146497
[6] 146498
writing files
writing files
writing files
writing files
writing files
writing files
checking files
checking files
checking files
Binary files reproducer_146495_0 and reproducer_146495_720 differ
checking files
checking files
checking files
Binary files reproducer_146498_0 and reproducer_146498_590 differ
[1]   Done                    /home/admin/reproducer.sh
[3]   Done                    /home/admin/reproducer.sh
[2]   Done                    /home/admin/reproducer.sh
[4]   Done                    /home/admin/reproducer.sh
[5]-  Done                    /home/admin/reproducer.sh
[6]+  Done                    /home/admin/reproducer.sh
root@truenas:/mnt/slow# 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment