-
-
Save Gro-Tsen/424531cd4af68db5163079cd3e51da3e to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vega david ~ $ cat /tmp/mail.msg | |
From david+ml@madore.org Wed Sep 30 02:53:09 2020 | |
Date: Wed, 30 Sep 2020 02:53:09 +0200 | |
From: David Madore <david+ml@madore.org> | |
To: Linux Kernel mailing-list <linux-kernel@vger.kernel.org> | |
Subject: RAID5->RAID6 reshape remains stuck at 0% (does nothing, not even | |
start) | |
Message-ID: <20200930005309.cl5ankdzfe6pxkgq@achernar.gro-tsen.net> | |
MIME-Version: 1.0 | |
Content-Type: text/plain; charset=us-ascii | |
Content-Disposition: inline | |
User-Agent: NeoMutt/20170113 (1.7.2) | |
Status: RO | |
Content-Length: 4689 | |
Lines: 128 | |
Dear list, | |
I'm trying to reshape a 3-disk RAID5 array to a 4-disk RAID6 array (of | |
the same total size and per-device size) using linux kernel 4.9.237 on | |
x86_64. I understand that this reshaping operation is supposed to be | |
supported. But it appears perpetually stuck at 0% with no operation | |
taking place whatsoever (the slices are unchanged apart from their | |
metadata, the backup file contains only zeroes, and nothing happens). | |
I wonder if this is a know kernel bug, or what else could explain it, | |
and I have no idea how to debug this sort of thing. | |
Here are some details on exactly what I've been doing. I'll be using | |
loopbacks to illustrate, but I've done this on real partitions and | |
there was no difference. | |
## Create some empty loop devices: | |
for i in 0 1 2 3 ; do dd if=/dev/zero of=test-${i} bs=1024k count=16 ; done | |
for i in 0 1 2 3 ; do losetup /dev/loop${i} test-${i} ; done | |
## Make a RAID array out of the first three: | |
mdadm --create /dev/md/test --level=raid5 --chunk=256 --name=test \ | |
--metadata=1.0 --raid-devices=3 /dev/loop{0,1,2} | |
## Populate it with some content, just to see what's going on: | |
for i in $(seq 0 63) ; do printf "This is chunk %d (0x%x).\n" $i $i \ | |
| dd of=/dev/md/test bs=256k seek=$i ; done | |
## Now try to reshape the array from 3-way RAID5 to 4-way RAID6: | |
mdadm --manage /dev/md/test --add-spare /dev/loop3 | |
mdadm --grow /dev/md/test --level=6 --raid-devices=4 \ | |
--backup-file=test-reshape.backup | |
...and then nothing happens. /proc/mdstat reports no progress | |
whatsoever: | |
md112 : active raid6 loop3[4] loop2[3] loop1[1] loop0[0] | |
32256 blocks super 1.0 level 6, 256k chunk, algorithm 18 [4/3] [UUU_] | |
[>....................] reshape = 0.0% (1/16128) finish=1.0min speed=244K/sec | |
The loop file contents are unchanged except for the metadata | |
superblock, the backup file is entirely empty, and no activity | |
whatsoever is happening. | |
Actually, further investigation shows that the array is in fact | |
operational as a RAID6 array, but one where the Q-syndrome is stuck in | |
the last device: writing data to the md device (e.g., by repopulating | |
it with the same command as above) does cause loop3 to be updated as | |
expected for such a layout. It's just the reshaping which doesn't | |
take place (or indeed begin). | |
For completeness, here's what mdadm --detail /dev/md/test looks like | |
before the reshape, in my example: | |
/dev/md/test: | |
Version : 1.0 | |
Creation Time : Wed Sep 30 02:42:30 2020 | |
Raid Level : raid5 | |
Array Size : 32256 (31.50 MiB 33.03 MB) | |
Used Dev Size : 16128 (15.75 MiB 16.52 MB) | |
Raid Devices : 3 | |
Total Devices : 4 | |
Persistence : Superblock is persistent | |
Update Time : Wed Sep 30 02:44:21 2020 | |
State : clean | |
Active Devices : 3 | |
Working Devices : 4 | |
Failed Devices : 0 | |
Spare Devices : 1 | |
Layout : left-symmetric | |
Chunk Size : 256K | |
Name : vega.stars:test (local to host vega.stars) | |
UUID : 30f40e34:b9a52ff0:75c8b063:77234832 | |
Events : 20 | |
Number Major Minor RaidDevice State | |
0 7 0 0 active sync /dev/loop0 | |
1 7 1 1 active sync /dev/loop1 | |
3 7 2 2 active sync /dev/loop2 | |
4 7 3 - spare /dev/loop3 | |
- and here's what it looks like after the attempted reshape has | |
started (or rather, refused to start): | |
/dev/md/test: | |
Version : 1.0 | |
Creation Time : Wed Sep 30 02:42:30 2020 | |
Raid Level : raid6 | |
Array Size : 32256 (31.50 MiB 33.03 MB) | |
Used Dev Size : 16128 (15.75 MiB 16.52 MB) | |
Raid Devices : 4 | |
Total Devices : 4 | |
Persistence : Superblock is persistent | |
Update Time : Wed Sep 30 02:44:54 2020 | |
State : clean, degraded, reshaping | |
Active Devices : 3 | |
Working Devices : 4 | |
Failed Devices : 0 | |
Spare Devices : 1 | |
Layout : left-symmetric-6 | |
Chunk Size : 256K | |
Reshape Status : 0% complete | |
New Layout : left-symmetric | |
Name : vega.stars:test (local to host vega.stars) | |
UUID : 30f40e34:b9a52ff0:75c8b063:77234832 | |
Events : 22 | |
Number Major Minor RaidDevice State | |
0 7 0 0 active sync /dev/loop0 | |
1 7 1 1 active sync /dev/loop1 | |
3 7 2 2 active sync /dev/loop2 | |
4 7 3 3 spare rebuilding /dev/loop3 | |
I also tried writing "frozen" and then "resync" to the | |
/sys/block/md112/md/sync_action file with no further results. | |
I welcome any suggestions on how to investigate, work around, or fix | |
this problem. | |
Happy hacking, | |
-- | |
David A. Madore | |
( http://www.madore.org/~david/ ) | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment