Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save govind0229/af4d2acb13e146fbc1188ae9d390e913 to your computer and use it in GitHub Desktop.
Save govind0229/af4d2acb13e146fbc1188ae9d390e913 to your computer and use it in GitHub Desktop.
How to replace faulty hard-disk into Software Raid1.

How To Replace Faulty harddisk into software raid 1

Table of contents

General info.

Replacing A Failed Hard Drive In A Software RAID1 Array. This guide shows how to remove a failed hard drive from a Linux RAID1 array (software RAID), and how to add a new hard disk to the RAID1 array without losing data.

Preliminary Note.

In this example I have two hard drives, /dev/sda and /dev/sdb, with the partitions /dev/sda1,sda2 as well as /dev/sdb1,sdb2.

/dev/sda1 and /dev/sdb1 make up the RAID1 array /dev/md125
/dev/sda2 and /dev/sdb2 make up the RAID1 array /dev/md126

/dev/sdb has failed, and we want to replace it.

How to replace if /dev/sdb Hard Disk Has Failed?

If a disk has failed, you will probably find a lot of error messages in the log files, e.g. /var/log/messages or /var/log/syslog.

Also you can run

You can check with smartctl -l error /dev/sdb to hdd errors

Also you can run

cat /proc/mdstat

and instead of the string [UU] you will see [U_] if you have a degraded RAID1 array.

Steps

Removing The Failed Disk /dev/sdb

To remove /dev/sdb, we will mark /dev/sdb1 and /dev/sdb2 as failed and remove them from their respective RAID arrays (/dev/md125 and /dev/md126).

First we mark /dev/sdb1 as failed:

mdadm --manage /dev/md125 --fail /dev/sdb1

output below;

Check mdstat

cat /proc/mdstat

should look like this:

Govind:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid5] [raid4] [raid6] [raid10]
md125 : active raid1 sda1[0] sdb1[2](F)
      24418688 blocks [2/1] [U_]
 
md126 : active raid1 sda2[0] sdb2[1]
      24418688 blocks [2/2] [UU]
 
unused devices: <none

Then we remove /dev/sdb1 from /dev/md125:

mdadm --manage /dev/md125 --remove /dev/sdb1

The output should be like this:

Govind:~# mdadm --manage /dev/md125 --remove /dev/sdb1
mdadm: hot removed /dev/sdb1

And

Govind:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid5] [raid4] [raid6] [raid10]
md125 : active raid1 sda1[0]
      24418688 blocks [2/1] [U_]
 
md126 : active raid1 sda2[0] sdb2[1]
      24418688 blocks [2/2] [UU]
 
unused devices: <none>

Now we do the same steps again for /dev/sdb2 (which is part of /dev/md126):

mdadm --manage /dev/md126 --fail /dev/sdb2
cat /proc/mdstat
Govind:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid5] [raid4] [raid6] [raid10]
md125 : active raid1 sda1[0]
      24418688 blocks [2/1] [U_]
 
md126 : active raid1 sda2[0] sdb2[2](F)
      24418688 blocks [2/1] [U_]
 
unused devices: <none>
mdadm --manage /dev/md126 --remove /dev/sdb2
Govind:~# mdadm --manage /dev/md126 --remove /dev/sdb2
mdadm: hot removed /dev/sdb2
Govind:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid5] [raid4] [raid6] [raid10]
md125 : active raid1 sda1[0]
      24418688 blocks [2/1] [U_]
 
md126 : active raid1 sda2[0]
      24418688 blocks [2/1] [U_]
 
unused devices: <none>

Then power down the system: 🔌

shutdown -h now

and 👍 replace the old /dev/sdb hard drive with a new one 💽 (it must have at least the same size as the old one - if it's only a few MB smaller than the old one then rebuilding the arrays will fail).


Adding The New Hard Disk into Raid 1 👨

After you have changed the hard disk /dev/sdb, boot the system.

The first thing we must do now is to create the exact same partitioning as on /dev/sda. We can do this with one simple command:

sfdisk -d /dev/sda | sfdisk /dev/sdb

You can run

fdisk -l
  • to check if both hard drives have the same partitioning now.

Next we add /dev/sdb1 to /dev/md125:

mdadm --manage /dev/md125 --add /dev/sdb1

Output like below;

Govind:~# mdadm --manage /dev/md125 --add /dev/sdb1
mdadm: re-added /dev/sdb1
Now we do the same steps again for /dev/sdb2 (which is part of /dev/md126):
mdadm --manage /dev/md126 --add /dev/sdb2

Output like below;

Govind:~# mdadm --manage /dev/md126 --add /dev/sdb2
mdadm: re-added /dev/sdb2

Now both arays (/dev/md125 and /dev/md126) will be synchronized. Run :run:

Check Status;

cat /proc/mdstat
Govind:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid5] [raid4] [raid6] [raid10]
md125 : active raid1 sda1[0] sdb1[1]
      24418688 blocks [2/1] [U_]
      [=>...................]  recovery =  9.9% (2423168/24418688) finish=2.8min speed=127535K/sec
 
md126 : active raid1 sda2[0] sdb2[1]
      24418688 blocks [2/1] [U_]
      [=>...................]  recovery =  6.4% (1572096/24418688) finish=1.9min speed=196512K/sec
 
unused devices: <none>

When the synchronization is finished, the output will look like this:

Govind:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid5] [raid4] [raid6] [raid10]
md125 : active raid1 sda1[0] sdb1[1]
      24418688 blocks [2/2] [UU]
 
md126 : active raid1 sda2[0] sdb2[1]
      24418688 blocks [2/2] [UU]
 
unused devices: <none>

That's it, you have successfully replaced /dev/sdb!

Contact

Created by @Govind0229 - feel free to contact me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment