Skip to content

Instantly share code, notes, and snippets.

@JPvRiel
Last active January 7, 2019 09:28
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save JPvRiel/feb87ff374b8931f1418feada36aaf71 to your computer and use it in GitHub Desktop.
Save JPvRiel/feb87ff374b8931f1418feada36aaf71 to your computer and use it in GitHub Desktop.
Linux MD software raid inspection

inspect_raid.sh

A bash script that collects some useful commands to check Linux MD software arrays and component devices for some issues, e.g.

  • RAID component device has a SMART warnings or errors.
  • A RAID with HDD and SSD mixed doesn't have the HDD components set to write-mostly or the SSD is set to write-mostly.

It can also display info. Set env var SHOW_INFO='y' or anything else, e.g. 'n' (default) to just report issues.

Usage

Needs to run with root privileges.

Only errors:

sudo bash -c "SHOW_INFO='n' ./inspect_raid.sh"

Info and errors:

sudo bash -c "SHOW_INFO='y' ./inspect_raid.sh"

Write-mostly setting and SSD + HDD mixed array assumptions

When mixing SSD and HDD in a RAID1 array, it's likely benificial for the SSD to perform the read operations while HDD devices are probably better left set in write-mostly mode and won't slow down reads.

A rarer exception would be a very large RAID1 (why?) that had a lot of HDD devices, only one SSD device, and a sequential IO worklaod. The assumptions are that:

  • SSD is >= 2x faster at sequential read (500MB/s vs 200MB/s)
  • SSD is >= 100x faster at random read (~20000 IOPs vs 200 IOPs for 4K random read)

Another exception might be a non-rotational device like a flash memory card or USB flash device on a slow interface where read speeds can be slow (e.g. 30MB/s sequential read and only 2000 4K random read IOPS). Again, the assumption is that mixing USB flash or micro SD with an HDD in RAID1 is a far out edge case.

Example output for checking write-mostly and the array

A RAID1 device used for root was built with 1x SSD and 4x HDD with HDD set to writemostly. Usually, if ROTA (rotational), that should correlate to writemostly if mixed with SSD.

INFO: # Inspecting virtual block raid device 'md127'

/dev/md127:
        Version : 1.2
  Creation Time : Tue Feb 21 18:01:21 2017
     Raid Level : raid1
     Array Size : 17809408 (16.98 GiB 18.24 GB)
  Used Dev Size : 17809408 (16.98 GiB 18.24 GB)
   Raid Devices : 5
  Total Devices : 5
    Persistence : Superblock is persistent

    Update Time : Mon Jan  7 10:33:53 2019
          State : clean 
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0

           Name : biscuit:os_raid1_hybrid  (local to host biscuit)
           UUID : 8602fd17:d885d3ac:c490e8da:8c56cbbf
         Events : 553

    Number   Major   Minor   RaidDevice State
       0       8       19        0      active sync   /dev/sdb3
       6       8       35        1      active sync writemostly   /dev/sdc3
       2       8       51        2      active sync writemostly   /dev/sdd3
       7       8       67        3      active sync writemostly   /dev/sde3
       5       8       83        4      active sync writemostly   /dev/sdf3

INFO: ## Inspecting 5 members for raid device 'md127'

NAME    TYPE  ROTA SCHED RQ-SIZE TRAN     SIZE VENDOR  MODEL         SERIAL
md127   raid1    1           128           17G                       
├─sdf3  part     1 cfq       128           17G                       
│ └─sdf disk     1 cfq       128 sata     3.7T ATA     ST4000VN008-2 ZDH11XD0
├─sde3  part     1 cfq       128           17G                       
│ └─sde disk     1 cfq       128 sata     2.7T ATA     ST3000DM001-1 W1F3LWTP
├─sdd3  part     1 cfq       128           17G                       
│ └─sdd disk     1 cfq       128 sata     2.7T ATA     ST3000DM001-9 S1F0J0T2
├─sdc3  part     1 cfq       128           17G                       
│ └─sdc disk     1 cfq       128 sata     3.7T ATA     ST4000VN008-2 ZGY03PCV
└─sdb3  part     0 cfq       128           17G                       
  └─sdb disk     0 cfq       128 sata   238.5G ATA     SAMSUNG SSD 8 S0XZNEAC602

INFO: raid member component composition: mixed
INFO: raid member HDD component count: 4/5 (/dev/sdc /dev/sdd /dev/sde /dev/sdf)
INFO: raid member SSD component count: 1/5 (/dev/sdb)

Bash scripting features used

This script can provide examples of using bash arrays, functions, terminal colouring, and regular expressions.

Array examples

The script demonstrates (ab)using bash arrays to enumerate raid device components and partitions to inspect which partitions lie on devices with SMART errors or rotational (HDD) properties and check various combinations.

Declare an array for rotational devices:

d_list_rotational=()

Append a device to the array:

d_list_rotational+=("/dev/$d")

Count the number of devices in the array:

d_list_n_rotational=${#d_list_rotational[@]}
#!/usr/bin/env bash
# env switches (too lazy to parse parameters)
show_info=${SHOW_INFO:='n'}
# Term colour escape codes
T_DEFAULT='\e[0m'
T_RED_BOLD='\e[1;31m'
T_YELLOW_BOLD='\e[1;33m'
T_BLUE='\e[0;34m'
function report_info() {
echo -e "${T_BLUE}INFO:${T_DEFAULT} $*"
}
function report_warning() {
echo -e "${T_YELLOW_BOLD}WARN:${T_DEFAULT} $*" >&2
}
function report_error(){
echo -e "${T_RED_BOLD}ERROR:${T_DEFAULT} $*" >&2
}
# Inspect RAID and member devices
d_list_all=()
vd_list=()
for vd in /sys/devices/virtual/block/md*/md; do
if [[ $show_info == 'y' ]]; then
echo
fi
if [[ $vd =~ ^/sys/devices/virtual/block/(md[0-9]+)/md$ ]]; then
md="${BASH_REMATCH[1]}"
vd_list+=("/dev/$md")
if [[ $show_info == 'y' ]]; then
echo "--------------------------------------------------------------------------------"
report_info "# Inspecting virtual block raid device '$md'"
echo
sudo mdadm --detail "/dev/$md"
fi
if [[ -e "$vd/degraded" && $(cat "$vd/degraded") -gt 0 ]]; then
echo
report_warning "'$md' is degraded"
fi
d_list=()
p_list=()
d_list_rotational=()
d_list_nonrotational=()
p_list_writemostly=()
d_list_composition='unkown'
d_list_n_rotational=0
for f in "$vd"/dev-*; do
if [[ $f =~ .*/dev-(([^/0-9]+)[0-9]?)$ ]]; then
# 1st match is the outer regex group (device + partition number)
p=${BASH_REMATCH[1]}
p_list+=("/dev/$p")
# 2nd match is the inner regex group (just device)
d=${BASH_REMATCH[2]}
d_list+=("/dev/$d")
# append new devices to overall device list
if ! [[ ${d_list_all[@]} == *dev/$d* ]]; then
d_list_all+=("/dev/$d")
fi
r=$(cat "/sys/block/$d/queue/rotational")
if [[ $r != 0 ]]; then
d_list_rotational+=("/dev/$d")
else
d_list_nonrotational+=("/dev/$d")
fi
s=$(cat "$f/state")
if [[ $s == *write_mostly* ]]; then
p_list_writemostly+=("/dev/$p")
fi
fi
done
d_list_n=${#d_list[@]}
p_list_n=${#p_list[@]}
d_list_n_rotational=${#d_list_rotational[@]}
d_list_n_nonrotational=${#d_list_nonrotational[@]}
# show devices
if [[ $show_info == 'y' ]]; then
echo
report_info "## Inspecting $p_list_n members for raid device '$md'"
echo
lsblk -s -o NAME,TYPE,ROTA,SCHED,RQ-SIZE,TRAN,SIZE,VENDOR,MODEL,SERIAL "/dev/$md"
fi
# compare SSD vs HDD composition
d_list_n_rotational_ratio=$(bc -l <<< "$d_list_n_rotational / $d_list_n")
case $d_list_n_rotational_ratio in
0)
d_list_composition='ssd'
;;
1.0*)
d_list_composition='hdd'
;;
.*)
d_list_composition='mixed'
;;
esac
#echo "d_list_n=$d_list_n, d_list_n_rotational=$d_list_n_rotational, d_list_n_rotational_ratio=$d_list_n_rotational_ratio, d_list_composition=$d_list_composition"
if [[ $show_info == 'y' ]]; then
echo
report_info "raid member component composition: $d_list_composition"
fi
# If array is has both SSD and HDD devices, warn when HDD devices are not writemostly
if [[ "$d_list_composition" == 'mixed' ]]; then
if [[ $show_info == 'y' ]]; then
report_info "raid member HDD component count: $d_list_n_rotational/$p_list_n (${d_list_rotational[*]})"
report_info "raid member SSD component count: $d_list_n_nonrotational/$p_list_n (${d_list_nonrotational[*]})"
fi
# HDD not writemostly?
for d in "${d_list_rotational[@]}"; do
if ! [[ ${p_list_writemostly[@]} == *$d* ]]; then
echo
report_warning "'$d' HDD (rotational) IS NOT set to writemoslty"
echo "'$d' is used by '$md' which mixes SSD and HDD"
echo "HDD devices: ${d_list_rotational[*]}"
echo "devices/partitions set to writemostly were: ${p_list_writemostly[*]}"
fi
done
# SSD writemostly?
for d in "${d_list_nonrotational[@]}"; do
if [[ ${p_list_writemostly[@]} == *$d* ]]; then
echo
report_warning "'$d' SSD (non rotational) IS set to writemoslty"
echo "'$d' is used by '$md' which mixes SSD and HDD"
echo "SSD devices: ${d_list_nonrotational[*]}"
echo "devices/partitions set to writemostly were: ${p_list_writemostly[*]}"
fi
done
fi
else
echo
report_error "'$vd' did not match expected regex"
fi
done
# look for smart errors
echo
for d in "${d_list_all[@]}"; do
if [[ $show_info == 'y' ]]; then
echo "--------------------------------------------------------------------------------"
report_info "# smart errors and attribute check for raid member device $d"
echo
sudo smartctl --health --attributes --log=selftest --format=brief "$d"
fi
if ! sudo smartctl --health --attributes --log=selftest --quietmode=silent "$d"; then
echo
report_warning "'$d' is a member of a raid virtual block device and has one or more SMART errors"
echo
sudo smartctl --health --attributes --log=selftest --quietmode=errorsonly --format=brief "$d"
fi
done
exit
# set writemostly for devices that are rotational (assumes root)
#for f in /sys/block/md127/md/dev-sd{c,d,e,f}3/state; do
#echo echo writemostly > "$f"
#done
# remove writemostly for SSD
#echo echo -writemostly > /sys/block/md127/md/dev-sdb3/state
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment