Skip to content

Instantly share code, notes, and snippets.

@ctrahey
Last active March 22, 2021 18:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save ctrahey/5dede3bde80947a9a5150bc1f79f4d6b to your computer and use it in GitHub Desktop.
Save ctrahey/5dede3bde80947a9a5150bc1f79f4d6b to your computer and use it in GitHub Desktop.
Cleanup Ceph Disks from cluster.yaml

Cleanup Ceph Disks

This script is designed to work with yq to take arguments directly from a CephCluster CRD (cluster.yaml) and zap all matching disks on the hosts.

Caveats/Assumptions:

Danger: This is (currently) a wildly destructive script.

  1. Requires named hosts with specific config
  2. Assumes devicePathFilter
  3. Assumes devicePathFilter regex treats /dev/disks/by-path as a base
  4. !!DANGER!! Assumes that your filters DO NOT select ANY disks that aren't for Ceph.

To expand on #4 - you may be relying on the feature of Ceph (or Rook?) which safely avoids disks that have filesystems on them, which allows you to use a broader device filter in your rook cluster.yaml. If you do rely on this, this script will RUIN your system and DESTROY YOUR DATA. If you don't 100% follow what I'm saying, DO NOT use this script. @todo: Check for Ceph remnants on disk before executing the cleanup

These assumptions are not necessary, they just fit with my environment when I needed this script.

This is what my cluster.yaml spec.storage.nodes looks like, for reference:

    nodes:
    - name: "dl380p-g8-01"
      devicePathFilter: "pci-0000:02:00.0-sas-|nvme-1"
    - name: "dl380p-g8-02"
      devicePathFilter: "pci-0000:0a:00.0-sas-exp0x500a098000d7223f-|nvme-1"
    - name: "dl380p-g8-03"
      devicePathFilter: "pci-0000:02:00.0-sas-0x3001438025a76544-lun-|nvme-1"
    - name: "dl380p-g8-04"
      devicePathFilter: "pci-0000:02:00.0-sas-0x5"

Usage

Base script:

cleanup.sh <hostname> <devicePathFilter>

With yq to pull direct from cluster.yaml

cat cluster.yaml | yq e '.spec.storage.nodes[]| .name + " " + .devicePathFilter' - | xargs -n2 ./cleanup.sh

#!/usr/bin/env bash
HOST=$1
# Quote the pattern, as it may include e.g. pipe | chars
QUOTED_PATTERN=$(cat <<EOF
'$2'
EOF
)
echo "Connecting to $HOST with pattern $RAW_PATTERN"
ssh -o ConnectTimeout=2 $HOST sudo PATTERN=$QUOTED_PATTERN bash <<'EOF'
echo "Running on $(hostname) with $PATTERN"
for f in /dev/disk/by-path/*
do
[[ $f =~ $PATTERN ]]
FILEMATCH="${BASH_REMATCH[0]}"
if [ -n "$FILEMATCH" ]; then
DISK=$(realpath $f)
[[ $DISK =~ /dev/(.+) ]]
DEVNAME="${BASH_REMATCH[1]}"
ROTATING=$(cat /sys/block/${DEVNAME}/queue/rotational)
echo "Running cleaning operations on $DISK"
sgdisk --zap-all $DISK
if [[ $ROTATING -eq 1 ]]; then
dd if=/dev/zero of="$DISK" bs=1M count=100 oflag=direct,dsync
else
blkdiscard $DISK
fi
echo "Done"
fi
done
echo "Running host level cleaning"
ls /dev/mapper/ceph-* | xargs -I% -- dmsetup remove %
rm -rf /dev/ceph-*
EOF
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment