Skip to content

Instantly share code, notes, and snippets.

@palma21
Forked from cpuguy83/README.md
Last active January 20, 2020 15:41
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save palma21/803c904c7cd128f48822a18904df6bb0 to your computer and use it in GitHub Desktop.
Save palma21/803c904c7cd128f48822a18904df6bb0 to your computer and use it in GitHub Desktop.

This script helps to clean up vmss nodes that may have Azure disks attached to it that should not be due to bugs in Kubernetes. The specific case for this is that a disk has been re-attached to by Kubernetes when it should not have been.

This DOES NOT detect a bad node/disk, only assists in cleaning it up.

Usage

$ ./vmssfix.sh NODE_NAME PV_NAME

Where Node_NAME is the AKS Node name and PV_NAME is the Name of the PV having issues (normally from the logs).

This will trigger a detach of the pvc from the vmss instance in Azure. It will not perform any destructive action without user confirmation. You can run the script from any machine, including your local machine, that has the azure CLI installed and logged in.

#!/bin/bash
set -e -o pipefail
node="$1"
pvc="$2"
usage() {
echo $0 NODE PVC
}
if [ -z "${node}" ]; then
echo "node must not be empty"
usage
exit 1
fi
if [ -z "${pvc}" ]; then
echo "pvc must not be empty"
usage
exit 1
fi
provider_id="$(kubectl get node -o json ${node} | jq -r .spec.providerID | awk -F'azure://' '{ print $2 }')"
instance_id="$(basename ${provider_id})"
if [ -z "${instance_id}" ]; then
echo "could not find vmss instance id, bailing out"
exit 1
fi
rg="$(echo ${provider_id} | awk -F'/resourceGroups/' '{ print $2 }' | awk -F'/' '{ print $1 }')"
if [ -z "${rg}" ]; then
echo "could not find resource group name, bailing out"
exit 1
fi
vmss_name="$(echo ${provider_id} | awk -F'/virtualMachineScaleSets/' '{ print $2 }' | awk -F '/' '{ print $1 }')"
if [ -z "${vmss_name}" ]; then
echo "could not find vmss name, bailing out"
exit 1
fi
disks="$(az vmss show -g ${rg} -n ${vmss_name} --instance-id ${instance_id} -o json | jq -r '.storageProfile.dataDisks[] | "\(.managedDisk.id) \(.lun)"')"
echo
IFS=$'\n' x=($disks)
for i in ${x[@]}; do
if [[ ! "${i}" =~ "${pvc}" ]]; then
continue
fi
lun="$(echo $i | awk '{ print $2}' )"
echo az vmss disk detach --ids ${provider_id} --lun ${lun}
read -r -p "Are you sure? [y/N] " response
case "$response" in
[yY][eE][sS]|[yY])
az vmss disk detach --ids ${provider_id} --lun ${lun}
;;
*)
echo skipping
continue
;;
esac
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment