Skip to content

Instantly share code, notes, and snippets.

@coresolve
Last active January 12, 2018 08:48
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save coresolve/d42740ae590b3d737fd4af777af4479f to your computer and use it in GitHub Desktop.
Save coresolve/d42740ae590b3d737fd4af777af4479f to your computer and use it in GitHub Desktop.
A small shell script that replaces controller-manager and scheduler.

This script can be used to recover kube-controller-manager and kube-scheduler in the case where both of the deployed pods have failed to startup.

It assumes a 1.5 kubernetes cluster. The script will try to identify a node to place the rescue pod on using the label master. In 1.6 this label is node-role.kubernetes.io/master

If you want to use the script with a 1.6 cluster just provide the nodeName manually.

We have documentation around this process: https://coreos.com/tectonic/docs/latest/troubleshooting/controller-recovery.html

If you want to specify a host to use for the nodeName you can set it with the second argument so something like:

bash rescue-pod.sh kube-scheduler $HOSTNAME

Note that the requirement for the rescue pods are that they are deployed onto a node that has flannel running. These are part of the overlay by default. So in some cases you may want to deploy them to a worker.

There is also work being done to get this functionality built into bootkube. kubernetes-retired/bootkube#491

#!/bin/bash
label=${1:-"kube-scheduler"}
master0=$(kubectl get node -l master -o jsonpath='{.items[0].metadata.name}')
nodeName=${2:-$master0}
namespace="kube-system"
echo "** Warning this script implies a whole bunch of stuff including that you
have kubectl configured to point at a kubernetes cluster you want to operate on"
kubectl get deploy --namespace=$namespace -l k8s-app=${label} -o json --export | \
jq --arg namespace $namespace \
--arg name ${label}-rescue \
--arg node $nodeName \
'.items[0].spec.template
| .kind = "Pod"
| .apiVersion = "v1"
| del(.metadata, .spec.nodeSelector)
| .metadata.namespace = $namespace
| .metadata.name = $name
| .spec.containers[0].name = $name
| .spec.nodeName = $node
| .spec.serviceAccount = "default"
| .spec.serviceAccountName = "default"' \
| kubectl convert -f- > ${label}-rescue.yaml
echo "A file called ${label}-rescue.yaml has been created
If you want to recover the controller-manager type:
$0 kube-controller-manager"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment