Skip to content

Instantly share code, notes, and snippets.

@patrick0057
patrick0057 / README.md
Last active October 6, 2020 13:23
Restoring RKE cluster with incorrect or missing rkestate file

Overview

When using RKE 0.2.0 and newer, if you have restored a cluster with the incorrect rkestate file you will end up a state where your infrastructure pods will not start. This includes all pods in kube-system, cattle-system and ingress-nginx. As a result of these core pods not starting, all of your workload pods will be unable to function correctly. If you find yourself in this situation you can use the directions below to fix the cluster.

Recovery

  1. Delete all service-account-token secrets in kube-system, cattle-system and ingress-nginx namespaces.
{
kubectl get secret -n cattle-system | awk '{ if ($2 == "kubernetes.io/service-account-token") system("kubectl -n cattle-system delete secret " $1) }'
kubectl get secret -n kube-system | awk '{ if ($2 == "kubernetes.io/service-account-token") system("kubectl -n kube-system delete secret " $1) }'
@superseb
superseb / restore-rkestate-file.md
Last active December 6, 2023 06:31
Recover cluster.rkestate file from controlplane node

Recover cluster.rkestate file from controlplane node

RKE

Run on controlplane node, uses any found hyperkube image

k8s 1.19 and higher

docker run --rm --net=host -v $(docker inspect kubelet --format '{{ range .Mounts }}{{ if eq .Destination "/etc/kubernetes" }}{{ .Source }}{{ end }}{{ end }}')/ssl:/etc/kubernetes/ssl:ro --entrypoint bash $(docker inspect $(docker images -q --filter=label=org.opencontainers.image.source=https://github.com/rancher/hyperkube.git) --format='{{index .RepoTags 0}}' | tail -1) -c 'kubectl --kubeconfig /etc/kubernetes/ssl/kubecfg-kube-node.yaml -n kube-system get configmap full-cluster-state -o json | jq -r .data.\"full-cluster-state\" | jq -r .' > cluster.rkestate