Skip to content

Instantly share code, notes, and snippets.

@psxvoid
Created August 6, 2018 14:41
Show Gist options
  • Save psxvoid/71492191b7cb06260036c90ab30cc9a0 to your computer and use it in GitHub Desktop.
Save psxvoid/71492191b7cb06260036c90ab30cc9a0 to your computer and use it in GitHub Desktop.
Delete evicted pods from all namespaces (also ImagePullBackOff and ErrImagePull)
#!/bin/sh
# based on https://gist.github.com/ipedrazas/9c622404fb41f2343a0db85b3821275d
# delete all evicted pods from all namespaces
kubectl get pods --all-namespaces | grep Evicted | awk '{print $2 " --namespace=" $1}' | xargs kubectl delete pod
# delete all containers in ImagePullBackOff state from all namespaces
kubectl get pods --all-namespaces | grep 'ImagePullBackOff' | awk '{print $2 " --namespace=" $1}' | xargs kubectl delete pod
# delete all containers in ImagePullBackOff or ErrImagePull or Evicted state from all namespaces
kubectl get pods --all-namespaces | grep -E 'ImagePullBackOff|ErrImagePull|Evicted' | awk '{print $2 " --namespace=" $1}' | xargs kubectl delete pod
@binoysankar
Copy link

Hi,

Thanks for these commands. I have an AKS cluster and due to long run I saw more than 50 pods are in Failed state and got an exception "The node was low on resource: [Disk Pressure]". Below command I deleted all the failed pods and the above error is gone.
I would like to know how we can automate this script in kubernates where I can run this script once in every month or the Failed pods are very high or when I receive the above error. Is there any way to do this?

Looking forward for your reply.

Thanks,
Binoy

@psxvoid
Copy link
Author

psxvoid commented Aug 22, 2019

Hi,

It may depend on specifics of your clusters but here some options that may give you ideas:

  1. Run a script once per month with CronJob Running Automated Tasks with a CronJob
  2. Deploy your custom service\monitoring pod
  3. Use an external solution. For example, for running jobs on particular events you can try Brigade

@binoysankar
Copy link

binoysankar commented Oct 21, 2019

Hi,

It may depend on specifics of your clusters but here some options that may give you ideas:

  1. Run a script once per month with CronJob Running Automated Tasks with a CronJob
  2. Deploy your custom service\monitoring pod
  3. Use an external solution. For example, for running jobs on particular events you can try Brigade

Thanks for your reply. I have created a cron to delete all the Failed/Evicted pods to run 59 minutes as below.
In "default" namespace there more and more job and pods getting created and not getting deleted. How to solve this issue.

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: delete-failed-pods
spec:
  schedule: "*/59 * * * *"
  failedJobsHistoryLimit: 1
  successfulJobsHistoryLimit: 1
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: kubectl-runner
            image: wernight/kubectl
            command: ["sh", "-c", "kubectl get pods --all-namespaces | grep Evicted | awk '{print $2 \" --namespace=\" $1}' | xargs kubectl delete pod --all"]
          restartPolicy: OnFailure

@yeouchien
Copy link

Delete all evicted pods from all namespaces not working for me in macOS.

I use this instead

kubectl get pods --all-namespaces | grep Evicted | awk '{print $2 " --namespace=" $1}' | xargs -I '{}' bash -c 'kubectl delete pods {}'

@ntsh999
Copy link

ntsh999 commented Apr 15, 2020

What should be the command to delete evicted pods from a given namespace?

@igor9silva
Copy link

@ntsh999 you can use the above command from @yeouchien. Just change --all-namespaces to -n your_namespace.

E.g.

kubectl get pods -n default | grep Evicted | awk '{print $2 " --namespace=" $1}' | xargs -I '{}' bash -c 'kubectl delete pods {}'

@DheerajJoshi
Copy link

@ntsh999 you can use below command. This will delete both Evicted and Failed pods

kubectl get pods --namespace <your_namespace> --field-selector 'status.phase==Failed' -o json | kubectl delete -f -

@MathiasVandePol
Copy link

MathiasVandePol commented Sep 15, 2021

apiVersion: v1
kind: ServiceAccount
metadata:
  name: sa-cronjob-runner
  namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: cronjob-runner
rules:
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
  - watch
  - list
  - delete
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: cronjob-runner
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cronjob-runner
subjects:
- kind: ServiceAccount
  name: sa-cronjob-runner
  namespace: default
---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: delete-failed-pods
  namespace: default
spec:
  concurrencyPolicy: Allow
  failedJobsHistoryLimit: 1
  jobTemplate:
    metadata:
      creationTimestamp: null
    spec:
      template:
        metadata:
          creationTimestamp: null
        spec:
          serviceAccountName: sa-cronjob-runner
          containers:
          - command:
            - sh
            - -c
            - kubectl get pods --all-namespaces | grep Evicted | awk '{print $2 "
              --namespace=" $1}' | xargs kubectl delete pod  --all
            image: wernight/kubectl
            imagePullPolicy: Always
            name: kubectl-runner
            resources: {}
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
          dnsPolicy: ClusterFirst
          restartPolicy: OnFailure
          schedulerName: default-scheduler
          securityContext: {}
          terminationGracePeriodSeconds: 30
  schedule: '*/59 * * * *'
  successfulJobsHistoryLimit: 1
  suspend: false

@andreyev
Copy link

What about this?

kubectl get pods -A --field-selector=status.phase!=Running --template '{{range .items}}kubectl delete pod -n {{.metadata.namespace}} {{.metadata.name}}{{"\n"}}{{end}}' | sh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment