-
-
Save psxvoid/71492191b7cb06260036c90ab30cc9a0 to your computer and use it in GitHub Desktop.
#!/bin/sh | |
# based on https://gist.github.com/ipedrazas/9c622404fb41f2343a0db85b3821275d | |
# delete all evicted pods from all namespaces | |
kubectl get pods --all-namespaces | grep Evicted | awk '{print $2 " --namespace=" $1}' | xargs kubectl delete pod | |
# delete all containers in ImagePullBackOff state from all namespaces | |
kubectl get pods --all-namespaces | grep 'ImagePullBackOff' | awk '{print $2 " --namespace=" $1}' | xargs kubectl delete pod | |
# delete all containers in ImagePullBackOff or ErrImagePull or Evicted state from all namespaces | |
kubectl get pods --all-namespaces | grep -E 'ImagePullBackOff|ErrImagePull|Evicted' | awk '{print $2 " --namespace=" $1}' | xargs kubectl delete pod |
Hi,
It may depend on specifics of your clusters but here some options that may give you ideas:
- Run a script once per month with CronJob Running Automated Tasks with a CronJob
- Deploy your custom service\monitoring pod
- Use an external solution. For example, for running jobs on particular events you can try Brigade
Hi,
It may depend on specifics of your clusters but here some options that may give you ideas:
- Run a script once per month with CronJob Running Automated Tasks with a CronJob
- Deploy your custom service\monitoring pod
- Use an external solution. For example, for running jobs on particular events you can try Brigade
Thanks for your reply. I have created a cron to delete all the Failed/Evicted pods to run 59 minutes as below.
In "default" namespace there more and more job and pods getting created and not getting deleted. How to solve this issue.
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: delete-failed-pods
spec:
schedule: "*/59 * * * *"
failedJobsHistoryLimit: 1
successfulJobsHistoryLimit: 1
jobTemplate:
spec:
template:
spec:
containers:
- name: kubectl-runner
image: wernight/kubectl
command: ["sh", "-c", "kubectl get pods --all-namespaces | grep Evicted | awk '{print $2 \" --namespace=\" $1}' | xargs kubectl delete pod --all"]
restartPolicy: OnFailure
Delete all evicted pods from all namespaces not working for me in macOS.
I use this instead
kubectl get pods --all-namespaces | grep Evicted | awk '{print $2 " --namespace=" $1}' | xargs -I '{}' bash -c 'kubectl delete pods {}'
What should be the command to delete evicted pods from a given namespace?
@ntsh999 you can use the above command from @yeouchien. Just change --all-namespaces
to -n your_namespace
.
E.g.
kubectl get pods -n default | grep Evicted | awk '{print $2 " --namespace=" $1}' | xargs -I '{}' bash -c 'kubectl delete pods {}'
@ntsh999 you can use below command. This will delete both Evicted and Failed pods
kubectl get pods --namespace <your_namespace> --field-selector 'status.phase==Failed' -o json | kubectl delete -f -
apiVersion: v1
kind: ServiceAccount
metadata:
name: sa-cronjob-runner
namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cronjob-runner
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- watch
- list
- delete
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: cronjob-runner
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cronjob-runner
subjects:
- kind: ServiceAccount
name: sa-cronjob-runner
namespace: default
---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: delete-failed-pods
namespace: default
spec:
concurrencyPolicy: Allow
failedJobsHistoryLimit: 1
jobTemplate:
metadata:
creationTimestamp: null
spec:
template:
metadata:
creationTimestamp: null
spec:
serviceAccountName: sa-cronjob-runner
containers:
- command:
- sh
- -c
- kubectl get pods --all-namespaces | grep Evicted | awk '{print $2 "
--namespace=" $1}' | xargs kubectl delete pod --all
image: wernight/kubectl
imagePullPolicy: Always
name: kubectl-runner
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: OnFailure
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
schedule: '*/59 * * * *'
successfulJobsHistoryLimit: 1
suspend: false
What about this?
kubectl get pods -A --field-selector=status.phase!=Running --template '{{range .items}}kubectl delete pod -n {{.metadata.namespace}} {{.metadata.name}}{{"\n"}}{{end}}' | sh
Hi,
Thanks for these commands. I have an AKS cluster and due to long run I saw more than 50 pods are in Failed state and got an exception "The node was low on resource: [Disk Pressure]". Below command I deleted all the failed pods and the above error is gone.
I would like to know how we can automate this script in kubernates where I can run this script once in every month or the Failed pods are very high or when I receive the above error. Is there any way to do this?
Looking forward for your reply.
Thanks,
Binoy