Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save aisuhua/c8e4acbf2d2b1061758ef5ecef5bc0e7 to your computer and use it in GitHub Desktop.
Save aisuhua/c8e4acbf2d2b1061758ef5ecef5bc0e7 to your computer and use it in GitHub Desktop.
OpenShift v4.x - ImagePrunerDegraded: Job has reached the specified backoff limit

OpenShift v4.x - ImagePrunerDegraded: Job has reached the specified backoff limit

본 이슈는 내부 레지스트리인 image-registry의 Operator 상태가 Degraded로 발생하는 이슈이다.

해당 버그 리포트를 살펴보면 OCP v4.4 버전에서도 발생했던 사항으로,
이슈가 발생하는 시점은 image-registry operator의 managementState를 Removed 상태에서 pruner job이 생성이 될때까지 기다리면 발생한다. (현재 OCP v4.6은 managementState는 Removed가 기본)

1. 이슈

[root@bastion ~]# oc describe clusteroperators image-registry | grep 'ImagePrunerDegraded'
    Message:               ImagePrunerDegraded: Job has reached the specified backoff limit

ImagePrunerDegraded: Job has reached the specified backoff limit

2. 원인

image-registry에 사용할 수 있는 스토리지가 없을때 발생한다.
즉, managementState가 Removed 상태인 경우에는 operator가 image-registry pod를 생성하지 않아도,
Image Pruner가 cronjob을 생성하고 이를 수행하는 부분에서 발생하는 것으로 이해 한다.

3. 해결

3.1. Image Pruner 상태 변경

suspend를 false에서 true로 변경 해준다.

[root@bastion ~]# oc edit imagepruner.imageregistry/cluster
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: imageregistry.operator.openshift.io/v1
kind: ImagePruner
metadata:
  creationTimestamp: "2021-03-12T06:53:03Z"
  generation: 2
  name: cluster
  resourceVersion: "117054"
  selfLink: /apis/imageregistry.operator.openshift.io/v1/imagepruners/cluster
  uid: 83526cbc-642b-4140-9146-921486bf24c6
spec:
  failedJobsHistoryLimit: 3
  ignoreInvalidImageReferences: true
  keepTagRevisions: 3
  logLevel: Normal
  schedule: ""
  successfulJobsHistoryLimit: 3
  suspend: true

3.2. cronjob 삭제

기존에 생성 된 image-pruner의 cronjob을 삭제한다.

[root@bastion ~]# oc delete jobs --all -n openshift-image-registry

4. 확인

[root@bastion ~]# oc get clusteroperators | image-registry
NAME               VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE
image-registry     4.6.21    True        False         False      40h

[root@bastion ~]# oc describe clusteroperators image-registry  | grep Message
    Message:               Available: The registry is ready
    Message:               Progressing: The registry is ready

Available: The registry is ready ImagePrunerAvailable: Pruner CronJob has been created

5. RefURL

[1]: RedHat Knowledge base - ImagePrunerDegraded error stalling upgrade
[2]: RedHat Knowledge base - Pruner degrades image registry operator if the registry is removed
[3]: RedHat Knowledge base - The image-registry operator is in degraded state (no managed image found).
[4]: Bugzilla - Pruner degrades image registry operator if the latter is removed
[5]: Bugzilla - Imagepruner met error "Job has reached the specified backoff limit" which causes image registry degraded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment