본 글은 OpenShift v3.11 환경에서 클러스터가 정상적이지 않거나 사용자에 의해
object resource가 손상되었을 경우를 대비해 백업 및 복구하는 방법에 대해서 작성 되었다.
또한, OpenShift v3.11 버전은 레드햇(RedHat)에서 EoL(End of Life) 및 EoS(End of Service)된
제품이기 때문에 RedHat에서 더 이상 지원하지 않음을 알린다.
etcd는 kubernetes에서 사용되는 모든 정보들이 저장되어 있는 key/value 기반의 database 이다.
etcd 백업은 크게 2가지 방법으로 수행이 가능하다.
OpenShift v3.11에서 Control Plane(Master Nodes)에서 etcdctl 명령어로 snapshot 백업이 가능하다.
etcdctl 명령어는 etcd 패키지를 설치 후 사용이 가능하고,
사용법은 해당 명령어를 각 Master 노드에서 수행하고 백업 파일이 저장될 경로와 파일 이름을 지정 해주면 된다.
[root@master ~]# yum install etcd
snapshot 파일이 저장될 경로는 master에 Static Pod로 구동되어 있는 Container에 존재하므로,
/var/lib/etcd/ 하위 디렉토리에 백업 받는것이 좋다.
[root@master ~]# cat /etc/origin/node/pods/etcd.yaml | sed -n '41,65p'
name: etcd
securityContext:
privileged: true
volumeMounts:
- mountPath: /etc/etcd/
name: master-config
readOnly: true
- mountPath: /var/lib/etcd/
name: master-data
- mountPath: /etc/localtime
name: host-localtime
workingDir: /var/lib/etcd
hostNetwork: true
priorityClassName: system-node-critical
restartPolicy: Always
volumes:
- hostPath:
path: /etc/etcd/
name: master-config
- hostPath:
path: /var/lib/etcd
name: master-data
- hostPath:
path: /etc/localtime
name: host-localtime
[root@master ~]# mkdir -p /var/lib/etcd/backup/
[root@master ~]# chown etcd:etcd -R /var/lib/etcd/backup/
[root@master ~]# restorecon -Rv /var/lib/etcd/backup/
[root@master ~]# etcdctl3 snapshot save /var/lib/etcd/backup/snapshot-$(date +%Y-%m-%d).db
Bastion에서 백업된 snapshot 파일을 복사해온다. (복사시 Master01,02,03 노드 한곳에서 백업된 파일만 복사 해온다.)
[root@bastion ~]# scp root@master01:/var/lib/etcd/backup/snapshot-*.db /opt/
위의 "1.1. 명령어 백업"의 내용을 kubernetes의 cronjob 기능으로 자동 수행 하도록 설정하는 방식이다.
이 방식은 총 4가지의 object resource를 kube-system namespace에 생성하여 구성한다.
구성전 etcd snapshot 파일이 백업될 디렉토리를 각 Master 노드에서 모두 생성한다.
[root@master ~]# mkdir -p /var/lib/etcd/backup/
[root@master ~]# chown etcd:etcd -R /var/lib/etcd/backup/
[root@master ~]# restorecon -Rv /var/lib/etcd/backup/
CronJobs을 수행할 Service Account를 생성한다.
[root@bastion ~]# vi 00_service-account.yaml
kind: ServiceAccount
apiVersion: v1
metadata:
name: cluster-backup
namespace: kube-system
labels:
cluster-backup: "true"
[root@bastion ~]# oc create -f 00_service-account.yaml
Service Account에 클러스터의 권한을 부여 한다.
[root@bastion ~]# vi 01_cluster-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cluster-backup
rules:
- apiGroups:
- '*'
resources:
- '*'
verbs:
- '*'
- nonResourceURLs:
- '*'
verbs:
- '*'
[root@bastion ~]# oc create -f 01_cluster-role.yaml
Service Account에 ClusterRole을 반영할 수 있도록 ClusterRoleBinding을 생성한다.
[root@bastion ~]# vi 02_cluster-role-binding.yaml
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: cluster-backup
labels:
cluster-backup: "true"
subjects:
- kind: ServiceAccount
name: cluster-backup
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-backup
[root@bastion ~]# oc create -f 02_cluster-role-binding.yaml
매주 일요일 00시 30분에 수행 날짜를 기준으로 디렉토리를 생성 후 7일치의 디렉토리만 남기고 etcd 백업을 진행한다.
[root@bastion ~]# vi 03_cronjobs-etcd-backup.yaml
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: etcd-backup
namespace: kube-system
spec:
# Sunday, 00:30
schedule: "30 0 * * 0"
successfulJobsHistoryLimit: 5
failedJobsHistoryLimit: 5
concurrencyPolicy: Forbid
suspend: false
jobTemplate:
metadata:
creationTimestamp: null
labels:
etcd-backup: "true"
spec:
backoffLimit: 0
template:
metadata:
creationTimestamp: null
labels:
etcd-backup: "true"
spec:
containers:
- name: etcd-backup
args:
- "-c"
- oc get pod -n kube-system -o name | cut -d '/' -f '2' | grep 'master-etcd' | xargs -I {} -- oc exec {} -n kube-system -c etcd -- bash -c "mkdir -p /var/lib/etcd/backup/$(date +%Y-%m-%d)/ && ETCDCTL_API=3 etcdctl --cert /etc/etcd/peer.crt --key /etc/etcd/peer.key --cacert /etc/etcd/ca.crt --endpoints https://$(oc get node -l node-role.kubernetes.io/master --no-headers -o name | cut -d '/' -f '2' | sed -n 1p):2379,https://$(oc get node -l node-role.kubernetes.io/master --no-headers -o name | cut -d '/' -f '2' | sed -n 2p):2379,https://$(oc get node -l node-role.kubernetes.io/master --no-headers -o name | cut -d '/' -f '2' | sed -n 3p):2379 snapshot save /var/lib/etcd/backup/$(date +%Y-%m-%d)/snapshot.db && find /var/lib/etcd/backup/ -type d -ctime +'7' -delete"
command:
- "/bin/bash"
image: "registry.ocp3.local:5000/openshift3/ose-cli:v3.11"
imagePullPolicy: IfNotPresent
resources:
requests:
cpu: 100m
memory: 256Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
securityContext:
privileged: true
runAsUser: 0
tolerations:
- operator: Exists
nodeSelector:
node-role.kubernetes.io/master: 'true'
dnsPolicy: ClusterFirst
restartPolicy: Never
schedulerName: default-scheduler
serviceAccount: cluster-backup
serviceAccountName: cluster-backup
terminationGracePeriodSeconds: 30
activeDeadlineSeconds: 500
[root@bastion ~]# oc get cronjobs -n kube-system
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
etcd-backup 30 0 * * 0 False 1 2s 25s
[root@bastion ~]# oc get jobs -n kube-system
NAME DESIRED SUCCESSFUL AGE
etcd-backup-1675047600 1 1 22s
[root@bastion etcd-backup]# oc get pod -l etcd-backup
[root@bastion ~]# oc get pod -l etcd-backup
NAME READY STATUS RESTARTS AGE
etcd-backup-1675047600-6nq9z 0/1 Completed 0 30s
ose-cli pod를 통해 master 노드에 구동된 etcd static pod에 접근하여 백업을 수행 한다.
[root@bastion ~]# oc logs -f etcd-backup-1675047600-6nq9z
Snapshot saved at /var/lib/etcd/backup/2023-01-30/snapshot.db
Snapshot saved at /var/lib/etcd/backup/2023-01-30/snapshot.db
Snapshot saved at /var/lib/etcd/backup/2023-01-30/snapshot.db
복구 방식은 백업 과정에서 생성한 snapshot 파일을 기준으로 각 master 노드에서 수행한다.
모든 Master 노드에서 static pod로 사용되는 etcd.yaml, apiserver.yaml, controller.yaml 파일을 임시로 다른 곳으로 옮긴다.
[root@bastion ~]# for masters in {master01,master02,master03}; do
ssh root@$masters.ocp3.local "mkdir -p /etc/origin/node/pods-stopped";
done
[root@bastion ~]# for masters in {master01,master02,master03}; do
ssh root@$masters.ocp3.local "mv /etc/origin/node/pods/* /etc/origin/node/pods-stopped";
done
Master 노드 중 한곳을 recovery host로 지정하고 etcd 백업본을 각 나머지 master 노드에 파일을 모두 복사한다.
본 내용에서는 master01 노드를 recovery host로 지정한다.
[root@master01 ~]# cp /var/lib/etcd/backup/2023-01-30/snapshot.db /opt/snapshot-2023-01-30.db
[root@master01 ~]# scp /opt/snapshot-2023-01-30.db root@master02.ocp3.local:/opt/snapshot-2023-01-30.db
[root@master01 ~]# scp /opt/snapshot-2023-01-30.db root@master03.ocp3.local:/opt/snapshot-2023-01-30.db
recovery host(master01)의 snapshot 백업 데이터를 기준으로 복구하기 위해 기존 etcd 데이터를 삭제한다.
(기존 데이터를 삭제하지 않으면 데이터 정합성이 일치하지 않아 복구가 되지 않는다.)
[root@bastion ~]# for masters in {master01,master02,master03}; do
ssh root@$masters.ocp3.local "rm -rf /var/lib/etcd";
done
모든 master 노드에 아래 명령어를 수행하여 ETCD를 복구 한다.
[root@master ~]# source /etc/etcd/etcd.conf
[root@master ~]# export ETCDCTL_API=3
[root@master ~]# etcdctl snapshot restore /opt/snapshot-2023-01-30.db \
--name $ETCD_NAME \
--initial-cluster $ETCD_INITIAL_CLUSTER \
--initial-cluster-token $ETCD_INITIAL_CLUSTER_TOKEN \
--initial-advertise-peer-urls $ETCD_INITIAL_ADVERTISE_PEER_URLS \
--data-dir /var/lib/etcd
[root@master ~]# chown etcd:etcd -R /var/lib/etcd
[root@master ~]# restorecon -Rv /var/lib/etcd
모든 Master 노드에서 static pod로 사용되는 etcd.yaml, apiserver.yaml, controller.yaml 파일을 원본 디렉토리로 옮긴다.
[root@bastion ~]# for masters in {master01,master02,master03}; do
ssh root@$masters.ocp3.local "mv /etc/origin/node/pods-stopped/* /etc/origin/node/pods/";
done
복구가 완료되면 환경에 따라 최소 10분이내에 복구가 완료 된다.
[root@bastion ~]# oc get pod -o wide -n kube-system | grep etcd
master-etcd-master01.ocp3.local 1/1 Running 0 13h 172.16.45.20 master01.ocp3.local <none>
master-etcd-master02.ocp3.local 1/1 Running 0 13h 172.16.45.21 master02.ocp3.local <none>
master-etcd-master03.ocp3.local 1/1 Running 0 13h 172.16.45.22 master03.ocp3.local <none>
[root@master01 ~]# ETCD_ALL_ENDPOINTS=` etcdctl3 --write-out=fields member list | awk '/ClientURL/{printf "%s%s",sep,$3; sep=","}'`
[root@master01 ~]# etcdctl3 --endpoints=$ETCD_ALL_ENDPOINTS endpoint status --write-out=table
+----------------------------------+------------------+---------+---------+-----------+-----------+------------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+----------------------------------+------------------+---------+---------+-----------+-----------+------------+
| https://master01.ocp3.local:2379 | 57b1603f55394850 | 3.2.32 | 30 MB | false | 2 | 174176 |
| https://172.16.45.20:2379 | 57b1603f55394850 | 3.2.32 | 30 MB | false | 2 | 174176 |
| https://172.16.45.21:2379 | 6b0db1f119b9991e | 3.2.32 | 30 MB | true | 2 | 174176 |
| https://172.16.45.22:2379 | 7899562af965edcd | 3.2.32 | 30 MB | false | 2 | 174176 |
+----------------------------------+------------------+---------+---------+-----------+-----------+------------+
[root@master02 ~]# ETCD_ALL_ENDPOINTS=` etcdctl3 --write-out=fields member list | awk '/ClientURL/{printf "%s%s",sep,$3; sep=","}'`
[root@master02 ~]# etcdctl3 --endpoints=$ETCD_ALL_ENDPOINTS endpoint status --write-out=table
+----------------------------------+------------------+---------+---------+-----------+-----------+------------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+----------------------------------+------------------+---------+---------+-----------+-----------+------------+
| https://master02.ocp3.local:2379 | 6b0db1f119b9991e | 3.2.32 | 30 MB | true | 2 | 174494 |
| https://172.16.45.20:2379 | 57b1603f55394850 | 3.2.32 | 30 MB | false | 2 | 174494 |
| https://172.16.45.21:2379 | 6b0db1f119b9991e | 3.2.32 | 30 MB | true | 2 | 174494 |
| https://172.16.45.22:2379 | 7899562af965edcd | 3.2.32 | 30 MB | false | 2 | 174494 |
+----------------------------------+------------------+---------+---------+-----------+-----------+------------+
[root@master03 ~]# ETCD_ALL_ENDPOINTS=` etcdctl3 --write-out=fields member list | awk '/ClientURL/{printf "%s%s",sep,$3; sep=","}'`
[root@master03 ~]# etcdctl3 --endpoints=$ETCD_ALL_ENDPOINTS endpoint status --write-out=table
+----------------------------------+------------------+---------+---------+-----------+-----------+------------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+----------------------------------+------------------+---------+---------+-----------+-----------+------------+
| https://master03.ocp3.local:2379 | 7899562af965edcd | 3.2.32 | 30 MB | false | 2 | 174499 |
| https://172.16.45.20:2379 | 57b1603f55394850 | 3.2.32 | 30 MB | false | 2 | 174499 |
| https://172.16.45.21:2379 | 6b0db1f119b9991e | 3.2.32 | 30 MB | true | 2 | 174499 |
| https://172.16.45.22:2379 | 7899562af965edcd | 3.2.32 | 30 MB | false | 2 | 174499 |
+----------------------------------+------------------+---------+---------+-----------+-----------+------------+
[root@bastion ~]# oc get node
NAME STATUS ROLES AGE VERSION
infra01.ocp3.local Ready infra 13h v1.11.0+d4cacc0
infra02.ocp3.local Ready infra 13h v1.11.0+d4cacc0
infra03.ocp3.local Ready infra 13h v1.11.0+d4cacc0
logging01.ocp3.local Ready logging 13h v1.11.0+d4cacc0
logging02.ocp3.local Ready logging 13h v1.11.0+d4cacc0
logging03.ocp3.local Ready logging 13h v1.11.0+d4cacc0
master01.ocp3.local Ready master 13h v1.11.0+d4cacc0
master02.ocp3.local Ready master 13h v1.11.0+d4cacc0
master03.ocp3.local Ready master 13h v1.11.0+d4cacc0
router01.ocp3.local Ready router 13h v1.11.0+d4cacc0
router02.ocp3.local Ready router 13h v1.11.0+d4cacc0
worker01.ocp3.local Ready worker 13h v1.11.0+d4cacc0
worker02.ocp3.local Ready worker 13h v1.11.0+d4cacc0
worker03.ocp3.local Ready worker 13h v1.11.0+d4cacc0
[root@bastion ~]# oc get pod -o wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
default docker-registry-2-22qzg 1/1 Running 0 13h 10.128.5.11 infra02.ocp3.local <none>
default docker-registry-2-cn5xr 1/1 Running 0 13h 10.128.7.13 infra03.ocp3.local <none>
default docker-registry-2-kn2p4 1/1 Running 0 13h 10.128.6.9 infra01.ocp3.local <none>
default logging-eventrouter-1-2wjz5 1/1 Running 0 11h 10.128.8.20 logging01.ocp3.local <none>
default logging-eventrouter-1-v7z5f 1/1 Running 0 11h 10.128.9.18 logging02.ocp3.local <none>
default logging-eventrouter-1-xrv7v 1/1 Running 0 11h 10.128.10.17 logging03.ocp3.local <none>
default router-1-bbvs4 1/1 Running 0 13h 172.16.45.31 router02.ocp3.local <none>
default router-1-tjtqp 1/1 Running 0 13h 172.16.45.30 router01.ocp3.local <none>
kube-system master-api-master01.ocp3.local 1/1 Running 0 13h 172.16.45.20 master01.ocp3.local <none>
kube-system master-api-master02.ocp3.local 1/1 Running 0 13h 172.16.45.21 master02.ocp3.local <none>
kube-system master-api-master03.ocp3.local 1/1 Running 0 13h 172.16.45.22 master03.ocp3.local <none>
kube-system master-controllers-master01.ocp3.local 1/1 Running 0 13h 172.16.45.20 master01.ocp3.local <none>
kube-system master-controllers-master02.ocp3.local 1/1 Running 0 13h 172.16.45.21 master02.ocp3.local <none>
kube-system master-controllers-master03.ocp3.local 1/1 Running 0 13h 172.16.45.22 master03.ocp3.local <none>
kube-system master-etcd-master01.ocp3.local 1/1 Running 0 13h 172.16.45.20 master01.ocp3.local <none>
kube-system master-etcd-master02.ocp3.local 1/1 Running 0 13h 172.16.45.21 master02.ocp3.local <none>
kube-system master-etcd-master03.ocp3.local 1/1 Running 0 13h 172.16.45.22 master03.ocp3.local <none>
openshift-console console-5f5bb7f877-842hc 1/1 Running 0 13h 10.128.2.7 master03.ocp3.local <none>
openshift-console console-5f5bb7f877-nbwcz 1/1 Running 0 13h 10.128.1.9 master02.ocp3.local <none>
openshift-console console-5f5bb7f877-pzld7 1/1 Running 0 13h 10.128.0.10 master01.ocp3.local <none>
openshift-infra bootstrap-autoapprover-0 1/1 Running 0 13h 10.128.0.9 master01.ocp3.local <none>
openshift-infra hawkular-cassandra-1-xqrtq 1/1 Running 0 12h 10.128.5.12 infra02.ocp3.local <none>
openshift-infra hawkular-metrics-75vsp 1/1 Running 0 12h 10.128.5.13 infra02.ocp3.local <none>
openshift-infra hawkular-metrics-schema-zsxxx 0/1 Completed 0 12h 10.128.12.2 worker02.ocp3.local <none>
openshift-infra heapster-x5tql 1/1 Running 0 12h 10.128.7.10 infra03.ocp3.local <none>
openshift-logging logging-curator-1675017000-8klvn 0/1 Completed 0 9h 10.128.9.21 logging02.ocp3.local <none>
openshift-logging logging-curator-ops-1675017000-rrhfj 0/1 Completed 0 9h 10.128.8.23 logging01.ocp3.local <none>
openshift-logging logging-es-data-master-47t2x0lo-1-5fhsv 2/2 Running 0 11h 10.128.10.18 logging03.ocp3.local <none>
openshift-logging logging-es-data-master-mp39sg6z-1-l5cr9 2/2 Running 0 11h 10.128.9.16 logging02.ocp3.local <none>
openshift-logging logging-es-data-master-npezejqj-1-dhn5q 2/2 Running 0 11h 10.128.8.18 logging01.ocp3.local <none>
openshift-logging logging-es-ops-data-master-pei9e4d3-1-g5dfp 2/2 Running 0 11h 10.128.10.19 logging03.ocp3.local <none>
openshift-logging logging-fluentd-5wjt2 1/1 Running 0 10h 10.128.3.5 router01.ocp3.local <none>
openshift-logging logging-fluentd-8drkd 1/1 Running 0 10h 10.128.13.10 worker03.ocp3.local <none>
openshift-logging logging-fluentd-bvxb2 1/1 Running 0 10h 10.128.10.22 logging03.ocp3.local <none>
openshift-logging logging-fluentd-cczv2 1/1 Running 0 10h 10.128.2.9 master03.ocp3.local <none>
openshift-logging logging-fluentd-gzd56 1/1 Running 0 10h 10.128.12.12 worker02.ocp3.local <none>
openshift-logging logging-fluentd-hvl25 1/1 Running 0 10h 10.128.7.11 infra03.ocp3.local <none>
openshift-logging logging-fluentd-lmd9v 1/1 Running 0 10h 10.128.1.8 master02.ocp3.local <none>
openshift-logging logging-fluentd-m6flf 1/1 Running 0 10h 10.128.9.19 logging02.ocp3.local <none>
openshift-logging logging-fluentd-sz958 1/1 Running 0 10h 10.128.4.6 router02.ocp3.local <none>
openshift-logging logging-fluentd-t8kxr 1/1 Running 0 10h 10.128.5.10 infra02.ocp3.local <none>
openshift-logging logging-fluentd-tnpgr 1/1 Running 0 10h 10.128.6.10 infra01.ocp3.local <none>
openshift-logging logging-fluentd-v55kk 1/1 Running 0 10h 10.128.8.21 logging01.ocp3.local <none>
openshift-logging logging-fluentd-vg2qd 1/1 Running 0 10h 10.128.11.10 worker01.ocp3.local <none>
openshift-logging logging-fluentd-z89m9 1/1 Running 0 10h 10.128.0.11 master01.ocp3.local <none>
openshift-logging logging-kibana-1-58l5l 2/2 Running 0 11h 10.128.10.20 logging03.ocp3.local <none>
openshift-logging logging-kibana-1-vdlpt 2/2 Running 0 11h 10.128.9.20 logging02.ocp3.local <none>
openshift-logging logging-kibana-1-vl2fx 2/2 Running 0 11h 10.128.8.19 logging01.ocp3.local <none>
openshift-logging logging-kibana-ops-1-kdzlw 2/2 Running 0 11h 10.128.10.21 logging03.ocp3.local <none>
openshift-logging logging-kibana-ops-1-n8tnx 2/2 Running 0 11h 10.128.9.17 logging02.ocp3.local <none>
openshift-logging logging-kibana-ops-1-vfpvk 2/2 Running 0 11h 10.128.8.22 logging01.ocp3.local <none>
openshift-metrics-server metrics-server-7cb48555f7-jhr89 1/1 Running 0 12h 10.128.7.12 infra03.ocp3.local <none>
openshift-monitoring alertmanager-main-0 3/3 Running 0 11h 10.128.12.11 worker02.ocp3.local <none>
openshift-monitoring alertmanager-main-1 3/3 Running 0 11h 10.128.11.11 worker01.ocp3.local <none>
openshift-monitoring alertmanager-main-2 3/3 Running 0 11h 10.128.13.11 worker03.ocp3.local <none>
openshift-monitoring cluster-monitoring-operator-79c559d786-z7ghz 1/1 Running 0 11h 10.128.11.8 worker01.ocp3.local <none>
openshift-monitoring grafana-784cbccb8f-zl66b 2/2 Running 0 11h 10.128.12.10 worker02.ocp3.local <none>
openshift-monitoring kube-state-metrics-6cf558b5f4-dsmlk 3/3 Running 0 11h 10.128.12.9 worker02.ocp3.local <none>
openshift-monitoring node-exporter-2lp9b 2/2 Running 0 11h 172.16.45.52 logging03.ocp3.local <none>
openshift-monitoring node-exporter-62xh8 2/2 Running 0 11h 172.16.45.30 router01.ocp3.local <none>
openshift-monitoring node-exporter-7v5lr 2/2 Running 0 11h 172.16.45.42 infra03.ocp3.local <none>
openshift-monitoring node-exporter-99h4f 2/2 Running 0 11h 172.16.45.41 infra02.ocp3.local <none>
openshift-monitoring node-exporter-9jprb 2/2 Running 0 11h 172.16.45.21 master02.ocp3.local <none>
openshift-monitoring node-exporter-cwb4p 2/2 Running 0 11h 172.16.45.51 logging02.ocp3.local <none>
openshift-monitoring node-exporter-ffmqx 2/2 Running 0 11h 172.16.45.20 master01.ocp3.local <none>
openshift-monitoring node-exporter-jj8hw 2/2 Running 0 11h 172.16.45.62 worker03.ocp3.local <none>
openshift-monitoring node-exporter-kftw5 2/2 Running 0 11h 172.16.45.22 master03.ocp3.local <none>
openshift-monitoring node-exporter-mp8kc 2/2 Running 0 11h 172.16.45.31 router02.ocp3.local <none>
openshift-monitoring node-exporter-ql62l 2/2 Running 0 11h 172.16.45.61 worker02.ocp3.local <none>
openshift-monitoring node-exporter-spglb 2/2 Running 0 11h 172.16.45.40 infra01.ocp3.local <none>
openshift-monitoring node-exporter-vqnjh 2/2 Running 0 11h 172.16.45.50 logging01.ocp3.local <none>
openshift-monitoring node-exporter-zs82p 2/2 Running 0 11h 172.16.45.60 worker01.ocp3.local <none>
openshift-monitoring prometheus-k8s-0 4/4 Running 1 2h 10.128.11.12 worker01.ocp3.local <none>
openshift-monitoring prometheus-k8s-1 4/4 Running 1 2h 10.128.12.13 worker02.ocp3.local <none>
openshift-monitoring prometheus-operator-57548d4b75-422r7 1/1 Running 0 11h 10.128.13.8 worker03.ocp3.local <none>
openshift-node sync-2nkt7 1/1 Running 0 13h 172.16.45.20 master01.ocp3.local <none>
openshift-node sync-4vwcm 1/1 Running 0 13h 172.16.45.52 logging03.ocp3.local <none>
openshift-node sync-6279f 1/1 Running 0 13h 172.16.45.42 infra03.ocp3.local <none>
openshift-node sync-6vbn2 1/1 Running 0 13h 172.16.45.50 logging01.ocp3.local <none>
openshift-node sync-7zpj8 1/1 Running 0 13h 172.16.45.60 worker01.ocp3.local <none>
openshift-node sync-bm47j 1/1 Running 0 13h 172.16.45.62 worker03.ocp3.local <none>
openshift-node sync-gnlz8 1/1 Running 0 13h 172.16.45.31 router02.ocp3.local <none>
openshift-node sync-hrmxz 1/1 Running 0 13h 172.16.45.22 master03.ocp3.local <none>
openshift-node sync-k8nq9 1/1 Running 0 13h 172.16.45.21 master02.ocp3.local <none>
openshift-node sync-m768f 1/1 Running 0 13h 172.16.45.61 worker02.ocp3.local <none>
openshift-node sync-mstzd 1/1 Running 0 13h 172.16.45.41 infra02.ocp3.local <none>
openshift-node sync-n9wzg 1/1 Running 0 13h 172.16.45.51 logging02.ocp3.local <none>
openshift-node sync-sfgns 1/1 Running 0 13h 172.16.45.40 infra01.ocp3.local <none>
openshift-node sync-xz92j 1/1 Running 0 13h 172.16.45.30 router01.ocp3.local <none>
openshift-sdn ovs-24n89 1/1 Running 0 13h 172.16.45.21 master02.ocp3.local <none>
openshift-sdn ovs-4qjt8 1/1 Running 0 13h 172.16.45.31 router02.ocp3.local <none>
openshift-sdn ovs-62pqj 1/1 Running 0 13h 172.16.45.62 worker03.ocp3.local <none>
openshift-sdn ovs-62ssg 1/1 Running 0 13h 172.16.45.41 infra02.ocp3.local <none>
openshift-sdn ovs-6fqwt 1/1 Running 0 13h 172.16.45.42 infra03.ocp3.local <none>
openshift-sdn ovs-7f55g 1/1 Running 0 13h 172.16.45.40 infra01.ocp3.local <none>
openshift-sdn ovs-7jcjs 1/1 Running 0 13h 172.16.45.61 worker02.ocp3.local <none>
openshift-sdn ovs-7m4z5 1/1 Running 0 13h 172.16.45.30 router01.ocp3.local <none>
openshift-sdn ovs-d9hs5 1/1 Running 0 13h 172.16.45.51 logging02.ocp3.local <none>
openshift-sdn ovs-k9lhn 1/1 Running 0 13h 172.16.45.60 worker01.ocp3.local <none>
openshift-sdn ovs-nrd5r 1/1 Running 0 13h 172.16.45.50 logging01.ocp3.local <none>
openshift-sdn ovs-t8527 1/1 Running 0 13h 172.16.45.20 master01.ocp3.local <none>
openshift-sdn ovs-wlk2k 1/1 Running 0 13h 172.16.45.22 master03.ocp3.local <none>
openshift-sdn ovs-x5ftt 1/1 Running 0 13h 172.16.45.52 logging03.ocp3.local <none>
openshift-sdn sdn-2dkcv 1/1 Running 0 13h 172.16.45.52 logging03.ocp3.local <none>
openshift-sdn sdn-45fhn 1/1 Running 0 13h 172.16.45.20 master01.ocp3.local <none>
openshift-sdn sdn-78xwx 1/1 Running 0 13h 172.16.45.40 infra01.ocp3.local <none>
openshift-sdn sdn-7f26d 1/1 Running 0 13h 172.16.45.21 master02.ocp3.local <none>
openshift-sdn sdn-j2zmx 1/1 Running 0 13h 172.16.45.41 infra02.ocp3.local <none>
openshift-sdn sdn-k6mhw 1/1 Running 0 13h 172.16.45.22 master03.ocp3.local <none>
openshift-sdn sdn-k826k 1/1 Running 0 13h 172.16.45.62 worker03.ocp3.local <none>
openshift-sdn sdn-lhv8t 1/1 Running 0 13h 172.16.45.30 router01.ocp3.local <none>
openshift-sdn sdn-m89v8 1/1 Running 0 13h 172.16.45.31 router02.ocp3.local <none>
openshift-sdn sdn-mffs2 1/1 Running 0 13h 172.16.45.50 logging01.ocp3.local <none>
openshift-sdn sdn-rhs7f 1/1 Running 0 13h 172.16.45.60 worker01.ocp3.local <none>
openshift-sdn sdn-sb4fc 1/1 Running 0 13h 172.16.45.61 worker02.ocp3.local <none>
openshift-sdn sdn-szvb5 1/1 Running 0 13h 172.16.45.51 logging02.ocp3.local <none>
openshift-sdn sdn-vtvsd 1/1 Running 0 13h 172.16.45.42 infra03.ocp3.local <none>
openshift-web-console webconsole-55ccd559bb-27gtn 1/1 Running 0 13h 10.128.0.8 master01.ocp3.local <none>
openshift-web-console webconsole-55ccd559bb-2jbng 1/1 Running 0 13h 10.128.2.8 master03.ocp3.local <none>
openshift-web-console webconsole-55ccd559bb-4d97f 1/1 Running 0 13h 10.128.1.7 master02.ocp3.local <none>
모든 OpenShift 클러스터 노드를 재부팅하여 작업을 완료한다.
[root@bastion ~]# for node in $(oc get node -o name | cut -d '/' -f '2'); do
ssh root@$node "systemctl reboot";
done
[1]: RedHat Knowledge-Centered Support - How to restore etcd on OpenShift 3.11 with 2 etcd members in error state
[2]: RedHat Knowledge-Centered Support - How do I restore from an etcd backup in OpenShift 3.9 and older?
[3]: Gist - OpenShift v4.x - ETCD 백업 및 복구 방법
[4]: Gist - OpenShift v3.11 - 2022-03-28 이슈 정리