Metric | Description | As of Kubernetes 1.14(OpenShift 4.2) |
---|---|---|
kubelet_pleg_relist_interval_microseconds | Interval in microseconds between "relist" calls | kubelet_pleg_relist_interval_seconds |
kubelet_pleg_relist_latency_microseconds | Latency in microseconds for "relist" | kubelet_pleg_relist_duration_seconds |
kubelet_runtime_operations | Cumulative number of runtime operations by operation type | kubelet_runtime_operations_total |
kubelet_runtime_operations_latency_microseconds | Latency in microseconds of runtime operations. Broken down by operation type | kubelet_runtime_operations_duration_seconds |
# HELP kubelet_pleg_relist_interval_microseconds Interval in microseconds between relisting in PLEG.
# TYPE kubelet_pleg_relist_interval_microseconds summary
kubelet_pleg_relist_interval_microseconds{quantile="0.5"} 1.054052e+06
kubelet_pleg_relist_interval_microseconds{quantile="0.9"} 1.074873e+06
kubelet_pleg_relist_interval_microseconds{quantile="0.99"} 1.126039e+06
kubelet_pleg_relist_interval_microseconds_count 5146
# HELP kubelet_pleg_relist_latency_microseconds Latency in microseconds for relisting pods in PLEG.
# TYPE kubelet_pleg_relist_latency_microseconds summary
OpenShift version: 3.11
etcd information:
- cluster id: aaaaaaaaaaaaaaaa
- 111111111111111: master1.ocp.example.com:https://10.0.1.10:2380:https://10.0.1.10:2379 (This member is failed.)
- 222222222222222: master2.ocp.example.com:https://10.0.1.20:2380:https://10.0.1.20:2379
- 333333333333333: master3.ocp.example.com:https://10.0.1.30:2380:https://10.0.1.30:2379
# oc get pod -n kube-system
NAME READY STATUS RESTARTS AGE
:
master-etcd-master1.ocp.example.com 0/1 CrashLoopBackOff 10 15m
master-etcd-master2.ocp.example.com 1/1 Running 1 226d
master-etcd-master3.ocp.example.com 1/1 Running 1 226d
# oc logs master-etcd-master1.ocp.example.com
:
2019–12–25 10:15:24.291020 C | raft: tocommit(18928) is out of range [lastIndex(13100)]. Was the raft log corrupted, truncated, or lost?
sh-4.2# etcdctl - cert=$ETCD_PEER_CERT_FILE - key=$ETCD_PEER_KEY_FILE - cacert=$ETCD_TRUSTED_CA_FILE - endpoints=$ETCD_LISTEN_CLIENT_URLS member remove 111111111111111
Member 111111111111111 removed from cluster aaaaaaaaaaaaaaaa
sh-4.2# etcdctl - cert=$ETCD_PEER_CERT_FILE - key=$ETCD_PEER_KEY_FILE - cacert=$ETCD_TRUSTED_CA_FILE - endpoints=$ETCD_LISTEN_CLIENT_URLS - write-out=table member list
+ - - - - - - - - - + - - - - -+ - - - - - - - - - - - - - -+ - - - - - - - - - - - - - - + - - - - - - - - - - - - - - +
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS |
+ - - - - - - - - - + - - - - -+ - - - - - - - - - - - - - -+ - - - - - - - - - - - - - - + - - - - - - - - - - - - - - +
| 222222222222222 | started | master2.ocp.example.com | https://10.0.1.20:2380 | https://10.0.1.20:2379 |
| 333333333333333 | started | master3.ocp.example.com | https://10.0.1.30:2380 | https://10.0.1.30:2379 |
+ - - - - - - - - - + - - - - -+ - - - - - - - - - - - - - -+ - - - - - - - - - - - - - - + - - - - - - -
# ssh master1
master1 ~# mv /etc/origin/node/pods/etcd.yaml .
master1 ~# oc get pod -n kube-system
NAME READY STATUS RESTARTS AGE
:
master-etcd-master2.ocp.example.com 1/1 Running 2 1m
master-etcd-master3.ocp.example.com 1/1 Running 1 226d
master1 ~# mv /var/lib/etcd /var/lib/etcd_bak
master1 ~# mkdir /var/lib/etcd
# oc rsh master-etcd-master2.ocp.example.com
sh-4.2# source /etc/etcd/etcd.conf
sh-4.2# export ETCDCTL_API=3
sh-4.2# etcdctl - cert=$ETCD_PEER_CERT_FILE - key=$ETCD_PEER_KEY_FILE - cacert=$ETCD_TRUSTED_CA_FILE - endpoints=$ETCD_LISTEN_CLIENT_URLS member add master1.ocp.example.com - peer-urls https://10.0.1.10:2380
Member 444444444444444 added to cluster aaaaaaaaaaaaaaaa
ETCD_NAME="master1.ocp.example.com"
ETCD_INITIAL_CLUSTER="master1.ocp.example.com=https://10.0.1.10:2380,master2.ocp.example.com=https://10.0.1.20:2380,master3.ocp.example.com=https://10.0.1.30:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"
# ssh master1
master1 ~# vim /etc/etcd/etcd.conf
ETCD_NAME="master1.ocp.example.com"
:
ETCD_INITIAL_CLUSTER="master1.ocp.example.com=https://10.0.1.10:2380,master2.ocp.example.com=https://10.0.1.20:2380,master3.ocp.example.com=https://10.0.1.30:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"
master1 ~# mv etcd.yaml /etc/origin/node/pods/
master1 ~# oc get pod -n kube-system
NAME READY STATUS RESTARTS AGE
:
master-etcd-master1.ocp.example.com 1/1 Running 0 5s
master-etcd-master2.ocp.example.com 1/1 Running 2 17m
master-etcd-master3.ocp.example.com 1/1 Running 1 226d
master1 ~# oc logs master-etcd-master1.ocp.example.com
:
# oc rsh master-etcd-master2.ocp.example.com
sh-4.2# source /etc/etcd/etcd.conf
sh-4.2# export ETCDCTL_API=3
sh-4.2# ETCD_ALL_ENDPOINTS=` etcdctl - cert=$ETCD_PEER_CERT_FILE - key=$ETCD_PEER_KEY_FILE - cacert=$ETCD_TRUSTED_CA_FILE - endpoints=$ETCD_LISTEN_CLIENT_URLS - write-out=fields member list | awk '/ClientURL/{printf "%s%s",sep,$3; sep=","}'`
sh-4.2# etcdctl - cert=$ETCD_PEER_CERT_FILE - key=$ETCD_PEER_KEY_FILE - cacert=$ETCD_TRUSTED_CA_FILE - endpoints=$ETCD_LISTEN_CLIENT_URLS - write-out=table member list
+ - - - - - - - - - + - - - - -+ - - - - - - - - - - - - - -+ - - - - - - - - - - - - - - + - - - - - - - - - - - - - - +
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS |
+ - - - - - - - - - + - - - - -+ - - - - - - - - - - - - - -+ - - - - - - - - - - - - - - + - - - - - - - - - - - - - - +
| 444444444444444 | started | master1.ocp.example.com | https://10.0.1.10:2380 | https://10.0.1.10:2379 |