Ok. Here is one failing Pod:
k get pods -o yaml | grep bemdb
openshift.io/deployment-config.name: bemdb
openshift.io/deployment.name: bemdb-5
generateName: bemdb-5-
app: bemdb
deployment: bemdb-5
deploymentconfig: bemdb
name: bemdb
name: bemdb-5-k5zdl
name: bemdb-5
selfLink: /api/v1/namespaces/bem/pods/bemdb-5-k5zdl
name: bemdb
name: bemdb
name: bemdb
name: bemdb-data
- name: bemdb-data
claimName: bemdb
openebs.io/persistent-volume-claim: bemdb
openebs.io/persistent-volume-claim: bemdb
openebs.io/persistent-volume-claim: bemdb
openebs.io/persistent-volume-claim: bemdb
From Pod Eventlog:
Warning FailedMount 14s (x3559 over 5d) kubelet, staging-okd-worker01.okd-staging.crosscan.com MountVolume.WaitForAttach failed for volume "pvc-281e2fad-cfd0-11e9-a9e0-525400d479ca" : Heuristic determination of mount point failed:sta
t /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/iscsi/iface-default/172.30.220.203:3260-iqn.2016-09.com.openebs.jiva:pvc-281e2fad-cfd0-11e9-a9e0-525400d479ca-lun-0: input/output error
Storage pods:
k get pod | grep pvc-281e2fad-cfd0-11e9-a9e0-525400d479ca
pvc-281e2fad-cfd0-11e9-a9e0-525400d479ca-ctrl-6f6d8bf7f-sjs44 2/2 Running 2 11d
pvc-281e2fad-cfd0-11e9-a9e0-525400d479ca-rep-7c97b7c4c-52j6v 1/1 Running 1 11d
pvc-281e2fad-cfd0-11e9-a9e0-525400d479ca-rep-7c97b7c4c-5k2j8 1/1 Running 2 11d
pvc-281e2fad-cfd0-11e9-a9e0-525400d479ca-rep-7c97b7c4c-gjbg4 1/1 Running 1 11d
Logs of k logs pvc-281e2fad-cfd0-11e9-a9e0-525400d479ca-ctrl-6f6d8bf7f-sjs44 pvc-281e2fad-cfd0-11e9-a9e0-525400d479ca-ctrl-con
[...]
time="2019-09-17T07:32:20Z" level=warning msg="opcode: a3h err: check condition"
time="2019-09-17T07:34:22Z" level=warning msg="opcode: a3h err: check condition"
time="2019-09-17T07:36:25Z" level=warning msg="opcode: a3h err: check condition"
Logs of maye container:
I0912 04:12:45.356223 1 command.go:119] Starting maya-exporter ...
I0912 04:12:45.356542 1 command.go:125] Initialising maya-exporter for the jiva
I0912 04:12:45.356950 1 command.go:152] Registered maya exporter for jiva
I0912 04:12:45.356989 1 server.go:41] Starting http server....
Logs of one random replica pod:
time="2019-09-12T04:13:32Z" level=info msg="PrepareRebuild tcp://10.129.2.242:9502"
time="2019-09-12T04:13:32Z" level=info msg="GetReplica for id 1"
time="2019-09-12T04:13:32Z" level=info msg="Running ssync [ssync -timeout 7 -port 9700 -daemon volume-head-003.img.meta]"
time="2019-09-12T04:13:32Z" level=info msg="Creating Ssync service"
time="2019-09-12T04:13:32Z" level=info msg="open: receiving fileSize: 164, setting up directIo: false"
time="2019-09-12T04:13:32Z" level=info msg="Ssync server opened and ready"
time="2019-09-12T04:13:32Z" level=info msg="Ssync server exit(0)"
time="2019-09-12T04:13:32Z" level=info msg="Done running ssync [ssync -timeout 7 -port 9700 -daemon volume-head-003.img.meta]"
time="2019-09-12T04:13:33Z" level=info msg="GetReplica for id 1"
time="2019-09-12T04:13:33Z" level=info msg="reloadAndVerify tcp://10.129.2.242:9502"
time="2019-09-12T04:13:33Z" level=info msg="Reload Replica"
time="2019-09-12T04:13:33Z" level=info msg="Reloading volume"
time="2019-09-12T04:13:33Z" level=info msg="Start reading extents"
time="2019-09-12T04:13:33Z" level=info msg="Read extents successful"
10.129.2.242 - - [12/Sep/2019:04:13:33 +0000] "POST /v1/replicas/1?action=reload HTTP/1.1" 200 2492
time="2019-09-12T04:13:33Z" level=info msg="GetReplica for id 1"
10.128.5.35 - - [12/Sep/2019:04:13:33 +0000] "POST /v1/replicas/1?action=setreplicamode HTTP/1.1" 200 2492
time="2019-09-12T04:13:33Z" level=info msg="SetReplicaMode to RW"
time="2019-09-12T04:13:33Z" level=info msg="SetRevisionCounter to 114279"
10.128.5.35 - - [12/Sep/2019:04:13:33 +0000] "POST /v1/replicas/1?action=setrevisioncounter HTTP/1.1" 200 2492
time="2019-09-12T04:13:33Z" level=info msg="GetReplica for id 1"
time="2019-09-12T04:13:33Z" level=info msg="SetRebuilding to false"
10.129.2.242 - - [12/Sep/2019:04:13:33 +0000] "POST /v1/replicas/1?action=setrebuilding HTTP/1.1" 200 2776
time="2019-09-12T04:13:34Z" level=info msg="GetReplica for id 1"
time="2019-09-12T04:13:34Z" level=info msg="SnapshotReplica name: f8910506-b6be-49ef-ab79-d4f1fde4feef created: 2019-09-12T04:13:34Z"
time="2019-09-12T04:13:34Z" level=info msg="Snapshotting [f8910506-b6be-49ef-ab79-d4f1fde4feef] volume, user created false, created time 2019-09-12T04:13:34Z"
10.128.5.35 - - [12/Sep/2019:04:13:34 +0000] "POST /v1/replicas/1?action=snapshot HTTP/1.1" 200 3162
time="2019-09-12T04:13:34Z" level=info msg="Set clone status as NA"
time="2019-09-12T04:13:53Z" level=info msg="GetReplica for id 1"