Skip to content

Instantly share code, notes, and snippets.

@a1994sc
Created March 15, 2024 21:19
Show Gist options
  • Save a1994sc/094bc4d300cb76aab4df09e9553ad8c9 to your computer and use it in GitHub Desktop.
Save a1994sc/094bc4d300cb76aab4df09e9553ad8c9 to your computer and use it in GitHub Desktop.
2024-03-15 00:48:47.404259 I | op-osd: ceph osd status in namespace "rook-ceph" check interval "1m0s"
2024-03-15 00:48:47.404266 I | ceph-cluster-controller: enabling ceph osd monitoring goroutine for cluster "rook-ceph"
2024-03-15 00:48:48.396763 I | op-k8sutil: CSI_PROVISIONER_TOLERATIONS="" (default)
2024-03-15 00:48:58.783679 I | op-k8sutil: CSI_PLUGIN_TOLERATIONS="- key: node-role.kubernetes.io/control-plane\n operator: Exists\n- key: node-role.kubernetes.io/infrastructure\n operator: Exists" (configmap)
2024-03-15 00:48:58.783755 I | op-k8sutil: CSI_RBD_PLUGIN_TOLERATIONS="" (default)
2024-03-15 00:48:59.415188 I | op-k8sutil: CSI_RBD_PROVISIONER_TOLERATIONS="" (default)
2024-03-15 00:49:01.990334 I | op-k8sutil: CSI_CEPHFS_PLUGIN_TOLERATIONS="" (default)
2024-03-15 00:49:02.757884 I | op-k8sutil: CSI_CEPHFS_PROVISIONER_TOLERATIONS="" (default)
2024-03-15 00:49:10.717449 I | op-osd: stopping monitoring of OSDs in namespace "rook-ceph"
   Tolerations: []v1.Toleration{
   TolerationSeconds: nil,
   Tolerations: []v1.Toleration{
   TolerationSeconds: nil,
   Tolerations: []v1.Toleration{
   TolerationSeconds: nil,
2024-03-15 00:49:14.988431 I | op-osd: ceph osd status in namespace "rook-ceph" check interval "1m0s"
2024-03-15 00:49:14.988436 I | ceph-cluster-controller: enabling ceph osd monitoring goroutine for cluster "rook-ceph"
2024-03-15 00:54:22.428182 I | op-config: deleting "global" "ms_osd_compress_mode" option from the mon configuration database
2024-03-15 00:54:23.150740 I | op-config: successfully deleted "ms_osd_compress_mode" option from the mon configuration database
2024-03-15 00:55:18.285382 I | op-osd: start running osds in namespace "rook-ceph"
2024-03-15 00:55:18.285457 I | op-osd: wait timeout for healthy OSDs during upgrade or restart is "10m0s"
2024-03-15 00:55:18.311306 I | op-osd: start provisioning the OSDs on PVCs, if needed
2024-03-15 00:55:18.314544 I | op-osd: no storageClassDeviceSets defined to configure OSDs on PVCs
2024-03-15 00:55:18.315526 I | op-osd: start provisioning the OSDs on nodes, if needed
2024-03-15 00:55:18.356191 I | op-k8sutil: skipping creation of OSDs on nodes [machine-kc1 machine-kc2 machine-kc3]: placement settings do not match
2024-03-15 00:55:18.356359 I | op-osd: 5 of the 8 storage nodes are valid
2024-03-15 00:55:18.487034 I | op-k8sutil: Removing previous job rook-ceph-osd-prepare-machine-ki2 to start a new one
2024-03-15 00:55:18.556406 I | op-k8sutil: batch job rook-ceph-osd-prepare-machine-ki2 still exists
2024-03-15 00:55:19.467825 I | op-config: setting "global"="mon_pg_warn_min_per_osd"="0" option to the mon configuration database
2024-03-15 00:55:20.153573 I | op-config: successfully set "global"="mon_pg_warn_min_per_osd"="0" option to the mon configuration database
2024-03-15 00:55:21.559716 I | op-k8sutil: batch job rook-ceph-osd-prepare-machine-ki2 deleted
2024-03-15 00:55:21.933324 I | op-osd: started OSD provisioning job for node "machine-ki2"
2024-03-15 00:55:21.953498 I | op-k8sutil: Removing previous job rook-ceph-osd-prepare-machine-ki3 to start a new one
2024-03-15 00:55:22.002946 I | op-k8sutil: batch job rook-ceph-osd-prepare-machine-ki3 still exists
2024-03-15 00:55:25.005131 I | op-k8sutil: batch job rook-ceph-osd-prepare-machine-ki3 deleted
2024-03-15 00:55:26.335548 I | op-osd: started OSD provisioning job for node "machine-ki3"
2024-03-15 00:55:26.351833 I | op-k8sutil: Removing previous job rook-ceph-osd-prepare-machine-kw1 to start a new one
2024-03-15 00:55:26.450902 I | op-k8sutil: batch job rook-ceph-osd-prepare-machine-kw1 still exists
2024-03-15 00:55:29.454864 I | op-k8sutil: batch job rook-ceph-osd-prepare-machine-kw1 deleted
2024-03-15 00:55:29.924247 I | op-osd: started OSD provisioning job for node "machine-kw1"
2024-03-15 00:55:29.939785 I | op-k8sutil: Removing previous job rook-ceph-osd-prepare-machine-kw2 to start a new one
2024-03-15 00:55:30.091316 I | op-k8sutil: batch job rook-ceph-osd-prepare-machine-kw2 still exists
2024-03-15 00:55:33.097057 I | op-k8sutil: batch job rook-ceph-osd-prepare-machine-kw2 deleted
2024-03-15 00:55:33.619510 I | op-osd: started OSD provisioning job for node "machine-kw2"
2024-03-15 00:55:33.640956 I | op-k8sutil: Removing previous job rook-ceph-osd-prepare-machine-ki1 to start a new one
2024-03-15 00:55:33.826396 I | op-k8sutil: batch job rook-ceph-osd-prepare-machine-ki1 still exists
2024-03-15 00:55:36.832605 I | op-k8sutil: batch job rook-ceph-osd-prepare-machine-ki1 deleted
2024-03-15 00:55:38.248324 I | op-osd: started OSD provisioning job for node "machine-ki1"
2024-03-15 00:55:38.256661 I | op-osd: OSD orchestration status for node machine-ki1 is "starting"
2024-03-15 00:55:38.256730 I | op-osd: OSD orchestration status for node machine-ki2 is "completed"
2024-03-15 00:55:38.264281 I | op-osd: OSD orchestration status for node machine-ki3 is "completed"
2024-03-15 00:55:38.276026 I | op-osd: OSD orchestration status for node machine-kw1 is "starting"
2024-03-15 00:55:38.276061 I | op-osd: OSD orchestration status for node machine-kw2 is "completed"
2024-03-15 00:55:41.220349 I | op-osd: updating OSD 0 on node "machine-ki1"
2024-03-15 00:55:43.659058 I | clusterdisruption-controller: reconciling osd pdb reconciler as the allowed disruptions in default pdb is 0
2024-03-15 00:55:46.646253 I | clusterdisruption-controller: osd "rook-ceph-osd-0" is down but no node drain is detected
2024-03-15 00:55:47.572463 I | clusterdisruption-controller: osd is down in failure domain "machine-ki1". pg health: "cluster is not fully clean. PGs: [{StateName:active+clean Count:111} {StateName:stale+active+clean Count:58}]"
2024-03-15 00:55:47.573498 I | clusterdisruption-controller: creating temporary blocking pdb "rook-ceph-osd-host-machine-ki2" with maxUnavailable=0 for "host" failure domain "machine-ki2"
2024-03-15 00:55:47.585072 I | clusterdisruption-controller: creating temporary blocking pdb "rook-ceph-osd-host-machine-ki3" with maxUnavailable=0 for "host" failure domain "machine-ki3"
2024-03-15 00:55:47.601689 I | clusterdisruption-controller: deleting the default pdb "rook-ceph-osd" with maxUnavailable=1 for all osd
2024-03-15 00:56:13.671472 I | clusterdisruption-controller: osd "rook-ceph-osd-0" is down but no node drain is detected
2024-03-15 00:56:14.613862 I | clusterdisruption-controller: osd is down in failure domain "machine-ki1". pg health: "cluster is not fully clean. PGs: [{StateName:active+undersized Count:74} {StateName:active+undersized+degraded Count:73} {StateName:active+clean Count:22}]"
2024-03-15 00:56:28.717552 I | op-osd: OSD orchestration status for node machine-ki1 is "orchestrating"
2024-03-15 00:56:28.717930 I | op-osd: OSD orchestration status for node machine-ki1 is "completed"
2024-03-15 00:56:28.730983 I | op-osd: OSD orchestration status for node machine-kw1 is "orchestrating"
2024-03-15 00:56:28.731243 I | op-osd: OSD orchestration status for node machine-kw1 is "completed"
2024-03-15 00:56:31.655878 I | op-osd: updating OSD 1 on node "machine-ki2"
2024-03-15 00:56:44.638232 I | clusterdisruption-controller: osd "rook-ceph-osd-1" is down but no node drain is detected
2024-03-15 00:56:45.543533 I | clusterdisruption-controller: osd is down in failure domain "machine-ki2". pg health: "cluster is not fully clean. PGs: [{StateName:active+undersized Count:76} {StateName:active+undersized+degraded Count:74} {StateName:active+clean Count:19}]"
2024-03-15 00:56:45.545641 I | clusterdisruption-controller: creating temporary blocking pdb "rook-ceph-osd-host-machine-ki1" with maxUnavailable=0 for "host" failure domain "machine-ki1"
2024-03-15 00:57:15.580612 I | clusterdisruption-controller: osd "rook-ceph-osd-1" is down but no node drain is detected
2024-03-15 00:57:16.494551 I | clusterdisruption-controller: osd is down in failure domain "machine-ki2". pg health: "all PGs in cluster are clean"
2024-03-15 00:57:16.498679 I | clusterdisruption-controller: deleting temporary blocking pdb with "rook-ceph-osd-host-machine-ki2" with maxUnavailable=0 for "host" failure domain "machine-ki2"
2024-03-15 00:57:16.737450 I | op-osd: waiting... 5 of 5 OSD prepare jobs have finished processing and 2 of 3 OSDs have been updated
2024-03-15 00:57:19.738451 I | op-osd: updating OSD 2 on node "machine-ki3"
2024-03-15 00:57:46.541424 I | clusterdisruption-controller: osd "rook-ceph-osd-2" is down but no node drain is detected
2024-03-15 00:57:47.480913 I | clusterdisruption-controller: osd is down in failure domain "machine-ki3". pg health: "cluster is not fully clean. PGs: [{StateName:active+undersized+degraded Count:74} {StateName:active+undersized Count:72} {StateName:active+clean Count:23}]"
2024-03-15 00:57:47.485172 I | clusterdisruption-controller: creating temporary blocking pdb "rook-ceph-osd-host-machine-ki2" with maxUnavailable=0 for "host" failure domain "machine-ki2"
2024-03-15 00:58:06.430001 I | cephclient: successfully disallowed pre-reef osds and enabled all new reef-only functionality
2024-03-15 00:58:07.260214 I | op-osd: finished running OSDs in namespace "rook-ceph"
2024-03-15 00:58:18.412110 I | clusterdisruption-controller: all PGs are active+clean. Restoring default OSD pdb settings
2024-03-15 00:58:18.412126 I | clusterdisruption-controller: creating the default pdb "rook-ceph-osd" with maxUnavailable=1 for all osd
2024-03-15 00:58:18.426279 I | clusterdisruption-controller: deleting temporary blocking pdb with "rook-ceph-osd-host-machine-ki1" with maxUnavailable=0 for "host" failure domain "machine-ki1"
2024-03-15 00:58:18.443908 I | clusterdisruption-controller: deleting temporary blocking pdb with "rook-ceph-osd-host-machine-ki2" with maxUnavailable=0 for "host" failure domain "machine-ki2"
2024-03-15 00:58:18.458473 I | clusterdisruption-controller: deleting temporary blocking pdb with "rook-ceph-osd-host-machine-ki3" with maxUnavailable=0 for "host" failure domain "machine-ki3"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment