The cluster was in HEALTH_WARN
state with backfill errors. So I followed the advise from https://centosquestions.com/how-to-resolve-ceph-pool-getting-activeremappedbackfill_toofull/.
See the health:
# ceph health detail
HEALTH_WARN 1 backfillfull osd(s); 1 pool(s) backfillfull
OSD_BACKFILLFULL 1 backfillfull osd(s)
osd.8 is backfill full
POOL_BACKFILLFULL 1 pool(s) backfillfull
pool 'replicapool' is backfillfull
You can see in the following image that some of the OSDs have some space but they are not rebalancing:
Now I do the rebalancing:
# ceph osd reweight-by-utilization
moved 4 / 384 (1.04167%)
avg 25.6
stddev 10.5249 -> 10.5502 (expected baseline 4.88808)
min osd.4 with 8 -> 8 pgs (0.3125 -> 0.3125 * mean)
max osd.1 with 39 -> 37 pgs (1.52344 -> 1.44531 * mean)
oload 120
max_change 0.05
max_change_osds 4
average_utilization 0.6327
overload_utilization 0.7592
osd.8 weight 1.0000 -> 0.9500
osd.1 weight 1.0000 -> 0.9500
Now they are all rebalanced:
And HEALTH_OK
again.
# ceph health detail
HEALTH_OK