/ceph/teuthology-archive/teuthology-2021-02-03_03:31:04-rados-pacific-distro-basic-smithi/5852263/remote/smithi191/log/ceph-osd.11.log
first observation of state: active+recovery_unfound+undersized+degraded+remapped
2021-02-05T17:07:20.059+0000 7f05d5a29700 10 osd.11 pg_epoch: 14974 pg[3.1as0( v 854'670 (0'0,854'670] local-lis/les=542/543 n=22 ec=248/27 lis/c=542/418 les/c/f=543/419/0 sis=542) [11,1,10,5,0,9]/[11,NONE,NONE,5,0,9]p11(0) async=[1(1),10(2)] r=0 lpr=542 pi=[418,542)/2 crt=854'670 lcod 854'669 mlcod 463'631 active+recovery_unfound+undersized+degraded+remapped mbc={0={(1+0)=2},1={(0+0)=2},2={(0+0)=2},3={(1+0)=2},4={(0+0)=2},5={(1+0)=2}} trimq=209] handle_advance_map: 14975
no prior crt reported; but since this is the epoch where state went wrong and crt still got updated, this should confirm crt handling needs update.
op_wq happened: 2 epoch acting set changes had,
2021-02-05T17:07:20.059+0000 7f05d5a29700 20 osd.11 op_wq(2) _process OpSchedulerItem(3ed: 14975 MNotifyRec 3.1as0 from 8(2) notify: (query:14975 sent:14975 3.1as2( v 453'629 (0'0,453'629] local-lis/les=418/419 n=10 ec=248/27 lis/c=537/418 les/c/f=538/419/0 sis=542) 2->0 ([418,541] all_participants=0(4),1(1),5(3),7(1),7(3),8(2),9(5),10(2),11(0) intervals=([456,460] acting 0(4),5(3),7(1),9(5),11(0)),([468,522] acting 1(1),5(3),9(5),10(2),11(0)))) features: 0x3f01cfb9fffdffff +create_info) prio 255 cost 10 e14975) pg 0x5590faa9d000.1as0 PGPeeringEvent(epoch_sent: 14975 epoch_request
roll_forward_to=823'668 penaltimate to the dirty crt
<pre>
./remote/smithi134/log/ceph-osd.5.log.gz:2021-02-05T04:53:51.204+0000 7f9bbf1d3700 10 osd.5 854 dequeue_op 0x56294738caa0 prio 127 cost 0 latency 0.000166 MOSDECSubOpWrite(3.1as3 854/542 ECSubWrite(tid=5606, reqid=client.4548.0:9098, at_version=854'670, trim_to=0'0, roll_forward_to=823'668)) v2 pg pg[3.1as3( v 854'669 (0'0,854'669] local-lis/les=542/543 n=22 ec=248/27 lis/c=542/418 les/c/f=543/419/0 sis=542) [11,1,10,5,0,9]/[11,NONE,NONE,5,0,9]p11(0) r=3 lpr=542 pi=[418,542)/2 luod=0'0 crt=823'668 lcod 823'668 mlcod 823'668 active+remapped mbc={}]
./remote/smithi134/log/ceph-osd.5.log.gz:2021-02-05T04:53:51.204+0000 7f9bbf1d3700 10 osd.5 pg_epoch: 854 pg[3.1as3( v 854'669 (0'0,854'669] local-lis/les=542/543 n=22 ec=248/27 lis/c=542/418 les/c/f=543/419/0 sis=542) [11,1,10,5,0,9]/[11,NONE,NONE,5,0,9]p11(0) r=3 lpr=542 pi=[418,542)/2 luod=0'0 crt=823'668 lcod 823'668 mlcod 823'668 active+remapped mbc={}] _handle_message: MOSDECSubOpWrite(3.1as3 854/542 ECSubWrite(tid=5606, reqid=client.4548.0:9098, at_version=854'670, trim_to=0'0, roll_forward_to=823'668)) v2
./remote/smithi134/log/ceph-osd.5.log.gz:2021-02-05T04:53:51.204+0000 7f9bbf1d3700 10 osd.5 pg_epoch: 854 pg[3.1as3( v 854'669 (0'0,854'669] local-lis/les=542/543 n=22 ec=248/27 lis/c=542/418 les/c/f=543/419/0 sis=542) [11,1,10,5,0,9]/[11,NONE,NONE,5,0,9]p11(0) r=3 lpr=542 pi=[418,542)/2 luod=0'0 crt=823'668 lcod 823'668 mlcod 823'668 active+remapped mbc={}] append_log log((0'0,854'669], crt=823'668) [854'670 (854'669) modify 3:5a96ee3e:::smithi05121792-2:head by client.4548.0:9098 2021-02-05T04:53:51.202789+0000 0 ObjectCleanRegions clean_offsets: [0~3358720,3358720~18446744073706192895], clean_omap: 1, new_object: 0]
log((0'0,854'669] crt=823'668
./remote/smithi134/log/ceph-osd.5.log.gz:2021-02-05T04:53:51.204+0000 7f9bbf1d3700 10 osd.5 pg_epoch: 854 pg[3.1as3( v 854'670 (0'0,854'670] local-lis/les=542/543 n=22 ec=248/27 lis/c=542/418 les/c/f=543/419/0 sis=542) [11,1,10,5,0,9]/[11,NONE,NONE,5,0,9]p11(0) r=3 lpr=542 pi=[418,542)/2 luod=0'0 lua=854'669 crt=823'668 lcod 823'668 mlcod 823'668 active+remapped mbc={}] add_log_entry 854'670 (854'669) modify 3:5a96ee3e:::smithi05121792-2:head by client.4548.0:9098 2021-02-05T04:53:51.202789+0000 0 ObjectCleanRegions clean_offsets: [0~3358720,3358720~18446744073706192895], clean_omap: 1, new_object: 0
in append_log >> add_log_entry >> didn't check transaction_applied which skips roll_forward from docs ObjectCleanRegions plays with partial recovery
add_log_entry 854'670
means we added pglog entry tells that we are not going
in PGLog append_missing instead log add happens from :
void PeeringState::add_log_entry(const pg_log_entry_t& e, bool applied)
{
// raise last_complete only if we were previously up to date
if (info.last_complete == info.last_update)
info.last_complete = e.version;
// raise last_update.
ceph_assert(e.version > info.last_update);
info.last_update = e.version;
// raise user_version, if it increased (it may have not get bumped
// by all logged updates)
if (e.user_version > info.last_user_version)
info.last_user_version = e.user_version;
// log mutation
pg_log.add(e, applied);
psdout(10) << "add_log_entry " << e << dendl;
}
we should add should_rollforward here as well. let's see what went wrong...
./remote/smithi134/log/ceph-osd.5.log.gz:2021-02-05T04:53:51.204+0000 7f9bbf1d3700 10 osd.5 pg_epoch: 854 pg[3.1as3( v 854'670 (0'0,854'670] local-lis/les=542/543 n=22 ec=248/27 lis/c=542/418 les/c/f=543/419/0 sis=542) [11,1,10,5,0,9]/[11,NONE,NONE,5,0,9]p11(0) r=3 lpr=542 pi=[418,542)/2 luod=0'0 lua=854'669 crt=823'668 lcod 823'668 mlcod 823'668 active+remapped mbc={}] append_log approx pg log length = 670
./remote/smithi134/log/ceph-osd.5.log.gz:2021-02-05T04:53:51.204+0000 7f9bbf1d3700 10 osd.5 pg_epoch: 854 pg[3.1as3( v 854'670 (0'0,854'670] local-lis/les=542/543 n=22 ec=248/27 lis/c=542/418 les/c/f=543/419/0 sis=542) [11,1,10,5,0,9]/[11,NONE,NONE,5,0,9]p11(0) r=3 lpr=542 pi=[418,542)/2 luod=0'0 lua=854'669 crt=823'668 lcod 823'668 mlcod 823'668 active+remapped mbc={}] append_log transaction_applied = 1
./remote/smithi134/log/ceph-osd.5.log.gz:2021-02-05T04:53:51.204+0000 7f9bbf1d3700 10 osd.5 pg_epoch: 854 pg[3.1as3( v 854'670 (0'0,854'670] local-lis/les=542/543 n=22 ec=248/27 lis/c=542/418 les/c/f=543/419/0 sis=542) [11,1,10,5,0,9]/[11,NONE,NONE,5,0,9]p11(0) r=3 lpr=542 pi=[418,542)/2 luod=0'0 lua=854'669 crt=823'668 lcod 823'668 mlcod 823'668 active+remapped mbc={}] op_applied version 854'670
./remote/smithi134/log/ceph-osd.5.log.gz:2021-02-05T04:53:51.236+0000 7f9bbf1d3700 10 osd.5 854 dequeue_op 0x56294738d440 prio 127 cost 0 latency 0.000095 MOSDECSubOpWrite(3.1as3 854/542 ECSubWrite(tid=5607, reqid=client.4548.0:9098, at_version=0'0, trim_to=0'0, roll_forward_to=854'670)) v2 pg pg[3.1as3( v 854'670 (0'0,854'670] local-lis/les=542/543 n=22 ec=248/27 lis/c=542/418 les/c/f=543/419/0 sis=542) [11,1,10,5,0,9]/[11,NONE,NONE,5,0,9]p11(0) r=3 lpr=542 pi=[418,542)/2 luod=0'0 crt=823'668 mlcod 823'668 active+remapped mbc={}]
**updated: roll_forward_to=854'670 crt=823'668
./remote/smithi134/log/ceph-osd.5.log.gz:2021-02-05T04:53:51.236+0000 7f9bbf1d3700 10 osd.5 pg_epoch: 854 pg[3.1as3( v 854'670 (0'0,854'670] local-lis/les=542/543 n=22 ec=248/27 lis/c=542/418 les/c/f=543/419/0 sis=542) [11,1,10,5,0,9]/[11,NONE,NONE,5,0,9]p11(0) r=3 lpr=542 pi=[418,542)/2 luod=0'0 crt=823'668 mlcod 823'668 active+remapped mbc={}] _handle_message: MOSDECSubOpWrite(3.1as3 854/542 ECSubWrite(tid=5607, reqid=client.4548.0:9098, at_version=0'0, trim_to=0'0, roll_forward_to=854'670)) v2
./remote/smithi134/log/ceph-osd.5.log.gz:2021-02-05T04:53:51.236+0000 7f9bbf1d3700 10 osd.5 pg_epoch: 854 pg[3.1as3( v 854'670 (0'0,854'670] local-lis/les=542/543 n=22 ec=248/27 lis/c=542/418 les/c/f=543/419/0 sis=542) [11,1,10,5,0,9]/[11,NONE,NONE,5,0,9]p11(0) r=3 lpr=542 pi=[418,542)/2 luod=0'0 crt=823'668 mlcod 823'668 active+remapped mbc={}] scrubber pg(3.1as3) unreg_next_scrub existing-0.000000. was registered? 0
./remote/smithi134/log/ceph-osd.5.log.gz:2021-02-05T04:53:51.236+0000 7f9bbf1d3700 10 osd.5 pg_epoch: 854 pg[3.1as3( v 854'670 (0'0,854'670] local-lis/les=542/543 n=22 ec=24
8/27 lis/c=542/418 les/c/f=543/419/0 sis=542) [11,1,10,5,0,9]/[11,NONE,NONE,5,0,9]p11(0) r=3 lpr=542 pi=[418,542)/2 luod=0'0 crt=823'668 mlcod 823'668 active+remapped mbc={}] append_log log((0'0,854'670], crt=823'668) []
./remote/smithi134/log/ceph-osd.5.log.gz:2021-02-05T04:53:51.236+0000 7f9bbf1d3700 20 osd.5 pg_epoch: 854 pg[3.1as3( v 854'670 (0'0,854'670] local-lis/les=542/543 n=22 ec=248/27 lis/c=542/418 les/c/f=543/419/0 sis=542) [11,1,10,5,0,9]/[11,NONE,NONE,5,0,9]p11(0) r=3 lpr=542 pi=[418,542)/2 luod=0'0 crt=854'670 mlcod 823'668 active+remapped mbc={}] rollforward: entry=854'669 (823'668) modify 3:5a96ee3e:::smithi05121792-2:head by client.4548.0:9097 2021-02-05T04:53:51.202769+0000 0 ObjectCleanRegions clean_offsets: [3358720~18446744073706192895], clean_omap: 0, new_object: 0
this point we updated crt value
./remote/smithi134/log/ceph-osd.5.log.gz:2021-02-05T04:53:51.236+0000 7f9bbf1d3700 20 osd.5 pg_epoch: 854 pg[3.1as3( v 854'670 (0'0,854'670] local-lis/les=542/543 n=22 ec=248/27 lis/c=542/418 les/c/f=543/419/0 sis=542) [11,1,10,5,0,9]/[11,NONE,NONE,5,0,9]p11(0) r=3 lpr=542 pi=[418,542)/2 luod=0'0 crt=854'670 mlcod 823'668 active+remapped mbc={}] rollforward: entry=854'670 (854'669) modify 3:5a96ee3e:::smithi05121792-2:head by client.4548.0:9098 2021-02-05T04:53:51.202789+0000 0 ObjectCleanRegions clean_offsets: [0~3358720,3358720~18446744073706192895], clean_omap: 1, new_object: 0
./remote/smithi134/log/ceph-osd.5.log.gz:2021-02-05T04:53:51.236+0000 7f9bbf1d3700 10 osd.5 pg_epoch: 854 pg[3.1as3( v 854'670 (0'0,854'670] local-lis/les=542/543 n=22 ec=248/27 lis/c=542/418 les/c/f=543/419/0 sis=542) [11,1,10,5,0,9]/[11,NONE,NONE,5,0,9]p11(0) r=3 lpr=542 pi=[418,542)/2 luod=0'0 crt=854'670 mlcod 823'668 active+remapped mbc={}] append_log approx pg log length = 670
./remote/smithi134/log/ceph-osd.5.log.gz:2021-02-05T04:53:51.236+0000 7f9bbf1d3700 10 osd.5 pg_epoch: 854 pg[3.1as3( v 854'670 (0'0,854'670] local-lis/les=542/543 n=22 ec=248/27 lis/c=542/418 les/c/f=543/419/0 sis=542) [11,1,10,5,0,9]/[11,NONE,NONE,5,0,9]p11(0) r=3 lpr=542 pi=[418,542)/2 luod=0'0 crt=854'670 mlcod 823'668 active+remapped mbc={}] append_log transaction_applied = 1
./remote/smithi134/log/ceph-osd.5.log.gz:2021-02-05T04:53:52.532+0000 7f9bbf1d3700 10 osd.5 pg_epoch: 854 pg[3.1as3( v 854'670 (0'0,854'670] local-lis/les=542/543 n=22 ec=248/27 lis/c=542/418 les/c/f=543/419/0 sis=542) [11,1,10,5,0,9]/[11,NONE,NONE,5,0,9]p11(0) r=3 lpr=542 pi=[418,542)/2 luod=0'0 crt=854'670 mlcod 854'670 active+remapped mbc={}] handle_advance_map: 855
./remote/smithi134/log/ceph-osd.5.log.gz:2021-02-05T04:53:52.532+0000 7f9bbf1d3700 10 osd.5 pg_epoch: 854 pg[3.1as3( v 854'670 (0'0,854'670] local-lis/les=542/543 n=22 ec=248/27 lis/c=542/418 les/c/f=543/419/0 sis=542) [11,1,10,5,0,9]/[11,NONE,NONE,5,0,9]p11(0) r=3 lpr=542 pi=[418,542)/2 luod=0'0 crt=854'670 mlcod 854'670 active+remapped mbc={}] handle_advance_map [11,1,10,5,0,9]/[11,2147483647,2147483647,5,0,9] -- 11/11
./remote/smithi134/log/ceph-osd.5.log.gz:2021-02-05T04:53:52.532+0000 7f9bbf1d3700 10 osd.5 pg_epoch: 855 pg[3.1as3( v 854'670 (0'0,854'670] local-lis/les=542/543 n=22 ec=248/27 lis/c=542/418 les/c/f=543/419/0 sis=542) [11,1,10,5,0,9]/[11,NONE,NONE,5,0,9]p11(0) r=3 lpr=542 pi=[418,542)/2 luod=0'0 crt=854'670 mlcod 854'670 active+remapped mbc={}
see teuthology: /home/ideepika/crt.log
lowest epoch where the state was registered