Skip to content

Instantly share code, notes, and snippets.

View Matan-B's full-sized avatar

Matan Breizman Matan-B

View GitHub Profile

Reproduce the scenario (Steps 1-3) and apply the fix (Step 4):

1) Create clone object:

Create pool (single pg, no autoscale):

ceph osd pool create <pool_id> 1 1 --autoscale-mode=off

Put an object:

@Matan-B
Matan-B / 57628.md
Last active September 19, 2023 13:23
same_interval_since != 0
@Matan-B
Matan-B / Create corrupted snap mapper key.md
Last active August 17, 2023 07:52
Simplified version: 1 OSD, 1 pg, 1 object.

Create pool (single pg, no autoscale):

ceph osd pool create <pool_id> 1 1 --autoscale-mode=off

Put an object:

rados -p <pool_id> put objectone <obj_1>

Make a snapshot:

  • Before the change:

    handled osd_map messages may have epochs intersections,
    While we are trying to avoid proccessing the same map more than once,
    we can only assure than it is safe to skip an already handled osd map
    *only* after it was processed (and written to the superblock).
    
  • Logs:

INFO  2023-05-15 12:28:50,249 [shard 0] osd - handle_osd_map epochs [1..5], i have 0, src has [1..5]                                                                       
INFO  2023-05-15 12:28:51,245 [shard 0] osd - handle_osd_map epochs [5..6], i have 5, src has [1..6]                                                                       
Backport PRs:
https://github.com/ceph/ceph/pulls?q=is%3Apr+label%3A%22crimson+backport+reef%22
Test Build:
https://shaman.ceph.com/builds/ceph/wip-matanb-crimson-testing-21.5-reef
Test Run:
https://pulpito.ceph.com/matan-2023-05-22_07:51:45-crimson-rados-wip-matanb-crimson-testing-21.5-reef-distro-crimson-smithi/
https://pulpito.ceph.com/sjust-2023-05-21_21:38:43-crimson-rados-wip-matanb-crimson-testing-21.5-reef-distro-default-smithi/

Tracker: https://tracker.ceph.com/issues/59165

Note: The following run had ceph/ceph#51425 applied.

Step 1: Replacing the crash by skipping

With the new patch appplied, we skip the PGAdnvaceMap ops when to is ealier than from epoch (where previously we instead crashed) .

There are few PGAdvanceMap events with a to epoch of 375 while from is already of epoch 376.

@Matan-B
Matan-B / Balanced.md
Last active May 21, 2023 11:38
3329: oid 103 version is 35 and expected 381

Testing:

ceph_test_rados --balance-reads --max-ops 400000 --objects 1024 --max-in-flight 64 --size 4000000 --min-stride-size 400000 --max-stride-size 800000 --max-seconds 600 --op read 100 --op write 50 --op delete 50 --op snap_create 50 --op snap_remove 0 --op rollback 0 --op setattr 25 --op rmattr 25 --op copy_from 0 --op write_excl 50 --pool unique_pool_0

Failure traceback:

update_object_version oid 785 v 593 (ObjNum 2048 snap 305 seq_num 2048) dirty exists
3327:  left oid 785 (ObjNum 2048 snap 305 seq_num 2048)
3330:  finishing write tid 2 to folio031017823-831
3329: oid 103 version is 35 and expected 381
../src/test/osd/RadosModel.h: In function 'virtual void ReadOp::_finish(TestOp::CallbackInfo*)' thread 7efd96ad4700 time 2023-02-13T15:26:02.548504+0000                                     

Resolution:

/etc/default/grub:

GRUB_CMDLINE_LINUX_DEFAULT="nomodeset"
GRUB_GFXPAYLOAD=1024x768

Font:

sudo dpkg-reconfigure console-setup

Confs: