Skip to content

Instantly share code, notes, and snippets.

@liu-chunmei
Last active August 31, 2021 02:16
Show Gist options
  • Save liu-chunmei/011a70ecb67da5abc0dc2d59a659a89c to your computer and use it in GitHub Desktop.
Save liu-chunmei/011a70ecb67da5abc0dc2d59a659a89c to your computer and use it in GitHub Desktop.
fio test for crimson-osd

fio: Rough perf testing of crimson-osd with interruptable future and new lba tree

Scenario

  • 4 KB random reads and writes with one fio instance and two job number
  • single OSD instance
  • deployed on purley server by vstart.sh
  • crimson-seastore

version

master (commit ID 7e7ba588ccab0918b7579d8ca67bbd49b2a81a22 ) + PR #42901

The cycles-per-op metric

crimson-osd(new) with interruptable future and new lba tree

iodepth=2, numjobs=1                                       iodepth=16, numjobs=1                  
write: 849,380,915,123 /97103 = 8,747,215                  write: 853,644,231,165/85582 = 9,974,576
>>>
read:  816,828,521,188 /1156323 = 706,401                  read: 845,300,737,306 /3949506 = 214,026
>>>

iodepth=32, numjobs=2                                      iodepth=64, numjobs4
write: 863,330,801,891/76547 = 11,278,440                  write: 872,813,077,856/65806 = 13,263,427
>>>
read:  828,604,482,838/3018234 = 274,532                   read: 806,917,132,774/4216684 = 191,362
>>>

ceph-osd

iodepth=2, numjobs=1                                       iodepth=16, numjobs=1                  
write: 116,427,446,191/106601 = 1,092,179                  write: 251,035,889,726/524288 = 478,812
>>>
read: 212,223,757,354 /1563896 = 135,701                   read: 391,659,644,461/2326574 = 168,341
>>>

iodepth=32, numjobs=2                                      iodepth=64, numjobs4
write: 517,119,671,802/1048576 = 493,163                   write: = 1,060,218,192,398/2097152 = 505,551
>>>
read:  439,450,395,960/ 2507796 = 175,233                  read: 509,714,046,081/2847645 = 178,994
>>>

crimson-osd(old) without interruptable future and with old lba tree

iodepth=2, numjobs=1   
write: 839,463,751,717 / 73053 = 11,491,160
>>> 
read: 844,600,463,654 / 1507616 = 560,222
>>>

The avg-lat metric

crimson-osd(new) with interruptable future and new lba tree

iodepth=2, numjobs=1            iodepth=16, numjobs=1           iodepth=32, numjobs=2                  iodepth=64, numjobs=4
write: 4941.41(us)              write: 44921.91(us)             write: 201076.00(us)                   write: 941174.55(us)
read:  413.39(us)               read: 971.33(us)                read: 5087.98(us)                      read: 14572.29(us)

ceph-osd

iodepth=2, numjobs=1            iodepth=16, numjobs=1           iodepth=32, numjobs=2                  iodepth=64, numjobs=4
write: 4496.84(us)              write: 5311.12(us)              write: 7757.64(us)                     write: 27.79(ms)
read: 305.30(us)                read: 1649.07(us)               read: 6123.16(us)                      read: 21574(us)

crimson-osd(old) without interruptable future and with old lba tree

iodepth=2, numjobs=1   
write:  6569.57 (us)
read:   317.26 (us)

crimson Build configuration

./do_cmake.sh -DWITH_SEASTAR=ON -DWITH_MGR_DASHBOARD_FRONTEND=OFF -DCMAKE_BUILD_TYPE=RelWithDebInfo 

crimson Deployment

[build]$ MDS=0 MGR=1 OSD=1 MON=1 ../src/vstart.sh -n --without-dashboard --seastore -X --crimson --seastore-devs /dev/nvme1n1

[build]$ bin/ceph osd pool create rbd 128 128 && bin/ceph osd pool set --yes-i-really-mean-it rbd size 1 && bin/ceph osd pool --yes-i-really-mean-it set rbd min_size 1 

[build]$ bin/rbd create fio_test --size 2G --image-format=2 --rbd_default_features=3
...

crimson 4 KB random write, iodepth=2, numjobs=1

[build]$ perf stat -p `pgrep -u ${UID} crimson-osd` & fio ../rbd_write.fio; sleep 1; killall -INT perf
[1] 731768
rbd_iodepth32: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine       =rbd, iodepth=2
fio-3.26-46-g30be
Starting 1 process
Jobs: 1 (f=1): [w(1)][100.0%][w=740KiB/s][w=185 IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=1): err= 0: pid=731857: Thu Aug 26 20:58:41 2021
  write: IOPS=404, BW=1618KiB/s (1657kB/s)(379MiB/240065msec); 0 zone resets
    slat (usec): min=2, max=197, avg=12.64, stdev= 5.26
    clat (usec): min=279, max=141313, avg=4928.76, stdev=12971.49
     lat (usec): min=288, max=141322, avg=4941.41, stdev=12971.97
    clat percentiles (usec):
     |  1.00th=[   322],  5.00th=[   347], 10.00th=[   359], 20.00th=[   388],
     | 30.00th=[   461], 40.00th=[   494], 50.00th=[   562], 60.00th=[   635],
     | 70.00th=[   742], 80.00th=[  1020], 90.00th=[ 16581], 95.00th=[ 41157],
     | 99.00th=[ 60031], 99.50th=[ 63701], 99.90th=[ 77071], 99.95th=[ 88605],
     | 99.99th=[102237]
   bw (  KiB/s): min=  545, max=13024, per=100.00%, avg=1620.97, stdev=1695.99, samples=479
   iops        : min=  136, max= 3256, avg=405.14, stdev=424.03, samples=479
  lat (usec)   : 500=41.13%, 750=29.08%, 1000=9.41%
  lat (msec)   : 2=5.17%, 4=2.19%, 10=1.84%, 20=1.85%, 50=6.67%
  lat (msec)   : 100=2.66%, 250=0.01%
  cpu          : usr=0.88%, sys=0.48%, ctx=96922, majf=0, minf=927
  IO depths    : 1=0.1%, 2=100.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,97103,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=2

Run status group 0 (all jobs):
  WRITE: bw=1618KiB/s (1657kB/s), 1618KiB/s-1618KiB/s (1657kB/s-1657kB/s), io=379MiB (398MB), run=       240065-240065msec

Disk stats (read/write):
    dm-0: ios=1/7213, merge=0/0, ticks=0/1468, in_queue=1468, util=0.95%, aggrios=1/3509, aggrmerg       e=0/3704, aggrticks=1/976, aggrin_queue=0, aggrutil=0.94%
  sda: ios=1/3509, merge=0/3704, ticks=1/976, in_queue=0, util=0.94%
root@otccldstore05:/home/chunmei/ceph/build#
 Performance counter stats for process id '731575':

        237,546.19 msec task-clock                #    0.985 CPUs utilized
             2,732      context-switches          #    0.012 K/sec
                 0      cpu-migrations            #    0.000 K/sec
             4,485      page-faults               #    0.019 K/sec
   849,380,915,123      cycles                    #    3.576 GHz
 1,587,003,778,529      instructions              #    1.87  insn per cycle
   299,570,132,610      branches                  # 1261.103 M/sec
     2,280,886,622      branch-misses             #    0.76% of all branches

     241.057670733 seconds time elapsed

...

crimson 4 KB random read, iodepth=2, numjobs=1

[build]$ perf stat -p `pgrep -u ${UID} crimson-osd` & fio ../rbd_read.fio; sleep 1; killall -INT perf
[1] 731875
rbd_iodepth32: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=2
fio-3.26-46-g30be
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=19.1MiB/s][r=4879 IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=1): err= 0: pid=731964: Thu Aug 26 21:05:37 2021
  read: IOPS=4817, BW=18.8MiB/s (19.7MB/s)(4517MiB/240044msec)
    slat (usec): min=2, max=872, avg= 9.06, stdev= 2.09
    clat (usec): min=122, max=49081, avg=404.33, stdev=491.22
     lat (usec): min=129, max=49092, avg=413.39, stdev=491.27
    clat percentiles (usec):
     |  1.00th=[  186],  5.00th=[  192], 10.00th=[  198], 20.00th=[  265],
     | 30.00th=[  281], 40.00th=[  297], 50.00th=[  371], 60.00th=[  424],
     | 70.00th=[  461], 80.00th=[  482], 90.00th=[  562], 95.00th=[  644],
     | 99.00th=[  848], 99.50th=[ 2442], 99.90th=[ 5604], 99.95th=[ 5735],
     | 99.99th=[ 6063]
   bw (  KiB/s): min=14616, max=23992, per=100.00%, avg=19292.62, stdev=1300.37, samples=479
   iops        : min= 3654, max= 5998, avg=4823.03, stdev=325.08, samples=479
  lat (usec)   : 250=17.27%, 500=66.16%, 750=14.79%, 1000=1.00%
  lat (msec)   : 2=0.09%, 4=0.35%, 10=0.34%, 20=0.01%, 50=0.01%
  cpu          : usr=5.92%, sys=5.77%, ctx=1149534, majf=0, minf=664
  IO depths    : 1=0.1%, 2=100.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=1156323,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=2

Run status group 0 (all jobs):
   READ: bw=18.8MiB/s (19.7MB/s), 18.8MiB/s-18.8MiB/s y(19.7MB/s-19.7MB/s), io=4517MiB (4736MB), run=240044-240044msec

Disk stats (read/write):
    dm-0: ios=0/7204, merge=0/0, ticks=0/1248, in_queue=1248, util=0.96%, aggrios=0/3411, aggrmerge=0/3799, aggrticks=0/993, aggrin_queue=36, aggrutil=0.97%
  sda: ios=0/3411, merge=0/3799, ticks=0/993, in_queue=36, util=0.97%
root@otccldstore05:/home/chunmei/ceph/build#
 Performance counter stats for process id '731575':

        238,080.54 msec task-clock                #    0.988 CPUs utilized
             4,860      context-switches          #    0.020 K/sec
                 0      cpu-migrations            #    0.000 K/sec
             6,932      page-faults               #    0.029 K/sec
   816,828,521,188      cycles                    #    3.431 GHz
 1,014,637,665,743      instructions              #    1.24  insn per cycle
   195,172,317,521      branches                  #  819.774 M/sec
     1,973,657,343      branch-misses             #    1.01% of all branches

     241.025207329 seconds time elapsed

...

crimson 4 KB random write, iodepth=16, numjobs=1

[build]$ perf stat -p `pgrep -u ${UID} crimson-osd` & fio ../rbd_write.fio; sleep 1; killall -INT perf
rbd_iodepth32: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=16
fio-3.26-46-g30be
Starting 1 process
Jobs: 1 (f=1): [w(1)][100.0%][w=604KiB/s][w=151 IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=1): err= 0: pid=734840: Thu Aug 26 22:05:35 2021
  write: IOPS=356, BW=1424KiB/s (1459kB/s)(334MiB/240319msec); 0 zone resets
    slat (usec): min=2, max=886, avg=16.46, stdev= 9.39
    clat (usec): min=312, max=5851.9k, avg=44905.45, stdev=190418.73
     lat (usec): min=323, max=5851.9k, avg=44921.91, stdev=190419.06
    clat percentiles (usec):
     |  1.00th=[    388],  5.00th=[    498], 10.00th=[    570],
     | 20.00th=[    717], 30.00th=[    889], 40.00th=[   1156],
     | 50.00th=[   1467], 60.00th=[   2073], 70.00th=[   3359],
     | 80.00th=[  14615], 90.00th=[  89654], 95.00th=[ 206570],
     | 99.00th=[ 876610], 99.50th=[1283458], 99.90th=[2499806],
     | 99.95th=[3103785], 99.99th=[4462740]
   bw (  KiB/s): min=  280, max=12296, per=100.00%, avg=1426.47, stdev=1401.86, samples=480
   iops        : min=   70, max= 3074, avg=356.53, stdev=350.50, samples=480
  lat (usec)   : 500=5.24%, 750=16.52%, 1000=13.96%
  lat (msec)   : 2=23.40%, 4=13.09%, 10=6.46%, 20=2.18%, 50=4.53%
  lat (msec)   : 100=5.35%, 250=5.06%, 500=2.12%, 750=0.82%, 1000=0.44%
  lat (msec)   : 2000=0.64%, >=2000=0.18%
  cpu          : usr=1.00%, sys=0.69%, ctx=69877, majf=0, minf=616
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,85582,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
  WRITE: bw=1424KiB/s (1459kB/s), 1424KiB/s-1424KiB/s (1459kB/s-1459kB/s), io=334MiB (351MB), run=240319-240319msec

Disk stats (read/write):
    dm-0: ios=0/7197, merge=0/0, ticks=0/1256, in_queue=1256, util=0.95%, aggrios=0/3385, aggrmerge=0/3811, aggrticks=0/960, aggrin_queue=0, aggrutil=0.95%
  sda: ios=0/3385, merge=0/3811, ticks=0/960, in_queue=0, util=0.95%
root@otccldstore05:/home/chunmei/ceph/build#
 Performance counter stats for process id '734557':

        238,363.54 msec task-clock                #    0.988 CPUs utilized
             1,814      context-switches          #    0.008 K/sec
                 0      cpu-migrations            #    0.000 K/sec
             4,252      page-faults               #    0.018 K/sec
   853,644,231,165      cycles                    #    3.581 GHz
 1,847,516,249,229      instructions              #    2.16  insn per cycle
   342,609,446,702      branches                  # 1437.340 M/sec
     1,734,187,807      branch-misses             #    0.51% of all branches

     241.317734095 seconds time elapsed

...

crimson 4 KB random read, iodepth=16, numjobs=1

[build]$ perf stat -p `pgrep -u ${UID} crimson-osd` & fio ../rbd_read.fio; sleep 1; killall -INT perf
[1] 734846
rbd_iodepth32: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=16
fio-3.26-46-g30be
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=65.5MiB/s][r=16.8k IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=1): err= 0: pid=734935: Thu Aug 26 22:13:58 2021
  read: IOPS=16.5k, BW=64.3MiB/s (67.4MB/s)(15.1GiB/240045msec)
    slat (nsec): min=673, max=1168.1k, avg=5827.12, stdev=3687.20
    clat (usec): min=133, max=58194, avg=965.50, stdev=369.13
     lat (usec): min=135, max=58203, avg=971.33, stdev=369.08
    clat percentiles (usec):
     |  1.00th=[  461],  5.00th=[  545], 10.00th=[  619], 20.00th=[  701],
     | 30.00th=[  750], 40.00th=[  807], 50.00th=[  881], 60.00th=[  971],
     | 70.00th=[ 1106], 80.00th=[ 1237], 90.00th=[ 1401], 95.00th=[ 1582],
     | 99.00th=[ 1975], 99.50th=[ 2147], 99.90th=[ 3130], 99.95th=[ 5669],
     | 99.99th=[ 6325]
   bw (  KiB/s): min=54328, max=69763, per=100.00%, avg=65893.30, stdev=2005.11, samples=479
   iops        : min=13582, max=17440, avg=16473.18, stdev=501.22, samples=479
  lat (usec)   : 250=0.01%, 500=2.67%, 750=27.25%, 1000=32.84%
  lat (msec)   : 2=36.36%, 4=0.79%, 10=0.06%, 20=0.01%, 50=0.01%
  lat (msec)   : 100=0.01%
  cpu          : usr=15.56%, sys=10.65%, ctx=2117761, majf=0, minf=436
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=3949506,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
   READ: bw=64.3MiB/s (67.4MB/s), 64.3MiB/s-64.3MiB/s (67.4MB/s-67.4MB/s), io=15.1GiB (16.2GB), run=240045-240045msec

Disk stats (read/write):
    dm-0: ios=0/7316, merge=0/0, ticks=0/1256, in_queue=1256, util=0.96%, aggrios=0/3500, aggrmerge=0/3821, aggrticks=0/1027, aggrin_queue=28, aggrutil=0.96%
  sda: ios=0/3500, merge=0/3821, ticks=0/1027, in_queue=28, util=0.96%
root@otccldstore05:/home/chunmei/ceph/build#
 Performance counter stats for process id '734557':

        239,811.00 msec task-clock                #    0.995 CPUs utilized
             2,857      context-switches          #    0.012 K/sec
                 0      cpu-migrations            #    0.000 K/sec
             6,812      page-faults               #    0.028 K/sec
   845,300,737,306      cycles                    #    3.525 GHz
   914,508,718,913      instructions              #    1.08  insn per cycle
   177,108,687,729      branches                  #  738.534 M/sec
     1,477,158,821      branch-misses             #    0.83% of all branches

     241.038307869 seconds time elapsed

...

crimson 4 KB random write, iodepth=32, numjobs=2

[build]$ perf stat -p `pgrep -u ${UID} crimson-osd` & fio ../rbd_write.fio; sleep 1; killall -INT perf
[1] 732740
rbd_iodepth32: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=32
...
fio-3.26-46-g30be
Starting 2 processes
Jobs: 2 (f=2): [w(2)][100.0%][w=524KiB/s][w=131 IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=2): err= 0: pid=732847: Thu Aug 26 21:19:47 2021
  write: IOPS=318, BW=1273KiB/s (1304kB/s)(299MiB/240521msec); 0 zone resets
    slat (nsec): min=1755, max=183782, avg=21054.45, stdev=10139.99
    clat (usec): min=290, max=13937k, avg=201054.95, stdev=731352.12
     lat (usec): min=303, max=13937k, avg=201076.00, stdev=731352.77
    clat percentiles (usec):
     |  1.00th=[     457],  5.00th=[     668], 10.00th=[     824],
     | 20.00th=[    1156], 30.00th=[    1516], 40.00th=[    1942],
     | 50.00th=[    2671], 60.00th=[    4490], 70.00th=[   24249],
     | 80.00th=[   91751], 90.00th=[  392168], 95.00th=[ 1061159],
     | 99.00th=[ 4076864], 99.50th=[ 5268046], 99.90th=[ 7549748],
     | 99.95th=[ 8355054], 99.99th=[10267657]
   bw (  KiB/s): min=   96, max=13936, per=100.00%, avg=1274.95, stdev=769.15, samples=960
   iops        : min=   24, max= 3484, avg=318.69, stdev=192.29, samples=960
  lat (usec)   : 500=1.50%, 750=6.05%, 1000=8.89%
  lat (msec)   : 2=24.74%, 4=17.05%, 10=8.28%, 20=2.78%, 50=5.60%
  lat (msec)   : 100=5.72%, 250=6.60%, 500=4.18%, 750=2.06%, 1000=1.29%
  lat (msec)   : 2000=2.45%, >=2000=2.81%
  cpu          : usr=0.59%, sys=0.43%, ctx=63311, majf=0, minf=1553
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=99.9%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,76547,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
  WRITE: bw=1273KiB/s (1304kB/s), 1273KiB/s-1273KiB/s (1304kB/s-1304kB/s), io=299MiB (314MB), run=240521-240521msec

Disk stats (read/write):
    dm-0: ios=0/7317, merge=0/0, ticks=0/1216, in_queue=1216, util=0.95%, aggrios=0/3529, aggrmerge=0/3824, aggrticks=0/990, aggrin_queue=0, aggrutil=0.96%
  sda: ios=0/3529, merge=0/3824, ticks=0/990, in_queue=0, util=0.96%
root@otccldstore05:/home/chunmei/ceph/build#
 Performance counter stats for process id '732546':

        240,156.96 msec task-clock                #    0.994 CPUs utilized
             1,500      context-switches          #    0.006 K/sec
                 0      cpu-migrations            #    0.000 K/sec
             4,088      page-faults               #    0.017 K/sec
   863,330,801,891      cycles                    #    3.595 GHz
 1,991,186,784,477      instructions              #    2.31  insn per cycle
   366,302,053,004      branches                  # 1525.261 M/sec
     1,404,455,941      branch-misses             #    0.38% of all branches

     241.543718876 seconds time elapsed


...

crimson 4 KB random read, iodepth=32, numjobs=2

[build]$ perf stat -p `pgrep -u ${UID} crimson-osd` & fio ../rbd_read.fio; sleep 1; killall -INT perf
rbd_iodepth32: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=32
...
fio-3.26-46-g30be
Starting 2 processes
Jobs: 2 (f=0): [f(2)][100.0%][r=47.9MiB/s][r=12.3k IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=2): err= 0: pid=733080: Thu Aug 26 21:34:41 2021
  read: IOPS=12.6k, BW=49.1MiB/s (51.5MB/s)(11.5GiB/240031msec)
    slat (nsec): min=1220, max=807470, avg=7418.98, stdev=4721.60
    clat (usec): min=161, max=55441, avg=5080.56, stdev=2467.13
     lat (usec): min=170, max=55445, avg=5087.98, stdev=2467.06
    clat percentiles (usec):
     |  1.00th=[ 1020],  5.00th=[ 1942], 10.00th=[ 2474], 20.00th=[ 3064],
     | 30.00th=[ 3654], 40.00th=[ 4080], 50.00th=[ 4555], 60.00th=[ 5145],
     | 70.00th=[ 5932], 80.00th=[ 6718], 90.00th=[ 8029], 95.00th=[ 9503],
     | 99.00th=[13042], 99.50th=[14746], 99.90th=[20055], 99.95th=[22676],
     | 99.99th=[30540]
   bw (  KiB/s): min=35741, max=56064, per=100.00%, avg=50350.54, stdev=730.34, samples=958
   iops        : min= 8935, max=14016, avg=12587.36, stdev=182.56, samples=958
  lat (usec)   : 250=0.01%, 500=0.08%, 750=0.16%, 1000=0.62%
  lat (msec)   : 2=5.21%, 4=30.30%, 10=59.67%, 20=3.87%, 50=0.10%
  lat (msec)   : 100=0.01%
  cpu          : usr=7.21%, sys=5.04%, ctx=1773849, majf=0, minf=636
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=3018234,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
   READ: bw=49.1MiB/s (51.5MB/s), 49.1MiB/s-49.1MiB/s (51.5MB/s-51.5MB/s), io=11.5GiB (12.4GB), run=240031-240031msec

Disk stats (read/write):
    dm-0: ios=0/7436, merge=0/0, ticks=0/1240, in_queue=1240, util=0.97%, aggrios=0/3450, aggrmerge=0/3992, aggrticks=0/981, aggrin_queue=16, aggrutil=0.97%
  sda: ios=0/3450, merge=0/3992, ticks=0/981, in_queue=16, util=0.97%
root@otccldstore05:/home/chunmei/ceph/build#
 Performance counter stats for process id '732546':

        239,905.76 msec task-clock                #    0.995 CPUs utilized
             5,275      context-switches          #    0.022 K/sec
                 0      cpu-migrations            #    0.000 K/sec
             5,423      page-faults               #    0.023 K/sec
   828,604,482,838      cycles                    #    3.454 GHz
   794,052,342,764      instructions              #    0.96  insn per cycle
   150,969,903,452      branches                  #  629.288 M/sec
     2,131,108,010      branch-misses             #    1.41% of all branches

     241.116795147 seconds time elapsed

...

crimson 4 KB random write, iodepth=64, numjobs=4

[build]$ perf stat -p `pgrep -u ${UID} crimson-osd` & fio ../rbd_write.fio; sleep 1; killall -INT perf
[1] 733768
rbd_iodepth32: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=64
...
fio-3.26-46-g30be
Starting 4 processes
Jobs: 4 (f=4): [w(4)][3.1%][w=16KiB/s][w=4 IOPS][eta 02h:07m:43s]
rbd_iodepth32: (groupid=0, jobs=4): err= 0: pid=733911: Thu Aug 26 21:45:53 2021
  write: IOPS=271, BW=1088KiB/s (1114kB/s)(257MiB/241978msec); 0 zone resets
    slat (usec): min=2, max=145, avg=23.88, stdev= 9.25
    clat (usec): min=319, max=33879k, avg=941150.68, stdev=2717540.49
     lat (usec): min=333, max=33879k, avg=941174.55, stdev=2717541.28
    clat percentiles (usec):
     |  1.00th=[     783],  5.00th=[    1287], 10.00th=[    1631],
     | 20.00th=[    2180], 30.00th=[    2933], 40.00th=[    4146],
     | 50.00th=[   10552], 60.00th=[   67634], 70.00th=[  252707],
     | 80.00th=[  843056], 90.00th=[ 2499806], 95.00th=[ 5066720],
     | 99.00th=[15099495], 99.50th=[17112761], 99.90th=[17112761],
     | 99.95th=[17112761], 99.99th=[17112761]
   bw (  KiB/s): min=   32, max=12880, per=100.00%, avg=1090.89, stdev=362.38, samples=1923
   iops        : min=    8, max= 3220, avg=272.71, stdev=90.59, samples=1923
  lat (usec)   : 500=0.26%, 750=0.58%, 1000=1.71%
  lat (msec)   : 2=14.50%, 4=22.14%, 10=10.56%, 20=2.57%, 50=4.75%
  lat (msec)   : 100=5.86%, 250=7.02%, 500=5.50%, 750=3.46%, 1000=2.76%
  lat (msec)   : 2000=6.43%, >=2000=11.90%
  cpu          : usr=0.32%, sys=0.20%, ctx=55806, majf=0, minf=2253
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.2%, >=64=99.6%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=0,65806,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
  WRITE: bw=1088KiB/s (1114kB/s), 1088KiB/s-1088KiB/s (1114kB/s-1114kB/s), io=257MiB (270MB), run=241978-241978msec

Disk stats (read/write):
    dm-0: ios=0/7390, merge=0/0, ticks=0/1044, in_queue=1044, util=0.95%, aggrios=0/3452, aggrmerge=0/3938, aggrticks=0/976, aggrin_queue=0, aggrutil=0.95%
  sda: ios=0/3452, merge=0/3938, ticks=0/976, in_queue=0, util=0.95%
root@otccldstore05:/home/chunmei/ceph/build#
 Performance counter stats for process id '733571':

        241,896.62 msec task-clock                #    0.995 CPUs utilized
             1,331      context-switches          #    0.006 K/sec
                 0      cpu-migrations            #    0.000 K/sec
             3,978      page-faults               #    0.016 K/sec
   872,813,077,856      cycles                    #    3.608 GHz
 2,122,692,945,307      instructions              #    2.43  insn per cycle
   387,959,046,622      branches                  # 1603.822 M/sec
     1,093,863,866      branch-misses             #    0.28% of all branches

     243.055255400 seconds time elapsed

...

crimson 4 KB random read, iodepth=64, numjobs=4

[build]$ perf stat -p `pgrep -u ${UID} crimson-osd` & fio ../rbd_read.fio; sleep 1; killall -INT perf
[1] 733921
rbd_iodepth32: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=64
...
fio-3.26-46-g30be
Starting 4 processes
Jobs: 4 (f=4): [r(4)][100.0%][r=64.8MiB/s][r=16.6k IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=4): err= 0: pid=734064: Thu Aug 26 21:56:28 2021
  read: IOPS=17.6k, BW=68.6MiB/s (71.9MB/s)(16.1GiB/240060msec)
    slat (nsec): min=629, max=1973.0k, avg=7791.30, stdev=6630.43
    clat (usec): min=445, max=487018, avg=14564.50, stdev=16908.89
     lat (usec): min=472, max=487027, avg=14572.29, stdev=16908.91
    clat percentiles (msec):
     |  1.00th=[    6],  5.00th=[    8], 10.00th=[    9], 20.00th=[   10],
     | 30.00th=[   11], 40.00th=[   11], 50.00th=[   12], 60.00th=[   12],
     | 70.00th=[   13], 80.00th=[   15], 90.00th=[   18], 95.00th=[   27],
     | 99.00th=[  102], 99.50th=[  132], 99.90th=[  194], 99.95th=[  222],
     | 99.99th=[  388]
   bw (  KiB/s): min=45855, max=92116, per=100.00%, avg=70357.03, stdev=683.47, samples=1916
   iops        : min=11463, max=23029, avg=17589.13, stdev=170.88, samples=1916
  lat (usec)   : 500=0.01%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.03%, 4=0.35%, 10=30.82%, 20=61.18%, 50=4.82%
  lat (msec)   : 100=1.77%, 250=1.00%, 500=0.03%
  cpu          : usr=5.91%, sys=3.69%, ctx=2308146, majf=0, minf=1204
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=4216684,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=68.6MiB/s (71.9MB/s), 68.6MiB/s-68.6MiB/s (71.9MB/s-71.9MB/s), io=16.1GiB (17.3GB), run=240060-240060msec

Disk stats (read/write):
    dm-0: ios=0/7352, merge=0/0, ticks=0/1188, in_queue=1188, util=0.97%, aggrios=0/3501, aggrmerge=0/3900, aggrticks=0/1008, aggrin_queue=20, aggrutil=0.97%
  sda: ios=0/3501, merge=0/3900, ticks=0/1008, in_queue=20, util=0.97%
root@otccldstore05:/home/chunmei/ceph/build#
 Performance counter stats for process id '733571':

        239,953.06 msec task-clock                #    0.995 CPUs utilized
             1,314      context-switches          #    0.005 K/sec
                 0      cpu-migrations            #    0.000 K/sec
             3,001      page-faults               #    0.013 K/sec
   806,917,132,774      cycles                    #    3.363 GHz
   882,138,570,396      instructions              #    1.09  insn per cycle
   171,159,546,368      branches                  #  713.304 M/sec
     1,274,932,151      branch-misses             #    0.74% of all branches

     241.217857774 seconds time elapsed

...

ceph options configuration

ms_async_op_threads = 1
osd_op_num_threads_per_shard = 1
osd_op_num_shards =1

ceph Build configuration

./do_cmake.sh -DWITH_MGR_DASHBOARD_FRONTEND=OFF -DCMAKE_BUILD_TYPE=RelWithDebInfo 

ceph Deployment

[build]$ MDS=0 MGR=1 OSD=1 MON=1 ../src/vstart.sh -n --without-dashboard --bluestore -X --bluestore-devs /dev/nvme1n1            

[build]$ bin/ceph osd pool create rbd 128 128 && bin/ceph osd pool set --yes-i-really-mean-it rbd size 1 && bin/ceph osd pool --yes-i-really-mean-it set rbd min_size 1 

[build]$ bin/rbd create fio_test --size 2G --image-format=2 --rbd_default_features=3

[build]$ taskset -cp 0 <ceph_osd Pid> 
...

ceph 4 KB random write, iodepth=2, numjobs=1

[build]$ perf stat -p `pgrep -u ${UID} ceph-osd` & fio ../rbd_write.fio; sleep 1; killall -INT perf
[1] 790090
rbd_iodepth32: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=2
fio-3.26-46-g30be
Starting 1 process
Jobs: 1 (f=1): [w(1)][100.0%][w=808KiB/s][w=202 IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=1): err= 0: pid=790177: Mon Aug 30 17:28:07 2021
  write: IOPS=444, BW=1777KiB/s (1819kB/s)(416MiB/240008msec); 0 zone resets
    slat (usec): min=3, max=772, avg=25.93, stdev= 7.99
    clat (usec): min=1885, max=206522, avg=4470.91, stdev=2780.41
     lat (usec): min=1914, max=206554, avg=4496.84, stdev=2782.62
    clat percentiles (usec):
     |  1.00th=[ 2311],  5.00th=[ 2671], 10.00th=[ 2802], 20.00th=[ 2966],
     | 30.00th=[ 3097], 40.00th=[ 3261], 50.00th=[ 3392], 60.00th=[ 3589],
     | 70.00th=[ 3884], 80.00th=[ 5669], 90.00th=[ 8029], 95.00th=[10814],
     | 99.00th=[12649], 99.50th=[14222], 99.90th=[23725], 99.95th=[31327],
     | 99.99th=[42206]
   bw (  KiB/s): min=  456, max= 2632, per=100.00%, avg=1779.88, stdev=696.53, samples=479
   iops        : min=  114, max=  658, avg=444.92, stdev=174.15, samples=479
  lat (msec)   : 2=0.04%, 4=72.41%, 10=20.49%, 20=6.94%, 50=0.11%
  lat (msec)   : 100=0.01%, 250=0.01%
  cpu          : usr=2.24%, sys=1.42%, ctx=106614, majf=0, minf=860
  IO depths    : 1=0.1%, 2=100.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,106601,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=2

Run status group 0 (all jobs):
  WRITE: bw=1777KiB/s (1819kB/s), 1777KiB/s-1777KiB/s (1819kB/s-1819kB/s), io=416MiB (437MB), run=240008-240008msec

Disk stats (read/write):
    dm-0: ios=0/1434598, merge=0/0, ticks=0/241752, in_queue=241752, util=100.00%, aggrios=0/744369, aggrmerge=0/690611, aggrticks=0/193901, aggrin_queue=10920, aggrutil=100.00%
  sda: ios=0/744369, merge=0/690611, ticks=0/193901, in_queue=10920, util=100.00%
root@otccldstore05:/home/chunmei/ceph/build#
 Performance counter stats for process id '789517':

        107,559.51 msec task-clock                #    0.446 CPUs utilized
         1,341,429      context-switches          #    0.012 M/sec
               463      cpu-migrations            #    0.004 K/sec
         1,003,097      page-faults               #    0.009 M/sec
   116,427,446,191      cycles                    #    1.082 GHz
    58,882,325,404      instructions              #    0.51  insn per cycle
    11,471,641,006      branches                  #  106.654 M/sec
       935,763,084      branch-misses             #    8.16% of all branches

     241.011065909 seconds time elapsed

...

ceph 4 KB random read, iodepth=2, numjobs=1

[build]$ perf stat -p `pgrep -u ${UID} ceph-osd` & fio ../rbd_read.fio; sleep 1; killall -INT perf
[1] 790187
rbd_iodepth32: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=2
fio-3.26-46-g30be
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=25.4MiB/s][r=6512 IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=1): err= 0: pid=790274: Mon Aug 30 17:37:53 2021
  read: IOPS=6516, BW=25.5MiB/s (26.7MB/s)(6109MiB/240001msec)
    slat (nsec): min=1037, max=2194.6k, avg=10164.98, stdev=4370.38
    clat (usec): min=169, max=22889, avg=295.13, stdev=103.77
     lat (usec): min=179, max=22897, avg=305.30, stdev=103.95
    clat percentiles (usec):
     |  1.00th=[  239],  5.00th=[  251], 10.00th=[  265], 20.00th=[  281],
     | 30.00th=[  281], 40.00th=[  285], 50.00th=[  289], 60.00th=[  293],
     | 70.00th=[  302], 80.00th=[  310], 90.00th=[  326], 95.00th=[  347],
     | 99.00th=[  392], 99.50th=[  404], 99.90th=[  586], 99.95th=[  709],
     | 99.99th=[ 5735]
   bw (  KiB/s): min=19528, max=31048, per=100.00%, avg=26093.05, stdev=1813.46, samples=479
   iops        : min= 4882, max= 7762, avg=6523.20, stdev=453.36, samples=479
  lat (usec)   : 250=4.96%, 500=94.84%, 750=0.16%, 1000=0.02%
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%
  cpu          : usr=8.18%, sys=8.64%, ctx=1507590, majf=0, minf=857
  IO depths    : 1=0.1%, 2=100.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=1563896,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=2

Run status group 0 (all jobs):
   READ: bw=25.5MiB/s (26.7MB/s), 25.5MiB/s-25.5MiB/s (26.7MB/s-26.7MB/s), io=6109MiB (6406MB), run=240001-240001msec

Disk stats (read/write):
    dm-0: ios=0/7114, merge=0/0, ticks=0/1148, in_queue=1148, util=1.01%, aggrios=0/3494, aggrmerge=0/3645, aggrticks=0/1007, aggrin_queue=4, aggrutil=1.01%
  sda: ios=0/3494, merge=0/3645, ticks=0/1007, in_queue=4, util=1.01%
root@otccldstore05:/home/chunmei/ceph/build#
 Performance counter stats for process id '789517':

        206,471.25 msec task-clock                #    0.857 CPUs utilized
         3,135,538      context-switches          #    0.015 M/sec
               335      cpu-migrations            #    0.002 K/sec
         1,671,043      page-faults               #    0.008 M/sec
   212,223,757,354      cycles                    #    1.028 GHz
   166,044,005,403      instructions              #    0.78  insn per cycle
    33,350,937,922      branches                  #  161.528 M/sec
       410,678,098      branch-misses             #    1.23% of all branches

     241.019824574 seconds time elapsed


...

ceph 4 KB random write, iodepth=16, numjobs=1

[build]$ perf stat -p `pgrep -u ${UID} ceph-osd` & fio ../rbd_write.fio; sleep 1; killall -INT perf
[1] 792217
rbd_iodepth32: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=16
fio-3.26-46-g30be
Starting 1 process
Jobs: 1 (f=1): [w(1)][100.0%][w=14.3MiB/s][w=3670 IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=1): err= 0: pid=792304: Mon Aug 30 17:46:16 2021
  write: IOPS=3011, BW=11.8MiB/s (12.3MB/s)(2048MiB/174096msec); 0 zone resets
    slat (nsec): min=1069, max=136442, avg=8394.18, stdev=8544.49
    clat (usec): min=1419, max=74423, avg=5302.73, stdev=2597.73
     lat (usec): min=1442, max=74427, avg=5311.12, stdev=2597.58
    clat percentiles (usec):
     |  1.00th=[ 1811],  5.00th=[ 2376], 10.00th=[ 2933], 20.00th=[ 3818],
     | 30.00th=[ 4293], 40.00th=[ 4555], 50.00th=[ 4752], 60.00th=[ 4948],
     | 70.00th=[ 5473], 80.00th=[ 6521], 90.00th=[ 7963], 95.00th=[10028],
     | 99.00th=[14091], 99.50th=[16450], 99.90th=[28705], 99.95th=[36439],
     | 99.99th=[42206]
   bw (  KiB/s): min= 3056, max=26624, per=100.00%, avg=12064.78, stdev=3890.00, samples=347
   iops        : min=  764, max= 6656, avg=3016.10, stdev=972.46, samples=347
  lat (msec)   : 2=3.06%, 4=20.45%, 10=71.50%, 20=4.73%, 50=0.25%
  lat (msec)   : 100=0.01%
  cpu          : usr=4.58%, sys=2.24%, ctx=113985, majf=0, minf=1654
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,524288,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
  WRITE: bw=11.8MiB/s (12.3MB/s), 11.8MiB/s-11.8MiB/s (12.3MB/s-12.3MB/s), io=2048MiB (2147MB), run=174096-174096msec

Disk stats (read/write):
    dm-0: ios=0/1051447, merge=0/0, ticks=0/162716, in_queue=162716, util=99.92%, aggrios=0/531406, aggrmerge=0/521357, aggrticks=0/127486, aggrin_queue=6836, aggrutil=99.90%
  sda: ios=0/531406, merge=0/521357, ticks=0/127486, in_queue=6836, util=99.90%
root@otccldstore05:/home/chunmei/ceph/build#
 Performance counter stats for process id '791646':

        187,388.98 msec task-clock                #    1.070 CPUs utilized
         1,538,603      context-switches          #    0.008 M/sec
               361      cpu-migrations            #    0.002 K/sec
         4,352,335      page-faults               #    0.023 M/sec
   251,035,889,726      cycles                    #    1.340 GHz
   182,465,265,414      instructions              #    0.73  insn per cycle
    36,319,735,776      branches                  #  193.820 M/sec
     1,105,983,281      branch-misses             #    3.05% of all branches

     175.097180385 seconds time elapsed


...

ceph 4 KB random read, iodepth=16, numjobs=1

[build]$ perf stat -p `pgrep -u ${UID} ceph-osd` & fio ../rbd_read.fio; sleep 1; killall -INT perf
[1] 792313
rbd_iodepth32: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=16
fio-3.26-46-g30be
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=39.6MiB/s][r=10.1k IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=1): err= 0: pid=792400: Mon Aug 30 17:57:26 2021
  read: IOPS=9693, BW=37.9MiB/s (39.7MB/s)(9088MiB/240002msec)
    slat (nsec): min=1017, max=5106.9k, avg=9135.02, stdev=5416.13
    clat (usec): min=306, max=13337, avg=1639.93, stdev=652.11
     lat (usec): min=316, max=13348, avg=1649.07, stdev=652.74
    clat percentiles (usec):
     |  1.00th=[  627],  5.00th=[  848], 10.00th=[  979], 20.00th=[ 1139],
     | 30.00th=[ 1270], 40.00th=[ 1385], 50.00th=[ 1516], 60.00th=[ 1663],
     | 70.00th=[ 1860], 80.00th=[ 2114], 90.00th=[ 2474], 95.00th=[ 2737],
     | 99.00th=[ 3261], 99.50th=[ 3523], 99.90th=[ 7439], 99.95th=[ 9241],
     | 99.99th=[10552]
   bw (  KiB/s): min=20208, max=65442, per=100.00%, avg=38817.42, stdev=10531.87, samples=479
   iops        : min= 5052, max=16360, avg=9704.23, stdev=2632.97, samples=479
  lat (usec)   : 500=0.21%, 750=2.58%, 1000=8.39%
  lat (msec)   : 2=64.56%, 4=23.98%, 10=0.26%, 20=0.02%
  cpu          : usr=11.49%, sys=10.43%, ctx=1904670, majf=0, minf=1054
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=2326574,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
   READ: bw=37.9MiB/s (39.7MB/s), 37.9MiB/s-37.9MiB/s (39.7MB/s-39.7MB/s), io=9088MiB (9530MB), run=240002-240002msec

Disk stats (read/write):
    dm-0: ios=0/6949, merge=0/0, ticks=0/1260, in_queue=1260, util=1.02%, aggrios=0/3439, aggrmerge=0/3524, aggrticks=0/990, aggrin_queue=0, aggrutil=1.01%
  sda: ios=0/3439, merge=0/3524, ticks=0/990, in_queue=0, util=1.01%
root@otccldstore05:/home/chunmei/ceph/build#
 Performance counter stats for process id '791646':

        291,673.08 msec task-clock                #    1.210 CPUs utilized
         4,778,100      context-switches          #    0.016 M/sec
               430      cpu-migrations            #    0.001 K/sec
         6,024,880      page-faults               #    0.021 M/sec
   391,659,644,461      cycles                    #    1.343 GHz
   264,861,567,234      instructions              #    0.68  insn per cycle
    53,041,921,612      branches                  #  181.854 M/sec
       720,525,099      branch-misses             #    1.36% of all branches

     240.990189165 seconds time elapsed


...

ceph 4 KB random write, iodepth=32, numjobs=2

[build]$ perf stat -p `pgrep -u ${UID} ceph-osd` & fio ../rbd_write.fio; sleep 1; killall -INT perf
[1] 794341
rbd_iodepth32: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=32
...
fio-3.26-46-g30be
Starting 2 processes
Jobs: 1 (f=1): [w(1),_(1)][100.0%][w=22.2MiB/s][w=5691 IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=2): err= 0: pid=794444: Mon Aug 30 18:07:22 2021
  write: IOPS=8233, BW=32.2MiB/s (33.7MB/s)(4096MiB/127349msec); 0 zone resets
    slat (nsec): min=1992, max=281766, avg=9239.18, stdev=9017.15
    clat (usec): min=2004, max=88095, avg=7748.40, stdev=3494.78
     lat (usec): min=2016, max=88102, avg=7757.64, stdev=3494.27
    clat percentiles (usec):
     |  1.00th=[ 4146],  5.00th=[ 4817], 10.00th=[ 5407], 20.00th=[ 5866],
     | 30.00th=[ 6128], 40.00th=[ 6390], 50.00th=[ 6718], 60.00th=[ 7177],
     | 70.00th=[ 7767], 80.00th=[ 8717], 90.00th=[11207], 95.00th=[14484],
     | 99.00th=[21890], 99.50th=[26346], 99.90th=[40109], 99.95th=[42730],
     | 99.99th=[52691]
   bw (  KiB/s): min= 9160, max=43776, per=100.00%, avg=33041.20, stdev=3938.51, samples=507
   iops        : min= 2290, max=10944, avg=8260.05, stdev=984.60, samples=507
  lat (msec)   : 4=0.56%, 10=85.89%, 20=12.02%, 50=1.52%, 100=0.01%
  cpu          : usr=7.01%, sys=3.30%, ctx=338569, majf=0, minf=1909
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,1048576,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
  WRITE: bw=32.2MiB/s (33.7MB/s), 32.2MiB/s-32.2MiB/s (33.7MB/s-33.7MB/s), io=4096MiB (4295MB), run=127349-127349msec

Disk stats (read/write):
    dm-0: ios=0/621584, merge=0/0, ticks=0/113472, in_queue=113472, util=99.89%, aggrios=0/411285, aggrmerge=0/210551, aggrticks=0/98207, aggrin_queue=6728, aggrutil=99.88%
  sda: ios=0/411285, merge=0/210551, ticks=0/98207, in_queue=6728, util=99.88%
root@otccldstore05:/home/chunmei/ceph/build#
 Performance counter stats for process id '793773':

        228,932.47 msec task-clock                #    1.783 CPUs utilized
         2,780,808      context-switches          #    0.012 M/sec
               290      cpu-migrations            #    0.001 K/sec
         3,261,322      page-faults               #    0.014 M/sec
   517,119,671,802      cycles                    #    2.259 GHz
   353,387,674,486      instructions              #    0.68  insn per cycle
    70,796,256,700      branches                  #  309.245 M/sec
     1,442,024,657      branch-misses             #    2.04% of all branches

     128.400388292 seconds time elapsed

...

ceph 4 KB random read, iodepth=32, numjobs=2

[build]$ perf stat -p `pgrep -u ${UID} ceph-osd` & fio ../rbd_read.fio; sleep 1; killall -INT perf
[1] 794559
rbd_iodepth32: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=32
...
fio-3.26-46-g30be
Starting 2 processes
Jobs: 2 (f=2): [r(2)][100.0%][r=45.1MiB/s][r=11.6k IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=2): err= 0: pid=794662: Mon Aug 30 18:17:45 2021
  read: IOPS=10.4k, BW=40.8MiB/s (42.8MB/s)(9796MiB/240006msec)
    slat (nsec): min=765, max=1102.8k, avg=9705.11, stdev=5920.11
    clat (usec): min=576, max=25028, avg=6113.45, stdev=2486.63
     lat (usec): min=589, max=25034, avg=6123.16, stdev=2487.73
    clat percentiles (usec):
     |  1.00th=[ 1663],  5.00th=[ 2180], 10.00th=[ 3261], 20.00th=[ 4080],
     | 30.00th=[ 4686], 40.00th=[ 5211], 50.00th=[ 5735], 60.00th=[ 6259],
     | 70.00th=[ 6980], 80.00th=[ 8455], 90.00th=[ 9765], 95.00th=[10552],
     | 99.00th=[11731], 99.50th=[12256], 99.90th=[18220], 99.95th=[19268],
     | 99.99th=[21365]
   bw (  KiB/s): min=23088, max=130104, per=100.00%, avg=41836.73, stdev=9126.06, samples=958
   iops        : min= 5772, max=32526, avg=10458.91, stdev=2281.50, samples=958
  lat (usec)   : 750=0.01%, 1000=0.01%
  lat (msec)   : 2=3.58%, 4=14.98%, 10=73.00%, 20=8.40%, 50=0.04%
  cpu          : usr=7.03%, sys=6.19%, ctx=1979368, majf=0, minf=725
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=2507796,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
   READ: bw=40.8MiB/s (42.8MB/s), 40.8MiB/s-40.8MiB/s (42.8MB/s-42.8MB/s), io=9796MiB (10.3GB), run=240006-240006msec

Disk stats (read/write):
    dm-0: ios=1/8186, merge=0/0, ticks=0/1208, in_queue=1208, util=1.05%, aggrios=1/4142, aggrmerge=0/4055, aggrticks=1/1401, aggrin_queue=0, aggrutil=1.05%
  sda: ios=1/4142, merge=0/4055, ticks=1/1401, in_queue=0, util=1.05%
root@otccldstore05:/home/chunmei/ceph/build#
 Performance counter stats for process id '793773':

        305,994.61 msec task-clock                #    1.270 CPUs utilized
         3,989,239      context-switches          #    0.013 M/sec
               533      cpu-migrations            #    0.002 K/sec
         5,466,383      page-faults               #    0.018 M/sec
   439,450,395,960      cycles                    #    1.436 GHz
   265,393,016,497      instructions              #    0.60  insn per cycle
    53,275,240,050      branches                  #  174.105 M/sec
       719,967,315      branch-misses             #    1.35% of all branches

     241.027460490 seconds time elapsed

...

ceph 4 KB random write, iodepth=64, numjobs=4

[build]$ perf stat -p `pgrep -u ${UID} ceph-osd` & fio ../rbd_write.fio; sleep 1; killall -INT perf
[1] 796827
rbd_iodepth32: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=64
...
fio-3.26-46-g30be
Starting 4 processes
Jobs: 4 (f=4): [w(4)][100.0%][w=41.8MiB/s][w=10.7k IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=4): err= 0: pid=796962: Mon Aug 30 18:33:46 2021
  write: IOPS=9209, BW=36.0MiB/s (37.7MB/s)(8192MiB/227723msec); 0 zone resets
    slat (usec): min=2, max=449, avg=10.32, stdev= 9.75
    clat (msec): min=5, max=104, avg=27.78, stdev= 6.10
     lat (msec): min=5, max=104, avg=27.79, stdev= 6.10
    clat percentiles (usec):
     |  1.00th=[18482],  5.00th=[21627], 10.00th=[23200], 20.00th=[24773],
     | 30.00th=[25297], 40.00th=[26084], 50.00th=[26608], 60.00th=[27132],
     | 70.00th=[27919], 80.00th=[29230], 90.00th=[31851], 95.00th=[41681],
     | 99.00th=[52691], 99.50th=[55313], 99.90th=[65274], 99.95th=[69731],
     | 99.99th=[82314]
   bw (  KiB/s): min=31493, max=47237, per=100.00%, avg=36880.47, stdev=480.20, samples=1820
   iops        : min= 7871, max=11807, avg=9219.11, stdev=119.99, samples=1820
  lat (msec)   : 10=0.03%, 20=2.40%, 50=95.62%, 100=1.95%, 250=0.01%
  cpu          : usr=4.33%, sys=2.30%, ctx=780131, majf=0, minf=4539
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=0,2097152,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
  WRITE: bw=36.0MiB/s (37.7MB/s), 36.0MiB/s-36.0MiB/s (37.7MB/s-37.7MB/s), io=8192MiB (8590MB), run=227723-227723msec

Disk stats (read/write):
    dm-0: ios=0/1037872, merge=0/0, ticks=0/198292, in_queue=198292, util=100.00%, aggrios=0/813814, aggrmerge=0/224559, aggrticks=0/183479, aggrin_queue=11288, aggrutil=100.00%
  sda: ios=0/813814, merge=0/224559, ticks=0/183479, in_queue=11288, util=100.00%
root@otccldstore05:/home/chunmei/ceph/build#
 Performance counter stats for process id '796254':

        450,611.05 msec task-clock                #    1.969 CPUs utilized
         5,955,962      context-switches          #    0.013 M/sec
               512      cpu-migrations            #    0.001 K/sec
         4,036,376      page-faults               #    0.009 M/sec
 1,060,218,192,398      cycles                    #    2.353 GHz
   701,343,050,871      instructions              #    0.66  insn per cycle
   140,799,881,435      branches                  #  312.464 M/sec
     2,753,635,925      branch-misses             #    1.96% of all branches

     228.827405765 seconds time elapsed


...

ceph 4 KB random read, iodepth=64, numjobs=4

[build]$ perf stat -p `pgrep -u ${UID} ceph-osd` & fio ../rbd_read.fio; sleep 1; killall -INT perf
[1] 796971
rbd_iodepth32: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=64
...
fio-3.26-46-g30be
Starting 4 processes
Jobs: 4 (f=0): [f(4)][100.0%][r=70.9MiB/s][r=18.1k IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=4): err= 0: pid=797106: Mon Aug 30 18:39:58 2021
  read: IOPS=11.9k, BW=46.3MiB/s (48.6MB/s)(10.9GiB/240012msec)
    slat (nsec): min=764, max=699076, avg=10684.89, stdev=8026.80
    clat (usec): min=2353, max=76918, avg=21563.84, stdev=12281.03
     lat (usec): min=2365, max=76939, avg=21574.53, stdev=12283.45
    clat percentiles (usec):
     |  1.00th=[ 9372],  5.00th=[10421], 10.00th=[11207], 20.00th=[12518],
     | 30.00th=[13435], 40.00th=[14353], 50.00th=[15795], 60.00th=[18482],
     | 70.00th=[23462], 80.00th=[31589], 90.00th=[42730], 95.00th=[49021],
     | 99.00th=[55313], 99.50th=[57934], 99.90th=[63177], 99.95th=[65799],
     | 99.99th=[72877]
   bw (  KiB/s): min=16840, max=103064, per=100.00%, avg=47465.51, stdev=5997.11, samples=1916
   iops        : min= 4210, max=25766, avg=11865.57, stdev=1499.29, samples=1916
  lat (msec)   : 4=0.01%, 10=2.94%, 20=60.75%, 50=32.13%, 100=4.18%
  cpu          : usr=4.47%, sys=3.77%, ctx=2266326, majf=0, minf=1231
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=2847645,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=46.3MiB/s (48.6MB/s), 46.3MiB/s-46.3MiB/s (48.6MB/s-48.6MB/s), io=10.9GiB (11.7GB), run=240012-240012msec

Disk stats (read/write):
    dm-0: ios=0/7818, merge=0/0, ticks=0/2480, in_queue=2480, util=1.60%, aggrios=0/3945, aggrmerge=0/3902, aggrticks=0/2588, aggrin_queue=492, aggrutil=1.60%
  sda: ios=0/3945, merge=0/3902, ticks=0/2588, in_queue=492, util=1.60%
root@otccldstore05:/home/chunmei/ceph/build#
 Performance counter stats for process id '796254':

        304,880.89 msec task-clock                #    1.265 CPUs utilized
         4,351,492      context-switches          #    0.014 M/sec
               836      cpu-migrations            #    0.003 K/sec
         5,479,561      page-faults               #    0.018 M/sec
   509,714,046,081      cycles                    #    1.672 GHz
   301,135,685,379      instructions              #    0.59  insn per cycle
    60,452,359,986      branches                  #  198.282 M/sec
       825,345,030      branch-misses             #    1.37% of all branches

     241.105620573 seconds time elapsed

...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment