Skip to content

Instantly share code, notes, and snippets.

@liu-chunmei
Last active February 10, 2021 06:04
Show Gist options
  • Save liu-chunmei/f696b9c4f31b123fb223cdd47f13c8ea to your computer and use it in GitHub Desktop.
Save liu-chunmei/f696b9c4f31b123fb223cdd47f13c8ea to your computer and use it in GitHub Desktop.
crimson-osd performance testing---fio

fio: Rough perf testing of crimson-osd: ceph-osd

Scenario

  • 4 KB random reads and writes with one fio instance and two job number
  • single OSD instance
  • deployed on purley server by vstart.sh
  • with minimal content stored in memstore / cyanstore.

version

master (commit ID caa58fce7a3c6a16d86ec2bf075a204e1e2b836c)

The cycles-per-op metric

ceph-osd

write: 135,556,125,646 /317739 = 426627.2810
>>>
read:  180,596,074,788 /1231875 = 146602.5974
>>>

crimson-osd

write: 92,465,570,052 /304513 = 303650.6489
>>> 
read: 100,894,866,318 /1517528 = 66486.3292
>>>

The avg-lat metric

ceph-osd

write: 12.09(ms)
read: 3117.05 (us)

crimson-osd

write: 12627.25(us)
read:  2533.58(us)

crimson Build configuration

./do_cmake.sh -DWITH_SEASTAR=ON -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_CXX_FLAGS="-fno-omit-frame-pointer" -DWITH_TESTS=OFF

crimson Deployment

[build]$ MDS=0 MGR=0 OSD=1 MON=1 ../src/vstart.sh -n --without-dashboard --memstore -X -o "memstore_device_bytes=2516582400" --nolockdep --crimson --nodaemon --redirect-output            

[build]$ bin/ceph osd pool create rbd 128 128 && bin/ceph osd pool set --yes-i-really-mean-it rbd size 1 && bin/ceph osd pool --yes-i-really-mean-it set rbd min_size 1 

[build]$ bin/rbd create fio_test --size 2G --image-format=2 --rbd_default_features=3
...

crimson 4 KB random write, two fio job number instances

[build]$ perf stat -p `pgrep -u ${UID} crimson-osd` & fio ../rbd_write.fio; sleep 1; killall -INT perf
[1] 38206
rbd_iodepth32: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=64
...
fio-3.1
Starting 2 processes
Jobs: 2 (f=2): [w(2)][100.0%][r=0KiB/s,w=23.8MiB/s][r=0,w=6102 IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=2): err= 0: pid=38319: Wed Feb  3 00:32:24 2021
  write: IOPS=10.1k, BW=39.6MiB/s (41.5MB/s)(1190MiB/30044msec)
    slat (nsec): min=842, max=114432, avg=5931.96, stdev=5059.90
    clat (usec): min=1028, max=169210, avg=12621.32, stdev=12856.93
     lat (usec): min=1034, max=169222, avg=12627.25, stdev=12857.12
    clat percentiles (msec):
     |  1.00th=[    3],  5.00th=[    3], 10.00th=[    4], 20.00th=[    5],
     | 30.00th=[    6], 40.00th=[    7], 50.00th=[    9], 60.00th=[   11],
     | 70.00th=[   13], 80.00th=[   16], 90.00th=[   29], 95.00th=[   48],
     | 99.00th=[   56], 99.50th=[   59], 99.90th=[   88], 99.95th=[   99],
     | 99.99th=[  126]
   bw (  KiB/s): min= 4745, max=76288, per=50.06%, avg=20294.77, stdev=10959.40, samples=120
   iops        : min= 1186, max=19072, avg=5073.65, stdev=2739.87, samples=120
  lat (msec)   : 2=0.18%, 4=14.95%, 10=44.01%, 20=28.57%, 50=8.97%
  lat (msec)   : 100=3.28%, 250=0.04%
  cpu          : usr=50.75%, sys=49.02%, ctx=29182, majf=0, minf=1221
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwt: total=0,304513,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
  WRITE: bw=39.6MiB/s (41.5MB/s), 39.6MiB/s-39.6MiB/s (41.5MB/s-41.5MB/s), io=1190MiB (1247MB), run=30044-30044msec

Disk stats (read/write):
    dm-1: ios=0/115, merge=0/0, ticks=0/60, in_queue=60, util=0.19%, aggrios=0/71, aggrmerge=0/56, aggrticks=0/56, aggrin_queue=56, aggrutil=0.19%
  sda: ios=0/71, merge=0/56, ticks=0/56, in_queue=56, util=0.19%
chunmei@otccldstore05:~/ceph/build$
 Performance counter stats for process id '38096':

      28052.782758      task-clock (msec)         #    0.896 CPUs utilized
           114,012      context-switches          #    0.004 M/sec
                 0      cpu-migrations            #    0.000 K/sec
           237,692      page-faults               #    0.008 M/sec
    92,465,570,052      cycles                    #    3.296 GHz
   101,483,088,881      instructions              #    1.10  insn per cycle
    20,126,489,495      branches                  #  717.451 M/sec
       137,456,674      branch-misses             #    0.68% of all branches

      31.305689522 seconds time elapsed

...

crimson 4 KB random read, two fio job number instances

[build]$ perf stat -p `pgrep -u ${UID} crimson-osd` & fio ../rbd_read.fio; sleep 1; killall -INT perf
[1] 38323
rbd_iodepth32: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=64
...
fio-3.1
Starting 2 processes
Jobs: 2 (f=2): [r(2)][100.0%][r=201MiB/s,w=0KiB/s][r=51.4k,w=0 IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=2): err= 0: pid=38436: Wed Feb  3 00:36:52 2021
   read: IOPS=50.5k, BW=197MiB/s (207MB/s)(5928MiB/30044msec)
    slat (nsec): min=196, max=807057, avg=2071.82, stdev=2064.28
    clat (usec): min=455, max=46061, avg=2531.51, stdev=663.56
     lat (usec): min=459, max=46061, avg=2533.58, stdev=663.56
    clat percentiles (usec):
     |  1.00th=[ 1532],  5.00th=[ 1844], 10.00th=[ 2024], 20.00th=[ 2278],
     | 30.00th=[ 2409], 40.00th=[ 2442], 50.00th=[ 2507], 60.00th=[ 2540],
     | 70.00th=[ 2606], 80.00th=[ 2704], 90.00th=[ 2999], 95.00th=[ 3326],
     | 99.00th=[ 3916], 99.50th=[ 4621], 99.90th=[ 8455], 99.95th=[11207],
     | 99.99th=[19268]
   bw (  KiB/s): min=81088, max=116888, per=50.08%, avg=101175.18, stdev=3626.79, samples=120
   iops        : min=20272, max=29222, avg=25293.76, stdev=906.71, samples=120
  lat (usec)   : 500=0.01%, 750=0.01%, 1000=0.03%
  lat (msec)   : 2=9.16%, 4=89.93%, 10=0.81%, 20=0.06%, 50=0.01%
  cpu          : usr=53.70%, sys=46.11%, ctx=26674, majf=0, minf=462
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwt: total=1517528,0,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=197MiB/s (207MB/s), 197MiB/s-197MiB/s (207MB/s-207MB/s), io=5928MiB (6216MB), run=30044-30044msec

Disk stats (read/write):
    dm-1: ios=0/268, merge=0/0, ticks=0/256, in_queue=256, util=0.65%, aggrios=0/177, aggrmerge=0/127, aggrticks=0/192, aggrin_queue=192, aggrutil=0.64%
  sda: ios=0/177, merge=0/127, ticks=0/192, in_queue=192, util=0.64%
chunmei@otccldstore05:~/ceph/build$
 Performance counter stats for process id '38096':

      29642.606495      task-clock (msec)         #    0.947 CPUs utilized
           120,225      context-switches          #    0.004 M/sec
                 0      cpu-migrations            #    0.000 K/sec
               523      page-faults               #    0.018 K/sec
   100,894,866,318      cycles                    #    3.404 GHz
    76,282,985,821      instructions              #    0.76  insn per cycle
    15,146,255,994      branches                  #  510.962 M/sec
        55,043,674      branch-misses             #    0.36% of all branches

      31.317012681 seconds time elapsed

...

ceph options configuration

ms_async_op_threads = 1
osd_op_num_threads_per_shard = 1
osd_op_num_shards =1

ceph Build configuration

./do_cmake.sh -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_CXX_FLAGS="-fno-omit-frame-pointer" -DWITH_TESTS=OFF

ceph Deployment

[build]$ MDS=0 MGR=0 OSD=1 MON=1 ../src/vstart.sh -n --without-dashboard --memstore -X -o "memstore_device_bytes=2516582400" --nolockdep --nodaemon --redirect-output            

[build]$ bin/ceph osd pool create rbd 128 128 && bin/ceph osd pool set --yes-i-really-mean-it rbd size 1 && bin/ceph osd pool --yes-i-really-mean-it set rbd min_size 1 

[build]$ bin/rbd create fio_test --size 2G --image-format=2 --rbd_default_features=3

[build]$ taskset -cp 0 <ceph_osd Pid> 
...

ceph 4 KB random write, two fio job number instances

[build]$ perf stat -p `pgrep -u ${UID} ceph-osd` & fio ../rbd_write.fio; sleep 1; killall -INT perf
[1] 37526
rbd_iodepth32: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=64
...
fio-3.1
Starting 2 processes
Jobs: 2 (f=2): [w(2)][100.0%][r=0KiB/s,w=25.4MiB/s][r=0,w=6506 IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=2): err= 0: pid=37639: Wed Feb  3 00:22:20 2021
  write: IOPS=10.6k, BW=41.3MiB/s (43.4MB/s)(1241MiB/30021msec)
    slat (nsec): min=1296, max=131535, avg=5758.49, stdev=1934.07
    clat (msec): min=3, max=234, avg=12.09, stdev= 7.13
     lat (msec): min=3, max=234, avg=12.09, stdev= 7.13
    clat percentiles (msec):
     |  1.00th=[    5],  5.00th=[    5], 10.00th=[    6], 20.00th=[    7],
     | 30.00th=[    9], 40.00th=[   10], 50.00th=[   12], 60.00th=[   13],
     | 70.00th=[   15], 80.00th=[   17], 90.00th=[   18], 95.00th=[   20],
     | 99.00th=[   31], 99.50th=[   36], 99.90th=[  103], 99.95th=[  129],
     | 99.99th=[  163]
   bw (  KiB/s): min= 2104, max=49960, per=50.03%, avg=21178.27, stdev=9811.25, samples=120
   iops        : min=  526, max=12490, avg=5294.51, stdev=2452.83, samples=120
  lat (msec)   : 4=0.01%, 10=41.92%, 20=53.28%, 50=4.49%, 100=0.20%
  lat (msec)   : 250=0.11%
  cpu          : usr=50.76%, sys=49.18%, ctx=3249, majf=0, minf=1422
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwt: total=0,317739,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
  WRITE: bw=41.3MiB/s (43.4MB/s), 41.3MiB/s-41.3MiB/s (43.4MB/s-43.4MB/s), io=1241MiB (1301MB), run=30021-30021msec

Disk stats (read/write):
    dm-1: ios=0/129, merge=0/0, ticks=0/60, in_queue=60, util=0.20%, aggrios=0/77, aggrmerge=0/65, aggrticks=0/60, aggrin_queue=60, aggrutil=0.20%
  sda: ios=0/77, merge=0/65, ticks=0/60, in_queue=60, util=0.20%

 Performance counter stats for process id '37380':

      53161.843754      task-clock (msec)         #    1.699 CPUs utilized
           720,762      context-switches          #    0.014 M/sec
                18      cpu-migrations            #    0.000 K/sec
         1,805,282      page-faults               #    0.034 M/sec
   135,556,125,646      cycles                    #    2.550 GHz
    90,259,333,420      instructions              #    0.67  insn per cycle
    17,410,244,625      branches                  #  327.495 M/sec
       100,216,022      branch-misses             #    0.58% of all branches

      31.285592179 seconds time elapsed
...

ceph 4 KB random read, two fio job number instances

[build]$ perf stat -p `pgrep -u ${UID} ceph-osd` & fio ../rbd_read.fio; sleep 1; killall -INT perf
[1] 37644
rbd_iodepth32: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=64
...
fio-3.1
Starting 2 processes
Jobs: 2 (f=2): [r(2)][100.0%][r=180MiB/s,w=0KiB/s][r=45.0k,w=0 IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=2): err= 0: pid=37757: Wed Feb  3 00:26:27 2021
   read: IOPS=41.1k, BW=160MiB/s (168MB/s)(4812MiB/30003msec)
    slat (nsec): min=238, max=737432, avg=3485.81, stdev=1930.53
    clat (usec): min=1238, max=15937, avg=3113.57, stdev=464.04
     lat (usec): min=1239, max=15956, avg=3117.05, stdev=464.12
    clat percentiles (usec):
     |  1.00th=[ 2212],  5.00th=[ 2606], 10.00th=[ 2671], 20.00th=[ 2769],
     | 30.00th=[ 2868], 40.00th=[ 2999], 50.00th=[ 3195], 60.00th=[ 3261],
     | 70.00th=[ 3294], 80.00th=[ 3359], 90.00th=[ 3425], 95.00th=[ 3490],
     | 99.00th=[ 5211], 99.50th=[ 5735], 99.90th=[ 6980], 99.95th=[ 8455],
     | 99.99th=[11469]
   bw (  KiB/s): min=67144, max=114864, per=50.01%, avg=82137.21, stdev=8147.17, samples=120
   iops        : min=16786, max=28716, avg=20534.26, stdev=2036.81, samples=120
  lat (msec)   : 2=0.64%, 4=97.85%, 10=1.49%, 20=0.02%
  cpu          : usr=52.61%, sys=47.33%, ctx=10007, majf=0, minf=465
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwt: total=1231875,0,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=160MiB/s (168MB/s), 160MiB/s-160MiB/s (168MB/s-168MB/s), io=4812MiB (5046MB), run=30003-30003msec

Disk stats (read/write):
    dm-1: ios=0/271, merge=0/0, ticks=0/288, in_queue=288, util=0.68%, aggrios=0/181, aggrmerge=0/126, aggrticks=0/204, aggrin_queue=204, aggrutil=0.68%
  sda: ios=0/181, merge=0/126, ticks=0/204, in_queue=204, util=0.68%

 Performance counter stats for process id '37380':

      55431.466486      task-clock (msec)         #    1.772 CPUs utilized
           645,319      context-switches          #    0.012 M/sec
                18      cpu-migrations            #    0.000 K/sec
         1,132,945      page-faults               #    0.020 M/sec
   180,596,074,788      cycles                    #    3.258 GHz
    90,476,210,908      instructions              #    0.50  insn per cycle
    17,630,567,178      branches                  #  318.061 M/sec
       147,929,888      branch-misses             #    0.84% of all branches

      31.279219246 seconds time elapsed
...

fio write configuration

[global] ioengine=rbd clientname=admin pool=rbd rbdname=fio_test rw=randwrite bs=4K runtime=30 numjobs=2 direct=1 group_reporting

[rbd_iodepth32] iodepth=64 ...

fio read configuration

[global] ioengine=rbd clientname=admin pool=rbd rbdname=fio_test rw=randread bs=4K time_based=1 runtime=30 group_reporting

[rbd_iodepth32] iodepth=64 numjobs=2

fio write configuration

[global] ioengine=rbd clientname=admin pool=rbd rbdname=fio_test rw=randwrite bs=4K runtime=30 numjobs=2 direct=1 group_reporting

[rbd_iodepth32] iodepth=2 ...

fio read configuration

[global] ioengine=rbd clientname=admin pool=rbd rbdname=fio_test rw=randread bs=4K time_based=1 runtime=30 group_reporting

[rbd_iodepth32] iodepth=2 numjobs=2 ...

The cycles-per-op metric

ceph-osd

write: 131,354,778,365 /297594 = 441389.2026
>>>
read:  118,108,326,710 /728138 = 162205.9646
>>>

crimson-osd

write: 99,797,508,296 /296447 = 336645.3642
>>> 
read: 101,895,820,665/623251 = 163490.8257
>>>

The avg-lat metric

ceph-osd

write: 402.78(us)
read: 164.43(us)

crimson-osd

write: 404.52(us)
read:  192.28(us)

crimson 4 KB random write, two fio job number instances

[build]$ perf stat -p `pgrep -u ${UID} crimson-osd` & fio ../rbd_write.fio; sleep 1; killall -INT perf
[1] 38912
rbd_iodepth32: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=2
...
fio-3.1
Starting 2 processes
Jobs: 2 (f=2): [w(2)][100.0%][r=0KiB/s,w=26.1MiB/s][r=0,w=6681 IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=2): err= 0: pid=39025: Wed Feb  3 00:45:48 2021
  write: IOPS=9867, BW=38.5MiB/s (40.4MB/s)(1158MiB/30042msec)
    slat (nsec): min=1140, max=49358, avg=6243.33, stdev=2168.75
    clat (usec): min=128, max=43944, avg=398.28, stdev=291.83
     lat (usec): min=135, max=43948, avg=404.52, stdev=291.80
    clat percentiles (usec):
     |  1.00th=[  186],  5.00th=[  200], 10.00th=[  210], 20.00th=[  229],
     | 30.00th=[  269], 40.00th=[  322], 50.00th=[  375], 60.00th=[  420],
     | 70.00th=[  465], 80.00th=[  510], 90.00th=[  570], 95.00th=[  668],
     | 99.00th=[ 1074], 99.50th=[ 1418], 99.90th=[ 2966], 99.95th=[ 3523],
     | 99.99th=[ 5145]
   bw (  KiB/s): min= 4352, max=36688, per=50.21%, avg=19818.80, stdev=6729.06, samples=119
   iops        : min= 1088, max= 9172, avg=4954.68, stdev=1682.27, samples=119
  lat (usec)   : 250=27.49%, 500=50.05%, 750=18.92%, 1000=2.24%
  lat (msec)   : 2=1.04%, 4=0.24%, 10=0.02%, 20=0.01%, 50=0.01%
  cpu          : usr=51.35%, sys=48.55%, ctx=18687, majf=0, minf=360
  IO depths    : 1=0.1%, 2=100.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,296447,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=2

Run status group 0 (all jobs):
  WRITE: bw=38.5MiB/s (40.4MB/s), 38.5MiB/s-38.5MiB/s (40.4MB/s-40.4MB/s), io=1158MiB (1214MB), run=30042-30042msec

Disk stats (read/write):
    dm-1: ios=0/117, merge=0/0, ticks=0/64, in_queue=64, util=0.19%, aggrios=0/77, aggrmerge=0/57, aggrticks=0/56, aggrin_queue=56, aggrutil=0.19%
  sda: ios=0/77, merge=0/57, ticks=0/56, in_queue=56, util=0.19%
chunmei@otccldstore05:~/ceph/build$
 Performance counter stats for process id '38802':

      29615.127680      task-clock (msec)         #    0.945 CPUs utilized
           120,404      context-switches          #    0.004 M/sec
                 0      cpu-migrations            #    0.000 K/sec
           217,649      page-faults               #    0.007 M/sec
    99,797,508,296      cycles                    #    3.370 GHz
   102,418,795,042      instructions              #    1.03  insn per cycle
    20,274,589,218      branches                  #  684.602 M/sec
       144,595,430      branch-misses             #    0.71% of all branches

      31.326414195 seconds time elapsed

...

crimson 4 KB random read, two fio job number instances

[build]$ perf stat -p `pgrep -u ${UID} crimson-osd` & fio ../rbd_read.fio; sleep 1; killall -INT perf
[1] 39030
rbd_iodepth32: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=2
...
fio-3.1
Starting 2 processes
Jobs: 2 (f=2): [r(2)][100.0%][r=81.0MiB/s,w=0KiB/s][r=20.0k,w=0 IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=2): err= 0: pid=39143: Wed Feb  3 00:49:45 2021
   read: IOPS=20.7k, BW=81.0MiB/s (84.0MB/s)(2435MiB/30044msec)
    slat (nsec): min=579, max=47607, avg=3324.34, stdev=769.23
    clat (usec): min=95, max=43095, avg=188.96, stdev=78.68
     lat (usec): min=98, max=43098, avg=192.28, stdev=78.70
    clat percentiles (usec):
     |  1.00th=[  161],  5.00th=[  163], 10.00th=[  167], 20.00th=[  176],
     | 30.00th=[  182], 40.00th=[  188], 50.00th=[  190], 60.00th=[  192],
     | 70.00th=[  196], 80.00th=[  200], 90.00th=[  208], 95.00th=[  215],
     | 99.00th=[  225], 99.50th=[  229], 99.90th=[  265], 99.95th=[  359],
     | 99.99th=[  502]
   bw (  KiB/s): min=36384, max=48504, per=50.08%, avg=41553.93, stdev=3041.38, samples=120
   iops        : min= 9096, max=12126, avg=10388.47, stdev=760.35, samples=120
  lat (usec)   : 100=0.01%, 250=99.87%, 500=0.11%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.01%, 50=0.01%
  cpu          : usr=50.57%, sys=49.44%, ctx=982, majf=0, minf=361
  IO depths    : 1=0.1%, 2=100.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=623251,0,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=2

Run status group 0 (all jobs):
   READ: bw=81.0MiB/s (84.0MB/s), 81.0MiB/s-81.0MiB/s (84.0MB/s-84.0MB/s), io=2435MiB (2553MB), run=30044-30044msec

Disk stats (read/write):
    dm-1: ios=0/281, merge=0/0, ticks=0/320, in_queue=320, util=0.69%, aggrios=0/181, aggrmerge=0/136, aggrticks=0/204, aggrin_queue=204, aggrutil=0.68%
  sda: ios=0/181, merge=0/136, ticks=0/204, in_queue=204, util=0.68%
chunmei@otccldstore05:~/ceph/build$
 Performance counter stats for process id '38802':

      29692.487796      task-clock (msec)         #    0.948 CPUs utilized
           120,205      context-switches          #    0.004 M/sec
                 0      cpu-migrations            #    0.000 K/sec
               578      page-faults               #    0.019 K/sec
   101,895,820,665      cycles                    #    3.432 GHz
    69,445,577,382      instructions              #    0.68  insn per cycle
    13,379,368,207      branches                  #  450.598 M/sec
        89,268,986      branch-misses             #    0.67% of all branches

      31.322221984 seconds time elapsed

...

ceph 4 KB random write, two fio job number instances

[build]$ perf stat -p `pgrep -u ${UID} ceph-osd` & fio ../rbd_write.fio; sleep 1; killall -INT perf
[1] 39651
rbd_iodepth32: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=2
...
fio-3.1
Starting 2 processes
Jobs: 2 (f=2): [w(2)][100.0%][r=0KiB/s,w=25.6MiB/s][r=0,w=6554 IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=2): err= 0: pid=39764: Wed Feb  3 00:54:32 2021
  write: IOPS=9919, BW=38.7MiB/s (40.6MB/s)(1162MiB/30001msec)
    slat (nsec): min=1761, max=43911, avg=6779.81, stdev=2029.58
    clat (usec): min=159, max=14924, avg=396.00, stdev=283.55
     lat (usec): min=166, max=14929, avg=402.78, stdev=283.49
    clat percentiles (usec):
     |  1.00th=[  204],  5.00th=[  221], 10.00th=[  233], 20.00th=[  258],
     | 30.00th=[  289], 40.00th=[  318], 50.00th=[  355], 60.00th=[  396],
     | 70.00th=[  441], 80.00th=[  490], 90.00th=[  545], 95.00th=[  586],
     | 99.00th=[ 1598], 99.50th=[ 1958], 99.90th=[ 4228], 99.95th=[ 5211],
     | 99.99th=[ 7439]
   bw (  KiB/s): min= 2016, max=33696, per=50.31%, avg=19961.54, stdev=6893.34, samples=118
   iops        : min=  504, max= 8424, avg=4990.36, stdev=1723.34, samples=118
  lat (usec)   : 250=17.34%, 500=64.74%, 750=16.23%, 1000=0.23%
  lat (msec)   : 2=0.99%, 4=0.35%, 10=0.12%, 20=0.01%
  cpu          : usr=50.89%, sys=49.01%, ctx=12474, majf=0, minf=364
  IO depths    : 1=0.1%, 2=100.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,297594,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=2

Run status group 0 (all jobs):
  WRITE: bw=38.7MiB/s (40.6MB/s), 38.7MiB/s-38.7MiB/s (40.6MB/s-40.6MB/s), io=1162MiB (1219MB), run=30001-30001msec

Disk stats (read/write):
    dm-1: ios=0/123, merge=0/0, ticks=0/52, in_queue=52, util=0.17%, aggrios=0/75, aggrmerge=0/60, aggrticks=0/56, aggrin_queue=56, aggrutil=0.19%
  sda: ios=0/75, merge=0/60, ticks=0/56, in_queue=56, util=0.19%
chunmei@otccldstore05:~/ceph/build$
 Performance counter stats for process id '39505':

      52386.375095      task-clock (msec)         #    1.677 CPUs utilized
           726,252      context-switches          #    0.014 M/sec
                39      cpu-migrations            #    0.001 K/sec
         1,706,724      page-faults               #    0.033 M/sec
   131,354,778,365      cycles                    #    2.507 GHz
    83,694,944,543      instructions              #    0.64  insn per cycle
    16,132,260,159      branches                  #  307.948 M/sec
       104,244,466      branch-misses             #    0.65% of all branches

      31.238484082 seconds time elapsed

...

ceph 4 KB random read, two fio job number instances

[build]$ perf stat -p `pgrep -u ${UID} ceph-osd` & fio ../rbd_read.fio; sleep 1; killall -INT perf
[1] 39769
rbd_iodepth32: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=2
...
fio-3.1
Starting 2 processes
Jobs: 2 (f=2): [r(2)][100.0%][r=96.4MiB/s,w=0KiB/s][r=24.7k,w=0 IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=2): err= 0: pid=39883: Wed Feb  3 00:58:07 2021
   read: IOPS=24.3k, BW=94.8MiB/s (99.4MB/s)(2844MiB/30001msec)
    slat (nsec): min=241, max=167581, avg=3350.41, stdev=1706.38
    clat (usec): min=100, max=5895, avg=161.08, stdev=50.33
     lat (usec): min=103, max=5897, avg=164.43, stdev=50.35
    clat percentiles (usec):
     |  1.00th=[  124],  5.00th=[  133], 10.00th=[  139], 20.00th=[  145],
     | 30.00th=[  149], 40.00th=[  153], 50.00th=[  159], 60.00th=[  163],
     | 70.00th=[  169], 80.00th=[  176], 90.00th=[  186], 95.00th=[  196],
     | 99.00th=[  219], 99.50th=[  231], 99.90th=[  302], 99.95th=[  416],
     | 99.99th=[ 2638]
   bw (  KiB/s): min=41848, max=50808, per=50.00%, avg=48537.78, stdev=1400.59, samples=118
   iops        : min=10462, max=12702, avg=12134.42, stdev=350.18, samples=118
  lat (usec)   : 250=99.78%, 500=0.18%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.01%, 4=0.03%, 10=0.01%
  cpu          : usr=51.45%, sys=48.38%, ctx=23214, majf=0, minf=461
  IO depths    : 1=0.1%, 2=100.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=728138,0,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=2

Run status group 0 (all jobs):
   READ: bw=94.8MiB/s (99.4MB/s), 94.8MiB/s-94.8MiB/s (99.4MB/s-99.4MB/s), io=2844MiB (2982MB), run=30001-30001msec

Disk stats (read/write):
    dm-1: ios=0/271, merge=0/0, ticks=0/252, in_queue=252, util=0.65%, aggrios=0/179, aggrmerge=0/128, aggrticks=0/196, aggrin_queue=196, aggrutil=0.65%
  sda: ios=0/179, merge=0/128, ticks=0/196, in_queue=196, util=0.65%

 Performance counter stats for process id '39505':

      37205.669563      task-clock (msec)         #    1.190 CPUs utilized
         1,087,191      context-switches          #    0.029 M/sec
                33      cpu-migrations            #    0.001 K/sec
           703,764      page-faults               #    0.019 M/sec
   118,108,326,710      cycles                    #    3.174 GHz
    59,256,590,289      instructions              #    0.50  insn per cycle
    11,491,475,799      branches                  #  308.864 M/sec
       121,731,989      branch-misses             #    1.06% of all branches

      31.269378318 seconds time elapsed

fio write configuration

[global] ioengine=rbd clientname=admin pool=rbd rbdname=fio_test rw=randwrite bs=4K runtime=30 numjobs=2 direct=1 group_reporting

[rbd_iodepth32] iodepth=32 ...

fio read configuration

[global] ioengine=rbd clientname=admin pool=rbd rbdname=fio_test rw=randread bs=4K time_based=1 runtime=30 group_reporting

[rbd_iodepth32] iodepth=32 numjobs=2 ...

The cycles-per-op metric

ceph-osd

write: 135,557,028,177 /295242 = 459138.7003
>>>
read:  186,093,071,128 / 1318768 = 141111.3032
>>>

crimson-osd

write: 94,629,585,102 /310273 = 304988.1398
>>> 
read: 99,869,175,993 / 1532969  = 65147.5509
>>>

The avg-lat metric

ceph-osd

write: 6504.78(us)
read: 1455.56 (us)

crimson-osd

write: 6199.67(us)
read:  1252.12(us)

crimson 4 KB random write, two fio job number instances

[build]$ perf stat -p `pgrep -u ${UID} crimson-osd` & fio ../rbd_write.fio; sleep 1; killall -INT perf
[1] 49723
rbd_iodepth32: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=32
...
fio-3.1
Starting 2 processes
Jobs: 2 (f=2): [r(2)][100.0%][r=170MiB/s,w=0KiB/s][r=43.6k,w=0 IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=2): err= 0: pid=49836: Wed Feb  3 14:25:34 2021
   read: IOPS=43.0k, BW=172MiB/s (180MB/s)(5151MiB/30002msec)
    slat (nsec): min=239, max=51418, avg=3632.22, stdev=1794.78
    clat (usec): min=458, max=12421, avg=1451.93, stdev=224.48
     lat (usec): min=461, max=12423, avg=1455.56, stdev=224.48
    clat percentiles (usec):
     |  1.00th=[ 1123],  5.00th=[ 1270], 10.00th=[ 1319], 20.00th=[ 1352],
     | 30.00th=[ 1385], 40.00th=[ 1418], 50.00th=[ 1434], 60.00th=[ 1467],
     | 70.00th=[ 1483], 80.00th=[ 1516], 90.00th=[ 1582], 95.00th=[ 1631],
     | 99.00th=[ 1827], 99.50th=[ 2638], 99.90th=[ 4490], 99.95th=[ 4817],
     | 99.99th=[ 5735]
   bw (  KiB/s): min=76408, max=109224, per=50.01%, avg=87934.46, stdev=4504.50, samples=120
   iops        : min=19102, max=27306, avg=21983.60, stdev=1126.12, samples=120
  lat (usec)   : 500=0.01%, 750=0.12%, 1000=0.45%
  lat (msec)   : 2=98.74%, 4=0.48%, 10=0.21%, 20=0.01%
  cpu          : usr=52.62%, sys=47.31%, ctx=11843, majf=0, minf=389
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwt: total=1318768,0,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
   READ: bw=172MiB/s (180MB/s), 172MiB/s-172MiB/s (180MB/s-180MB/s), io=5151MiB (5402MB), run=30002-30002msec

Disk stats (read/write):
    dm-1: ios=0/274, merge=0/0, ticks=0/256, in_queue=256, util=0.67%, aggrios=0/182, aggrmerge=0/128, aggrticks=0/200, aggrin_queue=200, aggrutil=0.66%
  sda: ios=0/182, merge=0/128, ticks=0/200, in_queue=200, util=0.66%

 Performance counter stats for process id '49414':

      57398.992807      task-clock (msec)         #    1.835 CPUs utilized
           514,295      context-switches          #    0.009 M/sec
                47      cpu-migrations            #    0.001 K/sec
         1,160,810      page-faults               #    0.020 M/sec
   186,093,071,128      cycles                    #    3.242 GHz
    95,721,598,532      instructions              #    0.51  insn per cycle
    18,618,111,072      branches                  #  324.363 M/sec
       151,239,795      branch-misses             #    0.81% of all branches

      31.285291582 seconds time elapsed

[1] 60683
rbd_iodepth32: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 409                                                                                                             6B-4096B, ioengine=rbd, iodepth=32
...
fio-3.1
Starting 2 processes
Jobs: 2 (f=2): [w(2)][10.0%][r=0KiB/s,w=86.2MiB/s][r=0,w=22.1k IOPS][eta 00m:27s                                                                                                             Jobs: 2 (f=2): [w(2)][13.3%][r=0KiB/s,w=85.4MiB/s][r=0,w=21.8k IOPS][eta 00m:26s                                                                                                             Jobs: 2 (f=2): [w(2)][16.7%][r=0KiB/s,w=68.3MiB/s][r=0,w=17.5k IOPS][eta 00m:25s                                                                                                             Jobs: 2 (f=2): [w(2)][20.0%][r=0KiB/s,w=58.8MiB/s][r=0,w=15.0k IOPS][eta 00m:24s                                                                                                             Jobs: 2 (f=2): [w(2)][23.3%][r=0KiB/s,w=54.1MiB/s][r=0,w=13.8k IOPS][eta 00m:23s                                                                                                             Jobs: 2 (f=2): [w(2)][26.7%][r=0KiB/s,w=45.6MiB/s][r=0,w=11.7k IOPS][eta 00m:22s                                                                                                             Jobs: 2 (f=2): [w(2)][30.0%][r=0KiB/s,w=46.6MiB/s][r=0,w=11.9k IOPS][eta 00m:21s                                                                                                             Jobs: 2 (f=2): [w(2)][33.3%][r=0KiB/s,w=43.8MiB/s][r=0,w=11.2k IOPS][eta 00m:20s                                                                                                             Jobs: 2 (f=2): [w(2)][36.7%][r=0KiB/s,w=39.7MiB/s][r=0,w=10.2k IOPS][eta 00m:19s                                                                                                             Jobs: 2 (f=2): [w(2)][40.0%][r=0KiB/s,w=38.6MiB/s][r=0,w=9883 IOPS][eta 00m:18s]                                                                                                             Jobs: 2 (f=2): [w(2)][100.0%][r=0KiB/s,w=27.0MiB/s][r=0,w=6920 IOPS][eta 00m:00s                                                                                                             ]
rbd_iodepth32: (groupid=0, jobs=2): err= 0: pid=60796: Wed Feb  3 13:52:55 2021
  write: IOPS=10.3k, BW=40.3MiB/s (42.2MB/s)(1212MiB/30082msec)
    slat (nsec): min=914, max=1065.1k, avg=6125.97, stdev=5997.43
    clat (usec): min=630, max=120768, avg=6193.55, stdev=6810.69
     lat (usec): min=640, max=120772, avg=6199.67, stdev=6810.76
    clat percentiles (usec):
     |  1.00th=[ 1221],  5.00th=[ 1631], 10.00th=[ 2114], 20.00th=[ 2868],
     | 30.00th=[ 3359], 40.00th=[ 3884], 50.00th=[ 4490], 60.00th=[ 5407],
     | 70.00th=[ 6456], 80.00th=[ 7570], 90.00th=[ 9503], 95.00th=[13042],
     | 99.00th=[43779], 99.50th=[46924], 99.90th=[52691], 99.95th=[56361],
     | 99.99th=[84411]
   bw (  KiB/s): min= 3416, max=53000, per=50.13%, avg=20682.73, stdev=10209.56,                                                                                                              samples=120
   iops        : min=  854, max=13250, avg=5170.65, stdev=2552.35, samples=120
  lat (usec)   : 750=0.01%, 1000=0.09%
  lat (msec)   : 2=7.85%, 4=35.67%, 10=48.40%, 20=4.14%, 50=3.65%
  lat (msec)   : 100=0.20%, 250=0.01%
  cpu          : usr=51.81%, sys=47.91%, ctx=30962, majf=2, minf=920
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,310273,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
  WRITE: bw=40.3MiB/s (42.2MB/s), 40.3MiB/s-40.3MiB/s (42.2MB/s-42.2MB/s), io=12                                                                                                             12MiB (1271MB), run=30082-30082msec

Disk stats (read/write):
    dm-1: ios=6/115, merge=0/0, ticks=4/92, in_queue=96, util=0.21%, aggrios=6/7                                                                                                             5, aggrmerge=0/52, aggrticks=4/60, aggrin_queue=64, aggrutil=0.21%
  sda: ios=6/75, merge=0/52, ticks=4/60, in_queue=64, util=0.21%

 Performance counter stats for process id '60534':

      28598.545752      task-clock (msec)         #    0.910 CPUs utilized                                                                                                                   
           116,226      context-switches          #    0.004 M/sec                                                                                                                           
                 0      cpu-migrations            #    0.000 K/sec                                                                                                                           
           227,249      page-faults               #    0.008 M/sec                                                                                                                           
    94,629,585,102      cycles                    #    3.309 GHz                                                                                                                             
   104,342,228,089      instructions              #    1.10  insn per cycle                                                                                                                  
    20,695,145,660      branches                  #  723.643 M/sec                                                                                                                           
       144,943,108      branch-misses             #    0.70% of all branches                                                                                                                 

      31.409968835 seconds time elapsed


...

crimson 4 KB random read, two fio job number instances

[build]$ perf stat -p `pgrep -u ${UID} crimson-osd` & fio ../rbd_read.fio; sleep 1; killall -INT perf
[1] 60802
rbd_iodepth32: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=32
...
fio-3.1
Starting 2 processes
Jobs: 2 (f=2): [r(2)][100.0%][r=200MiB/s,w=0KiB/s][r=51.3k,w=0 IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=2): err= 0: pid=60915: Wed Feb  3 13:57:02 2021
   read: IOPS=51.1k, BW=200MiB/s (209MB/s)(5988MiB/30002msec)
    slat (nsec): min=208, max=86509, avg=2060.61, stdev=1790.40
    clat (usec): min=372, max=11026, avg=1250.06, stdev=216.63
     lat (usec): min=373, max=11027, avg=1252.12, stdev=216.59
    clat percentiles (usec):
     |  1.00th=[  783],  5.00th=[  955], 10.00th=[ 1004], 20.00th=[ 1074],
     | 30.00th=[ 1156], 40.00th=[ 1205], 50.00th=[ 1237], 60.00th=[ 1287],
     | 70.00th=[ 1336], 80.00th=[ 1401], 90.00th=[ 1483], 95.00th=[ 1565],
     | 99.00th=[ 1827], 99.50th=[ 1958], 99.90th=[ 2540], 99.95th=[ 2999],
     | 99.99th=[ 5932]
   bw (  KiB/s): min=82696, max=123664, per=50.01%, avg=102210.81, stdev=3780.23, samples=120
   iops        : min=20674, max=30916, avg=25552.63, stdev=945.05, samples=120
  lat (usec)   : 500=0.01%, 750=0.71%, 1000=8.77%
  lat (msec)   : 2=90.11%, 4=0.38%, 10=0.02%, 20=0.01%
  cpu          : usr=52.78%, sys=47.08%, ctx=24975, majf=0, minf=383
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwt: total=1532969,0,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
   READ: bw=200MiB/s (209MB/s), 200MiB/s-200MiB/s (209MB/s-209MB/s), io=5988MiB (6279MB), run=30002-30002msec

Disk stats (read/write):
    dm-1: ios=0/212, merge=0/0, ticks=0/224, in_queue=224, util=0.56%, aggrios=0/158, aggrmerge=0/114, aggrticks=0/188, aggrin_queue=188, aggrutil=0.62%
  sda: ios=0/158, merge=0/114, ticks=0/188, in_queue=188, util=0.62%

 Performance counter stats for process id '60534':

      29624.249555      task-clock (msec)         #    0.947 CPUs utilized
           120,274      context-switches          #    0.004 M/sec
                 0      cpu-migrations            #    0.000 K/sec
               526      page-faults               #    0.018 K/sec
    99,869,175,993      cycles                    #    3.371 GHz
    77,820,708,858      instructions              #    0.78  insn per cycle
    15,450,187,652      branches                  #  521.539 M/sec
        55,823,834      branch-misses             #    0.36% of all branches

      31.272487415 seconds time elapsed


...

ceph 4 KB random write, two fio job number instances

[build]$ perf stat -p `pgrep -u ${UID} ceph-osd` & fio ../rbd_write.fio; sleep 1; killall -INT perf
[1] 49604
rbd_iodepth32: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=32
...
fio-3.1
Starting 2 processes
Jobs: 2 (f=2): [w(2)][100.0%][r=0KiB/s,w=25.3MiB/s][r=0,w=6470 IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=2): err= 0: pid=49718: Wed Feb  3 14:22:10 2021
  write: IOPS=9837, BW=38.4MiB/s (40.3MB/s)(1153MiB/30011msec)
    slat (nsec): min=1193, max=145706, avg=6339.22, stdev=1960.48
    clat (usec): min=1639, max=130910, avg=6498.44, stdev=3920.04
     lat (usec): min=1653, max=130916, avg=6504.78, stdev=3920.50
    clat percentiles (usec):
     |  1.00th=[ 2245],  5.00th=[ 2507], 10.00th=[ 2835], 20.00th=[ 3458],
     | 30.00th=[ 4178], 40.00th=[ 4948], 50.00th=[ 5932], 60.00th=[ 7177],
     | 70.00th=[ 8094], 80.00th=[ 8848], 90.00th=[10028], 95.00th=[11338],
     | 99.00th=[16909], 99.50th=[20579], 99.90th=[52167], 99.95th=[64750],
     | 99.99th=[84411]
   bw (  KiB/s): min= 2136, max=50824, per=50.02%, avg=19681.83, stdev=9943.27, samples=120
   iops        : min=  534, max=12706, avg=4920.40, stdev=2485.83, samples=120
  lat (msec)   : 2=0.02%, 4=27.38%, 10=62.72%, 20=9.35%, 50=0.41%
  lat (msec)   : 100=0.12%, 250=0.01%
  cpu          : usr=51.13%, sys=48.85%, ctx=1765, majf=0, minf=627
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,295242,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
  WRITE: bw=38.4MiB/s (40.3MB/s), 38.4MiB/s-38.4MiB/s (40.3MB/s-40.3MB/s), io=1153MiB (1209MB), run=30011-30011msec

Disk stats (read/write):
    dm-1: ios=0/281, merge=0/0, ticks=0/244, in_queue=244, util=0.63%, aggrios=0/188, aggrmerge=0/130, aggrticks=0/188, aggrin_queue=188, aggrutil=0.62%
  sda: ios=0/188, merge=0/130, ticks=0/188, in_queue=188, util=0.62%

 Performance counter stats for process id '49414':

      52931.246938      task-clock (msec)         #    1.692 CPUs utilized
           668,005      context-switches          #    0.013 M/sec
                60      cpu-migrations            #    0.001 K/sec
         1,804,535      page-faults               #    0.034 M/sec
   135,557,028,177      cycles                    #    2.561 GHz
    82,839,677,408      instructions              #    0.61  insn per cycle
    15,965,395,745      branches                  #  301.625 M/sec
        90,191,371      branch-misses             #    0.56% of all branches

      31.276089819 seconds time elapsed

...

ceph 4 KB random read, two fio job number instances

[build]$ perf stat -p `pgrep -u ${UID} ceph-osd` & fio ../rbd_read.fio; sleep 1; killall -INT perf
[1] 49723
rbd_iodepth32: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=32
...
fio-3.1
Starting 2 processes
Jobs: 2 (f=2): [r(2)][100.0%][r=170MiB/s,w=0KiB/s][r=43.6k,w=0 IOPS][eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=2): err= 0: pid=49836: Wed Feb  3 14:25:34 2021
   read: IOPS=43.0k, BW=172MiB/s (180MB/s)(5151MiB/30002msec)
    slat (nsec): min=239, max=51418, avg=3632.22, stdev=1794.78
    clat (usec): min=458, max=12421, avg=1451.93, stdev=224.48
     lat (usec): min=461, max=12423, avg=1455.56, stdev=224.48
    clat percentiles (usec):
     |  1.00th=[ 1123],  5.00th=[ 1270], 10.00th=[ 1319], 20.00th=[ 1352],
     | 30.00th=[ 1385], 40.00th=[ 1418], 50.00th=[ 1434], 60.00th=[ 1467],
     | 70.00th=[ 1483], 80.00th=[ 1516], 90.00th=[ 1582], 95.00th=[ 1631],
     | 99.00th=[ 1827], 99.50th=[ 2638], 99.90th=[ 4490], 99.95th=[ 4817],
     | 99.99th=[ 5735]
   bw (  KiB/s): min=76408, max=109224, per=50.01%, avg=87934.46, stdev=4504.50, samples=120
   iops        : min=19102, max=27306, avg=21983.60, stdev=1126.12, samples=120
  lat (usec)   : 500=0.01%, 750=0.12%, 1000=0.45%
  lat (msec)   : 2=98.74%, 4=0.48%, 10=0.21%, 20=0.01%
  cpu          : usr=52.62%, sys=47.31%, ctx=11843, majf=0, minf=389
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwt: total=1318768,0,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
   READ: bw=172MiB/s (180MB/s), 172MiB/s-172MiB/s (180MB/s-180MB/s), io=5151MiB (5402MB), run=30002-30002msec

Disk stats (read/write):
    dm-1: ios=0/274, merge=0/0, ticks=0/256, in_queue=256, util=0.67%, aggrios=0/182, aggrmerge=0/128, aggrticks=0/200, aggrin_queue=20
  sda: ios=0/182, merge=0/128, ticks=0/200, in_queue=200, util=0.66%
chunmei@otccldstore05:~/ceph/build$
 Performance counter stats for process id '49414':

      57398.992807      task-clock (msec)         #    1.835 CPUs utilized
           514,295      context-switches          #    0.009 M/sec
                47      cpu-migrations            #    0.001 K/sec
         1,160,810      page-faults               #    0.020 M/sec
   186,093,071,128      cycles                    #    3.242 GHz
    95,721,598,532      instructions              #    0.51  insn per cycle
    18,618,111,072      branches                  #  324.363 M/sec
       151,239,795      branch-misses             #    0.81% of all branches

      31.285291582 seconds time elapsed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment