Skip to content

Instantly share code, notes, and snippets.

@yuvalif
Last active May 5, 2022 01:01
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save yuvalif/d526c0a3a4c5b245b9e951a6c5a10517 to your computer and use it in GitHub Desktop.
Save yuvalif/d526c0a3a4c5b245b9e951a6c5a10517 to your computer and use it in GitHub Desktop.

vstart

setup

  • start:
MON=1 OSD=1 MDS=0 MGR=0 RGW=1 ../src/vstart.sh -n -d -o rgw_max_objs_per_shard=50 -o rgw_reshard_thread_interval=60
  • verify the number was set, and that dynamic reshard is enabled:
bin/ceph -c ceph.conf daemon out/radosgw.8000.asok config get rgw_max_objs_per_shard
bin/ceph -c ceph.conf daemon out/radosgw.8000.asok config get rgw_dynamic_resharding
bin/ceph -c ceph.conf daemon out/radosgw.8000.asok config get rgw_reshard_thread_interval

test

  • create the bucket:
hsbench -a 0555b35654ad1656d804 -s h7GhxuBLTrlhVUyxSPUKUV8r/2EI4ngqJxD7iBdBYLhwluN30JaT3Q== -u http://localhost:8000 -bp my-bucket -b 1 -r default -m i
  • upload less than 550 objects:
hsbench -a 0555b35654ad1656d804 -s h7GhxuBLTrlhVUyxSPUKUV8r/2EI4ngqJxD7iBdBYLhwluN30JaT3Q== -u http://localhost:8000 -bp my-bucket -b 1 -r default -m p -z 4K -d 10 -op obj1
  • verify there ws no reshard:
bin/radosgw-admin -c ceph.conf bucket limit check
  • upload more than 550 objects:
hsbench -a 0555b35654ad1656d804 -s h7GhxuBLTrlhVUyxSPUKUV8r/2EI4ngqJxD7iBdBYLhwluN30JaT3Q== -u http://localhost:8000 -bp my-bucket -b 1 -r default -m p -z 4K -d 10 -op obj2

verify reshard (this may take a while).

multisite

setup

  • start:
MON=1 OSD=1 MDS=0 MGR=0 ../src/test/rgw/test-rgw-multisite.sh 2 --rgw_max_objs_per_shard=50 \
    --rgw_reshard_thread_interval=60 --rgw_user_quota_bucket_sync_interval=90 \
    --rgw_data_notify_interval_msec=0 --rgw_data_log_num_shards=1 --rgw_sync_log_trim_interval=0
  • verify the number was set, and that dynamic reshard is enabled:
bin/ceph -c run/c2/ceph.conf daemon run/c2/out/radosgw.8002.asok config get rgw_max_objs_per_shard
bin/ceph -c run/c2/ceph.conf daemon run/c2/out/radosgw.8002.asok config get rgw_dynamic_resharding
bin/ceph -c run/c2/ceph.conf daemon run/c2/out/radosgw.8002.asok config get rgw_reshard_thread_interval
bin/ceph -c run/c2/ceph.conf daemon run/c2/out/radosgw.8002.asok config get rgw_user_quota_bucket_sync_interval
bin/ceph -c run/c2/ceph.conf daemon run/c2/out/radosgw.8002.asok config get rgw_data_notify_interval_msec
bin/ceph -c run/c2/ceph.conf daemon run/c2/out/radosgw.8002.asok config get rgw_data_log_num_shards
bin/ceph -c run/c2/ceph.conf daemon run/c2/out/radosgw.8002.asok config get rgw_sync_log_trim_interval

test dynamic reshard

  • create the bucket:
hsbench -a 1234567890 -s pencil -u http://localhost:8001 -bp my-bucket -b 1 -r zg1 -m i
  • upload less than 550 objects:
hsbench -a 1234567890 -s pencil -u http://localhost:8001 -bp my-bucket -b 1 -r zg1 -m p -z 4K -d 10 -op obj1
  • verify there ws no reshard (and that sync worked fine):
bin/radosgw-admin -c run/c1/ceph.conf bucket limit check
bin/radosgw-admin -c run/c2/ceph.conf bucket limit check
  • upload more than 550 objects:
hsbench -a 1234567890 -s pencil -u http://localhost:8001 -bp my-bucket -b 1 -r zg1 -m p -z 4K -d 10 -op obj2

verify reshard and sync happened (it should take ~1min for the reshard to happen).

test manual reshard

  • create the bucket:
hsbench -a 1234567890 -s pencil -u http://localhost:8001 -bp another-bucket -b 1 -r zg1 -m i
  • upload less than 550 objects:
hsbench -a 1234567890 -s pencil -u http://localhost:8001 -bp another-bucket -b 1 -r zg1 -m p -z 4K -d 10 -op obj1
  • verify there ws no reshard (and that sync worked fine):
bin/radosgw-admin -c run/c1/ceph.conf bucket limit check
bin/radosgw-admin -c run/c1/ceph.conf bucket limit check
  • manually chage the number of shards:
bin/radosgw-admin -c run/c1/ceph.conf reshard add --bucket another-bucket000000000000 --num-shards=21

verify reshard happened after ~1min.

ststus commands

bin/radosgw-admin -c run/c1/ceph.conf reshard list
bin/radosgw-admin -c run/c1/ceph.conf reshard status --bucket my-bucket000000000000

multisite (manual setup)

setup

follow instructions from here then set the folllwoing conf parameters:

bin/ceph -c run/cluster1/ceph.conf daemon run/cluster1/out/radosgw.8001.asok config set rgw_max_objs_per_shard 50
bin/ceph -c run/cluster2/ceph.conf daemon run/cluster2/out/radosgw.8002.asok config set rgw_max_objs_per_shard 50
bin/ceph -c run/cluster1/ceph.conf daemon run/cluster1/out/radosgw.8001.asok config set rgw_reshard_thread_interval 60
bin/ceph -c run/cluster2/ceph.conf daemon run/cluster2/out/radosgw.8002.asok config set rgw_reshard_thread_interval 60
bin/ceph -c run/cluster1/ceph.conf daemon run/cluster1/out/radosgw.8001.asok config set rgw_data_notify_interval_msec 0
bin/ceph -c run/cluster2/ceph.conf daemon run/cluster2/out/radosgw.8002.asok config set rgw_data_notify_interval_msec 0
bin/ceph -c run/cluster1/ceph.conf daemon run/cluster1/out/radosgw.8001.asok config set rgw_data_log_num_shards 1
bin/ceph -c run/cluster2/ceph.conf daemon run/cluster2/out/radosgw.8002.asok config set rgw_data_log_num_shards 1
bin/ceph -c run/cluster1/ceph.conf daemon run/cluster1/out/radosgw.8001.asok config set rgw_sync_log_trim_interval 0
bin/ceph -c run/cluster2/ceph.conf daemon run/cluster2/out/radosgw.8002.asok config set rgw_sync_log_trim_interval 0

then reload both clusters (needed for some of the changes).

test dynamic reshard

  • create the bucket:
hsbench -a "$access_key" -s "$secret_key" -u http://localhost:8001 -bp my-bucket -b 1 -r mygroup -m i
  • upload objects at high rate, causing mid-sync dynamic reshard
hsbench -a "$access_key" -s "$secret_key" -u http://localhost:8001 -bp my-bucket -t 2 -b 1 -r mygroup -m p -z 4K -d 200 -op obj1

check status

  • overall sync status:
bin/radosgw-admin -c run/cluster1/ceph.conf sync status
bin/radosgw-admin -c run/cluster2/ceph.conf sync status
  • bucket sync status:
bin/radosgw-admin -c run/cluster1/ceph.conf bucket sync status --bucket my-bucket000000000000
bin/radosgw-admin -c run/cluster2/ceph.conf bucket sync status --bucket my-bucket000000000000
  • compare actual object difference:
AWS_ACCESS_KEY_ID="$access_key" AWS_SECRET_ACCESS_KEY="$secret_key" aws --endpoint-url http://localhost:8001 s3 ls s3://my-bucket000000000000 > rgw1
AWS_ACCESS_KEY_ID="$access_key" AWS_SECRET_ACCESS_KEY="$secret_key" aws --endpoint-url http://localhost:8002 s3 ls s3://my-bucket000000000000 > rgw2
diff rgw1 rgw2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment