Bucket notifications are important building block in the object storage ecosystem. And persistent bucket notifications in particular, as they let the system overcome broker outages. However, since the persistent notifications are backed with a RADOS queue, they have a cost. Both in the extra load on the RADOS cluster, and with the inability to operate in environemnts where there is no RADOS backend. In this project, we would like to implement persistent bucket notifications in the RADOS Gateway using a Redis Queue. Combined with the "zipper" project we would be able to enjoy bucket notifications with backends like posix, dbstore, daos etc.
Note that on top of using RADOS for the notification queue, our code is depended with RADOS with its implementation of a distributed lock (to make sure that one and only one RGW serve a queue at a given point in time). as well as the fact that topic and notification configuration stored inside RADOS objets.
First would be to have a linux based development environment, as a minimum you would need a 4 CPU machine, with 8G RAM and 50GB disk. Unless you already have a linux distro you like, I would recommend choosing from:
- Fedora (38/39) - my favorite!
- Ubuntu (22.04 LTS)
- WSL (Windows Subsystem for Linux), though it would probably take much longer...
- RHEL9/Centos9
- Other Linux distros - try at your own risk :-)
Once you have that up and running, you should clone the Ceph repo from github (https://github.com/ceph/ceph). If you don’t know what github and git are, this is the right time to close these gaps :-) And yes, you should have a github account, so you can later share your work on the project.
Install any missing system dependencies use:
./install-deps.sh
Note that, the first build may take long time, so the following cmake
parameter could be used to minimize the build time.
With a fresh ceph clone use the following:
./do_cmake.sh -DBOOST_J=$(nproc) -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DWITH_MGR_DASHBOARD_FRONTEND=OFF \
-DWITH_DPDK=OFF -DWITH_SPDK=OFF -DWITH_SEASTAR=OFF -DWITH_CEPHFS=OFF -DWITH_RBD=OFF -DWITH_KRBD=OFF -DWITH_CCACHE=OFF
if the build
directory already exists, you can rebuild the ninja files by using (from build
):
cmake -DBOOST_J=$(nproc) -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DWITH_MGR_DASHBOARD_FRONTEND=OFF \
-DWITH_DPDK=OFF -DWITH_SPDK=OFF -DWITH_SEASTAR=OFF -DWITH_CEPHFS=OFF -DWITH_RBD=OFF -DWITH_KRBD=OFF -DWITH_CCACHE=OFF ..
then invoke the build process (using ninja) from within the build
directory (created by do_cmake.sh
).
Assuming the build was completed successfully, you can run the unit tests (see: https://github.com/ceph/ceph#running-unit-tests).
Now you are ready to run the ceph processes, as explained here: https://github.com/ceph/ceph#running-a-test-cluster You probably would also like to check the developer guide (https://docs.ceph.com/docs/master/dev/developer_guide/) and learn more on how to build Ceph and run it locally (https://docs.ceph.com/docs/master/dev/quick_guide/). Ceph's bucket noptification documentation:
- https://docs.ceph.com/en/latest/radosgw/notifications/
- notification as part of the bucket operations API: https://docs.ceph.com/en/latest/radosgw/s3/bucketops/#create-notification
- S3 compatibility: https://docs.ceph.com/en/latest/radosgw/s3-notification-compatibility/
Run bucket notification tests for persistent notifications using an HTTP endpoint:
- start the vtsart cluster:
$ MON=1 OSD=1 MDS=0 MGR=0 RGW=1 ../src/vstart.sh -n -d
- on a separate terminal start an HTTP endpoint:
$ wget https://gist.githubusercontent.com/mdonkers/63e115cc0c79b4f6b8b3a6b797e485c7/raw/a6a1d090ac8549dac8f2bd607bd64925de997d40/server.py
$ python server.py 10900
- install the awc cli tool
- configure the tool according to the access and secret keys showing in the output of the
vstart.sh
command - set the region to
default
- create a persistent topic pointing to the above HTTP endpoint:
$ aws --endpoint-url http://localhost:8000 sns create-topic --name=fishtopic \
--attributes='{"push-endpoint": "http://localhost:10900", "persistent": "true"}'
- create a bucket:
$ aws --endpoint-url http://localhost:8000 s3 mb s3://fish
- create a notification on that bucket, pointing to the above topic:
$ aws --endpoint-url http://localhost:8000 s3api put-bucket-notification-configuration --bucket fish \
--notification-configuration='{"TopicConfigurations": [{"Id": "notif1", "TopicArn": "arn:aws:sns:default::fishtopic", "Events": []}]}'
Leaving the event list empty is equivalent to setting it to
["s3:ObjectCreated:*", "s3:ObjectRemoved:*"]
- create a file, and upload it:
$ head -c 512 </dev/urandom > myfile
$ aws --endpoint-url http://localhost:8000 s3 cp myfile s3://fish
- on the HTTP terminal, see the JSON output of the notifications
using the boost Redis client:
- write a standalone client that pushes to the Redis Queue (input could be stdin, or any other option)
- write a standalone client that pull from the Redis Queue (output could be stdout, or any other option)
Note that currently Ceph as a Redis cpp client that we have as a submodule, but this will be changed soon in favor of boost Redis.
- optional: using gtest write a unit test that excersite the above clients
- create an abstraction layer that could be implemented using cls_2pc_queue or a REDIS client
- use this abstraction layer in the bucket notification code and break its dependency with RADOS
- implement using boost redis client (that soon will be incorporated as a submodule in the Ceph source tree)
- add a config option to select the implementation used in runtime
- add test and setup instructions when using RADOS backend
- create an abstraction layer for cls lock
- use this abstraction layer in the bucket notification code
- implement using our cpp redis submodule. see: redis distributed lock
- move
rgw_notify.cc
outside of thedriver/rados
directory - test and setup instructions when using a non-RADOS backend (e.g. posix)
Hello there, this is Adarsh here. You mentioned downloading the awc cli tool as part of the steps to run bucket notifications tests for persistent notifications using a HTTP endpoint, but the closest I could find on Google is the aws-cli tool (https://github.com/aws/aws-cli). Could you please confirm if this what I'd have to download ?