Skip to content

Instantly share code, notes, and snippets.

Setup

start a vstart cluster with RGW

Alternative 1

Add object locking to all bucket creations via a lua.

  • upload the following script in prerequest context:
-- enablog object lock on bucket creation

radosgw-admin UX and documentation improvements

Background

Currently, documenting radosgw-admin commands is a manual and error-prone process. After implementing a new command, the "usage" string should be updated accordingly in the code, where there could be a mismatch between the actually command and its arguments and what is documented in the usage. After that the man page needs to be updated manually, as well as the admin guide. Any references to this command in other places in our documentation also need to be manually updated.

We would like to solve this program with a more programmatic approach:

  • Declare command & argument semantics explicitly in code using a cli/args framework that supports auto-generation of context-aware "usage" docs
  • Investigate how this can then be used to auto-generate the man page, admin guides and any other related documentation (maybe using some python code)
  • See if we can easily reference these command descriptions in other places in our documentation

Kafka Security

Background

Bucket notification integration with Kafka is a very useful feature in the RGW. However, some security features needed for such integrations are missing. so, in this project we will try to make bucket notifications over kafka more secure. The following features are missing:

The main challenge in the above would be in automating the tests, so they could easily run locally,

# code
  - you can and SHOULD use AI when writing code
  - please avoid some of the things that AI likes to do in code:
    - write long, unnecessary comments on self explanatory code
    - use non-ascii characters in comments
    - generate redundant (even though it may be correct) code
    - write repetitive code that could be easily refactored
    - reimplement functionality that can be taken from a library
  - in short, PRs that are AI generated without human guidance tend to be unnecessarily long

RGW tcmalloc Profiling

Background

All daemons in ceph are using tcmalloc as the memory allocator to achieve better performance. In a recent PR the ability to get information on how tcmalloc performs in the RGW was added. In this project, we should use the profiling information from RGW runs to tune the tcmalloc parameters so that would be more suitable for the memory use of the RGW.

Evaluation Stage

Step 1 - Build Ceph and Run Basic Tests

First would be to have a Linux based development environment, as a minimum you would need a 4 CPU machine, with 8G RAM and 50GB disk. Unless you already have a Linux distro you like, I would recommend choosing from:

Instructions for fedora 36

Java

install java. curently gradle does not work with jdk higher than 11, so we would need to:

sudo dnf install java-11-openjdk-devel.x86_64

if other version is already installed, use:

sudo alternatives --config java
# channel for rgw send notifications
apiVersion: messaging.knative.dev/v1
kind: InMemoryChannel
metadata:
name: text-channel
---
# subscription for the python-ceph-vectordb app which listens notifications from the channel
apiVersion: messaging.knative.dev/v1
kind: Subscription
metadata:

Setup

  • start k8s with kinikube with an extra disk:
minikube start --extra-disks=1 --driver=kvm2
  • install rook:
kubectl create -f https://raw.githubusercontent.com/rook/rook/refs/heads/master/deploy/examples/crds.yaml
kubectl create -f https://raw.githubusercontent.com/rook/rook/refs/heads/master/deploy/examples/common.yaml
kubectl create -f https://raw.githubusercontent.com/rook/rook/refs/heads/master/deploy/examples/operator.yaml

Basic Bucket Logging Testing

  • to enable our extension to the API when using python (boto3 or aws CLI) the following file has to be placed under: ~/.aws/models/s3/2006-03-01/ (the directory should be created if it dioes not exist)
  • currently there is no generic solution for other client SDKs
  • start a vstart cluster
  • create a bucket:
aws --endpoint-url http://localhost:8000 s3 mb s3://fish
  • create a log bucket:
@yuvalif
yuvalif / zshrc
Last active January 19, 2026 09:13
# zshrc
# set path
export PATH=$HOME/bin:/usr/local/bin:$HOME/go/bin:$HOME/.local/bin:/app/bin/:$PATH
# golang
export GOPROXY=https://proxy.golang.org,direct
export GOBIN=$HOME/go/bin
# for other themes, see: https://github.com/ohmyzsh/ohmyzsh/wiki/Themes