Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save sanoj-unnikrishnan/f7a62db550792bc3292204dff7a568a4 to your computer and use it in GitHub Desktop.
Save sanoj-unnikrishnan/f7a62db550792bc3292204dff7a568a4 to your computer and use it in GitHub Desktop.
RFC lease based Quotad
Current Design
-------------------------
Quota translator : Ensures the limits are enforced. To know the current size enforce lookup is done on QUOTAD for that directory
The values are cached until soft/hard timeout (To reduce IPC)
QUOTAD : Seperate process that aggregates the size of directory from all the birck server processes.
Marker Translator : A server translator that accounts for and maintains the size of directory.
In the Current design the quota translator has to make IPC on a periodic basis based on timeouts configured at volume level.
The timeouts also create a room for overshoot in quota limit.
Secondly, we use a single quotad on each node in the trusted storage pool. So all volumes with quota enabled share them.
In a configuration with large number of volumes this could cause performance delays.
Thirdly, In a configuration where a single volume has large number of bricks (say distribute only volume with 20 bricks).
qoutad would create as many lookup request (To service one request, we endup generating 20 lookup request)
This introduces a scaling limit for numberof bricks in quota enabled volume.
Alternative approach
-------------------------------
change the functionality of quotad to be that of a grantor/ revoker/ redistributor of lease.
Initially quotad would distribute the lease ( i.e. sub lease to bricks).
Marker translator functionality remains as is, It would account for the current usage.
Quota translator would permit the IO as long as sub lease is not expired, When sublease is expired it makes
a request to quotad to extend the lease. The quotad can do this by contacting just a subset of the bricks in the volume.
If quotad fails to extend lease, hard limit is reached.
This leads to significant reduction in the ipc, since we do ipc only at lease expiry.
Secondly during such IPC we don't have to always talk to all bricks in a volume
We will have to work through details of how lease is managed though.
Another advantage here is, we could have events for sub lease expiry in bricks (as opposed to tota limit over volume).
The admin could use these events for say extending lvm volume group underneath.
One issue here is that supporting soft limit will be tricky (since we don't aggregate sizes).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment