[TOC]
##Problem Statement
At present, Gluster for Swift (G4S) allows only one account to reside in a Gluster volume. An account maps to a single Gluster volume, i.e., the root directory of the Gluster volume mount point serves as the account. (/mnt/gluster-object/$vol_name).
G4S needs to allow multiple accounts to reside in the same Gluster volume. To enable such a feature, an account would need to be implemented as a subdirectory (child) of a Gluster volume mount point's root directory (/mnt/gluster-object/$vol_name/$acc_name).
##Proposed Design and Implementation
The ring files "map the names of entities stored on disk and their physical location"1. At present only the volume name is stored.
One of the fields of the ring data structure is a list of devices in the cluster. An element of the list of devices is a dictionary, which helps identify the drive where the data actually resides. The 'device' key of the dictionary corresponds to the "on disk name name of the device on the server"2. G4S currently sets the 'device' key to the Gluster volume name. The idea is to use the dictionary to also store the account name. This would be done by setting the pre-existing but currently unused 'meta' key3 of the dictionary to the account name.
The above would be done by modifying the gluster-swift-gen-builders script, which builds the ring files for G4S. The script would now take both volume names and the accounts (to be created in the volumes) as input parameters.
E.g. pass the volume names and account names to the ring builder script as follows,
# gluster-swift-gen-builders acc1[:vol1] acc2[:vol2] ...
whereby
- the volume part of the argument (after the colon) is optional – if not given is assumed to be the same as the account name (this way providing backward compatibility for the cli UI);
- we enforce the account names to be pairwise distinct (as the account → volume mapping should be well defined).
In the gluster-swift-gen-builders script the devices would be added to the cluster using the following command,
swift-ring-builder <builder_file> add [--region <region>] --zone <zone> --ip <ip> \ --port <port> --replication-ip <r_ip> --replication-port <r_port> \ --device <GlusterFS volume name> --meta <account name> --weight <weight>
The present gluster-swift-gen-builder script creates new builder files each time it's run. This means that every time a G4S user wants to add a device to a cluster, she would also have to pass the previously existing devices in the cluster as command line arguments for the script. A simple solution would be to first create an interface to obtain the list of existing devices in the cluster, and then pass it along with the new additional device as the command line arguments. Please note that the devices in the new design could also be account:device pairs, where device is a Gluster volume.
acc2v_old=`gluster-swiftlist-accounts`; gluster-swift-gen-builders $acc2v_old newacc1:vol1
- The REST client makes a request of the form /account[/container[/object]]. 4
- The request is intercepted by the swift proxy server which looks up the
device (G4S: volume) corresponding to this URL5 and passes it on
to the appropriate internal server (account / containter / object).
In details:
- the main Swift routine that takes care of routing is
swift.proxy.controllers.base.Controller.GETorHEAD_base
- it fetches the backend parameters from the ring in the form of
node
dicts via theiter_nodes
method (in case of G4S there is just a singlenode
) - that is passed down to the
http_connect
method which sends the request to the appropriate internal server using the [_/device/partition/account/container[/object]]_ address format.
- the main Swift routine that takes care of routing is
In case of G4S, 2. is monkey-patched to use
gluster.swift.common.ring.Ring._get_part_nodes
method which
currently looks up the node whose 'device' value is the sames as the account
requested. We would instead look up the node whose 'meta' matches account, and
thus its 'device' would become an independent specifier for the Gluster volume of the storage
backend.
G4S monkey-patches the internal account/container/object server's GET
method
by routines that map the /device/partition/account[/container[/object]]
address to filesystem paths according to the layout used in the Gluster storage
node.
The internal servers instantiate G4S specific classes – respectively,
- account server:
gluster.swift.common.DiskAccount
, - container server:
gluster.swift.common.DiskDir
, - object server:
gluster.swift.obj.DiskFile
whereby the first two are application-specific customization of a common base class,
gluster.swift.common.DiskCommon
.
These classes implement the request → filesystem path mapping and interact with the local filesystem. All the classes are initiated with the components of the REST request that include the two initiation parameters, device and account. But the account is silently ignored as, according to the current model, it does not carry additional information.
As discussed above, this will not be the case anymore, account will be independent information. In this context, it will present a new layer in the path hierarchy. The disk utility classes should also consider account and perform path manipulations accordingly. In particular,
- as of now,
DiskAccount
is path agnostic,
"since accounts should always exist (given an account maps to a gluster volume directly, and the mount has already been checked at the beginning of the REST API handling)" – it has to be changed to actively look up the account DiskDir
andDiskFile
's path construction routines should take care to insert account to the path component chain.
-
How do we make sure that a user cannot accidentally or intentionally access an another user's account, i.e., access other directories in the GlusterFS volume?
-
How can we limit the storage usage of an account/user? Maybe we can use Gluster-Quota's CLI to enforce storage limit for an account/user?
-
How do we delete accounts?
-
Can we allow the previous users of glusterswift to smoothly upgrade to the revised glusterswift that'd allow multiple accounts to reside in a GlusterFS volume?
Footnotes
-
ibid. ↩
-
OpenStack Object Storage API v1 Reference: API for accounts, containers, objects. ↩
-
also the partition number, but in case of G4S that's a synthetic dummy value ↩