Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save ajarr/6635621 to your computer and use it in GitHub Desktop.
Save ajarr/6635621 to your computer and use it in GitHub Desktop.
Multiple accounts per volume for Gluster for Swift (revised)
# Multiple accounts per volume for _Gluster for Swift_
[TOC]
##Problem Statement
At present, Gluster for Swift (G4S) allows only one account to reside in a
Gluster volume. An account maps to a single Gluster volume, i.e., the root
directory of the Gluster volume mount point serves as the account.
(*/mnt/gluster-object/$vol\_name*).
G4S needs to allow multiple accounts to reside in the same Gluster volume. To
enable such a feature, an account would need to be implemented as a
subdirectory (child) of a Gluster volume mount point's root directory
(*/mnt/gluster-object/\$vol\_name/$acc\_name*).
##Proposed Design and Implementation
### Store the account name and volume name (Gluster volume) in the ring files
The ring files "map the names of entities stored on disk and their physical
location"[^1]. At present only the volume name is stored.
One of the fields of the ring data structure is a list of devices in the
cluster. An element of the list of devices is a dictionary, which helps
identify the drive where the data actually resides. The 'device' key of the
dictionary corresponds to the "on disk name name of the device on the
server"[^2]. G4S currently sets the 'device' key to the Gluster
volume name. The idea is to use the dictionary to also store the account name.
This would be done by setting the pre-existing but currently unused 'meta'
key[^3] of the dictionary to the account name.
The above would be done by modifying the _gluster-swift-gen-builders_ script,
which builds the ring files for G4S. The script would now take both
volume names and the accounts (to be created in the volumes) as input
parameters.
E.g. pass the volume names and account names to the ring builder script as follows,
# gluster-swift-gen-builders acc1[:vol1] acc2[:vol2] ...
whereby
- the volume part of the argument (after the colon) is optional –
if not given is assumed to be the same as the account name
(this way providing backward compatibility for the cli UI);
- we enforce the account names to be pairwise distinct (as the
_account → volume_ mapping should be well defined).
In the gluster-swift-gen-builders script the devices would be added to the
cluster using the following command,
<pre>
swift-ring-builder &lt;builder_file&gt; add [--region &lt;region&gt;] --zone &lt;zone&gt; --ip &lt;ip&gt; \
--port &lt;port&gt; --replication-ip &lt;r_ip&gt; --replication-port &lt;r_port&gt; \
--device <strong>&lt;GlusterFS volume name&gt;</strong> --meta <strong>&lt;account name&gt;</strong> --weight &lt;weight&gt;
</pre>
[^1]: [Swift Documentation, "Swift Architectural Overview"](http://docs.openstack.org/developer/swift/overview_architecture.html#the-ring)
[^2]: [Swift Documentation, "The Rings"](http://docs.openstack.org/developer/swift/overview_ring.html#list-of-devices)
[^3]: ibid.
### Modify the _account → device (volume)_ lookup {#acc2dev}
- The REST client makes a request of the form _/account[/container[/object]]_. [^api]
- The request is intercepted by the _swift proxy server_ which looks up the
device (G4S: volume) corresponding to this URL[^part-dummy] and passes it on
to the approprite internal server (account / containter / object).
In details:
1. the main Swift routine that takes care of routing is
[`swift.proxy.controllers.base.Controller.GETorHEAD_base`][getorhead]
2. it [fetches the backend parameters][fetchbackend] from the ring in the form of
`node` dicts via the `iter_nodes` method (in case of G4S there is just a single `node`)
3. that is [passed down to the `http_connect` method][http_connect]
which sends the request to the appropriate internal server using
the [_/device/partition/account[/container[/object]\]_][address format]
address format.
In case of G4S, 2. is monkey-patched to use
[`gluster.swift.common.ring.Ring._get_part_nodes`][_get_part_nodes] method which
currently looks up the node whose 'device' value is the sames as the _account_
requested. We would instead look up the node whose 'meta' matches _account_, and
thus its 'device' would become an independent specifier for the Gluster volume of the storage
backend.
[^part-dummy]: also the partition number, but in case of G4S that's a synthetic dummy value
[getorhead]: https://github.com/openstack/swift/blob/1.9.1/swift/proxy/controllers/base.py#L918
[fetchbackend]: https://github.com/openstack/swift/blob/1.9.1/swift/proxy/controllers/base.py#L935
[http_connect]: https://github.com/openstack/swift/blob/1.9.1/swift/proxy/controllers/base.py#L941
[address format]: https://github.com/openstack/swift/blob/1.9.1/swift/common/bufferedhttp.py#L135
[_get_part_nodes]: https://github.com/gluster/gluster-swift/blob/0f90d1db18/gluster/swift/common/ring.py#L73
### Modify the storage backend layout and the _internal REST API → path_ mapping
G4S monkey-patches the internal account/container/object server's `GET` method
by routines that map the _/device/partition/account[/container[/object]]_
address to filesystem paths according to the layout used in the Gluster storage
node.
The internal servers instantiate G4S specific classes – respectively,
- account server: [`gluster.swift.common.DiskAccount`][DiskAccount],
- container server: [`gluster.swift.common.DiskDir`][DiskDir],
- object server: [`gluster.swift.obj.DiskFile`][DiskFile]
whereby the first two are application-specific customization of a common base class,
[`gluster.swift.common.DiskCommon`][DiskCommon].
These classes implement the _request → filesystem_ path mapping and interact with
the local filesystem. All the classes are initiated with
the components of the REST request that include the two initiation parameters,
_device_ and _account_. But the _account_ is silently ignored as, according to the
current model, it does not carry additional information.
As discussed [above](#acc2dev), this will not be the case anymore, _account_
will be independent information. In this context, it will present a new layer
in the path hierarchy. The disk utility classes should also consider _account_
and perform path manipulations accordingly. In particular,
- as of now, `DiskAccount` is path agnostic,
["since accounts should always exist (given an account maps to a
gluster volume directly, and the mount has already been checked at
the beginning of the REST API handling)"][DiskAccount_comment] –
it has to be changed to actively look up the _account_
- `DiskDir` and `DiskFile`'s path construction routines should take
care to insert _account_ to the path component chain.
[DiskAccount]: https://github.com/gluster/gluster-swift/blob/0f90d1db18/gluster/swift/common/DiskDir.py#L495
[DiskDir]: https://github.com/gluster/gluster-swift/blob/0f90d1db18/gluster/swift/common/DiskDir.py#L208
[DiskFile]: https://github.com/gluster/gluster-swift/blob/0f90d1db18/gluster/swift/obj/diskfile.py#L429
[DiskCommon]: https://github.com/gluster/gluster-swift/blob/0f90d1db18/gluster/swift/common/DiskDir.py#L149
[DiskAccount_comment]: https://github.com/gluster/gluster-swift/blob/0f90d1db18/gluster/swift/common/DiskDir.py#L538
## Design Concerns
- How do we make sure that a user cannot accidentally or intentionally access an
another user's account, i.e., access other directories in the GlusterFS volume?
- How can we limit the storage usage of an account/user?
Maybe we can use Gluster-Quota's CLI to enforce storage limit for an account/user?
- How do we delete accounts?
- Can we allow the previous users of glusterswift to smoothly upgrade to the
revised glusterswift that'd allow multiple accounts to reside in a GlusterFS
volume?
[^api]: [OpenStack Object Storage API v1 Reference](http://docs.openstack.org/api/openstack-object-storage/1.0/content/): API for
[accounts](http://docs.openstack.org/api/openstack-object-storage/1.0/content/storage-account-services.html),
[containers](http://docs.openstack.org/api/openstack-object-storage/1.0/content/storage-container-services.html),
[objects](http://docs.openstack.org/api/openstack-object-storage/1.0/content/storage-object-services.html).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment