Skip to content

Instantly share code, notes, and snippets.

@tipabu
Last active March 3, 2016 13:15
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tipabu/a6bd9ed39a9ec2bec4af to your computer and use it in GitHub Desktop.
Save tipabu/a6bd9ed39a9ec2bec4af to your computer and use it in GitHub Desktop.

Data Protection Middleware for OpenStack Swift

This project provides several middleware to add data-protection capabilities for Swift. The goal is to allow cluster operators to be able to automatically guard against accidental or malicious overwrites and deletions. This in turn allows IT administrators to feel comfortable giving workstations easy, usable, direct access to Swift (e.g. by mounting a container as though it were a network drive) without worrying about malware or disgruntled users.

Changes and Features

  1. The existing versioned_writes middleware now has the concept of a "versioning mode". Previously, it would always behave as a stack, with PUTs pushing a new version onto the stack and DELETEs popping the most recent version off. Now, there is a new option to behave as a history, with PUTs and DELETEs behaving normally while recording objects' previous states.

  2. A new defaulter middleware is introduced to allow operators and users to specify default header values to set (if not already present) during PUTs.

    • Containers may set defaults for objects.
    • Accounts may set defaults for containers and objects.
    • The filter config may set defaults for accounts, containers, and objects. This allows the operator to automatically enable versioning on all new containers, and to do so with the new "history" mode. Note that this required changes to versioned_writes so that subrequests would have their defaults populated.
  3. The existing versioned_writes middleware will now attempt to auto-vivify the versions container if it does not exist. Otherwise, users would still need to manually create the versions container for their primary containers; with this, they don't even have to know about it.

  4. A new data_protection middleware is introduced to guard against unsafe actions (PUTs, POSTs, DELETEs) in versions locations, as well as attempts to modify the versioning status of containers. This ensures that malware, etc. cannot truly destroy data, only move it to the versions container.

  5. Since versions containers would otherwise grow without bound, the data_protection middleware may also be used to specify a default retention window that should be used for new versions containers. This uses the defaulter infrastructure to add X-Delete-After headers to the objects copied infer versions containers.

  6. Since the defaulter infrastructure may otherwise be used to subvert the protection, the data_protection middleware prevents (non-admin) users from being able to set the following headers on their accounts:

    • X-Default-Container-X-Data-Protection
    • X-Default-Container-X-Versions-Location
    • X-Default-Container-X-Versions-Mode Note that X-Default-Object-X-Delete-At and X-Default-Object-X-Delete-After are fine, as they would be overridden by the container-level X-Default-Object-X-Delete-After (and X-Delete-After takes precedence over X-Delete-At).

Caveats

The versioned_writes filter config must include use = egg:data_protection#versioned_writes; using paste.filter_factory = ... will cause Swift to auto-insert its own versioned_writes, which will likely lead to bad/weird behavior.


The example proxy-server.conf describes a recommended setup, not the defaults of the middlewares. In particular, operators should be sure to:

  • Enable use_formatting in the defaulter filter config. Otherwise, all object versions for all containers will be stored in a single container.
  • Be sure to include default-container-x-versions-mode = history in the defaulter filter config. Otherwise, Swift will default to the stack-based versioning, where DELETEs actually destroy data.
  • Configure the auto_enable_prefix in the data_protection filter config and use that prefix when configuring default-container-x-versions-location. Otherwise, users may create the versions container before it is auto-vivified, and it won't have the protection flag set.
  • Choose an appropriate value for default_versions_retention; by default, all versions are retained indefinitely.
  • Disable the owner_can_protect option in the data_protection filter config. This is enabled by default in hopes of later submitting the middleware upstream, where account owners are expected to have full control over all data within the account.

The recommended setup restricts account owners' ability to manage the data within their account, including their own data usage. This may be tolerable in private deployments, but would be wholely inappropriate for public clouds.

The owner_can_protect option may make the data_protection middleware more appropriate for public clouds (and allow account owners to protect against accidental data loss from read/write users), but it remains largely untested.

[pipeline:main]
# A few notes on the pipeline and pipeline placement:
#
# * defaulter should be as far left as possible while still right of our
# sane-WSGI-environment middlewares (gatekeeper, proxy-logging, cache).
#
# * versioned_writes must be explicitly put into the pipeline; if you allow
# Swift to insert it, it won't be the history-capable fork.
#
# * data_protection must be after versioned_writes; they go hand-in-hand.
pipeline = catch_errors gatekeeper healthcheck proxy-logging cache defaulter
container_sync bulk tempurl ratelimit tempauth container-quotas account-quotas
slo dlo versioned_writes data_protection proxy-logging proxy-server
[filter:defaulter]
use = egg:swift_data_protection#defaulter
use_formatting = true
default-container-x-versions-location = .trash-{container}
default-container-x-versions-mode = history
[filter:versioned_writes]
use = egg:swift_data_protection#versioned_writes
allow_versioned_writes = true
[filter:data_protection]
use = egg:swift_data_protection#data_protection
auto_enable_prefix = .trash-
owner_can_protect = false
default_versions_retention = 7776000 # 90 days
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment