Skip to content

Instantly share code, notes, and snippets.

@krisis
Last active July 28, 2017 22:35
Show Gist options
  • Save krisis/761ff87ee0a7368681351239dca5733a to your computer and use it in GitHub Desktop.
Save krisis/761ff87ee0a7368681351239dca5733a to your computer and use it in GitHub Desktop.

Multipart upload backend format proposal

Both the schema below have the following properties

  1. Crash-consistent during a completeMultipartUpload/abortMultipartUpload
  2. Concurrency-safe in presence of multiple concurrent uploads

These properties allows us to avoid fcntl(3) based locking in shared mode with FS backend.

Schema-1

.minio.sys.tmp
    ├── <uploadId> -----------------> created on initMultipartUpload
        ├── <eTag>.0 ---------------> created on initMultipartUpload
        ├── <eTag>.<partNumber> ----> created on putObjectPart

where, <uploadId>/<eTag>.0 contains the following json object

{
    "Bucket": "bucketName",
    "Object": "objectName"
}

Example

The following example contains 3 parts each of 2 concurrent uploads with uploadId

  1. 6e463bb8-35bd-4408-809e-78f509f558b3
  2. 7bed54f8-ad71-48b1-a4f8-bd2e2a8efbda

to mybucket/myobject.

.minio.sys.tmp
    ├── 6e463bb8-35bd-4408-809e-78f509f558b3
    │   ├── 467886be95c8ecfd71a2900e3f461b4f.0
    │   ├── 467886be95c8ecfd71a2900e3f461b4f.1
    │   ├── 467886be95c8ecfd71a2900e3f461b4f.2
    │   └── 467886be95c8ecfd71a2900e3f461b4f.3
    └── 7bed54f8-ad71-48b1-a4f8-bd2e2a8efbda
        ├── 467886be95c8ecfd71a2900e3f461b4f.0
        ├── 467886be95c8ecfd71a2900e3f461b4f.1
        ├── 467886be95c8ecfd71a2900e3f461b4f.2
        └── 467886be95c8ecfd71a2900e3f461b4f.3

6e463bb8-35bd-4408-809e-78f509f558b3/467886be95c8ecfd71a2900e3f461b4f.0 would contain

{
    "Bucket": "mybucket",
    "Object": "myobject"
}

Pros

  • Easier to support/debug since uploadId directory will have at most 10000 entries
  • uploadId modTime will reflect last updated part for a given upload; simplifies detection of 'stale' uploads
  • Limited number of entries per uploadId directory; listing during cleanup of stale upload parts will be faster

Schema-2 (flat namespace)

.minio.sys.tmp
    ├── <uploadId>.<eTag>.0 ---------------> created on initMultipartUpload
    ├── <uploadId>.<eTag>.<partNumber> ----> created on putObjectPart

where, <uploadId>.<eTag>.0 contains the following json object

{
    "Bucket": "bucketName",
    "Object": "objectName"
}

Example

The following example contains 3 parts each of 2 concurrent uploads with uploadId

  1. 6e463bb8-35bd-4408-809e-78f509f558b3
  2. 7bed54f8-ad71-48b1-a4f8-bd2e2a8efbda

to mybucket/myobject.

.minio.sys.tmp
    ├── 6e463bb8-35bd-4408-809e-78f509f558b3.467886be95c8ecfd71a2900e3f461b4f.0
    ├── 6e463bb8-35bd-4408-809e-78f509f558b3.467886be95c8ecfd71a2900e3f461b4f.1
    ├── 6e463bb8-35bd-4408-809e-78f509f558b3.467886be95c8ecfd71a2900e3f461b4f.2
    ├── 6e463bb8-35bd-4408-809e-78f509f558b3.467886be95c8ecfd71a2900e3f461b4f.3
    ├── 7bed54f8-ad71-48b1-a4f8-bd2e2a8efbda.467886be95c8ecfd71a2900e3f461b4f.0
    ├── 7bed54f8-ad71-48b1-a4f8-bd2e2a8efbda.467886be95c8ecfd71a2900e3f461b4f.1
    ├── 7bed54f8-ad71-48b1-a4f8-bd2e2a8efbda.467886be95c8ecfd71a2900e3f461b4f.2
    └── 7bed54f8-ad71-48b1-a4f8-bd2e2a8efbda.467886be95c8ecfd71a2900e3f461b4f.3

6e463bb8-35bd-4408-809e-78f509f558b3.467886be95c8ecfd71a2900e3f461b4f.0 contains

{
    "Bucket": "bucketName",
    "Object": "objectName"
}

Pros

  • Single directory to hold all ongoing uploads including single PUT object

Cons

  • Maximum object name supportable is limited by platform-specific path segment length limits. In GNU/Linux it's 255.
@krisis
Copy link
Author

krisis commented Jul 28, 2017

To avoid the above cases from adding/removing entries of the form uploadId/eTag.partNumber as part of completeMultipart, abortMultipart and putObjectPart request, we can rename the uploadId directory to (say) uploadId-1 during completeMultipart/abortMultipart request. This ensures that the first completeMultipart/abortMultipart can proceed successfully while subsequent/concurrent requests fail with NoSuchUploadId.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment