Skip to content

Instantly share code, notes, and snippets.

@pauldardeau
Last active September 13, 2016 00:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pauldardeau/559048ef35ce57cb4fb940bfce139f0d to your computer and use it in GitHub Desktop.
Save pauldardeau/559048ef35ce57cb4fb940bfce139f0d to your computer and use it in GitHub Desktop.
Swift data file path decoded
Example of decoded full data file path in Swift
===============================================
/srv/node/sdb1/objects/884/963/dd001dbc81e220b9368528fc70c7b963/1471554845.30599.data
--------- ^ ^ ^ ^ ^ ---------------- ^
^ | | | | | ^ |
| | | | | | | |
| | | | | | | |
devices | | | | | | |
dir | | | | | | |
| | | | | | |
device | | | | | |
| | | | | |
-------------- | | | | | |
^ | | | | | |
| | | | | | |
| | | | | | |
device path | | | | | |
| | | | | |
| | | | | |
DATADIR_BASE ------ | | | | |
(diskfile.py) | | | | |
| | | | |
| | | | |
partition ----- | | | |
| | | |
name hash suffix ------------ | | |
| | |
name hash | |
| |
timestamp.internal (see Timestamp class in utils.py) -------------------- |
|
extension ('.data' for data, '.ts' for tombstone, '.meta' for metadata) ------------
devices dir = '/srv/node'
device = 'sdb1'
device path = '/srv/node/sdb1'
DATADIR_BASE = 'objects' (constant in diskfile.py)
partition = '884'
name hash = 'dd001dbc81e220b9368528fc70c7b963'
timestamp = '1471554845.30599'
extension = '.data'
NOTE: '963' are the last 3 characters (suffix) of name hash
Code flow to get path:
swift/obj/server.py: get_diskfile(device,partition,account,container,obj,policy,kwargs)
# calls get_diskfile on router for policy (BaseDiskFileManager)
swift/obj/diskfile.py: get_diskfile(device,partition,account,container,obj,policy,kwargs)
# calls get_dev_path for device
get_dev_path(device) (os.path.join(self.devices, device)) (self.devices defaults to '/srv/node')
# calls diskfile_cls(dev_path, partition, account, container, obj, policy, use_splice, pipe_size,kwargs)
# ***TRICKY*** : diskfile_cls is a function that must be set by subclasses
# for a replication policy, this is DiskFile. Most functionality in base class BaseDiskFile.
# for an EC policy, this is ECDiskFile
in BaseDiskFile constructor:
self._name = '/' + '/'.join((account, container, obj))
name_hash = hash_path(account, container, obj)
self._datadir = join(device_path, storage_directory(get_data_dir(policy), partition, name_hash))
in BaseDiskFile _put:
filename = self.manager.make_on_disk_filename(timestamp, extension, ctype_timestamp)
(extension = '.data')
get_data_dir --> get_policy_string(DATADIR_BASE)
utils.py:storage_directory(datadir, partition, name_hash)
os.path.join(datadir, str(partition), name_hash[-3:], name_hash)
^
|
|
last 3 chars
utils.py:hash_path(account, container, object, raw_digest)
md5(HASH_PATH_PREFIX/account/container/object/HASH_PATH_SUFFIX)
HASH_PATH_PREFIX - set in /etc/swift/swift.conf
HASH_PATH_SUFFIX - set in /etc/swift/swift.conf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment