Summary
- Edge caching for gateway and server with a disk based cache
Requirements
- cache on GET, PUT and POST operations
- cache should be highly available. In the event of an offline drive, object should still be cached to any available online drive.
- when backend is down, List, Get, Head options should work seamlessly. Put operations will fail
Assumptions:
- All drives have same capacity
- ATime support is enabled on the drives
- once set, the list of cache drives will not be changed.
External interface
- User sets the list of drives to be used for disk cache, cache expiry duration and any wildcard patterns for object exclusion from cache via "cache" config settings in minio server's config.json or via environment variables MINIO_CACHE_DRIVES, MINIO_CACHE_EXPIRY and MINIO_CACHE_EXCLUDE.
Design
Disk structure
fsObjects is leveraged to provide a cache backend. Disk cache uses a list of fsObjects, one for each drive specified in the cache drives setting. All operations are identical to fs operations - with the exception that PutObject and NewMultipartUpload operation use the ETag generated at backend for the cached entry
Entry hash
Objects are deterministically hashed to one of the available cache drives using unique hash index derived from sha256sum of bucket and object name. In the event that object hashes to an offline drive, the next available drive is used for caching.
Consistency checking
On every Get operation, the ETag of the object in the cache is verified with the backend for consistency. In the event of ETag mismatch, the cached entry is deleted and a new cache entry created.
Implementation
Caching
Objects will be cached if disk caching is enabled and following conditions apply
- sufficient disk space exists to save 100 times the size of current object. This is a crude heuristic to ensure cache eviction keeps pace with rate of filling up of cache.
- No wildcard patterns were specified in the exclude option of cache config to exclude this entry from cache.
- If all cache drives are full( upto usable capacity) / offline, then serve from backend
Lookup
the hash index for an object is used as a hint to lookup the cache entry. If not found, the list of cache drives is treated as a circular buffer and searched until the object is found.
Eviction
Maximum usable disk space for caching is 80% of drive capacity. When disk usage hits 80% of drive capacity, eviction automatically kicks in and purge routine uses expiry duration as hint to evict as many stale entries as needed to clear enough space.The purge loop continues to binomially reduce the expiry duration if not enough space can be cleared in one iteration.
Purging entries
- happens automatically when there is no free space in cache or at 30 minutes interval when no entries were deleted in previous purge cycle.
- temporary directories and entries created during multipart operation are cleaned up automatically by fs backend
Data Integrity
- Content is served from cache only upon ETag consistency with backend in online mode. In offline mode, cached entries are served as is.
- Object listing in offline mode is partially consistent.
Limitations
- Policy not cached currently. So anonymous operations not allowed in offline mode. In online mode, anonymous operations reverts to backend for gateway.
- Cache is not cleared on delete operation, but only if a subsequent GET operation returns an ObjectNotFound. There is a potential for stale entries being built up and not cleared.
- If list of cache drives are altered from initial configuration or if one or more drives go offline, the deterministic hash lookup could deteriorate to linear lookup and degrade performance. However data integrity would not be affected though it could result in duplicate copies in the cache and cache misses.
Not implemented:
- Metrics to measure cache hits/misses to be implemented in mc admin
- anonymous operations in offline mode will be possible if gateway policies are cached locally in memory