Author: Javier Ron
The project consisted on the creation of a client side cache for ZLog. Project description
Below is shown the location of all code files created:
- cache.h
- cache.cc
- eviction.h
- lru.h
- lru.cc
- arc.h
- arc.cc
- cache_bench.cc (based on existing test)
These files along with modifications of existing files and importing of several open source tools are referenced in the following commits
- https://github.com/cruzdb/zlog/commit/0110297e480537851b07278a42e642c9b1b90861
- https://github.com/cruzdb/zlog/commit/7150ad9f99f6d99e3c56930cdda27d270a22eec1
- https://github.com/cruzdb/zlog/commit/c0e0f667582e45f144ee16383d8f63dcad14536e
- https://github.com/cruzdb/zlog/commit/9db01af738e056cc5d47d950cf2a1c83d3981ad6
- https://github.com/cruzdb/zlog/commit/9b9b02371e9f17368cac4a36a7d8ee7a2a8c2238
- https://github.com/cruzdb/zlog/commit/274f4448674129dfb45adaad84e200d34b236f47
- https://github.com/cruzdb/zlog/commit/6927cdf174dca6ed3aeb3605a19a4601adb7dc37
- https://github.com/cruzdb/zlog/commit/dc3f298660ed78ab4820d685cc63c4d3680db664
All the work was rebased into single patch commits for the ease of merging with the master branch
The options structure, defined in options.h contains the configurable variables of the cache:
Options:
- Statistics
- A pointer to a cache statistics object, created with
zlog::CreateCacheStatistics()
- Http
- A vector of strings that contains the configuration to expose the cache statistics through an embedded http server
- Eviction
- Enumerate that describes the eviction policy to be used by the cache
- Cache size
- The maximum number of entries that the cache will hold.
Types and deaults:
std::shared_ptr<Statistics> statistics = nullptr;
std::vector<std::string> http;
zlog::Eviction::Eviction_Policy eviction = zlog::Eviction::Eviction_Policy::LRU;
size_t cache_size = 1024 * 1024 * 1;
The cahce implements currently 2 eviction policies
- LRU (Least Recently Used)
options.eviction = zlog::Eviction::Eviction_Policy::LRU;
- ARC (Adaptive Replacement Cache(don't tell IBM))
options.eviction = zlog::Eviction::Eviction_Policy::ARC;
The eviction policies are built on top of an abstract layer, so that building your own eviction policies is really simple as long as you implement the abstract interface.
virtual int cache_get_hit(uint64_t* pos) = 0;
virtual int cache_get_miss(uint64_t pos) = 0;
virtual int cache_put_miss(uint64_t pos) = 0;
virtual uint64_t get_evicted() = 0;
The size of the cache can be configured by modifing the cache_size
field:
options.cache_size = 1024;
Note that this is not the size in bytes of the cache, but the maximum number of entries that will be stored at any given time on the cache.
Note
The cache will only be available if zlog is built with the WITH_CACHE macro.
You can define it using the CMake configuration add_definitions(-DWITH_CACHE)
Currenlty ZLog provides a way to measure in real time these statistics of the cache:
- Number of cache requierments
- Number of cache hits
Setup options.statistics
options.statistics = zlog::CreateCacheStatistics();
Setup http options to expose the statistics
options.http = std::vector<std::string>({"listening_ports", "0.0.0.0:8080", "num_threads", "1"});
Then you will be able to read the current stats by accessing localhost:8080
from a browser.
Note
The cache statistics will only be available if zlog is built with the WITH_STATS macro.
You can define it using the CMake configuration add_definitions(-DWITH_STATS)
The version of the ZLog with the cache passes all the tests described on the ZLog's test suit. The following is a link to the CI evironment. Link to TravisCI
A performance test was made to measure the throughput and latency of the cache and validate the functionality of the cache on something similar to a real-world ZLog deployment scenario. The following is a link to a brief report of the test. Link to report