Skip to content

Instantly share code, notes, and snippets.

@kassane
Last active May 21, 2024 03:31
Show Gist options
  • Save kassane/1db2446307743c0a7c896715923a901e to your computer and use it in GitHub Desktop.
Save kassane/1db2446307743c0a7c896715923a901e to your computer and use it in GitHub Desktop.
About SIEVE-cache :en:

Original post: kubo39/gist 🇯🇵


About SIEVE Cache

A cache algorithm called SIEVE was announced in 2023. It claims to be simpler than LRU, and indeed it is a fairly simple algorithm, but it is said to perform as well as existing superior cache algorithms such as S3-FIFO and TinyLFU.

Algorithm

You can see the animation of How does SIEVE work?

It is a simple algorithm, but it is difficult to explain it in a few words like LRU. As mentioned in the paper, it is rather similar to cache algorithms such as CLOCK and second-chance FIFO-Reinsertion used in page replacement algorithm than LRU.

Data Structure

The SIEVE cache has internally a maximum capacity, cache size, a single two-way concatenated list and a hash map that refers to it by key. It also has a hand, which is a reference to the node referenced at the beginning and end of the concatenation list and at the end of each operation (get/insert). Each node in the linked list has a key/value pair, references to the previous and next nodes, and a “visited” flag that records the access status of each node.

Get

The value held by the node corresponding to the key is retrieved via the internal hash map. The node updates the “visited” flag, but unlike LRU, it does not perform any operation to rearrange the order of the linked list.

Insert

There are two types of insert operations: adding a new entry and updating an existing key and its corresponding value.

  1. when a new entry is added

    • When the cache has reached its capacity limit, it is necessary to delete one of the entries. The algorithm here is the heart of SIEVE. This operation is performed by traversing the entries from the hand, which is a reference to the most recently operated node, to the top of the concatenated list. If a node with a “visited” flag is found, the “visited” flag is set to “false,” and if a node without a “visited” flag is found, it is deleted and the process is terminated. If a node is not found after traversing to the top, it is moved to the bottom and traversed toward the top again. Since the “visited” flag is set to “false” in advance, the node to be deleted always appears in the second week. The paper states that it is important that the old surviving entries are gathered toward the end of the list by this hand holding position.
  2. when updating the value of an existing entry: In this case, the value is updated and visited.

    • In this case, just update the value and set the VISITED flag.

Deletion

The delete operation is not described in the PAPER, but basically it just deletes the entry in question from the hashmap and the concatenated list.

However, if the hand points to the entry to be deleted, it is recommended to point to the entry before the one to be deleted, or to the end if the deletion target is at the beginning.

Implementations in each language

For convenience, we write “thread-safe implementation” as one that obtains locks internally for each operation, but even if it is not a thread-safe implementation, it is sufficient to lock operations on the cache by yourself.

C

  • 1a1a11a/libCacheSim
    • Paper author's reference implementation
    • In addition to the paper's implementation, it is possible to customize the expire setting, whether or not to remove from the hashmap, etc.

Go

  • scalalang2/golang-fifo
    • Thread-safe implementation
      • Locking on all operations to the cache
    • Also provides S3FIFO cache
    • Expire mechanism added to the original algorithm
    • High number of stars
  • opencoff/go-sieve
    • Thread-safe implementation
      • All operations on the cache are locked

JavaScript(TypeScript)

Rust

Zig.

Java.

C++

D language

  • kubo39/sieve-cache-d
    • Provides thread-safe and non-thread-safe implementations
    • Separates implementations that acquire locks from implementations that do not, depending on shared type
    • Single-threaded implementation prevents unnecessary lock operations

Nim.

Ruby.

Python

  • daipham3213/sieve.cache
    • Thread-safe implementation
    • Other implementations are mostly MIT licensed, but this one is Apache 2.0.

Swift.

C#

  • lostmsu/SIEVE
    • Internally non-locking, non-thread-safe implementation

Elixir


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment