TiKV engine abstraction status update
TiKV is in the process of encapsulating RocksDB in a family of generic traits, with the intent to add support for more storage engines.
The tracking issue for this efort is
This is a status update on what has been done so far, what the plan is going forward, and a description of how to help. Help is wanted.
Thanks to @5kbpers, @aknuds1, and @hicqu for help.
engine_traits crate contains a new set of abstract key/value storage
engine traits, implemented by
engine_rocks. Relatively soon, TiKV will
interact with the storage engine entirely through these traits, and have no
direct dependencies on RocksDB.
Enough work has been completed to have some confidence in the process, but there is much more to be done.
You can help by claiming tasks on the tracking issue.
TiKV developers should read on to understand what is happening and should read
engine_traits crate docs for details.
As of today there is a crate,
engine_traits that defines a large family of
storage engine traits and and their associated types. These traits closely
mirror the design of the RocksDB engine wrappers defined in
engine crate, with the primary exception that the crate has no
dependencies on any RocksDB code.
There is another crate,
engine_rocks, that implements these traits for
Neither are complete.
engine crate contains similar traits, with wrappers around
rust-rocksdb, but they are not isolated from the underlying engine. The first
phase of the current engine abstraction effort is to migrate callers of
We have been slowly migrating TiKV to use
learning how to do so in the process. There have been a lot of false starts
and backtracking, but today we have completed the following:
- Redefined most of the APIs from
- Migrated the
sst_importercrate completely to
engine_traits, removing the concrete
- Migrated all
- "Pulled up" generic
Snapshots through parts of the TiKV codebase
I'm not going to describe the detailed design here. Instead look at the crate
engine_traits. It describes the design, the porting
process, and tips for how to use the new APIs successfully, particularly during
the transition to the new abstractions.
Note that at this time we are not attempting to redesign the storage engine abstraction. That we will do in the future. For now we are simply trying to eliminatet TiKV's direct dependency on RocksDB by adding an intermediate abstraction layer.
What to expect in the future
The TiKV codebase is going to begin to carry more generic type parameters.
Essentially any code that transitively depends on the storage engine will carry
one extra type parameter, usually
E: KvEngine, but sometimes over some other
engine crate is going to quickly disappear. If you find it is missing APIs
you expect, look instead in
engine_traits. For now it is possible in many
parts of TiKV to use these traits concretely through the
Rocks* types in
engine_rocks, but in the future they will only be available through
bounded type parameters.
During the transition you will see
.c() methods temporarily sprinkled through
the TiKV codebase. These are making conversions like from
RocksEngine, and will go away soon.
Please resist adding new dependencies on
if you find yourself working on code that already using
engine_rocks, try to extend those instead.
How to help
We need help.
engine_traits crate docs for design guidelines,
porting guidelines, and refactoring tips.
Coordinate on https://github.com/tikv/tikv/issues/4184
The issue is updated to contain a checklist of tasks that definitely need to happen, along with name of who is working on it. I do not know though in what order they need to happen. Instead, I have documented my own efforts here
When you decide to take a task, say so on the issue before attempting it so that someone else doesn't duplicate your work.
Some porting principles
I know some people are eager for this work to be done so that they can begin introducing new storage engines.
Unfortunately, I can not easily identify exactly what needs to happen beyond one or two steps. But I can offer some direction, advice, and coordination.
In the effort so far I have consistently found that I only discovered the next step after trying the wrong next step, often several times. This is a big refactoring effort, so that's not surprising.
I have two guidelines that I follow to decide what to do next:
- Use the crate system to forcibly break dependencies. Do this by completely
removing access to APIs that break the desired abstraction boundaries. e.g.
- define the engine traits in their own crate that does not list any concrete RocksDB dependencies in the manifest, so it is impossible to break the abstraction.
- e.g. when porting the
sst_importercrate, completely remove any concrete RocksDB dependencies, so that once the work is done it can't be accidentally undone.
- e.g. when abstracting
Snapshot, migrate all callers at once and delete the old
SnapshotAPI, so that it can't be reintroduced.
- Never duplicate code. When code is duplicated in an active codebase, one of those duplications will end up wrong. We migrate an entire subsystem at a time without leaving the possibility of accessing the feature any way but through the abstraction.
Based on what I've learned, the port is happening in several phases:
1) Migrating the `engine` abstractions 2) Eliminating direct-use of `rocksdb` re-exports 3) "Pulling up" the generic abstractions though TiKV 4) Isolating test cases from RocksDB
These are described in the
engine_traits crate docs. Those
that would like to contribute should read the link.
These phases need to happen more-or-less in sequence, but can be partially
completed in parallel. For example, now that
Snapshot has been migrated from
engine_traits, with TiKV depending on concrete
types, efforts can begin to "pull up" generic type parameters through TiKV for
code that uses snapshots, eliminating the dependency on the concrete
Obstacles encountered so far
Almost every patch I begin turns up other work that must be done first. Every patch ends up being a process of hacking, stashing, hacking on a prerequisite, stashing, hacking on a prerequisite of a prerequisite, etc. No patch so far has ended with me actually accomplishing what I set out to. I've thrown away a lot of code.
Associated types can't contain lifetimes. This requires "generic associated types", which is not implemented in Rust. To compensate, some abstractions need to be modified so that they carry reference counted pointers in order to hold resources open.
I have recently started the generics "pull up" phase, that is, abstracting TiKV
over engines. In the process I have run into a new problem that needs to be
solved soon. And that has to do with an inability specificy that two different
associated types must be the same type.
engine_traits is made up of a family
of traits, connected through associated types. Sometimes these associated types
identify the same type. e.g.
SstWriterBuilder::KvEngine. Sometimes callers need to know that these two
associated types are the same type in order to compile, but Rust can't express
engine_traits crate makes it quite easy to understand the
complex API surface of TiKV's storage engine, since it is mostly API,
little implementation, all in one place. This should be a boon for
understanding and contributing to TiKV.