Skip to content

Instantly share code, notes, and snippets.

@quake
Last active April 24, 2020 17:46
Show Gist options
  • Save quake/5d5f9d32909a5574e1c9300f76f0baf8 to your computer and use it in GitHub Desktop.
Save quake/5d5f9d32909a5574e1c9300f76f0baf8 to your computer and use it in GitHub Desktop.

KVStore 杂谈

Secondary Index

RDBMS and SQL

Employees table structure:

+-----+----------------+------+-------------+--------+
| Key | EmployeeNumber | Name | DateOfBirth | Gender |
+-----+----------------+------+-------------+--------+
|   1 | E0001          | Zhao |    19900101 | M      |
|   2 | E0002          | Qian |    19800301 | F      |
|   3 | C0001          | Sun  |    19850315 | F      |
|   4 | T0001          | Li   |    19900101 | M      |
+-----+----------------+------+-------------+--------+

SQL:

select * from employees where employee_number = 'E0002' limit 1;
select * from employees where gender = 'M';
select * from employees where date_of_birth >= '19800101' and date_of_birth <= '19891231';

KVStore (similar to BTreeMap, not HashMap)

+-----+-----------------------------+
| Key |            Value            |
+-----+-----------------------------+
|   1 | E0001 | Zhao | 19900101 | M |
|   2 | E0002 | Qian | 19800301 | F |
|   3 | C0001 | Sun | 19850315 | F  |
|   4 | T0001 | Li | 19900101 | M   |
+-----+-----------------------------+

unique secondary index, employee_number:

+-------+-------+
|  Key  | Value |
+-------+-------+
| E0001 |     1 |
| E0002 |     2 |
| C0001 |     3 |
| T0001 |     4 |
+-------+-------+

query:

id = store('employee_number').get('E0002')
store('employee').get(id)

non-unique, gender:

+-------+-------+
|  Key  | Value |
+-------+-------+
| M | 1 |       |
| F | 2 |       |
| F | 3 |       |
| M | 4 |       |
+-------+-------+

range query:

ids = store('gender').range('M |'..).take_while(key.starts_with('M |'))
store('employee').get(ids)

non-unique, date_of_birth:

+----------+-------+
|   Key    | Value |
+----------+-------+
| 19900101 | 1 | 4 |
| 19800301 | 2     |
| 19850315 | 3     |
+----------+-------+

ckb-indexer

https://github.com/quake/ckb-indexer

KVStore Structure

+--------------+--------------------+--------------------------+
| KeyPrefix::  | Key::              | Value::                  |
+--------------+--------------------+--------------------------+
| 0            | OutPoint           | Cell                     |
| 32           | ConsumedOutPoint   | Cell                     | * rollback and prune
| 64           | CellLockScript     | TxHash                   |
| 96           | CellTypeScript     | TxHash                   |
| 128          | TxLockScript       | TxHash                   |
| 160          | TxTypeScript       | TxHash                   |
| 192          | TxHash             | TransactionInputs        | * rollback and prune
| 224          | Header             | Transactions             |
+--------------+--------------------+--------------------------+

primary key: OutPoint => Cell secondary index: CellLockScript / CellTypeScript => TxHash

Key::OutPoint           OutPoint序列化                                                            36 byte
Key::CellLockScript     Lock Script序列化 | BlockNumber | TxIndex | OutputIndex                   57 + 8 + 4 + 4 = 73 byte
Key::CellTypeScript     Type Script序列化 | BlockNumber | TxIndex | OutputIndex                   37 + 8 + 4 + 4 = 53 byte

Value::Cell             BlockNumber | TxIndex | Output | OutputData                              8 + 4 + 97 + 4 = 113 byte + extra OutputData len

query live cells:

lock_script = 9bd7e06f3ecf4be0f2fcd2188b23f1b9fcc88e5d4b65a8637b17723bbda3cce8 | 1 | 20 | 88020da54bb8c6211dc44635996ff9ce7605e2a1
out_points = store('CellLockScript').range(lock_script..).take_while(key.starts_with(lock_script))
store('OutPoint').get(out_points)

性能和空间

  1. append block / query 查询次数不超过 N + 1
  2. batch prune

坑和尝试

坑:

  1. parent tx and child tx in same block
  2. rocksdb reverse iterator ( https://github.com/quake/ckb-indexer/blob/9087f395cea0e7c2dbe94f03e6ca012f80eb2ae9/src/service.rs#L298 )

尝试

  1. rocksdb 压缩
  2. batch rpc
  3. ipc vs http
  4. 切换不同的 KVStore

TODO:

  1. 按短地址规范对 lock_script 序列化进行压缩存储: 9bd7e06f3ecf4be0f2fcd2188b23f1b9fcc88e5d4b65a8637b17723bbda3cce8 | 1 => 0100
  2. CKB new block notify
  3. Rocksdb 的各种参数优化

KVStore bench

https://github.com/nervosnetwork/rust-kvstore-bench

rocksdb: Log Structured Merge Tree lmdb: MMAP copy on write B+Tree sled: BwTree

fixed key size: 32 bytes

# value size: 16 KB, batch: 3 puts, batch nums: 30000, read nums: 5000
./example-bench.sh 16384 30000 5000

+---------+--------+----------+
|   db    | Write  |   Read   |
+---------+--------+----------+
| rocksdb | 62.356 |  26.321  |
| lmdb    | 26.056 |   1.241  |
| sled    | 70.590 | 160.049  |
+---------+--------+----------+
./example-bench.sh 16384 60000 5000

+---------+--------+----------+
|   db    | Write  |   Read   |
+---------+--------+----------+
| rocksdb | 83.011 |  540.281 |
| lmdb    |414.162 |   68.651 |
| sled    |344.217 |  268.279 |
+---------+--------+----------+
./example-bench.sh 8192 60000 5000
+---------+--------+----------+
|   db    | Write  |   Read   |
+---------+--------+----------+
| rocksdb | 37.078 |   9.009  |
| lmdb    | 62.782 |   1.524  |
| sled    | 17.649 |  65.462  |
+---------+--------+----------+
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment