Skip to content

Instantly share code, notes, and snippets.

@ripatel-fd
Last active April 14, 2024 18:06
Show Gist options
  • Save ripatel-fd/268c88d938075537ec6431e2960f47dd to your computer and use it in GitHub Desktop.
Save ripatel-fd/268c88d938075537ec6431e2960f47dd to your computer and use it in GitHub Desktop.
Informal Guide to Solana Snapshots

Consider this an informal guide to reading the Solana snapshot format. This guide is written for Solana Labs versions v1.14 through v1.17.

You are probably reading this because you want to read the accounts in a snapshot without going through the pain of interfacing with the Solana Labs codebase.

Terminology

We assume general familiarity with the Solana ledger. Let's start by clarifying some less obvious terminology.

  • Solana Runtime: The deterministic state machine that executes transaction and manages all Solana accounts. When data enters the runtime/blockchain, it is often referred to as "confirmed". Every piece of runtime data can be validated by re-executing (replaying) the blockchain from genesis.
  • Implicit state: A lot of runtime data (ca. 300 MB) is not stored in any accounts and only partially exposed via RPC. In this document, we will refer to this kind of data as "implicit state". Some implicit state is periodically copied to sysvars. In the Solana Labs client, the implicit state is managed by a structure called Bank.
  • AppendVec: A file format containing multiple accounts. The term AppendVec originates from the Solana Labs code. It should probably have been called "accounts vec".
  • AppendVec length: The current version of AppendVecs have a major design flaw: An AppendVec file cannot be read without external information for the simple reason that the true length of an AppendVec is unknown.
  • Manifest: The manifest is a large binary file containing structured data serialized via Bincode. It contains implicit state as well as all AppendVec lengths.
  • Bincode: A binary serialization format.

Further terminology will be introduced along the way. We will revisit each component in detail.

High-level Structure

First, we need to understand how to get to the data stored inside a full snapshot. (Incremental snapshots will be explained at the end of this document.)

A snapshot consists of the following conceptual layers:

+-----------------------------+
| Zstandard compressed stream |
+-----------------------------+
| TAR file stream             |
+-----------------------------+
| Files                       |
+-----------------------------+
| Accounts & Implicit State   |
+-----------------------------+

Starting from the bottom of the stack:

  • First, the information is serialized into bytes
  • The serialized data is packed into files
  • Multiple files are packed together into a TAR stream (OLDGNU format).
  • The TAR stream is compressed using the Zstandard (zstd) compression format.

Hence, a snapshot uses the .tar.zst file extension.

File Stream

Consuming the .tar.zst stream is straightforward. Both TAR and Zstandard are widely adopted. A naive approach is to compress the entire archive to the file system. However, it is also possible to read, uncompress, and process a snapshot in a single pass using tar/zstd streaming APIs.

A typical snapshot results in a list of files like so.

version
snapshots/
snapshots/status_cache
snapshots/196493007
snapshots/196493007/196493007
accounts/
accounts/196487562.2062643
accounts/196487862.2062575
accounts/196486291.2059029
accounts/196489838.2066997
...

The file names ending with / indicate directories and can be ignored.

If you are just looking for accounts, it is tempting to just try to parse the accounts/196487562.2062643 files (AppendVec file format). As mentioned earlier, it is impossible to parse these files on their own. The AppendVec lengths are deeply hidden inside the snapshots/196493007/196493007 manifest file, which requires a deserializer worth thousand of lines of code of complexity.

So without further ado ...

File: version

The version file contains the 5 byte text string 1.2.0. Its use is obvious.

File: status_cache

No idea what this one does. Doesn't seem to be important.

File: snapshots/SLOT/SLOT

This file contains the snapshot manifest.

Bincode

To understand how to parse the manifest, one must be able to parse the Bincode serialization format. Quick recap of the second worst serialization format known to man:

Bincode operates on the following data types:

  • Scalar types (bool, u8, u16, u32, u64, u128, i8, i16, i32, i64, i128, float, double)
  • Composite types
    • Structs
    • Tuples
    • Enums (Tagged Unions)
  • Collections
    • Options: Like Rust's Option<T>
    • Arrays: Like Rust's [T; count]
    • Vectors: Like Rust's Vec<T>
    • Maps: Like Rust's BTreeMap<K, V>

Encoding rules:

  • The bool type is a u8 that is either 0 or 1
  • A scalar type is encoded in little-endian byte order
  • A struct is encoded by encoding each of the struct's fields
  • A tuple is encoded by encoding each of the tuple's fields
  • An enum is the concatenation of ...
    • the variant's ID encoded as an u32
    • the encoded variant's data (if applicable)
  • An option is the concatenation of ...
    • whether the optional value is unset or set, as an encoded bool
    • the encoded value, if set
  • An array is the concatenation of each item encoded
  • A vector is the concatentation of ...
    • the number of items encoded as an u64
    • each item encoded
  • A map is the concatenation of each key-value tuple encoded

A Bincode blob can only be deserialized if the data type of that blob is known.

Manifest Data Type

The top level data type of the manifest is as follows.

{
  "name": "solana_manifest",
  "type": "struct",
  "fields": [
    { "name": "bank",                     "type": "deserializable_versioned_bank" },
    { "name": "accounts_db",              "type": "solana_accounts_db_fields" },
    { "name": "lamports_per_signature",   "type": "ulong" }
  ]
}

If you are only interested in accounts, you only might again be tempted to only parse the part that actually contains the AppendVec lengths.

Thanks to bincode's thoughtful design, it is not possible to selectively parse fields. You have to parse all of it.

So here are the full definitions:

{
  "name": "hash",
  "type": "array",
  "length": 32,
  "element": "uchar"
}
{
  "name": "pubkey",
  "type": "array",
  "length": 32,
  "element": "uchar"
}
{
  "name": "deserializable_versioned_bank",
  "type": "struct",
  "fields": [
    { "name": "blockhash_queue",       "type": "block_hash_queue" },
    { "name": "ancestors",             "type": "vector", "element": "slot_pair" },
    { "name": "hash",                  "type": "hash" },
    { "name": "parent_hash",           "type": "hash" },
    { "name": "parent_slot",           "type": "ulong" },
    { "name": "hard_forks",            "type": "hard_forks" },
    { "name": "transaction_count",     "type": "ulong" },
    { "name": "tick_height",           "type": "ulong" },
    { "name": "signature_count",       "type": "ulong" },
    { "name": "capitalization",        "type": "ulong" },
    { "name": "max_tick_height",       "type": "ulong" },
    { "name": "hashes_per_tick",       "type": "option", "element": "ulong" },
    { "name": "ticks_per_slot",        "type": "ulong" },
    { "name": "ns_per_slot",           "type": "uint128" },
    { "name": "genesis_creation_time", "type": "ulong" },
    { "name": "slots_per_year",        "type": "double" },
    { "name": "accounts_data_len",     "type": "ulong" },
    { "name": "slot",                  "type": "ulong" },
    { "name": "epoch",                 "type": "ulong" },
    { "name": "block_height",          "type": "ulong" },
    { "name": "collector_id",          "type": "pubkey" },
    { "name": "collector_fees",        "type": "ulong" },
    { "name": "fee_calculator",        "type": "fee_calculator" },
    { "name": "fee_rate_governor",     "type": "fee_rate_governor" },
    { "name": "collected_rent",        "type": "ulong" },
    { "name": "rent_collector",        "type": "rent_collector" },
    { "name": "epoch_schedule",        "type": "epoch_schedule" },
    { "name": "inflation",             "type": "inflation" },
    { "name": "stakes",                "type": "stakes" },
    { "name": "unused_accounts",       "type": "unused_accounts" },
    { "name": "epoch_stakes",          "type": "vector", "element": "epoch_epoch_stakes_pair" },
    { "name": "is_delta",              "type": "char" }
  ],
}
{
  "name": "block_hash_queue",
  "type": "struct",
  "fields": [
    { "name": "last_hash_index", "type": "ulong" },
    { "name": "last_hash", "type": "option", "element": "hash" },
    { "name": "ages", "type": "vector", "element": "hash_hash_age_pair" },
    { "name": "max_age", "type": "ulong" }
  ]
}
{
  "name": "hash_hash_age_pair",
  "type": "struct",
  "fields": [
    { "name": "key", "type": "hash" },
    { "name": "val", "type": "hash_age" }
  ]
}
{
  "name": "hash_age",
  "type": "struct",
  "fields": [
    { "name": "fee_calculator", "type": "fee_calculator" },
    { "name": "hash_index", "type": "ulong" },
    { "name": "timestamp", "type": "ulong" }
  ]
}
{
  "name": "fee_calculator",
  "type": "struct",
  "fields": [
    { "name": "lamports_per_signature", "type": "ulong" }
  ]
}
{
  "name": "slot_pair",
  "type": "struct",
  "fields": [
    { "name": "slot", "type": "ulong" },
    { "name": "val", "type": "ulong" }
  ]
}
{
  "name": "hard_forks",
  "type": "struct",
  "fields": [
    { "name": "hard_forks", "type": "vector", "element": "slot_pair" }
  ]
}
{
  "name": "fee_rate_governor",
  "type": "struct",
  "fields": [
    { "name": "target_lamports_per_signature", "type": "ulong" },
    { "name": "target_signatures_per_slot", "type": "ulong" },
    { "name": "min_lamports_per_signature", "type": "ulong" },
    { "name": "max_lamports_per_signature", "type": "ulong" },
    { "name": "burn_percent", "type": "uchar" }
  ]
}
{
  "name": "rent_collector",
  "type": "struct",
  "fields": [
    { "name": "epoch", "type": "ulong" },
    { "name": "epoch_schedule", "type": "epoch_schedule" },
    { "name": "slots_per_year", "type": "double" },
    { "name": "rent", "type": "rent" }
  ]
}
{
  "name": "epoch_schedule",
  "type": "struct",
  "fields": [
    { "name": "slots_per_epoch", "type": "ulong" },
    { "name": "leader_schedule_slot_offset", "type": "ulong" },
    { "name": "warmup", "type": "uchar" },
    { "name": "first_normal_epoch", "type": "ulong" },
    { "name": "first_normal_slot", "type": "ulong" }
  ]
}
{
  "name": "rent",
  "type": "struct",
  "fields": [
    { "name": "lamports_per_uint8_year", "type": "ulong" },
    { "name": "exemption_threshold", "type": "double" },
    { "name": "burn_percent", "type": "uchar" }
  ]
}
{
  "name": "inflation",
  "type": "struct",
  "fields": [
    { "name": "initial", "type": "double" },
    { "name": "terminal", "type": "double" },
    { "name": "taper", "type": "double" },
    { "name": "foundation", "type": "double" },
    { "name": "foundation_term", "type": "double" },
    { "name": "__unused", "type": "double" }
  ]
}
{
  "name": "stakes",
  "type": "struct",
  "fields": [
    { "name": "vote_accounts", "type": "vote_accounts" },
    { "name": "stake_delegations", "type": "map", "element": "delegation_pair", "key": "account" },
    { "name": "unused", "type": "ulong" },
    { "name": "epoch", "type": "ulong" },
    { "name": "stake_history", "type": "stake_history" }
  ]
}
{
  "name": "vote_accounts",
  "type": "struct",
  "fields": [
    { "name": "vote_accounts", "type": "map", "element": "vote_accounts_pair", "key": "key" }
  ]
}
{
  "name": "vote_accounts_pair",
  "type": "struct",
  "fields": [
    { "name": "key", "type": "pubkey" },
    { "name": "stake", "type": "ulong" },
    { "name": "value", "type": "solana_account" }
  ]
}
{
  "name": "solana_account",
  "type": "struct",
  "fields": [
    { "name": "lamports", "type": "ulong" },
    { "name": "data", "type": "vector", "element": "uchar" },
    { "name": "owner", "type": "pubkey" },
    { "name": "executable", "type": "uchar" },
    { "name": "rent_epoch", "type": "ulong" }
  ]
},
{
  "name": "delegation_pair",
  "type": "struct",
  "fields": [
    { "name": "account", "type": "pubkey" },
    { "name": "delegation", "type": "delegation" }
  ]
}
{
  "name": "delegation",
  "type": "struct",
  "fields": [
    { "name": "voter_pubkey", "type": "pubkey" },
    { "name": "stake", "type": "ulong" },
    { "name": "activation_epoch", "type": "ulong" },
    { "name": "deactivation_epoch", "type": "ulong" },
    { "name": "warmup_cooldown_rate", "type": "double" }
  ]
}
{
  "name": "unused_accounts",
  "type": "struct",
  "fields": [
    { "name": "unused1", "type": "vector", "element": "pubkey" },
    { "name": "unused2", "type": "vector", "element": "pubkey" },
    { "name": "unused3", "type": "vector", "element": "pubkey_u64_pair" }
  ]
}
{
  "name": "pubkey_u64_pair",
  "type": "struct",
  "fields": [
    { "name": "_0", "type": "pubkey" },
    { "name": "_1", "type": "ulong" }
  ]
}
{
  "name": "epoch_epoch_stakes_pair",
  "type": "struct",
  "fields": [
    { "name": "key", "type": "ulong" },
    { "name": "value", "type": "epoch_stakes" }
  ]
}
{
  "name": "epoch_stakes",
  "type": "struct",
  "fields": [
    { "name": "stakes", "type": "stakes" },
    { "name": "total_stake", "type": "ulong" },
    { "name": "node_id_to_vote_accounts", "type": "vector", "element": "pubkey_node_vote_accounts_pair" },
    { "name": "epoch_authorized_voters", "type": "vector", "element": "pubkey_pubkey_pair" }
  ]
}
{
  "name": "pubkey_node_vote_accounts_pair",
  "type": "struct",
  "fields": [
    { "name": "key", "type": "pubkey" },
    { "name": "value", "type": "node_vote_accounts" }
  ]
}
{
  "name": "node_vote_accounts",
  "type": "struct",
  "fields": [
    { "name": "vote_accounts", "type": "vector", "element":"pubkey" },
    { "name": "total_stake", "type": "ulong" }
  ]
}
{
  "name": "pubkey_pubkey_pair",
  "type": "struct",
  "fields": [
    { "name": "key", "type": "pubkey" },
    { "name": "value", "type": "pubkey" }
  ]
}
{
  "name": "solana_accounts_db_fields",
  "type": "struct",
  "fields": [
    { "name": "storages", "type": "vector", "element": "snapshot_slot_acc_vecs" },
    { "name": "version", "type": "ulong" },
    { "name": "slot", "type": "ulong" },
    { "name": "bank_hash_info", "type": "bank_hash_info" },
    { "name": "historical_roots", "type": "vector", "element": "ulong" },
    { "name": "historical_roots_with_hash", "type": "vector", "element": "slot_map_pair" }
  ]
}
{
  "name": "snapshot_slot_acc_vecs",
  "type": "struct",
  "fields": [
    { "name": "slot", "type": "ulong" },
    { "name": "account_vecs", "type": "vector", "element": "snapshot_acc_vec" }
  ]
}
{
  "name": "snapshot_acc_vec",
  "type": "struct",
  "fields": [
    { "name": "id", "type": "ulong" },
    { "name": "file_sz", "type": "ulong" }
  ]
}
{
  "name": "bank_hash_info",
  "type": "struct",
  "fields": [
    { "name": "hash", "type": "hash" },
    { "name": "snapshot_hash", "type": "hash" },
    { "name": "stats", "type": "bank_hash_stats" }
  ]
}
{
  "name": "bank_hash_stats",
  "type": "struct",
  "fields": [
    { "name": "num_updated_accounts", "type": "ulong" },
    { "name": "num_removed_accounts", "type": "ulong" },
    { "name": "num_lamports_stored", "type": "ulong" },
    { "name": "total_data_len", "type": "ulong" },
    { "name": "num_executable_accounts", "type": "ulong" }
  ]
}
{
  "name": "slot_map_pair",
  "type": "struct",
  "fields": [
    { "name": "slot", "type": "ulong" },
    { "name": "hash", "type": "hash" }
  ]
}

And btw - These data structures have been extended several times. This means that you will need to build a deserializer that handles older versions of the data structure definitions if you want to unpack old snapshots.

To get to the AppendVec lengths, select

manifest.accounts_db.storages[].account_vecs[].file_sz

The corresponding filenames are according to

  • <slot>: manifest.accounts_db.storages[].slot
  • <id>: manifest.accounts_db.storages[].account_vecs[].id

Resulting in file name accounts/<slot>.<id>.

AppendVec file format

In snapshots, the AppendVec file format looks somewhat like this.

+------------------+
| Account Header 0 |
+------------------+
| Account Data   0 |
|                  |
+------------------+
| Padding          |
+------------------+
| Account Header 1 |
+------------------+
| Account Data   1 |
+------------------+
| Padding          |
+------------------+
| Account Header N |
+------------------+
| Account Data   N |
|                  |
|                  |
+------------------+  <--- file_sz
| Random Garbage   |
+------------------+

It contains repeated instances of (Account Header, Account Data, Padding). Padding contains arbitrary bytes (usually zero) used to align the next account header such that its file offset is a multiple of 8. If the file offset is already aligned to a multiple of 8 after reading the account data, the padding is omitted.

file_sz (obtained from the manifest above) indicates when the random garbage starts. Technical reasons for the random garbage involve the the way account data is allocated internally within Solana Labs.

If you just tried to parse the entire file without respecting file_sz, you would eventually end up misinterpreting an account where none exists.

The definition of the Account Header is as follows (C code). It amounts to a size of 136 bytes.

struct __attribute__((packed)) solana_account_hdr {
  /* 0x00 */ uint64_t write_version;
  /* 0x08 */ uint64_t data_len;
  /* 0x10 */ uchar    pubkey[32];
  /* 0x30 */ uint64_t lamports;
  /* 0x38 */ uint64_t rent_epoch;
  /* 0x40 */ uchar    owner[32];
  /* 0x60 */ uchar    executable;
  /* 0x61 */ uchar    padding[7];
  /* 0x68 */ uchar    hash[32];
  /* 0x88 */
};

The Account Data appears as-is directly after the header. The length is controlled by the solana_account_hdr data_len field.

Duplicate Accounts

While walking the accounts that appear in each AppendVec, you might encounter the same pubkey twice. In this case, compare the slot numbers of each AppendVec and choose the larger one. The case where an account appears twice with the same slot number is undefined.

Incremental Snapshots

Incremental snapshots use exactly the same format as specified above.

However, it is assumed that the accounts database is pre-populated with a set of accounts loaded from a prior full snapshot. The accounts in the incremental snapshot then override any existing ones.

The incremental snapshot's implicit state also replaces the full snapshot's state.

Closing Thoughts

  • Stop doing bincode.
  • Stop adding random garbage to the end of AppendVec files so we can just skip the bincode blob.
  • The snapshot manifest needs to change from bincode to Protobuf ASAP to allow parsing with an incomplete schema definition.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment