Skip to content

Instantly share code, notes, and snippets.

@cc32d9
Last active December 28, 2021 23:21
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save cc32d9/7ca97272dde9f37e446088f8c4262587 to your computer and use it in GitHub Desktop.
Save cc32d9/7ca97272dde9f37e446088f8c4262587 to your computer and use it in GitHub Desktop.
How nodeos writes snapshots

How EOSIO nodeos writes snapshots

EOSIO commit c6a7ec0dd816f98a6840f59dca9fed04efd9f7a5 (v2.0.6).

plugins/producer_plugin/producer_plugin.cpp line 1156: producer_plugin::create_snapshot is called by HTTP API call.

plugins/producer_plugin/producer_plugin.cpp line 1187: opens the output stream, creates ostream_snapshot_writer object, and calls controller::write_snapshot.

libraries/chain/controller.cpp line 2922: controller_impl::add_to_snapshot is called.

libraries/chain/controller.cpp line 843: add_to_snapshot starts writing the snapshot.

libraries/chain/include/eosio/chain/snapshot.hpp line 135 writes sections. Each section has a header with a string name.

libraries/chain/snapshot.cpp lines 137-154: two 64-bit integers are written for section size and row count, and then null-terminated section name follows.

Lines 166-185: the stream is returned to the start of the section, and section size (in bytes) and row count is written, then the ostream position is set to the end of file.

Line 156: ostream_snapshot_writer::write_row is only needed to count the rows, and it calls a supplied callback.

plugins/producer_plugin/producer_plugin.cpp lines 843-876: sections written as follows:

  • chain_snapshot_header is written (defined in libraries/chain/include/eosio/chain/chain_snapshot.hpp);

  • block_state is written (defined in libraries/chain/include/eosio/chain/block_state.hpp);

  • Internal multi-index data is written (indexes are defined at plugins/producer_plugin/producer_plugin.cpp lines 41-54, and index_set is defined in libraries/chain/include/eosio/chain/database_utils.hpp. Each section is named after the index class, e.g. eosio::chain::account_metadata_index;

  • Line 872: contract tables are written to the section contract_tables;

  • authorization_manager::add_to_snapshot writes 3 sections related to permissions system (defined in libraries/chain/authorization_manager.cpp lines 18-21);

  • resource_limits_manager::add_to_snapshot writes 4 sections related to resource limits (defined in libraries/chain/resource_limits.cpp lines 13-17).

Raw encoding format

The data structures are serialized as raw data, and th eformat is specified in FC_REFLECT and FC_REFLECT_DERIVED macros in header files.

FC_REFLECT is used if the structure or class does not have parents.

FC_REFLECT_DERIVED refers to a parent structure fields, followed by the child structure fields.

Example:

// libraries/chain/include/eosio/chain/block_state.hpp
FC_REFLECT_DERIVED( eosio::chain::block_state, (eosio::chain::block_header_state), (block)(validated) )

// libraries/chain/include/eosio/chain/block_header_state.hpp
FC_REFLECT( eosio::chain::detail::block_header_state_common,
            (block_num)
            (dpos_proposed_irreversible_blocknum)
            (dpos_irreversible_blocknum)
            (active_schedule)
            (blockroot_merkle)
            (producer_to_last_produced)
            (producer_to_last_implied_irb)
            (valid_block_signing_authority)
            (confirm_count)
)


FC_REFLECT_DERIVED(  eosio::chain::block_header_state, (eosio::chain::detail::block_header_state_common),
                     (id)
                     (header)
                     (pending_schedule)
                     (activated_protocol_features)
                     (additional_signatures)
)

In encoded form, the field bytes are written as-is, in the sequence defined by these macros.

Indexes serialization

Nodeos RAM consists of a series of multi-index tables. libraries/chain/controller.cpp lines 865-869 illustrate how such index is walked through, and each row in the index is appended to the snapshot. As a result, the whole index occupies the whole section. Only the content of each index object is stored, and no information about its keys, because the corresponding C++ header defines the indexes for each type. When they are restored from snapshot, the multi-index definition recreates the primary and secondary indexes.

Smart contract tables serialization

libraries/chain/controller.cpp line 788:

Each smart contract table consists of contents and associated indices. The section contract_tables is built as follows:

  • table_id_object is written. libraries/chain/include/eosio/chain/contract_table_objects.hpp defines table_id_object, and line 296 defines its serialization: (code)(scope)(table)(payer)(count).

  • Six multi-index objects are dumped in sequential order, as defined in lines 56-64 of libraries/chain/controller.cpp:

   key_value_index,
   index64_index,
   index128_index,
   index256_index,
   index_double_index,
   index_long_double_index

Before each multi-index dump, its memory size is written as fc::unsigned_int. This is a variable-size integer, and its size in memory varies depending on value. The encoding and decoding procedure is defined in libraries/fc/include/fc/io/raw.hpp lines 214-243.

The first object is the contract table row contents and primary indices, and others define the secondary indices.

key_value_object is serialized as (primary_key)(payer)(value), where first two are 64-bit integers, and the value is a shared_blob object.

shared_blob is defined in libraries/chain/include/eosio/chain/types.hpp as derivative from basic_string. The serialization of basic_string is defined in libraries/fc/include/fc/io/raw.hpp lines 317-331: first goes the string size as fc::unsigned_int, then the string body.

indexXXX_index objects have similar structure, and only thesecondary key type is different. libraries/chain/include/eosio/chain/contract_table_objects.hpp line 300 defines it as (primary_key)(payer)(secondary_key)

Once all six multi-index types are serialized (most of them are zero length), the next table_id_object is written.

The important part is how CDT wraps C++ multi-index objects into the nodeos index structures: each secondary index is occupying a separate table_id_object where table field is the original table name, but 4 least significant bits are used to encode the positional number of the secondary index (0 to 15).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment