JackyWYX/design (part1).md

## design (part1).md

      
    Raw
  

              design (part1).md
            
          
    Harmony FastSync Design

This document is the design doc for harmony fastSync.
Note:

This document inherit a lot of ideas from Ethereum. It would help if you have some knowledge of Ethereum network implementation to better understand this document.
Please use this document as a collection of potential ideas instead of the final design.

0 Assumptions

These are the some basic assumptions that have been made for stability of chain consensus.

Shard committee is not corrupted in a single epoch.
The Validation of header data can be verified with the combination of header signature and state committee.
The validation of committee shard can be obtained by syncing beacon chain headers data.
It could happen that all nodes are malicious. We need to have a way to verify all data based on above three trust basics,

1 Sync logic walk through

Following is a quick walk through of the overall fast sync logic. The network used for synchronization is libp2p stream (elaborated in section 2) and pubsub messages. The process of state synchronization is better to be implemented with the pipeline pattern to enable non-block processing of sync content to maximize performance.
1.1 Sync process start up


The client handshake pick a certain number of nodes from peer discovery in both beacon chain and side chain as host candidates.
Client also listen to pubsub node.sync messages to get the latest blocks from both beacon chain and validating chain.

1.2 Overall sync strategy


In our current implementation, the state of both beacon chain and side chain are synchronized.
But having a full state in the beacon chain is not necessary for consensus. The only purpose of beacon chain is to provide a valid stateShard with proof.
But as long as we hold the three assumptions in chapter 0, we can do validation on side chain with stateShard for each epoch.
Which means we only need to synchronize the last beacon header of each epoch, where stateshard data can be obtained with verification.
We even don't need crosslinks and staking transactions to synchronize the side chain. Thus what we need from beacon chain is only one header per epoch.
This will greatly alleviate the network stress for beacon chain synchronization.


Note: This is really a bold move. Need extensive review.

So the overall strategy is as follows:
Overall logic


Sync the beacon header skeleton which is the last block header in each epoch. (See 1.3)
Meanwhile, choose to run fastSync or fullSync for downloader synchronization

If fastSync

Set the pivot and fast sync the validating chain content. Pivot is a certain number of blocks before the latest block.
For blocks before the pivot, do not process state, directly insert related data into db. (See 1.4.1)
For block at the pivot, download state and other content and insert into db. (See 1.4.2)
For blocks after the pivot, use insertChain to process blocks in EVM. (See 1.4.3)


If fullSync

Do the logic same as fastSync step 4 for all missing blocks


Note: Why need both fastSync and fullSync:
Because downloading a full state is costy. Sometimes it may happen that processing all states is cheaper than downloading the whole state. Need to decide the threshold based on some experiment results.

1.3 Sync beacon header skeleton


Beacon Skeleton Header is defined as the last block of an epoch which contains shardState data for header verification on both side chain and beacon chain.
Client pick peers with a certain strategy to

Find the beacon epoch headers that it self don't have (cannot start from the epoch data it possess since a hard fork might have happened)
Sync the beacon skeleton headers


Beacon skeleton headers are verified and written to rawdb.
ShardState obtained by beacon skeleton is also written to rawdb.

1.4 FastSync

1.4.1 Synchronization before pivot


Pivot is defined as the block which is at a certain number blocks before the latest block.


For blocks before the pivot, do not process state. Instead,

download related data from host directly
verify header against obtained state shard, and verify rest of them against verified header.
insert into rawdb directly.


Data to be downloaded:

Header
Transactions and receipts
Staking transaction and receipts
Incoming CX Receipts


Data to be written to rawdb

Header (Canonical chain, e.t.c)
Crosslinks (from header)
Slash records (from header)
VRF, VDF (from header)
Shard State (from header)
Block commit sig (from header)
Epoch Block Number (from header)
Transactions and receipts (by request)
Staking transaction and receipts (by request)
Incoming CXReceipt (by request)


All these data can be obtained concurrently from multiple peers.


1.4.2 Synchronization at pivot


Pivot data is crucial. This is the first block which has state.
It has more data to download compared to block before pivot
Data to be downloaded (4 of them are same as blocks before pivot):

Header
Transactions and receipts
Staking transaction and receipts
Incoming and outgoing CX receipts
Full State
ValidatorList (TBD)
Accumulated distributed reward (TBD)
ValidatorSnapshots with proof in last block at last epoch


Data to be written to rawdb

Header (Canonical chain, e.t.c)
Crosslinks (from header)
Slash records (from header)
VRF, VDF (from header)
Shard State (from header)
Block commit sig (from header)
Epoch Block Number (from header)
Transactions and receipts (by request)
Staking transaction and receipts (by request)
Incoming and outgoing CX Receipts (by request)
ValidatorList (by request)
ValidatorStat (by computing the diff of the current validatorWrapper against validatorWrapper in last epoch)
DelegationsByDelegator (by computing from current validatorWrapper)
Full State must be committed to db.


1.4.4 Synchronization after pivot


After the pivot, each block follows the full synchronization logic, which is to call blockchain.InsertChain for block processing.
Data to be downloaded:

Header
Transactions
Staking Transactions
Incoming CX receipts


Data to be writtent to rawdb

Execute the same logic as the current InsertChain logic.


1.5 Full Sync

Full sync is just the same process as fastSync - Synchronization after pivot (1.4.4)
1.5 Post FastSync - backupSync


After the fastSync, the downloader will change the mode to backupSync.
In this mode, block synchronization can be done in three cases:

For shard that is in consensus comittee, the block is directly inserted in consensus module.
For beacon chain or non-validators node, the block message is get through pubsub. pubsub message will be forwarded to downloader further inserted into chain.
In other cases that the node realized some missing blocks, downloader start the backupSync task to actively fetch data and feed to downloader to complete synchronization.


On the other hand, always listen to mainnet pubsub to get the latest committee from beacon chain. If some header is missing from beacon chain, actively request from mainnet nodes.


Any better name for backupSync?

1.6 Topics to be discussed

1.6.1 How do we determine the pivot?

Here we have three options:

Set the pivot as the block at a certain number before the latest block. E.g 100 blocks.
Set the pivot as the last block in last epoch.
Combine 1 and 2, Set the pivot as the larger value of option 1 and 2.

For option 1, since we need validator snapshot to compute APR, it's likely that we also need the data the validatorSnapshot of the block in last epoch. It's possible that we don't have the full state of that block, we need to request the code blob along with the merkle proof from providers. It also means that we need to adopt option 2 in 1.6.2.
For option 2, the good thing is that it does not need to request for additional validatorSnapshot data. But it has two fallbacks:

The timing is crucial since it might happen that it does not have state data for enough blocks.
It is also possible that we might need to compute states for the whole epoch. If each state processing takes 100ms, it will take hours process the whole epoch blocks. This is really costy.

For options 3, it get rid of the first fallback from option 2, but still have the same problem as option 2.
I would personally recommend option 1, which might leave use no choice for 1.6.2.
1.6.2 Do we need ValidatorListHash in header?

There are two options to compute for the ValidatorList offchain data:

Iterate over the state to get the validatorList.
Add a hash of validatorList to the header, then the client can download it from request.

The first solution doesn't need to change anything in code. But the iteration over the whole state will be costy as the state grows larger and larger.
The second solution need to do a hard fork on the header. The good thing is that when finished state, we don't need to iterate over the whole state and still have the data that can be verified.
1.6.3 Do we need accumulated reward in header?

Accumulated (distributed) reward is used to determine the block reward given in beacon chain. It is recorded in offchain data. The data is not in header and can only be obtained from full state computation.

Is it true that current new block produced, we don't need the accumulated reward for reward computation?
It is true that this value is used in some historical blocks?

Based on the answer of the two questions, we have two scenarios:


Block reward is not needed even for historical block.
This will be good. Just get rid of the accumulated reward from Finalize.


We don't need it in latest blocks, but it is still needed for historical block.
Thus some blocks can not fastSynced. If the pivot block does not support fastSync (though not likely to happen), we need to disable the fastSync and start fullSync.


1.6.4 Need confirm: last beacon header in epoch forms a valid chain with proof.

This is really important if we can only sync skeleton in beacon chain.
Is it true for all historical blocks that:

In all historical header versions, committee data exist in last header of the epoch.
In last header of the epoch, holds the new committee for the next epoch.

2. Harmony full-node protocol

In this design, data in network are transfered in two channels:

Libp2p pubsub
Libp2p stream

In this part, we will discuss the details of the network data transfer.
2.1 Overview

2.1.1 Two network data: stream and pubsub

Stream and pubsub serves different usage scenarios.

Stream is used for peer to peer private messages (E.g. state sync from a new comer - The message is needed only for the new comer). No need to flood the whole network with the data which is only meaningful for small number of them.
Pubsub is used for public messages which expects multiple receivers to handle this message. (E.g. Validator messages for the consensus layer)

2.1.2 How many different roles in network?

Based on different roles, nodes need a different set of network data. Here are a full list of roles. Note one node must be exactly one role.

Leader
Validator
Synced node (non-validator)
Non-synced node

For each shard nodes, it has the following role transtion:

2.1.3 sub-protocols

The protocol is comprised of several types of sub-protocols
1. PubSub protocols

Pubsub protocols are just the current protocol running on pubsub. In this design, we will leave pubsub unchanged:

Consensus.ViewChange / Prepare / Commit...
Node.Transaction / CXReceipts / Slash...

Providers:

Consensus messages are provided from validators and leaders
Other messages are provided by node that receives a corresponding info as block message.

Consumers:

Node.Sync message is used by all nodes to acknowledge of the latest blocks of a shard.
Beacon Node.Sync message is also used for other shard nodes to keep track of the latest block of beacon chain.

2.  HeavySync protocols (Stream)

As described in chapter 1, the downloader module need to fetch a number of blocks to get synced for non-synced nodes, including fast Sync and full Sync.
Providers:

All validators provide this service for the nodes of validating chain.
The leader will not provide this service for best performance.

Consumers:

Node need to validate on a shard chain that it does not have state or possessed state is out of date.
This include both first started node and node are resharded to another shard.

3. BackupSync protocol (Stream)

Used for node sync when node has arrived at the latest state. The sync serves as a backup is case that some blocks are missing from pubsub.
Providers:

All validators provide this service for the nodes of validating chain.
The leader will not provide this service for best performance.

Consumers:

Consumed by nodes that missed some pubsub message and neet to fetch missed body message.


Note that only BackupSync protocol is bidirectionaly - each side can be both provider and consumer.

4. Ultra-light Sync protocol (Stream) (skeleton sync)

Used for beacon header sync. Only last header in an epoch on beacon chain is provided.
Providers:

All validators on beacon chain
Not beacon leader

Consumers:

All nodes other than beacon leader.

In total, we have sub-protocols:


Pubsub * 4
HeavySync Host * 4 shard
HeavySync Client * 4 shard
BackupSync * 4
Ultra-light Host * 1
Ultra-light Client * 1

2.1.4 Protocol role POV

This part defines the protocol strategy in POV of each role so that the network reaches expected topology. Note that the number of connection is an illustration. The final parameters need to be tuned according to performance in testing.
Also note that here assume only the validators will provide full sync & fast sync. This is optional for non-validators since other node may have more trust on validator nodes.
1. Non-synced


Pubsub (side chain)
Pubsub (beacon chain)
HeavySync
BackupSync
Ultra-light (on beacon)


Provide Data
False
/
False
False
False


Advertise
False
/
False
False
False


Consume
True
True
True
False
True if validating side


Num streams
N/A
N/A
32-64
0
2~4


2. Non-Validator


Pubsub (side chain)
Pubsub (beacon chain)
HeavySync
BackupSync
Ultra-light (on beacon)


Provide Data
True
/
True
True
True if validating beacon


Advertise
True
/
False
False
False


Consume
True
True
False
True
True if validating side


Num streams
N/A
N/A
0~16
2~4
2-4 as consumer; 0-128 as provider


Note: whether to advertise in side chain pubsub is an option here. Need to be determined.
3. Validator


Pubsub (side chain)
Pubsub (beacon chain)
HeavySync
BackupSync
Ultra-light (on beacon)


Provide Data
True
/
True
True
True if validating beacon


Advertise
True
/
True
True
True if validating beacon


Consume
True
True
False
True
True if validating side


Num streams
N/A
N/A
0~16
2~4
2-4 as consumer; 0-128 as provider


4. Leader


Pubsub (side chain)
Pubsub (beacon chain)
FastSync
FullSync
Ultra-light (on beacon)


Provide Data
True
/
False
False
False


Advertise
True
/
False
False
False


Consume
True
True
False
False
True if validating side


Num streams
N/A
N/A
0
0
2~4 (only as consumer)


Note that in this stream, only FullSync protocol is bi-directional - have both incoming and outgoing requests. Other than that, there is always one provider and one consumer for each stream protocol.
2.1.4 More specifications on the stream


There can not be multiple protocols running on a stream. There must be only one protocol for each stream.
The stream can switch type. E.g. A stream running fastSync can be transfered to a fullSync protocol when fastSync finishes.
Why we need to always keep 4~8 fullSync stream? Because it might happen that a node is out of sync and it need to get to the latest block. On the other hand, peer discovery and setting up stream connection is costy. So it would be better to keep a minumun number of stream to rely on for fastSync content.
Based on the expected network topology in the role POV, we need a peerManager to initiate and manage all peer connections with the complicated role based stream policies. This will be discussed in the later chapters.

2.2 Stream Handler

2.2.1 Read / Write Async

Synchronization protocol sits on libp2p stream to enable write / read async to maximize the usage of the bandwidth.
Following is a comparison between read/write Sync vs Async.


We can see that if the network latency is the major delay in protocol, reading / writing with async can fully utilize the bandwidth in bi-direction and get higher throughput. Thus read / write async through stream is recommended.
2.2.2 Serialization

Compared options: 1) use RLP. 2) Use Protobuf.
In this synchronization protocol, RLP is prefered to be used here since:

RLP can deal with nested structure. There could exist some nested data structure to be serialized in protocol. e.g TrieNode.
There is some existing RLP encoding / decoding rules has been implemented in codebase. E.g. Header, trie.Node, Block, Transactions, e.t.c.
Implementing RLP encoding / decoding rule in golang is simpler than using protobuf.

On the other side, Protobuf outweights RLP in:

Protobuf has better performance.

So for the underlying serialization algorithm in stream data, RLP has more advantage and is recommended in this scenario.
More readings: Ethereum rlp design idea.
2.2.3 Stream handler

When a new stream connection is setup from libp2p host, we can run stream handler function on the stream established. The stream handler does the following:

After the stream is setup, it is first filtered by middleware, which is basically a blacklist to screen out blacklisted peers.
Then the stream run a handshake process - both sides send out version metadata, running protocols, e.t.c and verify.
Then check whether the given metadata is accepted.
Run the corresponding protocol on the stream which is a forever loop to handle messages. For each loop, the request is also filtered by middleware for limiting requests.
If an error is detected when running the protocol, close the stream connection and return.

Following we will discuss the components to the stream handler.
2.2.4 Middleware

Here middleware is serves as two purpose:

Blacklister to filter blacklisted peers.
Limit the request calls from a certain peer.
Access log recorder: update peer metadata (e.g last request for LRU replacement)

Blacklister:

Blacklister is a module to read and write black listed peers.
Blacklister manage entries indexed by libp2p.PeerID and each is associated with a timestamp, indicating the time when the access of the PeerID is no longer forbidden. (time out of jail)
Blacklister will periodically remove stale entries and persist the data entries to local persistence file.
Blacklister provide the interface to see whether a give peerID is blacklisted for now.
Blacklister is called in two cases. Each case should have a predefined jail period.

Timeout of request from downloader module.
Content error from downloader module.
User added nodes should be removed from blacklist and shall not be blacklisted


Blacklister shall be thread-safe to use.

Limiter:

Limit the request with some weight by request type.
The limiter policy is customized for each sub-protocol.


(TODO) it would be great if libp2p have some middleware options. Explore libp2p repo to get an authenticate middleware.

2.2.5 Handshake

Handshake process is to check the new streams with node that is compatible with self configuration:

Exchange Harmony stream protocol version and check compatibility.

Only the overall protocol version is checked.


Exchange protocol metadata and check compatibility

Network ID
Genesis block hash
Running sub-protocol specifier, version and metadata


Parse the stream to peer manager to decide whether to run the sub-protocol. If number of peers of the sub-protocol reaches the hard upper limit based on the peer management policy, the peer manager will immediately reject and close the stream.
Run the corresponding sub-procotol content to handle requests.

2.2.6 Peer management

Peer management is to ensure that the network topology described in 2.1.4 is achieved.


When a new stream is setup, it determines whether to run the subprocotol according to different peer management policies. If the number of peers running a certain sub-protocol reaches the hard limit, do not accept connection. This is handled in handshake process.


When a stream is closed, if the left number of nodes running sub-protocol reaches hard lower limit, spin up a maintenance thread.


Periodically spin up a maintenance thread to keep number of peers as expected. Illustrated in 2.2.7.


Dynamically register peer management policy. Provide public methods to be called to handle role change events. If a role change happens, discard the maintenance in progress and spin up a new maintenance loop according to the new peer management policy.


The different management policies for different roles should be implemented as an interface according to 2.1.4.


2.2.7 Maintenance policies

The maintenance loop can be awaken in six cases:

A peer stream is closed (from 2.3.3).
Periodically triggerred every say, 60 seconds.
A role change event is received.

For each maintenance process, the following logic will be executed:

For each subprotocol,

Check whether the peer counts is smaller than soft lower limit. If true, execute a batch of peer discovery tasks to discover more peers to reach the expected connected peers.
Check whether the peer counts is greater than soft higher maximum. If true, remove peers from from the head of LRU entries to reach the expected connected peers.


2.2.9 Error handling

All validation steps are done in upper function calls.
2.2.8 Example for node type transfer

In this section, three examples will be given to illustrate the process of peer management:

New node joining for validating shard 1.
Node role switching from validator of shard 1 to leader of shard 1.
Node receives a resharding task from shard 1 to shard 2.

2.2.8.1 New node joining shard 1


When new node starts, a role change event (non-synced) is sent to peer manager.


Peer manager spin up the maintenance process.


Currently have 0 connected peers, and the node need 32-64 heavy sync connection in shard 1 and 2~4 ultra-light connections.


Thus the maintenance will trigger discovery for a number of heavy sync and ultra-light connections.


When node stream is set up, emit the new node event to other modules.


2.2.8.2 Node role switching from validator to leader (shard 1)


When a view change is triggered and the node found itself is the leader node, inform the network peer manager of the event.
The peer manager spin up the maintenance process.
The node current have 2 heavy sync on shard 1, 2 backup sync on shard 1 and 2 ultralight sync stream. (Just for example)
Checking the peer management policy of the leader, we need to remove 2 heavy sync and 2 backup sync.
Send a close signal to the StreamHandle function of 4 nodes in total.
After the stream handler is closed, remove the peer from peer manager and emit a stream close event.

2.2.8.3 Node role switching from validator shard 1 to validator in shard 2


Resharding inform the network of the resharding event.
The peer manager spin up the maintenance process.
The node current have 2 heavy sync, 2 backup sync on shard 1 and 2 ultralight sync stream.
Checking the new peer management policy, it needs 00 heavy sync on shard 1, 00 backup sync on shard 1, 016 heavy sync on shard 2, 24 backup sync on shard 2 and 2 ultra light sync stream.
Thus it will

terminate 2 heavy sync stream
terminate 2 backup sync stream
Peer discover of size 4 with backup sync on shard 2.


Emit the new peer or peer closed event.

2.2.8.4 Soft limits and hard limits

For the peer management policy for each sub-procotol, it is encouraged to have both soft limits and hard limits.

Hard limit is the limit that shall not reached. If the peer count falls outside of the hard limit, an action is required immediately.
Soft limit is the limit set for periodic maintenance. It falls within the range of hard limit. It could happen that a certain value reaches soft limit has not reached hard limit. If this happen, will not enforce an immediate action but will do in the periodic maintenance process.

2.3 Sub-protocols

This section will illustrate the how synchronize requests are requested and received.
2.3.1 Pubsub protocol

Pubsub should work just as the current logic:

Node.Start will register the handler for pubsub messages.
After validation, the information is dispatched to different modules.
In this design, the Node.Sync message will be also delivered to downloader module for further handling.

2.3.2 Stream requests flow

First layout the flow of synchronize requests:

Downloader needs certain data to process for chain syncing. So it will call some method of peer in 2.3 which wraps Stream, to write request data of corresponding types to the Stream. This method is called in downloader module as the initiater of a sync request.
The method is read by host StreamHandler. It will then fetch related data from blockchain and write response to the stream.
The response is read by the client. It will dispatch the response to downloader module non-block.

2.4.2 Ultra-light sync

Ultra-light sync protocol is the simplist one - getLastHeaderByEpoch.
2.4.2.1 getLastHeadersByEpoch


Client send out requests with a list of epoch ids.
Host retrieve the last header in the epoch, and send back to client
Client receive the host's response. Directly deliver to the downloader module.

2.4.3 Backup Sync

Backup Sync is also simple. It is needed when a client missed a pubsub message. It is also simple and supports only one requests:
2.4.3.1 getBlockByNumber


Client send out the request of a block numbers to obtain the target block.
Host read the block from blockchain and send via stream.
Client receive the block and deliver to the downloader module directly.

2.4.4 HeavySync protocol

HeavySync could be either fast Sync or full sync from client's view. But the protocol they are running are actually exactly the same - They are handling the same type of requests. The only difference is that there are some request types the client will never call in full sync mode. So here we will put all request types here and
2.4.3 Protocols

Here defines all the request and corresponding handling logics over p2p stream connection. All responses received by client will be delivered to downloader module directly.
2.4.3.1 getHeadersByNumber

Parameters: StartBlockNumber, size, interval
Return: A list of headers starting from StartBlockNumber, with interval of target size.
2.4.3.2 getBlockBody

Parameter: header hash
Return:

Transactions of the block
Staking transactions of block
Incoming CX receipts of block

2.4.3.3 getReceipts

Parameter: header hash
Return:

Transaction receipts
Staking transaction receipts

2.4.3.4 getStateNode

Parameter: a list of node hashes
Return: A list node rap values corresponding to the node hashes in the request.
After the client received the response, directly deliver to the downloader module for downloader module.
2.4.3.5 getValidatorList

Parameter: blockHash
Return: A list of bls public keys for stateShard from the block
Client deliver to downloader module.
2.4.3.6 getValidatorWrapperByAddr

This is used for get the validatorSnapshot when the state does not exist.
Parameter: block number, list(address)
Return: A list of the following structure

State Object {nonce, codeHash, storageHash, balance}
code blob
Merkle proof of the state object to the state root.

2.5 Topics to be discussed

2.5.1 Defer pubsub message handling

When the node start up, it does not have the latest shard state, thus it is not able to validate any pubsub message against state shard (there could be malicious message in pubsub). It also shall not ask a random node to get the latest state shard since this will make the node prune to eclipse attack.
Options 1: The solution is that a node only start pubsub message handling when it has finished syncing beacon skeleton. This also have some correlation with the consensus initialize. I will leave these topics to experts.
Option 2: Still use a centralized service - provide a set of trusted nodes for latest shard state to get bumped in.
2.5.2 Better namings

Looking for better names:

Chain that only have last headers of epoches. Candidates:

Skeleton Chain (Sounds creepy)


Sub-protocol used for synchronization in beacon headers

Skeleton Sync (creepy)
Ultra-light Sync (too long)


Sub-protocol used once missing blocks from pubsub

Backup Sync


2.5.3 Stream Serialization

Protobuf / RLP
If we are using Protobuf, most of the data fields are still RLP encoded data.
2.5.4 Enforce only validators provide HeavySync and Ultra-light sync?

It is true that a node would prefer a validator node to a non-validator node for synchronization. Thus we can enforce only validators can provide HeavySync and Ultra-light sync.
The gain is huge:
This will make the eclipse attack super expensive and we can almost get rid of this attack approach.
In order to do this, we need some updates:

(Sanity) Non-validators does not provide heavySync, ultraSync service and do not advertise.
Some BackupSync protocol might become host/client model. Need extra logic to deal with this sync.
Enforce a bls key signature check for client during the handshake process.

But the drawback is also huge:

Large amount of work.
Node has no way to know about the latest state shard without ultra-light sync. Thus we cannot enforce validator identity on the initial ultra-light sync connections. (This is a paradox of chicken and egg).
Node cannot setup the HeavySync streams until it has finished beacon syncing. Otherwise, these two can be done concurrently.
Network layer also need to deal with an additional event - epoch change events. If a state shard is changed, some connected peers can be unelected and the stream need to be closed.
An additional BLS sign and verification step is needed in stream setup, which will open some opportunity for CPU intensive DDOS attack.

... TO BE CONTINUED
Preview


Chain logic change

3.1 Skeleton Chain
3.2 Header Chain
3.3 Receipt Chain
3.4 Offchain data

Syncer

4.1 Overview
4.2 Skeleton Syncer
4.3 Full Syncer
4.3.1 Detailed process
4.3.2 header fetcher
4.3.3 transaction fetcher

4.3.4 Receipt fetcher

4.3.5 State fetcher

5. Module communication / interface

5.1 Event / Subscribe based communication

In an effort to decoupling.
5.2 Events diagram


## design (part2).md

      
    Raw
  

              design (part2).md
            
          
    Harmony fastSync design (Chain logic and Syncer)

This is the Harmony fastSync design doc (chain logic and syncer)
3. Chain logic change

3.1 Blockchains

Since we are supporting resharding which need to switch between different shard chains dynamically, we will need databases for all shard chains and need to be registered into the Harmony structure.
type Harmony struct {
	...
	blockchains []*core.Blockchain
}
In the next few parts, some fields will be added to core.Blockchain. Note these fields shares the same rawdb, thus having the same blockchain data.
3.2 EpochTransChain

A EpochTransChain is needed on beacon chain for epoch transition blocks. It should implemente the following interface.
type EpochTransChain interface {
	// Read methods
	CurrentEpochNumber() uint64 // Return the epoch of latest skeleton header + 1
	CurrentSkeletonHeader() *block.Header // Return the latest skeleton header 
	CurrentStateShard() (*state.Shard, error)
	GetStateShardByEpoch(epoch uint64) (*state.Shard, error) // Return the state shard at epoch
	GetLastHeaderByEpoch(epoch uint64) (*block.Header, error) // Return the last header before the given epoch
	HasEpoch(epoch uint64) bool
	
	// Write methods
	ValidateSkeletonChain(chain []*block.Header) (int, error)
	InsertSkeletonChain(chain []*block.Header, writeSkeleton WsCallBack, start time.Time) (int, error)
  WriteSkeleton(header *block.Header) (WriteStatus, error)
  SetSkeletonHeader()
	
	// Config fields
	Config() *params.ChainConfig
	Engine() consensus_engine.Engine
}
TODO


Implement EpochTransChain structure for beacon chain. If the chain is not beacon chain, return an error for methods related to state-shard.


Add SkeletonChain as a field to Blockchain.


Add caching mechanism for SkeletonChain.


In WriteSkeletonChain, the following message should be inserted into raw db:

Canonical chain
Epoch-block number
State Shard
Header data


Note: The stateShard data is owned by SkeletonChain. It shall not be touched by other structures.

3.3 HeaderChain

For fastSync before the pivot, we will need both InsertHeaderChain and InsertReceiptChain method to insert the data into the chain. So we should restore the logic of InsertHeaderChain.
Basically, we can copy the logic of WriteHeader (core/rawdb/accessors_chain.go:152) into InsertHeaderChain.
TODO


Copy WriteHeader into InsertHeaderChain.


In WriteHeader, it's also need to write the following data:


Canonical chain


Epoch-block number


VRF / VDF


Header data


Commit Sig


Crosslinks


Slash records


In ValidateHeaderChain, add the logic to check duplication of header

It could happen that the data written to rawdb is previously written in beacon skeleton.
If this happens, check whether the header hash equals. If not equal, report the error.
This error shall not happen as long as the consensus assumption holds.


3.4 Receipt Chain

For fastSync before the pivot, we need to write the receipt into the blockchain. Receipts include data that's not in header. It includes


All data in block


Some offchain data


TODO


Implement InsertReceiptChain method to blockchain.

WriteBlock (core/rawdb/accessors_chain.go:330)
Transactions (from block)
Receipts (from Receipts)
Staking Transactions (from block)
Staking Transaction receipts (from Receipts)
Incoming CXReceipts (from block)
Outgoing CXReceipts (from Receipts)


3.5 Blockchain

3.5.1 CommitOffchainDataAtPivot

At the pivot, we need to commit some offchain data to blockchain. Thus CommitOffchainDataAtPivot method is needed as a method for Blockchain structure. The method is to be called after synced state at pivot.
The data is only valid on Beacon chain.  The data includes:

ValidatorList (By iterating over the state).
ValidatorSnapshot.
DelegationsByDelegator.

For cache data, we also need to update CurrentBlock After the data is all committed at the pivot.
3.6 Misc

3.6.1 Read commit hash from block

In core/types/block.go:583, when reading blocks from rawdb, we should also add the commit hash to the block structure.
3.6.2 Migration plan

There shouldn't be any compatibility issue on the chain data since all data in rawdb are still kept the same. The new version is 100% built on the existing database.So no migration plan is needed for chain logic.
4. Syncer

Syncer is the module resonsible for actively fetch data from network and insert into the chain. There are in total 5 syncers in the network - one syncer dedicated to sync beacon skeleton, and one syncer for each shard. Each syncer is a stateless pipeline to execute sync tasks. There are in total four sync tasks, and only one of the sync task can be running for each sync at a time.

Beacon Sync Task - Sync the skeleton headers for beacon chain.
Active Sync Task - Sync state data actively for non-synced node.

Based on different block numbers, could use fastSync and fullSync policy.


Passive Sync Task - Sync data passively to the expected block.
No Sync Task - Do not sync any data.

One Beacon Syncer is always running to listen and fetch the latest beacon skeleton header and insert into beacon skeleton chain (Except for beacon leader). And rest of the syncers will receive sync tasks as need.
We will discuss the different sync tasks in the following parts. E.g. For validators, it should run two syncers: beaconSync on beaconSyncer and passiveSync on Syncer of its shard.
4.1 Beacon Sync Task

Beacon sync task is to sync the skeleton chain on beacon chain. It is required to start for every node except the beacon leader.
Basically, it does the following tasks:

During initial start up, sync the beacon skeleton chain to the latest header. (Active)
After the beacon sync is synced to the latest epoch, emit a BeaconSynced event.
Listen to beacon chain node.Sync message to get the latest beacon header. Fetch data if needed. (Passive)

4.1.1 Node initial start (active)


Read the current beacon epoch transition header from local storage.
Concurrently request latest epoch from multiple peers.
Wait for 3/4 of the node return a response. Pick the peers with the highest epoch number to start sync.
Request the headers for epochs concurrently from picked peers.
Verify the skeleton header and insert it into beacon epoch transition chain.
After finished node initial start process, emit a beaconSkeletonSynced event.
Emit a BeaconSynced event to tell pubsub to handle message.

4.1.2 Listen to beacon node sync (passive)


Beacon sync also listen to Node.Sync event.
Once received a block which is the last block of an epoch according to schedule, insert it to skeleton chain.
Once received a block whose block number is greater than next skeleton blocks, try to actively fetch the headers from nodes.

4.2 No-Sync

No sync does not do anything, until a mode change is occured.
4.3 Passive Sync

During the passive sync, the sync process is only triggered by consensus module. Only when the consensus module realizes that there is some missing blocks, it will trigger the passive sync process of the syncer.
4.3.1 Consensus trigger

In normal consensus process, consensus will try to process committed message from the pubsub. It could happen that the consensus module missed processing some blocks and do not have the latest block number. In this case, the consensus module will trigger the passive sync of the syncer for shard chain to get synchronized to the certain block. The block to be synced is added to the task.

Note that the synchronization process should be called after the committed block is verified in signature.

4.3.2 PassiveSync process

PassiveSync is done in the  following steps:

Request concurrently for blocks from the current header number to the expected block.
For the request results, call InsertChain to insert all these blocks into chain.

4.4 Active Sync

Active Sync is the process initiated during the node start up or the syncer is previously in mode no-sync. Based on the block number left behind, it can have two options:

FastSync
FullSync

4.4.1 get latest block number


Concurrently send request to multiple peers to get the latest block number.


Wait for a certain portion (3/4) of the peers return a response.


The latest block number is defined as the returned results with the maximum vote.


If the diff between last block number and local block number is smaller than a certain threshold (need to determine according to experiment results), use full sync mode. Else use fast sync mode.


4.4.2 FullSync

4.4.2.1 Headers


Spawn a goroutine to keep requesting for header data by number concurrently.
Get headers message delivered from the StreamHandler. Assemble the header data in order.
After the latest block number obtained in 4.4.1 is delivered, still keep requesting for latest headers and deliver.
Throttle the header data to headerProcCh, to further process the headers.
Spawn a goroutine to verify header dealing with headerProcCh. If the header pass the verification, forward the header data to request body messages.

4.4.2.2 Bodies


Spawn a goroutine for requesting body data concurrently by header. Each peer has a maximum request on the fly. If all three roots of the header are empty hashes, do not request data since there is a empty body.
Get body message delivered from the StreamHandler. Assemble block with the body and header and put them in a queue in order.
Throttle the block message to InsertChain logic to further process  the block on chain.

4.4.3 Fast Sync

4.4.3.1 Set pivot


Set the pivot according to schedule. The pivot is set at the last block of epoch.

4.4.3.2 headers


Spawn a goroutine to request for headers as the same as 4.4.2.1.

4.4.3.3 Bodies and receipts


Based on whether the block number is before or after pivot, request for body and receipt or body only.
Get body and receipt message delivered from StreamHandler, put the result into a queue.
Throttle the result to fastSync process channel.
Spawn a goroutine to deal with fastSync process channel. This process have different strategy for block before, at, and after pivot.
For blocks before pivot, it should have both body and receipt data. Call InsertHeaderChain and InsertReceiptChain .
For block at pivot,

Initiate and process a stateSync to synchronize the whole state.
After state sync done, Iterate over the state to get validatorList, validatorSnapshot, and delegatorByDelegations
Call InsertHeaderChain and InsertReceiptChain and ``CommitOffchainDataAtPivot` as discussed in part 3.


For blocks after pivot, it should have only body data. Call InsertChain to insert the data into blockchain.

4.4.4 StateSync

StateSync is to synchronize the whole state of a certain block. We can just copy the code from Ethereum with just a few adaptions:

Underlying request and response data type should be adapted.
Since we are using streaming, it could be optimized in request - Instead of not sending requests until receiving the response, we can have multiple requests on the fly. Please see more at 4.4.6.

4.4.5 Error handling

4.4.5.1 Cooldown for timeout (nice to have)

It could happen that a certain request has a timeout event. If this happens,

Reschedule the requests of the peer to another peers.
Put the peer to cool down. The peer cannot send request in a certain cool down window.
If cooldown event happens multiple times in a certain period, close the stream for bad service.

4.4.5.2 Deliver invalid data

Invalid data include:

Header data that cannot pass verification.
Body data that does not align with the transaction root at header.
Receipt data that does not aligh with the receipt root at header.

This is caused by node misbehaving. If this happens

Throw away all on-the-fly request of the peer.
Reschedule the requests of this peer to other peers.
Close the stream and put the peer into the blacklist.

4.4.5.3 InsertHeaderChain error

This shall not happen since the header data should have passed VerifyHeaders before InsertHeaderChain. If InsertHeaderChain returns an error, need to roll back to previous state.
4.4.5.4 InsertChain, InsertHeaderChain, InsertReceiptChain error

These are very critical errors. All messages should have passed validation in previous steps but cannot be inserted into the blockchain. If this is the case:

Need to rollback to the previous chain status.
Terminate the current sync process and return an error.

4.4.6 Flow control and QOS adjustment (Nice to have)

Since we are trying to fully utilizing the stream connection, we will use the stream as following:

We set a limit of on-the-fly requests for each peer. (E.g. 5)
If the number of requests sent but not responded reaches the limit, do not send out more requests.

Also, it would be great to have QOS adjustment based on the total throughput of a certain peer.
These features are optional, and might bring unstability to the system if not tuned carefully and tested heavily. Recommend to implement these features after the current implementation is finished.
Also, there is an alternative to use mplex for flow control (https://github.com/libp2p/go-mplex)
4.4.7 Topics to be discussed

4.4.7.1 Header VerifySeal interval

It shall be safe that we do not verify signature for each header, since the committee will not sign a block whose parent is not signed.

So do we need to set the VerifySeal interval?
Is check seal every 100 blocks a reasonable number to start�?

4.4.7.2 Nice to have features

We should also discuss whether to implement these features:

Peer cool down mechanism.
Flow control and QoS adjustment. A single approach is to only allow only one on-the-fly request per peer, this can be further extended to more complex flow control.

5. Module interfaces

There are basically four components from top down level:

Consensus and bootstrap.
Syncers.
Peer Manager. (Wraps stream and pubsub)
HostV3.

The overall structure is as follows (level top down):

Consensus module sits on the top

Call Syncer with sync tasks based on specific needs.
Listen to ChainSynced event during inital start from Syncer. Start validation after received this event.


Syncers sits in middle between peer manager and consensus, and controls the peer management policy as well as the stream handler.

Receive tasks from consensus and bootstrap.
Instruct lower layer peer manager for different policies.


Peer manager

Wraps HostV3.
Execute different peer management policy.
Execute handshake and run corresponding subprotocols on Streams.


HostV3

5.1 Consensus and bootstrap

Consensus and bootstrap (main) function is responsible for trigger the sync process.
5.1.1 Bootstrap


Assign BeaconSync task to Beacon Syncer.
Assign Active Sync task to Syncer at the validating shard.
Assign No Sync Task to other syncers.

5.1.2 Consensus


Consensus module send PassiveSync task to the syncer if find some missing blocks.
When the consensus found the role has changed to leader, send NoSync task to Syncer. If node is the leader at beacon, also send NoSync to Beacon Syncer.
When the consensus found the role has changed from leader to validator, send passive sync task to the syncer.

5.2 Syncers

Syncers receive tasks from consensus and bootstrap function. Other than its functionality of synchronization of blockchain data, Syncers are also responsible for:

Imform the peer manager of the peer management policy has changed if received a new sync task (for leader/validator switch)
Change the peer management policy from active to passive when finished active sync.
Emit multiple events:

BeaconSynced: Finished active syncing on beacon epoch transition blocks - inform HostV3 start to handle message.
ChainSynced: Finished active syncing on shard chain - Inform consensus module to do validation on consensus messages.


5.3 Peer Manager

Peer manager is the structure wraps HostV3 and its interfaces is:

Handle peer management policy changes from Syncers.
Wraps libp2p stream to provide read/write for synchronization requests.

5.4 HostV3


Defer pubsub handling until received BeaconSynced event from Syncers.
	Pubsub (side chain)	Pubsub (beacon chain)	HeavySync	BackupSync	Ultra-light (on beacon)
Provide Data	False	/	False	False	False
Advertise	False	/	False	False	False
Consume	True	True	True	False	True if validating side
Num streams	N/A	N/A	32-64	0	2~4