JackyWYX/ethereum_sync_walk_through.md

## ethereum_sync_walk_through.md

      
    Raw
  

              ethereum_sync_walk_through.md
            
          
    Ethereum Sync Logic Analysis

by Jacky@Harmony 5/20/2020
Description

This document analyze the Ethereum synchronization logic. The analysis is based on go-ethereum commit hash 65ce550b37670ce34aeaeaa6e66510028d2f7603 to the time of this document being written.
1. Background and assumptions

1.1 Downloader Synchronization modes

For large batch block synchronization, Ethereum has three synchronization modes:

FullSync: download block header and body block by block, and then insert into chain. For insertion of each block, run the EVM for every transaction in the block. Transaction receipts are yielded as a natural result of EVM processing transactions.
FastSync: Only sync header, body,  and receipt to block of height (currentHeight - 64) without full state. Do not process transactions. Upon currentHeight - 64, fetch the full state from peers. For the last 64 blocks, process as fullsync. FastSync is started only once when node starts. After the first sync, downloader will be changed to fullSync mode.
LightSync: Serve as the synchronization service for light client. Only synchronize the headers without transactions and state.

For our implementation, we focus on FastSync to replace the rclone sync solution.
1.2 Assumption for FastSync

There are some very important assumptions that have been made for FastSync:

Legitmation of a block only depends on the hash puzzle. (Select the block with highest Total Difficulty).
There is no any other data needed for verification of the block. Header data itself is sufficient for verification.

1.3 P2P layer

Unlike the current implementation of Harmony, Ethereum's P2P network is purely end-to-end, which is, each node keeps track of the information about all connected peers in a structure with latest sync info (eth/peer.go:87). And the communication is directly read/write on an encrypted channel over a TCP connection. So every sync message in Ethereum is combined with a peer.

Actually in Ethereum, the implementation of "subscribe" and "gossip" is not on P2P layer, but instead in eth protocol. Each node has the responsibility to validate the message before relaying to the whole network. But the downside is that the validation will increase the delay in message on network.
And a Kademlia network topology can ensure that a message can be broadcasted to the whole network within O(logn) times of relay.

Also note that, there is no such role as client / host in Ethereum p2p connection. Any connection between two nodes are accomplished in an async way. This is to ensure that the p2p connection has a high throughput.
2. Overall Sync Workflow

2.1 Related structs and modules

Protocol manager: Protocol manager (eth/handler.go:63) is the struct which defines all P2P message handling on a peer basis. All syncing related logics are triggered and depending on the P2P message handling of this structure.
Peer: Protocol manager keeps trap of each Peer with its latest header within the structure peer (eth/peer.go:87), and managed in group as peerSet (eth/peer.go:693).
Downloader: Downloader is the major sync module being used. It has three modes: FullSync, FastSync, and LightSync. It will fetch history data for synchronisation upon initial start as FastSync, and then check and fetch latest blocks every 10 seconds.
blockFetcher: Fetch blocks body when receiving a header hash announce. The difference between fetcher and downloader is that fetcher is passive, only fetching data of the announcing hash. But downloader will progressively fetch all data that it need.
2.2 Overall logic walk through

This is a very simple walk through for three POV:

Miner
Node with latest blocks
Node without latest blocks

2.2.1 Miner


After mining a block, a block broadcast will be triggered: minedBroadcastLoop (eth/handler.go:883).
Part of the connected peers will receive a full block body message (sqr(len(peers)))
Rest of peers will receive new block hash message.

2.2.2 Node with latest blocks


Node received a new hash message or new block message from other nodes.
In either case, the block announce will be sent to blockFetcher to fetch data put it in queue.
Block fetcher process the announce message. Send FetchBody request.
Fetcher received block data and then insert it into blockchain.

2.2.3 Node joined without latest blocks


Node discover a new peer and start handshake. (eth/peer.go:563)


New peer registered at peerSet and send a peer change event (eth/handler.go:338)


The change is captured by chainSyncer.loop (eth/sync.go:215) and call ProtocolManager.doSync


Downloader.Synchronize method is called to synchronize with the peer (eth/sync.go:308).


Downloader will choose one of the sync mode to progressively fetch data needed and insert into chain.


After inserted, broadcast the block hash to nodes to inform the latest header of the node.


3. Downloader FastSync

In this part, only downloader's FastSync process is discussed. The synchronization process is started given a peer as a "buddy". The buddy peer should be selected with the highest block TD, providing the target block number and header "skeleton", while other data could be retrieved from other nodes.
Also note that downloader module heavily uses pipeline design pattern to achieve high performance with graceful coordination between different tasks.

How ugly the code looks is comparable to how well the code is optimized in performance.      - by Jacky

3.1 Overall process

In this part, the execution of function downloader.syncWithPeer is elaborated (eth/downloader/downloader.go:416). Note that the following description does not follow the code logic, but just give a sense of what FastSync is doing.

Find the common ancestor through binary search by talking to the buddy peer. Ancestor block is defined by the query result that both remote and local node share the latest same block header and block body.
Fetching fastSync block data before pivot (Pivot is defined by current number - 64).

Fetch the headers.
Fetch block bodys from any peer available and store it to a result queue.
Fetch block receipts from any peer available and store it to a result queue.
Insert the headers, transactions, uncles blocks, and receipts to blockchain (without EVM processing)


Fetching pivot block data

Fetch the header.
Fetch the block body and receipt from any peer availabe.
Fetch and assemble state data. State are fetched in node as foundamental granularity.
Insert the block data into blockchain.


Fetching block data after pivot

Fetch the headers.
Fetch block bodys from any peer available and store it to a result queue.
Call insert chain to process the transactions in EVM, automatically yield receipts.


This is the interpreted overall process of the downloader FastSync mode. The actual code execute the above logic in a different way:

Find the common ancestor (step 1)
Spawn a loop to fetch headers (2.1, 3.1, 4.1)
Spawn a loop to insert header data into chain (2.4)
Spawn a loop to fetch bodys (2.2, 3.2, 4.2)
Spawn a loop to fetch receipts (2.3, 3.3)
Spawn a loop to assemble and process fast sync data (3.4, 4.3)

Note in code logic, 1 is executed first and then 2 to 5 are executed concurrently. The following sections will illustrate the above five steps in details.
3.2 Find ancestor


Fetch the height of the given node. (eth/downloader/downloader.go:437)
Consider 90000 as the soft finalize limit, set currentHeight - 90000 as starting point (eth/downloader/downloader.go:722)
Fetch header data from the given peer to allocate where the fork happens (eth/downloader/downloader.go:755).

In first iteration, divide the total search range to 16 spans and locate the fork (16-way binary search).
In later iteration, use binary search to locate the fork.


Note that in later iterations, we can also use 16-way binary search to improve efficiency.

3.3 Fetch headers

Code location: eth/downloader/downloader.go:912
This function is aimed to fetch all headers from common ancestor to pivot. The overall design of fetching headers is to create a goroutine of a forever loop until finished served as a "task manager" responsible for:

Send out the initial requests async
Receive results from header channel.

Based on the return results, send on more requests for more data.
Deliver the results to headerProcCh for the next goroutine to process.
If have enough data,


Listen on timeout event

If a timeout happens for the buddy peer, inform other fetchers to stop.
Return an error. The buddy peer will be dropped later because of inconsistency in behaviour.


The strategy for fetching headers:

Request for skeleton data from the buddy peer. (skeleton is headers with interval 192, each batch 192 headers)
Fill in the missing header skeleton from peers available concurrently with fetchParts function.
Assemble data in queue, and throttle in order to headerProcCh for next step.
Continue to next skeleton batch.

3.4 Process headers


Receive in-order headers from headerProcCh (eth/downloader/downloader.go:1396).
Insert into header chain, verify one header every 100 headers. (eth/downloader/downloader.go:1470)
Schedule the fetch body and fetch receipt task for inserted header (eth/downloader/downloader.go:1495).
If an error happens in verifyHeader, rollback 2048 blocks before error happens. And then return an error to drop buddy peer later (eth/downloader/downloader.go:1481, eth/downloader/downloader.go:1363).

3.5 Fetch bodies

FetchHeaderSkeleton, FetchBodies and FetchReceipts all calls a function fetchParts, and are quite alike in logics. Here use FetchBodies as an example to walk through this very complicated function (eth/downloader/downloader.go:1186).
Notice there are three pools in the queue for task management:

Task pool (request not sent)
Pending pool (request sent, response not received)
Done pool (response received, not processed)

FetchBodies is a forever loop until the fetching tasks are all done. Following is the logic the FetchBodies function is doing:

The task is already scheduled when the corresponding header is processed successfully in step 3.4.3.
Receive block results from d.bodyCh

Check validation and write to resultCache (eth/downloader/downloader.go:1209)
Awake the new fetch by sending a signal to update channel. (eth/downloader/downloader.go:1231)
If the result failed, just skip the the result and continue to next. (eth/downloader/downloader.go:1209)


Awake the fetching in total three cases

As received results from d.bodyCh as the previous step.
A signal is observed in update channel (awakened by other fetchers eth/downloader/downloader.go:1235).
Awaken every 100ms as time ticks. (eth/downloader/downloader.go:1246)


When the fetching is awakened, assign new fetch tasks if possible

Clear all expired requests and peers (eth/downloader/downloader.go:1259)

If it's the buddy peer that expires, terminate the synchronisation immediately
Push the expired tasks back to task pool and remove from pending (eth/downloader/queue.go:665)


If there is no more data to fetch, and other fetchers says no more tasks, finish fetching and return with nil error (eth/downloader/downloader.go:1295)
For each idle peers, tell them to fetch data if any. (eth/downloader/downloader.go:1319)


3.6 Fetch receipts

Fetching receipts has the same logic as fetch bodies. Skip here.
3.7 Process fastSync Contents

This is another goroutine that runs a loop until everything is done. It does the following:

Get in order block results (bodies + receipts) from queue.resultCache which is asembled by fetch receipts and fetch bodies.
Split results by the pivot.

For results before the pivot, insert transactions, uncles and tx receipts directly into the chain. (eth/downloader/downloader.go:1630)
For result on the pivot, download the full state content. And then insert body data (eth/downloader/downloader.go:1649)
For blocks after the pivot, use InsertChain to use EVM to process the block data. (eth/downloader/downloader.go:1558)


3.8 StateSync

StateSync has three layers:

Trie.Sync: trie data structure related, responsible for tell upper function what data to retrieve and combine received data pieces into a trie
State.Sync: Defines the extra data structure to be pulled based on struct state.Object. (additionally pull code and storage trie at leaf node)
Downloader stateSync, responsible for distribute the tasks given by state.Sync to different peers and feed them into trie.Sync.

3.8.1 trie.Sync

Trie is consist of nodes, and each missing node in synchronisation process is represented by the structure request:
type request struct {
	hash common.Hash // Hash of the node data content to retrieve
	data []byte      // Data content of the node, cached until all subtrees complete
	raw  bool        // Whether this is a raw entry (code) or a trie node

	parents []*request // Parent state nodes referencing this entry (notify all upon completion)
	depth   int        // Depth level within the trie the node is located to prioritise DFS
	deps    int        // Number of dependencies before allowed to commit this node

	callback LeafCallback // Callback to invoke if a leaf node it reached on this branch
}

The retrieval is done in DFS order, and parents member shows the hierarchy nature of the data structure.
Sync is a task to sync a whole trie which it to be called by upper functions.
type Sync struct {
   database ethdb.KeyValueReader     // Persistent database to check for existing entries
   membatch *syncMemBatch            // Memory buffer to avoid frequent database writes
   requests map[common.Hash]*request // Pending requests pertaining to a key hash
   queue    *prque.Prque             // Priority queue with the pending requests
   bloom    *SyncBloom               // Bloom filter for fast node existence checks
}


queue is the place to put all unstarted requests.
requests is the place to put all pending requests.
membatch is the in memory committed node data.

Trie.Sync methods:


AddSubTrie and AddRawEntry: Initial request to retrieve a trie or blob (trie/sync.go:94)
When want to sync a trie with a certain root, it is simply to call AddSubTrie. If want to fetch a data blob with hash, it is to call AddRawEntry. After the initial check, the function schedule(trie/sync.go:246) will be called which will add the requested root to s.queue.


Missing: Get hash of unsynced nodes (trie/sync.go:168)
This is to call Missing which will return pop and return elements in s.queue (trie/sync.go:168). This method is called when have some idle peers ready to fetch some data.


Process: deal with node results (trie/sync.go:179)
The result of the node request can be three types:

leaf node (value node) -> This is the leaf node, call callback function continue to process his parent.
short node / full node -> if its children haven't been processed yet, add new requests to process its children. If all its children haven been processed, commit and continue to process its parent.


Commit: Commit the in memory data to database


3.8.2 trie.Sync in state

State uses a the data structure of a trie. And it's more. State has the leaf node as stateObject, also known as an account, and in this structure, code and storageTrie is stored. So when fetched a stateObject, we also need to fetch the code data and storageTrie under that account. Thus we have a trie.Sync with special call back function.
// core/state/sync.go:31
callback := func(leaf []byte, parent common.Hash) error {
  var obj Account
  if err := rlp.Decode(bytes.NewReader(leaf), &obj); err != nil {
    return err
  }
  syncer.AddSubTrie(obj.Root, 64, parent, nil)
  syncer.AddRawEntry(common.BytesToHash(obj.CodeHash), 64, parent)
  return nil
}

3.8.3 downloader.stateSync

downloader.stateSync sits on top of the trie.Sync and calls different methods of trie.Sync. It is responsible to assign the missing node tasks to peers concurrently and feed the results to trie.Sync.


Start state sync
State sync is started in a forever loop stateFetcher (eth/downloader/statesync.go:75) which is started when the downloader module is started. It keeps read state sync request from d.stateSyncStart channel. After dealing with the deliver channel and time out event, received requests is handled in stateSync.loop function (eth/downloader/statesync.go:278).


Fetch result
This is done in the first step in stateSync.loop (eth/downloader/statesync.go:295). Call trie.Sync.Missing to fetch all node hashes that need to fetch and distribute to idle peers.


Process requested results

Node received the response from p2p requests (eth/handler.go:625)
Response sent to d.stateCh
Response data added to in memory queue cache finished (eth/downloader/statesync.go:142)
Deliver the queue cache to inside loop (eth/downloader/statesync.go:135)
The inside loop receive the request, call process to process node data to insert into trie.Sync (eth/downloader/statesync.go:333)
Commit the in-memory data in trie.Sync (eth/downloader/statesync.go:292)


Deal with timeout events

In step 2 fetching data, each request will be registered in map active tracking active requests. Also a time.AfterFunc to be triggered upon timeout (eth/downloader/statesync.go:200)
After the time out, the function will be triggered and notify the timeout channel (eth/downloader/statesync.go:202)
If the result has already been fetched, the request stored in active map (written in step 1) will not be the same (nil upon response received, different pointer if assigned another task). Don't do anything and continue (eth/downloader/statesync.go:175)
If pointer still the same, meaning a timeout has occured. Add to finished queue and wait to be fetched in inside loop. (eth/downloader/statesync.go:179)
Timeout request noticed by deliver channel (eth/downloader/statesync.go:307). If there is a timeout event, drop the peer (eth/downloader/statesync.go:319)
Place failed fetches to the retry queue (eth/downloader/statesync.go:462)
In the next fetch, the failed fetches will be retried again (eth/downloader/statesync.go:399)


Error or Missing data from fetch results

If node returned data is wrong or missing, it will remain in req.tasks (eth/downloader/statesync.go:431)
It will then added to the fetch queue (eth/downloader/statesync.go:449)
Fetch the next time (eth/downloader/statesync.go:399)


4. Others

4.1 Error handling

4.1.1 Error happens in buddy peer

During synchronization, if any error happened to the buddy peer, meaning that the peer of the target block is no longer stable. In this case, synchronisation process will be terminated and then drop the buddy peer.
4.1.2 Time out for other peers

This is handled in expire function in fetchParts function (eth/downloader/downloader.go:1186)

The peer is dropped
The task is rescheduled (pushed to taskQueue) (eth/downloader/queue.go:630)

4.1.3 Wrong data

Two cases might happen:


The data received is not consistent with the request

result cannot pass reconstruct function (eth/downloader/queue.go:772)
Return a partial error in deliver function (eth/downloader/queue.go:870)
In current code base, will not do anything with the peer (eth/downloader/downloader.go:1226)


Error happened when insert into blockchain

Error happens in processHeaders or processFastSyncContent
Both will return errInvalidChain
If header invalid, roll back 2048 blocks. (eth/downloader/downloader.go:1481)
If body invalid, don't need to rollback. (eth/downloader/downloader.go:1709)
Terminate the synchronization process. (eth/downloader/downloader.go:545)


Drop the buddy peer (eth/downloader/downloader.go:337)


4.2 QoS tuning


During synchronization, QoS is calculated on peer basis when a certain ask is finished. (eth/downloader/peer.go:235)
The throughput will be used for calculate the capcity of a single request. The larget the throughput, the more the data query in a request. (eth/downloader/peer.go:291)

Questions to be Answered


In our implementation, can the legitimation of a block verified only through header data itself? Even for view change block?
Since the EVM processing staking relies heavily on off-chain data. So is there a way to verify the off-chain data? Or is there a way to produce off-chain data without doing EVM processing.
Is there some coordination in verifying headers from beacon chain and side chain? If there is, what is the restriction? (Crosslinks)
Leader should not provide Sync service for better network performance? Offer user a flag in command line for user choose provide sync service or not? Handshake before sync started?