Skip to content

Instantly share code, notes, and snippets.

@zouhuan1215
Last active September 8, 2021 16:45
Show Gist options
  • Save zouhuan1215/e3f4298e9fc311924ecd784ae7d1119a to your computer and use it in GitHub Desktop.
Save zouhuan1215/e3f4298e9fc311924ecd784ae7d1119a to your computer and use it in GitHub Desktop.
Week 6 -- Data structure in ethereum


Core data structure in Ethereum | Go to the image source

As an overview of this post, following topics will be discussed:

  1. How does Ethereum find specific transaction details given the transaction hash?
  2. Properties of merkle patricia trie.


🌈 Transaction | Transaction hash | Transaction receipts


Lifecycle of transactions | Go to the image source

  • Transaction is a record of client transaction request compared to transaction receipt. There are two types of transactions: those which result in message calls and those which result in the creation of a new accounts with associated code. Both of them share some common fields like nonce, gasPrice, gaslimit, to, value, (v,r,s) with field init specifying EVM-code while field data speficying the input data of message call. For details, please refer to ethereum yellow paper section 4.2.
  • Transaction hash is the ECDSA signature of transaction content as shown in the above picture.
  • Transaction receipt is a record of the transaction outcome compared to transaction. In order to encode information about a transaction concerning which it may be useful to form a zero-knowledge proof, or index and search, Ethereum encode a receipt of each transaction containing certain information from its execution. Transaction receipt is generated after corresponding transaction has been executed by Ethereum node, not obtained from other peer node.


🌈 Merkle Patricia trie


Ethereum Block Header | Go to the image source

If you are quite unfamiliar with Merkle patricia trie used in Ethereum, then this post on medium Data structure in Ethereum Episode 3: Patricia trie is highly recommended. Before the page is redirected to medium, following concenpts should be made clear.

  1. MPT(merkle patricia trie) is just a logic structure of data storage. The only type of database in Ethereum is the key-value store database -- query key as input and returned value as output. MPT node is stored in levelDB indexed by the hash output of its value field. The value field typically contains keys pointed to other MPT nodes. It is these keys contained in value field that specify relationships between father and children in MPT. So there are two kinds of indexes. One is the key of levelDB which tells the database to fetch the content of specific MPT node. The other is the path of MPT specifying which MPT node is to be queried next step.
  2. Note that the key of MPT node is the hash output of its value field. And the value contains keys pointed to other MPT nodes. So the key of MPT node servers as two role -- merkle proof of its value and "entry" to its subtree. Merkle proof property can be used to implement simplified payment verification on light client while "entry" property can be used to implement version retrieving.

There are 3 Patricia trie roots in Ethereum block header.

  • stateRoot: path-- sha3(ethereumAddress) value -- a 4 item array of [nonce, balance, storage root, codehash]. storage root is another Patricia trie where all smart contract code related to this account lives. State tried records the global state. There is only one state trie being updated over time.
  • transactionsRoot: path -- transactionIndex. transactionIndex is its index within the block its mined. value -- transaction details. Each block has an independent transactionsTrie.
  • receiptsRoot: path -- transactionIndex. value -- output of executing corresponding transaction. Each block has an independent receiptsTrie.


🌈 LevelDB databases


Key types in levelDB | Go to the image source

Each transaction actually corresponds to two key/value pairs. One is the metaData of a transaction which tells where locates this transaction -- the i-th transaction within n-th block. The key form of such metaData prefixed with "l" is shown in the above snippets captured from Ethereum source code. Specifically, the hash refers to transactionHash. The other stores content of this transaction which can be regarded as a MPT node in a specific block.


🌈 Putting it together

So here shows how function getTransaction works:

  1. Find the position of queried transaction: Do database query[transactionHash --> transaction metaData]. Suppose queried transaction is located at n-th block with transactionIndex being i.
  2. Get transactionsTrie entry in n-th block: Do database query[blocknumber --> blockhash, blockhash --> blockheader]. Blockheader contains the transactionsRoot which is the key of root node in the database.
  3. Route from transactionsRoot with path i in the MPT: Downstream from transactionsRoot and query database until MPT node corresponding to path i is found. Extract transaction information from the value field and return it.
@tiedaoxiaotubie
Copy link

congrats!Vivela fanfan

@tete1030
Copy link

👍 🎉

@arijitAD
Copy link

arijitAD commented Jun 3, 2021

@zouhuan1215 Where can I get content for other weeks?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment