loredanacirstea/A Master Shard to Account for Ethereum 2.0 Global Scope.md Secret

## A Master Shard to Account for Ethereum 2.0 Global Scope.md

      
    Raw
  

              A Master Shard to Account for Ethereum 2.0 Global Scope.md
            
          
    tl;dr An expanded version of https://ethresear.ch/t/a-master-shard-to-account-for-ethereum-2-0-global-scope/5730, with a proposal for Ethereum development funding.
But … Ethereum 2.0 specs do not have a Master Shard!

Current Ethereum 2.0 specs mention the Beacon Chain, which will host Execution Environments code and (maybe) the last state root for each shard. We have Block Proposers and (maybe) Relayers, who can attest transaction results. We have Validators, part of shard committees, who validate proposed blocks and send the shard state root to the Beacon chain.
One of the major pain points will be atomic cross-shard transactions. Mechanisms for asynchronous cross-shard transactions have been described, but have some drawbacks: latency (expected ~6min), higher gas cost. Mechanisms for synchronous cross-shard transactions have also been described at a higher level, but there is no agreement on exact details or on whether this will be feasible in practice.
The current approach is:

drawbacks will be solved by Layer 2 (TODO Rollup, ZKSnarks etc.)
you can redeploy a contract on another shard that uses it frequently - ok for libraries, not great if you need the entire cross-shard storage

Global Scope for High-frequency Reads and Writes

We need sharding in Ethereum because it is decentralized and anyone can store whatever they want on-chain. And they don't need to worry about the storage. Heck, they don't even need to run a full node themselves.
You cannot say: well, I'll just keep in sync with this shard, because it contains everything that I want, because you cannot censor people from writing to a shard. (Much awesome, by the way)
What if you could keep in sync only with the highly used data and some additional shards where you have your services?
What if you could have a Global Scope for with highly used libraries, registries, from all shards, that structure projects/subsystems and makes them interoperable?
The Global Scope is intended for public, highly used data, that would help create a unitary OS across shards and would help off-chain tools to analyze and understand on-chain data. It can, for example, include various ontologies used to label blockchain data or provide transaction metadata, very useful for global ML and AI systems or blockchain explorers that will be able to display chain data in rich formats.
Is there life after the Global Scope?
Of course! Anyone can ditch the Global Scope standardization and build whatever they want. We all like rebels, they bring innovation.
I'm In! Now What?
A Master Shard to Account for Global Scope

We propose the existence and functionality of a special shard that we call Master Shard (MS). The main intent is for it to act as cache for frequently-used contracts and (frequently-used & rarely-modified) data from all the normal shards and bring them into the "global scope", optimizing thus the inter-shard operations.
The Beacon Chain, along with the MS will always be synced in a node. Calls to MS will be faster and cheaper than calls to other shards.
One option was to keep the Beacon Chain EEs stateless. Then, the MS could also hold the last state roots for all shards. This is similar in concept with a master database, where databases, tables, and fields are kept as records.
Master Shard Transactions

The MS only has three allowed methods:

Add <shardId> <typeOfResource> <resource>, returns the <resourceId> corresponding to the stored resource
Update <shardId> <typeOfResource> <resourceId1> <resource2>
Get <shardId> <typeOfResource> <resourceId>, returns the cached <resource>

<resource> can be a contract or an instantiated value type - e.g. uintX, boolean, array, struct.
Writing to the Master Shard

Writing to the MS cannot be done through a normal, user-initiated transaction. This is a process controlled by the Beacon Chain's EEs.
Block Producers (with the help of Relayers) and Validators run the EE scripts on the shard block data, in order to get the post-state transition shard root hash. This will get sent to the Beacon chain by the Validators.
A simplified EE can be viewed as a reducer function: from the previous shard state and current transition to the post-transition shard state.
function reducer(shard_prev_state_root, block_data) {
    // reduce, hash etc.
    return shard_next_state_root;
}
The output can also contain transaction receipts or other by-products returned by executing the transactions inside the VM (based on eWasm in Eth2).
The VM can determine if a resource (e.g. smart contract) from Shard1 has been frequently used by other shards and decide to move it to the Master Shard. It will build the necessary state transitions for this, run them and get the receipts.
Our simplified EE example becomes:
function reducer(shard_prev_state_root, ms_prev_state_root, block_data) {
    // get MS state transitions from block_data and add them to the previous MS state
    // reduce, hash etc.
    return [shard_next_state_root, ms_next_state_root];
}
The new MS state transitions, along with the previous MS state root hash, will go through the same process of being reduced. The new output will contain both the final state root hashes: for the initial shard and for the MS. Additionally, it will also hold all the transaction receipts.
Writing to the MS at this stage, should not cost the transaction initiator more. Gas estimations are still an open subject.
High-level sequences of what happens before and after adding a contract to the MS:


When Is a Resource moved To the MS?

I mentioned that the VM looks at how frequently a Shard1 resource is used and determines if it should move it to the Master Shard.
This requires a counter for how many times the resource is used by other shards (reads or writes): cache_threshold - a global variable (per resource),  with the maximum number of uses before the resource is cached.
The cache_threshold should be modifiable in time - if the number of total Ethereum transactions increases substantially, the cache_threshold might need to be higher.
The resource counter can be stored on the MS itself, in a smart contract. It can be a key-value store, where the key is the resource address, which already contains the shard identifier and the value is the current counter value. If it is possible to only cache smart contract storage partially (per data record), then the key might also contain the data record identifier.
The counter is increased by the EE every time the resource is read or written. This will happen with any cross-shard transaction. Gas costs remain unspecified, but it should not be expensive, as this should be deterministic and part of the system (not initiated by the user).
Updating Resources on the Master Shard

If a transaction is sent to a smart contract on Shard1, triggering a change of a resource that is cached on the MS, the EE will also update the cached resource, using the Update MS transaction.
Removing Resources from the Master Shard

The easiest solution would be resetting the MS after a number of blocks (e.g. equivalent of 1 year), removing the cached resources.
There are other solutions, but more complex or computationally intensive, that we will propose.
Reading from the Master Shard

Each time a transaction sent to a shard requires to read data from another shard, it will first look in the global scope (MS), to see if the data is cached. If it is not, the transaction will continue normally. If it is, it will use the global scope. The MS enables data memoization.
Conclusions


avoids redeployment of libraries and contracts to multiple shards
cross-shard transactions will be faster if read data is cached
there are gas costs to caching data which should be clarified in more detail

Master Shard and Funding Ethereum Development

The MS caching could also be used to fund Ethereum development. Part of the gas costs saved by caching into the global scope can be directed to an account used for this purpose.
This way, users would pay for an actual service: caching.