Warchant/iroha_smart_contracts.md

## iroha_smart_contracts.md

      
    Raw
  

              iroha_smart_contracts.md
            
          
    Proposal for smart contracts in iroha

author: Bohdan Vanieiev

According to the list of commands and queries implemented in iroha current design of commands and queries is not good, because:

It lacks low level commands, which allow to build arbitrary programs, required for proper smart contracts implementation. In other words, the system has only  high-level commands such as transfer asset, therefore is limited in features.
It is hard to extend running system with new commands and queries, because iroha has to be recompiled after addition of new features.
It is hard to add new commands or queries to the source code, because addition requires changes in 7+ different parts of iroha.

Other blockchains

Bitcoin

Bitcoin uses a scripting system for transactions. Forth-like, Script is simple, stack-based, and processed from left to right. It is purposefully not Turing-complete, with no loops (to prevent infinite loops).
Scripting provides the flexibility to change the parameters of what's needed to spend transferred Bitcoins. For example, the scripting system could be used to require two private keys, or a combination of several, or even no keys at all.
Client creates a script, sends to bitcoin node, internal interpreter executes and validates script, one instruction at a time.
Ethereum

The code in Ethereum contracts is written in a low-level, stack-based bytecode language, referred to as "Ethereum virtual machine code" or "EVM code". The code consists of a series of bytes, where each byte represents an operation. In general, code execution is an infinite loop that consists of repeatedly carrying out the operation at the current program counter (which begins at zero) and then incrementing the program counter by one, until the end of the code is reached or an error or STOP or RETURN instruction is detected. The operations have access to three types of space in which to store data:

The stack, a last-in-first-out container to which values can be pushed and popped
Memory, an infinitely expandable byte array
The contract's long-term storage, a key/value store. Unlike stack and memory, which reset after computation ends, storage persists for the long term.

The code can also access the value, sender and data of the incoming message, as well as block header data, and the code can also return a byte array of data as an output.
The formal execution model of EVM code is surprisingly simple. While the Ethereum virtual machine is running, its full computational state can be defined by the tuple (block_state, transaction, message, code, memory, stack, pc, gas), where block_state is the global state containing all accounts and includes balances and storage. At the start of every round of execution, the current instruction is found by taking the pcth (Program Counter) byte of code (or 0 if pc >= len(code)), and each instruction has its own definition in terms of how it affects the tuple.
To deploy a smart contract in Ethereum network one needs to write the code in one of several languages. The most popular one is solidity.
From abstraction standpoint of view contract can be considered as C++ class, so it can:

be inherited from another contract
communicate to other contracts via messages
contain public (for contract users and other contracts) member variables
contain private (visible only inside a contract, not for inherited contracts) variables
contain public and private interface methods
contain external and internal interface methods.

Querying of data stored in contract is performed through web3.js library. Basically, contract user supplies interface ABI for particular contract and contract's address.
Gas is the way to prevent infinite loops in smart contracts. Every transaction is supplied with certain amount of gas, which limits the number of computations which can be performed by interpreter.
Hyperledger Fabric

Blockchain logic, often referred to as "smart contracts," are self executing agreements between parties that have all relevant covenants spelled out in code, are settled automatically, and can be dependent upon future signatures or trigger events. In the Hyperledger project, they call this "chaincode" to help establish clarity between blockchain logic and the human-written contracts that they can sometimes represent.
The chaincode concept is more general than the smart contract concept. Chaincode can be written in any mainstream programming language, and executed in containers inside the Hyperledger context layer. Chaincode provides the capability to define smart contract templating language (similar to Velocity or Jade), and to restrict the functionality of the execution environment and the degree of computing flexibility to satisfy the legal contractual requirements.


A chaincode is a decentralized transactional program, running on the validating nodes. Chaincode Service uses Docker to host the chaincode without relying on any particular virtual machine or computer language. Docker provides a secured, lightweight method to sandbox chaincode execution. The environment is a "locked down" and secured container, along with a set of of signed base images containing secure OS and chaincode language, runtime and SDK images for Golang.


Chaincode in Fabric consists of several important functions, provided by SDK:

init() is executed when you want to deploy a chaincode to blockchain for the first time.
invoke() is executed whenever you want to execute your chaincode.
Handling of different interface functions is somewhat ugly:
if function == "init" {
    // Init function is defined somewhere by chaincode programmer
    return t.Init(stub, "init", args)
} else if function == "write" {
    // as well as write function
    return t.write(stub, args)
}

query() is executed whenever you want to query current state of the ledger.
main() that will be executed when each peer deploys their instance of the chaincode.

Ways to implement smart contracts in iroha

After analysis of existing smart contracts implementations, I can highlight only two models:
1. develop custom "script" /"bytecode" language which will be executed/evaluated/emulated/ interpreted by our VM/ interpreter.

The intent is to implement Fetch-Decode-Execute cycle, as in any CPU:


Transaction/Query in this model become just a container for a program -- a set of instructions, which are executed on iroha VM.
Key tasks:

specify a list of opcodes (commands) which can be executed by a VM. This list will contain opcodes similar to ethereum's opcodes.
develop a strategy to avoid infinite loops:
2.1 use instruction limit per transaction (eg "gas" from Ethereum)
2.2 remove loops and jump instructions from opcode list, as in Bitcoin (loops may be inlined to instructions)
implement opcodes in interpreter
develop an easy way to create a bytecode by users:
4.1 develop own language
4.2 implement a compiler for existing language or its subset.

This approach allows us to perform lots of computation optimizations, such as out-of-order instruction execution, out-of-order transaction execution, and others (which are applicable for CPUs).
From security standpoint development of a VM is the most secure way (for peers) to execute smart contracts, because we make a whitelist of possible instructions which can be executed by a VM, thus reducing attack surface.
From programming standpoint of view, this approach allows us to make good project structure and make system easily extendable.
Note: it would be good to have a hardcore CPU geek in team, who can make optimizations for interpreter.
2. develop an SDK, which can be used to create, invoke and query smart contracts, and use sandboxing to execute them

We may follow Fabric's way. The intent here is to allow users write contracts in arbitrary programming language, then execute the code in some sandbox.
Sandboxing is required for peer security.
As we reviewed few ways to do it, I can highlight them here:

Run every transaction (a set of instructions) in a single docker container. This approach is very slow, since to run single transaction iroha should create new container (takes 2s), execute chaincode (?s), halt container (less than 1s).
Run every transaction in the same docker container. It is faster than create new container every time, but this approach reduces overall security, since attacker is able to modify container's environment, which may influence execution of other smart contracts.
Use LXC in the following way:
3.1 create and run a "donor" or "base" container, with overlayfs filesystem (fast copy-on-write fs).
3.2 whenever we want to execute a transaction, we make a clone of base container (takes around 50ms), start it (takes from 10 ms to 1s, depending on container environment) and execute contract inside.
Use other sandboxing solutions such as VMs, which are more secure but much much slower (count how much time you need to execute one contract!).

Among all approaches usage of LXC with copy-on-write FS is justified.
However, there are still unsolved problems:

How to access the ledger from a chaincode (from container)?
How to get smart contract execution artifacts (if any) from container?
How to manage resources of container (CPU cycles, disk access, network access, etc)?
How to restrict users to only certain subset of CPU instructions (whitelist)?
How to achieve good security? Since we allow users to execute arbitrary code on peers... Nothing prevents users to write a botnet or something like this.
If user's platform and peer's platform is different, how to execute machine code from smart contracts?