Skip to content

Instantly share code, notes, and snippets.

@anvacaru
Last active March 8, 2022 12:56
Show Gist options
  • Save anvacaru/47c89962579a78de2a9aa119ddbcae4e to your computer and use it in GitHub Desktop.
Save anvacaru/47c89962579a78de2a9aa119ddbcae4e to your computer and use it in GitHub Desktop.
INTRO
------
Hi, everyone, I am Andrei Vacaru, I am a K developer at Runtime Verification Inc, and today I will talk about how to do Smart Contract Verification. I will be using KEVM which is a K implementation of the EVM, fully faithfull with other implementations and it is also a K formal semantics, which means that we can use it to do formal verification of ethereum programs.
Real quick, We are RV, we are a formal methods company, we have serviced automotive and aerospace companies in the past with our services, now we are focused mostly on blockchain clients, our general mission is to improve basically the state of security and correctes n code across the board.
K - is a mathematical modelling language.
you build mathematical models in K, and one of the key features of K is that those models are executable. So, for example you can ake your test suite and execute it against KEVM which is the mathematical model of the ethereum virtual machine in K, or you can make proofs
So, if you were to say, wanting to make a new programming language, you could write it in C or in K. If you implement it in C, then what you get is faster execution. You get a very fast interpreter that executes your programs. But if you write it in K, you still get execution, just not as fast as if it was written in C, but you also get symbolic execution, which gives you the ability to write proofs. So that's the reason why you would choose to implement your language in K. And there's other frameworks like K out there like Otter, Verifast and so much more.There are a bunch of other frameworks which achieve the same goals as K, which is like give you a mathematical modelling framework, and from that framework you do proofs about programs.
What is formal verification?
------------------------------
Formal verification is the process of checking if a design satisfies its system requirements (properties). In other words, to check if the code behaves as expected, using some formalism.
Testing is good to show the presence of bugs, but you can't show the absence of bugs.
scenario: you write some tests, you find some bugs, you fix the bugs, but then you don't have any confidence that there are no other unknown bugs
To prove that there are no bugs, you have to do this mathematically, like a mathematical proof which you do with a pen and paper, but you have a computer to do it for us, because the proofs end up being pretty large.
First. You need that mathematical model of the programming language.
We'll be using KEVM, which is the mathematical model of the EVM written in K.
Then you need to build a specification of the program in the same language.
specification - manual process
- an auditor agrees with a code developer how about the behaviour of the code
So previously, we build these kinds of informal specifications which we translated into tests. Just kind of a blind translation process. And since I am a K expert I'm going to do the same thing, take the informal specification and translate it into a proof. So basically I'm gonna take the informal specification, or a semi-formal one that includes some mathematics and manually translate it into a K specification. And that proof language we're gonna use is the same language in which the math modelling language that we built the KEVM in. That's key, because now the specification is in the Same language as the mathematical model so they can work togheter. And then we press the go button on the proover, he grinds away on it, and after some time it shows us if the program implements the specification that we wrote.
So it's key to know what this gives you and what it doesn't give you. So people think like if the program is formally verified then it is 100% safe and this is not the case. Right? When you're doing formal verification, you verify that That specification of your contract is true. The same way like you do in testing like "That scenario works" for your contract, but you can't say anything else about other scenarios. If you don't write enough specifications, then your contract can have other behaviours that you didn't specify that could potentially cause problems. So you have to convince yourself that you wrote enough specifications to have a high confidence in the system.
So what you get when you have formal verification?
You have a proof that only states that your program has exactly that behaviour on exatcly that specific conditions. Not saying anything else about other conditions.
example: proof on transferFrom function it's not saying anything about the approve function.
So that's the thing with formal verification, it's the same thing as testing, were if you don't write a test for a certain case you won't catch the bug and if you don't write a proof for a certain case you're not gonna catch the bug.
Symbolic execution
------------------
With testing, you're picking specific scenarios and you're just doing sanity checks on those scenarios.
With FV you're symbolically checking all scenarios so you don't have to iterate through all the values. I just hold symbolic variable X(for example) and I do reasoning over that symbolic variable using an analitycal reasoning engine basically. So what you get is this entire space of execution paths that encompass all the cases that X could take. And instead of looking at each execution path here, we do it in a close form and analitical way.
At the end we're getting a single state description, but that state contains the symbolic value X and represents a bunch of possible states.
Also, Holding a symbolic value is a lot less resource intensive than running each execution path.
Again, testing can prove the presence of a bug, because you have the specific execution trace that shows the presence of the bug; but itr can't prove the absence of the bugs, because you're not checking any values there.
So we're doing formal v to get the entire diagram there. FV is not a magic bullet. It's not like you're doing it once and your contract secure. It will give you a high confidence only for the properties and symbolic tests which you covered.
What is KEVM?
-------------
KEVM is the K implementation of the Ethereum Virtual Machine. Repository. It's big, I won't go through it all. Full implementation of EVM. We tested it against the same test suite that geth tests it against, that ethereum-cpp, and ethereum-js tests it against. So we know that we have a faithful implementation of the EVM in our mathematical modelling language.
So we have to write down a specification in K about ethereum programs.
1. we're calling the transfer function on a symbolic "TO_ACCT" (some other account, but were not specifying which one) and a symbolic value.
2. start at PC 0, it wouldn't make sense to start at any other PC like 7
3. we're making assertions about the post state
specifically. it ends in evmc_success
we're also saying that the output returned by the function is one padded to 32 bytes meaning that the function was called and it returned True.
4. in the program cell we load the program bytecode that we care about. So this contract is some ERC20 , which we took from the dappsys library
There's a bunch of boiler plate code, which is not interesting and doesn't matter for the proof, but you have to have it because the proover cares for all these details.
5.
In the account storage, in the account which is executing, and in its storage, the key thing is that the balances are updated correctly.
6. We're going to look at the preconditions. So the code says that the transfer shouldn't happen if there is an overflow.
1. Setup KEVM
Install all the dependencies in the README and follow the instructions.
K-Teal for algorand K-Michelson for Tezos, EVM Semantics, WASM semantics, I think we are working on a PLUTUS semantics for CARDANO projects
DEMO:
compile smart contract
paste bin runtime in the boilerplate template
So, the erc20- we're workin with is not a correct implementation. It's a simplified version.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment