Skip to content

Instantly share code, notes, and snippets.

@aterga
Last active January 2, 2023 22:10
Show Gist options
  • Save aterga/fece1a685baa73abb0684ea69401ddac to your computer and use it in GitHub Desktop.
Save aterga/fece1a685baa73abb0684ea69401ddac to your computer and use it in GitHub Desktop.

A Taste of Verification-Assisted Development in Motoko

In this post, we demonstrate how Internet Computer developers can leverage automatic verification to build bug-free, Web3 applications in Motoko. We start by explaining why this is tricky. We then demonstrate an example smart contract (a.k.a. canister in Motoko), showing how to formally specify and (automatically) verify it. We conclude with a discussion of our experience and observations gathered while working on this (very experimental) verification technique.

We wrote this post because we wanted to make more of DFINITY's formal verification know-how available to developers outside the Foundation. While DFINITY employs verification experts to formally specify and verify some of our smart contracts, we believe that formal verification is beneficial to all Web3 projects, especially in high-stake applications, like DeFi. We thus hope that this post inspires the community to look more closely into code-level verification that should ultimately become accessible to all Web3 developers.

Is there really a problem?

Data from rekt.news shows that blockchain vulnerabilities are often exploited via smart contracts. The most widely-used approach to quality assurance in high-stake Web3 projects — code audits — is insufficient. Since September 2020, the industry has acknowledged over $1.15 billion in asset losses among users of audited blockchain projects. Most blockchain audits are limited to basic practices, such as manual code review and automatic code analyzers, which cannot ensure absence of possible exploits.

Actors can hide tricky bugs

Smart contracts are like games with an unknown number of players. To make sure that the game is sensible, we need to specify its rules in a way that is both formal (e.g., to avoid ambiguity) and concise (so that players can wrap their heads around the rules and play strategically). The rules also need to be verified, proving that the implementation (i.e., executable code) actually respects the game rules.

But formally verifying a system of rules is tricky. It constitutes mathematically proving that the rules of the game cannot be bypassed even under the assumption that the players are coordinating adversaries, trying to exploit any vulnerabilities. To make matters worse, since the number of players is unknown upfront (serving an arbitrary number of users is typically the point of a Web3 application), we must resort to protecting against infinitely-many adversaries.

Why is testing insufficient?

Coming back from the game analogy, we note that runtime validation techniques like testing cannot provide sufficient security guarantees for smart contracts. The reasons are well known: Testing validates only some execution scenarios, typically covering a tiny subset of all possible executions. Therefore, it is practically not possible to validate all execution scenarios of a smart contract simply by testing it.

Towards formal verification

In this post, we focus on formal verification, a compile-time technique that validates all possible execution scenarios.

The expressivity of Motoko's asynchronous, actor-based paradigm (async) complicates verification. Other common smart contract languages (e.g., Solidity and Vyper) do not support async, making them easier to verify — but this sync-only paradigm severely restricts their domain of use. The reason async complicates verification is because it involves dynamic interleaving of code execution that cannot be precisely predicted at compile time; this can make the behavior of async code counterintuitive even to the programmer who wrote it. The implicit assumptions made by programmers while writing async code are especially hard to communicate when multiple people are working on the same async smart contract. This often results in notorious bugs that can be exploited, such as in reentrancy attacks.

Despite all these challenges, recent progress in formal verification demonstrates that the right kind of tools can dramatically simplify the development of formally secure async code. We will demonstrate this point for the Motoko language, but the core principles are language-independent. To this end, let us port a toy smart contract, called ClaimReward, to Motoko, formalize the intension behind the code, and then verify that the code does what is intended.

Example smart contract: ClaimReward

The purpose of this canister is to reward the very first claim (e.g., by sending some crypto tokens). All subsequent claims shall not be rewarded. Similar logic is often used in practice in lottery and auction applications.

Consider the following Motoko implementation of ClaimReward:

actor ClaimReward {
  var claimed = false;
  var count = 0 : Int;
  public shared func claim() : async () {
    if (not claimed) {   // step 0
      count += 1;        // step 1
      // (send reward)      step 2
      claimed := true;   // step 3
    };
  };
}

The canister consists of a single actor, whose two fields are claimed (a flag indicating if the reward has already been claimed) and count (the number of times that the reward has been sent out). In Motoko, fields of an actor can be directly modified only by the functions of the actor; external callers (be those human users or other smart contracts) may only invoke the public functions, potentially resulting in certain field modifications.

The only public function that ClaimReward has is claim. This function does not take any arguments, nor does it return anything (note that all public functions in Motoko must have an async return type). Naturally, an invocation of claim may cause some field modifications. Let us see what this function is doing:

  • step 0: The claim function first checks if the claimed flag is not yet set (otherwise returning without any effect).
  • If the condition not claimed is met, the following happens:
    • step 1: The count is incremented by one.
    • step 2: The reward tokens are sent back to the caller. For now, we abbreviate this with the send reward comment.
    • step 3: The claimed flag is set.

Specifying a canister invariant

Recall that the purpose of the ClaimReward canister is to reward the very first claim, and no other claim. Clearly, the total number of rewards sent to users over the lifetime of this canister should not exceed one. This condition can be formalized as follows:

(not claimed implies count == 0)  // initially synchronized
and (count == 0 or count == 1)    // reward is unique

We call this condition the canister invariant of ClaimReward. Despite it's simplicity, this canister invariant is a useful specification for our code. From a user's perspective, this condition expresses the rules of the ClaimReward canister. Remember that the user may not be a programmer and might not be able to fully understanding the source code! Conversely, we can easily communicate the above canister invariant in plain English.

Canister invariants are also essential for defining what it means for an implementation of ClaimReward to be correct. In particular, it must be impossible for an adversary to use a canister in such a way that would violate the canister invariant. Conversely, if there does exist a scenario in which, by invoking some public function(s), an adversary could break the invariant, that would imply that the canister is not implemented in a correct manner.

Code-level specifications

Above, we pointed out two key advantages of our code-level specification approach, i.e., the practice of specifying a canister's intention (the rules of the game) through the same language that is used to implement this canister:

  1. Code-level specs are formal (the notion of correctness has meaning only if the code is formally specified)
  2. Code-level specs can be easily communicated to the end users.

Aren't specifications code, too?

Another practical advantage of code-level specifications is that the specifications can be written and version controlled alongside the executable code. This is not the case for back-of-an-envelope specification approaches, e.g., plain text (think, Javadoc) or dedicated software modeling languages (e.g., TLA, Coq, Isabelle). While the latter also formalize the specifications, it is hard for a smart contract developer to detect possible source code modifications that require manual model adjustments.

Below, we demonstrate how to write code-level specifications for the ClaimReward canister. The assert:invariant <exp>; syntax allows specifying a canister invariant <exp> that is type checked just like a normal Motoko expression — while not affecting the ClaimReward's runtime behavior in any way.

// @verify
actor ClaimReward {
  var claimed = false;
  var count = 0 : Int;
  
  assert:invariant 
   (not claimed implies count == 0) // initially synchronized
   and (count == 0 or count == 1);  // reward is unique

  public shared func claim() : async () {
    if (not claimed) {
      count += 1;
      // (send reward)
      claimed := true;
    };
  };
}

Verify your canister invariants automatically

We are now ready to verify the canister code against its invariant. Motoko-san is a tool that makes this an easy process (a very early prototype is available on the VS Code extension marketplace). The tool provides a familiar user experience similar to that of the Motoko type checker.

Disclaimer: DFINITY released Motoko-san as a community preview; no further updates are currently planned. Please use this tool at your own risk.

To ask Motoko-san to verify a file, we add // @verify as the first line. The tool will either show a green checkmark ✅ (indicating that the code is proven to comply with the formal specifications) or report an error (underlining the offending code with a red wave).

Pro tip: Motoko-san works well together with Error Lens, a VS Code extension that shows error messages right next to the line that caused it.

Opening our formally specified ClaimReward canister in VS Code with Motoko-san, we observe that it verifies. Great!

This is a good time to play a bit with the tool: For example, try changing the second initializer to var count = 1 : Int;, in which case the tool will report a verification error:

Canister invariant cannot be established after initialization

This error is of course expected, as a canister invariant must be established right after initialization.

Handling async code

Our current ClaimReward implementation does not show how the reward is actually sent. In practice, the code abbreviated (send reward) could, e.g., transfer some tokens from the ClaimReward canister's account back to the user's account. Transferring tokens would practically involve public functions of other canisters (e.g., Ledger.transfer). Since the result of such a call is an async object, it needs to be awaited for. We represent this logic as follows:

// @verify
actor ClaimReward {
  var claimed = false;
  var count = 0 : Int;
  
  assert:invariant 
   (not claimed implies count == 0) // initially synchronized
   and (count == 0 or count == 1);  // reward is unique

  public shared func claim() : async () {
    if (not claimed) {
      // yield control ...
      await async {
        // ... control gain

        count += 1;
        // (send reward)

        // yield control ...
      };
      // ... control gain
      claimed := true;
    };
  };
}

Note that we also put count += 1; into this new async block, ensuring that this counter represents the actual number of times that a reward has been sent. In practice, this can be implemented by mapping the async value returned by the transfer function such as Ledger.transfer, e.g.:

await Ledger.transfer(...).map(func (transfer_result) {
  count += 1;
  transfer_result
});

To simplify the discussion, we inline the transfer function and omit dealing with transfer_result:

await async {
  count += 1;
  // (send reward)
};

To explain how async code works in Motoko, we first need to define which code fragments are guaranteed to be executed immediately, i.e., without being interleaved with other code fragments.

What is an atomic code block?

An atomic block of code is the smallest portion of code that is guaranteed to be executed immediately.

Control-yielding semantics of await

When our new variant of ClaimReward.claim is executed, the await statement splits the function into several atomic blocks. In particular, the atomic block under await async is not necessarily executed immediately after the branch condition has been checked, but possibly with some other atomic blocks being executed in between. Similarly, the execution of claimed := true will not necessarily happen right after the async code execution finishes. Again, other atomic blocks may be executed in between.

Consider the following question. Let's abbreviate the branch statement as b, the code block under await async as a, and claimed := true as c. Which of the following executions are possible?

  • b, a, c
  • b, a, c, b
  • b, a, c, b, a, c
  • b, a, b, a, c, c
  • b, a, b, c, a, c

The first case in this list represents a single invocation of ClaimReward.claim. The second case is similar, but after the first invocation returns, another one occurs; however, the condition of the branch b is not satisfied, so the second invocation returns without a and c being executed again. The third case is identical to the first one, with one repetition.

What scenarios do the last two cases represent?

Reentrancy

Reentrancy is an interleaved execution of atomic code blocks. Since there is no other function in the ClaimReward canister except for claim, it might be unclear what atomic blocks can realistically be involved in reentrancy in this case. We will now show that even a canister with just one function may end up handling multiple concurrent invocations, resulting in reentrancy. We will then explain why reentrancy can be dangerous.

Consider a scenario in which the execution of one invocation of claim enters the branch. Eventually, the async block will be executed, increasing the value of count (now count equals 1). If the very next executed atomic block happens to be claimed := true, then all is fine since:

  1. The reached state complies with our canister invariant.
  2. The next invocation of claim will not cause any changes to the fields as the branch condition not claimed will no longer be satisfiable.

However, if another invocation of claim starts executing before the claimed := true atomic block, then we might observe that count equals 2! This violates our canister invariant that specifies only 0 and 1 as acceptable values of count. Practically, this means that the reward can be claimed more than once, and a simple reentrancy attack could drain the canister owned's account balance.

What are Reentrancy attacks?

Reentrancy attacks are attacks on smart contracts in which the canister's invariant is violated by an interleaved execution of async code. A canister is reentrancy-safe if there is no possible reentrancy attack. We want to verify that our canisters are reentrancy-safe.

A simple fix for our ClaimReward canister involves swapping the two statements under the branch. Concretely, a reentrancy-safe implementation would first set the claimed flag (before yielding the control for the first time), and only then await the execution of the async block. Indeed, this would prevent an attacker from gaining rewards multiple times as the very first reward will be sent only after the claimed flag, which guards the reward, is already set.

Automatic compile-time detection of reentrancy attacks

Wouldn't it be nice to have a tool that automatically detects and reports reetrancy attacks — while the code is being compiled? We demonstrate how this is done with Motoko-san. To this end, let us try to verify the (buggy) code from the previous listing.

Motoko-san reports the following verification error:

Canister invariant violated by async block

Naturally, this is expected, as we already know that there exists a reentrancy attack in our code.

Here is how we can fix the code:

// @verify
actor ClaimReward {
  var claimed = false;
  var count = 0 : Int;
  
  assert:invariant 
   (not claimed implies count == 0) // initially synchronized
   and (count == 0 or count == 1);  // reward is unique

  public shared func claim() : async () {
    if (not claimed) {
      claimed := true;

      // yield control ...
      await async {
        // ... control gain

        count += 1;
        // (send reward)

        // yield control ...
      };
      // ... control gain
    };
  };
}

Perhaps surprisingly, Motoko-san reports the same error again:

Canister invariant violated by async block

The reason Motoko-san reports an error in this case is that, unlike in our ad-hoc reentrancy argument, Motoko-san verifies canisters in a function-modular manner. In other words, each canister function needs to be verified against the canister invariant independently of the knowledge about all other functions. Even the fact that there are no other functions (as is the case in our latest variant of ClaimReward) is (purposefully) not taken into account by Motoko-san.

Function-modular verification

Yes, function-modular verification is potentially incomplete — i.e., there can be false-positives. However, this approach enables scaling verification efforts to applications with arbitrary-many functions, as we explain next.

While correct, the canister implementation that we have converged to so far is unsafe under code evolution. To illustrate this point, assume that the code so far has been written entirely by one person, Alice, who is unfortunately unaware of the benefits that Motoko-san has to offer. After publishing the code as it is, Alice's work is continued by Bob. Let's assume that the only knowledge that Bob has from Alice about this canister's original intention comes from two sources:

  1. The ClaimReward canister invariant.
  2. ClaimReward's public API, consisting of the function claim.

Now, Bob wants to extend the canister with another function, reset(), that will return the canister to its original state. Bob thinks that this extension will preserve backward compatibility, as he concluded that the reset function does not contradict the intention of the original canister. Indeed, the initial state of the canister must satisfy the canister invariant — no issue with that — and all pre-existing functions remain unchanged. Therefore, Bob commits the following code:

// @verify
actor ClaimReward {
  var claimed = false;
  var count = 0 : Int;
  
  assert:invariant 
   (not claimed implies count == 0) // initially synchronized
   and (count == 0 or count == 1);  // reward is unique

  public shared func claim() : async () {
    if (not claimed) {
      claimed := true;

      // yield control ...
      await async {
        // ... control gain

        count += 1;
        // (send reward)

        // yield control ...
      };
      // ... control gain
    };
  };

  public shared func reset() : async () {
    claimed := false;
    count := 0;
  };
}

It is now possible to violate ClaimReward's canister invariant using the following sequence of invocations: claim(), reset(), claim(). Indeed, this sequence can result in an effective reentrancy attack, since the execution of reset() may occur during the execution of claim() (between the first yield control / control gain pair). In this attack, both invocations of claim will increase count, so we will eventually observe count == 2.

Luckily, Bob is a Motoko-san user. He immediately gets a verification error, indicating that the async block in ClaimReward.claim is problematic. Although a reentrancy attack on ClaimReward exploits Bob's reset function, it is actually Alice's claim function that is vulnerable. Indeed, removing the claim function whatsoever would result in a reentrancy-safe (yet, practically useless) canister.

Explicitly specifying assumptions in async code

Arguably, we should blame neither Alice nor Bob for the vulnerability in the listing above. The problem is due to Alice's implicit assumption (that Bob didn't know about) of the fact that no code can mess up the intermediate state of claim's execution, namely, between the first yield control / control gain pair.

To avoid confusion, Alice decides to encode this assumptions explicitly, following the code-level specification approach. Concretely, she wants to express the following async constraints:

  1. The claimed flag must still be set when the first control gain is reached.
  2. The value of count must still be 0 when the first control gain is reached.
  3. At most one instance of this async block may be executed per invocation of claim.

Async constraints serve a dual purpose. On the one hand, they are assumptions while reasoning about the atomic block inside await async. Concluding that this block actually preserves the canister invariant requires that these assumptions are met. In particular, if assumption 2 were violated, then the value of count after the async block could be less than 0 or greater than 1, violating the canister invariant.

On the other hand, async constraints are also formal proof obligations that must be respected by all functions of the ClaimReward canister. For example, the reset in the listing above violates the proof obligation 1, as it sets claimed back to false (which can be exploited in a reentrancy attack).

The third async constraint above is a bit more subtle. Generally, atomic blocks to be executed are picked up from a runtime message queue, allowing for the same code block to be scheduled multiple times. For example, this is useful to process async calls of public function efficiently. However, the async block in our ClaimRewards canister is of course not supposed to be scheduled more than once per message queue (otherwise, the reward could be sent more than once).

Canister specifications = Canister invariant + Async constraints

Since async constraints restrict all functions of a canister, they are similar to the canister invariant. However, while the canister invariant typically contains information that is useful to the canister's users, async constraints are mostly useful for communicating design decisions (involving async code) among canister developers. In particular, async constraints typically describe intermediate states of execution of public functions.

Syntactically, it seems the easiest to write async constraints next to the atomic blocks that they refer to (note that canisters often have multiple async blocks that would potentially need different sets of async constraints).

Motoko-san provides the following experimental syntax for writing async constraints:

// @verify
actor ClaimReward {
  var claimed = false;
  var count = 0 : Int;
  
  assert:invariant 
   (not claimed implies count == 0) // initially synchronized
   and (count == 0 or count == 1);  // reward is unique

  public shared func claim() : async () {
    if (not claimed) {
      claimed := true;

      await async {
        assert:1:async claimed and count == 0;

        count += 1;
        // (send reward)
      };
    };
  };
}

Here, the keyword assert:1:async <exp>; allows specifying async constraints as a regular Motoko expression (e.g., it is type checked). The conjunction of claimed and count == 0 specify, correspondingly, the async constraints 1 and 2 of Alice's informal async specifications.

  1. The claimed flag must still be set when the first control gain is reached.

  2. The value of count must still be 0 when the first control gain is reached.

Finally, the digit 1 in assert:1:async specifies async constraint 3, namely:

  1. At most one instance of this async block may be executed per claim invocation.

Motoko-san is now able to automatically verify this code against its formal specifications. ✅

A safe and secure ClaimReward canister implementation

As a happy new Motoko-san user, Alice has now fully specified and (automatically) formally verified the ClaimReward canister against its invariant and the async constraints. Assuming that all future collaborators, like Bob, continue using Motoko-san to verify their code, no reentrancy-related vulnerabilities can arise, as the tool will be able to check that all the assumptions (which are now explicit) actually hold.

A natural next step is to integrate the code-level verifier into a continuous integration and deployment (CI/CD) pipeline. Such integration is feasible, as our canister's formal specification is now part of the codebase. But this is a story for another time.

Discussion

In this post, we demonstrated how Internet Computer developers can leverage automatic verification to build Web3 applications that are bug-free. In our technique, the programmer implements smart contracts and specifies their essential properties (i.e., what conditions would imply that there's a bug?) using that same language — Motoko — while our tool, Motoko-san, automatically checks whether the implementation always respects the specification.

It is essential to formally verify Web3 applications, which tend to promise a lot to their users, for example, in terms of security, privacy, and reliability. Unlike runtime techniques like testing, which validate only a subset of smart contract executions, formal verification validates all possible executions, ensuring that these promises are always met.

Recent data shows that smart contracts passing human audits can still hide bugs, resulting in huge money losses for their users. Formal verification is needed to rule out such bugs.

While ClaimReward, our example smart contract, is merely a toy, it demonstrates a hard problem of reentrancy safety that occurs in presence of async calls. Many smart contracts could practically avoid reentrancy problems by using append-only structures and monotonic functions, but some do rely on mutable structures or non-monotonic functions. Our technique can handle both of these cases in a modular way, that is, by specifying and verifying one function at a time.

We argued that both developers and users should be able to understand the formal specification of a smart contract. Our code-level specification approach meets this requirement.

A practically limiting factor of any verification technique is its usability. Our technique does not require the developer to learn how to use a new language or tool; reasoning is done using only the concepts that Motoko already has. Motoko-san demonstrates compile-time automation and IDE integration, making the overall technique comparable, in terms of the learning curve, to using a type checker.

Our experience shows that the Internet Computer — and, in particular, the Motoko programming language — are well-suited for building next-generation developer tooling, such as automatic code-level verifiers.

Further reading

Acknowledgements

We cordially thank Marco Eilers, Gabor Greif, Martin Raszyk, Claudio Russo, and Ryan Vandersmith for contributing to the design and implementation of the Motoko-san prototype tool. Thanks, Marco and Claudio, also for proofreading this text.

@crusso
Copy link

crusso commented Dec 30, 2022

Maybe one way to motivate this to start with (intentionally broken code, setting claimed := true too late):

actor ClaimReward {
  var claimed = false;
  var count = 0 : Int;

  private func reward() : async () {
       // (send reward)
      count := 1
  };

  public shared func claim() : async () {
    if (not claimed) {
      await reward();
      claimed := true;     // buggy!
    };
  };
}

And just say we simplify this, by inlining function reward() to:

actor ClaimReward {
  var claimed = false;
  var count = 0 : Int;

  public shared func claim() : async () {
    if (not claimed) {
      await async {
          // (send reward)
         count := 1
      };
      claimed := true;     // buggy!
    };
  };
}

(The other unsatisfactory thing is that you don't know whom to reward with any additional info about who the claimant is, but one could perhaps abuse the caller of claimed for that, though motoko-sans can't handle that yet)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment