WilfredTA/udt_proposal.md

## udt_proposal.md

      
    Raw
  

              udt_proposal.md
            
          
    UDT Proposal Draft

Introduction

This document puts forth a proposed standard for user defined tokens (UDT) on Nervos Network’s Common Knowledge Base (CKB).
Transactions are the interface in CKB by which users perform operations (produce state changes). Transactions are composed of cells and cells can define type scripts that enforce constraints on any transaction in which the cell is included. Therefore descriptions of CKB operations are descriptions of transaction patterns. These patterns in turn describe specific input, output, and dependency cells, as well as the constraints the input & output cells introduce into a transaction (via their type & lock scripts). The operations and their corresponding transaction patterns are the first two aspects of the standard that this document defines. The third aspect that this draft defines is the UDT extension mechanism. The base standard is simple, defining only the bare minimum constraints for a fungible token. This will allow greater flexibility and opportunities to innovate on token models while still reaping the benefits of standard-compliance, namely: interoperability and predictable behavior. Extensions are added and defined in such a way that it is easy to see what additional logic and data exists for any UDT. The base standard defines the transaction structures and cell structures to accommodate the most common and elementary operations on fungible tokens as well as the cell structures to accommodate the most common and elementary queries for token data.
Expected Background & Preliminary Resources

To understand the material in this paper, readers should have an understanding of the programming model - “Cell” model - of Nervos Network’s CKB. More specifically, readers should have an understanding of the following concepts:

Cell model & fundamental CKB data structures
Type ID
Dep types
Hash types on scripts
Transaction verification workflow
Molecule serialization format

Design Concerns

The many design concerns can be classified into three types: usability concerns, resource cost concerns, and security concerns. The standard should be usable in so far that it should be pleasant and easy to develop standard-compliant tokens as well extensions. It should not be too difficult to identify the right cells to use to generate transactions (i.e., perform an operation) and should therefore also be easy to query, parse, and understand on-chain data by off-chain parties such as wallets and exchanges. It should be convenient for UDT holders/users to perform the operations supported by the base standard and any custom extensions. Additionally, since holders will probably use a wallet or other tool to interface with the token, it should be easy for interface tools to add support for new UDTs and support for new extensions to an already-supported UDT.
The cost of UDT comes in the form of cost of operations due to transaction size & compute cycles utilized by the scripts themselves, as well as from the amount of storage it takes to store the required on-chain metadata, UDT scripts, and UDT instances themselves, as described in the Architecture section below. These should be minimized.
Base Standard

Architecture

Components Overview


UDT Instance
UDT Definition Script
UDT Information Cell
UDT Info Type Script

UDT Instance

A UDT Instance is any cell that has a UDT type and contains an amount. So, holders or users of some UDT would be most often operating on UDT instances; they are the deliverable, so to speak, of UDT development.
UDT Definition Script & UDT Info Type Script

The UDT Definition Script (UDT Def) stores the executable that provides the core verification logic for any given UDT. The presence of a UDT Def reference on a cell is what signifies that the cell is a UDT instance as opposed to simply a cell with a number in its data field. A UDT Def can define exactly one UDT Type. Even if two UDT Defs have the exact same verification logic, they are unique and uniquely identifiable, as are the UDT instances that reference each respective UDT Def.
The UDT Info Type Script stores the executable that provides the core verification logic for any UDT Info cell. The presence of a UDT Info Type Script reference on a cell forms the logical connection between all UDT instances of a single type with the single UDT Info cell. This is because the info cell and the UDT instances possess the same UUID (which is the hash of the first input of the transaction in which these cells are created).
UDT Information Cell

The UDT information cell is important, as it acts as an integration point by which the other components are tied together, including any additional extensions.
The purpose of this cell is to provide a single logical location where shared data - especially mutable data - between scripts can be accessed and updated. Extensions can also be added in this cell to implement custom validation rules. It can be thought of as a sophisticated reusable arg. See Appendix A for a discussion of the design thinking behind this component.
data: {
  total_supply: uint_x,
  symbol: char[4],
  extensions: FixVec<blake2bDigest>,
  formula: FixVec<numOfExtensions + 1>,
  additional_metadata: DynVec<>
}
lock:<>,
type:UDT Info Type Script


The entire data field is a dynamic vector.
Extensions is a fixed vector of blake2b type hashes of cells whose data field contain custom executable code. Extensions are expected to return one of two values: Success (integer zero) or failure (any non-zero value).
The formula here is inspired by disjunctive normal form. Each element in the formula vector is a sequence of bits. The value and position of each bit within a sequence forms an assertion about the return value of an extension at the same position in the extensions vector.
The additional disjunct indicated by the “+1” in formula is reserved for the base verification logic. Subformula[0] should be reserved for the base verification logic (i.e., the verification logic in the UDT Def). There are cases where a developer may want to override this logic. Only a sub-portion of the verification logic in the UDT Definition is affected by the formula.
So, if I have 3 extensions, then each little-endian sequence of bits within the formula vector (each sequence is a sub formula)  has 4 bits.
The result of verification is the result of taking the disjunction of every sequence. The value of a sequence itself is the result of taking the conjunction of each bit. So, a sequence “101” would resolve to trueif and only if the default logic returned success, the first extension returned failure, and the second extension returned success.
There should always be at least one bit sequence in the formula of at least one bit, which would default to 1in the case of the default logic being the only logic present in the UDT implementation.
Naively, the formula should be a full DNF formula. I.e., if I wanted to represented A || B in full DNF, I’d actually have to write:
(A & B) || (!A & B) || (A & !B)
The reason for this is that every bit sequence will include a negation or assertion for the return of each extension.
So, if I had one extension and I wanted to say that verification is successful when either the default logic is successful or the extension logic is successful, but not both (perhaps the rules are mutually exclusive; they can’t both be true), I would write:
(A & !B) || (!A & B)  where A is the default logic and B is the extension. This would translate to a formula of:
[01, 10]
Operations

Create

Transaction 1: Create Definition Script


Transaction 2: Create UDT Info Cell & Instances

Rules of Create Transactions & Notes
Transaction 1

The definition script (UDT Def) and Info Cell Type Script should themselves have type scripts that implement type ID functionality if the hash_type of UDT instances will be type instead of data.

Transaction 2

In Transaction 2, the lock scripts on the outputs are up to the discretion of the creator. They can be anything, including the same script. So, two lock scripts are included just for demonstration purposes
The UDT instance in output must contain an amount equal to amount in Info Cell’s total_supply field. The size of the number must also be the same. uint256 is simply an example here.
The hashes in the args of each type script must be included, as they form the UUID of the UDT implementation as well as the logical connection of UDT instances and the UDT Info cell
The info cell type script should at least implement type ID functionality since it may be updateable.

UDT Instance Usage

Any transaction in which a UDT instance is included as an input or output within a transaction is considered a UDT Instance use.

System Logic (invariant properties necessary for all UDT implementations)

Amount must be the first x bytes in instance data field, where x is the num of bytes in amount from the info cell
Info cell with ID equal to UDT Instance ID must be included
ID cannot change
Deps with type hashes equal to each extension must be included


Default verification logic (default logic that defines UDT instance behavior. Can be customized & overridden with extensions)

Sum of amount in input & output must be equal


UDT Info Usage

Any transaction in which the Info cell appears as an input or output to the transaction is considered a UDT Info use.
Invariant Properties

Type hash cannot change
Total supply byte size cannot change
Symbol cannot change
There must be exactly one input and output with this ID unless the hash of the first input in the transaction matches the ID

Add Extension Operation


If the extensions field on the UDT Info cell’s input is different from the one on output, then there must be exactly one dep cell for each additional or changed type hash such that the type hash of those dep cells matches one of the hashes in the extensions field of the UDT Info cell.

Extensions

Extension Execution Strategy

When are extensions executed?

Extensions are executed on every transaction that includes a UDT instance. This is because extensions implement additional verification rules. If extensions were not executed in this fashion, it would be difficult to determine which extensions to execute for a given transaction. Since extensions provide verification rules, they provide certain guarantees about UDT instances, and to ensure that these guarantees are enforced, the extensions should execute on every transaction.
Of course, some extensions will not be relevant in all transaction patterns because some guarantees are conditional: “If the transaction is an attempt to mint new UDT amount, then the transaction must satisfy x, y, and z properties”. Because of this, the inclusion and execution of the extension will cause transaction size to be larger and introduce additional compute cycles. Compute cycles can be minimized by returning early if the antecedent of the conditional is not met. Another possible way would be to create an extension that checks for a variety of conditions and executes the relevant code based on those conditions. The only drawback of this is that if there is a relationship between some routines in the “conditional execution” extension and some other extension, they will be difficult to express. The subsection “Formula field” below discusses this in more detail with examples.
Verification Steps

The UDT Def verifies UDT transactions in a series of phases.
The first phase is the “system verification” phase in which certain invariants that must exist for all UDT implementations are confirmed.
The second phase “extension verification” will execute the extensions and cache their return values.
The third phase “default verification” will perform any default verification introduced by the developer as well as the default rules described in this standard. The result of default verification will also be cached.
Finally, the result of all verifications will be checked against the formula field in the info cell to determine whether a valid set of conditions has occurred.
Formula field

As mentioned before, the formula within the info cell is a logical formula in DNF. Every logical formula can be converted into DNF. The reason for choosing DNF is that DNF formulas map well to the natural way we would describe the success conditions of some transaction based on verification logic. Further, the DNF formulas are expected to be simple even in sophisticated UDTs with many extensions because ultimately, every extension will return a success or failure value.
For example, imagine I am a UDT developer and I want to add mintable and burnable extensions to my UDT.  The mintable extension will first check if the sum of UDT amount in the output of a transaction is greater than the sum of UDT amount in the inputs of a transaction (to determine whether the transaction is equivalent to a mint operation) and then check if the signer of a transaction is authorized to mint. The mintable extension will only return success if the transaction is indeed a minting transaction and if the signer is authorized, otherwise it will return a failure. Conversely, the burnable extension will check if the sum of UDT amount in inputs is greater than the sum in outputs, and will return true whenever this is the case (i.e., anyone can burn however many UDT that they own). Further, recall that the default verification logic will ensure that the sum of UDT amount in inputs & outputs are exactly equal.
The two extensions and the default logic will never return success within the same transaction because the conditions for their success are mutually exclusive. However, I know that one of them should return success; if none of them do, then the transaction is not valid.
I can express this with a disjunction of conjunctions, where atoms and negation of atoms can exist within each conjunction:
M = mintable extension
B = burnable extension
D = default sum verification

(M & !B & !D) || (!M & B & !D) || (!M & !B & D)

In the info cell, each atom (i.e., the simplest formula; “A” is an atom, while “A & B” is a conjunction of atoms, where “A” and “B” are referred to as conjuncts) within a conjunction corresponds to the return value of one extension, while the first atom is reserved for the default verification logic of the UDT Def. Further, each conjunction is represented as a sequence of bits. So, the conversion would look like this:
extensions: [mintable_hash, burnable_hash]

formula: [001, 010, 100]

A keen reader may notice that the functionality in the above example could be achieved simply by adding a third possible return value to extensions: a return value that indicates that the verification logic is irrelevant in the current transaction. The only hurdle is that we would have to find a new way to express that, in some transactions, the extension logic takes precedence over the default logic (the simplest way being that the default logic could be moved from the UDT Definition to an additional extension). In this case, each verification script would analyze the transaction to determine if a mint, burn, or regular operation were occurring. If a mint was not occurring, the mint logic would return this new value. Same for burn and same for regular operation in which sum of UDT must be preserved.
This is true in this particular case, but this approach would actually introduce a lot of difficulty and complexity for other types of extensions. While the above example showcases extensions that effectively add new possible operations to a UDT, this is not the only thing that extensions may be used for.
To illustrate this point, imagine I am building a token that can expire. Before the token expires, holders can perform one set of operations. After it expires, users can perform a different set of operations but not the ones that they can perform prior to expiration. Using the three-value approach to extensions, each operation that can be performed prior to expiration would have to check if, of course, the operation is occurring, then check if the token is expired and complete verification. With DNF, this relationship between extensions can be expressed with additional conjunctions. Now imagine that I want to add changes to the expiration check; e.g., extend the duration or even add logic that calculates the expiration based on certain conditions (if the token is used in a game, perhaps winning a round of the game pushes back the expiration time). The easiest way to do this is if expirability is an independent extension rather than integrated into other extensions, and express, via the extension formula, that some transaction is valid only if the operation is successful and expirable extension returns success (i.e., token is not yet expired). The formula-based approach to extensions makes it easier to express relationships between the conditions enforced among multiple extensions. Extensions can be leveraged to add new invariants, conditional properties, and operations on a UDT, and the relationships between these different types of extensions is easily captured by a DNF logical formula.
Conclusion & Future Steps

*A note on extension type hashes: *One point to consider is that the type hashes within the extension fields should be kept up to date in the event that an extension’s code is modified. Therefore, it would be good to use a similar mechanism when deploying extensions to enforce something similar to a type ID on those as well
Future Steps

Development

Lock Scripts & Reference Implementation
In the near future, I will provide a reference implementation to accompany this document as well as various extension examples.
Since lock scripts on UDT instances, Info cell, and UDT Def are not specified in this standard, I will soon publish another document describing uses of certain lock scripts. For example, facilitating transfers of UDT is an interesting topic to consider and is not as simple as setting a new lock on a UDT Instance cell (since this effectively transfers cell capacity as well).
Tooling
We should provide tooling support for UDT development including automatic conversion into formulas for most extension configurations as well as a debugger or IDE that makes it easier to develop UDT scripts & extensions and deploy all types of UDT components.
Research

Token Description Protocols
One challenge with extensions is that off-chain services such as wallets should be able to easily support new extensions. This process of adding support is entirely manual right now. As research and development of tokens continues to progress, a trend has emerged where sophisticated interaction with a token that has custom verification logic is handled by an off-chain service - usually a dapp. This also limits a person’s ability to take advantage of the token they own because they are dependent on a centralized provider to do so. I would like to explore the design of new protocols that allow tokens to communicate exactly the ways in which they can be used so that any wallet can automatically support new, custom logic. In this way, one would not need to depend on a dapp to make use of custom logic; any wallet that supports the token and follows this hypothetical protocol would be able to immediately present the token’s information and interact with it based on its custom logic.
Convenience without Counterparty Risk for UDT Updates
Although type ID functionality preserves the reference to a script cell across updates, it does not make any guarantees about the logic within the upgraded script cell itself. The problem this introduces I call an “invalidation attack”, although the situation could arise accidentally as well (which makes it even more dangerous). Essentially, if one party has control over some script cell X, and another party owns some cell Y whose type or lock script is a reference to X, then an update to X could render Y unusable.
The goal of building upgradeability into the standard is that history has taught us that it is a necessary feature, especially in cases where a security vulnerability has been disclosed and needs to be patched.
To this end, we can leverage different verification & testing approaches to ensure that certain properties of scripts are preserved across updates, as well as various governance schemes that ensure that the update is acceptable by the community.
Formula Forms
I also would like to explore various ways to represent the verification logic of extensions alternative to DNF. DNF formulas, especially complete-DNF formulas, can get quite large pretty quickly depending on the specific formula one is trying to represent in DNF.  Perhaps a decision tree or list, or a different normal form (such as ANF) would be better.
Appendix A: Upgradeability & Extensions

Possible Extension Mechanisms

Extend with Lock Script

There are multiple options for creating an extensible UDT Standard. The first is to offload all extensibility to custom lock script functionality. The problem with this approach is that extensions semantically apply state change rules to all UDT instances of the same type. Therefore, lock script based extensions would require that all UDT holders possess the same lock script. There are a couple problems with this. The first is that it is a priority that users can control their access to their UDT however they please. One huge benefit of the cell model and the UDT standard based on the cell model is that business, or verification, logic can be enforced by type scripts while access and ownership rules can be defined at the owner’s discretion via lock scripts.
The second reason for not using lock scripts is that it allows token holders to participate in different protocols or applications as they please. For example, if a developer used lock script to provide extensions to add certain features or properties to UDT, they would either be optional or required. If required, they would need to enforce that UDT Instances always possess the right lock script. Imagine many different protocols emerge that want to interoperate with a UDT such as a UDT DAO. For UDT holders to be able to participate, the UDT developer would have to add an extension making it possible to deposit tokens into the DAO. However, if users were free to choose how their tokens were locked, they could simply deposit into the DAO without the developer having control over how the UDT can be stored. Perhaps some extensions mechanism that primarily uses lock scripts could get around these types of issues, but it’s far simpler to just reserve lock scripts for optional features all determined by the user than to try to design an extension mechanism that can accommodate this complexity... Especially when type scripts can suffice. Essentially, the UDT standard should not define how UDT instances must stored; it should only define the valid ways UDT instances can undergo state changes.
If the extension was optional, however, then it is merely additional functionality that the user can opt into if they require it. This is perfectly acceptable and is exactly how UDT transfer would be implemented. Locks are appropriate for this use case: they allow the UDT holder to participate in other systems like DAOs or DEX with their UDT.
Extend with Type Script

The second option for extensibility is to rely on updates to the UDT Def script itself. This is ostensibly easy if UDT instances reference the script by type-hash: the reference doesn’t break when the logic is updated. There is another decision to make, though: where should metadata be stored? One possible approach is to store metadata as additional args to the type script or as hardcoded values. There are problems with this, though, best illustrated with an example.
Imagine if I were to add a mintability extension with a couple constraints that create a simple monetary policy:

A maximum 200 tokens can be minted at any one time
A list of authorized minters can perform the mint

So, I add this logic to the UDT Def and deploy it. The max mint amount and authorized minters is hardcoded. But:

I soon realize that hardcoding authorized minters makes adding or removing a minter pretty difficult and costly, since I have to update the UDT Def itself.
I can’t rely on args to track this system parameter either since UDT holders can change the contents of the arg. They could just make themselves an authorized minter.

Since verification scripts are stateless between executions, I essentially need a way to preserve potentially changing system parameters, or state, over time.
This is the purpose of the info cell. It stores system parameters that are useful by one or more scripts that within the system. Since it persists throughout transactions and blocks, it is also capable of providing scripts with information that would not be accessible from the script’s execution context. The other benefit is that changing system parameters and system verification logic are different actions with different risks, consequences, constraints and costs.
Extend with Extension Scripts

There is another improvement that we can make here. Instead of modifying UDT Def (type script of UDT Instances) themselves, we can separate the base logic from the extension logic. The info cell is an appropriate place to store extension data, which the UDT Def can examine to determine which extensions to execute and how to handle the return values of the additional verification logic given by the extensions.
The separation of system logic and extension logic is common practice where the system executes the extension logic as part of its lifecycle. In this UDT standard, the UDT Definition is logically split into two parts: system validation and base (default) verification logic. The former enforces invariants that must exist for all UDTs and cannot be modified, while the latter is simply a boilerplate default behavior for convenience. The base verification logic, or default verification logic, is similar to boilerplate behavior provided by a development framework.
Summary

Lock scripts are not an appropriate place for extension implementation. Said more accurately, extensions can be divided into two types:

Extensions that necessarily apply to all UDT instances of a certain type
Optional UDT functionality

The former requires a well-defined interface for adding extensions. The use case here is if the developer wishes to create a UDT that is, say, mintable. Extensions of this type operate at the UDT-system level to define specific and custom token types and properties. The latter provides UDT holders the freedom to opt into different functionality and interoperate with different systems. By reserving lock scripts for the latter, other dapps, such as a DEX or DAO, can implement lock scripts that various UDT holders can choose to use.
Type scripts are a more appropriate place for extension implementation. If the type script itself was modified to implement extensions, type scripts should be referenced by a universally unique type hash rather than data hash. This is because an update would consume the old cell containing type script code and produce a new one, thereby breaking references on UDT Instances. The alternative would be to create a second UDT Def while keeping the old one, and ask holders to manually migrate. This would cause something similar to a fork, leading to a host of additional challenges around coordination and compatibility, and place a lot of responsibility on UDT holders and additional effort on off-chain service providers that interface with the token. Especially when an update is for, say, patching a vulnerability, this friction is not ideal. This type of coordination problem makes all upgrades difficult, whereas ideally only poor quality updates are difficult. An extension mechanism that is separate from the base UDT logic is even better because it logically divides extension logic from system logic that enforces certain invariants on all UDT implementations.
Contributors

Author: Tannr Allard
Editors: Matt Quinn, Cipher Wang