george-hawkins/corda-notary.md

## corda-notary.md

      
    Raw
  

              corda-notary.md
            
          
    Note: this page is heavily chopped down from multiple earlier sources, so it doesn't hang together very well at this stage. It's now just a collection of facts and thoughts about Corda notaries.
How is a Corda notary different to a centralized DB?

Distributed DBs (both SQL and NoSQL) that guarantee eventual consistency have been around for a long time outside the world of distributed ledger.
How is a Corda notary really different to a centralized DB?
Note: whether behind the scenes a notary or centralized DB is implemented as a single system or as a distributed cluster is largely irrelevant to the primary parties involved (in a Corda environment that's the nodes).
Update: initially I was under the impression that a notary was an optional element of a Corda system. It is not, however its involvement is optional for certain types of transactions, e.g. those involving no input states.
Initially I assumed that when all parties reached consensus on a proposed transaction this meant more than that they'd all agreed to sign it, I thought it also meant that once fully signed they also reached consensus that all their ledgers could and would eventually become consistent with any agreed change. I.e. that some practical solution to the BGP came into play between the parties, whether a notary is present or not, regarding updates to their ledgers.
But it seems this isn't the case - without a notary any party that has a fully signed transaction will generally immediately persist it to their ledger without receiving any guarantees from the other parties that they will eventually become consistent with this change. If I want such a guarantee I need a notary (and even then the notary isn't really guaranteeing that consistency will be reached, it's guaranteeing that inconsistent facts won't be established, see the final section below).
So if all parties must communicate with a notary how is this meaningfully different to dealing with a central DB (whether clustered or not) as one might do in any large multi-party system that exists without any distributed ledger component.
Is it the signing and trust element? I.e. in a centralized DB setup all parties have to trust a central authority, i.e. what it puts in the DB is the truth (with no in-built mechanism to prove after the fact that all parties agreed on this truth), however in a distributed-ledger setup like Corda the parties agree among themselves as to new shared facts and this can be proved (both at the time and later). The notary has no role in agreeing new facts, its role is just to ensure that all parties can agree on a consistent view of such facts (but it cannot produce new facts itself).
So this seems to be the core of the thing - while the notary does constitute shared infrastructure there is a fundamental difference in how changes to state are agreed.
Note: this Corda FAQ question mentions BFT but only in relation to notary cluster consistency. However one should be no more interested in the internals of how a notary cluster achieves eventual consistency than in how a distributed DB does, in fact one wonders why they didn't use an existing such system to provide this behavior.
Shared facts without (much) shared infrastructure

If a group of firms need to agree on and evolve a shared set of facts, they could create a cooperative (like the SWIFT network) that is owned and acts in the interest of all parties. Such an organization could manage all relevant information on its DBs, i.e. be a centralized point for the data involved, and it could manage all the related business processes.
E.g. if bond issuance involves a number of parties to agree, all parties could form a cooperative, bring in McKinsey to determine the processes involved and IBM to manage the implementation. McKinsey would then get to grips with these processes - the party that proposes a new issuance having its set of existing processes and the other parties, whose input is also needed before the issuance can be finalized, having their processes too. Once all are understood IBM can then set about creating a system that implements what were dispate processes as a single system. The parties involved would then abandon the existing internal systems that used to manage their parts of the old process and wire up their ancillary systems to the new shared system (and appropriately adjust the work patterns of the employees involved).
Aside: is this a straw man? It's clear it's up to the parties involved to draw the line between internal processes and shared infrastructure but no matter where the line, it's still the case that some shared infrastructure needs to be developed.
Note that if there's no desire to be able to definitively say anything about shared facts then each party can develop their processes entirely in-house and just cooperating over the exchange of information. Defining the messages and messsage flows involved is clearly a case of shared development but not necessarily of shared infrastrucure - the infrastructure required for message exchange can be purchased on-demand from an independent utility, i.e. a telecoms company.
Shared systems that attempt to capture the requirements of a large number of participants are famous for coming in late, running over budget or failing altogether.
A centralized system implicitly guarantees a consistent world view with no trust issues between the processes within the system despite capturing what were once the disparate internal business process of various different parties. It would be nice if you could guarantee a consistent world view while allowing the parties involved to continue much as currently with their own internal processes and ledgers and with interaction between the parties not going beyond the kind of message exchange that is already common. And it would be nice if this could be achieved without requiring any significant change in the degree of trust between the parties involved.
Currently it's obviously possible to send EDIFACT messages back and forward between parties and get them signed etc. However the signing just indicates that the parties agree to the actions invovled but much of the existing state involved is implicit. It would be nearly impossible to reconstruct the set of effectively shared facts that are established just by examining the messages involved. The messages result in data evolving in various internal DBs as the result of various opaque system specific processes. In contrast one can make definite statements about the facts that will end up on a given party's Corda ledger. And for any particular fact one can show that it is the result of a specific transaction, one can prove that any counterparties involved agreed to establishing the fact and that it is valid in terms of the contract governing its creation (the contract itself also having been agreed between the parties).
If it's the case that Corda and similar distributed ledger systems provide a model that can be shown to fit many disparate environments then it would seem to make sense to use them, in terms of reducing implementation costs etc., even if everything is to hand, i.e. PKI frameworks etc., such that one could develop a similar solution oneself.
While the notary is a shared component it is an off the shelf solution - it does not necessitate the formation of a significant cooperative that e.g. contracts the services of companies like McKinsey and IBM to create a bespoke solution and itself requires a considerable ongoing commitment in terms of manpower etc. The notary is a standardized technical solution. Its job is to perform a well defined and fairly simple function - it ensures that an already established fact, e.g. that George has $100, cannot be used more than once as the basis for forming new facts, i.e. it solves the double-spend problem. It's a mechanical arbitrar of truth.
Some systems are largely symmetric, e.g. in a sytem of retail banks all are probably engaged in a similar set of operations, one day A will be the acceptor of a money transfer and B the receiver and other times it'll be vice-versa. Other systems are more asymmetric, e.g. between wealth management (WM) entities and investment banks (IB) - the WM entities may initiate an FX transaction that will be carried out by one of the IBs but such a flow will never go the other way.
With a system like Corda the parties can develop the flows, i.e. the process via which new facts are established, in concert or largely independently. I suspect even for apparently symmetric systems this would happen largely independently as the internal processes involved, along with the interactions with secondary systems, will be entirely particular to individual institutions.
However it seems clear that all parties should agree on what constitutes a valid transaction, i.e. given current states what new states can be established. Currently all parties would come to some kind of written agreement that defined such things and then all individually implement the checks needed for there particular systems. Corda holds out the promise that this component at least can be standardized across the internal systems of different companies. As Corda defines a standardized way to describe these transformation on existing state, company's can share the code that verifies their contractual validity.
Whether the apparent advantages over existing solution are compelling enough that many potential customers would see the switching cost as worthwhile is unclear and in the case where they are enthusiastic to switch to something else (e.g. due to costs associated with their existing systems) it's unclear wherther they would choose distributed-ledger technology over alternatives such as pooling resources with other parties in shared infrastructure. We can surely hope that there's a certain established scepticism regarding the development of shared monolithic systems and that each party will prefer the idea of just committing to delivering the part of the system that involves them and keeping this internal rather than contracting the whole system out to a new shared entity.
Non-validating notaries

In an ideal world the notary really would be a utility doing nothing more than preventing states being consumed more than once, it would just sign off on the fully signed transactions agreed elsewhere, persisting the new states involved and marking existing ones as consumed but treating those states as entirely opaque from its point of view.
The business processes involved are running elsewhere and the notary simply provides a mechanism by which the parties involved can be sure that if they persist a particular fact to their internal ledgers they will have a view which is consistent with that of the other parties.
As such the notary could manage its element of things for many different systems of interacting parties, much like a real world notary, without having to be updated as the business processes involved in these distinct systems change over time.
However this isn't the case where complete trust doesn't exist between the parties (and I would guess this is the common case). In this situation the notary does need access to the classes that define all relevant contract code and allow it to deserialize the states etc. that make up all the transactions it deals with.
A proposed transaction includes the set of parties that the proposer claims are needed to completely sign it. However it's up to all parties that receive the proposed transaction to verify that they agree with this claim. This verification is handled by the relevant Corda contract - it must confirm that the proposed set of signing parties is consistent with the rules encoded in it.
Corda supports the idea of validating and non-validating notaries. A non-validating notary does not need to know anything about the contracts involved. However without validation a party can accidentally or maliciously "wedge" a state, i.e. make an established fact unusable by the other parties to that fact. Say an established fact is known to A and B, A can include this fact as an input to a new proposed transaction along with the untrue claim that only it needs to sign off on the embodied change, it can then sign the transaction and send it to the notary. If the notary is non-validating it will not attempt to verify A's claim and will mark the input fact as consumed as part of adding the apparently fully signed transaction to its world view. Any subsequent transaction that B tries to create, involving the given fact as an input, will now be refused by the notary even though B never got to see A's transaction and would have rejected it had it done so on the basis that it did not match the conditions of the contract agreed between A and B.
The notary only prevents inconsistent facts

The notary just seems to guarantee that contradictory facts should never end up being established by different parties within a given system but not that a particular ledger will become consistent in terms of having all, rather than just some, of the facts relevant to it.
A notary just means that only facts that are consistent with its view of the world are considered valid by all parties and only such facts can end up on their ledgers. But this is quite a different thing to all parties definitely being aware of all facts that are relevant to them, e.g. if a node has been destroyed and rebuilt from a backup it's not the job of the notary to provide the node with a mechanism to negotiate with it such that the node's view of the world, on facts that are relevant to it, becomes complete and consistent with the notary's.
The queues behind the RPC calls between nodes are transactional, is it enough to use XA to coordinate read commits on these queues with the corresponding write commits to the ledger? If a party commits such a transaction must it also guarantee that it can recover all involved state following a later failure? Must it maintain its ledger on infrastructure that can survive failure? What is the current situation between non-trusting groups like banks? I guess it's that it is each parties job to make such guarantees for their own sake. You don't want to go around your partner banks saying "hey what do I owe you / you owe me?" I.e. you trust your counterparties to a degree but not to the extent that it would be acceptable to have a situation where you have to ask them what your view of the world should be.