On-chain contracting for privacy
(thanks to @fivepiece for significant contributions to these ideas)
"On chain contracting" is of course a very generic term; it applies to multisignature, coinjoin, coinswap or other exotic transactions that involve more than one party in one transaction (coinjoin, multisig) or multiple transactions (swaps with atomic-via-secret).
Here we're going to focus on a broader model that may allow more complex setups, with a focus on how they may apply to gaining privacy, although this model may well be useful in other ways too.
Segwit enables pre-signing of not just individual, but chains of transactions.
Each txid is now only dependent on transaction semantics, not on signatures.
To create a co-agreed contractual arrangement requires transferring ownership of coins into a shared control area; this will be one or more utxo outpoints under 2 of 2 or more generally N of N parties.
To make this trustless it's required there be a backout mechanism or "refund". This is because if either party has no refund clause in advance, the other party(ies) can hold up the funds deposited to said "shared control area" by invalidating the out-going transactions, creating a deadlock and thus reverting to a MAD game theory negotiation.
This refund clause can take more than one form, the obvious cases being either (a) create separate refund transactions with a locktime to give a time period of shared ownership, or (b) separate conditional paths in Script using CLTV or CSV to achieve the same purpose.
The latter, (b), is more simple, arguably, but the former is usually preferred if the focus is on privacy (so the scriptPubKey funded into does not have custom redemption paths; but see MAST).
Thus for the rest of this document, in which we focus on potential privacy applications of contracting on-chain, we will assume the use of (b) style refunds.
Based on this refund concept, along with the idea of pre-signing being safe in a segwit-enabled chain as we have today, we thus envisage a picture something like this (we work only with 2 party case; N party can be considered later):
+--> Refund locktime: M blocks -> pay out to A,B | A 1btc ---> | F (2,2,A,B) --+ B 1btc ---> | | +-->[Proposed transaction graph (PTG) e.g. ->TX1->TX2->TX3 ..]
The "proposed transaction graph" (hereafter PTG), can be a chain, or more generally a tree, of transactions, where each TX has F as ancestor - at minimum, one input which is 2-2-Alice-Bob controlled and is dependent on F.
Also note: the refund transaction from F may not be needed in all cases.
In human terms, you can envisage that: Alice and Bob would like to start to negotiate a set of conditional contracts about what happens to their money. Then they go through these steps:
- One side proposes F (the funding transaction) and a full graph of unsigned transactions to fill out the PTG above; e.g. Alice proposes, Bob And Alice share data (pubkeys, destination addresses). They sign an agreement "I get my money back untouched if we can't completely agree to terms, after a delay of M blocks" contract - the refund.
- They exchange signatures on all transactions in the PTG, in either order.
- With this in place (i.e. only after valid completion of (2)), they both sign (in either order) F.
- Now both sides have a valid transaction set, starting with F. Either or both can broadcast them. The transactions are all guaranteed to occur as long as at least one of them wants it. Contrariwise, none of them is valid without F being broadcast.
- If step (2) fails as seen by either party, both sides must wait M blocks before refunding their coins via backout.
This construction works fine if all inputs used in transactions in the PTG are descendants of F; but this makes the construction very limited. So we'll immediately add more details to allow a more general use-case, in the next section.
More sophisticated construction allowing external inputs
In the 5 step sequence above, we tacitly assumed that each input to TXs in the PTG must be a descendant of F, the funding transaction. The reason for this is: if, say, TX2 had an input from Alice only, it could be pre-signed, but Alice could spend that utxo out from under the process, thus breaking the trustless chain.
This greatly limits the usefulness of the construction, especially in privacy applications, since the mixing effect that can be achieved is reduced if the coins all share a history in F. This can be addressed, if desired, by weakening the atomicity of the transactions in the PTG, as follows:
- One party (Alice) can add an extra input to transaction Tn in the chain that is under her exclusive control. By doing so, she has added extra risk to the counterparty (Bob) (who may find that this utxo is spent maliciously, preventing the rest of the transaction chain from completing even though all the transactions are signed).
- To defend against this possibility, Alice and Bob can create a presigned refund transaction from Tn-1 that refunds him all his remaining funds at that point (the same can be done for Alice for niceness). The timelock on this refund can be the same as the main backout for the whole PTG, as illustrated above.
In human terms: Alice can promise to make an extra payment as part of the set of conditional contracts; Bob can conditionally accept the promise but include a clause to redeem all his money at that point in the contract set if Alice does not deliver on that promise. (To get into the weeds, it's a little stronger than just an empty promise; Alice writes a cheque, not even forward-dated, and shows that her bank account is not empty right now; but Bob is not cashing it quite yet so there is a small window for reneging by spending the money on something else).
Illustration of a case with one external input from Alice (utxo A1); the additional refund ensures all fund transfers between the parties are entirely unwound in case of Alice's failure to deliver on her promise; note also that TX3 has an input from TX2 (2 of 2 signed); every transaction must include at least one such input:
+--> Refund not needed here (TX1 has no external inputs) | A 1btc ---> | F (2,2,A,B) --- B 1btc ---> | +--> external payout 0.5 btc to Bob | | +->[TX1 --> TX2 --> TX3 --> TX4] | ^ | | | | | +--- utxo A1 | +--> refund locktime M, pay out *remaining* funds to A: 1btc, B: 0.5btc
The above addresses the case of a single external input being included in a chain of transactions in the PTG (here, TX1,2,3,4). Extending this, and generalising to allowing external inputs in many transactions, is straightforward; we can add such in-PTG backouts at every step, redeeming all remaining funds to parties according to what they're owed.
To summarize this section and how it differs from the original, simpler construction:
Alice and Bob have a choice:
- They can set up a fully trustless PTG, without promises. They are then guaranteed to achieve "all or nothing": either all cooperative signing works, then all transactions can be broadcast (as long as at least one of them wants to), or nothing (including F) is broadcast at all.
- They can set up a PTG including promises from one or both parties. Now they don't get "all or nothing" but only ensure that the transactions that complete are a subset, in order, from the start F. To achieve this they add presigned backouts at (probably every) step, so that if the chain "breaks" somewhere along, they will recover all the funds remaining that are owed to them.
The tradeoff is: (2) is not perfectly atomic, but it allows the transaction graph to include utxos from outside of F's ancestory, particularly useful for privacy applications. In a sequence of 10 coinjoins, you may be happy to risk that TXs 6-10 don't end up happening, if it doesn't cost you money. Case (2) is more likely to be of interest.
The next 5 subsections are commentary on the usefulness of the idea in practice; jump to "Concrete Example" at the end if you want a more complex example to look at.
Comparison with Lightning concept
Lightning leverages the idea of pre-signing sets of contracts in a (small) transaction graph, but it develops a much cleverer and more powerful trick added on to this: the ability to update a set of contracts by invalidating a previous set of contracts (see concepts like "breach remedy" and the sharing of a secret to invalidate a prior contract set).
Rather whimsically I propose to call Lightning (or any uni- or bi-directional channel idea really) "vertical contracting" (imagine new contracts "overlaying" previous; although I guess literally ripping up old contracts is a more accurate analogy), while what is being discussed here is more like "horizontal contracting"; preparing whole chains or trees of transactions, all of which will go on-chain. There is no updating here (at least, not in this proposal), there is just a primary backout of cancelling all contracts and reverting to a refund.
Advantages over contracting on chain without pre-signing
A good practical example of on-chain contracting today is Joinmarket, in which multiple parties engage in single-transaction coinjoin contracts. A user may often choose to do what is known as the "tumbler algorithm", running a whole set of coinjoins over a long period by finding counterparties to do the joins in a specifically arranged sequence to maximize the privacy effect. One of the big disadvantages of this is the long time frame interactivity (having to exchange messages with external counterparties in bursts, separated sometimes by several hours, over days in some cases). This makes things harder in terms of network connection management, running hot wallets etc.
So while the ability to do all the negotiation immediately, upfront is a big practicality win, it's not the only important thing here: what's also really important is that because the trustlessness is now spread out over an entire chain or tree of transactions, there is no problem with swapping control of coins in individual transactions. This can invalidate assumptions that an external observer makes about the meaning of transactions on chain (which is why I think it can be important for privacy applications, although I don't assume there are no other things you can do with it).
While there is no direct drop-in replacement for Joinmarket's "tumbler" in this proposal, in broad outline we can imagine the great advantage of being able to make a whole set of negotiations all at the start, including signing operations, so that direct over the wire interactivity is reduced to a very short time frame.
Anonymity set issues
As presented above, parties use a principal 2 of 2 destination for the funding transaction, and (with some possible exceptions) 2 of 2 shared control points as inputs to each transaction in the graph. In general terms this is a watermarking effect on these transactions. But consider these points:
- In a Schnorr enabled world, this watermarking disappears.
- In the absence of Schnorr and of any efforts to change this, it still isn't that bad: a. You still have the anonymity set of all 2 of 2 transactions b. You can improve this to general 2/2 2/3 (the majority of multisig usage) since you can use fake 3rd keys. c. If thinking about coinjoins in particular, they are already entirely identifiable (except very unusual niche coinjoin usages).
It's worth thinking carefully about other watermarking issues, though. For timing correlation issues, we have:
- Voluntary cooperation on broadcast times (not strong, but "defection" here is less likely, still a big issue).
- Locktimes; this watermarks transactions, although weakly.
- In some cases immediate multiple broadcasts aren't that rare, so it may be less of an issue.
The set of possible privacy use-cases
These are chains of connected transactions (or trees, but always with F at the root); so, it stands aside from ideas like "Coinswap" where there is more than one transaction flow involved. However, that doesn't mean that an atomic swap type construction couldn't be somehow blended into this model; but we defer that idea for another time.
It fits nicely along with Coinjoin-like ideas though. An example of using this to create multiple coinjoin transactions is shown in the next section ("Concrete example").
Let's remember the core concept of coinjoin: the Nakamoto/Meiklejohn assumption/heuristic states that "multiple inputs in one transaction implies common ownership", which is false in coinjoins. Usually coinjoins are made acceptable to participants by allowing all parties to achieve net zero transfer between ins and outs (in Joinmarket that is tweaked but not by much), which gives extra ammunition to blockchain analyst: perform subset sum analysis on the (ins, outs) sets, which re-links ins and outs up to the non-equal amounts (but no further). See "coinjoin sudoku".
With this new model - although as presented it's more restricted than a full network of coinjoin transactions - that assumption is not just unsafe, but very likely flat out wrong; there is no requirement for individual transactions in the PTG to preserve the balance of individual participants, because we've spread the trustlessness of a single transaction to the whole transaction graph in the PTG.
N of N vs 2 of 2
It's believed that basically all of the above constructions/descriptions carry over to N of N. The only problem is how much larger the signatures are in this case, and how that impacts the anonymity set. See the earlier comments about Schnorr.
Concrete example - a multisig-based coinjoin chain.
This example case is intended to flesh out some details that may be unclear in the high-level description above.
As before we start with Alice and Bob, 2 parties, they have established funds into F. In this case,they each contribute 1 btc and F has a single 2 btc output (note it also possible for only one of them to be funding).
To reduce the size of the diagram, we have one line per transaction, detailing inputs and outputs. 22ABn means the nth created 2 of 2 shared control destination for Alice and Bob. An and Bn refer to numbered external utxos, i.e. utxos from other wallets that Alice and Bob own, that they control unilaterally (see previous section about this). X(i) will mean the i-th outpoint utxo from transaction X, with index starting at 0. B(n, m) will be a timelocked backout with output n btc to Alice and m btc to Bob. The locktime will be the same for all the backouts. Outputs labelled "Alice" or "Bob" refer to transfers to externally controlled addresses (so are "sinks" to this process).
+--> B(1, 1) | | (F) --> TX1(ins: F(0); | outs: out(0): Alice 0.8 btc, out(1): 22AB_1, 0.8 btc, out(2): 22AB_2, 0.4 btc) | +--> B(0.2, 1.0) | +--> TX2(ins: TX1(1), TX1(2), A_1 0.3 btc; | outs: out(0): Bob, 0.2 btc, out(1): Alice, 0.2 btc, out(2): 22AB_3, 1.1btc) | +--> B(0.3, 0.8) | +--> TX3(ins: TX2(2), B_1 0.5 btc; outs: out(0): Bob, 0.6 btc, out(1): 22AB_4, 1.0 btc) | | +--> TX4(ins: TX3(1); outs: out(0) Alice 0.3 btc, out(1) Bob 0.3 btc, out(2) Bob 0.4 btc)
(Technically the first B(1,1) is not needed here).
The level of obfuscation achieved by this is not particularly strong due to only mixing with 2 parties, but it's easy to see that it could achieve significant fungibility with small numbers of transactions and 2-4 counterparties even, partly because the patterning of such transactions can be far more complex than with traditional coinjoin setups. In the above example coinjoins are used, and also include equal output amounts, but the idea gets quite smeared out/generalized here.
And of course this is a very simple example compared to what is possible.
Sequence of actions
The two sides have calculated the unsigned serialization of each of these transactions, including backouts, in advance; they can then follow this sequence:
- For each of TX4, TX3, TX2, TX1 and all backouts, Alice and Bob can send their signatures in any order.
- Both sides verify the other sides' signatures.
- They should verify the validity and unspent-ness of utxos owned by the other party (An, Bn).
- Both Bob and Alice sign F.
- Both Bob and Alice pass their F sigs to each other (in either order)
- Both Bob and Alice verify that F is validly signed.
- Now either Bob or Alice can choose to broadcast all transactions F, TX1, TX2, TX3, TX4 (with delays if they choose).
- If TX2 or TX3 fail to broadcast due to spentness of utxos (An, Bn) then they can broadcast the appropriate B(n,m) after waiting for the necessary blockheight (the rest of the transaction chain is invalidated), and they will recover the remainder of their funds.
If any of the signing steps fail as seen by one party, they simply refuse to sign F and nothing happens.