Skip to content

Instantly share code, notes, and snippets.

@nothingmuch
Last active December 29, 2021 16:28
Show Gist options
  • Save nothingmuch/544cdd47dd18ef8fe923b54e0d5ee141 to your computer and use it in GitHub Desktop.
Save nothingmuch/544cdd47dd18ef8fe923b54e0d5ee141 to your computer and use it in GitHub Desktop.

Unequal Amount Mixing for ZeroLink using Preferred Value Series Fixed Denominations

This is a quick sketch of several modifications to zerolink. This document tries to articulate an as of yet unproven intuition is that combined together they can allow unequal input amounts as well as relaxation of the post-mix no linking restriction, while retaining the same conservative assumptions about mixed output indistinguishability.

Disallowing post-mix linking is arguably bad for fungibility, since users are likely to bypass this restriction by transferring to other wallets. Therefore, if I am able to justify this change this seems like a much more substantial contribution to usability and fungibility. That said even if it can't be shown to be reasonable to do so, some of these ideas still have merit on their own, so not all would be lost.

Proposed Protocol Changes

"Soft Fork" changes

  1. Enable additional fixed denomination mixing rounds based on preferred value series: i.e. instead of just rounds of 0.1 as is the case today with wasabi, there would be multiple independent rounds for e.g. using the 1-2-5 series, the round denominations could be { 1, 2, 5 } * 10^n for all non-dust values of n (unpopular denominations would simply take longer to mix). Even with increased adoption this would likely reduce the size of anonymity set in given wallclock time interval, but not in discrete time measured mixing rounds.

3 phase chaumian coinjoin "hard fork"

Instead of input registration submitting multiple inputs, a change output and a blinded mix output, input registration mints chaumian tokens corresponding to the input amount minus the per-input fees, at a rate corresponding to the mixing fees.

These tokens can then be unblinded and "spent", deducting the per output fee, to register outputs that are not linked to the inputs before the final coinjoin signing phase. The server could still enforce the restriction of only 1 mixed output per participant if inputs must be registered together, with a single "mix" token being issued for the round denomination, and the remaining amount issued as "change" tokens (thus limiting the number of mixed outputs that can be made from one or more inputs to at most one). This restriction could also be relaxed, to allow tumbling that preserves the fixed denomination (allowing mixing fees to be paid by seemingly unrelated inputs).

Token denominations and colours would be specified by a set of per-round keys, and are thus only valid for the duration of the round. To allow O(1) verification of tokens, the amount should also be specified when spending: (amount, serial, signature) (see also brands credentials, anonymous credentials light for how to encode amount information with blind signatures).

This revised protocol would allow the following changes:

  1. Implement group send proposal as a new round type (non mixing round). If post-mix linking is allowed, then values over 0.1 would be spendable by using multiple mixed txos. Assuming preferred value series denominations are used, payments should generally be possible without requiring a change output given enough values (e.g. 0.321 = 0.2 + 0.1 + 0.02 + 0.001).

  2. Implement dual of group send, i.e. non mixing rounds for breaking arbitrary sized inputs into fixed size, mixable denominations.

  3. When rounds of any kind coincide, be they splitting to fixed denominations, mixing rounds, or group sends, they can be merged by the server into a single coinjoin transaction (merged rounds end by signing the same transaction). The server should still enforce rules for different round types, i.e. unequal amounts only get broken up, normal mixing rounds take fixed denomination inputs and outputs, and group sends of unequal output values are are only constructed from mixed inputs.

  4. Inputs with good anonymity set sizes can be given a negative fees, covering the output creation amount (limited to the first registered output from a previous mixing transaction). This would incentivize "switching network" mixing topologies, allowing already mixed coins to contribute more to the anonymity sets of newly mixed coins.

Anonymity Set for Output Linking Transactions

TBD. Work is ongoing to provide an improved set of anonymity set estimation metrics, to analyze the feasibiltity of removing the post-mix linking restriction, and to address some of the computational complexity of Boltzmann.

See previous revisions of this document for early notes (mostly incorrect and incoherent)

Links & References

@nopara73
Copy link

This is a great set of ideas! So, this is basically making unequal input mixing unequal input mixing with fixed denomination. And in this case we could harness the benefits of groupsend, plus a privacy metric.

https://gist.github.com/nothingmuch/544cdd47dd18ef8fe923b54e0d5ee141#anonimity-set-for-output-linking-transactions

Enable additional fixed denomination mixing rounds

There's no need to divide it to multiple transactions/multiple rounds. It can be done in one transaction more effectively.

Implement group send proposal. If post-mix linking is allowed, then values over 0.1 would be spendable by using multiple mixed txos.

Well, we could put the groupsend proposal into the same mixing transaction again, this would be more economical. Of course this requires larger liquiditiy/fast rounds.

To allow repeated mixing rounds for a given fixed denomination...

Maybe it can work, not sure about this, there are many things to think through.

About the privacy metric calculation. That's too much math for me, I cannot spot the issues with it.

@Varunram
Copy link

Great writeup! A couple questions on the points nopara made above:

There's no need to divide it to multiple transactions/multiple rounds. It can be done in one transaction more effectively.

Pardon me for being a noob here, but how would you do multiple amounts in a single tx? Just with x1+x2+x3 inputs and outputs? (ix x1, x2 and x3 are different denominations)

Else, we could do recursive denomination rounds where if a 0.4 btc user wants to participate in as 0.1 btc pool, we could have him participate 4 times. This would affect the entropy though, since he can be seen participating four times. I don't know about other practicalities involved, someone experiences such as nopara would have to comment on the issues there.

Also a noob question, but what would happen if some group of users all with entropy zero plus a single user want to mix? Do we simply reject such low entropy rounds? Or can we have something like low entropy levels within fixed denominations and then combine them with some high entropy inputs?

We could also have a construction where if a user with an unpopular denomination (say 0.666 btc) has been waiting for > 1 day (can be specified by the user), we split 0.666 btc into 0.5+0.1+0.05 + 0.01 + 0.001*6 (where the first numbers are the other denominations available) and then run rounds for them. This would affect the fee the user pays (and maybe is willing to pay, which would be another option in itself), but I can see this being useful for users, without having to worry about other 0.666 btc people.

@nothingmuch
Copy link
Author

nothingmuch commented Nov 22, 2018

There's no need to divide it to multiple transactions/multiple rounds. It can be done in one transaction more effectively.

Pardon me for being a noob here, but how would you do multiple amounts in a single tx? Just with x1+x2+x3 inputs and outputs? (ix x1, x2 and x3 are different denominations)

I think the main difference is the UX, i.e. does each fixed denomination has a separate participant counter? (i think that's simplest, at least conceptually). What happens when a round has enough participants but another denomination is close to completion? (i think a small delay which would allow a different denomination round to complete, thus combining the two rounds is simplest)

The way the CJJ API works, participants register inputs along with a blinded output and a cleartext change output, and when the round is full, all participants submit unblinded outputs within some time window, after which the unsigned transaction is ready for everyone to sign. there's no technical limitation preventing multiple inputs and outputs by a single entity. to have {x1, x2, x3} as inputs and one output of amount x1+x2+x3 is not possible in the current API since all outputs in a round are assumed to have the same denomination, but the last proposed change would allow that (input registration would issue per-round tokens that can be spent to register blinded outputs)

Else, we could do recursive denomination rounds where if a 0.4 btc user wants to participate in as 0.1 btc pool, we could have him participate 4 times. This would affect the entropy though, since he can be seen participating four times. I don't know about other practicalities involved, someone experiences such as nopara would have to comment on the issues there.

If you mean in one round, then yes, this reduces the anonimity set for everyone, and currently is not allowed (though technically nothing prevents a single entity from splitting up coins into smaller inputs and participating multiple times, it's just not what the wallet software does without modified). If you mean over multiple rounds, then that's what's done now, and such outputs indistinguishable participating multiple times is fine, so long as mixed outputs are not linked.

This is also problematic if the post-mix no linking restriction isn't observed (for example by sending from the post-mix wallet to a wallet which doesn't follow the zerolink spec). For example, suppose you want to mix a large amount, and you participate in a round and then send your change to the next round, and so on. If a payment of this magnitude is later paid out of the transactions that mixed this amount, bin packing analysis can be made on the aggregate input amount, the actual anonymity set is substantially reduced. It's still plausible that some other user consolidated such an amount. Note that in that particular transaction some of the mixed outputs have been linked and identified as such by oxt.me. This observed behaviour is the main motivation for this writeup.

The logic I've been trying to formalize for this recursive calculation is that if the set of all inputs is increased, that becomes less of a problem. however, in the current approach this set's size is linear in the number of rounds. with additional mixing rounds the anonymity set for a given can grow quite large if taking into account all of its sister transactions, and this can compound (see remark about Clos/switching networks in the coinjoin bitcointalk post, several transactions forming a clos network is the ideal form, is essnetially equivalent to one large coinjoin transaction). This increase is should be estimated conservatively (under estimated), and entropy reduction of linking is also estimated conservatively (over estimated), in other words the recursive formula should pessimistically esitmate the per output entropy coming out of a Clos network transaction topology as though those transactions were one giant coinjoin.

Also a noob question, but what would happen if some group of users all with entropy zero plus a single user want to mix? Do we simply reject such low entropy rounds? Or can we have something like low entropy levels within fixed denominations and then combine them with some high entropy inputs?

Both right now and with what i'm proposing that's allowed, entropy is created by introducing indistinguishability among the outputs. Boltzmann's intrinsic entropy and linkability matrix metrics account for all plausible combinations, and zerolink's notion only counts fully indistinguishable inputs.

I think this should remain unchanged, but if the post-mix linking restriction is lifted, the pre-mix entropy can be taken into account as well, in order to justify doing that (and the goal of the recursive formula is to quantify whether or not the pre-mix entropy is sufficient to cover the entropy lost by post-mix linking).

We could also have a construction where if a user with an unpopular denomination (say 0.666 btc) has been waiting for > 1 day (can be specified by the user), we split 0.666 btc into 0.5+0.1+0.05 + 0.01 + 0.001*6 (where the first numbers are the other denominations available) and then run rounds for them. This would affect the fee the user pays (and maybe is willing to pay, which would be another option in itself), but I can see this being useful for users, without having to worry about other 0.666 btc people.

The main reason to use a preferred value series is to reduce the number of possible denominations over the entire range of plausible values, to avoid needing complex coordination for unpopular denominations (DoS protection also adds complexity there). This does create additional chain bloat and UTXO set bloat, and therefore increased the miner fees. The mixing fees scale with the amount mixed, not the data size to dissuate sybil attacks, so those fees should be unaffected.

If the ecash style tokens for output registration thing is implemented, this actually adds another possibility - users can pay extra fees, similar to the join market taker fee, incentivizing others to join a round. But this too adds some complexity to the protocol, and I haven't really thought through the implications

@nothingmuch
Copy link
Author

several transactions forming a clos network is the ideal form

correction: i meant switching networks in general are ideal topology for multi round mixing from a fungibility standpoint, a clos network is a special case of switching networks with 3 stages, which has benefits for a coordinated multi tx coinjoin, but actually results in more chain bloat.

however, in the current approach this set's size is linear in the number of rounds. with additional mixing rounds the anonymity set for a given can grow quite large if taking into account all of its sister transactions, and this can compound

the following scenario is also not discussed in the gist, but should be considered. suppose a user is in control of input 1 which is mixed and resulting in output a. assuming no fees and identical amounts for all inputs/outputs:

     +--+
1 +-->  +-----------> b
     |  |
2 +-->  +-\   +--+
     +--+  \-->  +--> a
              |  |
     +--+  /-->  +--> d
3 +-->  +-/   +--+
     |  |
4 +-->  +-----------> c
     +--+

In this scenario the single tx anonymity set size of a is 2 (assuming a and d are indistinguishable). However, since the inputs to the 2nd mixing transaction each have an anonymity set of size of 2 though, and any of the 4 inputs is a potential predecessor, that is the upper bound for the anonymity set size of a.

if another mixing transaction then applies to the other outputs:

     +--+        +--+
1 +-->  +-------->  +--> a
     |  |        |  |
2 +-->  +-\   /-->  +--> d
     +--+  \ /   +--+
            X
     +--+  / \   +--+
3 +-->  +-/   \-->  +--> b
     |  |        |  |
4 +-->  +-------->  +--> c
     +--+        +--+

how much closer does this bring a's actual anonymity set size to this upper bound? logically i can't convince myself that it makes any difference for a or d, but I feel like i'm missing something.

@nothingmuch
Copy link
Author

JFC when will i learn to spell anonymity o_O

@nothingmuch
Copy link
Author

@Shabnam-Mir shared the knapsack mixing paper with me:

Our algorithm tries to split outputs so that different output sets can be constructed for the same input set. It achieves this by splitting a single output when two transactions are merged. The difference between the sum of the two sets is calculated and one coin of the larger set is split to produce this difference. The new coin with the value of the difference can then be added to the smaller set to produce a new set with the same sum as the originally larger set.
...
However, we notice, that it only decreases the linkability of input-output pairs and not the linkability of input-input pairs. The reason for this is, that our algorithm only produces new output sets with the same sum than existing output sets. Hence, only input sets with the same sum than without mixing will be found. It can be easily fixed by randomizing the input sets before applying the output splitting algorithm

When splitting or consolidating fixed denominations from a preferred value series, this property of a mixing transaction arises naturally so long as the sum amounts are of roughly the same magnitude.

Furthermore, with at least one round of mixing per fixed denomination as an intermediate stage, the search space extends to the space of all partitions of all inputs of the mixing mixing transactions. In other words, their computational hardness argument is extended to a potentially significantly larger set of inputs, when analyzing these multi-round transactions as a switching network topology.

@nothingmuch
Copy link
Author

for anyone reading the comments, note that as of revision 6 the section about the anonymity set size and entropy calculation has been removed due to improved understanding of the problem, they are planned to be reinstated later when progress has been made to formalize improved metrics.

@Shabnam-Mir
Copy link

Shabnam-Mir commented Dec 16, 2018

this paper is about anonymity metrics. It may be helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment