Skip to content

Instantly share code, notes, and snippets.

@DanGould
Last active March 5, 2024 22:06
Show Gist options
  • Save DanGould/7af261a5e6dfabb52eef304ecd0f579b to your computer and use it in GitHub Desktop.
Save DanGould/7af261a5e6dfabb52eef304ecd0f579b to your computer and use it in GitHub Desktop.
Uneccesary Input Heuristic (UIH) in Payjoin

Unneccesary Input Heuristic

PayJoins help bitcoin users preserve their privacy because they bring ambiguity into blockchain analysis. Today's PayJoins are blockchain entries that are many-input and 2-output transactions. So the fact that multiple users can contribute inputs to one transaction breaks [the oldest assumption in blockchain analysis]. Still, lesser known "unecessary input heuristics" identify potential PayJoins:

UIH 1: One output is smaller than any input; Assume that output is change

Today's PayJoin clients avoid creating transactions that conform to UIH 1. This heuristic assumption is usually right. The heuristic assumes wrong if the change output is of much higher value than the payment, which is unusual. BTCPayServer code will not create these types of transactions.

UIH 2: The largest output is funded without the smallest input

Because there are enough funsd to exclude the smallest input and still create the largest output, the smallest input is unecessary. Some UIH 2 conforming transactions may be "internal address reuse, i.e., spending the UTXOs of the same address... indicating that they are not PayJoin transactions" 1. Still more software creates UIH2 conformers, perhaps to consolidate inputs in times of lower cost blockspace.

There may be other reasons software creates UIH conforming transactions. Understanding them will help understand how to defeat their offensive capabilities. Fewer and fewer transactions in new blocks have only 2 outputs. These 2 output transactions provide an anonymity set of ~15% of all bitcoin transactions as of 2020 [1]. Beyond PayJoin, batched transactions with more than 2 outputs account for a greater share of block space over time. Therefore PayJoins which make more outputs might have a larger anonymity set to hide in and improve privacy for everyone else.

Avoiding UIH

Consider this bitcoin transaction. It has two inputs worth 2 BTC and 3 BTC and two outputs worth 4 BTC and 1 BTC.

// ⚠️ Conforms to UIH 1 & UIH2
2 btc --> 4 btc
3 btc     1 btc

Assume one of the outputs is change and the other output is the payment. The payment output is either the 4 BTC output or the 1 BTC output. But if the 1 BTC output is the payment amount then the 3 BTC input is unnecessary. The 2 BTC input could fund 2 1BTC outputs alone and paid lower miner fees for doing so. This is an indication that the real payment output is 4 BTC and that 1 BTC is the change output. However, we can't rule out that 1BTC wasn't the payment and the 2+3 were consolidated. Such consolidation becomes more likely on enterprise setups.

Let's take another example.

2 btc --> 4 btc
3 btc     6 btc
5 btc

Both payment interpretations have unnecessary inputs when another one gets added . While adding inputs costs more in miner fees, such steps can be proactive to consolidate utxos. Those who are sensitive to the fee market can take advantage of low-fee hours to consolidate which will save them during high-fee hours.

// 🛡 Conforms only to UIH 2
2 btc --> 3 btc
3 btc     1 btc
          1 btc

This is an issue for transactions which have more than one input. One way to fix this leak is to add more inputs until the change output is higher than any input, for example:

I posit that a PayJoin receiver could run the Knapsack Paper's dynamic output splitting algorithm, and perhaps the input selection algorithm, on the Original UTXO and their own inputs to create PayJoins with more outputs and more ambiguity. In cases where the Sender needs to merge multiple inputs, in order to make a payment, dynamic output splitting could prevent input-input links, with input shuffling too.

Knapsack assumes subtransactions that maintain equal input-output value. PayJoin of course involves a differing, but balanced input-output value among subtransactions. In technical terms every equal-output CoinJoin probably does include a small transfer from one party to another because of mining fees. Some pay more than others. An open question is at what point that marginal difference becomes a matterial affect on transaction interpretation.

Output splitting

Could a dynamically split 3-way PayJoin + Dual Funded channel where a PayJoin Sender funds a Receiver Channel Funding transaction that responds to a Channel Counterparty Liquidity ad to do dual funding have effectively 0 toxic change? For the sake of change heuristics, the miner output is effectively another possible subtransaction output.

Naive Merging

I've got a hypothesis regarding the merging of transactions. Take two transactions that defeat Unnecessary Input Heuristic, and merge them. Can the sum of the outputs be split using the same algorithm to defeat a macro-uih that separates the links created by an unnecessary input cluster? I wonder if this is the same "you split I choose" algorithm as in sharing a piece of cake.


  1. Unnecessary Input Heuristics and PayJoin Transactions Ghesmati, Simin and Kern, Andreas and Judmayer, Aljosha and Stifter, Nicholas and Weippl, Edgar In: HCI International 2021 - Posters , 24-29 July 2021 , Virtual Event (2021)

  2. F. K. Maurer, T. Neudecker and M. Florian, "Anonymous CoinJoin Transactions with Arbitrary Values," 2017 IEEE Trustcom/BigDataSE/ICESS, 2017, pp. 522-529, doi: 10.1109/Trustcom/BigDataSE/ICESS.2017.280.

  3. https://blog.bitgo.com/utxo-management-for-enterprise-wallets-5357dad08dd1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment