Receive private payments from anyone on a single static address without requiring any interaction or extra on-chain overhead.
The recipient generates a so-called silent payment address and makes it publicly known. The sender then takes a public key from one of their chosen inputs for the payment, and uses it to derive a shared secret that is then used to tweak the silent payment address. The recipient detects the payment by scanning every transaction in the blockchain.
Compared to previous schemes1, this scheme avoids using the Bitcoin blockchain as a messaging layer2 and requires no interaction between sender and recipient3 (other than needing to know the silent payment address). The main downsides are the scanning requirement, the lack of light client support, and the requirement to control your own input(s). An example use case would be private one-time donations.
While most of the individual parts of this idea aren't novel, the resulting protocol has never been seriously considered and may be reasonably viable, particularly if we limit ourselves to detecting only unspent payments by scanning the UTXO set. We'll start by describing a basic scheme, and then introduce a few improvements.
The recipient publishes their silent payment address, a single 32 byte public key:
X = x*G
The sender chooses an input containing a public key:
I = i*G
The sender tweaks the silent payment address with the private key that corresponds to their chosen input:
X' = hash(i*X)*G + X
i*X == x*I (Diffie-Hellman Key Exchange), the recipient can detect the payment by calculating
hash(x*I)*G + X for each input key
I in the blockchain and seeing if it matches an output in the corresponding transaction.
UTXO set scanning
If we forgo detection of historic transactions and only focus on the current balance, we can limit the protocol to only scanning the transactions that are part of the UTXO set when restoring from backup, which may be faster.
Jonas Nick was kind enough to go through the numbers and run a benchmark of
hash(x*I)*G + X on his 3.9GHz Intel(R) Core(TM) i7-7820HQ CPU, which took roughly 72 microseconds per calculation on a single core. The UTXO set currently has 80 million entries, the average transaction has 2.3 inputs, which puts us at
2.3*80000000*72/1000/1000/60 = 221 minutes for a single core (under 2 hours for two cores).
What these numbers do not take into account is database lookups. We need to fetch the transaction of every UTXO, as well as every transaction for every subsequent input in order to extract the relevant public key, resulting in
(1+2.3)*80000000 = 264 million lookups. How slow this is and what can be done to improve it is an open question.
Once we're at the tip, every new unspent output will have to be scanned. It's theoretically possible to scan e.g. once a day and skip transactions with fully spent outputs, but that would probably not be worth the added complexity. If we only scan transactions with taproot outputs, we can further limit our efforts, but this advantage is expected to dissipate once taproot use becomes more common.
Variant using all inputs
Instead of tweaking the silent payment address with one input, we could instead tweak it with the combination of all input keys of a transaction. The benefit is that this further lowers the scanning cost, since now we only need to calculate one tweak per transaction, instead of one tweak per input, which is roughly half the work, though database lookups remain unaffected.
The downside is that if you want to combine your inputs with those of others (i.e. coinjoin), every participant has to be willing to assist you in following the Silent Payment protocol in order to let you make your payment. There are also privacy considerations which are discussed in the "Preventing input linkage" section.
Concretely, if there are three inputs (
I1, I2, I3), the scheme becomes:
hash(i1*X + i2*X + i3*X)*G + X == hash(x*(I1+I2+I3))*G + X.
We can extend the silent payment address with a scanning key, which allows for separation of detecting and spending payments. We redefine the silent payment address as the concatenation of
X_scan, X_spend, and derivation becomes
X' = hash(i*X_scan)*G + X_spend. This allows your internet-connected node to hold the private key of
X_scan to detect incoming payments, while your hardware wallet controls
X_spend to make payments. If
X_scan is compromised, privacy is lost, but your funds are not.
Address reuse prevention
If the sender sends more than one payment, and the chosen input has the same key due to address reuse, then the recipient address will also be the same. To prevent this, we can hash the
index of the input, to ensure each address is unique, resulting in
X' = hash(i*X,txid,index)*G + X. Note this would make light client support harder (edit: not necessarily, see here).
Light clients cannot easily be supported due to the need for scanning. The best we could do is give up on address reuse prevention (so we don't require the
index), only consider unspent taproot outputs, and download a standardized list of relevant input keys for each block over wifi each night when charging. These input keys can then be tweaked, and the results can be matched against client-side block filters. Possible, but not simple. (edit: some more ideas how to do light client support here)
Effect on BIP32 HD keys
One side-benefit of silent payments is that BIP32 HD keys4 won't be needed for address generation, since every address will automatically be unique. This also means we won't have to deal with a gap limit.
While the simplest thing would be to only support one input type (e.g. taproot key spend), this would also mean only a subset of users can make payments to silent addresses, so this seems undesirable. The protocol should ideally support any input containing at least one public key, and simply pick the first key if more than one is present.
Pay-to-(witness-)public-key-hash inputs actually end up being easiest to scan, since the public key is present in the input script, instead of the output script of the previous transaction (which requires one extra transaction lookup).
Signature nonce instead of input key
Another consideration was to tweak the silent payment address with the signature nonce5, but unfortunately this breaks compatibility with MuSig2 and MuSig-DN, since in those schemes the signature nonce changes depending on the transaction hash. If we let the output address depend on the nonce, then the transaction hash will change, causing a circular reference.
Sending wallet compatibility
Any wallet that wants to support making silent payments needs to support a new address format, pick inputs for the payment, tweak the silent payment address using the private key of one of the chosen inputs, and then proceed to sign the transaction. The scanning requirement is not relevant to the sender, only the recipient.
Preventing input linkage
A potential weakness of Silent Payments is that the input is linked to the output. A coinjoin transaction with multiple inputs from other users can normally obfuscate the sender input from the recipient, but Silent Payments reveal that link. This weakness can be mitigated with the "variant using all inputs", but this variant introduces a different weakness – you now require all other coinjoin users to tweak the silent payment address, which means you're revealing the intended recipient to them.
Luckily, a blinding scheme6 exists that allows us to hide the silent payment address from the other participants. Concretely, let's say there are two inputs,
I2, and the latter one is ours. We add a secret blinding factor to the silent payment address,
X + blinding_factor*G = X', then we receive
X1' = i1*X' (together with a DLEQ to prove correctness, see full write-up6) from the owner of the first input and remove the blinding factor with
X1' - blinding_factor*I1 = X1 (which is equal to
i1*X). Finally, we calculate the tweaked address with
hash(X1 + i2*X)*G + X. The recipient can simply recognize the payment with
hash(x*(I1+I2))*G + X. Note that the owner of the first input cannot reconstruct the resulting address because they don't know
The blinding protocol above solves our coinjoin privacy concerns (at the expense of more interaction complexity), but we're left with one more issue – what if you want to make a silent payment, but you control none of the inputs (e.g. sending from an exchange)? In this scenario we can still utilize the blinding protocol, but now the third party sender can try to uncover the intended recipient by brute forcing their inputs on all known silent payment addresses (i.e. calculate
hash(i*X)*G + X for every publicly known
X). While this is computationally expensive, it's by no means impossible. No solution is known at this time, so as it stands this is a limitation of the protocol – the sender must control one of the inputs in order to be fully private.
These are the most important protocols that provide similar functionality with slightly different tradeoffs. All of them provide fresh address generation and are compatible with one-time seed backups. The main benefits of the protocols listed below are that there is no scanning requirement, better light client support, and they don't require control over the inputs of the transaction.
Payment code sharing
This is BIP472. An OP_RETURN message is sent on-chain to the recipient to establish a shared secret prior to making payments. Using the blockchain as a messaging layer like this is generally considered an inefficient use of on-chain resources. This concern can theoretically be alleviated by using other means of communicating, but data availability needs to be guaranteed to ensure the recipient doesn't lose access to the funds. Another concern is that the input(s) used to establish the shared secret may leak privacy if not kept separate.
Upon first payment, hand out a fresh xpub instead of an address in order to enable repeat payments. I believe Kixunil's recently published scheme3 is equivalent to this and could be implemented with relative ease. It's unclear how practical this protocol is, as it assumes sender and recipient are able to interact once, yet subsequent interaction is impossible.
Regular address sharing
This is how Bitcoin is commonly used today and may therefore be obvious, but it does satisfy similar privacy requirements. The sender interacts with the recipient each time they want to make a payment, and requests a new address. The main downside is that it requires interaction for every single payment.
- Exactly how slow are the required database lookups? Is there a better approach?
- Is there any way to make light client support more viable?
- What is preferred – single input tweaking (revealing an input to the recipient) or using all inputs (increased coinjoin complexity)?
- Are there any security issues with the proposed cryptography?
- In general, compared to alternatives, is this scheme worth the added complexity?
Thanks to Kixunil, Calvin Kim, and Jonas Nick, holihawt and Lloyd Fournier for their help/comments, as well as all the authors of previous schemes. Any mistakes are my own.
There is also a discussion of this scheme on the bitcoin-dev mailing list.
Stealth Payments, Peter Todd: https://github.com/genjix/bips/blob/master/bip-stealth.mediawiki
BIP47 payment codes, Justus Ranvier: https://github.com/bitcoin/bips/blob/master/bip-0047.mediawiki
Reusable taproot addresses, Kixunil: https://gist.github.com/Kixunil/0ddb3a9cdec33342b97431e438252c0a
BIP32 HD keys, Pieter Wuille: https://github.com/bitcoin/bips/blob/master/bip-0032.mediawiki
2020-01-23 ##taproot-bip-review, starting at 18:25: https://gnusha.org/taproot-bip-review/2020-01-23.log
Blind Diffie-Hellman Key Exchange, David Wagner: https://gist.github.com/RubenSomsen/be7a4760dd4596d06963d67baf140406
Thank you @RubenSomsen for the proposal and kudos @w0xlt for the implementation.
We were discussing this proposal with folks at Trezor and here are our takes:
Address format rough proposal:
00 || X_spend(don't use scanning key, don't use address reuse prevention)
01 || X_spend || X_scan(use scanning key, don't use address reuse prevention)
02 || X_spend(don't use scanning key, use address reuse prevention)
03 || X_spend || X_scan(use scanning key, use address reuse prevention)
For efficient implementation of scanning it would be great if we were able to come up with a scheme where only one input pubkey is required. (In this case we could just enrich each entry of the UTXO set with one particular pubkey). It could trivially be always the first input of the TX, but this unfortunately breaks the BIP69 lexicographical ordering of inputs. Need more thoughts on this.